Molecular Orbital Calculations for Biological Systems
Topics in Physical Chemistry A Series of Advanced Textbooks and...
27 downloads
649 Views
15MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Molecular Orbital Calculations for Biological Systems
Topics in Physical Chemistry A Series of Advanced Textbooks and Monographs Series Editor, Donald G. Truhlar
F. lachello and R. E. Levine, Algebraic Theory of Molecules P. Bernath, Spectra of Atoms and Molecules J. Cioslowski, Electronic Structure Calculations on Fullerenes and Their Derivatives E. R. Bernstein, Chemical Reactions in Clusters J. Simmons and J. Nichols, Quantum Mechanics in Chemistry G. A. Jeffrey, An Introduction to Hydrogen Bonding S. Scheiner, Hydrogen Bonding: A Quantum Chemical Perspective T. G. Dewey, Fractals in Molecular Biophysics A.-M. Sapse, ed., Molecular Orbital Calculations for Biological Systems
Molecular Orbital Calculations for Biological Systems
EDITED BY
Anne-Marie Sapse
New York
Oxford
Oxford University Press
1998
Oxford University Press Oxford New York Athens Auckland Bangkok Bogota Buenos Aires Calcutta Cape Town Chennai Dar es Salaam Delhi Florence Hong Kong Istanbul Karachi Kuala Lumpur Madrid Melbourne Mexico City Mumbai Nairobi Paris Sao Paulo Singapore Taipei Tokyo Toronto Warsaw and associated companies in Berlin Ibadan
Copyright © 1998 by Oxford University Press, Inc. Published by Oxford University Press, Inc. 198 Madison Avenue, New York, New York 10016 Oxford is a registered trademark of Oxford University Press All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior permission of Oxford University Press. Library of Congress Cataloging-in-Publication Data Molecular orbital calculations for biological systems / edited by Anne-Marie Sapse. p. cm.—(Topics in physical chemistry) Includes bibliographical references and index. ISBN 0-19-509873-0 1. Molecular orbitals. 2. Biomolecules. 3. Peptides. 4. Amino acids. 5. Antineoplastic agents. I. Sapse, Anne-Marie. II. Series: Topics in physical chemistry series. QP517.M66M65 1998 547'.70448—dc21 97-37834
9 8 7 6 5 4 3 2 1 Printed in the United States of America on acid-free paper
This book is dedicated to my husband, Marcel Sapse, and to my daughter, Danielle Sapse, whom I thank for their support. Anne-Marie Sapse
This page intentionally left blank
Preface
The applications of quantum chemical calculations to biological systems has been made possible by huge advances in computer facilities and the creation of better computer programs, capable of handling large systems. This book describes some of the quantum chemical methods used for such calculations, together with some widely used computer programs. Chapter 1 gives a short description of ab initio methods, Hartree-Fock and post-HartreeFock, focusing on the Gaussian computer programs. Chapter 2 describes semi-empirical calculations and their applications to biological systems. Chapter 3 addresses itself to electrostatic properties of molecules, as determined by quantum-chemical methods. The density functional method is discussed in chapter 4. Chapter 5 compares theoretically obtained parameters to experimental data. The second part of the book, consisting of chapters 6 and 7, describes the application of ab initio calculations to such biological systems as amino acids and peptides (Chapter 7) and anti-cancer drugs (Chapter 6). The book addresses itself mainly to biochemists who would like to augment experimental studies with theoretical calculations.
This page intentionally left blank
Contents
Contributors
xi
Introduction
xiii
1 Ab Initio Calculations, Anne-Marie Sapse 3 2
An Introduction to the Theoretical Basis of Semi-Empirical Quantum-Mechanical Methods for Biological Chemists, Nigel G. J. Richards 11
3
The Molecular Electrostatic Potential: A Tool for Understanding and Predicting Molecular Interactions, Jane S. Murray and Peter Politzer 49
4
Applications of Density Functional Theory to Biological Systems, Tomasz A. Wesolowski, and Jacques Weber 85
5
On Comparing Experimental and Calculated Structural Parameters, Lothar Schafer and John D. Ewbank 133
6
Ab Initio Studies of Anti-Cancer Drugs, Anne-Marie Sapse
7
Ab Initio Calculations of Amino Acids and Peptides, Lothar Schafer, Susan Q. Newton, and Xiaoqin Jiang 181
Index
225
159
This page intentionally left blank
Contributors
John D. Ewbank, Department of Chemistry, University of Arkansas, Fayetteville, Ark. 72701, USA Xiaoqin Jiang, Department of Chemistry, University of Arkansas, Fayetteville, Ark. 72701, US A Jane S. Murray, Department of Chemistry, University of New Orleans, New Orleans, La. 70148, USA Susan Q. Newton, Department of Chemistry, University of Arkansas, Fayetteville, Ark. 72701, USA Peter Politzer, Department of Chemistry, University of New Orleans, New Orleans, La. 70148, USA Nigel G. J. Richards, Department of Chemistry, University of Florida, Gainesville, Fl. 32611-7200, USA Anne-Marie Sapse, City University of New York, Graduate School and John Jay College, New York, N.Y. 10019, USA and Rockefeller University, New York, N.Y. 10021, USA Lothar Schafer, Department of Chemistry, University of Arkansas, Fayetteville, Ark. 72701, USA Jacques Weber, Universite de Geneve, Department of Physical Chemistry, Geneva, Switzerland Tomasz A. Wesolowski, Universite de Geneve, Department of Physical Chemistry, Geneva, Switzerland
XI
This page intentionally left blank
Introduction
In its early stages, more than fifty years ago, molecular quantum mechanics began to be used to study the structure of matter, by relating the properties of atoms and molecules to the position and interaction of electrons and nuclei. The basic equation of quantum mechanics, the Schrodinger differential equation, was applied to chemical systems starting with the simplest one, the hydrogen atom. Even for this simple system, solving the Schrodinger equation requires advanced calculus methods using special functions such as Legendre polynomials. However, for any system featuring more than one electron, starting with the helium atom, it has long been recognized that the Schrodinger equation cannot be solved analytically and so approximate methods had to be developed. Accordingly, progress in quantum chemistry is made in two directions: the refining of the methods for exact treatment of small systems such as atoms and small molecules, and the finding of reasonable approximate models in order to be able to treat larger systems. Both paths of research make use of computational chemistry, which simulates chemical structures and reactions numerically. The results of the simulations are tested by comparing them to the experimental data, when such data are available, and consequently are used for predictions. This way, a few hours of computer simulations can provide information which might require months of laboratory work. Short-lived intermediates which can not be isolated experimentally can be characterized via computer modeling. Transition states can be identified and characterized. An example of computer usefulness is the search for a putative intermediate of a reaction. This would normally take many hours of laboratory work. If the theoretical work shows that this intermediate does not represent a minimum on the energy hypersurface and as such could not be isolated, much experimental work is saved. The information obtained via computational studies on a system concerns its structure and its reactivity. The structure determination is obtained by geometric optimization (finding the set of geometrical parameters such as bond lengths, bond angles, and dihedral angles which enable the system to adopt the lowest possible energy state). In addition to the geometry, calculations provide both the value of the total energy of the system (the sum of electronic energy and the energy of the nuclei), and binding energies of monomers to form oligomers XIII
XIV
INTRODUCTION
(usually expressed as the energy of the oligomer minus the sum of the energies of the monomers). The method also provides a description of the wave function of the system, with occupied and virtual orbitals, for ground state or excited states, net atomic and bond charges, as well as electric fields, spin distributions, and vibrational frequencies resulting from the vibration of the atoms within the molecule. The chemical reactivity of compounds is studied by transition state location, activation energy calculations, and relative energies of the products versus reactants. These calculations are now applied to a large variety of systems, but even as early as 1953, the pioneering work of Alberte and Bernard Pullman opened the door for applying quantum chemistry to the investigation of biological systems. Their early papers investigate the possibility that the carcinogenic activity of polynuclear aromatic hydrocarbons is related to their electronic structure. For instance, Alberte Pullman, in the paper "Complements on the Factors Determining the Existence of Carcinogenic Activity in the Aromatic Hydrocarbons," (presented at the Seances de L' Academic Francaise and published consequently in Comptes Rendus des Seances de L'Academic Francaise), uses the LCAO (linear combinations of atomic orbitals) method to calculate the energies of ortho and para polarization of some aromatic hydrocarbons and also tries to correlate them to the carcinogenic activity for prediction purposes. Bernard Pullman and Jeanne Baudet in "The Metabolism of Carcinogenic Hydrocarbons," use the structure parameters obtained previously for quinoline systems to calculate the bond index and free valence index in the "M" region of aromatic hydrocarbons, and to describe an epoxy formation. Although the correlation between structural properties of aromatic hydrocarbons and their carcinogenic properties proved to be much more complicated than was hoped, this type of calculation opened the door to the application of quantum chemistry to biological systems. The calculations are applied not only to cancer-related problems, but also to the study of amino acids, peptides, nucleotides, and other than anti-cancer therapeutic agents. The size of the systems is still a limiting factor, but huge strides are being made in writing programs which can handle larger systems. Ab initio calculations can be performed now on systems which, not many years ago, could only be treated with semi-empirical or empirical methods. Researchers are striving to find the optimum combination of accuracy and expediency and the ultimate goal is to reduce the computational effort at no accuracy cost. The subsequent chapters will describe various quantum-chemical methods, compare them to experimental results and discuss their applications to such biological systems as amino acids, peptides, carcinostatic drugs, and DNA fragments. Proteins and large DNA fragments cannot be treated as yet with quantum-chemical methods, due to their size, but progress is being made continuously.
Molecular Orbital Calculations for Biological Systems
This page intentionally left blank
1 Ab Initio Calculations Anne-Marie Sapse
Various difficulties of classical physics, including inadequate description of atoms and molecules, led to new ways of visualizing physical realities, ways which are embodied in the methods of quantum mechanics. Quantum mechanics is based on the description of particle motion by a wave function, satisfying the Schrodinger equation, which in its "timeindependent" form is:
or, for short: H = E"V In this equation, H, the Hamiltonian operator, is defined by H = — (h2/8m 2) V2 + V, where h is Planck's constant (6.6 10--34 Joules), m is the particle's mass, V2 is the sum of the partial second derivative with x,y, and z, and V is the potential energy of the system. As such, the Hamiltonian operator is the sum of the kinetic energy operator and the potential energy operator. (Recall that an operator is a mathematical expression which manipulates the function that follows it in a certain way. For example, the operator d/dx placed before a function differentiates that function with respect to x.) E represents the total energy of the system and is a number, not an operator. It contains all the information within the limits of the Heisenberg uncertainty principle, which states that the exact position and velocity of a microscopic particle cannot be determined simultaneously. Therefore, the information provided by (nit) is in terms of probability: is the probability of finding the particle between x and x + Ax, at time t. The Schrodinger equation applied to atoms will thus describe the motion of each electron in the electrostatic field created by the positive nucleus and by the other electrons. When the equation is applied to molecules, due to the much larger mass of nuclei, their relative motion is considered negligible as compared to that of the electrons (BornOppenheimer approximation). Accordingly, the electronic distribution in a molecule depends on the position of the nuclei and not on their motion. The kinetic energy operator for the nuclei is considered to be zero. For a many-electron molecule, the Hamiltonian operator can thus be written as the sum of the electrons' kinetic energy term, which in turn is the sum of individual electrons' 3
4
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
kinetic energies and the electronic and nuclear potential energy terms. The first term can be expressed as:
where the sum is taken over the number of electrons. The electronic potential energy is due to the attraction between the positive nuclei and the negative electrons, which can be expressed as:
where i represents as before the summation over electrons and 7 is the summation over nuclei. Z is the charge of the 7 nucleus and Ri — rj is the distance between the 7th nucleus and the ith electron. To this term, one must add the term representing the repulsion between electrons:
where ri — rj represents the distance between electron i and electron j. The nuclear potential energy, that is the repulsion between nuclei, is given by
where RI -- RJ is the distance between nucleus 7 and nucleus J, and must also be added. In this way, the Schrodinger equation = E describes the motion of electrons in the electrostatic field of fixed nuclei. As mentioned in the introduction, this equation cannot be solved analytically for systems larger than the hydrogen atom. Therefore, a number of approximations have to be introduced. When the Schrodinger equation is applied to atoms, the wave function is made up of a set of functions called atomic orbitals, which correspond to a given energy state containing a number of electrons determined by Pauli's exclusion principle. If the exact form of these functions is known, the exact energy of the system can be computed. If the exact function is not known, an educated guess can be used. The Variation Principle states that the expectation value of the energy based on the choice of an appropriate will always be higher than the exact energy of the system. Accordingly, minimizing the energy as a function of parameters characterizing a wave function establishes equations whose solutions represent the set of parameters corresponding to the energy closest to the exact energy of the system, obtainable for a wave function of that particular form. In order to find a good approximate wave function, one uses the Hartree-Fock procedure. Indeed, the main reason the Schrodinger equation is not solvable analytically is the presence of interelectronic repulsion of the form e 2 /r i — rj. ln the absence of this term, the equation for an atom with n electrons could be separated into n hydrogen-like equations. The Hartree-Fock method, also called the Self-Consistent-Field method, regards all electrons except one (called, for instance, electron 1), as forming a cloud of electric charge
AB INITIO CALCULATIONS
5
through which electron 1 moves. The electronic cloud is characterized by its charge density which is, in turn, a function of the atomic orbitals describing the electrons. Once the interaction between electron 1 and the cloud is calculated, the Schrodinger equation is solved and improved atomic orbitals are obtained, which replace the initial guess. The new set of orbitals is used to calculate a new charge density which leads to an even better set of atomic orbitals. This iteration procedure is used until a certain threshold is reached. When the Hartree-Fock method is applied to molecules, molecular orbitals are used instead of atomic orbitals. To construct the molecular orbitals, one widely used approximation is LCAO (linear combinations of atomic orbitals). According to molecular orbital theory, the total wave function of the system is written as a combination of molecular orbitals, , which are complemented by spin functions describing electrons in terms of spin ( ) or — ( ). The Hartree-Fock method involves the calculation of integrals of atomic functions. The computation time required is approximately N4, where N is the number of atoms. For a large system, this makes the calculations a formidable task. Indeed, integrals carried out for atomic orbitals which are exponential functions of the e--ar form, where r is the distance from the electron to origin, are very cumbersome. To facilitate the task, these functions were replaced by Gaussian functions of the form e--r2, which greatly shorten the computation time. However, since atomic orbitals are not of Gaussian form, they had to be expanded in a series of Gaussians, with the general form:
where a is a constant determining the radial extent of the function and C is also a constant. These "primitive" Gaussians form the actual basis functions which are called "contracted" Gaussians. The atomic orbitals are then expressed as:
where c are coefficients and x are the contracted Gaussians A number of computer programs were devised in order to perform calculations on different systems. One of the most widely used series of programs is the series of Gaussian programs written at the Carnegie-Mellon Institute in Pittsburgh. These programs establish sets of basis functions and use them for Hartree-Fock and post-Hartree-Fock calculations. The smallest basis system used is called STO-3G, where STO stands for Slater-type orbitals, which are the s, p, and d orbitals of atoms. In the STO-3G case, s orbitals are used for hydrogen atoms and s and p for the other atoms. Each Slater-type orbital is expanded into three Gaussian functions. In general, this set is used for systems too big to allow the use of larger basis sets. Other minimal basis sets are STO-4G, STO-5G, and STO-6G. The value of the computed energy of the system depends on the number of Gaussians used. However, energy differences, optimum geometries, and atomic charges are fairly insensitive to the size of the expansion. STO-3G-calculated optimum bond distances are very accurate when compared with experimental ones, in most cases. For instance, the C-F bond length in CH3F is calculated to be 1.384A and is found to be 1.385A experimentally. The C-C bond lengths are usually reliably computed, as in ethane, 1.538 v. 1.531, for example. The C-N bond length is found to be 1.153 v. 1.154 in HCN, showing excellent agreement. The C-O bond is slightly overestimated, with 1.217 v. 1.203, and the C-N bond length in peptide bonds is also predicted to be longer than experimental values. The complete set of A-B lengths for A and B being C, N, O, and P show a mean absolute dissociation from experiment of 0.03A.
6
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
The bond angles are also satisfactorily reproduced in most of the cases. This is also true for dihedral angles which are sometimes more accurately predicted than by the use of the larger, double-zeta sets. For instance, experimental evidence favors the gauche structure for 1,2 difluoro ethane. STO-3G calculations lead to a gauche structure, while 4-31G calculations predict a trans structure. The 4-31G basis set belongs to the series of split-valence basis sets. These feature two or more Slater-type orbitals for each atomic orbital of valence electrons, while the core electrons are described by one Slater-type orbital. For instance, in double-zeta sets, there are two s orbitals for hydrogen, s and s'. For nonhydrogen atoms of the second row, the valence electrons are described by s, s', p, and p' orbitals while the core electrons are described by one s orbital. In one of the most used double-zeta sets, 6-31G, the core orbital is expanded in a series of six Gaussian functions, while the valence orbitals are expanded on a series of three Gaussians and the other approximated by one Gaussian. The triple-zeta sets, such as 6-311G, feature the valence electrons described by three Slater-type orbitals. In this case, one is expanded in a series of three Gaussians and the other two are each approximated by one Gaussian. Some improvement is obtained for the geometry optimization, as compared to the minimal basis sets. For instance, the 4-31G basis set shows a mean absolute deviation experimentally of only 0.11 A. However, these basis sets predict angles too large for some systems, such as the HOH bond angle in water, HNH in ammonia, and HOC in various alcohols. The energies predicted, though, are much lower than those predicted by the minimal basis sets, and energy differences or reaction energies are much more reliable. A further improvement of the basis sets consists of the introduction of polarization functions, such as d orbitals on nonhydrogen atoms or p orbitals on hydrogens. Indeed, while split-valence basis sets allow the orbitals to change size but not shape, polarization functions possess angular momentum beyond the one required for the ground state of the atom. Some of the most used basis sets are 6-31G*, which adds d orbitals on nonhydrogen atoms and 6-31G** which also adds p orbitals on the hydrogens. For systems containing sulfur, such as amino acids of thiols, if the system is too large to warrant the use of the 6-31G* set, the 3-21G* basis set can be used, which sets d orbitals only on the sulfur. When the species described is an anion, or, more generally, electron-rich, it is advisable to add diffuse functions to the basis set. These functions allow orbitals to occupy a larger region of the space. The most used types are 6-31+G*, which adds diffuse s- and p-type functions to nonhydrogen atoms and 6-31 + +G* which also adds p functions on the hydrogen. The description of negatively charged amino acids and, in general, of carboxylate ions benefits from the use of diffuse functions. Beside total energies, energy differences for such processes as oligomerization, chemical reaction, complex formation, and optimum molecular geometries, Hartree-Fock calculations provide the net atomic charge on each atom of the system under investigation. One of the methods for the calculation of these charges is the Mulliken Population Analysis. Using the electron density function concept, Mulliken proposed that the total atomic charge on an atom X can be calculated as the atomic number of X, minus the gross atomic population expressed as the sum of the net population of the functions associated only with atom X and half of the overlap population of the functions associated with both the atom X and the atoms bound to it. An even more reliable method of calculation of the atomic charges is the electrostatic potential-derived charge, devised by Kollman. This method, available in the Gaussian-92 program, assigns point charges to fit the computed electrostatic potential at a number of points on or near the van der Waals surface.
AB INITIO CALCULATIONS
7
All of the above calculations can be performed on two kinds of systems: closed-shell systems and open-shell systems. In the former, electrons are assigned to orbitals in pairs, the total spin is zero, so the multiplicity is 1. In this case, the restricted Hartree-Fock method (RHF) can be applied. For radicals with doublet or triplet states, the unrestricted Hartree-Fock (UHF) has to be applied. In this method, a. and, P electrons (spin up and spin down) are assigned to different spatial orbitals, so there are two distinct sets: and .
Post-Hartree-Fock Methods The Hartree-Fock method adequately describes the ground state of most molecules. However, the exact wave function itself should take into account the fact that electrons repel each other and need "breathing space." The electrons should be allowed to make use of energy levels which are normally empty in the ground state to maintain this breathing space. In other words, to add terms describing excited states in the ground state wave function. This is accomplished by replacing the Hartree-Fock form of the wave functions as a single determinant with a linear combination of determinants in which each determinant represents a particular electronic configuration. Since the wave function now consists of the sum of these determinants, it represents a "configuration interaction." The energy obtained with the corrected function is lower than the one obtained with the single determinant function. The difference between the two energies is due to the inclusion of electron correlation in the first energy and it is called "correlation energy." There are systems in whose description the inclusion of the correlation energy is mandatory. For instance, chemical reactions in which a bond is broken cannot be studied without correlation energy, which varies greatly from that of a pair of electrons to that of electrons separated on the breakage of the bond. Another example of a calculation requiring the inclusion of correlation energy is the description of aromatic-aromatic interactions. These interactions are particularly important to protein studies, due to the presence of the aromatic side-chains of such amino acids as tryptophan and tyrosine in the hydrophobic regions of the interior of proteins. It is possible that these interactions can play a role in the inactivation of enzymes by inhibitors, as seen in the inactivation of chymotrypsin by 6-chloro-2-pyrone compounds, or in the carboxypeptidase A-inactivator complex. Aromatic-aromatic interactions, which are among the interactions responsible for hydrophobic binding in proteins, have been found to exhibit an experimental value of the binding energy of around 2.4 kcal/mol. The calculations of Jorgensen and Severance using intermolecular potential functions in a Coulomb plus Lennard-Jones format, find energies of interaction for benzene-benzene complexes from 1.7 to 2.31 kcal/mol. In a study of these interactions, as relevant to the binding of substrate and inhibitors to dihydrofolate reductase, ab initio calculations have been applied to such systems. It was discovered that in order to obtain reliable results, the correlation energy had to be added to the Hartree-Fock energy of interaction (see chapter 6). A number of methods are available for estimating the correlation energy of a system. One of the most precise ones, full configuration interaction, requires too much computational effort in order to be applied to anything but very small systems. Therefore, limited configuration interaction is used for larger systems. The simplest method, which turns out to be an improvement over Hartree-Fock energies, is CID, or Configuration Interaction Doubles, which includes double substitutions only. A somewhat better energy is obtained
8
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
with CISD, which includes both single and double substitutions. Higher CIs are now available, but none of these methods can be applied to large biological systems as yet. An alternative way to compute the correlation energy is the Moller-Plesset perturbation method. The correlation energy is treated as a perturbation and the electronic Hamiltonian is expressed as:
where H0 is the noncorrelated Hamiltonian and V = (H — H0), where H is the correlated Hamiltonian and X is a parameter. The exact and E are expanded in powers of X and taking into account various powers of \ one obtains MP2, MP3, or MP4 terms. In order to keep the computational time reasonable, only MP2 terms can be used for larger systems. For instance, in the case of aromatic-aromatic interactions, MP2 calculations yielded realistic values for the binding energies.
The Gaussian Programs Suppose one wanted to study a biological system, possibly as small as an amino acid or as big as a polypeptide or a small nucleotide strand. There are a number of computer programs equipped to handle such systems, including the Gaussian programs. The Gaussian programs are the product of Gaussian, Inc., Pittsburgh, Pennsylvania. They perform quantum chemical calculations, using either semi-empirical methods such as AM1, MINDO/3, PM3, or ab initio calculations which have been discussed previously. In order to apply the program to the study of a given system, within the ab initio frame, one has to indicate if the calculations are to be at Hartree-Fock level or at post-Hartree Fock level. Accordingly, after the job cards, a card will contain a certain keyword, specifically, HF for Hartree-Fock, MP2 for Moller-Plesset of 2nd order, QCISD (Quadratic Configuration Interaction, Singles and Doubles), CIS (Configuration Interaction, Singles), MP3, MP4 (Moller-Plesset terms of 3rd and 4th order), QCISD(T) [Quadratic Configuration Interaction, Singles, Doubles (and Triples)]. Then, the basis set has to be selected. Let's take as an example the amino acid alanine. It is a small enough system so that a fairly large basis set could be used, such as 6-31G*. Suppose one is interested in the optimum geometry of the molecule, as obtained at HartreeFock level and at the energy corresponding to this geometry. Then, geometry optimization has to be performed so the card will read: HF/6-31G* OPT After this, there is a blank line and then a line which indicates the name of the job, in this case, "alanine." After another blank line, it is necessary to indicate the charge and spin multiplicity of the system. Since this is a neutral system, the charge is zero and since it is a closed system (total spin is zero), the multiplicity, 2S + 1, is 1. Next, one has to specify the initial geometry of the system. This can be done in two ways: specifying the X, Y, Z coordinates of each atom, or—the more commonly used method—defining a "Z-matrix" which comprises the way atoms are bound to each other, the bond lengths, the bond angles, and the dihedral angles. It is necessary to number the atoms of the molecule, as in the matrix below, which shows the numbering for alanine. It is more convenient to number first the nonhydrogen atoms, the "skeleton" of the molecule, and then to add the necessary hydrogens. Any chosen name can
AB INITIO CALCULATIONS
9
be given to bonds and to angles, as long as one is consistent and correctly assigns the initial values to the name of the parameter. The Z-matrix for alanine can read: Cl C2
1 C1C2
C3
1 C1C3
2 C3C2C1
N4
1 N4C1
3 N4C1C3
2 N4C1C3C2
05
2 O5C2
1 O5C2C1
3 O5C2C1C3
06
2 O6C2
1 O6C2C1
5 180.0
H7
1 H7C1
2 H7C1C2
3 H7C1C2C3
H8
4 H8N4
1 H8N4C1
2 H8N4C1C2
H9
4 H9N4
H10 3 H10C3
1 H9N4C1
8 NHN 1
1 H10C3CI
2 HIOC3C1C2
Hll
3 H11C3
1 H11C3C1
10 120.0
H12
3 H12C3
1 H12C3C1
10 -120.0
H13 6 H1306
2 H13O6C2
5 H13O6C2O5
In this form, C1 defines a point, C2 is bound to it and defines a line, C3 is also bound to C1 and, with C2, defines a plane. This is shown by the dihedral angle made by the N4C1C3 plane with the C2C1C3 plane. If the angle N4C1C3C2 is zero or 180 degrees, the four atoms are co-planar or anti-planar respectively. If not, the angle has to be given the appropriate value. The above parameters can be used in two ways: either kept frozen at a certain value, or, optimized. The program performs the optimization by setting the derivatives of the energy with each parameter to zero and solving the equations thus obtained. Initial values have to be assigned to all the parameters. Usually, when no special information is available, these are the experimental values of different bonds and angles. For instance, a CC single bond will be assigned a value of 1.54 A, an sp3 hybridized carbon, bond angles of 109.5, etc. One notices that in the above example, hydrogens 11 and 12 have been assigned a fixed 120 and —120 value. This is due to the symmetry of the methyl group. It would also be possible to give the same name to the bond lengths and angles with C1 to the hydrogens of the methyl, also due to the methyl's symmetry. This will cut in computational time. Then those lines are: H10 3 H10C3
1 HIOC3C1
2
H11 3 H10C3 1 HIOC3C1
10
H12 3 H1OC3
1 H1OC3C1 10
H1OC3C1C3 120.0 -120.0
Once the choice about the particular computation desired is made (for instance, the geometry optimization, single-point energy, vibrational frequencies, etc.), and the Z-matrix is written, the program is run and the output has to be analyzed. The first information which has to be retrieved from the output is the value of the total energy. A number of iteration cycles are used to obtain a convergent value. When convergency is not obtained, the program will stop. One way to avoid this is to lower the convergence threshold. Another way is to change some parameters whose value might cause the lack of convergence. The names given to the parameters do not have to describe what the parameter is. For instance, H10C3 can be called c3 for short, whereas H10C3C1 could be called d3 and the other parameters
10
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
called cl or d1 or any other chosen name, as long as they are distinguished from each other. After the program has been executed, the output will provide the scf energy and will go through several cycles of changing the geometrical parameters aiming to be as close as possible to the first derivative of the energy with each parameter being zero, with the diagonalized second derivatives, called eigenvalues of the Hessian matrix (the diagonalized matrix of the second derivatives of the energy with each geometrical parameter) positive. The second derivatives have to be positive in order to insure a minimum and not a maximum of the energy. If one of the eigenvalues is negative, the given state is not a minimum but a saddlepoint, characterizing a transition state. If the point of the calculation is to search for a transition state, the TS command has to be used together with the optimization command. When the predicted change in energy and the maximum displacement fall under some threshold, the optimization is completed. Other properties of the system are obtained, such as the energy levels of the molecular orbitals, as well as their composition in terms of Slater-type orbitals, the net atomic charges of each atom, the charge on nonhydrogen atoms, including the charges of the hydrogens attached to them, dipole and higher moments, and Fermi contact analysis for systems with nonzero spin. Electrostatic fields can also be computed. If the command "frequency" is given by including the keyword "freq" on the line containing the basis set, the vibrational frequencies are computed, as well as the zero-point vibrational energy, the heat capacities, and the entropies. Thus, the free energy of the system can be obtained for a given temperature. Even if the eigenvalues of the second derivatives matrix (the Hessian matrix) are all positive, a better way of checking if the stationary point is indeed a minimum is to make sure that all the frequencies are real and not imaginary. The computation of the frequencies is more accurate than the one of the eigenvalues and it constitutes a more stringent criterium. The following chapters will present applications of the ab initio method to different biochemical systems, such as amino acids, peptides, and anti-tumor drugs. References Foresman, J. B. and A. Frisch, 1993. Exploring Chemistry with Electronic Structure Methods. Gaussian Inc., Pittsburgh, Pa. Hehre, W. J., L. Radom, P. vR. Schleyer, J. A. Pople. 1986. Ab Initio Molecular Theory. Wiley, New York. Levine, I. N. 1991. Quantum Chemistry. Prentice Hall, Englewood Cliffs, NJ. Schaefer III, H. F. 1977. Applications of Electronic Structure Theory. Plenum, New York and London.
2 An Introduction to the Theoretical Basis of Semi-Empirical QuantumMechanical Methods for Biological Chemists Nigel G. J. Richards
Computational methods that can be employed to investigate fundamental questions concerning the complex chemical and structural behavior of biological molecules such as proteins1, carbohydrates2, and nucleic acids3 have been traditionally limited by the large number of atoms that comprise even the simplest system of biochemical interest. As a consequence, highly parameterized, empirical force field methods have been developed that describe the energy of macromolecular structures as a function of the spatial locations of the atomic nuclei4--7. In combination with algorithms for simulating molecular dynamics, these classical models allow relatively accurate calculations of the structural and thermodynamic properties associated with proteins and nucleic acids8--11. On the other hand, empirical approaches cannot be used to model molecular behavior that is directly dependent on electrons and their energies. For example, no information can be obtained concerning the electronic spectra of macromolecule/ligand complexes12, electron transfer reactions such as those that occur within the photosynthetic reaction center13, nitrogenase, an enzyme involved in nitrogen fixation14, or cytochrome c oxidase which catalyzes the reduction of oxygen in the last step of aerobic respiration15. Accurate modeling of transition states, excited states, and intermediates in biological catalysis16,17 requires application of quantummechanical (QM) representations since all of these phenomena depend on the distribution and/or excitation of electrons. At present, the most accurate ab initio algorithms for calculating electronic structure cannot be applied to systems comprised of hundreds of atoms, as such calculations scale as N4-N7 on most workstations, where N is the number of functions used in constructing the many-electron, molecular wavefunction18. Even with the implementation of ab initio codes optimized for use on parallel computing engines, and density functional approaches19, it is likely that high-accuracy QM calculations in the near future will remain limited to systems that comprise tens, rather than hundreds, of nonhydrogen atoms. Semi-empirical quantum-mechanical methods combine fundamental theoretical treatments of electronic behavior with parameters obtained from experiment to obtain approximate wavefunctions for molecules composed of hundreds of atoms20--22. Originally developed in response to the need to evaluate the electronic properties of organic molecules, especially those possessing unusual structures and/or chemical reactivity in organic chemistry, 11
12
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
these methods have now been widely implemented in commercial software packages that allow nonexperts to perform complex calculations, and visualize electronic and molecular properties23. Semi-empirical algorithms for obtaining many-electron wavefunctions scale only as N3 on most workstations, allowing larger systems to be treated by these methods relative to ab initio techniques. While many calculations are still performed on isolated, "gas-phase" structures, recent theoretical advances in the construction of solvation models mean that the effects of solvent upon structure and chemical behavior can be investigated computationally24--27. The development of methods that can be used to obtain semi-empirical wavefunctions that completely describe all of the atoms in small proteins are also under active investigation28. Unfortunately, the nature of the approximations employed to solve the equations that yield the wavefunctions describing the electronic properties limits both the accuracy of specific methods and the molecular structures to which they can be applied. Various flavors of semi-empirical models therefore exist, each with its strengths and weaknesses, and care must be taken when choosing the best model for describing the system of biochemical interest. The material in this chapter seeks to provide an overview of the theoretical basis, and practical application, of semi-empirical methods for the nonexpert with minimal interest in the mathematical basis of quantum-mechanical calculations. Although many of the equations underlying a number of semi-empirical theories and solvation models are presented in this discussion, no effort is made to prove specific results because excellent detailed derivations are available elsewhere20. There are a bewildering variety of different models, defined by acronyms such as CNDO29, INDO30, MNDO31, MINDO/330, AMI32, SINDO133, SAM134 and PM335. The interpretation of the fundamental equations in chemical terms is therefore an essential element of understanding the strengths and limitations of each of these semi-empirical techniques in solving biochemical problems. On the other hand, this discussion cannot aim to provide a comprehensive picture of the vast, and ever-increasing, literature describing the use of semi-empirical calculations in biological chemistry. Methods selected for discussion are those that appear to be widely used, given their implementation in popular software packages. Algorithms based on modern valence bond theories36,37, and those which avoid matrix multiplication and diagonalization by forming localized orbitals from hybrid atomic orbitals are not presented here38, although these techniques may have significant implications for the future treatment of biochemical problems. More comprehensive insights into specific problems and methods are provided by a number of excellent specialist articles and reviews20,21,39--42, which may be consulted by interested readers. Developing the Fundamental Equations: Hartree-Fock Theory Most semi-empirical models are based on the fundamental equations of Hartree-Fock theory. In the following section, we develop these equations for a molecular system composed of A nuclei and N electrons in the stationary state. Assuming that the atomic nuclei are fixed in space (the Born-Oppenheimer approximation), the electronic wavefunction obeys the time-independent Schrodinger equation:
where the many-electron Hamiltonian H(l, 2, 3, ... , N), in atomic units, has the following form:
BASIS OF SEMI-EMPIRICAL QUANTUM-MECHANICAL METHODS
13
where, Za is the charge on the ath nucleus, Ria is the distance from the fth electron to the ath nucleus, and r.. is the distance between electrons i and j The solutions of the Schrodinger equation (Eq. (2.1)] are therefore a series of eigenfunctions i. of the Hamiltonian that describe the spatial distribution of the electrons about the nuclei associated with a particular energy, Ei, These eigenfunctions . may be complex, having real and imaginary terms, and are usually chosen to be orthonormal so that the following results hold:
The normalization condition [Eq. (2.3 A)] ensures that the probability of finding the electrons throughout all space is unity. Unfortunately, finding exact analytical solutions of the many-electron Schrodinger equation (2.1) is not mathematically possible and so methods must be found to obtain approximate solutions. Although many-electron wavefunctions are complex functions, the energy of the molecular system must be an observable quantity. The mathematical consequence of this condition is that the Hamiltonian operator H(l, 2, 3,..., n) is Hermitian, and so the following result is true for all orthonormal solutions of the Schrodinger equation [Eq. (2.1)]:
Algebraic manipulation of Eq. (2.1) then allows the energy, Ei, to be expressed in an alternate form:
This suggests a strategy for obtaining the true wavefunction describing the molecular system. It is likely that any function (trial) that can be constructed as an approximate solution to the many-electron Schrodinger equation will yield an energy [Eq. (2.5)] that is larger than the value associated with the exact solution :
This condition is termed the variational principle. Thus, the trial wavefunction can be optimized using standard techniques43 until the system energy is minimized. At this point, the final solution can be regarded as . for all practical purposes. It is clear, however, that the wavefunction that is obtained following this iterative procedure will depend on the assumptions employed in the optimization procedure. Unfortunately, the many-electron wavefunction itself does not necessarily provide insight into the chemistry of complex molecules as it describes the electronic distribution over the whole system. It is therefore assumed that the true many-electron wavefunction i can be represented as the product of a series of independent, one-electron wavefunctions:
14
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
This is termed an independent electron model, and terms such as (1) are termed molecular orbitals (MOs). This equation is equivalent to assuming that the probabilities of electrons occupying the same region of space are independent, i.e., that each electron moves in the averaged field of the bare nuclei and the other (N— 1) electrons. At this point in the derivation, so as to simplify the notation, the subscript for a particular solution to the Schrodinger equation (2.1) and its associated energy will be dropped. Thus Eq. (2.7) can be rewritten as:
The exact form of the wavefunction is also conditioned, however, by the observation that electrons possess spin quantum numbers of either + of — . Consequently, physically correct solutions to the Schrodinger equation (2.1) must be antisymmetric. Mathematically, this condition can be written as:
where Pij is a permutation operator that interchanges the coordinates and spins of electrons i and i This requirement places restrictions on the ways in which MOs can be combined to form the wavefunction describing the complete system. An additional consequence of the existence of two possible spin states for electrons is that a single MO can be occupied by two electrons of opposite spin without violating the Pauli exclusion principle. From this point, we shall assume that the number of electrons in our system, N, is even, so that all occupied MOs contain two electrons. In order to illustrate the effects of the permutation operator on the form of allowed wavefunctions, consider the situation in which a molecule has two electrons contained within a single MO, As each electron has two spin states, usually indicated as a and p, then one possible expression for the molecular wavefunction might be:
in which the spin states of the electrons in each MO are explicitly shown. This simple expression cannot represent the true wavefunction, however, as it does not behave correctly when subject to the permutation operator P12:
The simplest solution to this problem is to construct an antisymmetric wavefunction using a linear combination of one-electron wavefunctions. For two electrons, this takes the following form:
Imposing the normalization condition Eq. (2.3A) then gives the final result:
which can be more usefully expressed in the form of a determinant as:
BASIS OF SEMI-EMPIRICAL QUANTUM-MECHANICAL METHODS
15
Applying the permutation operator P12 is therefore equivalent to interchanging rows of the determinant in Eq. (2.15). Having devised a method for constructing many-electron wavefunctions as a product of MOs, the final problem concerns the form of the many-electron Hamiltonian which contains terms describing the interaction of a given electron with (a) the fixed atomic nuclei and (b) the remaining (N— 1) electrons. The first step is therefore to decompose H(l, 2, 3,... , N) into a sum of operators H1 and H2, where:
The energy associated with the many-electron wavefunction is then given by:
The first, one-electron term is readily simplified by realizing that all of the N electrons in the molecule are indistinguishable. This integral describes the motion of each electron about the fixed atomic nuclei in the absence of all other electrons, and can therefore be written as:
where Za is the charge on the ath nucleus and Ra is the distance from the electron to the ath nucleus. The one-electron integral in Eq. (2.19) can then be evaluated by substitution of the MO expressions obtained from the relevant Slater determinant [Eq. (2.15)] and the application of the requirements for MO orthonormality in the resulting integral expressions [Eq. (2.3)]. After a substantial amount of algebraic manipulation, it can be shown that the relevant integral can be expressed as the sum of simple one-electron integrals:
Finally:
where Note that the sum must be doubled because there are two electrons in every occupied MO, as this analysis assumes that the system has equal numbers of electrons with spins of opposite sign.
16
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
The second, two-electron term in Eq. (2.18) is much harder to evaluate, as it describes the energy of an electron as it moves through the average field created by the remaining (N--1) electrons. After a considerable amount of mathematical magic, however, the energetic contribution of the two-electron interactions can be written as
where and
As before, r12 is the distance between electrons 1 and 2. Computation of Jij and Kij. is clearly more difficult than for the one-electron integrals H;J. but both of these two-electron integrals have a simple physical interpretation. The Coulomb integrals Jij reflect the total repulsion energy that would arise if all of the electrons moved independently in their MOs. The exchange integrals Kij usually have a negative sign that lowers the total energy, and represent a reduction in the interaction between electrons with parallel spins in different MOs i and , which arises from the requirement for wavefunction asymmetry. The final equation for the energy of a given many-electron wavefunction in terms of its spatial MOs [Eq. (2.18)] can then be written as:
where the summations run over the MOs. Other particularly useful ways of writing (2.25) are:
or
where The individual terms, i, in Eq. (2.27) are termed one-electron orbital energies and correspond to the ionization potential ( — e;) of an electron in MO i assuming that no reorganization of the core nuclei, or the other (2N — 1) electrons, takes place during ionization. The total energy of the system [Eq. (2.27)] is clearly not the sum of the one-electron energies because electron-electron interaction terms are included twice. Using this expression for the total electronic energy, application of the variational principle yields the following set of differential equations to obtain the optimized spatial MOS, , for the molecule:
BASIS OF SEMI-EMPIRICAL QUANTUM-MECHANICAL METHODS
17
In Eq. (2.30), F is the Fock operator and Hcore is the Hamiltonian describing the motion of an electron in the field of the spatially fixed atomic nuclei. The operators Jj. and Kj are operators that introduce the effects of electrons in the other occupied MOs. Hence, when i = j, Ji (= Ki) is the potential from the other electron that occupies the same MO, i Kj. is termed the exchange potential and does not have a simple functional form as it describes the effect of wavefunction asymmetry on the correlation of electrons with identical spin. Although simple in form, Eq. (2.29) (which is obtained after relatively complex mathematical analysis) represents a system of differential equations that are impractical to solve for systems of any interest to biochemists. Furthermore, the orbital solutions do not allow a simple association of molecular properties with individual atoms, which is the model most useful to experimental chemists and biochemists. A series of soluble linear equations, however, can be derived by assuming that the MOs can be expressed as a linear combination of atomic orbitals (LCAO)44:
The basis set for any given calculation is defined by the number and functional form of the atomic orbitals (AOs) used to construct the MOs, i Three types of basis sets are commonly employed in computational studies. A minimal basis set is comprised of AOs up to and including those in the valence shell of atoms in the molecule. An extended basis employs AOs lying outside the valence shell in addition to those used in the minimal basis set. For example, for ammonia NH3, the minimal basis would comprise the Is orbitals on each hydrogen, and the Is, 2s, and 2p orbitals on nitrogen. Adding 3s, 3p, 3d functions on nitrogen of 2s and 2p functions on hydrogen would then yield an extended basis set. Deriving the wavefunction for nay given MO is then merely a matter of adjusting the coefficients multiplying each AO until the total energy of the molecular system cannot be lowered further. In addition, orthonormality of the MOs leads to the following condition:
where and
Note that ij equals zero if i ¥ j, and is unity when i = j. S G M v is termed the overlap integral. Note that AOs are usually distinguished from MOs by being indexed using Greek characters, where and can be AOs centered on the same atom or different atoms. In extracting atom-based properties from the many-electron wavefunction, it is also useful to define a density matrix P, the elements of which, P are constructed from the coefficients of the AOs in a given, occupied MO, using the following expression:
The quantity can then be regarded as the electronic population of the atomic overlap distribution . Diagonal terms such as P S give the net electronic charge residing in the AO for a given MO, while off-diagonal elements, in which and
18
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
are centered on different atoms, indicate that amount of bonding between the two atoms in the MO. The equation for the total electronic energy using a given AO basis, derived using standard algebraic manipulation, can then be written as:
where
and From (2.37) and (2.38), it is clearly evident that the number of amount of two-electron integrals that must be evaluated scales as the fourth power of the number of basis functions, N, employed in the calculation. Application of the variational principle to (2.35) leads to the Roothaan equations44 from which the coefficients of the AOs in each MO, , with the energy can be determined:
where The important thing about the Roothaan equations (2.39) for finding the LCAO self-consistent field (SCF) MOs of the closed-shell system is that they are linear algebraic equations rather than the differential equations derived using the Fock operator F [Eq. (2.29)]. Matrix transformation methods therefore exist for their solution, and construction of the manyelectron wavefunction describing the molecular system is straightforward, given values for the coefficients of the AOs in a given MO. On the other hand, the elements of the HartreeFock matrix, F are dependent upon those of the density matrix, P which are themselves constructed from the coefficients of the AOs, which are what we are trying to calculate by solving the Roothaan equations! In practice, given the availability of modern computing power, an iterative procedure is employed in the calculation in which an initial set of coefficients are assumed and used to construct the elements, P of the density matrix. The initial Hartree-Fock matrix can then be evaluated and used in the secular equations [Eq. (2.39)] to derive a new set of coefficients, which are then used in the construction of a new Hartree-Fock matrix, and the procedure is repeated. The calculation is terminated when the difference between the old and new sets of coefficients is less than a user-specified tolerance. Moreover, further changes in the values of the coefficients should not affect the total electronic energy calculated using the AOs. As with any iterative scheme, the goal is to achieve convergence in the values of the coefficients in a minimum number of steps. The initial choice of coefficients is therefore important in determining the number of iterations required to obtain a self-consistent wavefunction. In addition, choosing the correct AO basis for performing accurate calculations usually represents a compromise between computational accuracy and the size of the system that may be studied. Determining the correct choice is often difficult, as indicated by the numerous published discussions of making such a choice45.
BASIS OF SEMI-EMPIRICAL QUANTUM-MECHANICAL METHODS
19
Although the above discussion assumes that all MOs are occupied by two electrons, it turns out that the basic ideas can be extended to open-shell molecules in which there are unequal numbers of electrons in the two spin states. Without showing the complicated mathematics, we will show how the wavefunction can be determined by constructing two Fock matrices for each spin state and then solving two sets of coupled Roothaan equations:
where
and The superscripts a and (3 indicate the spin state of the electrons in the many-electron wavefunction. Although many biologically important compounds, particularly metailoproteins46, exist in states with unpaired electrons, our work has not involved the study of openshell systems. Readers who wish to apply semi-empirical methods in the study of such structures should consult more specialized discussions47,48. In my experience, handling the complications that arise in treating systems with unpaired electrons should probably be left to professional theoreticians!
Solving the Fundamental Equations: Semi-Empirical Methods Solving the Roothaan equations [Eq. (2.39)] is obviously a lot of work, the amount of which scales as N4, where N is the number of basis functions (N) used to obtain the MOs. The biochemist's frustration is compounded when he realizes that many of these integrals are very small, or have zero magnitude. Hence, one obvious strategy to increase the number of atoms in the QM calculation would be to ignore these integrals altogether! While this simple approach clearly introduces errors into the calculated molecular energies and other electronic properties, this dilemma can be resolved by correcting the remaining matrix elements in an empirical manner by calibrating the theoretical model with experimental measurements. The exact strategy for avoiding the hard work in ab initio calculations, and the parameters used to fix up the approximations introduced into the calculated MOs, are what differentiate the bewildering variety of semi-empirical methods available in modern computer codes. Note, however, that an important requirement of any parameterization scheme is that the wavefunction and system energy must be rotationally invariant. The effect of moving to a parameterization scheme on computational effort can be understood by assuming that all four-center, two-electron integrals do not have to be computed. The slow step in solving the Roothaan equations becomes one of matrix diagonalization, a process that scales as N3, which is the fundamental reason for the applicability of these methods to systems of sufficient size to be biochemically interesting. Semi-empirical theories based on molecular orbital theory all seek to retain the essential physics underlying the chemical system, and many are surprisingly successful! Thus, semi-empirical QM calculations in which all va-
20
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
lence electrons are taken into account, correctly predict the relative stabilities of electrons in atomic energy levels, the spatial character of all bonding orbitals and electrostatic repulsion between electrons. As noted above, many integrals describing electron repulsion are very small in magnitude, especially those such as (1) (1) when . The simplest semi-empirical approach, termed the zero-differential overlap (ZDO) approximation, is therefore to assume that these integrals can be ignored. Mathematically expressed, this is equivalent to the following:
As usual, is the Kronecker delta. In addition, the corresponding overlap integrals 5 are neglected in normalizing the MOs, while the core integrals H , which involve significant overlap terms, are treated in a semi-empirical manner to retain possible bonding effects. These approximations eliminate many of the difficult two-electron integrals and all three- and four-center integrals become zero. In effect, chemical bonds are allowed using orbitals on adjacent atoms only. After making these adjustments, the modified Roothaan equations assume the following form:
where the elements of the Fock matrix are given by:
The simplest theory that is consistent with these requirements employs the complete neglect of differential overlap (CNDO)29. This semi-empirical approach will be discussed in some detail, albeit without extensive mathematical justification, as it illustrates the type of approximations that are made in more advanced theories. In addition to the assumptions outlined above, the remaining Coulomb-type integrals are reduced to a single value AB that depends only on the nature of atoms A and B with which and are associated, respectively, and not on the actual type of orbitals that overlap. This is equivalent to stating: for all
on atom A and all X on atom B
(2.47)
The CNDO expressions for the elements of the Fock matrix can then be expressed as:
where and
BASIS OF SEMI-EMPIRICAL QUANTUM-MECHANICAL METHODS
21
In Eq. (2.48), the summation is clearly taken over the orbitals centered on all atoms other than A, and PBB is the total electron density associated with atom B, i.e., the summation is over all AOs on B. The problem is now to derive suitable expressions for the one-electron elements H v in a manner consistent with the neglect of orbital overlap. Without detailing the derivation, the diagonal Fock matrix elements take the following form, in the CNDO model, for on atom A and on atom B:
where
and While the terms in these equations look complex, many have a straightforward physical interpretation. For example, U represents the energy of the AO, , in the bare field of the core of the atom on which is located, and can therefore be approximated by experimental measurements of ionization potential. PAA is the total electron density associated with atom A, the summation being made over all orbitals centered on this atom, and (QB is the net charge on atom B (ZB — PBB). Finally, changing the sign of FAB gives the interaction of any valence electron on atom A with the core of atom B. The off-diagonal elements of the Fock matrix can be calculated from the following expression whether and are both centered on atom A, or are on separate atoms A and B, respectively:
Note that if and are both centered on atom A, S is set to zero and AB is replaced by AA. The total electronic energy in the CNDO model can then be expressed as:
In summary, performing a semi-empirical calculation using the CNDO model, values are required for the overlap integrals, S the core Hamiltonian elements, U and VAB, the electron repulsion integrals, AB, and the bonding parameters AB. Without going into detail, and VAB can be calculated using analytical procedures, while the core integrals, U , are determined from observed atomic energy levels. So as to minimize the amount of empirical parameterization, it is also assumed that the parameters, AB, describing the bonding between atoms A and B can be derived from atom-specific parameters:
The bonding parameters A can then be determined by standard parameterization approaches in which their systematic adjustment gives the optimum agreement between experimental and calculated molecular properties. Full details of two parameterization schemes, CNDO/1 and CNDO/2 have been described, which differ primarily in the formu-
22
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
lae used to evaluate S and VAB29,49. Development of CNDO/2 was necessary since the CNDO/1 model gave an unrealistic attraction between neutral atoms separated by several Angstroms50. Semi-empirical models that have been reported subsequent to CNDO employ similar strategies to reduce the number of integrals that have been evaluated explicitly, and so the following nonmathematical discussion of more modern models will only detail important differences in basic assumptions. In Intermediate Neglect of Differential Overlap (INDO) calculations, two-electron integrals centered on the same atom are calculated explicitly30,51, the extra computational effort yielding an improvement in the molecular geometries computed using the INDO model. Theoretical development of the original formulation of INDO has proceeded in two directions. Initially, improvements aimed primarily at obtaining high-quality predictions of molecular geometries, heats of formation, dipole moments, and ionization potentials, resulted in the Modified Intermediate Neglect of Differential Overlap (MINDO/3) model30. The most important philosophical difference is the treatment of almost all quantities in the Fock matrix and energy expression as adjustable parameters. Exponents of the AOs determining the basis set are also allowed to vary and the core integrals AB are no longer taken to be the average of a limited number of atom-specific parameters. When combined with systematic parameterization studies using accurate data for a small number of structures, the MINDO/3 model gives calculated molecular properties in good qualitative, and often quantitative, agreement with experimental measurements. Subsequent refinements give the Modified Neglect of Differential Overlap (MNDO)31 model which is parameterized from measurements of heats of formation, molecular geometry, dipole moments, and ionization potentials. Once again, orbital exponents, the resonance integral (3 and the core integrals are considered as parameters in the fitting algorithm, the core-core repulsion term being treated as a function of electron-electron repulsion:
Application of the MNDO has become widespread since their implementation in the MOPAC computer program, the extensive capabilities of which have been described52. Parameters are now available for many elements of biological interest including lithium31, fluorine53, sulfur54, zinc55 and iodine56. More importantly for biological calculations, software for MNDO calculations is being implemented on massively parallel compute engines, significantly increasing the size of the systems that may be analyzed57,58. While the MNDO model gives excellent calculated results for the properties of a wide range of organic and inorganic compounds, its application to work on biologically important structures has been limited by an inability to reproduce hydrogen bonding patterns32. In order to correct this deficiency and the overestimation of interatomic repulsion when atoms are separated by distances just greater than those of chemical bonds, the MNDO model was recently modified and reparameterized, leading to the AMI 32 and PM335 semi-empirical models. Although retaining the basic MNDO equations, these two models employ additional terms for corecore repulsion that contain more adjustable parameters:
BASIS OF SEMI-EMPIRICAL QUANTUM-MECHANICAL METHODS
23
where ai(A), b i (A), and ci(A) are adjustable parameters. These new terms correct for the excessive interatomic repulsion, and improve the ability of the model to handle hydrogen bonding. On the other hand, the original seven parameters in the MNDO model are increased significantly in AM1 (to between 13-16), increasing the amount of experimental data required for parameterization. At present, parameters for eleven elements have been developed, including C, H, N, O, F, I, and P. The AM1 model has been employed extensively in calculations on biological systems to obtain accurate representations of ground state geometries and molecular electrostatics. While the theory underlying the PM3 model is identical to that used in AM1, the parameterization strategy is completely different, in that all quantities that are used to construct the elements of the Fock matrix, and hence the expression for the total system energy, are considered as adjustable. As a result, five additional parameters for treating the one-center, two-electron integrals are required by PM3 compared to AM 1, increasing the amount of experimental data needed to parameterize PM3. Automated procedures for optimizing parameters are therefore required that develop parameters simultaneously for multiple elements. In the initial parameterization of the PM3 model, calculations can be performed on structures containing C, H, N, O, F, Br, Cl, I, P, S, and Al. The PM3 and AM1 models are both available in recent versions of MOPAC and are widely used on computational chemistry.
Electronic Spectra: Excited-State Wavefunctions and Configuration Interaction (Cl) In theoretical studies parallel to the development of MINDO/3 and MNDO calculations, the INDO model has been refined and parameterized to reproduce electronic spectra, i.e., energetic transitions corresponding to absorption in the UV-visible region. An implicit assumption in the theoretical treatment to this point has been that the many-electron wavefunction describing the closed-shell system can be represented as a single determinant. Such an approach yields a set of MOs that are occupied by two electrons, and a set of unoccupied, or "virtual," orbitals that are still solutions of the Schrodinger equation (Fig. 2.1 A). Virtual MOs do not contribute to the molecular energy as they contain no electrons, and so the ground-state electronic configuration at the Hartree-Fock level possesses the minimum total energy. Other configurations can be written, however, in which electrons are "promoted" from filled orbitals into virtual orbitals (Fig. 2. 1B), the resulting state having a higher energy than the ground-state system. Excited states can also be generated by redistributing multiple electrons. For example, in double and triple excitations, two and three electrons are promoted to virtual orbitals, respectively. The many-electron wavefunction obtained from a single determinant includes none of these excited states, and may therefore yield significant errors in the total energy calculated using ab initio methods. One solution to this problem is the configuration interaction (Cl) method48, in which more flexible wavefunctions are constructed by inclusion of excited state configurations. The key mathematical idea underlying Cl is to assume that the wavefunction is a linear combination of determinants, Di., each of which describes a different orbital occupation scheme: As before, the total energy, E, can be minimized by optimizing the mixing coefficients ci. so that is zero. In the case of a system comprising two configurations, this leads to the same sort of determinantal equation that was obtained when we minimized the en-
24
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Fig. 2.1 (A) Representation of the electronic ground state for a closed-shell system in which all of the lowest energy MOs contain two electrons of opposite spin. (B) Two example configurations for singlet excited states of the QM system that involve promotion of a single electron to a previously unoccupied, or virtual, MO. Note that the spins of the two unpaired electrons are antiparallel.
ergy of an MO by mixing two AOs, except that the integration is carried out over the determinants:
where
and This procedure is therefore equivalent to mixing two configurations to obtain two new approximate wavefunctions, and the approach is easily extended to include any number of determinants Dr Although simple in form, solving these equations requires significant effort given that each configuration, ZX, contains the products of many MOs that are themselves constructed from the basis set of AOs. Although the CI method can take many forms, the most widely used starts from a reference configuration that is the lowest SCF state of the correct symmetry. For closed-shell systems, this is the ground-state Hartree-Fock wavefunction in which the n lowest SCR orbitals are doubly occupied, represented by | 0 >. Labeling the occupied spin orbitals as a, b, c,... and the virtual spin orbitals by p, q, r,..., the determinant >, corresponding to a singly excited configuration, can be obtained from > by promoting an electron from orbital a to orbital p (i.e., spin orbital a is replaced in the determinant by spin orbital p). Similarly, the determinant | >, corresponds to a doubly excited configuration. The CI expansion can then be written as:
BASIS OF SEMI-EMPIRICAL QUANTUM-MECHANICAL METHODS
25
The restrictions on each sum ensure that determinants are only counted once in constructing the wavefunction. Hence, every electronic state is expressed as a linear combination of singly, doubly, triply, etc., excited configurations. The total number of determinants is obviously large, being on the order of 109 for good basis sets, although symmetry constraints can decrease the number that need to be explicitly evaluated. In addition, if the properties of singlet excited states are of most interest, then determinants that describe triplet state configurations can be ignored in the expansion. For many calculations of electronic spectra on systems containing many atoms, the CI is truncated to include only states produced by the excitation of a single electron (configuration interaction singles, CIS). The utility of this approach to reducing the total number of determinants required in these computations arises from the behavior of the matrix elements calculated to solve the secular problem. Amazingly enough, the element involving the reference SCF determinant and a singly excited configuration is equivalent to the matrix element of the Fock operator evaluated using the occupied and virtual spin orbitals, represented by a and p, respectively, and is therefore zero. This result is termed Brillouin's theorem and means that there is no interaction between the Hamiltonian of the SCF reference determinant and any determinants describing the singly excited states, i.e., that:
Switching to a more symbolic notation for the truncated singles CI, we get the expression:
where The resulting Hamiltonian matrix can then be written in the following form, where each entry corresponds to a block of numbers:
More simply put, the ground-state energy of the molecular system is not affected by | S >, although the charge distribution and electric dipole are dependent on the excited state wavefunction, for which single electronic excitations play a dominant role. The vertical energies required to promote an electron into an unoccupied MO, together with the corresponding transition moments governing the intensity of the absorption are often calculated well with a singles-only CI. The exact expressions, together with the detailed mathematical derivations, are too complicated for this discussion, which seeks only to provide qualitative insight into the techniques by electronic spectra can be calculated. Full details of the relevant equations have been widely reviewed48. On the basis of these ideas, the INDO/S model has been parameterized to reproduce electronic spectra, and gives remarkably good results for molecules comprising many atoms59,60. More importantly, parameters are available for all first- and second-row elements as well as many of the transition metals61,62. In biological applications, INDO/S has also been extremely useful in studies of the spectroscopic properties of the "special pair" of chlorophyll molecules in the photosynthetic reaction center12, and of other metal-containing enzymes such as nitrogenase63.
26
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Including the Effects of Environment in Semi-Empirical Calculations To this point, an implicit assumption in this discussion of semi-empirical calculations has been that the system of computational interest is isolated. This is therefore equivalent to understanding its chemical behavior in the gas phase. Most interesting molecular phenomena take place in condensed phases or in solution, however, and hence methods to introduce the effects of the local environment into the wavefunction are the subject of much research effort. Such techniques will be particularly important in using theoretical methods to probe the structure and chemical reactivity of biologically important systems, given that water has unique properties as a solvent. The simplest method to treat the local environment is obviously just to include a sufficient number of atoms so as to model both the system of interest and its surroundings. The addition of a single water molecule to the quantum-mechanical system, however, adds at least six basis functions, as s and p valence orbitals must be employed as a basis for the semi-empirical MOs, resulting in significantly increased computational effort. Fortunately, alternate strategies for modifying the semi-empirical Hamiltonian, that introduce relatively little additional computational cost, have been implemented in the past five years allowing the investigation of solvation effects upon chemical reactions64 and molecular spectra65. While a comprehensive overview of these theoretical approaches is not possible here, the theory underlying three methods that are available in widely employed software packages will be briefly discussed. Conceptually, the self-consistent reaction field (SCRF) model is the simplest method for inclusion of environment implicitly in the semi-empirical Hamiltonian24, and has been the subject of several detailed reviews24,25,66. In SCRF calculations, the QM system of interest (solute) is placed into a cavity within a polarizable medium of dielectric constant (Fig. 2.2). For ease of computation, the cavity is assumed to be spherical and have a radius r0, although expressions similar to those outlined below have been developed for ellipsoidal cavities67. Using ideas from classical electrostatics, we can show that the interaction potential can be expressed as a function of the charge and multipole moments of the solute. For ease
Fig. 2.2 Self-Consistent Reaction Field (SCRF) model for the inclusion of solvent effects in semi-empirical calculations. The solvent is represented as an isotropic, polarizable continuum of macroscopic dielectric . The solute occupies a spherical cavity of radius ro, and has a dipole moment of O. The molecular dipole induces an opposing dipole in the solvent medium, the magnitude of which is dependent on .
BASIS OF SEMI-EMPIRICAL QUANTUM-MECHANICAL METHODS 27 of analysis, we shall assume that this expansion can be truncated so that only the solute charge and dipole moment contribute significantly to the solvation energy. The first two terms in the expansion have the following form:
The first term, E , is the self-energy of a charged solute while the second, E , describes the polarization of the medium by the solute dipole. The total energy of the system is then obtained by addition of these terms to the energy of the solute computed using QM methods. A key assumption here is that the wavefunction can be divided into solute and solvent parts, meaning that exchange and charge transfer processes cannot occur. The total energy, E , of the system including the contribution from the polarization energy of the medium, can then be written as:
where > describes the isolated solute, and the last term is included only if the solute has a total charge, Q26. Assuming that the total system energy (solute plus solvent) can be treated quantum mechanically, the variational principle can be applied to introduce the effects of the environment into the unperturbed Fock operator for the isolated system, Fo. This is accomplished by including a term describing the interaction of the electronic component of the dipole moment of the solute, O, with the polarizable medium, and the resulting Fock operator F', can be written in the following simple form:
where
g(e) is the Onsager reaction factor, reflecting the ability of the molecular dipole to polarize the surrounding medium of bulk dielectric constant, e. Note that the dipole operator (A is a one-electron operator, and that the Born term is merely additive. An advantage to this approach is that gradients required for geometric optimization are easy to obtain and directly include solute, solvent, and their interaction. The choice of an appropriate cavity radius, r , is an important issue in SCRF calculations especially as most biological molecules are not spherical, and various proposals have been investigated68. The simplest method of determining r remains the following: where Na is Avogadro's number, Mr is the relative mass of the molecule of interest, and d is the density of the solute. The field induced in the medium by the dipole moment of the QM system therefore lowers the energy of the semi-empirical SCF ground state for a neutral solute by an amount :
28
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
where g( ) is the Onsager reaction factor. In obtaining expressions for evaluating the gradient of the modified SCRF semi-empirical wavefunction, however, care must be taken to treat contributions arising from the solvent polarization energy correctly69. For example, the derivative of the energy expression with respect to nuclear coordinates does not vanish for the surroundings unlike the solute, and takes the form:
where the second term involves atomic orbitals s and the associated elements of the density matrix P v used in the calculation. Although d /dq is zero for ZDO theories, such as AM1 and PM3, the second term in Eq. (2.70) does not vanish and is difficult to calculate. In practice, numerical differentiation is employed to find the gradient terms required for geometry optimization using a full SCF at each step of the calculation. These simple ideas can be extended to derive similar expressions for the influence of the solvent on excited state wavefunctions, allowing calculation of solution phase electronic spectra using the CIS formalism. It is usually assumed that the dielectric response of the medium can be divided into two parts describing (a) nuclear reorganization, i.e., reorientation of the solvent due to photo-dependent electronic excitation in the solute, and (b) electronic polarization in the medium. Excitation to give the solute excited state occurs in 10-15 to 10--18 seconds, which is insufficient time for the solvent nuclei to change their position in response to the excited-state wavefunction. On the other hand, polarization of the electrons of the surrounding medium can occur on this timescale. Therefore, the theoretical description of solvent effects can be assumed to reflect only the dipole moment of the excited state, *. Detailed mathematical analysis can be used to show that the elements of the CI matrix using the SCRF model have the following form if 1 is the wavefunction describing the Ith excited state:
where and is the refractive index of the medium. Inclusion of solvent electronic polarization into the SCF calculation using perturbation theory then yields the following result26,70:
where *abs is the energy change in both the solute and solvent that occurs immediately after electronic excitation, and o and I are the wavefunctions describing the ground and excited states, respectively. Given that the absorption wavelength is a function of the relative energies of the ground and excited states, the shift of the peak in the electronic spectrum of the molecule in solution compared to that observed in the gas phase, E, can be approximated by:
BASIS OF SEMI-EMPIRICAL QUANTUM-MECHANICAL METHODS
29
The shift, E, is therefore determined by (| o — *), and therefore if the ground-state dipole is greater than that of the excited state, the absorption band will be blue-shifted. Equally, if there is a separation of charge in the excited state, the band will be red-shifted. One of the drawbacks to the SCRF model is that the QM system is placed into a spherical, or ellipsoidal, cavity within the medium representing the solvent. This limitation offsets the computational ease of SCRF calculations when applied to structures of biochemical interest given that these molecules possess irregular shapes. Recent theoretical developments have therefore focussed on extending continuum representations of solvent to handle cavities of arbitrary shape24,27,71,72. Two recent approaches to this problem will be discussed in this chapter that illustrate models with potential application in biochemical calculations. The first method extends charge screening models originally developed for conducting materials, and has therefore been termed COSMO (COnductor-like Screening MOdel)72. In this approach, the solute is placed into a cavity of arbitrary shape within a dielectric continuum medium of permittivity representing the solvent. The interface between the cavity and the dielectric is described by the solvent accessible surface (SAS)73. Evaluation of solvation effects on the solute then requires finding the polarization of the medium in response to the molecular charge distribution, as reflected by the screening charge densities (r):
where n(r) is the surface normal vector at position r, and E-- (r) is the total electric field on the inner side of the surface at r arising from the solute charge distribution and screening charges. As Eq. (2.79) cannot be solved analytically for an arbitrarily shaped surface, numerical methods are used in which the total SAS is divided into m small segments, S, of constant charge density m . The total charge density is then described by the m-dimensional vector, a. Note that the following equations employ electrostatic energy units for simplicity and that therefore the factor of is dropped from all energy expression. Continuum electrostatic theory indicates that dielectric screening energies scale as ( - l)/( + b), where b varies over the range 0-2. Hence, for calculations on biological systems in aqueous solution, for which e is 78, the solvent screening can be approximated very well by the corresponding effects that occur in a conductor. For any surface enclosing a set of n molecular charges Qi, the screening charge distribution on the solute surface created by solvent polarization and its associated energy can be determined by dividing the surface into a large number M of small segments S of constant charge density a centered at R . If the area of any given segment is , then its total charge, q , is given by:
Then the interaction energy bj of a unit charge at position rj. with a unit charge on S can be approximated as:
Similarly, the electrostatic interaction, a face, S andB Sv, can be written as:
of unit charges on different segments of the sur-
30
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
When
to
= v, detailed mathematical analysis yields the following good approximation
:
For all other cases,
= v, the interaction elements can be approximated by:
Representing all of the solute charges by the vector Q, the surface charges by q, and the matrices formed by elements a and b as A and B respectively, the total electrostatic energy of the system can be written:
where C is the Coulomb matrix in which the elements, in electrostatic units, take the form:
and Realizing that the screening charges, q, must be positioned so as to minimize the energy of the system, differentiation and simplification gives the amount of stabilization energy, E, due to the presence of the continuum solvent:
Therefore calculation of the polarization component of the solvation energy requires only the construction of the matrices A and B, after the initial evaluation of the solvent accessible surface and its assignment into small segments (Fig. 2.3)74. When used in combination with semi-empirical methods, the first time-consuming step is the inversion of matrix A. The use of A--1 is also time-consuming as B is an n X M matrix, and two matrix multiplication steps must be undertaken to obtain BA--1B. Although the COSMO method can also be employed in ab initio calculations, we note that n scales as the square of the number of basis functions N, making the effort to obtain BA--1B proportional to N1! Although derivation of (2.88) assumed that point charges were associated with atoms in the solute, it is easily generalized to charge distributions arising from the many-electron wavefunction describing the molecular system. Thus, in the MO formalism, the charge density distributions p (ri.) must be considered instead of point charges Qi at positions ri. These can be computed from the following expression:
where and P
K (r i ) and
(ri) are basis AOs describing the position of the overlap charge density, describes the amount of charge corresponding to the overlap of these basis func-
BASIS OF SEMI-EMPIRICAL QUANTUM-MECHANICAL METHODS
31
Fig. 2.3 The solvent accessible surface (SAS) area corresponds to that mapped out by the center of a sphere representing the solvent molecule (gray) as it is rolled over the van der Waals surface of the solute (light gray). In the COSMO model, the SAS is then divided into a series of segments of area S and charge density , centered at a position R .
tions (equivalent to Qi in the derivation given above). In this extension of the method, the summation index i is replaced by a double indexation and , denoting the combinations of basis functions. The corresponding elements in the B matrix then take the form:
This integral is that of a core-electron interaction and therefore available through solution of the many-electron wavefunction using a variety of methods. To this point, the theory has been developed assuming that the medium is a conductor rather than a polarizable solvent of finite dielectric . Fortunately, (2.88) can be extended to solvents by the introduction of a correction factor, f( ):
Use of this correction term yields a relative error in the screening energies due to the solvent of less than . Perhaps the most important feature of the COSMO approach, however, is the fact that analytical gradients are easily computed from the energy expression allowing geometry optimization of the QM solute in the presence of solvent. In general, other continuum models require the calculation of gradients using numerical methods, resulting in increased computing times and lower rates of convergence. Despite the theoretical complexity of the COSMO algorithm, it has been successfully coded into the MOPAC software package, and can be employed in calculations using either the AM1 or PM3 semi-empirical models75. An series of alternative, generally parameterized methods for introducing the effects of solvent into semi-empirical calculations are termed SMx, where the value of x represents the type and quality of parameterization27,76--81. These methods have potential value in studying solvation effects on the structure, electronic spectra, and reactivity of biologically
32
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
interesting compounds, and are available in the AMSOL software package. Once again, the solvation energy is related to the shape of the QM solute in a continuum representation of solvent, but cavity, dispersion, solvent structure, and local-field polarization terms are present. The usefulness of this approach for biochemical systems lies in the fact that parameters have been derived for many different types of atom, including phosphorus, when in an aqueous environment27. On the other hand, the actual parameter values used in the model depend on whether the AM1 of PM3 semi-empirical model is used for determining the solute properties. Although there are a number of discussions concerning the fundamental theory of this approach and parameter development27,76, it will be briefly reviewed here to illustrate similarities and differences with the two previous solvation models. As usual, the solvent is assumed to be an isotropic, polarizable continuum, and therefore the standard free energy of the QM solute in solution can be expressed as:
where G°s is the free energy of solvation and G°(g) is the gas-phase solute energy. In deriving the model, the solvation energy term is partitioned into a number of independent contributions:
in which EEN(sol) and EEN (g) are the sum of the electronic kinetic, and electron-nuclear Coulombic, energies in the presence and absence of solvent, respectively. Gvib and Gelec are the vibrational and electronic free energy changes that occur on moving the solute from gas-phase into solution. These two terms make small contributions compared to Gp (sol) and G°c(sol) which represent the polarization free energy and the energy required to make room for the solute in the solvent (cavity free energy), respectively. Note that the partitioning of the energetic contributions arising from polarization and cavity formation is an essential assumption in this solvation model. The classical Born expression for the polarization free energy of a spherical ion of net charge q can be written as82:
where is the dielectric constant of the medium containing the charge and a is the effective ionic radius of the ion. For multi-centered systems, the Born equation can be generalized so that:
where k and k' are atomic centers, and ykk, is a Coulomb integral that is very hard to evaluate for complex charge distributions. The difficulty in obtaining analytical expressions for kk, therefore limits efforts to obtain the polarization energy for complex structures, and so stimulated the development of an empirical function, initially for use in energy minimization and molecular dynamics simulations83,84, of the following form:
BASIS OF SEMI-EMPIRICAL QUANTUM-MECHANICAL METHODS
33
where rkk, is the distance between charges and k is the effective radius of the kth ion. The term kk, must then be chosen such that this expression becomes the Born equation in the case of a single ion, and yields the usual electrostatic interaction for two charged particles separated by a distance rkk,. The following expression for the empirical function , satisfies both of these constraints:
in which d(0) is an empirically optimized constant that is usually set equal to 4.0. This expression is a modification of that originally employed in calculations of solvation free energies using classical force-field approaches83 in that an additional localized function, is included which has the following form27:
The adjustable parameter d(1) is made nonzero for only a limited range of atom pairs such as bonded and geminal O-O, and vicinal N-H, interactions so as to correct systematic deficiencies exhibited for interactomic distances in a localized range. is a complicated function that is therefore only computed under well-defined circumstances and will not be discussed further here27. Evaluation of the polarization energies for the multi-centered QM system therefore only requires the derivation of values for and ., In the case of a single atom of charge qk' can be approximated by a "coulomb radius" chosen such that:
where pk(0) pk(0), and qk(0) are adjustable parameters that can be derived from experimental energies of solvation in aqueous solution. In most implementations of SMx models, qk(1) is fixed at 0.1. The function varies smoothly from an empirical limit of pk(0) for large positive qk to (pk(1) + pk(0)) for large negative qk. For atoms in multi-centered molecules, k must be evaluated using a complex numerical integration procedure85. Since the solvent accessible area contributed by the kth atom is an essential element in determining k. numerically, this dielectric screening algorithm again has the advantage that arbitrarily shaped solutes are modeled within a realistic solvent cavity. Having established expressions, Eqs. (2.95)-(2.99), for evaluating the polarization energy due to the charge distribution in the solute, the following expression for the free energy of the QM system in the Hartree-Fock SCF formalism can be written:
where P , H v, and F(0) are the density, one-electron, and Fock matrix elements, respectively, in the absence of solvation terms. The indices . and v are summed over the AO basis and Zk is the valence nuclear charge of the kth atom, defined as the nuclear charge mi-
34
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
nus the number of core electrons. Although a number of methods exist for obtaining the partial charge of the kth atom, qk, a Mulliken population analysis86 is generally used in the SMx solvation models implemented in the AMSOL package, such that:
where the summation includes only orbitals, , that are on the kth atom. Given that errors in evaluating partial charges should be systematic, the actual method for deriving these atomic parameters is not a major issue in this solvation approach. At fixed kk,, and considering the variation in the solution free energy with respect to the coefficients defining the AO contributions to the MOs, the following expression can be derived27:
where In Eq. (2.103) the summations in the second term that modifies the gas-phase Fock matrix element represent the Born polarization energy, and therefore the summations ' are taken over AOs associated only with the kth atom. Moreover, only the diagonal elements of the gas-phase, solute Fock matrix are modified by the presence of the continuum solvent due to the appearance of the Kronecker delta function, 8 v, which is zero unless = v. The critical advantage to including solvent polarization effects in the Fock operator explicitly is that the density matrix and the converged MOs are determined self-consistently. Hence, the molecular charge distribution can relax in the field of the polarizable solvent87--89. This redistribution of electrons may represent a significant contribution to the overall free energy of solvation. After solving the problem of introducing solvent polarization into the semi-empirical model, only the free energy associated with cavity formation, Gc°(sol), remains to be determined. In common with many other models90--93, this term can be considered to be proportional to either the van der Waals surface area or the solvent accessible surface of the solute (Fig. 2.3). Using the solvent accessible surface has the advantage that it is proportional to the number of solvent molecules in the first solvation sphere. If the small contributions arising from Gvib and Gelec are ignored, the cavity free energy associated with dissolving the solute can be expressed as:
and In (2.105), the assumption of the proportionality of solvent-accessible surface and cavity energy is explicitly shown, and therefore k depends on the type of the k-atom interacting with the solvent. Ak( k) is the complex function describing the solvent accessible surface area, and depends on k, which is defined by the following expression:
BASIS OF SEMI-EMPIRICAL QUANTUM-MECHANICAL METHODS
35
where Rk is the van der Waals radius for the kth atom and Rsol is the radius of the sphere encompassing an explicit solvent molecule. In most calculations in which water is solvent, Rsol has the value 1.4A. Note that the solvent accessible area does not include regions that overlap with the cognate spheres computed for all other atoms surrounding the kth atom, and must therefore be reevaluated during each geometry optimization step to reflect changes in the molecular conformation. By using both ionic and neutral solutes in the parametrization procedures, it was possible to optimize the parameters contributing to the polarization energy separately from those required to obtain the energetics of cavity formation. In common with similar approaches that relate solvent accessible surface to cavity free energy90--93, the simple SM1 model required careful parameterization, and assumed that atoms interacted with solvent in a manner independent of their immediate molecular environment and their hybridization76. In more recent implementations of the SMx approach, k parameters are selected for particular atoms based on properties determined from the SCF wavefunction that is evaluated during calculation of the solute and solvent polarization energies27. On the other hand, the inclusion of more parameters in the solvation model requires access to substantial amounts of experimental data for the solvation free energies of molecules in the training set94,95.
Extending Semi-Empirical Calculations to Model Protein Structure and Enzyme Reaction Mechanisms Although continuum solvation models do appear to reproduce the structural and spectroscopic properties of many molecules in solution, parameterization remains an issue in studies involving solvents other than water. In addition, the extension of these approaches to study proteins embedded in anisotropic environments, such as cell membranes, is clearly a difficult undertaking96. As a result, several theoretical studies have been undertaken to develop semi-empirical methods that can calculate the electronic properties of very large systems, such as proteins28,97,98. The principal problem in describing systems comprised of many basis functions is the method for solving the semi-empirical SCF equations:
where is the orthonormal set of MOs that are linear combinations of AOs [Eq. (2.31)], and F is the Fock matrix (Fig. 2.4A). As a consequence, an iterative procedure must be used to construct the Fock matrix and solve the SCF equations to obtain the MOs that scales as N3, where N is the number of basis functions. The two algorithmic steps from which the N3-dependence arises are computation of the density matrix elements, P v [Eq. (2.34)], and diagonalization of the Fock matrix in order to obtain the MO energies. Extension of semiempirical models to large molecules, therefore, requires methods that obviate both of these steps. One interesting approach that has been recently described28 recognizes that the only necessary condition for obtaining an SCF is that all of the Fock matrix elements coupling the occupied and unoccupied sets of MOs must be zero. Therefore, the description of the molecular system in terms of localized MOs (LMOs) provides an approach to develop an algorithm that can treat systems composed of large numbers of atoms. While conventional MOs extend over all of the atoms in a molecule, LMOs correspond to the electronic structure represented by Lewis diagrams and are therefore highly localized in three-dimensional
36
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Fig. 2.4 (A) Flowchart showing the basic operations required to perform an SCF calculation. (B) Flowchart showing the steps required to carry out the same calculation using LMOs.
space. A limitation of the LMO method, however, is that calculations employing such orbitals are restricted to closed-shell systems, but this is not a problem in almost all biological applications. Using LMOs offers no advantages in computational efficiency over conventional semi-empirical approaches for small systems. On the other hand, calculations on large molecules benefit for several reasons. First, all interactions between occupied and virtual LMOs located on atoms separated by distances greater than a user-defined cutoff can be annihilated. As the number of atoms defining the QM system, and hence its size, increases, the number of interaction terms that require evaluation becomes proportionately smaller. Second, the number of virtual LMOs with which any specific occupied LMO can interact depends on the number of atoms and bonds used in defining the LMO, and for large molecular systems, the number of annihilations becomes linearly dependent on the number of atoms. Third, the matrix elements that must be evaluated in constructing the density matrix are limited in this model. For example, if the LMO does not span a specific atom, J, in the molecule, then that LMO cannot contribute to any density matrix elements involving atom J. As a result, the computational effort in constructing the density matrix scales linearly with the number of LMOs. Finally, the calculation of LMO energies and occupied-
BASIS OF SEMI-EMPIRICAL QUANTUM-MECHANICAL METHODS
37
Fig. 2.4 (continued)
virtual interaction energies can be limited to the atoms upon which the LMOs are localized since the size of an LMO depends only on the local electronic structure. The computational effort to determine the many-electron wavefunction, excluding that required to treat longrange electrostatic interactions, therefore scales linearly with the number of LMOs. When conventional algorithms for solving the SCF equations are compared with those involving an LMO description of the molecular system (Fig. 2.4B), the consequence is that the diagonalization step (N3-dependence) is replaced by an annihilation procedure (N-dependent). An additional step is introduced, however, in the LMO SCF procedure to manipulate the initial set of LMOs so as to eliminate small orbital contributions (vide infra). A significant problem in applying the LMO approach to large molecules, on the other hand, is obtaining an initial set of LMOs, that satisfy a number of constraints including orthogonality. For example, LMOs should be localized, at most, on only two atoms, and the total numbers of occupied and virtual LMOs describing the system must be equal to the cor-
38
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
responding numbers of occupied and virtual MOs. While the exact details of the methods proposed for constructing LMOs according to these requirements are too complex for discussion here, generation of LMOs from combinations of AOs, selection of which is based on consideration of Lewis structures, automatically yields orthonormal LMOs. Thus, pairs of AOs, and , are combined to give diatomic bonding, i, and anti-bonding, MA, LMOs so that the LMOs diagonalize the matrix:
in which F is the Fock operator and ii and jj are the LMO energies. The initial AOs are hybrid orbitals in an sp basis set, and the LMOs are polarized due to the different electronegativities of the hybrid AOs. Simple rules can then be used to assign any unused AOs to the sets of occupied or virtual LMOs. For example, hybrid AOs corresponding to either lone pairs or unused orbitals on anionic species, and unused orbitals on cations, are considered to be occupied and virtual LMOs, respectively. Although this approach to generating an initial set of LMOs means that the treatment of hypervalent systems or open-shell systems, such as organic radicals, is not feasible, calculations on most molecules of biological interest are amenable to expression in terms of LMOs. As noted, all Fock matrix elements connecting the sets of occupied and virtual orbitals must be zero, so that the following equation holds for the model:
Due to their spatial localization, it follows that the interaction energy of an occupied LMO with any distant virtual LMO will be zero, and so the computational problem becomes reduced to annihilating matrix elements connecting LMOs that are close in space. These LMOs can be easily identified from the molecular connectivity table given the requirement that any allowed LMO spans one or two atoms. The Fock matrix element, Fij' takes the form:
In the annihilation procedure, for any occupied LMO, i, all of the proximate virtual LMOs, are identified and mixed so as to generate two new LMOs having zero interaction usj, ing the following expressions;
where
BASIS OF SEMI-EMPIRICAL QUANTUM-MECHANICAL METHODS
39
and
One consequence of this annihilation algorithm is that the number of atoms involved in a specific LMO increases as a result of mixing the original LMOs. The wavefunction describing the new occupied LMOs not only has intensity on atoms originally in the LMO but also on atoms with which the virtual LMO was associated. If no action is taken, then the number of atoms spanned by a given LMO increases until every LMO includes contributions, albeit extremely small ones in most instances, from every atom in the QM system. As a consequence, after each iteration to solve the SCF equations, the contributions to each LMO from individual atoms are examined, so that if those associated with a specific atom, J, are small, then atom / is deleted from the LMO. In practice, the number of atoms that contribute to LMOs appears to reach a limit of 100-130, as the number of atoms in the molecule increases. Given this procedure for constructing and annihilating LMOs, the remaining issues in this method arise from the need to evaluate the one- and two-electron integrals in building the elements of the Fock matrix. Unlike conventional SCF methods, all of the interatomic interactions are not included in the LMO approach as many of the corresponding two-electron integrals are not used in the calculation. For example, if the density matrix element P . is zero, then the associated energy term is independent H and F , For the standard functional groups found in biological molecules, the bond order between atoms i andy is extremely small given that the overlap of AOs centered on these atoms is essentially zero when their distance apart, Rij, is more than 5-7A. In semi-empirical models, the one-electron integrals, H , depend on the AO overlap, taking the following form:
where S is the amount of orbital overlap and and are parameters. Therefore, oneelectron integrals involving atoms that are separated by more than 7 A are not calculated in this semi-empirical approach to studying biological macromolecules. Methods for eliminating the evaluation of two-electron integrals in a similar manner, using a distance cutoff, are more difficult given that electron-electron repulsion and exchange are longer range effects. Detailed analysis of the various types of multi-polar interactions represented by these integrals, however, suggests that when the atoms upon which the AOs are centered by distances of greater than 30A, only charge-charge interactions contribute significantly to the energy. Similarly, at separation distances of 6-30A, many two-electron integrals have negligible value and need not be computed. Long range electrostatic effects are therefore proportional to the net charges, Qi and Qj, on the atoms that are involved in the pairwise interaction, where:
and As usual, Z; is the nuclear charge on the ith atom. Without presenting the details of the analysis, the longrange electrostatic energy contribution, Etr, can be calculated from the following expression;
40
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
The use of the distance-dependent multiplier,f(Rij), ensures that electrostatic interactions arising from the two-electron integrals evaluated for atoms separated by less than the cutoff distance, are not counted twice. Hence, f(R ij ) is zero for atoms separated by less than the cutoff distance and unity for all other separation distances, ij is a simple Coulomb integral involving the s-type AOs on atoms i and j. A number of simple correction terms to treat the behavior of lone-pair electrons correctly then yield the final model of the electrostatic interactions that can be used to avoid calculating (and storing) large numbers of twoelectron integrals. This theoretical treatment has been implemented in the MOZYME package, and offers the possibility that semi-empirical calculations may become routine for systems comprising thousands of atoms.
Applying Semi-Empirical Models to Biological Problems Having established the theoretical basis of the most widely used semi-empirical approaches for computing structural properties and reactivities of small molecules, practical aspects of their application to the study of biological problems must be briefly considered. For example, several semi-empirical models are implemented in most commercial software packages, and so care must be taken to choose the most suitable method for describing the system of interest. Such a choice demands a working knowledge of the fundamental assumptions underlying the method. Whatever semi-empirical model is chosen, careful calibration of calculations with experimental data on model compounds related to the system of interest should always be undertaken, and it must be remembered that any computational approach can only be as accurate as the experimental, or theoretical, data from which the parameters were derived. On the other hand, there seems little point in carrying out very accurate calculations to predict molecular properties that can only be measured with high experimental error. In this section, therefore, we discuss the strengths and limitations of several of the models outlined above. Our intent is to highlight issues that should be considered when using semi-empirical calculations to understand biological systems. For readers wishing more comprehensive information, there are numerous reviews available that provide detailed comparisons of these methods for a stunning variety of chemical problems39--41. Before discussing specific examples, however, we should make one further point. Computational studies almost always represent a compromise between accuracy and the amount of work required to obtain an answer. For example, semi-empirical models that yield optimized structures in excellent agreement with experiment may only be parameterized for a small number of elements. Given that parameter development is often a nontrivial problem, many studies must therefore be carried out using less accurate models that are parameterized for more elements. More importantly for the prediction of photochemical reactivity, there is not yet any single semi-empirical model that can reproduce both molecular geometries and electronic spectra accurately. Although of historic importance in the development of semi-empirical models, CNDO models are not routinely employed in studying the structure and energetics of biological molecules. On the other hand, CNDO/S calculations of electronic spectra are widely used, giving useful insight into the behavior of large systems. Given the reliability, functionality and ease of use the MOPAC software package52, semi-empirical calculations on organic and biological structures are dominated by the modified INDO models (MINDO/3, MNDO, AM1, and PM3) that are implemented in this program. MINDO/3, despite giving good estimates of ground-state molecular properties, is little used due to the limited range of ele-
BASIS OF SEMI-EMPIRICAL QUANTUM-MECHANICAL METHODS
41
ments for which parameters are available. The problem is compounded as not all pairwise bonds are parameterized. For example, in biological applications, the absence of parameters for the P-O bond is a considerable limitation. On the other hand, in a few very specific situations, MINDO/3 calculations may be more accurate than those using the AM 1 model99. The MNDO, AM1, and PM3 models are parameterized to handle 21 (H, Li, Be, B, C, N, O, F, Al, Si, P, S, Cl, Cr, Zn, Ge, Br, Sn, I, Hg, Pb), 12 (H, B, C, N, O, F, Si, P, S, Cl, Br, I), and 12 (H, C, N, O, F, Al, Si, P, S, Cl, Br, I) elements, respectively. All three models give excellent results in extensive comparisons of their ability to reproduce molecular properties such as geometry, dipole moment and heat of formation42,52. Particular problems with MNDO include the tendency of the model to underestimate the stability of sterically crowded structures, and, of more importance for its application to biological problems, a failure to predict both the energies and geometric arrangements of hydrogen bonds. For example, the hydrogen bond in the water dimer is computed to be worth 1.0 kcal/mol, substantially lower than the experimental value of 5.5 kcal/mol. The AM1 and PM3 models address both of these problems, and are more accurate than MNDO for compounds containing the elements for which parameters are available. Each model, however, has its strengths and weaknesses. For example, while geometries calculated using PM3 appear to agree better with experimental structures, AM1-derived dipole moments are generally better than those computed using either MNDO or PM3. Perhaps a more serious limitation of the AM1 model for biological applications is associated with the parameters for phosphorus, leading to problems in predicting the structure of P4O6. Equally, the PM3 model seems less capable of obtaining accurate structures for highly conjugated systems, such as porphyrins41, for which optimized geometries tend to be more puckered than those observed experimentally. Both the AM1 and PM3 models, however, yield an energy of 5.5 kcal/mol for the linear hydrogen bond in the water dimer although AM1 predicts this interaction to be slightly nonlinear. Despite correctly modeling the energy and geometry of the water-water hydrogen bond, the O-O distance is predicted to be 2.77 A rather than 3.00 A, as calculated using ab initio methods. A more important limitation is that the barrier to rotation in formamide (HCONH2) computed using the PM3 model is almost nonexistent, and that planar nitrogen atoms, such as those in amides and enamines, are often calculated to prefer pyramidal geometries52. It is likely that these defects will be remedied in future studies, however, and are overcome in MOPAC by constraining the nitrogen to planarity. Finally, repulsions between lone pairs are not always well represented in the theoretical model underlying MNDO, AM1 or PM3. As an example, none of these methods predict the observed geometry for hydrazine or CIF341. The SINDO1 semi-empirical model has been less well investigated, although this method gives good molecular geometries. Molecular dipole moments and heats of formation do not, however, appear to be predicted as accurately although the SINDO1 model includes d symmetry functions that appear to give better predictions for hypervalent compounds, such as those involving phosphorus100. Whereas there are a number of semi-empirical models available for computing molecular geometries and heats of formation, the INDO/S model is very widely employed in studies of electronic spectra and molecular photochemistry59. Several reasons account for the popularity of this semi-empirical method. First, the model has been calibrated at the CIS level for a wide variety of element60--62, including those in the first two series of transition metals and the lanthanides. Parameterization was carried out using experimental structures and consequently INDO/S is not usually employed to predict relative molecular energies or ground state geometries. On the other hand, the INDO/S model accurately predicts the transition energies associated with low-lying * and * bands in compounds containing
42
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
hydrogen, and first- and second-row elements41. Second, the accuracy of the theoretical spectra appears to improve with increasing molecular size, making INDO/S useful for studying the spectroscopic properties of biologically interesting systems such as the special pair of chlorophylls in the photosynthetic reaction center12. Finally, the INDO/S model has been implemented in the ZINDO software package, which has been distributed to many laboratories throughout the world. Therefore the use of a combination of methods is recommended for modeling the spectral properties of biological systems. For example, in our calculations concerning the effects of protein-ligand interactions on ligand spectra and photochemical reactivity65,101, we routinely employ the AM1 and INDO/S models for geometry optimization and the computation of electronic spectra, respectively.
Evaluating Continuum Solvation Models Given its relative computational simplicity, the SCRF solvation model has been extensively investigated. The usefulness of this approach in studying solvent effects on molecular structure was demonstrated in studies on the preferred tautomer of heteroaromatic compounds that can interact with solvent through hydrogen bonding (Fig. 2.5). In general, inclusion of the SCRF term in the Fock operator predicted the preferred tautomer, especially when there was a significant difference in the magnitude of the dipole moments of the possible tautomeric species as in ring systems containing two heteroatoms102. Further, there was no apparent correlation of the differences in Hf between tautomers calculated without the inclusion of solvent effects and the experimentally observed tautomeric equilibrium constants. It was also noted that the SCRF model does not include contributions from specific hydrogen-bond interactions between solute and solvent. Ab initio calculations on complexes between the tautomers of 2-hydroxypyridine and explicit water molecules confirmed that the inclusion of such interactions does lead to small changes in the optimized heteroaromatic structures. These effects were of significant magnitude, however, to alter the preferred tautomer predicted by the AM1-SCRF calculations. Subsequent studies have confirmed the utility of the SCRF model in computing ground-state molecular properties despite the assumption that the solute occupies a spherical cavity in the medium. The effects of solvation on electronic spectroscopy have also been investigated using the SCRF model and it appears that the inclusion of specific hydrogen bonding interactions between solute and solvent are important if calculations are to predict experimental data with a reasonable level of accuracy65,102. Thus, in calculations aimed at reproducing the n-> * and -> * absorptions in nitrogen and oxygen heterocycles, respectively, the best agreement of theoretical and experimental spectra was obtained when the optimized supermolecular complex, comprised of the heterocyclic solute and hydrogen-bonded waters, was used in the SCRF. These results clearly indicate that care must be taken to include key solvent molecules in semi-empirical treatments of biologically interesting systems in aqueous solution. Studies employing the COSMO solvation model do not appear to be as widespread, despite its implementation in MOPAC V6.0. The ability of COSMO to treat ionic equilibria has been demonstrated by work on glycine, which can exist as a zwitterion 1 and in an unionized form 2 (Fig. 2.6). In the gas-phase, the predominant structure is 2, presumably due to high self-energy of the charged carboxylate and amino groups. AM1 semi-empirical calculations on the isolated zwitterion show the presence of a hydrogen bonding interaction between an N-H and the carboxylate oxygen that distorts the bond angle about the central carbon atom72. On including the electrostatic screening effects of water using COSMO, the
BASIS OF SEMI-EMPIRICAL QUANTUM-MECHANICAL METHODS
43
Fig. 2.5 Representative tautomerization equilibria studied using semi-empirical calculations and the SCRF or SMx solvation models.
zwitterion becomes more stable than the neutral form by approximately 8.3 kcal/mol, an amount that is in good agreement with the experimental value of 10.1 kcal/mol103. The increased stability of the zwitterion appears to be associated with the ability of the bond angles to assume their equilibrium geometries due to the absence of the hydrogen bonding interaction that is present in the calculated gas-phase structure (Fig. 2.6). Thus COSMO effectively reproduces the electrostatic screening effects of the solvent for ionized species. In more recent work on the electronic basis for stereoselective epoxidation, the energetic effects of organic solvent on the transition state for the reaction were evaluated using COSMO in combination with PM3 calculations75. On the other hand, the COSMO solvation model does not contain any explicit treatment of the energetic contributions arising from cavity formation and associated entropic changes in the solvent. Therefore calculated energies and geometries for structures that either possess small dipole moments or are comprised mainly of hydrophobic functionality, such as alkyl chains and aromatic substituents, may not necessarily agree with experimental values.
44
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Fig. 2.6 Comparison of the calculated structures for glycine in the gas-phase and in water (COSMO solvation model). Note that the central bond angle in the zwitterionic form 1 is distorted by the hydrogen bond length of 1.96A computed for this structure in the gas phase. When solvation effects are included in the calculation using COSMO, the electrostatic interaction is reduced in magnitude due to charge screening by water, and the bond angle distortion is no longer present.
The SMx solvation models have been extensively tested and applied to studies of solvation effects for a wide variety of systems27,76--81. As might be expected, given the complexity of these models and the resulting large number of parameters, AM1 -SMx and PM3SMx calculations can be applied to ions and neutral molecules in solvents of varying polarity. Early tests of the model therefore involved calculating proton transfer free energies in aqueous solution and acid-base equilibra79. Not only did these computations correctly reproduce the increase in basicity of ammonia when in water relative to other amines, but acetate was shown to have a greater free energy of solvation compared to phenoxide, in agreement with experiment. The AM1-SM1 semi-empirical model also predicted the preferred tautomeric forms of several compounds, including 2-hydroxypyridine27, in water without the need to include explicit solvent molecules as in the case of the previous studies employing the AM1-SCRF approach102. Although some deficiencies in the SM1 model have been identified, the principal limitation on the accuracy of free energies calculated using this approach appears to be the semi-empirical treatment of the solute in the gas-phase. The negligible errors in the solvation energies may, however, arise from cancellation of systematic errors in the computed solvation energies of the solutes. Analyses of the changes in the distribution of electrons in the optimized structure for the solute in solvent relative to the gas-phase also indicate that the SMx models do simulate solvation effects in a chemically reasonable fashion (Fig. 2.7). Hence, the dipole moment of 4-hydroxypyridone calculated using the AM1-SM2 solvation model is larger than that for the AMl-optimized gasphase structure. The energy required for the charge distribution of the solvated structure is more than offset by the favorable polarization of the solvent in response to the increased
BASIS OF SEMI-EMPIRICAL QUANTUM-MECHANICAL METHODS
45
Fig. 2.7 Comparison of the Mulliken charges computed for 4-pyridone using the AM1 semiempirical model in (a) the gas-phase, and (b) water (bold) using the SM2 solvation model. The effect of including solvent is to increase the magnitude of the molecular dipole.
dipole moment of the solute. One cautionary note that is applicable to most parameterized solvation models concerns the use of these continuum methods for computing the solvation energies of conformationally flexible compounds. As parameterization is usually performed using the surface area values determined for a single low-energy conformation of each of the structures in the training set, the model may be unable to predict solvation energies for molecules that undergo large conformational changes on moving from the gasphase to solution104'105. Nevertheless, the SMx solvation models are simple to implement within semi-empirical calculations and analytical expressions are available for calculating gradients, making these approaches useful in geometry optimization and transition state modeling. The LMO approach to computing the electronic properties of entire proteins has been implemented in the MOZYME software package, and the method is under active investigation and refinement. In preliminary studies, the heats of formation of variety of proteins have been calculated for the X-ray crystallographic geometries. These structures contained approximately 600-4100 atoms, and convergence for the single-point SCF occurred in 4-24 hours on a SUN Spare 2 workstation. Geometry optimizations on small segments of bacteriorhodopsin have also been performed using the LMO algorithm. Detailed experiments to calibrate the accuracy of this method remain to be performed, although application of this semi-empirical model to the study of enzyme reaction mechanisms and electron transfer processes is an exciting prospect.
Summary The increased interest in biochemical systems and the fundamental physical chemical principles governing the design of proteins and nucleic acids has provided a significant impetus to the development of semi-empirical models that can treat large molecules and/or the effects of solvation. In consequence, methods now exist that have potential application in modeling the electronic properties of biological systems using a completely quantum-mechanical description. The methods outlined in this chapter should therefore complement "mixed-mode" calculations which combine classical and QM treatments of enzymes in order to model the molecular basis of biochemical catalysis and protein function17-17-106'107.
46
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Acknowledgments The discussion of the theoretical basis of semi-empirical methods would not have been possible without the help and advice of several friends and colleagues, including Drs. Yngve Ohm, James Stewart, Michael Zerner, Henry Rzepa, and Marshall Cory, Jr. I am also indebted to the authors of many previous reviews of semi-empirical methods and to Anne-Marie Sapse for her patience. References 1. Brooks, C. L. III, M. Karplus, and B. M. Pettit. 1988. Adv. Chem. Phys. 71, 1-249. 2. Ha, S., J. Gao, B. Tidor, J. W. Brady, and M. J. Karplus. 1991. J. Am. Chem. Soc. 113, 1553-1557. 3. Malhotra, A., H. A. Gabb, and S. C. Harvey. 1993. Curr. Op. Struct. Biol. 3, 241-246. 4. Cornell, W. D., P. Cieplak, C. I. Bayly, I. R. Gould, K. M. Merz, Jr., D. M. Ferguson, D. C. Spellmeyer, T. Fox, J. W. Caldwell, and P. A. Kollman. 1995. J. Am. Chem. Soc. 117, 5179-5197. 5. Maple, J. E., M. T. Hwang, T. P. Stockfish, U. Dinur, M. Waldman, C. S. Ewig, A. T. Hagler. 1994. J. Comput. Chem. 15, 162-182. 6. Jorgensen, W. L., and J. Tirado-Rives. 1988. J. Am. Chem. Soc. 110, 1657-1666. 7. Glennon, T. M., Y.-J. Zheng, S. LeGrand, T. A. Shutzberg, and K. M. Merz, Jr. 1994. J. Comput. Chem. 15, 1019-1040. 8. McCammon, J. A., and S. C. Harvey. 1987. Dynamics of Proteins and Nucleic Acids, Cambridge: Cambridge University Press. 9. Brooks, C. L. III. 1995. Curr. Op. Struct. Biol. 5, 211-215. 10. Schmitz, U., N. B. Ulyanov, A. Kumar, and T. L. James. 1993. J. Mol. Biol. 234, 373-389. 11. Mizushima, N., D. Spellmeyer, S. Hirono, D. Pearlman, and P. A. Kollman. 1991. J. Biol. Chem. 266, 11801-11809. 12. Thomspon, M. L., and M. C. Zerner. 1991. J. Am. Chem. Soc. 113, 8210-8215. 13. Parson, W. W., Z.-T. Chu, and A. Warshel. 1990. Biochim. Biophys. Acta 1017, 251-272. 14. Chan, M. K., J. Kim, and D. C. Rees. 1993. Science 260, 792-794. 15. Tsukihara, T., H. Aoyama, E. Yamashita, T. Tomizaka, H. Yamaguchi, K. ShinzawaItoh, R. Nakanishi, R. Yaona, and S. Yoshikawa. 1996. Science 272, 1136-1144. 16. Warshel, A. 1991. Computer Modeling of Chemical Reactions in Enzymes and Solutions. Wiley, New York. 17. Bash, P. A., M. J. Field, R. C. Davenport, G. A. Petsko, D. Ringe, and M. Karplus. 1991. Biochemistry 30, 5826-5832. 18. Bartlett, R. D., and J. F. Stanton in Reviews in Computational Chemistry, Vol. 5, K. B. Lipkowitz and D. B. Boyd, Eds. 1994. VCH, New York. 65-169. 19. Bartolotti, L. J., and K. Flurchick in Reviews in Computational Chemistry, Vol. 7. K. B. Lipkowitz and D. B. Boyd, Eds. 1995. VCH, New York. 187-216. 20. Pople, J. A., and D. L. Beveridge. 1970. Approximate Molecular Orbital Methods. McGraw-Hill, New York. 21. Dewar, M. J. S. 1969. The Molecular Orbital Theory of Organic Chemistry. McGrawHill, New York. 22. Thiel, W. 1988. Tetrahedron 44, 7393-7408. 23. Purvis, G. D. III. 1991. J. Comput.-Aided Mol. Des. 5, 55-80. 24. Tomasi, J., and M. Persico. 1994. Chem. Rev. 94, 2027-2094. 25. Tapia, O., and O. Goscinski. 1975. Mol. Phys. 29, 1653-1661. 26. Karelson, M., and M. C. Zerner. 1992. J. Phys. Chem. 96, 6949-6957. 27. Cramer, C. J., and D. G. Truhlar. 1992. J. Comput.-Aided Mol. Des. 6, 629-666. 28. Stewart, J. J. P. 1996. Intern. J. Quantum Chem. 58, 133-146. 29. Pople, J. A., D. P. Santry, and G. A. Segal. 1965. J. Phys. Chem. 43, S129-S135. 30. Bingham, R. C., M. J. S . Dewar, and D. H. Lo. 1975. J. Am. Chem. Soc. 97, 1285-1293. 31. Dewar, M. J. S., and W. Thiel. 1977. J. Am. Chem. Soc. 99, 4899--4907.
BASIS OF SEMI-EMPIRICAL QUANTUM-MECHANICAL METHODS 47
32. Dewar, M. J. S., E. Zoebisch, E. F. Healy, and J. J. P. Stewart. 1985. J. Am Chem. Soc. 107, 3902-3909. 33. Nanda, D. N., and K. Jug. 1980. Theor. Chim. Acta 57, 95-106. 34. Dewar, M. J. S., C. Jie, and J. Yu. 1993. Tetrahedron 49, 5003-5038. 35. Stewart, J. J. P. 1989. J. Comput. Chem. 10, 209-220. 36. Pauling, L. 1960. The Nature of the Chemical Bond, 3rd Ed. Cornell University Press, Ithaca. 37. Heitler, W., and F. Z. London. 1927. Z. Phys. 44, 455--472. 38. Cullen, J. M., and M. C. Zerner. 1982. J. Chem. Phys. 77, 4088--4109. 39. Thiel, W. 1996. Adv. Chem. Phys. 93, 703-757. 40. Clark, T. 1985. A Handbook of Computational Chemistry, Wiley, New York. 41. Zerner, M. C. in Reviews in Computational Chemistry, Vol. 2. K. B. Lipkowitz and D. B. Boyd, Eds. 1991. VCH, New York. 313-365. 42. Stewart, J. J. P. in Reviews in Computational Chemistry, Vol. 1. K. B. Lipkowitz and D. B. Boyd, Eds. 1990. VCH, New York. 45-81. 43. Fletcher, R. 1980. Practical Methods of Optimization, Unconstrained Optimization, Vol. 1. Wiley, New York. 44. Roothaan, C. C. J. 1951. Rev. Mod. Phys. 23, 69-89. 45. Feller, D., and E. R. Davidson in Reviews in Computational Chemistry, Vol. 1. K. B. Lipkowitz and D. B. Boyd, Eds. 1990. VCH, New York. 1--43. 46. Fontecave, M., H. Jornvall, and P. Reichard. 1992. Adv. Enzymol. Relat. Areas Mol. Biol. 65, 147-183. 47. Pople, J. A., and R. K. Nesbet. 1954. J. Chem. Phys. 22, 571-572. 48. Szabo, A., and N. Ostlund. 1990. Modern Quantum Chemistry. Macmillan, New York. 49. Pople, J. A., and G. A. Segal. 1966. J. Chem. Phys. 44, 3289-3296. 50. Pople, J. A., and G. A. Segal. 1965. J. Chem. Phys. 43, S136-S151. 51. Pople, J. A., D. L. Beveridge, and P. A. Dobosh. 1967. J. Chem. Phys. 47, 2026-2033. 52. Stewart, J. J. P. 1990. J. Comput- Aided Mol. Des. 4, 1-103. 53. Dewar, M. J. S., and H. S. Rzepa. 1978. J. Am. Chem. Soc. 100, 58-67. 54. Dewar, M. J. S., and C. H. Reynolds. 1986. J. Comput. Chem. 7, 140-143. 55. Dewar, M. J. S., and K. M. Merz. 1986. Organometallics 5, 1494-1496. 56. Dewar, M. J. S., E. F. Healy, and J. J. P. Stewart. 1984. J. Comput. Chem. 5, 358-362. 57. Bakowies, D., and W. J. Thiel. 1991. J. Am Chem. Soc. 113, 3704-3714. 58. Thiel, W., and D. G. Green in Methods and Techniques in Computational Chemistry, METECC-95. E. Clementi and G. Corongiu, Eds. 1995. Cagliari: STEF, 141-168. 59. Zerner, M. C., G. H. Loew, R. F. Kirchner, and U. T. Mueller-Westerhoff. 1980. J. Am. Chem. Soc. 102, 589-599. 60. Bacon, A. D., and M. C. Zerner. 1979. Theor. Chim. Acta 53, 21-54. 61. Anderson, W. P., T. Cundari, R. Drago, and M. C. Zerner. 1990. Inorg. Chem. 29, 1-3. 62. Anderson, W. P., T. Cundari, and M. C. Zerner. 1990. Intern. J. Quantum Chem. 39, 31--45. 63. Stavrev, K. K., and M. C. Zerner. 1996. Chem. Eur. J. 2, 83-87. 64. Alagona, C., R. Cammi, C. Ghio, and J. Tomasi. 1993. Theor. Chim. Acta 85,167-187. 65. Richards, N. G. J. and M. G. Cory, Jr. 1992. Intern. J. Quantum Chem., Quantum Biol. Symp. 19, 65-76. 66. Warshel, A., and S. T. Russell. 1984. Q. Rev. Biophys. 17, 283--422. 67. Karelson, M. M., T. Tamm, A. R. Katritzky, S. J. Cato, and M. C. Zerner. 1989. Tetrahedron Comput. Methodol. 2, 295-304. 68. Rinaldi, D., M. F. Ruiz-Lopez, and J. L. Rivail. 1983. J. Chem. Phys. 78, 834-838. 69. Rzepa, H. S., M.-Y. Yi, M. M. Karelson, and M. C. Zerner. 1991. J.C.S. Perkin Trans. 2, 635-637. 70. Brunschwig, B. S., S. Eherenson, and N. Sutin. 1987. J. Phys. Chem. 91, 4714--4723. 71. Hoshi, H. H., M. Sakurai, Y. Inoue, and R. Chujo. 1987. J. Chem. Phys. 87, 1107--1115. 72. Klamt, A., and G. Schuurmann. 1993. J.C.S. Perkin Trans. 2, 799-805. 73. Connolly, M. L. 1983. Science 221, 709--713.
48
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
74. Silla, F, Tunon, I. and J. L. Pascual-Ahuir. 1991. J. Comput. Chem. 12, 1077-1088. 75. Casher, O., D. O'Hagan, C. A. Rosenkranz, H. S. Rzepa, and N. A. Zaidi. 1993. Chem. Commun. 1337-1340. 76. Cramer, C. J., and D. G. Truhlar. 1991. J. Am. Chem. Soc. 113, 8305-8311. 77. Cramer, C. J., and D. G. Truhlar. 1992. Science 256, 213-217. 78. Cramer, C. J., and D. G. Truhlar. 1992. J. Comput. Chem. 13, 1089-1097. 79. Cramer, C. J., and D. G. Truhlar. 1991. J. Am. Chem. Soc. 113, 8552-8554. 80. Giesen, D. J., J. W. Stoner, C. J. Cramer, and D. G. Truhlar. 1995. J. Am. Chem. Soc. 117, 1057-1068. 81. Giesen, D. J., C. J. Cramer, and D. G. Truhlar. 1995. J. Phys. Chem. 99, 7137-7146. 82. Rashin, A. A., and B. Honig. 1985. J. Phys. Chem. 89, 5588-5593. 83. Still, W. C., A. Tempczyk, R. C. Hawley, and T. F Hendrickson. 1990. J. Am. Chem. Soc. 112, 6127-6129. 84. Arald, J.-C., A. Nicholls, K. Sharp, B. Honig, A. Tempczyk, T. F. Hendrickson, and W. C. Still. 1991. J. Am. Chem. Soc. 113, 1454-1455. 85. Hasel, W., T. F Hendrickson, and W. C. Still. 1988. Tetrahedron Comput. Methodol. 1, 103-116. 86. Mullikin, R. S. 1955. J. Chem. Phys. 23, 1833-1840. 87. Kozaki, T., M. Morihasi, and O. Kikuchi. 1989. J. Am. Chem. Soc. 111, 1547-1552. 88. Rivail, J. L., B. Terryn, D. Rinaldi, and M. F Ruiz-Lopez. 1985. J. Mol. Struct. (Theochem) 120, 387--400. 89. Stienke, T, E. Hansele, and T. J. Clark. 1989. J. Am Chem. Soc. 111, 9107-9109. 90. Hermann, R. B. 1972. J. Phys. Chem. 76, 2754-2759. 91. Eisenberg, D., and A. D. McLachlan. 1986. Nature(Lond.) 319, 199-203. 92. Harris, M. J., T. Higuchi, and J. H. Rytting. 1973. J. Phys. Chem. 77, 2694-2703. 93. Ooi, T., M. Ootabake, G. Nemethy, and H. A. Scheraga. 1987. Proc. Natl. Acad. Sci., USA 84, 3086-3090. 94. Ben-Nairn, A., and Y. Marcus. 1984. J. Chem. Phys. 81, 2016-2025. 95. Pearson, R. G. 1986. J. Am. Chem. Soc. 108, 6109-6114. 96. von Heijne, G. 1994. Annu. Rev. Biophys. Biomol. Struct. 23, 167-192. 97. Ferenczy, G. G., J. J. Rivail, P. R. Surjan, and G. Naray-Szabo. 1992. J. Comput. Chem. 13, 830-837. 98. Thery, V., D. Rinaldi, J. L. Rivail, B. Maigret, and G. G. Ferenczy. 1994. J. Comput. Chem. 15, 269-282. 99. Halim, H., N. Heinrich, W. Koch, J. Schmidt, and G. Frenking. 1986. J. Comput. Chem. 7, 93-104. 100. Jug, K., R. Iffert, and J. Schulz, 1987. Intern. J. Quantum Chem. 32, 265-277. 101. Cory, M. G. Jr., N. G. J. Richards, and M. C. Zerner in Modeling the Hydrogen Bond, ACS Symp. Ser. #569. D. A. Smith, Ed., 1994, American Chemical Society, Washington, D.C. 222-234. 102. Karelson, M. M., A. R. Katritzky, M. Szafran, and M. C. Zerner. 1989. J. Org. Chem. 54, 6030-6034. 103. Rzepa, H. S., and M. Y. Yi. 1991. J.C.S. Perkin Trans. 531-537. 104. Richards, N. G. J., P. B. Williams, and M. S. Tute. 1992. Intern. J. Quantum Chem. 44, 219-233. 105. Richards, N. G. J., P. B. Williams, and M. S. Tute, 1991. Intern. J. Quantum Chem., Quantum Biol. Symp. 18, 299-316. 106. Field, M. J., P. A. Bash, and J. Karplus. 1990. J. Comput. Chem. 11, 700-733. 107. Gao, J., in Reviews in Computational Chemistry, Vol. 7. K. B. Lipkowitz and D. B. Boyd, Eds., 1995, VCH, New York. 119-185.
t3
The Molecular Electrostatic Potential A Tool for Understanding and Predicting Molecular Interactions Jane S. Murray Peter Politzer
The quest for improved methods for elucidating and predicting the reactive behavior of molecules and other chemical species is a continuing theme of theoretical chemistry. This has led to the introduction of a variety of indices of reactivity; some are rather arbitrary, while others are more or less directly related to real physical properties. They have been designed and are used to provide some quantitative measure of the chemical activities of various sites and/or regions of the molecule. In this chapter our focus is on one of these indices, the electrostatic potential V(r) that is created in the space around a molecule by its nuclei and electrons. V(r) can be computed rigorously, given the electronic density function p(r), by Eq. (3.1).
ZA is the charge on nucleus A, located at RA. V(r) is the potential created by the static charge distribution of the molecule; it can also be regarded as the exact interaction energy of this charge distribution with a proton situated at the point r. Of course such a proton would, in reality, produce some polarization of the molecule's electronic density, which is not taken into account if p(r) corresponds to the unperturbed ground state, as is normally the case. Despite this inconsistency, the pioneering work of Scrocco, Tomasi, and their collaborators (Bonaccorsi, Scrocco, and Tomasi 1970; Bonaccorsi, Scrocco, and Tomasi 1971) demonstrated the usefulness of the electrostatic potential as a guide to molecular interactive behavior, a role in which it is now well-established (Berthier et al. 1972; Berthod and Pullman 1975; Berthod and Pullman 1978; Bonaccorsi et al. 1972a; Bonaccorsi et al. 1972b; Bonaccorsi et al. 1975; Giessner-Prettre and Pullman 1975; Lavery, Corbin and Pullman 1982; Lavery, Pullman, and Pullman 1980; Lavery and Pullman 1981; Perahia and Pullman 1978; Politzer and Daiker 1981; Politzer, Laurence, and Jayasuriya 1985; Politzer and Murray 1990; Politzer and Murray 1991; Politzer and Truhlar 1981; Pullman and Berthod 1976; Pullman and Pullman 1980; Pullman and Pullman 1981b; Scrocco and Tomasi 1973, Scrocco and Tomasi 1978). Until a few years ago, its extensive applications have focused 49
50
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
upon either (a) the extrema of V(r), its most negative and (more recently) positive values, or (b) the qualitative pattern of the electrostatic potential's positive and negative regions, plotted either on planes through the molecule or on its surface. More recently, a third approach has been pursued, which involves quantifying certain key features of the overall pattern of the electrostatic potential and relating them to macroscopic properties (Murray et al. 1994; Murray and Politzer 1994). [For a current exposition of a variety of applications of molecular electrostatic potentials, see Murray and Sen (Murray and Sen 1996).] Unlike many of the other quantities used now and earlier as indices of reactivity, the electrostatic potential V(r) is a real physical property, one that can be determined experimentally by diffraction methods as well as computationally (Politzer and Truhlar 1981). It is through this potential that a molecule is first "seen" or "felt" by other chemical species. Equation (3.1) shows that the potential at any point r is the sum of a positive term, representing the contribution of the nuclei, and a negative one, reflecting the distribution of electrons. The sign of V(r) is determined by the term that dominates at the point in question. For a neutral atom, the electrostatic potential is positive everywhere (Politzer and Murray 1991; Sen and Politzer 1989): negative regions arise only in the space surrounding a molecule or an anion. Our discussion in this chapter will focus on the use of the electrostatic potential as a means to understanding and predicting chemical interactions. First, we will examine some of its properties and important features. Next, we will discuss methodology. Finally we will review some recent applications of the electrostatic potential in areas such as hydrogen bonding, molecular recognition, and understanding and prediction of a variety of physiochemical properties related to molecular interactions. Our intent has not been to provide a complete survey of the ways in which the potential has been used, many of which are described elsewhere (Politzer and Daiker 1981; Politzer, Laurence, and Jayasuriya 1985; Politzer and Murray 1990; Politzer and Murray 1991; Politzer and Truhlar 1981; Scrocco and Tomasi 1973), but rather to focus on some diverse examples.
3.1 Background The expression for V(r), given as Eq. (3.1), follows from the definition of electrical potential, which will be reviewed here. Any distribution of electrical charge creates a potential V(r) in the surrounding space. For an assembly of point charges Qi.located at positions ri, this electrical potential is simply a sum of Coulombic potentials, as given in Eq. (3.2).
Qi may be either positive or negative in sign. If the charge distribution is continuous, then integration replaces the summation in Eq. (3.2), giving Eq. (3.3).
D(r) is the total charge density; its sign can vary as a function of r. If we consider a molecule as having a static but continuous distribution of electronic charge around a rigid nuclear framework, then its electrical or "electrostatic" potential will have a term similar to Eq. (3.2), with Q. being the positive charges of the nuclei, ZA, and a
THE MOLECULAR ELECTROSTATIC POTENTIAL
51
term similar to Eq. (3.3), with D(r) being replaced by the electronic density function p(r). Since p(r) is customarily defined as a positive function [unlike D(r)], the second term on the right side of Eq. (3.1) comes in with a negative sign. The net result is Eq. (3.1). Our purpose in reviewing this background is to show explicitly that Eq. (3.1) follows from basic electrostatics. It should be noted that the electrostatic potential V(r) is related rigorously to the total charge density D(r) through Poisson's equation, Eq. (3.4).
V(r) accordingly plays a key role in the very fundamental and rapidly developing area of density functional theory (Parr 1983; Parr and Yang 1989). An important aspect of this has been the development of relationships, both exact and approximate, between the energies of atoms and molecules and the electrostatic potentials at their nuclei (Levy, Clement, and Tal 1981; Politzer 1980; Politzer 1981; Politzer 1987). More recently, we have shown that the position of the minimum potential along a bond provides us with a realistic set of covalent radii (Wiener et al. 1996) and that its magnitude is related to the bond dissociation energy (Wiener et al. 1997). The significance of these findings is discussed elsewhere (Politzer and Murray 1996). The sign of the electrostatic potential V(r) in any particular region around a molecule, which depends upon whether the effects of the nuclei or electrons are dominant, is a key to assessing its reactivity there. Regions in which V(r) is negative in sign are those to which electrophiles are initially attracted, in particular to the most negative potentials, Vmin. These Vmin are typically between 1 and 2 A from the nuclear framework and associated with one of the following three molecular features: (a) heteroatoms with lone pairs, such as O, N, F, S, P, C1, Se, As, and Br; (b) pi regions, such as are found in aromatic and double- and triplebonded systems; and (c) strained bonds (Politzer and Murray 1990). Sites susceptible to nucleophilic attack can also be identified and ranked by means of positive electrostatic potential regions, but it is necessary to analyze the latter at distances at least 1 to 2 A away from the nuclei, e.g., in planes removed from the molecular framework (Murray, Lane, and Politzer 1990; Politzer, Abrahmsen, and Sjoberg 1984; Politzer et al. 1984) or on molecular surfaces (Murray et al. 1991b; Murray and Politzer 1991; Murray and Politzer 1992; Pullman, Perahia, and Cauchy 1979; Sjoberg and Politzer 1990). This is because the electrostatic potentials of atoms and molecules have local maxima only at the nuclei (Pathak and Gadre 1990). To identify sites for nucleophilic attack, it is necessary, accordingly, to look for the most positive values in planes or on surfaces that are at some distance from the nuclei. (These are, of course, not true local minima.) As mentioned earlier, the electrostatic potential around a free neutral atom is positive everywhere (Politzer and Murray 1991; Sen and Politzer 1989), due to the very highly concentrated positive charge of the nucleus in contrast to the dispersed negative charges of the electrons. It is when atoms interact to form molecules that regions of negative potential may and usually do develop as a consequence of the subtle electronic rearrangements that accompany the process. Figures 3.1 and 3.2 show calculated electrostatic potentials for guanine (1), in the plane of the molecule and on the molecular surface, respectively. Looking at Figures 3.1 and 3.2, it can be seen that there are negative potentials associated with N3, N7 and the carbonyl oxygen, with the latter two overlapping to form a strong and extensive negative region on one side of the molecule. Both Figs. 3.1 and 3.2 allow us to rank N7 as the site most suscepti-
52
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Fig. 3.1 Calculated electrostatic potential of guanine (1), in kcal/mole, in the plane of the molecular framework. Dashed contours correspond to negative potentials. The positions of the most negative potentials are indicated; the values are: -92.6; A -72.6; 0 -69.2. ble to electrophilic interactions, with N3 and the carbonyl oxygen being fairly similar but less negative than N7. This is consistent with the experimental observation that N7 is the favored site for the protonation and alkylation of guanine (Fiskin and Beer 1965; Lawley 1957), with some alkylation also occurring at the oxygen (Friedman, Mahapatra, and Stevenson 1963), which is more accessible than N3 (Figs. 3.1 and 3.2). Focusing next on positive regions of potential, it is clear that the surface shown in Fig. 3.2 is superior to the contour map in Fig. 3.1 in revealing the relative magnitudes of the positive potentials associated with the hydrogens. Specifically, Fig. 3.2 shows that the hydrogens of the amine group and the one bonded to N1 are more positive than those on the five-membered ring; the former would accordingly be predicted to be more favored for hydrogen bonding. This is indeed found to be the case; guanine hydrogen bonds to negative sites on cytosine through an amine hydrogen and that on N1. Uses of the electrostatic potential that will be emphasized in this chapter will be both qualitative and quantitative. It is important to recognize however that these applications are based on the following exact interpretations of V(r):
THE MOLECULAR ELECTROSTATIC POTENTIAL
53
Fig. 3.2 Calculated electrostatic potential on the molecular surface of guanine (1). Three ranges of V(r) are depicted, in kcal/mole. These are: white for V(r) < 0; light gray for V(r) from 0 to 10; dark gray for V(r) > 10.
1. Given a point charge ± Q located at the point r, then ± QV(r) is equal to the electrostatic interaction energy between the unpolarized molecule and the point charge. 2. In a perturbation theory treatment of the total (not just electrostatic) interaction between the molecule and the point charge, ±QV(r) is the first-order term in the expression for the total interaction energy (which would include polarization and other effects). 3. The negative gradient of ±QV(r) equals the electrostatic force that is exerted by the molecule's unperturbed charge distribution upon the point charge ±Q. As just mentioned, ± Q V(r) is an energy quantity. Even though V(r) itself is a potential, not an energy, it is customary to express V(r) in units of energy (e.g., kcal/mole). This is actually QV(r) with Q equal to +1. Since the electrostatic potential is closely related to the electronic density, it may be useful to discuss how the information that can be obtained from V(r) differs from that provided by the p(r). Both are real physical properties, related by Eqs. (3.1) and (3.4). An important difference between V(r) and p(r) is that the electrostatic potential explicitly reflects the net effect of all of the nuclei and electrons at each point in space, whereas the electron density directly represents only the concentration of electrons at each point. A molecule's interactions with another chemical system is affected by its total charge distribution, both positive and negative, and thus can be better understood in terms of its electrostatic potential than its electronic density alone. Examples illustrating this point have been discussed elsewhere (Politzer and Daiker 1981; Politzer and Murray 1991).
54
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
3.2 Methodology Although Eq. (3.1) is an exact formula for the electrostatic potential due to a set of nuclei {ZA} and an electronic density p(r), the latter [p(r)] is generally obtained computationally from an ab initio or semi-empirical wave function or, more recently, from density functional procedures and is therefore necessarily approximate. It follows that the resulting V(r) is also approximate. Within this framework, given any particular p(r), V(r) can be evaluated rigorously (all of the integrals in Eq. (3.1) being calculated exactly) or approximately (Politzer and Daiker 1981). With the ready availability of software packages like the Gaussian series, the former is certainly the more widely used procedure. We will discuss some aspects of this first. We will then briefly examine some approximate methods for obtaining V(r) from p(r); detailed analyses of these procedures are found elsewhere (Politzer and Daiker 1981; Politzer, Laurence, and Jayasuriya 1985; Tomasi 1982). 3.2.1 Dependence of Rigorously Evaluated V(r) Upon Computational Level How does a rigorously calculated electrostatic potential depend upon the computational level at which was obtained p(r)? Most ab initio calculations of V(r) for reasonably sized molecules are based on self-consistent field (SCF) or near Hartree-Fock wavefunctions and therefore do not reflect electron correlation in the computation of p(r). It is true that the availability of supercomputers and high-powered work stations has made post-HartreeFock calculations of V(r) (which include electron correlation) a realistic possibility even for molecules with 5 to 10 first-row atoms; however, there is reason to believe that such computational levels are usually not necessary and not warranted. The M011er-Plesset theorem states that properties computed from Hartree-Fock wave functions using one-electron operators, as is V(r), are correct through first order (M011er and Plesset 1934): any errors are no more than second-order effects. It has been shown that the electrostatic potentials of formamide calculated at nearHartree-Fock (HF/6-31G*) and post-Hartree-Fock (MP2/6-31G*) levels are qualitatively similar (Politzer and Murray 1991). Both computational approaches predict the oxygen to be the preferred site for electrophilic attack (Seminario, Murray, and Politzer 1991). It is further noteworthy that SCF results obtained with minimal basis sets (e.g., HF/STO-3G and HF/STO-5G) are also in good agreement with those calculated at the higher computational levels. We feel justified in restating our earlier conclusion that varying the ab initio computational level does not greatly affect the overall pattern of the electrostatic potential for a given molecule (Politzer and Daiker 1981; Politzer and Murray 1991). There are, however, certainly differences in detail, the exact locations and magnitudes of the potential minima may change to some degree, and not always predictably (Luque, Illas, and Orozco 1990; Politzer and Daiker 1981; Politzer and Murray 1991; Seminario, Murray, and Politzer 1991). The key point is that a generally reliable picture of the electrostatic potential can be obtained with an SCF wavefunction, even if only of minimum basis set quality (Boyd and Wang 1989; Daudel et al. 1978; Gatti, MacDougall, and Bader 1988; Luque, Illas, and Orozco 1990; Politzer and Daiker 1981; Politzer and Murray 1991; Seminario, Murray, and Politzer 1991). However, we have found that the inclusion of polarization functions for molecules with second-row atoms is recommended, even at the minimum basis set level. There are in-
THE MOLECULAR ELECTROSTATIC POTENTIAL
55
dications that the semi-empirical MNDO and AM 1 methods also yield qualitatively reliable electrostatic potentials (Ferenczy, Reynolds, and Richards 1990; Luque, Illas, and Orozco 1990; Luque and Orozco 1990). Within the past ten years, density functional procedures (Dahl and Avery 1984; Labanowski and Andzelm 1991; Parr and Yang 1989; Seminario and Politzer 1995) have emerged as an extremely promising alternative to the more traditional ab initio and semiempirical procedures for computing molecular properties. Density functional theory is based on the Hohenberg-Kohn theorem (Hohenberg and Kohn 1964), according to which all of the electronic properties of a chemical system, including the energy, are determined by the electronic density. Important features of this approach are that it takes account of electron correlation but nevertheless requires considerably less computer time and space than do comparable ab initio techniques. The effectiveness of density functional procedures for computing molecular electrostatic potentials and other molecular properties is still being explored (Labanowski and Andzelm 1991; Laidig 1994; Murray et al. 1992; Seminario and Politzer 1995; Sola et al. 1996). Electrostatic potentials obtained by a local density functional method were shown to be similar to those from SCF calculations (Murray et al. 1992). More recently, analyses of density distributions obtained by density functional techniques (Laidig 1994; Sola et al. 1996) suggest that they provide generally reliable distributions of charge. These results are encouraging in regard to the use of density functional methods for obtaining electrostatic properties. 3.2.2 Approximate Evaluation of V(r) With the continuing surge of development in computing capabilities, approximate methods for evaluating V(r) are now generally used only for very large molecular systems, such as those studied in nucleic-acid, protein, and other biomolecular research. Historically, the most widely used approximate procedures for computing the electrostatic potential have been those based upon multipole expansions (Etchebest, Lavery, and Pullman 1982; Politzer and Daiker 1981; Rabinowitz, Namboodiri, and Weinstein 1986; Tomasi 1982; Williams 1988; Williams 1991). Such representations can approach the rigorously computed V(r) to varying degrees, depending on the number of terms (i.e., quadrupole, octapole, etc.) that are included. Terminating the expansion after the monopole terms [which corresponds to using a set of point charges to obtain V(r)] is the simplest possibility, the results of which obviously depend on the number, locations, and magnitudes of the point charges (Politzer and Daiker 1981). Overall, this approach has had only limited success. (On the other hand, there continues to be considerable interest in using the molecular electrostatic potential as a basis for obtaining physically meaningful atomic charges (Besler, Merz, and Kollman 1990; Breneman and Wiberg 1990; Chirlian and Francl 1987; Francl et al. 1996; Williams 1991; Williams and Yan 1988; Woods et al. 1990).) Expansions through the quadrupole terms have been shown to yield V(r) comparable with that obtained rigorously from the same p(r) (Murray et al. 1990; Rabinowitz, Namboodiri, and Weinstein 1986). This success has stemmed from the recognition that an electronic density function written in terms of a Gaussian basis set can be expressed as a finite multicenter expansion (Rabinowitz, Namboodiri, and Weinstein 1986), with the centers not limited to the nuclei. Another methodology for computing the electrostatic potential that has been of interest for a number of years involves representing V(r) of large systems as a combination or superposition of contributions from their constituent units or fragments (Bonaccorsi et al. 1980;Nagy,Angyan,and Naray-Szab6 1987; Naray-Szabo 1979; Politzer and Daiker 1981;
56
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Pullman and Pullman 1981a; Scrocco and Tomasi 1973; Tomasi 1981). Breneman, quite recently, has developed a method in which V(r) is computed from densities obtained through "transferable atom equivalents" (Breneman 1996); the resulting electrostatic potentials are of ab initio quality.
3.3 Some Applications
3.3.1 Analysis of Noncovalent Interactions Noncovalent interactions, both inter- and intramolecular, are of considerable importance in determining the physical properties of molecules. Such interactions can be classified as hydrogen-bonding or non-hydrogen-bonding. In this section we will explore some recent uses of the electrostatic potential in the analysis of both types. 3.3.1.1 Family-Independent Relationships Between Computed Electrostatic Potentials on Molecular Surfaces and Solute Hydrogen Bond Acidity/Basicity In view of the well-established importance of the electrostatic component in hydrogen bonding (Benzel and Dykstra 1983; Buckingham and Fowler 1985; Kollman 1977; Lego and Millen 1987; Lin and Dykstra 1986; Umeyama and Morokuma 1977), it is not surprising that the molecular electrostatic potential V(r) has been found to be an effective means for analyzing and correlating hydrogen-bonding interactions (Espinosa et al. 1996; Hagelin et al. 1995; Kollman et al. 1975; Leroy, Louterman-Leloup and Ruelle 1976c; Murray an Politzer 1991; Murray and Politzer 1992; Murray, Ranganathan, and Politzer 1991; Politze and Daiker 1981; Politzer and Murray 1991). For example, it has been used successfully to predict the sites and directionality of hydrogen bonds in a variety of systems, including many hydrogen-bonded dimers (Kollman et al. 1975; Leroy, Louterman-Leloup, and Ruelle 1976a; Leroy, Louterman-Leloup, and Ruelle 1976b; Leroy, Louterman-Leloup, and Ruelle 1976c). Specifically, the positions of the most negative potentials, Vmin, associated with the hydrogen-bond accepting heteroatoms of isolated gas-phase molecules were shown to be effective for predicting the sites and therefore directionality of hydrogen bonds to that particular heteroatom (Kollman et al. 1975; Leroy, Louterman-Leloup, and Ruelle 1976a; Leroy, Louterman-Leloup, and Ruelle 1976b; Leroy, Louterman-Leloup, and Ruelle 1976c). In addition, a good correlation was found between calculated hydrogen bond energies and the value of V(r) at a fixed distance from the hydrogen-bond accepting molecule in a series of complexes between HF and various acceptors (Kollman et al. 1975). In order to further explore the relationship between the magnitude of the Vmin in the vicinity of a hydrogen-bond accepting heteroatom and its tendency to form a hydrogen bond, we studied the relationship between Vmin and the solvent hydrogen-bond-accepting parameter (3 (Murray, Ranganathan, and Politzer 1991). (3 is one of the "solvatochromic parameters" introduced by Kamlet et al. (Kamlet et al. 1983; Kamlet, Abboud, and Taft 1981; Kamlet et al. 1979; Kamlet, Solomonovici, and Taft 1979; Kamlet and Taft 1976) in the course of an extended effort to separate, identify, and quantify various types of solvent effects upon experimentally measurable solution properties (e.g., rate constants, equilibrium constants, and 1R, NMR, ESR, and UV/vis absorption maxima and intensities). It is inter-
THE MOLECULAR ELECTROSTATIC POTENTIAL
57
preted as providing a measure of a solvent's ability to accept a proton in a solute-to-solvent hydrogen bond (Kamlet et al. 1983). We showed that there exist good correlations between and Vmin, computed at the HF/STO-5G*//HF/STO-3G* level, for four series of oxygenor nitrogen-containing molecules (Murray, Ranganathan, and Politzer (1991): azines, primary amines, alkyl esters, and molecules containing double-bonded oxygens. The success of this simple approach led us to consider correlating some value of the electrostatic potential associated with a hydrogen-bond-donating molecule with the solvatochromic parameter a. The latter is viewed as indicative of a solvent's ability to donate a proton in a solute-solvent hydrogen bond (Kamlet et al. 1983). Because there are no true maxima associate with any regions of a molecule's V(r) away from its nuclei, we chose to compute the electrostatic potential on molecular surfaces defined by a contour of the electronic density (e.g., the 0.002 au or 0.001 au contour). Good correlations were found between a and the most positive value of the electrostatic potential on the surface, VS, max, for a group of -OH and a group of alkyl-hydrogen-bond donors (Murray and Politzer 1992). These calculations were also carried out at the HF/STO-5G*//HF/STO-3G* level. In the preceding studies, we had focused upon the hydrogen-bond basicity and acidity of solvents. Our next step was to investigate whether our calculated Vmin and Vs max would also correlate with Abraham et al.'s more recently developed scales of solute hydrogenbond basicity and acidity (Abraham et al. 1988; Abraham et al. 1989a; Abraham et al. 1990; Abraham et al. 1989b), designed 2 H, and 2H, respectively, 2H and 2H had been obtained from equilibrium constants for the formation of 1:1 complexes between a solute molecule and a given reference base or acid, respectively, in CC14. We found good correlations between VS, max and 2H and Vmin and 2H, for groups of molecules (Murray and Politzer 1992). The relationships are very similar to those found for the solvent parameters a and (Murray, Ranganathan, and Politzer, 1991). These findings confirmed that the calculated electrostatic potential, which refers to the molecule in the gas phase, can be quantitatively related to its tendency to form hydrogen bonds in solution, whether as a part of the solvent interacting with a solute or as a solute molecule forming a 1:1 complex with a reference system. The correlations that have been described are all family dependent, a different one applying to each different group of structurally related molecules. Our next objective, accordingly, was to ascertain whether they could perhaps be made more general by improving the quality of the wave functions used to calculate the electrostatic potentials and, in the case of the hydrogen-bond-acceptor molecules, by using surface electrostatic potential minima (VS,min) instead of the three-dimensional spatial minima (Vmin). Optimized structures and surface electrostatic potentials were computed for eighteen hydrogen-bond-donating and thirty-three hydrogen-bond-accepting molecules, at the HF/6-31G* level (Hagelin et al. 1995). The eighteen hydrogen-bond-donors are listed in Table 3.1 along with their experimentally derived and statistically corrected1 2H values and our calculated VS, max. At the HF/631G* level, an excellent general correlation was found between 2H (corrected) and VS, max; the latter is invariably associated with the hydrogen(s) to be donated in the hydrogen bond (Hagelin et al. 1995). This relationship, given as Eq. (3.5), is shown in Fig. 3.3. The correlation coefficient is 0.991 and the standard deviation is 0 .04.
For the group of thirty-three hydrogen-bond-accepting molecules in Table 3.2, we chose to seek correlations directly with the equilibrium constant, KHB, for 1:1 complexation of
Table 3.1. Properties of some hydrogen-bond donors.a,b VS.max
Molecule CH3COCH3 CH3CN CH3NO2 CH2C12 CHC13 C6H5NH2 CH3CH2OH CH3OH pyrrole indole CH3COOH CF3CH2OH C6H5OH 2-naphthol p-C6H4(Cl)OH p-C6H4(OH)N02 (CF3)3COH CF3COOH
0.04 (-0.13) 0.09 (-0.01) 0.12 (0.02) 0.13 (0.07) 0.20 0.26 (0.20) 0.33 0.37 0.41 0.44 0.55 0.57 0.60 0.61 0.67 0.82 0.86 0.95
(predicted)
(kcal/mole)
-0.12 -0.01 0.10 0.04 0.16 0.15 0.36 0.38 0.39 0.41 0.55 0.66 0.59 0.61 0.70 0.86 0.82 0.88
22.3 27.8 33.5 30.2 36.7 36.3 47.0 47.8 48.3 49.6 56.4 62.1 58.4 59.7 64.1 72.5 70.1 73.3
a
The values are from R. W. Taft. A statistical correction has been applied to obtain the values given in parentheses. b The VS.max and predicted values have been reported in reference 91.
Fig. 3.3 Calculated electrostatic potential on the molecular surface of 1,3-bisdiphenylurea (13). Three ranges of V(r) are depicted, in kcal/mole. These are: white for V(r) < 0; light gray for V(r) from 0 to 10; dark gray for V(r) > 10.
THE MOLECULAR ELECTROSTATIC POTENTIAL
59
the acceptors with p-fluorophenol. This is because the KHB in Table 3.2 are from different sources and it seemed preferable to use the actual measured quantity (KHB) rather than one defined in terms of it. As in the case of the hydrogen-bond acidity correlations, we have listed the statistically corrected2 values of log KHB in Table 3.2, along with our calculated Vs min for each molecule (Hagelin et al. 1995). Only four of the acceptor molecules needed statistical corrections; these are (CH3CH2)S2, dithiane, dioxane, and pyrimidine, each of which has two identical hydrogen-bond-accepting heteroatoms per molecule. Table 3.2. Properties of some hydrogen-bond acceptors a,b Molecule benzene C13CCN (CH3CH2S)2 (CH3CH2)2S dithiane tetramethylsulfide C1H2CCN HCO2CH3 F3CCH2NH2 (3,5-C12)pyridine C6H5CHO CH3OH CH3CN CH3CO2CH3 (CH3CH2)20 CH3CH2OH dioxane CH3COCH3 tetrahydrofuran pyrimidine (CH3CH2)3P=S (CH3)2NCN cyclopropylamine
NH3 pyridine (4-CH3)pyridine (CH3)2NCHO CH3NH2 CH3(CH2)3NH2 (CH3)2NCOCH3 (CH3)2S=0 (1-CH3)imidazole pyridine N-oxide
log KHB
-0.50 -0.26 -0.10 0.11 0.24 0.30 0.39 0.69 0.72 0.80 0.80 0.82 0.91 1.00 1.01 1.02 1.03 1.18 1.26 1.35 1.46 1.56 1.64 1.68 1.88 2.03 2.10 2.15 2.17 2.44 2.53 2.60 2.76
(-0.40) (-0.06)
(0.73)
(1.05)
vS,min
log KHB (predicted)
(kcal/mole)
-0.53 0.29 -0.17 0.08 -0.26 0.14 0.87 1.58 0.39 0.41 1.27 1.27 1.45 1.32 0.91 1.29 0.73 1.38 1.33 0.88 1.00 1.88 1.25 1.98 1.37 1.51 2.02 1.69 21.70 2.07 2.85 2.20 2.42
-20.1 -29.1 -24.0 -26.9 -23.0 -27.5 -35.6 -43.4 -30.3 -30.5 -40.0 -40.0 -42.0 -40.5 -36.0 -40.2 -34.0 -41.2 -40.7 -35.7 -37.0 -46.8 -39.8 -47.8 -41.1 -42.6 -48.3 -44.6 -44.7 -48.8 -57.5 -50.3 -52.7
a Most of the log KHB values were obtained from M. Berthelot. A statistical correction has been applied to obtain the values in parentheses. All of the data in Table 3.2 have been reported in reference 91.
60
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Hildebrand solubility parameter Fig. 3.4 Plot of calculated values, (in kcal/mole)2, versus Hildebrand solubility parameters 8, in MPa1/2, for the molecules given in Table 3.3. The linear correlation coefficient and standard deviation are 0.930 and 1.9 MPa1/2, respectively.
We found a reasonable general linear relationship between V s,min and the corrected log KHB values, given as Eq. (3.6) and shown in Fig. 3.4. logKHB(corrected) = --9.030 X 10--2VS,min -- 2.341
(3.6)
The correlation coefficient is 0.902 and the standard deviation is 0.39. That this correlation is of a lower quality than that between and V s,max may be due in part to the fact that V s,max is always on a hydrogen, while V s,min is on a variety of different heteroatoms. Overall, we have shown that family-independent correlations can be obtained for solute hydrogen bond acidity and basicity, as quantitated by and log KHB. These are well represented at the HF/6-31G* level by an electrostatic potential term alone, V s,max or V s,min , respectively (Hagelin et al. 1995). 3.3.1.2 The Analysis of Non-Hydrogen-Bonding Noncovalent Interactions Using Surface Electrostatic Potentials The calculated molecular electrostatic potential is particularly well suited for the analysis of noncovalent interactions, which do not involve making or breaking covalent bonds and which occur without any extensive polarization or charge transfer between the interacting species. As we have discussed in the previous section, V(r) has been shown to be useful
THE MOLECULAR ELECTROSTATIC POTENTIAL
61
both as a guide to sites and directional preferences for hydrogen bonds and as an indicator of hydrogen-bond-donating and -accepting tendencies. In this section, we will discuss the application of the electrostatic potential to the analysis of other types of noncovalent interactions that we will classify as non-hydrogen-bonding. "Halogen" Bonding. Certain directional preferences have been observed in the orientations of halogen-containing organic molecules in the crystalline state (Murray-Rust et al. 1983; Ramasubba, Parthasarathy, and Murray-Rust 1986). When the halogen X is chlorine, bromine, or iodine, electrophilic portions of neighboring molecules generally tend to interact with it in a "side-on" manner, nearly normal to the C-X bond, whereas nucleophilic regions usually interact nearly "head-on," along the C-X axis at the X end (Ramasubba, Parthasarathy, and Murray-Rust, 1986). However, interactions with a fluorine in a C-F bond tend to be only by electrophiles, with the approaches somewhere intermediate between "side-on" and "head-on." The fact that nucleophiles interact at all with chlorine, bromine, and iodine in crystalline organic environments may seem inconsistent with the overall electron-attracting natures of these halogens and the resulting negative electrostatic potentials associated with them (Murray, Lane, and Politzer 1990; Politzer, Laurence, and Jayasuriya 1985; Politzer and Murray 1991). However, we have recently demonstrated that all of the directional preferences mentioned above can be predicted from an analysis of the potentials computed on the molecular surfaces of a series of halogenated methanes, including CF4, CC14, and CBr4 (Brinck, Murray, and Politzer 1992b).
"head-on" interaction with ucleophiles n
X = C1,Br,I
"side-on" interaction with electrophiles
interaction
with
electrophiles
X=F
Our calculated surface electrostatic potentials for CC14 and CBr4 show the anticipated negative regions around the chlorines and bromines, except at the outer ends, which are actually positive (Brinck, Murray, and Politzer 1992b). The negative rings around the sides of the C1 and Br have surface minima at angles of 102° and 96° with the C-C1 and C-Br axes, respectively. These results are consistent with the observed orientational preferences of both electrophiles and nucleophiles interacting with C-C1 and C-Br bonds in organic crystals. In contrast, the surface potential of CF4 is negative along the sides and at the ends of the fluorines, with the Vs,min forming angles of 132° with the C-F bonds (Brinck, Murray, and
62
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Politzer 1992b). This is consistent with fluorine in organic crystals interacting only with electrophiles and in an intermediate-type orientation. The positive potentials at the ends of the chlorines and bromines in CC14 and CBr4 suggest that they should be able to interact with negative portions of other systems. This has indeed been observed, e.g., with the electrons of aromatic rings such as benzene or p-xylene (Gotch, Garrett, and Zwier 1991; Ham 1953; Hooper 1964) and with the lone pair regions of pyridine (Dumas, Peurichard, and Gomel 1978), tetrahydrofuran (Dumas, Peurichard, and Gomel 1978), quinuclidine (2) (Blackstock, Lorand, and Kochi 1987) and diaza[2.2.2]octane (3) (Blackstock, Lorand, and Kochi 1987). Lorand and Spek (Lorand and Spek) have introduced the term "halogen-bonding" to designate this type of electrostatic interaction between the ends of the larger halogens (C1, Br, and I) in carbon-halogen bonds and the electron-donating portions of other molecules. We have also shown that the electrostatic potentials computed on the molecular surfaces of the mixed-halogen derivatives CHFC12, CF3C1, CF3Br, and CF3I give qualitatively the same pattern as was seen for the methane systems containing only one type of halogen. The fluorines are again negative everywhere, while chlorine, bromine, and iodine are negative around the sides but positive at the ends (Murray, Lane, & Politzer 1995b; Politzer and Murray 1995).
These results are relevant to spectroscopic studies showing that a variety of hydrogen-free fluorocarbons (e.g., CF3C1, C2F5C1, CF3Br, and C2F5Br) can act as hydrogen bond breakers (DiPaulo and Sandorfy 1974). This capability has been linked to the anesthetic potencies of halocarbons; it has been suggested that molecules such as CF3C1 and CF3Br disrupt preexisting hydrogen bonds by displacing the proton donors and forming "halogen" bonds. Interactions Involving Benzene and Its Derivatives. The surface electrostatic potential of benzene has a symmetrical pattern with negative regions above and below the six-membered ring, due to the electrons, and positive regions encircling the molecule with surface maxima associated with the hydrogens (Sjoberg 1989). This simple pattern can explain the existence of both the T-shaped structure (4) and the parallel-displaced structure (5) that have been reported experimentally and theoretically for the benzene dimer (Hobza, Selzle, and Schlag 1993). On the other hand, this V(r) pattern argues against a sandwich-type structure (6), and indeed this has been found computationally to be less stable than 4 and 5 (Hobza, Selzle, and Schlag 1990). The surface potential of benzene is also consistent with the crystalline orientation of benzene molecules, which is essentially a three-dimensional extension of the T-shaped dimer 4 (Cox et al. 1958). The interactive properties of the derivatives of benzene vary widely, depending upon the nature of the substituent and its influence upon the aromatic ring. The effects of substituents have been categorized and quantified through the introduction of first the Hammett and then the Taft constants, which were obtained through the analysis of linear free-energy relationships (Exner 1988). The electrostatic potentials of benzene derivatives provide another means of ascertaining how the substituents affect the interactive behavior of the aromatic systems (Murray, Paulsen, and Politzer 1994).
THE MOLECULAR ELECTROSTATIC POTENTIAL
63
In the course of a study of the surface electrostatic potentials of a group of C6H5X molecules, where X = NH2, OH, OCH3, CH3, F, C1, Br, I, CHO, CN, and NO2, we have found that the respective surface V(r) can be categorized into three main groups, depending upon whether X is (a) a resonance-donor, (b) strongly electron-withdrawing, or (c) a halogen. Each of these groups will be discussed separately. The relatively strongly resonance-donating substituents -NH2, -OH, and OCH3 produce very similar surface V(r) patterns (Murray, Paulsen, and Politzer 1994). The regions above and below the aromatic rings are more negative than those in benzene, and even stronger negative potentials are found in the vicinities of the heteroatoms (N or O); these are attributed to the lone pair electrons of the heteroatoms. These surface V(r) patterns are consistent with aniline, phenol, and anisole acting as bifunctional bases (Berthelot 1992). In toluene, on the other hand, with the weakly electron-donating methyl substituent, the negative regions above and low the ring are only very slightly strengthened relative to benzene. The methyl group does introduce some asymmetry into the surface V(r) pattern, but otherwise changes it relatively little from that of benzene, suggesting the alkyl substituents should not be classified together with the stronger electron donors but instead should be viewed as slight perturbations of benzene's hydrogens. We have found that strongly electron-withdrawing substituents, such as -CN, -NO2, and -CHO, either totally eliminate the negative regions above and below the aromatic ring, as in benzonitrile and nitrobenzene, or significantly weaken them, as in the case of benzaldehyde (Murray, Paulsen, and Politzer 1994). Molecules of this type have strong negative regions of potential associated with certain heteroatoms of their functional groups, such as the oxygens of -NO2 and -CHO and the nitrogen of -CN. From an analysis of the electrostatic potentials, it would be predicted that electrophilic intermolecular interactions should occur in the vicinities of these beteroatom negative regions (Murray, Paulsen, and Politzer 1994). Indeed Berthelot (Berthelot 1992) has found benzonitrile, nitrobenzene, and benzaldehyde to be monofunctional oxygen or nitrogen bases. The positive V(r) regions above nitrobenzene and benzonitrile suggest that these may serve as sites for nucleophilic inter-
64
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
actions. Indeed our results for nitrobenzene and other nitroaromatics (Murray, Lane, and Politzer 1990; Politzer, Abrahmsen, and Sjoberg 1984) are consistent with the observed interactions of these molecules with hydroxide and alkoxide ions to form Meisenheimer complexes, e.g., as shown for 1,3,5-trinitrobenzene (7) below. The monohalogenated derivatives C6H5X, where X = F, C1,Br, and I, have negative regions of potentials above and below their aromatic rings (Murray, Paulsen, and Politzer, 1994). However they are all weaker than those of benzene, due to the net electron-attracting nature of the halogen. There is also a weak negative region associated with each of the halogen atoms. Chloro-, bromo-, and iodobenzene have an additional interesting feature; the surface potential at the end of the chlorine, bromine, or iodine is positive (Murray, Paulsen, and Politzer 1994), suggesting a tendency for interactions with nucleophiles at these sites ("halogen" bonding), as we have discussed above. The overall pattern of the surface V(r) of the halogenated benzenes suggests that they will undergo weak electrophilic interactions above and below their aromatic rings and through the halogens, in addition to weak nucleophilic interactions at the ends of the halogens in chloro-, bromo-, and iodobenzene. The surface electrostatic potentials of benzene derivatives demonstrate how the substituent can significantly alter the pattern of the surface potential. Such effects are a key to understanding and predicting the noncovalent interactions that these types of molecules will undergo. Azine Interactions. The surface potentials of the azines pyridine (8), pyrimidine (9), pyrazine (10), s-triazine (11), and s-tetrazine (12) show several distinct patterns. The most negative potentials in each are associated with the ring nitrogens; they become less negative as the number of ring nitrogens increases (Politzer and Murray 1990). Though 8-12 are viewed as having varying degrees of aromaticity, only pyridine, 8, shows any negative region extending above and below the six-membered ring. 9-12 have increasingly stronger positive regions of surface potential above the ring. This is in striking contrast to benzene, which is negative in this region.
THE MOLECULAR ELECTROSTATIC POTENTIAL
65
The surface potentials of 9-12 suggest that they will interact with nucleophiles above and below their six-membered rings and with electrophiles through their ring nitrogens. Indeed, the surface potential of 12 helps to explain the formation of its crystal structure, its dimerization, and complexes with other molecules, such as HC1, H2O, and C2H2 (Politzer et al. 1992b). For example, in the s-tetrazine crystal, the planes of adjacent molecules are perpendicular to one another (Bertinotti, Giacomello, and Liquori 1956), consistent with the negative N-N portions of each being positioned above the positive ring centers of its neighbors. Diphenylurea Crystallization. 1,3-bisphenylurea (13) is the parent compound of a large family of derivatives, most of which do not cocrystallize with guest molecules (Etter et al. 1990). Even when put into solution with strong hydrogen bond acceptors, e.g., dimethyl sulfoxide (DMSO), triphenylphosphineoxide (TPPO) and tetrahydrofuran (THF), most diphenyl ureas crystallize with other molecules of the same kind in a connectivity pattern viewed as is shown below (14), instead of forming cocrystals (e.g., 15).
We have proposed that the tendency for 13 to form homomeric rather than guest-host crystals is largely due to a relatively strong and nonlocalized electrostatic attraction between diphenylurea molecules (Murray et al. 1991a). The surface electrostatic potential of 13, shown in Fig. 3.3, shows an extended negative region along the top edge of the molecule and a long positive one along the bottom edge. The suggested nonlocalized electrostatic in-
66
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
teraction between the top and bottom edges of adjacent molecules, more extensive than hydrogen bonding, apparently provides sufficient stability that homomeric crystal formation is not disrupted even by the presence of very strong hydrogen bond acceptors in solution during crystallization. 3.3.2 Molecular Recognition The initial step in many important classes of biological processes, including drug-receptor and enzyme-substrate interactions, is one of "recognition." A receptor "recognizes" that an approaching molecule has certain key features that will promote their mutual interaction. This recognition is believed to occur when the two species involved in the interaction are at a relatively large separation and precedes the formation of any covalent bond. The electrostatic potential V(r) is well suited for analyzing processes based on "recognition," because V(r) is a physically meaningful representation of how a molecule is perceived by a system in its vicinity. It is through their potentials that the two species involved in the interaction first "see" each other. Therefore, it is not surprising that the electrostatic potential has been shown to be an effective means of analyzing and elucidating recognition processes (Cheney 1982; Hayes and Kollman 1976; Loew and Berkowitz 1975; Martin et al. 1975; Martin et al. 1983a; Martin et al. 1983b; Martinelli and Petrongolo 1980; Murray, Evans, and Politzer 1990; Murray et al., 1986; Naray-Szabo 1983; Osman, Weinstein, and Topiol 1981; Petrongolo, Preston, and Kaufman 1978; Petrongolo and Tomasi 1975; Platt and Silverman 1996; Sheridan and Allen 1981; Spark, Winkler, and Andrews 1982; Thomson and Brandt 1983; Weinstein, Osman, and Green 1979; Weinstein et al. 1981a; Weinstein et al. 1981b). Several illustrative examples of this use of the electrostatic potential will be summarized below. The first example involves the molecule 5-hydroxytryptamine (16), also known as serotonin and 5-HT. 16 is a neurotransmitter that interacts with receptors both in the brain and in peripheral tissues. The electrostatic potentials of 16 and other hydroxytryptamines have been found to have two characteristic minima on each side (above and below) of the indole portions of the molecules (Weinstein, Osman, and Green 1979; Weinstein et al. 1981a; Weinstein et al. 1981b). One of these is associated with the six-membered ring, the other with the hydroxyl oxygen. An "orientation vector" can be drawn for each hydroxytryptamine, connecting these two minima along the potential gradient between them. It was found that the degree to which the direction of this vector deviates from that in 5-HT is related to the relative affinity of that molecule for 5-HT receptors. Apparently the direction of the vector is indicative of how readily the molecule can achieve the preferred orientation relative to the receptor.
THE MOLECULAR ELECTROSTATIC POTENTIAL
67
This type of reasoning has explained the experimental finding that 5-HT and (d-lysergic acid diethylamide, 17 (LSD), act on the same receptors (Weinstein, Osman, and Green 1979; Weinstein et al. 1981a; Weinstein et al. 1981b). This would not have been predicted by looking at the structures of 16 and 17; however, the electrostatic potentials of the two exhibit important similarities. The C12-C13 double bond in LSD produces a minimum in V(r) that mimics the one associated with the OH group in 16. The net result is that the electrostatic potential of 17 shows the key features that are required for the molecule to interact effectively with 5-HT receptors. It is interesting to note that when compound 18 was tested experimentally, its affinity for an LSD/5-HT receptor was lower by a factor of 10--2 than that of either 5-HT or LSD, but comparable to that of tryptamine (5-HT without the hydroxyl group). Our second example involves substituted dibenzo-p-dioxins and their analogues. 2,3,7,8Tetrachlorodibenzo-p-dioxin (19, TCDD) is the prototype of a group of halogenated aromatic hydrocarbons which have varying degrees of toxicity, ranging from virtually none, for the parent compound dibenzo-p-dioxin (20), to very high, as in the case of TCDD (Long and Hansson 1983; Poland and Knutson 1982). In the course of investigating factors that lead to effective interactions of certain members of this class of compounds with the receptor believed to initiate their toxic responses, we have computed the electrostatic potentials of 19 and 20, as well as nine other mono- to tetra- halogenated dibenzo-p-dioxins, dibenzofuran (21), 2,3,7,8-tetrachlorodibenzofuran (22), and three other analogues of TCDD, 23-25 (Murray, Evans, and Politzer 1990; Murray and Politzer 1987; Murray et al. 1986; Politzer 1988; Sjoberg et al. 1990).
68
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Laboratory studies have reported an excellent correlation between the toxicities of the dibenzo-p-dioxins and related compounds and their abilities to induce aryl hydrocarbon hydroxylase (AHH) activity (Poland and Knutson 1982), suggesting that some mechanistic features may be common to both the toxic and AHH-inducing activities of these compounds. Indeed, both the toxicities and the AHH-inducing activities of the dibenzo-p-dioxins have found to correlate well with binding affinities to a cytosolic receptor (Poland, Greenlee, and Kende 1979; Poland and Knutson 1982). Certain structural features have been identified as being associated with high degrees of toxicity, AHH induction and receptor binding for the dibenzo-p-dioxins and related compounds (Poland and Knutson 1982). These are as follows: 1. The molecules should be essentially planar and rectangular, with dimensions of roughly 3 X 10 A. 2. At least three of the four lateral positions (2,3,7,8,; see 19) should have halogen substituents.
THE MOLECULAR ELECTROSTATIC POTENTIAL
69
3. The activity induced by halogen substituents decreases in going from bromine to chlorine to fluorine. 4. At least one ring position should remain unsubstituted. Clearly, other molecular frameworks exist, besides dibenzo-p-dioxin, that can approximately meet the size and shape requirements, for example, 22-25. These also have four lateral positions chlorinated. It is interesting to look at the levels of biological activity of these analogues, in comparison to TCDD. 22, 2,3,7,8,-tetrachlorodibenzofuran, and 23, 2,3,6,7tetrachlorobiphenylene, have activities that are, respectively slightly less than and very similar to that of TCDD (Poland, Greenlee, and Kende 1979; Poland and Knutson 1982). On the other hand, 24 and 25 are much less active than is TCDD. We have shown that the biological activities of 22-25 can be understood in terms of the degree to which their electrostatic potentials mimic that of TCDD (Murray, Evans, and Politzer 1990). Since a molecule encounters a receptor at some distance from itself, the electrostatic potentials have been computed either in planes 1.75 A above the framework of the molecule or on molecular surfaces. To provide a basis for understanding the potentials of TCDD and 22-25, it is instructive to first consider that of the unsubstituted parent molecule dibenzo-p-dioxin, 20. Its V(r) is weakly negative above the outer aromatic rings and strongly negative in the areas surrounding the oxygen atoms, while the potentials above the lateral regions are positive in sign (Murray et al. 1986; Sjoberg et al. 1990). The replacement of the lateral hydrogens by chlorines to give TCDD results in a complete transformation of the V(r) pattern. In TCDD, at 1.75 A above the plane, there are no negative regions associated with either the aromatic rings or the central oxygens; however, V(r) is now negative above the lateral positions (Murray et al. 1986). The electrostatic potential of 22 is similar to that of TCDD, but lacks the horizontal plane of symmetry of the latter. An even closer match are the potentials of TCDD and 23, which share the same degree of symmetry. The similarity in the biological activities of TCDD, 22 and 23 shows that the oxygens in TCDD and 22 are not necessary for high activity. In fact, they can even be an inhibiting influence, as is apparently the case for 24, which has a potential pattern similar to that of TCDD but is much less active, presumably because the regions of negative potential near the central carbonyl oxygens in 24 are stronger than those in TCDD and 22. V(r) for 25 also differs significantly from that of TCDD in that the negative regions of the "lateral" chlorines actually overlap on one side of the framework, so that there is not a true extended positive region separating them. Our electrostatic potential analyses of TCDD, 22-25, and a number of other dibenzo-pdioxins have allowed us to make some generalizations about the V(r) pattern that appears to lead to high biological activity for this class of halogenated aromatics. These are listed below: 1. Biological activity appears to require negative potentials above all or most of the lateral positions, with optimum values of the minima, at 1.75 A above the plane, being about -13 kcal/mole (at the STO-5G level) (Murray, Evans, and Politzer 1990; Murray and Politzer 1987; Murray et al. 1986; Politzer 1988; Sjoberg et al. 1990). 2. The negative regions of V(r) above the lateral positions of the molecule should be separated by a large central region of positive V(r). 3. Negative regions of V(r) associated with central oxygens are not necessary for high activity; on the contrary, in systems that do have oxygens in or bonded to the center ring, it is important that the oxygen potentials be relatively weak and small (Murray, Evans, and Politzer 1990; Murray and Politzer 1987; Politzer 1988; Sjoberg et al. 1990).
70
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Although the actual structure of the receptor binding site is not known, some theoretical modeling computations based on experimental competitive binding studies support a stacking interaction model (Long, McKinney, and Pedersen 1987; McKinney et al. 1985; McKinney, Long, and Pederson 1984; McKinney and Pederson 1986). In this model, the toxigen is envisioned as being involved in a recognition step at a porphine-like binding site (26). For the specific case of TCDD, the most favorable interaction has been found to be one where the molecular planes of 19 and 26 are parallel to one another and separated by 3.38 A (McKinney, Long, and Pederson 1984), with the dioxin oxygens roughly above the unsubstituted nitrogens of 26. It has been shown in previous work (Politzer and Daiker 1981; Politzer and Murray 1990; Politzer and Murray 1991) that heterocyclic nitrogens, such as the doubly coordinated ones in 26, have large and strongly negative potentials associated with them. We have suggested that the observed need for small and weak negative oxygen potentials in the active dibenzo-p-dioxins, dibenzofurans, and other analogues containing central oxygens may be to avoid repulsive interactions with negative regions of V(r) above the center of the receptor, e.g., the doubly coordinated nitrogens in 26. This example of the electrostatic potentials of the dibenzo-p-dioxins shows how the patterns associated with high activity may be used to infer information about the actual receptor. The same approach can be used for other drag-receptor and toxigen-receptor systems. 3.3.3 Statistically Based Interaction Indices Derived from Electrostatic Potentials Computed on Molecular Surfaces: A General Interaction Properties Function (GIPF) 3.3.3.1 Background In the preceding sections we have discussed how the electrostatic potential can be used successfully to study molecular phenomena involving noncovalent interactions. We have shown that the patterns of positive and negative V(r) and the positions and values of V(r) extrema can be useful in understanding and predicting the most favorable sites and orientations for noncovalent interactions, e.g., hydrogen or "halogen" bonding, and for interpreting the recognition of a molecule by a receptor.
THE MOLECULAR ELECTROSTATIC POTENTIAL
71
In recent years, we have extended the nature of our analysis to include certain statistically defined features of the surface electrostatic potential. Our purpose has been to expand the capabilities of V(r) for quantitatively describing macroscopic properties that reflect noncovalent molecular interactions. This has led to the development of the General Interaction Properties Function (GIPF), described by Eq. (3.7):
The macroscopic property of interest, e.g., heat of vaporization, is represented in terms of some subset of the computed quantities on the right side of Eq. (3.7). The latter are measures of various aspects of a molecule's interactive behavior, with all but surface area being defined in terms of the electrostatic potential computed on the molecular surface. V s,max and Vs,min, the most positive and most negative values of V(r) on the surface, are site-specific; they indicate the tendencies and most favorable locations for nucleophilic and electrophilic interactions. In contrast, II, and v are statistically-based global quantities, which are defined in terms of the entire molecular surface. II is a measure of local polarity, indicates the degree of variability of the potential on the surface, and v is a measure of the electrostatic "balance" between the positive and negative regions of V(r) (Murray et al. 1994; Murray and Politzer 1994). The macroscopic properties that have been represented successfully by variations of Eq. (3.7) include boiling points (Murray et al. 1993a), critical constants (temperatures, pressures, and volumes) (Murray et al. 1993a), partition coefficients (Brinck, Murray, and Politzer 1993; Murray, Brinck, and Politzer 1993), solubilities in supercritical fluids (Murray et al. 1993b; Politzer et al. 1992a; Politzer et al. 1993), heats of vaporization (Murray, Lane, and Politzer 1995b; Murray and Politzer 1994), heats of sublimation (Politzer et al. 1997), heats of fusion (Murray, Brinck, and Politzer 1996), liquid and crystal densities (Murray, Brinck, and Politzer 1996), surface tension (Murray, Brinck, and Politzer 1996), diffusion constants (Politzer, Murray, and Flodmark 1996), C60 solubilities (Murray, Gagarin, and Politzer 1995), and nitroaromatic and nitroheterocyclic impact sensitivities (Murray, Lane, and Politzer 1995a). A key point to note is that liquid, solid, and solution properties are being expressed solely in terms of quantities computed for individual molecules; environmental factors are not explicitly taken into account. In this section, we will first define and discuss the global quantities II, and v. This will be followed by a review of some earlier and current applications of this approach.
3.3.3.2 Methodology The first step in our procedure is to compute an optimized structure for each molecule and then to use this geometry to compute the electronic density and the electrostatic potential. A large portion of our work in this area has been carried out at the SCF/STO-5G*//SCF/STO3G* level, although some other basis sets have also been used. We then compute V(r) on 0.28 bohr grids over molecular surfaces defined as the 0.001 au contour of the electronic density (Bader et al. 1987). The numbers of points on these grids are converted to surface areas (A2), and the Vs,max and V s,min are determined. Our statistically based interaction indices II, and v are then calculated according to Eqs. (3.8) through (3.10).
72
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
y(r) is the value of V(r) at point r. on the surface, and VS is the average over the surface. Similarly, V+(ri) and V--(ri) are the positive and negative values of V(r) on the surface, and VS+ and VS--:are the averages: Statistically, II is the average deviation of V(r) on the molecular surface; we view it as being indicative of the local polarity, or charge separation, that is present even in molecules having zero dipole moments (Brinck, Murray, and Politzer 1992a), e.g., BF3 and p-dinitrobenzene. We have shown that II correlates in a general fashion with several empirical polarity scales and with the dielectric constant (Brinck, Murray, and Politzer 1992a; Murray et al. 1994). is the total variance of V(r) on the molecular surface, equal to the sum of the positive and negative variances, and which are calculated separately. It is a measure of the variability within the positive and negative regions of the surface potential; because the terms in Eq. (3.9) are squared, is particularly sensitive to the extremes of V(r). We have found it to be an effective indicator of a molecule's overall tendency for noncovalent electrostatic interactions (Murray et al. 1994; Murray and Politzer 1994). In some instances it is preferable to use or alone, instead of (Brinck, Murray, and Politzer 1993; Murray, Brinck, and Politzer 1993; Politzer, Murray, and Flodmark 1996). The former refer specifically to tendencies for nucleophilic and electrophilic noncovalent interactions, respectively. The function of v, defined by Eq. (3.10), is to give the degree of balance between the positive and negative potentials on the surface (Murray et al. 1994; Murray et al. 1993a; Murray et al. 1993b). When and erf are equal, v attains a maximum value of 0.250; accordingly, the closer v is to 0.250, the better able is the molecule to interact to a similar extent (whether strongly or weakly) through both its positive and negative potentials. Our most frequent use of v has been as a factor in the product (Murray et al. 1994; Murray, Brinck, and Politzer 1996; Murray et al. 1993a; Murray et al. 1993b; Murray, Lane, and Politzer 1995b; Murray and Politzer 1994), which has been found to be a key term in representing properties that reflect the electrostatic interactions of a molecule with others of its own kind, e.g., boiling points and critical temperatures (Murray et al. 1993a), surface tension (Murray, Brinck, and Politzer 1996), and heats of vaporization (Murray and Politzer 1994) and sublimation (Politzer et al. 1997). t For illustrative purposes, Table 3.3 gives II, for and (forhhthirty molecules of a variety of types. More complete compilations can be found elsewhere (Murray et al. 1994; Murray et al. 1993a; Murray and Politzer 1994). The molecules in Table 3.3 are listed in order of increasing II. It should be noted that some of the larger II values are for molecules having zero dipole moments but, nevertheless, considerable internal charge separation, e.g., perfluorobenzene and 1,3,5,-trinitrobenzene. Although II and may seem to be measuring similar effects, the data in Table 3.3 clearly show that these are quite different quantities. II covers a range from 2 to 20 kcal/mole for most organic molecules, while ranges from 3 to over 300 (kcal/mole)2 (Murray et al. 1994); more important, they do
THE MOLECULAR ELECTROSTATIC POTENTIAL
73
Table 3.3. Calculated global properties and experimentally derived Hildebrand parameters (8) for a group of organic molecules. Molecule cyclohexane n-octane n-hexane n-pentane 1,3-butadiene toluene benzene naphthalene carbon tetrachlorine phenanthrene anthracene diethyl ether chloroform 1-butanol pyridine 2-propanol chloroethane acetone dichloromethane ethanol perfluorobenzene N,N-dimethyl formamide iodoform nitrobenzene methanol dimethylsulfoxide acetonitrile formamide 1,3,5-trinitrobenzene nitromethane
II
V
2.16 2.32 2.33 2.35 4.50 4.63 4.83 5.12 5.22
2.5 2.6 2.7 2.8 7.6 6.8 7.1 8.1 28.8
0.7 1.0 0.9 0.9 7.5 11.1 9.2 7.8 2.5
3.2 3.6 3.6 3.6 15.1 17.9 16.3 15.9 31.3
0.171 0.201 0.188 0.194 0.250 0.236 0.246 0.250 0.073
0.55 0.72 0.68 0.70 3.78 4.22 4.01 3.98 2.28
16.8 15.3 14.9 14.5 14.5 18.2 18.8 20.3 17.6
5.28 5.30 6.68 7.54 7.54 8.55 8.70 9.00 9.40 9.66
9.7 8.8 8.0 53.5 35.0 18.5 35.5 14.3 15.9 46.3
7.1 6.8 129.8 7.4 165.9 212.3 184.2 28.4 159.8 13.8
16.8 15.6 137.8 60.9 201.0 230.8 219.7 42.7 175.7 60.1
0.244 0.246 0.055 0.107 0.144 0.074 0.135 0.223 0.082 0.177
4.10 3.84 7.58 6.52 28.94 17.08 29.66 9.52 14.41 10.64
20.0 20.3 15.1 18.8 23.1 21.7 24.5 17.0 20.0 20.0
10.05 10.35
45.1 39.1
182.4 6.1
227.5 45.3
0.159 0.116
36.17 5.25
26.4
11.07
18.6
158.8
177.4
0.094
16.68
24.8
12.02 12.13 12.79 15.39
20.3 16.7 49.6 24.3
24.0 105.2 181.5 271.7
44.3 121.9 231.0 296.0
0.248 0.118 0.169 0.075
10.99 14.38 39.04 22.20
20.3 21.7 29.2 26.4
17.12 17.31 18.70
23.6 85.5 105.3
167.8 233.6 47.4
191.4 319.1 152.7
0.108 0.196 0.214
20.67 62.54 32.68
24.0 36.4
19.90
34.4
81.7
116.0
0.209
24.24
25.2
Units are: kcal/mole for II; (kcal/mole)2 for
(...
(...
for
not necessarily vary in the same direction. For the molecules in Table 3.3, the linear correlation coefficient between II and is only 0.721. It is interesting to look at the relative magnitudes of and in relation to the known interactive behavior of some of the molecules. For example, the values for diethyl ether, pyridine, and acetone are low, all under 20 (kcal/mole)2, while for these three molecules is in each case over 125 (kcal/mole)2. Diethyl ether, pyridine, and acetone are all known to
74 (74 MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
be good hydrogen-bond acceptors, but not good hydrogen-bond donors. Our V(r) results would predict this behavior. These and values are reflected in the electrostatic balance term v, which is between 0.05 and 0.08 for these molecules. Examples of molecules with v approaching the limit of 0.250 are the aromatics in Table 3.3. The latter interact to a similar degree through both their positive and negative regions of V(r). The molecule with the highest value of in Table 3.3 is formamide, which also has a relatively high v. This combination yields the largest in Table 3.3. Figure 3.4 shows a fair correlation between and the Hildebrand solubility parameter 8 (linear correlation coefficient = 0.930) which makes intuitive sense. The Hildebrand parameter, which is often used to characterize liquids, is defined as the square root of the cohesive energy density (Barton 1991), while can be viewed as reflecting how strongly a molecule interacts with others of the same kind (Murray et al. 1994). 3.3.3.3 Applications of GIPF Table 3.4 presents our GIPF relationships for some properties that can be regarded as involving noncovalent interactions. The equations for the properties that are characteristic of pure compounds (normal boiling point, critical temperature, volume and pressure, heat of vaporization, and surface tension) invariably include area in some form and nearly always (critical volume being an exception) also a term containing . As we have mentioned earlier, the latter has emerged as being important for properties that are determined by how well a molecule interacts with its own kind. For example, for 1-butanol is greater than that of diethyl ether [28.94 v. 7.5 (kcal/mole)2], largely because of their relative v values, 0.144 and 0.055. These reflect the greater ability of 1-butanol to interact through both its positive and negative regions, while diethyl ether is primarily limited to its negative V(r). The case of diethyl ether is typical of other compounds which operate as hydrogen-bond acceptors but not as donors; although (C2H5)2O has a strong negative potential near its oxygen, this does not promote highly favorable interactions with other molecules of the same kind because their positive potentials are so weak. To show how these factors affect a physical property, the normal boiling points of 1-butanol and diethyl ether are 390 and 308 K, respectively. Also in Table 3.4 are some solubility relationships (including partition coefficients) and one transport property. In these cases, the molecule in question is interacting with other kinds, and the product is found to be of less importance. Instead, and often appear in the equations, along with terms involving molecular size.
3.4 Summary The use of the electrostatic potential in analyzing and predicting molecular interactive behavior and properties has increased remarkably over the past 25 years. In 1980, it was still reasonable to hope to at least mention, in one lengthy review chapter (Politzer and Daiker 1981), all of the papers that had been published in this area. In 1996, such an objective would be ridiculous. This popularity can be attributed to (a) the insight that V(r) can provide, especially into noncovalent interactions, and (b) the widespread availability of computational software packages of which it has become a standard feature. In this chapter, we have sought to convey some appreciation of the sort of questions that can and have been addressed by means of the electrostatic potential, and further to indicate
THE MOLECULAR ELECTROSTATIC POTENTIAL 75 THE Table 3.4. Some GIPF relationships.a,b Relationshipc Normal boiling point, Tbp : 0.5 Tbp = (area) +
N
R
S. D.
100
0.948
Heat of vaporization, AHv: 0.5 AHv = (area)0.5 +
40
0.971
Critical temperature, Tc: 0.25 Tc = (area)0.5 +
66
0.909
60.7
158
Critical volume, V : Vc = (area)1.5 +
58
0.986
15.2
158
Critical pressure, PC: Pc = (area) +
/area
37.0
Ref.
2.03
158 25
57
0.910
4.8
158
Octanol/water partition coefficient, Pow: ( logPow = (area) +
57
0.910
4.8
158
Solubility in supercritical CO2 at 14 MPa and 308 K: In(sol) = (vol)--1.5 --
21
0.95
20
0.954
0.475
167
26
0.923
4.75
165
10
0.990
0.09
166
162
Solubility of C60 in organic solvents: log(sol X 104) =
Surface tension, y;
Diffusion constants of developing agents, D; in dry gelatin: D X 107 = (area) a
N is number of systems in data base; R is correlation coefficient; S. D. is standard deviation. b Units: Tbp, K; Tc, K; Vc, cm3/mole; Pc, bar; AHv, kJ/mole; y, dyn/cm; D, cm2/sec. C All coefficients are positive numbers.
some possible future directions. In particular, we believe that quantities derived from V(r), such as II and will find increasing application in quantitatively describing macroscopic properties based on noncovalent interactions. Biological systems should provide some fruitful areas for exploration, e.g., drug-receptor binding constants. Overall, a continuing extensive use of the electrostatic potential to analyze an expanding array of phenomena can be anticipated. Acknowledgment We greatly appreciate the financial support of the Office of Naval Research, through contract N00014-97-1-0066 and Program Officer Dr. Richard S. Miller.
76
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Notes 1. The statistical correction to is applied to those molecules having N similar hydrogens available for hydrogen bonding [82]. is defined by = (log Ka + 1.1)/4.636, whereK Ka is the equilibrium constant for a 1:1 complex of the donor and a reference acceptor. Then K is the corrected value. Accordingly log Ka = log N + (log log Ka and a™ (corrected) = 2. The statistical correction to log KHB is applied to those molecules having N indistinK guishable atoms that can accept a hydrogen bond. Then KHB = where KHB is the corrected value, and log KHB = log KHB -- log N. References Abraham, M. H., P. P. Duce, P. L. Grellier, D. V. Prior, J. J. Morris, and P. J. Taylor. 1988. "A Thermodynamically Based Scale of Solute Hydrogen-bond Acidity." Tetrahedron Letts. 29, 1587. Abraham, M. H., P. L. Grellier, D. V. Prior, P. P. Duce, J. J. Morris, and P. J. Taylor. 1989a. "A Scale of Solute Hydrogen-bond Acidity Based on log K Values for Complexation in Tetrachloromethane." J. Chem. Soc., Perkins Trans. 2, 699. Abraham, M. H., P. L. Grellier, D. V. Prior, J. J. Morris, and P. J. Taylor. 1990. "A Scale of Solute Hydrogen-bond Basicity using log K Values for Complexation in Tetrachloromethane." J. Chem. Soc., Perkins Trans. 2, 521. Abraham, M. H., P. L. Grellier, D. V. Prior, J. J. Morris, P. J. Taylor, C. Laurence, and M. Berthelot. 1989b. "A Thermodynamically-Based Scale of Solute Hydrogen Bond Basicity." Tetrahedron Letts. 30, 2571. Bader, R. F. W., M. T. Carroll, J. R. Cheeseman, and C. Chang. 1987. "Properties of Atoms in Molecules: Atomic Volumes." J. Am. Chem. Soc. 109, 7968. Barton, A. F. M. 1991. Handbook of Solubility Parameters and Other Cohesion Parameters. CRC Press, Boca Raton, FL. Benzel, M. A., and C. E. Dykstra. 1983. "The Nature of Hydrogen Bonding in the NN-HF, OC-HF, and HCN-HF Complexes." J. Chem. Phys. 78, 4052 Berthelot. 1992. Personal communication. Berthier, G., R. Bonaccorsi, E. Scrocco, and J. Tomasi. 1972. "The Electrostatic Molecular Potential for Imidazole, Pyrazole, Oxazole and Isoxazole." Theor. Chim. Acta 26, 101. Berthed, H., and A. Pullman. 1975. "The Molecular Electrostatic Potential of the Dimethyl Phosphate Anion: An Ab Initio Study." Chem. Phys. Lett. 32, 233. Berthod, H., and A. Pullman. 1978. "Electrostatic Molecular Potential Hydration and Cation Binding Scheme of C3-Endo-gg-Ribose." Theor. Chim. Acta 47, 59. Bertinotti, F., G. Giacomello, and A. M. Liquori. 1956. "The Structure of Heterocyclic Compounds Containing Nitrogen. I. Crystal and Molecular Structure of .s-Tetrazine." Acta Cryst. 9, 510. Besler, B. H., K. M. Merz, Jr., and P. A. Kollman. 1990. "Atomic Charges Derived from Semiempirical Methods." J. Comp. Chem. 11, 431. Blackstock, S. C., J. P. Lorand, and J. K. Kochi. 1987. "Charge Transfer Interactions of Amines with Tetrahalomethanes. X-ray Crystal Structures of the Donor-Acceptor Complexes of Quinuclidine and Diazabicyclo-[2.2.2]octane with Carbon Tetrabromide." J. Org. Chem. 52, 1451. Bonaccorsi, R., C. Ghio, E. Scrocco, and J. Tomasi. 1980. "The Effect of Intramolecular Interactions on the Transferability Properties of Localized Descriptions of Chemical Groups." Israel J. Chem. 19, 109. Bonaccorsi, R., A. Pullman, E. Scrocco, and J. Tomasi. 1972a. "N- versus O-Proton Affinities of the Amide Group: Ab Initio Electrostatic Molecular Potentials." Chem. Phys. Lett. 12, 622.
THE MOLECULAR ELECTROSTATIC POTENTIAL
77
Bonaccorsi, R., A. Pullman, E. Scrocco, and J. Tomasi. 1972b. "The Molecular Electrostatic Potentials for the Nucleic Acid Bases: Adenine, Thymine and Cytosine." Theor. Chim. Acta 24, 51. Bonaccorsi, R., E. Scrocco, and J. Tomasi. 1970. "Molecular SCF Calculations for the Ground State of Some Three-Membered Ring Molecules: (CH2)3, (CH2)2NH, (CH2)2NH2+, (CH2)2O, (CH2)2S, (CH2)2CH2, and N2CH2." J. Chem. Phys. 52, 5270. Bonaccorsi, R., E. Scrocco, and J. Tomasi. 1971. "Molecular SCF Calculations for the Ground State of Some Three-Membered Ring Molecules: Cis and Trans Diaziridine, Oxaziridine and the Corresponding Imminium Ions." Theor. Chim. Acta 21, 17. Bonaccorsi, R., E. Scrocco, J. Tomasi, and A. Pullman. 1975. "Ab Initio Molecular Electrostatic Potentials, Guanine Compared to Adenine." Theor. Chim. Acta 36, 339. Boyd, R. J., and L.-C. Wang. 1989. "The Effect of Electron Correlation on the Topological and Atomic Properties of the Electron Density Distributions of Molecules." J. Comp. Chem. 1, 367. Breneman, C., and M. Martinov. 1996. "The Use of the Electrostatic Potential Fields in QSAR and QSPR." In Molecular Electrostatic Potentials: Concepts and Applications, edited by J. S. Murray and K. D. Sen. Elsevier, Amsterdam. Breneman, C. M. and K. B. Wiberg. 1990. "Determining Atom-Centered Monopoles from Molecular Electrostatic Potentials. The Need for High Sampling Density in Formamide Computational Analysis." J. Comp. Chem. 11, 361. Brinck, T, J. S. Murray, and P. Politzer. 1992a. "Quantitative Determination of the Total Local Polarity (Charge Separation) in Molecules." Mol. Phys. 76, 609. Brinck, T., J. S. Murray, and P. Politzer. 1992b. "Surface Electrostatic Potentials of Halogenated Methanes as Indicators of Directional Intermolecular Interactions." Int. J. Quant. Chem., Quant. Biol. Symp. 19, 57. Brinck, T., J. S. Murray, and P. Politzer. 1993. "Octanol/Water Partition Coefficients Expressed in Terms of Solute Molecular Surface Areas and Electrostatic Potentials." J. Org. Chem. 58, 7070. Buckingham, A. D., and P. W. Fowler. 1985. "A Model for the Geometries of Van derWaals Complexes." Can. J. Chem. 63, 2018. Cheney, B. V. 1982. "Structural Factors Affecting Aryl Hydrocarbon Hydroxylase Induction of Dibenzo-p-Dioxins and Dibenzofurans." Int. J. Quant. Chem. 21, 445. Chirlian, L. E., and M. M. Francl. 1987. "Atomic Charges Derived from Electrostatic Potentials: A Detailed Study." J. Comp. Chem. 8, 894. Cox, E. G., D. W. Cruickshank, and J. A. S. Smith. 1958. Proc. Roy. Soc. A 247, 1. Dahl, J. P. and J. Avery, eds. 1984. Local Density Approximations in Quantum Chemistry and Solid State Physics. Plenum Press, New York. Daudel, R., H. Leronzo, R. Cimiraglia, and J. Tomasi. 1978. "Dependence of the Electrostatic Molecular Potential upon the Basis Set and the Method of Calculation of 1 the Wave Function. Case of the Ground3A, and A1 States of Formaldehyde." Int. J. Quant. Chem. 13, 537. DiPaulo, T., and C. Sandorfy. 1974. "On the Hydrogen Bond Breaking Ability of Fluorocarbons Containing Higher Halogens." Can. J. Chem. 52, 3612. Dumas, J.-M., H. Peurichard, and M. Gomel. 1978. "Base Interactions as Models of Weak Charge Transfer Interactions: Comparison with Strong Charge-Transfer and Hydrogen-bond Interactions." J. Chem. Res. (S), 54. Espinosa, E., C. Lecomte, N. E. Ghermani, J. Devemy, M. M. Rohmer, M. Benard, and E. Molins. 1996. "Hydrogen Bonds: First Quantitative Agreement between Electrostatic Potential Calculations from Experimental X-(X+N) and Theoretical Ab Initio SCF Models." J. Am Chem. Soc. 118, 2501. Etchebest, C., R. Lavery, and A. Pullman. 1982. "The Calculations of Molecular Electrostatic Potential from a Multipole Expansion Based on Localized Orbitals and Developed at Their Centroids: Accuracy and Applicability for Macromolecular Computations." Theor. Chim. Acta 62, 17.
78
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Etter, M. C., Z. Urbanczyk-Lipowska, M. Zia-Ebrahimi, and T. W. Pananto. 1990. "Hydrogen Bond Directed Cocrystallization and Molecular Recognition Properties of Diaryl Ureas." J. Am Chem. Soc. 112, 8415. Exner, O. 1988. Correlation Analysis of Chemical Data. Plenum Press, New York. Ferenczy, G. G., C. A. Reynolds, and W. G. Richards. 1990. "Semiempirical AM1 Electrostatic Potentials and AM1 Electrostatic Potential Derived Charges: A Comparison with ab initio Values." J. Comp. Chem. 11, 159. Fiskin, A. M., and M. Beer. 1965. "Determination of Base Sequence in Nucleic Acids with the Electron Microscope. IV. Nucleoside Complexes with Certain Metal Ions." Biochem. 4, 1287. Francl, M. M., C. Carey, L. E. Chirlian, and D. M. Gange. 1996. "Charges Fit to Electrostatic Potentials. 2. Can Atomic Charges be Unambiguously Fit to Electrostatic Potentials?" J. Comp. Chem. 17, 367. Friedman, O. M., G. N. Mahapatra, and R. Stevenson. 1963. "The Methylation of Deoxyribonucleosides by Diazomethane." Biochem. Biophys. Acta 68, 144. Gatti, C., P. J. MacDougall, and R. F. W. Bader. 1988. "Effect of Electron Correlation on the Topological Properties of Molecular Charge Distributions." J. Chem. Phys. 88, 3792. Giessner-Prettre, C., and A. Pullman. 1975. "On the Molecular Electrostatic Potential Obtained from CNDO Wave Functions." Theor. Chim. Acta 37, 335. Gotch, A. J., A. W. Garrett, and T. S. Zwier. 1991. "The Ham Bands Revisited: Spectroscopy and Photophysics of the C6H5-CC14 Complex." J. Phys. Chem. 95, 9699. Hagelin, J., T. Brinck, M. Berthelot, J. S. Murray, and P. Politzer. 1995. "Family Dependent Relationships Between Computed Molecular Surface Quantities and Solute Hydrogen Bond Acidity/Basicity and Solute-Induced Methanol O-H Infrared Frequency Shifts." Can. J. Chem. 73, 483. Ham, J. S. 1953. "New Electronic State in Benzene." J. Chem. Phys. 21, 756. Hayes, D. M., and P. Kollman. 1976. "Electrostatic Potentials of Proteins. 2. Role of Electrostatics in a Possible Catalytic Mechanism for carboxypeptidase A." J. Am Chem. Soc. 98, 7811. Hobza, P., H. L. Selzle, and E. W. Schlag. 1990. "Floppy Structure of the Benzene Dimer: Ab Initio Calculation on the Structure and Dipole Moment." Hobza, P., H. L. Selzle, and E. W. Schlag. 1993. "New Structure for the Most Stable Isomer of the Benzene Dimer: A Quantum Chemical Study." J. Phys. Chem. 97, 3937. Hohenberg, P., and W. Kohn. 1964. "Inhomogeneous Electron Gas." Phys. Rev. B 136, 864. Hooper, H. O. 1964. "Lack of Charge Transfer in Aromatic Charge-Transfer Complexes." J. Chem. Phys. 41, 599. Kamlet, M. J., J.-L. M. Abboud, M. H. Abraham, and R. W. Taft . Energy Relationships. 23. A Comprehensive Collection of the Solvatochromic Parameters, (3, and Some Methods for Simplifying the Generalized Solvatochromic Equation." J. Org. Chem. 48, 2877. Kamlet, M. J., J.-L. M. Abboud, and R. W. Taft. 1981. "Linear Solvation Energy Relationship." Prog. Phys. Org. Chem. 13, 485. Kamlet, M. L, M. E. Jones, J.-L. M. Abboud, and R. W. Taft. 1979. "Linear Solvation Energy Relationships. Part 2. Correlations of Electronic Spectral Data for Aniline Indicators with Solvent and Values." J. Chem. Soc., Perkins Trans. 2, 342. Kamlet, M. J., A. Solomonovici, and R. W. Taft. 1979. "Linear Solvation Energy Relationships. 5. Correlations between Infrared Values and the Scale of Hydrogen Bond Acceptor Basicities." J. Am. Chem. Soc. 101, 3734. Kamlet, M. J., and R. W. Taft. 1976. "The Solvatochromic Comparison Method. I. The Scale of Solvent Hydrogen-Bond Acceptor (HBA) Basicities." J. Am. Chem. Soc. 98, 377. Kollman, P. 1977. "A General Analysis of Noncovalent Intermolecular Interactions." J. Am. Chem. Soc. 99 (15), 4875--4894. Kollman, P., J. McKelvey, A. Johansson, and S. Rothenberg. 1975. "Theoretical Studies of
THE MOLECULAR ELECTROSTATIC POTENTIAL
79
Hydrogen-Bond Dimers. Complexes Involving HF, H2O, NH3, HC1, H2S, PH3, HCN, HNC, HCP, CH2NH, H2CS, H2Co, CH4, CF3H, C2H2, C6H6, F-- and H3O+." J. Am. Chem. Soc. 97, 955. Labanowski, J. K., and J. W. Andzelm, Eds. 1991. Density Functional Methods in Chemistry. Springer, Berlin. Laidig, K. E. 1994. "Density Functional Methods and the Spatial Distribution of Electronic Charge." Chem. Phys. Lett. 225, 285. Lavery, R., S. Corbin, and B. Pullman. 1982. "The Molecular Electrostatic Potential and Steric Accessibility of C-DNA." Theor. Chim. Acta 60, 513. Lavery, R., A. Pullman, and B. Pullman. 1980. "The Electrostatic Molecular Potential of Yeast tRNA. pheIII. The Molecular Potential and the Steric Accessibility Associated with the Phosphate Groups." Theor. Chim. Acta 57, 233. Lavery, R., and B. Pullman. 1981. "Molecular Electrostatic Potential on the Surface Envelopes of Macromolecules: B-DNA." Int. J. Quant. Chem. 20, 259. Lawley, P. D. 1957. "The Relative Reactivities of Deoxyribonucleotides and of the Bases of DNA towards Alkylating Agents." Biochem. Biophys. Acta 26, 450. Legon, A. C., and D. J. Millen. 1987. "Directional Character, Strength and Nature of the Hydrogen Bond in Gas-Phase Dimers." Acc. Chem. Res. 20, 39. Leroy, G., G. Louterman-Leloup, and P. Ruelle. 1976a. "Contribution to the Theoretical Study of the Hydrogen Bond. II. The Dimers of Hydrogen Sulfide." Bull. Soc. Chim. Belg. 85, 219. Leroy, G., G. Louterman-Leloup, and P. Ruelle. 1976b. "Contribution to the Theoretical Study of the Hydrogen Bond. III. The Dimers of Hydrogen Fluoride." Bull. Soc. Chim. Belg. 85, 229. Leroy, G., G. Louterman-Leloup, and P. Ruelle. 1976c. "Contribution to the Theoretical Study of the Hydrogen Bond. I. The Dimers of Water." Bull. Soc. Chim. Belg. 85,205. Levy, M., S. C. Clement, and Y. Tal. 1981. "Correlation Energies from Hartree-Fock Electrostatic Potentials at Nuclei and Generation of Electrostatic Potentials from Asymptotic and Zero-Order Information." In Chemical Applications of Atomic and Molecular Electrostatic Potentials, P. Politzer, and D. G. Truhlar, Eds. Plenum Press, New York. Lin S., and C. E. Dykstra. 1986. Chem. Phys. 107, 343. Loew, G., and D. S. Berkowitz. 1975. "Quantum Chemical Studies of Morphinelike Opiate Narcotic Analgesics: I. Effect of N-substituent Variations." J. Med. Chem. 18, 656. Long, G., J. McKinney, and L. Pedersen. 1987. "Polychlorinated Dibenzofuran (PCDF) Binding to the Ah Receptor(s) and Associated Enzyme Induction. Theoretical Model based on Molecular Parameters." Quant. Struct.-Act. Relat. 6, 1. Long, J. R., and D. J. Hansson. 1983. "Dioxin Report: A C&EN Special Issue." Chemical and Engineering News, June 6, p. 20. Lorand, J. P., and A. L. Spek, Private communication. Luque, F. J., F. Illas, and M. Orozco. 1990. "Comparative Study of the Molecular Electrostatic Potential Obtained from Different Wavefunctions. Reliability of the Semiempirical MNDO Wavefunction." J. Comp. Chem. 11, 416. Luque, F. J., and M. Orozco. 1990. "Reliability of the AM1 Wavefunction to Compute Molecular Electrostatic Potentials." Chem. Phys. Lett. 168, 269. Martin, M., R. Carbo, C. Petrongolo, and J. Tomasi. 1975. "Structure-Activity Relationships of Phenylalamine. A Comparison of Quantum Mechanical SCF "ab initio" and Semiempirical Calculations." J. Am Chem. Soc. 97, 1338. Martin, M., F. Sanz, M. Campillo, L. Pardo, J. Perez, and J. Turmo. 1983a. "Quantum Chemical Study of the Molecular Patterns of MAO Inhibitors and Substrates." Int. J. Quant. Chem. 23, 1627. Martin, M., F. Sanz, M. Campillo, L. Pardo, J. Perez, and J. Turmo. 1983b. "Quantum Chemical Structure Activity Relationships on carbolines as Natural Monoamine Oxidase Inhibitors." Int. J. Quant. Chem. 23, 1643. Martinelli, A., and C. Petrongolo. 1980. "Ab Initio Study of the International Rotation of
80
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
the Electrostatic Molecular Potential of Tazolol and Comparison with Similar Compounds." J. Phys. Chem. 84, 105. McKinney, J. D., T. Darden, J. A. Lyerly, and L. G. Pederson. 1985. "QSAR and Related Compound Binding to the Ah Receptor(s). Theoretical model Based on Molecular Parameters and Molecular Mechanics." Quant. Struct.-Act. Relat. 4, 166. McKinney, J. D., G. A. Long, and L. G. Pederson. 1984. "QSAR and Dioxin Binding to Cytosol Receptors: A Theoretical Model Based on Molecular Parameters." Quant. Struct.-Act. Relat. 3, 99. McKinney, J. D., and L. G. Pederson. 1986. "Biological Activity of Polychlorinated Biphenyls (PCBs) Related to Conformational Structure." Biochem. J. 240, 621. M011er, C., and M. S. Plesset. 1934. "Note on an Approximation Treatment for ManyElectron Systems." Phys. Rev. 46, 618. Murray, J. S., T. Brinck, and P. Politzer. 1996. "Relationships of Molecular Surface Electrostatic Potentials to Some Macroscopic Properties." Chem. Phys. 204, 289. Murray, J. S., T. Brinck, P. Lane, K. Paulsen, and P. Politzer. 1994. "Statistically-Based Interaction Indices Derived From Molecular Surface Electrostatic Potentials; A General Interaction Properties Function (GIPF)." J. Mol. Struct. (Theochem) 307, 55. Murray, J. S., T. Brinck, and P. Politzer. 1993. "Partition Coefficients of Nitroaromatics Expressed in Terms of Their Molecular Surface Areas and Electrostatic Potentials." J. Phys. Chem. 97, 13807. Murray, J. S., P. Evans, and P. Politzer. 1990. "A Comparative Analysis of the Electrostatic Potentials of Some Structural Analogues of 2,3,7,8-Tetrachlorodibenzo-p-dioxin and of Related Aromatic Systems." Int. J. Quant. Chem. 37, 271. Murray, J. S., S. G. Gagarin, and P. Politzer. 1995. "Representation of C60 Solubilities in Terms of Computed Molecular Surface Electrostatic Potentials and Areas." J. Phys. Chem. 99, 12081. Murray, J. S., M. E. Grice, P. Politzer, and M. C. Etter. 1991a. "A Computational Analysis of Some Diaryl Ureas in Relation to Their Observed Crystalline Hydrogen Bonding Patterns." Mol. Eng. 1, 95. Murray, J. S., M. E. Grice, P. Politzer, and J. R. Rabinowitz. 1990. "Evaluation of a Finite Multipole Expansion Technique for the Computation of Electrostatic Potentials of Dibenzo-p-dioxins and Related Systems." J. Comp. Chem. 11, 112. Murray, J. S., P. Lane, T. Brinck, K. Paulsen, M. E. Grice, and P. Politzer. 1993a. "Relationships of Critical Constants and Boiling Points to Computed Molecular Surface Properties." J. Phys. Chem. 97, 9369. Murray, J. S., P. Lane, T. Brinck, and P. Politzer. 1993b. "Relationships Between Computed Molecular Properties and Solute/Solvent Interactions in Supercritical Solutions." J. Phys. Chem. 97, 5144. Murray, J. S., P. Lane, T. Brinck, P. Politzer, and P. Sjoberg. 1991b. "Electrostatic Potentials on the Molecular Surface of Some Cyclic Ureides." J. Phys. Chem. 95, 844. Murray, J. S., P. Lane, and P. Politzer. 1990. "Electrostatic Potential Analysis of the Regions of Some Naphthalene Derivatives." J. Mol. Struct. (Theochem) 209, 163. Murray, J. S., P. Lane, and P. Politzer. 1995a. "Relationships Between Impact Sensitivities and Molecular Surface Electrostatic Potentials of Nitroaromatic and Nitroheterocyclic Molecules." Mol. Phys. 85, 1. Murray, J. S., P. Lane, and P. Politzer. 1995b. "Special Relationships of Fluorinated Methane/Ethane Boiling Points and Heats of Vaporization to Molecular Properties." J. Mol. Struct. (Theochem) 342, 15. Murray, J. S., K. Paulsen, and P. Politzer. 1994. "Molecular Surface Electrostatic Potentials on the Analysis of Non-Hydrogen-Bonding Noncovalent Interactions." Proc. Ind. Acad. Sci. (Chem. Sci.) 106, 267. Murray, J. S., and P. Politzer. 1987. "Electrostatic Potentials of Some Dibenzo-p-dioxins in Relation to their Biological Activities." Theor. Chim. Acta 72, 507. Murray, J. S., and P. Politzer. 1991. "Correlations Between the Solvent Hydrogen-Bond-
THE MOLECULAR ELECTROSTATIC POTENTIAL
81
Donating Parameter a and the Calculated Molecular Surface Electrostatic Potential." J. Org. Chem. 56, 6715. Murray, J. S., and P. Politzer. 1992. "Relationships between Solute Hydrogen Bond Acidity/Basicity and the Calculated Electrostatic Potential." J. Chem. Res. S, 109. Murray, J. S., and P. Politzer. 1994. "A General Interaction Properties Function (GIPF): An Approach to Understanding and Predicting Molecular Interactions." In Quantitative Treatments of Solute/Solvent Interactions. J. S. Murray and P. Politzer, Eds. Elsevier, Amsterdam. Murray, J. S., S. Ranganathan, and P. Politzer. 1991. "Correlations Between the Solvent Hydrogen Bond Acceptor Parameter and the Calculated Molecular Electrostatic Potential." J. Org. Chem. 56, 3734. Murray, J. S., J. M. Seminario, M. C. Concha, and P. Politzer. 1992. "An Analysis of Molecular Electrostatic Potentials Obtained by a Local Density Functional Approach." Int. J. Quant. Chem. 44, 113. Murray, J. S., and K. D. Sen, Eds. 1996. Molecular Electrostatic Potentials: Concepts and Applications. Vol. 3, Theoretical and Computational Chemistry. Elsevier, Amsterdam. Murray, J. S., B. A. Zilles, K. Jayasuriya, and P. Politzer. 1986. "Comparative Analysis of the Electrostatic Potentials of Dibenzofuran and Some Dibenzo-p-Dioxins." J. Am. Chem. Soc. 108, 915. Murray-Rust, P., W. C. Stallings, C. T. Monti, R. K. Preston, and J. P. Glusker. 1983. "Intermolecular Interactions of the C-F Bond: The Crystallographic Environment of Fluorinated Carboxylic Acids and Related Structures." J. Am. Chem. Soc. 105, 3206. Nagy, P., J. G. Angyan, and G. Naray-Szabo. 1987. "Molecular Electrostatic Fields From Bond Fragments." Int. J. Quant. Chem. 31, 927. Naray-Szabo, G. 1979. "Electrostatic Isopotential Maps for Large Biomolecules." Int. J. Quant. Chem. 16, 265. Naray-Szabo, G. 1983. "Unusually Large Electrostatic Field Effect of the Buried Aspartate in Serine Proteinases: Source of Catalytic Power." Int. J. Quant. Chem. 23, 723. Osman, R., H. Weinstein, and S. Topiol. 1981. "Models for Active Sites of Metalloenzymes. II." Ann. N.Y. Acad. Sci. 367, 356. Parr, R. G. 1983. "Density Functional Theory." Ann. Rev. Phys. Chem. 34, 631. Parr, R. G., and W. Yang. 1989. Density-Functional Theory of Atoms and Molecules. Oxford University Press, New York. Pathak, R. K., and S. R. Gadre. 1990. "Maximal and Minimal Characteristics of Molecular Electrostatic Potentials." J. Chem. Phys. 93, 1770. Perahia, D., and A. Pullman. 1978. "The Molecular Electrostatic Potentials of the Complementary Base Pairs of DNA." Theor. Chim. Acta 48, 263. Petrongolo, C., H. J. T. Preston, and J. J. Kaufman. 1978. "Ab Initio LCAO-MO-SCF Calculations of the Electrostatic Molecular Potential of Chlorpromazine and Promazone." Int. J. Quant. Chem. 13, 457. Petrongolo, C., and J. Tomasi. 1975. "The Use of the Electrostatic Molecular Potential in Quantum Pharmacology. I. Ab Initio Results." Int. J. Quant. Chem., Quant. Biol. Symp. 2, 181. Platt, D. E., and D. Silverman. 1996. "Registration, Orientation, and Similarity of Molecular Electrostatic Potentials Through Multipole Matching." J. Comp. Chem. 17, 358. Poland, A., W. F. Greenlee, and A. S. Kende. 1979. "Studies on the Mechanism of Action of the Chlorinated Dibenzo-p-dioxins and Related Compounds." Ann. N.Y. Acad. Sci. 320, 151. Poland, A., and J. C. Knutson. 1982. "2,3,7,8-Tetrachlorodibenzo-p-dioxin and Related Halogenated Aromatic Hydrocarbons: Examination of the Mechanism of Toxicity." Ann. Rev. Pharmacol. Toxicol. 22, 517. Politzer, P. 1980. "Observations on the Significance of the Electrostatic Potentials at the Nuclei of Atoms and Molecules." Israel J. Chem. 19, 224. Politzer, P. 1981. "Relationships between Energies of Atoms and Molecules and the Electrostatic Potentials at their Nuclei." In Chemical Applications of Atomic and
82
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Molecular Electrostatic Potentials. P. Politzer and D. G. Truhlar, eds. Plenum Press, New York. Politzer, P. 1987. "Single-Particle Density in Physics and Chemistry." In Single-Particle Density in Physics and Chemistry. N. H. March and B. M. Deb, eds. Academic Press, New York. Politzer, P. 1988. "Computational Approaches to the Identification of Suspect Toxic Molecules." Tox. Letts. 43, 257. Politzer, P., L. Abrahmsen, and P. Sjoberg. 1984. "The Effects of Amine and Nitro Substituents upon the Electrostatic Potential of an Aromatic Ring." J. Am. Chem. Soc. 106, 855. Politzer, P., and K. C. Daiker. 1981. "Models for Chemical Reactivity." In The Force Concept in Chemistry. B. M. Deb, ed. Van Nostrand Reinhold, New York. Politzer, P., P. Lane, J. S. Murray, and T. Brinck. 1992a. "Investigation of Relationships Between Solute Molecule Surface Electrostatic Potentials and Solubilities in Supercritical Fluids." J. Phys. Chem. 96, 7938. Politzer, P., P. R. Laurence, L. Abrahmsen, B. A. Zilles, and P. Sjoberg. 1984. "The Aromatic C-NO2 Bond as a site for Nucleophilic Attack." Chem. Phys. Lett. 111, 75. Politzer, P., P. R. Laurence, and K. Jayasuriya. 1985. "Structure-Activity Correlation in Mechanism Studies and Prediction Toxicology." Env. Health Persp. 61, 191. Politzer, P., and J. S. Murray. 1990. "Chemical Applications of Molecular Electrostatic Potentials." Trans. Amer. Cryst. Assoc. 26, 23. Politzer, P., and J. S. Murray. 1991. "Molecular Electrostatic Potentials and Chemical Reactivity." In Reviews in Computational Chemistry. K. B. Lipkowitz and D. B. Boyd, eds. VCH Publishers, New York. Politzer, P., and J. S. Murray. 1995. "General and Theoretical Aspects of the C-X bonds X=F, C1, Br, I): Integration of Theory and Experiment." In Supplement D2: The Chemistry of the Halides, Pseudo-halides and Azides, Part 1. S. Patai, and Z. Rappoport, eds. Wiley, New York. Politzer, P., and J. S. Murray. 1996. "Relationships of Electrostatic Potentials to Intrinsic Molecular Properties." In Molecular Electrostatic Potentials: Concepts and Applications. J. S. Murray and K. D. Sen, eds. Elsevier, Amsterdam. Politzer, P., J. S. Murray, and P. Flodmark. 1996. "Relationship Between Measured Diffusion Coefficients and Calculated Molecular Surface Properties." J. Phys. Chem. 100, 5538. Politzer, P., J. S. Murray, M. E. Grice, M. DeSalvo, and E. Miller. 1997. "Calculation of Heats of Sublimation and Solid Phase Heats of Formation." Mol. Phys. In press. Politzer, P., J. S. Murray, P. Lane, and T. Brinck. 1993. "Relationships between Solute Molecular Properties and Solubility in Supercritical CO2." J. Phys. Chem. 97, 729. Politzer, P., J. S. Murray, J. M. Seminario, and R. S. Miller. 1992b. "Computational Analysis of Dinitramine and Chlorine Derivatives of Benzene and s-Tetrazine." J. Mol. Struct. (Theochem) 262, 155. Politzer, P., and D. G. Truhlar, Eds. 1981. Chemical Applications of Atomic and Molecular Electrostatic Potentials. Plenum Press, New York. Pullman, A., and H. Berthod. 1976. "Cation Binding to Biomolecules. The Screening of the Electrostatic Potential of the Phosphate Group by Mono- and Divalent Cations." Chem. Phys. Lett. 41, 205. Pullman, A., and B. Pullman. 1980. "Electrostatic Effect of Macromolecular Structure on the Biochemical Reactivity of the Nucleic Acids. Significance for Chemical Carcinogenesis." Int. J. Quant. Chem., Quant. Biol. Symp. 7, 245. Pullman, A., and B. Pullman. 1981a. "The Electrostatic Molecular Potential of the Nucleic Acids." In Chemical Applications of Atomic and Molecular Electrostatic Potentials. Plenum Press, New York. Pullman, A., and B. Pullman. 1981b. "The Electrostatic Molecular Potential of the Nucleic Acids." In Chemical Applications of Atomic and Molecular Electrostatic Potentials. P. Politzer and D. G. Truhlar, Eds. Plenum Press, New York. Pullman, B., D. Perahia, and D. Cauchy. 1979. "The Molecular Electrostatic Potential of
THE MOLECULAR ELECTROSTATIC POTENTIAL
83
the B-DNAhelix. VI. The regions of the base pairs in poly(dG.dC) and polly (dA.dT)." Nucleic Acids Res. 6, 3821. Rabinowitz, J. R., K. Namboodiri, and H. Weinstein. 1986. "A Finite Expansion Method for the Calculation and Interpretation of Molecular Electrostatic Potentials." Int. J. Quant. Chem. 29, 1697. Ramasubba, N., R. Parthasarathy, and P. Murray-Rust. 1986. "Angular Preferences of Intermolecular Forces Around Halogen Centers: Preferred Directions of Approach of Electrophiles and Nucleophiles around the Carbon-Halogen Bond." J. Am. Chem. Soc. 108, 4308. Scrocco, E., and J. Tomasi. 1973. "The Electrostatic Molecular Potential as a Tool for the Interpretation of Molecular Properties." In Topics in Current Chemistry. SpringerVerlag, Berlin. Scrocco, E., and J. Tomasi. 1978. "Electronic Molecular Structure, Reactivity and Intermolecular Forces: A Heuristic Interpretation by Means of Electrostatic Molecular Potentials." Adv. Quant. Chem. 11, 115. Seminario, J. M., J. S. Murray, and P. Politzer. 1991. "First-Principles Theoretical Methods for the Calculation of Electronic Charge Densities and Electrostatic Potentials." In The Application of Charge Density Research to Chemistry and Drug Design. Plenum Press, New York. Seminario, J. M., and P. Politzer, Eds. 1995. Modern Density Functional Theory: A Tool for Chemistry. Vol. 2, Theoretical and Computational Chemistry. Elsevier, Amsterdam. Sen, K. D., and P. Politzer. 1989. "Characteristic Features of the Electrostatic Potentials of Singly-Negative Monatomic Ions." J. Chem. Phys. 90, 4370. Sheridan, R. P., and L. C. Allen. 1981. "The Active Site Electrostatic Potential of Human Carbonic Anhydrase." J. Am. Chem. Soc. 103, 1544. Sjoberg, P. 1989. Calculated Properties at Molecular Surfaces: Guides to Chemical Reactivity. Ph.D. dissertation, University of New Orleans, New Orleans, La. Sjoberg, P., J. S. Murray, T. Brinck, P. Evans, and P. Politzer. 1990. "The Use of the Electrostatic Potential at the Molecular Surface in Recognition Interactions: Dibenzop-dioxins and Related Systems." Journal of Molecular Graphics 8, 81. Sjoberg, P., and P. Politzer. 1990. "The Use of the Electrostatic Potential at the Molecular Surface to Interpret and Predict Nucleophilic Processes." J. Phys. Chem. 94, 3959. Sola, M., J. Mestres, R. Carbo, and M. Duran. 1996. "A Comparative Analysis by Means of Quantum Molecular Similarity Measures of Density Distributions Derived from Conventional ab initio and Density Functional Methods." J. Chem. Phys. 104, 636. Spark, M. J., D. A. Winkler, and P. R. Andrews. 1982. "Conformational Analysis of Folates and Folate Analogues." Int. J. Quant. Chem., Quant. Biol. Symp. 9, 321. Thomson, C., and R. Brandt. 1983. "Theoretical Investigations of the Structure of Potential Inhibitors of the Enzyme Glyoxalase-I." Int. J. Quant. Chem., Quant. Biol. Symp. 10. Tomasi, J. 1981. "Use of the Electrostatic Potential as a Guide to Understanding Molecular Properties." In Chemical Applications of Atomic and Molecular Electrostatic Potentials. Plenum Press, New York. Tomasi, J. 1982. "Electrostatic Molecular Potential Model and Its Application to the Study of Molecular Aggregations." In Molecular Interactions. H. Ratajezak and W. T. Orville-Thomas, eds. Wiley, New York. Umeyama, H., and K. Morokuma. 1977. "The Origin of Hydrogen Bonding. An Energy Decomposition Study." J. Am. Chem. Soc. 99, 1316. Weinstein, H., R. Osman, and J. P. Green. 1979. "The Molecular Basis of StructureActivity Relationships: Quantum Chemical Recognition Mechanisms in DrugReceptor Interactions." In Computer-Assisted Drug Design. E. C. Olson and R. E. Christofferson, eds. American Chemical Society, Washington, D.C. Weinstein, H., R. Osman, J. P. Green, and S. Topiol. 1981a. "Electrostatic Potentials as Descriptors of Molecular Reactivity: The Basis for Some Successful Predictions of Biological Activity." In Chemical Applications of Atomic and Molecular Electrostatic Potentials. P. Politzer and D. G. Truhlar, Eds. Plenum Press, New York.
84
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Weinstein, H., R. Osman, S. Topiol, and J. P. Green. 1981b. "A Quantum Chemical Studies on Molecular Determinants for Drug Action." Ann. N. Y. Acad. Sci. 367, 434. Wiener, J. J. M., M. E. Grice, J. S. Murray, and P. Politzer. 1996. "Molecular Electrostatic Potentials as Indicators of Covalent Radii." J. Chem. Phys. 104, 5109. Wiener, J. J. M., J. S. Murray, M. E. Grice, and P. Politzer. 1997. "Relationships Between Bond Dissociation Energies, Electronic Density Minima and Electrostatic Potential Minima." Mol. Phys. In press. Williams, D. E. 1988. "Representation of the Molecular Electrostatic Potential by Atomic Multipole and Bond Dipole Models." J. Comp. Chem. 9, 745. Williams, D. E. 1991. "Net Atomic Charge and Multipole Models for the Ab Initio Molecular Electrostatic Potential." In Reviews in Computational Chemistry. K. B. Lipkowitz and D. B. Boyd, eds. VCH Publisher, New York. Williams, D. E., and J.-M. Yan. 1988. "Point-Charge Models for Molecules Derived from Least-Squares Fitting of the Electric Potential." Adv. Atomic Mol. Phys. 23, 87. Woods, R. J., M. Khalil, W. Pell, S. H. Moffat, and V. H. Smith. 1990. "Derivation of Net Atomic Charges from Molecular Electrostatic Potentials." J. Comp. Chem. 11, 297.
4 Applications of Density Functional Theory to Biological Systems Tomasz Adam Wesolowski Jacques Weber
The term biological systems may be used in reference to a wide class of polyatomic systems. They can be defined as minimal functional units which perform specific biological functions: enzymatic reactions, transport across membranes, or photosynthesis. At present, such systems as a whole are not amenable to quantum-chemistry studies because of their large size. The smallest enzymes are built of few thousands of atoms (e.g., lysozyme consists of 129 amino-acid subunits1), the smallest nucleic acids are of similar size (e.g., t-RNA molecules consist of about 80 nucleotide subunits2), whereas biological membranes are even larger and include different biological macromolecules embedded in a phospholipide medium3. On the other hand, a common-sense definition of the term biological systems refers to any chemical molecule or molecular complex which is involved in biological or biochemical processes. The latter definition, which will be used throughout this review, covers not only complete functional units performing biological functions but also fragments of such units. Theoretical studies have provided data on properties of such fragments and have helped understanding of the biological processes at the molecular level. Depending upon the size of such fragments, they can be studied by means of various quantum-chemical methods. Molecular systems of up to a few thousands of atoms can be studied using semi-empirical4 methods. For the Hartree-Fock or Kohn-Sham density functional theory (DFT) calculations, the current size limit is a few hundreds of atoms5,6. (Throughout the text, Hartree-Fock refers to ab initio Self-Consistent Field calculations using the approximation of linear combination of atomic orbitals.) When the desired accuracy requires the calculation of electron correlation at the ab initio level, only systems containing no more than few tens of atoms can be treated7. Therefore, a theoretician aiming at the elucidation of biological processes by quantummechanical calculations faces two crucial issues. The first one is the selection of a fragment for modeling at the quantum-mechanical level. The second one is the assessment of the effects associated with parts of the system which cannot be modeled at the quantum-mechanical level. In this review, the DFT studies of biological systems are divided into two groups corresponding to different ways of addressing the second aforementioned issue. The first one 85
86
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
consists of such DFT studies of molecular systems, where the interaction of a selected fragment with remaining parts of the whole biological system were neglected. The second one includes studies of fragments of larger biological systems where such interaction were accounted for in the model. Approximations underlying the presented approaches will be discussed in detail.
4.1 DFT Studies of Isolated Fragments of Large Biological Systems 4.1.1 The Kohn-Sham Method and Its Applications 4.1.1.1 The Method Today, density functional methods following the Kohn-Sham formalism15 are considered a valuable alternative to the traditional ab initio quantum-chemical models. In principle, they, too, are based on a parameter-free theory, i.e., they attempt to find solutions "from first principles" to the SCF mean-field model of the electronic structure, while treating the electron correlation problem differently then the post-Hartree-Fock techniques. The resulting oneelectron equations involve an approximate expression of the exchange-correlation potential and their solution requires substantially reduced computational effort compared to that of conventional ab initio methods. Consequently, density functional methods can be applied to large systems, such as coordination compounds; organometallic, inorganic, and biological systems; and new materials. Several books and reviews on the density functional methodology have appeared recently8-13. Here, we shall only summarize its main features. The basic idea of density functional theory is to use, instead of the electronic wavefunction (1,2,..., n), the electron density p(r) as the variable of the system. This replacement does not involve any loss of generality or any approximations. It has been shown by Hohenberg and Kohn that the ground state energy of a multi-electron system is completely and uniquely determined by its density, although the explicit functional dependence of the energy on density is not exactly known14. However, the energy functional satisfies the variational principle, i.e., it is minimal for the true electron density of the system. Following Kohn and Sham15, for an nelectron system, it is expressed by the following exact formula (throughout the text, atomic units are used):
where: • the first term is the kinetic energy of a reference system of noninteracting electrons with the same total electron density as the actual system of interacting electrons11, and ni being spin orbitals and their occupation numbers, respectively;
APPLICATIONS OF DENSITY FUNCTIONAL THEORY TO BIOLOGICAL SYSTEMS
87
• the second term represents the usual potential energy arising through electron-nuclei interactions, the summation running over the nuclei with charges ZA located at
R
• the third term is the classical Coulomb repulsion energy of electrons with density P(r); • Exc accounts for: the energy of exchange interactions, correlation effects, and the difference between the exact kinetic energy and that of the reference system of noninteracting electrons with the density p(r). The first three terms in Eq. 4.1 have their correspondence in equations of conventional ab initio methodologies, whereas the fourth one xc((E ) requires a more throughout discussion. Equation 4.1 can be viewed as the definition of (E ). Actually, the major problems of DFT are due to Exc, as there is no exact analytical formula for this term in the case of n-electron systems; approximations thus have to be sought. Applying the variational principle to the energy given by Eq. 1, Kohn and Sham reformulated the density functional theory by deriving a set of one-electron Hartree-like equations leading to the Kohn-Sham orbitals (r) involved in the calculation of p(r)15. The Kohn-Sham (KS) equations are written as follows:
where the expression in brackets is the effective one-electron Kohn-Sham Hamiltonian hKS, the exchange-correlation potential V , which contains multi-electron effects, being defined as:
Note that the Kohn-Sham Hamiltonian hKS [Eq. (4.1)] is a local operator, uniquely determined by electron density15. This is the main difference with respect to the Hartree-Fock equations which contain a nonlocal operator, namely the exchange part of the potential operator. In addition, the KS equations incorporate the correlation effects through Vxc, whereas they are lacking in the Hartree-Fock SCF scheme. Nevertheless, though the latter model cannot be considered a special case of the KS equations, there are some similarities between the Hartree-Fock and the Kohn-Sham methods, as both lead to a set of one-electron equations allowing to describe an n-electron system. In principle, the KS equations would lead to the exact electron density, provided the exact analytic formula of the exchange-correlation energy functional Exc was known. However, in practice, approximate expressions of Exc must be used, and the search of adequate functionals for this term is probably the greatest challenge of DFT8. The simplest model has been proposed by Kohn and Sham: if the system is such that its electron density varies slowly, the local density approximation (LDA) may be introduced:
where is the exchange and correlation energy per particle of a uniform and homogeneous electron gas of density p. Vxc may then be easily deduced from this approximate ex-
88
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
pression of Exc by using Eq. 4.3 and the KS equations can be solved. As for Hartree-Fock based methods, the SCF procedure must be used since the hKS Hamiltonian depends explicitly upon p(r). As shown by Parr and Yang8, the exchange and correlation contributions to (p) can be separated as:
and the exchange part is usually taken from the electron gas theory:
If no correlation is introduced , the KS equations reduce to the well known method proposed by Slater22 as a simplification of the Hartree-Fock scheme with a local exchange operator:
where a is an adjustable parameter. Actually, the X formalism is the simplest DFT method based on the LDA approximation, and a large number of more sophisticated exchange-correlation potentials have been proposed by various authors. In the first generation of DFTbased methods, often referred to as local schemes, the p1/3 functional is retained for x whereas the analytical formulas deriving from parametrizations of correlation energy obtained from Monte Carlo calculations for the electron gas are used23,24 for . The LDA formalism, using the exchange-correlation functionals which are believed to represent closely the limit of the local approximation, is applicable to systems with slowly varying electron densities, a situation which is rarely encountered in atoms and molecules. Experience has shown that LDA leads to surprisingly good results for the geometry and the electronic structure of a broad range of molecular systems, including transition metal compounds12 and molecules of biological importance18. However, due to the fact that the LDA approximation significantly overestimates bonding energies, both of intra- and intermolecular nature, the second generation of DFT theories was developed in the 1980s. These methods introduce the electron density gradient in the expansion of the exchange and correlation functionals to account for inhomogeneities in the electron distribution12,17. The general form of the exchange-correlation energy functional is retained while nonlocal (NL) gradient-dependent correction terms appear in the expansion of . Equation 4.4 is thus transformed xc into:
Various analytical formulas for have been proposed, the most popular ones being reported in Table 4.1. Experience has shown that nonlocal (also called gradient-corrected) functionals lead to significantly improved bond energies and geometries of organic19,20 and transition metals containing systems 11,17 compared to LDA. Finally, in the third generation of DFT schemes, a portion of the exact Hartree-Fock (HF) exchange energy is mixed to the DFT exchange-correlation term, using the adiabatic con-
APPLICATIONS OF DENSITY FUNCTIONAL THEORY TO BIOLOGICAL SYSTEMS
89
Table 4.1. Acronyms used in text for parametrizations for the exchange-correlation (Exc) functional. The acronyms for separate exchange (Ex) and correlation (Ec) components of Exc are specified when applicable. Throughout the text, density functional calculations following the Kohn-Sham formalism are referred to as D F T ( X X X ) , where XXX stands either for the acronym of the approximate exchange-correlation functional or for the acronyms of the exchange and correlation functionals, separated by the "/ " symbol.
Exc X 22 SVWN HL25 GL26 BH27 PZ24
Ex
Ec
Local functionals (LDA)
X
S21
Nonlocal functionals (posy-LDA) B8842 BLYP B-Half-and-Half-LYP28a ACM29 B3LYP29b PW86/P86 PW8639 41 PW91 B88/P86 B8842 B88/LAP1 B8842 PW86/LAP1 PW8639
none VWN23
LYP43
P8640 P8640 LAP132 LAP132
a
The description of the original "half-and-half" approach can be found in Ref.28. B-Halfand-Half-LYP was obtained in a similar way, but the LYP correlation functional was used instead of the SVWN. b In the original work by Becke,29 the three-parameter hybrid exchange-correlation functional involves a linear combination of the B88, SVWN, and PW91 functionals with the coefficients fitted to experimental data. The B3LYP functional was obtained in a similar way, but the LYP correlation functional was used instead of the P86 one.
nection method recently proposed by Becke28,29, so as to eliminate the self-interaction effects. These schemes, also known as hybrid HF-DFT methods, have been shown recently by several researchers to be remarkably accurate in predicting various molecular properties29,31. Most recently, successful attempts to incorporate terms depending on the electron density Laplacian into the exchange-correlation functionals were also reported32. The formalism of Kohn and Sham was implemented in several programs. Among the most widely used are: ADF33, which uses Slater basis sets; DGauss34, deMon/KS35, and Gaussian92/DFT36, which use Gaussian basis sets; DMol37, which uses numerical basis sets; and the programs implementing the Car-Parrinello dynamics38, which use the plane waves. 4.1.1.2 Applications The Kohn-Sham equations have been applied to study gas-phase properties of many systems of biological interest. Most of such studies have been made for relatively small mol-
90
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
ecular systems for which the conventional post-Hartree-Fock ab initio methods could also be applied. For such systems, the DFT calculations served mainly the purpose to determine the range of applicability of known approximations of the exchange-correlation functional in the Kohn-Sham equations. There is also a still-growing amount of DFT results concerning larger systems which are of biological interest and for which DFT is the only quantumchemical method that includes correlation effects available. The forthcoming section presents applications of the KS formalism to study properties of isolated molecules, molecular complexes, and chemical reactions. Organic Molecules of Biological Interest. The works by Andzelm and Wimmer19 and by Johnson et al.20 were instrumental in expanding the range of applicability of the KohnSham formalism to enter the domain of molecular systems of biological relevance. For large sample of organic molecules (including some of biological interest—H2O, formic acid, formamide, pyrimidine, for instance), molecular properties derived from the DFT calculations were compared to the ones stemming from the post-Hartree-Fock calculations and from experiment (when available). Geometries, atomization energies, dipole moments, and vibrational frequencies were analyzed. Andzelm and Wimmer19 performed the DFT(SVWN) and DFT(B88/P86) calculations. It was found that for small molecules containing C, N, O, H, and F atoms, the DFT equilibrium bond lengths agreed with experiment within 0.01-0.02 A, whereas the bond and dihedral angles within 1-2 degrees. The DFT(SVWN) vibrational frequencies were too low, but as close to the experiment as those obtained from the secondorder M011er-Plesset (MP2) calculations. The studies by Johnson et al.20, in which several local and nonlocal functional were applied (S/null, SVWN, S/LYP, B88/null, B88/VWN, and BLYP) led to similar conclusions. Similar analyses to these by Andzelm and Wimmer and by Johnson et al. were performed specifically for molecules of biological interest by St.-Amant et al.44. Functional groups which are commonly present in molecules of biological interest (mainly proteins) were investigated. The largest molecules were analogs of glycine and alanine dimers. Geometries, conformational energies and dipole moments were obtained from the DFT(SVWN) and DFT(B88/P86) calculations. Equilibrium geometries, relative energies of different conformers, dipole moments, and molecular electrostatic potentials were analyzed. For the conformational analysis, 175 conformers of 79 organic molecules were selected. For 21 molecules, an analysis of the dipole moments was performed. Based on 35 comparisons of conformational energies for which experimental results were available, the authors concluded that the DFT(B88/P86) results were significantly better than the DFT(SVWN) ones. The root mean square (RMS) of the corresponding energy differences amounted to 0.8 kcal/mol and 0.5 kcal/mol for SVWN and B88/P86, respectively. The analysis of molecular geometries showed that the DFT(B88/P86) bond lengths were systematically overestimated compared to experimentally measured values. The average overestimation of the bond lengths ranged from 0.007 A for CC double bonds to 0.06 A for SS bonds. The corresponding differences between the experimental and MP2 results were significantly lower. The experimental and DFT(B88/P86) bond angles agreed within 1 degree. The MP2 and DFT(B88/P86) results for bond angles were at the same level of accuracy. Compared to experimental results, both the DFT(SVWN) and the DFT(B88/P86) calculations overestimated dipole moments (the average deviation amounted to 11.7% and 9.1 %, respectively). Smaller deviations (to 6.6% and 5.5%, respectively) were obtained in calculations involving an augmented basis set which included an extra set of diffuse s and p functions on heavy atoms and a diffuse s function on hydrogen. The authors concluded that the DFT(B88/P86)
APPLICATIONS OF DENSITY FUNCTIONAL THEORY TO BIOLOGICAL SYSTEMS
91
led to results approaching the MP2 ones and that they were better than the ones obtained from the Hartree-Fock calculations. Rashin et al.45 obtained the dipole moment of 32 molecules of biological relevance by means of the DFT(SVWN) and DFT(B88/P86) calculations. The results showed a rather weak dependence of calculated dipole moments on the functional form of the exchangecorrelation functional but a strong dependence on the basis set. Numerous studies of properties of individual organic molecules of biological interest were reported recently. They dealt with molecules which represented fragments of proteins, biological membranes, and nucleic acids, drugs, and drug analogs. Some results will be discussed below. Adamo, Barone, and collaborators46-48 studied isolated glycine, the smallest building block of proteins. Results showed that the DFT(B3LYP) conformational analysis led to results comparable to the ones obtained from the post-Hartree-Fock calculations. Salahub and collaborators studied the conformational equilibria in glycine and malonaldehyde using B88/P86, B88/LYP, ACM, B88/LAP1, and PW86/LAP1 functionals49. Both molecules contain internal hydrogen bonds and the relative energy differences between the most stable conformers are within 1 kcalAnol. The DFT(B88/P86), DFT(PW86/P86), and DFT(PW91) calculations led to wrong order of stability for the two conformers of glycine. The DFT(BLYP) and DFT(ACM) predicted the right order of stability but the relative energies were underestimated. Very good results were obtained using the DFT(B88/LAP1) and DFT(PW86/LAP1) calculations for both the relative energies of the glycine conformers and the energy of the hydrogen bond in malonaldehyde. Florian and Johnson50 calculated vibrational frequencies in isolated formamide using the DFT calculations at the LDA (SVWN) and post-LDA (B88/LYP) levels. The DFT frequencies were compared with the ones derived from the Hartree-Fock and MP2 calculations, and from experiment. The authors found that the DFT(B88/LYP) frequencies were more in line with experiment then the MP2 ones. The DFT(SVWN) calculations led to geometry, force constants, and infrared spectra fully comparable to the MP2 results. The equilibrium geometry and vibrational frequencies of formamide were also the subject of studies by Andzelm et al.51. It was found that the DFT(B88/P86) calculations led to frequencies in a better agreement with experiment than those obtained from the CISD calculations. Oie et al.52 studied the potential energy surfaces of 11 small conjugated molecules relevant to conformational equilibria of large biomolecules. The DFT(B88/P86) barriers to the rotation around the conjugated bond were compared to the ones derived from the MP2 and MP4 calculations. The geometries of minima located by means of both methodologies were in an excellent agreement (the largest torsional angle difference was 2.7 degrees). The rotational barriers were in a satisfactory agreement. The differences between the MP4 and the DFT relative energies were within 0.13--1.39 kcal/mol for non-amide molecules, whereas they were larger (2.71--4.88 kcal/mol) for molecules containing the amide group. The DFT calculations were used to predict NMR parameters of several molecules of biological importance54. Predicting NMR spectra represents a challenging problem because of their strong dependency on the electronic structure and geometry. Salahub and collaborators reported several papers on this subject55--57. Malkin et al.55 reported a very good agreement between spin-spin coupling constants Jcc JCH and JHH derived from the postHartree-Fock and DFT calculations, and from experiment for several organic molecules. In the DFT calculations, the following approximate exchange correlation functionals were used: SVWN, B88/P86, PW86/P86, and PW91. Compared to the post-Hartree-Fock meth-
92
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
ods, the DFT calculations led to worse coupling constants for systems containing lone pairs. A strong dependence of the estimated Fermi contact contribution to the coupling constants on the form of the approximate exchange-correlation functional was pointed out. Malkin et al.56 combined the Sum-Over-States perturbation theory (SOS) with the DFT calculations to derive shielding constants. The shielding constants did not appear to be sensitive to the analytic form of the exchange-correlation functional. The method was applied for several organic molecules including a model dipeptide. The shielding constants obtained with the DFT and post-Hartree-Fock electron densities were in very good agreement. In few cases, the shielding constants obtained by means of the SOS calculations combined with the Hartree-Fock electron densities were qualitatively wrong (e.g., the ozone molecule). In such cases, the SOS calculations combined with the DFT electron densities led to shielding constants in good agreement with experiment. Malkin et al.57 also reported very good agreement between experimental and the DFT/SOS calculated shielding constants for glycine. Case58 investigated the effect of ring currents on NMR shielding constants by means of the DFT calculations. The studied rings included the ones commonly found in proteins and nucleic acids. The shielding constants were calculated for methane molecule placed in several positions relative to the ring. The calculations provided data needed to derive structural parameters from measured chemical shifts in proteins and nucleic acids. The DFT calculations have also been applied to investigate radicals of biological inter53 est48,53,54,59,6o Eriksson et al. calculated the hyperfine structure of small radicals built of H, N, C, O, F, and C1 atoms. It was found that the anisotropic hyperfine couplings are relatively insensitive to the basis-set effects and to the functional form of the exchange-correlation functional. The isotropic hyperfine couplings were, however, strongly dependent on the approximate form of the exchange-correlation functional. The best results, in an excellent agreement with experiment, were obtained using the PW86/P86 functional for all neutral and cationic radicals, whereas for halide-containing anions, hyperfine structures were less accurate. The disagreement between experiment and the DFT was attributed to the wrong DFT equilibrium geometries for these compounds. Barone et al. obtained good ESR features for glycine radical using the DFT calculations at the LDA (SVWN) level48. O'Malley and Collins studied semi-quinoine anions, which are formed in the electron transfer reactions of photosyntesis, by the DFT(BSLYP) calculations59. Very accurate hyperfine coupling constants were obtained, but a strong basis set influence on the results was observed. For ring 13C atoms, basis sets of at least full double zeta quality were required. The hyperfine coupling constants for 1H, 17O, (,and other 13C atoms were less basis-set dependent. Jensen et al.60 derived the spin densities of 3-methylindole from the DFT(B3LYP) calculations. This molecule was considered a model of tryptophan-191 radical of the enzyme cytochrome-c-peroxidase. The agreement between the experimental spin densities of the tryptophan-191 of cytochrome-c-peroxidase and the DFT spin densities of the cation radical of 3-methylindole supported the conclusion that tryptophan-191 radical is a cation radical. The spin densities derived from the MP2 and DFT(B3LYP) calculations were in a qualitative disagreement. The components of nucleic acids have been the subject of continuous DFT studies 61-65,67-69 Jasien Fitzgerald calculated dipole moments and polarizabilities for a series of molecules of biological interest including nucleic acid bases (adenine, thymine, cytosine, and guanine) and their pairs (adenine-thymine and cytosine-guanine)61. A good correlation between DFT(HL), experimental, and MP2 results was obtained for dipole moments and polarizabilities. More detailed analyses of DFT(SVWN) and DFT(B88/P86) results, which included vibrational frequencies, were reported for isolated bases and their
APPLICATIONS OF DENSITY FUNCTIONAL THEORY TO BIOLOGICAL SYSTEMS
93
pairs by Santamaria and Vasques63. Similar analyses were made by Estrin et al.62 who used the LDA(SVWN), B88/P86, and PW86/P86 functionals. Sponer et al.65 compared the MP2 and the DFT energies for cytosine dimer in the base staking conformation. Some features (e.g., twist displacement of bases) of the MP2 potential energy surface were well reproduced by means of the DFT calculations; however, qualitatively wrong results were obtained for others (vertical displacement). ([The reported DFT energies were obtained by combining energies derived from the Kohn-Sham calculations (DFT(BLYP) or DFT(B3LYP)) with empirical terms representing dispersion.]) Bakalarski et al.64 studied the properties of isolated N-methylated nucleic bases in their fundamental tautomeric forms by means of the DFT(B3LYP), Hartree-Fock, and MP2 calculations. The dipole moment, rotational constants, and molecular electrostatic potential were calculated. Molecular properties (rotational constants, dipole moments) agreed better with experiment than the corresponding results from the Hartree-Fock calculations. The DFT results were comparable to the MP2 ones. The electrostatic potential (as measured by the magnitudes of fitted atomic charges) steming from the DFT calculations was closer to the MP2 one than to the electrostatic potential given by the Hartree-Fock calculations. The tautomerism of heterocyclic compounds has been the subject of much interest due to its biological implications for the base pair formation in nucleic acids. A proper assessment of the relative stability of different tautomeric forms of isolated heterocyclic compounds seems, therefore, to be a prerequisite for modeling biological processes involving nucleic acids. For such compounds, relative energies of the most stable tautomeric forms are typically small (in the range of 1-2 kcal/mol) which makes quantum-chemical studies difficult. The relative energies of the most stable tautomers are frequently of the same order of magnitude as the differences among energies obtained from various the post-HartreeFock calculations. Since the differences between the zero-point energies in the most stable tautomers are typically in the 1 kcal/mol range, significant computational efforts are required to accurately calculate vibrational frequencies. The lack of conclusive ab initio results makes it difficult to assess the quality of the DFT results. A number of DFT studies on this subject have appeared recently62,66-71. The structure, relative energetics of tautomeric forms, and vibrational frequencies in uracil and in cytosine were studied by Estrin et al.62 who applied LDA(GL), and post-LDA (B88/P86 and PW86/P86) functionals. Good structures and dipole moments were obtained for all functionals, including the LDA. The DFT calculations with different post-LDA exchange functionals (B88 and PW86) led to similar relative energies of tautomers. The DFT calculations, for all functionals, predicted that the dioxo form is the most stable, in agreement with experiment and the high-level postHartree-Fock calculations. For cytosine, it was found that the energies of the three most stable tautomers were within 1 kcal/mol. Two of the most stable tautomers predicted by the DFT calculations had been observed experimentally. Hall et al.70 studied the energies of tautomeric forms in the 2-hydroxypyridine and in cytosine by means of the DFT calculations using the B88/VWN, BLYP, and the corresponding hybrid functionals. For each considered tautomeric form, the DFT and post-Hartree-Fock (QCISD(T) and MP4) calculations were made at the Hartree-Fock optimized geometry. For both studied molecules, the experimental relative energies of the most stable tautomeric forms are very close to each other (within 1.0 kcal/mol). The calculations did not lead to conclusive results for such small energy differences. The DFT(B88/VWN) and DFT(BLYP) calculations predicted a larger stability of the keto form, which is in a disagreement with experiment but in line with the MP2 results. The relative energies obtained using hybrid functionals agreed better with the MP4 ones. Adamo and Lej166 reported DFT(SVWN) and DFT(B3LYP) studies of relative
94
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
energies of 2-pyridone and its alternative tautomeric form—2 hydroxypyridine. The calculations were aimed at studying solvent effects, which will be discussed later in this chapter. Turning to gas-phase results, the DFT calculations predicted that the 2-pyridone was less stable than the 2-hydroxypyridine, in agreement with results obtained from the postHartree-Fock calculations. The stabilization energy of this tautomer amounted to —1.0, --3.8,--0.56,--1.20,and --0.47 kcal/mol for the Hartree-Fock, MP2, C1, DFT(SVWN), and DFT(B3LYP), respectively. In a similar study, Barone and Adamo71 reported good agreement between the DFT(B3LYP) and the MP2 results for the relative energies of 2-pyridone, 2-hydroxypyridine, and the transition state. Kwiatkowski, Leszczynski, and collaborators obtained equilibrium geometries, vibrational spectra, relative energies of tautomers, dipole moments, and rotational constants of adenine67, cytosine68, and 2(1H)-pyridone69 by means of the DFT(B3LYP) and MP2 calculations. From comparisons between the DFT and the MP2 results, a general conclusion was drawn that, except for relative energies of tautomers and intensities of the IR absorption bands, the DFT results matched the ones steming from the MP2 calculations and were in good agreement with the available experimental data. The authors concluded that to predict relative energies of different tautomers (especially if they are small) the conventional ab initio post-Hartree-Fock methods are more reliable. The computational advantages of the Kohn-Sham formalism make it a method of choice for conformational studies of large and flexible molecules. Many molecules of biological interest belong to this class and there is a continuously growing interest in applying the DFT calculations as an alternative to the semi-empirical or the Hartree-Fock calculations72--75. Topol and Burt72 studies conformational equilibria of 1,2-difluoroethane and inositol. Molecules containing inositol are commonly found in biomembranes as one of the few head groups of phospholipides. The conformational analysis demonstrated that the LDA calculations (DFT(BH)) can properly describe the ordering of small (1 kcal/mol) differences between energies of local minima on the potential energy surface. The authors showed also that the unsealed vibrational frequencies were in a reasonable agreement with experiment. The conformational energies of inositols were also studied by Liang et al.73, who performed a conformational analysis by means of the Hartree-Fock, MP2, DFT(SVWN), and DFT(B88/P86) calculations. Eight conformers were considered. Theoretical studies on such compounds are very valuable because of the lack of experimental conformational data. Unfortunately, the results of this analysis were not conclusive because of the small differences between energies of different conformers. Some common trends in the DFT(B88/P86) and the MP2 results were observed. Oie et al.74 performed a conformational analysis of ethylene glycol. Ethylene glycol is a model compound for diols which are employed as the central chemical core of several HIV-1 protease inhibitors. Geometries and relative conformational energies of 10 conformers were obtained by means of the Hartree-Fock, MP2, MP4, DFT(SVWN), and DFT(B88/P86) calculations. All the studied conformations were found to lie within 4.5 kcal/mol indicating an important flexibility of the ethylene glycol molecule. This conclusion was supported by results obtained from all theoretical methods. Rabinovitz et al.75 studied conformational properties in cyclopentapolycyclic aromatic hydrocarbons (PAHs) by means of the semi-empirical, Hartree-Fock, and DFT(SVWN) methods. Polycyclic hydrocarbons are known for their carcinogenetic activity. Due to their relatively large size, theoretical studies of their reactivity remains a challenge to quantum chemistry. The energetics and geometry of carbocation pairs which arc formed upon protonation of PAHs were studied by means of the DFT calculations combined with other methods for geometry optimization.
APPLICATIONS OF DENSITY FUNCTIONAL THEORY TO BIOLOGICAL SYSTEMS
95
Metallo-organic Systems. Metalloorganic complexes can be found in a variety of biological systems that range from metallo-organic drugs to metallo-enzymes76. Pioneering studies of metallo-organic systems of biological importance were made by Aizman and Case77 by means of the multiple scattering method (MSX )16 to study the model of the active center of proteins containing the 4Fe-4S cluster. The same method was used to investigate Fe(SR)4 (R = H, CH3) clusters78 which mimic the active site of iron-sulfur proteins and also to investigate ferryl intermediates79. The reader is encouraged to see the review by Case80 for early applications of X theory to metallo-organic systems. More recently, the SCF-X scattered wave method was employed by Salomon and collaborators81,82 to investigate a series of copper(II) peroxide structures, in an attempt to rationalize the properties of such complexes of relevance to homocyanin and tyrosinase. Indeed, the latter systems are examples of copper-containing metallo-proteins that reversibly bind and react with dioxygen. Results of such calculations were used to derive Cu-O and O-O bonding interactions, magnetic exchange interactions, charge and spin distributions and excited-state transition energies, which all compared favorably with the corresponding experimental data. In particular, these investigations showed that, even at the X level of approximation, the DFT methods were able to elucidate the main features of active sites of metallo-proteins and to describe in a coherent and reliable way their possible reaction mechanisms with dioxygen. Gosh et al. reported LDA(SVWN) studies of oxo(porphyrinato)iron(IV) complexes.83 These compounds have been detected in various peroxidases and are believed to be involved in the reaction mechanisms of other heme enzymes, such as cytochromes P450. Very Good Fe-O distance and values of unsealed stretching frequencies, which were in excellent agreement with CASSCF results, published elsewhere, were obtained. Case and collaborators84,85 reported studies of spin coupling in (Fe4S4)3+ using the postLDA calculations. Parameters of the spin Hamiltonian were estimated using the DFT energies of a high-spin state as well as two different broken symmetry states. The parameters were compared to the ones derived from experimental measurements on the temperature dependence of the magnetic susceptibility. Good overall agreement between theory and experiment was found. Bray and Deeth applied the DFT(SVWN) method to investigate the active site models of xantine oxidase86. The authors addressed the issue of the presence of (OH)-- at the MoVI active site. The geometry of several active site models differing in the number of ligands was obtained by means of the geometry optimization at the DFT level. Resulting metalligand distances were compared with the ones stemming from extended X-ray absorption fine structure (EXAFS) experiments. The best agreement between calculated and EXAFS distances were observed for five-coordinate models in which one (OH)-- was one of the ligands. The results of the calculations supported the hypothesis which assumes that the oxidation of xanthine to uric acid involves the presence of (OH)-- at the oxidized, five-coordinated active site. Cluster of metal ions and water molecules represent a special case of metallo-organic systems of biological interest. Most of biochemical reactions take place in aqueous solution. Since metal ions are present in biological solvents, all thermodynamic considerations of metal ions bound to biological macromolecules must involve the thermodynamics of ions in water. Clusters of water molecules and monovalent ions were studied by Combariza and Kestner87. For these systems, the Hartree-Fock calculations lead to the underestimated interaction energies and reliable ab initio results can be obtained only with approaches including electron correlation. The authors found that the DFT(B3LYP) calculations led to
96
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
excellent energies, comparable to the MP2 ones and were significantly better than the ones obtained from the Hartree-Fock calculations. Parrinello and collaborators investigated several metallorganic systems of biological interest.88-91 A particular implementation of the Kohn-Sham formalism (see the section The Car-Parrinello Method and Its Applications), in which pseudopotentials are used for core electrons, the Kohn-Sham orbitals are expanded using plane waves, and the energy minimization is not constrained to the Born-Oppenheimer surface, was applied to obtain ground state properties at equilibrium geometry. The structure, vibrational frequencies, and electronic properties of cisplatin and other PtII complexes were studied by Carloni et al.88 and by Tornaghi et al.89 and compared with the available experimental data. These compounds are known for their anti-tumor activity. The studies of their properties represent an initial step in modeling their biological activity that involves interaction with DNA. A good agreement was shown between the available experimental data and the results of the DFT(B88/P86) calculations. Differences between the biological activity of cisplatin and that of carboplatin were rationalized by pointing out qualitative differences between their electronic structures. Lamoen and Parrinello90 studied the electronic and the structural properties of porphyrin, phyrazine, and their magnesium and palladium derivatives. Electronic properties of these compounds are relevant to oxygen and electron transport processes in biological systems. Molecules similar to free-base porphyrine can be found in important biomolecules like hemoglobin and chlorophyll. Structural results from calculation at the LDA level (PZ) were very close to the ones stemming from experiment and from the MP2 calculations. In addition, the electronic structure compared well with ab initio CASSCF calculations. Carloni et al.91 applied the DFT(PZ) calculations to investigate the electronic structure of various models of oxydized and reduced Cu, Zn superoxide dismutase. The first stage of the enzymatic reaction involves the electron transfer from CuII ion to superoxide. The theoretical investigations provided a detailed description of the electronic structure of the molecules involved in the electron transfer process. The effect of charged groups, present in the active center, on the electron transfer process were analyzed and the Argl41 residue was shown to play a crucial role. Thermochemistry. The DFT calculations with gradient-dependent functionals are very useful in thermochemistry. Contrary to earlier models, namely X , which gives erratic results, and LDA, which systematically overestimates binding energies, the post-LDA calculations yield good results with average errors of order 6 kcal/mol in standard thermochemical tests.29,92,93 Exchange-correlation functionals based on the adiabatic connection formula for the exchange-correlation hole29 reduce this error further to roughly 2 kcal/mol.29,30 At this level of accuracy, the DFT calculations can be considered a promising tool for predicting thermochemical properties of molecules of biological interest. Several DFT studies of atomization energies, ionization potentials, electron and proton affinities, heats of reaction, and bond energies in systems of biological relevance were reported recently19,20,29,30,32,92-96,98. Clementi and Chakravorty94 calculated atomization energies of about 50 organic molecules ranging from the ones as small as HF to the ones as large as pyridine. Several approximate exchange-correlation functionals, both the LDA and the post-LDA, were used. None of the considered functionals led to atomization energies of the chemical accuracy. Atomization energies obtained by means of nonlocal functionals were systematically closer to the experiment than the corresponding results obtained from the LDA calculations.
APPLICATIONS OF DENSITY FUNCTIONAL THEORY TO BIOLOGICAL SYSTEMS
97
Andzelm and Wimmer19 studied the energetics of several reactions by means of the DFT(SVWN) and DFT(B88/P86) calculations. The DFT(B88/P86) bond separation energy for a set of typical organic reactions involving H, C, N, and O bond atoms was within 7 kcal/mol from experimental values. The similar level of accuracy was seen for the dissociation of bonds formed by C, N, and O atoms leading to radical products, the DFT(B88/P86) energies were comparable to the ones derived from the MP2 and MP4 calculations. Larger errors were detected for breaking bonds involving F atom. Fournier and DePristo96 calculated bond energies in several small compounds containing disulfide bonds which are known to stabilize the tertiary structure of proteins. Bond dissociation energies are generally overestimated when LDA(SVWN) is used whereas the PW86/P86 functional brings them to within 5 kcal/mol of experimental values. Lee et al.97 investigated the HF, HC1, and H2S dissociation reactions in water clusters. A good agreement between the DFT(BLYP) and the MP2 results was reported for binding energies (within 2 kcal/mol) and geometries. The studies showed also that at least four water molecules are needed for ionic products of dissociation to coexist. Chandra and Goursot98 reported the DFT(SVWN), DFT(PW86/P86), DFT(B88/P86), DFT(PW86/LAP1), and DFT(B88/LAP1) studies of the proton affinities of the following organic molecules: H2CO, CH3CHO, CH3OH, C2H5OH, HCOOH, and CH3COOH. The SVWN proton affinities were systematically overestimated by 6-10 kcal/mol compared to experimental results. Significantly better results were obtained using nonlocal functionals. The PW86/P86, B88/P86, and PW86/LAP1 led also to overestimated proton affinities by 4.3--6.16, 2.2--4.4, and 0.9--2.2 kcal/mol, respectively. The best results were derived by means of the DFT(B88/LAP1) calculations which led to proton affinities that differed from the experimental values by --0.9 to +0.36 kcal/mol. These results are consistent with original work by Proynov et al.32 where the LAP1 functional was introduced to calculate atomization energies of several organic molecules including pyridine and pyridazine and to derive heats of reaction of simple organic reactions. The structure and the proton affinity of triazene, a molecule related to triazenium ions some of which are known as putative carcinogens was the subject of DFT investigations by Schmiedenkamp et al.99 The DFT(SVWN), DFT(PW86/P86), and DFT(B88/P86) results were compared with the ones obtained from the Hartree-Fock and MP2 calculations. Structural and energetic results obtained from the nonlocal DFT calculations compared well to the MP2 ones. In the subsequent paper™ the authors performed similar analyses for 19 small organic compounds. The protonation site were either at nitrogen or at oxygen atoms. The results were not sensitive to the analytical form of the approximate exchange-correlation functional. Good results were obtained for nitrogen proton affinities, whereas oxygen affinities were systematically underestimated. Hydrogen-Bonded Molecular Complexes. Gas-phase hydrogen-bonded complexes frequently attract attention for their relevance to similar complexes in biomolecular systems. Hydrogen bonds contribute to the stabilization energy of large biomolecules: nucleic acids and proteins. The nucleoside bases (adenine, thymine, uracil, cytosine, and guanine), which are the building blocks of nucleic acids, form hydrogen-bonded complexes of several types. The two polynucleotide chains of B-DNA are held together by hydrogen-bonded purine-pirymidyne base pairs101, for instance. In proteins, hydrogen bonds are abundant since the polypeptide backbone is built of groups that can form hydrogen bonds (NH as an acceptor and CO as a donor) and because several amino-acid side chains may be involved in hydrogen bond formation as donors (tryptophan and arginine) or both as acceptors and
98
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
as donors (asparagine, glutamine, serine, threonine), or either as donors and acceptors dependent on the pH value (lysine, glutamic and aspartic acid, tyrosine, histidine)102. Polar groups of proteins frequently form hydrogen bonds with the solvent molecules (mostly water) or with other macromolecules103,104. In addition, strong hydrogen bonds are known to play the central role in several enzymatic reactions105,106. Consequently, any nonempirical attempt to study the stability of common biomolecules or enzymatic activity requires methods that can accurately describe hydrogen bonds. Hydrogen bonds represent a great challenge to quantum chemistry as far as chemical accuracy is required, as was demonstrated in the case of the hydrogen fluoride dimer by Latajka and Bouteiller107. For this system, detailed studies of a broad family of density functional methods using large basis sets (triplezeta with diffuse functions and multiple sets of polarization functions) showed the good performance of the DFT methods compared to the conventional ab initio ones: Hartree-Fock, MP2, and quadratic configuration interactions with single and double excitations (QCISD). A water dimer, which bears more relevance to biological systems, was recently investigated by many researchers. Results obtained using various implementations of the KohnSham formalism were reported. 87,109--116,119,122,124,125,128 Table 4.2 collects select results. From Table 4.2, it can be seen that the LDA calculations, regardless of the functional form of the exchange-correlation functional, led to too-small intermolecular distances (by about 0.15 A) and binding energies overestimated by about 2-3 kcal/mol. Water dimer properties (energy, dipole moment, geometry, and vibrational frequencies) predicted by the DFT calculations at the post-LDA level were significantly closer to experiment than the corresponding LDA results. The nonlocal functionals led to oxygen-oxygen distances ranging from 2.886 A to 2.943 A, which, being still too small, were fairly close to the experimental distance equal to 2.967 A117. The interaction energies corrected for the basis set superposition error (BSSE) and obtained at the post-LDA level were affected to some extent by the functional form of the exchange-correlation functional. The comparison of results obtained with the same basis set shows that the interaction energy may vary by as much as 1.5 kcal/mol depending on the choice of the approximate functional. In most cases, the equilibrium intermolecular distance increased upon the improvement of basis set. The dependence of the DFT results on the basis set used to expand the Kohn-Sham orbitals is illustrated in Table 4.3, which collects equilibrium geometry properties of water dimer obtained with the same exchange-correlation functional (B88/P86) but with different basis sets. It can be seen from Table 4.3 that the basis set superposition error was significantly reduced in the calculations with the largest basis sets. Surprisingly, results obtained with the plane-waves expansion of the Kohn-Sham orbitals, which should be free from the BSSE, differ from the ones obtained with the largest Gaussian basis set. One of the reasons of the discrepancy might be the periodicity (the length of the cubic unit cell was set to 8.464 A) in the plane waves calculations, or the poor convergence of the plane waves expansion for systems with relatively flat potential energy surface. Differences between results reported in Ref.109 and in Ref.114 for the same orbital basis set are due to the differences in the numerical evaluation of the exchange-correlation potential, which was fitted using different sets of auxiliary functions. Finally, the calculated value of the dipole moment of water dimer is very sensitive to the basis set. Besides water dimer, larger clusters of water molecules were extensively investigated by means of the DFT calculations87,111--114,127,128. Laasonen et al.113 studied the structure, the energies, and the vibrational frequencies of small water clusters (up to eight molecules)
Table 4.2. Calculated properties of the water dimer. The interaction energy (Eint) in kcal/mol, the intermolecular distance (R OO ) in A. Eint(BSSE)
ROO
Exc
basis set
Ref.
-7.57
-7.20
2.746
Xa
6-31 + +G(2d,2p)
124
-9.16 -8.27
-8.75 -7.93
2.710 2.715
S/VWN S/VWN
6-31G** 6-31G++G(2d,2p)
109 124
-8.30
2.719 2.719
PZ PV
Aug-cc-pVDZ Plane waves (energy cutoff 80 Ry)
111 110
-5.993 -5.96
-5.604 -5.80
2.877 2.893
PW86/P86 PW86/P86
6-31G** (7111111/11111111/111) for O; (7111/111) for H
109 114
-4.51 -4.53
-4.16 -4.37 -4.10
2.886 2.913 2.925
B88/P86 B88/P86 B88/P86
6-31G** (7111111/11111111/111) for O; (7111/111) for H Plane waves (energy cutoff 80 Ry)
109 114 110
-6.52 -5.6 -4.31 -5.20
-5.38 -4.8 -3.95 -4.18
2.890 2.912 2.943 2.938
BLYP BLYP BLYP BLYP
DZ(d,p) 6-31 + +G** 6-31 + +G(2d,2p) TZ(1df,2pd)
128 119 124 128
-6.0 -4.82 -4.57
-5.2 -4.48 -4.5
2.886 2.915 2.917
B3LYP B3LYP B3LYP
6-31 + +G** 6-31+G(2d,2p) Aug-cc-pVTZ
119 124 115
-6.2
-5.4
2.845
B3/P86
6-31 + +G**
119
Eint
--9.0
-5.4 ± 0.7
2.976
Exp.
117
100
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Table 4.3. Water dimer properties: the interaction energy (E int ) in kcal/mol, the intermolecular distance (ROO) in A, and the dipole moment in Debye, calculated using the B88/P86 exchange-correlation functional and different basis sets. Eint
Eint
(BSSE)
ROO
-4.5 -2.19 -4.16 -4.18 -3.7 -4.2
2.38 2.70 2.38 2.78 2.20 2.62 2.48 2.62 2.62 2.61
-6.4 -4.17 -4.51 -4.44 -6.1 -4.69 -4.4 -4.72 -4.54
-4.28 -4.47
2.898 2.914 2.886 2.926 2.895 2.886 2.894 2.903 2.917
-4.53
-4.37
2.913
-4.10
2.925
-5.4 ± 0.7
2.976
Basis set
2.60
(631/31/1) for O; (31/1) for H (5211/311/1) for O; (51/1) for H (5211/411/1) for O; (41/1) for H (5211/411/1) for O; (41/1) for H (6311/311/1) for O; (311/1) for H aug-cc-pVDZ* aug-cc-pVDZ (73111/521/1) for O; (721/1) for H (73111/521/1 ) + 2 d for 0; (721/1) +2d for H (7111111/11111111/111) for O; (7111/111) for H
Ref. 122 114 109 114 122 115a 111 114 114 114
Plane waves (energy cutoff 80 Ry)
110
Exp.
117,118
a
The [6s5p4d/5s3p] auf-cc-pVDZ* basis set was obtained by deleting higher angular momentum functions from [6s5p4d3f2g/5s4p3d2f ] aug-cc-pVDZ. by means of the DFT(PZ) and DFT(B88/P86) calcualtions, which employed pseudopotentials, the plane waves expansion of the Kohn-Sham orbitals, and the Car-Parrinello38 technique for geometry optimization. The authors found that for clusters built of three to six water molecules the ring structures were the most stable, in an agreement with the Hartree-Fock results. For the cluster built of eight water molecules the cubic structure appeared to be the most stable. Lee et al. studied the binding energies of water clusters (up to 20 molecules) by means of the DFT(SVWN), DFT(B88/P86), and DFT(BLYP) calculations. The DFT(B88/P86) and DFT(BLYP) binding energies were similar and significantly smaller than the DFT(SVWN) ones for all studied clusters. The authors extrapolated their results for finite clusters to obtain the binding energy per water molecule in the condensed phase. The binding energies obtained this way were in the excellent agreement with the experimental binding energy in ice; the values were —11.38 kcal/mol, —11.98 kcal/mol, and —11.35 kcal/mol, for DFT(B88/P86), DFT(BLYP), and experiment, respectively. Lee, Sosa, and Novoa97 reported the DFT conformational analysis for several water clusters. New stable minima involving ionic species were found in clusters built of five or eight water molecules. Xantheas111 applied the Kohn-Sham equations to calculate structure, energies, and vibrational frequencies of water clusters (up to six molecules). The PZ exchange-correlation functional and several combinations of the S and B88 exchange functionals with the VWN, LYP, and P86 correlation functionals were used. Larger clusters were studied using only functional leading to good properties of water dimer (BLYP and B88/P86). Comparisons between the MP2 and DFT results were made for clusters built of up to four molecules. Structure and energetics derived from the DFT calculations were in a good agreement with the MP2 results. The DFT(BLYP) and the MP2 binding energies agreed within 3.5 kcal/mol
APPLICATIONS OF DENSITY FUNCTIONAL THEORY TO BIOLOGICAL SYSTEMS
101
whereas the oxygen-oxygen distances differed by 0.02 A, 0.002 A and, 0.000 A for the dimer, the trimer, and the tetramer, respectively. More recently, the structure, the energies, and the vibrational frequencies of clusters built of up to eight water molecules were studied by Clementi and collaborators114 by means of the DFT(B88/P86) and DFT(PW86/P86) calculations. Good agreement between the DFT and MP2 results was found. Suhai128 investigated water dimer and an infinite chain of hydrogen-bonded water molecules by means of the DFT and post-Hartree-Fock calculations. For the infinite system, the DFT(BLYP), MP2, and MP4 binding energies were within 0.2 kcal/mol, whereas the corresponding interatomic distances were within 0.04 A. A similar agreement was reported for water dimer. Reports on many further hydrogen bonded systems of biological importance appeared recently in the literature61,63,109,112,119-126,128,144. Jasien and Fitzgerald61 reported the DFT(HL) calculations at the LDA level of nucleicacid base pairs. Compared to ab initio results, the DFT(HL) interaction energy in guaninecytosine and in adenine-tymine base pairs was overestimated by about 80%. Similar results were recently obtained from the DFT(SVWN) calculations by Santamaria and Vasquez63 who studied changes of the geometry and the electronic structure in nucleic acid bases upon the base pair formation. It the same paper, very good interaction energies obtained using the B88/P86 exchange-correlation functional were reported. Aformamide-water complex109 was studied by Sim et al. by means of the DFT(SVWN), DFT(PW86/P86), and DFT(B88/P86) calculations. The LDA led to qualitatively wrong results for conformational energies of this hydrogen-bonded complex. Furthermore, the poor performance of the LDA calculations was observed in studies of conformational equilibria in malonaldehyde, a molecule with the internal hydrogen bond. Hobza et al.119 studied formamide-formamidine complex using the BLYP and B3LYP functionals. The hydrogen bond in this complex was considered a model for the hydrogen bond in adenine-tymine base pair. A negligible variation of structural results with respect to the choice of the exchange-correlation functional was noted. The DFT energies were significantly better than the ones obtained from the Hartree-Fock calculations and they were close to the MP2 ones. Florian and Johnson120 applied the local DFT(SVWN) and nonlocal DFT(BLYP) calculations to obtain the structure, the energy, vibrational frequencies, and force constants of the formamide dimer. The DFT(BLYP) results were in a very good agreement with the MP2 ones obtained with the same basis set. The dissociation energy amounted to —17.1 and --17.3 kcal/mol for MP2/6-31G(d,p) and DFT(BLYP)/6-31G(d,p), respectively. Chojnacki et al.144 obtained similar dissociation energies (— 15.37, 15.25, and --14.75 kcal/mol using DZVD, TZVP, and TZ2P basis sets, respectively) in their DFT(BLYP) studies of proton transfer in the formamide dimer (see the Chemical Reactions section). Amonia dimer was studied by several researchers121,123,124. Zhu and Yang121 reported an excellent agreement between the DFT(B88/PW86) and the MP2 interaction energies (3.3 vs. 3.2 kcal/mol). Both the DFT and the MP2 predicted that the staggered quasi-linear structure (Cs) is more stable than the symmetric (C2h) cyclic one. The DFT(BLYP) calculatios by Kieninger and Suhai123 provided a similar picture of the relative conformational equilibria of this system. Novoa and Sosa124 reported studies on several hydrogen-bonded complexes including the NH3-NH3 dimer. Several local and nonlocal functionals were applied. The DFT(B3LYP) and MP2 geometries agreed within 0.03 A. Also, good agreement was reported for energies which agreed within 0.3 kcal/mol.
102
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Kieninger and Suhai122 applied the DFT(SVWN), DFT(B88/P86), and DFT(PW86/P86) calculations to obtain equilibrium structure and energetics of methanol dimer. All exchangecorrelation functionals led to overestimated binding energies (--5.8, --5.0, --4.4, and --3.2 ± 0.4 kcal/mol, for SVWN, B88/P86, PW86/P86, and experiment, respectively). Topol et al.125 reported the DFT(B88/P86) calculations for the same system together with other hydrogen-bonded complexes relevant to hydrogen-bonding in biological macromolecules (hydrogen bonded complexes formed by water, methanol, ethanol, formic acid, acetic acid, and trifluoroacetic acid). The calculations served to verify whether the energetics results derived from the DFT calculations could reproduce the available experimental data. Ten out of twelve dimerization energies agreed with experimental data within 1 kcal/mol. Larger differences were obtained for acetic acid and formic acid dimers (2 kcal/mol). Han and Suhai126 reported the DFT(X ), DFT(S/LYP), DFT(SVWN), DFT(B3LYP), and DFT(BLYP) calculations on N-methylacatemide-water complex. The N-methylacatemide molecule may be considered one of the simplest models of the main chain of proteins. Conformational equilibria of clusters of N-methylacatemide and from one to three water molecules were studied using the DFT(B88/null), DFT(X ), DFT(S/LYP), DFT(SVWN), DFT(B3LYP), and DFT(BLYP) calculations. The DFT(B3LYP) results compared most favorably with the ones steming from the MP2 calculations. Stronger hydrogen bonds usually involve charged molecules. Several reports on charged hydrogen-bonded systems of biological relevance were studied using the KS formalism.122,129,130 Kieninger and Suhai122 applied the DFT(SVWN), DFT(B88/P86), and DFT(PW86/P86) methods to study the NH3 . . NH4+ complex. The BSSE corrected interaction enthalpies obtained using nonlocal functionals were in a fair agreement with the results deduced from experiment and with the ones derived from the MP4 calculations. The corresponding values amounted to: -32.0, -26.0, -27.6, -25.9, and -24.8 kcal/mol for DFT(SVWN), DFT(B88/P86), DFT(PW86/P86), MP4, and experiment, respectively. Pudzianowski129 applied the DFT(B3LYP) and DFT(BLYP) calculations to study geometries, vibrational frequencies, and energies in the following hydrogen-bonded complexes: H3O+ . . H2O, NH . . H2O, CH3NH . . H2O, NH . . NH3, CH3NH . . NH3, O-. . H2O, CH3OH-- . . H2O,CN--. . H2O, HCC-- . . H2O, HCOO-- . . H2O. TheDFT(B3LYP) and the DFT(BLYP) results were in a fair agreement with the MP2 results. The root mean square deviation of the DFT and the MP2 complexation enthalpies amounted to 0.7 and 1.1 kcal/mol, for B3LYP and BLYP, respectively. From the basis set dependence of the DFT results, it was concluded that the nonlocal DFT calculations require diffuse atomic functions. Sule and Nagy130 studied properties of hydrogen diformate. This compound represents a simple model for carboxyl-carboxylate dyad, which plays the central role in the active center of aspartate proteases. The authors compared the DFT results with the ones derived from the MP2 and MP4 calculations. Various approximate exchange-correaltion functionals (SVWN, B88/P86, B3LYP, BLYP, and ACM) were used. For isolated monomer properties (dipole moment, bond lengths, and valence bond angles). The DFT results were at least at the MP2 accuracy level. The complexation enthalpy and the minimum energy structure calculated using the B88/P86 functional agreed better with experiment than those obtained from the MP2 calculations. Two conformers of the dimer were found, the nonsymmetric one bound by two hydrogen bonds and the symmetric one bound by three hydrogen bonds. Their complexation energies were very similar (35.7 and 34.9 kcal/mol) and the corresponding structures were identified with experimental ones. The LDA calculations led to overestimated complexation energies. In a contrary to results obtained for weaker hydrogen-
APPLICATIONS OF DENSITY FUNCTIONAL THEORY TO BIOLOGICAL SYSTEMS
103
bonded complexes, the dimer structure obtained using the LDA was remarkably good whereas the nonlocal functionals (B3LYP and BLYP) gave worse structures. The basis set effects were extensively discussed. From results presented in a large number of reports on DFT studies dealing with hydrogen-bonded systems, the following conclusions can be drawn. Firstly, the DFT calculations at the LDA level systematically overestimated binding energies and frequently underestimated intermolecular distances. The geometries and vibrational frequencies obtained by means of the LDA calculations were remarkably good in few cases, such as strongly bounded complexes. Secondly, the nonlocal functionals led to significantly better energetic results regardless of the analytical form of the approximate functional. For systems, where LDA leads to too small intermolecular distances, nonlocal functionals led also to improved geometries. The electrostatic properties, such as the dipole moments, were affected more by the basis set than by the particular form of the approximate exchange-correlation functional (including the LDA). Chemical Reactions. Theoretical studies of chemical reactions usually deal with the analysis of the Born-Oppenheimer potential energy surface (PES). In the PES approach, the non-adiabatic and tunneling effects are neglected. This approximation can be justified in most of the cases, especially for reactions involving only rearrangements of heavy atoms. These effects can be fully accounted for in quantum molecular dynamics theoretical studies. An efficient implementation of the quantum-molecular dynamics methodology to study chemical reactions remains, however, a challenge to the theory131. In the PES studies, quantum-mechanical calculations supply the following characteristics of the stationary points (reactants, products, and transition state) on the PES: the geometry, the energy, and second derivatives of the energy, which are subsequently used to obtain the reaction heat and the reaction rate at given temperature. Practically all approaches of quantum chemistry have been used to investigate stationary points at the PES. Usually, calculations at the post-Hartree-Fock level methods are needed to obtain reliable energies at the transition state as well as at the equilibrium geometries. Very often, such studies involve the calculation of energy derivatives (to locate stationary points and to calculate the partition functions), which increases significantly the computational effort required. The Kohn-Sham formalism offers a promising alternative for such studies, although the currently known exchange-correlation functional approximations leave much room for improvement. Due to the complexity and the large size of the systems involved in biochemical reactions, the system-size scaling of a theoretical method becomes the critical issue. The DFT calculations following the Kohn-Sham formalism outperform the post-Hartree-Fock methods. Arguably, the Kohn-Sham approach in which an approximate exchange-correlation functional is used is less 'empirical' than semi-empirical methods. Several DFT studies of chemical reactions were reported for model systems that might be considered crude models of biochemical reactions.51,132--136,145,146,178,183 Fan and Zigler132 initiated DFT studies of transition state structures and energies. The transition state and the energies in the CH3NC CH3CN isomerization process were calculated by means of the Hartree-Fock-Slater method (X ). The DFT results were similar to the ones determined by the post-Hartree-Fock ab initio methods. The same process was studied later by means of the Kohn-Sham calculations with both LDA (SVWN) and gradient-dependent functionals (B88/P86)133. Results that compare well with experiment were obtained at the LDA level, which indicates that gradient corrections were not important for this reaction. In the same paper, studies of the abstraction of hydrogen atom from methane by a methyl
104
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
radical showed that the LDA calculations led to incorrect energetical and structural results whereas the calculations with the B88/P86 functional were in good agreement with experiment. Andzelm et al.134 studied the potential energy surfaces for simple organic reactions: N2 dissociation and CH bond dissociation in methane, acetylene, ethylene, and vinyl radical. The DFT(B88/P86) results compared well with the post-Hartree-Fock (MP4, CCSDT-1) ones. The LDA(SVWN) potential energy surface compared less favorably with the one steming from the post-Hartree-Fock calculations. Similar studies for a wider class of chemical reactions were made by Baker et al.135 using the BLYP and the ACM functionals. The reactions considered involve radical and closed shells reactions which show a range of barrier heights from a few to about 50 kcal/mol. The DFT energies and geometries of the stationary points were compared with the ones calculated using several quantum-chemical methods. A general conclusion was drawn, that the DFT methods tend to underestimate the transition barrier heights, particularly for radical reactions. Deng et al. investigated the X" + CH3X XCH3 + X-- reaction (X = F, Cl, Br, or I), which serves as a prototype for SN2 reaction. The DFT(B88/P86) complexation energies agreed within 1.5 kcal/mol with the experimentally measured values, whereas the ones derived from the DFT(SVWN) calculations were significantly overestimated (by up to 5 kcal/mol). The DFT(B88/P86) complexation energies were in a fair agreement with the MP4 results. The differences between the DFT and the MP4 energies were substantially larger (up to 8 kcal/mol) for at the transition states than at equilibrium geometries. Andzelm et al.51 applied the DFT(SVWN), DFT(B88/P86), and DFT(ACM) calculations to study conformational equilibria, vibrational spectra, and reactions involving formic acid. The results were compared with the ones stemming from the post-Hartree-Fock (CISD, CCSDT-1, and MP4) calculations. The structures of the stable conformers and at the internal rotation transition barrier were obtained at the LDA level. The relative energies of stationary points on the potential energy surface calculated at the post-LDA level were in a good agreement with experiment and with the post-Hartree-Fock calculations. For instance, the relative energies of the local minima (cis and trans conformers) were within about 1 kcal/mol. At the transition states, the post-LDA energetic results were within the range of experimental results and they were in good agreement with results of the best postHartree-Fock calculations. The agreement was somewhat better for DFT(ACM) than for DFT(B88/P86) which tended to underestimate the transition state energies. The study of Andzelm et al.51 represents one of few cases where the LDA geometries were more in line with experiment (and the post-Hartree-Fock calculations) than the ones derived from the post-LDA calculations. The isomerization of the formaldehyde radical, both isolated and in the presence of assisting water molecules, was studied by Barone and Adamo146 using the DFT(SVWN), DFT(B3LYP), DFT(B88/P86), and post-Hartree-Fock (MP2, QCISD) approaches. This reaction presents a challenging test for theoretical models. High-level post-Hartree-Fock calculations are needed to obtain the correct energetics. The authors showed that the geometries, harmonic frequencies, and relative energies of reactant, transition state, and products obtained by means of the B3LYP functional were better than the ones obtained from the MP2 calculations. Other functionals (SVWN and B88/P86) led to worse results. Stanton and Merz studied the reaction of carbon dioxide addition to zinc hydroxide, as a model for zinc metallo-enzyme human carbonic anhydrase II136. It was shown that the LDA calculations (DFT(SVWN)) were not reliable for locating transition state structures whereas the post-LDA ones (DFT(B88/P86)) led to the transition state structures and ener-
APPLICATIONS OF DENSITY FUNCTIONAL THEORY TO BIOLOGICAL SYSTEMS
105
gies in good agreement with the MP2 results (bond lengths agreed within 0.05 A, and energies within 1.0 kcal/mol). In the same paper, a comparable agreement between the DFT(B88/P86) and the MP2 results was reported for the reaction of carbon dioxide with H2 leading to formaldehyde. Compared to studies of reactions involving rearrangements of heavier atoms, theoretical studies of proton transfer reactions present an additional difficulty. Due to the small mass of the proton, the tunneling effects can not be a priori excluded. Interest in such reactions arises largely because transfer reactions form a key step in several biochemical reactions. Theoretical studies of proton transfer reactions by means of DFT based methods and taking into account the quantum nature of the proton are in the initial state of development137. The ab initio dynamics of the proton transfer in the formamide-water complex was an example of such studies138 for systems of biochemical relevance. When the quantum properties of the proton are neglected, the same transition state locating algorithms as those used for reactions involving heavier atoms may be used. Several studies of the BornOppenheimer potential energy surface in reactions involving proton transfer were reported recently139--144. Mijoule et al.139 studied the proton transfer in H2O . . H3O+ complex by means of the DFT(B88/P86), DFT(PW86/P86), MP2, and MP4 calculations. A good agreement between the DFT and ab initio results was found for equilibrium geometry. The equilibrium oxygenoxygen distances agreed within 0.05 A. The inter-action energies were --37.87, --35.6, and --31.6 kcal/mol for the MP4, DFT(PW86/P86), and DFT(B88/P86) methods, respectively For larger distances between oxygen atoms, the double-minimum potential energy curve for the proton transfer obtained from the MP4 calculations was poorly reproduced by the DFT calculations (the barrier height was underestimated by approximately 67%). The same system was studied by Barone et al.140,141 who confirmed the DFT(B88/P86) results of Mijoule et al. The potential energy curves obtained with the ACM or B3LYP functionals were significantly better, but the barrier height was still underestimated by about 30%. Stanton and Merz142 investigated proton transfer in several symmetric systems: H2OHOH , HOHOH--, CH3HCH , NH3HNH , and NH2HNH by means of the DFT(SVWN), DFT(BLYP), and MP2 calculations. Compared to the MP2 results, the DFT(SVWN) complexation energies were overestimated by more than 10 kcal/mol. Even larger differences of barrier heights were seen for all studied systems. The DFT(BLYP) results were within 1--4 kcal/mol of the MP2 ones. Zhang et al.143 studied proton transfer in formamidine which belongs to the amidine class exhibiting antibiotic, antifungal, and aneasthetic activities. The barriers for the proton transfer in the gas-phase and in the presence of one, two, or three water molecules calculated by means of the DFT(SVWN), DFT(HL), DFT(BLYP), DFT(B88/P86), DFT(ACM), DFT(B3LYP), and DFT(B-Half-and-Half-LYP) were compared with the MP2, MP4(SDTQ), or CCSD(T) results. The highest level (CCSD(T)) calculations led to the conclusion that gas-phase barrier of 48.8 kcal/mol is reduced to about 20 kcal/mol upon adding one, two, or three water molecules forming cyclic hydrogen bonds with formamidine. The LDA calculations led to an underestimated gas-phase barrier and to overestimated stabilizing effect of assisting water molecules on the barrier height. Although, all gradient-dependent functionals led to results in qualitative agreement with the CCSD(T) calculations, the DFT results obtained with the B-Half-and-Half-LYP functional were clearly superior. The DFT(BHalf-and-Half-LYP) potential energies agreed with the CCSD(T) ones within 2.5 kcal/mol. The proton transfer in the formic acid dimer was the subject of DFT(SVWN), DFT(ACM), DFT(B88/PW91), DFT(BLYP), DFT(B88/P86), and DFT(PW86/P86) stud-
106
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
ies by Chojnacki et al.144. All nonlocal functionals led to dissociation energies (from 14.75 to 19.52 kcal/mol) in a reasonable agreement with experiment (the most recent experimental value amounts to 16.4 ± 0.1 kcal/mol) and the MP2 results (14.76 kcal/mol, see Ref.144 for an extensive list of experimental data). The LDA calculations overestimated binding energy by about 13 kcal/mol. Compared with the barrier height of 8.16 kcal/mol obtained from the MP2 calculations, the DFT barriers for double proton transfer were underestimated. The barrier heights ranged from 3.13 to 5.58 kcal/mol depending on the choice of nonlocal exchange-correlation functional, being significantly better then the ones derived from the Hartree-Fock calculations. 4.1.2 The Car-Parrinello Method and Its Applications 4.7.2.7 The Method The Komi-Sham theory made a dramatic impact in the field of ab initio molecular dynamics. In the 1985, Car and Parrinello38 introduced a new formalism to study dynamics of molecular systems in which the total energy functional defined as in the Kohn-Sham formalism proved to be instrumental for practical applications. In the Car-Parrinello method (CP), the equations of motion are based on a Lagrangian (L C P ) which includes fictitious degrees of freedom associated with the electronic state. It is defined as:
where • Nbasis, number of basis functions used to expand the Kohn-Sham orbitals, • Norbital, number of the Kohn-Sham orbitals i, • arbitrary parameter representing the fictitious mass of the electron (usually 300-500 a.u.), • Nnuclei, number of nuclei, • Cij, the coefficient of ith basis function in the expansion of jth Kohn-Sham orbital • E, the total energy of the system, which is calculated as in Eq. 4.1, • RI, coordinates of the nucleus I, • MI mass of the of nucleus I. The dynamics of the nuclear coordinates (RI)and that of the expansion coefficients (C ij ) is governed by a generalized steepest-descent procedure that involves solving two sets of dynamical equations:
where ij are Lagrange multipliers associated with the orthonormalization constraint of the Kohn-Sham orbitals j.
APPLICATIONS OF DENSITY FUNCTIONAL THEORY TO BIOLOGICAL
107
One may consider the above equation as a generalization of Born-Oppenheimer dynamics in which electrons always stay on the Born-Oppenheimevr surface. For a give conformation of nuclei, the numerical value of the fictitious mass associated with electronic degrees of freedom determines how far the electron devnsity is allowed to deviate from th Born-Oppenheimer one. Each consecutive step along the trajectory, which involves electronic and nuclear degrees of freedom, can be obtained without determining the exact BornOppenheimer electron density. 4.1.2.2 Applications To date, only a limited number of systems of biological interest were studied using ab initio molecular dynamics based on the Car-Parrinello method. This is mostly due to the still significant computer requirements of this method and relatively short times (ps range) that can be achieved in a reasonable computer simulation. This relatively short time scale leaves a large domain of biological phenomena beyond the scope of the CP ab initio molecular dynamics. It is also important to point out that many computationally robust implementations of the CP dynamics based on the plane-wave expansion of the Kohn-Sham orbitals loose their efficiency when applied to large disordered materials like biological systems. On the other hand, the energy minimization within the framework of the Car-Parrinello method offers a promising alternative to approaches based on the expansion of the electron density using localized orbitals for solving the Kohn-Sham equations. Structural and electronic properties of selected points on the Born-Oppenheimer potential energy surface were reported for several molecules of biological interest88--91,110,113 (see section The Kohn-Sham Method and Its Application). One of the most important advantages of the DFT and ab initio methods is the ability of studying chemical reactions without introducing system-dependent parameters. Only few studies by ab initio molecular dynamics were reported for systems of biological interest147--150. The studies of liquid heavy water (D2O) by Laasonen et al147 were the first applications of the Car-Parrinello ab initio molecular dynamics formalism for a system of biological relevance. Average structural and electronic properties were analyzed based on a 2 ps trajectory. Better structural properties were obtained using the B88/P86 functional than the PW86/P86 one, and they were in satisfactory agreement with experiment. The average value of the dipole moment of water molecule in liquid phase was 2.66 ± 0.004 Debye, in line with experiment. The diffusion coefficient measured from Einstein relation was in excellent agreement with experiment ((D = 2.1 ± 1)10--5 cm2s--1 v. D = 2.410--5 cm 2 s --1 ). The agreement is, however, of limited importance considering the fact that the relative error of the diffusion coefficient estimated on the basis of this very short simulation was around 50%. Curioni et al.148 studied the protonation of 1,3-dioxane and 1,3,5-trioxane by means of CP molecular dynamics similations. The dynamics of both molecules was continued for few ps following protonation. The simulation provided a detailed picture the evolution of both the geometry and the electronic structure, which helped to rationalize some experimental observations. CP molecular dynamics simulations were applied by Tuckerman et al.149,150 to study the dynamics of hydronium (H3O+) and hydroxyl (OH -- ) ions in liquid water. These ions are involved in charge transfer processes in liquid water: H 2 O . H + . . . OH2 H2O . . . H+.OH2, and HOH . . . OH-- HO-- . . . HOH. For the solvatetd H3O+ ion, a picture consistent with experiment emerged from the simulation. The simulation showed that the H3O ion forms a complex with water molecules, the structure of which oscillates between the ones of H5O and H9O clusters as a result of frequent proton transfers. During a consid-
108
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
erable fraction of time (40%), the proton could not be assigned to any oxygen in the liquid. In such intermediate states, neighboring water molecules form a strong hydrogen bond characterized by a small distance between oxygen atoms (ROO = 2.5 A). A good agreement between the results of the simulation and experimental data reported for the H3O+ solvation and on the fact that the non-adiabatic electron dynamics and the quantum nature of proton were neglected in the applied model, the authors concluded that the neglected effects were of secondary importance. This observation, without being the ultimate proof, justifies the frequently made assumption the quantum nature of the proton can be neglected in studies of proton transfer in liquids or in biological materials.
4.2 DFT Studies of Molecular Systems Embedded in Their Biological Environment One of the most distinct features of elementary biochemical processes is that they take place in a very inhomogeneous microscopic environment that can not be described neither as a liquid nor as a solid. When quasi-macroscopic characteristics like concentration of ions, electric charge density, electric polarization, dielectric function, elasticity, hydrophobicity, and electric conductivity are considered, most of them are not uniform within functional units performing biological functions (such as enzymes, membranes, nucleic acids). For instance, the dielectric function of solvated biological macromolecules exhibits both the characteristics of a liquid and that of a solid151,152. Theoretical modeling of chemical molecules embedded in such a medium is therefore a real challenge. In a contrary to the DFT studies of isolated molecules, where there is a strong link between applications to biological systems and general developments in the theory of density functionals, approaches used for modeling properties of chemical molecules embedded in the biological microscopic environment combine developments in many fields. These fields include DFT, statistical physics, dielectric theory, and the theory of liquids. In this chapter, we concentrate on methodological developments rather than on particular applications. For each discussed approach, underlying assumptions and the outline of the formalism will be given. 4.2.1 DFT Studies of Molecules in an External Electrostatic Field DFT studies of the electronic structure perturbation of a molecule bound to an enzyme were pioneered by Bajorath et al.153--155. In these studies, the electrostatic potential arising from enzyme's electric charges (Vext) was included in the KS Hamiltonian:
In the Kohn-Sham Hamiltonian, the SVWN exchange-correlation functional was used. Equation 4.12 was applied to calculate the electron density of folate, dihydrofolate, and NADPH (reduced nicotinamide adenine dinucleotide phosphate) bound to the enzyme— dihydrofolate reductase. For each investigated molecule, the electron density was compared with that of the isolated molecule (i.e., with Vcxt = 0). A very strong polarizing effect of the enzyme electric field was seen. The largest deformations of the bound molecule's electron density were localized. The calculations for folate and dihydrofolate helped to rationalize the role of some ionizable groups in the catalytic activity of this enzyme. The results are,
APPLICATIONS OF DENSITY FUNCTIONAL THEORY TO BIOLOGICAL SYSTEMS
109
however, of only a qualitative importance due to some simplifications in the model. In particular, the effects of solvent molecules on the electric field was neglected. The polar water molecules and counterions surrounding the protein and can modify the electric field in a very nonuniform way. Another factor that may influence the results is the well known poor quality of the DFT molecular polarizabilities which are at the accuracy level of the ones derived from the Hartree-Fock calculations regardless of the exchange-correlation functional107,108,156,157 used in the KS formalism. This conclusion extends to other electrostatic properties such as dipole and quadrupole moments as was recently shown by deLuca et al.158 for several organic molecules. 4.2.2 DFT/Reaction Field Models 4.2.2.7 The Methods Most of biological reactions take place in a highly polarizable medium which contains mobile polar water molecules, reorientable polar groups, and mobile ions. For macroscopic media, the energetics of an electric charge distribution placed in a vicinity of a polarizable medium can be described by means of the classical dielectric theory159. A group of approaches were developed based on the assumption that the same theory is valid for microscopic charge densities embedded in the microscopic environment. In these approaches, the interactions of the molecule (or molecules) with the surrounding microscopic environment is described by means of classical theory of dielectrics. According to it, the polarization of the surrounding medium by a given distribution of electrostatic charge ( ) is described without considering the microscopic details of the medium. The contribution of the polarized medium to the electrostatic potential (potential of the reaction field, 160 : RF) can be obtained as the linear integral of
where G(R, R') is the Green function of the Poisson equation
(R) is the local value of the dielectric constant, and 8(R, R') is the Dirac delta function. In 1973161, Rinaldi and Rivail proposed an approach that combines the quantummechanical level of description of chemical molecules with the macroscopic concept of the reaction field. A similar approach was introduced by Tapia and Goscinski in 1975162. In this approach, the electron density of a solvated molecule ( ) is calculated using the SCF procedure where the isolated molecule Hamiltonian Hgas is replaced by the solvated molecule Hamiltonian Hsol:
There is a fundamental difference between Eqs.4.12 and 4.15 despite their apparent similarity. The term RF in Eq. 4.15 depends on the electron density (see Eq. 4.13), whereas the term Vcxt in Eq. 4.12, is constant in the SCF procedure. To reflect this fact, the approach based on Eqs. 4.13--4.15 is frequently called the Self-Consistent Reaction Field metho (SCRF). (Throughout the text, XXX/SCRF denotes combined quantum-mechanical/reaction field calculations where XXX specifies the quantum-mechanical method.)
110
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Once the electron density of the embedded molecule is evaluated by the SCRF calculations, the free-energy component which is due to the solvent polarization and can be expressed as:
where Zi denotes the electric charge of the nucleus i. The nonelectrostatic components of the free energy such as the energy of cavity formation Gcav or components that take into account atomistic details of the medium (interactions between atoms inside the cavity and those in the medium) are calculated using empirical approximations (see Reference 164 for review or 165 for recent developments). These terms are do not affect the SCF procedure since their dependence on electron density is usually neglected. The evaluation of the reaction field potential for a given distribution of electric charges inside the cavity ( ) presents a classical electrostatic problem. The techniques applied for the SCRF calculations were recently reviewed in detail by Tomasi and Persico.164 The simplest ones are models where the reaction field potential can be expressed analytically for a given distribution of electric charge inside the cavity. In particular, spherical161 or ellipsoidal cavity163 models are relevant to study solvent effects on molecular properties. The most general approaches based on the numerical solution of the Poisson equation (by using finite difference or finite element algorithms) are applicable for dielectrics with nonuniform dielectric properties. The SCRF calculations using a spherical cavity require negligible increase of the CPU time over the time of the corresponding SCF gas-phase calculations. On the other hand, the calculations of RF using the finite difference method for cavities of irregular shape require more computer time due to the evaluation of the reaction field potential and the associated matrix elements at each SCF step. This leads the increase of the total CPU time by a factor 2--4 compared to that needed for gas-phase calculations168. The SCRF approach became a standard tool167 for estimating solvent effects and was combined with various quantum chemical methods that range from semi-empirical161 to the post-Hartree-Fock ab initio ones. It can also be combined with the Kohn-Sham formalism where the Kohn-Sham Hamiltonian (Eq. 4.2) is used for the gas-phase Hamiltonian in Eq. 4.15. The effective Kohn-Sham Hamiltonian for the system embedded in the dielectric environment takes the following form:
Several techniques for calculating formalism. They include:
RF
were recently combined with the Kohn-Sham
Multipole Expansion for a Spherical Cavity170. For a molecule embedded in a polarizable medium this method requires little more computational effort than the calculations for an isolated molecule. For a cavity with a radius a, the electrostatic potential is164:
APPLICATIONS OF DENSITY FUNCTIONAL THEORY TO BIOLOGICAL SYSTEMS
111
whereY ( , ) is the spherical harmonic with the total angular momentum / and the projection m along the z axis. The coefficients Blm depend on the distribution of electric charge inside the cavity. The dielectric constant of the region outside the cavity is uniform and its numerical value is taken from experimental measurements for the corresponding bulk material. In the above multipole series, higher than dipole terms are frequently neglected170. Since the shape of most molecules of biological interest is usually complicated, the spherical cavity with an effective cavity radius is a rather serious simplification. The radius of the cavity is usually calculated based on experimental molecular volumes. The choice of the cavity radius has been the subject of much discussion (see Ref.172). Multipole Expansion for an Ellipsoidal Cavity171. For this shape of the cavity, similar analytical formulas to the ones applicable for spherical cavities were found163. Representing the cavity with an ellipsoid increases the range of applicability of the SCRF method compared to spherical cavity calculations. The ellipsoid parameters are usually determined using geometrical criteria166. For both spherical and ellipsoidal cavities, the convergence of the multipole expansion can be substantially increased when a multicenter multipole expansion of the embedded electron density is applied instead of a single-center expansion. Both spherical and ellipsoidal cavity calculations are of limited applicability to study biological systems due to the idealization of the molecular shape and to the uniformity of the the dielectric properties of the region outside cavity. Apparent Surface Charges (ASC)169,173. This method, also referred to as Boundary Element Method (BEM), overcomes the major drawback of spherical or ellipsoidal cavity calculations, namely the idealization of the cavity shape. It is also possible to introduce the nonhomogeneity of the polarizable medium by defining several regions, each one characterized by a different dielectric constant. In this method, the electrostatic potential is obtained as a sum of two components: one arising from charges inside the cavity ( ) and the other arising from charges ( 12) on the dividing surfaces:
The surface charges are coupled to the electrostatic potential through the boundary condition: where n12 is the vector normal to the dividing surface oriented from region 1 to region 2, and polarization vectors Pi (i = 1,2) are expressed as:
where i is the dielectric constant assigned to region i. Since surface charges depend on the electrostatic potential (Eq. 4.20), Eqs. 4.20--4.22 are solved in an iterative way leading to self-consistent surface charges. At the end of this procedure, surface charges and the electrostatic potential satisfy the boundary condition specified in Eq. 4.21. In practical applications, this self-consistent procedure for calculating reaction field potential is coupled to self-consistent procedure which governs solving the Kohn-Sham equations. A special case for infinite dielectric constant outside the cavity
112
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
( out = ), was combined with the DFT formalism by Truong and Stefanovich174. Due to its large value, the dielectric constant of liquid water might be replaced by infinity without causing significant errors in the reaction field potential inside the cavity. Besides aqueous solutions, however, this type of modeling is of limited applicability for biological systems which are usually characterized by a large but nonuniform local values of dielectric constant. The Finite Difference Method (FD)168,169. This is a general method applicable for systems with arbitrary chosen local dielectric properties. In this method, the electrostatic potential ( RF) is obtained by solving the discretized Poisson equation:
Equation (4.23) was also generalized (Poisson-Boltzman equation) to include the effect of nonuniformly distributed mobile ions on the reaction field potential. This makes the finite difference method the most suitable technique to obtain reaction field potential in the DFT/SCRF studies of biological systems. Such calculations involve, however, increased computational effort as exemplified by the four-fold computational time increase of the finite difference SCRF/DFT calculations for trans-methylacetamide compared to the gasphase calculations168. To define the boundary between the high and low dielectric regions, several approaches were proposed175. For large biomolecules, the surface dividing the low and high dielectric regions can be determined using simple geometric criteria such as the solvent accessibility176). 4.2.2.2 Applications Although the Kohn-Sham equations form the quantum mechanical core of the DFT/SCRF methods, the final energetical results obtained by these methods also depend on other features of a particular DFT/SCRF implementation. It is important, therefore, to stress that the DFT/SCRF represents a whole family of methods. Each particular implementation may differ in the following details concerning: • DFT: the exchange-correlation functional and basis sets; • Reaction Field: the determination of the cavity shape, dielectric constant(s), the method to calculate the reaction field potential, presence of counterions (for finite difference calculations); • nonelectrostatic terms: the parametrizations for the cavity formation energy, dispersion attraction, short-ranged repulsion, or for special terms describing the energy of hydrogen bonding. Conformational Equilibria. The solvent effect on the conformational equilibria represents a typical problem studied using the DFT/SCRF methods. The presence of the environment may affect the free energy of a given conformer, its equilibrium conformation or even destabilize a particular conformation. The DFT/SCRF calculations have been applied to study such effects using various KS methods as well as different techniques for calculating RF. Hydration-free enthalpies of different conformers of formamide and ally1 vinyl ether were studied using the ellipsoidal cavity model and the BLYP functional 170,177 . A very good agreement between the MP2/SCRF and the DFT(BLYP)/SCRF results was shown for the
APPLICATIONS OF DENSITY FUNCTIONAL THEORY TO BIOLOGICAL SYSTEMS
113
solvent induced increase of the barrier for the rotation about the C-N bond in formamide. The calculated values (both close to 1.5 kcal/mol) were, however, smaller than the experimental estimate (close to 8 kcal/mol) based on NMR measurements in polysterene matrix and in water. For the Claisen rearrangement in allyl vinyl ether, both the MP2/SCRF and the DFT(BLYP)/SCRF results were discouraging. The experimental estimate for the reduction of the barrier induced by the solvent amounts 3.2-3.7 kcal/mol whereas the SCRF results predict only a negligible solvent effect. The solvated H3N . . . HBr complex was studied by Ruiz-Lopez et al.171 using the ellipsoidal cavity in the DFT(B88/P86)/SCRF calculations which demonstrated that the solvent stabilizes the ionic pair structure. The authors found only one conformational minimum of the complex, which electronic structure corresponded to that of anionic pair. Foresman et al.175 applied the DFT(B3LYP)/SCRF calculations to obtain the polar solvent effect on conformational equilibria in furfuraldehyde and on the C-C rotational barrier of (2-nitrovinyl)amine. The authors demonstrated that the poor results obtained using either spherical or ellipsoidal cavities can be significantly improved upon performing the SCRF calculations for the cavity of molecular shape. The reaction field based DFT methods were also applied to study solvent effects on tautomeric equilibria in several molecules of biological interest66,71. Adamo and Lelj66 studied solvent effects on formamide formamidic acid and on 2-pyridone 2-hydroxypriymidine equilibria. Both the LDA (SVWN) and the post-LDA (B88/P86) results were reported. The spherical cavity calculations led to a very good agreement with the experiment, in line with results from the SCRF calculations where the solute was described at the post-Hartree-Fock level (MP2 and CI). Barone and Adamo71 reported a continuation of this work for 2-pyridone. Some microscopic features were added by means of including explicit water molecules into the cavity. Relative energies of tautomers and barriers for the direct and the water-assisted proton transfer were calculated using the B3LYP functional for solute Kohn-Sham Hamiltonian. Good agreement between the DFT and the MP2 results was reported. The calculations, in agreement with experiment, predicted that the 2hydroxypirymidine is the most stable tautomer in polar solvents. Thermochemistry. Chen et al.168 combined the Kohn-Sham formalism with finite difference calculations of the reaction field potential. The effect of mobile ions into on the reaction field potential RF was accounted for by means of the Poisson-Boltzman equation. The authors used the DFT(B88/P86)/SCRF method to study solvation energies, dipole moments of solvated molecules, and absolute pKa values for a variety of small organic molecules. The list of molecules studied with this approach was subsequently extended182. A simplified version, where the reaction field was calculated only at the end of the SCF cycle, was applied to study redox potentials of several iron-sulphur clusters181. Rashin et al. combined the Kohn-Sham equations with either the finite difference or boundary element method to calculate the reaction field169, and they evaluated solvent effects on molecular properties and hydration enthalpies for several organic molecules of biological relevance. For neutral molecules, the calculated solvation enthalpies were in a very good agreement with experimental results (the difference amounted to 10% of the experimental values). Larger relative differences were reported for charged molecules. The differences between the LDA (SVWN) and the post-LDA (B88/P86) results were not significant. Baldridge et al.185 studied the effect of the solvent screening on the interactions between the formaldehyde and Na+ or Ca++ cations. In these studies, the cation and the formaldehyde molecule were localized inside the cavity of molecular shape. The reaction field po-
114
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
tential was calculated by means of the finite difference method. The authors discussed in detail the numerical of implementations of the SCRF approach. It was demonstrated that the electronic structure of a molecule inside the cavity depended strongly on the way in which the atomistic environment was replaced by a continuous medium. To calculate free energies of solvation for several organic molecules, Fortunelli and Tomasi applied the boundary element method for the reaction field in DFT/SCRF framework173. The authors demonstrated that the DFT/SCRF results obtained with the B88 exchange functional and with either the P86 or the LYP correlation functional are significantly closer to the experimental ones than the ones steming from the HF/SCRF calculations. The authors used the same cavity parameters for the HF/SCRF and DFT/SCRF calculations, which makes it possible to attribute the apparent superiority of the DFT/SCRF results to the density functional component of the model. The boundary element method appeared to be very efficient computationally. The DFT/SCRF calculations required only a few percent more CPU time than the corresponding gas-phase SCF calculations. Truong and Stefanovich calculated the free energies of hydration for 17 neutral and 35 charged organic molecules applying a special variant of boundary element calculations: the screening conductor model which corresponds to infinite dielectric constant outside the cavity174. Hydration-free energies were calculated by means of either the HF/SCRF or the DFT/SCRF calculations and cavities of molecular shape. The DFT(B3LYP)/SCRF results were slightly better than the ones obtained from the HF/SCRF calculations for both neutral as well as for charged species. Good correlation with experimental hydration-free energies was seen, however, in individual cases (HPO3 molecule, for instance), the relative error of calculated solvent effects was around 50%. The spherical cavity DFT(BLYP)/SCRF model was applied by Siegbahn180 in studies of the solvated H3O+ . . OH-- system. The DFT/SCRF calculations were made for (H3O+ . . OH--)*(H2O)n clusters, where n = 0, 1, 2, or 3 denotes the number of explicit water molecules within the cavity. In principle, a series of the DFT/SCRF calculations for increasing number of explicit water molecules and the matching size of the cavity should lead to converging energies. The free energy results of Sieghbahn did not, however, reach convergence even for the largest investigated clusters. The calculated solvent effects varied by 7 kcal/mol depending on the number of explicit water molecules. The polar solvent effect on relative acidities of carboxylic acids and enols was studied by Wiberg et al.179 by means of the DFT(B3LYP)/SCRF and MP2/SCRF calculations. Both methods well reproduced experimental solvent effects. Chemical Reactions. The reaction field approach has been also applied to study chemical reactions in polar media. Such studies involve the analys of the Born-Oppenheimer potential energy surface. Similarly to the gas-phase studies, the PES approach neglects the quantum nature of nuclei. In condensed-phase, however, the PES approach using the ab initio/SCRF method involves an additional approximation which neglects the dynamic correlations between the structure of the embedded molecular system and the structure of the microscopic environment. The DFT/SCRF energies correspond to fully equilibrated microenvironment; consequently, only reactions which are slow in the time scale of the dynamics of the microscopic environment can be treated. As in the case of gas-phase reactions, the energies and geometries of the stationary points on the potential energy surface have been used to derive heats of reaction and reaction rates, and to investigate reaction mechanisms in condensed phase. Several chemical reactions of biochemical interest were modeled by means of the DFT/SCRF approach178,183,184.
APPLICATIONS OF DENSITY FUNCTIONAL THEORY TO BIOLOGICAL SYSTEMS
115
The solvent effect on the reaction OH-- + CO2 HCO is of great relevance for modeling biochemical reactions because the enzyme carbonic anhydrase speeds up the reaction significantly. It was studied by means of the DFT/SCRF method using the BLYP functional178. The MP4 and DFT gas-phase reaction profiles were similar, the largest energy differences being about 2 kcal/mol. The gas-phase enthalpies of reaction from the MP4 and DFT calculations were similar (-44.1 and —43.4 kcal/mol for MP4 and DFT, respectively). Two variants of the DFT/SCRF calculations were applied to estimate the solvent effect on the transition state energetics. In the first one, idealized spherical cavity and multipole expansion were used for reaction field calculations. In the second one, the boundary element method was used for a cavity of molecular shape. Both methods led to very similar results in the transition state region. The relative free energies calculated using these two methods amounted to 9.2 and 9.8 kcal/mol, in good agreement with experiment (12.1 kcal/mol). Truong and Stefanovich183 studied solvent effects in the SN2 reaction Cl-- 00C+0ClC C1CH3 + Cl-- using the SCRF/DFT calculations. Gas-phase structures and energies of reactants, products, and transition state were obtained using the Hartree-Fock, MP2, DFT(B3LYP), and DFT(B-Half-and-Half-LYP) calculations. An excellent agreement between experimental and DFT(B-Half-and-Half-LYP) gas-phase structures was observed for CH3C1. For the transition state, for which experimental structure is not available, the DFT(B-Half-and-Half-LYP) results were very close to the MP2 ones. Two sets of the DFT/SCRF calculations differing in the numerical value of the dielectric constant outside the cavity ( = 80 or = ). For infinite dielectric constant a very efficient computationally approach was used (screening conductor model). The solvent effects on energies derived from the MP2/SCRF and DFT/SCRF calculations were in good agreement with experiment. For both gas-phase and solvent calculations, the MP2 and the DFT(B-Half-andHalf-LYP) results were very similar. In the reaction field calculations, it was found that the effect of replacing = 80 by = negligibly affects the energetics. Radkiewicz et al.184 explored the mechanism of aspartic acid racemization by means of the DFT(B3LYP)/SCRF calculations. The DFT/SCRF calculations provided quantitative rationalization of the rapid racemization observed at succinimide residues in proteins. The proposed reaction mechanism was supported by the computed increase of the acidity of the succinimide residue in aqueous solution compared to gas phase. On the basis of these results, it can be concluded that the DFT/SCRF calculations lead usually to similar results to the ones from the MP2/SCRF calculations, provided the postLDA exchange-correlation functionals are used. Energetic effects exceeding 2 kcal/mol compare well with experiment in most systems investigated. In cases, where DFT/SCRF and experimental results differ more than 1-2 kcal/mol, the cause of the discrepancy is clouded by a diversity of approximations and simplifications within the DFT/SCRF framework. 4.2.3 DFT/Molecular Mechanics Coupled Methods In this section, a group of related approaches is discussed in which the continuum dielectric description of the microscopic environment is replaced by a more detailed model in which the atomic details of the structure and the dynamics of the microscopic environment are taken into account. These models will be referred to here as coupled DFT/Molecular Mechanics (DFT/MM). For a general overview of coupled ab initio/Molecular Mechanics methods, see the recent reviews by Aquist and Warshel186 and by Gao187.
116
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
The most important feature of the DFT/MM methodology is the partitioning of the whole system into two regions, and applying different levels of description for each of them. This is opposite to the ab initio molecular dynamics (e.g., on the Born-Oppenheimer potential energy surface or following the Car-Parrinello formalism) in which same approximations are made for all parts of the whole system. A selected region is described at the quantum mechanical level (Kohn-Sham equations) whereas the rest of the whole system is described at the more approximate (molecular mechanics) level. The electron density of the quantum-mechanical part ( ) is calculated repeatedly for many geometries used to derive averaged properties. The mean value of an observable represented by the Operator O is obtained as:
where
is the density matrix at a given geometry of the microscopic environment and represents the average over conformations of the microscopic environment. The sample of conformations represents a given thermodynamic ensemble. The density is obtained in a SCF procedure with the following effective Hamiltonian: MicroEnv
where • Hgas corresponds to the gas-phase Kohn-Sham Hamiltonian of the embedded electron density; • HMicroEnv represents the interactions between the molecules of the environment and the embedded molecule(s); • H0MicroEnv is the energy of the molecules of the microenvironment (this term does not depend on the embedded electron density and is not involved in the construction of the Fock matrix). The DFT/MM approach have been applied to study equilibrium properties as well as to study chemical reactions. Several DFT/MM implementations were developed differing in the strategy for approximating the HMicroEnv and H0MicroEnvv terms and in the way the statistical sample of conformations is generated. Below, these implementations will be briefly presented. Stanton et al. combined the Kohn-Sham formalism with molecular dynamics to study solvation enthalpies of several molecules189. Their implementation of Eq. 4.25 can be outlined as follows: • H0MicroEnv is calculated using standard approximations used in classical molecular mechanics (i.e., pair potentials and analytical expressions for strain energy190); • HMicroEnv is represented as:
where • j runs over all atoms of the embedded molecule; • i runs over all atoms in the microscopic environment;
APPLICATIONS OF DENSITY FUNCTIONAL THEORY TO BIOLOGICAL SYSTEMS
117
• Aij and Bij are constants which do not depend on the geometry of neither the embedded molecule nor the atoms from the environment; • Qi is the net charge at the ith atom of the microscopic environment. The first term in Eq. 4.26 represents Van der Waals forces between atoms of the microscopic environment and the embedded molecule, this term is not involved in the construction of the Fock matrix. The second one represents Coulomb interactions between the embedded electron density and the electric charge distribution in the environment which is approximated by point charges. Stanton et al. applied the model based on Eqs. 4.24--4.26 to study solvent effects on the structure and the energetics of the following complexes: water dimer, formaldehyde/water, and formaldehyde/hydrogen chloride191. The authors found that the DFT/MM approach performed well in most cases studied and, in general, outperformed the semiempirical/MM approach. The interaction energies and structures obtained using this method appeared to be insensitive to the choice of a particular nonlocal exchange-correlation functional. Alternative implementations of the DFT/MM approach based on similar approximations for Heff as in Eqs. 4.25 and 4.26 appeared recently in literature188,192--194. Wei and Salahub investigated a solvated water molecule192 and proton transfer in a solvated H5O+ cluster193 by means of the combined DFT/molecular dynamics simulation. The LDA(SVWN) and the post-LDA (PW86/P86) exchange-correlation functionals were used for solute Kohn-Sham Hamiltonian (Hgas in Eq. 4.25). The molecular dynamics was performed using periodic boundary conditions and was governed by a potential energy including empirical pair potentials for solute-solvent and solvent-solvent terms. Similarly to the approach of Stanton et al.189, only the electrostatic component of HMicroEnv was involved in the construction of the Fock matrix, the dispersion and repulsion interactions were neglected in the effective potential. The calculated binding energy of the solvated quantum water molecule and its dipole moment did not depend significantly on the choice made for exchange-correlation functional (either SVWN or PW86/P86). They depended stronger on the basis set. The dipole moment of the water molecule increased upon solvation by 0.43 and 0.66 Debye at the DZVP and TZVP+ levels, respectively, in line with experiment. The calculations predicted that the proton transfer barrier is higher in water than in gas phase by 3 kcal/mol. Tunon et al.194 studied the water molecule in liquid water. The sample of conformations by the microscopic environment (water in this case) was obtained using Monte Carlo technique. The energy was calculated as in the approach of Stanton et al.189 i.e., using Eqs. 4.25 and 4.26. The solvent induced increase of the dipole moment amounted to 0.61 Debye in line with the results by Wei and Salahub and close to the experimental value of 0.75 Debye. The solvation enthalpy amounted —12.6 kcal/mol, while the value calculated by Salahub and Wei and the experimental ones were —10.4 kcal/mol and —9.9 kcal/mol, respectively. Truong and Stefanovich188 applied the DFT/Monte Carlo simulations for estimating solvent effects on binding enthalpies of several copmlexes of the Me . . (H2O)n type (Me = Li+, Na+, F--, or Cl -- ; n = 1, 2). Approximations for HMicroBllv were similar to the ones by Stanton and Merz. In order to speed up the DFT calculations, the authors evaluated the accurate SCF energy only for selected conformations of the microscopic environment, whereas they performed first-order perturbation calculations for intermediate conformations. The gain in computer time is proportional to the ratio Npert/NSCF between the number of self-consistent (NSCF) and pcrturbative (NPcrt) calculations. For each cluster considered, the enthalpy of binding was calculated at Npert/N'SCF ranging from 1:1, which
118
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
corresponds to standard implementation of the DFT/MM formalism to 1:2000. For Li+ and Na+ clusters, replacing the SCF calculations by the perturbative ones negligibly affected calculated energies. For the Cl-- and F-- clusters, the effect on the solvation enthalpies was as little as 0.2-0.4 kcal/mol at Npert/NSCF equal to 1:2000. Wesolowski and Warshel197 introduced a DFT based approach in which all short-range terms in the effective Hamiltonian (Eq. 4.25) were derived entirely from density functional theory and were involved in the construction of the Fock matrix195,196. In this approach, the HMicroEnv is expressed using explicit functionals of the electron density:
where and Env are electron densities of the quantum-mechanical subsystem and the microscopic environment, respectively; Tnadd is the nonadditive kinetic energy functional defined as:
where Ts[ ] denotes the kinetic energy of a reference system of noninteracting electrons with density p. The first two terms in Eq. 4.27 represent the electrostatic potential generated by atomic nuclei (V NuclEnv ) and electrons in the microscopic environment. Remaining terms represent the quantum-mechanical components of the interaction between electron densities and Env: dispersion interaction and Pauli repulsion. It is worthwhile to note that, all interactions between the embedded molecule and its environment are calculated at the DFT level and do not depend neither on empirical parameters nor on chemical composition of the studied system. This approach might be computationally advantageous provided that the kinetic energy component of the effective potential can be accurately calculated using an explicit functional of the electron density. It was recently shown200 that gradient-dependent kinetic energy functionals with correct asymptotic behavior can be used for this purpose. It is worthwhile to note that in the present formulation (FDFT, Frozen Density Functional Theory196) the electron density of each molecule in the microscopic environment is fixed at the initial value (Frozen Density). The total electron density Env of the microscopic environment changes for each geometry due to the movements of the molecules of the microscopic environment. This simplification can be straightforwardly removed by allowing the frozen density of each molecule of the microscopic environment to relax199. This approach was used to obtain solvation free energy differences197 and to study solvent effect on the energy barrier height in a model proton transfer reaction198. Very good agreement between free energies calculated by means of empirical force fields (which depend on chemical composition of studied system and are fitted to experimental results) and parameterfree DFT results was obtained. The coupled DFT/MM formalism can be regarded as an intermediate approximation between ab initio molecular dynamics, and classical molecular mechanics. Being so, the range of its applicability extends to problems not treatable by molecular mechanics, chemical reactions for instance. The possibility of restricting quantum-mehcanical treatment to welllocalized regions also makes it computationally advantageous over supermolecule ab initio simulations. It is important to note that this formalism does not differ whether applied to study biochemical reactions or to study reactions taking place in an other microscopic environment. This makes it possible to test any implementation on problems for which there
APPLICATIONS OF DENSITY FUNCTIONAL THEORY TO BIOLOGICAL SYSTEMS
119
are accurate experimental data (reactions in solution, for example) and than to apply the same model in the investigations of biochemical reactions. This is opposite to the DFT/SCRF approaches where moving from bulk solvents to biological environments requires additional approximations to be introduced (e.g., assuming specific local values of the dielectric constant).
4.3 Prospects for the Future 4.3.1 Refinements of the Kohn-Sham Formalism The current state of the Kohn-Sham formalism and its applications for electronic structure calculations has been presented in many books and reviews. The reader is encouraged to see the very recent paper by Kohn, Becke, and Parr13 and references therein. New approaches aiming at finding accurate approximate exchange-correlation functionals emerge in the literature30,201,203,208,209,212,214. Owing to the availability of correlated ab initio electron densities of high accuracy, the exact exchange-correlation potential (Vxc) could be calculated for atoms (see Ref.204 and references therein) and also for soluble model systems202. The comparisons of the exact Vxc with the approximate ones give hints to improve the approximate exchange-correlation functionals or to derive the new ones:202,204--207,209,210 Two directions in the development can be foreseen. The first one is the search for an accurate parameter-free general exchange-correlation functional. Such a functional, most certainly, cannot be accurately approximated using relatively simple analytical functions as those of the SVWN and PW91 functionals. The second one is the development of functionals of relatively simple analytical forms containing empirical parameters and having well-defined domains of applicability to chemical problems. The second route seems to be more promising for modeling biomolecules as is evidenced by the generally good results obtained by means of the B3LYP functional. 4.3.2 Toward Linear System-Size Scaling Quantum-mechanical methods that scale linearly with the system size is the "Holy Grail" of quantum chemists, especially for those studying large, disordered molecular systems such as the ones of biological interest. The Kohn-Sham formalism already represents a great achievement with its N3 scaling. Several computer implementations reduce this scaling further as for example the Fast Multipole Method for evaluation of Coulomb integrals6, the fast assembly of Coulomb matrix215 or fast procedures for fitting the electron density216. The density functional elongation method217, designed to study large aperiodic polymers like DNA molecules, reduces also significantly the canonic N3 scaling of the Kohn-Sham method. The developments in the implementation of the Kohn-Sham approach, helped also to formulate alternative formalisms aiming at approaching the linear scaling195,218--227. In particular, the "divide-and-conquer" approach of Yang218 have attracted much attention. 4.3.3 Time-Dependent DFT The time-dependent expansion of the DFT has not yet been well explored although its firm theoretical basis was given by Runge and Gross228 more than a decade ago. Most recent
120
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
studies showed a good performance of the presently known exchange-correlation functionals when used within the time-dependent formalism to obtain frequency-dependent polarizabilities108. The natural subject of forthcoming studies would be the electron transfer reaction229 in biomolecular complexes. 4.3.4 Analysis of the Electron Density Electron density might be the subject of direct analysis at various levels of sophistication. In a typical implementation of the DFT, the electron density is known numerically for a large number of grid points which makes such an analysis straightforward. Wang et al.230 showed a qualitative agreement between electron densities from the DFT and QCISD calculations for several small organic molecules. Electron density may be used to determine the molecular shape175 and to find similarities between large molecules231,232. The maps of electron density deformation are useful for quantitative discussion of factors stabilizing molecules.211 Electron density obtained from the DFT calculations also might be used to derive such local properties as the molecular electrostatic potential, reactivity descriptors (Fukui function, hadness softness233--237, the 'electron localization function' (ELF)238, or atomic properties defined within 'atom-in-molecule' formalism239. Thanks to the increasing availability of high-resolution molecular graphics devices, the analysis of electron density (both direct or indirect) can be expected to become a supplementary tool that helps us understand molecular mechanisms of biochemical reactions at the atomistic level.
4.4 Concluding Remarks The DFT calculations at the LDA level only rarely lead to good energetic results for biological systems. Energetic results are usually erratic, although such properties of isolated molecules like equilibrium geometries, vibrational frequencies, ionization potentials, dipole moments, quadrupole moments, and dipole polarizabilities are frequently in fair agreement with experiment. None of several attempts to improve local functionals led to a universal functional better than SVWN. In particular, LDA is fails to reliably describe the hydrogen-bonded systems. In the search for the approximate exchange-correlation functionals applicable in studies of chemical problems, significant progress was achieved with the introduction of nonlocality by means of gradient-dependent terms. Such functionals made it possible for the Density-Functional theory to enter the domain of biomolecular modeling. Good prospects for the future applications of Density-Functional theory to model systems of biochemical interest are justified by the following observations: • MP2 accuracy level of the Kohn-Sham results in most tests; • favorable scaling (N3 in conventional Kohn-Sham method)—instrumental for large systems; • possibility to incorporate the microenvironment at the atomistic level into the formalism in a straightforward manner to study large biological systems; • possibility to investigate dynamical properties by means of ab initio molecular dynamics simulations (the Car-Parrinello method, for instance); • good description of metallo-organic complexes present in many biological systems; • expected progress in the implementation time-dependent formalism.
APPLICATIONS OF DENSITY FUNCTIONAL THEORY TO BIOLOGICAL SYSTEMS
121
There are also important features of current implementations of density functional theory which leave some biological processes beyond the scope of the DFT studies: • Modeling of biological systems frequently requires that the accuracy of calculated energy differences are at the kT level (which amounts to less than 1.0 kcal/mol in room temperature). In conventional ab initio methods, such accuracy has been achieved because of effective error cancelation, which is not always the case for the Kohn-Sham calculations. • In modeling of some chemical reactions (especially the ones involving radicals), DFT gives frequently erratic results. • Common approximate exchange-correlation functionals give poor results for Van der Waals complexes. Acknowledgments The authors would like to thank Dr. Olivier Parisel for critical reading of the manuscript and useful discussions. This work is a part of the Project 20-41830.94 of the Swiss National Science Foundation. References 1. Blake, C. C. F, D. F. Koenig, G. A. Mair, A. C. T. North, D. C. Phillips, and V. R. Sarma. 1965. Structure of Hen Egg-white Lysozyme: A Three-dimensional Fourier Sysnthesis at 2 A Resolution. Nature 206, 757. 2. Kim, S. H., F. L. Suddath, G. J. Quigley, A. McPerson, J. L. Sussman, A. H. J. Wang, N. C. Seeman, and A. Rich. 1974. Three-Dimensional Tertiary Structure of Yeast Phenylalanine Transfer RNA. Science 185, 435. 3. Bretcher, M. S. 1973. Membrane Structure: Some general principles. Science 181, 622. 4. Stewart, J. J. P. 1996. Applications of Localized Molecular-Orbitals to the Solution of Semiempirical Self-Consistent-Fielf Equations. Int. J. Q. Chem. 58, 133. 5. Challacombe, M., E. Schwegler, and J. Almlof. 1995. Linear scaling computations of the Hartree-Fock exchange matrix. J. Chem. Phys. 105, 2726. 6. White, C. A., B. G. Johnson, P. W. Gill, and M. Head-Gordon. 1996. Linear scaling density functional calculations via the continuous fast multipole method. Chem. Phys. Lett. 253, 268. 7. Haser, M.,J. Almlof, and G. E. Scuseria. 1991. The equilibrium geometry of C60 as predicted by second-order (MP2) perturbation theory. Chem. Phys. Lett 181, 497. 8. Parr, R. G., and W. Yang. 1989. Density Functional Theory of Atoms and Molecules. New York, Oxford University Press. 9. Kryachko, E. S.and E. V. Ludena. 1981. Density-Functional Theory of Many Electron Systems. Kluwer, Dordrecht. 10. March, N. H. 1992. Electron Density Theory of Atoms and Molecules. New York, Academic Press. 11. Ziegler, T. 1991. Density functional theory as a practical tool in studies of organometallic energetics and kinetics. Beating the heavy metal blues with DFT. Chem. Rev. 91, 651. 12. Ziegler, T. 1995. Density Functional Theory as a Practical Tool in Studies of Organometallic Energetics and Kinetics. Beating the Heavy Metal Blues with DFT. Can. J. Chem. 73, 743. 13. Kohn, W., A. D. Becke, and R. G. Parr. 1996. Density Functional Theory of Electronic Structure. J. Phys. Chem. 100, 12974. 14. Hohenberg, P., and W. Kohn. 1964. Inhomogeneous Electron Gas. Phys. Rev. B136, 864. 15. Kohn, W., and L. J. Sham. 1965. Self-Consistent Equations Including Exchange and Correlation Effects. Phys. Rev. 140A, 1133. 16. Slater, C., and K. H. Johnson. 1972. Self-Consistent-Field X Cluster Method for Polyatomic Molecules and Solids. Phys. Rev. B 5, 844.
122
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
17. Salahub, D. R. 1987. Ab Initio Methods in Quantum Chemistry, K. P. Lawley, ed., Adv. Chem. Phys. 69, 447. Chichester, Wiley. 18. St-Amant, A. Density Functional Methods in Biomolecular Modeling, Reviews in Computational Chemistry 7, K. Lipkowitz and D. B. Boyd, eds. New York, VCH Publishers, 217-259. 19. Andzelm, J., and E. Wimmer. 1992. Density functional Gaussian-type-orbital approach to molecular geometries, vibrations, and reactin energies. J. Chem. Phys. 96, 1280. 20. Johnson, B. G., P. M. W. Gill, and J. A. Popler. 1996. The performance of a family of density functional methods. J. Chem. Phys. 98, 5612. 21. Dirac, P. A. M. 1930. Note on Exchange Phenomena in the Thomas Atom. Proc. Cambridge Philos. Soc. 26, 376. 22. Slater, J. C. 1951. A simplification of the Hartree-Fock Method. Phys. Rev. 81, 385. 23. Vosko, S. H., L. Wilk, and M. Nusair. 1980. Accurate spin-dependent electron liquid correlation energies for local spin density calculations: a critical analysis. Can. J. Phys. 58, 1200. 24. Perdew, J. P., and A. Zunger. 1981. Self-interaction correction to the density-functional approximations for many-electron systems. Phys. Rev. 23, 5048. 25. Heidin, L., and B. I. Lundqvist. 1971. Explicit local exchange-correlation potentials. J. Phys. C 4, 2064. 26. Gunnarson, O., and B. I. Lundqvist. 1976. Exchange and correlation in atoms, molecules, and solids by the spin-density-functional formalism. Phys. Rev. B 13, 4274. 27. von Barth, U., and L. Heidin. 1972. A local exchange-correlation potential for the spin polarized case: I. J. Phys. 5, 1629. 28. Becke, A. D. A new mixing of Hartree-Fock and local density theories. J. Chem. Phys. 98, 1372. 29. Becke, A. D. Density-functional thermochemistry. III. The role of exact exchange. J. Chem. Phys. 98, 5648. 30. Becke, A. D. Density-functional thermochemistry. IV. A new dynamical correlation functional and implications for exact-exchange mixing. J. Chem. Phys. 104, 1040. 31. Laming, G. J., N. C. Handy, and W. H. Miller. 1995. Comparison of the Gaussian and Bessel Function Exchange Functionals with the Hartree-Fock Exchange for Molecules. J. Phys. Chem. 99, 1880. 32. Proynov, E. I., E. I. Ruiz, A. Vela, and D. R. Salahub. 1995. Determining and Extending the Domain of Exchange and Correlation Functionals. Int. J. Quant. Chem. (Symp) 29, 61. 33. ADF 2.0.1, Department of Theoretical Chemistry, Vrije Universiteit, Amsterdam, The Netherlands. For a description of the ADF program, see: E. J. Berends, D. E. Ellis, and P. Ros. 1973. Chem. Phys. 2, 2993. 34. DGauss is available as a part of the UniChem software from Cray Research, Eagan, MN. For a description of the DGauss program, see: J. Andzelm and E. Wimmer. 1991. J. Chem. Phys. 96, 1280. 35. St-Amant, A. 1992. Ph.D. Thesis Universite de Montreal. 36. Frisch, M. J., G. W. Tracks, M. Head-Gordon, P. M. W. Gill, M. W. Wong, J. B. Foresman, B. G. Johnson, H. B. Schlegel, M. A. Robb, E. S. Replogle, R. Gomperts, J. L. Andres, K. Raghavachari, J. S. Binkley, C. Gonzalez, R. L. Martin, D. J. Fox, D. J. Defrees, J. Baker, J. J. P. Steward, and J. A. Pople. 1993. GAUSSIAN 92/DFT. Gaussian Inc., Pittsburgh, Pa. 37. The DMol program is distributed by Biosym Technologies, Inc. of San Diego, Ca. For a description of the DMol program, see: B. Delley. 1990. J. Chem. Phys. 92, 508. 38. Car, R., and M. Parrinello. 1985. Unified Approach for Molecular Dynamics and Density-Functional Theory. Phys. Rev. Lett. 55, 2471. 39. Perdew, J. P., and Y. Wang. 1986. Accurate and simple density functional for the electronic exchange energy: Generalized gradient approximation. Phys. Rev. B 33, 8800. 40. Perdew, J. P. 1986. Density-functional approximation for the correlation energy of the inhomogeneous electron gas. Phys. Rev. b 33, 8822.
APPLICATIONS OF DENSITY FUNCTIONAL THEORY TO BIOLOGICAL SYSTEMS
123
41. Perdew, J. P., and Y. Wang. 1991. Electronic Structure of Solids '91. P. E. Ziesche and H. Eschrig, eds. Berlin, Adacemic Verlag, 11. 42. Becke, A. D. 1988. Density-functional exchange energy approximation with correct asymptotic behavior. Phys. Rev. A 38, 3098. 43. Lee, C., W. Yang, and R. G. Parr. 1988. Development of the Colle-Salvetti correlation energy formula into a functional of the electron density. Phys. Rev. B 37, 785. 44. St.-Amant, A., W. D. Cornell, P. Kollman, and T. A. Halgren. 1995. Calculation of Molecular Geometries, Relative Conformation Energies, Dipole Moments, and Molecular Electrostatic Potential Fitted Charges of Small Organic Molecules of Biochemical Interest by Density Functional Theory. J. Comp. Chem. 16, 1483. 45. Rashin, A. A., L. Young, I. A. Topol, and S. K. Burt. 1994. Molecular dipole moments calculated with density functional theory. Chem. Phys. Lett. 230, 182. 46. Lelj, F., C. Adamo, and V. Barone. 1994. Role of Hartree-Fock exchange in density functional theory. Some aspects of the conformational potential energy surface of glycine in the gas phase. Chem. Phys. Lett. 230, 189. 47. Barone, V., C. Adamo, and F. Lelj. 1995. Conformational behavior of gaseous glycine by a density functional approach. J. Chem. Phys. 102, 364. 48. Barone, V., C. Adamo, A. Grand, F. Jolibois, Y. Brunei, and R. Subra. 1995. Structure and ESR Features of Glycine Radical. J. Am. Chem. Soc. 117, 12618. 49. Sirois, S., E. I. Proynov, D. Nguyen, and D. R. Salahub, to be published. HydrogenBonding in Glycine and Malonaldehyde. The Lapl Correlation Functional. 50. Florian, J., and B. Johnson. 1994. Comparison and Scaling of Hartree-Fock and Density Functional Harmonic Force Fields. I. Formamide Monomer. J. Phys. Chem. 98, 3681. 51. Andzelm, J. W., D. T. Nguyen, R. Eggenberger, D. R. Salahub, and A. T. Hagler. 1995. Applications of the Adiabatic Connection Method to Conformational Equilibria and Reactions Involving Formic Acid. Computers and Chemistry 19, 145. 52. Oie, T, I. A. Topol, and S. K. Burt. 1995. Ab initio and Density Functional Studies on Internal-Rotation and Corresponding Transition-States in Conjugated Molecules. J. Phys. Chem. 99, 905. 53. Eriksson, L. A., O. L. Malkina, V. G. Malkin, and D. S. Salahub. 1994. The hyperfine structures of small radicals from density functional theory. J. Chem. Phys. 100, 5066. 54. Malkin, V. G., O. L. Malkina, L. A. Eriksson, and D. S. Salahub. 1995. The Calculation of NMR and ESR Spectroscopy Parameters Using Density Functional Theory in: Theoretical and Computational Chemistry, vol. 1, Density Functional Calculations, P. Polotzer and J. M. Seminario, eds., Amsterdam, Elsevier. 55. Malkin, V. G., O. L. Malkina, and D. S. Salahub. 1994. Calculation of spin-spin coupling constants using density functional theory. Chem. Phys. Lett. 221, 91. 56. Malkin, V. G., O. L. Malkina, M. E. Casida, and D. S. Salahub. 1994. Nuclear Magnetic Resonance Shielding Tensors Calculated with a Sum-Over-States Density Functional Perturbation Theory. J. Am. Chem. Soc. 116, 5898. 57. Malkin, V. G., O. L. Malkina, and D. S. Salahub. 1995. Influence of Intramolecular Interactions on 13C NMR Shielding Tensor in Solid -Glycine. J. Am. Chem. Soc. 117, 3294. 58. Case, D. A. 1995. Calibration of Ring-Current Effects in Proteins and Nucleic Acids. Journal of Biomolecular NMR 6, 341. 59. O'Malley, P., and S. J. Collins. 1996. Density functional studies of free radicals: accurate geometry and hyperfine coupling prediction for semiquinone anions. Chem. Phys. Lett. 259, 296. 60. Jensen, G. M., D. B. Goodlin, and S. W. Bunte. 1996. Density Functional and MP2 Calculations of Spin Densities of Oxidized 3-Methylindole: Models for Tryptophan Radicals. J. Phys. Chem. 100, 954. 61. Jasien, P. G., and G. Fitzgerald. 1990. Molecular dipole moments and polarizabilities from local density functional calculations: Applications to DNA base pairs. J. Chem. Phys. 93, 2554.
124
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
62. Estrin, D. A., L. Paglieri, and G. Corongiu. 1994. A Density Functional Study of Tautomerism of Uracil and Cytosine. J. Phys. Chem. 98, 5653. 63. Santamaria, R., and A. Vasquez. 1994. Structural and Electronic Property Changes of the Nucleic Acid Bases upon Base Pair Formation. J. Comp. Chem. 15, 981. 64. Bakalarski, G., P. Grochowski, J. S. Kwiatkowski, B. Lesyng, and J. Leszczynski. 1996. Molecular and electrostatic properties of the N-methylated nucleic acid bases by density functional theory. Chem. Phys. 204, 301. 65. Sponer, J., J. Leszczynski, and P. Hobza. 1996. Base Staking in Cytosine Dimer. A Comparison of Correlated Ab Initio Calculations with Three Empirical Models and Density Functional Theory Calculations. J. Comp. Chem. 17, 841. 66. Adamo, C., and F. Lelj. 1994. Equilibrium solvent effects in the framework of density functional theory. Application to the study of the thermodynamics of some organic and inorganic tautomereic equilibria. Chem. Phys. Lett. 223, 54. 67. Nowak, M. J., L. Lapinski, J. S. Kwiatkowski, and J. Leszczynski. 1996. Molecular Structure and Infrared Spectra of Adenine—Experimental Matrix Isolation and Density Functional Theory Study of Adenine N-15 Isotopomers. J. Phys. Chem. 100, 3527. 68. Kwiatkowski, J. S., and J. Leszczynski. 1996. Molecular Structure and Vibrational IR Spectra of Cytosine and its thio and seleno Analogs by Density Functional Theory and Conventional ab initio Calculations. J. Phys. Chem. 100, 941. 69. Kwiatkowski, J. S., and J. Leszczynski. 1996. 2(1H)-Pyridone system and its thio and seleno analogs—Density Functional Theory Versus Conventional ab initio Calculations. Journal of Molecular Structure 367, 325. 70. Hall, R. J., N. A. Burton, I. H. Killer, and P. E. Young. 1994. Tautomeric equilibria in 2-hydroxypyridine and in cytosine. An assessment of density functional methods, including gradient corrections. Chem. Phys. Lett. 220, 129. 71. Barone, V., and C. Adamo. 1995. Density Functional Study of Intrinsic and Environmental Effects in the Tautomeric Equilibrium of 2-Pyridone. J. Chem. Phys. 99, 15062. 72. Topol, I. A., and S. K. Burt. 1993. The calculations of small molecular conformation energy differences by density functional method. Chem. Phys. Lett 204, 611. 73. Liang, C., C. S. Ewig, T. R. Slouch, and A. T. Hagler. 1994. Ab Initio Studies of Lipid Model Species. 2. Conformational Analysis of Inositols. J. Am. Chem. Soc. 116, 3904. 74. Oie, T., L. A. Topol, and S. K. Burt. 1992. Ab Initio and Density Functional Calculations on Ethylene Glycol. J. Phys. Chem. 98, 1121. 75. Rabinowitz, J. R., and S. B. Little. 1994. Comparison of Quantum-Mechanical Methods to Compute the Biologically Relevant Reactivities of Cyclopenta-Polycyclic AromaticHydrocarbons. Intl. J. Quant. Chem. 52, 681. 76. Kaim, W., and B. Schwederski. 1994. Bioinorganic Chemistry: Inorganic Elements in the Chemistry of Life. An Introduction and Guide. New York, Wiley. 77. Aizman, A., and D. A. Case. 1982. Electronic Structure Calculations on Active Site Models for 4-FE,4-S Iron-Sulfur Proteins. J. Am. Chem. Soc. 104, 3269. 78. Noodleman, L., J. G. Norman Jr., J. H. Osborne, A. Aizman, and D. A. Case. 1985. Models for Ferreoxins: Electronic Structures of Iron-Sulfur Clusters with One, Two, and Four Iron Atoms. J. Am. Chem. Soc. 107, 3418. 79. Sontum, S. F., and D. A. Case. 1985. Electronic Structures of Active Site Models for Compounds I and II of Peroxidase. J. Am. Chem. Soc. 107, 4013. 80. Case, D. A. 1982. Electronic Structure Calculations Using the X Method. Annu. Rev. Phys. Chem. 33, 151. 81. Ross, P. K., and E. I. Solomon. 1991. An Electronic Structural Comparison of CooperPeroxide Complexes of Relevance to Hemocyanin and Tyrosinase Active Sites. J. Am. Chem. Soc. 113, 3246. 82. Baldwin, M. J., P. K. Ross, J. E. Pate, Z. Tyeklar, K. D. Karlin, and E. T. Solomon. 1991. Spectroscopic and Theoretical Studies of an End-On Peroxide-Bridged Coupled Binuclear Cooper(II) Model Complex of Relevance to Active Sites in Hemocyanin and Tyrosinase. J. Am. Chem. Soc. 113, 8671.
APPLICATIONS OF DENSITY FUNCTIONAL THEORY TO BIOLOGICAL SYSTEMS
125
83. Gosh, A., J. Almlof, and L. Que Jr. 1994. Density Functional Theoretical Study of Oxo(porphyroinato)iron(IV) Complexes, Models of Peroxidase Compounds I and II. J. Phys. Chem. 98, 5516. 84. Muesca, J. M., L. Noodleman, and D. A. Case. 1995. Density functional Calculations of Spin Coupling in (Fe4S4)3+ Clusters. Int. J. Quant. Chem. s22, 95. 85. Noodleman, L., C. Y. Peng, D. A. Case, and J. M. Muesca. 1995. Orbital Interactions, Electron Delocalization and Spin Coupling in Iron-Sulfur Clusters. Coordination Chemistry Reviews 144, 199. 86. Bray, M. R., and R. J. Deeth. 1996. A density functional study of active site models of xantine oxidase. Inorganic Chemistry 35, 5720. 87. Combariza, J. E., and N. R. Kestner. 1995. Density Functional Study of Short-Range Interaction Forces between Ion and Water Molecules. J. Phys. Chem. 99, 2717. 88. Carloni, P., W. Andreoni, J. Hutter, A. Curioni, P. Gianozzi, and M. Parrinello. 1995. Structure and bonding in cisplatin and other Pt(II) complexes. Chem. Phys. Lett. 234, 50. 89. Tornaghi, E., W. Andreoni, P. Carloni, J. Hutter, and M. Parrinello. 1995. Carboplatin versus cisplatin: density functional approach to their molecular properties. Chem. Phys. Lett. 246, 469. 90. Lamoen, D., and M. Parrinello. 1996. Geometry and electronic-structure of porphyrins and porhyrazines. Chem. Phys. Lett. 248, 309. 91. Carloni, P., P. E. Blochl, and M. Parrinello. 1995. Electronic Structure of the Cu, Zn Superoxide Dismutase Active Site and Its Interactions with the Substrate. J. Phys. Chem. 99, 1338. 92. Becke, A. 1992. Density-functional thermochemistry. I. The Effect of exchange-only gradient corrections. J. Chem. Phys. 96, 2155. 93. Becke, A. 1992. Density-functional thermochemistry. II. The Effect of the PerdewWang generalized-gradient correlation correction. J. Chem. Phys. 97, 9173. 94. Clementi, E., and S. J. Chakravorty. 1990. Comparative study of density functional models to estimate molecular atomization energies. J. Chem. Phys. 93, 2591. 95. Tschinke, V., and T. Ziegler. 1991. Gradient corrections to the Hartree-Fock-Slater exchange and their influence on bond energy calculations. Theor. Chim. Acta 81, 81. 96. Fournier, R., and A. E. DePristo. 1992. Predicted bond energies in peroxides and disulfides by density functional methods. J. Chem. Phys. 96, 1183. 97. Lee, C., C. Sosa, M. Planas, and J. J. Novoa. 1996. Atheoretical study of the ionic dissociation of HF, HC1, and H2S in water clusters. J. Phys. Chem. 104, 7081. 98. Chendra, A. K., and A. Goursot. 1996. Calculation of Proton Affinities Using Density Functional Procedures: A Critical Study. J. Phys. Chem. 100, 11596. 99. Schmiedenkamp, A. M., I. A. Topol, S. K. Burt, H. Razafinjanahary, H. Chermette, T. Pfatzgraff, and C. J. Michejda. 1994. Triazene Proton Affinities: A comparison between Density Functional, Hartree-Fock, and Post-Hartree-Fock Methods. J. Comp. Chem. 875, 875. 100. Schmiedenkamp, A. M., I. A. Topol, and C. J. Michejda. 1995. Proton Affinities of Molecules Containing Nitrogen and Oxygen—Comparing Density-Functional Results to Experiment. Theoretica Chimica Acta 92, 83. 101. Watson, J. D., and F. H. C. Crick. 1953. A structure of deoxiribose nucleic acid. Nature 171, 737. 102. Streyer, L. 1997. Biochemistry. New York, Streyer Biochemistry, W. H. Freeman and Company. 103. Philips, D. C. 1996. The three-dimensional structure of an enzyme molecule. Sci. Amer. 215, 78. 104. Kraut, J., and D. A. Mathews. 1987. Biological Macromolecules and Assemblies. F. A. Jurnak and A. McPerson, eds. New York, Wiley. 3, 1. 105. Warshel, A., A. Papazyan, and P. Kollman. 1985. On Low-barrier Hydrogen Bonds and Enzyme Catalysis. Science 269, 102. 106. Clevland, W. W., and M. M. Krevoy. 1985. On Low-Barrier Hydrogen Bonds and Enzyme Catalysis. Science 269, 103.
126
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
107. Latajka, Z., and Y. Bautellier. 1994. Application of density functional methods for the study of hydrogen-bonded systems: The hydrogen flouride dimer. J. Chem. Phys. 101, 9793. 108. van Gisbergen, S. J. A., V. P. Osinga, O. V. Gritsenko, R. van Leeuven, J. G. Snijders, and E. J. Berends. 1996. Improved density functional theory for frequency-dependent polarizabilities, by the use of an exchange-correlation potential with correct asymptotic behavior. J. Chem. Phys. 105, 3142. 109. Sim, F., A. St-Amant, I. Papai, and D. R. Salahub. 1992. Gaussian Density Functional Calculations on Hydrogen-Bonded Systems. J. Am. Chem. Soc. 114, 4391. 110. Laasonen, K., F. Csajka, and M. Parrinello. 1992. Water dimer properties in the gradient-corrected density functional theory. Chem. Phys. Lett. 194, 172. 111. Xantheas, S. 1995. Ab initio studies of cyclic water clusters (H2O)n, n = 1-3. III. Comparison of density functional with MP2 results. J. Chem. Phys. 102, 4505. 112. Lee, C., H. Chen, and G. Fitzgerald. 1994. Chemical bonding in water clusters. J. Chem. Phys. 102, 1266. 113. Laasonen, K., M. Parrinello, R. Car, C. Lee, and D. Vanderbildt. 1993. Structures of small water clusters using gradient-corrected density functional theory. Chem. Phys. Lett. 207, 208. 114. Estrin, D. A., L. Paglieri, G. Corongiu, and E. Clementi. 1996. Small Clusters of Water Molecules Using Density Functional Theory. J. Phys. Chem. 100, 8701. 115. Kim, K., and K. D. Jordan. 1994. Comparison of Density Functional and MP2 Calculations on the Water Monomer and Dimer. J. Phys. Chem. 98, 10089. 116. Del Bene, J. E., W. B. Person, and K. Szczepaniak. 1995. Properties of HydrogenBonded Complexes Obtained from B3LYP Functional with 6-32G(d,p) and 6-31 + G(d,p) Basis Sets: Comparison with MP2/631 + G(d,p) Results and Experimental Data. J. Phys. Chem. 99, 10705. 117. Fredin, L., B. Nelander, and G. Ribbegard. 1977. Infrared spectrum of the water dimer in solid nitrogen. I. Assignment and force constant calculations. J. Chem. Phys. 66,4065. 118. Odutola, J. A., and T. R. Dyke. 1980. Partially deuterated water dimers: Microwave spectra and structure. J. Chem. Phys. 72, 5062. 119. Hobza, P., J. Sponer, and T. Reschel. 1996. Density Functional Theory and Molecular Clusters. J. Comp. Chem. 17, 1315. 120. Florian, J., and B. Johnson. 1995. Structure, Energetics, and Force Fields of the Cyclic Formamide Dimer: MP2 Hartree-Fock, and Density Functional Study. J. Phys. Chem. 99, 5899. 121. Zhu, T., and W. Yang. 1994. Structure of the Ammonia dimer Studied by Density Functional Theory. Int. J. Quant. Chem. 49, 613. 122. Kieninger, M., and S. Suhai. 1994. Density Functional Studies on Hydrogen-Bonded Complexes. Int. J. Quant. Chem. 52, 465. 123. Kieninger, M., and S. Suhai. 1996. Conformational and Energetic Properties of the Ammonia Dimer-Comparison of Post-Hartree-Fock and Density Functional Methods. J. Comp. Chem. 17, 1508. 124. Novoa, J. J., and C. Sosa. 1995. Evaluation of the Density Functional Approximation on the Computation of Hydrogen Bond Interactions. J. Phys. Chem. 99, 15837. 125. Topol, I. A., S. K. Burt, and A. A. Rashin. 1995. Can contemporary density functional theory yield accurate thermodynamics for hydrogen bonding? Chem. Phys. Lett 247, 112. 126. Han, W.-G., and S. Suhai. 1996. Density Functional Studies on N-MethylacatemideWater Complex. J. Phys. Chem. 100, 3942. 127. Lee, C., C. Sosa, and J. J. Novoa. 1995. Evidence of the existence of dissociated molecules in water clusters. J. Chem. Phys. 103, 4360. 128. Suhai, S. 1995. Density Functional Studies of the Hydrogen-Bonded Network in an Infinite Water Polymer. J. Phys. Chem. 99, 1172. 129. Pudzianowski. 1995.A SystematicAppraisal of Density Functional Methodologies for Hydrogen Bonding in Binary Ionic Complexes. J. Phys. Chem. 100, 4781.
APPLICATIONS OF DENSITY FUNCTIONAL THEORY TO BIOLOGICAL SYSTEMS
127
130. Sule, P., and A. Nagy. 1996. Density Functional study of strong hydrogen-bonded systems: The hydrogen diformate complex. J. Chem. Phys. 104, 8524. 131. Bowman, J. M., and G. C. Schatz. 1995. Theoretical Studies of Polyatomic Bimolecular Reaction Dynamics. Annu. Rev. Phys. Chem. 46, 169. 132. Fan, L., and T. Ziegler. 1990. The application of density functional theory to the optimization of transition state structures. I. Organic migration reactions. J. Chem. Phys. 92, 3645. 133. Fan, L. and T. Ziegler. 1992. Nonlocal Density Functional Theory as a Practical Tool in Calculations on Transition States and Activation Energies. Applications to Elementary Reaction Steps in Organic Chemistry. J. Am. Chem. Soc. 114, 10890. 134. Andzelm, J., C. Sosa, and R.A. Eades. 1993. Theoretical Study of Chemical Reactions Using Density Functional Methods with Nonlocal Corrections. J. Phys. Chem. 97, 4664. 135. Baker, J., M. Miur, and J. Andzelm. 1995. A study of some organic reactions using density functional theory. J. Chem. Phys. 102, 2063. 136. Stanton, R. V. and K. M. Merz Jr. 1994. Density functional transition states of organic and organometallic reactions. J. Chem. Phys. 100, 434. 137. Truong, T. N. and W. Dunkan. 1994. A new direct ab initio dynamics method for calculating thermal rate constants from density-functional theory. J. Chem. Phys. 101, 7408. 138. Bell, R.L. and T.N. Truong. 1994. Direct ab initio dynamics studies of proton transfer in hydrogen-bond systems. J. Chem. Phys. 101, 10442. 139. Mijoule, C., Z. Latajka, and D. Borgis. 1993. Density functional theory applied to proton-transfer systems. A numerical test. Chem. Phys. Lett. 208, 364. 140. Barone, V., L. Orlandini, and C. Adams. 1994. Proton-transfer in model hydrogenbonded systems by a density-functional approach. Chem. Phys. Lett. 231, 295. 141. Barone, V., L. Orlandini, and C. Adams. 1995. Proton-Transfer in Small Model Systems—a Density-Functional Study. Intl. J. Quant. Chem. 56, 697. 142. Stanton, R. V. and K. M. Merz Jr. 1994. Density functional study of symmetric proton transfers. J. Chem. Phys. 101, 6658. 143. Zhang, Q., R. Bell, and T. N. Truong. 1995. Ab initio and Density Functional Studies of Proton Transfer Reactions in Multiple Hydrogen Bond Systems. J. Phys. Chem. 99, 592. 144. Chojnacki, H., J. Andzelm, D. T. Nguyen, and A. Sokalski. 1995. Preliminary Density Functional Calculations on the Formic Acid Dimer. Computers and Chemistry 19, 181. 145. Deng, L., V. Branchadell, and T. Ziegler. 1994. Potential Energy Surface of the GasPhase SN2 Reactions X-- + CH2X = XCH3 + X-- (X = F, Cl, Br, I): A Comparative Study by Density Functional Theory and ab initio Methods. 146. Barone, V. and C. Adamo. 1994. Theoretical study of direct and water-assisted isomerization of formaldehyde radical cation. A comparison between density functional and post-Hartree-Fock approaches. Chem. Phys. Lett. 224, 432. 147. Laasonen, K., M. Sprik, M. Parrinello, and R. Car. 1993. Ab Initio Liquid Water, J. Chem. Phys. 99, 9080. 148. Curioni, A., W. Andreoni, J. Hutter, H. Schiffer, and M. Parrinello. 1994. DensityFunctional-Theory-Based Molecular Dynamics Study of 1,2,5-Trioxaneand 1,3Dioxane Protolysis. J. Am. Chem. Chem. Soc. 116, 11251. 149. Tuckerman, M., K. Laasonen, M. Sprik, and M. Parrinello. 1995. Ab initio molecular dynamics simulation of the solvation and transport of hydronium and hydroxyl ions in water. J. Chem. Phys. 103, 150. 150. Tuckerman, M., K. Laasonen, M. Sprik, and M. Parrinello. 1995. Ab initio Molecular Dynamics Simulation of the Solvation and Transport of H3O+ and OH-- Ions in Water. J. Phys. Chem. 99, 5749. 151. Grant, E. H., R. J. Shephard, and G. P. South. 1978. Dielectric Behaviour of Biological Macromolecules in Solution. Oxford, Clarendon Press. 152. Pethig, R. 1992. Protein-Water Interactions Determined by Dielectric Methods. Annu. Rev. Phys. Chem. 43, 177.
128
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
153. Bajorath, J., D. H. Kitson, G. Fitzgerald, J. Andzelm, J. Kraut, and A. T. Hagler. 1991. Local Density Functional Calculations on a Protein System: Folate and Escherichia Coli dihydrofolate reductase. Electron Redistribution on Binding of a Substrate to an Enzyme. PROTEINS 9, 217. 154. Bajorath, J., Z. Li, G. Fitzgerlad, D. H. Kitson, M. Farnum, R. M. Fine, J. Kraut, and A. T. Hagler. 1991. Changes in the Electron Density of the Cofactor NADPH on Binding to E. coli Dihydrofolate Reductase. PROTEINS 11, 263. 155. Bajorath, J., J. Kraut, Z. Li, D. H. Kitson, and A. T. Hagler. 1991. Theoretical studies on the dihydrofolate reductase mechanism: electronic polarization of bound substrates. Proc. Natl. Acad. Sci. USA 88, 6423. 156. McDowell, S. A. C., R. D. Amos, and N. C. Handy. 1995. Molecular polarisabilities— a comparison of density functional theory with standard ab initio methods. Chem. Phys. Lett. 235, 1. 157. Dickson, R.M. and A.D. Becke. 1996. Local Density-Functional Polarizabilities and Hyperpolarizabilities at the Basis-Set Limit. J. Phys. Chem. 100, 16105. 158. de Luca, G., N. Russo, E. Sicilia, and M. Toscano. 1996. Molecular quadrupole moments, second moments, and diamagnetic susceptibilities evaluated using the generalized gradient approximation in the framework of Gaussian density functional method. J. Chem. Phys. 105, 3206. 159. Bottcher, C. F. J. 1973. Theory of Electronic Polarization. New York, Elsevier Scientific Publishing Company. 160. Jackson, J. D. 1978. Classical Electrodynamics. New York, Wiley. 161. Rinaldi, D. and J.-L. Rivail. 1973. Molecular Polarizability and Dielectric Effect of Medium in the Liquid Phase. Theoretical Study of the Water Molecule and Its Dimer. Theoret. Chim. Acta 32, 57 (in French). 162. Tapia, O. and O. Goscinski. 1975. Self-consistent reaction field theory of solvent effects. Mol. Phys. 29, 1653. 163. Rivail, J. L. and B. Terryn. 1982. Energie libre d'une distribution de charges electriques separe d'un milieu dielectrique infini par une cavite ellipsoidale quelconque. Application a l'etiude de la solvation des molecules. J. Chim. Phys. Chim. Biol. 79, 2. 164. Tomasi, J. and M. Persico. 1994. Molecular Interactions in Solution: An Overview of Methods Based on Continuous Distributions of the Solvent. Chemical Reviews 94, 2027. 165. Marten, B., K. Kim, C. Cortis, R. A. Friesner, R. B. Murphy, M. N. Ringnalda, D. Sitkoff, and B. Honig. 1996. New Model for Calculation of Solvation Free Energies: Correction to Self-consistent Reaction Field Continuum Dielectric Theory for ShortRange Hydrogen-Bonding Effects. J. Phys. Chem. 100, 11775. 166. Rinaldi, D., J. L. Rivail, and N. Rguini. 1992. Fast Geometry Optimization in SelfConsistent Reaction Field Computations on Solvated Molecules. J. Comput. Chem. 13, 675. 167. Rivail, J. L., D. Rinaldi, and M. F. Riuz-Lopez. 1991. Theoretical and Computational Models for Organic Chemistry. S. J. Formosinho et al., eds. Kluwer, Dordrecht, 79-92. 168. Chen, J. L., L. Noodleman, D. A. Case, and D. Bashford. 1994. Incorporating Solvation Effects into Density-Functional Electronic-Structure Calculations. J. Phys. Chem. 98, 11059. 169. Rashin, A. A., M.A. Bukatin, J. Andzelm, and T. Hagler. 1994. Incorporation of reaction field effects into density functional calculations for molecules of arbitrary shape. Biophys. Chem. 51, 375. 170. Davidson, M. M., I. H. Hilier, R. J. Hall, and N. A. Burton. 1994. Effect of Solvent on the Claisen Rearrangement of Allyl VinylEther using Ab Initio Continuum Methods. J. Am. Chem. Soc. 116, 9294. 171. Riuz-Lopez, M. F., F. Bohr, M. T. C. Martins-Costa, and D. Rinaldi. 1994. Studies of solvent effects using density functional theory. Co-operative interactions in H3N . . HBr proton transfer. Chem. Phys. Lett. 221, 109.
APPLICATIONS OF DENSITY FUNCTIONAL THEORY TO BIOLOGICAL SYSTEMS
129
172. Wong, M. W., M. J. Frisch, and K. B. Wiberg. 1991. Solvent Effects. 1. The Mediation of Electrostatic Effects by Solvents. J. Amer. Chem. Soc. 113, 4776. 173. Fortunelli, A. and J. Tomasi. 1994. The implementation of density functional theory within the polarizable continuum model for solvation. Chem. Phys. Lett. 231, 34. 174. Truong, T. N. and E. V. Stefanovich. 1995. A new method for incorporating solvent effects into classical, ab initio molecular-orbital and density functional theory frameworks for arbitrary shape cavity. Chem. Phys. Lett. 240, 253. 175. Foresman, J. B., T. A. Keith, K. B. Wiberg, J. Snoonian, and M. J. Frisch. 1996. Solvent Effects. 5. Influence of Cavity Shape, Truncation of Electrostatics, and Electron Correlation ab initio reaction field calculations. J. Phys. Chem. 100, 16098. 176. You, T. and D. A. Bashford. 1995. An Analytical Algorithm for the Rapid Determination of the Solvent Accessibility of Points in a Three-Dimensional Lattice around a Solute Molecule. J. Comp. Chem. 16, 743. 177. Hall, R. J., M. M. Davidson, N. A. Burton, and I. H. Killer. 1995. Combined Density Functional Self-Consistent Reaction Field Model of Solvation. J. Phys. Chem. 99, 921. 178. Davidson, M. M., I. H. Hillier, R. J. Hall, and N. A. Burton. 1994. Modeling the reaction OH-- + CO2-HCO3-- in the gas-phase and in aqueous-solution: a combined density functional continuum approach. Mol. Phys. 83, 327. 179. Wiberg, K. B., J. Ochterskim, and A. Streitwieser. 1996. Origin of the Acidity of Enols and Carboxylic Acids. J. Am. Chem. Soc. 118, 8291. 180. Siegbahn, P. E. M. 1996. Models for the Description of the H2O+ and OH-- Ions in Water. J. Comp. Chem. 17, 1099. 181. Muesca, J. M., J. L. Chen, L. Noodleman, D. Bashford, and D. A. Case. 1994. Density Functional/Poisson-Boltzmann Calculations of Redox Potentials for Iron-Sulfur Clusters. J. Am. Chem. Soc. 116, 11898. 182. Richardson, W. H., C. Peng, D. Bashford, L. Noodleman, and D. A. Case. 1996. Incorporating Solvating Effects into Density Functional Theory: Calculation of Absolute Acidities. DFT95 Proceedings. Intl. J. Quant. Chem. xxx. 183. Truong, T. N. and E. V. Stefanovich. 1995. Hydration Effects on Reaction Profiles: An ab initio Dielectric Continuum Study on the SN2CL-- + CH3Cl Reaction. J. Phys. Chem. 99, 14700. 184. Radkiewicz, J. L., H. Zipse, S. Clarke, and K. N. Houk. 1996. Acclerated Racemization of Aspartic Acid and Asparagine Residues via Succinimidine Intermediates: An ab initio Theoretical Exploration of Mechanism. J. Am. Chem. Soc. 118, 9148. 185. Baldridge, K., R. Fine, and A. Hagler. 1996. The Effect of Solvent Screening in Quantum Mechanical Calculations in Protein Systems. J. Comp. Chem. 1216. 186. Aquist, J. and A. Warshel. 1993. Simulation of Enzyme Reactions Using Valance Bond Force Fields and Other Hybrid Quantum/Classical Approaches. Chem. Rev. 93, 2523. 187. Gao, J. 1996. Reviews in computational chemistry vol. 30. K. B. Lipkowitz and D. B. Boyd, eds. New York, VCH Publishers. 119 Methods and Applications of Combined Quantum Mechanical and Molecular Mechanical Potentials. 188. Truong, T. N. and E. Stefanovich. 1996. Development of a perturbative approach for Monte Carlo simulations using hybrid ab initio QM/MM method. Chem. Phys. Lett. 256, 348. 189. Stanton, R. V., D. S. Hartsough, and K. M. Merz Jr. 1993. Calculation of Solvation Free Energies Using a Density Functional/Molecular Dynamics Coupled Potential. J. Phys. Chem. 97, 11868. 190. Weiner, S. J., P. A. Kollman, D. A. Case, U. C. Singh, C. Ohio, G. Alagona, S. Profeta, and P. Weiner. 1984. A New Force Field for Molecular Mechanical Simulation of Nucleic Acids and Proteins. J. Am. Chem. Soc. 106, 765. 191. Stanton, R. V., D. S. Hartough, and K. M. Merz Jr. 1995. An Examination of a Density Functional/Molecular Mechanical Coupled Potential. J. Comp. Chem. 16, 113. 192. Wei, D. and D. R. Salahub. 1994. A combined density functional and molecular dy-
130
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
namics simulation of a quantum water molecule in aqueous solution. Chem. Phys. Lett. 224, 291. 193. Wei, D. and D. R. Salahub. 1994. Hydrated proton clusters and solvent effects on the proton transfer barrier: A density functional study. J. Chem. Phys. 101, 7633. 194. Tunon, I., M. T. C. Martins-Costa, C. Millot, M. F. Ruiz-Lopez, and J. L. Rivail. 1996. A Coupled Density Functional-Molecular Mechanics MonteCarlo Simulation Method: The Water Molecule in Liquid Water. J. Comp. Chem. 17, 19. 195. Cortona, P. 1991. Self-consistently determined properties of solids without bandstructure calculations. Phys. Rev. B 44, 8454. 196. Wesolowski, T. A. and A. Warshel. 1993. Frozen Density Functional Approach for ab initio Calculations of Solvated Molecules. 197. Wesolowski, T. A. and A. Warshel. 1994. Ab initio Free Energy Perturbation Calculations of Solvation Free Energy Using Frozen Density Functional Approach. J. Phys. Chem. 98, 5183. 198. Wesolowski, T. A., R. Muller, and A. Warshel. 1996. Ab initio Frozen Density Functional Calculations of Proton Transfer Reactions in Solution. J. Phys. Chem. 100, 15444. 199. Wesolowski, T. A. and J. Weber. 1996. Kohn-Sham equations with constrained electron density: an iterative evaluation of the ground-state electron density of interacting molecules. Chem. Phys. Lett. 248, 71. 200. Wesolowski, T. A., H. Chermette, and J. Weber. Accuracy of Approximate Kinetic Energy Functionals in the Model of Kohn-Sham Equations with Constrained Electron Density: the FH-NCH complex as a Test Case. J. Chem. Phys. In press. 201. Laming, G.J., V. Termath, and N.C. Handy. 1993. A general purpose exchange-correlation energy functional. J. Chem. Phys. 99, 8765. 202. Filippi, C., C. J. Umrigar, and M. Taut. 1994. Comparison of exact and approximate density functionals for an exactly soluble models. J. Chem. Phys. 100, 1290. 203. Proynov, E. L, A. Vela, and D. R. Salahub. 1994. Gradient-free exchange-correlation functional beyond the local-spin-density approximation. Phys. Rev. A 50, 2421. 204. Gritsenko, O. V., R. van Leeuven, and E. J. Baerends. 1996. Molecular exchangecorrelation Kohn-Sham potential and energy density from ab initio first- and secondorder density matrices: Examples for XH (X = Li, B, F). J. Chem. Phys. 104, 8535. 205. van Leeuven, R. and E. Baerends. 1994. Exchange-correlation potential with correct asymptotic behavior. Phys. Rev. A 49, 2421. 206. Zhao, Q., R. C. Morrison, and R. G. Parr. 1994. From electron density to Kohn-Sham kinetic energies, orbital energies, exchange-correlation potentials, and exchangecorrelation energies. Phys. Rev. A 50, 2138. 207. Morrison, R. C., Q. Zhao, R. C. Morrison, and R. G. Parr. 1995. Solution of the KohnSham equations using reference densities from accurate, correlated wave functions for the neutral atoms helium through argon. Phys. Rev. A 51, 1980. 208. Lembarki, A., F. Regemont, and H. Chermette. 1995. Gradient-corrected exchange potential with the correct asymptotic behavior and the corresponding exchange-energy functional obtained from virial theorem. Phys. Rev. A 52, 3704. 209. Ingamells, V. E. and N. C. Handy. 1996. Towards accurate exchange-correlation potentials for molecules. Chem. Phys. Lett. 248, 373. 210. Wang, Y. and R. Parr. Construction of exact Kohn-Sham orbitals from a given electron density. Phys. Rev. A 47, R1591. 211. Wiberg, K. B., P. v .R. Scheyer, and A. Streitwieser. 1996. The role of hydrogens in stabilizing organic ions. Can. J. Chem. 74, 892. 212. Becke, A. and M. E. Russel. 1989. Exchange holes in inhomogeneous systems: A coordinate-space model. Phys. Rev. A 39, 3761. 213. Glossman, M. D., L. C. Balbas, A. Rubio, and J. A. Alonso. 1994. Nonlocal Exchange and Kinetic Energy Density Functionals with Correct Asymptotic Behavior for Electronic Systems. Int. J. Q. Chem. 49, 171. 214. Gritsenko, O. V., N. A. Cordero, A. Rubio, L. C. Balbas, and J. A. Alonso. 1993.
APPLICATIONS OF DENSITY FUNCTIONAL THEORY TO BIOLOGICAL SYSTEMS
215. 216. 217. 218. 219. 220. 221. 222. 223. 224. 225. 226. 227. 228. 229. 230. 231. 232. 233. 234. 235. 236. 237.
131
Weighted-Density Exchange and Local-Density Coulomb Correlation Energy Functionals for Finite Systems—Applications to Atoms. Phys. Rev. A 48, 4197. Challacombe, M. E. Schwegler, and J. Almlof. 1996. Fast assembly of the Coulomb matrix: A quantum chemical tree code. J. Chem. Phys. 104, 4686. Gallant, R. T. and A. St-Amant. 1996. Linear scaling for the charge density fitting procedure of the linear combination of Gaussian-type orbitals density functional method. Chem. Phys. Lett. 256, 569. Aoki, Y., S. Suhai, and A. Imamura. 1994. A Density Functional Elengation Method for Theoretical Synthesis of Aperiodic Polymers. Intl. J. Quant. Chem. 52, 367. Yang, W. 1991. Direct Calculation of Electron Density in Density-Functional Theory: Implementation for Benzene and Tetrapeptide. Phys. Rev. A 44, 7823. Baroni, S. and P. Giannozzi. 1992. Towards Very Large Scale Electronic Structure Calculations. Europhys. Lett. 17, 547. Mauri, F., G. Galli, and R. Car. 1993. Orbital formulation for electronic-structure calculations with linear system-size scaling. Phys. Rev. B 47, 9973. Mauri, F. and G. Galli. 1994. Electronic-structure calculations and molecular-dynamics simulations with linear system-size scaling. Phys. Rev. B 50, 4316. Galli, G. and M. Parrinello. 1992. Large Scale Electronic Structure Calculations. Phys. Rev. Lett. 69, 3547. Ordejon, P., D. A. Drabold, M. P. Grumback, and R. M. Martin. 1993. Unconstrained Minimization Approach for Electronic Computations Which Scales Linearly with System Size. Phys. Rev. B 14646. Li, X. P., R. W. Nunes, and D. Vanderbildt. 1993. Density Matrix Electronic Structure Method with Linear System-Size Scaling. Phys. Rev. B 48, 10891. Pearson, M., E. Smargiassi, and P. A. Madden. Ab initio molecular dynamics with an orbital-free density functional. J. Phys. Cond. Matter 5, 3221. Nicholson, D. M. C., G. M. Stocks, Y. Wang, W. A. Shelton, Z. Szostek, and W. M. Temmereman. 1994. Stationary nature of the density-functional free energy: Application to accelerated multiple-scattering calculations. Phys. Rev. b 50, 14686. Abrokosov, I. A., A. M. N. Niklasson, S. I. Simak, B. Johansson, A. V. Ruban, and H. L. Skriver. 1996. Order-N Green's Function Technique for Local Environment Effects in Alloys. Phys. Rev. Lett. 76, 4203. Runge, E., and E. K. U. Gross. 1984. Density Functional Theory for Time-Dependent Systems. Phys. Rev. Lett. 52, 997. Samanta, A. A., and S. K. Gosh. 1995. Density functional approach to the solvent effects on the dynamics of nonadiabatic electron transfer reactions. J. Chem. Phys. 102, 3172. Wang, J., B. G. Johnson, B. J. Russel, and L. A. Eriksson. 1996. Electron Densities of Several Small Molecules As Calculated from Density Functional Theory. J. Phys. Chem. 100. Carbo, R., B. Calabuig, L. Vera, and E. Basalu. 1994. Molecular Quantum Similarity: Theoretical Framework, Ordering principles, and Visualization Techniques. 25, 253. Mestres, J., M. Sola, M. Duran, and R. Carbo. 1996. On the Calculation of Ab Initio Quantum Molecular Similarities for Large Systems: Fitting the Electron Density. J. Comp. Chem. 15,1113. Parr, R. G., and W. Yang. 1984. Density Functional Approach to the Frontier-Electron Theory of Chemical Reactivity. J. Chem. Soc. 106, 4049. Yang, W., and R. G. Parr. 1985. Hardness, softness and the Fukui function in the electronic theory of metals and clusters. Proc. Natl. Acad. Sci. USA 82, 6723. Lee, C., W. Yang, and R. G. Parr. 1987. Local Softness and Chemical Reactivity in the Molecules CO, SCN , and H2CO. J. Mol. Struct (Theochem) 163, 305. Mendez, F., and M. Galvan. 1991. Nucleophilic Attacks on Maleic Anhydride: A Density Functional Approach. Density Functional Methods in Chemistry, J. K. Labanowski and J. Andzelm, eds. New York, Springer-Verlag, 387--400. DeProft, F., J. M. L. Martin, and P. Geerlings. 1996. Calculation of molecular elec-
132
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
trostatic potentials and Fukui functions using density functional methods. Chem. Phys. Lett. 256, 400. 238. Becke, A. D., and K. E. Edgecombe. 1990. A simple measure of electron localization in atomic and molecular systems. J. Chem. Phys. 92, 5397. 239. Bader, R. F. W., and T. T. Nguyen-Dang. 1981. Quantum Theory of Atoms in Molecules--Dalton Revisited. Adv. Quant. Chem. 14, 63.
5 On Comparing Experimental and Calculated Structural Parameters Lothar Schafer John D. Ewbank
The Effects of Molecular Vibrations The tacit assumption underlying all science is that, of two competing theories, the one in closer agreement with experiment is the better one. In structural chemistry the same principle applies but, when calculated and experimental structures are compared, closer is not necessarily better. Structures from ab initio calculations, specifically, must not be the same as the experimental counterparts the way they are observed. This is so because ab initio geometries refer to nonexistent, vibrationless states at the minimum of potential energy, whereas structural observables represent specifically defined averages over distributions of vibrational states. In general, if one wants to make meaningful comparisons between calculated and experimental molecular structures, one must take recourse of statistical formalisms to describe the effects of vibration on the observed parameters. Among the parameters of interest to structural chemists, internuclear distances are especially important because other variables, such as bond angles, dihedral angles, and even crystal spacings, can be readily derived from them. However, how a rigid torsional angle derived from an ab initio calculation compares with the corresponding experimental value in a molecule subject to vibrational anharmonicity, is not so easy to determine. The same holds for the lattice parameters of a molecule in a dynamical crystal, and their temperature dependence as a function of the molecular potential energy surface. In contrast, vibrational effects are readily defined and best described for internuclear distances, bonded and nonbonded ones. In general, all observed internuclear distances are vibrationally averaged parameters. Due to anharmonicity, the average values will change from one vibrational state to the next and, in a molecular ensemble distributed over several states, they are temperature dependent. All these aspects dictate the need to make statistical definitions of various conceivable, different averages, or structure types. In addition, since the two main tools for quantitative structure determination in the vapor phase—gas electron diffraction and microwave spectroscopy—interact with molecular ensembles in different ways, certain operational definitions are also needed for a precise understanding of experimental structures. To illustrate how the operations of an experimental technique affect the nature of its observables, gas electron diffraction shall be used as an example. Considering the mechanics 133
134
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
of diffraction is also instructive because it shows that the concept of structure is intrinsically misleading when it is applied to molecules. Molecular structure implies that molecules are some sort of regular objects, whose atoms reside at the edges of a rigid framework skeleton, like balls pinned down on the ends of sticks. With increasing sophistication, this image of molecules evolves such that the atoms oscillate about their assigned positions in the alleged framework, but the model still misses the essence of what molecules really are. That is, they are quantum objects, existing in the form of probability amplitudes, whose properties can be rationalized in terms of classical concepts, such as vibrational motion and internuclear separation, but which in themselves are not real in the same way as ordinary classical objects. In accordance with the quantum nature of molecules, the outcome of individual measurements of various molecular properties—for example, of internuclear distances—cannot be predicted precisely, but only average values are predictable over a large number of measurements. It is the purpose of the following section to describe the nature of these averages. For simplicity we shall begin by considering the scattering of electrons by a diatomic molecule.
Operational Definitions of Internuclear Distance Types Structural Observables Derived from Gas Electron Diffraction The scattering of electrons by an atom pair separated by a rigid distance, r, averaged over all possible orientations in space, gives rise to a molecular scattering intensity function which can be represented in the following way1:
where g(s) is a function of the atomic scattering factors, which measure the power of atoms to scatter electrons, and s is the scattering space variable, a function of scattering angle and electron wavelength. Equation (5.1) defines the electron diffraction operator, [sin(sr)/r], for hypothetical, rigid molecules. It can be used as a basis to derive diffraction intensities for real systems; i.e., structures which, in the classical description, exhibit molecular vibrations. The nature of the internuclear distance, r, is the object of interest in this chapter. In Eq. (5.1) it has the meaning of an instantaneous distance; i.e., at the instant when a single electron is scattered by a particular molecule, r is the value that is evoked by the measurement in accordance with the probability density of the molecular state. Thus, when electrons are scattered by an ensemble of molecules in a given vibrational state v, characterized by the wave function v(r), the molecular intensities, Iv(s), are obtained by averaging the electron diffraction operator over the vibrational probability density.
In Eq. (5.2), the function | v(r)|2/ = P(r)/ is an example of a so-called radial distribution (RD) function, in the form in which it is obtained from gas-electron diffraction, in this case for a particular vibrational state of a diatomic molecule. It is seen that the molecular intensity curve is the Fourier transform of P . The reverse, by inversion, the RD function is the Fourier transformation of the molecular intensities:
ON COMPARING EXPERIMENTAL AND CALCULATED STRUCTURAL PARAMETERS
135
CS: Intensity vs Vibrational Level (v = 0-10)
Fig. 5.1 SampleIv(s) curves for various vibrational states of carbon monosulfide, C S. These curves were calculated2 in accordance with Eq. (5.2), using v(r) functions obtained by solving Schrodinger's equation with an experimental potential energy surface derived from molecular spectroscopy.
Molecular ensembles usually exist in some distribution over several vibrational states. Denoting the population of the vth state by pv' the total molecular intensity function can then be calculated by summing over the states:
It is useful to illustrate the concepts introduced above by considering some graphic examples. In Fig. 5.1, Iv(s) curves are shown for various vibrational states of carbon monosulfide, C S. These curves were calculated2 using wave functions, v (r), obtained by solving Schrodinger's equation with a ground-state potential energy surface for C S determined from spectroscopic measurements. In Fig. 5.2, the Fourier transforms of the intensities of Fig. 5.1 are shown, Boltzmann-weighted at the arbitrarily selected temperatures of T = 1000K and 5000K. That is, in accordance with Eqs. (5.2)-(5.4), the curves shown are pv v( ) curves, where pv = exp(—E v kT). In Fig. 5.3, the average molecular intensi-
CS Boltzmann (1000K)
Distance/p CS Boltzmann (5000K
Distance /p
ON COMPARING EXPERIMENTAL AND CALCULATED STRUCTURAL PARAMETERS
137
Fig. 5.3 Average molecular intensities, mol(s), (left) and their Fourier transform (right) calculated for C S in a Boltzmann distribution at 2000K, using the data shown in Figs. 5. 1and 5.2. ties of C S, mol(s), and their Fourier transform are shown, calculated for an equilibrium molecular ensemble at 2000K. The characteristic sinusoidal form of gas-electron diffraction intensities is clearly seen for this simple molecule but, in accordance with Eq. (5.4), the curve shown is really a composite, containing pv - weighted contributions from many vibrational states. The same is true for the RD-curve, or the Fourier transform of the intensities. A single maximum of probability density is seen in Fig. 5.3, which is the superposition of the maxima of many individual curves. Equations (5.2)-(5.4) and Figs. 5.1-5.3 illustrate the nature of the structural observables obtained from gas-electron diffraction: the intensity data provide internuclear distances which are weighted averages of the expectation values of the individual vibrational molecular states. This presentation clearly illustrates that the temperature-dependent observable distribution averages are conceptually quite different from the singular, nonobservable and temperature independent equilibrium distances, usually denoted r e -type distances, obtained from ab initio geometry optimizations. For polyatomic molecules the situation is somewhat more complex but essentially the same. The effect of intramolecular motion upon the scattering of fast electrons by molecular gases was first described by Debye3 for the particular case of a molecular ensemble at thermal equilibrium. The corresponding average molecular intensity function can be expressed in the following way:
where PT(rij) is the probability density function, T is the vibrational temperature of the molecules, and rij denotes the internuclear distances, both bonded and nonbonded, between all Fig. 5.2 Radial distribution curves, Pv| v (r)| 2 /r for different vibrational states of carbon monosulfide, C S, calcualted2 for Boltzmann distributions, withPv = exp(—E v / kT), at T = 1000K (top) and T = 5000K (bottom) arbitrarily selected for the sake of illustration, where EV is the energy level of state v. The figure conveys an impression of how stateaverage distance values, which can be derived from experimental spectroscopic data, differ from distribution-average values, derived from electron diffraction data for an ensemble of molecules at a given vibrational temperature. Both "observables" in turn differ from the unobservable stateless equilibrium distances which are temperature-independent in the Born-Oppenheimer approximation.
138
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
i,j-atom pairs. Again, a Fourier transform of the molecular intensities yields the RD-curve, P /r, which is a composite curve for polyatomics, containing a series of probability maxima at different distances, some of them possibly closely spaced and overlapping. Starting with the Debye equation and using a number of simplifying assumptions, various authors contributed to the development of a general equation for molecular electron diffraction intensities, in the final form given by Bartell and Kuchitsu4. For every atom pair in a polyatomic molecule,
where ra is the average internuclear distance, 2 is the mean square amplitude of vibration, and is an asymmetry parameter related to anharmonicity. Equation (5.6) defines the ra-type distances that are usually reported for structural studies by gas electron diffraction, the subscript denoting the fact that the distance appears in the argument of the intensity equation. To appreciate the meaning of ra-parameters, note that this kind of distance is specifically linked to the operations connected with conventional gas electron diffraction. That is, ra-distances are structural parameters which are obtained when the data are analysed using Eq. (5.6). The mode of analysis in this case leaves a definite imprint on the nature of the resulting observables. For example, due to the assumptions made in deriving this equation, its validity is limited to equilibrium ensembles of quasi-rigid molecules that exhibit small amplitude motions. Thus, it is measurably inaccurate5 when systems with large amplitude motions are investigated, such as molecules at elevated temperatures. Furthermore, ra parameters are geometrically inconsistent and most analyses are performed without any knowledge on the anharmonicity K-parameters. The geometrical inconsistency is described by the Bastiansen-Morino shrinkage effect6 which states that, in the presence of bending vibrations, the nonbonded Y. . . Y distance in a linear Y-X-Y system is smaller than the sum of the X-Y bonded distances. The problem involving the unknown K-values needed for evaluating ra geometries from electron-diffraction data is usually neglected by setting them equal to zero, or by using some approximate ad hoc value in the data analysis. Imperfections of this kind have a definite effect on the outcome of an electron diffraction study and limit the accuracy of Eq. (5.6) to several tenths of a picometer (several thousandths of an A) which is nearly an order of magnitude above5 the ultimate precision of the experiment as afforded by modern technology. During the last decades, a large body of structural information has been derived from gas-electron diffraction studies. The corresponding results are nearly exclusively reported in the literature in terms of ra-distances, or the equivalent thermal average internuclear distances, which are denoted rg . The rg distances are defined by the relation, ra rg— Alternative methods for interpreting gas-electron diffraction data are possible, for example, in terms of re-geometries5, but they are currently too complex to apply in routine structural analyses, because they require detailed information on the molecular potential energy surface which is not usually available. Other Experimental Observables Similar operational definitions have to be taken into account for every experimental tool of structural chemistry to define the meaning of the observables that it provides6. In microwave spectroscopy, for example, structural information is obtained from the rotational constants
ON COMPARING EXPERIMENTAL AND CALCULATED STRUCTURAL PARAMETERS
139
which are determined from the observed transitions between rotational energy levels. In studies of this kind, ro-distances are between effective nuclear positions derived from the BO rotational constants of the zero-point vibrational level. The qualifying term "effective" is used to transcribe the fact that the effects of rotation-vibration interactions on the Bo constants, or on the nuclear positions in a rigid molecular framework, are neglected. Alternatively, rs-distances in microwave spectroscopy are distances between effective nuclear positions derived from isotopic differences in rotational constants, where the subscript denotes "substitution." Distances from X-ray crystallography are derived from procedures following an entirely different methodology. In contrast to electrons which are sensitive to the overall charge distribution in a molecule, nuclear and electronic, X-rays are sensitive to only the electronic charges. Thus, intramolecular distances from X-ray crystallography are between maxima of electron density. Since these maxima are usually shifted away from the nuclei into a bond, their separations are not exactly internuclear and their values may differ significantly from those determined for the same molecule by either gas electron diffraction or microwave spectroscopy. In addition, compared to the latter, data analysis of X-ray crystallography involves fundamentally different concepts. In the gas and at equilibrium, molecules are randomly oriented, separated by large and variable distances, and large amplitude internal motions are possible. In the crystal, data analysis is based on repetitive units, many internuclear distances are fixed in regions of strong interactions—with significant effects on local molecular geometries—and large amplitude motions are prohibited. All these conditions are essentially different from those of the isolated molecules observed in gas-phase experiments or considered in ab initio calculations. In particular, the temperature dependence of anisotropic effects in crystals is difficult to predict. Statistical Definitions of Internuclear Distance Types Operational definitions of molecular structure are needed to clarify experimental significance. In addition, some statistical notation is needed to clarify physical meaning. All statistical definitions hinge on the minimum of potential energy in a bound electronic state, which defines the equilibrium geometry or re-internuclear distance type. Consider a pair of atoms i and j frozen at their equilibrium positions and denote the connection between them as the local z-axis. In this case rij = re. In a vibrating molecule the nuclear positions can be averaged over the vibrational states. In that case the distances between them—the so-called vibrational average or rv-distances—are then defined in the following way6:
where minor expansion terms have been omitted6, and z v is the expectation value of the linear displacement in the vth vibrational state. In the case of the vibrational ground state, v = 0, the corresponding zero-point average distance is
where the subscript denotes "zero-point." (Note that, for v = 0, the notation used for rv is not r0, because the latter is used for the effective structures derived from the B0 constants in microwave spectroscopy.) In addition to rv and rz, for a system at equilibrium, the thermal average distance at temperature T is often used and denoted as r :
140
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
where z T is the thermal average value of the linear displacements, or the Boltzmann weighted average over all vibrational states. At OK, the thermal average distance is often reported in the literature as r°; it follows that r° = rz . Equations (5.7)--(5.9) define distances between average nuclear positions. A different type of average is obtained when the internuclear distances, not the positions, are averaged. The meaning of the subtle shift in language is clear when the mathematical relation is considered. Thermal average internuclear distances, or rg -parameters, are related to re in the following way:
where r T is the average value of the change in internuclear distance at thermal equilibrium. z is just one component of x In polyatomic molecules, where vibrations perpendicular to the z axis occur, x is also dependent on the perpendicular displacements, x and In this case, as a rule,
Interactions between Computational and Experimental Procedures in Structural Chemistry Some specific numerical examples taken from the literature7 will demonstrate the necessity of strictly adhering to precise definitions, whenever structural parameters are used in detailed molecular analyses. In CH4, ro(C-H) = 1.094(1) A and r s(C-H) = 1.107(1) A. In formic acid, HCOOH, rs(CH) = 1.097(5) A; ra(C-H) = 1.103(3) A; rss(C-O) = 1.202(10) A; and ra(C-O) = 1.214(1) A. In formamide, HCONH2, rs(N-H) = 1.002(3) A; rg(N-H) = 1.027(6) A; rs(C-O) = 1.219(12) A; and rg(C-O) = 1.212(3) A. In methylbromide, CH3Br, rs(C-Br) = 1.939(1) A and re(C-Br) = 1.933(2) A. The numbers in parentheses are error limits, often neglected when theoretical and experimental structural parameters are compared. They show that, in many cases, the differences between different types of the same parameter are significant within error limits, and in precise comparisons they must be taken into account. There are really two ways in which computational procedures and experimental techniques interact in structural analyses. On the one hand, the experimental data are needed to check and normalize the theoretical procedures. On the other hand, ab initio geometries are useful in guiding structural studies in all experiments in which the number of unknowns exceeds the number of observables. For example, it is a characteristic weakness of gaselectron diffraction that, as a rule, closely spaced bond distances and angles cannot be resolved from the diffraction data. At the same time, it is a characteristic strength of ab initio geometries that differences between chemically similar parameters are resolved with particular accuracy. This complementary relationship has made it possible to use differences between similar parameters in ab initio geometries as constraints of least squares diffraction data analysis. Such a joint experimental and computational approach is the basis of Molecular Orbital Calculation-Constrained Electron Diffraction (MOCED)8. Introducing calculated differences, rather than absolute values, into the data interpretation largely eliminates problems related to the different natures of ab initio re and electron diffraction ra-parameters, implying that re ra. As it turns out, calculated differences are also propitious
ON COMPARING EXPERIMENTAL AND CALCULATED STRUCTURAL PARAMETERS
141
to use because they are much less basis-set dependent than absolute values, and much of the methodological error in the ab initio techniques is canceled. MOCED-related techniques have become essential in gas-electron diffraction studies of complicated systems. They are often the only means for8 (a) finding the structure of a compound that is in agreement with the data; (b) identifying the components in a conformational mixture; and (c) giving plausible details for bond distance patterns which are not resolved in the diffraction data. In microwave spectroscopy, the situation is characterized by the fact that the number of independent pieces of observable data is often restricted to three rotational constants, while the number of degrees of freedom—3N-6 or 3N-5, where N is the number of atoms—can be significantly larger. The number of independent experimental rotational constants can be effectively increased by recording the spectra of isotopically substituted species, but that is not always possible. Thus, structural studies by microwave spectroscopy are often seriously underdetermined and extraneous information is useful in aiding the spectroscopic assignment. In early investigations9 involving the joint use of ab initio gradient geometries and microwave spectroscopic data, it was perhaps a surprise that the rotational constants predicted from the calculated structures were consistently found in close agreement with the experimental B0 constants, without in any way correcting for differences in parameter types. The close agreement made it possible to arrive at conclusions, which the microwave data alone would not have supported. For example, in the case of methyl hydrazinocarboxylate9, it was not possible to identify the observed conformer or to assign the spectrum of an isotopic species without the help of ab initio calculations. Similarly, in the case of 2-methylallyl alcohol9, the calculations made it possible to detect the syn-form of this compound which had previously not been assigned due to the complexity of the spectrum; and in 2-methoxy ethylamine9 the prediction of exact details of local geometry by ab initio calculations led to the discovery of its trans form which had remained undetected in a previous study because of the extraordinary collusion of a large number of misleading factors. The most striking study of this kind has perhaps been the case of glycine, where ab initio calculations first predicted10 that the most populated state had been overlooked in spectroscopic work11, and then guided new experiments which led to the discovery of the undetected conformational state12,13. Experiences of this kind and others in the course of time firmly established the utility of ab initio geometries in microwave spectroscopy, and their use has become a routine matter in investigations of this kind. X-ray crystallography is known for the fact that the number of experimental observables is usually very large and sufficient to determine the structures even of complex molecules. In addition, bond lengths and angles in crystal structures are often so significantly affected by solid-state interactions that individual parameters may vary unpredictably from their ab initio counterparts. This situation seems to preclude a fruitful interaction between the two methods, and X-ray crystallographic structures are not a useful check of the accuracy of ab initio geometries. However, when the same type of parameter is determined for a large number of molecules of the same class, the average values derived from series of molecules are very close to the calculated ones. This can be done, for example, for the backbone bond distances and angles in proteins, which can be predicted from ab initio calculations as functions of the , -torsional angles with the same accuracy that is afforded by high-resolution protein crystallography14--16. Since protein crystal structures are frequently also underdetermined, the calculated geometry trends have been useful in inspiring the formulation of anew set of so-called ideal geometries—standard values of bond lengths and angles—which are needed as restraints of crystallographic data refinement 17 .
142
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Conclusions The material presented in this chapter illustrates that the study of the dynamics of molecular structures reveals subtle aspects which are conceptually simple, but still somewhat hidden from public awareness. In addition to recognizing that the parameters obtained from different experimental techniques are different by nature, it is important to realize that experimental structure determination is not always an automatic and straightforward process, but involves value judgments which may affect the outcome of data interpretation. Thus, it takes a good deal of experience to determine which structures reported in the literature are reliable, and which ones are not, regardless of the claims of the authors. In general, the following rules can be given for structural comparisons of experimental and calculated data: 1. Due to the characteristics of the technique, comparisons of rs-parameters in microwave spectroscopy are not meaningful within 0.01 A6, particularly if one or several atoms are close to a principal axis of rotation. When ro-structures are determined from three rotational constants for a species with more than three degrees of freedom, no estimate is possible of the significance of the results. 2. For structures obtained by gas-electron diffraction, it is not meaningful to compare internuclear distances to within several thousandths of an A. This is particularly true when bond lengths are closely spaced and not resolved in the diffraction data. In such cases the uncertainties can be >0.05 A, even though they may be reported with higher precision. 3. For parameters taken from crystal structures, no systematic predictions can be made of how the intermolecular interactions in the crystal may affect distances and angles. These potential solid state effects preclude comparisons within a level of several hundredths of an A for bond lengths and several degrees for angles. Operational effects are particularly large for C-H bonds, which are characteristically smaller in crystallographic studies compared to those found by the other methods. Notes 1. "Stereochemical Applications of Gas-Phase Electron Diffraction," I. Hargittai and M. Hargittai, eds., Parts A and B, VCH, New York (1988). 2. A. A. Ischenko, L. Schafer, J. Y. Luo, and J. D. Ewbank, "Structural and Vibrational Kinetics by Stroboscopic Gas Electron Diffraction: The 193 nm Photodissociation of CS2," J. Phys. Chem., 98 (1994) 8673-8678. 3. P. Debye, "The Influence of Intramolecular Atomic Motion on Electron Diffraction Diagrams," J. Chem. Phys., 9 (1941) 55-60. 4. K. Kuchitsu and L. S. Bartell, "Effects of Anharmonicity of Molecular Vibrations on The Diffraction of Electrons. II. Interpretation of Experimental Structural Parameters," J. Chem. Phys., 35 (1961) 1945-1949. 5. A. A. Ischenko, I. D. Ewbank, andL. Schafer, "Direct Evaluation of Equilibrium Molecular Geometries Using Real-Time Gas Electron Diffraction," J. Phys. Chem., 98 (1994) 4287--4300. 6. K. Kuchitsu and S. J. Cyvin, "Representation and Experimental Determination of the Geometry Of Free Molecules," in Molecular Structures and Vibrations, S. I. Cyvin, ed., Elsevier, Amsterdam (1972). 7. J. H. Callomon, E. Hirota, K. Kuchitsu, W. I. Lafferty, A. G. Maki, C. S. Pote, I. Buck, and B. Starck, "Structure Data of Free Polyatomic Molecules," K. H. Hellwege and A. M. Hellwege, eds., Landolt-Bornstein, New Series, Vol. II.7, Springer, New York (1976). 8. L. Schafer, J. D. Ewbank, K. Siam, N-S. Chiu, and H. L. Sellers, "Molecular Orbital Constrained Electron Diffraction Studies: The Concerted Use of Electron Diffraction and
ON COMPARING EXPERIMENTAL AND CALCULATED STRUCTURAL PARAMETERS
143
Quantum Chemical Calculations," in Stereochemical Applications of Gas-Phase Electron Diffraction, I. Hargittai and M. Hargittai, eds., Part A, Chap. 9, p. 301--319, VCH, New York (1988). 9. L. Schafer, K. Siam, J. D. Ewbank, W. Caminati, and A. C. Fantoni, "Some Surprising Applications of Ab Initio Geometries in Microwave Spectroscopic Conformational Analyses," in Modeling of Structures and Properties of Molecules, Z. B. Maksic, ed., Chap. 4, p. 79-90, E. Horwood Pub., Chichester, England (1987). 10. H. L. Sellers and L. Schafer, "Investigations Concerning the Apparent Contradiction Between the Microwave Structure and the Ab Initio Calculations of Glycine," J. Am. Chem. Soc., 100 (1978) 7728-7729. 11. R. D. Brown, P. D. Godfrey, J. W. V. Storey, and M. P. Bassez, "Microwave Spectrum and Conformation of Glycine," J. C. S. Chem. Comm. (1978) 547-548. 12. L. Schafer, H. L. Sellers, F. J. Lovas, and R. D. Suenram, "Theory Versus Experiment: the Case of Glycine," J. Am. Chem. Soc., 102 (1980) 6566-6568. 13. (a) R. D. Suenram and F. J. Lovas, "Millimeter Wave Spectrum of Glycine," J. Mol. Spectrosc., 72 (1978) 372-382. (b) R. D. Suenram and F. J. Lovas, "Millimeter Wave Spectrum Of Glycine: A New Conformer," J. Am. Chem. Soc., 102 (1980) 7180-7184. 14. L. Schafer, M. Cao, and M. J. Meadows, "Predictions of Protein Backbone Bond Distances and Angles from First Principles," Biopol., 35 (1995) 603-606. 15. L. Schafer and M. Cao, "Predictions of Protein Backbone Bond Distances and Angles From First Principles," J. Mol. Struct., 333 (1995) 201-208. 16. X. Jiang, M. Cao, B. Teppen, S. Q. Newton, and L. Schafer, "Predictions of Protein Backbone Bond Distances and Angles from First Principles: Systematic Comparisons of Calculated N-C(a)-C' Angles with High-Resolution Protein Crystallographic Results," J. Phys. Chem., 99 (1995) 10521-10525. 17. P. A. Karplus, "Experimentally Observed Conformation-Dependent Geometry and Hidden Strain In Proteins," Protein Science, 5 (1996) 1406-1420. References Ab Initio Calculations ofAmino Acids Abraham, R. J., and B. Hudson. 1985. "Charge Calculations in Molecular Mechanics. III: Amino Acids and Peptides." J. Comput. Chem. 6, 173-181. Alper, J. S., H. Dothe, and M. A. Lowe. 1992. "Scaled Quantum Mechanical Calculation of the Vibrational Structure of the Solvated Glycine Zwitterion." Chem. Phys. 161, 199-209. Barron, L. D., A. R. Gargaro, L. Hecht, and P. L. Polavarapu. 1991. "Experimental and Ab Initio Theoretical Vibrational Raman Optical Activity of Alanine." Spectrochimica Acta47A, 1001-1016. Basch, H., and W. J. Stevens. 1990. "The Structure of Glycine-Water H-Bonded Complexes." Chem. Phys. Letters 169, 275-280. Bonaccorsi, R., P. Palla, and J. Tomasi. 1984. "Conformational Energy of Glycine in Aqueous Solutions and Relative Stability of the Zwitterionic and Neutral Forms. An Ab Initio Study." J. Am. Chem. Soc. 106, 1945-1950. Bouchonnet, S., andY. Hoppiliard. 1992. "Proton and Sodium Ion Affinities of Glycine and Its Sodium Salt in the Gas Phase. Ab Initio Calculations." Org. Mass Spectrometry 27, 71-76. Cao, M., S. Q. Newton, J. Pranata, and L. Schafer. 1995. "Ab Initio Conformational Analysis of Alanine." J. Mol. Struct. 332, 251-267. Chipot, C., B. Maigret, J.-L. Rivail, and H. A. Scheraga. 1992. "Modeling Amino Acid Side Chains. 1. Determination of Net Atomic Charges from Ab Initio Self-Consistent-Field Molecular Electrostatic Properties." J. Phys. Chem. 96, 10276-10284. dementi, E., F. Cavallone, and R. Scordamaglia. 1977. "Analytic Potentials from "ab Initio" Computations for the Interaction Between Biomolecules. 1. Water with Amino Acids." J. Am. Chem. Soc. 99, 5531--5545.
144
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Csaszar, A. G. 1995. "On the Structures of Free Glycine and -Alanine." J. Mol. Struct. 346, 141-152. Csaszar, A. G. 1992. "Conformers of Gaseous Glycine." J. Am. Chem. Soc. 114, 9568-9575. de Dios, A. C., J. G. Pearson, and E. Oldfield. 1993. "Chemical Shifts in Proteins: An Ab Initio Study of Carbon-13 Nuclear Magnetic Resonance Chemical Shielding in Glycine, Alanine, and Valine Residues." J. Am. Chem. Soc. 115, 9768-9773. Depke, G., N. Heinrich, and H. Schwarz. 1984. "On the Gas Phase Chemistry of Ionize Glycine and Its Enol. A Combined Experimental and Ab Initio Molecular Orbital Study." Int. J. Mass Spectrom. Ion Porcesses 62, 99-117. Destro, R., R. Bianchi, and G. Morosi. 1989. "Electrostatic Properties of L-Alanine from X-ray Diffraction at 23 K and Ab Initio Calculations." J. Phys. Chem. 93, 4447--4457. Destro, R., R. E. Marsh, and R. Bianchi. 1988. "A Low-Temperature (23 K) Study of L-Alanine." J. Phys. Chem. 92, 966-973. Ding, Y., and K. Krogh-Jespersen. 1992. "The Glycine Zwitterion Does Not Exist in the Gas Phase: Results from a Detailed Ab Initio Electronic Structure Study." Chem. Phys. Letters 199, 261-266. Dixon, D. A., and W. N. Kipscomb. 1976. "Electronic Structure and Bonding of the Amino Acids Containing First Row Atoms." J. Biol. Chem. 251, 5992-6000. Dykstra, C. E., R. A. Chiles, and M. D. Garrett. 1981. "Recent Computational Developments with the Self-Consistent Electron Pairs Method and Application to the Stability of Glycine Conformers." J. Comput. Chem. 2, 266-272. Ewbank, J. D., V. J. Klimkowski, K. Siam, and L. Schafer. 1987. "Conformational Analysis of the Methyl Ester of Alanine by Gas Election Diffraction and Ab Initio Geometry Optimization." J. Mol. Struct. 160, 275-285. Gatti, C., R. Bianchi, R. Destro, and F. Merati. 1992. "Experimental v. Theoretical Topological Properties of Charge Density Distributions. An Application to the L-alanine Molecule Studied by X-ray Diffraction at 23 K." J. Mol. Struct. (Theochem) 255, 409--433. Hu, C.-H., M. Shen, and H. F. Schaefer, III. 1993. "Glycine Conformational Analysis." J. Am. Chem. Soc. 115, 2923-2929. Jensen, F. 1992. "Structure and Stability of Complexes of Glycine and Glycine Methyl Analogues with H+, Li+, and Na+." J. Am. Chem. Soc. 114, 9533-9537. Jensen, J. H., and M. S. Gordon. 1991. "The Conformational Potential Energy Surface of Glycine: A Theoretical Study." J. Am. Chem. Soc. 113, 7917-7924. Kikuchi, O., T. Matsuoka, H. Sawahata, and O. Takahashi. 1994. "Ab Initio Molecular Orbital Calculations Including Solvent Effects by Generalized Born Formula. Conformation of Zwitterionic Forms of Glycine, Alanine and Serine in Water." J. Mol. Struct. (Theochem) 305, 79-87. Kikuchi, O., T. Natsui, and T. Kozaki. 1990a. "MNDO Effective Charge Model Study of Conformations of Zwitterionic and Neutral Forms of Glycine, Alanine and Serine in the Gas Phase and in Solution." J. Mol. Struct. (Theochem) 207, 103-114. Kikuchi, O., and H. Wang. 1990b. "Parity-Violating Energy Shift of Glycine, Alanine, and Serine in the Zwitterionic Forms: Calculation Using HFO-NG Basis Sets." Bull. Chem. Soc. Jpn. 63, 2751-2754. Klimkowski, V. J., J. D. Ewbank, J. N. Scarsdale, L. Schafer, and C. Van Alsenoy. 1985. "Conformational Analysis and Molecular Structures of Valine Methyl Ester." J. Mol. Struct. (Theochem) 124, 175-182. Klimkowski, V. J., J. N. Scarsdale, and L. Schafer. 1983a. "Ab Initio Studies of Structural Features Not Easily Amenable to Experiment. 25. Conformational Analysis of Methyl Propanoate and Comparison with the Methyl Ester of Glycine." J. Comput. Chem. 4, 494--498. Klimkowski, V. J., L. Schafer, L. Van Den Enden, C. Van Alsenoy, and W. Caminati. 1983b. "Ab Initio Studies of Structural Features Not Easily Amenable to Experiment. Part 28. Comparison of the Observed Ground State Rotational Constants of the Methyl Ester
ON COMPARING EXPERIMENTAL AND CALCULATED STRUCTURAL PARAMETERS
145
of Glycine with the Rotational Constants calculated for Some Planar and Non-planar Gradient Geometries." J. Mol. Struct. (Theochem) 105, 169-174. Kokpol, S. U., P. B. Doungdee, S. V. Hannongbua, B. M. Rode, and J. P. Limtrakul. 1988. "Ab Initio Study of the Hydration of the Glycine Zwitterion." J. Chem. Soc., Faraday Trans. 2, 84, 1789-1792. Laurence, P. R., and C. Thomson. 1982. "The Boron Analogue of Glycine: A Theoretical Investigation of Structure and Properties." J. Mol. Struct. (Theochem) 88, 37--43. Laurence, P. R., and C. Thomson. 1981. "A Comparison of the Results of PCILO and Ab Initio SCF Calculations for the Molecules Glycine, Cysteine and N-Acetyl-Glycine." Theoret. Chim. Acta (Berl.) 58, 121-124. Le1j, F., C. Adamo, and V. Barone. 1992. "Role of Hartree-Fock Exchange in Density Functional Theory. Some Aspects of the Conformational Potential Energy Surface of Glycine in the Gas Phase." Chem. Phys. Letters 230, 189-195. Lindroos, J., M. Perakyla, J.-P. Bjorkroth, and T. A. Pakkanen. 1992. "Ab Initio Models for Receptor-Ligand Interactions in Proteins. Part 1. Models for asparagine, glutamine, serine, threonine and tyrosine." J. Chem. Soc. Perkin Trans. 2, 2271--2277. Luke, B. T., A. G. Gupta, G. H. Loew, J. G. Lawless, and D. H. White. 1984. "Theoretical Investigation of the Role of Clay Edges in Prebiotic Peptide Bond Formation. I. Structures of Acetic Acid, Glycine, H2SO4, H3PO4, Si(OH)4, and A1(OH)4--." Int. J. Quantum Chem. Quantum Biol. Symp. 11, 117-135. Masamura, M. 1988a. "Reliability of AM1 in Conformational Analysis of Unionized Amino Acids." J. Mol. Struct. (Theochem) 168, 227-234. Masamura, M. 1988b. "Reliability of AM1 in Determining the Equilibrium Structures of Unionized Amino Acids." J. Mol. Struct. (Theochem) 164, 299-311. Masamura, M. 1987. "Reliability of MNDO in Determining the Equilibrium Structures of Unionized Amino Acids." J. Mol. Struct. (Theochem) 152, 293-303. Mezey, P. G., J. J. Ladik, and S. Suhai. 1979. "Non-Empirical SCF MO Studies on the Protonation of Biopolymer Constituents I. Protonation of Amino Acids." Theoret. Chim. Acta (Berl.) 51, 323--329. Millefiori, S., and A. Millefiori. 1983. "On the Relative Stability of Glycine Conformers. The Role of Electron Correlation." J. Mol. Struct. (Theochem) 91, 391-393. Ni, X., X. Shi, and L. Ling. 1988. "An Interaction Potential Between an Alanine Zwitterion and a Water Molecule Based on Ab Initio Calculations." Int. J. Quantum Chem. 34, 527-533. No, K. T., K. H. Cho, O. Y. Kwon, M. S. Jhon, and H. A. Scheraga. 1994. "Determination of Proton Transfer Energies and Lattice Energies of Several Amino Acid Zwitterions." J. Phys. Chem. 98, 10742-10749. Pagliarin, R., G. Sello, and M. Sisti. 1994. "Model Studies for Predicting the Diastereoselectivity in the Condensation of Aldehydes with Zinc and Copper Complexes of Amino Acid Derivatives. Part 1. Analysis and realisation of the models." J. Mol. Struct. (Theochem) 312, 251--259. Palla, P., C. Petrongolo, and J. Tomasi. 1980. "Internal Rotation Potential Energy for the Glycine Molecule in Its Zwitterionic and Neutral Forms. A Comparison among Several Methods." J. Phys. Chem. 84, 435--442. Peters, D., and J. Peters. 1982. "Quantum Theory of the Structure and Bonding in Proteins Part 13. The (3 branched hydrocarbon side chains valine and isoleucine." J. Mol. Struct. (Theochem) 88, 157--170. Ramek, M. 1990. "An Initio SCF Investigation of -Alanine." J. Mol. Struct. (Theochem) 208, 301-355. Ramek, M. 1990b. "Intramolecular Hydrogen Bonding in Neutral Glycine, -Alanine, -Aminobutyric Acid, and -Aminopentane Acid." Int. J. Quantum Chem. Quantum Biol. Symp. 17, 45--53. Ramek, M., and V. K. W. Cheng. 1992. "On the Role of Polarization Functions in SCF Calculations of Glycine and Related Systems with Intramolecular Hydrogen Bonding." Int. J. Quantum Chem. Quantum Biol. Symp. 19, 15--26.
146
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Ramek, M., V. K. W. Cheng, R. F. Frey, S. Q. Newton, and L. Schafer. 1991. "The Case of Glycine Continued: Some Contradictory SCF Results." J. Mol. Struct. (Theochem) 235, 1-10. Ramek, M., F. A. Momany, D. M. Miller, and L. Schafer. 1996. "On the Importance of Full Geometry Optimization in Correlation-Level Ab Initio Molecular Conformational Analyses." J. Mol. Struct. 375, 189-191. Ranghino, G., E. Clementi, and S. Romano. 1983. "Lysinium, Argininium, Glutamate, and Aspartate Ions in Water Solution." Biopolymers 22, 1449-1460. Ringnalda, M. N., Y. Won, and R. A. Friesner. 1990. "Pseudospectral Hartree-Fock Calculations on Glycine." J. Chem. Phys. 92, 1163--1173. Sapse, A.-M., and D. C. Jain. 1986. "Guanine and Adenine-Amino Acids Interactions: An Ab Initio Study." Int. J. Quantum Chem. 29, 23-29. Sapse, A.-M., M. Mezei, D. C. Jain, and C. Unson. 1994. "Ab Initio Study of Aspartic and Glutamic Acid: Supplementary Evidence for Structural Requirements at Postition 9 for Glucagon Activity." J. Mol. Struct. (Theochem) 306, 225-233. Sch fer, L., S. Q. Kulp-Newton, K. Siam, V. J. Klimkowski, and C. Van Alsenoy. 1990a. "Ab Initio Studies of Structural Features Not Easily Amenable to Experiment. Part 71. Conformational analysis and structural study of valine and threonine." J. Mol. Struct. (Theochem) 209, 373-385. Sch fer, L., H. L. Sellers, F. J. Lovas, and R. D. Suenram. 1980. "Theory versus Experiment: The Case of Glycine." J. Am. Chem. Soc. 102, 6566-6568. Sch fer, L., K. Siam, V. J. Klimkowski, J. D. Ewbank, and C. Van Alsenoy. 1990b. "Ab Initio Studies of Structural Features Not Easily Amenable to Experiment Part 69. Conformational analysis and structural study of cysteine." J. Mol. Struct. (Theochem) 204, 361-372. Sch fer, L., C. Van Alsenoy, J. N. Scarsdale, V. J. Klimkowski, and J. D. Ewbank. 1981. "Ab Initio Studies of Structural Features Not Easily Amenable to Experiment Part 18. Conformational Analysis and Molecular Structure of Glycine Methyl ester." J. Comput. Chem. 2, 410--413. Sellers, H. L., and L. Sch fer. 1979. "Ab Initio Equilibrium Structures of Unionized Amino Acids: Alanine." Chem. Phys. Letters 63, 609-611. Sellers, H. L., and L. Sch fer. 1978. "Investigations Concerning the Apparent Contradiction Between the Microwave Structure and the Ab Initio Calculations of Glycine." J. Am. Chem. Soc. 100, 7728-7729. Siam, K., V. J. Klimkowski, J. D. Ewbank, C. Van Alsenoy, and L. Schafer. 1984. "Ab Initio Studies of Structural Features Not Easily Amenable to Experiment Part 39. Conformational analysis of glycine and alanine." J. Mol. Struct. (Theochem) 110, 171-182. Singh, U. C., Brown, F. K., Bash, P. A., and Kollman, P. A. 1987. "An Approach to the Application of Free Energy Perturbation Methods Using Molecular Dynamics: Applications to the Transformations of CH3OH CH3CH3, H3O+ NH+4, Glycine Alanine, and Alanine Phenylalanine in Aqueous Solution and to H3O+(H2O)3 NH+4(H2O)3 in the Gas Phase." J. Am. Chem. Soc. 109, 1607--1614. Sokalski, W. A., K. Maruszewski, P. C. Hariharan, and J. J. Kaufman. 1989. "Library of Cumulative Atomic Multipole Moments: II. Neutral and Charged Amino Acids." Int. J. Quantum Chem. Quantum Biol. Symp. 16, 119--164. Sordo, J. A., M. Probst, G. Corongiu, S. Chin, and E. Clementi. 1987. "Ab Initio Pair Potentials for the Interactions Between Aliphatic Amino Acids." J. Am. Chem. Soc. 109, 1702-1708. Sukumar, N., and G. A. Segal. 1986. "Effect of Aqueous Solvation upon the Electronic Excitation Spectrum of the Glycine Zwitterion: A Theoretical CI Study Using a Fractional Charge Model." J. Am. Chem. Soc. 108, 6880--6884. Sulzbach, H. M., P. v. R. Schleyer, and H. F. Schaefer, III. 1994. "Interrelationship Between Conformation and Theoretical Chemical Shifts. Case Study on Glycine and Glycine Amide." J. Am. Chem. Soc. 116, 3967--3972.
ON COMPARING EXPERIMENTAL AND CALCULATED STRUCTURAL PARAMETERS
147
Tarakeshwar, P., and S. Manogaran. 1994. "Conformational Effects on Vibrational Frequencies of Cysteine and Serine: An Ab Initio Study." J. Mol. Struct. (Theochem) 305, 205-224. Tranter, G. E. 1985a. "The Parity Violating Energy Differences Between the Enantiomers of -amino Acids." Mol. Phys. 56, 825-838. Tranter, G. E. 1985b. "The Parity-Violating Energy Differences Between the Enantiomers of -amino Acids." Chem. Phys. Letters 120, 93-96. Van Alsenoy, C., S. Kulp, K. Siam, V. J. Klimkowski, J. D. Ewbank, and L. Schafer. 1988. "Ab Initio Studies of Structural Features Not easily Amenable to Experiment Part 63. Conformational analysis and structural study of serine." J. Mol. Struct. (Theochem) 181, 169-178. Van Alsenoy, C., J. N. Scarsdale, and L. Schafer. 1982. "Ab Initio Studies of Structural Features Not Easily Amenable to Experiment Part 24. Molecular structures and conformational analyses of the methyl esters of formic acid, acetic acid and alanine." J. Mol. Struct. (Theochem) 90, 297-304. Van Alsenoy, C., J. N. Scarsdale, H. L. Sellers, and L. Schafer. 1981. "Ab Initio Studies of Structural Features Not Easily Amenable to Experiment. The Molecular structures of Two Low-Energy Forms of Unionized Serine." Chem. Phys. Letters 80, 124-126. Vijay, A., and D. N. Sathyanarayana. 1992. "Theoretical Study of the Ground-State Vibrations of Nonionized Glycine." J. Phys. Chem. 96, 10735--10739. Vishveshwara, S., and J. A. Pople. 1977. "Molecular Orbital Theory of the Electronic Structures of Organic Compounds. 32. Conformations of Glycine and Related Systems." J. Am. Chem. Soc. 99, 2422-2426. Voogd, J., J. L. Derissen, and F. B. van Duijneveldt. 1981. "Calculation of Proton-Transfer Energies and Electrostatic Lattice Energies of Various Amino Acids and Peptides Using CNDO/2 and Ab Initio SCF Methods." I. Am. Chem. Soc. 103, 7701-7706. Williams, R. W., V. F. Kalasinsky, and A. H. Lowrey. 1993. "Scaled Quantum Mechanical Force Field for Cis- and Trans-glycine in Acidic Solution." J. Mol. Struct. (Theochem) 281, 157-171. Wright, L. R., and R. F. Borkman. 1980. "Ab Initio Self-Consistent Field Calculations on Some Small Amino Acids." J. Am. Chem. Soc. 102, 6207-6210. Wright, L. R., R. F. Borkman, and A. M. Gabrielli. 1982. "Protonation of Glycine: An Ab Initio Self-Consistent Field Study." J. Phys. Chem. 86, 3951--3956. Yu, D., D. A. Armstrong, and A. Rauk. 1992. "Hydrogen Bonding and Internal Rotation Barriers of Glycine and Its Zwitterion (Hypothetical) in the Gas Phase." Can. J. Chem. 70, 1762-1772. Ab Initio Calculations of Peptides Aida, M. 1993. "Theoretical Studies on Hydrogen Bonding Interactions Between Peptide Units." Bull. Chem. Soc. Jpn. 66, 3423--3429. Aizman, A., and D. A. Case. 1982. "Electronic Structure Calculations on Active Site Models for 4-Fe, 4-S Iron-Sulfur Proteins." J. Am. Chem. Soc. 104, 3269--3279. Amodeo, P., and V. Barone. "A New General Form of Molecular Force Fields. Application to Intra- and Interresidue Interactions in Peptides." J. Am. Chem. Soc. 114, 90859093. Bakhshi, A. K., and J. Ladik. 1986. "Ab Initio Study of the Effect of Side-Chain Reactions on the Electronic Structure of Proteins." Chem. Phys. Letters 129, 269-274. Bakhshi, A. K., J. Ladik, and P. Otto. 1989. "Ab Initio Study of the Effect of Cation Binding on the Electronic Structure of Proteins." J. Mol. Struct. 198, 143-158. Bakhshi, A. K., P. Otto, and J. Ladik. 1988. "On the Electronic Structure and Conduction Properties of Aperiodic Proteins: Study of Six-Component Polypeptide Chains." J. Mol. Struct. (Theochem) 180, 113--123. Bakhshi, A. K., P. Otto, C.-M. Liegener, E. Rehm, and I. Ladik. 1990. "Modeling of Real 20-Component Protein Chains: Determination of the Electronic Density of States of
148
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Aperiodic Seven-Component Polypeptide Chains Containing Strongly Different Amino Acid Residues." Int. J. Quantum Chem. 38, 573-583. Balazs, A. 1991. "Notes on the Local Flexibility of the Polypeptide Backbone Part I. Mean amplitudes of thermal motions in the dipeptide model." J. Mol. Struct. 245, 111-117. Barone, V., F. Fraternali, and P. L. Cristinziano. 1990. "Sensitivity of Peptide Conformation to Methods and Geometrical Parameters. A Comparative Ab Initio and Molecular Mechanics Study of Oligomers of -Aminoisobutyric Acid." Macromolecules 23, 2038-2044. Bellido, M. N., and J. A. C. Rullmann. 1989. "Atomic Charge Models for Polypeptides Derived from Ab Initio Calculations." J. Comput. Chem. 10, 479--87. Bohm, H.-J. 1993. "Ab Initio SCF Calculations on Low-Energy Conformers of NAcetylglycylglycine N'-Methylamide." J. Am. Chem. Soc. 115, 6152-6158. Bohm, H.-J., and S. Brode. 1995. "Ab Initio SCF Calculations on Low-Energy Conformers of Cyclohexaglycine." J. Comput. Chem. 16, 146-153. Bohm, H.-J., and S. Brode. 1991. "Ab Initio SCF Calculations on Low-Energy Conformers of N-Acetyl-N'-methylalaninamide and N-Acetyl-N'-methylglycinamide." J. Am. Chem. Soc. 113, 7129--7135. Bour, P., and T. A. Keiderling. 1993. "Ab Initio Simulations of the Vibrational Circular Dichroism of Coupled Peptides." J. Am. Chem. Soc. 115, 9602--9607. Caillet, J., P. Claverie, and B. Pullman. 1978. "Effect of the Crystalline Environment upon the Rotational Conformation about the N-C and C-C' Bonds ( and ) in Amides and Peptides." Theoret. Chim. Acta (Berl.) 47, 17-26. Chang, C., and R. F. W. Bader. 1992. "Theoretical Construction of a Polypeptide." J. Phys. Chem. 96, 1654-1662. Cheam, T. C. 1993. "Normal Mode Analysis of Alanine Dipeptide in the Crystal Conformation Using a Scaled Ab Initio Force Field." J. Mol. Struct. 295, 259-271. Cheam, T. C. 1992. "Normal Mode Analysis of Glycine Dipeptide in Crystal Conformation Using a Scaled Ab Initio Force Field." J. Mol. Struct. 274, 289-309. Cheam, T. C., and S. Krimm. 1990. "Ab Initio Force Fields of Alanine Dipeptide in Four Non-Hydrogen Bonded Conformations." J. Mol. Struct. (Theochem) 206, 173-203. Cheam, T. C., and S. Krimm. 1989a. "Ab Initio Force Fields of Glycine Dipeptide in C5 and C7 Conformations." J. Mol. Struct. 193, 1-34. Cheam, T. C., and S. Krimm. 1989b. "Ab Initio Force Fields of Alanine Dipeptide in C5 and C7 Conformations." J. Mol. Struct. (Theochem) 188, 15--43. Cheam, T. C., and S. Krimm. 1986. "Vibrational Properties of the Peptide N-H Bond as a Function of Hydrogen-Bond Geometry: an Ab Initio Study." J. Mol. Struct. 146, 175-189. Cheam, T. C., and S. Krimm. 1985. "Infrared Intensities of Amide Modes in N-methylacetamide and Poly(Glycine I) From Ab Initio Calculations of Dipole Moment Derivatives of N-methylacetamide." J. Chem. Phys. 82, 1631--1641. Chesnut, D. B., and C. G. Phung. 1991. "Ab Initio Determination of Chemical Shielding in a Model Dipeptide." Chem. Phys. Letters 183, 505-509. Day, R. S., S. Suhai, and J. Ladik. 1981. "Electronic Structure in Large Finite Aperiodic Polypeptide Chains." Chem. Phys. 62, 165-169. Dive, G., D. Dehareng, and J. M. Ghuysen. 1994. "Detailed Study of a Molecule in a Molecule: N-Acetyl-L-tryptophanamide in an Active Site Model of a-Chymotrypsin." J. Am. Chem. Soc. 116, 2548--2556. Endredi, G., C.-M. Liegener, M. A. McAllister, A. Perczel, J. Ladik, and I. G. Csizmadia. 1994. "Peptide Models 8. The Use of a Modified Romberg Formalism for the Extrapolation of Molecular Properties from Oligomers to Polymers. Polyalanine Diamide in Its "Extended Like" or( 1)n or (C5)n Conformation." J. Mol. Struct. (Theochem) 306, 1--7. Faerman, C. H., and S. L. Price. 1990. "A Transferable Distributed Multipole Model for the Electrostatic Interactions of Peptides and Amides." J. Am. Chem. Soc. 112, 4915--4926.
ON COMPARING EXPERIMENTAL AND CALCULATED STRUCTURAL PARAMETERS
149
Fernandez, B., M. A. Rios, and L. Carballeira. 1991. "Molecular Mechanics (MM2) and Conformational Analysis of Compounds with N-C-O Units. Parametrization of the Force Field and Anomeric Effect." J. Comput. Chem. 12, 78--90. Fischer, S., R. L. Dunbrack, Jr., and M. Karplus. 1994. "Cis-Trans Imide Isomerization of the Proline Dipeptide." J. Am. Chem. Soc. 116, 11931-11937. Fowler, P. W., and G. J. Moore. 1988. "Calculation of the Magnitude and Orientation of Electrostatic Interactions Between Small Aromatic Rings in Peptides and Proteins: Implications for Angiotensin II." Biochem. Biophys. Res. Commun. 153, 1296--1300. Frey, R. F, J. Coffin, S. Q. Newton, M. Ramek, V. K. W. Cheng, F. A. Momany, and L. Schafer. 1992. "Importance of Correlation-Gradient Geometry Optimization for Molecular Conformational Analyses." J. Am. Chem. Soc. 114, 5369--5377. Gaspar, R., Jr., and R. Gaspar. 1983. "Ab Initio Molecular Fragment Calculations with Pseudopotentials: Model Peptide Studies." Int. J. Quantum Chem. 24, 767--771. Gould, I. R., W. D. Cornell, and I. H. Hillier. 1994. "A Quantum Mechanical Investigation of the Conformational Energetics of the Alanine and Glycine Dipeptides in the Gas Phase and in Aqueous Solution." J. Am. Chem. Soc. 116, 9250--9256. Gould, I. R., and I. H. Hillier. 1993. "Solvation of Alanine Dipeptide: a Quantum Mechanical Treatment." J. Chem. Soc., Chem. Commun. 951--952. Gould, I. R., and P. A. Kollman. 1992. "Ab Initio SCF and MP2 Calculations on Four Low-Energy Conformers of N-Acetyl-N'-methylalaninamide." J. Phys. Chem. 96, 9255-9258. Grant, J. A., R. L. Williams, and H. A. Scheraga. 1990. "Ab Initio Self-Consistent Field and Potential-Dependent Partial Equalization of Orbital Electronegativity Calculations of Hydration Properties of N-Acetyl-N'-Methyl-Alanineamide." Biopolymers 30, 929-949. Gresh, N., A. Pullman, and P. Claverie. 1985. "Theoretical Studies of Molecular Conformation. II: Application of the SIBFA Procedure to Molecules Containing Carbonyl and Carboxylate Oxygens and Amide Nitrogens." Theoret. Chim. Acta (Berl.) 67, 11--32. Guo, H., and M. Karplus. 1994. "Solvent Influence on the Stability of the Peptide Hydrogen Bond: A Supramolecular Cooperative Effect." J. Phys. Chem. 98, 7104-7105. Hadzi, D., M. Hodoscek, D. Turk, and V. Harb. 1988. "Theoretical Investigations of Structure and Enzymatic Mechanisms of Aspartyl Proteinases Part 2. Ab initio calculations on some possible initial steps of proteolysis." J. Mol. Struct. (Theochem) 181, 71-80. Head-Gordon, T., M. Head-Gordon, M. J. Frisch, C. L. Brooks III, and J. A. Pople. 1991. "Theoretical Study of Blocked Glycine and Alanine Peptide Analogues." J. Am. Chem. Soc. 113,5989--5997. Head-Gordon, T., M. Head-Gordon, M. J. Frisch, C. Brooks III, and J. A. Pople. 1989. "A Theoretical Study of Alanine Dipeptide and Analogs." Int. J. Quantum Chem. Quantum Biol. Symp. 16, 311-322. Jensen, J. H., K. K. Baldridge, and M. S. Gordon. 1992. "Uncatalyzed Peptide Bond Formation in the Gas Phase." J. Phys. Chem. 96, 8340--8351. Jewsbury, P., S. Yamamoto, T. Minato, M. Saito, and T. Kitagawa. 1994. "The Proximal Residue Largely Determines the CO Distortion in Carbonmonoxy Globin Proteins. An Ab Initio Study of a Heme Prosthetic Unit." J. Am. Chem. Soc. 116, 11586--11587. Jiao, D., M. Barfleld, and V. J. Hruby. 1993. "Ab Initio IGLO Study of the - and )-Angle Dependence of the 13C Chemical Shifts in the Model Peptide N-Acetyl-N'-methylglycinamide." J. Am. Chem. Soc. 115, 10883--10887. Kertesz, M., J. Koller, and A. Azman. 1980. "On the Electronic Structure of Periodic Polyglycine." Int. J. Quantum Chem. Quantum Biol. Symp. 7, 177--179. Kleier, D. A., and W. N. Lipscomb. 1977. "Molecular Orbital Study of Polypeptides. Conformational and Electronic Structure of Polyglycine." Int. J. Quantum Chem. Quantum Biol. Symp. 4, 73--86. Klimkowski, V. J., L. Schafer, F. A. Momany, and C. Van Alsenoy. 1985. "Local Geometry
150
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Maps and Conformational Transitions Between Low-Energy Conformers of N-AcetylN'-Methyl Glycine Amide: An Ab Initio Study at the 4-21G Lever with Gradient Relaxed Geometries." J. Mol. Struct. (Theochem) 124, 143-153. Ladik, J., P. Otto, A. K. Bakhshi, and M. Seel. 1986. "Quantum Mechanical Treatment of Biopolymers as Solids: Possible Implications for Carcinogenesis." Int. J. Quantum Chem. 29, 597--617. Ladik, J., A. Sutjianto, and P. Otto. 1991. "Improved Band Structures of Some Homopolypeptides with Aliphatic Side Chains and of the Four Nucleotide Base Stacks: Estimation of Their Fundamental Gap." J. Mol. Struct. (Theochem) 228, 271-276. Liegener, C.-M., A. K. Bakhshi, P. Otto, and J. Ladik. 1989. "Effects of Correlation and Hydration on the Electronic Structure of Aperiodic Polypeptides." J. Mol. Struct. (Theochem) 188, 205--212. Liegener, C.-M., A. Sutjianto, and J. Ladik. 1990. "The Treatment of Electron Correlation in Aperiodic Systems. III. Application to Polypeptides." Chem. Phys. 145, 385--388. Mavri, J., F. Avbelj, and D. Hadzi. 1989. "Conformation of N-Acetyl-L-Pro-D-Ala-N'Methyl Tripeptide Empirical, Semi-Empirical MO and Ab Initio MO Calculations." J. Mol. Struct. (Theochem) 187, 307-315. McAllister, M. A., A. Perczel, P. Csaszar, and I. G. Csizmadia. 1993a. "Peptide Models 5. Topological Features of Molecular Mechanics and Ab Initio 4D-Ramachandran Maps. Conformational Data for Ac-L-Ala-L-Ala-NHMe and For-L-Ala-L-Ala-NH2." J. Mol. Struct. (Theochem) 288, 181--198. McAllister, M. A., Perczel, P. Csaszar, W. Viviani, J.-L. Rivail, and I. G. Csizmadia. 1993b. "Peptide Models 4. Topological Features of Molecular Mechanics and Ab Initio 2DRamachandran Maps. Conformational Data for For-Gly-NH2, For-L-Ala-NH2, Ac-LAla-NHMe and For-L-Val-NH2." J. Mol. Struct. (Theochem) 288, 161-179. Mehrotra, P. K., M. Mezei, and D. L. Beveridge. 1984. "Monte Carlo Determination of the Internal Energies of Hydration for the Ala Dipeptide in the C7, Cs, R, and PII Conformations." Int. J. Quantum Chem. Quantum Biol. Symp. 11, 301-308. Mezei, M., P. K. Mehrotra, and D. L. Beveridge. 1985. "Monte Carlo Determination of the Free Energy and Internal Energy of Hydration for the Ala Dipeptide at 25°C." J. Am. Chem. Soc. 107, 2239-2245. Miick, S. M., G. V. Martinez, W. R. Fiori, A. P. Todd, and G. L. Millhauser. 1992. "Short Alanine-Based Peptides May Form 310-Helices and Not-Helices in Aqueous Solution." Nature 359, 653-655. Mirkin, N. G., and S. Krimm. 1990. "Vibrational Dynamics of the Cis Peptide Group." J. Am. Chem. Soc. 112, 9016--9017. Nakagawa, S., and H. Umeyama. 1981. "Molecular Orbital Study of the Effects of Ionic Amino Acid Residues on Proton Transfer Energetics in the Active Site of Carboxypeptidase A." Chem. Phys. Letters 81, 503-507. No, K. T., J. A. Grant, M. S. Jhon, and H. A. Scheraga. 1990. "Determination of Net Atomic Charges Using a Modified Partial Equalization of Orbital Electronegativity Method 2. Application to Ionic and Aromatic Molecules as Models for Polypeptides." J. Phys. Chem. 94, 4740--4746. Oie, T, G. H. Loew, S. K. Burt, J. S. Binkley, and R. D. MacElroy. 1982. "Ab Initio Study of Catalyzed and Uncatalyzed Amide Bond Formation as a Model for Peptide Bond Formation: Ammonia-Formic Acid and Ammonia-Glycine Reactions." Int. J. Quantum Chem. Quantum Biol. Symp. 9, 223--245. Oie, T., G. H. Loew, S. K. Burt, and R. D. MacElroy. 1983. "Ab Initio Study of Catalyzed and Uncatalyzed Amide Bond Formation as a Model for Peptide Bond Formation: Ammonia-Glycine Reactions." J. Comput. Chem. 4, 449--460. Otto, P., and A. Sutjianto. 1991. "Electron Correlation Effects on the Energy Bond Structure of Pllyglycine." J. Mol. Struct. (Theochem) 231, 277-282. Perczel, A., J. G. Angyan, M. Kajtar, W. Viviani, J.-L. Rivail, J.-F. Marcoccia, and I. G. Csizmadia. 1991a. "Peptide Models. 1. Topology of Selected Peptide Conformational
ON COMPARING EXPERIMENTAL AND CALCULATED STRUCTURAL PARAMETERS
151
Potential Energy Surfaces (Glycine and Alanine Derivatives)." J. Am. Chem. Soc. 113, 6256-6265. Perczel, A., O. Farkas, and I. G. Csizmadia. 1996. "Peptide Models XVI. The Identification of Selected HCO-L-SER-NH2 Conformers via a Systematic Grid Search Using Ab Initio Potential Energy Surfaces." J. Comput. Chem. 17, 821--834. Perczel, A., M. Kajtar, J.-F. Marcoccia, and I. G. Csizmadia. 1991b. "The Utility of the FourDimensional Ramachandran Map for the Description of Peptide Conformations." J. Mol. Struct. (Theochem) 232, 291-319. Perczel, A., M. A. McAllister, P. Csaszar, and I. G. Csizmadia. 1994. "Peptide Models. IX. A Complete Conformational Set of For-Ala-Ala-NH2 from Ab Initio Computations." Can. J. Chem. 72, 2050-2070. Perczel, A., M. A. McAllister, P. Csaszar, and I. G. Csizmadia. 1993. "Peptide Models 6. New -Turn Conformations from Ab Initio Calculations Confirmed by X-ray Data
of Proteins." J. Am. Chem. Soc. 115, 4849--1858. Perczel, Csizmadia. ofPerczel, A., A., W. ViW. viani, andViviani, I. G. Csizmadia.and 1992. "PeptiI. deG. Conformati onal Potential Protei1992. ns." J. Am."Peptide Chem. Soc. 115, Conformational4 8Potential 49--185l
Energy Surfaces and Their Relevance to Protein Folding" in Molecular Aspects of Biotechnology: Computational Models and Theories, Bertran, J., ed., Kluwer Academic Publishers, 39-82. Peters, D., and J. Peters. 1984a. "Quantum Theory of the Structure and Bonding in Proteins Part 17. The unionised aspartic acid dipeptide." J. Mol. Struct. (Theochem) 109, 149-159. Peters, D., and J. Peters. 1984b. "Quantum Theory of the Structure and Bonding in Proteins Part 16. The asparagine dipeptide." J. Mol. Struct. (Theochem) 109, 137-148. Peters, D., and J. Peters. 1982a. "Quantum Theory of the Structure and Bonding in Proteins Part 15. The threonine dipeptide." J. Mol. Struct. (Theochem) 90, 321-334. Peters, D., and J. Peters. 1982b. "Quantum Theory of the Structure and Bonding in Proteins Part 14. The serine dipeptide." J. Mol. Struct. (Theochem) 90, 305-320. Peters, D., and J. Peters. 1982c. "Quantum Theory of the Structure and Bonding in Proteins Part 12. Conformational analysis of side chains and the ethyl group as a model side chain." J. Mol. Struct. (Theochem) 88, 137-156. Peters, D., and J. Peters. 1981 a. "Quantum Theory of the Structure and Bonding in Proteins Part 10. The C10 hydrogen bonds and (3 bends in peptides and proteins." J. Mol. Struct. (Theochem) 85, 267--277. Peters, D., and J. Peters. 1981b. "Quantum Theory of the Structure and Bonding in Proteins Part 9. The proline dipeptide." J. Mol. Struct. (Theochem) 85, 257-265. Peters, D., and J. Peters. 1981c. "Quantum Theory of the Structure and Bonding in Proteins Part 8. The alanine dipeptide." J. Mol. Struct. (Theochem) 85, 107-123. Peters, D., and J. Peters. 1980a. "Quantum Theory of the Structure and Bonding in Proteins Part 7. The a helix and the hydrogen bonding in the tetrapeptide." J. Mol. Struct. 69, 249-263. Peters, D., and J. Peters. 1980b. "Quantum Theory of the Structure and Bonding in Proteins Part 6. Factors governing the formation of hydrogen bonds in proteins and peptides." J. Mol. Struct. 68, 255--270. Peters, D., and J. Peters. 1980c. "Quantum Theory of the Structure and Bonding in Proteins Part 5. Further studies on the C10 hydrogen bond of the tripeptide." J. Mol. Struct. 68, 243-253. Peters, D., and J. Peters. 1979. "Quantum Theory of the Structure and Bonding in Proteins Part 2. The simple dipeptide." J. Mol. Struct. 53, 103-119. Price, S. L., C. H. Faerman, and C. W. Murray. 1991. "Toward Accurate Transferable Electrostatic Models for Polypeptides: A Distributed Multipole Study of Blocked Amino Acid Residue Charge Distributions." J. Comput. Chem. 12, 1187--1197. Price, S. L.,and A. J. Stone. 1992. "Electrostatic Models for Polypeptides: Can We Assume Transferability?" J. Chem. Soc. Faraday Trans. 88, 1755-1763. Probst, M. M., and B. M. Rode. 1984. "Quantum Chemical Investigations on the Complexes of Ca2+ and Zn2+ with Aliphatic Dipeptides." Inorg. Chim. Acta 92, 75--78.
152
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Ragazzi, M., D. R. Ferro, and E. Clementi. 1979. "Analytical Potentials from Ab Initio Computations for the Interaction Between Biomolecules. V. Formyl-triglycyl Amide and Water." J. Chem. Phys. 70, 1040-1050. Ramek, M., A-M. Kelterer, B. J. Teppen, and L. Schafer. 1995. "Theoretical Structure Investigation of N-acetyl L-proline amide." J. Mol. Struct. 352/353, 59-70. Ramani, R., and R. J. Boyd. 1981. "Ab-initio Molecular Orbital Study of the cis/trans Conformations of the Peptide Bond." Int. J. Quantum Chem. Quantum Biol. Symp. 8, 117-127. Rao, B. G., R. F. Tilton, and U. C. Singh. 1992. "Free Energy Perturbation Studies on Inhibitor Binding to HIV-1 Proteinase." J. Am. Chem. Soc. 114, 4447--4452. Roux, B. 1993. "Non-additivity in Cation-peptide Interactions. A Molecular Dynamics and Ab Initio Study of Na+ in the Gramicidin Channel." Chem. Phys. Letters 212,231--240. Ryan, J. A., and J. L. Whitten. 1972. "Self-Consistent Field Studies of Glycine and Glycylglycine. The Simplest Example of a Peptide Bond." J. Am. Chem. Soc. 94, 2396-2400. Sapse, A.-M., S. B. Daniels, and B. W. Erickson. 1988. "Ab Initio Calculations for NAcetylalanylglycine Amide." Tetrahedron 44, 999-1006. Sapse, A.-M., L. M. Fugler, and D. Cowburn. 1986. "An Ab Initio Study of Intermolecular Hydrogen Bonding Between Small Peptide Fragments." Int. J. Quantum Chem. 29, 1241-1251. Sapse, A.-M., D. C. Jain, D. de Gale, and T. C. Wu. 1990. "Solvent Effect and Librational Entropy Calculations on N-Acetylalanylglycine Amide." J. Comput. Chem. 11, 573-575. Sapse, A.-M., L. Mallah-Levy, S. B. Daniels, and B. W. Erickson. 1987. "The Turn: Ab Initio Calculations on Proline and N-Acetylproline Amide." J. Am. Chem. Soc. 109, 3526-3529. Sarai, A., and M. Saito. 1985. "Theoretical Studies on the Interaction of Proteins with Base Paris. II. Effect of External H-Bond Interactions on the Stability of Guanine-Cytosine and Non-Watson-Crick Pairs." Int. J. Quantum Chem. 28, 399--409. Sarai, A., and M. Saito. 1984. "Theoretical Studies on the Interaction of Proteins with Base Paris. I. Ab Initio Calculation for the Effect of H-Bonding Interaction of Proteins on the Stability of Adenine-Uracil Pair." Int. J. Quantum Chem. 25, 527--533. Sawaryn, A., and J. S. Yadav. 1982. "Ab Initio Studies on the Nonplanarity of a Peptide Unit: Calculations on Model Compounds." Int. J. Quantum Chem. 22, 547--556. Scarsdale, J. N., C. Van Alsenoy, V. J. Klimkowski, L. Schafer, and F. A. Momany. 1983. "Ab Initio Studies of Molecular Geometries. 27. Optimized Molecular Structures and Conformational Analysis of N-Acetyl-N-methylalaninamide and Comparison with Peptide Crystal Data and Empirical Calculations." J. Am. Chem. Soc. 105,3438--3445. Schafer, L., V. J. Klimkowski, F. A. Momany, H. Chuman, and C. Van Alsenoy. 1984. "Conformational Transitions and Geometry Differences Between Low-Energy Conformers of N-Acetyl-N'-Methyl Alanineamide: An Ab Initio Study at the 4-21G Level with Gradient Relaxed Geometries." Biopolymers 23, 2335-2347. Schafer, L., S. Q. Newton, M. Cao, A. Peelers, C. Van Alsenoy, K. Wolinski, and F. A. Momany. 1993. "Evaluation of the Dipeptide Approximation in Peptide Modeling by Ab Initio Geometry Optimizations of Oligopeptides." J. Am. Chem. Soc. 115, 272-280. Schafer, L., C. Van Alsenoy, and J. N. Scarsdale. 1982. "Ab Initio Studies of Structural Features Not Easily Amenable to Experiment. 23. Molecular Structures and Conformational Analysis of the Dipeptide N-acetyl-N'-methyl Glycyl Amide and the Significance of Local Geometries for Peptide Structures." J. Chem. Phys. 76, 1439--1444. Scheiner, S., and L. Wang. 1993. "Hydrogen Bonding and Proton Transfers of the Amide Group." J. Am. Chem. Soc. 115, 1958--1963. Shang, H. S., and T. Head-Gordon. 1994. "Stabilization of Helices in Glycine and Alanine Dipeptides in a Reaction Field Model of Solvent."J. Am. Chem. Soc. 116, 1528--1532.
ON COMPARING EXPERIMENTAL AND CALCULATED STRUCTURAL PARAMETERS
153
Shipman, L. L., and R. E. Christoffersen. 1973. "Ab Initio Calculations on Large Molecules Using Molecular Fragments. Polypeptides of Glycine." J. Am. Chem. Soc. 95, 4733--4744. Siam, K., V. J. Klimkowski, C. Van Alsenoy, J. D. Ewbank, and L. Schafer. 1987. "Ab Initio Geometry Refinement of Some Selected Structures of the Model Dipeptide N-Acetyl N'-Methyl Serine Amide." J. Mol. Struct. (Theochem) 152, 261-270. Siam, K., S. Q. Kulp, J. D. Ewbank, and L. Schafer. 1989. "Ab Initio Studies of Structural Features Not Easily Amenable to Experiment Part 64. Conformational analysis and local geometry maps of the model dipeptide N-acetyl N' -methyl serine amide." J. Mol. Struct. (Theochem) 184, 143-157. Skala, L., and P. Pancoska. 1988. "Interpolation Formula for Physical Properties of Polypeptides as a Function of the Number of Amino Acid Residues." Chem. Phys. 125, 21-30. Sokalski, W. A., D. A. Keller, R. L. Ornstein, and R. Rein. 1993. "Multipole Correction of Atomic Monopole Models of Molecular Charge Distribution. I. Peptides." J. Comput. Chem. 14, 970--976. Sordo, J. A., T. L. Sordo, G. M. Fernandez, R. Gomperts, S. Chin, and E. Clementi. 1989. "A Systematic Study on the Basis Set Superposition Error in the Calculation of Interaction Energies of Systems of Biological Interest." J. Chem. Phys. 90, 6361--6370. Stern, P. S., M. Chorev, M. Goodman, and A. T. Hagler. 1983. "Computer Simulation of the Conformational Properties of Retro-Inverso Peptides. II Ab Initio Study, Spatial Electron Distribution, and Population Analysis of N-Formylglycine Methylamide, NFormyl N'-Acetyldiaminomethane, and N-Methylmalonamide." Biopolymers 22, 1901-1917. Sternberg, U., F.-T. Koch, and M. Mollhoff. 1994. "New Approach to the Semiempirical Calculation of Atomic Charges for Polypeptides and Large Molecular Systems." J. Comput. Chem. 15, 524--531. Sugawara, Y., A. Y. Hirakawa, and M. Tsuboi. 1984. "In-Plane Force Constants of the Peptide Group: Least-Squares Adjustment Starting from Ab Initio Values of NMethylacetamide." J. Mol. Spectrosc. 108, 206-214. Suhai, S. 1985. "Perturbation Theoretical Calculation of Optical Effects in Polypeptides." J. Mol. Struct. (Theochem) 123, 97-108. Torii, H., and M. Tasumi. 1993. "Infrared Intensities of Vibrational Modes of an -helical Polypeptide: Calculations Based on the Equilibrium Charge/Charge Flux (ECCF) Model." J. Mol. Struct. 300, 171-179. Tranter, G. E. 1986. "Parity-violating Energy Differences and the Origin of Biomolecular Homochirality." J. Theor. Biol. 119, 467--479. Van Alsenoy, C., M. Cao, S. Q. Newton, B. Teppen, A. Perczel, I. G. Csizmadia, F. A. Momany, and L. Schafer. 1993. "Conformational Analysis and Structural Study by Ab Initio Gradient Geometry Optimizations of the Model Tripeptide N-formyl L-alanyl L-alanine Amide." J. Mol. Struct. (Theochem) 286, 149-163. Van Duijnen, P. T, and B. T. Thole. 1982. "Cooperative Effects in -Helices: An Ab Initio Molecular-Orbital Study." Biopolymers 21, 1749-1761. Viviana, W., J.-L. Rivail, and I. G. Csizmadia. 1993a. "Peptide Models II. Intramolecular Interactions and Stable Conformations of Glycine, Alanine, and Valine Peptide Analogues." Theor. Chim. Acta 85, 189--197. Viviani, W., J.-L. Rivail, A. Perczel, and I. G. Csizmadia. 1993b. "Peptide Models. 3. Conformational Potential Energy Hypersurface of Formyl-L-valinamide." J. Am. Chem. Soc. 115, 8321--8329. Voisin, C., and A. Cartier. 1993. "Determination of Distributed Polarizabilities to be Used for Peptide Modeling." J. Mol. Struct. (Theochem) 286, 35--5. Walker, P. D., and P. G. Mezey. 1994. "Ab Initio Quality Electron Densities for Proteins: A MEDLA Approach." J. Am. Chem. Soc. 116, 12022-12032. Walker, P. D., and P. G. Mezey. 1993. "Molecular Electron Density Lego Approach to Molecule Building." J. Am. Chem. Soc. 115, 12423-12430.
154
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Weiner, S. I, U. C. Singh, T. J. O'Donnell, and P. A. Kollman. 1984. "Quantum and Molecular Mechanical Studies on Alanyl Dipeptide." J. Am. Chem. Soc. 106, 6243-6245. Williams, D. E. 1990. "Alanyl Dipeptide Potential-Derived Net Atomic Charges and Bond Dipoles, and Their Variation with Molecular Conformation." Biopolymers 29, 1367-1386. Wright, L. R., and R. F. Borkman. 1982. "Ab Initio Self-Consistent Field Studies of the Peptides Gly-Gly, Gly-Ala, Ala-Gly, and Gly-Gly-Gly." J. Phys. Chem. 86, 3956--3962. Yang, W. 1992. "Electron Density as the Basic Variable: a Divide-and-Conquer Approach to the Ab Initio Computation of Large Molecules." J. Mol. Struct. (Theochem) 255, 461--479. Zhang, K., D. M. Zimmerman, A. Chung-Phillips, and C. J. Cassady. 1993. "Experimental and Ab Initio Studies of the Gas-Phase Basicities of Polyglycines." J. Am. Chem. Soc.
115, 10812-10822. Amino Acids and Peptides Barlow, D. J., and J. M. Thornton. 1988. "Helix Geometry in Proteins." J. Mol. Biol. 201, 601-619. Boggs, J. E. 1988. "Interaction of Theoretical Chemistry with Gas-Phase Electron Diffraction," in Stereochemical Applications of Gas-Phase Electron Diffraction, Hargittai, I., and Hargittai, M., eds., Vol. B, Chap. 10, 455--475. New York, VCH Publishers. Boggs, J. E. 1983. "The Integration of Structure Determination by Computation, Electron Diffraction and Microwave Spectroscopy." J. Mol. Struct. 97, 1-16. Boggs, J. E., and F. R. Cordell. 1981. "Accurate Ab Initio Gradient Calculation of the Structures and Conformations of Some Boric and Fluoroboric Acids. Basis-Set Effects on Angles Around Oxygen." J. Mol. Struct. (Theochem) 76, 329-347. Boggs, J. E., M. von Carlowitz, and S. von Carlowitz. 1982. "Symmetry of the Methyl Group in Molecules of the Type CH3YH2X." J. Phys. Chem. 86, 157--159. Brown, R. D., P. D. Godfrey, J. W. V. Storey, and M.-P. Bassez. 1978. "Microwave Spectrum and Conformation of Glycine." J. Chem. Soc., Chem. Commun. 547-548. Caminati, W., A. C. Fantoni, L. Schafer, K. Siam, and C. Van Alsenoy. 1986. "Conformational and Structural Analysis of Methyl Hydrazinocarboxylate by Microwave Spectroscopy and Ab Initio Geometry Refinements." J. Am. Chem. Soc. 108, 4364--4367. Caminati, W., A. C. Fantoni, B. Velino, K. Siam, L. Schafer, J. D. Ewbank, and C. Van Alsenoy. 1987a. "Conformational Equilibrium and Internal Hydrogen Bonding in 2Methylallyl Alcohol: Detection of a Second Conformer by Microwave Spectroscopy on the Basis of Ab Initio Structure Calculations." J. Mol. Spectrosc. 124, 72-81. Caminati, W., K. Siam, J. D. Ewbank, and L. Schafer. 1987b. "Interpretation of the Microwave Spectrum of 2-Methoxy Ethylamine Using Its Ab Initio Structures." J. Mol. Struct. 158, 237--247. Caminati, W., B. Velino, M. Dakkouri, L. Schafer, K. Siam, and J. D. Ewbank. 1987c. "Reinvestigation of the Microwave Spectrum of Cyanocyclobutane: Assignment of the Axial Conformer." J. Mol. Spectrosc. 123, 469--475. Cao, M., and L. Schafer. 1993. "Viewpoiny 6---Characteristic Aspects of GG Sequences and the Importance of Constitutional Properties for Conformational Entropies." J. Mol. Struct. (Theochem) 284, 235-242. Chiu, N. S., H. L. Sellers, L. Schafer, and K. Kohata. 1979. "Molecular Orbital Constrained Electron Diffraction Studies. Conformational Behavior of 1,2-Dimethylhydrazine." J. Am. Chem. Soc. 101, 5883-5889. Chuman, H., F. A. Momany, and L. Schafer. 1984. "Backbone Conformations, Bend Structures, Helix Structures, and Other Tests of an Improved Conformational Energy Program for Peptides: ECEPP83." Int. J. Pept. Prot. Res. 24, 233-248.
ON COMPARING EXPERIMENTAL AND CALCULATED STRUCTURAL PARAMETERS
155
Cremer, D. 1981. "Theoretical Determination of Molecular Structure and Conformation Part X. Geometry and puckering potential of azetidine, (CH2)3NH, combination of electron diffraction and ab initio studies." J. Mol. Struct. 75, 225--240. de Smedt, J., F. Vanhoutegem, C. Van Alsenoy, H. J. Geise, and L. Schafer. 1992. "Empirical Corrections of SCF Geometries With Special Examples from 4-21G Calculations." J. Mol. Struct. 259, 289--305. Doms, L., H. J. Geise, C. Van Alsenoy, L. Van den Enden, and L. Schafer. 1985. "The Molecular Orbital Constrained Electron Diffraction (MOCED) Structural Model of Quadricyclane Determined by Electron Diffraction Combined with Ab Initio Calculations of Potential and Geometrical Parameters." J. Mol. Struct. 129, 299-314. Eliel, E. L., N. L. Alinger, S. J. Angyal, and G. A. Morrison. 1965. "Conformational Analysis." New York, Interscience. Frey, R. F, M. Cao, S. Q. Newton, and L. Schafer. 1993. "Electron Correlation Effects in Aliphatic Non-Bonded Interactions: Comparison of N-Alkane MP2 and HF Geometries." J. Mol. Struct. (Theochem) 285, 99-113. Geise, H. J., and W. Pyckhout. 1988. "Self-Consistent Molecular Models from a Combination of Electron Diffraction, Microwave, and Infrared Data Together with High-Quality Theoretical Calculations," in Stereochemical Applications of Gas-Phase Electron Diffraction, Hargittai, L, and Hartittai, H., eds., Vol. A, Chap. 10, 321-346. New York, VCH Publishers. Godfrey, P. D., S. Firth, L. D. Hatherley, R. D. Brown, and A. P. Pierlot. 1993. "MillimeterWave Spectroscopy of Biomolecules: Alanine." J. Am. Chem. Soc. 115, 9687--9691. Godfrey, P. D., and R. D. Brown. 1995. "Shape of Glycine." J. Am. Chem. Soc. 117, 2019-2023. Godfrey, P. D., R. D. Brown, and F. M. Rodgers. 1996. "The Missing Conformers of Glycine and Alanine: Relaxation in Supersonic Jets." J. Mol. Struct. 376, 65--81. Hehre, W. J., L. Radom, P. v. R. Schleyer, and J. A. Pople. 1986. "The Performance Of The Model," in Ab Initio Molecular Orbital Theory, Chap. 16, 133--344. New York, Wiley. Iijima, K., K. Tanaka, and S. Onuma. 1991. "Main Conformer of Gaseous Glycine: Molecular Structure and Rotational Barrier from Electron Diffraction Data and Rotational Constants." J. Mol. Struct. 246, 257-266. Jiang, X., M. Cao, B. J. Teppen, S. Q. Newton, and L. Schafer. 1995a. "Predictions of Protein Backbone Structural Parameters from First Principles: Systematic Comparisons of Calculated N-C( )-C' Angles with High-Resolution Protein Crystallographic Results." J. Phys. Chem. 99, 10521-10525. Jiang, X., M. Cao, S. Q. Newton, L. Schafer, and E. F. Paulus. 1995b. "Predictions of Peptide and Protein Backbone Structural Parameters from First Principles. IV: Systematic Comparisons of Calculated N-C( )-C' Angles with Peptide Crystal Structures." Electronic J. Theo. Chem. 1, 11--17. Jiang, X., C-H. Yu, M. Cao, S. Q. Newton, E. F. Paulus, and L. Schafer. 1997. " / Torsional Dependence of Peptide and Protein Backbone Bond-Lengths and Bond-Angles: Comparison of Crystallographic and Calculated Parameters." J. Mol. Struct. 403, 83-93. Kabsch, W, and C. Sander. 1983. "Dictionary of Protein Secondary Structure: Pattern Recognition of Hydrogen-Bonded and Geometrical Features." Biopol. 22, 2577--2637. Karplus, P. A. 1996. "Experimentally Observed Conformation-Dependent Geometry and Hidden Strain in Proteins." Protein Sci. 5, 1406-1420. Kitano, M., and K. Kuchitsu. 1974. "Molecular Structure of Formamide as Studied by Gas Electron Diffraction." Bull. Chem. Soc. Japan 47, 67-72. Klimkowski, V. J., J. D. Ewbank, C. Van Alsenoy, J. N. Scarsdale, and L. Schafer. 1982. "Molecular Orbital Constrained Electron Diffraction Studies. 4. Conformational Analysis of the Methyl Ester of Glycine." J. Am. Chem. Soc. 104, 1476-1480. Kohata, K., T. Fukuyama, and K. Kuchitsu. 1979. "Molecular Structure and Conformation of 1,2-Dimethylhydrazine Studied by Gas Electron Diffraction." Chem. Lett. 257-260.
156
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Lewis, P. N., F. A. Momany, and H. A. Scheraga. 1973a. "Energy Parameters For Polypeptides: Conformational Energy Analysis Of The N-Acetyl N'-Methyl Amides of The Twenty Naturally Occurring Amino Acids." Isr. J. Chem. 11, 121--152. Lewis, P. N., F. A. Momany, and H. A. Scheraga. 1973b. "Chain Reversals In Proteins." Biochim. Biophys. Acta 303, 211--229. MacArthur, M. W., and J. M. Thornton. 1996. "Deviations from Planarity of the Peptide Bond in Peptides and Proteins." J. Mol. Biol. 264, 1180--1195. Marsh, R. E., and J. Donohue. 1967. "Crystal Structures of Amino Acids and Peptides." Adv. Prot. Chem. 22, 235--256. McKean, D. C., J. E. Boggs, and L. Schafer. 1984. "CH Bond Length Variations Due to the Intramolecular Environment: a Comparison of the Results Obtained by the Method of Isolated CH Stretching Frequencies and by Ab Initio Gradient Calculations." J. Mol. Struct. 116, 313--330. Milner-White, E. J., and R. Poet. 1987. "Loops, Bulges, Turns, and Hairpins in Proteins." Trends Biochem. Sci. 12, 189-192. Mislow, K. 1965. Introduction to Stereochemistry. Reading, Mass. Benjamin/Cummings. Mislow, K., and M. Raban. 1967. "Stereoisomeric Relationships of Groups in Molecules," in Topics in Stereochemistry, Allinger, N. L., and Eliel, E. L., eds. Vol. 1, 1-37. New York, Interscience. Momany, F. A., V. J. Klimkowski, and L. Schafer. 1990. "On the Use of Conformationally Dependent Geometry Trends from Ab Initio Dipeptide Studies to Refine Potentials for the Empirical Force Field CHARMM." J. Comp. Chem. 11, 654--662. Momany, F. A., R. Rone, H. Kunz, R. F. Frey, S. Q. Newton, and L. Schafer. 1993. "Geometry Optimization, Energetics and Solvation Studies on Four- and Five-membered Cyclic and Disulfide-bridged Peptides, Using the Programs QUANTA3.3 and CHARMm 22." J. Mol. Struct. 286, 1--18. M011er, C., and M. S. Plesset. 1934. "Note on an Approximate Treatment for Many-Electron Systems." Phys. Rev. 46, 618--622. Nakata, M., H. Takeo, C. Matsumura, K. Yamanouchi, K. Kuchitsu, and T. Fukuyama. 1981. "Structures of 1,2-Dimethylhydrazine Conformers as Determined by Microwave Spectroscopy and Gas Electron Diffraction." Chem. Phys. Letters 83, 246-249. Norden, T. D., S. W. Staley, W. H. Taylor, and M. D. Harmony. 1986. "On the Electronic Character of Methylenecyclopropene: Microwave Spectrum, Structure, and Dipole Moment." J. Am. Chem. Soc. 108, 7912-7918. Pople, J. A., R. Krishnan, H. B. Schlegel, and J. S. Binkley. 1979. "Derivative Studies in Hartree-Fock and M011er-Plesset Theories." Int. J. Quantum Chem. Quantum Chem. Symp. 13, 225--241. Pople, J. A., and M. Gordon. 1967. "Molecular Orbital Theory of the Electronic Structure of Organic Compounds. I. Substituent Effects and Dipole Moments." I. Am. Chem. Soc. 89, 4253--4261. Pulay, P. 1979a. "An Efficient Ab Initio Gradient Program." Theoret. Chim. Acta (Berl.) 50, 299-312. Pulay, P. 1969. "Ab Initio Calculation of Force Constants and Equilibrium Geometries in Polyatomic Molecules. I. Theory." Mol. Phys. 17, 197-204. Pulay, P., G. Fogarasi, F. Pang, and J. E. Boggs. 1979b. "Systematic Ab Initio Gradient Calculation of Molecular Geometries, Force Constants, and Dipole Moment Derivatives." J. Am. Chem. Soc. 101, 2550-2560. Pullman, B., and A. Pullman. 1974. "Molecular Orbital Calculations on the Conformation of Amino Acid Residues of Proteins." Adv. Protein Chem. 28, 347-526. Ramachandran, G. N., and V. Sasisekharan. 1968. "Conformation Of Polypeptides And Proteins." Adv. Protein Chem. 23, 283-238. Richardson, J. S., and D. C. Richardson. 1989. "Principles and Patterns of Protein Conformation," in Prediction of Protein Structure and the Principles of Protein Conformation, Fasman, G., ed., 1--98. New York, Plenum.
ON COMPARING EXPERIMENTAL AND CALCULATED STRUCTURAL PARAMETERS
157
Richardson, J. S. 1981. "The Anatomy and Taxonomy of Protein Structure." Adv. Protein Chem. 34, 167--339. Sasisekharan, V. 1962. "Stereochemical Criteria for Polypeptide and Protein Structures," in Collagen, Ramanathan, N., ed. 39--78. Madras, India, Wiley. Schafer, L., M. Cao, and M. J. Meadows. 1995a. "Predictions of Protein Backbone Bond Distances and Angles from First Principles." Biopolymers 35, 603-606. Schafer, L., and M. Cao. 1995b. "Predictions of Protein Backbone Bond Distances and Angles from First Principles." J. Mol. Struct. 333, 201-208. Schafer, L. 1983. "The Ab Initio Gradient Revolution in Structural Chemistry: the Importance of Local Molecular Geometries and the Efficacy of Joint Quantum Mechanical and Experimental Procedures." J. Mol. Struct. 100, 51--73. Schafer, L., I. S. Bin Drees, R. F. Frey, C. Van Alsenoy, and J. D. Ewbank. 1995c. "Molecular Orbital Constrained Gas Electron Diffraction Study of N-Acetyl N'MEthyl Alanine Amide." J. Mol. Struct. (Theochem) 338, 71-82. Schafer, L., J. D. Ewbank, V. J. Klimkowski, K. Siam, and C. Van Alsenoy. 1986. "predictions of Relative Structural Trends from Ab Initio Derived Standard Geometry Functions." J. Mol. Struct. (Theochem) 135, 141-158. Schafer, L., J. D. Ewbank, K. Siam, N. S. Chiu, and H. L. Sellers. 1988a. "Molecular Orbital Constrained Electron Diffraction (MOCED) Studies: The Concerted Use of Electron Diffraction and Quantum Chemical Calculations," in "Stereochemical Applications of Gas-Phase Electron Diffraction," Hargittai, L, and Hargittai, M., eds., Vol. A, Chap. 9, 301-320. New York, VCH Publisher. Schafer, L., M. Cao, M. Ramek, B. J. Teppen, S. Q. Newton, and K. Siam. "Conformational Geometry Functions: Additivity and Cooperative Effects." J. Mol. Struct., in press. Schafer, L., and K. Siam. 1988b. "Comment on: Accuracy of Ab Initio C-H Bond Length Differences and Their Correlation with Isolated C-H Stretching Frequencies." J. Chem. Phys. 88, 7255-7256. Schafer, L., K. Siam, J. D. Ewbank, W. Caminati, and A. C. Fantoni. 1987. "Some Surprising Applications of Ab Initio Gradient Geometries in Microwave Spectroscopic Analyses," in "Modeling of Structures and Properties of Molecules," Maksic, J. B., ed., Chap. 4, 79--90. Chichester, England, E. Horwood, Publ. Comp. Schafer, L. 1991. "The Mutation of Chemistry: The Rising Importance of Ab Initio Computational Techniques in Chemical Research." J. Mol. Struct. (Theochem) 230, 5-11. Schafer, L., C. Van Alsenoy, and J. N. Scarsdale. 1982. "Estimates for Systematic Empirical Corrections of Consistent 4-21G Ab Initio Geometries and Their Correlations to Total Energy Group Increments." J. Mol. Struct. (Theochem) 86, 349-364. Schafer, L., C. Van Alsenoy, and L. Van den Enden. 1984. "The Possible Chirality Of Tetrahedral Carbon Atoms with Two Substituents of Identical Constitution." J. Chem. Ed. 61 ,945--947. Schei, S. H. 1984a. "3-Chloro-l-Butene: Gas-Phase Molecular Structure and Conformations as Determined by Electron Diffraction and by Molecular Mechanics and Ab Initio Calculations. J. Mol. Struct. 118, 319-332. Schei, S. H., A. Almenningen, and J. Almlof, 1984b. "1,2,4,5-Tetrafluorobenzene: Molecular Structure as Determined by Gas-Phase Electron Diffraction and by Ab Initio Calculations." J. Mol. Struct. 112, 301-308. Scheraga, H. A. 1968. "Calculations of Conformations of Polypeptides." Adv. Phys. Org. Chem. 6, 103--184. Sibanda, B. L., and J. M. Thornton. 1985. "( -Hairpin families in Globular Proteins." Nature 316, 170--174. Skancke, A., and J. E. Boggs. 1978. "The Molecular Structures of Methylcyclopropane, Cyclopropylamine and Cyclopropyl Lithium." J. Mol. Struct. 50, 173--182. Staley, S. W., T. D. Norden, W. H. Taylor, and M. D. Harmony. 1987. "Electronic Structure of Cyclopropenone and Its Relationship to Methylenccyclopropene. Evaluation of Criteria for Aromaticity." J. Am. Chem. Soc. 109, 7641-7647.
158
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Suenram, R. D., and F. J. Lovas. 1980. "Millimeter Wave Spectrum of Glycine. A New Conformer." J. Am. Chem. Soc. 102, 7180-7184. Suenram, R. D., and F. J. Lovas. 1978. "Millimeter Wave Spectrum of Glycine." J. Mol. Spectrosc. 72, 372--382. Teeter, M. M., S. M. Roe, and N. H. Heo. 1993. "Atomic Resolution Crystal Structure of the Hydrophobic Protein Crambin at 130 K." J. Mol. Biol. 239, 292-311. Teppen, B. J., M. Cao, R. F. Frey, C. Van Alsenoy, D. M. Miller, and L. Schafer. 1994a. "An Investigation into Intramolecular Hydrogen Bonding: Impact of Basis Set and Electron Correlation on the Ab Initio Conformational Analysis of 1,2-Ethanediol and 1,2,3Propanetriol." I. Mol. Struct. (Theochem) 314, 169-190. Teppen, B. J., D. M. Miller, M. Cao, R. F. Frey, S. Q. Newton, F. A. Momany, M. Ramek, and L. Schafer. 1994b. "Investigation of Electron Correlation Effects on Molecular Geometries." J. Mol. Struct. (Theochem) 311, 9-17. Van Den Enden, L., C. Van Alsenoy, J. N. Scarsdale, V. J. Klimkowski, and L. Schafer. 1983. "Ab Initio Studies of Structural Features Not Easily Amenable to Experiment. 29. Conformational Analysis Of Glycine Aldenyde." J. Mol. Struct. 105, 407--415. Van Alsenoy, C. 1988. "Ab Initio Calculations on Large Molecules: The Multiplicative Integral Approximation." J. Comput. Chem. 9, 620-626. Van Hemelrijk, D., L. Van den Enden, H. J. Geise, H. L. Sellers, and L. Schafer. 1980. "Structure Determination of 1-Butene by Gas Electron Diffraction, Microwave Spectroscopy, Molecular Mechanics, and Molecular Orbital Constrained Electron Diffraction." J. Am. Chem. Soc. 102, 2189-2195. Venkatachalam, C. M. 1968. "Stereochemical Criteria for Polypeptides and Proteins. V. Conformation of a System of Three Linked Peptide Units." Biopol. 6, 1425-1436. von Carlowitz, S., H. Oberhammer, H. Willner, and J. E. Boggs. 1986. "Structural Determination of a Recalcitrant Molecule (S2F4)." J. Mol. Struct. 100, 161-177. von Carlowitz, S., W. Zeil, P. Pulay, and J. E. Boggs. 1982. "The Molecular Structure, Vibrational Force Field, Spectral Frequencies, and Infrared Intensities of CH3POF2." J. Mol. Struct. (Theochem) 87, 113-124.
6 Ab Initio Studies of Anti-Cancer Drugs Anne-Marie Sapse
Anti-Cancer Drugs Cancer is an extraordinarily complicated group of diseases which are characterized by the loss of normal control of the maintenance of cellular organization in the tissues. It is still not completely understood how much of the disease is of genetic, viral, or environmental origin. The result, however, is that cancer cells possess growth advantages over normal cells, a reality which damages the host by local pressure effects, destruction of tissues, and secondary systemic effects. As such, a goal of cancer therapy is the destruction of cancer cells via chemotherapeutic agents or radiation. Since the late 1940s, when Farber treated leukemia with methotrexate, cancer therapy with cytotoxic drugs made enormous progress. Chemotherapy is usually integrated with other treatments such as surgery, radiotherapy, and immunotherapy, and it is clear that postsurgery, it is effective with solid tumors. This is due to the fact that only systemic therapy can attack micrometastases. The rationale for using chemotherapy is the control of tumor-cell populations via a killing mechanism. The major problem in this approach is the lack of selectivity of chemotherapeutic agents. Some agents indeed preferentially kill cancer cells, but no agents have been synthesized yet which kill only cancer cells and do not affect normal cells. Unfortunately, normal tissues are affected, giving rise to a multitude of side effects. In addition to drugs exhibiting cytotoxic activity, antiproliferative drugs are also formulated. According to their mode of action, anti-cancer drugs are divided into several classes. alkylating agents antimetabolites DNA intercalators mitotic inhibitors lexitropsins drugs which bind covalently to DNA 159
160
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Experimental studies of these molecules are complemented and enhanced by theoretical studies. Some of the theoretical studies use molecular mechanics methods while others apply ab initio or semi-empirical quantum-chemistry methods. Most of these molecules are large and besides their structures and properties it is important to investigate their interaction with DNA fragments (themselves large molecules). Ab initio calculations cannot always be applied to the whole system. Therefore, models are used and through a judicious choice of the entities investigated, the calculations can shed light on the problem and provide enough information to complement the experimental studies. The next pages will provide examples of ab initio calculations applied to some anti-cancer drugs. Alkylating Agents Alkylating biological macromolecules act via reactions where a leaving group, such as a halogen or ammonium ion, is displaced from the alkylating agent by a nucleophylic atom on the macromolecule. The order of the reaction depends on the nature of the alkylating agent, for instance, methanesulfonates, ethylamines, and epoxides give second-order reactions (first order in the alkylating agent and first order in the nucleophyle). Among the most active alkylating agents used against cancer are nitrosoureas and aziridines. Nitrosoureas and Mitomycins As early as 1961, J. A. Montgomery submitted N-methyl-N-nitrosourea to be screened for anti-cancer activity. Even though its activity was not great, nitrosourea turned out to be the only one out of seven agents active against intraperitoneally or intravenously implanted leukemias, also showing activity against intracerebrally implanted leukemic cells. That demonstrated the capacity of nitrosoureas to cross the blood-brain barrier, and spurred researchers to try to find more active nitrosoureas. Out of those, N-(2-haloethyl)-N-nitrosoureas were found to exhibit superior activity, especially the chloro- and fluoro- compounds. The anti-tumor and cytotoxic activity of nitrosoureas is related to the ability of the drug to alkylate DNA.1 The decomposition of chloroethyl nitrosourea proceeds spontaneously under physiological conditions to produce alkylating moieties and isocyanates that might function as carbamoylating agents.2 Urea precursors of the N-nitroso compounds are not biologically active. However, the N-nitroso group labilizes the bond between the nitrogen and the adjacent carbon atom, so the compound decomposes spontaneously at physiological pH. Thus, electrophilic species are generated, which are ultimately responsible for the DNA attack. These agents could be either diazohydroxides or diazonium ions, a possible carbenium ion having been eliminated by kinetic evidence.1 In the formation of diazohydroxides, experimental evidence3 on the aqueous decomposition of specifically labelled bis(2-chloroethyl) nitrosourea (BCNU) indicates a relative contribution of about 80% to 20% for the competing pathways that form the E and Z diazohydroxide configurational isomers, respectively. The computed estimate of the energy barrier for configurational inversion of the Z to E isomer of methyldiazohydroxide is 49 kcal/mol, using ab initio calculations with total geometry optimization, with the 6-31G* basis set. It was postulated that the alkylating agent R-N-N-OH probably existed with a short lifetime inside the target cell.4 It was found that certain DNA lesions, such as guanine O6 alkylation, appear more harmful to the cell than the relatively innocuous guanine N7 alkylation.5 Indeed, attack at O6 of
AB INITIO STUDIES OF ANTI-CANCER DRUGS
161
guanine leads to adduct formations (interstrand cross-linking) which is a lethal event for the cell.6 The proposed sequence in the formation of DNA cross-links hy haloethylnitrosoureas begins with the generation of the haloethyldiazohydroxides. These transfer the haloethyl moiety to a nucleophilic site on one DNA strand. This monoadduct reacts with a second nucleophilic site on the complementary strand, through the displacement of the labile halogen. Thus, an ethylene bridge is formed between the two strands. It was hypothesized that the decisive factor in both carcinogenesis and carcinostatics is not the absolute extent of the DNA modification, but the relative amount of the O6 adduct that was formed.7 This fact was related to the repair mechanism of DNA. Indeed, in normal human cells, the repair mechanism removes the initial haloethyl lesion at O6 before the cross-link could be formed. This process somewhat spares normal cells during treatment with alkylating agents. Experimental findings show that tumor cells are deficient in the removal mechanism.8 Such cells are designed as phenotype "Mer-," while the normal cells, capable of removing the monoadduct from O6, are termed "Mer+." A better understanding of the molecular basis of nitrosoureas' activity is obtained by complementing experimental data with ab initio calculations. Ab initio calculations were performed on nitrosourea systems with two goals: (a) to help elucidate the mechanism of DNA alkylation by the electrophilic species R-N-N-OH, which are produced by the decomposition of nitrosoureas, and (b) to investigate possible effects of the presence of alkali metal cations on the activity of nitrosoureas and the alkyldiazohydroxides resulting from their decomposition. The first objective was accomplished by performing ab initio Hartree-Fock calculations on some diazohydroxide compounds, such as methyl, ethyl, and chloroethyl diazohydroxides. The main goal of these calculations was to compare the LUMO composition and charge distribution between the different species and their conformers. The question to be answered was whether or not frontier orbital theory could explain the experimentally found preference of attack of O6 or N7 on guanine. Indeed, if the alkylation of DNA proceeds through a soft nucleophile attack, the LUMO composition of the electrophile is of great importance. If the reaction occurs via a hard nucleophile attack, the net atomic charges are more significant, since the interaction with DNA is electrostatically controlled. This approach is called the Hard-Soft, Acid-Base concept (HSAB).9 Therefore, guanine itself had to be subjected to ab initio calculations in order to obtain HOMO participation and net atomic charges of O6 and N7. Calculations performed on the E and Z methyl and ethyl diazohydroxides,10'11 complemented by the calculations on guanine, have explained thus the experimental finding that methylnitrosourea alkylates mostly the N7 of guanine, while ethylnitrosourea alkylates the O6 site with a better ratio to N7 than the methyl species. In order to gather more information about this problem, it was deemed worthwhile to follow the energetics of the alkylation reaction of water by methyl-, ethyl-, and fluoroethyldiazonium ions. The main goal of these calculations was to establish whether transition-state calculations can provide information about hard versus soft electrophilic character of these species.12 Computations at Hartree-Fock and MP2 level were performed using the 6-31G* basis set. It was found that both at the Hartree-Fock level and when correlation energy affects were included, the ethyl and fluoroethyl species do not show the presence of a transition state, while the methyl species show a small transition state. It was concluded that transition state computations cannot shed light on the characters of these species. The interaction between H+, Li+, and Na+ cations and nitrosourea, as well as methyl, ethyl, and chloroethyl diazohydroxides (E and Z isomers) has been investigated with ab initio calculations, using the STO-6G and the 3-21G basis sets.13 The resulting affinities show
162
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
a strong binding of the cations to these molecules. It was suggested that administering the nitrosoureas in conjunction with cations such as lithium, could somewhat enhance the effect of the drug. Experimental studies of the decomposition of a specific nitrosourea, 1-2-chloroethylsulphinylethyl-3-cyclohexyl-l-nitrosourea, as well as of its alkyl substituted analogs have provided evidence for the intermediacy of novel, 1,2-oxathienes.14 These molecules were studied with ab initio calculations,15 using the 3-21g* basis set, which adds d functions to the sulfur atoms. 1,2-oxathietane, as well as its products of decomposition SO, C2H6, H2CO, and H2CS have been geometry optimized and their energies have been calculated. In addition, a mechanism of cleavage via a biradical intermediate was investigated. Ab initio calculations showed such a mechanism to be plausible. Another study16 investigates the effect of benzene ring fusion on the reactivity of 1,2Oxathietane. Ab initio calculations were performed using the 3-21G* and the 6-31G* basis sets, at Hartree-Fock and MP2 calculational levels. It was found that the allowed (8s + 2s) cycloreversion is unfavorable energetically. A subsequent experimental and theoretical study17 favors biradical intermediates in the valence tautomerism of benzoxathiete to monothio-o-benzoquinone. While nitrosoureas show anti-tumor activity, they might also show carcinogenic activity. In contrast, nitrosamines are strongly carcinogenic. However, the ultimate DNA attacking agent, the diazonium species is the same for nitrosoureas and nitrosamines. The difference is that nitrosoureas decompose spontaneously while nitrosamines require an enzymatic oxidation in order to decompose. The oxidation which was found to produce carcinogenic species is the hydroxilation at the -position to the N-nitroso group. The -hydroxynitroamines are very unstable, so it has been proposed that their esters may represent the transportable forms. Another class of DNA alkylating agents, the Mitomycins, proved to be most promising in clinical trials. Among these, mitomycin C, shown in Fig. 6.1, exhibits significant anti-tumor activity. Its mechanism of activation consists of a complex bioreductive process. The first step is the reduction to hydroquinone, followed by a loss of methanol. This reaction fa-
AB INITIO STUDIES OF ANTI-CANCER DRUGS
163
cilitates the opening of the aziridine ring, resulting in the formation of an amine group. Upon the rearrangement of theTTbonds, mitomycin becomes a bifunctional alkylating agent.20 An alternative mechanism proposed involves the opening of the aziridine ring upon DNA attack.21 In the absence of reducing agents, mitomycin C can be activated by reducing the pH of the medium. This is called "acidic activation," and it triggers the opening of the aziridine ring. It was found that while the reductive activation leads to attack at N2, the acidic activation leads to preferential attack at N7. This difference has been explained by Tomasz et al.22 as being due to Hard versus Soft electrophylic attack on guanine. This is an area where quantum chemical calculations are useful.23 More recently24 it has been suggested that there is increased reductive activation at low pH. Metal complexation or enzymatic protonation were also thought to activate mitomycin.25 In consequence, Sapse et al.26 have deemed it interesting to calculate the affinities of the Li+ and the Na+ ions to aziridine and ethylene oxide rings and to compare them to the proton affinities of these species. While the aziridines as part of mitomycin are featuring anti-tumor activity, ethylene oxides are carcinogens and they are also activated by the ring opening. In addiction to affinities, that work also calculated the activation energy for the aziridine ring opening under nucleophilic attack by ammonia in the presence of an ammonium ion attached to the nitrogen of the aziridine ring. The method of calculations was ab initio, using the 6-31G* and the 6-21G basis sets, as implemented by the Gaussian-80 computer program, and performing complete geometry optimization of all the species under investigation. It was found that the proton affinities of aziridine and etylene oxides were 232.18 and 193.87 kcals/mole, respectively, at 6-31G* calculational level, while the respective lithium affinities were 47.28 and 43.45 kcals/mole, respectively. The activation energy under attack by NH3 of the aziridine ring featuring an ammonium ion attached to the nitrogen, was found to be 43.6 kcals/mole, for a cis attack and 30.7 kcals/mole for a trans attack, using the MINDO/3 semi-empirical method. These energies are slightly higher than those calculated by Hopfinger27 for the protonated aziridine ring, which are of 36 and 23 kcals/mole, respectively. In a consequent study, the proton and lithium ion affinity of aziridine as part of the mitomycin molecule have been calculated.28 Ab initio calculations were performed, using the minimal STO-3G basis set. The use of a minimal basis set was made necessary by the size of the system. However, the results overestimated significantly the affinities, due to the superposition error.29 In order to estimate the error, the lithium ion affinity to two smaller systems, acetylene and phenol, was calculated both with STO-3G, 6-31G, and 6-31G* basis sets, presuming that larger basis sets will decrease the superposition error. Using these results for mitomycin (even though, mitomycin being a larger molecule, the error should be larger), a value of 42 Kcals/mole for the lithium ion affinity of the aziridine ring was obtained, which is not very different from the one obtained for aziridine itself. These calculations were performed in gas phase, without taking the solvent into consideration. However, it is probable that in an aqueous medium, the lithium affinity of aziridine will be smaller, since a lithium ion dissolved by itself in water will feature a larger energy of interaction than a lithium attached to a neutral molecule. Indeed, according to Horn's equation,30 the energy of interaction between a solvent and a charge entity is increased with the dielectric constant of the solvent and with the charge of the solute, and decreases with the size of the cavity formed by the solute in the solvent. Gersten and Sapse31 have extended the Born equation to systems where charges are set at different points in the cavity. The form of the equation for a sperical cavity is:
164
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
where Q are the individual charges located at coordinates r, P are Legendre polynomials, E is the dielectric constant of the solvent, and a is the radius of the cavity. The equation has been examined for a number of systems,32 among which the lithium aziridine and lithium oxirane. In order to implement the above equation, the position and magnitude of the charges inside of the cavity have to be known. The position was obtained by Hartree-Fock ab initio geometric optimization, using the 6-31G and the 6-31G* basis sets, and the charges were obtained by Mulliken Population Analysis. To examine the effect of the solvent on the binding energy of lithium to aziridine and oxirane, we calculated the interaction energy for the complexes, for the lithium ion and for aziridine and oxirane. By substracting the sum of the solvations energies of the lithium ion and aziridine or oxirane from the interaction energy of the complex, the energy of stabilization or destabilization of the complex formation is found and added to the gas-phase energy of formation. It was stated before that the gas-phase lithium affinities of aziridine and oxirane were found to be 47 and 43 kcals/mole, respectively. These values might be somewhat overestimated since the Born formula used to compute the electrostatic attraction between Li and water may underestimate it. It can thus be seen that in certain situation the ab initio gas-phase computations should be augmented by solvent effect in order to afford reliable results.
Anti-Metabolites An antimetabolite interferes with the normal cellular metabolites. For instance, it can act as an inhibitor of one or more enzymes whose substrates are metabolites. Others are incorporated into macromolecules instead of the metabolites. Development of antimetabolites exhibiting anti-cancer activity met with the greatest success for analogues of metabolites involved in the biosynthesis of nucleic acids and of cofactors containing nitrogenous bases. Compounds such as 5-fluorouracyl and methotrexate are remarkably effective against human cancers, even though they feature host toxicity. 5-fluoroacetyl, as well as other fluoropyrimidines, convert to monophosphate species and consequently inhibit thymidilate synthase (the enzyme which catalyzes the conversion of deoxyuridine monophosphate to thymidine monophosphate by a methyl transfer). It also converts to the triphosphates which then are incorporated into RNA, impairing its function. Incorporation of such residues into DNA also leads to the impairment of DNA. All these three mechanisms lead to the cytotoxicity of these molecules. However, complete remissions necessitate the use of these drugs in conjunction with other anti-cancer agents. Folic acid and its metabolites called folates are essential to the cell's functions. They act as coenzymes in many biochemical processes. Folate-dependent enzymes are vital to rapidly dividing cell populations, such as the neoplastic or normal-stem cells. Therefore, they are a target for anti-folates in anti-cancer treatment. Among these drugs, the dihydrofolate reductase (DHFR) inhibitors are used clinically with a certain amount of success. They belong to two major classes: the classical antifolates which feature a polar amino-acid side chain terminus and those containing nonpolar side chains, called lipophilic or nonclassical anti-folates.
AB INITIO STUDIES OF ANTI-CANCER DRUGS
165
DHFR has been the object of intense research for the last few decades. The enzyme catalyses the NADPH-dependent reduction of 7,8-dihydrofolate to 5,6,7,8 tetrahydrofolate, a chemical which participates in the thymidilate synthesis cycle. Thus, the enzyme is crucial in the synthesis of thymidine monophosphate as well as in various one-carbon unit transfer reactions. Among its inhibitors are methotrexate (MTX), trimethoprim, and other derivatives of pyrimidines, triazines, pteridines, and related heterocyclic compounds. Some of these inhibitors, such as MTX, bind more tightly to Escherichia coli enzyme than does the substrate dihydrofolate. This fact has been attributed to ion-pair formation between protonated MTX and a negative carboxyl, presumably Asp-27, as well as to hydrophobic interactions.33 One of the greatest problems in treatment with MTX and other anti-folates is the fact that the cancer cells develop immunity to the drugs. It has been found34 that this immunity is due mainly to DHFR mutations where some amino-acid residues are replaced by others which do not bind to anti-folates. The desire to better understand the mechanism of binding of anti-folates to DHFR, in order that this problem will be remediated, has led to numerous experimental studies. In addition, theoretical studies have complemented the attempts to elucidate the mechanism. In order to gain insight into the DHFR-MTX binding, Singh and Bencovic33 performed free-energy perturbation studies, using Zwanzig statistical perturbation theory35 as implemented into molecular dynamics. The program used was Amber, with the electrostatic partial charges of the MTX fragments and of the perturbed amino-acid residues computed ab initio with the 6-31G* basis set. The study concludes that Tyr 31 has a better interaction with the inhibitor then Phe 31, which suggests that the desolvation of the phenol group is important in the free energy of binding. Indeed, in the mutant, the Tyr 31 phenolic group is almost parallel to the MTX pteridine group, while in the wild-type complex, the phenyl group of Phe 31 is perpendicular. The hydrophobic interactions are more active in the binding than the asp 27 salt bridge. Other theoretical calculations comprise QSAR studies, such as those undertaken by Koelher et al.36 and Doweyko.37 Since hydrophobic interactions contribute largely to the DHFR-inhibitor binding, it was deemed interesting to apply ab initio methods to the description of aromatic-aromatic interactions. Aromatic side chains of ammo acids such as phenylalanine, tryptophan, and tyrosine are found in general in the interior of proteins, in hydrophobic regions. In some proteins they mediate helix-helix contacts. It is to be expected that agents containing aromatic groups could interact with proteins via aromatic-aromatic interactions, as for instance, proven by X-ray studies of biphenyl compounds which inhibit sickle-cell hemoglobin gelation. Burley and Presko38 defines aromatic-aromatic interactions as the interactions between pairs of aromatic side chains set at distances between 4.5 and 7 A. The systems they studied show energies of interaction between 0 and 2 Kcal/mole. Levitt and Perutz39 calculated energies of hydrogen bonding between aromatic rings and polar groups, which they termed "aromatic hydrogen bonds." They calculated those energies to range between 2.7 and 4.9 kcals/mole, using potential energy functions consisting of a van Der Waals term and an electrostatic term. Such interactions might play a role in the inactivation of some enzymes by inhibitore, as in the inactivation of chymotripsin by 6-chloro-2-pyrone compounds or in the carboxypeptidase A-inactivator complex.40 The simplest aromatic-aromatic system, the benzene-benzene complex, has been studied by Jorgensen and Severance41a using intermolecular potential functions in a Lennard-
166
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Jones plus Coulomb format. Energies of interaction for several conformations range from 1.70 to 2.31 kcals/mole with distances between the ring centers between 3.77 and 5.19 A. Sapse et al.41b performed ab initio calculations on benzene-benzene as a model for the interaction between the a phenylalanine residue and the benzoyl portion of a typical folate compound. In addition, that study examined the complex formed by a benzene ring and a polar serine which acts as a hydrogen donor, between a benzene ring and the amino group of a meta diamino-imidazole group (present in some anti-folates) and between serine as a hydrogen donor and the diamino-imidazole group as the electron donor. The basis set used for calculations were the STO-3G, the 3-21G, and the 6-31G basis sets, as implemented by the Gaussian-88 computer program. The energies of interaction were computed by using the supermolecule approach where the sum of the energies of the subsystems are substracted from the energy of the complex. All the species were geometry optimized. Since aromatic-aromatic interactions are due largely to dispersion forces, the correlation energy should account for most of the binding. In order to estimate it, MP2/6-31G* calculations had to be performed. Due to the required computational effort, these calculations were performed only for the benzene-benzene complex and it was found that they count indeed for most of the binding energy (2.5 kcals/mole versus 0.6 kcals/mole for the HartreeFock calculation). This value has been added to the Hartree-Fock energies of the other systems, resulting in interaction energies between 1.9 and 3.6 kcals/mole. In the complex formed by a benzene ring's hydrogen attracted to the center of the other benzene ring, the distance between the hydrogen and the center of the ring is found to be 3.15 A at HartreeFock level and 2.6 A at MP2/6-31G* level. The value of 2.6 is identical to the distance between a hydrogen of the phenyl ring of Phe 34 and the diamino imidazole moiety of MTX in the human DHFR complex with MTX, as visualized by the Quanta computer modeling program. This result shows that these systems require correlation energy calculations. The highest energy of interactions for these systems is obtained for the complex formed by a hydrogen of MDI and the center of a benzene ring. The lowest energy is obtained for the interaction between serine and MDI. This result agrees with experimental findings that mutations which change Phe 31 to serine reduce significantly the MTX binding to DHFR.42 Mutations of the Phe 34 to serine also reduce greatly the MTX binding. The difference in binding energies between MTX and benzene (as a model for Phenylalanine) and MTX and serine cannot account for such large differences in the binding. It is probable that serine is set at a much larger distance from MTX, destroying the interaction completely, and Phe 34 might interact with a second site MTX in the wild-type enzyme. Another class of DHFR inhibitors feature a triazine ring, attached to larger moieties. Welsh43 used AM1 calculations to obtain qualitative proton affinities for the different nitrogens of triazines. Sapse et al.44 have performed ab initio calculations at Hartree-Fock level, using the 6-31G basis set in order obtain accurate values for the proton affinities. In addition, they have calculated the energy of interaction between the triazine ring and a carboxylate ion, as shown in Fig. 6.2 and 6.3 as well as the energy of interaction between the triazine ion and a formamide molecule, as shown in Fig. 6.4. The highest protonation energy, of 261.6 kcals/mole, has been found for Nl, followed by 243.4 kcals/mole for N3 and 195.0 for N4 (the amino nitrogen). The complexes formed with the carboxylate ion are of two kinds: of the neutral triazine ring and of the protonated one. For the former, a binding energy of 32.72 kcals/mole is obtained. The later is examined in two conformations: one which is a zwietterion, with the proton set on the triazine ring and which features a binding energy of 131.4 kcals/mole, and one which sets the pro-
Fig. 6.2
Fig. 6.3
Fig. 6.4
168
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
ton on the carboxylate ion, neutralizing its negative charge and thus giving rise to a neutral complex which is not bound. The binding energy between the amino group of the triazine and a formamide molecule is of 11.7 kcals/mole. The fact that N1 is preferentially protonated is in agreement with crystal data obtained for free triazines and enzyme-bound triazines in ternary complex with enzyme and enzyme cofactor (NADPH)45 and also with the difference spectroscopy evidence46 that the N1 of the DHFR-bound MTX is protonated. As the above examples have shown, the application of quantum-chemical calculations to the study of DHFR-inhibitor interaction can shed additional light on the problem. Lexitropsins and Hoechst Agents The regulation of gene expression by control proteins (promoters, repressors) in both prokaryotes and eukaryotes requires the specific recognition of both single-stranded and double-stranded nucleic acid base sequences. Control proteins and xenobiotics utilize two different channels of information in reading the base sequence in double-helical DNA. In general, the major groove is employed by control proteins, with some exceptions when interactions extend to the minor groove, while the minor groove is utilized by certain polymerases and antibiotics. From the point of view of artificial control of gene expression, the minor groove offers advantages in that it is more accessible to attack than the major groove. This may be the reason for the evolutionary development of minor-groove selective antibiotics by micro-organisms to combat competitors. Nature provides examples of sequence-selective minor groove binding, such as antibiotics including netropsin, distamycin, anthelvencin, and kikumycin. Evidence from biochemical pharmacology indicates that these oligopeptides act to block the template function of DNA by binding selectively to AT sequences in the minor groove. Analysis of the structural requirements for the molecular recognition of netropsin and distamycin for DNA suggested that appropriate structural modification of the antibiotic could lead to an alteration of the DNA sequence recognition. The resulting compounds have been called "lexitropsins" or information-reading molecules. While the natural products exhibit only moderate anti-cancer and anti-viral activities, lexitropsins, to which they give rise, are capable of recognizing longer and alternative sequences and exhibit considerably enhanced potency as anti-cancer and anti-viral (including HIV-I) agents. Thus, systematic examination of the structural requirements for the molecular recognition of natural oligopeptides for DNA suggested appropriate structural modification of the parent molecule could provide more potent agents not only in terms of cytotoxicity but also with enhanced inhibitory action against important cellular targets including topoisomerases and reverse transcriptases.47 The components of molecular recognition that determine the sequence selectivity of these molecules include electrostatic attraction between ligand and DNA, hydrogen bonds from the amide NH groups which bridge the strands to the exposed adenine N3 and thymine 02 on adjacent sites, and between inward-directed heteroatoms and guanine NH2 sites. The semantophoric, or information-reading process, is evidently carried out largely by Van der Waals nonbonded contacts between the ligand and the surface of the minor groove. Such hydrophobic interactions have been also invoked in the recognition of the lac repressor.48 Pullman and co-workers49 performed theoretical calculations on lexitropsins, using empirical methods, and concluded that the binding of the drug to DNA is not only due to hydrogen bonding but also to electrostatic effects, mainly the interaction between the elec-
AB INITIO STUDIES OF ANTI-CANCER DRUGS
169
trostatic field in the groove of DNA and the field of the drug. Pullman proposed that the lower affinity of netroposin and distamycin for GC sequences as compared to AT sequences is not only due to the steric hindrance afforded by the guanine NH2 but also to the fact that the GC sequences feature a less negative potential than the AT sequences and, consequently, interact less efficiently with monocationic or biscationic oligopeptide antibiotics. Accumulating experimental evidence from complementary strand footprinting, high-field NMR analysis of ligand-oligonucleotide complexes and microcalorimetry50 have provided tests for the theoretical predictions. A number of studies of lexitropsins and Hoechst agents make use of ab initio calculations to complement the experimental and molecular modeling results. Due to the large size of most of these molecules, it is necessary to perform fragment-wise calculations if a larger basis set is to be used. For smaller systems, the whole molecule can be subjected to geometry optimization. Two of the small lexitropsins investigated with ab initio calculations are Amidinomycin and Noformycin.51 The former was isolated from Streptomyces flavochromogenes and was shown to have 1R, 3S configuration. The later was isolated from Nocardia formica and shown to feature 4S(+) configuration. Both have been shown to exhibit activity against a variety of plant and animal viruses.52 These two molecules, shown in Fig. 6.5 and 6.6 feature an amidine group at one end, and an aminopyrrolidinium ion at the other in noformycin, replaced by an aminocyclopentene ring in amidinomycin. Both molecules have only one peptidic bond. While larger molecules feature greater biological potency, shorter oligopeptides seem to be more discriminating in terms of sequence preference. Another point of interest of these molecules is the presence of chiral carbons which may play an important role in the recognition of DNA. For example, chiral ligands should be isohelical with the minor groove, and indeed, marked differences were observed in the binding of 4S(+)-anthelvencin and dihydrokikumycin, which are natural products, and their 4R enantiomers. As the size of the molecule decreases and the chiral center assumes a greater portion, the difference in binding increases. The charges of the molecule are important for binding. When the amidine group accepts a proton and becomes an amidinium ion, noformycin and amidinomycin become biscationic and bind strongly to AT sequences. These two molecules have large pKa values at both basic sites and, as such, are biscationic at physiological pH. The molecules were subjected to complete geometry optimization via ab initio calculations, using the 6-31G basis set. The energies of the molecules, as well as the net atomic charges, were calculated.
Fig. 6.5
170
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Fig. 6.6
The geometrical parameters obtained for amidinomycin were in good agreement with experimental values. No experimental data were found for noformycin. In both molecules, the rings exhibit puckering, with the cyclopentane ring of amidinomycin laying orthogonal to the plane of the rest of the molecule. The barrier to rotation around the C4-C5 bond in noformycin was found to be about 6 kcals/mole, at 3-21G level, with the value of 180.0 for this angle corresponding to the lowest energy (-655.6086 au). The molecules in their most stable configuration were displayed using the Quanta computer graphics program and were inserted into the minor groove of a B-DNA fragment, (GCGAATTCGC) 2 from which the water molecules have been removed. The binding energy is the sum of Van der Waals and electrostatic interactions. For amidinomycin, the 1S3R, 1R-3R, 1R-3S, and 1S-3S isomers were considered. The strongest binding isomer was 1S-3R, followed closely by the other isomers, in the order RR, RS, and SS. Since the difference within the pairs SS and RS, and RR and RR is small, while the difference between pairs is larger, it appears that the second chiral center is responsible for most of the difference in binding between isomers. For noformycin, the two isomers differ considerably in binding abilities, with the S isomer stronger that the R isomer. In general, noformycin exhibits greater binding than amidinomycin. This is probably due to strong hydrogen bonds formation and to a greater dipole moment (7.69D for noformycin vs. 5.58D for amidinomycin, as calculated by ab initio methods). Both molecules show preference for the AT sequence, but bind somewhat to a GC DNA fragment. One of the crucial factors determining effective minor-groove binding of ligands is related to the repeat distance of the nucleotide units of DNA and the Van der Waals and hydrogen-bond contacts generated by the interactions. These are sometimes referred to as the phasing problem. It has been shown53 that as the ligand increases in size, the hydrogen bond and Van der Waals contacts become out of phase with the spacing between nucleotides. In the case of noformycin and amidinomycin, the N1-N4 and N1-N5 distances ranging from 9.05 to 9.26A (for the 1S, 3R isomer) are close to the "ideal" value of 9.2A found for a potent minor-groove binder. In contrast, the natural isomer of amidinomycin, which is 1R, 3S, features these the distances 9.71 and 9.50A and this might explain their reduced binding. Ab initio calculations have also been applied to certain prototype lexitropsins in which one of the N-methylpyrrole moieties of netropsin was replaced by 1-methylimidazole.54 The rationale for this replacement was an attempt to invoke hydrogen bonding between the
AB INITIO STUDIES OF ANTI-CANCER DRUGS
171
inward-directed imidazole nitrogen and the NH2 group of guanine in the minor groove, in order to make these lexitropsins more prone to binding to GC sequences. The molecules were too large to allow complete geometry optimization using a double-zeta basis set. Accordingly, the molecule was split into three overlapping segments which were geometrically optimized at Hartree-Fock level, using the 6-31G basis set. The inter-segment parameters, including dihedral angles, were optimized using the STO-3G basis set and keeping the rest of the parameters at their 6-3 1G optimized values. The conformation with the two central rings forms angles of 63°, the guanidinium group co-planar to the ring it is attached to, and the amidiniurn group set at 99° to the ring it is attached to was found to be the most stable. The optimized molecule was modeled with the Quanta program and inserted into the minor groove of a B-DNA segment, (GCGAATTCGC) 2, with AATT being the site targeted for insertion, after the water molecules were removed. The procedure for insertion made use of the rotational-translational matrix, aiming for the minimum energy. Besides the optimized conformation, the binding of two other conformations was investigated: a completely planar one and one which kept the angle between the rings at 63° but positioned the rest of the molecule co-planar with the respective rings. These two conformations were found to be higher in energy than the optimized one by 4.5 kcals/mole and 7.9 kcals/mole, respectively. This last conformation afforded the best binding to DNA, due to the fact that the guanidinium and amidinium moieties did not engage in steric clashes with the interior of the minor groove. The binding of the molecule to DNA fragments containing GC sequences has been found to be substantially lower, in spite of the possibility of additional hydrogen-bond formation. The concept of replacing the pyrrole rings from naturally occurring oligopeptides by heteroatomic rings capable of acting as hydrogen acceptors, in order to bind with the NH2 group of guanine acting as a proton donor, raises the question, how do the proton affinities of different heteroatomic rings compare? To answer this question, we performed molecular orbital calculations of the proton affinities of N-methyl imidazole, N-methyl oxazole, and N-methyl thiazole. The Hartree-Fock calculations used the 6-3 1G basis set as implemented by the Gaussian86 computer program.55 Protonated and unprotonated species were calculated and the proton affinities were obtained as the difference between the energy of the protonated species and the energy of the unprotonated ones. In addition, 6-3 1G* calculations of the energy, using the 6-31G obtained geometry were performed. The proton affinities thus obtained show a substantial decrease from nitrogen, to oxygen, to sulfur. This may be due to the fact that, according to the Mulliken Population Analysis, the nitrogen and the oxygen from the neutral species are negatively charged, while the sulfur is positive. It is thus not surprising that nitrogen and oxygen would attract the proton more willingly than sulfur. These results are in agreement with NMR experiments56 which have shown that a thiazole ring containing lexitropsin particularly avoids GC sequences. This is due to the large size of the sulfur atom which increases the steric hindrance and to the lower ability of the sulfur to act as a proton acceptor. Instead, it is indicated that the thiazole ring intercalates between the DNA bases. Computations performed on an N-methylated imidazole showed that the presence of the methyl group increases the proton affinity of the other nitrogen. In contrast, a peptide substituent at a carbon adjacent to the nitrogen decreases its proton affinity. This can be explained by the conjugation stabilization of the peptide-substituted neutral form, which is absent in the protonated form. This result seems to indicate that heteroatomic rings linked by
172
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
other bonds than peptidic might be better proton-acceptors and consequently bind better to GC DNA sequences. Recently, considerable attention has been focused on the synthesis of lexitropsins with amide-bond surrogates, resistant to enzymatic degradation. Several isosters of the amide bond have been prepared, such as methylenamino, methylenethio, methylenoxy, and others.57 Due to the introduction of the Lawesson reagent, which converts amides into thioamides,58 the study of thiopeptide analogues of substrates and regulators was given new impetus. A number of studies have confirmed the resistance of the thioamide function to enzymatic attack. Studies of the distamycin/DNA complex have shown that the carbonyl group of the peptidic bond does not join to the binding, and thus can be replaced by a thiocarbonyl group. So the thioamide bond in lexitropsins may prevent enzymatic deformylation followed by oxidative degradation, without affecting the binding to DNA. Lown et al.57 have synthesized thioformyldistamycin and reported its resistance to enzymatic and acid-catalyzed hydrolysis. NMR experiments confirmed the existence of E and Z configurations about the thioamide bond. The footprinting methodology was used in order to examine the DNA sequence selectivity of the two forms of the drug. The experimental procedures were complemented by theoretical calculations, in order to determine the preferred conformation of the thioformyl group. Ab initio calculations at Hartree-Fock level, using the 3-21G* basis set, which sets d orbitals on the sulfur, were performed on 3-thioformylamino-N-methylpyrrole, as a model for the drug. The Z and E isomers were geometry-optimized. The E conformation proved to be more stable than the Z conformation by 17.3 kcals/mole. In order to estimate the rotational barrier between E and Z, a structure with the SC bond perpendicular to the rest of the molecule was examined, and found to be higher in energy than E by 26.3 kcals/mole. This barrier is high enough to prevent the two isomers interconverting readily. Therefore, two different modes of binding could be anticipated: in the E conformation, the sulfur atom points toward the floor of the minor groove, while in the Z isomer, the nonacidic formyl hydrogen faces the floor of the groove. One important aspect of the binding of lexitropsins to the minor groove of DNA is the presence of positive groups at one or both ends of the long flexible strands of the ligand. As mentioned before, netropsin features a guanidinium moiety at one end and an amidinium group at the other; distamycin features an amidinium group at one end; and anthelvencin, kikumycin, and the smaller compound noformycin exhibit an aminopyrrolidinium group at one end and an amidinium group at the other. These groups provide the positive charges which bind electrostatically to the negative potential present in the minor groove of DNA. However, there is the possibility that they might form hydrogen bonds with the DNA bases, contributing to the binding of the molecule to the minor groove. And there is also the possibility that they act as intercalators. Differences between the strengths of binding of these positive groups to the DNA base might partly account for differences in the binding of lexitropsins. For instance, netropsin and distamycin are very similar, except for the fact that the former features a guanidinium group at one end and the later features an aminopyrrolidinium group. These two molecules show a large difference in their DNA binding, as shown by the experiments of Lown and co-workers,58 who find the netropsin has a binding free energy of 53 kcals/mole while the two enantiomers of anthelvencin have a binding free energy of only 33 kcals/mole. Accordingly, it is interesting to evaluate the binding energy between these positive moieties and a DNA base.59 The complex formed by thymine with the guanidinium ion (Figs. 6.7, 6.8) and the complex formed by thymine with the aminopyrolidinium ion (Figs. 6.9, 6.10) were subjected to ab initio (Hartree-Fock) calculations using the 6-31G and the 3-21G basis sets, in order to
Fig. 6.7
Fig. 6.8
174 MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Fig. 6.9 obtain the optimum geometry and the binding energy. The latter is defined as the difference between the energy of the complex and the sum of energies of the subspecies. As can be seen, Figs. 6.7 and 6.9 show the binding at O1, while Figs. 6.8 and 6.10 show the binding at O2. Thymine, and the guanidinium and pyrolidinium ions were optimized with the 6-31G basis set. The parameters thus obtained were kept frozen in the complexes and the O1-H5 distances as well as the C1O1H5, C2O2H5, N3H5O1, and N3H5O2 angles were optimized with the 3-21G basis set. The dihedral angles H5O2C2N2 and N3H5O2C2 and the corre-
Fig.
6.10
AB INITIO STUDIES OF ANTI-CANCER DRUGS
175
sponding angles with O1 instead of O2 were also optimized at 3-21G level, starting with values of 0°, 90°, and 180°. The complexes formed by H5 bound to O1 were found to be slightly less stable than the new with H5 bound to O2, for both the guanidinium and the pyrolidinium complex. Both type of complexes feature bifurcated hydrogen bonds, with O2 bound to H5 and H6. The double binding is achieved with some cost to the linearity of the hydrogen bond. The most stable complexes have been found to be planar. The binding energy of the guanidinium ion is found to be higher by 2.9 kcals/mole than the one of the aminopyrrolidium ion. This difference agrees with the fact that netropsin binds better than anthelvencin to DNA but it is too small to account for the experimental difference in binding. The Hoechst Agent Another class of DNA minor groove binders are the Hoechst 33258 and its analogues. Hoechst 33258 is an anthelmintic bis-benzimidazole derivative, whose interaction with DNA has been characterized as selective to the minor groove rich in AT sequences. However, the drug can also interact with contiguous GC pairs. Electric Linear Dichroism experiments indicate that the 2-amino group of guanine prevents a strong binding of the drag to the minor groove of DNA so the binding proceeds probably via an intercalating mechanism. A series of benzimidazole analogues have been designed to alter its DNA binding to different base sequences. It was thought that in order to investigate further the binding of the Hoechst agent to DNA, one must determine its optimum geometry by theoretical calculations and dock the molecule into the minor groove.60 In order to be able to use ab initio calculations with a double-zeta Gaussian basis set (321G), fragments of the molecule were subjected to complete geometric optimization. These fragments were the benzimidazole ring, the phenol ring, and the N-methyl piperazine ring. The geometries thus obtained were kept frozen in the molecule in which the interring parameters (bond lengths, bond angles, and dihedral angles) were optimized using the STO3G basis set. Using these results, the Insight 11 molecular modeling program was used to dock the molecule into the minor groove of three DNA fragments: a (GCGAATTCGC) 2 fragment, a (GCGCATATGCGC) 2 fragment, and a (GGGGGGGGGGGG) 2 fragment. The rationale was to compare the binding to AATT and ATAT segments as well as to segments formed only of G nucleotides. Two tautomers were investigated: the one represented in Fig. 6.11 and the one represented in Fig. 6.12. The first one shows a minimum in energy for the angle between the benzimidazole rings (the N1C7C8C9 dihedral angle) for a = 180.0. The angles between the benzimidazole rings and the phenol and the N-methyl piperazine ring, and y, take values of 0.0 and 8.0, respectively. The second tautomer shows for the most stable conformation = 0.0, = 0.0, and = 52.0. The best binding to DNA is obtained with the first tautomer featuring a = 60.0 and being attached to the ATAT segment. Lown et al.61 have synthesized analogues of the Hoechst Agent, in which the benzene moiety of the benzimidazole ring has been replaced by a pyridine moiety. In another series of analogues, one of the nitrogen atoms of the imidazole ring was replaced by oxygen. Two of these compounds have been subject to fragment-wise geometric optimiza-
Fig. 6.11
Fig. 6.12
178
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
tion using the 3-21G basis set and the dihedral angles of the whole molecule, as well as the interfragment parameters being optimized with the STO-3G basis set, keeping the rest of the molecule's parameters frozen at 3-21G values. Consequently, the molecule was docked into the minor groove of DNA fragments, similar to those described previously. It was found that the compound featuring one nitrogen in each imidazole ring replaced by an oxygen atom binds better to the DNA fragment than the pyridine-containing compound. The best binding is obtained with the AATT DNA fragment and, in the case of the oxygen-containing compound at its best geometry, at the same time the other compound requests an interring angle of 30° (different by almost 10 Kcals/mole in energy from the optimum geometry which features an interring angle of 180°). It is also interesting to notice that the binding proceeds best with the piperazine ring closest to the minor groove and the rest of the molecule "hovering" over it. There are other anti-tumor drugs which are being investigated with molecular orbital calculations. For instance, two categories of drugs, esperamicins and calicheamycins, are characterized by a ring containing two triple bonds. These bonds are opening upon DNA attack, giving rise to a biradical which binds covalently to DNA. The reaction of opening is currently studied in our group with ab initio methods (unpublished data). In conclusion, we can say that molecular orbital methods, sometimes in conjunction with molecular modeling methods, are a precious tool in the investigation of anti-tumor drugs. References 1. Montgomery, J. A. 1976. Cancer Treat. Rep. 60, 651. 2. Montgomery, J. A., R. James, G. S. McCaleb, and T. P. Johnston. 1967. J. Med. Chem. 10, 668. 3. Lown, J. W., and S.M.S. Chauhan. 1982. J. Org. Chem. 47, 851. 4. Lown, J. W., S.M.S. Chauhan, R. R. Koganty, and A.-M. Sapse J. Amer. Chem. Soc. 106, 6401. 5. Loveless, A. 1969. Nature 223, 206.Loveless, A., and C. L. Hampton. 1969. Mutat. Res. 7,1.Singer, B. J. 1977. Toxicol. Environ. Health 2,1279.Lawley, P. D. 1980. Brit. Med. Bull. 36,19.Singer, B. J., and J. T. Kusmierek. 1982. Ann. Rev. Biochem. 52,655. 6. Kohn, K. W. 1977. Cancer Res 37, 1450. 7. Goth, R., M. F. Rajewsky, and Z. Krebsforsch. 1974.2,37.Goth, R. and M. F. Rajewsky. 1974. Proc. Nat'l. Acad. Sci. 71, 639. 8. Day, R. S., C.H.J. Ziolkowsky, D. A. Scudiero, S. A. Meyer, and M. R. Mattern. 1989. Carcinogenesis 1, 21. 9. Pearson, R. B. and J. M. Songstadf. 1967. J. Amer. Chem. Soc. 89, 1827. Saville, B. 1967. Angew. Chem. Int. 6, 928. 10. E. B. Allen. 1985. Ph.D. Thesis, City University of New York. 11. Sapse, A.-M., E. B. Allen, and J. W. Lown. 1988. J. Amer. Chem. Soc. 110, 5371. 12. Sapse, A.-M. and D. C. Jain. 1993. J. Phys. Org. Chem. 6, 243. 13. Sapse, A.-M., E. B. Allen, and L. Fugler. 1987. Domenico Cancer Investigation 5, 559. A.-M. Sapse. 1987. Revue Roumaime de Chimie 31, 1071. 14. Lown, J. W., and R. R. Koganty. 1983. J. Amer. Chem. Soc. 105, 126. 15. Naghipur, A., J. W. Lown, D. C. Jain, and A.-M. Sapse. 1890. Can J. Chem. 66, 1890. 16. Naghipur, A., K.Reszka, A.-M. Sapse, and J.W. Lown. 1989. J. Amer. Chem. Soc. 111, 258. 17. Naghipur, A., K. Reszka, J. W. Lown, and A.-M. Sapse. 1990. Can. J. Chem. 67, 625. 18. Demeunynck, M., J. W. Lown, and A.-M. Sapse. 1989. Can. J. Chem. 67, 625. 19. Sapse, A.-M., D. S. Sapse, and D. C. Jain. 1994. Theor. Chim. Acuta. 88,111.
AB INITIO STUDIES OF ANTI-CANCER DRUGS
179
20. Szybalski, W., and V. N. Iyer. 1964. Fed. Proc. 23, 946. 21. Moore, H. W. 1977. Science 197, 527. 22. Tomasz, M. 1994. In Topics in Molecular and Structural Biology, 2, 311. S. Neidle, and M. Waring, eds. New York, McMillan. 23. Pullman, A., and B. Pullman. 1980. Int. J. Quant. Chem. Quant. Biol. Symp. 7, 245. 24. Gustafson, D. L., and C. Pristos. 1992. J. Natl. Cancer Inst. 84, 1180. • 25. lyengar, B. S., S. M. Sami, T. Takahashi, E. E. Sikorski, and W. A. Remers. 1986. J. Med. Chem. 29, 1760. 26. Sapse, A.-M., J. D. Bunce, and D. C. Jain. 1984. J. Amer. Chem. Soc. 106, 6579. 27. Kikuchi, O., A. J. Hopfinger, and G. Klopman. 1980. Biopolymers 19, 325. 28. A.-M. Sapse, and J. D. Bunce. 1985. J. Biol. Phys. 13, 39. 29. Boys, S. F., and F. Bernardi. 1970. Mol. Phys. 19, 553. 30. Born, M. 1920. Phys. Z. 1, 45. 31. Gersten, J. I., and A.-M. Sapse. 1981. J. Phys. Chem. 85, 3407. 32. Gersten, J. I., and A.-M. Sapse. 1985. J. Comp. Chem. 6, 481. 33. Singh, U. C., S. J. Benkovic. 1988. Proc. Natl. Acad. Sci. 85, 9519. 34. Schweitzer, B. L, A. P. Dicker, J. R. Bertino. 1990. FASEB J. 4, 2441. Benkovic, C. A. Fierke, A. M. Naylor. 1988. Science 239, 1105. 35. Zwanzig, H. W. 1954. J. Chem. Phys. 22, 1420. 36. Koehler, M. G., K. Rowberg-Sebacher, and A. J. Hopfinger. 1988. Arch. Biochem. Biophys. 266, 152. 37. Doweyko, A. M. 1988. J. Med. Chem. 31, 1396. 38. Burley, S. K., and G. A. Petsko. 1985. Science 2299, 23. 39. Levitt, M., M. F Perutz. 1988. J. Mol. Biol. 201, 751. 40. Ringe, D. B., D. B. Seaton, M. Gelb, and R. H. Abeles. 1984. Biochemistry 24, 64. 41a. Jorgensen, W. L., and D. L. Severance, 1990, J. Am. Chem. Soc., 112, 768. 41b. Sapse, A.-M., B. I. Schweitzer, A. P. Dicker, J. R. Bertino, V. Frecer. 1992. Int. J. Pep. Prot. Res. 39, 18. 42. Schweitzer, B. I., S. Srimatkandada, H. Gritsman, R. Sheridan, R. Venkataraghavan, and J. R. Bertino. 1989. J. Biol. Chem. 264, 20786. 43. Welsh, W. J. 1990. J. Comp. Chem. 11, 644. 44. Sapse, A.-M., M. C. Waltham, and J. R. Bertino. 1994. Cancer Investigation 12, 469. 45. Voltz, K. W., D. A. Matthews, and R. A. Alden et al. 1982. J. Biol. Chem. 257, 2528. 46. Hood, K., and G. C. Roberts. 1978. Bichem. J. 171, 357.Howell, E. E., J. E. Villafranca, and M. S. Warren et al. 1986. Science 231, 1123. 47. Lown, J. W. 1990. in Molecular Basis of Specificity in Nucleic Acid-Drug Interaction, London, Kluwer Academic Pub. 103.Lown, J. W. 1992. Antiviral Res. 17, 179.Wang, W., and J. W. Lown. 1992. J. Med. Chem. 35, 2890. 48. Wartell, R. M., J. E. Larson, and R. D. Wells. 1974. J. Biol. Chem. 249, 6719. 49. Pullman, B. 1989. Adv. Drag Res. 18, 1. 50. Krowicki, K., and J. W. Lown. 1987. J. Org. Chem. 52, 3493.Lown, J. W. 1988. Anti-Cancer Drug Design, 3, 25.Lee, M., R. G. Shea, J. A. Hartley, J. W. Lown, K. Kissinger, J. C. Dabrowiak, G. Vesnauer, K. J. Breslauer, and R. T. Pon. 1989. J. Molec. Recogn. 2, 6. 51. Sapse, A.-M., W. Feng, L. Fugler-Domenico, S. Kabir, T. Joseph, and J. W. Lown. 1993. J. Biomol. Struct. Dyn. 10, 709. 52. Kaneda, M., S. Nakamura, and Y. litaka. 1980. J. Antibiot. Ser. A 33, 778.Diana, G. D. 1973. J. Med. Chem. 16,857. 53. Rao, K. E., J. Zimmermann, and J. W. Lown. 1991. J. Org. Chem. 56, 786. 54. Mazurek, P., W. Feng, K. Shukla, A.-M. Sapse, and J. W. Lown. 1991. J. Biomol. Struct. Dyn. 9, 299. 55. Kabir, S. and A.-M. Sapse. 1991. J. Comp. Chem. 12, 1142. 56. Kumar, S., M. Jaseja, J. Zimmermann, B. Yadagiri, R. T. Pon, A.-M. Sapse, and J. W. Lown. 1990. J. Biomol. Struct. Dyn. 8, 99.
180
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
57. Zimmermann, J., K. E. Rao, T. Joseph, A.-M. Sapse, and J. W. Lown. 1991. J. Biomol. Struct. Dyn. 9, 599. 58. Lee, M., R. G. Shea, J. A. Hartley, K. Kissinger, R. T. Pon, G. Vesnauer, K. J. Breslauer, J. C. Dabrowiak, and J. W. Lown. 1989. J. Amer. Chem. Soc. 111, 345. 59. Sapse, A.-M., S. Kabir, and G. Snyder. 1995. THEOCHEM 339, 227. 60. Sapse, A.-M., D. S. Sapse, D. C. Jain, and J. W. Lown. 1995. J. Biomol. Struct. Dyn. 12, 857. 61. Sapse, A.-M., D. C. Jain, J. W. Lown. 1997. J. of Biomol. Struct. Dyn. 14, 475.
7
Ab Initio Calculations of Amino Acids and Peptides Lothar Schafer Susan Q. Newton Xiaoqin Jiang
7.1 Theory versus Experiment: The Case of Glycine In the late seventies and early eighties, a small number of researchers who had the requisite resources at their disposal made the first systematic attempts to employ the newly emerging quantum-chemical computational tools in experimental studies of molecular structures (for reviews see Boggs 1983G, 1988G; Schafer et al. 1983G, 1987G, 1988aG; and Geise et al. 1988G). In this process, which provided an important testing ground for the evolution of computational chemistry, the ab initio geometric optimizations of glycine (Sellers et al. 1978AA) represent a special landmark. In a sequence of events unprecedented in conformational chemistry, the results of the optimizations first suggested the existence of a hidden conformation that had remained undetected in two independent microwave spectroscopic studies of the compound (Brown et al. 1978G; Suenram et al. 1978G), and then guided new experiments that led to the detection of the missing state (Suenram et al. 1980G; Schafer et al. 1980AA). In the first microwave investigations of glycine (Brown et al. 1978G; Guenram et al. 1978G)—experiments whose success represent a considerable achievement because glycine is difficult to work with in the vapor phase—the observed transitions were assigned to the cyclic form, C, of the compound. In both studies, it was emphasized that other conformers of glycine could have been present but were not detected in the microwave spectra because their line intensities were weaker than those of C. Nevertheless, since C was observed but not the stretched form, S, (see Fig. 7.1) Brown et al. concluded that "the most likely conformation of glycine in the vapor state" is C and that the experimental result was in conflict with ab initio calculations of glycine by Vishveshwara and Pople (1977AA) in which S was found more stable than C. Brown et al. (1978G): "The microwave spectrum of glycine vapor has been measured and analyzed; it is in the molecular form with a dipole moment of 4.5-4.6D and probably having conformation (C), which is in conflict with arecent theoretical study that implies that conformation (S) is more stable." In the sixties and seventies, theoretical chemistry had developed a somewhat uncertain reputation. At that time, the computational means did not really allow the performance of 181
182
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Fig. 7.1 The stretched form of glycine, S, and the cyclic form C, with the planar N-CC=O configuration.
reliable calculations of sufficiently large molecules which were useful in laboratory research. In response to this predicament, a large number of approximate procedures were developed and received with great enthusiasm, but they often failed to live up to initial expectations. In this situation, it was natural to expect an experimental result to be in contrast to a theoretical study. In addition, the ab initio study of glycine by Vishveshwara and Pople (1977 AA), apart from being a pioneering achievement, had its own characteristic problems, because it consisted of a series of single-point energy calculations. That is, conformational energies were calculated at rigid or standard geometries without geometry optimization. Thus, the results were intrinsically somewhat inconclusive, because one can never predict exactly how optimization will affect the relative energies of unoptimized geometries. In view of this controversy, Sellers et al. (1978AA) and Schafer et al. (1980AA) optimized several structures of glycine using Pulay's gradient method (Pulay 1969G and 1979aG) at the HF/4-21G level (Pulay et al. 1979bG). These calculations, too, indicated that S was more stable than C, but yielded important additional results. First of all, rotational constants calculated with the ab initio geometry of C agreed within a few tenths of a percent with the experiment. Furthermore, the calculations reinforced the suggestion that the microwave experiments and ab initio calculations were not really in conflict because the stretched form, S, was indeed a highly populated state of glycine but had remained undetected in the spectroscopic experiments. This suggestion seemed plausible because the HF/4-21G dipole moment for S was significantly smaller than that calculated for C. Since intensities of transitions in the microwave region depend on the squares of the dipole moment components, it was reasonable to expect that C had a considerable advantage in line intensity over S. Thus, a significantly populated conformational state could easily have been overlooked, if the microwave spectrum of an equilibrium mixture of glycine was dominated by the spectrum of the less populated conformer. To some extent, the close agreement between the HF/4-21G geometry of C and the microwave spectrum was a surprising result. In 1978, the ability to calculate structures for a molecule of the size of glycine with such a quantitative agreement with experimental data was a new dimension in structural chemistry. Thus, at that time, the case was generally considered an isolated instance of serendipity. In contrast, Suenram and Lovas (1980G) considered the possibility that the accuracy found for C might very well also allow accurate calculations of the microwave spectrum of 5 from its ab initio structure. Therefore, they used the frequencies calculated from the HF/4-21G geometry of S as a guide in the further search for this conformer in the microwave spectrum. Indeed, when the experiments were
AB INITIO CALCULATIONS OF AMINO ACIDS AND PEPTIDES
183
performed, weak transitions were found (Suenram et al. 1980G) in the predicted range, their assignment was made in terms of S, the predicted rotational constants from the ab initio structure were within a few tenths of a percent of the experimental values (Schafer et al. 1980AA), and 5 was found to be more stable than C. Thus, a general rule of experimental conformational chemistry was confirmed: molecular conformations observed do not allow one to make any inferences regarding conformations not observed. In view of the advances in computational chemistry during the last decade, the case of glycine now seems simple enough and nearly trivial. However, in hindsight, the study of glycine and related computational research performed in other laboratories at that time symbolized a turning point, revealing as they did the unexpected accuracy of HF/ab initio geometries. A whole string of similar discoveries followed in rapid sequence, all involving errors in experimental studies, some involving missing conformational states, originally undetected in experiments of various types, but later discovered in new experiments guided by the ab initio calculations. For example, errors were discovered in the structural studies of cyclopropylamine (Skancke et al. 1978G) and CH3POF2 (von Carlowitz et al. 1982G), when the ab initio geometries disagreed with the experimental results. Using constraints from ab initio geometries in the gas-electron diffraction study of dimethyl hydrazine (Chiu et al. 1979G), evidence for the existence of the outer-outer form was detected, first overlooked in a conventional gas-electron diffraction study (Kohata et al. 1979G), but then confirmed by a microwave investigation (Nakata et al. 1981G). Similarly, the axial form of monocyanocyclobutane, originally thought not to exist, was discovered in its microwave spectrum (Caminati et al. 1987cG), when a new search was guided by its calculated geometry. In the same way, quantitatively accurate geometry predictions played an important role in the discoveries of the syn-form of 2-methyl allyl alcohol (Caminati et al. 1987aG), of trans-2-methoxy-ethylamine (Caminati et al. 1987bG), and the assignment of the spectrum of methylhydrazino carboxilate (Caminati et al. 1986G). Eventually, the power of combining experimental procedures with results from ab initio gradient calculations was firmly established (Boggs et al. 1982G, 1983G, 1988G; Geise et al. 1988G; van Hemelrijk et al. 1980G; von Carlowitz et al. 1983G; Doms et al. 1985G; Schei et al. 1984aG, 1984bG; Cremer et al. 1981G; Norden et al. 1986G; Staley et al. 1987G). Their application is a routine matter now and it is an important conclusion resulting from these studies that ab initio geometries—that is, differences between parameters of the same type in organic molecules—can be calculated with high accuracy, often even exceeding the resolution of details by experiments.
7.2 Conformational Properties of Amino Acids The ability of the ab initio gradient method (Pulay 1969G) to predict molecular geometries in precise agreement with experimental structures inspired refinements of hundreds of structures of basic organic functional groups at the HF/4-21G level, revealing structural details with a hitherto unknown resolution (McKean et al. 1984G; Schafer et al. 1983G, 1986G). In the area of amino acids, early optimizations were extended to various residues other than glycine, including alanine (Sellers et al. 1979AA; Siam et al. 1984AA; Masamura 1988aAA, 1988bAA), serine (van Alsenoy et al. 1981AA, Masamura 1988aAA, 1988bAA; Tarakeshwar et al. 1994AA), cysteine (Laurence et al. 1981AA; Schafer et al. 1990bAA; Tarakeshwar et al. 1994AA), valine (Schafer et al. 1990aAA), threonine (Schafer et al. 1990aAA), and aspartic and glutamic acid (Sapse et al. 1994AA). Experimental investiga-
184
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
tions of volatile derivatives in the vapor phase consistently confirmed the stability of the stretched form of the amino acid backbone, as in the methyl esters of glycine (Klimkowski et al. 1982G, 1983bAA), alanine (Ewbank et al. 1987AA), and valine (Klimkowski et al. 1985 AA). In addition to renewed microwave investigations (Godfrey et al. 1995G, 1996G), the vapors of glycine were studied by gas electron diffraction (lijima et al. 1991G). Interest was also in glycine analogs, such as the boron analog of glycine (Laurence et al. 1982AA), the hydrazino analog of the methyl ester of glycine, methyl hydrazinocarboxylate (Caminati et al. 1986G), glycine aldehyde (van den Enden et al. 1983G), and other related compounds, such as -alanine (Ramek 1990aAA) and -aminobutyric acid (Ramek 1990bAA). Klimkowski et al. (1983aAA) compared methyl propanoate with the methyl ester of glycine. In aqueous solution and in the solid state, the most stable form of amino acids is the zwitterion. Numerous quantum-chemical studies on the molecular properties of zwitterions are available (Alper et al. 1992AA; Bonaccorsi et al. 1984AA; Depke et al. 1984AA; Destro et al. 1988AA; Ding et al. 1992AA; Kikuchi et al. 1990aAA, 1990bAA; Kokpol et al. 1988AA; Ni et al. 1988AA; No et al. 1994AA; Palla et al. 1980AA; Ranghino et al. 1983AA; Singh et al. 1987AA; Sokalski et al. 1989AA; Sukumar et al. 1986AA; Voogd et al. 1981AA; Williams et al. 1993AA; Wright et al. 1980AA). In the gas phase, amino acids are neutral and the conformational properties of the backbone are determined by the torsions 1(H-N-C-C), 2(N-C-C=O), and 3(O=C-O-H). In the planar structures of glycine, two meaningful arrangements can be expected for each of these torsions, in the vicinity of ±60° or ±120° for 1, and 0° or 180° for 2 and 3. The possible combinations yield eight different conformers. The number is enhanced when planar or nonplanar geometries are considered in some cases. A summary of the possible conformations of glycine is given in Fig. 7.2. The most stable form of the amino-acid backbone, regardless of residue, is the stretched conformation, S, characterized by a bifurcated attractive interaction (see fig. 7.1) between the amino hydrogens and carbonyl oxygen. The next stable conformation is the cyclic form, C, characterized by a ring formed by the interaction between the hydroxyl- and aminogroups. The characteristic bifurcated nonbonded interaction of S can also be directed at the C-O group, rather than C=O, leading to another stable form, denoted as C' in Fig. 7.2. Less stable forms can be derived from S when one or two attractive nonbonded interactions are decoupled, as in S*, S*', and S**, respectively. Similarly, decoupling attractive interactions in C leads to C* and C**. For several of these conformers planar or nonplanar forms can be discussed. They are denoted in Fig. 7.2 by the subscripts "p" or "np," respectively. In amino acids other than glycine, the backbone torsions can be significantly affected by the side group (Sellers et al. 1979AA, Csaszar 1995AA, Cao et al. 1995AA). For example, in alanine the backbone of the S-form is no longer planar, and no controversy arises concerning the planarity or not of the energy minimum of C. Even more pronounced effects on backbone torsions and conformational stabilities are found in systems such as serine, in which the side group can strongly interact with, and twist, different parts of the backbone (van Alsenoy et al. 1988AA). Renewed interest in the recent past has led to advanced calculations of amino acids, involving various basis sets and levels of ab initio theory (Ringnalda et al. 1990AA; Jensen et al. 1991AA; Ramek et al. 1991 AA; Csaszar 1992AA, 1995AA; Hu et al. 1993AA; Cao et al. 1995AA). These studies include geometry optimization at the post-HF level, following some early pioneering single-point explorations of electron correlation effects (Dykstra et al. 1981 AA, Millefiori et al. 1983AA).
AB INITIO CALCULATIONS OF AMINO ACIDS AND PEPTIDES
185
Fig. 7.2 Conformations of glycine. Because of the basic features of the cyclic and stretched forms, the letters C and S have been assigned to all forms in which the N-C-C-O torsion is syn or anti, respectively. In the cyclic form there is an attractive nonbonded interaction between the hydroxyl-H and nitrogen lone pair. At some levels of ab initio theory the planar form, C , is the energy minimum of C; at other levels C is characterized by a dual-well potential energy function, with the nonplanar form, Cnp , as the minimum. When the attractive interaction characterizing the C form is decoupled by rotation about the C-O bond, the less stable forms, C* and C*np , result. By decoupling all stabilizing N...H or O...H interactions, the high energy form, C**, is obtained. In the stretched form of glycine, S, there is a characteristic bifurcating attractive interaction between the amino-hydrogens andthecarbonyl(C=O) oxygen (see Fig. 7.1). In a related form, but of type C', the characteristic bifurcation is extended to the carboxyl (C-O) oxygen. When one or two of the interactions that stabilize S are decoupled, the forms S*, S*', and S**, result. Again, except for the case of S, different levels of ab initio theory disagree on whether planar or nonplanar forms are the energy minima of some of the S-forms.
Among the advanced calculations, those performed at the HF-level of theory with large basis sets are somewhat disappointing because, as far as exact energy differences are concerned, they are not converged. For the S and C forms of glycine, for example, the HF energy differences range from 7.5 to 13.6 kJ/mol for various basis sets (Frey et al. 1992P). In addition, different levels of ab initio theory (Jensen et al. 1991AA; Ramek et al. 1991AA; Csaszar 1992AA, 1995AA; Hue et al. 1993AA; Cao et al. 1995AA) do not agree on the exact location of the minimum of the cyclic form, at C or Cnp , even though the most advanced levels seem to converge on Cnp . It is a particularly disturbing aspect of the HF energy difference between 5 and C that calculations with large basis sets can deviate more from the experimental value than those performed with smaller bases (Frey et al. 1992P). Thus, enlarging the basis does not necessar-
186
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
ily lead to improved HF energies, as is often implied. In the case of glycine, addition of zero point energies can lead to even poorer agreement with experiment (Frey et al. 1992P). In view of such results, consideration of electron correlation seems necessary to obtain accurate conformational energies of amino acids and related systems. for the MP2/6311G** optimized structures of S and C, the calculated energy difference is 3.5 kJ/mol (Frey etal. 1992P) which compares to an experimental value of 5.9 ± 1.8kJ/mol (Suenramet al. 1980G). Zero point energy corrections very likely will add ~ 1kJ/mol to AE — they are doing so in the HF calculations — improving the agreement between experiment and the MP2/6-311G** energy for the optimized structures. When MP2 energies are calculated at HF-geometries, i.e., at MP2-unoptimized geometries, the error due to the lack of geometry optimization can amount to ~2kJ/mol. Thus, geometry optimization seems to be a prerequisite for obtaining accurate energies at the correlated level of theory. Of all the amino acids, glycine and alanine have been investigated most thoroughly. In spite of the fact that the computations have not yet attained a level in which they are totally converged, the results indicate that the conformational equilibria in the vapor phase of glycine and alanine are more complex than has been observed experimentally. A detailed discussion has been given by Godfrey et al. (1995G, 1996G). thus, the conformational properties of gaseous amino acids may continue for some time to be a stimulating challenge to both computational and experimental chemists.
7.3 The Importance of Full Geometry Optimization in Correlated-Level Ab Initio Molecular Conformational Analyses The statement that MP2 geometry optimization is essential for the computation of precise MP2 energies is in contrast to the generally accepted procedure of calculating correlatedlevel conformational energies at HF-optimized geometries, avoiding the expense of postHF geometry optimization. In the late seventies and early eighties, quantum chemistry made a leap in accuracy and reliability, when analytical gradients became available (Pulay 1969G), because their employment in conformational analyses eliminated the uncertainty related to the lack of geometry optimization. Analytical correlation-level gradients, too, are now available (Pople et al. 1979G) within the second-order perturbative M011er-Plesset (MP2) formalism (M011er et al. 1934G), but geometries are often not optimized at this level of theory because of the considerable computing time required for the process. In a recent paper Csaszar (1995 AA) emphasized the validity of this procedure, stating that: "single-point MP2 energies obtained an HF geometries for amino acids are rather accurate as conventional wisdom would imply"; and "correlated-level geometry optimizations can usually be avoided even if nearly quantitative accuracy is sought in relative energy predictions." In order to determine what effects geometry optimization—or the lack of it—can have on correlated-level conformational energies, we have executed a series of studies (Frey et al. 1992P, 1993G; Teppen et al., 1994aG, 1994bG) in which geometries devoid of correlation (i.e., HF-optimized) were compared with the same geometries when correlation was switched on by MP2 optimization. MP2 calculations are not a true substitute for advanced electron correlation calculations, such as full configuration interaction variational methods but, in combination with extended basis sets, they are generally considered to include a large part of the correlation energy that is neglected in HF theory (Hehre et al. 1986G). In the fol-
AB INITIO CALCULATIONS OF AMINO ACIDS AND PEPTIDES
187
lowing we shall denote as "geometry errors" the errors connected with MP2 energies determined at MP2-unoptimized (i.e., HF-optimized) geometries. In addition to glycine and the model dipeptide N-formyl alanine amide (Frey et al. 1992P), the question of geometry errors has been explored in other systems with nonbonded interactions of seminal significance, such as hydrogen bonding (Teppen et al. 1994aG, Ramek 1996AA) and aliphatic interactions (Frey et al. 1993G). In each case, structures were compared which are devoid of correlation (i.e., HF-gradient optimized) with structures in which correlation was included (i.e., MP2-gradient optimized). Some general trends are beginning to emerge from these studies, which can be summarized in the following way (Teppen 1994bG, Ramek 1996AA): Geometry errors increase with the size of a system when the number of nonbonded interactions increases, and folded conformations are differently affected than extended ones. That is, MP2-energies calculated at HF-geometries can be inaccurate, because geometry errors are not necessarily the same in different conformations of the same molecule. Regarding the molecular structures it can be shown that, even when differences between MP2 geometries and the HF equivalents are small, they can be significant. Size-effects are apparent, for example, in the hydrocarbon series (Frey et al. 1993G), nbutane, w-pentane, and n-hexane, where the geometry error in MP2/6-311G** relative energies increases systematically with increasing molecular size; i.e., from 90 cal/mol in the case of n-butane, to 280 cal/mol for n-pentane, and 460 cal/mol for w-hexane. Similarly, for several conformations of the relatively small system, ethylene glycole (Teppen et al. 1994aG), the geometry error in MP2/6-311G** relative energies is relatively small, i.e., up to 150 cal/mol, but exceeds 400 cal/mol in the optimized structures of the larger system, glycerol (Teppen et al. 1994aG). If one extrapolates to even larger systems with additional hydroxyl groups, such as carbohydrates, the impression is unavoidable that any conclusions based on unoptimized energies will not be very meaningful. The importance of folding derives from the torsional sensitivity (TS) (Cao et al. 1993G) of a given conformation. TS measures how nonbonded distances at a given point in torsional space, and their contributions to nonbonded energy, are affected by small changes in backbone torsional angles (Cao et al. 1993G). TS is typically high for gauche forms (G), and low for trans forms (T). Extended structures, in which all main chain torsions are in the T-state, cannot give in to the subtle dispersion forces established by electron correlation in the same way as folded forms do, in which several main chain torsions are in G-states (see Table 7.1). As a consequence, the MP2 structures of all T-forms are rather similar to their HF counterparts, in contrast to forms with G-sequences, where the opposite is the case. This constitutional difference in response to correlated-level geometry optimization is the source of potentially serious geometry errors. The analysis above is supported by the hydrocarbon series (Frey et al. 1993G) where the error increases systematically with the number of G forms, amounting to ~ 100 cal/mol for the TG form of n-pentane and the GTT form of n-hexane; 280 cal/mol for GG n-penetrate, 320 cal/mol for TGG n-hexane, and 460 cal/mol for GGG n-hexane. For the latter, the effect is sufficient to cause an inversion in the order of stability, invalidating a frequently quoted rule according to which alkane stability decreases with the addition of G torsions. Quantitative MP2-calculations currently can not be performed for larger alkanes, but the progression of errors in the available series is so clear that unoptimized correlated energies will very likely be meaningless for large aliphatic systems, such as parts of globular proteins or lipids.
188
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Table 7.1. Comparison of nonbonded distances in the HF/6-311G** and MP2/6-311G** optimized structures of various conformations of n-hexaneoa
C1-C4
C1-C5
C1-C6
TTT-HF TTT-MP2
4.78 4.76
5.75 5.75
7.51 7.48
GTT-HF GTT-MP2
3.51 3.39
5.29 5.23
6.18 6.06
GTG-HF GTG-MP2
3.49 3.36
5.53 5.45
5.61 5.59
TGG-SCF TGG-MP2
4.89 4.89
4.33 4.06
5.69 5.47
GGG-HF GGG-MP2
3.80 3.72
4.26 4.04
5.69 5.43
a
The Table shows that the HF- and MP2-structures are very similar for stretched (all-trans, TTT) conformations. In contrast, HF- and MP2-structures can differ significantly for forms with sequential gauche (GG) torsions.The values are taken from R. F. Frey, M. Cao, S. Q. Newton, and L. Schafer, J. Mol Struct., 285 (1993) 99. In glycine (Csaszar 1995AA), the energy difference between the C and S forms is 570 cal/mol for the fully optimized MP2 geometries in calculations employing the B1 basis set (Csaszar 1995AA) and 980 cal/mol for MP2/B1 at HF/B1 geometries. For alanine corresponding differences are 140 cal/mol and 470 cal/mol, respectively (Csaszar 1995AA). An error of some 400 cal/mol seems small for isolated amino acids but if it is frequently repeated and accumulated, as when these compounds are residues in polymers, the effects on the results of a model calculation can be devastating. As to the effects of MP2-optimization on HF geometries, they are in some cases significant and in others small, but, even when small, they are not necessarily insignificant. As to the former, MP2-optimized hydrogen bond lengths, for example, differ from the HF-optimized values by 0.17 A in N-formyl alanine amide (Frey et al. 1992P), and 0.19 A in glycerol (Teppen et al. 1994aG). In the same way, MP2- and HF- optimized nonbonded distances in hydrocarbons differ (Frey et al. 1993G) by 0.3 A in a sensitive region of the nonbonded potential energy surface. One cannot evaluate conformational energies and nonbonded interactions correctly, if effects of this magnitude are disregarded in large systems where the nonbonded interactions represent one of the most important factors determining conformational stability and structure. In cases in which the differences between HF-optimized structures and MP2-geometries are small, they need not be negligible. For example, Godfrey et al. (1993G) recently recorded the microwave spectroscopic data of gaseous alanine and interpreted the observed transitions by the help of ab initio geometries. On the basis of their analysis they concluded (Godfrey et al. 1993G), "From comparison (of rotational constants and dipole moment components) with those predicted from ab initio molecular orbital calculations at the HF/631G** level, we have been able unambiguously to identify the two observed conformers as alanine 1 and alanine III." In contrast to this statement, the assignment of the microwave data is not unambiguous at all and two conformations are equally well in agreement with the same set of experimental constants, when the rotational constants and dipole moment
AB INITIO CALCULATIONS OF AMINO ACIDS AND PEPTI 189
Fig. 7.3 Contour graph illustrating the concept of torsional sensitivity (TS). Changes in the 1,5-CH3/CH3 nonbonded H...H distances in n-pentane, , are plotted which occur when the torsional angles = C1-C2-C3-C4 and 2=C2-C3-C4-C5 change by 1 = 2 = 1°. The plotted values are / 1 1 + | / 2| 2. TS measures how sensitive non-bonded distances at a given point in 1, 2-space, and their contributions to non-bonded energy, are to changes in backbone torsional angles. Maximum TS is found at ( 1, 2) equal to (-40°,-40°) which is close to the GG regions of hydrocarbons and the helical regions of peptides and proteins. Minimum TS occurs in the vicinity of (180°, 180°) and is close to the C5- or -regions of peptides. The numerical values of this graph were taken from M. Cao and L. Schafer, J. Mol. Struct., 284 (1993) 235. components of alanine are predicted from MP2-optimized geometries (Cao et al. 1995AA). Thus, additional information is needed to make the data interpretation unambiguous. In view of the results that are available so far, the suggestion is unavoidable that, with the exception of small and quasi-rigid molecules, correlated-level energies evaluated at HF-geometries are potentially misleading. When bond lengths, angles, and torsions of a molecular model are wrong, calculated energies are also wrong because nonbonded interactions are not evaluated correctly. In general, it seems prudent to avoid as much as possible any known sources of error, no matter how insignificant they might seem. In organic compounds molecular dispersion forces are manifested in subtle features of geometry. Excluding these features from conformational analyses vitiates all evaluations of nonbonded interactions, excluding the very basis of conformational stability.
Fig. 7.4 Surface plot of the contour diagram given in Fig. 7.3, illustrating the concept of torsional sensitivity. The numerical values were taken from M. Cao and L. Schafer, J. Mol. Struct, 284 (1993) 235.
Fig. 7.5 Illustration of how dispersion forces affect gauche (G) conformations. Compared to structures with gauche forms devoid of dispersion forces (i.e., HF-optimized), structures with gauche forms subject to dispersion forces (MP2 optimized) contract in such a way that the 1 ...5 nonbonded interactions in an attractive part of the van der Waals potential are shortened. Thus, in GG-pentane (shown above), MP2-optimized torsional angles are contracted by several degrees compared to the HF-optimized geometry, causing a reduction in the 1...5 nonbonded distances by several tenths of an A. For additional details and the numerical values see R. F. Frey, M. Cao, S. Q. Newton, and L. Schafer, J. Mol. Struct. 285 (1993) 99.
AB INTIO CALCULATIONS OF AMINO ACIDS AND PEPTIDES 191
Fig. 7.6 The torsional dependence of the central C-C bond distances in some substituted ethanes. All curves are graphs of relative values taken from L. Schafer, J. D. Ewbank, V. J. Klimkowski, K. Siam, and C. van Alsenoy, J. Mol. Struct. 135 (1986) 141, and L. Schafer, M. Cao, M. Ramek, B. J. Teppen, S. Q. Newton, and K. Siam, J. Mol. Struct. in press.
7.4 The Concept of Local Geometry If one rotates about a C-C single bond in a compound of type X-C-C-Y, at each step of this torsional motion there are electron redistributions, and the bond lengths and angles will change. This phenomenon is the basis of the local nature of molecular structure. Several examples are illustrated in Figs. 7.6 and 7.7. By the concept of local geometry (Schafer et al. 1982P), the efficacy of local perturbations in shaping equilibrium geometries is emphasized in contrast to invariant or "standard" geometries (Marsh et al. 1967G; Pople et al. 1967G; Scheraga 1968G). By employing the latter, it is implied that the bond distances and angles in molecules are essentially constant at different locations of the potential energy surfaces (PES), or characteristic atom groups are assumed to possess an enhanced local symmetry that is not maintained by real systems. In real systems, structural subunits often display significant parameter variations from one point of the PES to the next, and bond lengths and angles are local in the sense that they depend on where they are in a molecule, and where the molecule is on its PES. Variations in local geometries of organic compounds are often subtle and below the resolution of most experimental structural techniques. Thus, the full extent of structural variability was at first not so much apparent from experimental structural studies, but from ab initio gradient optimized geometries (Schafer 1982G, 1983G). Consider, for example, the simple structure of formamide (Fig. 7.8). Albeit more than twenty years old, the combined electron diffraction and microwave study of this compound (Kitano et al. 1974G), is still state-of-the-art in the sense that no investigation of this kind has been performed with higher resolution. The experimental structure is characterized by
Fig, 7.7 The torsional dependence of the terminal X-C-C and C-C-Y angles in some substituted ethanes. All curves are graphs of relative values taken from L. Schafer, J. D. Ewbank, V. J. Klimkowski, K. Siam, and C. van Alsenoy, J. Mol. Struct. 135 (1986) 141, and L. Schafer, M. Cao, M. Ramek, B. J. Teppen, S. Q. Newton, and K. Siam, J. Mol. Struct., in press.
vvaAB INTIO CALCULATIONS OF AMINO ACIDS AND PEPTIDES 193 193
Fig. 7.8 Structural parameters for formamide determined from gas-electron diffraction (values with uncertainty estimates in parentheses taken from M. Kitano and K. Kuchitsu, Bull. Chem. Soc. Japan, 47 (1974) 67) and HF/421G calculations (values taken from H. L. Sellers, V. J. Klimkowski, and L. Schafer, Chem. Phys. Lett. 58 (1978) 541).
the fact that the experimental data contain information only on the average value of the NH bond lengths. In contrast, the HF/4-21G gradient geometry optimization yields (Fig. 7.8) two N-H bonds that differ at a level of several thousandths of an A, below the microwave/electron diffraction resolution of this parameter. The difference is expected, because one of the N-H bonds is cis to the carbonyl group and thus involved in an attractive nonbonded interaction, while the one trans to C=O is not. From other systems it is known (Schafer 1982G) that interacting N-H and O-H bonds are usually lengthened compared with noninteracting bonds. Thus, the conclusion must be that the experimental method employed does not allow one to determine subtle features of this kind. The assumption of an attractive interaction between N-H and C=O in formamide is supported (Fig. 7.8) by the angular relation Hcis-N-C(=O) < Htrans-N-C(=O), a feature of both the experimental structure and the calculated geometry. Subtle structural trends of this kind are ubiquitous, involving all types of bond lengths and angles. They are clearly visible in ab initio structures because, in contrast to experiments, the calculated results are not affected by statistical noise and a systematic study of the entire conformational space of a molecule is possible, including energy maxima. The particular accuracy of HF/4-21G calculations in determining such trends has been documented for a large number of compounds (see McKean et al. 1984G). A special case illustrating the significance of local geometries involves the chirality of tetrahedral carbon atoms with two substituents of identical constitution.
7.5 The Conformationally Dependent Chirality of Glycine One of the fundamental concepts of structural chemistry is that of molecular asymmetry or chirality. The most typical example is that of a tetrahedral carbon atom with four different substituents, C(abcd), which can produce two different arrangements, which are nonsuperimposable mirror images of one another. Such a carbon atom is usually called asymmetric or chiral. In contrast, when two of the substituents are alike, as in C(abc2), the system is usually termed symmetrical or achiral, except for a special class of compounds
194
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Fig. 7.9 In the -helical conformation of the model dipeptide N-acetyl N'-methyl glycineamide (left), the C-H bond lengths and H-C-X angles at the a-carbon are different (1.081 A and 1.078 A for C-H; and 109.3° and 109.7° for H-C-N). Thus, the -carbon is asymmetric. In contrast, in the C5 conformation of N-acetyl N'-methyl glycineamide (right), bonds and angles at C( ) are identical, there is a molecular symmetry plane, and the -carbon is symmetric. (All values from Schafer et al. 1984.)
which are prochiral; i.e., they have no asymmetric site, but are able to react asymmetrically with an asymmetric active site. Additional sources of asymmetry in C(abc2) systems arise from conformational degrees of freedom. Gauche butane, for example, can be correctly described as rapidly interconverting D,L pairs (Mislow 1965G; Eliel et al. 1965G; Mislow et al. 1967G). Amino acids are characteristic examples of compounds with an asymmetric carbon atom, with the exception of glycine which, since its -carbon carries two hydrogens, is often said to be without an asymmetric carbon atom. As a typical C(abc2) system, glycine can be used (Schafer et al. 1984G) to illustrate the conformationally dependent chirality of tetrahedral carbon atoms with two substituents of identical constitution. That is, in compounds containing the glycine residue, some conformations usually exist in which the -carbon is asymmetric; and others in which it is not. In Fig. 7.9, the local geometry around the -carbon is depicted that is obtained from HF/4-21G geometry optimizations (Schafer et al. 1984G) for the R helical conformation of the model dipeptide N-acetyl N'-methyl glycine amide. In this form, the two C( )-H bonds and H-C( )-X angles are not identical. Hence, the -carbon of the glycine residue in this system is asymmetric, because it carries four different substituents, C( )(NHR)(COR)HH', which can be arranged in the way of two nonsuperimposable mirror images. In contrast, the extended (C5) form of N-acetyl N'-methyl glycine amide (Fig. 7.9) contains a symmetry plane, and the -carbon is symmetric. Conformationally dependent chirality is a common feature of molecules that contain carbon atoms with identical substituents. Another example is presented in Fig. 7.10. In dihydroxymethane, the conformation in which the H-O-C-O angles are trans, has a plane of symmetry, and the carbon atom is of the type C(a2b2). In contrast, when one of the O-H bonds is rotated out of the plane, the structural parameters change in an asymmetric way. The system is now of type C(aa'bb') and the mirror images cannot be superimposed.
AB INITIO CALCULATIONS OF AMINO ACIDS AND PEPTIDES
195
Fig. 7.10 When the H-O-C-O torsions in dihydroxymethane, CH2(OH)2, are 180° and form a planar backbone (structure on the left), the H-C-O angles (110.9°), C-H (1.086 A), O-H (0.963 A), and C-O bond lengths (1.425 A) are identical, the molecule has a mirror plane, and the carbon atom is symmetric. In contrast, when one of the O-H bonds is rotated out of the plane, equivalent bond lengths and angles are different (1.078 A and 1.085 A, for C-H; 1.438 A and 1.417 A for C-O; 0.963 A and 0.965 A for O-H; and 111.9° and 109.4° for H-O-C). Thus, in the structure on the right, the carbon atom is asymmetric with four different substituents. (All values from Schafer et al. 1984G).
7.6 Ab Initio Calculations of Dipeptides and Oligopeptides Many of the conformational properties of peptide systems, including protein conformation, can be approximated in terms of the local interactions encountered in dipeptides, where the two torsional angles (N-C( )) and (C( )-C') are the main conformational variables. Nacetyl N'-methyl alanine amide, shown in Fig. 7.11, is a model dipeptide that has been the subject of numerous computational studies. The dipeptide model implies that the peptide unit is basically rigid and planar which is a good first order approximation, with characteristic deviations (M. W. MacArthur, 1996G). The effectiveness of the model and the significance of the main chain and torsions for protein folding was first discovered by Sasisekharan (1962G), but the , plots of peptide
Fig. 7.11 N-acetyl N'-methyl alanine amide. Identification of the (C( )-C') torsional angles.
(N-C( )) and
196
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
conformational properties are usually called Ramachandran plots, following the review paper by Ramachandran and Sasisekharan (1968G). Thus, the Sasisekharan plots met the same fate as many other important discoveries in science which were not specifically named after their inventor. The low energy conformations of dipeptides and their features were reliably identified in the very early computational work, based on empirical and semi-empirical procedures (see, Lewis et al. 1973aG; and Pullman et al. (1974G). They include the following regions in space as identified by the approximate -values: C q ( =-80°, = + 80c), C5 0 ( =-155 , = +155°), C ( = +78°, =-64°), =-134°, = +38°), R ( =-74 , =-45°), and L ( = +54°, = +54°). A systematic description of the expected energy minima has been given by Head Gordon et al. (1991P). Two characteristic low energy forms of N-acetyl N'-methyl alanine amide (ALA) are shown in Fig. 7.12. The C q form shown in Fig. 7.12 is the global energy minimum of ALA and is characterized by an intraresidue stabilizing interaction between the N-H and C=O groups, forming an O-C-NC-C-N-H seven-membered nonplanar ring with the -carbon in equatorial position. The axial arrangement of the side group in this conformer is also stable, yielding the C -form. The C5-form shown in Fig. 7.12 is characterized by a five-membered ring closed by a stabilizing nonbonded interaction, involving O-C-C-N-H. This is the extended form of peptides.
Fig. 7.12 The C
(bottom) and C5 (top) forms of N-acetyl N'-methyl alanine amide.
AB INITIO CALCULATIONS OF AMINO ACIDS AND PEPTIDES
197
Fig. 7.13 Characteristic nonbonded interactions in N-acetyl N'-methyl serine amide. Torsional angles to construct this figure were taken from K. Siam, S. Q. Kulp, J. D. Ewbank, and L. SSCHAFER , J. Mol. Struct., 184 (1989) 143.
When the side group is more complex than in ALA, interactions with the backbone can significantly affect the relative energies of the various conformers. This is the case, for example, in N-acetyl N'-methyl serine amide (SER). As shown in Fig. 7.13, various conformers of SER form complex polycyclic systems (Siam et al. 1987P, 1989P) with simultaneous five-, six-, and seven-membered rings. Compared to the simpler ALA and GLY, the conformational energy surface of SER shows that increasing the complexity of the ammoacid side group leads to deeper energy minima on the PES and, at the same time, makes it possible to construct minimum-energy conformers in different regions which are energetically equivalent. In Fig. 7.14 we present a summary of the nomenclature used by protein crystallographers for the various regions of -space, and approximate limits for the observed distributions of dipeptide conformations, as presented by Karplus (1996G). It is seen in Fig. 7.14 that, among the forms identified above, the C7-forms correspond to the crystallographic and ' (Fig. 7.14); C5 corresponds to the region of -sheets, or the crystallographic s (Fig. 7.14); ' is close to ; R is the right-handed -helix region; the mirror image of R; and R is commonly referred to as the bridge region. It is also seen from Fig. 7.14 that, for residues in proteins, the most heavily populated regions of -space include the -sheets and -helix regions. The first geometry optimizations of dipeptides—or more precisely of the model dipeptides ALA and the glycine homolog, GLY—were performed in the early eighties (Schafer et al. 1982P; Scarsdale et al. 1983P), using Pulay's gradient method (Pulay 1969G) at the double-zeta HF/4-21G level (Pulay 1979G). These calculations were performed somewhat before their time, because they demanded an enormous computing effort. For example, a single cycle in the geometry optimization of GLY required abut 40 hours of CPU time on an IBM mainframe of that era. Approximately 30 cycles were needed to relax a single structure, four structures were included in the first paper on GLY (Schafer et al. 1982P),
198
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Fig. 7.14 Nomenclature for characteristic regions of peptide -space taken from Karplus (1996). The frequencies of observed peptide conformations in protein crystal structures decrease from areas enclosed by a heavy solid line to regions enclosed by a plain solid line, to dashed outlines. Areas outside the dashed lines are disallowed in peptide conformational space. The lines are an approximate rendering of the exact contours given by Karplus (1996). which, at a standard rate of a thousand dollars per hour — even though considered "paper" money—represented a block of computing time not easily obtained. Among the results engendered by the early ab initio geometry optimizations of GLY and ALA (Schafer et al. 1982P; Scarsdale et al. 1983P; Schafer et al. 1984P) and their continuations (Siam 1987P, 1989P), two aspects are particularly noteworthy. These are (a) that the -space; and (b) that, for the comR right-handed helix is not an energy minimum in plete description of the conformational properties of peptides, conformational geometry maps are as important as conformational energy maps. The right-handed -helix, or the helix region in general, is one of the most important conformational regions of peptides in proteins (see Fig. 7.14). Thus, force-field parameter developments for empirical energy calculations of peptides have typically aimed at producing an energy minimum at R. In contrast, when a series of geometry optimizations of ALA was started with a right-handed -helical structure (Schafer et al. 1984P), it did not lead to an energy minimum for this conformer, but the path of optimization continued through the bridge region into the vicinity of . Details are shown in Fig. 7.15.
Fig. 7.15 Empirical energy modeling procedures of dipeptides traditionally attempted to locate a minimum of potential energy in the R-helical region ( ~-70°, ~-40°), because of the frequent population of this region in proteins. In contrast, when the geometry of the model dipeptide N-acetyl N' -methyl alanine amide was optimized at the ab initio HF/4-21G level, somewhat unexpectedly the calculations led from the helical region through the bridge to lower energies in the ' region, without finding a minimum for the R-helical region (L. Schafer, V. J. Klimkowski, F. A. Momany, H. Chuman, and C. Van Alsenoy, Biopol, 23 (1984) 2335). The energy values of the figure were taken from the paper quoted above and represent approximate values because of the level of calculation and because geometry was not optimized at each step, but they illustrate very well the lack of an energy minimum at . More recently, confirming experimental evidence was found from gas-electron diffraction, indicating that the -helical form is not significantly populated in vapors of N-acetyl N'-methyl alanine amide at 500K (Schafer et al. 1995cG).
200
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
More advanced calculations of ALA are now easily performed and have confirmed the early results (Head-Gordon et al. 1991P). The best interpretation of this aspect of the ab initio calculations of ALA is that helix stability is due to long-range or cooperative effects. Indeed, evidence for cooperative effects in -helices was already discovered in 1982 by some pioneering ab initio single-point energy calculations of oligopeptides (van Duijnen 1982P). The conclusion that the -helix is not an energy minimum in peptide , -space is also supported by a recent gas-electron diffraction analysis of ALA (Schafer et al. 1995cG). In order to interpret the gas electron diffraction (GED) intensities for a molecule that is as complex as ALA, one must proceed by constructing various theoretical models that represent conformational mixtures of interest, and compare them with the experimental data using least squares refinement procedures. Due to the random nature of molecular ensembles in the vapor phase, many models can be constructed for complex molecules that agree with the same set of GED intensity data in a satisfactory way, so that one cannot distinguish between them. Nevertheless, as it turned out, when model intensities for conformational mixtures of ALA were calculated and refined from the GED data by least-squares procedures subject to geometrical constraints taken from ab initio geometries, no physically meaningful model conformational mixture containing could be found. This finding provides definite evidence that, at ~500K, the temperature of the GED study, the R-helical form is not a significantly populated conformer. The GED study of ALA allows a second interesting result (Schafer et al. 1995cG). In most calculations of ALA, regardless of method, the C form is consistently the global energy minimum, l kcal/mol lower in energy than C5. This result seems to be in conflict with the fact that the C5-region in proteins is much more frequently populated than C q. Again, one might invoke longrange effects but must also consider that the C form is in a conformational region of high torsional sensitivity (Cao 1993G), whereas the torsional sensitivity for is C5 low, indicating that the latter has a considerably higher entropy than the former. In this situation, it is important that the GED data of ALA allow the conclusion (Schafer et al. 1995cG) that the C5 form in vapors of ALA is not less populated than C . The significance of cooperativity in helical conformations is further supported by the fact that stable 310-type helical forms were found in HF/4-21G geometry optimizations of the model hexapeptides, N-formyl pentaglycine amide and N-formyl pentalanine amide (Schafer et al. 1993P). The latter is shown in Fig. 7.16, demonstrating that computational procedures which do not yield helical energy minima for dipeptides can establish stable helical forms for larger systems in which stabilizing longrange interactions are possible. The fact that the calculations yielded a 310-type rather than an -helix is in agreement with experimental observations (Miick 1992P) according to which short alanine-based peptides may form 310-helices and not -helices in aqueous solution. It is also interesting to note that helical forms can be stabilized in dipeptides by the presence of the reaction field of a solvent (Shang et al. 1994P), indicating that the origin of helical stabilization in dipeptides is due to environment. Due to advances in computational software and equipment, calculations which required days to complete only ten years ago can now be performed within minutes on a workstation of modest size. Accordingly, ab initio studies with geometry optization are now being published in large numbers, involving model dipeptides at the post-HF level (Frey et al. 1992P), HF calculations of dipeptides with various residues in addition to GLY and ALA (Bohm 1993P;Bohmetal. 1991P;Cheam 1992P, 1993P;Cheametal. 1985P, 1986P, 1989P, 1990P; Dive et al. 1994P; Fischer et al. 1994P; Gould et al. 1992P, 1993P, 1994P;
AB INITIO CALCULATIONS OF AMINO ACIDS AND PEPTIDES
201
Fig. 7.16 310-type helical conformation obtained by HF/4-21G geometry optimizations of N-formyl pentaalanine amide (Schafer et al. 1993P).
Klimkowski, 1985P; McAllister 1993bP; Perczel et al. 1991aP, 1996P; Ramek et al. 1995P, Sapse et al. 1986P, 1987P, 1988P, 1990P; Viviani et al. 1993a, P; Weiner et al. 1984P), and oligopeptides (Bohm et al. 1995P; McAllister 1993aP; Perczel et al. 1992P, 1993P, 1994P; Schafer et al. 1993P; van Alsenoy et al. 1993P). Studies of oligopeptides profit a great deal from computational procedures specifically developed for ab initio geometry optimizations of large molecules, such as the Multiplicative Integral Approximation programmed by van Alsenoy (1988G). Oligopeptides are interesting because they allow the exploration of features not afforded by dipeptides. For example, Perczel et al. (1993P) investigated the structural properties of -turns in conformations of N-formyl alanine alanine amide, identifying a large number of triamide structures as structural subunits of globular proteins as determined by X-ray crystallography. Double-type-I and double-type-II -turns were found by Bohm et al. (1995P) in their study of cyclohexaglycine. Comparisons between bends in N-formyl pentaglycine amide and Nformyl pentaalanine amide were made by Schafer et al. (1993P). When the starting structure for the refinement of the -turn of pentagly was a classic type II bend, the structure resulting from the HF/4-21G geometry optimization (shown in Fig. 7.17) was an unusual bend not common in proteins. Its dihedral angles are unique due to the interactions between end groups. The result is of interest because it shows that the interaction between two residues which are not part of the bend can significantly affected the torsional states of residues in the bend. Because of their great significance for protein conformation, detailed systematic classifications of bend structures can be found in various summaries (Barlow et al. 1988G; Kabschetal. 1983G; Lewis etal. 1973bG; Milner-White etal. 1987G; Sibandaetal. 1985G; and Richardson et al. 1981G, 1989G) following the pioneering work by Venkatachalam (1968G).
202
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Fig. 7.17 Bend structure resulting from the HF/4-21G geometry refinement of a type-II bend of N-formyl pentaglycine amide. The torsional angles in this structure are not common in proteins due to the effects of the end groups on the dihedral angles in the bend.
7.7 The Importance of Peptide Conformational Geometry Maps Conformational geometry maps are important for peptides because their structures—the main chain bond lengths and angles—change considerably as one walks across the , -surface. An example is presented in Fig. 7.18 in which the variation of the N-C( )-C' angle of ALA in , -space is shown. Relative values are plotted which were obtained from HF/421G calculations (Schafer et al. in press, G) which show that the changes in this important backbone parameter exceed 10° as one proceeds form the -region through the bridge into the -helical region. It is obvious that, if such geometrical trends are neglected in the modeling of peptides and proteins, the error introduced by this neglect along an extended peptide chain can be significant. Indeed, when force-field parameters for empirical energy calculations of peptides were adjusted to include the structural trends obtained from the ab initio geometries, considerable improvements in the modeling results were achieved (H. Chuman et al. 1984G; F. A. Momany et al. 1990G, 1993G). The dependence of bond lengths and angles on associated torsional angles can be described by Conformational geometry functions (CGF) which have the property of being approximately additive (L. Schafer et al. 1986G, in press, G). CGF additivity arises from the fact that the interactions encountered during torsional motion in a complex molecule can be approximately represented as the sum of the interactions encountered by individual structural components. For the case of ALA, for example, it is shown in Fig. 7.19 that
Ala N-C(a)-C' Relative Angle Values
Fig. 7.18 Plots of relative N-C( )-C' angle values (surfaces of differences, in degrees, relative to the values at = = 180°) for the ,\|/-space of ALA. The top surface represents values directly calculated for ALA as a whole by HF/4-21G geometry optimizations; the center surface represents simulated parameter values which were obtained using the conformational geometry function additivity principle as described in the text. The bottom surface is the difference, top minus center. All surfaces were plotted with the same scale factor, but offset by arbitrary and constant amounts for the sake of graphical clarity. The numerical values used to construct this Figure were taken from L. Schafer, M. Cao, M. Ramek, B. J. Teppen, S. Q. Newton, and K. Siam, J. Mol. Struct., in press.
Fig. 7.19 Illustration of the additivity of conformational geometry functions for the -rotation in (CH3CO)(H)N-C(CH3)(H)(CONHCH3) (ALA). During the torsional motion about the N-C( ) bond of ALA, the interactions within the system are the same as those encountered during the N-C torsion in N-ethyl acetamide (NEA), plus those encountered during the N-C( ) torsion in N-acetyl N'-methyl glycine amide offset by 120° as shown (GLY), minus those encountered during the N-C torsion in N-methyl acetamide (NMA).
204
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Fig. 7.20 Illustration of the additivity of conformational geometry functions for the -rotation in (CH3NH)(O=)C-C(CH3)(H)(NHCOCH3) (ALA). During the torsional motion about the C( )-C' bond of ALA, the interactions within the system are the same as those encountered during the C(=O)-C torsion in N-methyl propanamide (NMPA), plus those encountered during the C( )-C' torsion in N-acetyl N'-methyl glycine amide offset by 120° as shown (GLY), minus those encountered during the C-C torsion in N-methyl acetamide (NMA). the -rotation can be considered as the rotation of three bonds or groups, C-CH3, C-H, and C-(CONHCH3), against a stationary (H)N(COCH3) background. The interactions encountered during this rotation in ALA are the same as those encountered during the N-C rotation of N-ethyl acetamide, NEA, plus those encountered during the -rotation in N-acetyl N'-methyl glycine amide, GLY, offset by a phase difference of 120°, minus the interactions encountered during the N-C rotation in N-methyl acetamide, NMA. For the -rotation in ALA (Fig. 7.20), the corresponding combination involves N-methyl propanamide, NMPA, plus GLY, minus NMA. For more details concerning the addition of CGF see (Schafer et al. in press, G). When the additivity scheme described above for the -rotation in ALA is applied to the N-C( )-C' angle, using the HF/4-21G ab initio geometries of all the species involved (Schafer et al. in press, G), the surface called "simulated" in Fig. 7.18 is obtained. It is seen from Fig. 7.18 that the simulated surface is similar to the exact surface — the one directly calculated for ALA as a whole by HF/4-21G geometry optimizations — in many regions of , -space where the additivity of CGF is closely obeyed. In other regions relatively large deviations occur as shown by the difference surface (Fig. 7.18), indicating the actions of cooperative effects. It is an important aspect of this feature that the topology of the directly calculated and simulated surfaces is essentially similar and additivity holds more closely for relative trends---parameter differences referred to a selected point of , -space — than absolute parameter values, where pronounced cooperative effects are frequently found (Schafer et al. in press, G).
7.8 Predictions of Protein Backbone Bond Distances and Angles from First Principles It is always good to ask, to what extent the results from ab initio calculations are in agreement with experimental results. In general it is reasonable to expect that small energy differences are difficult to obtain accurately from ab initio calculations but calculated geometry trends, particularly for organic compounds, are usually found in very good agreement with experimental molecular structures determined in the vapor phase. As to the agreement with crystal
AB INITIO CALCULATIONS OF AMINO ACIDS AND PEPTIDES
205
structures, that is an entirely different matter and for a long time it was not clear what significance the ab initio peptide local geometries might have for protein crystal structures. In order to study this question in a more systematic way, we have recently optimized 144 different structures of ALA at the HF/4--21G level, covering the entire / -space by a 30° grid (Schafer et al. 1995aG, 1995bG). From the resulting coordinates of ALA analytical functions were derived for the most important main chain structural parameters, such as N-C( ), C( )-C', and N-C( )-C', expanding them in terms of natural cubic spline parameters. In fact, Fig. 7.18 is an example of the type of conformational geometry map that can be derived from this procedure. The spline-function representation of the conformational dependence of N-C( )-C' allows one to calculate this parameter at any wished point in / -space. Thus, the relevance of the trends shown in Fig. 7.18 for protein crystal structures can now be explored by calculating the values for N-C( )-C' at the / -torsions found for the residues in protein crystal structures and comparing the two sets of parameters. As an example, in Fig. 7.21 the N-C( )-C' angles calculated in this way for each residue of crambin are plotted on top of the corresponding values taken from its crystal structure (Teeter et al. 1993G). It is seen from Fig. 7.21 that the agreement is excellent, even though the calculated values were derived from the model dipeptide (Schafer et al. 1995a, 1995b) of a single residue in a hypothetical vibrationless vacuum state, while the comparison is with residues of various types in an extended and vibrationally averaged peptide chain in a crystal environment; the rms deviation between calculated and experimental values is 1.6° for the 0.83 A resolution crystal structure. An interesting discrepancy is found (Fig. 7.21) for residues 7 to 16 of crambin which form an -helical strand (Teeter et al. 1993G). In this part of the crambin chain the calculated --- dipeptide --- values for N-C( )-C' are consistently too large, ~2°. We have interpreted this feature (Schafer et al. 1995aG, 1995bG) as an indication of helix compression due to cooperative effects which are present in extended chains but not the dipeptide moiety. A similar phenomenon but with the opposite sign — -expansion — is found for (3sheets (Jiang et al. 1995aG, 1995bG). In order to evaluate the utility of the ab initio structures of ALA in a more extensive way, they were compared (Jiang 1995aG, 1995bG) with the N-C( )-C' angles of some forty highresolution protein crystal structures. For the ith residues of the selected proteins (Jiang 1995aG, 1995bG), the crystallographic N-C( )-C' angles, cryst i., were compared with values calculated, calc i., at the crystallographic i/ i-torsions. The average rms deviation between the two sets of parameters for the selected proteins was found at 2.9°, which compares to observed angle variations exceeding 10°. When the cryst i. and calc i. were ordered by regions in / -space defined by a 30° grid and when region-average values were calculated, the average rms deviation between the two sets in the most populated regions was 1.2° (Jiang 1995aG, 1995bG). In the -helical region the dipeptide based angle predictions were found on the average 1.5° too high compared to experimental values, indicating helix compression in proteins compared to dipeptides. Similarly, in the -region the calculated angles are an average 2.1° too low, indicating -expansion in extended sheets compared to dipeptides. Protein crystal structures depend to some extent on the modeling procedure that is used to analyze the diffraction data. In order to ascertain that the results described above are not an artifact caused by crystal structure refinement, the same comparisons were also made (Jiang 1995bG, 1997G) with the crystal structures of oligopeptides whose small-molecule data are analyzed without any model restraints. The same agreement was found (Jiang 1995bG, 1997G) as with proteins. More recently, in a brilliant paper, Karplus (1996G) has created a new and improved database for protein structures which provides direct experi-
206
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
N-C -C' Bond Angle of Crambin: Experimental and Predicted Values
Fig. 7.21 Comparison of calculated and observed N-C( )-C' main chain bond angles in crambin. Angle values are plotted for each residue with the numbering of the crystal structure (Teeter et al. 1993G). The experimental values were extracted from the crystal structure using the MSI (at the time of the original study BIOSYM) InsightII/Discover software package (MSI, 9685 Scranton Road, San Diego, CA 92121). The calculated values (simulated) were obtained by taking the crystallographic and values of each residue as input to a program (PEPPSII) that calculates peptide main chain structural parameters as functions of and as described in the text. The data base used in the program has been published in L. Schafer, M. Cao, M. Ramek, B. J. Teppen, S. Q. Newton, and K. Siam, J. Mol. Struct., in press.
mental evidence that a definite correlation exits between the / -torsions and the main chain structural parameters of proteins. The results will undoubtedly have some impact on the refinement of protein crystal data since deviations from torsion independent ideal geometries are now used as a goodness-of-fit criterion. Instead it is suggested to establish detailed dictionaries of flexible geometry functions for use in empirical peptide and protein modeling. The analysis of peptide and protein crystal structures described above is summarized in Fig. 7.22 and 7.23. In the -helical regions of peptides (no. 13, Fig. 7.22) and proteins (no. Fig. 7.22 Comparison of calculated and observed (x-ray) mean N-C( )-C' bond angles for oligopeptides. Regions of / -space and region numbering are identified in the lower graph. All numerical values were taken from Jiang et al. (1997,G). Values plotted are the region-average values, and < calc N-C( )-C' >, of the most populated regions (N 3) of a set of oligopeptides selected as described in the reference quoted above.
Experimental values (x-ray) were taken from the Cambridge Crystallographic Data File (Allen, Davies, Galloy, Johnson, Kennard, Macrae, Mitchell, Smith & Watson, 1991). The calculated values were obtained by taking the and ---values of each residue in the crystal structures as input in a program that calculates the values of peptide main chain parameters as functions of and . Further details are described in the text.
Fig. 7.23 Comparison of calculated and observed (x-ray) mean N-C( )-C' bond angles for 37 proteins selected as described by Jiang et al. (1997,G). This reference is also the source of the values plotted, which are the region-average values, and . Regions of / -space and region numbering are explained in the lower graph. Experimental values (x-ray) were taken from the Brookhaven Protein Data Bank; Chemistry Department, Building 555; Brookhaven National Laboratory, Box 5000, Upton N.Y. 11973-5000). The calculated values were obtained as described in the text.
AB INITIO CALCULATIONS OF AMINO ACIDS AND PEPTIDES
209
26, Fig. 7.23) the helix compression described above is clearly seen. In the vicinity of the -region of proteins (nos. 1-5, Fig. 7.23) the -expansion described above is seen. In the case of the peptides this trend is less well established (Fig. 7.22), we think because of the small number of examples currently available. 7.9 Conclusions Finding close agreement between the result of a calculation and an established body of empirical data is a delightful experience. To those of us who have witnessed the evolution of the computational tools during their own lifetime, the process has at times appeared like a revolution and a mutation of chemistry (Schafer 1983G, 1991G). The reference section given at the end of this chapter demonstrates that ab initio calculations of amino acids and peptides are at the base of a very active field and have become a powerful source of information on these important molecules. References Abraham, R. J., and B. Hudson. 1985. "Charge Calculations in Molecular Mechanics. III: Amino Acids and Peptides," J. Comput. Chem. 6, 173-181. Alper, J. S., H. Dothe, and M. A. Lowe. 1992. "Scaled Quantum Mechanical Calculation of the Vibrational Structure of the Solvated Glycine Zwitterion," Chem. Phys. 161, 199-209. Barron, L. D., A. R. Gargaro, L. Hecht, P. L. Polavarapu. 1991. "Experimental and Ab Initio Theoretical Vibrational Roman Optical Activity of Alanine," Spectrochimica Acta 47A, 1001-1016. Basch, H., and W. J. Stevens. 1990. "The Structure of Glycine-Water H-Bonded Complexes," Chem. Phys. Letters 169, 275-280. Bonaccorsi, R., P. Palla, and J. Tomasi. 1984. "Conformational Energy of Glycine in Aqueous Solutions and Relative Stability of the Zwitterionic and Neutral Forms. An Ab Initio Study," J. Am. Chem. Soc. 106, 1945-1950. Bouchonnet, S., and Y. Hoppilliard. 1992. "Proton and Sodium Ion Affinities of Glycine and Its Sodium Salt in the Gas Phase. Ab Initio Calculations," Org. Mass Spectrometry, 27, 71-76. Cao, M., S. Q. Newton, J. Pranata, and L. Schafer. 1995. "Ab Initio Conformational Analysis of Alanine," J. Mol. Struct. 332, 251-267. Chipot, C., B. Maigret, J.-L. Rivail, and H. A. Scheraga. 1992. "Modeling Amino Acid Side Chains. 1. Determination of Net Atomic Charges from Ab Initio Self-Consistent-Field Molecular Electrostatic Properties," J. Phys. Chem. 96, 10276-10284. dementi, E., F. Cavallone, and R. Scordamaglia. 1977. "Analytical Potentials from ab Initio Computations for the Interaction Between Biomolecules. 1. Water with Amino Acids," J. Am. Chem. Soc. 99, 5531--5545. Csaszar, A. G. 1995. "On the Structures of Free Glycine and -Alanine," J. Mol. Struct. 346, 141-152. Csaszar, A. G. 1992. "Conformers of Gaseous Glycine," J. Am. Chem. Soc. 114, 9568--9575. de Dios, A. C., J. G. Pearson, and E. Oldfield. 1993. "Chemical Shifts in Proteins: An Ab Initio Study of Carbon- 13 Nuclear Magnetic Resonance Chemical Shielding in Glycine, Alanine, and Valine Residues," J. Am. Chem. Soc. 115, 9768-9773. Depke, G., N. Heinrich, and H. Schwarz. 19984. "On the Gas Phase Chemistry of Ionized Glycine and Its Enol. A Combined Experimental and Ab Initio Molecular Orbital Study," Int. J. Mass Spectrom. Ion Processes 62, 99-117. Destro, R., R. Bianchi, and G. Morosi. 1989. "Electrostatic Properties of L-Alanine from X-ray Diffraction at 23 K and Ab Initio Calculations," J. Phys. Chem. 93, 4447--4457. Destro, R., R. E. Marsh, and R. Bianchi. 1988. "A Low-Temperature (23 K) Study of LAlanine," J. Phys. Chem. 92, 966-973.
210
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Ding, Y., and K. Krogh-Jespersen. 1992. "The Glycine Zwitterion Does Not Exist in the Gas Phase: Results from a Detailed Ab Initio Electronic Structure Study," Chem. Phys. Letters 199, 261-266. Dixon, D. A., and W. N. Lipscomb. 1976. "Electronic Structure and Bonding of the Amino Acids Containing First Row Atoms," J. Biol. Chem. 251, 5992-6000. Dykstra, C. E., R. A. Chiles, and M. D. Garrett. 1981. "Recent Computational Developments with the Self-Consistent Electron Pairs Method and Application to the Stability of Glycine Conformers," J. Comput. Chem. 2, 266-272. Ewbank, J. D., V. J. Klimkowski, K. Siam, and L. Schafer. 1987. "Conformational Analysis of the Methyl Ester of Alanine by Gas Election Diffraction and Ab Initio Geometry Optimization," J. Mol. Struct. 160, 275-285. Gatti, C., R. Bianchi, R. Destro, and F. Merati. 1992. "Experimental vs. Theoretical Topological Properties of Charge Density Distributions. An Application to the L-alanine Molecule Studied by X-ray Diffraction at 23 K," J. Mol. Struct. (Theochem) 255, 409--433. Hu, C.-H., M. Shen, and H. F. Schaefer III. 1996. "Glycine Conformational Analysis," J. Am. Chem. Soc. 115, 2923-2929. Jensen, F. 1992. "Structure and Stability of Complexes of Glycine and Glycine Methyl Analogues with H+, Li+, and Na+," J. Am. Chem. Soc. 114, 9533-9537. Jensen, J. H. and M. S. Gordon. 1991. "The Conformational Potential Energy Surface of Glycine: A Theoretical Study," J. Am. Chem. Soc. 113, 7917-7924. Kikuchi, O., T. Matsuoka, H. Sawahata, and O. Takahashi. 1994. "Ab Initio Molecular Orbital Calculations Including Solvent Effects by Generalized Born Formula. Conformation of Zwitterionic Forms of Glycine, Alanine and Serine in Water," J. Mol. Struct. (Theochem) 305, 79-87. Kikuchi, O., T. Natsui, and T. Kozaki. 1990a. "MNDO Effective Charge Model Study of Conformations of Zwitterionic and Neutral Forms of Glycine, Alanine and Serine in the Gas Phase and in Solution," J. Mol. Struct. (Theochem) 305, 79-87. Kikuchi, O., and H. Wang. 1990b. "Parity-Violating Energy Shift of Glycine, Alanine, and Serine in the Zwitterionic Forms: Calculation Using HFO-NG Basis Sets," Bull. Chem. Soc. Jpn. 63, 2751-2754. Klimkowski, V. J., J. D. Ewbank, J. N. Scarsdale, L. Schafer, and C. Van Alsenoy. 1985. "Conformational analysis and Molecular Structures of Valine Methyl Ester," J. Mol. Struct. (Theochem) 124, 175-182. Klimkowski, V. J., J. N. Scarsdale, and L. Schafer. 1983a. "Ab Initio Studies of Structural Features Not Easily Amenable to Experiment. 25. Conformational Analysis of Methyl Propanoate and Comparison with the Methyl Ester of Glycine," J. Comput. Chem. 4, 494--498. Klimkowski, V. J., L. Schafer, L. Van Den Enden, C. Van Alsenoy, and W. Caminati. 1983b. "Ab Initio Studies of Structural Features Not Easily Amenable to Experiment. Part 28. Comparison of the Observed Ground State Rotational Constants of the Methyl Ester of Glycine with the Rotational Constants calculated for Some Planar and Non-planar Gradient Geometries," J. Mol. Struct. (Theochem) 105, 169-174. Kokpol, S. U., P. B. Doungdee, S. V. Hannongbua, B. M. Rode, and J. P. Limtrakul. 1988. "Ab Initio Study of the Hydration of the Glycine Zwitterion," J. Chem. Soc., Faraday Trans. 2, 84, 1789-1792. Laurence, P. R., and C. Thomson. 1982. "The Boron Analogue of Glycine: A Theoretical Investigation of Structure and Properties," J. Mol. Struct. (Theochem) 88, 37--43. Laurence, P. R., and C. Thomson. 1981. "A Comparison of the Results of PCILO and Ab Initio SCF Calculations for the Molecules Glycine, Cysteine and N-Acetyl-Glycine," Theoret. Chim. Acta (Berl.) 58, 121-124. Lelj, R, C. Adamo, and V. Barone. 1994. "Role of Hartree-Fock Exchange in Density Functional Theory. Some Aspects of the Conformational Potential Energy Surface of Glycine in the Gas Phase," Chem. Phys. Letters 230, 189-195.
AB INITIO CALCULATIONS OF AMINO ACIDS AND PEPTIDES
211
Lindroos, J., M. Perakyla, J.-P. Bjorkroth, and T. A. Pakkanen. 1992. "Ab Initio Models for Receptor-Ligand Interactions in Proteins. Part 1. Models for asparagine, glutamine, serine, threonine and tyrosine," J. Chem. Soc. Perkin Trans. 2, 2271-2277. Luke, B. T., A. G. Gupta, G. H. Loew, J. G. Lawless, and D. H. White. 1984. "Theoretical Investigation of the Role of Clay Edges in Prebiotic Peptide Bond Formation. I. Structures of Acetic Acid, Glycine, H2SO4, H3PO4, Si(OH)4, and A1(OH)-4," Int. J. Quantum Chem. Quantum Biol. Symp. 11, 117-135. Masamura, M. 1988a. "Reliability of AM1 in Conformational Analysis of Unionized Amino Acids," J. Mol. Struct. (Theochem), 168, 227-234. Masamura, M. 1988b. "Reliability of AM1 in Determining the Equilibrium Structures of Unionized Amino Acids," J. Mol. Struct. (Theochem), 164, 299-311. Masamura, M. 1987. "Reliability of MNDO in Determining the Equilibrium Structures of Unionized Amino Acids," J. Mol. Struct. (Theochem) 152, 293-303. Mezey, P. G., J. J. Ladik, and S. Suhai. 1979. "Non-Empirical SCF MO Studies on the Protonation of Biopolymer Constituents I. Protonation of Amino Acids," Theoret. Chim. Acta (Berl.), 51, 323-329. Millefiori, S., and A. Millefiori. 1983. "On the Relative Stability of Glycine Conformers. The Role of Electron Correlation," J. Mol. Struct. (Theochem), 91, 391-393. Ni, X., X. Shi, and L. Ling. 1988. "An Interaction Potential Between an Alanine Zwitterion and a Water Molecule Based on Ab Initio Calculations," Int. J. Quantum Chem. 34, 527-533. No, K. T, K. H. Cho, O. Y. Kwon, M. S. Jhon, and H. A. Scheraga. 1994. "Determination of Proton Transfer Energies and Lattice Energies of Several Amino Acid Zwitterions," J. Phys. Chem. 98, 10742-10749. Pagliarin, R., G. Sello, and M. Sisti. 1994. "Model Studies for Predicting the Diastereoselectivity in the Condensation of Aldehydes with Zinc and Copper Complexes of Amino Acid Derivatives. Part 1. Analysis and realisation of the models," J. Mol. Struct. (Theochem) 312, 251-259. Palla, P., C. Petrongolo, and J. Tomasi. 1980. "Internal Rotation Potential Energy for the Glycine Molecule in Its Zwitterionic and Neutral Forms. A Comparison among Several Methods," J. Phys. Chem. 84, 435--442. Peters, D., and J. Peters. 1982. "Quantum Theory of the Structure and Bonding in Proteins Part 13. The p branched hydrocarbon side chains valine and isoleucine," J. Mol. Struct. (Theochem) 88, 157-170. Ramek, M. 1990a. "Ab Initio SCF Investigation of ( --Alanine," J. Mol. Struct. (Theochem) 208, 301-355. Ramek, M. 1990b. "Intramolecular Hydrogen Bonding in Neutral Glycine, -Alanine, Aminobutyric Acid, and 8-Aminopentane Acid," Int. J. Quantum Chem. Quantum Biol. Symp. 17, 45-53. Ramek, M., V.K.W. Cheng. 1992. "On the Role of Polarization Functions in SCF Calculations of Glycine and Related Systems with Intramolecular Hydrogen Bonding," Int. J. Quantum Chem. Quantum Biol. Symp. 19, 15-26. Ramek, M., V.K.W. Cheng, R. F. Frey, S. Q. Newton, and L. Schafer. 1991. "The Case of Glycine Continued: Some Contradictory SCF Results," J. Mol. Struct. (Theochem), 235, 1-10. Ramek, M., F. A. Momany, D. M. Miller, and L. Schafer. 1996. "On the Importance of Full Geometry Optimization in Correlation-Level Ab Initio Molecular Conformational Analyses," J. Mol. Struct. 375, 189-191. Ranghino, G., E. dementi, and S. Romano. 1983. "Lysinium, Argininium, Glutamate, and Asparate Ions in Water Solution," Biopolymers 22, 1449-1460. Ringnalda, M. N., Y. Won, and R. A. Friesner. 1990. "Pseudospectral Hartree-Fock Calculations on Glycine," J. Chem. Phys. 92, 1163--1173. Sapse, A.-M., and D. C. Jain. 1980. "Guanine and Adenine-Amino Acids Interactions: An Ab Initio Study," Int. J. Quantum Chem. 29, 23--29.
212
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Sapse, A.-M., M. Mezei, D. C. Jain, C. Unson. 1994. "Ab Initio Study of Aspartic and Glutamic Acid: Supplementary Evidence for Structural Requirements at Position 9 for Glucagon Activity," J. Mol. Struct. (Theochem) 306, 225-233. Schafer, L., S. Q. Kulp-Newton, K. Siam, V. J. Klimkowski, C. Van Alsenoy. 1990a. "Ab Initio Studies of Structural Features Not Easily Amenable to Experiment. Part 71. Conformational analysis and structural study of valine and threonine," J. Mol. Struct. (Theochem) 209, 373-385. Schafer, L., H. L. Sellers, F. J. Lovas, and R. D. Suenram. 1980. "Theory versus Experiment: The Case of Glycine," J. Am. Chem. Soc. 102, 6566-6568. Schafer, L., K. Siam, V. J. Klimkowski, J. D. Ewbank, and C. Van Alsenoy. 1990b. "Ab Initio Studies of Structural Features Not Easily Amenable to Experiment Part 69. Conformational analysis and structural study of cysteine," J. Mol. Struct. (Theochem) 204, 361-372. Schafer, L., C. Van Alsenoy, J. N. Scarsdale, V. J. Klimkowski, and J. D. Ewbank. 1981. "Ab Initio Studies of Structural Features Not Easily Amenable to Experiment Part 18. Conformational Analysis and Molecular Structure of Glycine Methyl Ester," J. Comput. Chem. 2, 410--413. Sellers, H. L., and L. Schafer, "Ab Initio Equilibrium Structures of Unionized Amino Acids: Alanine," Chem. Phys. Letters, 63, 609-611. Sellers, H. L., and L. Schafer. 1978. "Investigations Concerning the Apparent Contradiction Between the Microwave Structure and the Ab Initio Calculations of Glycine," J. Am. Chem. Soc. 100, 7728-7729. Siam, K., V. J. Klimkowski, J. D. Ewbank, C. Van Alsenoy, and L. Schafer. 1984. "Ab Initio Studies of Structural Features Not Easily Amenable to Experiment Part 39. Conformational analysis of glycine and alanine," J. Mol. Struct. (Theochem) 110, 171-182. Singh, U. C., F. K. Brown, P. A. Bash, and P. A. Kollman. 1987. "An Approach to the Application of Free Energy Perturbation Methods Using Molecular Dynamics: Applications to the Transformations of CH3OH CH3CH3, H3O+ NH+4, Glycine Alanine, and Alanine Phenylalanine in Aqueous Solution and to H3O+ (H2O)3 NH+4 (H2O)3 in the Gas Phase," J. Am. Chem. Soc. 109, 1607-1614. Sokalski, W. A., K. Maruszewski, P. C. Hariharan, and J. J. Kaufman. 1989. "Library of Cumulative Atomic Multipole Moments: II. Neutral and Charged Amino Acids," Int. J. Quantum Chem. Quantum Biol. Symp. 16, 119-164. Sordo, J. A., M. Probst, G. Corongiu, S. Chin, and E. dementi. 1987. "Ab Initio Pair Potentials for the Interactions between Aliphatic Amino Acids," J. Am. Chem. Soc. 109, 1702-1708. Sukumar, N., and G. A. Segal. 1986. "Effect of Aqueous Solvation upon the Electronic Excitation Spectrum of the Glycine Zwitterion: A Theoretical CI Study Using a Fractional Charge Model," J. Am. Chem. Soc. 108, 6880-6884. Sulzbach, H. M., P.v.R. Schleyer, and H. F. Schaefer, III. 1994. "Interrelationship Between Conformation and Theoretical Chemical Shifts. Case Study on Glycine and Glycine Amide," J. Am. Chem. Soc. 116, 3967-3972. Tarakeshwar, P., and S. Manogaran. 1994. "Conformational Effects on Vibrational Frequencies of Cysteine and Serine: An Ab Initio Study," J. Mol. Struct. (Theochem) 305, 205-224. Tranter, G. E. 1985a. "The Parity Violating Energy Differences between the Enantiomers of -amino Acids," Mol. Phys. 56, 825-838. Tranter, G. E. 1985b. "The Parity-Violating Energy Differences between the Enantiomers of -amino Acids," Chem. Phys. Letters 120, 93-96. Van Alsenoy, C., S. Kulp, K. Siam, V. J. Klimkowski, J. D. Ewbank, and L. Schafer. 1988. "Ab Initio Studies of Structural Features Not Easily Amenable to Experiment. Part 63. Conformational analysis and structural study of serine," J. Mol. Struct. (Theochem) 181, 169-178.
AB INITIO CALCULATIONS OF AMINO ACIDS AND PEPTIDES
213
Van Alsenoy, C., J. N. Scarsdale, and L. Schafer. 1982. "Ab Initio Studies of Structural Features Not Easily Amenable to Experiment Part 24. Molecular structures and conformational analyses of the methyl esters of formic acid, acetic acid and alanine," J. Mol. Struct. (Theochem) 90, 297-304. Van Alsenoy, C., J. N. Scarsdale, H. L. Sellers, and L. Schafer. 1981. "Ab Initio Studies of Structural Features Not Easily Amenable to Experiment. The Molecular structures of Two Low-Energy Forms of Unionized Serine," Chem. Phys. Letters 80, 124-126. Vijay, A., and D. N. Sathyanarayana. 1992. "Theoretical Study of the Ground-State Vibrations of Nonionized Glycine," J. Phys. Chem. 96, 10735-10739. Vishveshwara, S., and J. A. Pople. 1977. "Molecular Orbital Theory of the Electronic Structures of Organic Compounds. 32. Conformations of Glycine and Related Systems," J. Am. Chem. Soc. 99, 2422-2426. Voogd, J., J. L., Derissen, and F. B. van Duijneveldt. 1981. "Calculation of Proton-Transfer Energies and Electrostatic Lattice Energies of Various Amino Acids and Peptides Using CNDO/2 and Ab Initio SCF Methods," J. Am. Chem. Soc. 103, 7701-7706. Williams, R. W., V. F. Kalasinsky, and A. H. Lowrey. 1993. "Scaled Quantum Mechanical Force Field for Cis- and Trans-glycine in Acidic Solution," J. Mol. Struct. (Theochem), 281, 157-171. Wright, L. R., and R. F. Borkman. 1980. "Ab Initio Self-Consistent Field Calculations on Some Small Amino Acids," J. Am. Chem. Soc. 102, 6207-6210. Wright, L. R., R. F. Borkman, and Gabrielli, A. M. 1982. "Protonation of Glycine: An Ab Initio Self-Consistent Field Study," J. Phys. Chem. 86, 3951-3956. Yu, D., D. A. Armstrong, and A. Rauk. 1992. "Hydrogen Bonding and Internal Rotation Barriers of Glycine and Its Zwitterion (Hypothetical) in the Gas Phase," Can. J. Chem. 70, 1762-1772. Ab Initio Calculations of Peptides Aida, M. 1993. "Theoretical Studies on Hydrogen Bonding Interactions between Peptide Units," Bull. Chem. Soc. Jpn. 66, 3423-3429. Aizman, A., and Case, D. A. 1980. "Electronic Structure Calculations on Active Site Models for 4-Fe, 4-S Iron-Sulfur Proteins," J. Am. Chem. Soc. 104, 3269-3279. Amodeo, P., and V. Barone. 1992. "A New General Form of Molecular Force Fields. Application to Intra- and Interresidue Interactions in Peptides," J. Am. Chem. Soc. 114, 9085-9093. Bakhshi, A. K., and J. Ladik. 1986. "Ab Initio Study of the Effect of Side-Chain Reactions on the Electronic Structure of Proteins," Chem. Phys. Letters 129, 269-274. Bakhshi, A. K., J. Ladik, and P. Otto. 1989. "Ab Initio Study of the Effect of Cation Binding on the Electronic Structure of Proteins," J. Mol. Struct. 198, 143-158. Bakhshi, A. K., P. Otto, and J. Ladik. 1988. "On the Electronic Structure and Conduction Properties of Aperiodic Proteins: Study of Six-Component Polypeptide Chains," J. Mol. Struct. (Theochem) 180, 113-123. Bakhshi, A. L., P. Otto, C.-M. Liegener, E. Rehm, and J. Ladik. 1990. "Modeling of Real 20-Component Protein Chains: Determination of the Electronic Density of States Different Amino Acid Residues," Int. J. Quantum Chem. 38, 573-583. Balazs, A. 1991. "Notes on the Local Flexibility of the Polypeptide Backbone Part I. Mean amplitudes of thermal motions in the dipeptide model," J. Mol. Struct. 245, 111-117. Barone, V, F. Fraternali, and P. L. Cristinziano. 1990. "Sensitivity of Peptide Conformation to Methods and Geometrical Parameters. A Comparative Ab Initio and Molecular Mechanics Study of Oligomers of -Aminoisobutyric Acid," Macromolecules 23, 2038-2044. Bellido, M. N., and J.A.C. Rullmann. 1989. "Atomic Charge Models for Polypeptides Derived from Ab Initio Calculations," J. Comput. Chem. 10, 479--487.
214
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Bohm, H.-J. 1993. "Ab Initio SCF Calculations on Low-Energy Conformers of NAcetylglycylglycine N'-Methylamide," J. Am. Chem. Soc. 115, 6152-6158. Bohm, H.-J., and S. Erode. 1995. "Ab Initio SCF Calculations on Low-Energy Conformers of Cyclohexaglycine," J. Comput. Chem. 16, 146-153. Bohm, H.-J., and S. Erode. 1991. "Ab Initio SCF Calculations on Low-Energy Conformers of N-Acetyl-N'-methylalaninamide and N-Acetyl-N'-methylglycinamide," J. Am. Chem. Soc. 113, 7129-7135. Bour, P., and T. A. Deiderling. 1993. "Ab Initio Simulations of the Vibrational Circular Dichroism of Coupled Peptides," J. Am. Chem. Soc. 115, 9602-9607. Caillet, J., P. Claverie, and B. Pullman. 1978. "Effect of the Crystalline Environment upon the Rotational Conformation about the N-C and C-C' Bonds ( and ) in Amides and Peptides," Theoret. Chim. Acta (Berl.) 47, 17-26. Chang, C., and R.F.W. Bader. 1992. "Theoretical Construction of a Polypeptide," J. Phys. Chem. 96, 1654-1662. Cheam, T. C. 1993. "Normal Mode Analysis of Alanine Dipeptide in the Crystal Conformation Using a Scaled Ab Initio Force Field," J. Mol. Struct. 295, 259--271. Cheam, T. C. 1992. "Normal Mode Analysis of Glycine Dipeptide in Crystal Conformation Using a Scaled Ab Initio Force Field," J. Mol. Struct. 274, 289-309. Cheam, T. C., and S. Krimm. 1990. "Ab Initio Force Fields of Alanine Dipeptide in Four Non-Hydrogen Bonded Conformations," J. Mol. Struct. (Theochem) 206, 173-203. Cheam, T. C., and S. Krimm. 1989a. "Ab Initio Force Fields of Glycine Dipeptide in C5 and C7 Conformations," J. Mol. Struct. 193, 1-34. Cheam, T. C., and S. Krimm. 1989b. "Ab Initio Force Fields of Alanine Dipeptide in C5 and C7 Conformations," J. Mol. Struct. (Theochem) 188, 15--43. Cheam, T. C., and S. Krimm. 1986. "Vibrational Properties of the Peptide N-H Bond as a Function of Hydrogen-Bond Geometry: An Ab Initio Study," J. Mol. Struct. 146, 175-189. Cheam, T. C., and S. Krimm. 1985. "Infrared Intensities of Amide Modes in N-methylacetamide and Poly(Glycine I) From Ab Initio Calculations of Dipole Moment Derivatives of N-Methylacetamide," J. Chem. Phys. 82, 1631-1641. Chestnut, D. B., and C. G. Phung. 1991. "Ab Initio Determination of Chemical Shielding in a Model Dipeptide," Chem. Phys. Letters 183, 505-509. Day, R. S., S. Suhai, and J. Ladik. 1981. "Electronic Structure in Large Finite Aperiodic Polypeptide Chains," Chem. Phys. 62, 165-169. Dive, G., D. Dehareng, and J. M. Ghuysen. 1994. "Detailed Study of a Molecule in a Molecule: N-Acetyl-L-tryptophanamide in an Active Site Model of -Chymotrypsin," J. Am. Chem. Soc. 116, 2548-2556. Endredi, G., C.-M. Liegener, M. A. McAllister, A. Perczel, J. Ladik, and I. G. Csizmadia. 1994. "Peptide Models 8. The Use of a Modified Romberg Formalism for the Extrapolation of Molecular Properties from Oligomers to Polymers. Polyalanine Diamide in Its "Extended Like" or ( L) or (C5)n Conformation," J. Mol. Struct. (Theochem) 306, 1-7. Faerman, C. H., and S. L. Price. 1990. "A Transferable Distributed Multipole Model for the Electrostatic Interactions of Peptides and Amides," J. Am. Chem. Soc. 112, 4915--4926. Fernandez, B., M. A. Rios, and L. Carballeira. 1991. "Molecular Mechanics (MM2) and Conformational Analysis of Compounds with N-C-O Units. Parametrization of the Force Field and Anomeric Effect," J. Comput. Chem. 12, 78-90. Fischer, S., R. L. Dunbrack, Jr., and M. Karplus. 1994. "Cis-Trans Imide Isomerization of the Proline Dipeptide," J. Am. Chem. Soc. 116, 11931-11937. Fowler, P. W., and G. J. Moore. 1988. "Calculation of the Magnitude and Orientation of Electrostatic Interactions Between Small Aromatic Rings in Peptides and Proteins: Implications for Angiotensin II," Biochem. Biophys. Res. Comniun. 153, 1296--1300. Frey, R. F., J. Coffin, S. Q. Newton, M. Ramek, V.K.W. Cheng, F. A. Momany, and L. Schafer. 1992. "Importance of Correlation-Gradient Geometry Optimization for Molecular Conformational Analyses," J. Am. Chem. Soc. 114, 5369--5377.
AB INITIO CALCULATIONS OF AMINO ACIDS AND PEPTIDES
215
Gaspar, R., Jr., and R. Gaspar. 1983. "Ab Initio Molecular Fragment Calculations with Pseudopotentials: Model Peptide Studies," Int. J. Quantum Chem. 24, 767-771. Gould, I. R., W. D. Cornell, and I. H. Hillier. 1994. "A Quantum Mechanical Investigation of the Conformational Energetics of the Alanine and Glycine Dipeptides in the Gas Phase and in Aqueous Solution," J. Am. Chem. Soc., 116, 9250-9256. Gould, I. R., and I. H. Hillier, "Solvation of Alanine Dipeptide: A Quantum Mechanical Treatment," J. Chem. Soc., Chem. Commun. 951-952. Gould, I. R., and P. A. Kollman. 1992. "Ab Initio SCF and MP2 Calculations on Four LowEnergy Conformers of N-Acetyl-N'-methylalaninamide," J. Phys. Chem. 96, 9255-9258. Grant, J. A., R. L. Williams, and H. A. Scheraga. 1990. "Ab Initio Self-Consistent Field and Potential-Dependent Partial Equalization of Orbital Electronegativity Calculations of Hydration Properties of N-Acetyl-N'-Methyl-Alanineamide," Biopolymers 30, 929-949. Gresh, N., A. Pullman, and P. Claverie. 1985. "Theoretical Studies of Molecular Conformation. II: Application of the SIBFA Procedure to Molecules Containing Carbonyl and Carboxylate Oxygens and Amide Nitrogens," Theoret. Chim. Acta (Berl.), 67, 11-32. Guo, H., and M. Karplus. 1994. "Solvent Influence on the Stability of the Peptide Hydrogen Bond: A Supramolecular Cooperative Effect," J. Phys. Chem. 98, 7104-7105. Hadzi, D., M. Hodoscek, D. Turk, and V. Harb. 1988. "Theoretical Investigations of Structure and Enzymatic Mechanisms of Aspartyl Proteinases Part 2. Ab initio calculations on some possible initial steps of proteolysis," J. Mol. Struct. (Theochem) 181, 71-80. Head-Gordon, T, M. Head-Gordon, M. J. Frisch, C. L. Brooks, III, and J. A. Pople. 1991. "Theoretical Study of Blocked glycine and Alanine Peptide Analogues," J. Am. Chem. Soc. 113, 5989-5997. Head-Gordon, T, M. Head-Gordon, M. J. Frisch, C. Brooks, III, and J. A. Pople. 1989. "A Theoretical Study of Alanine Dipeptide and Analogs," Int. J. Quantum Chem. Quantum Biol. Symp. 16, 311-322. Jensen, J. H., K. K. Baldridge, and M. S. Gordon. 1992. "Uncatalyzed Peptide Bond Formation in the Gas Phase," J. Phys. Chem. 96, 8340-8351. Jewsbury, P., S. Yamamoto, T. Minato, M. Saito, and T. Kitagawa. 1994. "The Proximal Residue Largely Determines the CO Distortion in Carbonmonoxy Globin Proteins. An Ab Initio Study of a Heme Prosthetic Unit," J. Am. Chem. Soc. 116, 11586-11587. Jiao, D., M. Barfield, and V. J. Hruby. 1993. "Ab Initio IGLO Study of the - and -Angle Dependence of the 13C Chemical Shifts in the Model Peptide N-Acetyl-N'-methylglycinamide," J. Am. Chem. Soc. 115, 10883-10887. Kertesz, M., J. Koller, and A. Azman. 1980. "On the Electronic Structure of Periodic Polyglycine," Int. J. Quantum Chem. Quantum Biol. Symp. 7, 177-179. Kleier, D. A., and W. N. Lipscomb. 1977. "Molecular Orbital Study of Polypeptides. Conformational and Electronic Structure of Polyglycine," Int. J. Quantum Chem. Quantum Biol. Symp. 4, 73-86. Klimkowski, V. J., L. Schafer, F. A. Momany, and C. Van Alsenoy. 1985. "Local Geometry Maps and Conformational Transitions Between Low-Energy Conformers of N-AcetylN'-Methyl Glycine Amide: An Ab Initio Study at the 4-21G Lever with Gradient Relaxed Geometries," J. Mol. Struct. (Theochem) 124, 143-153. Ladik, J., P. Otto, A. K. Bakhshi, and M. Seel. 1986. "Quantum Mechanical Treatment of Biopolymers as Solids: Possible Implications for Carcinogenesis," Int. J. Quantum Chem. 29, 597-617. Ladik, J., A. Sutjianto, and P. Otto. 1991. "Improved Band Structures of Some Homopolypeptides with Aliphatic Side Chains and of the Four Nucleotide Base Stacks: Estimation of Their Fundamental Gap," J. Mol. Struct. (Theochem) 228, 271-276. Liegener, C.-M., A. K. Bakhshi, P. Otto, and J. Ladik. 1989. "Effects of Correlation and
216
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Hydration on the Electronic Structure of Aperiodic Polypeptides," J. Mol. Struct. (Theochem) 188, 205-212. Liegener, C.-M., A. Sutjianto, and J. Ladik. 1990. "The Treatment of Electron Correlation in Aperiodic Systems. III. Application to Polypeptides," Chem. Phys. 145, 385-388. Mavri, J., F. Avbelj, and D. Hadzi. 1989. "Conformation of N-Acetyl-L-Pro-D-Ala-N'Methyl Tripeptide Empirical, Semi-Empirical MO and Ab Initio MO Calculations," J. Mol. Struct. (Theochem) 187, 307-315. McAllister, M. A., A. Perczel, P. Csaszar, and I. G. Csizmadia. 1993a. "Peptide Models 5. Topological Features of Molecular Mechanics and Ab Initio 4D-Ramachandran Maps. Conformational Data for Ac-L-Ala-L-Ala-NHMe and For-L-Ala-L-Ala-NH2," J. Mol. Struct. (Theochem) 288, 181-198. McAllister, M. A., A. Perczel, P. Csaszar, W. Viviani, J.L. Rivail, and I. G. Csizmadia. 1993b. "Peptide Models 4. Topological Features of Molecular Mechanics and Ab Initio 2D-Ramachandran Maps. Conformational Data for For-Gly-NH2, For-L-Ala-NH2, Ac-L-Ala-NHMe and For-L-Val-NH2," J. Mol. Struct. (Theochem) 288, 161-179. Mehrotra, P. K., M. Mezei, and D. L. Beveridge. 1984. "Monte Carlo Determination of the Internal Energies of Hydration for the Ala Dipeptide in the C7, C5, R, and PII Conformations," Int. J. Quantum Chem. Quantum Biol. Symp. 11, 301-308. Mezei, M., P. K. Mehrotra, and D. L. Beveridge. 1985. "Monte Carlo Determination of the Free Energy and Internal Energy of Hydration for the Ala Dipeptide at 25°C," J. Am. Chem. Soc. 107, 2239-2245. Miick, S. M., G. V. Martinez, W. R. Fiori, A. P. Todd, and G. L. Millhauser. 1992. "Short Alanine-Based Peptides May Form 310-Helices and Not -Helices in Aqueous Solution," Nature 359, 653-655. Mirkin, N. G., and S. Krimm. 1990. "Vibrational Dynamics of the Cis Peptide Group," J. Am. Chem. Soc. 112, 9016-9017. Nakagawa, S., and H. Umeyama. 1981. "Molecular Orbital Study of the Effects of Ionic Amino Acid Residues on Proton Transfer Energetics in the Active Site of Carboxypeptidase A," Chem. Phys. Letters 81, 503-507. No, K. T., J. A. Grant, M. S. Jhon, and H. A. Scheraga. 1990. "Determination of Net Atomic Charges Using a Modified Partial Equalization of Orbital Electronegativity Method. 2. Application to Ionic and Aromatic Molecules as Models for Polypeptides," J. Phys. Chem. 94, 4740--4746. Oie, T., G. H. Loew, S. K. Burt, J. S. Binkley, and R. D. MacElroy. 1982. "Ab Initio Study of Catalyzed and Uncatalyzed Amide Bond Formation as a Model for Peptide Bond Formation: Ammonia-Formic Acid and Ammonia-Glycine Reactions," Int. J. Quantum Chem. Quantum Biol. Symp. 9, 223-245. Oie, T., G. H. Loew, S. K. Burt, and R. D. MacElroy. 1983. "Ab Initio Study of Catalyzed and Uncatalyzed Amide Bond Formation as a Model for Peptide Bond Formation: Ammonia-Glycine Reactions," J. Comput. Chem. 4, 449--460. Otto, P., and A. Sutjianto. 1991. "Electron Correlation Effects on the Energy Band Structure of Polyglycine," J. Mol. Struct. (Theochem) 231, 277-282. Perczel, A., J. G. Angyan, M. Kajtar, W. Viviani, J.-L. Rivail, J.-F. Marcoccia, and I. G. Csizmadia. 1991a. "Peptides Models. 1. Topology of Selected Peptide Conformational Potential Energy Surfaces (Glycine and Alanine Derivatives)," J. Am. Chem. Soc. 113, 6256-6265. Perczel, A., O Farkas, and I. G. Csizmadia. 1996. "Peptide Models XVI. The Identification of Selected HCO-L-SER-NH2 Conformers via a Systematic Grid Search Using Ab Initio Potential Energy Surfaces," J. Comput. Chem. 17, 821-834. Perczel, A., M. Kajtar, J.-F. Marcoccia, and I. G. Csizmadia. 1991b. "The Utility of the FourDimensional Ramachandran Map for the Description of Peptide Conformations," J. Mol. Struct. (Theochem) 232, 291-319. Perczel, A., M. A. McAllister, P. Csaszar, and I. G. Csizmadia. 1994. "Peptide Models. IX. A Complete Conformational Set of For-Ala-Ala-NH2 from Ab Initio Computations," Can. J. Chem. 72, 2050-2070.
AB INITIO CALCULATIONS OF AMINO ACIDS AND PEPTIDES
217
Perczel, A., M. A. McAllister, P. Csaszar, and I. G. Csizmadia. 1993. "Peptide Models 6. New -Turn Conformations from Ab Initio Calculations Confirmed by X-ray Data of Proteins," J. Am. Chem. Soc. 115, 4849--4858. Perczel, A., W. Viviani, and I. G. Csizmadia. 1992. "Peptide Conformational Potential Energy Surfaces and Their Relevance to Protein Folding," in Molecular Aspects of Biotechnology: Computational Models and Theories" Bertran, J. ed., pp. 39-82. Kluwer Academic Publishers. Peters, D., and J. Peters. 1984a. "Quantum Theory of the Structure and Bonding in Proteins Part 17. The unionised aspartic acid dipeptide," J. Mol. Struct. ((Theochem) 109, 149-159. Peters, D., and J. Peters. 1984b. "Quantum Theory of the Structure and Bonding in Proteins Part 16. The asparagine dipeptide," J. Mol. Struct. (Theochem) 109, 137-148. Peters, D., and J. Peters. 1982a. "Quantum Theory of the Structure and Bonding in Proteins Part 15. The threonine dipeptide," J. Mol. Struct. (Theochem) 90, 321-334. Peters, D., and J. Peters. 1982b. "Quantum Theory of the Structure and Bonding in Proteins Part 14. The serine dipeptide," J. Mol. Struct. (Theochem) 90, 305-320. Peters, D., and J. Peters. 1982c. "Quantum Theory of the Structure and Bonding in Proteins Part 12. Conformational analysis of side chains and the ethyl group as a model side chain," J. Mol. Struct. (Theochem) 88, 137-156. Peters, D., and J. Peters. 1981a. "Quantum Theory of the Structure and Bonding in Proteins Part 10. The C10 hydrogen bonds and (3 bends in peptides and proteins." J. Mol. Struct. (Theochem) 85, 267-277. Peters, D., and J. Peters. 1981b. "Quantum Theory of the Structure and Bonding in Proteins Part 9. The proline dipeptide," J. Mol. Struct. (Theochem) 85, 257-265. Peters, D., and J. Peters. 1981c. "Quantum Theory of the Structure and Bonding in Proteins Part 8. The alanine dipeptide," J. Mol. Struct. (Theochem) 85, 107-123. Peters, D., and J. Peters. 1980a. "Quantum Theory of the Structure and Bonding in Proteins Part 7. The a helix and the hydrogen bonding in the tetrapeptide," J. Mol. Struct. 69, 249-263. Peters, D., and J. Peters. 1980b. "Quantum Theory of the Structure and Bonding in Proteins Part 6. Factors governing the formation of hydrogen bonds in proteins and peptides," J. Mol. Struct. 68, 255-270. Peters, D., and J. Peters. 1980c. "Quantum Theory of the Structure and Bonding in Proteins Part 5. Further studies of the C10 hydrogen bond of the tripeptide," J. Mol. Struct. 68, 234-253. Peters, D., and J. Peters. 1979. "Quantum Theory of the Structure and Bonding in Proteins Part 2. The simple dipeptide," J. Mol. Struct. 53, 103-119. Price, S. L., C. H. Faerman, and C. W. Murray. 1991. "Toward Accurate Transferable Electrostatic Models for Polypeptides: A Distributed Multipole Study of Blocked Amino Acid Residue Charge Distributions," J. Comput. Chem. 12, 1187-1197. Price, S. L., and A. J. Stone. 1992. "Electrostatic Models for Polypeptides: Can We Assume Transferability?," J. Chem. Soc. Faraday Trans. 88, 1755-1763. Probst, M. M., and B. M. Rode. 1984. "Quantum Chemical Investigations on the Complexes of Ca2+ and Zn2+ with Aliphatic Dipeptides," Inorg. Chim. Acta, 92, 75-78. Ragazzi, M., D. R. Ferro, and E. Clementi. 1979. "Analytical Potentials from Ab Initio Computations for the Interaction Between Biomolecules. V. Formyl-triglycyl Amide and Water," J. Chem. Phys. 70, 1040-1050. Ramek, M., A.-M. Kelterer, B. J. Teppen, and L. Schafer. 1995. "Theoretical Structure Investigation of N-acetyl L-proline amide," J. Mol. Struct. 352/353, 59-70. Ramani, R., and R. J. Boyd. 1981. "Ab-initio Molecular Orbital Study of the cis/trans Conformations of the Peptide Bond," Int. J. Quantum Chem. Quantum Biol. Symp. 8, 117-127. Rao, B. G., R. F. Tilton, and U. C. Singh. 1992. "Free Energy Perturbation Studies on Inhibitor Binding to HIV-1 Proteinase," J. Am. Chem. Soc. 114, 4447--4452. Roux, B. 1993. "Non-additivity in Cation-peptide Interactions. A Molecular Dynamics and
218
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Ab Initio Study of Na+ in the Gramicidin Channel," Chem. Phys. Letters 212, 231-240. Ryan, J. A., and J. L. Whitten. 1972. "Self-Consistent Field Studies of Glycine and Glycylglycine. The Simplest Example of a Peptide Bond," J. Am. Chem. Soc. 94, 2396-2400. Sapse, A.-M., S. B. Daniels, and B. W. Erickson. 1988. "Ab Initio Calculations for NAcetylalanylglycine Amide," Tetrahedron 44, 999-1006. Sapse, A.-M., L. M. Fugler, and D. Cowburn. 1986. "An Ab Initio Study of Intermolecular Hydrogen Bonding Between Small Peptide Fragments," Int. J. Quantum Chem. 29, 1241-1251. Sapse, A.-M., D. C. Jain, D. de Gale, and T. C. Wu. 1990. "Solvent Effect and Librational Entropy Calculations on N-Acetylalanylglycine Amide," J. Comput. Chem. 11, 573-575. Sapse, A.-M., L. Mallah-Levy, S. B. Daniels, and B. W. Erickson. 1987. "The Turn: Ab Initio Calculations on Proline and N-Acetylproline Amide," J. Am. Chem. Soc. 109, 3526-3529. Sarai, A., and M. Saito. 1985. "Theoretical Studies on the Interaction of Proteins with Base Pairs. II. Effect of External H-Bond Interactions on the Stability of Guanine-Cytosine and Non-Watson-Crick Pairs," Int. J. Quantum Chem. 28, 399--409. Sarai, A., and M. Saito. 1984. "Theoretical Studies on the Interaction of Proteins with Base Pairs. I. Ab Initio Calculation for the Effect of H-Bonding Interaction of Proteins on the Stability of Adenine-Uracil Pair," Int. J. Quantum Chem. 25, 527-533. Sawaryn, A., and J. S. Yadav. 1982. "Ab Initio Studies on the Nonplanarity of a Peptide Unit: Calculations on Model Compounds," Int. J. Quantum Chem. 22, 547-556. Scarsdale, J. N., C. Van Alsenoy, V. J. Klimkowski, L. Schafer, and F. A. Momany. 1983. "Ab Initio Studies of Molecular Geometries. 27. Optimized Molecular Structures and Conformational Analysis of N-Acetyl-N-methylalaninamide and Comparison with Peptide Crystal Data and Empirical Calculations," J. Am. Chem. Soc. 105, 3438-3445. Schafer, L., V. J. Klimkowski, F. A. Momany, H. Chuman, and C. Van Alsenoy. 1984. "Conformational Transitions and Geometry Differences between Low-Energy Conformers of N-Acetyl-N'-Methyl Alanineamide: An Ab Initio Study at the 4-21G Level with Gradient Relaxed Geometries," Biopolymers 23, 2335-2347. Schafer, L., S. Q. Newton, M. Cao, A. Peeters, C. Van Alsenoy, K. Wolinski, and F. A. Momany. 1993. "Evaluation of the Dipeptide Approximation in Peptide Modeling by Ab Initio Geometry Optimizations of Oligopeptides," J. Am. Chem. Soc. 115, 272-280. Schafer, L., C. Van Alsenoy, and J. N. Scarsdale. 1982. "Ab Initio Studies of Structural Features Not Easily Amenable to Experiment. 23. Molecular Structures and Conformational Analysis of the Dipeptide N-acetyl-N' -methyl Glycyl Amide and the Significance of Local Geometries for Peptide Structures," J. Chem. Phys. 76, 1439--1444. Scheiner, S., and L. Wang. 1993. "Hydrogen Bonding and Proton Transfers of the Amide Group," J. Am. Chem. Soc. 115, 1958-1963. Shang, H. S., and T. Head-Gordon. 1994. "Stabilization of Helices in Glycine and Alanine Dipeptides in a Reaction Field Model of Solvent," J. Am. Chem. Soc. 116, 1528--1532. Shipman, L. L., and R. E. Christoffersen. 1973. "Ab Initio Calculations on Large Molecules Using Molecular Fragments. Polypeptides of Glycine," J. Am. Chem. Soc. 95, 4733--4744. Siam, K., V. J. Klimkowski, C. Van Alsenoy, J. D. Ewbank, and L. Schafer. 1987. "Ab Initio Geometry Refinement of Some Selected Structures of the Model Dipeptide N-Acetyl N'-Methyl Serine Amide," J. Mol. Struct. (Theochem) 152, 261-270. Siam, K., S. Q. Kulp, J. D. Ewbank, and L. Schafer. 1989. "Ab Initio Studies of Structural Features Not Easily Amenable to Experiment Part 64. Conformational analysis and local geometry maps of the model dipeptide N-acetyl N'-methyl serine amide," J. Mol. Struct. (Theochem) 184, 143-157.
AB INITIO CALCULATIONS OF AMINO ACIDS AND PEPTIDES
219
Skala, L., and P. Pancoska. 1988. "Interpolation Formula for Physical Properties of Polypeptides as a Function of the Number of Amino Acid Residues," Chem.Phys. 125, 21-30. Sokalski, W. A., D. A. Keller, R. L. Ornstein, and R. Rein. 1993. "Multipole Correction of Atomic Monopole Models of Molecular Charge Distribution. I. Peptides," J. Comput. Chem. 14, 970-976. Sordo, J. A., T. L. Sordo, G. M. Fernandez, R. Gomperts, S. Chin, and E. dementi. 1989. "A Systematic Study on the Basis Set Superposition Error in the Calculation of Interaction Energies of Systems of Biological Interest," J. Chem. Phys. 90, 6361--6370. Stern, P. S., M. Chorev, M. Goodman, and A. T. Hagler. 1983. "Computer Simulation of the Conformational Properties of Retro-Inverso Peptides. II Ab Initio Study, Spatial Electron Distribution, and Population Analysis of N-Formylglycine Methylamide, NFormyl N'-Acetyldiaminomethane, and N-Methylmalonamide," Biopolymers 22, 1901-1917. Sternberg, U., F.-T., Koch, and M. Mollhoff. 1994. "New Approach to the Semiempirical Calculation of Atomic Charges for Polypeptides and Large Molecular Systems," J. Comput. Chem. 15, 524-531. Sugawara, Y., A. Y. Hirakawa, and M. Tsuboi. 1984. "In-Plane Force Constants of the Peptide Group: Least-Squares Adjustment Starting from Ab Initio Values of NMethylacetamide," J. Mol. Spectrosc. 108, 206-214. Suhai, S. 1985. "Perturbation Theoretical Calculation of Optical Effects in Polypeptides," J. Mol. Struct. (Theochem) 123, 97-108. Torii, H., and M. Tasumi. 1996. "Infrared Intensities of Vibrational Modes of an -helical Polypeptide: Calculations Based on the Equilibrium Charge/Charge Flux (ECCF) Model," J. Mol. Struct. 300, 171-179. Tranter, G. E. 1986. "Parity-violating Energy Differences and the Origin of Biomolecular Homochirality," J. Theor. Biol. 119, 467--479. Van Alsenoy, C., M. Cao, S. Q. Newton, B. Teppen, A. Perczel, I. G. Csizmadia, F. A. Momany, L. and Schafer. 1993. "Conformational Analysis and Structural Study by Ab Initio Gradient Geometry Optimizations of the Model Tripeptide N-formyl L-alanyl Lalanine Amide," J. Mol. Struct. (Theochem) 286, 149-163. Van Duijnen, P. T., and B. T. Thole. 1982. "Cooperative Effects in a-Helices: An Ab Initio Molecular-Orbital Study," Biopolymers 21, 1749-1761. Viviani, W., J.-L. Rivail, and I. G. Csizmadia. 1993a. "Peptide Models II. Intramolecular Interactions and Stable Conformations of Glycine, Alanine, and Valine Peptide Analogues," Theor. Chim. Acta 85, 189-197. Viviani, W., J.-L. Rivail, A. Perczel, and I. G. Csizmadia. 1993b. "Peptide Models. 3. Conformational Potential Energy Hypersurface of Formyl-L-valinamide," J. Am. Chem. Soc. 115, 8321-8329. Voisin, C., and A. Cartier. 1993. "Determination of Distributed Polarizabilities to be Used for Peptide Modeling," J. Mol. Struct. (Theochem) 286, 35--45. Walker, P. D., and P. G. Mezey. 1994. "Ab Initio Quality Electron Densities for Proteins: A MEDLA Approach," I. Am. Chem. Soc. 116, 12022-12032. Walker, P. D., and P. G. Mezey. 1993. "Molecular Electron Density Lego Approach to Molecule Building," J. Am. Chem. Soc. 115, 12423-12430. Weiner, S. J., U. C. Singh, T. J. O'Donnell, and P. A. Kollman. 1984. "Quantum and Molecular Mechanical Studies on Alanyl Dipeptide," J. Am. Chem. Soc. 106, 6243-6245. Williams, D. E. 1990. "Alanyl Dipeptide Potential-Derived Net Atomic Charges and Bond Dipoles, and Their Variation with Molecular Conformation," Biopolymers 29, 1367-1386. Wright, L. R., and R. F. Borkman. 1982. "Ab Initio Self-Consistent Field Studies of the Peptides Gly-Gly, Gly-Ala, Ala-Gly, and Gly-Gly-Gly," J. Phys. Chem. 86, 3956-3962. Yang, W. 1992. "Electron Density as the Basic Variable: a Divide-and-Conquer Approach
220
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
to the Ab Initio Computation of Large Molecules," J. Mol. Struct. (Theochem) 255, 461--479. Zhang, K., D. M. Zimmerman, A. Chung-Phillips, and C. J. Cassady. 1993. "Experimental and Ab Initio Studies of the Gas-Phase Basicities of Polyglycines," J. Am. Chem. Soc. 115, 10812-10822. Amino Acids and Peptides Barlow, D. J., and J. M. Thornton. 1988. "Helix Geometry in Proteins," J. Mol. Biol. 201, 601-619. Boggs, J. E. "Interaction of Theoretical Chemistry with Gas-Phase Electron Diffraction," in Stereochemical Applications of Gas-Phase Electron Diffraction, Hargittai, I., and Hargittai, M. eds., Vol. B, Chap. 10, 455--475. New York, VCH Publishers. Boggs, J. E. 1983. "The Integration of Structure Determination by Computation, Electron Diffraction and Microwave Spectroscopy," J. Mol. Struct. 97, 1-16. Boggs, J. E., and F. R. Cordell. 1981. "Accurate Ab Initio Gradient Calculation of the Structures and Conformations of Some Boric and Fluoroboric Acids. Basis-Set Effects on Angles Around Oxygen," J. Mol. Struct. (Theochem) 76, 329-347. Boggs, J. E., M. von Carlowitz, and S. von Carlowitz. 1982. "Symmetry of the Methyl Group in Molecules of the Type CH3YH2X," J. Phys. Chem. 86, 157-159. Brown, R. D., P. D. Godfrey, J.W.V. Storey, and M.-P. Bassez. 1978. "Microwave Spectrum and Conformation of Glycine," J. Chem. Soc., Chem. Commun. 547-548. Caminati, W., A. C. Fantoni, L. Schafer, K. Siam, and C. Van Alsenoy. 1986. "Conformational and Structural Analysis of Methyl Hydrazinocarboxylate by Microwave Spectroscopy and Ab Initio Geometry Refinements," J. Am. Chem. Soc. 108, 4364--4367. Caminati, W., A. C. Fantoni, B. Velino, K. Siam, L. Schafer, J. D. Ewbank, and C. Van Alsenoy. 1987a. "Conformational Equilibrium and Internal Hydrogen Bonding in 2Methylallyl Alcohol: Detection of a Second Conformer by Microwave Spectroscopy on the Basis of Ab Initio Structure Calculations," J. Mol. Spectrosc. 124, 72-81. Caminati, W., K. Siam, J. D. Ewbank, and L. Schafer. 1987b. "Interpretation of the Microwave Spectrum of 2-Methyoxy Ethylamine Using Its Ab Initio Structures," J. Mol. Struct. 158, 237-247. Caminati, W., B. Velino, M. Dakkouri, L. Schafer, K. Siam, and J. D. Ewbank. 1987c. "Reinvestigation of the Microwave Spectrum of Cyanocyclobutane: Assignment of the Axial Conformer," J. Mol. Spectrosc. 123, 469--475. Cao, M., and L. Schafer. 1993. "Viewpoint 6---Characteristic Aspects of GG Sequences and the Importance of Constitutional Properties for Conformational Entropies," J. Mol. Struct. (Theochem) 284 235-242. Chiu, N. S., H. L. Sellers, L. Schafer, and K. Kohata. 1979. "Molecular Orbital Constrained Electron Diffraction Studies. Conformational Behavior of 1,2-Dimethylhydrazine," J. Am. Chem. Soc. 101, 5883-5889. Chuman, H., F. A. Momany, and Schafer, L. 1984. "Backbone Conformations, Bend Structures, Helix Structures, and Other Tests of an Improved Conformational Energy Program for Peptides: ECEPP83," Int. J. Pept. Prot. Res. 24, 233-248. Cremer, D. 1981. "Theoretical Determination of Molecular Structure and Conformation Part X. Geometry and puckering potential of azetidine, (CH2)3NH, combination of electron diffraction and ab initio studies," J. Mol. Struct. 75, 225-240. de Smedt, J., F. Vanhoutegem, C. Van Alsenoy, H. J. Geise, and L. Schafer. 1992. "Empirical Corrections of SCF Geometries With Special Examples from 4-21G Calculations," J. Mol. Struct. 259, 289--305. Doms, L., H. J. Geise, C. Van Alsenoy, L. Van den Enden, and L. Schafer. 1985. "The Molecular Orbital Constrained Electron Diffraction (MOCED) Structural Model of Quadricyclane Determined by Electron Diffraction Combined with Ab Initio Calculations of Potential and Geometrical Parameters," J. Mol. Struct. 129, 299-314.
AB INITIO CALCULATIONS OF AMINO ACIDS AND PEPTIDES
221
Eliel, E. L., N. L. Alinger, S. J. Angyal, and G. A. Morrison. 1965. "Conformational Analysis," New York, Interscience. Frey, R. E, M. Cao, S. Q. Newton, and L. Schafer. 1993. "Electron Correlation Effects in Aliphatic Non-Bonded Interactions: Comparison of N-Alkaline MP2 and HF Geometries," J. Mol. Struct. (Theochem) 285, 99-113. Geise, H. J., and W. Pyckhout. 1988. "Self-Consistent Molecular Models from a Combination of Electron Diffraction, Microwave, and Infrared Data Together with High-Quality Theoretical Calculations," in Stereochemical Applications of Gas-Phase Electron Diffraction," Hargittai, I. and Hargittai, M. eds., Vol. A, Chap. 10, 321-346. VCH Publishers. Godfrey, P. D., S. Firth, L. D. Hartherley, R. D. Brown, and A. P. Pierlot. 1993. "MillimeterWave Spectroscopy of Biomolecules: Alanine," J. Am. Chem. Soc. 115, 9687-9691. Godfrey, P. D., and R. D. Brown. 1995. "Shape of Glycine," J. Am. Chem. Soc. 117, 2019-2023. Godfrey, P. D., R. D. Brown, and F. M. Rodgers. 1996. "The Missing Conformers of Glycine and Alanine: Relaxation in Supersonic Jets," J. Mol. Struct. 376, 65--81. Hehre, W. J., L. Radom, P.V.R. Schleyer, and J. A. Pople. 1986. In "The Performance Of The Model," Ab Initio Molecular Orbital Theory, 133-344. New York, Wiley. Ijima, K., K. Tanaka, and S. Onuma. 1991. "Main Conformer of Gaseous Glycine: Molecular Structure and Rotational Barrier from Electron Diffraction Data and Rotational Constants," J. Mol. Struct. 246, 257-266. Jiang, X., M. Cao, B. J. Teppen, S. Q. Newton, and L. Schafer. 1995a. "Predictions of Protein Backbone Structural Parameters from First Principles: Systematic Comparisons of Calculated N-C( )-C' Angles with High-Resolution Protein Crystallographic Results," J. Phys. Chem. 99, 10521-10525. Jiang, X., M. Cao, S. Q. Newton, L. Schafer, and E. F. Paulus. 1995. "Predictions of Peptide and Protein Backbone Structural Parameters from First Principles. IV: Systematic Comparisons of Calculated N-C( )-C' Angles with Peptide Crystal Structures," Electronic J. Theo. Chem. 1, 11-17. Jiang, X., C.-H. Yu, M. Cao, S. Q. Newton, E. F. Paulus, and L. Schafer. 1997."/ Torsional Dependence of Peptide and Protein Backbone Bond-Lengths and BondAngles: Comparison of Crystallographic and Calculated Parameters," J. Mol. Struct. Kabsch, W., and C. Sander. 1983. "Dictionary of Protein Secondary Structure: Pattern Recognition of Hydrogen-Bonded and Geometrical Features," Biopol. 22, 2577--2637. Karplus, P. A. 1996. "Experimentally Observed Conformation-Dependent Geometry and Hidden Strain in Proteins," Protein Sci. 5, 1406-1420. Kitano, M., and K. Kuchitsu. 1974. "Molecular Structure of Formamide as Studied by Gas Electron Diffraction," Bull. Chem. Soc. Japan 47, 67-72. Klimkowski, V. J., J. D. Ewbank, C. Van Alsenoy, J. N. Scarsdale, and L. Schafer. 1982. "Molecular Orbital Constrained Electron Diffraction Studies. 4. Conformational Analysis of the Methyl Ester of Glycine," J. Am. Chem. Soc. 104, 1476-1480. Kohata, K., T. Fukuyama, and K. Kuchitsu. 1979. "Molecular Structure and Conformation of 1,2-Dimethylhydrazine Studied by Gas Electron Diffraction," Chem. Lett. 257-260. Lewis, P. N., F. A. Momany, and H. A. Scheraga. 1973a. "Energy Parameters For Polypeptides: Conformational Energy Analysis Of The N-Acetyl N'-Methyl Amides Of The Twenty Naturally Occurring Amino Acids," Isr. J. Chem. 11, 121-152. Lewis, P. N., F. A. Momany, and H. A. Scheraga. 1973b. "Chain Reversals in Proteins," Biochim. Biophys. Acta 303, 211--229. MacArthur, M. W., and J. M. Thornton. 1996. "Deviations from Planarity of the Peptide Bond in Peptides and Proteins," J. Mol. Biol. 264, 1180--1195. Marsh, R. E., and J. Donohue. 1967. "Crystal Structures of Amino Acids and Peptides," Adv. Prot. Chem. 22, 235-256. McKean, D. C., J. E. Boggs, and L. Schafer. 1984. "CH Bond Length Variations Due to the
222
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Intramolecular Environment: A Comparison of the Results Obtained by the Method of Isolated CH Stretching Frequencies and by Ab Initio Gradient Calculations," J. Mol. Struct. 116, 313--330. Milner-White, E. J., and R. Poet. 1987. "Loops, Bulges, Turns, and Hairpins in Proteins," Trends Biochem. Sci. 12, 189-192. Mislow, K. 1965. Introduction to Stereochemistry. Reading, Mass., Benjamin/Cummings. Mislow, K., and M. Raban. 1967. "Stereoisomeric Relationships of Groups in Molecules," in Topics in Stereochemistry, Allinger, N. L., and Eliel, E. L., eds. Vol. 1, 1-38. New York: Interscience. Momany, F. A., V. J. Klimkowski, and L. Schafer. 1990. "On The Use Of Conformationally Dependent Geometry Trends From Ab Initio Dipeptide Studies To Refine Potentials For The Empirical Force Field CHARMM," J. Comp. Chem. 11, 654-662. Momany, F. A., R. Rone, H. Kunz, R. F. Frey, S. Q. Newton, and L. Schafer. 1993. "Geometry Optimization, Energetics and Solvation Studies on Four- and Five-membered Cyclic and Disulfide-bridged Peptides, Using the Programs QUANTA3.3 and CHARMm 22," J. Mol. Struct. 286, 1-18. Myller, C., and M. S. Plesset. 1934. "Note On An Approximate Treatment For ManyElectron Systems," Phys. Rev. 46, 618-622. Nakata, M., H. Takeo, C. Matsumura, K. Yamanouchi, K. Kuchitsu, and T. Fukuyama. 1981. "Structures of 1,2-Dimethylhydrazine Conformers as Determined by Microwave Spectroscopy and Gas Electron Diffraction," Chem. Phys. Letters 83, 246-249. Norden, T. D., S. W. Staley, W. H. Taylor, and M. D. Harmony. 1986. "On the Electronic Character of Methylenecyclopropene: Microwave Spectrum, Structure, and Dipole Moment," J. Am. Chem. Soc. 108, 7912-7918. Pople, J. A., R. Krishnan, H. B. Schlegel, and J. S. Binkley. 1979. "Derivative Studies in Hartree-Fock and M011er-Plesset Theories," Int. J. Quantum Chem. Quantum Chem. Symp. 13, 225-241. Pople, J. A., and M. Gordon. 1967. "Molecular Orbital Theory of the Electronic Structure of Organic Compounds. I. Substituent Effects and Dipole Moments," J. Am. Chem. Soc. 89, 4253--4261. Pulay, P. 1979a. "An Efficient Ab Initio Gradient Program," Theoret. Chim. Acta (Berl.) 50, 299-312. Pulay, P. 1969. "Ab Initio Calculation of Force Constants and Equilibrium Geometries in Polyatomic Molecules. I. Theory," Mol. Phys. 17, 197-204. Pulay, P., G. Fogarasi, F. Pang, and J. E. Boggs. 1979b. "Systematic Ab Initio Gradient Calculation of Molecular Geometries, Force Constants, and Dipole Moment Derivatives," J. Am. Chem. Soc. 101, 2550-2560. Pullman, B., and A. Pullman. 1974. "Molecular Orbital Calculations on the Conformation of Amino Acid Residues of Proteins," Adv. Protein Chem. 28, 347-526. Ramachandran, G. N., and V. Sasisekharan. 1968. "Conformation Of Polypeptides And Proteins," Adv. Protein Chem. 23, 283--438. Richardson, J. S., and D. C. Richardson. 1989. "Principles and Patterns of Protein Conformation," in Prediction of Protein Structure and the Principles of Protein Conformation, Fasman G., ed., 1-98. New York, Plenum Press. Richardson, J. S. 1981. "The Anatomy and Taxonomy of Protein Structure," Adv. Protein Chem. 34, 167-339. Sasisekharan, V. 1962. "Stereochemical Criteria For Polypeptide and Protein Structures," in Collagen, Ramanathan, N., ed. 39-78. Madras, India: Wiley. Schafer, L., M. Cao, and M. J. Meadows. 1995a. "Predictions of Protein Backbone Bond Distances and Angles from First Principles," Biopolymers 35, 603-606. Schafer, L., and M. Cao. 1995b. "Predictions of Protein Backbone Bond Distances and Angles from First Principles," J. Mol. Struct. 333, 201-208. Schafer, L. 1983. "The Ab Initio Gradient Revolution in Structural Chemistry: the Importance of Local Molecular Geometries and the Efficacy of Joint Quantum Mechanical and Experimental Procedures," J. Mol. Struct. 100, 51-73.
AB INITIO CALCULATIONS OF AMINO ACIDS AND PEPTIDES
223
Schafer, L., I. S. Bin Drees, R. F. Frey, C. Van Alsenoy, and J. D. Ewbank. 1995. "Molecular Orbital Constrained Gas Electron Diffraction Study of N-Acetyl N'-Methyl Alanine Amide." J. Mol. Struct. (Theochem) 338, 71-82. Schafer, L., J. D. Ewbank, V. J. Klimkowski, K. Siam, and C. Van Alsenoy. 1986. "Predictions of Relative Structural Trends From Ab Initio Derived Standard Geometry Functions," J. Mol. Struct. (Theochem) 135, 141-158. Schafer, L., J. D. Ewbank, K. Siam, N. S. Chiu, and H. L. Sellers. 1988a. "Molecular Orbital Constrained Electron Diffraction (MOCED) Studies: The Concerted Use of Electron Diffraction and Quantum Chemical Calculations," in "Stereochemical Applications of Gas-Phase Electron Diffraction," Hargittai, I. and Hargittai, M. eds., Vol. A, Chap. 9, 301-320. New York: VCH Publishers. Schafer, L., M. Cao, M. Ramek, B. J. Teppen, S. Q. Newton, and K. Siam. In press. "Conformational Geometry Functions: Additivity and Cooperative Effects," J. Mol. Struct. Schafer, L., and K. Siam. 1988b. "Comment on: Accuracy of Ab Initio C-H Bond Length Differences and Their Correlation with Isolated C-H Stretching Frequencies," J. Chem. Phys. 88, 7255-7256. Schafer, L., K. Siam, J. D. Ewbank, W. Caminati, and A. C. Fantoni. 1987. "Some Surprising Applications of Ab Initio Gradient Geometries in Microwave Spectroscopic Analyses," in "Modeling of Structures and Properties of Molecules" Maksic, Z. B. ed. Chap. 4, 79-90. Chichester, England, E. Horwood Publ. Comp. Schafer, L. 1991. "The Mutation of Chemistry: The Rising Importance of Ab Initio Computational Techniques in Chemical Research," J. Mol. Struct. (Theochem) 230, 5-11. Schafer, L., C. Van Alsenoy, and J. N. Scarsdale. 1982. "Estimates for Systematic Empirical Corrections of Consistent 4-21G Ab Initio Geometries and Their Correlations to Total Energy Group Increments," J. Mol. Struct. (Theochem) 86, 349-364. Schafer, L., C. Van Alsenoy, and L. Van den Enden. 1984. "The Possible Chirality Of Tetrahedral Carbon Atoms with Two Substituents of Identical Constitution," J. Chem. Ed. 61, 945-947. Schei, S. H. 1984a. "3-Chloro-l-Butene: Gas-Phase Molecular Structure and Conformations as Determined by Electron Diffraction and by Molecular Mechanics and Ab Initio Calculations," J. Mol. Struct. 118, 319-332. Schei, S. H., A. Almenningen, and J. Almlof. 1984b. "1,2,4,5-Tetrafluorobenzene: Molecular Structure as Determined by Gas-Phase Electron Diffraction and by Ab Initio Calculations," J. Mol. Struct. 112, 301-308. Scheraga, H. A. 1968. "Calculations of Conformations of Polypeptides," Adv. Phys. Org. Chem. 6, 103-184. Sibanda, B. L., and J. M. Thornton. 1985. " -Hairpin families in Globular Proteins," Nature 316, 170-174. Skancke, A., and J. E. Boggs. 1978. "The Molecular Structures of Methylcyclopropane, Cyclopropylamine and Cyclopropyl Lithium," J. Mol. Struct. 50, 173-182. Staley, S. W., T. D. Norden, W. H. Taylor, and M. D. Harmony. 1987. "Electronic Structure of Cyclopropenone and Its Relationship to Methylenecyclopropene. Evaluation of Criteria for Aromaticity," J. Am. Chem. Soc. 109, 7641-7647. Suenram, R. D., and F. J. Lovas. 1980. "Millimeter Wave Spectrum of Glycine. A New Conformer, J. Am. Chem. Soc. 102, 7180-7184. Suenram, R. D., and F. J. Lovas. 1978. "Millimeter Wave Spectrum of Glycine," J. Mol. Spectrosc. 72, 372-382. Teeter, M. M., S. M. Roe, and N. H. Heo. 1993. "Atomic Resolution Crystal Structure of the Hydrophobic Protein Crambin at 130 K," J. Mol. Biol. 239, 292--311. Teppen, B. J., M. Cao, R. F. Frey, C. Van Alsenoy, D. M. Miller, and L. Schafer. 1994. "An Investigation into Intramolecular Hydrogen Bonding: Impact of Basis Set and Electron Correlation on the Ab Initio Conformational Analysis of 1,2-Ethanediol and 1,2,3Propanetriol," J. Mol. Struct. (Theochem) 314, 169--190.
224
MOLECULAR ORBITAL CALCULATIONS FOR BIOLOGICAL SYSTEMS
Teppen, B. I, D. M. Miller, M. Cao, R. F. Frey, S. Q. Newton, F. A. Momany, M. Ramek, and L. Schafer. 1994b. "Investigation of Electron Correlation Effects on Molecular Geometries," J. Mol. Struct. (Theochem) 311, 9-17. VanDenEnden, L.,C. VanAlsenoy, J. N. Scarsdale, V. J. Klimkowski, and L. Schafer. 1983. "Ab Initio Studies of Structural Features Not Easily Amenable to Experiment. 29. Conformational Analysis of Glycine Aldehyde," J. Mol. Struct. 105, 407--415. Van Alsenoy, C. 1988. "Ab Initio Calculations on Large Molecules: The Multiplicative Integral Approximation," J. Comput. Chem. 9, 620-626. Van Hemelrijk, D., L. Van den Enden, H. J. Geise, H. L. Sellers, and L. Schafer. 1980. "Structure Determination of 1-Butene by Gas Electron Diffraction, Microwave Spectroscopy, Molecular Mechanics, and Molecular Orbital Constrained Electron Diffraction," J. Am. Chem. Soc. 102, 2189-2195. Venkatachalam, C. M. 1968. "Stereochemical Criteria for Polypeptides and Proteins. V. Conformation of a System of Three Linked Peptide Units," Biopol. 6, 1425-1436. von Carlowitz, S., H. Oberhammer, H. Willner, and J. E. Boggs. 1983. "Structural Determination of a Recalcitrant Molecule (S2F4)," J. Mol. Struct. 100, 161-177. von Carlowitz, S., W. Zeil, P. Pulay, and J. E. Boggs. 1982. "The Molecular Structure, Vibrational Force Field, Spectral Frequencies, and Infrared Intensities of CH3POF2," J. Mol. Struct. (Theochem) 87, 113-124.
Index
ab initio calculations, 3-10 of alkylating agents, 160-164 of anti-metabolites, 164-178 drawbacks of, 11 absorption wavelength, 28 AM1 N-acetyl-N'-methyl alanine amide, 196-200, 202, 203, 205 N-acetyl N'-methyl serine amide, 197 achiral systems, 193 acidic activation, 163 acidity, hydrogen bond, 56-57 ACM functionals, 104 activation energy, XIV additivity, of conformational geometry functions, 202-203 adenine, density functional theory studies of, 94 adiabatic connection method, 88-89 alanine basis set for, 8-9 conformational studies of, 188-189 alkylating agents, ab initio calculations, 160-164 allyl vinyl ether, conformational studies, 112--113 alpha-carbon conformations, 194 alpha helix regions, of proteins, 197, 198, 200, 206 amidinomycin, 169-170
amino acids, 97, 98. See also names of specific amino acids conformational properties of, 183-186 use of diffuse functions, 6 model, 22-23, 31, 41 for calculating electrostatic potential, 55 ammonia, 101 AMSOL computer program, 32 Andzelm, J., 90 anharmonicity, 138 anions, basis sets for, 6 anisotropic environments, 35 annihilation procedure, 37, 38-39 antibiotics, 168. See also names of specific antibiotics anti-cancer drugs, classes of, 159 anti-metabolites, ab initio calculations, 164-178 apparent surface charges, 111--112 aromatic-aromatic interactions, 7, 165-166 aryl hydrocarbon hydroxylase, 68 aspartic acid, racemization, 115 asymmetric carbons, 194 atomic charge, 6,21 atomic distance, correcting for, 33 atomic interactions, in localized molecular orbitals, 39 atomic orbitals, XIV, 4, 17, 18, 28
225
226
INDEX
atomic orbitals (continued) charge density and, 5 energy, 21 generation of localized molecular orbitals from, 38 atomic overlap distribution, 17 atomic scattering factors, 134 atomization energies, 96 atoms electrostatic potential of neutral, 50, 51 Schrodinger equation applied to, 3, 4 attractive interaction, 184, 193 azines, electrostatic potentials of, 64-65 aziridine, proton affinities, 163-164 basicity, hydrogen bond, 56-57 basis sets, 17 for calculating electrostatic potential, 54 in conformational studies of amino acids, 185 double and triple zeta, 6 for Gaussian programs, 8-9 influence on density functional theory results, 98-100 Slater-type orbitals, 5-6 split-valence, 6 Bastiansen-Morino shrinkage effect, 138 benzene, 73 -benzene complex, 165-166 electrostatic potential of, 62-64 ring fusion, 162 benzonitrile, 63 beta expansion, 205 beta sheets, 197 beta turns, in oligopeptides, 201 binding, minor-groove, 170 binding energy, XIII, 170 of water clusters, 100 biological systems, definition, 85 BLYP functionals, 104, 115 bond angles calculated using Kohn-Sham method, 90 calculated using Slater-type orbital basis sets, 6 in crystalline structures, 141 dependence of calculated energies on, 189 for oligopeptides, 207 for proteins, 208 in Z-matrices, 8-9
bond cleavage, 7 bond dissociation, 104 bond distances calculated using Slater-type orbital basis sets, 5-6 protein backbone, 204-209 bond energies, 97 bonding. See also hydrogen bonding atomic, 18 iodine and bromine, 61-62 parameters, 21 bond lengths calculated using Kohn-Sham method, 90 in crystalline structures, 141 relation to calculated energies, 188 Born expression, 163-164 for polarization free energy, 32, 34 Born-Oppenheimer approximation, 3, 12 boundary element method, 111--112 bridge region, of proteins, 197, 198, 202 Brillouin's theorem, 25 bromine, 61-62 1-butanol, 73, 74 calicheamycins, 178 cancer cells, immunity to drugs, 165 carbonic anhydrase, 104-105, 115 carbon monosulfide, vibrational states, 135, 136 carboxylate ions, use of diffuse functions, 6 carcinogenic activity, XIV Car-Parrinello method, 106-108 cavities, 26, 33 choice of radius, 27 electric charges within, 110-112, 164 free energy of formation, 34 shape, 29 size, 163 charge density, 5, 29, 50, 51 distributions, 30, 53 chemical reactions density functional theory studies of, 103-106 self-consistent reaction field studies of, 114-115 chemical reactivity, XIV chemotherapy, 159 chirality, of glycine, 193-195
INDEX
chlorine bonding, 61--62 cisplatin, 96 closed-shell systems, 7, 18, 24 CNDO. See complete neglect of differential overlap cohesive energy density, 74 complete neglect of differential overlap (CNDO), 20-22 complexation energies, 104 conductor-like screening model (COSMO), 29-31, 42-43 configuration interaction, 7-8, 23-25, 25, 186 conformational analysis, 112-113, 194 of amino acids, 183-186, 188-189 of dipeptides, 195-202 conformational equilibria, 94, 112-113, 186 conformational geometry functions, 202, 203, 204 conformation energies of dipeptides, 196 effect of geometric optimization, 186-191 of inositol, 94 conjugated systems, in the PM3 model, 41 continuum solvation models, 42-45 control proteins, 168 convergence, 18, 185, 186 in Gaussian programs, 9-10 cooperativity, in helical conformations, 200, 204 copper complexes, 95 correlation energy, 7-8, 87, 186, 189. See also exchangecorrelation COSMO. See conductor-like screening model Coulomb integrals, 16, 20 evaluating, 119 Coulomb interactions, 117 Coulomb matrix, 30 crambin, 205, 206 crystalline state, 61, 141, 142. See also Xray crystallography of diphenylurea, 65-66 experimental versus calculated results, 204-205 cutoff distance, 39-40
227
cyclic conformation, of ammo acids, 184-186, 188 cyclopropylamine, 183 cytosine, density functional theory and M011er-Plesset studies of, 93, 94 Debye equation, 138 delta function. See Dirac delta function; Kronecker delta density functional elongation method, 119 density functional theory, 51, 85. See also Kohn-Sham method compared to MP2 and MP4, 102 considerations of environment, 108-119 coupled with molecular mechanics, 115-119 effect of basis set on results, 98-100 frozen, 118 to predict NMR parameters, 91-92 studies of adenine, 94 studies of chemical reactions, 103-106 studies of heterocyclic compounds, 93 studies of thermochemistry, 96-97 time dependent, 119 -120 density matrix, 17, 18, 28, 39 diazohydroxide compounds, Hartree-Fock calculations on, 161 diazonium species, 162 dielectric constant, 26, 32, 163 of liquid water, 112 dielectric theory, 109 diethyl ether, 73, 74 diffuse functions, 6 diffusion coefficient, 107 1,2-difluoroethane, conformational equilibria, 94 dihydrofolate reductase inhibitors, 164-166 dihydroxymethane, conformational studies, 194 dimethyl hydrazine, gas electron diffraction study of, 183 1,3-dioxane, protonation of, 107 dioxins, toxicity of, 67-70 dipeptides, ab initio conformational studies, 195-202 diphenylurea, crystallization, 65--66 dipole moments, 27, 28, 117, 182, 188 calculated using Kohn-Sham method, 91
228
INDEX
dipole moments (continued) of nucleic acids, 92-93 Dirac delta function, 109 dispersion forces, 166, 189, 190 distamycin, 168, 169 DNA alkylation, by nitrosoureas, 160-161 DNA fragments, XIV DNA repair, 161 DNA sequence selectivity, 172
use to analyze noncovalent interactions, 56-66 variance on molecular surface, 71, 72-74 energy differences, 6, 185 enthalpies of ammonia complexes, 102 solvation, 116 entropy, of dipeptide conformations, 200 environment, density functional theory studies of, 108--119 enzymes inactivation, 7 modeling reaction mechanisms, 35-40 equilibrium differences, 137 equilibrium mixture, of glycine, 182 esperamicins, 178 n-ethyl acetamide, 204 ethylene glycol, conformational studies, 94 ethylene oxide, proton affinities, 163 exact energy, 4 exchange-correlation parameterizations for functional, 89 potential, 17, 87, 88, 119 excited states effects of solvent, 28 wave functions, 23--25 exclusion principle, 4, 14 extended X-ray absorption fine structure (EXAFS), 95
electric field, total, 29 electric linear dichroism, 175 electron charge, 17 electron configuration, ground state, 23 electron correlation, in conformational studies of amino acids, 186 electron density, 21, 53, 54, 55, 86, 87, 107, 120 in frozen density functional theory, 118 function, 6, 49 gradient, 88 of a solvated molecule, 109 electron diffraction operator, 134 electron distance, 16 electron distribution, 13 inhomogeneities in, 88 of molecules, 3 electronic energy, 16, 21 potential, 4 electronic repulsion, 20, 22, 39 fast multipole method, 119 electronic spectrum, 134 Fermi contact, 10, 92 shift in peak, 28-29 finite difference method, 112 electron motion, 15 fluorine bonding, 61--62 electron withdrawers, 63 fluorocarbons, 62 electrostatic effects, 39, 52 fluoropyrimidines, mechanism of action, of lexitropsins, 168-169 164 electrostatic energy, 30 Fock matrix, 20, 22, 35, 117, 118 electrostatic fields, density functional Fock operator, 17, 18 theory studies of, 108--109 introducing environment effects, 27 electrostatic potential, 109 folding, 187, 195, 197 of azines, 64-65 folic acid, 164 of benzene, 62-64 footprinting methodology, 172 calculating, 54, 110-112 force-field parameters, for peptides, 198 of cytosine, 93 formamide, 73, 101 definition, 50 conformational studies, 112-113 -derived charge, 6 electrostatic potential, 54 generated by atomic nuclei, 118 local geometry, 191, 193 molecular, 49 structural parameters, 140 sign of, 51 vibrational frequencies, 91
INDEX
formamidine, 105-106 formic acid density functional theory studies, 104 structural parameters, 140 Fourier transformation, 134 free energy, 10, 32, 33 of cavity formation, 34 component, 110 of hydration, 114 solvation, 114, 118 frequency command, in Gaussian programs, 10 frozen density functional theory, 118 full configuration interaction, 7 furfuraldehyde, conformational studies, 113 gas electron diffraction, 133-138, 142 interpreting intensities, 200 study of dimethyl hydrazine, 183 gas-phase properties, Kohn-Sham method applied to, 89-90 gauche forms, 187 Gaussian functions, use in Hartree-Fock method, 5 Gaussian programs, 8-10 general interaction properties function, 70-74, 75 geometric optimization, XIII, 6, 175 calculation time for dipeptides, 197, 200 effect on conformational energies, 186--191 in Gaussian programs, 9 geometry ab initio, 140, 141 local, 191-193 specifying initial in Gaussian programs, 8-9 geometry errors, 187 glycine, 141, 184 chirality, 193-195 conformations, 185, 188 history of geometric optimizations, 181-183, 197-198 Kohn-Sham analysis, 90, 91 shielding constants, 92 gradient method, 88, 103, 182, 183, 197 Green function, 109 ground state, 23, 24, 25, 29, 49, 86 vibrational, 139
229
guanidinium ion, complex with thymine, 172-175 guanine, 163 alkylation, 161 electrostatic potential, 51-53 halogenated aromatics, biological activity, 69-70 halogens, 61-62, 64 Hamiltonian operator, 3, 12-13, 15, 17, 25 Kohn-Sham, 87 hard-soft, acid-base concept, 161 Hartree-Fock matrix, 18 Hartree-Fock method, 4-7, 12-19. See also post-Hartree-Fock methods applied to diazohydroxide compounds, 161 basic operations required, 36 for calculating electrostatic potential, 54 computation time, 5 in conformational studies of amino acids, 183, 185, 186 restricted and unrestricted, 7 heat of formation, 45 Heisenberg uncertainty principle, 3 helical forms, of dipeptides, 193-201 helix compression, 205, 209 Hermite polynomials, XIII heteroatoms, 51 heterocyclic compounds, density functional theory studies of, 93 Hildebrand parameters, 60, 73 Hoechst agents, 169, 175-178 Hohenberg-Kohn theorem, 55 hydrocarbons, geometric optimization, 187 hydrogen bond acceptors, 59 hydrogen bonding, 22, 188 density functional theory studies of, 97.-103 energies, 165 inclusion in self-consistent reaction field model, 42 in the MNDO model, 41 use of electrostatic potential to analyze, 56-60 hydrogen diformate, 102 hydronium ion, 107-108 hydrophobic interactions, 165-166, 168
230
INDEX
hydroxyl ion, 107-108 5-hydroxytryptamine, 66-67 INDO. See intermediate neglect of differential overlap inositol, conformational equilibria, 94 interaction energy, 49, 52, 87, 163 of metalloorganic complexes, 95-96 of occupied localized molecular orbital, 38 interaction potential, 26 intermediate neglect of differential overlap (INDO) model, 22, 23, 41-42 intermediates, characterizing, XIII intermolecular potential functions, 165 internuclear distances, 133, 142 operational definitions of types, 134-139 statistical definitions of types, 139-140 thermal average, 138, 139 iodine bonding, 61-62 ionization potential, 16, 21 iron complexes, 95 isomerization, 103 of formaldehyde radical, 104 kinetic energy operator, 3 Kohn-Sham method, 85--89 application to organic molecules, 9095 compared to Hartree-Fock, 87 computer programs implementing, 89 in potential energy surface studies, 103 use to calculate electrostatic potential, 110-112 Kronecker delta, 20, 34 Lawesson reagent, 172 least squares diffraction, 140, 200 Legendre polynomials, XIII lexitropsins, 168-175 ligand size, 170 linear combinations of atomic orbitals, XIV, 5, 17, 18 linear system-size scaling, 119 lithium ion affinity, of aziridine, 163--164 local density approximation, 87-88, 102-103, 120 applied to water molecules, 98
localized molecular orbitals (LMOs), 35-40 use to compute protein properties, 45 local schemes, 88 Lovas, F. J., 182 d-lysergic acid diethylamide, 67 macroscopic properties, 71, 75 matrix diagonalization, 19 metalloorganic complexes, application of Kohn-Sham method, 95-96 methane, structural parameters, 140 methanol, 102 methotrexate, 165-166 2-methoxy ethylamine, 141 methylacetamide, 102, 204 2-methylallyl alcohol, 141 methylbromide, structural parameters, 140 methyl hydrazinocarboxylate, 141 3-methylindole, 92 microwave spectroscopy, 138-139, 141, 142, 181, 182, 183, 188 MINDO. See modified intermediate neglect of differential overlap mitomycins, ab initio calculations on, 162-164 MNDO. See modified neglect of differential overlap modified intermediate neglect of differential overlap (MINDO) model, 22 modified neglect of differential overlap (MNDO) model, 22, 40-41 for calculating electrostatic potential, 55 molecular charge distribution, 34 molecular connectivity table, 38 molecular energy, 13 molecular intensities, 134, 135, 137 molecular mechanics, coupled with density functional theory, 115--119 molecular orbital calculation-constrained electron diffraction, 140-141 molecular orbitals, 14-19. See also localized molecular orbitals in Hartree-Fock method, 5 virtual, 23 molecular recognition, 66-70 molecular scattering intensity function, 134 M011er-Plesset method (MP2), 8, 54
INDEX
compared to Kohn-Sham, 90, 91 geometric optimization, 186-189 monocyanocyclobutane, 183 MOPAC computer program, 22, 31, 40, 41,42 motion, 15. See also torsions; vibration equations of, 106 MOZYME computer program, 40, 45 MP2 model. See M011er-Plesset method Mulliken population analysis, 6, 34, 164, 171 multiple scattering method, 95 multiplicative integral approximation, 201 multipole expansions, 55 for a cavity, 110-111 netropsin, 168, 169 nitrobenzene, 63, 73 nitrosamines, 162 nitrosoureas, DNA alkylation by, 160-162 NMR parameters, predicted by density functional theory calculations, 91-92 noformycin, 169--170 nonadiabatic effects, 103 noncovalent interactions, use of electrostatic potential to analyze, 56-66 nonlocal gradient-dependent correction, 88 nuclear coordinates, dynamics of, 106 nuclear magnetic resonance. See NMR nuclear potential energy, 4 nuclear reorganization, 28 nucleic acids, 51-53, 97, 161, 163 density functional theory studies of, 92-94, 101 nucleophilic attack, sites for, 51 nucleus, charge of, 4 O6 adducts, 161 oligopeptides ab initio conformation studies, 195-202 bond angles, 207 Onsager reaction factor, 27, 28 open-shell systems, 7, 19 operator, definition of, 3 orbitals. See atomic orbitals; molecular orbitals orbital theory, applied to DNA damage by nitrosoureas, 161
231
overlap, 39 charge density, 30 integral, 17 1,2-oxathienes, 162 parameters absolute, 204 experimental versus calculated, 133-134, 140-141 Pauli's exclusion principle, 4, 14 peptides, conformational geometry maps, 202-204 permutation operator, 14 perturbation theory, 52, 117 sum-over-states, 92 Zwanzig statistical, 165 phi/psi torsions, 205, 206 pi regions, 51 Planck's constant, 3 PM3 model, 22-23, 31, 41 Poisson-Boltzman equation, 112 Poisson's equation, 51 polarity, local, 71--73 polarizabilities, of nucleic acids, 92-93 polarization, 27, 28, 109 component, 30 energy, 32, 34 functions, 6 inclusion in calcuations of electrostatic potential, 54 polycyclic aromatic hydrocarbons, conformational properties of, 94 Pople, J. A., 181, 182 post-Hartree-Fock methods, 7-8 of electrostatic potential, 54 potential energy, 3-4 potential energy surfaces, 103, 104, 105, 114, 191, 197 of cytosine, 93 studied using Kohn-Sham method, 91 prochiral systems, 194 protein crystallography, 141 nomenclature, 197, 198 proteins, XIV, 7 alpha helix regions, 197, 198, 200 backbone bond distances, 204-209 control, 168 semi-empirical methods for modeling structure, 35--40 use of localized molecular orbits to
232
INDEX
proteins (continued) study, 45 proton affinities, 97 of aziridine and ethylene oxide, 163 of heteroatomic rings, 171-172 proton transfer, 105, 107, 117, 118 free energy, 44 Pulay's gradient method, 182, 197 Pullman, Alberte and Bernard, XIV pyrazine, electrostatic potentials of, 64-65 pyridine, 73 electrostatic potentials of, 64-65 2-pyridone, density functional theory studies of, 94 pyrimidine, electrostatic potentials of, 64-65 radial distribution function, 134 radicals, 104 density functional theory calculations applied to, 92 Ramachandran plots, 196 refractive index, 28 repulsion energy, 16, 87 interelectronic, 4, 20, 22, 39 resonance donors, 63 Roothaan equations, 18, 19-23 rotation, in oligopeptides, 204 rotational constants, 139, 141, 182, 188 Sasisekharan plots, 196 scattering angle, 134 intensity function, 134 wave method, 95 Schafer, L., 182 Schrodinger equation, XIII, 3-7, 12-13, 135 screening charge distribution, 29 screening conductor model, 115. See also conductor-like screening model screening energies, 31 self-consistent field method. See HartreeFock method self-consistent fields, and Roothaan equations, 18 self-consistent reaction fields, 26--31, 42, 109-115 Sellers, H. L., 182
semi-empirical methods, 11--12. See also Roothaan equations applied to biological problems, 40-42 applied to proteins and enzymes, 35--40 effects of environment, 26-35 semi-quinone, density functional theory calculations, 92 serine, conformations, 184 serotonin. See 5-hydroxytryptamine shrinkage effect, Bastiansen-Morino, 138 simulated surfaces, 204 single-point energy calculations, 182, 186 size effects, 187 Slater method, 88, 103 Slater-type orbitals, basis sets, 5-6 SMx models, 31-35, 44-45 solute charge, 27 solvation energy, 32 solvatochromic parameters, 56-57 solvent accessible surface, 29, 34, 35 solvent effects, 29, 42-45, 115 spin coupling, 91, 95 spin densities, 92 spin functions, 5 spin quantum numbers, 14 spin states, 8, 19 spline function, 205 stacking interaction model, 70 static charge distribution, molecular, 49 strained bonds, 51 stretched conformation, of amino acids, 184--186, 188 structural analysis, XIII-XIV. See also conformational analysis experimental versus calculated parameters, 140-141 guidelines for, 142 Suenram, R. D., 182 sulfur, basis sets for systems containing, 6 sum-over-states perturbation theory, 92 superoxide dismutase, 96 surface charges. See also potential energy surfaces apparent, 111-112 symmetrical systems, 193 Taft constants, 62 tautomers, 42, 44 conformational studies, 113 of heterocyclic compounds, 93
INDEX
temperature, vibrational, 137 thermochemistry density functional calculations, 96-97 self-consistent reaction field studies, 113-114 thymidilate synthesis cycle, 165 thymine, complex with guanidinium ion, 172-175 toluene, 73 torsional dependence, 191, 192 torsional sensitivity, 187, 189, 190, 200 torsions, 189, 191 phi/psi, 205, 206 tau, 184 total energy, XIII, 3, 16, 23 in Gaussian programs, 9 transferable atom equivalents, 56 trans forms, 187 transition barrier heights, 104 transition intensities, of glycine, 182 transition state, XIV density functional theory studies of, 103 triamides, as subunits of globular protein, 201 triazenes, density functional theory studies of, 97 triazines, 166-168 1,3,5-trioxane, protonation of, 107 tunneling effects, 103, 105 uncertainty principle. See Heisenberg uncertainty principle vacuum state, 205 van der Waals forces, 117 van der Waals radius, 35 variation principle, 4, 13, 16, 18, 86 applied to environment effects, 27 vertical energies, 25
233
vibration, 138, 139 energy, 32 vibrational frequencies of formamide, 91 in Gaussian programs, 10 of heterocyclic compounds, 93 vibrational probability density, 134 vibrational temperature, 137 Vishveshwara, S., 181, 182 water, heavy, 107 water molecules, 117 density functional theory studies of, 98-101 dielectric constant, 112 and metal ions, 95-96 wave functions, XIV, 3, 4, 13-15, 134 antisymmetric, 14 excited state, 23--25 in post-Hartree-Fock methods, 7 total, 5 Wimmer, E., 90 xantine oxidase, 95 Xa method. See Slater method X-ray absorption fine structure, extended (EXAFS), 95 X-ray crystallography, 139, 141, 201 zero-differential overlap, 20 zero point energies, 186 zinc hydroxide, 104-105 ZINDO computer program, 42 Z-matrix, 8-9 Zwanzig statistical perturbation theory, 165 zwitterions of amino acids, 184 of glycine, 42-43