CHEMISTRY RESEARCH AND APPLICATIONS
QUANTUM FRONTIERS OF ATOMS AND MOLECULES No part of this digital document may be reproduced, stored in a retrieval system or transmitted in any form or by any means. The publisher has taken reasonable care in the preparation of this digital document, but makes no expressed or implied warranty of any kind and assumes no responsibility for any errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of information contained herein. This digital document is sold with the clear understanding that the publisher is not engaged in rendering legal, medical or any other professional services.
CHEMISTRY RESEARCH AND APPLICATIONS Additional books in this series can be found on Nova’s website under the Series tab. Additional E-books in this series can be found on Nova’s website under the E-book tab.
CHEMISTRY RESEARCH AND APPLICATIONS
QUANTUM FRONTIERS OF ATOMS AND MOLECULES
MIHAI V. PUTZ EDITOR
Nova Science Publishers, Inc. New York
Copyright © 2011 by Nova Science Publishers, Inc. All rights reserved. No part of this book may be reproduced, stored in a retrieval system or transmitted in any form or by any means: electronic, electrostatic, magnetic, tape, mechanical photocopying, recording or otherwise without the written permission of the Publisher. For permission to use material from this book please contact us: Telephone 631-231-7269; Fax 631-231-8175 Web Site: http://www.novapublishers.com NOTICE TO THE READER The Publisher has taken reasonable care in the preparation of this book, but makes no expressed or implied warranty of any kind and assumes no responsibility for any errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of information contained in this book. The Publisher shall not be liable for any special, consequential, or exemplary damages resulting, in whole or in part, from the readers’ use of, or reliance upon, this material. Any parts of this book based on government reports are so indicated and copyright is claimed for those parts to the extent applicable to compilations of such works. Independent verification should be sought for any data, advice or recommendations contained in this book. In addition, no responsibility is assumed by the publisher for any injury and/or damage to persons or property arising from any methods, products, instructions, ideas or otherwise contained in this publication. This publication is designed to provide accurate and authoritative information with regard to the subject matter covered herein. It is sold with the clear understanding that the Publisher is not engaged in rendering legal or any other professional services. If legal or any other expert assistance is required, the services of a competent person should be sought. FROM A DECLARATION OF PARTICIPANTS JOINTLY ADOPTED BY A COMMITTEE OF THE AMERICAN BAR ASSOCIATION AND A COMMITTEE OF PUBLISHERS. Additional color graphics may be available in the e-book version of this book. LIBRARY OF CONGRESS CATALOGING-IN-PUBLICATION DATA Quantum frontiers of atoms and molecules / editor, Mihai V. Putz. p. cm. Includes index. ISBN 978-1-61324-867-6 (eBook) 1. Chemical bonds--Mathematical models. 2. Dirac equation. 3. Quantum chemistry. I. Putz, Mihai V. QD461.Q36 2009 541'.28--dc22 2010001746
Published by Nova Science Publishers, Inc. † New York
CONTENTS Foreword
ix
Chapter 1
Fulfilling Dirac’s Promise on Quantum Chemical Bond Mihai V. Putz
Chapter 2
Duality within the Structure of Complementarity: Right Where It Has No Place to Be Constantin Antonopoulos
21
Chapter 3
Complementarity Out of Context: Essay on the Rationality of Bohr’s Thought Constantin Antonopoulos
41
Chapter 4
Molecular Integrals over Slater-Type Orbitals. From Pioneers to Recent Developments P.E. Hoggan, M.B. Ruiz and T. Özdoğan
61
Chapter 5
Tunneling Dynamics and Its Signatures in Coupled Systems S. Ghosh and S.P. Bhattacharyya
91
Chapter 6
Theoretical Calculation of the Low Laying Electronic States of the Molecular Ion CsH+ with Spin-Orbit Effects M. Korek and H. Jawhari
111
Chapter 7
Theoretical Explanation of Light Amplifying by Polyethylene Foil Vjekoslav Sajfert, Dušan Popov, Stevo Jacimovski and Bratislav Tosic
141
Chapter 8
Anharmonic Effects in Normal Mode Vibrations: Their Role in Biological Systems Attila Bende
157
Chapter 9
Emergent Properties in Bohmian Chemistry Jan C.A. Boeyens
191
1
vi
Contents
Chapter 10
The Algebraic Chemistry of Molecules and Reactions Cynthia Kolb Whitney
217
Chapter 11
Quantum and Electrodynamic Versatility of Electronegativity and Chemical Hardness Mihai V. Putz
251
Chapter 12
Physics and Chemistry of Carbon in the Light of Shell-Nodal Atomic Model G.P. Shpenkov
277
Chapter 13
Molecular Modeling of the Peanut Lectin - Carbohydrate Interaction by Means of the Hybrid QM/MM Method Alexei N. Pankratov, Nikolay A. Bychkov and Olga M. Tsivileva
325
Chapter 14
Electron Density Distributions of Heterocycles: A Shortcoming of the Resonance Model Ricardo A. Mosquera, Marcos Mandado, Laura Estévez and Nicolás Otero
343
Chapter 15
Electromerism in Small Molecule Activation by Metal Centers of Biological Relevance Radu Silaghi-Dumitrescu
367
Chapter 16
Structural Modelling of Nano-Carbons and Composites Mihai Popescu and Florinel Sava
387
Chapter 17
Nanostructure Design—between Science and Art Mircea V. Diudea
425
Chapter 18
Quantifying Structural Complexity of Graphs: Information Measures in Mathematical Chemistry Matthias Dehmer, Frank Emmert-Streib, Yury Robertovich Tsoy and Kurt Varmuza
479
Chapter 19
Topological Indices of Nanostructures Ali Reza Ashrafi
499
Chapter 20
On Uniform Representation of Proteins by Distance Matrix M. Randić, M. Vračko, M. Novič and D. Plavšić
521
Chapter 21
Timisoara Spectral – Structure Activity Relationship (SpectralSAR) Algorithm: From Statistical and Algebraic Fundamentals to Quantum Consequences Mihai V. Putz and Ana-Maria Putz
539
Chapter 22
On Plots in QSAR/QSPR Methodologies Emili Besalú, Jesus Vicente de Julián Ortiz and Lionello Pogliani
589
Contents
vii
Chapter 23
Application of the Fuzzy Logic Theory to QSPR-QSAR Studies Pablo R. Duchowicz and Eduardo A. Castro
607
Chapter 24
Modeling the Toxicity of Alcohols. Topological Indices versus Van Der Waals Molecular Descriptors Dan Ciubotariu, Vicentiu Vlaia, Ciprian Ciubotariu, Tudor Olariu and Mihai Medeleanu
629
Index
669
FOREWORD ATOMS AND MOLECULES AS THE QUANTUM FRONTIERS OF LIFE Elements of Nature: elements of Life: atoms and molecules. From the earlier time of rational modeling of the Universe, the atomistic vision (Leucippus & Democritus) gave the mechanistic vision upon which all objects are created by unitary elementary entities; they eventually later found the autonomic conceptualization by Leibniz’s monads — the veritable frontiers, two-sided coins, of Creation. On one side they realize the first step from the nonorganized vacuum towards the matter structure, while on the other hand preserve the entangled connection and unity between all the created/observed (and beyond) world. From the epistemological point of view, the Atom at the base – frontier of micro-to-macro Universe marks a drastic unification of the ancient beliefs and myths of the Elements of Nature: water, earth, fire, air – all of them being in one way or another fluidization, sublimation, exaltation or aeration of atomic systems in various conjugations or composites; moreover, Plato’s geometric elements attributing the octahedrons to air, cubes to earth, tetrahedrons to fire, icosahedrons to water, while dodecahedrons build the constellation and paradise – appears as a further atomic combination, yet in a more organized paradigm. Atoms therefore — the first frontier of microcosm, the most pre-eminent physical system in which the fermions (electrons) live in a boson environment (atoms as a whole) — so simple in principle to describe and speculate upon, so difficult to model and quantify. The reason is because of being material and also immaterial, since it spans in principle the whole Universe by its wavefunction. Moreover, all living Nature behaves like its most simple benchmark: Hydrogen. We put under the Hydrogenic wave-function: life and death follows its localized-delocalized curvatures, all events encountered as the Hydrogen wave-function shape; acceleration of particles and bodies, the maximum metastable-equilibrium — the climax of life, followed by the smooth descent toward the zero ground of being, leaving nevertheless space for believers regarding the eventual unification of minus and plus infinites, such that all of Nature works in cycles, with the satisfaction of energy conservation. This is Nature — this is Physics; but maybe it is not enough. Just when more than one electron is stabilized in an atomic system — the atomic world diversifies in such manner that the multiple can spring out of Monads, the Elements become now the Periodic Elements, the Chemistry arises. As such the Chem listens by the words of his father Noah in bringing new life in the Universe by dissemination, and cross-fertilization of the earth’s seeds, as such the cheos gives the fluids of Life, and archeos
x
Mihai V. Putz
the archetypes of living, the paradigms of Nature. Living with more than one electron in an orbital, the appearance of duality, of spinning, that’s Chemistry. As a name enough obscure, partly mythical from the dark side of Egypt, partly theological from the sacral combination it provides (El-Chemia: Alchemy) towards the molecular world. With this we arrive at the bonding mystery; it is called in general the chemical bond, but stays as the frontier of the bigger world to come: the Bios. The biological component of Nature may be approached through the generalization of the chemical bonding idea, or the resonance between the ionic and covalent bindings, to the ligand-receptor one. Such picture is fully feasible after rooting in the hard and soft acids and bases theory in which terms any chemical reaction can in principle be formulated. On the other hand, for the bio-, eco-, and pharmaco-logical involved systems the chemical bond appears unstable respecting the isolated systems, due to the environment damping; in other words, a toxicological action (meaning a chemical binding with a toxic effect) is diminishing in time as a consequence of metabolic actions (intern factors) and of the environment (extern factors). This way, it raises the challenge in formulating a theory of the chemical bond variable in time at the biomolecular level, specialized on the in vitro enzymic case, and then extended to the in vivo situations in order to better cover the record of the (non)toxic or therapeutic actions in organisms and the environment. Recent attention was jointly focused on eco-, bio-, as well as on pharmacosciences. In this stage new pharmacophoric reactivity indices may be formulated with the help of quantum molecular topology combined with the graph theory (due to its versatility to be iteratively computed from atoms to molecules) producing a direct quantification of the toxic or therapeutic effect for a given chemical. The Hydrogenic atoms, the many-electron atoms, and the (bio-)molecules: the Physics, Chemistry, and Biology — and the wholeness they cover in a rheological manner with the help of the quanta, fields and particles, in reciprocal transformation, combination, excitation — the thin red line of the Universe. Life’s frontiers —the quantum-verse, the true nature of the uni-verse! With the actual ever-expanding developments of the edge technology with direct impact on life and environment, a lucid review of the main foreground conceptual realms of electronic matter at the level of atoms and molecules is by this volume unfolded aiming to offer a unitary perspective of the quantum principles as applied to many-electronic states, either in isolate and interacting context, to chemical bond and bonding as well to the relation between the physical-chemical structure and chemical-biological reactivity and activity manifestations of nature. The volume widely gives larger and deeper coverage of the electronic matters through the physical, chemical and biological ordered systems, with their increasing complexity through gradually presenting the matter structure from the physical to chemical to biological manifestations in an inter-disciplinary cross-fertilization manner. With equilibrated contents provided by important scientists worldwide with a valuable impact on quantum fundaments and applications, the book presents and reviews the avant-garde contributions for the XXI century. In fact, the present volume steps aside to serve for the unification of the physicalchemical-biological manifestation of atoms in molecules and in nanostructures by means of expanding the quantum frontiers by conjunction with either relativity or topological or information or graph theories as well. The book successfully balances among the physical, chemical and biological sides of the quantum theory and of its applications emphasizing both conceptual and computational sides while experiment is addressed only for reference; in this regard the book is more focused on why rather than how quantum effects are produced with
Foreword
xi
the inner belief that this way the second issue is self-contained in the first one. The book is addressed to a large audience as well as to advanced research wisely combining the epistemological, heuristic and philosophic aspects of quantum manifestation of matter in atoms, molecules and of their combination in complex nano- and bio- structures. On the other hand, the book likes to show how far the quantum theory furnishes the background and the framework in which the simple as well as most complex electronic structures may unfold and evolve in an interacting environment. Overall, searching the unity of the manifestation forms of the chemical bonding at various levels of matter organization had become a very active interdisciplinary field in the last years, being one of the main goals in the frame of the nanosciences. The quantum paradigm of bonding unification through the formulation of a minimal set of concepts and quantities having as much universal multi-electronic relevance as possible represents a real challenge for the conceptualization and prescription of the viable applicative directions of the nanosystems, from atoms to biomolecules. For that we are certain that the present edited volume will have a special role for making further advancement in improving the scientific knowledge in this priority domain of research. And last, but not least, Editor and Authors like to sincerely thank all the NOVA team involved in the present challenging editorial venture, and to NOVA Vice-president Nadya Columbus especially, for kind assistance and patience throughout all stages of publication. Mihai V. PUTZ (Volume Editor) Chemistry Department West University of Timişoara Romania
In: Quantum Frontiers of Atoms and Molecules Editor: Mihai V. Putz, pp. 1-19
ISBN: 978-1-61668-158-6 © 2010 Nova Science Publishers, Inc.
Chapter 1
FULFILLING DIRAC’S PROMISE ON QUANTUM CHEMICAL BOND Mihai V. Putz* Laboratory of Computational and Structural Physical Chemistry, Chemistry Department, West University of Timişoara, Str. Pestalozzi No.16, Timisoara, RO-300115, Romania
Abstract One of the most fundamental issues of quantum chemistry, the forming and electronic description of bonding, is here unfolded at the level of Dirac’s theory, in conjunction with the density functional concept of chemical action or, equivalently, with the beloved electronegativity for practical applications of encountering atoms in molecules. The resulting chemical bonding equation allows for the first time the geometrical identification of molecular region along bonding in which the parallel and anti-parallel spin-electrons coexist in antibonding and bonding states, respectively.
1. Introduction After the Nobel Prize 1933 jointly awarded to the “productive” contribution on atomic theory for Schrödinger and Dirac’s theories of electrons, there was created the impression that at least the chemistry world of phenomena will be entirely explained [1-17]. This is not minor, because of the huge importance the Chemistry of atoms and molecules plays in life and in metabolism [18]. Yet, soon it became clearer that the chemical systems are not as simple as the physical models intend to imply; for instance, no spherical molecule exists to transfer upon the principles of special orthogonal groups; or even more complicated the chemical systems consist of many—but numerable—electrons, therefore not amendable with the thermodynamic physical limit ( N → ∞ ); or more subtle, the chemical bond is supporting *
E-mail addresses:
[email protected],
[email protected]. Tel: +40-256-592633; Fax. +40-256-592620,
2
Mihai V. Putz
both the parallel and anti-parallel spins in different states, inversely known as anti-bonding and bonding states, emphasizing that the first instance is firstly involved in the chemical reactivity through belonging to the so-called valence state [3-5]. Although the molecular orbital theory attempted to explain this intriguing behavior by the anti-symmetry rule of a given state’s spin-electron wave function, i.e. the total Pauli antisymmetry wave function may be obtained either by convoluting symmetrical spin with antisymmetrical coordinate wave functions—for anti-bonding or anti-symmetrical spin with symmetrical coordinate wave function—for bonding states, this remains only as a qualitative description with no practical effect on geometrical description of anti-bonding parallel spins’ coexisting along the bonding [11-13]. One can say that since electrons are waves they virtually occupy all space of bonding with an anti-symmetrical coordinate wave that eventually does not interfere with that symmetrical one coming from the bonding wave; very true, but also this seems a qualitative argument [8]. Therefore, one should be next interested in describing the geometrical locus along the chemical bond where the bonding and anti-bonding waves behave as within resonators (since they are stationary in the formed molecule), and eventually explaining why they do not interfere (i.e., they are orthogonal states belonging to separate Hilbert spaces); and all these with the help of Dirac’s theory directly since the spin problem is aiming to be responded, and not by employing the Schrödinger indirect argumentum of coordinate symmetry [9]. To this end, the present work likes to step forward in elucidating the mystery of “the chemical bond” through combining the fundamental Dirac equation with spinorial solution, generalized from the Schrödinger equation, while being combined with the recently advanced chemical action description of bonding [19,20].
2. Dirac-Schrödinger Equivalence There is a known fact that the basic Dirac equation unfolds as the temporal generalized operatorial form [1,21-23]
i=∂ t [Ψ ] = Hˆ Dir [Ψ ]
(1)
in a very similar shape with the Schrödinger one, however with the Dirac Hamiltonian specialization: 0 Hˆ Dir = Hˆ Dir + vˆ( x)
(2a)
with the free particle and applied potential components:
G Gˆ 0 Hˆ Dir = −i=cαˆ ⋅ ∇ + mc 2 βˆ ,
(2b)
vˆ( x) = V ( x) βˆ ,
(2c)
Fulfilling Dirac’s Promise on Quantum Chemical Bond
3
respectively, with m - the particle mass, c - the light velocity, = - the Planck constant, while
G the introduced special operators αˆ , βˆ assume the Dirac 4D representation:
⎡0
αˆ k = ⎢ ⎣σˆ k
σˆ k ⎤
, k = 1,2,3 , 0 ⎥⎦
⎡1ˆ
0⎤ ˆ⎥ ⎣0 − 1⎦
βˆ = ⎢
(3a)
(3b)
in terms of bi-dimensional Pauli and unitary matrices (operators)
⎡1 0⎤ ⎡0 1 ⎤ ⎡0 − i ⎤ ⎡1 0 ⎤ , σˆ 1 = ⎢ , σˆ 2 = ⎢ , σˆ 3 = ⎢ ⎥ ⎥ ⎥ ⎥ ⎣0 1 ⎦ ⎣1 0 ⎦ ⎣i 0 ⎦ ⎣0 − 1⎦
σˆ 0 = 1ˆ = ⎢
(4)
and with the wave function featuring the so called spinorial (bi-dimensional) equivalent formulation
⎡ϕ ⎤ − =i E⋅t [Ψ ] = ⎢ ⎥ e ⎣φ ⎦ ⎧⎡ϕ ⎤ − =i E ⋅t , E > 0 anti − bonding states ⎪⎢ ⎥ e ⎪⎣ 0 ⎦ =⎨ i ⎪⎡ 0 ⎤ e + = E ⋅t , E < 0 bonding states ⎪⎢φ ⎥ ⎩⎣ ⎦
⎡ − =i E ⋅t ⎤ ϕe = ⎢ i ⎥ (5) ⎢ + = E ⋅t ⎥ ⎢⎣φe ⎥⎦ However, there also arises the question whether the general Dirac equation (1) may be reduced or transformed so that to represent the eigen-equation for the electronic states for a given quantum system. For this, through closely analyzing the form of Eq. (1) with all its contribution, one may resume the free motion Dirac operator to the working form [8]
∂ ∂ ∂ ∂ Dˆ = −i=Aˆ σˆ 0 + i=σˆ 1 + i=σˆ 2 + i=σˆ 3 + Cˆ ∂t ∂x1 ∂x 2 ∂x3 and employing it to the stationary operatorial equation:
(6a)
4
Mihai V. Putz
⎡ Ψ e i (kx−ωt ) ⎤ ⎡ζ ⎤ 0 = Dˆ [Ψ ] = Dˆ ⎢ 1 −i (kx−ωt ) ⎥ = Dˆ ⎢ ⎥ ⎣ξ ⎦ ⎣Ψ2 e ⎦
(6b)
with the oscillatory phase written so that to be in accordance with the prescription of eq. (5) for the Planck energy-frequency identification:
E = =ω
(7)
In these conditions, one notes that for the time and coordinate derivatives yields:
−ζ ∂ [Ψ ] = iω ⎡⎢ ⎤⎥ , ∂t ⎣ξ ⎦
(8a)
−ζ ∂ [Ψ ] = −ik k ⎡⎢ ⎤⎥ ∂xk ⎣ξ ⎦
(8b)
which reduce the above stationary condition (6) to the form
(=ωAˆ1ˆ + =k σˆ 1ˆ)⎡⎢−ξζ ⎤⎥ + Cˆ ⎡⎢ζξ ⎤⎥ = 0 k
k
⎣
⎦
(9)
⎣ ⎦
Choosing now appropriately (that stands for the optimization procedure) the matrices
⎡ 0 0 ⎤ ˆ ⎡0 2 m ⎤ Aˆ = ⎢ ⎥ , C = ⎢0 0 ⎥ ⎣1 0⎦ ⎣ ⎦
(10)
the last form (9) further rearranges as
⎡=k k σˆ k ⎢ ˆ ⎣ =ω1
2m1ˆ ⎤ ⎡− ζ ⎤ ⎥⎢ ⎥ = 0 =k k σˆ k ⎦ ⎣ ξ ⎦
(11a)
leaving with the system:
⎧⎪− =k k σˆ k ζ + 2m1ˆξ = 0 ⎨ ⎪⎩− =ω1ˆζ + =k k σˆ k ξ = 0 Now, since solving the first equation of the system (11b) in one variable, say
(11b)
Fulfilling Dirac’s Promise on Quantum Chemical Bond
ξ=
=k k σˆ k 1ˆ p σˆ 1ˆ ζ = k k ζ 2m 2m
5
(12a)
and substituting into the second one, there is obtained:
( p σˆ )( p σˆ ) p 2σˆ 2 p2 ˆ 1ζ E1ˆζ = k k k k ζ = k k ζ = 2m 2m 2m
(12b)
where the above Planck relationship was supplemented by the companion de Broglie one for the momentum,
=k k = p k
(13)
σˆ k2 = 1ˆ
(14)
while the Pauli matrices basic property
applies, as may be immediately verified from their realization of eq. (4). Nevertheless, the eq. (12a) represents in fact the eigen-equation for the free motion, supporting the latent generalization to the bounded state, either in anti-bonding or bonding existence
Eζ = εζ , Eξ = εξ
(15)
This is an interesting result because abolished many odd perception about Dirac equation and its meaning; actually, there follows that: •
•
•
Dirac equation is formally related with the temporal Schrödinger one, while producing the same eigen-problems, thus describing in essence the same nature of the electronic motion; The spin information modeled by the bi-dimensional spinors is not necessarily a relativistic effect (beside completing the 2+2=4 relativistic framework dimension of the Dirac equation) but merely a quantum one since fulfilling the eigen-value problems, each separately; The two spinors of the Dirac equation may be associated with the bonding (for negative energies) and anti-bonding (for positive energies) of a system, being thus suited for physically modeling of the chemical bond, beside the common interpretation of negative/positive spectrum of free positronic/electronic energies in the Dirac Sea.
However, before effectively pursuit to the chemical bonding description based on Dirac equation one needs some background of the recent non-orbitalic quantum modeling of the chemical bond.
6
Mihai V. Putz
3. Quantum Chemical Bond 3.1. Binding Functions and the Chemical Bond Employing the dimensional quantum-relativity relationship
< energy > ⋅ < distance >~ Joule ⋅ meter ~ = ⋅ c
(16)
recently, there were introduced the chemical binding functions [19]:
⎧1, λ → 0 f α (λ , C A ) = 1 − ΩλC A = ⎨ ⎩− ∞ , λ → ∞
(17a)
⎧1, λ → 0 f β (λ , C A ) = exp(− ΩλC A ) = ⎨ ⎩0, λ → ∞
(17b)
called as the anti-bonding and bonding functions, for the reason grounded on their asymptotical behavior, respectively; the introduced Ω-factor accounts for assumed dimensionless nature of functions (17), being adequately settled as:
Ω= and where
1 = 0.506773 ⋅ 10 −3 J −1 m −1 =c
(18)
λ stands for the localization distance, while C A stays for the chemical action
[19,20]
G G G C A = ∫ ρ (x )V ( x )dx ≅ χ [ρ ]
(19)
o
expressed in A (ångstrom) and eV (electron-volts), respectively. Note that in eq. (19) the equivalence between chemical action and electronegativity χ [ρ ] , as electronic density functionals, was assumed based on their similar nature in convoluting the applied potential with the concerned electronic density ρ [24], although electronegativity is currently unfolded as a generalization of the chemical action, being dependent on it, showing a more complex density functional expression at various localization levels [25, 26]. Nevertheless, the bonding functions (17) reciprocally combine within a paradigmatic AB molecule, with a coordinate system centered in A, to provide the electronic pair-localization region within the bond length RAB by means of the binding equations (see Figure 1) [19],
(
)
(
)
(I) : f αA λI , χ A = f βB RAB − λI , χ B ,
(20a)
Fulfilling Dirac’s Promise on Quantum Chemical Bond
(
)
(
(II) : f αB RAB − λII , χ B = f βA λII , χ A as the interval
)
7 (20b)
λII − λI or as the single point λII = λI for the hetero- and homo- bonding
systems, i.e. having different or identical isolated electronegativities
χ A and χ B ,
respectively.
Figure 1. Geometrical loci of the sigma-bonding (σ-B in blue), anti-bonding (⎤-B in red), nobonding (∅-B in orange), and pi-bonding (π-B in green) for chemical binding from equal electronegativity influences of two systems A and B throughout equations (20) with binding functions (17) through constants and parametric settings as = = c = 1 , χ A = χ B = 1 , RAB = 1.
In each of above equations (20) the binding “points” I and II appear as the informational crossing (transfer) between the anti-bonding and bonding functions of both A and B systems driven by their electronegativities; however, they fix the all types of involved bonding regions (see Figures 1 and 2) as follows [19]: •
the sigma-bonding region (σ-B in Figures 1 and 2) is uniquely defined and has no “nodes” or discontinuities; it is delimited by the area under bonding crossing functions inside of the pairing interval bordered by the projection of the points I and II along the bond; it corresponds to the consecrated bonding obtained by the composed wave-function density Ψ A + ΨB
•
2
in the conventional molecular orbital
(MO) theory [4]; the anti-bonding region (⎤-B in Figures 1 and 2) is represented by the two parts spanning the space from the systems A and B until the sigma-bonding limit, being defined by the area under bonding functions f βA and f βB but outside of the interval
8
Mihai V. Putz fixed by the projection of points I and II on the bond length; it corresponds to the 2
anti-bonding state density ΨA − ΨB with separated parallel electronic spins in MO •
•
theory [4]; the no-bonding region (∅-B in Figures 1 and 2) is composed of two parts, one in each binding side respecting sigma-bonding, being formed by the area delimited by all the binding functions of (17) around the binding points I and II, outside of their projected interval on the bond length, while not intersecting between them and with the bond length; the pi-bonding region (π-B in Figures 1 and 2) spans the bond length entirely without crossing it, thus having nodes on it, being resulted from the area defined by all the binding functions of (17) around the binding points I and II, partially outside and partially inside (with a common point inside the projected interval of the points I and II on bond; therefore, this region is compatible with the consecrated pi-bond type of the MO theory.
Figure 2. Geometrical loci of the bonding regions as in Figure 1 for chemical binding from different electronegativity influences of two systems A and B throughout equations (20) with binding functions (17) through constants and parametric settings as = = c = 1 , χ A = 2 χ B = 2 , RAB = 1 .
Note that the difference between the equal and different electronegativity influences on bonding in Figures 1 and 2 is reflected in sigma-bonding shift towards the more electronegative bonding component, while slightly enlarging the spanning interval of projection of points I and II on bond in Figure 2; moreover, the bond of Figure 2 is accompanied by a slightly decreasing of the sigma bonding apex value on binding probability, being now about 0.5 respecting the recorded 0.6 value in Figure 1. This may lead with the
Fulfilling Dirac’s Promise on Quantum Chemical Bond
9
meaningful consequence in describing the covalency (in Figure 1) and ionicity (in Figure 2) characters of chemical bonding, in terms of quantum tunneling of the sigma bonding region: • •
covalent binding is characterized by a symmetric higher and thinner well of electrons, being those more localized on middle of bond; ionic binding features a dissymmetric taller and thicker well of electrons with more delocalized pairing electrons towards the more electronegative component.
The bonding regions may appear as the consequence of equilibrium between binding functions (17) that caries the density functional information either as chemical action or as electronegativity. As a consequence, all above identified binding regions are defined within positive (0, 1) realm of binding functions (17) allowing the natural probabilistic interpretation for their inside. Nevertheless, there remains to explore their influence on bonding within the Dirac equation framework, and how the Dirac equation in generals influences the chemical bonding phenomenology when the spin (or spinors) are involved. This will be addressed in the sequel.
3.2. Chemical Bond by Dirac Equation The above spinorial identification as bonding and anti-bonding, see eq. (5), may be now combined with the introduced bonding and anti-bonding functions (17) so that to create the actual working binding spinor:
⎡ ⎛ λχ ⎞ ⎛ i ⎞ ⎤ ⎢ ⎜1 − =c ⎟ exp⎜ − = E ⋅ t ⎟ ⎥ [Ψ ] = ⎢ ⎝ λχ⎠ ⎝ i ⎠ ⎥ ⎞ ⎛ ⎞ ⎢exp⎛⎜ − exp⎜ + E ⋅ t ⎟⎥ ⎟ ⎢⎣ ⎝ =c ⎠ ⎝ = ⎠⎥⎦
(21)
where the previous chemical action dependence was here reconsidered as the more generalized (density functional) electronegativity. Next, we impose the condition the spinor (21) fulfilling the Dirac equation (1); for this we separately express the involved terms, while self-understanding the presence of the (bidimensional) unitary and other Dirac operators on both spinorial upper and down components so that the implicit total dimension of the wave-function to be completed to four-dimensional space: •
the time derivative Dirac term is directly computed as:
⎡ ⎛ λχ ⎞⎛ i ⎞ ⎛ i ⎞⎤ ⎢i=⎜1 − =c ⎟⎜ − = E ⎟ exp⎜ − = E ⋅ t ⎟⎥ ⎠⎝ ⎠ ⎝ ⎠⎥ i=∂ t [Ψ ] = ⎢ ⎝ λχ i i ⎛ ⎞ ⎛ ⎞ ⎢ i= E exp⎜ − ⎟ exp⎜ + E ⋅ t ⎟ ⎥⎥ ⎢⎣ = ⎝ =c ⎠ ⎝ = ⎠⎦
10
Mihai V. Putz
⎡ ⎛ λχ ⎞ ⎛ i ⎞ ⎤ ⎢ E ⎜1 − =c ⎟ exp⎜ − = E ⋅ t ⎟ ⎥ ⎝ ⎠ ⎝ ⎠ ⎥ =⎢ λχ i ⎛ ⎞ ⎛ ⎞ ⎢− E exp⎜ − exp⎜ + E ⋅ t ⎟⎥ ⎟ ⎢⎣ ⎝ = ⎠⎥⎦ ⎝ =c ⎠ •
(22)
the space coordinate Dirac derivative needs the pre-requisite of simple derivative
∂ k λ = ∂ k λk λk =
λk λ
(23a)
providing the yield:
⎤ ⎡ ⎛ λ χ⎞ ⎛ i ⎞ − i=cαˆ k ⎜ − k ⎟ exp⎜ − E ⋅ t ⎟ ⎥ ⎢ ⎝ = ⎠ ⎝ =cλ ⎠ ⎥ − i=cαˆ k ∂ k [Ψ ] = ⎢ ⎢− i=cαˆ k ⎛ − λk χ ⎞ exp⎛ − λχ ⎞ exp⎛ + i E ⋅ t ⎞⎥ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟⎥ ⎢⎣ ⎝ =c ⎠ ⎝ = ⎠⎦ ⎝ =cλ ⎠ ⎤ ⎡ χ ⎛ i ⎞ i λk exp⎜ − E ⋅ t ⎟ ⎥ ⎢ ˆ σk ⎤ λ ⎝ = ⎠ ⎥ ⎢ ⎛ λχ ⎞ ⎛ i ⎞⎥ 0 ⎥⎦ ⎢ χ i λ exp⎜ − ⎟ exp⎜ + E ⋅ t ⎟⎥ ⎢⎣ λ k ⎝ =c ⎠ ⎝ = ⎠⎦
⎡0 =⎢ ⎣σˆ k
⎡ χ ⎛ λχ ⎞ ⎛ i ⎞⎤ ⎢i λ σˆ k λk exp⎜ − =c ⎟ exp⎜ + = E ⋅ t ⎟⎥ ⎝ ⎠ ⎝ ⎠⎥ =⎢ χ i ⎛ ⎞ ⎥ ⎢ i σˆ k λk exp⎜ − E ⋅ t ⎟ ⎥⎦ ⎢⎣ λ ⎝ = ⎠ •
(mc
(23b)
the free + potential term:
2
(
⎡ mc + V ( x) + V ( x) βˆ [Ψ ] = ⎢ 0 ⎣
)
2
⎡ ⎛ λχ ⎞ ⎛ i ⎞ ⎤ 1− exp⎜ − E ⋅ t ⎟ ⎥ ⎜ ⎟ ⎢ ⎤ ⎝ 0 =c ⎠ ⎝ = ⎠ ⎥ ⎥⎢ 2 − mc + V ( x) ⎦ ⎢exp⎛ − λχ ⎞ exp⎛ + i E ⋅ t ⎞⎥ ⎜ ⎟⎥ ⎢⎣ ⎜⎝ =c ⎟⎠ ⎝ = ⎠⎦
)
(
)
⎡ ⎛ λχ ⎞ ⎛ i ⎞ ⎤ 2 ⎢ mc + V ( x ) ⎜1 − =c ⎟ exp⎜ − = E ⋅ t ⎟ ⎥ ⎝ ⎠ ⎝ ⎠ ⎥ =⎢ λχ i ⎛ ⎞ ⎛ ⎞ ⎢− mc 2 + V ( x ) exp⎜ − exp⎜ + E ⋅ t ⎟⎥ ⎟ ⎢⎣ ⎝ =c ⎠ ⎝ = ⎠⎥⎦
(
(
)
)
With these, the Dirac equation (1) now provides the system of equations:
(24)
Fulfilling Dirac’s Promise on Quantum Chemical Bond λχ
11
− E ⋅t χ ⎛ λχ ⎞ − = E ⋅t ⎛ λχ ⎞ − = E ⋅t = i σˆ k λk e =c e = + 1ˆ(mc 2 + V ( x) )⎜1 − 1ˆ E ⎜1 − , ⎟e ⎟e λ =c ⎠ =c ⎠ ⎝ ⎝
i
− 1ˆ E e
−
λχ =c
e
i E ⋅t =
i
i
(25a)
λχ
i i − − E ⋅t E ⋅t χ 2 c = = = ˆ = i σˆ k λk e − 1(mc + V ( x) )e e λ
(25b)
Next, through getting out from the second equation (25b) the term containing the covariant product: λχ
i
− χ σˆ k λk = 1ˆ(mc 2 + V ( x) − E )e =c λ
(26a)
it is then replaced in the first equation (25a) to obtain:
(mc
2
λχ −2 ⎛ λχ + V ( x) − E )⎜⎜1 − + e =c =c ⎝
⎞ ⎟=0 ⎟ ⎠
(26b)
whose solutions expresses the energy conservation
E = mc 2 + V ( x)
(27)
and the Dirac adapted bonding equation
−e
−2
λχ =c
= 1−
λχ =c
(28a)
Yet, due to the negative sign of the left hand side of eq. (28) one may infer that it is just one solution of a quadratic equation, say
e
−4
λχ =c
≅ 1− 2
λχ =c
(28b)
providing the second accompanied solution
e
−2
λχ =c
≅ 1−
λχ =c
(28c)
However, there is noted the formal difference between eqs. (28a) and (28c) not only because of sign but also due to the approximate nature of the second, coming from the form (28b) in short range of binding distance regime λ ≅ 0 . Nevertheless, the appearance of two
12
Mihai V. Putz
(±) forms of Dirac chemical bonding equation is in accordance with the manifestation of the Dirac positive/negative manifestation of energies respecting the electronic/positronic motions within the Dirac Seas, respectively. Still, for the chemical bond description the difference in sign allows for further mixing of the bonding equations for a paradigmatic AB molecule generating more bonding points so modeling in more detail the bonding with spins in bonding and anti-bonding states. Therefore, the actual working Dirac binding functions are: •
The Dirac anti-bonding function remains the same as given within density kernel approach by eq. (17a):
f αDir (λ , χ ) = 1 − •
λχ
(29)
=c
The Dirac bonding function is modified respecting the previous one given by eq. (17b) while being two-folded:
λχ ⎞ ⎛ f βDir ⎟, ( + ) (λ , χ ) = exp⎜ − 2 =c ⎠ ⎝
(30a)
λχ ⎞ ⎛ f βDir ⎟ ( − ) (λ , χ ) = − exp⎜ − 2 =c ⎠ ⎝
(30b)
Now, the bonding geometric loci are determined, for the molecule AB, by the system of equations: −B B (I) : f αA (λI , χ A ) = f βDir ( + ) (RAB − λI , χ ) ,
(31a)
−A A (II) : f αB (RAB − λII , χ B ) = f βDir ( + ) (λ II , χ ) ,
(31b)
−A A (III) : f αA (λIII , χ A ) = f βDir ( − ) (λIII , χ ) ,
(31c)
(
)
(
−B (IV) : f αB RAB − λIV , χ B = f βDir RAB − λIV , χ B ( −)
)
(31d)
which is regarded as Dirac generalization of the previous one of eqs. (20) by means of the last two equations which quantifies the “interference” effect of the anti-bonding and the negative bonding functions belonging to the same atom in the to be transferred towards a virtual bonding partner. The representations of Figures 3-4 show how the Dirac binding functions and equations (29)-(31) provides more insight in modeling of chemical bonding respecting the previous density functional ones of Figures 1 and 2.
Fulfilling Dirac’s Promise on Quantum Chemical Bond
13
The differences comes from two basic facts: the bonding function (17b) takes through Dirac equation two forms, i.e. it degenerates into one positive and other negative, see eqs. (30a) and (30b), respectively, while having also the modified argumentum. Instead, the antibonding equation (17b) is Dirac preserved either as in form and multiplicity. Due to this fact, depending on the electronegativity differences between the bonding partners the anti-bonding spin state may be located in various locations between the mixed (positive) bonding-antibonding crossing points I and II, eqs. (30a) and (30b), and the self (negative) bonding – antibonding crossing points III and IV, eqs. (30b) and (30c). Even more, for equal electronegativity three types of parallel spin (antibonding) separation may arise as illustrated by the Figures 3a, 3b and 3d, i.e. as being delocalized outside, precisely localized at the edge and delocalized inside of the sigma-bonding region, respectively, while the Figure 3c illustrates the case when the bonding pairing is precisely localized on bond. Actually, following the bonding points delivered by the system (31) one has, for the equal electronegativity cases of Figures 3a-3d, the following configurations respecting the electronegativity equal values, respectively: Figure 3a:
χ A = χ B = 1 : IV II < I < III N N< ↑
(32a)
↑
↑↓
providing the anti-bonding (parallel) spins delocalized outside of the region with delocalized anti-parallel pairing electrons; Figure 3b:
χ A = χ B = 2 : IV =N I <
II = N III ↑
↑↓
(32b)
↑
providing the precise localization of the anti-bonding (parallel) spins at the margins of the region with delocalized anti-parallel pairing electrons; Figure 3c:
χ A = χ B = 2.22 : I < III IV <
II N
= ↑
↑↓
(32c)
↑
providing the precise localization of the anti-parallel pairing electrons at the half of the bond length, being outside of it the delocalization of the anti-bonding (parallel) spins; Figure 3d:
χ A = χ B = 3: N I < III < IV II
↑↓
(32d)
↑
providing the limited delocalization of the anti-bonding (parallel) spins at the margins of regions with delocalized anti-parallel pairing electrons.
14
Mihai V. Putz
Figure 3a. Geometrical loci of the bonding regions as in Figure 1 for chemical binding from equal electronegativity influences of two systems A and B throughout equations (31) with Dirac binding functions (29) and (30) through constants and parametric settings as = = c = 1 , χ A = χ B = 1 , RAB = 1 .
Figure 3b. Geometrical loci of the bonding regions as in Figure 1 for chemical binding from equal electronegativity influences of two systems A and B throughout equations (31) with Dirac binding functions (29) and (30) through constants and parametric settings as = = c = 1 , χ A = χ B = 2 ,
RAB = 1 .
Fulfilling Dirac’s Promise on Quantum Chemical Bond
15
Figure 3c. Geometrical loci of the bonding regions as in Figure 1 for chemical binding from equal electronegativity influences of two systems A and B throughout equations (31) with Dirac binding functions (29) and (30) through constants and parametric settings as = = c = 1 , χ A = χ B = 2.2 ,
RAB = 1 .
Figure 3d. Geometrical loci of the bonding regions as in Figure 1 for chemical binding from equal electronegativity influences of two systems A and B throughout equations (31) with Dirac binding functions (29) and (30) through constants and parametric settings as = = c = 1 , χ A = χ B = 3 , RAB = 1 .
16
Mihai V. Putz
Figure 4. Geometrical loci of the bonding regions as in Figure 2 for chemical binding from different electronegativity influences of two systems A and B throughout equations (31) with Dirac binding functions (29) and (30) through constants and parametric settings as = = c = 1 , χ A = 2 χ B = 2 ,
RAB = 1 .
Worth observing that the case of the equal electronegativities presents the so called critical electronegativity fulfilling the equation (31c), for instance, at the half of the bond length:
⎞ ⎞ ⎛R − A ⎛ RAB f αA ⎜ AB , χ ⎟ = f βDir ,χ⎟ ( −) ⎜ ⎠ ⎝ 2 ⎠ ⎝ 2
(33)
which give the information of precisely localization of pairing of anti-parallel electronic spins. As well, the precise localization of the anti-bonding parallel spins of eq. (32b), namely at the critical distances
λI = λIV = 0.445571 [ RAB ] ,
(34a)
λII = λIII = 0.554429 [ RAB ]
(34b)
on the actual bond length scale, are candidate for universal application for the given (known) equal electronegativity as stipulated by the case (32b), in adequate units. However, for both cases of eqs. (33) and (34), one providing localization of pairing antiparallel spins at half RAB / 2 of bond length for critical electronegativity, and the other the critical localization of the parallel spins of the anti-bonding states for equal electronegativities
Fulfilling Dirac’s Promise on Quantum Chemical Bond
17
equaling the twice of current units, respectively, furnishes important practical result when one likes to control the magnetic properties of quantum material composed by two aggregates. The Figure 4 is nothing than the extension of the Figure 3a in ionic manner as was the case for Figure 2 different from Figure 1. Still, for the present case worth remarking that due to the Dirac treatment, that is by the involvement of the negative bonding function of (30b) type, the actual model for anti-bonding state, although delocalized in Figures 3a and 4 is still finite respecting the infinite asymptotic space coverage in Figures 1 and 2. The final remark regards the height of bonding probability that is more and more contracted for the increase equal electronegativity cases in Figures 3a-3d, while for different electronegativities the appreciable contraction of width of the sigma bonding region is recorded, see Figures 2 and 4. Overall, the inclusion of the binding functions of eqs. (17) into the Dirac spinor of eq. (21) leaves with the so called Dirac binding functions (30) that enlarge through the binding equations (31) the cases of bonding and anti-bonding spin states, while being able to identify the specific situation when either parallel or anti-parallel spin states are precisely localized along the bond lengths, with presumed higher importance in designing and controlling of atoms-in-molecules spin based reactivity, and of nano-composites [27].
4. Conclusion Although containing a great amount of quantum and relativistic information about the electronic existence, origin, and motion, the Dirac equation is usually used to correct the atomic and molecular energies, eventually through the density functional Dirac-Kohn-Sham equation employed the Dirac-Coulomb Hamiltonian (called as the no-pair approximation) [17]:
[cαGˆ ⋅ pGˆ + (mc with
2
]
)
G G G G + V (r ) βˆ + Vxc (ρ (r ) ) ϕ j (r ) = ε jϕ j (r )
(35)
G Z Z Z G ρ (r ' ) G V (r ) = ∫ G G dr '−∑ G A G + ∑ G A BG + ... r − r' A RA − r A ≠ B RA − RB
(36a)
G δE [ρ (r )] G Vxc (ρ (r ) ) = xc G δρ (r )
(36b)
G
occ
G
G
ρ (r ) = ∑ n jϕ +j (r )ϕ j (r )
(36c)
j
where n j stay for occupancy numbers and where the exchange-correlation energy may be appropriately chose according with the desired approximation framework [28].
18
Mihai V. Putz
The present approach goes beyond the energetic description of bonding while evaluating the geometrical (physical) phenomenology of bonding employing the Dirac equation for the binding functions abstracted from the density kernel approximation [19]:
G G λχ B [ρ ( r )] ⎛ λχ A [ρ ( r )] ⎞ exp⎜ ⎟ ≅ 1− =c c ⎝ =
⎠ bonding function of atom A
(37)
anti −bonding function of atom B
Remarkably, when functions (37) are considered as bonding and anti-bonding parts of the Dirac spinor and plugged into the Dirac equation they provide, beside the energy conservation – as the control correct result, also the quantum relativistic (Dirac) binding equations (28) are obtained; their appearance differs from earlier binding functions (17) only in bonding terms (30) which are now two-folded degenerate, in accordance with positive-negative Dirac energetic Seas [1,10,21]; yet, the difference exists also in the bonding argument of eqs. (28) that is doubled respecting the previous non-relativistic one of (17b). These special features of Dirac treatment of binding provide a diversity of situations, depending on reciprocal electronegativity influences of adducts, however, including the situations of precisely localization of the bonding pair and of the anti-bonding non-pairing electrons along the bond length. This may lead to important practical application in chemical reactivity driving by spin, in nano-structures’ composites, as well as in bio-materials, throughout controlling the binding and electronic bonding electrons by electronegativity. Overall, there is also proof that the electronegativity concept and its realization may serve as the main ingredient in bonding modeling, being related with the most intimate quantum-relativistic structures of matter [29]. These ideas are very fruitful and should be further unfolded and applied in forthcoming works aiming to model Chemical Bonding and Reactivity.
References [1] [2] [3] [4] [5] [6] [7] [8] [9]
Dirac, P.A.M. The Quantum theory of the electron. Proc. Roy. Soc. A 1928, 117, 610-624. Bethe, H.A.; Salpeter, E.E. Quantum Mechanics of One- and Two-Electron Atoms; Springer-Verlag, Berlin, 1957. Slater, J.C. Quantum Theory of Atomic Structure, Vol. 2; McGraw-Hill, New York, 1960. McWeeney, R.C.; Sutcliffe, B.T. Methods of Molecular Quantum Mechanics, Academic Press, New York, 1969. Wilson, S. Electron Correlation in Molecules; Clarendon Press, Oxford, 1984. Lindgren, I.; Morrison, J. Atomic Many-Body Theory; Springer-Verlag, Berlin, 1982. Malli, G.L. (Ed.) Relativistic Effects in Atoms, Molecules, and Solids; Plenum Press, New York, 1983. Boeyens, J.C.A. New Theories for Chemistry, Elsevier, Amsterdam, 2005. Hoffman, E. O. (Ed.) Progress in Quantum Chemistry Research, Nova Science Publisher, New York, 2007.
Fulfilling Dirac’s Promise on Quantum Chemical Bond [10] [11] [12] [13]
[14] [15]
[16]
[17] [18] [19] [20] [21] [22] [23] [24]
[25]
[26] [27] [28] [29]
19
Grant, I.P.; Quiney, H.M. Foundations of the relativistic theory of atomic and molecular structure. Adv. At. Mol. Phys. 1988, 23, 37-86. Pyykkö, P. Relativistic Effects in Structural Chemistry. Chem. Rev. 1988, 88, 563594. Ziegler, T.; Snijders, J.G.; Baerends, E.J. Relativistic effects on bonding. J. Chem. Phys. 1981, 74, 1271-1284. Schwarz, W.H.E. Fundamentals of relativistic effects in chemistry, in Theoretical Models of Chemical Bond. Part 2. The Concept of the Chemical Bond, Ed. Z.B. Maksić. Springer, Berlin, 1990, p. 593-643. Buenker, R.J.; Chandra, P. Application of configuration interaction for the study of relativistic effects in atoms and molecules. Pure Appl. Chem. 1988, 60, 167-173. Hess, B.A. Applicability of the non-pair equation with free-particle projection operators to atomic and molecular structure calculations. Phys. Rev. A 1985, 32, 756763. Buenker, R.J.; Chandra, P.; Hess, B.A. Matrix representation of the relativistic kinetic energy operator: two-component variational procedure for the treatment of many-electron atoms and molecules. Chem. Phys. 1984, 84, 1-9. Liu, W.; Dolg, M. Benchmark calculations for lanthanide atoms: Calibration of ab initio and density-functional methods. Phys. Rev. A 1998, 57, 1721-1728. Putz, M.V. (Ed.) Chemical Bond and Bonding, special issue of Int. J. Mol. Sci. 20072009, http://www.mdpi.com/journal/ijms/special_issues/bond_bonding. Putz, M.V. Chemical action and chemical bonding, J. Mol. Struct. THEOCHEM 2009, 900, 64-70. Putz, M.V. Levels of a unified theory of chemical interaction, Int. J. Chem. Model. 2009, 1, 141-147. Schweber, S.S. An Introduction to Relativistic Quantum Field Theory; Harper and Row, New York, 1964. Jauch, J.M.; Rohrlich, F. The Theory of Photons and Electrons, 2nd ed., Springer, New York, 1976. Harriman, J.E. Theoretical Foundations of Electronic Spin Resonance; Academic Press, New York, 1978. Putz, M.V. Contributions within Density Functional Theory with Applications in Chemical Reactivity Theory and Electronegativity, Disertation. Com, Parkland, Florida, 2003. Putz, M.V. Systematic Formulation for Electronegativity and Hardness and Their Atomic Scales within Density Functional Softness Theory, Int. J. Quantum Chem. 2006, 106, 361-386. Putz, M.V. Absolute and Chemical Electronegativity and Hardness, Nova Science Publisher, New York, 2008. Putz, M.V. (Ed.) Advances in Quantum Chemical Bonding Structures, Transworld Research Network, Kerala 2008. Putz, M.V. Density functionals of chemical bonding, Int. J. Mol. Sci. 2008, 9, 10501095. Putz, M.V. Can Quantum-Mechanical Description of Chemical Bond Be Considered Complete?, in Quantum Chemistry Research Trends, Mikas P. Kaisas (Ed.), Nova Science Publishers Inc., New York, (2007), pp.3-5.
In: Quantum Frontiers of Atoms and Molecules Editor: Mihai V. Putz, pp. 21-40
ISBN: 978-1-61668-158-6 © 2010 Nova Science Publishers, Inc.
Chapter 2
DUALITY WITHIN THE STRUCTURE OF COMPLEMENTARITY: RIGHT WHERE IT HAS NO PLACE TO BE Constantin Antonopoulos* Department of Applied Mathematics and Physics National Technological University of Athens, 28 Oktovriou (Patision) 42,10682 Athens, Greece
Abstract Bohr defines as complementary a pair of concepts “united in the classical mode of description”. But waves and particles are not united in this mode of description. They are separated in it. And from now on it only gets worse: Waves and particles are mutually exclusive even in Classical Mechanics. And Classical Mechanics does not contain the quantum. In other words, the incompatibility between waves and particles is self-sufficient and fact-independent and therefore such as cannot even be attributed to the quantum. But then again Complementarity (CTY) must at all costs be attributed to the quantum. In consequence, the sort of incompatibility afforded by Wave-Particle Duality (WPD) is the wrong sort for CTY, which is dependent on the quantum. Once this is realized, CTY is in search of a new foundation. This foundation is not really new at all and is offered in abundance in Bohr’s conception of the wholeness of the quantum of action. CTY is what happens to our structural conceptual scheme, when it is applied to the atom. Once CTY is properly derived in this way, Bohr’s ideas are further utilized to effect a reduction of WPD to the all pervasive, underlying Wholeness. It will then follow that WPD is a derivative instance in Bohr’s doctrine and not the primitive axiom that almost everyone assumes it to be.
1. Why a Reappraisal Is Necessary The closing decade of the twentieth century and perhaps the one immediately preceding it have witnessed a literal explosion of philosophers’ interest in Bohr’s quantum philosophy, *
E-mail address:
[email protected]
22
Constantin Antonopoulos
considered now as a purely philosophical subject in its own right, and pursued after increasingly complex pathways of contemporary semantics and epistemology. Accounts exemplary in finesse and subtlety, by far transcending the clumsy, older presentations of semi-instructed physicists, appear all around us, capable for the first time of doing real justice to “the depth of Bohr’s thought” (Hooker states [p.174] that he has had no cause to change his earlier views on Bohr, as presented in his all-time classic monograph, see Hooker, 1972. I mention this because I will be quoting essential passages from this monograph and very little from this present work; see Hooker, 1994, 155-194). Yet despite the number, the extent, the competence and the sophistication of the works recently written on Bohr’s obscure genius and its products, the issue of the relation between Duality and Complementarity (CTY hereafter), if any, seems to be left untouched on the whole and pretty much where it stood back in the forties (with possibly one other exception; see D. Murdoch’s work of 1987). Since 1984 I have made it clear [in a paper first presented in front of a live audience, comprising the results of my doctoral thesis, in a conference held in Athens, titled Forms of Physical Determinism; the paper proposed a compact argument for deriving energy-time CTY on the basis of discontinuity alone, and “without the intervention of duality”, see Antonopoulos 1985, 95; See also E. Marquit’s reference to this work, Marquit 1988, 193] that I consider Wave-Particle Duality (WPD hereafter) an inelegant, facile, confused and, what is of essence, an incoherent method for deriving the relations of mutual exclusion demanded by CTY. It was therefore a matter of particular personal interest for me to observe, whether the above noted sophistication of recent Bohrian scholars has at last branded the continuing survival of Duality in Bohr’s system as the anachronism it is, or whether, competence, depth and subtlety notwithstanding, has otherwise resigned itself to a venerable tradition. The answer to this question was not particularly satisfying to me, so I presently intend to do something about it. The line of authors who have explicitly rejected Duality as a basis of CTY is sadly not a long one and the fingers of one’s one hand suffice to do the numbering. I have actually found two of them: Max Born and Adolf Grünbaum. Max Born makes it clear that, one: he fully endorses Bohr’s CTY; two: that he rejects the idea of waves altogether, claiming that “waves are waves of probability. They determine the supply of particles, that is to say, their distribution in space and time” [Born, 1951, 157; the author’s italics]. Hence, there are no real waves for the particles to be complementary to. Yet he otherwise speaks most approvingly of CTY [Born, 1969, 98]. To him the relationship of CTY must therefore pertain to and obtain between pairs of concepts other than the “wave” and the “particle”. Which are my sentiments exactly I nevertheless doubt whether the reasons for Born’s dismissal of Duality as a basis, or even an instance of CTY, share much in common with my own. In all probability this is but another case of two people having reasoned to the same conclusion from different premises. But with A. Grünbaum it is not so. Here a good deal is shared: <
> [Grünbaum, 1957, 717; italics in the original].
Duality within the Structure of Complementarity…
23
This is much closer to the complementary scheme of things I have in mind. Yet the position I am about to defend is far stronger than Grünbaum’s. In fact, it is quite extreme: “waves” and “particles” are not just a left over from classical methods and habits, and hence totally unsuitable for a concept as unclassical as that of CTY, though left over and all the rest they certainly are. Waves and particles are a logical impossibility as a putative basis of CTY. It is not without interest to see how, if only by stark contrast to the ‘extreme’ thesis I am going to defend, WPD fares in recent, hence presumably well informed (?) literature: “Complementarity appears to have been introduced by Bohr in response to the waveparticle duality, which becomes the wave-particle complementarity in his framework” [Plotnitsky, 1994, 68]. To complete this untroubled, and slightly museum-piece picture of things with what this author is really aiming at, there is then introduced the irrationality lurking in the concept of WPD: The author (predictably) infers that “wave-particle complementarity is thus defined in anti-epistemological terms”. [ibid.] and is, to the same extent, the scientific counterpart of contemporary French obscurantism, involving Derrida, deconstructionism and a curtailed form of Hegelian dialectics, from which French philosophy finds it hard to at last emancipate itself, the author quoted very probably included. Having acknowledged subtlety and finesse in the texts of contemporary Bohrians does not necessarily include all Bohrians nor all texts. It does not include the passage just cited nor, for that matter, the one to now be, although its author, Henry Folse, is undoubtedly a very well informed and very attentive reader of Bohr’s text on nearly all other topics and occasions. But even to a scholar of so high a fidelity to the original, as Folse is, the idea that WPD may not be what is really behind CTY, has never occurred; or, perhaps because of it. For suggesting, as Grünbaum has, that CTY is something other than WPD, or that it is inconsistent with it, as I have, is not high fidelity. It is heresy. In consequence, Folse writes that: •
“by 1925, when matrix mechanics appeared, both wave and particle representations seemed necessary to describe the full range of phenomena in which atomic systems and radiation were to be observed” [Folse, 1985, 83; author’s italics]. or that • “by 1925 Bohr came to believe that the key to the new framework which would resolve the inconsistencies between the classical and the quantum ideas was to work with both particle and wave pictures” [86; author’s italics]. or that • “what was needed [for Bohr] was not a victory for the particle picture or the wave picture [for] by 1925 he was convinced that neither was to be discarded in favor of the other” [Ibid.; slightly rephrased]. Ergo, wave-particle CTY, of which there are plenty more samples in the relevant section of Folse’s (otherwise excellent) book, some preceding the ones I quoted, some succeeding them and some in between, convincingly backed up by bulky historical evidence too direct to question, such as remembrances of Bohr by Heisenberg at al., private correspondence, notes taken in a hurry, indeed everything it takes to make Grünbaum and myself (and perhaps Max Born a trifle) weep bitter tears at our heretic presumption.
24
Constantin Antonopoulos
So I guess it is high time we got some of our own back and cause, if not bitter tears as such, then at least a modicum of worry to the crowded orthodoxy, relying on evidence drawn not from letters, nor from recollections, nor from ill remembered phrases and flashes of memory nor, finally, from Bohr’s still confused ideas of 1925, but from the first official statement of CTY as it is specified in the first published volume of Bohr’s work: <> [Bohr, 1934, 19]. Here then: since CTY is the combination of features united in the classical mode of description, if CTY obtains between waves and particles, waves and particles are united in the classical mode of description. This routine definition of CTY, known to all to be as commonplace as the ABC of the doctrine, should give Plotnitsky and Folse something to really think about; for it is not waves and particles which are united in the classical mode of description and separated in the quantum. If anything, waves and particles are (quasi)‘united’ in the quantum mode of description and separated in the classical [And even there the (quasi)‘unity’ is only pretentious; complementarists do say that they are properties of one and the same micro-entity but the unrelenting classical opposition between waves (large) and particles (small) not only is retained full force in QM. It is actually utilized to derive CTY]. In consequence, if there is one possibility which this important passage excludes, this is the possibility of wave-particle CTY. On the other hand, what this common, but at the same time unique, passage establishes as the proper pairs of complementary concepts are none other than those which Grünbaum has already specified above, viz. the two pairs of classical conjugate concepts, whose products (not unexpectedly) yield a unit of physical action: Et and pq. The very ones which also enter Heisenberg’s Uncertainty Relations (UR hereafter), in many ways Bohr’s CTY expressed quantitatively. For it is these, it goes without saying, which are united in the classical mode of description and separated in the quantum. Not waves and particles. In consequence, treating WPD as the basis, or an instance, of CTY creates a definite and insurmountable hermeneutic impossibility. Bohr’s own words directly forbid it. If they are adhered to, WPD is the last thing in the world that fits the description. Folse is too alert a reader of Bohr’s text (quite unlike Plotnitsky and his un-welcome improvisations) to have missed a point of such importance. But he receives it rather differently than I: “It is clear that in this original statement of Bohr’s new viewpoint, the complementary relationship holds between space-time coordination and the principle of causality, both of which were combined in the classical framework. Only later does Bohr speak of complementarity between wave picture and particle picture” [Folse, 1985, 114.]. If “later” in this connection is supposed to mean “in his mature years” it won’t do. Maturity to the extent of self-contradiction is no development or enrichment of the doctrine. Bohr may have included wave-particle CTY “later”, if Folse so wishes. But it is just as true, if not indeed much more so, that the CTY between the classical conjugate concepts he has never retracted or abandoned and only sheer folly would lead him to do a thing like that. CTY has the exact same age that Heisenberg’s UR have, and in Heisenberg’s UR the concepts which are mutually exclusive are still today the action products Et and pq. Hence, E with t and p with q are still as complementary as ever. Are also the wave and the particle, which were “later” added to complete the doctrine, of a like nature? If so, then on
Duality within the Structure of Complementarity…
25
the basis of Bohr’s own definition, CTY should now obtain between concepts which are both: united and separated in the classical mode of description, since the classical concepts E,t and p,q correspond to the former account, the wave and the particle to the latter. In consequence, unless Folse is to explicitly reject classical concept CTY for the sake of wave-particle CTY, an idea whose pronouncement alone seems suicidally absurd, since the latter is said to support the former, we shall inevitably have to make room for both of them. In other words allow for CTIES between both, concepts united and concepts separated in the classical theory. Some definition of CTY this is. I don’t know whether to call it a contradiction, an ugly patchwork or just an empty formula which licenses everything. Let there be no mistake about that. Bohr’s own words, as explicitly voiced in the quoted, inconvenient passage, make wave-particle CTY exegetically impossible. And this, until now, only as far as exegesis goes. But there is more. For the exegetic impossibility implicitly contains a logical one, thus far concealed by the focus of emphasis. So let us reveal it, if only in moderation, for not all the story should be told at so early a stage. I have claimed that wave-particle CTY fails to satisfy the standards of Bohr’s definition, because waves and particles are not united, as the definition requires, they are separated in classical mechanics. Separated they certainly are. But then again, without any help from the quantum. For nothing of this sort exists in classical mechanics. Nevertheless, wave and particle are as incompatible as ever, even in classical mechanics. Waves (large) and particles (small) are self-sufficiently incompatible. So why need the quantum to separate them? The incompatibility between “large” and “small” is factindependent and therefore cannot even relate to the quantum. But then again, as I’m sure Folse would be the first to agree, CTY has to relate to the quantum. How then can it be based on WPD, which –by definition– can not? Complementarists, who make a habit of easy, sweeping solutions, will not find an easy way out of this one. Nor, strangely, will any body else.
2. The Relevance of the Classical Framework Probing deeper into the logical structure of Bohr’s relevant passage we can discern not only which actual concepts should qualify as members of CTY; we can, through Bohr’s thrifty but pregnant words, also discern why. Why, that is, Bohr thinks that such concepts can still complement one another in QM, despite their separation therein. They can, indeed they must, because they are united in the classical mode of description. Hence, the prototype sets as candidates for the relation of CTY such concepts only, which are firstly shown, united in the classical mode of description. (But waves and particles are not united in this mode of description. Still, people say that they are ‘complementary’.) The demand that concepts complementary in QM must first be shown to be united in Cl.M (classical mechanics) is not perchance a weak, dispensable analogy, to be dealt with or waived aside on the physical aphorism, that QM has already undone Cl.M, so what’s the point of calling them complementary in the first place. For the demand to treat them as complementary in QM, iff united in Cl.M, is not itself a physical argument at all. It is a transcendental one. The pressures to call them complementary are not factual. They are conceptual. The interference of the quantum may well undermine the physics of the situation, which is in any case contingent and therefore transitory. But the conceptual aspects are not as
26
Constantin Antonopoulos
easily removable because of this and Bohr was the last person ready to remove them [Bohr emphatically denies that the classical concepts can be replaced by different concepts in QM; whence, of course, their resulting CTY. See for greater details my essay, Antonopoulos, 1987, passim]. To put the point more forcefully, Et and pq are not complementary in QM just because they are united in Cl.M. Far more than merely establishing the unity of Et and pq Cl.M. is itself established on the presupposition of such unity; for the unity spoken of antedates scientific systemization and is the rock bottom, transcendental foundation of its very possibility. Whence the Kantian dimension of the point [Kant’s Transcendental Aesthetic, treating space and time, and Transcendental Analytic, treating causality, combine to furnish the transcendental substratum, whose joint fulfillment alone will confer upon a provisional entity the status of an event. Compare this with Favreholdt’s description of Bohrian epistemology: “(All of this) fits well with Bohr’s insistence that space, time and causality are necessary forms of sensibility in human knowledge”, see Favreholdt, 1994, 81; for Bohr’s transcendentalism, see also Honner, 1994, 142]. This, then, is the reason why momentum and position must be considered complementary: Because they constitute equally necessary components of one and the same phenomenon of motion; as Folse elegantly puts it, “Motion is change of position in time” [Op.cit., 57], hence, the idea of motion without simultaneous reference to both of these constituents, p and q, borders on conceptual impossibility. Whence, of course, the very roots of the representational, intuitive, epistemological and even logical problems related with QM and CTY, which demand their separation. However, in spite of their physical separation in QM, position and momentum still remain inerasable conceptual correlatives, for their correlation was other than physical to begin with. Hence, when split apart by the quantum of action, they should still be considered complementary, because, even so, the two of them belong to one another, if motion of any kind is to ever result. And in belonging to one another, they therefore complement one another. (NB: Mutual complementation is also necessary to CTY!) Here now is why energy and time should be considered as complementary. They are because the following principle, elementary in its simplicity, still continues to obtain in spite of everything: P.1: An object can only occupy a particular energy state at some time, or over a time, if at all; which, in turn, has very little to do with QM or even Cl.M, for that matter. For it is itself but a special case of a more general and fundamental principle, which underlies it: P: An object occupies a state at some time, if at all (energy states included). I hold this to be an important conceptual truth for its alternatives lead to absurdity. To show this, I will restate P in the following, Kantian form; P': What happens, happens within time, therefore at a time (or over a time). That is to say, cannot happen outside time (or outside the flow of time). To then suppose it can, may be magic or miracle but it certainly isn’t physics. And, in any case, cannot be referring to what we normally call an event. Since, therefore, P, P.1 and P' leave little room for coherent alternatives, energy states as much as anything else must needs be associable with a time [Take, for instance, the following assertion by D.M.Mackay: “Energy must always be associated with a tract of time”, see
Duality within the Structure of Complementarity…
27
Mackay, 1958, 112; italics the author’s; what is so special about this assertion is that it comes in the text just after the author has derived the energy-time uncertainty! Even that result is not powerful enough to erase the transcendental correlatedness of the two concepts]. And hence, even if energy and time are no longer directly associable at the physical level, due to the quantum, they nevertheless retain their formerly established conceptual link and so their very correlatedness. For the very arguments establishing such link are not themselves empirical and contingent. They are transcendental. And this kind of link lingers on even in the face of adverse empirical conditions, revealing its true origin. The time of its occurrence belongs to an event and, conversely, the very notion of timing something can in turn only refer to an event. Hence, the two complement one another and should be regarded as complementary in all the cases, where the physical conditions obtaining prevent their hitherto joint application; which brings us directly to Bohr’s second string of definitions: <> [Op. cit, 10]. In other words, such and only such pairs of concepts qualify as complementary, which, in the presence of any one member of the pair that is empirically realized, the absent member is still as necessary for a wholesome, cogent representation of reality, as the present member is; which is only sound, plain sense. One must never forget that, as can be verified from Bohr’s own words, mutual exclusion is only a necessary condition for CTY. Not a sufficient one. The other necessary condition trivially being, I would suppose, the capacity of the alternatingly excluded member of the pair to complement its complementary. The epistemological insanity lurking within QM is that it precisely seeks to drive a wedge and set apart such pairs of concepts, which make good sense only when taken together. This is the nature of the conceptual pressures exerted upon us by the fundamental structure of the classical framework, retrodictively demanding its toll even in the face of the quantum. Can I put the point in plain, human, unpretentious terms? A concept A is complementary with a concept B, iff it would have been preferable for human knowledge to have obtained A together with B, rather than without it. Normally, this is what CTY is all about. Now let us try and apply all these rather commonplace facts about CTY to the case of waves and particles and see what follows. Waves are large (extended) and particles are small (local). And assigning them both simultaneously to one and the same thing would result to a silly contradiction. Is it then preferable for human knowledge to be able to obtain the wave together with the particle, rather than without it? For this is what their alleged CTY would boil down to. What is the fundamental epistemological problem raised by QM and its complementary account? It is, for example, that it renders a classical position incompatible with a classical momentum, perennially frustrating Einstein, who has branded the theory incomplete because of this very reason [Einstein, Podolsky, Rosen, 1935, 777-780]. Was Einstein complaining about quantum incompleteness because QM forbids the co-existence (or the ‘co-verifiability’) of waves and particles? Was that what Einstein’s demand for completeness was about? Here are some further, impressive ‘similarities’(?) between classical concept CTY and wave-particle CTY, ‘similarities’ which, I’m sure, all concerned have noticed long before I have. To start with, energy and time or position and momentum are attributes which can be jointly assigned to a macro-system. Wave and particle are not attributes which can be jointly
28
Constantin Antonopoulos
assigned to a macro-system, except at the pain of blatant contradiction. Who can deny such similarity? Accordingly, for energy and time or position and momentum the descriptive problem raised by CTY, or by QM as such, is that they cannot be observed together, when all concerned feel they should. So, one must conclude, the problem would be removed if they were observed together. But the descriptive problem raised by wave-particle CTY is not that these two cannot be observed together. They cannot but, so far, no one has called this a problem. The problems here would start, if wave and particle could be observed together, though, thankfully, they cannot. For some pairs of concepts, it would seem, their CTY is a deep frustration. But for some others, their CTY is pure salvation! Something here has gone terribly wrong, terribly not just due to the magnitude of the mistake but, indeed, due mainly to how easy it was to avoid it. In view of these remarks I trust no one will invoke experiments to save the day and argue that, since it is experiments which force upon us the “two natures”, we have no choice but to comply. At this point experiments are irrelevant nor have I built my case on them. Experiments can at best establish the existence of waves and particles. Not their complementarity. CTY is a conceptual doctrine and the citing of brute experimental facts, however persistent or inexplicable, is but a lame excuse for calling them complementary, thus making a virtue of necessity. The deplorable logic of “experiments are like that, so what we can do?” does not per se necessitate the application of CTY. How about doing nothing? In my view, to argue in this way is to just take hold of a problem and, through its sheer longevity and inexplicability, decide abruptly to turn it into its own solution, because it is a problem. Duality experiments do not establish the complementarity of waves and particles. They just establish their observation. And he is surely a poor reader of Bohr, who conflates these two. If they also establish a contradiction in the world at least behind the scenes [See also Folse, 1985, 83-4], then, at best, this is what CTY is supposed to handle. Not therefore what CTY is supposed to be. Indeed, for CTY to be able to handle the contradiction, it must itself be distinct from the contradiction. Otherwise, and were the contradiction also a ‘CTY’, the problems plaguing the contradiction would eo ipso become the problems now plaguing CTY; which is exactly what they have become, only spreading more and more butter on the bread of Bohr’s adversaries. And then, rather than solving the problem of WPD, CTY would itself become a part of this problem. That alone is excellent reason for wanting to separate between CTY and WPD. Hence, WPD experiments do not establish wave-particle CTY. Then what does? There are some who find it consoling to suppose that “wave” belongs to “particle” because, they say, some of the concepts which even I have admitted as authentically belonging to one another, viz. p with q and E with t, are themselves connected, some with the wave and some with the particle [This confused contention is based on a tragically erroneous reading of de Broglie’s p=h/λ, mistaking p, or mv, as the particle property, and so (correctly) taking λ as the remaining, i.e. the wave property; but p is no longer the particle property in this relation. p is the magnitude which is here determined by the wave, by simply assigning to λ a unique value; then the fraction h/λ can also receive a definite value; but a unique value to λ entails the use of the wave picture, if to be assigned, indeed the picture of a harmonic wave, which makes recourse to the ‘particle’ impossible a priori, at least if complementarists themselves are to be believed. So, if p is the particle property, p has to define in absence of the particle, whose property it is! For a lengthier demonstration of the point see
Duality within the Structure of Complementarity…
29
Antonopoulos, 2004]. So if, they say, p belongs with q, so must “wave” belong with “particle”, which is respectively connected to them. I honestly hope that no serious arguer will attempt this circle against me. To assume that “waves” belong together with “particles” simply because p and q also do, is to take for granted the very thing I have already called into question. For I have been arguing that this is precisely what the relation between wave and particle does not do and cannot do; namely, come even close to comparing with the relation between p and q or that between E and t in the first place. To therefore suggest against me that the wave is complementary to the particle, because these two ‘correspond’ with p and q or E and t, which latter are complementary, is to freely assume that they do so correspond and this is the last thing I am presently inclined to allow. In a nut shell; wave and particle are contradictory concepts, exactly as contradictory as “yes” is with “no”. But energy with time and position with momentum are not contradictory concepts. So the whole nonsense of ‘associating’ waves and particles with energies and times or positions and momenta is as sound and rational a practice as urging us to treat as contradictory, concepts which we have ourselves admitted that are not contradictory. Other than that, Plotnitsky sees no problem in calling waves complementary with particles, and then proceeds full speed to the understanding of CTY in “anti-epistemological” terms. Were CTY as incoherent as all that, it would definitely be anti-epistemological and, besides, every single one of the things, which its opponents have claimed it, is for nearly seventy years now. Sadly, this practice is adopted also by Folse, who latter should at least know better [In the Editorial to the 1994 collection, see Folse, Faye, eds., 1994; Folse, if he is indeed the co-author of the piece (I suppose he must be), speaks as if it is still evident to him that CTY≈WPD. There is, however, a slight revision: “The connection between wave-particle dualism and the complementarity of the dynamic and kinematic properties remains a problematic issue for the analysis of Bohr’s philosophy and for the interpretation of QM”, see Editorial, 1994, xvi. I would say it does! But I think I may have an answer to it, see Section 5]. Well, if this is orthodoxy, I’d choose heresy any day. An exasperation shared by all heretics, I’m sure.
3. On the Two Kinds of Incompatibility If something is behind the mutual exclusion of the classical conjugate concepts, Et and pq, and this cannot be WPD, then what can it be? Bohr is pretty clear about what it is. All we need to do is listen: Complementarity is a term suited to embrace the features of individuality of quantum phenomena. [Bohr, 1958, 39] “The fundamental postulate of the indivisibility of the quantum of action [...] forces us to adopt a mode of description designated as complementary” [Bohr, 1934, 10]. Quantum theory is characterized by the emphasis on the feature of wholeness connected with the quantum of action. [Bohr, 1963, 53]
Bohr stresses here that what is behind CTY of the classical conjugate concepts is not some ‘duality’ or other but, indeed, and very directly so, the indivisibility of the elementary quantum; that is to say, its “wholeness”. Further exegetic efforts disclose remarkable persistence on this point:
30
Constantin Antonopoulos The essence of quantum theory may be expressed in the quantum postulate, which attributes to any atomic process an essential discontinuity or, rather, individuality. This postulate implies a renunciation of the causal, space-time description of atomic phenomena [Bohr, 1934, 43].
Once again it is the individuality of the processes, or their discontinuity, which is behind the failure to coordinate the causal and the spatiotemporal description. CTY has appeared in these passages several times already and still the word “duality” is spectacularly absent from the text. And there is more. Bohr is well known to have attempted to generalize CTY over the (philosophical) domains of Action, Mind and Life, where he claims that CTIES comparable to those of QM can be detected. It is interesting to see why he thinks so in these other three domains, where quantum wholeness is not present: In general philosophical perspective [...] in other fields of knowledge, we are confronted with situations reminding us of the situation in quantum physics. Thus, the integrity of living organisms and the characteristics of conscious individuals present features of wholeness, the account of which implies a typically (!) complementary mode of description. [Bohr, 1963, 7]
So, after all, a comparable wholeness is present in these other three domains, analogously inviting application of the doctrine. Wholeness, it should be stressed, which implies a typically complementary mode of description. How much more typically holistic does this mode of description have to be, to at last detach itself from Duality? I think that after seventy five years of CTY complementarists must simply accept the facts as they are and stop foisting on Bohr the dualities of QM, or those of the Copenhagen Interpretation (a caricature of Bohr’s real doctrines, if there ever was one), or those of their own experiences with WPD, in case they are also physicists. The man speaks differently. The man himself never said “complementarity is a description based on duality”, as P.K.Feyerabend said in his name some fort five years ago [1958, 94] and as Plotnitsky or Folse plus very many others still say today. It is only natural to assume that, if Bohr thought CTY was based on Duality, this is what he would have said right from the start, just as Feyerabend has, rather than unendingly repeat that it is based on indivisibility and wholeness instead. But I still have an ace up my sleeve, to determine once and for all whether it is the indivisibility of the quantum that is behind CTY, or whether it is its ‘duality’ which really is. Either way, concerning the quantum of action one thing is certain. It is the quantum which is behind CTY and nothing but the quantum, independently of what other properties it is deemed to possess. I trust no complementarist will deny me that. This, besides, is what the quantum uncertainties, ΔEΔt, ΔpΔq≥h, also say. For h→0 both uncertainties would vanish. Suppose then that it is essentially WPD which is behind the mutual exclusion of the four conjugated parameters, i.e. the two pairs of classical concepts. However, the wave and the particle are mutually exclusive even in classical mechanics. And classical mechanics does not contain the quantum. How then can CTY, which according to all quoted Bohrian passages is dependent on the quantum, be the synonym of a pair of concepts, wave and particle, which are mutually exclusive without the presence of the quantum? For wave and particle exclude one another even in classical mechanics and therefore do so independently of the quantum. Some CTY this is! When I have emphatically announced in the opening section of this essay that it is logically impossible for WPD to be the basis of CTY I was not exaggerating. It is indeed no less than logically impossible to so be.
Duality within the Structure of Complementarity…
31
The sole way that this inevitable conclusion can be avoided is to then argue that WPD is the quantum. Whereupon CTY, by being dependent on the quantum, would eo ipso be dependent on WPD. Accounts and interpretations of this sort are anything but hard to come by and, it should be mentioned, are generally considered as the ‘cultivated’ or the ‘sophisticated’ version of the ordinary and vulgar version of Duality. Needless to say, they are nothing of the kind. They are as incoherent and confused as any other, less sophisticated and far more vulgar presentation of wave-particle CTY available. Suppose we examine this contention closer: Originally, wave and particle are two things; in fact, two incompatible things; the quantum of action just one. How then are we to understand the claim that WPD and the quantum are one? Is this supposed to mean that the wave and the particle cease to be two incompatible things, becoming one, or that the quantum ceases to be one, becoming two incompatible things? Isn’t all this asking of us to swallow unresistingly more and more logical curiosities as we go along, rather than less and less, as it would be if the answers given us were only clear, convincing and reasonable? Identifications have their own strict rules. What we have here as candidates for identification are a pair of mutually exclusive concepts, on the one hand, and a (so far) unique concept, on the other, in whose frame the former two are alleged to be joined; but treating a pair of concepts as mutually exclusive presupposes treating them as not being able to be joined in the frame of a third concept, if it presupposes anything. Else there is no point in calling them mutually exclusive in the first place. In consequence, any identification of the sort presently proposed is impossible to implement without severe loss of identity of either of the participants. Either the two hitherto incompatible concepts will lose their identity, and cease being mutually exclusive (to no benefit of wave-particle CTY!), or else they will retain their identity, warranting mutual exclusion, and then, if WPD is the quantum, for the quantum to partake in the identity, the quantum will have to be mutually exclusive to ... itself. Which conclusion is more incoherent and absurd than even the vulgar versions of wave-particle CTY it seeks to replace. If complementarists are to be taken seriously they must give up such symptoms of exasperating double talk and start talking some sense. WPD cannot constitute the foundation of CTY because it yields the wrong sort of mutual exclusion. But then, if there is such a thing as the wrong sort, there must also be the right sort as well. In other words, there must be two kinds of mutual exclusion, that is to say, two kinds of incompatibility, only one of which is suitable for CTY. This brings us to the title heading of this section. Consider, then, the propositions “A>B” and “A=B”. These are certainly incompatible. Consider next the proposition “your money or your life!” as pronounced by the proverbial bandit. Are these two cases of incompatibility anything at all like one another? They are not. They are diametric opposites. The former, logical incompatibility is a consequence of definitions and is therefore necessary (as Leibniz would say). The latter, factual incompatibility is a mere consequence of obtaining facts and is therefore contingent (as Leibniz would say). Contingent, that is to say, upon a fact (e.g. the quantum) [A preliminary formulation of this distinction I have given in Antonopoulos, 1988, 323; Its complete and comprehensive formulation can be found in Antonopoulos, 1994, 187-9, its application to the EPR-Bohr debate in Antonopoulos, 1996, 1998 and its application to the problem of nonlocality in Antonopoulos, 1997a, 1999 and, partly, 2005]. This we may entitle “the prohibitive fact” [Antonopoulos, 1994, 188] (e.g. again the quantum). Hence, on the whole, there is fact-dependent and there is fact-
32
Constantin Antonopoulos
independent incompatibility. And Wave-Particle Duality is of the latter sort. Not of the former. This, I believe, suffices to make all the mess perfectly clear as it suffices, I hope, to make clear, why it is a mess. WPD, in expressing a logical incompatibility, which its own defenders go out of their way in establishing (!), is therefore fact-independent and, as such, cannot even relate to the quantum. At least not coherently, though otherwise it certainly can. And is therefore unsuitable not just for CTY but for QM itself on the whole. Most people assume that WPD is the mystery of QM, the sole quantum mystery there is, the rest of them, if any, presumably reducible to its narrower or at worst its wider frame. Three physicists, for instance, one of whom, R.Feynman, is quite well known and popular, introduce the reader to their discussion of Duality (i.e. the double slit experiment) in the following way: “We choose to examine a phenomenon which is impossible, absolutely impossible, to explain in any classical way, and which has in it the heart of quantum mechanics. In reality, it contains the only mystery. In telling you how it works we will have told you about the basic peculiarities of all quantum mechanics” [Feynman, Leighton & Sands, 1965, 1-1; italics the authors’]. That it is a mystery, I can see as much as the next man. That it is a quantum mystery, I cannot. Since WPD incorporates a logical and therefore a fact-independent type of incompatibility, hence a type of incompatibility that is given a priori, I simply see no way in sight of how to even relate it with the quantum, which is a fact, and only given a posteriori; and so with QM.
4. Wholeness and the Proper Logic for Complementarity Having regarded the distinction between logical and factual incompatibility of paramount importance for the proper understanding of QM and CTY and, most definitely, of the true place of Duality in either of the two former, I have been systematically scanning the quantum literature for traces of its presence in other authors. The following two accounts, which really come close, are the best that I have been able to come up with: “Bohr believes that while it has seemed to us at the macro-level of classical physics that the conditions were in general satisfied for the joint applicability of all classical concepts, we have discovered this century that this is not accurate and that the conditions required for the applicability of some classical concepts are actually incompatible with those required for the applicability of other classical concepts. This is the burden of the doctrine (B4). [=Complementarity] This conclusion is necessitated by the discovery of the quantum of action and only because of its existence. It is not therefore a purely conceptual discovery that could have been made a priori through a more critical analysis of classical concepts. It is a discovery of the factual absence of the conditions required for the joint applicability of certain classical concepts” [Hooker, 1972, 137; italics the author’s]. The distinction between the two types of incompatibility previously spoken of is certainly presupposed in this passage. For where there is a factual absence (in italics) forbidding joint applicability, there must also be a logical one, to which the former is implicitly contrasted. But Hooker did not pursue the issue further. In fact, he gave it up altogether later on and, indeed, repudiated it! [Hooker wrote to me: «These remarks... lead me to be cautious about your insistence “that wave-particle duality must be abolished in Bohr’s philosophy as well as
Duality within the Structure of Complementarity…
33
in quantum theory itself”», Hooker, letter dated 15 August 1983, upon receiving a draft of my doctoral thesis. Of course, my rejection was based on the distinction under consideration. In subsequent communication he even questions the overall validity of the distinction. In his letter of 18th December 1989 he tells me that regarding “formal” as opposed to “factual” aspects of the problem at hand “naturalists like me (him) cannot make a sharp distinction between the two kinds of truth”. Well, up there he has! Let’s only hope first thoughts are best thoughts]. The second passage faintly reflecting the same logical differences between CTY and Duality, which I have been insisting upon, belongs to G.Holton: “Position and momentum are not mutually exclusive notions since both are needed to specify the state of the system. But they are complementary in the restricted sense that they cannot both at the same time be ascertained with arbitrary precision [...] In contrast (!), the wave-particle aspects of matter are complementary and mutually exclusive” [Holton, 1970, 1050; italics the author’s]. Position and momentum are not mutually exclusive notions, that is to say, they are not really incompatible, and so it takes a fact to make them so – “in a restricted sense”. “By contrast”, wave and particle are complementary and mutually exclusive. That is to say, logically incompatible right from day one and, therefore, independently of the quantum (which detail escapes Holton’s attention). Hence, so far as the distinction between incompatibilities goes, this passage is practically identical with my point. But here all affinity stops. Holton sees no obstacle in speaking of restricted as opposed to unrestricted forms of CTY and accommodates them both under the same broad conceptual roof. But if it makes no difference, whether two concepts are restrictedly or unrestrictedly incompatible for being commonly called complementary, it makes no difference either, whether two concepts are restrictedly or unrestrictedly incompatible in the first place. So the difference is denied right within the very reasoning which detects it. This is why, besides, this potentially shrewd remark remains entirely idle in the work referred to [Antonopoulos, 1988]. Having designed the appropriate logical instrument for CTY, the sole available for permitting it to relate to the quantum, I shall now proceed to its application to indicate at least structurally what I take the logic of CTY to be. However, I will do so in purely abstract and formal terms, perhaps too abstract and too formal at that, but such as can surely make CTY accessible to any fellow philosophers, namely, the people I am presently addressing. Yet staying close to at least the spirit of Bohr’s words and that of QT I will still make indivisibility the centre of my arguments. Abstraction of the sort I have in mind will involve a slight modification of terminology. In physical terms I have spoken of CTY as presupposing a fact-dependent type of incompatibility. Since I will now discuss purely abstract (numerical) entities I will add the following, already implied provision. Fact-dependent incompatibility is conditional incompatibility. Conditional, that is, on the very fact it is dependent upon. Since all fact-dependent incompatibility is eo ipso conditional, the modification is minimal and valid. Consider therefore a simple equation of the form A+B=x, where all three variables, A B and x, are to be receiving values exclusively from the field of natural numbers, i.e. positive integers without zero. (Some say because zero is not a number. I would say because zero is not a positive number.) Then for every value ascription to x such that x>1, it is always possible for both the other two variables of the equation, A and B, to receive equally precise and definite values themselves. E.g. for the ascription x=7, one can correlatively assign A=2 and B=5 in satisfaction of the equation, or any other mutually constrained values he chooses,
34
Constantin Antonopoulos
values the availability of which increases in direct proportion to the value assigned to x; and decreases in comparable manner. So then consider what happens for the limiting ascription, x=1. Now it will no longer be possible for the other two variables, A and B, to both receive a precise and definite value. There is now but a single value available, 1, and this latter cannot be distributed among the two variables, A and B. It will therefore be assigned as a whole either to A or to B but not to both. (Observe how naturally talk of wholes enters this description.) If it is assigned as a whole to A, and this is the only way it can be assigned, then we get: A=1, B=indefinite –since zero is not a possible natural ascription. And if it is assigned as a whole to B, and this is the only way it can be assigned, we get: B=1, A=indefinite –for the same reason. Due to the indivisibility of 1 in this particular number field, the two variables simultaneously referred to it are thereby driven to mutual exclusion. This, I submit is (the logic of) Complementarity. The incompatibility of simultaneous and definite value assignments for A and B is conditional on the value assignment to x. For any value ascription x>1 A and B are compatible. For x=1 they are not. We have not assumed an incompatibility between the variables A and B themselves. Their incompatibility is extrinsic to their nature, not intrinsic to it as is the case with Duality. For any value assignment x>1 it disappears as much as it will, if we change the entire number field from that of the naturals to that of the fractional numbers, now permitting simultaneous value assignments even in the case that x=1.(This would correspond to a return to the classical ontology.) Mutual exclusion is in this case ascription-dependent, fully corresponding to the factdependent incompatibility required by the indivisibility of the elementary quantum, of which 1 as a natural, is presently the formal representation; that is to say, ascription-dependent and not a priori. And, it goes without saying, it is this very possibility which allows us to link the resulting variable incompatibility with the indivisibility of the elementary entity, to which they are simultaneously referred and applied. The whole logic of the scheme is: Two into one won’t go. (This is a logical truth.) Incidentally, and since there has been a fleeting reference to Einstein’s demand for quantum completeness, which is something essential to the cogency of CTY, the structure of the foregoing model makes it perfectly clear that the disjunctive (and definite) value ascriptions available are the best possible ones allowed by the nature of the mathematical phenomenon at hand and, respectively, by the nature of the comparably indivisible quantum. The fact that either value ascription is available, prior to actually ascribing any, will not go to show that they are both assignable simultaneously.
5. The Reduction of Duality The structure of the numerical model on the basis of which I have designed the logic appropriate for CTY, not only for staying true to the actual words of Bohr, naming indivisibility as the raison d’être of CTY (an exegetic requirement), but also for warranting the possibility of even relating CTY with indivisibility, (a logical requirement), is a structure utterly opposed to anything we can extract from WPD. A priori incompatibility cannot relate to the world, and therefore cannot relate to whatever exists in the world, which is predicated as indivisible; in other words, the quantum. This alone is reason enough to dismiss WPD as a possible basis for CTY or a possible basis for anything. Having demonstrated this
Duality within the Structure of Complementarity…
35
unsuitability and having set the record straight on the basis of Bohr’s so poorly understood doctrines, I would consider my obligations fulfilled and my investigation concluded. So far as I’m concerned they are. But so far as Bohr is they aren’t. For Bohr has an explanation for WPD, deriving it as the consequence of deeper laying realities, that is to say, of Indivisibility. This, come to think of it, is anything but surprising and certainly anything but incongruous. So I will stretch the limits of this enquiry somewhat further as an ultimate test of the unity and the consistency of his thought. The aim of this last section is therefore to indicate the relevance of Indivisibility (or Wholeness) to the emergence of WPD. I will commence with my own version and proceed with Bohr’s. Here then, in a nutshell, is my version: • •
AXIOM: If an indivisible thing is successfully predicated, the predication will embrace the entire thing. (This is an analytic truth.) THEOREM: An indivisible thing can only manifest undivided aspects of itself.
It would then follow that alternative predications of an indivisible thing, in being impossible to contain in their rightful place each, will embrace the indivisible thing in its entirety, its “wholeness”, and hence conflict with one another in ways which would never obtain, if this thing was subdivisible and therefore such as to allow its predications to be kept separate, and thus distinct and combinable. Wholeness will not permit its partial predications, which for an ordinary object could certainly be distinct and therefore consistently attributable. Hence, any pair of alternative, successful predications, A and B, of an indivisible entity will successively apply to its entirety, now representing the entity as all-A, and therefore as no-B, or as all-B, and therefore as no-A, which two, given the appropriate context, will most certainly conflict. The appropriate context is, in this case, the Part-Whole relation, which is, essentially, what the Axiom and Theorem, above formulated, are all about. If no parts of an indivisible thing can ever be obtained, the “would-have-been” part (for any other, subdivisible entity) will now emerge as the whole thing instead. And then the two (holistic) predications will have to conflict, for the “would-have-been” part, peacefully coexistent with another of its kind, will now extend to the entire thing, blocking the other part out; and conversely. The question may arise, whether it is at all legitimate to introduce parts in the case of indivisible things, even if permanently behind the scenes, and derive the conflict on their implicit presence, which is precisely what I’ve done. Indivisible things should have no parts, one might point out. This is not so. It is true that something simple has to be indivisible, e.g. the unit of my complementary model. But the converse is not likewise true; something indivisible does not have to be simple. It may be possessed of parts, the difference being that they’d have to be inseparable parts. Nor are we in want of concrete examples. The good old proposition is such an indivisible, though structural whole. The properties of truth and falsehood emerge with the entire proposition as such. They do not accompany any of the propositional terms, when separately regarded, not even in analogy. The proposition is a whole which, as metaphysical tradition has so correctly grasped, is greater than the sum of its parts. So also in my argument each resulting ‘part’ is greater than itself, becoming the whole. If that is denied, no predication of an indivisible entity will ever be successful and then we’d simply have no problem in the first place.
36
Constantin Antonopoulos
A quick regress to several of the Bohrian passages I have quoted, and to many which I have not, directly reveal that in his general epistemology Bohr conceives of classical mechanics as the methodology of subdividing, analyzing and “dissecting” its objects. Of course, if the objects turn out to be indivisible, rather than being themselves dissected, they will dissect us, that is to say, the structural conceptual scheme which we habitually apply to our objects. And will divide it in two, rendering it complementary. (My rudimentary ‘atomic’ model shows exactly how this is brought about.) Were the question to ever arise, and it has, whether Bohr adopts any Kantian-like category of the understanding as an ‘a priori’ formula for future experience, that would surely have to be the Category of Divisibility. We can only make sense of divisible, analyzable, dissectible phenomena. We are habitual ‘analyzers’ [This is, incidentally, Henrgy Bergson’s view of the understanding as well, incapable of grasping the duration, i.e. the wholeness of organisms and our own uninterrupted flow of consciousness. I have already quoted one Bohrian passage to this end but there are countless to choose from, see Antonopoulos, 1997c, where all are gathered]. Hence, when confronted with indivisible phenomena we cannot but persist in our “customary ways” (a favorite Bohrian motto) of analysis and subdivision, ways that now are bound to be frustrated. One consequence of this practice is, of course, CTY. Our structural, i.e. composite conceptual scheme will not be accommodated by a structureless, simple reality, symbolized by “1” in my model. The scheme can only apply to it complementarily (Two into one won’t go); the other consequence of this practice is what else? the wave - particle duality. Indivisible entities, when differently dissected will either retreat and become inaccessible or else emerge by manifesting undivided aspects of themselves, which, given the conditions of dissection currently obtaining, namely, our specific macro-inspired experiments, are the sole permitted to be realized in the setting. To overcome dissection and stay true to their nature atomic phenomena must make up for the severed part of their identity by ‘replacing’ it, as it were, by whatever other properties the dissecting environment allows them to manifest. So in a dissecting A-environment they will emerge as if all-A and in a B-dissecting environment as if all-B, which are evidently inconsistent. Depending on the dissection –or predication– chosen, the indivisible thing will respond accordingly, if it will respond at all. And will emerge transformed, manifesting properties which it can be independently shown to not autonomously possess. The contradictions resulting from atomic phenomena, when too closely looked at (dissected) are not original. They are nothing but the combined result of their indivisibility and our own attempts at subdividing them. The contradiction is not inherent in the world itself. The contradiction is of our own making, when due to our persistent macro-habits we do to an object something contrary to its own nature. This is what Wave-Particle Duality is all about for Bohr, if it is anything, and, it should be stressed by the by, there is really nothing subjectivist in this situation. The thing is indivisible, so it behaves indivisibly, so it reacts indivisibly, so it interacts indivisibly, so it is observed indivisibly. When others see “the human subject” in all this, I see nothing but an entity always staying true to its independently asserted nature (So none of the usual, subjectivist fairy tales please! – See also the next Chapter). Bohr’s way of reducing WPD to the underlying quantum wholeness is similar to mine and is distinguished from it only in the respect of introducing indivisible interactions, when I have found it more convenient to speak of directly indivisible entities instead. I cannot say whether this deviation contains elements of substance in it or is just an idiosyncratic peculiarity. Be that as it may, Bohr proceeds in his reduction by the following steps:
Duality within the Structure of Complementarity…
37
“The finite magnitude of the quantum of action prevents altogether a sharp distinction between the phenomenon and the agency by which it is observed” [Bohr, 1934, 11]. Object-device inseparability is, as is well known, the opening stage of Bohrian wholeness, directly connected to the indivisibility of the quantum itself. [Also Hooker, 1972, 195] Once this is laid down, the next step consists in dragging to surface the potential consequences that such inseparability is expected to produce, when we are warned of “the impossibility of any sharp separation between the behavior of atomic objects and the interaction with measuring instruments” [Bohr, 1958, 39]. From this premise Bohr proceeds to obtain immediately below that “under these circumstances an essential element of ambiguity is involved in ascribing conventional physical attributes to atomic objects, as is at once evident in the dilemma regarding the corpuscular and wave properties of electrons and photons, where we have to do with contrasting pictures, each referring to an essential aspect of empirical evidence” [ibid.]. All three passages now jointly entail that such contrasting pictures emerge during experiments because of the impossibility of separating between atomic object and measuring instrument for the duration of the experiment. That is to say, Duality exists because Wholeness exists first. Any denial of the proposed connection between the two, intending to posit WPD as fundamental and irreducible in Bohr’s system of interdependent axioms, must now take its case with these revealing and restrictive passages. And also take its case with those that follow: <> [Ibid.]. The whole conception from beginning to end is epitomized in the following, comprehensive way: (a) Phenomena like individual, atomic processes, due to their very nature, are essentially determined by the interaction, between the objects in question and the measuring instruments. [Hence] (b) No result of an experiment concerning [such] a phenomenon can be interpreted as giving information about independent properties of the objects but is inherently connected with a definite situation in the description of which the measuring instruments interacting with the objects also enter essentially. [So] (c) This fact gives the straightforward (!) explanation of the apparent contradictions which appear when results about atomic objects are tentatively combined into a selfcontained picture of the object [Bohr, 1958, 25-6]. What Bohr is saying, therefore, though it is clear enough to be plain to all, is roughly this: If during an indivisible interaction atomic object and measuring instrument form an inseparable whole, in the sense of becoming indistinguishable, then they are one thing. But if object and instrument become one thing in a given measurement and another thing in a different measurement, then one and the same thing will appear as different to itself as the two measurements are to one another. This corresponds with my own alternative predications, extending to the entirety of the atomic object whereupon, by being uneven to one another,
38
Constantin Antonopoulos
will compel the indivisible object, capable only of displaying undivided aspects of itself, to appear uneven to itself. Bohr implies the exact same thing: “In the study of atomic phenomena we are presented with the situation where the repetition of an experiment with the same arrangement may lead to different recordings, and experiments with different arrangements may give results which at first sight seem contradictory” [Bohr, 1958, 18]. Hence, in view of all the preceding, namely, my analysis and that of Bohr’s, it is simply impossible for Duality to be anything else save a created situation. It nowhere follows from any of the previous considerations that Duality is a foundation of the doctrine. In this vein I would also call the reader’s attention to the following point. The reduction of WPD to indivisibility was here not the inductive procedure which many suppose it to be, beginning from collected and contradictory experimental results, and inferring therefrom that the devices of observation must have something to do with the contradictions. This is the eternal (and pernicious) route which takes us from Duality to ‘wholeness’ between observational results and the devices by means of which they were acquired; which reasoning essentially continues to posit WPD as the sole physical basis behind quantum effects. The reduction of Duality to wholeness was here furnished as a deductive process, commencing from wholeness and advancing to the emergence of the contradictory predications of it, where wholeness itself was taken as an independently ascertainable property of the indivisible object, entailing the ‘contradictions’ when treated in disregard of its true nature. Whether or not we will obtain inconsistent predications thereafter depends solely on the Theorem, that an atomic thing will only manifest undivided aspects of itself. Psychologically, we had to see it first to become aware of it. But the logical order is strictly one way in either derivation, mine or Bohr’s. Wholeness precedes and Duality follows, which is what a proper reduction consists of. This, I submit, is all there is behind Bohr’s account of wave-particle duality and this is its true status in his doctrines; namely, but a second order consequence of a presupposed wholeness and of our own attempts at subdividing it. It is therefore a derivative notion and not a primitive axiom at all, as it has been so frequently asserted all the way to the very present. It is, so to speak, a side product of indivisibility, not an ontological property inherently subsisting in the things themselves, and as a side product it should be regarded both in Bohr’s philosophy and QM as such. In this light Duality is divested of any special privilege in delivering all the other, subordinate propositions of the doctrine, least of all of Complementarity. There is but one such protagonist in Bohr’s doctrines and its name is not “wave-particle duality” at all. It has a far simpler, one-word name: Wholeness.
References [1] Antonopoulos, C. “Discontinuous Alterations of State and the Question of Determinism”. Determinism in Physics, Gutenberg, Athens 1985. [2] Antonopoulos, C. “The Uncertainty Relations of Energy and Time and the Conflict between Discontinuity and Duality”. Microphysical Reality and Quantum Formalism, Kluwer, Dordrecht 1988, vol.2.
Duality within the Structure of Complementarity…
39
[3] Antonopoulos, C. “Innate Ideas, Categories and Objectivity”. Philosophia Naturalis, 26, 2, 1989. [4] Antonopoulos, C. “Indivisibility and Duality; A Contrast”. Physics Essays, 7, 2, 1994. [5] Antonopoulos, C. “Bohr on Nonlocality; The Facts and the Fiction”. Philosophia Naturalis, 33, 2, 1996. [6] Antonopoulos, C. “A Schism in Quantum Physics or How Locality Can Be Salvaged”. Philosophia Naturalis, 34, 1, 1997. [7] Antonopoulos, C. “Time as Non-Observational Knowledge; How to Straighten Out ΔEΔt≥h”. International Studies in the Philosophy of Science, 11, 2, 1997. [8] Antonopoulos, C. “(In Refutation of ) Complementary Conceptual Schemes; The Objective Metaphysics of Complementarity”. Idealistic Studies, 27, 2, 1997. [9] Antonopoulos, C. “The Remaining Alternative of Bell’s Theorem”. Physics Essays, 12, 1, 1999. [10] Antonopoulos, C. “Reciprocity, Complementarity and Minimal Action”. Annales de la Fondation Louis de Broglie, 29, 3, 2004. [11] Antonopoulos C. “Investigating Incompatibility: How to Reconcile Complementarity with EPR”. Annales de la Fondation Louis de Broglie, 30, 1, 2005. [12] Bohr, N. Atomic Theory and the Description of Nature. Cambridge University Press, 1934. [13] Bohr, N. Atomic Physics and Human Knowledge. Wiley & Sons, New York, 1958. [14] Bohr, N. Essays 1958-62 on Atomic Physics and Human Knowledge. R.Clay & Co, Suffolk 1963. [15] Born, M. The Restless Universe. Dover Publications, New York 1951. [16] Born, M. Physics in My Generation. Springer&Verlag, New York 1969. [17] Einstein, A., Podoslky, B., Rosen, N. “Can Quantum Mechanical Description of Physical Reality Be Consired Complete?”. Physical Review, 45, May 1935. [18] Favreholdt, D. “Niels Bohr and Realism”. Niels Bohr and Contemporary Philosophy, Faye, J. & Folse, H. eds., Kluwer, Dordrecht 1994. [19] Feynman, R., Leighton, R., Sands, M. The Feynman Lectures on Physics. AddisonWesley, California 1965. [20] Feyerabend, P.K. “Complementarity”. Proceedings of the Aristotelian Society, supplementary vol. xxxii, 1958. [21] Folse, H. The Philosophy of Niels Bohr. North Holland, Amsterdam 1985. [22] Folse, H. “Bohr’s Framework of Complementarity and the Realism Debate”. Niels Bohr and Contemporary Philosophy, Faye, J., Folse, H., eds. Kluwer, Dordrecht, 1994. [23] Grünbaum, A. “Complementarity in Quantum Physics and its Philosophical Generalization”. The Journal of Philosophy, lvi 23, 1957. [24] Holton, G. “The Roots of Complementarity”. Daedalus, 99, 1970. [25] Honner, J. “Description and Deconstruction: Niels Bohr and Modern Philosophy”. Niels Bohr and Contemporary Philosophy, 1994. [26] Hooker, C.A. “The Nature of Quantum Mechanical Reality”. Paradigms and Paradoxes, University of Pittsburgh Press, 1972. [27] Hooker, C.A. “Bohr and the Crisis of Empirical Intelligibility: An Essay on the Depth of Bohr’s Thought and Our Philosophical Ignorance”. Niels Bohr and Contemporary Philosophy, 1994.
40
Constantin Antonopoulos
[28] Mackay, D.M. “Complementarity”. Proceedings of the Aristotelian Society, supplementary vol., xxxii, 1958. [29] Marquit, E. “Determinism in Physics, 1985: Book Review”. Foundations of Physics Letters, 1, 2, 1988. [30] Murdoch, D. Niels Bohr’s Philosophy of Physics. Cambridge University Press, 1987. [31] Plotnitsky, A. Complementarity. London 1994.
In: Quantum Frontiers of Atoms and Molecules Editor: Mihai V. Putz, pp. 41-60
ISBN: 978-1-61668-158-6 © 2010 Nova Science Publishers, Inc.
Chapter 3
COMPLEMENTARITY OUT OF CONTEXT: ESSAY ON THE RATIONALITY OF BOHR’S THOUGHT Constantin Antonopoulos* Department of Applied Mathematics and Physics National Technological University of Athens, 28 Oktovriou (Patision) 42,10682 Athens, Greece
Abstract It has been suggested by several people that Bohr’s Complementarity (CTY) is a form of context-dependence. In defense of the realism and objectivism of Bohr’s doctrine, I provide the following refutations: A. Position and momentum are not the same word in different contexts. They are different words in the same context, = a single instance of motion. Hence, are complementary in a single context rather than so across two. This refutes a contextualist reading of CTY. B. CTY does not resemble Kuhnian paradigms, unless it is identified with Duality. But the incompatibility between waves and particles is self-sufficient, hence such that makes the quantum redundant. Hence CTY, which is dependent on the quantum, is unlike Duality; hence unlike paradigms. C. The foundation of CTY is shown to be Indivisibility, that is to say, Atomism. An atomic singularity, thus conceived, splits in two and renders complementary a pair of classical concepts, E with t and p with q, on the basis of the analytic truth that two into one won’t go, except of course complementarily. For this kind of singularity Bohr is shown to speak as a realist. He says it “—in principle—resists an analysis”, implying that we try and fail. The failure however is due to the thing, not to the analysis. Bohr is therefore a metaphysical realist and, as an atomist, a metaphysical realist twice over.
1. Complementarity Contextualized The view that Bohr is a nonrealist, due mainly to the ideas of older physicists inadequately instructed in matters philosophical, has recently dropped out of fashion. It was a deserving fate but there are still some pockets of fierce resistance left in this country [see *
E-mail address: [email protected]
42
Constantin Antonopoulos
Note 1] and abroad. The roles have slightly changed, however. Bohr is no longer labeled as an out and out “idealist”. He is now depicted as a contextualist, hence still a nonrealist. One version of the point is delivered by his grandson, Christian: In order to prepare for my comparison between Complementarism and Anti-Realism let me attempt an Anti-Realist’s approach to these problems: According to Post-Tractatus Wittgenstein, problems (philosophical problems, it is) in principle arise because of the rules of language usage being violated. In the case of man’s free will the problem is that it— dependent (roughly) on whether we look into our selves or study other beings—seems consistent with experience to assume both that we do have, and that we don’t have a free will; with the two contentions being of course mutually inconsistent.— And the cure for this apparent inconsistency is the understanding that we though using the same word, scil. will, still are not using it with the same meaning, because we have changed the context. [...] Now let’s see how the same problems are attacked with Bohr’s complementarity analysis. A case for complementarity arises, if I understand Bohr correctly, when our use of two (or more) words, vocables, terms, etc., involves us in inconsistencies and ambiguities, because these words, though intimately related in ordinary usage, as a consequence of the different conditions for their use actually belong in different contexts. [Chr. Bohr, correspondence, received 18 April 97. All italics mine. Other than that, the author’s text is quoted unchanged.] This, therefore, is an identification of CTY with context-dependence spreading, as it seems, throughout the entire conceptual matrix. But it is anything but a solitary proposal; not by any means. For that matter, it stands in the company of a good deal of influential names in the issue at hand, past and present, such as those of N.R. Hanson, P.K. Feyerabend, Th.Kuhn (as interpreted by others, however; cf. C.Glymour), Gonzalo Munevar (editor of the Kluwer volume dedicated to Feyerabend [see Note 2]), and last (in my list, at least) C.Glymour himself [see Note 3]. Nor would I ever venture to launch an attack against “contextual CTY” drawing merely from the limited confines of my personal correspondence with Chr. Bohr, however close to the man himself he might otherwise have been, without cross referring it with available literature. From the list of complementarian contextualists above offered I would only wish to exempt a single name as a potential target of my criticisms that of N.R.Hanson. I have always interpreted Hanson far more realistically than most and probably far more than Hanson himself intended. I have, in fact, received Hanson’s transitory—and expendable— quantum quasi-contextualism as but a more elaborate, semantic version of underlying realities, which to me have always signaled a deep going macro-micro domain dualism; this is a view which I myself espouse and which Bohr also had. Hanson’s Patterns of Discovery may abound with (seemingly) contextualist talk, but one that is not particularly difficult to de-contextualize: “Classical mechanics as a whole is said to be but a special case of quantum mechanics. They are apparently clusters of statements in the same language. Statements and languages however do not work in this way [Hanson, 1961, 151]. Finite and transfinite arithmetic, Euclidean and non-Euclidean geometries, the language of time and the language of space, the language of mind and the language of body, these show themselves to be different and discontinuous on just this principle: What can be meaningfully said in one case may express nothing intelligible in the other” [Op.cit., 152]. I myself entertain not the slightest doubt, as neither does he, that the rift between quantum and macro phenomena is only evaded by the mathematical scheme, in other words by the quantitative cover up, convenienced by the numerical convergence of classical and
Complementarity Out of Context
43
quantum laws in the case of large numbers, a technique, on the whole, especially designed to patch up the irreducible qualitative discrepancies between the two classes of events. But my own reaction is, in due accordance, that the asymptotic nature of the two languages results from the nature of the phenomena themselves and is to that extent only derivative. It is the phenomena that manifest the primordial discrepancy; not the languages, which but follow. The fact that physicists are professionally inclined to nourish dreams of integration and unification of the micro with the macro within some grand comprehensive scheme or other is no evidence of the essential, besides the numerical, continuity between the two realms. Hooker describes this (classical) presupposition, coming natural to Einsteinians, with the following words: “All complex objects consist of definite structures of the fundamental elements which are their constituents” [Hooker, 1972, 70.]. Where, Hooker notes, it is the word “definite” which carries the primary ontological weight. But this untroubled assumption can be shown to falter even within a mathematical frame, provided it is suitably devised. Here, then, is the frame in question, manifesting –by analogy–within the confines of a single rudimentary model all the requisite properties, starting from atomicity upwards and all the way to CTY, the quantum uncertainties (UR hereafter), Bohrian holism as such and the micro-macro dualism previously referred to plus, as a bonus, the completeness of the quantum mechanical description of micro phenomena, a consequence of course of the repeatedly noted micro-macro dualism. Consider an elementary equation of the form A+B=x, where all variables, A, B and x are to receive values exclusively from the field of naturals (for obvious reasons), that is to say, of positive integers, not including zero. (Some say because zero is not a number. I would say because zero is not a positive number.) Once the previous condition is posited, then what follows is that for any value ascription to x, ranging from 2 to infinity, we shall at all times be capable of assigning definite and simultaneous bound values to variables A and B. But for the unique value ascription, x=1, this is no longer possible. For x=1 and therefore for A+B=1, we can either ascribe the entire, non-distributive value, 1, to A, whereupon A=1 and B=indeterminate or, conversely, ascribe it to B, whereupon, B=1 and A=indeterminate. (Zero is not a permissible value ascription.) This, I will argue in the appropriate places, is Bohr’s doctrine of wholeness. In my numerical model the wholeness is that of 1 as the smallest natural in existence; the atom at the basis of the natural sequence. This wholeness takes us almost immediately to the “mutual exclusion” imported within the components of the schema |A+B|, because the schema itself is structural, whereas 1 in this field is simple and devoid of structure and cannot be divided in consistency with the premises of the argument. We shall later on see Bohr referring to phenomena comparable to 1 as a natural, when aiming at defending the completeness of the complementary description, by stressing that they “in principle resist such an analysis”; which is a realist, and a dualist, thesis, for if atomic phenomena resist analysis, as macro objects do not, both classes of phenomena turn out to behave exactly as independently predicated; which is much closer to hard core realism, than to any other option, because reality behaves as independently predicated. But the rift between macro and micro, in the frame of the model the dualism between x=1 or x>1, is none the less irreducible, and in this case even mathematically irreducible, though to no noticeable hindrance to a realist account of the situation. “If S can be used to express an intelligible statement in one context”, says Hanson, “but not in another, it would be natural to conclude that the languages involved in these different contexts were different and
44
Constantin Antonopoulos
discontinuous” [Hanson, 1961, 151]. What is unnatural, however, is to conclude that therefore the linguistic phenomenon is thereby elemental and primitive, with no traceable ontological roots to anchor it upon. Mutual exclusion of value ascriptions in this model results from alternating conceptions of reality: x>1 or x= 1, the “macro” (=sub-divisible) and the atomic, respectively. And atomicity is a predicate of reality, not of complementary “languages”. Bohr is pretty clear about this: The fundamental postulate of the indivisibility of the quantum of action forces us to adopt a mode of description designated as complementary in the sense that any given application of classical concepts precludes the simultaneous use of other classical concepts which in a different connection are equally necessary for the elucidation of the phenomena [Bohr, 1934, 10]. And, I take it, the quantum of action is a property of the world. Not the property of a language, whatever that might come to mean in this weird connection and however such a language might have then been adopted. Bohr himself has no particular qualms, concerning how this language is adopted. It is forced upon us by the restrictions of reality, that is to say, indivisibility, affecting the current operation of our descriptions. My own numerical model of CTY reflects just this sort of dependence. The mutual exclusion between the related variables, A and B, in the context of my model results from the predication of atomicity to 1 and this predication is independent of the schema |A+B| per se, which latter is the “language”. There is nowhere discernible within the confines of the schema, that it must needs receive values from the field of naturals and at exclusion of other possibilities. It could just as well receive values from the fractionals, depending on what universe of discourse we are after representing, and then no mutual exclusion of simultaneous and definite value ascriptions would ever come up. The provision that the values be drawn from the naturals is extrinsic to the schema. And is therefore the foundation for its subsequent complementary reaction, when the limiting case x=1 commands it. “...=1” is not a “vocable” in any coherent sense here, as Bohr the grandson has suggested to me. [C. Bohr, April 97.]. It is the schema ⁄A+B ⁄ that is if anything has to be. And “...=1” is what “forces us”, to use Bohr’s own words, to adopt a mode of account designated as complementary. A and B in themselves considered are nothing of the sort. Just try x=2, 3, 4 or any other natural up to infinity. A and B will never enter a relation of mutual exclusion, because A and B are not mutually exclusive to one another and independently of selected assignments. They are rendered so for the specific –and unique- ascription x=1. But otherwise are cosatisfiable. |A+B| can handle all the cases except the simple. The quantum is not just a miniature of the macro. To this extent, therefore, Hanson is right: “How can intelligible empirical assertions become unintelligible just because quantum numbers get smaller? Conversely, how can unintelligible clusters of symbols become meaningful just because quantum numbers get larger?” [Op.cit. 152]. Well, how can A+B>1 yield coherent answers but A+B=1 not? They do because they refer to incompatibly predicated aspects of reality; aspects which, as I am at pains to stress, are both there. The atomic and the macro. It is not that the meaning of the variables, A or B, has changed or even that the meaning of the entire schema |A+B| as a whole has, now being intelligible now unintelligible. It is that the conditions of its application to reality have changed, those being now atomic, now divisible; hence, now jointly, now disjunctively permitting definite value ascriptions. |A+B| responds to their alternation without changing its meaning. It just responds to them by rearranging its conditions of internal relatability; which is what CTY is all about.
Complementarity Out of Context
45
Were we to suppose any different, mutual exclusion itself would be jeopardized or falsely attributed; to postulate a change of meaning in the schema |A+B| as we alternate between x=1 and x>1 value assignments is exactly the same as postulating that, if books A and B cannot fit together within a drawer of a size smaller than their sum of surfaces, they have eo ipso changed their shape! And this is patently absurd. Not just because this supposition is evidently false, as it clearly is. But because it is even compatible with the supposition that, in having changed their shapes thus, the books could now very possibly fit in the said drawer. And this is surely contradictory to the assumption. The books are no different. The drawer is. Similarly, the self identity of the schema |A+B| stays rigid across whatever number field we might consider, natural or fractional. |A+B| can receive values either from the field of naturals or the field of fractionals. Yet its internal identity stays rigid across its transfer from one field to the other. The only transformation apparent, comes with the limiting ascription x=1 in the field of the naturals. And this ascription introduces a dichotomy not between the two number fields, as it were, but one within the self same number field instead, that of the naturals alone. Which latter, I would suppose, should feature here as a single context. Perhaps complementarian contextualists could explain to us how, in this case, |A+B| manifests properties of mutual exclusion in a single context, that of the naturals for x=1, and no such thing when |A+B| travels across one number field to another, for any natural ascription x>1. Concepts which behave complementarily in a single context for the ascription x=1, still behave identically across two distinct contexts, the naturals as opposed to the fractionals, for any natural value ascription greater than 1. Hence, mutual exclusion is due to a factor other than change of context. But not other than the ascription x=1 of a context. I submit therefore that Hanson’s view above quoted is dualistic and realistic. The unintelligibility, as quantum numbers get smaller, results from not having any more numbers at our disposal. Reality itself is running out on us. Until, for x=1, the tank is empty. And the intelligibility is restored, as quantum numbers get larger, because now we have as many of them as we care to count up to. “The monads [=1] have no windows”, Leibniz warns us [Monadology, prop.7] But the rest of things open theirs to us. What Hanson understandably refers to as fluctuations of intelligibility are essentially fluctuations of available reality. Whatever happens to our concepts thereafter is purely conditional on just how much reality there is. This is a ‘contextual’ account of Bohr’s correspondence principle and, by extension, of his CTY, that I can quite comfortably live with. Other such, I certainly cannot; Time then to rebut them.
2. Complementarity: A Stumbling Block on the Way to Context C. Bohr began his contextualist narrative of CTY by briefly mentioning the word “will” as potentially emerging in distinct, hence uncommunicating, contexts, pointing to Wittgenstein as the ultimate source for this suspicion [see Note 4]. I will briefly indicate my initial reply, for isolating whatever elements will be requisite for the quantum integration of the point. First, let me remind to all concerned, the exact position of context-dependence: The meaning of the same word differs in different contexts. Now then; either “free–caused” or “mind–body” are pairs of words semantically connected or else words not semantically connected. Suppose first they are not connected. Then they are simply two different words,
46
Constantin Antonopoulos
signifying two different concepts. Then they will trivially not have the same meaning, since they are different words to begin with. And this rules out all contextual treatment ex hypothesi. So suppose they are semantically connected instead, as they clearly seem to be. But then this semantic connection of theirs can be none other than their very opposition. For if “free” is just taken to mean “not caused”, and “mental” to mean “not bodily”, and vice versa, i.e. the only provision at our disposal for semantically connecting the two pairs of words (otherwise we fall back to the previous horn: different words), then, unless I’m horribly mistaken, such pairs of terms have now been turned into straightforward contradictories. And contradictory descriptions must have a unique reference, if to be at all contradictory. If you say “this car is moving” and I say “this car is at rest” but point at different cars, we are certainly not disagreeing. Again there is no profit of sorts for the contextualist here. In fact, my former impasse still persists. Either these pairs of concepts are unconnected, whereupon their difference in meaning is semantically trivial and not susceptible to a contextualist analysis, or else they are connected, but are then contradictories and hence such, as designating and belonging to a unique context, antithetically predicated; The context being, a single act. Now to quantum CTY; the contextualist motto is still what it always has been (or is it?); Different meanings (in different contexts) of the same word. But the quantum or, for that matter, the classical terms “position” and “momentum” are certainly not the same word. They are two different words. Why should we ever expect a meaning “variance” between words which are different in any case, and which C.Glymour somewhat (self) forgetfully conflates? “Meaning holism has the consequence that if beliefs about a collection of sentences change, so do the meanings of the terms of the sentences. This phenomenon is sometimes called meaning variance” [Glymour, 1992, 124, the author’s italics]. Then, extending this to p and q, which should normally be elements of a single collection, he concludes: “Just as no world of experience combines different conceptual schemes, no reality we can experience, even indirectly through our experiments, combines precise position and precise momentum.” [Glymour, 1992, 128, my italics]. “Just as...?” Just as what? When we speak of meaning variance, what exactly is it, which the variance is a variance of? A different word? A different word does not yield “variance of meaning”, in the sense here demanded. It just yields a difference of meaning because it is a different word; period. And so far as I can see momentum and position are different words of a collection. Not the same word in two collections. For although, perhaps, different conceptual schemes are not combinable, on whose authority are momentum and position elements of different conceptual schemes? Are they not, then, constituents of a single phenomenon of motion? Classically, the velocity, U, is given by U= ds/dt, where ds in the limiting case is position, derived by means of velocity and, indeed, conversely, since velocity is itself a derivative of position. To compare these two with different conceptual schemes is, in a word, nonsensical. As constituents of a single phenomenon of motion, they evidently belong to the same context. “Motion is change of position in time”, as H.Folse puts it concisely [Folse, 1985, 57]. This, if it is evidence of anything, is evidence that they belong, by definition, to one and the same context. And if the mutual determinations of position and momentum get in each other’s way at all in QM, this is because they belong to the same context in the first place. For were they unrelated instead, their determinations would be independent of one another and hence attained in combinable separation (see below), in utter contrast to Glymour’s different conceptual schemes.
Complementarity Out of Context
47
This situation, though it does lead directly to their CTY, rather than leading to their claimed context dependence at the same time, obeys instead laws of logic and semantics which are the very negation of Glymour’s ‘analogy’. Let us consider mass and position instead. Mass is a concept definable in contexts differing, or even unrelated, to contexts appropriate for defining position. The two of them are, for that matter, as distinct, unconnected and divergent as we care to imagine. The question of where an object is, is the last question relevantly entering issues of specifying its contained quantity of matter (or inertia). The two of them belong, one is entitled to declare, to totally alien contexts. Then, perhaps, C.Glymour would care to enlighten us just how it happens that, wide disparateness of context notwithstanding, position and mass are still not complementary quantities in QM! And how, instead, position and momentum, which are clearly constituents of a single phenomenon of motion, and are therefore closely bound to one another within the confines of a single context instead, do happen to so be. Well, to me the answer is simple. CTY has nothing to do with context dependence and, I venture to propose, a lot to do with context independence. A question, then, for some people at Pittsburgh University: How can two contextually unrelated concepts be compatible, and definable with arbitrary precision, while two contextually connected ones, and both immersed within a single context, motion, be complementary? If I have done my quantum homework, as some contextualists no doubt also should, it is actually because position and mass are thus unrelated, that they are thereby jointly, rather than disjunctively, definable within the frame of the theory. Because, in other words, their mutual determinations are effectively isolatable, never interfering in any form with one another. Hence, position and mass are both known and they are both known simply because they have nothing at all in common. Their contextual diversity is not their CTY. It is their compatibility. On the contrary, position and momentum cannot both be known because, unluckily, they have a lot in common. Their contextual identity is their CTY. It’s not therefore as if CTY is a tacit form of context-dependence or, even, vice versa. If anything, to call a pair of concepts complementary, in Bohr’s sense, is to call these concepts context– insensitive. Which is a point going beyond the confines of the present approach, and generalizable over all relevant cases. The very notion of CTY rules out context–dependence. ‘Complementarian’ contextualists have got the truths all backwards; again.
3. The Duality Route to Context Gonzalo Munevar and P.K.Feyrabend have developed between themselves a parallel version of subjecting complementary relationships to context–dependence. This, Munevar explains, was on Feyerabend’s incitation. [Munevar, 1995,6.] Well, surely that won’t make things better for either of them! Here the scheme of things, unlike the cautious formulation earlier preferred by Chr. Bohr, assumes all the features of grandiose ambition, reaching to giddy heights when, together with complementary conceptual schemes of different cultures, we are announced of complementary conceptual schemes between humans and bats! [Munevar, 1995, 6-8.] The term “CTY” here mainly serves the purpose of stressing the equality in epistemic status of the disjunctive world accounts. Possibly, even, the equal epistemic status between human and bat cognition: <
48
Constantin Antonopoulos
produce information in one frame that is not logically, conceptually, theoretically, or mathematically equivalent to any produced in another, even if it is presumably about the same aspect of “reality”>> [Munevar, 1995, 6; italics the author’s.]. This, I would imagine, is a big step away from momentum-position or energy-time CTY. We no longer have to do with CTIES established between concepts collectively belonging to a certain conceptual scheme and rift apart by indivisibility, or do with indivisibility itself for that matter. Rather than dealing with disjunctiveness between concepts constitutive of a conceptual scheme, such as the products Et and pq formerly of classical mechanics, we are now dealing with one conceptual scheme disjunctive to another. Whence, of course, the irresistible temptation on the author’s part to involve Kuhnian Paradigms in the discussion, not an uncommon supposition in the matter at hand, as we shall soon see, though at least one wished that he wouldn’t have to drag “bat cognition” together with them just to drive the point home [Munevar, 1995, 6]. The analogy supporting all this, or in any case what Munevar takes an analogy to be, is to be located, he claims, to quantum experiments disclosing corpuscular and/or undulatory properties of micro-systems: “As Bohr pointed out, one experimental arrangement that enables us to see the electron behave as a wave complements, but also rules out, another arrangement that enables us to see the electron behave as a particle. The wave and the particle descriptions are thus complementary –there is no sense in which they are equivalent. The same may obtain between descriptions produced in different frames” [Munevar, 1995, 6; the italicized words directly connect CTY with context]. So there is really no escape, is there? After ninety years of complementary theorizing and still Bohrian scholars, methodic or perfunctory, aficionado or free lance, subtle or populist, exhaust what little imagination they can muster, to take us back to the ‘CTY’ of waves and particles. Contextualists regard themselves as avant guarde progressivists in life, art, politics, and philosophy as well as in science. But on this matter they just never learn; from Feyerabend’s early formulation of the thesis, “CTY is a description based on duality” [Feyerabend, 1958, 94], repeated a few years later (“...duality as well as the idea of CTY based upon it”. [Feyerabend, 1962, 223]), we proceed to more recent times with an occasional, but not too decisive, and always back stepping, improvement: <> [Folse, 1985, 114]. Well, close enough. But back steps are, it seems, the rule in the pathway to CTY. Thus, even more recently we are once again assured that “CTY is introduced by Bohr in response to the wave-particle duality, which becomes wave-particle complementarity in his framework” [Plotnitsky, 1994, 68]. The message transmitted has now become all the more ominous for the handful of objectivist complementarists still in life (counting Folse and myself) and the even fewer rationalist ones (just counting myself, presumably). According to this author “waveparticle CTY is, thus, defined in anti-epistemological terms” [ibid.] and hence serves almost as the scientific Trojan horse that throws the gates wide to contemporary French obscurantism, Derrida, deconstructionism and a curtailed version of Hegelian dialectics, from which last French philosophy simply finds it impossible to emancipate itself, the author included, it would seem.
Complementarity Out of Context
49
Bohr, surely, has on occasion spoken of wave-particle CTY, though mostly as a rather degenerate instance of the concept and, chiefly, as a derivative situation in QM, rather than as the primitive axiom which these authors take it to be. But, and that’s what matters, he has also spoken very differently: “The term Complementarity, which is already coming into use, may be suited to remind us of the fact that it is the combination of features which are united in the classical mode of description but appear separated in the quantum theory” [Bohr, 1934, 19]. But it is not waves and particles which are united in the classical mode of description and separated in the quantum. It is the conjugated classical concepts, whose products, Et and pq, yield a unit of action. To say nothing of the fact that, on the very words of most complementarists of the persuasion in question, wave and particle are just about ‘united’ in the quantum mode, besides being separated in the classical and, I would add, everywhere else as well. Once again contextualists have got the truth backwards. There has been, I strongly suspect, too much casual reading of Bohr’s text and too much poor logic involved. Let me then demonstrate, and I mean demonstrate, not just indicate, suggest or argue, why wave-particle duality literally cannot be an instance, let alone the basis for CTY. The first step I will take hand in hand with Hooker’s account: “Bohr believes that while it has seemed to us at the macro level of classical physics that the conditions were in general satisfied for the joint [see Note 5] applicability of all classical concepts, we have discovered this century that this is not accurate and that the conditions required for the applicability of some classical concepts are actually incompatible with those required for the applicability of other classical concepts. This is the burden of the doctrine (B4) [Doctrine (B4)=CTY]. This conclusion is necessitated by the discovery of the quantum of action and only because of its existence. [...] It is a discovery of the factual absence of the conditions required for the joint applicability of certain classical concepts” [Hooker, 1972, 137; dark letters for the author’s italics]. In a word, CTY and its quantitative expression by the two UR is conditional on the quantum. (As if it could ever be otherwise.) Precisely as the mutual exclusion of the two variables of my own complementary model, A and B, is conditional on whether the value ascription is x=1 or x>2 &c. And this entails a conditional form of incompatibility; Conditional, in this case, on the quantum of action itself. But the incompatibility of waves and particles is not conditional on anything. Waves (large) and particles (small) are incompatible by definition and hence incompatible in the ways of logic. That is to say, necessarily incompatible. This is why, besides, it is called a “duality” in the first place. And in being necessarily – rather than conditionally – incompatible, waves and particles are also incompatible in classical mechanics. And classical mechanics does not contain the quantum. Stated in the simplest of terms, waves and particles are mutually exclusive without any help from the quantum. Were CTY to be grounded on the incompatibility afforded us by the wave and the particle, CTY would equally have no need of the quantum. This sort of incompatibility is self contained and obtains independently of the quantum. So then would a CTY based upon it, in direct contradiction with the premises of the argument. Wave–particle duality, rather than being the basis of CTY and/or the UR, is in actual fact a situation yielding exactly the sort of incompatibility which renders the quantum redundant. And this is patently absurd. Duality, I submit, yields the wrong sort of incompatibility. Let me rephrase this. The wrong sort, if there ever was one. Duality is an unmanageable idea and the sooner one gets rid of it altogether, the
50
Constantin Antonopoulos
better for CTY. In more recent years this is beginning to tentatively dawn upon commentators, at long last: “The connection between wave-particle dualism and the complementarity of the dynamic and the kinematic properties remains a problematic issue for the analysis of Bohr’s philosophy and for the interpretation of QM” [Faye & Folse, Editoditorial, 1994, xvi]. So then does, to put it mildly, the ambitious plan of associating CTY with context– dependence. If Duality was retained, together with its correlative subjectivism (see Munevar’s account earlier cited) the task would have looked more promising. If, that is, we incorrigibly stick to the old ways, always known to die hard, and continue to identify CTY with “the technique, applicable to all domains of experience, where the phenomena depend on the conditions on which they appear” [Petersen, 1968, 242] this would (albeit still erratically) suggest a plausible similarity with Kuhnian Paradigms, capable, if their author is to be believed, of making us see things because we have been operating within this Paradigm, while other researchers, operating within the confines of a different Paradigm, would literally see different things, because secluded in that other Paradigm [Kuhn, 1970, 110ff.]. Either way, the complementary or the paradigmatic, the phenomena experienced would intrinsically depend on the conditions of their observation, either as wave and particle properties, depending on the experimental design chosen but arising from a unique referent, or as noncommunicating, “incommensurable” and Paradigm-dependent visions of a per se inaccessible referent (as in Munevar’s account). If there is a temptation at all, to the task of tracing down parallelisms between CTY and context-dependence, this is it: namely, Duality. This is the sole (good) reason behind the intended comparison. Thus: <<What properties the world exhibits depends on what we ask of it. Niels Bohr called this phenomenon “complementarity”. [...] Changing the experiments we conduct is like changing conceptual schemes or paradigms: we experience a different world>> [Glymour, 1992, 127-28]. This is really most regrettable, when affirmed by so many. That what the world exhibits may have something to do with what we ask of it, is, so far as I can see, at best a necessary condition for the answers we receive. Not a sufficient one. Were it not so, why pray, do we get Complementarity of the classical concepts in the first place, right when we were all so complacently confident beforehand that, asking in the way we had, we would get their compatibility instead? My own answer [1987] is quite simple: We do, because the world refuses to comply with our preferred conceptual instructions and goes its own way. And if that’s not evidence for realism, then I recommend to the authors above to rethink their thoughts on this as carefully as I promise to do mine. But none of this is really necessary. Once Duality is stripped bare of its pretenses, namely that it matches the logical structure of CTY, when it cannot even coherently relate to the quantum, then I suspect that not even contextualists will be all that eager to assimilate such incoherence in their scheme of things. But when it comes to contextualists, one should never be too sure.
4. Incommensurability: Last Resort—and Last Exit Feyrabend claims that: “The combination of duality with the second set of empirical premises introduced (the Einstein-de Broglie relations) shows that the duality [see Note 6]
Complementarity Out of Context
51
between the wave and the particle properties of matter may also be interpreted as a duality between two sets of variables, e.g. position and momentum” [Feyerabend, 1962, 225]. Well, if this is so, I suppose I just have to repeat for positions and momentum what just a few pages ago I said for waves and particles. The incompatibility between waves and particles is self-sufficient and has no need of the quantum. In consequence, if the incompatibility between position and momentum mirrors the former, as Feyerabend asserts, it will likewise have no need of the quantum. Some interpretation of ΔpΔq≥h this is; one that derives ΔpΔq≥h without the quantum; Now to further disasters. Consider the following argument: (a) Position and momentum are either incompatible by virtue of Logic, or, (b), not incompatible by virtue of Logic. Suppose first that (a). Then, if position and momentum are incompatible by virtue of Logic, just as Duality also is, they must be incompatible across the board. But then they must be incompatible even in classical mechanics. And then QM and Cl.M. would not be incommensurable at all, for they would both manifest the same type of incompatibility between position and momentum, now holding across theories. (As Zeno was the first to contend!) Suppose then that (b). Position and momentum are not incompatible by virtue of Logic. Then they must be incompatible due to specific conditions, obtaining in QM but not also in Cl.M. But then their incompatibility in QM is due to reasons other than their meaning. Consequently, position and momentum have not changed their meanings in QM. Therefore QM and Cl.M are not incommensurable theories, despite their evident contrast in all else. Thus, both exhaustive alternatives refute context–dependence. But they do not refute Complementarity. In fact, they confirm it down to its last letter. “The combination of features which are united in the classical mode of description but appear separated in the quantum theory”, is what we have witnessed Bohr demand in [1934, 19] for the establishment of a complementary relationship. And if the classical concepts are merely factually [see Note 7] incompatible in QM –due, of course to the premise that h>0, which could have gone the other way– but compatible otherwise, hence compatible also in CM, then the classical concepts behave exactly as Bohr’s definition of CTY prescribes in both the theories considered. But they behave contrary to what context- dependence prescribes for both theories considered, for when separated in QM by a premise extrinsic to their root core meaning, they eo ipso enter a relation of incompatibility without a change of their meanings, even though now, qua incompatible (NB: factually), they operate in a different physical theory. Hence, though QM and CM are certainly different physical theories due to this, they are anything but incommensurable. Hence, there is no such thing as context-dependence, entailing the incommensurability between CM and QM, and if there is none such in their case, there is none such anywhere at all, the quantum revolution being the strongest case ever for conceptual revolution. I have begun intent on freeing CTY from the unappealing clutches of context-dependence. What this has resulted down to, however, is a refutation of context-dependence per se. In my impasse to Feyerabend and contextualism, above, the alternatives “logical or factual incompatibility” is exhaustive. There is no ‘middle’ type of incompatibility. And the concepts themselves, which are thereby involved, simply manage to preserve their semantic core, whether we treat them as logically or as factually incompatible instead. For if the former, CM and QM are thereby united. And there would then be no incommensurability; And if the latter, then only extrinsically incompatible, whereupon CM and QM would be different theories, as they clearly are via the said incompatibility, but still different without incommensurability.
52
Constantin Antonopoulos
“Either-or” therefore, contextualism is rebutted per se. Consider this a bonus of the point, rather than a side effect.
5. The Concept of Action Quantization In view of the mistakes and confusions thus far listed, it is crucial to get our primary assumptions right, before we ever embark on the issue of what CTY actually is. So it is essential to start building from rock bottom upwards. Olivier Darrigol writes: <<Einstein’s form of the quantum condition fitted well with de Broglie’s idea that action played the role of a phase. Langevin called the action variables “the cyclic periods” of the action integral. This denomination implied that Langevin regarded action as a periodic function>> [Darrigol, 1993, 330]. The meaning of this most illuminating account is that in QM the least possible exemplification of a dynamical quantity, E or p, can only be recorded over a period; a wave period. If this period is expressed in terms of the frequency, ν, we obtain E=hν and so Et=h. If it is expressed in terms of the wavelength, λ, we obtain p=h/λ and so pλ=h, as two alternative expressions of minimal action, h. The idea, stated in more precise terms discloses that the basic quantum relation, E=hν or E≈ν, signifies that E, exactly like ν, can no longer be defined at an instant dt→0, as was classically assumed, but only over a period t>0, whose boundary instants {t1,t2} are here given by the frequency of this period, ν. Correspondingly, p also cannot be determined at a point location, dq→0, but only over a distance, whose boundary points {q1q2}are given by the wavelength of this period, λ. It is because E cannot be defined at an instant dt→0, but can only be mapped over a period, that the resulting product of action Et is not arbitrarily reducible to a diminishing value, Et→0, but can only be determined over a limiting period t>0, that it is rendered a quantum of action, h, instead. Clearly, in the case that E could be recorded at an instant, taking that instant as narrow as we might wish, leading to a t→0, the product Et would be a vanishing quantity and no quantum of action would result. But since a t={t1,t2}>0 as opposed to a t→0 is a time latitude, and q={q1q2}>0 a place latitude as opposed to a q→0, it immediately follows that the dynamical quantities E and p cannot be defined within an arbitrarily narrow space-time interval, as was classically assumed, but only over such limiting space-time latitudes, in other words only within accuracies not smaller than periods, instead of instants, and not smaller than distances, instead of points. Hence, E can only be defined over a Δt and p only over a Δq. The quantum uncertainties already announce themselves in the mere unpacking of the concept of quantized action. So let us proceed to derive them. [See also Appendix.] ΔEΔt≥h: If there is such a thing as a shortest time permissible, i.e. a time limit, imposed on the conditions warranting the very manifestation of E, this being a time limit of dimensions Δt={t1t2}or ν, then any subsequent narrowing of this interval, of the order, say, Δt'={t1t2}/2 can only mean that the overall energy determination will only be reciprocally affected and, therefore, reciprocally inaccurate. For if we require at least a time length of dimensions Δt, if we are to determine the energy with an accuracy ΔE=n, where n is sufficiently small to stand for a high E approximation, then, all other conditions being identical, at half the time formerly allowed, i.e. within Δt/2, we can only expect to end up
Complementarity Out of Context
53
with an uncertainty ΔE=2n if the action product itself, of which E and t are the components, is to always remain constant, i.e. h. And so on, reciprocally, for any other diminution. In other words, if energy can never be defined within an instant t→0, and hence is on-ly to be defined over the boundaries {t1,t2}=Δt, as is dictated by the assumption that the action unit which is the product of components E and t is to remain constant (or minimal) at all times, then the optimal definitions of these two action components, E and t, cannot themselves be any better than the said limiting product, Et. And therefore that the joint errors in the definitions of these two action components, E and t, can at best be equal, or if not, then greater than this limiting product Et. Hence, in symbols, ΔEΔt≥Et. But Et=h. So ΔEΔt≥h. ΔpΔq≥h: By strict analogy with energy before, if momentum can only be defined over a distance and therefore not at a point location Δq→0 (whence also Zeno’s paradox), hence only over a distance of dimensions Δq={q1q2}(=λ), it equally follows that any attempt at its definition within spatial boundaries narrower than those specified, will be as inaccurate, as the distance itself employed for its definition is taken shorter. Same as before, if we need at least a distance of dimensions {q1q2}=λ=Δq, if we are to determine the momentum within an accuracy Δp=n, where n is sufficiently small to stand for a high momentum approximation, then, all other conditions being identical, at half the distance formerly allowed, i.e. at {q1q2}/2, we can only expect to end up with an uncertainty Δp=2n, if the action product itself, of which p and q are the components, is to always remain constant, i.e. h. So, in general, if momentum cannot be defined at a space point, but is only to be defined over the spatial boundaries {q1q2}=Δq, as is dictated by the assumption that the product pλ is to remain constant (or minimal) at all times, then the optimal definitions of these action components, p and q, cannot themselves be any better than the limiting product pλ. And then, consequently, the joint inaccuracies in the definitions of p and q can at best be equal, or if not then greater, than this limiting product pλ; hence, in symbols, ΔpΔq≥pλ. But pλ=h; so ΔpΔq≥h. And thus the CTY between energy and time and the corresponding CTY between position and momentum have been derived, simply and, I believe, elegantly, from the mere analysis of the quantization of action alone. The definition of quantized action included, it has all taken me but a page. And CTY (or the UR), in this deduction, were shown to be strictly dependent on the quantum; which is the primary requirement. Let us then embed this result, i.e. the UR thus derived, in the deeper spirit of CTY. The structure of quantized action –its limiting value- demands a delicate balance to be maintained overall, which, once upset by leaning too heavily on either side of the balancing scale (defining either complementary quantity involved too accurately) can only result to leaning too lightly on the other side (total loss of its complementary), because the quantum they jointly comprise (Et or pλ) has itself to stay rigid either way. A sharp instant of duration Δt→0 at a sharp point Δq→0, though themselves ideally accurate, will eo ipso make the definitions of E and p, respectively, impossible, if energies are to be defined only over periods, momenta only over distances. And, conversely, when energies and momentum are to be defined only over periods and distances, respectively, then their determinations will have been attained within margins way too wide and hence way too inaccurate, when compared – as above- with instants of duration Δt→0 and points of dimensions Δq→0; all this because of the inward reciprocity [see Note 8] possessed by the indivisible action block. Expand it at one end, it will eo ipso contract at the other, in directly reciprocal manner, to keep itself a
54
Constantin Antonopoulos
constant, undiminishing quantity. This, then, is indivisibility. And this is how such indivisibility forces us to adopt a mode of description designated as complementary. Well, then. Has there been any ‘duality’ assumed? None has. For what I have said, and what Darrigol has said before me, was not that a “particle” is associated with a wave period, but that action so is. Hence, reading Duality into this argument is quite simply a misnomer. What this argument discloses is that the quantities E and p must “stretch out” over periods or distances, if to at all materialize and be determined. And not that particles must so stretch. To read “particles” into the quantities themselves, E and p, is not what the two quantum relations entail but what the two quantum relations preclude. The restriction that energy is to be defined over a period rather than at an instant, says nothing at all about particles. It just says how long it takes to (minimally) define energy. That the energy thus defined is that of a particle is but a gross non sequitur and the last thing that the relation E=hν, i.e. the relation that E≈ν, will ever license. Energy is connected with the frequency, is what the relation itself prescribes, hence by definition not with a particle. And the restriction that momenta be defined over a distance, now related in spatial terms with the said period, rather than at a point, is the very thing which makes reference to a particle literally impossible, for particles can only have a locally concentrated momentum, not one spreading over a distance. Particles, therefore, emerge in these relations from literally nowhere. They are simply imported by an act of fiat, whereas so far as the two quantum relations are concerned, particles only originate in the great outside and it is from there alone, that they are introduced in the action scheme. To persist ‘associating’ the dynamical quantities E and p with the presence of a particle (a phrase usually attributed to de Broglie) is but the product of a classical remnant within relations whose presence alone should be sufficient warning against this very practice. The thought behind it all is, “if there is energy and momentum, there must be a particle around”, a loan from classical preconceptions, if there ever was one. Not even the face of a particle is inscribed in these relations. It is action, not particles, which is periodic, and hence ‘wavelike’. And action as such is not even material, in the sense that particles are. For it is a process; not a thing. Once Duality is shut off from the picture, either on the side of waves or on that of particles, the only backdoor to contextualizing CTY in the ways indicated by Munevar, Feyerabend or Glymour, is shut together with it. Were Duality a coherent way of stating the quantum situation, then the relativism that has become part and parcel of this notion would provide the basis for a fusion between CTY and context-dependence, itself a method of regarding meaning – and fact (cf. Paradigms) – relative to a context, i.e. what Duality is also supposed to do in its usual descriptions. But Duality, as I trust I have established, does very many others things as well, none of them too good for the theory (as several have already surmised). Duality is a quantum myth and a lethal one at that. Hence, context-dependence has really nothing to go by for their planned assimilation, except, that is, that context-dependence may be a myth too, and a lethal one at that, whence the obstacles to their unity are thereby removed. Now to another matter; an inherent part of wave-particle duality reasoning is the one we have extracted from Petersen’s (alleged) definition of CTY, as a method of handling such phenomena, as those which depend on their own observation. Which is a very definite form of relativism (and one, I will not deny, that Bohr has frequently encouraged, when addressing audiences or ‘cultures’ which he had a mind, or a diplomacy, to flatter); but what relativism
Complementarity Out of Context
55
and what observations? In my own –primordial- derivation of the two UR, and their correlated CTY, there were no measurements assumed at all. Only strictly purified quantum axioms and their immediate logical entailments, i.e. the two UR derived as theorems. I have done nothing but unpack the contents of the concept of quantization. I have not said a word about ‘measurements’ or ‘observations’ or any such irrelevancies. But CTY has been derived, simply and directly, from such a concept alone and despite my not once having introduced or even so much as hinted at observations. What I have derived, valid or not, I have derived prior to observations; which is precisely what it takes in the first place, if the resulting theorems, the two UR, are to be subsequently confirmed by experiment, if at all. CTY, therefore, has no need of any ‘measurements’ to come forth, as it had none in my |A+B|=1 model, where mutual exclusion was warranted a priori. To proceed to measurements directly, as so many physicists are professionally prone to do, implying in practice that there can be no uncertainties, or no CTY, unless a measurement is first performed, as if the purely ontological counterparts of these notions are literally absent when no measurement is in process to then spring forth from zero only upon measurement, not only makes the practice of experimenting utterly incomprehensible. What are the measurements about? Themselves? But, in addition, and due precisely to such obscure experimental self-reference, this tendency makes talk of context–dependence and Paradigms almost inevitable. If measured uncertainties only measure measured uncertainties (!), rather than limitations built into reality itself, then surely there is nothing behind them, except a shapeless, amorphous void and not the atom (i.e. the foundation). My reply is really quite simple. I have assumed no measurements. Hence whatever CTY was deduced in the process, was deduced from an ontological premise and that alone. In addition, in my account of CTY and the UR, the quantized products Et and pq (or pλ) are not the same words featuring in different contexts. They are different words featuring in the same context, namely, the one, unique wave period, and thereby the one, unique unit of action they jointly comprise. This is why they are complementary. And this refutes context– dependence not just as a possible reading of Complementarity but as a possible reading of anything. Does the period t, over which E is defined to yield Et, belong to a context other than E? Does the distance λ, over which momentum is defined to yield pλ, belong to a context other than p? Do, that is, concepts operating in a certain context, serve in defining concepts that operate in a different context and still define them? This is the sole way contextualism could ever handle the present situation and it is a way that leads it to but a blatant contradiction. But that is nothing new.
6. Tying Loose Ends There are, I believe, not too many of them in the argument. But there are points unmentioned or points only hinted at, such as dualism, atomism, foundations, and, ultimately, realism, which need to be brought to focus. An atomist is a metaphysicist, twice over. Once, because he is a realist, and this is a metaphysical position, twice because he is an atomist and atoms (since Leibniz) are said to be anything but physical. Kant’s 2d Antinomy between the Composite and the Simple reflects the same thing [402 ff]. Atomists, in addition, are foundationalists. Ancient atomism began as a response to Zeno’s paradoxes, which led infinite divisibility of processes to a literal dead
56
Constantin Antonopoulos
end. I shall name but one such paradox, that of the stadium runner (though the arrow paradox is saturated with complementary connotations), to drive the point home. The runner, in order to cross the stadium must take the step which is the most crucial step of all that follow; the first ever. But there is no first step to take, if distances are infinitely divisible [see Note 9]. The first step would be right where he’s standing; to realize this clearly, suppose you ask someone to start counting –from the beginning. He will go “one, two, three, four ...” and so on. No problem. Now ask him to start counting, again from the beginning, but count fractional numbers instead. He cannot. There just is no place, from where to start. It is a logically impossible task to count fractional numbers from the beginning. He will remain as speechless as Zeno’s runner is said to have remained motionless. There is nowhere to start from and, respectively, nowhere to go. Atomism seems the sole escape from this riddle. Atomism is a way to start and therefore atomism is a form of foundationalism. It is however, once atoms are closer examined, a form of dualism as well. Atoms, to circumvent perpetual divisibility, turn out to consistently lack a structure, for structure is a (near) synonym of inward separability; and so of divisibility. And what lacks a structure cannot become the part of a thing possessed of one. Hence, strictly speaking, atoms cannot be the parts of anything. One would think that this undermines the very essence of their introduction, namely, that of offering a foundation. I am of the opposite inclination myself. If atoms were within the structures, which they are supposed to support, then they would themselves become part of the structure in need of the support. And it would be then that the foundation would be shaken, not now. Hence an atomist can hardly avoid being a dualist. And can therefore hardly avoid running across Hanson’s paradox. As (‘quantum’) numbers get smaller, unintelligibility will result, not perchance because we have changed languages, which in a way we have, but because we are running out of reality. And it is this that ‘changes’ the language, rendering it complementary, whereas it was unified previously, and will again become so, as (‘quantum’) numbers start growing larger. As my numerical model of CTY endeavors to disclose, and as the directly quantal derivation of CTY per se affirms, the atom is just too narrow to make room for all our macro attributes in any one attempt at fitting them to it. But it will make room for them disjunctively, which is what CTY is all about. Two into one won’t go –this is an analytic truth– and, despite our efforts, the indivisible will only manifest undivided aspects of itself; that is to say, complementary ones. If we lean too hard towards the way of the prospective attributes E and p, the other two remaining, prospective attributes, t and q, will vanish from the picture, and conversely, because the quantum is indivisible and can respond affirmatively to either pair of them only in distinct succession, if at all. Language does what Hanson notices only because it conforms to the restrictions. Is this nonrealism? On whose irresponsible definition? If the atomic resists accommodating all our macro attributes at any one time, disjunctively manifesting only undivided aspects of itself [see Note 10], in what exact fashion, pray, does it behave differently than its specified nature? And what other inference would be consistent with the premises of the argument, viz. that of indivisibility? In accordance: “In quantum mechanics we are not dealing with an arbitrary renunciation of a more detailed analysis, but with the recognition that such an analysis is in principle excluded“ [Bohr, 1958, 62]. This is because: “Any consistent use of the concept of the quantum of action refers to phenomena resisting such an analysis” [Bohr, 1963, 94].
Complementarity Out of Context
57
How on earth, then, when an object resists its analysis, can this have anything to do with context-dependence or subject-dependence, if it came to that? It can only have to do with the object. Phenomena resisting analysis, our analysis, and frustrating our attempts at subdividing them because they are unanalyzable (NB: in principle), is the closest approximation to hard core realism ever. The quantum resists its analysis as any true atom should in strictest accordance with its own, independent nature. The language of CTY simply preserves and mirrors this effect by inwardly splitting up in classical halves because the quantum it seeks to describe in corresponding halves itself just won’t. Namely, exactly what my numerical model in the naturals succeeds in preserving. For A+B=1, it will be the variables, not the Unit, that will undergo the splitting. The resulting descriptions, which are complementary in a language (that of classical physics) rather than two languages complementary to one another, as some find it convenient or consoling to suppose, have everything to do with the independent predication of the world as atomistic and non-distributive and nothing at all to do with context–dependence, or its ideological aims [see Note 11]. Bohr is, quite simply, a modern day atomist; and so a dualist, a foundationalist and a metaphysical realist. Relativism and context-dependence have little or nothing to do with any of that. Human rights apart, they have little to do with an anything.
Appendix I have presented the derivations of section 5 twice already, in Author, Refs. 3 and 4; Now, I will present an additional proof, nor presented before, which reconfirms them: Assume first that E=hν. Then assume that we know the energy, E, within the margins of an error ΔE. Then, tautologically, ΔE=Δhν and, since ν=1/t, ΔΕ=Δh1/t. Then, since “Δ” cannot concern h, we obtain ΔΕ=hΔ(1/t). Now, it already becomes clear even at this early stage of the argument that Δ(1/t) is an error in the time, derived tautologically from the initial premise, E=hν, leading (tautologically) to ΔE=Δhν. In my initial derivation, t>0 = Δt, since, in accordance with it, the very fact that E cannot be defined in an instant, t→0, but only over an interval t>0, is the raison d’etre of an uncertainty in time. Now it is known that, in accordance with Error Theory, the expression Δ(1/t) is equal with Δt/t2. Therefore, by substituting for Δt/t2 in ΔE=hΔ(1/t), we obtain: ΔE=hΔt/t2. Since, however, in my initial derivation, t>0=Δt, the denominator of the fraction becomes (Δt)2. Hence, ΔE=hΔt/(Δt)2. This, in turn, yields ΔE=h1/Δt and this, finally, ΔEΔt=h. The point of importance is this: Unless it is assumed in this line of reasoning that t>0=Δt, which is the exact same assumption in my initial derivation, ΔEΔt=h will never follow from ΔE=hΔν, itself derived tautologically from E=hν, thus contradicting the theory. In short, if E=hν implicitly entails ΔE=Δhν, then ΔE=Δhν can only be turned into ΔEΔt=h –and indeed how can it not be? – iff t>0=Δt, in other words, iff the t of the primary relation, E=hν, is interpreted as a limiting time latitude for the definition of the energy, exactly as postulated in my first deduction. This second proof of ΔEΔt=h, not only reconfirms the first but, in addition, it furnishes in the process a by and large self-contained deduction of ΔEΔt=h, fully crosschecking with the first.
58
Constantin Antonopoulos
Notes 1. Mainly by Dr. V. Karakostas. See his “Nature of Physical Reality in the Light of Quantum NonSeparability” [Oviedo, Spain, 7–13 August 2003] or his “Nonseparability, Potentiality and the Context-Dependence of Quantum Objects”. Dialectica, 2005 (to be published). 2. Beyond Reason: Essays on the Philosophy of Paul Feyerabend. Gonzalo Munevar ed., Kluwer, Dordrecht, 1991. 3. Plus, that is, eight other people signing the text. John Earman and Wesley Salmon are two of them. (See 2d paragraph from the end, Section 3.) 4. The general idea is that “when language-games change, then there is change in concepts, and with the concepts the meanings of words change” [Wittgenstein, 1977, 66.]. 5. The very expression “joint applicability” of the classical concepts, eo ipso excludes waves and particles as its possible recipients, which are –trivially– not jointly applicable in classical mechanics. Only the variables of the products Et and pq are. 6. The correct thing to say, of course, is that the two quantum relations, when submitted to a Fourier treatment, will alternatively represent the system as either an extended, plane wave or as a localizable wave packet. The former allows the determinations of E and p, since a unique wavetrain is employed, the latter, the determinations of t and q, since a superposition of many waves is employed. And thus the classical concepts, E and p, on the one side, t and q, on the other, will through this process assume an incompatibility analogous to that between “one” and “many”. The sole duality, therefore, derivable from this reasoning is that between waves and wave packets. That is to say, of waves alone, though in antithetic wave profile configurations. Particles are absent even here. 7. This is what Hooker has earlier referred to as “the factual absence” for classical compatibility. For a thorough analysis of logical vs factual incompatibility see my essays [2, 1994, pp.187-9 and 4, 223-5, 2005]. 8. I can hardly overemphasize the importance of this word, soon to be abandoned by Bohr for the sake of the dominant term, CTY, [1934, 19], although the former shows exactly how complementary counter happenings at either end of the indivisible cluster result from uneven balance. The two notions, I should think, go hand in hand. 9. “More strikingly, Zeno shows that no body can ever start its journey: it can never take the first step, since there is no first step to take” [Barnes, 1982, 262]. This, of course, is Kant’s problem in the Second Antinomy, that between Composite and Simple and the frustration thereby of constructing the former by repetitive accumulations of the latter. It is anticipated, in a uniquely important degree, by Zeno’s paradox of extension: “How can a line of finite length be divided into infinitely many parts of finite length? And how can a line which is made up of lengthless parts add up to a line which has length?” [Harrison, 1996, 273]. 10. Dimly, Duality is discerned in the distant horizon. What can only manifest undivided aspects of itself, will reasonably manifest such aspects as may conflict with one another, when what should otherwise be only a part of the entity, must
Complementarity Out of Context
59
now extend over its whole profile instead; and may thus conflict with its alternative, indivisibly manifested profile. Duality is not a primitive assumption in this syllogism. It is only a side product of Indivisibility. 11. The aim is, of course, an equalitarian ideology, offered in abundance through ‘humanitarian’ incommensurability. Incommensurable scientific beliefs let alone moral, social, political or aesthetic ones, are impossible to bring to conflict, therefore impossible to select from. They are all just as good (or just as bad, if it came to that). For the essentially ideological character of context-dependence see [Katz, 1978, 364] and, it goes without saying, A. Sokal’s pseudo-paper [Longino, 1977, 119]. But Feyrabend himself has beaten them all to it, openly confessing that his scientific relativism is preferable because more humane [1971, 33; and 1978 passim]. Scientific truth must be silenced, if it conflicts with politics.
References [1] Antonopoulos C. “Innate Ideas, Categories and Objectivity”. Philosophia Naturalis, 26, 2, 1987. [2] Antonopoulos C. “Indivisibility and Duality; A Contrast”. Physics Essays, 7, 2, 1994. [3] Antonopoulos C. “Investigating Incompatibility: How to Reconcile Complementarity With EPR”. Annales de la Fondation Louis de Broglie, 30, 1, 2005. [4] Antonopoulos C. “Making the Quantum of Relevance”. Journal for General Philosophy of Science, 36, 2, 2005. [5] Barnes, J. The Presocratic Philosophers. Routledge and Kegan Paul, London, 1982. [6] Bohr, N. Atomic Theory and the Description of Nature. Cambridge University Press, 1934. [7] Bohr, N. Atomic Physics and Human Knowledge. New York, 1958. [8] Bohr, N. Essays 1958-62 on Atomic Physics and Human Knowledge. Suffolk, 1963. [9] Darrigol, O. “Strangeness and Soundness in de Broglie’s Early Works”. Physis, 30, 1993. [10] Feyerabend, P.K. “Complementarity”. Proceedings of the Aristotelian Society, supplementary volume, xxxii, 1958. [11] Feyerabend, P.K. “Problems in Microphysics”.Frontiers of Science and Philosophy, R.Colodny, ed., University of Pittsburgh Press, 1962. [12] Feyrabend, P.K. “How to Be a Good Empiricist”. The Philosophy of Science, P.H. Nidditch ed., Oxford University Press, 1971. [13] Feyerabend, P.K. Science in a Free Society, NLB 1978. [14] Faye, J., Folse, H. eds. Niels Bohr and Contemporary Philosophy. Dordrecht 1994. [15] Folse, H. The Philosophy of Niels Bohr. Amsterdam, 1985. [16] Hanson, N.R. Patterns of Discovery. Cambridge University Press 1961. [17] Harrison, C. “The Three Arrows of Zeno”. Synthese, 107, 1996. [18] Hooker, C.A. “The Nature of Quantum Mechanical Reality”. Paradigms and Paradoxes, The University of Pittsburgh Press, 1972. [19] Kant, Imm. The Critique of Pure Reason. Transl. N.K.Smith, Macmillan, 1973. [20] Karakostas, V. “The Nature of Physical Reality in the Light of Quantum Nonseparability”. Oviedo, Spain, 7–13 August 2003.
60
Constantin Antonopoulos
[21] Karakostas V. “Nonseparability, Potentiality and the Context-Dependence of Quantum Objects”. Dialectica, 2005 (to be published). [22] Katz, J. “Semantics and Conceptual Change”. The Philosophical Review, 88, 1979. [23] Kuhn, Th. The Structure of Scientific Revolutions. The University of Chicago Press, 1970. [24] Leibniz, W. Monadology. The European Philosophers from Descartes to Nietzsche. M.C.Beardsley ed., New York, 1960. [25] Longino, H.E. “Alan Sokal’s ‘Transgressing Boundaries’”. International Studies in the Philosophy of Science, 11, 2, 1997. [26] Mackay, D.M. “Complementarity”. Proceedings of the Aristotelian Society, suppl. volume xxxii, 1958. [27] Munevar, G. “Bohr and Evolutionary Relativism”. Explorations in Knowledge, xii, 2, 1995. [28] Munevar, G., Ed., Beyond Reason: Essays on the Philosophy of Paul Feyerabend. Dordrecht, 1991. [29] Petersen, A. “On the Philosophical Significance of the Correspondence Argument”. Boston Studies in the Philosophy of Science, Volume 1966-68. [30] Rosenfeld, L. “Foundations of Quantum Theory and Complementarity”. Nature, 190, 1961. [31] Plotnitsky, A. Complementarity. London 1994. [32] Salmon Marilee H., Earman, J., Glymour C., Lennox J., Machamer P., McGuire, J.E., Norton J.D., Salmon Wesley C., Schaffner, K.N. Introduction to the Philosophy of Science, Prentice Hall, New Jersey 1992. [33] Wittgenstein, L. On Certainty, transl. by G.E.Anscombe, B.Blackwell, Oxford 1977.
In: Electrostatics: Theory and Applications Editor: Camille L. Bertrand, pp. 61-89
ISBN 978-1-61668-549-2 c 2010 Nova Science Publishers, Inc.
Chapter 4
M OLECULAR I NTEGRALS OVER S LATER -T YPE O RBITALS . F ROM P IONEERS TO R ECENT D EVELOPMENTS ¨ Philip E. Hoggan1,∗, Mar´ıa Bel´en Ruiz2,† and Telhat Ozdo˘ gan3 1 LASMEA, UMR 6602 CNRS, University Blaise Pascal, 24 avenue des Landais, BP 80026, 63171 AUBIERE Cedex, France 2 Department of Theoretical Chemistry of the Friedrich-Alexander-University Erlangen-N¨urnberg, Egerlandstraße 3, D-91058 Erlangen, Germany 3 Department of Physics, Faculty of Arts and Sciences, Rize University, 53100 Rize, Turkey
Abstract It can readily be demonstrated that atomic and molecular orbitals must decay exponentially at long-range. They should also possess cusps when two particles approach each other. Therefore, Slater orbitals are the natural basis functions in quantum molecular calculations. Their use was hindered over the last four decades by integration problems. Consequently, Slater orbitals were replaced by Gaussian expansions in molecular calculations (in spite of their more rapid decay and absent cusps). From the 90s until today considerable effort has been made by several groups to develop efficient algorithms which have fructified in new computer programs for polyatomic molecules. The key ideas of the different methods of integration: one-center expansion, Gauss transform, Fourier transform, use of Sturmians and elliptical co-ordinate methods are reviewed here, together with their advantages and disadvantages, and the latest developments within the field. A recent approximation separating the variables of the Coulomb operator will be described, as well as its usefulness in molecular calculations. Recently, due to the developments of the computer technology and the accuracy of the experiments, there is a renewed interest in the use of Slater orbitals as basis functions for Configuration Interaction (CI) and Hylleraas-CI atomic and molecular calculations, and in density functional and density matrix theories. ∗ E-mail † E-mail
addresses: [email protected] addresses: [email protected]
¨ P.E. Hoggan, M.B. Ruiz and T. Ozdo˘ gan
62
Keywords: Slater orbitals, computer programs, Kato conditions, accurate molecular wavefunctions
1.
Introduction
Slater-type orbitals (STO) [1] are the natural basis functions in quantum molecular calculations. Nevertheless, their use has been rather restricted, mostly due to mathematical integration difficulties. Even today there are no simple general algorithms to solve all the integrals appearing in a Hartree-Fock (HF) or Configuration Interaction (CI) molecular calculation, where integrals involving up to four atomic centers may appear. In spite of these difficulties the research on Slater orbitals has always continued. The reason is the requirement for large basis sets of Gaussian orbitals (GTO) and large wave function expansions to perform more accurate calculations of energy and properties of ever larger interesting systems. As a consequence those calculations need enormous computational times. In 1981, in a Congress in Tallahassee about Slater type orbitals, Milan Randic described the situation: ”Gaussian functions are not the first choice in theoretical chemistry. They are used (...) primarily because molecular integrals can be evaluated, not because they posses desirable properties. Today this may be a valid reason for their use, but tomorrow they may be thought of as bastard surrogates, which served their purpose in the transition period, have no longer viable merits and will fall into oblivion” [2]. The use of an expansion of GTOs instead of an STO was then a pragmatic solution and originally intended for solving the problems in the calculation of the first molecules on early mainframe computers. The GTO expansion together with the popular distribution of computer programs like GAUSSIAN have encouraged the use of GTOs for accurate calculations of large systems. The limits are receding with respect size of the systems and dimension of the wave function, i.e. HF calculations of clusters of hundreds of atoms, CI calculations including hundreds of thousands of Slater determinants. In spite of the rapid development of the computer technology and the availability of supercomputers, the computational times are unreasonably long, so that the computational chemist is restricted i.e. to perform numerous test calculations. This motivates the search for basis functions, where fewer would give a good CI, in particular. The possibility of using Slater orbitals, where a minimal basis would consist in one function per atom would provide a forward impulse to theoretical and computational chemistry. Since the difficulties are of a purely mathematical nature, e.g. definite integrations, it would be worthwhile pursuing investigations of Slater orbitals. The purpose of this paper is to explain the key ideas about Slater orbitals for readers outside the field. It is beyond our scope to review the whole work of the all authors in this field, what would deserve a longer treatment. The history of Slater orbitals and the first computer programs using them is exposed and the currently used computer programs are listed. The STO and GTO are defined and compared. The methods used in the literature are explained recalling in the key ideas in which these methods are based. The last developments within the field are reported.
Molecular Integrals over Slater-Type Orbitals
2.
63
Early History of the Slater Orbitals
The history of STOs is the history of theoretical chemistry. In 1928 Slater [1] simplified the hydrogen-like orbitals (which are eigenfunctions of the Hamiltonian for a one-electron atom) obtaining the orbitals which bear his name. Curiously Slater called these orbitals at that time Hartree orbitals. Slater orbitals are a simplification of the hydrogen-like orbitals, which are eigenfunctions of the atomic one-electron Schr¨odinger equation. Brief time-line of events in molecular work over Slater type orbitals to date: 1928 Slater and London. 1929 Hylleraas: He atom. 1933 James and Coodlidge: Hylleraas calculations on H2 . 1949 Roothaan LCAO paper. 1950 Boys: first Gaussian expansion of STO published. 1951 Two-center Coulomb Integrals. Roothaan formulae. 1954 Boys and Shavitt ’Automated calculations’. 1958 Tauber: Work on analytic two-center Exchange integrals: Poisson equation. 1962 Scrocco: first publishes STO work, (in Italian) but with a programme. This follows early molecular work in 1951-53. [3, 4]. 1963 Clementi produces tables of optimised single zeta basis sets for atoms. Shavitt B-Functions described. 1970 The Journal of Chemical Physics published work on STO codes by E. Scrocco and R. Stevens. Gaussian 70 prepared for QCPE by J. Pople and R. Ditchfield. 1973 E. J. Baerends: numerical integration over STO used for ADF DFT code. 1978 Filter and Steinborn: Fourier transform work. B-functions and plane-wave expansion of Coulomb operator. 1981 ETO conference in Tallahassee. Weatherford and Jones. 1994 First STOP (Slater Type Orbital Package, QCPE 667 1996) code. Bouferguene and Hoggan. 2001 First SMILES (Slater Molecular Integrals for Large Electronic Systems) code. Fern´andez Rico, L´opez et al. 2008 Gill: Coulomb resolution.
64
¨ P.E. Hoggan, M.B. Ruiz and T. Ozdo˘ gan
Very soon with Slater at MIT, researchers broached the problem of evaluating the twoelectron integrals in this basis. During the 1950s the Chicago group led by Mulliken took on the task of evaluating all the molecular integrals. Roothaan treated the Coulomb and Hybrid two-center integrals [5,6], R¨udenberg the exchange integrals [7]1 . Among the many authors who were working around the world on the solution of the necessary integrals one may mention Masao Kotani in Japan [8], who wrote the famous tables of integrals which bear his name and that were widely used. Coulson in Oxford (England) proposed a method to evaluate the three- and four-center integrals [9], L¨owdin in Uppsala [10], and young American scientist called Harris [11] were involved. Work in the early 50s mostly focused on integrals over STO. The interest was to make the first theoretical calculations of some molecules starting with the diatomic systems H2 , N2 . For three-center molecules the problem of integration was encountered (orbital translation). Mulliken and Roothaan called this ”The bottleneck of Quantum Chemistry” [12], Mulliken mentioning it in his Nobel Lecture in 1966, on the molecular orbital method. Boys in Cambridge published his landmark paper [13] containing the evaluation of three- and four-center integrals using Gaussian functions, for which he derived the so-called product theorem: the product of two Gaussian functions located on different centers is a new Gaussian function located on a new center. Thus four-center electron distributions could be reduced to single-center distributions and evaluation was analytically facilitated. Boys regarded his work as an existence theorem. It was to change the course of molecular computations. Note that the product theorem for Slater orbitals leads to complicated infinite sums, making evaluation awkward compared with the simple closed forms for Gaussians. In 1954 Boys, Shavitt et al [14] expanded Slater orbitals into Gaussians to perform quantum mechanical calculations. In 1963 Clementi presented the so-called basis set using Slater orbitals [15]. Later Pople would base his programs on Boys’ pragmatism.
3.
History of the STO Computer Programs
The first (and surely the last) manual calculation of a molecule, the N2 molecule, was done by Scherr in 1956. It was necessary the work of 2 (sometimes it appears 3 ) men for 2 years. Afterwards this calculation was reproduced by the first digital computer calculation [16, 17], taking 35 minutes. In 1962 Shull initiated the Quantum Chemistry Program Exchange (QCPE) at Indiana University. The first automatic computer program was POLYATOM [18] which used nevertheless GTOs with SCF-LCAO. The program was developed at MIT in 1963 when Slater was there. In 1963 the program IBMOL [19] was developed by Clementi and others when he visited the Chicago group. In 1968 a STO code was developed by Scrocco and Tomasi from Pisa. Preliminary work by Scrocco is reported in Italian as early as 1962 [4]. 1 The two-center two-electron integrals are classified according to the centers a, b. Writing them according the charge distributions [Ω(1)|Ω(2)] the Coulomb integrals are [aa|bb], the hybrids [aa|ab] and the exchange integrals [ab|ab]. The most difficult are the exchange integrals because the charge distribution of every electron is located over two centers.
Molecular Integrals over Slater-Type Orbitals
65
This program was also used by Berthier in France. The program ALCHEMY in 1968 was originally developed using Slater orbitals by Clementi and the staff of the IBM laboratory in San Jose [20], afterwards, the new ALCHEMY 2 by Bagus and others used GTOs. The program DERIC [21] by Hagstrom in 1972 perform STO calculations of two-center molecules. In the 80s, the advent of GAUSSIAN [22] saw development in the STO field hibernate somewhat. By the 90s several groups around the world developed new STO computer programs which are now distributed. The program STOP, by Boufergu`ene and Hoggan [23] was published first in 1996. It is based on the single center strategy and was first presented in 1994 at the 8th ICQC in Prague. New versions appeared, the latest (parallel) in 2009. Then in 1998 a program was written using B-functions by Steinborn, Weniger, Homeier et al, in Regensburg [24]. The program SMILES by Fern´andez Rico, L´opez, Ema, and Ram´ırez in Madrid appeared in 1998 and new versions have appeared, the latest in 2004 for the HF and CI calculations of molecules [25]. The program CADPAC [26] in Cambridge uses techniques like density fitting, involving auxiliary Slater type orbital basis sets to perform Hartree-Fock and Density Functional Theory (DFT) calculations with a reduced number of indices in requisite integrals. They aimed to obtain better Nuclear Magnetic Resonance (NMR) chemical shifts on the basis involving nuclear cusps. In the density functional theory field in 2001 the program ADF (Amsterdam Density Functional) [27] begun in 1973 by Baerends et al uses Slater orbitals for their calculations. This much-used package offers a very extensive series of atomic basis sets for input, including most elements. It is a numerical grid strategy and this review will not detail it. The program ATMOL of Bunge et al performs large highly accurate CI calculations on atoms using Slater orbitals [28]. In the first century of the third millennium much interest is concentrated in generating more efficient calculation algorithms, use of non-integer Slater orbitals, numerical solution of integrals when using B-functions and in the electron correlation when using Hylleraas wave functions.
4.
Slater Orbitals & Gaussian Orbitals
It is well known that hydrogen-like orbitals are the solution of the Sch¨odinger equation for a one-electron atom. For helium and atoms with more electrons the Sch¨odinger equation has no analytical solution due to the potential term 1/ri j which correlates the (otherwise) independent electrons. It is assumed that for systems with N ≥ 2 this form of the exponential e−αr will be the asymptote of the formal solution. The hydrogen-like orbitals have nodes, i.e. the 2s orbital is of the form (1 − br) e−αr , and higher quantum number orbitals are similar but STOs are node-less. A related problem appears for Gaussians. In 1928 Slater [1] regarded the hydrogen-like orbitals as polynomials in r which make the calculations messy and proposed the use of single powers of r i.e. linear combinations of hydrogen-like terms. A picture which helps to visualize the differences between Slater and Gaussian orbitals is the representation of the 1s orbital function of both types, see Figure 1.
66
¨ P.E. Hoggan, M.B. Ruiz and T. Ozdo˘ gan
Figure 1. Comparison of the shape of a STO and GTO functions. STOs represent well the electron density near the nucleus (cusp) and far from the nucleus (correct asymptotic decay). STOs thus resemble the true orbitals. Conversely, the GTOs have erroneous shape near and far from the nucleus (no cusp). One can observe that far from the nucleus the GTOs tend to zero much faster than STOs. To reproduce a 1s STO using 3 GTOs (the so-called minimal GTO basis) an orbital is obtained with the shape of a Gauss curve, no cusp, see Figure 2. To reproduce a single STO many GTOs are necessary, but the electron cusp at the nucleus is missing. This is one of the reasons of the slow convergence of the wave function solutions to the exact (HF or CI) result. In general, if the basis function is not a formal solution of the Schr¨odinger equation its convergence is slower. That means that more Slater determinants are required to obtain the same result. Thus Slater orbitals show faster convergence when increasing their number. Another advantage of Slater orbitals is the size of the basis, one orbital per electron is of reasonable quality and multiple-zeta basis sets converge fast to the Hartree-Fock limit. therefore, the number of integrals to be evaluated is dramatically smaller. CI is spectacularly more efficient. Finally, conceptually the Slater orbitals give a more intuitive description of the atomic orbitals and of the molecular orbitals (MO) formed with them. The disadvantages of Slater orbitals have been already mentioned: the three- and fourcenter two-electron integrals are the bottleneck. There is no general analytical solution for them, which would be the most effective and fastest way of calculation. Instead there are a number of approximate methods of calculation, involving infinite series, or truncated approximations to the Coulomb operator itself. They will be treated in the next Sections. The radial Slater functions do not represent the bonding region adequately, it being then necessary to add higher angular momentum functions. It is nevertheless possible to use linear combinations restoring radial nodes. This approach is advocated particularly for ADF, where the hydrogen-like basis is obtained by fixing the coefficients for combining Slater functions. Another disadvantage is that some of the two-center integrals since the times of Roothaan and R¨udenberg have been solved for a co-axial conformation of the atomic coordinate systems (the z-axes point to each other) that is not the molecular frame. Therefore rotations and reflections are necessary. These problems have been solved, but it requires
Molecular Integrals over Slater-Type Orbitals
67
Figure 2. Construction of a STO with 3 GTOs. additional calculations [29]. Nowadays, Slater orbitals are used in atomic calculations, especially in highly accurate calculations of atoms using Hylleraas wave functions (with explicit ri j dependence, and also in diatomics. They are used in DFT and in Density matrix theories. Traditionally they have been used in semi-empirical calculations where of course the three- and four-center integrals were neglected. The Gaussian orbitals are generally used in standard quantum mechanical calculations. As explained they are not shaped like analytical orbitals, with no cusp at the nucleus, for that reason they are not good for the calculation of properties where the density at the nucleus has to be well described. Also the radial dependence is not well represented and the number of integrals increases with the dimension of the basis dramatically. The major advantage of GTOs is the existence of a product theorem. Over many years, workers have improved the calculation of the necessary integrals, having achieved a considerable speed-up. For example the Coulomb operator with a Laplace transform enables to calculate three- and four-center integrals like two center integrals. Concluding, the main defect of GTO expansions is the absent cusp which slows the convergence and the large number of integrals to be computed.
5.
Types of Exponentially Decaying Orbital, Based on Eigenfunctions for One-Electron Atoms
In general one calls Slater-type orbitals those with an exponential radial factor of the form rn e−αr , for n a positive integer (or 0). The atom-centered Slater orbitals are defined as: ϕnlm (r) = rn−1 e−αrYlm (θ, φ),
(1)
¨ P.E. Hoggan, M.B. Ruiz and T. Ozdo˘ gan
68
where n, l, m are the quantum numbers. Ylm (θ, φ) are the spherical harmonics defined using the Condon-Shortley phase: Ylm (θ, φ) =
m
(−1)
2l + 1 (l − m)! 4π (l + m)!
1/2
Plm (cos θ)eimφ ,
(2)
Plm (cos θ) are the associated Legendre functions. The spherical harmonics are eigenfunctions of the angular momentum operator Lˆ 2 and its z-projection Lˆ z . The complex spherical harmonics are used mainly in atoms and in developing theories because it is easier to work out general formulae and derivations with them. The real spherical harmonics are linear combinations of the complex ones. These are used mainly in molecules. Note that they are written using polar coordinates. They can be also straightforwardly converted into Cartesian Slater orbitals by the exchange: x = r sin θ cos φ,
(3)
y = r sin θ sin φ,
(4)
z = r cos θ,
(5)
obtaining in general: χnlm (r) = xnx yny znz rn−1 e−αr .
(6)
Cartesian Slater type orbitals are very seldom used compared with Cartesian Gaussians, that are an almost systematic choice. When the principal quantum number n in Eq. (1) is a non-integer we have the NISTOs (Non Integer Slater Orbitals). The main difficulty when working with these orbitals is during the derivations a binomial has to be used with an non-integer power what leads to an infinite expansion. These orbitals are widely investigated in the present [30]. The additional flexibility of using non-integer quantum numbers brings a lowering in the energy results. There is the possibility to transform also the polar coordinates to elliptical coordinates. Traditionally the Elliptical Slater orbitals have been used as basis functions for two-center molecules [31]- [33]. These orbitals are known to lead to lower energy results, see Ref. [34]. Using ξ = λ1 = ra + rb and η = µ1 = ra − rb : ϕnlm (r) = ξn ηl (ξ2 − 1)m/2 (1 − η2 )m/2 e−αξ eimφ ,
(7)
where ξ, η, φ are the elliptical coordinates. Now we go to orbitals which are linear combinations of Slater orbitals: B-functions [35], hydrogen-like, Sturmians [36]. The B-functions are Bessel functions. The orbitals have some helpful properties like a compact Fourier transform. Written in the form n
Bnlm (r) =
(2n − j − 1)!
∑ 22n+l−1 (n + l)!(n − j)!( j − 1)! (ζr)l+ j−1 e−ζrYlm (θ, φ),
j=1
(8)
Molecular Integrals over Slater-Type Orbitals
69
one can see that they are a linear combination of Slater orbitals. The angular parts are the spherical harmonics. The hydrogen-like orbitals which are solutions of the Schr¨odinger equation for the hydrogen atom have a radial part which is a Laguerre polynomial. The polynomial and the exponent coefficient depend on the atomic number Z and the principal quantum number n: 2Zr l − Zr m 2l+2 χnlm (r) = Nnl Ln−l−1 (9) r e n Yl (θ, φ). n Due to that fact, the hydrogen-like orbitals do not form a complete set (for finite n), they need orbitals of the continuum to be complete. This would be important for the convergence of the solutions. Shull and L¨owdin [37] realized that this was due to the dependence of Z with n that dilates the orbitals and they proposed the following orbitals where these were substituted by adjustable parameters, i.e. usual orbital exponents: 2l+2 χnlm (r) = Nnl Ln−l−1 (2αr) rl e−αrYlm (θ, φ),
(10)
so these orbitals form a complete set. These orbitals were subsequently called Coulomb Sturmians because they fulfill the so-called Sturm-Liouville theorem for eigenfunctions of such differential equations, with central Coulomb attraction. In the Section 7 methods of the literature we will see how these kinds of orbitals have been used.
6.
Types of Integral over Slater Orbitals
Due to the form of the Hamiltonian and of its expectation value we find the following kinds of integrals. First the integrals which appear when using Hartree-Fock and CI wave functions, in general ab initio methods. The integrals are classified according the number of electrons and centers which are linked. We present them in order of difficulty.
6.1.
One-Electron Integrals
These are the one- and two-center overlap integrals ha|bi, kinetic energy integrals ha|bi and two-center nuclear attraction ones ha|1/rb |bi. Other case of one-electron integral is the three-center nuclear attraction, originated from the nuclear attraction operators in the Hamiltonian: ha|1/rc |bi.
6.2.
Two-Electron Integrals
They can be up to four-centers because of the determinant giving the wave-function and thus the four orbitals which form the integral. According to the number of centers: The two-center integrals have been traditionally the most investigated, they have the following nomenclature: The Coulomb integrals where the charge distribution of every electron is located at a center: [aa|bb]. Hybrid integrals, one charge distribution is located at one center and the other over two centers [aa|ab] and their equivalents [bb|ab]. The exchange integral is more difficult, it leads in case of different exponents to an infinite
70
¨ P.E. Hoggan, M.B. Ruiz and T. Ozdo˘ gan
sum. Every electron is located in two centers: [ab|ab]. To solve these integrals a change to elliptical co-ordinates is useful. The Coulomb operator in elliptical co-ordinates contains associated Legendre functions of the first and second kind, for which integration is very difficult. In the case of slightly different exponents there are some singularities. In actual calculations, the Coulomb and Hybrid integrals are calculated exactly, numerous methods exist. The exchange integrals are calculated with great accuracy. The three-center integrals are of several types [aa|bc], [ab|ac]. For different exponents there is no general solution. The four-electron integrals are of the type [ab|cd].
6.3.
Three- and Four-Electron Integrals
They appear in the Hylleraas-CI method [38] when using one inter-electronic distance ri j per configuration. For the two-center case they have been solved generally by Budzinski [39]. Three- and higher number of centers have not been solved yet. These can be many-center integrals, as every electron from right and left in the expectation value operator may be in a different center. These integrals are of the type, i.e. the easier [aa|r12 r13 |ab|bb], to the most difficult [ab|r12 r13 |ab|ab]. Four-electron ones [aa|r12 r13 /r14 |bb|ab|bb], and so on. For three- and higher number of centers one would find three- and four electron integrals with as many centers as the molecule has up to 8. These integrals are still not solved. Interest nowadays focusses on the solution of two and three center molecules using explicitly correlated methods.
7.
Methods in the Literature
In this section the main methods of evaluation of the three- and four-center integrals over Slater orbitals from the literature will be explained. The methods are approximate because they consist in transformations, expansions or include numerical integrations. Therefore they are not as effective as analytical integration would be. Nevertheless, by these methods the evaluation of these integrals is possible and the programs are even as competitive as those using Gaussians.
7.1.
Single-Center Expansion
The single-center expansion method requires expanding the Slater orbitals located at different centers at only one of them and then as for atoms to perform the integrations. The translation method consists in selecting an atom as origin then the translation of other orbitals from their atom to the origin. Therefore both methods are essentially the same. To expand one function centered in A at another point B the following expansion: ∞ Z ϕAi = ∑ ϕAi χB j dτ χB j . (11) j=1
This formula is due to Smeyers [40]. In brackets, the requisite coefficients. The different methods of single-center expansion differ in the way to calculate these coefficients.
Molecular Integrals over Slater-Type Orbitals
71
This method was first proposed by Barnett and Coulson [9] in 1956 using radial orbitals (s-orbitals) and was called the zeta function method because of expansions in terms of successive derivatives with respect to exponents. The method has similarities with the alpha function method of L¨owdin [10]. Harris and Michels [41] extended the method to angular general orbitals in 1965. This method has been used by Smeyers, Jones, Guseinov, Fern´andez Rico et al, and others. The idea is the translation of an orbital from one point to the other. The translation of a spherical harmonic is a limited expansion, the translation of the radial part is nevertheless an infinite expansion. This situation can be best explained with formula of Guseinov [42]: χn,l,m (ζ, rA ) =
∞ n′ −1
l′
∑ ∑ ∑
n′ =1 l ′ =0 m′ =−l ′
Vnlm,n′ l ′ m′ (ζ, RAB ) χn′ ,l ′ ,m′ (ζ, rB ),
(12)
where V are the coefficients of the expansion. The method is very stable but it requires computation of a lot of terms to obtain sufficient correct decimal digits, therefore this method needs very long computational times.
7.2.
Gaussian Expansion
This is the Boys-Shavitt method [43], which consist in solving some integrals over Slater orbitals expanding them into a finite series of Gaussians: NG
e−αr = ∑ ci e−αi r , 2
(13)
i=1
ci and αi are obtained by minimizing the least squares. This method and some improvements of this method are used at present in the program SMILES [25]. As NG is usually larger than the number of the primitives when using only Gaussian basis sets, the number of integrals to calculate is large. The method is very stable and robust. It requires lengthy computational times to get accurate integral values.
7.3.
Gaussian Transform Method
The Gaussian transform method by Shavitt and Karplus 1965 [44] has been probably the most used method. It consists in the Laplace transform of the exponential function, here exemplified by the simplest one i.e. a 1s orbital: α e−αr = √ 2 π
Z ∞ 0
s−3/2 e−α
2 /(4s)
2 ds e−sr .
(14)
Every Slater exponential within the integral is transformed into a Gaussian one, for that one has to solve the integrals over s which have a special form. This integral has to be solved numerically. This is the disadvantage of the method.
7.4.
Fourier-Transform Method
The B-functions Eq. (8) proposed by Filter and Steinborn in 1978 [35] have a highly compact Fourier transform. The group of Steinborn has developed this method [24]. The
72
¨ P.E. Hoggan, M.B. Ruiz and T. Ozdo˘ gan
Figure 3. Transformation from polar to elliptical coordinates. evaluation of integrals using B-functions leads to some integrals including a Bessel function of first kind which is oscillatory: Z ∞ 0
rn e−αr Jl+1/2 (rx)dr.
(15)
To evaluate these Safouhi [45,46] used the SD-transform, due to Sidi [47], which consists in substituting this integral by a sine integral which has the same behavior. It needs numerical integration.
7.5.
Use of Sturmians
The Sturmians were proposed by Shull and L¨owdin in 1956 [37]. Smeyers used the Sturmians to evaluate three-center nuclear attraction integrals using the one-center expansion [40]. Guseinov 2001 used also them [48]. The Sturmians Eq. (10) satisfy the SturmLiouville theorem: 2αn m 2 m 2 ∇ Sn,l = α − Sn,l . (16) r The so-called Coulomb Sturmians orthogonalise the Coulomb potential in their argument. This generally applies to the attraction term, at least for one-electron functions. Geminals useful for explicit correlation have also been used. A seminal text by Avery gives more details to the interested reader on this subject [36].
7.6.
Elliptic Coordinate Method
The elliptic coordinate method is the transformation of the polar orbital coordinates into elliptical ones λ, µ according to Figure 3. The two coordinate systems pointed to each other so that the elliptical angle φ coincides with polar angle φ. This transformation is: r1a =
R (λ1 + µ1 ), 2
r1b =
R (λ1 − µ1 ), 2
(17)
Molecular Integrals over Slater-Type Orbitals 1 + λ1 µ1 , λ1 + µ1
cos θ1b =
1 − λ1 µ1 , λ1 − µ1
(18)
[(λ21 − 1)(1 − µ21 )]1/2 , λ1 + µ1
sin θ1a =
[(λ21 − 1)(1 − µ21 )]1/2 , λ1 − µ1
(19)
cos θ1a = sin θ1a =
The volume element and the domain change are: Z ∞ 0
r2 dr
73
Z π 0
sin θdθ
Z 2π 0
dφ →
R3 8
Z ∞ 1
dλ1
Z +1 −1
dµ1 (λ21 − µ21 )
Z 2π 0
dφ1 .
(20)
The method has been used by numerous authors: Mulliken, Rieke, Orloff, R¨udenberg, Roothaan, Eyring, Randic, Saika, Yoshimine, Maslen and Trefry, Guseinov, Bosanac, ¨ Randic, Harris, Fernandez Rico, Lopez, Ozdogan and many others. Some types of three¨ electron integrals have been recently solved by Ozdogan and Ruiz using this method [49].
8. 8.1.
General Two-electron Exponential Type Orbital Integrals in Poly-Atomics without Orbital Translations Introduction
Now, the Coulomb resolution will be presented. This is a readily controlled approximation to separating the variables in the 1/r12 which, in recent work by Gill and by Hoggan is shown to spell the end of exponential orbital translations and ensuing integral bottlenecks. This section advocates the use of atomic orbitals which have direct physical interpretation, i.e. hydrogen-like orbitals. They are Exponential Type Orbitals (ETOs). Until 2008, such orbital products on different atoms were difficult to manipulate for the evaluation of two-electron integrals. The difficulty was mostly due to cumbersome orbital translations involving slowly convergent infinite sums. These are completely eliminated using Coulomb resolutions. They provide an excellent approximation that reduces these integrals to a sum of one-electron overlap-like integral products that each involve orbitals on at most two centers. Such two-center integrals are separable in prolate spheroidal coordinates. They are thus readily evaluated. Only these integrals need to be re-evaluated to change basis functions. The above is still valid for three-center integrals. In four- center integrals, the resolutions require translating one potential term per product. This is outlined here and detailed elsewhere. Numerical results are reported for the H2 dimer and CH3 F molecule. The choice between Gaussian and exponential basis sets for molecules is usually made for reasons of convenience at present. In fact, it appears to be constructive to regard them as being complementary, depending on the specific physical property required from molecular electronic structure calculations. As regards exponential type orbitals (ETOs) such as Slater functions, it seems to be difficult to evaluate two-electron integrals because the general three- and four-center integrals evaluated by the usual methods require orbital translations. Some workers avoid the problem using large Gaussian expansions, as in SMILES [50, 51].
¨ P.E. Hoggan, M.B. Ruiz and T. Ozdo˘ gan
74
It would be helpful to devise a separation of the variables of integration. This would eliminate orbital translations, although some other translations remain involving a simple analytic potential. The present work describes a breakthrough in two-electron integral calculations, as a result of Coulomb operator resolutions. This separates the independent variables of the operator and gives rise to simple analytic potentials. The two-center integrals are replaced by sums of overlap-like one-electron integral products. One potential term in these products requires translation in four-center terms, which is significantly simpler to carry out than that of the orbitals. This implies a speed-up for all basis sets, including Gaussians. The improvement is most spectacular for exponential type orbitals. A change of basis set is also facilitated as only these one-electron integrals need to be changed. The Gaussian and exponential type orbital basis sets are, therefore interchangeable in a given program. The timings of exponential type orbital calculations are no longer significantly greater than for a Gaussian basis, when a given accuracy is sought for molecular electronic properties. Numerical values for all two-electron integrals evaluated using Coulomb resolutions as well as total energies will be tabulated for the H2 dimer and CH3 F molecule.
8.2.
Basis Sets
Although the majority of electronic quantum chemistry uses Gaussian expansions of atomic orbitals [13, 43], the present work uses exponential type orbital (ETO) basis sets which satisfy Kato’s conditions for atomic orbitals: they possess a cusp at the nucleus and decay exponentially at long distances from it [52]- [54]. It updates a ‘real chemistry’ interest beginning around 1970 and detailed elsewhere [3, 4, 15, 27, 44, 55, 56]. Slater type orbitals (STOs) [57, 58] are considered here. STOs allow us to use routines from the STOP package [23, 59] directly. The integrals may be evaluated after Gaussian expansion or expressed as overlaps to obtain speed up [60]. Exponents may be optimized solving a secular determinant as in [61].
8.3.
Programming Strategy
Firstly, the ideal ab initio code would rapidly switch from one type of basis function to another. Secondly, the chemistry of molecular electronic structure must be used to the very fullest extent. This implies using atoms in molecules (AIM) and diatomics in molecules (DIM) from the outset, following Bader (in an implementation due to Rico et al [50]) and Tully [62] implemented in our previous work [59, 63], respectively. The natural choice of atomic orbitals, i.e. the Sturmians or hydrogen-like orbitals lend themselves to the AIM approach. To a good approximation, core eigenfunctions for the atomic hamiltonian remain unchanged in the molecule. Otherwise, atom pairs are the natural choice, particularly if the Coulomb resolution recently advocated by Gill is used. This leads us to products of auxiliary overlaps which are either literally one- or two- centered, or have one factor of the product where a simple potential function is translated to one atomic center. The Slater basis set nightmare of the Gegenbauer addition theorem is completely avoided. Naturally, the series of products required for, say a four-center two-electron integral may require 10 or even 20 terms to converge to chemical accuracy, when at least one
Molecular Integrals over Slater-Type Orbitals
75
atom pair is bound but the auxiliaries are easy to evaluate recursively and re-use. Unbound pairs may be treated using approximate methods. Now, the proposed switch in basis set may also be accomplished just by re-evaluating the auxiliary overlaps. Furthermore, the exchange integrals are greatly simplified in that the products of overlaps just involve a two-orbital product instead of a homogeneous density. The resulting cpu-time growth of the calculation is n2 for SCF, rather than n4 . Further gains may be obtained by extending the procedure to post-Hartree-Fock techniques involving explicit correlation, since the r12 −1 integrals involving more than two electrons, that previously soon led to bottlenecks, are also just products of overlaps. This Coulomb resolution is diagonal in Fourier space in some cases.
8.4.
Avoiding ETO Translations for Two-Electron Integrals over Three and Four Centers
Previous work on separation of integration variables is difficult to apply, in contrast to the case for Gaussians [64, 65]. Recent work by Gill et al [66] proposes a resolution of the Coulomb operator, in terms of potential functions φi , which are characterized by examining Poisson’s equation. In addition, they must ensure rapid convergence of the implied sum in the resulting expression for Coulomb integrals J12 as products of ”auxiliaries” i.e. overlap integrals, as detailed in [66]: J12 = hρ(r1 ) φi (r1 )i hφi (r2 ) ρ(r2 )i, with implied summation over i.
(21)
This technique can be readily generalized to exchange and multi-center two-electron integrals. For two-center terms it is helpful to define structure harmonics by Fourier transforms, limiting evaluation to non-zero terms [67]. Note, however, that in four-center integrals, the origin of one of the potential functions only may be chosen to coincide with an atomic (nuclear) position. Define the potential functions [67]: φi = 23/2 φn l (r)Ylm (θ, φ) . Omitting the spherical harmonic term gives radial factors: φn l (r) =
Z +∞ 0
hn (x) jl (rx)dx, with jl (x) denoting the spherical Bessel function.
(22)
Here, hn (x) is the nth member of any set of functions that are complete and orthonormal on the interval [0, +∞), such as the nth order polynomial function (i.e. polynomial factor of an exponential). The choice made in [66] is to use parabolic cylinder functions (see also another application [51]), i.e. functions with the even order Hermite polynomials as a factor. This is not the only possibility and a more natural and convenient choice is based on the Laguerre polynomials Ln (x): Define: √ hn (x) = 2 Ln (2 x)e−x . (23)
¨ P.E. Hoggan, M.B. Ruiz and T. Ozdo˘ gan
76
These polynomial functions are easy to use and lead to the following analytical expressions for the first two terms in the potential defined in (22): V00 (r) =
V10 (r) =
√
2[
√ tan−1 (r) , 2 r
tan−1 (r) 2 − ], r (1 + r2 )
(24)
(25)
Furthermore, higher n expressions of Vn0 (r) all resemble (25) (see [67] eq (23)): Vn0 (r) =
n √ 1 sin(2 k tan−1 (r)) 2 (tan−1 (r) + ∑(−1)k ), r k 1
(26)
and analytical expressions of Vnl (r) with non-zero l are also readily obtained by recurrence. These radial potentials can generally be expressed in terms of hypergeometric functions, whether the choice of polynomial is the present one, i.e. Laguerre, or Hermite polynomials, as in [66]. This structure has been used to confirm the results of [67] using a rapid code in C [68]. Spherical harmonics are translated using Talman’s approach [69]. The displaced potential in one factor of the product of ’auxiliaries’, from four-center integrals is readily expanded in two-center overlaps, after applying Euler’s hypergeometric transformation. [70, 71]. The auxiliary overlap integrals hρ(r1 ) φi (r1 )i and hφi (r2 ) ρ(r2 )i will involve densities obtained from atomic orbitals centered on two different atoms in exchange multi-center two-electron integrals. The overlap integrals required in an ETO basis are thus of the type: hψa (r1 ) ψb (r1 ) φi (r1 )i =
µmax
∑ Nµ (n1 , n2 , ni , li , |mi | αβ) s(n1, l1 , m, n2, l2 , αβ),
(27)
µ=0
with: α = ζ1 R and β = ζ2 R. Slater exponents. In three-center overlaps, Nµ is a normalised Racah coefficient [71]. In two-center cases the sum reduces to a single normalisation term, N0 . A Fourier transform approach is also being investigated, extending [67]. The real space core overlaps then take the form: λ 1 1 (α + β) B j (α − β) , (28) s(n1 , l1 , m, n2 , l2 , α, β) = Dl1 ,l2 ,m ∑ Yiλj Ai 2 2 ij Yiλj is a matrix with integer elements uniquely determined from n, l and m. Dl1 ,l2 ,m is a coefficient that is independent of the principal quantum number. It is obtained upon expanding the product of two Legendre functions in this co-ordinate system. Symmetry conditions imply that only m1 = m2 = m lead to non-zero coefficients: Z ∞ 1 1 Ai exp − (α + β)µ µi dµ, (α + β) = (29) 2 2 1 Bj
Z 1 1 1 (α − β) = exp − (α − β)ν ν j dν. 2 2 −1
(30)
Molecular Integrals over Slater-Type Orbitals
77
Here, recurrence relations on the auxiliary integrals A and B lead to those for the requisite core integrals [72, 73]. These integrals may be pre-calculated and stored. Such integrals appear for two-center exchange integrals and three- and four-center integrals (although just in one factor for three-center Coulomb terms). Note that exchange integrals require distinct orbitals ψa and ψb . In the atomic case, they must have different values for at least one of n, l, m or ζ. In the two-center case, the functions centered at a and b may be the same. The product does not correspond to a single-center density: it is two-centered. Equation (27) then illustrates the relationship to the one-electron two-center overlap integral, although it clearly includes the extra potential term from the Coulomb operator resolution. This assumes tacitly that the potential obtained from the Coulomb operator resolution be centered on one of the atoms. Whilst this choice can be made for one pair in a four-center product, it cannot for the second. There remains a single translation for this potential in one auxiliary of the two in a product representing a four-center integral and none otherwise. This method obviates the need to evaluate infinite series that arise from the orbital translations efficiently. They have been eliminated in the Coulomb operator resolution approach, since only orbitals on two centers remain in the one-electron overlap-like auxiliaries. These can be evaluated with no orbital translation, in prolate spheroidal co-ordinates, or by Fourier transformation [67, 71].
8.5.
Numerical Results of Coulomb Resolutions: Efficiency
First a test system is studied, built up of four hydrogen atoms. The second example is the full RHF calculation of CH3 F using the Coulomb resolutions. Consider the H2 molecule and its dimer/agregates. In an s-orbital basis, all two-center integrals are known analytically, because they can be integrated by separating the variables in prolate spheroidal co-ordinates. A modest s-orbital basis is therefore chosen, simply for accuracy demonstration on a rapid calculation, for which some experimental data could be corroborated. The purpose of this section is to compare evaluations using the Coulomb resolution to the exact values, obtained analytically. The IBM Fortran compiler used is assumed to be reliable to 14 decimals in double precision. The worst values in the Coulomb resolution approximation have 10 correct decimals for two-center integrals with a 25-term sum. Timings are then compared for translation of a Slater type orbital basis to a single center (STOP) [59] with the Poisson equation solution using a DIM (Diatomics in molecules or atom pair) strategy and finally to show that the overlap auxiliary method is by far the fastest approach, for a given accuracy (the choice adopted is a sufficient six decimals, for convenient, reliable output). H2 molecule with interatomic distance of 1.402 atomic units (a.u.). One and two-center Coulomb integrals may be obtained analytically and Coulomb resolution values compare well with them [66]. The two-center exchange integrals are dominated by an exponential of the interatomic distance and thus all have values close to 0.3. The table is not the full set. All index ‘15‘ terms, involving 1sa1 (1) 1sb1 (2) are given, to illustrate symmetry relations. Note that this is by no means the best possible basis set for H2 , since it is limited to
78
¨ P.E. Hoggan, M.B. Ruiz and T. Ozdo˘ gan
Table 1. Atomic exchange integrals (6 distinct single center values between pairs of different AOs). AOs (zeta) 1sa1 1.042 999 1sa2 1.599 999 2sa1 1.615 000 2sa2 1.784 059 1sb1 1.042 999 1sb2 1.599 999 2sb1 1.615 000 2sb2 1.784 059
Label 1 2 3 4 5 6 7 8
[a(1)b(2)a′ (2)b′ (1)] 1212 1313 1414 2323 2424 3434 2121 3232
Value 0.720 716 0.585 172 0.610 192 0.557 878 0.607 927 0.602 141 0.720 716 0.557 878
Table 2. Two-center exchange integrals. All pair permutations possible. Some are identical by symmetry. Labels 1515 1516 1517 1518 1527 1528 1538 2525 2516 2517 2518
Value 0.319 902 0.285 009 0.325 644 0.324 917 0.291 743 0.293 736 0.329 543 0.260 034 0.254 814 0.290 533 0.290 149
l = 0 functions (simply to ensure that even the two-center exchange integral has an analytic closed form). The total energy obtained for the isolated H2 molecule is -1.1284436 a.u. as compared to a Hartree-Fock limit estimate of -1.1336296 a.u. Nevertheless, the Van der Waals well, observed at 6.4 au with a depth of 0.057 kcal/mol (from Raman studies) is quite reasonably reproduced [74]. Dimer geometry: rectangular and planar. Distance between two hydrogen atoms of neighboring molecules: 6 a.u. Note that this alone justifies the expression dimer, the geometry corresponds to two almost completely separate molecules. However, the method is applicable in any geometry (for 3 a.u. all three- and four-center integrals evaluated by Coulomb resolution agree with those of STOP to at least 6 decimals). Timings on an IBM RS6000 Power 6 workstation, for the dimer (all four-center integrals in msec): STOP: 12 POISSON: 10 OVERLAP: 2. Total dimer energy: -2.256998 a.u. This corresponds to a well-depth of 0.069 Kcal/mol, which may be considered reasonable in
Molecular Integrals over Slater-Type Orbitals
79
Table 3a. Orbital exponents. AO No. 01 02 3-5 06 07 8-10 H
n 1 2 2 1 2 2 1
l 0 0 1 0 0 1 0
m 0 0 m 0 0 m 0
zeta 5.6727 1.6083 1.5679 8.5600 2.5600 2.5200 1.2400
Table 3b. Selected examples of three-center exchange integrals. Integral h2sC 2sF |2sC 1sHa i h2sC 2sF |2sC 1sHa i h2sC 1sF |1sC 1sHa i h2sC 1sF |2sC 1sHa i h1sC 2pzF |2pzC 1sHa i h2sC 2pzF |2pzC 1sHa i h2pzC 1sF |1sC 1sHa i h2pzC 1sF |2sC 1sHa i h1sC 1sF |1sC 1sHa i h1sC 1sF |2sC 1sHa i h1sC 2sF |1sC 1sHa i h1sC 2sF |2sC 1sHa i h2pzC 2sF |1sC 1sHa i h2pzC 2sF |1sC 1sHa i
Value 0.4970 48510 ×10−1 0.8420 56635 ×10−2 0.5737 90540 ×10−3 0.3789 18525 ×10−2 0.1587 58344 ×10−2 0.5258 34208 ×10−2 0.1025 32536 ×10−2 0.6772 76818 ×10−2 0.1099 00118 ×10−6 0.6794 54131 ×10−6 0.1446 31297 ×10−2 0.4235 59085 ×10−2 0.1112 10955 ×10−1 0.6738 14908 ×10−1
Integral h2sF 1sHa |1sF 2sC i h2sF 1sHa |2sF 2sC i h2sF 1sHa |2pzF 2sC i h2sF 1sHa |1sF 2pzC i h2sF 1sHa |2sF 2pzC i h2sF 1sHa |2pzF 2pzC i h1sHa 2sF |1sHa 2sC i h1sHa 2sF |1sHa 2pzC i h1sHa 2pzF |1sHa 2pzC i h1sF 1sHb |2sF 1sC i h1sHb 2sF |1sHb 1sC i h2sC 1sHa |1sC 1sHb i h1sC 1sHa |1sC 1sHb i h2sC 1sHa |2sC 1sHb i
Value 0.1014 05594 ×10−2 0.9341 35949 ×10−2 -0.8442 95091 ×10−2 0.1813 23479 ×10−2 0.1379 64387 ×10−1 -0.1135 01125 ×10−1 0.1252 319411 ×10−1 -0.1591 49899 ×10−2 0.1772 90873 ×10−2 0.2287 77210 ×10−4 0.1939 63837 ×10−2 0.2034 841982 ×10−1 0.7154 932331 ×10−2 0.1137 390852
view of the basis set.
8.6.
Selected Exchange Integrals for the CH3 F Molecule (Evaluated Using the Coulomb Resolution)
Geometry and exponents are those of previous work [75]: Tetrahedral angles, with C-H 2.067 and C-F 2.618 a.u. No symmetry is assumed but geometric relationships are observed, as well as those due to m values, at least to the nano-Hartree accuracy chosen. For illustrative purposes, three-center exchange integrals are tabulated in a real basis. Timings on IBM RS6000 Power 6 workstation for all two-electron integrals: STOP: 1.21 s, OVERLAP: 0.17 s. All the two-electron integrals are identical to better than six significant figures with those obtained using the STOP software package [59]. The factor limiting precision in this study is the accuracy of input. The values of Slater exponents and geometric parameters are required to at least the accuracy demanded of the
¨ P.E. Hoggan, M.B. Ruiz and T. Ozdo˘ gan
80
integrals and the fundamental constants are needed to greater precision.
8.7.
Conclusions
A remarkable gain in simplicity is provided by Coulomb operator resolutions [66], that now enables the exponential type orbital translations to be completely avoided in ab initio molecular electronic structure calculations. This breakthrough that Coulomb resolutions represent (in particular with the convenient choice of Laguerre polynomials) in the ETO algorithm strategy stems from a wellcontrolled approximation, analogous to the resolution of the identity. The convergence has been shown to be rapid in all cases [67]. The applications to H2 dimer Van der Waals complexes and CH3 F uses a general code within the STOP package [59]. They show the Coulomb resolution can be used to give fast and accurate results for basis sets of s and p Slater type orbitals. Generalisation is in progress. Numerical vales for the H2 dimer geometry and interaction energy agree well with complete ab initio potential energy surfaces obtained using very large Gaussian basis sets and data from vibrational spectroscopy [74].
9.
Explicitly Correlated Methods for Molecules
The application and development of such methods to determine accurately the ground and excited states, and properties of diatomic and triatomic molecules is very promising and more interesting for the Computational Chemist than the atomic case. There is nowadays a growing interest in this field. Subroutines and programs which perform these calculations are often requested in the community. The investigation of these integrals should be approached within the Molecular Orbital method (MO) [76], because the MO wave function is the simplest wave function for a molecular system. As Coulson [77] discussed, the MO method permits the visualization of electrons and nuclei and interpretation of individual electrons and their orbital exponents better than the wave functions written in elliptical coordinates. The wave functions constructed with elliptical orbitals are of two types, the so-called James-Coodlige [78] wave functions (one-alpha exponent), recently extended to the twoalpha case [33], and Kolos-Wolniewick [79,80] wave functions (with both orbital exponents alpha, and beta ). Both have been applied to the H2 molecule. The elliptical wave functions are the natural representation of a two-center problem but for three-center and larger molecules the use of the MO method becomes necessary. Frost [81] used the MO method and the Correlated Molecular Method (CMO) in H2 calculations. About the extension of the method he wrote: ”The extension of CMO-type wave functions to more complex molecules does not seem feasible at the present time. The new integrals which will be introduced would involve more than two centers if more nuclei were involved and higher atomic orbitals than 1s if more electrons were considered, and their evaluation would be extremely difficult”.
Molecular Integrals over Slater-Type Orbitals
81
Recently, impressive calculations using Hylleraas wave functions have been done for H2 , see Table 4. Hylleraas [33], the Iterative Complement Iteration method (ICI) [82], and explicitly correlated Gaussian (ECG) [83] calculations of the hydrogen molecule, Hylleraas calculations on HeH+ and some other species [84] leading to −2.9710784698 a.u. using 9576 configurations and calculations of He2 using 4800 optimized ECG configurations with energy −5.80748359014 a.u. [83] achieved the highest known accuracy in molecules (picohartree accuracy is more than that of chemical measurements, e.g. a micro cm−1 , a nano eV or micro cal/mol. Although one must recall that in the calculation of properties according to Drake [85], only half of the digits of the energy are kept). Note also that input exponents, distances and some fundamental constants may limit accuracy of calculations compared with measurements and that molecules may not be rigid. Eventually, dynamics and the effect of the Born-Oppenheimer approximation should be included. Hylleraas-CI (Hy-CI) was applied in 1976 to LiH molecule by Clary [32] using elliptical STOs. For two-center molecules the three-electron and four-electron integrals occurring in the Hy-CI have been developed by Budzinski [39]. Another type of explicitly correlated wave functions are the ones that use Gaussian orbitals. Clementi et al extended the Hy-CI to molecules using Gaussian orbitals [86], and applied it to the calculation of H3 . The ECG wave function is appropriate also for molecules [83, 87], as the inter-electronic distance r12 is a Gaussian exponent. This leads to results, which are comparable with Hylleraas calculations [83]. The R12 -wave function proposed by Kutzelnigg and Klopper [88, 89] has the merits to fulfill the cusp condition, to use Gaussian functions avoiding the three- and four-center integration problems, and to include precisely r12 , involving electrons 1 and 2, close to the nucleus, where the probability that r12 = 0 is larger, also these electrons are present at any system starting from helium atom. The r12 variation influences energy. The R12 wave function, developed for molecular calculations is nowadays widely used and combined with all kinds of methods. The occurring three- and four-electron integrals are calculated in terms of two-electron ones. Due to the use of a single r12 value, the accuracy achieved for atomic calculations is lower than the accuracy of Hy and Hy-CI calculations. Recent improvements of the method [90]- [92] can achieve microhartree accurate energy results for chemically interesting systems. Table 4. Highly accurate calculations on the H2 molecule with different types of wave functions at R=1.4011 a.u. Authors 1933 James and Coolidge 1960 Kolos and Roothaan 1968 Kolos and Wolniewicz 1995 Wolniewicz 2006 Sims and Hagstrom 2007 Nakatsuji 2008 Cencek and Szalewicz
type w. f. JC KR KW KW JC ICI ECG,opt
Confs. 5
833 7034 6776 4800
Energy (a.u.) -1.1735 -1.17214 -1.174475 -1.17447467 -1.17447593139984 -1.17447571400027 -1.17447571400135
¨ P.E. Hoggan, M.B. Ruiz and T. Ozdo˘ gan
82
Short wave function expansions lead to very good results. When a large number of configurations are used (up to 10000) the energy results are beyond pico-hartree accuracy, while the CI wave function would need in the order of millions of configurations.
10.
Highly Accurate Calculations Using STOs
Another problem appearing in these calculations is the digital erosion. For many operations and subtraction numbers of similar value some digits can be lost leading to erroneous results. Quadruple precision avoids this, about 30 decimal digits are correct on our computer. Other possibility is high precision arithmetic software. Some programs are available like Bailey’s MPFUN [93], the Brent and Miller program packages [94, 95]. One example of the use of Slater orbitals in the present are the highly accurate calculations of small molecules using explicitly correlated wave functions i.e. wave functions where the inter-electronic coordinate ri j is included explicitly in the wave function. These are the Hylleraas and Hylleraas-CI wave functions, ICI method, compared with the explicit correlated Gaussians ECG and the R12 method.
11.
Closing Remarks
We conclude with the words of G. Berthier: GTOs are like medicine, you have to use them as long as they are healing, but once they don’t work any more, you much change them, Gaston Berthier, Interview, Paris, 2nd June 1997. Recently, a whole book ”Recent Advances in Computational Chemistry: Molecular Integrals over Slater Orbitals ” was dedicated to a mathematical review of methods of integration over Slater orbitals and Hylleraas wave functions [96].
Acknowledgements The authors would like to thank very much Profs. Milan Randic, Ante Graovac, Roberto Todeschini and Peter Otto for their interest in Slater orbitals.
References [1] Slater J.C. Central fields and Rydberg formulas in wave mechanics. Phys. Rev. 1928 31, 333-343. [2] Randic M. International Conference on ETO Multicenter Molecular Integrals, Tallahassee, 1981, C. A. Weatherford and H. Jones Eds.: Reidel, Dortdrech, 1982, pp. 141-155. [3] Scrocco E. and Salvatti O. Ric. Sci. 1951 21, 1629; ibid 1952 22, 1766; ibid 1953 23, 98;
Molecular Integrals over Slater-Type Orbitals
83
[4] Petrongolo C., Scrocco E. and Tomasi J. Minimal-Basis-Set LCAOSCFMO calculations for the ground state of O3 , NO2 , NOF, and OF2 molecules. J. Chem. Phys. 1968 48, 407-411, and Refs. therein. [5] Roothaan C.C.J. A study of two-center integrals useful in calculations on molecular structure. I. J. Chem. Phys. 1951 19, 1445-1458. [6] Wahl A.C., Cade P.E., and Roothaan C.C.J. Study of two-center integrals useful in calculations on molecular structure. V. General methods for diatomic integrals applicable to digital computers. J. Chem. Phys. 1964 41, 2578-2599. [7] R¨udenberg K. A study of two-center integrals useful in calculations of molecular structure. II. The two-center exchange integrals. J. Chem. Phys. 1951 19, 1459-1477. [8] Kotani M., Amemiya A. and Simose T. Tables of integrals useful for the calculations of molecular energies. Proc. Phys. Math. Soc. Japan 1938 20 extra No. 1, 1-70; Kotani M., and Amemiya A. Tables of integrals useful for the calculations of molecular energies. II. Proc. Phys. Math. Soc. Japan 1940 22 extra No. 1, 1-28. [9] Barnett M.P., and Coulson C.A. The evaluation of integrals occurring in the theory of molecular structure. Parts I & II. Phil. Trans. R. Soc. Lond. A 1951 243, 221-249. [10] L¨owdin P.O. Quantum theory of cohesive properties of solids. Adv. Phys. 1956 5, 1172. [11] Harris F.E. Gaussian wave functions for polyatomic molecules. Rev. Mod. Phys. 1963 35, 558-568. [12] Mulliken R.S, and Roothaan C.C.J. Broken bottlenecks and the future of molecular Quantum Mechanics. Proc. Natl. Acad. Sci. (U.S.) 1959 45, 394-398. [13] Boys S.F. Electronic wave functions. I. A general method of calculation for the stationary states of any molecular system. Proc. Roy. Soc. (London) 1950 A200, 542-554. [14] Shavitt I., and Karplus M. Multicenter integrals in molecular Quantum Mechanics. J. Chem. Phys. 1962 36, 550-551. [15] Clementi E. and Raimondi D.L. Atomic screening constants from SCF functions. J. Chem. Phys. 1963 38, 2686-2689. [16] Scherr C.W. An SCF LCAO MO study of N2 . J. Chem. Phys. 1955 23, 569-578. [17] Smith S.J. and Sutcliffe B.T. The Development of Computational Chemistry in the United Kingdom, in Reviews in Computational Chemistry, D.B. Boy D.B. Lipkowitz K.B. Eds. 1997 10 pp. 271-316. [18] POLYATOM: Newmann D.B., Basch H., Korregay R.L., Snyder L.C., Moskowitz J., Hornback C., and Liebman P. Quantum Chemistry Program Exchange, Indiana University, No. 199; Csizmadia I.G., Harrison M.C., Moscowitz J.W., Sutcliffe B.T. Non-empirical LCAO-MO-SCF-CI calculations on organic molecules with Gaussian type functions. Theoret. Chim. Acta (Berl.) 1966 6, 191-216 .
84
¨ P.E. Hoggan, M.B. Ruiz and T. Ozdo˘ gan
[19] IBMOL: David D.J. CDC 6600 Version. Technical Report of C.C. ENSJF and Lab. de Chimie ENS, Paris 1969; Clementi E., Davis D.R. Electronic structure of large molecular systems. J. Comput. Phys. 1967 1, 223-244; Veillard A. IBMOL: Computation of wave-functions for molecules of general geometry, Version 4; IBM Research Laboratory, San Jose. [20] McLean A.D., Yoshimine M., Lengsfield B.H., Bagus P.S. and Liu B. ALCHEMY II: A Research Tool for Molecular Electronic Structure and Interactions, in Modern Techniques in Computational Chemistry, MOTECC 91, Clementi E. Ed., Elsevier B.V. (Leiden): 1991 pp. 233-353. [21] DERIC: Diatomic Integrals over Slater-Type Orbitals. Hagstrom S.A. Quantum Chemistry Program Exchange, Indiana University No. 252, 1974. [22] GAUSSIAN 70: Hehre W. J., Lathan W.A., Ditchfield R., Newton M.D., and Pople J.A. Gaussian 70, Quantum Chemistry Program Exchange, Program No. 237 1970. [23] STOP: A slater-type orbital package for molecular electronic structure determination. Bouferguene A., Fares M., and Hoggan P.E., Int. J. Quantum Chem. 1996 57, 801810. [24] Homeier H.H.H., Joachim Weniger E., and Steinborn E.O. Programs for the evaluation of overlap integrals with B functions, Comp. Phys. Comm. 1992 72, 269-87; Homeier H.H.H and Steinborn E.O. Programs for the evaluation of nuclear attraction integrals with B functions, Comp. Phys. Comm. 1993 77, 135-151. [25] SMILES: Fern´andez Rico J., L´opez R., Ema I., and Ram´ırez G. Reference program for molecular calculations with Slater-type orbitals. J. Comp. Chem. 1998 19(11), 12841293; ibid Electrostatic potentials and fields from density expansions of deformed atoms in molecules. J. Comp. Chem. 2004 25, 1347-1354. [26] CADPAC: The Cambridge Analytic Derivatives Package. Amos R.D., Alberts I.L., Andrews J.S., Colwell S.M., Handy N.C., Jayatilaka D., Knowles P.J., Kobayashi R., Laidig K.E., Laming G., Lee A.M., Maslen P.E., Murray C.W., Rice J.E., Simandiras E.D., Stone A.J., Su M.D. and Tozer D.J, Cambridge, 1995. [27] ADF: Amsterdam Density Functional, available at http://www.scm.com/. [28] ATMOL: Atomic and Molecular CI calculations. Bunge C.F. et al, 1965-2005. [29] Pinchon D. and Hoggan P.E. Rotation matrices for real spherical harmonics: general rotations of atomic orbitals in space-fixed axes. J. Phys. A: Math. Theor. 2007, 40, 1597-1610 . [30] Ozdogan T. Unified treatment for the evaluation of arbitrary multicenter molecular integrals over Slater-type orbitals with noninteger principal quantum numbers. Int. J. Quantum Chem. 2003 92, 419-427. [31] Hagstrom S.A. and Shull H. The nature of the two-electron chemical bond. III. Natural orbitals for H2 . Rev. Mod. Phys. 1963 35, 624-629.
Molecular Integrals over Slater-Type Orbitals
85
[32] Clary D.C. Variational calculations on many-electron diatomic molecules using Hylleraas-type wave functions. Mol. Phys. 1977 34, 793-811. [33] Sims J.S. and Hagstrom S.A. High precision variational calculations for the BornOppenheimer energies of the ground state of the hydrogen molecule. J. Chem. Phys. 2006 124, 094101 (2006). [34] Ruiz M.B. and Peuker K. Mathematical techniques in molecular calculations using Slater orbitals, in Recent Advances in Computational Chemistry: Molecular Integrals ¨ over Slater Orbitals, Ozdogan T. and Ruiz M.B. Eds., Transworld Research Network: Kerala, India, 2008 pp. 100-144. [35] Filter E. and Steinborn E.O. Extremely compact formulas for molecular two-cenetr one-electron integrals and Coulomb integrals over Slater-type atomic orbitals. Phys. Rev. A 1978 18 1-11. [36] Avery J. Hyperspherical Harmonics and Generalized Sturmians, Kluwer: Boston, 2000; Avery J. Many-center Coulomb Sturmians and Shibuya-Wulfman integrals. Int. J. Quantum Chem. 2004 100, 121-130; Red E. and Weatherford C.A. Derivation of a general formula for the Shibuya-Wulfman matrix. Int. J. Quantum Chem. 2004 100, 208-213. [37] Shull H. and L¨owdin P.O. Superposition of configurations and natural spin orbitals. Applications to the He problem. J. Chem. Phys. 1959 30, 617-626. [38] Sims J.S. and Hagstrom S.A. Combined Configuration-InteractionHylleraas-type wave-function study of the ground state of the beryllium atom. Phys. Rev. A 1971 4, 908-916; Sims J.S. and Hagstrom S.A. One-center ri j integrals over Slater-type orbitals. J. Chem. Phys. 1971 55, 4699-4710. [39] Budzinski J. Evaluation of two-center, three- and four-electron integrals over Slatertype orbitals in elliptical coordinates. Int. J. Quantum Chem. 2004 97, 832-843. [40] Smeyers Y.G. About evaluation of many-center molecular integrals. Theoret. chim. Acta (Berl.) 1966 4, 452-459. [41] Harris F.E. and Michels H.H. Multicenter integrals in Quantum Mechanics. I. Expansion of Slater-type orbitals about a new origin. J. Chem. Phys. 43 1965, S165-S169. [42] Guseinov I.I. Unified analytical treatment of multicenter multielectron integrals of central and noncentral interaction potentials over Slater orbitals using Ψα -ETOs. J. Chem. Phys. 2003 119, 4614-4619. [43] Boys S.F., Cook G.B., Reeves C.M., and Shavitt I. Automatic Fundamental Calculations of Molecular Structure. Nature 1956 178, 1207-1209. [44] Shavitt I. Methods in Computational Physics 2, NY, Eds. Adler, S.Fernbach, M. Rotenberg, Academic Press: NY 1963 pp. 1-45.
86
¨ P.E. Hoggan, M.B. Ruiz and T. Ozdo˘ gan
[45] Berlu L., Safouhi H. and Hoggan P. Fast and accurate evaluation of three-center, twoelectron Coulomb, hybrid and three-center nuclear attraction integrals over Slater-type orbitals using the S transformation. Int. J. Quantum Chem. 2004, 99, 221-235. [46] Safouhi H. and Berlu L. The Fourier transform method and the SD approach for the analytical and numerical treatment of multicenter overlap-like quantum similarity integrals. J. Comp. Phys. 2006 216, 19-36. [47] Levin D., Sidi A. Two new classes of nonlinear transformations for accelerating the convergence of infinite integrals and series. Appl. Math. Comp. 1981 9, 175-215; Sidi A. The numerical evaluation of very oscillatory infinite integrals by extrapolation. Math. Comp. 1982 38, 517-529. [48] Guseinov I.I. New complete orthonormal sets of exponential-type orbitals and their application to translation of Slater orbitals. Int. J. Quantum Chem. 2002 90, 114-118. ¨ [49] Ozdogan T., and Ruiz M.B. Evaluation of three-center repulsion integrals over Slater orbitals, in preparation. [50] Fern´andez Rico J., L´opez R., Ram´ırez G., Ema I. and Lude˜na E.V. Analytical method for the representation of atoms-in-molecules densities. J. Comp. Chem. 2004 25, 1355-1363. [51] Pinchon D. and Hoggan P.E. Gaussian approximation of exponential type orbitals based on B functions. Int. J. Quantum Chem. 2009 109, 135-148. [52] Kato T. On the eigenfunctions of many-particle systems in quantum mechanics. Commun. Pure Applied Math. 10 151-177. [53] Kato T. Schr¨odinger Operators, Graffi Ed., Springer Verlag: Berlin, 1985 pp. 1-38. [54] Agmon S. Lectures on Exponential Decay of Solutions of Second Order Elliptic Equations: Bounds on Eigenfunctions of N-Body Schr¨odinger Operators, Princeton University: Princeton, NJ, 1982. [55] Stevens R.M. Geometry optimization in the computation of barriers to internal rotation. J. Chem. Phys. 1970 52, 1397-1402. [56] Jones H.W. International Conference on ETO Multicenter Integrals, (Tallahassee, USA 1981), Weatherford C.A. and Jones H.W. Eds., Reidel: Dordrecht 1982; Int. J. Quantum Chem., Special Issue in memory of Jones H.W., Weatherford C.A. and Jones H.W. Eds., 2004 100(2), pp. 63-243. [57] Slater J.C. Atomic shielding constants. Phys. Rev. 1930 36, 57-64. [58] Slater J.C. Analytic atomic wave functions. Phys. Rev. 1932 42, 33-43. [59] Boufergu`ene A. and Hoggan P.E., QCPE, Programme No. 667, 1996. [60] Weatherford C.A., Red E., Joseph D. and Hoggan P.E. Poisson’s equation solution of Coulomb integrals in atoms and molecules. Mol. Phys. 2006 104, 1385-1389.
Molecular Integrals over Slater-Type Orbitals
87
[61] Avery J. and Avery J. Hyperspherical Harmonics and Generalized Sturmians , Kluwer: Boston 2007. [62] Tully J.C. Diatomics-in-molecules potential energy surfaces. I. First-row triatomic hydrides. J. Chem. Phys. 1973 58, 1396-1410. [63] Weatherford C.A., Red E. and Hoggan P.E. Solution of Poisson’s equation using spectral forms. Mol. Phys. 103 (2005) 2169-2172. [64] Cesco J.C., P´erez J.E., Denner C.C., Giubergia G.O. and Rosso A.E. Error bounds for direct evaluation of four-center molecular integrals. Applied Num. Math. 2005 55(2), 173-190, and references therein. [65] Shao Y., White C.A. and Head-Gordon M. Efficient evaluation of the Coulomb force in density-functional theory calculations. J. Chem. Phys. 2001 114, 6572-6577. [66] Vagranov S.A., Gilbert A.T.B., Duplaxes E. and Gill P.M.W. Resolutions of the Coulomb operator. J. Chem. Phys. 2008 128, 201104. [67] Gill P.M.W. and Gilbert A.T.B. Resolutions of the Coulomb Operator. II The Laguerre Generator, Chem. Phys. 2009 356 86-92. [68] Pinchon D. and Hoggan P.E., unpublished code. [69] Talman J.D. Expression for overlap integrals of Slater orbitals. Phys. Rev. A 1993 48, 243-249. [70] Whittaker E.T. and Watson G.N., A Course in Modern Analysis, Cambridge University Press, 4th Ed.: Cambridge, England, 1990. [71] Pinchon D. and Hoggan P.E., Translating Coulomb potentials over Slater type orbitals, in preparation. [72] Guseinov I.I., Ozmen A., Atav U., Uksel H. Computation of overlap integrals over Slater-type orbitals using auxiliary functions. Int. J. Quantum Chem. 1998 67, 199204. [73] Hoggan, P.E. DSc Thesis, 1991, Theoretical study of physisorption and its influence on reactivity. Appendix 2. (in French available at INIST). [74] Hinde R.J. A six-dimensional H2 -H2 potential energy surface for bound state spectroscopy. J. Chem. Phys. 2008 128, 154308. [75] Absi N. and Hoggan P.E. Analytical evaluation of molecular electronic integrals using Poisson’s equation: Exponential-type orbitals and atom pairs. Int. J. Quantum Chem 2006 106, 2881-2888. [76] Mulliken R.S. Electronic structures of polyatomic molecules and valence VI. On the method of Molecular Orbitals. J. Chem. Phys. 1935 3, 375-378.
88
¨ P.E. Hoggan, M.B. Ruiz and T. Ozdo˘ gan
[77] Coulson C.A. The energy and screening constants of the hydrogen molecule. Trans. Faraday Soc. 1937 33, 1479-1492. [78] James H.M. and Coolidge A.S. The ground state of the hydrogen molecule. J. Chem. Phys. 1933 1, 825-835. [79] Kolos W. and Roothaan C.C.J. Correlated orbitals for the ground state of the hydrogen molecule. Rev. Mod. Phys. 1968 32, 205-210. 3 + 1 [80] Kolos W. and Wolniewicz L. Potential curves for the χ1 ∑+ g , b ∑u , and C Πu states of the hydrogen molecule. J. Chem. Phys. 1965 43, 2429-2441; ibid 1968 49, 404-410.
[81] Frost A.A. and J. Braunstein J. Hydrogen molecule energy calculation by correlated molecular orbitals. J. Chem. Phys. 1951 19, 1133-1138. [82] Kurokawa Y., Nakashima H. and Nakatsuji H. Free iterative-complement-interaction calculations of the hydrogen molecule. Phys. Rev. A, 2005 72, 062502. [83] Cencek W. and Szalewicz K. Ultra-high accuracy calculations for hydrogen molecule and helium dimer. Int. J. Quantum Chem. 2008 108, 2191-2198. [84] Zhou B.L., Zhu J.M., and Yan Z.C. Ground state energy of HeH+ . Phys. Rev. A 2006 73 064503. [85] Drake G.W.F. High precision theory of atomic helium. Phys. Scr. 1999 T83, 83-92. [86] Frey D., Preiskorn A., Lie G.C., and Clementi E. HYCOIN: Hylleraas Configuration Interaction method using Gaussian functions, in Modern Techniques in Computational Chemistry: MOTECC-90, Clementi E. Ed., ESCOM Science Publ.: Leiden 1990, Chapter 5, pp. 57-97. [87] Rychlewski J. and Komasa J. Explicitly correlated functions in variational calculations, in Explicitly Correlated Wave Functions in Chemistry and Physics, Rychlewski J. Ed., Kluwer Academic Publishers: Netherlands 2004, pp. 91-147. [88] Kutzelnigg W. r12 -dependent terms in the wave function as closed sums of partial wave amplitudes for large l. Theor. Chim. Acta 1985 68, 445-469. [89] Klopper W. and Kutzelnigg W. Møller-Plesset calculations taking care of the correlation cusp. Chem. Phys. Lett. 1987 134, 17-22. [90] Klopper W. and Noga J. Linear R12 terms in Coupled Cluster theory, in Explicitly Correlated Wave Functions in Chemistry and Physics, Rychlewski J. Ed., Kluwer Academic Publishers: Netherlands 2004, pp. 149-183. [91] Cardoen W., Gdanitz R.J., and Simons J. Transition-state energy and geometry, exothermicity, and van der Waals wells on the F + H2 → FH + H ground-state surface calculated at the r12 -ACPF-2 level. J. Phys. Chem. A 2006 110, 564-571. [92] Explicitly Correlated Wave Functions in Chemistry and Physics, Rychlewski J. Ed., Kluwer Academic Publishers: Netherlands, 2004.
Molecular Integrals over Slater-Type Orbitals
89
[93] Bailey D.H. High-precision software directory. Available from: http://crd.lbl.gov/˜dhbailey/mpdist/mpdist.html [94] Brent R.P. A Fortran multiple-precision arithmetic package, ACM Transactions on Mathematical Software (TOMS), v.4 n.1, pp.57-70, 1978. [95] Miller A.J. Alan Miller’s Fortran software. Available from: http://users.bigpond.net.au/amiller/ [96] Recent Advances in Computational Chemistry: Molecular Integrals over Slater Or¨ bitals, Ozdogan T. and Ruiz M.B. Eds., Transworld Research Network: Kerala, India, 2008.
In: Electrostatics: Theory and Applications Editor: Camille L. Bertrand, pp. 91-110
ISBN 978-1-61668-549-2 c 2010 Nova Science Publishers, Inc.
Chapter 5
T UNNELING DYNAMICS AND I TS S IGNATURES IN C OUPLED S YSTEMS S. Ghosh and S.P. Bhattacharyya∗ Department of Physical Chemistry Indian Association for the Cultivation of Science Jadavpur, Calcutta 700 032, INDIA
Abstract It has been demonstrated through numerical experiments that tunneling in a symmetric double well may be either suppressed or enhanced by a Morse oscillator coupled to it depending on the form of coupling. An external time varying electric field that causes 0+ − 0− transition in the double well affects the tunneling rate. A well defined minimum in the rate is observed for λ = λc for which maximum energy transfer from the double well to the Morse mode takes place. If the field is chosen to cause 0 → 1 transition in the Morse mode, the dissociation probability is influenced by the tunneling mode, being maximized for a particular value λ = λ′c . In the uncoupled system, field with similar intensity practically fails to cause any dissociation. The tunneling dynamics is analyzed in a situation when the particle has a coordinate dependent mass as is often assumed in the charge transport in hetero-structures. The tunneling time or rate are seen to be very significantly affected by the nature of the coordinate dependence of the tunneling mass.
1.
Introduction
As a purely quantum phenomenon, tunneling is ubiquitous in microscopic systems. It pervades all areas in physics, chemistry and biology and ever since its use in nuclear physics in explaining the decay of α particles from atomic nuclei, the importance of tunneling has been increasingly recognized. The study of electron tunneling in condensed matter physics has led to the Josephson effect and the tunneling diode [1] . The tunneling of proton or hydrogen atom has often been invoked in chemistry to explain unusual features of chemical ∗ E-mail
address: [email protected]
92
S. Ghosh and S.P. Bhattacharyya
reaction rates or mechanisms. In fact, the atom tunneling phenomenon has been recognized to be important in science of various types of materials and biology. Starting from atom tunneling reactions in quantum solid hydrogen or in solid or liquid Helium, there have been studies relating to rather unusual aspects of tunneling reactions of organic substances, tunneling insertion reaction of carbenes and heavy particle tunneling. The role of atom tunneling reaction in vitamin-C in the suppression of mutation or of vitamin-E in its antioxidant, pro-oxidant and regeneration reactions have attracted serious attention from biologists. The possible occurrence of spontaneous tunneling elimination of hydrogen molecules from hydrocarbon cations has led to serious questions about our conventional idea of a stable chemical structure [2]. The relevance of tunneling in the interpretation of molecular and crystal structure in very low temperature regimes can hardly be overestimated. It would be appropriate therefore to review briefly how the idea of tunneling was developed and exploited in different fields of science.
2.
Historical Development
A. Tunneling in Physics The theory of α radioactivity proposed by Gamow [3] explained the law of exponential −Γt by solving the Schrodinger decay P(t) = N(t) ¨ equation for the α particle inside the N0 = e nucleus where the attractive nuclear force and coulomb repulsion were assumed to provide an effective barrier confining the α particle. Gamow imposed the ’outgoing wave’ boundary condition at large distances from the center of the nucleus and found that the Schrodinger ¨ equation does not have solutions for the real energies while for complex energies, it had solutions. Gamow interpreted the imaginary part of the energy as the decay width Γ2 and obtained a relation between Γ2 and the energy of the emitted α particle. The use of complex energy meant the use of a non-hermitian hamiltonian and the idea was criticized as quantum mechanics worked with hermitian operators. The same result was later obtained by Bohr by considering states with real energies and working with a hermitian hamiltonian [4]. The idea of resonant tunneling was introduced by Gurney who realized that particles with low energies that match with quasi-stationary energies of the nucleus could easily penetrate the barrier [5]. The importance of the idea in artificial disintegration can hardly be overestimated. The idea of tunneling was soon exploited in other areas of physics. Notably, many attempts were made to relate the dynamics of electron current in the metal semi-conductor systems to the tunneling of electrons in solids. The discovery of transistors in 1947 rekindled interest in the tunneling of electrons in solids the occurrence of which was conclusively proved by L. Esaki (1957) who discovered the tunneling diode [6]. Close on the heels of the discovery of tunneling diode, Giaever found if one or both the metals are superconducting, the voltage-current plots could lead to measurement of energy gaps in superconductors [7]. Josephson discovered that the superconductors separated by a thin layer of insulating oxide provide a system in which a second current (over and the above the Giaever’s current), the so-called supercurrent, exists and that it is caused by tunneling of electrons in pairs [8]
Dynamics and Its Signature
93
B. Tunneling in Chemistry Tunneling has a long history in chemical kinetics. Traditionally, curvature in Arrhenius plots of rate constants has been interpreted as a signature of tunneling. It is generally very difficult to observe such curvatures in Arrhenius plots for gas phase chemical reactions while such curvatures are a common-place occurrence in chemical reactions in condensed phases below 100 K. Accurate theoretical calculations, however, indicate that tunneling can contributes significantly to the reaction rates even at room temperature where the Arrhenius plots are very nearly linear [8]. In fact, ’tunneling’ can be taken to be synonymous with a chemical reaction occurring at energies less than the barrier energy (Ea ). The barrier arises on the 3N − 6 dimensional potential energy surface in the Born-Oppenheimer approximation as the nuclei of the N atom reactive system move breaking and making bonds. Traditionally one identifies the one dimensional minimum energy path (x) as the reaction path along which an effective potential energy Ve f f (x) is defined by adding the vibrational energies ε(x) along the reaction path associated with the nuclear motion perpendicular to it [9]. This vibrationally adiabatic approximation reduces the multidimensional problem into a one dimensional problem, for which tunneling has a unique definition. The tunneling probabilities calculated using the idea described above are generally a bit too small due to the neglect of the reaction path curvature. Several alternative approaches have been explored for better representation of the tunneling path, for example, the least action ground state method [10] and the ’tunneling tube method’ [11]. The barrier energy on the reaction −Ea
path can be identified with the activation energy (Ea ) in the Arrhenius rate law k = Ae Kb T which has been recognized as the central law of chemical kinetics. Many attempts have been made to derive the central law of chemical kinetics. The transition state theory (TST) of Eyring [12] was the first attempt in this direction. The TST was derived assuming complete thermodynamic equilibrium wherein the possibility of reverse transition was neglected, which was later taken care of in an ad hoc manner by introducing a transmission coefficient in the pre-exponential factor. Kramers’ [13] treated elementary rate process in the presence of a medium as a generalized Brownian motion in a potential field and showed how the rate constant would be influenced by the viscosity of the medium. Kramers’ results have been derived from the TST applied to an extended system of the reactants and reservoir of oscillators which provide a frictional force and a random force along the reactive mode [14]. Kim and Hynes [15], Truhler etal. [16] introduced additional coordinates describing the dynamics of the solvent mode along with the reagent mode within the framework of TST and derived results equivalent to those obtained by Kramers. Chemical reactions at sufficiently low temperatures are marked by remarkably special features. Goldanski [17], showed that the rate constant of a chemical process has a low temperature limit (non-zero) as T → 0 and introduced the concept of crossover temperature (Tc )which divides the whole temperature interval into over and under barrier regimes. Goldanskii assumed that the tunneling particle moves through the saddle point for obtaining TC . A tunneling particle may not, however move in the traditional classical manner along the adiabatic reaction coordinate through the first order saddle point, but take a short cut wherein it experiences a higher barrier, but a shorter tunneling length. In the absence of highly precise data for the reaction systems, numerical calculations for obtaining the tun-
94
S. Ghosh and S.P. Bhattacharyya
neling rate constant become difficult. One is then left with the alternative of relating the special features of tunneling chemical reactions to the changes in the reaction barrier due to the vibrations (intermolecular) [18]. Atom transfer reactions are the most thoroughly investigated reactions [19]. Here the transferring atom moves between the two potential wells. A reversible transfers leads to tunneling level splitting (spectral signature of tunneling) and an irreversible transfer precipitates chemical reactions (signature of tunneling dynamics). If the period of inter-well quantum oscillation is smaller than the relaxation time, coherent transfer takes place while the process is incoherent if the opposite situation is encountered. If the two time scales are comparable, the process is better described by density matrix methods. Ivanov and Kozhushner [20] showed that the time period of oscillation (Ω−1 ) and the particle transfer time (the tunneling time τ) play an important role. If Ωτ << 1, the particle transfer is instantaneous and takes place at predetermined positions of the reagents. If Ωτ >> 1, the transfer takes place at average positions of the reagents and the surrounding molecules. If Ωτ = 1, the tunneling particle adjusts its positions corresponding to those of the reagents and the surroundings. The synchronous motion of the reagent and the surroundings may either hasten or delay the particle transfer rate [22]. In a tunneling reaction we have three subsystems taking part of which the electronic and the intramolecular subsystem constitute the fast variables and the inter subsystem variables constitute the slow variable. Using the Fermi Golden rule, and a modified theory of radiationles transition, an expression for the tunneling rate constant has been obtained [21] ktunneling =
2π ~
Aνi ∑hφν f |V (R)|φνi i2 δ(~ν f + ∆E − ~νi ). νf
The transition matrix element delicately depends on the distance between the reagents and that means intermolecular vibrations play a dominant role in shaping the tunneling rate constant. Such vibrations cause variations in the tunneling distance. Vibrations that bring the reagent closer are called promotive modes which are taken into account by invoking either the Einstein or the Debye model of promotive modes [22-24], depending upon the situation. It has been shown that in atom tunneling reactions in the solid phase the low temperature limit of ktunneling exists for nonendothermic processes only. Along with inter-molecular vibrations (which can change inter-reagent separation), medium reorganization and under barrier friction play significant roles in determining temperature and pressure dependence of tunneling rate constants. The effect of reorganization of the medium alone leads to a temperature dependence that is independent of the form of the barrier. We note that the tunneling particle is assumed to interact with phonons only in the initial and final states, but not during the course of tunneling. The interaction of the tunneling particle with phonons in the underbarrier regimes lead to the appearance of frictional effects that depend rather sensitively on the form of the barrier and its modulations, if any.
C. Tunneling in Coupled Systems With the preceeding background in view, our purpose in this chapter has been to investigate typical signatures of tunneling dynamics in several coupled systems. In one of these, the tunneling mode is described by a symmetric double well potential which is coupled to,
Dynamics and Its Signature
95
let us say, a bond stretching mode. The latter is represented by a typical Morse potential and the form and strength of the coupling are assumed to vary. We propose to investigate how the mode of coupling with bond stretching mode influences the tunneling dynamics in the double well and vice versa. If the bond stretching mode is externally driven, how does it affect the tunneling dynamics? These questions are addressed in subsequent sections. Normally, the tunneling particle is assumed to have a fixed mass- fixed in space and time. However, in charge transfer and tunneling of electrons through heterostructures, the effective mass of the electron is often assumed to be coordinate dependent. Could the coordinate dependence of the mass of the tunneling particle have a typical signature on the tunneling dynamics? We have investigated the question through numerical experiments. Let us now focus on the general methodology used in our explorations.
3.
The Method
For the calculations, we have used time dependent Fourier grid hamiltonian method [25-19]. The basic framework of the approach adopted in the present series of calculations is described in this section.
A. Dynamics of the Coupled System in the Absence of Driving Let H0 (x, y) be the hamiltonian of a particle of mass m moving on the x − y plane in a potential V0 (x, y), where V0 (x, y) is assumed to be additively separable into a double well potential V0 (x) and a Morse potential V0 (y) . Thus H0 (x, y) = T0 (x) + T0 (y) +V0 (x, y) = T0 (x) +V1 (x) + T0 (y) +V2 (y) = H0 (x) + H0 (y).
(1)
Let us assume that x represents the tunneling coordinate and y stands for a bond stretching coordinate. To be more specific, we may take H0 (x) is taken to represent the Hamiltonian describing the motion of a quantum particle in a asymmetric double well potential V1 (x) while H0 (y) is assumed to represent the motion along a bond stretching coordinate described by an appropriate Morse potential V2 (y) . We may now introduce a coupling between the two modes by using the interaction potential Vint (x, y) so that the total hamiltonian H(x, y) that describes the coupled system can be written as H(x, y) = H0 (x, y) + λVint (x, y).
(2)
λ being the strength of the coupling between the tunneling and bond stretching coordinates. The purpose of the present study has been to investigate how the tunneling dynamics gets affected by the coupling with the bond stretching mode. Let us assume that the eigen functions φi (x) of H0 (x) and χi (y) of H0 (y) are known to start with. The states of the coupled system can be described by superposition of the products of the eigenstates of the uncoupled system [H0 (x) and H0 (y)]. Thus we may write ni n j
|ψ(x, y,t)i = ∑ ∑ ci j (t)|φi (x)χ j (y)i. i
j
(3)
96
S. Ghosh and S.P. Bhattacharyya
The product functions generated by the eigenfunction of the uncoupled system described by the hamiltonians.H0 (x) and H0 (y) serve as the basis of the coupled system. The introduction of the coupling term Vint (x, y) would cause the superposition in equation 3 to evolve in time. The combination coefficients ci j ’s are therefore taken to be time dependent. Time evolution equations for the combining coefficients are obtained from the time dependent Schrodinger equation by following the standard procedures and are given by i~
dci j = dt
mx ,my
∑ k,l
ckl hφxk χyl |H0 (x, y)|φxi χyj i
(4)
for i = 1, mx , j = 1, ny We may call it a time dependent configuration interaction type of formalism. To find out the time independent basis functions |φi (x)i and |χ j (y)i we make use of the Fourier grid hamiltonian (FGH) technique [29] and compute the eigenfunctions and eigenvalues of the uncoupled model systems as follows H0 (x)|φi (x) = εxi |φi (x)i,
(5)
where nx
∑ wxpi |x p i∆x,
(6)
H0 (y)χi (y) = εyi |χi (y)i,
(7)
|φi (x)i =
p=1
and
where |χyi i =
ny
∑ wyqi |yq i∆y.
(8)
q=1
wxp,i and wyq,i are the corresponding grid point amplitudes along x and y coordinates, respectively. We note that the coordinates are uniformly discretized, ∆x and ∆y being the uniform grid spacing along the x and y coordinate, respectively. Using equations (6) and (8) the time evolution equations of the combining coefficients ci j (i = 1, nx ; j = 1, ny ) of equation 4 reduce to nx ,ny dci j i~ = ∑ ckl ∑ wxpi wyq j H(x p , yq )wxpk wyql (∆x∆y)2 = ∑ ckl Hi j,kl . dt p,q k,l k,l
(9)
In matrix form equation 9 can be written as ˙ = HC(t). i~C(t)
(10)
The time integration may be done by employing sixth order Runge-Kutta method. When a quantum mechanical particle moves in a double well there is a nonzero probability that the particle tunnels from one well to the other well whatever be its energy relative to the barrier
Dynamics and Its Signature
97
top. Let us suppose the particle was initially localized in the left well of the uncoupled symmetric double well. The lowest energy localized states (φL and φR ) can be described by linear combinations of the two lowest energy eigen-states of even (ψ+ 0 ) and odd parity − (ψ0 ), respectively 1 − φL = √ (ψ+ (11) 0 + ψ0 ), 2 1 − φR = √ (ψ+ (12) 0 − ψ0 ), 2 where φL and φR represent the states localized in the left and the right well, respectively. In the absence of any coupling between the tunneling mode and the bond stretching mode the particle tunnels coherently from the left well to the right well. If one takes φL to be the initial state, tunneling probability is obtained by calculating the probability of finding the particle in the right well at energies less than the barrier energy. The corresponding probability PR (t) at any particular instance is given by (the barrier is located at x = 0) PR (t) =
Z ∞ 0
|ψ(x,t)|2 dx.
(13)
In the two dimensional case, the corresponding probability can be computed by using PR (t) =
Z +∞ Z +∞ y=−∞ x=0
|ψ(x, y,t)|2 dxdy.
(14)
In the CI formalism used by us in the FGH basis, equation 14 can be written as ni ,n j
PR (t) =
∑ i, j
|ci j |2
nx
ny
nx −1 2
q=1
∑ ∑ wxpi wyq j ∆x∆y.
(15)
The rate of tunneling can then be obtained from the average slope of the PR (t) - t plot. Alternatively, the tunneling rate can be calculated from the average rate of the change of the hx(t)i with time plot, where hx(t)i is given by hx(t)i = hψ(x, y,t)|x|ψ(x, y,t)i.
(16)
Substituting the expression for hψ(x, y,t)| from equation (4) we get hx(t)i =
∑ c∗i j (t)ck j (t)hφxi |x|φxk i
(17)
i, j,k
nx
=
∑ c∗i j ck j ∑ wxpi x p wxpk .
i, j,k
(18)
p
The computed rate of tunneling is related to the tunneling splitting in the double well [30]. When the double well gets coupled to a bond stretching mode described by a Morse oscillator, we may envisage three probabilities: (a) The bond stretching mode enhances the tunneling rate. We may call it a promoting mode. (b) The bond stretching mode reduces the tunneling rate. We call it a suppressing mode (c) The bond stretching mode does not affect the tunneling rate at all. It acts as a passive mode We anticipate that the type of the effect could depend on the nature of the coupling (see later).
98
S. Ghosh and S.P. Bhattacharyya
B. Dynamics of the Coupled System in the Presence of External Driving When a spatially homogeneous external electric field is applied along the x direction the perturbed hamiltonian H(x,t) can be represented as H(x,t) = T (x) +V1 (x) +V (x,t) = H0 (x) +V (x,t),
(19) (20)
where V (x,t) = xeεx sin(ωt).
(21)
ε is the field intensity and e represents the charge carried by an electron. Similarly, with an electric field applied along y direction H(y,t) = H0 (y) +V (y,t),
(22)
V (y,t) = yeεy sin(ωt).
(23)
where
The time evolution equations now turn out to be (with field along x direction) i~
nx ,ny dci j = ∑ ckl ∑ wxpi wyq j (H(x p ,t) + H0 (yq ) + λVint (x p , yq ))wxpk wyql (∆x∆y)2 dt p,q k,l
(24)
for i = 1, nx ; j = 1, ny . These equations can be integrated numerically over ckl ’s at t = 0 are provided.
4.
Results and Discussion
A. Tunneling in the Coupled System in Absence of External Driving We take the model potentials V1 (x) = ax4 − bx2 + c,
(25)
(−β(y−ye )) 2
V(2) (y) = D(1 − e
) .
(26)
The system parameters are listed in table-1. FGH calculations have been done with 151 grid points in each coordinate. The grid lengths along x axis and y axis are 6 a.u and 10 a.u, respectively. When V1 (x) and V2 (y) are coupled through an interaction potential Vint (x, y) and the coupling strength λ. the shape of the two dimensional potential energy surface (PES) gets modified depending on the functional form of V (x, y) and strength λ. Figure 1a shows the two dimensional potential energy surface (PES) when V1 (x) and V2 (y) are uncoupled, i.e. λ = 0. Figure 1b shows the two dimensional PES for Vint (x, y) = λxy, (λ = 0.001). Evidently, for the particular functional form the symmetry of the double well potential gradually gets distorted (along the y axis). The left well becomes deeper than the right well. By increasing the coupling strength the asymmetry between the two wells
Dynamics and Its Signature
99
Figure 1. Potential Energy Surfaces for different forms of interaction potentials Vint (x, y). (a) Vint (x, y) = 0; (b) Vint (x, y) = λxy; (c) interaction potential Vint (x, y) = λx2 y2 ; (d) Vint (x, y) = λ(x2 y + xy2 ).
increases and the coherence in the tunneling process is progressively disturbed. Figure 2a shows the plot of the computed tunneling rate versus the coupling strength. The initial wave function ψ(x, y,t = 0) corresponds to the state in which the Morse mode and the tunneling mode are both in the ground state. From figure 2a we can clearly conclude that the tunneling rate decreases almost linearly with increasing coupling strength for the particular functional form of interaction potential Vint (x, y) = λxy. Figures 1c and 1d show the two dimensional PES for the functional forms Vint (x, y) = λx2 y2 and Vint (x, y) = λ(x2 y + y2 x), respectively (with λ = 0.001). In figure 1c the symmetry of the double well potential remains intact but the barrier height of the double well decreases (along y axis). As we go on increasing λ with the functional form Vint (x, y) = λx2 y2 the barrier height gradually decreases and as a consequence the tunneling rate increases almost linearly which is very clearly reflected in figure 2b. If Vint (x, y) is taken to have the form Vint (x, y) = λ(x2 y+y2 x), the PES (figure 1d) is obtained in which the barrier height of the double well potential increases as well as the symmetry of the double well potential is lost. The tunneling rate decreases (figure 2c) as λ increases. These results indicate that the form (Vint (x, y)) and strength (λ) of the interaction potential between the symmetric double well (tunneling mode) and the Morse oscillator (bond stretching mode)are very important in determining the nature and the extent of the influence that the coupling could
100
S. Ghosh and S.P. Bhattacharyya
Figure 2. Tunneling rate in the coupled systen in absence of external driving when (a) Vint (x, y) = λxy; (b) Vint (x, y) = λx2 y2 ; (c) V (x, y) = λ(x2 y + xy2 ). have on the tunneling dynamics in the double well.
B. Tunneling Dynamics in the Coupled System in the Presence of an External Electric Field Coupled to the Tunneling Coordinate We apply an external time varying electric field along the tunneling coordinate with an intensity of 0.01 a.u and frequency of 0.00227 a.u. The specific frequency matches with the 0 → 1 transition frequency (ω0→1 )of the double well. We take the product of the lowest even parity state the double well and the ground state of the Morse oscillator, as the initial state; ψ(x, y,t = 0) = φ+ 0 χ1 .
(27)
For the uncoupled composite system, λ = 0 and we see the particle execute a to and fro movement between the two wells. Figure 3a shows the plot of < x > vs t for the uncoupled double well Morse oscillator system. The variation of < x > seems to be very regular and periodic in nature. The corresponding quantum phase space diagram (figure 3b) shows that
Dynamics and Its Signature
101
Figure 3. Tunneling dynamics in the presence of an external time varying field coupled to the tunneling coordinate x with Vint (x, y) = λxy (a) < x > versus t profile at λ = 0; (b) ’Quantum phase space’ diagram along λ = 0 for the tunneling mode; (c) < x > versus t profile at λ = 0.0001; (d) ’Quantum phase space’ diagram along λ = 0.0001 for the tunneling coordinate; (e) ’Quantum phase space’ diagram for λ = 0.0001 for the bond stretching mode.
102
S. Ghosh and S.P. Bhattacharyya
the particle is symmetrically distributed in the two wells ( the symmetry about < x >= 0 line is maintained). Tunneling rate is calculated from the average slope of < x > versus t plots. When the symmetrical double well system gets coupled to the Morse oscillator with a coupling term Vint (x, y) = λxy the dynamics gets modified. We have shown the variation of < x > with t for λ = 0.0001 for such a system in figure 3c which is substantially different from the corresponding picture of the uncoupled system (figure 3a). However, figure 3c also exhibits regularity and periodicity. The ’quantum phase space’ picture (figure 3d) is still symmetric about the < x >= 0 line, but the area occupied in the phase space diagram has decreased substantially. It indicates transfer of some energy from the tunneling to the bond stretching mode. Figure 3e displays the ’quantum phase space’ structure along y direction (the bond stretching mode). The occupied region of the phase space in figure 3e is rather small which indicates that the energy transfer from the x mode to the y mode (i.e. from the tunneling to the bond stretching or the Morse mode) is also small for λ = 0.0001. However, because of this small energy transfer the tunneling rate gets reduced. We have investigated the dynamics at different values of λ and we have found that the energy transfer from the x mode to the y mode becomes maximum at λ = 0.003. The tunneling rate in turn passes through a minima for λ = 0.003 (figure 4a). Figure 4b shows the variation of < x > with time for λ = 0.003. At this λ value the ’quantum phase space’ diagram for the Morse mode occupies a larger area which again indicates that the energy transfer (figure 4d) from the tunneling to the Morse mode for the given value of λ is higher. The population of the first excited state of the Morse oscillator grows (figure 4d) with oscillations which is reflected in the increase of the amplitude of the oscillation of the average bond length < y > (figure 4e). It is clear therefore that direct excitations by external field coupled to the tunneling mode can influence the bond stretching mode coupled to it.
C. Tunneling Dynamics in the Presence of External Field Coupled to the Bond Stretching Mode If we locally excite the Morse mode with an external time varying electricfield of intensity 0.01 a.u, and frequency ω0→1 (y) (frequency of the 0 → 1 transition in the Morse mode), we do not observe any significant probability of bond dissociation in the uncoupled system. But if we introduce an interaction potential Vint (x, y) = λxy, the dissociation probability slowly rises as λ increases. Table-3 shows the values of the computed dissociation probability and tunneling rate for various values of coupling strength (λ). At λ = 0.004 we observe a dissociation probability of 0.37 which is the maximum for the given intensity. From table-3 it is clear that tunneling is strongly impeded when the tunneling mode gets coupled to a stretching mode that is locally excited. Quantum phase space picture along the tunneling coordinate at λ = 0.004 is depicted in figure 5a. Figure 5b exhibits the corresponding picture for the stretching mode at λ = 0.004. The growth of dissociation probability attains a maximum at λ = 0.004 (Table-3). Figures 5c and 5d show the ’quantum phase space’ diagrams for the stretching mode for λ = 0.001 and λ = 0.006, respectively. In both the cases the area occupied in the phase space is smaller compared to what is observed for λ = 0.004 (figure 5b). It indicates that the maximum energy transfer from the symmetric double well to the Morse mode takes place at λ = 0.004 which is
Dynamics and Its Signature
103
Figure 4. Tunneling dynamics in the presence of an external time varying field coupled to the tunneling coordinate with Vint (x, y) = λxy (a) Tunneling rate versus coupling strength λ; (b) < x > versus t profile for λ = 0.003 along the tunneling coordinate; (c) ’Quantum phase space’ diagram for λ = 0.003; (d) Population of the ground and the first excited states of the bond stretching mode for λ = 0.003; (e) < x > versus t profile at λ = 0.003 for the bond stretching mode.
104
S. Ghosh and S.P. Bhattacharyya
Figure 5. Tunneling dynamics in the presence of an external time varying field coupled to the bond stretching coordinate with Vint (x, y) = λxy (a) Quantum phase space diagram for λ = 0.004 along the tunneling coordinate; (b) “Quantum phase space” diagram for λ = 0.004 along the bond stretching coordinate; (c) “Quantum phase space” diagram for λ = 0.001 along the bond stretching coordinate; (d) “Quantum phase space” diagram for λ = 0.006 along the bond stretching coordinate; (e) Dissociation probability versus time for λ = 0.004.
Dynamics and Its Signature
105
reflected in the relatively high dissociation probability achieved at the given intensity of the external field . Figure 5e shows the plot of the computed dissociation probability against time at λ = 0.004 a.u. The dissociation probability grows fast and attains a relatively high value. The tunneling, on the other hand gets quenched completely. Thus, excitations in the non-tunneling mode in the coupled system can be exploited for controlling tunneling.
D. Tunneling Dynamics of a Particle with Coordinate Dependent Mass Tunneling of electron through heterostructures is complicated by the interaction of the tunneling particles with many centres of scattering. Instead of taking these complexities directly into account in the calculation, it is often expedient to replace the real system by a model one in which the tunneling potential remains unaffected, but the tunneling particle is assumed to have coordinate dependent mass. The question that arises now concerns the signature of the coupling of the tunneling particle with the lattice or equivalently of the coordinate dependent mass [31], on the dynamics of the tunneling. We have carried out a series of experiments within the framework of the basic methodology described in SectionII. The numerical experiments are done with a symmetric double well potential and are subdivided into two classes. 1. The mass of the tunneling particle varies symmetrically along the tunneling coordinate. 2. The mass variation is asymmetric
Figure 6. Pattern of coordinate dependent mass variation (a) mass distribution is gaussian; (b)mass distribution has a minima at the barrier top. Under category 1 three possibilities have been investigated: (a) the tunneling mass is constant m0 everywhere, (b) the tunneling particle has mass m0 at x = 0 where the barrier height is maximum while away from the barrier top, the mass decreases. More specifically, 2 the mass has a gaussian profile along the tunneling coordinate (figure 6a), m(x) = m0 e−βx , and (c) the tunneling mass is m0 at the bottom of the left or right well and it decreases as 2 2 it approaches the barrier top at x = 0 (figure 6b) i.e m(x) = m0 e−β(x −a ) . x = ±a being the location of the well minima. Figure 7a shows how the dynamics appears to be when the tunneling mass (m0 ) is fixed along x. It is perfectly coherent and the particle oscillates back and forth between the right and the left wells. When the tunneling mass is higher in the
106
S. Ghosh and S.P. Bhattacharyya
Figure 7. The pattern of tunneling dynamics displayed when (a) tunneling mass is independent of x i.e. m(x) = m0 ; (b) tunneling mass has a maxima at the bottom of the wells; (c) tunneling mass has a maxima on the barrier top.
Figure 8. The mass distribution plot when it is sharply peaked at x = −α. wells and lower in the barrier, the tunneling becomes slower, still remaining coherent (figure 7b). However, if the tunneling mass is lower in the well and increases as it approaches the barrier, the tunneling rate is increased very significantly and the coherent oscillation frequency becomes much larger (figure 7c). In such a situation very significant increase in tunneling rate through heterostructure should be seen. It is clear that mass concentration in the well lowers the tunneling rate while mass concentration in the barrier region enhances tunneling. Under category-2, we have considered an asymmetric distribution of the tunneling mass along the tunneling coordinate by assuming m(x) =
m0 . |x + α|
(28)
The function is peaked at x = −α (figure 8). The larger the value of α the larger is the shift of the peak towords the region x < 0. Figure-8 shows the mass variation pattern along the tunneling coordinate for different values of α. The tunneling time [32] has been computed for each value of α with m0 being assumed to be the proton mass and reported in table-IV. It is clearly seen that tunneling time increases as α increases meaning thereby that the rate of
Dynamics and Its Signature
107
tunneling decreases. The asymmetric mass variation of the type displayed in figure 8 along the tunneling coordinate could lead to quenching tunneling. The coupling with the lattice (’environment’) can therefore modulate tunneling current in hetero structures. Table 1. Morse and Symmetric double well potential parameters used in model calculations Parameters a b c D β xe µ
Values (a.u) 0.1 0.12 0.04 0.135 0.731 2.23 250.0
Table 2. Energies of the Morse and the double well oscillators
1st state energy(a.u) 2nd state energy (a.u)
double well 0.02270 0.02496
Morse Oscillator 0.00439 0.01267
Table 3. Computed dissociation probability and tunneling rate in a symmetric double well and Morse oscillator system for various strength of coupling parameters λ 0 0.001 0.002 0.003 0.004 0.005 0.006
Dissociation probability 0.04 0.05 0.1 0.18 0.37 0.05 0.05
Tunneling rate(au−1 ) 0.000769 0 0 0 0 0 0
Table 4. Computed tunneling time for various α Tunneling time 5.0 ∗ 105 1.0 ∗ 106 3.3 ∗ 106
α 0.5 1.0 1.5
108
5.
S. Ghosh and S.P. Bhattacharyya
Conclusion
In a tunneling system where the tunneling motion of the particle along the coordinate gets coupled to non-tunneling motion along another coordinate, the coupling leaves its signature on the dynamics in different ways. It can quench tunneling, depending upon the strength and form of coupling. Similarly the tunneling motion can also affect the dynamics along the non-tunneling (e.g. bond stretching) coordinate. Experimentally, little seems to be known about the possible impact that excitation in the tunneling coordinate could have on the bond stretching and the dissociation dynamics of a mode coupled to the tunneling coordinate and vice-versa. Work along these lines could enrich our knowledge about the dynamics of coupled quantum system.
References [1] M. Razavi, Quantum Theory of Tunneling, World Scientific, 2003. [2] R. P. Bell, The Tunneling Effect in Chemistry, Chapman & Hall, London, 1980. [3] G. Gamow, The quantum theory of nuclear disintegrationm, Nature 1928, 122, 805806; Zur Quantentheorie des Atomkevnes, Z. fur. Phys. 1928, 51, 204–212. [4] M. Born, Zur Theorie des Kernzerfalls, Z. fur. Phys. 1928, 58, 306–321. [5] R. W. Gurney, Nuclear levels and artificial disintegration, Nature 1929, 123, 565–566. [6] L. Esaki, Long journey into tunneling, Proc. IEEE 1974, 62, 825–831. [7] I. Giaever, Electron tunneling and superconductivity, Science 1974, 183, 1253–1258. [8] B. D. Josephson, The discovery of tunneling supercurrents, Science 1974, 184, 527– 530. [9] G. C. Schatz,Tunnelling in bimolecular collisions, Chem. Rev. 1987, 87, 81–89; Quantum effects in gas phase bimolecular chemical reactions, Ann. Rev. Phys. Chem. 1988, 39, 317–340. [10] D. G. Truhler, A. D. Isaacson, B. C. Garrett, Theory of chemical reaction dynamics, D. C. clary, Dordrecht Ed.; NATO ASI series. Series C, Mathematical and physical sciences; vol. 170, NATO ASI series., no. 170. CRC Press, Boca Raton, FL, 1985, Vol. 4, pp. 65–137. [11] B. C. Garrett and D. G. Truhlar, A least action variational method for calculating multidimensional tunneling probabilities for chemical reactions, J. Chem. Phys. 1983, 79, 4931–4938. [12] H. Ushiyama and K. Taktsuka, Semiclassical study on multidimensuional effects in tnneling chemical reactions: tunneling paths and tunneling tubes, J. chem. Phys. 1997, 106, 7023–7035.
Dynamics and Its Signature
109
[13] H. Eyring, The activated complex in chemical reactions, J. Chem. Phys. 1935, 3, 107– 115. [14] H. A. Kramers, Brownian motion in a field of force and the diffusion model of chemical reactions, Physica 1940, 7, 284–304. [15] P. Hanggi, P. Talkner, M. Borkovec, Reaction-rate theory: fifty years after Kramers, Rev. Mod. Phys. 1990, 62, 251–270. [16] H. J. Kim, J. T. Hynes, A theoretical model for SN1 ionic dissociation in solution. 1. Activation free energetics and transition-state structure, J. Am. Chem. Soc. 1992, 114, 10508–10528. [17] D. G. Truhlar, G. K. Schenter, B. C. Garrett, Inclusion of nonequilibrium continuum solvation effects in variational transition state theory, J. Chem. Phys. 1993, 98, 5756– 5770. [18] V. I. Goldanskii, Role of the tunneling effect in the kinetics of chemical reactions at low temperatures Dokl. Akad. Nauk. Phys. Chem. 1959, 124, 1261–1264. [19] V. A. Benderskii, V. I. Goldanskii, A. A. Ovchinnikov, Effect of molecular motion on low temperature and other anomalously fast chemical reactions in solid phase, Chem. Phys. Lett. 1980, 73, 492–495. [20] T. Miyazaki; Ed. Springer, Atom tunneling phenomena in Physics, Chemistry and Biology, Berlin, 2004, 36. [21] G. K. Ivanov and M. A. Kozhushner, Sov. J. Chem. Phys. 1983, 2, 1299. [22] G. K. Ivanov, M. A. Kozhushner, L. I. Trakhtenberg, Temperature dependence of cryochemical H-tunneling reactions, J. Chem. Phys. 2000, 113, 1992–2002. [23] V. I. Goldanskii, L. I. Trakhtenberg and V. N. Fleurov, Tunneling Phenomena in chemical Physics, Gordon and Breach Science Publishers, New York, 1989. [24] L. I. Trakhtenberg, V. L. Klochikhin, S. Ya. Pshezhetskii, Theory of tunnel transitions of atoms in solids, Chem. Phys. 1982, 69, 121–134. [25] L. I. Trakhtenberg in ref-18, 46–47. [26] C. C. Marston, G. B. Baliant Kurti, The Fourier grid Hamiltonian method for bound state eigenvalues and eigenfunctions, J. Chem. Phys. 1989, 91, 3571–3576. [27] S. Adhikari, P. Dutta and S. P. Bhattacharyya, A time-dependent Fourier grid Hamiltonian method. Formulation and application to the multiphoton dissociation of a diatomic molecule in intense laser field, Chem. Phys. Lett. 1992, 91, 574–579. [28] P. Dutta, S. Adhikari and S. P. Bhattacharyya, Fourier grid Hamiltonian method for bound states of multidimensional systems. Formulation and preliminary applications to model systems, Chem. Phys. Lett. 1993, 212, 677–684.
110
S. Ghosh and S.P. Bhattacharyya
[29] S. Adhikari, P. Dutta and S. P. Bhattacharyya, Properties, dynamics, and electronic structure of atoms and molecules applications of a local grid method for modeling chemical dynamics at a mean-field level, Int. J. Quant. Chem. 1996, 59, 109–117. [30] G. B. Balint Kurti, C. L. Ward, R. N. Dixon, A. J. Mulholland, The calculation of product quantum state distributions and partial crsssections in time-dependent molecular collision and photodissocistion theory, Comput. phys. Commun. 1991, 63, 126–134. [31] A. V. Kolesnikov, A. P. Silin, Quantum mechanics with coordinate-dependent mass, Phys. Rev. B 1999, 59, 7596–7599. [32] Kaushik Maji, C. K. Mondal, S. P. Bhattacharyya, Tunneling time and tunneling dynamics, International Reviews in Phys. Chem. 2007, 26, 647–670.
In: Quantum Frontiers of Atoms and Molecules Editor: Mihai V. Putz, pp. 111-140
ISBN: 978-1-61668-158-6 © 2010 Nova Science Publishers, Inc.
Chapter 6
THEORETICAL CALCULATION OF THE LOW LAYING ELECTRONIC STATES OF THE MOLECULAR ION CSH+ WITH SPIN-ORBIT EFFECTS M. Korek* and H. Jawhari Faculty of Science, Physics Department, Beirut Arab University, P.O.Box 11-5020 Riad El Solh, Beirut 1107 2809, Lebanon
Abstract Research studies on ultracold molecules are a current and great challenge in the spectroscopic study of alkali dimers, because of their importance in the cooling and trapping of atoms and molecules, their role in high precision spectroscopy and Bose-Einstein condensation (BEC). Using the high reliability of the ab initio technique combined with the easily amenable phenomenology core polarization concept, the theoretical calculation of the electronic structure of the molecular ion CsH+ has been performed. This ion is treated as a one electron system where the interaction between the outer electron and the atomic core of Cs+ is modulated through non empirical relativistic effective one-electron core potential. The lowest 2 71 electronic states for the ion CsH+ have been calculated for the molecular states Λ(+) and Ω and dissociating into the 16 asymptotes considered, i.e. up to 16 states 2Σ+, 9 states 2Π, 4 states 2Δ, 25 states Ω=1/2, 13 states Ω=3/2 and 4 states Ω=5/2. Some avoided crossings are pointed out for the symmetries 2Σ+, 2Π, Ω=1/2 and Ω=3/2, their positions rAC and the energy difference ΔEAC at these positions have been determined. For 19 bound states, the harmonic vibrational constant ωe, the internuclear distance re and the electronic transition energy with respect to the ground Te have been calculated. Using the canonical functions approach, we calculate in the present work Ev, Bv and Dv of the molecular ion CsH+ up to the vibrational levels v = 19 for 17 electronic states. From the calculated values of Ev for a given vibrational level v and by using a cubic spline interpolation between each 2 consecutive points of the potential energy curves, the rmin and rmax of the turning points have been investigated for a these bound states. Permanent dipole moments M a (r) as well as all non-zero transition dipole moments *
M ab (r)
(a≠b ) have been calculated for each electronic state (a,b) under
E-mail address: [email protected]
112
M. Korek and H. Jawhari consideration and in the whole range of r investigated here. The comparison of the present results with those available in literature shows a very good agreement.
1. Introduction Beginning in the late 1980s, methods to use laser light to cool and confine atoms at unprecedented temperatures have made a dramatic impact on atomic physics. The cooling and manipulation of cold molecules is likely opening up new branches of research. As a gas of molecules is cooled, their average velocity is decreased and the spread of their molecular velocities narrowed. This is important not only for studying molecular physics, but also for studying fundamental physics. The internal structure of certain molecules provides an ideal “laboratory” for sensitive measurements of fundamental physical quantities. Ultracold molecules are of a current and great challenge in the spectroscopic study of alkali dimers, because of their importance in the cooling and trapping of atoms [1,2] and molecules [3], their role in high precision spectroscopy [4], Bose-Einstein condensation (BEC) [5], atomic clocks, ultrasensitive isotope detection, quantum information, and processing ultracold collisions. At ultracold temperatures, the collisions of atoms, which may be characterized by s-wave scattering lengths, have received considerable attention because of their importance in cooling and trapping of atoms and molecules [6] and their role in high precision spectroscopy [7]. Collisions of ions involve higher-order partial waves because of the long-range attractive polarization forces and because of the possibility of charge transfer. Ion-atom charge transfer collisions are of great interest both theoretically and experimentally. Another especially promising area will be the study of collisions between ultracold molecules, in a regime where they behave like waves, perhaps giving rise to a new chemistry [8]. They may also allow for the study of collective quantum effects in molecular systems, including BEC [9]. There are proposed experiments to study polar molecular systems in order to measure the electron’s permanent electric dipole moment (EDM), the lifetime of long-lived energy levels, and the effects of the dipole-dipole interactions on the molecular samples properties [10]. Ultracold polar molecules interact with each other via highly anisotropic electric dipoledipole forces providing access to qualitatively new regimes previously unavailable by ultracold homonuclear_nonpolar systems [11-12]. Novel phenomena are expected from ultracold polar molecules as new features in phase diagrams of degenerate states [13] and anisotropic collisions caused by anisotropic dipole-dipole interaction [14]. The sympathetic cooling of these mixtures has led to the achievement of simultaneous quantum degeneracy of bosonic and fermionic species, producing BEC and Fermi-Bose mixtures [15-16]. Theory groups developed the methods to map problems, such as the BEC-BCS crossover, superfluid phases, Bose glasses and spin-charge separation from solid state quantum systems to the pure world of degenerate quantum gases [17-19]. Meanwhile, a growing community developed the technique of atom chips. From this dataset and the experience of three generations of atom chip experiments on hand, a truly next-generation experiment was designed. Furthermore, the interaction of bosons and fermions with attractive and repulsive interactions can be studied without the need of a Feshbach resonance, as the interaction between Rb and Li is repulsive and the interaction between Rb and K is attractive [20]. Recent advances in precision control of the optical spectrum emitted by a femtosecond laser have made a revolutionary impact on the fields of optical frequency metrology (via self-referenced, ultra-broad bandwidth
Theoretical Calculation of the Low Laying Electronic States…
113
frequency combs) and ultrafast optical science (via carrier-envelope phase stabilized pulse trains). These advances have, in effect, provided an entirely new class of optical sources available for experimental investigations which is the Femtosecond lasers+cold atoms/molecules. In the three past decades a limited number of theoretical studies have been performed for the molecular ions AH+(where A is an alkaline atom). The potential energy curves of these molecular ions are well needed to understand ion-atom collision [21] and to serve as input for diatomics-in-diatomics studies of the potential energy surfaces for A+H2 collisions [22]. These systems are interesting for the theory of chemical bonding since each involves a single valence electron. Interest in alkali dimers is closely related to developments in the ultra-cold alkali dimers atom trapping, which are at the root of photoassociation spectroscopy. The charge transfer in collision of neutral alkaline atoms with protons affects the ionization balance in the atmospheres of planets, dwarf stars, and the interstellar medium [23-27]. At low temperature, such as in ultracold experiments, collision energies are much les than 1eV, and radiative charge transfer may become dominant over nonradiative charge transfer. Neutralization of H+ is of interest in the area of plasma fusion as a method of energetic neutral beam injection in fusion reactors. At lower energies, this charge exchange process is used to make metastable [H(2s)] hydrogen used for atomic experiments [28,29] and for the creation of spin polarized proton beam for injection into large accelerators [30]. Due to these results, the lack of the theoretical calculation on a certain number of alkali dimers, and because of its use particularly in the domain of quantum computer to create the qubite [31], we investigated recently theoretical calculation of these molecules and their ions [32-38]. Using an improvement on the ab initio pseudo-potential method [39-45], we investigate in this chapter the lowest 29 electronic states of Λ-representation (neglect spin-orbit effect), the lowest 42 electronic states of Ω-representation (including spin-orbit effect), and the spectroscopic constants of the regularly bound states. Based on the canonical functions approach [46-48], a rovibrational study has been done to calculate the eigenvalue energies Ev, the rotational constants Bv, the centrifugal distortion constants Dv, and the abscissas of the turning points (rmin, rmax) for 17 electronic states. Moreover, the dipole moment functions and the transition dipole moment are calculated for many of the states in the Ω-representation.
2. The Theory A. Ab initio Calculation For the molecular ion CsH+ the energies for the molecular states including the spin-orbit effect Ω=1/2, 3/2, and 5/2 have been obtained from the treatment of the total Hamiltonian Ht=He+WSO where He is the Hamiltonian in the Born-Openheimer approximation for the calculation of the energies for the molecular states labelled 2S+1Λ(+/-) and WSO is the spin-orbit pseudo-potential. The Spin-orbit (SO) effects are considered for Cs while they are neglected for H. The CsH+ ion is treated as a one electron system where the interaction between the outer electron and the atomic core of Cs+ is modulated through non empirical relativistic effective one-electron core potential of the Durand and Barthelat type [42-44]. The electroncore interaction is represented by the effective potential
114
M. Korek and H. Jawhari 2
V[r ] = ∑ U [r ]P =0
where ℓ is the orbital angular momentum and Pℓ corresponds to the projection operator on the subspace defined by the Ym spherical harmonies with a given ℓ. U[r] is written as: 2
U [ r ] = ∑ ci r ni e −α r
2
i
i =1
with c, n and α adjusted to fit the energy and wave functions of the valence Hartree-Fock orbitals. Core valence effects including core-polarization and core valence correlation are taken into account by using an ℓ-dependent core-polarization potential of the Foucrault et al. type [49]
1 ∑α k f k . f k 2 k
Vcpp = −
where the index k labels the ionic cores, αk is the static dipole polarizability of the ionic core, fk is the electric field action on the ionic core k due to the valence electrons and the other core. The ℓ-dependent form proposed by Foucrault et al [49] ∞
fk = ∑
m=+
∑ F (r
=0 m = −
ik
, rk )
m > kk < m
with ⏐ℓm>k are spherical harmonic centered on the core k and rkℓ are cut-off parameters. For l
the one-valence-electron atom Cs the parameters rk have been determined in order to reproduce the experimental values of the ionization potential IP, as well as the transition energies for the atomic. In this way rl has been obtained for l = 0, 1, 2 and we have chosen
rk2 = rk3 . [50] The parameters defining the core-polarization potentials and the comparison of the calculated IPs and atomic transition energies with the experimental values of the atom Cs are given in Ref. [51] The core-core interaction is evaluated as the ground state energy for the molecular ion RbH2+ instead of the approximation 1/r which not accurate enough for this species, at least for small values of the internuclear distance. In the present calculation including the spin-orbit effect the total Hamiltonian Ht is diagonalized in the basis of the SΛΣ states yielding the relativistic Ω adiabatic states. The symmetry used in this calculation is being C∞v with a common set of molecular orbitals for all symmetries. Semi-empirical spinorbit pseudo-potentials have been designed for Cs atom [52-53]. The present investigation of the electronic structure including the spin-orbit effect for the molecular ion CsH+ has been performed by using the package CIPSO (Configuration Interaction by Perturbation of a multiconfiguration wave function with Spin-Orbit interaction) of the Laboratoire de Physique
Theoretical Calculation of the Low Laying Electronic States…
115
Quantique Toulouse, France, which allows a full CI calculation as well as perturbative CI calculations with SO effects.
B. Vibration – Rotation Calculation In the Rayleigh-Schrödinger perturbation theory the eigenvalue EvJ and the eigenfunction ΨvJ are given respectively by
E vJ = ∑ e n λ n
(1)
ΨvJ (r ) = ∑ Φ n (r )λ n
(2)
n =0
n =0
where r is the internuclear distance, v and J are respectively the vibrational and rotational quantum numbers, λ = J (J + 1) , and e0 = Ev, e1=Bv, e2 =- Dv , …, φ0 is the pure vibrational wave function and φn its rotational corrections. By replacing Eqs.(1) and (2) into the radial Schrödinger equation [54-58]
⎡ d 2 2μ λ⎤ ⎢ 2 + 2 (E vJ − U(r )) − 2 ⎥ ΨvJ (r ) = 0 r ⎦ ⎣ dr
(3)
φ '0' (r ) + [e 0 − U (r )]φ 0 (r ) = 0
(4)
one can write [36]
φ1'' (r ) + [e 0 − U(r )] φ1 (r ) = −[e1 − R (r )] φ 0 (r )
(5-1)
φ2'' (r ) + [e0 − U (r )]φ2 (r ) = −[e1 − R (r )]φ1 (r ) − e2 φ0 (r )
(5-2)
n
φ 'n' (r ) + [e 0 − U(r )] φ n (r ) = R (r ) φ n −1 − ∑ e m φ n −m (r )
(5-n)
m =1
where R(r)=1/r2, the first equation is the pure vibrational Schrödinger equation and the remaining equations are called the rotational Schrödinger equations. One may project Eqs.(5) onto φ0 and find [47]
< φ 0 | φ 0 > e1 =< φ 0 |
1 | φ0 > r2
(6-1)
116
M. Korek and H. Jawhari
1 | φ1 > −e1 < φ 0 | φ1 > r2
(6-2)
n −1 1 | φ n −1 > − ∑ e m < φ n − m | φ 0 > 2 m =1 r
(6-n)
< φ 0 | φ 0 > e 2 =< φ 0 | < φ 0 | φ 0 > e n =< φ 0 |
Once e0 is calculated from Eq.(4), e1, e2, e3 … can be obtained by using alternatively Eqs.(5) and (6).
3. The Results Atomic Calculation The electronic structure of the molecular ion CsH+ is studied with and without the spinorbit (SO) coupling. The spin-orbit effects has been studied by using the package CIPSO (Configuration Interaction by Perturbation including Spin-Orbit coupling) of the "laboratoire de physique Quantique Toulouse-France". The values of the cut-off parameters involved in the polarization potentials are given in Table 1 [38,59]. Table 1. Parameters for the polarization potential of the Cs and H atoms α(ao3) 15.117 1.000
Atom Cs H
rko(ao) 2.6915 1.000
rk1(ao) 1.8505 1.000
rk2(ao) 2.8070 1.000
Table 2. Energies of the lowest lying levels of the H-atom Configuration 1s
Term 2 S
2p
2 0
2s
2
3p
2 0
3s
2
3d
2
4p
2 0
4s
2
4d
2
P
S
P
S
D
P
S
D
J Etheoretical(cm-1) 1/2 0.00 1/2 82312.7386 3/2 82324.9802 1/2 82324.9802 1/2 97568.0092 3/2 97568.0092 1/2 97568.0092 3/2 97568.0092 5/2 97568.0092 1/2 102948.0576 3/2 102948.0576 1/2 102948.0576 3/2 102948.0576 5/2 102948.0576 Average relative error
Eexperimental(cm-1) 0.00 82258.9206 82259.2865 82258.9559 97492.2130 97492.3214 97492.2235 97492.3212 97492.3574 102823.8505 102823.8962 102823.8549 102823.8961 102823.9114
δE/E% 0.00 0.07 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.12 0.12 0.12 0.12 0.12 0.09
Theoretical Calculation of the Low Laying Electronic States…
117
By using the Gaussian basis set given in Ref. [32] for the cesium atom and the modified basis cc-pv6z [60] for the hydrogen atom we calculate the energy levels up to 4d and 8s respectively for H and Cs-atoms. The comparison of these values, in Tables 2 and 3, to those obtained experimentally [61-63] shows an excellent agreement with an average relative error δE/E=0.09% for the H-atom and 0.01% for the Cs-atom. This agreement confirms the validity and the accuracy of the chosen basis sets in this calculation. Table 3. Energies of the lowest lying levels of the Cs-atom Configuration
Term
6s
2
6p
2 0
5d
2
S
P
D
7s
2
7p
2 0
6d
2
8s
2
S
P
D S
J
Etheoretical(cm-1)
Eexperimental(cm-1)
δE/E%
1/2 1/2 3/2 3/2 5/2
0.00 11172.7488 11728.5025 14480.8502 14579.0300
0.00 11178.2686 11732.3079 14499.2584 14596.8423
0.00 0.04 0.03 0.12 0.12
1/2
18537.3276
18535.529
~0.00
1/2 21837.0533 3/2 22019.7714 3/2 22776.7197 5/2 22818.5012 1/2 24354.4958 Average relative error
21765.35 21946.396 22588.8210 22631.6863 24317.0185
0.32 0.33 0.83 0.82 0.01 0.23
3.3. Spin-Orbit Effects Neglected For the molecular ion CsH+ the potential energy curves (PECs) of the 2Λ(+) states have been investigated in the range 3.0a0≤r≤60a0 of the internuclear distance and dissociating into the 16 asymptotes considered up to the dissociation limit H+(4d 2D3/2,5/2)+ Cs(1S0); i.e. up to 16 states 2Σ+, 9 states 2Π, 4 states 2Δ, as displayed in Table 4. This table also reports the calculated energies at the dissociation limits; the comparison of these values to those obtained experimentally [62-63] shows a very good agreement with an average relative error of 0.06 %. The PECs versus the internuclear distance for the 2Λ(+)-states are plotted respectively in the Figs1-3. Among the calculated 29 PECs 22 states are proved to be attractive. For each bound state the harmonic vibrational constant ωe, the rotational constant Be, the internuclear distance at equilibrium re and the electronic transition energy with respect to the ground state Te are calculated by fitting the energy data around the equilibrium position to a polynomial in terms of the internuclear distance r, these values are given in Table 5. The comparison of our calculated values of re and Be of the ground state with those available in literature shows a good agreement with relative errors 3.6% and 7.5% respectively. Double minima potentials are obtained for the states (14)2Σ+, (16)2Σ+, and (9)2Π, the minima of these potentials are given in Table 5. No comparison for the other results since they are given here for the first time.
118
M. Korek and H. Jawhari
Table 4. Numbering of the various Λ-states of the CsH+ correlated adiabatically to the 16 lowest dissociation limits Cs
H
2
1
2
1 0 2 78268.9598 3 1 82318.8594 4 82324.9802 5 2 89719.5854 6 3 1 92798.8999 7 96806.2874 8 97568.0092 9 4 97568.0092 10 5 2 97568.0092 11 6 100197.3722 12 7 3 101066.5703 13 102623.4556 14 102948.0576 15 8 102948.0576 16 9 4 102948.0576 Average relative error
S0 6s 2S1/2 1 S0 1 S0 2 6p P1/2,3/2 5d 2D3/2,5/2 7s 2S1/2 1 S0 1 S0 1 S0 7p 2P1/2,3/2 6d 2D3/2,5/2 8s 2S1/2 1 S0 1 S0 1 S0
1s S1/2 1 S0 2 2p P1/2,3/2 2s 2S1/2 1 S0 1 S0 1 S0 3s 2S1/2 3p 2P1/2,3/2 3d 2D3/2,5/2 1 S0 1 S0 1 S0 4s 2S1/2 4p 2P1/2,3/2 4d 2D3/2,5/2
Σ+
2
Π
2
Δ
Etheoretical(cm-1)
Eexperimental(cm-1)
δE/E%
0 78268.9598 82259.1035 82258.9559 89724.2480 92817.0101 96804.4888 97492.2235 97492.2672 97492.3393 100124.8328 100879.2135 102585.9783 102823.8549 102823.8733 102823.9037
0.00 0.00 0.07 0.08 ~0.00 0.02 ~0.00 0.08 0.08 0.08 0.07 0.18 0.03 0.12 0.12 0.12 0.06
460000
E(cm-1)
440000
420000
400000 0
10
20
30
40
R(Bohr)
Figure 1. Continued on next page.
50
60
Theoretical Calculation of the Low Laying Electronic States…
119
400000
E(cm-1)
300000
200000
100000
0 0
10
20
30
40
50
60
R(Bohr)
Figure 1. Potential energy curves of the states 2Σ+ of the molecule CsH+. 460000
E(cm-1)
440000
420000
400000 0
10
20
30
40
50
R(Bohr)
Figure 2. Potential energy curves of the states 2Π of the molecule CsH+.
60
120
M. Korek and H. Jawhari Table 5. Calculated spectroscopic constants for the various [(k)2Λ] states of CsH+ [(k)2s+1Λ]
Te(cm-1)
[(1)2Σ+]
0.00
[(2)2Σ+]
re(Å)
204220.06
3.372 3.25a 3.467
2 +
401701.74
2 +
[(4) Σ ] [(5)2Σ+]
ωe(cm-1)
Be(cm-1)
409.50
1.479 1.59a 1.402
4.611
263.99
0.792
414566.28
5.178
129.26
0.628
418211.74
7.854
115.45
0.273
2 +
423296.11
9.699
96.250
0.179
2 +
428459.83
11.65
81.54
0.124
2 +
432489.77
15.185
41.21
0.073
2 +
[(9) Σ ]
433523.86
24.742
6.017
0.029
2 +
435366.04
16.978
46.20
0.058
2 +
[(11) Σ ]
438110.28
19.817
40.05
0.042
[(12)2Σ+]
441472.00 444554.93 Max 444583.4 443451.88
11.932 13.921 At r=14.4 23.834
66.71 62.13
0.117 0.086
32.93
0.029
[(15) Σ ]
444571.45
17.765
72.78
0.053
447113.83 Max 447239.8 447111.94
15.030 At r=16.6 25.870
78.56
0.074
[(16)2Σ+]
7.20
0.025
2
204796.07
3.519
344.80
1.359
2
[(2) Π]
414065.64
3.425
457.16
1.437
[(3)2Π]
424463.77
6.946
53.227
0.344
[(4)2Π]
429363.48
9.583
59.41
0.183
[(6) Π]
438641.40
14.171
36.11
0.083
[(8)2Π]
444031.66
18.80
17.15
0.047
447229.32 Max 447249.3 447118.05
13.270 At r=14 23.338
46.83
0.095
[(9)2Π]
7.14
0.030
[(3) Σ ]
[(6) Σ ] [(7) Σ ] [(8) Σ ] [(10) Σ ]
[(14)2Σ+] 2 +
[(1) Π]
2
460.350
(a) Ref. [64].
These PECs present avoided crossings in quite complex forms (humps and wells) at short and large value of the internuclear distance which are due to either crossings or avoided crossings. The internuclear distance at the avoided crossing RAC with the energy difference ΔEAC between two corresponding states at these points for the different states are given in Table 6. We show avoided crossing between the states (9)2Σ+ and (8)2Σ+ in Figs 4 as illustration.
Theoretical Calculation of the Low Laying Electronic States…
121
460000
-1
E(cm )
440000
420000
400000 0
10
20
30
40
50
60
R(Bohr)
Figure 3. Potential energy curves of the states 2Δ of the molecule CsH+.
Table 6. Some avoided crossing between 2Λ+ states of the molecular ion CsH+ (n+1)State / (n)State
rAC(Bohr)
ΔEAC(cm-1)
(8)2Σ+/(7)2Σ+
12.4
1573.97
(9)2Σ+/(8)2Σ+
10.4
1461.58
(9)2Σ+/(8)2Σ+
18.4
33.14
(12)2Σ+/(11)2Σ+
8.3
543.19
(12)2Σ+/(11)2Σ+
11.5
547.72
(12)2Σ+/(11)2Σ+
20.6
504.07
2 +
2 +
6.4
357.90
2 +
2 +
17.8
473.14
2 +
2 +
32
118.38
2 +
2 +
(16) Σ /(15) Σ
10.6
154.25
(5) 2Π/(4) 2Π
8.2
1886.88
(9) 2Π/(8) 2Π
8.3
1630.83
(15) Σ /(14) Σ (15) Σ /(14) Σ (15) Σ /(14) Σ
122
M. Korek and H. Jawhari 434400 (8)Σ (9)Σ
E(cm-1)
434300
434200
434100
434000 18.1
18.3
18.5
18.7
R(Bohr)
Figure 4. Avoided crossing between (9)2Σ+ and (8)2Σ+states of the molecule CsH+.
3.4. Spin-Orbit Effects Included Energy calculation for the Ω-representation is performed for the states corresponding to the 16 lowest dissociation limits, i.e. up to H+(4d 2D3/2,5/2)+ Cs+(1S0) is the range of internuclear distance r from 3ao to 60ao. Consequently 25 states of Ω=1/2, 13 states of Ω=3/2 and 4 states of Ω=5/2 are correlated adiabatically as shown in Table 7. The PECs of these states are displayed in Figs 7, 8 and 9 respectively. Among these calculated PECs 24 states are proved to be attractive and the other are repulsive. Table 7. Numbering of the various Ω-states of CsH+ correlated adiabatically to the 25 lowest dissociation limits Cs 1
S0 6s 2S1/2 1 S0 1 S0 2 6p P1/2,3/2 5d 2D3/2,5/2 7s 2S1/2 1 S0 1 S0
H+
Ω=1/2
2
1 2 3,4 5 6,7 8,9 10 11 12,13
1s S1/2 1 S0 2 2p P1/2,3/2 2s 2S1/2 1 S0 1 S0 1 S0 2 3s S1/2 3p 2P1/2,3/2
Ω=3/2
Ω=5/2
1 2 3,4
5
1
Theoretical Calculation of the Low Laying Electronic States…
123
Table 7. Continued Cs 1
S0 2 7p P1/2,3/2 6d 2D3/2,5/2 8s 2S1/2 1 S0 1 S0 1 S0
H+
Ω=1/2
Ω=3/2
Ω=5/2
3d 2D3/2,5/2 1 S0 1 S0 1 S0 4s 2S1/2 4p 2P1/2,3/2 4d 2D3/2,5/2
14,15 16,17 18,19 20 21 22,23 24,25
6,7 8 9,10
2
11 12,13
3
4
For each bound state the harmonic vibrational constant ωe, the rotational constant Be, the internuclear distance at equilibrium re and the electronic transition energy with respect to the ground state Te are calculated by fitting the energy data around the equilibrium position to a polynomial in terms of the internuclear distance r, these values are given in Tables 8 and 9 respectively along with the main parents 2Λ(+) of the states Ω=1/2 and Ω=3/2 near the equilibrium positions. It should be noticed that, such identification was not possible for other states since their minima are situated close to the crossings between 2Λ(+) states. 460000
E(cm-1)
440000
420000
400000 0
10
20
30
R(Bohr)
Figure 7. Continued on next page.
40
50
60
124
M. Korek and H. Jawhari 400000
E(cm-1)
300000
200000
100000
0 0
10
20
30
40
50
60
50
60
R(Bohr)
Figure 7. Potential energy curves of the states Ω=1/2 of CsH+. 460000
E(cm-1)
440000
420000
400000 0
10
20
30
R(Bohr)
Figure 8. Continued on next page.
40
Theoretical Calculation of the Low Laying Electronic States…
125
400000
E(cm-1)
300000
200000
100000
0 0
10
20
30
40
50
60
R(Bohr)
Figure 8. Potential energy curves of the states Ω=3/2 of the molecule CsH+. 460000
E(cm-1)
440000
420000
400000 0
10
20
30
40
50
60
R(Bohr)
Figure 9. Potential energy curves of the states Ω=5/2 of CsH+.
Since the spin orbit coupling has no effect on the Σ−state we compare these results with those available in literature without spin orbit. This comparison shows a very good agreement with a relative errors 3.6% and 7.5% respectively for re and Be for the ground state. No comparison for the other results since they are given here for the first time.
126
M. Korek and H. Jawhari
Table 8. Calculated spectroscopic constants for the various (n)Ω=1/2 states of CsH+ n[(k)2s+1Λ]
Te(cm-1)
re(Å)
ωe(cm-1)
Be(cm-1)
(1) [(1)2Σ+]
0.00
3.372 3.25a
460.356
1.479 1.59a
(2) [(2)2Σ+]
204220.06
3.467
409.473
1.402
(3) [(1)2Π]
204795.65
3.519
345.114
1.359
(4) [(3)2Σ+]
401700.63
4.611
263.463
0.792
3.426 At r=3.9 5.177
456.004
1.437
(5) [(4)2Σ+]
414063.89 Max 414752.7 414565.97
129.213
0.628
(6)
414792.88
4.169
480.602
0.974
418211.24
7.855
115.627
0.272
(8) [(6) Σ ]
423222.19
9.696
96.473
0.179
(9) [(3)2Π]
424450.23
7.574
1105.84
0.541
(10) [(7)2Σ+]
428383.70
11.694
78.041
0.123
(11) [(4) Π]
429284.02
10.256
58.061
0.159
(12) [(8)2Σ+]
432489.53
15.182
41.239
0.073
(15) [(10)2Σ+]
435366.45
16.977
46.185
0.058
(16)
438617.06 Max 438026.4 438110.06
12.252 At r=13.7 19.815
26.045
0.082
40.064
0.042
(17) [(6) Π]
438666.78
15.846
95.567
0.066
(18) [(12)2Σ+]
441471.64
11.934
66.424
0.117
(21) [(14) Σ ]
443451.88
23.834
32.939
0.029
(22)
444555.10 Max 444583.4 444026.53 444571.46 Max 444632.2 444080.58
12.624 At r=13.8 17.430 17.765 At r=18.5 30.278
92.079
0.085
58.353 72.777
0.044 0.053
15.702
0.017
(24) [(16) Σ ]
447113.96
15.019
79.332
0.075
(25)
447237.12 Max 447249.2 447225.70 Max 447239.8 447118.11
12.224 At r=13.5 15.187 At r=15.9 23.973
67.025
0.081
225.477
0.042
102.062
0.028
2
(5) [(2) Π]
(7) [(5)2Σ+] 2 +
2
(16) [(11)2Σ+] 2
2 +
(22) [(8)2Π] (23) [(15)2Σ+] (23) 2 +
(25) [(9)2Π] (25) [(9)2Π] (a) Ref.[64].
Theoretical Calculation of the Low Laying Electronic States…
127
Double minima potentials are obtained for the states (5)Ω=1/2, (16)Ω=1/2, (22)Ω=1/2, (23)Ω=1/2 and (13)Ω=3/2, while the (25)Ω=1/2 is a triple well potential state. Table 9. Calculated spectroscopic constants for the various (n) Ω=3/2 states of CsH+ n[(k)2s+1Λ]
Te(cm-1)
re(Å)
ωe(cm-1)
Be(cm-1)
(1) [(1)2Π]
204796.49
3.519
344.896
1.359
2
414067.25
3.424
458.206
1.438
2
(4) [(3) Π]
424505.55
7.170
14.270
0.295
(5)
414565.97
5.177
129.213
0.628
(2) [(2) Π]
(8) [(6)2Π]
438666.91
14.144
36.317
0.084
2
444033.05
18.801
17.144
0.047
2
447229.54 Max 447249.4 447118.05
13.272 At r=13.4 23.337
46.757
0.095
7.246
0.030
(11) [(8) Π] (13) [(9) Π] (13) [(9)2Π]
At the internuclear distance at equilibrium re, the SO splitting for the states (1, 2, 3, 6, 8, 9)2Π have been identified and evaluated, the difference between the lowest and highest energy are 0.84cm-1, 3.36cm-1, 0.13cm-1, 6.52cm-1, 3.84cm-1 and 0.6cm-1 respectively. In the Ω-representation the PECs present avoided crossings in quite complex forms (humps and wells) at short and large value of the internuclear distance which are due to either crossings or avoided crossings of the Λ-states. The internuclear distance at the avoided crossing rAC with the energy difference ΔEAC between two corresponding states at these points are given in Tables 10 and 11. These avoided crossings are drawn, as illustration, in figures 10 and 11. Table 10. Some avoided crossings between Ω=1/2 states of CsH+ . rAC and ΔEAC are respectively the internuclear distance and the energy difference at the avoided crossing between the two corresponding states (n+1)Ω =1/2/(n)Ω=1/2
rAC(Bohr)
ΔEAC(cm-1)
(3)Ω=1/2/(2) Ω=1/2 (5)Ω=1/2/(4) Ω=1/2 (6)Ω=1/2/(5) Ω=1/2 (8)Ω=1/2/(7) Ω=1/2 (9)Ω=1/2/(8) Ω=1/2 (11)Ω=1/2/(10) Ω=1/2
5.1 4.7 7.7 5.2 13.9 17.7 6.1
25.18 169.97 46.92 137.64 109.66 378.17 8.87
Crossing between (n)state/(m)state (2)2Σ+/(1)2Π (3)2Σ+/(2)2Π (4)2Σ+/(2)2Π (5)2Σ+/(3)2Π (6)2Σ+/(3)2Π (7)2Σ+/(4)2Π (8)2Σ+/(5)2Π
(13)Ω=1/2/(12) Ω=1/2
12.1 18.8 18.2
81.59 18.14 52.56
(8)2Σ+/(5)2Π (9)2Σ+/(5)2Π (9)2Σ+/(5)2Π
(14)Ω=1/2/(13) Ω=1/2
Aviod crossing (n)state/(m)state
128
M. Korek and H. Jawhari Table 10. Continued
(n+1)Ω =1/2/(n)Ω=1/2
rAC(Bohr)
ΔEAC(cm-1)
Crossing between (n)state/(m)state
(16)Ω=1/2/(15) Ω=1/2
11.4
65.71
(10)2Σ+/(6)2Π
(17)Ω=1/2/(16) Ω=1/2
29.4
66.70
(11)2Σ+/(6)2Π
8.8
3.26
(11)2Σ+/(7)2Π
11.5
519.22
(12)2Σ+/(11)2Σ+
20.6
502.78
(12)2Σ+/(11)2Σ+
8.3
545.70
(12)2Σ+/(11)2Σ+
11.1
121.37
(12)2Σ+/(7)2Π
12
88.07
(12)2Σ+/(7)2Π
17.7
4.62
(12)2Σ+/(7)2Π
34.5
62.83
(12)2Σ+/(7)2Π
16
44.80
(13)2Σ+/(8)2Π
21.7
18.20
(14)2Σ+/(8)2Π
25.8
6.20
(14)2Σ+/(8)2Π
35
73.10
(14)2Σ+/(8)2Π
32.2
163
56.5
5.5
(15)2Σ+/(8)2Π
26.4
8.5
(16)2Σ+/(9)2Π
31
13.5
(16)2Σ+/(9)2Π
(18)Ω=1/2/(17) Ω=1/2
(19)Ω=1/2/(18) Ω=1/2
(21)Ω=1/2/(20) Ω=1/2 (22)Ω=1/2/(21) Ω=1/2
(23)Ω=1/2/(22) Ω=1/2 (25)Ω=1/2/(24) Ω=1/2
Aviod crossing (n)state/(m)state
(15)2Σ+/(14)2Σ+
Table 11. Some avoided crossings between Ω=3/2 states of CsH+ (n+1)Ω =3/2/(n)Ω=3/2
rAC(Bohr)
ΔEAC(cm-1)
Crossing between (n)state/(m)state
(6)Ω =3/2/(5)Ω=3/2
9
63.07
(4)2Π/(2)2Δ
(7)Ω =3/2/(6)Ω=3/2
8.2
1848.77
(8)Ω =3/2/(7)Ω=3/2
4.6
14.75
(5)2Π/(3)2Δ
(9)Ω =3/2/(8)Ω=3/2
7.4
20.47
(6)2Π/(3)2Δ
(12)Ω =3/2/(11)Ω=3/2
11.8
26.63
(8)2Π/(4)2Δ
Avoid crossing (n)state/(m)state (5)2Π/(4)2Π
Each Ω-state, except for few ones, has more than one main parent Λ-state. This is proved in Tables 12 and 13 by showing the percentage of parent Λ-states over a certain range of the internuclear distance of Ω-states.
Theoretical Calculation of the Low Laying Electronic States…
129
425000 (8)Ω=1/2 (9)Ω=1/2
424800
(6)Σ+ (3)Π
E(cm-1)
424600
424400
424200
424000 13.2
13.4
13.6
13.8
14
14.2
14.4
14.6
R(Bohr)
Figure 10. Avoided crossing between (8)Ω=1/2 and (9)Ω=1/2 is due to crossing between (6)2Σ+ and (3)2Π. 445200 (22)Ω=1/2 (23)Ω=1/2 (14)Σ+ (15)Σ+
E(cm-1)
444800
444400
444000 30
31
32
33
34
35
R(Bohr)
Figure 11. Avoided crossing between (22) Ω=1/2 and (23) Ω=1/2 is due to avoided crossing between (14) 2Σ+ and (15) 2Σ+.
Table 12. Parent states for the potential energy curves of (n)Ω=1/2 (n)
From
to
(1)
3
60.5
(2)
3
5
(3)
3
5
(4)
3
4.7
(5)
3
4.7
(6)
3
7.8
(7)
3
5
(8)
3
5.1
(9)
3
4.7
(10)
3
4.7
% State 100% (1)2Σ+ 3.4% (1)2Π 3.4% (2)2Σ+ 2.9% (2)2Π 2.9% (3)2Σ+ 8.3% (4)2Σ+ 3.4% (3)2Π 3.6% (5)2Σ+ 2.9% (4)2Π 2.9% (6)2Σ+
From
to
5
60.5
5
60.5
4.7
60.5
4.7
7.9
7.8
60.5
5
60.5
5.1
14
4.7
13.9
4.7
17.6
% State
96.5% (2)2Σ+ 96.5% (1)2Π 97% (3)2Σ+ 5.5% (2)2Π 91.6% (2)2Π 96.5% (5)2Σ+ 15.4% (3)2Π 16% (6)2Σ+ 22.4% (4)2Π
From
to
% State
7.9
60.5
91.5% (4)2Σ+
14
60.5
13.9
60.5
17.6
60.5
80.9% (6)2Σ+ 81.1% (3)2Π 74.6% (7)2Σ+
Table 12. Continued (n)
From
to
(11)
3
17.6
(12)
3
6
(13)
3
6
(14)
3
60.5
(15)
3
11.5
(16)
3
4.7
(17)
3
5
(18)
3
8.8
(19)
3
60.5
3
6.6
16
60.5
(20)
% State 25.3% (7)2Σ+ 5.2% (8)2Σ+ 5.2% (5)2Π 100% (9)2Σ+ 14.7% (6)2Π 2.9% (7)2Π 3.4% (10)2Σ+ 10% (11)2Σ+ 100% (12)2Σ+ 5.7% (8)2Π 77.9% (13)2Σ+
From
to
17.6
60.5
6
12
6
12.4
11.5
60.5
4.7
11.3
5
8.7
8.8
60.5
6.6
11.8
% State 74.6% (4)2Π 10.4% (5)2Π 11.1% (8)2Σ+
85.2% (10)2Σ+ 11.4% (10)2Σ+ 6.4% (7)2Π 89.9% (7)2Π
9% (13)2Σ+
From
to
12
60.5
12.4
60.5
11.3
60.5
8.7
60.5
11.8
16
% State
84.3% (8)2Σ+ 83.6% (5)2Π
85.6% (6)2Π 90.1% (11)2Σ+
7.3% (8)2Π
Table 12. Continued (n)
From
to
3
6.7
16
60.5
3
3.7
21.6
60.5
3
5.2
3
13.3
32
44
3
26.4
44
60.5
(21)
(22)
(23)
(24)
(25)
% State 6.4% (13)2Σ+ 77.4% (8)2Π 1.2% (13)2Σ+ 67.6% (8)2Π 3.8% (14)2Σ+ 17.9% (15)2Σ+ 20.8% (9)2Π 41% (15)2Σ+ 28.3% (9)2Π
From
to
6.7
11.5
3.7
5.2
5.2
13.3
13.3
26.4
44
60.5
26.4
31
% State 8.3% (8)2Π
2.6% (9)2Π
14% (9)2Π 22.7% (9)2Π 28.8% (16)2Σ+ 8% (9)2Π
From
to
11.5
16
5.2
21.6
13.3
60.5
26.4
32
31
44
% State 7.8% (13)2Σ+
28.5% (14)2Σ+
82.1% (15)2Σ+ 9.7% (16)2Σ+
22.6% (15)2Σ+
Table 13. Parent states for the potential energy curves of (n)Ω=3/2 (n)
From
to
(1)
3
60.5
(2)
3
60.5
(3)
3
60.5
(4)
3
60.5
(5)
3
9
(6)
3
9
(7)
3
21.8
(8)
3
4.5
(9)
3
7.2
(10)
3
60.5
(11)
3
11.7
(12)
3
11.7
(13)
3
60.5
% State 100% (1)2Π 100% (2)2Π 100% (1)2Δ 100% (3)2Π 10.4% (2)2Δ 10.4% (4)2Π 32.6% (5)2Π 2.6% (5)2Π 7.3% (6)2Π 100% (7)2Π 15.1% (4)2Δ 15.1% (8)2Π 100% (9)2Π
From
to
9
60.5
9
60.5
21.8
60.5
4.5
7.2
7.2
60.5
11.7
60.5
11.7
60.5
% State
89.5% (4)2Π 89.5% (2)2Δ 67.3% (2)2Δ 4.6% (3)2Δ 92.6% (3)2Δ
84.8% (8)2Π 84.8% (4)2Δ
From
to
% State
7.2
60.5
92.7% (6)2Π
134
M. Korek and H. Jawhari
3.5. The Vibration-Rotation Calculation The canonical functions approach [1, 2, 3] enables us to calculate the eigenvalue energy Ev, the rotational constant Bv, the centrifugal distortion constant Dv at any vibrational level even near dissociation. However, this approach fails if avoided crossings between states occur because of the break down of the Born-Oppenheimer approximation at these points. Here for the alkali dimmer CsH+, these constants have been calculated for 8 states in Λ-representation and 9 states in the Ω-representation up to vibrational level v=19. From the cubic spline interpolation between each two consecutive energy values of the PECs, and by using eigenvalue energies Ev, the abscissas of the turning points (rmin, rmax) of the above mentioned states have been calculated. These constants for the states (1, 2, 3)2Σ+ are reported in the Tables 14 to 16, and the other are given in Ref. [59]. No comparison of these values with other results since they are given here for the first time. Table 14. Values for the eigenvalues (Ev), the abscissas of the turning points (rmin, rmax), the rotational constant (Bv), and the centrifugal distortion constant (Dv) for the state (1)2Σ+ of the molecular ion CsH+ v
Ev(cm-1)
rmin(Å)
rmax(Å)
Bv(cm-1)
Dv×10+5(cm-1)
0 1 2 3
214.728 587.566 873.605 1089.894
3.156 3.049 2.997 2.966
3.716 4.131 4.552 5.035
1.423 1.276 1.326 0.971
7.036 9.568 11.934 14.5043
Table 15. Values for the eigenvalues (Ev), the abscissas of the turning points (rmin, rmax), the rotational constant (Bv), and the centrifugal distortion constant (Dv) for the state (2) 2 + Σ of the molecular ion CsH+ v
Ev(cm-1)
rmin(Å)
rmax(Å)
Bv(cm-1)
Dv×10+5(cm-1)
0 1 2 3
190.687 501.900 737.346 921.226
3.240 3.134 3.083 3.051
3.849 4.319 4.795 5.287
1.3260 1.1693 1.0219 0.8949
7.9254 10.5019 11.8595 12.8388
Table 16. Values for the eigenvalues (Ev), the abscissas of the turning points (rmin, rmax), the rotational constant (Bv), and the centrifugal distortion constant (Dv) for the state (3) 2 + Σ of the molecular ion CsH+ v
Ev(cm-1)
rmin(Å)
rmax(Å)
Bv(cm-1)
Dv×10+5(cm-1)
0 1 2 3 4
129.638 377.385 608.215 820.972 1014.498
4.290 4.096 3.981 3.899 3.835
5.014 5.384 5.696 5.999 6.314
0.7785 0.7490 0.7169 0.6819 0.6435
2.9326 3.0982 3.3159 3.5967 3.9675
Theoretical Calculation of the Low Laying Electronic States…
135
Table 16. Continued v
Ev(cm-1)
rmin(Å)
rmax(Å)
Bv(cm-1)
Dv×10+5(cm-1)
5 6 7 8 9 10 11 12
1187.593 1339.148 1468.346 1574.907 1659.412 1723.533 1769.979 1802.086
3.785 3.745 3.714 3.690 3.761 3.658 3.648 3.641
6.655 7.038 7.484 8.018 8.676 9.504 10.564 11.938
0.6011 0.5544 0.5031 0.4475 0.3887 0.3287 0.2700 0.2149
4.4562 5.0841 5.8789 6.8411 7.9306 9.0650 10.1569 11.1810
3.6. Dipole Moment Knowledge of the permanent or transition dipole moment is essential. For the molecular ion CsH+ we calculate the transition electric dipole moment and the permanent electric dipole moment values of the different bound Ω-states as functions of the internuclear distance r. This data provides us information about the most efficient scheme of forming the CsH+ molecular ion. Moreover, our ab initio potentials for the excited states can be used to identify the complex behavior of the transition dipole moment as function of a
r. Permanent dipole moments M a (r) as well as all non-zero transition dipole moments
M ab (r)
M ab =≺ Ψea | μ e (r ) | Ψeb have been calculated for each electronic states (a, b) under consideration and in the whole range of r investigated here. Ψe and Ψe are respectively the electronic wave functions of two a
b
different electronic states and μe(r) is the permanent electronic dipole moment. This dipole moment function has been calculated for most of the Ω-states [59]. In Fig 14 we show, as illustration, the transition dipole moment between the states (22)Ω=1/2 → (21)Ω=1/2. The three peaks at 21.7, 25.8 and 35.0 Bohr respectively, correspond to crossing or avoid crossing between the states (14)2Σ+ and (8)2Π as shown in these figure15, 16 and 17and given in Table 10.
136
M. Korek and H. Jawhari 4
2
1
0 20
22
24
26
28
30
32
34
36
38
40
R(Bohr)
Figure 15. Variation dipole moment between (22) Ω=1/2 → (21) Ω=1/2 states. 445600 (21)Ω=1/2 (22)Ω=1/2 (14)Σ+
445400
E(cm-1)
Re(r)
3
(8)Π
445200
445000
444800 21
21.4
21.8
22.2
R(Bohr)
Figure 16. Crossing between (14)2Σ+ and (8)2Π states.
22.6
Theoretical Calculation of the Low Laying Electronic States…
137
444800 (21)Ω=1/2 (22)Ω=1/2 444700
(14)Σ+ (8)Π
E(cm-1)
444600
444500
444400
444300 24.8
25.2
25.6
26
26.4
26.8
R(Bohr)
Figure 17. Crossing between (14)2Σ+ and (8)2Π states. 444600
(21)Ω=1/2 (22)Ω=1/2
444400
(14)Σ+ (8)Π
E(cm-1)
444200
444000
443800
443600 33
34
35
36
37
R(Bohr)
Figure 18. Crossing between (14)2Σ+ and (8)2Π states.
38
138
M. Korek and H. Jawhari
4. Conclusion Using an ab initio approach the potential energy has been calculated for 16 states 2Σ+, 9 states 2Π, 4 states 2Δ, 25 states Ω=1/2, 13 states Ω=3/2 and 4 states Ω=5/2 of the molecular ion CsH+. For 19 bound states the harmonic vibrational constant ωe, the internuclear distance re and the electronic transition energy with respect to the ground Te have been calculated. Based on the canonical functions methods Ev, Bv and Dv have been calculated up to the vibrational levels v = 19 for 17 electronic states. From the calculated values of Ev for a given vibrational level v and by using a cubic spline interpolation between each two consecutive points of the potential energy curves the rmin and rmax of the turning points have been a
investigated for these bound states. Permanent dipole moments M a (r) as well as all nonb
zero transition dipole moments M a (r) (a≠b ) have been calculated for each electronic states (a,b) under consideration and in the whole range of r investigated here. The comparison of the present results with those available in literature shows a very good agreement.
References [1] [2] [3]
[4] [5] [6] [7] [8] [9] [10] [11] [12] [13]
John Weiner, Vanderlei S. Bagnato, Sergio Zilio, Paul S. Julienne, Rev. Mod. Phys., 71, 1 (1999). R. Côte´ and A. Dalgarno, Phys. Rev. A 58, 498 (1998). A. Fioretti, D. Comparat, A. Crubellier, O. Dulieu, F. Masnou-Seeuws, and P. Pillet, Phys., Rev. Lett. 80, 4402 (1998); T. Takekoshi, B.M. Patterson, and R.J. Knize, ibid. 81, 5109 (1998); A. N. Nikolov, E. E. Eyler, X. T. Wang, J. Li, H. Wang, W. C. Stwalley, and P.L. Gould , ibid. 82, 703 (1999); N. Balakrishnan, R.C. Forrey, and A. Dalgarno, ibid. 80, 3224 (1998). R. Côté, A. Dalgarno, and Y. Sun, Phys. Rev. Lett. 74, 3581 (1995). D. J. Heinzen, R. Wynar, P. D. Drummond, and K.V. Kheruntsyan, Phys. Rev. Lett. 84, 5029 (2000). A. N. Nikolov, E. E. Eyler, X. T. Wang, J. Li, H. Wang, W. C. Stwalley, and P. L. Gould Phys. Rev. Lett. 82, 703 (1999). W.C. Stwalley and H. Wang, J. Mol. Spectrosc. 195, 194 (1999); R. Côte´ and A. Dalgarno, ibid. 195, 236 (1999). M. Mark, T. Kraemer, P. Waldburger, J. Herbig, C. Chin, H. -C. Naegerl, R. Grimm. Phys. Rev. Lett. 99, 113201 (2007). H. Bethlem and G. Meijer, Int. Rev. Phys. Chem. 22, 73 (2003). O. Docenko, M. Tamanis, J. Zaharova, and R. Ferbera_A. Pashov, H. Knöckel and E. Tiemann, J. Chem. Phys. 124, 174310 (2006). I. Klincare, J. Zaharova, M. Tamanis, and R. Ferber, A. Zaitsevskii, E. A. Pazyuk, and A. V. Stolyarov, Phys. Rev. A, 76, 032511 (2007). O. Docenko, M. Tamanis, R. Ferber, A. Pashov, H. Knöckel, and E. Tiemann, Eur. Phys. J. D, 31, 205 (2004). J. M. Sage, S. Sainis, T. Bergeman, and D. DeMille, Phys. Rev. Lett. 94, 203001 (2005).
Theoretical Calculation of the Low Laying Electronic States…
139
[14] D. Wang, J. Qi, M. F. Stone, O. Nikolayeva, H. Wang, B. Hattaway, S. D. Gensemer, P.L. Gould, E. E. Eyler, and W. C. Stwalley, Phys. Rev. Lett. 93, 243005 (2004). [15] G. Roati, F. Riboli, G. Modugno, and M. Inguscio, Phys.Rev. Lett. 89, 150403 (2002). [16] G. Modugno, G. Ferrari, G. Roati, R. J. Brecha, A. Simoni, M. Inguscio, Science, 294, 1320 (2001); G. Modugno, G. Roati, F. Riboli, F. Ferlaino, R. J. Brecha, M. Inguscio, Science, 297, 2240 (2002). [17] Roberto Casalbuoni and Giuseppe Nardulli, Rev. Mod. Phys., 76(1):263, (2004). [18] Hui Hu, Xia-Ji Liu, and Peter D. Drummond, Phys. Rev. Lett., 98(7):070403, (2007). [19] D. S. Petrov, G. E. Astrakharchik, D. J. Papoular, C. Salomon, and G. V. Shlyapnikov, Phys. Rev. Lett., 99(13):130407, (2007). [20] Christoph Graf vom Hagen, Thesis for the degree of Doctor of Natural Sciences, Ruperto- Carola University of Heidelberg, Germany (2008). [21] H. Scheidt, G. Speiss, A. Valance, and P. Pradel, J. Phys,. B. 11, 2665 (1978). [22] L. Kahn, P. Baybutt, and D. J. Truhlar, J. Chem., 65, 3826 (1976). [23] T.J. Millar, P.R.A. Farquhar, and K. Willacy, Astron. Astro-phys., Suppl. Ser. 121, 139 (1997). [24] G.E. Langer, C.F. Prosser, and C. Sneden, Astron. J. 100, 216 (1990). [25] M. Samland, Astrophys. J. 496, 155 (1998). [26] A. Natta and C. Giovanardi, Astrophys. J. Lett. 356, 646 (1990). [27] J.K. Watson and D.M. Meyer, Astrophys. J. Lett. 473, L127 (1996). [28] R.R. Lewis and W. L. Williams, Phys. Rev. Lett. 59B, 70 (1975). [29] E.A. Hinds, Phys. Rev. 44, 374 (1980). [30] J.P. Lawrence, G.G. Ohlson and J.L. McKibben, Phys. Lett. 28B, 594 (1969). [31] D. De Mille, Phys. Rev. Lett. 88, 067901 (2002). [32] M. Korek, A .R. Allouche, M. Kobeissi, A. Chaalan, M. Dagher, K. Fakherddin and M. Aubert-Frécon, Chem. Phys. 256, 1 (2000). [33] M .Korek, A .R. Allouche, K. Fakherddine, and A. Chaalan, Can. J. Phys .78, 977 (2000). [34] M. Korek and A .R. Allouche, J .Phys. B: At. Mol. Opt. Phys. 34, 3689 (2001). [35] M Korek, A .R. Allouche, and S .N. Abdul al, Can. J. Phys. 80, 1025 (2002). [36] M. Korek, G. Younes, and A. R. Allouche, Int. J. Quant. Chem. 92, 376 (2003). [37] M. Korek and G .Younes, Int. J. Quant. Chem. 101, 84 (2005). [38] M. Korek, K. Badreddine, A. R. Allouche, Can. J. Phys. In press [39] D. Maynau and J.P. Daudey, Chem. Phys. 81, 273 (1981). [40] W. Müller and W. Meyer, J. Chem. Phys. 80, 3311 (1984). [41] F. Spiegelmann, D. Pavolini, and J .P. Daudey, J. Phys. B: At. Mol. Opt. Phys. 22, 2456 (1989). [42] J.C. Barthelat and Ph. Durand, Gazz. Chem. Ital. 108, 225 (1978). [43] Ph. Durand and J.C. Barthelat, Theor. Chim. Acta, 38, 283 (1975). [44] D. Pavolini, T. Gustavson, F. Spiegelmann and J.P. Daudey, J. Phys. B: At. Mol. Opt. Phys. 22, 1721 (1989). [45] W. R. Wadt and P.J. Hay, J. Chem. Physics. 82, 299 (1985). [46] H. Kobeissi, M Dahger, M. Korek and A. Chaalan, J. Comput. Chem., 4, 218, (1983). [47] M. Korek and H. Kobeissi, J. Comput. Chem., 13, 1103, (1992). [48] M. Korek, Comput. Phys. Commun., 119, 169, (1999). [49] M. Foucrault, Ph. Millié, and J.P. Daudey, J. Chem. Phys. 96, 1257 (1992).
140
M. Korek and H. Jawhari
[50] S. Magnier and Ph. Millié, Phys. Rev. A. 53, 204 (1996). [51] A. R. Allouche, M. Korek, K. Fakherddin, A. Chaalan, M. Dagher, F. Taher and M. Aubert-Frécon, J. Phys. B: At. Mol. Opt. Phys. 33, 2307 (2000). [52] A.R.Allouche, G. Nicolas G, J.C. Barthelat and F. Spiegelmann, J. Chem. Phys. 96, 7646 (1992). [53] S. Rousseau S, A.R. Allouche, and M. Aubert-Frécon, J. Mol. Spectrosc. 203, 235 (2000). [54] M. Korek and H. Kobeissi. Can. J. Phys. 73, 559 (1995). [55] M. Korek. Can J. Phys. 75, 795 (1997). [56] M. Korek and K.Fakhreddine. can. J. Phys. 78, 969 (2000). [57] M. Korek, B. Hamdan and K.Fakhreddine. Physica scripta. 61, 66 (2000). [58] G. Herzberg, Spectra of diatomic molecule, Van Nostrand, Toronto, 1950. [59] H. Jawhari, Master Thesis, Beirut Arab University (2008). [60] See EPAPS Document No. E-JCPSA6-129-605840 for supplementary tables. For more information on EPAPS, see http://www.aip.org/pubservs/epaps.html. [61] S. A. Hammoud, Master Thesis, Beirut Arab University (2007) [62] National Institute of standards and Technology (NIST): Tables of Spectra of Hydrogen, Carbon, Nitrogen, and Oxygen Atoms and Ions, C.E. Moore, edited by J.W. Gallagher. CRC Handbook of Chemistry and Physics, Edition 76 (CRC Press, Boca Raton, FL), 336 pp. (1993) [63] National Institute of Standards and Technology, http://physics.nist.gov/ PhysRefData/ASD/ [64] L.Von szentpaly, P. Fuentealba, H. Preuss and H. Stoll, Chem. Phys. Letters, 93, 8, (1982).
In: Quantum Frontiers of Atoms and Molecules Editor: Mihai V. Putz, pp. 141-155
ISBN: 978-1-61668-158-6 © 2010 Nova Science Publishers, Inc.
Chapter 7
THEORETICAL EXPLANATION OF LIGHT AMPLIFYING BY POLYETHYLENE FOIL Vjekoslav Sajfert1a,, Dušan Popov2,b, Stevo Jacimovski3,c and Bratislav Tosic4,d 1
Technical Faculty “Mihajlo Pupin” Zrenjanin, Serbia 2 Universitatea “Politehnica”, Timisoara, Romania 3 Police Academy Belgrade, Serbia 4 Vojvodina Academy of Science and Arts Novi Sad, Serbia
Abstract In connection with the experimental result which stated that polyethylene foil amplifies about three times the penetrated light, we propose two theoretical explanations of this phenomenon. One of them is that several amplified peaks are the consequence of the forming of solitons in a polyethylene chain whose velocities are close to the velocity of sound. Forming of solitons, together with boundary conditions in a polyethylene macromolecules chain, which contain about thirty monomers, lead to the amplification of light. The second explanation requires introduction of homeopolar excitons in polymer macromolecules. Both energy gap of homeopolar excitons and width of homeopolar exciton zone are of the same order of magnitude. It means that transitions in a very wide zone give light quanta which are able to amplify the initial light. In order to avoid some confusion and misunderstandings, we wish to point out the following. Atoms and molecules as the whole are treated classically (transition through potential barriers, for example, etc.). The exception to this rule are phonon theories of crystals where the phonon is considered as a quanta of boson field, i.e., it means that, in the theory of mechanical oscillations, molecules and atoms as the whole are treated quantum mechanically. On the other hand, elementary excitations in crystals such as excitons, vibrons, spin waves, and ferroelectric excitations, etc., which arise from changes of some parts of atoms or molecules are treated quantum mechanically exclusively.
a
E-mail address: [email protected]. E-mail address: [email protected]. Correspondent author. c E-mail address: [email protected]. d E-mail address: [email protected]. b
142
Vjekoslav Sajfert, Dušan Popov, Stevo Jacimovski et al. In the analyses of this work, the excitations of an individual molecule subsystem (i.e. the quantum objects) would serve as an explanation of the light amplification by a polymer chain.
1. Introduction In some previous works [1,2] it was experimentally found that the polyethylene foil noticeably amplifies intensity of light, and that amplification is proportional to the foil thickness. It was separated seven lines from mercury lamp and each of them was amplified. The experimental procedure is described in detail in [1], and therefore it will not be repeated here. We shall quote only the results of the experiment. The seven lines of mercury lamp were amplified 3–4 times after passing the light through a polyethylene foil. The wavelengths of amplified peaks were 254, 315, 366, 436, 506, 577 and 624 nanometers. Seven initial lines which are separated, are the more probable consequence of impurities of Hg, since in the foil of absolutely pure Hg exists only one level at λ = 253.7 nm. We shall try to explain the amplification of incident peaks in the frames of two models. One model is based on the idea that solitons, whose velocity is close to the velocity of sound appearing in polyethylene molecular chains, create the conditions for light amplification. The second model requires introducing homeopolar excitons in the system of possible excitations in the polyethylene chain. In this case, amplification of light by polyethylene arises as a consequence of the fact that the width of homeopolar exciton zone is of the same order of amplitude as the energy of excitation of individual polyethylene monomer. The amplifications of peaks were produced by one polyethylene foil whose thickness was 2 mm. The peaks were amplified 3–4 times by this foil. The amplifications produced by two foils (thickness: 4 mm) could not be registered since the devices were not able to register highness of peaks. It means that amplifications sharply increase with respect to the thickness of polyethylene foil. The theoretical explanation of this amplification of light intensity, given in [1], was based on inverse population of electrons. Namely, it was assumed that in the polyethylene chain exist metastable energy levels where electrons gathered and that they coherently transit to ground state. This coherent transition, as in a laser, leads to the energy amplification. In the work [2], the amplification was explained by exciton and soliton transitions. The excitons were not of dipole-dipole type, but of exchange (homeopolar) type, since polyethylene monomers are forming linear polyethylene macromolecules (polyethylene chains) by homeopolar forces whose potentials are of the order of magnitude of excitation energy of an electron in isolated monomer. The behavior of such excitons will be analyzed in this work with the goal to explain amplification of the light.
2. Excitons and Solitons in Infinite Polyethylene Chains Frenkel excitons [3] more often appear in a molecular crystal where dipole-dipole interactions propagate excitations of an isolated molecule produced by visible light. Excitation energy of an isolated molecule lies between 2.5 eV and 5 eV, while energies of
Theoretical Explanation of Light Amplifying by Polyethylene Foil
143
dipole-dipole interactions are 50–100 times lower. In such situation, energy of exciton and energy of visible photons are practically identical. For forming polyethylene linear chain made of monomers C2H4 are responsible exchange forces (covalent connection of monomers) [4,5]. Since potential energies between two electrons are reflexive, the potential energy is of positive sign. On the other hand if electron spins are parallel, the configurationally parts of wave function must be antisymmetric and this leads to negative exchange integrals which are binding monomers in chain. Consequently, if one isolated monomer of chain is excited by quanta of visible light this excitation propagates along the chain in covalent forces field. The matrix elements of covalent potential are of the order of five electronvolts. It means that they are of approximately of same magnitude as excitations of an isolated molecule. It is essential difference with respect to excitons in molecular crystals. In polyethylene chains wideness of exciton zone is practically same as the energy of excitation of an isolated monomer. It means that in polyethylene chains can appear excitons whose energy is two or three times higher than energy of visible light which produces excitons. This amplifying is registered in experiment described in the previous section. The mechanical oscillations interacting with excitons lead to forming of new quasiparticles - solitons [6], which can have higher energy than exciton. Besides, the solitons are qusiparticles of stable form moving superfluidely and having higher luminescence time than excitons. Luminescence time for singlet excitons is about 10
−8
s, for triplet excitons
−3
−2
−1
(corresponding to parallel electron-spins) 10 s [7] and for solitons about 10 s - 10 s [8]. The long times of luminescence means that metastable state are lasting sufficiently long to cause coherent illumination. The upper ideas will be demonstrated on the simple case of two level monomer excitation. The electronic Hamiltonian of monomer chain can be written in the following way:
H = ∑ ε s ans+ ans + n,s
1 2
∑W (s , s , s , s )a nm
1
2
3
n, m s1 , s 2 , s 3 , s 4
4
+ ns 1
+ ams a ans 2 3 ms 4
(1)
where a are Fermi operators of electrons localized in monomer and ε (which are of the order 2.5 -5 eV) are energies of excitations of an isolated monomer. The matrix element W include Coulomb and exchange forces and they are given by
( ) ( ) n e− m Ψ (ξ ) Ψ (ξ )
Wnm = ∫ d 3ξ n d 3ξ m Ψs*1 ξ n Ψs*3 ξ m
2
s4
m
s2
n
(2)
where ξ are internal coordinates of monomers. Because we consider two-level scheme, indices s1, s2 , s3 , s4 must have only two values: 0 - corresponding to ground state and f corresponding to excited state.
144
Vjekoslav Sajfert, Dušan Popov, Stevo Jacimovski et al. Here and below, for reasons of simplifying the writing of equations, we will use in the
denominator the following notation: n − m ≡
ξn − ξm . +
+
+
Introducing the operators of monomers excitations P = a f a0 and P = a0 a f we
(
easily concluded that they are closed in electron subspace 10 ,0 f operators P
+
)
(
)
and 00 ,1 f . The
and P are Pauli operators which in the lowest ASQ approximation can be +
substituted by Bose operators B and B . +
Extracting from the Hamiltonian P only quadratic terms in P and substituting P with Bose operators B , we obtain the following excitonic Hamiltonian in ASQ approximation +
+
(ground state terms as well as the terms proportional to P P and PP are omitted):
V ( f , f ,0,0) + V (0,0, f , f ) ⎤ + ⎡ H = ∑ ⎢ε f − ε 0 − V (0,0,0,0) + ⎥ Bn Bn 2 n ⎣ ⎦ +
1 ∑ 2 [Wnm ( f ,0,0, f ) + Wnm (0, f , f ,0)]Bn+ Bm
(3)
n, m
where V =
∑ Wn − m . m
It will be now shown that W ( f , f ,0,0 ) and W (0,0, f , f
while W ( f ,0,0, f
) are positive Coulomb terms
) and W (0, f , f ,0) are negative exchange terms.
The antisymmetric electron pair function is given by:
Ψnm =
[ ( ) ( )
( ) ( )]
1 Ψ f ξ n Ψ0 ξ m − Ψ f ξ m Ψ0 ξ n 2
(4)
and, as a consequence of this fact, the integral of the electron-electron interaction
∫
3
d ξn d
3
* ξm Ψnm
e2 Ψnm n−m
consists from four integrals:
∫
d 3ξ n d 3ξ mΨ nm*
( ) ( )
( ) ( )
e2 1 e2 Ψ nm = { ∫ d 3ξ n d 3ξ mΨ f* ξ n Ψ 0* ξ m Ψ0 ξm Ψ f ξn n−m 2 n−m
( ) ( ) n e− m Ψ (ξ )Ψ (ξ )
+ ∫ d 3 ξ n d 3ξ mΨ 0* ξ n Ψ f* ξ m
2
f
m
0
n
Theoretical Explanation of Light Amplifying by Polyethylene Foil
145
( ) ( ) n e− m Ψ (ξ )Ψ (ξ ) }
− ∫ d 3 ξ n d 3ξ mΨ 0* ξ n Ψ f* ξ m =
2
0
m
f
n
1 [Wnm ( f , f ,0,0) + Wnm (0,0, f , f ) + Wnm ( f ,0,0, f ) + Wnm (0, f , f ,0)] 2
(5)
Introducing notations
Δ = ε f − ε0 D=
V ( f , f ,0,0 ) + V (0,0, f , f ) − V (0,0,0,0) 2 1 − Wnm = [Wnm ( f ,0,0, f ) + Wnm (0, f , f ,0)] 2
(6)
and taking the nearest neighboring approximation ( Wn, n ±1 = W ), the Hamiltonian (3) can be written as (see [5,6]):
H exc = (Δ + D )∑ Bn+ Bn − W ∑ Bn+ (Bn +1 + Bn −1 ) ; W > 0 n
(7)
n
By means of Fourier transformations
Bn =
1 N
∑
eikan Bk
(8)
k
the Hamiltonian (7) goes over to
H exc = ∑ Ek Bk+ Bk
(9)
Eexc (k ) = Δ + D − 2W cos ak
(10)
k
where
Now we can estimate exciton energy in polymer chain with exchange interaction. We will take Δ = 3 eV , D = 1 eV (matrix elements W ( f , f ,0,0 ) are bigger than matrix
elements W (0,0,0,0 ) and W ~2.5 eV. The maximal energy of excitons for ak = π is 9 eV
and it is 3 times higher than energy excitations (energy of photons). This corresponds to the results of quoted experiment. In the case of dipole-dipole interaction we can take D − 2W cos ak = 0.06 eV , so that Eexc (k )max = 3.06 eV . This example points out a
146
Vjekoslav Sajfert, Dušan Popov, Stevo Jacimovski et al.
sharp difference between exciton energies in systems with dipole-dipole interactions with respect to those energies in systems with exchange interactions. At the end, we shortly expose the soliton characteristics of this exciton system. To the excitonic Hamiltonian we add the phonon Hamiltonian:
H ph =
1 2M
∑
pn2 +
n
C (un − un −1 )2 ∑ 2 n
(11)
where M is the mass of monomer, C is the Hook's constant of monomer chain, un are displacements of monomers and pn = u are corresponding momenta. The exciton phonon Hamiltonian is given as follows:
∂W + ⎡ ∂D + ⎤ ∂u H ep = ∑ ⎢2 Bn Bn − Bn (Bn +1 + Bn −1 )⎥ a ∂a ⎦ ∂ (na ) n ⎣ ∂a
(12)
Using complete Hamiltonian of the system
H = H exc + H ph + H ep
(13)
and equations of motion for B we obtain
∂W ⎡ ∂D (Bn +1 + Bn −1 )⎤⎥ a ∂u (14) EBn = (Δ + D )Bn − W (Bn +1 + Bn −1 ) + ⎢2 Bn − ∂a ⎣ ∂a ⎦ ∂ (na ) and
∂ 2u n C (un +1 + un −1 − 2un ) = M dt 2
(15)
The equations can be averaged by coherent states
< Ψep eβ n (Bn − Bn )Bn e −β n (Bn − Bn ) Ψep >=< Ψep eβ n (Bn − Bn )Bn+ e −β n (Bn − Bn ) Ψep >= βn +
+
+
Ψep >= nexc > n ph > ;
+
β*n = βn
(16)
and
< Ψep
γn pn ei u
ne
−
γn pn i
Ψep >= − γ n ; γ* = γ
(17)
After that we go over to continual variable na → x and to common variable ξ = x − vt , where v is velocity of soliton. After quoted operations, the equation (15) goes over to:
Theoretical Explanation of Light Amplifying by Polyethylene Foil
(v
2
− s2
147
)ddξγ = 0
(18)
C M
(19)
2
2
where
s2 = a2
is the square velocity of sound in the chain. The boundary condition for this equation is
γ (0 ) = 0 ,
dγ dξ
ξ =0
= M −1 p0
(20)
where p0 is the beginning momentum which produce mechanical oscillations. By the way, one of explanations of appearance of tsunami is based on such beginning conditions. By means of conditions (20) we obtain
dγ M −2 p02 = dξ v 2 − s 2
(21)
The equation (14), written in continuum, after the substitution (21), becomes:
⎡ 2aM −2 p 2 ⎛ ∂V ∂W E − ⎢Δ + D − 2W + 2 2 0 ⎜ − v − s ⎝ ∂a ∂a d 2β ⎢⎣ = ∂W ⎞ ⎛ dξ 2 a 2 ⎜W + a ⎟ ∂a ⎠ ⎝
⎞⎤ ⎟⎥ ⎠⎥⎦
β=0
(22)
If the coefficient in second term of (22) is positive the solution of (22) is
β = c1 cosh Qξ = c1 cosh Q( x − vt )
(23)
where
Q=
⎡ 2aM − 2 p 2 ⎛ ∂V ∂W ⎞⎤ E − ⎢Δ + D − 2W + 2 2 0 ⎜ − ⎟⎥ v − s ⎝ ∂a ∂a ⎠⎦⎥ ⎣⎢ ∂W ⎞ ⎛ a 2 ⎜W + a ⎟ ∂a ⎠ ⎝
(24)
The details of further calculations one can find in ref. [6, pp. 27-31]. Here they will not be quoted. For our aims it is important to say that the wave function of soliton is proportional to reciprocal value of β , i.e.
148
Vjekoslav Sajfert, Dušan Popov, Stevo Jacimovski et al.
Ψsol =
c2 1 = β cosh Q( x − vt )
The function Ψsol is taken as Ψsol = cosh
−1
(25)
[Q(x − vt )]
since Q contains arbitrary
parameter soliton energy E. Soliton energy Esol can be determined from normalization condition: +∞
1 2 dξ Ψsol = 1 ∫ a −∞
(26)
2 =1 aQ
(27)
which reduces to
wherefrom we obtain
E = Δ + D + 2W +
2aM −2 p02 ⎛ ∂V ∂W − ⎜ v 2 − s 2 ⎝ ∂a ∂a
∂W ⎞ ⎟ + 4a ∂a ⎠
(28)
Soliton wave function (25) is of the stable form and propagates practically superfluidly. Comparing (26) with (10) we see that soliton energy is some higher than the exciton one, but when s → v , i.e. when soliton velocity becomes close to sound velocity then soliton energy becomes noticeably higher than exciton energy. It means that in the soliton model of excitations in polyethylene, amplification can be considered as a consequence of sharp increase of soliton energies when velocity of solitons becomes very close to the velocity of sound.
3. Green’s Function of Homeopolar Excitons The general form of electron Hamiltonian, where overlapping of electron wave functions of neighbor molecules is weak, is given by N
H = ∑ ∑ ε s ans+ ans + n=0 s
+
8 1 N Wn − m ( s1 , s2 , s3 , s4 )ans+ 1 ans 2 ans+ 3 ans 4 ∑ ∑ 2 n, m = 0 s1 , s 2 , s3 , s 4 = 0
Here a and are a electron operators,
(29)
ε s is energy of electron in an isolated monomer
and W are matrix elements of reflexive electron potential
e2 taken over products of n−m
Theoretical Explanation of Light Amplifying by Polyethylene Foil
149
configuration electron functions and spin electron functions. For two electrons forming bonds between two monomers spin functions is even (it corresponds to parallel spins) while the configurationally function must be odd. It gives attractive transition potentials between monomers whose order of magnitude is 5 eV, i.e. of the same order of magnitude as excitation energies of electrons in an isolated monomer. It should be noted that in the case of dipole-dipole interactions (Frenkel excitons) attractive potentials between molecules are of the order of 10-2 eV. Using the well known procedure, taken from the theory of Frenkel excitons for two level schemes of electron excitations [3, 4] and going over to approximate second quantization approach, we can write the Hamiltonian (29) in the nearest-neighboring approximation as: N
N
N
n=0
n=0
n=0
H = Δ ∑ Bn+ Bn + ∑ (Dn , n +1 + Dn , n −1 )Bn+ Bn −∑ Bn+ (J n , n +1Bn +1 + J n , n −1 Bn −1 ) (30) where
Δ = ε f − ε 0 ; Dn, n ±1 =
J n, n ±1 =
Here
Wn , n ±1 ( f , f ,0,0) + Wn , n ±1 (0,0, f , f ) − Wn , n ±1 (0,0,0,0) (31) 2
Wn , n ±1 ( f ,0,0, f ) + Wn , n ±1 (0, f , f ,0) ; D > 0, J > 0 2
ε f denotes energy of excited electron, while ε 0 denotes electron energy in ground
state. Assuming that polyethylene linear chain has N monomers (N is 30-50 monomers) we must introduce the boundary conditions
D0, −1 = DN , N +1 = 0 ; J 0, −1 = J N , N +1 = 0
(32)
The exciton system with Hamiltonian (2) will be analyzed by means of the “real space” Green’s function method. The double-time Green’s functions are
⎧1, n = m (33) Gnm (t ) =<< Bn (t ) | Bm+ (0) >>= θ (t ) < Bn (t ), Bm+ (0) > ; θ (t ) = ⎨ ⎩0 , n ≠ m
[
where
[, ]
is the commutator, <
]
> is the thermal average over the grand canonical
ensemble and θ (t ) is the Heaviside function. Using the equations of motion for Bose operators B and taking into account the boundary conditions (32) we will obtain, after Fourier transformations time-frequency, the system of three difference equations defining Green’s functions,
150
Vjekoslav Sajfert, Dušan Popov, Stevo Jacimovski et al.
J [Gn +1, m (ω ) + Gn −1, m (ω )] + ρGn , m (ω ) = JG1, m (ω ) + ( ρ + D)G0, m (ω ) =
i δ n, m ; 1 ≤ n ≤ N − 1 2π i δ 0, m ; n = 0 2π
JGN −1, m (ω ) + ( ρ + D)GN , m (ω ) = where
i δ N ,m ; n = N 2π
ρ = E − Δ − 2D
(34)
(35)
(36)
(37)
It can be shown that the system of equations (34) - (36) reduces into unique equation (39) by the substitution: N +1
Gn,m (ω ) = ∑ Aν (m, ω ) Fν (n)
(38)
ν =1
and N
i
∑ (2 J cosϕν + ρ )Aν (m,ω ) Fν (n) = 2π δ ν =0
n,m
; n = 0, 1, 2,…, N
(39)
where
Fν (n) = sin(n + 1)ϕν −
D sin nϕν J
(40)
Parameters φν are the solutions of the transcendental equation 2
D ⎛D⎞ sin( N + 2)ϕν − 2 sin( N + 1)ϕν + ⎜ ⎟ sin Nϕν = 0 J ⎝J⎠
(41)
The Kronecker symbol will be represented as follows N +1
δ n, m = ∑ Fν (n) Φν (m) ; n, m ∈ {0, 1, 2,…, N }
(42)
ν =1
Putting equation (42) into (39) we obtain
Aν ( n , ω ) = Φ ν ( n ) gν ( ω )
(43)
Theoretical Explanation of Light Amplifying by Polyethylene Foil
151
where
gν ( ω ) =
1 i 2π ω − Ων
(44)
After substituting the equations (42) - (44) into equation (39), we get
Ων =
Eν
; Eν = Δ + 2 D − 2 J cos ϕν
(45)
The spectral intensity of Green’s functions:
Gn , m (ω ) =
i N Fν (n) Φν (m) ∑ 2π ν = 0 ω − Ων
(46)
is given by [5, 6] N +1
I n, m (ω ) = ∑ ν =1
Fν (n) Φν (m) ω
e
k BT
δ (ω − Ων )
(47)
−1
By means of the formula (3.19 ) we can determine the correlation function ∞
N +1
Cn , m (t ) =< Bm+ (0) Bn (t ) >= ∫ dω e − iω t I n , m (ω ) = ∑ ν =1
−∞
Fν (n) Φν (m) e
Eν k BT
e
i − Eν t
(48)
−1
and the concentration of excitons N +1
Cn , m (0) =< Bm+ (0) Bn (0) >= ∑ ν =1
Fν (n) Φν (m) e
Eν k BT
(49)
−1
This formula gives the exciton energies in analysed polyethylene chain.
4. The Exciton Wave Function of Polyethylene Chain The equation (45) gives the energies of excitons in polymer chain. Using this formula we can find the transitions whose wavelenghts correspond to experimentally obtained wavelenghts , which are quoted in Introduction. Besides these wavelenghts, we shall determine the probabilities of finding of exciton at given energy level, as well as the probabilities of exciton transition between two energy levels. These probabilities can be find out by means of exciton one particle wave function
152
Vjekoslav Sajfert, Dušan Popov, Stevo Jacimovski et al. N +1
| Ψν >= ∑ Aν (n) Bn+ | 0 >
(50)
n =1
On the basis of equations motion and boundary conditions, it is obtained the system of homogeneous diference equations for coefficients Aν(n):
An +1 + An −1 + ρAn = 0 ; 1 ≤ n ≤ N − 1 D⎞ ⎛ A1 + ⎜ ρ + ⎟ A0 = 0 ; n=0 W⎠ ⎝ D⎞ ⎛ AN −1 + ⎜ ρ + ⎟ AN = 0 ; n=N W⎠ ⎝
(51)
which, by the substitution N +1 D ⎡ ⎤ N +1 An = ∑ Aν ⎢sin( n + 1)ϕν − sin nϕν ⎥ = ∑ Aν Fν ( n) J ⎣ ⎦ ν =1 ν =1
(52)
goes over to the unique equation N +1
∑ (2 J cosϕν + ρ )Fν (n) = 0 ; ν
n ∈ (0,1, 2,…, N )
(53)
=1
where Fν (n) is given by (40) and parameters φν are real solutions of the equation (41). Consequently, finally we have N
Bν+ | 0 >= ∑ Fν (n)Bn+ | 0 >
(54)
n=0
Multiplying equation (54) with Φν(m), summing both sides of obtained equality over ν and taking into account (42), we obtain: N +1
Bn+ | 0 >= ∑ Φν (n)Bν+ | 0 >
(55)
ν =1
+
The normalised function Bn 0 is given by
Bν+ | 0 >=
N
1 N
∑ Fν (n) n =0
2
∑ Fν (n)B n =0
+ n
|0>
(56)
Theoretical Explanation of Light Amplifying by Polyethylene Foil
153
+
while normalised function Bν 0 is:
Bn+ | 0 >=
N +1
1 N +1
∑ Φν (n) ν 2
∑ Φν (n)Bν ν
+
|0>
(57)
=1
=1
The dependence of exciton energies with respect to
ϕν , which are obtained from (17) for
values Δ = 3 eV; D = 2.5 eV; J = 2.7 eV, is given in Figure 1.
Figure 1. The energy levels of excitons in polyethilen chain.
Table 1. Experimental and theoretical wavelengths of transitions leading to amplifications
λexp (nm) λtheor (nm) Transition Amplification factor ν →ν ' 254 315 366 436 506 577 624
256 314 363 439 503 574 625
32 → 18 27 → 18 24 → 17 32 → 22 23 → 18 30 → 23 29 → 23
4.46 4.16 3.81 4.46 3.67 4.39 4.33
154
Vjekoslav Sajfert, Dušan Popov, Stevo Jacimovski et al.
Experimental and theoretical wavelenghts of peaks as well as the theoretical amplifying of peaks are given in the Table 1. The theoretical values λ are determined as
hc (Eν − Eμ ) ; ν > μ . −1
On the other hand, the probabilities of finding excitons on levels before luminiscence are given Table 2. Table 2. The probabilities of existence at stable levels where Φν =
Pν
Pν =
Φν2
N +1
∑ Φν2
1 31 ∑ Φν (n ) 32 n = 0
Pν (%)
ν =1
P23 P22 P18 P17
0.299281 0.072849 0.265614 0.152059
8.956 0.531 7.055 2.312
Table 3. The transition probabilities of exciton to stable levels
Pν , μ =
Φν Φ μ N +1
∑ Φν ν
2
Pν , μ (%)
=1
P32;18 P30;23 P32;23 P27;18 P32;22 P24;17 P23;18
3.454 7.794 6.289 0,796 0.947 3.652 7.949
Similarly, the transitions probabilities from higher level Eν to the stable levels Eμ quoted in the previous Table 2. are presented in the Table 3.
Conclusion In this work are analyzed two possible mechanisms of light amplification by polyethylene foil, registered in experiments with a mercury lamp. In these experiments the light, after transition through polyethylene foil was about 3 times more intensive than in the cases when the foil is absent. In the first model it was assumed that in the polymer chain solitons are forming whose velocities were close to the velocity of sound and this caused the light amplification.
Theoretical Explanation of Light Amplifying by Polyethylene Foil
155
In the second model was introduced the concept of homeopolar excitons having approximately equal gap and zone width. The analyses by means of Green's function as well as the wave of homeopolar excitons wave function were determined metastable levels of homeopolar excitons and probabilities of transition levels to ground state. Calculated theoretical values of mentioned probabilities were in good agreement with experimental data.
Acknowledgements This work was supported by the Serbian Ministry of Science and Technology: Grant No 141044, by Vojvodina Academy of Sciences and Arts, and by the Provincial Secretariat for Science and Technological Development of the Autonomous Province of Vojvodina (Project 114-451-00615/2007-06).
References [1] [2]
[3] [4] [5] [6] [7] [8]
Janjić J. D., Tošić B. S., Scientific Bulletin of the „Politehnica“ University of Timisoara, ROMANIA, Transactions on Mathematics and Physics 2006, 51, 80. Janjić J. D., Tošić B. S., Sajfert V., "Luminescence Mechanism of Light Amplyfing by Polyethylene, Bulletin of the “Politehnica” University of Timisoara, ROMANIA, Transactions on Mathematics and Physics 2008, 53, 91. Agranovich V.M., JETP 1959, 37, 430. Agranovich V. M., Theory of Excitons; Nauka: Moscow, RUSSIA, 1978. Тyablikov S.V., Methods in the Quantum Theory in Magnetism, Plenum Press: New York, USA, 1967. Tošić B. S., Statistical Physics, Faculty of Natural Sciences, Novi Sad, SERBIA, 1978. (in Serbian) Masterson W. W., Hurley C. N., Chemical Principles and Reactions, Saunders College Publishing, Phyladelphia, USA, 1993; pp. 579. Brown, W.H., Introduction in Organic Chemistry, Saunders College Publishing Phyladelphia, USA, 1997; pp. 428.
In: Quantum Frontiers of Atoms and Molecules Editor: Mihai V. Putz, pp. 157-189
ISBN: 978-1-61668-158-6 © 2010 Nova Science Publishers, Inc.
Chapter 8
ANHARMONIC EFFECTS IN NORMAL MODE VIBRATIONS: THEIR ROLE IN BIOLOGICAL SYSTEMS Attila Bende* Molecular and Biomolecular Physics Department, National Institute for Research and Development of Isotopic and Molecular Technologies, Donath street, No: 65 – 103, Ro-400293, Cluj-Napoca, ROMANIA
Abstract The intra- and intermolecular hydrogen-bond dynamics plays an important role in thermal stability and molecular functionality of biomolecules like DNA base pairs and proteins. The phenomena of thermal relaxation, conformational changes as well as the ultrafast nonradiative decays in molecules, especially in biomolecules are realized among other physical effects also with the help of molecular vibrations. Intermolecular hydrogen- or weak chemical bonds (covalent bond including H atoms) usually present large anharmonic normal mode vibrations. The anharmonic effects of normal mode vibrations for some small molecular model systems (formamide and urea) as well as for guanine-cytosine and adenine-thymine DNA base pairs are presented considering DFT and ab initio second order Møller-Plesset (MP2) theoretical methods. The role of basis set superposition errors in harmonic and anharmonic frequency calculations is briefly discussed. It was observed that anharmonic effects can significantly change the blue- or red-shifted harmonic frequency values. Large anharmonic effects were found in the case of protonated molecular structures, especial for Hoogsteen conformation of guanine-cytosine base pairs.
1. Introduction Hydrogen bonding is ubiquitous in nature and governs a wide array of chemical and biological processes ranging from local structure in molecular liquids to the structure and folding dynamics of proteins [1, 2]. Although the hydrogen bond is well studied, its lowfrequency vibrations—the large-amplitude motions involving stretching and bending along the actual hydrogen-bond coordinates—have been rarely investigated [3]. Information about *
E-mail addresses: [email protected], [email protected]
158
Attila Bende
these vibrations offers exceptional insight into the potential energy surface of the interaction and so further enhances our understanding of the hydrogen bond and its impact on molecular structure and dynamics. The “C=O···H-N” type hydrogen-bond (H-bond) is one of the most frequently occurring Van der Waals (VdW) bonds in the biological systems. They can be found as a main component of the DNA bases pair’s interaction systems or in protein α-helix and β-sheets. In this sense, not only is the cognition of molecular structures important, but also their dynamics, which in essence represent their biological functionality. Vibrational spectroscopy is a powerful tool for getting information about the complex dynamics of atoms in H-bonds. In particular, such calculations provide sensitive information on the potential energy surfaces of these systems [4, 5]. In the case of infrared (IR) spectroscopic studies, the most evident effect of hydrogen bonding are the red shift of the high-frequency X–H···Y stretching mode, its intensity increase and band broadening; the latter is often accompanied by the development of peculiar band-shapes. The large increase of the bandwidth, the band asymmetry, the appearance of subsidiary absorption maxima and minima, such as Evans windows, peculiar isotope and temperature effects and the similarity of features of the line shape when passing from the condensed to the gas phase [6, 7] are the challenge for theories of H-bonds. Computational studies are essential in this context since comparison of calculations to the measured spectroscopic data provides a test of the potential surface [8–11]. While calculations at the level of the harmonic approximation are very useful, they are often not of sufficient accuracy [8, 9, 12–14]. One approach is to apply empirical scaling factors in order to represent the anharmonic effects [15, 16]. However, such an approach, while useful in many cases, has no fundamental (quantum mechanical) basis. Moreover, it does not provide any insights into the nature of the anharmonic part of the potential, which itself is of great interest. Calculations of the spectra of large molecules beyond the harmonic approximation remain a major challenge. The goal of this Chapter is to give an accurate description of intermolecular normal modes and to present different intermolecular interaction effects which could influence the monomer type vibrations, considering the formamide and urea dimer cases and the guaninecytosine DNA base pair molecular system.
2. Theoretical Background and Computational Methods Regarding the theoretical methods, it is well established that for an accurate description of HBs, ab initio techniques which include the electronic correlation level are needed, i.e. with an error bar of less than 0.04 eV (1 kcal/mol) for predicting the HB strength. Thus the observed underestimation of the HB strength by the Hartree-Fock (HF) calculations (where the electron correlation effects are missing) is overcome using correlated methods like second order Møller-Plesset perturbation theory (MP2) or coupled cluster (CC) methods. But it is necessary to use very large high quality basis sets to expand the wave function and to get reliable HB properties. This fact and the necessity of correlated methods to accurately describe HBs make such studies computationally too expensive and only applicable to molecular complexes of at most a few tens of atoms. Therefore strategies to study HBs with similar accuracy to MP2 or higher levels of theory but computationally less expensive are needed. In this vein, density functional theory (DFT) is a method that includes electronic correlation. Unfortunately the accuracy of DFT to describe the HB interaction relies on the
Anharmonic Effects in Normal Mode Vibrations
159
applied functional to approximate the electronic exchange - correlation (XC) contribution. To overcome this problem, first we try to find those XC functionals which could describe HBs with an approximate accuracy as MP2. In the last few years a number of significant works could be found in the literature, which compared the efficiency of different DFT functionals with those obtained using MP2 method. Besides of the accurate electron correlation methods, based on the perturbational, coupled-cluster or Kohn-Sham (KS) scheme, the basis set superposition errors (BSSE) present a very important source of discrepancies. Many papers have indicated that the impact of BSSE on the geometries of weakly bound systems is smaller for DFT methods than those for ab initio methods such as MP2, but their influence even in case of DFT could not be neglected [17–19]. The reason why BSSE is similar in HF and KS methods is that the KS method does not require the high oscillatory basis functions which are needed to describe electron correlation in traditional electronic structure methods, and which are virtually impossible to include at saturation.
2.1. The BSSE Problem The BSSE is a pure “mathematical effect” and it is an important problem to solve when we study a weakly bonded molecular complex. This effect appears only as a result of the use of finite basis sets, because the description of the monomer is actually better within the supermolecule than that which one has for the free monomers by applying the same basis set, so thus leads to incomplete description in the individual monomers. Due to BSSE, the calculated interaction energies show too deep minima and the computed potential energy surface (PES) is distorted. The most important and straightforward a posteriori correction scheme, the so-called “function counterpoise” (CP), or simply the Boys–Bernardi method, was introduced by Jansen and Ross [20] and, independently, by Boys and Bernardi [21] in 1969/1970. In this CP scheme, the monomer energies are recalculated by using the whole supermolecular basis set and these corrected monomers are used in the molecular interaction energy calculations. A conceptually different way to handle the BSSE problem is to apply the “chemical Hamiltonian approach” (CHA) for the case of intermolecular complexes proposed by Mayer [22] in 1983. (For a detailed review on CHA, see Ref. [23].) By using the a priori CHA method one can eliminate the nonphysical terms of the Hamiltonian, which leads to wave functions free from the nonphysical delocalizations caused by BSSE. Using this CHA scheme, several approaches have been developed both at the HF [24–37] and correlated [38– 45] levels of theory to study the structures and interaction energies for different van der Waals and H-bonded systems.
The CP Scheme The simplest definition of the uncorrected interaction energy between two molecules is the difference of the supermolecular energy and the sum of the free monomer energies, each calculated in its own basis set: unc ΔE AB = E AB ( AB ) − E A ( A) − EB (B ) ,
(1)
160
Attila Bende
where EAB(AB), EA(A), and EB(B) denote the total energy of the AB “supermolecule” and the energy of the A and B monomers, respectively. The notations in parentheses indicate that basis sets corresponding to (sub)system A, B, and AB, respectively, were used. To compute unc , we need to use (nearly) complete basis sets the correct value of the interaction energy ΔE AB on the supermolecule and on each monomer, which is usually impossible in practice. CP In the CP scheme, the interaction energy ΔE AB is defined as the difference of the supermolecule and monomer energies, all computed in the same supermolecule basis set: CP ΔE AB = E AB ( AB ) − E A ( AB ) − EB ( AB )
(2)
Using Eqs. (1) and (2), one can define the BSSE content in the interaction energy as unc CP δEBSSE = ΔE AB − ΔE AB
= E A ( AB ) − E A ( A) + EB ( AB ) − EB (B )
(3)
According to Eq. (3), the CP-corrected potential energy surface (PES) of the dimer becomes unc E CP ( AB ) = ΔE AB − δE BSSE
unc = ΔE AB − E A ( AB ) + E A ( A) − E B ( AB ) + E B (B )
(4)
Equation (4) shows that by considering only the intermolecular internal coordinates as optimized parameters one has to calculate three different total energies to determine the CPcorrected PES.
The CHA Scheme In the alternative a priori CHA scheme introduced by Mayer [22, 23] one can omit the BSSE caused terms of the Hamiltonian, which is a conceptually different way of handling the BSSE problem. The CHA procedure permits the supermolecule calculations to remain consistent with those for the free monomer performed in their original basis sets. The basic idea of Mayer’s scheme is that one can divide the Born–Oppenheimer Hamiltonian into two parts:
Hˆ BO = Hˆ CHA + Hˆ BSSE
(5)
where Hˆ CHA is the BSSE-free part of the Hamiltonian and Hˆ BSSE is the “unphysical” part of the Hamiltonian that is responsible for the BSSE. The only difficulty of this scheme is that the resulting “physical” Hamiltonian Hˆ CHA is not Hermitical, so one cannot expect the BSSE-
Anharmonic Effects in Normal Mode Vibrations
161
free Hamiltonian Hˆ CHA to be Hermitical either. Based on this CHA Hamiltonian, Mayer and Vibók developed different SCF-type equations [28]:
Hˆ CHA ΨCHA = ΛΨCHA ,
ECHA / CE =
ΨCHA Hˆ BO ΨCHA ΨCHA ΨCHA
(6)
.
(7)
In the CHA framework [32–35] described by Eqs. (6)–(7) the non-Hermitical CHA Hamiltonian Hˆ CHA is used only to provide the BSSE-free wave function (or a perturbative approximation to it) but the energy should be calculated by using the conventional (Hermitical) Born– Oppenheimer Hamiltonian Hˆ BO , making not trivial the question of how one should calculate the second-order energy correction. The 0th-order Hamiltonian is defined in terms of the CHA Fockian [45]:
Hˆ 0 =
∑ ε ϕˆ ϕ~ˆ p
+ p
− p
,
(8)
p
+ − where ϕˆ p and ϕ~ˆ p are the creation and annihilation operators.
To obtain the first-order wave function in the perturbation theory, the non-Hermitical CHA Hamiltonian could be partitioned as
Hˆ CHA = Hˆ 0 + VˆCHA ,
(9)
where Hˆ defined by Eq. (8), is the Møller–Plesset type unperturbed Hamiltonian, which is 0
also non-Hermitical, and VˆCHA represents the perturbation. At the same time, the perturbation energy could be obtained considering the following partition of the Born–Oppenheimer Hamiltonian Hˆ BO :
Hˆ BO = Hˆ 0 + Vˆ
(10)
Then the energy up to second order could be presented as
E
(2 )
=
Ψ0 Hˆ BO Ψ0 Ψ0 Ψ0
where J2 is the generalized Hylleraas functional:
+ J2 ,
(11)
162
Attila Bende
J2 =
[ (
) (
)]
1 2 Re Qˆ Ψ1 Vˆ Ψ0 − Re Ψ1 Hˆ 0 − E0 Ψ1 , Ψ0 Ψ0
(12)
Here Ψ0 is the unperturbed wave function, E0 is the zero order energy ( Hˆ Ψ0 = 0
= E0 Ψ0 ), Ψ1 is the first order wave function of the perturbation, and Qˆ is the projection operator on to the orthogonal complement to Ψ0 . These equations define our working formula at the second-order perturbation level. This formalism is called “CHA/MP2” theory [40].
2.2. Anharmonic Approach of Normal Mode Vibrations Computation of harmonic force field and implicit the harmonic vibrational frequencies many times give results which are very far from the experimentally measured values. The explanation seems to be quite simple. Beyond of harmonic approximation, many other contributions like solvent effects, anharmonic corrections or cluster effects, are not taken into account. Usually, all of these corrections were considered by multiplying the harmonic frequency values with a scaling factor. Unfortunately, the collective contribution of different approximations could not always be fitted into the previous scaling scheme. So, it is very important to study them separately considering different theoretical models which describe these effects. The linear relationship between normal coordinates Q and Cartesian displacement X coordinates are:
Q = L+ M 1 2 X ,
(13)
where, by convention, all the components of Q and X vanish at the reference geometry, M is the diagonal matrix of atomic masses, and L is the matrix of (columwise) eigenvectors of the mass weighted Cartesian force constant matrix M-1/2FM-1/2. The second-derivative matrix over normal coordinates, Φ is
Φ = L+ M −1 2 FM −1 2 L
(14)
and is diagonal when evaluated at the equilibrium geometry with eigenvalues λ proportional to the squares of harmonic vibrational frequencies ω. In this way we can evaluate the third and fourth order energy derivatives with respect to normal coordinates by numerical differentiations of analytical Hessian matrices at geometries displaced by small increments δQ from the reference geometry [46, 47]: 1 ⎛ Φ (δQi ) − Φ jk (− δQi ) Φ ki (δQ j ) − Φ ki (− δQ j ) Φ ij (δQk ) − Φ ij (− δQk ) ⎞⎟ (15) Φ ijk = ⎜ jk + + ⎟ 3 ⎜⎝ 2δQi 2δQ j 2δQk ⎠
Anharmonic Effects in Normal Mode Vibrations
Φ ijkk =
163
Φ ij (δQk ) + Φ ij (− δQk ) − 2Φ ij (0 )
(16)
δQk2
1 ⎛ Φ (δQk ) + Φ ii (− δQk ) − 2Φ ii (0 ) Φ kk (δQi ) + Φ kk (− δQi ) − 2Φkk (0) ⎞ ⎟⎟ (17) + Φ iikk = ⎜⎜ ii δQk2 δQi2 2⎝ ⎠ In case of nonlinear molecules these computations require at most the Hessian matrices at 6N-11 different points, N being the number of atoms in molecules. Expanding the vibrational Hamiltonian Hvib using the vibrational wavefunctions in the framework of the second-order perturbation theory, one can obtain the following expression:
vi H vib vi = ξ 0 +
⎛
1⎞
1 ⎞⎛
⎛
∑ ω ⎜⎝ n + 2 ⎟⎠ + ∑ ξ ⎜⎝ n + 2 ⎟⎠⎜⎝ n i
i
ij
i
i≤ j
j
1⎞ + ⎟ 2⎠
(18)
where the ξ constants are simple functions of the cubic and quartic force constants.
2.3. Density-Fitting Local Perturbation Methods Hyper-accurate quantum chemical techniques, suffer from high-degree polynomial scaling. For example, CCSD scales as O(N6) and MP2 scales as O(N5) where N is some measure of system size. In this sense, to develop new techniques which reduce their highdegree polynomial scaling are very important in order to describe larger molecular systems. In the last decade were developed two important methods (density fitting and local perturbational or coupled cluster approximation) which all together can change the highdegree scaling and reduce them close to the linear dependency. In this way, medium sized molecular systems (60 – 80 atoms) can be treated at triple-zeta basis set quality level.
Density Fitting In the HF theory, calculations of electron repulsion integrals (ERI)
G G ψ p (1)ψ q (1)ψ r (2)ψ s (2 ) pr r qs = pq rs = dr1 dr2 r12 −1 12
∫ ∫
*
*
(19)
are the most expensive computational procedures. The computational effort for ERI valuation and transformation can be reduced by 1–2 orders of magnitude using density fitting (DF) methods [48 – 51]. In this approach the one-electron charge densities in the ERIs, which are binary products of orbitals, are approximated by linear expansions in an auxiliary basis set
| pq ) ≈ | ~ pq ) = DApq | A)
(20)
164
Attila Bende
This leads to a decomposition of the 4-index ERIs in terms of 2- and 3-index ERIs, and the O(N4) dependence of the computational cost is reduced to O(N3)
( pq | rs ) ≈ ( ~pq | ~r s ) = DApq J AB DBrs ,
(21)
G G A(1)B(2) J AB = dr1 dr1 r12
(22)
where
∫ ∫
Local Approximation In the LMP2 method the occupied space is spanned by localized molecular orbitals (LMOs), which can be obtained from the canonical orbitals by standard localization procedures as proposed by Boys [52] or Pipek and Mezey[53]. The virtual space is spanned by a basis of non-orthogonal projected atomic orbitals (PAOs),
1 = Tabij Ψijab
(23)
which are obtained from the AO basis functions by projecting out the occupied orbital space
1 =
∑ ∑[ T]
ij ab
Ψijab
(24)
i , j∈P ab∈ i , j
In the following, PAOs will be labeled a,b. Since these functions are inherently local, one can introduce two approximations: First, excitations from a pair of occupied LMOs can be restricted to subsets of PAOs that are spatially close to the two LMOs. The number of functions Nij in each of these subsets (pair domains) is independent of the molecular size, and the number of excitations for each electron pair reduces from Nvirt2 to Nij2. Second, the integrals (ai|bj) for distant orbitals i and j can be approximated by multipole expansions [54] or neglected. The remaining number of non-distant orbital pairs (ij), and therefore the total number of excitations, scales linearly with molecular size.
3. Results and Discussions 3.1. Formamide Dimers Formamide (FA) is the simplest molecule that contains a peptide linkage built by the carbonyl and amino groups; therefore, we can consider the formamide dimer (FA–FA) as the simplest model of the pairing of nucleic acids and the formamide–water (FA–WA) complex as a hydration of proteins, respectively. Geometry structures [3, 55–64] and vibrational spectra [3, 56, 60, 65–68] of the FA–FA and FA–WA dimers have been the subject of many studies using different ab initio [HF and MP2] methods, which give valuable information about the structure and dynamics of the H-bonds in molecular systems.
Anharmonic Effects in Normal Mode Vibrations
165
The standard HF, MP2, and CP-corrected HF/MP2 calculations were performed using the Gaussian 03 computer code [69]. The CHA/CE- and CHA/MP2-type calculations were done by generating the input data (integrals and RHF orbitals) with a slightly modified version of HONDO-8 [70]. In these calculations the CHA/SCF code [25–27] and the CHA/MP2 program of Mayer and Valiron [40, 42] were used. For the frequency calculations based on Wilson’s G-F method, the program written by Beu [71] was applied. We considered six different basis sets: 6-31G, 6-31G**, 6-31G**++, D95V, D95V**, and D95V++**. 6-31G to 6-31G**++ are standard Pople basis sets; D95V to D95V++** are Dunning/Huzinaga valence basis sets. The conventional supermolecule geometries were optimized at both the HF and MP2 levels, applying the analytical gradient method included in the Gaussian 03; the CHA- and CP-corrected geometries were calculated by using a numerical gradient method [inverse parabolic interpolation (IPI) [72]] in internal coordinates including only internal coordinates with intermolecular character (one bond, two angles, and three torsion angles). The reason for this choice is that the CPU time of MP2-CHA program is fairly big. To test the applicability of our numerical gradient method we performed several sample calculations using both the IPI method and the analytical gradient built into Gaussian 03. There is practically no difference between them. For conventional uncorrected cases we also performed similar calculations to check the values of the force constants and harmonic vibrational frequencies. The uncorrected HF and MP2 results for the force constants (in internal coordinates) and for the harmonic vibrational frequencies were obtained by using the standard routines of the Gaussian 03 program. As for the CHA- and CP-corrected calculations, at first the numerical second derivatives of the energies were calculated to obtain the CHA and CP force constants and then the NOMAD program [71] was applied to obtain the appropriate CHA and CP harmonic vibrational frequencies. As we are interested in the BSSE content in the molecular interaction energies, only those components of the force constant matrix were recalculated that correspond to intermolecular internal coordinates. The anharmonic frequencies were obtained using the standard full-CP method implemented in Gaussian 03.
The Geometry Structure Table I shows the optimized geometry parameters for the FA–FA complex (Figure 1), using the conventional (Uncorr.), CHA, and CP schemes at both the HF and MP2 levels. The FA–FA dimer has planar geometry configuration: the global minimum for the dimer is a cyclic structure of C2h symmetry involving two equivalents N-H···O=C intermolecular hydrogen bonds. Two other planar minima has been identified [56, 62, 73] that establish a single N-H·O=C HB building up the linear and zig-zag configurations. In the work we consider just the planar cyclic dimer configuration, for which the BSSE-corrected geometrical parameters are presented. Once this assumption is made, the only variables left are the rHO bond length and two angles, αNHO and αHOC, which could be associated with the in-plane vibration normal modes, while all three torsion angles having intermolecular character are kept constant, their normal modes representing out-of-plane vibrations. The results show that only the rHO bond length has an important BSSE correction (0.06 Å for MP2-CHA/6-31++G**, 0.08 Å for MP2-CHA/D95V++**, and 0.06 Å for MP2-CP/6-31++G** and MP2-CP/D95V++**), which increase the bond size. Furthermore the change in the αNHO and αHOC angle values is
166
Attila Bende
insignificant but their corresponding force constants also include BSSE effects. Unfortunately, the experimental value for rHO bond length presented in Ref. [63] does not have the desired precision, so we could not compare with high precision the uncorrected values and the given BSSE corrected bond lengths. Both the corrected and the uncorrected values are close to the experimental value, 1.9 Å (1.87 Å for MP2/D95V++** and 1.9 Å for MP2/6-31++G** in the uncorrected case; 1.95 Å for the CHA- and CP-corrected cases). However, we can compare our calculated result for the rNO intermolecular distance with the experimental values obtained by Itoh and Shimanouchi [66]. Table 1. Intermolecular coordinate for the FA–FA dimer computed at the HF and second-order Møller–Plesset perturbation theory (Uncorr., CHA, CP) level, using D95V, D95V**, D95V++**, 6-31G, 6-31G**, and 6-31++G** basis sets rHO (Å)a Basis
ΑNHO (Deg.)
RHF
MP2
RHF
MP2
RHF
MP2
Uncorr.
1.904
1.872
129.7
125.9
165.6
169.3
(66)
CHA
1.950
1.936
128.9
125.1
166.3
170.1
CP
1.943
1.960
129.4
125.7
165.8
169.5
D95V**
Uncorr.
2.002
1.862
125.7
122.1
168.9
172.6
(120)
CHA
2.029
1.912
125.0
122.1
169.6
172.6
CP
2.019
1.924
124.9
121.8
169.7
172.8
D95V++**
Uncorr.
2.021
1.874
125.2
120.9
169.4
173.7
(150)
CHA
2.039
1.915
125.2
122.5
169.4
172.1
CP
2.034
1.936
125.2
122.4
169.4
172.2
D95V b
Method
αHOC (Deg.)
6-31G
Uncorr.
1.919
1.912
126.8
122.0
168.0
172.4
(66)
CHA
1.934
1.923
126.8
123.0
168.0
171.3
CP
1.939
1.965
127.6
123.7
167.3
170.7
6-31G**
Uncorr.
1.998
1.887
122.7
119.0
171.5
175.0
(120)
CHA
2.007
1.898
123.6
120.9
170.5
173.0
CP
2.025
1.947
123.6
120.4
170.6
173.6
Uncorr.
2.017
1.899
125.4
121.7
169.2
172.8
CHA
2.034
1.954
125.1
122.6
169.4
171.9
CP
2.038
1.954
125.2
122.5
169.3
172.1
6-31++G** (150) a
Experimental value ≈ 1.9 Å, taken from Ref. [63]. The number of basis functions are given in parenthesis.
b
The calculated lengths are 2.895 Å for the uncorrected case, 2.970 Å for CHA-type, and 2.955 Å for CP-type BSSE-corrected cases, while the X-ray data for the formamide crystal [66] gives 2.935 Å for the rNO intermolecular distance. If we suppose that the intermolecular bond length in the crystal phase is a bit shorter than in the gas phase, it can be considered that the corrected values are very close to the experimental.
Anharmonic Effects in Normal Mode Vibrations
167
Figure 1. The formamide dimer.
Table 2. Interaction energiesa (in kcal/mol) for different FA–FA dimer geometry computed at the HF and second-order Møller–Plesset perturbation theory (Uncorr., CHA, CP) level, using D95V, D95V**, D95V++**, 6-31G, 6-31G**, and 6-31++G** basis sets Basis Method Uncorr.
D95V
D95V**
D96V++**
RHF
MP2
RHF
MP2
RHF
MP2
-17.018
-18.370
-12.912
-16.921
-12.043
-15.903
CHA
-15.279
-14.511
-12.086
-13.971
-11.567
-13.140
CP
-15.185
-14.046
-11.886
-13.552
-11.547
-13.201
Basis Method
6-31G RHF
6-31G** MP2
RHF
MP2
6-31++G** RHF
MP2
Uncorr.
-17.387
-18.799
-14.212
-18.330
-12.385
-15.365
CHA
-15.580
-14.496
-12.576
-13.851
-11.965
-13.228
CP
-14.956
-13.705
-12.032
-13.492
-11.837
-13.399
a
Experimental value ≈ -13.967 kcal/mol; taken from Ref. [63].
In Table 2 we present the calculated intermolecular binding energies, considering the optimized geometry in the given basis and using the given methods (uncorrected, CHA, and CP) and levels of theory (RHF and MP2). The experimental value was obtained using Rydberg electron transfer technique between laser-excited atoms; the molecular system [63] is 606 meV, which corresponds to 13.967 kcal/mol. Our BSSE-corrected results (13.288 kcal/mol for MP2-CHA/6-31++G**, 13.399 kcal/mol for MP2-CP/6-31++G**, 13.140
168
Attila Bende
kcal/mol for MP2-CHA/D95V++**, and 13.201 for MP2-CP/D95V++**) are very close to experimental values, whereas the uncorrected results show more than 1.4 kcal/mol difference. Moreover, we can obtain reasonable binding energy value even if we use the 6-31G and D95V bases without diffuse or polarization functions, applying BSSE correction for uncorrected geometry at the same time.
Harmonic and Anharmonic Frequencies The FA–FA dimer has 30 vibrational normal modes from among which 24 (12 for each monomer) vibrations are characteristic to the monomer-type vibrational motion, while six modes have a pure intermolecular character. Table 3. The uncorrected and CP-corrected harmonic (ν) and diagonal anharmonic (x) frequency of FA–FA dimer computed at MP2 level of theory, using 6-31++G** basis set Nr.
νdim
νCP
νmon
xdim
xCP
xmon
Assign.
intramolecular 1
3770.0 3769.9
3770.3 3770.1
3814.1
-32.5 -32.9
-32.4 -32.8
-40.5
N-N a.
2
3465.8 3424.8
3467.5 3426.4
3660.9
-49.9 -58.9
-49.2 -57.8
-36.1
N-N s.
3
3103.9 3101.6
3104.2 3101.9
3083.9
-33.0 -33.1
-33.0 -33.1
-67.7
C-H
4
1795.5 1773.9
1796.4 1774.4
1794.9
-24.0 -22.3
-24.4 ???
-6.6
C=O
5
1677.0 1666.6
1677.2 1667.9
1654.2
-9.5 -3.9
-9.4 -4.0
-8.5
H-N-H
6
1444.0 1444.8
1446.7 1445.8
1444.0
-3.8 -3.9
-3.9 -3.9
-9.1
O=C-H
7
1367.7 1353.1
1367.4 1352.4
1295.9
-2.9 -2.6
-2.8 -2.6
-5.0
C-N
8
1112.1 1106.0
1111.6 1105.6
1075.2
-0.8 -0.7
-0.8 -0.7
-1.3
C-N-H
9
1060.2 1046.2
1067.7 1056.1
1041.0
0.5 1.1
-1.1 -0.7
-1.3
OofP
10
824.3 785.1
827.8 789.7
628.1
-15.5 -6.5
-20.7 -8.3
-15.9
Torsion
11
622.8 605.7
621.2 604.8
565.6
-0.6 -0.2
-0.6 0.3
1.2
O=C-N
12
413.9 394.5
425.5 404.4
276.1
17.8 28.2
12.9 24.8
-462.7
Torsion
Anharmonic Effects in Normal Mode Vibrations
169
Table 3. Continued Nr.
νdim
νCP
νmon
xdim
xCP
xmon
Assign.
intermolecular 1
47.5
118.3
451.2
47.6
OofP
2
29.6
99.7
2101.0
11.0
OofP
3
9.2
49.0
16800.0
1.5
OofP
4
171.8
176.2
-3.1
-2.3
H…O
5
212.6
217.3
-1.8
-1.2
H…H
6
136.4
136.2
-0.2
-0.1
O…O
Usually the vibration frequencies of molecular complex are evaluated at the harmonic approximation applying the Wilson F-G analysis and using the BSSE-uncorrected Hessians. On the other hand, the vibrational frequencies, in particular there with an intermolecular character, contain considerable anharmonic effects and therefore it is difficult to follow these two important corrections in a distinct way. Considering the monomer-type vibrations (Table 3) it can be found that two different dimer frequencies correspond to the similar monomer vibrations, but their values are usually shifted due to the intermolecular interaction. Taking in to account the full CP-corrected values in the dimer calculations, we found another frequency shift, but in this case due to the BSSE effects. Moreover, the frequency values show an important basis size effect at the MP2 level, which implies the shifts in dimer frequency values will change. The results of the anharmonic frequency corrections show a more complex picture. Because of the large numbers of the anharmonic frequencies in Table 3 only the diagonal elements of anharmonic frequency matrix are presented. The most important effect in the anharmonic values is given by the influence of the adjacent molecule, which generates substantial shifts in the anharmonicity of different monomer normal modes within the dimer system. Although the above-mentioned “cluster” effect is quite uniform, the basis size effects become much more complicated. In the case of hydrogen bond stretching (N–H, C–H) and angle-bending vibrations, changes in the anharmonic frequency are not so important, but the torsion angle and C=O stretching modes show very dissimilar results. Considering the full CP-corrected BSSE-free anharmonic frequency calculations, no major corrections can be found for the monomer-type vibrations, which in practice mean that their effects could be generally neglected for the FA–FA dimer. Regarding the intermolecular normal modes, in addition to the “cluster” and basis size effects, the BSSE corrections become very important, especially for “out of plane” normal modes (see MP2/6-31++G**). After these, considering the collective effects of basis size and BSSE corrections on the intermolecular vibration frequencies, it can be concluded that major corrections are obtained in the cases of the harmonic approximation given by the quality (applying polarization and diffuse basis sets) and the BSSE of the applied basis sets, which is followed by the similar correction of the anharmonic approximation. With respect to the intermolecular normal modes, the out-of-plane vibration with low frequencies usually shows a very dissimilar and unrealistic anharmonic correction, especially for the MP2/6-31++G** CP-corrected case. This phenomenon may be related to many facts: i) in the CP method the PES is very flat, the intermolecular force constants are very small [74] and the numerical
170
Attila Bende
calculations could give significant errors; ii) the role of the well-balanced basis set is very important, therefore we consider that the 6-31++G** does not give us adequate results. For example, the 6-31++G(2d, 2p) basis set could be a more suitable choice, but the available computer capacity does not allow us to perform such full-CP anharmonic calculations.
3.2. Urea Dimer The urea dimer presents to some extent a good similarity with the formamide-formamide system. The difference between these two systems is the existence of the second cyclic system in the urea dimer. In the case of the urea dimer one of the HBs is the weak C-N···H-N hydrogen bond. The planarity (or nonplanarity) of the urea dimer system has been the focus of a number of studies. Masunov and Dannenberg [75, 76] considered different levels of theory (HF, MP2 and DFT with and without BSSE corrections), and the most stable conformation was found to be the non-planar structure (using the MP2 method with the D95++** basis set). However, they accentuate that inclusion of vibrational and thermal corrections in the calculations of the molecular structure might give an effectively planar structure. We consider the papers of Rousseau and Keuleers [77, 78] as very important work in elucidating the urea structure, where detailed descriptions of vibrational spectra of urea both in gas and crystal phase are presented. Their concisely conclusion was that the vibrational analysis of solid urea and of the gas phase of urea are not comparable, which is mostly due to the different planar or non-planar conformation of urea in different states. The goal of this study is to give an accurate description of intermolecular normal modes and to present different intermolecular interaction effects which could influence the monomer type vibrations, considering the cyclic urea dimer case. Accordingly, several DFT functionals were tested by comparing them with the corresponding MP2 results. The BSSE was corrected using the counterpoise (CP) method [20, 21] as implemented in the Gaussian03 package suite [69]. The uncorrected and BSSE-corrected energies, geometries, harmonic frequencies and their anharmonic corrections [79, 80] were calculated for MP2 and DFT levels of theory using D95V, D95V** and D95V**++ basis sets [81]. The combined local and density fitting approximations for MP2 and Coupled-Cluster methods (DF-LMP2 and DF-LCCSD(T)) were taken into account considering Molpro program package suite [82].
The Geometry Structure In Table 4 we list the interaction energies (given in kcal/mol) and intermolecular HB distances (in Å) obtained for the cyclic urea dimer structure with MP2 method and six different DFT XC functionals (both in uncorrected and CP-corrected cases) using D95V**++ basis set. In the last five rows of Table 4 the same interaction energy and intermolecular HB distance obtained with DF-LMP2 method using D95V**++ and cc-pVQZ basis sets are also shown, as well as the interaction energy obtained using DF-LCCSD(T) method. The cyclic urea dimer is bonded by two equivalent C=O···H-N HBs presented in Figure 2.
Anharmonic Effects in Normal Mode Vibrations
171
Table 4. The he ε interaction energies (in kcal/mol) and R intermolecular distances (in Ǻ) in case of cyclic urea dimer structure, optimized at different levels of theory and considering the D95V**++ and cc-pVQZ basis sets Method
Cyclic
MP2
NoCP
a
b
CP BLYP
ε
RO…H
-16.53
1.86
-13.18
1.93
NoCP
-14.15
1.84
CP
-13.52
1.86
NoCP
-15.04
1.84
CP
-14.43
1.85
NoCP
-16.81
1.79
CP
-16.12
1.80
NoCP
-12.33
1.96
CP
-11.76
1.98
NoCP
-20.74
1.75
CP
-20.08
1.75
NoCP
-15.59
1.81
CP
-15.01
1.86
-
-13.26
1.93
-
-2.21
-
DF-LMP2/vqz
-
-14.83
1.86
LMP 2 E disp
/vqz
-
-3.56
-
DF-LCCSD(T)e/vqz
-
-14.29
-
B3LYP PBE HTCH407 KMLYP c
BHLYP
DF-LMP2 LMP 2 E disp d
a
Without counterpoise correction With counterpoise correction c BHandHLYP d cc-pVQZ e The LMP2 optimized geometry was used. b
Figure 2. The urea cyclic dimer.
172
Attila Bende
Table 5. The harmonic (ν) and anharmonic (a) frequencies (in cm-1) of intramolecular normal modes (H – N: ν1 – ν4 and O=C: ν5,ν6 bond stretching) of cyclic urea dimer obtained at MP2, B3LYP and BHLYP levels of theory with and without BSSE correction and using the D95V++** basis set No
ν1
Meth
MP2 B3LYP BHLYP
ν2
MP2 B3LYP BHLYP
ν3
MP2 B3LYP BHLYP
ν4
MP2 B3LYP BHLYP
ν5
MP2 B3LYP BHLYP
ν6
MP2 B3LYP BHLYP
dim ν NCP
dim ν CP
3782.5 3782.4 3712.1 3712.1 3848.3 3848.3 3752.3 3751.7 3657.4 3675.7 3813.7 3813.4 3649.7 3649.7 3589.9 3589.9 3723.5 3723.5 3457.8 3419.6 3340.0 3297.2 3510.8 3478.8 1798.3 1777.9 1768.6 1743.7 1855.9 1825.3 1693.2 1684.7 1656.9 1651.0 1722.7 1721.6
3784.1 3784.0 3712.6 3712.6 3848.8 3848.8 3759.4 3758.6 3675.9 3675.3 3814.7 3814.4 3651.1 3651.0 3590.2 3590.1 3723.8 3723.7 3509.0 3480.0 3348.9 3307.3 3518.2 3483.2 1801.2 1781.0 1769.4 1744.3 1856.6 1825.6 1693.1 1687.4 1659.1 1653.5 1724.3 1723.3
ν mon 3770.1 3706.8 3841.3 37701 3706.6 3841.1 3641.5 3578.8 3720.1 3638.9 3582.5 3715.1 1853.2 1797.0 1887.4 1823.2 1629.0 1696.2
dim a NCP
dim aCP
3608.7 3608.7 3546.4 3546.4 3688.1 3688.0 3584.7 3584.3 3512.3 3511.7 3652.1 3652.1 3493.6 3493.7 3441.9 3441.8 3580.2 3580.0 3229.3 3149.0 3144.0 3008.5 3322.0 3305.0 1764.9 1729.1 1722.6 1769.3 1813.0 1781.8 1648.3 1650.1 1619.9 1602.4 1681.6 1682.4
3610.5 3610.4 3546.7 3546.7 3688.6 3688.6 3592.6 3592.2 3514.3 3513.3 3653.8 3653.8 3495.3 3495.2 3442.1 3442.0 3580.6 3580.3 3388.8 3370.4 3148.1 2984.4 3307.5 3325.0 1765.6 1740.9 1723.9 1700.5 1800.9 1782.6 1647.3 1646.6 1622.3 1610.6 1683.7 1685.4
a mon
νexp
3596.1 3544.5
3450a 3348b
3658.4 3596.3 3543.9
3444 3435
3684.8 3484.3 3438.9
3349 3345
3580.3 3482.3 3435.3
3331 3330
3575.3 1782.1 1751.5
1683 1683
1841.4 1728.1 1589.4
1625 1627
1659.0
a
At T = 20 ºC At T = -196 ºC
b
The interaction energy and HB distance values show that a good fitting of DFT values with MP2 results are not obvious for any of the selected XC functionals. In the case of cyclic structure one can observe that the best agreement for the intermolecular interaction energies and intermolecular RO...H distances are given by the BHLYP and B3LYP XC functionals. The
Anharmonic Effects in Normal Mode Vibrations
173
interaction energy values are: -15.04 kcal/mol for B3LYP, -15.59 kcal/mol for BHLYP and 16.53 kcal/mol for MP2, while the intermolecular distance for RO...H is: 1.84 Å (B3LYP), 1.85 Å (BHLYP) and 1.86 Å (MP2). At the same time, reasonable values could be also obtained applying the BLYP functional, while in the case of KMLYP, PBE and HTCH407 the energy and geometry parameter results are quite different from the MP2 values. If we compare the same ε and R values at DF-LMP2 level of theory, but obtained with different basis sets (D95V**++ versus cc-pVQZ), one could see that the energy results increase with 1.58 and 2.39 kcal/mol, respectively, while for the R distance we get smaller values with 0.06 and 0.07 Å, respectively. Focusing on the dispersion part of the intermolecular interaction energy, significant dispersion energy growth can be found (1.35 kcal/mol for cyclic dimer and 1.71 kcal/mol for asymmetric case), when the larger cc-pVQZ basis set is used against the D95V**+ one. These represent the major contribution to the total interaction energy increase. In order to consider higher correlation effects, other than those included in the MP2, the interaction energies were computed applying the DF-LCCSD(T) method, using the same cc-pVQZ basis set and taking into account the DF-LMP2 optimized geometry.
Harmonic and Anharmonic Frequencies The cyclic urea dimer has 42 vibrational normal modes from which a number of 36 (18 for each monomer) vibrations are characteristic to monomer-type vibrational motion. The last 6 normal modes with the lowest frequency values have purely intermolecular character. These 18 monomer frequencies of dimer structure are split in a form of frequency pairs (doublets). For the whole theoretical IR spectra, see Figure 3. Compared to the individual monomer lines they present different frequency shifts and their magnitudes depend very much on the dimer molecular symmetry.
Figure 3. The theoretical IR spectra of urea cyclic dimer obtained at MP2-CP, DF-LMP2 and B3LYPCP levels of theory, using D95V**++ basis set.
174
Attila Bende
In Table 5 were collected the harmonic frequencies (doublets) and their anharmonic corrections for those monomer type normal modes of which vibrational motions are substantially perturbed by the adjoining molecule. They were determined in advance by identifying the frequency value which corresponds to each normal mode vibration and by visualization of their vibrational characters [83]. It was found that, the ν1, ν2 and ν3 intramolecular normal modes are N-H bond stretching vibrations where the H atoms do not take part in intermolecular HB formations. The ν4, ν5 and ν6 are one N-H and two C=O bond stretching vibrations located at the N-H···O=C intermolecular HBs. It can be observed that the doublet frequency splits for ν1, ν2 and ν3 normal modes are almost irrelevant. More precisely, the frequency shifts, which are induced by each monomer on the adjoining molecules, have the same magnitude. Only those normal modes presents different frequency shifts and implicitly larger doublet splits where the perturbation of HB vibration is present (ν4, ν5 and ν6). As it can be seen in Table 5 the dimer frequency shift could be attributed to several effects like: anharmonic corrections, BSSE effects, or intermolecular effects. It should be mentioned that BSSE is not a real physical effect and normally it must be considered together with the intermolecular effects. But, in order to see how important frequency shifts could the BSSE error induce we consider as a separate effects.
Table 6. The harmonic (ν) and anharmonic (a) frequencies (cm-1) of intermolecular normal modes of cyclic urea dimer obtained at MP2, B3LYP and BHLYP levels of theory with and without BSSE correction and using the D95V++** basis set No. νI
νII
νIII
νIV
νV
νVI
Meth. MP2 B3LYP BHLYP MP2 B3LYP BHLYP MP2 B3LYP BHLYP MP2 B3LYP BHLYP MP2 B3LYP BHLYP MP2 B3LYP BHLYP
Cyclic dim ν NCP
dim ν CP
dim a NCP
dim aCP
159.3 157.8 158.4 154.3 152.0 154.7 126.8 134.6 133.4 94.4 93.7 93.7 65.1 67.9 68.6 35.8 45.4 45.8
141.5 150.6 156.0 135.2 137.9 150.7 121.9 127.2 133.4 89.9 81.5 94.1 62.0 50.5 67.6 38.0 46.8 45.0
151.9 155.2 151.5 144.7 147.4 143.9 121.6 135.0 127.0 80.5 93.8 78.7 55.6 66.9 53.8 29.5 44.1 43.9
132.4 150.1 153.1 123.7 137.4 141.3 117.6 132.6 132.6 75.3 80.7 80.7 54.4 53.0 55.9 33.8 45.7 54.4
Anharmonic Effects in Normal Mode Vibrations
175
In the case of N-H bond stretching vibrations, the magnitudes of anharmonic corrections (frequency red-shift) are quite large (≈ 150 - 170 cm-1 for ν1, ν2 and ν3, and >200 cm-1 for ν4, respectively) while in the case of a1, a2 and a3 anharmonic corrections, for both of doublet frequency values, the magnitude of frequency shifts are the same. For a4 correction the size of doublet frequency shifts is different (ex. at MP2 level ν′4-a'4=3457.8 - 3229.3 cm-1 = 228.5 cm-1 and ν"4 - a"4 = 3419.6 - 3149.0 cm-1 = 270.6 cm -1). Regarding to ν5 and ν6 C=O stretching modes one can see that the anharmonic corrections are much smaller than in the HN case, their shifts are about 30-40 cm-1 and the behavior of their frequency split is similar to ν4 mode. The correction scheme for ν 4 and ν 5 is as follows:
ν 4' : ν mon = 3638 .9 cm −1 ⎯dimer ⎯⎯→ν dim ⎯anh. ⎯→ ⎯ ν anh ⎯BSSE ⎯⎯→ν BSSE = 3388 .8 cm −1 -
ν : ν
mon
ν : ν
mon
ν : ν
mon
" 4
' 5
" 5
= 3638 .9 cm
−1
= 1853 .2 cm
−1
= 1853 .2 cm
−1
-
⎯ ⎯⎯→ν
dim
⎯ ⎯⎯→ν
dim
⎯ ⎯⎯→ν
dim
dimer
-
dimer
-
⎯ ⎯→ ⎯ ν
⎯ ⎯⎯→ν BSSE = 3370 .4 cm −1
⎯ ⎯→ ⎯ ν
anh
⎯ ⎯⎯→ν BSSE = 1765 .6 cm −1
⎯ ⎯→ ⎯ ν
anh
⎯ ⎯⎯→ν BSSE = 1740 .9 cm −1
-
-
dimer
+1
anh
anh.
anh.
+2
-
anh.
-
BSSE
BSSE
+0.
BSSE
+1
Considering the scheme of frequency corrections by dimer, anharmonic and BSSE effects presented above, one could be observe that they bring different contributions in the case of the ν 4 monomer frequency value (3638.9 cm-1), where finally we got a double-split frequency pair (ν 4' and ν 4" ) of 3370.4 cm-1 and 3388.8 cm-1 values. Similar situation can be found for the ν 5 monomer frequency value where we have two frequency values with 24.7 cm-1 distance between them. Beside of the monomer-type frequencies, the molecular association induces a group of another six normal mode vibrations which can be called intermolecular normal modes. They can be found in the very-far region (10–250 cm-1) of the molecular IR spectra and show the relative vibrations of two “rigid” urea monomers according to the six degree of freedom which derive from the intermolecular coordinates. The frequency values of these intermolecular normal mode vibrations are shown in Table 6 obtained both at MP2 and DFT (B3LYP and BHLYP) levels of theory. They have strictly intermolecular character and show only anharmonic and BSSE corrections. Scrutinizing the results of normal mode vibrations for the cyclic dimer one can observe that there are three frequencies of which motion take place in the supermolecular plane (νI – νIII) – let calls them “in-plane” vibrations, while the another three intermolecular normal modes (νIV – νVI) show “out-of-plane” vibrations. If we consider together the amount of anharmonic and BSSE corrections one can see that the “inplane” vibrations are more affected by these errors than “out-of-plane” normal modes. Yet analyzing separately for only one mode, it can be seen that they have almost similar magnitude, becoming equally relevant corrections for the intermolecular normal mode vibrations. At the same time, if we consider the off-diagonal vibration couplings of these normal modes, one can find that frequencies which belong to the group of “in-plane” or “outof-plane” their vibrations are strongly coupled inside the group, but much less coupled between the groups. In the case of asymmetric urea dimer the separation of the above mentioned “in-plane” and “out-of-plane” group of vibrations is not so obvious, but similarly
176
Attila Bende
to the cyclic dimer the BSSE and the anharmonic effects are equally relevant. In both cases of dimer structures the collective effect of BSSE and anharmonic corrections presents about 10 – 15% from the frequency values. The intra – intermolecular normal mode couplings could give us more detailed information about how strongly the intra- and intermolecular normal modes are coupled between them. Since the intra- and intermolecular normal modes have very different vibrational frequency values one should obtain a strong coupling only in some special cases. Analyzing the anharmonic coupling matrix we found that in case of cyclic dimer the νI – νIII intermolecular normal modes are coupled only with ν 4' and ν 4" (ex. x(ν 4" -ν I ) = +6.73 cm-1) intramolecular vibrations, while couplings between ν 5' and ν 5" intra-normal modes and νI – νIII inter-normal modes are almost missing. This could be explained with the fact that molecular vibrations are performed in the same molecular plane and along the same vibrational direction. In the case of asymmetric urea dimer neither for ν 4' and ν 4" , nor for ν 5' and ν 5" so significant anharmonic coupling with the intermolecular normal modes could be found than for the cyclic structure. In this case, the only relevant coupling is x(ν 5" -ν II ) = -3.86
cm-1. If one compares the selected theoretical normal mode frequencies with the experimental data (Table 5), the more appropriate values are obtained for the CP-corrected anharmonic frequencies, both for cyclic and asymmetric structures. At the same time, it can be seen that the dimer approximation is not satisfactory in order to reproduce these experimental data with the desired high accuracy. Thus, other extended theoretical calculation where also solvent effects and larger cluster sizes are taken into account would be absolutely necessary. In addition, other spectroscopic data, like vibrational absorption intensities calculated via atomic polar tensor [84] and comparing them with the measured vibrational intensities would be also useful in order to explain the unusual behavior of urea in gas or solid phase.
3.3. Guanine-Cytosine DNA Base Pair The guanine-cytosine (GC) base pair's intermolecular interaction can be considered as a contribution of two very important HBs: a number of two C=O···H-N and one PN···H-N bonds. Due to the biological manifoldness of DNA base pair conformations (dry-DNA, wetDNA, double helix, super-double helix, etc.) their theoretical characterization is much diversified. A given theoretical model should take into account the backbone and environment effects as well as the influence of the DNA-protein interactions. In virtue of this fact, choosing the correct theoretical method is essential in the investigation of DNA base pairs. In the last five years the substantial computer advance gave for molecular modeling scientists the opportunity to include accurate electron correlation effects in their calculations. Šponer et al. [85] and Podolyan et al. [86] used the well-known MP2 method with different basis sets (mostly including the polarization effects) and they give a very detailed description of the interaction energies and geometry structures of the adenine-thymine and guaninecytosine base pairs. At the same time the normal mode analysis of molecular vibration could give us supplementary information about the efficiency of different methods. In connection with this, several vibrational frequency calculations and experimental measurements were
Anharmonic Effects in Normal Mode Vibrations
177
performed [9, 86, 87, 88, 89] in order to study different interaction effects in normal mode vibrations of base pairs (mostly for guanine-cytosine dimer). For instance, the intermolecular interactions could significantly influence the intramolecular normal mode vibrations such as red-shift or improper blue-shifting [90, 91] of the vibrational frequencies. Furthermore, the intermolecular normal mode vibrations depend very much on the applied method or basis sets, and last but not least on the BSSE effects [3, 74, 92, 93]. The different Watson-Crick and Hoogsteen G-C geometries, harmonic frequencies and their anharmonic corrections [79, 80] were calculated at DFT level of theory considering the PBE exchange-correlation functional implemented in Gaussian03 [69] program package and using the 6-31G basis set. The harmonic frequency values were scaled using the 0.986 factor corresponding to PBE/6-31G type calculations.
The Geometry Structure In Figures 4 and 5 two different conformation (4: Watson-Crick [94] and 5: Hoogsteen [95]) of G-C base pair are presented. Beside of the well-known Watson-Crick base pair configuration the Hoogsteen base pairs have been known for more than 40 years. In Hoogsteen base pairs the N face of guanine is hydrogen bonded to cytosine N side by an H+ proton. Such interactions were postulated in U(A·U) triple helices [96] and it was also found in chemically modified nucleic acids [97, 98]. Isolated Hoogsteen base pairs have been reported in some protein/DNA complexes [99] and occasionally in RNA [100]. Its crystal structure is presented in [101].
Figure 4. The Watson-Crick conformation of Guanine-Cytosine base pair.
Figure 5. The Hoogsteen conformation of Guanine-Cytosine base pair.
178
Attila Bende
The geometry of the DNA base pair was the subject of many scientific works, among which the theoretical investigation performed by Guerra at al. [102] and van der Wijst [103] are the most detailed ones. They present a systematic comparison of different DFT exchangecorrelation functionals and basis sets as well as they also compare the intermolecular interaction energy values and H-bond distances with the experimental results. The best agreement was obtained for the BP86 and PW91 exchange-correlation functionals, while the widely used B3LYP functional consistently underestimates hydrogen-bond strengths and overestimates hydrogen-bond distances. Nearly same good results can be obtained also with the PBE functional as it found in case of BP86 and PW91 ones. Considering the above mentioned method and basis set, for the intermolecular distances were obtained the following values: a) Watson-Crick: d1(O···H-N) = 2.80 Å, d2(N···H-N) = 2.93 Å and d3(O···H-N) = 2.92 Å, b) Hoogsteen: d1(N···H+-N) = 2.66 Å, d2(O···H-N) = 3.08 Å.
Harmonic and Anharmonic Frequencies The guanine-cytosine binary system in its Watson-Crick configuration has a number of 29 atoms and presents a number of 81 normal mode vibrations from which 6 have intermolecular character. In Figure 6 the IR absorption spectra for two characteristic spectral regions of 500 – 2000 cm-1 and 2750 – 3750 cm-1, respectively, are presented.
Figure 6. The IR absorption spectra for Watson-Crick configuration of guanine-cytosine DNA base pair at harmonic and anharmonic approximation.
The black vertical line shows the normal mode vibrational frequencies in harmonic approximation, while the red lines are frequencies where the anharmonic approximation was also taken into account. The values show a considerable frequency shift due to the anharmonic approximation. These shifts are very pronounced in case of 2750 – 3750 cm-1 spectral region, where the C-H and N-H covalent stretching vibration are located.
Anharmonic Effects in Normal Mode Vibrations
179
Figure 7. The IR absorption spectra for Hoogsteen configuration of guanine-cytosine DNA base pair at harmonic and anharmonic approximation.
Figure 8. The IR absorption spectra for Watson-Crick and Hoogsteen configuration of guanine-cytosine DNA base pair at harmonic approximation.
The guanine-cytosine binary system in its Hoogsteen configuration has a number of 30 atoms and presents a number of 84 normal mode vibrations from which 6 have intermolecular character. In Figure 5 can be observed that the N···H+-N intermolecular bond is stabilized with the help of the H+ proton. The H+-N is not a real covalent bond and therefore its stretching vibration is also different from the usual H-N stretching vibrations. In case of Watson-Crick configuration there are three characteristic stretching vibrations: the first one is a pure H4-N4 vibration (ν5 = 3292.9 cm-1) of O6···H4-N4 intermolecular donor-acceptor complex, while the second and third ones are a combined N1-H1 and N2-H2 symmetric (ν9 = 3085.8 cm-1) and asymmetric (ν10 = 3027.8 cm-1) stretching vibrations of N1-H1···N3 and N2H2···O2 intermolecular donor-acceptor complexes. In case of Hoogsteen configuration there are only two characteristic vibrations which belong to the intermolecular interaction region:
180
Attila Bende
the first is an N-H stretching vibration (ν10 = 3014.4 cm-1) of N-H···O donor-acceptor complex, while the second one is also an N-H stretching vibration (ν11 = 2053.2 cm-1) but as a component of the N···H+-N unusual intermolecular donor-acceptor complex. If one considers the anharmonic corrections, the similar behavior can be observed as it was found in case of formamide and urea systems. There are large anharmonic shifts for all three ν5, ν9 and ν10 stretching vibrations of Watson-Crick configuration: a5 = 263.1 cm-1, a9 = 294.8 cm-1 and a10 = 362.4 cm-1. In case of Hoogsteen conformation the anharmonic shifts are: a10 = 187.6 cm-1 and a11 = 958.0 cm-1. The a11 anharmonic shift is very large compared with the harmonic frequency values. This can be explained with the unusual nature of N···H+-N intermolecular donor-acceptor complex. Two different type intermolecular normal mode vibrations can be found both in case of Watson-Crick and Hoogsteen configurations. In one of them the vibration motion occurs in the plane defined by the monomer molecules (here we have a number of three normal modes), while in the second case the intermolecular vibrational motions are out-of-plane vibrations. Since the characterization of the out-of-plane vibrations is quite difficult, this will not constitute the subject of our further investigation. Accordingly, the three intermolecular normal mode (in-plane) frequencies are: νI = 101.2 cm-1, νII = 125.5 cm-1 and νIII = 131.0 cm-1 for Watson-Crick conformation and νI = 88.9 cm-1, νII = 109.4 cm-1 and νIII = 170.6 cm-1, for Hoogsteen conformation, respectively. Similar to the previous urea case presented in section 3.2, there are also significant interand intramolecular normal mode couplings. In case of Watson-Crick system the intramolecular ν9 and ν10 stretching vibrations are relatively strong coupled (about 5-6 cm-1) with all three νI, νII and νIII intermolecular normal modes. For Hoogsteen configuration these couplings are stronger: 17.0-21.0 cm-1.
3.4. Adenine-Thymine DNA Base Pair The adenine-thymine binary system (Watson-Crick configuration) has a number of 30 atoms and presents a number of 84 normal mode vibrations from which 6 have intermolecular character. For its geometry structure see Figure 9. Since we are interested in the stretching vibrations (ν4 and ν11) of those intramolecular covalent bonds where H atoms are involved, as well as in those intermolecular vibrations of which motion is situated mostly in the molecular plane (ν78, ν80 and ν82), we present only these specific normal mode vibrations. At the same time, one can identify another group of seven normal modes (ν41, ν42, ν54, ν60, ν61, ν64, and ν69) specific for purine and pyrimidine ring vibrational deformation which can significantly disturb the H-bond vibrations. All these normal mode frequency values in harmonic, BSSE corrected harmonic, and anharmonic approximations are presented in Table 7. Considering the BSSE correction, the most affected normal modes are the ν4 and ν11 N-H stretching vibrations. In these two cases we found 76.6 cm-1 and 429.5 cm-1 frequency increasing, respectively. Similar findings can be obtained in case of νI and νII intermolecular normal modes, where in spite of the fact that BSSE effects not show large frequency shifts, they could represent significant corrections compared with the uncorrected frequency values. Regard to anharmonic corrections, one can observe that large frequency shifts are obtained in the case of N-H covalent-bond stretching vibrations (281.1 cm-1 for ν4 and -160.4 cm-1 for ν11). Considering the ring deformation normal mode
Anharmonic Effects in Normal Mode Vibrations
181
vibrations (ν41, ν42, ν54, ν60, ν61, ν64, and ν69), the anharmonic correction is more important than BSSE effects, but even so, their frequency shifts are less than 20 cm-1. As it was concluded in the previous case of urea dimers, the anharmonic and BSSE corrections could be considered in a very good approximation as additive effects. According to this fact, we present in the fifth column of Table 7 the integral correction (νint) of the anharmonic and BSSE effects. The results show that in some frequency cases we have an opposite contribution of frequency shifts, while in some other cases the anharmonic and BSSE collective corrections increase the magnitude of the frequency shift. Analyzing the anharmonic coupling matrix one observes strong vibrational coupling between ν11 intra- and ν78 intermolecular normal modes (x11,I = 8.2 cm-1) as well as between ν11 intra- and νII intermolecular normal modes (x11,II = 10.4 cm-1). The case of this strong coupling could be explained by the same fact that in the case of cyclic urea dimer namely, by the presence of the same molecular plane for both normal modes and the same vibrational direction for ν11 and νI modes. In other cases there is no significant coupling between the intra- and intermolecular normal modes.
Table 7. The harmonic (ν) and anharmonic (a) frequencies (in cm-1) of some selected intramolecular and intermolecular normal modes in adenine-thymine DNA base pair, obtained at B3LYP level of theory and using D95V basis set (The ν int is the predicted frequency value, considering, together, the BSSE and anharmonic corrections) Nr.
dim ν NCP
dim ν CP
dim a NCP
νint
ν4
3329.3
3405.9
3048.2
3124.8
ν11
2546.7
2976.2
2386.3
2815.8
ν41
1035.2
1035.7
1018.5
1019.0
ν42
1031.4
1024.2
1014.2
1007.0
ν54
742.4
745.9
726.9
730.4
ν60
637.7
630.3
632.8
625.4
ν61
607.4
611.1
601.2
604.9
ν64
550.2
550.4
542.0
542.2
ν69
397.2
404.1
388.8
395.7
νI
119.9
108.2
115.4
103.7
νII
114.6
101.6
110.3
97.3
νIII
66.8
60.3
58.3
51.8
Krishnan et al [104] show that to compare the experimentally observed IR spectra of adenine-thymine base pair with the calculated frequencies is not a simple task. First of all because the Watson-Crick configuration of A-T base pair is not the most stable isomer conformation [105]. Making a detailed theoretical anharmonic frequency analysis they were able to assign the IR-UV double resonance spectra [106] also to a particular isomer which is not the Watson-Crick structure. A direct experimental assignment of N-H stretching vibrations in A-T oligomers in condense phase is very difficult because the N-H stretching
182
Attila Bende
vibration spectral region overlap with the water’s O-H stretching spectral region. Reducing the water content of the A-T oligomers does not solve the problem because they do not adopt a well define structure at extremely low water concentration. In order to identify and to characterize the N-H stretching vibrations of A-T base pair oligomers in DNA Heyne et al [107] use the two-color IR pump-probe technique to overcome the above mentioned experimental problem. They found that the adenine ν(NH2) absorbs at 3215 cm-1 and has pronounced anharmonic couplings to the ν(C=O) mode of the thymine and δ(NH2) mode of the adenine.
Figure 9. The adenine-thymine DNA base pair.
4. Conclusion In this Chapter a detailed investigation, including geometry structure, harmonic and anharmonic frequency calculations, on cyclic conformation of formamide and urea dimers, as well as on the guanine-cytosine and adenine-thymine DNA base pairs were presented. Considering the results some general conclusions can be drawn. First of all, in order to describe accurately the geometry structures of the molecules one needs to consider proper ab initio methods, which are able to describe correctly the different intermolecular interaction effects (electrostatic, polarization, induction, dispersion and three- or four-body terms). Including all this effects in our calculation, high level electron correlation methods like perturbation or coupled-cluster methods as well as basis sets which contain successively larger shells of polarization (correlating) functions (d, f, g, etc.) are required. For these calculations one need a huge amount of computer capacity and therefore we can only limit to a small size molecular systems (up to 15-20 atoms). Recently introduced local and densityfitting approximations are very promising tools, they can reduce very much the calculation efforts and in this way the molecular size limit could be extended. These methods are also free from the mathematical artifact called basis set superposition error or BSSE. In order to obtain accurate description for the infrared absorption spectra, the harmonic approximation is not good enough and therefore the anharmonic approximation is also should be included. Including these effects, some normal mode vibrations, like X-H (X=C or N), C=O and N-H+ stretching vibrations, need more detailed investigation. For the N-H and C=O covalent-bond stretching vibrations the anharmonic frequency correction is significant. In case of Hoogsteen conformation of guanine-cytosine base pair the anharmonic approximation is comparable with the harmonic frequency value. For the same N-H and C=O covalent-bond
Anharmonic Effects in Normal Mode Vibrations
183
stretching vibrations the BSSE corrections are also significant, but only in the case when the normal mode vibrations are in the intermolecular region. Normal mode vibrations which are located in the intermolecular region, the influence of the adjoining molecule (dimer effect) on the vibrational frequency is comparable with the magnitude of the BSSE and anharmonic corrections. Analyzing the magnitude of the anharmonic and BSSE corrections, it was found that their contributions in the harmonic frequency shift could be considered in a very good approximation as an additive effect. The anharmonic coupling between intra- and intermolecular normal modes is significant only when the motion of normal mode vibrations occur in the same molecular plane (the plane defined by those atoms which move during the vibration) and along of the same vibrational direction. Using the classical DFT functionals and applying double-zeta quality basis sets, like 6-31G or D95V, one could obtain a good qualitative description of the anharmonic corrections, but further investigations considering the recently developed DFT or perturbation methods and triple-zeta quality basis sets are needed in order to obtain also a good quantitative description for them. The strong coupling between different intra- and intermolecular normal modes can be considered as an important and biologically relevant effect. In this way the vibrational relaxation processes can take place very easily and some strongly excited localized vibrations can wither away as thermal vibrations.
References [1] [2]
[3]
[4]
[5] [6]
[7] [8]
[9]
Rose G.; Fleming P.; Banavar J.; Martin A. A backbone-based theory of protein folding PNAS, 2006, 103, 16623-16633. Deechongkit S.; Nguyen H.; Dawson P. E.; Gruebele M.; Kelly J. W. Contextdependent contributions of backbone hydrogen bonding to β-sheet folding energetics, Nature, 2004, 430, 101-105. Bende A.; Suhai S. BSSE-Corrected Geometry and Harmonic and Anharmonic Vibrational Frequencies of Formamide–Water and Formamide–Formamide Dimers, Int. J. Quantum Chem., 2005, 103(6), 841-853. Gerber R. B.; Chaban G. M.; Brauer B.; Miller Y., Theory and Applications of Computational Chemistry: The First 40 years, Elsevier, Amsterdam, The Netherlands, 2005, Chap. 9, pp. 165–193. Pratt D. W. High Resolution Spectroscopy in the Gas Phase: Even Large Molecules Have Well-Defined Shapes, Annu. Rev. Phys. Chem., 1998, 49, 481-530. Belhayara K.; Chamma D.; Velcescu A. C.; Rousseau O. H. On the similarity of the IR lineshapes of weak H-bonds in the gas and liquid phases: Quantum combined effects of strong anharmonic coupling, multiple Fermi resonances, weak dampings and rotational structure, J. Mol. Struct., 2007, 833, 65-73. Maréchal Y. IR spectra of carboxylic acids in the gas phase: A quantitative reinvestigation, J. Chem. Phys., 1987, 87, 6344-6353. Chaban G. M.; Jung J. O.; Gerber R. B. Anharmonic Vibrational Spectroscopy of Glycine: Testing of ab Initio and Empirical Potentials, J. Phys. Chem. A, 2000, 104, 10035-10044. Brauer B.; Gerber R. B.; Kabelac M.; Hobza P.; Bakker J. M.; Riziq A. G. A.; de Vries M. S. Vibrational Spectroscopy of the G···C Base Pair: Experiment, Harmonic and
184
[10]
[11]
[12]
[13] [14]
[15]
[16]
[17]
[18] [19]
[20] [21]
[22] [23] [24] [25] [26]
Attila Bende Anharmonic Calculations, and the Nature of the Anharmonic Couplings, J. Phys. Chem. A, 2005, 109, 6974-6984. Carbonniere P.; Barone V. Performances of different density functionals in the computation of vibrational spectra beyond the harmonic approximation, Chem. Phys. Lett., 2004, 399, 226-229. Pele L.; Gerber R. B. On the number of significant mode-mode anharmonic couplings in vibrational calculations: Correlation-corrected vibrational self-consistent field treatment of di-, tri-, and tetrapeptides, J. Chem. Phys., 2008, 128, 165105. Roitberg A.; Gerber R. B.; Elber R.; Ratner M. A. Anharmonic wave functions of proteins: quantum self-consistent field calculations of BPTI, Science, 1995, 268, 13191322. Bowman J. M. The self-consistent-field approach to polyatomic vibrations, Acc. Chem. Res., 1986, 19, 202-208. Rekik N.; Oujia B.; Wójcik M. J. Theoretical infrared spectral density of H-bonds in liquid and gas phases: Anharmonicities and dampings effects, Chem. Phys., 2008, 352, 65-76. Sinha P.; Boesch S. E.; Gu C. M.; Wheeler R. A.; Wilson A. K. Harmonic Vibrational Frequencies: Scaling Factors for HF, B3LYP, and MP2 Methods in Combination with Correlation Consistent Basis Sets, J. Phys. Chem. A, 2004, 108, 9213-9217. Halls M. D.; Velkovski J.; Schlegel H. B. Harmonic frequency scaling factors for Hartree-Fock, S-VWN, B-LYP, B3-LYP, B3-PW91 and MP2 with the Sadlej pVTZ electric property basis set, Theor. Chem. Acc., 2001, 105, 413-421. Wang N. X.; Venkatesh K.; Wilson A. K. Behavior of Density Functionals with Respect to Basis Set. 3. Basis Set Superposition Error, J. Phys. Chem. A, 2006, 110, 779-784. Tuma C.; Boese A. D.; Handy H. C. Predicting the binding energies of H-bonded complexes: A comparative DFT study, Phys. Chem. Chem. Phys., 1999, 1, 3939-3948. Haynes P. D.; Skylaris C.-K.; Mostofi A. A.; Payne M. C. Elimination of basis set superposition error in linear-scaling density-functional calculations with local orbitals optimised in situ, Chem. Phys. Lett., 2006, 422, 345-349. Jansen H. B.; Ros P. Non-empirical molecular orbital calculations on the protonation of carbon monoxide, Chem. Phys. Lett., 1969, 3, 140-143. Boys S. B.; Bernardi F. The calculation of small molecular interactions by the differences of separate total energies. Some procedures with reduced errors, Mol. Phys., 1970, 19, 553-566. Mayer I. Towards a “Chemical” Hamiltonian, Int. J. Quantum Chem., 1983, 23, 341363. Mayer I. The chemical Hamiltonian approach for treating the BSSE problem of intermolecular interactions, Int. J. Quantum Chem., 1998, 70, 41-63. Mayer I.; Surján P. R. Monomer geometry relaxation and the basis set superposition error, Chem. Phys. Lett., 1992, 191, 497-499. Mayer I.; Vibók Á. SCF theory of intermolecular interactions without basis set superposition error, Chem. Phys. Lett., 1987, 136, 115-121. Mayer I.; Vibók Á.I ntermolecular SCF method without bsse: the closed-shell case, Chem. Phys. Lett., 1987, 140, 558-564.
Anharmonic Effects in Normal Mode Vibrations
185
[27] Vibók Á.; Mayer I. Intermolecular SCF theory without BSSE: The equations and some applications for small systems, J. Mol. Struct. (Theochem), 1988, 170, 9-17. [28] Mayer I.; Vibók Á.; Halász G. J.; Valiron P. A BSSE-free SCF algorithm for intermolecular interactions. III. Generalization for three-body systems and for using bond functions, Int. J. Quantum Chem., 1996, 57, 1049-1055. [29] Halász G. J.; Vibók Á.; Valiron P.; Mayer I. BSSE-Free SCF Algorithm for Treating Several Weakly Interacting Systems, J. Phys. Chem., 1996, 100, 6332-6335. [30] Halász, G. J.; Vibók Á., Suhai S. A BSSE-free SCF algorithm for intermolecular interactions. IV. Generalization for open-shell systems, Int. J. Quantum Chem., 1998, 68, 151-158. [31] Mayer I.; Turi L. An analytical investigation into the bsse problem, J. Mol. Struct. (Theochem), 1991, 227, 43-65. [32] Mayer I.; Surján P. R. Improved intermolecular SCF theory and the BSSE problem, Int. J. Quantum Chem., 1989, 36, 225-240. [33] Mayer I.; Surján P. R.; Vibók Á. BSSE-free SCF methods for intermolecular interactions, Int. J. Quantum Chem., Quant. Chem. Sym., 1989, 23, 281-290. [34] Mayer I.; Vibók Á. A BSSE-free SCF algorithm for intermolecular interactions, Int. J. Quantum Chem., 1991, 40, 139-148. [35] Vibók Á.; Mayer I. A BSSE-free SCF algorithm for intermolecular interactions. II. Sample calculations on hydrogen-bonded complexes, Int. J. Quantum Chem., 1992, 43, 801-811. [36] Hamza A.; Vibók, Á.; Halász G. J.; Mayer I. BSSE-free SCF theories: a comment, J. Mol. Struct. (Theochem), 2000, 501-502, 427-434. [37] Halász G. J.; Vibók, Á.; Suhai S.; Mayer I. Toward a BSSE-free description of strongly interacting systems, Int. J. Quantum Chem., 2002, 89, 190-197. [38] Mayer I.; Vibók, Á. BSSE-free second-order intermolecular perturbation theory, Mol. Phys., 1997, 92, 503-510. [39] Vibók Á.; Halász G. J.; Mayer I. BSSE-free second order intermolecular perturbation theory II. Sample calculations on hydrogen-bonded complexes, Mol. Phys., 1998, 93, 873-877. [40] Mayer I.; Valiron P. Second order Møller–Plesset perturbation theory without basis set superposition error, J. Chem. Phys., 1998, 109, 3360-3373. [41] Vibók Á.; Halász G. J.; Mayer I., in Electron Correlation and Material Properties; Gonis, A.; Kioussis, N.; Ciftan, M.; Eds.; Kluwer Academic/Plenum: New York, 2004, pp. 263–283. [42] Mayer I.; Valiron P., Program CHA-MP2, Budapest, 1998. [43] Valiron P.; Vibók Á.; Mayer I. Comparison of a posteriori and a priori BSSE correction schemes for SCF intermolecular energies, J. Comp. Chem., 1993, 14, 401-409. [44] Halász G. J.; Vibók Á.; Mayer I. Comparison of basis set superposition error corrected perturbation theories for calculating intermolecular interaction energies, J. Comp. Chem., 1999, 20, 274-283. [45] Mayer I. On the Hylleraas functional for a non-Hermitian unperturbed Hamiltonian, Mol. Phys., 1996, 89, 515-519. [46] Schneider W.; Thiel W. Anharmonic force fields from analytic second derivatives: Method and application to methyl bromide, Chem. Phys. Lett., 1989, 157, 367-373.
186
Attila Bende
[47] Barone V. Anharmonic vibrational properties by a fully automated second-order perturbative approach, J. Chem. Phys., 2005, 122, 014108. [48] Weigend F.; Häser M.; Patzelt H.; Ahlrichs R. RI-MP2: optimized auxiliary basis sets and demonstration of efficiency, Chem. Phys. Lett., 1998, 294, 143-152. [49] Dunlap B. I. Robust and variational fitting, Phys. Chem. Chem. Phys., 2000, 2, 21132116. [50] Weigend F. A fully direct RI-HF algorithm: Implementation, optimised auxiliary basis sets, demonstration of accuracy and efficiency, Phys. Chem. Chem. Phys., 2002, 4, 4285-4291. [51] Weigend F.; Köhn A.; Hättig C. Efficient use of the correlation consistent basis sets in resolution of the identity MP2 calculations, J. Chem. Phys., 2002, 116, 3175-3183. [52] Boys S. F., in Quantum Theory of Atoms, Molecules, and the Solid State, edited by P. O. Löwdin, Academic, New York, 1966, pp. 253. [53] Pipek J.; Mezey P. G. A fast intrinsic localization procedure applicable for ab initio and semiempirical linear combination of atomic orbital wave functions, J. Chem. Phys.,1989, 90, 4916-4926. [54] Hetzer G.; Pulay P.; Werner H.-J. Multipole approximation of distant pair energies in local MP2 calculations, Chem. Phys. Lett., 1998, 290, 143-149. [55] Cabaleiro-Lago E. M.; Ríos, M. A. Development of an intermolecular potential function for interactions in formamide clusters based on ab initio calculations, J. Chem. Phys., 1999, 110, 6782-6791. [56] Cabaleiro-Lago E. M.; Otero, J. R. Ab Initio and density functional theory study of the interaction in formamide and thioformamide dimers and trimers, J. Chem. Phys., 2002, 117, 1621-1632. [57] Martell J. M.; Yu H.; Goddard J. D. Molecular decompositions of acetaldehyde and formamide: theoretical studies using Hartree-Fock, Moller-Plesset and density functional theories, Mol. Phys., 1997, 92, 497-502. [58] Lii J.-H.; Allinger N. L. Directional hydrogen bonding in the MM3 force field: II, J. Comp. Chem., 1998, 19, 1001-1016. [59] Hobza P.; Šponer J. MP2 and CCSD(T) calculations on H-bonded and stacked formamide···formamide and formamidine···formamidine dimers, J. Mol. Struct. (Theochem), 1996, 388, 115-120. [60] Florián L.; Leszczynski J.; Johnson B. G. On the intermolecular vibrational modes of the guanine···cytosine, adenine···thymine and formamide···formamide H-bonded dimers, J. Mol. Struct., 1995, 349, 421-426. [61] Sobolewski A. L. Ab initio study of the potential energy functions relevant for hydrogen transfer in formamide, its dimer and its complex with water, J. Photochem. Photobio. A, 1995, 89, 89-97. [62] Vargas R., Garza J., Frisner R. A., Stern H., Hay B. P., Dixon D. A., J. Phys. Chem. A, 2001, 105, 4963. [63] Defrançois C.; Périquet V.; Carles S.; Schermann J. P.; Adamowicz L. Neutral and negatively-charged formamide, N-methylformamide and dimethylformamide clusters, Chem. Phys., 1998, 239, 475-483. [64] Sathyan N.; Santhanam V.; Sobhanadti J. Ab initio calculations on some binary systems involving hydrogen bonds, J. Mol. Struct. (Theochem), 1995, 333, 179-189.
Anharmonic Effects in Normal Mode Vibrations
187
[65] Wójcik M. J.; Hirakawa A. Y.; Tsuboi M.; Kato S.; Morokuma K. Ab initio MO calculation of force constants and dipole derivatives for the formamide dimer. An estimation of hydrogen-bond force constants, Chem. Phys. Lett., 1983, 100, 523-528. [66] Itoh K.; Shimanouchi T. Vibrational spectra of crystalline formamide, J. Mol. Spectrosc., 1972, 42, 86-99. [67] Lovas F. J.; Suenram R. D.; Fraser G. T. The microwave spectrum of formamide–water and formamide–methanol complexes, J. Chem. Phys., 1988, 88, 722-729. [68] Engdahl A.; Nelander B. Complex formation between water and formamide, J. Chem. Phys., 1993, 99, 4894-4907. [69] Gaussian 03; Revision C.02; Frisch M. J.; Trucks G. W.; Schlegel H. B.; Scuseria G. E.; Robb M. A.; Cheeseman J. R.; Montgomery J. A. Jr.; Vreven T.; Kudin K. N.; Burant J. C.; Millam J. M.; Iyengar S. S.; Tomasi J.; Barone V.; Mennucci B.; Cossi M.; Scalmani G.; Rega N.; Petersson G. A.; Nakatsuji H.; Hada M.; Ehara M.; Toyota K.; Fukuda R.; Hasegawa J.; Ishida M.; Nakajima T.; Honda Y.; Kitao O.; Nakai H.; Klene M.; Li X.; Knox J. E.; Hratchian H. P.; Cross J. B.; Adamo C.; Jaramillo J.; Gomperts R.; Stratmann R. E.; Yazyev O.; Austin A. J.; Cammi R.; Pomelli C.; Ochterski J. W.; Ayala P. Y.; Morokuma K.; Voth G. A.; Salvador P.; Dannenberg J. J.; Zakrzewski V. G.; Dapprich S.; Daniels A. D.; Strain M. C.; Farkas O.; Malick D. K.; Rabuck A. D.; Raghavachari K.; Foresman J. B.; Ortiz J. V.; Cui Q.; Baboul A. G.; Clifford S.; Cioslowski J.; Stefanov B. B.; Liu G.; Liashenko A.; Piskorz P.; Komaromi I.; Martin R. L.; Fox D. J.; Keith T.; Al-Laham M. A.; Peng C. Y.; Nanayakkara A.; Challacombe M.; Gill P. M. W.; Johnson B.; Chen W.; Wong M. W.; Gonzalez C.; Pople J. A.; Gaussian, Inc., Wallingford CT, 2004. [70] Dupuis M.; Farazdel A., HONDO-8, from MOTECC-91; IBM Corporation Center for Scientific & Engineering Computations: Kingston, NY, 1991. [71] Beu T. A., Program NOMAD, Cluj-Napoca, Romania, 1997. [72] Press W. H.; Flannery B. P.; Teukolsky S. A.; Vetterling W. T., Numerical Recipes, Cambridge University Press: Cambridge, U.K., 1986. [73] Bende A.; Vibók Á.; Halász G. J.; Suhai S. BSSE-Free Description of the Formamide Dimers, Int. J. Quantum Chem., 2001, 84, 617-622. [74] Bende A.; Vibók Á.; Halász G. J.; Suhai S. Ab Initio Study of the Ammonia–Ammonia Dimer: BSSE-Free Structures and Intermolecular Harmonic Vibrational Frequencies, Int. J. Quantum Chem., 2004, 99, 585-593. [75] Masunov A.; Dannenberg J. J. Theoretical Study of Urea. I. Monomers and Dimers, J. Phys. Chem. A, 1999, 103, 178-184. [76] Masunov A.; Dannenberg J. J. Theoretical Study of Urea and Thiourea. 2. Chains and Ribbons, J. Phys. Chem. B, 2000, 104, 806-810. [77] Rousseau B.; Van Alsenoy C.; Keuleers R.; Desseyn H. O. Solids Modeled by AbInitio Crystal Field Methods. Part 17. Study of the Structure and Vibrational Spectrum of Urea in the Gas Phase and in Its P21m Crystal Phase, J. Phys. Chem. A, 1998, 102, 6540-6548. [78] Keuleers R.; Desseyn H. O.; Rousseau B.; Van Alsenoy C. Vibrational Analysis of Urea, J. Phys. Chem. A, 1999, 103, 4621-4630. [79] Clabo Jr. D. A.; Allen W. D.; Remington R. B.; Yamaguchi Y.; Schaefer III H. F. A systematic study of molecular vibrational anharmonicity and vibration-rotation
188
[80] [81] [82]
[83]
[84] [85] [86]
[87]
[88]
[89]
[90] [91]
[92]
[93]
[94]
Attila Bende interaction by self-consistent-field higher-derivative methods. Asymmetric top molecules, Chem. Phys., 1988, 123, 187-239. Barone V. Anharmonic vibrational properties by a fully automated second-order perturbative approach, J. Chem. Phys., 2005, 122, 014108. Dunning Jr. T. H.; Hay P. J., in Modern Theoretical Chemistry, Ed. Schaefer III H. F., Plenum, New York, 1976, Vol. 3, 1-28. MOLPRO, version 2008.1, a package of ab initio programs, Werner H.-J.; Knowles P. J.; Lindh R.; Manby F. R; Schütz M.; Celani P.; Korona T.; Mitrushenkov A.; Rauhut G.; Adler T. B.; Amos R. D.; Bernhardsson A.; Berning A.; Cooper D. L.; Deegan M. J. O.; Dobbyn A. J.; Eckert F.; Goll E.; Hampel C.; Hetzer G.; Hrenar T.; Knizia G.; Köppl C.; Liu Y.; Lloyd A. W.; Mata R. A.; May A. J.; McNicholas S. J.; Meyer W.; Mura M. E.; Nicklass A.; Palmieri P.; Pflüger K.; Pitzer R.; Reiher M.; Schumann U.; Stoll H.; Stone A. J.; Tarroni R.; Thorsteinsson T.; Wang M.; Wolf A., see http://www.molpro.net. Molden 4.7, Schaftenaar G.; Noordik J. H. Molden: a pre- and post-processing program for molecular and electronic structures, J. Comput-Aided Mol. Design, 2000, 14, 123134. Jalkanen K. J.; Stephens P. J. Ab initio calculation of force fields and vibrational spectra: 2-oxetanone, J. Phys. Chem., 1999, 95, 5446-5454. Šponer J.; Jurečka P.; Hobza P. Accurate Interaction Energies of Hydrogen-Bonded Nucleic Acid Base Pairs, J. Am. Chem. Soc., 2004, 126, 10142-10151. Podolyan Y.; Nowak M. J.; Lapinski L.; Leszczynski J. Probing ab initio MP2 approach towards the prediction of vibrational infrared spectra of DNA base pairs, J. Mol. Struct., 2005, 744-747, 19-34. Bakker J. M.; Compagnon I.; Meijer G.; von Helden G.; Kabeláč M.; Hobza P.; The mid-IR absorption spectrum of gas-phase clusters of the nucleobases guanine and cytosine de Vries M. S., Phys. Chem. Chem. Phys., 2004, 6, 2810-2815. Gorb L.; Podolyan Y.; Dziekonski P.; Sokalski W. A.; Leszczynski J. Double-Proton Transfer in Adenine-Thymine and Guanine-Cytosine Base Pairs. A Post-Hartree-Fock ab Initio Study, J. Am. Chem. Soc., 2004, 126, 10119-10129. Müller A.; Talbot F.; Leutwyler S. Hydrogen Bond Vibrations of 2-Aminopyridine·2Pyridone, a Watson−Crick Analogue of Adenine·Uracil, J. Am. Chem. Soc., 2002, 124, 14486-14494. Hobza P.; Halvas Z. Blue-Shifting Hydrogen Bonds, Chem. Rev., 2000, 100, 42534264. Hobza P.; Špirko V. Why is the N1–H stretch vibration frequency of guanine shifted upon dimerization to the red and the amino N–H stretch vibration frequency to the blue?, Phys. Chem. Chem. Phys., 2003, 5, 1290-1294. Wang N. X.; Venkatesh K.; Wilson A. K. Behavior of Density Functionals with Respect to Basis Set. 3. Basis Set Superposition Error, J. Phys. Chem. A, 2006, 110, 779-784. Simon S.; Bertran J.; Sodupe M. Effect of Counterpoise Correction on the Geometries and Vibrational Frequencies of Hydrogen Bonded Systems, J. Phys. Chem. A, 2001, 105, 4359-4364. Watson J. D.; Crick F. H. Molecular Structure of Nucleic Acids: A Structure for Deoxyribose Nucleic Acid, Nature, 1953, 171 (4356), 737-738.
Anharmonic Effects in Normal Mode Vibrations
189
[95] Hoogsteen K. The structure of crystals containing a hydrogen-bonded complex of 1methylthymine and 9-methyladenine, Acta Crystallogr., 1959, 12, 822-823. [96] Felsenfeld G.; Davies D. R.; Rich, A. Formation Of A Three-Stranded Polynucleotide Molecule, J. Am. Chem. Soc., 1957, 79, 2023-2024. [97] Hakoshima T.; Fukui T.; Ikehara M.; Tomita K.-I. Molecular structure of a double helix that has non-Watson-Crick type base pairing formed by 2-substituted poly(A) and poly(U), PNAS, 1981, 78, 7309-7313. [98] Isaksson J.; Zamaratski E.; Maltseva T. V.; Agback P.; Kumar A.; Chattopadhyaya J. The First Example of a Hoogsteen Basepaired DNA Duplex in Dynamic Equilibrium with a Watson-Crick Basepaired Duplex – A Structural (NMR), Kinetic and Thermodynamic Study, J. Biomol. Struct. Dyn., 2001, 18, 783-806. [99] Patikoglou G. A.; Kim J. L.; Sun L.; Yang S.-H.; Kodadek T.; Burley; S. K. TATA element recognition by the TATA box-binding protein has been conserved throughout evolution, Genes Dev., 1999, 13, 3217-3230. [100] Leontis N. B.; Westhof E. Conserved geometrical base-pairing patterns in RNA, Quart. Rev. Biophys., 1998, 31, 399-445. [101] Abrescia N. G. A.; Thompson A.; Huynh-Dinh T.; Subirana J. A. Crystal structure of an antiparallel DNA fragment with Hoogsteen base pairing, PNAS, 2002, 99, 2806-2811. [102] Guerra C. F.; Bickelhaupt F. M.; Snijders J. G.; Baerends E. J. Hydrogen Bonding in DNA Base Pairs: Reconciliation of Theory and Experiment, J. Am. Chem. Soc., 2000, 122, 4117-4128. [103] van der Wijst T.; Guerra C. F.; Swart M.; Bickelhaupt F. M. Performance of various density functionals for the hydrogen bonds in DNA base pairs, Chem. Phys. Lett., 2006, 426, 415-421. [104] Krishnan G. M.; Kühn O. Identifying adenine–thymine base pairing by anharmonic analysis of the hydrogen-bonded NH stretching vibrations, Chem. Phys. Lett., 2007, 435, 132-135. [105] Kabeláč M.; Hobza P. Potential Energy and Free Energy Surfaces of All Ten Canonical and Methylated Nucleic Acid Base Pairs: Molecular Dynamics and Quantum Chemical ab Initio Studies, J. Phys. Chem. B, 2001, 105, 5804-5817. [106] Plützer C.; Hünig I.; Kleinermanns K.; Nir E.; de Vries M. S. Pairing of Isolated Nucleobases: Double Resonance Laser Spectroscopy of Adenine-Thymine, Chem. Phys. Chem., 2003, 4, 838-842. [107] Heyne K.; Krishnan G. M.; Kühn O. Revealing Anharmonic Couplings and Energy Relaxation in DNA Oligomers by Ultrafast Infrared Spectroscopy, J. Phys. Chem. B, 2008, 112, 7909-7915.
In: Electrostatics: Theory and Applications Editor: Camille L. Bertrand, pp. 191-215
ISBN 978-1-61668-549-2 c 2010 Nova Science Publishers, Inc.
Chapter 9
E MERGENT P ROPERTIES IN B OHMIAN C HEMISTRY Jan C.A. Boeyens Unit for Advanced Study, University of Pretoria, South Africa
Abstract Bohmian mechanics developed from the hydrodynamic interpretation of quantum events. By this interpretation all dynamic variables retain their classical meaning in quantum systems. It is of special significance in chemistry as a discipline which is traditionally based on the point electrons of quantum field theory. It could be more informative to assume a non-dispersive electronic spinor, or wave packet, with divergent and convergent spherical wave components, and with many properties resembling those of a point particle. Complex chemical matter is endowed with three attributes: cohesion, conformation and affinity, which can be reduced to the three fundamental electronic properties of charge, angular momentum and quantum potential, known from the wave structure. The chemical effects of these respective scalar, vector and temporal principles, all manifest as extremum phenomena. The optimal distribution of electronic charge in space appears as Pauli’s exclusion principle. Minimization of orbital angular momentum becomes the generator of molecular conformation. Equilization of electronegativity, the quantum potential of the valence state, dictates chemical affinity. The chemical environment is said to generate three emergent properties: the exclusion principle, molecular structure and the second law of thermodynamics. These concepts cannot be predicted from first fundamental principles. Only by recognition of the emergent properties of chemistry is it possible to simulate chemical behaviour. The exclusion principle controls all forms of chemical cohesion, atomic structure and periodicity; molecular structure underpins vector properties such as conformational rigidity, optical activity, photochemistry and other stereochemical phenomena; while transport properties and chemical reactivity depend on the second law. Simulation of these chemical concepts by constructionist procedures, starting from basic physics, is impossible. The ultimate reason is that complex chemical properties are not represented by quantum-mechanical operators in the same sense as energy and momentum. The Bohmian interpretation, which enables the introduction of simplifying emergent parameters, in analogy with classical procedures, allows the calculation of molecular properties by generalized Heitler-London methods, point-charge simulation and molecular mechanics.
192
1.
Jan C.A. Boeyens
Introduction
The editorial challenge to address the quantum frontiers of atoms and molecules in chemistry cuts far deeper than a survey of current activity in quantum chemistry, which for all practical purposes means ab-initio computational chemistry. There is no quantum theory of chemistry: Quantum mechanics originated as a theory to understand radiation and its interaction with sub-atomic matter. It gave birth to the modern science of spectroscopy, in which form it stimulated the development of sophisticated observational techniques that revolutionized physics, chemistry and biology. However, the early promise of a quantum theory of matter in general has not come to fruition. A theory that fails to elucidate the nature of electrons, atoms and molecules can never lead to an understanding of chemistry. It may enable computations of unrivalled complexity, but without a conceptual framework the results have no meaning. As a theory of spectroscopy, quantum mechanics serves to relate measured quantities, such as frequency and wavelength to the dynamic concepts of energy and angular momentum, by means of differential operators e.g.: −i~∂/∂t → E,
i~∂/∂ϕ → L.
Although chemically more useful information on molecular structure and electronegativity is also embedded in the state functions, there are no known operators whereby to extract this information directly. The current computational alternative fails for the same reason. Minimization of the scalar quantity, energy, can never generate three-dimensional structure, which is a vector concept. The failure to rationalize chemical behaviour is not a failure of quantum theory, but rather, a failure of the traditional interpretation of the theory. The ruling interpretation of quantum theory, known as the Copenhagen interpretation, incorporates a number of features totally at variance with chemistry. It defines quantum objects, including photons, electrons, atoms and molecules, as structureless point particles without extension. It reduces a continuously varying density, such as the electronic charge distribution in atoms, molecules and crystals to a probabilistic function. It offers no rationale for the occurrence of stationary states, apart from a postulate. Non-local effects are forbidden and the doctrine teaches more about measurement problems and quantum uncertainty than about chemical interaction. The unfortunate reality is that Schr¨odinger’s description of elementary matter as wave structures, which could not be reconciled with the Copenhagen orthodoxy, was ruled unacceptable and re-interpreted in terms of the quantum jumps and probability density of the particle model. This compromise resulted in the awkward concept of wave-particle duality, familiar to all modern chemists, but understood by none. The challenge that we are facing here is to retrace our steps to the point where the time-honoured concepts of classical chemistry merge in a natural way with the ideas of wave mechanics and start rebuilding a theory that ′′ stimulate(s) the mutual understanding of the various branches of chemistry and its neighbouring sciences′′ , realizing that ′′ the main stumbling block for the development of a theory of large and complex molecular systems is not computational but conceptual′′ [1].
Emergent Properties in Bohmian Chemistry
193
I propose to start with an outline of the essential concepts of cohesion, conformation and affinity; show how these relate to the more fundamental concepts of space, matter and number; and the derived concepts of interaction, waves and periodicity. Next we examine the Schr¨odinger formalism, the Bohmian interpretation and the application of wave mechanics, supported by higher-level emergent properties. The photoelectric effect, probably the most effective argument to have established the particle nature of photons is shown to be explained more convincingly by the transactional-wave model of electromagnetic interaction.
2.
The Fundamental Concepts
Any object or event observed in Nature can always be considered as the product of more primitive events or the interaction between more primitive entities. It is quite natural, in this reductionist spirit, to ascribe the actions and features of a living organism to some lower-level activity of biological cells, which in turn is driven by intracellular chemical interactions. The molecular building blocks of biological cells are assumed to consist of atoms, held together by electromagnetic forces, while the sub-atomic nucleons interact with strong interaction, and so on, ad infinitum. Not really. The reduction has to stop somewhere. As progressively smaller entities are implicated at the more primitive levels, it is reasonable to conclude that the cascade ends in a void, traditionally assumed to be the same as space, or the vacuum. At this point it is easy to get distracted by the interminable philosophical dispute about the possible existence or non-existence of a void. As a practical alternative I prefer to define space in geometrical terms and the vacuum as a physical entity.
2.1.
Space
The simplest way of looking at space is as a coordinate system that serves to describe the relative positions of observable objects. The intuitively most obvious is a cartesian system on three orthogonal axes. However, this may not necessarily be the most convenient coordinate system. Although it works well in a laboratory environment, it is well known to be inappropriate in geographical context, which is simplified by using spherical trigonometry. By the same argument cartesian coordinates may not be the most convenient to map the relative positions of astronomical objects. To first approximation the planets and most of their moons have been assumed to move in a single plane, called the ecliptic, around the sun. Individual elliptic orbits of planets and moons have a simple description in terms of Kepler’s laws in a plane, but most of these planes are known to make non-zero angles with the ecliptic. Moving into interstellar, or even intergalactic, space the situation becomes more complicated when dealing with cosmological distances, times and velocities, which demands relativistic rather than Galilean kinematics. Special relativity is conveniently formulated in four-dimensional Minkowski space and general relativity requires non-Euclidean geometry, i.e. curved space. In general relativity the geometry of space is described by a curvature tensor, which is linearly related to a stress tensor that describes the distribution of matter in the cosmos.
194
Jan C.A. Boeyens
This reciprocal relationship shows that an empty universe has zero curvature and that curved space generates matter. The mechanism whereby curvature generates matter is visualized in the process of covering a curved surface with an inflexible sheet. The higher the curvature, the poorer the fit and the more the wrinkles in the cover that cannot be smoothed away. Such wrinkles in space are interpreted as matter and energy – the content of the stress tensor. 2.1.1.
Number
Figure 1. Spiral structure of a fossilized nautilus shell. Not only the topology of space-time, but also the physical content of the universe, resembles the natural number system in remarkable detail [2]. This explains the unreasonable effectiveness of mathematics as a scientific tool and the success of number theory to predict natural phenomena as a manifestation of cosmic symmetry [2, 3]. The physical world, as an image of the natural numbers, can never be known in more detail than the number system. Concepts such as infinity and singularity, poorly understood mathematically, therefore make no physical sense. On the other hand, the concept of high-dimensional space, readily manipulated mathematically, and, although physically difficult to visualize, has a legitimate place in scientific discourse. To deal with the ubiquitous, but bothersome, infinities of physics, the infinity concept of projective geometry can be used to define both the number system and the physical cosmos as closed. Realizing that any closed system is periodic, by definition, a wave structure of the vacuum and periodicity of matter are inferred. The implied cosmic symmetry is referred to as self-similarity. The chambered structure of a nautilus shell, shown in Figure 1, is one of the best-known examples of this symme-
Emergent Properties in Bohmian Chemistry
195
try type. All the chambers have the same shape and only differ in size, which increases regularly along a golden logarithmic spiral. The same pattern occurs in the arrangement of growing seed buds in a sunflower head and in the image of a spiral galaxy. Recognition of the same growth features [2, 4] in atomic nuclei, atoms, covalent molecules and the solar system, reveals self-similarity on a much wider scale. The number theory of self-similarity shows that all of these structures are based on the Fibonacci number sequence, which converges to the golden ratio.
2.2.
Vacuum
The fabric of space is a matter of conjecture. However, if tangible matter occurs on curving flat space, flat space is not void. A useful analogy is to picture the vacuum as a regular undulating expanse filled by waves of constant wavelength. When curved, interference of the primary waves produces persistent wave packets, earlier identified as wrinkles. In local space such a wave packet is conveniently described as the superposition of threedimensional spherical waves, converging to and diverging from a centre of mass. These waves are the retarded and advanced solutions of the general wave equation, which implies motion, either forward or backward in time. 2.2.1.
Wave Packets
A typical wave packet generated by such a superposition of waves is shown in Figure 2. The tangent curve follows the amplitude of the 1/r Coulomb potential, which reflects the actual charge density, except when r → 0. The secondary waves propagate with the group velocity vg of the system and the primary waves have phase velocity vφ , such that √ vg vφ = c2 , where c = 1/ ǫ0 µ0 , is the velocity of light in the vacuum. Such a wave packet [4]: sin kr iωt cos kr (1) or Φ = Ae kr kr has been shown [5] to describe elementary waves, equivalent to the postulated elementary distortions of space, i.e. the elementary particles of atomic physics. Interpreted as an electron, the distance between nodal points represents λdB = h/me vg , the de Broglie wavelength of a free electron and λC = 2π/k = h/me c, the Compton wavelength. The amplitude of the standing wave is proportional to the electronic charge. Φ0 = A, in eqn.(1), represents a wave packet with charge proportional to 0 or ±A. Electrons and protons, despite their difference in mass have charges of ±e. The neutron is neutral. The field intensity ΦΦ∗ = A2 (sin kr/kr)2 = C/r2 defines the force between charges, in line with Coulomb’s law, except when r → 0. The breakdown of Coulomb’s law, which occurs naturally for charged wave structures is equivalent to the special renormalization postulate in quantum field theory. 2.2.2.
Electron and Atom
A particle image, as shown in Figure 3, is obtained by rotating the diagram of Figure 2 about two axes perpendicular to x. The charge density can either contract or expand,
196
Jan C.A. Boeyens
Figure 2. One-dimensional section through a spherical wave packet with components converging on and diverging from x0 . depending on environmental pressure. The minimum radius that can be reached on compression depends on the rest mass of the object. For an electron r0 = e2 /m0 c2 .
Figure 3. Wave structure of a free electron with de Broglie wavelength λdB = h/me vg . Given the inferred flexible structure of an electron as a continuous indivisible charge, the self-similarity of atoms and the solar system is not obvious. The pioneering work on the planetary model of atomic structure was firmly based on Kepler’s model of elliptic orbits. Two crucial parameters that define a Kepler ellipse are the semimajor axis and the eccentricity, which characterize the size and shape of an orbit. By Newton’s laws these respective parameters are related to the energy and angular momentum of the orbital motion. In retrospect Kepler’s laws are seen to embody the general conservation principles for energy and angular momentum in celestial mechanics. The efforts of Bohr and Sommerfeld to explain electronic motion in atoms by the same model were spectacularly successful, despite a few subtle, but fatal, defects. Whereas Kepler’s model is valid in a gravitational field, it needs modification in an electromagnetic field as an accelerated charge radiates energy and an accelerated point charge therefore cannot maintain a stable orbit. Conservation
Emergent Properties in Bohmian Chemistry
197
of angular momentum in a central electrostatic field should rather be interpreted as conservation of the spherical shape of a continuous charge. Polar deformation under an external influence is described by the three-dimensional surface harmonics, or eigenfunctions, of the circulation Laplacian, with discrete eigenvalues: L2 Ylml = l(l + 1)k 2 Ylml ,
Lz Ylml = ml kYlml .
This classical result acquires quantum-mechanical meaning by equating the arbitrary constant k with ~, the elementary unit of angular momentum [6].
Figure 4. Phase-locked cavity with perfectly reflecting walls, filled with radiation in the form of standing waves. The difference between atomic and planetary systems goes a long way towards understanding of cosmic self-similarity. Although electrons in an atom are spread in three dimensions and planets orbit the sun in an approximately two-dimensional plane, both arrangements depend parametrically on the golden ratio. In the same way sunflower seeds of varying size are closely packed in a plane, compared to the three-dimensional stacking of nucleons; both styles conditioned by the golden ratio. The only common factor in all cases is the general curvature of space. Evidently, the curvature of cosmic space must be a function of the golden ratio, from the sub-atomic to supergalactic scales. 2.2.3.
Mass
Not only the charge, but also the characteristic mass and spin of sub-atomic species are accounted for by their wave structure. Jennison and Drinkwater [7] demonstrated that microwave radiation trapped in a phase-locked cavity, as in Figure 4, generates an interaction pattern which is mathematically equivalent to a system with inertial mass. Disturbing the equilibrium by a pulse that moves a cavity wall at velocity δv for a period δt, which is matched to the wave propagation across the cavity, modifies the internal pressure by Doppler shifting of the waves and sets the entire system into motion with velocity 2δv. The radiation pressure on the walls is balanced by an electromagnetic field, which keeps the system in static equilibrium. In the real vacuum the analogue of the phase-locked cavity is a standing wave, filled with radiation of Compton wavelength, internal energy E, in equilibrium with the external radiation (wave) field. Simple calculation [7] shows that
198
Jan C.A. Boeyens
the inertial mass of the wave packet obeys Newton’s law, F = ma, on identifying E/c2 with the rest mass.
2.2.4.
Spin
The standing-wave description of an electron defines it as an integral part of the vacuum, not obviously free to move without impediment. Linear motion of an electron must then clearly lead to continuous drag and deformation of both electron and its immediate environment, culminating in rupture of the vacuum and creation of a turbulent state. An electron that rotates in the vacuum, although more symmetrical, winds up the connecting medium until it shears and develops a discontinuity along a cylindrical surface. The only motion that occurs without distortion of the spherical wave packet or mechanical entanglement of the environment is rotation around a point. Unlike axial rotation this mode is more like a continuous wobble that returns to the original situation after two complete revolutions. The three dimensions of space participate equally in the motion without the transfer of rotational energy from the spinning object to the connecting medium. The strain that builds up during the first part of the rotational cycle relaxes during the second part. Apart from half-frequency cyclic disturbance in the connecting medium, the electron is free to move through the vacuum without permanent entanglement. An object, which performs this type of spherical rotation, is described mathematically by a spinor, or a quantity that reverses sign on rotation through an odd multiple of 2π radians.
3.
Quantum Theory
Quantum theory started with the discovery of line spectra and Balmer’s observation that the spectral lines of atomic hydrogen obey a digital formula, later generalized to: ν = Rc
1 1 − 2 2 n1 n2
,
n2 > n1 = 1, 2, 3 . . .
(2)
The first sensible explanation of the formula was proposed in 1904 by Nagaoka who used the planet Saturn with its system of rings as the basis of an atomic model, with electrons at energy levels (rings) in simple numerical order, orbiting a heavy positively charged nucleus. Experimental confirmation of such an arrangement was found by Rutherford in 1910 and a dynamic model, based on Planck’s quantum condition, E = hν, was proposed in 1914 by Bohr. Where Nagaoka argued that electrodynamically stable orbits required a standing electron wave of length λ = 2πr/n at an average distance r from the nucleus, Bohr postulated quantum stability for an orbiting electron with angular momentum p = nh/2π ≡ n~. With electrostatic and mechanical forces in balance, (using electrostatic units, 4πǫ0 = 1): p2 e2 = , r2 mr
E =T +V =
e2 e2 e2 − =− , 2r r 2r
Emergent Properties in Bohmian Chemistry
199
the postulate leads to the Balmer formula 2π 2 me4 hν = ∆E = h2 En =
2π 2 me4 , n2 h2
1 1 − , n21 n22 n2 h2 rn = 2 2 . 4π me
(3) (4)
The rest is history. When de Broglie rediscovered the Nagaoka condition in 1924 by postulating that all matter has an associated wavelength of λ = h/mv, only the mathematical framework for defining a general wave formalism of electronic behaviour, was lacking.
3.1.
Wave Mechanics
The mechanical behaviour of a Newtonian particle is described correctly by three quantities – energy, momentum and angular momentum, which describe the motion as a function of either time, displacement or rotation. In wave formalism each of these parameters is specified as a periodic function [8] with respect to time (τ ), translation (λ) or rotation (ϕ): ω = 2π/τ, E = ~ω,
k = 2π/λ, p = ~k,
ml = 2π/ϕ, Lz = ~ml .
The carrier of the electromagnetic field is described by the differential wave equation: ∇2 Ψ = µǫ
1 ∂2Ψ ∂2Ψ = 2 2. 2 ∂t c ∂t
(5)
To remain consistent with the previous relationships the dynamic variables need to be specified as differential operators: E → −~i∂/∂t,
p → −~i∇,
Lz → ~i∂/∂ϕ,
which can be checked by direct substitution. To allow for the first-order temporal dependence of the energy, the equation for matter waves is restricted to processes which only depend on time through a factor exp(2πiνt), leading to the final form: 2 ~ ∂Ψ 2 ∇ + V Ψ = ±~i (6) 2m ∂t which formally resembles the classical Hamiltonian definition of total energy, as H =T +V =
p2 + V = E. 2m
(7)
By defining a density function ρ = ΨΨ∗ and a current density j=
~ (Ψ∗ ∇Ψ − ψ∇Ψ∗ ) 2mi
(8)
there follows a continuity equation as in classical hydrodynamics ∂ρ + divj = 0. ∂t
(9)
200
Jan C.A. Boeyens
A general expression for a one-electron wave function over all available states X ck ψk e2πiνt Ψ=
(10)
k
may be used to calculate the current density over two states k and l: j=
~e X ck cl (ψl ∇ψk − ψk ∇ψl ) e2πi(νk −νl )t . mi
(11)
k,l
If only a single eigenvibration is excited, the current disappears and the distribution of electron density remains constant. Otherwise an electron flows from one state to another in an exchange that involves a photon to keep the energy in balance. This flow of electricity can hardly be described as a quantum jump. More realistically the vibrations of the two affected states (emitter and acceptor) are seen to interact and generate a beat (wave packet) that moves to the state of lower energy. The virtual photon that links two equilibrium states turn into a real photon that carries the excess energy, either into or away from the system. In chemical applications Schr¨odinger’s equation is best known in its amplitude form, which is obtained by substituting Ψ = ψ exp(2πiνt), followed by elimination of the time parameter to give: 2m ∇2 ψ + 2 (E − V )ψ = 0. (12) ~ In spherical polar coordinates this equation, for the hydrogen problem, separates into independent radial and angular equations: l(l + 1) 2m d2 R 2 dR R = 0, + + (13) E − V (r) − dr2 r dr ~2 r2 1 ∂ ∂Y 1 ∂Y sin θ + + l(l + 1)Y = 0, (14) sin θ ∂θ ∂θ sin2 θ ∂ϕ2 with separation constant λ = l(l + 1), integer l. The angular part is further separable into: d2 Φ + m2l Φ = 0 (ml = −l . . . l), dϕ2 m2l dΘ 1 d sin θ + l(l + 1) − Θ = 0. sin θ dθ dθ sin2 θ
(15) (16)
An electron associated with a stationary proton (V = e2 /r) defines the only problem of some chemical significance for which the radial equation has been solved. Since the proton is here regarded as a point particle, the system does not represent a wave-mechanical model of a hydrogen atom, despite contrary claims in all chemistry texts. Like the Bohr model, it defines a set of quantized energy levels to match most spectroscopic measurements, apart from the Lamb shift, fairly well. The angular equations are valid for central-field problems and produce quantized values of the orbital angular momentum. These eigenvalues should not be confused with the angular momenta of an orbiting particle. They are, more appropriately, considered as symmetry
Emergent Properties in Bohmian Chemistry
201
parameters, such that ml = 0 defines a spherically symmetrical charge distribution. For given l there is always an odd number of 2l + 1 sub-levels with different quantum numbers P ml , which, for many-electron systems, can be chosen in such a way that l ml = 0, in all cases. The choice reflects the electrostatic property of the charge distribution to assume spherical symmetry. A hydrogen atom, by this model, has ml 6= 0 only for excited states, which spontaneously relax to the ml = 0 spherically symmetrical ground state. 3.1.1.
Electron Spin
Schr¨odinger’s equation appears incomplete in the sense of lacking an operator for spin, only because its eigenfunction solutions are traditionally considered complex variables. The wave function, interpreted as a column vector, operated on by square matrices, such that abbreviated to
ei(ωt−kx) 0 0 e−i(ωt−kx)
φ1 e+ φ2 e−
φ1 φ2
=
φ1 ei(ωt−kx) φ2 e−i(ωt−kx)
.
, represents a spinor that moves in the x-direction. By forming the
derivatives: ∂φ = iω ∂t
φ1 e+ φ2 e−
∂2φ = k2 ∂x2
φ1 e+ φ2 e−
,
,
it follows that (in three dimensions): −i
∂φ ω = 2 ∇2 φ. ∂t k
This is Schr¨odinger’s equation, providing (~k)2 = 2m~ω, i.e. −i
~ 2 ∂φ = ∇ φ, ∂t 2m
as in (6):V = 0
which shows ~ω = E = p2 /2m, k = 2π/λ, p = h/λ. This result is interpreted [4] to show that a region of the continuum, which rotates in spherical mode, interacts with its environment by generating a wave-like disturbance at half the angular frequency of the core. The angular momentum on the surface of a unit sphere is L = mω. At λ = 2π, k = 1, the spin angular momentum follows as L = ~/2, with intrinsic magnetic moment µ = ~e/2mc.
3.2.
Bohmian Mechanics
The connection between wave mechanics and hydrodynamics, expressed by equations (7) and (8), was developed in more detail by Madelung, writing the time dependence of Ψ
202
Jan C.A. Boeyens
as an action function, Ψ = ψe2πiνt → ReiS/~, which seperates (5) into a coupled pair that resembles the field equations of hydrodynamics: ∂S (∇S)2 ~2 ∇2 R + − + V = 0, ∂t 2m 2mR ∂R2 R2 ∇S = 0, +∇· ∂t m
(17) (18)
which describe the irrotational flow of a compressible fluid, assuming R2 to represent the density ρ(x) of a continuous fluid with stream velocity v = ∇S/m. It was shown that both density and flux vary periodically with the same periodicity as νik = (Ei − Ek )/h, that results from superposition of states i and k. This means that radiation is not due to quantum jumps, but rather happens by slow transition in a non-stationary state. An attractive feature of the hydrodynamic model is that it obviates the statistical interpretation of quantum theory, by eliminating the need of a point particle. It is worth noting that the assumption of a point electron derives from the observation that it responds as a unit to an electromagnetic signal, which must therefore propagate instantaneously through the interior of the electron, at variance with the theory of relativity. However, by now it is known from experiment that non-local (instantaneous) response is possible in quantum systems and the initial reservation against Madelung’s proposal and Lorentz’s definition of an electron as a flexible sphere should fall away. On reinterpretation it was pointed out by David Bohm that equation (17) differed from the classical Hamilton-Jacobi equation only in the term Vq = −
~ 2 ∇2 R . 2mR
(19)
The quantity Vq , called quantum potential vanishes for classical systems as h/m → 0. A gradual transition from classical to quantum behaviour is inferred to occur for systems of low mass, such as sub-atomic species. All dynamic properties of classical systems should therefore be defined equally well for quantum systems, although the relevant parameters are hidden [10]. 3.2.1.
Quantum Potential
As for the classical potential, the gradient of quantum potential energy defines a quan.. tum force. A quantum object therefore has an equation of motion, m x= −∇V − ∇Vq . For an object in uniform motion (constant potential) the quantum force must vanish, which requires Vq = 0 or a constant, −k say. Vq = 0 defines a classical particle; alternatively1 −(V + Vq ) = T , the kinetic energy of the system. Hence ~2 ∇2 R/2mR = −E, which rearranges into 2mE ∇2 R + 2 R = 0 ~ Schr¨odinger’s equation for a free particle. 1
It is a common misconception that Vq = T for a free electron – compare [11]. Stationary states do not occur for Vq = 0, but when Vq = −V .
Emergent Properties in Bohmian Chemistry
203
The quantum potential concept is vitally important for understanding the structure of an electron and of quantum systems in general. The fact that the amplitude function (R) appears in both the numerator and denominator of Vq implies that the effect of the wave field does not necessarily decay with distance and that remote features of the environment can affect the behaviour of a quantum object. The quantum potential for a many-body system: Vq =
n X i=1
~2 − 2mR
∇2i R mi
depends on the quantum state of the entire system. The potential energy between a pair of entities, Vq (xi , xj ) is not uniquely defined by the coordinates, but depends on the wave function of the entire system, Ψ. This condition defines a holistic system in that the whole is more than a sum of the parts. The instantaneous motion of one part depends on the coordinates of all other parts at the same time. That defines a non-local interaction of the type assumed to exist within an indivisible electron, and now inferred to occur in all quantum systems, including molecules. If the system is distorted locally, the entire system responds instantaneously. As the quantum potential is not a function of distance, the behaviour of a composite system depends non-locally on the configuration of all constituents, no matter how far apart. In a chemical context the properties, structure and rearrangement of molecules must depend intimately on the quantum potential. It is necessary to give up the notion that molecular rearrangement involves the breaking and making of bonds and rather consider it as a modification of the intramolecular electronic wave interference pattern. However, all systems are not correlated equally well. Whenever a wave function can be written as a product Ψ(r1 , r2 , t) = ΦA (r1 , t)ΦB (r2 , t) the quantum potential becomes the sum of two terms: Vq (r1 , r2 , t) = VqA (r1 , t) + VqB (r2 , t). The two sub-systems evidently behave largely independently. That is a good description of a molecular crystal, or liquid, with relatively weak interaction between molecular units. Systems like these are better described as partially holistic. The contentious issue of quantum-particle trajectories is put into perspective by the Bohmian model. One interpretation is that the quantum electron has an unspecified diffuse structure, which contracts into a classical point-like object when confined under external influences. The observed trajectory, as in a cloud chamber, may be considered to follow the centre of gravity. In a two-slit experiment an electron wave passes through both slits to recombine, with interference, but without rupture. The interference pattern disappears on closure of one slit or when the slits are too far apart, compared to the de Broglie wavelength. It now behaves exactly like a classical particle, when forced through a single slit2 [12]. 2
The de-Broglie – Bohm formulation of particle plus pilot wave is considered an unnecessary complication by this author. Instead, Ψ may be thought of as a state of vibration of empty space.
204
Jan C.A. Boeyens
3.2.2.
Stationary States
Writing the wave equation in two equivalent forms: Ψ(x, t) = Ψ0 e−iEt/~, Ψ(x, t) = R(x, t)eiS(x,t)/~, and noting that R(x, 0) = R0 (x); S(x, 0) = S0 (x); Ψ0 = R0 eiS0 /~, it follows that: S(x, t) = S0 (x) − Et,
(20)
R(x, t) = R0 .
The unexpected conclusion is that a real wave function, Ψ0 = ψ, implies S0 (x) = 0 and hence the momentum ∇S = p = 0 and E = V + Vq . Those states with ml = 0 all have real wave functions, which therefore means that such electrons have zero kinetic energy and are therefore at rest. The classical (electrostatic) and quantum forces on electrons in such stationary states are therefore balanced and so stabilize the position of the electron with respect to the nucleus. For the hydrogen atom in the ground state, R(r) = N e−r/a0 and hence, d2 R N = 2 e−r/a0 , 2 dr a0 such that, from (19), Vq = ~2 /2ma20 . In general Vq =
~2 , 2mr2
(21)
and the quantum force on the electron: Fq =
∂Vq ~2 =− 3 ∂r mr
whereas the electrostatic force F = e2 /r2 . These forces are in balance when ~2 e2 = ; mr3 r
r=
~2 = a0 , me2
the Bohr radius. This means that V = Vq at r = a0 /2, halfway between proton and electron. 3.2.3.
Orbital Angular Momentum
Orbital angular momentum is perhaps the most awkward concept to visualize as the property of a quantum-mechanical point electron, but is readily understood in hydrodynamic analogy. Like tidal motion, atomic orbital motion in a continuous spherical charge cloud consists of the propagation of a wave disturbance, without matter circulation, as first proposed by Nagaoka and described by the quantized spherical surface harmonics, Ylml = N Plml eiml ϕ , in terms of Legendre polynomials, P .
Emergent Properties in Bohmian Chemistry
205
In Bohmian formalism angular momentum is described by rotation of the phase function: S(x, t) = S0 (x) − Et
= ml ~ϕ − Et.
The wavefronts S=constant are planes parallel to and rotating about the z-axis, with angular velocity ∂ϕ/∂t = E/ml ~. Single-valuedness of Ψ = R exp(iS/~) requires that Ψ(S) = Ψ(S + 2πn~) = ψ(S + nh). This is interpreted to mean that n = |ml | wave crests occur during each cycle. Positive and negative values of ml represent anticlockwise and clockwise rotations respectively. This interpretation of orbital angular momentum has a formal resemblance to the semiclassical model of Bohr and Sommerfeld, but there is no physical rotation of charge. Two electrons with magnetic quantum numbers of ±ml have wave structures that rotate, in phase, in opposite directions, with resultant distortion of zero. Quenching of orbital angular momentum during chemical interaction between neighbouring atoms happens by the same principle. The wave pattern in the case where l 6= 0 and ml = 0 is to be interpreted as the three-dimensional analogue of the circular modes of a vibrating drumhead. There is no axial component to the disturbance. The wave motion is more like spherical vibration, compared to spherical rotation that causes electron spin and which can be oriented in the polar direction of a magnetic field.
4.
Chemical Change
In the same sense that biological activity is more than chemical change, chemical effects depend on a number of emergent properties unknown to physics. The concepts of chemical affinity, cohesion and structure were discovered experimentally and not anticipated from first principles. Although chemical events can therefore not be inferred from the laws of physics, the Bohmian interpretation of quantum mechanics provides an attractive framework for their understanding. The fundamental reason for this emergence is the chemical environment. The interaction between chemical species, partially characterized in isolation, is as hard to predict as the behaviour of an individual in a crowd. Not being acquainted with the concepts molecule, phase transition and free energy, there is no possibility of deriving the laws of chemical affinity, reactivity and composition from the quantum numbers that quantify the energy and angular momentum of electrons in isolated atoms. The problem is approached here by examining the possible modes of interaction between charges and the response of atoms to close confinement.
4.1.
Interaction Theory
Interaction at a distance is interpreted in modern theories as a field phenomenon. The electromagnetic field, described by Maxwell’s equations as waves, propagate through the vacuum, with a constant velocity that depends on the permittivity and permeability of free √ space, c = 1/ µ0 ǫ0 . The wave equation (4) has solutions Ψ(t) and Ψ(−t), known as retarded and advanced waves, respectively. The transmission of electromagnetic energy
206
Jan C.A. Boeyens
between an emitter and a distant receptor is assumed to be negotiated by a pair of retarded and advanced waves. As a spherical wave signal from the emitter reaches an acceptor, it responds with an advanced return signal that reaches the emitter at the exact moment of first emission, to establish a standing wave, known as a photon. Further interaction depends on the potential energy difference between emitter and receptor. Transfer of excess energy occurs by relaxation of the standing wave, which is experimentally observed as photon emission. Alternatively the standing wave, known as a virtual photon, that exists between interacting sites, becomes balanced against external factors, at a distance that defines the electrostatic force of interaction between the charges as: F =
q1 q2 . 4πǫ0 r2
All chemical interactions are of this type [4]. In Bohmian formalism the theory predicts the stability of atomic matter as a function of the fine-structure constant. Sommerfeld [13] – (p.107) introduced the fine-structure constant as α = v1 /c = e2 /4πǫ0 c~ (= 2πe2 /ch, in esu), where v1 is the velocity of an electron in the first Bohr orbit. More generally, the parameter α′ = v/c for a freely moving electron with de Broglie wavength λdB = h/mv and Compton wavelength λC = h/mc is defined, more appropriately as α′ = λC /λdB . An electron in a hydrogenic stationary state has nλdB = 2πn2 a0 , hence: e2 . αn = n~c In the Bohmian interpretation an atomic stationary state occurs when the potential energy of the electron, at rest, is balanced by the quantum potential. The relativistic mass of an electron at the position of the nucleus, with respect to the rest mass mo in the ns state, would be
i.e.,
mo mo =√ m= p 2 2 1 − α2 1 − v /c α2 =
m2 − m2o 4π 2 e4 En = = , m2 n2 h2 c2 mc2
Hence En = ∆m′ c2 ,
where ∆m′ = m − m2o /m ≃ m − mo .
This is interpreted here to show that an electron in a stationary state has its mass reduced, with respect to the nucleus, by an amount ∆m′ , which reappears as the binding energy −En . The same argument explains nuclear binding energy as a mass defect. Transition of an electron with n > 1 to a lower unoccupied energy level by emission of a photon with energy hν and spin ~, is anticipated. However, in the 1s state with quantum number l = 0, there is no orbital angular momentum to transfer in promoting photon emission and the ground state remains stable. The calculation does not imply different velocities for the electron at different energy levels – only a quantized change in de Broglie wavelength. The mass-energy difference amounts to exchange of a (virtual) photon in the form of a standing wave between the charge centres.
Emergent Properties in Bohmian Chemistry
207
With the classical radius of the electron defined as r0 = e2 /mc2 it is noted that r0 me4 = = a0 m~2 c2
e2 ~c
2
= α2 ,
where a0 is the Bohr radius. This result follows from the two relationships: 2πe2 = 2πr0 , mc2 λC 2π~2 = = 2πa0 = λdB . α me2
αλC =
Now define λZ = 2πr0 . Whereas the wavelength λdB = λC /α represents a wavepacket with group velocity vg < c, the phase velocity vφ > c is associated with the Zitterbewegung of wavelength λZ = α · λC ; vg vφ = c2 [15]. This argument relates to two problematic parameters: α and the classical electron radius r0 which still awaits quantum-mechanical definition. The fine-structure constant appears firmly associated with the wave nature of an electron, seen as a standing wave that results from the superposition of diverging and converging spherical components. The internal wave structure of the electron is observed as high-frequency Zitterbewegung while the macroscopic effects in an electromagnetic field are fixed by the spread of the wavepacket, conveniently defined as a de Broglie wavelength. Trapped in the field of a proton the de Broglie wavelength is quantized to avoid self-destruction, such that e2 λC = αn = . λdB n~c For an effective charge separation of rn , the ratio αn may be considered the ratio of two energies: 2 e2 1 e = · n~c rn hν an electrostatic and a quantum-mechanical factor. The constant c = λ/τ = λν describes the virtual photon that occurs as a standing wave (nλ = 2πr) between the charge centres. The balance between the classical coulombic attraction and the quantum-mechanical repulsion (the quantum potential) defines the fine-structure constant with a value, fixed by the de Broglie wavelength of the virtual photon. In a strong field the size of an electronic wavepacket may be compressed below the Compton radius to an absolute minimum of λZ , which describes the minimum size to which an electron may be compressed, measuring r0 = λZ /2π, for an electron defined as an electric charge −e distributed over a sphere of radius r0 . The potential energy E = e2 /r0 corresponds to r0 = e2 /mo c2 , as measured classically.
4.2.
Environmental Effects
Schr¨odinger’s solution for the hydrogen electron serves as the starting point for the qualitative discussion of all chemical effects in quantum formalism. It is routinely forgotten that the simple hydrogen solution ignores all interactions that the electron would experience in
208
Jan C.A. Boeyens
a chemical environment. Even the use of hydrogen energy levels to rationalize the structure of the periodic table is of limited value. A useful approach to simulate environmental effects was pioneered by Sommerfeld on solving Schr¨odinger’s equation under modified boundary conditions. Non-zero environmental pressure was introduced by assuming that ψ(r) → 0 as r approaches some finite value rc , rather than infinity. All energy levels move to higher values with decreasing rc , until the ground level reaches the ionization limit at rc = r0 , the ionization radius. It is noted that on reaching the ionization limit by uniform compression the electron that becomes decoupled from the nucleus finds itself confined to a spherical cavity at zero potential and kinetic energy. However, the non-zero energy of a free electron in a hollow sphere, must therefore be interpreted as quantum potential energy. The Helmholtz equation for such an electron: p ∇2 + k 2 ψ = 0 , k = 2mE/~2 p has the radial solutions, R = 2kr/π · kl (kr). At the first zero of the spherical Bessel function 0 = sin(kr)/(kr), kr0 = π, and hence (compare 21) E0 =
h2 = Vq . 8mr02
(22)
The Fourier transform of 0 is the box function f (r) =
√
2π/2r0 if |r| < r0 , 0 if |r| > r0 .
(23)
It follows that the decoupled (valence) electron of the hydrogen atom, compressed to r0 is uniformly spread across the ionization sphere.
4.3.
Emergent Properties
Chemical theory requires insight into more than atomic stability. One-electron quantum theory provides no guidance beyond the hydrogen atomic ground state and the structure of many-electron atoms must be inferred from the empirically known periodic table of the elements. However, the superficial correspondence between the calculated quantum states of hydrogen and the observed elemental periods strongly suggests a functional relationship between the two sets. To better appreciate the relationship it is noted that both sets can be generated by convergent sequences of Fibonacci or Lucas fractions as shown below. 4.3.1.
Periodicity
A modular pair of rational fractions
h1 k1
,
h2 k2
has the property:
h1 h2 k1 k2 = ±1.
Emergent Properties in Bohmian Chemistry
209
Such a pair is geometrically represented by two Ford circles with radii and y-coordinates of 1/2k 2 at x-coordinates of h/k. A series of rational fractions with all neighbouring terms in unimodular relationship is represented by a set of tangent Ford circles [4]. Examples of such modular series are the Farey sequences, Fn and the converging Fibonacci and Lucas fractions on the segment ( 12 35 32 ) of F5 :
Despite a number of uncertain half-lives, a reasonable estimate of 264 divides the stable (non-radioactive) nuclides into 11 periods of 24. Plotting the ratio of protons:neutrons (Z/N ) for all isotopes as a function of atomic number, the hem lines that separate the periods of 24, intersect a reference line, at Z = τ , in Z-coordinates which correspond to well-known ordinal numbers that define the periodic table of the elements [2, 3]. Remarkably, the same hem lines intersect a reference line at Z/N = 0.58 in atomic numbers that correspond to the closure of the calculated wave-mechanical energy levels for hydrogen. Noting that the radii of unimodular Ford circles are inversely proportional to the number (2k2 ) of atoms in elemental periods, we look for converging circles that match the two forms of periodicity. The primary circle
at x = 0 or 1, rF = 21 , is flanked by two tangent circles at x = 0.5 and 1.5 ≡ −0.5, 1 1 and x = ± 34 , rF = 32 . This rF = 81 ; further converging pairs are at x = ± 23 , rF = 18 arrangement mimics the periodic table:
210
Jan C.A. Boeyens
What is probably the aesthetically most pleasing form of the periodic table is obtained by rearrangement in circular array, as shown in Figure 5 for the hundred naturally occurring cosmic elements. Interpreted in terms of electronic distribution it implies twelve 8-fold and three 2-fold energy levels, with all closed-shell elements grouped together. These are not hydrogen-like energy levels, but they agree with the valence levels, calculated for compressed atoms in Hartree-Fock-Slater approximation [14]. The hypothetical arrangement based on the hydrogen solution is recognized in the nested set of Ford circles at x = 1, predicting consecutive periods of 2n2 , n = 1, 2, . . . , arranged as follows:
If n is interpreted as Schr¨odinger’s principal quantum number, periods of the correct length (2n2 ) are predicted. Each of the periods consists of n subshells for subsidiary quantum numbers 0 ≤ l < n. The number of elements per subset equals 2(2l + 1), l ≤ ml ≤ l. This result provides the basis of Pauli’s exclusion postulate, which defines an emergent property, not of quantum-mechanical origin. The hypothetical and observed versions of the periodic table are in agreement for elements 1 to 18. The superficial agreement (e.g. for elements 28, 46 and 78; and 29–36) beyond that point is purely accidental. We conclude that the wave-mechanical hydrogen model fails to account for elemental periodicity mainly because it ignores all interactions apart from the central-field unitary electrostatic attraction. The common thesis of chemistry textbooks that Schr¨odinger’s equation, with due allowance for interelectronic effects, accounts for the periodic table, fails on two important counts. It predicts transition series of ten elements, compared to the observed eight. The guiding principle, known as the Aufbau procedure, is valid only for the alkaline-s and p blocks. Less than two-thirds of the nominal transition elements obey an Aufbau rule. The correct periodic system occurs in an environ-
Emergent Properties in Bohmian Chemistry
211
Figure 5. The Periodic Table of the elements in circular form. ment that requires the convergence of stable nuclear composition, Z/N to the golden ratio, τ , and subject to an emergent exclusion principle. Further new properties are expected to emerge in the analysis of chemical affinity, cohesion and conformation, at a higher hierarchical level.
4.3.2.
Electronegativity
Chemical affinity is the intuitive qualitative concept that guided experimental chemistry for centuries. The first quantitative measure of affinity was discovered by Lothar Meyer as the atomic volume of an element – his basis of periodicity. It served to differentiate between electropositive and electronegative elements, with a natural affinity between them. The concept was generalized by Pauling, Mulliken and others, by placing all elements on
212
Jan C.A. Boeyens
Figure 6. Electronegativity as quantum potential of the valence state a single empirical electronegativity scale. By demonstrating the equivalence of electronegativity and the quantum potential of the valence state [16] it was finally recognized as an emergent atomic property, readily reduced to fundamental quantum theory. Like hydrogen, an atom is said to be in its valence state when ionized by environmental pressure. The energy of the electron, decoupled from the nucleus but confined to the ionization sphere, is given by (22). Characteristic ionization radii, r0 , are obtained by numerical Hartree-Fock-Slater calculation [14] with boundary conditions modified as for H. Redefined on this basis, electronegativity, χ, is calculated as χ2 =
h2 , 8mr02
expressed in eV, such that χ relates to Pauling electronegativities on a linear scale and χ2 √ to the Mulliken scale. A plot of χ = E0 reveals the same periodicity as Figure 5 and as Lothar Meyer atomic volumes. The uniform electron density of the valence state, from (23): 1 ρ = ψ 2 (r), ψ(r) = (φ/V0 ) 2 exp {−(r/r0 )p } , p >> 1. (24) The scale factor, φ, which compensates for an inaccessible core, is proportional to r0 and varies inversely with the number of nodes as defined by an effective principal quantum number n. Hence, φ = cr0 /n. The wave function p 1 exp {−(r/r0 )p } (25) ψ(r) = 3c/4πn r0 describes the interaction of an atom with its chemical environment.
Emergent Properties in Bohmian Chemistry 4.3.3.
213
Covalence
Chemical cohesion has for many years been the main topic of theoretical chemistry, conducted as an exercise in computational quantum physics, described by one practitioner [17] as if repeatedly ′′ ...validating Schr¨odinger’s equation!′′ . There is a curious conviction that the Born-Oppenheimer scheme enables molecular structure to be computed ab initio. An initially assumed structure is treated only as a device to kickstart the calculation. Once the electronic density has been obtained, the nuclear framework is computed theoretically, without assumption. It always comes out miraculously close to the assumed structure. To the uninitiated the procedure appears to be circular and unlikely to produce anything of physical significance beyond the assumed molecular structure. An obvious alternative is to model the electron exchange that constitutes atomic pairwise interactions, known as covalent bonds, before assembly into a three-dimensional structure is attempted. The computational details for this procedure, which requires atomic wave functions, have been documented as the well-known Heitler-London method. Appropriate wave functions (25) are obtained from empirically adjusted ionization radii that compensate for steric factors [4]. H–L calculations predict both dissociation energy and equilibrium interatomic distance for any first-order covalent interaction. High-order interaction results from the valence-level screening of the internuclear repulsion.
Figure 7. Covalent binding energy curve for homonuclear diatomics in dimensionless units. The same set of characteristic atomic radii (r) can be used to model covalent electron exchange by point-charge simulation, as a function of interatomic separation (d) only. It
214
Jan C.A. Boeyens
is found that with the ratio d/r and binding energy E, expressed in dimensionless units, all homonuclear diatomic interactions are described by a single interaction curve, shown in Figure 7. The curve turns where d/r = τ and E = −2τ , at the point where exactly two electron waves are concentrated in the interatomic region. The condition is seen to reflect the exclusion principle for fermions. Its relationship to the golden ratio defines the origin of the exclusion principle as the curvature of space-time. Without this emergent property there is no understanding of covalent interaction. 4.3.4.
Molecular Shape
The inability to derive molecular structures from fundamental quantum theory identifies molecular shape as another emergent property. Although it cannot be inferred from basic theory it is readily reduced to the conservation of orbital angular momentum. Conventional computational schemes, designed to minimize energy, with total neglect of orbital angular momentum, must, by definition converge to a spherically symmetrical arrangement. To prevent this from happening a potential field of lower symmetry is imposed by assuming a fixed nuclear framework. Instead of imposing an experimentally determined structure, conservation of orbital angular momentum provides a theoretically more satisfying algorithm to generate such a structure from first principles. Polarization of mutually approaching reactants resolve local angular-momentum vectors, just like an applied magnetic field. During the formation of a molecule, the alignment of reactants that minimizes angular momentum in the local polar direction is favoured. In many reaction systems there is sufficient symmetry for the orbital angular momentum in the polar direction to become quenched completely. Where the quenching in low-symmetry (chiral) systems is incomplete, the residual angular momentum will couple to the magnetic field of polarized light, causing optical activity. Should quenching be possible only for a specific angular alignment of neighbouring fragments, a rigid system, which resists torsional deformation, is obtained. So-called double bonds and aromatic systems are common examples. The empirical stereochemical rules, pioneered by Kekul´e, van’t Hoff and others, are consistent with the principles outlined here and these have been generalized into empirical computational schemes, collectively known as molecular mechanics. There is no more fundamental procedure to predict molecular structure.
References [1] H. Primas, Chemistry, Quantum Mechanics and Reductionism, 2nd ed., SpringerVerlag, Berlin, 1983. [2] J. C. A. Boeyens and D.C. Levendis, Number Theory and the Periodicity of Matter, Springer.com, 2008. [3] J. C. A. Boeyens, Periodicity of the stable isotopes, J. Radioanal. Nucl. Chem., 2003 (257) 33–41.
Emergent Properties in Bohmian Chemistry
215
[4] J. C. A. Boeyens, Chemistry from First Principles, Springer.com, 2008. [5] M. Wolff, Beyond the Point Particle – A Wave Structure for the Electron, Galilean Electrodynamics, 1995 (6) 83–91. [6] J. C. A. Boeyens, Angular Momentum in Chemistry, Z. Naturforsch. 2007 (62b) 373– 385. [7] R. C. Jennison and A. J. Drinkwater, An approach to the understanding of inertia from the physics of the experimental method, J. Phys. A, 1977 (10) 167–179. [8] J. C. A. Boeyens, Quantum theory of molecular conformation, C.R. Chimie, 8 (2005) 1527 – 1534. [9] E. Schr¨odinger, Collected Papers on Wave Mechanics, (Translated from the second German edition), 2nd ed., Chelsea, NY, 1978. [10] D. Bohm, A Suggested Interpretation of the Quantum Theory in Terms of ′′ Hidden′′ Variables, Phys. Rev., 1952 (85) 166–179, 180–193. [11] D. W. Belousek, Einstein’s 1927 Unpublished Hidden-Variable Theory: Its Background, Context and Significance, Stud. Hist. Phil. Mod. Phys., 1996 (27) 437–461. [12] P. R. Holland, The Quantum Theory of Motion, University Press, Cambridge, 1993. [13] A. Sommerfeld, Atombau und Spektrallinien, 4th ed., Vieweg, Braunschweig, 1924. [14] J. C. A. Boeyens, Ionization radii of compressed atoms, J. Chem. Soc. Faraday Trans., 1994 (90) 3377–3381. [15] J. C. A. Boeyens, New Theories for Chemistry, Elsevier, Amsterdam, 2005. [16] J. C. A. Boeyens, The Periodic Electronegativity Table, Z. Naturforsch., 2008 (63b) 199–209. [17] N. C. Handy, in: R. Broer, P.T.C. Aerts and P.S. Bagus, New Challenges in Computational Chemistry, University of Groningen, 1994.
In: Quantum Frontiers of Atoms and Molecules Editor: Mihai V. Putz, pp. 217-250
ISBN: 978-1-61668-158-6 © 2010 Nova Science Publishers, Inc.
Chapter 10
THE ALGEBRAIC CHEMISTRY OF MOLECULES AND REACTIONS Cynthia Kolb Whitney* Galilean Electrodynamics, 141 Rhinecliff Street, Arlington, MA 02476-7331, USA
Abstract A new line of research generally characterized as ‘Algebraic Chemistry’ is here applied to the problem of modeling energies involved in molecule formation and in chemical reactions. The approach is based on algebraic scaling laws that allow one to estimate energies of interest by evaluating simple algebraic expressions, without recourse to computer calculations based on detailed quantum mechanical formulations and phase-space integrations, such as found in traditional Quantum Chemistry. The simplicity of the algebraic approach means that it can address molecules and reactions involving more atoms than the ones that are presently convenient for traditional Quantum Chemistry. In fact, there is no complexity-related limit on the atom count amenable to molecule or reaction analysis with the algebraic method. Algebraic Chemistry is a simple tool always suitable for hand calculations. In the cases of many real molecules and reactions, data is available to test the algebraic approach, and thus build some confidence about it. This is important because the scaling laws used come from new, and not yet broadly known, theoretical extensions to the traditional quantum mechanics of atoms, and even to special relativity theory and classical electromagnetic theory. These extensions of traditional theory are briefly summarized in Appendices and detailed further in the References.
Keywords: Algebraic Chemistry, Quantum Chemistry, heat of molecule formation, heat of chemical reaction.
*
E-mail address: [email protected]
218
Cynthia Kolb Whitney
1. Introduction The present work develops a technique for quantitative analysis of molecules and reactions of arbitrary complexity. The approach is called ‘Algebraic Chemistry’, because it is based on simple multiplicative scaling laws. The present work extends the development of Algebraic Chemistry [1-4] from the discussion of individual elements to the discussion of complete molecules and reactions. The objective is to make Algebraic Chemistry more ready for full adoption into Chemistry, and less likely to remain a disruptive stepchild within Physics. Much of the earlier work in the development of Algebraic Chemistry concerned ionization potentials of atoms. It showed that all the information necessary to specify ionization potentials of arbitrary order for all the elements is embodied in first-order ionization potentials for all the elements, and that, in fact, all the information necessary to specify the first-order ionization potentials for all the elements is embodied in the first-order ionization potential of Hydrogen, and that, in fact, this number is predictable from theory. Here the idea of scaling laws is used to extend the known information about ionization potentials of neutral atoms to estimate ionization potentials for ‘already-ionized’ atoms. This new information is not to be confused with ‘higher-order’ ionization potentials. The ‘higherorder’ ionization potentials of neutral atoms, apparently [1], describe multiple ionizations that occur simultaneously, whereas the ionization potentials of ‘already-ionized’ atoms describe single ionization events that occur sequentially. The ‘one-at-time’ physical process is gentler than the ‘all-at-once’ physical process, and is more typical of events that actually occur throughout most of normal Chemistry. Like the previously known information about the higher-order ionization potentials of neutral atoms, the new information about ionization potentials of already-ionized atoms is entirely derivable from first-order ionization potentials of all the elements. Section 2 shows exactly what the scaling laws have to be, given the algebraic model under consideration. The new information is then used in subsequent Sections to estimate energies involved in molecule formation and chemical reactions.
2. Scaling Laws The basis for having scaling laws at all lies in imagining atoms to consist of a nucleus and an electron ‘cluster’ sufficiently well defined that the atom as a whole is similar to a twobody system. Being imagined as similar to a two-body system, all atoms are then similar to Hydrogen, and scaling laws based on Hydrogen follow. This idea is developed in [1], and then applied and extended in [2-4], and is summarized for the present applications in Appendix 3. The key results are as follows: The magnitude of the potential energy for the one electron in the Hydrogen atom is:
e2 (re + rp ) = 3c 2 me2 25 mp
(1)
The Algebraic Chemistry of Molecules and Reactions
219
where e is electron charge, r is orbit radius, c is light speed, m is mass, and subscripts e and p distinguish electron from proton. Suppose we want to model the system potential energy, not for Hydrogen per se, but for its isotopes, Deuterium and/or Tritium. The proton mp and rp then need to be replaced with a more generic nuclear mass M and its orbit radius rM . The magnitude of potential energy for this more massive atom is:
e2 (re + rM ) = 3c 2 me2 25 M .
(2)
Next, suppose we want to deal with a neutral atom with nuclear charge number Z , as well as the generic nuclear mass M . Then we have Z electrons as well. For the more charged system, the magnitude of the potential energy becomes:
Z 2e2 (re + rM ) = Z 2 3c 2 me2 25 M .
(3)
This scaled-up expression represents the magnitude of the total potential energy of the system involving Z electrons. What is then comparable to the ionization potential for removing a single electron is:
(
)
(
Z e2 (re + rM ) = Z × 3c 2 me2 25 M ≡ (Z / M ) × 3c 2 me2 25
)
(4)
Thus we see that Z / M scaling that is predicted for measured ionization potentials. This is cancelled out by M / Z scaling to produce the IP ’s collected in Appendix 1 and used in following Sections. More generally, if the atom is in an ionized state, we have a distinct electron count Ze and proton count Z p . For the baseline nuclear-orbit part, we have for the total system:
)(
(
Z p Ze e2 (re + rM ) = Z p Ze M × 3c 2 me2 25
)
(5)
What is then generally comparable to the nuclear-orbit part of the ionization potential for removing a single electron? Appendix 3 shows that it is as if all factors of e changed to
Z p Ze e . What is then comparable to the ionization potential for removing a single electron is:
Z p Ze e2 (re + rM ) =
(Z Z
p e
)(
M × 3c 2 me2 25
)
(6)
220
Cynthia Kolb Whitney
In the present work, all scaling laws are based on first-order ionization potentials of neutral elements. Removing the Z / M scaling by multiplying raw ionization data by the inverse, M / Z , produces IP ’s that are all comparable to the IP of Hydrogen. In this paper, the symbol IP always means ‘ M / Z -scaled ionization potential to compare with that of Hydrogen’. Let the M / Z -scaled first-order ionization potential for Hydrogen be represented by IP1,1 . The first subscript 1 means ‘first ionization potential’, and the second subscript 1 means ‘first element’; i.e. Hydrogen. The M / Z scaled first-order IP for the element Z is then represented by IP1,Z . In the earlier work [1], the problem addressed was modeling IP ’s of all orders higher than unity for all elements - the ‘all-at-once’ problem. The results showed that we could enlist information from one element to develop information about an ion of another element. That idea is exploited here to infer the first-order IP ’s of already-ionized atoms – the ‘one-at-atime’ problem. The IP1,Z for removing an electron separates into a baseline, nuclear-orbit part, IP1,1 , and a deviation, electron-cluster part, ΔIP1,Z = IP1,Z − IP1,1 . The baseline part is independent of Z , but the deviation part is a complicated function of Z . The present paper takes the deviation part as input data. But there exists a basis for future deeper analysis and modeling of the deviation part. The development of an Expanded SRT, detailed in [1] or [2], allows for superluminal speeds, which in turn allows for same-charge systems. Rings of multiple charges rotating at superluminal speeds are analyzed in [3], and such rings stacked like little magnets are used to model the electron populations in atoms. The deviation ΔIP1,Z can be positive or negative, depending on where the element is in the Periodic Table. The deviation generally tends to zero at mid period, between noble gasses. It is maximally positive at a noble gas, and maximally negative just after a noble gas. For example, [see Appendix 1] the deviation term for Helium is
ΔIP1,2 = IP1,2 − IP1,1 = 49.875 − 14.250 = 35.625 , whereas the deviation term for Lithium is
ΔIP1,3 = IP1,3 − IP1,1 =Ê12.469 − 14.250 = −1.781 . +
−
Let symbols like IP1,Z and IP1,Z represent first-order IP ’s for already-ionized atoms. The superscript + means positively charged due to previous electron removal, and multiple +’s would mean multiple electrons removed. Superscript − means negatively charged due to previous electron addition, and multiple − ’s would mean multiple electrons added. The
Z n Ze M scaling maps the IP1,1 baseline part of the IP1,Z for the neutral element into
The Algebraic Chemistry of Molecules and Reactions
221
+ − the baseline part of the IP1, Z or IP1, Z or whatever. The increment part is different; it depends only on the number of electrons involved in the charge cluster, so the scaling that
applies to it is just Ze / M , and what it applies to is the deviation term for the element whose Z matches the Ze needed. The IP1,1 and the ΔIP1,Z are the basic data used below to describe first-order IP ’s like
IP1,+Z and IP1,−Z for already-ionized atoms.
3. Example Atoms In Physics, it is traditional to begin any discussion about atoms with the simplest possible atom, Hydorgen 1 H . I am going to depart from this tradition: I am starting this discussion with Carbon 6 C . Hydrogen is just too simple; with Hydrogen too many important parts of the ionization problem disappear as invisible zero’s. Carbon has no such degeneracies, and yet it is still a lot like Hydrogen; after Hydrogen, it is the next ‘keystone’ element [see Appendix 4]. Like all keystone elements, it is just as willing to give as to take electrons, until the number reaches a ‘noble-gas’ number: 2 or 10 for Carbon (as compared to 0 or 2 for Hydrogen). For Carbon 6 C , IP1,6 = IP1,1 + ΔIP1,6 times Z / M = 6 / M 6 represents the work that must be supplied to take one electron off the neutral Carbon atom. That exercise produces + a positively charged ion 6 C . The transition also returns some heat, because after the ionization, the electron cluster is different: 5 electrons instead of 6, and all resting at a different energy. I call it ‘heat’, not ‘work’, because it is uncontrollable; Nature simple ‘does’ this. The heat returned is evidently (ΔIP1,6 × 6 − ΔIP1,5 × 5) / M 6 . Observe how the ‘5’ for 5 B information enters here.
+ + The ionization potential of the singly ionized 6 C , IP1,6 , times 6 / M 6 represents the work that must be supplied to remove another electron from the already singly ionized
(
Carbon atom. It is IP1,1 ×
)M
6 × 5 + ΔIP1,5 × 5
6 . Observe how the
6 × 5 due to the
+ previous ionization enters here. It means that IP1,6 is certainly not the same thing as the so-
called ‘second ionization potential’ of the neutral Carbon atom, IP2,6 . As before, heat is also returned as the electron cluster readjusts. This time, the amount of heat returned is
(ΔIP1,5 × 5 − ΔIP1,4 × 4) M6 . Observe how the 4 for 4 Be information enters here.
222
Cynthia Kolb Whitney
Proceeding to the next step, the work for removing another electron from the now doubly ++ ++ ionized 6 C , IP1,6 , times 6 / M 6 is IP1,1 × 6 × 4 + ΔIP1,4 × 4 M 6 . The heat
(
(
)
) M6 . The now triply ionized 6 C+++ takes
returned this time is ΔIP1,4 × 4 − ΔIP1,3 × 3 +++ work IP1,6 × 6 / M 6 =
(IP
1,1 ×
)M
6 × 3 + ΔIP1,3 × 3
(
The M / Z -scaled heat returned is ΔIP1,3 × 3 − ΔIP1,2 × 2 negative number. This means that getting from 6 C environment.
And
the
(
now
quadruply
ionized
++++ × 6 / M 6 = IP1,1 × 6 × 2 + ΔIP1,2 × 2 IP1,6
+++
)M
6C
6 to remove another electron.
) M6 . This turns out to be a
to 6 C ++++
++++
cools the local
would
take
work
ΔIP1,2 is a large
6 . Because
number, this last ionization is not very likely to happen. That is, like the 2 He atom, the ++++ ion is quite stable. 6C With all this information, one can construct an energy tally for all the electron removal + ++ +++ , and scenarios that Carbon invites: 6 C → 6 C , 6 C → 6 C , 6 C → 6 C ++++ : 6C → 6C 6C → 6C
+
6C → 6C
++
6C → 6C
+++
takes IP1,1 × 6 / M 6 and (ΔIP1,6 × 6 − ΔIP1,5 × 5) / M 6 ;
( 6 × 5 ) M and (ΔIP takes IP (6 + 6 × 5 + 6 × 4 ) M
takes IP1,1 6 +
1,6 × 6 − ΔIP1,4 × 4) / M 6 ;
6
1,1
6
and (ΔIP1,6 × 6 − ΔIP1,3 × 3) / M 6 ; 6C → 6C
++++
(
takes IP1,1 6 +
6×5+ 6×4 + 6×3
)M
6
and (ΔIP1,6 × 6 − ΔIP1,2 × 2) / M 6 . Carbon also allows the addition of electrons, producing negatively charged ions. To −−−− quantify these, let use the pattern for electron removal, which means we start with 6 C . The
(IP
1,1 ×
work
to
remove
6 × 10 + ΔIP1,10 × 10
an
)M
electron 6 . Like
must
be
−−−− IP1,6 × 6 / M6 =
ΔIP1,2 , ΔIP1,10 is a large number, meaning
The Algebraic Chemistry of Molecules and Reactions that, like 10 Ne , 6 C
−−−−
is quite stable. The heat returned in this de-ionization is
(ΔIP1,10 × 10 − ΔIP1,9 × 9) M6 .
(IP
−−− × 6 / M6 = IP1,6
1,1 ×
The
another
−−
electron
1,1 ×
removal,
6 × 7 + ΔIP1,7 × 7
(
ion
and
)M
−−−
,
takes
work
6 to remove the next electron. The
(IP
1,1 ×
this −
6C
And
)M
6C
down,
(ΔIP1,9 × 9 − ΔIP1,8 × 8) M6 . Then the next ion
−− , takes work IP1,6 × 6 / M 6 =
(ΔIP1,8 × 8 − ΔIP1,7 × 7 ) M6 .
(IP
next
6 × 9 + ΔIP1,9 × 9
heat returned in this de-ionization is down, 6 C
223
6 × 8 + ΔIP1,8 × 8
de-ionization takes
work
)M
6 for
returns −
heat
IP1,6 × 6 / M 6 =
6 for the last electron removal, and this final de-
ionization then returns heat ΔIP1,7 × 7 − ΔIP1,6 × 6
) M6 .
The process of adding electrons to an atom is just opposite to the process of removing electrons. So the energy tally for all the electron addition scenarios is: 6C → 6C
−
6C → 6C
−−
takes − IP1,1 × takes − IP1,1
6 × 7 M 6 and −(ΔIP1,7 × 7 − ΔIP1,6 × 6) / M 6 ;
( 6×8 +
6×7
)M
6
and −(ΔIP1,8 × 8 − ΔIP1,6 × 6) / M 6 ; 6C → 6C
−−−
takes − IP1,1
( 6×9 +
6×8 + 6×7
)M
6
and −(ΔIP1,9 × 9 − ΔIP1,6 × 6) / M 6 ; 6C → 6C
−−−−
takes − IP1,1
( 6 × 10 +
6×9 + 6×8 + 6×7
)M
6
and −(ΔIP1,10 × 10 − ΔIP1,6 × 6) / M 6 . With Carbon now well in tow, we are ready to look back to Hydrogen. The transition + 1 H → H takes IP1,1 / M1 and (ΔIP1,1 − ΔIP1,0 ) / M1 , but the latter two IP data items are zero, so the pattern being followed isn’t well revealed by them. The transition 1 H → H
−
takes − IP1,1 1 × 2 M1 and (−ΔIP1,2 × 2 + ΔIP1,1 × 1) / M1 , but ΔIP1,1 is zero, and so doesn’t fully reveal the pattern. That is why it was better to violate tradition and start with Carbon.
224
Cynthia Kolb Whitney Since hydrocarbons are so important in the discipline of Chemistry, it will also be useful
to have at hand the corresponding results for Oxygen. Being so close to 10 Ne , 8 O is + probably characterized thoroughly enough by just four transitions: 8 O → 8 O , ++ − −− and 8 O → 8 O , 8 O → 8 O . 8O → 8O + 8 O → 8 O takes IP1,1 × 8 / M 8 and (ΔIP1,8 × 8 − ΔIP1,7 × 7) / M8 , 8O → 8O
++
8O → 8O
−
8O → 8O
−−
(
takes IP1,1 8 + 8 × 7
)M
8 and (ΔIP1,8 × 8 − ΔIP1,6 × 6) / M 8 ;
takes − IP1,1 × 8 × 9 M8 and −(ΔIP1,9 × 9 − ΔIP1,8 × 8) / M 8 , takes − IP1,1
( 8 × 10 +
8×9
)M
8 and
−(ΔIP1,10 × 10 − ΔIP1,8 × 8) / M8 .
4. Application to Analysis of Molecules The basic idea exploited in this work is that a molecule consists of ionized atoms, some positively ionized and some negatively ionized, exerting Coulomb attraction for each other. It is therefore possible to learn something about a molecule by determining what its constituent ions are, and modeling the energy requirements for creating those ions from the neutral atoms involved. The first part of that question was addressed in Ref. [1], which gave the following two mirror-image propositions: Proposition 1: Molecules that are relatively stable have total electron counts such that every atom present can be assigned an electron count equal to that of a noble gas, or else zero. Proposition 2: Molecules that are highly reactive have total electron counts such that not every atom present can be assigned an electron count equal to that of a noble gas, or else zero. Proposition 1 is very often fulfilled, and was illustrated by molecules from small ( NH 3 , NaOH )
to
larger
( CH 3CO 2C10 H17 ,
(CH 3CO2 )2 Pb ⋅ 3H 2O ,
(C17 H 35CO2 )2 Ca ). Proposition 2 was illustrated with some common atmospheric gasses ( O 2 , O3 , or NO ). So the Propositions are probably true, but we really need to know much more. We need not just a binary division into stable / reactive; we need numerical rankings within those categories. That would mean modeling the energy requirements for creating the ions involved. Furthermore, it is often true that a ‘stable’ molecule admits more than one possible set of noble-gas electron assignments, or that a ‘reactive’ molecule admits more than one possible set of not-quite noble-gas electron assignments. When multiple assignments are possible, which one is the one Nature picks? Again, we need numerical rankings.
The Algebraic Chemistry of Molecules and Reactions
225
The present work is aimed at developing such quantitative rankings, based on the algebraic modeling concepts developed in the previous Section. There are several preliminary remarks to be made: 1) In all of the following examples, the IP data used come from the algebraic model developed in [1] and summarized in Appendix 1 and Appendix 2. Occasional IP data points differ detectably from M / Z -corrected raw data. So be it. I wish to test the concept of algebraic modeling over all, and if errors that are manifest in molecule modeling actually arise from errors in atom modeling, they are nevertheless errors, and I want them to be seen; 2) The calculations reported carry several more digits than can be justified as ‘significant’. This level of numerical detail is provided only to make the calculations easier to follow and to reproduce, and not to imply an amazing level of precision; 3) Comparison to reported data is generally desired, but not always possible, for reasons discussed below. One problem is that the energies calculated here are just for the formation of the ions in a single molecule. Some of this energy will be consumed in actually forming the molecule; i.e. giving the ions enough energy to stay at some stand-off distance from each other. Some more energy will be consumed in getting to the ‘state of matter’ for which data are reported. That means many molecules, in bulk matter – a mole – endowed with collective attributes. Molar heats of formation are generally reported for ‘standard conditions’, i.e. temperature (typically 25o C), pressure (for gasses, one atmosphere), dilution (for solutions, infinite dilution), etc. A single molecule can hardly be said to possess such properties. Indeed, not even the ‘state’ of matter – solid, liquid, or gas – seems meaningful for a single molecule. So it is clear that the energy calculated here for forming the ions in a single molecule in isolation cannot be said to correspond directly to reported data on the heat of formation for a molecule as a constituent of bulk matter. The energies calculated here are generally are generally less than reported heats of formation. Why less? Because, for molecules that actually form spontaneously, thus releasing heat, the reported heats of formation are by convention negative. That means the energies for forming the ions in molecules should also come out negative, and indeed more negative than the heats of formation reported for bulk matter. “Less is more,” the saying goes! Another problem, all too pervasive throughout Chemistry, is the need for conversion among many different systems of units. That is why the edition of Lang’s Handbook that provided most of the data in this paper has some 42 pages devoted to specifying conversion factors. And even at that generous page count, it does not have the particular conversion factor needed here. Heats of formation are quoted there in ‘kilogram-calories per mole’ (here abbreviated as ‘Kg-cal’s’), whereas ionization potentials are quoted in ‘electron volts per atom’ (here abbreviated as ‘eV’s’). Our energy-tally results are reported in ‘electron volts per molecule’, and with ‘molecule’ being just a generalization on ‘atom’, those units too are abbreviated as ‘eV’s’. So for almost everything in this paper, we need the conversion from Kg-cal’s to eV’s. The meaning of ‘kilogram-calories’ seems somewhat ambiguous, but the relevant conversion information provided appears to be: kilogram-calories to joules: 4186; 7 −12 23 ; gm-mole to molecules: 6.0228 × 10 . joules to ergs: 10 ; eV’s to ergs: 1.602 × 10 The needed conversion factor is then probably
4186 × 107 Kilogram-calories to joules × joules to ergs = ≈ 0.043 eV's to ergs × gm mole to molecules 1.602 × 10−12 × 6.0228 × 1023
226
Cynthia Kolb Whitney
This interpretation can be checked by successful use with enough example molecules. Many examples will be presented the following Sections.
5. Analyses of Some Small Molecules A Molecule with Two Atoms H 2 . Reported heat of formation: 0 Kg-cal’s, times conversion faction 0.043 yields 0 eV’s out (meaning no energy is released in forming this molecule). Probable electron assignments: one 1 H atom, 0 electrons; the other 1 H atom, 2 electrons. Relevant Model Data about 1 H : The transition 1 H → 1 H The transition 1 H → H
+
−
takes work IP1,1 / M1 = 14.250 / 1.008 = 14.137 eV’s. takes negative work for the extra electron falling into nuclear
orbit: − IP1,1 × 1 × 2 M1 = −14.25 × 1.414 / 1.008 = −19.990 eV’s, and takes negative heat
for
the
formation
of
an
electron
ΔIP1,2 = IP1,2 − IP1,1 = 49.875 − 14.250 = 35.625 , −35.625 × 2 / 1.008 = −70.685 eV’s.
The
−ΔIP1,2 × 2 / M1 ,
cluster:
sum
so
−ΔIP1,2 × 2 / M1 =
that of
where
energies
taken
is
−19.990 − 70.685 = −90.675 eV’s. Observe that this 2-electron negative state, H − , is + very much favored over the neutral state, 1 H , or the zero-electron state positive state, 1 H .
Thus
the
formation
of
the
ions
in
the
H2
molecule
takes
14.137 − 90.675 = −76.538 eV’s, the minus meaning that heat is released in forming this molecule. So there is a lot of energy available, which can allow H 2 molecules to form, and H 2 bulk matter to vaporize; i.e., become a gas.
Another Molecule with Two Atoms O 2 . Reported heat of formation: 0 Kg-cal, times conversion factor 0.043 is 0 eV’s. Probable electron assignments: 8 O , 6; 8 O , 10. Note: it is not possible for both 8 O atoms to get a ‘Noble gas’ electron count. Relevant model data about 8 O : The transition 8 O → 8 O
++
(
takes IP1,1 8 + 8 × 7
)M
8 , or
The Algebraic Chemistry of Molecules and Reactions
(
)
(
227
)
14.250 8 + 8 × 7 15.999 = 14.250 8 + 7.483 15.999 = 13.791 eV’s, and
(ΔIP1,8 × 8 − ΔIP1,6 × 6) M8 , or
(13.031 × 8 − 7.320 × 6) 15.999 = (104.248 − 43.920) 15.999 = 3.771eV’s. ++ takes altogether 13.791 + 3.771 = 17.562 eV’s. So the transition 8 O → 8 O −− The transition 8 O → 8 O takes − IP1,1 8 × 10 + 8 × 9 M8 , or
(
(
)
)
−14.250 8.944 + 8.485 15.999 = −15.524 eV’s,
(
and − ΔIP1,10 × 10 − ΔIP1,8 × 8
(
)
) M8 , or
(
)
− 29.391 × 10 − 13.031 × 8 15.999 = − 293.910 − 104.248 15.999 = −11.855 eV’s. So the transition 8 O → 8 O
−−
takes altogether −15.524 − 11.855 = −27.379 eV’s.
++ Interpretation concerning O 2 : Transforming the neutral 8 O atom to the positive 8 O −− ion takes ion takes 17.103 eV’s, and transforming 8 O to the negative 8 O
−27.379 eV’s,
so
forming
the
ions
in
the
O2
molecule
takes
17.562 − 27.379 = −9.817 eV’s. This molecule readily forms, and there is excess energy available to vaporize the bulk matter formed.
A Third Molecule with Two Atoms CO . Reported heat of formation: −26.42 Kg-cal as a gas, multiplied by conversion factor 0.043 yields −1.136 eV’s. Probable electron assignments: 6 C , 4; 8 O , 10. Note: it is not possible for both atoms to get a ‘Noble gas’ electron count. Relevant model data about 6 C : ++ The transition 6 C → 6 C takes IP1,1 6 +
(
(
)
6×5
)M
(
6 , or
14.250 6 + 5.477 12.011 = −13.617 eV’s, and ΔIP1,6 × 6 − ΔIP1,4 × 4
) M6 ,
or
(7.320 × 6 − 9.077 × 4) 12.011 = (43.920 − 36.308) 12.011 = 0.6338 eV’s.
Interpretation concerning 6 C : Transforming a neutral 6 C atom to a positive 6 C++ ion takes −13.617 + 0.6338 = −12.983 eV’s.
228
Cynthia Kolb Whitney
Relevant model data about 8 O (part of the information given for O 2 ): −− takes −27.379 eV’s. The transition 8 O → 8 O Interpretation concerning CO : Forming the ions in the CO molecule takes −12.983 − 27.379 = −40.362 eV’s. This molecule readily forms, and as bulk matter it readily becomes a gas. Indeed, it looks even more favorable than CO 2 , analyzed next. This may explain why CO is a frequent, though unwelcome, product of combustion.
A Molecule with Three Atoms CO 2 . Reported heat of formation: −94.05 Kg-cal as gas, or −98.69 Kg.cal as aqueous solution (either way meaning energy is released in forming this molecule) multiplied by the conversion factor 0.043 yields −4.044 -eV’s for gas or −4.244 eV’s for solution. Probable electron assignments: 6 C , 2; 8 O ’s, 10 each. Relevant Model Data about 6 C : The transition 6 C → 6 C
++++
(
takes IP1,1 6 +
(
)
6×5+ 6×4 + 6×3
)M
6 , or
14.250 6 + 5.477 + 4.899 + 4.243 12.011 = 24.463 eV’s, and ( ΔIP1,6 × 6 − ΔIP1,2 × 2) / M 6 , or
(7.320 × 6 − 35.625 × 2) / 12.011 = (43.920 − 71.250) / 12.011 = −2.275 eV’s. So the transition 6 C → 6 C
++++
takes altogether 24.463 − 2.275 = 22.188 eV’s.
Relevant model data about 8 O : (same as for CO above); −− The transition 8 O → 8 O takes −27.379 eV’s. Interpretation concerning CO 2 : Forming the ions in the CO2 molecule takes 22.188 − 2 × 27.379 = −32.570 eV’s; that is, this molecule easily forms, and there is plenty of energy to make the bulk matter into a gas. Observe that just looking at eV’s for ion formation, without the complications embedded in Kg-cal’s for getting to the gaseous state at prescribed conditions, shows CO 2 at −32.570 eV’s to be less favorable than CO at −40.362 eV’s. The −26.42 Kg-cal’s for CO as a gas, vs. −94.05 Kg-cal’s or −98.69 Kgcal’s for CO 2 as gas or aqueous solution, obscures this situation.
The Algebraic Chemistry of Molecules and Reactions
229
Another Molecule with Three Atoms H 2O . Reported heat of formation −57.80 Kg-cal’s as gas, −68.32 Kg-cal’s for liquid,
multiplied by the conversion factor 0.043 yields −2.485 eV’s as gas, −2.938 eV’s for liquid. Probable electron assignments: 8 O , 10, 1 H , both zero. Relevant Model Data about 1 H (part of the Information given for H 2 above): The transition 1 H → 1 H
+
takes 14.137 eV’s.
Relevant model data about 8 O (same as for CO 2 above): −− takes −27.379 eV’s. The transition 8 O → 8 O Interpretation concerning H 2O : The formation of the ions in the H 2O molecule takes 2 × 14.137 − 27.379 = 0.895 eV’s. This is very near zero, but it is positive, and so seems puzzling: it suggests that water takes some net energy to form, rather than yielding some + energy. But one more phenomenon can occur with water. The two H ions are really naked protons, extremely tiny, and so able to form a positive binary charge cluster similar to the − negative binary charge cluster that two electrons form in an H ion. If the two naked protons indeed do that, the process yields some energy. That energy would be related to, the −70.685 eV’s that the two electrons in the H − ion take [see H 2 ]. We do not at this time have a scaling law to express the relation between clusters of electrons and clusters of protons. What we do have is the comparable data for heavy water (made with deuterons): −59.56 Kg-cal for gas, −70.41 Kg-cal for liquid. These results are not so different from those for regular water. So the yet-to-be-articulated scaling from electrons to protons, and to deuterons, does not strongly involve the mass of protons vs. deuterons. So perhaps it does not strongly involve the mass of either one vs. the mass of electrons. If that is so, the resulting heat could be very similar to, possibly even equal to, the −70.685 eV’s for the two electrons in the H − ion. Taking this value as an estimate, the proton clustering would easily make the energy tally for making the ions in the water molecule appropriately negative, at 0.895 − 70.685 = −69.790 eV’s. This energy yield would allow water to both form its molecule and then melt into a liquid. Note too that proton clustering would also make a water molecule polarized – which indeed it definitely is. As a result of this polarization, a lot of other molecules dissolve in water, meaning they dissociate into positive and negative ions. This dissolution behavior 7 includes even the pure water itself: at any given moment in time, about one out of 10 of the + − molecules in a sample of pure water is dissociated into H and OH ions. Hence we have the phenomenon of ‘pH’ with ‘neutral’ set at 7.
230
Cynthia Kolb Whitney
A Molecule with Four Atoms Co3C . (Like H 2 , the Co3C molecule involves only atoms of ‘keystone’ elements [see Appendix 4]. Unlike the other molecules treated so far, Co3C is a crystalline solid at standard conditions, so no part of the heat released in forming its constituent ions is used in getting to liquid or gas state.) Reported heat of formation +9.5 Kg-cal’s (the ‘+’ meaning heat is consumed in making this molecule) times conversion factor 0.043 yields
+0.4085 eV’s. Probable electron assignments: 27 Co , one 25, two 30’s; 6 C , 2. Relevant model data about 27 Co : The transition 27 Co → 27 Co
(
++
(
takes IP1,1 27 +
)
27 × 26
)M
27 , or
14.250 27 + 26.495 58.933 = 12.935 eV’s, and
(ΔIP1,27 × 27 − ΔIP1,25 × 25) M 27 , or
(1.980 × 27 − 1.289 × 25) 58.693 = (53.460 − 32.225) 58.693 = 0.362 eV’s. ++ takes altogether 12.935 + 0.362 = 12.573 eV’s. So the transition 27 Co → 27 Co −−− takes The transition 27 Co → 27 Co
− IP1,1
( 30 × 27 +
29 × 27 + 28 × 27
(
)
)M
27 , or
− 14.25 28.460 + 27.982 + 27.495 58.933 = −20.296 eV’s, and
(
− ΔIP1,30 × 30 − ΔIP1,27 × 27
(
)
) M 27 , or
(
)
− 4.242 × 30 − 1.980 × 27 58.933 = − 127.260 − 53.460 58.933 = −1.252 eV’s. So the transition 27 Co → 27 Co −21.548 eV’s.
−−−
takes altogether −20.296 − 1.252 =
Interpretation concerning 27 Co : Transforming 3 neutral 27 Co into one 27 Co −−− ions takes 12.573 + 2 × (−21.548) = −30.524 eV’s. and two 27 Co Relevant model data about 6 C : (same as in CO 2 ): ++++ takes altogether 22.188 eV’s. The transition 6 C → 6 C
++
ion
The Algebraic Chemistry of Molecules and Reactions
231
Interpretation concerning Co3C : Making the ions in the Co3C molecule takes −30.524 + 22.188 = −8.336 eV’s. Evidently, all this energy, and a slight bit more, is consumed in forming the molecule and its bulk-matter crystal structure. It is encouraging that the calculation here produces a result that is reasonably interpretable. It is cautioning that the calculation involves some rather small differences between rather large numbers. This circumstance may be even worse for some molecules. If so, related numerical problems would probably occur with traditional Quantum Chemistry as well. Nature is a challenge for us all!
A Molecule with Five Atoms CH 4 (Methane; another ‘keystone-only’ molecule, like H 2 and Co3C ) Reported heat of formation: −17.89 Kg-cal’s, times conversion factor 0.043 yields −0.769 eV’s. Probable electron assignments: 6 C , 10; 1 H , all zero. Relevant model data concerning 6 C : The transition 6 C → 6 C
− IP1,1
( 6 × 10 + (
6×9 + 6×8 + 6×7
)
)M
−−−−
takes
6 , or
−14.250 7.746 + 7.348 + 6.928 + 6.481 12.011 = −14.250 × 28.503 / 12.011 =
(
−33.816 , and − ΔIP1,10 × 10 − ΔIP1,6 × 6
(
)
(
) M6 , or
)
− 29.391× 10 − 7.320 × 6 12.011 = − 293.91− 43.920 12.011 =
−20.813 eV’s. Interpretation concerning 6 C : The transition 6 C → 6 C −33.816 − 20.813 = −54.629 eV’s.
−−−−
takes
Relevant model data concerning 1 H (from Sect. 5): The transition 1 H → 1 H 14.137 eV’s.
+
takes
+ Interpretation concerning 1 H : Turning 4 neutral 1 H atoms into 4 positive 1 H ions takes 4 × 14.137 = 56.548 eV’s. These four ions are really four naked protons. Recall the
case of two naked protons in the H 2O : they apparently formed a positive binary charge cluster. So consider that the four naked protons in CH 4 could form two binary clusters, or one four-fold cluster. The two-binaries configuration seems favored on energetic grounds, and it seems to be confirmed on polarization grounds. Observe that the four-fold configuration would produce a polarized molecule, maybe four times as strongly polarized as
H 2O , whereas the two-binary configuration allows the two binaries to seek opposite sides of
232
Cynthia Kolb Whitney
the central C
−−−−
ion, and so produce no net polarization of the CH 4 molecule. And note
that CH 4 is indeed not a polarized molecule. The energy taken to make two binary proton clusters is related to, and probably very similar to, twice the −70.685 eV’s estimated for making the one binary proton cluster in
H 2 O ; i.e. −141.320 eV’s. Interpretation concerning CH 4 : Forming the ions in the CH 4 molecule takes something like −54.629 + 56.548 − 141.320 = −139.401 eV’s. This molecule readily forms, and as bulk matter becomes a gas.
A Molecule with Six Atoms CH 4O (Methyl alcohol). Reported heat of formation −48.10 Kg-cal’s for gas, times conversion factor 0.043 yields −2.0683 eV’s. Probable electron assignments: 6 C , 2; 1 H , one at 0, three at 2; 8 O , 10. Relevant Model Data about 6 C (same as in CO 2 ): ++++ takes altogether 22.188 eV’s. The transition 6 C → 6 C Relevant model data about 1 H (same as in H 2 ): + − The transition 1 H → 1 H takes 14.137 eV’s and the transition 1 H → 1 H takes −90.675 eV’s. Relevant model data about 8 O (part of the information given for O 2 ): −− takes −27.379 eV’s. The transition 8 O → 8 O Interpretation concerning CH 4O : Forming the ions in the CH 4O molecule takes 17.040 + 14.137 + 3 × (−90.675) − 27.379 = −268.227 eV’s. This molecule readily forms, and as bulk matter becomes a gas.
6. Analysis of a Much Larger Molecule With Algebraic Chemistry, there is no real impediment to analyzing large molecules. This can be demonstrated here with some suitably larger molecule. A socially significant one is sucrose; C12 H 22O11 ; we do eat a lot of that one! It has 45 atoms and 182 electrons. That qualifies it as ‘much larger’, I do believe. The probable electron assignments are: C ’s, 6 at 2 , 6 at 10 ; H ’s, all 0 ’s; O ’s, all 10 ’s. The relevant model data all comes from Sect. 5:
The Algebraic Chemistry of Molecules and Reactions
233
++++ takes 22.188 eV’s; from Concerning 6 C : From CO 2 , the transition 6 C → 6 C CH 4 , the transition 6 C → 6 C−−−− takes −54.629 eV’s. + Concerning 1 H : From H 2 , the transition 1 H → 1 H takes 14.137 eV’s. −− takes −27.379 eV’s. Concerning 8 O : From O 2 The transition 8 O → 8 O
Interpretation concerning 6 C : Transforming 6 neutral 6 C atoms into 6 positive ++++ ions takes 6 × 22.188 = 133.128 eV’s, and transforming 6 neutral 6 C atoms to 6C 6 negative 6 C
−−−−
ions takes 6 × (−54.629) = −327.774 eV’s.
Interpretation concerning 1 H : Transforming 22 neutral 1 H atoms into 22 positive + ions takes 22 × 14.137 = 311.014 eV’s. [Do these then form binary clusters? 1H Probably not; where would they go?] Interpretation concerning 8 O : Transforming 11 neutral 8 O atoms into 11 negative −− ions takes 11 × (−27.379) = −301.169 eV’s. 8O Interpretation concerning C12 H 22O11 : Forming the ions in the C12 H 22O11 molecule takes
133.128 − 327.774 + 311.014 − 301.169 = −184.801 eV’s . I wanted to compare the total heat of ion formation modeled here and the heat of bulk matter formation reported in Lang’s Handbook of Chemistry. However, the latter information was not available in the rather old edition that had provided all the other heat data used in this paper. The modern-day web revealed reason for the omission: the presumed direct synthesis reaction for sucrose, 1
12C(s) + 11H 2 (g) + 5 O2 (g) → C12 H 22O11(s) + heat , 2 was never accomplished in the lab. Note that all species on the left side were reported to have zero heat of formation, and that sucrose did form naturally, and would have been expected to have a negative heat of formation. The assumed condition for the reaction to work would likely have been that the heats of formation be less negative on the left than on the right, so the reaction would have looked feasible. So its actual non-feasibility would have been a big surprise. Algebraic Chemistry reveals the reason for the failure. The prerequisite for the reaction to work is really about the energy taken to form ions. From Sect. 5, forming all the ions in all the species on the left side of the sucrose synthesis reaction takes 12 × 0 + 11 × (−76.538) + 5.5 × (−9.817) = −841.918 − 53.994 = −895.912 eV’s.
234
Cynthia Kolb Whitney
This is more negative than the estimated −184.801 eV’s to form the ions in the sucrose molecule on the right. This situation indicates that the presumed direct synthesis reaction for sucrose does not go to the right, as it is depicted. The web also provided a sucrose heat of formation instead inferred from its combustion reaction,
C12 H 22O11(s) + 12O 2 (g) → 12CO 2 (g) + 11H 2O + heat Since this reaction definitely works, Algebraic Chemistry should be able to confirm that fact. Forming the ions on the left side of the reaction is estimated to take −184.801 + 12 × (−9.817) = −302.605 eV’s, whereas forming the ions on the right side is estimated
to
take
12 × (−32.570) + 11× (−70.685) =
−390.840 − 779.535 =
−1168.375 eV’s, which is indeed more negative. This says the sucrose combustion reaction does indeed run to the right as depicted. But the reported heat of formation inferred from the sucrose combustion reaction is not reported in the Kg-cal/mole units that the heats of formation for the other molecules quoted earlier were; it is instead reported in Kjoules/mole (here abbreviated Kj’s). So the needed conversion factor changes, from
Kilogram-calories to joules × joules to ergs 4186 × 107 = ≈ 0.043 eV's to ergs × gm mole to molecules 1.602 × 10−12 × 6.0228 × 1023 to
103 × 107 Kjoules to joules × joules to ergs = ≈ 0.0104 eV's to ergs × gm mole to molecules 1.602 × 10−12 × 6.0228 × 1023 The reported −2226.1 Kj’s is −23.25 eV’s. This is much less in magnitude than the −184.801 eV’s here calculated to form the ions in sucrose. This means that a very large portion of the energy generated making the ions in sucrose is then tied up in making the molecule and its crystal structure. Being so energy-packed, it is no wonder that sucrose crystals dissolve so easily in water!
7. Analyses of Current Benchmark Reactions The history of the sucrose problem highlights the importance of studying not only molecules, but also reactions. There was recently a status report on efforts in Quantum Chemistry in the form of a collection of papers collectively titled “Challenges in Theoretical Chemistry” [Science Magazine, October 2008]. Among the papers included was one entitled “Quantum Dynamics of Chemical Reactions”, by D.C. Clary [5].
The Algebraic Chemistry of Molecules and Reactions
235
A Reaction Involving Four Atoms In
[5]
Clary
set
as
a
benchmark
a
reaction
involving
four
atoms:
OH + H 2 → H 2O + H . He cited extensive calculations on this reaction using the ‘wave packet’ method [6,7]. We can also analyze this same reaction from the viewpoint of Algebraic Chemistry, as follows:
Analysis of the Left Side of the Reaction, OH + H 2 : Analysis of OH : Probable electron assignments: 8 O , 9; 1 H , zero. − From Sect. 2, the transition 8 O → 8 O takes − IP1,1 × 8 × 9 M8 , or
−14.250 × 8.485 / 15.999 = −7.558 eV’s, and −(ΔIP1,9 × 9 − ΔIP1,8 × 8) / M8 , or −(20.254 × 9 − 13.031 × 8) / 15.999 = −(182.286 − 104.248) / 15.999 = −4.878 eV’s. + From Sect. 5, H 2 , the transition 1 H → 1 H takes 14.137 eV’s. So forming the ions in the OH radical takes −7.558 − 4.878 + 14.137 = 1.701 eV’s.
Analysis of H 2 : From Sect. 5, the formation of the ions in the H 2 molecule takes −76.538 eV’s. Forming all of the ions involved in the left side of the reaction, OH + H 2 , takes
1.701− 76.538 = −74.837 eV’s.
Analysis of the Right Side of the Reaction, H 2O + H : Analysis of H 2O . From Sect. 5, the formation of the ions in the H 2O molecule is estimated to take −69.790 eV’s. Analysis of H : None required. So formation of the ions involved in the H 2O + H right side of the reaction is estimated to take −69.790 eV’s. This is less negative than the −74.837 eV’s on the left side. So this reaction does not run to the right, as it is depicted, and so is not a realistic target for analysis by Quantum Chemistry. But note: The OH on the left side of the reaction, OH + H 2 , is not usually seen that − way, as a neutral species; it is usually seen as a negative ion, OH . So perhaps we should
look at some variations on the stated reaction OH + H 2 → H 2O + H . For example, − − consider OH + H 2 → H 2O + H + e .
236
Cynthia Kolb Whitney
Analysis of the New Left Side of the Reaction, OH − + H 2 : − Analysis of OH : Probable electron assignments: 8 O , 10; 1 H , zero. From Sect. 5, O 2 , the transition 8 O → 8 O−− takes −27.379 eV’s. From Sect. 5, H 2 , the transition + − takes 14.137 eV’s. So forming the ions in the OH radical takes 1H → 1H −27.379 + 14.137 = −13.242 eV’s.
Analysis of H 2 : From Sect. 5, the formation of the ions in the H 2 molecule takes −76.538 eV’s. − So formation of the ions involved in the left side of the reaction, OH + H 2 , takes
−13.242 − 76.538 = −89.780 eV’s. This is even worse than the −74.837 eV’s for the original left side, OH + H 2 .
But note: the H on the right side of either OH + H 2 → H 2O + H or the alternative OH − + H 2 → H 2O + H + e− is not usually seen like that, as a neutral atom; it is more + − often seen as H or H . So perhaps we should also look at two further variant reactions: OH − + H 2 → H 2O + H + + 2e− and OH − + H 2 → H 2O + H − .
Analysis of the Common Left Side of these two Teactions, OH − + H 2 From above, formation of the ions involved takes −89.780 eV’s.
Analysis of the Two Right Sides of the Two Reactions, H 2O + H + + 2e − and
H 2O + H − Analysis of H 2O . From Sect. 5, the formation of the ions in the H 2O molecule is estimated to take −69.790 eV’s. − + and H : From Sect. 5, H 2 , the transition 1 H → 1 H takes 14.137 eV’s, and the transition 1 H → 1 H − takes −90.675 eV’s. − Analysis of 2e : None required.
Analysis of H
+
+ − So in the first case, the right side of the reaction H 2O + H + 2e is estimated to take −69.790 + 14.137 = −55.653 eV’s, whereas in the second case, the right side of the
The Algebraic Chemistry of Molecules and Reactions
237
− reaction H 2O + H is estimated to take −69.790 − 90.675 = −160.465 . The first case still does not run to the right as depicted, but the second case does, and very strongly so. So only this one last variant form of the reaction involving these four atoms appears to be a realistic target for analysis by Quantum Chemistry.
A Reaction Involving Six Atoms In [5], Clary also cited a six-atom reaction, printed as H + CH 4 → CH 3 + H , but probably really meaning H + CH 4 → CH 3 + H 2 , for which the ‘wave-packet’ calculations require significant approximations [8], but for which some recent progress has been achieved by exploiting permutations of identical atoms [9]. We can also look at this reaction from the viewpoint of Algebraic Chemistry.
Analysis of the Left Side of the Reaction, H + CH 4 : Analysis of H : None required. Analysis of CH 4 : From Sect. 5, forming the ions in the CH 4 molecule takes something like −139.401 eV’s. So forming the ions involved in the left side of the reaction, H + CH 4 , takes something
like −139.401 eV’s.
Analysis of the Right Side of the Reaction, CH 3 + H 2 : Analysis of CH 3 : Probable electron assignments: 6 C , 9; 1 H , all zero. −−− takes Relevant model data on 6 C : From Sect. 2, the transition 6 C → 6 C
− IP1,1
( 6×9 +
6×8 + 6×7
(
)M
6 or
)
−14.250 7.348 + 6.928 + 6.481 12.011 = −24.626 eV’s,
(
and − ΔIP1,9 × 9 − ΔIP1,6 × 6
(
)
) M6 , or − (20.254 × 9 − 7.320 × 6) 12.011 =
− 182.286 − 43.92 12.011 = −11.520 eV’s. Relevant model data on 1 H : From Sect. 5, H 2 , the transition H → H 14.137 eV’s, so creating 3 H + ions takes 3 × 14.137 = 42.411 eV’s. Interpretation concerning CH 3 : Forming the ions in the CH 3 molecule takes
+
takes
238
Cynthia Kolb Whitney
−24.626 − 11.520 + 42.411 = 6.265 eV’s. Observe that this is significantly positive, meaning this molecule is not easy to make. Analysis of H 2 : From Sect. 5, the formation of the ions in the H 2 molecule takes 14.137 − 90.677 = −76.538 eV’s. So forming the ions involved in the right side of the reaction, CH 3 + H , takes 6.265 − 76.538 = −70.273 eV’s. This is less negative than the −139.401 eV’s for forming the
ions
on
the
left
side
of
the
reaction,
H + CH 4 ,
so
the
reaction
H + CH 4 → CH 3 + H 2 does not run to the right as it is depicted. But again, some aspects of the reaction H + CH 4 → CH 3 + H 2 are not realistic: the
H seen on the left is usually seen as H + or H − , and the CH 3 seen on the right is usually − seen as (CH 3 ) . Since making a reaction that can run to the right requires less negative on the left and/or more negative on the right, the best candidate reaction appears to be: H + + CH 4 + 2e − → (CH 3 )− + H 2 . + Change on the left side: From Sect. 5, H 2 , the transition H → H takes 14.137 eV’s. So forming all the ions on the left side of the reaction now takes 14.137 − 139.401 = −125.264 eV’s. − Change on the right side: Analysis of (CH 3 ) ; Probable electron assignments: 6 C ,
10; 1 H , all zero. Relevant model data on 6 C : From Sect. 5, CH 4 , the transition −−−− takes −54.629 eV’s. So forming all the ions on the right side of the 6C → 6C reaction now takes −54.629 + 42.629 − 76.538 = −88.538 eV’s. This is not enough change to make a reaction that runs to the right as depicted. So this family of reactions appears not very promising for study with the techniques of Quantum Chemistry.
8. Results and Discussion At the present time, many of our most talented people, armed with our most powerful computing capabilities, are committed to applying of traditional Quantum Mechanics to the problems of interest for Quantum Chemistry. In this book, you are likely to see many current accomplishments reported, along with future agendas laid out. So Quantum Chemistry is today a work in progress. The exercises documented in the present paper demonstrate that while the work in Quantum Chemistry continues to develop, we can at the same time accomplish some preliminary analyses by applying the techniques of Algebraic Chemistry. Such exercises can be useful, for example, in planning computational and experimental investigations. We can
The Algebraic Chemistry of Molecules and Reactions
239
calculate the amount of heat that the formation of an atomic ion in a molecule will consume (negative heat for natural formation). We can predict if a hypothetical chemical reaction may fail to transpire naturally (ion-formation heat more negative on the left side of the reaction than on the right side). Extensions of the present work are also possible. One that is important, although beyond the scope of the present paper, is the calculation of energies for lots more ions, in lots more molecules, involved in lots more reactions. Another extension is less obvious, but just as necessary: the calculation of energies, not just for formation of the ions in molecules, but also for the settlement of ions into molecules, and the organization of molecules into the particular states of matter for which their data are quoted: solid state, liquid state, or gas state. And sometimes data might be quoted for even more specifically described states, such as: ‘solid crystal’, ‘amorphous solid’, ‘super-conducting’, ‘polymer’, ‘plastic’,… ‘pure liquid’, ‘aqueous solution’, ‘super fluid’,…, ‘molecular beam’, ‘plasma’,…etc. So there is a lot of scope for future development in the line of work reported here.
Appendix 1. Essential Data about First-Order Ionization Potentials The following Tables capture the input data that one needs to conduct analyses of the type introduced in this paper. Table 1. Periods 1, 2 and 3
Element H He
Charge Z 1 2
Mass M 1.008 4.003
Ionization IP = Ionization Model Model IP Potential Potential × M / Z ΔIP 14.250 13.718 13.610 0 49.875 35.625 49.244 24.606
Li
3
6.941
5.394
12.480
12.469
−1.781
Be B C N O F Ne
4 5 6 7 8 9 10
9.012 10.811 12.011 14.007 15.999 18.998 20.180
9.326 8.309 11.266 14.544 13.631 17.438 21.587
21.011 17.966 22.551 29.101 27.260 36.810 43.562
23.327 17.055 21.570 27.281 27.281 34.504 43.641
9.077 2.805 7.320 13.031 13.031 20.254 29.391
Na Mg Al
11 12 13
22.990 24.305 26.982
5.145 7.656 5.996
10.753 15.506 12.444
10.910 −3.340 16.565 2.315 14.923 0.673
Si P S Cl Ar
14 15 16 17 18
28.086 30.974 32.066 35.453 39.948
8.154 10.498 10.373 12.977 15.778
16.357 21.677 20.790 27.063 35.017
18.874 4.624 23.871 9.621 23.871 9.621 30.192 15.942 38.186 23.936
240
Cynthia Kolb Whitney
The data are separated into blocks corresponding to the periods in the Periodic Table. A few useful comments are interleaved with the blocks to complete the display. The IP Model to which the Tables refer is detailed in Appendix 2. Observe the M / Z scaling that is introduced to convert the raw ionization-potential data into the IP data to be modeled mathematically. The reason for the M / Z scaling emerges from the physical model in Appendix 3. Basically, the physical model shows that the measurable ionization potentials of elements do, to first approximation, scale with Z / M . So raw ionization-potential data is very element specific, and in fact it is very isotope specific. To create information that is not so element/isotope specific, we remove that Z / M scaling by applying its inverse M / Z . Observe finally that these Tables refer only to neutral atoms. The scaling to similar information for ions is worked out in Appendix 3. Table 2. Period 4 Charge
Mass
Ionization
IP = Ionization
Model
Model
Element K
Z 19
M 39.098
Potential 4.346
Potential × M / Z 8.944
IP 9.546
ΔIP −4.704
Ca Sc Ti V
20 21 22 23
40.078 44.956 47.867 50.942
6.120 6.546 6.826 6.743
12.265 14.013 14.851 14.934
13.057 −1.193 13.057 −1.193 13.638 −0.612 14.244 −0.006
Cr
24
51.996
6.774
14.676
14.877
0.627
Mn Fe Co
25 26 27
54.938 55.845 58.933
7.438 7.873 7.863
16.345 16.911 17.163
15.539 15.539 16.229
1.289 1.289 1.980
Ni Cu
28 29
58.693 63.546
7.645 7.728
16.026 16.934
16.951 17.705
2.701 3.455
Zn Ga Ge As
30 31 32 33
65.390 69.723 72.610 74.922
9.398 6.006 7.905 9.824
20.485 13.509 17.936 22.303
18.492 14.494 17.860 22.007
4.242 0.244 3.610 7.757
Se Br Kr
34 35 36
78.960 79.904 83.800
9.761 11.826 14.015
22.669 26.998 32.623
22.007 7.757 27.116 12.866 33.412 19.162
One key to the task of modeling ions is to separate the IP ’s into two parts: one being a baseline amount corresponding to the Hydrogen IP , IP1, Z − 14.250 , and the other being the increment from the Hydrogen IP ; i.e., ΔIP1, Z = IP1, Z − IP1,1 = IP1, Z − 14.250 eV’s. For neutral atoms, the two parts both scale with Z . For ions, the Z -scaling generalizes differently for the two parts. The baseline 14.250 eV’s scales with
Ze Z M , where Ze is
The Algebraic Chemistry of Molecules and Reactions
241
the electron count and Z M is the nuclear charge. (Appendix 3). The ΔIP just scales with
Ze . The way all these data are used to work out information for ions is to sum up the information for the desired number of single-electron removals. In the case of the baseline 14.250 eV contributions, that results in a sum of various square roots. In the case of the ΔIP contributions, a lot of cancellations occur, leaving just two terms, one from the beginning state and one from the end state. Observe that the Periods differ in length, and the remaining ones will be much longer than these first three. The periods also differ in dips from the end of one period to the start of the next period. The dips go: 1 / 4 , 1 / 4 , 1 / 4 , and then 2 / 7 (the inverse of the 7 / 2 ) thereafter. That is, Period 2 is 7 / 2 × 1 / 4 = 7 / 8 below Period 1, Period 3 is 7 / 8 below Period 2, and Period 4 is 7 / 8 below Period 3. All the rest of the periods will start, and end, just where Period 4 did. The result of the dips is that the elements mid period generally have IP ’s that are very similar to that of Hydrogen. There are other functional similarities as well, so that the whole list of elements that are exactly mid period is called out and featured by the terminology ‘keystone elements’ (Appendix 4). Table 3. Period 5 Charge
Mass
Ionization
Element Rb Sr
Z 37 38
M 87.620 88.906
Potential 5.695 6.390
IP = Ionization
Model
Model
Potential × M / Z IP ΔIP 9.657 9.546 −4.704 13.132 13.057 −1.193 −1.193
Y
39
91.224
6.846
14.567
13.057
Zr Nb Mo Tc
40 41 42 43
92.906 92.906 95.940 98.000
6.888 6.888 7.106 7.282
15.614 15.608 16.232 16.597
13.638 −0.612 14.244 −0.006 14.877 0.627 15.539 1.289
Ru Rh Pd Ag Cd
44 45 46 47 48
101.070 102.906 106.420 107.868 112.411
7.376 7.469 8.351 7.583 9.004
16.942 17.080 19.319 17.403 21.087
15.539 16.230 16.951 17.705 18.492
1.289 1.980 2.701 3.455 4.242
In Sn
49 50
114.818 118.710
5.788 7.355
13.563 17.462
14.494 17.860
0.244 3.610
Sb Te I
51 52 53
121.760 127.600 126.904
8.651 9.015 10.456
20.655 22.120 25.037
22.007 7.757 22.007 7.757 27.116 12.866
Xe
54
131.290
12.137
29.508
33.412 19.162
242
Cynthia Kolb Whitney Table 4. Period 6 Charge Mass Ionization IP = Ionization Model Model M IP Z Element Potential Potential × M / Z ΔIP 132.905 9.546 −4.704 55 9.425 Cs 3.900 137.327 13.057 −1.192 56 12.796 Ba 5.218 138.906 12.393 −1.857 57 13.600 La 5.581 140.116 12.583 −1.667 58 13.232 Ce 5.477 140.908 12.776 −1.474 59 12.957 Pr 5.425 144.240 12.972 −1.278 60 13.217 Nd 5.498 145.000 13.171 −1.079 61 13.192 Pm 5.550 150.360 13.374 −0.876 62 13.660 Sm 5.633 151.964 13.579 −0.671 63 13.687 Eu 5.674 −0.671 −0.463 −0.251
Gd Tb Dy
64 65 66
157.250 158.925 162.500
6.141 5.851 5.934
15.089 14.305 14.609
13.579 13.787 13.999
Ho Er
67 68
164.930 167.260
6.027 6.110
14.836 15.029
Tm Yb Lu
69 70 71
168.934 170.040 174.967
6.183 6.255 5.436
15.137 15.463 13.395
14.213 −0.037 14.431 0.181 14.653 0.403 14.878 0.627 17.860 3.610
Hf Ta
72 73
178.490 180.948
7.054 7.894
17.487 19.568
18.755 19.696
4.505 5.446
W Re Os
74 75 76
183.840 186.207 190.230
7.988 7.884 8.714
19.844 19.574 21.811
20.684 21.721 21.721
6.434 7.471 7.471
Ir Pt
77 78
192.217 195.076
9.129 9.025
22.788 22.571
22.811 23.955
8.560 9.705
Au Hg
79 80
196.967 200.530
9.232 10.446
23.019 26.184
Tl Pb Bi
81 82 83
204.383 207.200 208.980
6.110 7.427 7.293
15.417 18.768 18.361
25.156 10.906 26.418 12.168 16.515 2.265 19.696 5.446 23.490 9.240
Po At
84 85
209.000
8.423
20.958
210.000
Rn
86
222.000
10.757
27.769
23.490 9.240 28.015 13.765 33.412 19.164
There are many tantalizing facts modeled but not yet understood: • •
Observe that the IP Model rise on all periods is 7 / 2 . The meaning of the universal factor 7 / 2 is not yet known; it is at present just a fact of Nature. Observe that, except for Period 1, the ΔIP always start negative at the beginning of a period, and ends positive at the end of the period. And the positive increments are larger than the negative ones. The reason is that all this data falls into a simple
The Algebraic Chemistry of Molecules and Reactions
•
243
pattern when plotted on a log scale, as shown in Appendix 2. On the log scale, the increments up and down are similar in magnitude. Observe too that within each period, there are sub periods. These correspond to runs of nominal single-electron quantum states being filled. On the log scale, the sub periods form straight-line segments. That means they follow power laws. And the slopes are simple functions of the quantum numbers of the nominal single-electron states being filled. The physical meaning of these functions is not yet known. Table 5. Period 7 Ionization IP = Ionization Potential Potential × M / Z
Charge Z
Mass M
Fr Ra Ac
87 88 89
223.000 226.000 227.000
5.280 6.950
13.560 17.727
Th Pa
90 91
232.038 231.036
6.089 5.892
15.699 14.959
U Np
92 93
238.029 237.000
6.203 6.276
16.050 15.994
Pu Am Cm
94 95 96
244.000 243.000 247.000
6.068 5.996 6.027
15.752 15.337 15.507
12.972 −1.277 13.171 −1.079 13.374 −0.876 13.579 −0.671 13.579 −0.671
Bk Cf
97 98
247.000 251.000
6.234 6.307
15.875 16.154
13.787 13.999
Es Fm Md
99 100 101
252.000 257.000 258.000
6.421 6.504 6.587
16.345 16.716 16.827
No Lf
102 103
259.000
6.660
16.911
14.213 −0.037 14.431 0.181 14.653 0.403 14.877 0.627 17.859 3.610
Rf Db Sg
104 105 106
18.755 19.696 20.684
4.505 5.446 6.434
Bh Hs
107 108
21.721 21.721
7.471 7.471
Mt Uun
109 110
22.811 23.955
8.561 9.705
Uuu Uub ???
111 112 113
??? ???
114 115
25.156 10.906 26.418 12.168 16.515 2.265 17.019 2.769 23.490 9.240
??? ??? ???
116 117 118
23.490 9.240 28.015 13.765 33.412 19.162
Element
Model
Model ΔIP
IP 9.546 −4.704 13.057 −1.193 12.393 −1.857 12.583 −1.667 12.776 −1.474
−0.463 −0.251
244
Cynthia Kolb Whitney
There are two more periods remaining, and their Tables have some blank spaces where this author did not find raw data available. Predictions, however, are no problem to provide, and are listed in anticipation that the real data will eventually emerge for comparison.
Appendix 2. The Algebraic Model for Ionization Potentials 10000
1000
100
10
1 0
10
20
30
40
50
60
Figure A2.1. Ionization potentials, scaled by
70
M/Z
80
90
100
110
120
and modeled algebraically.
Figure A2.1 depicts the behavior of IP ’s for all elements (nuclear charge Z = 1 to Z = 120 shown). Element Z actually allows Z ionization potentials, but for larger Z , many IP ’s are not so easy to measure. Readily available data go only to seventh order, so that is how many orders are shown here. The points on Figure A2.1 are measured IP eV’s, scaled for comparison with each other as indicated by the new theory summarized in the next Appendix. The scale factor M / Z , where M is nuclear mass number and Z is nuclear charge, is in no way indicated by traditional QM. The lines on Figure A2.1 represent the algebraic model for IP ’s, rendered in its current best state of development. The model is capable of producing plausible estimates for all M / Z -scaled IP ’s for all IO ’s, even beyond those measured, and all Z ’s, even beyond those known to exist. The model-development approach is an example ‘data mining’. Figure A2.1 has less than 400 out of approximately 5000 desired data points. But that is enough data points to reveal a
The Algebraic Chemistry of Molecules and Reactions
245
pattern. The work involved is a good example of continuing positive feedback between theory and experiment. Theory shows what to look for; experiment shows what to try to understand. The first step in model development was fundamentally observational: for IO = 1 , with M / Z scaling, there are consistent rises on periods, and consistent mid-period similarity to Hydrogen ( Z = 1 ). For IO > 1 , there is consistent scaling with IO . There are several ways that the scaling can be described, and the simplest way found so far is summarized as follows: 1) First-order IP ’s contain ALL the information necessary to predict ALL higher-order IP ’s via scaling; 2) Every ionization potential IP of any order IO can be expressed as a function of at most two first-order IP ’s; 3) For a given ionization order IO > 1 , the ionization potentials for all elements start at element Z = IO , and follow a pattern similar to the IP ’s for IO = 1 , except for a shift to the right and a moderation of excursions. Details are given in [1].
Figure A2.2. First-order
IP ’s: map of main highways through the periods.
Only the first-order IP s are needed in the present work. They follow a definite pattern, detailed below. For every period, the rise is 7 / 2 , and the drops from one period to the next start at 7 / 8 and go to unity. Figure A2.2 shows the data used and the pattern inferred concerning periods. Within each period, there are internal rises keyed to the nominal quantum numbers of single-electron states being successively filled. Eqs. (A2-1a-1b) and Figure A2-3 give this level of detail.
incremental rise = total rise × fraction
(A2-1a)
246
Cynthia Kolb Whitney
fraction = ⎡⎢(2l + 1) / N 2 ⎤⎥ ⎡⎣( N − l) / l ⎤⎦ ⎣ ⎦
N
l
1 2 2 3 3 4 4
0 0 0 0 0 0 0
fraction 1 1/2 1/3 1/4 1/4 1/4 1/4
(A2-1b)
l
fraction
l
fraction
l
fraction
1 1 2 2 3 3
3/4 3/4 5/18 5/18 7/48 7/48
1 1 2 2
2/3 2/3 5 / 16 5 / 16
1 1
9 / 16 9 / 16
Figure A2.3. First-order
IP ’s: map of local roads through the periods.
The pattern described above has been inferred from the data. Let us be quite clear: knowledge about the pattern is presently wholly ‘ontic’ (about what the observable facts are); we also need knowledge that is ‘epistemic’ (about why the facts are that way). We definitely do not have this yet.
Appendix 3. Hydrogen, the Basis for all Scaling Laws The work reported in this paper derives from a sequence of earlier works [1-4] that are reviewed very briefly in this Appendix. The story really goes all the way back to Maxwell [4]. The differential equations that he bequeathed us admit a large diversity of particular solutions. In any application, the trick lies in choosing the right particular solution to fit the problem. An imperfect choice was made around the turn of the twentieth century. The problem was to describe an electromagnetic signal as the basis for developing special relativity theory (SRT). Einstein imagined a signal pulse, presumably of finite energy, that would propagate like a wave, infinite in extent and infinite in energy, forever undistorted at speed c = 1
ε 0μ 0 ,
where ε0 and μ 0 are electric permittivity and magnetic permeability of free space. Actually, that scenario isn’t possible. We were given a warning in the well-known phenomenon of diffraction: when limited in directions transverse to the nominal propagation direction, electromagnetic waves always develop a spread in propagation directions. We should have wondered what would happen if the electromagnetic wave were also confined in the longitudinal direction – as it would have to be, to make a wave packet of finite energy. Would some kind of spread result? Indeed, wouldn’t there have to be some kind of longitudinal spread in order for the concept of ‘wavelength’ to ever to become applicable? Well yes, it turns out that, following emission from a source, a pulse expands in the longitudinal direction, and preceding absorption into a receiver, the spread-out energy distribution contracts to pulse again.
The Algebraic Chemistry of Molecules and Reactions
247
Redeveloping SRT to use the more realistic model for an electromagnetic signal leads to a new theory that is expanded with respect to the original one. I call it ‘Expanded SRT’. It includes all the symbols and formulae that the original SRT contains. But it also includes an additional symbol, and additional formulae, and more precise interpretations of the old ones. The additional symbol is V for old-fashioned Galilean speed, distinguished from Einsteinian speed v , which is limited to light speed c . Galilean speed is not limited. With SRT thus extended, it is appropriate to review decisions that were based on the earlier ideas. One of these concerned the applicability of Maxwell’s electromagnetic theory (EMT) to problems at the small scale of atoms. Traditional QM was developed on the presumption that Maxwell theory could not apply. The reasoning was that an electron orbiting around a nucleus should produce radiation, which would rob the orbit of energy, causing collapse of the atomic system. But the Expanded SRT shows that the system also has a second physical process going on; namely, internal torquing. This one provides an energy gain mechanism. So it is possible to have the atomic system persist with a balance between the two physical processes. The new internal torquing mechanism is characterized by an energy gain rate
PT = (e4 / mp ) c(re + rp )3 .
(A3-1)
The more familiar radiation mechanism is characterized by an energy loss rate
PR = (25 e6 / me2 ) 3c3(re + rp )4 .
(A3-2)
4 This is enhanced by a factor of 2 over the totally familiar Larmor dipole formula. The enhancement is caused by Thomas rotation of the atomic system, which is in turn caused by non-central forces, which were absent from Newtonian physics, and were not originally noticed in Maxwellian physics.
The balance between the two mechanisms occurs when PT = PR . At the balance point for the Hydrogen system, we have Eq. (1) in the main text; i.e.,
e2 (re + rp ) = 3c 2 me2 25 mp .
(A3-3)
for the magnitude of the potential energy of the one electron in the Hydrogen atom. Eq. (A3-3) provides the basis from which to build. The extension to Deuterium and/or Tritium requires that the proton mass mp be replaced with a more generic nuclear mass M and that rp be replaced by rM . Then we have Eq. (2) in the main text; i.e.,
e2 (re + rM ) = 3c 2 me2 25 M .
(A3-4)
248
Cynthia Kolb Whitney
for the magnitude of the potential energy of this more massive system. The extension for a neutral atom with nuclear charge number Z , involves Z electrons as well. To develop an opinion on this question, we must return to Eqs. (A3-1) and (A3-2). 2 2 2 2 2 2 All the factors of e change to Z e , and the factor of me changes to Z me . PT = PR 4 6 2 4 becomes Z PT = ( Z / Z ) PR = Z PR . So nothing happens to the equality (A3-4). But for the more charged system, the magnitude of the potential energy becomes Eq. (3) in the main text; i.e.,
Z 2e2 (re + rp ) = Z 2 3c 2 me2 25 M .
(A3-5)
This scaled-up expression represents the magnitude of the total potential energy of the system involving Z protons and Z electrons. What is then comparable to the ionization potential for removing a single electron is Eq. (4) in the main text; i.e.,
(
)
(
Z e2 (re + rM ) = Z × 3c 2 me2 25 M ≡ (Z / M ) × 3c 2 me2 25
)
.
(A3-6)
Thus we see the Z / M scaling that is predicted for raw-data ionization potentials, and so is cancelled out by M / Z scaling to produce the IP ’s documented in Appendix 1 and used in Appendix 2. More generally, if the atom is in an ionized state, we have a distinct electron count Ze and proton count Z p . For the baseline nuclear-orbit part, we have for the total system Eq. (5) in the main text; i.e.,
)(
(
Z p Ze e2 (re + rM ) = Z p Ze M × 3c 2 me2 25
)
.
(A3-7)
What is then generally comparable to the nuclear-orbit part of the ionization potential for removing a single electron? To develop an opinion on this question, we must return again to 2 2 Eqs. (A3-1) and (A3-2). Clearly, all of the factors of e change to Z p Ze e . It is as if all factors of e changed to
Z p Ze e . Removal of one electron is then like removal of one
Z p Ze e charge. What is comparable to the ionization potential for removing a single electron is then Eq. (6) in the main text; i.e.,
Z p Ze e2 (re + rM ) =
(Z Z
p e
)(
M × 3c 2 me2 25
)
.
(A3-8)
The Algebraic Chemistry of Molecules and Reactions
249
Appendix 4. The Periodic Arch Figure A4.1 shows a new and convenient presentation of the Periodic Table from [1]. It is called the ‘Periodic Arch’ (PA).
Figure A4.1. The Periodic Arch (PA).
The ‘keystone’ elements referred to in the main text are the ones that fall at the keystone positions of the successive layers of the PA. They are important as facilitators of molecule formation because they can give or take electrons in equal numbers. The terminology ‘keystone element’ is appropriate for many other reasons as well. Hydrogen 1 H is certainly the ‘keystone’ for all of present-day physical analysis of atoms. Carbon 6 C is certainly the ‘keystone’ for all of organic chemistry and biological life. Silicon 14 Si is certainly the ‘keystone’ for present-day technological life. Cobalt 27 Co is not so famous, but it lies between Iron 26 Fe and Nickel 28 Ni , and is functionally better than either of them: harder, more corrosion resistant, and more heat resistant. And, like Iron, it is much strengthened by the addition of a trace of Carbon – that other keystone element. We humans are currently more than three millennia into our ‘Iron Age’, which, with the help of Carbon and other trace additives, has morphed into our ‘Steel Age’. Had Cobalt been more plentiful on this planet, this might have been our ‘Cobalt/Steel Age’. Rhodium 45 Rh is also not so famous, mainly
250
Cynthia Kolb Whitney
because it is not so plentiful, but it is good for plating and alloying. The remaining keystone elements, Ytterbium 70Yb , and Nobelium 102 No , remain to be exploited very much.
Acknowledgments A number of individuals have played important roles in fostering the development of this whole line of research. I am especially grateful to Michael C. Duffy, Ruggero M. Santilli, and Mihai V. Putz.
References and Notes [1] [2] [3] [4] [5] [6] [7] [8] [9]
Whitney, C.K. Closing in on Chemical Bonds by Opening up Relativity Theory. Int. J. Mol. Sci. 2008, 9, 272-298. Whitney, C.K. Single-Electron State Filling Order Across the Elements. Int. J. Chem. Model. 2008, 1, 105-135. Whitney, C.K. Visualizing Electron Populations in Atoms. Int. J. Chem. Model. 2008, 1, 245-297. Whitney, C.K. Recent Progress in Algebraic Chemistry. Computational Chemistry: New Research, Frank Columbus, Ed., Nova Science Publishers, in press. Clary, D.C. Quantum Dynamics of Chemical Reactions. Science 2008, 321, 789-791. Zhang, D.H., Zhang, J.Z.H., J. Chem. Phys. 1994, 101, 1146. Zhang, D.H., J. Chem. Phys. 2006, 125, 133102. Yang, M.H.; Zhang, D.H., Lee, S.Y., J. Chem. Phys. 2002, 117, 9539. Zhang, X., Braams, B., Bowman, J.M., J. Chem. Phys. 2006, 124, 021104.
In: Quantum Frontiers of Atoms and Molecules Editor: Mihai V. Putz, pp. 251-275
ISBN: 978-1-61668-158-6 © 2010 Nova Science Publishers, Inc.
Chapter 11
QUANTUM AND ELECTRODYNAMIC VERSATILITY OF ELECTRONEGATIVITY AND CHEMICAL HARDNESS Mihai V. Putz* Laboratory of Computational and Structural Physical Chemistry, Chemistry Department, West University of Timişoara, Str. Pestalozzi No.16, Timisoara, RO-300115, Romania
Abstract Aiming to affirm specific physical-chemical quantities of electronegativity and hardness as the major electronic indicators of structure and reactivity, their systematic density functional, as well their quantum and electrodynamic field formulations are presented; it may serve for analytical studies of chemical bonding, reactivity, aromaticity, up to the biological activity modeling of atoms in molecules and in nanostructures.
1. Introduction In the last years, the first rate scientific research was mainly focused on the synergistic approaches of the structure and properties of the natural complex systems at the quantum chemical level [1-7]. Yet, while the pure physics still struggles on the great unification paradigm through the fundamental forces in nature, being in the last decades subject to a continuous reform, a similar attitude is now emerging in chemistry, at the quantum level of representation, related with the existing natural chemical bonds: the ionic, covalent, metallic, hydrogenic and the van der Walls (as driven, induction and diffusion) ones. Because the types of chemical bonds coexist in various degrees and combination through the matter organization, only a unitary quantum treatment, based on the first physical-chemical principles, can release an estimation of the structure-properties correlations across the complex natural nano-systems: metals, clusters, fullerenes, liquid crystals, polymers, ceramics, biomaterials, metaloenzymes. As a *
E-mail addresess: [email protected], [email protected]. Tel: +40-256-592633; Fax. +40-256-592620,
252
Mihai V. Putz
synergistic field of research, the nanosystems have received many connotations. When the spatiality of the chemical bond is studied, e.g., when the atomic systems condensate into a smaller volume than of the isolated components, the arising composite nanosystems display exceptional properties of coherences, used afterwards in processing, storing and communication of the quantum information [8-10]. On the other hand, when dealing with the chemical concentration of elements, it has already been proven that the range of nano-molar better reflects the complex bio –organic and –inorganic combinations, especially when focusing on the doze zones of responses for an essential element, with a role in selection or inhibition of a certain biological function in organisms, with effects on the growth and reproduction of cells and living organisms [11]. This way, a unitary picture to link and flexibly adapt the quantum mechanical formalisms at the chemical bonding level was intensively searched [12]. In such studies, it was recently established that for an adequate treatment in the quantum space of the polyatomic combinations stays the electronic density ρ (r ) rather than the already historical wave function ψ (r1 ,...rN ) as the main variable for a system with N electrons. This because, on the contrary to the wave function, the electronic density is an experimentally detectable quantity, is defined in the real three-dimensional space, and not within a 3N Hilbert abstract one, being also directly related with the total number of electrons in the concerned system through the functional relation [13-20]:
∫ ρ (r)dr = N
(1)
Therefore, the electronic density receives the central role within the newest quantum paradigm of matter, the Density Functional Theory (having Walter Kohn as its father, Nobel laureate in Chemistry for this theory in 1998) [15]. On the other hand, the reactivity indices’ studies are essential for indicating the propensity of a multielectronic system to participate in a chemical reaction. At the molecular level, these indices are defined in order to quantitatively measure the chemical reactivity, while at the biomolecular level they are associated with the biological activity. Thus, as the reactivity indices are informationally placed at the interface between the electronic systems’ stability and their tendency to transform and combine, they are mathematically introduced as the integral functions of the electronic density function, releasing the so-called electronic density functionals as the efficient tool for the global prediction of the electronic properties of the investigated nanosystems [21]. Although the Density Functional Theory offers concepts, e.g., the density functionals, the reactivity indices or the localization functions, with an exact formal character, still the computational effort to evaluate the electronic densities for the polyatomic systems is often immense and not without being susceptible to errors from the numerical recipes used. Also, many times, the chemical intuition is totally hidden within the routines and the basis sets chosen for implementations [12]. Faced with such programmatic problems, a closed form solution is searched, with a proper phenomenological impact relative to the qualitative-quantitative predictive character of the chemical reactivity and the biological activity when characterizing the complex natural nanosystems.
Quantum and Electrodynamic Versatility of Electronegativity…
253
However, there was recently realized a suitable revisited point of view that we should recall the basic atomic structure, and from that only treating the valence shell as the main ingredient of the chemical interactions. Then, it was searched for an intermediate concept and quantity between the single- and poly- atomic systems such that an iterative construction, that adequately describe the hierarchy of matter organization, to be possible in reflecting the specific structure and interaction. In this context, it was recently established that the complete description of a polyatomic system both at equilibrium and in interaction may be realized through a minimal set of quantum observables, containing an electronic density functional derived from energy functional expansion [22,23]:
1 ⎛ ∂2E ⎞ ⎛ ∂E ⎞ ⎟ (ΔN )2 + ... E[ N + ΔN ] = E0 [ N ] + ⎜ ⎟ ΔN + ⎜⎜ 2 ⎟ 2 ⎝ ∂N ⎠V ⎝ ∂N ⎠V
(2)
When restrained to the second order, as the most common perturbative approach, eq. (2) provides the variation of energy relationship with the charge variation
ΔE = − χΔN + η (ΔN )
2
(3)
by means of the chemical electronegativity [24-28]
⎛ ∂E ⎞ ⎟ ⎝ ∂N ⎠V
χ = −⎜
(4)
and chemical hardness [29-37]
1 ⎛ ∂2E ⎞ 1 ⎛ ∂χ ⎞ η = ⎜⎜ 2 ⎟⎟ = − ⎜ ⎟ 2 ⎝ ∂N ⎠V 2 ⎝ ∂N ⎠V
(5)
indices, in the field of external applied potential V. It appears that this simple energy-charge correlation furnishes the basics of the chemicalphysical phenomenology either in isolate and reactive state [38-44], as well in bonding modeling [45-52] up to the most recent reactive biological activity (ReBiAc) principles [53]. In this context, the present work likes to review few of the most intriguing forms of electronegativity and chemical hardness emphasizing the versatility of these reactivity indices in reflecting various energetic (energy) density functionals, of the second quantized or electrodynamical fields.
2. Density Functionals of Electronegativity and Hardness 2.1. Absolute Electronegativity and Hardness Aiming to elaborate the implications of eqs. (3)-(5), if one likes to model the energy consumed in forming the AB bonding through the gauge equilibrium reactions [54,55]:
254
Mihai V. Putz
A − + B + ↔ AB ↔ A + + B −
(6)
may equivalently write
ΔE = E NA0 −1 + E NB0 +1 − E NA0 +1 − E NB0 −1
(
) (
) (
) (
= E NA0 −1 − E NA0 + E NA0 − E NA0 +1 − E NB0 −1 − E NB0 − E NB0 − E NB0 +1
)
= IP A + EA A − ( IP B + EA B ) = 2( χ A − χ B )
(7)
thus establishing two important facts: •
The so called Mulliken electronegativity considered in (7) in terms of ionization potential (IP) and electronic affinity (EA) [56]
χM ≡
IP + EA 2
(8)
may be viewed as the (chemical) finite difference approximation of the differential electronegativity (4)
χM ≡
IP + EA ( E N 0 −1 − E N 0 ) + ( E N 0 − E N 0 +1 ) ⎛ ∂E ⎞ = ≅ −⎜ N ⎟ ≡ χ , 2 2 ⎝ ∂N ⎠V
(9)
while furnishing as well the more general form of absolute electronegativity by means of the integral [57,58]: N +1
1 0 χ A = − ∫ dE N 2 N 0 −1 •
(10)
Eq. (7) prescribes that in order to proper describe the reactive propensity of the chemical systems the change in electronegativity dχ P has to be also considered, eventually averaged against the interval (N0-1, N0+1)
1 = 2
1 N 0 +1
∫ dN
N 0 −1
(11)
Quantum and Electrodynamic Versatility of Electronegativity…
255
in accordance with above (5) introductory definition, towards the so called absolute hardness: N +1
1 0 η A = − ∫ dχ 2 N 0 −1
(12)
that unfortunately cannot give particular information as far χ(N) remains unknown. However, worth noting that when considering the eq. (11) as the factor for eq. (12) one accounts for the average of the acidic (electron accepting, N0 ≤ N ≤ N0+1) and basic (electron donating, N0–1 ≤ N ≤ N0) behaviors, being therefore an inherent part of the hardness definition, although equal arguments for skipping it may be as well considered; this is at the end only a scaling factor and remains to be chosen as per whish for the given problem at hand (and somehow depending on the taste of the investigator) [59-61]. Now, unlike the differential expressions (4) and (5), the absolute electronegativity and hardness take the symmetrical forms of (10) and (12) in terms of their “potential” (i.e. their cause), the (total – for ground state or valence – for excited states) energy and (differential) electronegativity, respectively. Moreover, eqs. (10) and (12) highly advocates the energy and electronegativity total differentials as furnishing the main thermodynamical (quantum) equation for achievement a systematic development of chemical reactivity in terms of changing charges and applied potential, N and V(r), respectively. Nevertheless, this electronegativity-hardness symmetric formulation produces an elegant unification of the χtheories and the hard-and-soft-acids-and-bases (HSAB) rules beyond the ionization potentials and electronic affinities [43,44].
2.2. Systematic Electronegativity and Hardness Since the established absolute electronegativity and hardness dependence on energy and differential electronegativity with the forms (10) and (12), for an N-electronic system, being under the external potential influence V(r), their respective functionals E = E[ N ,V (r )] and
χ = χ[ N ,V (r)] may be next employed either as total differential equation or as perturbative expansions in order to systematically obtain electronegativity and chemical hardness density functionals. For total differential equation of energy and electronegativity one has the working forms:
⎛ δE ⎞ ⎛ ∂E ⎞ ⎟⎟ δV (r)dr = − χ dN + ∫ ρ (r )δV (r )dr , dE = ⎜ ⎟ dN + ∫ ⎜⎜ δ V ( r ) ⎝ ∂N ⎠V ⎝ ⎠N
(13)
⎛ δχ ⎞ ⎛ ∂χ ⎞ ⎟⎟ δV (r ) dr = −2ηdN + ∫ f (r )δV (r )dr dχ = ⎜ ⎟ dN + ∫ ⎜⎜ ⎝ ∂N ⎠V ⎝ δV (r ) ⎠ N
(14)
being the last relation introducing the Fukui index [45-50]
256
Mihai V. Putz
⎛ δχ ⎞ ⎛ ∂ρ (r ) ⎞ ⎟⎟ = ⎜ − ⎜⎜ ⎟ = f (r ) ⎝ δV (r ) ⎠ N ⎝ ∂N ⎠V
(15)
based on the Cauchy property of the total differentiable quantities of eq. (13), playing the role in locally revealing the reactivity nature of the system being a descriptor which measures how sensitive a system’s electronegativity is to an external perturbation at a particular space point. These equations may be now employed to furnish various systematic realizations of electronegativity and hardness, either under chemical or absolute forms. For instance, refereeing to the energy equation (13), while considering Δ as the change in electrons restricted to the valence shell, it can be appropriately integrated: E ( N v ± Δ) − E ( N v ) = −
{
Nv ±Δ
}
∫ χ dN + ∫ ρ (r) ∫ δ [V (r)] dr = − Nv
Nv
Nv ±Δ
∫ χ dN + ∫ ρ
Nv
(r )V N v (r )dr (16)
Nv
to give the electronic affinity and ionization potential working expressions
− EA = −
N v +1
∫ χdN + C A ,
(17)
Nv
IP = −
N v −1
∫ χdN + C A
(18)
Nv
assuming the variation of electronic charge to be unity ( Δ = 1 ) around the number of concerned valence electrons, Nv; note the appearance of the so called chemical action [18, 62-64]
C A = ∫ ρ N v (r )V N v (r )dr
(19)
with a major role in chemical bonding and reactivity, see [52] and Chapter 1 of this edited monograph. Further on, the ionization potential and electron affinity of eqs. (17) and (18) may be combined amomg them so that when replaced into the finite difference approximations of derivative forms of electronegativity and hardness of eqs. (4) and (5) to produce their chemical forms
χC =
N +1
IP + EA 1 v = χdN , 2 2 N v∫−1
IP − EA ηC ≡ = χC − 2
N v +1
∫ χdN + C A
(20)
(21)
Nv
which may be regarded as working forms of the absolute definitions (10) and (12), respectively.
Quantum and Electrodynamic Versatility of Electronegativity…
257
Table 1. Different electronegativity (left column) and hardness (right column) in the absolute (first two rows) and chemical (last two rows) formulations relating the local and global softness contributions [18]
χ Nv
χ=−∫ 0
η 1 ⎛ ∂χ ⎞ ⎟ 2 ⎝ ∂N ⎠V
1 1 dN − ∫ s ( x)V ( x)dx S S
ηχ = − ⎜
1 2S 1 ⎛ ∂χ ⎞ η Cχ = − ⎜ C ⎟ 2 ⎝ ∂N ⎠V
ηS = χC =
N +1
1 v χdN 2 Nv∫−1
η CCA = χ C −
N v +1
∫ χdN + C A
Nv
This step, however, succeed in expressing both the chemical electronegativity and hardness as uniformly depending on single “kernel” electronegativity, which can be, instead, expressed from employing the above electronegativity total differential equation (14). To this end, one makes use of the so called local softness definition [33,36] ⎛ ∂ρ (r ) ⎞ s (r ) = −⎜⎜ ⎟⎟ ⎝ ∂χ ⎠V ( r )
(22)
which through integrating to the global softness
⎛ ∂N S = ∫ s (r ) dr = −⎜⎜ ⎝ ∂χ D
⎞ 1 ⎟⎟ = ⎠ V ( r ) 2η
(23)
allows for rewriting of eq. (14) as
dχ = −
1 s (r ) dN − ∫ dV (r )dr S S
(24)
followed by the formal integration: Nv
χ =−∫ 0
1 s (r ) ⎛⎜ V (r ) ⎞⎟ dN − ∫ dV dr = − ⎟ S S ⎜⎝ ∫0 ⎠
Nv
∫ 0
1 1 dN − ∫ s (r )V (r )dr S S
(25)
From practical implementation point of view, one can identify the form (25) as the realization of the absolute electronegativity (10) to be inserted in the chemical forms of
258
Mihai V. Putz
electronegativity and hardness as prescribed by eqs. (20) and (21), with the analytical advantage that now the dependence was moved on the knowledge of local softness rather than on the total energy. To summarize, all working absolute and chemical levels of electronegativity and hardness are collected in the Table 1. The final step in this algorithm regards the way in which the softness influence is considered. For that one consider the complete softness hierarchy from the global to local to kernel contributions
1 ≡ S = ∫ s (r )dr = ∫∫ s (r , r ' )drdr ' 2η
(26)
and to asses for local and non-local effects the specific density related terms; as such under three quantum mechanical constraints, namely the translational invariance condition, the Hellmann-Feynman theorem, and the normalization of the linear response function, it leaves with the approximate formulathere was derived the approximate formula [16]:
s(r, r ' ) = L(r)δ (r − r ' ) + ρ (r ) ρ (r' )
(27)
with the local response function
L(r ) =
being
∇ρ (r) ⋅ [− ∇V (r )]
[− ∇V (r)]2
(28)
δ (r − r ' ) the delta-Dirac function. Therefore, going back by performing the
successive integrations of (27) one gets for the local and global softness the respective results:
s (r ) = L(r ) + Nρ (r ) ,
(29)
S = a+ N2
(30)
a ≡ ∫ L(r )dr
(31)
where the short-hand notation:
was considered. Now, while supplementing the reactivity softness related index (31) by the associate one which appears as the last term in integration (25),
b ≡ ∫ L(r )V (r )dr
(32)
the various explicit density functional of absolute and chemical electronegativity and hardness are obtained, within various approximation levels of charge and applied potential order, as systematized in Table 2 [54].
Table 2. The absolute electronegativity and hardness density functionals as results from electronegativity and total energy expansions within different combination between the first and second order of charge and first order of external potential variations, respectively. The notations a, b, and CA corresponds to integrals of eqs. (31), (32), and (19), respectively [54] Electronegativity Sources
dχ
dE
Absolute (Softness)
N +1
Nv
χ = ∫ dχ
χ =−
0
1 − dN S
1 − dN S ⎡ ∂ ⎛ 1 ⎞⎤ − ⎢ ⎜ ⎟⎥dNdN ⎣∂N ⎝ S ⎠⎦
1 dN =− a + N2 N dNdN + 2 a + N2
(
)
− χdN
⎛N ⎞ arctan⎜⎜ v ⎟⎟ − a ⎝ a⎠
− χdN
⎛N ⎞ 1 − arctan⎜⎜ v ⎟⎟ a ⎝ a⎠
1 ⎛ ∂χ ⎞ − ⎜ ⎟ dNdN 2 ⎝ ∂N ⎠V
1
+
Hardness Absolute
Absolute (Energy)
(
Nv − a arctanNv / a 2a
)
1 v dE 2 N∫v −1
N +1
η
[1] A
1 v = − ∫ dχ 2 N v −1
⎧ ⎡ ⎛ Nv −1⎞ ⎤⎫ ⎟⎟ ⎥⎪ ⎪ ⎢(Nv −1) arctan⎜⎜ ⎝ a ⎠ ⎥⎪ ⎪1 ⎢ ⎪ ⎪ 1 ⎪ a ⎢⎢− (N +1)arctan⎛⎜ Nv +1⎞⎟⎥⎥⎪ v ⎨ ⎜ ⎟⎬ 2 ⎪ ⎢⎣ ⎝ a ⎠⎥⎦⎪ ⎪ ⎪ 1 ⎡a + (N +1)2 ⎤ v ⎪ ⎪+ ln⎢ ⎥ ⎪⎭ ⎪⎩ 2 ⎢⎣ a + (Nv −1)2 ⎥⎦
⎡ ⎛ Nv +1⎞ ⎤ ⎟⎟ ⎥ ⎢arctan⎜⎜ 1 ⎢ ⎝ a ⎠ ⎥ 2 a⎢ ⎛ N − 1 ⎞⎥ ⎢− arctan⎜⎜ v ⎟⎟⎥ ⎢⎣ ⎝ a ⎠⎥⎦
⎫ ⎧ ⎡a + (Nv +1)2 ⎤ ⎪ ⎪4Nv + 3aln⎢ ⎥ 2 ⎪ ⎪ ⎣ a + (Nv −1) ⎦ ⎪ ⎪ 3 ⎪ ⎡ ⎛ Nv −1⎞ ⎤⎪ ⎨ ⎟ ⎥⎬ ⎢( Nv −1)arctan⎜ 16a ⎪ ⎝ a ⎠ ⎥⎪ ⎢ ⎪ ⎪+ 6 a ⎢ ⎛ N +1⎞⎥ ⎪ ⎢− (Nv +1)arctan⎜ v ⎟⎥⎪ ⎪⎩ ⎝ a ⎠⎦⎥⎪⎭ ⎣⎢
⎧ ⎛ N + 2⎞ ⎫ ⎛ Nv − 2 ⎞ ⎟ + arctan⎜ v ⎟ ⎪ ⎪arctan⎜ a ⎝ a ⎠ ⎪ ⎠ ⎝ ⎪ ⎪ 1 ⎪ ⎡ ⎛ Nv −1⎞ ⎛ N ⎞⎤⎪⎪ ⎟ + arctan⎜ v ⎟⎥⎬ ⎨ ⎢arctan⎜ 4 a⎪ ⎢ ⎝ a ⎠ ⎝ a ⎠⎥⎪ −2 ⎪ ⎢ ⎥⎪ ⎛ N +1⎞ ⎪ ⎢− arctan⎜ v ⎟ ⎥⎪ ⎪⎩ ⎣⎢ ⎝ a ⎠ ⎦⎥⎪⎭
Table 2. Continued Electronegativity Sources
1 dN S s (r ) −∫ δV (r )dr S
−
1 − dN S ⎡ ∂ ⎛ 1 ⎞⎤ − ⎢ ⎜ ⎟⎥dNdN ⎣∂N ⎝ S ⎠⎦ −∫
s(r) δV(r)dr S
− χdN + ∫ ρ (r )δV (r )dr
Absolute (Softness)
⎧ ⎡ ⎛ Nv −1⎞ ⎤⎫ ⎟ ⎥⎪ ⎪ ⎢(b + Nv −1)arctan⎜ ⎝ a ⎠ ⎥⎪ ⎪1 ⎢ ⎛ Nv +1⎞⎥⎪⎪ 1 ⎪⎪ a ⎢ ⎟⎥⎬ ⎨ ⎢− (b + Nv +1)arctan⎜ 2 ⎪ ⎢⎣ ⎝ a ⎠⎥⎦⎪ ⎪ ⎪ C −1 ⎡a + (N −1)2 ⎤ v ⎪ ⎪+ A ln⎢ 2⎥ ⎪⎭ ⎪⎩ 2 ⎣a + (Nv +1) ⎦
1 ⎛N ⎞ arctan⎜ v ⎟ a ⎝ a⎠ b + NvCA − a + N v2
−
− χdN
−
1 ⎛ ∂χ ⎞ − ⎜ ⎟ dNdN 2 ⎝ ∂N ⎠V
+
+ ∫ ρ (r)δV (r)dr
Absolute (Energy)
1 ⎛N ⎞ arctan⎜ v ⎟ a ⎝ a⎠
(
N v − a arctan N v / a 2a b + NvC A − a + N v2
)
⎫ ⎧ ⎪ ⎪ ⎪ ⎪4Nv ⎪ ⎡ ⎤⎪ ⎪ (2b + 3Nv − 3)arctan⎛⎜ Nv −1⎞⎟ ⎥⎪ ⎢ 3 ⎪ ⎝ a ⎠ ⎥⎪ = ⎬ ⎨+ 2 a ⎢⎢ 16a ⎪ ⎛ N + 1 ⎞⎥ − (2b + 3Nv + 3)arctan⎜ v ⎟⎥⎪ ⎢ ⎪ ⎝ a ⎠⎦⎥⎪ ⎣⎢ ⎪ ⎪ ⎪ ⎪ ⎡ a + ( Nv − 1)2 ⎤ ⎪ ⎪+ a(2CA − 3) ln⎢ 2⎥ ⎣ a + ( Nv + 1) ⎦ ⎭ ⎩
[
Hardness Absolute
]
⎧ 2 CA (1 + a − Nv2 ) − 2bNv ⎫ ⎪ 2 2 4⎪ ⎪(1 + a) + 2(a −1)Nv + Nv ⎪ ⎡ 1 ⎪⎪ ⎛ Nv +1⎞ ⎤ ⎪⎪ arctan ⎜ ⎟ ⎥⎬ ⎨ 2 ⎪ 1 ⎢⎢ ⎝ a ⎠ ⎥⎪ + ⎪ a⎢ ⎛ N −1⎞⎥ ⎪ ⎪ ⎢− arctan⎜ v ⎟⎥ ⎪ ⎪⎩ ⎝ a ⎠⎦⎥ ⎪⎭ ⎣⎢ ⎧ ⎫ ⎪ ⎪ ⎪ ⎪ ⎪ 4 C 1+ a − N 2 − 2bN ⎪ A v v ⎪ ⎪ 2 2 4 ⎪(1+ a) + 2(a −1)Nv + Nv ⎪ ⎪ ⎪ ⎡ ⎛ Nv − 2 ⎞ ⎤ ⎪ ⎪ arctan ⎜ ⎟ ⎢ ⎥ 1 ⎪⎪ 1 ⎢ ⎝ a ⎠ ⎥ ⎪⎪ − ⎨+ ⎬ ⎥ 4⎪ a ⎢ ⎛ N + 2⎞ ⎪ + arctan⎜ v ⎟⎥ ⎢ ⎪ ⎪ ⎢ ⎝ a ⎠⎦⎥ ⎣ ⎪ ⎪ ⎪ ⎡ ⎤⎪ ⎛ Nv −1⎞ ⎟ ⎪ ⎢arctan⎜ ⎥⎪ ⎝ a ⎠ ⎪2 ⎢ ⎥⎪ ⎪ a⎢ ⎛ Nv +1⎞⎥⎪ ⎛ Nv ⎞ ⎪ ⎢+ arctan⎜ ⎟ − arctan⎜ ⎟⎥⎪ ⎝ a ⎠⎦⎥⎭⎪ ⎝ a⎠ ⎩⎪ ⎣⎢
[ (
)
]
Quantum and Electrodynamic Versatility of Electronegativity…
261
By analyzing the Table 2 there is remarked that the chemical electronegativity of eq. (20) is obtained as a special case of the absolute one of eq. (10) when the total or valence energy is restrained itself as to the first order variation in charge, i.e. the fourth row of Table 2, while the total differential equation of electronegativity and energy corresponds to the sixth row of Table 2, providing a reasonable complex electronegativity and hardness density functionals to be further used in modeling chemical bonding and reactivity [55]. Yet, although with such variety of quantum formulations for electronegativity and hardness, depending of the chemical reactivity framework, appears the fundamental question whether the basic definitions (4) and (5), which opened all the above exposed analytical phenomenology, are of intrinsic quantum nature to be then implicitly subsisting in any other related formulation. This matter will be in next addressed.
3. Electronegativity and Chemical Hardness by Second Quantization 3.1. Affinity and Ionization Fields by Second Quantization Starting from the bi-dimensional unitary operator on the Fock electronic space F{ 0
1ˆ = 0 0 + 1 1 = aˆaˆ + + aˆ + aˆ = {aˆ,aˆ + }
,1
}
(33)
one may easily introduce the annihilation and creation operators respectively as:
aˆ = 0 1 ,
(34a)
aˆ + = 1 0
(34b)
aˆ + 0 = 1 0 0 = 1 ,
(35a)
noting their fundamental actions
aˆ 1 = 0 1 1 = 0
(35b)
throughout fulfilling the fundamental ortho-normal rules
0 1 = 1 0 = 0,
(36a)
0 0 = 1 1 =1
(36b)
for the vacuum and single electronic states, respectively.
262
Mihai V. Putz
Going now to treat the electronegativity and hardness by means of the second quantization, one will consider the “valence state” reality as characterized by the unperturbed stationary state ψ 0 with associated eigen-energy E0 :
Hˆ ψ 0 = E 0 ψ 0
(37)
Yet, the normalization constrain for the valence wave-function allows the unitary operator decomposition on the vacuum and uni-particle occupancies as:
1 = ψ 0 ψ 0 = ψ 0 1ˆ ψ 0 = ψ 0 (aˆaˆ + + aˆ + aˆ ) ψ 0 = ψ 0 aˆaˆ + ψ 0 + ψ 0 aˆ + aˆ ψ 0 2
= 0ψ0
+ 1ψ0
2
= (1 − ρ 0 ) + ρ 0 , ρ 0 ∈ [0,1]
(38)
and for further identifying the wave function projections:
0 ψ 0 = ψ 0 0 = 1 − ρ0 ,
(39a)
1 ψ 0 = ψ 0 1 = ρ0
(39b)
In these conditions, the valence wave-function may be modified such that the affinity and ionization chemical states are written as corrections or perturbations added towards the occupancy or vacuum quantum states, respectively as:
(
)
ψ λA = 1 + λaˆ + aˆ ψ 0 = ψ 0 + λ 1 0 0 1 ψ 0 = ψ 0 + λ ρ0 1 ,
(40a)
ψ λI = (1 + λaˆaˆ + ) ψ 0 = ψ 0 + λ 0 1 1 0 ψ 0 = ψ 0 + λ 1 − ρ0 0
(40b)
These quantum chemical affinity and ionization electronic states are to be used to closely characterize the electronegativity and hardness fields in a way to resolve the question of their observable character.
3.2. Observability of Electronegativity and Hardness Aiming to evaluate the electronegativity and hardness “response” to the affinity and ionization perturbations of eqs. (40) one employs their basic definitions (4) such that to include the perturbative effect [65,66]:
χλ = −
δ Eλ ∂ E λ ∂λ =− , δρ λ ∂λ ∂ρ λ
(41a)
Quantum and Electrodynamic Versatility of Electronegativity… 2 1 ∂ Eλ 1 ⎧⎪⎡ ∂ ⎛ ∂ E λ ηλ = = ⎨⎢ ⎜ 2 ∂ρ λ2 2 ⎪⎩⎣⎢ ∂λ ⎜⎝ ∂λ
⎞⎤ ∂λ ∂ E λ ⎡ ∂ ⎛ ∂λ ⎟⎥ ⎜ ⎟ ∂ρ + ∂λ ⎢ ∂λ ⎜ ∂ρ ⎠⎦⎥ λ ⎣ ⎝ λ
⎞⎤ ⎫⎪ ∂λ ⎟⎟⎥ ⎬ ⎠⎦ ⎪⎭ ∂ρ λ
263
(41b)
by iteratively employing the chain-derivation rule
∂ • ∂ • ∂λ = ⋅ ∂ρ λ ∂λ ∂ρ λ
(42)
From expressions (41) appears that the main components of electronegativity and hardness are the density and energy derivatives respecting the perturbation factor λ . Therefore they both will be unfolded respecting the affinity and ionization chemical states (40) in a special way so that to record their reciprocal transition (no matter in which temporal order), while normalized at their scalar product for limiting occupancy 0 << ρ 0 ≤ 1 (since they are not necessary orthogonal while related), written respectively as [65]:
ρ
I ↔A λ∈ℜ
=
ψ λI aˆ + aˆ ψ λA ψ λI ψ λA
,
(43)
0 << ρ0 ≤1
ψ λI H ψ λA
E λI∈↔ℜA =
ψ λI ψ λA
(44)
0<< ρ 0 ≤1
Starting with computing the perturbed occupancy, we get successively:
=
( ) ( ) (1 + λaˆaˆ )(1 + λaˆ aˆ ) ψ
ψ 0 1 + λaˆaˆ + aˆ + aˆ 1 + λaˆ + aˆ ψ 0
ρλ =
ψ0
(
+
+
0 0<< ρ ≤1 0
)
ψ 0 aˆ + aˆ + λaˆ + aˆaˆ + aˆ + λaˆaˆ + aˆ + aˆ + λ 2 aˆaˆ + aˆ + aˆaˆ + aˆ ψ 0
(
+
+
+
+
)
ψ 0 1 + λaˆ aˆ + λaˆaˆ + λ aˆaˆ aˆ aˆ ψ 0
= ρ0
2
0<< ρ0 ≤1
1+ λ 1 + λρ 0
(45)
since the chemical fields evaluations:
ψ 0 aˆaˆ + aˆ + aˆ ψ 0 = ψ 0 0 1 1 0 1 0 0 1 ψ 0 = 0
(46a)
264
Mihai V. Putz
ψ 0 aˆ + aˆaˆ + aˆ ψ 0 = ψ 0 1 0 0 1 1 0 0 1 ψ 0 = ρ0
(46b)
ψ 0 aˆaˆ + aˆ + aˆaˆ + aˆ ψ 0 = ψ 0 0 1 1 0 1 0 0 1 1 0 0 1 ψ 0 = 0
(46c)
based on the above (34), (36), and (39) rules. From expression (45) the perturbation factor is firstly yield
λ=
ρλ − ρ0 ρ 0 (1 − ρ λ )
(47)
that provide the density involved derivatives appearing on eqs. (41):
(1 + λρ 0 ) ∂λ = , ∂ρ λ ρ 0 (1 − ρ 0 ) 2
∂ ⎛ ∂λ ⎜ ∂λ ⎜⎝ ∂ρ λ
⎞ 1 + λρ 0 ⎟⎟ = 2 1 − ρ0 ⎠
(48a)
(48b)
Then, going to the energy calculation based on ansatz (44) the chemical affinity and ionization field produce the results:
ψ 0 (1 + λaˆaˆ + )Hˆ (1 + λaˆ + aˆ ) ψ 0
Eλ =
=
ψ 0 (1 + λaˆaˆ + )(1 + λaˆ + aˆ ) ψ 0
0<< ρ0 ≤1
ψ 0 Hˆ ψ 0 + λ ψ 0 Hˆ aˆ + aˆ ψ 0 + λ ψ 0 aˆ aˆ + Hˆ ψ 0 + λ 2 ψ 0 aˆ aˆ + Hˆ aˆ + aˆ ψ 0
= E0
1 + λρ 0
1+ λ 1 + λρ 0
(49 )
since the eigen-equation of the non-perturbed valence state (37) are employed by means of the creation-annihilation quantum rules:
ψ 0 Hˆ aˆ + aˆ ψ 0 = ψ 0 Hˆ 1 0 0 1 ψ 0 = E 0 ψ 0 1 1 ψ 0 = E 0 ρ0 ,
(50a)
ψ 0 aˆ aˆ + Hˆ ψ 0 = ψ 0 0 1 1 0 Hˆ ψ 0 = ψ 0 0 0 ψ 0 E0 = E0 (1 − ρ0 ) , (50b) while the term
Quantum and Electrodynamic Versatility of Electronegativity…
265
ψ 0 aˆaˆ + Hˆ aˆ + aˆ ψ 0 = ψ 0 0 1 1 0 Hˆ 1 0 0 1 ψ 0
(51)
send to zero within the usual second quantization Hamiltonian expansion over its one and two particle terms with the corresponding integrals hpq and gpqts over the p, q, t, and s orbitals, respectively:
1 Hˆ = ∑ h pq aˆ +p aˆ q + ∑ g pq,ts aˆ +p aˆ t+ aˆ q aˆ s 2 pqts pq
(52)
0 Hˆ 1 ~ 0 aˆ +p ... 1 = 0 1 0 ... 1 = 0
(53)
noticing that
With the help of eq. (49) there is immediate to obtain the first and second energy involved derivatives in (41) Finally, from eq. (20), respectively as:
∂ Eλ ∂λ
∂ ⎛ ∂ Eλ ⎜ ∂λ ⎜⎝ ∂λ
= E0
1− ρ0
(1 + λρ 0 )2
,
(54a)
⎞ 1 − ρ0 ⎟ = −2 E0 ρ 0 ⎟ (1 + λρ 0 )3 ⎠
(54b)
Finally, through combining expressions (48) and (54) into (41) the second quantization based electronegativity and hardness are shaped as [65,66]:
χλ = −
, ρ 0 → 0 ( E0 < 0) ⎧∞ = −μ 0 = ⎨ ρ0 ⎩− E 0 = − ψ 0 H ψ 0 , ρ 0 → 1 E0
⎧0, ⎪ 1 + λρ 0 ηλ = 0 ⋅ E0 = ⎨0 ⋅ ∞ = ?, ρ 0 (1 − ρ 0 ) ⎪ ⎩0 ⋅ ∞ = ?,
(55a)
ρ 0 ∈ ( 0,1) ρ0 → 0 ρ0 → 1
(55b)
Once again there is clear that electronegativity behaves quite differently by its hardness companion in eq. (3). In other words, while electronegativity plainly behaves as a quantum observable through recovering the finite chemical potential definition in (55a), the hardness in (55b) remains as “hidden variable” for the vacuum and single occupancy while being absent for observation for factionary occupied quantum chemical states. This last remark cuts however the question of factionary occupancy in chemical reactivity phenomena, largely dominated and controlled by the hardness principles [13,17,18]. On the other side, the
266
Mihai V. Putz
electronegativity reveals its correctness in manifesting as the power with which one system attracts electrons in fulfilling its vacuum states, the upper branch in (55a). Overall, the second quantization treatment of electronegativity and hardness enlighten on versatility with which the reactivity indices drives the chemical states in bonding and reactivity; yet, from the second quantization results appears that the reactivity/bonding scenario according which the electronegativity acts as the first influence by means of the equalization of the chemical potential involved, followed by the hardness second stage and (energetically) order refining the electronic chemical states between vacuum and full occupancies with no fractional allowed states in bonding. This is an important result that overcomes the conceptual difficulties eventually raised by the quantum chemical computations of the fractional electronic populations at equilibrium [12,13,26,39].
4. Electronegativity and Hardness Electrodynamical Counterpart Being already convinced of the huge role electronegativity plays in isolated or reactive electronic systems, there is just one step until the correspondence between the electronegativity and electrodynamics equations to arise, since both are originating in electronic source or movements. As such, if one rewrites eq. (15) as mirroring the charge conservation (the continuity) law of electrodynamics [67]
⎛ δχ ⎞ ⎛ ∂ρ (r ) ⎞ ⎟⎟ = 0 , ⎜ ⎟ + ⎜⎜ ⎝ ∂N ⎠ V ⎝ δV (r ) ⎠ N
(56a)
∂ρ G G +∇⋅ j = 0 ∂t
(56b)
one get the following physico-chemical correspondences:
∂ ∂ ↔ , ∂t ∂N G ∇⋅ ↔
δ δV (r )
G j↔χ
(57a)
,
(57b)
(58)
thus providing both the space-time chemical counterpart derivatives as well as identifying electronegativity with the electronic charge current field, as a non-trivial, however not unrealistically reactivity physical picture. This result may be further employed since recalling the other definition for charge current in terms of electrodynamical magnetic field intensity, namely
Quantum and Electrodynamic Versatility of Electronegativity…
267
G G G j = α∇ × H α ∈ ℜ
(59)
which being equated with electronegativity (4) as prescribed by (58)
G G ⎛ ∂E ⎞ −⎜ ⎟ = α∇ × H ⎝ ∂N ⎠V
(60)
adds more electromagnetic field- chemical reactivity correspondences:
G ∂ ∇× ↔ , ∂N
(57c)
G H ↔ E[N ] ,
(61a)
α ↔ −1
(62a)
Next, the electrodynamical contact with hardness may use the fact the last is zero when factionary occupied states are involved, according with upper branch of eq. (55b) while having to fulfill the derivative relationship with electronegativity as conditioned by eqs. (4) and (5); therefore the magnetic induction may be assigned for electronegativity and hardness in the way that
G 1 B↔− χ, 2
(61b)
G ∂B ↔η ∂t
(63)
thus combined into the consecrated relationship:
η→
1 ∂χ ∂ G ∂ ⎛ 1 ⎞ B→ ⎜− χ ⎟ = − 2 ∂N ∂t ∂N ⎝ 2 ⎠
(64)
Now, we are going to combine these main ingredients of the chemical electrodynamical picture for completing the physical field- chemical reactivity correspondences. This way, the specialization of the Biot-Savart law
G
G
G
α∇ × B = μ 0 j
in the light ob above one-to-one substitutions reduces to the equation
(65a)
268
Mihai V. Putz
1 ∂χ = μ0 χ 2 ∂N from where the magnetic susceptibility
(65b)
μ 0 is eventually associated as follows: μ0 ↔
1 2N
(62b)
thus taking a non fixed parametric value but rather a variable one being in inverse relation with the electrons’ number in a given system. In the same line, the Faraday-Lenz law
G G G ∂B α∇ × j = −σ ∂t
(66a)
∂ χ = −ση ∂N
(66b)
rewrites as
−1
from where the electric conductance takes the fixed value as
σ = −2
(62c)
Yet, the last determination has two important consequences. One is derived by combining it with the electric current- field intensity formula:
G G j = σE
(67)
leaving with another electronegativity connection with one electrodynamical field component, here the electric field intensity:
G 1 E↔− χ 2
(61c)
noting, remarkably, an unification with the magnetic field induction (61b). The second consequence is provided by reshaping the Gauss law
G G
ε 0 ∇ ⋅ j = σρ within the above chemical reactivity quantities and eq. (56a)
(68a)
Quantum and Electrodynamic Versatility of Electronegativity…
− 2ρ = ε 0
δχ ∂ρ = −ε 0 δV ∂N
269 (68b)
to yield for the electric susceptibility the correspondence form
ε 0 ↔ 2N
(62d)
recording the inverse value with that provided by the magnetic companion of eq. (62b), with obvious meaningfully inside. Finally, worth introducing the electrodynamical field equivalence for the chemical action of eq. (19) and to explore the implications. Looking to the structure of the chemical action as the convolution of the electronic density with applied potential, there appears as the natural identification of it with the polarization field [67]
G CA ↔ P
(69)
With this addition to the list of electrodynamical field – chemical reactivity correspondences the polarization law provides the electric field induction transformation
G ε G G D = 0 j + P ↔ − Nχ + C A
σ
(70)
Yet, besides the above correspondences the present formalism allow for determination of a new chemical (dynamical) reactivity equation, since reconsidering the previous Biot-Savart law (65a) in its extended form with polarization
G G G G ∂D α∇ × H = j + ∂t
(71a)
that due to the present chemical reactivity counterparts successively becomes:
−1
∂ ∂ (− Nχ + C A ) E[ N ] = χ + ∂N ∂N ⇔−
∂E N ∂χ ∂C A + = −N ∂N ∂N ∂N
⇔ χ = 2 Nη +
∂C A ∂N
(71b)
which may be considered as new achievement in the panoply of the chemical reactivity indices theory; in fact eq. (71b) establishes the intimate relationship between the chemical
270
Mihai V. Putz
action, electronegativity and hardness involving only the electronic number; in other terms it may be considered as a generalization of the electronegativity- hardness reciprocal relationship by the contribution of the chemical action variation over the exchanged electronic charge with environment or within bonding. Relations like this may be most useful derived when obtained from consecrated electromagnetic fields transformations since including the field – to – wave- to –quantum information in an undulatory non-separated (entangled) manner. This may ultimately contribute in formulating a consistent quantum-relativistic formulation of the chemical field and reactivity (see also the Chapter 1 of this volume), beside the precious connection the Chemistry will achieve with information theory and quantum theory of hidden variables. Overall, worth observing that only one quantity remains untouched or transformed when passing from electromagnetic field to chemical reactivity sides, namely the electronic density; it constitutes the invariant of the present approach, therefore, while guaranties the observability preserved nature of both interconnected picture of physical-chemical reality.
5. Conclusion As the classical quantum chemistry had proposed a series of principles and rules to operate in describing the atomic, molecular samples and the reaction mechanisms [68], the modern quantum physical-chemistry also likes to unitarily characterize the quantum nature of the chemical and biochemical bonding and transformation on the base of the electronic density [69]. While searching for an adequate expressing of the electronic density of atomic and polyatomic forms of matter through an entire arsenal of quantitative techniques such as are the computational methods of the self-consistent field, of the pseudopotentials, of the matrices and their combinations, and of the graph theory [70,71], the resulted electronic densities can be then properly integrated or differentiated to provide density functionals, e.g., the total energy, the bond energy, the promoting energy, the solvation energy, reactivity indices, etc. [72], as well as the localization functions or the electronic basins of stability [73,74]. However, the conceptual chemistry evolves through developing specific objects expressing the reality of chemical reactivity, eventually at the valence levels by means of the frontier electron movements. In this context, the electronegativity stands as the benchmark as well as the forefront of the modern conceptual quantum chemistry since it may be related and correlated in principle with any many-electronic systems behavior in isolated and reactive environment. Moreover, considered jointly with its companion as hardness, arisen as the second order controlling factor of the total or valence energy expansion, constitute one of the most powerful conceptual binom in Chemistry with whose help either chemical bonding or the reactivity or even biological activity may be modeled in an elegant yet efficient analytical framework [75-77]. As such, the present review unfolds the most intriguing aspects of electronegativity and hardness, from their density functional forms, when clarifying their absolute and chemical systematic realizations; the observability character was approached by means of the second quantization formalism according which electronegativity is indeed revealed as the minus of
Quantum and Electrodynamic Versatility of Electronegativity…
271
the chemical potential or even more as the negative of the eigen-energies for fully occupied states thus affirming the plenty of observability, while the hardness still preserves the “quantum hidden” character for such circumstance along the vacuum state, with proved no observability for factionary occupancy. Finally, the common electronic origin of reactivity and electrodynamic fields allows for advancing the one-to-one correspondences of electronegativity and hardness with the main undulatory fields and of their electronic characteristics in matter – leaving with a sort of electric-magnetic unification of fields through their correspondence with the electronegativity, while furnishing the proper framework in which new or generalized chemical reactivity equations are provided with the fruitful perspective of characterizing the chemical transformation of matter with consistent quantum-relativistic information contents. All these conceptual aspects and analytical formulations aim for unification of the chemical bonding and reactivity by means of electronegativity and hardness indices, within the most celebrated and not yet completed theory of Chemistry.
References [1]
Bader, R.F.W. The zero-flux surface and the topological and quantum definitions of an atom in a molecule. Theor. Chem. Acc. 2001, 105, 276–283. [2] Ayers, P.W.; Parr, R.G. Variational principles for describing chemical reactions: The Fukui function and chemical hardness revisited. J. Am. Chem. Soc. 2000, 122, 20102018. [3] Deeth, R.J.; Bugg, T.D.H A density functional investigation of the extradiol cleavage mechanism in non-heme iron catechol dioxygenases. J. Biol. Inorg. Chem. 2003, 8, 409-418. [4] Silvi, B. The spin-pair compositions as local indicator of the nature of the bonding. J. Phys. Chem. A 2003, 107, 3081-3085. [5] Santos, J.C.; Tiznado, W.; Contreras, R.; Fuenteabla, P. Sigma-pi separation of the electron localization function and aromaticity. J. Chem. Phys. 2004, 120, 1670-1673. [6] Schmidera, H.L.; Becke, A.D. Two functions of the density matrix and their relation to the chemical bond. J. Chem. Phys. 2002, 116, 3184-3193. [7] Putz, M.V., Markovian approach of the electron localization functions. Int. J. Quantum Chem. 2005, 105, 1-11. [8] Brinkmann, G.; Fowler, P.W.; Justus, C. A catalogue of isomerization transformations of fullerene polyhedra. J. Chem. Inf. Comput. Sci. 2003, 43, 917-927. [9] Liu, J.; Shao, M.; Chen, X.; Yu, W.; Liu, X.; Qian, Y. Large-scale synthesis of carbon nanotubes by an ethanol thermal reduction process. J. Am. Chem. Soc. 2003, 125, 80888089. [10] Kobayashi, S.-I.; Mori, S.; Iida, S.; Ando, H.; Takenobu, T.; Taguchi, Y.; Fujiwara, A.; Taninaka, A.; Shinohara, H.; Iwasa, Y. Conductivity and field effect transistor of La2@C80 metallofullerene, J. Am. Chem. Soc. 2003, 125, 8116-8117. [11] Sato, K.; Hosokawa, K.; Maeda, M. Rapid aggregation of gold nanoparticles induced by non cross-linking DNA hybridization. J. Am. Chem. Soc. 2003, 125, 8102-8103. [12] Bader, R.F.W. Quantum Mechanics, or Orbitals? Int. J. Quantum Chem. 2003, 94, 173–177.
272
Mihai V. Putz
[13] Parr, R.G.; Yang, W. Density Functional Theory of Atoms and Molecules, Oxford University Press, New York, 1989. [14] Dreizler, R.M.; Gross, E.K.U. Density Functional Theory, Springer Verlag: Heidelberg, 1990. [15] Kohn, W.; Becke, A.D.; Parr, R.G. Density functional theory of electronic structure, J. Phys. Chem. 1996, 100, 12974-12980. [16] Garza, J.; Robles, J. Density functional theory softness kernel, Phys. Rev. A 1993, 47, 2680-2685. [17] Putz, M.V. Contributions within Density Functional Theory with Applications in Chemical Reactivity Theory and Electronegativity, Dissertation. Com, Parkland, Florida, 2003. [18] Putz, M.V. Absolute and Chemical Electronegativity and Hardness, Nova Science Publisher, New York, 2008. [19] Putz, M.V. (Ed.) Advances in Quantum Chemical Bonding Structures, Transworld Research Network, Kerala 2008. [20] Putz, M.V. Density functionals of chemical bonding, Int. J. Mol. Sci. 2008, 9, 10501095. [21] Ayers, P.W.; Parr, R.G. Variational principles for describing chemical reactions. Reactivity indices based on the external potential. J. Am. Chem. Soc. 2001, 123, 20072017. [22] Iczkowski, R.P.; Margrave, J.L. Electronegativity. J. Am. Chem. Soc., 1961, 83, 35473551. [23] Klopman, G. Electronegativity. J. Chem. Phys. 1965, 43, S124-S129. [24] Hinze, J.; Jaffe, H.H. Electronegativity. I. Orbital electronegativity of neutral atoms. J. Am. Chem. Soc. 1962, 84, 540-546. [25] Hinze, J.; Jaffe, H.H. Electronegativity. IV. Orbital electronegativities of neutral atoms of the period three A and four A and of positive ions of periods one and two. J. Phys. Chem. 1963, 67, 1501-1506. [26] Bergmann, D.; Hinze, J. Electronegativity and charge distribution, Structure and Bonding, 1987, 66, 145-190. [27] Parr, R.G.; Donnelly, R.A.; Levy, M.; Palke, W.E. Electronegativity: the density functional viewpoint. J. Chem. Phys. 1978, 68, 3801-3807. [28] Bartolotti, L.J.; Gadre, S.R.; Parr, R.G. Electronegativities of the elements from the simple Xα theory. J. Am. Chem. Soc. 1980, 102, 2945-2948. [29] Pearson, R.G. Chemical Hardness, Wiley-VCH, Weinheim, 1997. [30] Gázquez, J.L.; Ortiz, E. Electronegativities and hardness of open shell atoms. J. Chem. Phys. 1984, 81, 2741-2748. [31] Berkowitz, M.; Ghosh, S.K.; Parr, R.G. On the concept of local hardness in chemistry. J. Am. Chem. Soc. 1985, 107, 6811-6814. [32] Chattaraj, P.; Parr, R.G. Density functional theory of chemical hardness, Structure and Bonding 1993, 80, 11-25. [33] Berkowitz, M.; Parr, R.G. Molecular hardness and softness, local hardness and softness, hardness and softness kernels, and relations among these quantities. J. Chem. Phys. 1988, 88, 2554-2557. [34] Parr, R.G.; Gázquez, J.L. Hardness functional. J. Phys. Chem. 1993, 97, 3939-3940.
Quantum and Electrodynamic Versatility of Electronegativity…
273
[35] Baekelandt, B.G.; Cedillo, A.; Parr, R.G. Reactivity indices and fluctuations formulas in density functional theory: isomorphic ensembles and a new measure of local hardness, J. Chem. Phys. 1995, 103, 8548-8556. [36] De Proft, F.; Liu, S.; Parr, R.G. Chemical potential, hardness and softness kernel and local hardness in the isomorphic ensemble of density functional theory, J. Chem. Phys. 1997, 107, 3000-3006. [37] Kolandaivel, P.; Mahalingam, T.; Sugandhi, K. Polarizability and chemical hardness: a combined study of wave function and density functional theory approach, Int. J. Quantum Chem. 2002, 86, 368-375. [38] Ray, N.K.; Samuels, L.; Parr, R.G. Studies of electronegativity equalization, J. Chem. Phys. 1979, 70, 3680-3684. [39] Mortier, W. J.; van Genechten, K.; Gasteiger, J. Electronegativity equalization: application and parameterization. J. Am. Chem. Soc. 1985, 107, 829-835. [40] Chandrakumar, K.R.S.; Pal, S. A systematic study on the reactivity of Lewis acid-base complexes through the local hard-soft acid-base principle, J. Phys. Chem. A 2002, 106, 11775-11781. [41] Chattaraj, P.K.; Maiti, B. HSAB principle applied to the time evolution of chemical reactions, J. Am. Chem. Soc. 2003, 125, 2705-2710. [42] Chattaraj, P.K.; Liu, G.H.; Parr, R.G. (1995). The maximum hardness principle in the Gyftpoulos-Hatsopoulos three-level model for an atomic or molecular species and its positive and negative ions, Chem. Phys. Lett. 1995, 237, 171-176. [43] Putz, M.V.; Russo, N.; Sicilia, E. On the application of the HSAB principle through the use of improved computational schemes for chemical hardness evaluation. J. Comput. Chem. 2004, 25, 994-1003. [44] Putz, M.V. Maximum hardness index of quantum acid-base bonding, MATCH Commun. Math. Comput. Chem. 2008, 60, 845-868. [45] Parr, R.G.; Yang, W. Density functional approach to the frontier electron theory of chemical reactivity. J. Am. Chem. Soc. 1984, 106, 4049-4050. [46] Yang, W.; Parr, R.G.; Pucci, R. Electron density, Kohn-Sham frontier orbitals, and Fukui functions, J. Chem. Phys. 1984, 81, 2862-2863. [47] Berkowitz, M. Density functional approach to frontier controlled reactions, J. Am. Chem. Soc. 1987, 109, 4823-4825. [48] Sentilkumar, K.; Ramaswamy, M.; Kolandaivel, P. Studies of chemical hardness and Fukui function using the exact solution of the density functional theory, Int. J. Quantum Chem. 2001, 81, 4-10. [49] Senet, P. Nonlinear electronic responses, Fukui functions and hardnesses as functionals of the ground-state electronic density, J. Chem. Phys. 1996, 105, 6471-6489. [50] Gázquez, J.L.; Vela, A.; Galvan, M. Fukui function, electronegativity and hardness in the Kohn-Sham Theory, Structure and Bonding, 1987, 66, 79-98. [51] Putz, M.V. The chemical bond: spontaneous symmetry – breaking approach. Symmetry: Culture and Science, 2009/2010, in press. [52] Putz, M.V. Chemical action and chemical bonding. J. Mol. Struct. (THEOCHEM) 2009, 900, 64-70. [53] Selegean, M.; Putz, M.V.; Rugea, T. Effect of the polysaccharide extract from the edible mushroom Pleurotus Ostreatus against infectious bursal disease virus. Int. J. Mol. Sci. 2009, 10, 3616-3634.
274
Mihai V. Putz
[54] Putz, M.V. Systematic formulation for electronegativity and hardness and their atomic scales within density functional softness theory. Int. J. Quantum Chem. 2006, 106, 361386. [55] Putz, M.V. Can Quantum-Mechanical Description of Chemical Bond Be Considered Complete?, in Quantum Chemistry Research Trends, Mikas P. Kaisas (Ed.), Nova Science Publishers Inc., New York, (2007), pp.3-5. [56] Putz, M.V.; Russo, N.; Sicilia, E. About the Mulliken electronegativity in DFT, Theor. Chem. Acc. 2005, 114, 38-45. [57] Komorowski, L. Empirical evaluation of chemical hardness. Chem. Phys. Lett. 1987, 134, 536-540. [58] Komorowski, L. Electronegativity and hardness in chemical approximation. Chem. Phys. 1987, 55, 114-130. [59] Bartolotti, L.J. Absolute electronegativities as determined from Kohn-Sham theory, Structure and Bonding, 1987, 66, 27-40. [60] Pearson, R.G. Absolute electronegativity and absolute hardness of Lewis acids and bases, J. Am. Chem. Soc. 1985, 107, 6801-6806. [61] Parr, R.G.; Pearson, R.G. Absolute hardness: companion parameter to absolute electronegativity, J. Am. Chem. Soc. 1983, 105, 7512-7516. [62] Putz, M.V.; Chiriac, A.; Mracec, M. Foundations for a theory of the chemical field. III. The integrated electronegativity, Rev. Roum. Chimie 2002, 47, 201-206. [63] Putz, M.V., Russo, N., Sicilia, E. Atomic radii scale and related size properties from density functional electronegativity formulation. J. Phys. Chem. A 2003, 107, 54615465. [64] Putz, M.V. Semiclassical electronegativity and chemical hardness. J. Theor. Comput. Chem. 2007, 6, 33-47. [65] Putz, M.V. Electronegativity: quantum observable, Int. J. Quantum Chem. 2009, 109, 733-738. [66] Putz M.V. Chemical hardness: quantum observable?", Studia Universitatis BabeşBolyai - Seria Chimia 2010, accepted. [67] Putz, M.V. Chemical reactivity and electromagnetic field. Int. J. Chem. Model. 2009, 1, in press. [68] Bredow, T.; Jug, K. Theory and range of modern semiempirical molecular orbital methods, Theor. Chem. Acc. 2005, 113, 1–14. [69] Burresi, E.; Sironi, M. Determination of extremely localized molecular orbitals in the framework of density functional theory, Theor. Chem. Acc. 2004, 112, 247–253. [70] Vishveshwara, S.; Brinda, K.V.; Kannany, N. Protein structure: insights from graph theory. J. Theor. Comput. Chem. 2002, 1, 187-211. [71] Fujita, S. Graphs to chemical structures 1. Sphericity indices of cycles for stereochemical extension of Polya’s theorem. Theor. Chem. Acc. 2005, 113, 73–79. [72] Tomasi, J. Thirty years of continuum solvation chemistry: a review, and prospects for the near future. Theor. Chem. Acc. 2004, 112, 184–203. [73] Berski, S.; Andres, J.; Silvi, B.; Domingo, L.R. The joint use of catastrophe theory and electron localization function to characterize molecular mechanisms. A density functional study of the Diels-Alder reaction between ethylene and 1,3-butadiene. J. Phys. Chem. A 2003, 107, 6014-6024.
Quantum and Electrodynamic Versatility of Electronegativity…
275
[74] Kohout, M.; Pernal, K.; Wagner, F.R.; Grin, Y. Electron localizability indicator for correlated wavefunctions. I. Parallel-spin pairs. Theor. Chem. Acc. 2004, 112, 453–459. [75] Tarko, L.; Putz, M.V. On electronegativity and chemical hardness relationships with aromaticity, J. Math. Chem. 2010, 47, 487-495. [76] Putz M.V. On absolute aromaticity within electronegativity and chemical hardness reactivity pictures, MATCH Comm. Math. Comput. Chem. 2010, 64, 391-418. [77] Putz M.V. Compactness aromaticity of atoms in molecules, Int. J. Mol. Sci. 2010, 11, 1269-1310.
In: Quantum Frontiers of Atoms and Molecules Editor: Mihai V. Putz, pp. 277-323
ISBN: 978-1-61668-158-6 © 2010 Nova Science Publishers, Inc.
Chapter 12
PHYSICS AND CHEMISTRY OF CARBON IN THE LIGHT OF SHELL-NODAL ATOMIC MODEL G.P. Shpenkov* FIRMUS S.A.R.L., 33 bd Princesse Charlotte MC, 98000 MONACO
Abstract The first review devoted to the structure of the carbon atom and all its isotopes in view of shell-nodal atomic model, and derivation of intra-atomic binding energy of nucleons in its is presented. It is shown that the main role in the formation of molecular stable isotope and crystal structures (their geometry) belongs to nodal nucleons, but not electrons, as is commonly regarded. Electrons define only the strength of interatomic bonds.
1. Introduction According to dialectical physics [1] with its axiom on the wave nature of all physical phenomena in the Universe, an internal (spatial) structure of individual atoms [2], symmetry of crystals [3], and the nature of Mendeleev’s periodic law [4] are described by a classical (not Schrödinger’s) wave equation
(1) -Function represents in this case the density of potential-kinetic phase probability of occurrence of events in wave spaces. All ―elementary‖ particles, including electrons, protons, and neutrons, are regarded in dialectical physics as pulsing microobjects [5]. Interactions between them and with an
*
E-mail address: [email protected]
278
G.P. Shpenkov
ambient space, more correctly exchange of matter-space and rest-motion (matter-space-time for brevity), are realized at the fundamental frequency
(2) inherent in microobjects at the atomic and subatomic levels; where (3) is an elementary quantum of the rate of mass exchange (called in modern physics the electron charge), and
is the electron mass. The fundamental frequency shows its worth everywhere. In particular, it defines the quantum of specific resistance of atomic spaces that is found in the integer and fractional quantum Hall effect [7], etc. is connected with the fundamental wave radius The fundamental frequency
(4) which defines average atomic diameters and, hence, average internodal distances (lattice parameters) in ordered material structures (crystals); where
is the basis speed of exchange of matter-space-time at the atomic and subatomic levels (the speed of light c is equal to the above speed). The wave exchange of matter-space-time is in the nature of all physical phenomena. Accordingly the probability of possible states has the wave character and reflects the states of rest and motion. The possibility of rest and motion gives birth to the potential-kinetic field of reality, where rest (potential field) and motion (kinetic field) are inseparable linked between themselves in the unit potential-kinetic field. The mathematical image (measure) of the wave of possibility is the wave of probability; the latter was called [9] the phase probability. Thus the density of phase probability is the complex function (5) The density satisfies the wave equation (1), which in this case is called the wave probabilistic equation.
Physics and Chemistry of Carbon in the Light of Shell-Nodal Atomic Model
279
Interference of waves of the exchange form standing waves in bound domains of space [8]. Potential nodes of these waves are natural places of equilibrium disposition of constituents of atoms and molecules - nucleons. An equilibrium disposition of nucleons in the nodes inside of individual atoms correlates with the disposition of atoms in molecules and crystals. The wave equation (1) admits particular solutions in the form (6) where
is the particular solution of the Helmholtz equation (7) is the wave number. The wave equation (1) describes both spherical
and cylindrical wave fields of matter-space-time. The general form of the solutions of Eq. (1) for a spherical (longitudinal, central) component of , in spherical polar coordinates, is (8) where (8a) is the spatial factor of the wave function (6); The radial component
of the spatial factor describes the density of potential-
kinetic probability of radial displacements, the polar component
- the polar
- the azimuthal displacements. displacements, and At integer values of the wave number m, an elementary solution of the wave equation (7) has the standard form:
(9) function. They Two terms in (9) are the potential and kinetic spatial constituents of reflect polar opposite features of the function - its potential and kinetic character, respectively. The half-integer solutions of (7), for
, where
have the form
280
G.P. Shpenkov (10)
where (11)
(12) From (12) it follows that the polar extremes of half-integer solutions are in the equatorial plane. All spatial components are determined with the accuracy of a constant factor A, imposed by boundary conditions, which have no influence on the peculiarity of distribution of the nodes on radial spheres. The superposition of even and odd solutions defines the even-odd solutions. Odd solutions describe the nodes, lying in the equatorial plane of atomic space. In this plane there are also solutions in the form of rings in space (shown below). The solution (9) describes the kinematic structure of standing waves of wave physical space, namely it yields a spatial geometry of disposition of specific points (nodes and antinodes) in which function takes the zero and extremal values. The disposition of principal potential polar-azimuth nodes (obtained in spherical polar coordinates) reflects the discrete geometry of probabilistic radial polar-azimuth wave . They are determined by the elementary solutions:
spherical shells at
(13) where
is the constant factor.
The conjugated kinetic nodes are determined by the function (14) If potential nodes are the points of rest, the kinetic nodes are the points of maximal motion (oscillation). of characteristic shells (the shells with the zero or extremal The relative radius value of radial functions
) is defined by the roots of Bessel functions:
, for
is the order of Bessel
potential and kinetic components, respectively, where
, the radii of
functions and q is the number of the zero or extremum. Since the shells (spheres) of zeros and extremes are, respectively: and
and
.
Physics and Chemistry of Carbon in the Light of Shell-Nodal Atomic Model
281
A great set of characteristic shells corresponds to each number l of the radial function. Zero values of the wave spherical field of probability define the radial shells of zero probability of radial displacements (oscillations); they are the shells of stationary states. According to the obtained solutions (9) [1-4], characteristic shells of the carbon atom are and . defined by the following quantum numbers: Radial functions of the even solutions (9), defining the characteristic shells of the carbon atom, are presented in Table 1 (through their relative values Rl /A) and in Figure 1. Table 1. The radial functions of even solutions for the carbon atom l 0
(sin ± i ( cos)) /
1
(( -1 sin cos) ± i ( -1 cos sin)) -1
2
[((3 -2 1) sin 3 -1 cos)) ± i ((1 3 -2) cos 3 -1 sin))] -1 Elementary solutions for the polar functions
have the form (15)
is the coefficient depending on normalizing conditions, and are where Legendre adjoined functions. Elementary linearly independent solutions for the azimuthal functions have the form (16) where is an initial phase of the azimuth state. The first component of the function (16) is potential, the second one is kinetic: (17) (18) Both solutions, (15) and (16), define the potential-kinetic polar-azimuth function (19)
282
G.P. Shpenkov
Figure 1. Plots of the radial spherical functions of the carbon atom: potential kinetic
(a) and
(b).
Table 2. The reduced polar-azimuth potential functions l
m
0
0
1
0
2
0
of the carbon atom
1
If the normalizing factor of the above functions is assumed to be equal to the numerical unit, these functions are called the reduced functions. The reduced polar-azimuth potential functions, drawn in Figure 2.
, for the carbon atom are presented in the Table 2; their graphs are
Physics and Chemistry of Carbon in the Light of Shell-Nodal Atomic Model
Figure 2. Graphs of the polar-azimuthal functions
of the carbon atom.
The contour plots of potential components of the density of probability
(13) for the
carbon atom are presented in Figure 3. Pictures presented here are the sections of sections of the product of the three functions: radial ) in the plane
283
, polar
(i.e. the
, and azimuthal
.
It should be noted that the pictures demonstrated in Figure 3 are, in accordance with the solutions, interference images of modes of standing waves in a three-dimensional spherical space. A characteristic feature of the shell at l = 2, m = 0 is an existence of a toroidal vortexring. Obviously, it plays an important role in physical and chemical processes at the atomic and molecular levels. We assume that it is responsible for a series of unique properties of graphene - an atomic size thickness layer of graphite – found in the last years. Therefore, we show this ring once more in Figure 4 in two projections: for a section along the z-axis (in a plane perpendicular to the plane (x, y), as in Figure 3), and additionally for a section z 0 in a plane (x, y).
284
G.P. Shpenkov
Figure 3. Contour plots (an interference image) of the sections for the potential density of probability in the plane
for the space of the carbon atom.
The next peculiarity of the solution is an existence of the nodes along the z-axis which we call polar nodes. The 2n and 2s, north and south, polar nodes, corresponding to l = 2, are indicated in Figure 4. of the space density of probability define characteristic Polar components parallels of extremes and zeroes on radial spheres (shells). Azimuth components define characteristic meridians of extremes and zeroes. Potential and kinetic polar-azimuth probabilities select together the distinctive coordinates of the disposition of extremes (antinodes) and zeroes (nodes) on the radial shells.
Physics and Chemistry of Carbon in the Light of Shell-Nodal Atomic Model
285
Figure 4. The solution (13) of the wave equation (7) for the spherical shell of the carbon atom with the wave (quantum) numbers
: (a) for a section along the z-axis (in a plane perpendicular
to the plane (x, y)), (b) for a section z 0 in a plane (x, y); 2n and 2s are, respectively, the north and south polar nodes belonging to the shell.
for of the The solutions probabilistic wave equation (1) presented in the form indicating the relative space disposition of potential extremes-nodes - discrete elements of the shell nucleon structure of the carbon atom - are drawn schematically in Figure 5. Numbers 1, 2, 3, …, 6 are the ordinal numbers of the principal polar-azimuth nodes coinciding with the atomic numbers of the elements Z of the periodic table. Every principal node with the ordinal number Z bounds, to the definite extent, all previous shells with their nodes. Having the specific spatial structure, every such a locally bounded object is distinguished from all others by the specific unrepeatable properties. A totality of discrete units (nodes) of the wave probabilistic field in the object is considered as an element (―atom‖) of the field. It turns out that the distribution of nodes in a standing wave spherical space, described by the wave equation (1), also defines the distribution of dense matter (―elementary‖ particles) if material spaces are considered. The particles are in nodal points of the space (presented, in particular, in Figure 5 for the space of the carbon atom) because they tend toward equilibrium state. Nodes for them are something like potential wells. It will be recalled in this connection
286
G.P. Shpenkov
that the wave equation for atomic spaces is the equation of microsystems. It does not describe the motion of isolated microobjects, but it describes the wave processes at the definite level of space on the whole, determining the space structures as unified systems.
Figure 5. A schematic drawing of the nodes and a toroidal vortex ring in the carbon atom: 0, 1N, 1S, 2N, 2S is the ordinal number of the polar potential-kinetic nodes (located along the z-axis, m = 0); 1, 2,…, 6 is the ordinal numbers of principal polar-azimuthal potential nodes. The nodes 1 and 2 belong to the internal spherical shell, l = 1; the nodes 3, 4, 5, and 6 are located on the external spherical shell, l = 2.
Analyzing the structure of crystals at the end of 18th century, R.J. Haüy (1743-1822) [11] came to the conclusion that it is necessary to consider atoms as elementary molecules, an internal structure of which determines the crystal structure of solids. As masses of atoms are multiple of the hydrogen atom mass, following Haüy’s ideas makes it reasonable to suppose that any atom, like the elementary Haüy’s molecule, is the H-atom molecule. Actually, as was shown by comprehensive analysis of direct and indirect consequences of the solutions for the wave equation (7) [3], nodes of intratomic space are completed by Hatoms to which we refer protons, neutrons, and hydrogen atoms; therefore, atoms of the shellnodal atomic model are regarded as H-atom quasispherical molecules. The simplest hydrogen (protium) is an elementary brick of the Universe at the atomic level (the basis atom particle of the atomic level). Protium has also a complicated internal structure which is defined, following dialectical physics, by the solutions of the same wave equation (1). However, it consists of ―elementary‖ particles of the deepest level of the Universe subatomic [16-18]. Thus, Figure 5 is a schematic image of the disposition of nodes (and a toroidal vortexring) in the wave spherical shells of the space of the carbon atom regarded as an elementary molecule of H-atoms. Principal azimuth nodes of the wave space of atoms are marked by ordinal numbers. These numbers coincide with ordinal numbers of the elements of Mendeleev’s periodic table. A conducted comprehensive analysis showed uniquely that the number of H-atoms, which can be localized in one node of an atom, is equal to or less than two.
Physics and Chemistry of Carbon in the Light of Shell-Nodal Atomic Model
287
All polar nodes (corresponding to m = 0) are potential-kinetic nodes. Being potential and kinetic at the same time, they are the nodes of rest and motion simultaneously. Therefore, particles of matter settled by a reason there will be in a disturbed nonequilibrium state. Thus, the function describes, following the shell-nodal atomic model, the density of probability of concentration of matter in nodal points of limited domains of the wave spherical space of standing waves. Every atom represents a system of spherical shells with discrete points-nodes of wave space completed by H-atoms. It means that atoms have not a customary nucleus as is assumed in modern physics based on the quantum mechanical (QM) Rutherford-Bohr’s atomic (nuclear) model [10, 11]. Being essentially different from QM and, respectively, Rutherford-Bohr’s model, the new molecule-like (shell-nodal) atomic model is in well agreement with all well-known experiments (including Rutherford’s experiment on scattering of - and -particles in substance [12]) that we have already time to consider beginning from 1996, the year of the first publication on dialectical physics. Moreover, this model, along with the dynamic model of elementary particles [5], reveals the nature of a series of phenomena misunderstood until now (see for example [1, 6, 13, 14]), including so-called ―forbidden‖ symmetries found recently in natural minerals [15]. The relative mass of atoms is defined by the total number A of H-atoms located in nodes of spherical shells of a concrete atom:
(20) shells, were k and i are the numbers of polar (m = 0) and polar-azimuthal respectively; is the number of polar nodes of k-th polar shell; and are the number of principal and collateral polar-azimuth nodes, respectively, of i-th polar-azimuthal , , and are the numbers of multiplicity, i.e., filling of the nodes shell; ( ). (Collateral polar-azimuth nodes are in atoms that have the shells . The first atom in this series with l = 3 is silicon, corresponding to quantum numbers the atomic number ) Thus, generally, following the physical shell-nodal atomic model, atoms represent the bounded microsystems of characteristic spherical shells with nodal points, where the wave function has the zero and extremal values, expressing the discrete structure of these shells. The number of potential (or, in equal degree, kinetic) polar-azimuth extremal points (nodes) Z indicates the ordinal (atomic) number of the concrete atom. The principal constituents of atoms are H-atoms located in the potential nodes. Carbon and its compounds, especially with hydrogen and oxygen, are the most studied substances with a great variety of different structures, which they form. Therefore it is in order and a case in point to show an advantage of the aforementioned physical shell-nodal atomic model, already essentially developed (see References), considering as an example the structure of the carbon and oxygen atoms, their isotopes and some typical compounds that is the goal of this review.
288
G.P. Shpenkov
2. An Internal Structure of Carbon and Oxygen Atoms and their Isotopes Thus, following the shell-nodal atomic model, atoms have an internal shell-nodal structure defined by the simplest solutions of the wave equation (1). The nodal structure of and its polar-azimuth functions are shown in Figure 6. the carbon atom
Figure 6. (a) Plots of potential polar-azimuth functions ), (b) symbolic designation of polar nodes on radial shells symbolic designation of the carbon atom.
(l = 0, 1, 2; (b), and (c) the
has two polar-azimuth spherical shells (l = 1, 2; ) with six The carbon atom potential polar-azimuth nodes completed every by two H-atoms. All they are in one plane: two potential nodes are in the inner spherical shell (l = 1), and four potential nodes are in the outer spherical shell (l = 2). In the center of this structure (in the origin of spherical polar coordinates), there is an empty node (l = 0, m = 0). The rest four empty nodes, potentialkinetic polar (l = 1, 2; m = 0), are located along the z-axis. A toroidal vortex-ring (at l = 2, m = 0) is in the plane (x, y) with the axis of symmetry z. Kinetic nodes, the nodes of maximal motion (they are not shown and will not be considered here), are in a perpendicular plane with respect to the plane of the disposition of potential nodes (compare (13) and (14)). The main mass of hydrogen in Nature at all its levels is in coherent states, in particular, in the form of coupled atoms – hydrogen molecules H2. Coupled H-atoms in polar-azimuth nodes define an equilibrium state of atomic (polar-azimuthal spherical) shells. The condition of coupling is inherent not only for H-atoms in nodes of individual atoms, but also for atoms linked in molecules and crystals; we shall touch this subject further.
Physics and Chemistry of Carbon in the Light of Shell-Nodal Atomic Model
289
Individual properties characteristic for an element of the ordinal number Z are keeping for its isotopes. Shell-nodal atomic model reveals this peculiarity showing the structure of all possible isotopes. Following the shell-nodal atomic model, the structure of isotopes is dependent on the filling of empty polar potential-kinetic nodes (located along the z-axis, at m = 0) by H-atoms and on the multiplicity (η = 1 or η = 2) of filling the principal potential nodes in the external spherical polar-azimuth shell. Of course, structures with incompletely filled principal potential nodes cannot be equilibrium, and as metastable states they are characterized by a definite lifetime. Thus, the geometry of external polar-azimuth shells and, hence, specific strong intraatomic nodal bindings in isotopes do not change only during the lifetime of the unstable, in this case, atom.
Figure 7. A schematic image of the filling of nodes in the lightest ( and
), and unstable (radiogenic,
), heaviest (
), stable (
) isotopes of the carbon atom.
The nodal shell structure of carbon presented in Figure 6 is not changed until each of the four polar-azimuth nodes of an external shell contains at least one H-atom. As was mentioned above, such a state with unpaired single H-atoms in the nodes will not be equilibrium and, hence, is temporal, short-lived. Hence, following the shell-nodal structure depicted in Figs. 5 and 6, the uniquely possible of the mass number 8 (the total number of H-atoms). lightest unstable isotope of carbon is The order of filling the nodes in this isotope is shown in Figure 7. It should be noted that the above described filling of only external nodes by one H-atom, in the lightest unstable isotope of carbon, is inherent as well in the lightest isotopes of most atoms of the periodic table having external integer shells. Carbon is made up of isotopes with masses 12, 13 and 14. An isotope of carbon
can
with one Hbe obtained under condition of filling a central vacant node in the carbon atom as shown in Figure 7. This isotope is stable. Apparently, a specific configuration of
290
G.P. Shpenkov
internal fields and internodal bindings in provides a stable state of a single H-atom in the central polar-azimuthal potential-kinetic node. with two H-atoms, or filling of two vacant polar nodes Filling of the central node in in its first shell (l = 1, m = 0) by one H-atom per node, leads to two possible forms of the carbon isotope
(see Figure 7), which both are unstable, radiogenic. The half-life of
years. This is a long-lived isotope, apparently, because of a specific symmetry of is 5730 the filled nodes and, hence, due to a specific symmetrical structure of its resulting binding fields. The maximal mass number of carbon cannot exceed 22 in any way. This fact follows from the solutions presented above. Actually, the total number of the nodes in the carbon atom (excluding polar-azimuthal kinetic nodes which are not shown here): axial potentialkinetic polar and principal potential polar-azimuth, is 11. Multiplicity of filling of the nodes, does not exceed 2. Accordingly, the maximal number of H-atoms in the nodes of the . In fact, the carbon is the heaviest carbon atom cannot exceed 22: 11 nodes artificially produced short-lived isotope of carbon [19]. It is obtained forcibly by filling all vacant polar nodes, as is shown in Figure 7, by the neutron exposure on accelerators with paired H-atoms. The matrix of polar-azimuth discrete structure of the carbon atom, a matrix of the nodes (excluding kinetic nodes), , has the form
(21) Matrix elements show the number of nodes in the shells corresponding to a definite values of l (the first row, l = 0; the second row, l = 1; and the third row, l = 2) and m (the first column, m = 0; the second column, m = 1; and the third column, m = 2). The matrices of filing the nodes by H-atoms in the stable
, lightest
, and heaviest
isotopes of carbon (shown in Figure 7) have the following forms:
(22) The shell-nodal structure of possible isotopes of the carbon atom (from lightest to heaviest), originated from the solutions of the wave equation (1), is shown in Figure 8.
Physics and Chemistry of Carbon in the Light of Shell-Nodal Atomic Model
Figure 8. The nodal structure of possible carbon isotopes (solutions of the wave equation (1)).
291
292
G.P. Shpenkov
The structure of carbon atom, being spherical (as generally, any atom), looks like ―plane‖, because all potential polar-azimuth nodes with H-atoms inside of them are situated in the same plane. Accordingly, due to specific geometry of disposition of the nodes, carbon atoms can form the plane hexagonal structure pertinent to graphite. The symbolic designation in Figure 6 reflects such a plane geometry of arrangement of all six potential of carbon polar-azimuth nodes and shows the shortest directions of exchange (interaction) between them. Thus, each carbon atom is associated in shell-nodal atomic model to one of the exopentagonal bonds of the truncated icosahedrons.
Figure 9. Plots of the potential polar-azimuth functions ) and
(l = 0, 1, 2;
(s = 2), and their nodal points on radial shells
of the oxygen atom. An initial phase of the azimuthal state
(a) and
the external half-integer shell at s = 2; (c) the symbolic designation of the oxygen atom (for
The spatial shell-nodal structure of the oxygen atom
and (b) for ).
and its polar-azimuth functions,
originated from the shell-nodal atomic model, are drawn in Figure 9. We see that in comparison with the carbon atom (Figure 6), the oxygen atom has a half-integer external , see Eq. shell with two potential polar-azimuth nodes lying in the equatorial plane ( (12)); the rest shells (l = 0, 1, 2; Six potential polar-azimuth nodes (at
) are the same as in the carbon atom. ), completed every by two H-atoms, lie in
one plane: two potential nodes are in the shell at l = 1 and the four potential nodes are in the shell at l = 2. Two external potential polar-azimuth equatorial nodes (at s = 2), 7th and 8th, can be whether in the same plane with internal nodes or in the perpendicular plane with respect to
Physics and Chemistry of Carbon in the Light of Shell-Nodal Atomic Model
293
the disposition of the rest nodes, as is shown in Figure 9b. Such positions of the nodes are determined by an initial phase of the azimuthal state that is admitted by the solutions of (1). All kinetic polar-azimuth nodes (not shown in Figure 9), the nodes of intensive motion (therefore they are empty), are in the perpendicular plane with respect to the plane of disposition of potential nodes, as in the carbon atom. The structure of all possible isotopes of the oxygen atom, as every atom in the shell-nodal atomic model, is uniquely defined by the multiplicity of filling of both external potential polar-azimuthal nodes and polar potential-kinetic nodes. The relative masses of the isotopes calculated on this basis coincide with the experimental data; they are presented in references [2, 4]. For example, from the aforementioned solutions it follows that the maximal mass number of oxygen is 26: the total number of all its nodes (excluding, of course, kinetic) is 13 . Actually, is the and maximal multiplicity of their filling is hence, 13 heaviest artificially produced short-lived isotope of oxygen [19]. It is obtained forcibly by filling of all 5 vacant polar nodes in
with paired H-atoms, like in the case of producing of
the carbon isotope considered above. Because all nodes inside of the oxygen atom are situated in the same plane (we mean the presented in Figure 5.3a), the spherical structure of oxygen atom looks like case with ―plane‖. Therefore, because of specific geometry of disposition of its nodes, oxygen atoms can form the plane hexagonal structure pertinent, in particular, in snow crystals [20]. The symbolic designation of oxygen presented in Figure 9c reflects such a plane geometry of the arrangement of all eight potential polar-azimuth nodes and shows the shortest directions of exchange (interaction) between them. A matrix of polar-azimuth discrete structure of the oxygen atom, or a matrix of the nodes (excluding kinetic nodes),
, has the form
(23) The matrices of filing of the nodes by H-atoms in the stable heaviest
, lightest
, and
isotopes of the oxygen atom have the following forms:
or
(24)
294
G.P. Shpenkov
It is to the point to note that contrary to particular solutions of Eq. (1) which uncover the shell-nodal structure of matter (atoms), Schrödinger’s quantum mechanics solutions led to the notion of ―electron structure” (―electron configuration”) of atoms which, as turned out, is the conceptually unfounded notion [10, 11, 21]. Moreover, Schrödinger’s solutions do not give any information about the nature and structure of atomic isotopes. This is the very important fact, which along with many other results already obtained in the framework of dialectical physics, proves the credibility of the shell-nodal atomic model. Estimates of binding energies in shell-nodal atomic structures and their compounds, presented in the last sections of this paper, also confirm this.
3. Graphite and Fullerenes Graphite is a modification of carbon crystallized in the laminated hexagonal structure. A unit cell of graphite consists of three layers. The atoms of carbon on the top layer and bottom layer are at the same lateral positions. The middle layer is shifted relative to the top and bottom layers. We will follow the shell-nodal pattern of the carbon atom expressed graphically by the symbol (Figure 6c). We do not know what kind of carbon is responsible for the formation of graphite, atomic, C, or molecular, C2. Therefore, the same designation will be applied to the carbon molecule C2 as for the carbon atom C. There are suppositions that the C2 radical is responsible for the formation of graphite [22]. And it was found experimentally that the carbon dimer C2 is in fact the major observable product of C60 fragmentation. Being a very effective growth species, C2 can rapidly incorporate into the diamond lattice leading to highfilm growth rates [23]. Our calculations of lattice parameters, results of which are presented below in Figure 10, give an answer to the question: what are elementary ―bricks‖ in graphite, C or C2? Following the shell-nodal atomic model and assuming that the lattice constants of crystals accepted in modern physics are precise and congruent to reality, we should accept that an elementary ―building block‖ in carbon crystals, including graphite, is the diatomic molecule of carbon, C2. Let us consider this question. Under accepted designations of C and C2, hexagonal layers of graphite have the structure shown in Figure 10. In this figure, lattice constants of graphite indicated in brackets correspond to imaginary lattice parameters if one accounts that the crystal lattice is formed from single carbon atoms, C. When we consider C2-based structure of graphite, we have six-multiple overlapped atomic nodes (except of boundary nodes) with 12 overlapped H-atoms per node belonging to three linked carbon dimmers. For all that let us assume that the structure presented in Figure 10 is formed from single carbon atoms, C. Then it takes place the three-multiple overlapping of every vertex (node) of bound carbon atoms (one internal node with two external nodes), excepting boundary atoms where two-multiple overlapping of external nodes occurs (Figure 11). It means that every (except boundary) nodal point (vertex) depicted in Figure 10 belongs to three individual carbon atoms. Every vertex contains 6 overlapped H-atoms, because we need to take into account the coupled H-atoms filling every of the six nodes of .
Physics and Chemistry of Carbon in the Light of Shell-Nodal Atomic Model
295
Figure 10. (a) An elementary cell of graphite; (b) the structure of bindings in a layer of graphite (graphene). The overlapping of external atomic nucleon shells of carbon dimmers is realized orderly along the shortest bindings between internal and external nodes of different carbon dimers.
Figure 11. Two- and three-multiple overlapping of the nodes characteristic for linked carbon dimmers C2 in graphene, a one-atom-thick crystal layer of graphite. The two types of 2-multiple overlapping are realized on the edges of the carbon crystal.
Thus, taking into account the above assumption on the C-based structure of graphite, we should distinguish among a vertex representing in shell-nodal atomic model three-multiple overlapped atomic nodes (with 6 overlapped H-atoms), belonging to three different carbon atoms, and the vertex (node) representing in the standard QM (nuclear) atomic model one carbon atom (holding 12 nucleons in its nucleus). Figure 10b and Figure 11 show a one-atomic layer of graphite. The latter called graphene was obtained quite recently and is intensively studied at present [24]. Unsaturated twomultiple overlapped external nodes of C2 dimers on different edges, especially zigzag-like edges, must have unusual physical and chemical properties as chemical radicals. Actually, the zigzag edge of a graphene nanoribbon possesses a unique electronic state and the chemical reactivity. An average density of graphite is . At the above density and three multiple nodal overlapping of single carbon atoms, lattice constants of graphite have to be
296
G.P. Shpenkov and
length of
(shown in Figure 10 in brackets). An average
bindings corresponding to these parameters must be
. However,
the above values do not coincide with the accepted data for graphite shown in Figure 10 without brackets: and ; and bond lengths are equal to
and
, respectively. The latter data completely coincides with
the calculated data only for the case when the lattice is formed from C2 units. Thus, the complete coincidence of calculated parameters with the table values can be achieved if we will take into account the previous coupling of individual carbon atoms themselves at the atomic level with the formation of two-atomic molecules C2 (as H2, N2, O2, etc.) and the following formation of crystals from them as elementary units. In this case we have the six multiple overlapping of every node (with 12 overlapped H-atoms) and arrive at the coincidence of the aforementioned parameters with the table values. The coupling is the natural property of matter at all levels. Remember, the coupling of H-atoms, constituents of atoms, occurs at the subatomic level as well, in intra-atomic nodes. A scheme of overlapping of nodes of two carbon atoms resulted in the formation of the C2 molecule is shown in Figure 12. Two-multiple overlapping of atomic nodes leads to the nodes contained every by four H-atoms per node. Just the same number it has the helium belonging to one of the balanced atomic structures (along with neon and argon). atom Apparently, the latter fact provides the more stable thermodynamic state of C2 with respect to the state that has an individual carbon atom. The following is noteworthy. The lattice constants of graphite shown in brackets in Figure 10 can prove to be verisimilar because of the following highly plausible situation. The existing X-ray and neutron scattering analysis (along with others) of crystals and molecules is not so simple matter in spite of the use of power modern computers and perfect techniques. The gauging of diffraction images takes into account the accepted atomic model and the density of substance known from the experiment. A solid-state theory based on the QM atomic model identifies every node of a crystal lattice (or a molecule) with one atom, which is regarded as the only center of scattering of incident particles or waves by this atom.
Figure 12. Formation of the C2 molecule: overlapping (―confluence‖) of all approaching nodes (and toroidal rings not shown here) of two carbon atoms in the unit whole. (A symbolic image of C 2 does not differ from the symbolical designation of a single C atom drawn in Figure 6).
Physics and Chemistry of Carbon in the Light of Shell-Nodal Atomic Model
297
Actually, it is assumed that every peak of electron density, in X-ray crystallography at the determination of atomic positions, corresponds to the definite position of a node with one (whole) atom [25]. A precise calculation of atomic positions and lengths of interatomic bonds uses the iterative method. The last is based on the comparison and fitting of measured and calculated (proceeding from the accepted atomic model) intensity of reflected beam so as long as will not be achieved an adequate correspondence of two sets of the values. Possibly, if only the X-ray analysis will be based on the use of shell-nodal (i.e., multi-center or molecule-like) atomic model, the gauging could be different, depending on the multiplicity of overlapping of atomic nodes belonging to various atoms. It means that the three multiple overlapping of nodes (vertices) of tree carbon atoms, just as presented in Figure 10, can be real and, hence, deserves a special examination. A present assuming that the lattice constants of crystals accepted in physics are precise and congruent to reality, we must accept that an elementary ―building block‖ of graphite is the C2 diatomic molecule. The pictures presented show that interatomic (intermolecular) bindings in graphene are realized along the bonds of external nodes of external shells and nearby internal nodes of internal shells. Every hexagonal circle with 4 double bindings is surrounded with 6 similar hexagonal circles. We see also that single bindings between the first and second nodes of ; Figure 6b) nowhere overlap. The next internal shells (responding to characteristic feature of graphene is its crystallographic anisotropy. The z-axes of all elementary carbon formations (C2) of a one-atomic/molecular-thick layer form straight parallel continuous chains of polar nodes along the direction perpendicular to the bonds between 1st and 2nd internal nodes (Figure 6, 11, and 12).
Figure 13. (a) An unfolded structure of buckminsterfullerene C60; (b) a fragment showing the overlapping of the nodes belonging to 5 spherical molecules C2 resulted in three-multiple overlapping of their nodes in two verticies.
298
G.P. Shpenkov
In order to verify the above described peculiarities of nodal bindings in carbon structures, found in graphite, it is interesting to consider the structure of other carbon compounds. Let us turn now to the structure of fullerenes considering them in the light of the shell-nodal atomic model as well. Fullerenes are regarded as a molecular form of a pure carbon (a kind of microclusters) representing a high symmetrical structure hollow inside. They are formed by the regular polygons of strained atomic bindings because of their bending under the formation of cagelike structure of carbon atoms, characteristic for fullerenes. The most known among fullerenes is C60 molecule (called buckminsterfullerene), which is well detected by mass-spectrograph. Complicated structural analysis has led to the conclusion that this molecule has 60 vertices, and their bindings form 20 hexagons and 12 pentagons. Such a structure reminds in form a football pattern (see Figure 13). If one follows the shell-nodal atomic model then we should accept that elementary ―building bricks‖ of the C60 molecule must be carbon dimers C2, but not carbon atoms C. The spherical closed pentagonal/hexagonal monomolecular shell has the rotational symmetry of order 5 forbidden in crystallographic space of plane symmetry group and highest possible icosahedral point-group symmetry [26]. The conclusion about the structure of buckminsterfullerene, accepted in physics, rests on the concept of quantum mechanical (mono-center) atomic model, according to which one node (vortex) of a crystal lattice corresponds to one atom (containing 12 nucleons in its nucleus). Let us suppose that the aforementioned spherical structure with 20 hexagons and 12 pentagons is realized on the basis of shell-nodal atomic model with use of 30 carbon atoms, but not carbon molecules C2, in the capacity of elementary ―bricks‖ (they are designated by , see Figure 6 and 12). In this case, the above structure of 60 vortices, the same symbol depicted in Figure 13 in the unfolded form, will be characterized by a three multiple overlapping of all vortices (nodes). We will arrive at the hypothetical C30 molecule, because only 30 carbon atoms are needed for the formation of 60 such vortices. Obviously, this case contradicts to the mass-spectrographic data. The six multiple overlapping (every of 60 vertices-nodes) only leads to C60 molecule with the above shell-nodal atomic structure of carbon. This is realized on the basis of coupled carbon atoms, participating as elementary structural units at the formation of C 60 just like it takes place at the formation of all crystal structures on the basis of carbon, including graphite considered above. Thus, the symbol in Figure 13 represents two coupled carbon atoms, the carbon dimer C2. Just then we have the right to state that the above structure belongs to C60 molecule. The length of ―single‖ bindings in such C60 molecule is , ―double‖ bindings –
, as in graphite in the same case formed from the
coupled carbon atoms. If one supposes that the case of non-coupled atoms with three multiple overlapping of their nodes is real then we should recognize that the molecule C60 called buckminsterfullerene has actually 120 vertices. This is very likely because of possible errors at an extremely complicated procedure of deciphering of intricate diffraction images of C60 molecule (as generally all fullerenes), and mainly due to the aforementioned gauging, accomplished with regard the nuclear (mono-center) structure of atoms. Thus, at the current stage of the
Physics and Chemistry of Carbon in the Light of Shell-Nodal Atomic Model
299
development of atomic physics, we cannot exclude completely that the structure presented in Figure 13 with 60 vertices is an image of the C30 molecule, but not C60. The overlapping of atomic shells at the formation of carbon crystal structures occurs along the directions of bindings between the nodes of two shells, external and nearest internal, for both graphite and fullerenes. Internal bindings of internal shells of carbon dimers , see Figure 6b) overlap only (between the nodes 1 and 2, corresponding to l = 1 and once at the coupling of carbon atoms and the formation of dimmers (see Figure 12).
4. The Shell-Nodal Structure of Diamond Diamond is a modification of carbon crystallized in a face-centered cube. The coordination number of diamond is 4. Therefore the structure of diamond is more friable in comparison with cubic structures characteristic for metals. Diamond can be artificially temperature and pressure. obtained from graphite under The disposition of all polar-azimuth nodes in one plane (inherent in the carbon atom under normal conditions and keeping in graphite) does not keep under aforementioned extremal conditions. The one-planar disposition of all atomic nodes is broken. The internal plane with the 1st and 2nd polar-azimuth nodes of an internal shell (Figure 14a) is turned (with respect to the plane of disposition of the about the z-axis by an angle of about rest four external polar-azimuth nodes belonging to the outer shell, Figure 14b). This is admitted by solutions (9) of the wave equation (7), in which it enters the phase angle .
Figure 14. a) A plane structure of carbon atom (and carbon molecule C2 as well); b) a shifted (turned) position of an internal shell (with nodes 1 and 2) around the z-axis by the phase angle
; c)
the bindings (marked by broken lines) between the shifted internal nodes 1 and 2, belonging to different carbon dimmers, resulted in the face-centered cube structure of diamond.
300
G.P. Shpenkov
Figure 15. (a) A hexagonal-wavy structure of crystallographic planes in diamond; (b) a hexagonal-wavy circle built from 3 carbon units (dimmers, C2); (c) the bindings (marked by dashed lines) between hexagonal-wavy layers resulted in the face-centered cube structure; (d) an elementary cell of diamond.
As a result the plane hexagonal structure of graphite layers is transformed into the wavy hexagonal structure, which enables bindings between them in the bulk face-centered cube structure. The direct interaction between 1st and 2nd nodes (drawn by broken lines in Fig 14c) of different coupled carbon atoms, belonging to nearby layers, is realized in this case. The bindings between the deformed (wavy) neighboring layers of the hexagonal structure, leading to the formation of diamond structure, are shown in Figure 15. We see that hexagonal circles are inherent in both graphite and diamond (Figure 15a). However, if they are planar in graphite, they are wavy in diamond (Figure 15b,c). Thus, the specific hexagonal structure originated from the specific geometry of disposition of nodes in the carbon atom, pronounced in graphite, keeps also in diamond in its crystallographic planes (Figure15a). The aforementioned similarity relates to multiplicity of overlapping of nodes as well. Peculiarities of overlapping of the nodes in graphite were considered in a preceding section. Similarly, the parameters of diamond presented in brackets in Figure 15 correspond to a hypothetical case of non-coupled single atoms, participating supposedly as elementary units at the formation of a diamond lattice. A three-multiple overlapping of atomic nodes of different carbon atoms takes place in this case. The parameters indicated in brackets do not agree with the data accepted in physics. Obviously, if only a structural analysis would be based on the shell-nodal (i.e., multicenter or molecule-like) atomic model, the gauging would be different; depending on the condition of what is accepted in the capacity of an elementary ―building block‖ of crystal lattices – a molecule-like atom or a carbon dimer, C or C2. Assuming as in the case of graphite that the lattice constants of crystals accepted in physics are precise and congruent to reality, we must accept that elementary ―building
Physics and Chemistry of Carbon in the Light of Shell-Nodal Atomic Model
301
blocks‖ in diamond (as, generally, in all carbon crystals) are diatomic molecules of carbon, C2.
5. The Formation of Bindings in Hydrocarbon Compounds A carbon frame is the basis for hydrocarbon molecules. An external shell of the carbon , see Figure 6) with four potential polar-azimuthal nodes is entirely atom (l = 2, completed by coupled H-atoms. The next subshell (
) of the same shell, l = 2, with
four empty nodes belongs to the neon atom, Ne, the atomic number Z = 10. Neon has all completely filled subshells ( , ) of the shell l = 2. Intermediate solutions (atomic numbers Z = 7, 8, and 9, respectively) relate to the atoms: N, O, and F. They have half-integer external shells with polar-azimuthal nodes lying in the equatorial plane. Empty nodes of the next external shell, which is outside the completed shell of the carbon atom, are in an equatorial plane as shown in Figure 16a,b (designated by dotted lines). The empty shell outside the carbon atom is simultaneously a vacant shell for the carbon environment. This shell plays a role in the formation of molecules. Four empty nodes of the shell can absorb H-atoms from the outside. By this way, the chemical level of bonds is realized and hydrocarbon molecules are formed as a result. Accordingly, this shell is called the improper shell of carbon. Thus, when the improper shell is drawn into a process of interchange (interaction) with hydrogen, hydrocarbon molecules are formed. The resulting structure of the improper shell repeats the discrete nodal structure (geometry) of the external shell of the corresponding atom to which this shell is proper. In a case of chemical adsorption of four H-atoms by the nodes of the improper shell of carbon, the methane molecule CH4 is formed (see Figure 16c). The conditional designations of the structures under consideration are drawn on the right side of the figures. A structural analog of the methane molecule is the neon atom (Z = 10). The latter has the same geometry of the disposition of nodes as the methane molecule CH4 drawn in Figure 16c. But external nodes of the neon atom have fully completed by coupled H-atoms equatorial nodes strong bound with the rest nodes of neon in comparison with the chemical level bonds of the nodes of the improper shell of the carbon atom filled with single atoms at the formation of methane. The next possible nodal structure of CH4, when two nodes of the improper shell absorb coupled H-atoms, H2, is shown in Figure 16d. Moreover, the radial solutions of the wave equation (1) give a series of radial shells (see Figure 3 and 4) slowly damped in amplitude (in the radial direction), with alternating zero amplitude values determined by a series of zeros of Bessel functions [28]. These shells are the proper shells for the atom but of the second, third, etc. order. The nearest (second order) ), following the first order completely proper polar-azimuthal shell of carbon (l = 2, filled external shell, with four empty nodes of the same polar-azimuth angular orientation, is shown in Figure 16a,b by dotted lines. The nodes of the second order proper shell of carbon are in a perpendicular plane with respect to the aforementioned nodes lying in an equatorial plane of the improper shell.
302
G.P. Shpenkov
A case of the participation of the second order proper radial shell in the formation of molecular bonds, resulted in a plane structure of the disposition of all potential constituents in methane molecule, is shown in Figure 17.
Figure 16. The carbon structure (a, b) with two nearest proper and improper spherical shells and their potential polar-azimuthal nodes designated by dotted lines; (c, d) two of the possible ways of the formation of the methane molecule CH4. (Polar nodes are not shown here).
Figure 17. (a) An internal structure of the carbon atom
, and (b) a possible polar-azimuthal
structure of methane molecule CH4; all chemically adsorbed individual H-atoms are in one plane with the completely filled with coupled H-atoms proper potential nodes of carbon.
Physics and Chemistry of Carbon in the Light of Shell-Nodal Atomic Model
303
The possible structure of some hydrocarbon compounds is demonstrated in Figure 18. Under the formation of a great number of carbon based molecules with C – C bonds, two- and three-multiple overlapping of polar-azimuthal nodes belonging to different interacting carbon atoms (or domers) takes place. An overlapping of polar-azimuth nodes is realized along the external and internal bonds of external and internal atomic shells, respectively, belonging to contacting carbon atoms, as shown in Figure 18. Two-multiple overlapping of the nodes of single H-atoms, characteristic for a homologous series of saturated hydrocarbons and having the general formula CnH2(n+1), is shown in the first row in Figure 18. The three-multiple overlapping of the nodes of single Hatoms, similarly as coupled H-atoms in graphite, fullerenes, diamond (shown in previous sections 3 and 4, see Figs. 10, 11, 13, 15) and in other carbon crystals, is also realized in cyclic hydrocarbon molecules (the second and third rows in Figure 18).
Figure 18. The structure of bindings in some typical hydrocarbon compounds based on the shell-nodal atomic model.
An overlapping occurs with both carbon atoms and carbon dimers (see Figure 19). In gaseous carbon compounds (as for example, in ethane C2H6, shown in Figure 18), single
304
G.P. Shpenkov
carbon atoms are overlapped as a rule. Dense carbon compounds, as for example benzene C6H6 (Figure 19c), are formed from carbon dimmers C2. And such compounds, as for example cyclohexane C6H10 and cyclooctadiene C8H12 (see Figure 18) are composed from both single carbon atoms, C, and carbon dimmers, C2; their positions are indicated in Figure 18. From Figure 18 it follows that among cyclic hydrocarbons CnH2n (cycloalkanes), where , the more stable is cyclohexane C6H12. The equilibrium geometry of atomic bindings in all six bound carbon atoms is not deformed; hence cyclohexane is not a strained compound. The most deformed intratomic bindings in carbon (with respect to the equilibrium structure originated from solutions of the wave equation (1)) are observed in cyclopropane C3H6 and cyclobutane C4H8. This is why these compounds are highly strained ones. They have maximal superfluous enthalpy (formation heat) among cycloalkanes: 37.674 and 26.377 kJ/mol, respectively; for comparison, cyclohexane has 0 kJ/mol. The direction of C – C bonds in hydrocarbon compounds and the character of overlapping of their potential polar-azimuthal nodes for a case of single carbon atoms is demonstrated in Figure 19a (two-multiple overlapping), for the carbon dimmers in Figure 19b,c.
Figure 19. A schematic view showing how C – C bonds in hydrocarbon compounds are formed, and the character of overlapping (two- and three-multiple) of polar-azimuthal nodes for the case of single carbon atoms (a) and their dimmers (b, c).
From the figures presented it follows that, at the formation of compounds, the internal nodes, 1 and 2, of connecting atoms never overlap between themselves. In all cases these nodes are overlapped only with the nodes belonging to external shells of nearby attached atom, as for example, it is shown in Figure 19a, where an internal node 2 (or 1) of one carbon atom is overlapped with the external node 3 (or 6) of a nearby attaching atom. A three-multiple overlapping is realized with carbon dimmers, C2, at the formation of carbon crystals. In this case (see Figure 19b), one internal node of one dimer is overlapped with two external nodes belonging to two nearby dimers. For example, an internal node 2 (of the 1st dimer) is overlapped with the node 4 (of the 2nd dimer) and the node 6 (of the 3rd dimer).
Physics and Chemistry of Carbon in the Light of Shell-Nodal Atomic Model
305
Two-multiple overlapping of dimmers resulted in a closed hexagonal ring with six vertices-nodes is characteristic feature for the carbon frame of the benzene molecule (Figure 19c).
Figure 20. A schematic view of self-binding (assembling) of two-dimensional carbon compounds.
A schematic view of self-binding (assembling) of two-dimensional carbon compounds on the basis of carbon dimmers, C2, such as graphene and benzene, is presented in Figure 20. Nature uses the method of self-assembly, self-organization that resulted in complexity of
306
G.P. Shpenkov
nature. The structure of carbon crystals is strongly deterministic. For example, characteristic feature of the graphene is the fact that all its constituent carbon dimmers, C2, are arranged in such a way that they form straight continuous chains of potential-kinetic polar nodes along a crystallographic direction coincident with joined z-axes of linked dimers. It means that the graphene is crystallographicly and, hence, physically anisotropous. Apparently, owing to such a structure, presented in Figure 20, graphene possesses unusual physical and chemical properties found recently experimentally [24]. A carbon frame of the benzene molecule, C6H6, has the form of a flat hexagonal ring, which is closed without a strain. Its formation from carbon dimmers C2 and a resulting structure are demonstrated in Figure 20. An ideal conjunction of all bonds conditions a high stability of benzene rings. It is wellknown that the benzene molecule behaves as a closed superconductor. Apparently, this feature occurs due to the junction around its center of three chains of empty polar potentialkinetic nodes. Thus, unique physical properties of the benzene molecule are determined by the specific nodal structure shown schematically in Figure 20. Thus, the structural features of carbon compounds, which we have already considered, are naturally and logically explained in the framework of shell-nodal atomic model, originated from the solutions of the wave equation (1).
6. Shell-Nodal Structure of Oxygen Compounds Let us consider now the formation of oxygen compounds, including some its compounds with carbon. The shell-nodal structure of the oxygen atom is presented in Figure 9. The external shell of oxygen is half-integer with two polar-azimuth nodes in an equatorial plane (according to the solutions (10) at s = 2) filled in with coupled H-atoms. An external quarterinteger shell at s = 1belongs to the nitrogen atom, 7N; the external fractional shell at s = 3 belongs to the fluorine atom, 9F. The entirely completed external integer shell at l = 2 and belongs to the neon atom 10Ne. The latter contains four external potential polarazimuthal nodes (numbered as 7, 8, 9, 10 in Figure 21). Thus, the half-integer external shell of oxygen is intermediate between the entirely completed external integer shells of carbon (l = 2, , Figure 6) and neon (l = 2, , Figure 21). The neon atom with the proper external shell in an equatorial plane, contained four ), belongs to one of the balanced atomic completed equatorial nodes (at l = 2, formations. The external shell with indicated in brackets parameters is simultaneously the resulting balanced shell for molecular compounds formed with the atoms N, O, and F (atomic numbers Z = 7, 8, and 9, respectively) having every a partially filled (fractional) external shell. Accordingly, this shell, proper for neon, is regarded as improper for nitrogen, oxygen and fluorine atoms. Correspondingly, empty improper nodes of the external shell of the aforementioned atoms are active centers of adsorption of H-atoms from environment. They provide the chemical level of interatomic nodal bonds owing to the full filling of the improper (vacant) nodes of the improper shell without breakdown of the individuality of every of interacting atoms.
Physics and Chemistry of Carbon in the Light of Shell-Nodal Atomic Model
307
Figure 21. Plots of polar-azimuth functions and nodal points on radial shells of the neon atom.
Figure 22. A conditional image of the formation of water molecule H2O, and the density of probability (contour plots)
of the localization of substance (H-atoms) in the external shell for the planes x = 0, y
= 0, and z = 0; the dashed smaller arrows in the pictures indicate the main directions of external internodal bindings inherent in the water molecule.
It should remind that when the improper shell of an atom is drawn into process of exchange (interaction) at the chemical level of bindings, and improper nodes are filling due to adsorption of single or coupled H-atoms, a new atom does not form. Just a molecule with the
308
G.P. Shpenkov
shell-nodal structure, repeating the discrete geometry of the atom (neon in our case) with the balanced external shell, is formed. In particular, the water molecule H2O can be formed in result of adsorption of two individual H-atoms by two improper (empty) nodes, 9 and 10, of the improper shell of oxygen as shown in Figure 22. In the framework of shell-nodal atomic model, the water molecule can be regarded as a (1672 ms half-life [23]). Actually, structural analogue of a short-lived isotope of neon both the isotope and the water molecule have the same atomic mass, 18, the same geometry of disposition of the nodes and the same multiplicity of filling of all nodes. However, of course, they have not the same strength of bindings in their proper and improper external shells, respectively. The shell-nodal structure of individual water molecule is not entirely completed – not all its nodes have paired H-atoms. Therefore this structure is not completely equilibrium. Such a structure will continuously aspire to form bindings with hydrogen and other water molecules by joining their half-completed nodes till the coupling of single H-atoms in them will not be achieved. If we take a look at the water molecule in the direction along the x-axis, we find the hexagonal structure of disposition of its nodes with six radial directions of exchange (interaction) in the plane x = 0, designated by smaller (dashed) arrows in Figure 22. Two other directions of exchange connecting two improper nodes of oxygen, 9 and 10, are in the plane y = 0, along the x-axis. The specific nodal structure of the H2O molecule enables the formation of the great variety of possible chemical (and, hence, relatively weak) internodal bindings between different H2O molecules; and, hence, it provides the great variety of resulting intermolecular structures in a liquid state (water) that is clear observed experimentally in a frozen state (in ice crystals). In particular, the nodal structure of oxygen, apparently, determines numerous symmetric-asymmetric hexagonal forms of snowflakes, short-range order of liquid water and long-range crystalline order of ice, dynamic and thermodynamic anomalies of water, etc. Water is the most abundant compound on the Earth and a major constituent of all living organisms. It is still the most enigmatic liquid on the Earth, apparently, because of the aforementioned specific nodal structure of the oxygen atom and its bonds with hydrogen. The radial solutions of the wave equation (1) give a series of slowly damped in amplitude, in radial direction, radial shells (shown, for example, for the carbon atom in Figure 3 and 4) with alternating zero values determined by a series of roots of Bessel functions (see Eq. 9 and Table 1). The half-integer solutions are determined by the solutions (11) and (12), where s = 1, 2, 3, … is actually the number of proper potential polar-azimuthal nodes entering in an external fractional shell of a corresponding atom. The distribution of the density of probability (in different plane sections of the water molecule) for the external shell is shown in Figure 22. The formation of oxygen O2 molecules (just like carbon ones) can be realized by different ways as shown, for example, in Figure 23. The possible nodal structure of the oxygen molecule O2, when the two-multiple overlapping of completed external proper nodes, belonging to two individual oxygen atoms takes place, is demonstrated in Fig 23a. Two improper nodes of oxygen are depicted by dashed circles. In a resulting molecule, all nodes (except of improper) are in one plane. The molecule obtained has four empty improper nodes.
Physics and Chemistry of Carbon in the Light of Shell-Nodal Atomic Model
309
Figure 23. Two possible ways of the formation of the oxygen molecule O2.
Figure 24. (a) One more possible way of the formation of oxygen O2, and the possible nodal structure of carbon oxide CO, carbon dioxide CO2, and the ozone molecule O3; (b) the symbolic designations of the compounds distinguished by the two-multiple overlapping of proper nodes of the constituent atoms.
Another possible structure of oxygen molecule O2, shown in Figure 23b,c, is formed with the participation of improper nodes. The overlapping of completed proper and empty improper external nodes of interacting atoms leads in this case to the bulk structure of the obtained oxygen molecule. The nodes of constituent oxygen atoms in the resulting molecule are in mutually perpendicular planes. The molecule obtained has three empty improper nodes. The resulted structure is not therefore fully completed, just like the structure presented in
310
G.P. Shpenkov
Figure 23a, that stipulates its chemical activity to form bindings with other atoms and molecules. We cannot also exclude the third way (not shown in Figure 23) analogous to the way of the formation of the C2 molecules shown in Figure 12, when all nodes of two approaching oxygen atoms are mutually overlapped in pairs. Another one more possible way of the formation of the oxygen molecule O2, which is realized by two-multiple overlapping of completed proper nodes, is demonstrated in Figure 24a. The formation by this way of a triatomic molecule of oxygen, ozone (or trioxide), O3, is shown in this figure as well. The similar way, apparently, is characteristic for the formation of chemical compounds of oxygen with carbon, CO and CO2, as also shown in this figure. The analogous interatomic bindings (just like above considered for oxygen and carbon compounds) take place for nitrogen monoxide (or nitric oxide) NO and nitrous oxide (or nitrogen hemioxide) N2O. The possible structures of hemioxide N2O are shown in Figure 25a,b.
Figure 25. Two possible ways of the formation of hemioxide N2O: (a) an intermediate (unfolded) image of one of such ways, (b) another of the possible nodal bindings in the hypothetical structure of N2O. The equatorial densities of probability atoms,
and
(contour plots) are drawn for external shells of separate
(the upper row, left and right); for the shell at l = 2,
= 0); and for the external shell of the resulting formation (l = 3,
).
(the section for z
Physics and Chemistry of Carbon in the Light of Shell-Nodal Atomic Model
311
The beginning stage of one of the possible ways of the formation of N2O is shown in Figure 25a. The nitrogen atom has one proper node in its external fractional shell, filled with coupled H-atoms, and three empty improper nodes. Two nitrogen atoms can be attracted every by their external proper nodes to the improper nodes of the oxygen atom (located along the x-axis) and stay there as shown in the figure. There are other ways of the formation of N2O. For example, the overlapping of two external proper nodes of oxygen with improper nodes of two nitrogen atoms, respectively, at the beginning stage of the formation of the molecules, is also possible. Two associated nitrogen atoms (left and right) have rotational degrees of freedom (around the joined nodes). Turning around, they can change situation and form the bonds between themselves and oxygen, turning about the overlapped nodes up to overlapping either single nodes or pairs of nodes belonging to their internal shell (at l = 2, ) as, for example, shown in Figure 25b. The next but very enlightening example of the formation of oxygen compounds in view of their shell-nodal structure is the aluminum oxide Al2O3. Its structure, admissible by the solutions of the wave equation (1), is presented in Figure 26 in unfolded and closed forms. The atomic number of aluminum is Z = 13. It means that it has a fractional external shell at l =3 next to the inner shell at l = 3, m = 0. This shell contains 3 proper nodes filled by paired H-atoms lying in the equatorial plane. Improper vacant nodes of every from three conjugated oxygen atoms (designated by hollow circles in the figure) are joined with external proper nodes of two aluminum atoms, as indicated in Figure 26d by arrows. By this way it is achieved a perfect compact arrangement of all overlapped nodes of two aluminum atoms (situated up and down opposite to each other, Figure 26e) and three oxygen atoms, seated symmetrically between them. A stable neutral formation, as if five individual neon spaces were tightly embedded and bound together, reminding a perfectly associated cogged joint, is achieved in this case. A formation of the nodal structure of carbon and oxygen compounds presented above allows us to note the following. The characteristic feature of carbon in all its innumerable compounds is the fact of the existence of two- and three-multiple overlapping of joined nodes of constituent atoms or dimers. The three-multiple overlapping takes place in most cyclic carbon compounds (Figure 18) and in all crystallographic forms of carbon as for example: graphite (Figure 10), diamond (Figure 15), graphene (Figure 11 and 20), and fullerenes (Figure 13). Remember, three (or two) multiple overlapping of completed nodes means (see for example Figure 13b and Figure 19) that an overlapped nodal point (a joined node) belongs to three (or two) individual atoms, C, or to three (or two) carbon dimmers, C2. And taking into account the coupling of H-atoms in the nodes of the carbon atom, every such a joined node contains from 6 (or 4) H-atoms (when single atoms are overlapped) to 12 (or 8) H-atoms (when carbon dimmers are overlapped). Compared with carbon compounds, the formation of oxygen compounds is mostly characterized by only two-multiple overlapping of the completed nodes of individual atoms. According to shell-nodal atomic model, the overlapping of completed nodes occurs mainly in such a way that resulting internodal-interatomic (chemical) bindings are realized just along the principal intra-atomic internodal (strong) bonds (Figure 18) existed between external nodes (belonging to external shells) and conjugated nearby internal nodes (belonging
312
G.P. Shpenkov
to internal shells) of interacting atoms. Considered examples of the formation of compounds based on the carbon and oxygen atoms confirm this peculiarity.
Figure 26. Two images (a and b) of the nodal structure of the atoms O, Ne, and Al and their conditional designations (c) for different projections; the unfolded (d) and closed (e) conditional images of the resulting structure of the Al2O3 compound.
The next peculiarity characteristic for all compounds is that single bindings between the 1st and 2nd internal nodes of an internal shell (corresponding to l = 1, ) of constituent atoms (or dimmers) never overlap (see Figs 9 - 11, 13, 15, 18 - 20, 24 - 26). Thus, on the basis of shell nodal atomic model, originated from solutions of the wave equation (1), all structural features of chemical compounds, which we have already considered, are naturally and logically revealed. It should be noted also that resting on the dynamic model of elementary particles and the aforementioned atomic model, physical phenomena already found are also well described, revealing unknown earlier regularities and features (see References).
Physics and Chemistry of Carbon in the Light of Shell-Nodal Atomic Model
313
7. Intra-Atomic Bindings and the Binding Energy of the Carbon Atom Unlike the conventional quantum mechanical (mono-center) nuclear model, the shellnodal atomic model is molecule-like, i.e., multi-center. Its atom consists of H-atoms (to which we refer proton, neutron and hydrogen atom) which are not all in one central point, nucleus. It is the principal discrepancy with the common nuclear (mono-center) atomic model. The main role in the new atomic theory, based on the shell-nodal atomic model, belongs to nodal Hatoms, constituents of atoms, but not to electrons – particles of the second order in magnitude (mass) with respect to H-atoms. A was shown above, interatomic chemical bindings of different atoms are realized along the characteristic directions defined by the specific geometry of disposition of nodes with coupled H-atoms within them (Figure 27). Exchange interactions of nodal H-atoms are responsible for the formation of both intraatomic and interatomic bonds, for the origination of different atoms and molecules, solids and liquids. Electrons define only the strength but not directions of chemical bindings between nodes, belonging to outer shells of different interacting atoms. Let us show this from another side basing on general estimates of exchange interactions related to both intra- and interatomic binding energies.
Figure 27. (a) A schematic view of internodal nucleon (―nuclear‖) bonds in the carbon atom
; and
(b) characteristic internodal distances (between their centers) determined by the roots of Bessel functions.
As was mentioned above, the new atomic theory rests also on the dynamic model of elementary particles. In accordance with this model, elementary particles, being pulsing microobjects, do not have the rest mass. The mass of the particles, as dynamic microobjects, has associated character [5] and is defined by the formula
(25)
314 where
G.P. Shpenkov is the radius of a pulsing wave spherical shell of a particle;
the absolute unit density; fundamental frequency
is
is the wave number corresponding to the of the atomic and subatomic levels (see (2)), which is the
fundamental frequency of the field of exchange of the levels (it is the frequency of ―electrostatic‖ field); herein c is the basis speed of exchange of matter-space-time at the subatomic level equal to the speed of light. We will assume that external and internal spaces of H-atoms are delimited by the Bohr radius , so that the mass of H-atoms calculated from Eq. ;
(25) is
The rate of mass exchange (or, in other words, exchange charge of the H-atom [5]) responsible for internodal bindings between atomic constituents, H-atoms, is equal to (26) The rate of mass exchange of such a value determines the high stability of individual atoms. Actually, the energy of interchange (interaction) of two separate H-atoms (situated in two conjugated nodes of the same atom) being apart at the distance (that is the length of double bindings in graphite, see Figure 10), is equal to
(27) This value correlates with the experimental data for the binding energy of neutron in a . If carbon nucleus and with the threshold energy of (γ, n) reactions [30] equal to we shall take the length
quoted from [31], corresponding to the
isolated double binding in C = C = C and CH2 = C = O structures, the obtained internodal (following from Eq. (27)) will energy of interaction of constituent H-atoms of practically coincide with the above threshold energy of (γ, n) reactions. Accepting (indicated in brackets in Figure 10), we arrive at the energy that is close to the threshold energy
of (n, 2n) reactions in
isotope [30] (p. 887), etc. The energy of interchange (interaction) of two separate H-atoms situated in two , is equal to conjugate nodes of the same atom [3] being apart at . The taken distance r is the length of a single binding between the internal nodes 1 and 2 (see Figure 10b or 19a). This distance is also equal to an averaged characteristic length of different bindings with participation of oxygen (S – O, C – O, N – O, B – O, etc. [31]). The obtained energy correlates with the experimental value for the binding energy of neutron in an oxygen nucleus and with the threshold energy of (γ, n) reactions in the nucleus, equal to [30].
Physics and Chemistry of Carbon in the Light of Shell-Nodal Atomic Model Calculated from the formula on the mass difference, the carbon atom
, equal to
315
, the binding energy of
, mainly depends on the energy of its nucleus
consisting of 12 nucleons. Let us show how this energy can be derived in the framework of shell-nodal atomic model which has not a conventional nucleus ascribed to atoms by modern physics. The derivation of the binding energy presented below one should regard as an estimation, because it was conducted on the basis of some suppositions regarded as preliminary axioms[18]. Basing on solutions of Eq. (1), we must take into account only those shortest internodal which are distinguished by the shortest distances between wave bonds in the carbon atom shells of internodal nucleons. Angular directions of such bonds are conditioned by the space geometry of polar-azimuthal functions (see Figure 2 and 6). The symbolic designation of carbon (introduced in Figure 6c) reflects a plane geometry of arrangement of all six principal potential polar-azimuth nodes and shows the shortest directions of exchange (interaction) between them. Just along these directions (except along the bonds between the internal nodes, 1 and 2), shown in Figure 27a, the chemical bonds between nucleon nodes of different atoms are realized at the formation of carbon molecules and crystals [29]. Regularities of wave processes are described by the Bessel functions that influence the strictly definite structure of material spaces at all levels. Five internodal bonds (1 – 2, 1 – 3, 1 – 5, 2 – 4, 2 – 6, see Figure 27), responsible for the binding energy in the carbon atom, have [ the same length r1 which is defined by the root of Bessel functions 28] (as in a case of the helium atom): (28) i. e.,
, where r0 is the Bohr radius;
(see
(4)). All other characteristic internodal distances in the carbon atom: r2, r3, and r4 shown in Figure 27b are not arbitrary as well. They are defined by the following roots, respectively: (29) Hence,
(30) The binding energy in an atom is attributed to three causes and consists of:
316
G.P. Shpenkov (1) the binding energy of paired nucleons in nodes, i.e., in essence, the energy of deuterons; (2) the binding energy of nucleon nodes with atomic shells to which these nodes belong; and (3) the energy of internodal exchange (interaction) of nucleons.
Thus, the first constituent must take into account the energy of coupling of two nucleons in a node. We will not describe its derivation here in the framework of shell-nodal atomic model; it is not the matter in question here. The latter is considered in detail in [18]. We will use the value equal to the deuteron’s binding energy 2.224 MeV per node obtained from the : formula on the mass defect, (31) We have the right to take this value assuming that according to shell-nodal atomic model the coupled protons and neutrons in nodes are in the form of the deuteron D ( The second constituent of the binding energy is defined from the following conditions. In a spherical atomic field, radial amplitudes of oscillations of H-units in nodes of the n- atomic shell are determined by the expression
(32) originated from solutions of Eq. (1) for the radial function
[1], where
(33) and
are Bessel functions. The constant A is equal to
(34) Then, assuming that
(atomic mass unit), at the
level of the fundamental frequency e, the energy of oscillations takes the form:
(35)
Physics and Chemistry of Carbon in the Light of Shell-Nodal Atomic Model where The
317
is the roots of Bessel functions, defines
root
an
equilibrium
distance
between two potential polar-azimuthal nodes of an internal atomic shell of carbon. Hence, according to (35), for the 1st and 2nd nodes (Figure 27) situated in the internal atomic shell, the binding energy of the nucleon node with the atomic shell is
(36) Thus, the second constituent of the binding energy (36) takes into account the bond of a node with the atomic shell where this node is located. A transition from one n-shell into another is defined by the following energy of the transition:
(37) Transitions of nucleons from the internal shell to the external shell, where four nodes are located, are defined by the above formula. For and , we have . The binding energy for every of the four nodes of the external shell is
(38) According to the DM, and the law of the universal exchange, the energy of exchange (interaction) of particles at atomic and subatomic levels is defined by the formula
(39) where
and
are exchange charges of interacting particles (pulsing
microobjects). The third constituent of the binding energy of the carbon atom
, the energy of
internodal exchange, is determined by the formula (39). According to the latter, an elementary binding energy, caused by exchange interaction between two nodes a distance r1 apart, is
(40)
318
G.P. Shpenkov The exchange energy (40) of the quantum of nucleon exchange
of the 1st
node (Figure 27a) expends on three equal bonds with 2nd, 3rd, and 5th nodes; and the 2nd node expends on the bonds with 1st, 4th, and 6th nodes. Hence the binding energy per node (we mean 1st and 2nd nodes here) is
(41) Every node of the 3rd, 4th, 5th, and 6th nodes are connected only with one node (1st or 2nd). Hence, the binding energy per node (for nodes from 3rd to 6th) is
(42) Thus, we have the following internodal binding energies between the nodes of the numbers (1-2):
(43) (3-1), (5-1), (4-2), (6-2):
(44) Thus, the total energy of internodal exchanges is (45) A resulting sum of all constituents of binding energy of the carbon atom
, calculated
for the exchange charge of the proton, qp: (31), (36), (38), and (45), is (46) Calculations
for
the
exchange
charge
of
a
neutron
give
. Subtracting the energy carbon atom
, that have the four valent electrons of the
, from the value (46), we arrive at the energy of the carbon ion
:
Physics and Chemistry of Carbon in the Light of Shell-Nodal Atomic Model
319 (47)
We see that the binding energy of the carbon ion
, obtained here on the basis of
shell-nodal atomic model and dynamic model of elementary particles, is in well agreement , , with the binding energy of the nucleus of the stable carbon isotope calculated from the formula
.
It should be noted at the conclusion that the shell-nodal atomic model allows understanding the physics of atomic reactions caused by an inelastic interaction of highenergy particles with substance. Actually, main ―structural‖ units of the shells are potential nodes with coupled H-atoms (and, of course, empty kinetic nodes not considered here), representing by themselves deuterons. The d-, p-, and n-radiation occur when nodal H-atoms tear off from their nodes. Besides, outgoing important elementary constructions of shells, such as structural ―edges‖ of two nodes completed with the two pairs of H-atoms, form radiation. Under powerful impacts, it takes place a ―splitting off‖ of external shells of heavy atoms, resulted in the formation of lighter elements. Consequently, such ―elementary‖ ), also appear. splinters, as p, n, and t (
8. Interatomic (Molecular Level) Binding Energies In accordance with the shell-nodal atomic model, we consider atoms as quasi-spherical multiplicative molecules of H-atoms. The word ―multiplicative‖ means that particles (Hatoms), constituted these elementary molecules, are characterized by strong intra-atomic internodal bonds (analogous in value to common ―nuclear‖). Accordingly, we call them multiplicative bindings. Ordinary molecules with relatively weak (chemical) bonds (analogous to cohesive or adhesive bonds), we call additive molecules. They are related to the electron level of bindings. For example, if deuterium D (an isotope of the hydrogen atom containing two Hatoms) is the multiplicative molecule then the hydrogen molecules H2 is the additive molecule. Accordingly, in the latter case we have relation with additive bindings. Let us estimate the electron level of bindings, the level of additive bindings. The energy of electron binding is equal to
(48) where (49) is the minimal quantum of the rate of mass exchange, the electron exchange charge; (50)
320
G.P. Shpenkov
is the absolute unit density;
is the fundamental wave radius (see (4)). , predetermines the electron work function of solids. For The energy obtained, instance, the electron work function of mono- and polycrystals of Al, B, Bi, W, Fe, Co, and [32, 33]. Cu is within The energy (48) practically coincides with the dissociation energy of the molecules: H2 ( , HD ( ), HT ( ) and close to the dissociation energy of the molecules O2 ( ) and OH ( 4.4) [30] (p. 425), etc. The energy of electron binding (48) correlates with the break energy of bindings in molecules and radicals. For in reactions and ; in instance, it is equal to , it is . The binding energy (of the electron level) per mole of substance defines the characteristic break energy of chemical bonds
(51) where NA is the Avogadro number. A definite energy is spent upon tearing off the H-atom from a node of the improper shell. In accordance with the experimental data [34], this energy is equal to for CH4 and
for C2H4 that is consistent with the obtained value (51).
Obviously, in a case of breaking of two bonds simultaneously, the break energy must be approximately twice as much. Actually, a breakdown of O2 molecule with two similar bonds requests about . The additive bindings (of the electron level) show its worth in the molar heat capacity of molecules and other phenomena considered in detail in [1].
9. Conclusion We should not only ―calculate all the results‖ [37], but also reveal and understand the ―genetic code‖ of structural variety in Nature—comprehend Nature, where the above considered carbon and oxygen and their constituent hydrogen atoms dominate. Therefore, creation of physical atomic models is inevitable. The imaginable physical atomic models must be non-contradictory, clearly comprehensible, and well agreeable to common sense, logic and the experimental data. The shell-nodal atomic model described here responds to these requirements. A new atomic theory based on shell-nodal atomic model and dynamic model of elementary particles accounts for all physical phenomena, related to atomic structure, which
Physics and Chemistry of Carbon in the Light of Shell-Nodal Atomic Model
321
we already had time to consider [1]. First of all it predicts theoretically the number of all possible isotopes, including the ultimate ones—lightest and heaviest (shown in this report for carbon), revealing their internal structure. The latter was a large gap in our knowledge that existed till now due to invalidity of quantum mechanical atomic theory (QM) dominated currently in modern physics. The shell-nodal atomic model, taking the first steps, has been considered here in general outline by the very important examples of the structure of carbon and oxygen atoms. The shell-nodal (multi-center) structure of atoms reminds the nodal structure of spherical resonant cavities having internal oscillating electric and magnetic mode fields. They are described by Bessel functions. Such resonant cavities can exist in free space even without physical material guiding the wave [36]. All types of elementary crystal formations represent, in essence, the elementary nodal structures of standing waves in a limited three-dimensional wave physical space. , including the above described carbon and oxygen atoms (Z = 6 All atoms with and Z = 8), like their basic constituents H-atoms (Z = 1), have an internal structure defined by solutions of ordinary wave equation (1), reminding R.J. Haüy’s elementary molecules [35]. The physical shell-nodal (molecular-like) atomic model enables simpler and convincing elucidation of the formation of chemical compounds in comparison with an abstract and intricate explanation based on the QM atomic model. As turned out the main role in the formation of molecular and crystal structures belongs to nodal H-atoms—constituents of atoms. Electrons define only the strength but not directions of chemical (additive) bindings. The crystal structure of graphite, diamond and buckminsterfullerene is characterized by either three-multiple overlapping of nodes belonging to 3 carbon atoms, or to six-multiple overlapping of nodes belonging to 6 carbon atoms (3 carbon dimmers, C2). We cannot uniquely state now, which one of the two possible overlaps, three- or six-multiple, is closer to reality. In the light of the discovery of shell-nodal atomic structure, both above cases are now equiprobable. An additional investigation on conformity of the commonly used gauging (accepted in the structural analysis with due account of nuclear atomic model) to the shellnodal atomic model has become topical; this must resolve in the future above duality. It is necessary for a precise verification of lattice parameters and binding lengths, accepted currently in modern physics, in view of the new concept on the intra- and interatomic structure. The new theory operates with exchange charges. This enables revealing the nature of nuclear and chemical bindings from one theoretical concept, the universal law of exchange (related to gravitational, electromagnetic, and strong interactions [1]). The rate of mass exchange of H-atoms, or exchange charge of H-atoms, defines the nature of multiplicative intra-atomic bindings (called in modern physics strong or nuclear). The electron exchange charge, the minimal quantum of the rate of mass exchange, is responsible for additive (interatomic, or chemical) bindings. Estimations of the energy of interactions between eigennodes of an atom, having the shell-nodal spherical structure, and between nodes of different interacting atoms, are entirely consistent with the experimental data.
322
G.P. Shpenkov
References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15]
[16]
L. Kreidik and G. Shpenkov, Atomic Structure of Matter-Space, Geo. S., Bydgoszcz, 2001, 584 p. G. Shpenkov, Shell-Nodal Atomic Model, Hadronic Journal Supplement, Vol. 17, No. 4, 507-566, (2002). L. Kreidik and G. Shpenkov, The Wave Equation Reveals Atomic Structure, Periodicity and Symmetry, Kemija u Industriji, 51, 9, 375-384, (2002). G. Shpenkov, An Elucidation of the Nature of the Periodic Law, Chapter 7 in a book "The Mathematics of the Periodic Table", edited by Rouvray D. H. and King R. B., Nova Science Publishers, NY, 119–160, 2006. L. Kreidik and G. Shpenkov, Dynamic Model of Elementary Particles and the Nature of Mass and ‘Electric’ Charge, "Revista Ciencias Exatas e Naturais", Vol. 3, No 2, 157170, (2001); www.unicentro.br/pesquisa/editora/revistas/exatas/v3n2/trc510final.pdf G. P. Shpenkov, Theoretical Basis and Proofs of the Existence of Atom Background Radiation, Infinite Energy, Vol. 12, Issue 68, 22-33, (2006). G. P. Shpenkov, The Dependence of Hall Conductance Quanta on the Fundamental Frequency of the Atomic Level; http://shpenkov.janmax.com/Hall.pdf G. P. Shpenkov, The Nodal Structure of Standing Spherical Waves and the Periodic Law: What Do They Have in Common? Physics Essays, Vol. 18, No 2, (2005) G. Shpenkov and L. Kreidik, Discrete Configuration of Probability of Occurrence of Events in Wave Spaces, Apeiron, Vol. 9, No. 4, 91-102, (2002); http://redshift.vif.com/JournalFiles/V09NO4PDF/V09N4shp.PDF L. Kreidik and G. Shpenkov, Important Results of Analyzing Foundations of Quantum Mechanics, Galilean Electrodynamics & QED-East, Special Issues 2, 13, 23-30, (2002); http://shpenkov.janmax.com/QM-Analysis.pdf G. P. Shpenkov and L. G. Kreidik, Schrodinger's Errors of Principle, Galilean Electrodynamics, Vol. 16, No. 3, 51 - 56, (2005). G. P. Shpenkov, The Scattering of Particles and Waves on Nucleon Nodes of the Atom, International Journal of Chemical Modelling, Vol. 2, No. 1, (2008). G. P. Shpenkov, Microwave Background Radiation of Hydrogen Atoms, Revista de Ciencias Exatas e Naturais, Vol. 4, No. 1, 9-18, (2002); www.unicentro.br/pesquisa/editora/revistas/exatas/v4n1/Microwave.pdf G. P. Shpenkov, Derivation of the Lamb Shift with Due Account of Wave Features for the Proton-Electron Interaction, Revista de Ciencias Exatas e Naturais, Vol. 6, No. 2, 171 - 185, (2004); http://shpenkov.janmax.com/derivation.pdf D. Shechtman, et al., Metallic Phase with Long-Range Orientation Order and no Translation Symmetry, Phys. Rev. Lett., 53, No.20, 1984, pp. 1951-53; P. J. Steinhardt, New Perspectives on Forbidden Symmetries, Quasicrystals, and Penrose Tilings, Proc. National Academy of Sciencies of the USA, Vol. 93 (25), No. 10, pp. 14267–14270, (1996), http://www.pnas.org/content/93/25/14267.full.pdf; P.J. Steinhardt and H.C. Jeong, A simpler approach to Penrose tiling with implications for quasicrystal formation, Nature, (London), Vol. 382, pages 433-435 (1996). G. P. Shpenkov, Particles of the Subelectronic Level of the Universe, Hadronic Journal Supplement, Vol. 19, No. 4, 533 - 548, (2004).
Physics and Chemistry of Carbon in the Light of Shell-Nodal Atomic Model
323
[17] G. P. Shpenkov, On the Nature of the Ether-Drift, Magnetic Strength, and Dark Matter, Phys. Essays, 20, 46 (2007). [18] G. P. Shpenkov, The Binding Energy of Helium, Carbon, Deuterium, and Tritium in View of Shell-Nodal Atomic Model and Dynamic Model of Elementary Particles (2007), http://shpenkov.janmax.com/stronginteraction.pdf [19] G. Audi and A.H. Wapstra, The 1995 Update to the Atomic Mass Evaluation, Nuclear Physics A595, Vol. 4, p. 409-480, December 25, 1995; R.R. Kinsey, et al., The NUDAT/PCNUDAT Program for Nuclear Data. Data extracted from NUDAT database (Jan. 14/1999). [20] W.A. Bentley and W.J. Humphreys, Snow Crystals, McGraw-Hill, New York and London, 1931. [21] G. P. Shpenkov, Conceptual Unfoundedness of Hybridization and the Nature of the Spherical Harmonics, Hadronic Journal, Vol. 29. No. 4, p. 455, (2006). [22] H. C. Shik, et al., Diamond and Related Materials 2, Elsevier Science Publishers B. V., Amsterdam, (1993), 531. [23] D. M. Gruen, at al, Turning Soot Into Diamonds With Microwaves, Proceedings of the 29th Microwave Power Symposium, Chicago, Illinois, July 25-27, 1994. [24] A.K. Geim and K. S. Novoselov, The Rise of Graphene, Nature Naterials, 6, 183-191 (2007). [25] Russell S. Drago, Physical Methods in Chemistry, W.B. Saunders Company, 1977. [26] R.E. Smalley, The Third Form of Carbon, Naval Research Reviews, December (1991), p. 3-14. [27] C.S. Yannoni, P.P. Bernier, D.S. Bethune, G. Meijer, J.R. Salem, NMR Determination of the Bond Length in C60, 113. Journal of the American Chemical Society, (1991), p. 3190-3192. [28] F.W.J. Olver, ed., Royal Society Mathematical Tables, Vol. 7, Bessel Functions, part. III, Zeros and Associated Values, Cambridge, 1960. [29] G. P. Shpenkov, The Role of Electrons in Chemical Bonds Formations (In the Light of Shell-Nodal Atomic Model), Molecular Physics Reports, 41, 89-103, (2005). [30] Tables of Physical Quantities, Reference Book (in Russian), edited by I. K. Kikoin, M., Atomizdat, 1976, pp. 891-892. [31] A.J. Gordon and R.A. Ford, The Chemist’s Companion: A Handbook of Practical Data, Techniques and References, A Wiley-Interscience Publication, 1972. [32] A.P. Babichev, et al., Physical Quantities, Reference Book, Atomenergoizdat, Moscow, 1991, Table 23.1, p. 568, in Russian; [33] H.B. Michelson, The Work Function of the Elements and Its Periodicity, Journal of Applied. Physics, Vol. 48, No 11, pp. 4729-4733, (1977). [34] V.I. Vedeneev, et al., The Chemical Bond Brake Energy (in Russian), Moscow, 1962; A.P. Babichev, et al., Physical Quantities, Reference Book, Atomenergoizdat, Moscow, 1991. [35] R.J. Haüy, Essai d’une Theorie sur la Structure des Crystaux, Paris, 1784; Traite Elementaire de Physique, Paris, Imp. Delance et Lesueur, an XII, 1803. [36] R.F. Harrington, Time-Harmonic Electromagnetic Fields, McGraw-Hill, 1961. [37] P.A.M. Dirac, Classical Theory of Radiating Electrons, Proc. Roy. Soc., 1939, V. 168, p. 148.
In: Quantum Frontiers of Atoms and Molecules Editor: Mihai V. Putz, pp. 325-341
ISBN: 978-1-61668-158-6 © 2010 Nova Science Publishers, Inc.
Chapter 13
MOLECULAR MODELING OF THE PEANUT LECTIN CARBOHYDRATE INTERACTION BY MEANS OF THE HYBRID QM/MM METHOD Alexei N. Pankratov1,a, Nikolay A. Bychkov1 and Olga M. Tsivileva1,b 1
Division of Analytical Chemistry and Chemical Ecology, Institute of Chemistry, N. G. Chernyshevskii Saratov State University, 83 Astrakhanskaya Street, Saratov 410012, Russia 2 Laboratory of Microbiology, Institute of Biochemistry and Physiology of Plants and Microorganisms, Russian Academy of Sciences, 13 Entuziastov Ave., Saratov 410049, Russia
Abstract The search and analysis of literature devoted to the problem of molecular modeling of proteincarbohydrate systems and protein-ligand interaction were performed. The hybrid QM/MM method was established to be the most appropriate for modeling the proteins interaction with different ligands. Within the framework of this method, the force fields AMBER and CHARMM were used until 2000 and during the period 2000-2005, respectively, while the most recent works deal with the force field OPLSAA as а preferable one and UFF to a lesser extent. Using the Brookhaven Protein Data Bank, the adequate model of lectin was selected for modeling interaction in the systems lectin-carbohydrate. By means of the QM/MM method, modeling of peanut lectin interaction with seven carbohydrates, the molecules of which incorporate D-galactose and D-glucose chains, was carried out. Therewith the amino acid fragments of carbohydrate binding site along with the carbohydrate molecule served as the quantum chemical subsystem. As a criterion of carbohydrate specificity, the value of formation energy of the complex (conjugate) lectin-carbohydrate from its composing parts was proposed. The theoretical a
E-mail address: [email protected], [email protected]. Web: sgu.ru/node/44087, sgu.ru/faculties/chemical/pankratov b E-mail address: [email protected]
326
Alexei N. Pankratov, Nikolay A. Bychkov and Olga M. Tsivileva substantiation of dominating specificity of peanut lectin to β-anomeric digalactoses, as well as of higher specificity to galactose derivatives than to glucose derivatives, was presented. The AIM-analysis of interaction between hydroxyl groups of carbohydrate and polar (amide, carboxylic, hydroxyl) groups of protein was performed. The above interaction was shown to obey a criterion of hydrogen bonding. The analysis of electron density topological properties pointed out the dominating role of aspartic and glutamic acid residues in proteincarbohydrate binding. The results of molecular modeling carried out correlate to the experimental data on the carbohydrate specificity, as well as on the contribution to this specificity made by the carbohydrate anomers nature and by the L-glutamic acid chains. The latter confirms a predictive power of the approach used as a whole, the energy of formation of the proteincarbohydrate conjugate as a criterion of carbohydrate specificity, as well as the electron density in the bond critical point and Laplacian of this quantity in respect to establishing the amino acid residues principal for the lectin activity manifestation.
1. Introduction At present, chemistry has made great strides toward the transformation of covalent and ionic compounds’ chemistry into the supramolecular chemistry, which deals with complexes and associates, in which the mutual components disposition is provided by the non-covalent bonds [1]. Within the framework of supramolecular chemistry, there is a field of science concerned with biospecific interaction. Biospecific interaction (enzyme - substrate, antigen - antibody, protein - carbohydrate, hormone - receptor, etc.) constitutes the basis of current methods and approaches in analytical chemistry (immunochemical methods of analysis, affinity chromatography and others), in which specificity is realized as a highest degree of selectivity. Affinity (biospecific) labeling is a powerful tool for the proteins research. Biospecific polymeric adsorbents are used in medicine, biochemistry, and the food industry. Supramolecular chemistry gains special importance when being applied to living objects. Supramolecular chemistry involves such essential definitions as recognition, selforganization and self-assemblage. All these phenomena occur in living organisms at the level investigated by molecular biology. Basic biochemical knowledge assigns nucleic acids and proteins the decisive role in information flow in biosystems. Connected by the genetic code the transcribed portions of the genome govern the expression of a complex set of messages on the level of polypeptides [2]. When however aspiring to understand intra- and intercellular recognition processes comprehensively, the two biochemical dimensions established by nucleic acids and proteins are not sufficient to satisfactorily explain all molecular events (e.g., in cell adhesion). To bridge this gap, consideration of further code systems is essential. In contrast to nucleic acids and proteins, branching of chains is a common feature of the glycan part of cellular glycoconjugates (glycoproteins, glycolipids). On defining the role of glycans as hardware in information storage and transfer, it should be noted that oligosaccharides surpass peptides by more than seven orders of magnitude in the theoretical ability to build isomers [3]. Thus, the capacity for information storage has been extended to the third biochemical dimension established by so-called carbohydrate code. After introducing first the concept of
Molecular Modeling of the Peanut Lectin
327
the sugar code on the level of sequence and conformation, carbohydrates gain their place as ideal candidates for generating compact units with explicit informational properties [2]. The message of coding units of the sugar code, in the interplay with sugar receptors, will trigger post-binding signaling and the intended biological response. Information stored as sequence and shape will have to be grasped. Translating and transmitting it into intended responces is the task of decoding devices. They should specifically recognize coding units established by glycans. Therefore, in addition to physicochemically serving roles to control folding, oligomerization and access of proteolytic enzymes [4-8], oligosaccharides in glycan chains can be likened to the postal code in an address to convey distinct messages read by suitable receptors [2]. The above bioinformation potential of carbohydrates is realized due to biospecific carbohydrate-protein interaction. The carbohydrate-binding proteins (sugar receptors just mentioned) are classified into enzymes responsible to assemble, modify and degrade sugar structures, immunoglobulins homing in on carbohydrates as antigens, and lectins. The latter class encompasses all carbohydrate-binding proteins, which are neither antibodies nor are they enzymes which couple ligand recognition with catalytic activity to process the target [9, 10]. Lectins are proteins capable of specific recognition and reversible binding to carbohydrate moieties of complex carbohydrates without altering covalent structure of any of the recognized glycosyl ligands [11]. Carbohydrate-protein complexes are formed in the initial steps of a large number of physiological and pathological processes, which range from cell-pathogen interaction, to cellcell recognition, to tumor metastasis, etc. Interference with these recognition events could be used to modulate or alter signal transmission, or to prevent the onset of diseases. Molecular recognition by specific targets is at the heart of the drug discovery process. The synthesis of functional sugar mimics capable of antagonizing oligosaccharides at the protein receptor level has attracted a great deal of attention as a way to develop drugs with good stability and synthetic availability [12, 13]. The study of the role of the lectins and lectin-like receptors in the immune system is a topic of current interest [13]. The immunity-related reactions are known to be controlled in their early stages by the interaction of proteins with oligosaccharides, often in the form of glycoconjugates [14-17]. This interaction is realized mainly via lectins or lectin-like substances capable of recognizing the specific carbohydrate determinants in the cellular structures. Organelles self-assembling process is accompanied by the above interaction, too [18]. Therefore, the protein-carbohydrate interaction is extremely important in living systems. Typical contributions to such kind of ligand binding originate from hydrogen bonding, electrostatic effects, hydrophobic interaction, medium effects and depend on molecular conformations, pH of medium, state of the reactants - protein and carbohydrate (protolytic, tautomeric equilibria, etc.), presence of metal ions, and other factors [19]. Until now, mechanism of the majority of those reactions is not understood thus presenting a great scientific and practical challenge. The investigations in this field involve both experimental and computer-assisted approaches. For the directive design of appropriate conjugates, promising are the methods of molecular modeling (molecular dynamics, molecular mechanics, quantum chemistry).
328
Alexei N. Pankratov, Nikolay A. Bychkov and Olga M. Tsivileva
Because of the voluminous size of protein systems, the QM/MM method should be reasonably applied as combining the powerful aspects of quantum chemistry and molecular mechanics. The goal of the present work is molecular modeling of the lectin interaction with carbohydrates by means of the QM/MM hybrid method. In the course of investigations, the following tasks have been formulated: 1. To search and analyze literature on the simulation methods concerning the proteincarbohydrate interaction. To choose the most appropriate variant of the QM/MM method for modeling the lectin-carbohydrate interaction, as force field and theory level for the quantum chemical computations. 2. To conduct the selection of model protein in compliance with the following criteria: lectin nature, known spatial structure, determined saccharide specificity, lack of lipid and carbohydrate moiety in the molecule. 3. To carry out the analysis of three-dimensional structure of the model protein. To highlight the binding site(s), as well as the peptide regions critical for binding site. To outline the quantum chemical and molecular mechanical subsystems. 4. To perform computations of the lectin complexes with different carbohydrates, to compare the values of energy of conjugates formation with the lectin affinity to saccharides. To conclude on the predictive power of the calculation method used. 5. To realize the AIM analysis. To elucidate the amino acid residues mostly essential for the lectin-carbohydrate bonding. To discuss the results obtained in the light of the published literature data. The present work has been done following the practice of two lines of scientific inquiries given below: •
•
The establishment of quantitative structure - property relationships in the series of inorganic, organic, organoelement, coordination compounds; the statement of the reactivity interrelation to the molecules and nanoclusters electronic structure in the ground and excited states on the basis of refining the understanding of electronic effects, electronegativities of atomic groups, hydrogen bond, generalization of the views on reaction mechanisms (including oxidation and reduction, nitrosation, nitration, azo coupling, halogenation, alkoxylation, condensations, other electrophilic, nucleophilic and radical processes, complexation, ligand exchange, molecular and ionic association, dissociation, tautomerism and dual reactivity, isomerization, proton, hydrogen atom and “hydride ion” transfer), regioselectivity of homolytic (oxidative and reductive) coupling reactions for the substances of different classes; study of the medium influence on chemical processes occurrence; development of the theory of analytical reagents action; molecular modeling of biospecific protein-ligand interaction, development of physical chemistry of morphogenic proteins of higher fungi; systematization and generalization of information about Web resources on natural sciences, on ecology (Professor Dr. Sc. Alexei N. Pankratov); Research into the physiology and biochemistry of edible cultivated mushrooms, the functions and biological activity of glycopolymers and carbohydrate-binding proteins of xylotrophic basidiomycetes, the role of carbohydrate-binding glycoproteins in the
Molecular Modeling of the Peanut Lectin
329
processes of fungal vital activity, the biosynthesis and characterization of lipophilic compounds of mycelial fungi (Leading Researcher Dr. Sc. Olga M. Tsivileva).
2. Methodology of Theoretical Study The initial geometry of protein was gained from the Brookhaven Protein Data Bank (RCSB PDB) [20]. The molecular modeling was performed by means of the QM/MM hybrid method [21] (quantum chemical method was PM3 [22-24], force field was OPLSAA [25]) using the FireFly programs package in the modified version by Dr. James W. Kress [26]. In order to convert the initial PDB file the programs TINKER V 4.2 [27] and Force Field Explorer [28] were used, and for viewing and analyzing the polypeptide chain the Protein Explorer V 2.80 program [29]. AIM analysis was carried out by the AIMALL program [30] with the wave function deduced using the Hartree - Fock method for the basis set of 3-21G [31, 32], by the program from the PC GAMESS V 7.1.F package [33] for geometry found using the QM/MM method.
3. Results and Discussion 3.1. Choice of the Force Field for Molecular Mechanics Computations Basing on the analysis of works published starting from 1995 up to now (not cited here because of insufficient place) we have generalized the data on force fields used in the QM/MM method (Figure 1).
Figure 1. Frequency of force fields use: 1 - CHARMM, 2 - AMBER, 3 - OPLSAA, 4 - UFF, 5 – others.
Within the framework of the QM/MM method, the force fields AMBER [34] and CHARMM [35, 36] were used until 2000 and during the period 2000–2005, respectively, while the most recent works deal with the force field OPLSAA [25] as а preferable one and UFF [37] to a lesser extent. More than a half of papers, which authors exploit the QM/MM method, involves the quantum chemical methods of DFT group [21].
330
Alexei N. Pankratov, Nikolay A. Bychkov and Olga M. Tsivileva
3.2. Choice of Subject under Study and Initial Approximation Preparation We have pursued the search for lectins. From all the lectins characterized by threedimensional structure (59 proteins) in the Brookhaven Protein Data Bank (RCSB PDB) [20] we have selected the peanut protein with 2PEL PDB-code (Figure 2). This lectin has been treated as a model since it is adequately explored experimentally, displays the specificity to galactose [38-40]. The works on studying this lectin assisted by the quantum chemical methods are absent, so far as we know.
Figure 2. Peanut lectin macromolecule complexed with 1,4-β-D-digalactose (one fourth of the elementary cell of the crystal).
Besides, accessible are structures of this lectin’s complexes with 1,3-α-D-digalactose, 1,3-β-D-digalactose, and D-galactoso-1,6-β-(N-acetyl-D-galactosamine) [39]. For convenience, in Figure 2 the color of polypeptide chain changes gradually from red into blue (red is for С-end, blue is for N-end). The polypeptide chain is visualized by means of tubular model, carbohydrate molecule and ions of elements-metals by means of ball model. In the lower-left part of Figure, the moiety of red balls denotes the molecule of 1,4-β-Ddigalactose. In the lower part, large red balls denote: on the left - calcium ion, on the right manganese(II) ion. Hydrogen atoms are not shown.
Molecular Modeling of the Peanut Lectin
331
Unit cell of the protein crystal structure incorporates 4 identical (both in composition and conformation) polypeptide chains, each of those involves calcium and manganese (II) ions. The polypeptide chain consists of 3446 atoms (232 amino acid residues). Within those, two sections were chosen, including the binding site. The first section comprises the amino acid residues from 79-th to 134-th (775 atoms), the second one from 210-th to 215-th (84 atoms) (Figure 3). The end-point fragments of the boundary amino-acid sequences were terminated (by the addition of hydrogen at N-end, oxygen at С-end, types of atoms were changed in compliance with the force field). The initial PDB-file was converted by means of the TINKER V 4.2 and Force Field Explorer programs, polypeptide chain was scanned and analyzed using the Protein Explorer V 2.80 program.
Figure 3. Simulated part of peanut lectin macromolecule complexed with 1,4-β-D-digalactose. Brown balls represent carbon atoms, blue - nitrogen atoms, red - oxygen atoms, gray - hydrogen atoms.
In Figure 3 the carbohydrate molecule is arranged on the right, that is why large violet balls denote, in contrast to Figure 2, on the left - manganese(II) ion, on the right - calcium ion. The further conversion into a format compatible with PC GAMESS/FireFly was performed by means of the special software created by ourselves (Nikolay A. Bychkov). Metal ions were described additionally to the basic program. Carbohydrate molecules were converted in a manner analogous to polypeptide chains.
332
Alexei N. Pankratov, Nikolay A. Bychkov and Olga M. Tsivileva
3.3. Separating Out the Quantum Chemical Subsystem As a result of the polypeptide chain analysis conducted using the Protein Explorer V 2.80 program, we decided to construct the quantum chemical component (Figure 4) from the amino acids 80-83 (aspartic acid, proline, alanine, aspartic acid), 125-129 (tyrosine, serine, asparagine, serine, glutamic acid), 211 - serine. Carbohydrate simulations were quantum chemical in all the cases.
Figure 4. Quantum chemical subsystem of peanut lectin macromolecule complexed with 1,4-β-Ddigalactose. The atoms numbering is given. Brown balls represent carbon atoms, blue - nitrogen atoms, red - oxygen atoms, gray - hydrogen atoms.
Metal ions simulations were made within molecular mechanics. The tentative computations and AIM analysis were performed. In consequence of this analysis it emerged that the residue 83 - aspartic acid did not form hydrogen bonds with carbohydrate, that is why we decided to transfer this residue into the molecular mechanics part.
3.4. Computations by Means of the QM/MM Method It is known [39] that the lectin under our study displays specificity to D-galactose. That specificity is regularly considered being exemplified by disaccharides derivatives. Few works report [39, 40] on the preferable specificity of peanut lectin to β-anomers of such compounds.
Molecular Modeling of the Peanut Lectin
333
We attempted to substantiate theoretically the given lectin’s carbohydrate specificity. This has become the topic of a large body of computations at different theory levels. QM/MM-computation of the system assisted by personal computer with a AMD Athlon XP+ 2666 processor and 1 Gb ROM, when considering the quantum chemical part at the semiempirical РМ3 level in single point takes time on the order of several minutes, at the HF/3-21G level on the order of one hour and a half, at the HF/6-31G(d,p) level on the order of 6 hours. The geometry optimization at the semiempirical level takes about 750 iterations and several days. Therefore, the geometry optimization at a high level of theory at the given hardware instrumentation should be impossible. Since the quantum chemical part carries a charge –2, the DFT method is hardly applicable in the given case because of poor convergence, this fact being inspected for the instance of B3LYP with the 3-21G and 631G(d,p) basis sets. It was decided to apply the semiempirical PM3 method, and to describe the molecularmechanics part by the OPLSAA force field. The geometry optimization was carried out by the standard gradient technique. In the software package FireFly the molecular mechanics block is compatible only incompletely with the quantum chemical block in the case when the semiempirical methods of quantum chemistry are involved. For this reason, in the course of our study the geometry of molecular-mechanics subsystem was not optimized, i.e. was fixed and extracted from the Brookhaven Protein Data Bank (RCSB PDB) [20]. Nevertheless, by means of the QM/MM method we have obtained the results (see below) in compliance to the experimental data. Moreover, fixing the coordinates of atoms entering the molecular-mechanics subsystem leads to the considerable economy in calculation resources and computations duration. Provided that the challenge and sufficient calculation resources appear, the above results could be refined at the high theory level. Table 1.Carbohydrate part of the lectin-carbohydrate systems Lectin-carbohydrate system* 2DV9 2DVD 2DVB 2PEL 2DV9+ 2DVD+ 2PEL+
Carbohydrate part** D-Gal-β-1,3-D-Gal D-Gal-α-1,3-D-Gal D-Gal-β-1,6-D-Gal D-Gal-β-1,4-D-Gal β-D-Gal α-D-Gal D-Glc-β-1,4-D-Gal
*
Sign “+” means that the carbohydrate part of the system is modified against the analog referred to in literature ** Carbohydrate residues abbreviations: Gal - galactose, Glc - glucose
Table 1 refers to the carbohydrate parts of peanut lectin conjugates by their abbreviations adopted conventionally in the international scientific literature and in the Brookhaven Protein Data Bank (RCSB PDB). Computations of all the lectin-carbohydrate conjugates were performed. On the basis of optimized geometry of each complex, the initial approximation has been chosen for computing the isolated molecular systems of protein and carbohydrate. Therefore, every
334
Alexei N. Pankratov, Nikolay A. Bychkov and Olga M. Tsivileva
estimation of the formation energy of one or another complex involves the particular value of energy of the protein molecule adopting a somewhat distinct conformation in a given case. Consequently, each the value obtained implies hard work. According to the X-ray data [39] used as the initial approximation, the peanut lectin secondary structure is organized predominantly by a β-type (Figure 3). Fixed geometry of the molecular-mechanics subsystem conserves such the protein structure, thus providing the computations correctness. The results of computations using the QM/MM method are shown in the Table 2. Table 2. Values of full energy (a.u.) of molecules of peanut lectin, carbohydrates and their complexes System
Protein
Carbohydrate
Conjugate
2DV9 2DVD 2DVB 2PEL 2DV9+ 2DVD+ 2PEL+
–444.5731264 –444.5800379 –444.5725665 –444.5470548 –444.5731264 –444.5800379 –444.5470548
–183.2695074 –183.4243216 –183.2717943 –183.3017077 –97.5640462 –97.6897403 –183.0882028
–627.8721643 –628.0294127 –627.8778614 –627.8765784 –542.1622761 –542.2972807 –627.8765784
Energy of complex formation, kcal/mol –18.5 –15.7 –21.0 –17.5 –15.7 –17.2 –12.9
Analyzing the data obtained one should remember that the computations have been carried out with no allowance for solvent, that is why the experimental value of conjugation energy appears to be more positive by the quantity of dehydration of the binding site and carbohydrate. However, over the series of carbohydrates this quantity alters only slightly, thus the comparative studies could be performed with no regard to the medium effect. Provided that the complexation energy serves as a criterion of carbohydrate specificity, the data obtained display the lectin’s preferable specificity to 1,6-β-D-digalactose. The lowest specificity is observed toward 1,4-β-D-glucoso-D-galactose. The difference between complexation energies with these two carbohydrates consists about 8 kcal/mol. The latter value does not exceed the computation error for heats of formation of the C-, H-, N-, O-containing substances [22, 23, 41, 42]. However, even if this quantity exceeds, due to the presence of calcium and manganese atoms in the protein subunit, the error of the quantum chemical method [24], so one of reactants (lectin) being the same and other reactants (carbohydrates) being structurally similar, as well as the complexes structures being aliketyped and relatively close to each other, enable one in accordance with the relativization principle [43] to compare the energy characteristics of the lectin-carbohydrate adducts without taking into consideration the basis set superposition error (BSSE) [44-46]. Moreover, the necessity of the BSSE correction is controversial [47, 48]. The authors of [48] have performed a theoretical and numerical analysis of the different “counterpoise correction” (CP) schemes potentially applicable to correct for the BSSE in the neighborhood of transition structures of chemical reactions. The analysis proved that neither of them is satisfactory: all CP versions result in either discontinuous potential surfaces or yield different energies for the same species in various reactions.
Molecular Modeling of the Peanut Lectin
335
In relation to the criterion of complexation energy, the series of carbohydrate specificity occurs to be the following: β-1,6-D-digalactose > β-1,3-D-digalactose > β-1,4-D-digalactose α-D-galactose > β-D-galactose = α-1,3-D-digalactose > > β-1,4-D-glucoso-D-galactose. It has been stated experimentally [49] that the lectin shows higher specificity toward αmethylgalactopyranoside as compared to its β-form (inhibiting ability referred to inhibiting ability of lactose being equal 2.5 and 1.25, respectively); as for disaccharides, the affinity is better toward β-anomers, therewith the lectin is mostly specific to the β-1,3 disaccharides derivatives. Consequently, our work renders the theoretical substantiation for the peanut lectin’s highest specificity to β-anomeric digalactoses, as well as the higher specificity toward galactose derivatives than to glucose derivatives.
3.5. AIM Analysis Richard F.W. Bader has developed the current quantum theory of electronic structure of molecules from the viewpoints of analysis of the electron density topological properties. This theory is called Quantum Theory “Atoms in Molecules” (AIM) [50-54]. According to the Bader’s theory, a molecule could be divided into fragments (atoms) by zero-flux surfaces, which obey the equation: ∇ρ(r).n = 0, where ρ(r) is electron density depending on nuclei coordinates r, n is unity vector normal to surface. Points on the zero-flux surface, for which ∇ρ(r) = 0, are called critical. The positive value of electron density at the Bond Critical Point (BCP) (ρb) and the negative value of Laplacian of electron density at BCP (∇2ρb) testify to the electron charge concentration within inter-nuclei region and to the electron charge depletion towards nuclei, i.e. to the occurrence of strong chemical bond. Contrary, the negative ρb value and the positive ∇2ρb value serve as the evidence in favor of the electron charge concentration on nuclei and its depletion in the inter-nuclei space, i.e. testify to the absence of binding. Within the framework of topological theory of electron distribution, the BCP (3, –1) is the point of electron density field gradient for a given nuclei configuration, in which ∇ρ(r,q) = 0, local maximum toward two directions and local minimum in the third direction, i.e. the saddle point in three dimensions. Here r is a rank of BCP (the number of non-negative
336
Alexei N. Pankratov, Nikolay A. Bychkov and Olga M. Tsivileva
eigenvalues of the matrix of second derivatives), s is a signature (algebraic sum of signs of eigenvalues). The BCP exists between every pair of the adjacent bound atoms, its position reflecting the bond polarity. The position of BCP for A-B displaces toward A and thus reserves the greater bulk of electron density for the B atom, provided that B is more electronegative than A. The hydrogen bond formation is accompanied by the BCP appearance between the hydrogen atom and the atom - proton-acceptor, which are bound by the bond pathway. This BCP possesses the characteristic properties of interaction within a closed shell: the value of electron density at BCP (ρb) is relatively moderate, and the Laplacian of electron density (∇2ρb) is positive, those point out to the exhausted electron density in the direction from interatomic basin toward the interacting nuclei ([55-64], etc.). We have realized the AIM analysis of the protein, carbohydrate and conjugates in the instance of the system 1,3-?-digalactose with lectin. For the pairs of atoms, one of which enters the lectin composition, and another the composition of α-1,3-D-digalactose, among which the hydrogen bonding occurrence could be presumed, we have analyzed, at the HF/3-21G theory level, the electron density (ρb) at the BCP and the Laplacian of electron density at BCP (∇2ρb) (Table 3). Table 3. The ρb & ∇2ρb values (a.u.) at the bond critical points Numbers of atoms* О44…Н152 О128…Н95 Н119…О130 О9…Н154 Н68…О139
ρb 0.0435 0.0222 0.0154 0.0492 0.0057
∇2ρb 0.156 0.092 0.084 0.138 0.032
*
According to Figure 4
In Table 3, O44 is an oxygen atom from COOH group of L-aspartic acid residue; H95 is a hydrogen atom from CONH2 group of L-asparagine residue; H119 is a hydrogen atom from OH group of L-serine residue; O9 is an oxygen atom from COOH group of L-glutamic acid residue; H68 is a hydrogen atom from OH group of L-tyrosine residue; H152, H154 are hydrogen atoms from OH groups of D-galactose residues; O128, O130, O159 are oxygen atoms from OH groups of D-galactose residues. The greater is the electron density at BCP, the stronger is the hydrogen bond. Consequently, one could conclude on the formation of 5 hydrogen bonds, therewith two of them (О44…Н152 and О9…Н154) outstand as being stronger than others.These strongest hydrogen bonds appear between the oxygen atoms from carboxylic groups of aspartic and glutamic acids and the hydrogen atom from hydroxyl group of galactose residue. Weaker hydrogen bonds occur between the hydrogen atoms from amide group of asparagine and the oxygen atom from OH group of galactose fragment, between the hydrogen atoms from hydroxyl group of serine and oxygen atom from OH group of galactose residue. Thus, the results of the AIM analysis show that the aforesaid mentioned interaction hydrogen atom - heteroatom (between the carbohydrate hydroxyl groups and polar (amide, carboxylic, hydroxyl) groups of the protein) satisfies the criteria of hydrogen binding.
Molecular Modeling of the Peanut Lectin
337
The authors of the work [65] have attempted a comparison of carbohydrate specificity of the natural peanut lectin and the lectins of its artificial mutants, in which the L-glutamic acid residue (129-th amino acid residue) was replaced by other amino acid chains. It has appeared that such the substitution causes a substantial increase in the inhibiting concentrations of carbohydrate, i.e. the worse specificity (Table 4). Table 4. Inhibiting concentrations (mmol/l) of various carbohydrates for the lectins of peanut and its artificial mutant Carbohydrate D-Galactose D-Galactosamine N-Acetyl-D-galactosamine D-galactoso-β-1,4-D-glucose D-Galactoso-β-1,3-N-acetyl-D-galactosamine
Lectin 3.84 2.80 – 1.37 0.10
Mutant 6.65 9.26 6.94 2.45 0.35
The above result agrees with our data on the AIM analysis carried out. In general, the results of molecular modeling we have performed confirm a predictive power of the QM/MM method as a whole, the energy of formation of the proteincarbohydrate conjugate as a criterion of carbohydrate specificity, as well as the electron density at the bond critical point and Laplacian of this quantity in respect to establishing the amino acid residues principal for the lectin activity manifestation.
Acknowledgments The authors would like to thank Dr. James W. Kress (President of The KressWorks Foundation, Chairman and Chief Executive Officer of KressWorks) for valuable advice and discussion, and Dr. Donna Dennis (Nova Science Publishers, Inc., Editorial Production Manager) for careful proceeding with the manuscript
References [1] [2] [3]
[4] [5]
J.-M. Lehn, Supramolecular Chemistry. Concepts and Perspectives; VCH Verlagsgesellschaft mbH, 1995. Russian Edition: Nauka, Novosibirsk, 1998, 334 pp. H.-J. Gabius, “Biological Information Transfer Beyond the Genetic Code: The Sugar Code”, Naturwissenschaften, Bd 87, no. 3, ss. 108-121, 2000. R.A. Laine, “The Information-Storing Potential of the Sugar Code”, In: Glycosciences: Status and Perspectives, Editors: H.-J. Gabius and S Gabius, Chapman & Hall, London, Weinheim, 1997, pp. 1-14. K. Drickamer and M.E. Taylor, “Evolving Views of Protein Glycosilation”, Trends in Biochemical Sciences, vol. 23, no. 9, pp. 321-324, 1998. P. Gagneux and A. Varki, “Evolutionary Considerations in Relating Oligosaccharide Diversity to Biological Function”, Glycobiology, vol. 9, no. 8, pp. 747-755, 1999.
338 [6]
[7]
[8]
[9] [10] [11] [12]
[13]
[14]
[15] [16]
[17] [18]
[19]
[20] [21]
Alexei N. Pankratov, Nikolay A. Bychkov and Olga M. Tsivileva G. Reuter and H.-J. Gabius, “Eukaryotic Glycosylation - Whim of Nature or Multipurpose Tool?”, Cellular and Molecular Life Sciences, vol. 55, no. 3, pp. 368-422, 1999. N. Sharon and H. Lis, “Glycoproteins: Structure and Function”, In: Glycosciences: Status and Perspectives, Editors: H.-J. Gabius and S. Gabius, Chapman & Hall, London, Weinheim, 1997, pp. 133-162. A. Varki, “”Unusual” Modifications and Variations of Vertebrate Oligosaccharides: Are We Missing the Flowers for the Trees?”, Glycobiology, vol. 6, no. 7, pp. 707-710, 1996. S.H. Barondes, “Bifunctional Properties of Lectins: Lectins Redefined”, Trends in Biochemical Sciences, vol. 13, no. 12, pp. 480-482, 1988. H.-J. Gabius, “Non-Carbohydrate Binding Partners/Domains of Animal Lectins”, International Journal of Biochemistry, vol. 26, no. 4, pp. 469-477, 1994. J. Kocourek and V. Horejsi, “Defining a Lectin”, Nature, vol. 290, no. 5803, p. 188, 1981. A. Bernardi, D. Arosio, L. Manzoni, D. Monti, H. Posteri, D. Potenza, S. Mari, and J. Jiménez-Barbero, “Mimics of Ganglioside GM1 As Cholera Toxin Ligands: Replacement of the GalNAc Residue”, Organic & Biomolecular Chemistry, vol. 1, no. 5, pp. 785-792, 2003. S. Mari, D. Serrano-Gómez, F.J. Cañada, A.L. Corbí, and J. Jiménez-Barbero, “1D Saturation Transfer Difference NMR Experiments on Living Cells: The DCSIGN/Oligomannose Interaction”, Angewandte Chemie International Edition, vol. 44, no. 2, pp. 296-298, 2005. A. Bernardi, D. Arosio, D. Potenza, I. Sánchez-Medina, S. Mari, F.J. Cañada, and J. Jiménez-Barbero, “Intramolecular Carbohydrate-Aromatic Interactions and Intermolecular van der Waals Interactions Enhance the Molecular Recognition Ability of GM1 Glycomimetics for Cholera Toxin”, Chemistry - A European Journal, vol. 10, no. 18, pp. 4395-4406, 2004. P.E. Ignatov, Immunity and Infection (in Russian), Time (in Russian: Vremya), Moscow, 2002, 352 pp. I.M. Roitt, Essential Immunology, Blackwell Scientific Publications, Oxford, London, Edinburgh, Boston, Palo Alto, Melbourne, 1988. Russian Edition: Bases of Immunology (in Russian), Mir, Moscow, 1991, 328 pp. Fundamental Immunology, Editor: W.E. Paul, Raven Press Books, Ltd., New York, 1984. Russian Edition: Immunology (in Russian), Mir, Moscow, 1987, vol. 1, 476 pp. S. Gendreau, Modulation of Protein Functions by Homo- and Heterophilic Protein Interactions As Studied with P2X Receptors and Glutamate Transporters, Dissertation zur Erlangung des Doktorgrades der Naturwissenschaften; Johann Wolfgang GoetheUniversität in Frankfurt am Main, Frankfurt am Main, 2004, 162 S. R.R. Ueta and F.B. Diniz, “Adsorption of Concanavalin A and Lentil Lectin on Platinum Electrodes Followed by Electrochemical Impedance Spectroscopy: Effect of Protein State” Colloids and Surfaces B: Biointerfaces, vol. 61, no. 2, pp. 244-249, 2008. RCSB PDB (Protein Data Bank), www.rcsb.org/pdb/home/home.do. F. Jensen, Introduction in Computational Chemistry; John Wiley & Sons Ltd., 2007, 599 pp.
Molecular Modeling of the Peanut Lectin
339
[22] J.J.P. Stewart, “Optimization of Parameters for Semiempirical Methods. I. Method”, Journal of Computational Chemistry, vol. 10, no. 2, pp. 209-220, 1989. [23] J.J.P. Stewart, “Optimization of Parameters for Semiempirical Methods. II. Applications”, Journal of Computational Chemistry, vol. 10, no. 2, pp. 221-264, 1989. [24] J.J.P. Stewart, “Optimization of Parameters for Semiempirical Methods. III. Extension of PM3 to Be, Mg, Zn, Ga, Ge, As, Se, Cd, In, Sn, Sb, Te, Hg, Pb, and Bi”, Journal of Computational Chemistry, vol. 12, no. 3, pp. 320-341, 1991. [25] W.L. Jorgensen, D.S. Maxwell, and J. Tirado-Rives, “Development and Testing of the OPLS All-Atom Force Field on Conformational Energetics and Properties of Organic Liquids”, Journal of the American Chemical Society, vol. 118, no. 45, pp. 1122511236, 1996. [26] J. Kress, Kress Works. QC Tools. FireFly, kressworks.org/QC_tools.html; www.kressworks.org/QC_tools.html. [27] TINKER - Software Tools for Molecular Design. Current Major Version: TINKER 4.2. Major Release Date: June 2004. Last Minor Revision: September 8, 2004, dasher.wustl.edu/tinker. [28] Force Field Explorer, dasher.wustl.edu/ffe. [29] Protein Explorer Advisories (Updated October 30, 2008), umass.edu/microbio/chime/pe_beta/pe/protexpl/frntdoor.htm; www.umass.edu/microbio/chime/pe_beta/pe/protexpl/frntdoor.htm. [30] T.A. Keith (2009), AIMAll (Version 09.04.23), aim.tkgristmill.com. [31] J.S. Binkley, J.A. Pople, and W.J. Hehre, “Self-Consistent Molecular-Orbital Methods. 21. Small Split-Valence Basis Sets for First-Row Elements”, Journal of the American Chemical Society, vol. 102, no. 3, pp. 939-947, 1980. [32] M.S. Gordon, J.S. Binkley, J.A. Pople, W.J. Pietro, and W.J. Hehre, “Self-Consistent Molecular-Orbital Methods. 22. Small Split-Valence Basis Sets for Second-Row Elements”, Journal of the American Chemical Society, vol. 104, no. 10, pp. 2797-2803, 1982. [33] A.A. Granovsky, PC GAMESS/Firefly version 7.1.F, classic.chem.msu.su/gran/gamess/index.html. [34] W.D. Cornell, P. Cieplak, C.I. Bayly, I.R. Gould, K.M. Merz, Jr., D.M. Ferguson, D.C. Spellmeyer, T. Fox, J.W. Caldwell, and P.A. Kollman, “A Second Generation Force Field for the Simulation of Proteins, Nucleic Acids, and Organic Molecules”, Journal of the American Chemical Society, vol. 117, no. 19, pp. 5179-5197, 1995. [35] A.D. MacKerell, Jr., J. Wiórkiewicz-Kuczera, and M. Karplus, “An All-Atom Empirical Energy Function for the Simulation of Nucleic Acids”, Journal of the American Chemical Society, vol. 117, no. 48, pp. 11946-11975, 1995. [36] A.D. MacKerell, Jr., D. Bashford, M. Bellott, R.L. Dunbrack, Jr., J.D. Evanseck, M.J. Field, S. Fischer, J. Gao, H. Guo, S. Ha, D. Joseph-McCarthy, L. Kuchnir, K. Kuczera, F.T.K. Lau, C. Mattos, S. Michnick, T. Ngo, D.T. Nguyen, B. Prodhom, W.E. Reiher III, B. Roux, M. Schlenkrich, J.C. Smith, R. Stote, J. Straub, M. Watanabe, J. Wiórkiewicz-Kuczera, D. Yin, and M. Karplus, “All-Atom Empirical Potential for Molecular Modeling and Dynamics Studies of Proteins”, The Journal of Physical Chemistry B, vol. 102, no. 18, pp. 3586-3616, 1998. [37] A.K. Rappe, C.J. Casewit, K.S. Colwell, W.A. Goddard III, and W.M. Skiff, “UFF, A Full Periodic Table Force Field for Molecular Mechanics and Molecular Dynamics
340
[38]
[39]
[40]
[41]
[42]
[43]
[44]
[45]
[46]
[47] [48]
[49]
[50]
Alexei N. Pankratov, Nikolay A. Bychkov and Olga M. Tsivileva Simulations, Journal of the American Chemical Society, vol. 114, no. 25, pp. 1002410035, 1992. V. Sharma, M. Vijayan, and A. Surolia, “Imparting Exquisite Specificity to Peanut Agglutinin for the Tumor-Associated Thomsen-Friedenreich Antigen by Redesign of Its Combining Site”, The Journal of Biological Chemistry, vol. 271, no. 35, pp. 2120921213, 1996. S.K. Natchiar, O. Srinivas, N. Mitra, A. Surolia, N. Jayaraman, and M. Vijayan, “Structural Studies on Peanut Lectin Complexed with Disaccharides Involving Different Linkages: Further Insights into the Structure and Interactions of the Lectins”, Acta Crystallographica Section D: Biological Crystallography, vol. 62, part 11, pp. 1413-1421, 2006. R. Banerjee, K. Das, R. Ravishankar, K. Suguna, A. Surolia, and M. Vijayan, “Conformation, Protein-Carbohydrate Interactions and a Novel Subunit Association in the Refined Structure of Peanut Lectin-Lactose Complex”, Journal of Molecular Biology, vol. 259, no. 2, pp. 281-296, 1996. A.N. Pankratov, “A Quantum Chemical Evaluation of Thermodynamic and Molecular Properties of Acyclic and Aromatic Compounds” (in Russian), Zhurnal Strukturnoi Khimii, vol. 41, no. 4, pp. 696-700, 2000. A.N. Pankratov, “Thermodynamic and Molecular Properties of Cyclic Nonaromatic Hydrocarbons and Unsaturated Heterocyclic Compounds: A Quantum Chemical Evaluation” (in Russian), Izvestiya Vysshikh Uchebnykh Zavedenii. Khimiya i Khimicheskaya Tekhnologiya, vol. 52, no. 2, pp. 29-35, 2009. A.K. Charykov, Mathematical Treatment of the Results of Chemical Analysis. Methods of Revealing and Evaluation of Errors (in Russian), Khimiya. Leningrad Branch, Leningrad, 1984; 168 pp. I.H. Williams, G.M. Maggiora, and R.L. Schowen, “Theoretical Models for Mechanism and Catalysis in Carbonyl Addition”, Journal of the American Chemical Society, vol. 102, no. 27, pp. 7831-7839, 1980. S.F. Boys and F. Bernardi, “The Calculation of Small Molecular Interactions by the Differences of Separate Total Energies. Some Procedures with Reduced Errors”, Molecular Physics, vol. 19, no. 4, pp. 553-566, 1970. S.S. Xantheas, “On the Importance of the Fragment Relaxation Energy Terms in the Estimation of the Basis Set Superposition Error Correction to the Intermolecular Interaction Energy”, The Journal of Chemical Physics, vol. 104, no. 21, pp. 8821-8824, 1996. H.B. Schlegel, “Ab Initio Methods in Quantum Chemistry. Part I”, In: Advances in Chemical Physics, Editor: K.P. Lawley, Wiley, New York, 1987, vol. 67, pp. 249-286. G. Lendvay and I. Mayer, “Some Difficulties in Computing BSSE-Corrected Potential Surfaces of Chemical Reactions”, Chemical Physics Letters, vol. 297, no. 5-6, pp. 365373, 1998. G. Bhanuprakash Reddy, V.R. Srinivas, N. Ahmad, and A. Surolia, “Molten GlobuleLike State of Peanut Lectin Monomer Retains Its Carbohydrate Specificity. Implifications in Protein Folding and Legume Lectin Oligomerization”, The Journal of Biological Chemistry, vol. 274, no. 8, pp. 4500-4503, 1999. R.F.W. Bader, Atoms in Molecules: A Quantum Theory (The International Series of Monographs on Chemistry. No. 22), Oxford University Press, New York, 1994, 458 pp.
Molecular Modeling of the Peanut Lectin
341
[51] P.L.A. Popelier, Atoms in Molecules: An Introduction, Prentice Hall, London, 2000, 184 pp. [52] R.F.W. Bader, “A Quantum Theory of Molecular Structure and Its Applications”, Chemical Reviews, vol. 91, no. 5, pp. 893-928, 1991. [53] J.R. Mohallem, “Molecular Structure and Bader’s Theory”, Theoretical Chemistry Accounts: Theory, Computation, and Modeling (Theoretica Chimica Acta), vol. 107, no. 6, pp. 372-374, 2002. [54] R.F.W. Bader, “Everyman’s Derivation of the Theory of Atoms in Molecules”, The Journal of Physical Chemistry A, vol. 111, no. 32, pp. 7966 -7972, 2007. [55] M.T. Carroll, Cheng Chang, and R.F.W. Bader, “Prediction of the Structures of Hydrogen-Bonded Complexes Using the Laplacian of the Charge Density”, Molecular Physics, vol. 63, no. 3, pp. 387-405, 1988. [56] M.T. Carroll and R.F.W. Bader, “An Analysis of the Hydrogen Bond in BASE-HF Complexes Using the Theory of Atoms in Molecules”, Molecular Physics, vol. 65, no. 3, pp. 695-722, 1988. [57] P.L.A. Popelier and R.F.W. Bader, “The Existence of an Intramolecular C…H…O Hydrogen Bond in Creatine and Carbamoyl Sarcosine”, Chemical Physics Letters, vol. 189, no. 6, pp. 542-548, 1992. [58] U. Koch and P.L.A. Popelier, “Characterization of C-H-O Hydrogen Bonds on the Basis of the Charge Density”, The Journal of Physical Chemistry, vol. 99, no. 24, pp. 9747-9754, 1995. [59] A.E. Shchavlev, A.N. Pankratov, V.B. Borodulin, and O.A. Chaplygina, “DFT Study of the Monomers and Dimers of 2-Pyrrolidone: Equilibrium Structures, Vibrational, Orbital, Topological, and NBO Analysis of Hydrogen-Bonded Interactions”, The Journal of Physical Chemistry A, vol. 109, no. 48. pp. 10982-10996, 2005. [60] R. Parthasarathi, V. Subramanian, and N. Sathyamurthy, ”Hydrogen Bonding Without Borders: An Atoms-in-Molecules Perspective”, The Journal of Physical Chemistry A, vol. 110, no. 10, pp. 3349-3351, 2006. [61] A.E. Shchavlev, A.N. Pankratov, and V. Enchev, “Intramolecular Hydrogen-Bonding Interactions in 2-Nitrosophenol and Nitrosonaphthols: Ab Initio, Density Functional, and Nuclear Magnetic Resonance Theoretical Study”, The Journal of Physical Chemistry A, vol. 111, no. 30, pp. 7112-7123, 2007. [62] R. Parthasarathi, S. Sundar Raman, V. Subramanian, and T. Ramasami, “Bader’s Electron Density Analysis of Hydrogen Bonding in Secondary Structural Elements of Protein”, The Journal of Physical Chemistry A, vol. 111, no. 30, pp. 7141-7148, 2007. [63] R. Parthasarathi, V. Subramanian, and N. Sathyamurthy, “Hydrogen Bonding in Protonated Water Clusters: An Atoms-in-Molecules Perspective”, The Journal of Physical Chemistry A, vol. 111, no. 51, pp. 13287-13290, 2007. [64] A.N. Pankratov, “Electronic Structure and Reactivity of Inorganic, Organic, Organoelement and Coordination Compounds: An Experience in the Area of Applied Quantum Chemistry”, In: Quantum Chemistry Research Trends, Editor: Mikas P. Kaisas, Nova Science Publishers, Inc., New York, 2007, pp. 57-125. [65] V. Sharma, M. Vijayan, P. Adhikari, and A. Surolia, “Molecular Basis of Recognition by Gal/GalNAc Specific Legume Lectins: Influence of Glu 129 on the Specificity of Peanut Agglutinin (PNA) Towards C2-Substituents of Galactose”, Glycobiology, vol. 8, no. 10, pp. 1007-1012, 1998.
In: Quantum Frontiers of Atoms and Molecules Editor: Mihai V. Putz, pp. 343-366
ISBN: 978-1-61668-158-6 © 2010 Nova Science Publishers, Inc.
Chapter 14
ELECTRON DENSITY DISTRIBUTIONS OF HETEROCYCLES: A SHORTCOMING OF THE RESONANCE MODEL Ricardo A. Mosquera*, Marcos Mandado, Laura Estévez and Nicolás Otero Departamento de Química Física, Universidade de Vigo, 36310-Vigo, Galicia (Spain)
Abstract Electron density distributions obtained for several heterocycles (indoles, 1,3-azoles, and anthocyanidins) at diverse computational levels were analyzed with the Quantum Theory of Atoms in Molecules (QTAIM). The results computed for some of these compounds, or for their protonated species, indicate that qualitative partial atomic charges obtained by using the resonance model do not provide a reliable description for those electronic distributions. Overall, total QTAIM atomic charges are more related to electrostatic interactions than to concerted movements of electron pairs.
Introduction The resonance model (RM) is still one of the most often employed tools for explaining the mechanism of chemical processes or predicting their products. The study and application of this model consumes a significant amount of time for chemistry students. Nevertheless diverse evidences found by several research groups point to its inadequacy to describe the evolution of the electron density in various simple chemical processes. Even, RM cannot explain some experimental facts like the evolution of pKas along certain series of organic compounds. Conformational equilibria, protonations or hydride additions are examples of simple processes where the resonance model leads to explanations that contradict those obtained using modern quantum chemical methods for the electron density analysis. Among *
E-mail address: [email protected]
344
Ricardo A. Mosquera, Marcos Mandado, Laura Estévez et al.
them, the quantum theory of atoms in molecules (QTAIM) [1-3] can be considered as a very reliable one, as it is based exclusively on the basic principles of Physics without introducing any other hypothesis. Several QTAIM studies have contradicted well known and generally accepted conclusions of the RM. To the best of our knowledge, Wiberg and Laidig’s work on the origin of ester and amide resonance [4], can be reported as the first serious difference between QTAIM results and RM explanations. This work shows that in diverse R-CO-XR’ compounds, comprising formamide (XR’=NH2), formic acid (XR’=OH), methyl formate (XR’=OCH3), etc., the atomic electron population of nitrogen/oxygen, N(X), is smaller in the transition states for the C-X rotation than in the corresponding planar conformers. In contrast, N(O) remains nearly constant along this process. According to the RM, the non-planar geometry of transition states for C-X rotation breaks the electron delocalization along the O=C-X unit (usually represented by resonant forms: O=C-X ↔ O–C=X+) present in the planar conformer. Clearly, RM predicts that, at transition states and with regard to conformers, N(X) should be larger, as the O–C=X+ form should have a negligible weight, and N(O) should be smaller. These results, obtained initially at the HF level, were confirmed later at the MP2 level [5]. Slightly later, Slee and MacDougall observed that the comparison of QTAIM atomic electron populations, N(Ω), of allyl ions and the corresponding neutral compounds is not in line with the evolution of electron density expected with the resonance model [6]. In this context, it should be mentioned that the publication of the first of Wiberg’s papers on “ester/amide resonance” led to a long argument about the reliability of QTAIM atomic populations [7], ended by clear demonstrations of unreliability of the basic postulates against QTAIM charges [8,9]. Moreover, the same kind of contradiction between RM predictions and QTAIM relative charges observed for simple amides and esters was also obtained for thioformamide [10]. In the same vein, Glaser and Chao obtained that the electron density distributions of diazonium ions are inconsistent with the commonly used Lewis structure R-N+≡N and would be better represented by a combination of two unconnected structures: R+···N≡N and R+···N≡N+ [11,12]. Also, the acidity sequence followed by dimethyl sulfide, sulfoxide and sulfone, cannot be explained by the RM, which reverses the order. In contrast, QTAIM atomic populations explain the real sequence and provide no evidence for the delocalization of the charge from the anionic carbon in the rest of the anion [13]. In the second half of the 90s our group started a systematic study on protonation processes of oxygenated compounds employing QTAIM as basic tool for analyzing the evolution of the molecular electron density (computed at diverse computational levels: HF, B3LYP, MP2 and sometimes QCISD) along the protonation. This study comprised carbonylic compounds [14,15]. linear [16,17] and cyclic ethers [18,19]. The general conclusion obtained was that the positive charge was mainly concentrated on the proton while the oxygen formally attached to it does not reduce its electron population, as postulated by a classic protonation scheme shown in Figure 1. On the contrary, N(O) increases upon protonation, gaining electron density from the remaining hydrogen atoms in the molecule, as had been previously proposed by Stutchbury and Cooper [20].
Electron Density Distributions of Heterocycles
+
O
H O
H+
+ R1
345
R2
R1
R2
Figure 1. Classic mechanism of protonation for carbonyl compounds.
+H
+
+310 -25
+406
-39
a
-76
-47
-112 -112
-105
b -83
-105
Figure 2. Evolution of atomic electron population, ΔN(Ω), upon protonation of propanone. All values in au multiplied by 103.
Later on, our work was extended to other systems of practical interest, as uracil and cytosine [21-23], and to compounds without oxygen, like nitriles [24]. RM was only able to predict the stability sequence of protonated forms and explain the changes exhibited by most of the bond properties upon protonation. Even, both the O- and N-protonated forms of uracil and cytosine are found to be better described by RO-H+ and RN-H+ forms than by the classical RO+-H and RN+-H structures. Again, according to the QTAIM analysis the electron charge gained by the proton is mainly provided by the other hydrogens of the molecule. The study of several model systems, like vinylketone, methyl formiate and N-methyl formamide [22] led us to explain the previously reported stability sequence of uracil [25] and cytosine protonated forms, as well as the evolution of atomic electron populations. Thus, we developed an alternant model, not based on the resonance concept but mainly on electrostatic interactions [22,23], which we think can be applied to any protonation. This model is based upon the following points: i) the donation of electron population is easier when the atomic number is smaller; ii) the closer the distance to the proton is, the easier the electron donation will be; iii) the donation of electron population between bonded atoms follows the direction of the bond. The orientation of the bond with regard to the proton can make the electron transference easier (if the electron density approaches the proton), or more difficult (if the electron density moves away from the proton) (see, respectively, hydrogens labeled “a” and “b” in Figure 2); and iv) π-transferences are generally easier than σ ones, when both are possible.
346
Ricardo A. Mosquera, Marcos Mandado, Laura Estévez et al.
At this point, we should highlight that other modern methods for electron density analysis, like the Hirshfeld scheme [26], recently implemented for several computational levels [27], which was employed to analyze several simple oxygenated compounds [28], provide different absolute values for the evolution of atomic electron populations, ΔN(Ω), but the same qualitative description, contradicting RM expectations. As protonation can be considered as a model for electrophilic attacks, we have also studied how activant and deactivant substituents modify the evolution of electron density in this process. QTAIM analysis carried out for the protonation of a set of aniline derivatives, indicate that most of the electron density gained by the proton is provided neither by the nitrogen atom nor by activant substituens like OH [29]. In a similar way, the acidity of phenol derivatives can be rationalized on the basis of atomic QTAIM properties, but not on the RM predictions [30]. On the other hand, the evolution of molecular electron density upon hydride addition, (simple model for nucleophilic attacks), computed both with QTAIM or Hirshfeld methods, has been shown to display general trends that are also not in line with RM predictions summarized in the scheme shown in Figure 3 [31,32]. Thus, we observe that most of the electron density provided by the hydride is not taken by the oxygen. In fact ΔN(O) never reaches 0.2 au, whereas for the carbon attached to hydride ΔN(C) always exceeds 0.4 au and ΣΔN(H) goes from 0.44 au to 0.53 au in the compounds hitherto studied. When the study is repeated using other anionic nucleophiles (CN-, OH-) the results do not change substantially.
-O
O
H-
H
+ R1
R2
R1
R2
Figure 3. Classic mechanism for hydride addition to carbonyl compounds.
Among the discrepancies observed between RM predictions and relative atomic charges, we highlight the specific behavior of heteroatoms, X, reducing the extent of electron reorganization with regard to that displayed when they are replaced by carbons [22]. In fact C-X bonds were found to act as barriers to σ-electron reorganization, precluding (or reducing substantially) the transference of σ electron density throughout them [22,33]. We think of interest to show if the discrepancies previously described for pyrimidine and purine bases [21-23], affect in general to all heterocycles. In particular, this chapter reports a detailed comparison between the RM predictions and conclusions derived from the QTAIM analysis carried out for the protonation (in some cases also other processes) of diverse heterocycles: indoles, 1,3-azoles, and anthocyanidins.
Short Overview of QTAIM QTAIM has been included among the topological methods for analysing of the electron G density function, ρ (r ) . As in any topological analysis, the localization of singular points
Electron Density Distributions of Heterocycles
347
plays a basic role. In this case we have singular points in the real space spanned by the 3 G coordinates representing the position of any electron, r , that can be classified as: i) Local maxima, also called electron density attractors, whose coordinates correspond very approximately to those of the nuclei in the molecule; ii) Along every bond there is a saddle G point with two negative eigenvalues of the Hessian matrix of ρ (r ) . These points are also called bond critical points or BCPs. Inside each ring, we observe other kind of saddle points, whose Hessian matrix present two positive eigenvalues, and are called ring critical points (RCPs). Finally, in polycycles with cage structure, we observe the presence of relative minima, one per cage, named cage critical points (CCPs). The topological analysis also looks at the gradient paths of the electron density. They form a vector field where every group of field lines ends at a different nucleus. These groups of lines are delimited by surfaces given by what is known as the zero flux condition (1), which is rigorously derived [34] from Schwinger’s principle of stationary action [35]. These G surfaces intersect with a certain vanishing limit of ρ (r ) (e.g. 10-5 au), defining the atomic basins that are disjoint regions of the space. In this context, an atom Ω can be defined as the joint of a nucleus and its electron basin. The integration of the proper density function within the atomic basin provides the atomic properties, like the atomic electron population (2), the atomic electron kinetic energy (3) or the atomic volume (4).
G G G ∇ρ (r ) ⋅ n (r ) = 0
(1)
G G N (Ω ) = ∫ ρ (r )dr
(2)
Ω
K (Ω ) =
1− γ 4
∫ {2[∇∇′ρ (r , r ′)] G G
G G r =r ′
}
2 G G − ∇ ρ (r ) dr
(3)
Ω
G v(Ω ) = ∫ dr
(4)
G −1 2 G L(r ) = ∇ ρ (r ) 4
(5)
Ω
Turning back to critical points, it has to be said that the conduction of the eigenvector associated to the positive eigenvalue of every BCP gives rise to the atomic interaction lines or bond paths. According to Bader's original formulation of QTAIM, bond paths are the physical representation of chemical bonds [1]. Nevertheless, the interpretation of bond paths in certain systems (biphenyl, inclusion complexes of He in adamantane, etc.) has risen significant controversies [36-42]. Other points requiring careful attention when performing QTAIM analysis are: i) The obtention of non-nuclear attractors, usually found in systems with very weak bonds, [43-48]; ii) The calculation of atomic energies by correcting kinetic energies using the molecular virial ratio, γ, may lead to artifacts when the later quantity experiences
348
Ricardo A. Mosquera, Marcos Mandado, Laura Estévez et al.
important variations [49]. This problem can be reduced employing
ρ (rG ) distributions that
closely verify the virial theorem [50], what can be achieved using self consistent virial scaling (SCVS) optimizations [51]; iii) Finally, the quality of atomic integrations should be tested, checking that summations of atomic energies and atomic electron populations recover within negligible differences, respectively, total electronic energy and the number of electrons in the molecule. Moreover, each atomic integration should be performed providing absolute G integrated values of the L( r )function (5) nearly below 10-3 au. In fact, L(Ω) and N(Ω) display linear correlations within a certain range of L(Ω) values [52].
Computational Details Electron densities were computed in all cases for completely optimized geometries at the G same computational level with Gaussian03 program [53]. Topological QTAIM ρ (r ) analysis was performed with the program AIMPAC [54]. When necessary, two-centre delocalization indices (DIs), δ(Ω,Ω’), [55-57] and QTAIM based 5- and 6-centre DIs, Δ5 and Δ6 respectively, [58,59] were also calculated using a program developed in our group [60]. In all cases DFT delocalization indices are computed as approximations using HF expressions, as the Fermi hole density cannot be defined strictly in DFT framework. The accuracy achieved in the calculation of the QTAIM atomic properties was checked as usual. Thus, summations of the N(Ω) and atomic energy, E(Ω), values for each molecule never differ from total electron populations and electronic molecular energies by more than 5·10-3 a.u. and 4.0 kJ mol-1, respectively. No atom was integrated with |L(Ω)| > 2 10-3 a.u.
Indoles Indole has been studied intensively in chemical literature because of both its structural and practical importance. Thus, details of its molecular structure can be found elsewhere (see for instance [61,62] and references therein). Experimental evidences [63-66] and computational results [61,67,68] reveal that C3 is the preferred protonation site both in gas phase and solution, contrasting with pyrrole, which prefers C2 [69,70]. Our study comprises a detailed analysis of all possible protonations of parent indole [71] and how the protonation affinity (PA) of every site is affected by including one activating or deactivating substituent (-CH3, -F, -NH2, -NO2) at different positions [72]. In the first case, B3LYP/6-311++G(2d,2p) 6d and MP2/6-31++G(d,p) electron densities were considered for indole and all its protonated forms, named with the IUPAC carbon number where the proton is attached. The geometries obtained at B3LYP and MP2 levels are very similar in all cases, as well as most of relative atomic properties for the protonation process. The highest change in bond lengths is 0.013 au, whereas differences in bond angles are never higher than 0.3°. Thus in the second study we restricted ourselves to B3LYP calculations. Also, for the sake of simplicity we are only detailing the results obtained for the parent molecule, whereas we just summarize the main conclusions obtained for derivatives. The protonations of parent indole at bridge carbons (C8 and C9) were not studied, as the corresponding PAs were reported to be much less than the remaining ones [61]. PAs shown in
Electron Density Distributions of Heterocycles
349
Table 1 (named by the position where proton is attached) were computed transforming the energy difference into enthalpy, including the corresponding unscaled B3LYP ZPVE, BSSE (which was found negligible) and thermal correction at 298.15 K. G QTAIM N(Ω) values of indole computed from B3LYP ρ (r ) show that, with the exception of N1 and its neighbors, the atoms are nearly neutral. N1 has a strong negative charge (-1.067 au), while C2, C8, and H10 are charged positively (+0.329, +0.340, and +0.390 au, respectively). The main differences with MP2/6-31++G(d,p) values occur in the atoms displaying strong charges, which MP2 tends to increase. Thus, N1 is even more negative (-1.333 au), and C2, C8, and H10 more positive (+0.428, +0.422, and +0.452 au, G respectively). The symmetry of indole allows partitioning ρ (r ) into σ and π components. We notice that even the carbon atoms that lose total electron population have nearly 1 au of π population. Therefore, when we exclude N1, the atomic charges are mainly due to σ displacements. N1 displays a strong negative σ charge (-1.350 and -1.588 au at B3LYP and MP2 levels, respectively) and a slight positive π charge (+0.283 and +0.254 au with B3LYP and MP2, respectively). If we only consider the π population, we could say that it agrees with the results expected from the application of the RM. Thus, most of the eleven isovalent resonance structures that can be written for indole place a positive charge on the nitrogen atom [71]. Also, according to nine of them, all carbon atoms should have π negative charges. However, it is only displayed by C3, C7, and C9, while the rest of the carbons are positive in terms of π charge. Table 1. PA, (in kJ mol-1) and relative Δ5 and Δ6 indices for the protonated species (in G au), Qzz and [∇2 ρ (r ) ]SCC-Ω values (both in au) for protonation sites in neutral indole
G
PAB3LYP
PAMP2a
102·Δ6b
102·Δ5b
Qzz(Ω)
[∇2 ρ (r ) ]SCC-Ω
1 2
832.3 870.7
827.5 842.6
2.11 0.60
-0.15 0.10
-2.53 -3.40
-1.414 -0.158
3 4
891.6 858.3
884.2 835.4
2.03 0.14
0.31 1.56
-3.74 -3.30
-0.136 -0.143
5 6
858.3 861.4
835.3 833.5
0.12 0.13
1.15 1.10
-3.46 -3.38
-0.148 -0.143
7
846.7
826.9
0.11
1.66
-3.47
-0.161
a
Thermal and ZPVE corrections taken from the B3LYP/6-311++G(2d,2p) 6d frequency calculation. B3LYP/6-311++G(2d,2p) values, Δ6 and Δ5 indices for indole are 1.63·10-2 and 1.34·10-2 au respectively.
b
If the electron delocalization is described correctly by the RM, we should find significant DIs between nonbonded pairs of atoms that bear charges in one of those resonance structures (nitrogen and all the carbon atoms), and the corresponding values could be employed to weight them. The highest index found is δ(N1,C3) (0.166 au). Other important nitrogen delocalizations are found with C9 and C7 (0.119 and 0.086 au respectively). The rest of the δ(N1,C) values are in the same order or even smaller than the δ(N1,H) ones. In contrast, there are some important DIs between carbon atoms, like δ(C4,C7) (0.100 au), which correspond to
350
Ricardo A. Mosquera, Marcos Mandado, Laura Estévez et al.
the delocalization of the carbons in the 6 member ring, or δ(C2,C8) and δ(C2,C9) indicating delocalizations between rings. Most of δ(N1,C3) corresponds to π delocalization, while for δ(N1,C9) the π delocalization only represents a half of the total one, meaning that the electron lone pair of the nitrogen delocalizes better in the direction of C3. The three pairs of atoms arranged in para in the benzene ring (C4-C7, C5-C8, and C6-C9) display also important π delocalizations (larger than 0.071 au). The highest π delocalization between rings occurs in the pair C2-C4 (0.035 au), while the bridge carbons have noticeable delocalizations with atoms from both rings. PAs calculated for 1-7 agree with the experimental fact that C3 is the preferred protonation site and N-protonation is the least suitable. The values obtained with both computational levels (Table 2) are also in line with the gas phase experimental one (933.4 kJ mol-1 [73]). According to Bader’s criteria [74], the protonation should be favored where ρ (rG ) is less retained in the ring. This is measured by the atomic quadrupolar electric tensors, Q(Ω), and especially by Qzz(Ω), the eigenvalue of this tensor associated with the eigenvector perpendicular to the ring, eπ. The more negative Qzz(Ω), the more concentrated the charge is in the eπ axis, and the easier the protonation will be. Bader also related reactivity with G ∇2 ρ (r ) in the sense that more negative values indicate more reactive points for electrophilic
G
attack. So, another index to study is ∇2 ρ (r ) at secondary concentrations of charge (SCC)
G
close to the atom (SCC-Ω), [∇2 ρ (r ) ]SCC-Ω. A SCC corresponds to (3,-1) critical points for
G
G
∇2 ρ (r ) which also presents ∇2 ρ (r ) < 0. The most interesting of them are the ones close to
G
(3,-1) points with ∇2 ρ (r ) > 0, because they warrant a favorable route for the approximation
of the proton. Calculated Qzz(Ω) values of indole (Table 1) show a qualitative correspondence with the proton affinity. Therefore, the most negative value corresponds to C3, the most favored protonation, while the least favored protonation, N1, displays the least negative Qzz(Ω). The rest of the carbons that can receive a proton have similar values of Qzz(Ω). However, no G linear correlation was found with PAs. On the contrary, [∇2 ρ (r ) ]SCC-Ω values are not in line
G
with PAs (Table 1). e.g., the most negative [∇2 ρ (r ) ]SCC-Ω corresponds to N1, the atom with
the lowest PA, and the atom that displays the least negative value is C3, the preferred protonation site. π n-DIs were shown to be a useful tool to investigate the local aromaticity in monocyclic and polycyclic compounds [58,75]. Since the n-DIs measure the extension of the electron delocalization to n atoms, when it is computed for all atoms of the ring, the larger the n-DI, the more aromatic the molecule. However, because of n-DIs are defined as a product of n overlap atomic integrals, the comparison of aromaticity of rings with a different number of centres requires the inclusion of a scaling factor that weights the π n-DIs, increasing Δ6 values. In this case, even unscaled Δ6 indices for the benzenoid ring are larger than Δ5 values of the pyrrole ring (Table 1), indicating that the benzene ring is significantly more aromatic than the pyrrole one. This fact also agrees with the pyrrole ring larger PAs. Variation of atomic electron populations due to protonation, ΔN(Ω), computed at the B3LYP level (Table 2) are very similar to those obtained at the MP2 one. The differences obtained for ΔN(Ω) with both levels never exceed 0.06 au. In general, there is also a good
Electron Density Distributions of Heterocycles
351
linear correlation between ΔN(Ω) and ΔE(Ω) values [71]. Nevertheless, the remarkable stabilization of N1 in protonation 3 makes C3 the preferred protonation site (Table 2). This fact can be explained considering the small variation shown by ΔN(N1) arises from an important reduction of π population (-0.754 au) compensated by a little larger increase of σ population (+0.767 au). As the σ electron population is in average closer to the nucleus the result is a significant stabilization. Table 2. Most significant variations of atomic and group B3LYP energies (in kJ mol-1) and electron populations (in au multiplied by 103) due to protonation Group ΔE(C6H4) ΔE(N1) ΔN(N1) E(H+) N(H+) ΔE(C4NH3)
1 4 332 -160 -1186 562 37
2 483 18 -47 -1529 894 349
3 124 -238 13 -1520 896 323
4 370 -94 -2 -1534 905 248
5 573 -88 -12 -1521 899 266
6 340 -78 -9 -1530 905 258
7 595 -97 -8 -1528 901 256
A very noticeable difference among N-protonation and C-protonations is the electron population gained by the proton (Table 2). Thus, the proton on the nitrogen atom has the smallest atomic population, whereas the ones on the carbons are always around 0.33 au larger. The proton on the N receives 83% of the electron population from the rest of the hydrogens according to B3LYP electron densities (93%, from the MP2 results). In contrast, the proton on the carbons receives only between 55 and 59% of the electron population from the rest of the hydrogens. Moreover, most of the hydrogens lose more electron population than carbon atoms. Also, in protonations 4-7, the highest donor to the electron population gained by the proton is the neighbor hydrogen. Thus, as previously found in other cations and anions [20,76], hydrogens allow a dispersal of the charge excess. Contrary to what could be expected, the most important ΔN(Ω) values are not shown by the α-carbon, but in C2 or C8 depending on the specific protonation site. Thus, protonations 3, 4, 6 decrease N(C2) and protonations 2, 5, 7 decrease N(C8) [71]. The role played by these carbons can be explained considering the resonance structures of protonated indole. Two points have to be remarked about them: i) The most stable protonation (3) is the only one that allows two resonance structures that do not alter the electronic structure of the benzene ring, in line with the high Δ6 value already commented; ii) the resonance structures favored by ΔN(Ω) values keep a positive formal charge connected to a formally neutral nitrogen (C+-N), contrasting with the traditionally used N+-C structures. Thus, the two atoms that are losing more electron density in C-protonations are the neighbors of the nitrogen, while global ΔN(N1) is almost negligible. Nevertheless, N1 experiences a strong σ/π movement, loosing π electron density and gaining a similar amount of σ. The largest movements occur for protonation 3 and the smallest for 4 and 7. Thus, except for protonation 7, ΔNπ(N1) is higher than ΔNπ(Ω) for atoms C2 or C8 in the corresponding protonations, what agrees better with traditional resonance structures. Finally, protonation at C2, the preferred protonation site for pyrrole, reduces significanty the aromaticity of the benzenoid ring, destabilizing the protonated species with regard to that formed by C3-protonation.
352
Ricardo A. Mosquera, Marcos Mandado, Laura Estévez et al.
The activating/deactivating effects of –CH3, -F, -NH2, and –NO2 in indole were quantitatively measured as ΔPA values (Table 3) and, in general, agree with RM predictions. The insertion of a methyl group into one of the carbons of the pyrrol or phenyl rings clearly activates the electrophilic attack at all positions with the exception of the ipso one. The specificity of ipso positions may be due either to steric hindrance or to electronic effects and will be discussed below. As a general rule, it can be stated that methylation of a certain ring mainly activate the carbons of that ring, indicating the prevalence of inductive effects for this substitution. However, some exceptions to this rule can be observed in Table 3. Thus, the most activated position of 6-methylindole is C2, which is quite far from C6. Also, the most activated position of 4-methylindole and 7-methylindole are respectively C7 and C4. All these evidences indicate that the activating ability of -CH3 cannot be solely explained by arguments based on inductive effects. The insertion of fluorine into the pyrrol or phenyl rings clearly disfavours the protonation at all positions with the exception of C2 and C7 for respectively 3-fluorineindole and 4fluorineindole. The large inductive withdrawing (-I) character of fluorine is clearly reflected on the ipso positions, displaying the most negative ΔPAs for all the substitutions. N1 is an exception to this rule that can be explained by its high electronegativity. Also, the behaviour of fluorine derivatives cannot be uniquely explained by the -I character. Thus, ΔPAs for protonations at ortho and para sites are significantly less negative than the rest. This can be explained accepting that fluorine has a small mesomeric electron-releasing character (+R). The amino group is an example of -I and +R character. Thus, ipso positions are significantly deactivated, showing negative ΔPAs, whereas the remaining positions are activated (large positive ΔPAs). It must be noticed that the +R character of -NH2 is quite important as can be derived from the high ΔPA values of each substitution, which always exceed 81.0 kJ mol-1. The position with the largest mesomeric effect is located in the substituted ring. Thus, substitutions at N1, C2 and C3 mainly activate the pyrrol ring and substitutions at C4, C5, C6 and C7 mainly activate the phenyl ring. Finally, the nitro group is an example of inductive and mesomeric electron-withdrawing character (-I and -R). The former is reflected on the ΔPA values at ipso positions, which are highly deactivated, in fact they are the most deactivated positions. The only exception is observed for 2-nitroindole, where the most deactivated position is C3 and not C2. We think that is due to the competitive inductive effects of the two attached nitrogens. Once again, the position with the largest mesomeric effect is located in the substituted ring as happened for NH2. According to traditional interpretations, inductive effects and mesomeric effects are mainly related to variations of σ and π electron populations respectively [77]. However, no quantitative simple relationship has been found between ΔN(Ω) values [72] and PAs, pointing again that RM considerations are not able to predict the evolution of electron density. It should be noticed that the summation of ΔNσ(Ω) values for the heterocycle, ΔNσ[R], is negative for methylindoles (up to -0.017 au). Thus, contrary to what is generally accepted, the QTAIM results point to an inductive electron-withdrawing character for -CH3 instead of electron-releasing. These results probably stem from the slightly negative QTAIM charge of Hs in methyl groups. The loss of σ electrons by the rings is partially recovered via π donation from -CH3. The ΔNσ[R] remaining groups confirm the expectations, -F and -NO2 are highly electron-withdrawing groups.
Electron Density Distributions of Heterocycles
353
Table 3. B3LYP PAs for every site (1-7) [in kJ mol-1 with regard to those of indole (Table 1)] for the 28 indole derivatives here studied R
CH3
F
NH2
NO2
1 2 3 4 5 6 7 1 2 3 4 5 6 7 2 3 4 5 6 7 2 3 4 5 6 7
1 17.5 16.0 21.7 7.1 11.0 10.4 8.3 -9.1 -29.1 -8.3 -20.1 -14.9 -17.2 -18.8 18.4 52.0 16.9 30.5 27.6 14.2 -62.1 -84.2 -53.2 -58.4 -55.9 -48.2
2 12.6 -2.7 25.8 13.3 14.5 23.1 9.8 -26.2 -41.0 4.7 -18.6 -18.8 -3.8 -25.1 -20.4 101.4 37.2 31.0 76.1 11.1 -80.9 -80.3 -58.3 -61.3 -69.0 -57.0
3 27.9 29.1 -2.4 7.6 11.8 9.8 7.5 -16.7 -6.3 -44.5 -17.8 -13.1 -17.3 -19.8 88.6 -16.7 15.4 37.6 23.3 13.6 -86.1 -90.1 -45.3 -56.8 -56.0 -48.4
4 12.4 20.5 10.4 -7.6 21.2 7.1 21.3 -29.7 -8.2 -16.2 -60.0 -4.3 -26.3 -2.5 60.1 22.8 -48.4 85.6 11.7 82.3 -70.9 -60.2 -78.7 -66.8 -62.3 -78.5
5 13.7 15.7 12.6 21.2 -3.5 21.0 11.3 -21.9 -17.5 -18.4 -5.0 -63.9 -10.1 -23.5 25.5 26.9 83.1 -41.7 64.5 12.4 -63.8 -64.1 -70.9 -90.3 -66.0 -65.4
6 13.9 23.3 9.9 10.1 20.1 -4.3 18.8 -30.6 -4.2 -20.2 -23.6 -8.8 -60.7 -7.3 68.6 19.9 11.4 63.3 -43.9 69.7 -76.1 -66.4 -59.6 -62.1 -85.4 -72.0
7 15.4 13.6 10.9 24.9 8.4 20.8 -6.2 -17.9 -20.0 -18.6 1.2 -25.9 -7.2 -58.4 20.4 22.5 99.5 16.0 81.5 -24.2 -60.0 -62.7 -77.6 -65.6 -67.7 -89.8
1,3-Azoles Azoles are defined as a class of five-membered heterocycles containing at least one nitrogen atom (N3) and another heteroatom. We have applied QTAIM to analyze the electron density of imidazole, N-methyl imidazole, oxazole and thiazole and how it evolves along some of the main chemical processes: electrophilic and nucleophilic aromatic substitution modelized, respectively, by protonation and hydride addition. Deprotonation of neutral molecules to produce the corresponding anion is also among the most characteristic reactions experienced by these compounds. This process is specially significant in the chemistry of imidazoles and thiazoles, which have been extensively employed to give NCN and NCS free carbenes [78]. All of these processes have been studied for gas and aqueous solutions,
354
Ricardo A. Mosquera, Marcos Mandado, Laura Estévez et al.
represented, respectively by optimizations of the isolated species and with the Polarized Continuum Model (PCM) [79]. Imidazole proton affinities, PAs, have been extensively computed at different theoretical levels [80-86], while less data are available for the deprotonation energies of this compound [80]. The computational studies carried out for oxazole and thiazole are scarce and concentrate on PAs [80,81]. The study comprises the four neutral species shown in Figure 4 and all their corresponding protonated (but that on N1 in 1-methylimidazole) and deprotonated species, as well as those obtained upon hydride addition on C atoms. In what follows, n+Hm and n-Hm refer to those species obtained, respectively, upon proton or hydride addition to the neutral molecule n at atom m (Figure 4). Conjugated bases obtained upon deprotonation are denoted as Bn-m.
n 1 2 3 4
X N-H N-CH3 O S
Figure 4. Atom numbering and nomenclature of 1,3-azoles here studied.
The electron densities were obtained in all cases for completely optimized geometries at the B3LYP/6-311++G(2d,2p) 6d level and confirmed as true minima by vibrational analysis carried out at the same level in gas phase. ZPVE and enthalphy thermal corrections (ETC) at 298.15 K were considered for computing gas phase PAs, deprotonation and hydride addition enthalpies. The corresponding quantities for aqueous solution were computed correcting PCM optimized energies with the ZPVE and ETC corrections obtained for the gas phase [87]. 5-centre DIs [58], and FLUπ indexes [88] in planar molecules were also calculated. Both Δ5 and FLUπ values (Table 4) indicate that oxazole, which is considered the least aromatic compound of the series, displays the lowest electron delocalization (minimum Δ5 and maximum FLUπ) so in gas phase as in aqueous solution. These indices also indicate that the electron delocalization of thiazole is significantly larger than that of oxazole, but only slightly lower than that of imidazoles. Both indices agree that electron delocalization increases in 1methylimidazole with regard to imidazole. Nevertheless, Δ5 and FLUπ values obtained from PCM computed electron densities indicate different trends. Thus, Δ5 indicates the electron delocalization of the imidazole ring decreases in aqueous solution upon N1-methylation, whereas the opposite trend is indicated by FLUπ values. Anyway, Δ5 differences between imidazole and its N-methyl derivative do not exceed 8·10-4 au in PCM and 6·10-4 au in gas phase. Comparing electron delocalization indices computed for the same molecule with PCM and gas-phase electron densities, we also observe Δ5 and FLUπ values display a common
Electron Density Distributions of Heterocycles
355
trend (electron delocalization increases in aqueous solution) but in oxazole, where Δ5 values again differ in a very small amount (3·10-4 au). Table 4. Δ5 delocalization indices (in au and multiplied by 102) and dimensionless FLUπ values for neutral species
1 2 3 4
102 Δ5 Gas PCM 3.155 3.450 3.210 3.373 2.167 2.202 3.036 3.151
102 FLUπ GAS PCM 7.154 5.702 5.658 5.181 23.719 24.097 12.552 11.945
Both gas phase and PCM calculations indicate that N3 is the most favored protonation site for the azoles here studied (Table 5), as expected according to experimental results [8182], whereas protonation at the other heteroatom (X1) gives rise to the least stable protonation, contrary to what had been assumed for oxazole and thiazole in a previous paper [80]. Concerning C-protonations, we notice that imidazole and N-methyl imidazole follow the sequence PA(C5) > PA(C4) > PA(C2) in both phases, which diverges from that obtained in the gas phase for imidazole using MP2/6-311G(2d,p) calculations [83]. This sequence changes for thiazole, where PA (C2) exceeds that of C4 by less than 20 kJ mol-1. Finally, oxazole follows the same sequence as imidazole in the gas phase and that of thiazole in aqueous solution. Table 5. Computed and experimental PAs (in kJ mol-1) for the diverse protonation sites
1 2 3 4
X1 698.0 624.1 699.8
C2 769.9 801.7 706.4 728.7
PA(gas) N3 C4 904.1 772.0 925.6 805.6 836.7 702.1 863.1 720.0
C5 786.5 814.2 729.5 744.8
PA(PCM) Exp.a X1 C2 N3 C4 942.8 959.9 1005.1 1138.5 1013.2 959.6 1007.7 1137.0 1013.9 876.4 924.0 944.3 1393.6 946.2 904.0 829.9 961.7 1471.4 956.9
C5 1019.8 1020.5 970.1 978.8
a
Experimental data for the preferred protonation site taken from reference 73.
We notice that gas phase experimental and computed PAs follow the same relative sequence for this series (r2= 0.997). Nevertheless, the values computed for the aqueous solvated species with PCM indicate a different sequence with thiazole displaying the largest PA followed by oxazole, whereas 1-methylimidazole and imidazole display the lowest PAs (Table 5). If we exclude the gas phase PA computed for thiazole at S1, the only case where protonation breaks the ring, we observe that the same sequence of PAs at the most stable site is followed by the remaining sites. This observation cannot be extended to aqueous solution (Table 5). The protonation of 1,3-diazoles has no consequences on the planarity of the ring with the exception of oxazole protonated at O1 which is non planar in gas phase, giving rise to the lowest PA in the series. The non planar O1-protonated oxazole is 1.5 kJ mol-1 more stable than the corresponding planar structure, which displays one imaginary frequency, in spite its electron delocalization diminishes from 0.0040 au (planar form) to 0.0024 au (conformer).
356
Ricardo A. Mosquera, Marcos Mandado, Laura Estévez et al.
We also notice that O1 protonation of oxazole in PCM produces a planar species, which is only 0.1 kJ mol-1 more stable than the corresponding non planar singular point. Both Δ5 and FLUπ values indicate that, in gas phase, the electron delocalization is reduced significantly by most of the protonations but that on N3, where we observe a slight increase of electron delocalization upon protonation with the exception of 1-methylimidazole. There is a good linear correlation between Δ5 and FLUπ in gas phase (r2=0.86) that worsens significantly (r2=0.51) for PCM calculations. In contrast, no significant correlation is observed between PAs and variation of delocalization indices. Looking at N(Ω) values and their evolution upon protonation (data not shown) we observe: i) The largest positive charge of the most stable protonation of each molecule (and for all protonations at heteroatoms excluding the ring-breaking protonation at S1) is displayed by the proton what, as in previous studies [18,19,21-23,71,72], contradicts the Lewis structure usually employed to represent protonated heterocycles; ii) In contrast, the proton gains a substantial amount of electron population (between 0.85 and 0.875 au) at Csites; iii) Consequently, the most stable protonations of 1,3-azoles are accompanied by smaller electron reorganizations in the molecule; iv) In all the protonations at N3, the electron density lost by the hydrogens of the molecule represents more than 50% of that gained by the proton. This percentage is reduced significantly in thiazole, what can be related to the population provided by S1 (0.300 au) because of its large polarizability; v) In general, Cprotonations take place with hydrogen electron donations that represent less than 50%; vi) Excluding protonations at X1 (the least favored ones) and C5-protonation of thiazole, the electron population of the protonation-site, Y, is enlarged. In most of the cases this and the proton are the only atoms displaying positive ΔN(Ω) values. ΔN(C6) values are also positive for all the protonations of 1-methylimidazole. All of this can be explained because the deformation experienced by the electron density distribution is continuous and displays its highest intensities along the bonds, resembling the electron density flows through the set of chemical bonds [23]. Thus, ΔN(Y) is positive because atom Y is at a junction which receives electron density through three bonds and sends it to the proton by only one. ΔN(C6) is also positive for all of the protonations at the ring of 1-methylimidazole as a consequence of the electron density gained by this atom from its three attached hydrogens exceeds that lost by its own basin through the C6-N1 bond. We have observed that PCM calculations reproduce the experimental preferred deprotonation site for these compounds (C2). Nevertheless, gas phase calculations predict that deprotonation at C5 is favored over that at C2 for 1-methylimidazole by 8.3 kJ mol-1 (this relative energy becomes -1.4 kJ mol-1 with PCM optimizations). Δ5 values show that the ring delocalization is always smaller in the anions than in the neutral molecule. The decrease observed for the deprotonation at C5 (-1.8·10-3 au with gas phase values) is significantly smaller than those for C2 (-4.4·10-3 au) and C4 (-10.5·10-3 au) carbenes. This sequence of delocalization reductions remains unchanged with PCM-optimized electron densities, although the values become smaller (-0.9·10-3, -2.7·10-3, and -4.5·10-3 au, respectively). As the relative sequence of Δ5 values is the same in PCM and gas-phase results, we infer that electron delocalizations cannot be employed as the unique element to explain the deprotonation preference in these compounds. Gas phase results indicate the basicity of heteroatom at position 3 is always larger than that at 1. The difference between both following the sequence: oxazole > imidazole > thiazole.
Electron Density Distributions of Heterocycles
357
Anthocyanidins Anthocyanins are a class of flavonoids that are responsible for the blue to red colors in higher plants [89,90]. As no harmful effects have been established for them, anthocyanins are useful as natural food colorants [91]. Also, anthocyanins have been reported to possess antioxidant properties in vitro [92-94] and are considered important nutraceuticals whose abundance in human diet has been related to lower incidence of diverse degenerative diseases [95,96]. The flavylium cation, F, constitutes the basic skeleton of anthocyanidins in acid media and pelargonidin cation, P, is the simplest natural anthocyanidin, with hydroxyls attached to 3, 5, 7 and 4'. Both molecules present coplanar structures between the AC bicycle and the B ring [97,98]. Several resonance structures, which localize the positive charge on different atoms, can be drawn. According to RM, this set of structures indicates that the positive charge of these cations is shared among diverse atoms (O1, C2, C4, C5, C7, C9, C2', C4', and C6'). Nevertheless, only some of these forms are usually employed to represent these cations, form I being the most widespread in literature. On the basis of Mulliken bond order analysis of AM1 optimized structures Pereira et al. have proposed that canonical forms represented by III in Figure 5 are predominant in the flavylium cation [98], with smaller contributions from I and IV, whereas the participation of V and other forms leaving the positive charge out of ring C, like VI, is negligible. The situation is modified in hydroxylated compounds, like P, where the relative weight of forms leaving a positive charge on the hydroxylated carbon increases [98]. Thus, form VI becomes specially significant for 4'-OH derivatives, like pelargonidin [98,99]. In contrast, a recent paper by Woodford [100] interprets the long C2-C1’ bond distance obtained for the cations of a series of eleven anthocyanidins as a lack of resonance conjugation between AC and B systems. Also, bond lengths within the C ring are interpreted in that paper as favoring form III.
8 9
7
A 6
5
10
+
O
5'
2 3
4'
B
1'
O
C
3'
2'
+
6'
II
I
4
O
+
+
O
IV
III
+
O
O
V
VI
+
Figure 5. Some of the common Lewis structures employed to represent flavylium cations and atom numbering.
358
Ricardo A. Mosquera, Marcos Mandado, Laura Estévez et al.
The summations of N(Ω) values for the AC bicycle and B ring in the flavylium cation indicate positive charges of 0.613 au in the former and of 0.387 au in the latter. This is neither in line with Pereira et al. suggestion [98,99] nor with Woodford interpretation [100], pointing to an important participation of the B ring in the distribution of the positive charge. Moreover, respectively 66.4 % and 60.3 % of both positive charges are due to the electron population of the hydrogens. Interpretation of individual values of N(Ω) should be done carefully. In fact the most positive charges are displayed by C2 and C9, what could be missinterpreted as indicative of the predominance of IV and III. Looking at the value of N(O1) we realise this fact is mainly due to the electronegativity of oxygen, as the total electron population of the C2-O1-C9 unit is 20.071 au. That is, the COC region of AC bicycle displays a slight negative charge in contrast to what is indicated by the canonical forms that are widespread used or proposed as predominant by the theoretical studies hitherto carried out (I-IV). As weak electron density attractor, hydrogens play usually an electron density sink or supplier role [20], we have used the electron densities of every CH group to analyze the relative predominance of the canonical forms shown in Figure 5. The most positive CH group corresponds to C8 (0.124 au), which according to RM should not bear a positive charge. Moreover, the third higher positive charge is displayed by the CH group at C3 (0.115 au). The positive charge on ring B is distributed nearly uniformly among the CH groups, if we exclude C6’, which displays the lowest value (0.041 au) in spite it is one of the positions that should display larger positive charges (C2’, C4’ and C6’) according to the RM. We conclude the inadequacy of the RM to describe the global electron density distribution of the flavylium cation. How does this picture evolve upon polyhydroxylation? All the carbons attached to hydroxyl groups display lower N(Ω) values in P than in F [101]. Nevertheless, this cannot be interpreted as a reinforcement of the role played by the resonance forms leaving a positive charge on one of those carbons outside the C ring. In fact, the global charges on the C-O-H fragments are less positive in P than the corresponding C-H charges in F [101]. Overall, hydroxylation increases the electron density of these regions by 0.050 au. Part of this electron density is taken from the C2-O1-C9 group, which is 0.021 au less negative in P than in F. The most affected atom is H6', which loses 0.060 au in P with regard to F and becomes involved in an intramolecular H-bond with O3. In contrast, N(Ω) values in some unsubstituted C-H groups remain practically unchanged from F to P (those at C4, C6 and C2'). Polyhydroxylation also alters the distribution of the positive charge between AC and B systems. Thus, a larger number of electronegative substituents on AC, increases the positive charge of ring B (0.448 au) and that of AC decreases (0.551 au). Comparing with the respective values in F we conclude an electron density transference of 0.062 au from B to AC. Overall, once more, the resonance model cannot be taken as an indicator for global electron density distributions. As resonance forms are obtained by concerted movements of π-electron pairs, they may represent exclusively the π-electron distribution. Therefore, it could be expected that atoms bearing a positive charge in some of the resonance forms shown in Figure 5 should present the lowest Nπ(Ω) values. This is true for some atoms of F like: C2, C4, C7, C4' and C2'. Neverhtheless, Nπ(C6) is smaller than Nπ(C5), although no resonance form displays a positive charge on C6 and one can be written leaving the positive charge on C5. Also, Nπ(Ω) values for C3' and C5' are smaller than Nπ(C9). In contrast, the fact that Nπ(C4) > Nπ(C9) agrees with
Electron Density Distributions of Heterocycles
359
the RM, where four equivalent resonance forms V leave a positive charge on C4, while only two equivalent forms of IV leave that on C9. Grouping the formal π-charges that can be obtained from the Nπ(Ω) values, we observe that almost half of the positive π-charge is placed on the C2-O1-C9 fragment (+0.435 au), more intensely on C2 and O1, indicating that resonance forms I, II and III are those that provide a better description of the π-electron distribution. The slightly negative charge displayed globally by this region is obtained by a balance of π and σ charges with opposite signs. The rest of the AC system contributes with +0.338 au and the smallest part of the πcharge is placed on the B ring (+0.226 au). When these values are compared with the corresponding global charges, the conclusion is that both regions display positive σ and π charges. σ and π charges are approximately equal for the rest of AC, whereas the π charge is more positive than the σ one in ring B. According to the RM, polyhydroxylation of F to yield P should reinforce the weight of resonance VI-like forms. Looking at the Nπ(Ω) values, we observe that C4' displays the most positive π atomic charge in P, highlighting the significance of a positive charge in this cation. This atom is followed by C7 and both of them exceed the positive π-charge at C2. Thus, forms VI provides the best description for the π-electron distribution of P in contrast with the formulas commonly drawn for this cation in bibliography (I-III), which are those favored by the RM according to the number of equivalent structures. Taking into account the π-electron donations of OH groups received by each ring system (0.108 and 0.243 au for, respectively, B and AC) and the variations presented by the summation of the corresponding Nπ(Ω) values (0.046 and 0.305 au, respectively) we conclude the polyhydroxylation takes place with an electron transference of 0.062 au from B to AC. This value coincides with the global change of electron density indicated above, showing that all that electron transference accompanying polyhydroxylation has π character. When the whole set of main natural anthocyanidins: pelargonidin (Pg), cyanidin (Cy), delphinidin (Dp), peonidin (Pn), petunidin (Pt), and malvinidin (Mv) is considered with all their acid/base forms (cation, neutral form and anion) we conclude that in ionic species the charge is distributed on the whole molecule. Thus, the summation of the electron atomic charge of the atoms of the B ring and those of the AC system shows that, for all the anthocyanidins, the positive charge of the cationic forms is spread throughout the whole molecule. Thus, over 44% of the charge is in the B ring. We also notice that more than 30% of total positive charge is at the hydrogens. As stated above for flavylium and pelargonidin cations, it is not possible to localize the positive charge on a specific atom or set of atoms. Thus, the commonly drawn Lewis structures for these cations, in which the positive charge is usually located at O1 or C2, are not supported by QTAIM results. O1, far from bearing a positive charge, displays a significant negative one, and even summing the atomic population of atoms C2, O1 and C9 (hereafter we refer to it as α group) we found a negative formal charge. Even, the relative electron population of the α group in the cations with regard to the neutral forms is only slightly depleted, between 0.063 and 0.075 au in gas phase along the series of anthocyanidins, and between 0.065 and 0.088 au with PCM. In contrast, the largest increases of positive charge upon protonation of neutral forms are obtained for the C-O group which receives the proton and for the whole B ring (even when the protonation site is located on the AC system Δq(B) exceeds 0.240 au in gas phase or 0.100 au in PCM). For the anions, QTAIM analysis indicates the negative charge is mainly spread among three regions of the molecule: the two C-O units where hydrogens have been removed
360
Ricardo A. Mosquera, Marcos Mandado, Laura Estévez et al.
(averages are: -0.277 au for C5-O5, -0.275 au for C4’-O4’ and -0.317 au for C3-O3) and the α area, which varies from -0.146 to -0.226 au. Relative charges, with regard to the corresponding neutral forms, indicate that more than 40% (more than 50% with PCM) of the charge involved in the deprotonation process is taken from deprotonated C-O units. That is, these groups act as electron density sinks. Also, the rest of the hydrogens receive a significant amount of electron density (from 0.079 to 0.200 au) although most of them keep positive atomic charges. Finally, the electron density gained by the α region represents in all cases less than 5%. Overall, as previously reported by Slee and MacDougall for allyl ions [6] QTAIM net charges are not in accord with RM expectations and/or predictions based on simple π orbital models. QTAIM analysis can also be employed to get insight about the enolate vs. quinonoidal character of neutral species: The usually represented quinonoidal Lewis structure containing a C=O double bond for representing the neutral form of anthocyanidins is mainly supported by comparing bond lengths between the most stable neutral and cationic rotamers. Nevertheless, several resonance structures displaying positive and negative sites -enolate-like structurescan also be drawn for neutral anthocyanidins. Taking into account previously observed shortcomings when resonance structures are employed for qualitative descriptions of electron distributions, we think that the electron distribution of neutral anthocyanidins species should be analyzed with some detail before accepting they bear a quinonoidal structure. As a general rule, we observe that electro-neutrality is obtained by compensation of substantially negatively charged areas (α group and deprotonated CO) and the rest of the molecule, where all the atoms (if we exclude some exceptions) bear moderate positive charges. It should be remarked that the charge of the deprotonated CO unit (COdep) is, in all molecules and tautomers (named by the site where proton has been abstracted from the cation), more negative than -0.206 au. Thus, looking at the most stable tautomers, we observe the C4’-O4’ bond displays negative charge (-0.236 au) in the 4’ tautomer of Cy whereas the corresponding COH group in the cation is slightly positive (0.036 au) as it has been obtained for the 4’ tautomer of Dp (-0.277 au and 0.086 au, respectively) [102]. The same pattern is observed in 5 for the remaining anthocyanidins. Thus, we find the C5-O5 bond goes from a slightly positive charge in the cations (0.063 to 0.068 au) to a negative charge (-0.206 to 0.210 au) in the neutral anthocyanidin. In general, q(COdep) is more negative than those found for a carbonyl group in aliphatic systems [52] in spite of there are less surrounding hydrogens (the atoms that act as electron donors) in anthocyanidins than in aliphatic aldehydes and ketones. Although this could point to a significant weight of the enolate forms, we do not observe the accompanying set of positive charges predicted by RM at: C2, C4, C5, C7, C9, C2’, C4’ and C6’, and neutral charges at: C3, C6, C8, C10, C1’, C3’ and C5’. Clearly, some QTAIM atomic charges are not in line with this rule, e.g. q(C1’) > q(C2’), q(C10) > q(C4), etc. Overall, QTAIM results indicate that in spite of their geometry, neutral forms display a certain enolate character, bearing a negative charge around the deprotonated oxygen but counterbalanced in a different fashion than that expected from the typically enolate-like resonance structures.
Electron Density Distributions of Heterocycles
361
Acknowledgments We are indebted to Spanish MICINN for financial support through project CTQ200615500/BQU and to Centro de Supercomputación de Galicia (CESGA) for access to its computational facilities. M.M. thanks Xunta de Galicia for financial support as a researcher of the "Isidro Parga Pondal" program. N.O. thanks University of Vigo for predoctoral fellowship.
References [1] [2] [3] [4]
[5]
[6] [7] [8]
[9] [10] [11]
[12]
[13]
[14] [15]
Bader, R.F.W. A quantum theory of molecular structure and its applications. Chem. Rev. 1991, 91, 893-928. Bader, R.F.W. Atoms in molecules: a quantum theory, Oxford University Press: Oxford, UK, 1990. Popelier, P.L.A Atoms in molecules. An introduction, Prentice Hall, Harlow, UK, 2000. Wiberg, K.B.; Laidig, K.E. Barriers to rotation adjacent to double bonds. 3. The C-O barrier in formic acid, methyl formate, acetic acid, and methyl acetate. the origin of ester and amide “resonance”. J. Am. Chem. Soc. 1987, 109, 5935-5943. Wiberg, K.B.; Breneman, C.M. Resonance interactions in acyclic systems. 3. Formamide internal rotation revisited. Charge and energy redistribution along the C-N bond rotational pathway. J. Am. Chem. Soc. 1992, 114, 831-840. Slee, T.S.; MacDougall, P.J. The correspondence between Hückel theory and ab initio atomic charges in allyl ions. Can. J. Chem. 1988, 66, 2961-2962. Perrin, C.L. J. Am. Chem. Soc. Atomic size dependence of Bader electron populations: significance for questions of resonance stabilization. 1991, 113, 2865-2868. Laidig, K.E. Use of nuclear potential to investigate the atomic size dependency of populations defined within the theory of atoms in molecules. J. Am. Chem. Soc. 1992, 114, 7912-7912. Gatti, C.; Fantucci, P. J. Phys. Chem. Are Bader electron populations atomic size dependent? 1993, 97, 11677-11680. Laidig, K.E.; Cameron, L.M. Barrier to rotation in thioformamide: implications for amide resonance. J. Am. Chem. Soc. 1996, 118, 1737-1742. Glaser, R. Diazonium ions. a theoretical study of pathways to automerization, thermodynamic stabilities, and topological electron density analysis of the bonding. J. Phys. Chem. 1989, 93, 7993-8003. Glaser, R.; Choy, G.S.-C. Importance of the anisotropy of atoms in molecules for the representation of electron density distributions with Lewis structures. A case study of aliphatic diazonium ions. J. Am. Chem. Soc. 1993, 115, 2340-2347. Speers, P.; Laidig, K.E.; Streitwieser, A. Origins of the Acidity Trends in Dimethyl Sulfide, Dimethyl Sulfoxide, and Dimethyl Sulfone. J. Am. Chem. Soc. 1994, 116, 9257-9261. Graña, A.M.; Mosquera, R.A. Effect of protonation on the atomic and bond properties of the carbonyl group in aldehydes and ketones. Chem. Phys. 1999, 243, 17-26. Mosquera, R.A.; Graña, A.M. Properties and transferability in molecules containing carbonyl groups. Rec. Res. Devel. Chem. Phys. 2001, 2, 23-36.
362
Ricardo A. Mosquera, Marcos Mandado, Laura Estévez et al.
[16] Vila, A.; Mosquera, R.A. Topological analysis of fluorinated dimethyl ethers and their protonated forms. J. Phys. Chem. A 2000, 104, 12006-12013. [17] Vila, A.; Mosquera, R.A. A comparative AIM study of alkyl monoethers and their protonated forms. Chem. Phys. Lett. 2000, 332, 474-480. [18] Vila, A.; Mosquera, R.A. AIM study on the protonation of methyl oxiranes. Chem. Phys. Lett. 2003, 371, 540-547. [19] Vila, A.; Mosquera, R.A. Electron density analysis of small ring ethers. Tetrahedron 2001, 57, 9415-9422. [20] Stutchbury, N. C. J.; Cooper, D. L. Charge partitioning by zero flux surfaces: the acidities and basicities of simple aliphatic alcohols and amines. J. Chem. Phys. 1983, 79, 4967-4972. [21] González Moa, M.J.; Mosquera, R.A. Applicability of resonance forms in pyrimidinic bases. An AIM study. J. Phys. Chem. A 2003, 107, 5361-5367. [22] González Moa, M.J.; Mosquera, R.A. On the applicability of resonance forms in pyrimidinic bases. II. QTAIM interpretation of the sequence of protonation affinities. J. Phys. Chem. A 2005, 109, 3682-3686. [23] González Moa, M.J.; Mandado, M.; Mosquera, R.A. Explaining the sequence of protonation affinities of cytosine with QTAIM. Chem. Phys. Lett. 2006, 428, 255-261. [24] López, J.L.; Graña, A.M.; Mosquera, R.A. Electron density analysis on the protonation of nitriles. J. Phys. Chem. A 2009, 113, 2652-2657. [25] Chandra, A.K.; Nguyen, M.T.; Uchimaru, T.; Zeegers-Huyskens, T. Protonation and deprotonation enthalpies of guanine and adenine and implications for the structure and energy of their complexes with water: comparison with uracil, thymine, and cytosine. J. Phys. Chem. A 1999, 103, 8853-8860. [26] Hirshfeld, F.L. Bonded-atom fragments for describing molecular charge densities. Theor. Chim. Acta 1977, 44, 129-138. [27] De Proft, F.; Van Alsenoy, C. Peeters, A.; Langenaker, W.; Geerlings, P. Atomic charges, dipole moments, and fukui functions using the hirshfeld partitioning of the electron density. J. Comput. Chem. 2002, 23, 1198-1209. [28] Mandado, M.; Van Alsenoy, C.; Mosquera, R.A. Comparison of the AIM and hirshfeld totals, σ, and π Charge distributions: A study of protonation and hydride addition processes. J. Phys. Chem. A 2004, 108, 7050-7055. [29] Graña, A.M.; Hermida-Ramón, J.M.; Mosquera, R.A. QTAIM interpretation of the basicity of substituted anilines. Chem. Phys. Lett. 2005, 412, 106-109. [30] Mandado, M.; Mosquera, R.A.; Graña, A.M. AIM interpretation of the acidity of phenol derivatives. Chem. Phys. Lett. 2004, 386, 454-459. [31] Mandado, M.; Van Alsenoy, C.; Mosquera, R.A. Electron charge redistribution upon hydride addition to carbonylic compounds. Chem. Phys. Lett. 2005, 405, 10-17. [32] Mosquera, R.A.; Mandado, M.; Graña, Electron density reorganization in simple nucleophilic attacks. A.M., to be submitted. [33] Mandado, M.; Van Alsenoy, C.; Mosquera, R.A. Joint QTAIM and Hirshfeld study of the σ and π charge distribution and electron delocalization in carbonyl compounds: A comparative study with the resonance model. J. Phys. Chem. A 2005, 109, 8624-8631. [34] Bader, R. F. W. From Schrödinger to atoms in molecules. Pure and Appl. Chem. 1988, 60, 145-155. [35] Schwinger, J. The theory of quantized fields I. Phys. Rev. 1951, 82, 914-927.
Electron Density Distributions of Heterocycles
363
[36] Poater, J.; Solà, M.; Bickelhaupt, F. M. Hydrogen-hydrogen bonding in planar biphenyl, predicted by atoms-in-molecules theory, does not exist. Chem. Eur. J. 2006, 12, 2889-2895. [37] Bader, R. F. W. Pauli repulsions exist only in the eye of the beholder. Chem. Eur. J. 2006, 12, 2896-2901. [38] Haaland, A.; Shorokhov, D. J.; Tverdova, N. V. Topological analysis of electron densities: is the presence of an atomic interaction line in an equilibrium geometry a sufficient condition for the existence of a chemical bond? Chem. Eur. J. 2004, 10, 4416-4421; Chem. Eur. J. 2004, 10, 6210 (Corrigendum). [39] Bader, R. F. W.; De-Cai, F. Properties of atoms in molecules: caged atoms and the Ehrenfest force. J. Chem. Theory Comput. 2005, 1, 403-414. [40] Cioslowski, J,; Mixon, S. T. Can. J. Chem. 1992, 70, 1263-1270. [41] Cioslowski, J.; Mixon, S. T. Topological properties of electron density in search of steric interactions in molecules: Electronic structure calculations on ortho-substituted biphenyls. J. Am. Chem. Soc. 1992, 114, 4382-4387. [42] Bader, R. F. W. A bond path: A universal indicator of bonded interactions. J. Phys. Chem. A 1998, 102, 7314-7323. [43] Cao, W.L.; Gatti, C.; MacDougall, P.J.; Bader, R.F.W. On the presence of non-nuclear attractors in the charge distributions of Li and Na clusters. Chem. Phys. Lett. 1987, 141, 380-385. [44] Gatti, C.; Fantucci, P.; Pacchioni, G. Charge density topological study of bonding in lithium clusters Part I: Planar Lin clusters (n=4, 5, 6). Theor. Chim. Acta 1987, 72, 433458. [45] Cioslowski, J. Nonnuclear attractors in the Li2 molecule. J. Phys. Chem. 1990, 94, 5496-5498. [46] Edgecombe, K.E.; Esquivel, R.O.; Smith, Jr, V.H. Pseudoatoms of the electron density. J. Chem. Phys. 1992, 97, 2593-2599. [47] Alcoba, D.R.; Lain, L.; Torre, A.; Bochicchio, R.C. Treatments of non-nuclear attractors within the theory of atoms in molecules. Chem. Phys. Lett. 2005, 407, 379383. [48] Alcoba, D.R.; Lain, L.; Torre, A.; Bochicchio, R.C. Treatment of non-nuclear attractors within the theory of atoms in molecules II: Energy decompositions. Chem. Phys. Lett. 2006, 426, 426-430. [49] Mandado, M.; Vila, A.; Graña, A. M.; Mosquera, R. A.; Cioslowski, J. Transferability of energies of atoms in organic molecules. Chem. Phys. Lett. 2003, 371, 739-743. [50] Cortés-Guzmán, F.; Bader, R. F. W. Transferability of group energies and satisfaction of the virial theorem. Chem. Phys. Lett. 200 3, 379, 183-192. [51] Lehd, M.; Jensen, F. A general procedure for obtaining wave functions obeying the virial theorem. J. Comput. Chem. 1991, 12, 1089-1096. [52] Graña, A.M.; Mosquera, R.A. J.Chem.Phys. The transferability of the carbonyl group in aldehydes and ketones 1999, 110, 6606-6616. [53] Frisch, M.J. et al. Gaussian03, Revision C.02, Gaussian, Inc., Wallingford, CT, 2004. [54] Bader, R.F.W., AIMPAC: A suite of programs for the theory of atoms in molecules, Mc Master University, Hamilton, Ontario, Canada, 1994. [55] Bader, R.F.W.; Streitwieser, A.; Neuhaus, A.; Laidig, K.E.; Speers, P. Electron delocalization and the Fermi hole. J. Am. Chem. Soc. 1996, 118, 4959-4965.
364
Ricardo A. Mosquera, Marcos Mandado, Laura Estévez et al.
[56] Fradera, X.; Austen, M.A.; Bader, R.F.W. The Lewis model and beyond. J. Phys. Chem. A 1999, 103, 304-314. [57] Bader, R.F.W.; Matta, C.F. Bonding to titanium. Inorg. Chem. 2001, 40, 5603-5611. [58] Mandado, M.; González Moa, M.J.; Mosquera, R.A. QTAIM n-center delocalization indices as descriptors of aromaticity in mono and poly heterocycles. J. Comput. Chem. 2007, 28, 127-136. [59] Bultinck, P.; Carbó-Dorca, R.; Ponec, R. Advances in Computational Methods in Science and Engineering; Lecture Series on Computer and Computacional Science, vol 4, 2005, pp. 1236-1239. [60] Mandado, M. NDELOC: A program to compute the total and π n-centre delocalization and n-order localization indices. University of Vigo, 2006. [61] Somers, K.R.F.; Kryachko, E.S.; Ceulemans, A. Theoretical study of indole: Protonation, indolyl radical, tautomers of indole, and its interaction with water. Chem. Phys. 2004, 301, 61-79. [62] Van Mourik, T.V. Comment on "theoretical study of indole: protonation, indolyl radical, tautomers of indole, and its interaction with water" [Chem. Phys. 301 (2004) 61-79]. Chem. Phys., 2004, 304, 317-319. [63] Hoyuelos, F.J.; García, B.; Ibeas, S.; Muñoz, M.S.; Navarro, A.M.; Peñacoba, I.; Leal, J.M. Protonation sites of indoles and benzoylindoles. Eur. J. Org. Chem. 2005, 11611171. [64] Hinman, R.L.; Whipple, E.B. The protonation of indoles: position of protonation. J. Am. Chem. Soc. 1962, 84, 2534-2539. [65] Hinman, R.L.; Lang, J. The protonation of indoles. basicity studies. the dependence of acidity functions on indicator structure. J. Am. Chem. Soc. 1964, 86, 3796-3806. [66] Chen, H.J.; Hakka, L.E.; Hinman, R.L.; Kresge, A.J.; Whipple, E.B. Basic strength of carbazole. Estimate of the nitrogen basicity of pyrrole and indole. J. Am. Chem. Soc. 1971, 93, 5102-5107. [67] Catalán, J.; Pérez, P.; Yáñez, M. A theoretical study of the protonation of methylindole derivatives. Tetrahedron 1982, 38, 3693-3699. [68] Catalán, J.; Yáñez, M. α vs. β protonation of pyrrole and indole. J. Am. Chem. Soc. 1984, 106, 421-422. [69] Jones, R.A.; Bean, G.P. The chemistry of pyrroles, Academic Press: London, UK, 1977. [70] Gassner, R.; Krumbholz, E.; Steuber, F.W. Stabile protonierte pyrrole. Leibigs Ann. Chem. 1981, 789-791. [71] Otero, N.; González Moa, M.J.; Mandado, M.; Mosquera, R.A. QTAIM study of the protonation of indole. Chem. Phys. Lett. 2006, 428, 249-254. [72] Otero, N.; Mandado, M.; Mosquera, R.A. Nucleophilicity of indole derivatives: Activating and deactivating effects based on proton affinities and electron density properties. J. Phys. Chem. A 2007, 111, 5557-5562. [73] Hunter, E.P.; Lias, S.G. Evaluated gas phase basicities and proton affinities of molecules: an update. J. Phys. Chem. Ref. Data 1998, 27, 413-656. [74] Bader, R.F.W.; Chang, C. Properties of atoms in molecules: Electrophilic aromatic substitution. J. Phys. Chem. 1989, 93, 2946-2956. [75] Mandado, M.; González Moa, M.J.; Mosquera, R.A. Chemical graph theory and ncenter electron delocalization indices: A study on polycyclic aromatic hydrocarbons. J. Comput. Chem. 2007, 28, 1625-1633.
Electron Density Distributions of Heterocycles
365
[76] Bader, R.F.W. Properties of atoms and bonds in carbocations. Can. J. Chem. 1986, 64, 1036-1045. [77] Dewar, M.J.S. The Electronic Theory of Organic Chemistry, Oxford University Press: London, UK, 1949. [78] Arduengo, A. J.; Goerlich, J. R.; Marshall, W. J. A stable thiazol-2-ylidene and its dimer. Liebigs Annalen 1997, 365-374. [79] Tomasi, J.; Mennuci, B.; Cammi, R. Quantum mechanical continuum solvation models. Chem. Rev. 2005, 105, 2999-3093. [80] Kabir, S.; Sapse, A.M. An ab initio study of the proton affinities of some heteroatomic rings: imidazole, oxazole, and thiazole. J. Comput. Chem. 1991, 12, 1142-1146. [81] Meot-Ner, M.; Liebman, J.F.; Del Bene, J.E. Proton affinities of azoles: experimental and theoretical studies. J. Org. Chem. 1986, 51, 1105-1110. [82] Nguyen, V. Q.; Tureček, F. Protonation sites in gaseous pyrrole and imidazole: a neutralization-reionization and ab initio study. J. Mass Spectrom. 1996, 31, 1173-1184. [83] Bu, L.; Cukier, R. I. effects of donors and acceptors on the energetics and mechanism of proton, hydrogen, and hydride release from imidazole. J. Phys. Chem. B 2004, 108, 10089-10100. [84] Blanco, F.; Alkorta, I.; Zborowski, K.; Elguero, J. Substitution effects in N -pyrazole and N -imidazole derivatives along the periodic table. Struct. Chem. 2007, 18, 965-975. [85] Rao, J. S.; Sastry, G. N. Proton affinity of five-membered heterocyclic amines: assessment of computational procedures. Int. J. Quantum Chem. 2006, 106, 1217-1224. [86] Da Silva, G.; Moore, E. E.; Bozzelli, J. W. Quantum chemical study of the structure and thermochemistry of the five-membered nitrogen-containing heterocycles and their anions and radicals. J. Phys. Chem. A 2006, 110, 13979-13988. [87] Rezabal, E.; Mercero, J. M.; Lopez, X.; Ugalde, J. M. A theoretical study of the principles regulating the specificity for Al(III) against Mg(II) in protein cavities. J. Inorg. Biochem. 2007, 101, 1192-1200. [88] Matito, E.; Duran, M.; Solà The aromatic fluctuation index (FLU): a new aromaticity index based on electron delocalization. J. Chem. Phys., 2005, 122, 14109-1,-8. [89] Harbone, J.B.; Grayer, R.J. in: J.B. Harbone (Ed.), The Flavonoids: Advances in Research since 1980, Chapman and Hall: London, UK, 1988. [90] Koes, R.E.; Quattrocchio, F.; Mol, J.N.M. The flavonoid biosynthetic pathway in plants: function and evolution. BioEssays 1994,16, 123-132. [91] Guedes, M.C., Ph.D. Thesis, Faculdade de Engenharia de Alimentos da Universidade de Campinas, Campinas SP. Brazil, 1993. [92] Wang, H.; Cao, G. Prior, R.L. Oxygen radical absorbing capacity of anthocyanins. J. Agric. Food Chem. 1997, 45, 304-309. [93] Pool-Zobel, B.L.; Bub, A.; Schroder, N.; Rechkemmer, G. Anthocyanins are potent antioxidants in model systems but do not reduce endogenous oxidative DNA damage in human colon cells. Eur. J. Nutr. 1999, 38, 227-234. [94] Tsuda, T.; Shiga, K.; Kawakishi, S.; Osawa, T. Inhibition of lipid peroxidation and the active oxygen radical scavenging effect of anthocyanin pigments isolated from Phaseolus vulgaris L. Biochem. Pharmacol. 1996, 52, 1033-1039. [95] Haslam, E. Practical Polyphenolics, Cambridge University Press: Cambridge, UK, 1998.
366
Ricardo A. Mosquera, Marcos Mandado, Laura Estévez et al.
[96] Garrote, G.; Cruz, J.M.; Moure, A.; Domínguez, H.; Parajó, J.C. Antioxidant activity of byproducts from the hydrolytic processing of selected lignocellulosic materials, Trends Food Sci. Tech. 2004, 15, 191-200. [97] Estévez, L.; Mosquera, R.A. A density functional theory study on pelargonidin. J. Phys. Chem. A 2007, 111, 11100-11109. [98] Pereira, G.K.; Donate, P.M.; Galembeck, S.E. Electronic structure of hydroxylated derivatives of the flavylium cation J. Mol. Struct. THEOCHEM. 1996, 363, 87-96. [99] Pereira, G.K.; Donate, P.M.; Galembeck, S.E. Effects of substitution for hydroxyl in the B-ring of the flavylium cation. J. Mol. Struct. THEOCHEM. 1996, 392, 169-179. [100] Woodford, J.N. A DFT investigation of anthocyanidins. Chem. Phys. Lett. 2005, 410, 182-187. [101] Estévez, L.; Mosquera, R.A. Where is the positive charge of flavylium cations? Chem. Phys. Lett. 2008, 451, 121-126. [102] Estévez, L.; Mosquera, R. A. Molecular structure and antioxidant properties of delphinidin. J. Phys. Chem. A 2008, 112, 10614-10623.
In: Quantum Frontiers of Atoms and Molecules Editor: Mihai V. Putz, pp. 367-385
ISBN: 978-1-61668-158-6 © 2010 Nova Science Publishers, Inc.
Chapter 15
ELECTROMERISM IN SMALL MOLECULE ACTIVATION BY METAL CENTERS OF BIOLOGICAL RELEVANCE Radu Silaghi-Dumitrescu* Department of Chemistry and Chemical Engineering,”Babes-Bolyai” University, 11 Arany Janos str, Cluj-Napoca RO-400028, Romania
Abstract Activation of small molecules, such as dioxygen, peroxide, or nitric oxide via mechanisms involving ligation to metal centers of biological relevance is discussed. Emphasis is placed on the means currently available for describing the problematic electronic structures of such complexes in cases where electromerism phenomena are possible.
Introduction Dioxygen binding to globins is central to human life and has as such been among the most studied biochemical processes [1,2]. Figure 1 illustrates this reaction: the ferrous histidine-ligated heme in globins (“deoxy”) binds dioxygen reversibly, to yield a ferrousdioxygen (“oxy”) adduct. Other reactions possible starting from these two states shown in Figure 1 will also be discussed in subsequent sections; of main interest to us will be the “ferric-peroxo” and the high-valent [Fe(V) and Fe(IV)] forms [3,4]. All of these species feature electromerism phenomena. The following chapter discusses the means currently available for unveiling the details of these complicated electronic structures, with emphasis on the possibility of assigning them to one electromer, or to a defined mixture of electromers.
*
E-mail address: [email protected]
368
Radu Silaghi-Dumitrescu O
O
O N N
OH
O N
Fe
+e
N N
N
X Fe(II)-O2 <−> Fe(III)-O2 "ferrous-dioxygen"
-
−
O N
Fe
+H
N
X
"Fe(II)-O2− <−> Fe(III)-O22−" "ferric-peroxo"
+O2
N N
N
N N
+H+
N Fe
-H2O
N
N -
+e
X Fe(II)
N
O N Fe N N
N
X
X
Fe(III)-O2H− "ferric-hydroperoxo"
"Compound I" formally Fe(V)
+e-
N O N Fe N N X "Compound II" formally Fe(IV)
+H-+ +e
+"R" -"RO"
+HO2
N Fe
+
Fe
N
+H+
N
-H2O
X Fe(III)
OH N Fe N N
N
X Fe(III)-OH
Figure 1. Physiologically-relevant reactions of active site hemes with O2 and/or H2O2. “R” denotes an organic substrate molecule. “X” may be a protein-derived cysteinate, tyrosinate or histidine ligand.
Ferrous-Dioxygen Species The nature of the ferrous-dioxygen adduct has elicited interest due to the two main electromers, Fe(II)-O2 and Fe(III)-O2-. Two key pieces of information have been used to decide which one of these two electromers predominates in the globin version of this adduct. Thus, the stretching frequency of the O-O bond in oxy hemoglobin, at ~1100 cm-1, is far lower than the 1500 cm-1 value seen for molecular oxygen and consistent with a superoxide moiety. Secondly, Mössbauer spectra are consistent with a ferric site as opposed to a ferrous one; hence, the ferric-superoxo electromer is generally favored [1]. However, a discussion remains as to whether a pure electromer is observed experimentally, and to what extent this structure is modulated by changes in the first and second coordination sphere. Indeed, the oxy adduct is seen with many other systems, ranging from cytochromes P450[5] to non-heme iron proteins [6,7], and their reactivities, structures and spectroscopic features indeed suggest different degrees of mixing between the possible electromers. In many of these complexes, the oxy adduct is short-lived and its electronic structure is key to critical reaction mechanisms. From these points of view, computational investigations can add significantly to our understanding of dioxygen binding and activation. With two main electromers suspected to contribute to the electronic structure of a “ferrous-dioxygen” species and at least another one [Fe(IV)-peroxo] considered, it appears that only multiconfigurational methods may address such structures computationally. Such methods are not yet amenable for geometry optimization procedures, so that either a highresolution experimental structure or a density functional (DFT)-optimized geometry have to be used for analysis of the electronic structure. Thus, in the most recent approach, using a geometry computed at B3LYP level, CASPT2 (multiconfigurational second-order perturbation theory) computes a contribution of 70% form the ferrous-dioxygen electromer, with most of the remaining 30% due to the ferric-superoxo electromer [8]. However, this procedure is still too demanding to be applied routinely to any given system of interest. On the other hand, closer examination of the DFT geometry, which was taken as basis for the
Electromerism in Small Molecule Activation by Metal Centers…
369
CASPT2 computation, reveals some interesting conclusions. Figure 2 thus shows a comparison of oxygen-oxygen bond lengths and oxygen-localized spin densities computed with two representative functionals for a few representative models; the HisH, SCH3- and PhO- mimic the globin and heme oxygenase (histidine-ligated), cytochrome P450 (cysteinateligated) and catalase (tyrosinate-ligated) active sites, respectively.[9] Since the dioxygenic ligands in these three species are expected to be described as either molecular oxygen (bound to ferrous iron) or superoxide (bound to ferric iron), or something in between, the computed parameters for the OO ligands are indeed compared to those found, at the same level of theory, for free O2 and HO2.[9]
N
O
O
N
N
N
N
Fe
O
O
O O N
1-
1-
0
Fe
N
N
N
N
N
O
S
N
N Fe
NH
HisH/heme % superoxide character 100% UBP86, bonds 80% 60%
SCH3-/heme -
PhO /heme
PhO-/heme -
SCH3 /heme
S=1/2 OOH
HisH/heme
UB3LYP, bonds UBP86, spin UB3LYP, spin
40% 20% 0%
S=1 O2
Figure 2. Top panel: Structures and nomenclature for “ferrous-dioxygen” models employed in the present study. In addition to the “heme” models, we examine the “cd1” model based on the known ability of cytochrome cd1 nitrite reductase to bind and reduce molecular oxygen. Bottom panel: Percentages of superoxide and nitroxyl character derived from bond lengths and spin densities for “ferrous-dioxygen” models with two different functionals, BP86 and B3LYP. Percentages are defined with the general formula 100x[P(complex)-P(Oxidized Reference)]/[P(Reduced Reference)-P(Oxidized Reference)], where the property P is either the bond length or the spin density on the O2 ligand; Oxidized Reference is S=1 O2; Reduced Reference is S=1/2 OOH.[9].
370
Radu Silaghi-Dumitrescu
It is important to note that of the two parameters investigated, O-O bond lengths and spin densities, the former is the only one that is found to be relatively independent of the functional used; the net spin density is seen to vary drastically with functional, such that consideration of this parameter alone can lead to dramatically wrong conclusions. A strong dependence of the spin density but not of the geometry on the dielectric constant of the medium was also shown.[9] The DFT-computed bond lengths in fact appeared to yield an indirect description of the electronic structure (% superoxide electromer) in reasonable agreement with previous CASPT2 calculations, at a far smaller computational cost. This agreement is surprising since the data shown in Figure 2 in fact come from ground state calculations, whereas CASPT2 directly takes into account excited states; one may indeed have expected a ‘ground state’ calculation to yield either a pure ferrous-dioxygen or ferricsuperoxo description. A similar paradox is encountered with semiempirical calculations on the same models, but not with Hartree-Fock.[10] A possible explanation is that parameterization, more or less directly common to DFT and semiempirical methods, has indirectly introduced excited-state information in the apparently ground-state results yielded by density functionals and certain semiempirical methods, so that further application of more expensive multiconfigurational methods on these ‘ground-state’ geometries only rediscovers information already introduced by parameterization. The validity of the DFT-derived geometrical criterion for assignment of electronic structures in problematic electromerism cases will further be exemplified in the following sections of this chapter.
Ferric-Peroxo Species The one-electron-reduced versions of the ‘ferrous-dioxygen’ adducts discussed in the previous section (cf. Figure 1) are much shorter-lived, and for certain systems impossible to detect: they rapidly (sometimes even at liquid helium temperature) protonate to produce a ferric-hydroperoxo species.[3,11-15] Partly due to this fact, of the two electromers possible, ferric-peroxo and ferrous-superoxo, the former is often preferred as terminology, and is as such, only formally, used in tables below. Table 1 lists computed O-O bond lengths for the one-electron reduced versions of the imidazole and methylthiolate models shown in Figure 2.[3] It is immediately apparent, based on this criterion alone, that a predominantly ‘ferroussuperoxo’ description applies here. Indeed, Table 2 then shows that the O-O bond lengths in ‘ferric-peroxo models are all shorter even compared to those computed for free superoxide, so that no significant contribution from the ferric-peroxo electromer can be envisioned. By contrast, in ferric-hydroperoxo models (cf. Figure 1) Table 1 shows O-O bond lengths similar to those in free hydrogen peroxide. The geometrical conclusions are supported by DFTderived partial atomic charges and spin densities, all supporting a distinct superoxide character in the “ferric-peroxo” models (Tabel 3). However, unlike the ‘ferrous-dioxygen” case, where extensive experimental data was available for verification of hypotheses formulated based on computations, data on “ferric-peroxo” adducts is mostly limited to EPR, ENDOR and UV-vis spectroscopic data. In the case of the heme adducts illustrated in Tables 1 and 2, EPR spectra indeed are suggestive of spin density being located mainly away from the metal, and ENDOR parameters for the oxygen atoms can be interpreted to be consistent with a superoxo description.[3] By contrast, in non-heme ferric-peroxo adducts (not shown in Tables here), where the O-O bond is computed to be distinctly more elongated than in the
Electromerism in Small Molecule Activation by Metal Centers…
371
heme cases, UV-vis spectra and possibly also EPR and Mössbauer parameters suggest clear contribution from the ferric-peroxo electromer [16-19]. Table 1. Calculated bond lengths (Å) for one-electron reduced versions of the models shown in Figure 2. Hydrogen bond effects were accounted for by inclusion of one or two water molecules (as indicated) interacting with the dioxygenic ligand.[3] Formal description S=1/2 OOH S=1/2 OOH2O2 HO2Ferric-peroxo Ferric-peroxo Ferric-peroxo, 1 H-bond Ferric-peroxo, 1 H-bond Ferric-peroxo, 2 H-bonds Ferric-hydroperoxo Ferric-hydroperoxo Ferric-hydroperoxo
X
SCH3imidazole SCH3-
Fe-O 1.93 1.90 1.92
O-O 1.35 1.38 1.48 1.57 1.32 1.31 1.34
imidazole
1.88
1.33
SCH3-
1.94
1.35
imidazole SCH3SCH3-
1.80 1.89 2.04
1.46 1.46 1.40
Table 2. Calculated Mulliken partial atomic charges and spin densities (the latter shown in parentheses), for Figure 2 models [3] Formal description S=1/2 OOH S=1/2 OOFerric-peroxo Ferric-peroxo Ferric-peroxo, 1 H-bond Ferric-peroxo, 1 H-bond Ferric-peroxo, 2 H-bonds Ferric-hydroperoxo Ferric-hydroperoxo Ferric-hydroperoxo Ferric-hydroxo a
X SCH3 imidazole SCH3-
Fe 1.12 (0.80) 1.27 (0.03) 1.09 (0.39)
O1a -0.18 (0.29) -0.50 (0.50) -0.25 (0.41) -0.26 (0.38) -0.28 (0.38)
O2b -0.17 (0.72) -0.50 (0.50) -0.23 (0.59) -0.23 (0.40) -0.31 (0.38)
imidazole
1.26 (0.17)
-0.26 (0.40)
-0.26 (0.43)
SCH3-
1.05 (0.57)
-0.34 (0.33)
-0.39 (0.34)
imidazole SCH3 SCH3 SCH3
1.32 (0.76) 1.09 (0.80) 1.09 (0.64) 1.05 (0.95)
-0.34 (0.24) -0.36 (0.22) -0.31 (0.14) -0.34 (0.13)
-0.28 (0.07) -0.34 (0.06) -0.36 (0.35) -
Iron-bound oxygen atom. bNon iron-bound oxygen atom. csolvated model, assuming ε=4.335.
372
Radu Silaghi-Dumitrescu
Fe(IV)-Oxo and Fe(III)-Oxo Species Fe(IV)-oxo species of the type illustrated in Figure 1 feature an electromerism case different form the previous two, in that no internal geometrical parameter of the ligand (e.g., O-O bond) is available as a criterion for assigning electromers. Therefore, analysis of molecular orbitals, partial atomic charges and spin densities remain as the only methods available – all bearing in mind the caveat provided by Figure 2 in this respect. Scheme 1 illustrates the molecular orbital diagram for formal binding of an oxo ligand to an Fe(IV) center in octahedral environment. Key to the reactivity of this system are the degenerate π* orbitals, each singly occupied and reminiscent in this respect of the dioxygen molecule [20].
σ∗ dx2-y2
eg
π∗ dxy px,py,pz n-
π
t2g
Fe4+
X
σ∗ Scheme 1.
Figure 3. Iron-oxygen π* orbitals for an octahedral ferryl model, with four equatorial amine ligands and an axial acetonitrile, illustrating a high degree of covalence [21].
Electromerism in Small Molecule Activation by Metal Centers…
373
Also reminiscent of the dioxygen molecule is the significant degree of covalence of these orbitals – in contrast with the ionic character seen with cognates of the oxo ligand such as hydroxide or water; Figure 3 illustrates this aspect with DFT-derived plots of the two orbitals [21]. Within such a framework, it is virtually impossible to assign electrons to either of the two centers, and so the canonical Fe(IV)-oxo electromer is generally-accepted. Indeed, Mössbauer spectra reveal the iron in such complexes to be highly oxidized [4,22,23]; even so, recent experimental data revealing an unexpectedly high effect of covalence on iron Mössbauer parameters have revealed that improper consideration of the covalence factor may lead to assignments of iron oxidation states that are wrong by one or even two units [24,25]. In sharp contrast with the DFT results of Figure 3, Hartree-Fock and MP2 analyses of the same system (Figure 4) reveal an entirely different picture: the π* orbitals computed at nonDFT levels reveal a clear ‘hole’ on the oxygen valence shell, so that the oxygen ligand is now described as oxyl (deprotonated version of a hydroxyl radical) while the iron is Fe(III). The question remains, whether the DFT or the Hartree-Fock results describe these systems more realistically; notably, the net electron densities obtained at the two levels of theory are essentially the same: one full electron on iron from DFT, and two halves of electron on iron from DFT.[21,26]
Figure 4. Iron-oxygen π* orbitals at HF/6-31G** level, for the same model as in Figure 3 [26].
While the iron-oxygen covalence appears to hamper clear assignment of oxidation states on the two partners in [FeO]2+, related systems such as [FeS]2+, [FeN] + and [FeN]2+ - all formally Fe(IV) or Fe(V) do feature counterparts of the oxo ligand with different electronegativity and hence different tendencies to engage in covalent interactions.[27] Indeed, Table 3 shows that in an octahedral environment where the other five ligands are, for simplification, water molecules, the various relatives of the [FeO]2+ system do exhibit drastically different behavior in terms of localization of the π* electrons: from the 45% seen on oxygen in the “Fe(IV)-oxo” to the only ~10% seen with a nitride ligand, and with an unexpected antiferromagnetic coupling in the Fe(V)-nitrido cognate. Table 4 further reveals clear ‘holes’ in the valence shels of the formally sulfide and nitride ligands (orbitals with occupancies at 0.37), so that the corresponding systems can be interpreted, even at DFT level,
374
Radu Silaghi-Dumitrescu
to be essentially Fe(III) and not Fe(IV) – with the extra oxidizing equivalents lying on the sulfide/nitride ligand. Additionally, stepwise elongation of the iron-oxo/sulfido/nitrido bonds leads to an increase in Fe(III) character,[27] further arguing against a predominantly Fe(IV) description at equilibrium. Table 3. Optimized geometries and spin relevant densities for [Fe(H2O)5O]2+,[Fe(H2O)5S]2+, [Fe(H2O)5N] + and [Fe(H2O)5N]2+ systems. All data is from UBP6/6-31G** geometry optimizations [27]
a
[Fe(H2O)5O]2+ [Fe(H2O)5S]2+ [Fe(H2O)5N]+ [Fe(H2O)5N]2+
Fe-Xa 1.64 2.06 1.58 1.50
Fe 1.12 0.89 1.73 1.22
X 0.92 1.13 0.17 -0.37
X=O, N or S, respectively. baverage over 5 bonds.
Table 4. Relevant atomic orbital occupancies for [Fe(H2O)5O]2+,[Fe(H2O)5S]2+, [Fe(H2O)5N]+ and [Fe(H2O)5N]2+ systems (α and β values listed in each cell, in this order). ‘X’ is defined as in Table 3 [27] Fe dxz Fe dyz Fe dxy Fe dx2-y2 Fe dz2 X px X py X pz
[Fe(H2O)5O]2+ 1.00-0.48 0.99-.50 1.00-.00 0.29-.24 0.45-.40 0.98-.52 0.97-.48 0.66-.68
[Fe(H2O)5S]2+ 0.99 -0.60 1.00-.56 1.00-.00 0.24-.21 0.50-.43 0.97-.37 0.97-.41 0.54-.58
[Fe(H2O)5N] + 0.99-0.46 0.63-.46 1.00-.99 0.97-.10 0.57-.44 0.92-.50 0.37-0.53 0.51-0.61
[Fe(H2O)5N]2+ 0.62-0.46 0.63-0.48 1.00-1.00 0.99-0.18 0.58-0.49 0.38-0.53 0.39-0.53 0.52-0.59
An indirect test of the electronic structure is reactivity. “Fe(IV)-oxo” systems are known to transfer an oxygen atom to organic substrates, with the reaction greatly facilitated by the presence of a porphyrin ligand at the iron, carrying an extra oxidizing equivalent in the form of a cation radical (the so-called “Compound I” state). Clearly, in order to accomplish this transfer of an oxygen atom, the iron “oxo” ligand must at some point lose two electrons, which implies an albeit transient Fe(II)-‘oxygen atom’ electromer. The other type ox reactivity of Compound I and related species, hydrogen atom abstraction by the ‘oxo’ ligand, is consistent with the ferric-oxyl electromer (formally related to a hydroxyl radical) more so than with the Fe(IV)-oxo (formally featuring a closed shell at the oxygen). On the other hand, paradoxically, protonation of the iron-bound oxygen atom, which, as expected and as shown in the next section, removes almost all of the spin density from the oxygen, is computed to leave essentially unaltered the hydrogen-abstracting ability.[3] Another way to assess the electronic structure of the Fe(IV)-oxo systems would be to look at their one-electron-reduced counterparts- formally ferric-oxo. One such system, for which a simplified model is shown in Scheme 2, is known experimentally. Figure 5 illustrates that, in sharp contrast with the one-electron oxidized counterpart seen in Figure 3, the π*
Electromerism in Small Molecule Activation by Metal Centers…
375
orbitals are now largely localized on the oxygen, which is cleanly described as “oxo”. Consistent with this, the oxygen atom transfer reactivity of this species is much lower than that seen typically with “Fe(IV)-oxo” systems.[28,29] Surprisingly however, Figure 6 reveals yet another facet of electromerism in this system: the spin density on heme ferric-oxo systems are indicative of one unpaired electron on the porphyrin ring, with the iron-oxygen moiety described, ironically, again as “Fe(IV)-oxo”.[30]
H H O
O
N N Fe N
O H O
N Scheme 2.
Figure 5. Molecular orbitals illustrating the Fe(III)-oxo interaction for the model shown in Scheme 2 [29].
Figure 6. Spin densities on imidazole-ligated S=1/2 and S=5/2 ferric-oxo models; white-positive, blacknegative) [30].
376
Radu Silaghi-Dumitrescu
Non-Oxo Fe(IV) Species The intricacies of the iron-oxygen covalence in the “Fe(IV)-oxo” system also prompt one to examine related systems where the metal-ligand double bond (and hence the problem of the two degenerate π* orbitals) no longer exists. One class o such systems is illustrated in Figure 7. An added phenomenon of isomerization known experimentally, with the alkyl/aryl ligand R migrating from the iron to the macrocycle, formally converts Fe(IV) to Fe(II) in these systems without participation of an external redox partner.[31,32] Table 5 then shows that for the Fe(IV) isomer with imidazole and methyl as axial ligands, there are clearly four orbitals with occupancies close to 1, as expected of a formally Fe(IV) system. 1+
1+
CH3
CH3 N N
N
N Fe
N
N
N N
N
N
N
Fe
N
N
N
N
N Fe
N
S
S NH
NH
CH3
CH3
N Fe
1+
1+
Fe(II)
Fe(IV)
Fe(II)
Fe(IV)
Figure 7. Formally Fe(IV) models in heme organometallic complexes [32].
Table 5. Occupancies of iron d orbitals for models shown in Figure 7, from Mulliken analyses. Axial ligands and formal oxidation states are labeled according to Figure 7 [32] Fe(IV), imidazole
Fe(II), imidazole
Fe(IV), thiolate
Fe(II), thiolate
α β
0.70 0.48
0.96 0.18
0.63 0.45
0.59 0.38
dxz α β
0.98 0.24
0.97 0.47
0.95 0.41
0.96 0.68
dyz α β
0.97 0.28
0.96 0.33
0.97 0.39
0.94 0.53
dxy α β
0.49 0.40
0.43 0.34
0.54 0.49
0.63 0.61
dx2-y2 α β
0.94 0.92
0.94 0.92
0.83 0.80
0.69 0.66
dz2
However, the iron α dz2, engaged in σ interaction with the methyl ligand, has an occupancy of 0.70, suggestive of a fifth iron d electron and hence an electronic description featuring Fe(III) engaged in covalent σ interaction with a methyl radical. A change of axial
Electromerism in Small Molecule Activation by Metal Centers…
377
ligand, from imidazole to thiolate, does not affect this description drastically. On the other hand, the formally Fe(II) isomers (with the methyl radical migrated from iron to the nitrogen) do differ significantly in electronic structure: the thiolate-ligated model features five d orbitals clearly occupied (hence, ferric), while the imidazole-ligated cognate, with 6 d electrons, is clearly ferrous [32].
Porphyrin Radical-Type Structures The formally Fe(V) species shown as Compound I in Figure 1 is best described as a [FeO]2+ moiety bound to a porphyrin cation radical; this is fully supported by experimental data.[4] On the other hand, the porphyrin may engage in electromerism in a number of other iron oxidation states. Thus, the certain ferric porphyrins can isomerize to a Fe(IV) + anion radical state, both of which are well characterized experimentally and computationally. More recently, attempts to produce “super-reduced” states at the iron (Fe(I), Fe(0)) in hemes and related complexes have led to the detection of species to which assignment of the electronic structure is still under debate.[33-36] Figure 8 shows one set of models relevant for experimental data available; consideration of CO and CO2 ligands is justified by the reactions observed, where CO2 appears to be reduced to carbon monoxide. Perhaps not surprisingly almost none of these models are found to be described as clean Fe(0) systems. O
C
C
O
O N
N N
N
N
Fe
N
N N
N
N
Fe N
N N
1
O
O N
N
N
O
O
N
4 O
O
(H)
O N N
N
N
N N
Fe N N N
N
6
(H)
C
Fe
N
5
N
C
N
Fe N
N
N
(H) (H)
Fe
N
3
O
N
N
N N
2
C
N
N
Fe
N
N
O C
7
Figure 8. Formally Fe(0) models examined computationally [33].
378
Radu Silaghi-Dumitrescu
Figure 9. Left: Metal-localized frontier orbitals in S=0 Model 1.Right: metal-localized frontier orbitals in S=0 Model 4 [33].
For exemplification, Figure 9 lists molecular orbitals computed for the pentacoordinated model: there are clearly four electrons missing from the iron d orbitals, and the system is perfectly described as Fe(II) coupled to a two-electron reduced porphyrin. By contrast, Figure 9 also shows that the molecular orbitals computed for the model featuring an iron-carbon bond between the heme and CO2, predicted by theory but yet to be confirmed by experiment, indicate only two electrons to be missing from the iron d-shell, and hence a true ‘superreduced’ state at the iron; orbital occupancies confirm this interpretation.
Metal-Nitric Oxide Complexes Metal-nitric oxide complexes are important in biochemical pathways concerning the nitrogen cycle (catalyzed entirely by metalloenzymes) as well as in nitrosative stress processes.[37-41] A recent important example has been the hemoglobin-nitric oxide interaction, thought to be important from a medical point of view.[42] With redox metals such
Electromerism in Small Molecule Activation by Metal Centers…
379
as iron or copper, M-NO adducts typically feature a second electromer, where an electron is donated either from, or towards, the NO ligand, converting it into NO+ or NO-.[43] In certain respects, this situation is reminiscent of the behavior of the dioxygen ligand when bound to these metals, and, as such, nitric oxide has often been used a ‘probe’ for examining enzyme active sites, taking advantage of the unpaired electron of NO which makes the resulting metal adducts more spectroscopically-amenable than their metal-dioxygen cognates.[6,44,45] Figure 10 thus shows a few relevant formally Fe(II)-NO models, where a second electromer is often invoked, Fe(III)-NO-.[9] The protocols applied are similar to those seen in Figure 2, and it is again obvious that DFT-derived spin densities are not always reliable, whereas assignments of electronic structure based on the internal bond length of the NO ligand is considerably less dependant on the computational model used and hence much more trustworthy.[9] Figure 11 illustrates one aspect of the biological importance of copper-NO adducts: the catalytic cycle of copper-containing nitrite reductases (CuNIR), where interaction of a reduced Cu(I) center with nitrite leads to formation of a formally Cu(II)-NO adduct, which indeed has a second electromer, depicted in the figure, Cu(I)-NO+.[38] The principle illustrated in Figure 11 is in fact applied by all nitrite-reducing enzymes – including those employing iron: a reduced metal interacts with nitrite to yield a metal-NO adduct.[38,40] One salient feature of the CuNIR catalytic cycle is the proposed bidentate binding of NO to Cu(II). While this species has never been observed, its more stable, and more spectroscopicallyamenable cognate, Cu(I)-NO, has been characterized both spectroscopically and structurally, and it did show the previously unobserved bidentate mode of ligation of NO to copper.[46] EPR spectra of the Cu(I)-NO adduct showed a typical Cu(II) signal, suggestive of significant contribution from the Cu(II)-NO+ electromer.[46] Both the structural and electronic structure aspects warranted a detailed computational analysis of the Cu(I)-NO and Cu(II)-NO adducts. Employing a simple model of the CuNIR active site (with the three protein-derived copper ligands, cf. Figure 11, modeled as imidazoles), there was no evidence for bidentate ligation of NO to the metal in either of the two oxidation states. However, using a larger model of the active site, which accounts for sterical constraints (Figure 12, listed as “+protein” in Tables), it was found that indeed bidentate binding of NO to Cu(I)-NIR is possible, in good agreement with experiment [38]. Table 6 lists relevant geometrical parameters for the Cu-NO models. The larger Cu(I)NO model shown in Figure 12 features an NO bond length only 0.02 Å shorter than free NOat the same level of theory, and longer by 0.06 than in free NO. This already indicates, as previously discussed, a significant Cu(II)-NO- character – in good agreement with EPR spectra; by contrast, Table 7 shows very little spin on the copper in the same model, again in line with pitfalls previously seen in Figures 2 and 10. On the other hand, the Cu(II)-NO model, which is directly relevant for the CuNIR catalytic mechanism, shows an NO bond length 0.03 Å shorter than in free NO and 0.05 longer than in free NO+, suggesting significant contribution from the Cu(I)-NO+ electromer. Here, too, if bidentate binding of NO is enforced the NO, the nitrogen-oxygen bond lengths elongates, in fact reaching the exact value seen for free NO at the same level of theory; this observation is important since this species is know to decay via liberation of NO, not of NO+ [38].
380
Radu Silaghi-Dumitrescu
O HN
O H2O H2O
N
N
N
OH2
Fe
OH2
Fe
N
O O
OH2 NH [Fe(PCD)NO]0
[Fe(H2O)5(NO)]+2
HN
O N
N HN
N
NH
N
N
N Fe
N N
O O
N [Fe(SOR)(NO)]
[Fe(RDO)(NO)]+1
% nitroxyl character 100% UBP86, bonds
60%
Fe
NH
S
+1
80%
O N
S=1 HNO +1
[Fe(SOR)(NO)]
UB3LYP, bonds UBP86, spin UB3LYP, spin
0
[Fe(PCD)NO]
40% +1
20% 0%
[Fe(RDO)(NO)] S=1/2 NO
+2
[Fe(H2O)5(NO)]
Figure 10. Top panel: S=3/2 formally Fe(II)-NO models: PCD = model of the active site of protocatechuate dioxygenase, featuring two tyrosine and two histidine ligands; SOR = model of the active site of superoxide reductase, featuring four histidine and one cysteine ligand; RDO = model the active site of naphthalene dioxygenase (a Rieske dioxygenase), featuring two histidine and one aspartate ligand. Percentages of nitroxyl character are derived from bond lengths and spin densities for “ferrous-NO” models with two different functionals, BP86 and B3LYP. Percentages are defined with the general formula 100x[P(complex)-P(Oxidized Reference)]/[P(Reduced Reference)-P(Oxidized Reference)], where the property P is either the bond length or the spin density on the NO-ligand; Oxidized Reference is S=1/2 NO; Reduced Reference is S=1 HNO [9].
Electromerism in Small Molecule Activation by Metal Centers…
OH2
N
(H)O + (H)NO2
Cu(II)
+
O
Cu(II)
+ OH2 - NO N
+e
O
Cu(I)
+ H+
381
(H)O
N
-
O
Cu(I)
- OH2
Figure 11. Proposed mechanism for Cu-NIR [46].
Table 6. Calculated energies (a.u.), and relevant distances (Å) for the CuNIR-NO models [38] Model exp-NOa Cu(I)-NO Cu(I)-NO +proteinb Cu(II)-NO Cu(II)-NO side-onc a
Cu-O 1.95 2.84 2.11 2.89 1.95
N-O 1.46 1.19 1.22 1.13 1.16
crystal structure, pdb code 1SNR. b model shown in Figure 12. cthe Cu-O bond was frozen at the distance seen in the crystal structure, cf. entry 1 in Table.
Table 7. Calculated partial atomic charges and spin densities for models shown in Table 6. Spin densities are shown in italics [38] Model Cu(I)-NO Cu(I)-NO +protein Cu(II)-NO Cu(II)-NO separated Cu(II)-N) side-on Cu(I)-OH2 Cu(II)-OH2
Cu 0.39 0.08 0.40 -0.03 0.45 0.48 0.44 0.37 0.59 0.52
N 0.02 0.57 -0.01 0.73 0.22 0.21 0.22 -
O1 -0.08 0.33 -0.12 0.31 0.15 0.20 0.07 -0.53 0.53 0.01
L -0.06 0.90 -0.06 0.90 0.37 0.42 0.29 -0.09 -0.17 0.01
382
Radu Silaghi-Dumitrescu
Figure 12. Large version of the Cu(I)-NO model, including selected side-chain and water (WAT) atoms found near the NO ligand (~3-5 Å). Protons are not shown, for simplicity. Heavy-atom coordinates were taken from the crystal structure of the NIR-NO adduct, pdb code 1SJM [38].
Conclusion A wide range of methods is available for exploring electromerism in complexes of metals with small ligands, such as seen in dioxygen or nitric oxide activation. Spectroscopic methods that probe the metal (EPR, Mössbauer) or ligand (resonance Raman, ENDOR) are in continuous development and need careful choice of reference molecules. Much the same are computational methods; in this latter respect, we have exemplified here how multiconfigurational methods can be complemented by ground-state methods, where careful analysis of geometries, molecular orbital plots, or orbital occupancies can again be very reliable, provided proper reference systems are selected.
References [1] [2] [3]
[4] [5]
D. M. Kurtz, Jr., Oxygen-carying proteins: three solutions to a common problem. Essays in Biochemistry, 1999, 55-80 E. Antonini,M. Brunori, Hemoglobin and Myoglobin in their Reaction with Ligands; North-Holland, Amsterdam, 1971. R. Silaghi-Dumitrescu,C. E. Cooper, Transient species involved in catalytic dioxygen/peroxide activation by hemoproteins: possible involvement of protonated Compound I species. Dalton Trans., 2005, 3477-3482. R. Silaghi-Dumitrescu, The nature of the high-valent complexes in the catalytic cycles of hemoproteins. J. Biol. Inorg. Chem., 2004, 9, 471-476. I. Schlichting, J. Berendzen, K. Chu, R. M. Sweet, D. Ringe, G. A. Petsko,S. G. Sligar, The catalytic pathway of cytochrome P450cam at atomic resolution. Science, 2000, 287, 1615-22.
Electromerism in Small Molecule Activation by Metal Centers… [6]
[7]
[8]
[9] [10]
[11] [12]
[13]
[14] [15] [16]
[17] [18]
[19]
[20]
[21] [22] [23]
383
A. Decker, M. D. Clay,E. I. Solomon, Spectroscopy and electronic structures of monoand binuclear high-valent non-heme iron-oxo systems. J Inorg Biochem, 2006, 100, 697-706. M. Costas, M. P. Mehn, M. P. Jensen,L. J. Que, Dioxygen Activation at Mononuclear Nonheme Iron Active Sites: Enzymes, Models, and Intermediates. Chem. Rev., 2004, 2, 939-86. K. P. Jensen, B. O. Roos,U. Ryde, O2-binding to heme: electronic structure and spectrum od oxyheme, studied by multiconfigurational methods. J. Inorg. Biochem., 2005, 99, 45–54, Erratum on page 978. R. Silaghi-Dumitrescu,I. Silaghi-Dumitrescu, DFT and the electromerism in complexes of iron with diatomic ligands. J Inorg Biochem, 2006, 100, 161-6. R. Silaghi-Dumitrescu, On the performance of the PM3 semiempirical method with heme complexes relevant to dioxygen and peroxide activation. Rev. Chim., 2004, 55, 304-307. B. M. Hoffman, ENDOR of Metalloenzymes. Acc. Chem. Res., 2003, 36, 522-529. R. Davydov, J. D. Satterlee, H. Fujii, A. Sauer-Masarwa, D. H. Busch,B. M. Hoffman, A Superoxo-Ferrous State in a Reduced Oxy-Ferrous Hemoprotein and Model Compounds. J. Am. Chem. Soc., 2003, 125, 16340-16346. R. Davydov, V. Kofman, H. Fuji, T. Yoshida, M. Ikeda-saito,B. M. Hoffman, Catalytic cycle of heme oxygenase through EPR and ENDOR of cryoreduced oxy-heme oxygenase and its Asp140 mutants. J. Am. Chem. Soc., 2002, 124, 1798-1808. D. M. Kurtz, Jr., Microbial detoxification of superoxide: the non-heme iron reductive paradigm for combating oxidative stress. Acc Chem Res, 2004, 37, 902-8. J. Girerd, F. Banse,A. Simaan, Characterization of Properties of Non-Heme Iron Peroxo Complexes. Structure and Bonding, 2000, 97, 145-176. D. M. Kurtz, W. N. Lanzilotta,R. Silaghi-Dumitrescu, How microbes detoxify superoxide, hydrogen peroxide, and nitric oxide: The non-heme iron reductive paradigm. Abstracts of Papers, 227th ACS National Meeting, Anaheim, CA, United States, March 28-April 1, 2004, 2004, INOR-418. V. Niviere,M. Fontecave, Discovery of superoxide reductase: an historical perspective. J Biol Inorg Chem, 2004, 9, 119-23. V. Niviere, M. Asso, C. O. Weill, M. Lombard, B. Guigliarelli, V. Favaudon,C. HoueeLevin, Superoxide reductase from Desulfoarculus baarsii: identification of protonation steps in the enzymatic mechanism. Biochemistry, 2004, 43, 808-18. R. Silaghi-Dumitrescu, I. Silaghi-Dumitrescu, E. D. Coulter,D. M. Kurtz, Jr., Computational study of the non-heme iron active site in superoxide reductase and its reaction with superoxide. Inorg. Chem., 2003, 42, 446-456. S. Shaik, D. Kumar, S. P. de Visser, A. Altun,W. Thiel, Theoretical perspective on the structure and mechanism of cytochrome P450 enzymes. Chem Rev, 2005, 105, 2279328. R. Silaghi-Dumitrescu, Bonding in Biologically-relevant high-valent iron centers. Int. J. Chem. Model., 2009, 2, 1-17. K. L. Stone, L. M. Hoffart, R. K. Behan, C. Krebs,M. T. Green, Evidence for two ferryl species in chloroperoxidase compound II. J Am Chem Soc, 2006, 128, 6147-53. R. K. Behan,M. T. Green, On the status of ferryl protonation. J. Inorg. Biochem., 2006, 100, 448-459.
384
Radu Silaghi-Dumitrescu
[24] R. Silaghi-Dumitrescu,D. M. Kurtz , Jr., Tuning the electronic structure of iron-nitrosyl complexes. Chemtracts Inorg. Chem., 2003, 16, 468-473. [25] M. Li, D. Bonnet, E. Bill, F. Neese, T. Weyhermuller, N. Blum, D. Sellman,K. Wieghardt, Tuning the electronic structure of octahedral iron-nitrosyl complexes {FeL(X)} (L=1-alkyl-4,7-bis(4-tert-butyl-2-mercaptobenzyl)-1,4,7-triazacyclononane, X=Cl, CH3O, CN, CO): The S=1/2 <=> S=3/2 spin equilibrium of {FeLPr(NO)}. Inorg. Chem., 2002, 41, 3444-3456. [26] R. Silaghi-Dumitrescu, "High-valent" ferryl-oxo complexes: how "high" are they really? Studia Univ. Babes-Bolyai, Chemia, 2005, 50, 17-21. [27] R. Silaghi-Dumitrescu, Electronic structures of Fe(IV) and Fe(V) systems with oxo, sulphido and nitrido ligands in octahedral environments. Rev. Chim., 2007, 58, 461-464. [28] C. E. MacBeth, A. P. Golombek, V. G. Young, Jr, C. Yang, K. Kuczera, M. P. Hendrich,A. S. Borovik, O2 activation by nonheme iron complexes: A monomeric Fe(III)-Oxo complex derived from O2. Science, 2000, 289, 938-941. [29] R. Silaghi-Dumitrescu, Bonding in ferric-oxo complexes. Studia Univ. Babes-Bolyai, Chemia, 2004, 49, 235-240. [30] R. Silaghi-Dumitrescu, The ferric-oxo moiety in porphyrin complexes: a ferryl in disguise? Macroheterocycles, 2008, 1, 79-81. [31] R. Guilard,K. M. Kadish, Organometallic chemistry of metalloporphyrins. Chem. Rev., 1988, 88, 1121-1146. [32] R. Silaghi-Dumitrescu, Fe(IV)-Fe(II) electromerism in hemoprotein complexes: implications for ferryl chemistry. Proc. Rom. Acad. Series B, 2006, 2-3, 95-101. [33] Z. Kis,R. Silaghi-Dumitrescu, The electronic structure of biologically-relevant fe(0) systems. Int. J. Quant. Chem., 2009, in press. [34] S. V. Makarov, D. S. Salnikov, T. E. Pogorelova, Z. Kis,R. Silaghi-Dumitrescu, A new route to carbon monoxide adducts of heme proteins. J. Porph. Phthalocyan., 2008, 12, 1096-1099. [35] E. V. Kudrik, S. V. Makarov, A. Zahl,R. van Eldik, Kinetics and mechanism of the iron phthalocyanine catalyzed reduction of nitrite by dithionite and sulfoxylate in aqueous solution. Inorg Chem, 2005, 44, 6470-5. [36] J. Grodkowski, T. Dhanasekaran, P. Neta, P. Hambright, B. S. Brunschwig, K. Shinozaki,E. Fujita, Reduction of Cobalt and Iron Phthalocyanines and the Role of the Reduced Species in Catalyzed Photoreduction of CO2. J. Phys. Chem. A, 2000, 104, 11332-11339. [37] W. G. Zumft, Cell biology and molecular basis of denitrification. Microbiol. Mol. Biol. Rev., 1997, 61, 533-616. [38] R. Silaghi-Dumitrescu, Copper-containing nitrite reductase: a DFT study of nitrite and nitric oxide adducts. J Inorg Biochem, 2006, 100, 396-402. [39] R. Silaghi-Dumitrescu, Nitrite linkage isomerism in cytochrome cd1 nitrite reductase. Inorg. Chem., 2004, 43, 3715-3718. [40] R. Silaghi-Dumitrescu, Nitric Oxide Reduction by Heme-Thiolate Enzymes (P450nor): A Reevaluation of the Mechanism. Eur. J. Inorg. Chem., 2003, 1048-1052. [41] R. Silaghi-Dumitrescu, D. M. Kurtz, Jr., L. G. Ljungdahl,W. N. Lanzilotta, X-ray Crystal Structures of Moorella thermoacetica FprA. Novel Diiron Site Structure and
Electromerism in Small Molecule Activation by Metal Centers…
[42] [43] [44] [45]
[46]
385
Mechanistic Insights into a Scavenging Nitric Oxide Reductase. Biochemistry, 2005, 44, 6492-501. M. T. Gladwin,D. B. Kim-Shapiro, The functional nitrite reductase activity of the heme-globins. Blood, 2008. J. H. Enemark,R. D. Feltham, Coord. Chem. Rev., 1974, 13, 339. E. I. Solomon, A. Decker,N. Lehnert, Non-heme iron enzymes: contrasts to heme catalysis. Proc Natl Acad Sci U S A, 2003, 100, 3589-94. E. I. Solomon, T. C. Brunold, M. I. Davis, J. N. Kemsley, S. K. Lee, N. Lehnert, N. Neese, A. J. Skulan, Y. S. Yang,Z. Zhou, Geometric and electronic structure/function correlations in non-heme iron enzymes. Chem. Rev., 2000, 100, 235-350. E. I. Tocheva, F. I. Rosell, A. G. Mauk,M. E. Murphy, Side-on copperr-nitrosyl coordination by nitrite reductase. Science, 2004, 304, 867-70.
In: Quantum Frontiers of Atoms and Molecules Editor: Mihai V. Putz, pp. 387-424
ISBN: 978-1-61668-158-6 © 2010 Nova Science Publishers, Inc.
Chapter 16
STRUCTURAL MODELLING OF NANO-CARBONS AND COMPOSITES Mihai Popescu* and Florinel Sava National Institute R&D of Materials Physics, Magurele-Ilfov, RO-077125, Romania
Abstract The state of art in the field of nano-carbon object structures is presented. Simulations of the structural configurations by a Monte-Carlo computing procedure are reported. New nanocarbon objects were predicted on the basis of structural modeling data. A complex structural configuration based on nano-carbons and a chalcogenide nano-object (arsenic sulphide nanotube) has been modelled and its crystallo-chemistry was analyzed in order to demonstrate the stability and importance of composite nano-objects for nano-technological purposes.
1. Introduction The structure of the carbon nano-configurations is still poorly known and understood. The nano-carbon species attracted the attention of the researchers in nano-devices, due to their possible use in the nano-conductors (by filling carbon nanotubes with metallic atoms), in the possible development of new superconductors (by filling the fullerene molecules with alkali metals) [1], and in the utilization of the luminescence properties of doped fullerenes [2]. Recently, nano-objects were discovered in other systems, as e.g. molybdenum sulphides [3] and arsenic chalcogenides [4]. In general, the low coordination covalent materials are able to develop fiber, rings, closed tubes, ball-like configurations, or even complex configurations at the nano-scale. The problem of the formation and growth of nano-configurations in different materials is still in its infancy.
*
E-mail address: [email protected], [email protected]
388
Mihai Popescu and Florinel Sava
The presently available simulation techniques (semi-empirical, ab-initio and others) are able to provide quantitative understanding of the formation and peculiarities of the structures as a function of the structural arrangement of the atoms. A deep insight, at the atomic scale, into various nano-objects will give the possibility to understand the essential physics of the nanomaterials and will open the way towards various applications.
2. Structural Modelling Procedures The modelling supposes a procedure for building rational structural models based on calculations consisting in finding the structure of minimum free energy on the basis of interatomic interactions chosen according to the crystallo-chemistry principles. Three main methods are well-known: - Molecular dynamics (MD) - Monte-Carlo Metropolis (MCM) - Reverse Monte-Carlo method (RMC). The molecular dynamics is a simulation method which allows for predicting the time evolution of a system with interacting particles and to estimate its relevant physical properties. Thus, the positions and velocity of the atoms (particles) as well as the forces acting upon them at every time can be known exactly. The particle trajectories are calculated by resolving the motion equations for equilibrium and non-equilibrium situations. By using the statistical mechanics are calculated the macroscopic properties of the system as a function of time: pressure, caloric energy parameters [5, 6]. The method permits the simulation of time-dependent phenomena: transport phenomena, growth processes, etc. The method consists in several steps: - choosing of an initial set of parameters (initial positions and velocities of all the particles of the system) - choosing the interaction potentials that govern the system and allow for the calculation of the forces acting between the particles - finding the evolution in time of the system by resolving the classical Newtonian equations for all the particles. The equations are expressed as:
G G d 2 ri Fi (t ) = mi 2 dt
(1)
where F is the force that acts on the particle i at time t and is equal to the negative gradient of the interaction potential U, mi is the atomic mass and ri is the position of the particle. The interaction potentials determine the force field in the system. This force field can be obtained by quantum methods (e.g ab-initio method, by using the Schrödinger equation), by empirical methods (Lennard-Jones [7], Morse [8], Born-Mayer [9]) or quantum empirical methods (embedded atom model [10], glue model [11], bond-order potential [12]). The criteria for selecting the force-field are the accuracy, transferability and calculation speed. A
Structural Modelling of Nano-Carbons and Composites
389
typical interaction potential could consist of a number of bonding interaction terms (potential for bond stretching, bond bending and bond torsion) and non-bonding interactions (van der Waals, electrostatic). The method has been improved by using the density functional theory (DFT) which leads to more precise geometry and energies. The MD method that uses DFT (first principles) is successful in the study of the dynamical processes but requires huge amounts of computing resources. The Monte-Carlo-Metropolis method has the following characteristics: - the problem is treated in an analogous-probabilistic or statistical model. - the probabilistic model is resolved by a numerical stochastic experiment; in a stochastic process there is not a unique possibility of evolution in time of the processes, as is the case when differential equations are used. Oppositely, there exists some uncertainty regarding the evolution described by the distribution probability. This means that, notwithstanding the knowledge of the initial conditions there exists more possibilities to continue the process, some ways being more probable than others. - the data are analyzed by using statistical methods. The simplest procedure is the static Monte-Carlo one. In this method the atoms trajectories are generated by random shifting of atoms in space, which leads, step by step, to the minimization of the free energy of the system. This method does not permit the investigation of the processes at the normal time scale. Temperature is not a variable parameter and the system is, therefore, considered at T = 0 K. To accelerate the finding of the minimum energy configuration it is better to choose firstly a set of initial coordinates. The reverse Monte-Carlo method proceeds with the following steps: - An initial configuration of points (atoms) are admitted (e.g. a set of N points within a cube of side L). It is possible to use a a random three-dimensional configuration, a special network or a set of coordinates from previous simulations. - boundary periodical conditions are applied (e.g. the cube is surrounded by its image) and one computes the pair distribution function, gs(r) (PDF, or radial distribution function (RDF)). - a new configuration is generated by moving randomly one or more simulated atoms. A new PDF (RDF) is calculated: gs’(r). - the two PDFs, old and new one, are compared with the experimental gexp(r) for the system under investigation by using Pearson test (χ2 criterion) or the least mean square estimation of the difference between the two functions:
∑ [g (r ) − g (r )] nr
χ2 =
i =1
exp
i
σ
S
2 E
∑ [g (r ) − g (r )] nr
2
i
;
χ '2 =
i =1
2
exp
i
σ E2
S
i
(2)
where nr is the number of points ri and σE is the experimental error. - if χ’ < χ the new configuration is accepted. If χ’ > χ the new configuration is accepted with a probability that follows a normal distribution, with width σ. - if the new configuration is accepted, then this one is taken as initial configuration for the next iterative step.
390
Mihai Popescu and Florinel Sava
The procedure is repeated till the χ2 decreases down to the equilibrium value and oscillates around this value as in the case of the energy in the conventional Monte-Carlo method.
3. Graphenes 3.1. Geometrical and Electronic Structure of the Graphene Nano-Object Graphene is the constituent unit of the graphite crystal. It is formed by a a plane of covalently bonded carbon atoms situated in the corners of the regular hexagons of side 1.415 Å. In graphite, the graphene sheet are held together by van der Waals forces at the distance of 3.354 Å. Fig. 1a shows the packing of the graphenes with the sequence ABABAB…, in a hexagonal lattice.
Figure 1. a. The hexagonal latticre of the graphite single crystal [from R. W. G. Wyckoff, Crystal Structures, (Interscience) New York 1964, Vol. 1] b. The image obtained by STM (Scanning tunneling microscopy, STM). It is observed the trigonal lattice of the pyrolitic graphite with very good orientation. In the picture appear only the positions noted in part a of the figure by B (filled dots) [from M. S. Dresselhaus and M. Endo, Carbon Nanotubes - Synthesis, Structure, Properties, and Applications, M. S. Dresselhaus, G. Dresselhaus, P. Avouris (Eds.), (Springer-Verlag Berlin Heidelberg 2001, Springer Series: Topics in Applied Physics, Vol. 80), p.15].
Although the fullerene has been discovered in 1985, and the carbon nanotubes in 1991, the prototype of these nano-objects, the graphene, (in greek: graphein = to write), has been obtained as distinct object for physico-chemical investigations only in 2004. At that time, Novoselov et al. [13-15] succeeded to make the transfer of a single graphite sheet from the face c of the crystal to an appropriate substrate for performing measurements of the optical and electrical properties of the carbon sheet. The schematic of the structure of the graphene is given in Fig. 2. Such atomic structure is formed due to the fact that during graphene formation process every atom modifies its electronic configuration, from the fundamental state, with two electrons on the atomic orbital 2s and one electron on every orbital 2px and 2py, to the hybrid state sp2, with three electrons on the hybrid orbitals sp2 (situated in the same plane, the angle between their axes being
Structural Modelling of Nano-Carbons and Composites
391
120°). The fourth electron is situated on the unhybridized orbital pz, having a lobe above and other below the plane of the hybridized orbitals. Thus, every carbon atom is covalently bonded with other three carbon atoms by three molecular orbitals σ, formed by the unification of two hybrid orbitals sp2, for every molecular orbital, and by a molecular orbital π (partially filled), formed by the unification of the unhibridized atomic orbitals pz. In fact, there exists only one molecular orbital π, extended in the whole graphene, with the maximum amplitude of the electronic wave above and below the plane of the σ bonds. The π electrons are delocalized. The σ bonds are strong and they lead to frequencies of the optical phonons, much higher than those observed in diamond carbon.
Figure 2. a) Illustration of the valence orbitals of the carbons in graphene: three hybridized orbitals sp2 in the plane of the graphene and the unhybridized orbital pz perpendicularly to the layer. These orbitals bind strongly the carbon atoms in a hexagonal lattice and are responsible for the high binding energy and for the elastic properties of the graphene. b)The width of the forbidden gap between σ bonding band (valence band) and antibonding (conduction band) is ~12 eV, while the π bonding and antibonding states are situated in the neghbourhood of the Fermi level (EF).Consequently, the σ bonds are frequently neglected when the electronic properties of the graphene around the Fermi level are predicted. c) Dirac cones localized in the six corners of the Brillouin 2D zone [from J.-C. Charlier, P. C. Eklund J. Zhu, A. C. Ferrari, Advanced Topics in the Synthesis, Structure, Properties and Applications, A. Jorio, G. Dresselhaus, M. S. Dresselhaus (Eds.), Springer-Verlag Berlin Heidelberg 2008, Springer Series: Topics in Applied Physics, Vol. 111, p. 675].
From the theoretical point of view, the graphene has been investigated well before 2004 [16-18]. Some scientists considered that graphene cannot exist in the free state and its analysis was carried out only for academic purposes [19]. All other curved structures (fullerenes, nanotubes) were viewed as unstable. The hypothesis of the thermodynamical instability, and, therefore, of the physical inexistence of the 2D crystals appeared already in the middle ’30 of the last century [20,21]. The thermal fluctuations would lead to atomic shift comparable to the interatomic distances at every finite temperature [22,23]. A lot of experimental observaions supported this hypothesis. For example, the melting temperature of a thin film decreases rapidly with its thickness and the film becomes unstable and separates in islands or decomposes at thickness of several atomic layers [24,25]. That is why the monoatomic layers werev physically prepared only by epitaxial growth on the surface of single crystals, with lattices approaching that of the monoatomic layer. The people was convinced that without a 3D basis the 2D materials cannot
392
Mihai Popescu and Florinel Sava
exist. After 2004 was obtained not only separated graphenes but also boron nitride layers [26]. Good 2D crystals were obtained in the following configurations: - on the surface of a non-crystalline substrate - as suspended membrane - as suspension in liquids To reconcile the theory of instability of graphene with the experiment, the following explanations were proposed: -The interatomic bonds are enough strong for preventing the generation of dislocations or other defects at room temperature. -The graphene is quenched in a metastable state because it is extracted from a 3D material. -The graphene is weakly waved in three directions (Fig. 3), the deformations outside the plane being of the order of nanometers [27]. Such ondulations increase the elastic energy of the lattice, but compensates the thermal vibrations (very large in 2D), and above a given temperature they can minimize the total free energy [28].
Figure 3. Illustration of the waved structure of graphene, observed experimentally by transmission electron microscopy [27].
Investigations have been carried out on the variation of the graphene properties as a function of the number of layers in a graphite packing. There as estsablished that the electronic structure evoluates very rapidly with the layer number, and reaches the state
Structural Modelling of Nano-Carbons and Composites
393
corresponding to av 3D crystal for a pack of ten graphenes. Nevertheless, only graphene and, with a good approximation, the double layer exhibit a simple electronic spectrum. Both are semiconductors with the width of the of the forbidden gap zero, having only one type of electron and one type of hole. If the number of layers raises to three or more, the electronic spectrum becomes more and more complicated. Several types of charge carriers appear and the conduction and valence bands gradually superpose. This behaviour allows for distinguishing between a graphene, di-layer and multilayer (from 3 to 10), as three types of 2D crystals. Graphenes of high quality show an ambipolar filed effect (Fig. 4).
Figure 4. The effect of ambipolar field effect in the monolayer graphene. There are represented the E(k) spectra of low energy (cones); it is indicated the change of Fermi level, EF, with the change of the gate voltage, Vg. Positive, and, respectively, negative gate voltages induce electrons (holes) with the concentration n=αVg, where the coefficient α≈7.2×1010 cm-2V-1 for field effect devices with a film of 300 nm thicknes of SiO2 as dielectric. The rapid decrease of the resistivity, ρ, for increasing number of charge carriers, speaks in favour of their high mobility (a value of μ≈5,000 cm2V-1s-1 that does not change significantly with temperature up to 300 K) [15].
In graphene was observed, also, the quantum Hall effect, even at room temperature [34]. The graphene attracted the attention of the researchers due to the peculiar nature of its charge carriers. These behave similarly to the relativistic particles and, therefore, they are described more simply by Dirac equation than Schrödinger one. The interaction of the electrons with the periodic potential of the graphene, produces new quasi-particles, which, at low energies are described accurately by a Dirac equation (2+1 dimensional). These quasiparticles, called Dirac fermions without mass, could be regarded as electrons that lost their mass, mo, or as neutrinos that acquired an electrical charge e.
3.2. Methods of Graphene Preparation Along the time several method for graphene preparation have been used: -chemical exfoliation: atoms or molecules are intercalated in-between the graphenes [29]; -epitaxial growth by vhemical vapour deposition of hydrocarbons on metallic supports;
394
Mihai Popescu and Florinel Sava
-thermal decomposition of SiC; -micromechanical cleavage of the graphite crystal. -the method of the micromechanical cleavage allowed for the graphene separation and its observation for the first time (Fig. 5) [26].
Figure 5. a. The graphene vizualized by atomic force microscopy [26]; b. Suspended graphene on metallic grid of micrometer size (TEM image) [27]; c. Image obtained by electron microscopy; one observes the zig-zag and chair-like edges [15].
After successive cleavage with the help of adhesive bands, the layer is pressed against a substrate and is looked for the remaining fragments of graphenes. The secrets of the method consist in the use of an optical microscope and in an appropriate thickness of the SiO2 film on the Si wafer. Thus, the optimum contrast is optimum and the graphene can be observed, with great effort and attention. It is enough to shift the SiO2 thickness by around 5 % (e.g. 315 nm against 300 nm) and the graphene layer becomes invisible. Moreover, the graphite used for cleavage must be carefully selected (large crystallites are needed), the cleavage must be fresh, and SiO2 must exhibit a very clean surface. The graphene gives a very strong and peculiar signature in the Raman microscopy measurements [30, 31], and this makes useful the Raman technique for rapid analysis of the thickness of the graphenes identified by optical microscopy.
3.3. Applications In spite of the optimistic opinions regarding the future of applications of graphenes in electronics, the graphene microprocessor seems to be still a dream. We must enumerate a series of nowadays applications: -use of graphenes in composite materials. -use of graphene powder in electrical batteries. The powder is cheaper than the nanocarbon tube powder used now by NEC Company (Japan). -source of electrons for TV screens. -sensors -hydrogen storage
Structural Modelling of Nano-Carbons and Composites
395
3.4. Graphene Nanoribbon There are many scientific papers that report the investigations of the nano-carbon structures based on ribbon graphenes. The ribbon is considered as quasi-one-dimensional (1D) due to its small width of several nanometers. As a function of the shape of the edges the ribbons are assigned to three different configurations (Fig. 6).
Figure 6. The graphene edge in gaphene ribbons. a). Zig-zag edge (Z); b). Edges with chair-like shape (A); c). Chiral edge (C).
The shape of the edges and the width of the ribbon determines the broadening of the forbidden gap. The modelling has shown that the chair-like edge ribbon could exhibit a metallic or semiconductor character, as a function of its width, while the zig-zag edge ribbon is metallic.
3.5. Topological Defects in Graphene Meyer et al. [35] from Berkeley Laboratories investigated a graphene monolayer with high resolution microscopy (1 Å) using low acceleration voltage (80 kV), in order to prevent the graphene destruction during measurements. The graphene was obtained by mechanical cleavage using the procedure presented in the section 3.2. Every atom from the visual field was detected. One observes a very ordered crystalline lattice with very few defects (Fig. 7).
Figure 7 [35]. A). Direct image of a graphene monolayer; b). Transition from the monolayer (upper part) to bi-layer (bottom part); c). The same figure as in b; additionally is shown the graphite AB packing.
396
Mihai Popescu and Florinel Sava In Fig. 8 are presented the metastable defects in the 3-layers graphite lattice.
Figure 8 [35]. Metastable defects detected by electron microscopy (HREM) (the scale is 2Å): a. unperturbed lattice before the appearance of the defect; b. Stone-Wales defect; c. the same image as in b, with superposed atomic configuration (for the sake of clarity); d. relaxation of the unperturbed lattice (after 4 s). e.-g. Elimination of a vacancy; e. initial image; f. image. + atom configuration with the indication of the pentagon; g. lattice recovered after 4s; h., i. Image and atomic configuration of the defect consisting in 4 heptagons and 4 pentagons; one observes two adjacent pentagons; j., k. atomic configuration of the defect consisting in three heptagons and three pentagons. This defect has been eliminated after 4s.
Figure 9. Nano-engineering of graphene defects: a. booble; b.crest; c. meta-crystal d. band [from M. T. Lusk, L. D. Carr, Nanoengineering Defect Structures on Graphene, Physical Review Letters, 100, 175503 (2008)]
Structural Modelling of Nano-Carbons and Composites
397
The graphene-type membranes are promising as support for materials to be used in transmission electron microscope (TEM), because graphene ensures a crystalline very transparent basis and its structure is accurately known. It is a continuous interest in the modelling of the defects in graphenes with the purpose to find new properties in the nanostructures. Irregular configuration of the defected graphenes was clearly established (Fig. 9.).
3.6. Graphene-Based Structures 3.6.1. Ideal Graphene Graphenes and nano-objects based on graphene configuration have been modelled in the frame of the valence force field model [36]. Bond stretching and bond bending force constants reported recently for fullerenes: kr=6.05 × 10-4 dyn/Å3 and kB = 7 × 10-5 dyn.Å, have been used [37].
Figure 10. The modeling of a graphene sheet (150 carbon atoms). The total free energy per bond and bonding angle = 3.142×10-8 meV and s=2sinθ/λ.
398
Mihai Popescu and Florinel Sava
The calculation of the free energy of every carbon configuration was carried out by the Monte Carlo – Metropolis method, with special computer programs developed in the National Institute of Materials Physics, and run on powerful PC computers. For every configuration there were performed 60 millions iterations, starting with the working step of 0.02 nm and gradually reducing the iteration step in order to refine the structure [38-40]. Firstly we used a hand-built model of graphene (150 carbon atoms). The atomic coordinates have been carefully measured and the table of atom interactions has been built. The data strings were put in the computer and special program was run in order to find the structure of minimum free energy (minimum bond stretching and bond bending distortion energy. For graphene this calculation is trivial because the ideal structure must give zero free energy and exact bond length and bond angle values. Figure 10 shows the results after 60 millions iterative steps. The total free energy was found: 3.142 × 10-8 meV. Fig. 10a shows the interatomic distance distribution in the model. Fig. 10 b presents the X-ray diffraction pattern calculated for such structure. The elongated and asymmetric shape of the diffraction peaks is characteristic to one-dimensional structures. Fig. 10c illustrates the distribution of the angles between bonds. The bonding angles are very close to ideal 120 oC. The calculation is important because it reveals the limits of the iteration method to reproduce the exact structure of a given configuration of atoms in the frame of the Monte-Carlo Metropolis method.
3.6.2. Nano-cones and Nanohorns A carbon nanocone can be modelled by a cut and glued procedure, thus creating a disclination defect. More precisely, a sector of 60º is cut and the remaining margins are glued (Fig. 11). Thus, if the deviation from the plane surface is permitted, one obtains a cone whom top angle is directly related to the disclination angle.
Figure 11. How to build a cone from a graphene [41]: 1) the sector βαγ is cut and one put together αβ and αγ to create a cone with one pentagon at top(nΩ=1); 2) the second sector is cut and the αβ and αδ is glued, thus producing a cone with a lower angle at the top (nΩ=2). The top ring made of 4 atoms is very unstable; the true cones with the same angle at the top contain very probably two vicin pentagons.
Structural Modelling of Nano-Carbons and Composites
399
Due to graphene symmetry, only five types of cones can be created from a graphene sheet. The disclination angles are multiple of 60º (60º, 120º, 180º, 240º, 300º), corresponding to the presence of a given number of pentagons, nΩ, at the top (respectively, nΩ=1, 2, 3, 4, 5), and the top angles are 113º, 85º, 60º, 39º, 19º, values observed experimentally [42]. If nΩ=6 one gets a closed nanotube at one end and nΩ=0. This is a disc. Experimentally, there were observed cones with the top angles: 30º, 50º şi 70º [43], but one considers that these nanocones are opened at the top (Fig. 12). Another possibility is the creation of partial disclinations in the structure.
Figure 12. The structure of an open nanocone (abat-jour-like configuration).
Figure 13. Electron microscope images (SEM) showing a carbon disk and three carbon nanocones of one micrometer diameter [from http://www.complexphysics.org/Projects/Nanocarbon.html].
400
Mihai Popescu and Florinel Sava
By using a laser ablation technique at room temperature and in the absence of the catalysts there was obtained a new class of carbonic materials, called nanohorns. These nanohorns exhibit a unique top angle: ~20º [44] (Fig. 14). These top angle of ~20º, correspons to a disclination of 5π/3, and this fact implies that all nanohorns contain exactly 5 top pentagons [45].
Figure 14. TEM images of the nanotubes(a) and nanohorns (b), (c). Additionally, it is presented the schematic of such nanohorns (c, at bottom) [from Iijima Sumio, Yudasaka Masako, Nihey Fumiyuki, Carbon Nanotube Technology, Nec Technical Journal, 2(1), (2007) p.52].
The modeling of the nano-cone objects has been carried out starting from hand-built models followed by structural relaxation of the coordinates in the frame of the Monte-CarloMetropolis method. Three configurations have been constructed: broad cupola, narrow cupola and small cup. After relaxation the nano-objects were characterized according to the total free energy and distribution of the bond distortion energies. Broad cupola has 80 atoms and is characterized by one five fold ring at the top, the remaining rings being 6-fold rings of carbon atoms (Fig. 15). After relaxation, the free energy per bond and angle is 0.322 meV. The top angle of the cone is 113.48o. The wall inclination is 60o. The narrow cupola has 73 carbon atoms and the free energy per bond and angle is 1.1 meV (Fig. 16). The top angle of the cone is 60o. The inclination angle of the walls is around 30o. The small cup configuration contains 35 carbon atoms and exhibits a distortion energy per bond and angle (free energy) of 4.79 meV. (Fig. 17). As it easily observed the small cup configuration has the most distorted valence bonds, and, therefore is least stable. The broad cupola that contains only one defect in the graphene (one 5-fold ring) is less distorted, and seems to be a very stable structure which could be, therefore, practically obtained. As shown in Fig. 13 the nano-cones have been experimentally observed. It is interesting that, mostly, narrow cupolae have been observed.
Structural Modelling of Nano-Carbons and Composites
401
Figure 15. Broad cupola (one five fold ring at the top) - 80 carbon atoms a. Pair distance distribution b. X-ray diffraction pattern c. Bonding angle distribution. Free energy/(bond and angle) = 0.332 meV.
Figure 16. Narrow cupola (3 five fold rings at the top) - 73 carbon atoms a. Pair distance distribution b. X-ray diffraction pattern c. Bonding angle distribution. Free energy / (bond and angle) = 1.1 meV
402
Mihai Popescu and Florinel Sava
Figure 17. Cup configuration (6 fivefold rings) - 35 carbon atoms. a. Pair distance distribution b. X-ray diffraction pattern c. Bonding angle distribution. Free energy / (bond and angle) = 4.79 meV
Figure 17 bis. e1) One heptagon in graphene (175 atoms); e2) side view. Free energy / (bond and angle) = 0.0152 meV.
A quite different type of configuration is induced by heptagons created in a rather perfect graphene. Fig. 17-bis shows the butterfly configuration of the defected graphene having a heptagon configuration of carbons in the centre.
Structural Modelling of Nano-Carbons and Composites
403
3.7. Fullerene and Fullerene Based Nano-Objects 3.7.1. How the Fullerenes Are Formed The exotic geometries of the carbon originate from the topological defects of the graphenes. The following structures have been modelled: fullerenes, nanotubes, nanocones, nanohorns, multilayer graphites, etc... Fullerenes are ball configurations of pure carbon, and were discovered in 1985 by Kroto, Smalley and Curl [46]. These scientists received the Nobel Prize for Chemistry for the year 1996. It must to remark that theoretical studies on the C60 molecule have been carried out as early as 1970, but the nobelists were unaware of these studies. The concept of fullerene comes from the name of the americain architect Richard Buckminster Fuller, who created the geodesic dome with the structure similar to that of C60 configuration. (Fig. 18).
Figure 18. Geodesic dome created by Richard Buckminster Fuller. [from http://commons.wikimedia.org/ wiki/File:Biosphere_in_Montreal.jpg].
Fullerenes have been obtained by self-assembling of the carbon atoms in a hot carbon plasma, created by irradiating a graphite disk with focused laser pulses (YAG:Nd, λ = 532 nm, pulse duration: 5 ns) having the pulse energy of ~30 mJ (Fig. 19).
Figure 19. Mass spectra of the distribution of the carbon clusters produced in various working conditions (a, b, c). One observes that in the case a. the stable carbon clusters, C60 şi C70, these are formed abundantly. This is a proof of the inexistence of free bonds (dangling bonds) in these clusters. The clusters must be in closed configuration with every atom three-fold coordinated [46].
404
Mihai Popescu and Florinel Sava
Because in specific working conditions (Fig. 19a), there were formed mainly C60 şi C70 clusters, the Nobel Prize laureates proposed for C60 cluster a geometrical configuration based on a cut icosahedron architecture (Fig. 20). The typical molecular fullerene C60, is composed of 60 carbon atoms, that form an Archimedic polyhedron with 12 pentagons and 20 hexagons. Every pentagon is surrounded by hexagons and every hexagon is surrounded by three pentagons and three hexagons that alternate around.
Figure 20. The structure of a cut icosahedron, proposed for C60 molecule [from R. F. Curl and R. E. Smalley, Probing C60, Science 242, 1017 (1988)].
An other way of fullerene synthesis consists in the self-assembling of the evaporated carbon clusters in a very hot electric arc discharge (2800 ºC) between two graphite electrodes, in a vessel filled by Helium at a pressure of 0.4 bar. As well known, an euclidean plane can be covered by a hexagonal tri-coordinated lattice with Nc=3 şi Nl=6. Three hexagons that intersect at the top determine three angles of 3 × 120º = 360º. If one hexagon is substituted by one pentagon, then the pentagon will be deformed, or if the pentagon remain perfect, then the full structure will be strongly deformed and goes out of the initial plane. Every pentagon creates a deficit of 12º in every corner. The heptagons introduced in the structure compensates the deficit. In the hot plasma that triggers the evaporation of the graphite, there is a large variety of carbon clusters: C2, C3, C4,...C6 (benzene ring), C10 (two adjacent rings as in naphtalene) ande even C12 clusters consisting of 3 rings: two hexagons and one pentagon (Fig. 21).
Figure 21. The C12 molecule consisting of two hexagons and one pentagon [from R. Kerner, Cap. 3 “The Role of Topology in Growth and Agglomeration” in Topology in Condensed Matter, M.I. Monastyrsky (Ed.), (Springer-Verlag Berlin Heidelberg 2006, Springer Series in solid-state sciences vol 150), p.77].
The C12 molecule is enough large to become a center for nucleation of the fullerene. How the growths proceed? We must take into account that in C60 fullerene all the corners are
Structural Modelling of Nano-Carbons and Composites
405
equivalent. They are formed by two hexagons and one pentagon. Two pentagons must not have a common edge or a common corner, nor do three hexagons have a common corner. These conditions must be accomplished when a new ring is added to the molecule C12 (by addition of C2 or C3 molecules) or when a larger molecule, e.g. C9 is added. What is built by self-assembling of different clusters is only a waved surface, with dangling bonds. Nevertheless, these additive processes are statistical and, therefore, the preference for fullerene configurations cannot be explained. That is why we must admit a significant difference between the energies of different added polygons. The fullerene C60 can be regarded as the self-assembly of 12 pentagons, the hexagons being a consequence of this assembling. To admit a high probability for the formation of pentagons is very important. In 1990 Krätschmer, Huffman et al. [47] discovered that C60 fullerene, produced in high amount, can crystallize (Fig. 22).
Figure 22. a). The elementary cell of the cubic crystal (f. c. c.) of fullerite (C60) [47]; b). Fullerite crystals [from http://en.wikipedia.org/wiki/Fullerene].
3.7.2. Multilayer (Onion-Like) Fullerenes Nanocarbon structures of multilayer fullerenes built from concentric fullerenes (Fig. 23a, b) have been prepared and investigated. Of course, the diameters of these concentric fullerenes must stay in rigorous relation, so that the van der Waals interaction energy between two layers be minimum. For that it is necessary that the interlayer distance be not less than the distance between the atom planes in graphite: 3.354 Å. Let us suppose that, for getting more and more larger fullerenes, we must preserve the pentagon number (12) and we must increase the number of hexagons (according to Euler formula). The 20 triangular faces constituted from a given number of hexagons with three pentagons at the corners (Fig. 20), become more and more plane and the fullerenes will
406
Mihai Popescu and Florinel Sava
exhibit a more and more distinct icosahedral structure (Fig. 21). Every triangular face can be parameterized, using the notation of Coxeter [48], by two integer numbers (p,q), which represent the relative positions of the pentagons one to another in a hexagonal system of coordinates (Fig. 24). Fig. 25 shows the family of icosahedral fullerenes having the same number of hexagons and a variable number of hexagons.
Figure 23. a. Structure of a fullerene with three layers b. Electron microscope picture [see the reference from F. Banhart, Structural transformations in carbon nanoparticles induced by electron irradiation, Fizika dverdovo tela (ru), 44(3) (2002) p.388].
Figure 24. Possible triangular configurations, which, in the notation of Coxeter correspond to the parameters (p,q): a: (2,2); b. (3.0); c. (4,2) [from R. Kerner, Cap. 3 “The Role of Topology in Growth and Agglomeration” in Topology in Condensed Matter, M.I. Monastyrsky (Ed.), (Springer-Verlag Berlin Heidelberg 2006, Springer Series in solid-state sciences vol 150), p.80.]
Structural Modelling of Nano-Carbons and Composites
407
Figure 25. The family of icosahedral fullerenes C20(p2+pq+q2) (p≥1, q≥0, p=q), with the same number of pentagons (12) and variable number of hexagons [60]. a. C240 : 20*(22+2*2+22); b. C540 : 20*(32+3*3+32); c. C60 : 20*(12+1*1+12); d. C960 : 20*(42+4*4+42); e. C1500 : 20*(52+5*5+52).
Figure 26. The fullerene family of quasi-spherical balls C60*n2 (n=3-5), that contain a variable number of pentagons and heptagons [60]. a. C540 (60*32); b. C960 (60*42); c. C1500 (60*52).
The number of the carbon atoms in the icosahedral molecule built from triangles of the type (p,q) is:
N = 20( p 2 + pq + q 2 )
408
Mihai Popescu and Florinel Sava
In Fig. 26 are presented three quasi-spherical fullerenes built from a variable number of pentagons, hexagons and heptagons.
3.7.3. Fullerene Modeling We have modellled two types of fullerenes: the C60 fullerene ball, which is the first fullerene discovered and investigated, and the small fullerene ball C35, in order to compare the structures and investigate the free energy relation. Figure 27 shows the fullerene configuration and its structural characterization: pair distance distribution, X-ray diffraction pattern and bonding angle distribution. The full relaxation gives the free energy (per bonding distance and bonding angle) of the fullerene ball gives 4.33 meV, a value approaching that of the cup modeled in the previous section. The smaller fullerene ball exhibits a much higher free energy (7.24 meV) that demonstrates low stability of such nano-object. Thus, is not expected to find or to prepare easily such type of fullerenes.
Figure 27. Fullerene (12 five fold rings, 60 carbon atoms). a. Pair distance distribution b. X-ray diffraction pattern c. Bonding angle distribution. Free energy / (bond and angle) = 4.33 meV.
Structural Modelling of Nano-Carbons and Composites
409
Figure 28. Small fullerene ball (12 five fold rings) - 36 carbon atoms a. Pair distance distribution b. X-ray diffraction pattern c. Bonding angle distribution. Free energy / (bond and angle) = 7.24 meV.
3.8. Carbon Nanotubes 3.8.1. Atomic and Electronic Structure In 1991, Iijima from the NEC Laboratory, Japan has reported the first observation of some multiwall carbon nanotubes in the soot obtained by arc discharge between two graphite electrodes [49]. After two years Iijima discovered the single wall carbon nanotubes [50]. The electrical and mechanical properties of the carbon nanotubes have attracted the attention of the scientists, and the understanding of these properties was the motor which propulsed the applications and the rapid development of the field of nano-carbons. A single wall carbon nanotube (NTC) can be considered as a curved graphene sheet and closed bas a cylinder, without any defect or free bond. All the polygons of this tube are hexagons [51] (Fig. 29). A necessary condition for getting a cylinder without defects is: one node of the graphene lattice (n1,n2) must coincide with the origin (0,0). In this condition, if a1 and a2 are the lattice vecors of the graphenes, the circumference of the carbon nanotube is equal with the length of the vector (n1a1+n2a2), while the chiral angle of the carbon nanotube, θ, is defined as the angle between the vectors (n1a1+n2a2) and a1.
410
Mihai Popescu and Florinel Sava
Figure 29. The structure of theb carbon nanotube. a) Graphene with the lattice vectors a1 and a2. The Miller indices are indicated for several lattice sites (0,0), (1,0), (3,0), (5,0), (1,1), (2,2), (3,3). θ is the chiral angle for a carbon naotube. (3,1). The dotted lines is the circumference of a possible chair-like nanotube, and the discontinuous line is the circumference of the zig-zag nanotube [51]; b) is the carbon chair-like nanotube (5,5) [51]; c) is the carbon zig-zag nanotube (9,0) [51]; d) Axial view of the chairlike nanotube (10,10); e) Axial view of the chair-like nanotube (16,0); f)Axial view of the chair-like nanotube (15,5) tip chiral.
The chiral angle of a chiral nanotube is given by:
and the corresponding diameter is given by:
In these two relations the parameter a is the lattice constant of the graphene.
Structural Modelling of Nano-Carbons and Composites
411
The diameter and the chirality of a carbon nanotube, and, consequently, its atomic geometry, is fully determined by the two Miller indices (n1,n2), which are called the chiral indices of the nanotube. Due to the symmetry of the graphene lattice, the chiral indices can take values in the range n1≥n2≥0, and n1>0, while the chiral angle, θ, takes values in the range (0,30)º. The theoretical calculations [31-34] have shown that the electronic properties of the carbon nanotubes are very sensitive to their geometrical structure. Although the graphene is a zero-gap semiconductor, the theory predicted that NTC can exhibit either metallic character, or semiconductor with different widths of the forbidden gap, as a function of the diameter and chiral indices of the tubes. Due to intimate correlation betwen the geometrical structure and the electronic one, new special properties appear, especially at the junction of nanotubes with different diameters (Fig. 30).
Figure 30. Carbon nanotube illustrating a metal-semiconductor junction [from http://www.nanotechnow.com/nanotube-buckyball-sites.htm].
The general rules for the character of the electrical conduction are: -chair-like nanotubes (n,n) (n1=n2=n) are metals - nanotubes with n1-n2=3j, including those with (3j,0), part of the zig-zag ones, are very narrow band semiconductors (~ 10 meV). This originates from the curvature effects (j>0 is an integral number) -nanotuburile with n1-n2≠3j are semiconductors with wide forbidden gap.
Figure 31. a. Carbon nanotube with four walls; b. HRTEM images of a multiwall NTC prepared by arc discharge, with well graphitized walls and bamboo structure [from http://www.nanotechnow.com/nanotube-buckyball-sites.htm].
412
Mihai Popescu and Florinel Sava
NTC (n,n) are metallic independently of their curvature, due to their symmetry. The width of the forbidden gap for the other two types of nanocarbons decreases with the increase of the tube diameter. The tubes of wide forbidden gap show gap variation proportional to 1/d (d is the tube diameter), while those of very narrow gap show gap variation proportional to 1/d2. The experimental measurements cannot distinguish between metallic and quasi-metallic nanotubes due to the presence of the contact resistances and thermal effects. Figure 31 shows the structure of a four-wall nanotube. In the paper [40] are reported the computer modelling results for the nano-tube – nanotube connexion. These connexions are important for applications in electronic circuits, because the nanotubes can play the role of conductors in nanotechnological devices.
3.8.2. Properties of the Carbon Nanotubes The following outstanding properties must be mentioned: -high mechanical resistance: Young modulus higher than 1.2 TPa, six times higher than the value for steel. -high thermal conductivity: 6×103 W/(mK) for isolated tubes [56]. -high elasticity and no plasticity even for large deformations [37]. -The metallic NTC are quantum unidimensional conductors, where the electrons are transported balistically. The heat dissipates only to contacts. -The ratio length/diameter could be huge: 105 or higher. The elecron emission can be induced through the end of long metallic NTS, by application of medium electrical fields. -As a function of the chiral indices, the metallic NTC can transit in semiconductors if they are stretched (by extension) or twisted. -Atoms and molecles can be encapsulated within NTC (Fig. 32). -They can be doped by p or n type elements.
Figure 32. TEM image of a NTC multilayer uniformly filled by leadv oxide.The filling was achieved by capillarity [58].
The Haeckelites. As well known, a plane can be covered by pentagons, hexagons and heptagons. Therefore, it is possible to imagine a graphene built by such polygons, where the number of pentagons and heptagons must be equal in order to compensate the negative curvature of the heptagons and the positive curvature of the pentagons [59,60] (Fig. 33). These lattices have been called haeckelites in the honor of Ernst Haeckel who produced very beautiful drawings where all these types of rings could be seen [61]. The haeckelite nanotubes are conductors independently of their diameter and chirality. They are very rigid and their Young modulus is ~1.0 TPa.
Structural Modelling of Nano-Carbons and Composites
413
Figure 33. Haeckelites [60]: a). Rectangular lattice (R); b). Hexagonal lattice (H); c). Oblic lattice (O); d)-f). Nanotubes corresponding to left side lattices; g). Waved oblic nanotube of haeckelite.
3.8.3. Modelling the Combination Fullerene Nanotube (nanoBuds) and Nanotube-Nanotube In nanotehnology, the combination NTC-fullerene represents a new material which presents both properties of NTC and fullerene. Thus, the mechanical properties and the electrical conductivity are similars to those of NTC, while, due to high reactivity of the attached fullerene molecules, the hybrid material can be functionalized by the well known chemistry of fullerenes. Moreover, the fullerene molecules can be used as anchor molecules to prevent the gliding of the nanotubes in different composite materials, thus improving their mechanical properties. In such a new type of material the fullerenes arev bonded covalently to the external face orf the nanotube wall (Fig. 34). Due to high curvature of the fullerene surface, this nanoobject acts as an electron emitter on the carbon nanotube conductor. Thus a thin film of such hibrid materials presents a very low extraction work for the electrons (0.65 V/μm) as compared to the value of 2 V/μm for single wall unfunctionalized NTC. Therefore, these materials are ideal candidates for electron sources in TV screens. We have modeled several special configurations including the attachment of a fullerene molecule to the wall of a nanotube and the coupling of two nanotubes linked perpendicularly. We tried to demonstrate that such configurations are possible from the crystallo-chemistry point of view [38].
414
Mihai Popescu and Florinel Sava
Figure 34. Various nanotube-fullerene structures [from http://en.wikipedia.org/wiki/Carbon_nanobud].
Figure 35. Carbon Nanotube (14 five fold rings + 2 seven fold rings) - closed ends (154 carbon atoms) a. Pair distance distribution b. X-ray diffraction pattern; c. Bonding angle distribution. Free energy / (bond and angle) = 2.41 meV
Structural Modelling of Nano-Carbons and Composites
415
Figure 36. Nanotube-Fullerene (214 carbon atoms) Free energy/(bond and angle) = 2.99 meV.
The results allow to conclude that the change in the free energy during the attachment of the fullerene to nanotube and the coupling of two (open) nanotubes is enough small to guarantee high stability to the complex configurations. In the same time the modeling suggests one new mechanism for the formation of fullerene: the nucleation at the nanotube wall. This is achieved with the help of pentagonal rings accidentally introduced during growth. Another result is the possibility to grow new nanotubes directly on the external part of a nanotube wall by protruding the wall with the help of hexagonal rings of atoms (Fig. 35). Last but not least, we must remark a special aspect of the growth of nanotubes, one perpendicular to another: the nanotube, on which is nucleated another nanotube at right angle becomes curved, elliptical in cross-section and twisted (Fig. 36). Furthermore, we have investigated the dihedral angle distribution in the three models of nanocarbon species (Fig. 37). The distribution of the dihedral angles in fullerene (C60) is: 180 angles with 0° , 60 angles with 138° and 120 angles with 142°. For the nanotube the dihedral angles show a larger dispersion due to the closed ends: 126 angles with 0°. and 224 angles with 26°. New angles appear at 4° (28 angles) at 6° (56 angles) and at 10° (28 angles). In the fullerene-nanotubule complex the dihedral angles are distributed on a larger range with maxims at 0°, 2°, 6°, 10°, 26°, 128°, 138° and 180°. This means that the complex configuration is subjected to inhomogeneous stresses. The identification of these stresses is important. The preliminary determinations have shown that the junction fullerene-nanotubule is the most distorted one: the stresses are the largest.
416
Mihai Popescu and Florinel Sava
Figure 37.A. Carbon nanotube-nanotube interconnection (246 carbon atoms). Free energy/(bond and angle) = 0.682 meV.
Figure 37.B. Dihedral angle distribution in the three relaxed models of nano-carbon configurations.
Structural Modelling of Nano-Carbons and Composites
417
Figure 37.C. Void distribution in the three relaxed models of nano-carbon configurations.
Finally, the void structure of the carbon configurations has been studied. In the normal fullerene (C60) the unique void radius is 0.2925 nm. The nanotubule contains a void of diameter 0.2275 nm, while the complex fullerene-nanotubule contains a distribution of voids situated between 0.2125 and 0.2275 nm. The conclusions of the modelling studies are: a. the nanometric structure of the combination fullerene-nanotube is physically realistic b) the bonding distortion at the junction fullerene–nanotube is enough low to permit the coupling of the fullerene molecule to nanotubule, c). the structural voids in the complex are enough large to permit the introduction of the metallic atoms (Fe, Co, Ni with radii ~0.123 nm) to form compact rows, or small clusters, thus giving rise to metallic conduction; this could be used in the integrated systems based on nano-devices, in the future optoelectronics d). modelling of the carbon nanostructures is a simple and cheap method and can be easily extended to other systems as e.g. silicon, in order to demonstrate the feasibility of nano-wires for applications in nanoelectronics and spintronics.
3.9. Exotic Nano-Configurations We can imagine a lot of nanometric structures based on graphite carbon. Here we present several examples: Toroidal structure (Fig. 38); Zeolite-like structure (Fig. 39) [60,67]; Giroidal sructure (Fig. 40); Elicoidal nanotube (Fig. 41); Fullerene with holes, without pentagons (Fig. 42); Nanotube with holes, without heptagons (Fig. 43). All the carbon nano-objects presented above have been not produced experimentally but the crystallochemical parameters, especially the bond distortion, seems to be so reasonable, that the work for discovering new carbon objects will be, surely, fully rewarded.
418
Mihai Popescu and Florinel Sava
Figure 38. Toroidal nanocarbon structures [from A. Lőrinczi, M. Popescu, F. Sava, A. Anghel, Modelling of the the complex carbon structure: fullerene – nanotubule, J. Optoelectron. Adv. Mater., 6(1), 349 (2004)]. a) nanotorus formed only by hexagons. A ring built by iron atoms can be observed; b) Twisted nanotorus; c) Nanotorus formed by hexagons, pentagons and heptagons. One observes a more waved surface than in a; d) nanotorus with small diameter; e) Fullerene C60 for comparison.
Figure 39 a). Four cubic cells decorated by graphenes built by hexagons and octogons (192 atoms per cell); b). Triple periodic primitive surface [from A. Lőrinczi, M. Popescu, F. Sava, A. Anghel, Modelling of the the complex carbon structure: fullerene – nanotubule, J. Optoelectron. Adv. Mater., 6(1), 349 (2004)].
Structural Modelling of Nano-Carbons and Composites
419
Figure 40. Giroidal triple periodic primitive surfaces [from M. T. Lusk and N. Hamm, Ab initio study of toroidal carbon nanotubes with encapsulated atomic metal loops, PRB 76, 125422 (2007)].
Figure 41. Elicoidal nanotube (with pentagons, heptagons and hexagons) [60].
Figure 42. Structures of pierced fullerenes (with heptagons and hexagons, but without pentagons) [60].
420
Mihai Popescu and Florinel Sava
Figure 43. Pierced nanotube (with heptagons and hexagons but without pentagons) [60].
3.10. Complex Nano-Configurations: Carbon-Chalcogenides We have developed a more complex configuration of nanostructure based on carbon nanotube network and arsenic sulphide network of atoms, known to have a nanotube structure [4].
Figure 44. Carbon nanotube with zig-zag configuration (15, 0), having at the coupling end 2 pentagonal rings of atoms linked to a zig-zag arsenic sulphide nanotube. At the other end of the zig-zag arsenic sulphide tube a fullerene molecule is attached. For attaching one 5-fold rings of atoms was eliminated [60].
Structural Modelling of Nano-Carbons and Composites
421
Figure 45. The distortion energy distribution in the complex nanostructure based on the combination of a carbon nanotube, As2S3 chalcogenide tube and a fullerene C60 molecule.
We have built by computer (HyperChem soft) an As2S3 nanotube. This nanotube was capped at one end by a carbon nanotube and at the other end by a fullerene molecule. Fig. 44 shows the final configuration after energy relaxation. Fig. 45 illustrates the variation of the free energy of the configuration along the nanotube axis. In the region of connection carbon nanotube-arsenic sulphide tube the distortion energy is very high. In the connecting region nanotube-fullerene the distortion energy variation is low.
3.11. Prospective of the Wealth of Nano-Objects as Revealed by Atomic-Scale Modelling If the free energy of the models of nano-objects are compared one can get important conclusions on the stability and probability of formation of a huge number of carbon configurations and even complex carbon-chalcogenide configuration.
Figure 46. The free energy per bond and angle for various nano-objects based on carbon.
422
Mihai Popescu and Florinel Sava
Fig. 46 shows how the different nano-objects based on graphene configurations are compared when the average free energy per atom is considered. Objects as nanocupolae and nanotubes with closed end exhibit lower distortion energy, and, therefore, higher stability. The modeling studies can speak in favour of one or another configuration, to be reproduced experimentally. It is also a challenge for chemists and physicists to prepare new materials at the nanoscale, based on low coordinated elements as e.g. P, As, S, Se , Te, etc… The new materials could be a reservoir of new properties and applications in nanotechnology.
Acknowledgment The support of the CNCSIS in the frame of the "Exploratory research project" ID 1356 is acknowledged with thanks.
References [1] [2] [3] [4]
[5] [6] [7] [8] [9] [10] [11] [12] [13]
[14] [15] [16] [17] [18] [19]
Tanigaki K., Ebbesen T.W., Saito S., Mizuki J., Tsai J.S., Kubo Y., Kuroshima S., Nature 352, 222 (1991). Lozovanu P., Lasser G., Stamati G., Caraman M., J. Optoelectron. Adv. Mater. 4(1), 151 (2002). Tenne R., Chem. Eur. J. 8, 5296 (2002); Angew. Chem. Int. Ed. 42, 5124 (2003). Lee J.H., Min-Gyu Kim M.G., Yoo B., Myung N.V., Maen J., Lee T., Dohnalkova A.C., Fredrickson J.K., Sadowski M.J., Hur H.G., Proc. Nat. Acad. Sci. USA, 104 (51) 20410 (2007). Zheng Q.D., Yu A. B., Lu G. Q., Progr. Polymer Sci. 33, 191 (2008). Karakasidis T. E., Charitidis C. A., Mater. Sci. Eng., C27,1082 (2007) Lennard-Jones J.E., Proc. Phys. Soc. 43, 461 (1931). Morse P.M., Phys. Rev. 34, 57 (1929). Abrahamson A.A., Phys. Rev. 178, 76 (1969). Murray S.D., Baskes M., Phys. Rev. 29(12), 6443 (1984). Godwal B.K., Rao R.S., Chidambaram R., J. Non-Cryst. Solids, 334-335, 117 (2004). Pauling L., The Nature of the Chemical Bond, 3-rd ed., Cornell University Press, Ithaca NY, 1960. Jorio A., Dresselhaus G., Dresselhaus M.S. (Eds.) Carbon Nanotubes, Advanced Topics in the Synthesis, Structure, Properties and Applications, Springer-Verlag, Berlin-Heidelberg, 2008. Novoselov K.S., Geim A.K., Morozov S.V., Jiang D., Dubonos S.V., Girgorieva I.V., Firsov A.A., Science 306, 666 (2004). Geim A.K., Novoselov K.S., Nature Materials, 6(3), 183 (2007). Wallace P.R., Phys. Rev. 71, 622 (1947). McClure, J.W., Phys. Rev. 104, 666 (1956). Slonczewski J.C., Weiss, P.R., Phys. Rev. 109, 272 (1958). Fradkin E., Phys. Rev. B 33, 3263 (1986).
Structural Modelling of Nano-Carbons and Composites [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30]
[31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43]
[44] [45] [46] [47] [48] [49]
423
Peierls R.E., Ann. I. H. Poincare 5, 177 (1935). Landau L.D., Phys. Z. Sowjetunion 11, 26 (1937). Landau L.D., Lifshitz E.M., Statistical Physics, Part I, Pergamon, Oxford, 1980. Mermin N.D., Phys. Rev. 176, 250 (1968). Venables J.A., Spiller G.D.T., Hanbucken M., Rep. Prog. Phys. 47, 399 (1984). Evans J.W., Thiel P.A., Bartelt M.C., Sur. Sci. Rep. 61, 1 (2006). Novoselov K.S., Jiang D., Schedin F., Booth T.J., Khotkevich V.V., Morozov S.V., Geim A.K. Proc. Natl Acad. Sci. USA 102, 10451 (2005). Meyer J.C., Geim A.K., Katsnelson M.I., Novoselov K.S., Booth T.J., Roth S., Nature, 446, 60 (2007). Nelson D.R., Piran T., Weinberg S. Statistical Mechanics of Membranes and Surfaces, World Scientific, Singapore, 2004. Dresselhaus M.S., Dresselhaus G., Adv. Phys. 51, 1 (2002). Ferrari A.C., Meyer J.C., Scardaci V., Casiraghi C., Lazzeri M., Mauri F., Piscanec S., Jiang D., Novoselov K.S., Roth S., Geim A.K., Phys. Rev. Lett. 97, 187401 (2006). Gupta A., Chen G., Joshi P., Tadigadapa S., Eklund P.C. Nano Lett. 6, 2667 (2006). Divigalpitiya W.M.R., Frindt R.F., Morrison S.R., Science 246, 369 (1989). Klein A., Tiefenbacher S., Eyert V., Pettenkofer C., Jaegermann W. Phys. Rev. B 64, 205416 (2001). Novoselov K.S., Jiang Z., Zhang Y., Morozov S.V., Stormer H.L., Zeitler U., Maan J.C., Boebinger G.S., Kim P., Geim A.K., Science 315(5817), 1379 (2007). Meyer J.C., Kisielowski C., Erni R., Rossell M.D., Crommie M.F., Zettl A., Nano Lett., 8(11) 3582 (2008). Keating P.N., Phys. Rev. 145, 637 (1966) Jishi R.A., Mirie R.M., Dresselhaus M.S., Phys. Rev. B, 45 (23), 13685 (1992). Lorinczi A., Popescu M., Sava F., Anghel A., J. Optoelectron. Adv. Mater. 6(1), 349 (2004). Sava F., Popescu M., Proc. Rom. Acad. - Series A, 10(1), in press (2009). Popescu M., Sava F., Lorinczi A., Stegarescu M., Digest J. Nanomater. Biostruct., 1(1), 21 (2006). Lammert P.E., Crespi V.H., Phys. Rev. Lett. 85, 5190 (2000). Krishnan A., Djuardin E., Treacy M.M.J., Hugdahl J., Lynum S., Ebbesen T.W., Nature 388, 451 (1997). Terrones H., Terrones M., Carbon 36, 725 (1998); Kiselev N.A., Sloan J., Zakharov D.N., Kukovitskii E.F., Hutchison J.L., Hammer J., Kotosonov A.S., Carbon 36, 1149 (1998). Iijima S., Yudasaka M., Yamada R., Bandow S., Suenaga K., Kokai F., Takahashi K., Chem. Phys. Lett. 309, 165 (1999). Berber S., Kwon Y.-K., Tománek D., Phys. Rev. B, 62(4), 2291 (2000). Kroto H.W., Heath J.R., O’Brien S.C., Curl R.F., Smalley R.E., Nature 318, 162 (1985). Krätschmer W., Lamb L.D., Fostiropoulos K., Huffman D.R., Nature, 347, 354 (1990). Coxeter M.C.M., Regular Polytopes, Methuen and Co., London, 1948. Iijima S., Nature 354, 56 (1991).
424 [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61]
Mihai Popescu and Florinel Sava Iijima S., Ichihashi T., Nature 363, 603 (1993). Maiti A., Microelectron. J. 39, 208 (2008). Chico L., Crespi V.H., Benedict L.X., Louie S.G., Cohen M.L., Phys. Rev. Lett. 76, 971 (1996). Saito R., Fujita M., Dresselhaus G., Dresselhaus M.S., Appl. Phys. Lett. 60, 2204 (1992). Mintmire J.W., Dunlap B. I., White C.T., Phys. Rev. Lett. 68, 631 (1992). Blase X., Benedict L.X., Shirley E.L ., Louie S.G., Phys. Rev. Lett. 72, 1878 (1994). Barber S., Kwon Y.K., Tomanek D., Phys. Rev. Lett. 84, 4613 (2000). Bernholc J., Brenner D., Nardelli M.B., Meunier V., Roland C., Ann. Rev. of Mater. Res. 32, 347 (2002). Tsang S.C., Chen Y.K., Harris P.J.F., Green M.L.H., Nature 372, 159 (1994). Terrones H., Terrones M., Hernández E., Grobert N., Charlier J.-C., Ajayan P.M., Phys. Rev. Lett. 84, 1716 (2000). Terrones H., Terrones M., Morán-López J.L., Current Science, 81(8), 1011 (2001). Report on the scientific results of the voyage of the H.M.S. Challenger during the years 1873–1876, Her Majesty’s Stationery Office, vol. 18, London 1887.
In: Quantum Frontiers of Atoms and Molecules Editor: Mihai V. Putz, pp. 425-478
ISBN: 978-1-61668-158-6 © 2010 Nova Science Publishers, Inc.
Chapter 17
NANOSTRUCTURE DESIGN—BETWEEN SCIENCE AND ART Mircea V. Diudea* Faculty of Chemistry and Chemical Engineering “Babes-Bolyai” University A. Janos Street No. 11, 400028 Cluj-Napoca, ROMANIA
1. Covering/Tiling 1.1. Introduction Covering a surface by various polygonal or curved regions is an ancient human activity. It occurred in house building, particularly in floor, windows and ceiling decoration. There were well known three regular Platonic tessellations (i.e., coverings by a single type face and a single vertex degree): (4,4), (6,3) and (3,6), written here as Schläfli symbols [Schläfli, 1901]. The Greek and Roman mosaics were very appreciated in this respect (Figure 1). It is well established that covering/tessellating of fullerenes (nanostructures, in general) dictates the stability and reactivity of these molecules. Coverings and their modification enable understanding of chemical reactions (their regioselectivity) occurring in nanostructures, particularly in carbon allotropes. Modeling molecules, particularly nanostructures, scientists often use the embedding of a polygonal lattice in a given 3D surface S. Such a “combinatorial” (discretized) surface is called a map (Pisanski and Randić, 2000). Analytical formulas, for generating a smooth surface, can be found in Mathematical recipes, available on the Internet. The coordinates of the lattice points are obtained by partitioning S, either by dedicated algorithms or by simply drawing vertices and edges on display, with the aid of some builders to switch from 2D to 3D. Another way uses templates, or unit blocks with a prescribed spatial arrangement. This last technique is also used (or naturally happens) in self assembling reactions, in experiments (or occurring in vivo).
*
E-mail addresses: [email protected], [email protected]
426
Mircea V. Diudea
Crystallography is another domain of great importance which searches infinite polyhedral tilings appearing in the crystalline state of matter. Covering is nowadays a mathematically founded science. It makes use of the Graph and Set Theory and is often inspired by, or inspires itself, the Arts and Architecture (see the recent books of Diudea, Ed., 2005; and Diudea and Nagy, 2007). In this respect, researcher groups in the U.S.A. (B. Grunbaum, M. Terrones, M. O’Keefee, J. D. Klein, G. Heart), Australia (S. T. Hyde, S. Ramsden), Brazil (J.-G. Eon), Italy (D. M. Proserpio, L. Carlucci), the UK (P. W. Fowler, J. Klinowski), Germany (P. E. John), Slovenia (T. Pisanski), Hungary (I. Laszlo), Russia (V. A. Blatov) and Romania (M. V. Diudea) have developed methods of generation and classification of covering/filling the 2D/3D space, with applications in Mathematics, Chemistry and Arts. There exist databases with beautiful galleries (HSAKA, EPINET) of mathematical coverings (RCSR Center of Reticular Structure Resources, Epinet, ESMC-SG European Society of Mathematical Chemistry-Structure Gallery) including online software programs working for generation and classification of known and novel molecular or crystalline architectures.
Figure 1. Ancient and modern mosaics in Milan, Italy.
It was our goal to develop and present some mathematical methods enabling discrete or periodic coverings/tilings. Although some simple tessellations can be drown by hand, the more elaborate ones need the aid of a computer. In this respect, the TOPO GROUP Cluj has developed some software programs, based on either well-known or original map operations, which will be presented in Section 1.2. Changing the square tiling (4,4) to polyhex (6,3) or [(4,8)3] or even [(5,7)3] patterns, appearing in modeling tubular nanostructures, by some cutting procedures is given in detail in Section 2. Then, Section 3 presents a transformation/isomerization known as the Stone-Wales edge rotation. Section 4 introduces map operations, from simple to composite operations or generalized ones. Extension of map operations to 3D nets is presented in Section 5.
1.2. Cluj Covering Software Nanostructure modeling necessarily involves the embedding of a polygonal lattice in a given 3D surface S. Analytical formulas, for generating a smooth surface, can be found in Mathematical recipes, available on the Internet. The coordinates of the lattice points are obtained by partitioning S, either by dedicated algorithms or by simply drawing vertices and
Nanostructure Design—between Science and Art
427
edges on display, with the aid of some builders to switch from 2D to 3D. Another way uses templates, e.g., unit blocks with a prescribed spatial arrangement. TOPO GROUP CLUJ has developed four main software programs dedicated to polyhedral tessellation and embedment in surfaces of various genera, either as finite or infinite lattices: TORUS (2003), CageVersatile_CVNET (2005), JSCHEM (2005), OMEGA (2006) counter and NANO-Studio (2009).
1.2.1. TORUS The program called TORUS (2003) is written in Delphi and works as a generator of coordinates for tori of various covering. It generates tori and tubes, tessellated according to selected type of cycles (C6 for (6,3), C8 for [(4,8)3] and C7 for [(5,7)3]. The dimensions of objects: "c" (meaning the number of atoms in the cross-section of the torus or tube (if "tube" is selected) and "n" (counting the number of cross-sections along the large hollow of the torus) are at choice. Twisted tori/tubes can be obtained by using the "twist" window and no. of twisted faces (even, integer). TORUSN-version also generates “phenylenic” C4C6C4C6 objects. H/V means Zigzag/Armchair objects. TORUSP-version allows Wiener number (of total distances in graph) and Hosoya polynomial calculation. To see the HIN files generated by the Torus program one needs the HyperChem Program.
1.2.2. CageVersatile, CVNET CageVersatile (Net) CVNET (2005) software package, written in C, is a program, created on the ground of map operations, as described by Diudea (2004), Diudea and John (2003), Diudea et al. (2003, 2006b), Stefu et al. (2005), as a theoretical support, for generating coordinates of closed or open lattices covering nanostructures. The program works in four levels of complexity: (1) Simple map operations (medial Med, dual Du, polygonal kmapping Pk (k=3,4,5), truncation Tr (all or only selected atoms), rotate RO (Stone-Wales, 1986, operation) and split (bisect) edge SP); (2) Complex map operations (Leapfrog Le1,1, Chamfering/Quadrupling Q2,0, Capra/Septupling Ca2,1 (or S1), Septupling 2 and Flower disjunction); (3) Generalized map operations Leapfrog Le2,2(and Le2,2C), Quadrupling Q3,0, Quadrupling Q4,0, Capra Ca3,2 (and Ca3,2C) and Flower disjunction) and (4) Net operations (operated by the button “take all cycles” which allow us to perform operations on nets, by considering all the hard rings in a structure. The operations working on this button are: dual, medial, stellation, truncation, leapfrog, quadrupling and capra.
1.2.3. JSChem JSChem software package (2005), is written in Java Script, and works as an assistant program (generates script input and/or collects the output files for other programs) in the field of molecular topology, crystallography and quantum chemistry. The program was developed in four directions: (1) HyperChem (Script general (provides script list for HyperChem); Homo-Lumo Gap (extracts info from the semiempirical log files); All Aromatic (makes all atoms C of “C A” type—puts the baricentre of molecule); MolColor (put colors by changing C with heteroatoms); Hin2Pdb (converts .hin file into .ent=.pdb format with preserving connectivity); (2) NanoUtils (POAV1 (calculates the strain, of sp2 C, in kcal/mol, per atom,
428
Mircea V. Diudea
for all or for eventually marked atoms in a molecule); HOMA (computes aromaticity from actual (optimised or crystallographic) geometry—both as global and local data); Stone-Wales (rotation of marked edges); Kekule-Clar (depictes double bonds at selected threshold – and disjoint hexagons when a structure shows perfect CLAR PC structure) and Schlegel (draws plane projections of polyhedra/fullerenes – according to the marked face); (3) Topology (Homeomorf (put k points on the marked edges, either C or heteroatom); Medial (operates Medial on a trivalent map); Truncation (operates Truncation on a trivalent map); Dual (operates Dual on a trivalent map); MolConnect (connects points by „joining” and faces by „identification” in assembling repeat units in molecular architectures)) and (4) NanoGen (TubeGen (generates nanotubes or series of nanotubes cf. (n,m) nomenclature). Input and output of this program is also in HIN format of HyperChem Program.
1.2.4. Omega Counter The program called OMEGA (2006) counter is written in Visual J#. It is used for generating the Omega polynomial (the opposite edge strips). The program works with .hin files of HyperChem. You can download the HyperChem program as an Evaluation version. The input files are stored in a subfolder named “hin” of Omega program main folder. All the HyperChem files from .hin folder will be input files. To use this program one sets: the maximum length cycle; the maximum vertex degree and the structure type (Planar structures, Fullerenes, Net, Torus with 5-7 covering, Torus with 6 covering, Torus with 4-8 covering). The program was developed on the ground of our original Omega polynomial, presented in Diudea (2006, 2009), Diudea et al. (2006a, 2009), Vizitiu et al. (2006).
1.2.5. NANO-Studio This program (2009) is written in C Sharp and can be seen as a new version of JSChem. In addition to the old version, it provides complete topological characterization of a graph (number of vertices, edges, faces and equivalence classes of the above ones), aromaticity index calculation and Omega polynomial calculation on cages and nets. There exist other sotware programs, e.g., SYSTRE, TOPOS, EPINET, or that used by G. HART, the readers can use in performing various tessellations and tilings.
2. Cutting Procedures on (4,4) Tessellation Covering the cylinder or the torus by hexagons is most often achieved by the graphite zone-folding procedure, as presented in Bovin et al. (2001), Ceulemans et al. (1998), Kirby et al. (1993), Marusic and Pisanski (2000). The method defines an equivalent planar parallelogram on the graphite sheet and identifies a pair of opposite sides to form a tube. Finally the two ends of the tube are glued in order to form a torus. A second procedure uses the so-called topological coordinates, extracted from the adjacency matrix eigenvectors, Graovac et al. (2000), Laszlo and Rassat (2001, 2003), Pisanski and Shawe-Taylor (2000).
Nanostructure Design—between Science and Art
429
Construction of polyhex (6,3) nanotubes and tori from square tiled (4,4) objects, as developed by Diudea (2002a, 2002b, 2005), Diudea and Nagy (2007), at TOPO GROUP Cluj, is a third main route. S. Iijima (1991) firstly reported graphitic carbon nanotubes (see also Endo et al., 1996). Next, clear distinction between single walled nanotubes SWNT and multi walled ones was made. Function of their diameter and chirality, the nanotubes can be metallic or semiconductors. Nowadays, nanotubes found the most important applications (among the newly discovered nanocarbon allortopes) in nanoelectronics and materials science. “Circle crops” structures have firstly been observed by Liu et at. (1997) and then by other groups, see Ahlskog et a. (1999), Martel at al. (1999a, 1999b). Martel et al. (1999b) argued that the observed rings were coils rather than perfect tori, but these structures have continued to attract a multitude of theoretical studies, dealing with construction, mathematical and physical properties of graphitic tori, see Babic et al. (2001), Johnson et al. (1994), Kirby (1993), Lin and Chu (1998), Meunier (1998). We implemented the cutting and graphite zone-folding procedures in our TORUS and JSCHEM original programs.
2.1. Square (4,4) Lattice The embedding of the (4,4) net is made by circulating a c-fold cycle, circumscribed to the toroidal tube cross-section of radius r, along the cylinder or around the large hollow of the torus, of radius R > r (Figure 2). The subsequent n images of c-fold cycle, equally spaced are joined with edges, point by point, to form a polyhedral cylinder or torus tiled by a tetragonal pattern. In case of the torus, the position of each of the n images of the “circulant” around the central hollow is characterized by angle θ while angle ϕ locates the c points across the tube. In all, c×n points are generated.
Toroidal parameters
The (4,4) covering
Figure 2. Embedding of the (4,4) net in the torus.
The parameters R and r are not directly involved in the topological characterization of the lattice embedded in the torus (see our books: Diudea, Ed., 2005; and Diudea and Nagy, 2007). The formulas for drawing the smooth toroidal surface were:
430
Mircea V. Diudea
P ( x, y , z ) : x = cos(θ)( R + r cos φ) y = sin(θ)( R + r cos φ) z = r sin φ
θi =
2π i ; i = 0,..., n − 1 n
ϕj =
2π j ; c
(1)
j = 0,..., c − 1
For the cylinder, the procedure is similar. The squares can be changed to hexagons (or other tiling patterns, suitable from chemical point of view) by appropriate edge-cutting, as described in Diudea (2002a-c), Diudea and Kirby (2001), Diudea et al. (2003), or by performing some map operations (see below).
2.2. Polyhex (6,3) Covering To obtain the (6,3) tessellation, each second edge of the (4,4) net has to be cut off. Two isomeric embeddings can be defined in the torus T (Figure 3): (i) The H-embedding (Figure 3 -a) when the cut edges lye horizontally (i.e., perpendicular to the Z axis of the torus). It is also called “zigzag” Z, by the tube cross-section shape. (ii) The V-embedding (Figure 3 -b) when the cut edges lye vertically (i.e., parallel to the Z axis of the torus). It is also called “armchair” A, by the tube cross-section aspect.
(a) T(6,3)H/Z-embedding
(b) T(6,3)V/A- embedding
Figure 3. The (6,3) covering by H/Z (a) and V/A (b) cutting of the (4,4) net. To obtain a torus, the two pairs of opposite edges have to be identified.
Twisted, chiral, (4,4) tori can be generated by the following two procedures: (1) twisting the horizontal layer connections (Figure 5 -a and Figure 6) and (2) twisting the vertical layer (offset) connections (Figure 5 -b).
Nanostructure Design—between Science and Art
431
Accordingly, two topologically distinct tori are obtained by the TORUS program: T(6,3)H/Z[c,n] and T(6,3)V/A[c,n] and they correspond to two different classes of aromatic chemical compounds, phenacenes and acenes, respectively. The name of such tori is a string of characters including the tiling, type of embedding and the tube dimensions [c,n]. Note that each hexagon consumes exactly two squares in the (4,4) net. By construction, the number of hexagons in the (6,3)H/Z isomer is half the number of squares on dimension c of the (4,4) torus. The same is true for the (6,3)V/A isomer, but on dimension n. Thus, T(6,3)H[2c,n] has the same number of hexes as its embedding isomer T(6,3)V[c,2n]. We focused here on the procedure concerning the toroidal embedding because of its higher complexity. The procedure for cylindrical embedding is immediate; another reason is the corresponding nanotubes can be generated by simply cross-cutting of the toroidal tube.
(a) T(4,4)H2[8,24]
(b) T(4,4)V2[8,24]
Figure 4. An H-twisted (a) and a V-twisted (offset - b) (4,4) net.
(a) (4,4) net
(b) (6,3) net
Figure 5. Twisted (4,4) pattern (a) and its (6,3) derivative (b).
(a) Tu(6,3)V/A[12,12]; v = 144
(b) T(6,3)H/Z[12,50]; v = 600
Figure 6. The (6,3) covering in the cylindrical (a) and toroidal (b) embedding, respectively.
After optimization, by a Molecular Mechanics procedure, the generated polyhex tubes and tori look like in Figure 6. The objects in these examples represent non-chiral structures.
432
Mircea V. Diudea
2.2.1. Topology of Polyhex Tori Each of the above twisting superimposes on the two basic cuttings, thus resulting four classes (Diudea Ed. 2005) of twisted tori: (i) H-twist, H-cut HHt[c,n]; (ii) H-twist, V-cut, HVt[c,n]; (iii) V-twist, H-cut, VHt[c,n]; and (iv) V-twist, V-cut, VVt[c,n]. The type of cutting will dictate the type of embedding and, ultimately, the shape of objects. Conversely, the type of twisting is involved in the π-electron structure of polyhex tori. Figure 7 illustrates some (non-optimized) twisted polyhex tori. The twist number t is just the deviation (in number of hexagons) of the chiral (i.e., rolling-up) vector to the zigzag line, in the graphite sheet representation (Hamada et al. 1992). Accordingly, a torus can be drawn as an equivalent planar parallelogram, involving two tubes: one tube is built on the rolling-up vector R (Figure 8), which, in terms of the primitive graphite lattice vectors, is written as:
R = ka1 + la 2
(2)
The second tube is formally defined on the translating vector T:
T = pa1 + qa 2
T(6,3)HH2[8,24]
T(6,3)VH2[8,24] (offset)
T(6,3)HV2[8,24]
T(6,3)VV2[8,24] (offset)
(3)
Figure 7. The four classes of twisted polyhex tori (non-optimized geometry).
For a given torus, the first tube can be identified by cutting the object across the tube while the second one results by cutting it around the large hollow. Anyway, a four integer parameter description (k,l,p,q) can be written. The coordinates of THH4[14,6] torus depicted in Figure 8 are (Diudea, Ed., 2005): (5, -4, 3, 3). Note that this representation is not unique and is reducible to three parameter notation, theorized by Kirby et al. (1993, 1998). Correspondence between our notation for tubes (TUXt[c,n]) and tori (TXt[c,n]) and that in two (k,l) and four (k,l,p,q) integers notation are given in Tables 1 and 2, respectively.
Nanostructure Design—between Science and Art
433
(3,3)
T
a1
a2
(5,-4)
R
Figure 8. Representation of the torus THH4[14,6] by an equivalent parallelogram defining two tubes: one defined on the rolling-up vector R (with integer coordinates (k,l)) and the other on the translating vector T (given by the pair (p,q)). The four parameter specification of the depicted torus is (5, -4, 3, 3).
Table 1. Correspondence between the TUXt[c,n] and (k,l) notation in nanotubes
1 2 3 4
Tube Xt H HHt HVt V
(c,t); (k,l) c=2k; t=l=0 c=k+2l; t=k; l=(c-t)/2 c=k+l; t=k; l=c-t c=2k; t=l=0
Tube (k,l) (c/2,0); Z [t,(c-t)/2] [t,(c-t)] (c/2,c/2); A
Table 2. Correspondence between the TXt[c,n] and (k,l,p,q) notation in tori 1 2 3 4 5 6
Torus Xt H V HHt HVt VHt (offset) VVt (offset)
Tube R H/Z V/A HH (tw) HV (tw) H/Z
Tube T V/A H/Z V/A H/Z HV (tw)
V/A
HH (tw)
Torus (k,l,p,q)*
v
(c / 2,−c / 2, n / 2, n / 2) (c / 2, c / 2, n / 2,−n / 2 )
2(kq − lp) = cn
[(c − t ) / 2,−t , n / 2, n / 2]
2( 2kq − lp ) = cn
2(lp − kq ) = cn
[(c + t ) / 2, (c − t ) / 2, n / 2,− n / 2]
2(lp − kq) = cn
[c / 2,−c / 2, (n + t ) / 2, (n − t ) / 2]
2(kq − lp ) = cn
[c / 2, c / 2, (n − t ) / 2,−t )]
2( 2lp − kq) = cn
*First pair (k,l) denotes the rolling-up vector R while last pair (p,q) specifies the translating vector T. The representation (m,-m) = (m,0), is an H/Z-tube while (m,m) is a V/A-tube (see Figure 7).
Encoding the type of tessellation can be done, for example, by the α-spiral ring code, which was first proposed for coding and constructing spherical fullerenes (Manolopoulos et al. 1991; Brinkmann and Fowler 1998). We adapted the spiral ring code for tubular structures (Diudea, 2002c). In a periodic tubular net, the ring code brings information on size and sequence of faces (i.e., repeat units) and embedding of the actual net on the parent (4,4)[c,n] lattice. The ring code for the polyhex tori is given in Table 3.
434
Mircea V. Diudea Table 3. Ring code of polyhex tori Ring code
1
Torus H/Z[c,n]
2
HH[c,n]
[ (6 c / 2 ) t ] n / t
3
HV[c,n]
[ ( 6 c ) t ] n / 2t
4
VH[c,n]
[ ( 6 n ) t ] c / 2t
5
VV[c,n]
[ (6 n / 2 ) t ] c / t
6
V/A[c,n]
[ 6 c ]n / 2
[ 6 c / 2 ]n
The number t inside the brackets equals the helicity while the number out the brackets gives the steps of a helix. Note that the helicity could be less than t, if an integer number of steps appear. A different topological description of polyhex tori is possible by means of the Omega polynomial (Diudea 2006, 2009; Diudea et al. 2006a, 2009; Vizitiu et al. 2007).
2.2.2. π-Electronic Structure of Polyhex Tori In the Spectral Theory, at the simple π-only Hückel (1931) level of theory, the energy of the ith molecular orbital Ei = α + xi β is evaluated by calculating the solutions xi of the characteristic polynomial Ch(G,x) or the eigenvalues of the adjacency matrix associated to the molecular hydrogen depleted graph. Table 4. Covering criteria for metallic character in polyhex tori
3 4
Torus Non-wisted H/Z[c,n] V/A[c,n] H-Twisted HHt[c,n] HVt[c,n]
5
V-Twisted VHt[c,n]
6
VVt[c,n]
1 2
Metallic 0 mod (c,6) 0 mod (n,6) 0 mod (c,6) 0 mod (n,6) 0 mod(t,6) 0 mod (c,6) 0 mod (t,6) 0 mod (n,6)
The π-electronic shells of neutral graphitic objects are classified (Fowler 1990, 1997), function of their eigenvalue spectra, as closed, when xv / 2 > 0 ≥ xv / 2 +1 or open, when the HOMO and LUMO molecular orbitals are degenerate, xv / 2 = xv / 2 +1 . The metallic character involves the existence of a zero HOMO-LUMO gap (a particular case of the open shell) and the degeneracy of some non-bonding orbitals (Yoshida et al. 1997)
Nanostructure Design—between Science and Art
435
(NBOs) favoring the spin multiplicity, cf. the Hund rule. In polyhex tori, the metallic behavior is ensured by four NBOs, also present in the graphite sheet. The gap (in β units) is taken as the absolute value of the difference EHOMO - ELUMO. Table 4 gives the lattice [c,n] criteria (in terms of our notation) for metallic shell in (6,3) tori of various types.
2.2.3. Identical Polyhex Tori by the Cutting Procedure Toroidal objects generated by the TORUS software, even correctly named to reflect different embeddings, could represent one and the same graph. Let consider normal tori (i.e., those with c < n). Their net dimensions can be written as: n=c+r; r = 0,2,..c − 2, c,.. and t = 0,2,.., c,.. .Investigating the spectra of the characteristic polynomial of families of polyhex tori led to the following: Rules of Valencia (see the book of Diudea and Nagy, 2007) (i) The maximum value of t to provide distinct topological objects, in a family of Htwisted polyhex tori, equals n / 2 . (ii) The maximum value of t to provide distinct topological objects, in a family of Vtwisted polyhex tori, equals Table 5. Identical graphs in families of twisted polyhex tori according to the Rules of Valencia 1 2 2a 2b 2c
3 3a
Case General
Characters n = c + r; r = 0, (2,..c − 2), c,.. ; t = 0,2,.., c,..
H-twist H-twist Case: r = c H-twist Case: r = 0 H-twist Case: 0
t = 0,2,.., n / 2; distinct objects
n = 2c → t max = c; all distinct objects n = c → V − twist
t = 0,2,..(r − 2); distinct objects t = r + 0,2,4,.., (c − r ) / 2 ≡ c − 0,2,4,..( c − r ) / 2 ; distinct pairs
t = 0,2,.., c / 2; distinct objects t = 0 + 0,2,4,.., c / 2 ≡ c − 0,2,4,..c / 2 ; distinct pairs
When c/2 or n/2 are odd, then t max = t − 1
As, by construction of the H-twisted polyhex tori, the maximum possible t-value is t max = c , the immediate consequence of Rule (i) is that n must be at most twice higher than c for having all distinct topological objects. Higher values for n will only repeat the already generated structures. When n / 2 < c , some duplicate objects appear, as indicated in Table 3.4.
436
Mircea V. Diudea
In case of V-twisted polyhex tori, t does not depend of the ratio c / n ; since t can take values up to n (by construction), with periodicity at k (c / 2) , k=1, 2,.., the Rule (ii) was formulated for t = c at most. Details are given in Table 5. In non-twisted tori, the HOMO-LUMO gap remains constant, at a given c- (in H/Z[c,n] series) and n- (in V/A[c,n]) values, respectively. The same is true for the twisted tori, according to the second capital letter in their name. Note the identity of graphs H[c,n] = V[ n,c] but having different embedding. At c = n, H[c,c] = V[c,c] and HVt[c,c] = VHt[c,c].
2.3. Other Coverings by the Cutting Procedure 2.3.1. Bathroom Floor and Pentaheptite Tessellations A ((4,8)3)S (bathroom floor) covering can be derived from the tetragonal (4,4) covering by deleting appropriate edges (Figure 9). The cut edge lay either horizontally or vertically, which results in two embedding isomers; however, the net is isotropic (see Diudea, Ed., 2005). Letter “S” comes from the “square” shape of the tetragon, in the optimized objects (Figure 10).
((4,8)3)S-pattern
((5,7)3)SP-pattern
Figure 9. Non-graphitic coverings in the toroidal embedding.
According to the building procedure, the following relations account for the involved toroidal embedding isomers:
((4,8)3)HS[c, n] = ((4,8)3)VS[c / 2,2n]
(4)
((4,8)3) VS[c, n] = ((4,8)3)HS[2c, n / 2] = ((4,8)3)VS[n / 2,2c]
(5)
Construction of a ((5,7)3)SP pattern (“SP” comes from “spiral” ) needs, in our procedure, four vertex rows (i.e., a 0 mod(c,4) lattice). It is called the pentaheptite tessellation and shows a local (t5, t7) signature of (1,3) type. Several ((5,7)3) lattices with various local signatures have been proposed, see Deza et al. 2000, Diudea et al. 2003, Kirby 1994.
Nanostructure Design—between Science and Art
((4,8)3)SH[20,64]; N = 1280
437
((5,7)3)SPH[20,64]; N = 1280
Figure 10. Non-graphitic optimized tori.
Table 6. Ring Code in Pentaheptite Tori 1
Torus ((5,7)3)SPH
Ring code [(5 7)c/4 , (7 5)c/4 ] ( n / 4)
2
((5,7)3)SPV
[(5 7)c/2 , (7 5)c/2 ] ( n /8)
Rings of size 5 and 7 were also used for curvature matching and strain relief in polyhex tori, see Babic et al. 2001, Iijima et al. 1992, Itoh and Ihara 1993, Itoh et al. 1993. Such coverings can also be drawn by the Stone-Wales transformation of a (6,3) net, see Deza et al. 2000, Diudea et al. 2003. The ring code in pentaheptite tori is given in Table 6.
2.3.2. Phenylenic and Naphthylenic Covering Phenylenic patterns: ((4,6,8)3)H/V and ((4,6,8)3)HX/VX can be derived from the tetragonal (4,4) net (Figure 11 – Diudea 2002b). Their local signature is: t4j(0, 2, 2); t6j(2, 0, 4); and t8j(2, 4, 2), j = 4, 6, 8. The resulting toroidal objects differ in orientation and number of the (4,6,8) repeat units. Figure 12 illustrates some phenylenic tori, as optimized objects.
. .
. . . .
((4,6,8)3)H-net
((4,6,8)3)HX-net
Figure 11. Phenylenic covering in the toroidal embedding.
Phenylenic chemical structures have been synthesized by the group of Vollhardt, see Berris et al. 1985, Vollhardt 1993, Vollhardt and Mohler 1966. Some of the phenylenic tori were predicted to have metallic character (Diudea 2002b).
438
Mircea V. Diudea
((4,6,8)3)H[12,96]; N =1152
((4,6,8)3)HX[12,96]; N =1152
Figure 12. Optimized phenylenic tori.
.
.
((4,6,8)3)HN-pattern
((4,6,8)3)HNX-pattern
Figure 13. Naphthylenic covering in toroidal embedding.
In close analogy to the phenylenic net, we proposed (Diudea 2002b) the naphthylenic net, with the sequence: C6, C6, C4,..., C6, C6, C4. The HN/VN lattice has the local signature: ((4,6,8)3)HN, t4j(0, 2, 2); t6j(1, 3, 2); and t8j(2, 4, 0), while the net HNX/VNX has the signature: t4j(0, 4, 0); t6j(1, 3, 2); and t8j(0, 8, 0), with j = 4, 6, 8 (Figure 13). This last net can be obtained from the square (4,4) lattice by a double leapfrog operation. Figure 14 illustrates two optimized naphthylenic tori.
((4,6,8)3)HN[25,80]; v =2000
((4,6,8)3)HNX[16,90]; v =1440
Figure 14. Optimized naphthylenic tori.
Insulating is the major trend of the π-electronic shell of naphthylenic toroidal nanostructures, in case of even-dimensional objects. On the contrary, odd-dimensional objects show open-shell structure. The X-net embedded in the torus shows a properly closed-shell (Diudea 2002c). The ring code of phenylenic and naphthylenic tori are given in Table 7.
Nanostructure Design—between Science and Art
439
Table 7. Ring Code in Phenylenic Tori 1
Torus HPH
Ring Code [(4 6 8) c / 3 ]n / 2
2
VPH
[(4 8 6) c / 2 ]n / 3
3
HPHX
[(4 8 6) c / 3 ]n / 2
4
VPHX
[( 4 6 8) c / 2 ] n / 3
5
HNP
[(4 6 6 6 8) c / 5 ] n / 2
6
VNP
[(4 8 6) c / 2 6 c ]n / 5
7
HNPX & VNPX
[(4 6 3 ) c / 3 (8 6) c / 3 ]n / 4
3. Stone-Wales Isomerization in Nanostructures A given tessellation, embedded either in the sphere, torus or cylinder, can be modified such that the number of points (i.e., atoms) is preserved; the only allowed changing, in the following, is the connectivity. There is a well-known edge rotation procedure, patterned by Stone and Wales (1986). The bold edge (Figure 15) shares two cycles of size (sm, sn) to be reduced, after rotation, to (sm-1, sn-1). Correspondingly, the two cycles joined by this edge will increase in size from (sp, sr) to (sp+1, sr+1) in the Stone-Wales SW isomerization (Diudea, 2003b). Often, the edge flipping runs as a cascade SW transformation.
(a)
(b)
Figure 15. Stone-Wales edge rotation can change: the position (a) or the size (b) of polygons.
This section presents several SW examples, possible isomerization routes in real experiments.
3.1. The Coalescence Process Recently, a hybrid structure consisting of fullerene molecules encapsulated in singlewalled nanotubes and called nanopeapods, have been experimentaly proved by highresolution transmission electron microscopy HRTEM. The symbol for such structures is inspired from the endohedral metal doped fullerenes (Singh et al. 2004): for example (C60)n@SWNT is the name of C60-nanopeapod.
440
Mircea V. Diudea
By exposure at an electron beam, fullerenes inside the nanotube coalesce into larger capsules, capped by C60 halves. The length of such capsules corresponds to three-four fullerene units, and the diameter is constrained by the outer nanotube dimensions (see Diudea and Nagy 2007, Fang et al. 1998).
(a)
(b)
(c)
(d)
Figure 16. From the [2+2] cycloadduct of C60, to peapods and double walled nanotube DWNT.
Topological analysis suggests that the merging process of two fullerene molecules can be achieved by a sequence of SW bond rotations. The first step in the C60 coalescence is the formation of a [2+2] (sp3-joined) cycloadduct Figure 16 -a. An intermediate step is the formation of all sp2 peanut shaped structure Figure 16 -b; subsequent SW bond flipping, in a circumferential order, transforms the heptagon-pentagon junction into a hexagonal (6,3) covering (Figure 16 -c). Finally, the coalescence process leads to a double walled nanotube DWNT (Figure 16 -d).
3.2. Isomerization of Peanut Tubulenes By cutting off the polar ring k of a spherical fullerene, a fullerene-like cap is obtained. Such a cap fits to a Z nanotube. Figure 17 illustrates two peanut-shaped fullerenes with different nanotubes (observe the complete description of tube covering) distancing the two caps.
(a) C168( 6 66 (5 6)6 ( 6 5)6 76 66 66 76 ) (D6d)(b) C168( 6 66 (5 6) 6 ( 6 5)6 7 6 (5 7 )3 ( 7 5)3 7 6 ) (S6) Figure 17. Peanut-shaped kf -tubulenes.
Nanostructure Design—between Science and Art
C 72( 76 66 66 76 )
C 72( 76 5 64 7 7 64 5 76 )
C 72( 76 5 62 7 5 7 2 62 5 7 5 76 )
441
C 72( 76 (5 7 )3 ( 7 5)3 76 )
Figure 18. SW isomerization of the necks of a peanut cages, from polyhex C 72( 7 6 66 66 7 6 ) to azulenic
C 72( 76 (5 7 )3 ( 7 5)3 76 ) covering and the corresponding geodesic projections. The peanut in Figure 17 -a, C168( 6 66 (5 6)6 ( 6 5)6 7 6 66 66 7 6 ) (D6d), isomerizes to the peanut
C168( 6 66 (5 6)6 ( 6 5)6 76 (5 7 )3 ( 7 5)3 76 ) (S6), Figure 17 -b, with azulenic covering on the distancing tube, achived by (three) Stone-Walls SW edge rotations (Diudea et al. 2005) (see the necks and their geodesic projections in Figure 18). The azulenic ((5,7)3) covering, well-known being the aromatic conjugation in such systems (Diudea Ed. 2005, Diudea and Nagy 2007), appears (from semiempirical PM3 calculations) as a stabilizing factor, even competing the polyhex covering. The cages in Figure 19 have even shorter distancing tube (one atom (a) and zero atom (b) rows, respectively) between the two caps. The name of such tubulenes includes the code for the cap and distancing zone up to the second cap, if the two caps are identical, or full description, if they are different. The cage C108 has been observed experimentally as a peapod (Hernandez et al. 2003). At moderate temperature and pressure (300°C and 1 GPa) spherical fullerenes arrange in one dimensional assemblies (Lebedkin et al. 2000). This suggested the existence of periodic fullerenes, like that presented in Figure 20. Recall that the phantasmagorical fullerene designed by the groups of Fowler and Dress, see Dress and Brinkmann 1996, respectively, also have 260 points and f 5 = f 7 + 12; f 7 = 60 , as Diudea’s C260 ((5,7)3) cage (Diudea Ed. 2005).
(a) C120 (5 65 ( 5 6 )5 ( 6 5)5 75 75 − Z (10,1])
(b) C108( 6 (5 6)3 ( 6 6 5)3 ( 6 5 6)3 7 6 − Z (12,0 ])
Figure 19. Peanut-shaped kf –tubulenes with the shortest distancing tube.
442
Mircea V. Diudea
C 260 ( k 5k ( 7 k 52 k 7 k ) r 5k k );
k = 5;
r =6
Figure 20. Diudea’s C260 ((5,7)3) cage.
Table 8. Periodic ((5,7)3) cages: net counting Formulas for k = 5; 7 f 5k = 2k ( r + 1) + 2t 5 ; f 7 k = 2kr + 2t 7
e5,5 k = 2 k ( r + 1 + t5 ) ; e5,7 k = 2k (3r + 2 + t 7 ) ; e7,7 k = 2k ( 2r − 1) v5,5,5k = 2kt 5 ; v5,5,7 k = 2k (2r + 1 + t 7 ) ; v5,7,7 k = 2k (r + 1) ; v 7,7,7 k = 2k ( r − 1) vk = 4k (2r + 1) * t s = 1 if s = k , and zero otherwise .
For a periodic ((5,7)3) cage, the number of faces, edges, and vertices of various types can be counted function of the repeat unit r and the polar ring size k (Table 8).
3.3. Isomerization of Tubercular Fullerenes The SW isomerization of tubercular cages of general formula C12 k ( k 5k 7 k 52 k 7 k 5k k ) (Figure 21 -a) leads to the classical C12 k ( k 6k ( 5 6 ) k ( 6 5) k 6k k ) fullerenes (Figure 21 -b). Routes of SW isomerization of spherical fullerenes to each other are given in The Atlas of Fullerenes (Fowler and Manolopoulos 1995).
(a) C 60( 5 557 55107 5555) (Ci)
(b) C 60 (5 65 ( 5 6)5 ( 6 5)5 655) (Ih)
Figure 21. Isomerization of all ((5,7)3) cages leading to the classical C12k fullerenes.
Nanostructure Design—between Science and Art
(a) C100 ( k 5k ( 7k 52 k 7k ) r 5k k );
k =5;
(c) C 100 ( k 5k 7 k 52 k 8k ( 5 6 ) k ( 5 7 ) k 5k k );
r =2
k =5
(e) C100 ( k 6 k (5 6 ) k ( 5 7 ) k 7 k (5 6) k ( 5 6 ) k k );
k =5
443
(b) C100 (5 65 (5 6 )5 − A[10, 2])
(d) C100 ( 5 5575 (5 6)5 ( 6 6 )5 ( 6 5)5 75555)
(f) C100 (5 65 ( 5 6 )5 ( 5 8)5 ( 5 6 )5 ( 6 5)5 655)
Figure 22. Ways of the Stone-Wales isomerization.
Isomerization of the peanut C100 ( 5 55 ( 75510 75 ) 2 555) tubercular cage (Figure 22 -a) to the corresponding tubulene C100 ( 5 65 ( 5 6 )5 − A[10, 2 ]) (Figure 22 -b) could follow two ways, as the SW rotation process starts: (i) by the “red” bonds sharing two heptagons in the zone of joining the two repeat units and (ii) the “blue” bonds at the cap zone. The stepwise process is given in Figure 22 (c to f); Quantum calculations proved the blue route is energetically more favored (Diudea Ed. 2005).
3.4. Isomerization of the (6,3) Net Let (6,3) net be embedded in the cylinder, as Tu(6,3)H/Z[c,n] and Tu(6,3)V/A[c,n] in our notation or in the (k,l) notation (Hamada et al. 1992), “zigzag” (c/2,0) and “armchair” (c/2,c/2) nanotubes (Figure 23). Let denote by H(i,j),(p,r) the edges lying parallel to the horizontally oriented tube generator, in the schematic lattice representation; the first subscript bracket encodes the relative location of the start-point of rotating edges along the tube while the second one the location of edges around the tube. Mark V(i,j),(p,r) the edges lying perpendicular to the tube generator. The marked edges will be rotated in the following isomerization and the above symbols play the role of a true rotational operator (Diudea 2003b, 2004). The hexagonal (6,3) covering is transformed into the “rhomboidal-bathroom-floor” tiling ((4, 8)3)R, by the operations (Diudea 2003b): H(1,3),(1,3)((6,3)H/Z) = ((4, 8)3)R = V(1,3),(1,3)((6,3)V/A)
(6)
444
Mircea V. Diudea
which illustrated in Figure 23.
(a) H(1,3),(1,3) ((6,3)H/Z)
(b) ((4,8)3)R
Figure 23. SW rotation of the marked bonds of the (6,3) covering (a) leads to the ((4,8)3)R pattern.
V(1,5),(1,5)((6,3)H/Z)
((4,8)3)S
Figure 24. A spiral path of SW edge rotation and its transform.
The operation can be written as: V(1,5),(1,5)((6,3)Z) = ((4, 8)3)S
(7)
V(1,5),(1,5),1a ((6,3)Z) = ((5, 7)3)SP
(8)
Similarly, the operation:
leads to a spiral ((5,7)3)SP net (Diudea et al. 2003), Figure 25. Note the combination V/A&H/Z, for describing a spiral path and the subscript 1a for a “leave one row/column out” way in getting an “alternating” spiral net.
V(1,5),(1,5),1a ((6,3)H/Z
((5,7)3)SP
Figure 25. A spiral path of SW edge rotation and its spiral net product.
The same operations can be done on the (6,3)A net thus resulting the corresponding pair embedding isomers. Different pentaheptite ((5,7)3) lattices (Deza et al., 2000) can be obtained by the following operations: H(1,5),(1,5)((6,3)Z) = ((5,7)3)V
(9)
Nanostructure Design—between Science and Art
445
V(1,5),(1,5)((6,3)A) = ((5,7)3)H
(10)
These coverings will be illustrated in the Figures 26 and 27.
3.5. Isomerizations on the ((4,8)3) Net The ((4,8)3) covering, particularly ((4,8)3)R, transforms to either (6,3)V/A or (6,3)H/Z net by operations: H(1,4),(1,4)(((4,8)3)R) = (6,3)V/A
(11)
V(1,4),(1,4)(((4,8)3)R) = (6,3)H/Z
(12)
as a unique intermediate of the (6,3) net isomerization. Other transformations of this covering are: H(1,7),(1,7) (((4,8)3)R) = ((5, 7)3)H
(13)
V(1,7),(1,7) (((4,8)3)R) = ((5, 7)3)V
(14)
the corresponding objects being illustrated in Figures 26 and 27.
H(1,3),(1,3)(((4,8)3)R)
((5, 7)3)H
Figure 26. SW isomerization of ((4,8)3)R to ((5,7)3)H lattice.
V(1,7),(1,7) (((4,8)3)R)
((5, 7)3)V
Figure 27. SW isomerization of ((4,8)3)R to ((5,7)3)V lattice.
Note that the pentaheptite H/V((5,7)3) lattice is encountered in the chemical net of ThMoB4. It is a 2-isohedral tiling (Grűnbaum et al., 1985), i.e., it has only two face orbits, with the local signature (t5, t7) = (1, 3).
446
Mircea V. Diudea Other isomerizations are as follows: H(1,6),(1,5) (V(1,7),(1,4)(((4,8)3)R) = ((5,6,7)3)HA
(15)
V(1,5),(1,6) (H(1,4),(1,7)(((4,8)3)R) = ((5,6,7)3)VA
(16)
V(1,7),(1,4) (((4,8)3)R)
H(1,6),(1,5) (V(1,7),(1,4) (((4,8)3)R))
((5,6,7)3)VA
Geodesic projection Figure 28. The route to ((5,6,7)3)VA lattice.
=
(a)
(b) Figure 29. Possible molecular mechanisms for the SW isomerization
This novel lattice has the local signature: t5j(0, 4, 1); t6j(2, 2, 2); and t7j(1, 4, 2), j = 5, 6, 7. We limit here to present only the VA embedding (Figure 28), which is deducible from C60 by cutting off two hexagonal parallel rings. It was described as a capped tubulene elsewhere (Diudea et al. 2005).
Nanostructure Design—between Science and Art
447
Note that the ((4,8)3) lattices are deductible from the square (4,4) covering, by some basic operations on maps (Diudea 2004). It is worthy noted that Stone-Wales rotation attracted attention of chemists who proposed an isomerization model involving reactive species such as free radicals (Figure 29 -a) or carbenes (Bettinger et al. 2003 - Figure 29 -b). By quantum chemical calculations, an activation energy for defect formation (in a (6,3) graphitic tube) in the range of 5-7 eV (or about 115-160 kcal/mol), at a strain above 6%, was estimated. Segments of a tube, with an altered helicity due to a SW isomerization, could appear, with consequently forming different metal/metal, semiconductor/semiconductor, and/or semiconductor/metal hetero-junctions, possibly leading to all-nanotube-based quantum dot structures (Orlikowski et al. 2000). Such defects have been examined experimentally by scanning tunneling microscopy STM (Zhao et al. 2000).The above SW edge rotations have been performed by the CageVersatile original program.
4. Operations on Maps Modifying a covering is a challenge and also a way to understand chemical reactions occurring in nanostructures, particularly in carbon allotropes, see Deza et al. 2000, Fowler and Pisanski 1994, Klein and Zhu 1997, de La Vaissière et al. 2001. Some geometrical-topological transformations, called operations on maps, are used to generate and/or modify the associate graphs of fullerenes (in general, nanostructures). Our original software CageVersatile (CVNET), enables such operations. A map M is a combinatorial representation of a (closed) surface (Pisanski and Randić, 2000). Let us denote in a map: v – the number of vertices, e - the number of edges, f – the number of faces and d – the vertex degree. A subscript “0” will mark the corresponding parameters in the parent map. Some basic relations in a map come form the very begin of Graph Theory (Euler, 1736):
∑ d vd
= 2e
(17)
∑ s f s = 2e
(18)
where vd and fs are the number of vertices of degree d and number of s-gonal faces, respectively. The two relations are joined in the famous EULER (1758) formula:
v − e + f = χ ( M ) = 2(1 − g )
(19)
with χ being the Euler characteristic and g the genus (Harary, 1969) of a graph (i.e., the number of handles attached to the sphere to make it homeomorphic to the surface on which the given graph is embedded; g=0 for a planar graph and 1 for a toroidal graph). Positive/negative χ values indicate positive/negative curvature of a lattice. This formula is useful for checking the consistency of an assumed structure.
448
Mircea V. Diudea
4.1. Simple Operations on Maps 4.1.1. Dualization Du. Dualization of a map starts by locating a point in the center of each face. Next, two such points are joined if their corresponding faces share a common edge. It is the (Poincaré) dual Du(M). The vertices of Du(M) represent the faces of M and vice-versa(Pisanski and Randić 2000). The parent and transformed map are related by the parameters: Du(M):
v = f 0 ; e = e 0 ; f = v0
Tetrahedron T
Cube C
(20)
Octahedron O
Dodecahedron D
Icosahedron I
Figure 30. The five Platonic polyhedra.
Dual of the dual returns the original map: Du(Du(M)) = M. Tetrahedron is self dual while the other Platonic polyhedra form pairs: Du(Cube) = Octahedron; Du(Dodecahedron) = Icosahedron. Figure 30 illustrates the five Platonic solids and symbols hereafter used. A Petrie dual is also known.
4.1.2. Medial Med. To achieve the medial, put new vertices in the middle of the original edges (Pisanski and Randić 2000). Join two vertices if the edges span an angle (and are consecutive within a rotation path around their common vertex in M). Medial is a 4-valent graph and Med(M)=Med(Du(M)), as illustrated in Figure 31 -a. The transformed map parameters are: Med(M):
(a) Med(C) = Cuboctahedron
v = e0 ; e = 2e0 ; f = f 0 + v 0
(21)
(b) Tr(O) = Truncated Octahedron
Figure 31. Medial and Truncated objects.
The medial operation rotates parent s-gonal faces by π/s. Points in the medial map represent original edges; this property can be used in topological analysis of edges in the
Nanostructure Design—between Science and Art
449
parent polyhedron. Similarly, the points in the dual map give information on the topology of parent faces.
4.1.3. Truncation Tr. Truncation is achieved by cutting off the neighborhood of each vertex by a plane close to the vertex, such that it intersects each edge incident to the vertex. Truncation is similar to the medial, the transformed map parameters being: Tr(M):
v = 2e0 = d 0 v 0 ; e = 3e0 ; f = f 0 + v 0
(22)
This was the main operation used by Archimedes in building its well-known 13 solids. Figure 31 -b illustrates a truncated object.
4.1.4. Polygonal Pn Mapping (Diudea 2004) To cover a map by all n-folded polygons (n= 3, 4, 5) add a new vertex in the center of each face. Put n-3 points on the boundary edges. Connect the central point with one vertex on each edge (the end points included). Thus, the parent map is entirely covered by triangles (n = 3), quadrilaterals (n = 4) or pentagons (n = 5). The P3 operation is also called stellation or triangulation. The transformed map parameters are: Pn(M):
v = v0 + (n − 3) e0 + f 0 ; e = ne0 ; f = s 0 f 0
(23)
so that the Euler’s relation holds. Maps transformed by the above operations form dual pairs:
Du ( P3 ( M )) = Le( M ) ; Du ( P4 ( M )) = Me( Me( M )) ; Du ( P5 ( M )) = Sn( M ) . Truncation was the main operation in construction of Archimedean objects, when applied on the Platonic solids, see de La Vaissière et al. 2001, Pisanski and Randić, 2000. Their duals are known as the Catalan objects. Note that all the net parameters refer to regular maps (i.e., having all vertices and faces of the same valence/size). Figure 32 illustrates the Pn operations; the symbols in the brackets are identical to those used by de La Vaissière et al. 2001, for Catalan objects (i.e., duals of the Archimedean solids). For other names of these operations see HART.
(a) P3(D)
(b) P4(D) = (C10)a Figure 32. Polygonal Pn mapping of the Dodecahedron D.
(c) P5(D) = (C9)a
450
Mircea V. Diudea
4.1.5. Snub Sn. Snub is a composite operation (Diudea 2004) that can be written as:
Sn( M ) = Du ( P5 ( M ))
Sn(D) = (A9)a
(24)
C60 = (A12)a
Figure 33. Snub of Dodecahedron; a Symbols in the brackets are identical to those used by de La Vaissière et al. (2001) for the Archimedean solids. Note the insulated pentagons in C60.
Correspondingly, the dual of the snub is P5(M): Du(Sn(M)) = P5(M). Similar to the medial, Sn(M) = Sn(Du(M)). In case of M = T, the snub is just the icosahedron I. Of chemical interest is the easy transformation of the snub (a regular pentavalent graph) into the leapfrog transform (a regular trivalent graph - see below), by deleting the edges of the triangle joining any three parent faces (Figure 33, in black). The transformed map parameters are: Sn(M):
v = s 0 f 0 = d 0 v 0 ; e = 5e 0 ; f = v0 + 2e0 + f 0
(25)
The multiplication ratio is v/v0 = d0, the same as for Le(M), both of them involving the dualization.
4.2. Complex Operations on Maps 4.2.1. Leapfrog Le. Leapfrog (tripling) is a composite operation (Eberhard 1891, Fowler 1986, 1990, 1997; Fowler et al. 1998a&b, Fowler and Steer 1987, Diudea and John 2001, Diudea et al. 2003) that can be written as:
Le( M ) = Du ( P3 ( M )) = Tr ( Du ( M ))
(26)
A sequence of stellation-dualization rotates the parent s-gonal faces by π/s. Leapfrog operation is illustrated, on a pentagonal face, in Figure 34. A bounding polygon, of size 2d0, is formed around each original vertex. In the most frequent cases of 4- and 3-valent maps, the bounding polygon is an octagon and a hexagon, respectively.
Nanostructure Design—between Science and Art
P3
451
Du
Figure 34. The Leapfrog Le operation on a pentagonal face.
If the map is a d0 regular graph, the following theorem holds, see Diudea and John 2001, Diudea et al. 2003: Theorem 4.1. The number of vertices in Le(M) is d0 times larger than in the original map M, irrespective of the tessellation type. The demonstration follows from the observation that, for each vertex of M, d0 new vertices result in Le(M): v / v 0 = d 0 v 0 / v 0 = d 0 . The complete transformed map parameters
are: Le(M): v = s 0 f 0 = d 0 v 0 ; e = 3e0 ; f = v 0 + f 0
(27)
Note that in Le(M) the vertex degree is always 3, as a consequence of the involved triangulation P3. In other words, the dual of a triangulation is a cubic net (Pisanski and Randić 2000). It is also true that truncation always provides a trivalent net. A nice example of using Le operation is: Le(Dodecahedron) = Fullerene C60 (Figure 35). The leapfrog operation can be used to insulate the parent faces by surrounding bounding polygons.
Dodecahedron D; v = 20
Fullerene C60 ; v = 60
Figure 35. Realization of Le operation.
A retro-leapfrog (Vizitiu et al. 2006)RLe operation (Figure 36) can be written as:
RLe( M ) = RP3 ( Du ( Le( M )))
(28)
To perform this operation, suppose the actual map is Le(M); make its dual and then cutoff all the vertices with degree lower than the maximal one.
452
Mircea V. Diudea
Du
RP3
Figure 36. The Retro-Leapfrog RLe operation on a pentagonal face.
4.2.2. Chamfering Q. Chamfering (quadrupling - Diudea and John 2001, Diudea et al. 2003, Vizitiu et al. 2006, Goldberg 1937) is another composite operation, written as the sequence:
Q( M ) = RE (TrP3 ( P3 ( M )))
(29)
where RE denotes the (old) edge deletion (dashed lines, in Figure 37) in the truncation TrP3 of each central vertex of the P3 mapping. The Q operation leaves unchanged the initial orientation of the polygonal faces.
P3
TrP3
Figure 37. The Quadrupling Q operation on a pentagonal face.
Theorem 4.2. The vertex multiplication ratio in a Q transform is d0 + 1 irrespective of the parent map tessellation. With the observation that, for each vertex of M, d0 new vertices appear in Q(M) and the old vertex is preserved, the demonstration is immediate, see Diudea and John 2001, Diudea et al. 2003: v = d 0 v 0 + v 0 ; v / v 0 = d 0 + 1 . The complete transformed parameters are:
Q(M): v = (d 0 + 1)v 0 ; e = 4e0 ; f = f 0 + e0
(30)
Q operation involves two π/s rotations, so that the initial orientation of the polygonal faces is preserved. Note that, because of preserving the old vertices, Q(M) is, in general, nonregular; only in case of a 3-valent M, Q(M) is a 3-regular graph and vertex multiplication is 4 (from which the name quadrupling is derived). Also note that edge chamfering is equivalent to vertex truncation. Q insulates the parent faces always by hexagons. An example of this operation is: Q (Dodecahedron) = Fullerene C80 (Figure 38).
Nanostructure Design—between Science and Art
Dodecahedron D; v = 20
453
Fullerene C80 ; v = 80
Figure 38. Realization of Q operation.
The retro-quadrupling RQ operation, see Vizitiu et al. 2006 (Figure 39) is based on the sequence:
RQ( M ) = E ( RTrP3 ( P3 ( M )))
(31)
and can be performed by adding new edges parallel to the boundary edges of the parent faces followed by the deletion of these faces.
E
RTrP3
Figure 39. The Retro-Quadrupling RQ operation on a pentagonal face.
4.2.3. Septupling Operations on Maps Two main operations on maps, leading to Platonic tessellations in open lattices, are known: the septupling S1 and S2 operations (Diudea 2003a, 2004, 2005; King and Diudea 2005, 2006). The S1 operation is a composite operation that can be written as a sequence of simple operations:
S1 ( M ) = TrP5 ( P5 ( M ))
(32)
with TrP5 meaning the truncation of new, face centered, vertices introduced by P5 pentagonal mapping, which involves an E2 (i.e., two new points put on each edge) operation. S1 operation was also called (Diudea 2003a) Capra Ca - the goat, by the Romanian name of the English leapfrog children game. Within this work, the two names are interchangeable. The nuclearity of the Goldberg (1937) polyhedra (related to the fullerenes) is given by the parameter:
454
Mircea V. Diudea
m = (a 2 + ab + b 2 ); a ≥ b; a + b > 0
(33)
which is the multiplication factor m = v/v0: in a 3-valent map, Le ((1,1); m = 3; Q ((2,0); m = 4 and Ca((2,1); m = 7. The m factor was used since the ancient Egypt for calculating the volume of truncated pyramid, of height h: V = mh/3.
P5(M)
S1(M)
Op(S1(M))
Figure 40. Septupling S1 operation on a square face, up to the open structure.
S1 insulates any face of M by its own hexagons, which are not shared with any old face. It is an intrinsic chiral operation (King and Diudea 2006); it rotates the parent edges by π/(3/2)s and was extensively illustrated in our books (Diudea Ed. 2005, Diudea and Nagy 2007). Since pentangulation of a face can be done either clockwise or counter-clockwise, it results in an enantiomeric pair of objects: S1S(M) and S1R(M), with the subscript S and R given in terms of the sinister/rectus stereochemical isomerism. Si can continue with the opening operation: Op k ( S i ( M )) , where k represents the number of points added on the boundary of the parent faces, that become the open faces. The resulting open objects have all the polygons of the same (6+k) size. The above operation sequence enables the construction of negatively curved networks. Figure 40 gives the steps of S1 realization on a square face in a trivalent lattice, up to the open structure. Theorem 4.3. The vertex multiplication ratio in an S transformation is 2d0 + 1 irrespective of the original map tiling. For demonstration, observe that, for each old vertex, 2d0 new vertices (Figure 40) appear and the old vertex is preserved in the transformed map. Thus, v = 2d 0 v 0 + v 0 and
v / v 0 = 2d 0 + 1 . The S2 operation (Diudea 2004, 2005 - Figure 41) is a simpler one; it can be achieved by putting four vertices on each edge of the parent map M (E4 operation) and next join these new vertices in order (-1, +3): S 2 = J ( −1, +3) ( E 4 ( M ))
(34)
It insulates the double sized parent faces by pentagons and parent vertices by pentagon d0-multiples; the transformed objects are non-chiral. Chirality in S2 is brought by the Op operation Op2a, achieved by putting two points on alternative edges of the double sized parent face boundary (Figure 41). Note that both the septupling operations keep the parent.
Nanostructure Design—between Science and Art
E4(M)
S2(M)
455
Op2a(S2(M))
Figure 41. Septupling S2 operation on a square face, up to the open structure.
The transformed map parameters are shown in the following relations:
S1 ( M ) & S 2 ( M ) : v = v 0 (2d 0 + 1) ;
e = 7e0 ;
f = f0 (s0 + 1) = 2e0 + f0
Op ( S 1 ( M )) : vOp = v 0 (3d 0 + 1) ; eOp = 9e0 ;
fOp = f 0 s0
Op 2 a ( S 2 ( M )) : vOp = v0 (4d 0 + 1) ; eOp = 11e0 ; fOp = f 0 s0
(35) (36) (37)
where d and s are the vertex degree and face size, respectively; the subscript zero refers to the original map M. The iterative n-time operating (on maps with any vertex degree, d 0 ≥ 3 ) leads to the following lattice parameters, transformed by both S1 and S2 (Diudea and Nagy 2007):
vn = v0 q n
(38)
e n = e0 m n
f n = f 0 ( s 0 p n + 1) q n = 2d 0 p n + 1; n ≥ 2
(39)
n −1
p n = ∑ m i = ( m n − 1) /( m − 1) = m( m...( m + 1)... + 1) n − 2 + 1
(40)
i =0
The parameter m is that defined in relation (33). From (39), it is obvious that:
p n = (q n − 1) / 2d 0
(41)
For trivalent maps (i.e., those with d0 =3), the above parameters become:
qn = m n
(42)
p n = ( m n − 1) / 6
(43)
456
Mircea V. Diudea
vn = v0 m n e n = e0 m n f n = f 0 ( s 0 (m n − 1) / 6 + 1)
(44)
For S1 and S2 the transformed lattice parameters will be:
v n = 7 n v 0 ; en = 7 n e0 ; f n = f 0 ( s 0 (7 n − 1) / 6 + 1)
(45)
In case of a cage opening after the nth iteration, the lattice parameters are as follows:
v n , Op ( S1 ) = v 0 q n + f 0 s 0 = v 0 ( d 0 + q n )
(46)
en ,Op ( S1 ) = e0 m n + f 0 s 0 = e0 (m n + 2) f n ,Op ( S1 ) = f 0 ( s 0 p n + 1) − f 0 = f 0 s 0 p n v n , Op 2 a ( S 2 ) = v 0 q n + 2 f 0 s 0 = v 0 (2d 0 + q n ) en ,Op 2 a ( S 2 ) = e0 m n + 2 f 0 s 0 = e0 ( m n + 4)
f n ,Op 2 a ( S 2 ) = f 0 (s 0 p n + 1) − f 0 = f 0 s 0 p n
S1S(S1S (O)) (top)
(47)
S1R(S1S (O)) (top)
Figure 42. Sequence of S1 pro-chiral operations on Octahedron: S1S,S1S transform is still twisted while S1R,S1S one is no more chiral.
As above mentioned, S1 rotates the parent bonds, so that it provides chiral transforms. The iterative application of S1 may lead to either chiral/twisted or non-chiral/non-twisted transforms: for example, the sequence S1S(S1S(M)) results in a twisted structure while S1R(S1S(M)) provides a non-twisted object (Figure 42). In the opposite, S2 applied to closed cages, leads to non-twisted objects. Its iterative application reveals the fractal fashion of the covering (Figure 43). The fractal characteristic (El-Basil 1996, Klein et al. 1993, Diudea 2005) can be seen even in the algebraic form of pn parameter, eq. (40). The only classical fullerene constructible by S2 is C28 (from the Tetrahedron).
Nanostructure Design—between Science and Art
(a) S2(D); I; v = 140 (two-fold axis)
457
(b) (S2)3(D); I; v = 6860 (five-fold axis)
Figure 43. Iterative S2 operation on Dodecahedron: observe the fractal covering in case of 3-time repetition (b).
RTrP5
RE2
Figure 44. The Retro-Capra RCa operation on a pentagonal face.
C140
RTrP5 (D); v = 80
Figure 45. Realization of the Retro-Capra RCa operation.
Retro-capra RCa operation (Figure 44, Vizitiu et al. 2006) can be written as the sequence:
RCa( M ) = RE 2 ( RTrP5 ( M ))
(48)
To achieve this operation, delete the smallest faces of the actual map and continue with RE2. A 3D realization of RCa is illustrated in Figure 45.
4.3. Generalized Operations on Maps Recently, Peter E. John (see Diudea et al. 2006b, Stefu et al. 2005) has proposed a generalization of operations on maps, inspired from the work of Goldberg (1937), and its
458
Mircea V. Diudea
representation of polyhedra in the (a,b) “inclined coordinates” (60o between axes). The multiplication factor m for trivalent maps is given by eq. (33). A similar procedure was used by Coxeter (1973), who built up icosahedral polyhedra/fullerenes as dual master triangular patches, represented by pairs of integers.
(4,0)
(2,2) G y
G y
G x
(2,2)
G x
(4,0)
Figure 46. Examples of the generalized (a, a) and (a, 0) operations.
G y
G x
(3,2)
(a) (3,2); m =19
G y
G x
(3,2)
(b) C(3,2); m =13
Figure 47. Examples of the generalized (a, b) operation: a = b + 1 (a) and (central face and first connected atoms) “cut” C(a,b) (b), the last one corresponding to m(3,1) =13 factor.
Figures 46 and 47 illustrate the method on the hexagonal face. The points of the “master” hexagon must lie either in the center of a lattice hexagon or on a lattice vertex. so that in the center of the parent hexagon must be a new hexagon. The edge length of the parent hexagon is counted by the primitive lattice vectors (x,y). For the (3,2) Cut operation - Figure 47 -b, the central face and first connected atoms were cut off. Some of the generalized composite operations, corresponding to non-prime m, can be expressed as sequences of operations, as shown in Table 9. It is obvious that the operations (a,a) also denoted Lea,a and (a,0) or Qa,0 provide achiral transforms (e.g., fullerenes of the full Ih point group symmetry) while the operations (a,b), a ≠ b, (also noted Caab) result in chiral transformed maps (e.g., fullerenes of the rotational I point group symmetry – King and Diudea 2006). The (a,0) operations produce non-rotated maps. The above generalized
Nanostructure Design—between Science and Art
459
operations, as implemented in the software package CageVersatile, work on any face and any vertex-degree maps. Table 9. Inclined coordinates (a, b), multiplication factor m = (a2 + ab + b2), number of atoms v and operation symbols ( running on Dodecahedron, C20) 1 2 3 4
(a, b) (1, 0) (1, 1) (2, 0) (2, 1)
m 1 3 4 7
v 20 60 80 140
Operation I Le1,1 Q2,0 Ca2,1
5 6 7 8 8' 9 10 11 12 13 14 15 16 17 18 19 20
(2, 2) (3, 0) (3, 1) (3, 2) (3, 2)c* (3, 3) (4, 0) (4, 1) (4, 2) (4, 3) (4, 4) (5, 0) (5, 1) (5, 2) (5, 3) (5, 4) (5, 5)
12 9 13 19 13 27 16 21 28 37 48 25 31 39 49 61 75
240 180 260 380 260 540 320 420 560 740 960 500 620 780 980 1220 1500
Le1,1, Q2,0 Le1,1, Le1,1 Le1,1, Le1,1, Le1,1 Q2,0, Q2,0 Le1,1, Ca2,1 Q2,0, Ca2,1 Le1,1, Q2,0, Q2,0 Ca2,1, Ca2,1 -
Obs. Identity Rotated by π / s; achiral Non-rotated; achiral Rotated by π / (3/2)s; chiral Rotated by π / s; achiral Non-rotated; achiral Rotated; chiral Rotated; chiral Rotated; chiral Rotated by π / s; achiral Non-rotated; achiral Rotated; chiral** Rotated by π / 2s; chiral Rotated; chiral Rotated by π / s; achiral Non-rotated; achiral Rotated; chiral Rotated; chiral Chiral/ achiral** Rotated; chiral Rotated; achiral
* c=cut; ** achiral, when the sequence CaR(CaS(M)) is used.
In case of a trivalent regular map, relations (17) and (18) can be rewritten as:
3 ⋅ v 0 = 2 ⋅ e0 = s 0 ⋅ f 0
(49)
Keeping in mind the multiplication factor m (see (33)), the number of vertices and edges, respectively, in the transformed map is:
v = m ⋅ v0
(50)
3 ⋅ v = 3 ⋅ m ⋅ v0 = 2 ⋅ e e=
3 3 2 ⋅ m ⋅ v 0 = ⋅ m ⋅ e0 = m ⋅ e0 2 2 3
(51)
460
Mircea V. Diudea
The above operations introduce new hexagons, keeping the original faces. Thus, the number of faces of any size s in M is:
fs = f6 + f0
(52)
2 ⋅ e = ∑ s ⋅ f s = 6 ⋅ f 6 + s0 ⋅ f 0
(53)
Relation (49) becomes:
Substitution of e from (51) in (54) leads to:
m −1 ⋅ s0 ⋅ f 0 6
(54)
m −1 ⋅ s0 ⋅ f 0 + f 0 6
(55)
f6 = fs =
In the case of n-iterative operations, equations (44) hold for all the presented operations running on a trivalent regular M0. The above relations are particularly true for the 3-valent Platonic solids: tetrahedron T, cube C and dodecahedron D. Figure 48 illustrates realizations of a chiral generalized operation.
(a) Ca3,2(D); m =19; v =380
v =380 ; optimized
(b) Ca3,2C(D); m =13; v =260
v =260 ; optimized
Figure 48. The generalized Ca3,2 and Ca3,2C operations performed on the Dodecahedron.
Nanostructure Design—between Science and Art
461
The operations above described have been proposed in view of rationalizing (see de La Vaissière et al. 2001, Pisanski and Randić, 2000) the transformations observed in nanostructures, in relation structure-property or in connection with their growth mechanisms. Sequences of map operations can be used to design a desired tessellation, particularly those showing disjoint hexagons (which according to the Clar theory could be more stable or more aromatic) or even circulene disjoint domains, which could predict valuable properties, e.g., super-magnetic ones. It is not the place to continue this subject; in this respect, the reader can consult our recent book (Diudea and Nagy 2007). All the operations on maps were performed by the CageVersatile software package, which works on any face, any vertex-degree maps and any type of surface.
5. All Ring Net Operations A polygonal motif, covering a surface, can be embedded in a given surface, which is locally planar, by operations on maps, (Section 4). Accordingly, a tessellation of a (locally 2D) map M we call a covering (Pisanski and Randić, 2000, Diudea, 2004). Other patterns are essentially 3D objects which fill the space within a network. The operations providing such units are the same as the above ones but working on all rings of a 3D net. We name these operations on nets while a tiling (Blatov et al. 2004, Delgado-Friedrichs and O’McKeeeffe 2005), is a filling of space by tiles sharing faces. The tiles are elementary polyhedra of a 3D net. A net N is a combinatorial representation of a space domain. The difference between the two types of operations originates in the difference between “face” and (strong) “ring”: in a (2D) map, any edge shares at most two rings (i.e., faces) while a (3D) net consists of edges that can share more that two (eventually strong) rings. A strong ring is not the sum of other smaller rings. As above mentioned, the operations on maps (i.e., on faces) are now extended on all strong rings. Our attention was focused on the following operations: dualization Du, medial Me, truncation Tr, leapfrog Le, chamfering Q and capra Ca. To specify a 3D operation, the suffix _all (meaning all edges or rings are operated) is added to the name of a map operation. In the top of figures, sequence of net operations leading to the given structure and lattice data are given; in some cases, the ring counting polynomial R(x) is given. Our original software programs CageVersatile-CVNET, JSChem, Omega counter, and Nano-Studio enable such operations and counting. The points/atoms of a net can be covered by an envelope (of a given tessellation) and this envelope is ultimately a surface that obeys the Euler topological rule. After an introduction to the all-ring operation realization, an extension of the Euler theorem will be given and exemplified as well.
5.1. Operations on Multi-Shell Cages Onion-like fullerenes have been observed experimentally and such structures is believed to consist of “polyhedra-into-polyhedra” - shells of the same (or not) number of atoms, connected (or not) to each others. It is known the interest of scientists in modifying a chemical structure in order to modify its properties, in particular, the tessellation of fullerene
462
Mircea V. Diudea
cages. It is also known the “cube-into-cube” 2C geometrical representation; the superscript number in front of the cage name accounts here for the number of shells.
(a) 2C; v=16; e=32; d=4 R(x)= 24x 4
(b) Med(2C); v=24; e=60; d=5 R(x)= 16 x 3 + 36 x 4
Figure 49. Double-shell Cube and its medial Med transform.
In multi-shell cages, our program CVNET enables both shell-by-shell operations (when only selected shells have to be transformed) and all-rings operations. In the following, the manner of operating will be always specified. Among the operations useful in this respect, we focused first on the medial Med operation. This is because “bisection” (the other name used for Med) is the most frequent “operation” in the nature and because interpenetrating networks exist in the realm of crystals (Carlucci et al. 2003, Blatov et al. 2004). Let’s start with “cage-into-cage” structures: the shells are now identical (Figure 49 -a). As an example, the Med operation (Figure 49 -b) is performed strictly at the level of shells, with no involvement of edges and rings that join the shells to each others. These operations can be used to modify either closed or open structures. Let’s perform the Med operation on all-rings, the rings joining the shells included: the shells are again identical (Figure 50). The structures are quite complex and need some detailed explanation.
(a) Med(2D)_all v=80; e=240; R(x)= 160 x 3 + 30 x 4 + 24 x 5
(b) Med(2I)_all v=72; e=240; R(x)= 160 x 3 + 30 x 4 + 24 x 5
Figure 50. All-Med(M) transforms of multi-shell dodecahedron in (a) and icosahedron in (b).
Let’s now perform the operations on the triple cube C3 and let the parent cage appear in figure for a better understanding (Figures 51 to 55): in the left hand side, the transform is given with the joint part (i.e., the substructure resulted by operating the joint rings - in red) while in the right hand side only shell transforms (in yellow), which are disjoint.
Nanostructure Design—between Science and Art
(a) 3C & Med(3C)_all; v=24&52
463
(b) 3C & Med(3C); v=24&36
Figure 51. All-Med transform of the multi-shell 3C.
If the shell transforms are cut-off, the joint part remains as a connected (or not) substructure; in case this remainder is connected, it can be viewed as a co-net CoN (see below). In case of Du (Figure 52), the remainder (Figure 53) is connected: it is just the medial of 2C (we named it CoN{Med(2C)}, with the detailed CoN in {}) and comes from the rings joining the shells.
3
C & Du(3C)_all; v=24&42
3
C & Du(3C); v=24&18
Figure 52. All- Du(3C) transform of the multi-shell 3C.
3
C & Du(3C);CoN{Med(2C)} v=24&24
3
C & Du(3C) & CoN{Med(2C)} v=24&18&24
Figure 53. Intercalated nets of Du_all transform of 3C; CoN is just Med(2C).
Note that our Du_all operation is different from that used in dualizing the crystal networks (Delgado-Friedrichs and O’McKeeeffe 2005): there, the operation is defined on tiles not on rings, so that many symmetrical nets (e.g., the cubic pcu net), appear to be self-dual, a result in disagreement with the well-known dualization of convex polyhedra.
464
Mircea V. Diudea
3
C& Le(3C)_all; v=24&168
3
C& Le(3C); v=24&72
Figure 54. The leapfrog Le(3C)_all.
The operation Le_all (Figure 54) is related to the Du_all operation, in the sense the remainder part (Figure 55) is related to the Med transform of the parent. However, this relatedness is not completely apparent (some edges are needed to join the vertices of lower connectivity located near the middle of the parent edges, as in case of Med) so that we left {} empty.
3
C& Le(3C)_all CoN{} v=24&96
C3& Le(3C) & CoN{} v=24&72&24
Figure 55. Intercalated nets of Le transform of C3; CoN is derived from Med(C2).
5.2. Operations on Centered Cages A particular result is obtained when the all-ring operations are performed on “point-intopolyhedra” (denoted by the suffix P added to the cage name). In case of Med(MP)_all, it leads to structures in which the parent cage and its medial coexist (Figure 56).
TP
CP
OP
DP
IP
Figure 56. Med(MP)_all transforms: TP (v=10; e=30); CP (v=20; e=60); OP(v=18; e=60); DP (v=50; e=150); IP (v=42; e=150).
Nanostructure Design—between Science and Art
TP
CP
OP
DP
465
IP
Figure 57. Du(MP)_all transforms: TP, CP, OP, DP and IP TP (v=10; e=30); Du(TP)=Med(TP); CP (v=18; e=60); Du(CP)=Med(OP) OP(v=20; e=60); Du(OP)=Med(CP); DP (v=42; e=150); Du(DP)=Med(IP) IP (v=50; e=150); Du(IP)=Med(DP).
In case of dual Du(MP)_all, the dual and its medial coexist (Figure 57). In case of Le(MP), the transforms consists of Tr(M)&Tr(Du(M))&Tr(T) (Figure 58 -a), excepting Le(TP) when there are only two interlaced Tr(T), connected by additional vertices (Figure 59). The Tr(MP) transforms consists of Tr(M)&M and some additional vertices (of the same degree as the corresponding vertex in M) at each junction point (in all, v(M) points) of the two cages (Figure 58 -b).
(a) Le(IP)_all; v=150{Tr(M)=60; Tr(Du(M))=60; Tr(T)=30}; e=300 ; R=184
(b) Tr(IP)_all v=84; e=192; R=142
Figure 58. All-ring transforms of the IP cage by Le (a) and Tr (b); observe the C60 structure inside (a red) and outside (b - yellow) of these cages.
(a) Le(TP)_all v=30; e=60; r=40
(b) Le(TP)_all; simplified view
Figure 59. All-ring transforms of TP cage by Le (a) and Tr (b); observe the C60 structure inside (a - red) and outside (b - yellow) of these cages.
466
Mircea V. Diudea
(a) Med(2TOR(4,4)[5,25])_all v=625; e=2000; R(G, x) = 1000 x 3 + 750 x 4
(b) Med(2TOR(4,4)[5,25])_all (slide) v=40; e=100; f3=40; f4=20; g = 1
Figure 60. Medial of a double shell toroidal structure.
TP
CP
OP
DP
IP
Figure 61. Med(Med(MP))_all transforms: MP=TP, CP, OP, DP and IP; core=Med(M) TP (v=30; e=90); R(G, x) = 50 x3 + 44 x 4 ; Tile_ext=O; CP (v=60; e=276); R(G, x) = 84 x3 + 68 x 4 ; Tile_ext=O; OP(v=60;
R(G, x) = 52 x3 + 60 x 4 ;
e=156);
Tile_ext=CO;
DP
(v=150;
e=450);
R (G, x) = 210 x + 210 x + 24 x Tile_ext=O;IP (v=150; e=390); R (G, x) = 130 x + 120 x + 24 x 5 ; 3
4
5
3
4
Tile_ext=CO.
The all-ring procedure was also applied on toroidal 2TOR structures (Figure 60). Observe the cross-sections of the all-medialized torus are also (medial) toroids which join to each other to form the corresponding supra-torus. By iterating twice Med_all operation, complex structures are obtained (Figure 61). All these structures show a core which is the medial Med(M) of the parent Platonic solid M. Three of them, those derived from TP, CP and DP look like interlaced structures, consisting of Octahedron units as external tiles. The others, those derived from OP and IP consist of Cubeoctahedron units as external tiles assembled such as clearly delimited hollows appear, like in zeolites. These two last structures we used in construction of crystal-like lattices. Observe the number of vertices of the objects in Figure 61 are represent the number of edges in Med(MP)_all objects (Figure 56), of which Med transforms are.
5.3. Euler Extended Formula for Multi-Shell Polyhedra As mentioned above, the Euler (1758) formula relates the basic map parameters to the χ characteristic of the surface S and the genus g of a graph embedded in S. In multi-shell polyhedra, the map M (a 2D lattice) is changed by the net N (a 3D lattice) and faces are changed by (strong) rings. In a 3D lattice, an edge can share more than two rings, this fact generating serious problems in counting SSSR (smallest set of smallest rings).
Nanostructure Design—between Science and Art
467
Within this paper, the rings are given in terms of the ring counting polynomial. The paradigm of the present approach is: a multi-shell polyhedral structure can be expressed as the union of the composing tiles. Tiles are elementary polyhedra of a space domain which form a tiling by sharing faces (face-to-face). A net N is carried by a tiling, which uniquely determines the net; the reciprocal is not true since the decomposition of a net in tiles is not unique. Moreover, there are nets formed by catenated rings, for which no tiling can be found. A tiling, consisting of the smallest possible tiles, that preserves the symmetry of the net and their rings are all strong rings, is called a natural tiling (Delgado-Friedrichs and O’McKeeeffe, 2005). For a tiling with t tiles, f faces, e edges and v vertices per repeat unit, Coxeter (1973) gave de formula:
v −e+ f −t = 0
(56)
Table 10. Euler-Extended Formula in Multi-shell Polyhedra Object: Ring Polynomial Platonics
1
2
2
3
C: 24x 4 C: 42x 4
3
2
4
3
5
2
6
3
7
2
8
3
D: 30 x 4 + 24 x 5 D: 60 x 4 + 36 x5 I: 40 x 3 + 30 x 4 I: 60 x 3 + 60 x 4
O: 16 x 3 + 12 x 4
O: 24 x 3 + 24 x 4 2
T: 8 x 3 + 6 x 4
9 10
3
T: 12 x 3 + 12 x 4
d (v)
4 5 (8) 4 (16) 4 5 (20) 4 (40) 6 7 (12) 6 (24) 5 6 (6) 5 (12) 4 5 (4) 4 (8)
v + r − e − t ( s − 1) = 2(1 − g )
Formula
Meaning t
16+24-32-6=2
f (C)
24+42-52-6×2=2
f (C)
40+54-80-12=2
f ( D)
60+96-130-12×2=2
f ( D)
24+70-72-20=2
f ( I)
36+120-114-20×2=2
f ( I)
12+28-30-8=2
f (O)
18+48-48-8×2=2
f (O)
8+14-16-4=2
f (T )
12+24-26-4×2=2
f (T )
In the light of the above paradigm, and being the case of a natural tiling, faces f can be replaced by rings r. The Euler formula extended for multi-shell polyhedra reads as proposed by Diudea and Nagy (2008):
v − e + r − t ( s − 1) = 2(1 − g )
(57)
where s is the number of shells; in case of s=1, the classical Euler formula is recovered (after identifying r with f). We limit here to the calculation of the genus g, by means of the newly introduced r and t parameters, the results being presented in Tables 10 to 14. The difficulty of the relation (57) is to find the parameter t, corresponding to the natural tiling. In case of identically transformed shells, t equals the number of faces of the parent map
468
Mircea V. Diudea
(Tables 10 and 11). When shells are differently tessellated, t accounts for the common features (Table 12). When all rings are “operated”, the vertex number of the parent map is added (Table 13). In case of Med(3C)_all (Table 13), the tiles per one shell are illustrated in Figure 62. Table 11. Euler-Extended Formula in Multi-shell Closed/Open Polyhedra Object: Ring Polynomial Archimedeans
5
36+90-96-14×2=2
Meaning t f (Med (C)) f (Med (C))
6
4
112+144-224-30=2
f (Ca (C))
4
Q( C): 60 x + 24 x Archimedeans-Open
4
64+84-128-18=2
f (Q(C))
5
Op(Ca(2C)): 108 x 4 + 48 x 7
160+156-296-24=2(1-3)
f (Op(Ca (C)))
6
Op(Q(2C)): 72 x 4 + 24 x 8
112+96-200-12=2(1-3)
f (Op (Q(C)))
1 2 3
Formula
d
v + r − e − t ( s − 1) = 2(1 − g )
2
3
4
5
24+52-60-14=2
3
3
4
Med( C): 16 x + 36 x Med( C): 24 x + 66 x 2
4
Ca( C): 96 x + 48 x 2
4
6
4(112) 3(48) 4(64) 3(48)
Table 12. Euler-Extended Formula in Multi-Shell Polyhedra Derived from Centered Cages Object: Ring Polynomial Med(MP)-Platonics
1
CP: 44 x 3 + 12 x 4
2
DP: 110 x + 24 x
3
IP: 110 x 3 + 20 x 4 + 12 x 5
4
OP: 52 x 3 + 6 x 4
5
TP: 30x 3
3
(a)
5
Formula
d
v + r − e − t = 2(1 − g )
Meaning t
6
20+56-60-14=2
f (Med (C))
6
50+134-150-32=2
f (Med (D))
10(12) 6(30) 8(6) 6(12) 6
42+142-150-32=2
f (Med (I))
18+58-60-14=2
f (Med (O))
10+30-30-8=2
f (Med (T))
(b)
(c)
Figure 62. Med(3C)_all; (a) square face tile {Med(C)}; (b) trigon face tile {two tetrahedra incident in a vertex}; (c) the fitting of the two tiles (per one shell).
Nanostructure Design—between Science and Art
469
Table 13. Euler-Extended Formula in Multi-Shell Archimedean Polyhedra: All-rings Operated Object: Ring Polynomial Archimedeans-all
1
Med( C)_all: 64 x + 24 x
2
Med( C)_all: 120 x + 42 x
3
Med(2D)_all:
3
2
3
v + r − e − t ( s − 1) = 2(1 − g )
t
32+88-96-22=2
f ( Med (C)) + v(C)
52+162-168-22×2=2
f ( Med (C)) + v(C)
80+214-240-52=2
f ( Med (D)) + v(D) f ( Med (D)) + v(D)
5
8 (30) 6 (100)
130+396-420-52×2=2
4
f ( Med (I)) + v(I)
5
10 (12) 6 (60)
72+214-240-44=2
4
8 (6) 6 (24) 8 (24) 6 (24) 6
30+88-96-20=2
f ( Med (O)) + v (O)
48+162-168-20×2=2
f ( Med (O)) + v (O)
16+46-48-12=2
f ( Med (T )) + v(T )
26+84-84-12×2=2
f ( Med (T )) + v(T )
300 x + 60 x + 36 x Med(2I)_all: 3
Meaning
5
Med(3D)_all: 3
5
8 (12) 6 (40) 6
4
Formula
4
160 x + 30 x + 24 x 4
6
4
3
3
d
160 x + 30 x + 24 x 6
Med(2O)_all: 64 x 3 + 24 x 4
7
Med(3O)_all: 120 x 3 + 42 x 4
8
Med(2T)_all: 40 x 3 + 6 x 4
9
Med(3T)_all: 60 x 3 + 24 x 4
8 (6) 6 (20)
Table 14. Euler-Extended Formula in Multi-Shell Toroidal Polyhedra Object: Ring Polynomial TORI
1 2
4
2
TOR(4,4)[7,7]: 49x Med(2TOR(4,4)[7,7])_all:
392 x 3 + 294 x 4 3
Med(2TOR (4,4)[5,25])_all: 3
1000 x + 750 x 4
4
Spongy-Dodecahedron Med(Med(IP))
d
Formula
v + r − e − t = 2(1 − g )
Meaning t
f (TOR (4,4)[7,7])
5
98+196-245-49=2(1-1)
8 (49) 6 (196)
245+686-784-147=2(1-1)
f (Med (TOR (4,4)[7,7])) +
8 (125) 6 (500)
625+1750-2000-375=2(1-1)
v(TOR (4,4)[7,7]) =98+49 f (Med (TOR (4,4)[5,25])) +
6 (90) 4 (60)
150+274-390-44=2(1-6)
f ( Med (I)) + v(I)
6 (36) 4 (24)
60+112-156-20=2(1-3)
f ( Med (O)) + v (O)
v(TOR (4,4)[5,25]) =250+125
130 x 3 + 120 x 4 + 24 x 5 5
Spongy-Cube Med(Med(OP)):
52 x 3 + 60 x 4
The structures of which calculation is given in Table 14 represent double-shell nets, both as single and multi-tori (i.e., spongy-structures). The spongy structures are of particular interest because of their hollows/channels. In case they are synthesized from appropriate MOFs, a possible catalytic activity can be predicted (Blatov et al., 2004).
470
Mircea V. Diudea
5.4. Operations in Crystal-Like Lattices A tiling is a filling of space by tiles sharing faces, see Baburin et al. 2005, Blatov et al. 2004, Delgado-Friedrichs and O’Keeffe 2005. The characterization of a (3D) net, carried by a tiling, is not a trivial task, first because of the increased dimensionality of the objects. Next, none of the indices developed so far (e.g., the Schläfli, Wells numbers or the point group symmetry) is unique and needs supplementary characterization. Moreover, the adjacency matrix is hardly manipulated in case of (infinite) networks and the isomorphism checking is an mp-complete problem, anyhow. Even the sequence of net operations is not unique in drawing the relatedness of such structures. When a polyhedral motif is repeated by translation along the coordinate axes the resulted covering is called periodic. Most often, the pattern is embedded in a given surface, which is locally planar. Such patterns are at most 2-periodic (i.e., double periodic), even the covered objects/cages form 3D nets. They can be performed by operations on maps (Section 4). Other patterns are essentially 3D objects and they can form either 2- or 3-periodic (i.e., tripli periodic) nets. The operations providing such units are the operations on nets.
5.4.1. 3-Periodic Tiling This class can be constructed by: (a) operations on nets and (b) identifying faces of finite cages generated by map operations.
(a) Du_all
(b) Tr_all
(c) Med_all
(d) Le_all
Figure 63. The cubic network and some of its “all-ring” operation transforms.
An “all-ring operation” (Diudea and Nagy 2007)is the operation performed on al strong rings in a network. Among operations developed by us, dual Du, medial Med, truncation Tr and leapfrog Le are the most important and are illustrated in case of the cubic net (as the start net) in Figure 63.
Nanostructure Design—between Science and Art
3 4 (a) Med(C);[111]; t=1; v=12 r (G, x) = 8 x + 6 x ; e=48
471
(b) Med(C);[444]; t=64; v=300
3 4 (c) Med(C)_all; [222]; t=8; v=54 r (G, x) = 64 x + 45 x ; (d) Med(C)_all;[222] & CoN{Du(C)_all} e=144 & CoN{Du(C)_all}; Ortho view
Figure 64. The cubic network and its Med_all (3-periodic) transform with the co-net {Du(C)_all}, in orthoscopic view (c) and in 3D (d).
(a) Le(C); [111]; t=1; v=24 r (G, x) = 6 x 4 + 8 x 6 ; e=36
(b) Le(C); {4}; [444]; t=64; v =960
4 6 (c)Le(C)_all; {4}; [222]; t=8; v =144 r (G, x) = 42 x + 64 x ; e=240 &CoN{ Le(C)_all}; Ortho view
(d) Le(C)_all; self-CoN
Figure 65. The cubic network and its Le_all (3-periodic) transform with the co-net {Le(C)_all}, in orthoscopic view (c) and in 3D (d).
472
Mircea V. Diudea
As in the case of fullerene tessellation, where a covering pattern, particularly a circulene flower has its own co-flower (there are at least two patterns in any covering), a net N has its own co-net CoN (written in {} -Figures 64 and 65).
5.4.2. 2-Periodic Tiling This class includes structures built up by identifying faces of finite cages generated by map operations. For example, Q is not suitable for “all-ring” operations because it preserves the old (parent) points. In exchange, the cage unit Q(C) derived from the Cube can be variously assembled by identifying either r4 or r6 or also r4 & r6 . Figure 66 illustrates the Q(C);{4,6} (2-periodic) network, which is translationally periodic in two directions of the space. For a better characterization, particularly of the nanostructures, we proposed the Omega polynomial, (Diudea 2006, Diudea 2009, Diudea et al. 2006a, 2009; Vizitiu et al. 2007), which counts opposite edge strips ops in their associate graphs.
(a) Q(C);{4,6};[111]; t=1; v=32 r (G, x) = 6 x 4 + 12 x 6 e = 4 × 6 + 3 × 8 = 48
(b) Q(C);{4,6};[444]; t=64; v=1280
(c) Q(C);{4,6};[222]; v=192; t=8; (aOb)-view
r (G, x) = 52 x + 92 x ; r=144 4
6
(d) Q(C);{4,6};[222]; v=192; t=8;
(aOc & bOc)-view
Ω(G, x) = 8 x10 + 4 x14 + 4 x 20 + 2 x 24 + 2 x 28 ; e=320
Figure 66. The net Q(C);{4,6}; 2-periodic, by Q(M) operation.
Nanostructure Design—between Science and Art
473
Conclusion Covering a surface or filling a space domain with various polygonal/polyhedral (repeat) units was the aim of this paper, which tried to fulfil the requirements of two well-established sciences: nano-science and crystallography, respectively. Even the energetic characterization of the discussed structures was here eluded (the reader can consult the quoted references), simple ways to design such structures, real or hypothetical ones, were proposed. Majority of the discussed ways originate in the works of TOPO GROUP Cluj, Romania, or are the most studied by our group, studies assisted by original software, as mentioned in the first section. These programs enabled one to discretize the well-known smooth surfaces or space domains in view of generating or transforming a given structure, with well-defined covering/tiling. In addition to some classical map operations, performed on the polygonal faces of a covering, the generalized operations on maps and net operations represent valuable ways in design of nanostructures. They provide input for more elaborate energy calculations devoted to structure stability and reactivity or to dynamics of their interaction/distribution with various media, from the inorganic realm up to the most complex biological systems.
References [1]
[2]
[3]
[4] [5]
[6]
[7]
M. Ahlskog, E. Seynaeve, R. J. M. Vullers, C. Van Haesendonck, A. Fonseca, K. Hernadi, and J. B. Nagy, ”Ring formations from catalytically synthesized carbon nanotubes”, Chemical Physics Letters, vol. 300, pp. 202-206, 1999. D. Babić, D. J. Klein, and T. G. Schmalz, ,”Curvature matching and strain relief in bucky-tori: usage of sp3-hybridization and nonhexagonal rings“, Journal of Molecular Graphics and Modelling, vol. 19, pp. 222-231, 2001. I. A. Baburin, V. A. Blatov, L. Carlucci, G. Ciani and D. M. Proserpio, ”Interpenetrating metal-organic and inorganic 3D networks: a computer-aided systematic investigation. Part II [1]. Analysis of the Inorganic Crystal Structure Database (ICSD)”, Journal of Solid State Chemistry, vol. 178, pp. 2452-2474, 2005. E. Barborini, P. Piseri, P. Milani, G. Benedek, C. Ducati, and J. Robertson, ”Negatively curved spongy carbon”, Applied Physics Letters, vol. 81, pp. 3359-3361, 2002. G. Benedek, H. Vahedi-Tafreshi, E. Barborini, P. Piseri, P. Milani, C. Ducati, and J. Robertson, ”The structure of negatively curved spongy carbon”, Diamond and Related Materials, vol. 12, pp. 768–773, 2003. C. Berris, G. H. Hovakeemian, Y.-H. Lai, H. Mestdagh, and K. P. C. Vollhardt, ”A new approach to the construction of biphenylenes by the cobalt-catalyzed cocyclization of odiethynylbenzenes with alkynes. Application to an iterative approach to [3]phenylene, the first member of a novel class of benzocyclobutadienoid hydrocarbons”, Journal of American Chemical Society, vol. 107, pp. 5670-5687, 1985. H. F. Bettinger, B. I. Yakobson, and G. E. Scuseria, “Scratching the surface of buckminsterfullerene: The barriers for stone - Wales transformation through symmetric and asymmetric transition states”, Journal of American Chemical Society, vol. 125, pp. 5572-5580, 2003.
474 [8]
[9] [10] [11] [12]
[13] [14] [15] [16] [17]
[18] [19] [20] [21] [22] [23] [24] [25] [26] [27]
[28]
Mircea V. Diudea V. A. Blatov, L. Carlucci, G. Ciani and D. M. Proserpio, “Interpenetrating metal– organic and inorganic 3D networks: a computer-aided systematic investigation. Part I. Analysis of the Cambridge structural database”, Crystal Engineering Communications, vol. 6, pp. 378-395, 2004. S. A. Bovin, L. F. Chibotaru, and A. Ceulemans, “The quantum structure of carbon tori”, Journal of Molecular Catalysis, vol. 166, pp. 47-52, 2001. G. Brinkmann and P. W. Fowler, ”Spiral coding of leapfrog polyhedra”, Journal of Chemical Information and Computer Science, vol. 38, pp. 463-468, 1998. CageVersatile_CVNET, M. Stefu and M. V. Diudea, Babes-Bolyai University, Cluj, 2005. L. Carlucci, G. Ciani and D. M. Proserpio, “Borromean links and other nonconventional links in polycatenated coordination polymers: re-examination of some puzzling networks”, Crystal Engineering Communications, vol. 5, pp. 269-279, 2003. A. Ceulemans, L. F. Chibotaru, and P. W. Fowler, “Molecular Anapole Moments”, Physical Reviews Lettters, vol. 80, pp. 1861-1864, 1998. H. S. M. Coxeter, Regular polytopes, Methuen and Co., 1948; 3rd Ed., Dover Pubs, Dover, 1973. V. H. Crespi, L. X. Benedict, M. L. Cohen, and S. G. Louie, “Prediction of a purecarbon planar covalent metal”, Physical Reviews, B, vol. 53, pp. 13303-13305, 1996. O. Delgado-Friedrichs and M. O’Keeffe, “Crystal nets as graphs: Terminology and definitions”, Journal of Solid State Chemistry, vol. 178, pp. 2480-2485, 2005. M. Deza, P. W. Fowler, M. Shtorgin, and K. Vietze, “Crystal nets as graphs: Terminology and definitions”, Journal of Chemical Information and Computer Science, vol. 40, pp. 1325-1332, 2000. M. V. Diudea, “Toroidal Graphenes from 4-Valent Tori”, Bulletin of the Chemical Society of Japan, vol. 75, pp. 487-492, 2002a. M. V. Diudea, “Phenylenic and naphthylenic tori”, Fullerenes, Nanotubes, Carbon Nanostructures, vol. 10, pp. 273-292, 2002b. M. V. Diudea, “Topology of naphthylenic tori” Physical Chemistry, Chemical Physics, vol., 4, pp. 4740-4746, 2002c. M. V. Diudea, “Capra-a leapfrog related operation on maps”, Studia Universitatis “Babes-Bolyai”, vol. 48, no. 2, 3-16, 2003a. M. V. Diudea, Nanotube covering modification”, Studia Universitatis “Babes-Bolyai”, vol. 48 no. 2, pp. 17-26, 2003b. M. V. Diudea, “Covering Forms in Nanostructures“, Forma (Tokyo), vol. 19, no. 3, pp. 131-163, 2004. M. V. Diudea, “Nanoporous Carbon Allotropes by Septupling Map Operations”Journal of Chemical Information and Computer Modeling, vol. 45, pp. 1002-1009, 2005. M. V. Diudea, Ed., Nanostructures, Novel Architecture, Nova, N. Y., 2005. M. V. Diudea, “Omega Polynomial”, Carpathian Journal of Mathematics, vol., 22, pp. 43-47, 2006. M. V. Diudea, “Omega Polynomial in Twisted/Chiral Polyhex Tori”, Journal of Mathematical Chemistry, vol. 45, pp. 309-315, 2009. M. V. Diudea, S. Cigher, A. E. Vizitiu, O. Ursu and P. E. John, “Omega polynomial in tubular nanostructures” Croatica Chemica Acta, vol. 79, pp. 445-448, 2006a.
Nanostructure Design—between Science and Art
475
[29] M. V. Diudea, M. Ştefu, P. E. John, and A. Graovac, “Generalized operations on maps“ Croatica Chemica Acta, vol. 79, pp. 355-362, 2006b. [30] M. V. Diudea, S. Cigher, A. E. Vizitiu, M. S. Florescu and P. E. John, “Omega polynomial and its use in nanostructures description”, Journal of Mathematical Chemistry, vol. 45, pp. 316-329, 2009. [31] M. V. Diudea and P. E. John, “Covering polyhedral tori”, MATCH, Communications in Mathematical and Computational Chemistry, vol. 44, pp. 103-116, 2001. [32] M. V. Diudea, P. E. John, A. Graovac, M. Primorac, and T. Pisanski, “Leapfrog and Related Operations on Toroidal Fullerenes”, Croatica Chemica Acta, vol. 76, pp. 153159, 2003. [33] M. V. Diudea and E. C. Kirby, “The energetic stability of tori and single-wall tubes”, Fullerene Science and Technology, vol. 9, pp. 445-465, 2001. [34] M. V. Diudea and Cs. L. Nagy, Periodic Nanostructures, Springer, 2007. [35] M. V. Diudea, Cs. L. Nagy, I. Silaghi-Dumitrescu, A. Graovac, D. Janežič, and D. Vikic-Topić, “Periodic cages”, Journal of Chemical Information and Computer Modeling, vol. 45, pp. 293-299, 2005. [36] M. V. Diudea and C. L. Nagy, “Euler Formula in Multi-Shell Polyhedra”, MATCH Communications in Mathematical and in Computer Chemistry, vol. 60 no. 3, pp. 835844, 2008. [37] M. V. Diudea, B. Parv, and E. C. Kirby, “Azulenic Tori”, MATCH Commun. Math. Comput. Chem 47, pp. 53-70, 2003. [38] A. Dress and G. Brinkmann, “Phantasmagorical fulleroids”, MATCH, Communications in Mathematical and Computational Chemistry, vol. 33, pp. 87-100, 1996. [39] V. Eberhard, Zur Morphologie der Polyeder, Leipzig,Teubner,1891. [40] S. El-Basil, “Combinatorial Self-Similarity“, Croatica Chemica Acta, vol. 69, pp. 11171148, 1996. [41] M. Endo, S. Iijima, and M. S. Dresselhaus, Carbon Nanotubes, Pergamon, 1996. [42] EPINET, http://epinet.anu.edu.au/about. [43] L. Euler, “Solutio Problematis ad Geometriam Situs Pertinentis”, Commentarii Academiae Scientiarum Imperalis Petropolitanae, vol. 8, pp. 128-140, 1736. [44] L. Euler, “Elementa doctrinae solidorum” Novi Commentarii Academiae Scientiarum Imperalis Petropolitanae, vol. 4, pp. 109-160, 1758. [45] S. L. Fang, A. M. Rao, P. C. Eklund, P. Nikolaev, A. G. Rinzler and R. E. Smalley, ”Raman scattering study of coalesced single walled carbon nanotubes”, Journal of Materials Research, vol. 13, pp. 2405-2411, 1998. [46] P. W. Fowler, ”How unusual is C60? Magic numbers for carbon clusters”, Chemical Physics Letters, vol. 131, pp. 444-450, 1986. [47] P. W. Fowler, “Carbon cylinders: a class of closed-shell clusters” Journal of Chemical Society, Faraday Transactions, vol. 86, pp. 2073-2077, 1990. [48] P.W. Fowler, ”Fullerene graphs with more negative than positive eigenvalues: The exceptions that prove the rule of electron deficiency?” Journal of Chemical Society, Faraday Transactions, vol. 93, pp. 1-3, 1997. [49] P. W. Fowler and D. E. Manolopolous, An atlas of fullerenes, Oxford University Press, Oxford, U.K., 1995. [50] P. W. Fowler and T. Pisanski, “Leapfrog transformations and polyhedra of Clar type”, Journal of Chemical Society, Faraday Transactions, vol. 90, pp. 2865-2871, 1994.
476
Mircea V. Diudea
[51] P. W. Fowler, T. Pisanski, A. Graovac, and J. Žerovnik, ,”A classification of centrallysymmetric and cyclic 12-vertex triangulations of S2 S2”, DIMACS Series in Discrete Mathematics and Theoretical Computer Science, vol. 51, pp. 175-187, 2000. [52] P. W. Fowler and K. M. Rogers, ”Eigenvalue relations for decorated trivalent polyhedra Connections between the fullerenes and their fulleren-yne and spheriphane relatives”, Journal of Chemical Society, Faraday Transactions, vol. 94, pp. 1019-1027, 1998a. [53] P. W. Fowler, P. W. and K.M. Rogers, “Eigenvalue spectra of leapfrog polyhedral”, Journal of Chemical Society, Faraday Transactions, vol. 94, pp. 2509-2514, 1998b. [54] P. W. Fowler and J. I. Steer, “The leapfrog principle: a rule for electron counts of carbon clusters” Journal of Chemical Society, Chemical Communications, pp. 14031405, 1987. [55] M. Goldberg, “A class of multi-symmetric polyhedra”, Tôhoku Mathematical Journal, vol. 43, pp. 104-108, 1937. [56] A. Graovac, D. Plavšić, M. Kaufman, T. Pisanski, and E. C. Kirby, ,”Condensed Phase Dynamics, Structure, and Thermodynamics: Spectroscopy, Reactions, and Relaxation Application of the adjacency matrix eige”, Journal of Chemical Physics, vol. 113, pp. 1925-1931, 2000. [57] B. Grünbaum, H. D. Löckenhoff, G. C. Shephard, and A. Temesvari, ,”The enumeration of normal 2-homeohedral tilings”, Geometriae Dedicata, vol. 19, pp. 109174, 1985. [58] N. Hamada, S. Sawada, and A. Oshiyama, “New one-dimensional conductors: Graphitic microtubules”, Physical Reviews Letters, vol. 68, pp. 1579, 1992. [59] F. Harary, Graph Theory, Addison-Wesley, Reading, MA, 1969. [60] G. Hart, http://www.georgehart.com/sculpture [61] HSAKA, www2u.biglobe.ne.jp/~hsaka. [62] E. Hernández, V. Meunier, B. W. Smith, R. Rurali, H. Terrones, M. Buongiorno Nardelli, M. Terrones, D. E. Luzzi, and J.-C. Charlierr, “Fullerene Coalescence in Nanopeapods: A Path to Novel Tubular Carbon”, Nano Letters, vol. 3, pp. 1037-1042, 2003. [63] E. Hückel, “Quantentheoretische Beiträge zum Benzolproblem“, Zeitschrift für Physiks, vol. 70, pp. 204-286, 1931. [64] S. Ihara, S. Itoh and J-i. Kitakami, “Toroidal forms of graphitic carbon”, Physical Reviews, B, vol. 47, pp. 12908-12911, 1993. [65] S. Iijima, „Helical microtubules of graphitic carbon“, Nature, vol. 354, pp. 56-58, 1991. [66] S. Iijima, T. Ichihashi, and Y. Ando, “Pentagons, Heptagons and Negative Curvative in Graphite Microtubule Growth”, Nature, vol. 356, pp. 776-778, 1992. [67] S. Itoh and S. Ihara, “Toroidal forms of graphitic carbon. II. Elongated tori”, Physical Reviews, B, vol. 48, pp. 8323-8328, 1993. [68] S. Itoh, S. Ihara, and J-i. Kitakami, “Toroidal forms of carbon C360”, Physical Reviews, B, 47, pp. 1703-1704, 1993. [69] J. K. Johnson, B. N. Davidson, M. R. Pederson, and J. Q. Broughton, “Energetics and structure of toroidal forms of carbon”, Physical Reviews, B, vol. 50, pp. 17575-17582, 1994. [70] JSChem, Cs. L. Nagy, M. V. Diudea, Babes-Bolyai University, Cluj, 2005.
Nanostructure Design—between Science and Art
477
[71] R. B. King and M. V. Diudea, “From the cube to the Dyck and Klein tessellations: Implications for the structures of zeolite-like carbon and boron nitride allotropes”, Journal of Mathematical Chemistry, vol. 38, no. 4, pp. 425-435, 2005. [72] R. B. King and M. V. Diudea, “The chirality of icosahedral fullerenes: a comparison of the tripling (leapfrog), quadrupling (chamfering), and septupling (capra) transformations”, Journal of Mathematical Chemistry, vol. 39, pp. 597-604, 2006. [73] E. C. Kirby, “Cylindrical and toroidal polyhex structures”, Croatica Chemica Acta, vol. 66, pp. 13-26, 1993. [74] E. C. Kirby, ”On toroidal azulenoids and other shapes of fullerene cage”, Fullerene Science and Technology, vol. 2, pp. 395-404, 1994. [75] E. C. Kirby, “Fully Arenoid Toroidal Fullerenes, Both Benzenoid and Non-Benzenoid”, MATCH, Communications in Mathematical and Computational Chemistry, vol. 33, pp. 147-156, 1996. [76] E. C. Kirby, R. B. Mallion, and P. Pollak, “Toroidal polyhexes”, Journal of Chemical Society, Faraday Transactions, vol. 89, pp. 1945-1953, 1993. [77] E. C. Kirby and P. Pollak, “How to enumerate the connectional isomers of a toroidal polyhex fullerene”, Journal of Chemical Information and Computer Science, vol. 38, pp. 66-70, 1998. [78] J. Klein, T. P. Živković, and A. T. Balaban, ”The fractal family of coro(n)enes”, MATCH, Communications in Mathematical and Computational Chemistry, vol. 29, pp. 107-130, 1993. [79] J. Klein and H. Zhu, “All-conjugated carbon species”, in: From Chemical Topology to Three - Dimensional Geometry, (Ed. A. T. Balaban), Plenum Press, New York, 1997, pp. 297-341. [80] I. Laszlo and A. Rassat, ”The Geometric Structure of Deformed Nanotubes and the Topological Coordinates” Journal of Chemical Information and Computer Science, vol. 43, pp. 519-524, 2003. [81] I. Laszlo, A. Rassat, P. W. Fowler, and A. Graovac, “Topological coordinates for toroidal structures”, Chemical Physics Letters, vol. 342, pp. 369-374, 2001. [82] B. de La Vaissière, P. W. Fowler, and M. Deza, “Codes in Archimedean and Catalan polyhedra”,Journal of Chemical Information and Computer Science, vol. 41, pp. 376386, 2001. [83] S. Lebedkin, W. E. Hull, A. Soldatov, B. Renker and M. M. Kappes, “Structure and Properties of the Fullerene Dimer C140 Produced by Pressure Treatment of C70”, Journal of Physical Chemistry B, vol., 104, pp. 4101-4110, 2000. [84] T. Lenosky, X. Gonze, M. Teter, and V. Elser, “Energetics of negatively curved graphitic carbon”, Nature, vol. 355, pp. 333-335, 1992. [85] M. F. Lin and D. S. Chuu, “Persistent currents in toroidal carbon nanotubes”, Physical Reviews, B, vol. 57, pp. 6731-6737, 1998. [86] J. Liu, H. Dai, J. H. Hafner, D. T. Colbert, R. E. Smalley, S. J. Tans, and C. Dekker, “Fullerene Crop Circles”, Nature, vol. 385, pp. 780-781, 1997. [87] L. Mackay and H. Terrones, “Diamond from graphite”, Nature, vol. 352, pp. 762, 1991. [88] E. Manolopoulos, J. C. May and S. E. Down, “Theoretical studies of the fullerenes: C34 to C70”, Chemical Physics Letters, vol. 181, pp. 105-111, 1991. [89] R. Martel, H. R. Shea, and Ph. Avouris, “Rings of single-walled carbon nanotubes”, Nature, vol. 398, pp. 299-299, 1999a.
478
Mircea V. Diudea
[90] R. Martel, H. R. Shea, and Ph. Avouris, “Ring formation in single-wall carbon nanotubes”,Journal of Physical Chemistry, B, vol. 103, pp. 7551-7556, 1999b. [91] D. Marušić and T. Pisanski, , ”Symmetries of hexagonal molecular graphs on the torus”, Croatica Chemica Acta, vol.73, pp. 969-981, 2000. [92] V. Meunier, Ph. Lambin, and A. A. Lucas, “Atomic and Electronic Structures of Large and Small Carbon Tori”, Physical Reviews, B, vol. 57, pp. 14886-14890, 1998. [93] NANO-Studio, Cs. L. Nagy, M. V. Diudea, Babes-Bolyai University, Cluj, 2009. [94] Omega Counter, S. Cigher and M. V. Diudea, Babes-Bolyai University, Cluj, 2006. [95] D. Orlikowski, M. B. Nardelli, J. Bernholc, and Ch. Ronald, “Theoretical STM signatures and transport properties of native defects in carbon nanotubes”, Physical Reviews, B, vol. 61, pp. 14194-14203, 2000. [96] T. Pisanski and M. Randić, “Bridges between geometry and graph theory”, Geometry at Work, MAA Notes vol. 53, pp. 174-194, 2000. [97] T. Pisanski and J. Shawe-Taylor, “Characterising graph drawing with eigenvectors”, Journal of Chemical Information and Computer Science, vol. 40, pp. 567-571, 2000. [98] L. Schläfli, Theorie der Vielfachen Kontinuität, 1901. [99] K. Singh, V. Kumar, and Y. Kawazoe, “Metal encapsulated nanotubes of silicon and germanium”, Journal of Materials Chemistry, vol. 14, pp. 555-563, 2004. [100] SYSTRE, http://gavrog.sourceforge.net [101] J. Stone and D. J. Wales, “Theoretical-Studies of Icosahedral C60 and Some Related Species”, Chemical Physics Letters, vol., 128, pp. 501-503, 1986. [102] M. Stefu, M.V. Diudea and P.E. John, “Composite operations on maps”, Studia Universitatis “Babes-Bolyai”, vol. 50, no. 2, pp. 165-174, 2005. [103] H. Terrones and A. L. Mackay, “From C60 to negatively curved graphite”, Progress in Crystal Growth and Characterization, vol. 34, pp. 25-36, 1997. [104] TOPOS, http://www.topos.ssu.samara.ru. [105] TORUS, M. V. Diudea, B. Parv and O. Ursu, “Babes-Bolyai” University, Cluj, 2003. [106] A. E. Vizitiu, M. V. Diudea, S. Nikolić and D. Janežić, “Retro-leapfrog and related retro map operations”, Journal of Chemical Information and Modeling, vol. 46, pp. 2574-2578, 2006. [107] A. E. Vizitiu, S. Cigher, M. V. Diudea, M. S. Florescu, Omega polynomial in ((4,8)3) tubular nanostructures MATCH Communications in Mathematical and Computational Chemistry, vol. 57, no. 2, pp. 457-462, 2007. [108] K. P. C. Vollhardt, “The phenylenes”, Pure and Applied Chemistry, vol. 65, pp. 153156, 1993. [109] C. Vollhardt and D.L. Mohler, in: B. Halton (Ed.), “The phenylenes: Synthesis, properties, and reactivity”, Advances in Strain in Organic Chemistry, JAI Press, London, pp. 121-160, 1966. [110] M. Yoshida, M. Fujita, P. W. Fowler, and E. C. Kirby, “Non-bonding orbitals in graphite, carbon tubules, toroids and fullerenes”, Journal of Chemical Society, Faraday Transactions, vol. 93, pp. 1037-1043, 1997. [111] Q. Zhao, M. B. Nardelli, and J. Bernholc, “Ultimate strength of carbon nanotubes: A theoretical study”, Physical Reviews, B, vol. 65, pp. 144105-1 - 144105-6, 2000.
In: Electrostatics: Theory and Applications Editor: Camille L. Bertrand, pp. 479-497
ISBN 978-1-61668-549-2 c 2010 Nova Science Publishers, Inc.
Chapter 18
Q UANTIFYING S TRUCTURAL C OMPLEXITY OF G RAPHS : I NFORMATION M EASURES IN M ATHEMATICAL C HEMISTRY Matthias Dehmer1,∗, Frank Emmert-Streib2 , Yury Robertovich Tsoy3 and Kurt Varmuza4 1 Institute for Bioinformatics and Translational Research, UMIT, Eduard Wallnoefer Zentrum 1, A-6060 Hall in Tyrol, Austria, 2 Computational Biology and Machine Learning, Center for Cancer Research and Cell Biology, School of Medicine, Dentistry and Biomedical Sciences, Queen’s University Belfast, 97 Lisburn Road, Belfast, BT9 7BL, UK, 3 Tomsk Polytechnic University, Lenin Avenue 30, 634050 Tomsk, Russia, 4 Laboratory for Chemometrics, Vienna University of Technology, Institute of Chemical Engineering, Getreidemarkt 9/166, A-1060 Vienna, Austria
Abstract In this chapter, we give a conceptional view about information measures for graphs which can be used to quantify their structural complexity. We focus on treating such measures in the context of mathematical chemistry but we want to mention that those are also applicable for arbitrary complex networks. Besides reviewing the most known information indices often used in chemical graph theory, we propose an information functional that is based on degree-degree associations in a graph. This leads us to a parametric graph entropy measure to quantify the structural information content of a graph. A brief numerical example shows how the measure can be calculated explicitly.
1.
Introduction
Statistical and information-theoretic methods to characterize networks are currently of considerable interest, see [44, 47]. For example, this relates to development of statistical ∗
E-mail address: [email protected]
480
M. Dehmer, F. Emmert-Streib, Y.R. Tsoy and K. Varmuza
correlation measures, information measures like entropy, conditional entropy, and mutual information for structurally analyzing networks [3-7]. It is important to note that the existing classical approaches for quantifying structural complexity of chemical graphs are mostly based on the application of S HANNON’s entropy formula [45] to derive a finite probability distribution induced by a certain equivalence criterion [8-15]. As the main contribution of this chapter, we define a novel information-theoretic functional to quantify structural information of undirected and connected networks. By using a recently proposed method [13] to determine the topological entropy of graphs, we finally obtain a parametric family of graph entropy measures. Further, we give a review of existing information measures for characterizing chemical structures represented by graphs. We want to emphasize that our resulting graph entropy measures can be applied to chemical graphs [3, 49] as well as to general complex networks because its time complexity is polynomial. This can be proven similarly as in [13]. This chapter is organized as follows: Section (2.) gives a short overview on the usage of general topological descriptors in mathematical chemistry. Also, Section (2.) presents some approaches from chemometrics to evaluate the topological descriptors statistically. In Section (3.) we start the conceptual part of the chapter by stating some mathematical preliminaries. The review of existing information indices often used in mathematical chemistry is given in Section (4.). By using the outlined method of Section (3.1.), we define a novel information functional based on degree-degree associations in Section (5.). As a result, we obtain a parametric entropy measure for quantifying the structural complexity of graphs. A numerical example is given in Section (5.1.). In Section (6.), the paper finishes with a short summary and conclusion.
2.
Topological Descriptors and Chemometrics
The development and efficient use of formal representations of chemical structures is a prominent task in chemistry. A typical molecule in organic chemistry (chemistry with carbon-containing molecules) consists of atoms connected by chemical bonds. Most common elements are carbon (C), hydrogen (H), nitrogen (N) and oxygen (O), but many others may be present in an organic molecule. The most common bond types are single bond, double bond, triple bond, and aromatic bond. A molecule is of course a 3-dimensional structure, but a representation of only the connectivities (atoms, bonds) is often an efficient approach for describing a molecule [29]. Graph theory is a powerful mathematical tool for a simple representation of molecular structures. In general, weighted graphs are used with (weighted) vertices for the atoms and edges for the bonds. Double and triple bonds between atoms are described by the same number of edges between the vertices; aromatic bonds can be replaced by alternating single and double bonds in an aromatic ring. Hydrogen atoms are often not considered (H-depleted structures) in this representation. In numerous applications of graph theory for chemical structures only skeletons are considered, that means all vertices (atoms) are considered to be equal and all edges (bonds) are considered to be equal (see Figure (1)). A topological descriptor is usually a graph invariant characterizing a certain feature of the graph and thus of the chemical structure [48]. During the last decades chemists have defined some hundreds of topological descriptors, some of them are rather abstract (and
Quantifying Structural Complexity of Graphs
481
Figure 1. Various representations of the molecule acetic acid: two forms of brutto formulae, connectivities including H-atoms, H-depleted structure, and skeleton (a 3-dimensional molecule is not shown).
less accepted by some parts of the chemistry community), others are based on chemical considerations. A topological index is independent from vertex numbering and of course independent from any 2D-representation of the graph. A topological descriptor is often a single number (called a molecular topological index) or is a sequence of numbers. Usually, topological indices are calculated from H-depleted molecular graphs; some consider the different atoms and bonds, other do not and only consider the skeleton. Topological indices characterize structural features, such as branching, symmetry, shape or size. A number of topological indices are based on topological distances between atoms (corresponding to the number of bonds). Further concepts are based on counting the number of vertices and edges or on determining subgraph isomorphisms. Global topological indices describe the entire chemical structure, local topological indices refer to the atoms. A special type of topological descriptors are topological information indices [3, 12]. Starting from a molecular graph, a probability distribution can be derived by applying certain equivalence criteria to group graph elements (e.g., vertices) into equivalence classes. This procedure results in concrete information measures that represent the entropy of the underlying graph topology (also see Section (4.)). Such measures are also interpreted as the structural information content of a graph. In general, molecular descriptors have been defined as “the final result of a logical and mathematical procedure which transforms chemical information encoded within a symbolic representation of a molecule into a useful number or the result of some standardized experiments” [48]. Besides topological descriptors a great number of other descriptors have been suggested to characterize various features of molecular structures. Not all descriptors are “useful” because many descriptors are highly correlating or even identical. A commercial software [19] calculates more than 2000 descriptors from molecular structures if 3D-atom coordinates and all hydrogen atoms given. Depending on the structural data used, we distinguish 0D-descriptors (e.g. number of atoms), 1D-descriptors (e.g. quantities of electrical charges in a molecule), 2D-descriptors (e.g. topological indices including information indices), 3D-descriptors (e.g. sum of geometrical distances between selected atoms). Further, to characterize chemical structures by a set of numbers (molecular descriptors) is a vehicle for the treatment of essential problems in chemistry, such as the construction of chemical structure databases, searches for identical or similar chemical structures, and especially for generating mathematical models describing relationships between chemical structures and
482
M. Dehmer, F. Emmert-Streib, Y.R. Tsoy and K. Varmuza
physical/chemical/biological properties of chemical compounds. Besides characterizing chemical graphs by using topological indices, the comparison of chemical structures - based on molecular descriptors - is an important task in chemoinformatics. The similarity of chemical structures is often expressed by the similarity of vectors consisting of binary components [54]. Each vector component denotes the presence (value 1) or absence (value 0) of a given substructure, and is called a binary substructure descriptor; a substructure may be e. g. a benzene ring, a methylester group -CH2-COOCH3, or simply a given number of nitrogen atoms. Appropriate sets of substructures - covering a wide area of chemical structures - have been defined and software is available for a fast computation of say 1000 such descriptors for some 10,000 chemical structures [53]. An appropriate and widely used measure for the similarity of such binary vectors is the Tanimoto index, also called Jaccard similarity coefficient [51, 54]. Let xA and xB be binary vectors with m components for two chemical structures A and B, respectively; the Tanimoto index t is given by P AN D(xAj , xBj ) t= P , j = 1, 2 . . . , m, (1) OR(xAj , xBj ) P where AN D() stands for the number of binary descriptors with a “1” in both vectors, and P OR() denotes the number of binary descriptors with a “1” in at least one of the vectors. The Tanimoto index reaches the maximum value 1 if all descriptors are pairwise equal; in this case structures A and B are considered to be very similar or are even identical. The distribution of t for randomly selected pairs of structures has been suggested as a measure of the diversity of a chemical structure database [16]. Quantitative relationships between chemical structure data and properties or activities of chemical compounds (QSPR or QSAR) are of great interest in chemoinformatics [25] and drug design [33, 55]. In this context chemometrics plays an important role because multivariate data analysis methods are applied for the development of QSPR/QSAR models [17]. Main parts of chemometrics are devoted to analytical chemistry [51] but the aim of this discipline has been more generally defined as “providing maximum chemical information from chemistry-relevant data” [30]. The prominent mathematical and statistical tools applied in chemometrics belong to multivariate data analysis. A successful strategy for the development of QSPR/QSAR models is to characterize chemical structures by a set of molecular descriptors (x1 , ..., xm ) collected in a vector x, and to create an empirical regression model yˆ = b0 + xT b with yˆ for a predicted property y, b the vector with regression coefficients, and b0 the intercept. Such models are datadriven (empirical) because a set of n (typical 30 to 300) chemical structures with known properties y is used to develop and to test the model. Multiple regression methods, widely used in chemometrics, are applied, e. g. PLS regression (partial least-squares regression) because this method allows an optimization of the complexity of the model (avoids overfitting), accepts data with more variables (m) than objects (n), and is insensitive to highly correlating variables [17]. The descriptors best suitable for a particular QSPR or QSAR model are usually not known in advance; thus one may start with some hundred potentially relevant descriptors and then apply variable selection methods - often a genetic algorithm [36]. Essential is a careful estimation of the prediction performance for new cases; appropriate strategies are repeated double cross validation [24] or bootstrap methods [20].
Quantifying Structural Complexity of Graphs
483
The relationships between chemical structure data and properties of chemical compounds are very complex and no general theory exists that could be applied for new cases. Therefore, the development of new molecular descriptors is of continuing interest, although a large number of descriptors have already been defined. Chemical structures (even if rather simply represented by graphs) exhibit a great diversity requiring a great variety of molecular descriptors. Information indices characterize the inner symmetry of graphs (molecular skeletons) which is an important feature for some properties and activities of compounds. Topological descriptors have often not been evaluated by using large chemical databases consisting of real chemical structures. They have often been evaluated only on using synthetic graphs, e.g. generated isomers. Recently, we showed for an information-based topological descriptor that the result is considerably different for real chemical structures (from a spectroscopic database) and for generated isomers [12].
3.
Mathematical Preliminaries
Before starting with the main definition, we briefly express some mathematical and known definitions, see [8,16]. We call G = (V, E), |V | < ∞ a finite undirected graph if E ⊆ V2 . G is called connected if for arbitrary vertices vi and vj there exists an undirected path from vi to vj . Otherwise, we call G unconnected. GU C denotes the set of finite, undirected and connected graphs. The degree of a vertex v ∈ V is denoted by δ(v) and k the class of equals the number of edges e ∈ E which are incident with v. We call the set GR k iff δ(v) = k ∀v ∈ V . Starting from G = (V, E) ∈ G k-regular graphs. It holds G ∈ GR UC, σ(v) = maxu∈V d(u, v) is called the eccentricity of v ∈ V , where d(u, v) denotes the shortest distance between u and v. d(u, v) is an integer metric. ρ(G) = maxv∈V σ(v) is called the diameter of G. Further, we define for G = (V, E) ∈ GU C the following vertex sets. Sj (vi , G) := {v ∈ V | d(vi , v) = j, j ≥ 1}, (2) is called the j-sphere of vi regarding G. For introducing S HANNON’s entropy, [11, 45], let X be a discrete random variable with alphabet A and p(xi ) = P (X = xi ) be the probability mass function of X. Then, the entropy of X is defined by H(X) := −
3.1.
X
p(xi ) log(p(xi )).
(3)
xi ∈A
Graph Entropies Based on Information Functionals
In order to introduce our novel information functional as well as the corresponding family of graph entropy measures, we briefly repeat the basic approach for determining the entropy of graphs, see [13]. Let G ∈ GU C and let S be a certain set, e.g., a set of vertices or paths etc. We call the mapping f : S −→ R+ an information functional of G. We always assume that f is monotonous. Now, we start with G ∈ GU C and define for vi ∈ V the quantities f (vi ) Pf (vi ) := P|V | , (4) f (v ) j j=1
484
M. Dehmer, F. Emmert-Streib, Y.R. Tsoy and K. Varmuza
where f represents an arbitrary information functional. Obviously, the quantities Pf (vi ) can be interpreted as vertex probabilities because Pf (v1 ) + Pf (v2 ) + · · · + Pf (v|V | ) = 1,
(5)
holds. The corresponding probability distribution is denoted by PGf (V ) := (PfG (v1 ), PfG (v2 ), . . . , PfG (v|V | )).
(6)
Starting from Equation (4), the entropy of the underlying graph topology of G has been defined as [13]: If (G) := −
|V | X
4.
(7)
i=1
|V |
=−
Pf (vi ) log (Pf (vi )) ,
X i=1
f (vi ) P|V |
j=1 f (vj )
log
f (vi ) P|V |
j=1 f (vj )
.
(8)
Review of Existing Information Indices
In this section, we give a review of the most known information indices which have been used for quantifying structural information of chemical structures [9,12]. In QSAR [18] and QSPR [18], partition-based information measures have been used for characterizing molecular graphs structurally by using a graph invariant X and an equivalence criterion α. The main step to construct these measures is as follows: The application of the equivalence criterion produces a partitioning of the vertex set V into k subsets whose cardinalities are denoted by |Vi |. Starting from such a partitioning, the structural information content of a chemical graph G can be defined by [3] I(G, α) = |X| log(|X|) − ¯ I(G, α) = −
k X i=1
k X i=1
Pi log(Pi ) = −
|Xi | log(|Xi |),
k X |Xi | i=1
|X|
log
|Xi | |X|
(9)
.
(10)
Equation (9) and Equation (10) are graph entropy measures for quantifying structural information of G. Equation (9) and Equation (10) represent the total and the mean information content of G, respectively [3]. As a technical note, we will always take the logarithms to the base 2 because we express the structural information contents in bits. In the following, we give a brief review on further classical information indices and on such information measures, which are based on graph distances [46].
4.1.
Classical Information Indices
For deriving information indices for graphs, one of the first studied graph invariant was the number of graph vertices regarding vertex degree and extended vertex degree. This
Quantifying Structural Complexity of Graphs
485
study led to the methods of R ASHEVSKY [43] and T RUCCO [50]. As a result, the so-called orbital information indices [48] k X
IORB (G) = |V | log(|V |) − and I¯ORB (G) = −
k X |Ni | i=1
|V |
Ni log (Ni ) ,
(11)
(12)
i=1
log
|Ni | |V |
,
have been developed [43]. |Ni | stands for the number of topologically equivalent vertices in the i-th vertex orbit of G and k is the number of different orbits, respectively. In general, vertices are considered as topologically equivalent if they belong to the same orbits of a graph G. Similarly, T RUCCO [50] applied the same approach to the edge automorphism group and obtained E
IORB (G) = |E| log(|E|) −
and E¯ IORB (G)
=−
k X i=1
k X |N E | i
i=1
|E|
NiE log NiE ,
log
|NiE | |E|
,
(13)
(14)
where |NiE | stands for the number of edges belonging to the i-th edge orbit [3, 48] of G. After this, M OWSHOWITZ [12-15] was the first who expressed a mathematically rigorous approach for determining the structural information content of a graph by developing further the method of R ASHEVSKY [43]. For example, he expressed the relative complexity of graphs based on the concept of determining their structural information contents. Further, he explored graph operations like complement, sum, join etc. and investigated the change of the corresponding information index. This examination was of considerable interest for the information-based modeling of chemical reactions [3]. Moreover, M OWSHOWITZ [39] defined the chromatic information content (based on graph colorings) and examined this measure for different graph classes. As an extension of R ASHEVSKY’s measure (Equation (11)), B ERTZ [2] used as a graph invariant the number of two-edge subgraphs to define a measure of molecular complexity [2]. Starting from Equation (11) and then by adding the term |V | log(|V |), he obtained the complexity index C(G) = 2|V | log(|V |) −
k X
Ni log (Ni ) .
(15)
i=1
The reason for adding this additional term was that Equation (11) does not properly reflect the number of the invariants used [2] because it holds IORB = 0 when all invariants are equal, independent from the fact how large the order of the graph is [2]. Another information index which takes advantage of the well known H OSOYA-Index Z [28] was developed in [9]. To construct this index, we start with the characteristic polynomial of a graph G that can be calculated from its adjacency matrix. In case of acyclic
486
M. Dehmer, F. Emmert-Streib, Y.R. Tsoy and K. Varmuza
molecules, the H OSOYA-Index was defined as the sum of the moduli of the corresponding polynomial coefficients p(G, k). Then, by defining the probability value p(G,k) Z , B ONCHEV et al. [9] expressed the information content for polynomial coefficients of a given graph G as [N/2] X p(G, k) log(p(G, k)), (16) Ipc (G) = Z log(Z) − k=0
and [N/2]
I¯pc (G) = −
X p(G, k) p(G, k) . log Z Z
(17)
k=0
[N/2] denotes the greatest integer that does not exceed N/2. Numerical studies to compare Z and Ipc (G), I¯pc (G) were also performed in [9]. The last information index [7] I(G, OX) = OX log(OX) −
|E| X
k
X log
k=0
k
X ,
(18)
we want to introduce for this section relies on the overall value OX, OX(G) =
|E| X
k
{OX(G)} = {0 X, 1 X, . . . , |E| X},
X;
k=0
(19)
of a certain graph invariant X by summing up its values in all subgraphs [7]. These will be partitioned into terms of increasing orders (increasing number of subgraph edges k) [7]. In the simplest case, it holds that OX is equal to the subgraph count (OX = SC) [4, 7]. Based on the just mentioned method [7], several overall information indices have been obtained, such as overall connectivity (the sum of total adjacency of all subgraphs) [5], overall W IENER index W (the sum of total distances of all subgraphs) [6], overall ZagrebIndices [10], and the overall H OSOYA index [7]. More classical information indices can be also found in [3, 48].
4.2.
Information Indices Based on Distances
The first information indices for graphs which are not based on algebraical principles to introduce vertex partitions were developed by B ONCHEV et al. [9]. Starting from an inferred distance matrix D of a graph under consideration, B ONCHEV et al. [9] introduced the information indices ρ(G) 2
2
ID (G) = |V | log(|V | ) − |V | log(|V |) −
X
2ki log(2ki ),
(20)
i=1
and 1 I¯D (G) = − log |V |
1 |V |
ρ(G)
X 2ki 2ki − , log |V |2 |V |2 i=1
(21)
Quantifying Structural Complexity of Graphs
487
by assuming that the distance of a value in the distance matrix D appears 2ki times. Similarly, B ONCHEV et al. [9] also introduced information indices ρ(G) W ID
= W log(W ) −
and
X
iki log(i),
(22)
i=1
ρ(G) W I¯D (G) = −
X iki i , log W W
(23)
i=1
which are based on the W IENER-Index [8], i.e., ρ(G)
W =
X
iki .
(24)
i=1
As an important result, it turned out that these indices possess a high discrimination power [9, 32] to measure the structural information content of chemical graphs. Additionally, by using these information measures, B ONCHEV et al. [9] proved some information inequalities for special graphs, i.e., chain graphs, simple trees, and star graphs. Generally, information inequalities can be very interesting for describing relations between the information indices under consideration, e.g., to study the influence of a special information functional on the resulting graph entropies. By considering certain information functionals for graphs and the resulting parameterized information measures, such a study has been recently performed in [13]. In [31], KONSTANTINOVA et al. introduced further information indices which are based on graph distances. By defining the quantity d(vi ) =
|V | X
d(vi , vj ),
(25)
j=1
the entropy measure ID (vi ) = −
|V | X d(vi , vj ) j=1
d(vi )
log
d(vi , vj ) d(vi )
,
(26)
was obtained. In contrast to the previous shown distance-based information indices for characterizing a graph G, this measure represents the information distance of the vertex vi ∈ V . Then, the information distance of a graph G was defined as [31] |V |
ID (G) =
|V | X
ID (vi ).
(27)
i=1
Also, by using the matrix S = (|Sj (vi , G)|)ij , i = 1, . . . , |V |, j = 1, . . . , ρ(G) of jsphere [14, 26] cardinalities of a graph G, the same kind of information indices as just mentioned were derived [31]:
488
M. Dehmer, F. Emmert-Streib, Y.R. Tsoy and K. Varmuza σ(vi )
X |Sj (vi , G)| |Sj (vi , G)| , log IS (vi ) = − |V | |V |
(28)
j=0
and ID (G) =
|V | X
IS (vi ).
(29)
i=1
Finally, the following similarly inferred measures [31] ID (G) = −
|V | |V | X d(vi , vj ) X i=0 j=0
2W
log
d(vi , vj ) 2W
,
(30)
and i ID (G)
=−
|V | X d(vi ) i=0
2W
log
d(vi ) 2W
,
(31)
also represent information indices to calculate the structural information content of G. In order to finalize our review on distance-based information indices, we express an entropybased molecular descriptor developed by BALABAN et al. [1]. Here, the main purpose was to define novel information indices whose degree of degeneracy is comparably low. Generally, a topological descriptor is called degenerated if for more than one structure the index possesses the same value. By starting from the entities d(vi ) =
|V | X
σ(vi )
d(vi , vj ) =
j=1
X
jgj ,
(32)
j=1
BALABAN et al. first obtained the mean information on the magnitude of distances for each vertex vi σ(vi ) X jgj j u(vi ) = − . (33) log d(vi ) d(vi ) j=1
The quantity gj indicates how many vertices there exist having distance j starting from vi . Obviously, this notation simply expresses the cardinality of the j-sphere of a vertex vi , see Equation (2). Based on Equation (33), the local information on the magnitude of distances was defined by w(vi ) = d(vi ) log(d(vi )) − u(vi ). (34) Now, by applying the well known formula developed by R ANDI C´ [42], a main result of [1] were the information indices X 1 |E| [u(vi )u(vj )]− 2 , U (G) = (35) µ+1 (vi ,vj )∈E
and W (G) =
|E| µ+1
X
(vi ,vj )∈E
1
[w(vi )w(vj )]− 2 ,
(36)
Quantifying Structural Complexity of Graphs
489
where µ denotes the cyclomatic number of G, see [1]. As similarly shown in Equation (26) and Equation (28), these measures result from applying an information-theoretic approach locally (i.e., with respect to the vertices) [1]. Further, it turned out that the indices are less degenerated than the most known topological indices known in chemical graph theory. Further information indices which are based on graph distances can be found, e.g., in [3, 32, 48].
5.
A Novel Information Functional: Degree-Degree Associations
In this section, we define a novel entropy measure for finite, undirected and connected graphs which is based on a special information functional. Generally, the idea to measure the entropy of graphs by using vertex probabilities depending on an information functional has been introduced in [13]. This procedure avoids the problem to determine vertex partitions for obtaining a finite probability distribution. For constructing the mentioned information functional, we use degree-degree associations of underlying shortest paths of the graph in question. We start with the following definitions. Definition 5.1. Let G ∈ GU C . We set Sj (vi , G) := {vaj , vbj , . . . , vzj }, 1 ≤ j ≤ ρ(G), 1 ≤ i ≤ |V |. For vi ∈ V , we define the sets of shortest paths P1j (vi ) := (vi , vaj 1 , vaj 2 , . . . , vaj j ), P2j (vi )
:= .. .
(37)
(vi , vbj1 , vbj2 , . . . , vbjj ),
(38)
.. .
Pkjj (vi ) := (vi , vzj1 , vzj2 , . . . , vzjj ).
(39)
Definition 5.2. Let G ∈ GU C . We define the following degree sequences: sj1 (vi ) := (δ(vi ), δ(vaj 1 ), δ(vaj 2 ), . . . , δ(vaj j )),
(40)
sj2 (vi ) := (δ(vi ), δ(vbj1 ), δ(vbj2 ), . . . , δ(vbjj )),
(41)
.. .
.. .
sjkj (vi ) := (δ(vi ), δ(vzj1 ), δ(vzj2 ), . . . , δ(vzjj )).
(42)
We call these strings the property strings starting from the vertex vi to all other vertices in G. We want to emphasize that these property strings capture structural information of G. Now, starting from Definition (5.3) that uses the degree-degree associations of the set of shortest paths for a certain vertex vi , we define the following quantities.
490
M. Dehmer, F. Emmert-Streib, Y.R. Tsoy and K. Varmuza
Figure 2. An undirected and connected graph. Definition 5.3. Let G ∈ GU C . For vi ∈ V , we define ∆G (vi , 1) = |δ(vi ) − δ(va11 )| + · · · + |δ(vi ) − δ(vz11 )|,
(43)
∆G (vi , 2) = |δ(vi ) − δ(va21 )| + · · · + |δ(vi ) − δ(vz21 )| +··· + .. .
|δ(vz21 )
−
(44)
δ(vz22 )|,
(45)
.. .
) − δ(vaρ(G) )| ∆G (vi , ρ(G)) =|δ(vi ) − δ(vaρ(G) )| + · · · + |δ(vaρ(G) 1 ρ(G)−1 ρ(G)
) − δ(vzρ(G) )|. +|δ(vi ) − δ(vzρ(G) )| + · · · + |δ(vzρ(G) 1 ρ(G)−1 ρ(G)
(46) (47)
In Definition (5.3) the differences |δ(x) − δ(y)| are called degree-degree associations. By taking Definition (5.3) into account, we are now able to define a parameterized information functional for measuring the entropy of G as follows. Definition 5.4. Let G ∈ GU C . We define the information functional f ∆ (vi ) as f ∆ (vi ) :=αc1 ∆
G (v
i ,1)+c2 ∆
G (v
i ,2)+···+cρ(G) ∆
ck > 0, 1 ≤ k ≤ ρ(G),
G (v
i ,ρ(G))
,
α > 0.
(48)
We want to remark that in practical applications the variable α can be set to α = e. Definition 5.5. Let G ∈ GU C and let ∆
PGf (V ) := (PfG∆ (v1 ), PfG∆ (v2 ), . . . , PfG∆ (v|V | )),
(49)
be the derived probability distribution by incorporating f ∆ . We define the entropy of G as If ∆ (G) := −
|V | X i=1
f ∆ (vi ) P|V |
j=1 f
∆ (v ) j
log
f ∆ (vi ) P|V |
j=1 f
∆ (v ) j
.
(50)
Quantifying Structural Complexity of Graphs
491
As a result, we obtained a novel family of graph entropy measures. We notice that by varying the parameters ci and α, one can weight structural properties, e.g., hubs for determining the entropy of G. In order to get a better understanding about the meaning of the final graph entropy measure, we state the following assertions. k . The probability distribution Theorem 5.1. Let G ∈ GR ∆
PGf (V ) := (PfG∆ (v1 ), PfG∆ (v2 ), . . . , PfG∆ (v|V | )),
(51)
maximizes the entropy If ∆ (G). Proof. We start with a given k-regular graph. According to the definition (see Section (3.)), k iff δ(v) = k ∀v ∈ V . From this, we obtain it holds G ∈ GR s11 (vi ) = (k, k),
(52)
s12 (vi )
= (k, k), .. .
(53)
s1k1 (vi ) = (k, k), .. .
(54)
sj1 (vi ) = (k, k, . . . , k),
(55)
sj2 (vi )
= (k, k, . . . , k), .. .
(56)
sjkj (vi ) = (k, k, . . . , k),
(57)
where 1 ≤ j ≤ ρ(G). We note that sjµ (vi ) = (k, k, . . . , k) has j + 1 entries. But this results in ∆G (vi , 1) = 0, . . . , ∆G (vi , j) = 0. Further, this implies f ∆ (vi ) = α0 = 1 and, finally, ∆
PGf (V ) := (PfG∆ (v1 ), PfG∆ (v2 ), . . . , PfG∆ (v|V | )), 1 1 1 =( , ,..., ). |V | |V | |V |
(58) (59)
This completes the proof. Corollary 5.2. Let K|V |,|V | be the complete graph with |V | vertices. The probability distribution ∆
K
K
K
f PK (V ) := (Pf ∆|V |,|V | (v1 ), Pf ∆|V |,|V | (v2 ), . . . , Pf ∆|V |,|V | (v|V | )), |V |,|V |
(60)
maximizes the entropy If ∆ (K|V |,|V | ). The interpretation of Theorem (5.1) leads to the following observation: The given probability distribution maximizes the entropy of a graph G if all vertices are topologically equivalent. From this, we conclude that the entropy of G decreases with an increasing diversity regarding the neighborhood of the vertices. We want to emphasize that this result clearly depends on the considered entropy measure. For example, the application of the classical graph entropy measure IORB [43] results in IORB (K|V |,|V | ) = 0.
492
5.1.
M. Dehmer, F. Emmert-Streib, Y.R. Tsoy and K. Varmuza
Numerical Example
In order to illustrate the above given definitions, we first consider Figure (2). In the following, we only exemplarily calculate the just defined quantities for the vertex v1 because the calculation for the remaining vertices can be done analogously. We yield: P11 (v1 ) = (v1 , v2 ),
P21 (v1 ) = (v1 , v6 ),
P12 (v1 ) P22 (v1 ) P32 (v1 ) P42 (v1 ) P13 (v1 ) P23 (v1 ) P33 (v1 )
(61) (62)
= (v1 , v2 , v3 ),
(63)
= (v1 , v2 , v5 ),
(64)
= (v1 , v6 , v5 ),
(65)
= (v1 , v6 , v7 ),
(66)
= (v1 , v2 , v3 , v4 ),
(67)
= (v1 , v2 , v5 , v4 ),
(68)
= (v1 , v6 , v5 , v4 ),
(69)
s11 (v1 ) = (2, 3),
(70)
s12 (v1 ) s21 (v1 ) s22 (v1 ) s23 (v1 ) s24 (v1 ) s31 (v1 ) s32 (v1 ) s33 (v1 )
= (2, 3),
(71)
= (2, 3, 2),
(72)
= (2, 3, 3),
(73)
= (2, 3, 3),
(74)
= (2, 3, 1),
(75)
= (2, 3, 2, 2),
(76)
= (2, 3, 3, 2),
(77)
= (2, 3, 3, 2),
(78)
and ∆G (v1 , 1) = 1 + 1 = 2,
(79)
G
(80)
G
(81)
G
(82)
∆ (v1 , 2) = 1 + 1 + 1 + 0 + 1 + 0 + 1 + 2 = 7, ∆ (v1 , 3) = 1 + 1 + 0 + 1 + 0 + 1 + 1 + 0 + 1 = 6, ∆ (v1 , 4) = 0. Hence, we get f ∆ (v1 ) := α2c1 +7c2 +6c3 .
(83)
Quantifying Structural Complexity of Graphs
493
If we perform these steps for every vertex in G, we obtain the special entropy measure
If ∆ (G) = − − − −
2c1 +7c2 +6c3 2c1 +4c2 +6c3 α2c1 +7c2 +6c3 α α α2c1 +4c2 +6c3 log log − DG DG DG DG c1 +4c2 +5c3 +11c4 c1 +3c2 +9c3 αc1 +4c2 +5c3 +11c4 α α αc1 +3c2 +9c3 log log − DG DG DG DG c1 +6c2 3c1 +3c2 +5c3 α α α3c1 +3c2 +5c3 αc1 +6c2 log log − DG DG DG DG 2c1 +5c2 +6c3 +10c4 α2c1 +5c2 +6c3 +10c4 α log , (84) DG DG
where DG := α2c1 +7c2 +6c3 + α2c1 +4c2 +6c3 + αc1 +4c2 +5c3 +11c4 + αc1 +3c2 +9c3 + αc1 +6c2 + α3c1 +3c2 +5c3 + α2c1 +5c2 +6c3 +10c4 .
6.
(85)
Summary and Conclusion
In this chapter, we first reviewed the most known information indices to characterize chemical structures by calculating their structural information content. Then, we gave a short overview on the usage of molecular descriptors which are often used in mathematical chemistry and also on approaches in chemometrics. Here, we used information indices (measures) to quantify structural complexity of graphs. However, the main contribution of the chapter was to define a novel information functional that is based on degree-degree associations. This led us to a special information measure for graphs (graph entropy) and we proved that maximum entropy is obtained for a fully connected and regular network. To evaluate the measure by using real chemical structures and to explore its mathematical properties in depth will be a part of our future research. As a final remark, we note that the mergence of statistics, information theory and graph theory bears a considerable potential [22, 41] that is largely unexplored so far.
Acknowledgements We thank Danail Bonchev and Abbe Mowshowitz for fruitful discussions. This work was supported by the COMET Center ONCOTYROL and funded by the Federal Ministry for Transport Innovation and Technology (BMVIT) and the Federal Ministry of Economics and Labour/the Federal Ministry of Economy, Family and Youth (BMWA/BMWFJ), the Tiroler Zukunftsstiftung (TZS) and the State of Styria represented by the Styrian Business Promotion Agency (SFG) [and supported by the University for Health Sciences, Medical Informatics and Technology and BIOCRATES Life Sciences AG].
494
M. Dehmer, F. Emmert-Streib, Y.R. Tsoy and K. Varmuza
References [1] A. T. Balaban and T. S. Balaban, New vertex invariants and topological indices of chemical graphs based on information on distances. J. Math. Chem., 8:383–397, 1991. [2] S. H. Bertz, The first general index of molecular complexity. Journal of the American Chemical Society, 103:3241–3243, 1981. [3] D. Bonchev, Information Theoretic Indices for Characterization of Chemical Structures. Research Studies Press, Chichester, 1983. [4] D. Bonchev, Kolmogorov’s information, shannon’s entropy, and topological complexity of molecules. Bulg. Chem. Commun., 28:567–582, 1995. [5] D. Bonchev, Overall connectivities and topological complexities: A new powerful tool for QSPR/QSAR. J. Chem. Inf. Comput. Sci., 40(4):934–941, 2000. [6] D. Bonchev, The overall Wiener index - a new tool for characterization of molecular topology. J. Chem. Inf. Comput. Sci., 41:582–592, 2001. [7] D. Bonchev, The overall topological complexity indices. In T. Simos and G. Maroulis, editors, Advances in Computational Methods in Science and Engineering, volume 4B, pages 1554–1557. VSP Publications, 2005. [8] D. Bonchev and D. H. Rouvray, Chemical Graph Theory. Introduction and Fundamentals. Abacus Press, New York, NY, USA, 1991. [9] D. Bonchev and N. Trinajsti´c, Information theory, distance matrix and molecular branching. Journal of Chemical Physics, 67:4517–4533, 1977. [10] D. Bonchev and N. Trinajsti´c, Overall molecular descriptors. 3. Overall zagreb indices. SAR QSAR Environ. Res., 12:213–236, 2001. [11] T. M. Cover and J. A. Thomas, Elements of Information Theory. Wiley Series in Telecommunications and Signal Processing. Wiley & Sons, 2006. [12] M. Dehmer and K. Varmuza and S. Borgert and F. Emmert-Streib. On Entropy-based Molecular Descriptors: Statistical Analysis of Real and Synthetic Chemical Structures. Journal of Chemical Information and Modeling, 49: 1655-1663, 2009. [13] M. Dehmer. Information processing in complex networks: Graph entropy and information functionals. Applied Mathematics and Computation, 201:82–94, 2008. [14] M. Dehmer and F. Emmert-Streib. Structural information content of networks: Graph entropy based on local vertex functionals. Computational Biology and Chemistry, 32:131–138, 2008. [15] M. Dehmer and A. Mowshowitz. A natural history of graph entropy. submitted for publication, 2009.
Quantifying Structural Complexity of Graphs
495
[16] W. Demuth and M. Karlovits and K. Varmuza. Spectral similarity versus structural similarity: Mass spectrometry. Anal. Chim. Acta, 516:75–85, 2004. [17] K. Varmuza and P. Filzmoser. Introduction to Multivariate Statistical Analysis in Chemometrics. Francis & Taylor, CRC Press. Amsterdam, The Netherlands. [18] J. Devillers and A. T. Balaban. Topological Indices and Related Descriptors in QSAR and QSPR. Gordon and Breach Science Publishers, 1999. Amsterdam, The Netherlands. [19] Dragon, software for calculation of molecular descriptors. R. Todeschini and V. Consonni and A. Mauri and M. Pavan. www.talete.mi.it, Talete srl, Milano, Italy, 2004. [20] B. Efron and R. J. Tibshirani. An introduction to the bootstrap. Chapman & Hall: London, United Kingdom, 1993 [21] F. Emmert-Streib and M. Dehmer. Information theoretic measures of UHG graphs with low computational complexity. Applied Mathematics and Computation, 190:1783–1794, 2007. [22] F. Emmert-Streib and M. Dehmer. Information Theory and Statistical Learning. Springer, 2008. [23] F. Emmert-Streib and M. Dehmer. Information theoretic measures of UHG graphs with low computational complexity. Applied Mathematics and Computation, 190:1783–1794, 2007. [24] P. Filzmoser and B. Liebmann and K. Varmuza. Repeated couble crosss validation. J. Chemometr., 23:160–171, 2009. [25] J. Gasteiger and T. Engel. Chemoinformatics - A textbook. Wiley-VCH: Weinheim, Germany, 2003. [26] R. Halin. Graphentheorie. Akademie Verlag, 1989. Berlin, Germany. [27] F. Harary. Graph Theory. Addison Wesley Publishing Company, 1969. Reading, MA, USA. [28] H. Hosoya. Topological index. A newly proposed quantity characterizing the topological nature of structural isomers of saturated hydrocarbons. Bull. Chem. Soc. Jpn., 44:2332–2339, 1971. [29] L. B. Kier and L. H. Hall. Molecular connectivity in structure-activity analysis. Wiley: New York, NY, USA, 1986. [30] B. R. Kowalski. Chemometrics: Views and propositions, J. Chem. Inf. Comput. Sci., 5: 201–203, 1975. [31] E. V. Konstantinova and A. A. Paleev. Sensitivity of topological indices of polycyclic graphs. Vychisl. Sistemy, 136:38–48, 1990. In Russian.
496
M. Dehmer, F. Emmert-Streib, Y.R. Tsoy and K. Varmuza
[32] E. V. Konstantinova, V. A. Skorobogatov, and M. V. Vidyuk. Applications of information theory in chemical graph theory. Indian Journal of Chemistry, 42:1227–1240, 2002. [33] H. Kubinyi. Information theory and statistics. Wiley-VCH: Weinheim, Germany, 1993. [34] S. Kullback. QSAR: Hansch analysis and related approaches. John Wiley & Sons, 1959. [35] S. Kullback and R. A. Leibler. On information and sufficiency. Annals of Mathematical Statistics, 22(1):79–86, 1951. [36] R. Leardi and L. Nørgaard. Sequential application of backward interval partial least squares and genetic algorithms for the selection of relevant spectral regions. J. Chemometr., 18:486–497, 2004. [37] A. Mowshowitz. Entropy and the complexity of graphs II: The information content of digraphs and infinite graphs. Bull. Math. Biophys., 30:225–240, 1968. [38] A. Mowshowitz. Entropy and the complexity of graphs III: Graphs with prescribed information content. Bull. Math. Biophys., 30:387–414, 1968. [39] A. Mowshowitz. Entropy and the complexity of graphs IV: Entropy measures and graphical structure. Bull. Math. Biophys., 30:533–546, 1968. [40] A. Mowshowitz. Entropy and the complexity of the graphs I: An index of the relative complexity of a graph. Bull. Math. Biophys., 30:175–204, 1968. [41] J. Pearl. Probabilistic Reasoning in Intelligent Systems. Morgan-Kaufmann, 1988. [42] M. Randi´c. On characterization of molecular branching. J. Amer. Chem. Soc., 97:6609–6615, 1975. [43] N. Rashevsky. Life, information theory, and topology. Bull. Math. Biophys., 17:229– 235, 1955. [44] F. Schweitzer, W. Ebeling, H. Ros´e, and O. Weiss. Network optimization using evolutionary strategies. In PPSN IV: Proceedings of the 4th International Conference on Parallel Problem Solving from Nature, pages 940–949, London, UK, 1996. SpringerVerlag. [45] C. E. Shannon and W. Weaver. The Mathematical Theory of Communication. University of Illinois Press, 1997. Urbana, IL, USA. [46] V. A. Skorobogatov and A. A. Dobrynin. Metrical analysis of graphs. MATCH, 23:105–155, 1988. [47] R. V. Sol´e and S. Valverde. Information theory of complex networks: On evolution and architectural constraints. In Lecture Notes in Physics, 650: 189–207, 2004.
Quantifying Structural Complexity of Graphs
497
[48] R. Todeschini, V. Consonni, and R. Mannhold. Handbook of Molecular Descriptors. Wiley-VCH, 2002. Weinheim, Germany. [49] N. Trinajsti´c. Chemical Graph Theory. CRC Press, 1992. Boca Raton, FL, USA. [50] E. Trucco. A note on the information content of graphs. Bulletin of Mathematical Biology, 18(2):129–135, 1956. [51] B. G. M. Vandeginste and D. L. Massart and L. C. M. Buydens and S. De Jong and J. Smeyers-Verbeke. Handbook of chemometrics and qualimetrics: Part B. Elsevier: Amsterdam, The Netherlands, 1998. [52] K. Varmuza and P. Filzmoser. Introduction to multivariate statistical analysis in chemometrics. CRC Press: Boca Raton, FL, USA, 2009. [53] K. Varmuza and W. Demuth and M. Karlovits and H. Scsibrany. Binary substructure descriptors for organic compounds. Croat. Chem. Acta, 78:141–149, 2005. [54] P. Willet. Similarity and clustering in chemical information systems. Research Studies Press: Letchworth, United Kingdom, 1987. [55] J. Zupan and J. Gasteiger. Neural networks in chemistry and drug design. Wiley-VCH: Weinheim, Germany, 1999.
In: Quantum Frontiers of Atoms and Molecules Editor: Mihai V. Putz, pp. 499-520
ISBN: 978-1-61668-158-6 © 2010 Nova Science Publishers, Inc.
Chapter 19
TOPOLOGICAL INDICES OF NANOSTRUCTURES Ali Reza Ashrafi* Department of Mathematics, Faculty of Science, University of Kashan, Kashan 87317-51167, I. R. Iran
Abstract Let be the class of finite graphs. A topological index is a function Top from into real numbers with this property that Top(G) = Top(H), if G and H are isomorphic. Obviously, the number of vertices and the number of edges are topological index. The distance d(u,v) := dG(u,v) between two vertices u and v is the length of a shortest (u,v)-path. The Wiener index is the first reported distance-based topological index and is defined as half sum of the distances between all the pairs of vertices in a molecular graph. In the last decade or so, various topological indices have been introduced. Recently many authors have devoted their studies to this subject and many results have been published. In this lecture we restrict ourselves to the modeling of nanostructures by chemical indices.
Keywords: Molecular Graph, Nanostructure, Topological index.
1. Introduction A nanostructure is an object of intermediate size between molecular and microscopic structures. It is a product derived through engineering at the molecular scale. The most important of these new materials is carbon nanotubes [1-3]. They have remarkable electronic properties and many other unique characteristics. For these reasons it is of interest to study the mathematical properties of these materials. Mathematical chemistry is a branch of theoretical chemistry for discussion and prediction of the molecular structure using mathematical methods without necessarily referring to quantum mechanics [4-6]. Chemical graph theory is a branch of mathematical chemistry which applies graph theory to mathematical modeling of chemical phenomena [7]. This theory had an important effect on the development of the chemical sciences. The *
E-mail address: [email protected]
500
Ali Reza Ashrafi
pioneers of the chemical graph theory are Alexandru Balaban, Ivan Gutman, Haruo Hosoya, Milan Randic and Nenad Trinajstić. Nowadays, hundreds of researchers work in this area producing thousands of articles annually. A molecular graph is a simple graph such that its vertices correspond to the atoms and the edges to the bonds. Note that hydrogen atoms are often omitted. By IUPAC terminology, a topological index is a numerical value associated with chemical constitution purporting for correlation of chemical structure with various physical properties, chemical reactivity or biological activity [8-14]. Diudea was the first scientist to study the mathematical properties of nanostructures [1521]. In some research papers he computed some topological indices of nanotubes and nanotori. After leading works of Diudea, one of us (ARA) and their co-authors continued this problem to compute some topological indices of nano-materials [22-44]. In recent years, several papers on computing Wiener, PI, Szeged, Schultz and Balaban indices of nanostructures have been published, and we encourage interested readers to consult also these papers for background materials as well as basic computational techniques [45-59].
2. Definitions In this section we describe some definitions which will be kept throughout. A graph is a collection of points and lines connecting a subset of them. The points and lines of a graph also called vertices and edges of the graph, respectively. If e is an edge of G, connecting the vertices u and v, then we write e = uv and say "u and v are adjacent". A path P in a graph G is a sequence v1, v2, …, vr of vertices such that vi and vi+1 are adjacent, 1 ≤ i ≤ r-1. A path graph is a graph consisting of a single path. A connected graph is a graph such that there exists a path between all pairs of vertices. The distance d(u,v) = dG(u,v) between two vertices u and v is the length of a shortest (u,v)-path in G. Let G be a graph. The vertex and edge sets of G are denoted by V(G) and E(G), respectively. A graph H is called a subgraph of G if V(H) V(G) and E(H) E(G). A cycle graph Cn of order n is a graph with V(G) = {v1, v2, …, vn} and E(G) = {v1v2, v2v3, …, vn-1vn, vnv1}. An acyclic graph or tree is a graph without a subgraph isomorphic to cycle graphs. Two graphs which contain the same number of graph vertices connected in the same way are said to be isomorphic. Formally, two graphs G and H are said to be isomorphic if there is a one-to-one and onto function f: V(G) V(H) such that uv E(G) if and only if f(u)f(v) E(H). Let be the class of finite graphs. A topological index is a function Top from into real numbers with this property that Top(G) = Top(H), if G and H are isomorphic. Obviously, the number of vertices and the number of edges are topological index. The Wiener index is the first reported distance based topological index and is defined as half sum of the distances between all the pairs of vertices in a molecular graph. In the last decade or so, various topological indices are introduced. Let G be a connected graph and e = uv be an edge of G. The number of vertices of G whose distance to the vertex u is smaller than the distance to the vertex v is denoted by nu(e). Analogously, nv(e) is the number of vertices of G whose distance to the vertex v is smaller than u. Suppose x is a vertex of the graph G. The distance between e and x is equal to d(x,e) = min{d(u,x) , d(v,x)}. Then mu(e) is the number of edges of G whose distance to the vertex u is
Topological Indices of Nanostructures
501
smaller than the distance to the vertex v. Analogously, mv(e) is the number of edges of G whose distance to the vertex v is smaller than the distance to the vertex u. Note that edges equidistant to u and v are not counted. The vertex Szeged index is another topological index which is introduced by Ivan Gutman [12]. It is defined as the sum of [nu(e) nv(e)], over all edges of a graph G. The edge Szeged index of G is a recently proposed topological index defined as the sum of [mu(e) mv(e)], over all edges G. The vertex and edge Szeged indices of the graph G are denoted by Szv(G) and Sze(G), respectively. Therefore,
Sz v (G) nu (e) nv (e) and Sz e (G) mu (e) mv (e) e
e
Motivated by the success of the vertex Szeged index, Padmakar Khadikar [10,11] proposed a seemingly similar molecular structure descriptor, that in what follows we call the edge-PI index. In analogy with definition of the vertex Szeged index the edge-PI index is defined as PI e (G)
[m
u
(e) mv (e)]. Quite recently the vertex-version of the PI index
e
was also considered [14]. It is defined as PI v (G )
[n
u
(e) nv (e)] and named the
e
vertex-PI index of G. There is an evident symmetry between the vertex and edge version of PI index, as well as the vertex and edge version of Szeged index. For this symmetry, it is reasonable to examine some important classes of nanostructures under these topological indices.
3. Edge and Vertex PI Indices of Some Nanostructures In this section the edge-PI index of the molecular graph of some nanostructures are computed. Suppose G is a graph with edge set E = E(G) and f = uv. Define N(f) = |E| (mu(f) + mv(f)). Then PI(G) = |E|2 fE N(f)
(1)
We say that e = xy is parallel to f = uv if d(x,f) = d(y,f). In this case, we write e || f. Therefore, for computing the PI index of G it is enough to calculate N(f), the number of parallel edges to f, for every f E. If T is an acyclic graph containing n vertices then PI(T) = (n-1)(n-2). In particular, PI = 0, for acyclic graphs when n = 1 and 2. Moreover, if Kn denotes a complete graph with n vertices then PI(Kn) = n(n-1)(n-2). We now investigate an important property of the edge PI index.
502
Ali Reza Ashrafi
Result 1 Let G be a connected graph with exactly m edges. Then PI(G) m(m1) with equality if and only if G is a cycle of odd length or an acyclic graph.
Proof Suppose e = uv is an edge of G. It is clear that nu(e) + nv(e) + N(e) = m and so PI(G) = m2 eE N(e). But N(e) 1 and hence eE N(e) eE 1 = m. Therefore, PI(G) = m2 eEN(e) m2 – m = m(m1). We now assume that PI(G) = m(m1). By Result 1, it is enough to consider non-acyclic graphs. Thus G has a cycle C of minimum length k, k 3. If there exists an edge e for which N(e) > 1 then eE N(e) > m and so PI(G) < m(m1), which is a contradiction. Hence for every edge e, N(e) = 1. Suppose C = x1x2, x2x3, …, xk-1xk, xkx1. We now consider two cases that k is odd or even.
Case 1. k is even. Suppose f = x1x2 is an edge of C. Consider the edge g = xk/2+1xk/2+2. Since C has minimum length, d(g,x1) = d(g,x2) = k/21. Thus, g is equidistant from both end of the edge f and hence N(f) 2, a contradiction.
Case 2. k is odd. Suppose f = x1x2 and v = x2+(k-1)/2. If deg(v) > 2 then we can choose an edge g = uv, in which u is distinct from xi’s. Hence d(g,x1) = d(g,x2) = (k1)/2, which is impossible. Hence deg(x1) = 2. Using a similar argument, we can see that for any i, 2 i k, deg(xi) = 2. But G is connected, so G = C, as desired. Conversely, if G is a tree then by Result 1, PI(G) = m(m1), in which m = |E(G)|. Also, in every cycle G with odd length k, we have PI(G) = m(m1), which completes the proof. We now consider the molecular graph T(h) constructed from a phenylene with h sixmembered rings in which each four-membered ring in the phenylene is replaced by a linear array consisting of k, k = 4, 7, 10, ..., four-membered rings, Figure 1.
Figure 1. The graph of linear phenylenes.
Suppose e is an arbitrary edge of the graph of Figure 1. Then PI(T) = |E|2 eE N(e). But |E(T)| = 6h + (h1)(3k1) and so PI(T) = 36h2 + (h1)2(3k1)2 + 12h(h1)(3k1) eE
Topological Indices of Nanostructures
503
N(e). Therefore, for computing the PI index of T, it is enough to calculate N(e), for every e E. To calculate N(e), we consider three cases that e is vertical, horizontal or oblique. If e is horizontal or oblique then N(e) = 2 and for vertical edges we have N(e) = hk + h – k + 1. Thus, PI(T) = 36h2 + (h1)2(3k1)2 + 12h(h1)(3k1) – 2[6h-2+2(h1)(k1)] – (hk+hk+1)2. If we put k=0 in the last formula, then we obtain the PI index of polyacenes which is computed before by Khadikar, Karmarkar and Varma [60]. Therefore, we prove that:
Result 2 Let T be the chemical graph of the linear phenylene with h six-membered rings in which each four-membered ring in the phenylene is replaced by a linear array consisting of k, k = 4, 7, 10, ..., four-membered rings. Then PI(T) = 36h2 + (h1)2(3k1)2 + 12h(h1)(3k1) – 2[6h2+2(h1)(k1)] – (hk+h-k+1)2 In particular, the PI index of polyacenes with h hexagons is 24h2. Vukičević and Trinajstić [61] obtained formulae for Wiener indices of a class of pericondensed benzenoid graphs consisting of one, two and three rows of hexagons of various lengths. They introduced the graph G(m,n) to be the pericondensed benzenoid graph given by Figure 2, in which m and n are positive integers. Here we continue this study to calculate the PI index of these chemical graphs. Without loss of generality, we can assume that n m.
Figure 2. A pericondensed benzenoid graphs consisting of two rows of n and m hexagons, m n.
It is easy to see that G(m,n) has exactly 5n + 3m + 2 edges. Suppose A and B are the set of all vertical and oblique edges. To compute the size of A, we notice that there are two rows of vertical edges. In the first row, N(e) = n+1 and in second N(e) = m+1. Thus eA N(e) = (m+1)2 + (n+1)2. To compute eB N(e), we define:
504
Ali Reza Ashrafi
{ Ek , E2n k 1 , E4n k 3 } { E , E k 2n k 1 , E4n k 1 } Bk = { Ek , E2n k 1 } { Ek , E2n k 1 }
k is even and 4 k 2m 2 k is odd and 1 k 2m 1 . k 2 or (k is even and 2m 4 k 2n) k is odd and 2m 1 k 2n - 1
It is easy to see that the collection P = {B1, B2, …, B2n} is a partition for B and if we define X = 1 i 2n & |Bi| = 3 Bi and Y = 1 i 2n & |Bi| = 2 Bi then eB N(e) = eX N(e) + e Y N(e) = e X 3 + e Y 2 = 3|X| + 2|Y|. Since |X| = 6m and |Y| = 4(n m), eB N(e) = 3|X| + 2|Y| = 10m + 8n. Therefore, PI(G(m,n)) = 8m2 + 24n2 + 30mn + 10n + 2. So we prove the following result:
Result 3 If n m then PI(G(m,n)) = 8m2 + 24n2 + 30mn + 10n + 2. We now present a computational method for computing PI index of fullerenes. Fullerene was discovered for the first time in 1985 [62,63]. Fullerenes are carbon-cage molecules in which a large number of carbon atoms are bonded in a nearly spherically symmetric configuration. Let p, h, n and m be the number of pentagons, hexagons, carbon atoms and bonds between them, in a given fullerene F. Since each atom lies in exactly 3 faces and each edge lies in 2 faces, the number of atoms is n = (5p+6h)/3, the number of edges is m = (5p+6h)/2 = 3/2n and the number of faces is f = p + h. By the Euler’s formula n − m + f = 2, one can deduce that (5p+6h)/3 – (5p+6h)/2 + p + h = 2, and therefore p = 12, v = 2h + 20 and e = 3h + 30. This implies that such molecules made up entirely of n carbon atoms and having 12 pentagonal and (n/2 10) hexagonal faces, where n 22 is a natural number equal or greater than 20. Using Table 1, the PI indices of the fullerene graphs of C24, C36, C48, C60, C72, C84, C96, C108, C120 and C132 are computed as PI(C24) = 996, PI(C36) = 2280, PI(C48) = 4188, PI(C60) = 6672, PI(C72) = 9816, PI(C84) = 13512, PI(C96) = 17856, PI(C108) = 22848, PI(C120) = 28488 and PI(C132) = 34776.
Result 4 For n ≥ 12, PI(C12n) = 324n2 – 516n + 1248.
Proof From Figure 3, one can see that there are six types of edges of the fullerene graph C12n. These are the first and second types of vertical edges (I, II), the first and second types of oblique edges (III, IV) and the edges of central and outer hexagons (V,VI). In the following table the values of N(e) for these type of edges are computed. By Table 2 and equation given the first paragraph of this section, the proof is complete.
Topological Indices of Nanostructures
505
Our calculations for working with topological indices of graphs are done by GAP SYSTEM. The method described in last result is quite general, and can be extended to solve several problems in computational chemistry. Table 1. The Values of mu(e) and mv(e) for some Exceptional Cases of C12n Edges Type V and VI Edges Type I Edges Type IV Edges Type II Edges
C24
C36
C48
C60
C72
# edges
15,15,6
20,20,14
25,25,22
27,27,36
29,29,50
12
12,15,9
20,23,11 21,19,14
35,25,12 20,33,19
48,29,13 20,48,22
31,63,14 20,66,22
12 12(n-3)
-
-
30,30,12
35,42,13 42,35,13
39,55,14 47,47,14 55,39,14
6(n-2)
24,24,6
24, 42,6 42,24,6
60,24,26 42,42,6 60,24,26
24,78,6 42,60,6 60,42,6 78,24,6
12
C84
C96
C108
C120
C132
# edges
29,29,68
29,29,86
29,29,104
29,29,122
29,29,140
12
79,31,16
31,97,16 102,20,2 2
115,31,16
133,31,16
151,31,16
12
120,20,22
138,20,22
156,20,22
12(n-3)
41,122,17 53,109,18 95,66,19 81,81,18 66,95,19 109,53,18 122,41,17
41,140,15 127,53,16 113,66,17 81,97,18 97,81,18 66,113,17 53,127,16 140,41,15 24,168,4 42,150,4 60,132,4 78,114,4 96,96,4 114,78,4 132,60,4 150,42,4 168,24,4
Type III Edges
Edges Type V and VI Edges Type I Edges Type IV Edges
Type II Edges
Type III Edges
84,20,22
70,41,15 51,60,15 60,51,15 41,70,15
86,41,17 53,75,16 64,64,16 75,53,16 41,86,17
41,104,17 91,53,18 66,79,17 79,66,17 53,91,18 41,104,17
24,96,6 42,78,6 60,60,6 78,42,6 96,24,6
24,114,6 42,96,6 60,78,6 78,60,6 96,42,6 114,24,6
24,132,6 42,114,6 60,96,6 78,78,6 96,60,6 114,42,6 132,24,6
24,150,6 42,132,6 60,114,6 78,96,6 96,78,6 114,60,6 132,42,6 150,24,6
6(n-2)
12
506
Ali Reza Ashrafi Table 2. The Values of N(e) for Six Types of Distinguishable Edges
Edges Type I Edges Type II Edges Type III Edges
#N(e) 22 6 16
No 6 6(n2) 24
Edges
Type IV Edges Type V and VI
18n58
12
#N(e) 17 18 19 20 22 (n1) times 20 19 18 17
No
12
This software was constructed by GAP’s team in Aachen [64]. Here, GAP stands for Groups, Algorithms and Programming. The name was chosen to reflect the aim of the system, which is a group theoretical software for solving computational problems in group theory. Recently, after including GRAPE into GAP it is possible to apply this computer algebra system to solve problems in graph theory.
A typical edge of central hexagon
A second type oblique edge
A second type vertical edge A first type oblique edge
A first type vertical edge A typical edge of outer hexagon Figure 3. The Fullerene Graph C12n.
The last years have seen a rapid spread of interest in the understanding, design and even implementation of graph theoretical algorithms. These are gradually becoming accepted both as standard tools for a working group theoretician, like certain methods of proof, and as worthwhile objects of study, like connections between notions expressed in theorems. GAP is a free and extensible software package for computation in discrete abstract algebra. The term extensible means that you can write your own programs in the GAP language, and use them
Topological Indices of Nanostructures
507
in just the same way as the programs which form part of the system (the “library”). More information on the motivation and development of GAP to date, can be found on GAP web page that you find on http://www.gap-system.org. We apply this software to compute, edge-PI, vertex-PI, edge-Szeged and vertex-Szeged indices of molecular graphs. To do this, we first draw the molecule by HyperChem [65]. Then compute the distance matrix of its molecular graph by TopoCluj [66], a software constructed by Diudea and his team in Cluj. Finally, we prepare a GAP program for computing the vertex PI and Szeged index of the molecular graph under consideration. Our programs are accessible from the authors upon request. e10 e5
e1 e4 e6 e4 e2
e3
e8 e7
Figure 4. The Fullerene Molecule F24n + 12 Containing 24n + 12 Carbon Atoms.
We now compute the vertex PI index of a new type of fullerenes with 24n + 12 carbon atoms. At first, we present a simple formula for computing vertex PI index of molecular graphs. Suppose G is a molecular graph, E = E(G) and V = V(G). Define N(e) = |V| (nu(e) + nv(e)). Then PIv(G) = e uv [|V | N ( e )] | V || E | e uv N ( e ) . By this equation, one can compute the vertex PI index of every fullerene graphs by computing the number of vertices co-distant from the ends of a given edge e of F. It is
508
Ali Reza Ashrafi
possible to prepare a GAP program for computing the vertex PI index of graphs. To do this, we assume that G is an n atom molecular graph with adjacency matrix A and distance matrix D. The distance matrix D = [dij] of G is another n n matrix defined by dij is the length of a minimum path connecting vertices i and j, i j, and zero otherwise. To compute the vertex PI index of G, we first draw it by HeperChem and then apply TopoCluj to compute the adjacency and distance matrices of G. We now upload A and D in our GAP program to compute the vertex PI index of G. From Figure 4, one can see that there are ten types of edges of fullerene graph F = F24n + . In the following table, the value of N(e) is computed for each case. 12 Table 3. The Number of Co-Distance Vertices of F for Edges e1, …, e10. Edge Types
N(e)
E1 E2 E3 E4 E5 E6 E7 E8 E9 E10
0 2 4 6 12 14 16 24 56 76
The Number of Similar Edges 6(5n-22) 12 12 24 24 24 6(n-3) 12 12 24
By this table and Figure 4, one can prove the following result:
Result 5 If F denotes the fullerene molecule of Figure 4, then the vertex PI index of F is computed as follows: PIv(F) = 864n2 + 2832n – 3144.
4. Edge and Vertex Szeged Indices of Some Nanostructures In this section we consider the problem of computing edge and vertex Szeged indices of molecular graphs into consideration. We begin with nanotubes and nanotori covered by C4. We first introduce some concepts. Suppose G and H are graphs. The Cartesian product G H of graphs G and H has the vertex set V(GH) = V(G)×V(H) and (a,x)(b,y) is an edge of GH if a = b and xy E(H), or ab E(G) and x = y. It is well known that the Cartesian product is commutative and associative, |V(GH)| = |V(G)||V(H)| and |E(GH)| = |E(G)||V(H)| + |V(G)||E(H)|. On the other hand, if (a,x) and (b,y) are vertices of G H then
Topological Indices of Nanostructures
509
dGH((a,x),(b,y)) = dG(a,b) + dH(x,y), see [67] for details. Consider the path and cycle graphs with n vertices. Then R = PnCm and S = CnCm are nanotubes and nanotori with mn vertices covered by C4. On the other hand, |E(R)| = 2mn – m, |E(S)| = 2mn, Sz(Pn) = n(n2 – 1)/6 and
m3 Sz( Cn ) 4 2 m( m 1 ) 4
m is even .
m is odd
In [68], the authors proved that the Wiener and Szeged indices of the Cartesian product of two graphs are given as Sz(GH) = |V(G)|3Sz(H) + |V(H)|3Sz(G). Then by these formulae and above calculations, one can see that
Result 6 The vertex Szeged index of nanotubes and nanotori covered by C4 are computed as follows:
nm 3 (5n 2 2) m is 12 Sz ( R) 2 2 2 2 2 nm(5n m 6n m 3n 2m ) m is 12 n3m3 23 3 3 2 2 3 3 3 2n m 2m n 2m n m n mn 4 Sz ( S ) 3 3 3 2 3 2 n m 2 n m n m 4 2n 3 m 3 2n 2 m 3 nm 3 4 .
even , odd n & m are even n & m are odd . n is even & m is odd m is even & n is odd
We now present a powerful method for computing vertex and edge Szeged indices of molecular graphs. To explain our method, we consider the molecular graph a water-soluble polyaryl ether dendrimer G[n], Figure 5. Let G be graph. A subgraph S of G is called convex if for each vertex x,y V(H) there exists no shortest path in G from x to y which involves a vertex w V(G) V(H). Define H[n] to be the graph constructed from G[n] by deleting almost on half of its vertices and edges, Figure 5. In [69], the authors proved that if {Fi}1 i k is a partition of E(G) such that for each i, 1 i k, G Fi is a two component graph with convex components then W(G) = 1 i k |V(GFi(1))|.|V(GFi(2))|, where GFi(1) and GFi(2) are two components of G Fi.
510
Ali Reza Ashrafi
Figure 5. The Molecular Graph of G[4].
If we omit an edge outside hexagons of G[n] then the components of new graph are easily convex. On the other hand, the graphs obtained from G[n] by deleting two nonadjacent parallel edges of a hexagon are also convex. These subsets constitute a partition {Fi}1ik of E(G) and H[n] = G[n] – Fi has the required properties of the mentioned result. Define gn = |V(G[n])| and hn = |V(H[n])|. Then & gn = 2hn + 16 = 5 ×2n+3 – 4. Suppose a* = a(gn – a). Then we have:
Result 7 The Wiener and vertex-Szeged indices of G[n] are computed as follows: W(G[n]) = 502 – 3440.4n + 8000.n.4n + 28000.n.2n + 8548.2n Sz(G[n]) = 726 + 11200n4n + 3920.n.2n – 3600.4n + 11592.2n
Topological Indices of Nanostructures
511
Proof Consider a hexagon C6 in H[n] G[n]. From Figures 68, one can see that G[n] – {e1,e4} has exactly two components, both of them are convex and one of the components has h 3 vertices, 0 n. We notice that the number of such hexagons is 2n-. Similarly, G[n] – {e2,e5} and G[n] – {e3,e6} have also two components, both of them are convex and one of their components has h-1 + 5 vertices, 0 n. Suppose e is an edge outside cycles of G[n]. Then G[n] – e has exactly two convex component. One of these components has h, h + 1 or h + 2 vertices and the number of such edges is 2n-, 1 n. For = 0, one can see that there is 2n hexagons and G[n] – {e1,e4}, G[n] – {e2,e5} and G[n] – {e3,e6} have exactly two components, where both of them are convex and one of them has 7 vertices. There is a similar argument for other edges of G[n] and so there is a partition {Fi}1ir in which G – Fi has two convex components. Therefore, n 1
eH[n] m(e) i 0 2i [(h n i 3) * 2(h n i1 5)*] 2n.3.(g n 7 ) * i 1 2i [(h n i 2) * (h n i 1) * (h n i )*] n
2n [2(g n 1) * (g n 2) * (g n 4)*]
Figure 6. The Molecular Graph of H[4].
Figure 7. The Position of Edges in a Hexagon.
512
Ali Reza Ashrafi
Figure 8. The Core of G[n].
On the other hand, if S is the core of G[n], then we have:
*
g (h n 1) * (h n 2) * 3(h n 5)*] n . n
m(e) 2[h*n eS
Therefore, W(G[n]) = 2eH[n]m(e) + eSm(e) = 502 – 3440.4n + 8000.n.4n + 28000.n.2n + 8548.2n. This completes the first part of the result. To prove the second part, we choose the set F = {uv}, where uv is an edge outside hexagons of G[n]. By definition of the Szeged index and partition of edges described above, nu(e)nv(e) = |V(GF(1))|.|V(GF(2))|. Similarly, if F = {uv,ab} then nu(e)nv(e) = na(e)nb(e) = |V(GF(1))|.|V(GF(2))|. So, by a similar argument as above, n 1
euvE(H[n]) n u (e)n v (e) 2i 0 2i[(h n i 3) * 2(h n i 1 5)*] 2n 1.3.(g n 7 ) * i 1 2i [(h n i 2) * (h n i 1) * (h n i )*] n
2n [2(g n 1) * (g n 2) * (g n 4)*],
e uvE(S) n u (e) n v (e)
*
2[h*n
g (h n 1) * (h n 2) * 3.2(h n 5)*] n . n
Therefore, Sz(G[n]) = 2e=uvE(H[n])nu(e)nv(e) + e=uvE(S)nu(e)nv(e) = 726 + 11200n4n + 3920.n.2n – 3600.4n + 11592.2n, which completes our argument. We now compute the Szeged index of the zig-zag polyhex nanotube T = TUHC6[p,q], Figure 9(a). It is clear that T has exactly 2pq vertices and p(3q 1) edges. Suppose A and B are the set of all vertical and oblique edges of T, respectively. Then Sz(T) = e=ijAni(e)nj(e) + e=ijB ni(e)nj(e). We assume that Sz1(T) = e=ijA ni(e)nj(e) and Sz2(T) = e=ijB ni(e)nj(e). To compute Sz(T), we first compute Sz1(T). Suppose e = uv is an arbitrary vertical edge in the ith row, Figure 9(b). One can see that T has exactly p vertical edges in each row and so there are 2pi vertices above the ith row, which are all closer to v than u. Therefore, Sz1(T) = e=ijA ni(e)nj(e) =
= 2/3p3q(q2 1)
Topological Indices of Nanostructures
(a)
513
(b)
Figure 9. (a) A zig-zag polyhex nanotube, (b) 2-Dimensional Lattice of T.
Figure 10. Oblique and Vertical Lines of T with p < q.
To calculate Sz2(T), we consider five separate cases that q ≤ p, p+1 < q < 2p, q = p+1, q = 2p, q > 2p. We explain our method for computing summations which are needed for calculating Szeged index of T. Without loss of generality, suppose e = x33x34. To calculate the number of closer vertices to x34, we first draw two copies of 2dimensional lattice of T and then cut e by an oblique line and pass a vertical line through x3(p+4). Then vertices closer to x34 lie in the triangular region between those two lines. In general, to calculate ni(xi(i+1)), we compute the number of vertices in the triangular or trapezoidal region, Figure 10, surrounded by the vertical line passes through xi,p+i+2 and oblique line passes through eii = xiixi(i+1). Suppose q ≤ p and eis = xis xi(s+1), eir = xir xi(r+1) are two arbitrary oblique edges of T, Figure 10. Since T is bipartite, ni(xir) + ni(xi(r+1)) = ni(xis) + ni(xi(s+1)) = 2pq. So, it is sufficient to consider one oblique edge in each zig-zag. So, ni(xii) = 1 + 2 + … + q + q(p q + i 1) = Tq + q(p q + i 1), where Tq = 1 + 2 + … + q is the qth triangular number. And ni(xi(i+1)) = 2pq (Tq + q(p q + i 1)) = Tq + q(p i). Therefore,
514
Ali Reza Ashrafi Sz2(T) = e=ijB ni(e)nj(e) = 2p =2p3q3 + 1/6pq3 1/6pq5
Using a similar argument as above we can compute the Szeged index of T in other cases and we have:
Result 8 With above notations, we have: 8 3 3 2 3 1 3 1 5 3 p q 3 p q 6 pq 6 pq 5 p 6 43 p5 35 p 4 5 p3 1 p 2 2 6 6 6 3 2 2 2 2 p3q3 p3q pq p 2 1 pq5 4 p 2 q 4 3 3 15 15 5 3 Sz (T ) 5 8 1 4 1 pq3 p 2 q 2 p 4 p5q p 6 3 3 3 3 5 89 5 1 p6 p4 p2 3 15 5 3 3 4 3 2 2 4 5 13 6 4 2 p q p q p p q p p 3 15 3 15
q p q p 1
. p 1 q 2 p q 2p q 2p
Figure 11. The One-Heptagonal Carbon Nanocone CNC7[4].
We end this section by computing edge Szeged index of one-pentagonal carbon nanocones. One pentagonal carbon nanocones originally discovered by Ge and Sattler in 1994, [70]. These are constructed from a graphene sheet by removing a 60° wedge and joining the edges produces a cone with a single pentagonal defect at the apex. The inclusion of the heptagons in the hexagonal lattice leads to the appearance of negative curvature, Figure 11. The single sevenfold in the plain graphene lattice was theoretically studied in [71], but
Topological Indices of Nanostructures
515
this situation, unfortunately, has not been observed in the experiment yet. The heptagons were observed in the nanotubes [72] and in the work [73] the magnetic properties of negatively curved structures were calculated. To simplify our argument, we assume that S = CNC7[n]. Two edges e = xy and f = uv of a graph G are said to be co-distant if and only if d(x,u) = d(y,v), d(x,v) = d(y,u) and |d(x,u) – d(x,v)| = 1. The number of edges parallel to a given edge e = uv is denoted by N(e). Suppose
n denotes the number of edges in the boundary of S CNC 7 [n ] . Then n n 1 14 and 1 21 . So n 7 14n . On the other hand, there are 7n edges such that connect the boundary of CNC 7 [n ] to the boundary of CNC 7 [n 1] . Thus
E(CNC7 [n]) n 7n E(CNC7 [n 1])
7 21n E(CNC7[n 1])
Define x n E (CNC 7 [n ]) to find the recurrence relation x n 7 21n x n 1 . This is a linear recurrence equation with x1 35, as initial condition. By solving this equation, we have x n E (S ) (7 / 2)(3n 2 5n 2) . There are two types of edges in molecular graph of S: one type of edges is those that are parallel to every edge of central heptagonal.
Suppose i , 1 i 7 , are edges of central heptagonal and i are their equidistant edges in the boundary of S, respectively. For every i , there are n+1 co-distant edges toi. Another type of edges in S are 2n edges which are located between every i and i+1. We denote these
2n edges in a fixed region with e1 x1 x2 , e2 x2 x3 ,
, e2 n x2 n x2 n1 . These are having
this property that N (e1 ) N (e2 n ), N (e2 ) N (e2 n1 ), N (en ) N (en1 ) , and so it is enough to compute N (e 2 j 1 ) , for 1 j n . From the molecular graph of S CNC 7 [n ] , one can see that there are n+1 edges which are co-distant to i. Besides that there are (3/ 2)(n n) edges in the triangle region of S 2
which
are
parallel
to
i.
So N ( i ) (1/ 2)(3n 5n 2) , 2
therefore,
mai (i ) mbi (i ) (1/ 2)( E(S ) N (i )) (3/ 2)(3n2 5n 2) . By orthogonal cut method of John, Khadikar and Singh, one can see that all of edges parallel to e 2 j 1 are codistant edges of e 2 j 1 for 1 j n . It now follows from the molecular graph of CNC 7 [n ] , there are n+j+1 edges parallel to e 2 j 1 and so N (e 2 j 1 ) n j 1 , 1 j n . Now we are ready to state the main results of this paper.
Result 9 The edge Szeged index of one-heptagonal carbon nanocones is as follows:
516
Ali Reza Ashrafi
Sz e
2835 16
n6
63091 80
n5
71827 48
n4
76111 48
n3
12173 12
n2
22603 60
n 63 .
Proof By above calculations, the values of mai ( i ) and mbi ( i ) ,
1 i 7 , were computed.
Now we compute m x 2 j 1 (e 2 j 1 ) and m x 2 j (e 2 j 1 ) , 1 j n . By Figure 11 and calculations given above, mx2 j1 (e2 j 1 ) mx2 j1 (e2 j 1 ) (n j 1) By
solving
this
recurrence
(3n 3 / 2) j (n 1) .
relation, By
we
mx2 j1 (e2 j 1 ) (3/ 2) j 2
have
substituting
mx2 j1 (e2 j 1 ) mx2 j (e2 j 1 ) n j 1 E (S ) ,
(2n 2 j 2) .
this we
value conclude
in that
m x 2 j (e 2 j 1 ) (7 / 2)(3n 2 5n 2) (3/ 2) j 2 (3n 5 / 2) j . Therefore, 7 n Sze ( S ) mai ( i ).mbi ( i ).(n 1) mx2 j1 (e2 j 1 ).mx2 j (e2 j 1 ).(n j 1) i 1 j 1 3 7.( (3n 2 5n 2)) 2 .(n 1) 2 n 3 3 5 3 7. j 2 (3n ) j n 1 . E ( S ) j 2 (3n ) j .(n j 1) 2 2 2 j 1 2
2835 16
n6
63091 80
n5
71827 48
n4
76111 48
n3
12173 12
n2
22603 60
n 63,
which completes the proof. Between three methods given here, we believe that the computational method, which is a combination of three softwares HyperChem, TopoCluj and GAP, is very efficient and can be used for most parts of molecular graphs raised by nanostructures.
References [1] [2] [3] [4]
S. Iijima, Helical microtubules of graphitic carbon, Nature, 354, 56-58 (1991). Y. M. Yang and W. Y. Qiu, Molecular Design and Mathematical Analysis of Carbon Nanotube Links, MATCH Commun. Math. Comput. Chem., 58, 635-646 (2007). A. T. Balaban, Carbon and its nets. Symmetry 2: unifying human understanding, Part 1., Comput. Math. Appl., 17, 397-416 (1989). N. Trinajstić and I. Gutman, Mathematical Chemistry, Croat. Chem. Acta, 75, 329-356 (2002).
Topological Indices of Nanostructures [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28]
517
I. Gutman and O. E. Polansky, Mathematical Concepts in Organic Chemistry, SpringerVerlag, Berlin (1986). S. J. Cyvin I. Gutman, Kekulé Structures in Benzenoid Hydrocarbons, Lecture Notes in Chemistry, Vol 46, Springer-Verlag, Berlin (1988). N. Trinajstić , Chemical Graph Theory, CRC Press, Boca Raton, FL (1992). H. Wiener, Structural determination of the paraffin boiling points, J. Am. Chem. Soc., 69, 17-20 (1947). H. Hosoya, On some counting polynomial in chemistry, Disc. Appl. Math., 19, 239-257 (1988). P. V. Khadikar, S. Karmarkar and V.K. Agrawal, A Novel PI Index and its Applications to QSPR/QSAR Studies, J. Chem. Inf. Comput. Sci., 41, 934-949 (2001). P. V. Khadikar, P. P. Kale, N. V. Deshpande, S. Karmarkar and V.K. Agrawal, Novel PI indices of hexagonal chains, J. Math. Chem., 29, 143-150 (2001). I. Gutman, A formula for the Wiener number of trees and its extension to graphs containing cycles, Graph Theory Notes New York, 27, 9-15 (1994). Gutman and A. R. Ashrafi, The edge version of the Szeged index, Croat. Chem. Acta, 81, 263-266 (2008). I. H. Khalifeh, H. Yousefi-Azari and A. R. Ashrafi, Vertex and edge PI indices of Cartesian product graphs, Discrete Appl. Math., 156, 1780-1789 (2008). M. V. Diudea and A. Graovac, Generation and Graph-Theoretical Properties of C4-Tori, MATCH Commun. Math. Comput. Chem., 44, 93-102 (2001). M. V. Diudea and P. E. John, Covering polyhedral tori, MATCH Commun. Math. Comput. Chem., 44, 103-116 (2001). M. V. Diudea, I. Silaghi-Dumitrescu and B. Parv, Toranes Versus Torenes, MATCH Commun. Math. Comput. Chem., 44, 117-133 (2001). M. V. Diudea, Graphenes from 4-valent tori, Bull Chem Soc Japan, 75, 487-492 (2002). M. V. Diudea, Hosoya Polynomial in Tori, MATCH Commun. Math. Comput. Chem., 45, 109-122 (2002). M. V. Diudea, B. Parv and E. C. Kirby, Azulenic tori, MATCH Commun. Math. Comput. Chem., 47, 53-70 (2003). M. V. Diudea, M. Stefu, B. Pârv and P. E. John, Wiener Index of Armchair Polyhex Nanotubes, Croat. Chem. Acta, 77, 111-115 (2004). A. R. Ashrafi and A. Loghman, PI index of armchair polyhex nanotubes, Ars Combinatoria, 80, 193-199 (2006). A. R. Ashrafi and A. Loghman, Padmakar-Ivan index of TUC4C8(S) nanotubes, J. Comput. Theor. Nanosci., 3, 378-381 (2006). I. R. Ashrafi and F. Rezaei, PI index of polyhex nanotori, MATCH Commun. Math. Comput. Chem., 57, 243-250 (2007). R. Ashrafi and H. Saati, PI and Szeged indices of a VC5C7[4p,8] nanotube, Int. J. Nanoscience, 6, 77-83 (2007). R. Ashrafi and S. Yousefi, Computing the Wiener index of a TUC4C8(S) nanotorus, MATCH Commun. Math. Comput. Chem., 57, 403-410 (2007). Iranmanesh and A. R. Ashrafi, Balaban index of an armchair polyhex, TUC4C8(R) and TUC4C8(S) nanotorus, J. Comput. Theor. Nanosci, 4, 514-517 (2007). H. Yousefi, A. Bahrami, J. Yazdani and A. R. Ashrafi, PI index of V-phenylenic nanotubes and nanotori, J. Comput. Theor. Nanosci., 4, 704-705 (2007).
518
Ali Reza Ashrafi
[29] A. R. Ashrafi and S. Yousefi, A new algorithm for computing distance matrix and Wiener index of zig-zag polyhex nanotubes, Nanoscale Res. Lett., 2, 202-206 (2007). [30] H. Youse-Azari, A. Bahrami and A. R. Ashrafi, Computing PI index of HAC5C6C7 nanoubes and nanotori, J. Comput. Theoret. Nanosci., 5, 129-130 (2008). [31] B. Manoochehrian, H. Yousefi-Azari and A. R. Ashrafi, Szeged index of a zig-zag polyhex nanotube, Ars Combinatoria, 86, 371-379 (2008). [32] A. R. Ashrafi and H. Saati, Relationship between PI and Szeged indices of a triangulane and its associated dendrimer, J. Comput. Theoret. Nanosci., 5, 681-684 (2008). [33] H. Yousefi-Azari, B. Manoochehrian and A. R. Ashrafi, PI index of product graphs, Applied Math. Letters, 21, 624-627 (2008). [34] A. R. Ashrafi and M. Mirzargar, Topological study of an infinite class of nanostar dendrimer, Int. J. Chem. Mod., 1, 157-162 (2008). [35] S. Yousefi, H. Yousefi-Azari, M. H. Khalifeh and A. R. Ashrafi, Computing distance matrix and related topological indices of an achiral polyhex nanotube, Int. J. Chem. Mod., 1, 149-159 (2008). [36] A. R. Ashrafi and M. Mirzargar, Topological study of an innite class of nanostar dendrimer, Int. J. Chem. Mod., 1, 157-162 (2008). [37] S. Yousefi, H. Yousefi-Azari, M. H. Khalifeh and A. R. Ashrafi, Computing distance matrix and related topological indices of an achiral polyhex nanotube, Int. J. Chem. Mod., 1, 149-156 (2008). [38] M. H. Khalifeh, H. Yousefi-Azari and A. R. Ashrafi, A matrix method for computing Szeged and vertex PI indices of join and composition of graphs, Linear Alg. Appl., 429, 2702-2709 (2008). [39] H.Yousefi-Azari and A. R. Ashrafi, Padmakar-Ivan index of q-multi-walled carbon nanotubes and nanotori, J. Comput. Theoret. Nanosci., 5, 2280-2283 (2008). [40] M. H. Khalifeh, H.Yousefi-Azari and A. R. Ashrafi,The first and second Zagreb indices of some graph operations, Discrete Appl. Math., 157, 804-811 (2009). [41] A. R. Ashrafi and M. Mirzargar, PI, Szeged and edge Szeged indices of nanostar dendrimers, Util. Math., 77, 249-255 (2008). [42] A. R. Ashrafi, M. Ghorbani and M. Jalali, The vertex PI and Szeged indices of an infinite family of fullerenes, J. Theor. Comput. Chem., 7, 221-231 (2008). [43] A. R. Ashrafi, M. Ghorbani and M. Jalali, Study of IPR fullerenes by counting polynomials, J. Theor. Comput. Chem., 8, 451-457 (2009). [44] A. R. Ashrafi, M. Jalali, M. Ghorbani and M. V. Diudea, Computing PI and omega polynomials of an infinite family of fullerenes, MATCH Commun. Math. Comput. Chem., 60, 905-916 (2008). [45] H. Deng, Wiener index of tori Tp,q[C4,C8] covered by C4 and C8, MATCH Commun. Math. Comput. Chem., 56, 357-374 (2006). [46] L. Xu and H. Deng, The Schultz molecular topological index of C4C8 nanotubes, MATCH Commun. Math. Comput. Chem., 59, 421-428 (2008). [47] S. Chen, Q. Jang and Y. Hou, The Wiener and Schultz index of nanotubes covered by C4, MATCH Commun. Math. Comput. Chem., 59, 429-435 (2008). [48] H. Deng, The Schultz Molecular Topological Index of Polyhex Nanotubes, MATCH Commun. Math. Comput. Chem., 57, 677-684 (2007). [49] M. Eliasi, B. Taeri: Szeged and Balaban Indices of Zigzag Polyhex Nanotubes, MATCH Commun. Math. Comput. Chem., 56, 383-402 (2006).
Topological Indices of Nanostructures
519
[50] M. Eliasi and B. Taeri, Szeged Index of Armchair Polyhex Nanotubes, MATCH Commun. Math. Comput. Chem., 59, 437-450 (2008). [51] A. Heydari and B. Taeri, Szeged index of TUC4C8(R) nanotubes, MATCH Commun. Math. Comput. Chem., 57, 463-477 (2007). [52] A. Heydari and B. Taeri, Wiener and Schultz indices of TUC4C8(S) nanotubes, MATCH Commun. Math. Comput. Chem., 57, 665-676 (2007). [53] A. Heydari and B. Taeri, Hyper Wiener index of TUC4C8(R) nanotubes, J. Comput. Theor. Nanosci., 5, 2275–2279 (2008). [54] A. Heydari and B. Taeri, Wiener and Schultz indices of TUC4C8(R) nanotubes, J. Comput. Theor. Nanosci., 4, 158–167 (2007). [55] Iranmanesh, B. Soleimani and A. Ahmadi, Szeged index of TUC4C8(R) nanotube, J. Comput. Theor. Nanosci., 4, 147–151 (2007). [56] Iranmanesh and Y. Alizadeh, Computing some topological indices by GAP program, MATCH Commun. Math. Comput. Chem., 60, 883-896 (2008). [57] Iranmanesh and B. Soleimani: PI index of TUC4C8(R) nanotubes, MATCH Commun. Math. Comput. Chem., 57, 251-262 (2007). [58] Iranmanesh and A. Adamzadeh, On Diameter of Zig-Zag Polyhex Nanotubes, J. Comput. Theor. Nanosci., 5, 1428–1430 (2008). [59] Iranmanesh and O. Khormali, Padmakar-Ivan (PI) index of HAC5C7[r,p] nanotubes, J. Comput. Theor. Nanosci., 5, 131–139 (2008). [60] P. V. Khadikar, S. Karmarkar and R. G. Varma, The estimation of PI index of polyacenes, Acta Chim. Slov., 49, 755-771 (2002). [61] D. Vukičević and N. Trinajstić, Wiener indices of benzenoid graphs, Bulletin of the Chemists and Technologists of Macedonia, 23(2), 113–129 (2004). [62] H. W. Kroto, J. R. Heath, S. C. O’Brien, R. F. Curl and R. E. Smalley, C60: Buckminsterfullerene, Nature, 318, 162-163 (1985). [63] P. W. Fowler and D. E. Manolopoulos, An Atlas of Fullerenes, Oxford Univ. Press, Oxford (1995). [64] The GAP Team, GAP, Groups, Algorithms and Programming, Lehrstuhl De fur Mathematik, RWTH, Aachen (1992). [65] HyperChem package Release 7.5 for Windows, Hypercube Inc., 1115 NW 4th Street, Gainesville, Florida 32601, USA (2002). [66] M. V. Diudea, O. Ursu and Cs. L. Nagy, TOPOCLUJ, Babes-Bolyai University, Cluj (2002). [67] Imrich W., Klavzar S., Product graphs: structure and recognition, John Wiley and Sons, New York, USA, 2000. [68] S. Klavzar, A. Rajapakse A. and I. Gutman, The Szeged and the Wiener index of graphs, Appl. Math. Lett., 9, 45-49 (1996). [69] M. H. Khalifeh, H. Yousefi-Azari and A. R. Ashrafi, Another aspect of graph invariants depends on path metric, submitted. [70] M. Ge and K. Sattler, Observation of fullerene cones, Chem. Phys. Lett., 220, 192-196 (1994). [71] D. R. Nelson and L. Peliti, Fluctuations in membranes with crystalline and hexatic order, J. Phys. (Paris), 48, 1085-1092 (1987).
520
Ali Reza Ashrafi
[72] D. N. Weldon, W. J. Blau and H. Zandlbergen, A high resolution electron microscopy investigation of curvature in carbon nanotubes, Chem. Phys. Lett., 241, 365372 (1995). I. [73] Park, M. Yoon, S. Berber, J. Ihm, E. Osawa and D. Tománek, Magnetism in all-carbon nanostructures with negative Gaussian curvature, Phys. Rev. Lett., 91, 237204-2 (2003).
In: Quantum Frontiers of Atoms and Molecules Editor: Mihai V. Putz, pp. 521-538
ISBN: 978-1-61668-158-6 © 2010 Nova Science Publishers, Inc.
Chapter 20
ON UNIFORM REPRESENTATION OF PROTEINS BY DISTANCE MATRIX M. Randića,*, M. Vračko a, M. Novič a and D. Plavšić b a
Laboratory for Chemometrics, National Institute of Chemistry, Hajdrihova 19, 61115 Ljubljana, Slovenia b Center for NMR, Institute Rudjer Bošković, Zagreb, Croatia
Abstract We outline an approach that associates with each protein a 20×20 distance matrix, regardless of protein size. In this way, for the first time, all proteins are represented uniformly. The distance matrix is constructed from a lattice representation of proteins in which to each amino acid is assigned a pair of integer Cartesian coordinates by averaging the coordinates for all amino acids of the same kind. The new uniform representation of proteins is illustrated on ND6 proteins of chimpanzee. The possibility of using this novel distance matrix for a uniform nomenclature of proteins is also discussed.
Introduction During the past 30 years Chemical Graph Theory has made impressive advancements in mathematical representations and characterizations of molecules, which have led to a better understanding of Molecular Structure-Molecular Property relationships and opened novel approaches to the Quantitative Structure-Activity Relationships (QSAR) [1]. These developments resulted in novel structure-property and structure-activity insights, such as regularities in physico-chemical properties of isomers with the count of paths of length two and length three [2-7]; characterization of molecular shape by quotients of the count of paths and walks [8]; stability of the regression equations with orthogonalized molecular descriptors [9-12]; characterization of pharmacophores [13-15]; characterization of aromaticity and local aromaticity [16-23], to mention only a few achievements. During the last ten years some of the tools of Discrete Mathematics that had been hitherto confined to molecules have been *
E-mail address: [email protected]
522
M. Randić, M. Vračko, M. Novič et al.
generalized and applied to DNA [24-31], the secondary structure of RNA [32, 33], proteins [34-40] and proteomics maps [41-53] and proteome [54, 55]. In this contribution we will briefly review major directions in the Graph Theoretical Approach to Bioinformatics and will outline a novel uniform representation of proteins.
Graphical Representation of DNA One of the central topics of molecular biology is the comparative study of biopolymers, in particular DNA, RNA and proteins. A straightforward, though computationally very intensive, approach is to consider raw bio-sequences and develop suitable computer software that allows pair-wise and multiple alignments of such sequences [56-58]. In 1983 Hamori and Ruskin came to an ingenious idea and reported an alternative approach [59-61], which gave birth initially to the graphical approach to DNA, and has led later to graphical representations of RNA and proteins. In their graphical approach to DNA, they represented the four nucleotide bases graphically in 3D by assigning to these four bases four directions in the (x, y) plane and displaying each such step in the z-direction for each successive base. Between mid-1990 and mid-2005 several alternative graphical representations of DNA were proposed of which we will illustrate a few:
2D Lattice Representation of DNA In Figure 1 we have illustrated a lattice representation of the first exon of the β-globin gene of human, lemur and opossum, by following the approach of Nandy [62], who assigned ±x directions to G, A and ±y directions to C,T. Observe that (x, y) coordinates for all nucleotides are integers, thus we refer to such systems as lattice representation.
Figure 1. Graphical representation of the coding sequences of the first exon of β-globin gene of human, lemur and opossum (from top to bottom).
While there is some loss of information accompanying such 2D representations of DNA, a great advantage is the visual insight that they offer. Just by glancing at Figure 1 one can
On Uniform Representation of Proteins by Distance Matrix
523
immediately see that the first exon of the β-globin gene of human and lemur are fairly similar, while the first exon of the β-globin gene of opossum is visibly different.
Highly Condensed Representation of DNA In Figure 2 we illustrate a 2D representation of the first exon of the β-globin gene of human based on the “magic square” of Jeffrey [63, 64].
Figure 2. The first exon of human β-globin gene as represented by the modified Chaos Game approach of Jeffrey.
Jeffrey has adopted the Chaos Game representation of random sequences introduced shortly earlier in mathematics by Barnsley [65] to be used for a graphical representation of DNA. Jeffrey specified the polygon of the Chaos Game to be a square, to the four corners of which he assigned the four nucleotide bases. He then started at the center of the square and for the first base placed a spot half-way between the center and the corner having the label of the first base. Then he continued by moving half-way towards the corner having the label of the second base, and so on. The first exon of the β-globin gene of human is thus represented
524
M. Randić, M. Vračko, M. Novič et al.
as a set of 92 spots within the interior of a square [66]. The “magic square” method as we colloquially refer to the approach of Jeffrey, was intended and designed for very lengthy DNA sequences, having between 10,000 – 100,000 bases, which is analogous to very lengthy random sequences of Barnsley, which in this way produced fractal-like patterns [65]. We should add that in several publications we have explored with our collaborators use of the “magic square” method on short and very short DNA sequences, including codons, and have in this way arrived at the “Table of Codons” [35, 36], which offered a way to graphical representation of proteins.
Spectral Representation of DNA Finally we would like to briefly outline one of spectral representations of DNA. A simple spectral representations of DNA are obtained by assigning to the four bases A, C, G, T numerical values, such as 1, 2, 3, and 4, respectively [67-70]. In Figure 3 we have illustrated the four-line spectral representation of the first exon of β-globin gene of human.
Figure 3. Spectrum-like graphical representation of the coding sequence of the first exon of human βglobin gene.
Figure 4. Graphical illustration of the degree of alignment of the ND6 proteins of human and gorilla.
On Uniform Representation of Proteins by Distance Matrix
525
Such graphical representations still offer limited visual insights but their great advantage is that they allow simple numerical manipulation with sequences, such as plotting the difference of two sequences, which indicates visually the degree of alignment as illustrated in Figure 4, and if necessary by considering differences of shifted sequences [71, 72]. In summary let us emphasize that the difference of computer-based studies of DNA and graphical studies of DNA is that the former ones apply only to comparative studies of two or more sequences; however, graphical approaches use graphical and sequence invariants for characterization of DNA, which is analogous to the use of mathematical invariants, such as the so-called topological indices [73-89] as molecular descriptors, allow one to characterize a single DNA sequence, and thus make possible to compile a catalog DNA – something that is outside the scope of computer-based approaches.
Graphical Representation of the Secondary Structure of RNA It is of interest in comparative studies of single-strand RNA to incorporate in the analysis the RNA secondary structure. Until very recently it was customary to differentiate the “free” four bases A, C, G and U from the same four bases when hydrogen-bonded in pairing A-U and C-G, by using eight symbols: A, C, G, U and A', C', G' U'. However, recently Liao et al., [90] have shown that there is some loss of information inherent to such approaches, because two different secondary structures can have the same eight-symbol representation. It has been shown that when one differentiates pairings A-U and C-G from pairings U-A and G-C, which lead to twelve symbol representation: A, C, G, U, A', C', G', U', A", C", G", U", one obtains unique representations of secondary RNA structures [33].
Graphical Representation of Proteins Graphical representations of proteins are of more recent time. The delay of about 25 years between the appearance of the first graphical representation of DNA and the first graphical representation of proteins is undoubtedly due to the combinatorial complexity associated with twenty factorials involving the ordering of twenty natural amino acids in contrast to four factorials accompanying the ordering of four nucleotide bases. Readers should consult the recent review on graphical representation of proteins for description of various graphical models of proteins [91]. It suffices here to mention only that there are 2D spectral representations of proteins [92, 93], highly compact 2D representations of protein [36, 93], and star-like graph based-graphical representations [94]. Conspicuously missing until very recently has been a lattice representation of proteins, the representation with the most valuable visual insight! Because the uniform graphical representation of proteins – which is the topic of the present contribution – is based on a lattice representation of proteins [95], we will illustrate here the construction of a lattice representation of a protein. For this purpose we have selected ND6 protein of Human, having 174 amino acids: MMYALFLLSVGLVMGFVGFSSKPSPIYGGLVLIVSGVVGCVIILNFGGGYMGLMVFLI YLGGMMVVFGYTTAMAIEEYPEAWGSGVEVLVSVLVGLAMEVGFVLWVKEYDGV VVVVNFNSVGSWMIYEGEGSGFIREDPIGAGALYDYGRWLVVVTGWPLFVGVYIVIE IARGN
526
M. Randić, M. Vračko, M. Novič et al.
In Figure 5 we show a 10×10 part of the Cartesian grid on the periphery of which are the locations of twenty amino acids arranged anticlockwise in decreasing order of their abundance in proteins starting from leucine in the right lower corner. Thus (x, y) coordinates of leucine are (5, –5), of serine are (5, –3), and so on.
Figure 5. The locations of 20 amino acids in the Cartesian coordinate system arranged anticlockwise starting from leucine in decreasing order of their abundance in proteins.
In Table 1 using these coordinates we illustrate for the first dozen amino acids the construction of lattice representation of ND6 protein of human, which is fully illustrated in Figure 6, together with the lattice representation of ND6 protein of chimpanzee and opossum. The starting amino acid is methionine (–3, –5). Because the next amino acid is again methionine , one proceeds to add the coordinates of the second amino acid (–3, –5) to existing arriving at (–6, –10). The third amino acid tyrosine (Y) with coordinates (–5, –5) leads to (– 11, –15), etc. A glance at the three proteins in Figure 6 immediately shows considerable similarity between the ND6 proteins of human and chimpanzee while the ND6 protein of opossum shows quite different patterns. Thus just as has been the case with the 2D graphical representation of DNA by Nandy of the first exon of β-globin gene, here we arrived at lattice representation of proteins with similar visual qualities.
On Uniform Representation of Proteins by Distance Matrix
Figure 6. Lattice representations of Human ND6, Chimpanzee ND6 and opossum.
527
M. Randić, M. Vračko, M. Novič et al.
528
Table 1. The (x, y) coordinates and cumulative coordinates for the first dozen amino acids of lattice representation of human ND6 protein
origin M M Y A L F L L S V G L
x
y
–3 –3 –5 5 5 –5 5 5 5 5 5 5
–5 –5 –5 –1 –5 –1 –5 –5 –3 3 1 –5
cumulative x 0 –3 –6 –11 –6 –1 –6 –1 4 9 14 19 24
cumulative y 0 –5 –10 –15 –16 –21 –22 –27 –32 –35 –32 –31 –36
Protein Distance Matrix The problem that we want to consider is a mathematical characterization of the lattice representation of proteins illustrated in Figure 6. One possibility, which has been already outlined in the literature, is to use reduced matrices, which are based on graphical representations of reduced lattice graphs. Here we will introduce distance matrices for lattice representations of proteins in the following way: We will group each of twenty amino acids present in a protein separately and find the corresponding average (x, y) coordinates. For example, methionine appears ten times in human ND6 protein, at positions 1, 2, 14, 51, 63, 64, 73, 98, and 125, the (x, y) coordinates of which are: (–3, –5), (–6, –10), (26, –38), (91, –29), (98, –38), (109, –49), (106, –54), (119, –49), (208, –42), and (267, –23) respectively. The above sequence yields for the average methionine coordinates (101.5, –33.7). In Table 2 we have listed the average coordinates for the 18 amino acids present in human ND6 protein and their average (x, y) coordinates, which are depicted at the upper part of Figure 7, where we have added as 19th point the center of all amino acids, which is at (185.7348, –16.9803). Observe that representation of ND6 protein illustrated in Figure 7 is geometrical rather than graphical, because all points have fixed (x, y) coordinates. Because of this we can immediately construct 18×18 distance matrix for the protein considered. In the general case one will arrive in this way to a 20×20 distance matrix, but ND6 is missing glutamine and histidine. If one connects all points representing the 18 amino acids with the central point at (185.7348, –16.9803) as illustrated in the lower part of Figure 7 one obtains a star graph, but one of fixed geometry and fixed orientation (within the Cartesian coordinate system).
On Uniform Representation of Proteins by Distance Matrix
529
Table 2. The average Cartesian coordinates for amino acids of human ND6 protein A R N D C E G I L K M F P S T W Y V
x 197.71 309.00 232.25 268.33 86.00 269.83 166.00 192.67 131.94 153.50 101.50 141.70 158.20 132.50 181.67 278.50 157.10 184.82
y –8.57 16.00 –1.25 –4.33 –30.00 –11.17 –19.79 –10.50 –26.76 –32.50 –33.70 –25.50 –16.20 –29.70 –25.67 –11.00 –24.90 –10.11
Figure 7. Amino acid map of human ND6 protein and corresponding embedded star graph of fixed geometry. Observe use of different scale for the x and y coordinates.
Table 3. New (20×20 symmetrical) distance matrix of human ND6 protein. Amino acids are ordered alphabetically based on their three letter codes
Ala Arg Asn Asp Cys Gln Glu Gly His Ile Leu Lys Met Phe Pro Ser Thr Trp Tyr Val
A 0
R 113 0
N 35 78 0
D 70 45 36 0
C 113 227 149 184 0
Q Ø Ø Ø Ø Ø 0
E 72 47 38 6 184 Ø 0
G 33 147 68 103 80 Ø 104 0
H Ø Ø Ø Ø Ø Ø Ø Ø 0
I 5 119 40 75 108 Ø 77 28 Ø 0
L 68 182 103 138 46 Ø 138 34 Ø 62 0
K 50 162 84 118 67 Ø 169 17 Ø 44 22 0
M 23 213 134 169 15 Ø 169 65 Ø 94 31 52 0
F 44 172 93 128 55 Ø 128 24 Ø 53 9 13 41 0
P 50 154 75 110 73 Ø 111 8 Ø 34 28 16 59 18 0
S 68 182 103 138 46 Ø 138 34 Ø 63 2 21 31 10 29 0
T 23 133 56 89 95 Ø 89 16 Ø 18 49 28 80 39 25 49 0
W 80 40 47 12 193 Ø 8 112 Ø 85 147 126 178 137 120 147 97 0
Y 43 157 78 113 71 Ø 113 10 Ø 38 25 8 56 15 8 25 24 122 0
V 12 126 48 83 100 Ø 85 21 Ø 7 55 38 86 45 27 55 15 93 31 0
On Uniform Representation of Proteins by Distance Matrix
531
In Table 3 we have listed the 20×20 distance matrix for human ND6 protein, in which the rows and columns belonging to glutamine and histidine are shown as “empty”. The significance of the novel protein distance matrix is that regardless of the size of protein, this approach results in uniform-size matrices for all proteins. This is the first time that such uniform matrices have been constructed for proteins – a novelty which is bound to have useful consequences not only in comparative studies of proteins but also may be of interest for auxiliary protein nomenclature as illustrated in the following section.
Uniform Auxiliary Nomenclature for Proteins The condensed graphical representation of ND6 protein in Figure 7 offers a novel approach to protein nomenclature. In Table 4 we have listed the average coordinates for amino acids of human ND6 protein relative to the coordinates of the center of the lattice representation (x0 = 185.7348, y0 = – 16.9803) and the corresponding polar coordinates with respect to the overall center for average for all amino acids. These coordinates determine the quadrant in which are individual amino acids relative to the center (x0, y0), which allows one to determine the polar angle. Table 4. The average coordinates for amino acids of human ND6 protein relative to the coordinates of the center of the lattice representation (x0 = 185.7348, y0 = –16.9803) and the corresponding polar coordinates with respect to the overall center for average for all amino acids
A R N D C E G I L K M F P S T W Y V
x–x0 11.9795 123.2652 46.5152 82.5985 –99.7348 84.0985 –19.7348 6.9319 –53.7936 –32.2348 –84.2348 –44.0348 –27.5348 –53.2348 –4.0681 92.7652 –28.6348 –0.9134
y–y0 8.40887 32.9803 15.7303 12.647 –13.0197 5.8136 –2.8054 6.4803 –9.7844 –15.5197 –16.7197 –8.5197 0.7803 –12.7197 –8.6864 5.9803 –7.9197 6.8732
R 14.6362 127.6010 49.1030 83.5611 100.5810 84.2992 19.9332 9.48923 54.6762 35.7763 85.8781 44.8514 27.5459 54.7333 9.59182 92.9578 29.7098 6.93363
φ 0.612026 0.261432 0.326102 0.151934 0.129809 0.069019 0.141209 0.751740 0.179921 0.448704 0.195942 0.191115 –0.028330 0.234539 1.132804 0.064378 0.269831 –1.438680
M. Randić, M. Vračko, M. Novič et al.
532
Thus A, R, N, D, E, I, W are in the first quadrant; P, V are in the second quadrant and C, G, L, K, M, F, S, T, Y are in the third quadrant, which allows amino acids in each quadrant to be ordered in increasing magnitude of the polar angle φ. As a result the 18 amino acids of human ND6 protein are ordered as: WEDRNAIPVCGLFMSYKT By adding at the end the nonexistent amino acids (alphabetically): WEDRNAIPVCGLFMSYKTHQ we obtained a particular ordering for this protein. Different proteins will show different orderings of amino acids, but even though one does not expect often to find proteins having the same ordering, in view that there are 20! possible arrangements of 20 letters in a sequence, and in view of large number of different proteins, one should expect different proteins having the same ordering. To distinguish such we can associate with each protein also a 20 component vector that lists the radial distances of average amino acid centers from the center of the lattice representation of the protein. For the above case, if we truncate the radial magnitudes to integer parts only, one obtains: (92, 84, 83, 127, 49, 14, 9, 27, 6, 100, 19, 54, 44, 85, 54, 29, 35, 9, 0, 0). It is very unlikely that there will be two proteins that will have identical both the alphabetic and the numerical sequence, and if there is overlap in numerical sequences, one can always include the decimal part of radial distances to discriminate such cases. We propose therefore as auxiliary notation of proteins the use of the alphabetic and the numerical sequences as described above for human ND6. The proposal has an important advantage: any person in any laboratory can easily construct the “binary” notation for proteins, and if such “binary” labels are used by the rest of the protein community, one will facilitate search of protein data.
Acknowledgments This research was supported in part by the ARRS grant L1-7230 and grant P1-017 from the Ministry of High Education, Science and Technology of the Republic of Slovenia. We thank Professor A. T. Balaban (Texas A&M University at Galveston, TX) for numerous helpful comments that led to improvement of the manuscript. MR thanks the Laboratory for Chemometrics, the National Institute of Chemistry, Ljubljana for warm hospitality.
On Uniform Representation of Proteins by Distance Matrix
533
References [1]
[2] [3]
[4] [5]
[6] [7] [8] [9]
[10] [11] [12] [13]
[14]
[15] [16] [17] [18]
C. Hansch, D. Hoekman, and H. Gao, “Comparative QSAR: Toward a deeper understanding of chemicobiological interactions.” Chemical Reviews, vol. 96, pp. 10451075, 1996. M. Randić and C. L. Wilkins, “On a graph theoretical basis for ordering of structures,” Chemical Physics Letters, vol. 63, pp. 332-336, 1979. M. Randić and C. L. Wilkins, “Graph-theoretical basis for ordering of structures as a basis for systematic searches for regularities in molecular data,” Journal of Physical Chemistry, vol. 83, pp. 1525-1540, 1979. M. Randić, “Chemical structure-what is ‘she’?,” Journal of Chemical Education, vol. 69, pp. 713-718, 1992. M. Randić and C. L. Wilkins, “Graph-theoretical analysis of molecular properties. Isomeric variations in nonanes,” International Journal of Quantum Chemistry, vol. 18, pp. 1005-1027, 1980. M. Randić and N. Trinajstić, “On isomeric variations in decanes,” MATCH – Communication in Mathematical and Computer Chemistry, vol. 13, pp. 271-290, 1982. M. Randić, “Chemical shift sums,” Journal of Magnetic Resonance, vol. 39, pp. 431436, 1980. M. Randić, Novel shape descriptors for molecular graphs, Journal of Chemical Information and Computer Science, vol. 41, pp. 607-613, 2001. M. Randić, “Resolution of ambiguities in structure-property studies by use of orthogonal descriptors,” Journal of Chemical Information and Computer Science, vol. 31, pp. 311-320, 1991. M. Randić, “Orthogonal molecular descriptors,” New Journal of Chemistry, vol.15, pp. 517-525, 1991. M. Randić, “Correlation of enthalpy of octanes with orthogonal descriptors,” Journal of Molecular Structure, vol. 233, 45-59, 1991 (Theochem, vol. 79, pp. 45-59, 1991). M. Randić, “Fitting of nonlinear regressions by orthogonalized power series,” Journal of Computer Chemistry, vol. 14, pp. 263-370, 1993. M. Randić, “Design of Molecules with desired properties. A molecular similarity approach to property optimization,” pp. 77-145 in: Concepts and Applications of Molecular Similarity, M. A. Johnson and G. Maggiora, eds., John Wiley & Sons, New York, USA, 1990. M. Randić, B. Jerman-Blažič, D. H. Rouvray, P. G. Seybold and S. C. Grossman, “The search for active substructures in structure-activity studies,” International Journal of Quantum Chemistry: Quantum Biology Symposium, vol. 14, pp. 245-260, 1987. M. Randić, “On characterization of pharmacophore,” Acta Chimica Slovenica, vol. 47 143-151, (2000) . M. Randić: “Aromaticity of Polycyclic Conjugated Hydrocarbons,” Chemical Reviews, vol. 103, pp. 103, 3449-3605, 2003. M. Randić, “Aromaticity and conjugation,” Journal of the American Chemical Society, vol. 99, pp. 444-450, 1997. M. Randić, “A graph theoretical approach to conjugation and resonance energies of hydrocarbons,” Tetrahedron, vol. 33, pp. 1905-1920, 1977.
534
M. Randić, M. Vračko, M. Novič et al.
[19] M. Randić, “Conjugated circuits and resonance energies of benzenoid hydrocarbons,” Chemical Physics Letters, vol. 38, pp. 68-70, 1976. [20] I. Gutman and M. Randić, “A correlation between Kekulé valence structures and conjugated circuits,” Chemical Physics, vol. 41, pp. 265-270, 1979. [21] M. Randić, D. Plavšić and N. Trinajstić, “Characterization of local benzenoid features: Polycyclic conjugated hydrocarbons,” Gazzetta Chimica Italiana, vol. 118, pp. 441446, 1988. [22] M. Randić, V. Solomon, S. C. Grossman, D. J. Klein, and N. Trinajstić, “Resonance energies of large conjugated hydrocarbons by a statistical method,” International Journal of Quantum Chemistry, vol. 322 pp. 35-59, 1980. [23] M. Randić, S. Nikolić and N. Trinajstić, “On the benzenoid character of polycyclic conjugated hydrocarbons,” Gazzetta Chimica Italiana, vol. 117, pp. 69-73, 1987. [24] M. Randić, M. Vračko, A. Nandy, and S. C. Basak, “On 3-D graphical representation of DNA primary sequence and their numerical characterization” Journal of Chemical Information and Computer Science, vol. 40, pp. 1235-1244, 2000. [25] X. Guo, M. Randić, and S. C. Basak, “A novel 2-D graphical representation of DNA sequences of low degeneracy,” Chemical Physics Letters, vol. 350, pp. 106-112, 2001. [26] M. Randić and A. T. Balaban, “On a four-dimensional representation of DNA primary sequences,” Journal of Chemical Information and Computer Science, vol. 43, pp. 532539, 2003. [27] M. Randić, “Graphical representation of DNA as a 2-D map,” Chemical Physics Letters, vol. 386, pp. 468-471, 2004. [28] M. Randić and J. Zupan, “Highly compact 2D graphical representation of DNA sequences,” SAR & QSAR in Environmental Research, vol. 15, pp. 191-205, 2004. [29] M. Randić, M. Novič, D. Vikić-Topić, and D. Plavšić, “Novel numerical and graphical representation of DNA sequences and proteins,” SAR & QSAR in Environmental Research, vol. 17, pp. 583-595, 2006. [30] M. Randić, N. Lerš, D. Plavšić, S.C. Basak, and A.T. Balaban, “Four-color map representation of DNA or RNA sequences and their numerical characterization,” Chemical Physics Letters, vol. 407, pp. 205-208, 2005. [31] M. Randić, J. Zupan, and T. Pisanski, “On representation of DNA by line distance matrix,” Journal of Mathematical Chemistry, vol. 43, pp. 674-692, 2008. [32] M. Randić, M. Vračko, M. Novič, and D. Plavšić, Spectrum-like graphical representation of RNA secondary structure, International Journal of Quantum Chemistry, vol. 109, pp. 2982-2995, 2009. [33] M. Randić and D. Plavšić, Novel spectral representation of RNA secondary structure without loss of information, Chemical Physics Letters, vol. 476, pp. 277-280, 2009. [34] M. Randić, “2-D graphical representation of proteins based on virtual genetic code” SAR & QSAR in Environmental Research, vol. 15, pp. 147-157, 2004. [35] M. Randić, J. Zupan, and A.T. Balaban, “Unique graphical representation of protein sequences based on nucleotide triplet codons,” Chemical Physics Letters, vol. 397, pp. 247-252, 2004. [36] M. Randić, A.T. Balaban, M. Novič, A. Založnik, and T. Pisanski, “A novel graphical representation of proteins,” Periodicum Biologorum,” vol. 107, pp. 403-414, 2005. [37] M. Randić, D. Butina, and J. Zupan, “Novel 2-D graphical representation of proteins,” Chemical Physics Letters, vol. 419, pp. 528-532, 2006.
On Uniform Representation of Proteins by Distance Matrix
535
[38] M. Randić, “2-D graphical representation of proteins based on physico-chemical properties of amino acids,” Chemical Physics Letters, vol. 444, pp. 176-180, 2007. [39] M. Randić and M. Novič, “Representation of proteins as walks in 20-D space,” SAR & QSAR in Environmental Research, vol. 19, pp. 317-337, 2008. [40] M. Randić, M. Novič, M. and Vračko, “On novel representation of proteins based on amino acid adjacency matrix,” SAR & QSAR in Environmental Research, vol. 19, pp. 339-349, 2008. [41] M. Randić, “Quantitative characterization of proteomics maps by matrix invariants,” in: P.M. Conn (Ed.), pp. 429-450 in Handbook of Proteomics Methods, Humana Press, INC. Totowa, NY, USA, 2003. [42] M. Randić, F. A. Witzmann, V. Kodali and S. C. Basak, “Dependence of a characterization of proteomics maps on the number of protein spots considered,” Journal of Chemical Information and Computer Science, vol. 46, pp. 116-122, 2006. [43] M. Randić, N. Novič and M. Vračko, “Novel characterization of proteomics maps by sequential neighborhood of protein spots,” Journal of Chemical Information and Computer Science, vol. 45, pp. 1205-1213, 2005. [44] M. Randić, N. Lerš, D. Vukučević, D. Plavšić, B. D. Gute, and S. C. Basak, “Canonical Labeling of Proteome Maps.” Journal of Proteome Research, vol. 4, pp. 1347-1352, 2005. [45] Ž. Bajzer, M. Randić, D. Plavšić and S. C. Basak, “Novel map descriptors for characterization of toxic effects in proteomics maps,” Journal of Molecular Graphics & Modelling, vol. 22, pp. 1-9, 2003. [46] D. Bonchev and M. Randić, “Shannon’s entropy of proteomic 2D-gel maps,” Chemical Physics Letters, vol. 372, pp. 548-552, 2003. [47] M. Randić, J. Zupan, M. Novič, B. D. Gute and S. C. Basak, “Novel matrix invariants for characterization of changes of proteomics maps,” SAR & QSAR in Environmental Research, vol. 13, pp. 689-703, 2002. [48] M. Randić, “A graph theoretical characterization of proteomics maps,” International Journal of Quantum Chemistry, vol. 90, pp. 848-858, 2002. [49] M. Randić and S. C. Basak, “A comparative study of proteomics maps using graph theoretical biodescriptors,” Journal of Chemical Information and Computer Science, vol. 42, pp. 983-992, 2002. [50] M. Randić, M. Novič and M. Vračko, On characterization of dose variations of 2-D proteomics maps by matrix invariants, Journal of Proteome Research, vol. 114, pp. 217-226, 2002. [51] M. Randić, F. Witzmann, M. Vračko and S. C. Basak, On characterization of proteomics maps and chemically induced changes in proteomes using matrix invariants: application to peroxisome proliferators, Medicinal Chemistry Research, vol. 10, pp. 456-479, 2001. [52] M. Randić, On graphical and numerical characterization of proteomics maps, Journal of Chemical Information and Computer Science, vol. 41, pp. 1330-1338, 2001. [53] M. Randić, J. Zupan and M. Novič, On 3-D graphical representation on proteomics maps and their numerical characterization, Journal of Chemical Information and Computer Science, vol. 41, pp. 1339-1344, 2001. [54] M. Randić and E. Estrada, “Order from chaos: Observing hormesis at the proteome level.” Journal of Proteome Research, vol. 4, pp. 2133-2136, 2005.
536
M. Randić, M. Vračko, M. Novič et al.
[55] M. Randić, Quantitative characterization of proteome: Dependence on the number of proteins considered, Journal of Proteome Research, vol. 5, pp. 1575-1579, (2006) [56] S. Needleman and C. D. Wunsch. “A general method applicable to search for similarities in amino acid sequence of 2 proteins,” Journal of Molecular Biology, vol. 48, pp. 443-453,1970 [57] T. F. Smith and M. S. Waterman, Identification of common molecular subsequences, Journal of Molecular Biology, vol. 147, pp. 195-197, 1981. [58] W. R. Pearson and D. J. Lipman, “Improved tools for biological sequence comparison,” Proceedings of the National Academy of Sciences, vol. 85, pp. 2444-2448, 1988. [59] E, Hamori and J. Ruskin, “H curves, a novel method of representation of nucleotide series especially suited for long DNA sequences,” Journal of Biological Chemistry, vol. 285, pp. 1318-1327, 1983. [60] E. Hamori, Novel DNA sequence representations, Nature, vol. 314, pp. 585-586, 1985. [61] E. Hamori, “Graphic representation of long DNA-sequences by the method of h-curves - current results and future aspects,” BioTechniques, vol. 7, pp. 710-720, 1989. [62] A. Nandy, “A new graphical representation and analysis of DNA sequence structure: I. Methodology and application to globin genes,” Current Science, vol. 66, pp. 309-314, 1994. [63] H. J. Jeffrey, “Chaos game representation of gene structure,” Nucleic Acid Research, vol. 18, pp. 2163-2170, 1990. [64] H. J. Jeffrey, “Chaos game visualization of sequences.” Computers & Graphics. An International Journal of Systems & Applications in Computer Graphics, vol. 16, pp. 2533, 1992. [65] M. F. Barnsley and H. Rising, H. Fractals Everywhere, 2nd ed., Academic Press, Boston, MA, USA, 1993. [66] M. Randić, “Another look at the chaos-game representation of DNA,” Chemical Physics Letters, vol. 456, pp. 84-88, 2008. [67] M. Randić, M. Vračko, N. Lerš, and D. Plavšić, “Novel 2-D graphical representation of DNA sequences and their numerical characterization,” Chemical Physics Letters, vol. 368, pp. 1-6, 2003. [68] M. Randić, M. Vračko, N. Lerš, and D. Plavšić, “Analysis of similarity/dissimilarity of DNA sequences based on novel 2-D graphical representation,” Chemical Physics Letters, vol. 371, pp. 202-207, 2003. [69] J. Zupan and M. Randić, “Algorithm for coding DNA sequences into "spectrum-like" and "zigzag" representations,” Journal of Chemical Information and Computer Science, vol. 45, pp. 309-313, 2005. [70] M. Randić, “Spectrum-like graphical representation of DNA based on codons,” Acta Chimica Slovenica, vol. 53, pp. 477-485, 2006. [71] M. Randić, J. Zupan, D. Vikić-Topić, and D. Plavšić, “A novel unexpected use of a graphical representation of DNA: graphical alignment of DNA sequence,” Chemical Physics Letters, vol. 431, pp. 375-579, 2006. [72] M. Randić, “On geometry-based approach to protein sequence alignment,” Journal of Mathematical Chemistry, vol. 43, pp. 756-772, 2008. [73] J. R. Platt, “Influence of neighbor bonds on additive bond properties in paraffins,” Journal of Chemical Physics, vol. 15, pp. 419-420, 1947.
On Uniform Representation of Proteins by Distance Matrix
537
[74] H. Wiener, “Structural determination of paraffin boiling points.” Journal of the American Chemical Society, vol. 69, pp. 17-20, 1947. [75] M. Randić, “Characterization of molecular branching,” Journal of the American Chemical Society, vol. 97, pp. 6609-6615, 1975. [76] A. T. Balaban, “Highly discriminating distance-based topological index.” Chemical Physics Letters, vol. 89, pp. 399-404, 1982. [77] M. Randić, “Novel graph theoretical approach to heteroatoms in QSAR,” Chemometrics & Intelligent. Laboratory Systems, vol. 10, pp. 213-227, 1991. [78] M. Randić, D. Plavšić, and M. Razinger, Double invariants, MATCH – Communication in Mathematical and Computer Chemistry, vol. 35, pp. 243-259, 1997. [79] A. R. Katrizky, V. S. Lobanov, and M. Karelson. CODESSA (Comprehensive Descriptors for Structural and Statistical Analysis), University of Florida, Gainesville. FL. 1995. [80] From Chemical Topology to Three-Dimensional Geometry, A. T. Balaban, ed., Plenum Press, New York, NY, USA, 1997. [81] M. Randić, “Topological Indices,” pp 3018-3032.in: Encyclopedia of Computational Chemistry, P. v. R. Schleyer, N. L. Allinger, T. Clark, J. Gasteiger, P. A. Kollman, H. F. Schaefer III, P. R. Schreiner (Eds.), Wiley, Chichester, UK, 1998. [82] J. Devillers and A.T. Balaban, (Eds.), Topological Indices and Related Descriptors in QSAR and QSPR, Gordon and Breach, Amsterdam, The Netherlands, 1999. [83] R. Todeschini and V. Consoni, Handbook of molecular descriptors, methods and principles, in Medicinal Chemistry, vol. 11, R. Mannhold, H. Kubinyi, H. Timmerman (eds), Wiley-VCH, Weinheim, Germany, 2000. [84] M. V. Diudea, (Ed.), QSPR/QSAR Studies by Molecular Descriptors, Nova Science Publishers, Huntington, UK, 2001. [85] M. Randić, “The connectivity index 25 years after,” Journal of Molecular Graphics and Modelling, vol. 20, pp. 19-35, 2001. [86] M. Randić and J. Zupan, “On the interpretation of the well-known topological indices,” Journal of Chemical Information and Computer Science, vol. 41, pp. 550-560, 2001. [87] D. H. Rouvray, R. B. King, (Eds.), Topology in Chemistry: Discrete Mathematics of Molecules, Horwood, Chichester, UK, 2002. [88] M. Randić and J. Zupan, “On the structural interpretation of topological indices.” Chapter 9, pp. 249-291 in: Topology in Chemistry: Discrete Mathematics of Molecules, D. H. Rouvray and R. B. King, Eds., Horwood Publ. Ltd.: Chichester, UK, 2002. [89] M. V. Diudea, M. S. Florescu, P. V. Khadikar, Molecular Topology and Its Applications, EfiCon, Bucharest, Romania, 2006. [90] B. Liao, W. Chen, X. Sun and W. Zhu, “A binary coding method of RNA secondary structure and its application,” Journal of Computational Chemistry, (in press) [91] M. Randić, J. Zupan, D. Vikić-Topić, D. Plavšić, and A. T. Balaban, “On graphical representation of proteins,” Chemical Reviews (submitted). [92] M. Randić, M. Vračko, M. Novič and D. Plavšić, “Spectral representation of reduced protein models,” SAR & QSAR in Environmental Research, 2009. (in press). [93] M. Randić, M. Vračko and D. Plavšić, “Novel compact graphical representation of proteins,” International Journal of Chemical Modeling (in press).
538
M. Randić, M. Vračko, M. Novič et al.
[94] M. Randić, J. Zupan and D. Vikić-Topić, “On representation of proteins by star-like graphs,” Journal of Molecular Graphics & Modelling, vol. 26, pp. 493-512, 2003. [95] M. Randić, M. Vračko, M. Novič and D. Plavšić, Simple 2D graphical representation of DNA, work in progress.
In: Quantum Frontiers of Atoms and Molecules Editor: Mihai V. Putz, pp. 539-587
ISBN: 978-1-61668-158-6 © 2010 Nova Science Publishers, Inc.
Chapter 21
TIMISOARA SPECTRAL – STRUCTURE ACTIVITY RELATIONSHIP (SPECTRAL-SAR) ALGORITHM: FROM STATISTICAL AND ALGEBRAIC FUNDAMENTALS TO QUANTUM CONSEQUENCES Mihai V. Putz1,* and Ana-Maria Putz2 1
Laboratory of Computational and Structural Physical Chemistry, Chemistry Department, West University of Timişoara, Str. Pestalozzi No.16, Timisoara, RO-300115, Romania 2 Laboratory of Inorganic Chemistry, Timişoara Institute of Chemistry of Romanian Academy, Av. Mihai Viteazul, No.24, Timişoara RO-300223, Romania
Abstract With the present-day interest in correlating chemical structure with biological activity, the quantitative structure-activity relationships (QSARs) are reviewed both on their fundamental statistical and advanced algebraic frameworks allowing for the so- called Spectral-SAR reformulation of the classical Multilinear regression in terms of data vectors and orthogonal conditions, while being suited for inter-endpoint (computed activity) paths and maps of interconversion. This way there is presented a novel, fresh and fruitful picture of regression analysis aiming to closely approach the quantum interpretation of data and of ligand-receptor interaction by means of systematic orthogonal and scalar (dot) product of either molecular (chemicals or toxicants) descriptors between them and with the observed (recorded, measured) activities. The resulted Spectral- or Quantum- SAR widely employs the present data as whole vectors, to be associated in principle with the eigen-states in quantum Hilbert space, opens the way for assigning a sort of wave function or wave packet for the congeneric active molecular series rather than for a single molecule as used to be; this way the specific interaction may be eventually modeled by structure (intrinsic)-metabolic (extrinsic) quantum rather than quantitative correlation picture.
*
E-mail addresss: [email protected], [email protected]; Web: www.mvputz.iqstorm.ro. Tel: +40-256592633; Fax. +40-256-592620. (Author of correspondence)
540
Mihai V. Putz and Ana-Maria Putz
1. Introduction In the last years, the world’s scientific research was focused on the so-called green chemistry, which consists in the efforts to reduce or eliminate the use or production of dangerous substances (with toxic potential) in synthesis, main stream and application of the chemical compounds through pre-industrial or computational design [1]. As such, on all meridians, new specific organizations and laws of validation of the entered compounds in environment or everyday and medical life have been raised: the first taxonomical groups emerged in the United States by the Environmental Protection Agency [2] followed by the European agency Umweltbundesamt (1997) and by Environment Canada (1999). However, at the level of European Union, since the Strategy on Management of Substances (SOMS, 2001) [3] program, the first step was made towards establishing by the European Commission, on 23 October 2003, with the Registration, Evaluation, Authorization and Restriction of Chemicals (REACH) norms establishing, through its directive EC no. 1907/2006, that starting from 2009 any substance with carcinomic or mutagenic potential entering in the life-cycle through market will be made only with authorization of the European Chemical Agency (ECMA) at Helsinki [4,5]. Also Romania, although from the legislative point of view has already the governmental directive OG no. 200/2000, approved by the law no. 451/2001, since 2003 was member of the Rotterdam Convention (10 September 2003) being part of the so-called Prior Informed Content (PIC) procedure relating the priory consent about the risk or toxicity degree of a specific chemical that will be circulating or imported across the country. Moreover, since the official membership of Romania in the European Union (1 January 2007), all chemicals on Romanian territory have to agree with the REACH normative. In this context, the fundamental research is at its turn driven by the EU laws through the directives of the Organization of Economical and Cooperation Development (OECD) that already credits the quantitative structure-activity relationship (QSAR) methodology as the only and certain source of computational design for the tested compounds with bio-, eco-, and pharmacological impact [6,7]. Being used in Chemistry during the second half of 20th century as an extended statistical analysis [8-15], the quantitative structure-activity relationship (QSAR) method had attained in recent years a special status, officially certified by the European Union as the main computational tool (within the so-called “in silico” approach) for the regulatory assessments of chemicals by means of non-testing methods [2-7,16-18]. However, while QSAR primarily uses the multiple regression analysis [8-15], alternative approaches such as neuronal-network (NN) or genetic algorithms (GA) have been advanced to somehow generalize the QSAR performance in delivering a classification of variables used, in the sense of principal component analysis (PCA) and partial least squares (PLS) methodologies. Still, the claimed advantage of the NN over QSAR techniques is limited by the fact the grounding physical-mathematical philosophies are different since highly non-linear with basic multi-linear pictures are compared, respectively [19-26]. Actually, the chemical-physical advantage of QSAR stands in its multi-linearity correlation that resembles with superposition principle of quantum mechanics, which allow meaningful interpretation of the structural (inherently quantum) causes associated with the
Timisoara Spectral – Structure Activity Relationship (Spectral-SAR) Algorithm
541
latent or unobserved variables (sometimes called common factors) into the observed effects (activity) usually measured in terms of 50%-effect concentration (EC50), associated with various types of bioaccumulation and toxicity [27]. Nevertheless, many efforts have been focused on applying QSAR methods to nonlinearity features from where the “expert systems” emerged as formalized computer-based environments, involving knowledge-based, rule-based or hybrid automata able to provide rational predictions about properties of biological activity of chemicals or of their fragments; it results in various QSAR based databases: the model database (QMDB) - inventorying the robust summaries of QSARs that can be appealed by envisaged endpoint or chemical, the prediction database (QPDB) - when data from QMDB are used for further prediction to be stored, or together towering the chemical category database (CCD) documentation [28-34]. Therefore, a certain conceptual-computational analysis of a compound of a series of compounds in the view of assigning its toxicity degree naturally two levels: one addresses the atomic-molecular structure together with related quantum properties while the other envisages the correlations of these properties, e.g. hydrophobicity, polarizability, steric effects, etc., with the bio, eco- or pharmaco- logical observed activities. Finally, it gets out the molecular mechanistic of the reactions involved in the studied chemical-biological interaction or, with other words, of the quantum chemical strength established between the ligand (the effector or the chemical) and receptor (in the target site or organism). Still, either the structure or the quantum chemical binding aspects require the advanced studies upon them, firstly in a separate manner, and then combined both at the intrinsic structural level and for correlating the interaction, based on the versatility of the atomic and molecular world to generate surprisingly structures and interactions just because the quantum character involved (i.e. undulatory, thus allowing the tunneling even for the energetic inaccessible potential barriers) when forming new apparently not explicated or controllable compounds by means of macroscopic procedures. Still, whatever the computational procedure approached, either of that of Hansch type [35-43], 3D [44-54], decisional [26,55-66], or orthogonal ones [67-80], the problem of delivering the molecular interaction mechanism as a QSAR analysis result was only recently furnished by the so called Spectral-SAR that proposes a purely algebraic rethinking of the traditional statistic QSAR, which allows, through the new concepts introduced (e.g. the orthogonal space of variables, the vectorial length of the biological activity, or the algebraic correlation factor as an intensity measure of the chemical-biological interaction) the building of an optimized chart of the molecular action pathways grounded on the minimum spectral path principle, δ [ A, B] = 0 with A and B the endpoints, within a generalized space of the action norms and correlation factors [81-90]. The present review will present the Spectral-SAR method, developed at Timisoara (Romania), as a natural continuation and generalization of the classical standard (statistical) quantitative structure-activity relationship (QSAR) towards the quantum assessment of the ligand-receptor cellular specific interactions, paths and maps.
542
Mihai V. Putz and Ana-Maria Putz
2. Statistic QSAR 2.1. Scalar (Dot) Product Basics Often very useful for mathematical elegance but also with a deep insight for the present Spectral-SAR methodology, the vectorial modeling of data may be associated with generalized classical-to-quantum description of variables on Hilbert space, beneficial for emphasizing many properties especially those related with orthogonality, i.e. independency of descriptors; this way the quantum most efficient description of a dynamical systems projected on the associated minimum set of commutative (independent) operators assure the maximum predictability in computation and viability in conceptual modeling. Skipping the formal mathematical details, while capping the essence of the computations, being the main operation on Hilbert space (a vectorial space) the scalar or dot product – its main features are shortly reviewed in what follows. Given two vectors
u = u1 , u 2 ,..., u n , v = v1 , v 2 ,..., v n
(1)
their scalar (or dot) product writes as: def
n
u v = ∑ u i vi = u1v1 + u 2 v 2 + ........ + u n v n
(2)
i =1
Since the self scalar product looks like: n
u u = ∑ u i2
(3)
i =1
one may introduce the so called norm (or length) of the vector by:
u =
uu =
n
∑u i =1
2 i
(4)
The length property of the vectorial norm may be easily visualized through computing the modulus of an arbitrary 3D vector r = u1 , u 2 , u 3 :
r r = u12 + u 22 + u 32 =
u1 , u 2 , u 3 u1 , u 2 , u 3 =
r r = r
(5)
Consequently, the distance between two vectors is written in terms of their difference norm
Timisoara Spectral – Structure Activity Relationship (Spectral-SAR) Algorithm
d( u , v ) = u − v = u − v =
u−v u−v =
n
∑ (u i =1
i
− vi ) 2
543
(6)
From last relation (but also from the fact that scalar product is positively defined, see above) the distributivity and commutativity properties of scalar product may be employed for any t ∈ ℜ towards equivalent expressions
u − tv u − tv ≥ 0, t ∈ ℜ ⇔ ( u − v t )( u − t v ) ≥ 0 `⇔ v v t −2 u v t + u u ≥ 0 2
(7)
The last inequality says that the right side second order equation has no solution or has single equal solutions, a condition fulfilled when its discriminator is less or equal with zero, respectively, leading with the famous Cauchy-Schwartz inequality:
u v
2
≤ uu vv
(8)
rewritten as: n
∑ u i vi ≤ i =1
n
∑ ui2 i =1
n
∑v i =1
2 i
(9)
or as
u v ≤ u ⋅ v
(10)
Cauchy-Schwartz inequality is usually successfully employed in probability theory, variance theory and correlation factors, as will be illustrated soon in what following.
2.2. Basic Statistical Indices Having a set of causes-effects covered “Universe” with either individual and coupled probabilities, as given in Table 1, the ergodic statistical (or normalization) condition for their discrete realizations is expressed respectively as: M
∑p i =1
i
= 1,
(11a)
544
Mihai V. Putz and Ana-Maria Putz N
∑p M
= 1,
j
j =1
N
∑∑ p
ij
i =1 j =1
(11b)
=1
(11c)
Table 1. Schematic representation of the “Universe” by the probability table (and values) with which a certain cause xk produces certain effect yk X Y
…
…
…
…
yk
pk
pk1 ...
pkk ...
pkM
...
...
...
...
xM pM p1M p2M
…
p1 p2
xk ... pk ... p1k ... p2k ...
...
y1 y2
x1 ... p1 ... p11 ... p21 ...
yN
pN
pN1 ...
pNk ...
pNM
Yet, in integral representation of the probability field extended to the “Universe” or actions, the condition (11c) rewrites in terms of probability density function f ( x, y ) as
P( x, y ) UNIVERSE = ∫ ∫ f ( x, y )dxdy = 1
(12)
x y
while introducing the average of a given observable on a given domain of reality “D” in the same manner with the quantum mechanical measurement postulate [91]: 2 Aˆ = ∫ ψ ∗ Aˆ ψdτ = ∫ Aˆ ψ ∗ψdτ = ∫ Aˆ ψ dτ = ∫ ∫ Aˆ f(x,y)dxdy
D
(13)
x y
where one easily recognizes the quantity
2
ψ as the probability density.
In these conditions the average for x-values writes as
x = ∫∫ xf ( x, y )dxdy
(14)
D
producing the chain of individual values’ departure from average
x1 − x , x 2 − x ,… and, even more, their squared (positive) counterparts
(15)
Timisoara Spectral – Structure Activity Relationship (Spectral-SAR) Algorithm
(x
1
− x
) , (x 2
− x
2
) ,K 2
545 (16)
for defining the x-dispersion (or x-variance) statements
⎧ ⎪ (x − x )2 = (x − x )2 f ( x, y )dxdy ∫∫ ⎪ 2 ⎪ ⎛ ⎞ ⎪ 2 Dx = ⎨( x − x ) = ∑ pi ⎜ xi − ∑ xi pi ⎟ i i ⎝ ⎠ ⎪ 2 ⎪ ⎪ 1 ∑ ⎛⎜ xi − 1 ∑ xi ⎞⎟ ⎪⎩ n i ⎝ n i ⎠
(17)
either expressed under integral, probability, or uniform probability
1 ⎧ ⎪ pij = pi = p j = n ⎨ ⎪⎩ N = M = n
(18)
For alternative, more practical definition of variance, one may use the “quantum” average properties to successively get the forms
Dx = (x − x
)
2
= (x − x )(x − x
)
= x2 − 2 x x + x
= x2 − 2 x x + x = x2 − x
2
= x2 − 2 x
2
2
+ x
2
2
(19)
thus providing the celebrated dispersion form (used in Heisenberg indeterminacy principle) from where also its meaning as measuring the error in attributing the average (14) for the xset of values in Table 1 [92]. Yet, the eq. (19) may be further adapted to the integral, probability and uniform variants, as before:
⎧ ⎪ x 2 − x 2 = x 2 f ( x, y )dxdy − ∫∫ ⎪ 2 ⎪ 2 ⎛ ⎞ ⎪ 2 2 D x = ⎨ x − x = ∑ xi p i − ⎜ ∑ xi pi ⎟ i ⎝ i ⎠ ⎪ 2 ⎪ ⎪= 1 ∑ xi2 − 1 ⎛⎜ ∑ xi ⎞⎟ ⎪⎩ n i n2 ⎝ i ⎠
(∫∫ xf ( x, y)dxdy )
2
(20)
546
Mihai V. Putz and Ana-Maria Putz
Closely related with the dispersion index stays the so called covariance index, which generalizes the variance for two different quantities, here viewed as x-causes and y-effects; it takes one of the forms
C xy = (x − x
)( y − y )
= xy − x y − x y + x y = xy − x y − x y + x y = xy − 2 x y + x y = xy − x y
(21)
being immediately transcribed into hierarchical way from integral, discrete probabilities, and uniform probabilities, as above
⎧ ⎪ xy − x y = xyf ( x, y )dxdy − xf ( x, y )dxdy ∫∫ ∫∫ ⎪ ⎪⎪ ⎞ ⎛ ⎞⎛ C xy = ⎨ xy − x y = ∑ xi y j p ij − ⎜ ∑ xi pi ⎟⎜⎜ ∑ y j p j ⎟⎟ i, j ⎝ i ⎠⎝ j ⎠ ⎪ ⎪ 1 ⎪= ∑ xi y i − 1 ⎛⎜ ∑ xi ⎞⎟⎛⎜ ∑ y i ⎞⎟ ⎪⎩ n i n2 ⎝ i ⎠⎝ i ⎠
(
)(∫∫ yf ( x, y)dxdy ) (22)
Note that the covariance meaning is best understood through imagine the case of its cancellation,
C xy = 0 ⇒ xy = x y ⇒ f ( x, y ) = f ( x) f ( y ) ,
(23)
the case in which the bi-dimensional probability density factorizes into two one-dimensional ones, from where the covariance should account for the “non-separability” of x-causes and yeffects realization probabilities, being this another point where one statistical quantity is reflected by a quantum reality. Worth here remarking that someone would say that this is natural since the quantum theory is often interpreted in terms of probability and in statistical way in general; this is only partially true, while remarking the subtle difference that still exists between quantum mechanics and quantum statistics, to some degree equivalent, but manifestly distinct in regarding time and temperature dependence, respectively [93]. Next, by working with squares of the dispersions, i.e. by defining the standard deviations (σ)
σ x = Dx , σ y = D y
(24)
One can combine the covariance and dispersion into the ratio called as the (Pearson) correlation coefficient, written in simple way as:
Timisoara Spectral – Structure Activity Relationship (Spectral-SAR) Algorithm
rxy =
C xy
σ xσ y
=
(x − x )( y − y ) (x − x ) (y − y ) 2
∑ (x
=
2
i
547
− x )( y i − y )
i
∑ ( x i − x )2 ∑ ( y i − y )2 i
(25a)
i
or in equivalent scalar product fashion:
x−x y− y
rxy =
x−x x−x
(25b)
y−y y−y
The last form gives the elegant opportunity to show its probabilistic character by applying the Cauchy-Schwartz inequality (8) for the vectors (states) x − x > , y − y > , that is
x−x y− y ≤
x−x x−x
y− y y− y
(26)
thus proofing the realm of eq. (25b) as being sub-unitary
rxy ≤ 1
(27)
If necessary, further discrete probability version of Pearson correlation coefficient (25)
∑x y i
ij
rxy =
∑x i
2 i
j
⎞ ⎛ ⎞⎛ pij − ⎜ ∑ xi pi ⎟⎜⎜ ∑ y j p j ⎟⎟ ⎝ i ⎠⎝ j ⎠
⎛ ⎞ p i − ⎜ ∑ xi pi ⎟ ⎝ i ⎠
2
∑y j
2 j
⎛ ⎞ p j − ⎜⎜ ∑ y j p j ⎟⎟ ⎝ j ⎠
2
(28a)
or under uniform probability (18) variant
rxy =
⎛ ⎞⎛ ⎞ n∑ xi yi − ⎜ ∑ xi ⎟⎜ ∑ yi ⎟ i ⎝ i ⎠⎝ i ⎠ ⎛ ⎞ 2 n ∑ xi − ⎜ ∑ xi ⎟ i ⎝ i ⎠
2
⎛ ⎞ 2 n∑ y i − ⎜ ∑ y i ⎟ i ⎝ i ⎠
2
(28b)
may be considered with the same interpretation as showing how much from the combined causes-effects probability may be represented as combining separated causes and effects’ probabilities; if this relation is identity it means that the causes and effects are distinct realities and may be treated as such, otherwise, for correlation bellow unity there appears that causes are mixed with effects already in their stage of causes, being the effects less
548
Mihai V. Putz and Ana-Maria Putz
observable as distinct (measurable) reality. This heuristic (yet meaningfully) interpretation may be also “geometrically” treated through remembering the classical scalar product between two vectors
r r r r r r x ⋅ y = x ⋅ y cos( x , y )
(29)
furnishes the value of the angle between them as the cosines
r r r r x⋅y cos( x , y ) = r r x y
(30a)
easily to be generalized for the present vectorial representation of causes and effects data recordings of Table 1:
cos( x , y ) =
x y x
y
=
x y x x
y y
=
∑x y i
i
i
∑ xi2 i
∑ yi2
(30b)
i
Since the sub-unitary value of the cosines, the expression (30b) may be treated as another definition of the Pearson correlation (25b) involving just the original data (vectorial) sets, or, equivalently, when their averages are vanishing, x = 0, y = 0 . From this point of view there is clear that the Pearson definition (25b) is more general since involving the average at whatever values, while the eq. (30b) fixes the angle between the causes and effects states: as the angle expresses orthogonality as the cosines goes unity and the two states are better correlated but in the sense of inferring one from other and not interfering one with other. This is a subtle message which we like to stress in synthesizing two conclusions: •
•
The orthogonality between two correlating sets of data is essential in establishing the qualitative degree of correlation and do not depend on the average of data sets but only by their vectorial length and scalar product through the angle cosines given by eq. (30b); Instead, the quantitative degree of correlation is established by invoking the average of concerned data sets through modifying/generalizing the eq. (30) towards the Pearson coefficient (25b).
These are fundamental ideas underlying the motivation and the “philosophy” of quantitative activity (for effects)-structure (for causes) relationships, to be in next step by step unfolded.
Timisoara Spectral – Structure Activity Relationship (Spectral-SAR) Algorithm
549
2.3. Linear Correlation Going on with the correlation analysis let’s explore the linear correlation between onecause – the effect relationship and to see the role the correlation factor play on it. For that we consider the one-to-one data sets for x-cause and y-effect, within uniform probability realization of eq. (18), here summarized as the data rows: X Y
x1 y1
x2 y2
... ...
xn yn
Basically, the regression problem consists in finding the best modeling of observed effects by the computed one
y iobs = axi + b + ei = y comp + ei 123
(31)
y comp
through minimizing the errors of such approximation, that is:
⎧ei2 = [ y i − (axi + b)]2 ⎪ ⎨ e 2 → min i ⎪⎩∑ i
(32)
Analytically, if the minimization function is introduced as the sum of squared errors
f (a, b) = ∑ ei2 = ∑ ( y i − axi − b ) → min 2
i
(33)
i
then the optimization procedure is to be done in respecting the linear parameters as the free terms and the slope of regression, i.e. providing the system
⎧ ∂f (a, b) ⎪⎪ ∂a = 0 ⎨ ⎪ ∂f (a, b) = 0 ⎪⎩ ∂b
(34a)
⎧2∑ ( yi − axi − b )(− xi ) = 0 ⎪ i ⎨ ⎪2∑ ( yi − axi − b )(−1) = 0 ⎩ i
(34b)
equivalently unfolded as
or even as:
550
Mihai V. Putz and Ana-Maria Putz
⎧∑ yi xi = a ∑ xi2 + b∑ xi ⋅ n ⎪⎪ i i i ⎨ ⎪∑ yi = a ∑ xi + bn ⋅ (−∑ xi ) ⎪⎩ i i i
(34c)
⎛ ⎞⎛ ⎞ n∑ xi y i − ⎜ ∑ y i ⎟⎜ ∑ xi ⎟ i ⎝ i ⎠⎝ i ⎠, a= 2 ⎛ ⎞ n∑ xi2 − ⎜ ∑ xi ⎟ i ⎝ i ⎠
(35)
solved for the solutions:
b=
∑ y ∑x −∑x ∑x y 2 i
i
i
i
i
i
i
i
⎛ ⎞ n∑ xi2 − ⎜ ∑ xi ⎟ i ⎝ i ⎠
2
i
(36)
Now, worth observing that by multiplying the second equation of the system (34c) with the factor 1 / n one gets the meaningful expression
1 1 y i = a ∑ xi + b ∑ n i n i
(37)
telling that the linear correlation is in fact precisely fulfilled by the data set averages of cause and effect, respectively, i.e.
y = ax + b
(38)
However, looking to the slope expression (35) and comparing it with the Pearson coefficient (28b) one easily recognize that they are in different statistic quantities although linked by the x- and y- standard deviations (24) with dispersions of (20) type, namely as
a = rxy
σy C xy σ y C xy = = σ x σ xσ y σ x D x
(39)
where the dependence of slope (35) by the x-y covariance of (22) and x-dispersion (20) was also emphasized. With these, it is clear that the monovariate linear correlation has the correlation factor as the direct information included in its slope; indeed, if the x- and y-standard deviations are considered approximately the same
Timisoara Spectral – Structure Activity Relationship (Spectral-SAR) Algorithm
551
σx =σy
(40)
that happens in the ideal case when both the x- and y- data sets are described by the same normal distribution, it results in the identity:
a = rxy
(41)
However, since, in general, we have the case
σy ≠1 σx
(42)
it is clear that this ratio “modulates” the correlation slope a of eq. (35) to provide the correct, sub-unitary, correlation factor; this explaining why, even in the practical cases of slope higher than unity ( a > 1 ), the correlation factor still records sub-unitary values. Returning to the general linear regression now we can consider the slope-Pearson correlation coefficient of eq. (39) as driven the instantaneous equation
y = rxy
σy x+b σx
(43a)
along its averaged form, in accordance with eq. (38),
y = rxy
σy x +b σx
(43b)
as well as their difference
y − y = rxy
σy (x − x ) σx
(44a)
from where the computed (predicted) instantaneous effects directly writes from averaged observed ones corrected in a perturbation sense by corresponding instantaneous cause departure fro its average modulated by the Pearson coefficient, which is sub-unitary as earlier proofed by eq. (27), and the ratio of effect-to-cause standard deviations
y comp = y + rxy
σy (x − x ) σx
(44b)
552
Mihai V. Putz and Ana-Maria Putz
Finally, worth giving a practical rule aiming to assure as much possible the premises of a good or relevant correlation in sense of increasing the Pearson correlation factor; it may look like
rxy n − 1 ≥ 3
(45a)
which offers a quite reasonable framework depending of the dimension of the data sets accounted as causes and recorded as effects. As such, there is clear that even for data sets containing ten points the condition (42) is still not satisfied, n=10 =>
n − 1 = 3 , ∀ rxy ⋅ 3 < 3
(46a)
while at least from seventeen dimension of vectorial state with instantaneous cause-effect points the regression analysis may become reasonable: n=17 =>
n − 1 = 4 , ∃ rxy n − 1 ≥ 3
(46b)
Nevertheless, from (45a) an even more relaxed condition may be inferred by squaring it to the condition:
rxy2 (n − 1) ≥ 9
(45b)
which may be satisfied even for data sets with cardinal laying in the range ≥ 10 . Worth observing that as the data sets for causes is more restrained the correlation factor has to be closer to unity for goodness of the fit. Again, the present discussion has two subtle consequences, namely: The cause-effect (linear) regression is meaningful when the number of points included in analysis is significant, and in any case larger than ten; The correlation analysis is still relevant, even for lower square of Pearson coefficient as far the number of included cause-effects points is higher enough such that condition (45b) to be fulfilled; this consequence prevent the ab initio exclusion of the correlation models with correlation coefficient not laying in the unity vicinity, but when considerable large data set was assumed. Further insight on correlation coefficient and of its alternative practical definition is to be in next exposed.
2.4. Correlation by Normal Distribution Function After introducing the main statistical indices and concepts, worth generalizing them with the aid of the distribution function defining the so called (statistical) moment of K-th order for the x-variable (cause):
Timisoara Spectral – Structure Activity Relationship (Spectral-SAR) Algorithm
μ K = (x − x
)
= ∫ (x − x
K
)
K
f ( x )dx
553 (47)
Consequently, the first three moments are easily recognized as: • the normalization condition through the zero-th order moment
μ 0 = ( x − x ) 0 = 1 = 1 = ∫ f ( x)dx •
(48)
cancellation of the first moment:
μ1 = ( x − x )1 = ∫ xf ( x)dx − ∫ x f ( x)dx = x − x = 0 •
(49)
the standard deviation through the second moment:
μ 2 = (x − x
)
2
= σ x2 = ∫ ( x − x ) f ( x)dx 2
(50)
The remaining problem is the identification of the distribution function; it can be nevertheless chosen as the normal distribution, with the form
⎡ (x − x )2 ⎤ f ( x) = exp ⎢− ⎥ , x∈ℜ 2σ x2 ⎦ 2π ⋅ σ x ⎣ 1
(51)
for which the first three moments are verified from eqs. (49)-(51) with the help of Appendix, respectively as:
μ0 =
⎡ ( x − x )2 ⎤ 1 exp ⎢− ⎥d ( x − x ) = 2 ∫ 2σ x ⎦ 2π ⋅ σ x −∞ ⎣ 2π ⋅ σ x 1
+∞
π 1
μ2 =
1 2π ⋅ σ x
(52a)
2σ x2
⎡ ( x − x )2 ⎤ μ1 = ∫ ( x − x ) exp⎢⎣− 2σ x2 ⎥⎦d ( x − x ) = 0 , 2π ⋅ σ x −∞ 1
=1,
+∞
(x − x ) 2 ∫ (x − x ) exp⎢− 2
+∞
⎡
−∞
⎣
2
2σ x
(52b)
⎤ 1 1 ⋅ ⋅ 2πσ x2 = σ x2 (52c) ⎥d ( x − x ) = 1 2π ⋅ σ x 2 ⋅ ⎦ 2σ x2
Next, having checked the reliability of the normal function for one variable, the generalized bi-dimensional form may be proposed through considering both the
554
Mihai V. Putz and Ana-Maria Putz
multiplication rule for independent probabilities (say for x-causes and y-effects) tuned by the degree of reciprocal correlation by the Pearson coefficient presence, with the working form: f ( x, y ) =
1 2πσ xσ y
⎧⎪ 1 exp⎨− 2 2 ⎪⎩ 2 1 − rxy 1 − rxy
(
)
⎡ ( x − x ) 2 ( y − y) 2 ( x − x )( y − y ) ⎤ ⎫⎪ (53) + − 2rxy ⎢ ⎥⎬ 2 2 σ xσ y σy ⎢⎣ σ x ⎥⎦ ⎪⎭
Note that when the x- and y- distributions are really independent, i.e. with rxy = 0 , it reduced to the factorization of the distribution functions of the associate probability fields ⎡ ( y − y)2 ⎤ ⎞ ⎛ ⎡ ( x − x ) 2 ⎤ ⎞⎛⎜ 1 1 ⎟ ⋅ exp⎢− f rxy =0 ( x, y ) = ⎜⎜ ⋅ exp ⎥ ⎟⎟ = f ( x) f ( y ) (54) ⎢− ⎥ 2 2 ⎟⎜ 2π ⋅ σ σ σ 2 2 ⋅ π σ 2 ⎥⎦ ⎠ ⎢ x y ⎣ ⎦ x y ⎝ ⎠⎝ ⎣
in the same manner as the covariance behavior, previously quoted by the note (23), with the same meaning: when the causes and effects are at all correlated they may be true simultaneously, thus abolishing any ordering hierarchy between them. Nevertheless, for better emphasizing on the cause role of the x-variable, the conditioned distribution function may be considered as the ratio of the bi-variate normal probability (51) reduced (normalized) by that corresponding to the cause probability, while better modeling the degree with which the effect probability arises when the cause manifestation is certain (and before it); thus the effect conditioned probability function by the cause appearance is successively written as:
g (y x) = = =
=
=
=
=
f ( x, y ) f ( x) 1
2π σ y
⎧⎪ ⎡ (x − x )2 ( y − y)2 1 ( x − x )( y − y ) ⎤ ( x − x ) 2 ⎫⎪ exp⎨− + − 2rxy ⎢ ⎥+ ⎬ 2 2 2 σ xσ y σy 2σ x2 ⎪⎭ ⎪⎩ 2(1 − rxy ) ⎣⎢ σ x 1 − rxy2 ⎦⎥
2 2 ⎧⎪ ⎡ (x − x)2 ( y − y)2 1 ( x − x )( y − y ) (1 − rxy )( x − x ) ⎤ ⎫⎪ exp⎨− + − 2rxy − ⎢ ⎥⎬ 2 2 2 2 σ xσ y σy σx ⎪⎩ 2(1 − rxy ) ⎣⎢ σ x 1 − rxy2 ⎦⎥ ⎪⎭
1 2π σ y
1 2π σ y 1 2π σ y
⎧⎪ ⎡ 2 (x − x )2 ( y − y)2 1 ( x − x )( y − y ) ⎤ ⎫⎪ exp⎨− r 2 r + − ⎢ ⎥⎬ xy xy 2 σ xσ y σ x2 σ y2 ⎪⎩ 2(1 − rxy ) ⎢⎣ 1 − rxy2 ⎥⎦ ⎪⎭ ⎧⎪ ⎡ 2 (x − x) 2 ( y − y)2 1 ( x − x )( y − y ) ⎤ ⎫⎪ exp⎨− r 2 r + − ⎥⎬ ⎢ xy xy 2 σ xσ y σ x2 σ y2 ⎪⎩ 2(1 − rxy ) ⎢⎣ ⎥⎦ ⎪⎭ 1 − rxy2
1 2π σ y 1 2π σ y
2 ⎧ ⎡y− y x − x ⎤ ⎫⎪ 1 ⎪ exp⎨− − rxy ⎥ ⎬ 2 ⎢ r 2 ( 1 ) σ σ x ⎥⎦ ⎪ − ⎢ 1 − rxy2 y xy ⎪⎩ ⎣ ⎭
⎧⎪ 1 exp⎨− 2 2 1 − rxy2 ⎪⎩ 2(1 − rxy )σ y
σy ⎡ ⎤ (x − x )⎥ ⎢ y − y − rxy σx ⎣ ⎦
2
⎫⎪ ⎬ ⎪⎭
(55)
Timisoara Spectral – Structure Activity Relationship (Spectral-SAR) Algorithm
555
It may, for instance, be used to check out the instantaneous computed/predicted effect, providing successively the expressions
y comp = y g ( y x ) = ∫ yg ( y x )dy
σy σ ⎡ ⎤ ⎛ ⎞ (x − x )⎥ g ( y x )d ⎜⎜ y − y − rxy y (x − x )⎟⎟ = ∫ ⎢ y − y − rxy σx σx ⎦4244 ⎝ 44444 − ∞⎣ 1444444 444 4443⎠ +∞
=0
σy σ ⎡ ⎤ ⎛ ⎞ (x − x )⎥ ∫ g ( y x )d ⎜⎜ y − y − rxy y (x − x )⎟⎟ + ⎢ y + rxy σx σx ⎣ ⎦ −1 ∞ 444⎝44 42444 444 3⎠ +∞
=1
= y + rxy
σy (x − x ) σx
(56)
until recovering the perturbative form (41) earlier proofed within the linear regression context. However, being with eq. (56) convinced by the usefulness of the conditioned probability function (55) one may use it for computing the important statistical quantity as the minimum of the squared errors sum, or the sum of residues SR in observing the effects from a set of causes, being practically equivalent with the variational calculation of the eq. (34). Indeed, through the following successive identities
(
SR y = min ∑ yiobs − yicomp
)
2
i
(y
=
obs i
−y
)
comp 2 i
σy ⎡ ⎤ (x − x )⎥ = ⎢ y − y − rxy g(y x) σx ⎣ ⎦
2
g(y x)
2
σy σ ⎛ ⎡ ⎤ ⎞ (x − x )⎥ g (y x )d ⎜⎜ y − y − rxy y (x − x )⎟⎟ = ∫ ⎢ y − y − rxy σx σx ⎝ ⎣ ⎦ ⎠ 2 2 = 1 − rxy σ y
(
)
(57)
one gets the equation
SR
σ y2
= 1 − rxy2
leaving with the so called standard or statistical correlation factor
(58)
556
Mihai V. Putz and Ana-Maria Putz
R = 1−
SR
σ
2 y
= 1−
∑ (y i
obs i
∑ (y
− y icomp
obs i
−y
)
2
(59)
)
2
i
The result of eq. (59), although formally equivalent wit the Pearson correlation factor (28b) adds a very important feature: it describe the correlation cause-effect only through the observed and computed effects so that hiding the causes in the instantaneous computed/predicted effects based upon the regression equation effects-causes. Such formulation is of the first importance and use in evaluating the correlation factors when the multi-regression analysis is employed, since the presence of the many-causes probabilities and correlations – a problem that is avoided when the correlation factor is based only on computed and observed effects, as formula (59) display. Nevertheless, variants of it for may be formulated, for instance the corrected correlation factor that accounts for the dimension of causes and effect vector (state), i.e. the cardinals M and N of Table 1, but this is relevant only for refining applicative discussions, while here we will restraint to only presenting and commenting the fundamental statistical regressions. In this line, in next, the multi-linear correlation is analytically exposed to complete the statistical presentation of the cause-effect correlation paradigm for the structure-activity modeling, respectively.
2.5. Multilinear Correlation The many-variable correlation problem may be resumed by finding the b’s parameters of the instantaneous equation
Y = b0 + b1 X 1 + b2 X 2 + .... + bM X M
(60)
when knowing the set of independent (x’s) and observed dependent (y) variables of Table 2. Table 2. The realization of the Table I within the uniform probability for evaluated (selected) causes of Xk and observed effects of Y columns, respectively
yN
x11 … x21 …
x1k … x2k …
x1M x2M
1
xN1 …
xkk …
xNk …
xkM …
xk1 …
…
1
…
1 1
…
XM
…
Xk …
…
…
yk
X1 …
…
…
Y y1 y2
X0
…
X
xNM
Timisoara Spectral – Structure Activity Relationship (Spectral-SAR) Algorithm
557
While recognizing the eq. (60) as being associated with the computed instantaneous effect (activity), when the observed counterparts is considered the corresponding errors appear provided the system (61) is fulfilled.
⎧ y1obs = b0 + b1 x11 + b2 x12 + ... + bM x1M + e1 ⎪ obs ⎪ y 2 = b0 + b1 x 21 + b2 x 22 + ... + bM x2 M + e2 ⎨ ⎪.... ⎪ y obs = b + b x + b x + ... + b x + e 0 1 N1 2 N2 M NM N ⎩ N
(61)
The minimization of the squared sum of errors from (61) respecting each of the searched parameters looks like
⎧ ∂ ⎡ N 2⎤ ⎪ ⎢∑ ei ⎥ = 0 ⎪ ∂b0 ⎣ i =1 ⎦ ⎪ ∂ ⎡ N 2⎤ ⎪ ⎢∑ ei ⎥ = 0 ⎨ ∂b1 ⎣ i =1 ⎦ ⎪... ⎪ ⎪ ∂ ⎡ N 2⎤ ⎪ ⎢∑ ei ⎥ = 0 ⎩ ∂bM ⎣ i =1 ⎦
(62a)
as a generalization of the linear variational procedure of system (34a); it unfolds analytically firstly as N ⎧ ⎪− 2∑ [ y i − (b0 + b1 xi1 + b2 xi 2 + ... + bM xiM )] ⋅1 = 0 ⎪ i =1 N ⎪ 2 − ⎪ ∑ [ y i − (b0 + b1 xi1 + b2 xi 2 + ... + bM xiM )] ⋅ xi1 = 0 ⎨ i =1 ⎪... ⎪ N ⎪ 2 [ yi − (b0 + b1 xi1 + b2 xi 2 + ... + bM xiM )]⋅ xiM = 0 − ∑ ⎪ ⎩ i =1
which can be then rearranged with the form
(62b)
558
Mihai V. Putz and Ana-Maria Putz N N N ⎧N ... y b N b x b x b = + + + + 0 1 ∑ i1 2 ∑ i2 M ∑ xiM ⎪∑ i i =1 i =1 i =1 i =1 ⎪ N N N N ⎪ 2 ⎪∑ yi xi1 = b0 ∑ xi1 + b1 ∑ xi 2 + ... + bM ∑ xiM xi1 ⎨ i =1 i =1 i =1 i =1 ⎪... ⎪ N N N ⎪N 2 y x b x b x x ... b = + + + 0 ∑ iM 1 ∑ i1 iM M ∑ xiM ⎪∑ i iM i =1 i =1 i =1 ⎩ i =1
(62c)
Yet, since the last system has to be solved for b’s coefficients a general (formal) solution may be furnished by recognizing it as the formal matrix equation
[X ]T [Y ] = [X ]T [X ][B]
(63)
with the notations:
⎛ y1 ⎞ ⎛1 x11 x12 ... x1M ⎜ ⎟ ⎜ 1 x x ... x ⎜ y2 ⎟ [Y ] = ⎜ ⎟ , [X ] = ⎜⎜ 21 22 2 M .... ..... ⎜ ⎟ ⎜ ⎜y ⎟ ⎜1 x x ... x NM ⎝ N⎠ ⎝ N1 N 2
⎞ ⎛ b1 ⎟ ⎜ ⎟ ⎜ b2 [ ] = B , ⎟ ⎜ .... ⎟ ⎜ ⎜b ⎟ ⎝ M ⎠
⎛ e1 ⎞ ⎞ ⎜ ⎟ ⎟ ⎜ e2 ⎟ ⎟ [ ] E = , ⎜ .... ⎟ ⎟ ⎜ ⎟ ⎟ ⎟ ⎜e ⎟ ⎠ ⎝ N⎠
(64)
Equation (63) can be nevertheless directly obtained by reconsidering the system (61) rewritten with notations (64) under the matrix form
[Y ] = [X ][B ] + [E ]
(65)
upon which the optimization condition (62a) is now becoming formally:
(
)
∂ [E ]T [E ] = 0 ∂[B ]
(66)
T
and where [ X ] stands for the transposition of the [ X ] matrix. In any case, the solution of the eq. (63) is immediately abstracted as
[B ] = ([X ]T [X ]) [X ]T [Y ] −1
(67)
often called as the Moore-Penrose matrix. Yet, although elegantly obtained it involves the inverse matrix operation which may be quite cumbersome in cases of higher dimensions of the observed effects through the selected causes; it may suffers as well by the indeterminacy
Timisoara Spectral – Structure Activity Relationship (Spectral-SAR) Algorithm
559
in cases in which the matrix inverse is not possible or with singularities. However, it allows for computer routines and is implemented in the majority of the statistical packages. Since having somehow hidden structure the solution (67) should be checked for the linear-regression case for the knowing analytical solution as given by eqs. (35) and (36). For this special case the system (61) restrains to the simple one
⎧ y1obs = b + ax1 + e1 ⎪ obs ⎪ y 2 = b + ax2 + e2 ⎨ ⎪.... ⎪ y obs = b + ax + e N N ⎩ N
(68)
whereas the involved matrices in general solution (67) are now shaped as
⎛ y1 ⎞ ⎛1 x1 ⎞ ⎜ ⎟ ⎟ ⎜ ⎛ b = b0 ⎞ ⎜ y2 ⎟ ⎜1 x 2 ⎟ ⎟ , [X ] = ⎜ [Y ] = ⎜ ⎟ , [B ] = ⎜⎜ M M ⎟ ... a = b1 ⎟⎠ ⎝ ⎜ ⎟ ⎟ ⎜ ⎜1 x ⎟ ⎜y ⎟ N ⎠ ⎝ ⎝ N⎠
(69)
Therefore, we firstly construct the matrix to be inversed, namely
[A] = [ X ]T [ X ] = ⎛⎜⎜
1 ⎝ x1
1 x2
⎛1 x1 ⎞ ⎛ ⎟ ⎜ N ⎜ L 1 ⎞⎜1 x 2 ⎟ ⎜ ⎟ = L x N ⎟⎠⎜ M M ⎟ ⎜ N xi ⎟ ⎜ ⎜1 x ⎟ ⎜⎝ ∑ i N ⎠ ⎝
⎞ ⎟ i ⎟ N ⎟ ∑i xi2 ⎟⎠ N
∑x
i
(70)
whose determinant is immediately yielded
⎛ N ⎞ det[ A] = N ∑ x − ⎜ ∑ xi ⎟ i ⎝ i ⎠ N
2
2 i
(71)
while providing also the minor determinants N N ~ A11 = ( −1)1+1 ∑ xi2 = ∑ xi2 ,
(72a)
N N ~ A12 = (−1)1+ 2 ∑ xi = −∑ xi ,
(72b)
N N ~ A21 = (−1) 2+1 ∑ xi = −∑ xi ,
(72c)
i
i
i
i
i
i
560
Mihai V. Putz and Ana-Maria Putz
~ A22 = (−1) 2+2 N = N
(72d)
entering the matrix
~ ⎛ [A]* = ⎜⎜ A~11 ⎝ A21
⎛ N 2 xi ~ ⎜ A12 ⎞ ⎜ ∑ i ⎟ = ~ N A22 ⎟⎠ ⎜ − x ⎜ ∑ i ⎝ i
N ⎞ − ∑ xi ⎟ i ⎟ ⎟ N ⎟ ⎠
(73)
All in all the inverse matrix of (70) is obtained as
[A]−1
N ⎛ xi2 ⎜ ∑ i ⎜ 2 ⎜ N ⎛ N ⎞ 2 ⎜ N ∑ xi − ⎜ ∑ xi ⎟ * [ A] ⎜ ⎝ i ⎠ = =⎜ i N det[A] − ∑ xi ⎜ i ⎜ 2 ⎜ N ⎛ N ⎞ 2 ⎜⎜ N ∑ xi − ⎜ ∑ xi ⎟ ⎝ i ⎠ ⎝ i
⎞ ⎟ i ⎟ 2 ⎟ N N ⎛ ⎞ N ∑ xi2 − ⎜ ∑ xi ⎟ ⎟ i ⎝ i ⎠ ⎟ ⎟ ⎟ N ⎟ 2 N N ⎛ ⎞ ⎟ N ∑ xi2 − ⎜ ∑ xi ⎟ ⎟⎟ i ⎝ i ⎠ ⎠ N
− ∑ xi
(74)
which together with the other matrices product
[X ]T [Y ] = ⎛⎜⎜
1
1
⎝ x1
x2
⎛ y1 ⎞ ⎛ N ⎜ ⎟ ⎜ ∑ yi ⎞⎟ L 1 ⎞⎜ y 2 ⎟ ⎜ i ⎟ ⎟ = ⎟ L x N ⎟⎠⎜ M ⎟ ⎜ N ⎜ ⎟ ⎜ ∑ xi y i ⎟ ⎜y ⎟ ⎝ i ⎠ ⎝ N⎠
(75)
construct the Moore-Penrose matrix for the mono-linear regression N N ⎛ N 2 N ⎜ ∑ x i ∑ y i − ∑ x i ∑ xi y i i i i ⎜ i 2 ⎜ N N ⎛ ⎞ 2 ⎜ N ∑ xi − ⎜ ∑ xi ⎟ T ⎜ [X ] [Y ] = ⎜ N i N ⎝ i N ⎠ ⎜ − ∑ xi ∑ y i + N ∑ xi y i i i i ⎜ 2 N N ⎜ ⎛ ⎞ 2 ⎜⎜ N ∑ xi − ⎜ ∑ xi ⎟ i ⎝ i ⎠ ⎝
[B ] = [A]−1 (
)
⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎛⎜ b0 ⎞⎟ ⎛⎜ b ⎞⎟ ⎟ = ⎜b ⎟ = ⎜a⎟ ⎟ ⎝ 1⎠ ⎝ ⎠ ⎟ ⎟ ⎟⎟ ⎠
(76)
Timisoara Spectral – Structure Activity Relationship (Spectral-SAR) Algorithm
561
recovering by its components the solutions (35) and (36). There is however clear by this presentation that as the linear regression is extended to many more (Xi) causes as the complexity and difficulty in analytically expression of the solution factors are increasing; moreover, through such algorithm there appears that no reference and control of the orthogonality or independency among the (Xi) causes are involved, or are so hidden to produce meaningful quantum interpretation for the ligand-receptor specific interaction. With these the statistical fundaments for treating and understanding the quantitative (regression) structure (cause) – activity (effects) relationships are exposed, while containing the “germens” for alternative and in some respects the generalized algebraic treatment of correlation in assessing for the ligand-receptor specific interaction an analytical pattern towards quantization, as will be presented din the sequel.
3. Algebraic QSAR 3.1. Multivariate Spectral Regression on Hilbert Space The key concept in SAR discussion regards the independence of the considered structural parameters in Table 3. As a consequence we may further employ this feature to quantify the basic SAR through an orthogonal space. The idea is to transform the columns of structural data of Table 3 into an abstract orthogonal space, where necessarily all predictor variables are independent, solve the SAR problem there and then referring the result to the initial data by means of a coordinate transformation. Table 3. The vectorial descriptors in a Spectral-SAR analysis Activity
Structural predictor variables … … X X
YOBS ( ERVED )
X0
y1-OBS y2-OBS
1 1
x11 x21
… …
x1k x2k
… …
x1M x2M
M
M
M
M
M
M
M
yN-OBS
1
1
xN1
k
…
xNk
…
XM
xNM
Since QSAR models aims correlations between concerned molecular structures and measured (or otherwise evaluated) activity, appears naturally that the structure part of the problem to be accommodated within the quantum theory and of its formalisms. In fact, there are few quantum characters that we are using within the present approach [94]: •
Any molecular structural state (dynamical, since undergoes interaction with organism) may be represented by a ket
state vector, in an abstract space of
allowed states within Hilbert space, following the bra ket Dirac formalism [95]; such states are to be here represented by any reliable molecular index, or, in particular in our study by hidophobicity
LogP , polarizability POL , total
562
Mihai V. Putz and Ana-Maria Putz optimized energy E tot , just to name only the so called Hansch parameters, usually employed for accounting the diffusion, electrostatic and steric effects for molecules acting within organisms’ cells, respectively. The (quantum) superposition principle that assures that sum combinations of molecular states map on other resulting molecular state, here interpreted as bio-, eco-
•
or toxico- logical activity, e.g. Y = Y0 + C LogP LogP + C POL POL + ... with
Y0 meaning the free or unperturbed activity (when all other influences are absent). •
The orthogonalization feature of quantum states, a crucial condition for that the superimposed molecular states generates other molecular state (here quantified as molecular-linking organism activity); analytically, the orthogonalization condition is represented by the bra ket
scalar product of two envisaged states (molecular
indices) whom value if it is evaluated to be zero, bra ket = 0 , then the states are said orthogonal and molecular descriptors independent, therefore suitable to be added as states in resulted activity state and as molecular indices in activity correlation. Further details on scalar product and related properties are given in Appendix A1, while in what follows the Spectral-SAR correlation method is resumed. Therefore the analytical procedure is unfolded under three fundamental steps.
Step 1 Given a set of N molecules being studies against biological activity they produce by means of their M – structural indicators, all input information (the states) may be vectorial expressed by the columns of the Table 3 and correlated upon equation
YOBS ( ERVED ) = b0 X 0 + b1 X 1 + ... + bk X k + ... + bM X M + prediction error
= YPRED ( ICTED ) + prediction error whith the unity vector X 0 = 1 1 L 1N
(77)
added to account for the free term.
In order equation (77) to represent a reliable model of the given activities, the molecular states (indices) assumed should constitute an orthogonal set, having this constraint a quantum mechanically fundament, as above described. However, unlike other important studies addressing this problem [67-80], the present employed Spectral-SAR assumes the prediction error vector in eq. (77) as being from beginning orthogonal on all others, since it cannot be considered input data as the others,
YPRED prediction error = 0
(78)
Timisoara Spectral – Structure Activity Relationship (Spectral-SAR) Algorithm
563
being not known apriori any correlation is made. Moreover, from eqs. (77) and (78) there follows that the prediction error vector has to be orthogonal on all other descriptor states of predicted activity. (79)
X i =0, M prediction error = 0
for consistency of the present vectorial (quantum formalized by means of ket
states)
approach. In other terms, conditions (78) and (79) confirm the form (77) in the sense that prediction vector and the prediction activity YPRED
(with all its sub-intended states
X i =0,M ) belongs to disjoint (thus orthogonal) Hilbert spaces; or even more, one can say that the Hilbert space of the observed activity YOBS
may be decomposed into a predicted
and error independent Hilbert sub-spaces of states. Therefore within Timisoara Spectral-SAR procedure the very beginning step in orthogonalization is prediction vector orthogonalization to prediction activity and of its predictor states, while the remaining orthogonalization algorithm do not search for optimizing the minimization of errors, but for producing the ideal correlation between YPRED
and the given descriptors X i =0, M .
Step 2 Next, the Gram-Schmidt orthogonalization algorithm is applied through constructing the orthogonal set of descriptors by means of the consecrated iteration [96-98]:
Ω0 = X 0 ,
(80a)
k −1
X k Ωi
i =0
Ωi Ωi
Ω k = X k − ∑ rik Ω i , rik =
, k = 1, M
(80b)
providing the orthogonal correlation:
YPRED = ω0 Ω 0 + ω1 Ω1 + ... + ω k Ω k + ... + ω M Ω M , ω = k
Ωk Y , k = 0, M
(81)
Ωk Ωk
Step 3 Remarkably, while the studies dedicated to orthogonal problem usually stops at this stage the Spectral-SAR uses it to provide the solution for the original searched correlation, eq. (77) with the error vector orthogonal on the predicted activity and on all its predictor states of Table 3. This can be adequately achieved through rearranging eqs. (80) and (81) so that the system of all descriptors of Table 3 to be written in terms of orthogonal descriptors:
564
Mihai V. Putz and Ana-Maria Putz
⎧ YPRED = ω 0 Ω 0 + ω1 Ω1 + ... + ω k Ω k + ... + ω M Ω M ⎪ ⎪ X0 = 1 ⋅ Ω 0 + 0 ⋅ Ω1 + ... + 0 ⋅ Ω k + ... + 0 ⋅ Ω M ⎪ ⎪ X1 = r01 Ω 0 + 1 ⋅ Ω1 + ... + 0 ⋅ Ω k + ... + 0 ⋅ Ω M ⎪⎪ ⎨........................................................................................ ⎪ k k ⎪ X k = r0 Ω 0 + r1 Ω1 + ... + 1 ⋅ Ω k + ... + 0 ⋅ Ω M ⎪........................................................................................ ⎪ ⎪ X M = r0M Ω 0 + r1M Ω1 + ... + rkM Ω k + ... + 1 ⋅ Ω M ⎩⎪
(82)
The system (82) has no trivial (orthogonal) solution if and only if the associated extended determinant vanishes; this condition introduces the Spectral-SAR determinant and of its equation [81-83]:
YPRED
ω0
ω1
X0
1
0
L ωk L 0
X1
1 0
r
1
L
M Xk
M r0k
M r1k
M L
M
M
M
M
XM
M 0
M 1
r
r
L ωM L 0
0
L
0
1
M L
0
=0
(83)
M M k
L r
L
1
If the determinant of eq. (83) is expanded on it first column, and the result rearranged so that to have YPRED
on left side and the rest of states/indicators on the right side the searched
QSAR solution of the initial problem of eq. (77) is obtained as Spectral-SAR vectorial expansion (from where the “spectral” name is justified as well) with the error vector already absorbed in the orthogonalization procedure. In fact Spectral-SAR procedure uses the double conversion passages: one forward from the given problem of eq. (77) to the orthogonal one of eq. (81) in which the error vector is orthogonally “dissolved”; and the reverse one, back from the orthogonal to the real descriptors throughout the system (82), leaving with the determinant (83) to be expanded as the QSAR solution. The result is that now QSAR/Spectral-SAR equation is delivered directly by the determinant (83) and not through matrices products as in statistical Pearson approach, see Section 2.5, while furnishing directly the Spectral-SAR correlation equation and not only the parameters of multi-variate correlation [8-15]. Moreover, the Spectral-SAR algorithm is invariant also to the order of descriptors chosen in orthogonalization procedure, providing equivalent determinants just with rearranged lines, a matter that was not previously achieved by other orthogonalization techniques [67-80]. Remarkably, apart from being conceptually new through considering the spectral (orthogonal) expansion of the input data space (of both activity and descriptors) throughout
Timisoara Spectral – Structure Activity Relationship (Spectral-SAR) Algorithm
565
the system (82), the present method also has the computational advantage of being simpler than the classical “standard” statistical way of treating SAR problem previously exposed. That because, one has nothing to do with computations of matrix of the coefficients (64) or (67), this being a quite involving and time consuming procedure for higher dimensional systems. Instead, one can write directly the Spectral-SAR solution (equation) as the expansion of a (M+2)-dimensional determinant of eq. (83) whose components are the activity and structural vectors involving the Gram-Schmidt and the spectral decomposition coefficients,
ri k and ω k , respectively. However, although different from the mathematical procedure, both standard- and spectral-SAR give similar results due to the theorem that states that [96]: if the matrix X, as that from (64), with dimension N×(M+1), N>M+1, has linear independent columns, i.e. they are orthogonal as in the spectral approach, then there exists an unique matrix [Q] of dimension N×(M+1) with orthogonal columns and a triangular matrix [R] of dimension (M+1)×(M+1) with the elements of the principal diagonal equal with 1, as identified in the first small determinant in eq. (83), so that the matrix [X] can be factorized as
[ X ] = [Q][ R ]
(84)
When combining equation (84) with the optimal equation (63) one can get, after straight algebraic rules, that the [B] vector of estimates takes the form
(
)
−1
[ B ] = [Q]T [Q] [Q]T [Y ]
(85)
in close agreement with previous normal one, see equation (67). However, by comparison of matrices [X]T[X] and [Q]T[Q] of equations (67) and (85), respectively, there is clear that the last case certainly furnishes a diagonal form which for sure is easier to handle (i.e. to take its inverse) when searching for the vector [B] of SAR coefficients. However, worth being convinced by the equivalence of the present Spectral method with the standard statistical one by specializing the general problem (77) to the linear case
YPRED = b0 X 0 + b1 X 1
(86)
and to check whether this is unfolded through the Spectral-equation (83) as providing the parameters of linear regression given by eqs. (35) and (36). In this respect, actually, we deal with the particular equation
YPRED 0=
X0 X1
ω 0 ω1 1 r01
1 0 = YPRED 1 r0 1
which is immediately rearranged as
ω 0 − X 0 10 r0 1
ω1 1
+ X1
ω 0 ω1 1
0
(87)
566
Mihai V. Putz and Ana-Maria Putz
(
)
YPRED = ω 0 − r01ω1 X 0 + ω {1 X 1 14243
(88)
a
b
so that to identify the actual with the previous linear coefficients of eqs. (35) and (36):
a = ω1 , b = ω 0 − r01ω1
(89)
Going to evaluate the expressions of (89) within the Spectral-SAR algorithm, there is instructive to identify form Table 3 only the relevant actual variables, with convenient denotation of instantaneous structural ones as the columns:
YPRED
X0
X1
y1 y2
1 1
x1 x2
M
M
M
1
yN
xN
Other working tools are the zero-th and the first orthogonal vectors, accordingly considered and computed respectively as
Ω 0 = 1 1 L 1N ,
(90a)
Ω1 = X 1 − r01 Ω 0
= x1 x2 ... x N −
1 N
N
1 N
∑ xi 1 1 1 ... 1 = x1 − i =1
N
∑ xi ... x N − i =1
1 N
N
∑x i =1
i
(90b)
with the help of coefficient
r01 =
X 1 Ω0 Ω0 Ω0
=
1 N
N
∑x i =1
(91)
i
specialized from the general definition (80b). In the same manner, the other specific Spectral coefficients from the general orthogonal recipe (81) are now for linear regression computed as the zero-th order contribution
ω0 =
Ω0 Y Ω0 Ω0
=
1 N
∑y
i
i
while the first orthogonal one recovers precisely the previous linear slope of eq. (35):
(92)
Timisoara Spectral – Structure Activity Relationship (Spectral-SAR) Algorithm
ω1 = =
567
Ω1 Y Ω1 Ω1
x1 − N −1 ∑ xi ... x N − N −1 ∑ xi y1 ... y N 2
⎛ ⎞ ∑i ⎜⎝ xi − N −1 ∑i xi ⎟⎠ ⎛ ⎞⎛ ⎞ ⎛ ⎞ ∑i yi ⎜⎝ xi − N −1 ∑i xi ⎟⎠ ∑i yi xi − N −1 ⎜⎝ ∑i yi ⎟⎠⎜⎝ ∑i xi ⎟⎠ = = 2 2 ⎡ 2 ⎤ ⎛ ⎞ ⎞ −1 −2 ⎛ x N x − ⎜ ⎟ ∑i ⎝ i ∑i i ⎠ ∑ ⎢ xi + N ⎜ ∑ xi ⎟ − 2 N −1 xi ∑ xi ⎥ i ⎢ i ⎝ i ⎠ ⎣ ⎦⎥ ⎛ ⎞⎛ ⎞ N ∑ y i xi − ⎜ ∑ y i ⎟⎜ ∑ xi ⎟ i ⎝ i ⎠⎝ i ⎠ =a = 2 ⎛ ⎞ N ∑ xi2 − ⎜ ∑ xi ⎟ i ⎝ i ⎠
(93)
as prescribed by the the correspondence of (89). Additionally, also its companion free term coefficient of relationship (88) may be now straightly evaluated as
b = ω 0 − r01ω1 ⎛ ⎞⎛ ⎞ N ∑ yi xi − ⎜ ∑ yi ⎟⎜ ∑ xi ⎟ 1 1⎛ ⎞ ⎝ i ⎠⎝ i ⎠ = ∑ y i − ⎜ ∑ xi ⎟ i 2 N i N⎝ i ⎠ ⎛ ⎞ 2 N ∑ xi − ⎜ ∑ xi ⎟ i ⎝ i ⎠ 2
⎛ ⎞⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎜ ∑ yi ⎟⎜ ∑ xi ⎟ − ⎜ ∑ xi ⎟⎜ ∑ yi xi ⎟ ⎠⎝ i ⎠ ⎝ i ⎠⎝ i ⎠ =⎝ i 2 ⎛ ⎞ N ∑ xi2 − ⎜ ∑ xi ⎟ i ⎝ i ⎠
(94)
as well successfully regaining the previously computed linear free terms counterpart as eq. (36), yet by means of variational statistical (optimization of errors’ squares summation) procedure. With this there is clear that the Timisoara Spectral-SAR algebraic SAR methodology not only recovers in great details the standard statistical QSAR routine but also generalizes to a great analyticity extent towards better assessment of mechanistically ordering and influences in practical eco- and bio- logical applications.
568
Mihai V. Putz and Ana-Maria Putz
3.2. Algebraic Correlation Factor Let’s explore in next whether the present spectral regression gives the opportunity in defining another correlation index, beyond the standard statistical one given by eq. (59) [94]. One starts with the simple connection between the observed, predicted and error vectors of eq. (77), however specialized on their instantaneous entries:
Yi −OBS = Yi − PRED + pei
(95)
where “ pe ”stays here as abbreviation for “prediction error”. Then, by means of squaring relation (95),
Yi −2OBS = Yi −2 PRED + pei2 + 2Yi − PRED ⋅ pei
(96)
and summing for all working N-molecules (of Table 3), N
∑Y i =1
N
N
N
i =1
i =1
i =1
= ∑ Yi −2 PRED + ∑ pei2 + 2∑ Yi − PRED ⋅ pei
2 i − OBS
(97)
the last relation simplifies to: N
∑Y
2 i −OBS
i =1
N
= ∑Y i =1
2 i − PRED
N
+ ∑ pei2
(98)
i =1
based on applying of scalar product definition (2) and of prediction error orthogonalization condition (78) upon the last term of (97), i.e. N
∑Y
i − PRED
i =1
⋅ pei = YPRED pe = 0
(99)
Now, substituting the prediction error values of (95) into remaining expression (98) one firstly gets: N
∑Y i =1
2 i − OBS
N
N
i =1
i =1
= ∑ Yi −2 PRED + ∑ (Yi −OBS − Yi − PRED )
2
(100)
or the equivalent identity N
N
i =1
i =1
∑ Yi −2 PRED = ∑ Yi −OBS ⋅ Yi − PRED
(101)
Timisoara Spectral – Structure Activity Relationship (Spectral-SAR) Algorithm
569
which further rewrites, recalling the norm and scalar product definitions of eqs. (2)-(4), respectively, as:
YPRED
2
= YOBS YPRED
(102)
Finally, the Cauchy-Schwarz form (10) is employed on the right side term of (102), noting that the observed and predicted activities are of the same nature for a given molecule – i.e. either both positive or both negative – thus providing their scalar product as positively defined; with these, the relation (102) immediately reads as the inequality:
YPRED
2
≤ YOBS ⋅ YPRED
(103)
leaving with the predicted-observed norms’ hierarchy
YPRED ≤ YOBS
(104)
that guarantees the consistent probability definition while introduceing algebraic correlation factor with the form:
RA ≡ rALGEBRAIC =
YPRED YOBS
≤1
(105)
Nevertheless, there remains to compare this new correlation factor, written in algebraically manner as the ration of predicted – to – observed norms of investigated molecular activity or of their effects, with the fashioned statistical counterpart given by eq. (59); this issue will be addressed in what follows.
3.3. Algebraic vs. Statistic Correlations Timisoara Theorem (on the algebraic Spectral-SAR correlation): for any QSAR analysis, once considering the measured/observed and computed/predicted activity data as the vectors YOBS
and YPRED
with the associate norms through the scalar products of eqs. (2)-
(4), the algebraic norm order (105) valid in defining the algebraic correlation factor (104), sets also the hierarchy at the levels of correlations factors in a sense that the algebraic one of always exceed the standard correlation factor (59): STATISTIC rSALGEBRAIC ≥ rQSAR − SAR
Proof: by straight algebraic translation the condition (106) firstly it rewrites as:
(106)
570
Mihai V. Putz and Ana-Maria Putz
YPRED YPRED
YOBS − YPRED YOBS − YPRED
≥ 1−
YOBS YOBS
(107)
YOBS − YOBS YOBS − YOBS
where we have introduced the averaged observed activity
YOBS =
1 N
N
∑y i =1
(108)
i −OBS
and its associate N-dimensional vector (state in Hilbert space):
⎛1 YOBS = ⎜ ⎝N
N
∑y i =1
i −OBS
⎞ ⎟ 1 1 K 1N ⎠
(109)
Note that the inequality (107) becomes equality in the case of perfect identity between observed and predicted activity values, i.e. perfect correlation, the case in which the second term of the right hand side vanishes while that of the left hand side become unity. For all other non-perfect correlations strict inequality holds and this will be considered in next, for the equivalent expression
YPRED YPRED YOBS − YOBS YOBS − YOBS
[
> YOBS YOBS YOBS − YOBS YOBS − YOBS − YOBS − YPRED YOBS − YPRED
]
(110a)
which may be further rearranged as
[Y
PRED
YPRED − YOBS YOBS + YOBS YOBS
[Y
OBS
][ Y
OBS
YOBS − 2 YOBS YOBS + YOBS YOBS
YOBS − 2 YOBS YPRED + YPRED YPRED ] > 0
] (110b)
At this point, after obvious simplifications and factorization may easily recognize and employ both the identities (102) and (104), specific to algebraic correlation
2 YPRED YPRED YOBS YOBS − 2 YOBS YOBS YOBS YPRED 14243
[
YPRED YPRED
+ [ YOBS YOBS − YPRED YPRED ] 2 YOBS YOBS − YOBS YOBS 14444 4244444 3
]> 0
(110c)
≥0
the simplified expression is obtained
2 YOBS YOBS > YOBS YOBS
(111a)
Timisoara Spectral – Structure Activity Relationship (Spectral-SAR) Algorithm
571
that finally is analytically explicated with the aid of introduced vector (109) of the average activity to the unfolded scalar ordered products N 1 ⎛ 2∑ ⎜ y i −OBS N i =1 ⎝
⎞ N ⎛1 y ∑ i −OBS ⎟ > ∑ ⎜ i =1 ⎠ i =1 ⎝ N N
⎞⎛ 1 yi −OBS ⎟⎜ ∑ i =1 ⎠⎝ N N
N
∑y i =1
i −OBS
⎞ ⎟ ⎠
(111b)
leaving with the equivalent strict inequality 2
2⎛ N 1⎛ N ⎞ ⎞ ⎜ ∑ yi −OBS ⎟ > ⎜ ∑ yi −OBS ⎟ N ⎝ i =1 N ⎝ i =1 ⎠ ⎠
2
(111c)
fully satisfied by the natural ordering as 2 > 1 . Therefore, there was proofed both the (qualitative) simplicity and the (quantitative) superiority of algebraic correlation factor. Many applications proof these statements also on dedicated molecular-biological or molecularecotoxicological cases. Yet, one modern bi-component molecular system concerned the ionic liquids (IL) toxicological actions are in next explained in its paradigmatic form.
3.4. Spectral-SAR for Ionic Liquids Since their emergence a decade ago, ionic liquids have had a constantly growing influence on organic, bio- and green chemistry, due to the unique physico-chemical properties manifested by their typical salt structure: a heterocyclic nitrogen-containing organic cation (in general) and an inorganic or organic anion [99], with melting points below 100 ºC and no vapor pressure [100]. The latter property leads to the practical replacement of conventional volatile organic compounds (VOCs) from the point of view of atmospheric emissions, though they do present the serious drawback that a small amount of IL could enter the environment through groundwater [101]. This risk makes it necessary to perform further eco-toxicological studies of IL on various species, in order to improve the "design rules" for synthesized IL with minimal toxicity to environment integrated organisms. Ionic liquids display variable stability in terms of moisture and solubility in water, polar and nonpolar organic solvents [102]. Various values of ionic liquid hydrophobicity and polarity may be tailored [101] with the help of nucleoside chemistry [103] according to the main principles of green chemistry [104, 105]: the new chemicals must be designed to preserve effectiveness of function while reducing toxicity, and not persisting in the environment at the end of their usage, but breaking down into inoffensive degradation products. In this respect, the costs of all approaches for sustainable product design can be reduced using SAR and QSAR methods [84, 85, 89]. It has already been proved that the antimicrobial activity of quaternary ammonium chlorides is lipophilicity-dependent [106]. While the 1-octanol-water partition coefficient could be seen only as the first approximation for compound lipophylicity, bioaccumulation and toxicity in fish, as well as sorption to soil and sediments assumes that lipophylicity is the main factor of anti-microbial activity [107]. Nevertheless, aiming at a deeper understanding of the specific mechanistic description of IL
572
Mihai V. Putz and Ana-Maria Putz
eco-toxicity, it is worth considering that the ionic liquid properties are more comprehensively quantified through lipophylicity, polarizability and total energy as a unitarily complex of factors in developing appropriate structure-activity relationship (SAR) studies. However, the main problem in assessing the viable QSAR studies to predict ionic liquid toxicities concerns the anionic-cationic interaction superimposed on the anionic and cationic subsystems containing ionic liquids. There are basically two complementary ways of attaining this goal. One may address the search of special rules for assessing the anionic-cationic structural separately from the individual anionic and cationic ones, and then generating the QSAR models. Yet, because the cationic and anionic effects on liquid toxicity are merely separately studied at the moment, the appropriate strategy would be to firstly derive the anionic and cationic QSARs and only then to move on to a QSAR of the ionic liquid viewed as an anionic-cationic interaction. As recently communicated [89], when the ionic liquids activity is evaluated two different additive models for modeling anionic-cationic interaction can be examined. The first one is based on the vectorial summation of the produced anionic and cationic biological effects Y , named the |1+> model, and which is constructed on the superposition of the anionic (subscripted with A) and cationic (subscripted with C) activities [84]:
Y AC
1+
= YC + Y A
(112)
The second S-SAR model, named |0+>, is employed when the additive stage is considered at the examined Hansch factors
X = LogP, POL, ETOT , which are firstly
combined to produce the anionic-cationic (subscripted with AC) indices that are further used to produce the spectral mechanistic map of the concerned interaction [85]:
YAC
0+
= O S − SAR 0 + = O S − SAR f ({ X A }{ , XC ∧
∧
})
(113)
with the particular specifications of the spectral vectors:
f (LogPA , LogPC ) ≡ LogPAC = log(e LogPA + e LogPC )∈ { X 1 AC } ,
(
)
f (POL A , POLC ) ≡ POL AC = POL1A/ 3 + POL1C/ 3 ∈ { X 2 AC 3
f (E A , EC ) ≡ E AC = E A + E C − 627.71 q A qC ∈ { X 3 AC 1/ 3 POL AC
}
} [Å3],
(114a) (114b)
[kcal/mol] (114c)
The open issue addresses whether the |0+> & |1+> states yields with the same results or in which aspects they might differ in the IL ecotoxicity upon certain species. Nevertheless, a practically criteria of deciding upon activity or structure additivity models, between eqs. (112) and (114), respectively, may be set respecting the so called ionic liquid internal angle
Timisoara Spectral – Structure Activity Relationship (Spectral-SAR) Algorithm
573
between the anion-cationic activity vectors, with y iA , y iC , i = 1, N components, abstracted from the general definition (30b), following the prescription [89]: N
cos θ AC =
∑y
yiA
⎧⎪≥ 0.707107 ... 0 + MODEL ⎨ N N < 0.707107 ... 1 + MODEL 2 2 ⎪ ∑ yiC ∑ yiA ⎩ i =1
i =1
iC
(115)
i =1
The illustration of the presented S-SAR-IL models was already performed by studying the aquatic species Vibrio fischeri, Daphnia magna and Electric El recorded ecotoxicity against a given tested ionic liquids, appropriately chosen so that containing a wide variety of heads, side chains, and anions. This way, the present methodology may be extended over a wide range of organisms towards designing specific eco-toxicological ionic liquid batteries [87,108].
4. Spectral-SAR Paths and Quantum-SAR Maps Having in deep presented the way in which structure – activity correlations may be realized from N recorded activity viewed as effects of M-structural causes, there remains to explore the combinatory of the models (endpoints) obtained along considering different sets of predictor variables in Table 3; this is nothing than the QSAR counterpart for what in quantum theory is known as the complete set of commutative operators (CoSCOpe) – since in both cases the discussion is to find the minimum (however complete) operators in quantum theory and structural variables in QSAR to behave as independent one each other so that to be independent or orthogonal one each other. Therefore the discussion and analysis based on the various possibilities a QSAR is realized from different structural indices implicitly or explicitly targets the quantum description of the correlation space; here we try to show the first step in exploitation this possibility [109]. Given a set of N-molecules, one can chose to correlate their observed activities Ai =1, N with M-selected structural indicators in as many combinations as:
C = ∑k =1 C Mk , C Mk = M
M! k!( M − k )!
(116a)
linked by different endpoint paths, as many as:
K = ∏k =1 C Mk M
(116b)
indexing the numbers of paths built from connected distinct models with orders (dimension of correlation) from k=1 to k=M.
574
Mihai V. Putz and Ana-Maria Putz
Basically, for each of the C-combinations a correlation (endpoint) QSAR equation is determined, say Yl =1,C = {y il }i =1, N , containing all computed activities for all considered Nl =1,C
molecules within the l-selected correlation. Note that the Spectral-SAR version of QSAR analysis computes these activities in a complete non-statistical way, i.e. by assuming the vectors for both observed (activities) and unobserved (latent variables) quantities while furnishing their correlation throughout the specific Spectral-SAR determinant, see eq. (83), obtained from the transformation matrix between the orthogonal (desirable) and oblique (input) correlations. Yet, besides producing essentially the same results as the statistical least-square fit of residues the Spectral-SAR method introduces new concepts reviewed here within three families as follows [109]:
Family 1 The Spectral-SAR concepts: •
The endpoint (computed) spectral norm
=
Yl
Yl Yl =
N
∑y i =1
2 il
, l = 1, C
(117a)
allowing the possibility of the unique assignment of a number to a specific type of correlation, i.e. performing a sort of resumed quantification of the models; •
The algebraic correlation factor of eq. (105) here rewritten as N
R ALG , l =
Yl
=
A
∑y i =1 N
2 il
∑A i =1
, l = 1, C
(117b)
2 i
viewed as the ratio of the spectral norm of the predicted activity to that of the measured one, giving the measure of the overall (or summed up) potency of the computed activities respecting the observed one rather than the local (individual) molecular distribution of activities around the mean statistical yields; thus, it is a specific measure of the molecular selection under study, always with a superior value to that yielded from statistical approach, however preserving the same hierarchy in a shrink (less dispersive) manner being therefore better suited for intra-training set molecular analysis.
Family 2 The QSAR map of end-points [109]: •
The spectral path, with the distance defined in the Euclidian sense as:
[l , l '] =
(Y
l
− Yl '
)
2
+ (Rl − Rl ' ) , ∀( l , l ' ) = 1, C 2
(118)
Timisoara Spectral – Structure Activity Relationship (Spectral-SAR) Algorithm
575
allows for defining complex information as path distances in norm-correlation space with norms computed from eq. (117a) while correlation free to be considered either from statistical (local) or algebraically (global) – eqs. (59) and (117b), respectively; note that as far as computed activity Yl corresponds to the measured activity Al defined as logarithm of inverse of 50%-effect concentration (EC50), see bellow, both modulus of Yl vectors and R values have no units so assuring the consistency of the eq. (118). •
The least spectral path principle, formally shaped as:
δ [l1 ,...lk ..., lM ] = 0; l1 ,..., lk ,..., lM : ENDPOINTS
(119)
that provides a practical tool in deciding the dominant {α ,...} hierarchies along the paths constructed by linking all possible k-models (i.e. models with k correlation factors) from (116a) combinations selected one time each on a formed path – generating the so called “Mendpoints containing ergodic path on K-paths assembly” of (116b). However, the implementation of the principle (119) is recursively performed through selecting the least distance computed upon systematically application of eq. (118) on ergodic paths; if, by instance, two paths are equal there is selected that one containing the first two models with shorter norm difference in accordance with the natural least action; the procedure is repeated until all C-models where connected on shortest paths; there was already conjectured that only the first M-shortest paths (called as α 1 ,..., α M ) are enough to be considered for a comprehensive (and self-consistent) mechanistic analysis [34-40].
Family 3 The Quantum-SAR indices and analysis [109]: •
The inter-endpoint norm difference (IEND)
ΔYl l ' = Yl ' − Yl , ( l , l ' ) ∈ {α 1 ,..., α M }
(120)
that accounts for norm differences of the models lying on the M-shortest spectral paths linking M- from the C-models of Equation (116a); •
The inter-endpoint molecular activity difference (IEMAD)
(EC50 )i 1 1 − ln = ln l' l (EC50 )lj' (EC50 ) j (EC50 )i l
l l'
ΔAi j = Alj' − Ail = ln
(121)
is considered from activity difference between the fittest molecules (i, j), in the sense of minimum residues, for the models (l, l’) belonging to the shortest paths α 1 ,..., α M for which the inter-endpoint norm difference is given by eq. (120).
576
Mihai V. Putz and Ana-Maria Putz
This way, we can interpret the two fittest molecules (i, j) as reciprocally activated by the models (l, l’) through the spectral path whom they belong; put in analytical terms, the difference between quantities of eqs. (120) and (121) may assure the “jump” or transition activity that turns the effect of i molecule on that of j molecule across the least spectral (here revealed as metabolization) path connecting the models l and l’:
1
ln
q
l l' i j
l l' j
≡ ΔYl l ' − ΔAi
(122)
Note that if we rearrange eq. (122) in terms of 505 - effect concentrations of eq. (121) one gets the wave-like form of molecular EC50 inter-molecular transformation:
(EC50 )li = (EC50 )lj' qil lj' exp(iΔYl l ' )
(123)
providing the analytic continuation in the complex plane for the IEND of eq. (120) was l l'
assumed, i.e. ΔYl l ' → iΔYl l ' outside the factor qi j . Remark that although the differences in eqs. (120) and (121) were considered mathematically along the “arrow” i-to-j the “quantum transformation” of eq. (123) suggests that the bio-chemical-physical equivalence (metabolization) of the concentration effects evolves from j-to-i, revealing a typical quantum l l' j
behavior with the factor qi
playing the propagator role as the quantum kernels in path
integral formulation of quantum mechanics [48]. This way, we may assert that eq. (123) stands as the present “quantum”-SAR equation because: •
•
•
it involves the wave-type expression of molecular effect of concentration, however, for special selected molecules (the fittest out of the C-models) and for special selected paths (the least for the M-ergodic assembly), being M and C related by eq. (116a); it provides the specific transition or specific transformation of the effect of a certain molecule into the effect of another special molecule out from the N-trained molecules, paralleling the phenomenology of consecrated quantum transitions; it has the amplitude of transformation driven by the so called quantum-SAR factor of an exponential form l l'
(
l l'
qi j = exp ΔAi j − ΔYl l '
)
(124)
defining the specific quantum-SAR wave; •
it allows the identity
(EC50 )li = (EC50 )li
(125)
Timisoara Spectral – Structure Activity Relationship (Spectral-SAR) Algorithm
577
when the reverse effects is considered
(
1 exp − iΔYl l ' l l' qi j
(EC50 )lj' = (EC50 )li
)
(126)
and substituted in the direct one (123), as absorption and emissions stand as reciprocal quantum effects; •
it has a “phase” with unity norm, in the same manner as ordinary quantum wave functions, allowing the inter-molecular “real” quantum-SAR transformation
(EC50 )li
= qi j ⋅ (EC50 ) j
l'
l l'
(127)
exclusively regulated by the quantum-SAR factor of eq. (124), in the same fashion as quantum tunneling is characterized by the transmission coefficient; •
when multiple transformations take place across paths with multiple linked models, say (l, l’, l’’), the inter-molecular transformation i→j→t is characterized by the overall quantum-SAR factor (124) written as product of intermediary ones l l ''
l l'
l ' l ''
qi t = qi j ⋅ q j t
(128)
due to the two-equivalent ways the (EC50 )i effect may be described directly from t l
or intermediated by j molecular effect transformations, respectively:
(EC 50 )li
= q i t ⋅ (EC 50 )t
l ''
l l ''
(
= q i j ⋅ (EC 50 ) j = q i j ⋅ q j t ⋅ (EC 50 )t l'
l l'
l l'
l ' l ''
l ''
)
(129)
in the same way as the quantum propagators behave along quantum paths [48]; certainly, such contraction scheme may be generalized for least paths connecting the M-contained k-endpoints giving an overall quantum-SAR (metabolization power) factor as: M
qi11 i MM = ∏ qi ww−−11 i ww l l
l
l
(130)
w= 2
•
Equation (123) supports the self-transformation as well, with the driven qua-SAR factor given by:
578
Mihai V. Putz and Ana-Maria Putz l l'
(
qi j = i = exp − ΔYl l '
)
(131)
during its evolution along the least paths when the same molecule (i=j) is metabolized by activating certain structural features (l≠l’) though specific indicators (variables) in correlation (bindings with receptor site); this case resembles the stationary quantum case according which even isolated (or with free motion), the molecular structures suffer dynamical wave-corpuscular or fluctuant transformation along their quantum paths. With the present Quantum-SAR methodology one can appropriately identify the molecular pairs that drive certain bio-/eco- activities against given receptor by means of selected descriptors in a “wave”- or “quantum” mechanistic formal way. The ultimate goal will be the computation of quantum-SAR factors along the least paths of actions that give the quantum-map information of the conversion power of the fittest molecules in their specific bindings [109]. This line is to be in the near future more applied and refined.
5. Conclusion Paradoxically, the main problem for QSAR resides not in performing the correlation itself but setting the variable selection for it; the mathematical counterpart for such problem is known as the “factor indeterminacy” [110-114] and affirms that the same degree of correlation may be reached in principle with an infinity of latent variable combinations. Fortunately, in chemical-physics there are a limited (although many enough) indicators to be considered with a clear-cut meaning in molecular structure that allows for rationale of reactivity and bindings [115,116]. Therefore, although undoubtedly useful, the “official” trend in employing QSAR methods is to classify, over-classify and validate through (external or molecular test set) prediction. A gap between the molecular computed orderings and the associate mechanistic role in bio-/eco- activity assessment remains as large as the QSAR strategy has not turned into a versatile tool in identifying the inter-molecular role in receptor binding sites through recorded activities by means of structurally selected common variables; that is to use QSAR information for internal mechanistic predictions among training molecules to see their interrelation respecting the whole class of observed activities employed for a specific correlation. Such an approach will also be helpful for checking the chemical domain spanned by training molecules – a feature of paramount importance also for further external tests. The modern in silico (computational) chemical analysis respecting the bio- activity and availability of analogues substances, potentially beneficial or detrimental for specific interaction in organs and organisms, faces a paradoxical dichotomy: if searching for the best correlation useful for prediction of specific molecular bio- or eco- activity QSAR models involving un-interpretable many latent variables may be produced, while always remaining is the question of correlation factor indeterminacy (i.e., the assumed descriptors can be at any time replaced with others producing at least the same correlation performances); instead, when restricting the analysis to search for molecular design and mechanisms throughout
Timisoara Spectral – Structure Activity Relationship (Spectral-SAR) Algorithm
579
performing SARs by means of special structural indicators for a given class of relevant molecules, arises the price of limiting the use of generated models for further prediction. The present review aims at filling this gap by deepening the modeling of inter-molecular activity through extending the main concepts of recent developed Spectral-SAR [8190,94,109], developing the fully algebraic version of traditional statistically optimized QSAR picture, targeting the quantification of the competition between molecular inter-activity and inter-endpoints records. As such, the present review was mainly oriented in presenting and developing the second (Q)SAR facet by rationalizing the recent introduced notion of spectralpath-linking-endpoints and the associate least action principle to spectral path quantification, in terms of the best fitted molecules, along the contained computed models, by means of the introduced q(uantum)-SAR factor within the generally called Quantum-SAR (QuaSAR) methodology. On the other side, the so-called green chemistry stands as a priority field of research which is approached by the research programs of United States and European Commission as well. It has the goal of characterization, prediction and the control of the chemical structures acting as toxicants on organisms and environment. The main reason for such research links the economical, ecological and public health issues in a general paradigm: method → data → information → knowledge → use. Within this epistemological chain the method relates the involved procedure in obtaining the experimental data and is regulated by the chemicalphysical and biological scientific laws; the data represent the chemicals and their toxic or carcinogenic values; information refers to elaboration of models through the recorded data; the knowledge means the prediction or the final model of the molecular action mechanisms; the use is defined by the legal boundaries for the toxic values or classes of chemicals admitted. In this context, the actual Spectral-to-Quantum SAR project propose an advanced study based on the epistemological bulk data-information-knowledge of the chemicals used in green chemistry in order to asses: a specific model of quantum characterization of concerned active substances at the bio-, eco- and pharmaco-logic levels through unitary formulation of the atomic-molecular indices for the effector-receptor binding degree potential of the logistic type (including the temporal dependency); a computational consistent model aiming to minimize the residual recorded activities in the experiments studying the enzymic, ionic liquid, antagonists and allosteric inhibition interactions. The methodology allows pattering both the controlling as well as the design of new compounds for synthesis this way eventually covering also the method-and-use segments of the economical-social life in XXI.
Acknowledgments Authors are truly indebted to Prof. Dr. Eduardo A. Castro from National University of La Plata (UNLA) and La Plata Instituto de Investigaciones Fisicoquímicas Teóricas y Aplicadas (INIFTA), Argentina, for fruitful ideas exchanged on statistical and algebraic correlation analysis during his visit in the summer of 2009 at the Chemistry Department of West University of Timisoara, as well as to Dr. Francisco M. Fernández from UNLA-INIFTA for the follow-up useful comments on orthogonality statements of the Spectral-SAR algorithm.
580
Mihai V. Putz and Ana-Maria Putz
Appendix: Common Poisson Integrals +∞
•
I 0 (a ) = ∫ e −ax dx = 2
π a
−∞
… the 0th order Poisson integral
⎛ +∞ −ax2 ⎞⎛ +∞ − ay 2 ⎞ +∞ −a (x 2 + y 2 ) dx ⎟⎟⎜⎜ ∫ e dy ⎟⎟ = ∫ ∫ e dxdy Proof: I (a ) = ⎜ ∫ e ⎜ ⎝ −∞ ⎠⎝ −∞ ⎠ −∞ 2 0
∞ 2π
= ∫ ∫e
− ar 2
0 0
∞
=− +∞
•
⎛ ∞ − ar 2 ⎞⎛ 2π ⎞ rdrdϕ = ⎜⎜ ∫ e rdr ⎟⎟⎜⎜ ∫ dϕ ⎟⎟ ⎝0 ⎠⎝ 0 ⎠
( )
( )
2 2 2π π d e −ar = − e −ar ∫ 2a 0 a
∞
0
=
π a
I 1 (a ) = ∫ xe − ax dx = 0 … the 1st order Poisson integral 2
−∞
+∞
Proof: I 1 ( a ) =
− ax ∫ xe dx = − 2
−∞
+∞
•
I 2 (a ) = ∫ x 2 e −ax dx = 2
−∞
+∞
Proof: I 2 ( a) =
∫x e
2 − ax 2
−∞
+∞
( )
( )
2 1 1 −ax 2 d e −ax = − e ∫ 2a −∞ 2a
+∞ −∞
=0
1 π … the 2nd order Poisson integral 2a a +∞
(
dx = ∫ x xe −∞
(
− ax 2
)
+∞
( )
1 d −ax 2 dx = − x e dx ∫ 2a −∞ dx
)
(
)
+∞ +∞ +∞ ⎤ 1 ⎡ d 1 1 π − ax 2 − ax 2 − ax 2 =− ⎢∫ + xe dx − ∫ e dx ⎥ = − d xe ∫ 2a ⎣ −∞ dx 2a −∞ 2a a −∞ ⎦
=−
(
)
2 +∞ 1 1 π 1 π = . xe −ax −∞ + 1 4 2 4 3 2a 2a a 2a a
( l ' Hospital )→0
References [1] [2]
Benfenati E., Predicting toxicity through computers: a changing world, Chem. Central J. 2007, 1:32, DOI: 10.1186/1752-153X-1-32 EPA US EPA AQUIRE (AQUatic toxicity Information REtrieval). U.S. Environmental Protection Agency.2002. ECOTOX User Guide: ECOTOXicology Database System.
Timisoara Spectral – Structure Activity Relationship (Spectral-SAR) Algorithm
[3]
[4]
[5]
[6]
[7]
[8] [9] [10] [11] [12] [13] [14] [15] [16]
581
Version 3.0 [http://www.epa.gov/ecotox/]. ECOTOX User Guide: ECOTOXicology Database System, 2002. SOMS (Strategy on Management of Substances) Ministry of Housing Spatial Planning and Environment; The Hague. http://www2.minvrom.nl/ Docs/internationaal/ soms_engels.pdf., 2001. European Commission. Regulation (EC) No. 1907/2006 of the European Parliament and of the Council of 18 Dec. 2006 concerning the registration, evaluation, authorisation and restriction of chemicals (REACH), establishing a European Chemicals Agency, amending directive 1999/45/EC and repealing Council Regulation (EC) No. 1488/94 as well as Council Directive 76/769/EEC and commission directives 91/155/EEC, 93/67/EEC, 93/105/EC and 2000/21/EC. Off. J. Eur. Union, L 396/1 of 30.12.2006; Office for Official Publication of the European Communities (OPOCE): Luxembourg, 2006. European Commission. Directive 2006/121/EC of the European Parliament and of the Council of 18 Dec. 2006 amending Council Directive 67/548/EEC on the approximation of laws, regulations and administrative provisions relating to the classification, packaging and labelling of dangerous substances in order to adapt it to Regulation (EC) No. 1907/2006 concerning the registration, evaluation, authorisation and restriction of chemicals (REACH) and establishing a european chemicals agency. Off. J. Eur. Union, L 396/850 of 30.12.2006; Office for Official Publication of the European Communities (OPOCE): Luxembourg, 2006. OECD, Report on the regulatory uses and applications in OECD member countries of (quantitative) structure-activity relationship [(Q)SAR] models in the assessment of new and existing chemicals. Organization of Economic Cooperation and Development: Paris, France, 2006; Available online: http://www.oecd.org/, accessed January 2009. OECD, Guidance document on the validation of (quantitative) structure-activity relationship [(Q)SAR] models. OECD series on testing and assessment No. 69. ENV/JM/MONO (2007) 2. Organization for Economic Cooperation and Development: Paris, France, 2007; Available online: http://www.oecd.org/, accessed January 2009. Anderson, T.W. An Introduction to Multivariate Statistical Methods; Wiley: New York, USA, 1958. Draper, N.R.; Smith, H. Applied Regression Analysis, Wiley: New York, USA, 1966. Shorter, J. Correlation Analysis in Organic Chemistry: An Introduction to Linear Free Energy Relationships; Oxford Univ. Press: London, UK, 1973. Box, G.E.P.; Hunter, W.G.; Hunter, J.S. Statistics for Experimenters; John-Wiley: New York, USA, 1978. Green, J.R.; Margerison, D. Statistical Treatment of Experimental Data; Elsevier: New York, USA, 1978. Topliss, J. Quantitative Structure-Activity Relationships of Drugs; Academic Press: New York, USA, 1983. Seyfel, J.K. QSAR and Strategies in the Design of Bioactive Compounds; VCH Weinheim: New York, USA, 1985. Chatterjee, S.; Hadi, A.S.; Price, B. Regression Analysis by Examples, 3rd Ed.; JohnWiley: New-York, USA, 2000. Worth, A.P.; Bassan, A.; Gallegos Saliner, A.; Netzeva, T.I.; Patlewicz, G.; Pavan, M.; Tsakovska, I.; Vracko, M. The characterization of quantitative structure-activity
582
[17]
[18]
[19]
[20] [21] [22] [23]
[24]
[25]
[26]
[27]
[28] [29] [30] [31] [32]
Mihai V. Putz and Ana-Maria Putz relationships: Preliminary guidance. European Commission - Joint Research Centre: Ispra, Italy, 2005; Available online: http://ecb.jrc.it/qsar/publications/, accessed January 2009. Worth, A.P.; Bassan, A.; Fabjan, E.; Gallegos Saliner, A.; Netzeva, T.I.; Patlewicz, G.; Pavan, M.; Tsakovska, I. The characterization of quantitative structure-activity relationships: Preliminary guidance. European Commission - Joint Research Centre: Ispra, Italy, 2005; Available online: http://ecb.jrc.it/qsar/publications/, accessed January 2009. Benigni, R.; Bossa, C.; Netzeva, T.I.; Worth, A.P. Collection and evaluation of [(Q)SAR] models for mutagenicity and carcinogenicity. European Commission - Joint Research Centre: Ispra, Italy, 2007; Available online: http://ecb.jrc.it/qsar/publications/, accessed January 2009. So, S.S.; Karpuls, M. Evolutionary optimisation in quantitative structure-activity relationship: An application of genetic neural network. J. Med. Chem. 1996, 39, 15211530. Kubinyi, H. Evolutionary variable selection in regression and PLS analysis. J. Chemometr. 1996, 10, 119-133. Teko, I.V.; Alessandro, V.A.E.P.; Livingston, D.J. Neutral network studies. 2. Variable selection. J. Chem. Inf. Comput. Sci. 1996, 36, 794-803. Kubinyi, H. Variable selection in QSAR studies. 1. An evolutionary algorithm. Quant. Struct.-Act. Relat. 1994, 13, 285-294. Haegawa, K.; Kimura, T.; Fanatsu, K. GA strategy for variable selection in QSAR Studies: Enhancement of comparative molecular binding energy analysis by GA-based PLS method. Quant. Struct.-Act. Relat. 1999, 18, 262-272. Zheng, W.; Tropsha, A. Novel variable selection quantitative structure-property relationship approach based on the k-nearest neighbour principle. J. Chem. Inf. Comput. Sci. 2000, 40, 185-194. Lučić, B.; Trinajstić, N. Multivariate regression outperforms several robust architectures of neural networks in QSAR modelling. J. Chem. Inf. Comput. Sci. 1999, 39, 121-132. Duchowicz, P.R.; Castro, E.A. The Order Theory in QSPR-QSAR Studies; Mathematical Chemistry Monographs, University of Kragujevac: Kragujevac, Serbia, 2008. Zhao, V.H.; Cronin, M.T.D.; Dearden, J.C. Quantitative structure-activity relationships of chemicals acting by non-polar narcosis - theoretical considerations. Quant. Struct.Act. Relat. 1998, 17, 131-138. Pavan, M.; Netzeva, T.; Worth, A.P. Review of literature based quantitative structureactivity relationship models for bioconcentration. QSAR Comb. Sci. 2008, 27, 21-31. Pavan, M.; Worth, A.P. Review of estimation models for biodegradation. QSAR Comb. Sci. 2008, 27, 32-40. Tsakovska, I.; Lessigiarska, I.; Netzeva, T.; Worth, A.P. A mini review of mammalian toxicity (Q)SAR models. QSAR Comb. Sci. 2008, 27, 41-48. Gallegos Saliner, A.; Patlewicz, G.; Worth, A.P. A review of (Q)SAR models for skin and eye irritation and corrosion. QSAR Comb. Sci. 2008, 27, 49-59. Patlewicz, G.; Aptula, A.; Roberts, D.W. Uriarte, E. A mini-review of available skin sensitization (Q)SARs/Expert systems. QSAR Comb. Sci. 2008, 27, 60-76.
Timisoara Spectral – Structure Activity Relationship (Spectral-SAR) Algorithm
583
[33] Netzeva, T.; Pavan, M.; Worth, A.P. Review of (quantitative) structure-activity relationship for acute aquatic toxicity. QSAR Comb. Sci. 2008, 27, 77-90. [34] Cronin, M.T.D.; Worth, A.P. (Q)SARs for predicting effects relating to reproductive toxicity. QSAR Comb. Sci. 2008, 27, 91-100. [35] Ogihara, N. Drawing out drugs. Mod. Drug Discovery 2003, 6 (9), 28-32. [36] Hansch, C.; Hoekman, D.; Gao, H. Comparative QSAR: toward a deeper understanding of chemicobiological interactions. Chem. Rev. 1996, 96, 1045-1075. [37] Kubinyi, H. Der Schlüssel zum Schloß I. Grundlagen der Arzneimittelwirkung. Pharmazie in unserer Zeit 1994, 23 Jahrg. Nr.3, 158-168. [38] Liwo, A.; Tarnowska, M.; Grzonka, Z. Tempczyk, A. Modified Free-Wilson method for the analysis of biological activity data. Computers Chem. 1992, 16, 1-9. [39] Schmidli, H. Multivariate prediction for QSAR. Chemometrics and Intelligent Laboratory Systems 1997, 37, 125-134. [40] Lhuguenot, J.-C. Relation quantitative structure-activité (QSAR): une méthode mal reconnue car trop souvent mal utilisée. Ann. Fals. Exp. Chim. 1995, 88, 293-310. [41] Crippen, G. M.; Bradley, M. P.; Richardson, W. W. Why are binding-site models more complicated than molecules? Perspectives in Drug Discovery and Design 1993, 1, 321328. [42] Kier, L.B.; Hall, L.H. Molecular Connectivity in Structure-Activity Analysis. Research Studies Press, Letchworth, 1986. [43] Balaban, A.T.; Motoc, I.; Bonchev, D.; Mekenyan, O. Topological indices for structureactivity correlations. Top. Curr. Chem. 1983, 114, 21-55. [44] Navia, M. A.; Peattie, D. A. Structure-based drug design: applications in immunopharmacology and immunosuppression. Immunology Today 1993, 14, 296-301. [45] Perkins, T. D. J.; Dean, P. M. An exploration of a novel strategy for superposing several flexible molecules. J. Comput.-Aided Mol. Design 1993, 7, 155-172. [46] Lemmen, C.; Lengauer, T. Time-efficient flexible superposition of medium-sized molecules. J. Comput.-Aided Mol. Design 1997, 11, 357-368. [47] Balaban, A. T.; Chiriac, A.; Motoc, I; Simon, Z. Steric Fit in QSAR; Springer, Berlin (Lecture Notes in Chemistry Series), 1980. [48] Simon, Z; Chiriac, A.; Holban, S.; Ciubotariu, D.; Mihalas, G. I. Minimum Steric Difference. The MTD Method for QSAR Studies; Res. Studies Press (Wiley), Letchworth, 1984. [49] Duda-Seiman C., Duda-Seiman D., Dragoş D., Medeleanu M., Careja V., Putz M.V., Lacrămă A.-M., Chiriac A., Nuţiu R., Ciubotariu D. Design of anti-HIV ligands by means of minimal topological difference (MTD) Method, Int. J. Mol. Sci. 2006, 7, 537555. [50] Cramer, R.D.III; Patterson, D.E.; Bunce, J.D. Comparative molecular field analysis (CoMFA). 1. Effect shape on binding of steroids to carrier proteins. J. Am. Chem. Soc. 1988, 110, 5959-5967. [51] Cramer, R.D.III; DePriest, S.A.; Patterson, D.E.; Hecht, P. The developing practice of comparative molecular field analysis. In 3D QSAR in Drug Design. Theory, Methods and Applications (ed. H. Kubinyi), Escom, Leiden, 1993, pp. 443-485. [52] Sun, J.; Chen, H.F.; Xia, H.R.; Yao, J.H.; Fan, B.T. Comparative study of factor Xa inhibitors using molecular docking/SVM/HQSAR/3D-QSAR methods. QSAR Comb. Sci. 2006, 25, 25-45.
584
Mihai V. Putz and Ana-Maria Putz
[53] Randić, M.; Jerman-Blazic, B.; Trinajstić, N. Development of 3-dimensional molecular descriptors. Comput. Chem. 1990, 14, 237-246. [54] Randić, M.; Razinger, M. Molecular topographic indices. J. Chem. Inf. Comput. Sci. 1995, 35, 140-147. [55] Manallack, D. T.; Livingstone, D. J. Artificial neural networks: application and chance effects for QSAR data analysis. Med. Chem. Res. 1992, 2, 181-190. [56] Manallack, D. T.; Livingstone, D. J Limitations of functional-link nets as applied to QSAR data analysis. Quant. Struct-Act. Relat. 1994, 13, 18-21. [57] Marchant, C. A.; Combes, R. D. Artificial intelligence: the use of computer methods in the prediction of metabolism and toxicity, in Bioactive Compound Design: Possibilities for Industrial Use, M. G. Ford, R. Greenwood (eds.), G. T. Brooks and R. Franke BIOS Scientific Publishers Limited, 1996. [58] Moriguchi, I.; Hirono, S.; Matsushita, Y.; Liu, Q.; Nakagome, I. Fuzzy adaptive least squares applied to structure-activity and structure-toxicity correlations. Chem. Pharm. Bull. 1992, 40, 930-934. [59] Moriguchi, I.; Hirono, S. Fuzzy adaptive least squares and its use in quantitative structure-activity relationships, in QSAR and Drug Design – New Developments and Applications, T. Fujita (ed.), Elsevier Science B. V., 1995. [60] Vapnik, V.N. Statistical Learning Theory, John Wiley & Sons, New York, 1998. [61] Vapnik, V.N. Estimation of Dependencies Based on Empirical Data, Springer-Verlag, Berlin, 1982. [62] Schölkpof, B.; Burges, C.J.C.; Smola, A.J. (eds.) Advances in Kernel Methods. Support Vector Learning. MIT Press, Cambridge, MA, 1999. [63] Schölkpof, B.; Smola, A.J. Learning with Kernels. MIT Press, Cambridge, MA, 2002. [64] Mangasarian, O.L.; Musicant, D.R. Succesive overrelaxation for support vector machines. IEEE Trans. Neural Networks 1999, 10, 1032-1036. [65] Mattera, D.; Palmieri, F.; Haykin, S. Simple and robust methods for support vector expansions. IEEE Trans. Neural Networks 1999, 10, 1038-1047. [66] Luan, F.; Ma, W.P.; Zhang, X.Y. ; Zhang, H.X. ; Liu, M.C. ; Hu, Z.D. ; Fan, B.T. QSAR study of polychlorinated dibenzodioxins, dibenzofurans, and biphenyls using the Heuristic method and support vector machine. QSAR Comb. Sci. 25, 25, 46-55. [67] Sutter, J. M.; Kalivas, J. H.; Lang, P. K. Which principal components to utilize for principal component regression. J. Chemometrics 1992, 6, 217-225. [68] Nendza, M.; Wenzel, A. Statistical approach to chemicals classification. Environ. Toxicol. Chem. 1993, Supplement, 1459-1470. [69] Cash, G. G.; Breen, J. J. Principal component analysis and spatial correlation: environmental analytical software tools. Chemosphere 1992, 24, 1607-1623. [70] Hemmateenejad, B.; Miri, R.; Jafarpour, M.; Tabarzad, M.; Foroumadi, A. Multiple linear regression and principal component analysis-based prediction of the antituberculosis activity of some 2-aryl-1,3,4-thiadiazole derivatives. QSAR Comb. Sci. 2006, 25, 56-66. [71] Randić, M. Resolution of ambiguities in structure-property studies by use of orthogonal descriptors. J. Chem. Inf. Comput. Sci.1991, 31, 311-320. [72] Randić, M. Orthogonal Molecular Descriptors. New J. Chem. 1991, 15, 517-525.
Timisoara Spectral – Structure Activity Relationship (Spectral-SAR) Algorithm
585
[73] Amić, D.; Davidović-Amić, D.; Trinajstić, N. Calculation of retention times of anthocyanins with orthogonalized topological indices. J. Chem. Inf. Comput. Sci. 1995, 35, 136-139. [74] Lučić, B.; Nikolić, S.; Trinajstić, N.; Juretić, D. The structure-property models can be improved using the orthogonalized descriptors. J. Chem. Inf. Comput. Sci. 1995, 35, 532-538. [75] Lučić, B.; Nikolić, S.; Trinajstić, N.; Jurić, A.; Mihalić, Z. A Structure-property study of the solubility of aliphatic alcohols in water. Croatica Chem. Acta 1995, 68, 417-434. [76] Lučić, B.; Nikolić, S.; Trinajstić, N.; Juretić, D.; Jurić, A. A Novel QSPR approach to physicochemical properties of the α-amino acids. Croatica Chem. Acta 1995, 68, 435450. [77] Šoškić, M.; Plavšić, D.; Trinajstić, N. Link between orthogonal and standard multiple linear regression models. J. Chem. Inf. Comput. Sci. 1996, 36, 829-832. [78] Klein, D.J.; Randić, M.; Babić, D.; Lučić, B.; Nikolić, S.; Trinajstić, N. Hierarchical orthogonalization of descriptors. Int. J. Quantum Chem. 1997, 63, 215-222. [79] Ivanciuc, O.; Taraviras, S.L.; Cabrol-Bass, D. Quasi-orthogonal basis sets of molecular graph descriptors as chemical diversity measure. J. Chem. Inf. Comput. Sci. 2000, 40, 126-134. [80] Fernandez, F. M.; Duchowicz, P. R.; Castro E. A. About orthogonal descriptors in QSPR/QSAR theories, MATCH Commun. Math. Comput. Chem. 2004, 51, 39-57. [81] Putz, M.V. A spectral approach of the molecular structure – biological activity relationship part I. The general algorithm. Ann. West Univ. Timişoara Ser. Chem. 2006, 15, 159-166. [82] Putz, M.V.; Lacrămă, A.-M. A spectral approach of the molecular structure – biological activity relationship part II. The enzymatic activity. Ann.West Univ. Timişoara Ser. Chem. 2006, 15, 167-176. [83] Putz, M.V.; Lacrămă, A.-M. Introducing spectral structure activity relationship (SSAR) analysis. Application to ecotoxicology. Int. J. Mol. Sci. 2007, 8, 363-391. [84] Lacrămă, A.-M.; Putz, M.V.; Ostafe, V. A Spectral-SAR model for the anionic-cationic interaction in ionic liquids: Application to Vibrio fischeri ecotoxicity. Int. J. Mol. Sci. 2007, 8, 842-863. [85] Putz, M.V.; Lacrămă, A.-M.; Ostafe V. Spectral-SAR ecotoxicology of ionic liquids. The Daphnia magna case. Res. Lett. Ecol. 2007, Article ID12813/5 pages, DOI: 10.1155/2007/12813. [86] Putz, M.V.; Duda-Seiman, C.; Duda-Seiman, D.M.; Putz A.-M. Turning SPECTRALSAR into 3D-QSAR analysis. application on H+K+-ATPase inhibitory activity, Int. J. Chem. Model. 2008, 1, 45-62. [87] Lacrămă, A.-M.; Putz, M.V.; Ostafe, V. Designing a spectral structure-activity ecotoxico-logistical battery, in Advances in Quantum Chemical Bonding Structures, Putz M.V., Ed.; Transworld Research Network: Kerala, India, 2008; Chapter 16, pp. 389-419. [88] Putz, M.V.; Putz (Lacrămă) A.-M. Spectral-SAR: Old wine in new bottle. Studia Universitatis Babeş-Bolyai Chemia, 2008, 53, 73-81. [89] Putz, M.V.; Putz, A.-M.; Ostafe, V.; Chiriac A. Application of spectral-structure activity relationship (S-SAR) method to ecotoxicology of some ionic liquids at the molecular level using acethylcolinesterase. Int. J. Chem. Model. 2009, 2, 85-96.
586
Mihai V. Putz and Ana-Maria Putz
[90] Putz, M.V.; Putz, A.M.; Lazea, M.; Chiriac, A. Spectral vs. statistic approach of structure-activity relationship. Application on ecotoxicity of aliphatic amines. J. Theor. Comput. Chem. 2010, 8 in press. [91] Daudel R.; Leroy G.; Peeters D.; Sana M. Quantum Chemistry, John Wiley & Sons, New York, 1983. [92] Messiah, A. Quantum Mechanics, Vols. I and II, North-Holland: Amsterdam, Holland, 1961. [93] Weiss, U. Quantum Dissipative Systems, World Scientific, Singapore, 1993. [94] Chicu, S.A.; Putz, M.V. Köln-Timişoara molecular activity combined models toward interspecies toxicity assessment. Int. J. Mol. Sci. 2009, 10, accepted. [95] Dirac, P.A.M. The Principles of Quantum Mechanics, Oxford University Press: Oxford, UK, 1944. [96] Fadeeva V. N. Computational Methods of Linear Algebra, Dover Publications, New York, 1959. [97] Steen, L.A. Highlights in the history of spectral theory. Amer. Math. Monthly 1973, 80, 359-381. [98] Siegmund-Schultze, R. Der Beweis des Hilbert-Schmidt Theorem. Arch. Hist. Ex. Sc. 1986, 36, 251-270. [99] Pernak, J.; Chwala, P. Synthesis and anti-microbial activities of choline-like quaternary ammonium chlorides. Eur. J. Med. Chem. 2003, 38, 1035-1042. [100] Bernot, R.J.; Brueseke, M.A.; Evans-White, M.A.; Lamberti, G.A. Acute and chronic toxicity of imidazolium-based ionic liquids on Daphnia Magna. Environ. Toxicol. Chem. 2005, 24, 87-92. [101] Sheldon, R.A. Green solvents for sustainable organic synthesis: state of the art. Green Chem. 2005, 7, 267-278. [102] Docherty, K.M.; Kulpa, C.F.Jr. Toxicity and antimicrobial activity of imidazolium and pyridinium ionic liquids. Green Chem. 2005, 7, 185-189. [103] Freemantle, M. New frontiers for ionic liquids. Chem. Eng. News 2007, 1, 23-26. [104] Anastas, P.T.; Warner, J.C. Green Chemistry Theory and Practice, 1998, Oxford University Press, New York. [105] Jastorff, B.; Molter, K.; Behrend, P.; Bottin-Weber, U.; Filser, J.; Heimers, A.; Ondurschka, B.; Ranke, J.; Scaefer, M.; Schroder, H.; Stark, A.; Stepnowski, P.; Stock, F.; Stormann, R.; Stolte, S.; Welz-Biermann, U.; Ziegert, S.; Thoming, J. Progress in evaluation of risk potential of ionic liquids—basis for an eco-design of sustainable products. Green Chem. 2005, 7, 362-372. [106] Jastorff, B.; Stormann, R.; Ranke, J.; Molter, K.; Stock, F.; Oberheitmann, B.; Hoffmann, W.; Hoffmann, J.; Nuchter, M.; Ondruschka, B.; Filser, J. How hazardous are ionic liquids? Structure – activity relationship and biologic testing as important elements for sustainability evaluation. Green Chem. 2003, 5, 136-142. [107] Pernak, J.; Sobaszkiewicz, K.; Mirska, I. Antimicrobial activities of ionic liquids. Green Chem. 2003, 5, 52-56. [108] Lacrămă, A.M. Ecotoxicological Batteries with Organisms from Different Species (in Romanian), PhD dissertation, West University of Timişoara, Romania, 2007. [109] Putz, M.V.; Putz, A.M.; Lazea, M.; Ienciu, L.; Chiriac A. Quantum-SAR extension of the Spectral-SAR algorithm. Application to polyphenolic anticancer bioactivity. Int. J. Mol. Sci. 2009, 10, 1193-1214.
Timisoara Spectral – Structure Activity Relationship (Spectral-SAR) Algorithm
587
[110] Steiger, J.H.; Schonemann, P.H. A history of factor indeterminacy. In Theory Construction and Data Analysis in the Behavioural Science, Shye, S., Ed.; Jossey-Bass Publishers: San Francisco, CA, USA, 1978. [111] Spearman, C. The Abilities of Man; MacMillan: London, UK, 1927. [112] Wilson, E.B. Review of the abilities of man, their nature and measurement, by Spearman, C. Science 1928, 67, 244-248. [113] Wilson, E.B.; Hilferty, M.M. The distribution of chi-square. Proc. Nat. Acad. Sci. USA 1931, 17, 684. [114] Wilson, E.B.; Worcester, J. A note on factor analysis. Psychometrika 1939, 4, 133-148. [115] Topliss, J.G.; Costello, R.J. Chance correlation in structure-activity studies using multiple regression analysis. J. Med. Chem. 1972, 15, 1066-1068. [116] Topliss, J.G.; Edwards, R.P. Chance factors in studies of quantitative structure-activity relationships. J. Med. Chem. 1979, 22, 1238-1244.
In: Quantum Frontiers of Atoms and Molecules Editor: Mihai V. Putz, pp. 589-605
ISBN: 978-1-61668-158-6 © 2010 Nova Science Publishers, Inc.
Chapter 22
ON PLOTS IN QSAR/QSPR METHODOLOGIES Emili Besalúa,*, Jesus Vicente de Julián Ortizb and Lionello Poglianic a
Institute of Computational Chemistry, Universitat de Girona, Facultat de Ciències, Avda. Montilivi s/n, 17071 Girona, Spain b Instituto de Tecnología Química, CSIC-Universidad Politécnica de Valencia, Av. de los Naranjos s/n, 46022 València, Spain c Dipartimento di Chimica, Università della Calabria, 87030 Rende (CS), Italy
Abstract Many of the numerical and algorithmic procedures used in the QSAR/QSPR field lead to rank or to predict activity values for virtual molecules. It is also well known that an image can give more information than a list of numbers. This is the reason why in order to illustrate the results obtained and its performance, several kinds of graphical representations are depicted in many publications. Here some of these graphical representations are revisited. It is also shown how the heuristic manipulation or interpretation of one of these graphical representations can lead to erroneous conceptions: when using the popular ordinary multiple linear regression technique, the graphical aspect of fitted versus experimental values scatter plot is affected by regression towards the mean effects. As a consequence, the ‘reverse’ presentation, i.e., experimental versus fitted plots, are not equivalent to the former ones. The underlying properties beyond these graphs demonstrate how the point cloud is not symmetrically distributed along the so-called “ideal” or “desired” line, that is, the bisector of the first and third quadrants. The deviation from the ideal line is fixed, and it is related to the coefficient of determination. Regarding classifiers, a distinction between the difficulty and the utility concepts will be presented. Some classifier related graphs will be shown, as receiver operating characteristics (ROC) curves or pharmacological-activiy distribution diagrams (PDD). Special emphasis will be set on the former ones. ROC curves, despite not being extensively known in the QSAR field, are presented here in order to promote them as a tool to qualify and compare classifiers performance.
a
E-mail address: [email protected]. (Corresponding author.) E-mail address: [email protected] c E-mail address: [email protected] b
590
Emili Besalú, Jesus Vicente de Julián Ortiz and Lionello Pogliani
Introduction In many fields of Chemistry, quantitative structure or property relationships (QSAR/QSPR) techniques are considered. There is a plethora of algorithmic procedures to be applied in order to predict molecular properties, being physical or biological. The numerical and algorithmic procedures lead sometimes to rank virtual molecules or to fit or predict activity values. In order to illustrate the results obtained in the molecular modeling field, several kinds of graphical representations are presented in the publications. Regression towards the mean effects will be presented in situations in which multilinear regression (MLR) technique is considered for model building. The concept is related to the graphical aspect of some scatter plots (experimental vs fitted, and fitted vs experimental values). These graphs demonstrate how the point cloud is not symmetrically distributed along the so-called “ideal” or “desired” y = x line. An extrapolation of these regression effects is also briefly discussed within the context of property predictions obtained via the leave-oneout cross-validation technique. Other kinds of graphs will be introduced, namely, the pharmacological distribution diagrams (PDD) and ROC curves.
1. Plots of Fitted and Experimental Values in MLR Studies In many scientific and chemistry-related fields it is very common to represent in a bidimensional plot calculated and observed data. If fitted values are obtained by the ordinary linear or MLR procedures, the two graphical representation choices, fitted vs. observed and observed vs. fitted biplots, are not equivalent. The slopes of the bidimensional regression lines in both plots bear distinct properties: the former representation exhibits a regression line with a slope always equal to r2 and in the latter case, the regression line coincides exactly with the bisector of the first and third quadrants of the representation plane. This situation is always reproduced and can be mathematically proven [1-3]. Here it is exemplified by the aid of a simple numerical example. Let us consider a set of 5000 items (molecules). We choose this number of items in order to get representative graphs having an evident visual message. Although the properties here exposed apply for any plausible number of items, due to the general theorem that supports our results [2], if only a few items are considered, the respective graphs may not visually exhibit the properties in an evident manner. We will assume that our molecules bear some property of interest in the field of QSAR, for instance, an activity which can be numerically represented and modeled by a MLR equation. We have constructed such an artificial toy set and generated a fitting MLR model involving three parameters. After the list of fitted values is obtained, the well-known graphs representing fitted vs. observed values (a) and the experimental vs. fitted ones (b) were constructed and displayed in Figure 1. Note that in both representations the point clouds are not distributed symmetrically around the bisector of the first and third quadrants (solid diagonal line), as many people will presume. The distortion is related to the “regression towards the mean effect” anticipated by Galton [4]. It can be mathematically demonstrated that, for the MLR case, in Figure 1a the simple linear regression line (minimal squares) among the displayed points is a line bearing a slope exactly equal to the subjacent MLR model coefficient of determination, r2.
591
120
120
100
100
80
80
Experimental value
Fitted value by MLR
On Plots in QSAR/QSPR Methodologies
60 40 20
60 40 20
0
0
-20
-20 -40
-40 -40
-20
0
20
40
60
80
100
120
-40
-20
0
20
40
60
80
100
120
Fitted value by MLR
Experimental value
a)
b)
Figure 1. Representation of fitted vs experimental values (a) and experimental vs fitted ones (b) for an artificial set of items. The data have been adjusted by means of an ordinary MLR model involving three descriptors slightly correlated with the dependent variable (experimental value).
In our case, this coefficient is equal to 0.62 and the line has been represented dashed in Figure 1a. The ordinate at the origin of this line is (1-r2)ymean, which in our case is 19.0. On the other side, for the “reversed” graph in Figure 1b, the fitted line is always equal to the y = x line, coinciding with the first and third quadrants bisector. Most people will anticipate that the better the fitting ability of the MLR model (i.e., the greater the value of r2) the closer the points cloud of Figure 1 are to be to the bisector line. That’s true and obvious, but most people will also erroneously anticipate that the cloud is always symmetrically distributed around the bisector. That’s false in general, as the present example shows. The situation is that, asymptotically, as the determination coefficient r2 approaches to 1, the cloud becomes narrower and tends to collapse to the bisector line. Simultaneously, and regarding the fitted versus experimental representation, for r2 values lesser than 1 the points cloud becomes spread but always “clockwise rotated” with respect to the bisector line. Conversely, if the experimental vs fitted values are represented, the cloud remains to be anticlockwise rotated respect to the bisector line. The aforementioned rotation effects are magnified if the represented points are not the fitted ones but the ones obtained by a standard leave-one-out (LOO) cross-validation procedure. In reference [4] this is also explained and mathematically demonstrated. MLR-LOO and leave-many-out predictions present systematic deviations which magnify the regression toward the mean effects. This is easily shown by recalling that a MLR-LOO result (yi’) is obtained from the following equation [5,6]:
yi' =
hii yi − yˆ i , i=1,2,…,n. hii − 1
(1)
where yi are experimental values, yˆ i are the values adjusted by the overall MLR data fitting, and each hii term is a diagonal element of the hat matrix. From (1) it can be seen that the numerical differences between experimental, fitted and cross-validated values are related:
(
' ˆyi − yi = (1 − hii ) yi − yi
)
(2)
592
Emili Besalú, Jesus Vicente de Julián Ortiz and Lionello Pogliani
and it is also known that the hii terms are bounded [7] between the values 1 and m/n (number of descriptors divided by the number of objects). From these condition, and due to de fact that n>m, is easy to see that the differences ˆyi − yi and yi − yi appearing in (2) bear the same '
sign and that the second difference is magnified with respect to the first one. This shows how a cross-validated value (yi’) differs from the experimental one more than the fitted one does. Globally, this effect magnifies the rotation of the points cloud in similar graphs as those on Figure 1 but when cross-validated property values are depicted against the experimental ones.
2. An Alternative: Orthogonal Regression
120
120
100
100
80
80
Experimental value
Fitted value by MLR
An alternative method for obtaining a “symmetric” graph (points cloud around a fitting line) is orthogonal regression [1,3,8]. If orthogonal least squares (OLS) is considered, the sum of point-line quadratic distances is minimized, whereas for the standard linear regression method one minimizes the sum of “vertical” distances or differences between fitted property and experimental one (while the distance parallel to the x-axis is not at all considered). OLS method provides with a unique biplot regression line, that is, the distinction between abscissa and ordinate is irrelevant. The point cloud is symmetrically distributed around the OLS line (see Figure 2).
60 40 20
60 40 20
0
0
-20
-20
-40
-40 -40
-20
0
20
40
60
Experimental value
a)
80
100
120
-40
-20
0
20
40
60
80
100
120
Fitted value by MLR
b)
Figure 2. Representation of the orthogonal regression line (dashed line) for the points of the presented example. This line corresponds to the first principal component of the represented bidimensional data and is the same entity in both representations, fitted vs experimental (a) or experimental vs fitted (b) values.
The method requires both series of data, {xi,yi}, to be conceptually equivalent and expressed in the same units, if any. In fact, the obtained fitted line by OLS coincides with the first eigenvector of the point cloud [8,9]. In Figure 2a this OLS equation line is represented (dashed line) for our example and it has the expression Fitted=0.738·Experimental+13.1. Due to the nature of this equation and the symmetric role of x and y variables, the related formula Experimental=1.36·Fitted-17.7 (dashed line in Figure 2b) can be obtained from the former
On Plots in QSAR/QSPR Methodologies
593
one by isolating the required variable. Of course, this manipulation is not possible when dealing with standard regression lines. As it has been now stated, for the particular case of orthogonal regression, one can properly say that the point cloud is symmetrically distributed among the fitting bidimensional line, but always “clockwise rotated” respect to the bisector due to the inherent “regression towards the mean” effect attached to the MLR method. If the MLR determination coefficient tends to be the unit, the orthogonal fitting equation line also tends to collapse at the bisector one.
3. A Distinction between Utility and Difficulty in Ranking There are a couple of graphs the authors have found useful in order to evaluate the utility and the difficulty of the results obtained by ranking methods. For illustrative purposes, here we will consider the case of ranking a set of 99239 molecules taken from the Cambridge Structural Database v5.24 (Nov. 2002) of the Cambridge Crystallographic Data Centre. In this set, molecules were labeled as drugs or non-drugs. The molecules of interest are the 674 (0.68%) drug ones. We will not focus here on the nature of the ranking method we had used but on the final result obtained. As it is expected, due to the application of an effective QSAR model, the 674 compounds of interest are not uniformly distributed along the final sorted list of compounds. As desired, the density of active compounds is greater at the beginning of the ranked series.
3.1. Some Probability Considerations The obtained rankings and classification results have been studied from a probabilistic point of view. In order to estimate the quality of the obtained predictions a statistical significance test calculation was designed [10,11]. This subsection is devoted to expose the main ideas relative to the basic formulation. Let us consider we have a series of m molecules (the whole database set, m=99239) and that n (=674) of them bear a property of interest (to be a drug in our case). We randomly select s out of the m molecules and we are asking what is the probability that r of the selected molecules are of interest. If there are neither a priori preferences nor more information available, this probability is equal to
n m − n r s − r P(r , n; s, m ) = m s
with r≤s≤m; r≤n≤m.
(3)
In eq. (3) the numerator accounts for the number of possible ways to take r compounds of interest from the subset of n multiplied by the number of ways to select s-r uninteresting compounds from the remaining database set of m-n. This product gives the total number of
594
Emili Besalú, Jesus Vicente de Julián Ortiz and Lionello Pogliani
ways to select r and only r molecules of interest from the whole database. The denominator counts the total number of ways to construct sets of s structures in the database. Within this framework, significance levels (p values) are obtained from cumulated probabilities, that is, given fixed values of m, n and s, the p significance level value (queue on the right) corresponds to the probability to select r or more than r structures of interest. Compared to a random selection of structures, this datum reveals the probability to embrace r or more active compounds when s are selected:
p(r , n; s, m ) =
min( n , s )
∑ P(i, n; s, m) = 1 − i= r
r −1
∑ P(i, n; s, m) .
(4)
i = max( 0 , s + n − m )
In eq. 4, the first summation expresses the direct addition of probabilities for the situations corresponding to collect r, r+1, r+2, ... active compounds. The maximal number of active compounds will be n (all the actives) unless the size of the selected set, s, is smaller than this number. Hence the upper limit for the first summation symbol. The last equality in eq. 4 alternatively counts the significance probability substracting from the unit all the non favorable cases, i. e., to collect less than r active compounds. The lower limit appearing in the second summation is due to the fact that the minimal number of structures of interest which can be chosen is max(0,s+n-m). This is so because there are situations for which the minimal number of active compounds in the selected subset is zero, but if the size of the sample, s, is greater than the number of inactive molecules, m-n, then this minimum value of active compounds would be the difference s-(m-n). This formulation can be found in several places of the literature [11,12] and was firstly presented by the present authors and coworkers [10].
3.2. Enrichment Factors References [13] and [14] provide a natural and intuitive enrichment (e) factor definition: it corresponds to the actual ratio of molecules of interest found in a selected subset divided by the overall ratio of target molecules within the whole database. Following the notation employed in eqs. 3 and 4, this reads
e=
r/s r = n n/ m ms
(5)
The last equality explicitly shows an equivalent and practical definition: the enrichment is the same as the quotient between the number of active compounds found in the subset of size s and the proportional number of active structures if a uniform distribution is being expected.
3.3. The Result Obtained As already seen, only 0.68% of the compounds present in the analyzed database were drugs. After the data processing and ranking, this percentage raised up to 41% for the first
595
On Plots in QSAR/QSPR Methodologies
100 molecules. This corresponds to a 60-fold improvement which can be equivalently measured with both quantities, the percentages quotient (see Table 1, column 3) or the enrichment factor, eq. (5). Furthermore, 337 drugs (50% of the total number of drugs) appeared in the first 10.7% (position 10569) of the sorted database (enrichment of 4.7). By contrast, there were only 17 drugs in the last ranked decile. The classifier performance can be checked in Table 1, which lists some ranked positions where drug molecules are found in the sorted list. For every position, the partial cumulated percentage of drug molecules in the selection is shown and this proportion is compared to the global bulk one, 0.68%, giving its quotient the enrichment factor defined in eq. (5). The last column of Table 1 gives the logarithm of the corresponding significance p-values, calculated according to eq. (4). 160
0
140 -40
100
-80
80 -120
60
p-values (log)
Enrichment
120
40 -160 20 0
-200 1
10
100
1000
10000
100000
Number of first ranked compounds
Figure 3. Enrichment (leftmost vertical scale, thick line in graph) and p-significance values (right scale, thin line) found along the ranked list of molecules. Note the logarithmic scaling in the abscissas axis.
Figure 3 shows the enrichment factors (leftmost vertical scale, thick line in the graph) obtained when selecting the indicated number of first ranked molecules in the abscissas logarithmic scale. The probabilistic formulation presented above tells us how long these results arise from randomness: Figure 3 also displays the logarithm of the significance pvalues (scale on the right, function depicted with a thin line) versus the number of first ranked compounds. Note that very small p values are achieved. Despite the number attached to the first 100 ranked compounds seems to be impressive (p≈10-70 and attached to a quite high enrichment), the graph reveals at once how an even more difficult (highly improbable) result is obtained when the first 3000 compounds are selected (significance p-value of the order of 10-160), despite the corresponding enrichment is much lesser. For this last case, the enrichment is lesser because of the great difficulty inherent to correctly classify all the 3000 compounds out of the whole data base.
596
Emili Besalú, Jesus Vicente de Julián Ortiz and Lionello Pogliani
Table 1. Ranked positions assigned to some drug molecules by the classifier. Enrichment factors and the corresponding significance p -values (eq. 4) are given. The symbols used in headers correspond to the ones found in the text Drug molecule cardinal (r) 1 2 3 4 5 6 7 8 9 10 20 30 40 41 50 60 70 80 90 100 133 200 217 256 300 326 337 400 500 564 600 674
Ranked position in full database (s) 1 7 8 9 13 16 17 18 19 20 42 58 90 100 133 180 284 340 419 520 1000 2402 3000 5000 7820 10000 10569? 16969 32519 50000 56961 98914
Percentage of drugs in selection (100r/s) 100.0 28.6 37.5 44.4 38.5 37.5 41.2 44.4 47.4 50.0 47.6 51.7 44.4 41.0 37.6 33.3 24.6 23.5 21.5 19.2 13.3 8.3 7.2 5.1 3.8 3.3 3.2 2.4 1.5 1.1 1.1 0.7
Enrichment factor (e)
p-value (logarithm)
147.2 42.1 55.2 65.4 56.6 55.2 60.6 65.4 69.7 73.6 70.1 76.2 65.4 60.4 55.4 49.1 36.3 34.6 31.6 28.3 19.6 12.3 10.7 7.5 5.6 4.8 4.7 3.5 2.3 1.7 1.6 1.0
-2.17 -3.02 -4.77 -6.59 -7.76 -9.14 -10.93 -12.75 -14.60 -16.47 -31.83 -48.94 -61.59 -61.28 -72.34 -83.01 -86.31 -96.95 -105.30 -111.99 -127.60 -154.93 -156.32 -151.00 -146.57 -141.32 -144.46 -134.08 -107.25 -72.88 -72.42 -0.96
3.4. Enrichment Factors and Probability Mean: Utility and Difficulty The enrichment factors depicted in Figure 3 reveal the utility of the obtained ranking. That is, a molecular engineer will know from this data if the first set of ranked compounds is enough enriched in order to be transferred to a posterior design or treatment stage. Usually, in the literature it can be read that enrichment factors of 10-fold or more are desirable. This
On Plots in QSAR/QSPR Methodologies
597
seems to be a good general rule of thumb, but it has to be combined with another datum: the number of structures which are to be collected in order to reach this enrichment factor. This is so because it is not the same to have a 10-fold enrichment factor for a set of the first 10 molecules than for a set of the first 1000 or 10000 ranked ones. This last aspect is related to probability: in Figure 3 the p-values tells us about the difficulty to get a particular result. All in all, for a molecular designer it is desirable to combine both aspects: a practical utility due to a notable enrichment factor combined with a high degree of difficulty to reach this classification performance. This is the same to say that high enrichments are desirable, but it is even more desirable to reproduce these enrichments in library subsets as bigger as possible. On the other side, it can be useless to consider only the few first ranked compounds on a ranked list, because despite of having and eventual high and impressive enrichment factor (apparent utility), the difficulty (significance, p-value) could be easily reachable. When a fixed database is ranked using distinct methodologies and when the goal is to compare the classification power of the methods, both parameters (enrichment and p values) are useful and will rank the methods in the same way. But things go different if comparisons must be made among distinct libraries. In general, the enrichment factors are not comparable directly and the ultimate value useful to rank the methods is the difficulty of the achieved results, that is, the significance p value. Thus, the consideration of p-values is a general procedure which provides with objective numerical quantifications that can be translated to compare ranking methods (either applied over the same or distinct libraries) and ranking results (for a fixed library using distinct methods). In fact, enrichment factors do not provide much information by itself unless the values of r, s, n and m are also provided (by the way, this also enables to compute significance pvalues!). If an index for the quality of the classifier is to be related to enrichment, it is better to report the fraction of maximum enrichment which can be achieved for a given case study. The enrichment can be evaluated relative to the maximum affordable one:
emax =
min(s, n ) / s n/m
(6)
Then, the fraction of maximal affordable enrichment is simply the quotient
ef =
e emax
=
r min(s , n )
(7)
Note that in this last equation the size of the database (m) does not appear.
4. ROC Curves Receiver Operating Characteristic (ROC) curves [15] were developed in the 1940-1950's in the context of radar signals research in order to treat noise. The research was motivated by the Pearl Harbor Japanese attack. The question was why the US radar receiver operators had missed the enemy signal. Later, in the 1960’s ROC curves were used in psychophysics, then in medicine and more recently for the evaluation of machine learning results. It is not still
598
Emili Besalú, Jesus Vicente de Julián Ortiz and Lionello Pogliani
common to see analysis of ROC curves in the QSAR or molecular design fields, but we believe that ROC curves constitutes a good tool to be taken into account to evaluate dichotomous classifiers. Table 2. Confounding table obtained once the result of a classifier is known
Real situation
Classification method result Positive Negative TPF FNF FPF TNF
Positive Negative
A ROC curve constitutes a graphical representation of the global efficiency of a classifier. It is a graphical representation for the frequencies of true positives and negatives and also for false positives and negatives along all the series of ranked compounds. In Table 2 it is shown a typical confounding matrix suitable for decision-making protocols. The nomenclature can be adapted to our example above of ranking a set of molecules: a positive result is to being a drug compound, whereas to be a non-drug molecule it will called here to be a negative result. In the boxes, the relative frequencies respect to the whole populations of real positive and negative cases are listed. These frequencies are arranged as being true positives fraction (TPF), true negatives fraction (TNF), false positives fraction (FPF) and the false negatives fraction (FNF). Due to the nature of these definitions, the condition FNF+TPF=1 holds. That is the same to say that all the real positive cases (drugs) have been classified either correctly or badly. It also holds that FPF+TNF=1, that is, the classification method has separated the real negative cases (non-drugs) into two excluding parts, true negatives or false positives. Threshold value
TPF FNF FPF TNF
Figure 4. The four fractions of a confounding table. Upper distribution applies for the positive cases whereas the lower one is for the negative ones. The classifier, once given a threshold value, defines the fractions. The more distributions’ overlapping the worse the classifier’s performance and the larger the FNF and FPF values.
In this context, the sensitivity is defined as a parameter which shows how good the classification method at detecting positive cases is. This corresponds to the true positive fraction (TPF), i. e., the probability that the classification method gives a true positive result
599
On Plots in QSAR/QSPR Methodologies
knowing that the molecule is a real positive case. This corresponds to the conditional probability P(C+|+). A counterpart of it is the specificity, which measures the ability of the classifier to pick out real negative cases, the TNF. It is the same as the P(C-|-) conditional probability. Additionally, the FNF corresponds to the probability of the method to give a case classified as negative when de molecule is a real positive one, P(C-|+). Finally, the fraction FPF corresponds to the P(C+|-) conditional probability. All the above conditional probabilities, i.e., the TPF, FNF, FPF, and TNF terms can be graphically displayed as being the respective areas under two density distribution functions (usually it is depicted one gaussian curve for both, the real positives and the negative cases) delimited by a critical decision point (threshold value) defined by the classification method or by the user. This conceptual display is depicted in Figure 4. It has to be understood that the classification method defines the horizontal scale in Figure 4 and that it also defines the distributions for true and false cases (ROC curves are non-parametric and are insensible to the particular probability distributions we are dealing with). A good classification method is one able to generate two non-overlapping distributions. If not, the FNF and FPF terms will grow dramatically. Note that if the distributions are maintained and the threshold value (vertical line) is moved to the left (to the right), then the proportion of false negatives decreases (increases)... but at the expense that the false positives fraction increases (decreases) accordingly. This sort of mutual dependence vanishes if the two distributions do not overlap and the threshold value is properly set. 1
Sensitivity
0.8
0.6
0.4
0.2
0 0
0.2
0.4
0.6
0.8
1
1 - Specificity
Figure 5. ROC curve attached to the example of the text relative to the classification of molecules of being drugs or non-drugs.
Once the molecular series is ranked, inherently the false and true cases distributions are generated. Then, in order to obtain the ROC curve it is necessary to move the threshold value along all the ranked series. For each threshold value a ROC curve point is depicted: every threshold defines the values of TPF, FNF, FPF, and TNF. In particular, the sensitivity and the specificity are also defined. A ROC curve is the representation of all the (1-specificity, sensitivity) points. That’s equivalent to depict the (1-TNF, TPF)=(FPF, TPF) values. For our example, the corresponding ROC curve is the one of Figure 5. Often the diagonal depicted in Figure 5 is shown and this corresponds to a random or a neutral classifier. If the curve (or a part of it) goes below this diagonal, it means that the
600
Emili Besalú, Jesus Vicente de Julián Ortiz and Lionello Pogliani
classifier must have its criteria reversed (change its polarity) in order to get a proper data dicotomization. 1
FNF
Sensitivity
0.8
TNF
FPF
0.6
0.4
TPF
0.2
0 0
0.2
0.4
0.6
0.8
1
1 - Specificity
Figure 6. The four fractions (TPF, FNF, FPF, TNF) attached to a specific point of a ROC curve.
In many places of the literature it is represented the portion of hits (in our example drugs) retrieved along the ranked series against the portion of the sorted database. In general, this representation is not the same as to depict a ROC curve, but if the database is large compared to the number of hits, then both graphical representations become almost the same. As said above, a ROC curve shows the overall performance of the classifier, displaying the sensitivity and the specificity obtained for each threshold value. Figure 6 shows how, at every point of the ROC curve, all the four fractions (TPF, FNF, FPF, TNF) are depicted in the graph. One parameter which is extensively used in order to quantify the overall performance of a classifier is the area under the curve (AUC). This is displayed in the shadowed zone in Figures 5 and 6. A random classifier has an attached value of AUC=0.5, and the greater the AUC from this value, the better the classifier performs, at least from a global perspective. Roughly, the following intervals are assumed relative to the quality of a particular result: AUC between 0.50 and 0.75 is fair, between 0.75 and 0.90 is good, it is very good for values of 0.90– 0.97, and it is excellent for greater values. For the example explored here, the AUC is equal to 0.79. An ideal classifier will have an AUC equal to 1 (the ROC curve collapses to the sensitivity axis and to the upper segment in the box represented in Figure 5 or 6). The AUC is a universal parameter useful to compare distinct classifiers. This area can be interpreted as a mean sensibility along all the specificity values. This is also directly related to the Wilcoxon’s sum of ranks statistic [16,17]. The AUC can be also interpreted [17,18] as the probability to correctly classify a couple of molecules, one being a drug and the other being a non-drug.
On Plots in QSAR/QSPR Methodologies
601
5. Pharmacological-Activity Distribution Diagrams The pharmacological-activity distribution diagrams (PDDs) are useful tools for the selection of SAR equations for molecular design [19]. They are histogram-like plots in which one or several groups of chemical structures, preferably from a test set, are distributed into intervals of the value of the predicted property. They were initially used to visualize how a group of active compounds was distributed with respect to a group of inactive ones, by using a discriminant function. Galvez el al. [19] noticed that the QSAR equations could be used as discriminant ones with the aid of PDDs. This is the reason why usually two groups (active and inactive compounds) are usually plotted in PDDs. These QSARs, that can also model pharmacokinetic properties, are called limiting properties. By using simply the number of compounds in ordinates has the drawback that the two groups must have approximately the same cardinal to reach a useful representation. Furthermore, it is advisable including some kind of penalty in the function plotted that gives account of the overlapping of the two groups. Thus, the function introduced was the quotient between the fraction of molecules pertaining to one group that falls in the considered interval, and the same fraction for the opposite group plus one, to avoid the division by zero. This gives idea of the probability of pertaining to a group and was termed as expectancy. Thus, for each arbitrary interval of whatever function, it can be defined the expectancy of activity as:
Ea =
a i +1
(8)
where a is the quotient between the number of active compounds in this interval and the number of total active compounds; in the same way, i represents the ratio of inactive compounds. It is also defined the expectancy of inactivity just as
Ei =
i a +1
(9)
For a given equation, it is easy to see the zones in which the overlapping between Ea and Ei is minimal, and so to determine if the equation studied can be useful for the selection and molecular design. This also allows determining the intervals of the limiting property in which there is a good expectancy to find new active drugs: where the probability of finding new active compounds is optimal relating to the chance of obtaining false positives. When the groups of molecules are structurally heterogeneous, PDDs generally adopt skew gaussian shapes or present several maxima. Let us see several examples. Figures 7 to 9 show the PDDs obtained with three QSAR equations and Figure 10 the PDD of a discriminant function [20]. In these cases, the correlation variables are connectivity indices (see reference 20 for its definitions), and the equations are applied to test sets. The active group is made up of compounds that show anti-herpes simplex activity (in white in the figures). The inactive group contains drugs that exhibit pharmacological activities different from anti-herpes (in black in the figures).
602
Emili Besalú, Jesus Vicente de Julián Ortiz and Lionello Pogliani
0.6 0.4 Active Non-active
0.2
55
40
25
10
-5
-20
0
4 v χp + 41.39 χpc + 21.71. Abscises: IC50 / µM. Ordinates: Expectancy of activity in white, expectancy of inactivity in black.
Figure 7. PDD for IC50 = - 17.36
4
Figure 7 displays the PDD for the inhibitory concentration-50 against the herpes simplex virus type 1. This is a property determined in vitro. The maximum expectancy for the active group is between -5 and 0 and for the inactive group is between 15 and 20 µM. These values are predictions for test sets as said, and even if the predicted values are not so accurate, the PDD reveals that this limiting property can be used for discriminating anti-herpes and inactive compounds respectively. The interval that can be chosen depends of the purpose and must be determined by agreement. For example, to minimize false positives, the ideal interval should be from -10 to 0.
0 0 v 3 v 3 v 4 Figure 8. PDD for log(ID50) = - 1.42 χ + 4.81 χ - 11.41 χp + 1.32 χc + 4.17 χpc - 8.42. Abscises: log(ID50). ID50 expressed in µM. Ordinates: Expectancy of activity in white, expectancy of inactivity in black.
The PDD shown in Figure 8 corresponds to another microbiological property: logarithm of the inhibitory dose-50 (ID50), determined in vivo. In this case, the limiting property is not as discriminant as in the former case, although the absolute maxima of each distribution are not coincident.
On Plots in QSAR/QSPR Methodologies
603
Figure 9 represents the PDD for a pharmacokinetic property, the percentage of unchanged drug found in urine, in logarithmic form. The maxima for active and inactive compounds are clearly different. This example illustrates how a property unrelated with the activity can be limiting if modelled for a group of active compounds.
1 v 2 3 3 v 3 Figure 9. PDD for log(UDU) = - 4.67 χ + 8.70 χ - 3.64 χp + 3.15 χp - 8.05 χc - 9.23. Abscises: logarithm of percent of unchanged drug in urine. Ordinates: Expectancy of activity in white, expectancy of inactivity in black.
3 0 v Figure 10. PDD for D = - 1.17 χ + 2.11 χp + 2.79. Abscises: Classification function obtined by linear discriminant analysis. Ordinates: Expectancy of activity in white, expectancy of inactivity in black.
Finally, Figure 10 shows the PDD for a linear discriminant function obtained for antiherpes activity. This shows the typical behaviour of a discriminant function with two Gaussian curves partially overlapped. It is noteworthy that the PDDs can reflect the pharmacological activity profile of a group of molecules independently that the nature of the limiting property used. It can be a QSAR of a pharmacological property, a discriminate function or a QSPR of a pharmacokinetic property. PDDs are valuable tools in the validation of limiting properties, and consequently in the search of new drugs, and give a clear picture of their quality.
604
Emili Besalú, Jesus Vicente de Julián Ortiz and Lionello Pogliani
Acknowledgments E. B. acknowledges the financial support of the grant number CTQ2009-09370 of the Spanish Ministry of Science and Innovation. J. V. de J. O. offers thanks for a grant from the I3P program of the Spanish ‘Consejo Superior de Investigaciones Cientificas (CSIC)’ and financial support from the project MAT2007-64682 (Ministerio de Educación y Ciencia), Spain.
References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11]
[12] [13] [14]
[15] [16] [17]
Besalú, E.; de Julián-Ortiz, J. V.; Pogliani, L. “Some Plots Are not that Equivalent” MATCH Commun. Math. Comput. Chem. 2006, 55, 281-286. Besalú, E.; de Julián-Ortiz, J. V.; Iglesias, M.; Pogliani, L. “An Overlooked Property of Plot Methods” J. Math. Chem. 2006, 39, 475-484. Besalú, E.; de Julián-Ortiz, J. V.; Pogliani, L. “Trends and Plot Methods in MLR Studies” J. Chem. Inf. Model. 2007, 47, 751-760. Galton, F. “Regression towards Mediocrity in Hereditary Stature” J. Anthrop. Inst. 1886, 15, 246-263. Besalú, E. “Fast Computation of Cross-Validated Properties in Full Linear LeaveMany-Out Procedures” J. Math. Chem. 2001, 29, 191-204. Weisberg, S. Applied Linear Regression; John Wiley and Sons: New York, 1985. Hawkins, D. M. “The Problem of Overfitting” J. Chem. Inf. Comput. Sci. 2004, 44, 112. Adcock, R. J. “A Problem in Least Squares” Analyst 1878, 5, 53-54. de Julián-Ortiz, J. V.; Pogliani, L.; Besalú, E. “Two-variable linear regression: Modeling with Orthogonal Least Squares” J. Chem. Educ. (submmited). Besalú, E.; Ponec, R.; de Julián-Ortiz, J. V. “Virtual Generation of Agents Against Mycobacterium tuberculosis. A QSAR study” Mol. Divers. 2003, 6, 107-120. Barroso, J. M.; Besalú, E. “Design of experiments applied to QSAR: ranking a set of compounds and establishing a statistical significance test” Theochem 2005, 727(1-3), 89-96. Yan, S. F.; H. Asatryan; Li, J.; Zhou, J. “Novel Statistical Approach for Primary HighThroughput Screening Hit Selection” J. Chem. Inf. Comput. Sci. 2005, 45, 1784-1790. Pearlman, D. A.; Charifson, P. S. “Improved scoring of ligand-protein interactions using OWFEG free energy grids” J. Med. Chem. 2001, 44, 502-511. Halgren, T. A.; Murphy, R. B.; Friesner, R. A.; Beard, H. S.; Frye, L. L.; Pollard, W. T.; Banks, J. L. “Glide: A New Approach for Rapid, Accurate Docking and Scoring. 2. Enrichment Factors in Database Screening” J. Med. Chem. 2004, 47, 1750-1759. Egan, J.P.; Signal Detection Theory and ROC Analysis; Academic Press: New York, 1975. Bamber, D.C. “The area above the ordinal dominance graph and the area below the receiver operating characteristic graph” J. Math. Psychol. 1975, 12, 387-415. Hanley, J.A., McNeil, B.J. “The meaning and use of the area under a receiver operating characteristic (ROC) curve” Radiology. 1982, 143, 29-36.
On Plots in QSAR/QSPR Methodologies
605
[18] Hanley, J.A., McNeil, B.J. “A method of comparing the areas under receiver operating characteristic curves derived from the same cases” Radiology. 1983, 148, 839-843. [19] Gálvez, J.; García-Domenech, R.; Gregorio-Alapont, C.; de Julián-Ortiz, J. V.; Popa L. “Pharmacological distribution diagrams: a tool for de novo drug design” J. Mol. Graph. Model. 1996, 14, 272-276. [20] de Julián-Ortiz, J. V.; Gálvez, J.; Muñoz-Collado, C.; García-Domenech, R.; JimenoCardona, C. “Virtual combinatorial syntheses and computational screening of new potential anti-herpes compounds” J. Med. Chem., 1999, 42, 3308-3314.
In: Quantum Frontiers of Atoms and Molecules Editor: Mihai V. Putz, pp. 607-628
ISBN: 978-1-61668-158-6 © 2010 Nova Science Publishers, Inc.
Chapter 23
APPLICATION OF THE FUZZY LOGIC THEORY TO QSPR-QSAR STUDIES Pablo R. Duchowicza and Eduardo A. Castrob Instituto de Investigaciones Fisicoquímicas Teóricas y Aplicadas INIFTA (UNLP, CCT La Plata-CONICET), Diag. 113 y 64, C.C. 16, Suc.4, (1900) La Plata, Argentina
Abstract The Fuzzy Logic Theory has been considered a brilliant and potent revolutionary computer technology that has received broad attention during the last decades, widely employed in the fields of Physics, Mathematics and Chemistry, particularly for the classification and systematization of information with applications in Theoretical Computer Science and Artificial Intelligence. The main reason for this is that Fuzzy Logic is a system of concepts, principles and methods for approximate ways of reasoning which are expressed in natural language. This chapter reviews the application of Fuzzy Logic Theory to the field of the Quantitative Structure Property-Activity Relationships Theory, describing the studies developed by different experts in this fascinating field.
1. Introduction The whole world of the Fuzzy Logic Theory (FLT) has a firm basis that enables it to describe complex disciplines providing a different kind of mathematics with admiration. Fuzzy concepts are found in Law, Medicine, Economics, Linguistics, System Theory, Philosophy or Psychology. This sort of fuzzy Mathematics describes quantities which are not able to be investigated by probability distributions, and this kind of reasoning parallels reallife though patterns much better than crisp reasoning does, as is a theory of the common sense.
a
E-mail address: [email protected] / [email protected]. Corresponding author: Tel.: (+54) (221) 425-7430 / (+54) (221) 425-7291. FAX: (+54)(221) 4254642 b E-mail address: [email protected] / [email protected]
608
Pablo R. Duchowicz and Eduardo A. Castro
According to the founder of Fuzzy Logic Lofti A. Zadeh [1], who developed the theory in the United States in 1964, it is a versatile theory, as it is possible to take anything in any field and fuzzify it. Much of the logic behind human reasoning is not the traditional Boolean twovalued or even multivalued logic, but a logic with fuzzy truths, fuzzy connectives, and fuzzy rules of inference. Fuzzy Logic Theory is not a logic that is fuzzy but a logic that deals with fuzzy quantities [2]. In the realms of FLT, the concept of fuzzy sets represents mathematical objects that are able to model the vagueness present in our natural language when we describe real phenomena that do not have sharply defined boundaries. Therefore, fuzzy sets are sets that calibrate vagueness, and thus FLT has been suggested as a fundamental tool for carrying out approximate reasoning processes and for automation when knowledge is uncertain, incomplete, imprecise, or vague [3]. In ordinary Mathematics, we are used to dealing with well-defined problems admitting a forced “yes / no” answer, for instance, certain subsets of a given set of objects such as the subset of “even integers” in the set of integers. On the contrary, when we speak of the subset of “structurally similar compounds” in a set containing various chemical families of compounds, it may be difficult or impossible to decide whether a compound is in that subset or not. The main drawback of applying the ordinary approach in this case is that there may be information lost during this process, as this information is not being correctly expressed. After Zadeh founded the FLT, the technique has evolved to become more general and applicable to different problems of both the chemical and biological interest. Different thinkers has recognized in the past the existing ubiquity of fuzziness: the physicist Albert Einstein, the quantum physicist Louis de Broglie, and philosophers W. V. Quine and Ludwig Wittgenstein. In view that fuzzy concepts need to be properly addressed in any mathematical problem in hands, the FLT approach involves a broad number of applications in several active fields of System Theory, whose great goal is a skeleton key for the working of systems. As it is known, Expert Systems (ES) are software that make decisions like humans, i.e. doctors, scientists, etc. In ES the rules expressing knowledge and facts are linguistic in nature, and the uncertainty involved can take on various facets, such as probability, possibility, belief functions, or fuzzy measures. Therefore, Fuzzy-based ES would reflect the way humans think in the best manner. Complex systems are systems with a complicated and unpredictable behavior, which defy the human comprehension, such as living organisms [4]. Control Engineering of complex systems, where mathematical models are difficult to specify, and Pattern Recognition, where classes of objects are more fuzzy than crisp and the variability across objects needs to be modeled, would represent real challenges in theories other than the one proposed in FLT [2]. In his 1965 publication [1], Zadeh noted that fuzziness plainly differs from probability, although both of them describe uncertainty numerically and they resemble, since both of them deal with degrees, one of truth and the other of likelihood. Probability treats yes/no occurrences, requires ignorance, and is inherently statistical. In contrast, fuzziness deals with degrees, does not require ignorance, and is completely nonstatistical. Consider a similar example to that provided in ref. [2], but now applied to the chemical context, the gas phase recombination reaction of two unknown atomic species A and B. The probabilistic question: “Is atom A more electronegative than atom B?” leads to the answer of 0.5 (ignorance): A may be more electronegative than B (value 1) or may not (value 0). Suppose now that we know that the electronegativity of atom A is the half value to that of B (additional information). The question now becomes: “To what extent is atom A more electronegative than atom B?” —
Application of the Fuzzy Logic Theory to QSPR-QSAR Studies
609
which is a fuzzy question. The answer is still the same in the fuzzy case (0.5) but become scertainty (0) for the probabilistic point of view, as we now know that A is less electronegative than B. Unlike fuzziness, probability dissipates with increasing information, as the more we know about the problem, the less uncertain it is. Probability vanishes, it simply requires ignorance. On the contrary, fuzziness can coexist with total information on the problem in hands. Fuzzy probabilities are a kind of fuzzy number, that is to say, an approximate number like “more or less” 30 percent and “around” 30 percent. Classic probability deals with such estimates by setting a crisp margin of error, such as plus or minus 5 percent, but fuzzy probability blurs this range. It is also possible to employ “fairly low” instead of 30 percent. The main concern of FLT is to represent, to manipulate, and to draw inferences from imprecise statements. In the chemical context, several known properties of interest involve a fuzzy definition, as is the case for acidity, structural similarity, aromaticity, reactivity, Hartree-Fock molecular orbitals, molecular shape, molecular symmetry, and others. In all these examples, the quantities are considered fuzzy in the sense that they cannot be sharply defined, and thus involve a “commonly established” criterion for dealing with such vague terms. For instance, not all the aromatic compounds are perfectly aromatic, they are aromatic to some extent. An ideal aromatic is an aromatic compound to 1.0 extent. The Fuzzy Logic Theory (FLT) is suitable for dealing with many real world problems, characterized by complexities, uncertainties, and a lack of knowledge of the governing physical laws. Fuzzy Logic provides a conceptual framework for dealing with the problem of knowledge representation in the environment of uncertainty and imprecision. The most important application of Fuzzy Set Theory (FST) is the fuzzy rule-based models, where the relationships among system variables are modeled using linguistically interpretable rules. This virtue of FLT makes it to encode expert knowledge in a direct and easy way. Another advantage of the fuzzy approach over traditional ones lies in the fact that fuzzy system does not require a detailed mathematical description of the system while modeling. Among the capabilities of FST is the numerical expression of the impression that stems from a grouping of elements into classes that do not have sharply defined boundaries. Fuzzy set theory has been found to be a powerful mathematical tool in Artificial Intelligence especially in the areas of knowledge representation and designing Fuzzy Expert Systems and Artificial Neural Networks, Qualitative Reasoning, and Pattern Recognition [5]. For instance, the Artificial Neural Networks appeared in the mid-1980s, when Artificial Intelligence efforts started to stall. These are devices that learn and that are crudely based on the brain. If ANN are linked to fuzzy systems, they become more powerful and reliable. It has also been widely studied in developing intelligent fuzzy logic control and optimization systems in Engineering. Present chapter revises in certain degree of details several Quantitative Structure Property-Activity Relationships (QSPR-QSAR) studies developed by different experts that apply the active area of FLT, and also include some illustrative examples in order to illuminate non-specialist readers with a valuable insight for understanding the methodology employed. It is not our purpose to make use of a dense mathematical treatment of fuzzy concepts, but to concentrate more on providing a clear and transparent application of FLT to gain a better insight of its modeling capabilities.
610
Pablo R. Duchowicz and Eduardo A. Castro
2. Fuzzy Sets and Rules As it has been known for decades, a set is defined as a collection of definite, distinguishable objects, in good agreement with our intuition [6]. The Cantor’s classical Set Theory imposes a strict membership of an object to a set, that is to say, objects either belong or do not belong to the set [1,7], and none straddle the line. One simply dictates a clear breakpoint. FST has been developed to depart from this two valued logic scheme. A fuzzy set (fs) A is defined by its objects x and their respective degree of memberships (dm) to the set. The use of dm give a mathematical definition of fuzzy sets that enables to continuously increase the number of objects encountered in human reasoning that can be subjected to the scientific investigation. The dm of an object in the fs can range in the real closed interval
[0...1] , and is given by a defined fuzzy membership function (fmf). This fmf maps each x to
a given dm value. The space of objects X, which includes all generic elements x and A ⊆ X , is called the Universe.
Figure 1. Examples of three fmf and their associated parameters [9].
Application of the Fuzzy Logic Theory to QSPR-QSAR Studies
611
In contrast to classical sets, an object can belong to multiple sets simultaneously by having different dm in each set. There are several kinds of fmf which can be used, such as triangular, trapezoidal, gaussian, bell, etc. [8] During the design of fuzzy systems if the number of fmf is large, which leads to a great number of fuzzy rules, this may lead to the overfitting problem which makes the system to loose predictive capability and increase the computation time. In addition, wrong membership functions can lead to poor performance and possibility of instability. The fmf can be optimized through learning methods [9]. Fuzzy sets include crisp sets, as a crisp set is a fuzzy one with membership values of 1 and 0. If an object is in a crisp set, it must have a value of 1; if it is not in a fuzzy set, it can have any value except 0. It is possible to fuzzify fuzziness, as many aspects of the classic Fuzzy Logic Theory result crisp: although a set may be fuzzy, a membership value of 0.7 is crisp. Therefore, second-order fuzzy sets or ultra-fuzzy sets involve values like “about 0.7”. Each membership is itself a fuzzy set [2]. In general, a fuzzy conditional rule is made up of a premise and a conclusion [7]:
IF premise THEN conclusion The premise term involve a number of fuzzy predicates
(1)
Pi , of the type
Pi = ( X i IS Ai ) , each of them may appear negated or combined by different operators such as AND or OR. With the purpose of applying an inference method to assess the conclusion, it is first necessary to assess the dm of the premise. This is done by performing the respective fuzzy operations on its Pi , leading to the dm for each of them. For instance, dm(predicate) is calculated by assessing dm of X i in fuzzy set Ai . Two cases appear: (a) if X i is a fs, its dm is obtained by making an intersection between both sets and choosing the maximum value of dm; (b) if X i is a crisp value, its dm in Ai is the value that its fmf assumes for X i . The conclusion is a result from the assessment of all the rules concerning the same output variable. In general, after obtaining the fuzzy output set, this has to be defuzzified for transforming the fuzzy information into numerical information. There exist various defuzzification methods, many of which simplify calculation of the output value using a single operation that aggregate the rules and defuzzify, without the need to calculate the fuzzy output set [7,10].
3. Medicinal Chemistry and FLT A research of 1998 [7] applies a new fuzzy learning technique called FuGeNeSys, that enables both the prediction of pharmacological activity and the development of new active compounds. This method allows a linguistic representation of the knowledge base, making it easy for a human to understand and interpret; the analysis of rules may be of help in gaining a theoretical understanding of the parameters and parameter ranges which affect the activity of compounds. It also leads to a great saving in money and in time during synthesis and testing of new compounds.
612
Pablo R. Duchowicz and Eduardo A. Castro
The automatic tool for learning of fuzzy rules implemented in FuGeNeSys [10,11] is an ES that uses GA and ANN approaches and allows supervised approximation of multiinput/multi-output systems, capable of learning and at the same time selecting the most important structural descriptors. It generates a small number of rules, allowing obtaining extremely compact knowledge bases which can be studied with accuracy. The study of these rules leads to a great potential for the prediction of pharmacological activity in a highly simplified analysis. The method only stops when it gives satisfactory results, providing a set of rules with physical sense which are then analyzed, so one has then only to validate it for demonstrating its predictive performance on new data. This fuzzy approach employed for QSAR achieves 100% correct recognition of the pharmacological activity of the compounds used in both the testing and learning phases, that are classified into active and inactive classes. The two datasets analyzed involve compounds inhibiting the Reverse Transcriptase enzyme of Human Immunodeficiency Virus (HIV) type 1 ( ED50 values for 44 compounds in the training set and 3 compounds in the test set) and the antirhinovirus activity of 9-benzyl purines against Rhinovirus serotype 1B ( IC50 values for 46 compounds in the training set and 6 compounds in the test set). The results found are better than those found in the literature and offer the great advantage of linguistic representation.
4. Fuzzy Clustering Methods The main objective of database mining (DBM) methods is the proposal of new efficient tools that allow designing and classifying large biochemical libraries [12,13], which is considered an important issue in Medicinal Chemistry during the Combinatorial Chemistry and High-Throughput Screening (HTS) research programs for selecting new leads based on the analysis of Molecular Diversity of compounds [14]. In QSPR-QSAR studies that involve complex datasets, an effective modeling approach can be to partition or “cluster” the available data into subsets of similar (common) data and then approximate each subset by a simple model, thus diminishing the complexity of the model [15]. Different pattern recognition approaches have been used in past years for establishing suitable classification models, and offer different possibilities and objectives. Principal Component Analysis (PCA) [16] is only a projective technique, while Discriminant Analysis (DA) [17] is a really discriminate one as it is capable to find linear relations in the molecular descriptors hyperspace able to separate different categories present in the data set. Both of them results valuable techniques whenever clusters or classes of compounds can be visually delineated, in other words, these are grouped in well-separated regions. In more complex distributions their classification power diminishes and it is not possible to have knowledge about the structure of the database. Cluster Analysis (CA) [18] provides a first approximation to solve the problem, as instead of inspecting all compounds in the database, it is enough to select some typical compounds representing each cluster to get a deeper knowledge of the distribution of compounds in the involved set of descriptors. Two main problems appear in CA based methods: a) the number of clusters and the initial positions of the cluster centers can influence the final result, and b) a compound lying between two clusters is included only in one of them.
Application of the Fuzzy Logic Theory to QSPR-QSAR Studies
613
Among methods based on Artificial Neural Networks, the Kohonen Self-Organizing Maps (SOM) [19] overcome previous limitations, integrating non-linearity into the data set so as to project the descriptor hyperspace onto a two-dimensional map as well as preserve the original topology: points located near each other in the original space remain neighbors in SOM. The drawback of this method is that it is an unsupervised projective procedure like PCA [20]. The Back Propagation Neural Networks (BPNN) [21] is a supervised method capable of discriminating any non-linearly separable class, relating continuous input and output spaces with an arbitrary degree of accuracy, and has proven to be very efficient in modeling complex data set relationships [22,23]. However, the employment of complex nonlinear modeling functions usually impedes a better understanding of the involved biological mechanism that lead to the observed activities of compounds. Patter recognition strategies, which are related to the application of human sense, could be transferred to an algorithmic process applicable in the field of molecular recognition. The FLT provides interesting alternatives within the context of imprecision categories, as fuzzy classification represents the boundaries between neighboring classes as something continuous, assigning to compounds a degree of membership of each class. Fuzzy clustering methods allow objects to belong to multiple clusters at one time with different dm. The degree of usefulness and complexity of different fuzzy clustering methodologies is quite vast [24]. Some of the fuzzy clustering methods involve partitioning [25], substractive clustering [26], fuzzy-c-means, Gustafson-Kessel, fuzzy maximum likelihood estimate clustering, fuzzy c-verieties, fuzzy c-elliptotypes, fuzzy c-regression models, and possible clustering [24]. A previous work [20] proposes an improvement based on FLT over SOM and Bayesian ANN [22] based methods for analyzing a set of Central Nervous System (CNS)-active molecules, considered as good as candidates for the treatment of diffuse neurology pathologies [27]. CNS-active compounds are classified according to the different CNS receptors on which they could act. A hybrid system constituted by SOM and Fuzzy Clustering [28,29] is applied on 389 active molecules, acting on eight types of receptors. The predictive QSAR model that considers 259 compounds is able to correctly predict the experimental activity of 130 compounds with a ratio of 81 %. As a continuation of previous research study, in ref. [30] is developed a new and more general DBM method called Adaptive Fuzzy Partition (AFP), which is applied on an enlarged data set of 581 CNS compounds mainly consisting on selective agonists or antagonists, acting on the same eight receptor types. The prediction ability of AFP is evaluated with a training set of 377 CNS-active molecules, which are subdivided into eight receptor classes or subspaces. The CNS data set is first distributed within a hyperspace given by 166 molecular descriptors, including constitutional, topological, physicochemical, and electronic parameters that are computed with ChemInter [31] and SciQSAR2D [32] programs. The structural feature selection is performed by a procedure based on Genetic Algorithms [33], leading to the best 11 descriptors of the pool. The AFP is based on the Fuzzy Partition algorithm [34,35] that allows generating fuzzy rules from numerical data by developing to main steps: (a) partitioning a working space into fuzzy subspaces ( Sk ); and (b) defining a fuzzy rule for each of these subspaces. For instance, the rule for S k in a d-dimensional hyperspace defined by d descriptors is of the following type:
614
Pablo R. Duchowicz and Eduardo A. Castro IF d1A is associated with fmf1k ( d1 A ) AND d 2 A is associated with fmf 2 k ( d 2 A ) … AND d NA is associated with fmf Nk (d NA ) THEN the score of the activity for A is Pk (2)
where diA is the value of the ith descriptor for molecule A, fmf ik is the membership function related to descriptor i for subspace S k , and Pk is the experimental activity in the fuzzy subspace. In the fuzzy rule given by Eq. (2), the AND is generally represented by the Min operator [36], and the membership function can be defined by triangular, trapezoidal or Gaussian shapes [35,37,38]. After that, AFP builds a model by establishing relationships (rules) between the best 11 selected descriptors and the CNS activities, performing a dynamical division of the descriptor hyperspace into a set of fuzzily partitioned subspaces. In this case, the membership functions used are trapezoidal shapes based on the boundaries of the created subspaces. The degree of membership of activity P for molecule A (P(A)) is defined as follows: nS
P ( A) =
∑ (Min fmf k =1 nS
d i
ik
(diA ))( Pk )
∑ (Min fmf k =1
d i
(3) ik
(diA ))
where nS represents the total number of subspaces and Pk is the experimental activity in subspace S k . For example, the following parameters are used to process the data: maximal number of rules for each chemical activity=35, and minimal number of compounds for a given rule=4. Furthermore, the robustness of the technique is confirmed by predicting an external test set of 102 compounds never used to define the AFP models. Validation ratios of about 80% are obtained in the prediction of the experimental CNS activities. Finally, a comparison between the results obtained by AFP and by other classic techniques, such as Learning Vector Quantization (LVQ) [39] and BPNN, shows that AFP improved sensibly the prediction power of the proposed QSAR models. A main advantage of AFP is that, independent of its complexity, the test phase takes only a few minutes to screen several thousands of molecules, as claimed by the authors of this work. In addition, subspaces are described by simple linguistic rules, and for each compound is calculated its degree of membership (ranging from 0.0 to 1.0) towards the different CNS biological properties. Finally, the AFP achieves three improvements over the cited SOM/FC hybrid method [30]: (a) the classification of the eight CNS activities is directly performed in the original descriptor hyperspace, avoiding the loss of information contents due to the projection into a 2D map; (b) each CNS activity is represented by a peculiar set of relevant molecular descriptors; (c) each compound can be straight away related to a unique biological property and not to a cluster of activities, the number of which depends on the compound distribution in the hyperspace and on the parameters used in the FC classification.
Application of the Fuzzy Logic Theory to QSPR-QSAR Studies
615
5. Fuzzy ARTMAP Neural Networks Non-linear QSPR models are established for the prediction of different physicochemical properties by using the cognitive classifier fuzzy ARTMAP neural network classifier [40,41]. The fuzzy ARTMAP neural networks is an approach demonstrated to achieve greater accuracy (in terms of lower average absolute errors) for estimating boiling temperatures [42], critical properties [43], octanol/water partition coefficients [44], aqueous solubilities [45], and the evaluation of toxicities [46], when compared to the BPNN approach as well as other regression-based statistical correlations reported in the literature. Fuzzy logic based ARTMAP neural networks have several advantages as they are able to carry out fast but stable online recognition, learning, hypothesis testing, and prediction of rare events, avoiding the plasticity-stability dilemma of many popular autonomous learning systems, such as BPNN, in the case of long training is needed and that causes huge unstable networks. The basic learning mechanism of the fuzzy ARTMAP neural system consists of creating new categories (equivalent to hidden units in back-propagation) when dissimilar molecular descriptors and different values of the physical property are encountered. The fuzzy ART architecture has been designed by Carpenter et al. [47] as a classifier for multidimensional data clustering according to a set of features.
Figure 2. Block diagram of the fuzzy ARTMAP neural network architecture [43].
In brief, the ARTMAP network consists of two fuzzy ART modules, artA and artB, that are linked together via an inter-ART module (Figure 2). Module artA learns how to a
categorize the input patterns (molecular descriptors) presented to layer F0 with a vigilance parameter
ρ a , while artB develops categories of the experimental property presented to layer
616
Pablo R. Duchowicz and Eduardo A. Castro
F0b with a vigilance parameter ρb . Both vigilance parameters calibrate how well an input pattern must match the learned prototype or cluster of input features that the category deems to be relevant, for a category to be accepted. The two modules work together and are linked by the map field module of Figure 2, which is an associative learning network that forms an internal controller designed to create a minimal number of artA recognition categories or hidden units needed to meet the accuracy criteria, by following the match tracking rule. When the molecular descriptors are presented, the artA module attempts the prediction through the map field of the category to which the current target belongs. The predictions of this method are of an “if-then” nature, e.g., IF the molecular structure has features close enough to a particular prototype, THEN it predicts the desired outcome. Many such rules coexist without mutual interference because of the competitive interactions whereby each hypothesis is compressed.
6. Ontogenic Neuro-Fuzzy Algorithm: FCID3 In a study of 1997 [48] structure-activity relationships are developed for correlating the observed mutagenic behavior of 62 aminoazo derivatives and 12 of their reductive cleavage products with molecular descriptors calculated by quantum-chemical semiempirical methodology. A model consisting on 8 descriptors computed from CODESSA software and its Best Multilinear Regression (BMLR) method [49] are shown to account for more than 70% of the variation in the relative mutagenic activity of these compounds. The non-linear approach adopted in this analysis integrates fuzzy logic with ANNs in the FCID3 hybrid algorithm [50]: ANN supply the computational power necessary to process rapidly large quantities of data, while fuzzy logic provides a high level reasoning capability that guides the overall construction of the network topology. The algorithm generates a feedforward network architecture for the data set and, after generating fuzzy Kosko entropies [51] at each node of the network, it switches to fuzzy decision making based on those entropies. The FCID3 consists on a fuzzyfication of the ontogenic CID3 algorithm [52], which generates a neural network architecture by minimizing the Shannon’s entropy function by adding new nodes arranged in layers. The initial network architecture is generated in the same way for both CID3 and FCID3, although FCID3 subsequently defines dm for fs associated with each of the hidden layer nodes, where the entropy is first reduced to zero. Under this condition of zero entropy all the training examples are correctly recognized. Once fs are defined, FCID3 switches entirely to operations on fs. This results in a simpler architecture than the CID3 for correctly classifying data, having a drastic reduction of the number of connections and nodes. Nodes and hidden layers are added as needed until the learning task is accomplished. In this study of mutagenicity, the architecture is restricted to a single hidden layer, and the approach can account for about 95% of this variation using 8 descriptors. Furthermore, the predictive power of this network, as assessed by the Cross-Validation technique [53], is exceptionally good, RCV = 0.94 . The FCID3 has also been applied for modeling the 2
mutagenicity of aromatic and heteroaromatic amines and related compounds [54].
Application of the Fuzzy Logic Theory to QSPR-QSAR Studies
617
7. Prediction of Mixture Toxicities Based on a Fuzzy Set Method A new methodology is proposed which uses molecular descriptors and the Fuzzy Set Theory to characterize the degree of similarity and dissimilarity of mixture constituents, and so it is able to predict the mixture toxicity regardless of whether mixture constituents possess similar, dissimilar, or mixed similar and dissimilar acting mechanisms of action [55,56]. It avoids the use of a two valued logic criteria which considers only mixtures comprising constituents with either completely similar or dissimilar mechanisms, that have been previously described through the employment of concentration addition and independent action models, respectively. The INFCIM (INtegrated Fuzzy Concentration addition Independent action Model) enables an objective quantitative assessment of similarity and dissimilarity that rely less on mechanisms of action. The INFCIM is applied in two case studies using toxicity data of four mixtures, and its performance is compared against those of both concentration addition and independent action models. It is demonstrated that INFCIM performs comparably or better than the best performing existing model in the original studies for all the mixtures tested [55]. The framework of the INFCIM applied to the analysis of toxicities is based on the following steps: (a) for a mixture of N components, derive the concentration response curves (CRC) for all the pure components and the mixture at a given composition; (b) calculate descriptors for each compound. Here are employed Dragon descriptors [57]; (c) obtain intermolecular distances (x) using descriptors: this is done here by calculating Euclidean distances between all pairs of molecules. (d) use fmf to calculate binary similarity and dissimilarity between mixture components. The proper shape choice of fmf can improve the performance of the INFCIM model, and present work employs the Gaussian fmf of Eq. (4) for describing the degree of similarity between mixture constituents:
dm = exp(−( x − c)2 2σ 2 )
(4)
with dm being the degree of membership, x is the binary molecular distance, c the mean, and σ is the standard deviation, which is the adjustable parameter. c is set to zero so that a binary distance of zero corresponds to a similarity of 1. (e) calculate the overall mixture similarity and dissimilarity measures based on the similarity/dissimilarity values of pure molecular pairs. The Z-fmf of Eq. (5) is utilized for describing the degree of dissimilarity between constituents:
618
Pablo R. Duchowicz and Eduardo A. Castro
⎧ ⎪ ⎪ ⎪ ⎪ dm = 1 − ⎨ ⎪ ⎪ ⎪ ⎪⎩
1
if
x < x1
⎛ x − x1 ⎞ 1− 2⎜ ⎟ ⎝ x1 − x0 ⎠
2
2
⎛ x −x ⎞ 2⎜ 0 if ⎟ ⎝ x1 − x0 ⎠ 0 if x > x0
if
x1 ≤ x <
x1 + x0 2
x1 + x0 ≤ x < x0 2
(5)
where x1 and x0 are the start and end points of the slope part of the Z-fmf that are the adjustable parameters. x1 is set to zero so that a binary distance of zero corresponds to a dissimilarity of 0. (f) the CRC for a mixture at a given composition is used to optimize the selection of the fuzzy membership functions and their associated parameters. (g) the optimized fuzzy membership functions and parameter values can be used for future prediction of toxicity of mixtures of the same constituents but with different compositions, through the INFCIM model:
ECx ,mix = wA ⋅ (CA) + wB ⋅ ( IA)
(6)
In this equation, ECx , mix is the mixture toxicity, and coefficients wA and wB are weightings for the concentration additive and independent action contributions. These weightings are calculated using descriptors and fmf.
8. Robust Fuzzy Mappings in QSAR An important issue in QSAR modeling is robustness: a model should not undergo overtraining and its performance should be least sensitive to the modeling errors associated with the chosen set of molecular descriptors and the linear / non-linear functional form of the model. Although various fuzzy techniques have been developed using approaches such as ANN, GA, clustering techniques, and Kalman filtering [24,58-60], most of the recursive fuzzy identification methods use gradient-descent based algorithms (such as Backpropagation) for calculating nonlinear fuzzy model parameters. However, in the presence of data uncertainties and modeling errors, gradient-descent based techniques are not suitable due to their non-robust nature, leading to errors in the identification of model parameters. As QSAR analyzes usually involve complexities and uncertainties due to the lack of complete knowledge of underlying physical laws, a recent study [61] presents a new method of clustering based fuzzy mappings (rules), establishing robust input-output mappings based on fuzzy ‘‘if-then’’ rules. The identification of these mappings is a first-principle based approach that minimizes the sensitivity (non-robustness) of the identification method towards
Application of the Fuzzy Logic Theory to QSPR-QSAR Studies
619
modeling errors. The sensitivity of an identification method can be assessed by measuring a gain in energy from modeling errors to the identification errors [62]:
max
energy of identification errors → min energy of modeling errors
(7)
The maximum value of energy- gain (that will be minimized) is calculated over all possible finite disturbances without making any statistical assumptions about the nature of signals. This is called as the “energy-gain bounding approach” to model identification. Such an identification method as given by Eq. (7) will guarantee that small modeling errors cannot lead to large identification errors. The method of QSAR models development using Bayesian regularized neural networks is taken as a reference method for comparing the performance of proposed robust QSAR fuzzy models. For this task, three molecular sets of carboquinones, benzodiazepines, and predicting the rate constant for hydroxyl radical tropospheric degradation of 460 heterogeneous organic compounds. The method based on fuzzy mappings outperforms results of ANN due to the fact that the energy-gain bounding approach mathematically takes into account the issue of modeling errors in a sensible manner without making any assumption about the nature of signals. Apart of using experimental data, the better performance in presence of disturbances of the proposed approach when compared to the Bayesian Regularized ANN is also numerically demonstrated, in terms of the values taken by a properly defined generalization error.
9. ANFIS: Adaptive Neuro-Fuzzy Inference System A recent study employs for the first time the Adaptive Neuro-fuzzy Inference System (ANFIS) [9] in QSAR for modeling a data set of 68 pyrimidines derivatives as DHFR inhibitors, described first by Hansch et al. [63,64] and later by So and Richards [65]. This ANFIS system is a combination of the GA technique for feature selection with FLT and ANN, resulting in an improved tool for determining the behavior of imprecisely defined complex systems by inducting rules from observations. A fuzzy inference system (FIS) [1], which is a knowledge representation where each fuzzy rule describes a local behavior of the system, can be viewed as a Feed-Forward Network structure and thus it is possible to apply the same back-propagation principle of ANN. Therefore, the juncture of FLT and neurocomputing leads to the formulation of neuro fuzzy-systems. The ANFIS is a multilayer procedure that employs hybrid learning rules to train a Sugeno-style FIS [66] with linear rule outputs, a very efficient and transparent FIS, the system having a total of five layers. Figure 6 compares the topology of both techniques for a simple example of two inputs and two rules. The input and output adaptive nodes represent the descriptors and the calculated activity, respectively, and in the hidden layers there are nodes functioning as fmf and rules. Nodes of the same layer have similar functions. The ANFIS maps inputs through fmf and associated parameters, and then though output fmf and associated parameters of output can be used to interpret the input/output map. This method
620
Pablo R. Duchowicz and Eduardo A. Castro
eliminates the disadvantage of a normal Feed Forward multilayer network, which is difficult to an observer to understand and modify. The ANFIS is trained using a hybrid algorithm consisting of back-propagation for learning the premise parameters (parameters of layer 1 in Figure 3.b.), and a least-squares estimation for learning the consequence parameters (parameters of layer 4 in Figure 3.b.). The overall ouput is expressed as linear combinations of the consequent parameters. Here, the optimum number and shape of fmf used are obtained through grid partition [59,67] and the subtractive clustering algorithm [26], which are techniques that allow a proper partition of the feature descriptors input space to decrease the number of fuzzy rules and increase the speed of the training and testing phases.
Figure 3. (a) A two-input first-order Sugeno fuzzy model with two rules. (b) equivalent ANFIS architecture [9].
A training set of 48 pyrimidine derivatives, a validation set of 10 compounds, and an external test set of 10 compounds are employed in this study, which are selected by means of D-Optimal Design and Kohonen Self-Organizing Map approaches [68]. The predictive abilities of the resulting models are compared to those produced from classical multivariate regression such as linear and nonlinear (quadratic) partial least squares regression (PLS). The ANFIS method outperformed both the predictive capability of PLS models as well as published results. Among the strengths of the method are fast and accurate learning, deals efficiently with imprecision and nonlinearity, good generalization capabilities, excellent explanation facilities in the form of semantically meaningful fuzzy rules, and the ability to joint both data and existing expert knowledge. The ANFIS technique has also been successfully applied, among other studies, in the modeling of skin permeability coefficients of drugs [15], the prediction of NMDA (N-methyl-D-Aspartate) receptor binding activities of phencyclidine (PCP)
Application of the Fuzzy Logic Theory to QSPR-QSAR Studies
621
derivatives [69], liquid chromatography-mass spectrometry (LC-MS) retention time of benzodiazepines [70], prediction of respiratory motion [71], study of 1-[(2Hydroxyethoxy)methyl]-6-(phenylthio)thymine (HEPT) derivatives acting as NonNucleoside Reverse Transcriptase Inhibitors (NNRTIs) [72], or in the study of serotonin (5HT7) receptor inhibitors [73].
10. Fuzzy Regression in QSAR The use of FLT based approaches to characterize imprecision has recently been recognized in risk assessment and environmental policy applications [74]. A QSPR of 2004 [75] employs the Fuzzy Least Squares Regression (FLSR) [76,77] technique to develop a relationship for the logarithm of the soil-water partitioning coefficient normalized to organic carbon ( log10 K oc ), for persistent organic pollutants (POPs). The molecular structures are represented with the octanol-water partition coefficient ( log10 K ow ) and three molecular connectivity indices are used.
Figure 4. A triangular fuzzy number [75].
The FLSR is different from Least Squares Regression (LSR) and is used to characterize the imprecision arising from limited data and/or incomplete model descriptions. In this study, it is assumed that statistically based QSPR do not fully account for all the sorbate-sorbent interactions for the partitioning of POPs for different reasons, such as: (a) the model developed do not capture all the mechanisms of action; (b) the model functional form may result inadequate [78]; (c) experimental artifacts that lead to vagueness in the data during the measurement of aqueous concentrations of the highly hydrophobic POPs. This last cannot be explained by statistical randomness, and together with previous factors, cause these relationships to have inherent fuzziness associated with them. The FLSR unlike LSR does not make specific assumptions regarding the distribution of the residuals, and it is known to work well in situations where data may not be very accurate or exhibit large variability. It has been proposed for using either fuzzy or nonfuzzy (crisp) inputs and outputs, but the regression coefficients are treated as symmetrical fuzzy numbers. The fuzzy number represented in Figure 4 is a fs whose fmf is convex in shape (having an increasing and decreasing part) and normal (dm in the range 0-1). Thus, the fuzzy coefficients
622
Pablo R. Duchowicz and Eduardo A. Castro
define the most likely values for the input along with their range of variation. The most likely values have dm = 1 , while the maximum and minimum values have dm = 0 .
Figure 5. Fuzzy and statistical regression between
log10 K oc
and
log10 K ow
[75].
In this study, the Fuzzy Least Squares with Minimum Fuzziness Criterion (FLSMFC) [79] is employed. It consists on a two-step process wherein LSR is used to obtain a fit between the input and output values. The regression coefficients using LSR are used as the midpoints of the fuzzy regression coefficients. The halfwidths of the fuzzy coefficients are obtained using the minimum vagueness criterion proposed by Tanaka et al. [76] A relatively small size dataset consisting on 18 log10 K oc values of POPs having experimental errors in their measurement suggests that FLSR is a suitable technique for modeling purposes. A comparison between the statistical and fuzzy relationship between the persistence of POPs and log10 K ow using a 95% confidence interval in Figure 5 indicates that the fuzzy regression model envelopes all scatter in the data and provides a tighter and more reliable fit around the mid-point values given by LSR estimates.
11. Conclusion The Fuzzy Logic Theory has been successfully used in past years for modeling, control systems, pattern recognition, or image processing. In this work we have reviewed various applications of the Fuzzy Logic Theory to the field of QSPR-QSAR. Among the major difficulties commonly found during the development of these applications is that the robust methods for the automatic construction of fuzzy models remain relatively unknown, as Fuzzy Logic can be considered an emerging theory of the last decades. Linear and non linear fuzzy predictive models have been established and have been demonstrated to improve QSPRQSAR predictions in a very simple manner. This is because of the proposal of fuzzy membership functions and the employment of rules with linguistic labels for encoding of
Application of the Fuzzy Logic Theory to QSPR-QSAR Studies
623
expert knowledge in a direct and easy fashion. The main difference between fuzzy approaches and classical ones is that, instead of assuming an analytical model function for performing predictions on data which severely oversimplify the problem in hands, natural rules are developed from the data rather than imposing rules on the modeled system. Among the objectives of QSPR-QSAR Theory for establishing good models are that the predictions closely correlate the experimental data, involve few molecular descriptors, enhance the understanding of the phenomenon, and it is easy to use. We believe these objectives can be suitably accomplished by means of strategies derived from the Fuzzy Logic Theory. During the forthcoming years, this realistic and promising tool would undoubtedly see an increased number of applications, as Probability Theory alone is not capable of capturing uncertainty in all of its manifestations, particularly when it arises from the vagueness of natural language. We hope to have contributed in this respect with the present chapter.
Acknowledgements The authors thank the Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET) and the Universidad Nacional de La Plata for financial support.
List of Abbreviations AFP – Adaptive Fuzzy Partition ANFIS – Adaptive Neuro-fuzzy Inference System ANN – Artificial Neural Networks BPNN – Back Propagation Neural Networks CA – Cluster Analysis d – number of descriptors or numerical attributes DBM – Database Mining Method d iA – value of the ith descriptor for molecule A dm – degree of membership ES – Expert System FIS – Fuzzy Inference System FLSR – Fuzzy Least Squares Regression FLT – Fuzzy Logic Theory fmf – fuzzy membership function fs – fuzzy set FST – Fuzzy Set Theory GA – Genetics Algorithms INFCIM – INtegrated Fuzzy Concentration addition - Independent action Model log10 K ow – octanol-water partition coefficient LSR – Least Squares Regression PLS – Partial Least Squares QSPR-QSAR – Quantitative Structure Property-Activity Relationships
624
Pablo R. Duchowicz and Eduardo A. Castro
Sk – fuzzy subspace SOM – Kohonen Self-Organizing Maps
References [1] [2]
[3] [4] [5] [6] [7] [8] [9]
[10] [11] [12]
[13] [14] [15] [16]
[17] [18] [19]
L. Zadeh, "Fuzzy Sets", Information and Control, vol. 8, pp. 338-353, 1965. D. McNeill and P. Freiberger, Fuzzy Logic: The Revolutionary Computer Technology that is Changing our World, First Edition ed., Simon&Schuster, New York-London, 1994. H. T. Nguyen and E. A. Walker, A First Course in Fuzzy Logic, Third Edition ed., Chapman&Hall/CRC, New York, 2006. W. Weaver, "Science and Complexity", American Scientist, vol. 36, pp. 536-544, 1948. G. P. Fogel, "Computational intelligence approaches for pattern discovery in biological systems", Briefings in Bioinformatics, vol. 9, pp. 307-316, 2008. G. J. Klir, U. H. St. Clair and B. Yuan, Fuzzy Set Theory. Foundations and Applications, Prentice Hall PTR, New York, 1997. M. Russo, N. A. Santagati and E. Lo Pinto, "Medicinal Chemistry and Fuzzy Logic", Journal of Information Sciences, vol. 105, pp. 299-314, 1998. B. Kosko, Fuzzy Engineering, Prentice Hall, Englewood Cliffs, NJ, 1996. Y. L. Loukas, "Adaptive Neuro-Fuzzy Inference System: An Instant and ArchitectureFree Predictor for Improved QSAR Studies", Journal of Medicinal Chemistry, vol. 44, pp. 2772-2783, 2001. M. Russo, "A genetic approach to fuzzy learning", International Symposium on NeuroFuzzy Systems, EPFL, Lausanne, 1996. M. Russo, "FuGeNeSys: comparisons with previous works", Italian Workshop on Fuzzy Logic II, Bari, Italy, 1997. D. M. Bayada, H. Hamersma and V. J. van Geerestein, "Molecular diversity and representativity in chemical databases", Journal of Chemical Information and Modeling, vol. 39, pp. 1-10, 1999. E. M. Gordon and J. F. Kerwin, (Eds.) Combinatorial Chemistry and Molecular Diversity in Drug Discovery, Wiley, New York, 1998. M. C. Pirrung, Molecular Diversity and Combinatorial Chemistry: Principles and Applications, Elsevier Science, London, 2004. A. K. Pannier, R. M. Brand and D. D. Jones, "Fuzzy Modeling of Skin Permeability Coefficients", Pharmaceutical Research, vol. 20, pp. 143-148, 2003. G. J. Niemi, Multivariate analysis and QSAR: applications of principal component analysis, in: W. Karcher, Devillers, J. (Ed.), Practical Applications of Quantitative Structure - Activity Relationships (QSAR) in Environmental Chemistry and Toxicology, Kluwer Academic Publishing, Dordrecht, pp. 153-169, 1990. C. J. Hubert, Applied Discriminant Analysis, Wiley-Interscience, New York, 1994. L. Kaufman and P. J. Rousseeuw, Finding Groups in Data: An Introduction to Cluster Analysis, Wiley-Interscience, New York, 1990. T. Kohonen, Self-Organizing Maps, Springer-Verlag, Berlin, 2001.
Application of the Fuzzy Logic Theory to QSPR-QSAR Studies
625
[20] M. Pintore, O. Taboureau, F. Ros and J. R. Chretien, "Database mining applied to central nervous systems (CNS) activity", European Journal of Medicinal Chemistry, vol. 36, pp. 349-359, 2001. [21] R. Hecht-Nielsen, Theory of the backpropagation neural network, Proceedings of the International Joint Conference on Neural Networks, Washington, D. C., pp. 593-605, 1989. [22] J. Devillers, Neural Networks in QSAR and Drug Design, Academic Press, New York, 1996. [23] J. Zupan and J. Gasteiger, Neural Networks for Chemists: An Introduction, VCH, Weinheim, 1993. [24] R. Babuska, Fuzzy modeling for control, Kluwer Academic Publishers, Boston, 1998. [25] J. S. R. Jang, "ANFIS: adaptive-network-based fuzzy inference system", IEEE Transactions on Systems Man and Cybernetics, vol. 23, pp. 665-684, 1993. [26] S. Chiu, "Fuzzy Model Identification Based on Cluster Estimation", Journal of Intelligent and Fuzzy Systems, vol. 2, pp. 267-278, 1994. [27] T. E. Lane, M. Carson, C. Bergmann and T. Wyss-Corray (Eds.), Central Nervous System Diseases and Inflammation, Springer-Verlag, Berlin, 2008. [28] F. Ros, K. Audouze, M. Pintore and J. R. Chretien, "Hybrid system for virtual screening: interest of fuzzy clustering applied to olfaction", SAR and QSAR in Environmental Research, vol. 11, pp. 281-300, 2000. [29] F. Ros, M. Pintore and J. R. Chretien, "Molecular descriptor selection combining genetic algorithms and fuzzy logic: application to database mining procedures", Chemometrics and Intelligent Laboratory Systems, vol. 63, pp. 15-26, 2002. [30] F. Ros, O. Taboureau, M. Pintore and J. R. Chretien, "Development of predictive models by adaptive fuzzy partitioning. Application to compounds active on the central nervous system", Chemometrics and Intelligent Laboratory Systems, vol. 67, pp. 29-50, 2003. [31] ChemInter©1.0, ChemInter. [32] SciQSAR 2D®, SciVision, Burlington, USA, 1999. [33] M. Affenzeller, S. Wagner, S. Winkler and A. Beham, Genetic Algorithms and Genetic Programming: Modern Concepts and Practical Applications, CRC Press, London, 2009. [34] Y. Lin and G. J. Cunningham, "Building a fuzzy system from input-output data", Journal of Intelligent and Fuzzy Systems, vol. 2, pp. 243-250, 1994. [35] M. Sugeno and T. Yasakawa, "A fuzzy-logic-based approach to qualitative modeling", IEEE Transactions on Fuzzy Systems, vol. 1, pp. 7-31, 1993. [36] D. Dubois and H. Prade, An introduction to possibilistics and fuzzy logic, in: G. Shafer and J. Pearl (Eds.), Readings in uncertain reasoning, Morgan Kaufman, San Francisco, pp. 742-761, 1990. [37] B. Fritzke, "Fast learning with incremental radial basis function networks", Neural Processing Letters, vol. 1, pp. 2-5, 1994. [38] M. Ichino, "A nonparametric multiclass pattern classifier", IEEE Transactions on Systems Man and Cybernetics, vol. 6, pp. 345-352, 1979. [39] N. P. I. Plusn, NeuralWorks Professional II Plusn, NeuralWare, Pittsburgh, 1995. [40] B. Bartfai, "On the Match Tracking Anomaly of ARTMAP Neural Network", Neural Networks, vol. 9, pp. 295-308, 1996.
626
Pablo R. Duchowicz and Eduardo A. Castro
[41] B. Bartfai, "An ART-based Modular Architecture for Learning Hierarchical Clustering", Neurocomputing, vol. 13, pp. 31-45, 1996. [42] G. Espinosa, D. Yaffe, Y. Cohen, A. Arenas and F. Giralt, "Neural Network Based Quantitative Structural Property Relations (QSPRs) for Predicting Boiling Points of Aliphatic Hydrocarbons", Journal of Chemical Information and Modeling, vol. 40, pp. 859-879, 2000. [43] G. Espinosa, D. Yaffe, Y. Cohen, A. Arenas and F. Giralt, "A fuzzy ARTMAP based Quantitative Structure-Property Relationships (QSPRs) for Predicting Physical Properties of Organics", Industrial and Engineering Chemistry Research, vol. 40, pp. 2757-2766, 2001. [44] D. Yaffe, Y. Cohen, G. Espinosa, A. Arenas and F. Giralt, "Fuzzy ARTMAP and BackPropagation Neural Networks Based Quantitative Structure-Property Relationships (QSPRs) for Octanol-Water Partition Coefficient of Organic Compounds", Journal of Chemical Information and Modeling, vol. 42, pp. 162-183, 2002. [45] D. Yaffe, G. Espinosa, Y. Cohen, A. Arenas and F. Giralt, "A fuzzy ARTMAP based Quantitative Structure-Property Relationships (QSPRs) for predicting Aqueous Solubility of Organic Compounds", Journal of Chemical Information and Modeling, vol. 41, pp. 1177-1207, 2001. [46] G. Espinosa, A. Arenas and F. Giralt, "An Integrated SOM-Fuzzy ARTMAP Neural System for the Evaluation of Toxicity", Journal of Chemical Information and Modeling, vol. 42, pp. 343-359, 2002. [47] G. A. Carpenter, S. Grossberg, N. Marcuzon and D. B. Rosen, "Fuzzy ART: Fast Stable Learning and Categorization of Analog Patterns by an Adaptive Resonance System", Neural Networks, vol. 4, pp. 759-771, 1991. [48] L. M. Sztandera, A. Garg, S. Hayik, K. L. Bhat and C. W. Bock, "Mutagenicity of aminoazo dyes and their reductive-cleavage metabolites: a QSAR/QPAR investigation", Dyes and Pigments, vol. 59, pp. 117-133, 2003. [49] CODESSATM, v2.0, Semichem, 7204 Mullen, Shawnee, KS 66216, USA. [50] K. J. Cios and L. M. Sztandera, "Ontogenic neuro-fuzzy algorithm: F-CID3", Neurocomputing, vol. 14, pp. 383-402, 1997. [51] B. Kosko, "Fuzzy entropy and conditioning", Information Sciences, vol. 40, pp. 165174, 1986. [52] K. J. Cios and N. A. Liu, "A machine learning method for generation of a neural network architecture: a continuous ID3 algorithm", IEEE Transactions on Neural Networks, vol. 3, pp. 280-291, 1992. [53] D. M. Hawkins, S. C. Basak and D. Mills, "Assessing Model Fit by Cross Validation", Journal of Chemical Information and Modeling, vol. 43, pp. 579-586, 2003. [54] K. L. Bhat, S. Hayik, L. Sztandera and C. W. Bock, "Mutagenicity of Aromatic and Heteroaromatic Amines and Related Compounds: A QSAR Investigation", QSAR and Combinatorial Science, vol. 24, pp. 831-843, 2005. [55] M. Mwense, X. Z. Wang, F. V. Buontempo, N. Horan, A. Young and D. Osborn, "Prediction of Noninteractive Mixture Toxicity of Organic Compounds Based on a Fuzzy Set Method", Journal of Chemical Information and Modeling, vol. 44, pp. 17631773, 2004. [56] M. Mwense, X. Z. Wang, F. V. Buontempo, N. Horan, A. Young and D. Osborn, "QSAR approach for mixture toxicity prediction using independent latent descriptors
Application of the Fuzzy Logic Theory to QSPR-QSAR Studies
[57] [58] [59]
[60] [61] [62] [63]
[64]
[65]
[66] [67] [68]
[69]
[70]
[71]
627
and fuzzy membership functions", SAR and QSAR in Environmental Research, vol. 17, pp. 53-73, 2006. Dragon, Milano Chemometrics and QSAR Research Group, http://michem.disat.unimib.it/chm. A. Gonzalez and R. Pérez, "Completeness and consistency conditions for learning fuzzy rules", Fuzzy Sets and Systems, vol. 96, pp. 37-51, 1998. J. S. R. Jang, C. T. Sun and E. Mizutani, Neuro-Fuzzy and Soft Computing; a Computational Approach to Learning and Machine Intelligence, Prentice-Hall, Upper Saddle River, 1997. J. S. R. Jang, "ANFIS: Adaptive-network-based fuzzy inference systems", IEEE Transactions on Systems Man and Cybernetics, vol. 23, pp. 665-685, 1993. M. Kumar, K. Thurow, N. Stoll and R. Stoll, "Robust fuzzy mappings for QSAR studies", European Journal of Medicinal Chemistry, vol. 42, pp. 675-685, 2007. M. Kumar, N. Stoll and R. Stoll, "An energy-gain bounding approach to robust fuzzy identification", Automatica, vol. 42, pp. 711-721, 2006. C. Hansch, R. Li, J. M. Blaney and R. Langridge, "Comparison of the inhibition of Escherichia coli and Lactobacillus casei dihydrofolate reductase by 2,4-diamino-5(substituted-benzyl)pyrimidines: quantitative structure-activity relationships, x-ray crystallography, and computer graphics in structure-activity analysis", Journal of Medicinal Chemistry, vol. 25, pp. 777-784, 1982. C. D. Selassie, R.- L. Li, M. Poe and C. Hansch, "Optimization of hydrophobic and hydrophilic substituent interactions of 2,4-diamino-5-(substituted-benzyl)pyrimidines with dihydrofolate reductase", Journal of Medicinal Chemistry, vol. 34, pp. 46-54, 1991. S. S. So and W. G. Richards, "Application of neural networks: quantitative structureactivity relationships of the derivatives of 2,4-diamino-5-(substitutedbenzyl)pyrimidines as DHFR inhibitors", Journal of Medicinal Chemistry, vol. 35, pp. 3201-3207, 1992. M. Sugeno and G. T. Kang, "Structure identification of fuzzy model", Fuzzy Sets and Systems, vol. 28, pp. 15-33, 1988. M. R. Berthold and K. P. Huber, "Constructing fuzzy graphs from examples", Intelligent Data Analysis, vol. 3, pp. 37-53, 1999. W. Wu, B. Walczak, D. L. Massart, S. Heuerding, F. Erni, I. R. Last and K. A. Prebble, "Artificial neural networks in classification of NIR spectral data: Design of the training set", Chemometrics and Intelligent Laboratory Systems, vol. 33, pp. 35-46, 1996. E. Buyukbingol, A. Sisman, M. Akyildiz, F. N. Alparslan, A. Adejare, "Adaptive neuro-fuzzy inference system (ANFIS): A new approach to predictive modeling in QSAR applications : A study of neuro-fuzzy modeling of PCP-based NMDA receptor antagonists", Bioorganic and Medicinal Chemistry, vol. 15, pp. 4265-4282, 2007. M. Jalali-Heravi, A. Kyani, S. Afsari-Mamaghani and A. Ghadiri-Bidhendi, "Quantitative Structure-Retention Relationship Study of Benzodiazepines Using Adaptive Neuro Fuzzy Inference System as Feature Selection Method", QSAR and Combinatorial Science, vol. 27, pp. 407-416, 2008. M. Kakar, H. Nyström, L. R. Aarup, T. J. Nottrup and D. R. Olsen, "Respiratory motion prediction by using the adaptive neuro fuzzy inference system (ANFIS)", Physics in Medicine and Biology, vol. 50, pp. 4721-4728, 2005.
628
Pablo R. Duchowicz and Eduardo A. Castro
[72] M. Jalali-Heravi and A. Kyani, "Comparison of Shuffling-Adaptive Neuro Fuzzy Inference System (Shuffling-ANFIS) with Conventional ANFIS as Feature Selection Methods for Nonlinear Systems", QSAR and Combinatorial Science, vol. 26, pp. 10461059, 2007. [73] M. Jalali-Heravi and M. Asadollahi-Babolia, "Quantitative structure-activity relationship study of serotonin (5-HT7) receptor inhibitors using modified ant colony algorithm and adaptive neuro-fuzzy interference system (ANFIS)", European Journal of Medicinal Chemistry, vol. 44, pp. 1463-1470, 2009. [74] C. Freissinet, M. Vauclin and M. Erlich, "Comparison of first-order analysis and fuzzy set approach for the evaluation of imprecision in a pesticide groundwater pollution screening model", Journal of Contaminant Hydrology, vol. 37, pp. 21-43, 1999. [75] V. Uddameri and M. Kuchanur, "Fuzzy QSARs for predicting log Koc of persistent organic pollutants", Chemosphere, vol. 54, pp. 771-776, 2004. [76] H. Tanaka, S. Uejima and K. Asai, "Linear regression analysis with fuzzy model", IEEE Transactions on Systems Man and Cybernetics, vol. 12, pp. 903-906, 1982. [77] D. Dubois and H. Prade, Possibility Theory, Plenum Press, New York, 1988. [78] A. Bardossy, I. Bogardi and L. Duckstein, "Fuzzy regression in hydrology", Water Resources Research, vol. 26, pp. 1497-1508, 1990. [79] D. Savic and W. Pedrycz, "Evaluation of fuzzy regression models", Fuzzy Sets and Systems, vol. 39, pp. 51-63, 1991.
In: Quantum Frontiers of Atoms and Molecules Editor: Mihai V. Putz, pp. 629-668
ISBN: 978-1-61668-158-6 © 2010 Nova Science Publishers, Inc.
Chapter 24
MODELING THE TOXICITY OF ALCOHOLS. TOPOLOGICAL INDICES VERSUS VAN DER WAALS MOLECULAR DESCRIPTORS Dan Ciubotariu1*, Vicentiu Vlaia1, Ciprian Ciubotariu2, Tudor Olariu1 and Mihai Medeleanu3 1
Department of Organic Chemistry, Faculty of Pharmacy, “Victor Babes” University of Medicine and Pharmacy, P-ta Eftimie Murgu No. 2, 300041 Timisoara, Romania 2 Department of Computer Sciences, University “Politehnica”, Timisoara, Romania 3 Department of Organic Chemistry, University “Politehnica” Timisoara, Romania
Abstract In this chapter we present three molecular size (CiD, i=1,2,3) and three molecular shape ( θ iD , i=1,2,3) descriptors developed on the basis of molecular vdW space supposed isotropic, homogeneous, and compressible to some extent, and sixteen generalized topological descriptors based on the reciprocal distance matrix, GTRDIs (
δ λ ). Thus, assuming that a
k
given molecule can be characterized by the vdW surface area and volume, we developed the compressibility measures of molecular vdW space, CiD, and we extend the ovality concept Θ as ovality molecular descriptors, θ iD ; the subscript i refers to the dimensionality of vdW space. The GTRDIs were built starting from the idea that each vertex i of a chemical graph supports a topological distance strain (TDS) of order k, k=1,2,3,… from all other vertices of the molecular graph. Consequently, the GTRDIs may be considered as form of an internal topological strain of chemical graphs. The GTDRIs and the vdW measures of molecular compressibility and ovality, together with intrinsic density ID (defined by the ratio ID=MW/VW, where MW is molecular weight), have been tested as molecular vdW descriptors for correlating toxicity of aliphatic alcohols on simple organisms like larvae and tadpoles and on Tetrahymena pyriformis and Pimephales promelas. The obtained QSAR results prove that these vdW indices offer an appropriate description of chemical structure of aliphatic alcohols for the modeled properties. *
E-mail address: [email protected]
630
Dan Ciubotariu, Vicentiu Vlaia, Ciprian Ciubotariu et al.
1. Introduction Our society is confronting challenges such as continuous degradation of the environment due to aggressive agricultural pest control processes and to various chemicals produced by chemical, pharmaceutical and other industries. Thus, there is an imperative need for predictive models for health hazard purposes, that is, to design new chemicals with improved properties and diminished side-effects, and to assess the safety of some chemical compounds. In addition, the consideration of the risk of chemicals released to the environment and the evolvement of environmentally benign synthetic methods is strongly required [1]. Among the most prevalent organic chemicals in the world, as defined by the high production volume chemical list, are a variety of aliphatic alcohols, acids, esters, saturated and unsaturated alkanes, and halogenated alkanes [2]. There is an increased emphasis in predicting the hazardous effects of these chemicals from molecular structure [3]. The ability to use a quantitative structure – activity relationship (QSAR) to estimate accurately the relative toxicity of chemical compounds would be of value to industry and regulatory bodies alike. Strictly speaking, the term QSAR refers to the mathematical relationship between the biological activities of a set of chemical compounds and their structural parameters, called molecular descriptors (MDs). Generally, quantitative structure – property relationships (QSPRs) correlate chemical structure to a wide variety of physical, chemical, biological (including biomedical, toxicological, ecotoxicological), and technological (glass transition temperatures of polymers, critical micelle concentrations of surfactants, rubber vulcanization rates) properties. It is widely recognized that QSPR equations, whether they be derived in a purely empirical fashion from an arbitrary set of molecular descriptors or from a preselected set of descriptors chosen on theoretical grounds for a connection with a particular property, can give considerable insight into the manner by which chemical structure controls physical and biological properties of compounds [4]. The properties of a molecule, including its toxicological effect, are a consequence of a complicated interplay of its topology (atomic connectivity), metric characteristics (bond lengths, valence and torsional angles) and dynamics of electrons and nuclei. Finding out how various molecular features (quantified by topological, geometrical, and quantum-mechanical molecular descriptors, respectively) depend on molecular structure is one of the central fields of research in chemistry and particularly the main subject of QSTR (quantitative structure – toxicological properties relationship). To develop such a QSTR requires the following three components: a data set, which provides a uniform and relative measure of toxicity for a group of chemicals, molecular structure/property data (i.e., MDs) for each chemical compound within the group, and a statistical method to develop a linear (usually) or nonlinear relationship between toxicity and structure [5]. The compounds used in developing a QSTR (the training set) should preferably all act by the same mechanism. Otherwise, the QSTR will be less accurate and there will be compounds that are not modeled well (outliers). Therefore, the QSTR analysis is commonly carried out on congeneric series of compounds. The experimental data should be as accurate and precise as possible and should have been determined with the same protocol and, if possible, in the same laboratory. For best results, they must represent the molar concentration (or dose) that produces a given observed effect (e.g., the dose required to kill 50% of the organisms, LD50,
Modeling the Toxicity of Alcohols
631
etc.). The compounds should cover as wide as possible the range of chemical (MD) space and the range of end point values. For details about the design of series of compounds one may consult the ref. [6]. At present there are many MDs available – hydrophobic, electronic, steric, geometric, quantum chemical, topological – and there are a number of commercially available software packages (IRS [7], Dragon [8], CODESSA [9], etc.) that will each generate many structural descriptors. For the selection [10] of relevant MDs for the series under QSTR study one can use two ways. The first is to choose only those descriptors that are relevant to a presumed mechanism of action. In this case the number of MDs is usually small, but the method has the disadvantage that, if the chosen descriptors are irrelevant, a good QSTR will not be obtained. The second approach consists in generating a large number of descriptors by means of a software package. The selection of those MDs that form the best model of the toxic activity is made by appropriate statistical methods. QSTR models can be generated using a wide variety of mathematical models ranging from linear methods (e.g., linear correlation and regression and linear discriminant analysis) to nonlinear methods (e.g., random forests and neural networks). The multiple linear regression (MLR) method remains the most widely used type of statistical method used in QSTR. This method has the advantage that it is simple to use and the MDs that best model the toxicological activity can be seen and easily understood. Its disadvantage is related to the fact that it works best with the congeneric series of compounds and, also, it can suffer from a high risk of chance correlation, especially when a large pool of descriptors is used. To minimize this risk the ratio between the number of compounds (data points) and the number of MDs have to be at least five [11]. Once a QSTR model has been built for a series of chemicals (the training set or test set), it must be validated. In all cases the predictive ability of the models are the tested with a set of molecules (the prediction set or validation set), which were not used during the model building process [12, 13]. In conclusion, a QSTR has two main uses. Its foremost use is predictive, to estimate the toxicity of a compound not used to develop the QSAR. Second, the molecular descriptors used in QSTR analysis should be related to the process by which the toxicological activity is manifested. Thus, the MDs could offer some insights on the mechanism of action. However, it should always be remembered that the existence of a correlation between structure and activity is not proof of causality. Among different approaches to develop MDs used in QSTR analysis, we present in this chapter our own structural approaches based on molecular topology and molecular van der Waals (vdW) space. Molecular topology is conventionally represented by a molecular graph being essentially a non-numerical mathematical object. To quantifying the structural information, a graph G is transformed into a more convenient mathematical representation (matrix, polynomial, etc.) and then, using an algorithm, the structural information contained in G is converted into a graph invariant. These invariants are usually called topological indices. Among the topological invariants we used the reciprocal distance (RD) matrix to generate a set of generalized topological reciprocal distance indices (GTRDIs), kδλ (λ = 1,2,3,…; k = 1,2,3,…), which seems to be a good mathematical representation of chemical structure in numerical form, with a priori physicochemical meaning.
632
Dan Ciubotariu, Vicentiu Vlaia, Ciprian Ciubotariu et al.
The space occupied by a molecule exhibiting a toxicological action can be described by the approximation of “hard spheres”: each atom of that molecule is represented by an isotropic sphere having the centre in the equilibrium position of the corresponding atom and a radius equal to its vdW radius. Consequently, one can define a molecular vdW envelope as the external surface resulted from the intersection of all vdW spheres. This envelope embeds a 3D space of volume VW and surface SW. VW and SW were computed by means of the Monte Carlo method. They are used to calculate three vdW compressibility (CiD, i=1,2,3) descriptors and three vdW ovality ( θ iD , i=1,2,3) descriptors, which have a clear physical meaning. These MDs are used to model the QSTR of aliphatic alcohol toxicological effects on simple organisms like larvae and tadpoles and on Tetrahymena pyriformis and Pimephales promelas. The obtained QSAR results prove that these two types of MDs are inter-related and offer an appropriate description of chemical structure of aliphatic alcohols.
2. Statistical Methods No general theory of the quantitative relationship between molecular structure and toxicological properties of organic chemical compounds (QSTR) can reasonably be regarded as satisfactory unless provided a sound basis for predicting and interpreting linear relationships among molecular quantities. A satisfactory theoretical model for linear correlations in toxicology should allow reliable predictions to be made as easily as possible concerning both the circumstances in which correlations should occur (e.g., between which toxicological action and for which compounds) and the magnitudes of the regression coefficients. An important factor to be considered in the development of a model is the degree of parameterization required. An under-parameterized model will fail to predict the existence of certain significant experimentally detectable features in the pattern of toxic behavior. An over-parameterized model will suggest that a more complex pattern of toxic behavior exists; the more parameters are involved in a model, the harder it is to apply that model in making predictions. Statistical analyses of experimental data are a valuable guide when searching for the optimum degree of parameterization [14]. Modern approaches to the QSTR analysis of organic molecules such as drugs, insecticides, herbicides, fungicides etc., are based on the quantification of toxicity as a function of molecular structure [15]. These are generally carried out by means of the MLR method using a correlation equation of the type [16]:
y n = β 1 ⋅ x n1 + β 2 ⋅ x n 2 +
+ β p ⋅ x np + z n = (x n1 , … , x np ) ⋅ β + z n
(1)
where yn are toxicological activities and the matrix X=(xn1,…,xnp) contains the predictor variables, i.e., the structural parameters (MDs) of the compounds from studied series. Linear regression provides estimates and other inferential results for the (statistical) parameters β=(β1,β2,…,βp)T in the model (1). In this model, the random variable yn, which represents the response for case n, n=1,2,…,N, has a deterministic part and a stochastic part. The deterministic part, (xn1,…,xnp) ·β, depends upon the parameters β and upon the predictor or regressor variables xnp, p=1,2,…,P. The stochastic part, represented by the random variable
Modeling the Toxicity of Alcohols
633
zn is a disturbance that perturbs the response for that case. The superscript transpose of a matrix. The model for N cases can be written
T
Y = Xβ + Z
denotes the
(2)
where Y is the vector of random variables represented the experimental data we may get, X is the N×P matrix of regressor variables, i.e., the molecular structural parameters and/or the physical and chemical properties (especially for QSAR studies) and Z is the vector of variables representing the disturbances; one assumes that Z is normally distributed. The maximum likelihood estimate βˆ is the value of β which minimizes S(β):
⎡ ⎛ P ⎞⎤ S(β ) = Y − Xβ = ∑ ⎢ y n − ⎜⎜ ∑ x npβ p ⎟⎟⎥ n =1 ⎢ ⎝ p=1 ⎠⎥⎦ ⎣ 2
N
2
(3)
This βˆ is called the least squares estimate and can be written:
β = (X T X ) X T Y −1
Least squares estimates can also be derived using sampling theory or by using a Bayesian approach [17]. All three of these methods of inference, the likelihood approach, the sampling theory approach, and the Bayesian approach, produce the same point estimates for βˆ and, also, similar regions of “reasonable” parameter values. In using the least squares estimates one assumes [18]: (1) The expectation function Xβ is correct. (2) (3) (4) (5)
The response is expectation function plus disturbance (relation 2). The disturbance is independent of the expectation function. Each disturbance is of zero mean and has a normal distribution. The disturbances are independently distributed and have equal variances.
The condition for existence of the least squares solution is that the reciprocal product of
(
T
)
matrices X X exists, so that it is non-singular. This, in turn, requires that X is nonsingular. In this case no column of X may be written as a linear combination of other columns of X, the system of equations is of fully rank, (n>p), and there is a unique solution. If the number of descriptors, p, is close or greater than the number of data points (chemical compounds), n≈p or n
(
T
and X X
)
−1
is not defined; the predictor variables (MDs) are linearly dependent. The
problem of colinearity can be solved by suppressing descriptors or by using other multivariate
634
Dan Ciubotariu, Vicentiu Vlaia, Ciprian Ciubotariu et al.
methods, e.g. PLS (projection of latent variables) method [19]. For a detailed presentation of MLR models one may consult refs. [16] and [17]. Once the QSTR model is built, its goodness of fit is evaluated by means of the following statistics: the correlation coefficient (r) and the coefficient of determination (r2), adjusted for 2
the degree of freedom ( radj ), which is also called explained variance, (EV). The uncertainty in the model was noted as the standard error (s), and the reliability in the model was expressed by F (Fisher) and t (Student) statistics. The t-test was used to determine the 95% confidence limits of that QSTR model. Statistical fit should not be confused with the ability of a model to make predictions. Therefore, one use the leave-one-out (LOO) and the leave-n-out (L-n-O) cross-validation method to estimate the predictive ability of the obtained QSTR model, using the crossvalidation coefficient (also called coefficient of predictions), q2, and the squared correlation 2 coefficient between predicted and experimental activities, rPE .
In the LOO procedure one compound is removed from the training set, the QSTR is reconstructed using the remaining compounds, and the toxicological activity of the deleted compound is then predicted with the new QSTR model. The deleted compound is then reinstated and the procedure is repeated until each compound in turn has been left out. A cross-validated q2 value is obtained that is a guide to the predictive ability of the QSTR. A q2 value of >0.5 is acceptable [19]. However, the LOO technique has come in for criticism [20, 21]. A better procedure, if one has sufficient data, is to leave an appreciable proportion (2050%) of compounds out of the training set and to use them as an external test set. This L-n-O procedure may be viewed as an external method for validation: the chemical structures selected for inclusion in the validation set Finally, whether or not the developed QSAR is a chance correlation can be checked by scrambling the toxicological response values (Y-scrambling) [8] and trying to build a model using the scrambled data. This procedure is then repeated, say, 100 times and the r2 values are checked against that for the real QSTR: if only one of the r2 values from the scrambled data is as high as that from real QSTR, then there is 1% risk that the real QSAR is a chance correlation [22]. Nonlinear relationships are accommodated by transforming the data and/or descriptors using mathematical functions such as square, square root, logarithm, and inverse. Less frequently, combinations of descriptors such as products of two descriptors are made to account for nonlinear cross-dependencies [23].
3. Van Der Waals Molecular Descriptors 3.1. Introduction The recent interest in the van der Waals (vdW) volume and surface affords an adequate reason for introducing new vdW molecular descriptors in the treatment of steric effects in the analysis of chemical reactivity or in QSAR studies [18]. The molecular volume and the surface area were used as molecular structural parameters in QSAR studies of Hansch type [24-26]. They were also used as a starting point for deriving other QSAR parameters, e.g.
Modeling the Toxicity of Alcohols
635
lipophilicity/hydrophilicity [27], surface tension parameters [28], Weighted Holistic Invariant Molecular (WHIM) descriptors [29] and so on. Both numerical and analytical algorithms were devised for the calculation of the volume and surface area of molecules. The molecular volume is a measure of the space around atomic nuclei filled by electrons [30, 31] and is defined geometrically as the combined volume of overlapping spheres centered on the nuclei, similar in shape to a space-filling molecular model. The van der Waals radii are used for the radii of the atomic spheres. The molecular surface area is the area of the surface, which wraps the molecular volume. Exact calculation of the molecular volume and surface area is, however, a formidable task due to multiple overlap of spheres of different radii. The first methods for calculating the molecular volume and surface area used molecular fragments [30]. The additive technique is particularly suited for QSAR because many of its parameters are obtained from molecular fragments (e.g. octanol/water partition coefficients). Recently, Govers and de Voogt proposed a quantum definition of the molecular volume in terms of molecular fragments via electron indices [32]. Because the volume of overlapping spheres (solvent excluded volume) is a quantity of interest in solution chemistry, attempts were made decades ago to find an analytical expression for it. Thus, Rowlinson performed the first analytical calculation of the volume occupied by three and more overlapping spheres [33] and used it to give an analytical expression for the triplet radial distribution function g(3) in terms of the intersecting volume of three simultaneously overlapping spheres [34]. But Connolly was the first to obtain a general solution for the analytical computation of the volume and surface area for an ensemble of overlapping spheres with unequal radii [35, 36]. At about the same time, Richmond proposed another method to solve the same problem [37], this time by analytical integration of surface area obtained from Gauss-Bonnet theorem [38]. Finally, Gibson and Scheraga obtained an analytical expression of the volume in terms of the inclusion-exclusion principle [39]. Gavezzotti [40], Meyer [41] and Ciubotariu et al. [42, 43] developed, among others, numerical algorithms for calculating the volume and surface area of overlapping spheres, which use either three dimensional grids or Monte Carlo integration techniques. The grid methods are quite expensive computationally, as a very large number of grid points are necessary to obtain acceptable accuracy. Optimized algorithms were also proposed (based on Lebesgue integral) which eliminate large part of the grid points by focusing the grid near the molecular surface. The method based on the Monte Carlo integration is similar in performance with the grid method. In addition, its results are dependent on the quality of the random number generator. A relative recent review by Mezey [44] gives an account of the methods used to calculate molecular volume and surface area and their use in defining the molecular shape and similarity. In the last years, we have developed some new models for quantitative treatment of steric effects, on the basis of standard and optimized geometry of molecules and their vdW space, supposed to be homogeneous and isotropic [18, 24-26, 42, 43]. For this purpose we investigated the way in which the vdW volume and the vdW surface of substituents, as well as different directions in the van der Waals space of a molecule are responsible for the steric effects manifested during chemical reactions. In this section we present two types of vdW molecular descriptors developed to measure the size (compressibility MDs, CiD, i=1,2,3) and the shape (ovality MDs, ΘiD, i=1,2,3) of molecules. They describe the steric aspects of a molecule, as they are manifested by its vdW space on one dimension (C1D and Θ1D), two
636
Dan Ciubotariu, Vicentiu Vlaia, Ciprian Ciubotariu et al.
dimensions (C2D and Θ2D), and three dimensions (C3D and Θ3D). The purpose is to model quantitatively their interaction with biological targets responsible of various biological responses, including here the toxicological activity of aliphatic alcohols. The development of these MDs is made on the basis of vdW molecular volume, VW, and surface, SW. VW and SW were calculated with the aid of original algorithms[24, 26] developed on the basis of Monte Carlo methods [45] and implemented in the IRS computer package [7, 47]. The necessity to define such numerical molecular descriptors, in order to quantify in a proper mode both the shape and size of substituents and molecules, comes out from the theoretical considerations regarding the nature of steric effects and the practical conditions imposed by such parameters. The fundamentals and the derivation of CiD and ΘiD (i=1,2,3) MDs and their application in QSTR modeling of toxic activity of aliphatic alcohols is presented on in this section.
3.2. Van Der Waals Radii The distance where attractive forces between the unbound atoms of a molecule, or between atoms of different molecules, are in equilibrium with repulsive forces is known as the van der Waals distance. One may define the van der Waals radius (rw) as the half of the vdW distance. The van der Waals radii have long been considered a measure of atomic size [48]. The space occupied by molecules can be described suitable in the approximation of “hard spheres”: each atom of the molecule M is represented by an isotropic sphere having the center in the equilibrium position (Xi, Yi, Zi) of atoms of a molecule and the radius equal with its w
van der Waals radius ri . There are several methods to compute the vdW radii [49]. The best extant values of rW for atoms are those of Bondi [30], listed in Table 1. Table 1. Bondi vdW radii Atom rw
H 1.20
C 1.70
O 1.52
N 1.55
F 1.47
Cl 1.75
Br 1.85
I 1.98
S 1.80
P 1.80
These values were obtained from a careful comparison of various types of physical properties and are quite reliable. Unfortunately, values for groups are limited to the methyl group and to the half thickness of the benzene ring.
3.3. Molecular Van Der Waals Volume The molecular van der Waals envelope, Γ, can be defined in the “hard-spheres” approximation as the outer surface resulted from the intersection of all vdW spheres corresponding to the atoms of molecule M. The points (x,y,z) inside the envelope satisfy at least one of the following inequalities:
(X i − x) 2 + (Yi − y)2 + (Zi − z) 2 ≤ (riw ) 2
i = 1, m
(4)
Modeling the Toxicity of Alcohols
637
where m represents the number of atoms from a molecule, M, and (Xi,Yi,Zi) are the coordinates of these atoms. Consequently, the total volume embedded by the envelope is the w
molecular vdW volume ( VM ) of the molecule M. w
To compute the vdW volume, VM , the molecule is inserted into a bounding parallelepiped with the volume Vp. The random points (x,y,z) are generated into the parallelepiped, which includes the domain M. If nt is the total number of generated points and ns the number of points that satisfy the inequalities in (4), than the van der Waals volume is [18,24,42,43]:
VMw =
ns ⋅ Vp nt
(5)
The accuracy ε of the estimate (5) for a given maximum probability is inversely proportional to the square root of the number of trials [45]:
ε=
1 2⋅ δ⋅N
(6)
This circumstance causes the relatively slow convergence of the Monte Carlo methods. For example, in order to reduce the error of the result 10-fold, the number of trials must be increased 100-fold. If the accuracy of the estimate ε and the guarantee probability 1-δ are given, then from formula (6) one derives the necessary number of trials:
N=
1 4εδ 2
(7)
Taking into consideration the precision and the accuracy of chemical and biological experiments, for ε=0.05 and δ=0.01 the number of necessary points is N=10,000. This makes the Monte Carlo method not difficult to apply, due to the performances of nowadays computers. In order to increase the accuracy of the method the calculus must be repeated at least 10 to 20 times for each volume. The final result, i.e., the mean value of the computed volume of each alcohol from the series under study, is validated by statistical method [47]. The vdW values of the volumes of the 35 alcohols are systematized in Table 2.
3.4. Molecular Van Der Waals Surface The molecular van der Waals envelope, Γ, defined by relation 4, is a surface. We used also the Monte Carlo integration method for computing the area of molecular vdW surface of the studied aliphatic alcohols [18, 24].
638
Dan Ciubotariu, Vicentiu Vlaia, Ciprian Ciubotariu et al. Table 2. Values of vdW molecular volumes (VW) and surface areas (SW) of aliphatic alcohols
No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 * #
Name of alcohol 1-tridecanol 1-dodecanol 1-undecanol 1-decanol 1-nonanol 4-decanol 2-nonanol 1-octanol 3,7-dimethyl-3-octanol 2-ethyl-1-hexanol 1-heptanol 3-octanol 2-octanol 2-propyl-1-pentanol 3-ethyl-2,2-dimethyl-3-pentanol 1-hexanol 4-methyl-1-pentanol 2,4-dimethyl-3-pentanol 3,3-dimethyl-1-butanol 2,2-dimethyl-1-propanol 2-methyl-1-butanol 3-methyl-2-butanol 1-Pentanol 3-Methyl-1-butanol 2-pentanol tert-amylalcohol 3-pentanol 2-Methyl-1-propanol 1-Butanol 2-Butanol 1-Propanol 2-Methyl-2-propanol 2-Propanol ethanol methanol
VW 322.559 300.620 278.841 256.950 235.136 253.550 233.761 213.176 244.976 206.057 191.197 209.923 211.027 206.302 207.862 169.464 166.775 178.716 162.967 143.337 141.195 142.242 147.495 145.178 145.351 137.741 144.562 124.510 125.736 124.709 103.855 124.252 103.470 82.003 59.917
SW 239.890 222.535 205.877 188.936 172.136 188.490 172.282 155.162 187.985 154.983 138.127 154.924 155.002 154.953 169.338 121.686 121.506 137.232 120.937 104.097 102.994 104.075 104.533 104.375 104.643 102.532 104.550 87.537 87.761 87.631 70.947 87.465 70.559 53.917 37.049
ID* 0.836 0.838 0.837 0.838 0.839 0.840 0.837 0.840 0.841 0.841 0.839 0.840 0.842 0.843 0.851 0.843 0.841 0.847 0.848 0.846 0.855 0.846 0.844 0.843 0.845 0.858 0.844 0.847 0.843 0.848 0.847 0.846 0.849 0.854 0.865
log KOW# 5.58 5.13 4.53 4.57 3.77 3.78 3.77 3.00 3.52 2.81 2.72 2.72 2.90 2.81 2.86 2.03 1.75 1.93 1.62 1.31 1.22 1.28 1.56 1.16 1.19 0.89 1.21 0.76 0.88 0.61 0.25 0.35 0.05 -0.31 -0.77
ID is the intrinsic density [24], defined as follows: ID=μ/VW; μ is the molecular weight. log KOW is the coefficient of partition octanol-water; the data are from ref. [5].
The Monte Carlo algorithm [24,46] implies the generation of a random uniform grid on each sphere of the molecule, followed by the detection of the number of points generated on
Modeling the Toxicity of Alcohols
639
the surface (nt) and of those (ne) that do not satisfy the inequalities in (4). For every “hard W
sphere” i, one computes the outer part of each sphere’s surface, S i :
Siw =
(n e ) i ⋅ 4 ⋅ π ⋅ (riw ) 2 nt
(8)
W
The final surface is computed as a sum of exterior surface of each sphere, Si : m
S = ∑ Siw w
(9)
i =1
As we have seen, the vdW radius is a successful concept for the computation of molecular size and shape descriptors, even if in a quantum chemical description the electron cloud has no well-defined boundary surface. In the “hard-spheres” theory, each atom of the molecule is represented as an impenetrable sphere, i, centered at the equilibrium position of the atomic nucleus and having a radius equal to the vdW radius of the corresponding atom,
riW (see Table 1 from Section 3.2). The exterior surface of all atomic spheres defines the vdW surface of area SW, which delimits the vdW volume of the molecule. The corresponding values of the surface area of aliphatic alcohols are also presented in Table 1. These VW and SW values were used for the computation of the compressibility and ovality MDs.
3.5. Molecular Van Der Waals Compressibility Descriptors Assuming that a molecule can be characterized by two spheres, corresponding to the vdW volume, V w , and to the vdW surface, S w , respectively, we developed three compressibility measures of molecular vdW space, CiD, i=1,2,3. This hypothesis is based on the known conformational flexibility of the molecules and on the fact that the molecules are relatively compressible [24, 25]. Therefore, one may suppose that a molecule should be compressed w
from the greatest sphere (SG), corresponding to the vdW surface area of Γ, equal to S , to the smallest sphere (SS), concordant to the vdW volume embedded by Γ, equal to V w S
w S ,
The vdW radius, r , and the vdW volume, V
w
[50].
of the molecular SG sphere are
calculated as follows
rSw = [ S w / 4π ]1 / 2
(10)
VSw = 4π (rSw )3 / 3
(11)
640
Dan Ciubotariu, Vicentiu Vlaia, Ciprian Ciubotariu et al. The molecular SG sphere can be compressed to a molecular SS sphere, which has a w
w
volume equal to the molecular van der Waals volume, V . The vdW radius, rV , and the w
vdW surface area SV of this molecular SS sphere are calculated with the following relations:
rVw = [3V w / 4π ]1 / 3
(12)
SVw = 4π (rVw ) 2
(13)
In this way, the molecular SG and SS spheres are characterized by the following two triplets:
{SG }: (rSw , S w ,VSw )
(14)
{S S } : (rVw , SVw ,V w )
(15)
The molecular vdW compressibility measures, CiD, can be easily defined from the triplets (14) and (15), as the difference between the corresponding values of vdW radius, surface area, and volume of SG sphere and SS sphere, as follows,
C1D = rSw − rVw
(16)
C2 D = S w − SVw
(17)
C3 D = VSw − V w
(18)
Figure 1. The physical model of the molecular compressibility measures CiD, i=1,2,3.
The molecular descriptors considered in this QSTR analysis were those for molecular compressibility, which have been evaluated by relations (16), (17), and (18), corresponding to mono-, bi-, and tri-dimensional compressibility measures – C1D, C2D, and C3D, respectively,
Modeling the Toxicity of Alcohols
641
w
and the alcohol hydrophobicity, logP. The vdW volume ( V ), and the area of vdW surface w
( S ) were calculated with an in house algorithm, IRS (Investigation of Receptor Space) developed on the basis of the Monte Carlo method [7]. The geometry of molecules was optimized with MM+ and AM1 algorithms from the HyperChem software package [51]. The values of C1D, C2D, C3D for the series of 35 alcohols are presented in Table 3. For physical meaning see Figure 1. The compressibility descriptors CiD have a clear physical meaning. Thus, supposing an isotropic and homogeneous molecular vdW space, C1D measure the quantity (in Å) with that a molecule may be compressed into hydrophobic pocket, in the solvation process or in various biological environments. The packed capacity of a molecule is estimated by means of C2D – the molecular vdW surface, in Å2, and C3D – the molecular vdW volume, in Å3. These facts are well understood from Figure 1.
3.6. Molecular Van Der Waals Ovality Descriptors Central to the creation of a QSTR is the choice of structural descriptors [52]. The purpose of a molecular descriptor in a QSTR application is to provide a measure of a particular feature of the structures of the compounds being studied. The goal is simply to measure the feature in question as accurately and unambiguously as possible. At present, many structural descriptors are available, ranging from simple whole-molecule properties to quantum mechanical indices [53]. Taking into account the fact that for a given volume the spherical shape presents the minimum surface, the ovality index, O, was introduced [54] as a measure of the deviation of a molecule from the spherical shape. It was calculated from the ratio between the actual molecular surface area, Sw, and the minimum surface area, SV, corresponding to the actual van der Waals (vdW) volume, Vw, of that molecule [53, 55]:
O=
Sw Sw = = SV 4π ( r w )2
Sw ⎛ 3 ⋅V w ⎞ ⎟⎟ 4π ⋅ ⎜⎜ ⎝ 4π ⎠
2/3
(19)
In relation (19) rw represents the vdW radius of the given molecule, calculated from its actual vdW volume. The ovality index is equal to 1 for spherical top molecules and increases with increasing linearity of the molecule. In fact, the reciprocal of the ovality index, Ψ = O , was introduced [56] before, in 1935, as sphericity index, to measure how spherical (or round) an object is. The sphericity, Ψ , is the ratio of the surface area of a sphere (with the same volume as the given object) to the surface area of the object. −1
642
Dan Ciubotariu, Vicentiu Vlaia, Ciprian Ciubotariu et al. Table 3. Values of compressibility (CiD,i=1,3) and ovality ( θ iD ,i = 1,3 ) molecular descriptors for the alcohols of Table 2 No.
C1D
C2D
C3D
θ1 D
θ2D
θ3D
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
1.218 1.134 1.050 0.964 0.877 0.934 0.864 0.783 0.862 0.719 0.695 0.751 0.770 0.717 0.639 0.600 0.567 0.573 0.535 0.461 0.440 0.446 0.506 0.477 0.476 0.404 0.471 0.394 0.408 0.395 0.306 0.388 0.302 0.211 0.117
136.419 123.275 110.499 97.888 85.650 94.509 84.307 73.794 86.326 66.690 61.955 70.055 71.842 66.554 60.183 51.015 47.911 50.206 44.845 36.412 34.638 35.205 40.316 37.898 37.790 31.548 37.374 28.931 30.270 29.121 21.157 29.084 20.505 13.010 6.263
305.980 268.036 232.306 159.580 167.202 191.080 164.253 137.197 172.646 123.467 110.712 130.485 133.884 123.269 113.040 85.973 80.669 87.668 74.899 57.361 54.332 55.410 64.117 59.936 59.847 49.123 59.083 43.164 44.953 43.366 28.494 42.555 28.094 15.919 6.660
1.316 1.302 1.287 1.271 1.254 1.263 1.251 1.235 1.243 1.216 1.217 1.225 1.231 1.215 1.186 1.195 1.185 1.179 1.175 1.158 1.151 1.153 1.173 1.163 1.163 1.139 1.161 1.143 1.148 1.143 1.119 1.141 1.118 1.090 1.057
1.734 1.694 1.656 1.615 1.572 1.595 1.565 1.525 1.544 1.478 1.480 1.500 1.516 1.478 1.409 1.428 1.404 1.391 1.382 1.342 1.326 1.328 1.376 1.352 1.354 1.297 1.349 1.305 1.317 1.308 1.254 1.304 1.250 1.188 1.116
2.279 2.193 2.128 2.052 1.972 2.015 1.951 1.883 1.927 1.802 1.803 1.845 1.857 1.804 1.664 1.710 1.671 1.638 1.615 1.545 1.531 1.531 1.613 1.574 1.575 1.483 1.559 1.489 1.510 1.494 1.403 1.483 1.398 1.297 1.181
Here we present our extension [50] of the ovality molecular descriptor defined by relation (19) to three molecular vdW ovality measures, denoted by ΘiD, i=1,2,3. Thus, taking into account the characteristics of the greatest and the smallest molecular sphere, SG (relation 14),
Modeling the Toxicity of Alcohols
643
and SS (relation 15), respectively, evaluated with relations (10)-(13), the ovality descriptors have been defined as follows [50]
rSw rVw
(20)
SW = W SV
(21)
VSW VW
(22)
θ1D =
θ2D
θ3D =
One may observe that the relations (19) and (21) are the same. Consequently, the twodimensional (2D) ovality molecular vdW descriptor, θ 2 D , is identical with the ovality index, O [54, 55]. The relations (20) and (22) extend the index O so that one can also measure the deviation of a molecule from the spherical shape on one- (1D) and on three-dimensions (3D) of the vdW space. In addition, for the series of alcohols from Tables 2 and 3 the domain of θ 3 D values is greater than those of θ 2 D values. Consequently, one may expect that the discrimination capability between the molecular shapes of the congeners decreases as follows θ 3 D > θ 2 D > θ 3 D . While the compressibility descriptors CiD (i=1,2,3) are dimensional measures of the molecular size (in Å, Å2, and Å3, respectively), the ovality descriptors are dimensionless measures of the molecular shape. The ability to discriminate among the molecules of aliphatic alcohols from Table 1 is higher for the compressibility than the ovality molecular descriptors, as we can see in Table 3.
4. Topological Descriptors from Reciprocal Distance Matrix 4.1. Introduction Quantitative structure – property (QSPR) and structure – activity (QSAR) relationships are valuable tools now used in analyzing and predicting various physicochemical and biological properties of organic chemical compounds. QSARs are also used as scientifically credible tools (QSTRs) for predicting the acute toxicity of chemicals when few or no empirical data are available [57]. Many QSTR studies use graph theoretical indices that are based on the topological properties of a molecule viewed as a graph. The application of graph theory to this field implies the representation of molecules by selected molecular descriptors, often referred to as topological indices (TIs). TIs are numerical quantities based on various invariants or characteristics of the molecular graph such as the adjacency matrix, the distance matrix, or centric topological indices and TIs based on information theory [58]. Among various
644
Dan Ciubotariu, Vicentiu Vlaia, Ciprian Ciubotariu et al.
invariants, more detailed topological information is provided by the distance matrix of chemical graphs, D, whose entries dij represent the topological distances between vertices i and j that is the number of edges (bonds) along the shortest path between vertices (atoms) [59]. Therefore, many TIs used in QSTR studies have been developed on the basis of D. Based on their definition one proposed that many TIs derived from D may code two structural steric factors, namely the size and the shape of the molecule [58, 60]. Although TIs do not have a precise physical meaning, they are measures for topological shape, i.e. the degree of branching or cyclicity and they correlate well with molecular volume or surface [61]. However, extensive studies on this topic do not yet exist. The success of the connectivity indices [62, 63] and of the Wiener index [64] as graph descriptors in QSAR/QSPR models stimulated the research in the field of molecular descriptors based on graph distances, Wiener-like indices [65] and various molecular matrices −1
[59]. The idea to use as LOVIs the sum of the reciprocal values of D matrix ( d ij ) was adopted [24] in the definition of three distance connectivity indices (DCIs), δλ, λ=1-3 [24, 66, 67]. These LOVIs are in fact the elements of the reciprocal distance (RD) matrix,
{ }
RD = d ij−1 [68-73]. In this section we present several topological distance indices and the set of generalized topological distance indices (GTRDIs),
k
δ λ (λ = 0,1,2,3,...; k = 1,2,3,4 ,...) developed earlier
by extending the definition of DCIs δλ. In the construction of these indices we used as LOVIs the k power of the row sum of the values of the RD matrix,
d ij− k
[73]. In a tentative to offer a
physical meaning of GTRDIs in the frame of molecular graphs we consider these LOVIs as measures of the local topological distance strain (LTDS). Thus, one may define the internal topological distance strain (ITDS) of order k as a part of the topological energy (TE) of a molecule structurally described by its molecular graph. These GTRDIs are even measures of the internal topological distance strain of order k. Introducing the corresponding bond lengths as a metric on the topological molecular graph, we developed a new set of GTRDIs, kβλ (λ=03, k=1-4), which can discriminate between the (hetero)atoms. The subscript λ represent all possible paths of length λ.
4.2. Brief Review of Several Topological Distance Indices The distance matrix D(Γ) = {dij} of a graph Γ is an important graph-invariant. Its entries dij, called distances, are equal to the number of edges connecting the vertices i and j on the shortest path between them. Thus all dij are integers, and dij =1 for nearest neighbors; by definition, dii = 0. Therefore, the distance matrix D = D(Γ) of a labeled connected graph Γ is a real symmetric matrix NxN whose elements dij are defined as [74,75]:
⎧lij if i ≠ j
⎫ ⎬ ⎩0 otherwise⎭
D = {dij} and d ij = ⎨
(23)
Modeling the Toxicity of Alcohols
645
where lij is the topological length of the shortest path, i. e. the minimum number of edges between the vertices i and j in Γ. The length of the shortest path lij is also called [75] the distance between the vertices i and j in Γ, hence the name “distance matrix” for D. Many TDIs have been developed on the basis of D. We selected some of these for the present study, in which we analyze the relationship between TDIs and molecular vdW space. Among the TDIs that can be derived from D, the most popular investigated and applied is the Wiener number [76]. Besides the Wiener number [64, 77] we will briefly present the following TDIs used in our analysis: the polarity number [64, 77, 78], the Platt index [77], the Balaban J index [79, 80]. The values of these TIs for the aliphatic alcohols of Table 1 are systematized in Table 4.
(A) Wiener Index The Wiener index, W, [64, 77] was defined as the sum of the number of bonds separating all pairs of atoms in an acyclic molecule. It is easy to shown that this index is equal to the half-sum of the off-diagonal elements of D [81]:
W=1
N
2∑ i =1
N
∑d j =1
ij
; i≠j
(24)
where N is the total number of vertices (atoms) in Γ.
(B) Polarity Number Wiener has also introduced the so-called polarity number, P. P is the number of pairs of vertices separated by three edges, that is half of the number of distances of length three:
P= 1
N
N
(i + j ) ; ∀i, j where d 2 ∑∑ i =1 j =1
ij
=3
(25)
In relation (25) N represents the total number of vertices in Γ. The ½ factor before the sums in (25) compensates for the fact that the three edges between the vertices i and j in Γ are accounted for two times (both ways). W and P have been applied to correlations with boiling points, heat of formation and vaporization and other physical properties of alkanes [64, 77, 78].
(C) Platt Index Platt (nearest-neighbor edges) index F is calculated by summing for each edge the number of its adjacent edges [78]:
F=1
N
N
d 2 ∑∑ i =1 j =1
ij
; ∀i, j where d ij = 1
(26)
646
Dan Ciubotariu, Vicentiu Vlaia, Ciprian Ciubotariu et al.
Table 4. Values of Wiener, Polarity, Platt, and Balaban TIs corresponding to aliphatic alcohols in Table 2 No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
W 455 364 286 220 165 202 158 120 180 104 84 110 114 102 110 56 52
P 24 22 20 18 16 20 18 14 26 16 12 16 16 16 28 10 12
F 11 10 9 8 7 9 7 6 10 8 5 7 6 8 15 4 4
J 2.785 2.758 2.727 2.691 2.648 2.973 2.773 2.595 3.376 3.092 2.530 2.877 2.747 3.175 4.328 2.448 2.678
No. 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
W 65 46 28 29 35 32 32 31 18 20 18 10 16 9 4 1
P 18 16 14 12 8 10 10 10 8 6 8 4 12 6 2 0
F 8 4 3 4 3 3 3 4 2 2 2 1 0 0 0 0
J 3.464 3.155 3.169 2.994 2.339 2.627 2.627 2.754 2.540 2.191 2.540 1.975 3.024 2.324 1.633 1.000
(D) Balaban Index Balaban [79, 80] has proposed a topological index, which can be described as the average distance sum connectivity. The Balaban topological index J of a molecular graph Γ is defined as [79]:
J=
m N −1 N −1 (d i d j ) 2 ; ∀i, j where d ij = 1 ∑ ∑ μ + 1 i =1 j =i +1
(27)
where m is the number of edges in Γ, μ is the cyclomatic number, and the vertices i and j are adjacent. The average distance sum d k for a vertex k in Γ represents the sum of all entries of the th
k row or column in the distance matrix, D [79]: N
d k = ∑ d ki ; k = 1, N , k ≠ i
(28)
i =1
The cyclomatic number μ = μ(Γ), i.e. the number of cycles in Γ, is given by [80]
μ = m – n +1
(29)
Modeling the Toxicity of Alcohols
647
where N is the number of vertices in Γ. Relation (29) is the known Euler equation connecting the number of vertices (N), edges (m) and cycles (μ) in a planar graph. Average distance sums were used in relation (27) instead of distance sums because distance sums increase approximately parallel with m for the same type of branching. The cyclomatic number μ, defined in relation (29), was introduced in the definition of J because the presence of cycles markedly reduces the distance sums [58].
4.3. Generalized Topological Distance Indices
{ } , i,j = 1,N, where N −1
Another graph-invariant is the reciprocal distance matrix RD = d ij
is the total number of graph vertices. This is a symmetrical matrix whose elements are reciprocal of the topological distance [24, 71, 82, 83]. The first TDIs proposed on the basis of RD have been developed by a two-steps process as follows [24, 82, 83]. (i) The LOVI of each vertex in a molecular graph Γ, denoted later by μi, was defined from the RD using the following relation [24, 82]: N
μ i = ∑ d ij−1 ; i = 1, N , i ≠ j
(30)
j =1
In relation (30) dij is the topological distance between the vertices i and j, N represents the total number of vertices (i.e. non-hydrogen atoms) in Γ, and summation is made over all possible paths, from dij = 1 to dij = max(dij). Thus, each vertex is well characterized; it contains global information of the topological structure of Γ, the topological interaction between vertices i and j decreasing as distance dij is increasing. That is, for each vertex i, the quantity μi may be viewed as a measure of the influence of all others vertices in a given graph Γ on the vertex i. (ii) The LOVIs μi were condensed into a TDI, hδ, with the aid of the Randić-type formula [84], namely the generalized molecular connectivity [85], as follows [24, 82]:
δ=
h
∑ (μ μ i
...μ h )
−1
j
2
(31)
paths
These topological distances connectivity indices (TDCIs) [24, 82], also called topological distance measure connectivity indices (TDMCIs) [83], of order higher than three, have not been used in correlation due to the expected small contributions to the molecular properties. The TDCIs of order one (1δ), two (2δ) and three (3δ) have been calculated by the following relations [24, 82]:
648
Dan Ciubotariu, Vicentiu Vlaia, Ciprian Ciubotariu et al. 1
N
1
δ = ∑ μ j −2
(32)
j =1
δ = ∑ (μ i μ j )−
2
1
2
; i ≠ j ; ∀i, j where d ij = 1
(33)
i, j
δ = ∑ (μ i μ j μ k )−
3
1
; i, j , k = 1, N ; i ≠ j ≠ k; ∀i, j , k where d ijk = 2 (34)
2
i , j ,k
Monoparametric correlations with molecular properties such as boiling temperatures (at normal pressure), gas chromatographic retention indices, atomization enthalpies, and molar refractions for alkanes were performed. The reported results for 2δ are very good, the correlation coefficients r being in the range 0.983 – 0.991 [24, 82]. We extended the TDCIs by generalization of relation (30) as follows [73]: k
N
μ = ∑ d ij− k ; k = 1,2,3,4,... ; i, j = 1, N ; i ≠ j
(35)
j =1
Thus, we obtain a set of generalized topological distance indices based on the reciprocal distance matrix (GTRDIs), kδλ, where k is the same as in relation (35), which can be calculated with the following formulas: k
N
δ 0 = ∑ k μ i ; k = 1,2,3,4,...
(36)
i =1
k
N
δ1 = ∑ ( k μi )
−1
2
; k = 1,2,3,4,...
(37)
i =1
k
δ 2 = ∑ ( k μi k μ j )
−1
2
; k = 1,2,3,4,... ; i, j = 1, N ; i ≠ j; where d ij = 1 (38)
i , j ,k
k
δ 3 = ∑ ( k μi k μ j k μl )
−1
2
; k = 1,2,3,4,... ; i, j , l = 1, N ; i ≠ j ≠ l ; where d ij = d jl = 1 (39)
i , j ,l
One may easily observe that the TDMCIs in relations (32)-(34) are included in GTRDIs in relations (36)-(39), and there exists a formal identity between λδ and 1δλ (λ=1,3). The sixteen GTRDIs corresponding to k = 1,4 in relations (36)-(39) have been calculated with the IRS computer program [7, 47, 86] for 72 alkanes with N = 2,9 carbon atoms. GTRDIs have been used with good results for correlation the boiling points of these
Modeling the Toxicity of Alcohols
649
alkanes [73]. The values of GTRDIs for the series of 35 aliphatic alcohols in Table 1 are given below in Table 5. In computation of GTRDIs the oxygen atom is treated as the carbon atom. Consequently, the kδλ values are quantitative measures of topological structure of the equivalent graphs of alkane molecules. Table 5. Values of generalized topological reciprocal distance indices (GTRDIs) corresponding to the alcohols in Table 2 No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
1
δο 63.044 56.684 50.477 44.437 38.579 45.932 39.357 32.921 48.471 34.600 27.486 34.052 33.671 34.767 45.500 22.300 22.967 30.333 24.167 19.000 18.167 18.667 17.400 18.000 18.000 19.000 18.167 13.333 12.833 13.333 8.667 14.000 9.000 5.000 2.000
2
δο 37.625 34.483 31.353 28.237 25.137 29.008 25.613 22.058 30.548 22.959 19.003 22.708 22.527 23.028 29.208 15.979 16.424 20.778 17.264 14.167 13.514 13.889 12.997 13.417 13.417 14.167 13.514 10.444 10.069 10.444 7.222 11.000 7.500 4.500 2.000
3
δο 30.439 28.041 25.643 23.246 20.851 23.593 21.098 18.458 24.382 18.863 16.068 18.772 18.704 18.885 22.830 13.681 13.922 16.968 14.390 11.972 11.578 11.796 11.301 11.535 11.535 11.972 11.578 9.148 8.929 9.148 6.574 9.500 6.750 4.250 2.000
4
δο 27.903 25.738 23.574 21.410 19.246 21.566 19.370 17.082 21.958 17.260 14.918 17.230 17.206 17.267 20.175 12.755 12.879 15.354 13.122 10.949 10.732 10.849 10.593 10.715 10.715 10.949 10.732 8.549 8.432 8.549 6.275 8.750 6.375 4.125 2.000
1
δ1 6.653 6.280 5.903 5.523 5.138 5.444 5.090 4.750 5.305 4.652 4.358 4.681 4.701 4.644 4.764 3.960 3.909 4.169 3.820 3.421 3.494 3.450 3.558 3.506 3.506 3.421 3.494 3.101 3.151 3.101 2.742 3.030 2.699 2.340 2.000
2
δ1 8.683 8.122 7.560 6.998 6.435 6.962 6.407 5.871 6.854 5.832 5.306 5.839 5.844 5.830 6.158 4.741 4.715 5.197 4.645 4.082 4.149 4.117 4.176 4.151 4.151 4.082 4.149 3.587 3.610 3.587 3.048 3.524 3.027 2.496 2.000
3
δ1 9.679 9.033 8.388 7.742 7.097 7.790 7.134 6.451 7.840 6.510 5.806 6.498 6.489 6.512 7.189 5.160 5.198 5.913 5.212 4.565 4.561 4.586 4.515 4.553 4.553 4.565 4.561 3.909 3.870 3.909 3.227 3.911 3.261 2.593 2.000
4
δ1 10.117 9.437 8.757 8.078 7.398 8.182 7.490 6.718 8.374 6.835 6.039 6.821 6.811 6.836 7.797 5.359 5.451 6.322 5.546 4.865 4.781 4.862 4.679 4.772 4.772 4.865 4.781 4.092 4.000 4.092 3.322 4.171 3.406 2.647 2.000
650
Dan Ciubotariu, Vicentiu Vlaia, Ciprian Ciubotariu et al. Table 5. Continued No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
1
δ2 2.863 2.726 2.586 2.444 2.299 2.334 2.220 2.150 2.137 2.005 1.997 2.042 2.067 1.995 1.797 1.840 1.746 1.700 1.604 1.430 1.559 1.475 1.678 1.577 1.577 1.430 1.559 1.401 1.509 1.401 1.334 1.265 1.225 1.155 1.000
2
δ2 4.794 4.479 4.162 3.845 3.527 3.677 3.374 3.207 3.296 3.018 2.886 3.040 3.053 3.014 2.703 2.564 2.405 2.402 2.171 1.838 2.068 1.915 2.240 2.077 2.077 1.838 2.068 1.746 1.914 1.746 1.587 1.512 1.414 1.265 1.000
3
δ2 5.916 5.499 5.083 4.666 4.249 4.511 4.078 3.831 4.091 3.683 3.414 3.673 3.661 3.685 3.438 2.996 2.825 2.923 2.572 2.146 2.416 2.236 2.578 2.405 2.405 2.146 2.416 1.983 2.160 1.983 1.743 1.706 1.549 1.333 1.000
4
δ2 6.448 5.986 5.524 5.062 4.600 4.928 4.436 4.138 4.534 4.029 3.676 4.001 3.974 4.032 3.893 3.213 3.049 3.232 2.807 2.339 2.611 2.430 2.751 2.587 2.587 2.339 2.611 2.122 2.289 2.122 1.828 1.835 1.633 1.372 1.000
1
δ3 1.210 1.160 1.109 1.057 1.002 1.085 1.099 0.946 1.285 0.938 0.886 0.995 1.047 0.925 1.207 0.823 0.940 1.080 1.119 1.108 0.838 0.987 0.755 0.888 0.888 1.108 0.838 0.843 0.679 0.843 0.591 1.200 0.866 0.471 0.000
2
δ3 2.604 2.426 2.248 2.068 1.888 2.122 2.078 1.706 2.491 1.664 1.523 1.778 1.899 1.651 2.171 1.337 1.538 1.830 1.806 1.658 1.243 1.492 1.149 1.358 1.358 1.658 1.243 1.186 0.958 1.186 0.762 1.714 1.155 0.566 0.000
3
δ3 3.575 3.306 3.036 2.767 2.498 2.894 2.794 2.228 3.491 2.211 1.957 2.356 2.525 2.211 3.095 1.687 1.987 2.473 2.400 2.140 1.546 1.908 1.416 1.718 1.718 2.140 1.546 1.449 1.144 1.449 0.873 2.182 1.386 0.629 0.000
4
δ3 4.075 3.761 3.447 3.133 2.819 3.323 3.195 2.505 4.115 2.520 2.190 2.683 2.881 2.532 3.745 1.876 2.253 2.889 2.807 2.479 1.727 2.178 1.562 1.938 1.938 2.479 1.727 1.614 1.248 1.614 0.935 2.526 1.540 0.666 0.000
1
δ2 2.863 2.726 2.586 2.444 2.299 2.334 2.220 2.150 2.137 2.005 1.997 2.042 2.067 1.995 1.797 1.840 1.746 1.700 1.604 1.430 1.559 1.475 1.678 1.577 1.577 1.430 1.559 1.401 1.509 1.401 1.334 1.265 1.225 1.155 1.000
We supposed that the relations (30) and (35) describe the topological interactions
k
μi
between all the vertices j and the vertex i; i,j=1,N, j≠i; N is the number of vertices (nonhydrogen atoms) in the molecular graph Γ. This interaction depends on the distance between the vertices i and j, dij, that is on the number of edges separating the vertices i and j. Paths of length one (dij=1) represents bonds, and paths of length two (dij=2) signify two consecutive
Modeling the Toxicity of Alcohols
651
bonds. Longer paths represent a string of consecutive bonds. One proposed that paths are to be viewed as “one of the elementary structural concepts which are well understood and need no explanation” [63]. One may consider that the molecules contain an internal topological energy, due to the strains appearing among the vertices of the corresponding chemical graphs. This is an internal topological strain, which decreases when the distance between the vertices increases. Thus, the contribution of the adjacent vertices to the internal topological strain (ITS) is highest; it is equal to 1 for all the powers of k. It diminishes with the growth of the topological distances between two vertices. Thus, from the definition of the local topological distance strain (LTDS) – see relation (30), the contribution of the vertices in molecular graphs Γn decreases as the corresponding distance between them increases, and the power k also increases. The individual LTDS (local topological distance strain) value of order k associated to a given vertex j, μj, is a measure of the topological strain due to all other adjacent vertices i of the molecular graph, with the condition j≠i. The topological strain exercised by each node of Γ on a given node depends on the reciprocal distance between these nodes, raised at power k. In this way, we have obtained a variable value for the local topological distance strain, depending of the values of k. In this study the value of k was limited to integers k=1, 2, 3, 4. The value of λ refers to the path of chemical graphs see relations (36) – (39). It corresponds to the generalized molecular connectivity [85].
5. QSTR Models of the Toxicity of Aliphatic Alcohols 5.1. Introduction There is an increased emphasis in predicting the toxic effects of the chemical compounds from molecular structure. The ability to use a QSTR analysis to estimate accurately the relative toxicity of chemicals would be of value to chemical and pharmaceutical industry. A variety of toxicity data sets have been compiled for QSTR studies. Of these, the population growth inhibition of the freshwater ciliate Tetrahymena pyriformis is among the largest. These toxicity data have been derived for the express purposes of the QSTR development and validation [5, 87]. Among the most widespread industrial organic chemicals in the world are the aliphatic alcohols, as indicated by the high production volume chemical list [88]. The alcohols exhibit their toxic effect by means of narcosis, which is a general term that describes non-covalent interactions between xenobiotics and cellular membranes. When accumulating, they disturb the function of cellular membranes, and over a certain concentration they may cause the death [5, 87]. The toxicant molecule partitions into the lipid bilayer, and if a critical volume is reached death occurs. The toxicity of substances is governed by their properties, which in turn are determined by their chemical structure. Therefore, there are interrelationships between structure, properties, and toxicity [57]. The most important problem in QSTR analysis is to convert chemical structure into molecular descriptors (MDs) with good predictive ability and that are relevant to the physical, chemical or biological properties [89]. Unfortunately, there are many molecular descriptors
652
Dan Ciubotariu, Vicentiu Vlaia, Ciprian Ciubotariu et al.
that are not so easy to interpret. Such descriptors include the topological indices, of which the molecular connectivity descriptors are the most frequently used [58, 63, 84, 85]. During recent years our research is concerned mainly to develop topological and van der Waals (vdW) descriptors and to investigate their use in QSAR (Quantitative StructureActivity Relationship) and QSTR studies [18, 50, 66-68, 73, 82, 83, 90]. In this chapter we also present some QSTR studies on a series of 33 aliphatic alcohols, which exhibit toxic effects on ciliate protozoa Tetrahymena pyriformis. These studies have been performed using the compressibility (CiD) and ovality ( θ i ) vdW molecular descriptors [91] and topological descriptors GTRDIs [92] presented above.
5.2. Modeling of Alcohol Toxicity with Van Der Waals Molecular Descriptors The data set was collected from literature [5, 93]. Each alcohol was tested in three replicate assays to ciliate Tetrahymena pyriformis. Each test replicate consisted of six to ten different concentrations. The reported toxic activity was the 50% growth inhibitory concentration (IGC50), expressed millimolar [5]. We used as experimental biological activity, A, the logarithm of the inverse of concentration that produces 50% growth inhibition to T. pyriformis. The values of A=log(1/IGC50) for a series of 33 alcohols used in this QSTR study are presented in Table 6. The T. pyriformis toxicity data for various chemicals are available at the Tetratox database Web site [93]. Molecular descriptors used in correlations are experimental and calculated 1-octanol/water partition coefficient (log KOW), and compressibility molecular descriptors C1D (in Å), C2D (in Å2), and C3D (in Å3). The linear QSTR models obtained by correlating toxicity (A) versus compressibility descriptors CiD, i=13, are the following:
Aˆ = −3.2635(± 0.0832) + 4.6399(± 0.1228) ⋅ C1D 2 2 n = 33,s = 0.185;r = 0.989;radj = 0.978;F = 1428;qLOO = 0.976
(40)
Aˆ = −2.5692(± 0.0698) + 0.0385(± 0.0011) ⋅ C2 D 2 2 n = 33,s = 0.193,r = 0.988,radj = 0.976;F = 1301;qLOO = 0.973
(41)
Aˆ = −2.1282(± 0.0898) + 0.0168(± 0.0007 ) ⋅ C3 D 2 2 n = 33, s = 0.290, r = 0.974, radj = 0.946, F = 562, qLOO = 0.936
(42)
Aˆ = −2.0298(± 0.0561) + 0.7699(± 0.0210) ⋅ log P 2 2 n = 33, s = 0.190, r = 0.989, radj = 0.977, F = 1351, qLOO = 0.974
(43)
ˆ stands for the calculated value of experimental inhibitory activity, n represents the where A 2
2
number of data, s is the standard error, and r, radj , and q LOO are the correlation coefficient,
Modeling the Toxicity of Alcohols
653
coefficient of determination adjusted for the degree of freedom, and the cross-validation coefficient of the leave-one-out method, respectively. The statistical tests F and t are used at the 95% reliability degree. The Student (t-) test was used for calculating the confidence limits of the parameters of the linear models (40)-(43). Table 6. Data used in QSTR analysis; the toxic activity of alcohols to Tetrahymena pyriformis is expressed by A=log(1/IGC50), where IGC50 is the concentration that produces a 50% inhibition of Tetrahymena pyriformis growth (in mM)#
#
No.
Name of alcohol
A
ˆ ( 40 ) A
ˆ ( 41 ) A
ˆ ( 42 ) A
ˆ ( 43 ) A
1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33.
1-tridecanol 1-dodecanol 1-undecanol 1-decanol 1-nonanol 4-decanol 2-nonanol 1-octanol 3,7-dimethyl-3-octanol 2-ethyl-1-hexanol 1-heptanol 3-octanol 2-octanol 2-propyl-1-pentanol 3-ethyl-2,2-dimethyl-3-pentanol 1-hexanol 4-methyl-1-pentanol 2,4-dimethyl-3-pentanol 3,3-dimethyl-1-butanol 2,2-dimethyl-1-propanol 3-methyl-2-butanol 1-pentanol 3- methyl-1-butanol 2-pentanol 3-pentanol 2- methyl-1-propanol 1-butanol 2-butanol 1-propanol 2- methyl-2-propanol 2-propanol Ethanol Methanol
2.450 2.161 1.955 1.335 0.855 0.850 0.618 0.583 0.340 0.167 0.105 0.031 0.001 -0.134 -0.169 -0.379 -0.637 -0.705 -0.737 -0.870 -0.996 -1.030 -1.036 -1.160 -1.244 -1.372 -1.431 -1.542 -1.746 -1.791 -1.882 -1.991 -2.666
2.388 1.998 1.608 1.209 0.806 1.070 0.745 0.370 0.736 0.073 -0.039 0.221 0.309 0.063 -0.299 -0.480 -0.633 -0.605 -0.781 -1.125 -1.194 -0.916 -1.050 -1.055 -1.078 -1.435 -1.370 -1.431 -1.844 -1.463 -1.862 -2.284 -2.721
2.683 2.177 1.685 1.199 0.728 1.069 0.677 0.272 0.754 -0.002 -0.184 0.128 0.197 -0.007 -0.252 -0.605 -0.725 -0.636 -0.843 -1.167 -1.214 -1.017 -1.110 -1.114 -1.130 -1.455 -1.404 -1.448 -1.755 -1.449 -1.780 -2.068 -2.328
3.012 2.375 1.775 0.553 0.681 1.082 0.631 0.177 0.772 -0.054 -0.268 0.064 0.121 -0.057 -0.229 -0.684 -0.773 -0.655 -0.870 -1.165 -1.197 -1.051 -1.121 -1.123 -1.136 -1.403 -1.373 -1.400 -1.650 -1.413 -1.656 -1.861 -2.016
2.266 1.920 1.458 1.489 0.873 0.880 0.873 0.280 0.680 0.134 0.064 0.064 0.203 0.134 0.172 -0.467 -0.682 -0.544 -0.783 -1.021 -1.044 -0.829 -1.137 -1.114 -1.098 -1.445 -1.352 -1.560 -1.837 -1.760 -1.991 -2.268 -2.623
The compounds were sorted by activity, in its decreasing order; A with eqs. n = 40, 41, 42, 43, respectively.
ˆ(n) A
are the calculated values of toxicity
654
Dan Ciubotariu, Vicentiu Vlaia, Ciprian Ciubotariu et al.
The correlation results are very good, as expressed by correlation coefficients; r varied from 0.989 for the model (40), corresponding to the predictor variable C1D, to 0.974 for the model (42), with C3D as predictor variable. The 95% confidence limits of the parameters of linear models are about 5% of the statistical parameter values for C1D (see ± value in relation 40), slightly greater than 5% for C2D (relation 41), and over 8% for C3D (relation 42). The standard error is less than 4% of the range of experimental values, A, for the linear models (40) and (41), corresponding to C1D, and C2D descriptors, and about 6% for the model (42), with C3D as the independent variable. The variance of experimental data is explained better by 2
the compressibility descriptors C1D and C2D (about 98% of variance) – see the values of radj from equations (40) and (41); C3D explains only 95% of the variance of log (IGC50)-1 values. The reliability in the models (40) and (41) is very good – see the values of F, 1428 and 1301, 2
respectively, and, also, their predictive ability – see the values of q LOO , 0,976 and 0.973, respectively. Consequently, the predictive power (or goodness of prediction) of these models are very good, taking into account the commonly accepted values for a satisfactory QSTR model, q > 0.500 . The statistical quality of the QSTR models using C1D and logP 2
2 from equations (40) and (43); the F statistic descriptors is similar – see the values of s, r, radj
is an indication that C1D works slightly better than logP for the series of alcohols from Table 6. Several QSTR models predicting chemical toxicity to T. Pyriformis have been published [5, 87, 89, 94]. They were mainly based on the algorithm of the octanol–water coefficient (logP, also referred to as log KOW) as this hydrophobicity term reproduces the ability of a molecule to enter cells through the lipid membranes and indicates both toxicant uptake and baseline toxicity. Nevertheless, the experimental determination of logP values can be a complex matter, and experimental values can differ greatly even when referred to the same compound. Thus, several approaches have been developed for the theoretical calculation of logP [95] but also in these calculations it is not uncommon to have differences of several orders of magnitude [96]. Therefore, the CiD compressibility measures of molecular vdW space can be used in QSTR studies as predictor variables, in place of log KOW. These structural descriptors are easy to calculate for any molecule with whatever geometry. They describe the compression ability of molecules of chemical compounds in their specific environments. To assess the predictive ability of the statistical QSTR model (40), we split the data set from Table 1 into test set and training set, using the rule [24, 97, 98] described below. With this end in view the data were sorted by toxicity values, Ai, i=1,33, in its decreasing order (see Table 6), and the compounds were alternately assigned to test (validation) set and training (calibration) set, and vice versa. Thus, 50% of the compounds were used for training and 50% for testing. That is, the QSTR model obtained for the subset composed of odd ranking p
compounds was used to calculate toxic activities, Ai , of the pair ranking subset, i=2n, n=1,16, and, reciprocally, the QSTR model developed for the pair ranking subset was used to p
estimate the toxic activities, Ai , of the compounds belonging to odd ranking subset, i=2n-1, n=1,17. Finally, the correlation between experimental toxicities, Ai, and predicted p
toxicities, Ai , for all alcohols i=1,33 from Table 6 was done (see eq. 44) and the
Modeling the Toxicity of Alcohols
655
corresponding statistics were estimated. The procedure described above is a LHO-type crossvalidation method; it will be referred to below as the Leave odd-pair Out (Lo-p-O) crossvalidation technique.
A p = −0.0064 + 0.9801 ⋅ A 2 2 n = 33, RMSE = 0.181, rPE = 0.979, q Lo − pO = 0.979
(44)
p
The plot of the predicted toxicity values, Ai , i=1,33, with Lo-p-O cross-validation method, against the experimental toxicity values, Ai, i=1,33 is presented in Figure 2. One can see from this Figure that the data points are very close to the correlation line. The slope of the line is about 1 (0.980), corresponding to an angle of 45 degree, and the intercept is near to 0 (0.0064).
Figure 2 The plot of predicted toxicity versus experimental toxicity,
Ap
vs. A. [ A = log
(IGC )]. −1 50
The quality of the predictions was assessed by three statistical measures (see eq. 44). The coefficient of prediction,
2 2 qLo − pO , was used as a first statistical measure. The value of q can
range between 1 and − ∞ [98]. The closer to the unity, the better predictiveness is achieved. The second statistical measure is the squared correlation coefficient between the predicted 2 A p and the experimental A toxic activities, which is reported in this paper by rPE . The 2
2
squared correlation coefficient rPE can vary between 1 and 0. The main difference to q is p
2
that correlation rPE measures the association between the variables A and A, whereas q
2
requires the magnitude of predicted and experimental data to be the same. rPE is defined within the interval [-1,1], and it is calculated as the correlation coefficient but replacing the fitted toxicity with the by predicted toxicity. The commonly accepted reference values for a satisfactory QSTR model are rPE > 0.800 , and q > 0.500 . The root-mean-square error 2
2
656
Dan Ciubotariu, Vicentiu Vlaia, Ciprian Ciubotariu et al.
(RMSE) of the prediction is the third statistical measure used to assess the quality of eq. (44) and the predictive ability of the QSTR model (40). This cross validation procedure, Lo-p-O, can be considered a pseudorandom division of data sets because the actual values of activities, A, are scattered by measurement errors. The method has the advantage that the activity distributions of corresponding training sets and test sets are very similar, and it should allow assessing the ability of the model to interpolate [98]. The mono-dimensional C1D compressibility descriptor models very well the toxicity of aliphatic alcohols to T. pyriformis. The goodness of fit and the predictive ability are described above by means of statistics of eqs. (40) and (44), respectively. The obtained results suggest that C1D can be used in QSTRs for other series of toxic compounds, and, also, that it can replace the hydrophobicity descriptor log KOW [91]. The results obtained by statistical analysis of the aliphatic alcohol toxicity by means of ovality molecular descriptors θ are summarized in eqs. (45), (46), and (47).
ˆ = −24.3317 (± 0.8433 )+ 20.0853(± 0.7059 ) ⋅ Θ A 1D n= 33 ,s= 0 .243 ;r = 0 .981 ;r 2adj = 0 . 962 ;F = 810 ;q 2LOO = 0 .955
(45)
ˆ = −12.4069(± 0.3856 ) + 8.4349(± 0.2688 ) ⋅ Θ A 2D n= 33 ,s= 0 .221 ;r = 0 .985 ;r 2adj = 0 . 969 ;F = 985 ;q 2LOO = 0 .964
(46)
ˆ = −8.4197 (± 0.2429 )+ 4.7074 (± 0.1404 ) ⋅ Θ A 3D n= 33 ,s= 0 .208 ;r = 0 .987 ;r 2adj = 0 . 972 ;F = 1124 ;q 2LOO = 0. 969
(47)
The correlation coefficients are r>0.930 and the LOO-cross-validation coefficients are
q
2 LOO >0.960.
The QSTR models (45)-(47) explain more than 96% of the variance of
experimental toxicity values, A=logIGC50. They are reliable models in predicting the toxicity of other aliphatic alcohols. Our studies [50, 91] prove that the vdW molecular descriptors presented in this chapter work well in QSTR analysis of the toxicity of this series of aliphatic alcohols. These compressibility and ovality descriptors are valuable tools in predicting the toxicity at least in the series of congener molecules.
5.3. Alcohol Toxicity Modeling with the Topological Descriptors GTRDIs Most of industrial organic chemicals present acute toxicity, exhibited by a narcosis mode of toxic action. This is a non-receptor-mediated toxic effect that is most often quantified as individual lethality, population-based endpoints such as growth or reproduction, or a biochemical/physiological measurement. Thus, QSARs for acute toxicity typically deal with aquatic and ecological endpoints [5]. The emphasis in the present study was placed on the physical meaning of generalized topological reciprocal distance indices (GTRDIs). Previous studies have analyzed the ability of these topological molecular descriptors in correlations with boiling points of alkanes, and
Modeling the Toxicity of Alcohols
657
the informational content in terms of molecular van der Waals space [18, 24]. Therefore, the aim of this example study is also to asses the use of GTRDIs for development of toxicological QSARs. The data set is chemically homogeneous, including only aliphatic alcohols. The values of toxicological activities are collected in Table 7. Table 7. Toxicity of alcohols against Arenicola Larvae (A1), frog tadpoles (A2), Barnacle Larvae (A3), Tetrahymena pyriformis (A4) and Pimephales promelas (A5)* No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 *
Alcohol Methanol Ethanol 1-Propanol 2-Propanol 1-Butanol 1-Pentanol 1-Hexanol 1-Heptanol 1-Octanol Isobutanol sec-Butanol tert-Butanol Isoamyl alcohol tert-Amyl alcohol 1-Nonanol 1-Decanol 1-Undecanol 1-Dodecanol
A1 -0.40 -0.01 0.47 0.41 1.06 1.64 3.00 -
A2 0.24 0.54 0.96 0.89 1.42 3.40 1.35 0.89 1.64 1.24 -
A3 -0.14 0.28 0.79 0.92 1.46 1.84 2.41 3.02 3.62 1.54 1.16 0.98 1.86 1.34 -
A4 -2.77 -2.41 -1.84 -1.99 -1.52 -1.12 -0.47 0.02 0.51 -1.47 -1.64 0.77 1.10 1.87 2.07
A5 -2.96 -2.51 -1.88 -2.22 -1.37 -0.73 0.02 0.53 0.98 -1.29 -1.69 1.40 1.82 2.22 2.27
The toxicological data represent: A1 is narcotic concentration, pC, against Arenicola Larvae [62]; The original data are from R. S. Lillie, J. Physiol. 1913, 31, 255; A2 is log (1/C), where C is the effective concentration that produces narcosis of frog tadpole [62]; The original data are from E. Overton. Studies of Narcosis, Fischer, Jena, Germany, 1901; A3 is narcotic concentration, pC, against Barnacle Larvae [62]; The original data are from D. J. Crisp; D. Marr, Proc. Int. Congr. Surface Activity, 2nd, 1975, p. 310; A4 is log (1/C), C is ciliate toxicity, in (mmol/L)-1 [99]; A5 is log (1/C); C is fish toxicity, (mmol/L)-1 [99].
Narcotic chemicals such as alcohols act at the level of cellular membrane, which is considered the theoretical site of action. The interactions are supposed to be non-covalent and reversible. Consequently, one expects that the measured toxicological effect may be modeled by the octanol/water partition coefficient; log Kow (see, also, Table 2 for the values of log Kow for the alcohols from this chapter). The values of this partition coefficient were computerestimated CLOGP values (version 3.51) or retrieved as measured values from the same program, and are from refs. [5] and [93]. But, the coefficient of partition is a hydrophobic measure of the whole molecule based on an empirical physicochemical model. It is not a structural parameter. Using only the partition coefficient in QSAR studies of toxicity, it is not possible to point out molecular structural features or subclasses of structural influences on the toxicity [99].
658
Dan Ciubotariu, Vicentiu Vlaia, Ciprian Ciubotariu et al.
In Table 7 are listed the toxic activities of the alcohols used in correlations for testing the molecular topological descriptors presented in this section [92]. The toxicity data are collected from literature [5, 62, 99]. They include the narcotic concentration (pC) against Arenicola Larvae, frog tadpoles and Barnacle Larvae [62] and relative toxicity for the static 48-h Tetrahymena pyriformis 50% population growth impairment and the flow-through 96-h Pimephales promelas 50% mortality endpoints [99]. The data set in Table 7 is chemically homogeneous, including only aliphatic alcohols [100]. We used the MLR method for analyzing the usefulness of GTRDIs for QSAR study of narcotic effects of the alcohols. Previous studies have analyzed the ability of these topological molecular descriptors to correlate the boiling points of alkanes, and their informational content in terms of molecular van der Waals space [24, 73]. The obtained results are summarized in Table 8 and 9. Table 8 contains only the correlation coefficients of the regressions with all data of each series of alcohols acting to a specific biological organism. The results of correlation analysis [100] are very good for the majority of biological activities in Table 7 and for topological molecular descriptors developed on the basis of reciprocal distance matrix. Topological molecular descriptors kδλ, k=1,2,3,4, λ=0,1,2,3 (with exception of kδ3, k=1,2,3,4) correlate very well with the narcotic effect generated by the 15 tested alcohols on Tetrahymena pyriformis and Pimephales promelas; the correlation coefficients are r ≥ 0.950. Table 8. Correlation coefficients (r) of linear regressions between GTDIs and toxicological activities Ai, i=1,5 (see Table 7) and partition coefficient (log Kow) GTDI 1
δο δο 3 δο 4 δο 1 δ1 2 δ1 3 δ1 4 δ1 1 δ2 2 δ2 3 δ2 4 δ2 1 δ3 2 δ3 3 δ3 4 δ3 2
A1
A2
A3
A4
A5
Log Kow
0.993 0.997 0.997 0.997 0.999 0.998 0.998 0.997 0.997 0.998 0.998 0.998 0.748 0.892 0.917 0.919
0.949 0.932 0.935 0.941 0.969 0.962 0.947 0.934 0.993 0.995 0.993 0.990 0.471 0.635 0.650 0.632
0.973 0.968 0.970 0.974 0.987 0.985 0.978 0.970 0.985 0.988 0.989 0.991 0.532 0.715 0.729 0.708
0.994 0.997 0.997 0.997 0.998 0.998 0.997 0.997 0.997 0.997 0.997 0.998 0.822 0.958 0.971 0.972
0.979 0.988 0.989 0.990 0.992 0.991 0.989 0.989 0.995 0.992 0.992 0.992 0.818 0.947 0.959 0.959
0.986 0.988 0.989 0.990 0.995 0.994 0.991 0.989 0.997 0.997 0.997 0.997 0.627 0.864 0.883 0.875
Modeling the Toxicity of Alcohols
659
The correlation coefficients obtained for all 5 series of alcohols with narcotic action are generally greater than 0.980, when descriptors kδλ, where k=1,2,3,4 and λ=0,1,2, are used in deriving fitting equations of structure – toxicity. Weaker results were obtained for the kδ3 indices. These are based on paths of length two from the molecular graph describing the topological structure of the alcohols under study. One can assume that this descriptor does not adequately describe the toxic effect of this series of alcohols. This fact is not surprising, since the contribution of the vertices situated at a greater distance is smaller. Furthermore, the considered path length in the graph for this index is greater than for the other GTDIs. These two topological structural elements act in the same direction, relieving the strain between the nodes under consideration. Globally, the effect leads to a diminution of the scale of kδλ values, reducing also the discriminative capacity of these molecular topological descriptors. The values of the kδλ index in Table 5 show that this level effect increases together with the values of k and λ. As we have seen from Table 8, these indices are linearly related to the partition coefficient log Kow. However, indices kδ3 fail as in the previous case. We can suppose that the topologies of molecular graphs, as described by these successful indices obtained from the reciprocal distance matrix, have a contribution to this experimental physicochemical parameter. We report in this chapter a more complete analysis using only the 1δ0 molecular structural descriptors. Statistical results are given in Table 9. The best results are obtained by the QSTR analysis of the toxicity of aliphatic alcohols on Tetrahymena pyriformis (A4) – see the values of the statistical indicators in Table 9; for example, the explained variance is greater than 98% and the QSTR model has a very good predictive ability, as measured by the cross-validation 2
coefficient, qLOO =0.981. Obviously, the number of compounds in this studied series is small, but the work is in progress for very large series of chemicals. The poorest results were obtained for QSTR study of the narcosis of frog tadpole; it is the series from the first reported QSAR study of Overton [62]. Table 9. Statistical results for fitting equations with GTRDS 1δ0 as structural variables* Ai = a (± Δa ) + b (± Δb) 1δ0 A1 A2 A3 A4 A5 *
a
Δa
b
Δb
s
EV
F
2 rCV
-0.511 -0.014 -0.299 -2.708 -2.761
0.094 0.174 0.139 0.082 0.167
0.111 0.094 0.117 0.090 0.102
0.006 0.011 0.008 0.003 0.006
0.150 0.287 0.242 0.184 0.375
0.983 0.889 0.943 0.986 0.957
349.075 72.952 215.809 990.383 308.520
0.920 0.766 0.936 0.981 0.938
Δx (x=a, b) are the 95% confidence limits; s is the standard error of the estimates; EV represents explained 2
variance ( radj ); F is the Fisher test;
2 rCV is the cross validation coefficient.
The topological indices developed on the basis of reciprocal distance matrix of the molecular graphs are good molecular descriptors for quantitative treatment of toxicological effects of alcohols. The best results are obtained for the toxicity of alcohols on the ciliate
660
Dan Ciubotariu, Vicentiu Vlaia, Ciprian Ciubotariu et al.
Tetrahymena pyriformis. The QSTR models explain more than 80% of the variance of experimental data.
6. GTDIS versus VDW MDS The statistical quality of the QSTRs obtained with the aid of the vdW molecular descriptors, CiD and θ iD , and the topological descriptors, kδλ, is as good as for all linear models presented above. One can see above that the obtained results are similar for all compressibility, ovality and topological descriptors GTRDIs. In this section we analyze the extent to which the molecular descriptors formerly presented are linearly interrelated. In this way we can establish to what extent these MDs are orthogonal. If the MDs used in MLR analysis of experimental toxicity of various chemicals are orthogonal, the reliability in QSTR models is high and they may be easily interpreted in terms of the structural factors quantified by the used MDs. On the other hand, the nonorthogonal MDs may express the same type of structural information, and the correlations and predictions of QSTR models may be artificially improved. We have investigated the linear relationship between the pairs of molecular descriptors presented here, MDa and MDb, by means of the following linear relation
MDa = α + β ⋅ MDb
(48)
where MDa are the GTRDIs and the topological distance indices W-Wiener [81], P-polarity [77], F-Platt [78], and J-Balaban [80], and MDb are the compressibility and ovality (size and shape) MDs; the equations (48) are characterized only by the correlation coefficient, r. The correlation coefficient, r, is a measure of the linear relationship (48). If r = 0 no linear relationship exists between MDa and MDb. If r = 1, there is a direct linear relationship, and if r = -1, there is an inverse linear relationship between MDa and MDb. The correlation coefficient r ≥ 0.900 was proposed as the criterion for the intercorrelated pairs of molecular descriptors [73]. Strongly intercorrelated pairs of the molecular descriptors are those with r ≥ 0.980. The results of the correlation analysis are displayed in Table 10. The vdW volume (VW) and surface (SW) and the compressibility and ovality descriptors are linearly related not only to GTDRI descriptors kδλ (λ=0,1,2; k=1,2,3,4) but also to the Wiener (W) topological distance index. There is no correlation against J and the other MDs related to the molecular vdW space of aliphatic alcohols. The intrinsic density of these molecules is not related to the topological indices in Table 10. The steric component of the most topological indices is poorly explained by the characteristics of the vdW space. Weak correlations were also obtained for P, F and J. The results suggest the impossibility of testing the vector nature of steric effects by means of the topological distance indices, which is rather important for modeling biological interactions. This is a possible explanation for the lesser utility of topological indices for QSAR studies.
Modeling the Toxicity of Alcohols
661
Table 10. Coefficients of correlation (r) corresponding to the linear relation (48) vdW MD
SW
VW
C1D
C2D
C3D
Θ1D
Θ2D
Θ3D
ID
0.985 0.985 0.990 0.994 0.999 0.999 0.995 0.991 0.985 0.991 0.995 0.996 0.707 0.907 0.912 0.897 0.927 0.869 0.866 0.491
0.993 0.995 0.998 0.999 0.998 0.999 0.999 0.998 0.972 0.980 0.988 0.992 0.727 0.921 0.929 0.917 0.913 0.900 0.899 0.542
0.967 0.967 0.974 0.980 0.994 0.990 0.983 0.977 0.994 0.995 0.996 0.995 0.680 0.885 0.886 0.867 0.928 0.828 0.824 0.432
0.971 0.964 0.970 0.975 0.989 0.986 0.978 0.971 0.989 0.994 0.995 0.994 0.646 0.868 0.874 0.856 0.958 0.819 0.822 0.392
0.962 0.950 0.955 0.960 0.974 0.972 0.963 0.956 0.972 0.980 0.980 0.979 0.615 0.847 0.856 0.839 0.976 0.804 0.809 0.359
0.946 0.951 0.961 0.968 0.984 0.979 0.971 0.964 0.989 0.986 0.985 0.984 0.700 0.884 0.880 0.860 0.891 0.812 0.799 0.445
0.949 0.952 0.962 0.969 0.986 0.980 0.972 0.965 0.991 0.990 0.988 0.987 0.690 0.881 0.878 0.857 0.904 0.809 0.798 0.430
0.950 0.951 0.960 0.967 0.986 0.980 0.971 0.963 0.994 0.992 0.990 0.989 0.673 0.872 0.870 0.849 0.914 0.801 0.795 0.409
-0.697 -0.719 -0.740 -0.752 -0.781 -0.769 -0.756 -0.748 -0.815 -0.794 -0.785 -0.780 -0.681 -0.740 -0.705 -0.674 -0.634 -0.579 -0.521 -0.382
GTRDI 1
δο δο 3 δο 4 δο 1 δ1 2 δ1 3 δ1 4 δ1 1 δ2 2 δ2 3 δ2 4 δ2 1 δ3 2 δ3 3 δ3 4 δ3 2
W F P J
7. Conclusion Molecular structure is one of the basic concepts of chemistry, since properties and chemical and biological behaviors of molecules are determined by it. The quantification of various aspects of the molecular structure is one of the formidable tasks of the actual research studies in toxicology. The purpose is to develop molecular descriptors (MDs) for quantitative structure – toxicity relationships (QSTRs) with good explicative and predictive capabilities. The potential utility of these MDs must be established on the series of chemical compounds with well determined toxic activity. From this point of view, saturated aliphatic alcohols are among the best studied classes of industrial organic chemicals. Therefore, we used various series of aliphatic alcohols, which are neutral narcosis-acting compounds, to validate the MDs. Three molecular size (CiD, i=1,2,3) and three molecular shape ( θ iD , i=1,2,3) descriptors developed on the basis of molecular vdW space supposedly isotropic, homogeneous, and compressible in some extent, and sixteen generalized topological descriptors based on the
662
Dan Ciubotariu, Vicentiu Vlaia, Ciprian Ciubotariu et al.
reciprocal distance matrix, GTRDIs (
δ λ ) were presented here. GTRDIs are non-empirical
k
descriptors based on chemical topology. They are derived from the application of chemical graph theory. The advantage of topological properties is that they are a direct, simple description of molecular structure; the disadvantage is those that chemical topologies have no direct mechanistic meaning to toxicology. On the contrary, the molecular descriptors developed on the basis of the vdW molecular space may have clear physical meaning when the analysis is performed on the toxicity of various series of chemicals, especially when using series of congeners in QSTR studies. The vdW molecular descriptors CiD (i=1,2,3) measure the i-dimensional relation between the packed and the extended vdW size of a molecule within its specific environment, during physical and chemical interaction. Consequently, the vdW compressibility measures were used to model the biological activity of chemical compounds, including their toxicity. This was done with good results in the present QSTR analysis of the toxicity of aliphatic alcohols to Tetrahymena pyriformis. The structural molecular descriptors CiD can be also used in such studies of toxicity in place of logP. The aliphatic alcohols act toward the cellular membranes and the packing degree of the molecules, as measured by CiD, influences their capacity to accumulate, thus their toxicity. The θ iD , i=1,2,3 vdW descriptors are measures of the molecular shape. They indicate the degree of deviation of a molecule from the round (globular) shape. The development of the GTRDIs was base on the following hypothesis: each vertex i supports a topological distance strain (TDS) from all other vertices of the molecular graph. This TDS is a function of the distance between the vertex i and j, j=1,N, j≠i. Naturally, there is an inverse relationship between the topological strain and the topological distance. Consequently, we defined as a local vertex invariant (LOVI) the local topological distance strain (LTDS) of order k,
k
μi , as the sum of the reciprocal distances of all vertices (atoms) of
the chemical graph to the given vertex, i. The μi value represents the contribution of the vertex (atom), i, to the total topological strain of a given molecule represented by its molecular graph Γ. Therefore, the sum of topological strain of order k, that
k
k
k
μi
over all atoms in Γ represents the total
δ 0 , that characterizes a molecule described by Γ. We consider
δ 0 may be viewed as an internal topological distance strain (ITDS) of order k, due to
the reciprocal influences of molecular graph vertices. ITDS is a topological distance contribution to the total topological energy of a molecule represented by Γ. These interactions decrease as the topological distances between vertices in Γ increase. In this way, ITDS offers an a priori physical meaning of the GTDIs. Introducing various metrics on the molecular topological graphs one can differentiate the (hetero)atoms, and thus can obtain reliable molecular descriptors for QSPR and QSAR studies. These molecular descriptors, CiD, θ iD , and GTRDIs were calculated for a set of 33 alcohols, which all exhibit toxic action on Tetrahymena pyriformis, and, in variable proportion, they present toxic action on Arenicola larvae, Barnacle larvae, frog tadpole, and Pimephales promelas. We conclude that the obtained results prove that these indices are valuable molecular descriptors in modeling the toxic activity of aliphatic alcohol, In order to arrive to more definite conclusions, we deem it is necessary to extend the calculations to quite
Modeling the Toxicity of Alcohols
663
different molecular sets and other biological activities and physicochemical properties. The relations among vdW descriptors and GTRDIs suggest a same physical meaning for these topological indices based on the reciprocal distance matrix. It has been also shown in this chapter that accurately describing the toxicity of aliphatic alcohols with a single correlation is quite feasible if the molecular descriptors have clear physical meaning, and they are related to the physical and chemical interacting forces between molecules, and particularly between biological receptors and the toxin molecules. The reason is that the mode of action is the same, a reversible accumulation of the alcohol within the cell membrane that results in distortion and disruption of function.
References [1]
[2]
[3] [4]
[5]
[6]
[7] [8] [9] [10]
[11] [12] [13] [14]
Green, S.; Goldberg, A.; Zurlo, J. TestSmart – high production volume chemicals: An approach to implementing alternatives into regulatory toxicology. Toxicol. Sci. 2001, 63, 6-14. Baratt, M.D.; Castell, J.V.; Chamberlain, M.; Combes, R.D.; Searden, J.C.; Fentem, J.H., Gerner, I., Giuliani, A., Gray, T.J.B., Livingstone, D.J., Provan, W.M., Rutten, F.A.J.J.L., Verhaar, H.J.M., Zbindem, P. The Integrated Use of Alternative Approaches for Predicting Toxic Hazard. The Report and Recommendations of ECVAM Workshop 8 ATLA 1995, 23, 410-429. McKinney, J.D.; Richard, A.; Waller, C.; Newman, M.C.; Gerberick, F. The practice of structure activity relationships (SAR) in toxicology. Toxicol. Sci. 2000, 56, 8-17. Katritzky, A.R.; Petrukhin, R.; Tatham, D.; Basak, S.; Benfenati, E.; Karelson, M.; Maran, U. Interpretation of Quantitative Structure – Property and – Activity Relationships. J. Chem. Inf. Comput. Sci. 2001, 41, 679-685. Schultz, T.W.; Cronin, M.T.D.; Netzeva, T.I.; Aptula, A.O. Structure – Toxicity Relationships for Aliphatic Chemicals Evaluated with Tetrahymena pyriformis. Chem. Res. Toxicol. 2002, 15, 1602-1609. Pleiss, M.A.; Unger, S.H. The Design of Test Series and the Significance of QSAR relationships, In Comprehensive Medicinal Chemistry; Ramsden, C.A.; Pergamon Press: Oxford, 1990; Vol. 4, pp 561-587. http://irs.cheepe.homedns.org/ www.talete.mi.it/dragon.htm www.semichem.com Walker, J.D.; Jaworska, J.; Comber, M.H.I.; Schultz, T.W.; Dearden, J.C. Guidelines for developing and using quantitative structure – activity relationships. Environ. Toxicol. Chem. 2003, 22, 1653-1665. Topliss, J.G.; Edwards, R.P. Chance Factors in Studies of Quantitative Structure – Activity Relationships. J. Med. Chem. 1979, 22, 1238-1244. Wold, S. Validation of QSARs. Quant. Struct.-Act. Relat. 1991, 10, 191-193. Guha, R.; Jurs, P.C. Determining the Validity of a QSAR Model – A Classification Approach. J. Chem. Inf. Model. 2005, 45, 65-73. Godfrey, M. Theoretical Models for Interpolating Linear Correlations in Organic Chemistry, In Correlation analysis in Chemistry. Recent Advances; Chapman, N.B; Shorter, J.; Plenum Press, New York, 1978, pp. 86-91.
664
Dan Ciubotariu, Vicentiu Vlaia, Ciprian Ciubotariu et al.
[15] Hansch, C.; Hoekman, D.; Leo, A.; Zhang, L.; Li, P. The expanding role of quantitative structure – activity relationships (QSAR) in toxicology. Toxicology Letters 1995, 79, 45-53. [16] Draper, N.R; Smith, H. Applied Regression Analysis; John Wiley & Sons, Inc.: New York, 1981; pp. 25-56. [17] Vancea, R.; Holban, St.; Ciubotariu, D. Pattern Recognition. Applications; Edit. Academiei R.S. România: Bucureşti, 1989; pp. 111-139. [18] Ciubotariu, D.; Gogonea, V.; Medeleanu, M. Van der Waals Molecular Descriptors. Minimal Steric Difference, In QSPR/QSAR Studies by Molecular Descriptors; Diudea, M.V.; NOVA Science Publishers, Inc.: Huntington, New York, 2000; pp. 281-362. [19] Esbensen K. H. Multivriate Data Analysis-In Practice; CAMO Process AS: Oslo, Norway, 2004. [20] Perkins, R.; Fang, H.; Tong, W.; Welsh, W.J. Quantitative structure – activity relationship methods: perspective on drug discovery and toxicology. Environ. Toxicol. Chem. 2003, 22, 1666-1679. [21] Konovalov, D.A.; Llewellyn, L.E.; Heyden, Y.V.; Coomans, D. Robust CrossValidation of Linear Regression QSAR Models. J. Chem. Inf. Model. 2008, 48, 20812094. [22] Dearden, J.C. Computers in Toxicology and Risk Assessment, In Computer Applications in Pharmaceutical Research and Development; Ekins, S.; John Wiley & Sons, Inc.: New York, 2006, pp. 469-494. [23] Purvis III, G.D. Size-intensive descriptors. J. Comput. Aided Mol. Des. 2008, 22, 461468. [24] Ciubotariu, D., Structure Reactivity Relationships in the Class of Carbonic Acid Derivatives, PhD Thesis, Polytechnic Institute of Bucharest, Romania, 1987. [25] Niculescu-Duvaz, I.; Ciubotariu, D.; Simon, Z.; Voiculetz, N. QSAR (SAR) Models and Their Use for Carcinogenic Potency Prediction, In Modeling of Cancer Genesis and Prevention; Voiculetz, N., Balaban, A.T., Niculescu-Duvaz, I., Simon, Z.; CRC Press: Boca Raton, 1991; pp. 157-214. [26] Chiriac, A.; Ciubotariu, D.; Simon, Z. Quantitative Structure – Activity Relationship (QSAR). The MTD Method; Mirton: Timisoara, 1996. [27] Heiden, W.; Moeckel, G.; Brickmann, K. A New Approach to Analysis and Display of Local Lipophilicity/Hydrophilicity Mapped on Molecular Surfaces. J. Comput.-Aided Mol. Design 1993, 7, 503-514. [28] Hermann, R.B. Modeling Hydrophobic Solvation of Nonspherical Systems: Comparison of Use of Molecular Surface Area with Accessible Surface Area. J. Comput. Chem. 1997, 18, 115-125. [29] Todeschini, R.; Gramatica, P. 3D-Modeling and Prediction by WHIM Descriptors. Part 5. Theory Development and Chemical Meaning of WHIM Descriptors. Quant. Struct.Act. Relat. 1997, 16, 113-119. [30] Bondi, A.J. van der Waals Volumes and Radii. J. Phys. Chem., 1964, 68, 441-451. [31] Francl, M.M.; Hout, Jr., R.F.; Hehre, W.J. Representation of Electron densities. 1. Sphere Fits to Total Electron Density Surfaces. J. Am. Chem. Soc., 1984, 106, 563-570. [32] Govers, H.; de Voogt, Pim. Calculation of Molecular Volumes from Molecular Fragments via Valence Electron Indices. Quant. Struct.-Act. Relat. 1989, 8, 11-16.
Modeling the Toxicity of Alcohols
665
[33] Rowlinson, J.S. The Triplet Distribution Function in a Fluid of Hard Spheres. Mol. Phys. 1963, 6, 517-524. [34] Pavani, P.; Ranghino, G. A Method to Compute the Volume of a Molecule. Comput. Chem. 1982, 6, 133-135. [35] Connoly, M.L. Solvent-Accesible Surfaces of Proteins and Nucleic Acids. Science 1983, 221, 709-713. [36] Connoly, M.L. Computation of Molecular Volume. J. Am. Chem. Soc. 1985, 107, 11181124. [37] Richmond, T.J. Solvent Accesible Surface Area and Excluded Volume in Proteins. Analytical Equations for Overlapping Spheres and Implications for the Hydrophobic Effect. J. Mol. Biol. 1984, 178, 63-89. [38] do Carmo, M.P. Differential Geometry of Curves and Surfaces; Prentice-Hall: Englewood Cliff, 1976, pp. 34-52. [39] Gibbson, K.D.; Scheraga, H.A. Exact Calculation of the Volume and Surface Area of Fused Hard-Spere Molecules with Unequal Atomic Radii. Mol. Phys. 1987, 62, 12471265. [40] Gavezotti, A. The Calculation of Molecular Volumes and the Use of Volume Analysis in the Investigation of Structured Media and of Solid-State Organic Reactivity. J. Am. Chem. Soc. 1983, 105, 5220-5225. [41] Meyer, A.Y. Molecular Mechanics and Molecular Shape. Part I. van der Waals Descriptors of Simple Molecules. J. Chem. Soc. Perkin Trans. II 1985, 1161-1169. [42] Ciubotariu, D.; Holban, Şt.; Moţoc, I. Computation of Molecular van der Waals Volume by means of Monte Carlo Method, Preprint, Univ. Timişoara, Fac. St. Nat., Ser. Chim., 1975; 3, 1-8. [43] Ciubotariu, D.; Gogonea, V.; Iorga, I.; Deretey, E.; Medeleanu, M.; Mureşan, S.; Bologa, C. New Shape Descriptors for Quantitative Treatment of Steric Effects. II The Molecular van der Waals Volume: two Monte Carlo Algorithms. Chem Bull Tech (Timişoara) 1993, 38, 63-75. [44] Mezey, P.G. Molecular Surfaces, In Reviews in Computational Chemistry; Lipkowitz, K.B; Boyd, D.B.; VCH Publishers: New York, 1990; Vol 1, pp 265-294. [45] Demidovich, B.P.; Maron, I.A. Computational Mathematics; Mir Publishers: Moscow, 1987, pp 649-674. [46] Gogonea, V.; Ciubotariu, D.; Deretey, E.; Popescu, M.; Iorga, I.; Medeleanu, M. Surface Area of Organic Molecules: a New Method of Computation. Rev. Roum. Chim. 1991, 36, 465-471. [47] Ciubotariu, C.; Medeleanu, M.; Ciubotariu, D. IRS – a Computer package for QSAR and QSPR studies. Chem. Bull. “Politehnica” Univ. Timisoara 2006, 51, 13-16. [48] Pauling, L. The Nature of the Chemical Bond; Cornell University Press: Ithaca, 1960. [49] Charton, M. The Upsilon Steric Parameters-Definition and Determination, In Steric Effects in Drug Design; Charton, M.; Moţoc, I.; Topics Current Chem No.114; Springer: Berlin, 1983; pp 60-65. [50] Ciubotariu, D.; Vlaia, V.; Olariu, T.; Ciubotariu, C.; Medeleanu, M.; Ursica, L.; Dragos, D. Molecular van der Waals Descriptors for Quantitative Treatment of Toxicological Effects. 12th Int. Workshop Quant. Struct.-Act. Relat. Environ. Toxicol. 2006, 8-12 May, Lyon, France, p. 90. [51] HyperChem version 7.0, HyperCube Inc., Gainesville, Florida, U.S.A.
666
Dan Ciubotariu, Vicentiu Vlaia, Ciprian Ciubotariu et al.
[52] Purvis III, G. D. Size-intensive descriptors. J. Comput. Aided Mol. Des., 2008, 22, 461468. [53] Todeschini, R.; Consonni, V. Handbook of molecular descriptors; Wiley-VCH: New York, 2000. [54] Bodor, N.; Gabany, Z.; Wong, C. K. A New Method for the Estimation of Partition Coefficient. J. Am. Chem. Soc., 1989, 111, 3783-3786. [55] Bodor, N.; Buchwald, P.; Huang, M.-J. Computer-Assisted Design of New Drugs Based on Retrometabolic Concepts. SAR & QSAR Environm. Res. 1998, 8, 41-92. [56] Wadell, H. Volume, Shape and Roundness of Quartz Particles. Journal of Geology 1935, 43, 250–280. [57] Schultz, T. W.; Cronin, M. T. D.; Netzeva, T. I. The present status of QSAR in toxicology. J. Molec. Struc. THEOCHEM 2003, 62, 23-38. [58] Balaban, A.T.; Motoc, I.; Bonchev, D.; Mekenyan, O. Topological Indices for Structure-Activity Correlations, In Steric Effects in Drug Design; Charton, M.; Motoc, I.; Top. Curr. Chem. No. 114; Springer: Berlin, 1983, pp 21-56. [59] Ivanciuc, O.; Ivanciuc, T. Matrices and Structural Descriptors Computed from Molecular Graph Distances, In Topological Indices and Related Descriptors in QSAR and QSPR; Devillers, J.; Balaban, A. T.; Gordon and Breach Sci. Publ.: Amsterdam, 1999; pp 221-277. [60] Balaban, A. T. Can topological indices transmit information on properties but not on structures?. J. Comput.-Aided Des. 2005, 19, 651-660. [61] Balaban, A.T. A Personal View about Topological Indices for QSAR/QSPR, In QSPR/QSAR Studies by Molecular Descriptors; Diudea, M.V.; NOVA Science: Huntington, 2001; pp 1-30. [62] Kier, L. B.; Hall, L. H. Molecular Connectivity in Chemistry and Drug Research, Academic Press, New York, 1976. [63] Kier, L. B.; Hall, L. H. Molecular Connectivity in Structure-Activity Analysis; Research Studies Press, Letchworth, 1986. [64] Wiener, H. Structural Determination of Paraffin Boiling Points. J. Am. Chem Soc. 1947, 69, 17-20. [65] Ivanciuc, O.; Ivanciuc, T.; Diudea, M.V. Molecular Graph Matrices and Derived Structural descriptors. SAR QSAR Environ. Res. 1997, 7, 63-87. [66] Ciubotariu, D.; Medeleanu, M.; Gogonea, V. Quantitative Treatment of Organic Molecules II. A New Topological Index Based on Distance Matrix. Chem. Bull. “Politehnica” Univ. Timisoara 1996, 42, 19-24. [67] Ciubotariu, D.; Medeleanu, M.; Gogonea, V. Quantitative Treatment of Organic Molecules I. Distance Connectivity Indices λd as Similarity Measure and Correlation Parameters for Alkanes. Chem. Bull. “Politehnica” Univ. Timisoara 1995, 40, 21-36. [68] Balaban, A.T.; Ciubotariu, D.; Ivanciuc, O. Design of Topological Indices. Part 2. Distance Measure Connectivity Indices. MATCH Commun. Math. Comput. Chem. 1990, 25, 41-70. [69] Balaban, T. S.; Filip, P. A.; Ivanciuc, O. Computer generation of Acyclic Graphs Based on Local Vertex Invariants and topological Indices. Derived Canonical Labelling and Coding of Trees and Alkanes. J. Math. Chem., 1992, 11, 79-105. [70] Plavšić, D.; Nikolać, S.; Trinajstić, N.; Mihalić, Z. On the Harary Index for the Characterization of Chemical Graphs. J. Math. Chem. 1993, 12, 235-250.
Modeling the Toxicity of Alcohols
667
[71] Ivanciuc, O.; Balaban, T. S.; Balaban, A. T. Design of topological Indices. Part 4. Reciprocal Distance Matrix, Related Local Vortex Invariants and Topological Indices. J. Math. Chem. 1993, 12, 309-318. [72] Diudea, M. V.; Ivanciuc, O.; Nicolić, S.; Trinajstić, N. Matrices of Reciprocal Distance, Polynomials and Derived Numbers. MATCH Commun. Math. Comput. Chem. 1997, 35, 41-64. [73] Ciubotariu, D.; Medeleanu, M.; Vlaia, V.; Olariu, T.; Dragos, D.; Seiman, C. Molecular Van der Waals Space and Topological Indices from Distance Matrix. Molecules 2004, 9, 1053-1078. [74] Mihalić, Z.; Nikolić, S.; Trinajstić, N. Comparative Study of Molecular Descriptors Derived from the Distance Matrix. J. Chem. Inf. Comput. Sci. 1992, 32, 28-37. [75] Harary, F. Graph Theory, 2nd edition; Addison-Wesley: Reading, 1971. [76] Rouvray, D.H. Predicting Chemistry from Topology. Sci.Am. 1986, 254, 36-43. [77] Wiener, H. Correlation of Heats of Isomerization and Differences in Heats of Vaporization of Isomers among the Paraffinic Hydrocarbons. J. Am. Chem. Soc. 1947, 69, 2636-2638. [78] Platt, J.R. Prediction of Isomeric Differences in Paraffin Properties. J. Phys. Chem. 1952, 56, 328-336. [79] Balaban, A.T. Highly Discriminating Distance-Based Topological Index. Chem. Phys. Lett. 1982, 80, 399-404. [80] Balaban, A.T. Topological Indices Based on Topological Distances in Molecular Graphs. Pure Appl. Chem. 1983, 55, 199-206. [81] Hosoya, H. Topological Index. A Newly Proposed Quantity Characterizing the Topological Nature of Structural Isomers of Saturated Hydrocarbons. Bull. Chem. Soc. Jpn. 1971, 44, 2332-2339. [82] Medeleanu, M.; Ciubotariu, D.; Ciubotariu, C. New Shape Descriptors for Quantitative Treatment of Steric Effects. III. A New Globularity Measure for QSPR/QSAR Studies. Chem. Bull. “Politehnica” Univ. Timisoara 2006, 51, 9-12. [83] Balaban, A. T.; Ciubotariu, D., Medeleanu, M. Topological indices and real number vertex invariants based on graph eigenvalues or eigenvectors. J. Chem. Inf. Comput. Sci. 1991, 31, 517-523. [84] Randić, M. On the Characterization of Molecular Branching. J. Am. Chem. Soc. 1975, 97, 6609-6615. [85] Kier, L.B.; Hall, L.H.; Murray, W.J.; Randić, M. Molecular Connectivity. Part 1. Relation to Nonspecific Local Anesthetics. J. Pharm. Sci. 1975, 64, 1971-1974. [86] Medeleanu, M.; Ciubotariu, C., Vlaia, V.; Olariu, T.; Ciubotariu, D. Generalized Topological Distance Indices based on Reciprocal Distance Matrix. Chem. Bull. “Politehnica” Univ. Timisoara 2004, 49(63), 1-2, 14-20. [87] Schultz, T. W.; Cronin, M. T. D.; Walker, J. D.; Aptula, A. O. Quantitative structure – activity relationships (QSARs) in toxicology: a historical perspective. J. Mol. Struct. THEOCHEM, 2003, 622, 1-22. [88] Green, S. Toxicology and Regulatory Process; Informa HealthCare: New York, 2006. [89] Cronin, M.T.D.; Schultz, T.W. Pitfaals in QSAR. J. Mol. Struct. THEOCHEM, 2003, 622, 39-51.
668
Dan Ciubotariu, Vicentiu Vlaia, Ciprian Ciubotariu et al.
[90] Chiriac, A.; Ciubotariu, D.; Funar-Timofei, S.; Kurunczi, L.; Mracec, M.; Mracec, M.; Szabadai, Z.; Seclaman, E.; Simon, Z. QSAR and 3D-QSAR in Timisoara. 1972-2005. Rev. Roum. Chim. 2006, 51, 79-99. [91] Vlaia, V.; Olariu, T.; Ciubotariu, C.; Medeleanu, M.; Ursica, L.; Ciubotariu, D. Molecular Descriptors for Quantitative Structure-Toxicity Relationships (QSTR). I. Molecular Van der Waals Compressibility Measures in Modeling of Aliphatic Alcohol Toxicity to Tetrahymena Pyriformis. Rev. Chim. 2009, 60, 605-610. [92] Vlaia, V.; Olariu, T.; Ciubotariu, C.; Medeleanu, M.; Dragos, D.; Ursica, L.; Ciubotariu, D. Quantitative Structure – Activity Relationship (QSAR). III. Generalized Topological Descriptors Based on Reciprocal Distance Matrix for Development of Toxicological QSARs. Farmacia 2007, LV (3), 284-296. [93] http://www.vet.utk.edu/TETRaTOX/ [94] Hawkins, D.M.; Basak, S.C.; Kraker, J.; Geiss, K.T.; Witzmann, F. Combining Chemodescriptors and Biodescriptors in Quantitative Structure−Activity Relationship Modeling. J. Chem. Inf. Model. 2006, 46 (1), 9-16. [95] Benfenati, E.; Gini, G.; Piclin, N.; Roncaglioni, A.; Vari, M. Predicting log P of pesticides using different software. Chemosphere 2003, 53, 1155-1164. [96] Kaiser, K. L. E. The use of neural networks in QSARs for acute aquatic toxicological endpoints. J. Mol. Struct. THEOCHEM 2003, 622, 85-95. [97] Ciubotariu, D.; Deretey, E.; Oprea, T. J.; Sulea, T.; Simon, Z.; Kurunczi, L.; Chiriac, A. Multiconformational Minimal Steric Difference. Structure – Acetylcholinesterase Hydrolysis Rates Relaions for Acetic Acid Esters. Quant. Struct.-Act. Relat., 1993, 12, 367-372. [98] Gedeck, P.; Rohde, B.; Bartels, C. QSAR – How Good Is It in Practice? Comparison of Descriptor Sets on an Unbiased Cross Section of Corporate Data Sets. J. Chem. Inf. Model. 2006, 46, 1924-1936. [99] Jaworska, J. S.; Schultz, T. W. Quantitative Relationships of Structure-Activity and Volume Fraction for Selected Nonpolar and Polar Narcotic Chemicals. SAR QSAR Environ. Res 1993, 1, 3-19. [100] Ciubotariu, D.; Vlaia, V.; Olariu, T.; Ciubotariu, C.; Medeleanu, M.; Dragos, D.; Ursica L. Generalized topological descriptors based on reciprocal distance matrix for development of toxicological QSARs, 12th Int. Workshop Quant. Struct.-Act. Relat. Environ. Toxicol. 2006, 8-12 May, Lyon (France), p. 89.
INDEX A absolute electronegativity, 254, 255, 257, 259 absolute hardness, 255, 274 achiral, 458, 459, 518 Adaptive Fuzzy Partition, 613, 623 ADF, 63, 65, 66, 84 Alchemy, x algebraic correlation, 569, 570, 571, 574, 579 all-ring, 461, 464, 466, 470, 472 Amino acid map, 529 amplification, 141, 142, 148, 154 anionic-cationic interaction, 572 annihilation, 161, 261 Anthocyanidins, 357 anti-bonding, 2, 5, 6, 7, 8, 9, 12, 13, 16, 17, 18 architecture, 404, 615, 616, 620, 626 Arrhenius, 93 Artificial Neural Networks, 609, 613, 623 ATMOL, 65, 84 Azoles, 353
B binding spinor, 9 Biot-Savart law, 267 bonding, x, xi, 1, 2, 3, 5, 6, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17, 18, 19, 66, 157, 158, 183, 186, 253, 266, 270, 271, 273, 326, 327, 328, 336, 361, 363, 389, 391, 397, 398, 408, 417 Born-Oppenheimer, 81, 93, 134, 213 Bose operators, 144, 149 Boys, 63, 64, 83, 85, 159, 164, 184, 186, 340
C CADPAC, 65, 84 cage, 298, 347, 441, 442, 443, 456, 462, 464, 465, 472, 477 capra, 427, 461, 477 carbon atom, 277, 281, 282, 283, 284, 285, 286, 288, 289, 290, 292, 293, 294, 295, 296, 297, 298, 299,
300, 301, 302, 303, 304, 308, 311, 313, 315, 317, 318, 321, 331, 332, 349, 351, 390, 391, 397, 398, 400, 401, 402, 403, 404, 407, 408, 409, 414, 415, 416, 504 Cauchy-Schwartz inequality, 543, 547 chamfering, 452, 461, 477 charge conservation, 266 chemical action, 1, 2, 6, 9, 256, 269, 270 chemical binding, x, 6, 7, 8, 14, 15, 16, 313, 321, 541 chemical binding functions, 6 chemical bond, x, xi, 1, 2, 5, 9, 12, 19, 84, 113, 251, 252, 256, 261, 270, 271, 272, 273, 315, 320, 335, 347, 356, 363, 480 chemical electronegativity, 253, 257, 258, 261 chemical graph, 479, 480, 482, 484, 487, 489, 494, 496, 500, 503, 629, 644, 651, 662 chemical hardness, 253, 255, 271, 272, 273, 274, 275 chemical reactions, 93, 94, 108, 109, 217, 218, 271, 272, 334, 425, 447, 485, 635 chemical reactivity, 18, 191, 252, 255, 261, 265, 267, 268, 269, 270, 271, 273, 500, 634 chemical structure, 92, 274, 437, 461, 480, 481, 482, 483, 484, 493, 500, 539, 579, 601, 629, 630, 631, 632, 634, 651 chemoinformatics, 482 chemometrics, 480, 482, 493, 497 chiral, 214, 409, 410, 411, 412, 430, 432, 454, 456, 458, 459, 460 Cluj, 426, 429, 473, 474, 476, 478, 507, 519 Cluster Analysis, 612, 623 coalescence, 440 coherent states, 146, 288 common factors, 541 complete set of commutative operators, 573 complexity, x, 192, 218, 305, 427, 431, 480, 482, 485, 494, 495, 496, 525, 561, 612, 613, 614 conditioned probability, 554, 555 Configuration Interaction (CI), 61, 62 correlation coefficient, 546, 551, 552, 634, 648, 652, 654, 655, 656, 658, 659, 660 Coulomb integral, 64, 69, 75, 77, 85, 86 Coulomb resolution, 63, 73, 74, 75, 77, 78, 80 Coulomb Sturmian, 72, 85
670
Index
covariance index, 546 covering, 194, 425, 426, 427, 428, 429, 430, 431, 436, 437, 438, 440, 441, 443, 444, 445, 447, 456, 457, 461, 470, 472, 473, 474, 579 Coxeter, 406, 423, 458, 467, 474 creation operator, 261 critical electronegativity, 16 crystals, 141, 143, 189, 192, 234, 251, 277, 278, 279, 286, 288, 293, 294, 296, 297, 300, 301, 303, 304, 306, 308, 315, 391, 392, 393, 405, 462 cube, 299, 300, 389, 460, 462, 477 CVNET, 427, 447, 462, 474
D Daphnia magna, 573, 585 Database, 473, 580, 581, 593, 604, 623, 625 de Broglie, 5, 28, 39, 52, 54, 59, 195, 196, 199, 203, 206, 207, 608 Debye model, 94 degree of membership, 610, 613, 614, 617, 623 degree-degree association, 479, 480, 489, 490 delocalization, 13, 344, 348, 349, 350, 354, 355, 356, 362, 363, 364, 365 density functionals, 184, 189, 252, 253, 259, 261, 270, 370 DERIC, 65, 84 descriptor, 256, 482, 488, 501, 563, 613, 614, 623, 625, 641, 642, 643, 656, 659 design, 50, 327, 461, 473, 482, 497, 506, 540, 571, 578, 579, 583, 596, 598, 601, 605, 611, 630, 631 difference equations, 149 differential electronegativity, 255 Difficulty, 593, 596 Dirac, 1, 2, 3, 5, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 323, 391, 393, 561, 586 Dirac binding functions, 12, 17 Dirac equation, 2, 3, 5, 9, 10, 13, 17, 18, 393 Discriminant, 612, 624 dispersion, 173, 182, 415, 545, 546 distribution, 22, 62, 64, 69, 105, 106, 191, 192, 193, 200, 201, 210, 246, 272, 280, 285, 308, 335, 356, 358, 359, 360, 362, 389, 398, 400, 401, 402, 403, 408, 409, 414, 415, 416, 417, 421, 473, 480, 481, 482, 484, 489, 490, 491, 552, 553, 554, 574, 587, 589, 590, 594, 598, 599, 601, 602, 605, 612, 614, 621, 635 dodecahedron, 460, 462 Drug, 583, 584, 596, 624, 625, 665, 666 dual, 328, 427, 448, 449, 450, 451, 458, 465, 470
E ECG, 81, 82 edge, x, 13, 295, 395, 405, 426, 427, 428, 430, 436, 439, 441, 444, 447, 448, 449, 452, 453, 454, 458, 461, 466, 472, 485, 500, 501, 502, 504, 506, 507,
508, 509, 510, 511, 512, 513, 514, 515, 517, 518, 645 eigen-value, 5 Electric El, 573 electric field, 91, 98, 100, 114, 268, 269 electric susceptibility, 269 electrodynamics, 266 electromagnetic theory, 217, 247 electron, ix, x, 18, 48, 64, 65, 66, 69, 70, 73, 86, 91, 92, 95, 98, 105, 111, 112, 113, 142, 143, 144, 148, 149, 158, 159, 163, 164, 167, 176, 182, 195, 196, 198, 200, 202, 203, 204, 205, 206, 207, 208, 212, 213, 214, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 235, 236, 237, 238, 241, 247, 248, 255, 256, 270, 271, 273, 274, 278, 294, 297, 319, 320, 321, 326, 335, 336, 337, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 358, 359, 360, 361, 362, 363, 364, 365, 373, 375, 376, 379, 390, 391, 392, 393, 394, 396, 397, 406, 413, 439, 440, 475, 476, 520, 635, 639 electronegativity, 1, 6, 7, 8, 9, 13, 14, 15, 16, 17, 18, 191, 192, 212, 251, 253, 254, 255, 256, 257, 258, 259, 261, 262, 263, 265, 266, 267, 268, 270, 271, 272, 273, 274, 275, 352, 358, 373, 608 electronic affinity, 254, 256 endohedral, 439 endpoint, 541, 573, 574 endpoint paths, 573 energy gap, 92, 141 Enrichment, 594, 595, 596, 604 entropy, 480, 481, 483, 484, 487, 488, 489, 490, 491, 493, 494, 535, 616, 626 Esaki, 92, 108 Euler, 76, 405, 447, 449, 461, 466, 467, 475, 504, 647 exchange, 64, 68, 69, 70, 75, 76, 77, 78, 79, 83, 113, 142, 143, 144, 145, 146, 159, 178, 200, 206, 213, 278, 279, 292, 293, 307, 308, 313, 314, 315, 316, 317, 318, 319, 321, 328, 472 Exchange integral, 63 excitons, 141, 142, 143, 145, 149, 151, 153, 154, 155 Expert System, 608, 609, 623 expert systems, 541 external potential, 255, 259, 272, 292, 293, 306 Eyring, 73, 93, 109
F factionary occupancy, 265, 271 factor indeterminacy, 578, 587 Faraday-Lenz law, 268 Fermi Golden rule, 94 Fermi operator, 143 Fock electronic space, 261 Fourier transform, 61, 63, 68, 71, 75, 76, 86, 145, 149, 208 Fourier transformation, 145, 149
671
Index FuGeNeSys, 611, 612, 624 Fukui index, 255 fullerene, 271, 387, 390, 403, 404, 405, 406, 407, 408, 409, 413, 415, 417, 418, 420, 421, 439, 440, 441, 456, 461, 472, 477, 504, 507, 508, 519 Fuzzy ARTMAP Neural Networks, 615 Fuzzy Clustering Method, 612 Fuzzy Least Squares Regression, 621, 623 Fuzzy Logic Theory, 607, 608, 609, 611, 613, 615, 617, 619, 621, 622, 623, 625, 627 Fuzzy set, 611
G Gamow, 92, 108 gauge equilibrium, 253 Gauss law, 268 GAUSSIAN, 62, 65, 84 Gaussian transform, 71 genus, 447, 466, 467 global softness, 257, 258 Goldberg, 452, 453, 457, 476, 663 Gram-Schmidt, 563, 565 Graph, 426, 447, 476, 480, 483, 494, 495, 497, 499, 506, 510, 511, 517, 521, 522, 605, 666, 667 graph entropy, 479, 480, 483, 484, 491, 493, 494 green chemistry, 571, 579 Gurney, 92, 108
H Hamiltonian, 2, 17, 63, 69, 95, 109, 113, 114, 143, 144, 145, 146, 148, 149, 159, 160, 161, 163, 184, 185, 199, 265 Hansch, 496, 533, 541, 562, 572, 583, 619, 627, 634, 664 hardness principles, 265 Hartree orbitals, 63 Hartree-Fock (HF), 62, 158 H-depleted, 480, 481 heat of molecule formation, 217 Heisenberg, 22, 23, 24, 545 hidden variable, 265, 270 Hilbert space, 2, 539, 542, 561, 563, 570 Hybrid integrals, 69, 70 hydrogen atom, 69, 77, 78, 91, 117, 200, 201, 204, 208, 286, 313, 319, 320, 328, 331, 332, 336, 344, 374, 481, 500 hydrogen molecule, 81, 85, 88, 288, 319 hydrophobicity, 541, 571, 641, 654, 656 Hylleraas, 63, 65, 67, 81, 82, 88, 161, 185 Hynes, 93, 109
I IBMOL, 64, 84
ICI, 81, 82 icosahedron, 404, 450, 462 Imidazole, 354 in silico, 540, 578 Independent action Model, 617, 623 Indoles, 348 information functional, 480, 483, 484, 487, 489, 490, 493, 494 information index, 485, 486 inter-endpoint molecular activity difference (IEMAD), 575 inter-endpoint norm difference (IEND), 575 ionic liquid internal angle, 572 ionic liquids, 571, 572, 573, 585, 586 ionization potential, 114, 218, 219, 220, 221, 225, 240, 244, 245, 248, 255, 256 ionization potentials, 218, 219, 220, 225, 240, 244, 245, 248, 255 isomerization, 271, 328, 376, 426, 439, 441, 442, 443, 445, 446, 447
J Jaccard similarity coefficient, 482 Josephson, 91, 92, 108
K Kim, 93, 109, 189, 422, 423 Kozhushner, 94, 109 K-paths, 575 Kramers, 93, 109
L Lattice Representation of DNA, 522 leapfrog, 427, 438, 450, 451, 453, 461, 464, 470, 474, 476, 477 lipophylicity, 571, 572 local response function, 258 local softness, 257, 258 localization, 6, 13, 16, 18, 164, 186, 252, 270, 271, 274, 307, 346, 364, 373 localization functions, 252, 270, 271 Lofti A. Zadeh, 608 luminescence, 143, 387
M magnetic field, 205, 214, 266, 268 map, 112, 193, 245, 246, 425, 426, 427, 428, 430, 447, 448, 449, 450, 451, 452, 454, 455, 457, 459, 461, 466, 467, 468, 470, 472, 473, 478, 534, 535, 562, 574, 613, 614, 616, 619 Maxwell theory, 247
672
Index
medial, 427, 448, 449, 450, 461, 462, 463, 464, 465, 466, 470 metabolization power, 577 methane molecule, 301, 302 molecules, ix, x, xi, 1, 19, 61, 62, 64, 65, 68, 70, 73, 74, 77, 78, 80, 81, 82, 83, 84, 85, 86, 87, 92, 94, 110, 111, 112, 113, 141, 148, 149, 157, 158, 159, 163, 174, 180, 182, 188, 192, 195, 203, 217, 218, 224, 225, 226, 229, 230, 231, 232, 234, 239, 251, 275, 279, 286, 288, 296, 297, 298, 301, 303, 308, 310, 311, 313, 315, 319, 320, 321, 325, 328, 331, 334, 335, 344, 353, 354, 357, 360, 361, 362, 363, 364, 367, 382, 387, 393, 405, 413, 425, 439, 440, 480, 486, 494, 504, 521, 562, 574, 575, 576, 578, 579, 583, 589, 590, 593, 594, 595, 596, 597, 598, 599, 600, 601, 603, 613, 614, 617, 631, 632, 635, 636, 639, 641, 643, 649, 651, 654, 656, 660, 661, 662, 663 Moore-Penrose, 558, 560 Morse oscillator, 91, 97, 99, 100, 102, 107 MPFUN, 82 Mulliken electronegativity, 254, 274 multi-shell, 462, 463, 466, 467
N nanotube, 387, 399, 409, 410, 411, 412, 413, 415, 417, 419, 420, 421, 440, 512, 513, 517, 518, 519 norm, 542, 569, 575, 577 normal distribution, 389, 551, 553, 633
O Omega, 428, 434, 461, 472, 474, 475, 478 operation, 44, 427, 438, 444, 448, 449, 450, 451, 452, 453, 454, 455, 457, 458, 459, 460, 461, 462, 463, 464, 466, 470, 472, 474, 542, 558, 611 operator, 3, 19, 61, 63, 66, 67, 68, 70, 74, 75, 77, 80, 87, 114, 162, 201, 261, 262, 443, 614 orthogonal descriptors, 533, 563, 585 orthogonality, 542, 548, 561, 579 oscillations, 102, 141, 143, 147, 281, 316 oxygen molecule, 309, 310
P pair-localization, 6 Pauli, 2, 3, 5, 144, 191, 210, 363 Pauli matrices, 5 Pauli operators, 144 peapod, 441 Pearson coefficient, 548, 551, 552, 554 Pearson correlation, 547, 548, 552, 556 periodic table, 208, 209, 210, 285, 286, 289, 365 perturbation factor, 263, 264 phonon, 141, 146
pi-bonding, 7, 8 PLS regression, 482 Poisson integral, 580 polarizability, 114, 356, 541, 561, 572 polarization, 111, 112, 116, 168, 169, 176, 182, 229, 231, 232, 269 POLYATOM, 64, 83 polyethylene chain, 141, 142, 143, 151 polyethylene foil, 141, 142, 154 polyhedron, 404, 449 potential energy surface, 80, 87, 93, 98, 113, 158, 160 Probability, 322, 593, 596, 608, 609, 623 probability density function, 544 propagator, 576 Protein Distance Matrix, 528 Protonation, 362, 364, 365
Q QSAR, 482, 484, 494, 495, 496, 521, 533, 534, 535, 537, 540, 541, 542, 561, 564, 567, 569, 571, 572, 573, 574, 578, 579, 581, 582, 583, 584, 589, 590, 591, 593, 595, 597, 598, 599, 601, 603, 604, 605, 612, 613, 614, 618, 619, 621, 624, 625, 626, 627, 628, 629, 630, 631, 632, 633, 634, 635, 643, 644, 652, 657, 658, 659, 660, 662, 663, 664, 665, 666, 667, 668 QSPR/QSAR, 482, 494, 517, 537, 585, 664, 666, 667 QSPR-QSAR Theory, 623 QTAIM, 343, 344, 345, 346, 347, 348, 349, 352, 353, 359, 360, 362, 364 quadrupling, 427, 452, 477 quantum chemistry, 1, 74, 192, 270, 327, 328, 333, 427 quantum mechanics, 32, 42, 56, 86, 92, 192, 205, 217, 294, 499, 540, 546, 576 quantum phase space diagram, 100 quantum rules, 264 quantum-SAR factor, 576, 577, 578 quantum-SAR transformation, 577
R R12-wave function, 81 Randic M., 82 Ranking, 593 reactive biological activity, 253 Regression towards the mean effect, 590 ring, 180, 283, 286, 305, 306, 347, 350, 351, 352, 354, 355, 356, 357, 358, 359, 362, 375, 398, 400, 401, 404, 405, 418, 433, 437, 438, 440, 442, 461, 467, 480, 482, 502, 503, 636 Runge-Kutta method, 96
673
Index
S scalar product, 263, 542, 543, 547, 548, 562, 568, 569 Schläfli, 425, 470, 478 Schrödinger, 1, 2, 5, 115, 277, 294, 362, 388, 393 second quantization, 149, 265, 266 self-transformation, 577 Sensitivity, 495, 599, 600 septupling, 453, 454, 477 sigma-bonding, 7, 8, 13 similarity, 28, 50, 86, 158, 170, 183, 245, 300, 482, 495, 526, 533, 536, 609, 617, 635 skeletons, 480, 483 SMILES, 63, 65, 71, 73, 84 snub, 450 solitons, 141, 142, 143, 148, 154 special relativity theory, 217, 246 Specificity, 340, 341, 599, 600 spectral mechanistic map, 572 spectral norm, 574 spectral path, 574, 575, 576, 579 spectral path principle, 575 Spectral Representation of DNA, 524 Spectral-SAR, 539, 541, 542, 543, 545, 547, 549, 551, 553, 555, 557, 559, 561, 562, 563, 564, 565, 566, 567, 569, 571, 573, 574, 575, 577, 579, 581, 583, 585, 587 Spectral-SAR determinant, 564, 574 Spin, 19, 198, 201, 375, 381 spin waves, 141 spongy, 469, 473 standard deviation, 546, 550, 551, 553, 617 steric effects, 541, 562, 634, 636, 660 Stone-Walls, 441 STOP, 63, 65, 74, 77, 78, 79, 80, 84 structural complexity, 479, 480, 493 structure-activity, 495, 521, 533, 539, 540, 541, 556, 572, 581, 582, 583, 584, 585, 586, 587, 616, 627, 628 sum of residues, 555 superposition principle, 540, 562 symmetry, 2, 77, 78, 79, 98, 99, 102, 114, 165, 173, 194, 200, 201, 214, 273, 277, 288, 290, 298, 349, 399, 411, 412, 458, 467, 470, 481, 483, 501, 609
T Tanimoto index, 482 tessellation, 427, 430, 433, 436, 439, 451, 452, 461, 472 tetrahedron, 460 Theorem, 35, 38, 39, 451, 452, 454, 491, 586 tiling, 322, 426, 430, 431, 443, 445, 454, 461, 467, 470, 473 Timisoara, xi, 1, 141, 155, 251, 539, 541, 543, 545, 547, 549, 551, 553, 555, 557, 559, 561, 563, 565,
567, 569, 571, 573, 575, 577, 579, 581, 583, 585, 586, 587, 629, 664, 665, 666, 667, 668 Timisoara Theorem, 569 topological descriptor, 480, 481, 483, 488, 652, 658, 659, 660, 661, 668 topological distance, 481, 629, 644, 647, 648, 651, 660, 662 topology, x, 194, 427, 449, 481, 484, 494, 496, 613, 616, 619, 630, 631, 662 torus, 427, 428, 429, 430, 431, 432, 433, 438, 439, 466, 478 toxicity, 540, 541, 571, 572, 580, 582, 583, 584, 586, 617, 618, 626, 629, 630, 631, 632, 643, 651, 652, 653, 654, 655, 656, 657, 658, 659, 660, 661, 662, 663 truncation, 427, 451, 452, 453, 461, 470 tubercular, 442, 443
U uniform probability, 545, 547, 549, 556 Utility, 593, 596
V vectorial length, 541, 548 vectorial space, 542 vertex, 294, 295, 425, 428, 436, 447, 448, 449, 450, 451, 452, 454, 455, 458, 465, 468, 481, 483, 484, 485, 486, 487, 488, 489, 492, 493, 494, 500, 501, 507, 508, 509, 518, 629, 646, 647, 650, 651, 662, 667 vertices, 297, 298, 299, 425, 426, 428, 442, 447, 448, 449, 451, 452, 453, 454, 459, 464, 465, 466, 467, 480, 481, 483, 484, 485, 488, 489, 491, 492, 499, 500, 501, 507, 508, 509, 511, 512, 513, 629, 644, 645, 646, 647, 650, 651, 659, 662 Vibrio fischeri, 573, 585
W water molecule, 229, 307, 308, 371, 373 wavelength, 52, 192, 195, 196, 197, 199, 203, 206, 207, 246 Wiener index, 494, 499, 500, 517, 518, 519, 644, 645