Reviews in Computational Chemistry 25 Edited by
Kenny B. Lipkowitz Thomas R. Cundari Editor Emeritus
Donald B. Boyd
This Page Intentionally Left Blank
Reviews in Computational Chemistry Volume 25
Reviews in Computational Chemistry 25 Edited by
Kenny B. Lipkowitz Thomas R. Cundari Editor Emeritus
Donald B. Boyd
Kenny B. Lipkowitz Department of Chemistry Howard University 525 College Street, N. W. Washington, D. C., 20059, U.S.A.
[email protected] Thomas R. Cundari Department of Chemistry University of North Texas Box 305070, Denton, Texas 76203-5070, U.S.A.
[email protected]
Donald B. Boyd Department of Chemistry and Chemical Biology Indiana University-Purdue University at Indianapolis 402 North Blackford Street Indianapolis, Indiana 46202-3274, U.S.A.
[email protected]
Copyright ß 2007 by John Wiley & Sons, Inc. All rights reserved Published by John Wiley & Sons, Inc., Hoboken, New Jersey Published simultaneously in Canada No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission. Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages. For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002. Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com. Wiley Bicentennial Logo: Richard J. Pacifico Library of Congress Cataloging-in-Publication Data: ISBN 978-0-470-17998-7 ISSN 1069-3599 Printed in the United States of America 10 9 8 7 6 5 4 3 2 1
Preface The editors of the Reviews in Computational Chemistry series have long prided themselves on the diversity of topics covered within the pages of this series, which has now reached its twenty-fifth incarnation. Our search for both breadth and depth in subject coverage, while maintaining the strong pedagogical focus of this series, is motivated in large part by a desire to maintain relevance to our core audience of computational chemists, while striving to reach out to new audiences, including, dare we say it, experimentalists. However, the diversity of subject matter is also a reflection of the growth of the field of computational chemistry, which is no longer the sole domain of theorists nor of organic chemists, drug designers, or any other narrow scientific subgroup. It is hard to think of a chemical subdiscipline in which modern computer-based modeling has not become part of the research and innovation enterprise. Indeed, one sees an increased utilization of the tools of computational chemistry by researchers who may not even identify themselves as chemists, including, for example, mineralogists, physicists, molecular biologists, and engineers. One of the true joys of being an editor of this series lies in the exposure it provides about all aspects of computational chemistry. When reading and editing some chapters, we sometimes ask: ‘‘Is this really computational chemistry?’’ and invariably at the end of such a chapter, we find ourselves as having an expanded view of what computational chemistry encompasses. As always, the desire to emphasize the diversity of this exciting discipline to non-computational chemists needs to be tempered with a dose of didactic reality; it does little good to expose a novice researcher to new vistas without providing the tools needed for traversing those new territories safely. In this volume, as in the others, we retain our pedagogical emphasis. In Chapter 1, Professor Wolfgang Paul initiates a theme that is prevalent throughout the first half of this book: extrapolating from atomic-level phenomena to macroscopic chemical and physical properties. Professor Paul begins his chapter on the use of simulations to study the glass transition in polymer melts by describing exactly what a glass transition is from both an
v
vi
Preface
experimental and a computational viewpoint. As with many modeling efforts that seek to understand and predict mechanical and physical properties from atomistic models (although non-atomistic models are also discussed in this chapter), one must treat phenomena that cover diverse scales appropriately, both in terms of time and in terms of size. Novice molecular modelers will particularly value his in-depth discussion of how to build models for glass transitions in polymer melts. Professor Paul introduces atomistic methods such as force field-based methods (which are likely to be more familiar to those researchers with experience in the modeling of small/medium-sized molecules) and coarse-grained approaches such as bead-spring and lattice models. Because the glass transition is intrinsically a response of the polymer to changes in thermal energy, Professor Paul provides ample treatment of molecular dynamics (MD) and Monte Carlo (MC) techniques for the sampling of thermodynamic and structural properties that are relevant to glass transitions. The review concludes by what amounts to a practicum on modeling glass transitions using the various methods discussed in the chapter, with application to 1,4-polybutadiene as a test case. Dr. Nicholas Mosey and Professor Martin Mu¨ser provide a tutorial on the atomistic modeling of friction in Chapter 2. As with the chapter on glass transitions, the tutorial considers the extrapolation of atomic-level phenomena to macroscopic behavior, although the designation of ‘‘macroscopic’’ must be used carefully in this brave new world of nanomaterials and nanomachines. The chapter begins with a concise and readable discourse on the theoretical background pertinent to modeling friction—from mechanisms of friction to the dependence of friction on load and velocity. Once the authors have brought the reader ‘‘up to speed,’’ computational aspects are presented that both experienced and novice modelers will need to come to grips with in order to implement an effective software-based solution to modeling friction. The chapter then continues with selected case studies that are designed to inspire those new to the field with what is possible in this branch of computational chemistry, but also to warn the reader of potential ‘‘sticking points’’ (pun intended) in the atomistic modeling of friction. Continuing with the mini-theme of computational materials chemistry is Chapter 3 by Professor Thomas M. Truskett and coworkers. As in the previous chapters, the authors quickly frame the problem in terms of mapping atomic (chemical) to macroscopic (physical) properties. The authors then focus our attention on condensed media phenomena, specifically those in glasses and liquids. In this chapter, three properties receive attention—structural order, free volume, and entropy. Order, whether it is in a man-made material or found in nature, may be considered by many as something that is easy to spot, but difficult to quantify; yet quantifying order is indeed what Professor Truskett and his coauthors describe. Different types of order are presented, as are various metrics used for their quantification, all the while maintaining theoretical rigor but not at the expense of readability. The authors follow this section of their
Preface
vii
review with a discussion of calculating free volumes in condensed phases and the use of free volume to derive important thermodynamic properties. This chapter, which begins with the quantification of order, comes full circle—perhaps if only in a metaphysical sense—by concluding with entropy. As elsewhere in the chapter, the authors place a premium on the testing of physical phenomena with the models. Accordingly, they conclude the section on entropy modeling with a discussion of simulations designed to test the well-known Adam–Gibbs relationship, and even simulations that go beyond this fundamental relationship connecting thermodynamics with dynamics. In Chapter 4, Dr. Laurence Fried provides a discourse on some issues involved in the modeling of energetic materials like high explosives (HEs). This chapter is an excellent example of how the computational chemist must not only rationalize and predict chemical phenomena, but must also relate atomic scale behavior of a given material to its bulk physical and mechanical properties. Energetic materials undergoing detonation are subject to high temperatures (several thousand Kelvin) and high pressures (hundreds of kilobars); these are conditions extreme enough to cause water to abandon its ‘‘general chemistry’’ properties as a hydrogen-bonded liquid and morph into a superionic fluid. The conditions of extremely high temperature and pressure are just two issues that make the experimental study of detonation processes difficult and, thus, an area in which computational chemistry can make an important contribution to both the understanding of existing systems and the design of next-generation HE materials. Another difficulty confronting researchers in the field of high-energy material modeling is the extreme rapidity of the chemical reactions that underlie the detonation of HE materials. However, as in many cases, the gods of computational chemistry give with one hand and take away with the other; the speed of many chemical reactions being studied fortunately allow us to take advantage of chemical equilibrium methods. Dr. Fried provides a concise yet thorough overview of the different modeling techniques that are typically found in materials science—use of hard sphere and related models, kinetic modeling, molecular mechanics, ab initio quantum mechanics, molecular dynamics—and a discourse on some of the limitations of such methods is included in this chapter. As is often the case in computational chemistry, progress requires both technological and methodological advances. Dr. Fried thus closes the chapter by describing some emerging strategies in the field of modeling highly energetic materials. The next two chapters deal largely with metals, albeit from two different perspectives and in different areas of application. In Chapter 5, Professor Julio Alonso describes the modeling of transition metal clusters. He begins with a synopsis of the experimental research being done in the area of metal clusters along with the attendant computational methods, thus providing a solid foundation for new scientists in the field to build on. Emphasis is placed throughout the chapter on the prediction and understanding of magnetic properties of
viii
Preface
transition metal clusters, as this property has attracted great research attention within the nanomaterials community. Professor Alonso provides us with a ‘‘tool box’’ of computational chemistry options that can be used to describe the bonding in these clusters. In many respects, the properties of atomic clusters comprise a ‘‘gray area’’ between those of well-defined molecular systems and the properties of bulk metals. However, unlike the bulk, the chemical and physical properties of clusters do not display simple linear or monotonic behavior. Indeed, it is this dichotomy between bulk and cluster along with the nonlinearity of chemical/physical properties with cluster size that has attracted so much interest by both the scientific and the technology communities. The methodologies discussed in this chapter for the study of metal clusters include tight binding methods (akin to those used for modeling extended systems such as solids and surfaces) and several flavors of density functional theory. The focus of this chapter is not only on quantitative calculations but also on the elucidation of simple, qualitative bonding pictures for transition metal clusters. These bonding models not only bring to light the chemistry behind the numbers that can nowadays be computed in this rapidly emerging field of science but also provide a meaningful basis for future research. In Chapter 6, Professor Laura Gagliardi leads us on a journey through the d- and f-blocks of the periodic table, where fascinating compounds ranging from the heavy metals rhenium and uranium are comingled with the chemistry of lighter metals like chromium and copper. Several themes are pursued throughout the chapter. From the chemical perspective, metal–metal bonding receives considerable attention, not only in terms of the novel chemistry it may reveal but also in terms of the methods needed to handle these challenging chemical systems. The author presents to the reader the complete active space selfconsistent field (CASSCF) methodology in which a linear combination of electronic configurations—as opposed to the single electron configuration methods familiar to most chemists—are used to describe a chemical compound. For chemical systems in which even a CASSCF description is insufficient for a quantitative (and in some cases even qualitative) understanding, the CASPT2 method is the method of choice for incorporating high-level electron correlation effects. It is apparent from a perusal of the literature that a major theme in recent computational chemistry involves the modeling of systems that are increasing in size. Quantum chemists were at one time limited to the study of systems comprising two or three atoms, but now quantum modeling of chemical systems is being applied to thousands of atoms (or more). To accomplish this demanding task, novel approaches to solving the Schro¨dinger equation are essential. Professor Hua Guo covers a promising family of techniques in the area of solving large eigenproblems in the penultimate chapter of this book. This family of techniques seeks to avoid the problems of direct diagonalization of large (often sparse) matrices, which is a common computational bottleneck in the modeling of large chemical systems. Professor Guo starts with a discussion
Preface
ix
of direct diagonalization strategies, which is then followed by recursive methods of diagonalization (the need for recursive strategies is discussed at length in this chapter for both the novice and the seasoned professional modeler alike). Once the problem has been set up, approaches to its solution are discussed, including Lanczos recursion, Chebyshev recursion, and filter diagonalization. The chapter continues with a generous discourse on representative problems in the chemical sciences for which recursive methods of matrix diagonalization may be profitably exploited, including, as examples, spectroscopy, wave packets, and dynamics. The chapter concludes with a comparison of Lanczos and Chebyshev recursion methods. The final chapter by Professor Hugh Cartwright covers artificial intelligence (AI) in modeling. Professor Cartwright’s chapter is a classic ‘‘How To’’ manual, starting with a simple definition of just what is artificial intelligence. After this introduction, the chapter covers four of the major subject areas of artificial intelligence—genetic algorithms (GAs)/evolutionary algorithms, neural networks, self-organizing maps, and expert systems. Professor Cartwright takes a plain language approach to introducing these research areas, long the purview of computational scientists, to the Reviews in Computational Chemistry audience. The chapter is organized in sections with titles ranging from the obviously practical like ‘‘Why does a genetic algorithm work?’’ and ‘‘What can we do with a neural network?’’ to some sections with titles that might seem a bit foreboding like ‘‘What can go wrong?’’ for those readers who are interested in seeking an entree into this field of computing. Common-sense suggestions, recommendations, and advice are presented throughout. Reviews in Computational Chemistry is highly rated and well received by the scientific community at large; the reason for these accomplishments rests firmly on the shoulders of the authors whom we have contacted to provide the pedagogically driven reviews that have made this ongoing book series so popular. To those authors we are especially grateful. We are also glad to note that our publisher now makes our most recent volumes available in an online form through Wiley InterScience. Please consult the Web (http://www.interscience.wiley.com/onlinebooks) or contact
[email protected] for the latest information. For readers who appreciate the permanence and convenience of bound books, these will, of course, continue. We thank the authors of this and previous volumes for their excellent chapters. Kenny B. Lipkowitz Washington Thomas R. Cundari Denton February 2007
This Page Intentionally Left Blank
Contents 1.
2.
Determining the Glass Transition in Polymer Melts Wolfgang Paul
1
Introduction Phenomenology of the Glass Transition Model Building Chemically Realistic Modeling Coarse-Grained Models Coarse-Grained Models of the Bead-Spring Type The Bond-Fluctuation Lattice Model Simulation Methods Monte Carlo Methods Molecular Dynamics Method Thermodynamic Properties Dynamics in Super-Cooled Polymer Melts Dynamics in the Bead-Spring Model Dynamics in 1,4-Polybutadiene Dynamic Heterogeneity Summary Acknowledgments References
1 2 7 7 11 11 11 13 13 17 18 26 34 40 50 54 56 57
Atomistic Modeling of Friction Nicholas J. Mosey and Martin H. Mu¨ser
67
Introduction Theoretical Background Friction Mechanisms Load-Dependence of Friction Velocity-Dependence of Friction Role of Interfacial Symmetry
67 69 70 74 76 77
xi
xii
Contents Computational Aspects Surface Roughness Imposing Load and Shear Imposing Constant Temperature Bulk Systems Computational Models Selected Case Studies Instabilities, Hysteresis, and Energy Dissipation The Role of Atomic-Scale Roughness Superlubricity Self-Assembled Monolayers Tribochemistry Concluding Remarks Acknowledgments References
3.
Computing Free Volume, Structural Order, and Entropy of Liquids and Glasses Jeetain Mittal, William P. Krekelberg, Jeffrey R. Errington, and Thomas M. Truskett Introduction Metrics for Structural Order Crystal-Independent Structural Order Metrics Structural Ordering Maps Free Volume Identifying Cavities and Computing Their Volumes Computing Free Volumes Computing Thermodynamics from Free Volumes Relating Dynamics to Free Volumes Entropy Testing the Adam–Gibbs Relationship An Alternative to Adam–Gibbs? Conclusions Acknowledgments References
4.
80 81 83 85 91 97 105 105 109 112 116 117 120 120 120 125
125 127 128 132 136 138 139 140 141 144 149 151 152 152 153
The Reactivity of Energetic Materials at Extreme Conditions Laurence E. Fried
159
Introduction Chemical Equilibrium Atomistic Modeling of Condensed-Phase Reactions First Principles Simulations of High Explosives
159 161 171 179
Contents
5.
xiii
Conclusions Acknowledgments References
184 184 184
Magnetic Properties of Atomic Clusters of the Transition Elements Julio A. Alonso
191
Introduction Basic Concepts Experimental Studies of the Dependence of the Magnetic Moments with Cluster Size Simple Explanation of the Decay of the Magnetic Moments with Cluster Size Tight Binding Method Tight Binding Approximation for the d Electrons Introduction of s and p Electrons Formulation of the Tight Binding Method in the Notation of Second Quantization Spin-Density Functional Theory General Density Functional Theory Spin Polarization in Density Functional Theory Local Spin-Density Approximation (LSDA) Noncollinear Spin Density Functional Theory Measurement and Interpretation of the Magnetic Moments of Nickel Clusters Interpretation Using Tight Binding Calculations Influence of the s Electrons Density Functional Calculations for Small Nickel Clusters Orbital Polarization Clusters of Other 3d Elements Chromium and Iron Clusters Manganese Clusters Clusters of the 4d Elements Rhodium Clusters Ruthenium and Palladium Clusters Effect of Adsorbed Molecules Determination of Magnetic Moments by Combining Theory and Photodetachment Spectroscopy Summary and Prospects Appendix. Calculation of the Density of Electronic States within the Tight Binding Theory by the Method of Moments
191 192 195 196 198 198 200 200 203 203 205 208 209 211 211 217 219 219 225 225 229 234 235 237 237 239 240
241
xiv
6.
Contents Acknowledgments References
243 243
Transition Metal- and Actinide-Containing Systems Studied with Multiconfigurational Quantum Chemical Methods Laura Gagliardi
249
Introduction 249 The Multiconfigurational Approach 251 The Complete Active Space SCF Method 252 Multiconfigurational Second-Order Perturbation Theory, CASPT2 253 Treatment of Relativity 257 Relativistic AO Basis Sets 259 The Multiple Metal–Metal Bond in Re2Cl82 and Related Systems 259 The Cr–Cr Multiple Bond 264 Cu2O2 Theoretical Models 265 Spectroscopy of Triatomic Molecules Containing One Uranium Atom 267 Actinide Chemistry in Solution 269 The Actinide–Actinide Chemical Bond 270 Inorganic Chemistry of Diuranium 274 Conclusions 278 Acknowledgments 279 References 279 7.
Recursive Solutions to Large Eigenproblems in Molecular Spectroscopy and Reaction Dynamics Hua Guo
285
Introduction Quantum Mechanics and Eigenproblems Discretization Direct Diagonalization Scaling Laws and Motivation for Recursive Diagonalization Recursion and the Krylov Subspace Lanczos Recursion Exact Arithmetic Finite-Precision Arithmetic Extensions of the Original Lanczos Algorithm Transition Amplitudes Expectation Values Chebyshev Recursion Chebyshev Operator and Cosine Propagator Spectral Method
285 285 286 289 291 292 293 293 296 300 303 307 308 308 310
Contents
8.
xv
Filter-Diagonalization Filter-Diagonalization Based on Chebyshev Recursion Low-Storage Filter-Diagonalization Filter-Diagonalization Based on Lanczos Recursion Symmetry Adaptation Complex-Symmetric Problems Propagation of Wave Packets and Density Matrices Applications Bound States and Spectroscopy Reaction Dynamics Lanczos vs. Chebyshev Summary Acknowledgments References
313 313 317 319 320 322 324 326 326 327 329 330 332 332
Development and Uses of Artificial Intelligence in Chemistry Hugh Cartwright
349
Introduction Evolutionary Algorithms Principles of Genetic Algorithms Genetic Algorithm Implementation Why Does the Genetic Algorithm Work? Where Is the Learning in the Genetic Algorithm? What Can the Genetic Algorithm Do? What Can Go Wrong with the Genetic Algorithm? Neural Networks Neural Network Principles Neural Network Implementation Why Does the Neural Network Work? What Can We Do with Neural Networks? What Can Go Wrong? Self-Organizing Maps Where Is The Learning? Some Applications of SOMs Expert Systems Conclusion References
349 350 350 352 358 361 362 365 366 366 368 373 374 378 380 382 384 385 386 386
Author Index
391
Subject Index
409
This Page Intentionally Left Blank
Contributors Julio A. Alonso, Departamento de Fı´sica Teo´rica, Ato´mica y Optica, Universidad de Valladolid, E-47011 Valladolid, Spain and Donostia International Physics Center (DIPC), 20018 San Sebastia´n, Spain (Electronic mail:
[email protected]) Hugh Cartwright, Department of Chemistry, University of Oxford, Physical and Theoretical Chemistry Laboratory, South Parks Road, Oxford, United Kingdom OX1 3QZ (Electronic mail:
[email protected]) Jeffrey R. Errington, Department of Chemical and Biological Engineering, State University of New York at Buffalo, Buffalo, NY 14260, U. S. A. (Electronic mail:
[email protected]) Laurence E. Fried, Chemistry, Materials Science, and Life Sciences Directorate, Lawrence Livermore National Laboratory, L-282, 7000 East Avenue, Livermore, CA 94550, U. S. A. (Electronic mail:
[email protected]) Laura Gagliardi, Department of Physical Chemistry, University of Geneva, 30 Quai Ernest Ansermet, CH-1211 Geneva 4, Switzerland (Electronic mail:
[email protected]) Hua Guo, Department of Chemistry, University of New Mexico, Albuquerque, NM 87131, U. S. A. (Electronic Mail:
[email protected]) William P. Krekelberg, Department of Chemical Engineering, The University of Texas at Austin, Austin, TX 78712, U. S. A. (Electronic mail:
[email protected]) Jeetain Mittal, Department of Chemical Engineering, The University of Texas at Austin, Austin, TX 78712, U. S. A. (Electronic mail:
[email protected]) xvii
xviii
Contributors
Nicholas J. Mosey, Department of Chemistry, University of Western Ontario, London, ON, N6A 5B7 Canada (Electronic mail:
[email protected]) Martin H. Mu¨ser, Department of Applied Mathematics, University of Western Ontario, London, ON, N6A 5B7 Canada (Electronic mail:
[email protected]) Wolfgang Paul, Institut fu¨r Physik, Johannes-Gutenberg-Universita¨t, 55099 Mainz, Germany (Electronic mail:
[email protected]) Thomas M. Truskett, Department of Chemical Engineering and Institute for Theoretical Chemistry, The University of Texas at Austin, Austin, TX 78712, U. S. A. (Electronic mail:
[email protected])
Contributors to Previous Volumes Volume 1 (1990) David Feller and Ernest R. Davidson, Basis Sets for Ab Initio Molecular Orbital Calculations and Intermolecular Interactions. James J. P. Stewart, Semiempirical Molecular Orbital Methods. Clifford E. Dykstra, Joseph D. Augspurger, Bernard Kirtman, and David J. Malik, Properties of Molecules by Direct Calculation. Ernest L. Plummer, The Application of Quantitative Design Strategies in Pesticide Design. Peter C. Jurs, Chemometrics and Multivariate Analysis in Analytical Chemistry. Yvonne C. Martin, Mark G. Bures, and Peter Willett, Searching Databases of Three-Dimensional Structures. Paul G. Mezey, Molecular Surfaces. Terry P. Lybrand, Computer Simulation of Biomolecular Systems Using Molecular Dynamics and Free Energy Perturbation Methods. Donald B. Boyd, Aspects of Molecular Modeling. Donald B. Boyd, Successes of Computer-Assisted Molecular Design. Ernest R. Davidson, Perspectives on Ab Initio Calculations.
xix
xx
Contributors to Previous Volumes
Volume 2 (1991) Andrew R. Leach, A Survey of Methods for Searching the Conformational Space of Small and Medium-Sized Molecules. John M. Troyer and Fred E. Cohen, Simplified Models for Understanding and Predicting Protein Structure. J. Phillip Bowen and Norman L. Allinger, Molecular Mechanics: The Art and Science of Parameterization. Uri Dinur and Arnold T. Hagler, New Approaches to Empirical Force Fields. Steve Scheiner, Calculating the Properties of Hydrogen Bonds by Ab Initio Methods. Donald E. Williams, Net Atomic Charge and Multipole Models for the Ab Initio Molecular Electric Potential. Peter Politzer and Jane S. Murray, Molecular Electrostatic Potentials and Chemical Reactivity. Michael C. Zerner, Semiempirical Molecular Orbital Methods. Lowell H. Hall and Lemont B. Kier, The Molecular Connectivity Chi Indexes and Kappa Shape Indexes in Structure-Property Modeling. I. B. Bersuker and A. S. Dimoglo, The Electron-Topological Approach to the QSAR Problem. Donald B. Boyd, The Computational Chemistry Literature.
Volume 3 (1992) Tamar Schlick, Optimization Methods in Computational Chemistry. Harold A. Scheraga, Predicting Three-Dimensional Structures of Oligopeptides. Andrew E. Torda and Wilfred F. van Gunsteren, Molecular Modeling Using NMR Data. David F. V. Lewis, Computer-Assisted Methods in the Evaluation of Chemical Toxicity.
Contributors to Previous Volumes
xxi
Volume 4 (1993) Jerzy Cioslowski, Ab Initio Calculations on Large Molecules: Methodology and Applications. Michael L. McKee and Michael Page, Computing Reaction Pathways on Molecular Potential Energy Surfaces. Robert M. Whitnell and Kent R. Wilson, Computational Molecular Dynamics of Chemical Reactions in Solution. Roger L. DeKock, Jeffry D. Madura, Frank Rioux, and Joseph Casanova, Computational Chemistry in the Undergraduate Curriculum.
Volume 5 (1994) John D. Bolcer and Robert B. Hermann, The Development of Computational Chemistry in the United States. Rodney J. Bartlett and John F. Stanton, Applications of Post-Hartree–Fock Methods: A Tutorial. Steven M. Bachrach, Population Analysis and Electron Densities from Quantum Mechanics. Jeffry D. Madura, Malcolm E. Davis, Michael K. Gilson, Rebecca C. Wade, Brock A. Luty, and J. Andrew McCammon, Biological Applications of Electrostatic Calculations and Brownian Dynamics Simulations. K. V. Damodaran and Kenneth M. Merz Jr., Computer Simulation of Lipid Systems. Jeffrey M. Blaney and J. Scott Dixon, Distance Geometry in Molecular Modeling. Lisa M. Balbes, S. Wayne Mascarella, and Donald B. Boyd, A Perspective of Modern Methods in Computer-Aided Drug Design.
Volume 6 (1995) Christopher J. Cramer and Donald G. Truhlar, Continuum Solvation Models: Classical and Quantum Mechanical Implementations. Clark R. Landis, Daniel M. Root, and Thomas Cleveland, Molecular Mechanics Force Fields for Modeling Inorganic and Organometallic Compounds.
xxii
Contributors to Previous Volumes
Vassilios Galiatsatos, Computational Methods for Modeling Polymers: An Introduction. Rick A. Kendall, Robert J. Harrison, Rik J. Littlefield, and Martyn F. Guest, High Performance Computing in Computational Chemistry: Methods and Machines. Donald B. Boyd, Molecular Modeling Software in Use: Publication Trends. Eiji Osawa and Kenny B. Lipkowitz, Appendix: Published Force Field Parameters.
Volume 7 (1996) Geoffrey M. Downs and Peter Willett, Similarity Searching in Databases of Chemical Structures. Andrew C. Good and Jonathan S. Mason, Three-Dimensional Structure Database Searches. Jiali Gao, Methods and Applications of Combined Quantum Mechanical and Molecular Mechanical Potentials. Libero J. Bartolotti and Ken Flurchick, An Introduction to Density Functional Theory. Alain St-Amant, Density Functional Methods in Biomolecular Modeling. Danya Yang and Arvi Rauk, The A Priori Calculation of Vibrational Circular Dichroism Intensities. Donald B. Boyd, Appendix: Compendium of Software for Molecular Modeling.
Volume 8 (1996) Zdenek Slanina, Shyi-Long Lee, and Chin-hui Yu, Computations in Treating Fullerenes and Carbon Aggregates. Gernot Frenking, Iris Antes, Marlis Bo¨hme, Stefan Dapprich, Andreas W. Ehlers, Volker Jonas, Arndt Neuhaus, Michael Otto, Ralf Stegmann, Achim Veldkamp, and Sergei F. Vyboishchikov, Pseudopotential Calculations of Transition Metal Compounds: Scope and Limitations. Thomas R. Cundari, Michael T. Benson, M. Leigh Lutz, and Shaun O. Sommerer, Effective Core Potential Approaches to the Chemistry of the Heavier Elements.
Contributors to Previous Volumes
xxiii
Jan Almlo¨f and Odd Gropen, Relativistic Effects in Chemistry. Donald B. Chesnut, The Ab Initio Computation of Nuclear Magnetic Resonance Chemical Shielding.
Volume 9 (1996) James R. Damewood, Jr., Peptide Mimetic Design with the Aid of Computational Chemistry. T. P. Straatsma, Free Energy by Molecular Simulation. Robert J. Woods, The Application of Molecular Modeling Techniques to the Determination of Oligosaccharide Solution Conformations. Ingrid Pettersson and Tommy Liljefors, Molecular Mechanics Calculated Conformational Energies of Organic Molecules: A Comparison of Force Fields. Gustavo A. Arteca, Molecular Shape Descriptors.
Volume 10 (1997) Richard Judson, Genetic Algorithms and Their Use in Chemistry. Eric C. Martin, David C. Spellmeyer, Roger E. Critchlow Jr., and Jeffrey M. Blaney, Does Combinatorial Chemistry Obviate Computer-Aided Drug Design? Robert Q. Topper, Visualizing Molecular Phase Space: Nonstatistical Effects in Reaction Dynamics. Raima Larter and Kenneth Showalter, Computational Studies in Nonlinear Dynamics. Stephen J. Smith and Brian T. Sutcliffe, The Development of Computational Chemistry in the United Kingdom.
Volume 11 (1997) Mark A. Murcko, Recent Advances in Ligand Design Methods. David E. Clark, Christopher W. Murray, and Jin Li, Current Issues in De Novo Molecular Design.
xxiv
Contributors to Previous Volumes
Tudor I. Oprea and Chris L. Waller, Theoretical and Practical Aspects of Three-Dimensional Quantitative Structure–Activity Relationships. Giovanni Greco, Ettore Novellino, and Yvonne Connolly Martin, Approaches to Three-Dimensional Quantitative Structure–Activity Relationships. Pierre-Alain Carrupt, Bernard Testa, and Patrick Gaillard, Computational Approaches to Lipophilicity: Methods and Applications. Ganesan Ravishanker, Pascal Auffinger, David R. Langley, Bhyravabhotla Jayaram, Matthew A. Young, and David L. Beveridge, Treatment of Counterions in Computer Simulations of DNA. Donald B. Boyd, Appendix: Compendium of Software and Internet Tools for Computational Chemistry.
Volume 12 (1998) Hagai Meirovitch, Calculation of the Free Energy and the Entropy of Macromolecular Systems by Computer Simulation. Ramzi Kutteh and T. P. Straatsma, Molecular Dynamics with General Holonomic Constraints and Application to Internal Coordinate Constraints. John C. Shelley and Daniel R. Be´rard, Computer Simulation of Water Physisorption at Metal–Water Interfaces. Donald W. Brenner, Olga A. Shenderova, and Denis A. Areshkin, QuantumBased Analytic Interatomic Forces and Materials Simulation. Henry A. Kurtz and Douglas S. Dudis, Quantum Mechanical Methods for Predicting Nonlinear Optical Properties. Chung F. Wong, Tom Thacher, and Herschel Rabitz, Sensitivity Analysis in Biomolecular Simulation. Paul Verwer and Frank J. J. Leusen, Computer Simulation to Predict Possible Crystal Polymorphs. Jean-Louis Rivail and Bernard Maigret, Computational Chemistry in France: A Historical Survey.
Volume 13 (1999) Thomas Bally and Weston Thatcher Borden, Calculations on Open-Shell Molecules: A Beginner’s Guide.
Contributors to Previous Volumes
xxv
Neil R. Kestner and Jaime E. Combariza, Basis Set Superposition Errors: Theory and Practice. James B. Anderson, Quantum Monte Carlo: Atoms, Molecules, Clusters, Liquids, and Solids. Anders Wallqvist and Raymond D. Mountain, Molecular Models of Water: Derivation and Description. James M. Briggs and Jan Antosiewicz, Simulation of pH-dependent Properties of Proteins Using Mesoscopic Models. Harold E. Helson, Structure Diagram Generation.
Volume 14 (2000) Michelle Miller Francl and Lisa Emily Chirlian, The Pluses and Minuses of Mapping Atomic Charges to Electrostatic Potentials. T. Daniel Crawford and Henry F. Schaefer III, An Introduction to Coupled Cluster Theory for Computational Chemists. Bastiaan van de Graaf, Swie Lan Njo, and Konstantin S. Smirnov, Introduction to Zeolite Modeling. Sarah L. Price, Toward More Accurate Model Intermolecular Potentials For Organic Molecules. Christopher J. Mundy, Sundaram Balasubramanian, Ken Bagchi, Mark E. Tuckerman, Glenn J. Martyna, and Michael L. Klein, Nonequilibrium Molecular Dynamics. Donald B. Boyd and Kenny B. Lipkowitz, History of the Gordon Research Conferences on Computational Chemistry. Mehran Jalaie and Kenny B. Lipkowitz, Appendix: Published Force Field Parameters for Molecular Mechanics, Molecular Dynamics, and Monte Carlo Simulations.
Volume 15 (2000) F. Matthias Bickelhaupt and Evert Jan Baerends, Kohn-Sham Density Functional Theory: Predicting and Understanding Chemistry.
xxvi
Contributors to Previous Volumes
Michael A. Robb, Marco Garavelli, Massimo Olivucci, and Fernando Bernardi, A Computational Strategy for Organic Photochemistry. Larry A. Curtiss, Paul C. Redfern, and David J. Frurip, Theoretical Methods for Computing Enthalpies of Formation of Gaseous Compounds. Russell J. Boyd, The Development of Computational Chemistry in Canada.
Volume 16 (2000) Richard A. Lewis, Stephen D. Pickett, and David E. Clark, Computer-Aided Molecular Diversity Analysis and Combinatorial Library Design. Keith L. Peterson, Artificial Neural Networks and Their Use in Chemistry. Jo¨rg-Ru¨diger Hill, Clive M. Freeman, and Lalitha Subramanian, Use of Force Fields in Materials Modeling. M. Rami Reddy, Mark D. Erion, and Atul Agarwal, Free Energy Calculations: Use and Limitations in Predicting Ligand Binding Affinities.
Volume 17 (2001) Ingo Muegge and Matthias Rarey, Small Molecule Docking and Scoring. Lutz P. Ehrlich and Rebecca C. Wade, Protein-Protein Docking. Christel M. Marian, Spin-Orbit Coupling in Molecules. Lemont B. Kier, Chao-Kun Cheng, and Paul G. Seybold, Cellular Automata Models of Aqueous Solution Systems. Kenny B. Lipkowitz and Donald B. Boyd, Appendix: Books Published on the Topics of Computational Chemistry.
Volume 18 (2002) Geoff M. Downs and John M. Barnard, Clustering Methods and Their Uses in Computational Chemistry. Hans-Joachim Bo¨hm and Martin Stahl, The Use of Scoring Functions in Drug Discovery Applications.
Contributors to Previous Volumes
xxvii
Steven W. Rick and Steven J. Stuart, Potentials and Algorithms for Incorporating Polarizability in Computer Simulations. Dmitry V. Matyushov and Gregory A. Voth, New Developments in the Theoretical Description of Charge-Transfer Reactions in Condensed Phases. George R. Famini and Leland Y. Wilson, Linear Free Energy Relationships Using Quantum Mechanical Descriptors. Sigrid D. Peyerimhoff, The Development of Computational Chemistry in Germany. Donald B. Boyd and Kenny B. Lipkowitz, Appendix: Examination of the Employment Environment for Computational Chemistry.
Volume 19 (2003) Robert Q. Topper, David L. Freeman, Denise Bergin and Keirnan R. LaMarche, Computational Techniques and Strategies for Monte Carlo Thermodynamic Calculations, with Applications to Nanoclusters. David E. Smith and Anthony D. J. Haymet, Computing Hydrophobicity. Lipeng Sun and William L. Hase, Born-Oppenheimer Direct Dynamics Classical Trajectory Simulations. Gene Lamm, The Poisson-Boltzmann Equation.
Volume 20 (2004) Sason Shaik and Philippe C. Hiberty, Valence Bond Theory: Its History, Fundamentals and Applications. A Primer. Nikita Matsunaga and Shiro Koseki, Modeling of Spin Forbidden Reactions. Stefan Grimme, Calculation of the Electronic Spectra of Large Molecules. Raymond Kapral, Simulating Chemical Waves and Patterns. Costel Saˆrbu and Horia Pop, Fuzzy Soft-Computing Methods and Their Applications in Chemistry. Sean Ekins and Peter Swaan, Development of Computational Models for Enzymes, Transporters, Channels and Receptors Relevant to ADME/Tox.
xxviii
Contributors to Previous Volumes
Volume 21 (2005) Roberto Dovesi, Bartolomeo Civalleri, Roberto Orlando, Carla Roetti and Victor R. Saunders, Ab Initio Quantum Simulation in Solid State Chemistry. Patrick Bultinck, Xavier Girone´s and Ramon Carbo´-Dorca, Molecular Quantum Similarity: Theory and Applications. Jean-Loup Faulon, Donald P. Visco, Jr. and Diana Roe, Enumerating Molecules. David J. Livingstone and David W. Salt, Variable Selection- Spoilt for Choice. Nathan A. Baker, Biomolecular Applications of Poisson-Boltzmann Methods. Baltazar Aguda, Georghe Craciun and Rengul Cetin-Atalay, Data Sources and Computational Approaches for Generating Models of Gene Regulatory Networks.
Volume 22 (2006) Patrice Koehl, Protein Structure Classification. Emilio Esposito, Dror Tobi and Jeffry Madura, Comparative Protein Modeling. Joan-Emma Shea, Miriam Friedel, and Andrij Baumketner, Simulations of Protein Folding. Marco Saraniti, Shela Aboud, and Robert Eisenberg, The Simulation of Ionic Charge Transport in Biological Ion Channels: An Introduction to Numerical Methods. C. Matthew Sundling, Nagamani Sukumar, Hongmei Zhang, Curt Breneman, and Mark Embrechts, Wavelets in Chemistry and Chemoinformatics.
Volume 23 (2007) Christian Ochsenfeld, Jo¨rg Kussmann, and Daniel S. Lambrecht, Linear Scaling Methods in Quantum Chemistry. Spiridoula Matsika, Conical Intersections in Molecular Systems. Antonio Fernandez-Ramos, Benjamin A. Ellingson, Bruce C. Garrett, and Donald G. Truhlar, Variational Transition State Theory with Multidimensional Tunneling.
Contributors to Previous Volumes
xxix
Roland Faller, Coarse-Grain Modelling of Polymers. Jeffrey W. Godden and Ju¨rgen Bajorath, Analysis of Chemical Information Content Using Shannon Entropy. Ovidiu Ivanciuc, Applications of Support Vector Machines in Chemistry. Donald B. Boyd, How Computational Chemistry Became Important in the Pharmaceutical Industry.
Volume 24 (2007) Martin Schoen and Sabine H. L. Klapp, Nanoconfined Fluids. Soft Matter Between Two and Three Dimensions.
This Page Intentionally Left Blank
CHAPTER 1
Determining the Glass Transition in Polymer Melts Wolfgang Paul Institut fu¨r Physik, Johannes Gutenberg-Universita¨t, Mainz, Germany
INTRODUCTION In the last 15 years, computer simulation studies of the glass transition in polymer melts have contributed significantly to advance our understanding of this phenomenon, which is at the same time of fundamental scientific interest and of great technical importance for polymer materials, most of which are amorphous or at best semi-crystalline. This progress has been possible, on the one hand, because of improved models and simulation algorithms and, on the other hand, because of theoretical advances in the description of the structural glass transition in general.1 Much of this development has been mirrored in a series of conferences on relaxations in complex systems the proceedings of which might serve as a good entry point into the literature on the glass transition in general.2,3 Instead of providing a detailed overview of all simulation work performed on the glass transition in polymer melts, this review has two goals. The first goal is to provide a novice to the field with the necessary background to understand the model building and choice of simulation technique for studies of the polymer glass transition. In particular, a novice modeler needs to be aware of the strengths and limitations of the different approaches used in the simulation of glass-forming polymers and to be able to judge the validity of the original literature. The second goal is to present a personal view of the Reviews in Computational Chemistry, Volume 25 edited by Kenny B. Lipkowitz and Thomas R. Cundari Copyright ß 2007 Wiley-VCH, John Wiley & Sons, Inc.
1
2
Determining the Glass Transition in Polymer Melts
contribution that computer simulations have made to our understanding of different aspects of the polymer glass transition, ranging from thermodynamic to dynamic properties. This part of the review will present our current understanding of glass transitions in polymeric melts based on simulation, experiment, and theory. We will illustrate this understanding based mainly on our own contributions to the field. In the next section, a short summary of the phenomenology of the glass transition is presented. The following section on models then explains the various types of models employed in the simulation of polymer melts, and the ensuing section on simulation methods introduces the algorithms used for such simulations. We will then describe simulation results on concepts relating to the thermophysical properties of the polymer glass transition. Finally, the main section of this review will present an overview of simulations of the slowdown of relaxation processes in polymer melts upon approaching the glass transition, and in the conclusions, we summarize what has been learned about how to identify the glass transition in polymer melts.
PHENOMENOLOGY OF THE GLASS TRANSITION The defining property of a structural glass transition is an increase of the structural relaxation time by more than 14 orders in magnitude without the development of any long-range ordered structure.1 Both the static structure and the relaxation behavior of the static structure can be accessed by scattering experiments and they can be calculated from simulations. The collective structure factor of a polymer melt, where one sums over all scattering centers M in the system M 1 X SðqÞ ¼ hexp½i~ q ð~ ri ~ rj Þi ½1 M i;j¼1 resembles the structure factor of small molecule liquids (we have given here a simplified version of a neutron structure factor: all scattering lengths have been set to unity). In Figure 1, we show an example of a melt structure factor taken from a molecular dynamics simulation of a bead-spring model (which will be described later). The figure shows a first peak (going from left to right), the so-called amorphous halo, which is a measure of the mean interparticle distance in the liquid (polymer melt). Upon lowering the temperature to the glass transition, the amorphous halo shifts to larger momentum transfers as the mean interparticle distance is reduced by thermal expansion. The amorphous halo also increases in height, which indicates smaller fluctuations of the mean interparticle distance, but no new structural features are introduced by this cooling.
Phenomenology of the Glass Transition
3
4
T=0.46 T=0.52 T=1.0
S(q)
3
2
1
0 0
5
10
q
15
20
Figure 1 Melt structure factor for three different temperatures (given in Lennard–Jones units) taken from a bead-spring model simulation. In the amorphous state (melt and glass), the only typical length scale is the next neighbor distance giving rise to the amorphous halo (first sharp diffraction peak) around q ¼ 6:9 for this model.
The thermal expansion, however, changes behavior at the glass transition, which is a phenomenon that was first analyzed in detail in a careful study by Kovacs.4 In the polymer melt, the thermal expansion coefficient is almost constant, and it is again so in the glass but with a smaller value. At the glass transition, there is therefore a break in the dependence of density on temperature that is the foremost thermophysical characteristic of the glass transition. The decay of the structural correlations measured by the static structure factor can be studied by dynamic scattering techniques. From the simulations, the decay of structural correlations is determined most directly by calculating the coherent intermediate scattering function, which differs from Eq. [1] by a time shift in one of the particle positions as defined in Eq. [2]:
Sðq; tÞ ¼
M 1 X hexp½i~ q ð~ ri ðtÞ ~ rj ð0ÞÞi M i;j¼1
½2
The Fourier transform of this quantity, the dynamic structure factor Sðq; oÞ, is measured directly by experiment. The structural relaxation time, or a-relaxation time, of a liquid is generally defined as the time required for the intermediate coherent scattering function at the momentum transfer of the amorphous halo to decay to about 30%; i.e., Sðqah ; ta Þ ¼ 0:3. The temperature dependence of the a time scale exhibits a dramatic slowdown of the structural relaxation upon cooling. This temperature dependence
4
Determining the Glass Transition in Polymer Melts
1e+12
Arrhenius VF: T0= 0.8 Tg
1e+10 1e+08
VF: T0= 0.9 Tg
η
1e+06 10000 100 1 0.01 0
0.2
0.4
0.6
0.8
1
T / Tg Figure 2 Sketch of typical temperature dependencies of the viscosity Z of glass-forming systems. The viscosimetric Tg of a material is defined by the viscosity reaching 1013 Poise. Strong glass formers show an Arrhenius temperature dependence, whereas fragile glass formers follow reasonably well a Vogel–Fulcher (VF) law predicting a diverging viscosity at some temperature T0 .
qualitatively agrees with that of the melt viscosity. This macroscopic measure of the relaxation time in the melt serves to define the so-called viscosimetric glass transition Tg as the temperature at which the viscosity is 1013 Poise. This result corresponds to a structural relaxation time of approximately 100 s. In Figure 2, we show three typical temperature dependencies of the viscosity in the form of an Angell plot.5 The upper curve is an Arrhenius law defining so-called ‘‘strong glass formers.’’ The two other curves follow Vogel–Fulcher laws (Eq. [3]) observed for ‘‘fragile glass formers,’’ a category to which most polymeric systems belong, displaying a diverging viscosity at some temperature T0 < Tg . Around Tg , the relaxation time of fragile glass formers increases sharply. The definition of Tg is thus based on the fact that at this temperature the system falls out of equilibrium on typical experimental time scales. As a result of this falling out of equilibrium, one also observes a smeared-out step in the temperature dependence of the heat capacity close to Tg defining the calorimetric Tg (similar to the behavior of the thermal expansion coefficient). The calorimetric Tg and the viscosimetric Tg need not agree exactly. For crystallizable polymers, one can define a ‘‘configurational entropy’’ of the polymer melt by subtracting the entropy of the corresponding crystal from the entropy of the melt. A monotonous decrease is predicted in the configurational entropy to a value at Tg , which is about one third of the corresponding value of the configurational entropy at the melting temperature of the crystal.6 Extrapolating to lower temperatures, one finds the configurational entropy to vanish at the Kauzmann temperature TK ,7 which is typically 30–50 K lower than Tg .5 It is interesting to note that
Phenomenology of the Glass Transition
5
TK is often close to the Vogel–Fulcher temperature T0 discussed in connection with Figure 2, which is determined by fitting the Vogel–Fulcher relation5–8 to the temperature dependence of the structural relaxation time of the melt9 using Eq. [3]: t ¼ t1 exp½Eact =kB ðT T0 Þ
½3
where t1 is a time characterizing microscopic relaxation processes at high temperatures and Eact is an effective activation energy. Up to this point the phenomenological characterization of the glass transition is the same for a polymer melt and for a molecular liquid. In a polymer melt, however, one must also have knowledge of both the conformational structure and the relaxation behavior of a single chain to characterize the system completely, be it in the melt state or in the glassy state. Flexible linear macromolecules in the melt adopt a random coil-like configuration; i.e., their square radius of gyration is given by10–12 Eq. [4]: R2g ¼
C1 ‘2 N Nb2 ¼ 6 6
½4
where N ðN 1Þ is the degree of polymerization and ‘ is the length of a segment. The characteristic ratio C1 describes short-range orientational correlations among subsequent monomer units along the backbone of the polymer pffiffiffiffiffiffiffiffi chain, and b ¼ C1 ‘ is the statistical segment length of the chain. On intermediate length scales, the structure of a polymer coil is well described by the Debye function10 of Eq. [5]: Sp ðqÞ ¼
N 1X hexp½i~ q ð~ ri ~ rj Þi ¼ NfD ðq2 R2g Þ N i;j¼1
½5
2 fD ðxÞ ¼ 2 ½expðxÞ 1 þ x x where qb 1 is assumed for the momentum transfer and we again set all scattering lengths to unity. In the dense melt, these coils interpenetrate each other. Thus, their diffusive motion is slow even at temperatures far above the glass transition. If the chain length N is smaller than the ‘‘entanglement chain length’’ Ne , above which reptation-like behavior sets in,12–15 the relaxation time describing how long it takes a coil to renew its configuration is given by the Rouse time tR ¼ ðTÞN 2 C1 ‘2 =ð3p2 kB TÞ
½6
where ðTÞ is the friction coefficient experienced by the segments of the chain in their Brownian motion, kB is Boltzmann’s constant, and T is the temperature.
6
Determining the Glass Transition in Polymer Melts
The Rouse model12 that yields Eq. [6] also shows that the self-diffusion constant of the chains scales inversely with chain length DN ¼ kB T=ðNðTÞÞ
½7
whereas the melt viscosity is proportional to the chain length Z ¼ cðTÞb2 N=36
½8
with c being the number of monomers per volume.14,15 Ample experimental evidence exists10–13 that Eqs. [4]–[8] capture the essential features of (nonentangled) polymer chains in a melt; however, recent simulations and experiments16,17 have shown that the relaxation of coils on length scales smaller than Rg is only qualitatively described by the Rouse model. The glass transition manifests itself in the temperature dependence of the segmental friction coefficient . Within the Rouse model, this quantity captures the influence of the specific chemistry on the dynamics in the melt, whereas the statistical segment length b captures its influence on the static properties. This result explains the two types of models used to study the properties of polymer melts (the glass transition being one of them). Coarse-grained models, like a bead-spring model in the continuum or lattice polymer models, can reproduce the chain length scaling of static and dynamic properties in polymer melts when they correctly capture the determining physics. That physics involves the excluded volume between all segments and the connectivity of the chains. Chemically realistic models are needed when one either tries to reproduce experimental data quantitatively or to describe polymer properties on length and time scales that are still influenced by the detailed chemistry. A particular characteristic feature of dynamic processes in the vicinity of the glass transition is the ubiquity of the Kohlrausch–Williams–Watts (KWW) stretched exponential relaxation:1,7–9 fðtÞ / exp½ðt=tÞb ;
0
½9
Relaxation functions fðtÞ, which are observable via mechanical relaxation, dielectric relaxation, multidimensional nuclear magnetic resonance (NMR) spectroscopy, neutron-spin echo scattering, and so on, can be described in their long-time behavior by Eq. [9]. The exponent b typically lies in the range 0:3 b < 1 and depends on what is relaxing. Although the relaxation time t depends strongly on temperature, b is often approximately independent of temperature in some temperature interval. In this regime, fðtÞ exhibits a scaling property called the ‘‘time-temperature superposition principle.’’7–9 Polymers are very good glass formers, with a few notable exceptions. For some polymers, such as atactic polypropylene or random copolymers like
Chemically Realistic Modeling
7
cis-trans polybutadiene, no possible crystalline state is known, so in these cases it is not clear at all whether we can speak about a super-cooled liquid on approaching the glass transition. Even when there is an ordered ground state (crystalline or only liquid crystalline) for a specific polymer, we can easily understand that a kinetic hindrance for ordering exists. In order to crystallize, a chain must change its random-coil-like state in favor of one of its possible energetic ground state conformations. Because of the packing constraints in a dense melt, this must happen (presumably) in a synchronized fashion with the surrounding chains. Thus, it is understandable that polymers are hard to crystallize. Accordingly, whether no known ordered state exists or whether that state is only kinetically inaccessible, it is easy to observe and measure metastable thermal equilibrium properties of polymer melts (e.g., specific heat or entropy) from the high-temperature melt to the low-temperature glass.
MODEL BUILDING Our aim is to better understand the glass transition phenomenon in polymer melts by using computer simulations. The discussion of the glass transition phenomenology in the previous section made it clear that we can distinguish several levels of specificity in our computational quest: (1) We can try to model generic features of the structural glass transition, i.e., those features that are independent of whether we are considering a polymer melt or, e.g., an organic liquid. (2) We can try to determine features of the structural glass transition that are specific to polymeric materials as compared with, e.g., silica glasses. (3) We can try to understand quantitatively the glass transition for a specific polymeric material. If the aim of our work falls into category (1) and partly (2), it is most efficient from the modeling perspective to use coarsegrained models that capture only generic polymeric properties like monomer connectivity and excluded volume. If the aim of our work falls into category (3) and partly (2), we will need to employ a chemically realistic model for which quantitative input on the local geometry and energetics, i.e., a well-calibrated force field, is required. Below we consider first chemically realistic models and then we describe two classes of coarse-grained models.
CHEMICALLY REALISTIC MODELING If we are aiming for a quantitatively correct prediction of the behavior and properties of a specific polymer, we need to employ an optimized and carefully validated force field for this specific polymer. In the literature one often finds simulation work using force fields that do not fulfill these criteria but where instead the authors use ‘‘polymer-xy-like’’ models. Although these models fail to reproduce the properties of the polymer they claim to model
8
Determining the Glass Transition in Polymer Melts
quantitatively, the qualitative conclusions drawn from these simulations are often valid, especially when they concern more general polymeric properties. However, even qualitative conclusions pertaining only to a specific polymer or to a class of similar polymers can be problematic when derived from simulations employing inaccurate or unvalidated potentials. Various forms of classical potentials (force fields) for polymers can be found in the literature18–23 and have been reviewed in this book series.24–28 We are concerned in this chapter with reproducing the static, thermodynamic, and dynamic (transport and relaxational) properties of non-reactive organic polymers, and for this reason, the potential must represent accurately the molecular geometry, nonbonded interactions, and conformational energetics of the macromolecules of interest. The classical force field represents the potential energy of a polymer chain, made of N atoms with coordinates given by the set f~ rg, as a sum of nonbonded interactions and contributions from all bond, valence bend, and dihedral interactions: Vðf~ rgÞ ¼ V nb ðf~ rgÞ þ V pol ðf~ rgÞ X X X nb rgÞ þ V bond ðrij Þ þ V bend ðyijk Þ þ V tors ðijkl Þ ¼ V ðf~ bonds
bends
dihedrals
½10 More complicated cross-terms between the different intramolecular degrees of freedom are also employed in some force fields, but we will not consider them in the following. The dihedral term may also include four-center improper torsion or out-of-plane bending interactions that occur at sp2 hybridized centers.29 The nonbonded interactions commonly consist of a sum of two-body repulsion and dispersion energy terms between atoms that are often of the Lennard–Jones form in addition to the energy from the interactions between fixed partial atomic or ionic charges (Coulomb interaction) " 6 # qi qj s 12 s V ðf~ rgÞ ¼ 4E þ rij rij 4pE0 rij i;j¼1 nb
M X
½11
The dispersion interactions are weak compared with repulsion, but they are longer range, which results in an attractive well with a depth E at an interatomic separation of smin ¼ 21=6 s. The interatomic distance at which the net potential is zero is often used to define the atomic diameter. In addition to the Lennard–Jones form, the exponential-6 form of the dispersion– repulsion interaction, V exp6 ðf~ rgÞ ¼
M Cij 1X Aij expfBij rij g 6 2 i;j¼1 rij
½12
Chemically Realistic Modeling
9
is often used in atomistic models. Nonbonded interactions are typically included in the force field calculations for all atoms of different molecules and for atoms of the same molecule separated by more than two bonds or by three bonds when the nonbonded ‘‘1-4’’ interaction has been included in the parameterization of an effective torsional interaction.29 The Coulomb interaction is long-range, which necessitates use of special numerical methods for efficient simulation.30 When one tries to understand the glass transition in a chemically realistic model, these long-range Coulomb interactions add further numerical overhead so that the most extensive glass transition simulations of realistic models were done for apolar molecules. In atomistic force fields, the contributions from bonded interactions included in Eq. [10] are commonly parameterized as 1 Vbond ðrij Þ ¼ kbond ðrij r0ij Þ2 2 ij 1 1 bend Vbend ðyijk Þ ¼ kbend ðyijk y0ijk Þ2 k0 ijk ðcos yijk cos y0ijk Þ2 2 ijk 2 1 X tors Vtors ðijkl Þ ¼ k ½1 cosðnijkl Þ or 2 n ijkl 1 X oop 2 Vtors ðijkl Þ ¼ k f 2 n ijkl ijk
½13 ½14 ½15
Here, r0ij is an equilibrium bond length and y0ijk is an equilibrium valence bend oop tors angle, whereas kbond , kbend ij ijk , kijkl ðnÞ, and kijkl are the bond, bend, torsion, and out-of-plane bending force constants, respectively. The indices indicate which (bonded) atoms are involved in the interaction. These geometric parameters and force constants, combined with the nonbonded parameters qi , E, and s, constitute the classical force field for a particular polymer. Although there are existing standard force fields in the literature like AMBER,18 OPLS-AA,20 COMPASS,21 CHARMM,22 and PCFF23 to name but a few (see also Refs. 24–28), one will typically find that they are only a qualitative or at best a semi-quantitative representation of a polymer one might want to study. The quantitative modeling of a given polymeric material has to start from high-level quantum chemistry calculations as the best source of molecular level information for force field validation and parameterization. Although such calculations are not yet possible on high polymers, they are feasible on small molecules that are representative of polymer repeat units and for oligomers. These calculations can provide the molecular geometries, partial charges, polarizabilities, and the conformational energy surface needed for accurate prediction of structural, thermodynamic, and dynamic properties of polymers. A general procedure for deriving quantum chemistry–based potentials can be found in the literature.29,31,32 The intermolecular dispersion interactions can also, to a certain extent, be determined from these calculations.
10
Determining the Glass Transition in Polymer Melts
However, it has turned out that the most accurate way of fixing these parameters is through matching of simulated phase equilibria to those derived from experiment.33 As a final step, the potential, regardless of its source, should be validated through extensive comparison with available experimental data for structural, thermodynamic, and dynamic properties obtained from simulations of the material of interest, closely related materials, and model compounds used in the parameterization. The importance of potential function validation in simulation of real materials cannot be overemphasized. For nonpolar, simple hydrocarbon chains that we will discuss later we can employ a simple force field of the form Vðf~ rgÞ ¼
X i
Vðli Þ þ
X j
Vðyj Þ þ
X k
Vðk Þ þ
X
Vnb ðrnm Þ
½16
n;m
where the sums run over all bonds, bends, torsions, and nonbonded interacting atoms, respectively. One often does not treat the hydrogen atoms explicitly in simulations of hydrocarbon chains but instead combines them with the carbon atoms to which they are bound to create ‘‘united atoms.’’ This approximation not only reduces the number of force centers for the calculation of the nonbonded interactions, but it also removes the highest frequency oscillations (C–H bond length and H–C–H and H–C–C bond angles) from the model. This approximation works well when one wants to study structure and relaxational properties in amorphous polymers without any specific local interactions (i.e., strong electrostatic interactions or hydrogen bonding). In the latter cases, internal degrees of freedom of the united atoms and a model for their interaction may be added, but no reliable way exists so far to determine the parameters entering such a description quantitatively for a given polymer, so one generally loses the ability to obtain quantitative predictions using such models. A final approximation often employed in large-scale polymer simulations is to neglect the C–C bond length oscillations and to perform the simulation with constrained bond lengths.34 The approximations discussed in the last paragraph are motivated by computational expediency. However, they reflect our understanding of the relevant physical processes that must be included in the computer simulation for us to obtain a quantitative reproduction of the structure and dynamics of a realistic polymer melt. Such approximations are imposed on us by the huge spread of relaxation times one has to cover in the simulation, which range from local relaxations to conformational changes of unentangled chains requiring substantial computational efforts when one is striving to perform simulations in thermodynamic equilibrium. The simulation studies of dynamic processes are generally conducted using molecular dynamics (MD) methods. Equilibrating the starting configurations for these studies, however, can profit from the use of Monte Carlo (MC) techniques where moves generating global conformational rearrangements are included.35
Coarse-Grained Models
11
COARSE-GRAINED MODELS Coarse-grained polymer models neglect the chemical detail of a specific polymer chain and include only excluded volume and topology (chain connectivity) as the properties determining universal behavior of polymers. They can be formulated for the continuum (off-lattice) as well as for a lattice. For all coarse-grained models, the repeat unit or monomer unit represents a section of a chemically realistic chain. MD techniques are employed to study dynamics with off-lattice models, whereas MC techniques are used for the lattice models and for efficient equilibration of the continuum models.36–42 A tutorial on coarse-grained modeling can be found in this book series.43
Coarse-Grained Models of the Bead-Spring Type These models retain the form of the nonbonded interaction used in the chemically realistic modeling, i.e., they use either an interaction of the Lennard–Jones or of the exponential-6 type. The repulsive parts of these potentials generate the necessary local excluded volume, whereas the attractive long-range parts can be used to model varying solvent quality for dilute or semi-dilute solutions and to generate a reasonable equation-of-state behavior for polymeric melts. The inclusion of chain connectivity prevents polymer strands from crossing one another in the course of a computer simulation. In bead-spring polymer models, this typically means that one has to limit the maximal (or typical) extension of a spring connecting the beads that represent the monomers along the chain. This process is most often performed using the so-called finitely extensible, nonlinear elastic (FENE) type potentials44 of Eq. [17] 1 2 ln½1 ðl=lmax Þ2 UF ðlÞ ¼ klmax 2
0 l lmax
½17
but also with harmonic spring length potentials with a length cut-off 45 or very stiff force-constants.46 Beyond this bond length potential, one may typically include a bending energy term to reduce local flexibility. Because the bending energy and geometry on this length scale do not derive from chemical hybridization, one typically takes the equilibrium bond angle to be 180 . Dihedral energy terms are generally not included in coarse-grained models. Instead, the chains are treated on mesoscopic scales as freely rotating.
The Bond-Fluctuation Lattice Model The large-scale structure of polymer chains in a good solvent is that of a self-avoiding random walk (SAW), but in melts it is that of a random walk (RW).11 The large-scale structure of these mathematical models, however, is
12
Determining the Glass Transition in Polymer Melts
independent of whether one studies them in the continuum or on a lattice, and because of this, MC simulations of lattice models for polymers have a long history. Later in this tutorial we will use results from simulations of the bondfluctuation lattice model.47–50 This model represents the repeat units of the coarse-grained polymer not as single vertices on some space lattice but as unit cubes on a simple cubic lattice (see Figure 3 for the three-dimensional version of the model). The bonds connecting consecutive monomers are from the class [2,0,0],[2,1,0],[2,1,1],[3,0,0],[2,2,1],[3,1,0], where the square brackets indicate all vectors obtainable from the given vector by lattice symmetry operations. There are a total of 108 bonds with 5 different lengths and 87 different bond angles for this model. The model thus introduces some local conformational flexibility while retaining the computational efficiency of lattice models for implementing excluded volume interactions by enforcing a single occupation of each lattice vertex. Intramolecular potentials are chosen as bond length and/or bond angle dependent according to the physical problem one wants to model. Note that, as in all coarse-grained models, the potentials in the bondfluctuation model do not correspond physically to bond stretches and valence angle bending potentials in a chemically realistic polymer chain. When one implements an MC stochastic dynamics algorithm in this model (consisting of random-hopping moves of the monomers by one lattice constant in a randomly chosen lattice direction), the chosen set of bond vectors induces the preservation of chain connectivity as a consequence of excluded volume alone, which thus allows for efficient simulations. This class of moves
Figure 3 Sketch of the bond-fluctuation lattice model. The monomer units are represented by unit cubes on the simple cubic lattice connected by bonds of varying length. One example of each bond vector class is shown in the sketch.
Simulation Methods
13
allows for a physical interpretation of the obtained stochastic dynamics41 that generates Rouse-like motion of the chains12 in the simulation of dense polymer melts.
SIMULATION METHODS The simulation methods most commonly used for atomistic or coarsegrained molecular models are MD simulations and MC simulations. In MD simulations, Newton’s equations of motion are integrated to generate a trajectory (a history) of the model system. The method can capture all vibrational and relaxational processes contained in the chosen model Hamiltonian. MC is a stochastic simulation method that can capture relaxational and diffusive processes. MC is commonly used to generate equilibrium configurations for either sampling of thermodynamic and structural properties or to provide starting configurations for ensuing MD runs that are used to evaluate the dynamics of the model system. In the following discussion, we will examine the two methods with regard to their two main applications—equilibration and generation of trajectories for dynamic measurements, respectively.
Monte Carlo Method The MC method considers the configuration space of a model and generates a discrete-time random walk through configuration space following a master equation41,51 X X Pðx; tn Þ ¼ Pðx; tn1 Þ þ Wðx0 ! xÞPðx0 ; tn1 Þ Wðx ! x0 ÞPðx; tn1 Þ x0
x0
½18 Here x; x0 denote two configurations of the system (specified, for instance, by the set of coordinates of all atoms f~ rn g or the position of one chain end for all chains and all bond lengths, bond angles, and torsion angles f~ ra1 ; lia ; yaj ; fak g, where a ¼ 1; . . . M runs over all chains and the indices i; j; k run over all internal degrees of freedom of one chain). The transition rates Wðx ! x0 Þ are chosen to fulfill the detailed balance condition Wðx0 ! xÞPeq ðx0 Þ ¼ Wðx ! x0 ÞPeq ðxÞ
½19
which ensures an equal probability flow from x0 to x as in the reverse direction in equilibrium. Here Peq ðxÞ ¼
1 expfbHðxÞg Z
½20
14
Determining the Glass Transition in Polymer Melts
where b ¼ 1=kB T, HðxÞ is the Hamiltonian of the system, and Z is the canonical partition function Z¼
X
expfbHðxÞg
½21
x
Equation [19] ensures that the thermodynamic equilibrium distribution of Eq. [20] is the stationary (long-time) limit of the Markov chain generated by Eq. [18]. It does not specify the transition rates uniquely, however. Let us write them in the following way: Wðx ! x0 Þ ¼ W0 ðx ! x0 ÞWT ðx ! x0 Þ
½22
where W0 is the probability suggesting x0 as the next state, i.e., to suggest a certain MC move, and WT is the thermal acceptance probability chosen to fulfill Eq. [19]. This requires the suggestion probabilities to be reversible W0 ðx ! x0 Þ ¼ W0 ðx0 ! xÞ
½23
Only a few choices for WT exist in the literature, i.e., Metropolis rates, Glauber rates, or heat-bath,51 but there is an unlimited variety of possible choices for W0 , and this is the great advantage of the MC method. Only some choices for W0 result in physically reasonable dynamics (in general, local moves like selecting a monomer at random and then moving it into a randomly chosen direction by a small distance), but all reversible choices lead to the correct equilibrium distribution of states. One can therefore invent MC moves targeted at overcoming the main physical barriers leading to slow equilibration of a model system. The two main sources for slow relaxation in polymers are entanglement effects and the glass transition. The first is entropic in origin, whereas the second—at least in chemically realistic polymer models—is primarily enthalpic. We write the largest relaxation time in the melt as tl ðT; NÞ ¼ tmes ðTÞN x
½24
where tmes is a mesoscopic time scale. The chain length dependence crosses over from x ¼ 2 for Rouse behavior to x ¼ 3:4 for repeating chains.12 Every simulation method that performs configuration changes typical for the mesoscopic time scale tmes , i.e., local rearrangements, leads to a relaxation of the large-scale structure of the polymer chains in the melt only after O(N x ) such configuration changes. This in turn quickly limits the range of chain lengths one can treat in thermodynamic equilibrium. To circumvent this problem,
Simulation Methods
15
Figure 4 Sketch of the double-bridging algorithm. Starting from monomer i on the white chain, a trimer bridge to monomer j on the black chain is initiated. If the formation of this connection is geometrically possible, then a bridge between monomers i0 and j0 has to be built, as they are four monomers removed from i and j, respectively. If both bridges can be formed, the intermediate monomers are excised. Two new chains with the same chain length as the original ones are created
one has to use advanced MC techniques that implement global configurational changes within a single Monte Carlo step. A class of these advanced MC techniques consists of the so-called ‘‘connectivity altering moves’’ like the cooperative motion algorithm52 and the end-bridging algorithm53–55 and its newest variant, the double-bridging algorithm,56,57 which is sketched in Figure 4. The latter two algorithms have been developed with chemically realistic polymer models in mind, and we will now briefly discuss the concepts behind, and properties of, these algorithms. In the original end-bridging algorithm,53–55 an end monomer i of one chain in the melt attacks a backbone atom j of another chain that is sufficiently close in proximity and tries to initiate a change in connectivity of the two involved chains by forming a trimer bridge to this backbone atom. In the event of a successful bridging, the attacking chain grows by a part that is cut off of the attacked chain, and the attacked chain shrinks by a corresponding amount. This description already exhibits the main drawbacks of the algorithm: It generates polydisperse polymer melts, and it needs a sufficient number of chain ends to be efficient. It was found empirically that the efficiency of the algorithm dropped considerably (1) as the stiffness of the chains was increased and (2) in the presence of chain orientation. The algorithm was nonetheless applied successfully to polyethylene melts58,59 and cis-1,4 polyisoprene melts,60,61 for example. In the double-bridging algorithm, an inner monomer of a chain attacks an inner monomer of another chain (or the same chain) and tries to form a
16
Determining the Glass Transition in Polymer Melts
trimer bridge (see Figure 4). Simultaneously another bridge is formed between two monomers that are four steps apart from the first two monomers along the two chains, thereby generating two new chains with exactly the same length as the original chains. These requirements understandably put heavy geometric constraints on the configurations of the two chains for which this type of move is feasible (because only special choices of involved monomers ði; jÞ and excised trimers conserve the chain lengths in the move). The trimer bridge is made of three monomers (atoms) connected by bonds of fixed lengths l making a fixed angle, which is chosen to be the maximum value of the bondangle distribution of the model. Whenever the monomers i and j are at a distance less than the maximum bridgeable distance of 4l cosððp ymax Þ=2Þ, this geometric problem is in principle solvable. Connectivity changing algorithms are especially efficient in decorrelating large-scale structure in the melt. These algorithms are typically augmented by local moves and reptation moves (a randomly chosen end monomer of a chain is cut off and reattached to the opposite end of the same chain with a random orientation) to equilibrate the local structure.41 An alternative method for overcoming the entropic slowdown in a polymer melt caused by packing and connectivity is the so-called ‘‘4d-algorithm.’’ The idea of this algorithm62 is to turn some monomers into ghost particles (alternatively, one can think of removing some particles into the fourth dimension, which is similar to desorbing and readsorbing particles from a two-dimensional film into the third dimension) and then forcing those particles back into the three-dimensional structure by applying an external field in an extended ensemble simulation. The algorithm is similar to other suggestions to reduce the packing effects.63–65 So far it has been tested on the structural relaxation of a collapsed polymer globule where the connectivity of the chains and the high density inside the globule lead to a dramatic increase in the structural relaxation time of the globule. True equilibration in a polymer melt is only reached when the largest length scales (e.g., the end-to-end orientation of the chains) are decorrelated. The connectivity altering moves overcome the chain length dependence of this time scale, and the extended ensemble simulations overcome some packing effects influencing the prefactor, but we have not yet discussed any method that can overcome completely the slowing down in the prefactor of Eq. [24], i.e., the deceleration of the structural relaxation accompanying the glass transition. No efficient MC algorithm exists yet to overcome the slowdown of structural relaxation connected with the glass transition! All MC moves employed thus far share the same fate as MD simulations, which face an increase of relaxation time (that is, the simulation time needed) by 14 orders of magnitude on approaching Tg . Because this range of relaxation times is not be covered by modern computing machinery, one is limited to follow the glass transition in equilibrium over about 3-4 orders of magnitude in the relaxation time.
Simulation Methods
17
Molecular Dynamics Method MD involves integrating Newton’s equation of motion, which we write in the Hamiltonian form d~ ri ¼~ vi dt ~ vi ~ ¼ Fi =mi dt
½25
where mi is the mass of particle i, ~ ri is its position, ~ vi is its velocity, and ~ Fi ¼ rV is the force acting on it, where V is the force field in the Hamiltonian. Because of its conceptual simplicity, the MD method is contained in almost all commercial simulation packages and is therefore widely used. To use it correctly and efficiently, however, requires having some knowledge about the integrators employed and the strengths and limitations of the method. The numerical solution of Eq. [25] is typically performed using the velocity Verlet integrator,37,38 which is a second-order symplectic integrator.66 Symplectic integrators conserve phase space volume and are therefore reversible, endowing them with an excellent stability even for relatively large time steps and making them good at conserving energy along a microcanonical trajectory. The time step in the MD integrator is limited by the fastest degrees of freedom in the material being modeled. When we denote with tf a typical vibrational period of such a fast degree of freedom, the integration step t has to be of the order of 1=30tf 1=10tf . The theory of symplectic integrators is also the starting point to derive multiple time-step integrators,67,68 which increase the efficiency of the MD simulation scheme by calculating weak forces less frequently than strong forces. One can also constrain certain fast degrees of freedom using methods that ensure that constraints are conserved.34,69,70 Between MC and MD methods, Brownian dynamics (sometimes called stochastic dynamics) methods exist:71 d~ ri ¼ ~ vi dt d~ vi ¼
! ~ Fi ~ i ðtÞ g~ vi dt þ sdW mi
½26
~ i are Gaussian white noise processes, and their strength s is related to The dW the kinetic friction g through the fluctuation-dissipation relation.72 When deriving integrators for these methods, one has to be careful to take into account the special character of the random forces employed in these simulations.73 A variant of the velocity Verlet method, including a stochastic dynamics treatment of constraints, can be found in Ref. 74. The stochastic
18
Determining the Glass Transition in Polymer Melts
simulation methods introduce an additional, external dissipation mechanism into the simulation and that alters the native dynamics of the model system under study on time scales larger than the typical 1=g time scale for this external dissipation mechanism. This is also true for MD simulations in ensembles other than the microcanonical ensemble. The Nose´–Hoover method75,76 for MD simulations in the canonical ensemble introduces a time-dependent friction gNH as an additional dynamical variable. Because its average value along the trajectory is small, hgNH i 1, it affects only very slow processes. This is not true, however, for MD simulations in the NpT ensemble, where the additional barostat dynamics77 strongly changes the intrinsic dynamics of the system. The safest approach is to work in the microcanonical ensemble, if possible. Starting from an equilibrated configuration, the MD method is used to generate a trajectory of the model system, and one typically stores configurations along the trajectory for a postsimulation analysis of the relaxation processes.
THERMODYNAMIC PROPERTIES In this section we will discuss two approaches commonly used by scientists to study the glass transition in polymers when they determine the temperature dependence of thermodynamic properties. These properties include the specific volume and the specific entropy. As discussed in the Introduction, the break in the temperature dependence of the specific volume served as one of the first experimental measures of the glass transition.4 In the experiment by Kovacs, a very careful study was performed of the cooling rate dependence of the glass transition temperature. The break in the temperature dependence of the specific volume signifies that the system’s internal relaxation times have reached the time scale of the experiment (inverse cooling rate), at which point, the system is no longer in equilibrium. With simulations one can also use the temperature dependence of other properties. These properties inlcude the internal energy or the size of the polymer chains as determined from the radius of gyration of the chains,78 both of which are readily accessible via computation. When using an atomic scale MD simulation of this cooling process, the integration time step dt is typically as small as 1 fs (1015 s), and accordingly, even very long runs (on the order of 107 or 108 time steps) will not exceed a time interval of 100 ns. It is not even clear that on such short time scales the true physical traits of the glass transition can emerge fully. The ‘‘glass transition’’ that was claimed to be observed in an atomistic model for polyethylene chains from simulation time periods less than 1 ns79,80 implied that a freezing of individual jumps between the energy minima of the torsional potential of the chains took place, rather than a more collective behavior (‘‘cooperatively rearranging units’’81) that might occur in the system at somewhat lower
Thermodynamic Properties
19
temperatures and on much longer time scales. Even if one takes the view that the only influence the short-time window (ns) of MD simulation has is to shift the apparent glass transition temperature Tg (MD) upward in comparison with that of experiment Tg (expt), one still has to assess the amount of this upward shift. Practitioners of MD simulation studying glass forming fluids often compare MD results on glass transition temperatures directly with experimental data,82–84 ignoring this systematic difference between Tg (MD) and Tg (expt). For ‘‘strong’’ glass formers such as molten SiO2, this issue has been studied carefully.85 A very strong dependence of Tg (MD) on the cooling rate was found (which in the case of SiO2 was in the range 1012 K=s < < 1015 K/s, which is many orders of magnitude higher than in the experiments.86) The computed Tg (MD) differed from Tg (expt) 1450 K by more than 1000 K! Because of the much steeper variation of the structural relaxation time t with T near Tg (expt), one does not expect such dramatic effects for fragile glass formers like most polymers, and indeed for simulations,82–84 it was found that Tg (MD) agreed with Tg (exp) reasonably well. One reason for ignoring the cooling rate dependence of the simulated Tg in these publications is because MD simulations of chemically realistic models are generally too time consuming for a systematic study of different cooling rates (for an exception, see Ref. 87), especially when one takes into account that for each cooling rate one must average over several independent cooling runs for each rate as will be described later. To evaluate cooling rate dependence, one therefore best uses a coarse-grained model, as was done for the bead-spring model suggested in Refs. 88 and 89. In the latter paper, the melting temperature of the bead-spring model was determined to be T ¼ 0:76 in Lennard–Jones units. Upon cooling at a fixed rate in a MD simulation at constant temperature and pressure (NpT) employing a Nose´–Hoover thermostat75,76 and an Andersen barostat,77 one observes the temperature-dependent specific volumes shown in Figure 5. In the top panel, the temperature dependence of the specific volume is shown for a cooling rate of ¼ 52:083 106 . Straight line fits in the melt and in the glassy phase assume a constant thermal expansion coefficient in both phases. The intersection point between the straight lines defines the glass transition temperature Tg ðÞ for this cooling rate. In the lower panel, the fit curves are shown for different cooling rates. The melt curve does not depend on the cooling rate, but the glass curves show a systematic shift, although, within the uncertainty of the data, the slope of the curves (the thermal expansion coefficient) is independent of the cooling rate also in the glass. It is obvious from the plot that only a small (but systematic) variation in Tg ðÞ exists in the range of cooling rates that was accessible in the simulation (3:3 106 < < 8:3 104 in Lennard–Jones units). To interpret the cooling rate dependence of the glass transition temperature, one can use the Vogel–Fulcher law discussed in the section on the
20
Determining the Glass Transition in Polymer Melts
7.09 7.08
ln(V)
7.07 7.06 7.05 7.04 7.03 0.3
0.4
0.5
0.6
0.5
0.6
T
7.09 7.08
ln(V)
7.07
-4
8.3 10 -4 4.2 10 -5 5.2 10 -6 6.5 10
7.06 7.05 7.04 7.03 0.3
0.4 T
Figure 5 The upper panel shows the logarithm of the specific volume as a function of temperature for a cooling rate ¼ 52:083 106 , with error bars determined from 55 independent cooling runs. The lines are fits with a constant expansion coefficient in the melt (continuous line) and glass phase (dashed line), respectively. The lower panel shows the common fit curve for all cooling rates in the melt and fit curves in the glass for four cooling rates given in the legend.
phenomenology of the glass transition. If one assumes that the break in the observed temperature dependencies occurs when the internal relaxation time is equal to the time scale of the cooling experiment texp , one obtains texp ¼ t1 expfEA =ðTg ðÞ T0 Þg
½27
Thermodynamic Properties
21
In a stepwise cooling experiment, texp is equal to the time spent at every temperature step texp ¼ T=
½28
Tg ðÞ ¼ T0 EA = lnðt1 =TÞ
½29
so that one obtains
Applying this prediction to the cooling rate dependence of a break points in the specific volume curves, one obtains a Vogel–Fulcher temperature of T0 ¼ 0:35 that agrees well with that determined from the temperature dependence of the diffusion constant in this model, which is T0D ¼ 0:32. From these results obtained from MD simulations with a coarse-grained model, the following picture emerges. When one cools down a model system in a computer simulation, the break in the temperature dependence of the specific volume indicates the temperature at which the time scale of those internal relaxation processes involved in volume relaxation equals the time scale of the cooling process. In strong glass formers, where the typical time scales at a high temperature increase in an Arrhenius fashion with an activation energy that is much higher than for the fragile glass formers, this means that the system falls out of equilibrium on the time scales accessible in an MD simulation at temperatures that are much higher than the experimental glass transition temperature and that consequently we obtain a very bad estimate for this temperature. For fragile glass formers, in contrast, the high-temperature increase of relaxation times is in general slow, thus often allowing the simulations to approach more closely to the experimental Tg before falling out of equilibrium. As a result, the glass transition temperatures can be in reasonable agreement with the experimental data. For polymers, where a larger hightemperature activation energy for volume relaxation exists, arising from a coupling to conformational rearrangements involving activated jumps over large dihedral barriers, for example, one anticipates the cooling method to be incapable of locating the experimentally relevant Tg on typical simulation time scales. This reasoning also means that we were not really describing a thermodynamic measurement of the glass transition in a polymer melt but instead a macroscopic determination of the temperature dependence of volume-related internal relaxation processes, i.e., a dynamic measurement in the disguise of a thermodynamic measurement. Let us now turn to a discussion of the relation of the temperature dependence of the polymer melt’s configurational entropy with its glass transition and address the famous paradox of the Kauzmann temperature of glass-forming systems.90 It had been found experimentally that the excess entropy of super-cooled liquids, compared with the crystalline state, seemed
22
Determining the Glass Transition in Polymer Melts
to vanish when extrapolated to low temperatures. The extrapolated temperature of vanishing excess entropy, the so-called Kauzmann temperature TK , is generally in close agreement with the extrapolated Vogel–Fulcher temperature T0 derived from the temperature dependence of relaxation times. A theoretical derivation of the Kauzmann temperature for polymeric glass formers was given by Gibbs and Di Marzio.91 Consider the canonical partition function of K polymer chains of chain length N in a volume V, Z¼
X
ðE; K; N; VÞ expðE=kB TÞ
½30
E
where E is the internal energy of the system and is the microcanonical partition function (i.e., the total number of states). For simplicity, a lattice with M sites is considered with M¼
L3 ¼ ðKN þ HÞ 8
½31
with H being the number of vacant sites (holes). Equation [31] is valid for a simulation of the glass transition in the bond-fluctuation lattice model where each repeat unit occupies the eight lattice vertices of a unit cube. The glass transition in this model was studied by employing a Hamiltonian that singled out the bonds of the class [3,0,0] giving them zero energy92 H¼
0 E
if b 2 ½3; 0; 0 otherwise
½32
The entropy density s is then s ¼ ðln Þ=M
½33
In the thermodynamic limit M ! 1, one can consider s ¼ sðe; rÞ, where e ¼ E=M is the internal energy per lattice site and r ¼ KN=M is the monomer density. We can also do this as a function of temperature because e can be replaced by T via the appropriate Legendre transformation. Variants of an approximate calculation of the configurational entropy of lattice chains have been developed by Flory,93 Gibbs and Di Marzio,91 and Milchev.94 All three treatments write as a product of an intrachain (intra ) contribution and an interchain (inter ) contribution ¼ intra inter
½34
In Flory’s original treatment, intra accounts for the increase of the chain stiffness when the temperature is lowered. Flory93 described this chain stiffening by
Thermodynamic Properties
23
an energy e if two consecutive bonds along a chain are not collinear, whereas no energy is assigned if they are collinear. In essence, this calculation relies on the fact that one has a two-level system for the internal degrees of freedom as was realized in the choice of Hamiltonian in Eq. [32]. If we denote by f the probability of finding a bond in the excited state, we get for the intramolecular part of the partition function, neglecting excluded volume effects but assuming a nonreversal random walk, Eq. [35] intra ¼
KðN 1Þ fKðN 1Þ
ð1f ÞKðN1Þ z 2 fKðN1Þ z1
1 z1
½35
where z is an effective coordination number in the melt. The last term in this equation is obviously correct for the original Flory model where z 1 possibilities exist for the next bond, only one of which is straight and does not carry an energy penalty. For the bond-fluctuation model with the Hamiltonian given by Eq. [32], 1 in 12 bonds does not carry an energy penalty. Furthermore, the effective coordination number in the melt is around 12, so again one can assume that one of the z 1 neighbor bonds is in the ground state. The treatments of Flory,93 Gibbs and Di Marzio,91 and Milchev94 differ in the way they calculate the second factor inter . This microcanonical partition function describes the number of ways in which the K chains can be put on the lattice, inter ¼ 2K ð1=K!Þ
K 1 Y
nkþ1
½36
k¼0
with nkþ1 being the total number of configurations of the ðk þ 1Þth chain if there are already k chains on the lattice that can be approximated by N1 zðz 1ÞN2 nkþ1 ðM kNÞ Nempty
½37
Here, M kN is the number of empty sites after k chains have been placed on the lattice and constitutes the number of potential starting points for the ðk þ 1Þth chain. The factor zðz 1ÞN2 represents the number of possibilities to place the remaining N 1 monomers of the chain after the first monomer has been placed, forbidding only the immediate back-folding of the walk. The N1 factor Nempty , which for the bond-fluctuation model counts the number of empty unit cubes, accounts approximately for the chains being self-avoiding and mutually avoiding and is approximated in the three approaches as in Eqs. [38]–[40]: Nempty ¼ 1 kN=M
ðFloryÞ
½38
24
Determining the Glass Transition in Polymer Melts
or Nempty ¼ ð1 kN=MÞ=½1 kðN 1ÞMz=2
ðGibbs--DiMarzioÞ
½39
or Nempty ¼ ð1 kN=MÞ=ð1 k=KÞ
ðMilchevÞ
½40
The idea behind these corrections is to recognize that not all empty lattice sites can serve as starting points for the new polymer; only those lying outside of the volume already consumed by the other k chains can be used. Unfortunately, neither Eq. [37] nor the expressions for Nempty can be justified with mathematical rigor. For both the Flory93 and the Gibbs and Di Marzio91 approximations, the entropy at low temperatures is negative (in the limit N ! 1 and r ! 1) sðT ! 0Þ ¼ 1 ðFloryÞ z 2 1 ln 1 sðT ! 0Þ ¼ <0 2 z
½41 ðGibbs--Di MarzioÞ
½42
whereas Milchev’s94 entropy remains non-negative (sðT ! 0Þ ¼ 0). A comprehensive account of these different mean-field-like theories has been given by Wittmann.95 To test these theoretical approaches in a computer simulation, one needs to proceed in several steps. In the first step, one determines the entropy per monomer in the simulation by measuring the energy per monomer eðr; TÞ and by calculating the free energy per monomer through thermodynamic integration of the excess chemical potential96 ð Zp ðN; TÞ eðr; TÞ r r 1 r ln 1 ln sðr; TÞ ¼ m ðr0 ; TÞdr0 T N N N 0 ex N
½43
where Zp ðN; TÞ is the partition function of a single chain of length N. The excess chemical potential can be measured according to mex ðr; TÞ ¼ T ln pins ðr; TÞ
½44
by evaluating the insertion probability for a chain of length N into a solution at density r.97,98 In the second step, one measures the fraction of bonds in the excited state f, the effective coordination number z (the number of monomers around a given monomer in the melt that are within a distance given by the maximum bond length), and the number of holes H (the fraction of the empty lattice sites where one can put another monomer of the bond-fluctuation model)
Thermodynamic Properties
25
0.25
entropy
0.20 0.15
Flory Gibbs–DiMarzio Milchev Simulation
0.10 0.05 0.00 0.0
1.0
2.0
3.0
4.0
5.0
6.0
1/T Figure 6 Entropy per monomer in the bond-fluctuation model as a function of inverse temperature. The results from the simulation (filled circles) are compared with the theoretical predictions discussed in the text.
and inserts these temperature-dependent quantities into Eqs. [38]–[40] for the specific entropy. A comparison between the three theories and computer simulation is shown in Figure 6. The theories of Flory and Gibss–Di Marzio result in practically identical predictions, both leading to a Kauzmann paradox (negative excess entropy) around T ¼ 0:18, which is a temperature in the vicinity of the Vogel–Fulcher temperature, T0 0:13, determined for this model. Both theories, however, strongly underestimate the value of the configurational entropy, which is always positive. The reason is that an underestimation of the intermolecular part of the partition function exists. This underestimation can be observed when comparing the behavior at 1=T ¼ 0 with the simulation data and with the results of Milchev’s theory, which agrees more closely with the simulation data and stays positive throughout the whole temperature range. All theories reproduce the shape of the simulated curve reasonably well, which explains why the Gibbs–Di Marzio theory is so successful in predicting experimental results on the glass transition in polymer melts that depend only on derivatives of the specific entropy. From the simulations we conclude, however, that the Kauzmann paradox of vanishing excess entropy is an extrapolation artifact and that the theoretical descriptions reproducing this finding are based on inappropriate approximations. These studies therefore have not revealed any evidence for a phase transition underlying the glass transition in polymer melts. The strong reduction in specific entropy that is observable in experiment as well as in simulation, i.e., the reduction in configuration space available to
26
Determining the Glass Transition in Polymer Melts
the chains in the melt, has been linked by Adam and Gibbs [81] to the slowing down of the dynamics of the system. In the Adam–Gibbs theory, the center of mass self-diffusion coefficient of the chains is related to the entropy by A DðTÞ ¼ Dð1Þ exp ½45 TSðTÞ Using the specific entropy determined in the simulations, one can test this theoretical approach by fitting this expression to the temperature dependence of D observed in the simulations. It has been concluded that the Adam–Gibbs theory cannot predict the temperature dependence of the dynamics from the thermodynamic information contained in the temperature dependence of the entropy.92 In the next sections we will focus on analyzing the dynamics of supercooled liquids in more detail and discuss our findings in terms of the modecoupling theory of the glass transition, which is a liquid state theory that predicts the dynamics from the structural properties of the liquid.
DYNAMICS IN SUPER-COOLED POLYMER MELTS It is not easy to obtain a crystalline polymeric material. In order to crystallize, a homopolymer chain built of simple regular building blocks like, e.g., a bead-spring polymer model, can arrange the beads on some regular cubic lattice and have the chains run along one of the lattice directions.89 To do this, the bead-spring chain must change its random-coil-like state into a stretched out state. In real polymer chains, the repeat units will attempt to obtain locally ground state conformations given by the dihedral conformer energies. These confomers may result in non-space-filling structures. Additionally there is polydispersity or chemical randomness, as in the 1,4-polybutadiene (PB) example that we will be discuss in the section on ‘‘Dynamics in 1,4-Polybutadiene.’’ Polymers, therefore, in principle, should be good candidates for (quasi-) equilibrium theories of super-cooled liquids. A liquid state theory that can describe the onset of slowdown in super-cooled liquids successfully is the mode coupling theory (MCT).99–101 It is a microscopic approach to the glass transition starting from the observation of the freezing-in of the structural relaxation in the glass transition. The theory assumes density fluctuations to be the dominating slow variables in glass-forming systems. Although the theory was formulated originally for simple (monatomic) fluids only, it is believed to be of much wider applicability and it has been applied to interpret experiments on the polymer glass transition. Starting from the Liouville equation as the fundamental microscopic evolution equation for the dynamics of all phase-space variables, MCT uses
Dynamics in Super-Cooled Polymer Melts
27
well-established projection operator techniques that are used for eliminating the fast variables to arrive at an equation for the correlation functions of density fluctuations in Fourier space, i.e., the intermediate scattering functions. q2 q Sðq; tÞ þ 2q Sðq; tÞ þ q Sðq; tÞ þ 2q qt qt2
ðt
mq ðt t0 Þ
0
q Sðq; t0 Þdt0 ¼ 0 ½46 qt0
In this generalized oscillator equation, the frequency q is related to the restoring force acting on a particle and q is a friction constant. The key quantity of the theory is the memory kernel mq ðt t0 Þ, which involves higher order correlation functions and hence needs to be approximated. The memory kernel is expanded as a power series in terms of Sðq; tÞ mq ðtÞ ¼
i0 X 1X i¼1
i! k
V ðiÞ ðq; k1; . . . ; ki ÞSðk1 ; tÞ . . . Sðki ; tÞ
½47
1 ...ki
The coefficients V ðiÞ of this mode-coupling functional are the basic control parameters of this idealized version of MCT. One sees that Eqs. [46] and [47] amount to a set of nonlinear equations for the correlators Sðq; tÞ that must be solved self-consistently. The basic qualitative prediction of MCT is that, upon lowering the temperature or increasing the density of a melt, one observes a separation of time scales between the microscopic dynamics and the structural relaxation leading to a two-step decay of all relaxation functions. One imagines the molecules to be trapped within a cage formed by their neighbors for some time span between the short time dynamics and the large-scale structural relaxation that comes about when particles leave their cages. The long-time behavior is the structural (or a-) relaxation and the plateau regime occuring between vibrational dynamics, and this a-relaxation is termed the b-regime in the theory. For a well-developed intermediate plateau regime between microscopic and structural relaxation, MCT predicts solutions of the type Sðq; tÞ ¼ fqc þ hq Gðt=t0 ; sÞ
½48
in a time window t0 t ts0 , where fqc is the nonergodicity parameter, hq is some wave-vector dependent amplitude, G is a q-independent scaling function of time, and t0 is the microscopic scale. The separation parameter s / 1 TTc measures the distance from the singularity representing the ‘‘ideal’’ glass transition. For s 0, one has limt!1 Sðq; tÞ ¼ fqc þ hq
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi s=ð1 lÞ þ OðsÞ
½49
28
Determining the Glass Transition in Polymer Melts
where the parameter lðl < 1Þ is called the ‘‘exponent parameter.’’ For s < 0, on the other hand, one has limt!1 fq ðtÞ ¼ 0. For s ! 0, the function Gðt=t0 ; sÞ can be linked to a universal correlation function g ðt=ts Þ, where ‘‘þ’’ indicates s > 0 and ‘‘’’ indicates s < 0 Gðt=t0 ; sÞ ¼ jsj1=2 g ðt=ts Þ;
s > 0ð< 0Þ
½50
where ts ¼ t0 jsj1=2a
½51
and the exponent a is related to l by l ¼ ½ð1 aÞ2 =ð1 2aÞ
½52
For short times, one has a power law decay g ðt=ts Þ ¼
t a ta A1 þ ... ts ts
½53
pffiffiffiffiffiffiffiffiffiffiffiffi where A1 is some amplitude. For large t=ts , gþ ðt=ts ! 1Þ ¼ 1= 1 l, consistent with Eq. [49], the correlator approaches structural arrest. The liquid phase solution, on the other hand, exhibits another power law, the so-called von Schweidler law, g
t tb ; ¼ B ts ts
t 1 ts
½54
where B is another amplitude, and the von Schweidler exponent b is related to l as ½ð1 þ bÞ2 =ð1 þ 2bÞ ¼ l
½55
One can rewrite the von Schweidler law using Eqs. [48], [50]–[52], [54], and [55] as follows: Sðq; tÞ ¼ fqc hq Bðt=ts0 Þb
½56
where ts0 is a characteristic time scale that diverges as the ideal glass transition is approached from above, ts0 ¼ t0 jsjg ;
g ¼ 1=ð2aÞ þ 1=ð2bÞ
½57
Dynamics in Super-Cooled Polymer Melts
29
This time ts0 actually is the maximum time for which Eq. [48] is valid, t0 t ts0 . The exponent g characterizes the behavior of the a-relaxation or structural relaxation. Typical a time scales diverge as ta ðT Tc Þg
½58
As indicated, the power law approximations to the b-correlator described above are only valid asymptotically for s ! 0, but corrections to these predictions have been worked out.102,103 More important, however, is the assumption of the idealized MCT that density fluctuations are the only slow variables. This assumption breaks down close to Tc . The MCT has been augmented by coupling to mass currents, which are sometimes termed ‘‘inclusion of hopping processes,’’ but the extension of the theory to temperatures below Tc or even down to Tg has not yet been successful.101 Also, the theory is often not applied to experimental density fluctuations directly (observed by neutron scattering) but instead to dielectric relaxation or to NMR experiments. These latter techniques probe reorientational motion of anisotropic molecules, whereas the MCT equation describes a scalar quantity. Using MCT results to compare with dielectric or NMR experiments thus forces one to assume a direct coupling of orientational correlations with density fluctuations exists. The different orientational correlation functions and the question to what extent they directly couple to the density fluctuations have been considered in extensions to the standard MCT picture.104–108 Of the available experimental techniques, the various neutron scattering methods most directly measure structural relaxation. Like simulation techniques, however, their dynamic range is limited and several experimental setups have to be combined to obtain information on polymer relaxation from the picosecond scale up to the longest time accessible in neutron spin echo experiments ( 100 ns depending on momentum transfer), with all the experimental correction and normalization issues involved in matching results from different experiments. To our knowledge, the reconstruction of a single relaxation curve Sðq; tÞ out of experimental information for the different frequency and time windows has not yet been tried. Therefore, simulation studies of the glass transition still provide us with the most detailed information of the structural relaxation processes.109 Before we examine in more detail the dynamics of a super-cooled melt of coarse-grained chains and of PB chains, respectively, let us first compare the structure of these two glass-forming systems. Structure is obtained experimentally from either the neutron or the X-ray structure factors. The melt (or liquid) structure factor is given as110 Sm ðqÞ ¼
M 1X bn ðqÞbm ðqÞhei~qð~rn ~rm Þ i M n;m
½59
30
Determining the Glass Transition in Polymer Melts
Here the angular brackets indicate thermal as well as isotropic averages, and the sum runs over all M scattering centers in the melt. The quantities bn ðqÞ are the scattering form factors of the different scatterers in the sample. For X-ray scattering, momentum transfer (q) dependence of the form factor of the electronic clouds must be taken into account. For neutron scattering, the form factors reduce to q-independent scattering lengths, bn ðqÞ ¼ bn . Neutron scattering studies of the melt structure are typically performed on perdeuterated samples, i.e., where all H atoms have been replaced by D atoms, because for deuterium and carbon atoms, coherent scattering dominates (they have about the same coherent scattering lengths), whereas hydrogen atoms scatter neutrons incoherently. X-ray scattering is most sensitive to the positional correlations of the heavy atoms in the sample with their large associated electron clouds. Performing neutron scattering not on perdeuterated samples but on a single deuterated chain in a protonated matrix (or vice versa; both ways provide the same contrast) gives the single-chain structure factor, Sch ðqÞ ¼
N 1X bn bm hei~qð~rn ~rm Þ i N n;m
½60
where now the sum runs only over all monomers of a single chain. When we think of simulations involving bead-spring models, all scatterers can be assigned the same scattering lengths [that are absorbed into arbitrary units for SðqÞ], and for united atom models like the one used for PB, we can consider scattering from the united atoms in the same way. This simplifies the scattering functions of Eqs. [59] and [60] to be Sm ðqÞ ¼
M 1X hei~qð~rn ~rm Þ i M n;m
½61
Sch ðqÞ ¼
N 1X hei~qð~rn ~rm Þ i N n;m
½62
and
The structure factors are Fourier transforms of radial pair-distribution functions for the complete melt or the single chain, respectively, Sm ðqÞ ¼ 1 þ 4pr
ð1
r2 gm ðrÞ
0
Sch ðqÞ ¼ 1 þ 4pðN 1Þ
ð1 0
sinðqrÞ dr qr
r2 gch ðrÞ
sinðqrÞ dr qr
½63 ½64
Dynamics in Super-Cooled Polymer Melts
31
where we already performed the angular average. The pair-distribution functions gm ðrÞ and gch ðrÞ measure structural correlations directly in real space and are, of course, also observables in the simulations. We show typical examples for the melt structure factor and for the single-chain structure factor in Figure 7. The upper panel is for a chemically realistic simulation of PB,111 where the scattering was calculated with the 4 3.5
Sm(q) T=273K
3
Sch(q) T=273K
S(q)
2.5 2 1.5 1 0.5 0 0.5
1
2
1.5
2.5
3
3.5
4
4.5
5
5.5
6
-1
q [Å ] 3.0
2.5
Sm Sch
S(q)
2.0
1.5
1.0
0.5
0.0 0
5
10
q
15
20
Figure 7 Comparison of melt structure factor and single-chain structure factor for PB (upper panel, calculated as scattering from the united atoms only) and a bead-spring melt (lower panel, in Lennard–Jones units).
32
Determining the Glass Transition in Polymer Melts
united atoms as scattering centers of unit scattering length. The lower panel is for a simulation of a bead-spring model.88 Recall that the first maximum in the melt structure factor is called the first sharp diffraction peak or amorphous halo. The position of the amorphous halo for the PB simulation agrees well with experimental data.112,113 We can see from Figure 7 that for momentum transfers larger than about 3 A˚1 in PB, i.e., starting around the second maximum, one observes only intramolecular correlations in the melt structure factor112–114 when one considers only scattering from the united atom centers. The melt structure factor can always be decomposed into a chain contribution (Sch ðqÞ) and a contribution that captures the correlations between distinct melt chains (Smd ðqÞ). Sm ðqÞ ¼ Sch ðqÞ þ Smd ðqÞ
½65
where Smd ðqÞ ¼ 4pr
ð1 0
r2 ðgmd ðrÞ 1Þ
sinðqrÞ dr qr
½66
contains only scattering contributions from scattering centers belonging to different chains (gmd ðrÞ is the pair correlation function for atoms belonging to different chains). Intermolecular scattering is responsible for only half of the intensity of the first sharp diffraction peak. For the intermolecular contribution to the scattering, the position of the amorphous halo is given approximately by 2p=d, where d is the typical intermolecular distance between scattering centers. The behavior of the bead-spring model is different from that of PB. The melt structure factor and the single-chain structure factor depicted in Figure 7 only start to agree at the third peak in the melt structure factor. For smaller momentum transfer, they oscillate with the same wavelength but with a phase shift. The intramolecular structure factor has a minimum preceding the amorphous halo and a maximum shifted slightly with respect to, but still within, the amorphous halo. For the chemically realistic united atom chain, we observe a shoulder in the intramolecular structure factor at the position of the amorphous halo and a first minimum where the melt structure factor also has its first minimum. The shoulder tells us that an intramolecular correlation exists in the PB chain on a scale given by the typical intermolecular distance of about ˚ , which agrees with the size of a repeat unit comprising the chain. Pic4–5 A torially, one can think of the bead-spring chain having the local structure of a pearl necklace with the beads touching each other, whereas the PB chain consists of overlapping spheres with a distance between their centers that is roughly a third of their diameter. The local packing in a hydrocarbon melt like PB, therefore, resembles more the packing of spaghetti than of billiard balls.
Dynamics in Super-Cooled Polymer Melts
33
This specific local packing can give rise to scattering behavior that might be puzzling at first glance. When one imagines the packing in a polymer melt to be that of billiard balls, one would predict that upon increasing the pressure a better defined packing will result in sharper radial distribution functions and, consequently, in a sharpening and increase in height of the amorphous halo. Moreover, that halo would move to large q because of the overall compression of the melt. However, a series of experiments on the structure and dynamics of polymers under pressure115–118 has been reported, showing a very different behavior of the first sharp diffraction peak: It shifted to larger momentum transfer values as expected, but it simultaneously broadened and decreased in height. This behavior has been reproduced in a simulation of a chemically realistic model of PB119 under pressure as shown in Figure 8. To understand the experimental behavior of PB, one has to take into account the fact that the scattering was performed on a perdeuterated sample and that carbon and deuterium have about the same coherent scattering length. Therefore, instead of having one melt structure factor one must actually consider three partial structure factors, SCC , SDD , and SCD , that are weighted by the appropriate combination of scattering lengths (see Ref. 110). The partial structure factor SCC is the one we used in Figure 7. To calculate the structure factor shown in Figure 8, in contrast, we used the trick of 2
p=1 atm p=2500 atm p=27000 atm
S(q)
1.5
1
0.5
0 0
1
2
3
4
5
-1
q [Å ] Figure 8 Behavior of the first sharp diffraction peak of PB with experimental scattering lengths for carbon and deuterium. The deuterium atoms are placed at their mechanical equilibrium positions determined by the positions of the united atom centers and the equilibrium CH bond length and HCH and HCC bond angles along a united atom MD trajectory. With increasing pressure (values given in the legend, simulation performed at T ¼ 293 K), the first sharp diffraction peak shifts to larger q as expected but unexpectedly decreases in height.
34
Determining the Glass Transition in Polymer Melts
reinserting deuterium atoms into a time series of stored united atom configurations that have been sampled along the MD trajectory. By knowing the equilibrium CH bond length and the equilibrium HCH and HCC bond angles, the hydrogen (deuterium) positions can be uniquely determined from the backbone configuration of the united atom polymer chain.120,121 Knowing all the partial structure factors, we can conclude that the unexpected behavior of the scattering function has been induced by the q-dependence of the carbon-deuterium cross correlations (which contribute positively when the amorphous halo is located at smaller momentum transfers but negatively at larger ones) and by a different q-dependence of intramolecular and intermolecular contributions.119 We caution the reader to be careful with the interpretation of experimental structure factors, and not just for polymers. Given the same molecular packing, neutron scattering on a perdeuterated sample, on a partially deuterated sample, or on a protonated sample and X-ray scattering may yield different experimental structure factors. On the other hand, careful analysis of results obtained from different scattering techniques and/or isotopic substitution can offer a way to glean information on partial structure factors from experiment. We now turn to a characterization of the dynamics in a polymer melt where, as it is supercooled, it approaches its glass transition temperature. We begin by looking at the translational dynamics in a bead-spring model and consider its analysis in terms of MCT.
DYNAMICS IN THE BEAD-SPRING MODEL Early simulation studies on the structural aspects of the glass transition in polymer melts were performed using the simple bond-fluctuation lattice model.122–125 The missing inertial regime of the short-time dynamics and the discreteness of the lattice, however, limited the information that one could obtain on structural relaxation using this model. The next simplest polymer models are hard-sphere chains, studied by Rosche et al.126 using MC simulations, and the bead-spring off-lattice model that was studied along an isobar using MD simulations in an NVT ensemble88 (see also Refs. 127 and 128 for reviews). Using MD as the simulation method has the advantage of capturing the short-time vibrational dynamics when compared with MC simulations. Our analysis of the melt dynamics begins by looking at large length and long time scales where we can assess the temperature dependence of the center of mass self-diffusion coefficient of the chains. This self-diffusion is measured in the simulations by monitoring the average mean-squared center of mass displacement of all chains and then employing the Einstein relation ~cm ðtÞ R ~cm ð0ÞÞ2 i hðR t!1 6t
DðTÞ ¼ lim
½67
Dynamics in the Bead-Spring Model
35
-2
10
-3
D
10
VF law MCT γ=2.09 MCT γ=1.8
-4
10
-5
10
-6
100.40
0.60
0.80
1.00
T Figure 9 Chain center of mass self-diffusion coefficient for the bead-spring model as a function of temperature (open circles). The full line is a fit with the Vogel–Fulcher law in Eq. [3]. The dashed and dotted lines are two fits with a power-law divergence at the mode-coupling critical temperature.
For a polymer chain, the long time limit in Eq. [67] means that one has to be able to simulate the model system for times on the order of several Rouse times or, to put it in another way, enough time for the chains to diffuse over a spatial range a few times their size. This simulation is possible for a bead-spring model down to rather low temperatures, but for a chemically realistic model with reasonably long chains, one typically cannot perform such lengthy simulations. Upon super-cooling the bead-spring melts below its crystallization temperature (which is T ¼ 0:76, see the section on thermodynamic properties), and a large decrease in the self-diffusion coefficient is observed (see Figure 9). The temperature dependence below T ¼ 1 is compatible with a Vogel–Fulcher law with a seemingly vanishing self-diffusion coefficient at T0 ¼ 0:320:02. Note, however, that even for this coarse-grained model, which is much easier to simulate than chemically realistic models, the information on the chain center of mass diffusion derived from the simulation in a (meta-)stable equilibrium is limited to temperatures above approximately 1:44 T0 , which makes the deduction of T0 from these data a risky extrapolation. We will comment on the MCT fits in this figure later in this chapter. No crystalline order is visible for the bead-spring model upon cooling to the frozen-in phase at T ¼ 0:3. The break in the volume-temperature curve (described in the section on thermodynamic information) occurring between T ¼ 0:4 and T ¼ 0:45 leads us to expect that the two-step decay described by MCT should be observable at simulation temperatures above (and close to) this region. This expectation is borne out in Figure 10, which shows the
36
Determining the Glass Transition in Polymer Melts 1.0
tσ
φq(t)
0.8
0.6
0.4
0.2
q=1.0 q=2.0 q=6.9 q=9.5 q=15
0.0 10
-2
10
0
10
2
10
4
t Figure 10 Intermediate incoherent scattering function for the bead-spring model at T ¼ 0:48 for different values of momentum transfer given in the legend.
intermediate incoherent scattering function in the bead-spring model for several values of momentum transfer at T ¼ 0:48.129 The basic length scale of MCT is the intermolecular distance as given by the position of the amorphous halo, which is q ¼ 6:9 for the bead-spring model. For q 6:9, we see a welldeveloped plateau regime in the figure. The amount of decorrelation on the microscopic time scale increases with q. Also indicated in the figure is the time scale ts derived from an application of the MCT predictions (Eq. [51]) for the b-relaxation regime. The time scale ts and the amplitudes hq from Eq. [56] are predicted by MCT to show a power law dependence on T Tc . When one plots ts and the amplitudes hq taken to the inverse of the predicted exponent versus temperature, one can directly find the critical temperature of MCT, T ¼ 0:45, as shown in Figure 11. From the MCT analysis in the b-regime, one also obtains the von Schweidler exponent, b ¼ 0:75, and therefore all other exponents through Eqs. [52], [55], and [57]. Another test of MCT, which is suggested by the form of Eq. (56), is to plot the ratio130,131 RðtÞ ¼
fq ðtÞ fq ðt0 Þ fq ðt00 Þ fq ðt0 Þ
½68
where all times t; t0 ; t00 are within the plateau region (b-regime, see Figure 12). It follows from Eq. [56] that the function RðtÞ defined in this way has to be independent of the correlation function that one studies. This so-called factorization theorem, i.e., Eq. [68], has been tested in detail for the
Dynamics in the Bead-Spring Model
37
1.0
0.8
0.6
q=3 q=6.9 q=9.5
0.4
tσ
–2a
s 2
100(hq ) |σ|
tσ
0.2
0.0 0.44
0.46
0.48
0.50
0.52
T Figure 11 MCT b-scaling for the amplitudes of the von Schweidler laws fitting the plateau decay in the incoherent intermediate scattering function for a q-value smaller than the position of the amorphous halo, q ¼ 3:0, at the amorphous halo, q ¼ 6:9, and at the first minimum, q ¼ 9:5. Also shown with filled squares is the b time scale. All quantities are taken to the inverse power of their predicted temperature dependence such that linear laws intersecting the abscissa at Tc should result.
bead-spring model132,133 and shown to be valid for many correlators, including coherent as well as incoherent scattering and the Rouse modes. The von Schweidler exponent, b ¼ 0:75, obtained from the b-relaxation determines the exponent for the a-relaxation to be g ¼ 2:09. This exponent should be observable in the temperature dependence of the self-diffusion coefficient shown in Figure 9. The dashed line in this figure is a fit with fixed values of Tc ¼ 0:45 and g ¼ 2:09 as determined from the b-relaxation analysis; the dotted line is the best fit at fixed Tc . The quality of the fit can be improved if one allows the exponent to differ from the prediction based on the b-relaxation behavior. Actually, a systematic decrease of the best-fit value for the exponent in the a-relaxation temperature dependence with increasing length scale was observed.134 The a time scale is also the time scale of the final decay in the scattering functions Sðq; tÞ, be it coherent or incoherent scattering. This time scale is typically obtained by fitting a KWW (Eq. [9]) time dependence to the final decay of the correlators. In a finite temperature regime above Tc , the stretching exponent b of the KWW functions is independent of temperature. In this temperature range, b is momentum transfer dependent with values between 0.65 and 0.75 and approaching the value of the von Schweidler exponent for q ! 1. In the temperature window with constant b, the time-temperature superposition principle, which is often used to reconstruct complete time-dependent curves from experimental measurements at different temperatures, is valid. The physical interpretation of the time-temperature superposition is that molecular
38
Determining the Glass Transition in Polymer Melts
q=1 5
q=6.9
Rq(t)
q=19
0
-5 -2 10
10
-1
10
0
10 t
1
10
2
10
3
10
4
Figure 12 Test of the factorization theorem of MCT for the intermediate coherent scattering function for the bead-spring model and a range of q-values indicated in the Figure. Data taken from Ref. 132 with permission.
relaxation mechanisms are the same within this temperature window. This is generally valid only over limited temperature ranges. The structural relaxation time scale, however, also determines relaxation processes on large length scales like, for example, the decay of the Rouse modes.135,136 In Figure 13 we can see that the time scales of the first five Rouse modes follow the predicted a-scaling of relaxation times over a certain temperature window above Tc . When one comes too close to Tc , the system does not actually freeze. Instead, other relaxation processes not considered in the idealized MCT take over. The glassy freezing therefore enters into the Rouse picture only through the temperature dependence of the segmental friction ðTÞ, following the temperature dependence of the a-relaxation time scale. One also finds that the value of the a-exponent (and consequently all other exponents) does not depend on the thermodynamic path one follows to reach the state-point given by ðTc ; rc Þ,134 which conforms to another prediction by MCT. In summary, MCT has been found to be consistently applicable to the glassy slowdown in the bead-spring polymer model over a narrow range of temperature above Tc . What is the reason underlying the applicability of MCT, considering the fact that the theory was developed for simple liquids and has no connectivity built in? The physics behind the success of MCT in describing the slowdown in bead-spring melts becomes clear when we look at the mean-square displacement master curve. This curve is obtained
Dynamics in the Bead-Spring Model
39
Figure 13 Temperature dependence of the time scales for the first five Rouse modes in the bead-spring model in the vicinity of the MCT Tc .
by plotting the displacements at all simulated temperatures against DðTÞt, where DðTÞ is the chain center of a mass diffusion coefficient at a given temperature, as shown in Figure 14. Also shown in the figure is the same master curve constructed for a binary Lennard–Jones fluid.137 For t ! 1, the data for 2
10
10
Re
2
2
Rg
0 2
0.75
g1(t)
6 rsc + A (Dt)
10
-2
diffusive
t 10
0.63
-4
LJ mixture -7
10
-6
10
-5
10
-4
10
-3
10
-2
10
-1
10
0
10
1
10
2
10
Dt
Figure 14 Master curve generated from mean-square displacements at different temperatures, plotting them against the diffusion coefficient at that temperature times time. Shown are only the envelopes of this procedure for the monomer displacement in the bead-spring model and for the atom displacement in a binary Lennard–Jones mixture. Also indicated are the long-time Fickian diffusion limit, the Rouse-like subdiffusive regime for the bead-spring model ( t0:63 ), the MCT von Schweidler description of the plateau regime, and typical length scales R2g and R2e of the bead-spring model.
40
Determining the Glass Transition in Polymer Melts
the two models must agree by construction. At very early times, the particles move freely, and both models exhibit ballistic motion, followed by the slow displacement characteristic for the b-process of MCT. However, where the Lennard–Jones fluid directly crosses over from the cage effect to the free diffusion, the polymer exhibits an intervening connectivity-dominated regime for length scales between the bond length, l ¼ 1; and the end-to-end distance. In this regime, the observed mean-square displacement increases less quickly ( t0:63 ) than it does in the MCT description, which is here displayed as the effective von Schweidler law g1 ðtÞ ¼ 6r2sc þ A1 ðDtÞ0:75 ;
ðrsc ¼ 0:087; A1 ¼ 11:86Þ
½69
before free diffusion sets in. This difference from simple liquids, i.e., the crossover to Rouse-like motion, has recently been included in mode coupling theory.138,139 Furthermore, it is important to note that the values of the plateau displacement in the bead-spring model as well as in the Lennard–Jones liquid model are well below one. For the Lennard–Jones liquid, this is the scale of the intermolecular packing s ¼ 1, but for the bead-spring polymer, this is also the scale of the bond-length l s ¼ 1. The packing constraints thus act on a length scale much smaller than the bond length in this polymer model; i.e., the monomers are caged before they actually feel that they are bonded along the chain. This might also explain why both models have very similar critical temperatures (Tc ¼ 0:45 for the bead-spring model vs. Tc ¼ 0:435 for the LJ mixture). As we discussed in the section on the structural properties of amorphous polymers, the relative size of the bond length and the Lennard–Jones scale is very different when comparing coarse-grained models with real polymers or chemically realistic models, which leads to observable differences in the packing. Furthermore, the dynamics in real polymer melts is, to a large extent, determined by the presence of dihedral angle barriers that inhibit free rotation. We will examine the consequences of these differences for the glass transition in the next section.
DYNAMICS IN 1,4-POLYBUTADIENE Structural relaxation in glass-forming polymers has been studied for many years using chemically realistic simulations. Most of the early work that examined incoherent, as well as coherent scattering functions, is more of a qualitative nature because of the unsatisfactory quality of the force fields employed and the severe limitations on the length of the MD simulations performed. Roe studied the slowdown of structural relaxation in a PE-like model140,141 as well as for polystyrene.142 More recently Okada et al.143,144
Dynamics in 1,4-Polybutadiene
41
performed MD simulations of cis-1,4-polybutadiene143,144 to identify candidates for jump motions in asymmetric double-well potentials. The idea of jumps in double-well potentials had been used earlier to explain quasi-elastic neutron scattering (QENS) data145,146 on PB. A detailed analysis of modecoupling predictions was performed by van Zon and de Leeuw for a PB-like model147 and a PE-like model148 and by Lyulin et al.149–151 for polystyrene. In the work by van Zon and de Leeuw, no quantitative comparison with experimental data was possible because of the limitations of the force field quality and the short runs performed. In the work of Lyulin et al., the simulations extended to several tens of nanoseconds, but there too no quantitative comparison was made with the experiment. In the works of Lyulin et al. and of van Zon and de Leeuw, MCT was found to provide a consistent description of the coherent as well as incoherent intermediate scattering functions over a temperature range between Tc and about 1:2 Tc . The value of the critical temperature van Zon and de Leeuw obtained for PB was Tc 162 K, which is about 50 K below the experimental value. Their conclusions mostly agreed with those derived for the bead-spring model with an important difference:148 The relaxation in real polymers is far more stretched than in the bead-spring model. This stretching results in a smaller value of the von Schweidler exponent (b ¼ 0:46 for the PE-like model vs. b ¼ 0:75 for the bead-spring model) as well as for the KWW stretching exponent (b 0:4 for the PE-like model vs. b 0:7 for the bead-spring model), which are interrelated by the MCT prediction bq ! b for q ! 1. The result for the KWW exponent for the chemically realistic simulation agrees well with typical values found in neutron scattering or dielectric experiments. In contrast, the result for the bead-spring model is much larger. We can therefore conclude that differences in the structural relaxation between bead-spring and chemically realistic models can be attributed to either the differences in packing that we discussed above or the presence of barriers in the dihedral potential in atomistic models. To quantify the role of dihedral barriers in polymer melt dynamics, we now examine high-temperature relaxation in polymer melts. There has been extensive effort in recent years to use coordinated experimental and simulation studies of polymer melts to better understand the connection between polymer motion and conformational dynamics. Although no experimental method directly measures conformational dynamics, several experimental probes of molecular motion are spatially local or are sensitive to local motions in polymers. Coordinated simulation and experimental studies of local motion in polymers have been conducted for dielectric relaxation,152–158 dynamic neutron scattering,157,159–164 and NMR spin-lattice relaxation.17,152,165–168 A particularly important outcome of these studies is the improved understanding of the relationship between the probed motions of the polymer chains and the underlying conformational dynamics that leads to observed motions. In the following discussion, we will focus on the
42
Determining the Glass Transition in Polymer Melts
information obtained from NMR experiments that have been used to probe local reorientational motion. NMR 13 C spin-lattice relaxation times are sensitive to the reorientational dynamics of 13 C–1H vectors. The motion of the attached proton(s) causes fluctuations in the magnetic field at the 13 C nuclei, which results in decay of their magnetization. Although the time scale for the experimentally measured decay of the magnetization of a 13 C nucleus in a polymer melt is typically on the order of seconds, the corresponding decay of the 13 C-1H vector autocorrelation function is on the order of nanoseconds, and, hence, is amenable to simulation. The spin-lattice relaxation time T1 can be determined from simulation by using the relationship169 of Eq. [70] 1 ¼ K½JðoH oC Þ þ 3JðoC Þ þ 6JðoH þ oC Þ nT1
½70
where JðoÞ is the spectral density as a function of angular frequency given by JðoÞ ¼
1 2
ð1
P2 ðtÞ expfiotgdt
½71
1
Here n is the number of attached protons at a given carbon atom and oH and oC are the proton and 13 C resonance frequencies. The constant K assumes a value of 2:29 109 s2 and 2:42 109 s2 for sp3 and sp2 nuclei, respectively. The orientational autocorrelation function is obtained from the simulation trajectory using the relationship 1 P2 ðtÞ ¼ ½3hjj^eCH ðtÞ ^eCH ð0Þjj2 i 1 2
½72
where ^eCH ðtÞ is the unit vector along a C–H bond at time t. Equation [72] is an ensemble average over all carbon atoms with the same chemical environment. Experimentally, T1 values can be determined for 13 C nuclei in various chemical (bonding) environments because of the different chemical shifts of these nuclei (the resonances one can distinguish in a cis-trans copolymer of polybutadiene are shown in Figure 15). The chemically realistic simulations we are discussing have been performed using a united atom representation of PB, which leads to the question: How does one actually measure a CH vector reorientation for such a model? The answer to this question is to use the trick we discussed in the analysis of the pressure dependence of the melt structure factor of PB. Hydrogen atoms are placed on the backbone carbons at their mechanical equilibrium positions for each structure that has been sampled along the MD trajectory. The CH vector dynamics we are showing in Figure 16 is solely from the backbone reorientations of the chain.
Dynamics in 1,4-Polybutadiene cis−trans
cis cis−cis cis
43
trans−cis
trans cis−cis cis
trans−trans trans trans trans−trans
Figure 15 Sketch of the local environment along a polybutadiene chain of cis-and transconformers. For sp3-hybridized carbon atoms (indicated by the gray spheres), the chemical shift is different when they belong to a cis-monomer than when they belong to a trans-monomer. For sp2-hybridized carbon atoms (shown by black spheres) in a cis-monomer, NMR shows a different chemical shift whether they have another cis-monomer as a neighbor or a trans-monomer as a neighbor, and it is similar for the sp2-hybridized carbon atoms in the trans monomer.
We can see that the different positions along the chain show distinct temperature-dependent relaxation curves. To further analyze these relaxation functions, we must Fourier transform them to determine their spectral density, which is best done employing an analytic representation of the data that 10
P2(t)
10
10
10
10
0
-1
-2
cis T=353 K cis T=293 K fit function trans-trans T=353 K trans-trans T=293 K
-3
-4
10
-2
10
0
10
2
10
4
t [ps]
Figure 16 Second Legendre polynomial of the CH vector autocorrelation function for the sp3 cis-carbon (dashed lines) and the sp2 carbon in a trans-group next to a transgroup (dashed-dotted lines) for two different temperatures. The fit curves to the ciscorrelation functions are a superposition of exponential and stretched exponential discussed in the text.
44
Determining the Glass Transition in Polymer Melts
Figure 17 Spin-lattice relaxation times for six resonances along a PB chain. Trans and cis denote sp3 hybridized carbons in the respective monomer type, trans-trans, trans-cis, cis-cis, and cis-trans. They denote sp2 hybridized carbons in a trans-group with a transgroup as neighbor, a trans-group with a cis-group as neighbor, and so on. Open bars are for simulation, and filled ones for experiment. Values are shown for 273 K (short bars) and 400 K (longer bars).
enables us to extrapolate to ‘‘infinite’’ time. For this process, we fit the data to the following superposition of exponential and stretched exponential decay b
P2 ðtÞ ¼ Aet=t1 þ ð1 AÞeðt=t2 Þ
½73
As one can see in Figure 16, where two fit functions are included, our ansatz for P2 ðtÞ can describe the data very well, except for the sub-picosecond vibrationally dynamics, which, however, has a negligible contribution to the spin-lattice relaxation time. From this information, we can then calculate the T1 times for different positions along the chain. For PB, comparison of 13 C NMR spin-lattice relaxation times and nuclear overhauser enhancement (NOE) values from simulation and experiment over a wide range of melt temperatures revealed excellent agreement.168 A comparison between simulation and experiment for two temperatures for six different resonances is shown in Figure 17. Comparing the variation in T1 for carbon atoms in different chemical environments with the variation in mean waiting time between conformational transitions for the different types of torsions present in PB (b, cis-allyl and trans-allyl), one concludes that spin-lattice relaxation for a given nucleus in PB cannot be associated with the dynamics of any particular torsion. Instead, the 13 C relaxation occurs as the result of multiple conformational events involving several neighboring torsions.168 However, a close correspondence was found to torsional autocorrelation times tTOR . The torsional autocorrelation time is given by the time integral of the torsional autocorrelation function tor ¼
hcos fðtÞ cos fð0Þi hcos fð0Þi2 hcos2 fð0Þi hcos fð0Þi2
½74
Dynamics in 1,4-Polybutadiene
45
Figure 18 Temperature dependence of the C–H vector (selected, filled symbols) and torsional correlation (open symbols) times for PB from simulation. Also shown is the mean waiting time between transitions for the cis-allyl, trans-allyl, and b torsions in PB. The solid lines are VF fits, whereas the dashed lines assume an Arrhenius temperature dependence.
where fðtÞ is the dihedral angle for a particular torsion at time t and the average is taken over all dihedrals of a given type. The C–H vector correlation time tCH is given as the time integral of P2 ðtÞ (see Eq. [73], upon which T1 depends). This correspondence is illustrated in Figure 18. The close correspondence between tTOR and tCH shown in Figure 18 has also been observed in simulations of other polymer melts.152,165 Interestingly, both the C–H vector and the torsional correlation times exhibit stronger than exponential slowing with decreasing temperature, whereas the rate of conformational transitions exhibits Arrhenius temperature dependence as shown in Figure 18. The divergence of time scales between the torsional correlation time and the rate of conformational transitions is a first indicator of increasing dynamic heterogeneity with decreasing temperature. We will come back to this point later on. The T1 values themselves were found to correlate well with tCH at higher temperatures as expected for the extreme narrowing regime (NOE is approximately three). However, at lower temperatures, the temperature dependence of T1 corresponds neither to that observed for tCH nor to that observed for the mean conformational transition times, which implies that the temperature dependence of the experimentally measurable T1 values reveals no quantitative information about the dynamics of the underlying conformational motions that lead to spin-lattice relaxation.168 A similar analysis in terms of conformational dynamics can be performed as well for the interpretation of neutron scattering data in the picosecond time window159 and dielectric data.156 How do these findings then
46
Determining the Glass Transition in Polymer Melts
relate to the interpretation of translational motion and scattering functions in super-cooled liquids in terms of MCT that was presented above for the beadspring model? To separate the contributions of dihedral barriers and packing effects, Krushev and Paul111 compared simulations of PB using the chemically realistic force field (CRC) with identical calculations in which all torsion energies were set to zero (FRC ¼ freely rotating chain). Because the different conformational states in PB are almost isoenergetic and the torsion potentials are highly symmetric, no discernible influence was detected on either the single chain structure factor or the liquid structure factor.111 Furthermore, the long-time dynamics was only rescaled by a change in diffusion coefficient.170 All simulations were performed at high temperatures using runs on the order of 100 ns, to ensure that the results were not influenced by quenching in non-equilibrium structures. A comparison of the melt structure factor and the single-chain structure factor of these models shows complete agreement.111 According to modecoupling theory, both models should then show the same dynamics. Figure 19 reveals that this is not the case: For the FRC model at 273 K, one observes a crossover from short time vibrational motion to Rouse-like motion, whereas the CRC model shows a well-defined plateau regime between the short-time and the long-time behavior at this temperature. This plateau regime is not present for the CRC model at high temperatures and extends in time as the temperature is lowered. The physical origin for this separation of time scales is not the packing as assumed in MCT but the presence of intramolecular barriers. The short-time vibrational motions are damped out on a time scale of
10
4
10
2
10
1
2
2
∆R [Å ]
10
3
10 10 10
0
353K CRC 240K CRC 273K CRC 273K FRC
-1
-2 -3
10 -2 10
10
-1
0
10
10
1
10
2
10
3
10
4
10
5
t [ps] Figure 19 Mean square monomer displacements using the CRC model of PB at three temperatures compared with the monomer displacement in an FRC version of the polymer model. Also indicated is the Rouse-like regime with the subdiffusive t0:61 power law entered after the caging regime (CRC at low T) or after the short time dynamics (FRC and CRC at 353 K).
Dynamics in 1,4-Polybutadiene
47
about 1 ps for all temperatures. Local reorientation needs transitions over the barriers in the torsion potentials (as discussed in detail earlier). These thermally activated processes occur on an average time scale of several picoseconds at high temperatures. Upon lowering the temperature, the waiting time between torsional transitions increases in an Arrhenius fashion (see Figure 18), which leads to a separation of time scales between vibrational motions and structural relaxation. Thus, in polymers, we have a second mechanism for the time scale separation between short-time vibrational motion and a-relaxation besides the packing mechanism considered in MCT. Packing effects can contribute to the increase in waiting times between torsional transitions that require enough thermal energy to overcome the intramolecular barrier as well as space to accommodate the change in local conformation. This may require neighboring chains to move out of the way, which in turn may require those neighbor chains to undergo a torsional transition. So an intricate interaction exists between the intramolecular energetics and the local packing. It is not yet understood how the influence of packing versus the influcence of torsional barriers balances as a function of temperature. For the melt regime in PB, however, it has been established that the activation energy for the Arrhenius temperature dependence of the mean waiting time between torsional transitions is given by the intramolecular dihedral barriers alone168 (see Figure 18). A quantitative assessment (by simulations) of the low-temperature dynamics in the super-cooled melt for a polymer like PB requires ‘‘wellequilibrated’’ starting configurations. For the PB model we discuss here these configurations have been generated using parallel tempering techniques171 combined with very long (several hundred nanoseconds) MD runs. We surround ‘‘well equilibrated’’ with quotation marks because, as discussed, one cannot propagate the chemically realistic model chains into the free diffusion limit at low temperatures (temperatures approaching Tc ). However, the volume of the model systems can be equilibrated and the runs are more than long enough to equilibrate local conformational statistics. As we discussed, no relevant temperature dependence of the coil-structure exists in PB, which makes this polymer an ideal model system where one can expect to observe only small effects developing from the time-scale limitations of chemically realistic simulations. In the discussion on the dynamics in the bead-spring model, we have observed that the position of the amorphous halo marks the relevant local length scale in the melt structure, and it is also central to the MCT treatment of the dynamics. The structural relaxation time in the super-cooled melt is best defined as the time it takes density correlations of this wave number (i.e., the coherent intermediate scattering function) to decay. In simulations one typically uses the time it takes Sðq; tÞ to decay to a value of 0.3 (or 0.1 for larger q-values). The temperature dependence of this relaxation time scale, which is shown in Figure 20, provides us with a first assessment of the glass transition
48
Determining the Glass Transition in Polymer Melts
incoherent 1/3 coherent VF with T0=127 K 2 Arrhenius laws
10000
-1
τα(q=1.4 Å ) [ps]
1e+06
100
1
200
250
300
350
400
T [K]
Figure 20 Temperature dependence of the a-relaxation time scale for PB. The time is defined as the time it takes for the incoherent (circles) or coherent (squares) intermediate scattering function at a momentum transfer given by the position of the amorphous halo (q ¼ 1:4A˚ 1 ) to decay to a value of 0.3. The full line is a fit using a VF law with the Vogel–Fulcher temperature T0 fixed to a value obtained from the temperature dependence of the dielectric a relaxation in PB. The dashed line is a superposition of two Arrhenius laws (see text).
in PB. The temperature dependence of the a-relaxation time can be described over a large temperature interval by a VF law, as was found when computing the diffusion coefficient of the bead-spring model. The Vogel–Fulcher temperature used in Figure 20 is not an independent fitting parameter; it was obtained by determining from the simulation the dielectric a process for temperatures above 253 K.156 T0 determined in this way agrees with the results from dielectric experiments172,173 and with dynamic mechanical measurements.13 This VF law, however, fails to describe the low-temperature behavior of the a time scale, which is purely Arrhenius as the dashed fit curve indicates that it is actually a superposition of two Arrhenius laws. The low-temperature Arrhenius behavior has an activation energy of 5650 K, which illustrates the general finding that the VF law is a crossover law that can interpolate successfully between a high- and a low-temperature behavior. Upon approaching the glass transition temperature, which for PB is 178 K, the relaxation time temperature dependence becomes Arrhenius-like and the VF law fails.174,175 The Vogel–Fulcher temperature T0 , therefore, is an extrapolation artifact similar to the Kauzmann temperature. In the interval between 198 K and 253 K, the form of the structural relaxation does not change114 as is evidenced by the success of the timetemperature superposition shown in Figure 21. One can also see from this figure that an additional regime intervenes between the short-time dynamics (first 10% of the decay at the lowest temperatures) and the structural relaxation (last 80% of the decay). We will identify this regime as the MCT b-regime
Dynamics in 1,4-Polybutadiene
49
1
-1
S(q=1.4 Å ,t)
0.8
0.6
T = 198 K T = 213 K T = 222 K T = 225 K T = 228 K T = 240 K T = 253 K KWW
0.4
0.2
0 1e-07
1e-06
1e-05
0.0001
0.001
t / τα
0.01
0.1
1
10
Figure 21 Coherent intermediate scattering functions at the position of the amorphous halo versus time scaled by the a time, which is the time it takes the scattering function to decay by 70%. The thick gray line shows that the a-process can be fitted with a Kohlrausch–Williams–Watts (KWW) law.
later. The duration of this additional regime increases with decreasing temperature, and its amplitude increases with decreasing temperature. We can analyze this intervening time regime as we did for the beadspring model by fitting it with an extended von Schweidler law Sðq; tÞ ¼ fqc hq tb þ hq ð2Þt2b . . .
½75
The von Schweidler law describes well the decay from the plateau in both the coherent and the incoherent scattering functions. All correlators for PB can be fitted with a von Schweidler exponent b ¼ 0:3 (see Figure 22). Like the findings by van Zon and de Leeuw148 and by Lyulin and Michels,149 the decay is much more stretched for the chemically realistic PB model than for the bead-spring model. Typical values for the stretching exponent in the KWW fit to the a relaxation (which should approach b for q ! 1) are around 0.5, which agrees well with experimental values. Using the b fit parameters, we can determine the critical temperature in a manner that is similar to what we did for the bead-spring model. The value we obtain this way is Tc ¼ 214 2 K,176 which agrees perfectly with experiment.177 It has so far not been possible to obtain a value for the von Schweidler exponent experimentally, which we can therefore predict by these chemically realistic simulations of PB. Finding that the scattering functions at low temperature are amenable to an MCT description, we are faced with a dilemma. On the one hand, the hightemperature mean-square displacement curves lead us to conclude that dihedral barriers constitute a second mechanism for time scale separation in super-cooled polymer melts besides packing effects. On the other hand, the
50
Determining the Glass Transition in Polymer Melts 1
-1
S(q=1.4 Å ,t)
0.8
0.6
T = 198 K T = 213 K T = 222 K T = 225 K T = 228 K T = 240 K T = 253 K
0.4
0.2
0 0.1
1
10
100
1000
10000
1e+05
1e+06
t [ps]
Figure 22 von Schweidler fits (dotted lines) to the plateau decay of the coherent intermediate scattering function in the temperature interval 198–253 K.
plateau regime in the low-temperature scattering data is perfectly described by MCT. The resolution of this dilemma can be observed in the fact that for PB MCT does not correctly predict the temperature dependence of the a time scale in the vicinity of Tc . The exponent g governing the divergence of the a time scale on approaching Tc is much smaller than that calculated from the von Schweidler exponent using the exponent relations of MCT.176 The difference between successful description of the b regime and the failure in the a regime can be an indication that the plateau regime is packing dominated and the structural relaxation is influenced by both mechanisms for time scale separation, but at this point in time, our arguments are only speculative. Additional studies of PB models with modified dihedral barriers are under way178 to provide more insight into which mechanism dominates the relaxation procesess in which time regime.
DYNAMIC HETEROGENEITY It has been clearly demonstrated by experiments as well as by simulations that the glass transition phenomenon is associated with an increase in dynamic heterogeneity in the motion of the glass-forming moieties. This heterogeneity is best observed experimentally at temperatures close to or below Tg , where the heterogeneity is well developed.179–181 The existence of domains of fast- and slow-moving molecules is closely connected with the existence of a characteristic length scale measuring correlated behavior and with the temperature dependence of this length scale.182 As far as geometrical information can be inferred from the experiments (as, for example, in References 183–185), these regions
Dynamic Heterogeneity
51
seem to be on the order of a few nanometers in size, even at and below Tg , and there is no indication that this length scale is becoming macroscopic. Using computer simulations of simple liquids,186,187 it was suggested that the non-Gaussianity of the van-Hove correlation functions for tagged particle motion can identify fast-moving particles. Typically, the van Hove function Gs ðr; tÞ from the simulation develops a tail when compared with the Gaussian approximation GG s ðr; tÞ having the same second moment, i.e., having the same mean-square displacement of the molecules. Defining the fraction of molecules with displacements beyond the crossing point r defined by
Gs ðr ; tÞ ¼ GG s ðr ; tÞ as fast-moving particles yields around 6% of fast particles. This definition of fast-moving and variants thereof have been applied to identify such particles and an eventual clustering phenomenon from simulations of the bead-spring polymer model.188–190 When one studies the clustering properties of these fast particles, one finds that the clusters are typically very ramified, string-like objects. The average mass of the clusters as a function of time lag from a starting configuration has a peak in the vicinity of the late b regime, where the non-Gaussianity of the van Hove function is maximum. At this point in time, the particles break out of their cages. This peak in the average cluster size is a consequence of the identification of the fast particles as those being faster than predicted by the Gaussian behavior. A dynamic correlation length (defined as the weight average mean-squared size of the cluster) increases only from x 2:5s to x 3:1s on approaching Tc from above, where s is the Lennard–Jones radius of the bead-spring monomers. This small increase is compatible with the experimental finding that the typical size of a domain of fast relaxing molecules at temperatures below Tc is only a few nanometers, which translates into between 3 and 10 Lennard–Jones radii. The new qualitative insight obtained from the cluster analysis is the string-like character of clusters of fast particles, which means that these particles tend to follow each other along their paths of movement. However, only very short scale correlation of this motion exists. There is also no observable tendency for the bead-spring monomers to move along their chain contour to take up the position of their bonded neighbor. In contrast, for PB, a clear tendency exists for a monomer to replace its bonded neighbor on the average time scale of a torsional transition.191 Torsional transitions combined with the local packing in a melt of chemically realistic chains (see the discussion on structural properties) give rise to a distinct type of motion191 as shown in Figure 23. Here we show isosurfaces of the intrachain distinct part of the van Hove function, that is, the probability that one particle of a chain is at the origin at time zero and another particle is at position r at time t. The structure along the r-axis for small t gives the intrachain radial distribution function with distinct peaks created by the C–C bond length, the average C–C–C bond angle, and the torsional isomers. On the average time scale of a torsional transition (around 100 ps), the bonded neighbor moves into the space
52
Determining the Glass Transition in Polymer Melts
Figure 23 Isosurface of the intrachain distinct part of the van Hove function projected onto the time-distance plane. For t ! 0, one observes the intrachain pair correlation function along the radial axes. On the average time scale of a torsional transition, a bonded neighbor moves into the position that the center particle occupied at time zero; i.e., the chain slithers along its contour.
that had been occupied by the reference united atom at time zero. That is, the torsional transitions lead to a slithering motion of the chain along its contour. This motion is also observable in the van Hove self-functions for the attached hydrogen atoms,192 and it influences the scattering at lower temperatures.164 As mentioned, the measures for dynamic heterogeneity typically applied to the bead-spring model are similar to those used for simple liquids. For chemically realistic polymer models, however, a much simpler measure of dynamic heterogeneity has been used for many years in this type of simulation. This heterogeneity is found in transition rates for different chemically identical dihedrals in the melt. This heterogeneity shows up most strikingly in the divergence in the rate of conformational transitions for dihedrals, which follow an Arrhenius temperature dependence with activation barriers resulting primarily from internal rotation barriers, and the relaxation time for the torsional autocorrelation function154,156,165,168,193,194 shown in Figure 18. Although at high temperatures the relaxation times for torsional autocorrelation functions closely follow the rate of conformational transitions,165 the former exhibit Vogel–Fulcher-like temperature dependence, whereas the latter stay Arrhenius-like. The torsional autocorrelation times are closely related to the rates of local relaxation, which give rise to experimentally measurable quantities like the NMR spin-lattice relaxation time, as discussed. The Vogel–Fulcher parameters that one typically finds for the relaxation time scales for the torsional autocorrelation function agree well with those parameters obtained for dielectric or magnetic relaxation from experiment.168 The divergence between the rate of conformational transitions and the decay of the torsional autocorrelation functions (and hence local relaxations)
Dynamic Heterogeneity
53
in amorphous polymers may lie in the increasingly heterogeneous nature of conformational transitions with decreasing temperature, which is an observation that has now been confirmed by computer simulations.152,154,156,166,168,193–196 Heterogeneity here is defined as some dihedrals exhibiting much faster motion than average dynamics, whereas other (chemically equivalent) dihedrals exhibit much slower motion than average. The mean conformational transition time, which follows Arrhenius temperature dependence, is sensitive to fast events. For example, a few fast dihedrals may exist in a system, with the remaining being quiescent. This system can yield the same mean conformational transition time as a system in which all dihedrals have average dynamics. On the other hand, for the torsional autocorrelation function to decay, and hence for 13 C-NMR relaxation and dielectric relaxation to occur, each dihedral in the polymer must visit its available conformational states with ensemble average probability. Heterogeneity in conformational dynamics can be quantified by measuring the distribution of waiting times for a given number of transitions. For PB, as shown in Figure 24,156 one observes an increase in probability for very short waiting times, which is from correlated transitions. Furthermore, a long time tail develops, indicating the existence of very slow dihedrals. These results are compared in the figure with the expected Poisson behavior of independent random events. At high temperatures, the distribution approaches the expected Poisson behavior. When reducing the temperature, the distribution becomes increasingly heterogeneous.
Figure 24 Probability distributions for the waiting time for 10 dihedral transitions. Time is given in units of the average waiting time 10t . The distributions are peaked around 10t ¼ 1 and are much broader than the Poisson distribution but approach it for high T. For low T, a high probability for short waiting times exists and a long time tail of the distribution develops.
54
Determining the Glass Transition in Polymer Melts
The dispersion of this waiting time distribution, i.e., its second central moment, is a measure that we can use to define a ‘‘homogenization’’ time scale on which the dispersion is equal to that of a homogeneous (Poisson) system on a time scale given by the torsional autocorrelation time. The homogenization time scale shows a clear non-Arrhenius temperature dependence and is comparable with the time scale for dielectric relaxation at low temperatures.156 The source of emerging heterogeneity (slow movers and fast movers) in the conformational dynamics of amorphous polymers when decreasing the system’s temperature remains unknown. In light of the packing arguments underlying MCT, it is reasonable to attempt to associate differences in transition rates with the local packing environment, i.e., a dense packing environment for slow dihedrals and a looser packing environment for faster dihedrals. The work by Jin and Boyd153 in this direction and efforts by de Pablo’s group197 trying to relate heterogeneous dynamics to inhomogeneous stress distributions have so far been inconclusive. Phenomenologically, it is clear from simulations that conformational transitions become increasingly self-correlated with decreasing temperature. (Here, correlation is defined in terms of the probability that, once a dihedral undergoes a transition, this dihedral will then undergo another transition (usually back to the original state) before a neighboring torsion undergoes a transition.) It has been observed for several polymer melts193,194 that the probability of self-correlation increases dramatically with decreasing temperature. Self-correlation may account for the ineffectiveness of conformational transitions to induce relaxation with decreasing temperature: A relatively few torsions jumping back-andforth rapidly between two conformational states can contribute significantly to the rate of conformational transitions, but they will contribute little to the experimentally observable local polymer relaxations. The homogenization (or equilibration) process of the torsional transitions can be examined in even more detail. For example, one can differentiate between a first regime where every dihedral becomes mobile and is able to visit other isomeric states, and a second regime where the frequency of these visits approaches its thermal equilibrium value. Bedrov and Smith198,199 have analyzed the time scales for these two regimes recently using modified PB-models and have shown that the first regime corresponds to the time scale of the dielectric b relaxation, whereas the second regime follows the a-relaxation time scale. This study is a first inroad into a mechanistic understanding of dielectric processes based on MD simulations.
SUMMARY This chapter has the title ‘‘Determining the Glass Transition in Polymer Melts,’’ but we might ask: ‘‘Which glass transition?’’ Do we consider
Summary
55
the Vogel–Fulcher temperature T0 , the calorimetric (or viscosimetric) glass transition temperature Tg , or the mode-coupling critical temperature Tc to mark the transition? The calorimetric or viscosimetric definitions of the glass transition temperature are arbitrary because they single out a temperature where the intrinsic relaxation time of the glass forming system is approximately 100 s. When one performs either a calorimetric experiment or a volumetric experiment with suitable cooling rates, a smeared step in the corresponding thermodynamic response function (specific heat or isobaric expansion coefficient) is observed around Tg . When one is more patient in performing the experiment, as exemplified by the classic work by Kovacs, the temperature at which the step in the response function is observed shifts to lower temperatures. For the class of fragile glass formers, to which most polymers belong, this shift is small, which gives some validity to the arbitrary definition of Tg . For strong glass formers, however, the shift can be of the same order as Tg itself (for example, for silica melts). Also, there is no qualitative change in any system-specific time scale near the viscosimetric Tg . From a basic physical point of view, the calorimetric or viscosimetric glass transition temperature is therefore an inadequate measure to use for defining the glass transition. We have shown in the section on the thermodynamics of the glass transition that an extrapolation of Tg ðÞ, the cooling-rate dependent step temperatures in the thermal expansion coefficient, postulating the empirical Vogel–Fulcher law as a description of the temperature dependence of internal relaxation times extrapolates to Tð ! 0Þ ¼ T0 ¼ 0:35 for the bead-spring model. This is in good agreement with what one obtains for T0 from the temperature dependence of the chain center-of-mass self-diffusion coefficient for this model. Additionally, one generally finds that the so-called Kauzmann temperature (where the extrapolated excess entropy of the super-cooled liquid in comparison with the crystal seems to vanish) agrees closely with T0 . This agreement has led people to speculate about an underlying phase transition around the Kauzmann temperature, especially based on the Gibbs–Di Marzio theory for the excess configurational entropy of polymer melts, which also produced a Kauzmann paradox. We have shown, however, that, although the theory nicely reproduced the shape of the entropy curve as a function of temperature, the predicted absolute values were too small compared with the simulation data and the entropy catastrophe was just a consequence of inaccurate approximations. Furthermore, careful studies revealed that the Vogel–Fulcher law cannot describe the temperature dependence of the a-relaxation time scale close to Tg . Instead, this temperature dependence can be described by a low-temperature Arrhenius law to which the hightemperature dependence crosses over. Thus, the Vogel–Fulcher temperature and the Kauzmann temperature are purely extrapolation artifacts having no underlying physical significance.
56
Determining the Glass Transition in Polymer Melts
The only remaining candidate for the definition of the glass transition temperature is therefore the crossover temperature from the high-temperature behavior to the low-temperature behavior of the a-relaxation time scale. A crossover that occurs around some temperature Tx that one also finds to be the merging temperature of dielectric a- and b-relaxation and the critical temperature of mode-coupling theory. Physically, at Tx , a crossover occurs from a high-temperature transport (or relaxation) mechanism to a low-temperature mechanism. For simple liquids, MCT identifies this crossover as originating from the cage effect. At high temperatures, the cage of neighboring particles opens up on the same time scale it takes the central particle to reach its boundary. Contrarily, at lower temperatures, the cage particles are themselves caged and an activated process is needed for the central particle to be able to leave its cage. This simple picture, and the theoretical predictions for the behavior of the density fluctuations as being the dominating slow variables within the theory, have been tested on diverse systems like hard-sphere colloids,200 Lennard–Jones mixtures,137 and silica melts.201 We have shown that MCT can also describe fairly well the glass transition in a bead-spring polymer melt and in a chemically realistic model of PB. Deviations between MCT predictions and simulation results from chain connectivity, which were found in the bead-spring chain simulations, have been incorporated into an extension of the theory. For a chemically realistic model, we have shown that a competition exists between two mechanisms, which leads to a time scale separation between vibrational motion and structural relaxation. One mechanism is related to the packing effects captured by MCT, and the other mechanism is from the presence of dihedral barriers. How these two mechanisms can be joined theoretically remains an open question. The answer to our question at the beginning of this summary therefore has to be as follows. When you want to locate the glass transition of a polymer melt, find the temperature at which a change in dynamics occurs. You will be able to observe a developing time-scale separation between short-time, vibrational dynamics and structural relaxation in the vicinity of this temperature. Below this crossover temperature, one will find that the temperature dependence of relaxation times assumes an Arrhenius law. Whether MCT is the final answer to describe this process in complex liquids like polymers may be a point of debate, but this crossover temperature is the temperature at which the glass transition occurs.
ACKNOWLEDGMENTS I am grateful to all my collaborators on the different aspects of the polymer glass transition, and in particular I would like to thank J. Baschnagel, D. Bedrov, K. Binder, and G. D. Smith for a longstanding, stimulating, and fruitful collaboration. Funding through the German Science Foundation under Grant PA 437/3 and the BMBF under Grant 03N6015 is gratefully acknowledged.
References
57
REFERENCES 1. K. Binder, and W. Kob, Glassy Materials and Disordered Solids, World Scientific, Singapore, 2005. 2. K. L. Ngai, E. Riande, and M. D. Ingram, Eds., J. Non-Cryst. Solids, 235-237, 1 (1998). Proceedings of the Third International Discussion Meeting on Relaxations in Complex Systems. 3. K. L. Ngai, G. Floudas, A. K. Rizos, and E. Riande, Eds., J. Non-Cryst. Solids, 307-310, 1–1080 (2002). Proceedings of the Fourth International Discussion Meeting on Relaxations in Complex Systems. 4. A. J. Kovacs, J. Polymer Sci., 30, 131 (1958). La Contraction Isotherme du Volume des Polyme`res Amorphes. 5. C. A. Angell, J. Res. Natl. Inst. Stand. Technol., 102, 171 (1997). Entropy and Fragility in Supercooling Liquids. 6. I. Gutzow and J. Schmelzer, The Vitreous State. Thermodynamics, Structure, Rheology and Crystallization, Springer, Berlin, 1995. 7. J. Ja¨ckle, Rep. Progr. Phys., 49, 171 (1986). Models of the Glass Transition. 8. J. Zarzycki, Ed., Glasses and Amorphous Materials, Materials Science and Technology, Vol. 9, VCH Publishers, Weinheim, 1991. 9. G. B. Mc Kenna, in Comprehensive Polymer Science: Vol 2, C. Booth and C. Price, Eds., Pergamon, Oxford, 1989, pp. 311–362. Glass Formation and Glassy Behavior. 10. G. R. Strobl, The Physics of Polymers, Springer, Berlin, 1996. 11. P. G. De Gennes, Scaling Concepts in Polymer Physics, Cornell University Press, Ithaca, 1979. 12. M. Doi and S. F. Edwards, The Theory of Polymer Dynamics, Clarendon Press, Oxford, 1986. 13. J. D. Ferry, Viscoelastic Properties of Polymers, Wiley, New York, 1980. 14. T. P. Lodge, N. A. Rotstein, and S. Prager, Adv. Chem. Phys., 79, 1 (1991). Dynamics of Entangled Polymer Liquids. Do Linear Chains Reptate? 15. D. Richter, M. Monkenbusch, A. Arbe, and J. Colmenero, Neutron Spin Echo in Polymer Systems, Advances in Polymer Science, Vol. 174, Springer, Berlin, 2005. 16. W. Paul, G. D. Smith, D. Y. Yoon, B. Farago, S. Rathgeber, A. Zirkel, L. Willner, and D. Richter, Phys. Rev. Lett., 80, 2346 (1998). Chain Motion in an Unentangled Polymer Melt: A Critical Test of the Rouse Model by Molecular Dynamics Simulations and Neutron Spin Echo Spectroscopy. 17. G. D. Smith, W. Paul, M. Monkenbusch, L. Willner, D. Richter, X. H. Qiu, and M. D. Ediger, Macromolecules, 32, 8857 (1999). Molecular Dynamics of a 1,4-Polybutadiene Melt. Comparison of Experiment and Simulation. 18. W. D. Cornell, P. Cieplak, C. I. Bayly, I. R. Gould, K. M. Merz, D. M. Ferguson, D. C. Spellmeyer, T. Fox, J. W. Caldwell, and P. A. Kollman, J. Am. Chem. Soc., 117, 5179 (1995). A Second Generation Force Field for the Simulation of Proteins, Nucleic Acids, and Organic Molecules. 19. J. W. Ponder and D. A. Case, Adv. Prot. Chem., 66, 27 (2003). Force Fields for Protein Simulations. 20. W. L. Jorgensen, D. S. Maxwell, and J. Tirado-Rives, J. Am. Chem. Soc., 118, 11225 (1996). Development and Testing of the OPLS All-Atom Force Field on Conformational Energetics and Properties of Organic Liquids. 21. H. Sun, J. Phys. Chem. B, 102, 7338 (1998). COMPASS: An Ab Initio Force-Field Optimized for Condensed-Phase Applications-Overview with Details on Alkane and Benzene Compounds. 22. A. D. MacKerell Jr., D. Bashford, R. L. Bellott, R. L. Dunbrack Jr., J. D. Evanseck, M. J. Field, S. Fischer, J. Gao, H. Guo, S. Ha, D. Joseph-McCarthy, L. Kuchnir, K. Kuczera, F. T. K. Lau,
58
Determining the Glass Transition in Polymer Melts C. Mattos, S. Michnick, T. Ngo, D. T. Nguyen, B. Prodhom, W. E. Reiher III, B. Roux, M. Schlenkrich, J. C. Smith, R. Stote, J. Straub, M. Watanabe, J. Wiorkiewicz-Kuczera, D. Yin, and M. Karplus, J. Phys. Chem. B, 102, 3586 (1998). All-Atom Empirical Potential for Molecular Modeling and Dynamics Studies of Proteins.
23. J. R. Maple, M. J. Hwang, T. P. Stockfish, U. Dinur, M. Waldman, C. S. Ewig, and A. T. Hagler, J. Comput. Chem., 15, 162 (1994). Derivation of Class II Force Fields. I. Methodology and Quantum Force Field for the Alkyl Functional Group and Alkane Molecules. 24. J. P. Bowen and N. L. Allinger, in Reviews in Computational Chemistry, Vol. 2, K. B. Lipkowitz and D. B. Boyd, Eds., VCH, Weinheim, Germany, 1991, pp. 81–98. Molecular Mechanics: The Art and Science of Parameterization. 25. U. Dinur and A. T. Hagler, in Reviews in Computational Chemistry, Vol. 2, K. B. Lipkowitz and D. B. Boyd, Eds., VCH, Weinheim, Germany, 1991, pp. 99–164. New Approaches to Empirical Force Fields. 26. C. R. Landis, D. M. Root, and T. Cleveland, in Reviews in Computational Chemistry, Vol. 6, K. B. Lipkowitz and D. B. Boyd, Eds., VCH, Weinheim, Germany, 1995, pp. 73–148. Molecular Mechanics Force Fields for Modeling Inorganic and Organometallic Compounds. 27. S. L. Price, in Reviews in Computational Chemistry, Vol. 14, K. B. Lipkowitz and D. B. Boyd, Eds., Wiley-VCH, New York, 2000, pp. 225–289. Toward More Accurate Model Intermolecular Potentials For Organic Molecules. 28. M. Jalaie and K. B. Lipkowitz, in Reviews in Computational Chemisty, Vol. 14, K. B. Lipkowitz and D. B. Boyd, Eds., Wiley-VCH, New York, 2000, pp. 441–486. Appendix: Published Force Field Parameters for Molecular Mechanics, Molecular Dynamics, and Monte Carlo Simulations. 29. G. D. Smith, in Handbook of Materials Modeling, S. Yip, Ed., Springer, New York, 2005, pp. 2561–2573. Atomistic Potentials for Polymers and Organic Materials. 30. A. Arnold and C. Holm, in Advanced Computer Simulation Approaches for Soft Matter Sciences II, Advances in Polymer Science, Vol. 185, K. Kremer and C. Holm, Eds., Springer, Berlin, 2005, pp. 59–109. Efficient Methods to Compute Long-Range Interactions for Soft Matter Systems. 31. O. Borodin and G. D. Smith, J. Phys. Chem. B, 107, 6801 (2003). Development of Quantum Chemistry-Based Force Fields for Poly(ethylene oxide) with Many-Body Polarization Interactions. 32. O. Borodin and G. D. Smith, in Computational Materials Chemistry: Methods and Applications, L. Curtiss and M.S. Gordon, Eds., Kluwer Academic Publishers, Dordrecht, The Netherlands 2004, pp. 35–90. Molecular Modeling of Poly(ethylenoxide) Melts and Poly(ethylene oxide) Based Polymer Electrolytes. 33. ‘‘Transferable Potentials from Phase Equilibria’’ force fields. Available: http://siepmann6. chem.umn.edu/trappe/intro.php. 34. M. Kutteh and T. P. Straatsma, in Reviews in Computational Chemistry, Vol. 12, K. B. Lipkowitz and D. B. Boyd, Eds., VCH, Weinheim, Germany, 1998, pp. 75–136. Molecular Dynamics with General Holonomic Constraints and Application to Internal Coordinate Constraints. 35. W. Paul, in Computational Soft Matter: From Synthetic Polymers to Proteins, NIC Symposium Series, Ju¨lich, Germany, 2004, pp. 169–193. Chemically Realistic Computer Simulations of Polymer Melts: Equilibration Issues and Study of Relaxation Processes. 36. K. Binder, Ed., Monte Carlo and Molecular Dynamics Simulations in Polymer Science, Oxford University Press, Oxford, 1995. 37. M. P. Allen and D. J. Tildesley, Computer Simulation of Liquids, Clarendon Press, Oxford, 1987. 38. M. Kotelyanski and D. N. Theodoru, Eds., Simulation Methods for Polymers, Marcel Dekker, New York, 2004. 39. S. C. Glotzer and W. Paul, Annu. Rev. Mater. Res., 32, 401 (2002). Molecular and Mesoscale Simulation Methods for Polymer Materials.
References
59
40. K. Kremer and K. Binder, Comput. Phys. Rep., 7, 259 (1988). Monte Carlo Simulation of Lattice Models for Macromolecules. 41. K. Binder, and W. Paul, J. Polym. Sci. Polym. Phys., 35, 1 (1997). Monte Carlo Simulations of Polymer Dynamics: Recent Advances. 42. J. Baschnagel, J. P. Wittmer, and H. Meyer, in Computational Soft Matter: From Synthetic Polymers to Proteins, NIC Symposium Series, Ju¨lich, Germany, 2004, pp. 83–140. Monte Carlo Simulation of Polymers: Coarse-Grained Models. 43. R. Faller, in Reviews in Computational Chemistry, Vol. 23, K. B. Lipkowitz and T. R. Cundari, Eds., Wiley-VCH, New York, 2006, Coarse Grain Modelling of Polymers. 44. K. Kremer and G. S. Grest, J. Chem. Phys., 92, 5057 (1990). Dynamics of Entangled Linear Polymer Melts: A Molecular Dynamics Simulation. 45. I. Gerroff, A. Milchev, K. Binder, and W. Paul, J. Chem. Phys., 98, 6526 (1993). A New OffLattice Monte Carlo Model for Polymers: A Comparison of Static and Dynamic Properties with the Bond-Fluctuation Model and Application to Random Media. 46. H. Meyer and F. Mu¨ller-Plathe, Macromolecules, 35, 1241 (2002). Formation of ChainFolded Structures in Supercooled Polymer Melts Examined by MD Simulations. 47. I. Carmesin and K. Kremer, Macromolecules, 21, 2819 (1988). The Bond Fluctuation Method: A New Effective Algorithm for the Dynamics of Polymers in All Spatial Dimensions. 48. H.-P. Wittmann and K. Kremer, Comput. Phys. Commun., 61, 309 (1990). Vectorized Version of the Bond Fluctuation Method for Lattice Polymers; Erratum Notice, ibid., 71, 343 (1992). 49. H.-P. Deutsch and K. Binder, J. Chem. Phys., 94, 2294 (1991). Interdiffusion and SelfDiffusion in Polymer Mixtures: A Monte Carlo Study. 50. W. Paul, K. Binder, W. W. Heermann, and K. Kremer, J. Phys. II, 1, 37 (1991). Crossover Scaling in Semidilute Polymer Solutions: A Monte Carlo Test. 51. D. P. Landau and K. Binder, A Guide to Monte Carlo Simulations in Statistical Physics, Cambridge University Press, Cambridge, 2000. 52. S. Geyler, T. Pakula, and J. Reiter, J. Chem. Phys., 92, 2676 (1990). Monte Carlo Simulation of Dense Polymer Systems on a Lattice. 53. P. V. K. Pant and D. N. Theodorou, Macromolecules, 28, 7224 (1995). Variable Connectivity Method for the Atomistic Monte Carlo Simulation of Polydisperse Polymer Melts. 54. V. G. Mavrantzas, T. D. Boone, E. Zervopoulou, and D. N. Theodorou, Macromolecules, 32, 5072 (1999). End-Bridging Monte Carlo: A Fast Algorithm for Atomistic Simulation of Condensed Phases of Long Polymer Chains. 55. V. G. Mavrantzas and D. N. Theodorou, Comput. Theor. Polym. Sci., 10, 1 (2000). Atomistic Simulation of the Birefringence of Uniaxially Stretched Polyethylene Melts. 56. N. C. Karayiannis, V. G. Mavrantzas, and D. N. Theodorou, Phys. Rev. Lett., 88, 105503 (2002). A Novel Monte Carlo Scheme for the Rapid Equilibration of Atomistic Model Polymer Systems of Precisely Defined Molecular Architecture. 57. N. C. Karayiannis, A. E. Giannousaki, V. G. Mavrantzas, and D. N. Theodorou, J. Chem. Phys., 117, 5465 (2002). Atomistic Monte Carlo Simulation of Strictly Monodisperse Long Polyethylene Melts through a Generalized Chain Bridging Algorithm. 58. A. Uhlherr, V. G. Mavrantzas, M. Doxastakis, and D. N. Theodorou, Macromolecules, 34, 8554 (2001). Directed Bridging Methods for Fast Atomistic Monte Carlo Simulations of Bulk Polymers. 59. A. Uhlherr, M. Doxastakis, V. G. Mavrantzas, D. N. Theodorou, S. J. Leak, N. E. Adam, and P. E. Nyberg, Europhys. Lett., 57, 506 (2002). Atomic Structure of a High Polymer Melt. 60. M. Doxastakis, V. G. Mavrantzas, and D. N. Theodorou, J. Chem. Phys., 115, 11339 (2001). Atomistic Monte Carlo Simulation of cis-1,4 Polyisoprene Melts. I. Single Temperature EndBridging Monte Carlo Simulations.
60
Determining the Glass Transition in Polymer Melts
61. M. Doxastakis, V. G. Mavrantzas, and D. N. Theodorou, J. Chem. Phys., 115, 11352 (2001). Atomistic Monte Carlo Simulation of cis-1,4 Polyisoprene Melts. II. Parallel Tempering EndBridging Monte Carlo Simulations. 62. W. Paul and M. Mu¨ller, J. Chem. Phys., 115, 630 (2001). Enhanced Sampling in Simulations of Dense Systems: The Phase Behavior of Collapsed Polymer Globules. 63. R. C. van Schaik, H. J. C. Berendsen, A. Torda, and W. F. van Gunsteren, J. Mol. Biol., 234, 751 (1993). A Structure Refinement Method Based on Molecular Dynamics in Four Spatial Dimensions. 64. T. C. Beutler and W. F. van Gunsteren, J. Chem. Phys., 101, 1417 (1994). Molecular Dynamics Free Energy Calculation in Four Dimensions. 65. G. Chikenji, M. Kikuchi, and Y. Iba, Phys. Rev. Lett., 83, 1886 (1999). Multi-Self-Overlap Ensemble for Protein Folding: Ground State Search and Thermodynamics. 66. H. Yoshida, Phys. Lett. A, 150, 262 (1990). Construction of Higher Order Symplectic Integrators. 67. M. Tuckerman, B. J. Berne, and G. J. Martyna, J. Chem. Phys., 97, 1990 (1992). Reversible Multiple Time Scale Molecular Dynamics. 68. A. Kopf, B. Du¨nweg, and W. Paul, Comput. Phys. Commun., 101, 1 (1997). Multiple Time Step Integrators and Momentum Conservation. 69. J.-P. Ryckart, G. Ciccotti, and H. J. C. Berendsen, J. Comput. Phys., 23, 327 (1977). Numerical Integration of the Cartesian Equations of Motion of a System With Constraints: Molecular Dynamics of n-Alkanes. 70. H. C. Andersen, J. Comput. Phys., 52, 24 (1983). Rattle: A ‘‘Velocity’’ Version of the Shake Algorithm for Molecular Dynamics Calculations. ¨ ttinger, Stochastic Processes in Polymeric Fluids, Springer, New York, 1996. 71. H. C. O 72. W. Paul, and J. Baschnagel, Stochastic Processes: From Physics to Finance, Springer, Berlin, 1999. 73. P. E. Kloeden and E. Platen, Numerical Solution of Stochastic Differential Equations, Springer, Berling, 1995. 74. W. Paul and D. Y. Yoon, Phys. Rev. E, 92, 2076 (1995). Stochastic Phase Space Dynamics With Constraints for Molecular Systems. 75. S. Nose´, Molec. Phys., 52, 255 (1984). A Molecular Dynamics Method for Simulations in the Canonical Ensemble. 76. W. G. Hoover, Phys. Rev. A, 31, 1695 (1985). Canonical Dynamics—Equilibrium Phase Space Distributions. 77. H. C. Andersen, J. Chem. Phys., 72, 2384 (1980). Molecular Dynamics Simulations at Constant Pressure and/or Temperature. 78. W. Paul and J. Baschnagel, in Monte Carlo and Molecular Dynamics Simulations in Polymer Science, K. Binder Ed., Oxford University Press, Oxford, 1995, pp. 307–355. Monte Carlo Simulations of the Glass Transition in Polymers. 79. D. J. Rigby and R. J. Roe, J. Chem. Phys., 87, 7285 (1987). Molecular Dynamics Simulation of Polymer Liquid and Glass. I. Glass Transition. 80. D. J. Rigby and R. J. Roe, J. Chem. Phys., 89, 5280 (1988). Molecular Dynamics Simulation of Polymer Liquid and Glass. II. Short Range Order and Orientation Correlation. 81. G. Adam and J. H. Gibbs, J. Chem. Phys., 43, 139 (1965). On the Temperature Dependence of Cooperative Relaxation Properties in Glassforming Liquids. 82. R. H. Boyd, R. H. Gee, J. Han, and Y. Jin, J. Chem. Phys., 101, 788 (1994). Conformational Dynamics in Bulk Polyethylene: A Molecular Dynamics Simulation Study. 83. J. Han, R. H. Gee, and R. H. Boyd, Macromolecules, 27, 7781 (1994). Glass Transition Temperatures of Polymers from Molecular Dynamics Simulations. 84. K.-Q. Yu, Z.-S. Li, and J. Sun, Macromol. Theory Simul., 10, 624 (2001). Polymer Structures and Glass Transition: A Molecular Dynamics Simulation Study.
References
61
85. K. Vollmayr, W. Kob, and K. Binder, Phys. Rev. B, 54, 15808 (1996). Cooling-Rate Effects in Amorphous Silica: A Computer-Simulation Study. 86. R. Bru¨ning and K. Samwer, Phys. Rev. B, 46, 11318 (1992). Glass Transition on Long Time Scales. 87. A. V. Lyulin, N. K. Balabaev, and M. A. J. Michels, Macromolecules, 36, 8574 (2003). Molecular-Weight and Cooling-Rate Dependence of Simulated Tg for Amorphous Polystyrene. 88. C. Bennemann, W. Paul, K. Binder, and B. Du¨nweg, Phys. Rev. E, 57, 843 (1998). Molecular Dynamics Simulations of the Thermal Glass Transition in Polymer Melts: a-Relaxation Behavior. 89. J. Buchholz, W. Paul, F. Varnik, and K. Binder, J. Chem. Phys., 117, 7364 (2002). Cooling Rate Dependence of the Glass Transition Temperature of Polymer Melts: A Molecular Dynamics Study. 90. F. Stillinger, J. Chem. Phys., 88, 7818 (1988). Supercooled Liquids, Glass Transitions and the Kauzmann Paradox. 91. J. H. Gibbs and E. A. Di Marzio, J. Chem. Phys., 28, 373 (1958). Nature of the Glass Transition and the Glassy State. 92. M. Wolfgardt, J. Baschnagel, W. Paul, and K. Binder, Phys. Rev. E, 54, 1535 (1996). Entropy of Glassy Polymer Melts: Comparison Between Gibbs-Di Marzio and Simulation. 93. P. J. Flory, Proc. Roy. Soc. London A, 234, 60 (1956). Statistical Thermodynamics of SemiFlexible Chain Molecules. 94. A. Milchev, C. R. Acad. Bulg. Sci., 36, 1415 (1983). On the Statistics of Semiflexible Polymer Chains. 95. H.-P. Wittmann, J. Chem. Phys., 95, 8449 (1991). On the Validity of the Gibbs–Di Marzio Theory of the Glass Transition. 96. D. Frenkel, and B. Smit, Understanding Molecular Simulation. From Algorithms to Applications, Academic Press, San Francisco, 1996. 97. M. Wolfgardt, J. Baschnagel, and K. Binder, J. Chem. Phys., 103, 7166 (1995). On the Equation of State of Thermal Polymer Solutions and Melts. 98. H. Meirovitch, in Reviews in Computational Chemistry, Vol. 12, K. B. Lipkowitz and D.B. Boyd, Wiley-VCH, New York, 1998, pp. 1–74. Calculation of the Free Energy and the Entropy of Macromolecular Systems by Computer Simulation. 99. W. Go¨tze, in Liquids, Freezing and the Glass Transition, J. P. Hansen, D. Levesque, and J. Zinn-Justin, Eds., North-Holland, Amsterdam, 1990, pp. 287–503. Aspects of Structural Glass Transitions. 100. W. Go¨tze and L. Sjo¨gren, Rep. Progr. Phys., 55, 241 (1992). Relaxation Processes in Supercooled Liquids. 101. W. Go¨tze, J. Phys.-Condes. Matter, 11, A1 (1999). Recent Tests of the Mode-Coupling Theory for Glassy Dynamics. 102. T. Franosch, M. Fuchs, W. Go¨tze, M. R. Mayr, and A. P. Singh, Phys. Rev. E, 55, 7153 (1997). Asymptotic Laws and Preasymptotic Correction Formulas for the Relaxation Near GlassTransition Singularities. 103. M. Fuchs, W. Go¨tze, and M. R. Mayr, Phys. Rev. E, 58, 3384 (1998). Asymptotic Laws for Tagged-Particle Motion in Glassy Systems. 104. R. Schilling and T. Scheidsteger, Phys. Rev. E, 56, 2932 (1997). Mode Coupling Approach to the Ideal Glass Transition of Molecular Liquids: Linear Molecules. 105. R. Schilling, J. Phys. Condens. Matter, 12, 6311 (2000). Mode-Coupling Theory for Translational and Orientational Dynamics Near the Ideal Glass Transition. 106. R. Schilling, Phys. Rev. E, 65, 051206 (2002). Reference-Point-Independent Dynamics of Molecular Liquids and Glasses in the Tensorial Formalism. 107. S. H. Chong and W. Go¨tze, Phys. Rev. E, 65, 041503 (2002). Idealized Glass Transitions for a System of Dumbbell Molecules.
62
Determining the Glass Transition in Polymer Melts
108. S. H. Chong and W. Go¨tze, Phys. Rev. E, 65, 051201 (2002). Structural Relaxation in a System of Dumbbell Molecules. 109. W. Paul and G. D. Smith, Rep. Prog. Phys., 67, 1117 (2004). Structure and Dynamics of Amorphous Polymers: Computer Simulations Compard to Experiment and Theory. 110. J.-P. Hansen and I. R. McDonald, Theory of Simple Liquids, Academic Press, London, 1986. 111. S. Krushev and W. Paul, Phys. Rev. E, 67, 021806 (2003). Intramolecular Caging in Polybutadiene Due to Rotational Barriers. 112. D. Richter, B. Frick, and B. Farago, Phys. Rev. Lett., 61, 2465 (1988). Neutron-Spin-Echo Investigation on the Dynamics of Polybutadiene Near the Glass Transition. 113. B. Frick, D. Richter, and C. Ritter, Europhys. Lett., 9, 557 (1989). Structural Changes Near the Glass Transition—Neutron Diffraction on a Simple Polymer. 114. G. D. Smith, D. Bedrov, and W. Paul, J. Chem. Phys., 121, 4961 (2004). Molecular Dynamics Simulation Study of the a-Relaxation in a 1,4-Polybutadiene Melt as Probed by the Coherent Dynamic Structure Factor. 115. B. Frick and C. Alba-Simionesco, Appl. Phys. A (Suppl.), 74, S549 (2002). Pressure Dependence of the Boson Peak in Poly(butadiene). 116. B. Frick, C. Alba-Simionesco, K. H. Andersen, and L. Willner, Phys. Rev. E, 67, 051801 (2003). Influence of Density and Temperature on the Microscopic Structure and the Segmental Relaxation of Polybutadiene. 117. A. Cailliaux, C. Alba-Simionesco, B. Frick, L. Willner, and I. Goncharenko, Phys. Rev. E, 67, 010802 (2003). Local Structure and Glass Transition of Polybutadiene up to 4 GPa. 118. B. Frick, G. Dosseh, A. Cailliaux, and C. Alba-Simionesco, Chem. Phys., 292, 311 (2003). Pressure Dependence of the Segmental Relaxation of Polybutadiene and Polyisobutylene and Influence of Molecular Weight. 119. D. Bedrov, G. D. Smith, and W. Paul, Phys. Rev. E, 70, 011804 (2004). On the Anomalous Pressure Dependence of the Structure Factor of 1,4-Polybutadiene Melts. A Molecular Dynamics Simulation Study. 120. W. Paul, D. Y. Yoon, and G. D. Smith, J. Chem. Phys., 103, 1702 (1995). An Optimized United Atom Model for the Simulation of Polymethylene. 121. W. Paul, G. D. Smith, and D. Y. Yoon, Macromolecules, 30, 7772 (1997). Static and Dynamic Properties of a n-C100H202 Melt from Molecular Dynamics Simulations. 122. J. Baschnagel and K. Binder, Physica A, 204, 47 (1994). Structural Aspects of a ThreeDimensional Lattice Model for the Glass Transition of Polymer Melts. A Monte Carlo Simulation. 123. H.-P. Wittmann, K. Kremer, and K. Binder, J. Chem. Phys., 96, 6291 (1992). Glass Transition of Polymer Melts: A 2-D Monte Carlo Study in the Framework of the Bond Fluctuation Method. 124. J. Baschnagel, Phys. Rev. B, 49, 135 (1994). Analysis of the Incoherent Intermediate Scattering Function in the Framework of the Idealized Mode-Coupling Theory: A Monte Carlo Study for Polymer Melts. 125. J. Baschnagel and M. Fuchs, J. Phys.: Condens. Matter, 7, 6761 (1995). Monte Carlo Simulation of the Glass Transition in Polymer Melts: Extended Mode-Coupling Analysis. 126. M. Rosche, R. G. Winkler, P. Reineker, and M. Schulz, J. Chem. Phys., 112, 3051 (2000). Topologically Induced Glass Transition in Dense Polymer Systems. 127. K. Binder, J. Baschnagel, and W. Paul, Prog. Polym. Sci., 28, 115 (2003). Glass Transition of Polymer Melts: Test of Theoretical Concepts by Computer Simulation. 128. J. Baschnagel and F. Varnik, J. Phys.: Condens. Matter, 17, R851 (2005). Computer Simulations of Supercooled Polymer Melts in the Bulk and in Confined Geometry. 129. C. Bennemann, J. Baschnagel, and W. Paul, Eur. Phys. J. B, 10, 323 (1999). Molecular Dynamics Simulation of a Glassy Polymer Melt: Incoherent Scattering Function. 130. W. Kob, J. Horbach, and K. Binder, in Slow Dynamics in Complex Systems: 8th Tohwa University Int’l Symposium, M. Tokuyama and I. Oppenheim, Eds., AIP Press, Woodbury,
References
63
CT, 1999, pp. 441–451. The Dynamics of Non-Crystalline Silica: Insight from Molecular Dynamics Computer Simulations. 131. T. Gleim and W. Kob, Eur. Phys. J. B, 13, 83 (2000). The b-Relaxation Dynamics of a Simple Liquid. 132. M. Aichele and J. Baschnagel, Eur. Phys. J. E, 5, 229 (2001). Glassy Dynamics of Simulated Polymer Melts: Coherent Scattering and van Hove Correlation Functions. Part I: Dynamics in the b-Relaxation Regime. 133. M. Aichele and J. Baschnagel, Eur. Phys. J. E, 5, 245 (2001). Glassy Dynamics of Simulated Polymer Melts: Coherent Scattering and van Hove Correlation Functions. Part II: Dynamics in the a-Relaxation Regime. 134. C. Bennemann, W. Paul, J. Baschnagel, and K. Binder, J. Phys.: Condens. Matter, 11, 2179 (1999). Investigating the Influence of Different Thermodynamic Paths on the Structural Relaxation in a Glass-Forming Polymer Melt. 135. C. Bennemann, J. Baschnagel, W. Paul, and K. Binder, Comput. Theor. Polym. Sci., 9, 217 (1999). Molecular-Dynamics Simulation of a Glassy Polymer Melt: Rouse Model and Cage Effect. 136. J. Baschnagel, C. Bennemann, W. Paul, and K. Binder, J. Phys.: Condens. Matter, 12, 6365 (2000). Dynamics of a Supercooled Polymer Melt Above the Mode-Coupling Critical Temperature: Cage Versus Polymer-Specific Effects. 137. W. Kob and H. C. Andersen, Phys. Rev. E, 51, 4626 (1995). Testing Mode-Coupling Theroy for a Supercooled Binary Lennard-Jones Mixture I: The van Hove Correlation Function. 138. S. H. Chong and M. Fuchs, Phys. Rev. Lett., 88, 185702 (2002). Mode-Coupling Theory for Structural and Conformational Dynamics of Polymer Melts. 139. M. Aichele, S. H. Chong, J. Baschnagel, and M. Fuchs, Phys. Rev. E, 69, 061801 (2004). Static Properties of a Simulated Supercooled Polymer Melt: Structure Factors, Monomer Distributions Relative to the Center of Mass, and Triple Correlation Functions. 140. R. J. Roe, J. Chem. Phys., 100, 1610 (1994). Short Time Dynamics of Polymer Liquid and Glass Studied by Molecular Dynamics Simulation. 141. D. J. Rigby and R. J. Roe, in Computer Simulation of Polymers, R. J. Roe, Ed., Prentice Hall, Englewood Cliffs, NJ, 1991. pp. 79–93. Local Chain Motion Studied by Molecular Dynamics Simulation of Polymer Liquid and Glass. 142. R. J. Roe, J. Non-Cryst. Solids, 235-237, 308 (1998). Molecular Dynamics Simulation Study of Short Time Dynamics in Polystyrene. 143. O. Okada and H. Furuya, Polymer, 43, 971 (2002). Molecular Dynamics Simulation of cis1,4-Polybutadiene. 1. Comparison With Experimental Data for Static and Dynamic Properties. 144. O. Okada, H. Furuya, and T. Kanaya, Polymer, 43, 977 (2002). Molecular Dynamics Simulation of cis-1,4-Polybutadiene. 2. Chain Motion and Origin of the Fast Process. 145. T. Kanaya, K. Kaji, and K. Inoue, Macromolecules, 24, 1826 (1991). Local Motions of cis-1,4Polybutadiene in the Melt. A Quasielastic Neutron-Scattering Study. 146. K. Inoue, T. Kanaya, S. Ikeda, K. Kaji, M. Shibata, and Y. Kiyanagi, J. Chem. Phys., 95, 5332 (1991). Low-Energy Excitations in Amorphous Polymers. 147. A. van Zon and S. W. de Leeuw, Phys. Rev. E, 58, R4100 (1998). Structural Relaxations in Glass Forming Poly(butadiene): A Molecular Dynamics Study. 148. A. van Zon and S. W. de Leeuw, Phys. Rev. E, 60, 6942 (1999). Self-Motion in Glass-Forming Polymers: A Molecular Dynamics Study. 149. A. V. Lyulin and M. A. J. Michels, Macromolecules, 35, 1463 (2002). Molecular Dynamics Simulation of Bulk Atactic Polystyrene in the Vicinity of Tg. 150. A. V. Lyulin, N. K. Balabaev, and M. A. J. Michels, Macromolecules, 35, 9595 (2002). Correlated Segmental Dynamics in Amorphous Atactic Polystyrene: A Molecular Dynamics Simulation Study.
64
Determining the Glass Transition in Polymer Melts
151. A. V. Lyulin, J. de Groot, and M. A. J. Michels, Macromol. Symp., 191, 167 (2003). Computer Simulation Study of Bulk Atactic Polystyrene in the Vicinity of the Glass Transition. 152. G. D. Smith, D. Y. Yoon, C. G. Wade, D. O’Leary, A. Chen, and R. L. Jaffe, J. Chem. Phys., 106, 3798 (1997). Dynamics of Poly(oxyethylene) Melts: Comparison of 13C Nuclear Magnetic Resonance Spin-Lattice Relaxation and Dielectric Relaxation as Determined from Simulations and Experiments. 153. Y. Jin and R. H. Boyd, J. Chem. Phys., 108, 9912 (1998). Subglass Chain Dynamics and Relaxation in Polyethylene: A Molecular Dynamics Simulation Study. 154. S. U. Boyd and R. H. Boyd, Macromolecules, 34, 7219 (2001). Chain Dynamics and Relaxation in Amorphous Poly(ethylene terephthalate): A Molecular Dynamics Simulation Study. 155. O. Borodin, D. Bedrov, and G. D. Smith, Macromolecules, 35, 2410 (2002). Molecular Dynamics Simulation Study of Dielectric Relaxation in Aqueous Poly(ethylene oxide) Solutions. 156. G. D. Smith, O. Borodin, and W. Paul, J. Chem. Phys., 117, 10350 (2002). A MolecularDynamics Simulation Study of Dielectric Relaxation in a 1,4-Polybutadiene Melt. 157. O. Borodin, R. Douglas, G. D. Smith, F. Trouw, and S. Petrucci, J. Phys. Chem. B, 107, 6813 (2003). MD Simulations and Experimental Study of Structure, Dynamics, and Thermodynamics of Poly(ethylene oxide) and its Oligomers. 158. M. Doxastakis, D. N. Theodorou, G. Fytas, F. Kremer, R. Faller, F. Mu¨ller-Plathe, and N. Hadjichristidis, J. Chem. Phys., 119, 6883 (2003). Chain and Local Dynamics of Polyisoprene as Probed by Experiments and Computer Simulations. 159. G. D. Smith, W. Paul, D. Y. Yoon, A. Zirkel, J. Hendricks, D. Richter, and H. Schober, J. Chem. Phys., 107, 4751 (1997). Local Dynamics in a Long-Chain Alkane Melt from Molecular Dynamics Simulations and Neutron Scattering Experiments. 160 G. D. Smith, W. Paul, M. Monkenbusch, and D. Richter, Chem. Phys., 261, 61 (2000). A Comparison of Neutron Scattering Studies and Computer Simulations of Polymer Melts. 161. M. L. Saboungi, D. L. Price, G. M. Mao, R. Fernandez-Perea, O. Borodin, G. D. Smith, M. Armand, and W. S. Howells, Solid State Ionics, 147, 225 (2002). Coherent Neutron Scattering from PEO and a PEO-Based Polymer Electrolyte. 162. J. Colmenero, A. Arbe, F. Alvarez, M. Monkenbusch, D. Richter, B. Farago, and B. Frick, J. Phys. Condens. Matter, 15, S1127 (2003). Self-Motion and the a-Relaxation in GlassForming Polymers. Molecular Dynamic Simulation and Quasielastic Neutron Scattering Results in Polyisoprene. 163. O. Ahumada, D. N. Theodorou, A. Triolo, V. Arrighi, C. Karatasos, and J.-P. Ryckaert, Macromolecules, 35, 7110 (2002). Segmental Dynamics of Atactic Polypropylene as Revealed by Molecular Simulations and Quasielastic Neutron Scattering. 164. J. Colmenero, A. Arbe, F. Alvarez, A. Narros, M. Monkenbusch, and D. Richter, Europhys. Lett., 71, 262 (2005). The Decisive Influence of Local Chain Dynamics on the Overall Dynamic Structure Factor Close to the Glass Transition. 165. G. D. Smith, D. Y. Yoon, W. Zhu, and M. D. Ediger, Macromolecules, 27, 5563 (1994). Comparison of Equilibrium and Dynamic Properties of Polymethylene Melts of n-C44H90 Chains from Simulations and Experiments. 166. S. J. Antoniadis, C. T. Samara, and D. N. Theodorou, Macromolecules, 31, 7944 (1998). Molecular Dynamics of Atactic Polypropylene Melts. 167 X. H. Qiu and M. D. Ediger, Macromolecules, 33, 490 (2000). Local and Global Dynamics of Unentangled Polyethylene Melts by 13C NMR. 168. G. D. Smith, O. Borodin, D. Bedrov, W. Paul, X. H. Qiu, and M. D. Ediger, Macromolecules, 34, 5192 (2001). 13C NMR Spin-Lattice Relaxation and Conformational Dynamics in a 1,4Polybutadiene Melt. 169. D. J. Gisser, S. Glowinkowski, and M. D. Ediger, Macromolecules, 24, 4270 (1991). Local Dynamics of Polyisoprene in Toluene. 170. S. Krushev, W. Paul, and G. D. Smith, Macromolecules, 35, 4198 (2002). The Role of Internal Rotational Barriers in Polymer Melt Chain Dynamics.
References
65
171. D. Bedrov and G.D. Smith, J. Chem. Phys., 115, 1121 (2001). Exploration of Conformational Phase Space in Polymer Melts: A Comparison of Parallel Tempering and Conventional Molecular Dynamics Simulations. 172. A. Arbe, D. Richter, J. Colmenero, and B. Farago, Phys. Rev. E, 54, 3853 (1996). Merging of the a and b Relaxations in Polybutadiene: A Neutron Spin Echo and Dielectric Study. 173. A. Aouadi, M. J. Lebon, C. Dreyfus, B. Strube, W. Steffen, A. Patkowski, and R. M. Pick, J. Phys.: Condens. Matter, 9, 3803 (1997). A Light-Scattering Study of 1,4-cis-trans Polybutadiene. 174. F. Stickel, E. W. Fischer, and R. Richert, J. Chem. Phys., 102, 6251 (1995). Dynamics of Glass-Forming Liquids: I. Temperature Derivative Analysis of Dielectric Relaxation Data. 175. F. Stickel, E. W. Fischer, and R. Richert, J. Chem. Phys., 104, 2043 (1996). Dynamics of Glass-Forming Liquids: II. Detailed Comparison of Dielectric Relaxation, DC-Conductivity and Viscosity Data. 176. W. Paul, D. Bedrov, and G. D. Smith, Phys. Rev. E, 74, 021501(2006) The Glass Transition in 1,4-Polybutadiene: Mode-Coupling Theory Analysis of Molecular Dynamics Simulations Using a Chemically Realistic Model. 177. B. Frick, B. Farago, and D. Richter, Phys. Rev. Lett., 64, 2921 (1990). Temperature Dependence of the Nonergodicity Parameter in Polybutadiene in the Neighborhood of the Glass Transition. 178. G.D. Smith and D. Bedrov, J. Polym. Sci. B: Polym. Phys., 45, 627(2007). Relationship between the k- and b-Relaxation Processes in Amorphous Polymers: Insight from Atomistic Molecular Dynamics Simulations of 1,4-polybutadiene Melts and Blends. 179. H. Sillescu, J. Non-Cryst. Solids, 243, 81(1999). Heterogeneityat the Glass Transition:A Review. 180. M. D. Ediger, Annu. Rev. Phys. Chem., 51, 99 (2000). Spatially Heterogeneous Dynamics in Supercooled Liquids. 181. R. Richert, J. Phys. Condens. Matter, 14, R703 (2002). Heterogeneous Dynamics in Liquids: Fluctuations in Space and Time. 182. E. Donth, The Glass Transition: Relaxation Dynamics in Liqids and Disordered Materials, Springer, Berlin, 2001. 183. U. Tracht, M. Wilhelm, A. Heuer, H. Feng, K. Schmidt-Rohr, and H. W. Spiess, Phys. Rev. Lett., 81, 2727 (1998). Length Scale of Dynamic Heterogeneities at the Glass Transition Determined by Multidimensional Nuclear Magnetic Resonance. 184. J. P. Bouchaud and G. Biroli, Phys. Rev. B, 72, 064204 (2005). Nonlinear Susceptibility in Glassy Systems: A Probe for Cooperative Dynamical Length Scales. 185. L. Berthier, G. Biroli, J. P. Bouchaud, L. Cipelletti, D. El Masri, D. L’Hote, F. Ladieu, and M Perino, Science, 310, 1797 (2005). Direct Experimental Evidence of a Growing Length Scale Accompanying the Glass Transition. 186. W. Kob, C. Donati, S. J. Plimpton, P. H. Poole, and S. C. Glotzer, Phys. Rev. Lett., 79, 2827 (1997). Dynamical Heterogeneities in a Supercooled Lennard–Jones Liquid. 187. C. Donati, J. F. Douglas, W. Kob, S. J. Plimpton, P. H. Poole, and S. C. Glotzer, Phys. Rev. Lett., 80, 2338 (1998). Stringlike Cooperative Motion in a Supercooled Liquid. 188. C. Bennemann, C. Donati, J. Baschnagel, and S. C. Glotzer, Nature (London), 399, 246 (1999). Growing Range of Correlated Motion in a Polymer Melt on Cooling Towards the Glass Transition. 189. Y. Gebremichael, T. B. Schrøder, F. W. Starr, and S. C. Glotzer, Phys. Rev. E, 64, 051503 (2001). Spatially Correlated Dynamics in a Simulated Glass-Forming Polymer Melt: Analysis of Clustering Phenomena. 190. M. Aichele, Y. Gebremichael, F. W. Starr, J. Baschnagel, and S. C. Glotzer, J. Chem. Phys., 119, 5290 (2003). Polymer-Specific Effects of Bulk Relaxation and Stringlike Correlated Motion in the Dynamics of a Supercooled Polymer Melt. 191. S. Krushev, Computersimulationen zur Dynamik und Statik von Polybutadienschmelzen, Dissertation, University of Mainz, 2002.
66
Determining the Glass Transition in Polymer Melts
192. E. G. Kim and W. L. Mattice, J. Chem. Phys., 117, 2389 (2002). Radial Aspect of Local Dynamics in Polybutadiene Melts as Studied by Molecular Dynamics Simulation: To Hop or not to Hop. 193. R. H. Boyd, R. H. Gee, J. Han, and Y. Jin, J. Chem. Phys., 101, 788 (1994). Conformational Dynamics in Bulk Polyethylene: A Molecular Dynamics Simulation Study. 194. O. Borodin, and G. D. Smith, Macromolecules, 33, 2273 (2000). Molecular Dynamics Simulations of Poly(ethylene oxide)/LiI Melts. 2. Dynamic Properties. 195. G. D. Smith, D. Y. Yoon, and R. L. Jaffe, Macromolecules, 28, 5897 (1995). Long-Time Molecular Motions and Local Chain Dynamics in n-C44H90 Melts by Molecular Dynamics Simulations. 196. W. Jin and R. H. Boyd, Polymer, 43, 503 (2002). Time Evolution of Dynamic Heterogeneity in a Polymeric Glass: A Molecular Dynamics Simulation Study. 197. K. Yoshimoto, T. S. Jain, K. V. Workum, P. F. Nealey, and J. J. de Pablo, Phys. Rev. Lett., 93, 175501 (2004). Mechanical Heterogeneities in Model Polymer Glasses at Small Length Scales. 198. D. Bedrov and G. D. Smith, Macromolecules, 38, 10314 (2005). A Molecular Dynamics Simulation Study of Relaxation Processes in the Dynamical Fast Component of Miscible Polymer Blends. 199. D. Bedrov and G. D. Smith, Phys. Rev. E, 71, 050801(R) (2005). Molecular Dynamics Simulation Study of the a- and b-Relaxation Processes in a Realistic Model Polymer. 200. W. van Megen and P. N. Pusey, Phys. Rev. A, 43, 5429 (1991). Dynamic Light-Scattering Study of the Glass Transition in a Colloidal Suspension. 201. J. Horbach and W. Kob, Phys. Rev. E, 64, 041503 (2001). Relaxation Dynamics of a Viscous Silica Melt: The Intermediate Scattering Functions.
CHAPTER 2
Atomistic Modeling of Friction Nicholas J. Moseya and Martin H. Mu¨serb a
Department of Chemistry, University of Western Ontario, London, Ontario, Canada b Department of Applied Mathematics, University of Western Ontario, London, Ontario, Canada
INTRODUCTION Friction is a well-known, but poorly understood, phenomenon that affects virtually all aspects of daily life. In some cases, friction is desirable; e.g., high friction in clutches leads to the effective transmission of forces between an automobile’s engine and its wheels, whereas in other cases friction is a significant drawback, e.g., friction between the piston and cylinder wall decreases the efficiency of automobile engines. Although macroscopic friction laws were introduced a few centuries ago,1 and the existence of friction was recognized long before that, the underlying atomic-level mechanisms leading to friction have remained elusive. The identification of these mechanisms has emerged as a topic of significant interest, which has been driven by the miniaturization of mechanical devices, the peculiar behavior of condensed matter at the nanoscale, and advances in simulating chemically complex lubricants and surfaces with ever-increasing accuracy.2–4 Although a great deal of research has been directed toward elucidating the fundamental, atomic-level origins of friction in recent years, many key questions remain unanswered. Atomic-level simulation has been used extensively in the study of friction, not simply as a means of supplementing experimental studies, but
Reviews in Computational Chemistry, Volume 25 edited by Kenny B. Lipkowitz and Thomas R. Cundari Copyright ß 2007 Wiley-VCH, John Wiley & Sons, Inc.
67
68
Atomistic Modeling of Friction
also as a powerful tool for gaining unique insight into the relevant processes. Indeed, simulations allow one to study well-defined systems under a variety of conditions that may be difficult, or even impossible, to examine in real laboratory experiments. As such, simulations have shed much-needed light on fundamental aspects of friction and, in some cases, have even overturned conventional wisdom regarding the origins of friction. In this chapter, we discuss the key points associated with performing tribological simulations (tribology is the science of surfaces in relative motion, sometimes also defined as the science of friction, lubrication, and wear) and review representative studies in which such simulations have been applied. Tribological simulations inherently involve studying systems that are far from equilibrium. As such, many general principles, such as the minimization of free energy, no longer apply. Along similar lines, the equivalence of ensembles in the thermodynamic limit does not apply to many tribological systems, and hence, the proper use of boundary conditions is crucial. This is illustrated by the following example, where we consider two simulations of a system composed of two sliding surfaces performed with different boundary conditions. In one simulation, the kinetic friction force Fk is determined at constant load and constant sliding velocity. In the second simulation, similar parameters are employed; however, instead of applying a constant load, the distance between the surfaces is constrained to a constant value. This corresponds to performing the simulation at constant separation and constant sliding velocity. These two simulations are likely to yield completely different values for Fk , which demonstrates that implementing boundary conditions properly is crucial in nonequilibrium simulations if one wants to make reliable predictions. It also illustrates one of the many pitfalls that can diminish considerably the value of a largely well-designed tribological simulation. These pitfalls often result from convenience, where the unrealistic treatment, in this case constant separation, is easier to implement than the experimental condition, that is, constant normal load. Additional considerations relating to nonequilibrium simulations arise from the fact that imposing shear and load requires one to constantly pump energy into the system. This energy is converted into heat and must be removed during the simulation, which is achieved through the use of thermostats. However, nonequilibrium simulations can be more sensitive to temperature control than equilibrium simulations, and applying thermostats in a naive manner may induce unrealistic velocity profiles in the sheared lubricant or lead to other undesired artifacts. The topic of nonequilibrium molecular dynamics has been reviewed previously in this series.5 Another important issue associated with tribological simulations involves the definition of the system to be studied. For example, a simple tribological system consists of two atomically flat, defect-free surfaces sliding past one another. Because of computational convenience, it is common
Theoretical Background
69
practice to orient the surfaces in a commensurate manner. That is, the surfaces are in perfect alignment and share a common periodicity. However, real engineering contacts rarely contain commensurate surfaces; indeed, it is exceedingly difficult to purposely devise such systems. Moreover, the frictional properties of systems that contain commensurate surfaces differ significantly from those of systems with incommensurate surfaces, i.e., surfaces that do not share a common periodicity in a systematic fashion. Thus, to obtain meaningful results from simulations, one must take care to ensure that the systems being studied do not contain commensurate surfaces or other artificial symmetries. The design of realistic systems is further complicated by the presence of defects, surface curvature and roughness, lubricant molecules, and surface contaminants, all of which must be treated appropriately. For instance, preventing lubricant molecules from becoming squeezed out of contacts, or even neglecting atomic-scale surface roughness, may lead to erroneous results in simulations. In some cases, it is even necessary to consider the chemical reactivity of the lubricant molecules. In this chapter, we discuss how to perform meaningful tribological simulations by avoiding the potential pitfalls that were mentioned above. In the next section, some theoretical aspects of friction between solids will be explained. Then an overview of algorithms that have been used in the simulation of tribological phenomena is provided. Selected case studies will be presented in the last section.
THEORETICAL BACKGROUND Everyday experience indicates that a finite threshold force, namely the static friction force Fs must be overcome to initiate lateral motion of one solid body relative to another. Experience also indicates that a finite minimum force, namely the kinetic friction force Fk , must be applied in order to keep a solid body moving at a constant velocity. In plain terms, we must expend a certain minimum amount of energy to overcome friction if we want to move an object and keep it moving. This contrasts with the situation encountered when one pulls a solid through a fluid medium where there is no such threshold, and instead, one only needs to overcome frictional forces that are linear in the final velocity v0 . The notion of a threshold force is indicated in Figure 1, where the applied force is increased with time until the static friction force is overcome, i.e., F=Fs ¼ 1, and sliding is initiated. In the sliding regime, the applied force is lower than that required to initiate sliding and corresponds to the kinetic friction force Fk . In this section, we provide a short introduction to the topic of friction. Without such a background, it is difficult to ask meaningful questions and interpret the outcome of simulations. After all, our goals extend beyond
70
Atomistic Modeling of Friction
1.0
F/Fs
0.8 0.6 0.4 0.2 0.0 0
1000
2000
3000
time
Figure 1 Force against time for a sliding system. Sliding is initiated once the threshold corresponding to the static friction force is surpassed when F=Fs ¼ 1.
simply reproducing experimental results. Moreover, a good understanding of the theoretical background will aid in determining which aspects of a simulation deserve particular focus and which details are essentially irrelevant. The discussion begins below with an overview of proposed energy dissipation mechanisms that lead to friction. This is followed by brief discussions of phenomenological friction laws that describe the dependence of friction upon normal load and sliding velocity. The dependence of friction on the symmetry of the surfaces that are in contact is discussed later.
Friction Mechanisms Given the ubiquitous nature of friction in our daily lives, it came as a surprise when Hirano and Shinjo suggested that friction between solids in ultra-high vacuum may essentially disappear.6,7 Although this proposal clearly conflicts with our intuition, it does not necessarily violate Newtonian mechanics. Consider the following case in which a slider is dragged over a surface. If the slider and substrate have homogeneous surfaces and if wear and plastic deformation are negligible, one may expect the same (free) energy at the beginning of the sliding process as at its end because of translational invariance. Thus, no work will be performed on the system, which opens up the possibility of ultra-low friction. This situation is commonly referred to as superlubricity and is currently an active area of research. The demonstration that superlubricity is possible, at least from a theoretical standpoint, indicates that solids are not required to exhibit friction under all conditions. However, virtually all surfaces exhibit friction as a result of various mechanisms that lead to deviations from the conditions necessary
Theoretical Background
71
for superlubricity. In general, these mechanisms involve hysteresis and energy dissipation, which prevents the recovery of the energy that was expended during sliding. In what follows, we provide a brief overview of common friction mechanisms. It has long been recognized that solid friction is intimately connected to hysteresis, as best illustrated by the model proposed independently by Prandtl8 and Tomlinson.9 In this model, a surface atom of mass m is coupled to its lattice site via a harmonic spring of stiffness k. The lattice site, which moves at constant velocity v0 , is assumed to be located at the origin at time t ¼ 0. In addition to the interaction with its lattice site, the atom experiences a coupling V0 cosð2px=aÞ to the substrate, where V0 has units of energy and reflects the strength of the coupling, a is the lattice constant of the substrate, and x is the current position of the surface atom. Upon the introduction of a viscous damping term that is proportional to velocity x_ and a damping coefficient g, the equation of motion for this atom is m€ x þ gx_ ¼ kðv0 t xÞ þ
2p V0 sinð2px=aÞ a
½1
If k is very large, i.e., k is greater than the maximum curvature of the potential, 00 Vmax ¼ ð2p=aÞ2 V0 , there will always be a unique equilibrium position for the atom of xeq v0 t and the atom will always be close to xeq . Consequently, the friction will be linear in v0 for small values of v0 . 00 Things become more interesting once Vmax exceeds k. Now more than one stable position at certain points in time exists, as shown in Figure 2.
e tim
net potential energy
30
20
10
0
instability −10
−4
−2
0
2
4
position of surface atom Figure 2 Illustration of an instability in the Prandtl–Tomlinson model. The sum of the substrate potential and the elastic energy of the spring is shown at various instances in time. The energy difference between the initial and the final point of the thick line will be the dissipated energy when temperature and sliding velocities are very small.
72
Atomistic Modeling of Friction
The time dependence of the combined substrate and spring potential reveals that mechanically stable positions disappear at certain points in time because of the motion of the spring. Consequently, an atom cannot find a mechanically stable position at time t þ dt in the vicinity of a position that was stable a small moment dt ago. As a result, at times slightly past t, the position of the atom becomes unstable and it must move rapidly toward the next potential energy minimum. After sufficiently many oscillations around the new mechanical equilibrium, most of the potential energy difference between the new and old equilibrium positions will be dissipated into the damping term. This process will repeat itself periodically as the atom moves between potential energy minima at finite sliding velocities. Consequently, the dissipated energy per sliding distance is independent of v0 and g (in the current example of a bistable system) for sufficiently small values of v0 . Despite its merits, the Prandtl–Tomlinson model should not be taken too literally. There simply is no reason why the inter-bulk coupling, reflected 00 by Vmax , should be stronger than the intra-bulk coupling k. But even if a reason did exist, one would have to expect more dramatic processes than elastic instabilities, such as cold welding and plastic deformation, so that the assumption of elastic coupling in the slider would break down completely. One could certainly argue that similar instabilities involving collective degrees of freedom may occur on longer length scales. However, it seems that elastic instabilities do not contribute considerably to dissipation.10 A notable exception to this rule is rubber, for which sliding friction is related to internal friction rather than energy dissipation taking place at the interface.11 A traditional explanation of solid friction, which is mainly employed in engineering sciences, is based on plastic deformation.12 Typical surfaces are rough on microscopic length scales, as indicated in Figure 3. As a result, intimate mechanical contact between macroscopic solids occurs only at isolated points, typically at a small fraction of the apparent area of contact.
Figure 3 Microscopic contact between surfaces. Note that the two rough surfaces only make intimate contact at a small number of distinct points. The sum of the areas at these points corresponds to the microscopic area of contact Areal .
Theoretical Background
73
The net area of this intimate contact is called the real area of contact Areal . It is assumed that plastic flow occurs at most microscopic points of contact, so that the normal, local pressures correspond to the hardness sh of the softer of the two materials that are in contact. The (maximum) shear pressure is given by the yield strength sy of the same material. The net load L and the net shear force Fs follow by integrating sh and sy over the real area of contact Areal . That is, L ¼ sh Areal and Fs ¼ sy Areal . Hence, the plastic deformation scenario results in the following (static) friction coefficient: ms ¼ sy =sh
½2
where ms is defined as the ratio of Fs and L. Although this explanation for a linear relationship between friction and load has been used extensively in the literature, Bowden and Tabor, who originally suggested this idea, were aware of the limitations of their model and only meant to apply it to contacts between (bare) metals.12 Two important objections exist to the claim that plastic deformation is generally a dominant friction mechanism. First, friction between two solids does not typically depend solely on the mechanical properties of the softer of the two materials in contact but on the properties of both of these materials and the lubricant between them. Second, theoretical calculations of typical surface profiles have shown that plastic flow should occur at only a very small fraction of the total number of contact points, and hence, it is unlikely that plastic deformation contributes significantly to friction in real engineering contacts.13 So far, we have not considered lubricants added intentionally, such as oils, or unintentionally in the form of contaminants such as airborne hydrocarbons. It is known that adsorbed molecules can alter the behavior of sliding contacts dramatically as long as these molecules remain at the microscopic points of contact.14,15 From an engineering point of view, such molecules prevent surfaces from making intimate mechanical contact, thereby reducing plastic deformation and wear. However, they also prevent surfaces from becoming superlubric. Under sliding conditions where a sufficient quantity of lubricant exists, the last one or two layers of that lubricant remain in the contact, where they solidify because of the typically large pressures at the microscopic scale. In this regime, one generally talks about boundary lubrication, and because the interactions between lubricant particles are relatively weak, the adsorbed atoms and molecules will try to optimize their interactions with the confining walls. This can lock the surfaces geometrically, as illustrated schematically in Figure 4, and when sliding one surface relative to another, an energy barrier has to be overcome, which generates a static friction force. Many other mechanisms lead to energy dissipation, although they may be less universal than those related to boundary lubricant-induced geometric frustration. Chemical changes in lubricant molecules, reversible or irreversible, produce heat. Examples are configurational changes in hydrogen-terminated
74
Atomistic Modeling of Friction
Figure 4 Schematic representation of the way in which adsorbed atoms can lock two nonmatching solids. Reprinted with permission from Ref. 15.
diamond surfaces16 or the terminal groups of alkane chains,17 as well as sliding- and pressure-induced changes in the coordination numbers of surface or lubricant atoms.18,19 Although the microscopic details of these processes differ significantly, they all exhibit molecular hystereses that are similar to that described by the Prandtl–Tomlinson model. Many irreversible tribological phenomena also exist, such as cold-welding, scraping, cutting, or uncontrolled, catastrophic wear. Characterizing these phenomena is often tedious because many of these processes are system specific, and in addition to being far out of equilibrium, they lack a steady state. For these reasons, we will not elaborate on these processes here.
Load-Dependence of Friction Many macroscopic systems show an almost linear relationship between friction F and load L F ¼ mL
½3
where the friction coefficient m does not depend on the apparent area of contact. This linear dependence, which is routinely taught in introductory physics classes, is referred to as Amontons’s law.1 The origin of this relationship is the subject of a great deal of controversy. Initial attempts to explain Amontons’s law focused on the nature in which surface asperities slide past one another. The basic argument put forward was that, if an asperity protrudes from the surface at some angle y, the force required to move past that asperity will be F ¼ LtanðyÞ. When one averages over all asperities on a surface, tanðyÞ will attain some constant value and the friction will be proportional to the load. Although this explanation seems to account for Amontons’s law at first glance, it was quickly pointed out that, once an asperity reaches the top of another, it will slide down the other side and the applied energy will be recovered. As, in a statistical sense, the average value of y will be constant irrespective of the sliding direction, there is no reason to anticipate any net loss of energy, and friction should essentially be nonexistent in such a scenario. In fact, similar arguments have been used to support the possibility of low-friction situations, such as superlubricity.
Theoretical Background
75
More recently, arguments for the origin of Amontons’s law have arisen that are based on experimental studies demonstrating that the shear stress ss varies with the local pressure P according to Eq. [4] ss ¼ s0 þ aP
½4
where s0 and a are constants. The reason for the linear relationship between shear and normal pressure, even in the presence of adsorbed atoms, can be rationalized qualitatively by considering Figure 4. For the top wall to move to the right, it must move up a slope, which is dependent on how the adsorbed atom is interlocked between the substrate (bottom) and the slider (top). If the system remains rigid, this will lead to F ¼ L tanðyÞ and, hence, a linear relationship between friction and load as discussed above. Of course, this argument is highly qualitative, because it assumes implicitly that nonbonded atoms behave like hard disks in areas of high pressure. Moreover, this argument must be modified if curved surfaces are considered.20 However, it seems to be a reasonable approximation for many systems. Integrating Eq. [4] over Areal yields the friction, which can then be divided by the load to give the following relationship for the friction coefficient: m ¼ F=L ¼ a þ s0 =P
½5
where the local pressure P is given by P ¼ L=Areal . Based on this equation, it is clear that Amontons’s law will be satisfied if P is constant and/or s0 is small. Numerous simulations of boundary lubricants have shown that s0 is indeed small, even at pressures close to the yield strengths of solids.14,21 Thus, it seems that Amontons’s law should hold in a wide range of cases. However, exceptions are observed when adhesive interactions are strong, which leads to large values of s0 . Conditions that lead to constant P can be understood from macroscopic contact mechanics. Even highly polished surfaces are rough on many different length scales and when two macroscopic solids are brought into contact, only a small fraction of these surfaces will be in microscopic, mechanical contact. It can be shown that the pressure distribution averaged over these real contacts is surprisingly independent of the externally imposed load L provided that the surfaces are not too adhesive or too compliant.13,22,23 Basically, the number of points in microscopic contact increases with the load in such a way that Areal / L, and hence P, which is given by L=Areal , will remain relatively constant. Thus, Amontons’s law can also be understood in terms of macroscopic contact mechanics. This interpretation indicates that this law should hold irrespective of the local relation between normal and shear pressure. However, it is important to note that the independence of the pressure distribution on the
76
Atomistic Modeling of Friction
normal load is not valid when conditions are less ideal and adhesion and plastic deformation play a role.24
Velocity-Dependence of Friction In general, solid friction is relatively independent of the sliding velocity v0 , with corrections on the order of ln(v0 ). This finding, also known as Coulomb’s law of friction,1 can be rationalized nicely in the Prandtl–Tomlinson model when the spring constant k is sufficiently small. Recall, that it was noted above that friction is relatively independent of the sliding velocity in this regime due to the occurrence of instabilities, which cause the surface atoms to undergo rapid transitions between minima and lead to energy dissipation. A certain number of instabilities will occur per sliding distance x, with each instability producing a similar amount of heat Q. Thus, in the steady state, one may associate the kinetic friction force Fk with the quotient Fk ¼
Q x
½6
Once temperature comes into play, the jumps of atoms between minima may be invoked prematurely, i.e., before the formation of instabilities, via thermal fluctuations. These thermally activated jumps decrease the force that is required to pull the surface atom, which leads to a decrease in the kinetic friction. The probability that a jump will be thermally activated is exponentially related to the energetic barrier of the associated process, which can be understood in terms of Eyring theory. In general, the energetic barriers are lower when the system is not at its thermal equilibrium position, which is a scenario that is more prominent at higher sliding velocities. Overall, this renders Fk rate or velocity dependent, typically in the following form: v0 g Fk Fk ðvref Þ þ c ln vref
½7
where c is a constant, vref is a suitable reference velocity, and g is an exponent equal to or slightly less than one. Of course, this equation will only be valid over a limited velocity range. In many cases, Fk becomes linear in v0 at very small values of v0 , i.e., when one enters the linear response regime, in which the system is always close to thermal equilibrium. An example of the velocity dependence of friction is given in Figure 5 for a boundary lubricant confined between two incommensurate surfaces.25 For the given choice of normal pressure and temperature, one finds four decades in sliding velocity for which Eq. [7] provides a reasonably accurate description.
Theoretical Background
77
1.0
0.6 *
Fk /Fs
athermal regime
T = 0.03 T = 0.07 T = 0.12 T = 0.2
0.8
0.4
0.2
0.0
d te iva e t ac gim re
linear response in thermal equilibrium
−6
10
10
−5
10
−4
10
−3
10
−2
10
−1
10
0
*
v
Figure 5 Typical velocity relationship of kinetic friction for a sliding contact in which friction is from adsorbed layers confined between two incommensurate walls. The kinetic friction Fk is normalized by the static friction Fs . At extremely small velocities v , the confined layer is close to thermal equilibrium and, consequently, Fk is linear in v , as to be expected from linear response theory. In an intermediate velocity regime, the velocity dependence of Fk is logarithmic. Instabilities or ‘‘pops’’ of the atoms can be thermally activated. At large velocities, the surface moves too quickly for thermal effects to play a role. Time-temperature superposition could be applied. All data were scaled to one reference temperature. Reprinted with permission from Ref. 25.
In the example shown in Figure 5, c is positive and the exponent g is unity; however, neither of these statements are universal. For example, the Prandtl– Tomlinson model can best be described with g ¼ 2=3 in certain regimes,26,27 whereas confined boundary lubricants are best fit with g ¼ 1.25,28 Moreover, the constant c can become negative, in particular when junction growth is important, where the local contact areas can grow with time as a result of slow plastic flow of the opposed solids or the presence of adhesive interactions that are mediated by water capillaries.29,30
Role of Interfacial Symmetry The two surfaces that comprise a contact can be oriented in any number of specific ways; however, for crystalline surfaces, interfacial symmetries correspond to either of two broad classifications. The first type of orientation is called the commensurate case and is found when two identical surfaces are perfectly aligned. The term incommensurate corresponds to the case in which two crystalline surfaces are misoriented or have different periodicities. An example of a commensurate systems is given as structure A in Figure 7, whereas structures B through D are incommensurate. Interestingly, the orientation of the surfaces within a contact has a tremendous influence on
78
Atomistic Modeling of Friction
the frictional properties of the system. Because commensurate surfaces are rarely found in real engineering contacts, it is important to avoid incorporating such artificial symmetries into calculations. In what follows, we illustrate the importance of this point by considering the effect of interfacial symmetry on static and kinetic friction. Commensurate surfaces have a tendency to exhibit much larger static friction than incommensurate surfaces. This tendency can be understood through the following example. Imagine two egg cartons sitting on top of one another. If the cartons are perfectly aligned, the peaks of one carton will each sit in a valley on the other, and a large force will be required to simultaneously lift all of the peaks out of the valleys in order to initiate lateral motion. On the other hand, if the cartons are brought out of registry, e.g., through rotation, only a few of the peaks and valleys will be aligned and less energy will be required to initiate motion. This scenario holds if the cartons are separated by eggs, which introduce new peaks and valleys with the same periodicity as the underlying carton. The rationale behind this simple example applies to surfaces that are composed of atoms and may be separated by other atoms and molecules. In this case, the periodicity of the surface atoms defines the peaks and valleys, and any confined atoms and molecules will attempt to adopt this periodicity. It may come as a surprise to some that two commensurate surfaces withstand finite shear forces even if they are separated by a fluid.31 But one has to keep in mind that breaking translational invariance automatically induces a potential of mean force F . From the symmetry breaking, commensurate walls can be pinned even by an ideal gas embedded between them.32 The reason is that F scales linearly with the area of contact. In the thermodynamic limit, the energy barrier for the slider to move by one lattice constant becomes infinitely high so that the motion cannot be thermally activated, and hence, static friction becomes finite. No such argument applies when the surfaces do not share a common period. The kinetic friction Fk is also affected by commensurability. If two crystalline surfaces are separated by one atomic layer only, Fk may actually be reduced because of commensurability, although static friction is increased.25 The strikingly different behavior for commensurate and incommensurate systems is demonstrated in Figure 6. Therefore, whenever we introduce symmetries into our systems, we risk observing behavior that is inconsistent with that observed when these symmetries are absent. Because opposing surfaces are almost always incommensurate unless they are prepared specifically, it will be important to avoid symmetries in simulations as much as possible. Unfortunately, it can be difficult to make two surfaces incommensurate in simulations, particularly when the interface is composed of two identical crystalline surfaces. These difficulties arise from the fact that only a limited number of geometries conform to the periodic boundary conditions in the lateral direction. Each geometry needs to be analyzed separately
Theoretical Background
79
Figure 6 Kinetic friction Fk as a function of the stiffness k of the spring pulling the upper wall at constant, small velocity. The inset shows a part of the simulated system. At large values of k, the slider moves at the same velocity as the spring and the smooth sliding kinetic friction is probed. At small values of k, the system manages to lock into a potential energy minimum, which is similar to what happens in the Prandtl–Tomlinson model. The surface then undergoes plugging or ‘‘stick-slip’’ motion as a whole. In that regime, the measured friction approaches the value for static friction. Commensurability affects the measured values for Fk in both regimes sensitively. Reprinted with permission from Ref. 25.
to ensure that the contact remains incommensurate, and there are few general suggestions that one can offer. For surfaces with hexagonal symmetry, such as (111) surfaces of face-centered cubic crystals, it is often convenient to rotate the top wall by 90 . This rotation does not map the hexagonal lattice onto itself. The number of unit cells in the x and y directions should be chosen so that only marginal strain is needed to form an interface with a square geometry. The top view of some incommensurate structures between hexagonal surfaces is shown in Figure 7. In most cases, the measured friction between incommensurate walls is relatively insensitive to how incommensurability is achieved, as long as the roughness of the two opposing walls remains constant.15 As a final point, we note that typical surfaces are usually not crystalline but instead are covered by amorphous layers. These layers are much rougher at the atomic scale than the model crystalline surfaces that one would typically use for computational convenience or for fundamental research. The additional roughness at the microscopic level from disorder increases the friction between surfaces considerably, even when they are separated by a boundary lubricant.15 However, no systematic studies have been performed to explore the effect of roughness on boundary-lubricated systems, and only a few attempts have been made to investigate dissipation mechanisms in the amorphous layers under sliding conditions from an atomistic point of view.
80
A
Atomistic Modeling of Friction
B
C
D
y x Figure 7 Projections of atoms from the bottom (solid circles) and top (open circles) surfaces into the plane of the walls. (A through C) The two walls have the same structure and lattice constant, but the top wall has been rotated by 0 , 11.6 , and 90 , respectively. (D) The walls are aligned, but the lattice constant of the top wall has been reduced by 12/13. The atoms can only achieve perfect registry in the commensurate case (A). Reprinted with permission from Ref. 14.
COMPUTATIONAL ASPECTS A typical model system used in tribological simulations is shown in Figure 8. In this system, two walls are separated by a fluid and shear is applied by pulling the top wall with an external device, whereas the bottom wall is held fixed. In atomic-level simulations, the two walls correspond to atomically discrete surfaces and the fluid is composed of atoms or molecules, which represent lubricants or contaminants.
Figure 8 Left: Schematic graph of the setup for the simulation of rubbing surfaces. Upper and lower walls are separated by a fluid or a boundary lubricant of thickness D. The outermost layers of the walls, represented by a dark color, are often treated as a rigid unit. The bottom most layer is fixed in a laboratory system, and the upper most layer is driven externally, for instance, by a spring of stiffness k. Also shown is a typical, linear velocity profile for a confined fluid with finite velocities at the boundary. The length at which the fluid’s drift velocity would extrapolate to the wall’s velocity is called the slip length . Right: The top wall atoms in the rigid top layer are set onto their equilibrium sites or coupled elastically to them. The remaining top wall atoms interact through interatomic potentials, which certainly may be chosen to be elastic.
Computational Aspects
81
In most atomic-level tribological simulations, the behavior of such systems is explored with molecular dynamics (MD). However, as noted, certain aspects of MD simulations performed under the nonequilibrium conditions that are inherent to tribology require more attention than in typical MD simulations of bulk systems that are at equilibrium. For instance, shear and load must be introduced in a meaningful way. Additionally, a great deal of heat is generated during tribological simulations, which must be removed from the system with thermostats. This process requires an understanding of how thermostats affect the behavior of the system. In this section, we cover these, and other, key points related to tribological simulations. We start by discussing various means of incorporating surface roughness into the model systems in order to perform more realistic simulations. The means of subjecting the system to shear and load are discussed below. Thermostats are then discussed. Finally, we consider cases in which one can neglect the walls and treat the system as a bulk fluid. We finish with a discussion of different computational methodologies that are used in tribological simulations.
Surface Roughness The natural starting point for a tribological simulation, or any other simulation, is to define the system that is to be simulated. Working within the context of the generic model shown in Figure 8, it is apparent that this starts by specifying the walls, which form the contact. In simulations and experiments aimed at exploring the fundamental aspects of friction, it has become common practice to employ crystalline surfaces. Setting up such surfaces is straightforward and will not be elaborated on here. Instead, we focus on situations in which one is interested in studying realistic engineering contacts. As mentioned below, surfaces in such contacts are rough on many length scales. Recently, there has been an intensified interest in modeling more realistic surface profiles; however, this research has focused so far on contact mechanics rather than on sliding motion between fractal surfaces.13,22,23 In this section, we examine ways of characterizing and generating rough surfaces. The roughness of a surface can be characterized by averaging its height difference autocorrelation function over one or several statistically identical samples. The height difference autocorrelation function C2 ðrÞ is given by C2 ðrÞ ¼ h½hðrÞ hðr þ rÞ2 i
½8
where hðrÞ is the height of a sample’s surface at the position r ¼ ðx; yÞ. Thus, C2 ðrÞ states the variation in height we would expect to encounter if we move a distance r away from our current position. For many real surfaces, C2 ðrÞ exhibits power law behavior according to Eq. [9] C2 ðrÞ / r2H
½9
82
Atomistic Modeling of Friction L = 0.01 L = 0.1
Figure 9 Flat elastic manifold pressed against a self-affine rigid surface for different loads L per atom in top wall.
where H is called the Hurst roughness exponent. H ¼ 1=2 corresponds to a random walk in height as we move laterally over the surface. Surfaces satisfying Eq. [9] are called self-similar. A profile of a self-similar surface is shown in Figure 9 along with a flat elastic object pressed onto the rough substrate. Various means of constructing self-similar surfaces are known.33 Some of them do not allow one to produce different realizations of surface profiles, for example, by making use of the Weierstrass function. These methods should be avoided in the current context because it would be difficult to make statistically meaningful statements without averaging over a set of statistically independent simulations. An appropriate method through which to construct self-similar surfaces is to use a representation of the height profile ~ hðxÞ via its Fourier transform hðqÞ. In reciprocal space, self-similar surfaces described by Eqs. [8] and [9] are typically characterized by the spectrum ~ SðqÞ defined as ~ h ~ ðqÞi ~ SðqÞ ¼ hhðqÞ
½10
~ hhðqÞi ¼0 0 ~ ~ hhðqÞh ðq Þi / q2Hd dðq q0 Þ
½11
with
where d is the number of independent coordinates on which the height depends, i.e., d ¼ 1 if h ¼ hðxÞ and d ¼ 2 if h ¼ hðx; yÞ, and ~SðqÞ is the Fourier transform of the height autocorrelation function SðxÞ ¼ hhðxÞhðx þ xÞi. The height-difference autocorrelation function C2 ðxÞ and SðxÞ are related through 2SðxÞ ¼ C2 ð0Þ C2 ðxÞ. The full characterization of the stochastic properties of a surface requires consideration of higher order correlations of the height function. However, it can be difficult to construct surfaces in this manner without experimental input. As an approximation, it may be reasonable to neglect the higher order terms. One means of generating height profiles is to draw (Gaussian) random ~ numbers for the real and complex parts of hðqÞ with a mean zero and defined variance, and then divide the random number by a term proportional to
Computational Aspects
83
~ qHþd=2 such that Eq. [11] is satisfied. Furthermore, hðqÞ must be chosen to be the complex conjugate to ensure that hðxÞ is a real-valued function. Alternatively, one may simply write hðxÞ as a sum over terms hðqÞ cosðqx þ jq Þ. In this case, one needs to draw one (Gaussian) random number with the proper second moment of hðqÞ with zero mean and one random number for each phase jq , which is uniformly distributed between 0 and p, and filter the absolute value of hðxÞ in the same way as described in the previous paragraph. Other methods exist with which to generate self-similar surfaces, such as the midpoint technique, described in Ref. 24.
Imposing Load and Shear To explore tribological phenomena, it is necessary to subject the system to external shear and load. When working within the context of a model such as that shown in Figure 8, this is achieved by pulling the top wall with an external driving device while holding the bottom wall fixed. It is only natural to subdivide such a system into a slider (the top wall), a substrate (the bottom wall), and the remaining system. In general, it is desirable to keep the interface as unperturbed by the external forces as possible, and hence, it is important to only explicitly apply any external forces and constraints to the outermost layers of the substrate and the slider. In this section, we discuss how this can be achieved in a manner that allows one to accurately simulate real sliding contacts. It is noted that some systems do not require the explicit consideration of confining walls and can be treated as bulk fluids; we defer the discussion of such systems to the section on bulk systems. It will be useful to establish some general nomenclature before proceeding. We have already defined the terms slider and substrate within the context of the model in Figure 8. To properly impose shear and load, it will prove convenient to subdivide the system even further. Specifically, only the outermost layer of the slider will be coupled to an external driving device. We will refer to this layer as the top layer (tl). Similarly, the substrate will be constrained by fixing the center of mass of the bottom layer of atoms in the substrate. This layer will be called the bottom layer (bl). All other atoms will be referred to as the embedded system, regardless of whether they are in the slider, substrate, or fluid. In three commonly used modes, the top layers are driven: 1. Predefined trajectory, e.g., X ¼ XðtÞ. 2. Predefined force, e.g., F ¼ FðtÞ. 3. Pulling with a spring, e.g., Fx ¼ k½X X0 ðtÞ, where Fx is the force acting on the top layer in the x direction, k reflects the (effective) stiffness of the driving device, and X0 ðtÞ denotes the position of the driving device as a function of time. The typical choices for the predefined trajectories or forces are constant velocity, including zero velocity, constant separation, constant forces, and/or
84
Atomistic Modeling of Friction
oscillatory velocities and forces. It is certainly possible to apply different driving modes along different directions, e.g., applying a constant force or load perpendicular to the interface and using a predefined velocity parallel to a direction that has no component normal to the interface. Simulating tribometer experiments would best be achieved using constant velocity or constant force modes, whereas it is probably best to employ oscillatory motion in a lateral direction to mimic measurements by a rheometer. In general, one should avoid using a scheme where the surfaces are held at a constant separation during sliding because such conditions rarely occur in experiments and such simulations may lead to erroneous results. Note that pulling a point particle over a periodic potential in driving mode (3) resembles the Prandtl–Tomlinson model discussed above. As in that model, the calculated (kinetic) friction force can depend sensitively on the stiffness of the driving spring (see also Figure 6). Weak springs tend to produce higher friction than do stronger springs. The fact that the calculated friction is not only a function of the interface, but also depends on how the interface is driven is important to keep in mind when comparing simulations to experiments. It is often beneficial to define a coordinate Rtl that describes the center of mass of the top layer. There are three common ways to set up the top layer. (1) The positions of top layer atoms rn are confined to (lattice) sites rn;0 , which are connected rigidly to the top layer. (2) The top layer atoms are coupled elastically to sites rn;0 fixed relative to the top layer, e.g., with springs of stiffness k. (3) An effective potential, such as a Steele potential VS 34 is applied between embedded atoms and the top layer. Specific advantages and disadvantages are associated with each method. Approach (1) may be the one that is most easily coded, (2) allows one to thermostat the outermost layer in an effective manner, whereas (3) is probably cheapest in terms of CPU time. The force on the top wall Ftl is evaluated differently for each driving mode mentioned above. P f ð1Þ Pn2tl n kðr r Þ ð2Þ Ftl ¼ Fext þ n n;0 : Pn2tl n2em rn VS ðrn Þ ð3Þ 8 <
½12
where f n in line (1) denotes the force on atom n and rn VS in line (3) is the gradient of the surface potential with respect to an embedded atom’s position. Ftl will be used to calculate the acceleration of the top layer, which results in a displacement Rtl . This displacement needs to be added to the sites rn;0 contained in the top layer in cases (1) and (2). It is possible to set the mass of the top layer Mtl arbitrarily. For example, one may increase Mtl beyond the total mass of the atoms in the top layer to incorporate some mass of the top wall that is not explicitly included in the simulation. However, one should be aware of two effects that arise from
Computational Aspects
85
altering Mtl . First, altering Mtl affects the dynamics of the system. Specifically, if Mtl is increased, a time scale gap will arise between the fast atomic motion of the embedded system and the slow collective motion of the top layer. However, having the top wall move on shorter time scales than in real systems may help to overcome the time scale gap between simulations and experiments. Second, one should be aware that the measured friction may depend on the mass of the top wall when it is pulled with a spring. Large masses favor smooth sliding over stick-slip motion and, hence, reduce the calculated friction.35,36
Imposing Constant Temperature The external driving force imposed on solids leads to the dissipation of energy as heat. In experiments, this heat diffuses away from the interface into the bulk and eventually into the experimental apparatus. However, this does not occur in simulations because of limited system size, and hence, artificial means of controlling temperature must be employed. Ideally, this process is done by applying thermostats to only the outermost layers of the system. In some cases, such as the simulation of bulk fluids, there are no confining walls, and in other cases, it is necessary to keep the confining walls in a rigid configuration. Under such circumstances, which are described below, thermostats must be applied directly on the sheared system. Numerous ways of applying thermostats in MD simulations of systems that are in equilibrium exist, each with its specific advantages and disadvantages. When the system is far from equilibrium, such as during tribological simulations, stochastic thermostats have proven particularly beneficial. Langevin thermostats are the prototypical stochastic thermostats,37 and dissipative particle dynamics (DPD) is a modern variation thereof.38 Although the use of such thermostats can be motivated from first principles through linear response theory, i.e., there are rigorous schemes for the derivation of the damping terms and the fluctuation terms contained in stochastic thermostats,39 we will not provide these arguments here. Instead, we will focus on their implementation and properties. Langevin Thermostat In the Langevin description, one assumes that the degrees of freedom within the system that are not explicitly considered in the simulation, exert, on average, a damping force that is linear in velocity gi r_ i along with additional random forces i ðtÞ. This leads to the following equation of motion for particle number i: mi€ri þ gi ð_ri hvi iÞ ¼ ri V þ i ðtÞ
½13
86
Atomistic Modeling of Friction
where the damping coefficient gi and the a component of the random forces ia ðtÞ acting on particle i should obey Eq. [14] hia ðtÞi ¼ 0 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi hia ðtÞjb ðt0 Þi ¼ 2gi kB T dðt t0 Þ dij dab pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 dt;t0 dij dab ! 2gi kB T t
½14
in order to satisfy the fluctuation-dissipation theorem. In Eq. [13], V denotes the total potential energy and hvi i the expected drift velocity,40 e.g., hvi i ¼ 0 in the bottom layer and hvi i ¼ vtl if atom i belongs to the top layer. The last line in Eq. [14] refers to the discrete time description used in molecular dynamics in which t is the time step. When using predictor-corrector methods (velocity Verlet is a second-order Gear predictor corrector method), it is necessary to keep in mind that random terms cannot be predicted. Therefore, one should only apply the predictor-corrector schemes to the deterministic parts of the equation of motion. In cases where very high damping is applied, time steps can be kept large when employing efficient integration schemes.41 In general, however, one should ensure that the coupling between the system and the thermostat is sufficiently weak in order to avoid externally imposed overdamping. It should also be emphasized that there is no need to choose the random forces from a Gaussian distribution unless one is interested in short-time dynamics. It is much faster to generate pffiffiffi pffiffiffiuniformly distributed random numbers for the i ðtÞ’s on an interval ½ 3s; 3s, where s is the standard deviation of the Gaussian distribution. Moreover, having a strict upper bound on the i ðtÞ’s eliminates potentially bad surprises when using higher order predictor-corrector schemes and, thus, allows one to use a large time step while producing accurate thermal averages and trajectories. It is also certainly possible to apply damping, and hence the thermostat, in a direction-dependent manner. For example, the damping terms can be suppressed parallel to the sliding direction. This is particularly important, for instance, when the system has a small viscosity or when the shear rates are high. Otherwise, one is likely to create artificial dynamics. Using the correct velocity profile hvi before the simulation can also reduce the problem of perturbing the dynamics in an undesirably strong fashion. However, anticipating certain velocity profiles will always suppress other modes, e.g., assuming laminar flow in a thermostat is likely to artificially bias the system toward laminar flow42 and may create additional artifacts.43–45 Effects of Damping on Calculated Friction Making assumptions regarding the dissipation of heat can also influence solid friction, although typically it is less of an issue. This can be explored most easily within the Prandtl–Tomlinson model; however, the lessons to be learned
Computational Aspects
87
apply to a large degree to more general circumstances. In the original formulation of this model, (see Eq. [1]), damping takes place relative to the substrate. However, one may also assume that the conversion of energy into heat takes place within the top solid.46 Thus, a generalized Prandtl–Tomlinson model would be m€ x þ gsub x_ þ gtop ðx_ v0 Þ ¼ rV kðx v0 tÞ þ sub ðtÞ þ top ðtÞ
½15
where the indices ‘‘sub’’ and ‘‘top’’ denote the thermal coupling to substrate and top solid, respectively. To investigate the effect of the thermostat on the frictional forces, it is instructive to study slightly underdamped or slightly overdamped motion. In 00 the following discussion, we will set m ¼ 1, a ¼ 1, V0 ¼ 1, and k ¼ 0:5 Vmax . Damping g and temperature T will be varied, but we will first consider the athermal case where T ¼ 0. With this choice of parameters, the maximum cur00 vature of the potential Vmax will be greater than k so that instabilities will occur during sliding and lead to finite kinetic friction at small v0 in the absence of thermal fluctuations. Figure 10(a) shows the friction-velocity dependence for the following choices of thermostats (1) gtip ¼ 1, gsubstrate ¼ 0; (2) gtip ¼ 1=4, gsubstrate ¼ 0; (3) gtip ¼ 1, gsubstrate ¼ 0; and (4) gtip ¼ 4, gsubstrate ¼ 0. We see that the kinetic friction is insensitive to the precise choice of the thermostat for small values of v0 , at least when the temperature is sufficiently small. This is because friction is dominated by fast ‘‘pops’’ in that regime. This conclusion becomes invalid only if g is sufficiently small such that a=v0 , the time it takes the driving stage to move by one lattice constant, is not long enough to transfer most of the ‘‘heat’’ produced during the last instability into the thermostat. At high
(a)
Fk
0.4
0.6
γsubstrate = 1.0 γtip = 0.25 γtip = 1.0 γtip = 4.0
0.4 Fk
0.6
0.2
0.0 -4 10
(b)
γsubstrate = 1.0 γtip = 0.25 γtip = 1.0 γtip = 4.0
0.2
10
-3
10
-2
v0
10
-1
10
0
0.0 -4 10
10
-3
10
-2
10
-1
10
0
v0
Figure 10 Friction velocity relationship Fk ðv0 Þ in the Prandtl–Tomlinson model at (a) zero and (b) finite thermal energy; i.e., kB T ¼ 0:1V0 . Different damping with respect to substrate gsubstrate and top solid gtip for different realizations of damping. The arrow in (b) points to the zero-velocity limit in the athermal case.
88
Atomistic Modeling of Friction
velocities, the sliding velocity v0 is no longer negligible when compared with the peak velocity during the instability. This renders the friction-velocity dependence susceptible to the choice of the thermostats. Damping with respect to the substrate leads to strictly monotonically increasing friction forces, whereas damping with respect to the top wall can result in non-monotonic friction-velocity relationships.46 So far, we have considered the zero temperature case. Once finite thermal fluctuations are allowed, a qualitative friction-velocity relationship exists, which can be shown by choosing thermal energies as small as kB T ¼ 0:1V0 [see Figure 10(b)]. Jumps can now be thermally activated, and the friction force decreases with decreasing velocity. Yet again, at small v0 , there is little effect of the thermostat on the measured friction forces. Even changing gtip by as much as a factor of 16 results in an almost undetectable effect at small v0 . This behavior originates from the fact that the system can get very close to thermal equilibrium at every position of the top wall for very small sliding velocities. As such, linear response theory applies and it follows that friction and velocity are linearly related at sufficiently small values of v0 . This linearity is generally valid unless the energy barriers to sliding are infinitely high and explains the linear dependence of friction upon the velocity at small v0 and finite T, as shown in Figure 5. Dissipative-Particle-Dynamics Thermostat A disadvantage of Langevin thermostats is that they require a (local) reference system. Dissipative particle dynamics (DPD) overcomes this problem by assuming that damping and random forces act on the centerof-mass system of a pair of atoms. The DPD equations of motion read as follows: m€ri ¼ ri V
X
gij ð_ri r_ j Þ þ ij ðtÞ
½16
j
where ij ðtÞ ¼ ji ðtÞ. The usual relations for fluctuation and dissipation apply hij;a ðtÞi ¼ 0 hij;a ðtÞkl;b ðt0 Þi ¼ 2kB Tgij ðdik djl þ dil djk Þdab
½17 dt;t0 t
½18
Note that gij can be chosen to be distance dependent. A common approach is to assume that gij is a constant for a distance smaller than a cut-off radius rcut; DPD and to set gij ¼ 0 otherwise. As calculating random numbers may be a task of relatively significant computational effort in force-field-based MD simulations, it may be sensible to make rcut; DPD smaller than the cut-off radius
Computational Aspects
89
for the interaction between the particles, or to have the thermostat act only every few time steps. Among the advantages of DPD over Langevin dynamics are conservation of momentum and the ability to properly describe hydrodynamic interactions with longer wavelengths,47,48 which ensures that ‘‘macroscopic’’ properties are less affected with DPD than with Langevin dynamics. To illustrate this point, it is instructive to study the effect that DPD and Langevin thermostats have on a one-dimensional, linear harmonic chain with nearest-neighbor coupling, which is the simplest model to study long wavelength vibrations. The Lagrange function L of harmonic chain without thermostats is given by L¼
N X m i¼1
k x_ 2i ðxi xi1 aÞ2 2 2
½19
where a is the lattice constant and k is the stiffness of the springs. Periodic boundary conditions are employed after a distance Na. The equations of motion at zero temperature with damping are m€ xi þ gx_ i ¼ kð2xi xiþ1 xi1 Þ ðLangevinÞ m€ xi þ gð2x_ i x_ iþ1 x_ i1 Þ ¼ kð2xi xiþ1 xi1 Þ ðDPDÞ
½20 ½21
As usual, it is possible to diagonalize these equations of motion by transforming them into reciprocal space. The equations of motion of the Fourier ~ðq; oÞ then read as follows: transforms x ~ þ imog~ x ¼ 0 ðLangevinÞ mo2 x x þ 4 sin2 ðqa=2Þk~ 2
2
2
~ þ 4 sin ðqa=2Þimog~ mo x x þ 4 sin ðqa=2Þk~ x ¼ 0 ðDPDÞ
½22 ½23
Thus, although Langevin and DPD damping do not alter the eigenfrequencies qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi of the chain, i.e., their q dependence is o0 ðqÞ ¼ 4 sin2 ðqa=2Þ=m, the quality factor Q, defined as the ratio of the eigenfrequency to the damping, does differ between the two methods, as shown in Eq. [24] o0 ðqÞ ðLangevinÞ g o0 ðqÞ 1 QðqÞ ¼ ðDPDÞ g 4 sin2 ðqa=2Þ QðqÞ ¼
½24
90
Atomistic Modeling of Friction
In the long wavelength limit where q ! 0, Langevin dynamics will always be overdamped, whereas DPD dynamics will be underdamped, provided that the system is not intrinsically overdamped, as would be the case in the vicinity of a continuous phase transition. Although this example was based on calculations of a linear harmonic chain, the results suggest that DPD has little effect on dynamical quantities that couple to long wavelengths. The bulk viscosity of a system is an example of a property where the measured value in sheared fluids depends on the precise choice of g only in a negligible manner as long as g is reasonably small.48 In some cases, it may yet be beneficial to work with Langevin thermostats. The reason is that (elastic) long wavelength modes equilibrate notoriously slowly. Consider the case in which an elastic solid is pressed onto a fractal surface. Because the DPD thermostat barely damps long-range oscillations, we must expect a lot of bumping before the center-of-mass of the top wall finally comes to rest. Conversely, Langevin dynamics can lead to faster convergence because the thermostat couples more strongly to longwavelength oscillations, as shown in Figure 11. When the Langevin thermostat is used, the system quickly reaches its mechanical position, whereas the DPD-thermostatted system is strongly underdamped, despite fact that the damping coefficient was 10 times larger for DPD than for Langevin. In general, one needs to keep in mind that equilibrating quickly and producing realistic dynamics (or calculating thermal expectation values) are often
0 Langevin DPD
Ztl(t)
−10
−20
γDPD=10γLangevin −30 0
1000
2000
3000
4000
5000
t
Figure 11 Time dependence of the normal position Ztl of an elastic solid, which is pressed against a self-affine substrate similar to the one shown in Figure 9. Two different damping/thermostatting schemes are employed, Langevin (broken lines) and DPD (full lines). Although the damping coefficient is 10 times greater in DPD than in Langevin, DPD-based dynamics are too strongly underdamped to relax efficiently to the right position.
Computational Aspects
91
mutually exclusive in simulations, and it may be necessary to consider carefully which aspect is more important for a given question of interest.
Bulk Systems In some cases, friction between two surfaces is dominated by the bulk viscosity of the fluid embedded between them.49 In these cases, it is often suitable to model the bulk sheared fluid and neglect the presence of confining walls. In this section, we describe computational approaches for shearing bulk systems and identify the conditions under which it is appropriate to treat the system in this manner. We start in the next section with a discussion of the conditions under which one may neglect confining walls. This is followed with a discussion of how to impose shear on bulk systems. We then close by exploring ways in which the system can be constrained to accurately reproduce certain phenomena. Theoretical Background When surfaces are sufficiently far from one another and shear rates are low, one can usually assume that the solid and first layer of fluid move at the same velocity. This situation is called a stick condition. For the calculation of shear forces in this scenario, it is possible to ignore the walls altogether by simply applying shear directly to the fluid. As the distance D between the two walls of the contact decreases, slip may occur, in which case the wall and nearby fluid no longer move at the same velocity. The concept of slip is alluded to in Figure 8, where the slip length is introduced. The calculation of from atomistic simulations is a subtle issue,50 which we will not touch on here. When the fluid is confined even further, the concept of slip length might break down altogether and the measured friction becomes a true system function that cannot be subdivided into smaller, independent entities. The discussion is summarized in the following equation: 8 hydrodynamic regime < Zv=D F=A ¼ Zv=ðD þ Þ moderate confinement : ? strong confinement
½25
where F=A is the force per surface area that is required to slide two solids separated by a distance D at a velocity v and Z denotes the (linear-response) viscosity of the fluid between the walls. Figure 12 illustrates the effect of D on the measured friction of a system similar to that shown in the inset of Figure 6. At large separations, the behavior is reminiscent of hydrodynamic lubrication; i.e., the damping coefficient grheo ¼ F=Av is approximately inversely proportional to D, and grheo is relatively independent of the orientation of the surfaces. As D is decreased, the
92
Atomistic Modeling of Friction 100 commensurate incommensurate, full layers inc., noninteger number of layers 10 γrheo
1
2 3
1
3/ D
4 8 0 .1
6
1/ D
1
Figure 12 Damping coefficient grheo ¼ F=Av obtained from simulating two atomically flat surfaces separated by a simple fluid consisting of monomers at constant temperature and normal pressure. Different coverages were investigated. The numbers in the graph denote the ratio of atoms contained in the fluid Nfl relative to the atoms contained per surface layer of one of the two confining walls Nw . The walls are (111) surfaces of facecentered-cubic solids. They are rotated by 90 with respect to each other in the incommensurate cases. Full circles represent data for which Nfl =Nw is an integer. The arrow indicates the point at which the damping coefficients for commensurate walls increases exponentially.
damping force is no longer inversely proportional to D and begins to exhibit a dependence on the orientation of the walls. In fact, the damping force for commensurate surfaces increases by several orders of magnitude by going from four layers of lubricant atoms to three, as indicated by the vertical arrow in Figure 12. The large values for the effective damping can be understood from the above discussion of lubricated commensurate surface. Incommensurate walls do not show such dramatic effects because a fluid lubricant does not lock the surface on very long time scales. However, the system with incommensurate surfaces still deviates from hydrodynamic behavior as D decreases. Boundary Conditions In the preceding section, it was demonstrated that the walls of the system can be neglected when the system is in the hydrodynamic regime, i.e., when D is sufficiently large. In this situation, it is often desirable to treat the system as a bulk fluid and apply shear directly without any boundary effects. In this section, we describe two different methods with which to shear bulk systems. The first method we will discuss was proposed by Lees and Edwards,51 and is outlined qualitatively in Figure 13. In this technique, periodic boundary conditions are employed in all three spatial directions; however, although the
Computational Aspects
93
shear direction y
t=0
t=t1
t=2t1
x
Figure 13 Visualizations of Lees–Edwards periodic boundary conditions. At time zero t ¼ 0, regular periodic boundary conditions are employed. As time moves on, the periodic images of the central simulation cell move relative to the central cell in the shear direction as shown in the middle and the right graph. The circle and square show points in space that are fixed with respect to the (central) simulation cell. It is important to distribute the effect of shear homogeneously through the simulation cell. Otherwise, velocities will be discontinuous in shear direction whenever a particle crosses the simulation cell’s boundary across the shear gradient direction. In this graph, x corresponds to the shear direction and y to the shear gradient direction.
center-of-mass of the central simulation cell remains fixed in space, many of its periodic images are moved parallel to the shear direction. Thus, even when a particle is fixed with respect to the central image, the distance to its periodically repeated images will change with time if the vector connecting the two images contains a component parallel to the shear gradient direction. To be specific, let Rij denote the position in the periodically repeated cell, which is the ith image to the right and the jth image on top of the central cell. (A potentially third dimension remains unaffected and will therefore not be mentioned in the following discussion.) The position in real space of the vector Rij ¼ ðX; YÞij would be
X Y
¼
ij
X Y
1 þ 0 00
E_ t 1
iLx jLy
½26
with E_ being the shear rate and Lx and Ly being the lengths of the simulation cell in x and y directions, respectively. Conventional periodic boundary conditions can be reproduced by setting E_ to 0. When using the Lees–Edwards periodic boundary condition, thermostatting is most naturally performed with DPD thermostats because no reference system needs to be defined. When integrating the equations of motion, it is important to not impose the shear only at the boundaries because this would break translational invariance. Instead, we need to correct the position in the shear direction at each MD step of size t. This correction is done, for instance, in the following fashion: Xnþ1 ¼ Xn þ Xn þ E_ tYn
½27
94
Atomistic Modeling of Friction
where Xn is the change in the x coordinate if no external shear was applied. This way, the effect of shear is more homogeneously distributed over the system. An alternative to Lees–Edwards boundary conditions is the formalism put forth by Parrinello and Rahman for the simulation of solids under constant stress.52,53 They described the positions of particles by reduced, dimensionless coordinates ra , where the ra can take the value 0 ra < 1 in the central image. Periodic images of a given particle are generated by adding or subtracting integers from the individual components of r. The real coordinates of R are obtained by multiplying r with the matrix h that contains the vector spanning the simulation cell. In the current twodimensional (2-D) example, this would read as follows: Ra ¼
X
hab rb
b
h¼
Lx
0
0
Ly
½28
1
E_ t
0
1
½29
In this approach, the potential energy V is a function of the reduced coordinates and the h-matrix. For the kinetic energy, one would only be interested in the motion of the particle relative to the distorted geometry so that a suitable Lagrange function L0 for the system would read as follows: L0 ¼
X1 i
2
mi
X
! hab r_ ib
Vðh; frgÞ
½30
b
in which the h matrix may be time dependent. From this Lagrangian, it is straightforward to derive the Newtonian equations for the reduced coordinates, which can then be solved according to any number of integration schemes. One advantage of the scheme outlined in Eqs. [28] to [30] is that it is relatively easy to allow for fluctuations of the size of the central cell. This scheme is described further below. Geometric and Topological Constraints An important aspect of the methods described in the preceding section is that Lx and Ly can be time dependent. As we will show in this section, this flexibility allows the simulation cell to fluctuate independently along different spatial dimensions during the simulation. This capability is useful in simulations of systems such as self-assembled monolayers under shear. However, care must be taken when allowing for this additional flexibility because, for some systems, e.g., simple fluids under shear, there is no particular reason why Lx and Ly should be chosen to be independent of one another. In this
Computational Aspects
95
Figure 14 Schematic representation of the microphase separation of block copolymers. The left graph shows atomic-scale details of the phase separation at intermediate temperatures, and the right graph shows a lamellar phase formed by block copolymers at low temperatures. The block copolymers have solid-like properties normal to the lamellae, because of a well-defined periodicity. In the other two directions, the system is isotropic and has fluid-like characteristics. From reference 54.
section, we use the specific example of simulating a diblock copolymer to explore aspects of allowing Lx and Ly to fluctuate independently. The most simple diblock copolymers are linear chains, in which one part of the chain consists of one type of monomer, say polystyrene (PS), and the other one of another type, say polybutadiene (PB), as illustrated in Figure 14. PS and PB usually phase separate at low temperatures; however, because of their chemical connectivity, block copolymers cannot unmix on a macroscopic scale. They can only phase separate on a microscopic scale, the size of which is determined by the length of the polymers. When lamellar structures are formed, it is necessary to ensure that the dimensions of the simulation cell are commensurate with the intrinsic periodicity of the lamellae. This process prevents unintentionally subjecting the system to artificial pressure as a result of the geometric constraints. Subjecting the system to a predetermined pressure, or stress, in a controlled manner can be achieved by allowing the system to fluctuate parallel to ‘‘solid directions,’’ which are introduced in Figure 14. For these directions, it would be appropriate to employ the usual techniques related to constant stress simulations.52,53 Let us consider the three-dimensional case and work within the Parrinello–Rahman framework. A rather general three-dimensional h matrix of Eq. [31] will be considered: 2
Lxx h¼4 0 Lxz
0 Lyy Lyz
32 Lxz 1 Lyz 54 0 Lzz 0
E_ t 1 0
3 0 05 1
½31
It is now possible to treat the variables Lij as generalized coordinates and to allow them to change during the MD simulation. For this purpose, it is
96
Atomistic Modeling of Friction
necessary to define a kinetic energy Tcell associated with the fluctuation of the cell geometry as a bilinear function of generalized velocities L_ ij , Tcell ¼
X1 abgd
2
Mabgd L_ ab L_ gd
½32
where Mabgd must be a positive definite matrix with the units of mass. Although the optimum choice for the M-matrix is a matter of debate,55 a reasonable approach is to treat the various L_ ab as independent, uncoupled variables and assign the same mass Mcell to all of them, which simplifies Eq. [32] to Eq. [33]: Tcell ¼
X1 Mcell L_ 2ab 2 a;ba
½33
It is often sensible to choose Mcell such that the simulation cell adjusts to the external pressure and the thermostat on microscopic time scales. The Lagrangian L for Lees–Edwards boundary conditions combined with Parrinello–Rahman fluctuations for the cell geometry now reads as follows: L ¼ L0 þ Tcell p det h
½34
where p is an isotropic pressure and det h is the volume of the simulation cell. The Newtonian equations of motion for the generalized coordinates Lab and ri follow from the Lagrange formalism. Furthermore, it is possible to couple fluctuating cell geometries not only to constant isotropic pressure but to non-isotropic stresses as well. The description of these approaches is beyond the scope of the current tutorial, but it can be found in the literature.52,53 When studying systems with mixed ‘‘fluid’’ and ‘‘solid’’ directions, it is important to keep in mind that each solid direction should be allowed to breathe and fluid directions need to be scaled isotropically or constrained to a constant value. Allowing two fluid directions to fluctuate independently from one another allows the simulation cell to become flat like a pancake, which we certainly would like to avoid. As an example, consider Figure 15, in which a lamellar block copolymer phase is sheared. The convention would be to have the shear direction parallel to x and the shear gradient direction parallel to y. No reason exists for the simulation cell to distort such that Lxz ¼ Lyz ¼ 0 would not be satisfied on average, so one may fix the values of Lxz and Lyz from the beginning. As a result, one solid direction exists plus two fluid directions. We can also constrain Lxx to a constant value, because the shear direction will always be fluid and another fluid direction can fluctuate. This result means that we should allow the simulation cell to fluctuate independently in only the directions of
Computational Aspects
97
Figure 15 A lamellar block copolymer phase is reoriented through external shear. The initial phase has the direction of the lamellae parallel to the shear gradient direction. The most stable state would be to orient the director parallel to the shear and shear gradient direction. However, the reorientation process gets stuck before true equilibrium is reached. The stuck orientation is relatively stable, because the lamellae have to be broken up before they can further align with respect to the shear flow. Reprinted with permission from Ref. 56.
the shear and the shear gradient. Yet during the reorientation process, i.e., during the intermediate stage shown in Figure 15, simulation cells do have the tendency to flatten out, because periodicity, and hence solid-like behavior, is lost for a brief moment in time. It is interesting to note that the lamellar structure in Figure 15 does not find the true equilibrium state during reorientation, but instead it adopts a metastable state. This behavior occurs because the periodic boundary conditions impose a topological constraint and prevent the system from simply reorienting. It is conceivable that similar metastable states are also obtained experimentally; however, the nature of the constraints leading to these states differs in both cases. One means of overcoming this topological constraint is to impose a higher temperature Th at the boundaries of the simulation cell (e.g., at 0 ry 0:2 and 0 rz 0:2) and to keep the temperature low in the remainder of the system.55 This method would melt the lamellar structure at the boundary and allow the remaining lamellae to reorient freely with respect to the shear flow.
Computational Models In the preceding sections, we have discussed how to set up a tribological simulation by properly defining the system, applying thermostats, and imposing shear. However, to perform the simulation, one must describe the interactions between the particles that comprise the system. Numerous techniques exist for this purpose ranging from continuum-based models, which neglect the atoms altogether, to quantum chemical methods, which explicitly consider the system down to the level of the electrons. Generally, the choice of model is dictated by the nature and goals of the simulation that will be performed. In cases where one needs to describe chemical reactions, it is necessary to use first-principles methods or, in some special cases, reactive force fields. However, if one wishes to study large systems or examine longer time scales, it is necessary to use more approximate methods.
98
Atomistic Modeling of Friction
In this section, we give a brief overview of theoretical methods used to perform tribological simulations. We restrict the discussion to methods that are based on an atomic-level description of the system. We begin by discussing generic models, such as the Prandtl–Tomlinson model. Below we explore the use of force fields in MD simulations. Then we discuss the use of quantum chemical methods in tribological simulations. Finally, we briefly discuss multiscale methods that incorporate multiple levels of theory into a single calculation. Generic Models The Prandtl–Tomlinson (PT) model introduced above is the most commonly used generic model for tribological simulations. It has tremendous didactic value because it clearly demonstrates the role of instabilities in energy dissipation and friction. Moreover, it is often used to describe accurately the dynamics of systems such as an atomic force microscope tip dragged over periodic substrate.57 Even though the PT model is a significant approximation to realistic systems, it is worthwhile to run a few simulations with this model to explore its rich behavior. In particular, interesting dynamics occur when k and g are so small that the surface atom is not only bistable but multistable and the motion is underdamped. In such a situation, the atoms do not necessarily become arrested in the next available, mechanically stable site after depinning and interesting nonlinear dynamics can occur, such as non-monotonic frictionvelocity dependence. Another frequently used model system is the Frenkel–Kontorova (FK) model, in which a linear harmonic chain is embedded in an external potential. For a review, we direct the interested reader to Ref. 62. The potential energy in the FK model reads as follows: V¼
X1 i
2
kðxiþ1 b xi Þ2 V0 cosð2pxi =aÞ
½35
where a and b are the lattice constant of substrate and slider, respectively, and k is the strength of the spring connecting two adjacent atoms in the slider. As in the PT model, finite friction is found when atoms can find more than one mechanically metastable position and become unstable during sliding. Experience indicates that it is not possible to reproduce the results of many tribological experiments with the FK model, despite the increased complexity with respect to that of Prandtl and Tomlinson. In particular, when parametrized realistically and generalized to higher dimensions, it is found that most incommensurate interfaces between crystals should be superlubric within the approximations of the FK model.7 In other cases, when instabilities do occur, the FK model can only describe the early time behavior of flat sliding interfaces.63 As such, one must be cautious when using the FK model.
Computational Aspects
99
Force-Field Methods Most atomic-scale tribological simulations use force fields (FFs) to describe the interactions between atoms. A huge amount of literature exists regarding the development and use of FFs, and we will not attempt to cover this vast topic here. Instead, we will point out aspects of FFs and their use that are relevant to tribological simulations. The reader interested in a more general discussion of FFs is directed to the chapters by Bowen and Allinger58 and Dinur and Hagler59 in volume 2 of this series and by Landis et al. in volume 6.60 FFs are parameterized functions that relate the structure of a system to its potential energy, which is taken as a sum of simpler energy terms originating from, for instance, bond distances and angles, as well as non-bonded terms, such as electrostatics and van der Waals interactions. In most cases, the parameters used in the FF are chosen to reproduce experimental results obtained under specific conditions. It is then assumed that the FF will be accurate for similar systems under similar conditions. As discussed, systems are exposed to large pressures and shear during tribological simulations; however, FFs are typically not parameterized with data obtained under such conditions, and no guarantee exists that FFs developed for simulations at ambient conditions are suitable for use under more extreme situations. As a result, one must take particular care to ensure that the FF used in a tribological simulation provides an accurate representation of the system over the pressure range encountered in that simulation. A natural starting point is to test the ability of existing FFs to reproduce experimental data obtained under the conditions of interest; however, in some cases, it may be necessary to generate new sets of parameters. FFs that are parameterized for high-pressure conditions can still lead to behavior that differs from that observed in experiments. For instance, it is common practice to treat the interatomic interactions with Lennard–Jones (LJ) potentials. Although this method is convenient from a computational standpoint, it is known that LJ potentials do not reproduce experimentally observed behavior such as ‘‘necking,’’ where a material attempts to minimize surface area and will break under large tensile stresses. Many other examples exist where particular types of FFs cannot reproduce properties of materials, and once again, we emphasize that one should ensure that the FF used in the simulation is sufficiently accurate. The calculation of electrostatic interactions in FF-based tribological simulations also requires care. The typical model used in tribological simulations consists of two surfaces separated by a fluid, with the whole system subject to periodic boundary conditions (PBCs). If we define the system such that the surfaces extend in the x y plane, it seems only natural to apply PBCs in these two dimensions. However, care must be taken when treating the third dimension z, which lies normal to the surfaces. Specifically, one must ensure that the length of the simulation cell in the z direction is large enough to leave
100
Atomistic Modeling of Friction
a large vacuum space between periodic images, which will minimize electrostatic interactions between these images. This process is particularly important when performing simulations with standard packages that use regular routines for the Ewald summation. In some cases, however, it may be possible to employ fast-multipole methods that are specific to 2-D systems. We end this section by mentioning reactive FFs,61,64,65 which are capable of describing changes in chemical bonding and atomic hybridization. The development of reactive FFs has emerged as a topic of significant interest. However, as one can imagine, this task is exceedingly difficult and these FFs only apply to very specific systems and reactions. Nonetheless, reactive FFs have been used in many areas of simulation as a cost-effective alternative to first-principles calculations. In tribology, reactive FFs have been used to study tribochemical reactions in hydrocarbons.66 The simulations of these reactions are discussed in greater detail below. Quantum Chemical Methods Although most atomic-level tribological simulations are performed with FFs, circumstances exist when such approaches are insufficient. This situation is particularly true when studying tribochemistry–chemical reactions that are induced by shear and load. Because chemical reactions involve changes in bonding, it is necessary to employ first-principles methods that explicitly consider the electronic structure of the system.67 These methods include those typically used in quantum chemistry (QC), such as molecular-orbital methods, e.g., Hartree–Fock, and density functional theory (DFT), as well as approximate QC techniques, such as tight-binding DFT and semi-empirical molecular orbital methods. In this section, we focus on the use of QC methods to study tribological phenomena. The greatest limitation of QC methods is computational expense. This expense restricts system sizes to a few hundred atoms at most, and hence, it is not possible to examine highly elaborate systems with walls that are several atomic layers thick separated by several lubricant atoms or molecules. Furthermore, the expense of first-principles calculations imposes significant limitations on the time scales that can be examined in MD simulations, which may lead to shear rates that are orders of magnitude greater than those encountered in experiments. One should be aware of these inherent differences between first-principles simulations and experiments when interpreting calculated results. In general, two types of first-principles tribological simulations have been reported in the literature. The first type of simulation is based on static energy calculations in which the dynamics of the system are not considered.68,69 A typical simulation would be set up as follows. An initial structure is generated where the two surfaces are aligned in the desired configuration and fixed at some constant separation distance D. The structure is relaxed, and the energy is recorded. Then the top is layer is moved by some small distance relative to
Computational Aspects
101
the bottom surface while holding D constant. The new structure is relaxed, and the energy is recorded. This procedure is then repeated until the desired sliding distance is covered. In this approach, one generates an energy profile along the given trajectory. The (static) friction force can be estimated as the maximum slope qE=qx of the energy along this trajectory between a potential energy minimum and the following maximum. To determine the kinetic friction force, one must identify instabilities and evaluate Fk as E Fk ¼ diss a
½36
where Ediss is the total energy lost to instabilities over the total sliding distance a.70 It is important to note that the number of instabilities that occur over a given sliding distance increases with the size of the system, and hence for large systems, one must use a very small step size to avoid missing instabilities. The calculated energetics also allow one to estimate the normal load, thereby providing access to the friction coefficient once the friction force is known. This method, albeit crude, has been shown to yield good results when compared with experiments in cases such as graphite sheets sliding past one another.68 However, one should realize that approximations in the determination of the normal load, the assumption that friction only depends on energy barriers, and the lack of a consideration of dynamical aspects of system may lead to significant deviations from experimental results for many other systems. The second approach used in first-principles tribological simulations focuses on the behavior of the sheared fluid. That is, the walls are not considered and the system is treated as bulk fluid, as discussed. These simulations are typically performed using ab initio molecular dynamics (AIMD) with DFT and plane-wave basis sets. A general tribological AIMD simulation would be run as follows. A system representing the fluid would be placed in a simulation cell repeated periodically in all three directions. Shear or load is applied to the system using schemes such as that of Parrinello and Rahman, which was discussed above. In this approach, one defines a (potentially time-dependent) reference stress tensor sref and alters the nuclear and cell dynamics, such that the internal stress tensor ssys is equal to sref . When ssys ¼ sref , the internal and external forces on the cell vectors balance, and the system is subject to the desired shear or load. When performing variable-cell AIMD simulations with plane-wave basis sets, problems originate from the fact that the basis set is not complete with respect to the cell vectors.71 This incompleteness can introduce fictitious forces (Pulay forces) into ssys and lead to artificial dynamics. To overcome this problem, one must ensure that ssys is well converged with respect to the basis set size. In general, it is found that one needs to employ a plane-wave kinetic
102
Atomistic Modeling of Friction
energy cut-off Ecut , that is, approximately 1.3–1.5 times greater than that used in a comparable constant volume simulation. This guideline is only a general one, as opposed to a strict principle, and one should test the convergence of ssys with respect to Ecut before performing any serious first-principles tribological simulations. A related issue originates from the fact the quality of a plane-wave basis set is dependent on the volume of the cell. In most variable-cell AIMD simulations performed with plane-wave basis sets, the number of plane waves is determined for the initial cell volume and held fixed throughout the simulation. That is, as the cell changes shape and size, the number of plane waves remains constant, and hence, the quality of the basis set, which is more closely related to Ecut , is not constant throughout the simulation. This inconstancy can lead to problems when calculating relative energies, which govern chemical reactions. Essentially, a relative energy is the difference between the absolute energies of two systems, and if the systems are treated at the same level of theory, any errors in the absolute energies of the individual species because of the quality of the basis set will roughly cancel. On the other hand, if one system is treated with a higher quality basis set, these errors will no longer cancel, but instead they will become incorporated into the relative energy. These errors can become significant if the cell undergoes a substantial change in volume, which may occur during a tribological simulation. Two popular approaches exist for minimizing the errors caused by changes in basis set quality during variable-cell AIMD simulations. The first approach is to set Ecut , which determines the number of plane waves considered in the calculation, high enough to ensure that the absolute energies are sufficiently converged for all structures within the range of volumes that one would anticipate encountering in the simulation. The second approach involves altering the number of plane waves that are explicitly considered during the simulation such that Ecut remains constant with changes in volume.72 We note that regardless of which approach one uses it is necessary to ensure that the basis set is large enough to yield an accurate ssys . An advisable strategy involves performing the AIMD simulation with a basis that is large enough to yield converged results for the pressure and exploring the energetics of any observed processes in greater detail through a second set of calculations with a basis set that yields converged absolute energies. Such a strategy has been used in static quantum chemical calculations of phase transitions in CO2 73 and could also be used by researchers to study tribochemical reactions. Multiscale Approaches In many simulations, it is desirable to simulate as many layers of the confining walls as possible to closely reproduce experimental situations. However, from a computational point of view, one would like to simulate as few degrees of freedom as possible. Unless conditions are special, all processes far away
Computational Aspects
103
Figure 16 Representation of a finite-element mesh for the simulation between a fractal, elastic object and a flat substrate. Reproduced with permission from reference 24.
from the interface can be described accurately within elastic theory or other methods that allow for a description of plastic deformations, such as finite elements. The advantage of these continuum-theory based methods is that it is possible to increasingly coarse-grain the system as one moves away from the interface, thereby reducing the computational effort. New methodological developments even allow one to couple atomistic simulations to continuumtheory descriptions.74,75 It is beyond the scope of this chapter to provide a detailed description of all available methods; however, the coordinate discretization scheme shown in Figure 16 alludes to how one must proceed when incorporating different mesh sizes into a simulation. Although quasi-static processes can be modeled quite well with continuummechanics-based models employing varying mesh size, this is not the case for dynamic processes. Whenever a region exists where the coarse-grain level is changed, one risks introducing artificial dynamics. In particular, the transmission of sound waves and energy density is suppressed whenever the mesh size changes, and hence, it is not possible to have the proper momentum and energy transfer across the boundary when employing a Hamiltonian-based description. It is important, however, to realize that the computational effort required to simulate a three-dimensional system of linear dimension L only scales with L2 , or sometimes L2 ln L, using coarse-grained models, as opposed to the L3 scaling for brute-force methods. It is often advisable to sacrifice realistic dynamics rather than system size.
104
Atomistic Modeling of Friction
An alternative approach involves integrating out the elastic degrees of freedom located above the top layer in the simulation.76 The elimination of the degrees of freedom can be done within the context of Kubo theory, or more precisely the Zwanzig formalism, which leads to effective (potentially time-dependent) interactions between the atoms in the top layer.77–80 These effective interactions include those mediated by the degrees of freedom that have been integrated out. For periodic solids, a description in reciprocal space decouples different wave vectors q at least as far as the static properties are concerned. This description in turn implies that the computational effort also remains in the order of L2 ln L, provided that use is made of the fast Fourier transform for the transformation between real and reciprocal space. The description is exact for purely harmonic solids, so that one can mimic the static contact mechanics between a purely elastic lattice and a substrate with one single layer only.81 The possibility even exists of including dynamical effects with timedependent friction terms (plus random forces at finite temperatures).77–80 However, it may not be advisable to take advantage of this possibility, as the simulation would become increasingly slow with increasing number of time steps. Moreover, the simulation will slow down considerably in higher dimensions because of the nonorthogonality of the dynamical coupling in reciprocal space. ~qia ðtÞ denote the a compoTo be specific regarding the formalism, let u nent of the displacement field associated with wave vector q and eigenmode i at time t. In the absence of external forces, which can simply be added to the equation, the equation of motion for the coordinates that are not thermo~qia would read as follows: statted explicitly u € ~qia ðtÞ ¼ Gqi uqia ðtÞ þ Mu
ðt 1
dt0
3 XX
0
0
jb jb 0 gqq ia ðt t0 Þu_ q0 jb ðt0 Þ þ qq ia ðt Þ
½37
q0 ;j b¼1
where the Gqi are the (static) Green’s functions, or effective spring constants, associated with eigenmode i and wavelength q. The knowledge of these functions enables us to work out0 the static contact mechanics. The jb time-dependent damping coefficients gqq ia ðt t0 Þ in Eq. [37] reflect the dynamical coupling between various eigenmodes. No reason exists why this coupling should be diagonal in any of its indices, and thus including the terms related to dynamics increases memory requirements and slows down the speed of the calculation tremendously, i.e., beyond the expense of approximating a semi-infinite solid by a discrete, elastic lattice of size 0 jb 0 L3 . Included in Eq. [37] are random forces qq ia ðt Þ, which must be used at finite temperature to counterbalance the time-dependent damping term. The random and damping terms have to be chosen such that they satisfy the fluctuation-dissipation theorem.39
Selected Case Studies
105
SELECTED CASE STUDIES The last few years have seen an explosion in the number of atomic-level tribological simulations aimed not just at understanding fundamental aspects of friction, but also at determining the frictional properties of systems used in real-world applications. In this section, we will discuss selected studies in an effort to demonstrate how the principles discussed earlier in this chapter are used in practice. Unfortunately, it is not possible to consider all of the important, high-caliber research within the space of this chapter. Instead, we focus on a few key areas that encompass both fundamental and applied research in computational tribology.
Instabilities, Hysteresis, and Energy Dissipation Earlier we used a relatively simple model composed of a slider and substrate to demonstrate how mechanical instabilities lead to energy dissipation and friction. However, realistic contacts are much more complex. For instance, real contacts can rarely be described as one-dimensional and almost always contain some molecules that act as impurities. Understanding the frictional aspects of these systems will require a consideration of the role instabilities play in systems that are more complex than the PT model. In this section, we discuss studies that investigate instabilities in more realistic systems. The role of instabilities involving confined impurity atoms has been investigated by Mu¨ser using a model in which two one-dimensional (1-D) or 2-D surfaces were separated by a very low concentration of confined atoms and slid past one another.25 The motion of the confined atoms was simulated with Langevin dynamics where the interactions between these atoms were neglected and the atom-wall interactions were described by ð0Þ
ð1Þ
Vt ¼ Vt cosððx v0 tÞ=bt Þ þ Vt cosð2ðx v0 tÞ=bt Þ þ . . . ð0Þ
ð1Þ
Vb ¼ Vb cosðx=bb Þ þ Vb cosð2x=bb Þ
½38
where the subscripts t and b denote the top and bottom walls, respectively; 2pbt and 2pbb are the periods of the top and bottom walls; and v0 is the sliding velocity. Only the first higher order harmonic was considered and defined as ð1Þ V1 ¼ Vt;b . In that study, the ‘‘impurity’’ limit was considered in which the concentration of impurity atoms is so low that interactions between these atoms can be neglected. Simulations with commensurate, 1-D surfaces showed that the behavior of the system was sensitive to V1 , as summarized in Figure 17. In the case where V1 < 0, the atom becomes unstable at some point in time and slides into the next minimum. However, the curve representing the mechanically stable position xms is continuous and the atom can always remain close to
106
Atomistic Modeling of Friction
xn [a.u.]
3π
2π
(c)
(b)
(a)
π
0
0
2π ∆xwall [a.u.]
0
2π ∆xwall [a.u.]
0
2π ∆xwall [a.u.]
4π
Figure 17 Mechanical equilibrium position for adsorbed atoms between two commensurate solid surfaces as a function of relative displacement xwall between the walls. The gray lines indicate the motion of the adsorbed atom if the walls are in slow relative sliding motion. (a) V1 < 0, (b) V1 ¼ 0, and (c) V1 > 0. Reproduced with permission from Ref. 25.
the equilibrium position, which minimizes energy dissipation. When V1 ¼ 0, the motion of xms is discontinuous; however, the instability leads to a transition between symmetrically equivalent minima. Consequently, there will be no net change in energy and dissipation will be minimal. On the other hand, when V1 > 0, the motion of xms is discontinuous and the transitions, or ‘‘pops’’ as they were termed by the author of the study, occur between inequivalent minima. This leads to energy dissipation and friction. Thus, different types of instabilities were observed in commensurate systems depending on the interference between Vt and Vb , which was altered with V1 . Instabilities that occur when V1 0 were termed continuous instabilities and lead to vanishing friction as the sliding velocity v0 goes to zero. When V1 > 0, first-order instabilities occur, which lead to energy dissipation and finite friction as v0 approaches zero. Simulations of incommensurate surfaces showed a similar dependence on V1 , with first-order instabilities occurring if V1 < V1 , where V1 is some positive, critical value that depends on the degree of mismatch between the lattice constants of the top and bottom surfaces. This process leads to nonvanishing Fk as v0 goes to zero. In the case where V1 < V , the atoms are dragged with the wall that exerts the maximum lateral force. It, in turn, leads to friction that scales linearly with the sliding velocity. As a result, the friction force will go to zero with v0 . The effect of dimensionality was also considered in that study. It was found that systems with commensurate 2-D walls yield results that are similar to the 1-D case because the interference between Vt and Vb persists. This situation is no longer true for the incommensurate case, where the adsorbed atoms can circumnavigate the points of maximum lateral force, which permits firstorder instabilities regardless of the nature of the higher order harmonics in the wall-atom potential. Thus, one would expect friction to remain finite as v0
Selected Case Studies
107
goes to zero for incommensurate systems, whereas this may not be the case for commensurate surfaces. These differences were demonstrated by considering the change in the friction force as a system proceeds from stick-slip behavior to smooth sliding motion. In the stick-slip regime, the friction force is dominated by static friction, whereas kinetic friction plays the major role in the smooth sliding regime. The simulations showed that an abrupt change in the friction force was observed when commensurate systems underwent a transition from stick-slip to smooth sliding motion, which is consistent with vanishing kinetic friction as the sliding velocity goes to zero. For incommensurate surfaces, the friction force varied smoothly as the system passed between these two regimes. These results are summarized in Figure 6. That study was extended by considering higher concentrations of confined atoms or molecules, 1-D, 2-D, and three-dimensional (3-D) surfaces, and the internal structure of the confined molecules.82 In what follows, we will focus on the effect of the concentration of confined particles. The relevant results of the study are summarized in Figure 18 and show that the dependence of the friction coefficient upon the concentration of lubricant molecules differs for systems with commensurate and incommensurate walls. For incommensurate systems, it was found that the magnitude of the friction coefficient m was relatively insensitive to the concentration of confined molecules for coverages between 0.25 and 2.5 monolayers. Furthermore, m was not highly dependent on v0 and remained finite as the sliding velocity went to zero. This behavior is caused by the ability of first-order instabilities to occur within incommensurate systems under all conditions. Meanwhile, first-order instabilities do not necessarily occur within systems with commensurate walls, and hence, such systems exhibit a significant dependence on surface coverage. The results
µd
10
10
10
c, 0.25 c, 0.5 c, 0.75 c, 1.0 c, 1.5 c, 2.0 ic, 0.25 ic, 0.5 ic, 0.75 ic, 1.0 ic, 1.5 ic, 2.0
-2
-3
-4
10
-4
-3
10 v0
10
-2
Figure 18 Coverage dependence of the kinetic friction coefficient mk of a system containing 0.25–2.5 monolayers of a simple fluid. Commensurate systems (c) are denoted with open symbols, and incommensurate systems (ic) are designated with closed symbols. Reproduced with permission from Ref. 82.
108
Atomistic Modeling of Friction
indicate that for sub-monolayer coverages, commensurate systems exhibit m that vanishes as v0 goes to zero because the confined molecules move coherently, which prevents first-order instabilities. Above one monolayer coverage, the confined molecules do not move in this manner and first-order instabilities can occur, which leads to nonvanishing friction as v0 goes to zero. The increase in first-order instabilities with coverage for commensurate systems is apparent from the data in Figure 18, where, in some cases, m of the commensurate system is higher than that of systems with incommensurate walls. First-order instabilities may not only involve the translational motion of atoms confined within contacts, but they may also involve chemical reactions within the confined fluid itself. This has been demonstrated recently in firstprinciples studies of zinc phosphates, which are found in protective films formed in automobile engines.19,83 Here, we focus on simulations of systems containing phosphate molecules in which pressure-induced chemical reactions lead to hysteresis and energy dissipation. The reactions involving zinc phosphates are discussed below along with other tribochemical reactions. Systems composed of phosphate molecules were exposed to isotropic pressures that were increased from 2.5 GPa to 32.5 GPa and then returned to the initial pressure. The resultant equation of state (see Figure 19) exhibited a hysteresis loop between 18 GPa and 26 GPa. Such hystereses lead to energy dissipation and friction. Analysis of the dynamics of the system indicated that this hysteresis was the result of a reversible chemical reaction. Essentially, when the system was compressed to 26 GPa, a P–O bond formed abruptly between two neighboring phosphate molecules, as indicated by the rapid decrease in the relevant P–O distance shown in Figure 20. This rapid transition
Figure 19 Volume V of phosphates as a function of pressure p. An initial compression cycle and a subsequent compression/decompression cycle is shown. A hysteresis occurs in the pressure range 18 GPa < p < 26 GPa, indicated by points A and B. Reproduced with permission from Ref. 19.
Selected Case Studies
109
3.5 compression decompression
dPO [Å]
3.0
2.5
2.0
1.5 0
10
20 p [GPa]
30
Figure 20 Bond lengths dPO between the phosphorus and oxygen atoms as a function of pressure p during the compression of phosphates. Reproduced with permission from Ref. 19.
between two inequivalent states is consistent with a first-order instability, as defined above. When the system was decompressed, the P–O bond persisted until the pressure dropped below 18 GPa, at which point the P–O bond dissociated rapidly, as indicated by the data in Figure 20. Once again, the abruptness with which this transition occurred is consistent with the formation of instabilities discussed earlier in this chapter. Integrating over the hysteresis loop between the ‘‘compression’’ and ‘‘decompression’’ curves in Figure 19 yields the amount of energy dissipated through the reversible bond formation/dissociation process. Unfortunately, it is not possible to determine the contribution of these transitions to the friction of phosphate films because such a calculation would require knowledge of the number of similar instabilities that occur per sliding distance, which is certainly beyond the limits of first-principles calculations. Nonetheless, the results do indicate that pressure- and shear-induced chemical reactions can contribute to the friction of materials.
The Role of Atomic-Scale Roughness Recently, interest has arisen in simulating realistic contacts. A key question involves the degree to which the atomistic nature of the surfaces in these contacts affects calculated values of friction. In this section, we will discuss studies that shed light on the importance of including atoms and atomic-scale roughness in tribological simulations. We start with a recent study in which continuum models (no atoms) were compared with atomistic models representing similar contact geometries. We then examine simulations
110
Atomistic Modeling of Friction
demonstrating that even sub-monolayer atomic roughness can significantly influence the calculated friction forces. The importance of treating surfaces atomistically in tribological simulations can be determined by comparing results from simulations that employ an atomistic description of the system with those based on continuum models. Such a comparison was reported recently by Luan and Robbins in an investigation of the breakdown of continuum models at small length scales.84 In their study, calculations were performed using a system composed of a cylinder pressed against a flat, elastic substrate. Four distinct atomistic models of the cylinder were considered: (1) a bent crystal with lattice spacing matching that of the substrate to render the system commensurate, (2) a bent crystal with lattice spacing differing from that of the substrate to render the system incommensurate, (3) a stepped crystal, and (4) an amorphous structure. Despite clear differences at the atomic level, all of these systems can be treated with the same continuum level model in which the atomic details are neglected. These models were used to evaluate various quantities related to contact mechanics. We will focus on the results pertaining to static friction. It was found that the static friction force was affected by the atomistic nature of the system. The commensurate and stepped cylinders yielded results that were in good agreement with the predictions from the continuum model. This agreement was attributed to the fact that, in these systems, the atoms on the cylinder can lock into registry with those on the substrate, which leads to a large friction force that is proportional to the apparent area of contact. On the other hand, the results showed that the amorphous and incommensurate systems gave values for the static friction force that were approximately an order of magnitude lower than those calculated with the continuum model. This was attributed to the fact that the atoms on the cylinders do not lock into registry with those on the substrate, which decreases the force necessary to initiate sliding. Overall, these results indicate that an atomistic description of the surfaces can have a tremendous influence on the calculated friction. This is not surprising given the discussion in earlier sections of this chapter. It is also important to note that the results for the amorphous and incommensurate systems, which are typical of surfaces in real contacts, differ most from the continuum models. This indicates that including atomic-level details of the contact may be necessary to simulate realistic systems. The comparison of continuum and atomistic models by Luan and Robbins demonstrates that the atomic details of this contact can have a significant influence on the calculated friction. However, those calculations did not explore atomically rough surfaces, which are most likely found in real engineering contacts. The effect of roughness has been investigated recently by Qi et al. in a study of the friction at the interface between two Ni(100) surfaces.85 Two models were considered in that work. In the first model, both surfaces were atomically flat; i.e., the rms roughness was 0.0 A˚. In the
Selected Case Studies
111
Figure 21 Friction coefficient for differently oriented Ni(100)/Ni(100) interfaces. Rough surfaces have a 0.8 A˚ rms variation in roughness added to the atomically smooth surfaces. Reproduced with permission from Ref. 85.
second model, 25% of the surface atoms were removed to give an rms roughness of 0.8 A˚. The surfaces were either perfectly aligned or misoriented through rotation. The results showed that roughening the surfaces increased the friction coefficients by up to an order of magnitude, as can be observed in Figure 21. In the case of atomically flat surfaces, misoriented by 45 , the static friction was very low (ms ¼ 0:21), as anticipated by arguments from analytical theories of friction. Roughening these surfaces increased ms to 2.06. The increase in the calculated friction with roughening was less dramatic for the aligned surfaces, where the atomically flat surfaces form a commensurate contact, which as discussed, results in large static friction forces regardless of surface roughness. The microscopic origin of the increase in friction with roughening can be understood from the microstructures shown in Figure 22. These structures clearly show that plastic deformation occurs during sliding, except for the atomically flat, incommensurate case. This difference accounts for the low friction coefficient that was calculated for the incommensurate system. In the three cases where plastic deformation occurs, the atoms are no longer elastically coupled to their lattice sites and can interlock the surface in a manner akin to that shown in Figure 4. This interlocking increases the force required to move the surface. It is interesting to point out that, upon bringing the two atomically flat commensurate surfaces together, the identity of each surface disappears and one ends up shearing a crystal, which exhibits finite resistance to shear. In such a scenario, no intervening layer is needed to interlock the surfaces.
112
Atomistic Modeling of Friction
Figure 22 Snapshots from the simulations leading to the friction coefficients shown in Figure 21. From left to right: Atomically flat commensurate, atomically flat incommensurate, rough commensurate, and rough incommensurate geometries. Only the flat incommensurate surfaces remain undamaged, resulting in abnormally small friction coefficients. Reproduced with permission from Ref. 85.
The calculated results were compared with experimental data86 regarding the influence of the surface orientation on the friction of Ni(100)/(100) interfaces. Interestingly, the calculated results obtained with the atomically flat surfaces differed significantly from the experimental findings. The agreement between theory and experiment was vastly improved when the surface roughness was taken into account in the simulations. This demonstrates that the mere presence of atoms in the surfaces of the contact may not be sufficient to obtain realistic results, but instead it may be necessary to include surface roughness at the atomic level.
Superlubricity Another area of current interest is superlubricity, which was introduced above. In that section, an argument in favor of superlubricity was given based on the translational invariance of solids. Basically, if plastic deformation and wear are negligible, one would anticipate having the same free energy at the start and end of a sliding process. As there is no net change in free energy, there will be no friction. An additional, microscopic argument for superlubricity is as follows. When a slider is moved relative to a substrate, in a statistical sense, there will be as many surface irregularities pushing the slider to the right as there are pushing it to the left. It will lead to a cancellation of lateral forces and ultra-low friction; see Figure 23. Once again, this mechanism is only operative when energy dissipation is minimal. In recent years, the search for superlubric materials has become a subject of practical importance, which may be most evident in the development of miniaturized systems, where nanoscopic surfaces slide past one another. In many cases, friction and wear are the main impediments to miniaturizing such devices even further. Identifying or designing superlubric materials may
Selected Case Studies
113
Figure 23 Cancellation of lateral forces between two surfaces. The atoms in the top layer, represented by circles, experience forces that are dependent on the position of the atom with respect to the periodic substrate. The arrows on the atoms indicate the magnitude and direction of these forces. For contacts lacking commensurability that are contain a sufficiently large number of surface atoms or irregularities, these forces will cancel in a statistical sense.
allow these technological barriers to be overcome. A thorough understanding of the theoretical aspects of superlubricity will aid in achieving these goals. Based on the discussion in earlier sections of this chapter, one may expect atomically flat incommensurate surfaces to be superlubric. Indeed the first suggestion that ultra-low friction may be possible was based on simulations of copper surfaces.6,7 Furthermore, the simulations of Ni(100)/(100) interfaces discussed in the previous section showed very low friction when the surfaces were atomically flat and misoriented (see the data for the atomically flat system between 30 and 60 in Figure 21). In general, however, it is reasonable to assume that bare metals are not good candidates for superlubric materials because they are vulnerable to a variety of energy dissipation mechanisms such as dislocation formation, plastic deformation, and wear. Layered materials such as graphite and MoS2 have been the focus of much attention in terms of superlubricity. These systems are expected to exhibit low friction because the coupling between atoms within a given sheet is much stronger than those between layers. As a result, the sheets will remain relatively rigid under sliding conditions, which in turn will minimize energy dissipation and lead to ultra-low friction. A major problem associated with layered materials is that they tend to rub away, leading to debris formation, which increases friction. Indeed, in some first experimental studies to demonstrate superlubricity, it was found that MoS2 sheets exhibited ultra-low friction prior to the onset of wear.87 Carbon nanotubes have also received a great deal of attention within the context of low-friction materials. These structures are consistent with rolled up graphite sheets and can exist as either solitary tubes (single-walled nanotubes) or several concentric tubes of increasing size (multiwalled nanotubes). In multiwalled nanotubes, the interactions between atoms within the tubes are much stronger than those between the atoms in two neighboring tubes, which opens up the possibility of ultra-low friction. As such, it has been suggested that multiwalled nanotubes could be used as molecular-level bearings88 and springs89,90 as well as gigahertz nano-oscillators.91 In the remainder of this section we will discuss recent theoretical studies of friction in double-walled carbon nanotubes.
114
Atomistic Modeling of Friction
Tangney et al.92 studied the friction between an inner and an outer carbon nanotube. Realistic potentials were used for the interactions within each nanotube and LJ potentials were employed to model the dispersive interactions between nanotubes. The intra-tube interaction potentials were varied and for some purposes even increased by a factor of 10 beyond realistic parameterizations, thus artificially favoring the onset of instabilities and friction. Two geometries were studied, one in which inner and outer tubes were commensurate and one in which they were incommensurate. In the simulations, the inner nanotube was initially displaced relative to the outer tube by a distance x. To reduce surface area, the inner nanotube was pulled into the outer tube and a potential energy minimum was found to exist when the inner tube is completely embedded. However, because of the accumulation of kinetic energy, the inner tube will move past this minimum and extend beyond the other end of the outer tube. In the absence of frictional forces, one would anticipate that this process would be repeated indefinitely, with the relative displacements of the two nanotubes oscillating between þx and x. In the presence of frictional forces, the maximum displacement will decrease with time. This can be seen from the data shown in Figure 24, where the relative displacement of the nanotubes and frictional forces are shown as functions of time.
4
Force [nN]
2 0 –2 –4 0
0.05
Displacement [nm]
8
0.1 Incommensurate
4 0 –4 Commensurate
–8 0
0.25
0.5 Time [ns]
0.75
1
Figure 24 Top: Friction force between two nanotubes as a function of time. Bottom: Displacement of the nanotubes as a function of time. Gray and black lines indicate incommensurate and commensurate geometries, respectively. Reproduced with permission from reference 92.
Selected Case Studies
115
Interestingly, the displacement curves shown in Figure 24 are virtually identical for the commensurate and incommensurate systems. This seems counterintuitive as one would expect low-energy dissipation for the incommensurate system and high friction for the commensurate system. To understand this apparent discrepancy, let us return briefly to the system of two idealized egg cartons mentioned above. Aligning the two egg cartons explains static friction because the cartons will be locked. However, once motion is initiated, there is no reason why the system should exhibit kinetic friction because whenever the top egg carton slides downward, kinetic energy is produced that will help it to climb up the next potential energy maximum. To encounter kinetic friction, an additional ‘‘microscopic’’ dissipation mechanism must be operative. The same scenario applies to atomic systems and is exactly demonstrated by the data for the commensurate system shown in Figure 24. The differences between the incommensurate and the commensurate systems are evident from the relative forces between the nanotubes, which are also plotted in Figure 24. In both cases, the forces change sign as the system passes through the potential energy minimum, which is found when the inner tube is completely embedded in the outer tube. However, for the portions of the simulation between these transitions, the magnitude of the force for the incommensurate system is relatively constant, whereas that for the commensurate system oscillates rapidly with what was described as a ‘‘butterfly’’ pattern. This behavior is directly related to commensurability. For the commensurate tubes, the oscillations are from aligned atoms on the two tubes passing over one another. The number of such interactions depends on the degree to which the two tubes overlap. Thus, when the inner tube is completely embedded, i.e., when the force changes sign, the magnitude of the force will be largest. As the inner tube moves out of either end of the outer tube, the number of inter-tube interactions will decrease, which leads to the butterfly pattern as the inner tube moves back and forth. No such systematic alignment of atoms on the two tubes exists for the incommensurate system and the oscillations in the force are of a much smaller amplitude. It is important to note that, despite the large oscillations, the average value of the force for the commensurate system is the same as that for the incommensurate system, which leads to the same behavior regarding the relative displacement of the tubes over time. The study by Tagney et al. indicates that energy dissipation in doublewall nanotubes does not depend on commensurability. The absence of significant dissipation on long time scales indicates that no instabilities occur. Thus, the simulated, low-dimensional rubbing system exhibits superlubricity, not only for incommensurate but even for commensurate surfaces. Finite static and zero kinetic friction forces have also been observed experimentally, albeit for a different system.93 Although the nanotube simulations were based on reasonably realistic potentials, it needs to be emphasized that real carbon nanotubes contain a lot of chemical defects. Additional simulations94,95 of double-wall nanotubes have demonstrated that defects in the walls of the
116
Atomistic Modeling of Friction
nanotubes as well as the chemical details of the ends of the nanotubes can lead to deviations from superlubric behavior and induce nonviscous types of friction forces, which are also observed in experiments.
Self-Assembled Monolayers Self-assembled monolayers (SAMs) are highly ordered organic thin films that have been proposed as potential low-friction, protective coatings for devices such as MEMS/NEMS and computer hard drives.96 The influence of the chemical details of the SAM, such as chain length and chemical nature of the terminal group, on the observed friction have been investigated both experimentally97,98 and theoretically.99 As anticipated, it has been found that these chemical details can have a significant effect on the frictional properties of SAMs. For instance, certain types of end groups promote adhesion between two SAMs, thereby increasing friction. Recent simulations have shown that the frictional aspects of SAMs are not only influenced by the chemical details of the film, but they can also be affected by the degree of order within the SAM. In their study, Park et al.100 investigated the frictional properties of fluorine-terminated alkanethiol SAMs grafted to gold surfaces. The frictional properties of the system were investigated by sliding two SAMs past one another at velocities in the stick-slip regime under various external loads. The simulations yield the shear stress ss and the kinetic friction coefficient mk can be estimated from the slope of a plot of ss versus load, using the relationships contained in Eqs. [4] and [5]. Two distinct types of fluorine-terminated SAMs emerged from the simulations. One exhibited low shear strength, whereas the other showed a much higher resistance to shear. Interestingly, it was found that the two films had significantly different friction coefficients. Considering external pressures of above 400 MPa, it was found that the low-shear strength SAM had mk 0:12, whereas the friction coefficient of high-shear strength SAM was approximately zero. The authors of that study investigated the origins of these differences. The differences in the frictional aspects of the two SAMs were found to arise from differences in the structures of the films. Specifically, the high-shear strength SAMs exhibited a high degree of order, such that each terminal group of the SAM on one surface was perfectly aligned with a terminal group of the SAM on the opposing surface. Thus, during sliding all chains of one SAM will lock with all of those on the opposite surface, which leads to a stick condition and high friction. The low-shear strength SAM, on the other hand, exhibited less order and only a fraction of the terminal groups would lock, thereby leading to lower friction. Thus, the differences in the order of the SAMs accounts for the differences in the shear strengths of the films. The authors also attributed the lower kinetic friction coefficient for the high-shear strength SAM to the increased order in the film, as higher order had previously been shown to lead to lower values for mk .101 Overall, this work demonstrates that the
Selected Case Studies
117
structure and order of films can have a significant influence on friction, even when the surfaces are of identical chemical composition.
Tribochemistry An emerging subdiscipline of tribological simulation involves the study of tribochemical reactions—that is, reactions that are activated by pressure and shear. These reactions alter the structure of lubricants and films that are used to protect surfaces from wear. Understanding the effects of these reactions on the intended behavior of these films is of utmost importance. However, simulation studies of tribochemical reactions have been impeded by the difficulty in accurately describing changes in chemical bonding. In a limited number of cases, this can be achieved with the use of reactive FFs, as noted above, whereas in other cases, one must resort to expensive quantum chemical calculations. In this section, we will describe two studies where such methods were used to examine tribochemical reactions. Chateauneuf et al. investigated the reactions that occur when SAMs containing diacetylene moieties are compressed and sheared.66 The authors were interested in exploring the origin of experimental results, which indicate that polymerization within such films can significantly influence the observed friction. To investigate this behavior, model SAMs composed of long alkyne chains grafted onto a diamond surface were compressed and sheared with a model for an AFM tip composed of amorphous carbon. The authors employed a reactive FF that was capable of modeling changes in hybridization of carbon atoms,64 and therefore, they can account for the formation of chemical bonds leading to polymerization within the film. The results demonstrated that both compression and shear can induce the formation of C–C bonds between sp-hybridized carbons atoms, which leads to polymerization within the SAM. Interestingly, it was found that the location of these reactive sites within the film could influence the calculated friction. For instance, if the diacetylene components in the chains were close to the tip/film interface, reactions between the film and tip could occur, which led to wear and high friction. On the other hand, if the diacetylene moieties were far from the tip, the reactions did not lead to wear and had little effect on the average calculated friction. These observations demonstrate that a proper treatment of the chemical reactivity of the system may be necessary in some cases to calculate friction accurately. Reactive FFs can only be applied to a few specific cases for which they have been developed, such as the hydrocarbon systems discussed in the first part of this section. For other systems, describing tribochemical reactions requires the use of quantum chemical methods. In recent studies, such methods have been applied to investigate the behavior of zinc phosphates (ZPs) in response to high pressures. ZPs form the basis of anti-wear films derived from zinc dialkyldithiophosphates (ZDDPs), which are additives that have
118
Atomistic Modeling of Friction
been incorporated into virtually all motor oils for the last seven decades. It has recently become apparent that these additives must be replaced because they are harmful to the environment and do not inhibit wear effectively on modern engines composed of aluminum. Despite extensive experimental research,102 the mechanisms through which the ZP films form and function have remained unknown, thus hindering efforts to develop new additives. One key experimental observation regarding the ZP films is that the films found on the tops of asperities are stiffer and exhibit chemical spectra indicative of longer phosphate chain lengths than films found in the valleys between asperities. These observations that differences in the conditions at the two distinct locations alter the elastic and chemical properties of the films. One of the key differences between the tops of asperities and the valleys is the pressure experienced by the zinc phosphates. Since the highest pressures, and greatest potential for wear, are achieved at the tops of the asperities, determining the response of ZPs to these pressures may aid in developing a clear picture of how the anti-wear films work. In a recent study, Mosey, Mu¨ser and Woo used ab initio molecular dynamics simulations to investigate the behavior of ZPs under high-pressure conditions encountered during asperity collisions in running engines.83 Systems composed of zinc phosphate molecules, which are formed when ZDDP decomposes thermally in the engine, were subjected to pressures that were increased from a low value to the theoretical yield strength of iron or aluminum and then decreased to the initial ambient values of pressure. Several successive compression cycles were performed. Relevant structures observed during these simulations are shown in Figure 25. The simulations demonstrated that compressing the system to a minimum pressure of 5 GPa results in the irreversible formation of chemical cross-links between the phosphate groups with Zn acting as a cross-linking agent. This irreversible change is evident through a comparison of structures (a) and (c) in Figure 25, which show the structure of the system before and after compression. In the first case, the system is composed of disconnected ZP molecules, whereas in the latter, extended bonding is present throughout the system. It is important to note that the presence of chain-like structures in
Figure 25 Molecular configuration of zinc phosphates. (a) The structure of the system before compression. (b) The structure of the system when compressed to 16 GPa. (c) The structure of the system when fully decompressed. Adapted from Ref. 83.
Selected Case Studies
119
(c) is a consequence of the limited size of the systems that were considered, and in larger systems, cross-linking would lead to structures where networking extends in all three spatial dimensions on longer length scales. A comparison with simulations of pure phosphates, (see above) showed that this irreversible behavior does not occur without zinc. The results indicated that cross-link formation increased the bulk modulus of the system. As noted, cross-linking was a pressure-induced effect that was facilitated by a change in the coordination at zinc when the pressure reached 5 GPa. The observation that stiffening of the film is a pressure-induced phenomenon is consistent with the differences in the measured elastic properties of films found on the tops of the asperities and those found in the valleys between asperities as mentioned above. Basically, in real systems, pressures high enough to form stiff cross-linked films are achieved on top of the asperities, but they are not encountered between the asperities. It was also found that cross-linking, which occurs at pressures accessible on aluminum, causes the films to become harder than aluminum. Thus, on aluminum surfaces, one could expect the films to act as abrasives that will induce wear, as has been observed in sliding experiments. The authors of the study suggested that the inability of ZDDP additives to protect aluminum surfaces from wear may be caused by the pressure-induced stiffening of the film. An additional pressure-induced reaction was observed during the simulations at a pressure of 16 GPa, which is accessible on iron. This reversible process involved an increase in the coordination number of Zn through Zn–O bond formation and further increased the degree of cross-linking in the system. It was found that this increase in cross-linking increased the stiffness of the film to a value approaching that of iron. It was suggested that this may contribute to the anti-wear properties of the film. Essentially, stiffening allows the film to accommodate and redistribute applied loads, thereby protecting an underlying iron surface. This behavior clearly requires the presence of cations that can undergo changes in coordination, which may explain why the anti-wear capabilities of the ZP films are reduced significantly if some zinc atoms are replaced with calcium (which does not exhibit variable coordination). Overall, this work highlights how quantum chemical methods can be used to study tribochemical reactions within chemically complex lubricant systems. The results shed light on processes that are responsible for the conversion of loosely connected ZP molecules derived from anti-wear additives into stiff, highly connected anti-wear films, which is consistent with experiments. Additionally, the results explain why these films inhibit wear of hard surfaces, such as iron, yet do not protect soft surface such as aluminum. The simulations also explained a large number of other experimental observations pertaining to ZDDP anti-wear films and additives.103 Perhaps most importantly, the simulations demonstrate the importance of cross-linking within the films, which may aid in the development of new anti-wear additives.
120
Atomistic Modeling of Friction
CONCLUDING REMARKS Interest in tribological simulations has developed because of the need to understand the fundamental details of friction and wear. In this chapter, we have provided a detailed overview of several aspects related to such simulations. Basic theories of friction have been covered, and potential pitfalls associated with tribological simulations were described. Particular emphasis was placed on designing simulations that are representative of realistic systems, so that meaningful results can be calculated. Finally, several recently reported tribological simulation studies were discussed. These studies span topics ranging from fundamental issues, such as the nature of instabilities during sliding, to practical issues, such as the function of engine anti-wear films. These studies represent the current status of simulations in exploring both fundamental and applied areas of tribology.
ACKNOWLEDGMENTS Financial support from the Natural Sciences and Engineering Research Council of Canada and SHARCnet (Ontario) is gratefully acknowledged.
REFERENCES 1. D. Dowson, History of Tribology, Longman, London, 1999. 2. J. A. Harrison, S. J. Stuart, and D. W. Brenner, in Handbook of Micro/Nanotribology, B. Bhushan, Ed., CRC Press, Boca Raton, FL, 1999, pp. 525–597. Atomic-Scale Simulations of Tribological and Related Phenomena. 3. M. O. Robbins and M. H. Mu¨ser, in Modern Tribology Handbook, B. Bhushan, Ed., CRC Press, Boca Raton, FL, 2001, pp. 717–770. Computer Simulations of Friction, Lubrication, and Wear. 4. M. H. Mu¨ser, M. Urbakh, and M. O. Robbins, Adv. Chem. Phys., 126, 187 (2003). Statistical Mechanics of Static and Low-Velocity Kinetic Friction. 5. C. J. Mundy, S. Balasubramanian, K. Baghi, M. E. Tuckerman, G. J. Martyna, and M. L. Klein, in Reviews in Computational Chemistry, K. B. Lipkowitz and D. B. Boyd, Eds., Wiley, New York, 2000, pp. 291–397. Nonequilibrium Molecular Dynamics. 6. M. Hirano and K. Shinjo, Phys. Rev. B, 41, 11837 (1990). Atomic Locking and Friction. 7. K. Shinjo and M. Hirano, Surf. Sci., 283, 473 (1993). Dynamics of Friction: Superlubric State. 8. L. Prandtl, Z. Angew. Math. Mech., 8, 85 (1928). Ein Gedankenmodell zur Kinetischen Theorie der Festen Korper. 9. G. A. Tomlinson, Philos. Mag. Series, 7, 905 (1929). A Molecular Theory of Friction. 10. M. H. Mu¨ser, Europhys. Lett., 66, 97 (2004). Structural Lubricity: Role of Dimension and Symmetry. 11. B. N. J. Persson, J. Chem. Phys., 115, 3840 (2001). Theory of Rubber Friction and Contact Mechanics. 12. F. P. Bowden and D. Tabor, The Friction and Lubrication of Solids, Clarendon Press, Oxford, 1986.
References
121
13. S. Hyun, L. Pei, J.-F. Molinari, and M. O. Robbins, Phys. Rev. E, 70, 026117 (2004). FiniteElement Analysis of Contact between Elastic Self-Affine Surfaces. 14. G. He, M. H. Mu¨ser, and M. O. Robbins, Science, 284, 1650 (1999). Adsorbed Layers and the Origin of Static Friction. 15. M. H. Mu¨ser, L. Wenning, and M. O. Robbins, Phys. Rev. Lett., 86, 1295 (2001). Simple Microscopic Theory of Amontons’s Laws for Static Friction. 16. J. A. Harrison, C. T. White, R. J. Colton, and D. W. Brenner, Thin Solid Films, 260, 205 (1995). Investigation of the Atomic-Scale Friction and Energy Dissipation in Diamond using Molecular Dynamics. 17. J. N. Glosli and G. McClelland, Phys. Rev. Lett., 70, 1960 (1993). Molecular-Dynamics Study of Sliding Friction of Ordered Organic Monolayers. 18. G. T. Gao, P. T. Mikulski, and J. A. Harrison, J. Am. Chem. Soc., 124, 7202 (2002). Molecular-Scale Tribology of Amorphous Carbon Coatings: Effects of Film Thickness, Adhesion, and Long-Range Interactions. 19. N. J. Mosey, T. K. Woo, and M. H. Mu¨ser, Phys. Rev. B, 72, 054124 (2005). Energy Dissipation via Quantum Chemical Hysteresis during High-Pressure Compression: A FirstPrinciples Molecular Dynamics Study of Phosphates. 20. L. Wenning and M. H. Mu¨ser, Europhys. Lett., 54, 693 (2001). Friction Laws for Elastic Nanoscale Contacts. 21. G. He and M. O. Robbins, Phys. Rev. B, 64, 035413 (2001). Simulations of the Static Friction due to Adsorbed Molecules. 22. A. W. Bush, R. D. Gibson, and T. R. Thomas, Wear, 35, 87 (1975). The Elastic Contact of a Rough Surface. 23. B. N. J. Persson, Phys. Rev. Lett., 87, 116101 (2001). Elastoplastic Contact between Randomly Rough Surfaces. 24. B. Q. Luan, S. Hyun, M. O. Robbins, and N. Bernstein, Mater. Res. Soc. Symp. Proc., 841, R7.4.1 (2005). Multiscale Modeling of Two Dimensional Rough Surface Contacts. 25. M. H. Mu¨ser, Phys. Rev. Lett., 89, 224301 (2002). Nature of Mechanical Instabilities and their Effect on Kinetic Friction. 26. Y. Sang, M. Dube´, and M. Grant, Phys. Rev. Lett., 87, 174301 (2001). Thermal Effects on Atomic Friction. 27. O. K. Dudko, A. E. Filippov, J. Klafter, and M. Urbakh, Chem. Phys. Lett., 352, 499 (2002). Dynamic Force Spectroscopy: A Fokker-Planck Approach. 28. G. He and M. O. Robbins, Tribol. Lett., 10, 7 (2001). Simulations of the Kinetic Friction due to Adsorbed Surface Layers. 29. J. H. Dieterich and B. D. Kilgore, Pure Appl. Geophys., 143, 283 (1994). Direct Observation of Frictional Contacts - New Insights for State-Dependent Properties. 30. J. H. Dieterich and B. D. Kilgore, Tectonophysics, 256, 219 (1996). Imaging Surface Contacts: Power Law Contact Distributions and Contact Stresses in Quartz, Calcite, Glass and Acrylic Plastic. 31. M. Schoen, C. L. Rhykerd, D. J. Diestler, and J. H. Cushman, Science, 245, 1223 (1989). Shear Forces in Molecularly Thin Films. 32. M. H. Mu¨ser and M. O. Robbins, Phys. Rev. B, 61, 2335 (2000). Conditions for Static Friction between Flat Crystalline Surfaces. 33. P. Meakin, Fractals, Scaling and Growth Far from Equilibrium, Cambridge University Press, Cambridge, 1998. 34. W. Steele, Surf. Sci., 36, 317 (1973). Physical Interaction of Gases with Crystalline Solids. I. Gas-Solid Energies and Properties of Isolated Adsorbed Atoms. 35. M. O. Robbins and P. A. Thompson, Science, 253, 916 (1991). Critical Velocity of Stick-Slip Motion.
122
Atomistic Modeling of Friction
36. B. Q. Luan and M. O. Robbins, Phys. Rev. Lett., 93, 036105 (2004). Effect of Inertia and Elasticity on Stick-Slip Motion. 37. T. Schneider and E. Stoll, Phys. Rev. B, 17, 1302 (1978). Molecular-Dynamics Study of a Three-Dimensional One-Component Model for Distortive Phase Transitions. 38. P. Espan˜ol and P. Warren, Europhys. Lett., 30, 191 (1995). Statistical Mechanics of Dissipative Particle Dynamics. 39. R. Kubo, Rep. Prog. Phys., 29, 255 (1966). Fluctuation-Dissipation Theorem. 40. S. S. Sarman, D. J. Evans, and P. T. Cummings, Phys. Rep., 305, 1 (1998). Recent Developments in Non-Newtonian Molecular Dynamics. 41. A. Ricci and G. Ciccotti, Mol. Phys., 101, 1927 (2003). Algorithms for Brownian Dynamics. 42. D. J. Evans and G. P. Morriss, Statistical Mechanics of NonEquilibrium Liquids, Academic Press, London, 1990. 43. D. J. Evans and G. P. Morriss, Phys. Rev. Lett., 56 2172 (1986). Shear Thickening and Turbulence in Simple Fluids. 44. W. Loose and G. Ciccotti, Phys. Rev. A, 45, 3859 (1992). Temperature and Temperature Control in Non-Equilibrium-Molecular-Dynamics Simulations of the Shear Flow of Dense Liquids. 45. M. J. Stevens and M. O. Robbins, Phys. Rev. E, 48, 3778 (1993). Simulations of ShearInduced Melting and Ordering. 46. P. Reimann and M. Evstigneev, Phys. Rev. Lett., 93, 230802 (2004). Nonmonotonic Velocity Dependence of Atomic Friction. 47. P. Espan˜ol, Phys. Rev. E, 52, 1734 (1995). Hydrodynamics from Dissipative Particle Dynamics. 48. T. Soddemann, D. Du¨nweg, and K. Kremer, Phys. Rev. E, 68, 046702 (2003). Dissipative Particle Dynamics: A Useful Thermostat for Equilibrium and Nonequilibrium Molecular Dynamics Simulations. 49. D. Dowson and G. R. Higginson, Elastohydrodynamic Lubrication, Pergamon, Oxford, 1968. 50. L. Bocquet and J.-L. Barrat, Phys. Rev. Lett., 70, 2726 (1993). Hydrodynamic BoundaryConditions and Correlation-Functions of Confined Fluids. 51. A. W. Lees and S. F. Edwards, J. Phys. C, 5, 1921 (1972). The Computer Study of Transport Processes Under Extreme Conditions. 52. M. Parrinello and A. Rahman, J. Chem. Phys., 76, 2662 (1982). Strain Fluctuations and Elastic Constants. 53. M. Parrinello and A. Rahman, J. Appl. Phys., 52, 7182 (1981). Polymorphic Transitions in Single Crystals: A New Molecular Dynamics Method. 54. B. Fraser, M. H. Mu¨ser, and C. Denniston, J. Pol. Sci. B, 43, 969 (2005). Diffusion, Elasticity, and Shear Flow in Self-Assembled Block Copolymers: A Molecular Dynamics Study. 55. R. M. Wentzcovitch, Phys. Rev. B, 44, 2358 (1991). Invariant Molecular-Dynamics Approach to Structural Phase-Transitions. 56. B. Fraser, Ph.D. thesis, University of Western Ontario, 2005. Molecular Dynamics Simulations of Diblock Copolymers Under Shear. 57. E. Gnecco, R. Bennewitz, T. Gyalog, and E. Meyer, J. Phys.-Condens. Mat., 13, R619–R642 (2001). Friction Experiments on the Nanometre Scale. 58. J. P. Bowen and N. L. Allinger, in Reviews in Computational Chemistry, Vol. 2, K. B. Lipkowitz and D. B. Boyd, Eds., VCH, Weinheim, Germany, 1991, pp. 81–97. Molecular Mechanics: The Art and Science of Parameterization. 59. U. Dinur and A. T. Hagler, in Reviews in Computational Chemistry, Vol. 2, K. B. Lipkowitz and D. B. Boyd, Eds., VCH, Weinheim, Germany, 1991, pp. 99–164. New Approaches to Empirical Force Fields.
References
123
60. C. R. Landis, D. M. Root, and T. Cleveland, in Reviews in Computational Chemistry, Vol. 6, K. B. Lipkowitz and D. B. Boyd, Eds., VCH, Weinheim, Germany, 1995, pp. 73–148. Molecular Mechanics Force Fields for Modeling Inorganic and Organometallic Compounds. 61. D. W. Brenner, O. A. Shenderova, and D. A. Areshkin, in Reviews in Computational Chemistry, Vol. 12, K. B. Lipkowitz and D. B. Boyd, Eds., Wiley, New York, 1998, pp. 207–239. Analytic Interatomic Forces and Materials Simulation. 62. O. M. Braun and Y. S. Kivshar, Phys. Rep., 306, 1 (1998). Non-Linear Dynamics of the Frenkel-Kontorova Model. 63. J. E. Hammerberg, B. L. Holian, J. Roder, A. R. Bishop, and S. J. Zhou, Physica D, 123, 330 (1998). Nonlinear Dynamics and the Problem of Slip at Material Interfaces. 64. S. J. Stuart, A. B. Tutein, and J. A. Harrison, J. Chem. Phys., 112, 6472 (2000). A Reactive Potential for Hydrocarbons with Intermolecular Interactions. 65. D. W. Brenner, Phys. Rev. B, 42, 9458 (1990). Empirical Potential for Hydrocarbons for Use in Simulating the Chemical Vapor Deposition of Diamond Films. 66. G. M. Chateauneuf, P. T. Mikulski, G.-T. Gao, and J. A. Harrison, J. Phys. Chem. B, 108, 16626 (2004). Compression- and Shear-Induced Polymerization in Model DiacetyleneContaining Monolayers. 67. C. Cramer, Computational Chemistry: Theories and Models, John Wiley and Sons, Ltd., Chichester, U.K., 2002. 68. R. Neitola and T. A. Pakkanen, J. Phys. Chem. B, 105, 1338 (2001). Ab initio Studies on the Atomic-Scale Origin of Friction between Diamond (111) Surfaces. 69. W. Zhong and D. Tomanek, Phys. Rev. Lett., 64, 3054 (1990). First-Principles Theory of Atomic-Scale Friction. 70. J. B. Sokoloff, Thin Solid Films, 206, 208 (1991). The Relationship between Static and Kinetic Friction and Atomic Level Stick-Slip Motion. 71. P. G. Dacosta, O. H. Nielsen, and K. Kunc, J. Phys. C: Solid State Phys., 19, 3163 (1986). Stress Theorem in the Determination of Static Equilibrium by the Density Functional Method. 72. M. Bernasconi, G. L. Chiarotti, P. Focher, S. Scandolo, E. Tosatti, and M. Parrinello, J. Phys. Chem. Solids, 56, 501 (1995). First-Principle-Constant Pressure Molecular Dynamics. 73. S. Serra, C. Cavazzoni, G. L. Chiarotti, S. Scandolo, and E. Tosatti, Science, 284, 788 (1999). Pressure-Induced Solid Carbonates from Molecular CO2 by Computer Simulation. 74. J. Q. Broughton, F. F. Abraham, N. Bernstein, and E. Kaxiras, Phys. Rev. B, 60, 2391 (1999). Concurrent Coupling of Length Scales: Methodology and Application. 75. W. A. Curtin and R. E. Miller, Model. Simul. Mater. Sci., 11, R33 (2003). Atomistic/ Continuum Coupling in Computational Materials Science. 76. Y. Saito, J. Phys. Soc. Jpn., 73, 1816 (2004). Elastic Lattice Green’s Function in Three Dimensions. 77. W. Cai, M. de Koning, V. V. Bulatov, and S. Yip, Phys. Rev. Lett., 85, 3213 (2000). Minimizing Boundary Reflections in Coupled-Domain Simulations. 78. S. A. Adelman and J. D. Doll, J. Chem. Phys., 64, 2375 (1976). Generalized Langevin Equation Approach for Atom/Solid-Surface Scattering: General Formulation for Classical Scattering off Harmonic Solids. 79. S. A. Adelman and J. D. Doll, J. Chem. Phys., 62, 2518 (1975). Generalized Langevin Equation Approach for Atom/Solid-Surface Scattering. Collinear Atom/Harmonic Chain Model. [Erratum to document cited in CA 82:77265]. 80. S. A. Adelman and J. D. Doll, J. Chem. Phys., 61, 4242 (1974). Generalized Langevin Equation Approach for Atom/Solid-Surface Scattering. Collinear Atom/Harmonic Chain Model. 81. C. Campana and M. H. Mu¨ser, Phys. Rev. B 74, 075420 (2006). A Practical Green’s Function Approach to the Simulation of Elastic, Semi-Infinite Solids.
124
Atomistic Modeling of Friction
82. M. Aichele and M. Mu¨ser, Phys. Rev. E, 68, 016125 (2003). Kinetic Friction and Atomistic Instabilities in Boundary-Lubricated Systems. 83. N. J. Mosey, M. H. Mu¨ser, and T. K. Woo, Science, 307, 1612 (2005). Molecular Mechanisms for the Functionality of Lubricant Additives. 84. B. Luan and M. O. Robbins, Nature, 435, 929 (2005). The Breakdown of Continuum Models for Mechanical Contacts. 85. Y. Qi, Y.-T. Cheng, T. Cagin, and W. A. Goddard III, Phys. Rev. B, 66, 085420 (2002). Friction Anisotropy at Ni(100)/(100) Interfaces. Molecular Dynamics Studies. 86. J. S. Ko and A. J. Gellman, Langmuir, 16, 8343 (2000). Friction Anisotropy at Ni(100)/ Ni(100) Interfaces. 87. J. M. Martin, C. Donnet, and Th. Le Mogne, Phys. Rev. B, 48, 10583 (1993). Superlubricity of Molybdenum Disulfide. 88. W. Guo and H. Gao, Comput. Model. Eng. Sci., 7, 19 (2005). Optimized Bearing and Interlayer Friction in Multiwalled Carbon Nanotubes. 89. J. Cumings and A. Zettl, Science, 289, 602 (2000). Low-Friction Nanoscale Linear Bearing Realized from Multiwall Carbon Nanotubes. 90. J. Cumings and A. Zettl, Nature, 405, 586 (2000). Peeling and Sharpening Multiwall Nanotubes. 91. W. Guo, Y. Guo, H. Gao, Q. Zheng, and W. Zhong, Phys. Rev. Lett., 91, 125501 (2003). Energy Dissipation in Gigahertz Oscillators from Multiwalled Carbon Nanotubes. 92. P. Tangney, S. G. Louie, and M. L. Cohen, Phys. Rev. Lett., 93, 065503 (2004). Dynamic Sliding Friction between Concentric Carbon Nanotubes. 93. A. Socoliuc, R. Bennewitz, E. Gnecco, and E. Meyer, Phys. Rev. Lett., 92, 134301 (2004). Transition from Stick-Slip to Continuous Sliding in Atomic Friction: Entering a New Regime of Ultralow Friction. 94. Z. Xia and W. A. Curtin, Phys. Rev. B, 69, 233408 (2004). Pullout Forces and Friction in Multiwall Carbon Nanotubes. 95. W. Guo, W. Zhong, Y. Dai, and S. Li, Phys. Rev. B, 72, 075409 (2005). Coupled Defect-Size Effects on Interlayer Friction in Multiwalled Carbon Nanotubes. 96. A. Ulman, Chem. Rev., 96, 1533 (1996). Formation and Structure of Self-Assembled Monolayers. 97. J. E. Houston and H. I. Kim, Acc. Chem. Res., 35, 547 (2002). Adhesion, Friction, and Mechanical Properties of Functionalized Alkanethiol Self-Assembled Monolayers. 98. H. I. Kim and J. E. Houston, J. Am. Chem. Soc., 122, 12045 (2000). Separating Mechanical and Chemical Contributions to Molecular-Level Friction. 99. B. Park, M. Chandross, M. J. Stevens, and G. S. Grest, Langmuir, 19, 9239 (2003). Chemical Effects on the Adhesion and Friction between Alkanethiol Monolayers: Molecular Dynamics Simulations. 100. B. Park, C. D. Lorenz, M. Chandross, M. J. Stevens, and G. S. Grest, Langmuir, 20, 10007 (2004). Frictional Dynamics of Fluorine-Terminated Alkanethiol Self-Assembled Monolayers. 101. E. D. Smith, M. O. Robbins, and M. Cieplak, Phys. Rev. B, 54, 8252 (1996). Friction on Adsorbed Monolayers. 102. M. A. Nicholls, T. Do, P. R. Norton, M. Kasrai, and G. M. Bancroft, Tribol. Int., 38, 15 (2005). Review of the Lubrication of Metallic Surfaces by Zinc Dialkyldithiophosphates. 103. N. J. Mosey, T. K. Woo, M. Kasrai, P. R. Norton, G. M. Bancroft, and M. H. Mu¨ser, Tribol. Lett. 24, 105 (2006). Interpretation of Experiments on ZDDP Anti-Wear Additives and Films through Pressure-Induced Cross Linking.
CHAPTER 3
Computing Free Volume, Structural Order, and Entropy of Liquids and Glasses Jeetain Mittal,a William P. Krekelberg,a Jeffrey R. Errington,b and Thomas M. Trusketta,c a
Department of Chemical Engineering, The University of Texas at Austin, Austin, Texas b Department of Chemical & Biological Engineering, University at Buffalo, The State University of New York, Buffalo, New York c Institute for Theoretical Chemistry, The University of Texas at Austin, Austin, Texas
INTRODUCTION The macroscopic properties of a material are related intimately to the interactions between its constituent particles, be they atoms, ions, molecules, or colloids suspended in a solvent. Such relationships are fairly well understood for cases where the particles are present in low concentration and interparticle interactions occur primarily in isolated clusters (pairs, triplets, etc.). For example, the pressure of a low-density vapor can be accurately described by the virial expansion,1 whereas its transport coefficients can be estimated from kinetic theory.2,3 On the other hand, using microscopic information to predict the properties, and in particular the dynamics, of condensed phases such as liquids and solids remains a far more challenging task. In these states Reviews in Computational Chemistry, Volume 25 edited by Kenny B. Lipkowitz and Thomas R. Cundari Copyright ß 2007 Wiley-VCH, John Wiley & Sons, Inc.
125
126
Computing Free Volume, Structural Order
of highly concentrated matter, the particles remain in constant ‘‘contact’’ with many neighbors, and as a result, their dynamic processes often rely on cooperative structural rearrangements.4–8 In short, the simplifications typically invoked in the study of dilute systems do not apply to these states. The close average proximity of interacting particles in condensed phases also leads to pronounced structural correlations;9 i.e., the position and orientation of a given particle biases the arrangement of other particles in the system strongly. These correlations extend over macroscopic distances in the case of crystals, where they reflect an underlying morphology that can be described uniquely by a periodically replicated unit cell. Such regularity gives rise to several profound simplifications that are central to solid-state physics, which is a field that has developed an arsenal of precise experimental and theoretical tools for characterizing crystalline matter.10 However, the conceptual picture for liquids, glasses, and other forms of soft condensed matter7,11–14 is considerably more complex. The average packing structures in these materials are too dense (i.e., too correlated), on the one hand, to be amenable to a ‘‘dilute gas’’ description, and at the same time too disordered to be suitable for the simplifying unit-cell framework of crystals. As a result, progress in the theories of liquids and glasses (and of the intervening glass transition itself) has lagged behind those of dilute gases and crystals. Nonetheless, major advances have occurred in the past 20 years, which have been largely fueled by the development of (1) efficient algorithms for simulating the liquid state,15–17 and (2) faster computers to accommodate the exploration of these systems across increasingly broader length and time scales. In a sense, many advances in understanding the liquid and glassy states of matter can be directly attributed to the ability of computer simulations to provide precise information on the ‘‘trinity’’ of relevant material properties (structure, thermodynamics, and dynamics) for a variety of model systems. This new information is allowing for stringent tests of the most pervasive ideas about the microscopic origins of the characteristic behaviors of the liquid and glassy states. This chapter focuses on three central concepts (structural ordering, free volume, and entropy) that continue to gain traction in this area largely because they offer promising, albeit still empirical, means for connecting static structural or thermodynamic properties to dynamics. For each concept, basic definitions are presented and some relevant computational strategies that are currently being employed are outlined. We discuss the rationale for their continued use, but we also mention some of their practical shortcomings. Finally, examples are given where the computational application of these concepts has led to new physical insights. This chapter is not meant to serve as a general review of the phenomenology of the liquid and glassy states because several excellent reviews and papers on this topic are already available.6,13,14,18–28 Rather, we provide here a compact and useful starting point for both novice graduate students and experts alike who are interested in contributing to an improved understanding of this important class of material systems.
Metrics for Structural Order
127
METRICS FOR STRUCTURAL ORDER Visualization is perhaps the most intuitive approach for gaining insights into the degree of structure exhibited by a collection of objects, and many experimental devices like microscopes and computational protocols like algorithms for rendering three-dimensional space have been developed to facilitate this approach. Although one can obtain valuable information about a system through simple visual inspection, the conclusions drawn from such an approach will inevitably be subjective because of the qualitative nature of the visualization process. A more objective approach is to develop structural order metrics that use an image (or other microscopic information) to determine quantitatively both the type and the amount of structural order present in a system. Once properly formulated, these metrics can then be used to gain a mechanistic understanding of the system by relating its observable macroscopic properties to the manner in which its individual components organize.9 We describe in this section how to compute order metrics designed for quantifying the packing arrangements typically found in liquids and glasses and provide some examples illustrating how these metrics have produced new insights into the connections among structure, thermodynamics, and dynamics in condensed states of matter from the computer simulations of model materials.29–33 If one has a priori knowledge of the types of structural order relevant to a system of interest, one can then generally construct metrics that are capable of detecting and quantifying that order. Such metrics are often designed to report the deviation of a structure from a reference arrangement of particles. This information can be especially useful for studying the behavior of supercooled liquids and related systems that exhibit transient structural precursors of the stable crystalline phase.34 The structural order metric t,30,31 XN c ideal ðn n Þ i i i¼1 ½1 t ¼ XNc ideal ðncrystal n Þ i i i¼1
measures the comparative degree of translational order of a system using a scale that spans the range between two idealized structures with the same number density r: an uncorrelated system (t ¼ 0) and a reference (perhaps ground-state) crystalline lattice (t ¼ 1). The quantity ncrystal indicates the numi ber of neighbors that are located in a particle’s ith neighbor shell (centered a distance ri from the molecule) in the perfect reference lattice. Similarly, nideal is i the average number of neighbors located in a spherical shell centered a distance r from a particle center (where ri ad=2 < r < ri þ ad=2) in a structurally uncorrelated system. Finally, ni measures the average number of neighbors within that same spherical shell surrounding a particle center in
128
Computing Free Volume, Structural Order
the actual system of interest. In the above discussion, a ¼ r1 (the first nearestneighbor distance for the reference crystal), d is a shell width parameter, and Nc is the total number of neighbor shells considered for comparison in the reference crystal.30,31 Unfortunately, this type of structural order parameter also has limitations, the most serious of which is the need to know the structure of the relevant crystalline phase for the system. This limitation does not pose a particular problem for simple, well-studied systems such as the monatomic hard-sphere and Lennard–Jones models discussed below, that exhibit only one type of crystalline solid phase over a broad range of conditions,35 but it is a problem for more complicated condensed matter systems for which the relevant crystalline structure may not be known in advance. Furthermore, many materials, like water, can form multiple crystalline phases; i.e., they are polymorphic.36 In these latter systems, the relevant reference structure changes as a function of thermodynamic conditions, requiring multiple structural translational order metrics for a complete description. Finally, a stable solid phase usually will not exist at low particle concentrations, so it is not clear if there is any value in drawing a comparison of the structure of a fluid under these conditions with that of a crystalline arrangement. For all of the above reasons, one might prefer to use so-called crystal-independent metrics30 that probe the magnitude of interparticle correlations in a material but do not rely on a comparison with a particular crystalline reference structure.
CRYSTAL-INDEPENDENT STRUCTURAL ORDER METRICS The packing arrangements of the particles in liquids and glasses can be characterized generally in terms of both their translational and so-called bond-orientational intermolecular correlations (see Figure 1). As discussed, the former correlation indicates the degree to which pairs of molecules adopt preferential separations. Bond-orientational order, on the other hand, describes the extent to which regular angles are formed by fictitious ‘‘bonds’’ that can be drawn from the center of a given molecule to those of its nearest neighbors. It is important to note that bond-orientational order is qualitatively different from molecular orientational order, which describes the extent to which the orientations of asymmetric molecules are spatially correlated (e.g., as in the nematic phase of a liquid crystal12). Below, we first examine two crystal-independent metrics for quantifying translational order in liquids and then shift focus to discuss measures for bond-orientational correlations. For all of the systems touched on here, the position of a molecule is mapped onto a single site (e.g., its center of mass or the location of a central atom), and the intramolecular distribution of atoms within a molecule is not considered further.
Crystal-Independent Structural Order Metrics
129
60°
q
Bond-orientational order
Translational order
(a)
(b)
Figure 1 Two basic types of ordering typically found in liquids. (a) Bond-orientational order describes the tendency of molecules to form well-defined angles between the fictitious ‘‘bonds’’ that can be drawn between the molecule of interest and two of its nearest neighbors. (b) Translational order describes the tendency of molecules to adopt preferential interparticle separations. (Adapted from Ref. 30.)
The first crystal-independent structural order metric that we will explore is the translational parameter t30 an integral measure of the amplitude of the material’s total correlation function hðrÞ, 1 t¼ rc
ð rc 0
hðrÞdr
½2
Here, hðrÞ ¼ gðrÞ 1, where gðrÞ is the radial distribution function of the material;7 the value rc ¼ r1=3 &c is an integration cut-off30 that scales with the average intermolecular separation r1=3 (r is the number density), and &c is a parameter that determines the number of coordination shells surrounding the particle to be included in the integration. This translational order metric provides an average measure of the density modulations surrounding a particle in the material. For a completely uncorrelated system, gðrÞ 1, and thus t vanishes. Conversely, the value of t is larger for systems with long-range translational order. One can determine the value of t analytically for a perfect crystalline lattice. For example, when using &c ¼ 3:5, t evaluates to approximately 1.7893 for an open face-centered cubic (fcc) lattice at any density.30
130
Computing Free Volume, Structural Order
A second quantity used to describe the translational order is sð2Þ =kB , which is defined as sð2Þ ¼ 2pr kB
ð1
fgðrÞ ln gðrÞ ½gðrÞ 1gr2 dr
½3
0
Here, sð2Þ is the ‘‘two-body’’ contribution to the excess entropy of a material relative to an ideal gas with the same number density and kB is the Boltzmann constant. The quantity sð2Þ formally emerges as the leading order term in the N-body distribution function expansion of the excess entropy of an isotropic liquid.37 The definition in Eq. [3] was originally introduced by Nettleton and Green38 (and later developed by Raveche´39 using an alternative approach) as an expression for the two-body contribution to the excess entropy appropriate for use only in the grand canonical ensemble. This restriction was later lifted by Baranyai and Evans, who showed that the expression is actually ensemble invariant.40 The quantity is defined such that sð2Þ =kB ¼ 0 for an ideal gas (which lacks any interparticle correlations) and sð2Þ =kB ! 1 for a perfect crystalline lattice. Several studies involving model isotropic liquids40,41 have shown that sð2Þ can account for a large fraction of the total excess entropy sex over a broad range of conditions. It is worth noting, however, that Eq. [3] above only accounts for the translational component of the two-body excess entropy. Lazaridis and Karplus have shown how to calculate approximately the twobody contribution of the excess entropy arising from correlations between molecular orientations of neighbors.42 Finally, the following caveat exists for the translational order metrics t and sð2Þ =kB : Although both can effectively distinguish between structures with different morphologies at the same particle packing fraction (e.g., equilibrium crystals versus liquid, polycrystalline, or glassy samples), these metrics are not always robust enough to distinguish morphological differences between two samples at significantly different densities.30 Under these latter conditions, the translational order metrics should be used in combination with a bond-orientational order parameter. We must now turn our attention to describing structural metrics, first introduced by Steinhardt, Nelson, and Ronchetti,43 that are commonly used to quantify the bond-orientational order of a system. The initial step in calculating these bond-orientational order parameters is to determine the nearest neighbors of each particle. Two particles are considered to be nearest neighbors if their separation is less than the distance of the first minimum in the radial distribution function gðrÞ, although alternative definitions can be used. After identifying the neighbors, ‘‘bonds’’ are defined as being the vectors rij pointing from a central particle to each of its nearest neighbors. For each bond, one first determines the quantity Qlm ð^rij Þ ¼ Ylm ðyij ; fij Þ
½4
Crystal-Independent Structural Order Metrics
131
where ^rij is the unit vector of rij ; yij and fij are the associated polar and azimuthal angles, respectively, and Ylm is the corresponding spherical harmonic.44 Subsequently, an average over all bonds in the system is performed to lm ¼ hQlm ð^rij Þi. Finally, Q lm , which depends on the choice of referobtain Q ence frame, is used to calculate the rotationally invariant order metric Ql , " #1=2 m¼l 4p X 2 lm j jQ ½5 Ql ¼ 2l þ 1 m¼l In general, the even-l structural order metrics grow in value as the crystalline order of a system increases. The upper limiting value of a given order metric for a perfect crystalline structure depends on both the value of l and the type of crystalline lattice (see, e.g., Figure 2 of Ref. 43). For instance, the l ¼ 6 value 43 for a perfect (defect-free) fcc crystalpisffiffiffiffiffiffiffiffiffiffiffiffi Qfcc For a completely 6 0:57452. uncorrelated system, Ql is of order 1= Nbond , where Nbond is the total number of nearest-neighbor bonds in the system [Nbond is typically an OðNÞ quantity in liquids, where N is the number of particles]. Therefore, in the thermodynamic limit, Ql of a material will generally take values between zero (indicating ideal randomness) and the finite value that reflects the perfect crystalline structure of the material. 1.0
translational order, t
0.8
crystal
0.6 glasses G
0.4
f
0.2 fluid 0.0 0.0
0.2 0.4 0.6 0.8 bond-orientational order, Q
1.0
Figure 2 Two-parameter ordering map for a simulated monatomic hard-sphere system.30,31 Shown are realizable states of the system plotted in structural order parameter space: the translational order metric t versus the bond-orientational order metric Q. Data are presented for the equilibrium fluid (dashed), the equilibrium fcc crystal (dashed), and a set of nonequilibrium glasses (filled circles) produced by applying different compression rates to a fluid. In contrast to the equilibrium state points, the degree of ordering and the packing fraction f for the glasses are generally determined by the processing history (in this case, the compression rate ). The locations of the freezing and melting transitions are indicated by the diamond and the square, respectively. (Adapted from Refs. 30, and 31.)
132
Computing Free Volume, Structural Order
STRUCTURAL ORDERING MAPS A useful concept for relating structural order to material properties is the ordering map,30,31 which projects the possible states of a system onto the plane of its relevant structural order metrics. Such maps provide a broad representation of the evolution of structural order along equilibrium as well as nonequilibrium paths. An ordering map for a monatomic hard-sphere system derived from molecular dynamics simulations30,31 is shown in Figure 2. The two relevant forms of order for this system, bond-orientational and translational order, serve as the two axes of the plot. The bondorientational order is presented as Q ¼ Q6 =Qfcc 6 (see Eq. [5]); i.e., this is the raw quantity of Q6 normalized by its value for a perfect fcc crystal. The translational order parameter t is given by Eq. [1], and it measures the similarity of the average interparticle separations in the fluid with that of a perfect fcc crystal. Defined in this manner, both structural order metrics take on values between zero (ideal randomness) and unity (perfect crystalline order) in the hard-sphere system. Inspection of the ordering map enables one to readily quantify the type and degree of order displayed by the equilibrium and glassy states of a material. One important point is that the glasses of the hard-sphere system occupy a different region of order parameter space than does the equilibrium fluid, even though both are nominally ‘‘disordered.’’30 Because the hard-sphere system captures the basic ordering patterns of many atomic and molecular fluids, this finding contradicts the traditional notion that glassy structure should be considered simply ‘‘liquid-like.’’ Also note that compression rate impacts strongly the structural order of the nonequilibrium hard-sphere glasses (i.e., structural order in glasses is highly history-dependent). Finally, the ‘‘randomness’’ (or lack of structural order) is a matter of degree for the hard-sphere glasses. That is to say, glasses with slightly higher packing fractions f can be created at the expense of small increases in order. The latter observation has recently led to a reassessment of the conceptual value of the so-called random close-packed state.31 Given the preceding results, a natural question to ask is as follows: Do simple interparticle attractions change the ordering map? As a first step toward answering this question, Errington, Debenedetti, and Torquato examined the temperature and density dependencies of various order metrics for a collection of identical particles interacting through the shifted-force Lennard– Jones potential.32 Their examination of bond-orientational and translational order along paths involving the equilibrium vapor, liquid, and crystalline phases of this system led to several interesting findings. Figure 3 shows an ordering map for this Lennard–Jones system, with the translational order t (of Eq. [2]) plotted against the bond-orientational order Q6 (of Eq. [5]). It can be observed that the data, collected over a wide range of temperatures and densities, collapse onto two distinct equilibrium branches
Structural Ordering Maps
translational order, t
1.0
T=0.75 T=0.935 T=1.5 V-L Equil. L-S Equil.
0.8
133
crystal
0.6
0.4
0.2 0.0
fluid
0.1
0.2
0.3
0.4
0.5
0.6
bond-orientational order, Q6
Figure 3 Comparison of Lennard–Jones (solid line) and hard-sphere (circles) model ordering maps for the equilibrium fluid and the fcc crystal. (Adapted from Ref. 32.)
for the fluid and the crystal. Notice also that the ordering map of the equilibrium hard-sphere system falls onto the same fluid and crystalline branches. Stated more generally, fluids with simple, spherically symmetric pair potentials, with or without slowly varying attractions, seem to sample the same configurations in their equilibrium states.45 This numerical result provides further justification for thermodynamic perturbation theory,46 which begins from the assumption that all simple fluids have the same underlying structure. The metrics described above report only the global order of a sample, not the local order (e.g., the order associated with the structure surrounding a particular particle or spatial region of a material). A poignant example of the use of local order metrics47 was given by ten Wolde, Ruiz-Montero, and Frenkel. These authors first developed local bond-orientational metrics through modification of the original global versions and subsequently used these metrics to identify ‘‘crystalline atoms’’ to study the mechanism by which homogeneous crystal nucleation proceeds in a super-cooled Lennard–Jones fluid. Local order parameters have also been used to explore the inhomogeneous structural properties of condensed phases under conditions of confinement48 and to quantify the order exhibited by nonequilibrium two-dimensional materials.49 Although the bond-orientational metrics defined above have proven useful for identifying numerous space-filling crystalline morphologies43 like face-centered cubic, body-centered cubic, simple cubic, and hexagonally closepacked lattices, they are inadequate for detecting order in systems that organize
134
Computing Free Volume, Structural Order
into tetrahedral particle arrangements, the most prominent of which is water. However, the order metric q, defined by Errington and Debenedetti, as q¼1
3 X 4 3X 1 2 cos cjk þ 8 j¼1 k¼jþ1 3
½6
provides an effective means for quantifying tetrahedral order. Here, cjk is the angle formed by the fictitious bonds joining a specified molecule of interest and two of its four nearest neighbors j and k. This definition is similar to the parameter introduced by Chau and Hardwick,50 only rescaled in such a way that the average value of q varies between 0 (for completely uncorrelated systems) and 1 (for a perfect tetrahedral network). Note that this metric provides local and, when averaged over all particles, global measures of the tetrahedral order. The so-called tetrahedrality parameter51 of Naberukhin, Voloshin and Medvedev is another metric that can be used to detect/quantify tetrahedral order. The competition between directional attractions and short-range repulsions in tetrahedral liquids, such as water or silica, give rise to some highly nontrivial physical and structural properties. Examples include expansion upon cooling at constant pressure13,52 and increased molecular mobility upon isothermal compression at sufficiently low temperatures.53–56 Given the prominence of tetrahedral fluids in both scientific phenomena and industrial processes, a deeper understanding of the relationship between macroscopic thermodynamic and kinetic properties and the molecular-level organization of these fluids has long been desired. In a recent study, Errington and Debenedetti29 investigated the connection between measures of structural order and macroscopic properties of stable, super-cooled, and stretched liquid water using the extended simple point charge (SPC/E) model.57 They could relate well-known macroscopically observed thermodynamic and kinetic anomalies to microscopic structural anomalies.29 The use of an ordering map once again proved fruitful. Figure 4 depicts the evolution of structural order upon isothermal compression of liquid water at two temperatures. The coordinates of the order map are the translational order metric (defined as t in Eq. [2]) and the tetrahedral order metric (defined as q of Eq. [6]). The data indicate that at sufficiently low temperatures a range of densities exists over which both forms of order decrease upon isothermal compression. This behavior is in clear contrast to what is observed in the simple hard-sphere and Lennard–Jones fluids discussed above. The conditions on the phase diagram for which this anomalous behavior occurs has been termed water’s ‘‘structurally anomalous’’ region. Inspection of the order map (Figure 4) reveals a dome of structural anomalies within the temperature-density plane, bounded by loci of maximum tetrahedral order (at low densities) and minimum translational order (at high densities) as shown in Figure 5. Also marked on Figure 5 are regions of diffusive anomalies,
Structural Ordering Maps
135
Translational order, t
0.45 D
A
E
0.40 B C 0.35
0.5
0.6 0.7 Tetrahedral order, q
Figure 4 The path traversed in structural order-metric space as liquid water (SPC/E) is compressed isothermally at two different temperatures. Filled diamonds represent T ¼ 260 K, and open triangles represent T ¼ 400 K. The arrows indicate the direction of increasing density. A and C are states of maximum tetrahedral order at the respective temperatures, whereas B is a state of minimum translational order. Reprinted with permission from Ref. 29.
350
T(K)
300
A
B
250
200 0.9
1.0 ρ (g
1.1
1.2
cm–3)
Figure 5 Relationship among loci of structural, dynamic, and thermodynamic anomalies in SPC/E water. The ‘‘structurally anomalous’’ region is bounded by the loci of q maxima (upward-pointing triangles) and t minima (downward-pointing triangles). Inside of this region, water becomes more disordered when compressed. The loci of diffusivity minima (circles) and maxima (diamonds) define the region of dynamic anomalies, where self-diffusivity increases with density. Inside of the thermodynamically anomalous region (squares), the density increases when water is heated at constant pressure. Reprinted with permission from Ref. 29.
136
Computing Free Volume, Structural Order
identified by state conditions for which the diffusivity increases upon isothermal compression, and a region of density anomalies, defined by state conditions for which the liquid expands upon isobaric cooling. The physical picture29,58 that emerged from the calculations, and one that has since been verified by numerous models of liquids that exhibit water-like properties,33,59–61 is that a cascade of anomalies occurs within liquid water, whereby structural, diffusive, and thermodynamic anomalies occur successively, as water becomes progressively ordered. Although thus far our focus has been on the means for detecting and quantifying the structural order reflected by the packing arrangements of particles in condensed phases, a complementary aspect of structure exists that is also thought to be important for determining material properties: free volume. We now discuss how to compute free volume and then provide examples of its relationship to the thermodynamic and dynamic properties of glass-forming materials.
FREE VOLUME Several different methods have been introduced over the years for describing free volume. However, in a qualitative sense, all of them essentially characterize the local space available for motion of the constituent particles of a material (i.e., they all characterize the ‘‘breathing room’’ of the particles in their packing structures). As one might expect, isothermal compression of a fluid generally brings the constituent particles closer together, which reduces the free volume and, for most systems, also reduces both the entropy and the particle mobility (liquid water under some conditions being a notable exception62). Consequently, it is natural to wonder whether a detailed understanding of free volume might reveal new microscopic insights into the thermodynamic and dynamic properties of condensed-phase materials. The simplest conceptual picture for relating single-particle dynamics of liquids to free volume is to assume that a critical amount of local free volume vf is necessary for a particle to ‘‘escape’’ from the transient cage formed by its neighboring particles and thus contribute to self-diffusion. In this view, microscopic density fluctuations play a central role in facilitating particle mobility. Cohen and Turnbull63 introduced a qualitative model based on this idea. It begins from the following relationship between the translational self-diffusion coefficient D and the probability density pðvf Þ associated with finding a particle with free volume between vf and vf þ dvf : D¼
ð1 vf
dvf pðvf ÞDðvf Þ
½7
Free Volume
137
Here, Dðvf Þ represents the contribution to the self-diffusivity from particles having free volume vf . Various approximations for pðvf Þ and Dðvf Þ, as well as other alternative theoretical approaches in the same spirit of this model, have been explored over the past five decades (see, e.g., Refs. 64–68), but the basic conceptual point of view has not changed significantly (Ref. 69 provides a recent review). Despite these efforts, a comprehensive and microscopically testable free-volume based theory for dynamics has yet to emerge, caused, in large part, by the lack of experimental/computational tools available for measuring and characterizing free volumes. However, recent advances in computational statistical geometry70,71 are now making it feasible to calculate free volumes efficiently from particle configurations obtained from either experiments (e.g., confocal microscopy of particle suspensions72,73) or from computer simulations,71,74–76 which opens the door for a careful examination of the basic ideas underlying the free-volume-based perspective for dynamics. We focus here exclusively on one commonly invoked geometric definition of a particle’s free volume vf. That definition is simply the cage of accessible space (see Figure 6) that a particle center could access from its current state if all neighboring particles were fixed in their current configuration.77 This type of geometric definition assumes that the short-range repulsive interactions between particles are steep enough to warrant the assignment of effective (perhaps temperature-dependent78) ‘‘hard-core’’ radii to the particles, which is an assumption that also underlies thermodynamic perturbation theories of the liquid state.46 The surface area sf, which bounds the free volume of a particle, is termed its free surface area. Notice that, in using the local geometric definition discussed above, each particle can be unambiguously associated with a free volume and a free surface area.
Figure 6 (Left) Schematic of a configuration of particles (black) with exclusion spheres (gray). (Right) The cross-hatched volume of available space that is formed upon ‘‘removal’’ of the particle is equal to that particle’s free volume vf . The surface area of the cavity is that particle’s free surface area sf . (Adapted from Ref. 71).
138
Computing Free Volume, Structural Order
Figure 6 shows a schematic collection of particles surrounded by their effective exclusion spheres. The exclusion sphere of each particle denotes the volume of space from which the other particle centers are excluded because of their hard-core interactions with that particle. As can be seen, exclusion spheres will (generally) partially overlap in condensed phases because of the relatively small average interparticle separation (on the order of a particle diameter). A cavity in such a configuration is defined to be a volume of connected space laying outside of the exclusion spheres. The union volume of the cavities is often referred to as the available volume. This volume is available in the sense that it could accommodate an additional sphere center without producing an overlap with an existing particle. Cavity volumes are intimately related to the free volumes discussed above.79 In particular, to compute the free volume of a particle, one virtually ‘‘removes’’ the particle of interest from a snapshot of the configuration. The volume and surface area of the newly formed cavity from where the sphere was removed represent that particle’s free volume and free surface area, respectively. We outline the basic steps for calculating cavity volumes and free volumes below. More details regarding efficient algorithms for carrying out these calculations can be found in the work of Sastry and co-workers.70,71
IDENTIFYING CAVITIES AND COMPUTING THEIR VOLUMES Consider a collection of spherical particles with associated exclusion spheres in a large, periodically replicated simulation cell. The first step in the calculation of the cavity volumes70,71 of the system is to generate the socalled Voronoi and Delaunay tessellations80 (see Figure 7). The Voronoi tessellation tiles space into polyhedra VPi comprising points in space closer to a given particle center i than to any other particle. This way, the faces of the Voronoi polyhedra are all equidistant from two particles. Accordingly, several VPk will exist that share a face with each VPi, and the particles associated with these VPk are designated as the ‘‘nearest neighbors’’ of particle i. The Delaunay tessellation, which tiles space into nonoverlapping tetrahedra, is obtained by connecting all nearest neighbors by fictitious bonds. Algorithms for constructing these tessellations efficiently are presented elsewhere.80,81 To compute cavity volumes, one then proceeds as follows.70,71 1. Identify the cavities: Each connected cluster of Voronoi polyhedra vertices and edges that avoid overlap with the exclusion spheres in the system identifies a distinct cavity of available space. 2. Identify the Delaunay tetrahedra enclosing the cavities: The union of the Delaunay tetrahedra corresponding to the aforementioned cluster of
Computing Free Volumes
139
Figure 7 (Left) Typical configuration of particles before particle i is removed. (Right) After particle i is removed, tessellations must be reconstructed within the ‘‘superpolyhedron’’ (bold, dashed line). The volume and surface area of the cavity that once held the removed particle can then be determined. (Adapted from Ref. 71).
Voronoi vertices completely encloses the associated cavity. This union volume provides an upper bound of the cavity’s volume. 3. Determine the actual cavity volume and surface area within the Delaunay tetrahedra: The overlap of the exclusion spheres with the relevant Delaunay tetrahedra is subtracted analytically, leaving only the actual cavity volume and surface area. This nontrivial calculation is from the multiple overlap of exclusion spheres, but a systematic method for carrying it out is available.70,71
COMPUTING FREE VOLUMES The calculation of free volumes and free surface areas requires an efficient way of extending the cavity algorithm. To calculate the free volume of a particle i, the particle and its associated exclusion sphere are effectively removed from a snapshot of the particle configuration. The volume and surface area of the cavity that is produced in the absence of particle i are equal to the free volume and free surface area of particle i, respectively. After particle i has been removed from the configuration, the Voronoi and Delaunay tessellations must be reconstructed before the resulting cavity volume can be calculated. It can be shown rigorously that, at least in the case of monodisperse sphere packings, the tessellations must be redone only for the neighboring particles of the removed particle (i.e., within a local ‘‘superpolyhedron’’ shown in Figure 7).71 Once the local re-tessellations are completed, the cavity volume algorithm described above is used to determine the free volume and free surface area particle i. Particle i is then replaced into
140
Computing Free Volume, Structural Order
its previous position in the snapshot of the configuration, and the original tessellation of the entire configuration is restored. This sequence of steps— removal of a particle, local re-tessellation, local use of the cavity volume algorithm to determine that particle’s free volume and free surface area, replacement of the particle, and restoration of the original tessellation—is repeated to analyze every particle in the system efficiently. Given this background, we can now provide several examples of how free volumes have been related to thermodynamic and dynamic properties of liquids, and how their measurement has been employed in computer simulations to derive microscopic insights that are otherwise not accessible from experiment. We consider first how to compute thermodynamic values from free volumes and follow that with the relationship of free volumes to dynamics.
COMPUTING THERMODYNAMICS FROM FREE VOLUMES As discussed, the intuitive notion that there should be a connection between the statistics of the free volumes of a fluid and its measurable macroscopic properties has a long history in studies of the liquid state. In fact, it turns out that this connection is precise in the case of the thermodynamics of the single-component hard-sphere fluid. Specifically, Hoover, Ashurst, and Grover77 and Speedy82 have provided independent derivations that predict the relationship between the hard-sphere compressibility factor Z ¼ P=rkB T and the geometric properties of its free volumes, as follows: s sf Z¼1þ ½8 6 vf where s represents the hard-sphere diameter and the angular brackets indicate an ensemble average. A more general form of this relationship, which applies to multicomponent hard-sphere systems with either additive or nonadditive diameters, has also been derived more recently by Corti and Bowles.83 From a computational perspective, the expression in Eq. [8] provides a straightforward test for the free-volume algorithm discussed above. In particular, the right-hand side of Eq. [8] can be computed from the configurations of hard spheres generated in a molecular dynamics simulation, whereas the lefthand side can be computed directly from measurement of the hard-sphere collision rate in the same simulation. Sastry and coworkers71 carried out that test, the results of which are shown in Figure 8. As can be seen, the agreement is excellent. Can the geometric properties on the right-hand side of Eq. [8] be predicted theoretically (i.e., rather than just ‘‘measured’’ from simulation)? Krekelberg, Ganesan, and Truskett85 (KGT) have taken a step in this direction
Relating Dynamics to Free Volumes
141
15 Carnahan–Starling MD Virial Free Volume 13 P/rkBT 11
9
7 0.80
0.85
r*
0.90
0.95
Figure 8 Compressibility factor P=rkB T versus density r ¼ rs3 of the hard-sphere system as calculated from both free-volume information (Eq. [8]) and the collision rate measured in molecular dynamics simulations. The empirically successful Carnahan– Starling84 equation of state for the hard-sphere fluid is also shown for comparison. (Adapted from Ref. 71).
by introducing and testing an approximate analytical model for the free volumes of equilibrium fluids. The KGT model can reproduce successfully the scaling behavior of the density-dependent free-volume and the free-surface distributions of the single-component hard-sphere fluid. It can also predict the manner in which the second virial coefficients of fluids with short-range attractions (e.g., colloids or globular proteins) affect their free-volume distributions. Unfortunately, there is not yet a means to improve on the results of the KGT model in Ref. 85 systematically, because that model was not derived within a formally exact statistical mechanical framework. In fact, the development of such a rigorous formalism for predicting free volumes remains one of the unresolved theoretical challenges in liquid-state theory.
RELATING DYNAMICS TO FREE VOLUMES Because efficient methods for computing free volumes from molecular simulations were introduced only recently, their connections to the dynamical properties of liquids have yet to be explored systematically. Nonetheless, initial investigations have already allowed scrutiny of some historical notions about these properties. Here, we briefly discuss two of these initial studies. Their results illustrate that some early ‘‘free-volume based’’ ideas about the origins of dynamics are consistent with simulation data, but those ideas will need significant revision if they are to be applied in a general way.
142
Computing Free Volume, Structural Order
The first example is from Starr et al.76 who used the algorithm of Sastry and coworkers71 in their molecular dynamics simulation study of a model super-cooled polymer melt. One of their main findings was that a quantitative connection does exist between the average free volume and the dynamics of that system. Specifically, Starr et al. demonstrated that the temperature dependence of the average monomeric free volume (defined with a corresponding temperature-dependent Boltzmann diameter78) is well described by a powerlaw fit that extrapolates to zero free volume at the same temperature that the dominant relaxation time of the melt diverges. This simulation data seem entirely consistent with the historical notion that the glass transition may be related to a vanishing free volume, which is a central tenet of freevolume theories of vitrification.13 Refuting the generality of this finding, however, is the work by Krekelberg, Ganesan and Truskett75 which demonstrates that static free volumes do not always correlate strongly with dynamics. These authors also used the same algorithm from Ref. 71 in conjunction with molecular dynamics simulations to probe the connection between free volume and the self-diffusivity of model colloids that interact via polymer-induced depletion attractions (for a detailed discussion of the model, see Refs. 86 and 87). Because this model system displays anomalous dynamic properties, it can be used to put the free-volume-based picture for dynamics to a stringent test. The most notable dynamic anomaly of this model colloidal suspension is that, at constant colloid concentration, the colloid self-diffusivity shows a maximum as a function of e=kB T [see Figure 9(a)], where e measures the strength (well-depth) of the depletion attraction. Interestingly, Figure 9(b) shows that increasing the strength of attraction between the colloidal particles monotonically increases the average free volume for the colloids. The monotonic increase is a reflection of the tendency of attractive particles to cluster, a process that naturally forms transient, open channels in the fluid.88,89 The implication is that the average free volume and diffusion coefficient are not correlated trivially for this colloidal system, as can be inferred from inspection of Figure 9(c). This uncorrelation represents a significant departure from behaviors of the hard-sphere reference fluid [also shown in Figure 9(c)] and the polymer melt simulated by Starr et al.76 This noncorrespondence between free volume and diffusion also differs from the qualitative expectations of freevolume theories of dynamics,69 which assume that increasing the static free volume leads to an increase in particle mobility. To fully understand the anomalous dynamics of an attractive colloidal fluid from a free-volume perspective, one must consider two effects of attractions on free volumes.75 First, attractions increase the average local space available to the particles and render the free-volume distribution more inhomogeneous than when no attractions exist. These changes act to increase the mobility of the fluid. Second, strong attractions also lead to long-lived
Relating Dynamics to Free Volumes
143
Figure 9 Properties of the attractive colloidal fluid investigated in Ref. 75: (a) selfdiffusivity and (b) average free volume versus strength of the interparticle attraction; (c) self-diffusivity versus average free volume for the hard-sphere fluid (open circles) and the attractive colloidal fluid (closed circles). Data compiled from Ref. 75.
associations between the colloids, which in turn slow down dynamical processes. This latter effect suggests that the correlation between free volume and a system’s dynamics must also incorporate the lifetimes of the free volumes. To measure such time scales, the free-volume autocorrelation function Cvf can be calculated.75 This quantity, defined as, Cvf ðtÞ ¼
N hdvf;i ðtÞdvf;i ð0Þi 1X N i¼1 hdvf;i ð0Þdvf;i ð0Þi
½9
where dvf;i ðtÞ ¼ vf;i ðtÞ hvf;i ðtÞi is the deviation of particle i’s free volume at time t from its average value. This autocorrelation function characterizes the dynamic manner in which the size of a particle’s free volume loses correlation with its initial value over time because of thermal fluctuations. The decay of the correlator can be used to identify the ‘‘persistence’’ times of free volumes tf,75 which are displayed in Figure 10(a) as a function of the interparticle attractive strength. As expected, the persistence time of the free volumes increases rapidly as the strength of attractions is increased. KGT75 also noted that the simple scaling relationship D ¼ Ahvf i2=3 =tf , is capable of capturing the diffusion coefficient’s nontrivial dependence on attractive strength. The
Computing Free Volume, Structural Order
Self-diffusivity
Free volume time scale
144
103 (a) 102 101
100 10–1
10–2 (b) 10–3 0
1
2 3 4 Attractive strength, e/kBT
5
Figure 10 (a) Free-volume persistence time extracted from the free-volume autocorrelation function (Eq. [9]) for an attractive colloidal fluid as a function of the strength of the interparticle attraction. (b) Comparison of colloidal self-diffusivity (closed symbols) with that estimated using the free-volume scaling relationship D ¼ Ahvf i2=3 =tf discussed in the text (open symbols). Data taken from Ref. 75.
good qualitative correspondence between this relationship and the actual selfdiffusivity, which is shown in Figure 10(b), implies that self-diffusion in liquids generally requires information about both the static and the dynamic properties of the free volumes. Another static quantity, the entropy, is also thought to be intimately related to various dynamic properties of liquids such as the viscosity m and the self-diffusivity D. In the next section, we discuss some of the main ways in which computer simulations have been used to probe this connection in recent years.
ENTROPY A well-known prediction connecting thermodynamics to dynamics appeared in the seminal work of Adam and Gibbs (AG),90 where semiempirical arguments led to the following exponential relationship: tR ¼ t0 exp½A=ðTSC Þ
½10
Here, tR is a dominant structural relaxation time of the liquid (e.g., viscosity m / tR ), T is temperature, SC is the molar configurational entropy (defined
Entropy
145
below), and both t0 and A are temperature-independent parameters. Equation [10] has been shown empirically to describe the behaviors of a diverse variety of computer-simulated liquids in their super-cooled states, including models for silica91 and water,92 a binary Lennard–Jones alloy,93 and a monatomic model glass-former introduced by Dzugutov.94 The AG relationship has also been validated through analysis of experimentally obtained thermodynamic and kinetic data of various super-cooled liquids.95–98 Given the empirical success of Eq. [10], it is natural to ask, precisely what is it that the configurational entropy measures? Although no general agreement about its structural interpretation exists yet, Adam and Gibbs originally proposed that there is a connection between SC and the size of the socalled ‘‘cooperatively rearranging regions’’ that play an important role in the dynamics of super-cooled liquids. The idea is that relaxation processes occur primarily by independent motions of the structural regions, whose average size is expected to increase with decreasing temperature, which in turn reduces the number of thermodynamically available configurations (and hence the ‘‘configurational’’ entropy) of the system. Although this picture is intuitively appealing, a universally accepted metric for identifying cooperatively rearranging regions, and thus for testing the structural interpretation of the AG relation, has yet to be discovered. In an effort to clarify the situation, various authors24,99,100 have proposed alternative means for deriving Eq. [10]. The results of these investigations, although insightful, do not represent the final word on the microscopic origin of the AG relation, and understanding this apparent connection between the thermodynamics and the dynamics of super-cooled fluids remains one of the most intriguing and long-standing challenges in the study of liquid state theories. Here, we do not explore the derivation or the possible molecular basis of the AG relation. Rather, we discuss how computer simulations for model systems can be used to put the AG relation to a stringent empirical test. The first question that must be addressed in this endeavor is as follows: How does one compute configurational entropy? Initial ideas by Goldstein4 followed by a formal statistical mechanical derivation by Stillinger and Weber5 have provided important guidance in this regard by drawing a connection between configurational entropy and the potential energy landscape (PEL) of the liquid. The aforementioned PEL (see Figure 11) is the multidimensional hypersurface formed when the total potential energy of the liquid is plotted as a function of its microscopic configurational degrees of freedom (i.e., its generalized particle coordinates). At any instant of time, the vector of generalized particle coordinates locates a single point on the surface of the PEL—the so-called configuration point of the liquid. Self-diffusion translates into motion of the configuration point on the PEL as the configuration point executes (1) ‘‘vibrations’’ within the basins of attraction surrounding each local energy minimum (inherent structure), and (2) transitions across saddle points from one ‘‘configurational’’ basin to another. These latter transitions, which often
146
Computing Free Volume, Structural Order
Potential energy
Basin
Transition states (saddle points)
Amorphous inherent structures Crystal permutations Particle coordinates
Figure 11 Simplified two-dimensional schematic of a multidimensional potential energy surface as a function of its configurational degrees of freedom. The landscape topology is specified by the density, whereas the system’s elevation on the landscape is dictated by temperature. Reprinted with permission from Ref. 6.
involve cooperative rearrangements of the molecules, occur less frequently as the temperature is lowered and thus become largely decoupled from the vibrational motions. From a thermodynamic perspective, Stillinger and Weber demonstrated that the total entropy of the liquid can similarly be divided into two additive terms, a ‘‘configurational’’ and a ‘‘vibrational’’ contribution.5,6 The configurational part SC measures the number of structurally distinct basins of attraction on the PEL that the configuration point accesses at a given temperature, whereas the vibrational contribution Svib characterizes the number of states associated with intra-basin fluctuations. Thus, the AG relationship, when viewed from the PEL perspective, suggests that it is the thermodynamic availability of basins on the landscape that dominates the rate of liquid-state diffusive processes. These ideas have been employed to compute configurational entropy and hence test the AG relation via molecular simulation of several model systems.91–94,101,102 The approach used in those studies is conceptually simple. First, the total entropy of the fluid S is calculated by integration of standard thermodynamic relationships, for example, as discussed below. Then, the configurational contribution to the entropy SC ¼ S Svib , is approximated by subtracting from the total entropy an estimate for the vibrational contribution, Svib . One way to obtain S at a given density r and temperature T is to first calculate the total entropy at a reference state with the same density r but
Entropy
147
with a much higher temperature T0 . This calculation can be done by performing an isothermal integration over a reversible path from the ideal gas state, SðT0 ; rÞ ¼ Sideal ðT0 ; rÞ þ
EðT0 ; rÞ þ NkB T0
0 ðr PðT0 ; r0 Þ dr 1 0k T r r0 B 0 0
½11
where EðT0 ; rÞ is the energy of the fluid at the reference state and PðT0 ; r0 Þ is the fluid pressure at density r0 and reference temperature T0 . The exact expression for the ideal gas contribution to the entropy Sideal is available in analytical form.1 Moreover, both EðT0 ; rÞ and PðT0 ; r0 Þ can be computed readily using standard molecular simulation techniques for equilibrium fluids, such as Monte Carlo or molecular dynamics.15–17 The data collected for PðT0 ; r0 Þ along the isotherm can then be fitted numerically to an analytical equation to facilitate the integration. A key issue to keep in mind is that the chosen reference temperature T0 should be high enough to avoid crossing a liquid– gas phase boundary along the isothermal integration path.23 The total entropy S at any other temperature of interest T is then obtained by thermodynamic integration along an isochoric path, dT 0 qEðT 0 ; rÞ SðT; rÞ ¼ SðT0 ; rÞ þ 0 qT 0 T0 T r ðT
½12
where the derivative in the integrand is also typically obtained by fitting a series of temperature-dependent molecular simulation data to a simple analytical equation.101 The above method provides a general way to calculate the total entropy. At low temperature, however, prohibitively long equilibration and production simulations would be required if standard Monte Carlo and/or molecular dynamics algorithms were to be used to calculate the necessary thermodynamic properties. Here, one should instead invoke one of the powerful simulation methods designed for equilibrating systems with large free energy barriers, including (but not limited to) parallel tempering,103 the cluster pivot algorithm of Santen and Krauth,104 the swap Monte Carlo algorithm by Grigera and Parisi,105 and ‘‘density-of-states’’ algorithms.106–110 The comparative performance of some of these methods has been examined recently, and a discussion of the results can be found elsewhere.111,112 Alternatively, one can invoke an approximate analytical scaling relation introduced earlier by Rosenfeld and Tarazona113 to extrapolate high-temperature entropy behavior down to the deeply super-cooled liquid region. This approach has been widely used,92,101,114–116 in large part, because of its simplicity and its reasonable accuracy.
148
Computing Free Volume, Structural Order
The next step when computing configurational entropy is to calculate the vibrational contribution to the entropy Svib . The most commonly employed technique used to accomplish this calculation is to assume that the configuration point of the liquid executes harmonic vibrations around its inherent structures (i.e., Svib Sharm ), which is a description that can be expected to be accurate at low temperatures. The quantity Sharm for a given basin is then computed as117 Sharm ¼ kB
3N3 X
½1 logðhoi =kB TÞ
½13
i¼1
where h is Planck’s constant. The oi are the intra-basin vibrational eigenfrequencies, which are the square roots of the eigenvalues of the Hessian matrix117 evaluated at the inherent structure of the basin. To complete this analysis, the right hand side of Eq. [13] needs to be averaged over a representative set of inherent structures obtained from each equilibrium state point. The inherent structure configurations are determined by mapping equilibrium particle configurations at the temperature and density of interest to their associated local potential energy minima using, e.g., a conjugate gradient algorithm or an alternative optimization routine for large-scale problems (see, for example, Ref. 118). In practice, good statistics for this calculation can usually be obtained when using 1000 inherent structures per state point.23 A comparison of the properties of inherent structures recovered by different minimization routines has been published.119 One of the main findings was that although different minimization routines yield slightly different sets of inherent structures, the average properties obtained are virtually indistinguishable. The harmonic approximation discussed above works well for simple liquids in their super-cooled state, but it is not sufficient for estimating Svib in the case of strong network-forming liquids like water or silica.91,92 For these fluids, the anharmonic contribution Sanh , where Svib ¼ Sharm þ Sanh , is important and is normally computed by calculating approximately the anharmonic part of the energy Eanh, where Evib ¼ Eharm þ Eanh . The anharmonic contribution to the energy is determined by subtracting the average inherent structure energy EIS and the harmonic vibrational contributions, Eharm ¼ ð3N 3ÞkB T=2;1 from the total energy E; Eanh ¼ E EIS Eharm , and then integrating the isochoric temperature derivative of Eanh from T ¼ 0 to the temperature of interest as in Eq. [12]. In practice, the computed variation of Eanh with temperature can be approximately fitted to a polynomial having both zero value and slope at T ¼ 0.91 This analysis implicitly assumes that the basin shapes on the PEL do not change appreciably with inherent structure energy (i.e., with basin ‘‘elevation’’). Finally, the required configurational entropy is obtained by difference, SC ¼ S Sharm Sanh
½14
Testing the Adam–Gibbs Relationship
149
Examples of this type of calculation are reported in Ref. 91–94 and 101–102. We present below representative results for a few network-forming liquids that are known to display anomalous thermodynamic and dynamic behavior. 91,92
TESTING THE ADAM–GIBBS RELATIONSHIP One of the most convincing tests of the AG relationship appeared in the work of Scala et al.92 for the SPC/E model of water,57 which is known to reproduce many of water’s distinctive properties in its super-cooled liquid state qualitatively. In this study, the dynamical quantity used to correlate with the configurational entropy was the self-diffusivity D. Scala et al. computed D via molecular dynamics simulations. The authors calculated the various contributions to the liquid entropy using the methods described above for a wide range of temperature and density [shown in Figure 12(a–c)].
S/kB
(a)
7 5 3 1
(b)
Svib/kB
4 3 2
Sconf/kB
(c)
D (10–5 cm2 s–1)
(d)
3 2 1
101 100 10–1 10–2 10–3 10–4 0.9 1.0 1.1 1.2 1.3 1.4 ρ (g cm–3)
Figure 12 (a–d) The total liquid entropy S (Eq. [12]), the vibrational entropy, Svib ¼ Sharm þ Sanh , the configurational entropy, SC Sconf (Eq. [14]), and diffusivity D as a function of density for the SPC/E water model. The temperatures from top to bottom are T ¼ 300, 260, 240, 230, 220, 210 K. Reprinted with permission from Ref. 92.
150
Computing Free Volume, Structural Order 101 r = 0.95g cm–3 r = 1.00g cm–3 r = 1.05g cm–3 r = 1.10g cm–3 r = 1.20g cm–3 r = 1.30g cm–3 r = 1.40g cm–3
D(10–5 cm2 s–1)
100
10–1
10–2 SPC/E 10–3
10–4 0.0
0.2
0.4 103/(TScont)
0.6 (mol
0.8
J–1)
Figure 13 Self-diffusivity D versus configurational entropy, SC Sconf , for the SPC/E water model at various density r values. The lines are fits to the AG form given by Eq. [10] with tR / 1=D. Reprinted with permission from Ref. 92.
The behavior of self-diffusivity for these state points is displayed in Figure 12(d). It is clearly visible that the non-monotonic dependence of D on density r is directly reflected by the configurational contribution to the entropy. In fact, the quantitative relationship between D and SC predicted by Eq. [10] (i.e., the AG relationship) holds remarkably well for SPC/E water over this wide range of thermodynamic conditions, as is shown in Figure 13. An investigation by Saika-Voivod, Poole, and Sciortino91 provides another convincing example of a network-forming liquid that obeys the AG relationship (in this case, silica using the BKS model120). Again, the authors compared the behavior of the temperature- and density-dependent self-diffusivity, which they computed via molecular dynamics simulations, to the configurational entropy, obtained by the methods discussed earlier in this section. The BKS silica model represents a particularly interesting case because its self-diffusivity shows non-Arrhenius T dependencies at high T (so-called ‘‘fragile’’ behavior) and Arrhenius T dependencies at low T (so-called ‘‘strong’’ behavior). This type of fragileto-strong transition for dynamics is discussed in more detail in Ref. 121. The AG relation again holds to a very good approximation as shown in Figure 14.
An Alternative to Adam–Gibbs?
108
10–5
151
P=ambient pressure r=2.36 g cm–3
1/D
107 106
D(cm2 s–1)
105
10–6
104 0.15 0.20 0.25 0.30 0.35 1,000/T
10–7 r=3.01 g cm–3 r=2.36 g cm–3 10–8 0.05
0.07
0.09 0.11 1,000/TSc (mol J–1)
0.13
0.15
Figure 14 Self-diffusivity D versus configurational entropy SC data (symbols) for the BKS silica model at two different density r values. The lines are fits to the AG form given by Eq. [10] with tR / 1=D. Reprinted with permission from Ref. 91.
AN ALTERNATIVE TO ADAM–GIBBS? Finally, we note that a different type of exponential scaling that links single-particle dynamics to entropy has now been found to hold for atomistic super-cooled liquids.122 In particular, the relationship, D ¼ D0 exp½Bsex
½15
captures, to an excellent approximation, the behavior of several diverse model liquids upon isochoric super-cooling. Here, sex is the molar excess entropy, i.e., the molar entropy of the liquid minus that of an ideal gas with the same number density. The quantities D0 and B are temperature-independent parameters, and so Eq. [15] indicates that the nontrivial T dependencies of D for super-cooled liquids are captured entirely by sex . Because the relationship in Eq. [15] involves excess entropy sex instead of configurational entropy SC , it has three important practical advantages over the AG relationship that make it a particularly intriguing subject of future investigation. First, unlike SC , the quantity sex is a standard thermodynamic variable123 that does not require knowledge of the detailed topographical information about the PEL. Second, the determination of sð2Þ (see Eq. [3]),
152
Computing Free Volume, Structural Order
the ‘‘two-body’’ approximation for sex , only requires knowledge of the radial distribution function gðrÞ, which is an experimentally accessible quantity.40 As a result, Eq. [15] promises to lend new insights into the elusive connection between structure and dynamics of super-cooled liquids.124 Third, although SC is not thought to be particularly relevant for the dynamics of fluids above their freezing point, sex is known to provide scaling relationships for the transport coefficients of simple equilibrium fluids,125–130 which means that sex seems to be a relevant thermodynamic quantity for dynamics from ‘‘ideal glass to gas.’’122 However, since the success of Eq. [15] is a very recent finding, more studies are warranted to probe the generality of the relation for the super-cooled liquid state. Also, as with the AG relation, a fundamental ‘‘first-principles’’ justification for the empirically observed connections between sex and the dynamical properties of fluids is still lacking.
CONCLUSIONS We reviewed some of the ways computing has emerged as a tool for investigating the microscopic origins of the behaviors of liquid and glassy states in this chapter. Our discussion centered around three concepts that have strongly influenced ‘‘physically based’’ models for these systems: structural ordering of the particles, free volume, and entropy. For each of these concepts, we provided introductory material and a description of some of the basic algorithmic tools that we feel a graduate student or a scientist new to the field might find helpful. We also highlighted some of the key research findings from extensive tests of these ideas in model systems. One pervasive theme is that the three different ways of studying the liquid state share a common hypothesis: namely, that an intimate connection exists between static (structural and thermodynamic) properties, on the one hand, and dynamics, on the other. Although there is mounting empirical evidence from simulations that such connections exist, the specific relations observed have not yet been shown to follow from a formal statistical mechanical treatment. However, because simulations can provide detailed information on the structure, thermodynamics, and dynamics of model systems, they will continue play an integral role in the testing of future theories that aim to provide an understanding of these observations.
ACKNOWLEDGMENTS T. M. Truskett acknowledges funding from the National Science Foundation, the American Chemical Society Petroleum Research Fund, the David and Lucile Packard Foundation, the Alfred P. Sloan Foundation, and the Texas Advanced Computing Center. W. P. Krekelberg acknowledges support from the National Science Foundation. J. R. Errington acknowledges support from the
References
153
National Science Foundation, the American Chemical Society Petroleum Research Fund, and the University at Buffalo Center for Computational Research.
REFERENCES 1. D. A. McQuarrie, Statistical Mechanics, Harper-Collins, New York, 1976. 2. S. Chapman and T. G. Cowling, The Mathematical Theory of Non-Uniform Gases, Cambridge University Press, Cambridge, Massachusetts, 1970. 3. J. O. Hirschfelder, C. F. Curtiss, and R. B. Bird, Molecular. Theory of Gases and Liquids, Wiley, New York, 1954. 4. M. Goldstein, J. Chem. Phys., 51, 3728 (1969). Viscous Liquids and the Glass Transition: A Potential Energy Barrier Picture. 5. F. H. Stillinger and T. A. Weber, Phys. Rev. A, 25, 978 (1982). Hidden Structure in Liquids. 6. F. H. Stillinger, Science, 267, 1935 (1995). A Topographic View of Supercooled Liquids and Glass Formation. 7. J.-P. Hansen and I. R. McDonald, Theory of Simple Liquids, Academic Press, London, 1986. 8. S. C. Glotzer, J. Non-Cryst. Sol., 274, 342 (2000). Spatially Heterogeneous Dynamics in Liquids: Insights from Simulation. 9. S. Torquato, Random Heterogeneous Materials: Microstructure and Macroscopic Properties, Springer-Verlag, New York, 2002. 10. C. Kittel, Introduction to Solid State Physics, Wiley, New York, 1996. 11. J.-L. Barrat and J.-P. Hansen, Basic Concepts for Simple and Complex Fluids, Cambridge University Press, Cambridge, Massachusetts, 2003. 12. P. M. Chaikin and T. C. Lubensky, Principles of Condensed Matter Physics, Cambridge University Press, Cambridge, Massachusetts, 1995. 13. P. G. Debenedetti, Metastable Liquids. Concepts and Principles, Princeton University Press, Princeton, New Jersey, 1996. 14. P. G. Debenedetti and F. H. Stillinger, Nature (London), 410, 259 (2001). Supercooled Liquids and the Glass Transition. 15. M. P. Allen and D. J. Tildesley, Computer Simulation of Liquids, Oxford University Press, Oxford, United Kingdom, 1989. 16. D. Frenkel and B. Smit, Understanding Molecular Simulation, Academic Press, Orlando, Florida, 2001. 17. D. C. Rapaport, The Art of Molecular Dynamics Simulation, Cambridge University Press, Cambridge, Massachusetts, 2004. 18. C. A. Angell, Science, 267, 1924 (1995). Formation of Glasses from Liquids and Biopolymers. 19. M. D. Ediger, C. A. Angell, and S. R. Nagel, J. Phys. Chem., 100, 13200 (1996). Supercooled Liquids and Glasses. 20. W. Go¨tze and L. Sjo¨gren, Rep. Prog. Phys., 55, 241 (1992). Relaxation Processes in Supercooled Liquids. 21. W. Kauzmann, Chem. Rev., 43, 219 (1948). The Nature of the Glassy State and the Behavior of Liquids at Low Temperatures. 22. D. R. Reichman and P. Charbonneau, J. Stat. Mech., P05013 (2005). Mode-Coupling Theory. 23. F. Sciortino, J. Stat. Mech., P05015 (2005). Potential Energy Landscape Description of Supercooled Liquids and Glasses. 24. J.-P. Bouchaud and G. Biroli, J. Chem. Phys., 121, 7347 (2004). On the Adam-GibbsKirkpatrick-Thirumalai-Wolynes Scenario for the Viscosity Increase in Glasses.
154
Computing Free Volume, Structural Order
25. T. R. Kirkpatrick, D. Thirumalai, and P. G. Wolynes, Phys. Rev. A, 40, 1045 (1989). Scaling Concepts for the Dynamics of Viscous Liquids near an Ideal Glassy State. 26. F. Ritort and P. Sollich, Adv. Phys., 52, (2003). Glassy Dynamics of Kinetically Constrained Models. 27. W. Kob, J. Phys.: Cond. Matt., 11, R85 (1999). Computer Simulations of Supercooled Liquids and Glasses. 28. X. Xia and P. G. Wolynes, Proc. Natl. Acad. Sci., 97, 2990 (2000). Fragilities of Liquids Predicted from the Random First Order Transition Theory of Glasses. 29. J. R. Errington and P. G. Debenedetti, Nature (London), 409, 318 (2001). Relationship between Structural Order and the Anomalies of Liquid Water. 30. T. M. Truskett, S. Torquato, and P. G. Debenedetti, Phys. Rev. E, 62, 993 (2000). Towards a Quantification of Disorder in Materials: Distinguishing Equilibrium and Glassy Sphere Packings. 31. S. Torquato, T. M. Truskett, and P. G. Debenedetti, Phys. Rev. Lett., 84, 2064 (2000). Is Random Close Packing of Spheres Well Defined? 32. J. R. Errington, P. G. Debenedetti, and S. Torquato, J. Chem. Phys., 118, 2256 (2003). Quantification of Order in the Lennard-Jones System. 33. M. S. Shell, P. G. Debenedetti, and A. Z. Panagiotopoulos, Phys. Rev. E, 66, 011202 (2002). Molecular Structural Order and Anomalies in Liquid Silica. 34. T. M. Truskett, S. Torquato, S. Sastry, P. G. Debenedetti, and F. H. Stillinger, Phys. Rev. E, 58, 3083 (1998). Structural Precursor to Freezing in the Hard-Disk and Hard-Sphere Systems. 35. P. A. Monson and D. A. Kofke, Adv. Chem. Phys., 115, 113 (2000). Solid-Fluid Equilibrium: Insights from Simple Molecular Models. 36. J. M. Bernstein, Polymorphism in Molecular Crystals, Oxford University Press, Oxford, United Kingdom, 2002. 37. H. S. Green, The Molecular Theory of Fluids, North-Holland, Amsterdam, The Netherlands, 1958. 38. R. E. Nettleton and M. S. Green, J. Chem. Phys., 29, 1365 (1958). Expression in Terms of Molecular Distribution Functions for the Entropy Density in an Infinite System. 39. H. J. Raveche´, J. Chem. Phys., 55, 2242 (1971). Entropy and Molecular Correlation Functions in Open Systems. I. Derivation. 40. A. Baranyai and D. J. Evans, Phys. Rev. A, 40, 3817 (1989). Direct Entropy Calculation from Computer Simulation of Liquids. 41. R. D. Mountain and H. J. Raveche´, J. Chem. Phys., 55, 2250 (1971). Entropy and Molecular Correlation Functions in Open Systems. II. Two- and Three-Body Correlations. 42. T. Lazaridis and M. Karplus, J. Chem. Phys., 105, 4294 (1996). Orientational Correlations and Entropy in Liquid Water. 43. P. J. Steinhardt, D. R. Nelson, and M. Ronchetti, Phys. Rev. B, 28, 784 (1983). BondOrientational Order in Liquids and Glasses. 44. E. W. Hobson, The Theory of Spherical and Ellipsoidal Harmonics, Chelsea, New York, 1955. 45. J. D. Weeks, D. Chandler, and H. C. Andersen, J. Chem. Phys., 54, (1971). Role of Repulsive Forces in Determining the Equilibrium Structure of Simple Liquids. 46. R. W. Zwanzig, J. Chem. Phys., 22, 1420 (1954). High-Temperature Equation of State by a Perturbation Method. I. Nonpolar Gases. 47. P. R. ten Wolde, M. J. Ruiz-Montero, and D. Frenkel, J. Chem. Phys., 104, 9932 (1996). Numerical Calculation of the Rate of Crystal Nucleation in a Lennard-Jones System at Moderate Undercooling. 48. A. Vishnyakov and A. V. Neimark, J. Chem. Phys., 118, 7585 (2003). Specifics of Freezing of Lennard-Jones Fluid Confined to Molecularly Thin Layers.
References
155
49. A. R. Kansal, T. M. Truskett, and S. Torquato, J. Chem. Phys., 113, 4844 (2000). NonEquilibrium Hard-Disk Packings with Controlled Orientational Order. 50. P. L. Chau and A. J. Hardwick, Mol. Phys., 93, 511 (1998). A New Order Parameter for Tetrahedral Configurations. 51. Y. I. Naberukhin, V. P. Voloshin, and N. N. Medvedev, Mol. Phys., 73, 917 (1991). Geometrical Analysis of the Structure of Simple Liquids: Percolation Approach. 52. C. A. Angell and H. Kanno, Science, 193, 1121 (1976). Density Maxima in High-Pressure Supercooled Water and Liquid Silicon Dioxides. 53. C. A. Angell, P. A. Cheeseman, and S. Tamaddon, Science, 218, 885 (1982). Pressure Enhancement of Ion Mobilities in Liquid Silicates from Computer-Simulation Studies to 800-Kilobars. 54. C. A. Angell, E. D. Finch, L. A. Woolf, and P. Bach, J. Chem. Phys., 65, 3063 (1976). SpinEcho Diffusion-Coefficients of Water to 2380 Bar and 20 degrees C. 55. F. X. Prielmeier, E. W. Lang, R. J. Speedy, and H. D. Ludemann, Phys. Rev. Lett., 59, 1128 (1987). Diffusion in Supercooled Water to 300 MPa. 56. S. Tsuneyuki and Y. Matsui, Phys. Rev. Lett., 74, 3197 (1995). Molecular-Dynamics Study of Pressure Enhancement of Ion Mobilities in Liquid Silica. 57. H. J. C. Berendsen, J. R. Grigera, and T. P. Straatsma, J. Chem. Phys., 91, 6269 (1987). The Missing Term in Effective Pair Potentials. 58. S. Sastry, Nature, 409, 300 (2001). Water Structure: Order and Oddities. 59. R. M. Lynden-Bell and P. G. Debenedetti, J. Phys. Chem. B, 109, 6527 (2005). Computational Investigation of Order, Structure and Dynamics in Modified Water Models. 60. T. M. Truskett and K. A. Dill, J. Phys. Chem. B, 106, 11829 (2002). A Simple Statistical Mechanical Model of Water. 61. Z. Yan, S. V. Buldyrev, N. Giovambattista, and H. E. Stanley, Phys. Rev. Lett., 95, 130604 (2005). Structural Order for One-Scale and Two-Scale Potentials. 62. O. Mishima and H. E. Stanley, Nature (London), 396, 329 (1998). The Relationship between Liquid, Supercooled and Glassy Water. 63. M. H. Cohen and D. Turnbull, J. Chem. Phys., 31, 1164 (1959). Molecular Transport in Liquids and Glasses. 64. D. Turnbull and M. H. Cohen, J. Chem. Phys., 52, 3038 (1970). On the Free-Volume Model of the Liquid-Glass Transition. 65. A. K. Doolittle, J. Appl. Phys., 22, 1471 (1951). Studies in Newtonian Flow. II. The Dependence of the Viscosity of Liquids on Free-Space. 66. P. B. Macedo and T. A. Litovitz, J. Chem. Phys., 42, 245 (1965). On the Relative Roles of Free Volume and Activation Energy in the Viscosity of Liquids. 67. J. H. Hildebrand, Science, 174, 490 (1971). Motions of Molecules in Liquids: Viscosity and Diffusivity. 68. J. H. Dymond, J. Chem. Phys., 60, 969 (1974). Corrected Enskog Theory and the Transport Coefficients of Liquids. 69. H. Liu, C. M. Silva, and E. A. Macedo, Fluid Phase Equilib., 202, 89 (2002). Generalised FreeVolume Theory for Transport Properties and New Trends About the Relationship between Free Volume and Equations of State. 70. S. Sastry, D. S. Corti, P. G. Debenedetti, and F. H. Stillinger, Phys. Rev. E, 56, 5524 (1997). Statistical Geometry of Particle Packings. I. Algorithm for Exact Determination of Connectivity, Volume, and Surface Areas of Void Space in Monodisperse and Polydisperse Sphere Packings. 71. S. Sastry, T. M. Truskett, P. G. Debenedetti, and S. Torquato, Mol. Phys., 95, 289 (1998). Free Volume in the Hard Sphere Liquid. 72. J. C. Conrad, F. W. Starr, and D. A. Weitz, J. Phys. Chem. B, 109, 21235 (2005). Weak Correlations between Local Density and Dynamics near the Glass Transition.
156
Computing Free Volume, Structural Order
73. R. P. A. Dullens, D. G. A. L. Aarts, and W. K. Kegel, Proc. Nat. Acad. Sci. USA, 103, 529 (2006). Direct Measurement of the Free Energy by Optical Microscopy. 74. P. G. Debenedetti and T. M. Truskett, Fluid Phase Equilib., 158, 549 (1999). The Statistical Geometry of Voids in Liquids. 75. W. P. Krekelberg, V. Ganesan, and T. M. Truskett, J. Phys. Chem. B, 110, 5166 (2006). Free Volumes and the Anomalous Self-Diffusivity of Attractive Colloids. 76. F. W. Starr, S. Sastry, J. F. Douglas, and S. C. Glotzer, Phys. Rev. Lett., 89, 125501 (2002). What Do We Learn from the Local Geometry of Glass-Forming Liquids? 77. W. G. Hoover, W. T. Ashurst, and R. Grover, J. Chem. Phys., 57, 1259 (1972). Exact Dynamical Basis for a Fluctuating Cell Model. 78. D. Ben-Amotz and G. Stell, J. Phys. Chem. B, 108, 6877 (2004). Reformulation of WeeksChandler-Andersen Perturbation Theory Directly in Terms of a Hard-Sphere Reference System. 79. R. J. Speedy and H. Reiss, Mol. Phys., 72, 1015 (1991). A Computer-Simulation Study of Cavities in the Hard Disk Fluid and Crystal. 80. M. Tanemura, T. Ogawa, and N. Ogita, J. Comput. Phys., 51, 191 (1983). A New Algorithm for Three-Dimensional Voronoi Tessellation. 81. N. N. Medvedev, A. Geiger, and W. J. Brostow, J. Chem. Phys., 93, 8337 (1990). Distinguishing Liquids from Amorphous Solids: Percolation Analysis on the Voronoi Network. 82. R. J. Speedy, J. Chem. Soc., Faraday Trans. 2, 77, 329 (1981). Cavities and Free-Volume in Hard-Disk and Hard-Sphere Systems. 83. D. S. Corti and R. K. Bowles, Mol. Phys., 96, 1623 (1999). Statistical Geometry of Hard Sphere Systems: Exact Relations for Additive and Non-Additive Mixtures. 84. N. F. Carnahan and K. E. Starling, J. Chem. Phys., 51, 635 (1969). Equation of State for Nonattracting Rigid Spheres. 85. W. P. Krekelberg, V. Ganesan, and T. M. Truskett, J. Chem. Phys., 124, 214502 (2006). Model for the Free-Volume Distributions of Equilibrium Fluids. 86. A. M. Puertas, M. Fuchs, and M. E. Cates, Phys. Rev. Lett., 88, 098301 (2002). Comparative Simulation Study of Colloidal Gels and Glasses. 87. A. M. Puertas, M. Fuchs, and M. E. Cates, Phys. Rev. E, 67, 031406 (2003). Simulation Study of Nonergodicity Transitions: Gelation in Colloidal Systems with Short-Range Attractions. 88. A. M. Puertas, M. Fuchs, and M. E. Cates, J. Chem. Phys., 121, 2813 (2004). Dynamical Heterogeneities Close to a Colloidal Gel. 89. F. Sciortino, Nat. Mater., 1, 145 (2002). One Liquid, Two Glasses. 90. G. Adam and J. H. Gibbs, J. Chem. Phys., 43, 139 (1965). The Temperature Dependence of Cooperative Relaxation Properties in Glass-Forming Liquids. 91. I. Saika-Voivod, P. H. Poole, and F. Sciortino, Nature (London), 412, 514 (2001). Fragile-toStrong Transition and Polyamorphism in the Energy Landscape of Liquid Silica. 92. A. Scala, F. W. Starr, E. La Nave, F. Sciortino, and H. E. Stanley, Nature (London), 406, 166 (2000). Configurational Entropy and Diffusivity of Supercooled Water. 93. S. Sastry, Nature (London), 409, 164 (2001). The Relationship between Fragility, Configurational Entropy and the Potential Energy Landscape of Glass-Forming Liquids. 94. Y. Gebremichael, M. Vogel, M. N. J. Bergroth, F. W. Starr, and S. C. Glotzer, J. Phys. Chem. B, 109, 15068 (2005). Spatially Heterogeneous Dynamics and the Adam-Gibbs Relation in the Dzugutov Liquid. 95. C. A. Angell, J. Res. Nat. Inst. Stand. Tech., 102, 171 (1997). Entropy and Fragility in Supercooling Liquids. 96. J. H. Magill, J. Chem. Phys., 47, 2802 (1967). Physical Properties of Aromatic Hydrocarbons. III. Test of the Adam-Gibbs Relaxation Model for Glass Formers Based on the HeatCapacity Data of 1,3,5-Tri-a-Naphthylbenzene.
References
157
97. R. Richert and C. A. Angell, J. Chem. Phys., 108, 9016 (1998). Dynamics of Glass-Forming Liquids. V. On the Link between Molecular Dynamics and Configurational Entropy. 98. C. M. Roland, S. Capaccioli, M. Lucchesi, and R. Casalini, J. Chem. Phys., 120, 10640 (2004). Adam-Gibbs Model for the Supercooled Dynamics in the Ortho-Terphenyl OrthoPhenylphenol Mixture. 99. V. Lubchenko and P. G. Wolynes, J. Chem. Phys., 119, 9088 (2003). Barrier Softening near the Onset of Nonactivated Transport in Supercooled Liquids: Implications for Establishing Detailed Connection between Thermodynamic and Kinetic Anomalies in Supercooled Liquids. 100. U. Mohanty, I. Oppenheim, and C. H. Taubes, Science, 266, 425 (1994). Low-Temperature Relaxation and Entropic Barriers in Supercooled Liquids. 101. S. Sastry, Phys. Rev. Lett., 85, 590 (2000). Liquid Limits: Glass Transition and Liquid-Gas Spinodal Boundaries of Metastable Liquids. 102. C. DeMichele, F. Sciortino, and A. Coniglio, J. Phys.: Cond. Matt., 16, L489 (2004). Scaling in Soft Spheres: Fragility Invariance on The Repulsive Potential Softness. 103. Q. Yan and J. J. de Pablo, J. Chem. Phys., 111, 9509 (1999). Hyper-Parallel Tempering Monte Carlo: Application to the Lennard-Jones Fluid and the Restricted Primitive Model. 104. L. Santen and W. Krauth, Nature (London), 405, 550 (2000). Absence of Thermodynamic Phase Transition in a Model Glass Former. 105. T. S. Grigera and G. Parisi, Phys. Rev. E, 63, 045102 (2001). Fast Monte Carlo Algorithm for Supercooled Soft Spheres. 106. F. Wang and D. P. Landau, Phys. Rev. Lett., 86, 2050 (2001). Efficient, Multiple-Range Random Walk Algorithm to Calculate the Density of States. 107. F. Wang and D. P. Landau, Phys. Rev. E, 64, 056101 (2001). Determining the Density of States for Classical Statistical Models: A Random Walk Algorithm to Produce a Flat Histogram. 108. Q. Yan, T. S. Jain, and J. J. de Pablo, Phys. Rev. Lett., 92, 235701 (2004). Density-of-States Monte Carlo Simulation of a Binary Glass. 109. Q. Yan and J. J. de Pablo, Phys. Rev. Lett., 90, 035701 (2003). Fast Calculation of the Density of States of a Fluid by Monte Carlo Simulations. 110. M. S. Shell, P. G. Debenedetti, and A. Z. Panagiotopoulos, J. Phys. Chem. B, 108, 19748 (2004). Flat Histogram Dynamics and Optimization in Density of States Simulations of Fluids. 111. C. De Michele and F. Sciortino, Phys. Rev. E, 65, 051202 (2002). Equilibration Times in Numerical Simulation of Structural Glasses: Comparing Parallel Tempering and Conventional Molecular Dynamics. 112. Y. Brumer and D. R. Reichman, J. Phys. Chem. B, 108, 6832 (2004). Numerical Investigation of the Entropy Crisis in Model Glass Formers. 113. Y. Rosenfeld and P. Tarazona, Mol. Phys., 95, 141 (1998). Density Functional Theory and the Asymptotic High Density Expansion of the Free Energy of Classical Solids and Fluids. 114. B. Coluzzi, G. Parisi, and P. Verrocchio, Phys. Rev. Lett., 84, 306 (2000). Thermodynamical Liquid-Glass Transition in a Lennard-Jones Binary Mixture. 115. S. Sastry, J. Phys.: Cond. Matt., 12, 6515 (2000). Evaluation of the Configurational Entropy of a Model Liquid from Computer Simulations. 116. F. Sciortino, W. Kob, and P. Tartaglia, Phys. Rev. Lett., 83, 3214 (1999). Inherent Structure Entropy of Supercooled Liquids. 117. D. J. Wales, Energy Landscapes, Cambridge University Press, Cambridge, United Kingdom, 2003. 118. R. H. Byrd, P. H. Lu, J. Nocedal, and C. Y. Zhu, Siam J. Sci. Comput., 16, 1190 (1995). A Limited Memory Algorithm for Bound Constrained Optimization.
158
Computing Free Volume, Structural Order
119. C. Chakravarty, P. G. Debenedetti, and F. H. Stillinger, J. Chem. Phys., 123, 206101 (2006). Generating Inherent Structures of Liquids: Comparison of Local Minimization Algorithms. 120. B. W. H. van Beest, G. J. Kramer, and R. A. van Santen, Phys. Rev. Lett., 64, 1955 (1990). Force Fields for Silicas and Aluminophosphates Based on Ab Initio Calculations. 121. A. Saksaengwijit, J. Reinisch, and A. Heuer, Phys. Rev. Lett., 93, 235701 (2004). Origin of the Fragile-to-Strong Crossover in Liquid Silica as Expressed by Its Potential-Energy Landscape. 122. J. Mittal, J. R. Errington, and T. M. Truskett, J. Chem. Phys., 125, 076102 (2006). Relationship between Thermodynamics and Dynamics of Supercooled Liquids. 123. J. M. Smith, H. C. V. Ness, and M. Abbott, Introduction to Chemical Engineering Thermodynamics, McGraw-Hill, New York, 2000. 124. J. Mittal, J. R. Errington, and T. M. Truskett, J. Phys. Chem. B, 110, 18147 (2006). Quantitative Link Between Single-Particle Dynamics and Static Structure of Supercooled Liquids. 125. Y. Rosenfeld, Phys. Rev. A, 15, 2545 (1977). Relation between Transport-Coefficients and Internal Entropy of Simple Systems. 126. Y. Rosenfeld, J. Phys.: Cond. Matt., 11, 5415 (1999). A Quasi-Universal Scaling Law for Atomic Transport in Simple Fluids. 127. Y. Rosenfeld and A. Baram, J. Chem. Phys., 75, 427 (1981). Universal Strong Coupling Equation of State for Inverse Power Potentials. 128. Y. Rosenfeld, E. Nardi, and Z. Zinamon, Phys. Rev. Lett., 75, 2490 (1995). Corresponding States Hard-Sphere Model for the Diffusion-Coefficients of Binary Dense-Plasma Mixtures. 129. M. Dzugutov, Nature, 381, 137 (1996). A Universal Scaling Law for Atomic Diffusion in Condensed Matter. 130. J. Mittal, J. R. Errington, and T. M. Truskett, Phys. Rev. Lett., 96, 177804 (2006). Thermodynamics Predicts How Confinement Modifies Hard-Sphere Dynamics.
CHAPTER 4
The Reactivity of Energetic Materials at Extreme Conditions Laurence E. Fried Chemistry, Materials Science, and Life Sciences Directorate Lawrence Livermore National Laboratory, Livermore, California
INTRODUCTION Energetic materials are unique for having a strong exothermic reactivity, which has made them desirable for both military and commercial applications. Energetic materials are commonly divided into high explosives, propellants, and pyrotechnics. We will focus on high explosive (HE) materials here, although a great deal of commonality exists between the classes of energetic materials. Although the history of HE materials is long, their condensed-phase properties are poorly understood. Understanding the condensed-phase properties of HE materials is important for determining stability and performance. Information regarding HE material properties [such as the physical, chemical, and mechanical behaviors of the constituents in plastic-bonded explosive (PBX) formulations] is necessary for efficiently building the next generation of explosives as the quest for more powerful energetic materials (in terms of energy per volume) moves forward.1 There is a need to better understand the physical, chemical, and mechanical behaviors when modeling HE materials from fundamental theoretical principles. Among the quantities of interest in PBXs, for example, are thermodynamic stabilities, reaction kinetics, equilibrium transport coefficients, mechanical moduli, and interfacial properties between HE materials and the Reviews in Computational Chemistry, Volume 25 edited by Kenny B. Lipkowitz and Thomas R. Cundari Copyright ß 2007 Wiley-VCH, John Wiley & Sons, Inc.
159
160
The Reactivity of Energetic Materials at Extreme Conditions
polymeric binders. These properties are needed (as functions of stress state and temperature) for the development of improved micro-mechanical models,2 which represent the PBX at the level of high explosive grains and polymeric binder.3,4 Improved micro-mechanical models are needed to describe the responses of PBXs to dynamic stress or thermal loading, thus yielding information for use in developing continuum models. Detailed descriptions of the chemical reaction mechanisms of condensed energetic materials at high densities and temperatures are essential for understanding events that occur at the reactive front under combustion or detonation conditions. Under shock conditions, for example, energetic materials undergo rapid heating to a few thousand degrees and are subjected to a compression of hundreds of kilobars,5 which results in almost 30% volume reduction. Complex chemical reactions are thus initiated, in turn releasing large amounts of energy to sustain the detonation process. Clearly, understanding of the various chemical events at these extreme conditions is essential in order to build predictive material models. Scientific investigations into the reactive process have been undertaken over the past two decades. However, the sub-microsecond time scale of explosive reactions, in addition to the highly exothermic conditions of an explosion, make experimental investigation of the decomposition pathways difficult at best. More recently, new computational approaches to investigate condensed-phase reactivity in energetic materials have been developed. Here we focus on two different approaches to condensed-phase reaction modeling: chemical equilibrium methods and atomistic modeling of condensedphase reactions. These complementary approaches assist in understanding the chemical reactions of high explosives. Chemical equilibrium modeling uses a highly simplified thermodynamic picture of the reaction process, which leads to a convenient and predictive model of detonation and other decomposition processes. Chemical equilibrium codes are often used in the design of new materials, both at the level of synthesis chemistry and formulation. Atomistic modeling is a rapidly emerging area. The doubling of computational power approximately every 18 months that is predicted by Moore’s law has made atomistic condensed-phase modeling more feasible. Atomistic calculations employ far fewer empirical parameters than chemical equilibrium calculations. Nevertheless, the atomistic modeling of chemical reactions requires an accurate global Born–Oppenheimer potential energy surface. Traditionally, such a surface is constructed by representing the potential energy surface with an analytical fit. This approach is only feasible for simple chemical reactions involving a small number of atoms. More recently, first principles molecular dynamics, where the electronic Schro¨dinger equation is solved numerically at each configuration in a molecular dynamics simulation, has become the method of choice for treating complicated chemical reactions.6
Chemical Equilibrium
161
CHEMICAL EQUILIBRIUM The energy content of an HE material often determines its practical utility. Accurate estimates of the energy content are essential in the design of new materials1 and for understanding quantitative detonation tests.7 The useful energy content is determined by the anticipated release mechanism. Because detonation events occur on a microseconds time frame, chemical reactions significantly faster than this may be considered to be in an instantaneous chemical equilibrium. It is generally believed that reactions involving the production of small gaseous molecules (CO2, H2O, etc.) are fast enough to be treated in chemical equilibrium for most energetic materials. This belief is based partly on success in modeling a wide range of materials with the assumption of chemical equilibrium.8–12 Unfortunately, direct measurements of chemical species involved in the detonation of a solid or liquid HE material are difficult to perform. Blais, Engelke, and Sheffield13 have measured some of the species produced in detonating nitromethane using a special mass spectroscopic apparatus. These measurements pointed to the importance of condensation reactions in detonation. The authors estimate that the hydrodynamic reaction zone of detonating base-sensitized liquid nitromethane is 50 m in thickness, with a reaction time of 7 ns. The hydrodynamic reaction zone dictates the point at which the material ceases to release enough energy to drive the detonation wave forward. Reactions may continue to proceed behind the reaction zone, but the time scales for such reactions are harder to estimate. Typical explosive experiments are performed on parts with dimensions on the order of 1–10 cm. In this case, hydrodynamic confinement is expected to last for roughly 1 ms, based on a high-pressure sound speed of several centimeters/microsecond. Thus, chemical equilibrium is expected to be a valid assumption for nitromethane, based on the time scale separation between the 7-ns reaction zone and the microsecond time scale of confinement. The formation of solids, such as carbon, or the combustion of metallic fuels, such as Al, is believed to yield significantly longer time scales of reaction.14 In this case, chemical equilibrium is a rough, although useful, approximation to the state of matter of a detonating material. Thermodynamic cycles are a useful way to understand energy release mechanisms. Detonation can be thought of as a cycle that transforms the unreacted explosive into stable product molecules at the Chapman–Jouguet (C-J) state,15 which is simply described as the slowest steady-state shock state that conserves mass, momentum, and energy (see Figure 1). Similarly, the deflagration of a propellant converts the unreacted material into product molecules at constant enthalpy and pressure. The nature of the C–J state and other special thermodynamic states important to energetic materials is determined by the equation of state of the stable detonation products.
162
The Reactivity of Energetic Materials at Extreme Conditions Chapman– Jouguet
Energy
Unreacted
Expanded products
Combustion in air Volume
Figure 1 A thermodynamic picture of detonation: The unreacted material is compressed by the shock front and reaches the Chapman–Jouguet point. From there adiabatic expansion occurs, which leads to a high-volume state. Finally, detonation products may mix in air and combust.
A purely thermodynamic treatment of detonation ignores the important question of reaction time scales. The finite time scale of reaction leads to strong deviations in detonation velocities from values based on the Chapman–Jouguet theory.16 The kinetics of even simple molecules under high-pressure conditions is not well understood. High-pressure experiments promise to provide insight into chemical reactivity under extreme conditions. For instance, chemical equilibrium analysis of shocked hydrocarbons predicts the formation of condensed carbon and molecular hydrogen.17 Similar mechanisms are at play when detonating energetic materials form condensed carbon.10 Diamond anvil cell experiments have been used to determine the equation of state of methanol under high pressures.18 We can then use a thermodynamic model to estimate the amount of methanol formed under detonation conditions.19 Despite the importance of chemical kinetic rates, chemical equilibrium is often nearly achieved when energetic materials react. As discussed, this is a useful working approximation, although it has not been established through direct measurement. Chemical equilibrium can be reached rapidly under high-temperature (up to 6000 K) conditions produced by detonating energetic materials.20 We begin our discussion by examining thermodynamic cycle theory as applied to high explosive detonation. This is a current research topic because high explosives produce detonation products at extreme pressures and temperatures: up to 40 GPa and 6000 K. These conditions make it extremely difficult to probe chemical speciation. Relatively little is known about the equations of state under these conditions. Nonetheless, shock experiments on a wide range of materials have generated sufficient information to allow reliable thermodynamic modeling to proceed.
Chemical Equilibrium
163
One of the attractive features of thermodynamic modeling is that it requires very little information regarding the unreacted energetic material; elemental composition, density, and heat of formation of the material are the only information needed. As elemental composition is known once the material is specified, only density and heat of formation need to be predicted. The C–J detonation theory15 implies that the performance of an explosive is determined by thermodynamic states, the C–J state, and the connected expansion region, as illustrated in Figure 1. As detonation processes are so rapid, there is insufficient time for thermal conduction during expansion, which implies that the expansion from the C–J state lies on an adiabat: dE ¼ pdV. The adiabatic expansion of the detonation products releases energy in the form of PV work and heat. Subsequent turbulent mixing of the detonation products in air surrounding the energetic material leads to combustion processes that release more energy. Thermochemical codes use thermodynamics to calculate states illustrated in Figures 1 and 2 and, thus, predict explosive performance. The allowed thermodynamic states behind a shock are intersections of the Rayleigh line (expressing conservation of mass and momentum) and the shock Hugoniot (expressing conservation of energy). The C–J theory assumes that a stable detonation occurs when the Rayleigh line is tangent to the shock Hugoniot, as shown in Figure 2. This point of tangency can be determined, assuming that the equation of state P ¼ PðV; EÞ of the products is known. The chemical composition of the products changes with the thermodynamic state, so thermochemical codes must solve for state variables and chemical concentrations simultaneously. This problem is relatively straightforward, given that the equation of state (EOS) of the fluid and solid products are known.
Fully reacted Hugoniot
P
Rayleigh line Chapman–Jouguet
Unreacted state V
Figure 2 Allowed thermodynamic states in detonation are constrained to the shock Hugoniot. Steady-state shock waves follow the Rayleigh line.
164
The Reactivity of Energetic Materials at Extreme Conditions
One of the most difficult parts of this problem is describing the EOS of the fluid components accurately. Because of its simplicity, the Becker– Kistiakowski–Wilson (BKW)21 EOS is used in many practical applications involving energetic materials. Numerous parameter sets have been proposed for the BKW EOS.22–25 Kury and Souers7 have critically reviewed these sets by comparing their predictions to a database of detonation tests. They concluded that the BKW EOS does not model the detonation of a copper-lined cylindrical charge adequately. The BKWC parameter set26 overcomes this deficiency partially through multivariate parameterization techniques. However, the BKWC parameter set is not reliable when applied to explosives that are very high in hydrogen content. It has long been recognized that the validity of the BKW EOS is questionable.12 This is particularly important when designing new materials that may have unusual elemental compositions. Efforts to develop better EOSs have been based largely on the concept of model potentials. With model potentials, molecules interact via idealized spherical pair potentials. Statistical mechanics is then employed to calculate the EOS of the interacting mixture of effective spherical particles. Most often, the exponential-6 (exp-6) potential is used for the pair interactions: r 6 e ar m 6 exp a VðrÞ ¼ a ½1 a6 rm r Here, r is the distance between particles, rm is the minimum of the potential well, E is the well depth, and a is the softness of the potential well. The Jacobs–Cowperthwaite–Zwissler (JCZ3) EOS was the first successful model based on a pair potential that was applied to detonation.27 This EOS was based on fitting Monte Carlo simulation data to an analytic functional form. Ross, Ree, and others successfully applied a soft-sphere EOS based on perturbation theory to detonation and shock problems.10,28–30 Computational cost is a significant difficulty with an EOS based on fluid perturbation theory. Byers Brown31 developed an analytic representation of the Kang et al. EOSs using Chebyshev polynomials. The accuracy of the Byers Brown EOS has been evaluated by Charlet et al.;12 these authors concluded that Ross’s approach is the most reliable. Fried and Howard32 have used a combination of integral equation theory and Monte Carlo simulations to generate a highly accurate EOS for the exp-6 fluid. The exp-6 model is not well suited to molecules with large dipole moments. To account for this, Ree9 used a temperature-dependent well depth E(T) in the exp-6 potential to model polar fluids and fluid phase separations. Fried and Howard have developed an effective cluster model for HF.33 The effective cluster model is valid for temperatures lower than the variable well-depth model, but it employs two more adjustable parameters than does the latter. Jones et al.34 have applied thermodynamic perturbation theory to
Chemical Equilibrium
165
polar detonation-product molecules. Despite these successes, more progress needs to be made in the treatment of polar detonation-product molecules. Efforts have been made to develop EOS for detonation products based on direct Monte Carlo simulations instead of on analytical approaches.35–37 This approach is promising given recent increases in computational capabilities. One of the greatest advantages of direct simulation is the ability to go beyond van der Waals 1-fluid theory, which approximately maps the equation of state of a mixture onto that of a single component fluid.38 In most cases, interactions between unlike molecules (treated as single spherical sites) are treated with Lorentz–Berthelot combination rules.39 The rules are used to determine the interactions between unlike molecules and those of like molecules. The rules specify the interactions between unlike molecules to be the arithmetic or geometric averages of single-molecule pairwise interactions. It seems that these rules work well in practice, although they have not been extensively tested through experiment. Highly non-additive pair interactions have been proposed for N2 and O2.30 The resulting N2 model accurately matches double-shock data, but it is not accurate at lower temperatures and densities.32 A combination of experiments on mixtures along with advancements in theory is needed to develop reliable unlike-pair interaction potentials. The exp-6 potential has also proved successful in modeling chemical equilibrium at the high pressures and temperatures characteristic of detonation. However, to calibrate the parameters for such models, it is necessary to have experimental data for product molecules and mixtures of molecular species at high temperature and pressure. Static compression and sound-speed measurements provide important data for these models. Exp-6 potential models can be validated through several independent means. Fried and Howard33 have considered the shock Hugoniots of liquids and solids in the ‘‘decomposition regime’’ where thermochemical equilibrium is established. As an example of a typical thermochemical implementation, consider the Cheetah thermochemical code.32 Cheetah is used to predict detonation performance for solid and liquid explosives. Cheetah solves thermodynamic equations between product species to find chemical equilibrium for a given pressure and temperature. From these properties and elementary detonation theory, the detonation velocity and other performance indicators are computed. Thermodynamic equilibrium is found by balancing chemical potentials, where the chemical potentials of condensed species are functions of only pressure and temperature, whereas the potentials of gaseous species also depend on concentrations. To solve for the chemical potentials, it is necessary to know the pressure-volume relations for species that are important products in detonation. It is also necessary to know these relations at the high pressures and temperatures that typically characterize the C–J state. Thus, there is a need for improved high-pressure equations of state for fluids, particularly for molecular fluid mixtures.
166
The Reactivity of Energetic Materials at Extreme Conditions
In addition to the intermolecular potential, there is an intramolecular portion of the Helmholtz free energy. Cheetah uses a polyatomic model to account for this portion including electronic, vibrational, and rotational states. Such a model can be expressed conveniently in terms of the heat of formation, standard entropy, and constant-pressure heat capacity of each species. We now consider how the EOS described above predicts the detonation behavior of condensed explosives. The overdriven shock Hugoniot of an explosive is an appropriate EOS test, because it accesses a wide range of high pressures. Overdriven states lie on the shock Hugoniot at pressures above the C–J point (see Figure 2). The Hugoniot of penta-erythritol tetranitrate (PETN) is shown in Figure 3. Fried, Howard and Souers40 have calculated the Hugoniot with the exp-6 model and with the JCZS41 product library. Figure 3 shows that the exp-6 model lies within 1% of the measured data for pressures up to 120 GPa (1.2 Mbar). The JCZS model is accurate to within 1% up to a pressure of 90 GPa, but it shows a disagreement with experiment at 120 GPa. As the exp-6 model is not calibrated to condensed explosives, such agreement is a strong indication of the validity of the chemical equilibrium approximation to detonation. Despite the many successes in the thermochemical modeling of energetic materials, several significant limitations exist. One such limitation is that real systems do not always obtain chemical equilibrium during the relatively short (nanoseconds-microseconds) time scales of detonation. When this occurs, quantities such as the energy of detonation and the detonation velocity are commonly predicted to be 10–20% higher than experiment by a thermochemical calculation.
Figure 3 The shock Hugoniot of PETN as calculated with exp-6 (solid line) and the JCZS library (dotted line) vs. experiment (error bars).
Chemical Equilibrium
167
Chemical kinetic modeling is another way to treat detonation. Several well-developed chemical kinetic mechanisms exist for highly studied materials such as hexahydro-1,3,5-trinitro-1,3,5-s-triazine (RDX) and 1,3,5,7-tetranitro-1,3,5,7-tetraazacyclooctane (HMX).42 Unfortunately, detailed chemical kinetic mechanisms are not available for high-pressure conditions. Some workers have applied simplified chemical kinetics to detonation processes.16 The primary difficulty in high-pressure chemical kinetic models is a lack of experimental data on speciation. First principles simulations, discussed below, have the potential to provide chemical kinetic information for fast processes. This information could then conceivably be applied to longer time scales and lower temperatures using high-pressure chemical kinetics. Finally, there are several issues to be addressed in determining the EOS of detonation products. Although the exp-6 model is convenient, it does not treat electrostatic interactions adequately. In a condensed phase, effects such as dielectric screening and charge-induced dipoles need to be considered. Also, non-molecular phases are possible under high-pressure and temperature conditions. Molecular shape is also neglected in exp-6 models. Although the small size of most detonation product molecules limits the importance of molecular shape, lower temperature conditions could yield long-chain molecules, where molecular shape becomes more important. The possible occurence of ionized species as detonation products is a further complication that cannot be modeled using the exp-6 representation alone. Recent results on the superionic behavior of water at high pressures (see discussion below) provide compelling evidence for a high-pressure ionization scenario. These results suggest, for example, that polar and ionic species interactions may account for approximately 10% of the (C–J) pressure of PETN. In addition, we note that thermochemical calculations of high explosive formulations rich in highly electronegative elements—such as F and Cl—typically have substantially higher errors than calculations performed on formulations containing only the elements H, C, N, and O. The difficulty in modeling the C–J states of these formulations successfully may be from the neglect of ionic species. Bastea, Glaesemann, and Fried43 have extended the exp-6 free energy approach to include the explicit thermodynamic contributions arising from dipolar and ionic interactions. The main task of their theory involves calculating the Helmholtz free energy (per particle) of the detonation products—f. The theory starts with a mixture of molecular species whose short-range interactions are well described by isotropic, exp-6 potentials. This mixture includes, for example, all molecules commonly encountered as detonation products, such as N2, H2O, CO2, CO, NH3 and CH4. As documented previously,44 a one-fluid representation of this system, where one replaces the different exp-6 interactions between species by a single interaction depending on both individual interactions and mixture composition, is a very good approximation. Bastea, Glaesemann, and Fried therefore, chose this nonpolar and neutral one-component exp-6 fluid to be the reference fluid. If the mixture
168
The Reactivity of Energetic Materials at Extreme Conditions
components possess no charge or permanent dipole moments, the calculation of the corresponding free energy per particle, designated as fexp 6 , suffices to yield the mixture thermodynamics and all desired detonation properties. This physical model has been used in many thermochemical codes for the calculation of high explosives behavior. It is worth noting that at high detonation pressures and temperatures the behavior of the exp-6 fluid so introduced is dominated by short-range repulsions and is similar to that of a hard repulsive sphere fluid. In fact, the variational theory treatment45 of the exp-6 thermodynamics employs a reference hard sphere system with an effective, optimal diameter seff that depends on density and temperature. Bastea, Glaesemann, and Fried pursued this connection to the hard sphere fluid by considering first a fluid of equisized hard spheres of diameter s with dipole moments m. For this simple model of a polar liquid, Stell et al.46,47 had previously suggested a Pade´ approximation approach for calculating the free energy fd , fd ¼ f0 þ fd
½2
f2 fd ¼ 1 f3 =f2
½3
where f0 corresponds to the simple hard sphere fluid and f2 and f3 are terms (second and third order, respectively) of the perturbation expansion in the dipole–dipole interaction ð m2 Þ such that fd ¼ f0 þ f2 þ f3 þ . . . :
½4
The first order term f1 can be shown to be identically zero, whereas f2 and f3 have been calculated explicitly.46 The resulting thermodynamics can be written in scaled variables as fd ¼ fd ðr ; bd Þ r ¼ rs3 bd
m2 ¼ kB Ts3
½5
where r is the (number) density and T is the temperature. The same Pade´ approximation also holds for a mixture of identical hard spheres with different dipole moments mi .48,49 We note that within this approximation, it is easy to show that the mixture thermodynamics is equivalent with that of a simple hard spheres polar fluid with an effective dipole moment m given by X m2 ¼ xi m2i ½6 i
where xi ¼ ri =r is the concentration of particles with dipole moment mi .
Chemical Equilibrium
169
40
r[GPa]
30
20
10
0 1
1.5 r[g/cc]
2
2.5
Figure 4 Comparison of pressure results for a model of polar water at T ¼ 2000 K: MD simulations (symbols), newly developed theory for polar fluids (lower line) and exp-6 calculations alone (upper line).
We also adopt the above combination rule (Eq. [6]) for the general case of exp-6 mixtures that include polar species. Moreover, in this case, we calculate the polar free energy contribution fd using the effective hard sphere diameter seff of the variational theory. We show a comparison of this procedure with MD simulation results for an exp-6 model of polar water in Figures 4 and 5. Also shown are the results of
3
E/NkB[10 k]
8
4
0 1
2 r[g/cc]
Figure 5 Same as Figure 4 for energy per particle.
170
The Reactivity of Energetic Materials at Extreme Conditions
exp-6 thermodynamics alone. For both the pressure and the energy, the agreement is very good and the dipole moment contribution is sizeable. The thermodynamic theory for exp-6 mixtures of polar materials is now implemented in the thermochemical code Cheetah.32 We considered first the major polar detonation products H2O, NH3, CO, and HF. The optimal exp-6 parameters and dipole moment values for these species were determined by fitting to a variety of available experimental data. We find, for example, that a dipole moment of 2.2 Debye for water reproduces very well all available experiments. Incidentally, this value is in very good agreement with values typically used to model supercritical water.50 A comparison of our Cheetah polar water model predictions with both high-pressure Hugoniot data,51 and low-density (steam at 800 K) experimental data52 is presented in Figure 6. The agreement is very good for both cases. The newly developed equation of state was applied to the calculation of detonation properties. In this context, one stringent test of any equation of state is the prediction of detonation velocities as a function of initial densities, and we chose for this purpose PETN. The Cheetah results are shown in Figure 7 along with the experimental data.53 The agreement is again very good. Advances continue in the treatment of detonation mixtures that include explicit polar and ionic contributions. The new formalism places on a solid footing the modeling of polar species, opens the possibility of realistic multiple fluid phase chemical equilibrium calculations (polar—nonpolar phase segregation), extends the validity domain of the EXP6 library,40 and opens the possibility of applications in a wider regime of pressures and temperatures.
15
P[GPa]
10
5
0 0.8
1.3 r[g/cc]
1.8
Figure 6 Comparison of theory for polar water: experimental data (Hugoniot—circles and steam at T ¼ 800 K—diamonds) and theory (lines).
Atomistic Modeling of Condensed-Phase Reactions
171
10
DCJ[km/s]
8
6
4
2
0
0.5
1 r0[g/cc]
1.5
2
Figure 7 PETN detonation velocity as a function of initial density; experiments (symbols) and Cheetah calculation (line).
Predictions of high explosive detonation based on the new approach yield excellent results. A similar theory for ionic species model43 compares very well with MD simulations. Nevertheless, high explosive chemical equilibrium calculations that include ionization are beyond the current abilities of the Cheetah code, because of the presence of multiple minima in the free energy surface. Such calculations will require additional algorithmic developments. In addition, the possibility of partial ionization, suggested by first principles simulations of water discussed below, also needs to be added to the Cheetah code framework.
ATOMISTIC MODELING OF CONDENSED-PHASE REACTIONS Chemical equilibrium methods provide useful predictions of the EOS of detonation processes and the product molecules formed, but no details of the atomistic mechanisms in the detonation are revealed. We now discuss condensed-phase detonation simulations using atomistic modeling techniques to evaluate reaction mechanisms on the microscopic level. Numerous experimental studies have investigated the atomistic details of HE decomposition by examining the net products after thermal (low-pressure) decomposition (see, for example, Ref. 54). For RDX and HMX, the rate limiting reaction is most likely NO2 dissociation and a plethora of final products in the decomposition process have been isolated. Several theoretical studies have also
172
The Reactivity of Energetic Materials at Extreme Conditions
been reported on the energetics of gas-phase decomposition pathways for HE materials using a variety of methods. For example, we point to work on RDX and HMX where both quantum chemistry42,55–57 and classic simulations of unimolecular dissociation58,59 were used. Gas-phase results provide insight into the reaction pathways for isolated HE molecules; however, the absence of the condensed-phase environment is believed to affect reaction pathways strongly. Some key questions related to condensedphase decomposition are as follows: (1) How do the temperature and pressure affect the reaction pathways? (2) Are there temperature or pressure-induced phase-transitions that play a role in the reaction pathways that may occur? (3) What happens to the reaction profiles in a shock-induced detonation? These questions can be answered with condensed-phase simulations, but such simulations would require large-scale reactive chemical systems consisting of thousands of atoms. Here we present results of condensed-phase atomistic simulations, which are pushing the envelope toward reaching the required simulation goal. In our group, we are considering whether non-molecular phases of such species could be formed at conditions approaching those of detonation. Condensed phase explosives typically have C–J pressures in the neighborhood of 20–40 GPa and temperatures between 2500 K and 4000 K. Early in the reaction zone, energetic materials are thought to be cooler but more compressed. The Zeldovich–von Neumann–Do¨ring60–62 (ZND) state is defined by the Hugoniot of the unreacted material, which can be probed by shock experiments carefully designed to avoid HE initiation. Estimates of the temperature at the ZND state are in the neighborhood of 1500 K, whereas pressures as high as 60 GPa are possible. One possible non-molecular phase that may exist is a superionic solid. Superionic solids are compounds that exhibit exceptionally high ionic conductivity, where one ion type diffuses through a crystalline lattice of the remaining types. In this unique phase of matter, chemical bonds are breaking and reforming rapidly. Since their discovery in 1836, a fundamental understanding of superionic conductors has been one of the major challenges in condensed matter physics.63 In general, it has been difficult to create a simple set of rules governing superionic phases. Studies have been limited mostly to metal-based compounds, such as AgI and PbF2.63 However, the existence of superionic solid phases of hydrogen-bonded compounds had been theorized previously.64,65 Recent experimental and computational results indicate the presence of a high-pressure triple point in the H2O phase diagram,66–68 including a so-called superionic solid phase with fast hydrogen diffusion.68,69 Goldman et al. have described the emergence of symmetric hydrogen bonding in superionic water at 2000 K and 95 GPa.69 In symmetric hydrogen bonding, the intramolecular X–H bond becomes identical to the intermolecular X–H bond, where X is an electronegative element. It has been suggested that for superionic solids a mixed ionic/covalent bonding character stabilizes the mobile ion during the
Atomistic Modeling of Condensed-Phase Reactions
173
diffusion process.63 Symmetric hydrogen bonding provides mixed ionic/covalent bonding and thus could be a key factor in superionic diffusion in hydrogen-bonded systems. Because of current limitations in diamond anvil cell techniques, the temperatures and pressures that can be investigated experimentally are too low to probe the role of hydrogen bonding in previously studied hydrides (i.e., H2O and NH3). On the other hand, current shock compression experiments have difficulty resolving transient chemical species. The density profiles of large planets, such as Uranus and Neptune, suggest that a thick layer of ‘‘hot ice’’, exists which is thought to be 56% H2O, 36% CH4, and 8% NH3.70 This hot ice layer has lead to theoretical investigations of the water phase diagram,64 in which Car-Parrinello Molecular Dynamics (CPMD) simulations6 were conducted at temperatures and pressures ranging from 300 K to 7000 K and 30–300 GPa.65 In these molecular dynamics simulations, the electronic degrees of freedom are treated explicitly at each time step, effectively solving the electronic Schro¨dinger equation at each step. At temperatures above 2000 K and pressures above 30 GPa, a superionic phase was observed in which the oxygen atoms had formed a bcc lattice, and the hydrogen atoms were diffused extremely rapidly (ca. 104 cm2/s) via a hopping mechanism between oxygen lattice sites. Experimental results for the ionic conductivity of water at similar state conditions71,72 agree well with the results from Ref. 3, confirming the idea of a superionic phase and indicating a complete atomic ionization of water molecules under extreme conditions (P > 75 GPa; T > 4000 K).72 More recent quantum-based MD simulations were performed at temperatures up to 2000 K and pressures up to 30 GPa.73,74 Under these conditions, it was found that the molecular ions H3Oþ and OH are the major charge carriers in a fluid phase, in contrast to the bcc crystal predicted for the superionic phase. The fluid high-pressure phase has been confirmed by X-ray diffraction results of water melting at ca. 1000 K and up to 40 GPa of pressure.66,75,76 In addition, extrapolations of the proton diffusion constant of ice into the superionic region were found to be far lower than a commonly used criterion for superionic phases of 104 cm2/s.77 A great need exists for additional work to resolve the apparently conflicting data. The superionic phase has been explored with more extensive CPMD simulations.69 Calculated power spectra (i.e., the vibrational density of states or VDOS) have been compared with measured experimental Raman spectra68 at pressures up to 55 GPa and temperatures of 1500 K. The agreement between theory and experiment was very good. In particular, weakening and broadening of the OH stretch mode at 55 GPa was found both theoretically and experimentally. A summary of our results on the phase diagram of water is shown in Figure 8. We find that the molecular to non-molecular transition in water occurs in the neighborhood of the estimated ZND state of HMX. This transition shows that the detonation of typical energetic materials occurs in the neighborhood of the molecular to non-molecular transition.
174
The Reactivity of Energetic Materials at Extreme Conditions ZND
Temperature (K)
2000
1500
molecular liquid
superionic phase
1000
ice X
500
ice VII ice VIII
0 0
20
40 60 Pressure (GPa)
80
100
Figure 8 The phase diagram of H2O as measured experimentally68 (black solid) and through first principles simulations of the superionic phase (gray dash).68,69 The estimated ZND state of HMX is shown as a square for reference.
For our simulations, we used CPMD v.3.91, with the BLYP exchangecorrelation functional,78,79 and Troullier–Martins pseudo-potentials80 for both oxygen and hydrogen. A plane wave cut-off of 120 Ry was employed to ensure convergence of the pressure, although all other properties were observed to converge with a much lower cut-off (85 Ry). The system size was 54 H2O molecules. The temperature was controlled by using Nose´–Hoover thermostats81 for all nuclear degrees of freedom. We chose a conservative value of 200 au for the fictitious electron mass and a time step of 0.048 fs. Initial conditions were generated in two ways: (1) A liquid configuration at 2000 K was compressed from 1.0 g/cc to the desired density in sequential steps of 0.2 g/cc from an equilibrated sample. (2) An ice VII configuration was relaxed at the density of interest and then heated to 2000 K in steps of 300 degrees each, for a duration of 0.5–1 ps. While heating, the temperature was controlled via velocity scaling. We will refer to the first set of simulations as the ‘‘L’’ set and the second as the ‘‘S’’ set. Unless stated otherwise, the results (including the pressures) from the ‘‘S’’ initial configurations are those reported. Once the desired density and/or temperature was achieved, all simulations were equilibrated for a minimum of 2 ps. Data collection from the simulations was obtained for 5–10 ps after equilibration. The calculated diffusion constants of hydrogen and oxygen atoms are shown in Figure 9. The inset plot shows the equation of state for this isotherm for both ‘‘L’’ and ‘‘S’’ simulations. The two results are virtually identical up until 2.6 g/cc. At 34 GPa (2.0 g/cc), the hydrogen atom diffusion constant has achieved values associated with superionic conductivity (greater than
Atomistic Modeling of Condensed-Phase Reactions
175
2.5
2.0 120 Pressure (GPa)
Diffusion constant (D, cm2/s)
3.0x10–4
1.5
1.0
×
100
×
80
× × ×
60 40
× 2.0
2.2
0.5
2.4 2.6 2.8 Density (g/cc)
3.0
0.0 2.0
2.2
2.4
2.6
2.8
3.0
Density (g/cc)
Figure 9 Diffusion constants for O and H atoms at 2000 K as a function of density. The lines with circles correspond to hydrogen and the lines with squares to oxygen. The solid lines correspond to a liquid (‘‘L’’) initial configuration and the dashed lines to an ice VII (‘‘S’’) initial configuration. The inset plot shows the pressure as a function of density at 2000 K, where the triangles correspond to ‘‘L’’ and the Xs to ‘‘S.’’
104 cm2/s). The diffusion constant remains relatively constant with increasing density, in qualitative agreement with the experimental results of Chau et al.72 for the ionic conductivity. In contrast, the O diffusion constant drops to zero at 75 GPa (2.6 g/cc) for both ‘‘L’’ and ‘‘S’’ initial configurations. The surprisingly small hysteresis in the fluid to superionic transition allows us to place the transition point between 70 GPa (2.5 g/cc) and 77 GPa (2.6 g/cc). The small hysteresis is most likely caused by the weak O–H bonds at the conditions studied, which have free energy barriers to dissociation comparable with kBT (see below). Simulations that start from the ‘‘L’’ initial configurations are found to quench to an amorphous solid upon compression to 2.6 g/cc. The transition pressure of 75 GPa is much higher than the 30 GPa predicted earlier.65 This difference is likely caused by the use of a much smaller basis set (70 Ry) by Cavazzoni et al. Our results are also in disagreement with simple extrapolations of the proton diffusion constant to high temperatures.77 Radial distribution functions (RDFs) for the ‘‘S’’ simulations are shown in Figure 10. Analysis of the oxygen–oxygen RDF (not shown) for all pressures yields a coordination number of just over 14 for the first peak, which is consistent with a high-density bcc lattice in which the first two peaks are broadened because of thermal fluctuations. The RDF can be further analyzed by calculating an ‘‘average position’’ RDF in which the position of each oxygen is averaged over the course of the trajectory. The results for
176
The Reactivity of Energetic Materials at Extreme Conditions
Figure 10 O–H radial distribution function as a function of density at 2000 K. At 34 GPa, we find a fluid state. At 75 GPa, we show a ‘‘covalent’’ solid phase. At 115 GPa, we find a ‘‘network’’ phase with symmetric hydrogen bonding. Graphs are offset by 0.5 for clarity.
75–115 GPa indicate the presence of a bcc lattice undergoing large amplitude vibrations, even though each RDF in Figure 10 has width similar to that of a liquid or a glass. The RDFs for the amorphous phase (not shown) are similar to those of the solid phase obtained in the ‘‘S’’ simulations. The O–O and H–H RDFs (not shown) indicate that no O–O or H–H covalent bonds are formed during the simulations at all densities. The g(ROH) shows a lattice-like structure at 115 GPa, which is consistent with proton diffusion via a hopping mechanism between lattice sites.65 At 34 GPa, the coordination number for the first peak in g(ROH) is 2, which indicates molecular H2O. Between 95 GPa and 115 GPa, however, the coordination number for the first peak in g(ROH) becomes four, which indicates that water has formed symmetric hydrogen bonds where each oxygen has four nearest-neighbor hydrogens. Concomitant with the change in the oxygen coordination number is a shift of the first minimum of the O–H RDF from 1.30 A˚ at 34 GPa to 1.70 A˚ at 115 GPa. We observe a similar structural change in the H–H RDF in ˚ (close to the result for ambient conwhich the first peak lengthens from 1.63 A ditions) to 1.85 A˚. These observations bear a strong resemblance to the ice VII to ice X transition in which the covalent O–H bond distance of ice becomes equivalent to the hydrogen bond distance as pressure is increased.82 However, the superionic phase differs from ice X, in that the position of the first peak in g(ROH) is not half the distance of the first O–O peak.82 We analyze the effect
Atomistic Modeling of Condensed-Phase Reactions
177
Figure 11 ROH free energy surface at 2000 K. The lines are spaced by a factor of 4 kcal/ mol for clarity.
of the change in g(ROH) below in terms of the molecular speciation in the simulations. We determined the free energy barrier for dissociation by defining a free energy surface for the oxygen–hydrogen distances, viz. WðrÞ ¼ kB T ln [g(ROH)], where WðrÞ is the free energy surface (potential of mean force). The results are shown in Figure 11. The free energy barrier can then be defined as the difference in height between the first minimum and the second maximum in the free energy surface. The free energy barrier is 11 kcal/mol at 34 GPa and 8 kcal/mol at 115 GPa. The remainder of the results discussed below are for the ‘‘S’’ simulations. We now analyze the chemical species prevalent in water at these extreme conditions by defining instantaneous species based on the O–H bond distance. If that distance is less than a cut-off value rc , we count the atom pair as being bonded. Determining all bonds in the system gives the chemical species at each point in time. Species with lifetimes less than an O–H bond vibrational period (10 fs) are ‘‘transient’’ and do not represent bound molecules. The optimal cut-off rc between bonded and nonbonded species is given by the location of the maximum in the free energy surface.83 Using the free energy maximum to define a bond cut-off provides a clear picture of qualitative trends. As expected from the g(ROH), at 34 GPa, the free ˚ , which is approximately the same value energy peak is found at 1.30 A obtained from simulations of ambient water. At 75 GPa, the free energy peak maintains almost the same position but broadens considerably. At 115 GPa, the peak has sharpened once again, and the maximum is now at 1.70 A˚.
178
The Reactivity of Energetic Materials at Extreme Conditions
Figure 12 Mole fraction of species found at 34–115 GPa and 2000 K. The filled circles correspond to H3Oþ, whereas the open circles correspond to OH.
Given the above definition of a bond distance, we can analyze species lifetimes. The lifetime of all species is less than 12 fs above 2.6 g/cc, which is roughly the period of an O–H bond vibration (ca. 10 fs). Hence, water does not contain any molecular states above 75 GPa and at 2000 K but instead forms a collection of short-lived ‘‘transient’’ states. The ‘‘L’’ simulations at 2.6 g/cc (77 GPa) and 2000 K yield lifetimes nearly identical to that found in the ‘‘S’’ simulations (within 0.5 fs), which indicates that the amorphous states formed from the ‘‘L’’ simulations are closely related to the superionic bcc crystal states found in the ‘‘S’’ simulations. Species concentrations are shown in Figure 12. At 34 GPa (2.0 g/cc), H2O is the predominant species, with H3Oþ and OH having mole fractions of ca. 5%. In addition, some aggregation has occurred in which neutral and ionic clusters containing up to six oxygens have formed. The concentrations of OH and H3Oþ are low for all densities investigated and nonexistent at 95 and 115 GPa (2.8 and 3.0 g/cc, respectively). The calculated lifetimes for these species are well below 10 fs for the same thermodynamic conditions (less than 8 fs at 34 GPa). At pressures of 95 and 115 GPa, the increase in the O–H bond distance leads to the formation of extensive bond networks (Figure 13). These networks consist entirely of O–H bonds, whereas O–O and H–H bonds were not found to be present at any point. A maximally localized Wannier function analysis84–86 was performed to better analyze the bonding in our simulations. The maximally localized Wannier functions express the quantum wave function in terms of functions localized at centers, rather than as delocalized plane waves. The positions of these centers give us insight into the localization of charge during the
First Principles Simulations of High Explosives
179
Figure 13 Snapshots of the simulations at 75 GPa (left) and 115 GPa (right). The temperature for both is 2000 K. At 75 GPa, the water molecules are starting to cluster, and at 115 GPa, a well-defined network has been formed. The protons dissociate rapidly and form new clusters (at 75 GPa) or networks of bonds (at 115 GPa).
simulation. We computed the percentage of O–H bonds with a Wannier center along the bond axis. Surprisingly, the results for pressures of 34–75 GPa consistently showed that 85–95% of the O–H bonds are covalent. For 95 GPa and 115 GPa, we find about 50–55% of the bonds are covalent. This result is consistent with symmetric hydrogen bonding, for which the split between ionic and covalent bonds would be 50/50. The above simulations show that the molecular to non-molecular transition in H2O lies just above the operating range of most typical condensed explosives—about 50 GPa. This range presents a considerable challenge for thermochemical calculations, because a simple statistical mechanical treatment of non-molecular phases such as superionic water does not yet exist.
FIRST PRINCIPLES SIMULATIONS OF HIGH EXPLOSIVES Quantum mechanical methods can now be applied to systems with up to 1000 atoms;87 this capacity is not only from advances in computer technology but also from improvements in algorithms. Recent developments in reactive classical force fields promise to allow the study of significantly larger systems.88 Many approximations can also be made to yield a variety of methods, each of which can address a range of questions based on the inherent accuracy of the method chosen. We now discuss a range of quantum mechanical-based methods that one can use to answer specific questions regarding shock-induced detonation conditions. Atomistic simulations have been performed on condensed-phase HMX, which is a material that is widely used as an ingredient in various explosives and propellants. A molecular solid at standard state, it has four known
180
The Reactivity of Energetic Materials at Extreme Conditions
polymorphs, of which d-HMX is believed to be the most highly reactive. In fact, b-HMX often transforms into d-HMX before reacting violently.89 Manaa et al.20 have conducted quantum-based molecular dynamics simulations of the chemistry of HMX and nitromethane90 under extreme conditions, which are similar to those encountered at the C–J detonation state. They studied the reactivity of dense (1.9 g/cm3) fluid HMX at 3500 K for reaction times up to 55 ps, using the ‘‘Self-Consistent Charge Density-Functional TightBinding’’ (SCC-DFTB) method.91 Stable product molecules are formed rapidly (in a less than 1 ps) in these simulations. Plots of chemical speciation, however, indicate a time greater than 100 ps is needed to reach chemical equilibrium. Reactions occur rapidly in these simulations because the system is ‘‘preheated’’ to 3500 K. In a detonation, on the other hand, a temperature close to 3500 K would only be found after stable product molecules had been formed. The initial temperature of unreacted nitromethane, after being shocked, has been estimated to be 1800 K.13 HMX likely has a similar initial temperature to that of nitromethane. Nonetheless, the simulations of Manaa et al. provide useful insight into the chemistry of dense, hot energetic materials, which demonstrate that they are a useful complement to more traditional gas phase calculations. Numerous experimental characterizations of the decomposition products of condensed-phase HMX exist at low temperatures (i.e., < 1000 K, well below detonation temperature).54, 92–100 These studies tend to identify final gas products (such as H2O, N2, H2, CO, and CO2) from the surface burn, and the authors aspire to establish a global decomposition mechanism. Similar experimental observations at detonation conditions (temperatures 2000–5000 K and pressures 10–30 GPa) have not been realized to date, however. Computer simulations provide the best access to the short time scale processes occurring in these regions of extreme conditions of pressure and temperature.101 In particular, simulations employing many-body potentials102,103 or tight-binding models have emerged as viable computational tools, the latter of which has been demonstrated successfully in the studies of shocked hydrocarbons.104,105 Lewis et al.106 calculated four possible decomposition pathways of the a-HMX polymorph: N–NO2 bond dissociation, HONO elimination, C–N bond scission, and concerted ring fission. Based on energetics, it was determined that N–NO2 dissociation was the initial mechanism of decomposition in the gas phase, whereas they proposed HONO elimination and C–N bond scission to be favorable in the condensed phase. The more recent study of Chakraborty et al.42 using density functional theory (DFT), reported detailed decomposition pathways of b-HMX, which is the stable polymorph at room temperature. It was concluded that consecutive HONO elimination (4 HONO) and subsequent decomposition into HCN, OH, and NO are the most energetically favorable pathways in the gas phase. The results also showed that the formation of CH2O and N2O could occur preferably from secondary decomposition of methylenenitramine.
First Principles Simulations of High Explosives
181
The computational approach to simulate the condensed-phase chemical reactivity of HMX employed by Manaa et al.20 is based on implementing the SCC-DFTB scheme.91 This approach is an extension of the standard tightbinding approach in the context of DFT that describes total energies, atomic forces, and charge transfer in a self-consistent manner. The initial conditions of the simulation included six HMX molecules, which correspond to a single unit cell of the d-phase, with a total of 168 atoms. The density was 1.9 g/cm3 and the temperature 3500 K in the simulations. These thermodynamic quantities place the simulation in the neighborhood of the C–J state of d-HMX (3800 K, 2.0 g/cm3) as predicted through thermochemical calculations. The closest experimental condition corresponding to this simulation would be a sample of HMX that is suddenly heated under constant volume conditions, such as in a diamond anvil cell. A molecular dynamics simulation of the 168-atom system was conducted at constant volume and constant temperature. Periodic boundary conditions, whereby a particle exiting the super cell on one side is reintroduced on the opposite side with the same velocity, were imposed. Under the simulation conditions, the HMX was found to exist in a highly reactive dense fluid. Important differences exist between the dense fluid (supercritical) phase and the solid phase, which is stable at standard conditions. One difference is that the dense fluid phase cannot accommodate long-lived voids, bubbles, or other static defects, whereas voids, bubbles, and defects are known to be important in initiating the chemistry of solid explosives.107 On the contrary, numerous fluctuations in the local environment occur within a time scale of tens of femtoseconds (fs) in the dense fluid phase. The fast reactivity of the dense fluid phase and the short spatial coherence length make it well suited for molecular dynamics study with a finite system for a limited period of time; chemical reactions occurred within 50 fs under the simulation conditions. Stable molecular species such as H2O, N2, CO2, and CO were formed in less than 1 ps. Figure 14 displays the product formation of H2O, N2, CO2, and CO. The concentration C(t) is represented by the actual number of product molecules formed at time t. Each point on the graphs (open circles) represents an average over a 250-fs interval. The number molecules in the simulation were sufficient to capture clear trends in the chemical composition of the species involved. It is not surprising to find that the rate of H2O formation is much faster than that of N2. Fewer reaction steps are required to produce a triatomic species like water, whereas the formation of N2 involves a much more complicated mechanism.108 Furthermore, the formation of water starts around 0.5 ps and seems to have reached a steady state at 10 ps, with oscillatory behavior of decomposition and formation clearly visible. The formation of N2, on the other hand, starts around 1.5 ps and is still progressing (as the slope of the graph is slightly positive) after 55 ps of simulation time, albeit slowly.
182
The Reactivity of Energetic Materials at Extreme Conditions
Figure 14 Product particle-number formations for H2O, N2, CO2, and CO as a function of time.
Because of the lack of high-pressure experimental reaction rate data for HMX and other explosives with which to compare, we produce in Figure 15 a comparison of dominant species formation for decomposing HMX that have been obtained from entirely different theoretical approaches. The concentration of species at chemical equilibrium can be estimated through thermodynamic calculations with the Cheetah thermochemical code.32,109 The results of the MD simulation compare well with the formation of H2O, N2, and HNCO predicted by Cheetah. The relative concentrations of CO and CO2, however, are reversed, possibly because of the limited time duration of the simulation. Another discrepancy is that Cheetah predicts carbon in the diamond phase is in equilibrium with the other species at a concentration of 4.9-mol/kg HMX. No condensed carbon was observed in the simulation. Several other products and intermediates with lower concentrations, common to the two methods, have also been identified, including HCN, NH3, N2O, CH3OH, and CH2O. A comparison between
First Principles Simulations of High Explosives
183
Figure 15 Comparison of relative composition of dominant species found in the MD simulation and in a thermodynamic calculation.
the two vastly different approaches needs to be established when using much longer simulation times. Also, the product-molecule set of the thermochemical code needs to be expanded with important species determined from quantumbased simulations. It should also be noted that the accuracy of DFT calculations for chemistry under extreme conditions needs further experimental validation. One expects the Cheetah results where more CO2 than CO is formed as final products, because disproportionation of CO to condensed C þ CO2 is energetically favorable. The results displayed in Figure 14 show that at a simulation time of 40 ps the system is still in the second stage of reaction chemistry. At this stage, the CO concentration is rising but has not yet undergone the water gas shift reaction ðCO þ H2 O ! CO2 þ H2 Þ conversion. Interestingly, this shift occurs around 50 ps in the simulation, when CO2 molecules are being formed while the CO concentration is correspondingly diminished. Although the simulation sheds light on the chemistry of HMX under extreme conditions, some methodological shortcomings need to be overcome in the future. The demanding computational requirements of the quantumbased MD method limit its applicability to short times and high-temperature conditions. For example, the simulations discussed on HMX took over a year of wall clock time. Moreover, the SCC-DFTB method is not as accurate as high-level quantum-based methods. Nonetheless, the SCC-DFTB approach could still be considered as a promising direction for future research on the chemistry of energetic materials.
184
The Reactivity of Energetic Materials at Extreme Conditions
CONCLUSIONS The ability to model chemical reaction processes in condensed-phase energetic materials at the extreme conditions typified by a detonation is progressing. Chemical equilibrium modeling is a mature technique with some limitations. Progress in this area continues, but it is hampered by a lack of knowledge of condensed-phase reaction mechanisms and rates. A useful theory of the equation of state for ionic and highly polar molecular species needs to be more fully developed. The role of unconventional molecular species in detonation needs to be investigated, and high-pressure chemical kinetics needs to be developed further as a field of study. Atomistic molecular dynamics modeling is computationally intensive and is currently limited in the realm of detonations to picosecond time scales. Nonetheless, this methodology promises to yield the first reliable insights into the condensed-phase processes responsible for high explosive detonation. First principles simulations reveal that the transition to non-molecular phases lies close to the operating range of common explosives such as HMX. Additional work is necessary to extend the time scales involved in atomistic simulations. Alternatively, advanced force fields may offer the ability to model the reactions of energetic materials for periods of many picoseconds. Recent work in implementing thermostat methods appropriate to shocks110,111 may promise to overcome time scale limitations in the non-equilibrium molecular dynamics method itself and allow the reactions of energetic materials to be determined for up to several nanoseconds.
ACKNOWLEDGMENTS The author is grateful for the contributions of many collaborators to the work reviewed here. Nir Goldman and M. Riad Manaa played a central role in the atomistic simulations. W. Michael Howard, Kurt R. Glaesemann, P. Clark Souers, Peter Vitello, and Sorin Bastea developed many of the thermochemical simulation techniques discussed here. This work was performed under the auspices of the U. S. Department of Energy by the University of California Lawrence Livermore National Laboratory under Contract W-7405-Eng-48.
REFERENCES 1. L. E. Fried, M. R. Manaa, P. F. Pagoria, and R. L. Simpson, Annu. Rev. Mater. Res., 31, 291 (2001). Design and Synthesis of Energetic Materials. 2. T. D. Sewell, R. Menikoff, D. Bedrov, and G. D. Smith, J. Chem. Phys., 119, 7417 (2003). A Molecular Dynamics Simulation Study of Elastic Properties of HMX. 3. I. P. H. Do and D. J. Benson, Int. J. Plasticity, 17, 641 (2001). Micromechanical Modeling of Shock-induced Chemical Reactions in Heterogeneous Multi-Material Powder Mixtures. 4. M. R. Baer, Thermochemica Acta, 384, 351 (2002). Modeling Heterogeneous Energetic Materials at the Mesoscale.
References
185
5. Y. B. Zel’dovich and Y. P. Raiser, Physics of Shockwaves and High Temperature Hydrodynamics Phenomena, Academic Press, New York, 1966. 6. R. Car and M. Parrinello, Phys. Rev. Lett., 55, 2471 (1985). Unified Approach for Molecular Dynamics and Density-Functional Theory. 7. P. C. Souers and J. W. Kury, Propellants, Explosives, Pyrotechnics, 18, 175 (1993). Comparison of Cylinder Data and Code Calculations for Homogeneous Explosives. 8. M. Cowperthwaite and W. H. Zwisler, J. Phys. Chem., 86, 813 (1982). Thermodynamics of Nonideal Heterogeneous Systems and Detonation Products Containing Condensed Al2O3, Al, and C. 9. F. H. Ree, J. Chem. Phys., 84, 5845 (1986). Supercritical Fluid Phase Separations - Implications for Detonation Properties of Condensed Explosives. 10. M. van Thiel and F. H. Ree, J. Appl. Phys., 62, 1761 (1987). Properties of Carbon Clusters in TNT Detonation Products: Graphite-Diamond Transition. 11. W. C. Davis and C. Fauquignon, Journal de Physique IV, 5, 3 (1995). Classical Theory of Detonation. 12. F. Charlet, M. L. Turkel, J. F. Danel, and L. Kazandjian, J. Appl. Phys., 84, 4227 (1998). Evaluation of Various Theoretical Equations of State used in Calculation of Detonation Properties. 13. N. C. Blais, R. Engelke, and S. A. Sheffield, J. Phys. Chem. A, 101, 8285 (1997). Mass Spectroscopic Study of the Chemical Reaction Zone in Detonating Liquid Nitromethane. 14. M. Cowperthwaite, Tenth International Detonation Symposium, Boston, Massachusetts, (1993). Nonideal Detonation in a Composite CHNO Explosive Containing Aluminum. 15. W. Fickett and W. C. Davis, Detonation, University of California Press, Berkeley, California, 1979. 16. W. M. Howard, L. E. Fried, and P. C. Souers, 11th International Symposium on Detonation, Snowmass, Colorado, (1998). Kinetic Modeling of Non-ideal Explosives with Cheetah. 17. F. H. Ree, J. Chem. Phys., 70, 974 (1979). Systematics of High-pressure and High-temperature Behavior of Hydrocarbons. 18. J. M. Zaug, L. E. Fried, E. H. Abramson, D. W. Hansen, J. C. Crowhurst, and W. M. Howard, High-Pressure Research, 23, 229 (2003). Measured Sound Velocities of H2O and CH3OH. 19. J. M. Zaug, E. H. Abramson, D. W. Hansen, L. E. Fried, W. M. Howard, G. S. Lee, and P. F. Pagoria. 12th International Detonation Symposium, San Diego, California, (2002). Experimental EOS and Chemical Studies of High-Pressure Detonation Products and Product Mixtures. 20. M. R. Manaa, L. E. Fried, C. F. Melius, M. Elstner, and T. Frauenheim, J. Phys. Chem. A, 106, 9024 (2002). Decomposition of HMX at Extreme Conditions: A Molecular Dynamics Simulation. 21. G. B. Kistiakowsky and E. B. Wilson. Rep. OSRD-69, Office of Scientific Research and Development, 1941. Report on the Prediction of Detonation Velocities of Solid Explosives. 22. M. Finger, E. Lee, F. H. Helm, B. Hayes, H. Hornig, R. McGuire, M. Kahara, and M. Guidry, Sixth International Symposium on Detonation, 1976, Coronado, California, Office of Naval Research, pp. 710–22. The Effect of Elemental Composition on the Detonation Behavior of Explosives. 23. C. L. Mader, Numerical Modeling of Detonations, University of California Press, Berkeley, California, 1979. 24. S. A. Gubin, V. V. Odintsov, and V. I. Pepekin, Sov. J. Chem. Phys., 3, 1152 (1985). BKW-RR EOS. 25. M. L. Hobbs and M. R. Baer, Tenth International Detonation Symposium, Boston, Massachusetts, (1993). Calibrating the BKW-EOS with a Large Product Species Data Based and Measured C-J Properties. 26. L. E. Fried and P. C. Souers, Propellants, Explosives, Pyrotechnics, 21, 215 (1996). BKWC: An Empirical BKW Parametrization Based on Cylinder Test Data.
186
The Reactivity of Energetic Materials at Extreme Conditions
27. M. Cowperthwaite and W. H. Zwisler, Sixth International Symposium on Detonation, 1976, Coronado, California, Office of Naval Research, p. 162. The JCZ Equation of State for Detonation Products and their Incorporation into the TIGER Code. 28. M. Ross and F. H. Ree, J. Chem. Phys., 73, 6146 (1980). Repulsive Forces of Simple Molecules and Mixtures at High Density and Temperature. 29. F. H. Ree, J. Chem. Phys., 81, 1251 (1984). A Statistical Mechanical Theory of Chemically Reacting Multiple Phase Mixtures: Application to the Detonation Properties of PETN. 30. M. van Thiel and F. H. Ree, J. Chem. Phys., 104, 5019 (1996). Accurate High-pressure and High-temperature Effective Pair Potentials for the System N2-N and O2-O. 31. W. Byers Brown, J. Chem. Phys., 87, 566 (1987). Analytical Representation of the Excess Thermodynamic Equation of State for Classical Fluid Mixtures of Molecules Interacting with Alpha-exponential-six Pair Potentials up to High Densities. 32. L. E. Fried and W. M. Howard, J. Chem. Phys., 109, 7338 (1998). An Accurate Equation of State for the Exponential-6 Fluid Applied to Dense Supercritical Nitrogen. 33. L. E. Fried and W. M. Howard, J. Chem. Phys., 110, 12023 (1999). The Equation of State of Supercritical HF, HCl, and Reactive Supercritical Mixtures Containing the Elements H, C, F, and Cl. 34. H. D. Jones, Shock Compression of Condensed Matter, 2001, Atlanta, Georgia: AIP, pp. 103–106. Theoretical Equation of State for Water at High Pressures. 35. M. S. Shaw, J. Chem. Phys., 94, 7550 (1991). Monte-Carlo Simulation of Equilibrium Chemical Composition of Molecular Fluid Mixtures in the Natoms PT Ensemble. 36. J. K. Brennan and B. M. Rice, Phys. Rev. E, 66, 021105 (2002). Molecular Simulation of Shocked Materials Using the Reactive Monte Carlo Method. 37. J. K. Brennan, M. Lisal, K. E. Gubbins, and B. M. Rice, Phys. Rev. E, 70, 061103 (2004). Reaction Ensemble Molecular Dynamics: Direct Simulation of the Dynamic Equilibrium Properties of Chemically Reacting Mixtures. 38. T. W. Leland, J. S. Rowlinson, and G. A. Sather, Trans. Faraday Soc., 64, 1447 (1947). Van der Waals 1-Fluid Mixture Model. 39. T. M. Reed and K. E. Gubbins, Statistical Mechanics, McGraw-Hill, New York, 1973. 40. L. E. Fried, W. M. Howard, and P. C. Souers, 12th International Symposium on Detonation, 2002, San Diego, CA, US Naval Research Office. EXP6: A New Equation of State Library for High Pressure Thermochemistry. 41. M. L. Hobbs, M. R. Baer, and B. C. McGee, Propellants, Explosives, Pyrotechnics, 24, 269 (1999). JCZS: An Intermolecular Potential Database for Performing Accurate Detonation and Expansion Calculations. 42. D. Chakraborty, R. P. Muller, S. Dasgupta, and W. A. Goddard III, J. Phys. Chem. A, 105, 1302 (2001). Mechanism for Unimolecular Decomposition of HMX (1,3,5,7-tetranitro1,3,5,7-tetrazocine), an Ab Initio Study. 43. S. Bastea, K. Glaesemann, and L. E. Fried, 13th International Symposium on Detonation, McLean, Virginia, (2006). Equation of State for High Explosive Detonation Products with Explicit Polar and Ionic Species. 44. F. H. Ree, J. Chem. Phys., 78, 409 (1978). Simple Mixing Rule for Mixtures with Exp-6 Interactions. 45. M. Ross, J. Chem. Phys., 71, 1567 (1979). A High Density Fluid-perturbation Theory Based on an Inverse 12th Power Hard-sphere Reference System. 46. G. Stell, J. C. Rasaiah, and H. Narang, Mol. Phys., 23, 393 (1972). Thermodynamic Perturbation Theory for Simple Polar Fluids. 1. 47. G. S. Rushbrooke, G. Stell, and J. S. Hoye, Mol. Phys., 26, 1199 (1973). Theory of Polar Liquids. I. Dipolar Hard Spheres. 48. K. E. Gubbins and C. H. Twu, Chem. Eng. Sci., 33, 863 (1977). Thermodynamics of Polyatomic Fluid Mixtures. I.
References
187
49. C. H. Twu and K. E. Gubbins, Chem. Eng. Sci., 33, 879 (1977). Thermodynamics of Polyatomic Fluid-Mixtures. II. 50. B. Guillot, J. Mol. Liq., 101, 219 (2002). A Reappraisal of What we Have Learned During Three Decades of Computer Simulation. 51. S. P. Marsh, LASL Shock Hugoniot Data, University of California Press, Berkeley, California, 1980. 52. W. Wagner and A. Pruss, J. Phys. Chem. Ref. Data, 31, 387 (2002). The IAPWS Formulation 1995 for the Thermodynamic Properties of Ordinary Water Substance for General and Scientific Use. 53. H. C. Hornig, E. L. Lee, M. Finger, and J. E. Kurly, Proceedings of the 5th International Symposium on Detonation, Office of Naval Research, 1970, Detonation Velocity of PETN. 54. R. Behrens and S. Bulusu, J. Phys. Chem., 95, 5838 (1991). Thermal Decomposition of Energetic Materials. 2. Deuterium Isotope Effects and Isotopic Scrambling in CondensedPhase Decomposition of Octahydro-1,3,5,7-tetranitro-1,3,5,7-tetrazocine. 55. C. Wu and L. Fried, J. Phys. Chem. A, 101, 8675 (1997). Ab Initio Study of RDX Decomposition Mechanisms. 56. S. Zhang and T. Truong, J. Phys. Chem. A, 105, 2427 (2001). Branching Ratio and Pressure Dependent Rate Constants of Multichannel Unimolecular Decomposition of Gas-phase aHMX: An Ab Initio Dynamics Study. 57. J. Lewis, K. Glaesemann, K. Van Opdorp, and G. Voth, J. Phys. Chem. A, 104, 11384 (2000). Ab Initio Calculations of Reactive Pathways for a-Octahydro-1,3,5,7-tetranitro-1,3,5,7tetrazocine (a-HMX). 58. T. Sewell and D. Thompson, J. Phys. Chem., 95, 6228 (1991). Classical Dynamics Study of Unimolecular Dissociation of Hexahydro-1,3,5-trinitro-1,3,5-s-triazine (RDX). 59. C. Chambers and D. Thompson, J. Phys. Chem., 99, 15881 (1995). Further Studies of the Classical Dynamics of the Unimolecular Dissociation of RDX. 60. Y. B. Zel’dovich, Sov. Phys. J. Exp. Theor. Phys., 10, 542 (1940). On the Theory of the Propagation of Detonation in Gaseous Systems. 61. J. Von Neumann, Theory of Detonation Waves, Pergamon Press, 1963. 62. W. Doring, Ann. Phys., 43, 421 (1943). On Detonation Processes in Gases. 63. S. Hull, Rep. Prog. Phys., 67, 1233 (2004). Superionics: Crystal Structures and Conduction Processes. 64. P. Demontis, R. LeSar, and M. L. Klein, Phys. Rev. Lett., 60, 2284 (1988). New High-pressure Phases of Ice. 65. C. Cavazzoni, G. L. Chiarotti, S. Scandolo, E. Tosatti, M. Bernasconi, and M. Parrinello, Science, 283, 44 (1999). Superionic and Metallic States of Water and Ammonia at Giant Planet Conditions. 66. B. Schwager, L. Chudinovskikh, A. Gavriliuk, and R. Boehler, J. Phys: Condensed Matter, 16, 1177 (2004). Melting Curve of H2O to 90 GPa Measured in a Laser-heated Diamond Cell. 67. J. F. Lin, E. Gregoryanz, V. V. Struzhkin, M. Somayazulu, H. K. Mao, and R. J. Hemley, Geophys. Res. Lett., 32, 11306 (2005). Melting Behavior of H2O at High Pressures and Temperatures. 68. A. F. Goncharov, N. Goldman, L. E. Fried, J. C. Crowhurst, I. F. W. Kuo, C. J. Mundy, and J. M. Zaug, Phys. Rev. Lett., 94, 125508 (2005). Dynamic Ionization of Water Under Extreme Conditions. 69. N. Goldman, L. E. Fried, I. F. W. Kuo, and C. J. Mundy, Phys. Rev. Lett., 94, 217801 (2005). Bonding in the Superionic Phase of Water. 70. W. B. Hubbard, Science, 214, 145 (1981). Interiors of the Giant Planets. 71. W. J. Nellis, N. C. Holmes, A. C. Mitchell, D. C. Hamilton, and M. Nicol, J. Chem. Phys, 107, 9096 (1997). Equation of State and Electrical Conductivity of ‘‘Synthetic Uranus,’’
188
The Reactivity of Energetic Materials at Extreme Conditions A Mixture of Water, Ammonia, and Isopropanol, at Shock Pressure up to 200 GPa (2 Mbar).
72. R. Chau, A. C. Mitchell, R. W. Minich, and W. J. Nellis, J. Chem. Phys., 114, 1361 (2001). Electrical Conductivity of Water Compressed Dynamically to Pressures of 70-180 GPa (0.71.8 Mbar). 73. E. Schwegler, G. Galli, F. Gygi, and R. Q. Hood, Phys. Rev. Lett., 87, 265501 (2001). Dissociation of Water under Pressure. 74. C. Dellago, P. L. Geissler, D. Chandler, J. Hutter, and M. Parrinello, Phys. Rev. Lett., 89, 199601 (2002). Comment on ‘‘Dissociation of Water under Pressure.’’ 75. M. R. Frank, Y. W. Fei, and J. Z. Hu, Geochemica et Cosmochimica Acta, 68, 2781 (2004). Constraining the Equation of State of Fluid H2O to 80 GPa Using the Melting Curve, Bulk Modulus, and Thermal Expansivity of Ice VII. 76. J. F. Lin, B. Militzer, V. V. Struzhkin, E. Gregoryanz, and R. J. Hemley, J. Chem. Phys, 121, 8423 (2004). High Pressure-temperature Raman Measurements of H2O Melting to 22 GPa and 900 K. 77. E. Katoh, H. Yamawaki, H. Fujihisa, M. Sakashita, and K. Aoki, Science, 295, 1264 (2002). Protonic Diffusion in High-pressure Ice VII. 78. A. D. Becke, Phys. Rev. A, 38, 3098 (1988). Density-Functional Exchange-Energy Approximation with Correct Asymptotic Behavior. 79. C. T. Lee, W. T. Yang, and R. G. Parr, Phys. Rev. B, 37, 785 (1988). Development of the Colle-Salvetti Correlation-Energy Formula into a Functional of the Electron Density. 80. N. Troullier and J. Martins, Phys. Rev. B, 43, 1993 (1991). Efficient Pseudopotentials for Plane-Wave Calculations. 81. S. Nose´, Mol. Phys., 52, 255 (1984). A Unified Formulation of the Constant Temperature Molecular Dynamics Methods. 82. M. Benoit, A. H. Romero, and D. Marx, Phys. Rev. Lett., 89, 145501 (2002). Reassigning Hydrogen Bond Centering in Dense Ice. 83. D. Chandler, J. Chem. Phys., 68, 2959 (1978). Statistical Mechanics of Isomerization Dynamics in Liquids and Transition-state Approximation. 84. G. H. Wannier, Phys. Rev., 52, 191 (1937). The Structure of Electronic Excitation Levels in Insulating Crystals. 85. N. Marazri and D. Vanderbilt, Phys. Rev. B, 56, 12847 (1997). Maximally Localized Generalized Wannier Functions for Composite Energy Bands. 86. P. L. Silvestrelli, Phys. Rev. B, 59, 9703 (1999). Maximally Localized Wannier Functions for Simulations with Supercells of General Symmetry. 87. I. F. W. Kuo and C. J. Mundy, Science, 303, 658 (2004). Ab Initio Molecular Dynamics Study of the Aqueous Liquid-Vapor Interface. 88. A. Strachan, E. M. Kober, A. C. T. van Duin, J. Oxgaard, and W. A. Goddard III, J. Chem. Phys., 122, 054502 (2005). Thermal Decomposition of RDX from Reactive Molecular Dynamics. 89. A. G. Landers and T. B. Brill, J. Phys. Chem., 84, 3573 (1980). Pressure-temperature Dependence of the Beta-delta-polymorph Interconversion in Octahydro-1,3,5,7-tetranitro-1,3,5,7-tetrazocine. 90. M. R. Manaa, E. J. Reed, L. E. Fried, G. Galli, and F. Gygi, J. Chem. Phys., 120, 10145 (2004). Early Chemistry in Hot and Dense Nitromethane: Molecular Dynamics Simulations. 91. M. Elstner, D. Porezag, G. Jungnickel, J. Elsner, M. Hauk, T. Frauenheim, S. Suhai, and G. Seifert, Phys. Rev. B, 58, 7260 (1998). Self-consistent-charge Density-Functional Tightbinding Method for Simulations of Complex Materials Properties. 92. B. Suryanarayana, R. J. Graybush, and J. R. Autera, Chem. Ind. London, 52, 2177 (1967). Thermal Degradation of Secondary Nitramines - A Nitrogen-15 Tracer Study of HMX (1,3,5,7-tetranitro-1,3,5,7-tetrazazcyclooctane).
References
189
93. S. Bulusu, T. Axenrod, and G. W. A. Milne, Org. Mass. Spectrom., 3, 13 (1970). Electronimpact Fragmentation of Some Secondary Aliphatic Nitramines. Migration of Nitro Group in Heterocyclic Nitramines. 94. C. V. Morgan and R. A. Beyer, Combust. Flame, 36, 99 (1979). Electron-spin -Resonance Studies of HMX Pyrolysis Products. 95. R. A. Fifer, in Progress in Astronautics and Aeronautics, K. K. Kuo, M. Summerfield, Eds., AIAA Inc., New York, 1984, p. 177. Fundamentals of Solid Propellant Combustion. 96. R. Behrens, International Journal of Chemical Kinetics, 22, 159 (1990). Determination of the Rates of Formation of Gaseous Products from the Pyrolysis of Octrahydro-1,3,5,7-tetranitro-1,3,5,7-tetrazocine (HMX) by Simultaneous Thermogravimetric Modulated Beam Mass Spectrometry. 97. J. C. Oxley, A. B. Kooh, R. Szekers, and W. Zhang, J. Phys. Chem., 98, 7004 (1994). Mechanisms of Nitramine Thermolysis. 98. T. Brill, P. Gongwer, and G. Williams, J. Phys. Chem., 98, 12242 (1994). Thermal Decompostion of Energetic Materials. 66. Kinetic Compensation Effects in HMX, RDX and NTO. 99. T. Brill, Journal of Propulsion and Power, 4, 740 (1995). Multphase Chemistry Considerations at the Surface of Burning Nitramine Monopropellants. 100. C.-J. Tang, Y. J. Lee, G. Kudva, and T. A. Litzinger, Combust. Flame, 117, 170 (1999). A Study of the Gas-phase Chemical Structure During CO2 Laser Assisted Combustion of HMX. 101. P. Politzer and S. Boyd, Struct. Chem., 13, 105 (2002). Molecular Dynamics Simulations of Energetic Solids. 102. D. W. Brenner, D. H. Robertson, M. L. Elert, and C. T. White, Phys. Rev. Lett., 70, 2174 (1993). Detonations at Nanometer Resolution using Molecular Dynamics. 103. M. L. Elert, S. V. Zybin, and C. T. White, J. Chem. Phys., 118, 9795 (2003). MD of Shockinduced Chemistry in Small Hydrocarbons. 104. S. R. Bickham, J. D. Kress, and L. A. Collins, J. Chem. Phys., 112, 9695 (2000). Molecular Dynamics Simulations of Shocked Benzene. 105. J. D. Kress, S. R. Bickham, L. A. Collins, B. L. Holian, and S. Goedecker, Phys. Rev. Lett., 83, 3896 (1999). Tight-binding Molecular Dynamics of Shock Waves in Methane. 106. J. Lewis, T. Sewell, R. Evans, and G. Voth, J. Phys. Chem. B, 104, 1009 (2000). Electronic Structure Calculation of the Structures and Energies of the Three Pure Polymorphic Forms of Crystalline HMX. 107. C. M. Tarver, S. K. Chidester, and A. L. Nichols III, J. Phys. Chem., 100, 5794 (1996). Critical Conditions for Impact and Shock-induced Hot Spots in Solid Explosives. 108. C. F. Melius, in Chemistry and Physics of Energetic Materials, D. N. Bulusu, Ed., Kluwer, Dordercht, The Netherlands, 1990. HMX Decomposition. 109. L. E. Fried and W. M. Howard, Phys. Rev. B, 61, 8734 (2000). Explicit Gibbs Free Energy Equation of State Applied to the Carbon Phase Diagram. 110. E. J. Reed, J. D. Joannopoulos, and L. E. Fried, Phys. Rev. Lett., 90, 235503 (2003). A Method for Tractable Dynamical Studies of Single and Double Shock Compression. 111. J. B. Maillet, M. Mareschal, L. Soulard, R. Ravelo, P. S. Lomdahl, T. C. Germann, and B. L. Holian, Phys. Rev. E, 63, 016121 (2001). Uniaxial Hugoniotstat: A Method for Atomistic Simulations of Shocked Materials.
This Page Intentionally Left Blank
CHAPTER 5
Magnetic Properties of Atomic Clusters of the Transition Elements Julio A. Alonso Departamento de Fı´sica Teo´rica, Ato´mica y Optica, Universidad de Valladolid, Valladolid, Spain Donostia International Physics Center (DIPC), San Sebastia´n, Spain
INTRODUCTION Atomic clusters are aggregates of atoms containing from a few to several thousand atoms. Their properties are different from those of the corresponding bulk material because of the sizable fraction of atoms forming the cluster surface. Many differences between clusters and bulk materials originate from the small volume of the potential well confining the electrons in the clusters. In such cases, the electrons of clusters fill discrete levels, instead of having the continuous distribution (bands) characteristic of the solid. How many atoms are required for a cluster to show the properties of the macroscopic material? This important question still lacks a convincing answer. By studying the properties of clusters, scientists expect to obtain information on the early stages of growth of condensed matter and on the evolution of the chemical and physical properties as a function of cluster size. Knowing something about the evolutionary patterns of clusters may have interesting technological implications. For instance, the melting temperature of small particles decreases linearly as a function of the inverse particle radius 1/R. This decrease affects sintering processes, in which fine powders are compressed and heated until the particles coalesce: Lower sintering temperatures will be required for particles with very Reviews in Computational Chemistry, Volume 25 edited by Kenny B. Lipkowitz and Thomas R. Cundari Copyright ß 2007 Wiley-VCH, John Wiley & Sons, Inc.
191
192
Magnetic Properties of Atomic Clusters of the Transition Elements
small radii. Also, given the current trend to nanoscale technologies, the extremely small size of the components will affect their electrical and mechanical stabilities at increased temperatures. Most studies of the magnetic properties of small atomic clusters have focused on the transition elements. These elements form three series in the Periodic Table that are characterized by the progressive filling of electronic d shells. The d electrons are responsible for many properties of these elements as free atoms and in the bulk phase. In the same way, most properties of clusters of the transition elements, and in particular the magnetic properties, reflect the localized behavior of the d electrons. The objective of this chapter is to review the theoretical work performed in the past years to understand, explain, and predict the magnetism in clusters of the transition elements. The structure of the chapter is as follows. After introducing a few basic concepts, some key experiments are presented revealing the broad features of the variation of magnetic moments as the cluster size increases. We will see that an overall decrease of the average magnetic moment exists going from the free atom value toward the value for the bulk metal. Models based on a simplified description of the density of electronic states have been introduced to explain this main feature. However, superimposed on this rough decrease of average magnetic moment exists a rich structure of the magnetic moment showing oscillations with increasing cluster size. This structure can only be explained using more accurate methods, and the calculations existing in the literature can be classified into one of two groups: tight binding calculations or density functional theory calculations, both of which are summarized before we review several of their applications to clusters of elements of the 3d and 4d series.
BASIC CONCEPTS The magnetism of small clusters is sensitive to the symmetry of the cluster, atomic coordination, and interatomic distances between neighbor atoms. The magnetic moments in clusters of Fe, Co, and Ni can be estimated, however, from a simple argument. First consider the free atoms Fe, Co, and Ni, having eight, nine, and ten outer electrons, respectively, to be distributed among the 3d and 4s shells. Hund’s rule requires the spin to be a maximum, and this leads to the following electronic configurations: 3d "5 3d #1 4s2 for Fe, 3d "5 3d #2 4s2 for Co, and 3d "5 3d #3 4s2 for Ni. The up ð3d "Þ and down ð3d #Þ spin subshells are separated in energy by the exchange interaction. The Fe, Co, and Ni atoms have nonzero spins, and because the spin magnetic moment of an electron is 1 Bohr magneton ðmB Þ, the atoms have substantial magnetic moments. Then, when atoms condense to form a cluster or a metal, the overlap between the atomic orbitals of neighboring atoms leads to the formation of bands of electronic levels. The orbitals corresponding to 4s electrons produce a nearly free electron band with a width in the solid of W ¼ 20 30 eV, whereas the
Basic Concepts
193
d electrons stay localized on the atomic sites, and consequently the d band width is much smaller, typically 5–10 eV. The crystal potential stabilizes the d and s states by different amounts, because of their different degree of localization. This process, plus partial hybridization between the s and d shells, leads to charge transfer from s to d states, and the occupation number of s electrons for clusters and metals is close to 1 per atom. Assuming that the 3d orbitals are still atomic-like, Hund’s rule requires the majority 3d " sub-band to be fully occupied with five electrons per atom, whereas the minority 3d # sub-band has to be occupied with two, three, and four electrons per atom in Fe, Co, and Ni, respectively. Therefore, the difference in the number of spin " and spin # 3d electrons per atom is nd ð"Þ nd ð#Þ ¼ 3; 2, 1 for Fe, Co, and Ni, b ðFeÞ ¼ 3mB , respectively, and the magnetic moments per atom are m b ðNiÞ ¼ 1mB . These simple estimates are close to the b ðCoÞ ¼ 2mB , and m m actual magnetic moments of very small clusters. In comparison, the magnetic ðFeÞ ¼ 2:2mB , m ðCoÞ ¼ 1:7mB , and moments of the bulk metals, m ðNiÞ ¼ 0:64mB , are smaller, and their noninteger values are caused by the parm tial delocalization of the 3d electrons. The exchange interaction between these delocalized electrons (known as itinerant exchange) also contributes to the mutual alignment of the atomic moments. Experiments have been performed to reveal how the magnetic moments evolve as the number of the atoms in the cluster increases.1–5 That evolution is very rich and has unexpected features. The clusters in a molecular beam are free from any interation with a matrix. So it is possible to measure their intrinsic magnetic properties. The magnetic moment can be determined by an experimental technique similar to that used by Stern and Gerlach to demonstrate the quantization of the components of the angular momentum in the early days of quantum theory. In this way, experimentalists can investigate the dependence of the cluster’s magnetic moment as a function of the cluster size.1,2 The clusters interact with an external inhomogeneous magnetic field B and are deflected from their original trajectory. The deflection l of a cluster moving with a velocity v in a direction perpendicular to the field gradient direction (defined as the z direction) is given by2 l¼K
MðBÞ qB mv2 qz
½1
where m is the cluster mass, qB=qz is the gradient of the magnetic field, and K is a constant that depends on the geometry of the apparatus. This equation shows that the deflection is proportional to the cluster magnetization M(B), which is the time-averaged projection of the magnetic moment m of the particle along the field gradient direction. When analyzing the experiments, one normally assumes that ferromagnetic clusters are monodomain particles; that is, all magnetic moments of the
194
Magnetic Properties of Atomic Clusters of the Transition Elements
particle are parallel, aligned in the same direction. In contrast, the case of a macroscopic crystal is more complex. A bulk crystal is formed by magnetic domains. In each domain, the magnetic moments may be aligned, but the direction of the magnetization is not the same in different domains. In the analysis of the experiments, one also usually assumes that the clusters follow the super-paramagnetic behavior. Super-paramagnetism is a phenomenon by which magnetic materials may exhibit a behavior similar to paramagnetism at temperatures below the Curie or the Neel temperatures. This effect is observed in small particles. In this case, although the thermal energy is not sufficient to overcome the coupling forces between the magnetic moments of neighboring atoms (that is, the relative orientation of the moments of two neighbor atoms cannot be modified), the energy required to change collectively the direction of the magnetic moment of the entire particle is comparable with the ambient thermal energy. It occurs because the crystalline magnetic anisotropy energy, which is the energy required to change the direction of magnetization of a crystallite, decreases strongly as the crystallite size decreases and is negligible in clusters. In that case, the N atomic magnetic moments of a cluster with N atoms are coupled by the exchange interaction, which give rise to a large total magnetic moment mN that is free of the cluster lattice. This orientational freedom allows the magnetic moment to align easily with an external magnetic field B, as in a paramagnetic material. For an ensemble of particles in thermodynamic equilibrium at a temperature T in an external magnetic field, the magnetization reduces, in the limit of low field (mN B kB T, where kB is the Boltzmann constant), to MðBÞ ¼
m2N B 3kB T
½2
¼ mN =N of a monodomain cluster is The average magnetic moment per atom m analogous to the saturation magnetization Ms of the bulk. However, at zero field, a magnetic monodomain cluster has a nonzero magnetic moment. In contrast, for a multidomain bulk sample, the magnetic moment may be much smaller than Ms because the different magnetic domains are not mutually aligned. Equations [1] and [2] can be used to determine mN in monodomain clusters. The evolution of the average magnetic moment as a function of cluster size N toward the bulk value m b is caused by is not smooth. The overall decrease of m the increasing number of nearest neighbors, which is an effect that enhances the itinerant character of the d electrons, that is, the possibility of hopping between neighboring atoms. Surface atoms have a smaller number of neighbors than do N to m b is thus achieved when the fracbulk atoms. Convergence of the value of m tion of surface atoms becomes small. In addition, clusters can have complex structures; i.e., they are not simple fragments of the crystal. These ingredients affect the detailed broadening of the electronic levels that form the d bands. The exchange splitting between " and # d sub-bands, charge transfer from the
Experimental Studies of the Dependence
195
s to the d band, and s–d hybridization depend sensitively on the cluster size and N evolves with cluster size. thus control how m
EXPERIMENTAL STUDIES OF THE DEPENDENCE OF THE MAGNETIC MOMENTS WITH CLUSTER SIZE The magnetic moments of Fe, Co, and Ni clusters with sizes up to 700 atoms have been measured by Billas et al.1,2 Those measurements were made under conditions where the clusters exhibit super-paramagnetic behavior for low cluster temperatures (vibrational temperature Tvib ¼ 78 K for Ni and Co clusters and 120 K for Fe clusters). Their results are shown in Figure 1. As
Magnetic moment per atom [mB]
1.2 (a) NiN at T=78 K
1.0
0.8
bulk
0.6
0.4
0
100
200
300
400
500
600
700
Magnetic moment per atom [mB]
2.6 (b)
2.4 CoN at T=78 K 2.2 2.0 1.8
bulk 1.6 1.4
0
100
200
300
400
500
600
700
Figure 1 Experimental magnetic moments per atom of Ni, Co and Fe clusters with sizes up to 700 atoms. Reproduced with permission from Ref. 2.
Magnetic Properties of Atomic Clusters of the Transition Elements
Magnetic moment per atom [mB]
196
3.4
(c) FeN at T=120 K
3.0
2.6
2.2
bulk
1.8 0
100
200
300
400
500
600
700
Cluster size N
Figure 1 (Continued)
expected, the largest magnetic moments occur for the smallest clusters. The decreases for increasing cluster size and conmagnetic moment per atom m verges to the bulk value for clusters consisting of a few hundred atoms. This convergence is fastest for Ni clusters. However, in the three cases shown, some . Apsel et al.3 have oscillations are superimposed onto the global decrease of m also performed high-precision measurements of the magnetic moments of nickel clusters with N ¼ 5 to 740, and more recently, values have been reported by Knickelbein.4 Experiments have also been performed for clusters of the 4d and 5d metals, which are nonmagnetic in the bulk.5 Rhodium was the first case of a nonmagnetic metal in which magnetism was observed in clusters. Magnetic moments were observed for Rh clusters with fewer than 60 atoms, but larger clusters are nonmagnetic. RhN clusters with about ten atoms have magnetic 0:8 mB ; then m decays quickly for cluster sizes between N ¼ 10 moments m and N ¼ 20, showing, however, oscillations and sizable magnetic moments for Rh15, Rh16, and Rh19. This behavior differs from that of Fe, Co, and Ni, where extends over a much wider range of cluster sizes (cf. the variation of m Figure 1). In contrast to rhodium, clusters of ruthenium and palladium (with N ¼ 12 to more than 100) are reported to be nonmagnetic.5
SIMPLE EXPLANATION OF THE DECAY OF THE MAGNETIC MOMENTS WITH CLUSTER SIZE Simple models can explain the general decay of the magnetic moment as the cluster size increases,6 but they cannot explain the fine details. Neglecting
Simple Explanation of the Decay of the Magnetic Moments
197
the contribution of the sp electrons and using the Friedel model, in which the d electrons form a rectangular band,7 the local density of electronic states (LDOS) with spin s (that is, " or #) at site i can be expressed as8 Dis ðeÞ ¼
5 Wi
for
Wi Wi < e eds < 2 2
½3
where eds is the energy of the center of the s spin sub-band and Wi is the local band width (assumed to be equal for " and # spins). The tight binding theory (see the next section) gives a scaling relation8 in which Wi is proportional to the square root of the local atomic coordination number Zi Wi ¼ Wb ðZi =Zb Þ1=2
½4
where Wb and Zb are the corresponding quantities for the bulk solid. If the d band splitting ¼ jed" ed# j caused by the exchange interaction is assumed equal to the splitting in the bulk, the local magnetic moment eðF
mi ¼
ðDi" ðeÞ Di# ðeÞÞde
½5
1
becomes mi ¼
Zb Zi
mi ¼ mdim
1=2 mb
if
Zi Zc
otherwise
½6
Here eF is the Fermi energy, that is, the energy of the highest occupied level; Zc is a limiting atomic coordination number below which the local magnetic moment of that atom adopts the value mdim of the dimer, and one can choose N P N ¼ ð1=NÞ mi depends senZc ¼ 5 in Ni.9 The average magnetic moment m i¼1
sitively on the ratio between the number of surface atoms and the bulk-like atoms. Surface atoms have small values of Zi and large values of mi , whereas the internal atoms have Zi ¼ Zb and mi ¼ mb . In the case of small clusters, is large. But as the cluster size most atoms are on the surface and hence m also decreases. increases, the fraction of surface atoms decreases and m Assuming magnetic moments ms for the surface atoms and mb for the bulk atoms, Jensen and Bennemann10 calculated the average magnetic moment of the cluster using Eq. [7] ¼ mb þ ðms mb ÞN 1=3 m
½7
198
Magnetic Properties of Atomic Clusters of the Transition Elements
toward the bulk magnetic moment This formula shows a smooth decrease of m with increasing N. However, the experimental results graphed in Figure 1 indi with N has a more complex, oscillatory behavior. cate that the variation of m Its explanation requires a detailed consideration of the geometry of the cluster and a better treatment of its electronic structure.
TIGHT BINDING METHOD Tight Binding Approximation for the d Electrons The orbitals of the d states in clusters of the 3d, 4d, and 5d transition elements (or in the bulk metals) are fairly localized on the atoms as compared with the sp valence states of comparable energy. Consequently, the d states are not much perturbed by the cluster potential, and the d orbitals of one atom do not strongly overlap with the d orbitals of other atoms. Intraatomic d–d correlations tend to give a fixed integral number of d electrons in each atomic d-shell. However, the small interatomic d-d overlap terms and s-d hybridization induce intraatomic charge fluctuations in each d shell. In fact, a d orbital contribution to the conductivity of the metals and to the low temperature electronic specific heat is obtained only by starting with an extended description of the d electrons.7 The partially filled d band of the transition metals, or the d states in clusters, are described well by the tight binding (TB) approximation11 using a linear combination of atomic d orbitals. The basic concepts of the method are as follows: (1) The lattice potential V (or the cluster potential) is written as a sum of atomic potentials Vi centered on the lattice sites i. (2) The electronic states in the cluster (or in the solid metal) are expressed as a linear combination of atomic states (LCAO) jcðeÞi ¼
X
aim jimi
½8
i;m
The sum in m goes from 1 to 5, because there are five different atomic d orbitals fm. In the usual notation, these orbitals are labeled dxy, dxz, dzy, dx2 y2 , and d3z2 r2 . The normalized atomic orbitals are eigenfunctions of the atomic Hamiltonian T þ Vi with energy e0. As a first approximation, the overlap integrals of the atomic orbitals across neighboring sites can be neglected. (3) Of the matrix elements himjVl jjm0 i, only the two-center integrals between first or second nearest neighbors are retained. The coefficients aim then
Tight Binding Method
199
satisfy the set of coupled linear equations ðe0 þ im eÞaim þ
X
0
jm t im ajm0 ¼ 0
½9
j6¼i;m0
where * im ¼
+ X Xð ~i ÞVj ð~ ~j Þfim ð~ ~i Þd3 r im Vj im ¼ rR rR rR fim ð~ j6¼i j6¼i
ð jm0 ~i ÞVj ð~ ~j Þf 0 ð~ ~ 3 t im ¼ himjVj jjm0 i ¼ fim ð~ rR rR jm r Rj Þd r
½10
½11 0
jm The im integrals shift the energy of the reference atomic levels e0, and the t im integrals mix them into molecular states. From the set of Eqs. [9], one arrives at a 5N 5N secular determinant from which the electronic levels of the cluster can be obtained.12 The 5N atomic d states jimi give rise to 5N electronic levels distributed between the two extremes ea and eb. The lowest level, with energy eb, corresponds to the formation of d–d bonds between most pairs of atoms. In the bonding states, the electron density increases along the bonds, compared with the simple superposition of the electron densities of the free atoms. In going from eb to ea, the number of antibonds increases. At ea, antibonds have been created between most pairs of atoms (in antibonding states, the electron density between atoms decreases compared with the superposition of densities of the free atoms). The energy broadening can be viewed as resulting from a resonance between the atomic d levels, allowing the electrons to hop from atom to atom through the cluster (or through the lattice in the solid). From experience acquired in metals, it is known that the d band width W ¼ ea eb is larger than the shift S. The shift is the energy difference between the atomic level e0 and the average band level ðea þ eb Þ=2. Typical values in metals are W ¼ 5–10 eV and S ¼ 1–2 eV. Atomistic simulations usually require the calculation of the total energy of the system. The band energy of the solid or cluster is evaluated by integrating the density of electronic states D(e)
Eband ¼
ð eF eDðeÞde
½12
The part of the energy not included in Eband can be modeled by pairwise repulsive interactions X Uij ½13 Erep ¼ i6¼j
200
Magnetic Properties of Atomic Clusters of the Transition Elements
Introduction of s and p Electrons Elements like carbon or silicon have only s and p electrons in the valence shell. Of course, s and p electrons are also present in the transition elements. Their contribution to the electronic and magnetic properties of the transition metal clusters is discussed later in this chapter. In particular, their effect can be taken into account in the tight binding method by a simple extension of the ideas presented above. For this purpose, the basis of atomic orbitals has to be extended by adding s and p orbitals. The new basis thus contains s, px, py, pz atomic orbitals in addition to dxy ; dxz ; dzy ; dx2 y2 , and d3z2 r2 orbitals. It was also pointed out in the previous section that the overlap integrals of the atomic orbitals across neighboring sites can be neglected as a first approximation. To overcome this limitation, an often applied improvement is to substitute the original basis set of atomic orbitals with an orthogonalized basis. This orthogonalization can be performed using the method introduced by Lo¨wdin.13 The orthogonalized orbitals cia ¼
X
0 0
ðSiiaa Þ1=2 fi0 a0
½14
i0 a0
preserve the symmetry properties of the original set. S in Eq. [14] is the overlap 0 0 matrix Siiaa ¼ hfia jfi0 a0 i, and the index a indicates the different atomic orbitals (a generalizes the index m used above). Consequently, the integrals in Eqs. [10] and [11] now become integrals between Lo¨wdin orbitals. A key approximation that makes the TB calculations simple and practical is to replace the and t integrals of Eqs. [10] and [11] by parameters ~i R ~j j and the symmetry of depending only on the interatomic distance jR the orbitals involved.
Formulation of the Tight Binding Method in the Notation of Second Quantization Recent application of the TB method to transition metal clusters often made use of a convenient formulation in the language of second quantization.14 In this formalism, the TB Hamiltonian in the unrestricted Hartree–Fock approximation can be written as a sum of diagonal and nondiagonal terms15 H¼
X i;a;s
^ias þ eias n
X
jb
t ia ^cþ cjbs ias ^
½15
a;b;s i6¼j
In this expression, ^cþ ias is an operator representing the creation of an electron with spin sðs ¼" or #Þ and orbital state a at site i, ^cias is the corresponding ^ias ¼ ^cþ annihilation operator, and the operator n ias cias appearing in
Tight Binding Method
201
the diagonal terms is the so-called number operator. As indicated, jb aðbÞ s; px ; py ; pz ; dxy ; dxz ; dyz ; dx2 y2 ; d3z2 r2 . The hopping integrals tia between orbitals a and b at neighbor atomic sites i and j are assumed to be independent of spin and are usually fitted to reproduce the first-principles band structure of the bulk metal at the observed lattice constant. The variation ~i R ~j j is often of the hopping integrals with the interatomic0 distance Rij ¼ jR lþl þ1 assumed to follow a power law ðR0 =Rij Þ , where R0 is the equilibrium nearest-neighbor distance in the bulk solid and l and l0 are the angular momenta of the two orbitals, a and b, involved in the hopping.16 An exponential decay is sometimes used instead of the power law. The spin-dependent diagonal terms eias contain all of the many-body contributions. In a mean field approximation, these environment-dependent energy levels can be written as eias ¼ e0ia þ
X
Uasbs0 nibs0 þ
bs0
X e2 nj þ Zi a Rij j6¼i
½16
where e0ia are the reference orbital energies. These energies could be the atomic levels, but most often the bare orbital energies of the paramagnetic bulk metal are taken as reference energies. The second term gives the shifts of the energies caused by intraatomic Coulomb interactions. The intraatomic Coulomb integrals ðð 1 Uasbs0 ¼ cias ð~ rÞcias ð~ rÞ rÞcibs0 ð~ r0 Þd3 rd3~ ½17 c 0 ð~ r0 j~ r ~r0 j ibs give the electrostatic interaction between two charge clouds corresponding to the orbitals jiasi and jibs0 i on the same atom. The quantity nibs ¼ nibs n0ibs , where nibs ¼ h^ nibs i is the average occupation of the spinorbital jibsi and n0ibs is the corresponding occupation in the paramagnetic solution for the bulk metal. The intraatomic Coulomb integrals Uasbs0 can be expressed in terms of two more convenient quantities, the effective direct integrals Uab ¼ ðUa"b# þ Ua"b" Þ=2 and the exchange integrals Jab ¼ Ua"b# Ua"b" . Then, the intraatomic term of Eq. [16] splits into two contributions X bs0
Uasbs0 nibs0 ¼
X b
Uab nib þ zs
X Jab b
2
mib
½18
where nib ¼ nib" þ nib# , mib ¼ nib" nib# , and zs is the sign function ðz" ¼ 1; z# ¼ þ1Þ. The term Ua"b# refers to the Coulomb interaction between electrons with opposite spin and Ua"b" to the interaction between electrons with the same spin. The first contribution in Eq. [18] arises from the change in electronic occupation of the orbital jibi and the second contribution from
202
Magnetic Properties of Atomic Clusters of the Transition Elements
the change of the magnetization (spin polarization) of that orbital. Because of the different local environments of the atoms in a cluster, charge transfer between nonequivalent sites can occur. However, the direct Coulomb repulsion tends to suppress charge redistribution between atoms and to establish approximate local charge neutrality (i.e., n is small). The direct and exchange integrals, Uab and Jab , are usually parametrized. The difference between s–s, s–p, and p–p direct Coulomb integrals is often neglected by writing Uss ¼ Usp ¼ Upp , and it is assumed that Usd ¼ Upd . The ratio Uss : Usd : Udd of the magnitudes of Uss , Usd , and Udd can be taken from Hartree–Fock calculations for atoms. The absolute value of Udd can be estimated by another independent method, for instance, from atomic spectroscopic data.17,18 Typical values for the Uss : Usd : Udd ratios are 0.32 : 0.42 : 1 for Fe and Udd ¼ 5:40 eV.15 The direct Coulomb integral between d electrons, Udd , dominates Uss and Usd . The magnetic properties of clusters are not very sensitive to the precise value of Udd because the charge transfer n is typically small. In most cases, all exchange integrals involving s and p electrons are neglected and the d exchange integral Jdd is determined in order to reproduce the bulk magnetic moment. Typical J values for Cr, Fe, Ni, and Co are between 0.5 and 1.0 eV.15,17,18 The third term in Eq. [16] represents the Coulomb shifts resulting from electronic charge transfer between the atoms. The quantity nj ¼ nj n0j , P where nj ¼ h^ njb" iþh^ njb# i is the total electronic charge on atom j and n0j is b
the bulk reference value. In Eq. [16], the interatomic Coulomb integrals V
iasjbs0
¼
ðð
cias ð~ rÞcias ð~ rÞ
1 c 0 ð~ rÞcjbs0 ð~ r0 Þd3 rd3~ r0 j~ r ~ r0 j jbs
½19
have been approximated as Vij ¼ e2 =Rij . Finally, the last term in Eq. [16] takes into account the energy level corrections arising from nonorthogonality effects15,19 and from the crystal field potential of the neighboring atoms,8 which are approximately proportional to the local atomic coordination number Zi . The constants a (a ¼ s; p; dÞ are orbital-dependent and can be obtained from the difference between the bare energy levels (that is, excluding Coulomb shifts) of the isolated atom and the bulk. These constants can have different signs for s–p orbitals as compared with d orbitals. For instance, Vega et al.15 obtained s ¼ 0:31 eV, p ¼ 0:48 eV, and d ¼ 0:10 eV for Fe, which means that the repulsive overlap effects dominate the orbital shifts for s and p electrons, whereas the dominant contribution for the more localized d electrons is the negative crystal field shift. One can also model, through this term, effects on the energy levels arising from changes in the bond length associated with a lowering of the coordination number.8,15 The spin-dependentPlocal electronic occupations h^ nias i and the local magnetic moments mi ¼ ðh^ nia" i h^ nia# iÞ are self-consistently determined a
Spin-Density Functional Theory
203
from the local (and orbital-dependent) density of states Dias ðeÞ h^ nias i ¼
eðF
Dias ðeÞde
½20
1
The Fermi energy is determined from the condition of global charge neutrality. In this Pway, the local magnetic moments and the average magnetic moment ¼ ð mi Þ=N are obtained at the end of the self-consistency cycle. The local m density of states can be calculated at each iteration step during the calculation from the imaginary part of the local Green’s function 1 Dias ðeÞ ¼ ImGias;ias ðeÞ p
½21
and the local Green function Gias;ias ðeÞ can be determined efficiently from the moments of the local density of states,20 as indicated in the Appendix. The tight binding framework discussed here is general, although the specific calculations may incorporate some differences or simplifications with respect to the basic method. For instance, Guevara et al.21 have pointed out the importance of the electron spillover through the cluster surface. These researchers incorporated this effect by adding extra orbitals with s symmetry outside the surface. This development will be considered later in some detail.
SPIN-DENSITY FUNCTIONAL THEORY General Density Functional Theory The basic variable in density functional theory (DFT)22 is the electron density nð~ rÞ. In the usual implementation of DFT, the density is calculated from the occupied single-particle wave functions ci ð~ rÞ of an auxiliary system of noninteracting electrons nð~ rÞ ¼
X
yðeF ei Þjci ð~ rÞj2
½22
i
rÞ are obtained by solving the Kohn–Sham equations23 and the orbitals ci ð~ r2 þ VKS ð~ rÞ ci ð~ rÞ ¼ ei ci ð~ rÞ 2
½23
written in atomic units. The symbol y in Eq. [22] is the step function, which ensures that all orbitals with energies ei below the Fermi level eF are occupied
204
Magnetic Properties of Atomic Clusters of the Transition Elements
and all orbitals with energies above eF are empty. The Fermi level is determined by the normalization condition ð nð~ rÞd3 r ¼ N ½24 rÞ where N is the number of electrons. The effective Kohn–Sham potential VKS ð~ appearing in Eq. [23] is the sum of several distinct contributions: VKS ð~ rÞ ¼ Vext ð~ rÞ þ VH ð~ rÞ þ Vxc ð~ rÞ
½25
The external potential Vext ð~ rÞ contains the nuclear or ionic contributions and possible external field contributions. The Hartree term VH ð~ rÞ is the classic electrostatic potential of the electronic cloud VH ð~ rÞ ¼
ð
nð~r0 Þ 3 0 d r j~ r ~ r 0j
½26
rÞ is the exchange-correlation potential. Exchange effects between the and Vxc ð~ electrons originate from the antisymmetric character of the many-electron wave function of a system of identical fermionic particles: Two electrons cannot occupy the same single-particle state (characterized by orbital and spin quantum numbers) simultaneously. This effect has the consequence of building a hole, usually called the Fermi hole, around an electron that excludes the presence of other electrons of the same spin orientation (up " or down #, in the usual notation for the z component). Additionally, there are Coulombic correlations between the instantaneous positions of the electrons because these are charged particles that repel each other. Because of this repulsion, two electrons cannot approach one another too closely, independent of their spin orientation. The combined effect of the Fermi and Coulomb correlations can be described as an exchange–correlation hole built around each electron. In practice Vxc ð~ rÞ is calculated, using its definition in DFT, as the functional derivative of an exchange–correlation energy functional Exc ½n, Vxc ð~ rÞ ¼
dExc ½n dnð~ rÞ
½27
The local density approximation (LDA)24 is often used to calculate Exc ½n and Vxc ð~ rÞ. The LDA uses as input the exchange–correlation energy of an electron gas of constant density. In a homogeneous system the exchange energy per particle is known exactly and it has the expression 3 3 1=3 1=3 ex ðn0 Þ ¼ n0 4 p
½28
Spin-Density Functional Theory
205
where n0 is the constant density of the system. The exchange energy of an inhomogeneous system with density nð~ rÞ is then approximated by assuming that Eq. [28] is valid locally; that is, ð 3 3 1=3 nð~ rÞ4=3 d3 r Ex ½n ¼ 4 p
½29
Performing the functional derivative of Eq. [29] (see Eq. [27]) leads to VxLDA ð~ rÞ ¼
1=3 3 nð~ rÞ1=3 p
½30
An exact expression for the correlation energy per particle ec ðn0 Þ of a homogeneous electron gas does not exist, but good approximations to this nevertheless do exist.24 Also, nearly exact correlation energies have been obtained numerically for different densities25 and the results have been parametrized as useful functions ec ðnÞ.26 The corresponding LDA correlation potential rÞ ¼ VcLDA ð~
dðnec ðnÞÞ dn n¼nð~ rÞ
½31
LDA is then immediately obtained. In summary, in the LDA, Vxc ð~ rÞ at point ~ r in space is assumed to be equal to the exchange–correlation potential in a homogeneous electron gas with ‘‘constant’’ density n ¼ nð~ rÞ, precisely equal to the local density nð~ rÞ at that point. The LDA works in practice better than expected, and this success is rooted in the fulfillment of several formal properties of the exact Exc ½n and in subtle error cancellations. Substantial improvements have been obtained with the generalized gradient approximations (GGAs)
ð GGA EGGA ½n ¼ fxc ðnð~ rÞ; rnð~ rÞÞd3 r xc
½32
which include rnð~ rÞ, or even higher order gradients of the electron density, in the exchange–correlation energy functional.27–29
Spin Polarization in Density Functional Theory Some generalization is required when the external potential Vext is spindependent (for instance, when there is an external magnetic field), or if one wants to take into account relativistic corrections such as the spin-orbit
206
Magnetic Properties of Atomic Clusters of the Transition Elements
term. Von Barth and Hedin30 formulated DFT for spin-polarized cases. The basic variable in this case is the 2 2 spin-density matrix rab ð~ rÞ, defined as ð rab ð~ rÞ ¼ N d~ x2 ::: d~ xN ð~ ra;~ x2 ; :::~ xN Þð~ rb;~ x2 ; :::~ xN Þ
½33
where the notation ~ x includes both the position ~ r and the spin variable, rÞ is then hermitian and defined a ¼ þ1=2 or a ¼ 1=2. The 2 2 matrix rab ð~ at each point ~ r. The spinless density is the trace of this density matrix nð~ rÞ ¼ Trrab ð~ rÞ ¼ nþ ð~ rÞ þ n ð~ rÞ
½34
where nþ ð~ rÞ ¼ rþþ ð~ rÞ and n ð~ rÞ ¼ r ð~ rÞ are the diagonal terms. To quantify the magnetic effects, one can define the magnetization density vector mð~ rÞ such that 1 1 rÞI þ mð~ rÞ r ½35 rab ð~ rÞ ¼ nð~ 2 2 where I is the 2 2 unit matrix and r ¼ ðsx ; sy ; sz Þ, with sx , sy , and sz being the 2 2 Pauli spin matrices. Consequently, nð~ rÞ and mð~ rÞ form an alternative representation of rab ð~ rÞ. The one-particle representation is now based on twocomponent spinors (when adding spin to the spatial orbitals, the two components, þ and , of the spinor correspond to the two projections of the spin, up " and down #, along a quantization axis) rÞ ciþ ð~ ð~ c rÞ ¼ i ci ð~ rÞ
½36
The purpose of spin-polarized DFT is again to describe the system (molecule, cluster, . . .) with an auxiliary noninteracting system of one-particle spinors ; :::; c g. The ground state density matrix of this noninteracting system fc 1 N rab ð~ rÞ ¼
X
yðeF ei Þcia ð~ rÞcib ð~ rÞ
½37
i
should be equal to that of the interacting system. In terms of the one-particle spin-orbitals nð~ rÞ ¼
X
ð~ rÞ yðeF ei Þc i rÞci ð~
½38
ð~ rÞ yðeF ei Þc i rÞrci ð~
½39
i
mð~ rÞ ¼
X i
Spin-Density Functional Theory
207
Spin-dependent operators are now introduced. The external potential can be ^ ext acting on the two-component spinors. The exchange– an operator V correlation potential is defined as in Eq. [27], although Exc is now a functional Exc ¼ Exc ½rab of the spin-density matrix. The exchange–correlation potential is then ab Vxc ð~ rÞ ¼
dExc ½rab rÞ drab ð~
½40
This potential is often written in terms of a fictitious exchange-correlation magnetic field Bxc ab Vxc ð~ rÞ ¼
dExc I Bxc ð~ rÞ r dnð~ rÞ
Bxc ð~ rÞ ¼
dExc dmð~ rÞ
½41 ½42
The Kohn–Sham Hamiltonian is now ^ þV ^ KS ½r ¼ T ^ ext þ V ^ H ½nI þ V ^ xc ½r H ab ab
½43
and the corresponding Kohn–Sham equations become i ¼ ej jc i ^ KS ½r jc H ab j j
½44
X r2 ab dab þ VKS ð~ rÞ cib ð~ rÞ ¼ ei cia ð~ rÞ 2 b
½45
that is,
where the Kohn–Sham effective potential is now ab ab VKS ð~ rÞ ¼ Vext ð~ rÞ þ dab
ð
nð~ r 0Þ 3 0 ab d r þ Vxc ð~ rÞ j~ r ~ r 0j
½46
In most cases of interest, the spin density is collinear; that is, the direction of the magnetization density mð~ rÞ is the same over the space occupied by the system; it is customary to identify this as the z-direction. The Hamiltonian is then diagonal if the external potential is diagonal, which allows one to decouple the spin " and spin # components of the spinors and to obtain two
208
Magnetic Properties of Atomic Clusters of the Transition Elements
sets of equations. This method is known as a spin-polarized calculation. The degree of spin polarization is defined as x ¼ ðnþ n Þ=n, which ranges from 0 to 1. When x ¼ 1, we have a fully polarized system, and when x ¼ 0, the system is unpolarized. This approach is adequate to treat ferromagnetic or antiferromagnetic order, found in some solids.31,32 In both ferromagnetic and antiferromagnetic ordering, the spin magnetic moments are oriented parallel, but in the ferromagnetic order, all moments point in the same direction ð""" . . . Þ, whereas in antiferromagnetically ordered solids, a spin " magnetic moment at a given lattice site is surrounded at neighbor lattice sites by spin moments # pointing in the opposite direction, and vice versa.
Local Spin-Density Approximation (LSDA) As in the non-spin-polarized case, the main problem with the spin-polarized method comes from our limited knowledge of the exchange–correlation energy functional Exc ½rab , which is not known in general. However, Exc ½rab is well known for a homogeneous gas of interacting electrons that is fully spin-polarized, i.e., nþ ð~ rÞ ¼ n; n ð~ rÞ ¼ 0 (and, of course, for a nonpolarized homogeneous electron gas, with nþ ð~ rÞ ¼ n ð~ rÞ ¼ n=2; see above). As a result, von Barth and Hedin30 proposed an interpolation formula for the exchange–correlation energy per electron in a partially polarized electron gas (the z-axis is taken as the spin quantization direction) exc ðn; xÞ ¼ ePxc ðnÞ þ ½eFxc ðnÞ ePxc ðnÞf ðxÞ
½47
where the function f ðxÞ gives the exact spin dependence of the exchange energy 1 f ðxÞ ¼ ð21=3 1Þ1 fð1 þ xÞ4=3 þ ð1 xÞ4=3 2g 2
½48
In Eq. [47], ePxc ðnÞ and eFxc ðnÞ are the exchange–correlation energy densities for the nonpolarized (paramagnetic) and fully polarized (ferromagnetic) homogeneous electron gas. The form of both eFxc ðnÞ and ePxc ðnÞ has been conveniently parameterized by von Barth and Hedin. Other interpolations have also been proposed24,33 for exc ðn; xÞ. The results for the homogeneous electron gas can be used to construct an LSDA ð Exc ½rab ¼ nð~ rÞexc ðnð~ rÞ; xð~ rÞÞd3 r
a Vxc ð~ rÞ ¼
dExc ½rab rÞ dna ð~
½49
½50
Noncollinear Spin Density Functional Theory
209
As with the unrestricted Hartree–Fock approximation, the LSDA allows for different orbitals for different spin orientations. The LSDA gives a simplified treatment of exchange but also includes Coulomb correlations.
NONCOLLINEAR SPIN DENSITY FUNCTIONAL THEORY In many systems of interest, the spin density is collinear; that is, the direction of the magnetization vector mð~ rÞ is the same at any point in space. There are other systems, however, in which the direction of mð~ rÞ changes in space, a well-known example being the g-phase of solid Fe.34 Noncollinear magnetic configurations occur easily in systems with low symmetry or those that are disordered.35,36 One can then expect the occurrence of noncollinear spin arrangements in clusters of the transition metals. Generalized LSDA calculations allowing for noncollinear magnetic structures have been performed for solids.37–39 Implementation of the noncollinear formalism for clusters has been more recent40,41 and uses again as a basis the LSDA. When a local exchange– correlation functional is used, the following key idea was introduced by von Barth and Hedin.30 One can divide the volume of the system into small independent boxes and consider that within each small box the electrons form a spin-polarized electron gas, whose densities are n" ð~ rÞ and n# ð~ rÞ, the two real and positive eigenvalues of the spin-density rab ð~ rÞ matrix at ~ r. At each point ~ r, one can then choose a local coordinate system such that the z-axis coincides with the direction of the local spin. In this way, one can use the LSDA exchange and correlation functionals and calculate the exchange–correlation potential in this locally diagonal frame. This strategy provides a local magnetization density approximation, which is similar in spirit to the local density approximation. That is, in the LDA, the exchange–correlation energy density and the exchange–correlation potential at the point ~ r are calculated by assuming that the system behaves locally (at ~ r) as a homogeneous electron gas with constant density n equal to nð~ rÞ, the true density at point ~ r. Similary, in the local magnetization density approximation, the exchange–correlation energy density and the exchange–correlation potential at~ r are calculated by assuming that the system behaves locally as a partially spin-polarized electron gas with a magnetization density vector m equal to mð~ rÞ, the true magnetization density vector at point ~ r. The procedure used to calculate the exchange–correlation potential involves carrying the density matrix to the local reference frame where it is diagonal, using the spin-1/2 rotation matrix.42 0
yð~ rÞ ði=2Þfð~rÞ B cos 2 e Uð~ rÞ ¼ B @ yð~ rÞ ði=2Þfð~rÞ e sin 2
1 yð~ rÞ ði=2Þfð~rÞ e C 2 C yð~ rÞ ði=2Þfð~rÞ A e cos 2 sin
½51
210
Magnetic Properties of Atomic Clusters of the Transition Elements
The angles yð~ rÞ and fð~ rÞ are calculated in such a way that U diagonalizes the density matrix rÞ ¼ Uð~ rÞrð~ rÞUþ ð~
n" ð~ rÞ 0
0 n# ð~ rÞ
½52
rÞ is the adjoint (or Hermitian conjugate) of Uð~ rÞ. The exchange– where Uþ ð~ correlation potential is then calculated in this local reference frame, in " # which it is a diagonal operator with components Vxc and Vxc , and then it must be transformed back to the original reference frame. The local spin rotation angles yð~ rÞ and fð~ rÞ, the local azimuthal and polar angles of the magnetization density vector, are computed through the requirement of having the off-diagonal elements vanish in the matrix of Eq. [52]. The result is fð~ rÞ ¼ tan1
yð~ rÞ ¼ tan1
Imr"# ð~ rÞ Rer"# ð~ rÞ
2f½Rer"# ð~ rÞ2 þ ½Imr"# ð~ rÞ2 g1=2 r"" ð~ rÞ r## ð~ rÞ
½53
½54
This leads to an exchange–correlation potential in the form of a 2 2 Hermitian matrix in spin space 1 " 1 " # # Vxc ð~ rÞ ¼ ðVxc ð~ rÞ þ Vxc ð~ rÞÞI þ ðVxc ð~ rÞ Vxc ð~ rÞÞr dð~ rÞ 2 2
½55
where dð~ rÞ is the unit vector in the direction of the magnetization mð~ rÞ. The presence of the second term in Eq. [55] effectively couples the " and # components of the spinor. To interpret the magnetic properties, one can use the spin-density matrix of Eq. [35] to compute the magnetization density mð~ rÞ. Local magnetic moments lat can be associated with individual atoms by integrating each component of mð~ rÞ within spheres centered on the ions. A reasonable choice for the radius of those spheres is to use one half of the smallest interatomic distance in the cluster. This process avoids overlap between neighbor spheres; however, some interstitial space remains between spheres, and one should be aware of the fact that those atom-centered spheres contain about 80–90% of the magnetization. It is worth stressing again that the method explained above has assumed local exchange– correlation functionals. A route that includes density gradients has been explored by Capelle and Gross.43
Measurement and Interpretation of the Magnetic Moments
211
MEASUREMENT AND INTERPRETATION OF THE MAGNETIC MOMENTS OF NICKEL CLUSTERS Interpretation Using Tight Binding Calculations Bloomfield et al.3 have performed accurate measurements of the magnetic moments of size-selected NiN clusters with N between 5 and 700. Their results up to N ¼ 60 are plotted as black dots in Figure 2. The average mag of the cluster shows an overall decrease with increasing clusnetic moment m ter size, but oscillations are superimposed on this decreasing behavior. In the decreases most rapidly, there is a local small size range, for N < 10, where m displays a deep minimum at Ni6 and a local maximum at Ni8. Thereafter, m minimum for Ni13 and another minimum at Ni56. The latter minimum is so close to Ni55 that it is tempting to conclude that the Ni clusters grow following an icosahedral pattern (clusters with perfect icosahedral structure and one and two shells, shown in Figure 3, have 13 and 55 atoms, respectively44). A third important minimum occurs around Ni34. The magnetic moment goes through a broad maximum between Ni13 and Ni34, and again between Ni34 and Ni56. Theoretical studies attempting to rationalize the behavior of the magnetic moment of NiN clusters45–48 have relied on the tight binding method. The magnetic moments calculated by Aguilera-Granja et al.46 for sizes up to N ¼ 60 are plotted in Figure 2. The calculations used the theory described above, with some simplifications. Local charge neutrality was assumed, lead-
Figure 2 Comparison between the experimental average magnetic moments of Ni clusters measured by Apsel et al.3 (black dots) and the moments calculated by a tight binding method45,46 (light circles). Reproduced with permission from Ref. 44.
212
Magnetic Properties of Atomic Clusters of the Transition Elements
Figure 3 Clusters with perfect icosahedral structure: one shell (N ¼ 13), two shells ðN ¼ 55Þ, and three shells ðN ¼ 147Þ.
ing to the following expression for the environment-dependent energy levels: eias ¼ e0ia þ zs
X Jab b
2
mib þ Zi ia
½56
which is simpler than the expression in Eq. [16]. Two principles are useful when interpreting these results.45,46 The first principle is that the local magnetic moments of the atoms in the cluster decrease when the number of neighbors (local coordination) around the atoms increases. The second principle is that the average magnetic moment of the cluster decreases when the interatomic distances decrease, which occurs because the width of the d band increases. In metallic clusters, the average atomic coordination number increases for increasing N. The average nearest-neighbor distance d also increases with N, from the value for the molecule dmol to the value for the bulk dbulk. The two effects oppose each other, and for that reason, the ðNÞ in a growing cluster is complex. For N 20, the cluster behavior of m geometries used to perform the calculations of the magnetic moments were obtained from molecular dynamics simulations using a many-atom potential49,50 based on the TB theory, with parameters fitted to the properties of Ni2 and bulk Ni. That potential is typically referred to as the Gupta potential.50 The geometries for N ¼ 516 are shown in Figure 4, which shows a pattern of icosahedral growth. A qualitative agreement exists between the experimental and theoretical magnetic moments of small clusters. The TB calculations predict pronounced local minima at Ni6 and Ni13 and a local maximum at Ni8. Ni13 is an icosahedron with an internal atom at its center. The local atomic coordination of the surface atoms in Ni13 is Z ¼ 6. On the other hand, Ni12 and Ni14 contain some atoms with coordination smaller than 6, and this leads to an increase of the local magnetic moments of those atoms. Consequently, the compact struc occurring at that cluster size. Ni6 is an ture of Ni13 explains the minimum of m
Measurement and Interpretation of the Magnetic Moments
213
Figure 4 Ground state geometries of Ni clusters with 5 to 16 atoms (symmetries are indicated), obtained using the Gupta potential. Reproduced with permission from Ref. 45.
octahedron formed by atoms with coordination Z ¼ 4. In Ni7, which has the structure of a flattened pentagonal bipyramid, the coordination of two atoms increases to Z ¼ 6 and remains equal to 4 for the rest of the atoms. Ni8 has four atoms with coordination Z ¼ 5 and four atoms with coordination Z ¼ 4, which leads to a mean coordination number that is slightly smaller than in Ni7. The coordination increases again in Ni9. This behavior of the mean coor for Ni8, which is dination number would lead one to expect a maximum of m indeed observed in the experiments and a minimum at Ni7. However, the observed minimum and the calculated minimum occur at Ni6, and the reason for this is that the average nearest-neighbor distance dn has a local maximum at Ni7. The larger value of dn counteracts the increase of the coordination at number when going from Ni6 to Ni7 and produces the minimum of m Ni6. To summarize, the oscillations of the average magnetic moment in small Ni clusters can be explained by two purely geometrical effects: (1) compact clusters, that is, clusters with high average atomic coordination number, ; and (2) clusters with large interatomic distances have small values of m . have large m The densities of states of Ni5, Ni6, and Ni7, decomposed into d and sp contributions, are compared in Figure 5. The occupied states of the majorityspin sub-band ð"Þ have mainly d character, except for a small peak with the sp character at the Fermi level; on the other hand, d holes are present in the minority-spin sub-band ð#Þ. Integration of the density of states gives average d
214
Magnetic Properties of Atomic Clusters of the Transition Elements
Figure 5 Density of states of NiN clusters with N ¼ 5, 6, and 7, calculated by the tight binding method: sp (dashed lines) and d (continuous lines). Positive and negative values correspond to up " and down # spins, respectively. The Fermi level is at the energy zero. Adapted with permission from Ref. 45.
magnetic moments of 1:60 mB , 1:52 mB , and 1:50 mB for Ni5, Ni6, and Ni7, respectively. A comparison with the calculated moments of Figure 2 reveals that the sp electrons make a sizable contribution. The sp moments in Ni5 ð0:29 mB Þ and Ni7 ð0:21 mB Þ reinforce the d moment, whereas for Ni6, the sp moment ð0:15 mB Þ opposes the d moment. The sp contribution decreases quickly with increasing cluster size. Icosahedral structures were assumed in these calculations for N > 20, although those structures were reoptimized using the Gupta potential.46 In addition, extensive molecular dynamics simulations were performed for a few selected cluster sizes. In all cases, the icosahedral structures were predicted as the ground state geometry, except for Ni38, which is a special case that will be discussed later. Icosahedral growth thus seems to be consistent with the interpretation of experiments probing the reactivity of Ni clusters with light molecules.51
Measurement and Interpretation of the Magnetic Moments
215
for The calculated magnetic moments of Figure 2 reveal a decrease of m sizes up to N 28, followed by a weak increase between N 28 and N ¼ 60. This behavior is primarily a result of the variation of the average which increases smoothly with N up to N ¼ 27 coordination number Z 46 and then drops. By extrapolating the smooth behavior of ZðNÞ to sizes larger than N ¼ 27, it was found that for N between 27 and 54, the actual values of ZðNÞ fall below the extrapolated values. In fact, ZðNÞ decreases between N ¼ 27 and N ¼ 30 and then begins to increase again after at N ¼ 27 suggests a flattening of m , which is conN ¼ 30. The break in Z firmed by the calculations. This break in the coordination number is caused by a structural transition in the icosahedral clusters,51 which occurs at precisely N ¼ 28. Starting with the first complete icosahedron shown in Figure 4, atoms can be added on top of this 13-atom core in two different ways. In a first type of decoration, atoms cover sites at the center of the triangular faces (F sites) and vertices (V sites). Those F and V sites provide a total of 32 sites ð20 þ 12Þ, and full coverage produces a cluster with 45 atoms; this type of decoration can be denoted FC (face centered) as it emphasizes the coverage of the faces of the icosahedron. Alternatively, atoms can decorate the edges (E sites) and vertices (V). These E and V sites provide a total of 42 sites ð30 þ 12Þ, and completion of this layer leads to the next Mackay icosahedron with 55 atoms; these structures are called multilayer icosahedral (or MIC) structures. FC growth is favored at the beginning of building a shell, up to a point when a transition occurs because MIC growth becomes preferred. The cluster size for the transition depends slightly on the details of the interatomic interactions in different materials. For Ni clusters, it occurs between N ¼ 27 and N ¼ 28. at N ¼ 55 corresponds with a minimum in The calculated minimum of m the measured magnetic moment at N ¼ 56. Also, the calculated minimum in the region Ni28 –Ni37, associated with the FC ! MIC transition, can be cor in that region. The experirelated with the broad experimental minimum of m ments also show a weak minimum at Ni19, which can be tentatively associated with a double icosahedron structure (an icosahedron covered by an FC cap formed by six atoms, one in a vertex site and the others in the five associated F sites),51 although this local minimum does not show up in the calculations. between Ni22 and Ni23, which has a Another weak feature is the drop of m counterpart in the calculation (the structure of Ni23 results by covering an icosahedron with two FC caps; its structure can be viewed as three interpenetrating double icosahedra). One may conclude with some confidence that the minima displayed by the measured magnetic moments provide some support to a pattern of icosahedral growth. It was indicated above that Ni38 is a special case. Reactivity experiments52 measuring the saturation coverage of this cluster with N2, H2, and CO molecules suggest that the structure of Ni38 is a truncated octahedron cut from a face-centered cubic (fcc) lattice. This structure is shown in
216
Magnetic Properties of Atomic Clusters of the Transition Elements
Figure 6 Calculated minimum energy structure of Ni38. It is a piece of an fcc crystal. Dark and light atoms are internal and surface atoms, respectively.
Figure 6. Motivated by this result, a detailed comparison between the energies of fcc and icosahedral structures was performed by Aguilera-Granja et al.46 for other NiN clusters with N ¼ 13, 14, 19, 23, 24, 33, 36, 37, 39, 43, 44, 55, and 68. Ni36, Ni37, Ni38, and Ni39 were the sizes selected in the neighborhood of Ni38. For most other sizes selected, it was possible to construct fcc clusters with filled coordination shells around the cluster center. In all cases, the icosahedral structure was predicted to be more stable, except for the Ni38 cluster. The difference in binding energy between the icosahedral and fcc structures is, however, small. This difference, plotted in Figure 7, is lower than 0.2 eV between Ni24 and Ni39 and larger for other cluster sizes. For the truncated octahedral ðNi38 Þ ¼ 0:99 mB . This value reduces the difference between the structure, m experimental and theoretical results to one third of the value in Figure 2 fcc ¼ 0:04 mB and m exp m ico ¼ 0:11 mB Þ. The moderate increase of ð mexp m ico is from the lower average coordination in the fcc strucfcc with respect to m m are very ture ðZðfccÞ ¼ 7:58; and ZðicoÞ ¼ 7:74Þ. The calculated values of m similar for the icosahedral and fcc structures of Ni36 (0:87 mB and 0:86 mB , respectively). Because the energy differences between isomers for N ¼ 24–40 are small (less than 0.4 eV), the possibility of different isomers contributing to the measured values of the magnetic moment cannot be excluded. at values of N around 20 Rationalizing the observed broad maxima of m and 42 is more difficult than for minima. These maxima are not observed in the TB results of Figure 2. One possibility, which has been suggested from
Measurement and Interpretation of the Magnetic Moments
217
6.0
DEico-fcc (ev)
5.0 4.0 3.0 2.0 1.0 0.0 –1.0 10
20
30
40
50
60
70
Number of Atoms
Figure 7 Difference in binding energy of icosahedral and fcc isomers of NiN clusters as a function of cluster size. Reproduced with permission from Ref. 46.
some molecular dynamics simulations,53 is that the structures of Ni clusters are bulk-like fcc instead of icosahedral for cluster sizes near those maxima. Using fcc structures covering the whole range of cluster sizes, Guevara et at Ni19 and Ni43 and minima at Ni28 al.21 predicted sharp maxima for m shows a local minimum at Ni19, and reacand Ni55. However, the measured m tivity experiments suggest that Ni19 is a double icosahedron.54 So the only at Ni43. Rodrı´clear prediction in favor of fcc structures is the maximum of m 55 for cluster guez-Lo´pez et al. have performed additional TB calculations of m structures that other authors had obtained using different semiempirical arising interatomic potentials. Their conclusion is that the changes of m at small from different cluster structures are not large. The oscillations of m N are accounted for reasonably well for all structural families considered; ðNÞ toward the bulk limit occurs in all cases. however, a fast approach of m These results do not resolve the discrepancies between TB calculations and experiment, which indicates that a possible misrepresentation of the exact geometry is not the only problem with the computational results. Other possibilities are explored in the next sections.
Influence of the s Electrons An alternative model for explaining the behavior of the magnetic moment of Ni clusters has been proposed by Fujima and Yamaguchi (FY).56 The interest of this model is because it may contain some additional ingredi. However, as a function of ents required to explain the observed maxima of m N, the FY model cannot predict the minima. It is intriguing that the observed are at N ¼ 8 and near N ¼ 20 and N ¼ 40.3 These numbers maxima of m
218
Magnetic Properties of Atomic Clusters of the Transition Elements
immediately bring to mind well-known electronic shell closing numbers of alkali metal clusters.44,57 Electrons move as independent particles in an effective potential with approximate spherical shape in the alkali clusters. In that potential, the electronic energy levels group in shells with a degeneracy equal to 2ð2l þ 1Þ caused by orbital angular momentum and spin considerations. A realistic form of the effective potential, obtained for instance with the jellium background model for the positive ionic charge,44,58 produces a sequence of electronic shells 1S, 1P, 1D, 2S, 1F, 2P, 1G, 2D, . . . , where the notation for the angular momentum character of the shells is given in capital letters (S, P, D, . . .) to avoid confusion with s, p, and d atomic orbitals. Clusters with closed electronic shells are especially stable. These ‘‘magic’’ clusters contain a number of valence electrons Nmagic ¼ 2, 8, 18, 20, 34, 40, 58, 68, . . . The same magic numbers have been observed for noble metal clusters.59 The electronic configurations of free noble metal atoms are 3d10 4s, 4d10 5s, and 5d10 6s for Cu, Ag, and Au, respectively, and the interpretation of the magic numbers in the noble metal clusters is that the shell effects are caused by the external s electrons. In a similar way, the FY model for Ni clusters distinguishes between localized 3d atomic-like orbitals, responsible for the magnetic moment of the cluster, and delocalized molecular orbitals derived from the atomic 4s electrons, and it neglects hybridization between the 3d electrons and the delocalized 4s electrons. The delocalized 4s electrons are treated as moving in an effective harmonic potential. The energy levels of the delocalized 4s electrons lie just above the Fermi energy in very small NiN clusters. But, as N grows, the binding energy of the delocalized 4s states increases and these states progressively move down below the 3d band. The existence of some 4s states below the 3d band causes the presence of holes at the top of the minority-spin 3d band (the majority-spin 3d band is filled). The number of 3d holes is equal to the number of 4s states buried below the 3d band, because the total number of valence electrons per Ni atom is 10. The FY model assumes that the transfer of 4s states to energies below the 3d band occurs abruptly when the number of delocalized 4s electrons is just enough to completely fill an electronic shell in the harmonic potential. As a consequence of the stepwise motion, there is a sudden increase in the number of holes at the top of the minority-spin 3d band of the cluster, and because the number of holes is equal to the number of unpaired electrons in the cluster, an abrupt increase of the magnetic then occurs. The stepwise transfer of 4s-derived levels from above moment m the Fermi energy to below the d band is supported by density functional calculations.60 The maxima of the magnetic moment observed in the Ni experiments near N ¼ 20 and N ¼ 42 could be related to this effect, because closing of electronic shells occurs in clusters of s-like electrons at N ¼ 20 and N ¼ 40. On the other hand, the FY model predicts the maxima and the minima of m that are too close, because of the assumption of the sudden transfer of a whole shell of electrons when the conditions of shell closing are met. This contrasts with experiment, where the maxima and the minima are well separated.
Measurement and Interpretation of the Magnetic Moments
219
Density Functional Calculations for Small Nickel Clusters Small clusters of transition elements have been primarily studied using DFT because the calculations become very time consuming for large clusters. for NiN clusters with N ¼ 2–6, 8, and Reuse and Khanna61 have calculated m ðNi6 Þ < m ðNi5 Þ and m ðNi13 Þ < m ðNi8 Þ, which agrees with the 13. They found m experimental trend; however, the magnetic moments of Ni6 and Ni8 were for Ni8 (Fig. 2). nearly equal whereas the experiment indicates a larger m Bouarab et al.45 also performed TB calculations with the same structures and interatomic distances used by Reuse and Khanna. Their magnetic moments differed by no more than 0.06 mB from the TB values of Fig. 2. Therefore, the differences between TB and DFT results have to be mainly ascribed to the different treatment of the electronic interactions.46 Desmarais ¼ 1:14 mB was obtained et al.62 have studied Ni7 and Ni8. The same value m for the ground state of Ni7, a capped octahedron, and for all its low-lying ¼ 1:0 mB was obtained for the isomers. Similarly, an average moment m ground state and for the low-lying isomers of Ni8. The insensitivity of the magnetic moments to atomic structure in Ni7 and Ni8, also found for Ni4,61 is striking. Reddy et al.63 have calculated the magnetic moments for sizes up to N ¼ 21. For N 6, they employed ab initio geometries, and for N > 6, geometries were optimized with the Finnis–Sinclair potential.64 Compared with the experiment, the calculated magnetic moments are substantially lower, and important discrepancies occur in the evolution of m with cluster size. Fujima and Yamaguchi65 have calculated the local magnetic moments at different atomic sites within model Ni19 and Ni55 clusters with fcc structure and bulk interatomic distances. The octahedral shape was assumed for Ni19 and the cuboctahedral shape for Ni55. No significant differences were found between the magnetic moments of atoms on different surface sites, but the moments of atoms in the layer immediately below the surface were 0.2 mB smaller than those of the surface atoms. The average magnetic moments ðNi55 Þ ¼ 0:73 mB are significantly smaller than the ðNi19 Þ ¼ 0:58 mB and m m measured moments. Pacchioni et al.66 calculated the electronic structure of Ni6, Ni13, Ni19, Ni38, Ni44, Ni55, Ni79, and Ni147. Icosahedral structures were assumed for Ni13, Ni55, and Ni147 and structures with Oh symmetry for Ni6, Ni13, Ni19, Ni38, Ni44, Ni55, and Ni79 (in most cases, fragments of to the bulk limit was not observed, despite an fcc crystal). Convergence of m the width of the 3d band being almost converged for N ¼ 40–50.
Orbital Polarization The calculations discussed above considered spin magnetic moments but not the orbital magnetic moments. However, it is known that orbital correlation has a strong effect in low-dimensional systems, which leads to orbital polarized ground states.67–69 Based on this fact, Guirado-Lo´pez et al.69 and
220
Magnetic Properties of Atomic Clusters of the Transition Elements
Wan and coworkers48 studied the magnetic moments of Ni clusters, taking into account both spin and orbital effects. Wan et al.48 used the following TB Hamiltonian H¼
X
e0il ^cþ iLs ciLs þ
XX
jL0
tiL ^cþ iLs cjL0 s þ HSO þ Hee
ij LL0 s
iLs
i Xh s0 þ þ 0 0 0 0 0 0 ^ þ e0i0 s0 ^cþ c þ t ðZ Þð^ c c þ c c Þ 0 0 0 0 0 i s iss iss i s s i ss i ss i s s i0 s
þ
X
ei0 ðni0 s0 Þð^cþ cþ i0 s0 s ci0 s0 s Þ i0 Ls ci0 Ls þ ^
½57
i0 Ls
where the meaning of the different symbols is the same as described in the section of the TB method. L ¼ ðl; mÞ indicates the orbital angular momentum quantum numbers. There are some differences with respect to the Hamiltonian in Eqs. [15] and [16]. The first two terms in Eq. [57] are already included in Eq. [15]. The term HSO is the spin-orbit coupling operator in the usual intraatomic approximation, HSO ¼ x
X D i;Ls;L0 s0
E ~ 0 0 þ Ls~ Si Li L s ciLs ciL0 s0
½58
where x gives the strength of the interaction ( x ¼ 0:073 for d orbitals). The term Hee, to be discussed below, is an intraatomic d–d electron–electron interaction. The two final terms in Eq. [57] apply specifically to the surface atoms of the cluster, labeled by the subscripts i0 . To take into account the electronic spillover at the surface,21 an extra orbital, labeled s0 , is attached to each surface atom i0 . The intraatomic d–d electron–electron interaction includes Coulomb and exchange interactions, and it is responsible for orbital and spin polarization. To account for orbital polarization, the idea of the LDA þ U method was followed.70 A generalized Hartree–Fock approximation including all possible pairings was then used to write Hee ¼
X
i þ VLs;L 0 s0 ciLs ciL0 s0
½59
i;Ls;L0 s0
where i VLs;L 0 s0 ¼
X
ðfULL2 L0 L3 ni;L2 s ;L3 s þðULL2 L0 L3 ULL2 L3 L0 Þni;L2 s;L3 s gdss0
L2 L3
ULL2 L3 L0 ni;L2 s ;L3 s ds s0 ÞUðni 0:5ÞdLL0 dss0 þJðnis 0:5ÞdLL0 dss0 ½60
Measurement and Interpretation of the Magnetic Moments
221
0 0 In this expression, ni;Ls;L0 s0 ¼ hcþ iLs ciL s i is Pthe single-site density matrix, nis is ¼ s, and ni ¼ the trace of ni;Ls;L0 s0 , s nis . The matrix elements ULL2 L0 L3
s
can be determined by two parameters, the average on-site Coulomb repulsion U and the exchange J, U¼
UJ ¼
1
X
Umm0 mm0
½61
X 1 ðUmm0 mm0 Umm0 m0 m Þ 2lð2l þ 1Þ mm0
½62
ð2l þ 1Þ2 mm0
This can be seen by expressing ULL2 L0 L3 in terms of complex spherical harmonics and effective Slater integrals Fk as in Eq. [63]70 X hm; m00 jUjm0 m000 i ¼ ak ðm; m0 ; m00 ; m000 ÞFk ½63 k
where 0 k 2l and ak ðm; m0 ; m00 ; m000 Þ ¼
k 000 4p X lm lmYkq lm0 lm00 Ykq 2k þ 1 q¼k
½64
Because we are dealing with d electrons, l ¼ 2, and the notation for ULL0 L00 L000 has been simplified to Umm0 m00 m000 in Eqs. [61] and [62]. The Slater integrals Fk , which carry the radial integrations in Eq. [63], are expressions of the type71 ð ð rk< Fk ðs; tÞ ¼ dr dr0 Pns ls ðrÞPns ls ðrÞ kþ1 Pnt lt ðr0 ÞPnt lt ðr0 Þ r>
½65
where the symbols s and t refer to two different electrons and Pns ls ðrÞ is the product of the distance r from the nucleus and the radial function of the s electron Rns ls ðrÞ. The terms r< ðr> Þ correspond to the smaller (larger) of r and r0 : Only the Slater integrals F0 , F2 , and F4 are required for d electrons, and these can be linked to the U and J parameters through the relations U ¼ F0 and J ¼ ðF2 þ F4 Þ=14, whereas the ratio F2 =F4 is a constant 0:625 for the 3d elements. In this formalism, the Stoner parameter, which determines the splitting of the bulk bands,67 is I ¼ ð2lJ þ UÞ=ð2l þ 1Þ. Wan et al.48 chose I ¼ 1:12 eV in their cluster calculations, in order to have consistency with the exchange splitting of the bands of bulk Ni obtained in LSDA calculations. They used a value U ¼ 2:6 eV, although other values (U ¼ 1:8 eV, and U ¼ 3:2 eV) were explored. It is evident that orbital polarization is included in this approach via the orbital-dependent effects coming from F2 and F4 . The orbital
222
Magnetic Properties of Atomic Clusters of the Transition Elements
polarization differentiates the approach discussed in this section from that embodied in the simpler formulation in Eqs. [16] and [17]. The fifth term in the Hamiltonian of Eq. [57] is added to account for the electronic spillover at the surface.21 One extra orbital with s symmetry (s0 orbitals) is added to each surface atom i0 and located in the vacuum region near the atom. This s0 orbital interacts with the s orbital of the same surface 0 atom through the hopping integral tss and the occupation of the s0 orbitals represents the spillout. Theffi hopping integral is parameterized such pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 0 that tss ðZi0 Þ ¼ Vss0 s Zmax Zi0 , where Zmax Zi0 is the local deficit in the atomic coordination (Zmax is the maximum coordination; 12 for a fcc solid) and Vss0 s is the hopping strength. The last term in Eq. [57], which accounts for the orbital shifts from intersite Coulomb interactions, is related to the last two terms in Eq. [16]. However, Wan et al.48 restricted the orbital shifts to the surface atoms i0 . The eigenvalue equation corresponding to the Hamiltonian of Eq. [57] can be solved self-consistently by an iterative procedure for each orientation of the spin magnetization (identified as the z direction). The self-consistent density matrix is then employed to calculate the local spin and orbital magnetic moments. For instance, the local orbital moments at different atoms i are determined from ¼2 ð eF X X m mi;orb ¼ mnims ðeÞde ½66 s m¼2 1
where m refers to the magnetic quantum number. The spin magnetic moment ~ morb of the cluster are calculated as a mspin and the orbital magnetic moment ~ vector average of the atomic moments. The total magnetic moment ~ m is obtained as the vector sum of ~ mspin and ~ morb . The self-consistent solution of the Hamiltonian generates a non-uniform and non-collinear distribution of spin and orbital moments; however, it was found by Wan et al. that noncollinearity is very weak in this case. Also, the spin-orbit interaction can generate anisotropy, but a comparison of calculations with spin along the three principal axes of the clusters revealed very small energy differences of less than 0.005 eV/atom. The results reported immediately below correspond to spin orientation along the largest inertia axis. A first test of the accuracy of the theory is provided by calculations for the bulk metals, performed by Guirado-Lo´pez et al.,69 using a very similar model: These authors found orbital magnetic moments mb;orb ðFeÞ ¼ 0:094 mB ; mb;orb ðCoÞ ¼ 0:131 mB , and mb;orb ðNiÞ ¼ 0:056 mB , all in good agreement with the experimental values mb;orb ðFeÞ ¼ 0:092 mB ; mb;orb ðCoÞ ¼ 0:147 mB , and mb;orb ðNiÞ ¼ 0:051 mB . The spin magnetic moments of Ni clusters calculated by Wan et al.48 are in reasonable agreement with density functional calculations,61,63,72 but both approaches, that is, TB and DFT, give values substantially smaller than the experimental magnetic moments. The results of Wan et al. improve by adding
Measurement and Interpretation of the Magnetic Moments
223
Figure 8 Calculated spin, orbital, and total magnetic moments per atom of Ni clusters. Reproduced with permission from Ref. 48.
the orbital magnetic moment: The magnitude of the total ~ m becomes closer to the experimental values, and its detailed variation as the cluster size increases also improves. One can observe in Figure 2 that the magnitude of the spin magnetic moments obtained by Aguilera-Granja et al.46 is good, but Wan et al.48 suggested that the result of Aguilera-Granja et al. is caused by the parameterization used. Figure 8 shows the spin, orbital, and total magnetic moments obtained by Wan et al. Except for the smallest clusters, morb varies between 0.3 and 0.5 mB /atom, which represents a large enhancement (6 to 10 times) with respect to the orbital moments in bulk Ni. The orbital magnetic moment of the free Ni ðd9 s1 Þ atom is mat;orb ðNiÞ ¼ 2 mB . Therefore, a large part of the quenching of morb is already manifested in the smallest clusters, as soon as full rotational symmetry is lost. However, a substantial enhancement of morb with respect to the bulk orbital moment is still observed at Ni60. On the other hand, the oscillations of the total magnetic moment come from morb . The orbital moments depend on the choice of the correlation parameter U, and a value U ¼ 2:6 eV gives the best agreement with the experiment. The positions of the minima may depend on U, except the four minima indicated by the vertical lines in Figure 8. A comparison with the experiment is provided in Figure 9. The qualitative trend up to N ¼ 28 is reproduced approximately by the calculations. In the size range 9 N 28, the calculated moments are in reasonable agreement with the experimental results of Knickelbein,4 but they are smaller than the magnetic moments measured by Apsel et al.3 (the discrepancies between the two sets of experimental results are largest between N ¼ 10 and N ¼ 20). The calculated moments for 30 N 38 are larger, by 0.1– 0.2 mB /atom, than those of Apsel et al., and they display the minimum at Ni34. Finally, there is excellent agreement with experiment between N ¼ 40
224
Magnetic Properties of Atomic Clusters of the Transition Elements
Figure 9 Comparison between the calculated magnetic moments of Ni clusters (dark squares) and the experimental results of Apsel et al.3 and Knickelbein.4 Reproduced with permission from Ref. 48.
and N ¼ 60. The predicted minimum at Ni28 is not observed in the experi at Ni28. ments, although there is a break in the decreasing behavior of m The minimum in the theoretical curve (see also Figure 2) is related to a structural transition in the model of icosahedral growth used in the theoretical calculations.46,48,51 The enhancement of morb in the clusters with respect to the bulk moment mb;orb results from several contributions related to changes in the local environment of the atoms.69 The first contribution is the reduction of the local coordination number leading to an increase of the local spin polarizations, which through spin-orbit interactions induce large orbital moments. The second contribution is the orbital dependence of the intraatomic Coulomb interaction, which favors the occupation of states with high m and contributes to the enhancement of morb in clusters. The final contribution is the presence of degeneracies in the spectrum of one-electron energies that allows for a very effective spin-orbit mixing that enhances morb . The magnetic moments of the Ni clusters are dominated by the contribution from surface atoms.48,69 The analysis of Wan et al. indicates that the orbital and spin local moments of cluster atoms with atomic coordination 8 or larger are similar to those in the bulk ðmb;spin 0:55 mB , and mb;orb 0:05 mB );73 that is, the orbital moment is almost quenched for internal cluster atoms. In contrast, there is a large enhancement of the spin and orbital moments for atoms with coordination less than 8. This enhancement increases with the coordination deficit, and it is larger for the orbital moment. Wan et al.48 also analyzed the quantum confinement effect proposed by Fujima and Yamaguchi,56 i.e., the
Clusters of Other 3d Elements
225
sudden immersion of delocalized states from above the Fermi energy to below the d band when the number of delocalized electrons is enough to close an electronic shell. This effect is confirmed by the TB calculations of Wan et al. and is found to be relevant in small Ni clusters. However, as the cluster size increases, immersion below the d band seems to be gradual rather than sharp. In summary, the works of Guirado–Lo´pez et al.69 and of Wan et al.48 have shown the importance of the orbital contribution to the magnetic moment of nickel clusters. The TB method provides a convenient framework ðNiN Þ with cluster size, and this understanding to understand the variation of m is good, although not perfect. In fact, the work of Andriotis and Menon,74 formulated in the TB framework, while supporting the idea that the enhanced orbital moments are aligned parallel to the spin moments in Ni clusters, also raises the possibility that these states are energetically unfavourable. This may be attributed to the interplay between the action of the spin-orbit interaction HSO, which favors the alignment of L along the direction of S, and the action of the crystal field, which tends to align L along the easy magnetization axis (in materials with magnetic anisotropy, the magnetization is easier along a particular direction of the crystal). Another fact that makes the theoretical analysis difficult arises from the differences between the magnetic moments measured in different experiments for the same clusters.2–4
CLUSTERS OF OTHER 3d ELEMENTS Chromium and Iron Clusters Chromium is an antiferromagnetic metal in the bulk phase, and the calculations of Cheng and Wang75 show that the Cr clusters also have a strong tendency toward antiferromagnetic spin ordering (although the fact that the number of atoms is small makes the distribution of the magnetic moments more complex). Small Cr clusters are ‘‘special’’ compared with clusters of the other 3d metals.75,76 The electronic structure of the atom 3d5 4s1 has six unpaired electrons. This half-filled electronic configuration leads to strong d–d bonding in Cr2, with a bond length of 1.68 A˚, which is very short com˚ in Cr metal. The dimer is a pared with the interatomic distance of 2.50 A closed shell molecule with a strong sextuple bond.75 The strong binding arises from the filling of the 3d-bonding molecular orbitals: s23d p43d d43d s24s ð1 þ g Þ. The electronic structure of the dimer is robust and controls the growth of the small clusters. The optimized geometries of Cr clusters are given in Figure 10. Cr3 is composed of a dimer plus an atom: The electronic structure of the dimer is little affected by the presence of the third atom, which remains in its atomic electronic state, leaving six unpaired electrons in the cluster. A new pair forms
226
Magnetic Properties of Atomic Clusters of the Transition Elements 2.44 1.96
1.83
1.69
5
2.7
42 2.
2.73
2.51
1.63 (D˚˚h)
(D2h)
(C2v)
(C2v)
2.45
3
2.5 (C2v)
(D3h)
2.43 (C2v)
(D4h)
2.5
2.3
2.81
1.72
2.27
2.61
2.75
1.80
5
2.76
3.12
2.77
7
1.8
2.2
3
2.3
6
0
1.92
2.46
1.94
1.78
1.72
1
2.5
1.83
45 2.
2.48
2.59
2.61
2.70
(D4h)
(D4h)
2.2
6
2.3
1
2.2
8
(C2v)
2.3
2
27
2.74
2.76
2.74
7
2.6
2.
2.78
9
2.6
(D4h)
(C4v)
(D4h)
Figure 10 Optimized structures of CrN clusters, N ¼ 2–15. Bond lengths are in A˚. The arrows indicate the orientation of the local atomic spins. Strong dimer bonds are represented by thick lines. Reproduced with permission from Ref. 75.
by adding the fourth atom, and Cr4 is formed by two strong dimers with weak interdimer bonding. The dimerization effect controls growth up to N ¼ 11; those clusters are formed by dimers with short bond lengths and one isolated atom (in Cr5, Cr7, Cr9), or two isolated atoms (in Cr10) bonded to adjacent dimers. The structure of Cr11 is similar to that of Cr10 with an atom at the cluster center. The dimer growth route stops at Cr11, at which point the bond lengths suddenly increase and dimer bonds can no longer be identified for N > 11. The arrows in Figure 10 indicate the orientation of the atomic spins. There is an anisotropic distribution of the magnetic moments, but the strong tendency to antiferromagnetic ordering is clear, especially as N increases. The local moments of the capping atoms are much larger than those of the dimerized atoms. The average magnetic moments of the small clusters
Clusters of Other 3d Elements
227
are as follows: Cr2 (0), Cr3 (2), Cr4 (0), Cr5 (0.93), Cr6 (0.33), Cr7 (0.29), Cr8 (0), Cr9 (0.22), Cr10 (0.2), Cr11 (0.55), Cr12 (1.67), Cr13 (1.06), Cr14 (0.74), and Cr15 (0.48), in units of mB /atom. The dimer growth route leads to an odd– even alternation of the average magnetic moments: Small moments for clusters with even N and large moments for clusters with odd N. The large moments arise from the quasiatomic character of the capping atoms; the dimer-paired because of the strong intradimer 3d–3d interaceven-N clusters have low m tion. In most cases, the magnitudes of the calculated moments are within the upper limit of 1:0 mB imposed by the experiments,77,78 but for Cr12 is larger than this limit. Fujima and Yamaguchi65 and Cr13, the predicted m studied chromium clusters and iron clusters both with 15 and 35 atoms, assuming the body-centered cubic (bcc) structure of the bulk metals and a symmetric shape of a rhombic dodecahedron. For Cr, an alternation of the signs of the local moments as a function of the distance to the cluster center was found. The absolute values of the local moments decrease with increasing local coordination, and they also decrease for decreasing interatomic distance. The local moments of the Fe clusters are less sensitive to atomic coordination, although small magnetic moments were obtained for Fe atoms on the layer below the surface. Calculations allowing for noncollinear arrangements of the spins have been performed for small Fe and Cr clusters. Calculations for Fe2 and Fe4 by Oda et al.40 resulted in collinear ground states. The ground state of Fe3 is an equilateral triangle with a collinear spin arrangement. It has a total moment of 8.0 mB and a binding energy of Eb ¼ 2.64 eV/atom. A linear isomer with noncollinear arrangement was also found. The central atom has a moment of 1.27 mB oriented perpendicular to the linear axis, and the two edge atoms have moments of magnitude 2.89 mB , tilted by 10 with respect to the cluster axis. This isomer has a total moment of 2.04 mB and a binding energy of Eb ¼ 2.17 eV/atom. Two other linear isomers were also found with collinear ferromagnetic and antiferromagnetic configurations. The total moments of those two isomers are 6.0 mB and 0 mB , and their binding energies are 1.80 and 2.15 eV/atom, respectively. A trigonal bipyramid structure (D3h symmetry) with a noncollinear spin arrangement was obtained for the ground state of Fe5. The three atoms of the basal plane have magnetic moments of 2.72 mB and point in the same direction. The two apical atoms have moments of magnitude 2.71 mB tilted in opposite directions by approximately 30 with respect to the moments of the basal atoms. The total moment of the cluster is 14.6 mB , and its binding energy is 3.46 eV/atom. An isomer with D3h structure, lying 0.01 eV/atom above the ground state, was also found having a collinear spin arrangement with atomic moments of 2.58 mB and 2.55 mB for the basal and apical atoms, respectively. Kohl and Bertsch41 studied Cr clusters with sizes between N ¼ 2 and N ¼ 13 and obtained noncollinear arrangements for all cluster sizes except for N ¼ 2 and N ¼ 4. They suggested that the trend of noncollinear configurations is
228
Magnetic Properties of Atomic Clusters of the Transition Elements
Figure 11 Character of the arrangement of the spin magnetic moments, and average magnetic moment, in seven-atom clusters with a pentagonal bipyramid structure and interatomic distances ranging from dbulk to 80% dbulk . Reproduced with permission from Ref. 79.
likely a feature common to most antiferromagnetic clusters, because noncollinear effects are caused by frustration, that is, by the impossibility of forming perfect antiferromagnetic arrangements. The nature of the spin arrangement depends sensitively on the interatomic distances. A comparative study of Cr7, Mn7 (Mn clusters will be studied in detail in the next section), and Fe7 was made by Fujima79 by assuming a pentagonal bipyramid structure for the three clusters and including the variation of the interatomic distance d from the bulk value dbulk up to a value 20% lower than dbulk. The results are summarized in Figure 11. For Cr7 with dbulk, the magnetic moments are arranged in a coplanar, noncollinear configuration; that is, the vector moments lie on the same plane but point in different directions. When d decreases, the magnetic moments are ordered in a parallel (P) fashion. The situation is similar in Mn7 for interatomic distances close to dbulk; that is, the cluster shows a coplanar configuration of the spins. However, a non-coplanar configuration first appears when d decreases, which then changes to a collinear antiparallel (AP) configuration with a further decrease of d. Finally the arrangement of the spins in Fe7 is parallel for d dbulk and d 0:8 dbulk , and noncollinear for d in between these limits. Similar work80 for 5 atom clusters with the structure of a trigonal bipyramid indicates that noncollinear magnetic arrangements appear for Mn5 and Cr5 with interatomic distances close to dbulk, which change to antiparallel arrangements with decreasing d. Another interesting result is that parallel magnetic moments appear for Ni5 (also for Co5 and Fe5) for almost all bond lengths between dbulk and 0.8 dbulk.
Clusters of Other 3d Elements
229
Manganese Clusters Manganese is the 3d metal with the smallest bulk modulus and cohesive energy. It has a complex lattice structure with several allotropic forms. Some of these bulk phases are antiferromagnetic, whereas monolayers81 and supported clusters82 exhibit nearly degenerate ferromagnetic and antiferromagnetic states. The dimer is also peculiar.83 In contrast to other transition elements, the bond length of Mn2 is larger than the nearest-neighbor distance in the bulk. In addition, its estimated binding energy is between 0.1 and 0.6 eV, which puts Mn2 in a category similar to van der Waals molecules. These properties arise from the electronic configuration of the atom, 3d5 4s2 . The electrons of the half-filled 3d shell are well localized and do not interact with those of the other atom in the dimer. Binding arises from the interaction between the filled 4s2 shells. A nonmetal-to-metal transition occurs as the Mn clusters reach a critical size. From experiments of the reactivity with hydrogen, Parks et al.84 have suggested that this transition occurs at Mn16. The large magnetic moment of the free atom (5 mB ) and the weak interaction between the atoms in the dimer lead one to expect an interesting magnetic behavior for Mn clusters. Some measurements of the magnetic moments of Mn clusters containing fewer than ten atoms have been performed for clusters embedded in matrices. Electron spin paramagnetic resonance (ESR) experiments of Mn2 in condensed rare gas matrices yield an antiferromagnetic configuration, but Mn2+ is ferromagnetic, with a total magnetic moment of 11 mB .85 A moment of 20 mB has been measured for Mn4 in a silicon matrix.86 Mn5+ embedded in inert gas matrices has a moment of 25 mB , although the cluster actually studied could be larger.87 We close by noting that neutral Mn2 is antiferromagnetic, whereas the other Mn clusters are ferromagnetic. The computational results on small Mn clusters are also controversial. An early Hartree–Fock study of Mn2 predicted a 1 þ g ground state resulting from the antiferromagnetic coupling of the localized spins.88 Fujima and Yamaguchi89 used DFT to study clusters of size Mn2 to Mn7. The interatomic distances were optimized for constrained geometries, and all clusters were predicted to show antiparallel spin ordering. Nayak and Jena90 optimized the structures of clusters with N 5 at the LSDA and GGA levels of theory. Only the GGA calculations reproduce some of the observed features of Mn2, namely a bond length larger than the nearest-neighbor distance in the bulk and a small binding energy (the calculated bond length is 6.67 a.u., and the binding energy is 0.06 eV). The cluster is predicted to be ferromagnetic with a total magnetic moment of 10 mB . The binding energy increases in Mn2 þ , and the bond length decreases relative to Mn2 because the electron is removed from an antibonding orbital. The total magnetic moment of Mn2 þ is 11 mB , in agreement with the experimental estimation for clusters in rare gas matrices.
230
Magnetic Properties of Atomic Clusters of the Transition Elements
The optimized geometries of Mn3, Mn4, and Mn5 obtained by Nayak and Jena90 are an equilateral triangle, a Jahn–Teller distorted tetrahedron, and a trigonal bipyramid, respectively. The strength of the bonding increases relative to the dimer because of s–d hybridization. The predicted geometries are consistent with those deduced from experiments in matrices. The hyperfine pattern observed for Mn4 embedded in a silicon matrix86 indicates that the four atoms are equivalent, as would occur in a tetrahedron. The triangular bipyramid is one of the possible structures of Mn5 consistent with the ESR measurements.87 The calculated interatomic distances decrease substantially from Mn2 to Mn3, which signals the onset of delocalization and hybridization between atomic orbitals at various sites. But the most striking property of these clusters is their ability to retain their atomic moments. Mn3, Mn4, and Mn5 in their ground state are predicted to be ferromagnetic, with moments of 5 mB per atom (low-lying structural isomers are also ferromagnetic). Experiments for thin layers support the possibility of large moments.91,92 The calculations of Pederson et al.93 provide additional insight into the magnetism of small Mn clusters. These authors studied Mn2 using LDA and GGA functionals and concluded that the manganese dimer is ferromagnetic with a total moment of 10 mB , a bond length of 4.93 a.u., and a binding energy of 0.99 eV. They also found an antiferromagnetic state whose properties, a binding energy of 0.54 eV and a bond length of 5.13 a.u., are closer to those of Mn2 in condensed rare gas matrices. A plausible resolution of the discrepancies for Mn2 offered by Pederson et al.93 is that the ferromagnetic state is the true ground state of free Mn2 but that the interaction with the condensed rare gas matrix may stretch the bond, which leads to the appearance of an antiferromagnetic state in the embedded cluster. However, very recent calculations by Yamamoto et al.94 using a high-level ab initio method (second-order quasidegenerate perturbation theory,95 MCQDPT2) predict antiferromagnetic coupling for the Mn dimer. Larger clusters were also studied by Pederson et al.93 Mn3 has different magnetic states close in energy. The ground state is an isosceles triangle in a ferromagnetic configuration with a total moment of 15 mB . A frustrated antiferromagnetic state also exists with the atomic spins of the shorter side of the triangle antiferromagnetically coupled to the third atom, whereas the first two atoms are ferromagnetically aligned (perfect antiferromagnetism is impossible in the triangular structure because the moments of two atoms necessarily point in the same direction; this represents a frustration of the tendency to antiferromagnetism). This state, with a net magnetic moment of 5 mB , is only 0.014 eV above the ground state. Mn4 is a tetrahedron with a total moment of 20 mB . The calculations predict a trigonal bipyramid as the ground state of Mn5 with a net moment of 23 mB , which is lower than the measured value of 25 mB .87 Trigonal bipyramid and square pyramid states with moments of 25 mB were found 0.62 eV and 1.20 eV above the ground state, respectively. Pederson and coworkers concluded that either the matrix influences the
Clusters of Other 3d Elements
231
Table 1. Calculated Average Bond Distance d, Number of Bonds Per Atom NB, , Binding Energy Per Atom Eb, and Spin Gaps 1 and Magnetic Moment Per Atom m 2 of MnN Clusters.93 N
d (a.u.)
NB
ðmB Þ m
Eb (eV)
1 (eV)
2 (eV)
2 3 4 5 6 7 8
4.927 5.093 5.162 5.053 5.002 4.970 4.957
0.5 1.0 1.5 1.8 2.0 2.1 2.2
5.0 5.0 5.0 4.6 4.3 4.2 4.0
0.50 0.81 1.19 1.39 1.56 1.57 1.67
0.65 0.46 0.62 0.50 0.90 0.70 0.93
1.30 1.38 2.31 0.79 1.13 0.47 0.37
ground state multiplicity of Mn5 or the cluster formed in the experiment is other than Mn5; the latter possibility had also been admitted in the original experimental work.87 A square bipyramid and a pentagonal pyramid were investigated for Mn6. The total moments are 26 mB and 28 mB , respectively, and this cluster was proposed as a possible candidate for the cluster with m ¼ 25 mB observed in the ESR experiments. Table 1 gives the results of Pederson et al.93 for the average bond distance, the number of bonds per atom, the magnetic moment, and the binding energy of Mn2 to Mn8. Also given are the minority majority two spin gaps 1 ¼ emajority and 2 ¼ eminority HOMO eLUMO HOMO eLUMO , which represent the energy required to move an electron from the HOMO of one spin sub-band to the LUMO of the other. The two spin gaps must be positive for the system to be magnetically stable. In a Stern–Gerlach deflection experiment, Knickelbein96 measured the magnetic moments of free MnN clusters for sizes between N ¼ 11 and N ¼ 99. The magnetic moments were obtained from Eq. [2] assuming superpar shows local minima for N ¼ 13 and amagnetic behavior. The moment m N ¼ 19, which suggest icosahedral growth in that size range; for larger sizes, shows a minimum in the region Mn32 –Mn37 and a broad maximum in the m region Mn47 –Mn56 followed by a weak minimum at Mn57. The maximum value ðMn15 Þ ¼ 1:4 mB , which of the magnetic moment found in the experiment was m is substantially smaller than the calculated moments given in Table 1, and this result is puzzling. The interpretation of the experimental results has been challenged by Guevara et al.97 They performed TB calculations for Mn clusters up to Mn62 using several model structures (icosahedral, bcc, and fcc), and they obtained several magnetic solutions for each cluster size and structure. In general, the magnetic moments are not ferromagnetically aligned. A comparison of the experimental and calculated moments led to the suggestion that the structures are mainly icosahedral for N < 30, and that bcc structures begin to compete with icosahedral structures for larger clusters. Jena and coworkers98,99 arrived at similar conclusions for the magnetic ordering: Non-ferromagnetic ordering is responsible for the small moments measured for the Mn clusters. The non-ferromagnetic ordering was proposed to be ferrimagnetic: That is,
232
Magnetic Properties of Atomic Clusters of the Transition Elements 2.0
moment per atom (mb)
Mnn 1.5
1.0
0.5
0.0 4
6
8
10
12
14
16
18
20
22
n
Figure 12 Measured magnetic moments per atom of MnN clusters with N between 5 and 22. Reproduced with permission from Ref. 103.
the magnitudes of the moments at the different atomic sites are different, the number of atoms with " and # spins are unequal, or both. This proposal is supported by the most recent DFT calculations for Mn13,99–102 by a combined experimental and theoretical analysis of Mn7,99 and by the most recent Stern–Gerlach deflection experiments for free clusters performed by Knickelbein103 for N ¼ 5–22. The results of the latter experiments are given are small. in Figure 12, where one can again see that the values of m 103 In summary, from the latest experimental and theoretical98–102 works, a clearer picture of the magnetic properties of Mn clusters is emerging. For the smaller cluster sizes ðN 6Þ, a strong competition exists between ferromagnetic and antiferromagnetic ordering of the atomic moments, which results in a near degeneracy between the two types of ordering. The calculations of Bovadova-Parvanova et al.101 clearly illustrate this competition. Mn2 is ferromagnetic, with a total m ¼ 10 mB , but an antiferromagnetic state lies only 0.44 eV above it in energy. Mn3 is ferromagetic with m ¼ 10 mB , but an antiferromagnetic state with a similar triangular structure and a net moment of 5 mB exists only 0.05 eV higher in energy. The ground state of Mn4 has a tetrahedral structure and is ferromagnetic with m ¼ 20 mB , but antiferromagnetic states with a similar tetrahedral structure exist 0.11 eV and 0.24 eV higher in energy, respectively. Mn5 is antiferromagnetic with m ¼ 3 mB (the structure and distribution of atomic magnetic moments are shown in Figure 13), but a ferromagnetic and two other antiferromagnetic states lie within a small energy range of ¼ 0:6 mB , explains 0.05 eV above the ground state. The ground state, with m the result obtained in the Stern–Gerlach experiments of Figure 12. On the other hand, the ferromagnetic state with m ¼ 23 mB could explain the result
Clusters of Other 3d Elements
233
obtained for Mn5 embedded in a matrix. Mn6 has three nearly degenerate octahedral structures competing for the ground state. The lowest energy state has an antiferromagnetic spin arrangement with a net magnetic moment per atom of 1.33 mB . The other two states are only 0.03 eV higher in energy: ¼ 2:66 mB , and the other is ferromagnetic One is antiferromagnetic, with m ¼ 4:33 mB . Another antiferromagnetic state with an octahedral strucwith m ¼ 0:33 mB lies 0.08 eV above the ground state. The distribution ture and m of atomic moments for this isomer is given in Figure 13. The calculations of Jones et al.102 lead to the same picture pointed here for the ground state and the low-lying isomers. Knickelbein103 has interpreted his Stern–Gerlach result ðMn6 Þ ¼ 0:55 mB as possibly being from the contribution of several of m
Figure 13 Ground state structures and local spin magnetic moments (in mB ) of Mn5 and Mn7 determined by DFT calculations. For Mn6, the structure and local moments correspond to a relevant isomer 0.07 eV above the ground state. Some bond lengths are also given, in A˚. Reproduced with permission from Ref. 103.
234
Magnetic Properties of Atomic Clusters of the Transition Elements
¼ 0:33 mB and one or more of isomers in the experiment: The isomer with m the higher-moment isomers. The structure of Mn7 is a distorted pentagonal bipyramid (see Figure 13), and the magnitude of the local moments is about 5 mB , but the coupling is ferrimagnetic and the net magnetic moment of the cluster is only 0.71 mB per atom, in good agreement with the Stern–Gerlach experiment ð m ¼ 0:72 0:42 mB Þ. This ferrimagnetic coupling is representative of the situation for N > 6, which is corroborated by calculations for Mn13 and larger clusters.100–102 Although the local atomic moments are in the range 3.5–4 mB , the tendency toward antiferromagnetic ordering leads to ferrimagnetic structures with magnetic moments of 1 mB per atom or less. All calculations for Mn clusters described above assumed collinear spin configurations. A few calculations have been performed for small Mn clusters that allow for noncollinear arrangements of the spins. Mn7 has been discussed above. Using the DFT code SIESTA,104 Longo et al.105 found four antiferromagnetic states for Mn6 (with octahedral structure), in good agreement with collinearconstrained calculations;101 however, the ground state has a noncollinear spin configuration with a total binding energy 0.46 eV larger than that of the most stable antiferromagnetic isomer. The net magnetic moment of this noncollinear structure is 5.2 mB , which corresponds to 0.78 mB per atom, which is still a little larger than the experimental magnetic moment of 0.55 mB per atom given in Figure 12.
CLUSTERS OF THE 4d ELEMENTS The 4d metals are nonmagnetic in the bulk phase. However, the free atoms are magnetic, and consequently, it is reasonable to expect that small clusters of these elements could be magnetic. Experiments5,106 show that Rh clusters with less than 60 atoms and Ru and Pd clusters with less than 12 atoms are magnetic. Several calculations have investigated the magnetism of those clusters assuming model structures. In particular, trends across the 4d period of the periodic table have been studied by performing DFT calculations for six-atom clusters with octahedral structure,107 and the magnetic moments are given in Table 2. All clusters, except Y6, Pd6, and Cd6, have finite magnetic moments and the largest moments occur for Ru6 and Rh6 (1.00 mB /atom and 0.99 mB /atom, respectively). The large moments of these two clusters arise from the fact that the density of electronic states shows a large peak in the region of the Fermi level. Just a small exchange splitting (the shift between " and # spin sub-bands) produces a sizable difference between the populations of electrons with " and # spins. Ru6, Rh6, and Nb6 have the largest exchange splittings. The Fermi levels of the bulk metals lie in a dip of the DOS in contrast to small clusters. The main contribution to the DOS comes from the d electrons, which gives support to models in which the effect of the sp electrons has been neglected. Two factors contribute to the large DOS near the Fermi
Clusters of the 4d Elements
235
Table 2. Binding Energy Per Atom Eb, Distance D from Atoms to for the Cluster Center, and Average Magnetic Moment Per Atom m Octahedral Six-Atom Clusters. Data Collected from Zhang et al.107 Cluster Y Zr Nb Mo Tc Ru Rh Pd Ag Cd
Eb (eV)
D (a.u.)
ðmB Þ m
3.53 5.23 5.07 4.05 4.91 4.70 4.03 3.14 1.56 0.39
4.40 3.96 3.64 3.40 3.36 3.40 3.48 3.50 3.76 4.48
0.00 0.33 0.67 0.33 0.33 1.00 0.99 0.00 0.33 0.00
energy. First, the bandwidth in the cluster is narrower than in the solid, because of the reduced atomic coordination. Second, high symmetry is assumed in the calculation. The latter effect suggests that some magnetic moments of Table 2 may be overestimated.
Rhodium Clusters Experiments on Rh clusters5,106 reveal an oscillatory pattern of the average magnetic moment, with large values for N ¼ 15, 16, and 19, and local minima for N ¼ 13–14, 17–18, and 20. DFT calculations have been performed for selected clusters in that size range, usually assuming symmetric structures except for the smallest clusters.108–112 The conclusion reached by the various researchers is that the Rh clusters are magnetic. However, different experiments for the same cluster size show a lot of dispersion. The self-consistent TB method has been employed to study several Rh clusters in the size range N ¼ 9–55 atoms.113 Only the d electrons were taken into account and model structures, which were restricted to be fcc, bcc, or icosahedral, were assumed, although relaxation of bond lengths that preserve the cluster symmetry was allowed. Bond length contractions of 2% to 9% with respect to the bulk were found, and these affect the magnetic moments. The magnetic moments oscillate and tend to decrease with increasing N, and the structures predicted as being most stable by the TB calculation lead to consistent agreement with the measured magnetic moments. The largest cohesive energy of Rh9 (2.38 eV/atom) was found for a twisted double-square, capped in the form of a pyramid. This Rh9 structure has a magnetic moment of ¼ 0:66 mB , in good agreement with the measured value of m ¼ 0:8 0:2 mB . m The icosahedral and the fcc structures are degenerate for Rh11, although only the magnetic moment of the icosahedral isomer ð m ¼ 0:73 mB Þ is consistent with the experiment ð m ¼ 0:8 0:2 mB Þ. The most stable structure
236
Magnetic Properties of Atomic Clusters of the Transition Elements
¼ 0:62 mB , in better agreement with experiment of Rh13 is bcc with m ð m ¼ 0:48 0:13 mB Þ than the other structures considered. Fcc structures are predicted in the size range 15 N 43, and the observed trends in the at N ¼ 13 and magnetic moments are reproduced, i.e., local minima of m N ¼ 17, and local maxima at N ¼ 15 and N ¼ 19. The magnetic moments, however, are larger than the experimentally measured values. Other structures fail to reproduce those oscillations, which further suggests that the geometrical structure in the size range from 15 to 20 atoms may be fcc. Rh55 is icosahedral, and its nonmagnetic character is also consistent with the experiment. Regarding the distribution of the magnetic moments, the bcc isomers order ferromagnetically and the atomic moments tend to increase when going from the cluster center to the surface atoms. On the other hand, the distribution in fcc and icosahedral structures is more complex and the magnetic order is sometimes antiferromagnetic, with the local moments changing sign between adjacent shells. A similar behavior has been predicted for Rh fcc surfaces and films.114,115 The effect of the sp electrons was analyzed for Rh13: The local moments show some sensitivity to sp–d hybridization, but the total magnetic moment of the cluster is not altered. In another TB calculation116 for Rh13, Rh19, Rh43, Rh55, and Rh79 with fcc structures, ferromagnetic ordering was found for Rh13, Rh19, and Rh43, and antiferromagnetic configurations for Rh55 and Rh79. The magnetic moments of the two largest clusters are very close to the experimental values, and this was interpreted as supporting fcc structures for N > 40. The magnetic-to-nonmagnetic transition was estimated at N 80. Rh4 was investigated to study the relationship among magnetism, topology, and reactivity.117 Working at the GGA level of DFT, the ground state was found to have a nonmagnetic tetrahedral structure. The cluster also has a magnetic isomer that is a square with a moment of 1 mB /atom, 0.60 eV/atom less stable than the ground state. The difference in the magnetic character can be from the different atomic coordination in the isomers, three in the tetrahedron and two in the square. More insight is obtained from the analysis of the distribution of the electronic energy levels. The square isomer of Rh4 has a larger number of states near the HOMO, and work for extended systems has shown that a large density of states near the Fermi energy usually leads to magnetic structures. By simulating the reaction of those two isomers with molecular hydrogen, the following conclusions were obtained: (1) H2 dissociates and binds atomically to both isomers, (2) the H2 binding energy to the nonmagnetic isomer is larger by a factor of 2, and (3) the spin multiplicities of the two isomers change upon reaction with H2. These results imply that the reactivity of transition metal clusters may depend sensitively on both their magnetic structure and their topology. In fact, the existence of isomers has been detected in reactivity experiments of some clusters.54,118,119 In the current case, only the magnetic isomer of Rh4, with the predicted structure of a square, can be deflected in a Stern–Gerlach magnet. On the other
Effect of Adsorbed Molecules
237
hand, the two reacted forms of Rh4H2 are magnetic and have different spin multiplicities. Consequently the two reacted clusters will be deflected by different amounts in a Stern–Gerlach field, which provides a route to test the theoretical predictions on the relation among magnetism, topology, and reactivity in Rh4.
Ruthenium and Palladium Clusters Density functional108,120 and TB calculations113,121 have been performed for ruthenium clusters. Antiferromagnetic ordering of the magnetic moments is preferred for most structures studied. The TB method predicts lower average moments compared with DFT, which are in better agreement with the experimental upper limits,5,106 but the sp electrons were not included in the calculations. The magnetic-to-nonmagnetic transition is estimated to occur around Ru19, which is in qualitative agreement with the experimental bound of N 13. The experiments of Cox et al.5,106 set the upper limits of the 0.40 mB / atom for the average magnetic moment of Pd13 and 0.13 mB /atom for Pd105. DFT calculations support the existence of small magnetic moments in Pd clusters.122–124 Calculations by Moseler et al.124 for neutral clusters with N 7 between Pd2 ð and N ¼ 13 predict a monotonic decrease of m m ¼ 1 mB Þ and Pd7 ð m ¼ 0:3 mB Þ, and an unexpected high value of 0.62 mB for Pd13. Nega oscillates and tively charged clusters are more complex. The magnitude of m is relatively large for N ¼ 5, 7, and 13 ( m ¼ 0:6; 0:7, and 0.54 mB , respectively). The total magnetic moment arises from sizable local atomic moments of magnitude 0.3–0.6 mB . These moments couple antiferromagnetically in some cases and align ferromagnetically in other cases.
EFFECT OF ADSORBED MOLECULES The electronic structure of a cluster is perturbed by the presence of molecules adsorbed on the cluster surface. A striking example is the quenching of the magnetic moments of Ni clusters caused by the adsorption of CO.125 Magnetic deflection experiments for NiNCO clusters with N ¼ 8–18 reveal that the presence of just a single CO molecule reduces the magnetic moment of most of those clusters.126 The quenching effect is particularly large for Ni8, Ni9, Ni15, and Ni18. For instance, the total magnetic moment of Ni8 is reduced by 5 mB , that is, 0.63 mB per atom. Nickel cluster carbonyl complexes like [Ni9(CO)18]2 display vanishing magnetic susceptibilities, revealing Ni moments of 0 mB . Calculations for [Ni6(CO)12]2, [Ni32(CO)32]n, [Ni44(CO)48]n, and other complexes predict low spin structures, which is consistent with the very low magnetic susceptibilities measured for macroscopic samples of these compounds.125,127 The proposed explanation is that
238
Magnetic Properties of Atomic Clusters of the Transition Elements
ligands with s lone pairs, like CO, interact repulsively with the diffuse 4sp electrons of the Ni atoms, inducing an electronic transition of the type 3d9 4s1 ! 3d10 that causes the filling of the atomic 3d shell. The calculations show that this repulsive destabilization occurs even when the Ni cluster is covered by a shell of inert He atoms.66 DFT studies of the adsorption of NH3 on NiN clusters with N ¼ 1–4 also indicate that the adsorbed molecules have a significant effect on the magnetism: A decrease of the Ni moments is predicted, which are completely quenched when the number of NH3 molecules equals the number of Ni atoms.128 The nitrogen atom binds directly to a Ni atom, and the quenching of the magnetic moment of Ni is from the short distance between the Ni and N atoms in the Ni–N bond. When the number of molecules is larger than the number of Ni atoms, the Ni–N bonds become stretched because of steric hindrance. Once Ni–N distances exceed the critical distance of 3.59 a.u., magnetism reappears. Adsorbed species can also increase the magnetic moments of ferromagnetic clusters. The magnetic moments of free and hydrogenated iron clusters measured by Knickelbein129 are shown in Figure 14. The Fe clusters become saturated with a layer of dissociatively chemisorbed hydrogen under the conditions of the experiment. For most cluster sizes studied, the FeNHm clusters have larger magnetic moments than the corresponding pure FeN clusters, and the enhancement is particularly large between N ¼ 13 and N ¼ 18. This result contrasts with analogous studies for Ni clusters; in this case, quenching of the magnetic moments is observed after hydrogenation.129
Figure 14 Measured magnetic moments of FeN (circles) and FeNHm (squares). Adapted with permission from Ref. 129.
Determination of Magnetic Moments by Combining Theory
239
DETERMINATION OF MAGNETIC MOMENTS BY COMBINING THEORY AND PHOTODETACHMENT SPECTROSCOPY The measurement of the magnetic moment of very small clusters by Stern–Gerlach deflection techniques is not simple. In such cases, the total magnetic moment is also small and the deflection in the magnetic field may lie within the error of the experiment. Motivated by this difficulty, an alternative method to determine the magnetic moments has been proposed by Khanna and Jena,130 based on combining calculations for the neutral and negatively charged (anionic) species, XN and X N , respectively, with electron photodetachment spectroscopy experiments for the anionic cluster. Let us consider a ferromagnetic anionic cluster that has n unpaired spins, and, thus, a magnetic moment nmB and multiplicity M ¼ n þ 1. When an electron is detached from the anion, the neutral cluster has a multiplicity of M þ 1 if the electron is removed from the minority band, or M 1 if the electron is removed from the majority band. The measured photoelectron energy peaks can be compared with theoretical calculations where one first determines the ground state of the anion, including its spin multiplicity M, and the energy for the transition to the neutral species with multiplicities M þ 1 and M 1 at the anion geometry. Quantitative agreement between the calculated energies and the observed spectral peaks indicates that the calculated multiplicity must be correct. The Khanna–Jena method has been applied to Ni5.130 The photoelectron 131 spectrum of Ni shows a prominent and broad 5 , measured by Wang and Wu, peak at 10.80 eV and a minor peak at 2.11 eV. A careful investigation was performed using DFT with the GGA for exchange and correlation of the equilibrium structures of anionic Ni 5 corresponding to spin multiplicities M ¼ 2, 4, 6, 8, and 10, and of neutral Ni5 with spin multiplicities M ¼ 1, 3, 5, 7, and 9. The ground state structure of the neutral cluster is a square pyramid with spin multiplicity M ¼ 7 (total magnetic moment of 6 mB ). This state is almost degenerate, with an isomer having the structure of a distorted trigonal bipyramid and M ¼ 5ðm ¼ 4 mB Þ. In the case of Ni 5 , the structure, for all of the spin muliplicities studied, is a slightly distorted square pyramid. The ground state has M ¼ 8, and this can only arise by adding an electron to the majority-spin band of neutral Ni5 with M ¼ 7 (which is precisely the ground state of Ni5). The structure of Ni 5 with M ¼ 6 has an energy only 0.05 eV above the ground state, so both isomers with M = 6 and 8 are expected to exist in the beam. The calculated vertical transition energies from the anionic to the neutral cluster are plotted in Figure 15. The transitions from the ground state of the anionic cluster (with M ¼ 8) to states of the neutral cluster with the anion geometry and M ¼ 7 and 9 (the transition energies are obtained as a difference of the total energies of the corresponding clusters) are shown on the left side of Figure 15. These transitions yield energies of 1.64 eV and 2.21 eV. On the other hand, the transitions from the M ¼ 6 state of Ni 5 yield energies of 1.58 eV and 1.79 eV. It
240
Magnetic Properties of Atomic Clusters of the Transition Elements
Figure 15 Transitions from the Ni 5 anionic isomers with spin multiplicity M to the corresponding neutrals with multiplicities differing by 1 from the anion. Adapted with permission from Ref. 130.
is plausible that the broad peak reported in the experiments originates from transitions from both isomers of Ni 5 , whereas the peak at 2.11 eV can only arise from the state of Ni 5 with M ¼ 8.
SUMMARY AND PROSPECTS The magnetic properties of small clusters of the transition elements are often different from those of the same material in the macroscopic bulk. This difference is because magnetism is very sensitive to the density of electronic states in the energy region around the Fermi level of the system, and the density of states in a cluster is strongly affected by the confinement of the electrons in a small volume. The atoms forming the cluster surface have a different local environment compared with the bulk-like atoms and thus a different local density of states. In addition, the geometrical structure of small clusters changes as the size of the cluster increases. These effects lead to a complex and nonmonotonic variation of the ordering of the atomic magnetic moments as the clus of small magnetic clusters ter size increases. The magnetic moment per atom m decreases as is higher than the magnetic moment per atom in the bulk metal. m dislays oscillathe cluster size increases but not in a smooth way. Instead, m tions superimposed to that overall decrease, before converging to the value for the bulk metal. Even more, nonzero magnetic moments have been measured in clusters of some metals that are nonmagnetic in the bulk phase. Many experimental studies of the magnetism in transition metal clusters use the method of Stern–Gerlach deflection of a cluster beam in an inhomogeneous magnetic field. Two computational methods have been mainly used to help in the interpretation of the experimental results. One is the tight binding method, and the other is the density functional theory in its spin polarized version. Both methods are reviewed in this chapter, and their performance is illustrated by showing several applications to the study of the magnetic properties of clusters of the 3d and 4d elements of the periodic table. In general, the two
Appendix. Calculation of the Density of Electronic States
241
methods are successful in the description of the magnetic ordering of transition metal clusters. However, both methods make approximations in the treatment of the electronic correlations, and because of those approximations, there are conflicting cases that resist a conclusive analysis; the magnetic ordering in Mn2 is a good example. The well-known ferromagnetic and antiferromagnetic orderings typical of many materials in the bulk phase become more complex in clusters. For instance, for a material with a tendency to antiferromagnetic ordering of the atomic spins, a perfect antiferromagnetic configuration is not possible in a trimer with the geometry of a triangle, because two magnetic moments have to point necessarily in the same direction. This simple example of magnetic frustration is induced by the finite number of atoms of the system. This type of frustration occurs in many clusters with a tendency to antiferromagnetic ordering. Sometimes the magnetic frustrations cost a sizable amount of energy and the magnetic moments reorder by pointing toward different directions in space in order to reduce the cluster energy; this is called a noncollinear magnetic configuration. Current improvements in the theoretical tools allows one to study noncollinear magnetic ordering in clusters, and this is one of the recent trends in the literature. As a consequence of those improved studies, it is expected that the results of some previous calculations and the interpretation of some experiments will have to be revised in light of possible noncollinear magnetic arrangements. Many experiments, so far, have been interpreted by taking into account the spin magnetism only. However, recent work has pointed out the importance of orbital magnetism and of the spinorbit coupling. A good example is the deep insight on the evolution of the magnetic moment of nickel clusters as a function of the cluster size obtained by taking into account the effects of orbital magnetism.48 However, the general relevance of this effect is not yet assessed and more work is required. To summarize, one can note that the magnetic characteristics of small clusters of the transition metals vary in a nonmonotonous way as a function of the number of atoms in the cluster. This nonscalable behavior is what makes small clusters interesting and complex at the same time, offering possibilities for future technological applications.
APPENDIX. CALCULATION OF THE DENSITY OF ELECTRONIC STATES WITHIN THE TIGHT BINDING THEORY BY THE METHOD OF MOMENTS Let H be the Hamiltonian for an electron interacting through a potential ~i Þ with the N atoms of the cluster placed at the sites R ~i : Vð~ rR H¼Tþ
X i
Vð~ r~ Ri Þ ¼ T þ Vi
½A:1
242
Magnetic Properties of Atomic Clusters of the Transition Elements
The density of electronic states can be written DðeÞ ¼ Trdðe HÞ
½A:2
where Tr indicates the trace of the operator dðe HÞ. The moments mðpÞ of the density of states are defined as ð mðpÞ ¼ ep DðeÞde ¼ TrH p
½A:3
These moments can be calculated using the tight binding approximation. Introducing a complete set of atomic orbitals jiai satisfying the equations ~i Þf ð~ ~ ½T þ Vi ð~ rR r~ Ri Þ ia r Ri Þ ¼ ea fia ð~
½A:4
the moments mðpÞ can be calculated by expanding the trace over this set mðpÞ ¼
X D i1 a1 ...ip ap
ED E D E fi1 a1 H fi2 a2 fi2 a2 H fi3 a3 . . . fip ap H fi1 a1
½A:5
and keeping only two-center nearest-neighbor integrals. In addition, integrals hfi jVj jfi i will be neglected in comparison with those of type hfi jVi jfi i. The sum in Eq. [A.5] goes over all paths of length p that start and finish at a given atom, such that the electron hops between nearest neighbors. If we work with the local (and orbital-dependent) density of states ðpÞ Dias ðeÞ, moments mia can also be calculated ðpÞ mia
ð
¼ ep Dia ðeÞde ¼
ED E D E X D fi1 a1 H fi2 a2 fi2 a2 Hfi3 a3 . . . fip ap H fi1 a1
i2 a2 ...ip ap
½A:6 Equation [A.6] shows a simple connection between the local bonding of an atom and its electronic structure. The density of states is then calculated from all the moments mðpÞ . This theory offers a promising way of calculating the density of states. Many numerical methods are unstable, although the recursion method of Haydock20 works efficiently. In this Green’s function method, the local density of states is written in terms of the local Green’s function Gia;ia ðeÞ as 1 Dia ðeÞ ¼ lim ImGia;ia ðe þ iZÞ p Z!0
½A:7
References
243
ACKNOWLEDGMENTS This work was supported by MEC (Grant MAT2005-06544-C03-01) and Junta de Castilla y Leo´n (Grant VA039A05). I acknowledge the hospitality and support of DIPC during the summer of 2006.
REFERENCES 1. I. M. L. Billas, A. Chatelain, and W. D. de Heer, Science, 265, 1682 (1994). Magnetism of Fe, Co and Ni Clusters in Molecular Beams. 2. I. M. L. Billas, A. Chatelain, and W. D. de Heer, J. Magn. Mag. Mater., 168, 64 (1997). Magnetism from the Atom to the Bulk in Iron, Cobalt and Nickel Clusters. 3. S. E. Apsel, J. W. Emmert, J. Deng, and L. A. Bloomfield, Phys. Rev. Lett., 76, 1441 (1996). Surface-Enhanced Magnetism in Nickel Clusters. 4. M. B. Knickelbein, J. Chem. Phys., 116, 9703 (2002). Nickel Clusters: The Influence of Adsorbates on Magnetic Moments. 5. A. J. Cox, J. G. Lourderback, S. E. Apsel, and L. A. Bloomfield, Phys. Rev. B, 49, 12295 (1994). Magnetism in 4d-Transition Metal Clusters. 6. J. Zhao, X. Chen, Q. Sun, F. Liu, and G. Wang, Phys. Lett. A, 205, 308 (1995). A Simple d-Band Model for the Magnetic Property of Ferromagnetic Transition Metal Clusters. 7. J. Friedel, in The Physics of Metals, J. M. Ziman, Ed., Cambridge University Press, Cambridge, United Kingdom, 1969, pp. 340-408, Transition Metals. Electronic Structure of the d-Band. Its Role in the Crystalline and Magnetic Structures. 8. G. Pastor, J. Dorantes-Da´vila, and K. H. Bennemann, Chem. Phys. Lett., 148, 459 (1988). A Theory for the Size and Structural Dependendence of the Ionization and Cohesive Energy of Transtion Metal Clusters. 9. F. Aguilera-Granja, J. M. Montejano-Carrizales, and J. L. Mora´n-Lo´pez, Solid State Commun., 107, 25 (1998). Geometrical Structure and Magnetism of Nickel Clusters. 10. P. Jensen and K. H. Bennemann, Z. Phys. D, 35, 273 (1995). Theory for the Atomic Shell Structure of the Cluster Magnetic Moment and Magnetoresistance of a Cluster Ensemble. 11. J. Callaway, Energy Band Theory, Academic Press, London, 1964. 12. J. C. Slater and G. F. Koster, Phys. Rev., 94, 1498 (1954). Simplified LCAO Method for the Periodic Potential Problem. 13. P. Lo¨wdin, J. Chem. Phys., 18, 365 (1950). On the Non-Orthogonality Problem Connected with the use of Atomic Wave Functions in the Theory of Molecules and Crystals. 14. A. L. Fetter and J. D. Walecka, Quantum Theory of Many Particle Systems, McGraw Hill, New York, 1971. 15. A. Vega, J. Dorantes-Da´vila, L.C. Balba´s, and G. M. Pastor, Phys. Rev. B, 47, 4742 (1993). Calculated sp-Electron and spd-Hybridization Effects on the Magnetic Properties of Small FeN Clusters. 16. V. Heine, Phys. Rev., 153, 673 (1967). s-d Interaction in Transition Metals. 17. G. M. Pastor, J. Dorantes-Da´vila, and K. H. Bennemann, Physica B, 149, 22 (1988). The Magnetic Properties of Small Fen Clusters. 18. G. M. Pastor, J. Dorantes-Da´vila, and K. H. Bennemann, Phys. Rev. B, 40, 7642 (1989). Size and Structural Dependence of the Magnetic Properties of 3d Transition Metal Clusters. 19. W. A. Harrison, Electronic Structure and the Properties of Solids, Freeman, San Francisco, California, 1980.
244
Magnetic Properties of Atomic Clusters of the Transition Elements
20. R. Haydock, in Solid State Physics, H. Ehrenreich, F. Seitz, and D. Turnbull, Eds., Academic Press, New York, 35 (1980), pp. 215–294. The Recursive Solution of the Schro¨dinger Equation. 21. J. Guevara, F. Parisi, A. M. Llois, and M. Weissmann, Phys. Rev. B, 55, 13283 (1997). Electronic Properties of Transition Metal Clusters: Consideration of the Spillover in a Bulk Parametrization. 22. P. Hohenberg and W. Kohn, Phys. Rev., 136, B864 (1964). Inhomogeneous Electron Gas. 23. W. Kohn and L. J. Sham, Phys. Rev., 140, A1133 (1965). Self-Consistent Equations Including Exchange and Correlation Effects. 24. S. Lundqvist and N. H. March, Eds., Theory of the Inhomogeneous Electron Gas, Plenum Press, New York, 1986. 25. D. M. Ceperley and B. J. Alder, Phys. Rev. Lett., 45, 566 (1980). Ground State of the Electron Gas by a Stochastic Method. 26. J. P. Perdew and A. Zunger, Phys. Rev. B, 23, 5048 (1981). Self-Interaction Correction to Density Functional Approximations for Many Electron Systems. 27. A. D. Becke, Phys. Rev. A, 38, 3098 (1988). Density Functional Exchange Energy Approximation with Correct Asymptotic Behavior. 28. J. P. Perdew, K. Burke, and M. Ernzerhof, Phys. Rev. Lett., 77, 3865 (1996). Generalized Gradient Approximation Made Simple. 29. J. P. Perdew and S. Kurth, in A Primer in Density Functional Theory, Vol. 620, C. Fiolhais, F. Nogueira and M. Marques, Eds., Lecture Notes in Physics, Springer, Berlin, 2003, pp. 1–55. Density Functionals for Non-Relativistic Coulomb Systems in the New Century. 30. U. von Barth and L. Hedin, J. Phys. C.: Solid State Phys., 5, 1629 (1972). A Local ExchangeCorrelation Potential for the Spin Polarized Case: I. 31. J. K. Ku¨bler, Theory of Itinerant Electron Magnetism, Oxford University Press, Oxford, United Kingdom, 2000. 32. D. J. Singh and D. A. Papaconstantopoulos, Eds., Electronic Structure and Magnetism of Complex Materials, Springer, Berlin, 2003. 33. S. H. Vosko, L. Wilk, and M. Nusair, Can. J. Phys., 58, 1200 (1980). Accurate Spin-dependent Electron Liquid Correlation Energies for Local Spin Density Calculations: A Critical Analysis. 34. Y. Tsunoda, J. Phys.: Cond. Matter, 1, 10427 (1989). Spin-density Wave in Cubic g-Fe and g-Fe100-x Cox Precipitates in Cu. 35. R. Lorenz, J. Hafner, S. S. Jaswal, and D. J. Sellmyer, Phys. Rev. Lett., 74, 3688 (1995). Disorder and Non-Collinear Magnetism in Permanent-Magnet Materials with ThMn12 Structure. 36. M. Liebs, K. Hummler, and M. Fa¨hnle, Phys. Rev. B, 51, 8664 (1995). Influence of Structural Disorder on Magnetic Order: An Ab Initio Study of Amorphous Fe, Co, and Ni. 37. O. N. Mryasov, A. I. Liechtenstein, L. M. Sandratskii, and V. A. Gubanov, J. Phys.: Cond. Matter, 3, 7683 (1991). Magnetic Structure of fcc Iron. 38. V. P. Antropov, M. I. Katsnelson, M. van Schilfgaarde, and B. N. Harmon, Phys. Rev. Lett., 75, 729 (1995). Ab Initio Spin Dynamics in Magnets. 39. M. Uhl and J. Ku¨bler, Phys. Rev. Lett., 77, 334 (1996). Exchanged-Coupled Spin-Fluctuation Theory. Application to Fe, Co, and Ni. 40. T. Oda, A. Pasquarello, and R. Car, Phys. Rev. Lett., 80, 3622 (1998). Fully Unconstrained Approach to Non-Collinear Magnetism: Application to Small Fe Clusters. 41. C. Kohl and G. F. Bertsch, Phys. Rev. B, 60, 4205 (1999). Non-Collinear Magnetic Ordering in Small Chromium Clusters. 42. J. Ku¨bler, K. H. Ho¨ck, J. Sticht, and A. R. Williams, J. Phys. F, 18, 469 (1988). Density Functional Theory of Non-Collinear Magnetism.
References
245
43. K. Capelle and E. K. U. Gross, Phys. Rev. Lett., 78, 1872 (1997). Spin-Density Functionals from Current-Density Functional Theory and Vice Versa: A Road Towards New Approximations. 44. J. A. Alonso, Structure and Properties of Atomic Nanoclusters, Imperial College Press, London, 2005. 45. S. Bouarab, A. Vega, M. J. Lo´pez, M. P. In˜iguez, and J. A. Alonso, Phys. Rev. B, 55, 13279 (1997). Geometrical Effects on the Magnetism of Small Ni Clusters. 46. F. Aguilera-Granja, S. Bouarab, M. J. Lo´pez, A. Vega, J. M. Montejano-Carrizales, M. P. In˜iguez, and J. A. Alonso, Phys. Rev. B, 57, 12469 (1998). Magnetic Moments of Ni Clusters. 47. J. A. Alonso, Chem. Rev., 100, 637 (2000). Electronic and Atomic Structure and Magnetism of Transition-Metal Clusters. 48. X. Wan, L. Zhou, J. Dong, T. K. Lee, and D. Wang, Phys. Rev. B, 69, 174414 (2004). Orbital Polarization, Surface Enhancement and Quantum Confinement in Nanocluster Magnetism. 49. F. Ducastelle, J. Phys. (Paris), 31, 1055 (1970). Elastic Moduli of Transition Metals. 50. R. P. Gupta, Phys. Rev. B, 23, 6265 (1981). Lattice Relaxations at a Metal Surface. 51. J. M. Montejano-Carrizales, M. P. In˜iguez, J. A. Alonso, and M. J. Lo´pez, Phys. Rev. B, 54, 5961 (1996). Theoretical Study of Icosahedral Ni Clusters within the Embedded Atom Method. 52. E. K. Parks, G. C. Nieman, K. P. Kerns, and S. J. Riley, J. Chem. Phys., 107, 1861 (1997). Reactions of Ni38 with N2, H2 and CO: Cluster Structure and Adsorbate Binding Sites. 53. N. N. Lathiotakis, A. N. Andriotis, M. Menon, and J. Connolly, J. Chem. Phys., 104, 992 (1996). Tight Binding Molecular Dynamics Study of Ni Clusters. 54. E. K. Parks, L. Zhu, J. Ho, and S. J. Riley, J. Chem. Phys., 102, 7377 (1995). The Structure of Small Nickel Clusters. II. Ni16 – Ni28. 55. J. L. Rodrı´guez-Lo´pez, F. Aguilera-Granja, A. Vega, and J. A. Alonso, Eur. Phys. J. D, 6, 235 (1999). Magnetic Moments of NiN Clusters (N 34): Relation to Atomic Structure. 56. N. Fujima and T. Yamaguchi, Phys. Rev. B, 54, 26 (1996). Magnetic Moment in Nickel Clusters Estimated by an Electronic Shell Model. 57. W. D. Knight, K. Clemenger, W. A. de Heer, W. A. Saunders, M. Y. Chou, and M. L. Cohen, Phys. Rev. Lett., 52, 2141 (1984). Electronic Shell Structure and Abundances of Sodium Clusters. 58. W. Ekardt, Phys. Rev. B, 29, 1558 (1984). Work Function of Small Metal Particles: SelfConsistent Spherical Jellium-Background Model. 59. I. Katakuse, Y. Ichihara, Y. Fujita, T. Matsuo, T. Sakurai, and H. Matsuda, Int. J. Mass Spectrom. Ion Proc., 74, 33 (1986). Mass Distributions of Negative Cluster Ions of Copper, Silver and Gold. 60. N. Fujima and T. Yamaguchi, J. Phys. Soc. Japan, 58, 3290 (1989). Magnetic Anomaly and Shell Structure of Electronic States of Nickel Microclusters. 61. F. A. Reuse and S. N. Khanna, Chem. Phys. Lett., 234, 77 (1995). Geometry, Electronic Structure, and Magnetism of Small Nin ( n ¼ 2–6, 8, 13) Clusters. 62. N. Desmarais, C. Jamorski, F. A. Reuse, and S. N. Khanna, Chem. Phys. Lett., 294, 480 (1998). Atomic Arrangements in Ni7 and Ni8 Clusters. 63. B. M. Reddy, S. K. Nayak, S. N. Khanna, B. K. Rao, and P. Jena, J. Phys. Chem. A, 102, 1748 (1998). Physics of Nickel Clusters. 2. Electronic Structure and Magnetic Properties. 64. M. W. Finnis and J. E. Sinclair, Phil. Mag., 50, 45 (1984). A Simple Empirical N-Body Potential for Transition Metals. 65. N. Fujima and T. Yamaguchi, Mater. Sci. Eng. A, 217, 295 (1996). Geometrical Magnetic Structures of Transition-Metal Clusters. 66. G. Pacchioni, S. C. Chung, S. Kru¨ger, and N. Ro¨sch, Chem. Phys., 184, 125 (1994). On the Evolution of Cluster to Bulk Properties: A Theoretical LCGTO-DFT Study of Free and Coordinated Nin Clusters (n ¼ 6–147).
246
Magnetic Properties of Atomic Clusters of the Transition Elements
67. L. Zhou, D. S. Wang, and Y. Kawazoe, Phys. Rev. B, 60, 9545 (1999). Orbital Correlation and Magnetocrystalline Anisotropy in One-Dimensional Transition Metal Systems. 68. M. Komelj, C. Ederer, J. W. Davenport, and M. Fa¨hnle, Phys. Rev. B, 66, 140407 (2002). From Bulk to Monatomic Wires: An Ab Initio Study of Magnetism in Co Systems with Various Dimensionality. 69. R. A. Guirado-Lo´pez, J. Dorantes-Da´vila, and G. M. Pastor, Phys. Rev. Lett., 90, 226402 (2003). Orbital Magnetism in Transition Metal Clusters: From Hund’s Rules to Bulk Quenching. 70. A. I. Liechtenstein, V. I. Anisimov, and J. Zaanen, Phys. Rev. B, 52, R5467 (1995). Density Functional Theory and Strong Interactions: Orbital Ordering in Mott-Hubbard Insulators. 71. B. R. Judd, Operator Techniques in Atomic Spectroscopy, McGraw-Hill, New York, 1963. 72. F. A. Reuse, S. N. Khanna, and S. Bernel, Phys. Rev. B, 52, R11650 (1995). Electronic Structure and Magnetic Behavior of Ni13 Clusters. 73. B. T. Thole, P. Carra, F. Sette, and G. van der Laan, Phys. Rev. Lett., 68, 1943 (1992). X-Ray Circular Dichroism as a Probe of Orbital Magnetization. 74. A. N Andriotis and M. Menon, Phys. Rev. Lett., 93, 026402 (2004). Orbital Magnetism: Pros and Cons for Enhancing the Cluster Magnetism. 75. H. Cheng and L. S. Wang, Phys. Rev. Lett., 77, 51 (1996). Dime Growth, Structural Transition, and Antiferromagnetic Ordering of Small Chromium Clusters. 76. L. S. Wang, H. Wu, and H. Cheng, Phys. Rev. B, 55, 12884 (1997). Photoelectron Spectroscopy of Small Chromium Clusters: Observation of Even-Odd Alternations and Theoretical Interpretation. 77. D. C. Douglass, J. P. Bucher, and L. A. Bloomfield, Phys. Rev. B, 45, 6341 (1992). Magnetic Studies of Free Nonferromagnetic Clusters. 78. L. A. Bloomfield, J. Deng, H. Zhang, and J. W. Emmert, in Proceedings of the International Symposium on Cluster and Nanostructure Interfaces. P. Jena, S. N. Khanna, and B. K. Rao, Eds., World Scientific, Singapore, 2000, pp. 131–138. Magnetism and Magnetic Isomers in Chromium Clusters. 79. N. Fujima, Eur. Phys. J. D, 16, 185 (2001). Non-Collinear Magnetic Moments of Seven-Atom Cr, Mn and Fe Clusters. 80. N. Fujima, J. Phys. Soc. Japan, 71, 1529 (2002). Non-Collinear Magnetic Moments of FiveAtom Transition Metal Clusters 81. S. Blu¨gel, B. Drittler, R. Zeller, and P. H. Dederichs, Appl. Phys. A, 49, 547 (1989). Magnetic Properties of 3d Transition Metal Monolayers on Metal Substrates. 82. V. S. Stepanyuk, W. Hergert, K. Wildberger, S. K. Nayak, and P. Jena, Surf. Sci. Lett., 384, L892 (1997). Magnetic Bistability of Supported Mn Clusters. 83. J. R. Lombardi and B. Davis, Chem. Rev. 102, 2431 (2002). Periodic Properties of Force Constants of Small Transition Metal and Lanthanide Clusters. 84. E. K. Parks, G. C. Nieman, and S. J. Riley, J. Chem. Phys., 104, 3531 (1996). The Reaction of Manganese Clusters and Manganese Cluster Carbides with Hydrogen. The Mn-CH3 Bond Energy. 85. R. J. Van Zee and W. Weltner, J. Chem. Phys., 89, 4444 (1988). The Ferromagnetic Mnþ 2 Molecule. 86. G. W. Ludwig, H. H. Woodbury, and R. O. Carlson, J. Phys. Chem. Solids, 8, 490 (1959). Spin Resonance of Deep Level Impurities in Germanium and Silicon. 87. C. A. Baumann, R. J. Van Zee, S. Bhat, and W. Weltner, J. Chem. Phys., 78, 190 (1983). ESR of Mn2 and Mn5 Molecules in Rare Gas Matrices. 88. R. K. Nesbet, Phys. Rev., 135, A460 (1964). Heisenberg Exchange Interaction of Two Mn Atoms. 89. N. Fujima and T. Yamaguchi, J. Phys. Soc. Japan, 64, 1251 (1995). Chemical Bonding in Mn Clusters, MnN and MnN± (N ¼ 2–7).
References
247
90. S. K. Nayak and P. Jena, Chem. Phys. Lett., 289, 473 (1998). Anomalous Magnetism in Small Mn Clusters. 91. P. Schieffer, C. Krembel, M. C. Hanf, D. Bolmont, and G. Gewinner, J. Magn. Mag. Mater., 165, 180 (1997). Stabilization of a Face-Centered-Cubic Mn Structure with the Ag Lattice Parameter. 92. O. Rader, W. Gudat, D. Schmitz, C. Carbone, and W. Eberhardt, Phys. Rev. B, 56, 5053 (1997). Magnetic Circular X-Ray Dichroism of Submonolayer Mn on Fe(100). 93. M. R. Pederson, F. A. Reuse, and S. N. Khanna, Phys. Rev. B, 58, 5632 (1998). Magnetic Transition in Mnn (n ¼ 2–8) Clusters. 94. S. Yamamoto, H. Tatewaki, H. Moriyama, and H. Nakano, J. Chem Phys. 124, 124302 (2006). A Study of the Ground State of Manganese Dimer Using Quasidegenerate Perturbation Theory. 95. H. Nakano, J. Chem. Phys. 99, 7983 (1993). Quasidegenerate Perturbation Theory with Multiconfigurational Self-Consistent-Field Reference Functions. 96. M. B. Knickelbein, Phys. Rev. Lett., 86, 5255 (2001). Experimental Observation of SuperParamagnetism in Manganese Clusters. 97. J. Guevara, A. M. Llois, F. Aguilera-Granja, and J. M. Montejano-Carrizales, Phys. Stat. Sol. B, 239, 457 (2003). Magnetism of Small Mn Clusters. 98. S. K. Nayak, M. Nooijen, and P. Jena, J. Phys. Chem. A, 103, 9853 (1999). Isomerism and Novel Magnetic Order in Mn13 Cluster. 99. S. N. Khanna, B. K. Rao, P. Jena, and M. Knickelbein, Chem. Phys. Lett., 378, 374 (2003). Ferrimagnetism in Mn7 Cluster. 100. T. M. Briere, H. F. Sluiter, V. Kumar, and Y. Kawazoe, Phys. Rev. B, 66, 064412 (2002). Atomic Structures and Magnetic Behavior of Mn Clusters. 101. P. Bobadova-Parvanova, K. A. Jackson, S. Srinivas, and M. Horoi, Phys. Rev. A, 67, 061202 (2003). Emergence of Antiferromagnetic Ordering in Mn Clusters. 102. N. O. Jones, S. H. Khanna, T. Baruath, and M. R. Pederson, Phys. Rev. B, 70, 045416 (2004). Classical Stern–Gerlach Profiles of Mn5 and Mn6 Clusters. 103. M. B. Knickelbein, Phys. Rev. B, 70, 014424 (2004). Magnetic Ordering in Manganese Clusters. 104. J. M. Soler, E. Artacho, J. D. Gale, A. Garcı´a, J. Junquera, P. Ordejo´n, and D. Sa´nchez-Portal, J. Phys.: Cond. Matter, 14, 2745 (2002). The SIESTA Method for Ab Initio Order-N Materials Simulation. 105. R. C. Longo, E. G. Noya, and L. J. Gallego, J. Chem. Phys., 122, 226102 (2005). NonCollinear Magnetic Order in the Six-Atom Mn Cluster. 106. A. J. Cox, J. G. Lourderback, and L. A. Bloomfield, Phys. Rev. Lett., 71, 923 (1993). Experimental Observation of Magnetism in Rhodium Clusters. 107. G. W. Zhang, Y. P. Feng, and C. K. Ong, Phys. Rev. B, 54, 17208 (1996). Local Binding Trend and Local Electronic Structures of 4d Transition Metals. 108. R. V. Reddy, S. N. Khanna, and B. Dunlap, Phys. Rev. Lett., 70, 3323 (1993). Giant Magnetic Moments in 4d Clusters. 109. B. Piveteau, M. C. Desjonque´res, A. M. Ole´s, and D. Spanjard, Phys. Rev. B, 53, 9251 (1996). Magnetic Properties of 4d Transition-Metal Clusters. 110. Y. Jinlong, F. Toigo, W. Kelin, and Z. Manhong, Phys. Rev. B, 50, 7173 (1994). Anomalous Symmetry Dependence of Rh13 Magnetism. 111. Y. Jinlong, F. Toigo, and W. Kelin, Phys. Rev. B, 50, 7915 (1994). Structural, Electronic, and Magnetic Properties of Small Rhodium Clusters. 112. Z. Q. Li, J. Z. Yu, K. Ohno, and Y. Kawazoe, J. Phys.: Cond. Matter, 7, 47 (1995). Calculations on the Magnetic Properties of Rhodium Clusters. 113. P. Villasen˜or-Gonza´lez, J. Dorantes-Da´vila, H. Dreysse´, and G. Pastor, Phys. Rev. B, 55, 15084 (1997). Size and Structural Dependence of the Magnetic Properties of Rhodium Clusters.
248
Magnetic Properties of Atomic Clusters of the Transition Elements
114. A. Chouairi, H. Dreysse´, H. Nait-Laziz, and C. Demangeat, Phys. Rev. B, 48, 7735 (1993). Rh Polarization in Ultrathin Rh Layers on Fe(001). 115. A. Mokrani and H. Dreysse´, Solid State Commun., 90, 31 (1994). Magnetism of Rh Vicinal Surfaces? 116. R. Guirado-Lo´pez, D. Spanjaard, and M. C. Desjonque´res, Phys. Rev. B, 57, 6305 (1998). Magnetic-Nonmagnetic Transition in fcc 4d-Transition-Metal Clusters. 117. S. K. Nayak, S. E. Weber, P. Jena, K. Wildberger, R. Zeller, P. H. Dederichs, S. V. Stepanyuk, and W. Hergert, Phys. Rev. B, 56, 8849 (1997). Relationship Between Magnetism, Topology, and Reactivity of Rh Clusters. 118. E. K. Parks, K. P. Kerns, and S. J. Riley, J. Chem. Phys., 109, 10207 (1998). The Structure of Ni39. 119. M. E. Geusic, M. D. Morse, and R. E. Smalley, J. Chem. Phys., 82, 590 (1985). Hydrogen Chemisorption on Transition Metal Clusters. 120. D. Kaiming, Y. Jinlong, X. Chuamyun, and W. Kelin, Phys. Rev. B, 54, 2191 (1996). Electronic Properties and Magnetism of Ruthenium Clusters. 121. R. Guirado-Lo´pez, D. Spanjaard, M. C. Desjonque´res, and F. Aguilera-Granja, J. Magn. Mag. Mater., 186, 214 (1998). Electronic and Geometrical Effects on the Magnetism of Small RuN Clusters. 122. K. Lee, Phys. Rev. B, 58, 2391 (1998). Possible Magnetism in Small Palladium Clusters. 123. K. Lee, Z. Phys. D, 40, 164 (1997). Possible Large Magnetic Moments in 4d Transition Metal Clusters. 124. M. Moseler, H. Ha¨kkinen, R. N. Barnett, and U. Landman, Phys. Rev. Lett., 86, 2545 (2001). Structure and Magnetism of Neutral and Anionic Palladium Clusters. 125. D. A. van Leeuwen, J. M. van Ruitenbeek, L. J. de Jongh, A. Ceriotti, G. Pacchioni, O. D. Ha¨berlen, and N. Ro¨sch, Phys. Rev. Lett., 73, 1432 (1994). Quenching of Magnetic Moments by Ligand-Metal Interactions in Nanosized Magnetic Metal Clusters. 126. M. B. Knickelbein, J. Chem. Phys., 115, 1983 (2001). Nickel Clusters: The Influence of Adsorbed CO on Magnetic Moments. 127. G. Pacchioni and N. Ro¨sch, Acc. Chem. Res., 28, 390 (1995). Carbonylated Nickel Clusters: From Molecules to Metals. 128. B. Chen, A. W. Castleman, and S. N. Khanna, Chem. Phys. Lett., 304, 423 (1999). Structure, Reactivity, and Magnetism: Adsorption of NH3 Around Nin. 129. M. B. Knickelbein, Chem. Phys. Lett., 353, 221 (2002). Adsorbate-Induced Enhancement of the Magnetic Moments of Iron Clusters. 130. S. N. Khanna and P. Jena, Chem. Phys. Lett., 336, 467 (2001). Magnetic Moment and PhotoDetachment Spectroscopy of Ni5 Clusters. 131. L. S. Wang and H. Z. Wu, Z. Phys. Chem., 203, 45 (1998). Photoelectron Spectroscopy of Transition Metal Clusters.
CHAPTER 6
Transition Metal- and Actinide-Containing Systems Studied with Multiconfigurational Quantum Chemical Methods Laura Gagliardi University of Geneva, Geneva, Switzerland
INTRODUCTION Ab initio quantum chemistry has advanced so far in the last 40 years that it now allows the study of molecular systems containing any atom in the Periodic Table. Transition metal and actinide compounds can be treated routinely, provided that electron correlation1 and relativistic effects2 are properly taken into account. Computational quantum chemical methods can be employed in combination with experiment, to predict a priori, to confirm, or eventually, to refine experimental results. These methods can also predict the existence of new species, which may eventually be made by experimentalists. This latter use of computational quantum chemistry is especially important when one considers experiments that are not easy to handle in a laboratory, as, for example, explosive or radioactive species. It is clear that a good understanding of the chemistry of such species can be useful in several areas of scientific and technological exploration. Quantum chemistry can model molecular properties and transformations, and in
Reviews in Computational Chemistry, Volume 25 edited by Kenny B. Lipkowitz and Thomas R. Cundari Copyright ß 2007 Wiley-VCH, John Wiley & Sons, Inc.
249
250
Transition Metal- and Actinide-Containing Systems
combination with experiment, it can lead to an improved understanding of processes such as, for example, nuclear waste extraction and storage procedures for radioactive materials. Quantum chemists have developed considerable experience over the years in inventing new molecules by quantum chemical methods, which in some cases have been subsequently characterized by experimentalists (see, for example, Refs. 3 and 4). The general philosophy is to explore the Periodic Table and to attempt to understand the analogies between the behavior of different elements. It is known that for first row atoms chemical bonding usually follows the octet rule. In transition metals, this rule is replaced by the 18-electron rule. Upon going to lanthanides and actinides, the valence f shells are expected to play a role. In lanthanide chemistry, the 4f shell is contracted and usually does not directly participate in the chemical bonding. In actinide chemistry, on the other hand, the 5f shell is more diffuse and participates actively in the bonding. Actinide chemistry presents a challenge for quantum chemistry mainly because of the complexity of the electronic structure of actinide atoms. The ground state of the uranium atom is, for example, (5f)3(6d)(7s)2, 5L6. The ground level is thus 13-fold degenerate and is described using 7 þ 5 þ 1 ¼ 13 atomic orbitals. The challenge for actinide quantum chemistry is to be able to handle systems with a high density of states involving many active orbitals along with including relativistic effects. It is true that much actinide chemistry involves highly oxidized actinide ions with few atomic valence electrons usually occupying the 5f shells. A good example is the uranium chemistry involving the U6þ ion (in the uranyl ion UO2þ 2 ). Such compounds are often closed-shell species and can be treated using well-established quantum chemical tools where only scalar relativisitc effects are taken into account. However, an extensive actinide chemistry involves ions of lower valency and even atoms. Also, in some chemical processes, we find situations where the oxidation number may change from zero to a large positive number, an example being the small molecule NUN that will be discussed in this review. The formal oxidation number of the uranium ion is six, and the UN bonds are strongly covalent. But consider the formation of this molecule, which is done by colliding uranium atoms with N2 : U þ N2 ! NUN.5 Here, the oxidation number of U changes from zero to six along the reaction path, and the spin quantum number changes from two to zero. The quantum chemical description of the reaction path requires methods that can handle complex electronic structures involving several changes of the spin state as well as many close lying electronic states. Another issue involving actinide complexes in the zero formal oxidation state is the possible formation of actinide–actinide bonds. For example, the molecule U2 has recently been described theoretically,6 in which the electronic structure is characterized by the existence of a large number of nearly
The Multiconfigurational Approach
251
degenerate electronic states and wave functions composed of multiple electronic configurations. The methods used to describe the electronic structure of actinide compounds must, therefore, be relativistic and must also have the capability to describe complex electronic structures. Such methods will be described in the next section. The main characteristic of successful quantum calculations for such systems is the use of multiconfigurational wave functions that include relativistic effects. These methods have been applied for a large number of molecular systems containing transition metals or actinides, and we shall give several examples from recent studies of such systems. We first describe some recent advances in transition metal chemistry, e.g., the study of Re2 Cl2 8 , the inorganic chemistry of the Cr2 unit, and the theoretical characterization of the end-on and side-on peroxide coordination in ligated Cu2 O2 models. The second part of this chapter focuses on actinide chemistry, where we start by describing some triatomic molecules containing a uranium atom, which have been studied both in the gas phase and in rare gas matrices. Most of actinide chemistry occurs, however, in solution, so we then describe actinide ions in solution. The extensive study of the multiple bond between two uranium atoms in the U2 molecule and in other diactinides is then reported. Finally, several examples of inorganic compounds that include U2 as a central unit are presented.
THE MULTICONFIGURATIONAL APPROACH We describe here the methods that have been used in quantum chemical applications to transition metal-and actinide-containing molecules. These methods are available in the computer software package MOLCAS-6,7 which has been employed in all reported calculations. Many such systems cannot be well described using single configurational methods like Hartree–Fock (HF), density functional theory (DFT), or coupled cluster (CC) theory. Accordingly, a multiconfigurational approach is needed, where the wave function is described as a combination of different electronic configurations. A three-step procedure is used to accomplish this approach. In the first step, a multiconfigurational wave function is defined using the complete active space (CAS) SCF method. This wave function is employed in the second step to estimate remaining (dynamic) correlation effects using multiconfigurational second-order perturbation theory. Scalar relativistic effects are included in both of these steps, but not spin-orbit coupling (SOC), which is included in a third step where a set of CASSCF wave functions are used as basis functions to set up a spin-orbit Hamiltonian that is diagonalized to obtain the final energies and wave functions. We describe each of these steps in more detail below.
252
Transition Metal- and Actinide-Containing Systems
The Complete Active Space SCF Method The CASSCF method was developed almost 30 years ago. It was inspired by the development of the Graphical Unitary Group approach (GUGA) to the full CI problem by Shavitt,8 making it possible to solve large full CI problems with full control of spin and space symmetry. The GUGA approach is in itself not very helpful because it can only be used with very small basis sets and few electrons. It was known, however, that the important configurations (those with coefficients appreciably different from zero) in a full CI expansion used only a limited set of molecular orbitals. The following idea emerged, especially the concept of a fully optimized reaction space (FORS) introduced by Ruedenberg and Sundberg in 1976:9 The molecular orbital space is divided into three subspaces: inactive, active, and external orbitals. The inactive orbitals are assumed to be doubly occupied in all configuration functions (CFs) used to build the wave function. The inactive orbitals thus constitute a Hartree–Fock ‘‘sea’’ in which the active orbitals move. The remaining electrons occupy a set of predetermined active orbitals. The external orbitals are assumed to be unoccupied in all configurations. Once the assignment of electrons to active orbitals is done, the wave function is fully defined within the set of active orbitals. All CFs with a given space and spin symmetry are included in the multiconfigurational wave function. This concept of CAS was introduced by B. O. Roos in the 1980s.10,11 A scheme of how the orbitals can be subdivided is presented in Figure 1. The choice of the correct active space for a specific application is not trivial, and many times one has to make several ‘‘experiments.’’ It is difficult to derive any general rules because every chemical system poses its own problems. The rule of thumb is that all orbitals intervening in the chemical process must be included. For example, in a chemical reaction where a bond is formed/broken, all orbitals involved in the bond formation/breaking must be included in the
Figure 1 Orbital spaces for CAS wave functions.
The Multiconfigurational Approach
253
active space. If, on the other hand, several electronic states are considered, the molecular orbitals from/to which the electronic excitation occurs have to be included in the active space. There is also a tight connection with the choice of atomic orbital (AO) basis, which must be extensive enough to be able to describe the occupied molecular orbitals (MOs) properly. Moreover, the size of the active space is limited, being in most software packages around 15 for the case where the number of orbitals and electrons are equal. This is the most severe limitation of the CASSCF method and makes it sometimes difficult or even impossible to perform a specific study. In this chapter, we shall exemplify how active orbitals are chosen for compounds that pose special difficulties in this respect because of the large number of valence orbitals that may contribute to actinide chemical bonds (5f, 6d, 7s, and possibly 7p). We shall also illustrate one case, the Cu2 O2 models, in which all affordable active spaces do not describe the system in a satisfactory way. An extension to the CASSCF method exists that has not been used much but may become more applicable in the future: The restricted active space (RAS) SCF method12,13 where the active subspace is divided into three regions: RAS1, RAS2, and RAS3. The orbitals in RAS1 are doubly occupied, but a limited number of holes are allowed. Arbitrary occupation numbers are allowed in RAS2. A limited number of electrons is allowed to occupy the orbitals in RAS3. Many different types of RAS wave functions can be constructed. Leaving RAS1 fully occupied and RAS3 empty, one obtains the CAS wave function. If there are no orbitals in RAS2, a wave function that includes all single, double, etc. excitations out of a closed shell reference function (the SDTQ etc.-CI wave function) is obtained. The interesting feature of the RAS wave function is that it can work with larger active spaces than CAS, without exploding the CI expansion. It thus has the potential to perform multiconfigurational calculations that cannot today be performed with the CASSCF method. The problem with a RASSCF wave function is how to add the effects of dynamic electron correlation. For CASSCF, wave function second-order perturbation theory (CASPT2, see below) can be used to accomplish this, but this is not yet possible for RASSCF wave functions. Recent developments in our research group and in the Lund group of Roos indicate, however, that this may become possible in the near future through the development of a RASPT2 method, thus extending the applicability of the multiconfigurational methods to new classes of problems that cannot be treated today. This work is currently in progress.
Multiconfigurational Second-Order Perturbation Theory, CASPT2 If the active space has been adequately chosen, the CASSCF wave function will include the most important CFs in the full CI wave function. In this
254
Transition Metal- and Actinide-Containing Systems
way we include all near-degenerate configurations, which describe static correlation effects, as for example, in a bond breaking process. The CASSCF wave function will then be qualitatively correct for the entire chemical process studied, which can be an energy surface for a chemical reaction, a photochemical process, etc. The energies that emerge are, however, not very accurate. We need to include the part of the CF space that describes the remaining (dynamic) correlation effects. This requirement is as necessary in the multiconfigurational approach as it would be if we started from the HF single determinant approximation. How can dynamic electron correlation be included? In a single configuration approach, the obvious choices are preferably CC methods, or if the system is too large, second order perturbation theory (MP2), which is already accurate. A practical multiconfigurational CC theory does not exist yet. A method that has been used with great success since the 1980s is Multi-Reference CI (MRCI), where the most important of the CFs of the CAS wave function are used as reference configurations in a CI expansion that includes all CFs that can be generated by single and double replacements of the orbitals in the reference CFs.14 The method is still used with some success because of recent technological developments.15 It becomes time consuming for systems with many electrons, however, and has also the disadvantage of lacking sizeextensivity, even if this latter problem can be corrected for, approximately. Another way to treat dynamic correlation effects is to use perturbation theory. Such an approach has the virtue of being size-extensive and ought to be computationally more efficient than the MRCI approach. Møller–Plesset second-order perturbation theory (MP2) has been used for a long time to treat electron correlation for ground states, where the reference function is a single determinant. It is known to give accurate results for structural, energetic, and other properties of closed-shell molecules. Could such an approach also work for a multiconfigurational reference function like CASSCF? This approach was suggested soon after the introduction of the CASSCF method,16 but technical difficulties delayed a full implementation until the late 1980s.17,18 Today it is the most widely used method to compute dynamic correlation effects for multiconfigurational (CASSCF) wave functions. The principle is simple: One first computes the second-order energy with a CASSCF wave function as the zeroth-order approximation. That said, we point out that there are some problems to be solved that do not occur in single determinant MP2. One needs to define a zeroth-order Hamiltonian with the CASSCF function as an eigenfunction. It should preferably be a one-electron Hamiltonian in order to avoid a too complicated formalism. One then needs to define an interacting space of configurations. These configurations are given as ^ pq E ^ rs jCASSCFi E
½1
The Multiconfigurational Approach
255
Equation [1] is an internally contracted configuration space, doubly excited with respect to the CAS reference function j0i ¼ jCASSCFi; one or two of the four indices p; q; r; s must be outside the active space. The functions of Eq. [1] are linear combinations of CFs and span the entire configuration space that interacts with the reference function. Labeling the compound index pqrs as m or n, we can write the first-order equation as X
ð0Þ ½Hmn E0 Smn Cn ¼ V0m
½2
m ð0Þ
Here, Hmn are matrix elements of a zeroth-order Hamiltonian, which is chosen as a one-electron operator in the spirit of MP2. Smn is an overlap matrix: The excited CFs are not in general orthogonal to each other. Finally, V0m represents the interaction between the excited function and the CAS reference function. The difference between Eq. [2] and ordinary MP2 is the more complicated structure of the matrix elements of the zeroth-order Hamiltonian; in MP2 it ð0Þ is a simple sum of orbital energies. Here Hmn is a complex expression involving ^ combined with up to fourthmatrix elements of a generalized Fock operator F order density matrices of the CAS wave function. Additional details are given in the original papers by Andersson and coworkers.17,18 We here mention only the basic principles. The zeroth-order Hamiltonian is written as a sum of pro^ onto the reference function j0i jections of F ^0F ^ SD F ^XF ^0 ¼ P ^P ^0 þ P ^P ^ SD þ P ^P ^X H
½3
^ SD projects onto the interact^ 0 projects onto the reference function, P where P ^ ing configurations space (1), and PX projects onto the remaining configuration ^ has been chosen as the generalized Fock space that does not interact with j0i. F operator: X ^¼ ^ pq fpq E ½4 F p;q
with fpq ¼ hpq þ
X r;s
1 Drs ðpqjrsÞ ðprjqsÞ 2
½5
With such a formulation, fpp ¼ IPp (Ionization Potential) when the orbital p is doubly occupied and fpp ¼ EAp (Electron Affinity) when the orbital is empty. The value of fpp will be somewhere between these two extremes for active orbitals. Thus, for orbitals with occupation number one, fpp ¼ 12 ðIPp þ EAp Þ. This formulation is somewhat unbalanced and will
256
Transition Metal- and Actinide-Containing Systems
favor systems with open shells, leading, for example, to low binding energies, as shown in the paper by Andersson and Roos.19 The problem is that one would like to separate the energy connected with excitation out of an orbital from that of excitation into the orbital. Very recently, a modified zeroth-order Hamiltonian has been suggested by Ghigo and coworkers20 to accomplish this, which removes the systematic error and considerably improves both dissociation and excitation energies. Equation [5] can be written approximately as an interpolation between the two extreme cases 1 Fpp ¼ ðDpp ðIPÞp þ ð2 Dpp ÞðEAÞp Þ 2
½6
where Dpp is the diagonal element of the one-particle density matrix for orbital p. The formula is correct for Dpp ¼ 0 and 2 and for a singly occupied open shell. Assume now that when exciting into an active orbital, one wants its energy to be replaced by EA. This is achieved by adding a shift to Eq[6]. 1 ¼ Dpp ððIPÞp ðEAÞp Þ 2
ðEAÞ
sp
½7
Contrarily, if one excites out of this orbital, its energy has to be replaced by IP. The corresponding shift is ðIPÞ
sp
1 ¼ ð2 Dpp ÞððIPÞp ðEAÞp Þ 2
½8
The definitions of ðIPÞp and ðEAÞp are not straightforward. Therefore, ðIPÞp ðEAÞp was replaced with an average shift parameter E. The two shifts are then ðEAÞ
sp
ðIPÞ
sp
1 ¼ Dpp E 2 1 ¼ ð2 Dpp ÞE 2
½9 ½10
A large number of tests showed that a value of 0.25 for e was optimal. The mean error in the dissociation energies for 49 diatomic molecules was reduced from 0.2 eV to 0.1 eV. Using an average e was particularly impressive for triply bonded molecules: The average error for N2 ; P2 , and As2 was reduced from 0.45 eV to less than 0.15 eV. Similar absolute improvements were obtained for excitation and ionization energies.20 Perturbation theory like MP2 or CASPT2 should be used only when the perturbation is small. Orbitals that give rise to large coefficients for the
The Multiconfigurational Approach
257
states in Eq. [1] should be included in the active space. Large coefficients in the first-order wave function are the result of small zeroth-order energy differences between the CAS reference state and one or more of the excited functions. We call these functions intruder states. In cases where the interaction term V0m is small, one can remove the intruder using a level shift technique that does not affect the contributions for the other states.21–23 The reference (zeroth-order) function in the CASPT2 method is a predetermined CASSCF wave function. The coefficients in the CAS function are thus fixed and are not affected by the perturbation operator. This choice of the reference function often works well when the other solutions to the CAS Hamiltonian are well separated in energy, but there may be a problem when two or more electronic states of the same symmetry are close in energy. Such situations are common for excited states. One can then expect the dynamic correlation to also affect the reference function. This problem can be handled by extending the perturbation treatment to include electronic states that are close in energy. This extension, called the Multi-State CASPT2 method, has been implemented by Finley and coworkers.24 We will briefly summarize the main aspects of the Multi-State CASPT2 method. Assume several CASSCF wave functions, i ; i ¼ 1; N, obtained in a state average calculation. The corresponding (single state) CASPT2 functions are wi ; i ¼ 1; N. The functions i þ wi are used as basis functions in a ‘‘variational’’ calculation where all terms higher than second order are neglected. The corresponding effective Hamiltonian has the elements: ^ ji ðHeff Þij ¼ dij Ei þ hi jHjw
½11
where Ei is the CASSCF energy for state i. This Hamiltonian is not symmetric, and in practice, a symmetrized matrix is used, which may cause problems if the non-Hermiticity is large, so it is then advisable to extend the active space. One can expect this extension of the CASPT2 method to be particularly important for metal compounds, where the density of states is often high.
Treatment of Relativity Nonrelativistic quantum chemistry has been discussed so far. But transition metal (starting already from the first row) and actinide compounds cannot be studied theoretically without a detailed account of relativity. Thus, the multiconfigurational method needs to be extended to the relativistic regime. Can this be done with enough accuracy for chemical applications without using the fourcomponent Dirac theory? Much work has also been done in recent years to develop a reliable and computationally efficient four-component quantum chemistry.25,26 Nowadays it can be combined, for example, with the CC approach for electron correlation. The problem is that an extension to multiconfigurational
258
Transition Metal- and Actinide-Containing Systems
wave functions is difficult and would, if pursued, lead to lengthy and complex calculations, which allow only applications to small molecules. It is possible, however, to transform the four-component Dirac operator to a two-component form where one simultaneously analyzes the magnitude of the different terms and keeps only the most important of these terms. The most widely used transformation of this type leads to the second order Douglas–Kroll–Hess Hamiltonian.27,28 The DKH Hamiltonian can be divided into a scalar part and a spinorbit coupling part. The scalar part includes the mass-velocity term and modifies the potential close to the nucleus such that the relativistic weak singularity of the orbital is removed. The effect on energies is similar to that of the Darwin term, but the resulting operator is variationally stable. This part of the relativistic corrections can easily be included in a nonrelativistic treatment. Usually, only contributions to the one-electron Hamiltonian are included. For lighter atoms, the scalar relativistic effects will be dominant and calculations on, say, first row transition metal compounds, can safely be performed by adding only this term to the one-electron Hamiltonian that is used in nonrelativistic quantum chemical methods. The scalar DKH Hamiltonian has been implemented recently into the CASSCF/CASPT2 version of the multiconfigurational approach by Roos and Malmqvist.29 The scalar terms are only one part of the DKH Hamiltonian. There is also a true two-component term that, as the dominant part, has the spin-orbit interaction. This is a two-electron operator and as such is therefore difficult to implement for molecular systems. However, in 1996, an effective one-electron Fock-type spin-orbit Hamiltonian was suggested by Hess and coworkers30 that simplifies significantly the algorithm for the subsequent calculation of spin-orbit matrix elements. Two-electron terms are treated as screening corrections of the dominant one-electron terms, at least for heavy elements.The atomic mean field integrals (AMFI) method is used, which, based on the short-range behavior of the spin-orbit interaction, avoids the calculation of multi-center one- and two-electron spin-orbit integrals and thus reduces the integral evaluation to individual atoms, taking advantage of full spherical symmetry. The approach reduces the computational effort drastically but leads to a negligible loss of accuracy compared with, e.g., basis set or correlation limitations as shown by Christiansen et al.31 The treatment of the spin-orbit part of the DKH Hamiltonian within the AMFI scheme is based on the assumption that the strongest effects of SOC arise from the interaction of electronic states that are close in energy. For these states, independent CASSCF/CASPT2 calculations are performed. The resulting CASSCF wave functions are then used as basis functions for the calculation of the spin-orbit coupling. The diagonal elements of the spin-orbit Hamiltonian can be modified to account for dynamic correlation effects on the energy by, for example, replacing the CASSCF energies with CASPT2 energies. To be able to use the above procedure, one needs to compute matrix elements between complex CASSCF wave functions, which is not trivial because the orbitals of two
The Mutiple Metal–Metal Bond in Re2 Cl 2 8 and Related Systems
259
different CASSCF wave functions are usually not orthogonal. A method to deal with this problem was developed by Malmqvist in the late 1980s.32,33 The method has become known as the CASSCF State Interaction (CASSI) method and is also effective for long CAS-CI expansions and was recently extended to handle the integrals of the spin-orbit Hamiltonian.34 Is the method outlined above accurate enough for heavy element quantum chemistry? Several studies have been performed on atoms and molecules, showing that the approach is capable of describing relativistic effects in molecules containing most atoms of the periodic system with good accuracy, with exception of the fifth-row elements Tl-At. Here the method gives larger errors than for any other atoms in the periodic system.35 Studies on actinide atoms and molecules show, however, that the method works well for the f-elements. Several examples will be given below.
Relativistic AO Basis Sets It is not possible to use normal AO basis sets in relativistic calculations: The relativistic contraction of the inner shells makes it necessary to design new basis sets to account for this effect. Specially designed basis sets have therefore been constructed using the DKH Hamiltonian. These basis sets are of the atomic natural orbital (ANO) type and are constructed such that semi-core electrons can also be correlated. They have been given the name ANO-RCC (relativistic with core correlation) and cover all atoms of the Periodic Table.36–38 They have been used in most applications presented in this review. ANO-RCC are all-electron basis sets. Deep core orbitals are described by a minimal basis set and are kept frozen in the wave function calculations. The extra cost compared with using effective core potentials (ECPs) is therefore limited. ECPs, however, have been used in some studies, and more details will be given in connection with the specific application. The ANO-RCC basis sets can be downloaded from the home page of the MOLCAS quantum chemistry software (http://www.teokem.lu.se/molcas).
THE MUTIPLE METAL–METAL BOND IN Re2 Cl2 8 AND RELATED SYSTEMS In 1965 Cotton and Harris examined the crystal structure of K2 ½Re2 Cl8 2H2 O39 and reported a surprisingly short Re-Re distance of 2.24 A˚. This was the first reported example of a multiple bond between two metal atoms and the Re2 Cl2 8 ion (Figure 2) has since become the prototype for this family of complexes. Cotton analyzed the bonding using simple MO theory and concluded that a quadruple Re–Re bond was formed.39,40 Two parallel ReCl4 units are connected by the Re–Re bond. The dx2 y2 ; px ; py , and s orbitals of the valence
260
Transition Metal- and Actinide-Containing Systems
Figure 2 The structure of Re2 Cl2 8 .
shell of each Re atom form the s bonds to each Cl atom. The remaining dz2 and pz orbitals with s symmetry relative to the Re–Re axis, the dxz and dyz with p symmetry, and the dxy with d symmetry form the quadruple Re–Re bond. The system thus contains one s bond, two p bonds, and one d bond between the Re atoms. Because there are eight electrons (Re3þ is d4 ) to occupy these MOs, the ground state configuration will be s2 p4 d2 . The presence of the d bond explains the eclipsed conformation of the ion. In a staggered conformation, the overlap of the d atomic orbitals is zero and the d bond disappears. The visible spectrum was also reported in these early studies. The notion of a quadruple bond is based on the inherent assumption that four bonding orbitals are doubly occupied. Today we know that this is not the case for weak inter-metallic bonds. The true bond order depends on the relation between the occupation of the bonding and antibonding orbitals, respectively. Such a description is, however, only possible if a quantum chemical model is used that goes beyond the single configuration Hartree–Fock model. The CASSCF model has been used to study this system, and it has been demonstrated that the true bond order between the two Re atoms is closer to three than to four. Because the Re2 Cl2 8 ion is such an important entity in inorganic chemistry, we decided to study its structure and electronic spectrum using multiconfigurational quantum chemistry.41 Scalar relativistic effects and spin-orbit coupling were included in this study. The geometry of Re2 Cl2 8 was obtained at the CASPT2 level of theory, and several excited states were calculated at this geometry. The calculations were performed using the active space formed by 12 active electrons in 12 active orbitals (12/12) (reported in Figure 3). It comprises one 5ds, two 5dp, and one 5dd Re–Re bonding orbitals and the corresponding antibonding orbitals, and two Re-Cl d bonding orbitals and the corresponding two antibonding orbitals. They are nicely paired such that the sum of the occupation numbers for the Re–Re bonding and antibonding orbitals of a given type is almost exactly two. The two bonding Re–Cl
The Mutiple Metal–Metal Bond in Re2 Cl 2 8 and Related Systems
261
Figure 3 The molecular orbitals describing the bonds in Re2 Cl2 8 .
orbitals are located mainly on Cl as expected, whereas the antibonding orbitals have large contributions from 3dx2 y2 . The occupation is low, and these orbitals are thus almost empty and may be used as acceptor orbitals for electronic transitions. The strongest bond between the two Re atoms is the s bond, with an occupation number of Zb ¼ 1:92 of the bonding and Za ¼ 0:08 of the antibonding natural orbital. We could estimate the effective bond order as ðZb Za Þ=ðZb þ Za Þ, and for the s bond, we obtain the value 0.92. The corresponding value for the p bond is 1.74. The d pair gives an effective bond order of only 0.54. Adding up these numbers results in a total effective bond order of 3.20 for Re2 Cl2 8 . The main reduction of the bond order from 4.0 to 3.2 is thus from the d bond. Note that the calculation of natural orbital occupation numbers that substantially deviate from zero and two is indicative of the need for a CASSCF description of Re2 Cl2 8 . Vertical excitation energies and oscillator strengths have been determined at the CASPT2 level with and without the inclusion of spin-orbit coupling. Although we refer the interested reader to the original manuscript41 for the details of the calculations, we describe here only the most significant features of the spectrum. The most relevant transitions are reported in Table 1. The lowest band detected experimentally occurs at 1.82 eV (14, 700 cm1) with an oscillator strength of 0.023. It has been assigned to the d ! d ð1 A1g !1 A2u Þ transition. Our 12/12 calculation predicts an excitation energy of 2.03 eV at the CASPT2 level with an oscillator strength equal to 0.004. Calculations with enlarged active spaces were also performed, for example, using 16 electrons in 14 orbitals (16/14). These calculations predict a
262
Transition Metal- and Actinide-Containing Systems
Table 1 Spin-Free Excitation Energies in Re2 Cl8 (in eV) Calculated at the CASSCF (CAS) and CASPT2 (PT2) Level. State
E(CAS) 1
d ! d , A2u d ! p , 1 A1g p ! d , 1 Eg d ! p , p ! d , 1 Eg d ! s , 1 B1u d ! dx2 y2 , 1 A2g ðd; pÞ ! ðd Þ2 , 1 Eu Clð3pÞ ! d LMCT, 1 Eu d ! dx2 y2 , 1 A1u p ! p , 1 A1u ðd; pÞ ! ðd Þ2 , 1 Eu ðd; pÞ ! ðd p Þ, 1 A1g p ! p , 1 B1u Clð3pÞ ! d ; LMCT, 1 Eu s ! d ;p ! p , 1 B1u p ! dx2 y2 , 1 Eu ðd; pÞ ! ðd p Þ, 1 B2g dp ! d s , 1 Eu s ! s ; p ! p , 1 A2u
3.08 2.90 3.41 3.87 4.47 3.96 4.20 6.37 4.24 5.02 4.81 5.01 5.17 6.54 6.01 6.15 5.66 6.79 6.66
E(PT2)
Expta
2.03(0.0037) 2.29(f) 2.70(f) 3.10(f) 3.10(f) 3.37(f) 3.38(0.29E-03) 3.56(0.60E-04) 3.59(f) 3.76(f) 3.80(0.92E-04) 3.91(f) 4.00(f) 4.08(0.08) 4.13(f) 4.17(0.009) 4.30(f) 4.40(1.0E-04) 4.56(0.015)
1.82(0.023) 2.19(weak) 2.60 2.93(very weak) 3.35 3.48
3.83(intense)
4.42(complex) 4.86(intense)
Q(Re) 1.03 1.03 1.04 1.04 1.00 1.11 1.04 0.84 1.13 1.04 1.04 1.05 1.05 0.88 1.05 1.08 1.04 1.03 1.04
a From Ref. 42 Notes: Oscillator strengths are given within parentheses. Q(Re) gives the Mulliken charge on one Re atom.
CASPT2 excitation energy that varies between 1.68 and 1.74 eV, and the oscillator strength that varies between 0.007 and 0.092, which shows that the oscillator strength is very sensitive to the active space. The low energy of this transition is a result of the weak d bond, which places the d orbital at low energy. In the region of weak absorption between 1.98 and 3.10 eV, (16,000– 25,000 cm1), the first peak occurs at 2.19 eV (17,675 cm1) and has been assigned to a d ! p ð11 A1g ! 21 A1g Þ transition located mostly on Re. We predicted it to be at 2.29 eV, and it is a forbidden transition. Two bands have then been assigned to charge transfer (CT) states. They occur at 3.35 eV and 3.48 eV, respectively. It was suggested that they correspond to two A2u spinorbit components of two close-lying 3 Eu states.43 We have not studied the triplet ligand to metal charge transfer (LMCT) states, but our first singlet CT state was predicted at 3.56 eV, corresponding to a Clð3pÞ ! d ð1 A1g ! 1 Eu Þ LMCT transition. Thus, it seems natural to assign the upper of the two bands to this transition. The peak at 3.35 eV has been assigned to a metal localized transition. A ðd; pÞ ! ðd ; p Þ ð1 A1g ! 1 A1g Þ transition is predicted at 3.91 eV and a p ! p ð1 A1g ! 1 B1u Þ transition at 4.00 eV. No corresponding experimental
The Mutiple Metal–Metal Bond in Re2 Cl 2 8 and Related Systems
263
bands could be found. An intense CT state is found in the experimental spectrum at 3.83 eV, and it is assigned to the Clð3pÞ ! d ð1 A1g ! 1 Eu Þ transition that we predict at 4.08 eV with an oscillator strength of 0.08. Togler et al.43 have suggested that the complex band found at 4.42 eV should be a mixture of two LMCT transitions. We find no evidence of this mixture in the calculations, but a weak 1 Eu state is found at 4.40 eV and there are other symmetry forbidden transitions nearby. An intense band is found at 4.86 eV with a tentative assignment p ! p ð1 A1g ! 1 A2u Þ. We agree with this assignment and compute the state to occur at 4.56 eV with an oscillator strength of 0.015. The spectrum of Re2 Cl2 8 was recomputed with the inclusion of spinorbit coupling, leading to no change of the qualitative features of the spectrum. There is a small shift in the energies and intensities, but we do not see any new states with intensities appreciably different from zero. We may, however, have lost some information because we have not studied the LMCT triplet states and the corresponding effects of spin-orbit splitting. Four compounds containing metal–metal quadruple bonds, the ½M2 ðCH3 Þ8 2n ions where M ¼ Cr; Mo; W; Re and n ¼ 4; 4; 4; 2, respectively, have also been studied theoretically46 using the same CASPT2 method employed in the Re2 Cl2 8 case. The molecular structure of the ground state of these compounds has been determined, and the energy of the d ! d transition has been calculated and compared with previous experimental measurements. The high negative charges on the Cr, Mo, and W complexes lead to difficulties in the successful modeling of the ground-state structures, which is a problem that has been addressed by the explicit inclusion of four Liþ ions in these calculations. The ground-state geometries of the complexes and d ! d transition are in excellent agreement with experiment for Re, but only satisfactory agreement for Mo, Cr, and W. The primary goal of this study44 was to provide a theoretical understanding of the apparently linear relationship between metal–metal bond length and d ! d excitation energy for the octamethyldimetallates of Re, Cr, Mo, and W. As we demonstrated, these seemingly simple anionic systems represent a surprising challenge to modern electronic structure methods, largely because of the difficulty in modeling systems (without electronegative ligands) that have large negative charges. Nevertheless, by using the CASPT2 method with Liþ counterions, one can model the ground-state geometries of these complexes in a satisfactory way. This multiconfigurational approach, which is critical for the calculation of excited-state energies of the complexes, does a fairly good job of modeling trends in the d ! d excitation energy with the metal–metal bond length, although the accuracy is such that we are not yet able to explain fully the linear relationship discovered by Sattelberger and Fackler.45 Progress on these systems will require better ways to accommodate the highly negative charges, which are in general difficult to describe, because of the intrinsic problem of the localization of the negative charges. These efforts are ongoing.
264
Transition Metal- and Actinide-Containing Systems
THE Cr–Cr MULTIPLE BOND The chromium atom has a ground state with six unpaired electrons (3d5 4s, 7 S). Forming a bond between two Cr atoms could, in principle, result in an hextuple bond, so it is not surprising that the chromium dimer has become a challenging test for various theoretical approaches to chemical bonding. Almost all existing quantum chemical methods have been used. The results are widely varying in quality (see Ref. 46 for references to some of these studies). It was not until Roos applied the CASSCF/CASPT2 approach to Cr2 that a consistent picture of the bonding was achieved.46 This study resulted in a bond energy (D0 ) of 1.65 eV, a bond distance of 1.66 A˚, and an oe value of 413 cm1 (experimental values are 1:53 0:06 eV,47 1.68 A˚,48 and 452 cm1, respectively49). Do the two chromium atoms form a hextuple bond? The calculations by Roos46 gave the following occupations of the bonding and antibonding orbitals: 4ssg 1:90, 4ssu 0:10, 3dsg 1:77, 3dsu 0:23, 3dpu 3:62, 3dpg 0:38, 3ddg 3:16, and 3ddu 0:84, yielding a total effective bond order of 4.46. The d bond is weak and could be considered as intermediate between a ‘‘true’’ chemical bond and four antiferromagnetically coupled electrons. The chromium dimer could thus also be described as a quadruply bonded system with the d electrons localized on the separate atoms and coupled in a way to give a total spin of zero. The difficulty in forming all six bonds arises mainly from the large difference in size between the 3d and 4s orbitals. When the Cr–Cr distance is such that the 3d orbitals reach an effective bonding distance, the 4s orbitals are already far up on the repulsive part of their potential curve, a behavior that explains why the bond energy is so small despite the high bond order. The difference in orbital size decreases for heavier atoms. The 5s orbital of Mo is more similar in size to the Mo 4d orbital. Even more pronounced is the effect for W, where the relativistic contraction of the 6s orbital and the corresponding expansion of the 5d orbital makes them very similar in size. The result is a much stronger bond for W2 with a bond energy above 5 eV and an effective bond order of 5.19.50 The tungsten dimer can thus be described as a nearly truly hextuply bonded system. The occupation numbers of the bonding orbitals are never smaller than 1.8, which is a value that is the highest bond order among any dimer of the Periodic Table. Nguyen et al. synthesized a dichromium compound with the general structure ArCrCrAr, where Cr is in the þ1 oxidation state.51 This is the first example of a compound with Cr in that oxidation state. A bond distance of 1.83 A˚ was determined for the Cr–Cr bond, and it was concluded that a quintuple bond was formed. CASSCF/CASPT2 calculations on the model compound PhCrCrPh (Ph ¼ phenyl) subsequently confirmed this picture.52 The natural orbital occupation numbers (NOONs) were found to be 3dsg 1:79, 3dsu 0:21, 3dpu 3:54, 3dpg 0:46, 3ddg 3:19, and 3ddu 0:81, which were very
Cu2 O2 Theoretical Models
265
similar to the chromium dimer with again a weak d bond. The total effective bond order is 3.52, so the bond is intermediate between a triple bond with four antiferromagnetically coupled d electrons and a true quintuple bond. The bond energy was estimated to be about 3.3 eV, which is twice as much as for the chromium dimer. The reason for this large bond energy is the absence of the 4s electron in the Cr(I) ion. Dichromium(II) compounds have been known for a long time. In particular, the tetracarboxylates have been studied extensively since the first synthetic work of Cotton.53 The Cr–Cr bond length varies extensively depending on the donating power of the bridging ligands and the existence ˚ was found for of additional axial ligands. The shortest bond length 1.966 A 54 Cr2 ðO2 CCH3 Þ4 in a gas phase measurement. A CASSCF calculation at this bond distance yields the following natural orbital occupation numbers: 3ds 1:68, 3ds 0:32, 3dp 3:10, 3dp 0:90, 3dd 1:21, and 3dd 0:79, giving an effective bond order of only 1.99. Note that, as in other examples discussed in this chapter, the calculation of NOONs for the antibonding orbitals significantly greater than zero (which they would be in a HF or DFT calculation) indicates the need for a CASSCF description. This is far from a quadruple bond, thus explaining the great variability in bond length depending on the nature of the ligands. Another feature of these compounds is their temperature-dependent paramagnetism, explained by the existence of lowlying triplet excited states, which arises from a shift of the weakly coupled d electron spin.55 A general picture of the Cr–Cr multiple bond emerges from these studies. Not unexpectedly, fully developed bonds are formed by the 3ds and 3dp orbitals, whereas the 3dd orbitals are only weakly coupled. The notion of a hextuple bond in the Cr2 system, a quintuple bond in ArCrCrAr, and a quadruple bond in the Cr(II)–Cr(II) complexes is therefore an exaggeration. The situation is different for the corresponding compounds containing the heavier atoms Mo and W, where more fully developed multiple bonds can be expected in all three cases.50
Cu2 O2 THEORETICAL MODELS An accurate description of the relative energetics of alternative bis(moxo) and m Z2 : Z2 peroxo isomers of Cu2 O2 cores supported by 0, 2, 4, and 6 ammonia ligands (Figure 4) is remarkably challenging for a wide variety of theoretical models, primarily because of the difficulty of maintaining a balanced description of rapidly changing dynamical and nondynamical electron correlation effects and the varying degree of biradical character along the isomerization coordinate. The isomerization process interconverting the three isomers depicted in Figure 4, with and without ammonia ligands, has been studied recently,54,58 using the completely renormalized coupled cluster level of theory, including
266
Transition Metal- and Actinide-Containing Systems
Figure 4 Some isomers of two supported copper(I) atoms and O2 .
triple excitations, various density functional levels of theory, and the CASSCF/ CASPT2 method. The completely renormalized coupled cluster level of theory including triple excitations and the pure density functional levels of theory, agree quantitatively with one another and also agree qualitatively with experimental results for Cu2 O2 cores supported by analogous but larger ligands. The CASPT2 approach, by contrast, significantly overestimates the stability of bis(m-oxo) isomers. The relative energies of m Z1 : Z1 (trans end-on) and m Z2 : Z2 (sideon) peroxo isomers (Figure 4) of Cu2 O2 fragments supported by 0, 2, 4, and 6 ammonia ligands have also been computed with various density functional, CC, and multiconfigurational protocols. Substantial disagreement exists among the different levels of theory for most cases, although completely renormalized CC methods seem to offer the most reliable predictions. The significant biradical character of the end-on peroxo isomer is problematic for the density functionals, whereas the demands on active space size and the need to account for interactions between different states in second-order perturbation theory prove to be challenging for the multireference treatments. For the details of the study, the reader should refer to the original papers.56,57 We focus here on the CASSCF/CASPT2 calculations and try to understand why in the current case the method has not been able to produce satisfactory results. As stated, the method depends on the active space. What are the relevant molecular orbitals that need to be included to have an adequate description of Cu2 O2 ? A balanced active space would include the molecular orbitals generated as a linear combination of the Cu 3d and O 2p atomic orbitals. In previous work, the importance of including a second d shell in the active space for systems was also discussed, where the d shell is more than half filled58,59 (the double-shell effect). In total this would add up to 28 active electrons in 26 active orbitals. Such an active space is currently too large to be treated with the CASSCF/CASPT2 method. Several attempts have been made to truncate the 28/26 active space to smaller and affordable active spaces, but with little success. The Cu2 O2 problem represents a case in which the CASSCF/CASPT2 method, currently, still fails. The relative energies (kcal mol1) of the triligated bis(m-oxo) and m Z2 : Z2 (side-on) peroxo isomers are reported in Table 2.
Spectroscopy of Triatomic Molecules
267
Table 2 Relative Energies (kcal mol1 ) of the bis(m-oxo) Isomer of Cu2 O2 with Respect to the m Z2 : Z2 Peroxo Isomer with Various Methods. Method
E
CCSD(T) CR-CCSD(T) CR-CCSD(T)L/BS2 CR-CCSD(T)La CASSCF(8,8) CASSCF(16,14) CASSCF(14,15) CASPT2(8,8) CASPT2(16,14) CASPT2(14,15) BS-BLYP BS-B3LYP BS-mPWPW91 BS-TPSS
6.3 4.3 13.1 10.1 17.9 29.8 22.5 12.1 17.2 16.6 8.4 26.8 9.1 7.9
Notes: CCSD(T): coupled cluster method. BLYP, B3LYP, mPWPW91, and TPSS: Various density functional theory-based methods. BS means broken symmetry DFT. See Refs. 56 and 59 for a description of the details of the calculations.
Although CC and DFT (in agreement with experiment) predict the peroxo structure to be more stable than the bis(m-oxo structure by about 10 kcal mol1, CASPT2 always overestimates the stability of the bis(m-oxo) isomer, by about 30 kcal mol1, independent of the active space used. We believe that such a result is from the inadequacy of the active spaces used. Similar results are obtained for the isomerization reaction interconverting the trans end-on m Z1 : Z1 and the side-on m Z2 : Z2 isomers. Extending the CASSCF/ CASPT2 approach to RASSCF/RASPT2, so as to handle larger active spaces, up to 28 electrons in 26 orbitals, seems to give promising results.60
SPECTROSCOPY OF TRIATOMIC MOLECULES CONTAINING ONE URANIUM ATOM The chemistry of uranium interacting with atmospheric components, like carbon, nitrogen, and oxygen, poses a formidable challenge to both experimentalists and theoreticians. Few spectroscopic observations for actinide compounds are suitable for direct comparison with properties calculated for isolated molecules (ideally, gas phase data are required for such comparisons). It has been found that even data for molecules isolated in cryogenic rare gas matrixes, a medium that is usually considered to be minimally perturbing, can
268
Transition Metal- and Actinide-Containing Systems
be influenced by the host. Calculations on isolated molecules are thus of great help to understand the interpretation of such experimental measurements. We have studied several triatomic compounds of general formula XUY, where X; Y ¼ C; N; O, and U is the uranium atom in the formal oxidation state 4þ, 5þ, or 6þ. We have determined the vibrational frequencies for the electronic ground state of NUN, NUOþ, NUO, OUO2þ, and OUOþ61 and have compared them with the experimental measurements performed by Zhou and coworkers.62 The CASSCF/CASPT2 method has proven to be able to reproduce experimental results with satisfactory agreement for all these systems. The electronic ground state and excited states of OUO were studied extensively.63–65 The ground state was found to be a (5f f)(7s), 3 2u state. The lowest state of gerade symmetry, 3 H4g , corresponding to the electronic configuration (5f)2 was found to be 3300 cm1 above the ground state. The computed energy levels and oscillator strength were used for the assignment of the experimental spectrum,66,67 in energy ranges up to 32,000 cm1 above the ground state. The reaction between a uranium atom and a nitrogen molecule N2 leading to the formation of the triatomic molecule NUN was investigated.68 The system proceeds from a neutral uranium atom in its (5f)3(6d)(7s)2, 5 L ground state to the linear molecule NUN, which has a 1 þ g ground state and a formal U(VI) oxidation state. The effect of spin-orbit coupling was estimated at crucial points along the reaction coordinate. The system proceeds from a quintet state for U þ N2 , via a triplet transition state to the final closed shell molecule. An eventual energy barrier for the insertion reaction is caused primarily by the spin-orbit coupling energy. The lowest electronic states of the CUO molecule were also studied.69 The ground state of linear CUO was predicted to be a 2 (a state with the total angular momentum equal to two). The calculated energy separation between the þ 0 and the 2 states is 0.36 eV at the geometry ˚ ˚ of the þ 0 state [(C–U) ¼ 1.77 A and (U–O) ¼ 1.80 A], and 0.55 eV at the geometry of the 2 state [(C–U) ¼ 1.87 A˚ and (U–O) ¼ 1.82 A˚]. These results indicate that the 2 state is the ground state of free CUO. Such a prediction does not confirm the experimental results,70 supported also by some DFT calculations. According to the results of Andrews and co-workers, the ground state of the CUO molecule shifts from a closed shell ground state to a triplet ground state, when going from a Ne matrix (analogous to free CUO) to an Ar matrix. Other groups are also working on the topic,71 which remains under debate. For the systems here described, a multiconfigurational treatment is needed, especially in the case of OUO, where the ground state is not a closed shell and several electronic states are lying close in energy to the ground state. In general the ground state and low-lying excited states of these systems are described in a satisfactory way in comparison with experiment with the CASSCF/CASPT2 approach, whereas the high-lying excited states are in less accurate agreement with experiment, because it becomes difficult to include all relevant orbitals in the active space.
Actinide Chemistry in Solution
269
ACTINIDE CHEMISTRY IN SOLUTION The elucidation of actinide chemistry in solution is important for understanding actinide separation and for predicting actinide transport in the environment, particularly with respect to the safety of nuclear waste disposal.72,73 The uranyl UO2þ 2 ion, for example, has received considerable interest because of its importance for environmental issues and its role as a computational benchmark system for higher actinides. Direct structural information on the coordination of uranyl in aqueous solution has been obtained mainly by extended X-ray absorption fine structure (EXAFS) measurements,74–76 whereas X-ray scattering studies of uranium and actinide solutions are more rare.77 Various ab initio studies of uranyl and related molecules, with a polarizable continuum model to mimic the solvent environment and/or a number of explicit water molecules, have been performed.78–82 We have performed a structural investigation of the carbonate system of dioxouranyl (VI) and (V), ½UO2 ðCO3 Þ3 4 and ½UO2 ðCO3 Þ3 5 in water.83 This study showed that only minor geometrical rearrangements occur upon the one-electron reduction of ½UO2 ðCO3 Þ3 4 to ½UO2 ðCO3 Þ3 5 , which supports the reversibility of this reduction. We have also studied the coordination of the monocarbonate, bicarbonate, and tricarbonate complexes of neptunyl in water, by using both explicit water molecules and a continuum solvent model.84 The monocarbonate complex was shown to have a pentacoordinated structure, with three water molecules in the first coordination shell, and the bicarbonate complex has a hexacoordinated structure, with two water molecules in the first coordination shell. Overall good agreement with experimental results was obtained. To understand the structural and chemical behavior of uranyl and actinyls in solution, it is necessary to go beyond a quantum chemical model of the actinyl species in a polarizable continuum medium, by eventually including several explicit water molecules. A dynamic description of these systems is important for understanding the effect of the solvent environment on the charged ions. It is thus necessary to combine quantum chemical results with potential-based molecular dynamics simulations. Empirical and/or semiempirical potentials are commonly used in most commercial molecular simulation packages (for example, AMBER), and they are generated to reproduce information obtained by experiment, or, to some extent, results obtained from theoretical modeling. Simulations using these potentials are accurate only when they are performed on systems similar to those for which the potential parameters were fitted. If one wants to simulate actinide chemistry in solution, this approach is not adequate because there are few experimental data (structural and energetic) available for actinides in solution, especially for actinides heavier than uranium. An alternative way to perform a simulation is to generate intermolecular potentials fully ab initio, from molecular wave functions for the separate
270
Transition Metal- and Actinide-Containing Systems
entities. We have studied the structure and dynamics of the water environment on a uranyl ion using such an approach (the nonempirical model potential, NEMO, method), which has been developed during the last 15 years.85,86 It has been used primarily to study systems like liquid water and water clusters, liquid formaldehyde and acetonitrile, and the solvation of organic molecules and inorganic ions in water. A recent review article85 by Engkvist contains references on specific applications. The interaction between uranyl and a water molecule has been studied using accurate quantum chemical methods.87 The information gained has been used to fit a NEMO potential, which is then used to evaluate other interesting structural and dynamical properties of the system. Multiconfigurational wave function calculations were performed to generate pair potentials between uranyl and water. The quantum chemical energies were used to fit parameters in a polarizable force field with an added charge transfer term. Molecular dynamics simulations were then performed for the uranyl ion solvated in up to 400 water molecules. The results showed a uranyl ion with five water molecules coordinated in the equatorial plane. The U– ˚ which is close to the experimental estimates. A secwater distance is 2.40 A ond coordination shell starts at about 4.7 A˚ from the uranium atom. Exchange of waters between the first and second solvation shell is found to occur through a path intermediate between association and interchange. This study is the first fully ab initio determination of the solvation of the uranyl ion in water.
THE ACTINIDE–ACTINIDE CHEMICAL BOND After studying single actinide-containing molecules, the next question that one tries to answer is if it possible to form bonds between actinide atoms and, if so, what is the nature of these bonds? Experimentally, there is some evidence of such bonds both in the gas phase and in a low-temperature matrix. The uranium diatomic molecule U2 was detected in the gas phase in 1974.88 The dissociation energy was estimated to be 50 5 kcal=mol. Andrews and co-workers found both U2 and Th2 molecules using matrix isolation spectroscopy.89 Both molecules were also found in the gas phase using laser vaporization of a solid uranium or thorium target.90 Small molecules containing U2 as a central unit were also reported, for example, H2U–UH291 and OUUO.88 Not much was known theoretically about the nature of the chemical bond between actinides before the study of U2 by Gagliardi and Roos.6 The same molecule was studied theoretically in 1990,92 but the methods used were not advanced enough to allow for a conclusive characterization of the chemical bond. Is it possible to say something about the bonding pattern of a molecule like U2 based on qualitative arguments? Before undertaking the study of the
The Actinide–Actinide Chemical Bond
271
diuranium molecules, some systems containg a transition metal and a uranium atom were studied, for example, the UAu4 and UAu6 molecules and NUIr.4,93,94 The ground state of the uranium atom is (5f)3(6d)1(7s)2, 5 L6 with four unpaired electrons that could in principle form a quadruple bond. The double occupancy of the 7s orbital, however, prevents the unpaired orbitals from coming in contact to form the bonds. We find, on the other hand, a valence state with six unpaired electrons only 0.77 eV above the ground level: (5f)3(6d)2(7s)1, 7 M6 . A hextuple bond could in principle be formed if it is strong enough to overcome the needed atomic promotion energy of 1.54 eV. There is, however, one more obstacle to bond formation. The 7s and 6d orbitals can be expected to overlap more strongly than the 5f orbitals. In particular, the 5f f orbitals, which are occupied in the free atom, will have little overlap. Thus, there must be a transfer of electrons from 5f to 6d to form a strong bond. As we shall see, it is this competition between the atomic configuration and the ideal molecular state that determines the electronic structure of the uranium dimer. To proceed further with the analysis, one needs to perform explicit calculations, and such calculations were done using a basis set of ANO type, with inclusion of scalar relativistic effects, ANO-RCC, of the size: 9s8p6d5f2g1h.6 As pointed out, potentially 13 active orbitals on each atom are involved in the bonding (5f, 6d, 7s). This would yield an active space of 26 orbitals with 12 active electrons, an impossible calculation, so the number of trial calculations were performed using different smaller active spaces. The results had one important feature in common: They all showed that a strong triple bond was formed involving the 7ssg and 6dpu orbitals. The occupation numbers of these three orbitals were close to two with small occupation of the corresponding antibonding orbitals. It was therefore decided to leave these orbitals inactive in the CASSCF wave function and to also remove the antibonding counterparts 7ssu and 6dpg . This approximation should work well near the equilibrium bond length, but of course it prevents the calculation of full potential curves. With six electrons and six MOs removed from the active space, one is left with 6 electrons in 20 orbitals, a calculation that could be performed easily. Several calculations were thus done with different space and spin symmetry of the wave function. The resulting ground state was found to be a septet state with all six electrons having parallel spin, and the orbital angular momentum was high with ¼ 11. Spin-orbit calculations showed that the spin and orbital angular momenta combined to form an ¼ 8 state. The final label of the ground state is thus 7 O8 . The main terms of the multiconfigurational wave function were found to be ðS ¼ 3; ¼ 11Þ ¼ 0:782ð7ssg Þ2 ð6dpu Þ4 ð6dsg Þð6ddg Þð5f dg Þð5f pu Þð5f fu Þð5f fg Þ þ0:596ð7ssg Þ2 ð6dpu Þ4 ð6dsg Þð6ddg Þð5f du Þð5f pg Þð5f fu Þð5f fg Þ
272
Transition Metal- and Actinide-Containing Systems
This wave function reflects nicely the competition between the preferred atomic state and the most optimal binding situation. We have assumed that the triple bond is fully formed. Also two electrons exist in 6d-dominated sigma bonds, 6dsg and 6ddg . The remaining MOs are dominated by 5f. Two weak bonds (one d and one p) are formed using 5f dg and 5f pu orbitals. Note that there is substantial occupation of the corresponding antibonding orbitals. Finally, the 5f f orbitals remain atomic and do not contribute to the bonding (equal occupation of the bonding and antibonding combinations). Formally, a quintuple bond is formed, but the large occupation of some antibonding orbitals reduce the effective bond order closer to four than five. Because of the weak bonding of the 5f orbitals, the effective bond order in U2 is not five but closer to four. It is interesting to note the occupation of the different atomic valence orbitals on each uranium atom. They are 7s, 0.94, 6d, 2.59, and 5f, 2.44. Compare that with the population in the lowest atomic level with 7s singly occupied: 7s, 1.00, 6d, 2.00, and 5f, 3.00. We see a transfer of about 0.6 electrons from 5f to 6d, which allows the molecule to use the better bonding power of the 6d orbitals compared with 5f. The calculations gave a bond distance of 2.43 A˚ and a bond energy of about 35 kcal/mol, including the effects of spin-orbit coupling. An experimental value of 50 5 kcal=mol was reported in 1974.88 Is it possible that other actinides can also form dimers? We already mentioned that Th2 has been detected in the gas phase and in a rare gas matrix. We have studied this dimer and the dimers of Ac and Pa.95 Some major findings are reported here. We present in Table 3 the excitation energies needed to produce a valence state with all orbitals singly occupied. The largest excitation energy is for Ac. The price to pay for forming a triple bond between two Ac atoms is 2.28 eV; for Th, only 1.28 eV is needed, which can then, in principle, form a quadruple bond. Note that in these two cases only 7s and 6d orbitals are involved. For Pa, 1.67 eV is needed, which results in the possibility of a quintuple bond. The uranium case was already described above where we saw that, despite six unpaired atomic orbitals, only a quintuple bond is formed with an effective bond order that is closer to four than five. It is the competition between the needed atomic promotion energy and the strength of the bond that will determine the electronic structure. In
Table 3 The Energy Needed to Reduce the Occupation Number of the 7s Orbital from Two to One in the Actinide Atoms Ac-U (in eV)a . Ac: Th: Pa: U: a
(7s)2 (6d)1 , 2 D3=2 !(7s)1 (6d)2 , 4 F3=2 (7s)2 (6d)2 , 3 F2 !(7s)1 (6d)3 , 5 F1 (7s)2 (6d)1 (5f)2 , 4 K11=2 !(7s)1 (6d)2 (5f)2 , 6 L11=2 (7s)2 (6d)1 (5f)3 , 5 L6 !(7s)1 (6d)2 (5f)3 , 7 M7
From the NIST tables in Ref. 96.
1.14 0.64 0.87 1.01
The Actinide–Actinide Chemical Bond
273
Table 4 The Dominant Electronic Configuration for the Lowest Energy State of the Early di-Actinides. Ac2 : Th2 : Pa2 : U2 :
ð7ssg Þ2 ð7s7psu Þ2 ð6dpu Þ2 , ð7ssg Þ2 ð6dpu Þ4 ð6ddg Þ1 ð6dsg Þ1 , ð7ssg Þ2 ð6dpu Þ4 ð6ddg Þ2 ð5f 6dsg Þ2 , ð7ssg Þ2 ð6dpu Þ4 ð6dsg Þ1 ð6ddg Þ1 ð5f dg Þ1 ð5f pu Þ1 ð5f fu Þ1 ð5f fg Þ1 ,
3
g g 3 g 7 Og 3
Table 4, we present the results of the calculations, and in Table 5, the populations of the atomic orbitals in the dimer are given. The results illustrate nicely the trends in the series. A double bond is formed in the actinium dimer involving the 7ssg and the 6dpu orbitals. But the su orbital is also doubly occupied, which would reduce the bond order to one. The Ac2 molecule mixes in the 7p orbital character to reduce the antibonding power of the su orbital, which results in a unique population of the 7p orbital that we do not see for the other di-actinides. The populations are, with this exception, close to that of the free atom. The calculated bond energy of Ac2 is also small (1.2 eV) and the bond length large (3.63 A˚). Already in the thorium dimer, Th2, we see another pattern. The 7s population is reduced to close to one. The electron is moved to 6d, and a strong quadruple bond is formed, involving three two-electron bonds and two 6d one-electron bonds. We also start to see some population of the 5f orbitals that hybridizes with 6d. The strongest bond is formed between the Pa atoms in Pa2. Here the contribution of 6d is maximum, and we see a complete promotion to the atomic state with five unpaired electrons. A quintuple bond is formed with a short bond distance and a bond energy close to 5 eV. The bond contains the (7ssg )2(6dpu )4 triple bond plus a 6dsg two-electron bond and two 6ddg one-electron bonds. The 5f population is increased to one electron, but we still do not see any molecular orbital dominated by this atomic orbital. They are all used but rather in combination with the 6d orbitals. With the Pa2 dimer, we have reached the maximum bonding power among the actinide dimers. In U2 the bond energy decreases and the bond length increases, which is from the increased stabilization of the 5f orbitals and the corresponding destabilization of 6d. Large transfer of electrons from Table 5 Mulliken Populations (Per Atom), Bond Distances, and Bond Energies (D0 ) for the Early di-Actinides. ˚) D0 (eV) 7s 7p 6d 5f Re (A Ac2 : Th2 : Pa2 : U2 :
1.49 0.93 0.88 0.94
0.49 0.01 0.02 0.00
0.96 2.83 3.01 2.59
0.04 0.21 1.06 2.44
3.63 2.76 2.37 2.43
1.2 3.3 4.0 1.2
274
Transition Metal- and Actinide-Containing Systems
5f to 6d is no longer possible, and the bonds become weaker and more dominated by the atomic ground state, even if we still see a complete promotion from a (7s)2 to a (7s)1 state. This trend will most certainly continue for the heavier di-actinides, and we can thus, without further calculations, conclude that Pa2 is the most strongly bound dimer with its fully developed quintuple bond having an effective bond order not much smaller than five.
INORGANIC CHEMISTRY OF DIURANIUM The natural tendency of a uranium atom to be preferentially complexed by a ligand, rather than to form a direct U–U bond, has precluded the isolation of stable uranium species containing direct metal-to-metal bonding. Although the uranium ionic radius is not exceedingly large, the presence of many electrons combined with the preference for large coordination numbers with common ligands makes the task of stabilizing the hypothetical U–U bond difficult. The greater stability for the higher oxidation states of uranium would suggest that if a bond is to be formed between uranium atoms, such species would rather bear several ligands on each multivalent U center. As discussed, the uranium atom has six valence electrons and the U–U bond in U2 is composed of three normal two-electron bonds, four electrons in different bonding orbitals and two non-bonded electrons leading to a quintuple bond between the two uranium atoms. Multiple bonding is also found between transition metal atoms. The Cr, Mo, and W atoms have six valence electrons, and a hextuple bond is formed in the corresponding dimers, even if the sixth bond is weak. The similarity between these dimers and the uranium dimer suggests the possibility of an inorganic chemistry based on the latter. Several compounds with the M2 (M ¼ Cr, Mo, W, Re, etc.) unit are known. 39 Among them are the chlorides, for example, Re2 Cl6 , Re2 Cl2 and the 8 , 97,98 carboxylates, for example Mo2 (O2 CCH3 )4 . The simplest of them are the tetraformates, which in the absence of axial ligands have a very short metal–metal bond length.99 Recently, calculations have suggested that diuranium compounds should be stable with a multiple U–U bond and short bond distances.100 We have studied two chlorides, U2 Cl6 and U2 Cl2 8 , both with U(III) as the oxidation state of uranium (see Figure 5), and three different carboxylates (see Figure 6), U2 (OCHO)4 , U2 (OCHO)6 , and U2 (OCHO)4 Cl2 . All species have been found to be bound with a multiply bonded U2 unit. In the diuranium chlorides, the formal charge of the uranium ion is þ3. Thus, 6 of the 12 valence electrons are available and a triple bond can in principle be formed. U2 Cl6 can have either an eclipsed or a staggered conformation. Preliminary calculations have indicated that the staggered conformation is about 12 kcal/mol lower in energy than the eclipsed form, so we focus our analysis on the staggered structure.
Inorganic Chemistry of Diuranium
275
Figure 5 The structure of U2 Cl6 .
The diuranium chloride and diuranium formate calculations were performed at the CASSCF/CASPT2 level of theory. The active orbitals were those that describe the U–U bond. Enough orbitals were included such that the method can make the optimal choice between the 5f and 6d orbitals in forming the bonding and antibonding orbitals. The number of active electrons was 6þ eight for the U4þ 2 unit and six for U2 . A basis set of the atomic natural orbital type, including scalar relativistic effects, was used. The U–U and U–Cl bond distances and the U–U–Cl angle have been optimized at the CASPT2 level of theory. The ground state of U2 Cl6 is a singlet state with the electronic configuration (sg )2 (pu )4 . The U–U bond distance is 2.43 A˚, the U–Cl distance 2.46 A˚, and the U–U–Cl angle 120.0 degrees. At
Figure 6 The structure of U2 (OCHO)4 .
276
Transition Metal- and Actinide-Containing Systems
the equilibrium bond distance, the lowest triplet lies within 2 kcal/mol of the singlet ground state. The two states are expected to interact via the spin-orbit coupling Hamiltonian, which will further lower the energy, but is expected to have a negligible effect on the geometry of the ground state, because it is a singlet state. The dissociation of U2 Cl6 to 2 UCl3 has also been studied. UCl3 , unlike U2 Cl6 , is known experimentally101 and has been the subject of previous computational studies.102 Single-point CASPT2 energy calculations have been performed at the experimental geometry, as reported in Ref. 102, namely a pyramidal structure with a U–Cl bond distance of 2.55 A˚ and a Cl–U–Cl angle of 95 degrees. U2 Cl6 was found to be about 20 kcal/mol more stable than two UCl3 moieties. 2 2 U2 Cl2 8 is the analog of Re2 Cl8 . The structure for U2 Cl8 has been optimized using an active space formed by 6 active electrons in 13 active orbitals, assuming D4h symmetry. As in the U2 Cl6 case, the molecular orbitals are linear combinations of U 7s, 7p, 6d, and 5f orbitals with Cl 3p orbitals. The ground state of U2 Cl2 is a singlet state with an electronic configuration of 8 ð5f sg Þ2 ð5f pu Þ4 . The molecule possesses a U–U triple bond. The U–U bond distance is 2.44 A˚, the U–Cl bond distance is 2.59 A˚, and the U–U–Cl angle is 111.2 2 degrees. U2 Cl2 8 is different compared with Re2 Cl8 in terms of molecular bond2 ing, in the sense that the bond in Re2 Cl8 is formally a quadruple bond, even though the dg bond is weak, because Re3þ has four electrons available to form the metal-metal bond. Only a triple bond can form in U2 Cl2 8 because only three electrons are available on each U3þ unit. Based on several experimental reports of compounds in which the uranium is bound to a carbon atom, we have considered the possibility that a CUUC core containing two U1þ ions could be incorporated between two sterically hindered ligands. We have performed a theoretical study of a hypothetical molecule, namely PhUUPh (Ph ¼ phenyl), the uranium analog of the previously studied PhCrCrPh compound.102 We have chosen to mimic the bulky terphenyl ligands, which could be potentially promising candidates for the stabilization of multiply bonded uranium compounds, using the simplest phenyl model. We demonstrate that PhUUPh could be a stable chemical entity with a singlet ground state. The CASSCF method was used to generate molecular orbitals and reference functions for subsequent CASPT2 calculations. The structures of two isomers were initially optimized using DFT, namely the bent planar PhUUPh isomer (Isomer A, Figure 7) and the linear isomer (Isomer B, Figure 8). Starting from a trans-bent planar structure, the geometry optimization for isomer A predicted a rhombic structure (a bis(mphenyl) structure), belonging to the D2h point group and analogous to the structure of the experimentally known species U2 H2 .91 Linear structure B also belongs to the D2h point group. CASPT2 geometry optimizations for several electronic states of various spin multiplicities were then performed on selected structural parameters, namely the U–U and U–Ph bond distances, whereas the geometry of the phenyl fragment was kept fixed. The most
Inorganic Chemistry of Diuranium
277
Figure 7 The bent planar PhUUPh isomer.
Figure 8 The linear PhUUPh isomer.
relevant CASPT2 structural parameters for the lowest electronic states of the isomers A and B, together with the relative CASPT2 energies, are reported in Table 1. The ground state of PhUUPh is a 1 Ag singlet, with a bis(m-phenyl) structure (Figure 1a), and an electronic configuration ðsÞ2 ðsÞ2 ðpÞ4 ðdÞ2 , which corresponds to a formal U–U quintuple bond. The effective bond order between the two uranium atoms is 3.7. It is interesting to investigate briefly the difference in the electronic configurations of the formal U2þ 2 moiety in PhUUPh and that of the bare meta6 2þ stable U2þ 2 cation. The ground state of U2 has an electronic configuration 2 4 1 1 1 1 ðsÞ ðpÞ ðdg Þ ðdu Þ ðfg Þ ðfu Þ , which corresponds to a triple bond between the two U atoms and four fully localized electrons. In PhUUPh, the electronic configuration is different, because the molecular environment decreases the Coulomb repulsion between the two U1þ centers, thus making the U–U ˚ bond stronger than in U2þ 2 . The corresponding U–U bond distance, 2.29 A, ˚ ). A single bond is present between is also slightly shorter than in U2þ (2.30 A 2 the U and C atoms. Inspection of Table 6 shows that the lowest triplet state, 3 Ag , is almost degenerate with the ground state, lying only 0.76 kcal/mol higher in energy. Several triplet and quintet states of various symmetries lie 5–7 kcal/mol above the ground state. The lowest electronic states of the linear structure (Figure B) lie about 20 kcal/mol above the ground state of the bis(m-phenyl) structure. As the 1 Ag ground state and the 3 Ag triplet state are very close in energy, they may be expected to interact via the spin-orbit coupling operator. To evaluate the impact of such interaction on the electronic states of PhUUPh, the spin-orbit coupling between several singlet and triplet states was computed at the ground state (1 Ag ) geometry. The ordering of the electronic states is not affected by the
278
Transition Metal- and Actinide-Containing Systems
Table 6 CASPT2 Optimized Most Significant Structural Parameters (Distances in A˚, Angles in Degrees) and Relative Energies (kcal/mol) for the Lowest Electronic States of Isomer A and B of PhUUPh. Isomer A A A A A A B B B
Elec. State 1
Ag Ag 5 B3g 5 B3u 3 B3g 1 B3g 3 B3g 3 Ag 1 B3g 3
R(U-U)
R(U-Ph)
UPhU
PhUPh
2.286 2.263 2.537 2.390 2.324 2.349 2.304 2.223 2.255
2.315 2.325 2.371 2.341 2.368 2.373 2.395 2.430 2.416
59.2 58.3 64.7 61.4 58.8 59.3
120.8 121.8 115.3 118.6 121.2 120.7 180 180 180
E 0 þ0.76 þ4.97 þ7.00 þ7.00 þ7.14 þ19.67 þ22.16 þ27.62
inclusion of spin-orbit coupling. To assess the strength of the U–U bond in PhUUPh, its bonding energy was computed as the difference between the energy of the latter and those of the two unbound PhU fragments. PhUUPh is lower in energy than two PhU fragments by about 60 kcal/mol, with the inclusion of the basis set superposition error correction. The question that one would like to answer is how to make PhUUPh and analogous species. PhUUPh could in principle be formed in a matrix, which is analogous to the already detected diuranium polyhydride species9,10 by laser ablation of uranium and co-deposition with biphenyl in an inert matrix. The phenyl ligand might, however, be too large to be made, and its reactions might be controlled, in a matrix, and so species like CH3 UUCH3 for example may be more feasible to construct.
CONCLUSIONS Exploring the nature of the chemical bond has been a central issue for theoretical chemists since the dawn of quantum chemistry 80 years ago. We now have a detailed understanding of what the electrons are doing in molecules on both a qualitative and a quantitative basis. We also have quantum chemical methods that allow us to compute, with high accuracy, the properties of chemical bonds, such as bond energies, charge transfer, and back bonding. In recent years, it has been possible to extend these methods to treat bonding involving atoms from the lower part of the Periodic Table. In this chapter we illustrated how the CASSCF/CASPT2 method can be used to explore the nature of such chemical bonds. Classic cases are the Re–Re multiple bond in Re2 Cl2 8 , and the Cr–Cr bond ranging from the quadruply bonded Cr(II)–Cr(II) moiety to the formal hextuple bond between two neutral chromium atoms. The bonding between the 3dd electrons is weak and should be considered as an intermediate between two pairs of antiferromagnetically
References
279
coupled localized 3d electrons and a true chemical bond. The Cr–Cr case also illustrates that no simple relation exists between bond order and bond energy. The energy of the bond in the Cr(I) compound PhCrCrPh is twice as large as that of the formally Cr(0) compound, Cr2 , despite the decreased bond order. On the other hand, the Cr(II)–Cr(II) moiety would hardly be bound at all without the help of bridging ligands such as carboxylate ions. In the study of Cu2 O2 , the CASSCF/CASPT2 method is unsatisfactory. This and related problems motivate the extension of the CASSCF/CASPT2 method to handle larger active spaces. The chemical bond in systems containing actinide atoms, in particular uranium, was also addressed. A formal quintuple bond was found for the uranium diatomic molecule U2 with a unique electronic structure involving six one-electron bonds with all electrons ferromagnetically coupled, which results in a high spin ground state. It was questioned whether the U2 unit could be used as a central moiety in inorganic complexes similar to those explored by Cotton et al. for transition metal dimers. Corresponding chlorides and carboxylates were found to be stable units with multiply bonded U(III) ions. It might even be possible to use the elusive U(I) ion in metal–metal bonding involving protective organic aryl ligands in parallel to the recently synthesized ArCrCrAr compound. Many challenges exist, and issues still remain open. The interplay between theoreticians and experimentalists will certainly enhance the possibilities for further progress in transition metal and actinide chemistry.
ACKNOWLEDGMENTS A wonderful collaboration and friendship with Bjo¨rn O. Roos over the years has certainly been inspiring for the author. All the developers of MOLCAS, whose effort has been essential in order to study such an exciting chemistry, should also be acknowledged, especially Roland Lindh, ˚ ke Malmqvist, Valera Veryazov, and Per-Olof Widmark. The Swiss National Science FoundaPer-A tion, Grant 200021-111645/1, is acknowledged for financial support.
REFERENCES 1. P. O. Lo¨wdin, Phys. Rev., 97, 1474 (1955). Quantum Theory of Many-Particle Systems I. Physical Interpretations by Means of Density Matrices, Natural Spin-Orbitals, and Convergence Problems in the Method of Configurational Interaction. 2. P. Pyykko¨, Adv. Quantum Chem., 11, 353 (1978). Relativistic Quantum Chemistry. 3. L. Gagliardi, J. Am. Chem. Soc., 124, 8757 (2002). New Group 2 Chemistry: A Multiple Barium-Nitrogen Bond in CsNBa. 4. L. Gagliardi, J. Am. Chem. Soc., 125, 7504 (2003). When Does Gold Behave as a Halogen? Predicted Uranium Tetra-auride and Other M(Au)4 Tetrahedral Species (M ¼ Ti, Zr, Hf, Th). 5. M. Zhou, L. Andrews, N. Ismail, and C. Marsden, J. Phys. Chem. A, 104, 5495 (2000). Infrared Spectra of UO2, UOþ 2 , and UO2 in Solid Neon. 6. L. Gagliardi and B. O. Roos, Nature, 433, 848 (2005). Quantum Chemical Calculations Show That the Uranium Molecule U2 has a Quintuple Bond.
280
Transition Metal- and Actinide-Containing Systems
˚ . Malmqvist, B. O. Roos, U. Ryde, V. Veryazov, P.-O. Widmark, 7. G. Karlstro¨m, R. Lindh, P.-A M. Cossi, B. Schimmelpfennig, P. Neogrady, and L. Seijo, Computat. Mat. Sci., 28, 222 (2003). Molcas: A Program Package for Computational Chemistry. 8. I. Shavitt, Int. J. Quantum Chem.: Quantum Chem, Symp., 12, 5 (1978). Matrix Element Evaluation in the Unitary Group Approach to the Electron Correlation Problem. 9. K. Ruedenberg and K. R. Sundberg, in Quantum Science; Methods and Structure, J.-L. Calais, Ed., Plenum Press, New York, 1976. 10. B. O. Roos, P. R. Taylor, and P. E. M. Siegbahn, Chem. Phys., 48, 157 (1980). A Complete Active Space SCF Method (CASSCF) Using a Density Matrix Formulated Super-CI Approach. 11. B. O. Roos, in Advances in Chemical Physics; Ab Initio Methods in Quantum Chemistry - II, K. P. Lawley, Ed., chapter 69, 399. John Wiley & Sons Ltd., Chichester, England, 1987. The Complete Active Space Self-Consistent Field Method and its Applications in Electronic Structure Calculations. 12. J. Olsen, B. O. Roos, P. Jørgensen, and H. J. A. Jensen, J. Chem. Phys., 89, 2185 (1988). Determinant Based Configuration Interaction Algorithms for Complete and Restricted Configuration Interaction Spaces. 13. P.-A˚. Malmqvist, A. Rendell, and B. O. Roos, J. Phys. Chem., 94, 5477 (1990). The Restricted Active Space Self-Consistent Field Method, Implemented with a Split Graph Unitary Group Approach. 14. P. E. M. Siegbahn, J. Chem. Phys., 72, 1647 (1980). Generalizations of the Direct CI Method Based on the Graphical Unitary Group Approach. 2. Single and Double Replacements From any Set of Reference Configurations. 15. H. Lischka, R. Shepard, I. Shavitt, R. M. Pitzer, M. Dallos, T. Mu¨ller, P. G. Szalay, F. B. Brown, R. Ahlrichs, H. J. Bo¨hm, A. Chang, D. C. Comeau, R. H. Gdanitz, H. Dachsel, C. Ehrhardt, M. Ernzerhof, P. Hchtl, S. Irle, G. Kedziora, T. Kovar, V. Parasuk, M. J. M. Pepper, P. Scharf, H. Schiffer, M. Schindler, M. Schler, M. Seth, E. A. Stahlberg, J.-G. Zhao, S. Yabushita, and Z. Zhang, COLUMBUS, an ab initio electronic structure program, release 5.9, (2004). 16. B. O. Roos, P. Linse, P. E. M. Siegbahn, and M. R. A. Blomberg, Chem. Phys., 66, 197 (1982). A Simple Method for the Evaluation of the Second-Order Perturbation Energy from External Double-Excitations with a CASSCF Reference Wavefunction. ˚ . Malmqvist, B. O. Roos, A. J. Sadlej, and K. Wolinski, J. Phys. Chem., 94, 17. K. Andersson, P.-A 5483 (1990). Second-Order Perturbation Theory with a CASSCF Reference Function. 18. K. Andersson, P.-A˚. Malmqvist, and B. O. Roos, J. Chem. Phys., 96, 1218 (1992). SecondOrder Perturbation Theory with a Complete Active Space Self-Consistent Field Reference Function. 19. K. Andersson and B. O. Roos, Int. J. Quantum Chem., 45, 591 (1993). Multiconfigurational Second-Order Perturbation Theory: A Test of Geometries and Binding Energies. ˚ . Malmqvist, Chem. Phys. Lett., 396, 142 (2004). A Modified 20. G. Ghigo, B. O. Roos, and P.-A Definition of the Zeroth Order Hamiltonian in Multiconfigurational Perturbation Theory (CASPT2). 21. B. O. Roos and K. Andersson, Chem. Phys. Lett., 245, 215 (1995). Multiconfigurational Perturbation Theory with Level Shift — The Cr2 Potential Revisited. 22. B. O. Roos, K. Andersson, M. P. Fu¨lscher, L. Serrano-Andre´s, K. Pierloot, M. Mercha´n, and V. Molina, J. Mol. Struct. (THEOCHEM), 388, 257 (1996). Applications of Level Shift Corrected Perturbation Theory in Electronic Spectroscopy. ˚ . Malmqvist, Chem. Phys. Lett., 274, 196 (1997). Multiconfiguration 23. N. Forsberg and P.-A Perturbation Theory with Imaginary Level Shift. 24. J. Finley, P.-A˚. Malmqvist, B. O. Roos, and L. Serrano-Andre´s, Chem. Phys. Lett., 288, 299 (1998). The Multi-State CASPT2 Method. 25. L. Visscher, J. Comput. Chem., 23, 759 (2002). The Dirac Equation in Quantum Chemistry: Strategies to Overcome the Current Computational Problems.
References
281
26. M. Abe, T. Nakajima, and K. Hirao, J. Chem. Phys., 125, 234110 (2006). Electronic Structures of PtCu, PtAg, and PtAu Molecules: A Dirac Four-Component Relativistic Study. 27. N. Douglas and N. M. Kroll, Ann. Phys., 82, 89 (1974). Quantum Electrodynamical Corrections to Fine-Structure of Helium. 28. B. A. Hess, Phys. Rev. A, 33, 3742 (1986). Relativistic Electronic-Structure Calculations Employing a 2-Component No-Pair Formalism With External-Field Projection Operators. ˚ . Malmqvist, Phys. Chem. Chem. Phys., 6, 2919 (2004). Relativistic 29. B. O. Roos and P.-A Quantum Chemistry — The Multiconfigurational Approach. 30. B. A. Hess, C. Marian, U. Wahlgren, and O. Gropen, Chem. Phys. Lett., 251, 365 (1996). A Mean-Field Spin-Orbit Method Applicable to Correlated Wavefunctions. 31. O. Christiansen, J. Gauss, and B. Schimmelpfennig, Chem. Phys. Phys. Chem., 2, 965 (2000). Spin-Orbit Coupling Constants from Coupled-Cluster Response Theory. 32. P.-A˚. Malmqvist, Int. J. Quantum Chem., 30, 479 (1986). Calculation of Transformation Density Matrices by Nonunitary Orbital Transformations. 33. P.-A˚. Malmqvist and B. O. Roos, Chem. Phys. Lett., 155, 189 (1989). The CASSCF State Interaction Method. 34. P.-A˚. Malmqvist, B. O. Roos, and B. Schimmelpfennig, Chem. Phys. Lett., 357, 230 (2002). The Restricted Active Space (RAS) State Interaction Approach With Spin-Orbit Coupling. 35. B. O. Roos, R. Lindh, P.-A˚. Malmqvist, V. Veryazov, and P.-O. Widmark, J. Phys. Chem. A, 108, 2851 (2004). Main Group Atoms and Dimers Studied with a New Relativistic ANO Basis Set. 36. B. O. Roos, V. Veryazov, and P.-O. Widmark, Theor. Chim. Acta, 111, 345 (2004). Relativistic ANO Type Basis Sets for the Alkaline and Alkaline Earth Atoms Applied to the Ground State Potentials for the Corresponding Dimers. 37. B. O. Roos, R. Lindh, P.-A˚. Malmqvist, V. Veryazov, and P.-O. Widmark, J. Phys. Chem. A, 109, 6575 (2005). New Relativistic ANO Basis Sets for Transition Metal Atoms. 38. B. O. Roos, R. Lindh, P.-A˚. Malmqvist, V. Veryazov, and P.-O. Widmark. Chem. Phys. Lett., 295, 409 (2005). New Relativistic ANO Basis Sets for Actinide Atoms. 39. F. A. Cotton and C. B. Harris, Inorg. Chem., 4, 330 (1965). Crystal and Molecular Structure of Dipotassium Octachlorodirhenate(III) Dihydrate K2[Re2Cl8].2H2O. 40. F. A. Cotton, Inorg. Chem., 4, 334 (1965). Metal-Metal Bonding in [Re2X8]2 Ions and Other Metal Atom Clusters. 41. L. Gagliardi and B. O. Roos, Inorg. Chem., 42, 1599 (2003). A Theoretical Study of the Electronic Spectrum of the ReCl2 8 Ion. 42. W. C. Trogler and H. B. Gray, Acc. Chem. Res., 11, 232 (1978). Electronic Spectra and Photochemistry of Complexes Containing Quadruple Metal-Metal Bonds. 43. W. C. Trogler, C. D. Cowman, H. B. Gray, and F. A. Cotton, J. Am. Chem. Soc., 99, 2993 2 (1977). Further Studies of Electronic Spectra of Re2Cl2 8 and Re2Br8 - Assignment of Weak Bands in 600-350-nm Region - Estimation of Dissociation Energies of Metal-Metal Quadruple Bonds. 44. F. Ferrante, L. Gagliardi, B. E. Bursten, and A. P. Sattelberger, Inorg. Chem., 44, 8476 (2005). Multiconfigurational Theoretical Study of the Octamethyldimetalates of Cr(II), Mo(II), W(II), and Re(III): Re-visiting the Correlation between the M-M Bond Length and the Delta -Delta* Transition Energy. 45. A. P. Sattelberger and J. P. Fackler, J. Am. Chem. Soc., 99, 1258 (1977). Spectral Studies of Octamethyldimetalates of Molybdenum(II), Rhenium(III), and Chromium(II) - Assignment of Delta-Delta* Transition. 46. B. O. Roos, Collect. Czech. Chem. Commun., 68, 265 (2003). The Ground State Potential for the Chromium Dimer Revisited.
282
Transition Metal- and Actinide-Containing Systems
47. B. Simard, M.-A. Lebeault-Dorget, A. Marijnissen, and J. J. ter Meulen, J. Chem. Phys., 108, 9668 (1998). Photoionization Spectroscopy of Dichromium and Dimolybdenum: Ionization Potentials and Bond Energies. 48. S. M. Casey and D. G. Leopold, J. Phys. Chem., 97, 816 (1993). Negative-Ion PhotoelectronSpectroscopy of Cr2. 49. K. Hilpert and K. Ruthardt, Ber. Bunsenges. Physik. Chem., 91, 724 (1987). Determination of the Dissociation-Energy of the Cr2 Molecule. 50. B. O. Roos, A. C. Borin, and L. Gagliardi, Angew. Chem. Int. Ed., 46, 1469 (2007). The Maximum Multiplicity of the Covalent Chemical Bond. 51. T. Nguyen, A. D. Sutton, M. Brynda, J. C. Fettinger, G. J. Long, and P. P. Power, Science, 310, 844 (2005). Synthesis of a Stable Compound with Fivefold Bonding between Two Chromium(I) Centers. 52. M. Brynda, L. Gagliardi, P.-O. Widmark, P. P. Power, and B. O. Roos, Angew. Chem. Int. Ed., 45, 3804 (2006). The Quintuple Bond between Two Chromiums in PhCrCrPh (Ph ¼ Phenyl). Trans-Bent Versus Linear Geometry: A Quantum Mechanical Study. 53. F. A. Cotton, Chem. Soc. Rev., 4, 27 (1975). Quadruple Bonds and Other Multiple Metal to Metal Bonds. 54. S. N. Ketkar and M. Fink, J. Am. Chem. Soc., 107, 338 (1985). Structure of Dichromium Tetraacetate by Gas-Phase Electron-Diffraction. 55. K. Andersson, Jr., C. W. Bauschlicher, B. J. Persson, and B. O. Roos, Chem. Phys. Lett., 257, 238 (1996). The Structure of Dichromium Tetraformate. 56. C. J. Cramer, M. Wloch, P. Piecuch, C. Puzzarini, and L. Gagliardi, J. Phys. Chem. A, 110, 1991 (2006). Theoretical Models on the Cu2O2 Torture Track: Mechanistic Implications for Oxytyrosinase and Small-Molecule Analogues. 57. C. J. Cramer, A. Kinal, M. Wloch, P. Piecuch, and L. Gagliardi, J. Phys. Chem. A, 110, 11557 (2006). Theoretical Characterization of End-On and Side-On Peroxide Coordination in Ligated Cu2O2 Models. 58. K. Andersson and B. O. Roos, Chem. Phys. Lett., 191, 507 (1992). Excitation Energies in the Nickel Atom Studied With the Complete Active Space SCF Method and Second-Order Perturbation Theory. 59. M. Mercha´n, R. Pou-Ame´rigo, and B. O. Roos, Chem. Phys. Lett., 252, 405 (1996). A Theoretical Study of the Dissociation Energy of Niþ 2 — A Case of Broken Symmetry. ˚ . Malmqvist, and L. Gagliardi, to be published. 60. A. Rehaman, P.-A 61. L. Gagliardi and B. O. Roos, Chem. Phys. Lett., 331, 229 (2000). Uranium Triatomic Compounds XUY (X,Y ¼ C,N,O): A Combined Multiconfigurational Second Order Perturbation and Density Functional Study. 62. M. Zhou, L. Andrews, J. Li, and B. E. Bursten, J. Am. Chem. Soc., 121, 9712 (1999). Reaction of Laser-Ablated Uranium Atoms with CO: Infrared Spectra of the CuO, CuO-, OUCCO, (eta(2)-C-2)UO2,-and U(CO)x (x ¼ 1–6) Molecules in Solid Neon. ˚ . Malmqvist, and J. M. Dyke, J. Phys. Chem. A, 105, 10602 63. L. Gagliardi, B. O. Roos, P.-A (2001). On the Electronic Structure of the UO2 Molecule. 64. J. Paulovic, L. Gagliardi, J. M. Dyke, and K. Hirao, J. Chem. Phys., 122, 144317 (2005). A Theoretical Study of the Gas-Phase Chemi-Ionization Reaction between Uranium and Oxygen Atoms. 65. L. Gagliardi, M. C. Heaven, J. W. Krogh, and B. O. Roos, J. Am. Chem. Soc., 127, 86 (2005). The Electronic Spectrum of the UO2 Molecule. 66. J. Han, V. Goncharov, L. A. Kaledin, A. V. Komissarov, and M. C. Heaven, J. Chem. Phys., 120, 5155 (2004). Electronic Spectroscopy and Ionization Potential of UO2 in the Gas Phase. 67. C. J. Lue, J. Jin, M. J. Ortiz, J. C. Rienstra-Kiracofe, and M. C. Heaven, J. Am. Chem. Soc., 126, 1812 (2004). Electronic Spectroscopy of UO2 Isolated in a Solid Ar Matrix.
References
283
68. L. Gagliardi, G. La Manna, and B. O. Roos, Faraday Discuss., 124, 63 (2003). On the Reaction of Uranium Atom with the Nitrogen Molecule: A Theoretical Study. 69. B. O. Roos, P.-O. Widmark, and L. Gagliardi, Faraday Discuss., 124, 57 (2003). The Ground State and Electronic Spectrum of CuO - A Mystery. 70. J. Li, B. E. Bursten, B. Liang, and L. Andrews, Science, 259, 2242 (2002). Noble Gas-Actinide Compounds: Complexation of the CuO Molecule by Ar, Kr, and Xe Atoms in Noble Gas Matrices. 71. I. Infante and L. Visscher, J. Chem. Phys., 121, 5783 (2004). The Importance of Spin-Orbit Coupling and Electron Correlation in the Rationalization of the Ground State of the CuO Molecule. 72. R. Silva and H. Nitsche, Radiochim. Acta, 70, 377 (1995). Comparison of Chemical Extractions and Laser Photoacoustic-Spectroscopy for the Determination of Plutonium Species in Near-Neutral Carbonate Solutions. 73. I. Grenthe, J. Fuger, R. Konings, R. Lemire, A. Muller, C. Nguyen-Trung, and H. Wanner, Chemical Thermodynamics of Uranium. North Holland, Amsterdam, 1992. 74. P. G. Allen, J. J. Bucher, D. K. Shuh, N. M. Edelstein, and T. Reich, Inorg. Chem., 36, 4676 2þ 4þ 3þ by (1997). Investigation of Aquo and Chloro Complexes of UO2þ 2 , NpO , Np , and Pu X-ray Absorption Fine Structure Spectroscopy. 75. L. Se´mon, C. Boehem, I. Billard, C. Hennig, K. Lu¨tzenkirchen, T. Reich, A. Rossberg, I. Rossini, and G. Wipff, Comput. Phys. Commun., 2, 591 (2001). Do Perchlorate and Triflate Anions Bind to the Uranyl Cation in an Acidic Aqueous Medium? A Combined EXAFS and Quantum Mechanical Investigation. 76. V. Vallet, U. Wahlgren, B. Schimmelpfenning, H. Moll, Z. Szabo´, and I. Grenthe, Inorg. Chem., 40, 3516 (2001). Solvent Effects on Uranium(VI) Fluoride and Hydroxide Complexes Studied by EXAFS and Quantum Chemistry. 77. J. Neuefeind, L. Soderholm, and S. Skanthakumar, J. Phys. Chem. A, 108, 2733 (2004). Experimental Coordination Environment of Uranyl(VI) in Aqueous Solution. 78. C. Clavague´ra-Sarrio, V. Brenner, S. Hoyau, C. J. Marsden, P. Millie´, and J.-P. Dognon, J. Phys. Chem. B, 107, 3051 (2003). Modeling of Uranyl Cation-Water Clusters. 79. S. Tsushima, T. Yang, and A. Suzuki, Chem. Phys. Lett., 334, 365 (2001). Theoretical Gibbs Free Energy Study on UO2(H2O)2þ and its Hydrolysis Products. 80. V. Vallet, U. Wahlgren, B. Schimmelpfenning, Z. Szabo´, and I. Grenthe, J. Am. Chem. Soc., 123, 11999 (2001). The Mechanism for Water Exchange in [UO2(H2O)5]2þ and [UO2(oxalate)2(H2O)]2, as Studied by Quantum Chemical Methods. 81. L. Hemmingsen, P. Amara, E. Ansoborlo, and M. Field, J. Phys. Chem. A, 104, 4095 (2000). Importance of Charge Transfer and Polarization Effects for the Modeling of Uranyl-Cation Complexes. 82. S. Spencer, L. Gagliardi, N. Handy, A. Ioannou, C.-K. Skylaris, and A. Willetts, J. Phys. Chem. 2þ A, 103, 1831 (1999). Hydration of UO2þ 2 and PuO2 . 83. L. Gagliardi, I. Grenthe, and B. O. Roos, Inorg. Chem., 40, 2976 (2001). A Theoretical Study of the Structure of Tricarbonatodioxouranate. 84. L. Gagliardi and B. O. Roos, Inorg. Chem., 41, 1315 (2002). The Coordination of the Neptunyl Ion With Carbonate Ions and Water: A Theoretical Study. 85. A. Wallqvist and G. Karlstro¨m, Chem. Scripta, 29A, 131 (1989). A New Non-Empirical Force Field for Computer Simulations. 86. O. Engkvist, P.-O. A˚strand, and G. Karlstro¨m, Chem. Rev., 100, 4087 (2000). Accurate Intermolecular Potentials Obtained from Molecular Wave Functions: Bridging the Gap between Quantum Chemistry and Molecular Simulations. 87. D. Hagberg, G. Karlstro¨m, B. O. Roos, and L. Gagliardi, J. Am. Chem. Soc., 127 (2005). The Coordination of Uranyl in Water: A Combined Ab Initio and Molecular Simulation Study. 88. L. N. Gorokhov, A. M. Emelyanov, and Y. S. Khodeev, Teplofiz. Vys. Temp., 12, 1307 (1974).
284
Transition Metal- and Actinide-Containing Systems
89. L. Andrews, Private Communication (2006). 90. M. C. Heaven, Private Communication (2006). 91. P. F. Souter, G. P. Kushto, L. Andrews, and M. Neurock, J. Am. Chem. Soc., 119, 1682 (1997). Experimental and Theoretical Evidence for the Formation of Several Uranium Hydride Molecules. 92. M. Pepper and B. E. Bursten, J. Am. Chem. Soc., 112, 7803 (1990). Ab Initio Studies of the Electronic Structure of the Diuranium Molecule. 93. L. Gagliardi and P. Pyykko¨, Angew. Chem. Int. Ed., 43, 1573 (2004). Theoretical Search for Very Short Metal-Actinide Bonds: NUIr and Isoelectronic Systems. 94. L. Gagliardi and P. Pyykko¨, Chem. Phys. Phys. Chem., 6, 2904 (2004). Study of the MAu6 Molecular Species (M ¼ Cr, Mo, W): A Transition from Halogenlike to Hydrogenlike Chemical Behavior for Gold. ˚ . Malmqvist, and L. Gagliardi, J. Am. Chem. Soc., 128, 17000 (2006). 95. B. O. Roos, P.-A Exploring the Actinide-Actinide Bond: Theoretical Studies of the Chemical Bond in Ac2, Th2, Pa2, and U2. 96. J. Sansonetti, W. Martin, and S. Young, Handbook of Basic Atomic Spectroscopic Data (version 1.00). [Online] Available: http://physics.nist.gov/Handbook., National Institute of Standards and Technology, Gaithersburg, Maryland, (2003). 97. D. Lawton and R. Mason, J. Am. Chem. Soc., 87, 921 (1965). The Molecular Structure of Molybdenum(II) Acetate. 98. T. A. Stephenson, E. Bannister, and G. Wilkinson, J. Chem. Soc., 2538 (1964). Molybdenum(II) Carboxylates. 99. F. A. Cotton, E. A. Hillard, C. A. Murillo, and H.-C. Zhou, J. Am. Chem. Soc., 122, 416 (2000). After 155 Years, A Crystalline Chromium Carboxylate With a Supershort Cr-Cr Bond. 100. B. O. Roos and L. Gagliardi, Inorg. Chem., 45, 803 (2006). Quantum Chemistry Predicts Multiply Bonded Diuranium Compounds to be Stable. 101. V. I. Bazhanov, S. A. Komarov, V. G. Sevast’yanov, M. V. Popik, N. T. Kutnetsov, and Y. S. Ezhov, Vysokochist. Veshchestva, 1, 109 (1990). 102. L. Joubert and P. Maldivi, J. Phys Chem. A, 105, 9068 (2001). A Structural and Vibrational Study of Uranium(III) Molecules by Density Functional Methods. 103. G. La Macchia, M. Brynda, and L. Gagliardi, Angew. Chem. Int. Ed., 45, 6210 (2006). Quantum Chemical Calculations Predict the Diphenyl Diuranium Compound, PhUUPh, to Have a Stable 1Ag Ground State.
CHAPTER 7
Recursive Solutions to Large Eigenproblems in Molecular Spectroscopy and Reaction Dynamics Hua Guo University of New Mexico, Albuquerque, New Mexico
INTRODUCTION Quantum Mechanics and Eigenproblems The fundamental equation in quantum mechanics, namely the timeindependent Schro¨dinger equation, ^ Hjci ¼ Ejci
½1
suggests that many problems in chemical physics can be reformulated as the solution of the corresponding eigenequations or their generalized counterparts.1 Examples of such problems include the Hartree–Fock and configuration interaction (CI) equations in electronic structure theory,2 ro-vibrational spectra of molecular systems,3–5 and resonances in scattering6,7 and in photodissociation.8 In addition, eigenproblems for other Hermitian operators in quantum mechanics are also common and can be handled using similar strategies. Unfortunately, most such eigenequations do not have analytical solutions, and one often has to rely on approximate methods, such as perturbation theory and
Reviews in Computational Chemistry, Volume 25 edited by Kenny B. Lipkowitz and Thomas R. Cundari Copyright ß 2007 Wiley-VCH, John Wiley & Sons, Inc.
285
286
Recursive Solutions to Large Eigenproblems
variation methods, which may involve substantial numerical computations. As a result, the efficiency and accuracy of such methods are of paramount importance. In this review, we will restrict our attention to methods based on the variation principle and consider some nontraditional diagonalization algorithms. These algorithms take advantage of recursive matrix-vector multiplication and are thus amenable to large eigenproblems. In particular, vibrational spectra of polyatomic molecules will be used as examples to illustrate the numerical approaches.
Discretization A numerical solution of the Schro¨dinger equation in Eq. [1] often starts with the discretization of the wave function. Discretization is necessary because it converts the differential equation to a matrix form, which can then be readily handled by a digital computer. This process is typically done using a set of basis functions in a chosen coordinate system. As discussed extensively in the literature,5,9–11 the proper choice of the coordinate system and the basis functions is vital in minimizing the size of the problem and in providing a physically relevant interpretation of the solution. However, this important topic is out of the scope of this review and we will only discuss some related issues in the context of recursive diagonalization. Interested readers are referred to other excellent reviews on this topic.5,9,10 Assuming that the basis functions used in discretization ðjfi iÞ are complete and orthonormal, the wave function in Eq. [1] can be expressed in the following expansion: jci ¼
N X
bi jfi i
½2
i¼1
In principle, the sum in Eq. [2] contains infinite terms. However, a judicious choice of the coordinate system and basis functions allows for a truncation with finite (N) terms without sacrificing accuracy. Substituting Eq. [2] back to Eq. [1], we have Hb ¼ eb
½3
Here, e is the eigenvalue, and the Hamiltonian matrix and the eigenvector are respectively defined as ^ ji Hij ¼ hfi jHjf b ¼ ðb1 ; b2 ; ; bN ÞT
½4 ½5
Introduction
287
where T denotes transpose. It is well known that a square matrix of order N has N eigenvalues and N eigenvectors. Therefore, Eq. [3] for all eigenpairs (i.e., eigenvalues and eigenvectors) can be written in a single matrix equation: HB ¼ BE
½6
where E is a diagonal matrix containing the eigenvalues of H Enn0 ¼ en dnn0
½7
and the nth column of B contains the eigenvector corresponding to en : Bin ¼ bin
½8
Because all quantum-mechanical operators are Hermitian, the corresponding matrices are also Hermitian. In other words, the complex conjugate of the transpose of such a matrix (denoted as {) is equal to itself: Ay ¼ A;
or
aij ¼ aji
½9
It is well established that the eigenvalues of an Hermitian matrix are all real, and their corresponding eigenvectors can be made orthonormal. A special case arises when the elements of the Hermitian matrix A are real, which can be achieved by using real basis functions. Under such circumstances, the Hermitian matrix is reduced to a real-symmetric matrix: AT ¼ A;
or
aij ¼ aji
½10
Without losing generality, this review will concentrate on real-symmetric matrices, whereas their Hermitian counterparts can be handled in a similar way. In some special cases, solutions of complex-symmetric matrices are required. This situation will be discussed separately. Let us now illustrate the discretization process using the vibration of a triatomic molecule (ABC) as an example. The nuclear Hamiltonian with zero total angular momentum ðJ ¼ 0Þ can be conveniently written in the Jacobi coordinates ( h ¼ 1 thereafter): 1 q2 1 q2 1 1 ^2 ^ H¼ þ þ j þ VðR; r; gÞ 2mR qR2 2mr qr2 2mR R2 2mr r2
½11
where r and R are, respectively, the diatomic (BC) and atom–diatom (A–BC) distances with mr and mR as their reduced masses and ^j denotes the diatomic
288
Recursive Solutions to Large Eigenproblems
rotational angular momentum operator for the Jacobi angle g. Although the kinetic energy part of the above Hamiltonian consists of factorizable operators, the potential energy operator (V) is typically nonseparable and system dependent. As a result, commonly used discretization schemes often rely on the so-called discrete variable representation (DVR),10 which defines a set of grid points in the coordinate representation and thus renders the potential energy operator diagonal. For simplicity, a direct product DVR is often used. Under such circumstances, the wave function and the Hamiltonian matrix can be expressed in the following form: jci ¼
X
flmn jlijmijni
½12
lmn
^ 0 ijm0 ijl0 i Hlmn;l0 m0 n0 hljhmjhnjHjn * * + + q2 0 q2 1 1 ¼ l 2 l dmm0 dnn0 m 2 m0 dll0 dnn0 ½13 2mR 2mr qR qr ! 1 1 þ þ hnj ^j2 jn0 idll0 dmm0 þ VðRl ; rm ; gn Þdll0 dmm0 dnn0 2mR R2l 2mr r2m As indicated by the Kronecker deltas in the above equation, the resulting Hamiltonian matrix is extremely sparse and its action onto a vector can be readily computed one term at a time.12,13 This property becomes very important for recursive diagonalization methods, which rely on matrix-vector multiplication: X ^ Hjci ¼ f0lmn jlijmijni ½14 lmn
where f0lmn
* * + + q2 1 X q2 0 1 X ¼ l 2 l fl0 mn m 2 m0 flm0 n 2mR l0 2mr m0 qR qr ! X 1 1 þ þ hnj ^j2 jn0 iflmn0 þ VðRl ; rm ; gn Þflmn 2mR R2l 2mr r2m n0
½15
As a result, the partial sum in Eq. [15] is so efficient that it can often lead to pseudo-linear scaling.14 The scaling may be further improved in some cases by using a fast Fourier transform (FFT).9 A note of caution should be given here regarding the Hamiltonian matrix in Eq. [13]. It is not difficult to see that singularities can arise when the radial coordinates approach zero, which in turn could result in serious convergence
Introduction
289
problems. Similar singularities may also exist in other coordinate systems and are particularly relevant for ‘‘floppy’’ molecules. Several effective strategies have been proposed to avoid or alleviate problems associated with such singularities.15,16
Direct Diagonalization The matrix eigenequation in Eq. [6] can also be considered as the result of the Rayleigh–Ritz variation with the trial function given in Eq. [2]. According to MacDonald’s theorem,17 the approximate eigenvalues obtained by solving Eq. [6] provide the upper bounds for the exact counterparts. Hence, eigenvalues en of the N N Hamiltonian matrix can be determined by optimizing the variation parameters (B).3,4 In principle, the eigenequation in Eq. [6] can be solved by finding the roots in the Nth-order characteristic polynomial det jH eIj, where I as the identity matrix. However, root-searching becomes inefficient when the order of matrix N increases, especially considering that there is no explicit formula for N 4. A more effective way to find the eigenpairs of the Hamiltonian matrix is by diagonalization. Indeed, multiplying B1 on both side of Eq. [6] on the left yields B1 HB ¼ E
½16
where B1 B ¼ BB1 ¼ I. In other words, the diagonalization of H by a similarity transformation gives its eigenvalues as the diagonal matrix elements, whereas the matrix that transforms the Hamiltonian to its diagonal form contains the eigenvectors. For a real-symmetric matrix, the transformation matrix B is orthogonal, whereas the B matrix becomes unitary if H is Hermitian. When a non-orthogonal basis is used, a generalized eigenequation will ensue. Its solution becomes more involved because of a non-diagonal overlap matrix, but methods to handle such problems exist and readers are referred to other sources for details.18 One popular approach is to remove the linear dependence by diagonalizing the overlap matrix. This so-called singular value decomposition (SVD) reduces a generalized eigenproblem to a regular one. In some techniques discussed below, the solution of a small generalized eigenproblem might be required. There are several ways to diagonalize a real-symmetric matrix. The common strategy is to use orthogonal matrices to gradually reduce the magnitude of off-diagonal elements.18,19 The most rudimental and well-known approach is the Jacobi rotation, which zeros one such element at a time. The alternative rotation scheme devised by Givens can be used to make the process more efficient. Perhaps the most efficient and most commonly used approach is that
290
Recursive Solutions to Large Eigenproblems
from Householder, which reduces the full matrix H to a symmetric tridiagonal form (T) by using a finite number ðN 2Þ of reflections: 1 1 T ¼ Q1 N2 QN3 Q1 HQ1 QN3 QN2
½17
where the inverse of the orthogonal matrix Q can be readily obtained: Q1 ¼ QT . The efficiency of the Householder method stems from the fact that multiple zeros can be produced at each step. The subsequent diagonalization of the symmetric tridiagonal matrix T is relatively straightforward and numerically inexpensive. It can be carried out by root-searching methods such as bisection if a small number of eigenvalues is of interest.18,19 When the entire spectrum is needed, on the other hand, one would use factorization methods such as QR (or the related QL), which are named after the factorized matrices shown below.18,19 In the QR approach, the tridiagonal matrix is factorized into the product of an orthogonal matrix Q and an upper triangular matrix R: T ¼ QR
½18
It can be shown that the new matrix after the similarity transform T0 ¼ RQ ¼ Q1 TQ
½19
maintains the tridiagonal form and eventually converges to the diagonal form. We note in passing that the QR factorization can itself be used to diagonalize matrices, but the efficiency is usually not optimal. Other methods such as inverse iteration may also be used to find eigenpairs of a symmetric tridiagonal matrix.18,20 As the modification of the original Hamiltonian matrix is involved in the diagonalization methods discussed above, we denote such approaches as direct diagonalization to distinguish them from the recursive ones discussed below. The direct diagonalization process is illustrated in Figure 1.
Figure 1 Schematic procedure for direct diagonalization.
Introduction
291
Scaling Laws and Motivation for Recursive Diagonalization Three major advantages of the direct diagonalization approach exist. First, it yields all eigenpairs at the same time. Second, it is very stable and robust. For instance, the orthonormality of the eigenvectors can be achieved with machine precision. Third, several well-developed and easy-to-use ‘‘canned’’ routines exist that allow robust diagonalization of Hermitian and real-symmetric matrices. Similar routines can also be found for solving generalized eigenproblems. One can find such routines from numerical libraries such as Numerical Recipes,19 LAPACK or formerly EISPACK (www. netlib.org/lapack/), IMSL (www.vni.com/products/imsl/), and NAG (www. nag.co.uk/). Among these, LAPACK is free. As a result, direct diagonalization has been the method of choice for many years. However, a potentially serious problem for direct diagonalization is its scaling. The number of arithmetic operations in the Householder method scales as N 3 . Moreover, all N 2 matrix elements must be stored in the core memory, even if it is a sparse matrix. The scaling laws for diagonalizing a tridiagonal matrix are much less severe than that of the tridiagonalization step. Hence, diagonalization becomes a computational bottleneck when N > 10; 000, even with rapidly increasing central processing unit (CPU) speed and core memory. The adverse effects of the steep scaling laws for the direct diagonalization can best be illustrated when calculating the vibrational spectrum of polyatomic molecules. If a direct product basis is used and each degree of freedom is allocated ten basis functions, the dimensionality of the Hamiltonian matrix (N) becomes 103 for a triatomic molecule, and it increases to 106 for a tetratomic molecule. The storage of 1012 double-precision numbers is difficult, if not impossible, to accomplish even with a state-of-the-art super-computer. Of course, various basis truncation and/or optimization schemes have been proposed to minimize the size of the Hamiltonian matrix,5,21,22 but these schemes are system dependent, in general more difficult to program, and more importantly merely postpone the onset of the bottleneck. There is thus a strong desire to use diagonalization schemes that scale better than direct methods. Apart from the steep scaling laws, one can also argue that the complete solution of the eigenequation mandated by direct diagonalization might not be necessary because only small regions of the spectrum are often of interest. It is thus desirable sometimes to extract only a few eigenvalues in a given spectral range. As discussed, an alternative to direct diagonalization is by recursion. The recursive diagonalization approach has several attractive features, including more favorable scaling laws, which make it ideally suited for large eigenproblems. For example, some applications of linear-scaling recursive methods in
292
Recursive Solutions to Large Eigenproblems
quantum chemistry have been discussed in this book series.23 A related problem in quantum mechanics is the recursive interpolation of operator functions, such as the time propagator. Although diagonalization in those cases might be overkill, recursive methods are nonetheless extensively used. As a result, we will offer some discussions about this issue when appropriate.
RECURSION AND THE KRYLOV SUBSPACE To understand the recursive diagonalization idea, it is instructive to examine the power method.18,20 Assuming that the eigenvector of H corresponding to its largest eigenvalue ðb1 , e1 ¼ emax Þ is contained in an initial vector ðqÞ: X q¼ an bn ½20 n¼1
the repeated multiplication of the Hamiltonian matrix on to the initial vector generates a series of vectors: q; Hq; H2 q; ; HK q
½21
As K increases, the vector HK q will eventually converge to b1 because it overwhelms all other terms in q with its large eigenvalue: HK q ¼
X
k!1
K an e K n bn !a1 e1 b1
½22
n¼1
Here, we have assumed that the dominant eigenvector b1 is nondegenerate. The power method uses only the last vector in the recursive sequence in Eq. [21], discarding all information provided by preceding vectors. It is not difficult to imagine that significantly more information may be extracted from the space spanned by these vectors, which is often called the Krylov subspace:20,24 KðKÞ ðq; HÞ ¼ spanfq; Hq; H2 q; ; HK1 qg
½23
Unless the initial vector is already an eigenvector, the Krylov vectors are linearly independent and they eventually span the eigenspace of H: KðNÞ ðq; HÞ ¼ spanfq; Hq; H2 q; ; HN1 qg ¼ spanfb1 ; b2 ; ; bN g
½24
provided that the initial vector has nonzero overlaps with all eigenvectors. This conclusion suggests that one can in principle extract eigenpairs of H
Lanczos Recursion
293
from the Krylov subspace. Several such strategies to accomplish this extraction are discussed below.
LANCZOS RECURSION Exact Arithmetic An obvious strategy to make use of the Krylov subspace is to orthogonalize its spanning vectors. However, it could be very costly in both CPU and core memory if the orthogonalization is done after all vectors are generated and stored. In 1950, Lanczos25 proposed an elegant scheme that orthogonalizes the vectors along with their generation, which works perfectly in exact arithmetic where round-off errors are absent. The Lanczos algorithm for a real-symmetric H can be expressed in the following three-term recursion formula: qkþ1 ¼ b1 k ½ðH ak Þqk bk1 qk1
k ¼ 1; 2; . . .
½25
starting with a normalized, but otherwise arbitrarily chosen, initial vector q1 ¼ q= k q k and b0 0. Each step of the recursion yields two scalar quantities: ak ¼ ðqk ; Hqk bk1 qk1 Þ
½26
bk ¼k ðH ak Þqk bk1 qk1 k
½27
¼ ððH ak Þqk bk1 qk1 ; ðH ak Þqk bk1 qk1 Þ1=2 where ð; Þ defines the inner product as follows: ða; bÞ ¼ aT b
½28
for real-symmetric matrices. The inner product for Hermitian matrices is given by ða; bÞ ¼ ay b ½29 It should be noted that some variations for calculating a and b exist, and their numerical accuracy might not be the same.26 The vectors generated by the Lanczos recursion differ from the Krylov vectors in that the former are mutually orthogonal and properly normalized, at least in exact arithmetic. In fact, the Lanczos vectors can be considered as the Gram–Schmidt orthogonalized Krylov vectors.27 Because the orthogonalization is performed implicitly along the recursion, the numerical costs are minimal.
294
Recursive Solutions to Large Eigenproblems
After K steps of recursion, the Lanczos vectors can be arranged into an N K matrix Q, which reduces the Hamiltonian matrix to a tridiagonal form: QT HQ ¼ TðKÞ
½30
where QT Q ¼ I
or
qTk qk0 ¼ dkk0
½31
and the tridiagonal matrix is symmetric and has the following form: 0
TðKÞ
a1 B b1 B ¼B @ 0
b1 a2 b2
0 b2 .. . bK1
bK1 aK
1 C C C A
½32
In fact, the Lanczos reduction was originally proposed as a tridiagonalization scheme, predating the Givens and Householder methods. Unlike the latter methods, however, the Lanczos method is recursive. This means that the dimensionality of the T matrix is determined by the number of steps of the Lanczos recursion (K), which is usually much smaller than the dimensionality of the Hamiltonian matrix (N) in real calculations. Finally, the conversion of the tridiagonal matrix to a diagonal form yields the approximate eigenvalues of H.20 In particular Z1 TðKÞ Z ¼ EðKÞ ðKÞ
ðKÞ
½33
ðKÞ
where Z ¼ ðz1 ; z2 ; . . . ; zK Þ is an orthogonal matrix that contains the eigenvectors of T as its columns. In other words, ðKÞ ðKÞ TðKÞ zðKÞ n ¼ en z n
½34
As discussed, this step is considered to be straightforward and relatively inexpensive. The recursive diagonalization method outlined in Eqs. [25] to [34] is depicted in Figure 2 and is referred to below as the Lanczos algorithm. If the Hamiltonian matrix H has N non-degenerate eigenpairs, it is easy to imagine that in exact arithmetic, the Lanczos recursion terminates when K ¼ N because the Lanczos vectors span the eigenspace of H completely. However, it is important to point out that some eigenvalues converge much earlier than will the entire eigenspace that is being generated.20,27 In fact, this is the beauty of a recursive method, which generates converged eigenvalues gradually rather than waiting until all of them are obtained. In the Lanczos
Lanczos Recursion
295
Figure 2 Schematic procedure for the Lanczos algorithm.
algorithm, the first to converge are those eigenvalues near the spectral extrema and having large gaps with their adjacent levels. This behavior is understandable because the Lanczos recursion provides a uniform interpolation of the energy axis, as discussed below, but with more interpolation points near the energy extrema than in the interior of the spectrum. In molecular spectroscopic problems, the lowest lying states are often the most important, thus making the Lanczos algorithm ideally suited for such problems. The numerical advantage is apparent for a recursive method that relies on repeated matrix-vector multiplication, such as the Lanczos algorithm. First, the matrix is not modified so that only the nonzero elements require storage. In practice, it is more convenient to calculate the action of the Hamiltonian matrix onto the recurring vector directly, as suggested by Eq. [15]. Second, all vectors generated by the recursion need not be stored in the core memory. In particular, the three-term recursion in Eq. [25] stipulates that only two vectors are necessary for the recursion. As a result, the storage requirement is proportional to N. Third, the CPU scaling law for the generation of the Lanczos vectors is at most proportional to N 2 , because of the matrix-vector multiplication. In most cases, however, the sparsity of the matrix renders the CPU dependency pseudo-linear. Fourth, one has the freedom to stop the recursion anytime, which is a highly attractive feature if the desired part of the spectrum converges first. Finally, a practical benefit is that the implementation of the Lanczos algorithm is straightforward. Sometimes, the eigenvectors are needed in addition to the eigenvalues, and they can also be obtained by the Lanczos algorithm. For a K-step recursion, an eigenvector of H can be expressed as a linear combination of the Lanczos vectors: bðKÞ n ¼
K X
ðKÞ
zkn qk
½35
k¼1 ðKÞ
in which the expansion coefficients ðzkn Þ belong to the corresponding eigenvector of the Lanczos tridiagonal matrix ðTðKÞ Þ. Because the qk vectors are
296
Recursive Solutions to Large Eigenproblems
typically not stored, the above assembly has to be carried out with a second Lanczos recursion with exactly the same initial vector and the same number of recursion steps. On the other hand, Eq. [35] can be used to generate multiple eigenvectors using the same Lanczos recursion because all energy information is contained in the coefficients. It is important at this point to note that the number of Lanczos recursion steps needed to converge the eigenvector is typically larger than that needed for converging the corresponding eigenvalue. Interestingly, if the Lanczos vectors ðqk Þ are considered as evolution states in a generalized time domain, Eq. [35] can be thought of as a transformation to the energy domain. Indeed, this conjugacy of energy and generalized time domains is a common theme in recursive methods that will be stressed throughout this review. It is noted in passing that the Lanczos algorithm is closely related to several recursive linear equation solvers, such as the conjugate gradient (CG),28 minimal residual (MINRES),29 generalized minimal residual (GMRES),30 and quasi-minimal residual (QMR) methods,31,32 all of which are based on Lanczos-like recursions. An excellent discussion of the recursive linear equation solvers can be found in Ref. 33. These linear equation solvers are useful in constructing filtered vectors in the energy domain, as discussed below.
Finite-Precision Arithmetic The utility of the original Lanczos algorithm is hampered by its behavior in finite-precision arithmetic, where the inevitable round-off errors cause the Lanczos vectors to lose global orthogonality and even their linear independence. This is nonetheless accompanied by well-maintained short-range orthogonality. The deterioration of long-range orthogonality leads to the regeneration of the existing Lanczos vectors and allows the recursion to proceed far beyond N. A notorious manifestation of the problem is the emergence of so-called ‘‘spurious’’ eigenvalues, which may pop up randomly in the energy spectrum or appear as redundant copies of converged eigenvalues. This phenomenon was noticed from the very beginning when Lanczos himself suggested reorthogonalization of the Lanczos vectors as a possible solution. However, such a remedy can be very costly for long recursions, and understandably, these numerical problems greatly dampened the initial enthusiasm for the Lanczos algorithm.34 The pathology of the ‘‘spurious’’ eigenvalues was not fully understood until the work of Paige,20,26,27,35,36 who undertook in the 1970s a detailed analysis of the Lanczos algorithm in finite-precision arithmetic. He discovered, to everyone’s surprise, that the loss of global orthogonality and the emergence of the ‘‘spurious’’ eigenvalues coincide with the convergence of some eigenvalues, implicating the interaction of the round-off errors with the convergence, rather than the round-off errors alone, as the culprit. It was also observed
Lanczos Recursion
297
that these ‘‘spurious’’ eigenvalues eventually converge to one of the true eigenvalues if a sufficient number of recursion steps is taken. (The corresponding eigenvectors also converge but within a normalization factor.) In other words, the round-off errors simply delay the appearance of converged eigenvalues, but they do not affect their accuracy. In addition, all eigenvalues of H eventually appear if the recursion is allowed to proceed sufficiently long. This so-called Lanczos phenomenon27,37 is important because it establishes the usefulness of the Lanczos algorithm in diagonalizing large matrices. Paradoxically, the Lanczos phenomenon holds even when the initial vector has zero projection onto some eigenvectors; the round-off errors actually help to find them. With the pathology of the Lanczos algorithm clarified, it is a relatively simple matter to sort out the converged eigenvalues from those that are ‘‘spurious.’’ There are several ways to examine the convergence of the Lanczos eigenpairs numerically. The most straightforward test would be to compare the convergence of eigenvalues with respect to the number of Lanczos recursion steps, but this approach can be error prone because Lanczos eigenvalues often cluster together. In what follows, we discuss two robust tests that allow for the identification of converged copies. ðKÞ The Paige test26 identifies a converged eigenvalues ei by the smallness ðKÞ ðKÞ of the last element of the eigenvector zi , namely jzKi j. (Here, we have used i instead of n to denote the Lanczos eigenvalues, because of the possibility of multiple converged copies generated by the finite-precision Lanczos algorithm.) This test is based on the observation that the Lanczos algorithm of Eqs. [25]–[34] can be rewritten as follows: ðKÞ
Hbi ðKÞ
ðKÞ ðKÞ
¼ ei b i
ðKÞ
þ bK zKi qKþ1
½36 ðKÞ
ðKÞ
Hence, if jbK zKi j is sufficiently small, the eigenpair fei ; bi g satisfies the original eigenequation; in other words, it converges to the true eigenpair of H. This behavior persists in finite-precision arithmetic and was termed ‘‘stabilization’’ by Paige.36 An error bound can thus be used to determine the convergence of an eigenpair. The convergence dynamics of the Lanczos algorithm is illustrated in Fig. 3 for a Hamiltonian in the form of Eq. [13].38 The absolute values of the elements of the eigenvector(s) near a chosen energy are plotted at several K values. As the figure shows, the first copy starts to converge near K ¼ 80 and is well converged at K ¼ 200 when the last element of its eigenvector reaches 1014. At this point, the orthogonalization of the Lanczos vectors starts to deteriorate (not shown here). The second copy appears near K ¼ 300 and converges near K ¼ 400. This process repeats at larger K values. The converged copies are typically accurate up to machine precision, and their corresponding eigenvectors are also accurate.
298
Recursive Solutions to Large Eigenproblems
Figure 3 Absolute values of the elements of the Lanczos eigenvectors of the ground state of HOCl. Adapted with permission from Ref. 38.
Several observations in Figure 3 are worth noting. First, the first converging copy of the eigenvector consists primarily of the first 160 Lanczos vectors, whereas the second copy is composed largely of the 160 latter vectors, implying regeneration of some recurring vectors after the convergence of the first copy. Starting to appear at K ¼ 300, the second eigenvector has an extremely small first element ðjz1i jÞ, indicating that in a loose sense this copy is not contained in the initial vector (cf. Eq. [35]). In other words, this copy is generated from round-off errors. Indeed, the Lanczos algorithm routinely generates eigenpairs that are not contained in the initial vector. Second, once converged, the copies tend to mix with each other because they are practically degenerate. Nonetheless, a closer look at the curves in Figure 3 reveals that each copy still has its dominant contributions from different k ranges. The Cullum–Willoughby test,27,37 on the other hand, was designed to identify the so-called ‘‘spurious’’ eigenvalues, rather than the converged eigenvalues. In particular, the tridiagonal matrix TðKÞ and its submatrix, obtained
Lanczos Recursion
299
Figure 4 Distribution of Lanczos eigenvalues in the HO2 system (adapted with permission from Ref. 40) and Gauss–Chebyshev quadrature points.
by deleting the first row and first column of TðKÞ , are diagonalized. This can be done using either the QR or a modified bisection method suggested by Cullum and Willoughby.27 Their numerically identical eigenvalues are regarded as being ‘‘spurious’’ and are thus discarded, whereas the remaining eigenvalues are labeled as being ‘‘good’’ and retained. The advantage of this test is that no reference to the tolerance is made and the process is thus free of subjective interference. Also, for each eigenvalue, only one converged copy exists, which is often called the ‘‘principal’’ copy because of its large overlap with the initial vector. The disadvantage of using the Cullum–Willoughby test is that it might discard converged copies that are not well represented in the initial vector.39 An interesting observation concerning the convergence behavior of the Lanczos algorithm is illustrated in Figure 4, where the (unconverged) Lanczos eigenvalues are plotted against the normalized index ðk=KÞ for several values of K.40 These so-called ‘‘convergence curves’’ show the distribution of Lanczos eigenvalues in the energy domain at different recursion steps, and the corresponding eigenvalues can be viewed as interpolation points in the energy axis. It is interesting to note that these curves are almost independent of the recursion length (K), and it is clear from the figure that there are more points near the extrema of the spectrum than in the interior. This is a direct result of the matrix-vector multiplication approach in the Lanczos recursion, which can be inferred from our earlier discussion about the power method. As a result, the eigenvalues near the spectral extrema converge first, whereas eigenvalues in the spectral interior and in regions with high densities of states converge much slower. Also plotted in the figure are Gauss–Chebyshev quadrature points, which give the distribution of the interpolation points in a Chebyshev expansion (vide infra). The similarities between the two curves are striking. It has been long recognized that the convergence rate of the Lanczos algorithm depends on the spectral range of the Hamiltonian matrix ðHÞ. Recently, it was shown from several numerical examples that the convergence rate is actually inversely proportional to the square root of H.41,42 This
300
Recursive Solutions to Large Eigenproblems
finding can be reconciled by considering the interpolation picture in Figure 4, where more Lanczos interpolation points are needed to achieve the same resolution when H is increased. Accordingly, it is extremely important to control the spectral range of the Hamiltonian in practical calculations, particularly when the DVR grid points are near singularities. A simple and commonly used strategy is to remove the DVR points above a certain energy cutoff. More sophisticated methods are discussed below. The implementation of various forms of the Lanczos algorithm is straightforward, and a library of routines has been collected in the second volume of Cullum and Willoughby’s book.27 Applications of the Lanczos algorithm to solve molecular vibration problems, pioneered by Wyatt,43–47 Carrington,12,13,15,48,49 and others,50–55 have been reviewed by several authors56–60 and will be discussed in more detail below. A list of other applications of the Lanczos algorithm in different fields of science can be found in the review by Wyatt.56
Extensions of the Original Lanczos Algorithm Implicitly Restarted Lanczos Algorithms As discussed, the original Lanczos algorithm generates eigenvalues easily, but it requires additional computational resources to obtain eigenvectors. A recently developed implicitly restarted Lanczos method (IRLM) allows for the accurate determination of the lowest eigenvalues and the corresponding eigenvectors with relative ease.61 This is achieved by storing and orthogonalizing a small number of Lanczos vectors and by combining an implicitly shifted QR method without additional matrix-vector multiplications. The advantages of this approach include the availability of eigenvectors and the avoidance of ‘‘spurious’’ eigenvalues. However, IRLM has a much larger memory requirement than does the original Lanczos algorithm, even though the memory scaling is still linear. In addition, IRLM may not extract highly excited eigenpairs in the interior of the spectrum effectively.62 IRLM routines are available in the public domain,63 and several applications to molecular vibration problems have appeared recently.64–67 Block Lanczos Algorithm One of the potentially fatal problems of the original Lanczos algorithm is its inability to handle degenerate eigenvectors. As we discuss below, most degeneracy problems in molecular spectroscopy are caused by symmetry, but the degeneracy can be removed by symmetry adaptation. In cases where no physical insight can be used to remove the degeneracy, the block version of the Lanczos algorithm may be effective.68 The basic idea here is to generate recursively not one but instead a few vectors simultaneously using the same three-term recursion (Eq. [25]). The resulting block tridiagonal matrix is
Lanczos Recursion
301
further tridiagonalized and then diagonalized to give the eigenvalues. The multiple initial vectors introduce additional linear independence necessary for resolving the multiplicity. Evidence has shown that the multiple initial vectors help to converge degenerate or near-degenerate eigenvalues. It should be noted that the scaling laws of the block Lanczos algorithm are generally less favorable than those of the original Lanczos algorithm because more recurring vectors must be stored and more arithmetic operations are required. Spectral Transform Lanczos Algorithms Another shortcoming associated with the original Lanczos algorithm is its inefficiency in extracting interior eigenvalues and those in dense spectral regions. In practical calculations, it is not uncommon for the Lanczos algorithm to generate hundreds of converged copies of the lowest eigenvalues before converging the desired high-energy eigenvalues in the spectral interior. To remedy the problem, Ericsson and Ruhe suggested the use of a spectral transform of the following form:69 FðHjEÞ ¼ ðEI HÞ1
½37
which replaces H in the Lanczos recursion. The spectral transform dilates the spectral density near the shifting parameter E so that nearby eigenvalues converge with a small number of recursion steps. The spectral transform can also be viewed as a filter, which will be discussed below. One is free to tailor the spectral transform for the specific problem of interest. In addition to the Green filter in Eq. [37],55,69–71 spectral transforming filters reported in the literature include the exponential form ðeaðHEIÞ Þ,72,73 the Gaussian form 2 ðeaðHEIÞ Þ and its derivatives,74,75 the hyperbolic form ðtanh½aðH EIÞÞ,76 and Chebyshev polynomials.65 However, there is a price to pay in a spectral transform Lanczos algorithm: At each recursion step, the action of the filter operator onto the Lanczos vectors has to be evaluated. In the original version, Ericsson and Ruhe update the Lanczos vectors by solving the following linear equation: ðEI HÞqkþ1 ¼ qk
½38
by factorization.69 In cases where the above solution is not possible by factorization because of the large matrix size, recursive linear equation solvers such as MINRES,29 GMRES,30 and QMR31,32 methods can be used.77,78 Other options for approximating the filter also exist, such as those based on polynomial expansions,73,79 as discussed in more detailed below. Unfortunately, all of these two-layered methods require many matrix-vector multiplications, despite a relatively short Lanczos recursion. Thus, the total number of matrix-vector multiplications can still be large, and such methods do not necessarily lead to computational savings over the original Lanczos
302
Recursive Solutions to Large Eigenproblems
algorithm.80 On the other hand, the spectral transform Lanczos approach does have some advantage if the eigenvectors are of interest. Because of the short Lanczos recursion, one can afford to store all of the Lanczos vectors, which can be used to obtain both eigenvalues and eigenvectors. Interestingly, the spectral transform Lanczos algorithm can be made more efficient if the filtering is not executed to the fullest extent. This can be achieved by truncating the Chebyshev expansion of the filter,76,81 or by terminating the recursive linear equation solver prematurely.82 In doing so, the number of vector-matrix multiplications can be reduced substantially. Preconditioned Lanczos Algorithms The idea of preconditioning is related to the spectral transform strategy described in the previous subsection. In the Davidson method,83 for example, the Lanczos recursion is augmented with a preconditioner at every recursion step. The Davidson method is designed to extract the lowest eigenpair of the large configuration interaction (CI) matrix typically found in electronic structure calculations. Instead of H, the matrix used in the Lanczos recursion is given as ðH EIÞ=ðEI H0 Þ and is updated at every step. Here, H0 is a zeroth-order Hamiltonian, which is easily invertible, and E is the estimated eigenvalue. If H0 is a good approximation of H, the preconditioning works efficiently to converge the eigenpair, with a speed faster than the original Lanczos algorithm. A possible choice of H0 is to use the diagonal part of H, especially when the off-diagonal elements of H are small. Of course, non-diagonal forms of H0 may also be used,84 but they are more expensive numerically than using the diagonal elements alone. If more than one eigenpair is required, a block version of the Davidson method is preferred.85,86 The Davidson method has been successfully applied to extract eigenpairs in molecular vibration problems, some with high energies.87–90 Carrington and coworkers have recently devised a Lanczos-based recursive scheme based on an approximate preconditioner.78,82,91,92 This so-called preconditioned inexact spectral transform (PIST) method is formally a spectral transform method because the Green filter, namely ðEI HÞ1 , is used in the Lanczos recursion instead of H itself. Only a small number of Lanczos vectors is generated and stored, so the memory requirement scales linearly with N. The Lanczos vectors are then orthogonalized, and a small Hamiltonian matrix is diagonalized. Unlike the original Lanczos algorithm, both eigenvalues and eigenvectors are obtained in the prespecified spectral window. PIST distinguishes itself from other spectral transform Lanczos methods by using two important innovations. First, the linear equation Eq. [38] is solved by QMR but not to a high degree of accuracy. In practice, the QMR recursion is terminated once a prespecified (and relatively large) tolerance is reached. Consequently, the resulting Lanczos vectors are only approximately filtered. This ‘‘inexact spectral transform’’ is efficient because many less matrix-vector multiplications are needed, and its deficiencies can subsequently
Lanczos Recursion
303
be compensated by diagonalization. Indeed, PIST has some similarities with the filter-diagonalization method that will be discussed later. The second important innovation is that an efficient preconditioner ½ðEI H0 Þ1 is used to accelerate the convergence of the QMR solution of the linear equation. An attractive characteristic of PIST is its weak dependence on the spectral range of the Hamiltonian,82 which allows one to focus on a particular spectral window of interest. Recall that the convergence rate of the original Lanczos algorithm is inversely proportional to the square root of the spectral range,41,42 which may result in numerical inefficiency for problems with large spectral ranges. The key element of PIST is the judicious choice of H0, which should be sufficiently close to the true Hamiltonian and easy to construct the preconditioner. Several publications by Poirier and Carrington were devoted to the choice of H0.91,92 These authors observed that with a good choice of H0, the number of vector-matrix multiplications in converging certain eigenpairs can be substantially less than what is needed by the original Lanczos algorithm. We point out that the design of the zeroth-order Hamiltonian may be system dependent and requires intimate knowledge of the structure of the Hamiltonian.
Transition Amplitudes Recursive Residue Generation Method In chemical physics, having knowledge of just the eigenpairs of the relevant Hamiltonian is often insufficient because many processes involve transition between different states. In such cases, the transition amplitudes between these states under a quantum mechanical propagator may be required to solve the problem: T ^ Cmm0 hwm jUðHÞjw m0 i ¼ vm UðHÞvm0
½39
where UðHÞ is a function of the Hamiltonian such as the time propagator and jwm i are prespecified states. Such amplitudes are prevalent in quantum mechanics, and their examples include absorption/emission spectra, resonance Raman cross sections, correlation functions, rate constants, and S-matrix elements for reactive scattering.1,56 A special case of Eq. [39] is recognized for transitions between molecular quantum states caused by interaction with an external source, which can be another molecule during a collision event or an electromagnetic field in a laser-molecule interaction. Under such circumstances, the total Hamiltonian is the sum of the molecular Hamiltonian H0 and its interaction with the external source V: H ¼ H0 þ V Here, vm are eigenvectors of the molecular Hamiltonian: H0 vm ¼ E0m vm .
½40
304
Recursive Solutions to Large Eigenproblems
A commonly used approach for computing the transition amplitudes is to approximate the propagator in the Krylov subspace, in a similar spirit to the time-dependent wave packet approach.7 For example, the Lanczos-based QMR has been used for UðHÞ ¼ ðE HÞ1 when calculating S-matrix elements from an initial channel ðvm0 Þ.93–97 The transition amplitudes to all final channels ðvm Þ can be computed from the ‘‘cross-correlation functions,’’ namely their overlaps with the recurring vectors. Since the initial vector is given by vm0 , only a column of the S-matrix can be obtained from a single Lanczos recursion. The entire amplitude matrix can be calculated in a straightforward fashion if the complete set of eigenpairs fEn ; bn g of the total Hamiltonian ðHÞ is known: X X Cmm0 ¼ m;n UðEn Þm0 ;n ¼ Rmm0 ;n UðEn Þ ½41 n
n
where m;n ¼ vTm bn are overlaps between vm and the eigenvectors bn . The quantity Rmm0 ;n ¼ m;n m0 ;n is referred to as the residue.56 Recall, however, that the calculation of eigenvectors with the Lanczos algorithm is typically much more demanding in both CPU and core memory, so one should avoid such calculations as much as possible. A closer look at Eq. [41] reveals that, in addition to the eigenvalues, only overlaps between the prespecified states and the eigenvectors of H are needed. Both are scalar quantities. It is thus desirable to develop methods that are capable of efficient and accurate computing of both the eigenvalues and the overlaps but with no explicit recourse to eigenvectors. Such a method was first proposed by Wyatt and co-workers.43–47,56 In their so-called recursive residue generation method (RRGM), both eigenvalues and overlaps are obtained using the Lanczos algorithm, without explicit calculation and storage of eigenvectors. In particular, the residue in Eq. [41] can be expressed as a linear combination of two residues: Rmm0 ;n ¼ ½Rþ;n R;n =2 where
pffiffiffi v ¼ ðvm vm0 Þ= 2 R ;n ¼ ½vT bn 2
½42
½43 ½44
The two vectors in Eq. [43] are used to initiate two Lanczos recursions that yield not only converged eigenvalues but also the residues R ;n . In particular, Wyatt and Scott have shown that these residues are simply the squared first elements of the Lanczos eigenvectors in a K-step recursion:46 ðKÞ
R ;n ¼
X i
ðKÞ 2
jz1i j
½45
Lanczos Recursion
305
where the sum runs over all multiple converged copies of the eigenpair: ðKÞ Ei En , including the ones labeled as ‘‘spurious’’ by the Cullum– Willoughby test. This somewhat surprising result comes about because the initial vectors for the Lanczos recursions are the same vectors that define the residues. The eigenvectors z of Eq. [45] are defined in Eq. [44] and can be obtained by diagonalizing the tridiagonal matrix using QR or QL. Wyatt and Scott46 further showed that the first elements of all Lanczos eigenvectors can be calculated efficiently using a modified QL method,98 which iterates only the first row of the eigenvector matrix instead of the entire Z matrix. Consequently, the main numerical task in the RRGM is the Lanczos propagation and the subsequent QL. Single Lanczos Propagation Method Although improvements to the original RRGM were later proposed,47,99 multiple recursions are still needed to generate the full transition amplitude matrix. Clearly, one would like to minimize the length of the recursion because they usually represent the most time-consuming part of the calculation. One such method has recently been suggested by Chen and Guo.38,100 The premise of this so-called single Lanczos propagation (SLP) method is that projections of all prespecified states onto the eigenvectors can be obtained from a single Lanczos recursion starting with an arbitrary initial state. As a result, SLP should be more efficient than RRGM in calculating the entire transition amplitude matrix. We note that a closely related idea implemented with the Chebyshev recursion was proposed by Mandelshtam.101 To illustrate the principles of SLP, we note that a Lanczos recursion initiated by an arbitrary vector q1 can, in exact arithmetic, yield not only the eigenvalues of H, but also overlaps of prespecified vectors with eigenvectors, as shown below: ðKÞ ¼ m;n
X
ðKÞ
zkn vTm qk ¼
X
k
ðKÞ
zkn cm;k
½46
k
where cm;k vTm qk can be loosely regarded as correlation functions. This expression can thus be considered as a spectral method in which the spectrum ðKÞ ðm;n Þ is obtained from the ‘‘correlation function’’ ðcm;k Þ by a transformation matrix (Z), which is reminiscent of the Chebyshev spectral method that will be described below. Unlike RRGM, however, there is no restriction on the initial state. In fact, in the special case where q1 ¼ vm , we have ðKÞ ¼ m;n
X k
ðKÞ
zkn vTm qk ¼
X
ðKÞ
ðKÞ
zkn dk;1 ¼ z1n
k
where we have used the orthonormality of the Lanczos vectors.
½47
306
Recursive Solutions to Large Eigenproblems
Despite the simplicity of the above scheme, however, a straightforward implementation in finite-precision arithmetic may cause severe problems in calculating the overlaps. The only circumstance that permits the direct application of Eq. [46] in practical calculations is when there is only a single converged copy, whose normalization is always maintained even in finiteprecision arithmetic.38 When multiple converged copies are present, the normalization of the approximate eigenvectors may not hold because of the loss of global orthogonality among the Lanczos vectors arising from round-off errors. Indeed, tests have shown that the norm of the approximate eigenvecðKÞ tors k bi k2 fluctuates once multiple converged copies appear.38 On the other hand, any converged eigenvector bi , judging by the smallness of its last element of the corresponding zi , is nonetheless a good approximation of the true eigenvector of H ðbn Þ except for a normalization constant. Despite these problems, it was realized that one can still compute the overlaps accurately by using the following formula:38 X ðKÞ 2 ðKÞ 2 jm;n j ¼ jm;i j =NnðKÞ ½48 i ðKÞ
where the sum runs over all converged copies fEi g of the true eigenvalue En . This is possible because of a remarkable observation made by Chen and Guo on the Lanczos algorithm in finite-precision arithmetic;38 i.e., the sum of the norms of all converged copies of an eigenvalue equals the number of copies: X
ðKÞ
k bi
k2 ffi NnðKÞ
½49
i
For real-symmetric systems, the above relation holds up to machine precision despite the fact that individual copies are not normalized.38 It also works reasonably well for complex-symmetric Hamiltonians.102 Unfortunately, there has not yet been a formal proof of this striking observation of the Lanczos algorithm. Equation [48] allows for the calculation of the squared overlaps, which are often sufficient for many problems. However, when the sign of an overlap is needed, it should be chosen to be that of the so-called ‘‘principal copy,’’ which has the largest jz1i j in the group. In many applications, the actual sign of the overlaps is not important as long as the same ‘‘principal copy’’ is used throughout the calculation. The efficiency of the SLP method can be further improved. According to Eq. [46], both the overlaps cm;k and all the Lanczos eigenvectors zi are needed. Although the latter can be obtained by QL, its explicit calculation is unnecesðKÞ sary, as noted by Chen and Guo.100 In particular, the overlaps m;i can be obtained efficiently and directly without the explicit calculation of the Lanczos eigenvectors, which can lead to substantial savings for long recursions. This is
Lanczos Recursion
307
done using a modified QL method similar to the one used to compute the first or last elements of the Lanczos eigenvectors.46 In particular, the unit initial matrix in QL can be replaced by the diagonal matrix with the overlaps ðcm;k Þ in the diagonal positions, followed by the same QL iteration, details of which are given in Ref. 100. Another version of SLP was proposed more recently.103 Instead of using renormalization, as alluded to above, the new scheme updates the prespecified vectors. Specifically, these vectors are modified at each Lanczos step: ~ðkÞ ~ðk1Þ v lðk1Þ qk1 m ¼ v m
½50
where lðk1Þ is the projection of the ðk 1Þth Lanczos vector on the corresponding prespecified vector: ~ðk1Þ lðk1Þ ¼ qT m k1 v
½51
The amplitude is finally computed as follows: ðKÞ 2 jm;n j ¼
2 X X 2 X X ðKÞ ðKÞ T zkn ½~ vðkÞ zkn ~cm;k m qk ¼ i
k
i
½52
k
T vðkÞ where ~cm;k ¼ ½~ m qk and i runs over all converged eigenvalues at En . Note that in comparison with Eq. [48], the normalization is avoided. The strategy here is based on earlier observations that the Lanczos recursion regenerates vectors that have already been generated, because of the loss of long-range orthogonality. These redundant vectors will make contributions to the overlap ð~cm;k Þ that have already been accounted for. The above procedure essentially removes the contributions of these redundant Lanczos vectors from the prespecified vectors. In the ideal case where there is no loss of orthogonality among the Lanczos vectors, there is only one copy of a converged eigenpair, and Eq. [52] is identical to Eq. [48], and eventually Eq. [46]. Numerical tests showed that the results obtained from the two SLP versions are within machine precision.103 Both RRGM and SLP have been used to compute various transition amplitudes with high efficiency and accuracy. Their applications, which have been reviewed in the literature,56,57,59 include laser-molecule interaction,43,44,99 correlation functions,45,104 absorption and emission spectra,100,103,105–107 intramolecular energy transfer,108–115 vibrational assignment,103,116,117 and reaction dynamics.96,102,118–120
Expectation Values It was demonstrated in the above subsection that the Lanczos algorithm can be used to compute scalar quantities such as transition amplitudes without explicit calculation and storage of the eigenvectors. We discuss here another
308
Recursive Solutions to Large Eigenproblems
low-storage approach that allows for the calculation of the expectation value ^ that does not commute with the Hamiltonian of an operator ðÞ 121 ^ ^ ð½; H 6¼ 0Þ. Such an operator could be, say, R2 , which is often used in assigning vibrational quantum numbers. Our perturbative scheme starts with the following effective Hamiltonian: ^0 ¼ H ^ þ l ^ H
½53
where l is a sufficiently small parameter. The Lanczos recursion under the effective Hamiltonian yields a set of eigenvalues {E0n ðlÞ}. The expectation ^ can then be computed using the Hellmann–Feynman theorem: value of 0
¼ hEn jjE ^ n i ffi En ðlÞ En l
½54
where En ¼ E0n ðl ! 0Þ are obtained from another Lanczos recursion under the original Hamiltonian. Numerical tests indicated that the accuracy is reasonable.
CHEBYSHEV RECURSION Chebyshev Operator and Cosine Propagator The generation of Krylov subspaces by the Lanczos recursion discussed in the previous section is just one of several strategies for recursive diagonalization of a large, real-symmetric matrix. Indeed, Krylov subspaces can also be generated using three-term recursion relations of classic orthogonal polynomials. The Chebyshev polynomials, for example, use the following recursion formula:122 Tk ¼ 2xTk1 Tk2
for
k2
½55
with T1 ¼ x and T0 ¼ 1. The variable x is defined on the real axis in [1,1] and the polynomials diverge exponentially outside this range. Starting with a normalized initial vector q0 , one can generate the Chebyshev vectors recursively: qk Tk ðHÞq0 ¼ 2Hqk1 qk2
for
k2
½56
with q ¼ Hq0 . Here, the Hamiltonian matrix has to be scaled so that all of its eigenvalues lie in [1,1]. This is achieved readily by setting: Hscaled ¼ ðH H þ Þ=H with H ¼ ðHmax Hmin Þ=2. (In the ensuing discussions below, the Hamiltonian is assumed to have been scaled.) The extrema of the spectrum ðHmin and Hmax Þ can be estimated by using, for example, a short Lanczos recursion. The Chebyshev vectors span a Krylov space, but
Chebyshev Recursion
309
unlike the Lanczos vectors they are not orthogonal. The scaling laws for the Chebyshev recursion are essentially the same as for the Lanczos recursion. The usefulness of the Chebyshev polynomials being both efficient and accurate building blocks in numerically approximating operator functions was realized first by Tal-Ezer and Kosloff,79,123 and later by Kouri, Hoffman, and coworkers.124–129 Aspects of their pioneering work will be discussed later in this review. A unique and well-known property of the Chebyshev polynomials is that they can be mapped onto a cosine function: Tk ðEÞ ¼ cosðk arccos EÞ ¼ cosðkyÞ
½57
with y arccos E. In essence, then, the Chebyshev polynomials are a cosine function in disguise. This duality underscores the utility of the Chebyshev polynomials in numerical analysis, which has long been recognized by many,130 including Lanczos.131 It is straightforward to extend the definition to a matrix or to an operator:123 Tk ðHÞ ¼ cosðk arccos HÞ ¼ cos k
½58
with arccos H. The mapping is unique if the spectrum of H is in [1,1]. Interestingly, the Chebyshev operator defined in Eq. [58] can be considered as the real part of an evolution operator or propagator ðeik Þ. In other words, the Chebyshev operator can be regarded as a discrete cosine propagator with k as the discrete generalized time and as the effective Hamiltonian.132–134 For this reason, we will use the words ‘‘propagation’’ and ‘‘recursion’’ interchangeably when describing the Chebyshev recursion. It can be further shown that the Chebyshev order (k) and angle ðyÞ form a conjugate pair of variables, similar to energy and time.135 The two conjugated representations are related by an orthogonal cosine transform. Thus, properties in the angle domain can be extracted readily from propagation in the order domain and the convergence is uniform. The Chebyshev angle does not introduce any complication because its mapping to energy is single-valued, albeit nonlinear. In many cases we are interested in the dynamics near the low end of the eigenspectrum of the Hamiltonian and the nonlinear mapping actually provides more interpolation points in this range (see Figure 4), and thus leads to better and faster convergence. The propagator nature of the Chebyshev operator is not merely a formality; it has several important numerical implications.136 Because of the similarities between the exponential and cosine propagators, any formulation based on time propagation can be readily transplanted to one that is based on the Chebyshev propagation. In addition, the Chebyshev propagation can be implemented easily and exactly with no interpolation errors using Eq. [56], whereas in contrast the time propagator has to be approximated.
310
Recursive Solutions to Large Eigenproblems
Like the time propagation, the major computational task in Chebyshev propagation is repetitive matrix-vector multiplication, a task that is amenable to sparse matrix techniques with favorable scaling laws. The memory request is minimal because the Hamiltonian matrix need not be stored and its action on the recurring vector can be generated on the fly. Finally, the Chebyshev propagation can be performed in real space as long as a real initial wave packet and real-symmetric Hamiltonian are used. The recursion scheme in Eq. [56] is very stable for real-symmetric (or Hermitian) Hamiltonian matrices. However, it might diverge for complexsymmetric matrices, such as those used to describe resonances (see below).6 This divergence arises from the complex-symmetric Hamiltonian (e.g., ^ iV, where V is the optical potential137–139) has complex eigenvalues, ^0 ¼ H H whereas the Chebyshev polynomials are defined on the real axis. To avoid this problem, Mandelshtam and Taylor proposed replacing the negative imaginary potential with the following damping scheme:140,141 qdk ¼ Dð2Hqdk1 Dqdk2 Þ
½59
where qd1 ¼ DHq0 . The damping function (D) is real, decays in the asymptote smoothly from unity, and has the effect of removing outgoing waves near the end of the grid. These authors further demonstrated that such a damping term is related to an energy-dependent optical potential,140 whose form can be chosen arbitrarily as long as it enforces the outgoing boundary conditions.142–144 The advantage of such a damping scheme is that the corresponding wave packet can still be propagated in real space, which greatly enhances the applicability and efficiency of the Chebyshev propagator for systems containing continua.133
Spectral Method How does one extract eigenpairs from Chebyshev vectors? One possibility is to use the spectral method. The commonly used version of the spectral method is based on the time-energy conjugacy and extracts energy domain properties from those in the time domain.145,146 In particular, the energy wave function, obtained by applying the spectral density, or Dirac delta filter ^ operator ðdðE HÞÞ, onto an arbitrary initial wave function ðjð0ÞiÞ1: 1 ^ jðEÞi dðE HÞjð0Þi ¼ 2p
ð1 1
^
dteiðEHÞt jð0Þi ¼
1 2p
ð1
dteiEt jðtÞi ½60
1
is expressed as an exponential Fourier transform of the time-dependent ^ wave packet: jðtÞi eiHt jð0Þi. Similarly, the energy spectrum can be
Chebyshev Recursion
311
obtained as an exponential Fourier transform of the autocorrelation function ðCðtÞ ¼ hð0ÞjðtÞiÞ: 1 ðEÞ hð0ÞjðEÞi ¼ 2p
ð1
1 dte hð0ÞjðtÞi ¼ 2p 1 iEt
ð1
dteiEt CðtÞ
½61
1
As pointed out in the previous section, the Chebyshev operator can be viewed as a cosine propagator. By analogy, both the energy wave function and the spectrum can also be obtained using a spectral method. More specifically, the spectral density operator can be defined in terms of the conjugate Chebyshev order (k) and Chebyshev angle ðyÞ:128,132 dðEI HÞ ¼
1 1 X ð2 dk0 Þ cos ky cos k p sin y k¼0
1 X 1 ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð2 dk0 ÞTk ðEÞTk ðHÞ p 1 E2 k¼0
½62
where the angles in the first equation are mapped energy and Hamiltonian, respectively, as defined before. Applying the delta operator onto an arbitrary initial vector q0 , one can expect to obtain a ‘‘filtered’’ vector near E: gðEÞ dðEI HÞq0 ¼
1 1 X ð2 dk0 Þ cos kyqk p sin y k¼0
½63
where the Chebyshev vectors qk are generated recursively using Eq. [56]. In other words, the energy domain vector is a cosine Fourier transform of the Chebyshev vectors. Compared with time propagation, the Chebyshev recursion is much more efficient because it can be carried out exactly using Eq. [56], whereas the time propagator has to be approximated. A desirable feature of Eq. [63] is that one can scan the energy axis looking for eigenvectors in the spectral regions of interest. In the limit where a sufficiently large number of terms in Eq. [63] is included, gðEÞ will be the (unnormalized) eigenvector when the energy variable (E) hits an eigenvalue ðEn Þ. All eigenvectors can be obtained from the same set of Chebyshev vectors because the Chebyshev vectors are energy global, i.e., independent of E (or y). Thus, all information about the energy is contained in the expansion coefficients. In reality, as will be discussed below, the infinite sum in Eq. [63] is always truncated and gðEÞ will only approximate an eigenvector when the width of the truncated delta filter is narrower than the spectral density of the region. One can extend Eq. [63] to compute the entire eigenspectrum of H. This can be achieved by calculating and storing, along the Chebyshev recursion, the
312
Recursive Solutions to Large Eigenproblems
autocorrelation function, which is the overlap between the initial vector and the Chebyshev vectors: Ck qT0 qk
½64
It is easy to show that the eigenspectrum of H is simply a cosine Fourier transform of the Chebyshev autocorrelation function: ðEÞ qT0 dðEI HÞq0 ¼
1 1 X ð2 dk0 Þ cos kyCk p sin y k¼0
½65
The efficient FFT scheme can be used to extract the eigenspectrum from the correlation function. We note in passing that the eigenspectrum can also be obtained using cross-correlation functions: C0k pT qk , where the vector p can be chosen arbitrarily. Once the eigenvalues are determined accurately, the corresponding (unnormalized) eigenvectors can be assembled from a second recursion: bn ¼ dðEn I HÞq0 ¼
1 1 X ð2 dk0 Þ cos kyn qk p sin yn k¼0
½66
where yn ¼ arccos En . The cosine form of the Chebyshev propagator also affords symmetry in the effective time domain, which allows for doubling of the autocorrelation function. In particular, 2K values of autocorrelation function can be obtained from a K-step propagation:147 C2k qT0 q2k ¼ 2qTk qk qT0 q0 C2kþ1
qT0 q2kþ1
¼
2qTkþ1 qk
qT1 q0
½67 ½68
based on the trigonometry relationship: 2 cosðk1 yÞ cosðk2 yÞ ¼ cos½ðk1 þ k2 Þy þ cos½ðk1 k2 Þy
½69
It is advantageous to use the doubling property of the autocorrelation function to reduce computational costs. Apart from the delta filter discussed here, one can define other filters using the same Chebyshev operators. In fact, any analytic function of the Hamiltonian can be expressed as an expansion in terms of the Chebyshev operator.148 For instance, the Green filter can be expressed as follows:126,127,149 1 i X GðEÞ ¼ ðEI HÞ1 ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð2 dk0 Þeik arccos E Tk ðHÞ 1 E2 k¼0
½70
Filter-Diagonalization
313
Indeed, the Green filter and the delta filter operator are related: 1 dðEI HÞ ¼ ImGðEÞ p
½71
and the spectral method illustrated for the delta filter can be readily implemented for the Green filter. To that end, the exponential Fourier transform replaces a cosine Fourier transform. The major shortcoming of the spectral method is the rate of convergence. Its ability to resolve eigenvalues is restricted by the width of the filter, which in turn is inversely proportional to the length of the Fourier series (the uncertainty principle). Thus, to accurately characterize an eigenpair in a dense spectrum, one might have to use a very long Chebyshev recursion.
FILTER-DIAGONALIZATION Spectral methods, whether based on the time or Chebyshev propagation, converge slowly because of the uncertainty principle. This behavior stems from the fact that the time or Chebyshev propagation is energy global, thus providing a uniform convergence in the entire energy range. Neuhauser suggested an elegant scheme that combines propagation with diagonalization to overcome this shortcoming.150–152 The central idea is to first construct a set of primitive energy bases in a prespecified energy window via propagation, a process that is denoted as filtering. These primitive bases need not be well resolved in energy. Consequently, the propagation length can be shortened significantly. The final resolution of the eigenpairs is achieved by solving a small local generalized eigenproblem via direct diagonalization. The original filterdiagonalization (FD) method of Neuhauser was formulated based on time propagation and the time-energy conjugacy. It has since been extended to other types of propagation.76,77,101,135,147,148,153–164 Some aspects of the FD method have been reviewed.58,136,165,166 Here, we discuss two implementations based on the Chebyshev and Lanczos recursions.
Filter-Diagonalization Based on Chebyshev Recursion To describe the FD method,150–152 we first define a filter operator using the Chebyshev propagator. The definition of a filter operator using the time propagator can be given in a similar manner, but it is not discussed here because it is considered to be inferior to the Chebyshev-based approach. However, we note in passing that the time-based FD is very important in signal processing, which is an important topic in many fields. The form of the filter is flexible, but it should enrich components near the energy of interest and depress contributions elsewhere. Both the Green operator and the spectral
314
Recursive Solutions to Large Eigenproblems
density operator are good examples of a filter operator. For numerical convenience, we define a generalized filter operator centered at El as follows:135,148 FðHjEl Þ
K X
fk ðEl ÞTk ðHÞ
½72
k¼0
where the expansion coefficients are obtained from a cosine Fourier transform of an analytic filter function: fk ðEl Þ ¼
2 dk0 p
2 dk0 ¼ p
ð1
dE
1
ðp
F ðEjEl ÞTk ðEÞ pffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 E2
½73
dyF ðcos yj cos yl Þ cosðkyÞ
0
where E ¼ cos y and El ¼ cos yl . The analytical filter F ðEjEl Þ may be the Green function ððE El Þ1 Þ,124,153 the delta function ðdðE El ÞÞ,128 or a Gaussian as in Eq. [74]:148 2
F ðEjEl Þ ¼ eðEEl Þ
=s2
½74
where s is related to the width of the filter. Other forms of the filter function can also be used.79,167,168 The reason that the generalized filter is defined as an expansion, rather than in the analytical form, is purely a numerical one, namely that one has to truncate the expansion in a real calculation. The inclusion of a finite number of Chebyshev terms in Eq. [72] also allows for the use of a K-point energy grid based on the Gauss–Chebyshev quadrature points, which are the equidistant Fourier grid points in the Chebyshev angle domain. By defining the filter in terms of a finite sum of Chebyshev terms, a discrete orthogonal transformation can be established between the so-called discrete energy representation (DER), defined by the energy grid, and the generalized time representation (GTR), defined by the Chebyshev order (k).135 In addition, coefficients in Eq. [73] can be obtained by a discrete cosine Fourier transform. Numerically, FFT19 can be used for long recursions. The generalized filter defined in Eq. [72] converges rapidly and uniformly to the corresponding analytical filter when K ! 1. In fact, it is well known that the best interpolation of a nonperiodic function is given by the Chebyshev polynomials because they provide the smallest maximum errors (the minimax property).130 When K is finite, the generalized filter may deviate from the analytic function, due, for example, to the Gibbs phenomenon. However, the deviations can be reduced by increasing K, as shown in Figure 5.
Filter-Diagonalization
315
Figure 5 Gaussian filter approximated by Chebyshev expansion with various numbers of terms. Adapted with permission from Ref. 148.
Once the filter is defined, it is a simple matter to generate the filtered vectors. This can be done in a parallel fashion using the same set of Chebyshev vectors: gl FðHjEl Þq0 ¼
1 X
fk ðEl Þqk
l ¼ 1; 2; . . . ; L;
½75
k¼0
where El are chosen in the spectral window of interest ð½Elower ; Eupper Þ and qk are generated recursively using Eq. [56] directly or via the damped Chebyshev recursion (Eq. [59]). With a sufficiently large L, these vectors span the local eigenspace and the eigenpairs in the energy range can be obtained by solving the following small dimensional generalized eigenproblem: HB ¼ SBE
½76
where the Hamiltonian and overlap matrices are given in terms of the filtered vectors: Hll0 ¼ gTl Hgl0
Sll0 ¼ gTl gl0
½77
and E and B contain the eigenvalues and eigenvectors, respectively. Because the dimensionality of the matrices is small, direct generalized eigensolvers,
316
Recursive Solutions to Large Eigenproblems
Figure 6 Schematic procedure for filter-diagonalization.
such as the generalized Gram–Schmidt orthogonalization and singular value decomposition (SVD),19 can be used to remove the linear dependence of the basis. In some cases, the removal of the linear dependence in the filtered vectors by SVD can be arbitrary. We have found that the RGG approach in EISACK169 is more reliable than the SVD-based approach. The FD scheme is illustrated in Figure 6. We note in passing that the spectral method can be regarded as a special case of FD with L ¼ 1, for which the uncertainty principle dictates that the spectral resolution is inversely proportional to the propagation length. In FD, the spectral resolution is enhanced beyond the uncertainty principle, because of the diagonalization of the Hamiltonian in the subspace spanned by multiple filtered vectors. In practice, the filtering energies ðEl Þ are often chosen as the Gauss– Chebyshev quadrature points in the energy range ½Elower ; Eupper for a particular K. Both K and L are often treated as convergence parameters. Generally, FD works well for well-separated eigenvalues and those near the spectral extrema. The better convergence for the extremal eigenvalues can be readily understood because the cosine mapping affords more interpolation points near both ends of the spectrum (see Figure 4). As expected, the resolution of closely spaced eigenpairs requires longer recursion. Mandelshtam and Taylor proposed the following estimate of the number of recursion steps:154 K 2rH
½78
where r is the local density of states and H is the spectral range. Like the Lanczos algorithm, the dependence of the convergence rate on the spectral range is an issue of great importance. Thus, one often places a premium on minimizing H in FD and other recursive diagonalization. Other caveats of FD include problems with eigenpairs that have small overlaps with the initial vector and identifying spurious eigenvalues because of the linear dependence of the filtered vectors. Hence, careful convergence tests are essential. The scaling laws of FD are dominated by the recursion because diagonalization of small matrices is relatively inexpensive. However, because one must store multiple filtered vectors along the recursion, FD could be a burden for large systems.
Filter-Diagonalization
317
Low-Storage Filter-Diagonalization The FD method can be further extended to a low-storage version, often denoted as LSFD. Wall and Neuhauser first showed that the matrix elements in Eq. [77] can be obtained directly from the time autocorrelation function, without explicitly resorting to the filtered vectors.147 In doing so, the construction and storage of the filtered vectors can be avoided, which is advantageous when only eigenvalues are of interest. Mandelshtam and Taylor extended the idea to the delta filter based on the Chebyshev propagation, and they derived analytical equations for calculating both the S and H matrices from the correlation function.154,156 Chen and Guo further developed several efficient schemes to construct the matrix elements for general filters.158,162 To illustrate the LSFD idea, we note that for the delta filter (Eq. [62]), the matrix elements can be written as Sll0 ¼
K X K X
ð2 dk0 Þð2 dk0 0 Þ cos kyl cos k0 yl0 qTk qk0
½79
ð2 dk0 Þð2 dk0 0 Þ cos kyl cos k0 yl0 qTk Hqk0
½80
k¼0 k0 ¼0
Hll0 ¼
K X K X k¼0
k0 ¼0
where El ¼ cos yl . An important observation is that the last terms in Eqs. [79] and [80] are related to the autocorrelation function ðCk Þ: qTk qk0 ¼ ðCkþk0 þ Ckk0 Þ=2
½81
qTk Hqk0 ¼ ðCkþk0 þ1 þ Ckþk0 1 þ Ckk0 þ1 þ Ckk0 1 Þ=4
½82
where the trigonometry relationship of Eq. [69] is used. In addition, the double sums in Eqs. [79] and [80] can be reduced analytically to a single sum,154 or evaluated via FFT.162 The calculation of the matrix elements can be carried out in quadruple precision to minimize errors. The final formulas for computing these matrix elements can be found in the original publications154,156,158,162,170 and are not given here. The solution of the generalized eigenproblem is handled the same way as before. One can also compute any selected eigenvector and its overlap with the initial vector used in the Chebyshev recursion:
bn ¼
K X
" ð2 dk0 Þ
X
Bln cos kyl qk
½83
l
k¼0
n ¼ qT0 bn ¼
#
X l
Bln
K X k¼0
ð2 dk0 Þ cos kyl Ck
½84
318
Recursive Solutions to Large Eigenproblems
where the eigenvectors B are obtained by solving the generalized eigenequation in Eq. [76]. Note that the eigenvectors require an additional Chebyshev recursion because the Chebyshev vectors are normally not stored. The convergence of the eigenvalues obtained by LSFD can be checked by varying L and K. For practical purposes, L is typically in the range of 100. The number of recursion step (K) is proportional to the average density of states and the spectral range. The error of a particular eigenvalue can be estimated from its dispersion k ðH2 E2n SÞbn k, where ½H2 ll0 ¼ gTl H2 gl0 can also be expressed in terms of the autocorrelation function.147,156 The LSFD method has essentially the same memory requirement as the Lanczos algorithm. Likewise, the CPU scaling law is similar and dominated by the recursion (because the numerical costs for solving the generalized eigenproblem are usually much smaller). In addition, the energy global nature of the propagator allows one to determine the eigenvalues in multiple spectral windows from a single autocorrelation function, whereas the energy grid El ðl ¼ 1; 2; . . . ; LÞ has to be defined a priori in the original FD. Of course, a much longer Chebyshev recursion might be needed for spectral windows in the interior of the spectrum. The relative merits of LSFD and the Lanczos algorithm will be discussed below. Most LSFD applications to date are based on the autocorrelation function obtained by propagating a single vector. However, it has been pointed out that the use of cross-correlation functions might be beneficial in determining the eigenvalues with LSFD, particularly for cases with high levels of degeneracy or fast deteriorating signals.147,171 This can be achieved by propagating multiple initial vectors and by computing the cross-correlation matrix at every step, similar to the block-Lanczos algorithm discussed above. The Hamiltonian and overlap matrices in Eq. [76] can be obtained in a similar fashion, albeit with a much larger dimension because of the block nature of the correlation functions. The benefits of cross-correlation functions are obvious because they have much higher information content than does the autocorrelation function. We note that the Chebyshev recursion-based LSFD can be used to extract frequencies from a time signal in the following form: Ck ¼
X
an eiktEn
½85
n
by assuming the signal corresponds, implicitly, to a quantum system in which the explicit form of the Hamiltonian is of no importance. Specifically, the signal in Eq. [85] can be considered as a correlation function under a discrete time propagator with a step of t. This strategy is successful for classic and semi-classic dynamics,101,170,172–176 and for NMR signal processing.170,176–178 In addition, several signal processing methods related to LSFD have been proposed by various authors.179,180 Interested readers are referred to an excellent review by Mandelshtam.166
Filter-Diagonalization
319
Filter-Diagonalization Based on Lanczos Recursion The filtering, namely the construction of energy local bases, can also be carried out using the Lanczos recursion or similar recursive methods. However, filtered vectors at El can only be obtained using the Green filter: gl ¼
1 q El I H 0
½86
by solving linear equations ðEl I HÞgl ¼ q0
½87
If H is Hermitian (or real-symmetric), the minimal residual (MINRES) method29 can be used for solving Eq. [87]. MINRES relies on the Lanczos recursion and is well suited for sparse Hamiltonian matrices. Other methods of Lanczos-based filtering have also been proposed.157,164 Once the filtered vectors are generated, they can be used to obtain eigenpairs by solving the same generalized eigenproblem (Eq. [76]) as discussed above. Smith and Yu have used MINRES to construct filtered vectors in Lanczosbased FD calculations for Hermitian/real-symmetric matrices.76,77,181,182 To this end, these authors demonstrated that the filtered vectors can be written as a linear combination of the Lanczos vectors: gl ¼
X
ðlÞ
yk qk
½88
k¼0 ðlÞ
where the expansion coefficients ðyk Þ for all filtered vectors can be obtained by solving the MINRES equation, which requires having knowledge of only fak ; bk g.29 A byproduct of the MINRES-FD scheme is the elimination of ‘‘spurious’’ eigenvalues in the original Lanczos algorithm.183 For complex-symmetric Hamiltonian matrices, the generalized minimal residual (GMRES)30 and quasi-minimal residual (QMR) methods31,32 are available. The former method is applicable to the more general non-symmetric linear systems, and the Arnoldi recursion (described later) is used to generate the Krylov subspace. GMRES stores all the recurring vectors for the purpose of reorthogonalization and minimization of the true residue. Thus, the storage requirement of GMRES increases linearly with the number of recursion steps. This problem can sometimes be alleviated by restarting the recursion, but instability and stagnation may still exist. The QMR, on the other hand, uses the Lanczos recursion as its workhorse and the loss of orthogonalization among the Lanczos vectors is not explicitly treated. As a result, QMR minimizes the quasi-residual, but it requires only a small number of vectors to be stored. To avoid a possible breakdown of the Lanczos recursion, a so-called
320
Recursive Solutions to Large Eigenproblems
‘‘look-ahead’’ algorithm is often used.184 QMR is applicable to both complexsymmetric and non-Hermitian matrices. The GMRES and QMS methods are completely equivalent in exact arithmetic, but they have different convergence behaviors in finite-precision arithmetic. GMRES typically converges faster and more smoothly than does QMR, but the latter is much more efficient computationally because it avoids storing and orthogonalizing the recurring vectors.93 Thus, QMR is often the method of choice for solving large dimensional linear equations. The use of QMR to construct filtered states was introduced by Karlsson,93 and its implementation in the FD framework was advanced by several authors for obtaining eigenpairs of complex-symmetric matrices.78,82,159,163 A powerful feature of these recursive linear solvers is the possibility to institute preconditioners, which can accelerate the convergence significantly.91–94,185 It should be pointed out that a low-storage version of the Lanczos-FD can also be formulated without explicit recourse to the filtered vectors.77,164 Such a low-storage version is preferred if only eigenvalues are needed. For example, Yu and Smith have shown that the overlap and Hamiltonian matrices in a prespecified energy range can be directly obtained as follows:77 S ¼ YT Y
½89
H ¼ YT TY
½90
where the ðL KÞ Y matrix contains all the expansion coefficients in Eq. [88]. As in the Chebyshev-based LSFD, the error of an eigenvalue can be determined from the dispersion k ðH2 E2n SÞbn k, in which the H2 matrix can also be constructed with T. For very long recursions, the large size of Y might cause problems. To avoid those problems, Zhang and Smith recently proposed a new LSFD scheme based on solving homogeneous linear equations in the Lanczos subspace.164 These authors showed that the S and H matrices can be built recursively along the Lanczos recursion, thus avoiding the storage of Y. Nonetheless, it is not entirely clear if significant numerical savings can be achieved when compared with the original Lanczos algorithm.
SYMMETRY ADAPTATION Because many physical systems possess certain types of symmetry, its adaptation has become an important issue in theoretical studies of molecules. For example, symmetry facilitates the assignment of energy levels and determines selection rules in optical transitions. In direct diagonalization, symmetry adaptation, often performed on a symmetrized basis, significantly reduces the numerical costs in diagonalizing the Hamiltonian matrix because the resulting block-diagonal structure of the Hamiltonian matrix allows for the separate
Symmetry Adaptation
321
treatment of each symmetry block, each of which has a much smaller dimensionality.186,187 However, such an approach can become complicated depending on the system under investigation, and it may also partially destroy the sparsity of the matrix, which is an issue that becomes important for recursive methods. For Krylov subspace methods, symmetry adaptation can also lower the computational costs, although the savings may not be as dramatic as in direct diagonalization. To this end, the breakdown of the spectrum into symmetric species reduces the spectral density, thus rendering faster convergence. It may also remove degeneracy or near-degeneracy in the spectrum that is pathologically difficult to converge using a recursive approach. An example of such near-degeneracy is the local vibrational modes coupled by the Darling– Dennison resonance.188 The unique operation in recursive methods, namely matrix-vector multiplication, demands different strategies in symmetry adaptation. We discuss several of these strategies below, with the assumption that a coordinate system has been chosen such that symmetry operations in the group to which the molecular system belongs can be readily realized. The simplest approach to symmetry adaptation is to recur several symmetry-adapted vectors.12,145,189–191 This approach is not optimal because multiple recursions have to be executed. The numerical efficiency can be improved by propagating only a single vector and constructing multiple symmetryadapted vectors and/or correlation functions at each step. This approach is possible because symmetry operators and the Hamiltonian commute. Using the Chebyshev propagator as an example, a symmetry-adapted Chebyshev vector for the mth irreducible representation of the symmetry group to which the molecular system belongs can be obtained by applying the appropriate projection operator ðPðmÞ Þ onto the original Chebyshev vector:192 ðmÞ
qk
ðmÞ
Tk ðHÞq0
¼ Tk ðHÞPðmÞ q0 ¼ PðmÞ Tk ðHÞq0 ¼ PðmÞ qk
½91
where the projection operator, being a linear combination of symmetry operators, commutes with the Hamiltonian and thus with the Chebyshev propagator. Similarly, autocorrelation functions belonging to different symmetries can be obtained from a single propagation: ðmÞ
Ck
ðmÞ
ðmÞ
½q0 T qk
ðmÞ
ðmÞ
¼ ½q0 T qk ¼ ½q0 T qk
½92
where the idempotency of the projection operator ðPðmÞ PðmÞ ¼ PðmÞ Þ is used. These symmetry-adapted autocorrelation functions can be used directly in a spectral method, or they can be used to construct symmetry-adapted generalized eigenequations in FD. Numerical tests have shown that these strategies are both accurate and efficient.193,194 Unfortunately, the symmetry adaptation scheme described above for the Chebyshev recursion cannot be applied directly to the Lanczos recursion.
322
Recursive Solutions to Large Eigenproblems
Because of round-off errors, symmetry contamination is often present even when the initial vector is properly symmetrized. To circumvent this problem, an effective scheme to reinforce the symmetry at every Lanczos recursion step has been proposed independently by Chen and Guo100 and by Wang and Carrington.195 Specifically, the Lanczos recursion is executed with symmetryadapted vectors, but the matrix-vector multiplication is performed at every Lanczos step with the unsymmetrized vector. In other words, the symmetrized vectors are combined just before the operation Hqk , and the resultant vector is symmetrized using the projection operators: " ðmÞ
½Hqk
¼ H
X
!#ðmÞ ðmÞ qk
" ¼P
ðmÞ
H
X
m
!# ðmÞ qk
½93
m
Such a strategy gives rise, from a single Lanczos recursion, to multiple T matrices that can be subsequently diagonalized for eigenvalues in different irreducible representations. The symmetrized vectors need not be stored in full length but instead can be squeezed into the space of a single vector. As a result, the memory requirement of this symmetry-adapted Lanczos algorithm remains unchanged. The CPU requirement also remains the same because the projection can be readily performed with few arithmetic operations. Applications to various problems have demonstrated the power of this symmetryadaptation method.103,196–205 Even when the system has no symmetry, one can still take advantage of the inherent symmetry in some operators in the Hamiltonian. An example is the reflection symmetry in the kinetic energy operator q2 =qx2 , which can be used to reduce its DVR matrix into a block-diagonal form.206 When the computation of its action onto the recurring vector is rate limiting, the efficiency will double if the symmetrized representation is used, even when the potential has no such symmetry. In practical implementations, the recurring vector is first symmetrized before applying the symmetrized DVR matrix. This is followed by recombination of the resultant vectors to give the non-symmetrized vector. This so-called extended symmetry-adapted discrete variable representation (ESADVR) can be generalized to any reference Hamiltonian that has high symmetry. The computational savings, which are proportional to the number of symmetry species in the group, can be significant for large systems. For details of the discussion, the reader is referred to Ref. 206.
COMPLEX-SYMMETRIC PROBLEMS Complex-symmetric matrices might arise in some problems in chemical physics. Examples include the electron paramagnetic resonance (EPR) and nuclear magnetic resonance (NMR) line shape problems.207 Another
Complex-symmetric Problems
323
prominent case is resonance states embedded in a continuum, formed, for example, by the temporary trapping of energy in one or more internal modes. Because resonances can affect the scattering processes significantly, they have been investigated extensively. Numerically, resonance states can be considered as eigenstates of a complex-symmetric Hamiltonian with a negative imaginary ^0 ¼ H ^ iV.137–139 This approach is related to the more absorbing potential H rigorous complex scaling method in which the dissociation coordinate is rotated into the complex plane.208,209 The diagonalization of the complexsymmetric Hamiltonian matrix yields eigenvalues in the form of E i=2, where E and represent the position and width of the resonance, respectively. In the limit of isolated resonances, the lifetime of a resonance is given by 1/. Like the real-symmetric case, the size of the matrix limits the applicability of direct diagonalization methods. A complex-symmetric matrix can be diagonalized recursively using the same Lanczos algorithm outlined in Eqs. [25] to [34].27,51,207 Under such circumstances, however, the inner product is based on the complex (non-conjugate) product,210 superficially identical to Eq. [28], and the resulting tridiagonal matrix (T) is thus complex-symmetric. The diagonalization of the complex-symmetric T cannot be done with bisection, but the inverse iteration or a modified QL method can be used instead, as suggested by Cullum and Willoughby.27 Like the real-symmetric case, spurious eigenvalues appear, but they can be identified using the same tests discussed in Sec. II. Because the eigenvalues are located in the complex plane, rather than on the real axis, the identification becomes more difficult, especially for systems with large spectral densities. This is because the multiple copies of an eigenpair do not converge to machine precision. A general non-symmetric eigenproblem can also be solved recursively using the Arnoldi method.211 Like the Lanczos algorithm for real-symmetric problems, the Arnoldi recursion generates a set of orthonormal vectors to span the Krylov subspace. Instead of a tridiagonal matrix, however, it yields an associate matrix in the Hessenberg form, which can then be diagonalized. The Lanczos algorithm can also be modified for such problems. Instead of orthonormal vectors, the nonsymmetric Lanczos recursion generates biorthogonal vectors for two Krylov subspaces.25 This so-called dual Lanczos algorithm results in a generally nonsymmetric tridiagonal matrix.34,212 Both methods are discussed in detail by Saad.24 Although these recursive methods are amenable to complex-symmetric problems, their applications to chemical physics have been attempted only recently. In a recent publication, Tremblay and Carrington proposed a clever realarithmetic method for calculating resonance energies and widths.213 Their method is based on a conversion of the original complex-symmetric Hamiltonian to a larger real nonsymmetric matrix, following a recipe for solving a pseudo-time Schro¨dinger equation proposed by Mandelshtam and Neumaier.214 It was demonstrated that a dual Lanczos recursion can be
324
Recursive Solutions to Large Eigenproblems
used to reduce this real nonsymmetric matrix to a complex-symmetric tridiagonal matrix, which yields the final complex eigenvalues. Because the real matrix is extremely sparse, its larger size has little impact on the recursion. On the other hand, the recursion is efficient as it is carried out with real vectors. The application of the Chebyshev recursion to complex-symmetric problems is more restricted because Chebyshev polynomials may diverge outside the real axis. Nevertheless, eigenvalues of a complex-symmetric matrix that are close to the real energy axis can be obtained using the FD method based on the damped Chebyshev recursion.155,215 For broad and even overlapping resonances, it has been shown that the use of multiple cross-correlation functions may be beneficial.216 Because of the damping in the Chebyshev recursion, however, the doubling formulas for the autocorrelation function (Eqs. [67] and [68]) do not hold any longer.156 Consequently, one might have to compute all correlation function values from the original definition (Eq. [64]), which would result in numerical inefficiency. Nevertheless, it has been shown by Li and Guo that the doubled autocorrelation function according to Eqs. [67] and [68] can still be used to calculate both the positions and the widths of narrow resonances accurately with LSFD based on the damped Chebyshev recursion,217,218 even though the errors can be large for broad resonances. This observation can be rationalized by the fact that damping in the asymptotic regions does not significantly affect narrow resonances because they are largely localized in the interaction region. Numerically, the doubling cuts the computational cost by half and the savings can be significant for large systems. A formal discussion about the validity of the doubling scheme has been given by Neumaier and Mandelshtam.219
PROPAGATION OF WAVE PACKETS AND DENSITY MATRICES The solution of the time-dependent Schro¨dinger equation i
q ^ ðtÞ ¼ HðtÞ qt
½94
constitutes the propagation of a wave packet in the time domain with the evo^ lution operator eiHt . As discussed, the discretized Hamiltonian H may be very large and sparse. As a result, many techniques introduced above can be used to approximate the time propagator. We emphasize that if the eigenpairs are all known, the time propagation can be performed analytically with minimal computational effort. However, it is often unnecessary to resolve the eigenpairs. Interpolation works well, particularly for relatively short time events.
Propagation of Wave Packets and Density Matrices
325
For example, the time propagator can be approximated by a Chebyshev expansion123 X eiHt ¼ ð2 dk0 ÞðiÞk Jk ðtÞTk ðHÞ ½95 k¼0
where Jk are the Bessel functions of the first kind and the spectral range of H is assumed to be normalized. Other orthogonal polynomials can also be used to approximate the time propagator.125–127,220–225 Using the damping technique in Eq. [59], one can usually avoid the problems introduced by a negative imaginary potential. The Lanczos algorithm can also be used to approximate a short-time propagator. The so-called short-iterative Lanczos (SIL) method of Park and Light constructs a small set of Lanczos vectors,226 which can be summarized by Eq. [96]: eiHt QZeei t Zy Qy
½96
where Q and Z are the matrices that tridiagonalize H and diagonalize T, respectively. Because the time step is short, only a few Lanczos vectors are needed to approximate the propagator. Note that even if the eigenvalues in Eq. [96] might not be converged, they are sufficient to provide an interpolation of the propagator for a short time step. For time-dependent Hamiltonians, one can reformulate the problem with the ðt; t0 Þ scheme,227 in which the time is treated as an extra degree of freedom. Thus, the techniques developed for stationary problems can be applied in a straightforward manner. Applications of recursive methods to laser-driven dynamics have been reported by several authors.99,228–230 By analogy the propagation of a density matrix, which corresponds to the solution of the Liouville–von Neumann equation:231 q ^^ ^ ¼ L^ r r qt
½97 ^^
requires calculating the Liouville–von Neumann propagator ðe Lt Þ. The Liou^^ ville super-operator ðLÞ is typically complex nonsymmetric and much larger than the Hamiltonian because the density matrix ð^ rÞ is a rank-2 tensor. The corresponding eigenvalues are generally complex, and diagonalization is not always possible. The most successful strategy for approximating the Liouville–von Neumann propagator is to interpolate the operator with polynomial operators. To this end, Newton and Faber polynomials have been suggested to globally approximate the propagator,126,127,225,232–234 as in Eq. [95]. For shorttime propagation, short-iterative Arnoldi,235 dual Lanczos,236 and Chebyshev
326
Recursive Solutions to Large Eigenproblems
approaches237,238 have been reported. The former two approaches are similar to the SIL discussed above, whereas the latter is essentially an interpolation scheme similar to that in Eq. [95]. All of these methods are based on the recursive generation of Krylov subspace, and they are thus numerically efficient.
APPLICATIONS Bound States and Spectroscopy The Lanczos algorithm has traditionally been considered an efficient way to extract the lowest few eigenvalues of a large sparse matrix. However, many researchers have come to realize that it is equally powerful for mapping out large portions of, or even entire, bound-state spectra of polyatomic molecules. The equally powerful Chebyshev FD method, particularly the low-storage version, has also been used very successfully for extracting bound-state spectra. A typical recursive diagonalization calculation starts with an arbitrary initial vector, which is often chosen randomly to minimize the possibility of missing certain eigenpairs. The recursion, which is the most numerically costly part of the calculation, generates the correlation function or the tridiagonal matrix. The final step involves the diagonalization of a small generalized eigenproblem or of a tridiagonal matrix. Some sorting and convergence testing might be required to remove the spurious eigenvalues. When eigenvectors are needed, several approaches exist, as discussed above. In recent years, state-of-the-art recursive diagonalization methods have been applied to bound-states problems for LiCN,152 H2O,12,117,239–241 CH2,242 HCN,13,80,105–107,241 HO2,40,67,164,243–245 ArHCl,246 HOCl,247,248 NO2,76,249–253 CS2,254 O3,54,255,256 SO2,194,241,257–259 HOOH,12,196,197,260–264 HCCH,103,193,198 HFCO,89,90,265,266 NH3,261,267 H2CO,12,39,48,87,182,261,264,266,268,269 HOCO,66 CH4,75,116,200,201,204,270–274 CH3F,275,276 C6H6,108–112,115,277,278 and several van der Waals systems.199,205,279–282 This list is by no means complete, but it does reflect major activity in the field. These calculations complement the traditional sequential diagonalization and truncation approaches,5 and they have advanced our understanding of vibrational dynamics in polyatomic molecules and their roles in both unimolecular and bimolecular reactions significantly. The recursive solution of the ro-vibrational Schro¨dinger equation not only gives the eigenvalues that form the spectrum but also additional information about the intermodal coupling and dynamics. A significant question is how the energy injected in a particular vibrational mode is dispersed in a polyatomic molecule.57,283–286 Experimentally, the intramolecular vibrational energy redistribution (IVR) can be investigated by overtone spectroscopy and stimulated emission pumping. A theoretical understanding of such problems often requires knowledge of the vibrational energy spectrum in regions with
Applications
327
very high spectral density and the corresponding spectral intensities. Such problems are ideally suited for the recursive methods described in this review. In particular, both the position and the intensities of the spectral lines can be obtained by efficient recursive methods such as RRGM and SLP without resorting to the explicit calculations of the eigenvectors.108–115 An important problem in molecular spectroscopy is the assignment of vibrational states, assuming the system is in the regular regime, which is trivial to do if the eigenvectors are known. However, it has been shown that even without the eigenvectors, important information about the eigenvectors can be extracted to allow an unambiguous assignment of vibrational quantum numbers. Two strategies have been proposed and demonstrated to accomplish this assignment. The first is based on a perturbative scheme to compute the expectation values of the vibrational operator for a particular mode,121 as described above. Such an operator could be chosen as the one-dimensional vibrational Hamiltonian ðð1=2mÞq2 =qR2 þ ðk=2ÞðR R0 Þ2 Þ or the squared displacement from equilibrium ððR R0 Þ2 Þ. This approach, which is amenable to both the Lanczos algorithm and the Chebyshev based LSFD, is especially efficient for assigning normal mode quantum numbers near the potential minimum. The second strategy takes advantage of the SLP by computing the overlaps of the eigenvectors with a set of prespecified target functions. For example, the target functions could be chosen to be Gaussian functions placed at large R values, which have large overlaps with highly excited stretching local-mode states.103,117,198 This latter approach is particularly effective in ‘‘fishing out’’ eigenvectors with well-defined characteristics.
Reaction Dynamics The dynamics of a scattering event can be described by the causal Green operator:287,288 Gþ ¼
1 EI H þ ie
½98
where e is an infinitesimally small number that can be interpreted as being the absorbing boundary condition.139,289 In particular, the S-matrix element for a transition from the initial (i) to final (f) state at energy E is given as Sf
i ðEÞ
¼
i hw jGþ jwi i 2pai ðEÞaf ðEÞ f
½99
where ai and af are the energy amplitudes of the initial and final wave packets, respectively. The S-matrix elements can be computed using the time-dependent wave packet theory that expands the Green operator in terms of the time propagator.7 As discussed above, the Chebyshev propagator bears many
328
Recursive Solutions to Large Eigenproblems
similarities with the time propagator, and a similar expansion (Eq. [70]) can therefore be used.125,126 The Chebyshev propagation is superior to time propagation because the action of the propagator on a vector can be calculated exactly with no approximation and a real algorithm is possible when the damped Chebyshev recursion140,141 is used. Indeed, studies of reaction dynamics using the Chebyshev propagation have been advocated by several authors.126,127,134,140,141,290–294 Techniques for extracting the cumulative reaction probability,295 reactive flux,296,297 and final state distributions127,134,141,292,298–300 have been reported for Chebyshev propagation. The applications of the Chebyshev recursion to reactive scattering problems have been discussed in several recent reviews.133,301 Lanczos-based methods have similarly been used for studying reaction dynamics. The most straightforward application involves the solution of a linear equation in the form of Eq. [87] using QMR or GMRES, which effectively compute the action of the Green operator on the initial vector ðjwi iÞ.93–97 Recent progress in the area of reaction dynamics includes a Lanczos implementation of the artificial boundary inhomogeneity (ABI),302 which allows for the calculation of S-matrix elements with real-symmetric Lanczos recursion.303,304 The Lanczos algorithm has also been used to diagonalize the socalled reaction probability operator,305–307 which allows for the direct calculation of the cumulative reaction probability without the S-matrix elements. This operator has only a few nonzero eigenvalues and is thus well suited for the Lanczos algorithm. Unlike the bound-state calculations where a random initial vector is commonly used, calculating the S-matrix element in Eq. [99] involves a well-defined initial vector. Such an initial vector often consists of a product of a translational wave packet and an eigenstate for the internal degrees of freedom, placed in the dissociation asymptote. Because each recursion produces a single column of the S-matrix, many recursions might be needed to obtain the entire matrix. Recently, however, Mandelshtam has argued that it is possible to obtain the entire S-matrix from cross-correlation functions based on a single damped Chebyshev recursion starting with an arbitrary initial vector.101,180 A similar formulation based on the Lanczos recursion has also been advocated by several research groups with some success.96,97,102,119,120 The aforementioned applications of recursive methods in reaction dynamics do not involve diagonalization explicitly. In some quantum mechanical formulations of reactive scattering problems, however, diagonalization of sub-Hamiltonian matrices is needed. Recursive diagonalizers for Hermitian and real-symmetric matrices described earlier in this chapter have been used by several authors.73,81 Many bimolecular and unimolecular reactions are dominated by longlived resonances. As a result, having knowledge about the positions and lifetimes of such resonance states is highly desired. Recursive calculations of resonance states have been reported for many molecular systems, including
Applications
329
12,15,215 Hþ H2O,240 CH2,308 HCO,92,153,213,217 HO2,243–245 HN2,218 3, 309,310 HOCl, HArF,311 and ClHCl.71 Most of these calculations were carried out using either the complex-symmetric Lanczos algorithm or filter-diagonalization based on the damped Chebyshev recursion. The convergence behavior of these two algorithms is typically much less favorable than in Hermitian cases because the matrix is complex symmetric. In some chemical reactions, both direct and resonance-dominated pathways coexist. Techniques designed for extracting narrow resonances may be inefficient for the (fast) direct channel. In such cases, it might be profitable to treat the two events separately. In particular, one can institute a short propagation first to give an accurate description of the direct process, which is fast. The propagation is terminated at some point in time, and the resulting state provides a starting point for extracting the relevant resonances, using either the Lanczos312 or the FD method.313 The S-matrix in the slower channel can thus be reconstructed from the resonances using Eq. [99].
Lanczos vs. Chebyshev It is not difficult to see that the two major recursion schemes described in this review are very similar; both use three-term recursion formulas to generate the Krylov subspace and have favorable scaling laws with respect to the dimensionality of the matrix. Although the coefficients in the recursion formula for the Chebyshev recursion are known a priori, their counterparts in the Lanczos recursion depend on the Hamiltonian matrix and on the initial vector. Both recursions can be considered as propagations in the generalized time domain for which the transformation to the energy domain can be found. The Lanczos algorithm attempts to impose orthogonalization among the recurring vectors, but unfortunately, an instability of the Lanczos algorithm in finite-precision arithmetic emerges because of numerical round-off errors.27 The Chebyshev vectors, on the other hand, are not orthogonalized. The analytical properties of the Chebyshev polynomials allow for the uniformly converging interpolation of any functions in the entire spectral range. The interested reader is referred to several excellent books and reviews on the topic.18,24 Interestingly, the loss of global orthogonality in the Lanczos recursion sometimes works in favor of eigenpairs that have no or little overlaps with the initial vector. The round-off errors are often sufficient to create copies of these eigenpairs in a long recursion. In other words, all eigenpairs eventually appear regardless of their amplitudes in the initial vector. Such a process does not occur in the Chebyshev recursion-based methods, where an eigenpair that is not contained in the initial vector simply will not appear. In other words, the Chebyshevbased methods yield the spectral information of the initial vector faithfully. The relative merit of the two recursive methods in computing both the bound-state and the resonance spectra have been examined and discussed by several of authors,41,42,80,247,314–318 and the consensus is that their efficiency
330
Recursive Solutions to Large Eigenproblems
and accuracy are comparable. To be more specific, the number of matrix-vector multiplications needed to converge specific eigenpairs is roughly the same in the two recursions. In our own experience, the real-symmetric Lanczos algorithm is generally preferred for bound-state calculations because of its simplicity and fool-proof ability to extract all eigenvalues. Evidence also shows that the convergence of low-lying levels with the Lanczos algorithm is somewhat faster than with Chebyshev-based methods.42,318 On the other hand, the convergence of the Chebyshev-based methods is typically more uniform than found with Lanczos-based methods. For narrow resonances, the LSFD method with the doubled Chebyshev autocorrelation function is probably the most effective,217,218,316 largely because the Chebyshev recursion can be carried out in real space. The recently proposed real-arithmetic Lanczos method may also be competitive.213 For specific systems, more sophisticated approaches might be devised to gain additional advantages. Observations on the convergence rate of both the Chebyshev-based LSFD method and the original Lanczos algorithm indicate that the number of converged eigenvalues is linearly proportional to the number of recursion steps and inversely proportional to the square root of the spectral range.41,42 These empirical scaling laws point to the importance of reducing the spectral range of the Hamiltonian matrix.
SUMMARY The recursive methods discussed in this tutorial, based on the Lanczos and Chebyshev recursions, have many attractive features. For example, their implementation is in most cases straightforward. Unlike the direct method in which all eigenpairs are obtained at the end of the calculation, a recursive diagonalizer can be terminated once the desired eigenpairs converge. This is advantageous for problems in molecular spectroscopy and reaction dynamics because these problems are dominated by low-lying eigenstates. Some methods, such as filter-diagonalization, also allow one to extract eigenpairs in one or more prespecified spectral windows. The most attractive feature of these Krylov subspace-based methods is the fact that they rely on matrix-vector multiplication and consequently they have favorable scaling laws making them amenable to larger matrices. In particular, the memory requirement scales linearly with dimensionality because only a few recurring vectors are stored and the matrix is neither stored nor modified. The CPU scaling law, which is dominated by the matrix-vector multiplication in the recursion, is often pseudo-linear with respect to the dimensionality thanks to the sparsity or factorizability of the matrix. The recursive approaches are conceptually reminiscent of the popular time-dependent wave packet method, and techniques developed for wave packet propagation can be transplanted easily to various recursive methods. For these and other reasons, these recursive
Summary
331
approaches have become the methods of choice for diagonalizing large sparse matrices in science and engineering, and they have found wide applications in other fields of research as well. In this review, we have discussed in detail the original Lanczos algorithm and its convergence behavior, noting that its notorious ‘‘spurious’’ eigenvalue problem can be effectively managed without reorthogonalization. Several extensions of the original Lanczos algorithm were presented that may be very useful in different scenarios. Discussions on the Chebyshev recursion were provided, with an emphasis placed on its propagation characteristics. Spectral analysis based on the spectral method and the filter-diagonalization approach were also discussed. Finally, the pros and cons of the two methods, namely the Lanczos and Chebyshev recursions, and their relationship have been presented. In this chapter, we also discussed several schemes that allow for the computation of scalar observables without explicit construction and storage of the eigenvectors. This is important not only numerically for minimizing the core memory requirement but also conceptually because such a strategy is reminiscent of the experimental measurement, which almost never measures the wave function explicitly. Both the Lanczos and the Chebyshev recursion-based methods for this purpose have been developed and applied to both bound-state and scattering problems by various groups. Future applications of recursive methods in molecular spectroscopy and reaction dynamics will inevitably face increasingly large bases needed for highly excited energy regions as well as for larger polyatomic systems. For direct product bases, the size of the wave function is exponentially proportional to the number of degrees of freedom. Despite favorable scaling laws, the increase of basis functions or grid points will still impact the efficiency of a recursive method by increasing the size of the recurring vector, thus leading to larger memory and CPU requirements. These difficulties are compounded by the increase of the spectral range of the Hamiltonian, which results in a longer recursion to resolve the eigen-spectrum. As a result, it will be vital to minimize the size of the basis in treating nuclear dynamics of large systems, such as those with more than three atoms. Not surprisingly, recent efforts in the recursive solution of the molecular vibrational problem have concentrated on deriving the exact kinetic energy operator in various coordinate systems and basis contraction. The appropriate kinetic energy operator allows not only an efficient representation of the Hamiltonian matrix by minimizing the intermodal coupling, but it also has several important advantages, such as symmetry adaptation, avoidance of singularities, and a meaningful interpretation of the results. For triatomic systems, the Jacobi and Radau coordinates are the most commonly used, and the corresponding kinetic energy operators are well known. For molecules with more atoms, much work has been devoted recently to the derivation of the exact form of the kinetic energy operator in various coordinate systems
332
Recursive Solutions to Large Eigenproblems
and their discretization schemes.12,261,272,319–325 The appropriate kinetic energy operator used in a variational calculation depends on the molecular geometry of the potential energy minimum of interest. Understandably, it is difficult to choose a coordinate system for floppy molecules or for highly excited spectral regions where more than one molecular configuration is possible. Much progress has also been made recently on basis contraction schemes designed for recursive methods. The essence of these schemes is to construct non-direct product bases that have smaller sizes and narrower spectral ranges. One possibility is to prune the direct product basis using a set of criteria such as energy. In a DVR, for example, this amounts to the removal of all grid points above a certain cut-off potential energy. This strategy has been successfully used by several authors.241,264,272,326 An alternative contracting scheme is to break the system into several subsystems, so that eigenfunctions of the subsystem Hamiltonians can be used to construct the complete basis.262,273 This approach involves the solution of not only the full eigenproblems, but also those for the smaller subsystems, both of which are amenable to recursive diagonalization methods. Applications of this contraction idea have allowed for the determination of energy levels in polyatomic systems with up to 12 degrees of freedom.200,201,203,327 It can be expected that this research area will remain active and vital to advance further the recursive methods.328–331
ACKNOWLEDGMENTS I am deeply indebted to the members of my research group, Guohui Li, Shenmin Li, Shi Ying Lin, Guobin Ma, Daiqian Xie, and Dingguo Xu, and especially Rongqing Chen. I would also like to thank Stephen Gray, Vladimir Mandelshtam, Tucker Carrington, Jr., and Hua-gen Yu for many in-depth discussions regarding recursive diagonalization approaches and related topics. I dedicate this review to my parents for their eternal love and encouragement. This work was funded by the National Science Foundation.
REFERENCES 1. A. Messiah, Quantum Mechanics, Wiley, New York, 1968. 2. W. J. Hehre, L. Radom, P. v. R. Schleyer, and J. A. Pople, Ab Initio Molecular Orbital Theory, Wiley, New York, 1986. 3. G. D. Carney, L. L. Sprandel, and C. W. Kern, Adv. Chem. Phys., 37, 305 (1978). Variational Approaches to Vibration-Rotation Spectroscopy for Polyatomic Molecules. 4. S. Carter and N. C. Handy, Comput. Phys. Rep., 5, 115 (1986). The Variational Method for the Calculation of Ro-Vibrational Energy Levels. 5. Z. Bacic and J. C. Light, Annu. Rev. Phys. Chem., 40, 469 (1989). Theoretical Methods for Rovibrational States of Floppy Molecules. 6. D. G. Truhlar, Ed., Resonances in Electron-Molecule Scattering, van der Waals Complexes, and Reactive Chemical Dynamics, ACS, Washington, D.C., 1984.
References
333
7. J. Z. H. Zhang, Theory and Application of Quantum Molecular Dynamics, World Scientific, Singapore, 1999. 8. R. Schinke, Photodissociation Dynamics, Cambridge University Press, Cambridge, United Kingdom, 1993. 9. R. Kosloff, in Dynamics of Molecular and Chemical Reactions, R. E. Wyatt and J. Z. H. Zhang, Eds., Marcel Dekker, New York, 1996, pp. 185–230. Quantum Molecular Dynamics on Grids. 10. J. C. Light and T. Carrington Jr., Adv. Chem. Phys., 114, 263 (2000). Discrete-Variable Representations and Their Utilization. 11. D. Feller and E. R. Davidson, in Reviews in Computational Chemistry, Vol. 1, K. B. Lipkowitz and D. B. Boyd, Eds., VCH, Weinheim, 1990, pp. 1–43. Basis Sets for Ab Initio Molecular Orbital Calculations and Intermolecular Interactions. 12. M. J. Bramley and T. Carrington Jr., J. Chem. Phys., 99, 8519 (1993). A General Discrete Variable Method to Calculate Vibrational Energy Levels of Three- and Four-Atom Molecules. 13. M. J. Bramley and T. Carrington Jr., J. Chem. Phys., 101, 8494 (1994). Calculation of Triatomic Vibrational Eigenstates: Product or Contracted Basis Sets, Lanczos or Conventional Eigensolvers? What Is the Most Efficient Combination? 14. G. C. Corey, J. W. Tromp, and D. Lemoine, in Numerical Grid Methods and Their Applications to Schroedinger’s Equation, C. Cerjan, Ed., Kluwer, Dordrecht, The Netherlands,1993, pp. 1–23. Fast Pseudospectral Algorithm Curvilinear Coordinates. 15. M. J. Bramley, J. W. Tromp, T. Carrington Jr., and G. C. Corey, J. Chem. Phys., 100, 6175 (1994). Efficient Calculation of Highly Excited Vibrational Energy Levels of Floppy 1 Molecules: The Band Origins of Hþ 3 up to 35000 cm . 16. G. Czako, V. Szalay, A. G. Csaszar, and T. Furtenbacher, J. Chem. Phys., 122, 024101 (2005). Treating Singularities Present in the Sutcliffe-Tennyson Vibrational Hamiltonian in Orthogonal Internal Coordinates. 17. J. K. L. MacDonald, Phys. Rev., 48, 830 (1933). Successive Approximations by the RayleighRitz Variation Method. 18. G. H. Golub and C. F. Van Loan, Matrix Computations, 3rd ed., The Johns Hopkins University Press, Baltimore, 1996. 19. W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery, Numerical Recipes, 2nd ed, Cambridge University Press, Cambridge, United Kingdom, 1992. 20. B. N. Parlett, The Symmetric Eigenvalue Problem, Prentice-Hall, Englewood Cliffs, New Jersey, 1980. 21. J. M. Bowman and B. Gazdy, J. Chem. Phys., 94, 454 (1991). A Truncation/Recoupling Method for Basis Set Calculations of Eigenvalues and Eigenvectors. 22. S. E. Choi and J. C. Light, J. Chem. Phys., 97, 7031–7054 (1992). Highly Excited Vibrational Eigenstates of Nonlinear Triatomic Molecules. Application to H2O. 23. C. Ochsenfeld, J. Kussmann, and D. S. Lambrecht, in Reviews in Computational Chemistry, Vol. 23, K. B. Lipkowitz and T. R. Cundari, Eds., Wiley, New York, 2006. Linear Scaling Methods in Quantum Chemistry. 24. Y. Saad, Numerical Methods for Large Eigenvalue Problems, Manchester University Press, Manchester, United Kingdom, 1992. 25. C. Lanczos, J. Res. Natl. Bur. Stand., 45, 255 (1950). An Iteration Method for the Solution of the Eigenvalue Problem of Linear Differential and Integral Operators. 26. C. C. Paige, J. Inst. Math. Appl., 10, 373 (1972). Computational Variants of the Lanczos Method for the Eigenproblem. 27. J. K. Cullum and R. A. Willoughby, Lanczos Algorithms for Large Symmetric Eigenvalue Computations, Birkhauser, Boston, 1985. 28. M. R. Hestenes and E. L. Steifel, J. Res. Natl. Bur. Stand., 49, 409 (1952). Methods of Conjugate Gradients for Solving Linear Systems.
334
Recursive Solutions to Large Eigenproblems
29. C. C. Paige and M. A. Saunders, SIAM J. Numer. Anal, 12, 617 (1975). Solution of Sparse Indefinite Systems of Linear Equations. 30. Y. Saad and M. H. Schultz, SIAM J. Sci. Stat. Comput., 7, 856 (1986). GMRES: A Generalized Minimal Residual Algorithm for Solving Nonsymmetric Linear Systems. 31. R. W. Freund and N. M. Nachtigal, Numer. Math., 60, 315 (1991). QMR: A Quasi-Minimal Residual Method for Non-Hermitian Linear Systems. 32. R. W. Freund, SIAM J. Sci. Stat. Comput., 13, 425 (1992). Conjugate Gradient-Type Methods for Linear Systems with Complex Symmetric Coefficient Matrices. 33. R. Barret, M. Berry, T. F. Chan, J. Demmel, J. Donato, J. Dongarra, V. Eijkhout, R. Pozo, C. Romine, and H. van der Vorst, Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods, SIAM, Philadelphia, 1994. 34. J. H. Wilkinson, The Algebraic Eigenvalue Problem, Clarendon Press, Oxford, 1965. 35. C. C. Paige, J. Inst. Math. Appl., 18, 341 (1976). Error Analysis of the Lanczos Algorithm for Tridiagonalizing a Symmetric Matrix. 36. C. C. Paige, Linear Algebra and App., 34, 235 (1980). Accuracy and Effectiveness of the Lanczos Algorithm for the Symmetric Eigenproblem. 37. J. Cullum and R. A. Willoughby, J. Comput. Phys., 44, 329 (1981). Computing Eigenvalues of Very Large Symmetric Matrices – An Implementation of a Lanczos Algorithm with no Reorthogonalization. 38. R. Chen and H. Guo, J. Chem. Phys., 111, 9944 (1999). A Single Lanczos Propagation Method for Calculating Transition Amplitudes. 39. N. M. Poulin, M. J. Bramley, T. Carrington Jr., H. G. Kjaergaard, and B. R. Henry, J. Chem. Phys., 104, 7807 (1996). Calculation of Vibrational (J=0) Excitation Energies and Band Intensities of Formaldehyde Using the Recursive Residue Generation Method. 40. R. Chen and H. Guo, Chem. Phys. Lett., 277, 191 (1997). Benchmark Calculations of Bound States of HO2 via Basic Lanczos Algorithm. 41. R. Chen and H. Guo, Chem. Phys. Lett., 369, 650 (2003). Effect of Spectral Range on Convergence in Lanczos Algorithm: A Numerical Approach. 42. R. Chen and H. Guo, J. Chem. Phys., 119, 5762 (2003). On the Convergence Scaling Laws of Lanczos and Chebyshev Recursion Methods. 43. A. Nauts and R. E. Wyatt, Phys. Rev. Lett., 51, 2238 (1983). New Approach to Many-State Quantum Dynamics: The Recursive-Residue-Generation Method. 44. A. Nauts and R. E. Wyatt, Phys. Rev. A, 30, 872 (1984). Theory of Laser-Molecule Interaction: The Recursive-Residue-Generation Method. 45. R. A. Friesner and R. E. Wyatt, J. Chem. Phys., 82, 1973 (1985). Quantum Statistical Mechanics via the Recursive Residue Generation Method. 46. R. E. Wyatt and D. S. Scott, in Large Scale Eigenvalue Problems, J. Cullum and R. A. Willoughby, Eds., North Holland, Amsterdam, 1986, pp. 67–79. Quantum Dynamics with the Recursive Residue Generation Method: Improved Algorithm for Chain Propagation. 47. N. Moiseyev, R. A. Friesner, and R. E. Wyatt, J. Chem. Phys., 85, 331 (1986). Natural Expansion of Vibrational Wave Functions: RRGM with Residue Algebra. 48. A. McNichols and T. Carrington Jr., Chem. Phys. Lett., 202, 464 (1993). Vibrational Energy Levels of Formaldehyde Calculated from an Internal Coordinate Hamiltonian Using the Lanczos Algorithm. 49. G. Charron and T. Carrington Jr., Molec. Phys., 79, 13 (1993). A Fourier-Lanczos Method for Calculating Energy Levels without Storing or Calculating Matrices. 50. H. Koeppel, W. Domcke, and L. S. Cederbaum, Adv. Chem. Phys., 57, 59 (1984). Multimode Molecular Dynamics Beyond the Born-Oppenheimer Approximation. 51. K. F. Milfeld and N. Moiseyev, Chem. Phys. Lett., 130, 145 (1986). Complex Resonance Eigenvalues by the Lanczos Recursion Method.
References
335
52. C. Iung and C. Leforestier, J. Chem. Phys., 90, 3198 (1989). Accurate Determination of a Potential Energy Surface for CD3H. 53. G. C. Groenenboom and H. M. Buck, J. Chem. Phys., 92, 4374 (1990). Solving the Discretized Time-Independent Schro¨dinger Equation with the Lanczos Procedure. 54. F. LeQuere and C. Leforestier, J. Chem. Phys., 94, 1118 (1991). Quantum Exact 3D Study of the Photodissociation of Ozone Molecule. 55. S. Dallwig, N. Fahrer, and C. Schlier, Chem. Phys. Lett., 191, 69 (1992). The Combination of Complex Scaling and the Lanczos Algorithm. 56. R. E. Wyatt, Adv. Chem. Phys., 73, 231 (1989). The Recursive Residue Generation Method. 57. R. E. Wyatt and C. Iung, in Dynamics of Molecular and Chemical Reactions, R. E. Wyatt and J. Z. H. Zhang, Eds., Marcel Dekker, New York, 1996. Quantum Mechanical Studies of Molecular Spectra and Dynamics. 58. G. Nyman and H.-G. Yu, J. Comput. Methods. Sci. Eng., 1, 229 (2001). Iterative Diagonalization of a Large Sparse Matrix Using Spectral Transform and Filter-Diagonalization. 59. H. Guo, R. Chen, and D. Xie, J. Theor. Comput. Chem., 1, 173 (2002). Calculation of Transition Amplitudes with a Single Lanczos Propagation. 60. T. Carrington Jr., Can. J. Chem., 82, 900 (2004). Methods for Calculating Vibrational Energy Levels. 61. D. C. Sorensen, SIAM J. Matrix Anal. Appl., 13, 357 (1992). Implicit Application of Polynomial Filters in a K-Step Arnoldi Method. 62. S.-W. Huang and T. Carrington Jr., Appl. Num. Math., 37, 307 (2001). Calculating Interior Eigenvlaues and Eigenvectors with an Implicitly Restarted and Filter Diagonalization Method. 63. R. B. Lehoucq, D. C. Sorensen, and C. Yang, ARPACK User Guide: Solution of Large Scale Eigenvalue Problems by Implicitly Restarted Arnoldi Methods, SIAM, Philadelphia, Pennsylvania, 1998. 64. P. Pendergast, Z. Darakjian, E. F. Hayes, and D. C. Sorensen, J. Comput. Phys., 113, 201 (1994). Scalable Algorithms for Three-Dimensional Reactive Scattering: Evaluation of a New Algorithm for Obtaining Surface Functions. 65. P. P. Korambath, X. T. Wu, and E. F. Hayes, J. Phys. Chem., 100, 6116 (1996). Enhanced Method for Determining Rovibrational Eigenstates of van der Waals Molecules. 66. R. B. Lehoucq, S. K. Gray, D.-H. Zhang, and J. C. Light, Comput. Phys. Commun., 109, 15 (1997). Vibrational Eigenstates of Four-Atom Molecules: A Parallel Strategy Employing the Implicitly Restarted Lanczos Method. 67. X. T. Wu and E. F. Hayes, J. Chem. Phys., 107, 2705 (1997). HO2 Rovibrational Eigenvalues Studies for Non-Zero Angular Momentum. 68. G. H. Golub and R. Underwood, in Mathematical Software III, J. R. Rice, Ed., Academic Press, New York, 1977, pp. 361–377. The Block Lanczos Method for Computing Eigenvalues. 69. T. Ericsson and A. Ruhe, Math. Comput., 35, 1251 (1980). The Spectral Transformation Lanczos Method for the Numerical Solution of Large Sparse Generalized Symmetric Eigenvalue Problems. 70. R. E. Wyatt, Phys. Rev. E, 51 (4), 3643 (1995). Matrix Spectroscopy: Computation of Interior Eigenstates of Large Matrices Using Layered Iteration. 71. C. Leforestier, K. Yamashita, and N. Moiseyev, J. Chem. Phys., 103, 8468 (1995). Transition State Resonances by Complex Scaling: A Three-Dimensional Study of ClHCl. 72. F. Webster, P. J. Rossky, and R. A. Friesner, Comput. Phys. Commun., 63, 494 (1991). Nonadiabatic Processes in Condensed Matter: Semi-classical Theory and Implementation. 73. H.-G. Yu and G. Nyman, Chem. Phys. Lett., 298, 27 (1998). A Spectral Transform Krylov Subspace Iteration Approach to Quantum Scattering. 74. H. Kono, Chem. Phys. Lett., 214, 137 (1993). Extraction of Eigenstates from an Optically Prepared State by a Time-Dependent Quantum Mechanical Method.
336
Recursive Solutions to Large Eigenproblems
75. C. Iung and C. Leforestier, J. Chem. Phys., 102, 8453 (1995). Direct Calculation of Overtones: Application to the CD3H Molecule. 76. H.-G. Yu and G. Nyman, J. Chem. Phys., 110, 11133 (1999). A Spectral Transform Minimum Residual Filter Diagonalization Method for Interior Eigenvalues of Physical Systems. 77. H.-G. Yu and S. C. Smith, Ber. Bungsenges. Phys. Chem., 101, 400 (1997). The Calculation of Vibrational Eigenstates by MINRES Filter Diagonalization. 78. S.-W. Huang and T. Carrington Jr., J. Chem. Phys., 114, 6485 (2001). Using the Symmetric Quasiminimal Residuals Method to Accelerate an Inexact Spectral Transform Calculation of Energy Levels and Wave Functions. 79. R. Kosloff and H. Tal-Ezer, Chem. Phys. Lett., 127, 223 (1986). A Direct Relaxation Method for Calculating Eigenfunctions and Eigenvalues of the Schroedinger Equation on a Grid. 80. P.-N. Roy and T. Carrington Jr., J. Chem. Phys., 103, 5600 (1995). An Evaluation of Methods Designed to Calculate Energy Levels in Selected Range and Application to a (One-Dimensional) Morse Oscillator and (Three-Dimensional) HCN/HNC. 81. H.-G. Yu and G. Nyman, J. Chem. Phys., 110, 7233 (1999). A Four Dimensional Quantum Scattering Study of the Cl þ CH4 ¼ HCl þ CH3 Reaction via Spectral Transform Iteration. 82. S.-W. Huang and T. Carrington Jr., J. Chem. Phys., 112, 8765 (2000). A New Iterative Method for Calculating Energy Levels and Wave Functions. 83. E. R. Davidson, J. Comput. Phys., 17, 87 (1975). The Iterative Calculation of a Few of the Lowest Eigenvalues and Corresponding Eigenvectors of Large Real Symmetric Matrices. 84. R. B. Morgan and D. C. Scott, SIAM J. Sci. Stat. Comput., 7, 817 (1986). Generalizations of Davidson Method for Computing Eigenvalues of Sparse Symmetrical Matrices. 85. E. R. Davidson, Comput. Phys. Commun., 53, 49 (1989). Super-Matrix Methods. 86. C. Murray, S. Racine, and D. F. Davidson, J. Comput. Chem., 103, 382 (1993). Improved Algorithms for the Lowest Few Eigenvalues and Associated Eigenvectors of Large Matrices. 87. M. Aoyagi and S. K. Gray, J. Chem. Phys., 94, 195 (1991). Rotation-Vibration Interactions in Formaldehyde: Results for Low Vibrational Excitations. 88. G. G. Balint-Kurti and P. Pulay, J. Molec. Struct. (THEOCHEM), 341, 1 (1995). A New GridBased Method for the Direct Computation of Excited Molecular Vibrational States: Test Application to Formaldehyde. 89. F. Ribeiro, C. Iung, and C. Leforestier, Chem. Phys. Lett., 362, 199 (2002). Calculation of Highly Excited Vibrational Levels: A Prediagonalized Davidson Scheme. 90. C. Iung and F. Ribeiro, J. Chem. Phys., 123, 174105 (2005). Calculation of Specific, Highly Excited Vibrational States Based on a Davidson Scheme: Application to HFCO. 91. B. Poirier and T. Carrington Jr., J. Chem. Phys., 114, 9254 (2001). Accelerating the Calculation of Energy Levels and Wave Functions Using an Efficient Preconditioner with the Inexact Spectral Transform Method. 92. B. Poirier and T. Carrington Jr., J. Chem. Phys., 116, 1215 (2002). A Preconditioned Inexact Spectral Transform Method for Calculating Resonance Energies and Widths, as Applied to HCO. 93. H. O. Karlsson, J. Chem. Phys., 103, 4914 (1995). The Quasi-Minimal Residual Algorithm Applied to Complex Symmetric Linear Systems in Quantum Reactive Scattering. 94. U. Peskin, W. H. Miller, and A. Edlund, J. Chem. Phys., 102, 10030 (1995). Quantum Time Evolution in Time-Dependent Fields and Time-Independent Reactive-Scattering Calculations via an Efficient Fourier Grid Preconditioner. 95. H. Zhang and S. C. Smith, J. Chem. Phys., 115, 5751 (2001). Calculation of Product State Distributions from Resonance Decay via Lanczos Subspace Filter Diagonalization: Application to HO2. 96. H. O. Karlsson and S. Holmgren, J. Chem. Phys., 117, 9116 (2002). Cross Correlation Functions Cnm(E) via Lanczos Algorithms without Diagonalization.
References
337
97. H. O. Karlsson, J. Theor. Comput. Chem., 2, 523 (2003). Lanczos Algorithms and CrossCorrelation Functions Cif(E). 98. G. H. Golub and J. H. Welsh, Math. Comput., 23, 221 (1969). Calculation of Gauss Quadrature Rules. 99. G. Yao and R. E. Wyatt, Chem. Phys. Lett., 239, 207 (1995). A Krylov-Subspace Chebyshev Method and Its Application to Pulsed Laser-Molecule Interaction. 100. R. Chen and H. Guo, J. Chem. Phys., 114, 1467 (2001). A Single Lanczos Propagation Method for Calculating Transition Amplitudes. II. Modified QL and Symmetry Adaptation. 101. V. A. Mandelshtam, J. Chem. Phys., 108, 9999 (1998). Harmonic Inversion of Time CrossCorrelation Functions. The Optimal Way to Perform Quantum or Semiclassical Dynamics Calculations. 102. S. Li, G. Li, and H. Guo, J. Chem. Phys., 115, 9637 (2001). A Single Lanczos Propagation Method for Calculating Transition Amplitudes. III. S-Matrix Elements with a ComplexSymmetric Hamiltonian. 103. D. Xu, R. Chen, and H. Guo, J. Chem. Phys., 118, 7273 (2003). Probing Highly Excited Vibrational Eigenfunctions Using a Modified Single Lanczos Method: Application to Acetylene (HCCH). 104. M. Alacid and C. Leforestier, Internat. J. Quantum Chem., 68, 317 (1998). Direct Calculation of Long Time Correlation Functions Using an Optical Potential. 105. J.-P. Brunet, R. A. Friesner, R. E. Wyatt, and C. Leforestier, Chem. Phys. Lett., 153, 425 (1988). Theoretical Study of the IR Absorption Spectrum of HCN. 106. R. A. Friesner, J. A. Bentley, M. Menou, and C. Leforestier, J. Chem. Phys., 99, 324 (1993). Adiabatic Pseudospectral Methods for Multidimensional Vibrational Potential. 107. D. Xu, H. Guo, and D. Xie, J. Theor. Comput. Chem., 2, 639 (2003). Theoretical Studies of A1A00 ! X‘A’ Resonance Emission Spectra of HCN/DCN Using Single Lanczos Propagation Method. 108. R. E. Wyatt, C. Iung, and C. Leforestier, J. Chem. Phys., 97, 3458 (1992). Quantum Dynamics of Overtone Relaxation in Benzene, I. 5 and 9 Mode Models for Relaxation from CH(v=3). 109. R. E. Wyatt, C. Iung, and C. Leforestier, J. Chem. Phys., 97, 3477 (1992). Quantum Dynamics of Overtone Relaxation in Benzene, II. 16 Mode Models for Relaxation from CH(v=3). 110. R. E. Wyatt and C. Iung, J. Chem. Phys., 98, 3577 (1993). Quantum Dynamics of Overtone Relaxation in Benzene: IV. Relaxation from CH(v=4). 111. R. E. Wyatt and C. Iung, J. Chem. Phys., 98, 6758 (1993). Quantum Dynamics of Overtone Relaxation in Benzene: V. CH(v=3) Dynamics Computed with a New Ab Initio Force Field. 112. R. E. Wyatt and C. Iung, J. Chem. Phys., 98, 5191 (1993). Quantum Dynamics of Overtone Relaxation in Benzene. III. Spectra and Dynamics for Relaxation from CH(v=3). 113. S. A. Schofield, P. G. Wolynes, and R. E. Wyatt, Phys. Rev. Lett., 74, 3720 (1995). Computational Study of Many-Dimensional Quantum Energy Flow: From Action Diffusion to Localization. 114. S. A. Schofield, P. G. Wolynes, and R. E. Wyatt, J. Chem. Phys., 105, 940 (1996). Computational Study of Many-Dimensional Quantum Vibrational Energy Redistribution. I. Statistics of the Survival Probability. 115. R. E. Wyatt, J. Chem. Phys., 109, 10732 (1998). Quantum Mechanical Study of the CH(v=2) Overtone in 30-Mode Benzene. 116. C. Iung and C. Leforestier, J. Chem. Phys., 90, 3198 (1993). Accurate Determination of a Potential Energy Surface for CD3H. 117. G. Li and H. Guo, J. Molec. Spectrosc., 210, 90 (2001). The Vibrational Level Spectrum of H2O(X1A’) from the Partridge-Schwenke Potential up to Dissociation Limit. 118. R. E. Wyatt, Chem. Phys. Lett., 121, 301 (1985). Direct Computation of Quantal Rate Constants: Recursive Development of the Flux Autocorrelation Function.
338
Recursive Solutions to Large Eigenproblems
119. H. O. Karlsson and O. Goscinski, J. Phys. Chem. A, 105, 2599 (2001). Correlation Functions and Thermal Rate Constants. 120. D. Xu, D. Xie, and H. Guo, J. Chem. Phys., 116, 10626 (2002). Theoretical Study of Predissociation Dynamics of HCN/DCN in Their First Absorption Bands. 121. R. Chen and H. Guo, Chem. Phys. Lett., 308, 123–130 (1999). A Low-Storage FilterDiagonalization Method to Calculate Expectation Values of Operators Non-Commutative to the Hamiltonian. Vibrational Assignment of HOCl. 122. M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functions, Dover, New York, 1970. 123. H. Tal-Ezer and R. Kosloff, J. Chem. Phys., 81, 3967 (1984). An Accurate and Efficient Scheme for Propagating the Time Dependent Schroedinger Equation. 124. Y. Huang, W. Zhu, D. J. Kouri, and D. K. Hoffman, Chem. Phys. Lett., 206, 96 (1993). A General Time-to-Energy Transform of Wavepackets. Time-Independent WavepacketSchroedinger and Wavepacket-Lippmann-Schwinger Equations. 125. Y. Huang, W. Zhu, D. J. Kouri, and D. K. Hoffman, Chem. Phys. Lett., 214, 451 (1993). Analytical Continuation of the Polynomial Representation of the Full, Interacting TimeIndependent Green Function. 126. Y. Huang, D. J. Kouri, and D. K. Hoffman, Chem. Phys. Lett., 225, 37 (1994). A General, Energy-Separable Polynomial Representation of the Time-Independent Full Green Operator with Application to Time-Independent Wavepacket Forms of Schrodinger and LippmannSchwinger Equations. 127. Y. Huang, D. J. Kouri, and D. K. Hoffman, J. Chem. Phys., 101, 10493 (1994). General, Energy-Separable Faber Polynomial Representation of Operator Functions: Theory and Application in Quantum Scattering. 128. W. Zhu, Y. Huang, D. J. Kouri, C. Chandler, and D. K. Hoffman, Chem. Phys. Lett., 217, 73 (1994). Orthogonal Polynomial Expansion of the Spectral Density Operator and the Calculation of Bound State Energies and Eigenfunctions. 129. D. J. Kouri, W. Zhu, G. A. Parker, and D. K. Hoffman, Chem. Phys. Lett., 238, 395 (1995). Acceleration of Convergence in the Polynomial-Expanded Spectral Density Approach to Bound and Resonance State Calculations. 130. J. P. Boyd, Chebyshev and Fourier Spectral Methods, Springer-Verlag, Berlin, 1989. 131. C. Lanczos, Applied Analysis, Prentice Hall, Englewood Cliffs, New Jersey, 1956. 132. R. Chen and H. Guo, J. Chem. Phys., 105, 3569 (1996). Evolution of Quantum System in Order Domain of Chebychev Operator. 133. V. A. Mandelshtam, in Multiparticle Quantum Scattering with Applications to Nuclear, Atomic and Molecular Physics, D. G. Truhlar and B. Simon, Eds., Springer, New York, 1996, pp. 389– 402. Global Recursion Polynomial Expansions of the Green’s Function and Time Evolution Operator for the Schro¨dinger Equation with Absorbing Boundary Conditions. 134. S. K. Gray and G. G. Balint-Kurti, J. Chem. Phys., 108, 950 (1998). Quantum Dynamics with Real Wavepackets, Including Application to Three-Dimensional (J ¼ 0) D þ H2 ! HD þ H Reactive Scattering. 135. R. Chen and H. Guo, J. Chem. Phys., 108, 6068 (1998). Discrete Energy Representation and Generalized Propagation of Physical Systems. 136. R. Chen and H. Guo, Comput. Phys. Commun., 119, 19 (1999). The Chebyshev Propagator for Quantum Systems. 137. G. Jolicard and E. J. Austin, Chem. Phys., 103, 295 (1986). Optical Potential Method of Calculating Resonance Energies and Widths. 138. D. Neuhauser and M. Baer, J. Chem. Phys., 90, 4351 (1989). The Time-Dependent Schro¨dinger Equation: Application of Absorbing Boundary Conditions. 139. T. Seideman and W. H. Miller, J. Chem. Phys., 96, 4412 (1992). Calculation of the Cumulative Reaction Probability via a Discrete Variable Representation with Absorbing Boundary Conditions.
References
339
140. V. A. Mandelshtam and H. S. Taylor, J. Chem. Phys., 103 (8), 2903 (1995). A Simple Recursion Polynomial Expansion of the Green’s Function with Absorbing Boundary Conditions. Application to the Reactive Scattering. 141. V. A. Mandelshtam and H. S. Taylor, J. Chem. Phys., 102, 7390 (1995). Spectral Projection Approach to the Quantum Scattering Calculations. 142. Y. Huang, S. S. Iyengar, D. J. Kouri, and D. K. Hoffman, J. Chem. Phys., 105, 927 (1996). Further Analysis of Solutions to the Time-Independent Wave Packet Equations of Quantum Dynamics II. Scattering as a Continuous Function of Energy Using Finite, Discrete Approximate Hamiltonians. 143. R. Chen and H. Guo, Chem. Phys. Lett., 261, 605 (1996). Extraction of Resonances via Wave Packet Propagation in Chebyshev Order Domain: Collinear H þ H2 Scattering. 144. H.-G. Yu and S. C. Smith, J. Chem. Phys., 107, 9985 (1997). The Simulation of Outgoing-Wave Boundary Conditions via a Symmetrically Damped, Hermitian Hamiltonian Operator. 145. M. D. Feit, J. A. Fleck, and A. Steger, J. Comput. Phys., 47, 412 (1982). Solution of the Schroedinger Equation by a Spectral Method. 146. R. Kosloff, J. Phys. Chem., 92, 2087 (1988). Time-Dependent Quantum-Mechanical Methods for Molecular Dynamics. 147. M. R. Wall and D. Neuhauser, J. Chem. Phys., 102, 8011 (1995). Extraction, through FilterDiagonalization, of General Quantum Eigenvalues or Classical Normal Mode Frequencies from a Small Number of Residues or a Short-Time Segment of a Signal. I. Theory and Application to a Quantum-Dynamics Model. 148. R. Chen and H. Guo, J. Chem. Phys., 105, 1311 (1996). A General and Efficient FilterDiagonalization Method without Time Propagation. 149. B. Hartke, R. Kosloff, and S. Ruhman, Chem. Phys. Lett., 158, 238 (1989). Large Amplitude Group State Vibrational Coherence Induced by Impulsive Absorption in CsI. A Computer Simulation. 150. D. Neuhauser, J. Chem. Phys., 93, 2611 (1990). Bound State Eigenfunctions from Wave Packets: Time ! Energy Resolution. 151. D. Neuhauser, J. Chem. Phys., 95, 4927 (1991). Time-Dependent Reactive Scattering in the Presence of Narrow Resonances: Avoiding Long Propagation Times. 152. D. Neuhauser, J. Chem. Phys., 100, 5076 (1994). Circumventing the Heisenberg Principle: A Rigorous Demonstration of Filter-Diagonalization on a LiCN Model. 153. T. P. Grozdanov, V. A. Mandelshtam, and H. S. Taylor, J. Chem. Phys., 103, 7990 (1995). Recursion Polynomial Expansion of the Green’s Function with Absorbing Boundary Conditions: Calculations of Resonances of HCO by Filter Diagonalization. 154. V. A. Mandelshtam and H. S. Taylor, J. Chem. Phys., 106, 5085 (1997). A Low-Storage Filter Diagonalization Method for Quantum Eigenenergy Calculation or for Spectral Analysis of Time Signals. 155. V. A. Mandelshtam and H. S. Taylor, Phys. Rev. Lett., 78, 3274 (1997). Spectral Analysis of Time Correlation Function for a Dissipative Dynamical Systems Using Filter Diagonalization: Application to Calculation of Unimolecular Decay Rates. 156. V. A. Mandelshtam and H. S. Taylor, J. Chem. Phys., 107, 6756 (1997). Harmonic Inversion of Time Signals and Its Applications. 157. R. Chen and H. Guo, J. Comput. Phys., 136, 494 (1997). Determination of Eigenstates via Lanczos Based Forward Substitution and Filter-Diagonalization. 158. R. Chen and H. Guo, Chem. Phys. Lett., 279, 252 (1997). Calculation of Matrix Elements in Filter Diagonalization: A Generalized Method Based on Fourier Transform. 159. H.-G. Yu and S. C. Smith, Chem. Phys. Lett., 283, 69 (1998). Calculation of Quantum Resonance Energies and Lifetimes via Quasi-Minimum Residual Filter Diagonalization. 160. M. H. Beck and H.-D. Meyer, J. Chem. Phys., 109, 3730 (1998). Extracting Accurate BoundState Spectra from Approximate Wave Packet Propagation Using the Filter-Diagonalization Method.
340
Recursive Solutions to Large Eigenproblems
161. M. Gluck, H. J. Korsch, and N. Moiseyev, Phys. Rev. E, 58, 376 (1998). Selective Quasienergies from Short Time Cross-Correlation Probability Amplitudes by the FilterDiagonalization Method. 162. R. Chen and H. Guo, J. Chem. Phys., 111, 464 (1999). Efficient Calculation of Matrix Elements in Low Storage Filter Diagonalization. 163. M. Alacid, C. Leforestier, and N. Moiseyev, Chem. Phys. Lett., 305, 258 (1999). Bound and Resonance States by a Time-Independent Filter Diagonalization Method for Large Hamiltonian Matrices. 164. H. Zhang and S. C. Smith, Phys. Chem. Chem. Phys., 3, 2282 (2001). Lanczos Subspace Filter Diagonalization: Homogeneous Recursive Filtering and a Low-Storage Method for the Calculation of Matrix Elements. 165. D. Neuhauser, in Highly Excited Molecules, A. S. Mullin and G. C. Schatz, Eds., American Chemical Society, Washington DC, 1997, pp. 26–38. A General Approach for Calculating High-Energy Eigenstates and Eigenfunctions and for Extracting Frequencies from a General Signal. 166. V. A. Mandelshtam, Progress NMR Spectrosc., 38, 159 (2001). FDM: The Filter Diagonalization Method for Data Processing in NMR Experiments. 167. T. Takatsuka and N. Hashimoto, J. Chem. Phys., 103, 6057 (1995). A Novel Method to Calculate Eigenfunctions and Eigenvalues in a Given Energy Range. 168. A. Vijay, J. Chem. Phys., 118, 1007 (2003). A Lorentzian Function Based Spectral Filter for Calculating the Energy of Excited Bound States in Quantum Mechanics. 169. B. S. Garbow, J. M. Boyle, J. J. Dongarra, and C. B. Moler, Matrix Eigensystem Routines EISPACK Guide Extension, Springer-Verlag, New York, 1977. 170. J. W. Pang, T. Dieckman, J. Feigon, and D. Neuhauser, J. Chem. Phys., 108, 8360 (1998). Extraction of Spectral Information from a Short-Time Signal Using Filter-Diagonalization: Recent Developments and Applications to Semiclassical Reaction Dynamics and Nuclear Magnetic Resonance Signals. 171. V. A. Mandelshtam, J. Theor. Comput. Chem., 2, 497 (2003). On Harmonic Inversion of Cross-Correlation Functions by the Filter Diagonalization Method. 172. J. W. Pang and D. Neuhauser, Chem. Phys. Lett., 252, 173 (1996). Application of Generalized Filter-Diagonalization to Extract Instantaneous Normal Modes. 173. J. Main, V. A. Mandelshtam, and H. S. Taylor, Phys. Rev. Lett., 79, 825 (1997). Periodic Orbit Quantization by Harmonic Inversion of Gutzwiller’s Recurrence Function. 174. F. Grossmann, V. A. Mandelshtam, H. S. Taylor, and J. S. Briggs, Chem. Phys. Lett., 279, 355 (1997). Harmonic Inversion of Semiclassical Short Time Signals. 175. V. A. Mandelshtam and M. Ovchinnikov, J. Chem. Phys., 108, 9206 (1998). Extraction of Tunneling Splitting from a Real Time Semiclassical Propagation. 176. V. A. Mandelshtam and H. S. Taylor, J. Chem. Phys., 108, 9970 (1998). Multidimensional Harmonic Inversion by Filter Diagonalization. 177. H. Hu, Q. N. Van, V. A. Mandelshtam, and A. J. Shaka, J. Magn. Reson., 134, 76 (1998). Reference Deconvolution, Phase Correction and Line Listing of NMR Spectra by the 1D Filter Diagonalization Method. 178. J. Chen and V. A. Mandelshtam, J. Chem. Phys., 112, 4429 (2000). Multiscale Filter Diagonalization Method for Spectral Analysis of Noisy Data with Nonlocalized Features. 179. D. Belkic, P. A. Dando, J. Main, and H. S. Taylor, J. Chem. Phys., 113, 6542 (2000). Three Novel High-Resolution Nonlinear Methods for Fast Signal Processing. 180. V. A. Mandelshtam, J. Phys. Chem. A, 105, 2764 (2001). The Regularized Resolvent Transform for Quantum Dynamics Calculations. 181. S. C. Smith, Faraday Disc. Chem. Soc., 102, 17 (1995). Towards Quantum Mechanical Characterization of the Dissociation Dynamics of Ketene.
References
341
182. H.-G. Yu and S. C. Smith, J. Chem. Soc., Faraday Trans., 93, 861 (1997). Restarted KrylovSpace Spectral Filtering. 183. H.-G. Yu and S. C. Smith, J. Comput. Phys., 143, 484 (1998). The Elimination of Lanczos Ghosting Effects by MINRES Filter Diagonalization. 184. R. W. Freund, M. H. Gutknecht, and N. M. Nachtigal, SIAM J. Comput., 14, 137 (1993). An Implementation of the Look-Ahead Lanczos Algorithm for Non-Hermitian Matrices. 185. B. Poirier and W. H. Miller, Chem. Phys. Lett., 265, 77 (1997). Optimized Preconditioners for Green Function Evaluation in Quantum Reactive Scattering Calculations. 186. P. R. Bunker, Molecular Symmetry and Spectroscopy, Academic Press, New York, 1979. 187. R. M. Whitnell and J. C. Light, J. Chem. Phys., 89, 3674 (1988). Symmetry-Adapted Discrete Variable Representation. 188. M. S. Child and L. Halonen, Adv. Chem. Phys., 57, 1–58 (1984). Overtone Frequencies and Intensities in the Local Mode Picture. 189. Y. Shi and D. J. Tannor, J. Chem. Phys., 92, 2517 (1990). Symmetry Adapted Fourier Solution of the Time-Dependent Schro¨dinger Equation. 190. J. A. Bentley, R. E. Wyatt, M. Menou, and C. Leforestier, J. Chem. Phys., 97, 4255 (1992). A Finite Base-Discrete Variable Representation Calculation of Vibrational Levels of Planar Acetylene. 191. L. Liu and J. T. Muckerman, J. Chem. Phys., 107, 3402 (1997). Vibrational Eigenvalues and Eigenfunctions for Planar Acetylene by Wavepacket Propagation, and Its Mode Selective Infrared Excitation. 192. R. Chen and H. Guo, Phys. Rev. E, 57, 7288 (1998). Symmetry Enhanced Spectral Analysis via Spectral Method and Filter-Diagonalization. 193. R. Chen, H. Guo, L. Liu, and J. T. Muckerman, J. Chem. Phys., 109, 7128 (1998). SymmetryAdapted Filter-Diagonalization: Calculation of Vibrational Spectrum of Planar Acetylene from Correlation Functions. 194. G. Ma, R. Chen, and H. Guo, J. Chem. Phys., 110, 8408 (1999). Quantum Calculations of Highly Excited Vibrational Spectrum of Sulfur Dioxide. I. Eigenenergies and Assignments up to 15000 cm1. 195. X.-G. Wang and T. Carrington Jr., J. Chem. Phys., 114, 1473 (2001). A Symmetry Adapted Lanczos Method for Calculating Energy Levels with Different Symmetries from a Single Sequence of Iterations. 196. R. Chen, G. Ma, and H. Guo, Chem. Phys. Lett., 320, 567 (2000). Full-Dimensional Calculation of Vibrational Spectrum of Hydrogen Peroxide (HOOH). 197. R. Chen, G. Ma, and H. Guo, J. Chem. Phys., 114, 4763 (2001). Six-Dimensional Quantum Calculation of Highly Excited Vibrational Energy Levels of Hydrogen Peroxide and Its Deuterated Isotopomers. 198. D. Xu, G. Li, D. Xie, and H. Guo, Chem. Phys. Lett., 365, 480 (2002). Full-Dimensional Quantum Calculations of Vibrational Energy Levels of Acetylene (HCCH) up to 13000 cm1. 199. X.-G. Wang and T. Carrington Jr., J. Chem. Phys., 115, 9781 (2001). Six-Dimensional Variational Calculation of the Bending Energy Levels of HF Trimer and DF Trimer. 200. X.-G. Wang and T. Carrington Jr., J. Chem. Phys., 118, 6946 (2003). A Finite Basis Representation Lanczos Calculation of the Bend Energy Levels of Methane. 201. X.-G. Wang and T. Carrington Jr., J. Chem. Phys., 119, 101 (2003). A Contracted BasisLanczos Calculation of Vibrational Levels of Methane: Solving the Schro¨dinger Equation in Nine Dimensions. 202. X.-G. Wang and T. Carrington Jr., J. Chem. Phys., 119, 94 (2003). Using C3v Symmetry with Polyspherical Coordinates for Methane. 203. X.-G. Wang and T. Carrington Jr., J. Chem. Phys., 121, 2937 (2004). A Finite Basis Representation Lanczos Calculation of the Bend Energy Levels of Methane.
342
Recursive Solutions to Large Eigenproblems
204. X.-G. Wang and T. Carrington Jr., J. Chem. Phys., 123, 154303 (2005). Improving the Calculation of Rovibrational Spectra of Five-Atom Molecules with Three Identical Atoms by Using a C3(G6) Symmetry-Adapted Grid: Applied to CH3D and CHD3. 205. X.-G. Wang and T. Carrington Jr., J. Chem. Phys., 123, 034301 (2005). Theoretical and Experimental Studies of the Infrared Rovibrational Spectrum of He2-N2O. 206. R. Chen and H. Guo, J. Chem. Phys., 110, 2771–2777 (1999). Extended Symmetry-Adapted Discrete Variable Representation and Accelerated Calculation of Hc. 207. G. Moro and J. H. Freed, J. Chem. Phys., 74, 3757 (1981). Calculation of ESR Spectra and Related Fokker-Plank Forms by the Use of the Lanczos Algorithm. 208. W. P. Reinhardt, Annu. Rev. Phys. Chem., 33, 223 (1982). Complex Coordinates in the Theory of Atomic and Molecular Structure and Dynamics. 209. N. Moiseyev, Israel J. Chem., 31, 311 (1991). Resonances, Cross Sections and Partial Widths by the Complex Coordinate Method. 210. N. Moiseyev, P. R. Certain, and F. Weinhild, Molec. Phys., 36, 1613 (1978). Resonance Properties of Complex-Rotated Hamiltonians. 211. W. E. Arnoldi, Q. Appl. Math., 9, 17 (1951). The Principle of Minimized Iterations in the Solution of the Matrix Eigenvalue Problem. 212. J. Cullum and R. A. Willoughby, in Large Scale Eigenvalue Problems, J. Cullum and R. A. Willoughby, Eds., North Holland, Amsterdam, 1986, pp. 193–240. A Practical Procedure for Computing Eigenvalues of Large Sparse Nonsymmetric Matrices. 213. J. C. Tremblay and T. Carrington Jr., J. Chem. Phys., 122, 244107 (2005). Computing Resonance Energies, Widths, and Wave Functions Using a Lanczos Method in Real Arithmetic. 214. V. A. Mandelshtam and A. Neumaier, J. Theor. Comput. Chem., 1, 1 (2002). Further Generalization and Numerical Implementation of Pseudo-Time Schroedinger Equation for Quantum Scattering Calculations. 215. V. A. Mandelshtam and H. S. Taylor, J. Chem. Soc. Faraday Trans., 93, 847 (1997). The Quantum Resonance Spectrum of the H3þ Molecular Ions for J=0. An Accurate Calculation Using Filter Diagonalization. 216. E. Narevicius, D. Neuhauser, H. J. Korsch, and N. Moiseyev, Chem. Phys. Lett., 276, 250 (1997). Resonances from Short Time Complex-Scaled Cross-Correlation Probability Amplitudes by the Filter-Diagonalization Method. 217. G. Li and H. Guo, Chem. Phys. Lett., 336, 143 (2001). Doubling of Chebyshev Correlation Function for Calculating Narrow Resonances Using Low-Storage Filter Diagonalization. 218. G. Li and H. Guo, Chem. Phys. Lett., 347, 443 (2001). Efficient Calculation of Resonance Positions and Widths Using Doubled Chebyshev Autocorrelation Functions. 219. A. Neumaier and V. A. Mandelshtam, Phys. Rev. Lett., 86, 5031 (2001). Pseudo-Time Schro¨dinger Equation with Absorbing Potential for Quantum Scattering Calculations. 220. S. M. Auerbach and C. Leforestier, Comput. Phys. Commun., 78, 55 (1993). A New Computational Algorithm for Green’s Functions: Fourier Transform of the Newton Polynomial Expansion. 221. G. Ashkenazi, R. Kosloff, S. Ruhman, and H. Tal-Ezer, J. Chem. Phys., 103, 10005 (1995). Newtonian Propagation Methods Applied to the Photodissociation Dynamics of I 3. 222. X.-G. Hu, Phys. Rev. E, 59, 2471 (1999). Laguerre Scheme: Another Member for Propagating the Time-Dependent Schro¨dinger Equation. 223. A. Vijay, R. E. Wyatt, and G. D. Billing, J. Chem. Phys., 111, 10794 (1999). Time Propagation and Spectral Filters in Quantum Dynamics: A Hermite Polynomial Perspective. 224. A. J. Rasmussen, S. J. Jeffrey, and S. C. Smith, Chem. Phys. Lett., 336, 149 (2001). Subspace Wavepacket Evolution with Newton Polynomials. 225. R. Kosloff, Annu. Rev. Phys. Chem., 45, 145 (1994). Propagation Methods for Quantum Molecular Dynamics.
References
343
226. T. J. Park and J. C. Light, J. Chem. Phys., 85, 5870 (1986). Unitary Quantum Time Evolution by Iterative Lanczos Reduction. 227. U. Peskin and N. Moiseyev, J. Chem. Phys., 99, 4590 (1993). The Solution of the Time Dependent Schro¨dinger Equation by the (t,t0 ) Method: Theory, Computational Algorithm and Applications. 228. U. Peskin, R. Kosloff., and N. Moiseyev, J. Chem. Phys., 100, 8849 (1994). The Solution of the Time Dependent Schro¨dinger Equation by the (t,t) Method: The Use of Global Polynomial Propagators for Time Dependent Hamiltonians. 229. G. Yao and R. E. Wyatt, J. Chem. Phys., 101, 1904 (1994). Stationary Approaches for Solving the Schro¨dinger Equation with Time-Dependent Hamiltonians. 230. C. S. Guiang and R. E. Wyatt, Int. J. Quant. Chem., 67, 273 (1998). Quantum Dynamics with Lanczos Subspace Propagation: Application to a Laser-Driven Molecular System. 231. K. Blum, Density Matrix Theory and Applications, Plenum, New York, 1981. 232. M. Berman and R. Kosloff, Comput. Phys. Commun., 63, 1 (1991). Time-Dependent Solution of the Liouville-von Neumann Equation: Non-Dissipative Evolution. 233. M. Berman, R. Kosloff, and H. Tal-Ezer, J. Phys. A, 25, 1283 (1992). Solution of the TimeDependent Liouville-von Neumann Equation: Dissipative Evolution. 234. W. Huisinga, L. Pesce, R. Kosloff, and P. Saalfrank, J. Chem. Phys., 110, 5538 (1999). Faber and Newton Polynomial Integrators for Open-System Density Matrix Propagation. 235. W. T. Pollard and R. A. Friesner, J. Chem. Phys., 100, 5054 (1994). Solution of the Redfield Equation for the Dissipative Quantum Dynamics of Multilevel Systems. 236. R. S. Dumont, P. Hazendonk, and A. Bain, J. Chem. Phys., 113, 3270 (2000). Dual Lanczos Simulation of Dynamic Nuclear Magnetic Resonance Spectra for Systems with Many Spins or Exchange Sites. 237. R. S. Dumont, S. Jain, and A. Bain, J. Chem. Phys., 106, 5928 (1997). Simulation of ManySpin System Dynamics via Sparse Matrix Methodology. 238. H. Guo and R. Chen, J. Chem. Phys., 110, 6626 (1999). Short-Time Chebyshev Propagator for the Liouville-von Neumann Equation. 239. P. Sarkar, N. Poilin, and T. Carrington Jr., J. Chem. Phys., 110, 10269 (1999). Calculating Rovibrational Energy Levels of a Triatomic Molecule with a Simple Lanczos Method. 240. S. K. Gray and E. M. Goldfield, J. Phys. Chem. A, 105, 2634 (2001). The Highly Excited Bound and Low-Lying Resonance States of H2O. 241. H.-S. Lee and J. C. Light, J. Chem. Phys., 118, 3458 (2003). Molecular Vibrations: Iterative Solution with Energy Selected Bases. 242. S. C. Farantos, S. Y. Lin, and H. Guo, Chem. Phys. Lett., 399, 260 (2004). A Regular Isomerization Path among Chaotic Vibrational States of CH2( a˜1A1). 243. V. A. Mandelshtam, T. P. Grozdanov, and H. S. Taylor, J. Chem. Phys., 103, 10074 (1995). Bound States and Resonances of the Hydroperoxyl Radical HO2. An Accurate Quantum Mechanical Calculation Using Filter Diagonalization. 244. H. Zhang and S. C. Smith, J. Chem. Phys., 118, 10042 (2003). Calculation of Bound and Resonance States of HO2 for Non-Zero Total Angular Momentum. 245. H. Zhang and S. C. Smith, J. Chem. Phys., 123, 014308 (2005). Unimolecular Rovibrational Bound and Resonance States for Large Angular Momentum: J¼20 Calculations for HO2. 246. Y. Wang, T. Carrington Jr., and G. C. Corey, Chem. Phys. Lett., 228, 144 (1994). A Time-to-Energy Fourier Resolution Method for Calculating Bound State Energies and Wavefunctions. 247. S. Skokov, J. Qi, J. M. Bowman, C.-Y. Yang, S. K. Gray, K. A. Peterson, and V. A. Mandelshtam, J. Chem. Phys., 109, 10273 (1998). Accurate Variational Calculations and Analysis of the HOCl Vibrational Energy Spectrum. 248. R. Chen, H. Guo, S. Skokov, and J. M. Bowman, J. Chem. Phys., 111, 7290 (1999). Theoretical Studies of Rotation Induced Fermi Resonances in HOCl.
344
Recursive Solutions to Large Eigenproblems
249. S. J. Jeffrey, S. C. Smith, and D. C. Clary, Chem. Phys. Lett., 273, 55 (1997). Calculation of the Vibrational Spectral Density of NO2 via Density Correlation Functions. 250. R. F. Salzgeber, V. A. Mandelshtam, C. Schlier, and H. S. Taylor, J. Chem. Phys., 109, 937 (1998). All the Adiabatic Bound States of NO2. 251. R. F. Salzgeber, V. A. Mandelshtam, C. Schlier, and H. S. Taylor, J. Chem. Phys., 110, 3756 (1999). All the Nonadiabatic (J ¼ 0) Bound States of NO2. ~ 2B2 ~ 2A1/A 252. F. Santoro, J. Chem. Phys., 109, 1824 (1998). Statistical Analysis of the Computed X Spectrum of NO2: Some Insights into the Causes of Its Irregularity. 253. A. Back, J. Chem. Phys., 117, 8314 (2002). Vibrational Eigenstates of NO2 by a ChebyshevMINRES Spectral Filtering Procedure. 254. C. Zhou, D. Xie, R. Chen, G. Yan, H. Guo, V. Tyng, and M. E. Kellman, Spectrochim. Acta, A58, 727 (2002). Quantum Calculation of Highly Excited Vibrational Energy Levels of ~ on a New Empirical Potential Energy Surface and Semiclassical Analysis of 1:2 Fermi CS2(X) Resonances. 255. R. Siebert, P. Fleurat-Lessard, R. Schinke, M. Bittererova, and S. C. Farantos, J. Chem. Phys., 116, 9749 (2002). The Vibrational Energies of Ozone up to the Dissociation Threshold: Dynamics Calculations on an Accurate Potential Energy Surface. 256. H.-S. Lee and J. C. Light, J. Chem. Phys., 120, 5859 (2004). Vibrational Energy Levels of Ozone up to Dissociation Revisited. 257. P.-N. Roy and T. Carrington Jr., Chem. Phys. Lett., 257, 98 (1996). A Direct-Operation Lanczos Approach for Calculating Energy Levels. 258. G. Ma and H. Guo, J. Chem. Phys., 111, 4032–4040 (1999). Quantum Calculations of Highly Excited Vibrational Spectrum of Sulfur Dioxide. II. Normal to Local Mode Transition and Quantum Stochasticity. 259. D. Xie, H. Guo, O. Bludsky, and P. Nachtigall, Chem. Phys. Lett., 329, 503 (2000). ~ 1A1/C ~ 1B2) Calculated from Ab Initio Absorption and Resonance Emission Spectra of SO2(X Potential Energy and Transition Dipole Moment Surfaces. 260. J. Antikainen, R. Friesner, and C. Leforestier, J. Chem. Phys., 102, 1270 (1995). Adiabatic Pseudospectral Calculation of Vibrational States of Four Atom Molecules: Application to Hydrogen Peroxide. 261. H.-G. Yu and J. T. Muckerman, J. Molec. Spectrosc., 214, 11 (2002). A General Variational Algorithm to Calculate Vibrational Energy Levels of Tetraatomic Molecules. 262. X.-G. Wang and T. Carrington Jr., J. Chem. Phys., 117, 6923 (2002). New Ideas for Using Contracted Basis Functions with a Lanczos Eigensolver for Computing Vibrational Spectra of Molecules with Four or More Atoms. 263. S. Y. Lin and H. Guo, J. Chem. Phys., 119, 5867 (2003). Exact Quantum Mechanical Calculations of Rovibrational Energy Levels of Hydrogen Peroxide (HOOH). 264. H.-S. Lee and J. C. Light, J. Chem. Phys., 120, 4626 (2004). Iterative Solutions with Energy Selected Bases for Highly Excited Vibrations of Tetra-Atomic Molecules. 265. A. Viel and C. Leforestier, J. Chem. Phys., 112, 1212 (2000). Six-Dimensional Calculation of the Vibrational Spectrum of the HFCO Molecule. 266. F. Ribeiro, C. Iung, and C. Leforestier, J. Chem. Phys., 123, 054106 (2005). A Jacobi-Wilson Description Coupled to a Block-Davidson Algorithm: An Efficient Scheme to Calculate Highly Excited Vibrational Levels. 267. F. Gatti, C. Iung, C. Leforestier, and X. Chapuisat, J. Chem. Phys., 111, 7236 (1999). Fully Coupled 6D Calculations of the Ammonia Vibration-Inversion-Tunneling States with a Split Hamiltonian Pseudospectral Approach. 268. H.-G. Yu, Chem. Phys. Lett., 365, 189 (2002). Accelerating the Calculation of the RoVibrational Energies of Tetraatomic Molecules Using a Two-Layer Lanczos Algorithm. 269. F. Ribeiro, C. Iung, and C. Leforestier, J. Theor. Comput. Chem., 2, 609 (2003). Calculation of Selected Highly Excited Vibrational States of Polyatomic Molecules by the Davidson Algorithm.
References
345
270. C. Iung and C. Leforestier, J. Chem. Phys., 97, 3458 (1992). Intramolecular Vibrational Energy Redistribution in the CD3H Molecule. 271. C. Iung, C. Leforestier, and R. E. Wyatt, J. Chem. Phys., 98, 6722 (1993). Wave Operator and Artificial Intelligence Contraction Algorithms in Quantum Dynamics: Application to CD3H and C6H6. 272. H.-G. Yu, J. Chem. Phys., 117, 2030 (2002). An Exact Variational Method to Calculate Vibrational Energies of Five Atom Molecules Beyond the Normal Mode Approach. 273. H.-G. Yu, J. Chem. Phys., 117, 8190 (2002). Two-Layer Lanczos Iteration Approach to Molecular Spectroscopic Calculation. 274. H.-G. Yu, J. Chem. Phys., 121, 6334 (2004). Converged Quantum Dynamics Calculations of Vibrational Energies of CH4 and CH3D Using an Ab Initio Potential. 275. A. T. Maynard, C. Iung, and R. E. Wyatt, J. Chem. Phys., 103, 8372 (1995). A Quantum Dynamical Study of CH Overtones in Fluoroform. I. A Nine-Dimensional Ab Initio Surface, Vibrational Spectra and Dynamics. 276. A. T. Maynard, R. E. Wyatt, and C. Iung, J. Chem. Phys., 106, 9483 (1997). A Quantum Dynamical Study of CH Overtones in Fluoroform. II. Eigenstate Analysis of the vCH = 1 and vCH = 2 Regions. 277. R. E. Wyatt, J. Chem. Phys., 103, 8433 (1995). Computation of High-Energy Vibrational Eigenstates: Application to C6H5D. 278. T. J. Minehardt, J. D. Adcock, and R. E. Wyatt, Phys. Rev. E, 56, 4837 (1997). Enhanced Matrix Spectroscopy: The Preconditioned Green-Function Block Lanczos Algorithm. 279. C. Leforestier, J. Chem. Phys., 101, 7357 (1994). Grid Method for the Wigner Functions. Application to the van der Waals System Ar-H2O. 280. C. Leforestier, L. B. Braly, K. Liu, M. J. Elrod, and R. J. Saykally, J. Chem. Phys., 106, 8527 (1997). Fully Coupled Six-Dimensional Calculations of the Water Dimer Vibration-Rotation-Tunneling States with a Split Wigner Pseudo Spectral Approach. 281. W. Kim, D. Neuhauser, M. R. Wall, and P. M. Felker, J. Chem. Phys., 110, 8461 (1999). SixDimensional Calculation of Intermolecular States in Molecular-Large Molecule Complexes by Filter Diagonalization: Benzene-H2O. 282. C. Leforestier, F. Gatti, R. S. Feller, and R. J. Saykally, J. Chem. Phys., 117, 8710 (2002). Determination of a Flexible (12D) Water Dimer Potential via Direct Inversion of Spectroscopic Data. 283. T. Uzer, Phys. Rep., 199, 73 (1991). Theories of Intramolecular Vibrational Energy Transfer. 284. K. K. Lehmann, G. Scoles, and B. H. Pate, Annu. Rev. Phys. Chem., 45, 241 (1994). Intramolecular Dynamics from Eigenstate-Resolved Infrared Spectra. 285. D. J. Nesbitt and R. W. Field, J. Phys. Chem., 100, 12735 (1996). Vibrational Energy Flow in Highly Excited Molecules: Role of Intramolecular Vibrational Redistribution. 286. M. Silva, R. Jongma, R. W. Field, and A. M. Wodtke, Annu. Rev. Phys. Chem., 52, 811 (2001). The Dynamics of "Stretched Molecules": Experimental Studies of Highly Vibrationally Excited Molecules with Stimulated Emission Pumping. 287. D. J. Tannor and D. E. Weeks, J. Chem. Phys., 98, 3884 (1993). Wave Packet Correlation Function Formulation of Scattering Theory: The Quantum Analog of Classical S-Matrix Theory. 288. D. J. Kouri, Y. Huang, W. Zhu, and D. K. Hoffman, J. Chem. Phys., 100, 3662 (1994). Variational Principles for the Time-Independent Wave-Packet-Schro¨dinger and WavePacket-Lippmann-Schwinger Equations. 289. T. Seideman and W. H. Miller, J. Chem. Phys., 97, 2499 (1992). Quantum Mechanical Reaction Probabilities via a Discrete Variable Representation- Absorbing Boundary Condition Green Function. 290. W. Zhu, Y. Huang, D. J. Kouri, M. Arnold, and D. K. Hoffman, Phys. Rev. Lett., 72, 1310 (1994). Time-Independent Wave Packet Forms of Schro¨dinger and Lippmann-Schwinger Equations.
346
Recursive Solutions to Large Eigenproblems
291. G.-J. Kroes and D. Neuhauser, J. Chem. Phys., 105, 8690 (1996). Performance of a Time-Independent Scattering Wave Packet Technique Using Real Operators and Wave Functions. 292. S. C. Althorpe, D. J. Kouri, and D. K. Hoffman, J. Chem. Phys., 106, 7629 (1997). A Chebyshev Method for Calculating State-to-State Reaction Probabilities from the TimeIndependent Wavepacket Reactant-Product Decoupling Equations. 293. H. Guo, J. Chem. Phys., 108, 2466 (1998). A Time-Independent Theory of Photodissociation Based on Polynomial Propagation. 294. H. Guo, Chem. Phys. Lett., 289, 396 (1998). An Efficient Method to Calculate Resonance Raman Amplitudes via Polynomial Propagation. 295. D. Xie, S. Li, and H. Guo, J. Chem. Phys., 116, 6391 (2002). Direct Calculation of Cumulative Reaction Probabilities from Chebyshev Correlation Functions. 296. A. J. H. M. Meijer, E. M. Goldfield, S. K. Gray, and G. G. Balint-Kurti, Chem. Phys. Lett., 293, 270 (1998). Flux Analysis for Calculating Reaction Probabilities with Real Wave Packets. 297. S. Y. Lin and H. Guo, J. Chem. Phys., 119, 11602 (2003). Quantum Wave Packet Study of Reactive and Inelastic Scattering between C(1D) and H2. 298. H. Guo and T. Seideman, Phys. Chem. Chem. Phys., 1, 1265 (1999). Quantum Mechanical Study of Photodissociation of Oriented ClNO(S1). 299. S. C. Althorpe, J. Chem. Phys., 114, 1601 (2001). Quantum Wavepacket Method for State-toState Reactive Cross-Sections. 300. D. Xu, D. Xie, and H. Guo, J. Phys. Chem. A, 106, 10174 (2002). Predissociation of HCN/ DCN in Two Lowest-Lying Singlet Excited States: Effect of Fermi Resonances on Spectra and Dynamics. 301. H. Guo, in Theory of Chemical Reaction Dynamics, A. Lagana and G. Lendvay, Eds., Kluwer, Dordrecht, The Netherlands, 2004, pp. 217–229. Chebyshev Propagation and Applications to Scattering Problems. 302. H. W. Jang and J. C. Light, J. Chem. Phys., 102, 3262 (1995). Artificial Boundary Inhomogeneity Method for Quantum Scattering Solutions in an Lt2 Basis. 303. D. Reignier and S. C. Smith, Chem. Phys. Lett., 366, 390 (2002). A Real Symmetric Lanczos Subspace Implementation of Quantum Scattering Using Boundary Inhomogeneity. 304. H. Zhang and S. C. Smith, J. Theor. Comput. Chem., 2, 563 (2003). A Comparative Study of Iterative Chebyshev and Lanczos Implementations of the Boundary Inhomogeneity Method for Quantum Scattering. 305. U. Manthe and W. H. Miller, J. Chem. Phys., 99, 3411 (1993). The Cumulative Reaction Probability as Eigenvalue Problem. 306. U. Manthe, T. Seideman, and W. H. Miller, J. Chem. Phys., 99, 10078 (1993). FullDimensional Quantum Mechanical Calculation of the Rate Constant for the H2 þ OH ! H2O þ H Reaction. 307. U. Manthe, T. Seideman, and W. H. Miller, J. Chem. Phys., 101, 4759 (1994). Quantum Mechanical Calculations of the Rate Constant for the H2 þ OH ! H þ H2O Reaction: FullDimensional Results and Comparison to Reduced Dimensionality Models. 308. S. Y. Lin, H. Guo, and S. C. Farantos, J. Chem. Phys., 122, 124308 (2005). Resonances of "1A1) and Their Roles in Unimolecular and Bimolecular Reactions. CH2(a 309. S. Skokov, J. M. Bowman, and V. A. Mandelshtam, Phys. Chem. Chem. Phys., 1, 1279 (1999). Calculation of Resonance States of Non-Rotating HOCl Using an Accurate Ab Initio Potential. 310. W. Bian and B. Poirier, J. Chem. Phys., 121, 4467 (2004). Accurate and Highly Efficient Calculation of the Highly Excited Pure OH Stretching Resonances of O(1D)HCl, Using a Combination of Methods. 311. H. Li, D. Xie, and H. Guo, J. Chem. Phys., 120, 4273 (2004). An Ab Initio Potential Energy Surface and Predissociative Resonances of HArF.
References
347
312. G.-J. Kroes and D. Neuhauser, J. Chem. Phys., 105, 9104 (1996). Avoiding Long Propagation Times in Wave Packet Calculations on Scattering with Resonances: A Hybrid Approach Involving the Lanczos Method. 313. G.-J. Kroes, M. R. Wall, J. W. Peng, and D. Neuhauser, J. Chem. Phys., 106, 1800 (1997). Avoiding Long Propagation Times in Wave Packet Calculations on Scattering with Resonances: A New Algorithm Involving Filter Diagonalization. 314. D. A. McCormack, G.-J. Kroes, and D. Neuhauser, J. Chem. Phys., 109, 5177 (1998). Resonance Affected Scattering: Comparison of Two Hybrid Methods Involving Filter Diagonalization and the Lanczos Method. 315. S.-W. Huang and T. Carrington Jr., Chem. Phys. Lett., 312, 311 (1999). A Comparison of Filter Diagonalization Methods with the Lanczos Method for Calculating Vibrational Energy Levels. 316. D. Xie, R. Chen, and H. Guo, J. Chem. Phys., 112, 5263 (2000). Comparison of Chebyshev, Faber and Lanczos Propagation Based Methods in Calculating Resonances. 317. H. Zhang and S. C. Smith, Chem. Phys. Lett., 347, 211 (2001). A Comparison of Low-Storage Strategies for Spectral Analysis in Dissipative Systems: Filter Diagonalization in the Lanczos Representation and Harmonic Inversion of the Chebychev Order Domain Autocorrelation Function. 318. V. A. Mandelshtam and T. Carrington Jr., Phys. Rev. E, 65, 028701 (2002). Comment on "Spectral Filters in Quantum Mechanics: A Measurement Theory Prospective". 319. X. Chapuisat and C. Iung, Phys. Rev. A, 45, 6217 (1992). Vector Parametrization of the NBody Problem in Quantum Mechanics: Polyspherical Coordinates. 320. F. Gatti, C. Iung, M. Menou, Y. Justum, A. Nauts, and X. Chapuisat, J. Chem. Phys., 108, 8804 (1998). Vector Parameterization of the N-Atom Problem in Quantum Mechanics. I. Jacobi Vectors. 321. F. Gatti, C. Iung, M. Menou, and X. Chapuisat, J. Chem. Phys., 108, 8821 (1998). Vector Parameterization of the N-Atom Problem in Quantum Mechanics. II. Coupled-AngularMomentum Spectral Representations for Four-Atom Systems. 322. M. Mladenovic, J. Chem. Phys., 112, 1070 (2000). Rovibrational Hamiltonian for General Polyatomic Molecules in Spherical Polar Parameterization. I. Orthogonal Representations. 323. X.-G. Wang and T. Carrington Jr., J. Chem. Phys., 113, 7097 (2000). A Simple Method for Deriving Kinetic Energy Operators. 324. C. Leforestier, A. Viel, F. Gatti, C. Mun˜oz, and C. Iung, J. Chem. Phys., 114, 2099 (2001). The Jacobi-Wilson Method: A New Approach to the Description of Polyatomic Molecules. 325. F. Gatti, C. Mun˜oz, and C. Iung, J. Chem. Phys., 114, 8275 (2001). A General Expression of the Exact Kinetic Energy Operator in Polyspherical Coordinates. 326. X.-G. Wang and T. Carrington Jr., J. Phys. Chem. A, 105, 2575 (2001). The Utility of Constraining Basis Function Indices When Using the Lanczos Algorithm to Calculate Vibrational Energy Levels. 327. H.-G. Yu, J. Chem. Phys., 120, 2270 (2004). Full-Dimensional Quantum Calculations of Vibrational Molecules. I. Theory and Numerical Results. 328. R. G. Littlejohn, M. Cargo, T. Carrington Jr., K. A. Mitchell, and B. Poirier, J. Chem. Phys., 116, 8691 (2002). A General Framework for Discrete Variable Representation Basis Sets. 329. R. Dawes and T. Carrington Jr., J. Chem. Phys., 121, 726 (2004). A Multidimensional Discrete Variable Representation Basis Obtained by Simultaneous Diagonalization. 330. R. Dawes and T. Carrington Jr., J. Chem. Phys., 122, 134101 (2005). How to Choose OneDimensional Basis Functions So That a Very Efficient Multidimensional Basis May Be Extracted from a Direct Product of the One-Dimensional Functions: Energy Levels of Coupled Systems with as Many as 16 Coordinates. 331. H.-G. Yu, J. Chem. Phys., 122, 164107 (2005). A Coherent Discrete Variable Representation Method for Multidimensional Systems in Physics.
This Page Intentionally Left Blank
CHAPTER 8
Development and Uses of Artificial Intelligence in Chemistry Hugh Cartwright University of Oxford, Oxford, United Kingdom
INTRODUCTION A decade ago, artificial intelligence (AI) was mainly of interest to computer scientists. Few researchers in the physical sciences were familiar with the area; fewer still had tried to put its methods to practical use. However, in the past few years, AI has moved into the mainstream as a routine method for assessing data in the experimental sciences; it promises to become one of the most important scientific tools for data analysis. In some respects, this is a strange state of affairs. The limits of the field are vague: Even computer scientists sometimes find it difficult to pin down exactly what characterizes an AI application. Nevertheless, in a review that focuses on the use of AI in science, it would be cowardly to hide behind the excuse of vagueness, so we shall have a stab at defining AI: An artificial intelligence program is a piece of computer software that can learn. Not all computer scientists would agree with this broad statement; however, it does encompass virtually every AI method of interest to the chemist. As we shall see, learning is a key part of the definition. In each method discussed in this chapter, there is some means by which the algorithm learns, and then stores, knowledge as it attempts to solve a problem.
Reviews in Computational Chemistry, Volume 25 edited by Kenny B. Lipkowitz and Thomas R. Cundari Copyright ß 2007 Wiley-VCH, John Wiley & Sons, Inc.
349
350
Development and Uses of Artificial Intelligence in Chemistry
Trying to nail down just what AI is, is not the only difficulty that awaits those new to the field. The manipulations within AI algorithms are often described by specifying a sequence of procedures or operations, rather than by casting them in the form of equations, with which many scientists feel more comfortable. At first sight, there also seems to be a slightly alarming uncertainty in the way that AI methods are used: Two users who attack the same problem using the same algorithm might obtain different results. To someone raised on the certainty of calculus, this can be unsettling, but to obtain two different answers to a problem when only one is sought is not necessarily bad, nor does it imply that there is something amiss with the method of solution. Indeed, at times, the ability to locate multiple solutions can be a definite advantage. Despite this assertion, if you are new to AI you may already feel doubt creeping up on you; AI is beginning to seem like a slippery concept. It is difficult to believe that a method that is less precise than deterministic methods such as calculus could outperform them. Yet the choice of ways to tackle a scientific problem that AI offers provides opportunities for inventiveness in use that are generally absent from traditional methods, and this is one of its strengths. Furthermore, AI methods show a tolerance of user uncertainty or lack of data, and they are often capable of providing results of value even when not used in an optimum fashion. They can also tackle some types of problem with which conventional methods struggle. Consequently, many scientists are now coming to the view that it is worth investing time and effort to learn about AI. This chapter outlines how some of the most widely used methods work and what they can achieve. Several thousand papers that address the use of AI in science are published each year, so there is room in this chapter to mention only a small fraction of these applications; nevertheless even a taster of this size should give the reader a hint of how AI can help in the assessment of scientific data. Most AI methods used in science lie within one of three areas: evolutionary methods, neural networks and related methods, and knowledge-based systems. Additional methods, such as automated reasoning, hybrid systems, fuzzy logic, and case-based reasoning, are also of scientific interest, but this review will focus on the methods that seem to offer the greatest near-term potential in science.
EVOLUTIONARY ALGORITHMS Principals of Genetic Algorithms As the name suggests, the inspiration for evolutionary algorithms is the evolutionary behavior of natural systems; links to natural systems are in fact evident in several AI techniques. Evolutionary algorithms are particularly
Evolutionary Algorithms
351
Figure 1 An outline of the steps in an evolutionary algorithm.
valuable in the solution of problems of high dimensionality and those that involve large, complex search spaces. Of the various types of evolutionary algorithm that exist, genetic algorithms (GAs), a topic that has been reviewed previously in this book series,1 are currently the most widely adopted in the physical and life sciences, and it is on this method that we shall concentrate. The manipulations in all evolutionary algorithms follow a similar course, an iterative process that is used to develop progressively better solutions from an initial random starting point (Figure 1). The use of iteration in solving problems is of course not limited to evolutionary approaches. Where the GA differs from other iterative techniques is in its simultaneous manipulation of many possible solutions to a problem, rather than just one. Typically, the algorithm will operate on a group of 40–100 solutions, but populations larger than this, perhaps running to many thousands, may be used. Members of the population are repeatedly selected, reproduced, and modified, in a manner analogous to the evolution of a natural population, with the aim of evolving high-quality solutions. When considering the use of a GA to tackle a scientific problem, the most fundamental requirement is that it must be possible to express the solution in vector form, generally, but not necessarily, of one dimension. This vector is
352
Development and Uses of Artificial Intelligence in Chemistry
known as an individual, a chromosome (by analogy with natural evolution), or a string; in this chapter we shall refer to these vectors as strings. The need to express the solution as a vector limits the range of problems that can be tackled, but as the section on applications will illustrate, a wide variety of problems is still amenable to attack using a GA. The iterative refinement applied to the population of these vectors using evolutionary operators is, as we shall show shortly, a simple process.
Genetic Algorithm Implementation The most distinctive feature of the algorithm is its use of a population of potential solutions, so it is reasonable to ask why it might be more effective to work with many potential solutions when conventional methods require only one. To answer this question, and to appreciate how the genetic algorithm works, we consider a simple example. Imagine several identical dipoles spaced evenly along a straight line (Figure 2). The center of each dipole is pinned down, but the dipole can rotate to adopt any orientation in the plane of the page. The problem is to find the lowest energy arrangement of the set of dipoles. Although it is easy to think of a solution to this problem without the need to introduce computers, it is nevertheless instructive to observe how the genetic algorithm works its way toward this solution. Potential solutions, constructed as vectors, can easily be prepared by specifying the angle that each dipole makes with the vertical axis. A typical string would then be written as an ordered list of these angles, for example: h10; 71; 147; 325; 103; 133; 142; 160; 20; 153i To run the algorithm the steps shown in Figure 1 are executed: 1. Create an initial population of random strings. 2. Calculate the quality (the fitness) of each string. 3. Initiate the creation of a new population by selecting members from the old one, choosing the better members stochastically. 4. Modify members of the new population to create fresh solutions. 5. Repeat steps 2–4 until a solution of acceptable quality emerges. It is not immediately obvious that this process will do anything other than consume computer time, so let us see what happens when we put this sequence of steps into action.
Figure 2 A dipole-alignment task to be solved by the genetic algorithm.
Evolutionary Algorithms
353
The first step is to create the starting population. Although population sizes within evolutionary algorithms may be large, we shall use a small population of just ten strings so that the operation of the algorithm is clear. All angles in the starting strings are chosen at random; the initial population is shown in Table 1. At the heart of the GA are evolutionary operators: These are a ‘‘survival of the fittest’’ operator, and a pair of modification operators whose role is to create new strings. To apply the first of these operators, the fitter members of the starting population are selected preferentially as parents for the next generation. This, of course, requires that we know what is meant by, and can calculate, the fitness of each string. As one might guess from the name, the fitness function (or objective function) measures the quality of the solution that a string represents. Thus, in the current problem, the fitness must depend on the energy of interaction between all dipoles whose orientations the string represents. There is no prescribed recipe in the algorithm for constructing a relationship between fitness and quality of solution; we are free to choose any metric, provided that it assigns high fitness to good solutions. This is not to suggest that the choice of the relationship between quality and fitness is unimportant; indeed, choosing an appropriate relationship is a key step in the construction of a successful GA application, as there are subtle interactions between the form of the function chosen and the speed at which the GA can reach an optimum solution. For the current problem, though, it is not difficult to select a function that will do the job, and a simple fitness function is sufficient; we shall use the relationship given by Eq. [1]. fi ¼
1:0 C þ ei
½1
in which ei is the interaction energy in arbitrary units. For simplicity we have assumed that only nearest-neighbor dipoles interact, so ei is the sum of nine dipole–dipole interactions. The constant C in Eq. [1] is introduced because the interaction energy between two dipoles may be positive (repulsive) or negative (attractive). Without the constant, the fitnesses could also be positive or negative, which would disrupt the GA calculation. Using Eq. [1] with a suitable choice for C (a value of 10.0 gives good results in this case, but the success of the procedure is not intimately related to the value chosen), we can calculate the fitness of each starting string; the energies and fitness of all strings in the initial population are shown in Table 1. The next step is to apply survival of the fittest within the population to determine which of the current strings will act as parents for the next generation. In the GA, as in nature, a stochastic, random, element enters into this process, so the process is more ‘‘survival of the fittest (usually).’’ Fitter strings
354
y1
10 210 3 125 142 249 4 313 58 227
String
1 2 3 4 5 6 7 8 9 10
71 147 57 148 144 3 101 96 57 202
y2
147 88 259 322 110 354 354 354 354 11
y3 325 91 266 66 229 299 343 12 63 162
y4 103 293 62 299 334 128 217 75 82 239
y5 133 58 232 129 278 46 96 13 65 357
y6 142 294 118 322 148 216 323 95 114 109
y7 160 227 302 85 62 141 272 119 3 221
y8 20 90 152 99 223 299 271 153 227 309
y9 153 78 8 331 220 213 86 83 127 209
y10
0.8713 0.2851 0.9458 0.5761 0.0389 0.2143 0.4813 0.5184 1.5821 0.6110
Energy
0.1095 0.1029 0.0914 0.0946 0.0996 0.1022 0.1051 0.1055 0.1188 0.1065
Fitness
Table 1 The Initial, Random Genetic Algorithm Population (The Significance of the Angles Marked in Bold is Discussed in the Text.)
Evolutionary Algorithms
355
Figure 3 The roulette wheel selection operator.
are more likely to be selected than poorer ones, but all strings have some chance of being chosen. In other evolutionary algorithms, the selection process can be more deterministic. There are various ways to perform selection: One widely used method is to allocate to each string a space on a roulette wheel, or pie diagram, whose width is proportional to its fitness (Figure 3). The imaginary roulette wheel is spun, and the string into whose slot the virtual ball falls is selected to be a parent; it is clear that this procedure does, as desired, bias selection in favor of the fitter strings, but it still gives less-fit strings some chance of being chosen. Other selection methods include repeated binary tournament selection, in which two strings are chosen at random and the one with the higher fitness is, with a high probability, selected. We shall use roulette wheel selection. Spinning the virtual wheel ten times gives us ten strings as the starting point for the new population (Table 2). We note that the fitter strings are indeed now more numerous than before, as we would expect, although, as there is a stochastic element in the choice of parents, repeated runs on the same problem can be expected to generate different results. Selection has improved the average string fitness, but all strings are just copies of members of the starting population—no new solutions have yet been created. To fashion new, and potentially better, solutions,
356
y1
210 10 4 125 10 58 313 227 249 58
String
2 1 7 4 1 9 8 10 6 9
147 71 101 148 71 57 96 202 3 57
y2
88 147 354 322 147 354 354 11 354 354
y3 91 325 343 66 325 63 12 162 299 63
y4 293 103 217 299 103 82 75 239 128 82
y5 58 133 96 129 133 65 13 357 46 65
y6
Table 2 The Strings Selected as Parents for the Second Population 294 142 323 322 142 114 95 109 216 114
y7 227 160 272 85 160 3 119 221 141 3
y8 90 20 271 99 20 227 153 309 299 227
y9 78 153 86 331 153 127 83 209 213 127
y10
0.2851 0.8713 0.4813 0.5761 0.8713 1.5821 0.5184 0.6110 0.2143 1.5821
Energy
0.1029 0.1095 0.1051 0.0946 0.1095 0.1188 0.1055 0.1065 0.1022 0.1188
Fitness
Evolutionary Algorithms
357
Figure 4 The genetic algorithm mating (crossover) operator.
some of these strings must be modified. Two operators exist for this purpose. The mating operator swaps information between strings (Figure 4). Two strings are selected at random, and a randomly chosen section of material is cut-and-pasted from one string into the other. This is often referred to as crossover, because material is crossed between the two strings. The action of crossover between the fourth and fifth genes on strings 8 and 9 is shown in Figure 4. The swapping of information usually creates strings that differ from both parents, but in the type of problem we are considering here, it cannot create information that was missing from both parents; in other words, it cannot turn a particular dipole into a new random orientation; it can only swap the orientation between those contained in the parents; we need a different operator to accomplish this. The injection of new data, which is required if the algorithm is to make progress, is accomplished by a mutation operator, which selects one of the new strings at random, and then introduces a random change at a randomly-selected position as illustrated in the string in Figure 5.
Figure 5 The genetic algorithm mutation operator.
358
Development and Uses of Artificial Intelligence in Chemistry
Crossover is generally applied to most or all members of the new population, whereas mutation is usually applied more sparingly, typically to between 1% and 10% of the population. Once these operators have been applied, the new population is fully formed; one generation has passed. The process is now repeated; the fitness of each string in the new population is determined, the fitter members are selected stochastically as parents for the next population, and the mating and mutation operators are applied once again. Table 3 shows the progress made by generation 3, and generation 5 is shown in Table 4. It is evident that, even at this early stage, the population is beginning to homogenize, as the algorithm starts to ‘‘learn’’ the form of a good solution to the problem. By the tenth generation, the best string is h6; 147; 88; 91; 88; 90; 95; 6; 272; 271i which has an energy of 7.5318 and a fitness of 0.4051; it is evident that the algorithm is making good progress toward an acceptable solution. In due course, it will settle on something close to an optimum solution.
Why Does the Genetic Algorithm Work? The problem tackled here is simple, and it may seem that there is nothing terribly clever about what has been done. Nevertheless, simple though it may seem, the algorithm can be applied successfully to problems that are much more challenging than aligning a set of dipoles, as the examples in the next section will illustrate. The algorithm works, but how does this happen? The theory of evolutionary algorithms is increasingly extensive, detailed, and in some areas, challenging. However, a simple insight into how the genetic algorithm creates good solutions is offered by the building-block hypothesis. Although this hypothesis is not regarded in the AI community as a complete explanation, it does provide an accessible qualitative picture of how evolutionary methods can discover optimum solutions. Let us return to Table 1. In strings 2 and 7, there is one section in each string, shown in bold, where neighboring dipoles are nearly aligned; this alignment, in one case with two neighboring dipoles having angles close to 90o and in the other having two angles near to 270o, gives rise to a low (favorable) interaction energy for that part of the string, and it is this sort of alignment that we can anticipate will appear in the optimum solution. These regions in the strings constitute ‘‘building blocks’’ for the construction of a good solution. If strings 2 and 7 are chosen as parents for the next generation, there is a chance that the mating operator will bring the two strings together for crossover. If this happens, these two building blocks might find themselves both in the same, new string, which will therefore contain at least two sections that will help improve (i.e., lower) the energy. That new string will be rewarded with a higher fitness and thus will be more likely to be selected as
359
y1
313 7 210 227 58 7 70 11 7 58
String
1 2 3 4 5 6 7 8 9 10
96 57 147 202 57 7 71 41 7 57
y2
Table 3 Generation 3
17 354 88 14 354 11 147 322 11 354
y3 12 63 91 17 325 12 325 66 12 63
y4 129 88 4 239 17 75 103 82 27 17
y5 322 65 96 357 13 41 357 65 41 13
y6 85 294 41 109 95 95 114 114 95 95
y7 99 227 272 221 119 60 3 3 60 119
y8 331 272 271 153 153 153 227 227 153 153
y9 180 271 86 83 40 83 228 228 228 83
y10
0.9293 2.5085 1.5253 0.3279 0.4370 1.4222 1.3890 1.5818 0.4219 0.5445
Energy
0.1102 0.1335 0.1180 0.0968 0.1046 0.1166 0.1161 0.1188 0.1044 0.1058
Fitness
360
y1
58 70 11 70 7 6 7 14 70 11
String
1 2 3 4 5 6 7 8 9 10
202 71 71 71 7 147 57 27 71 41
y2
Table 4 Generation 5
354 318 147 147 11 88 106 90 147 354
y3 63 63 88 325 63 91 63 63 40 63
y4 88 88 88 103 114 82 88 17 82 88
y5 65 65 65 13 65 65 65 322 65 65
y6 294 294 95 95 294 114 95 85 114 294
y7 227 114 6 60 271 127 60 99 119 3
y8 272 227 153 153 272 227 271 331 227 227
y9 271 228 83 154 271 228 86 180 228 228
y10
3.0807 0.9557 4.2025 2.1447 4.1138 4.4853 2.6745 1.5250 2.8340 0.0450
Energy
0.1445 0.0913 0.1725 0.1273 0.1699 0.1813 0.1365 0.1180 0.1395 0.1005
Fitness
Evolutionary Algorithms
361
a parent for the following generation, so the two building blocks, having met up, will have a good chance of surviving. As it happens, in generation 3, both building blocks remain in the population, but they have not yet been brought together by crossover. Both blocks are still present in generation 5, and although they still are in different strings, the h. . . ; 271; 272; . . .i sequence has been expanded by mutation to h. . . ; 271; 272; 271i, in which three neighboring dipoles are almost aligned, giving the string a high fitness. These useful building blocks have a good chance of remaining in the population until they are displaced by even better segments, but we should recognize that there is nothing that guarantees their survival: The crossover operator may bring potentially valuable building blocks together to form a string of enhanced fitness, but there is nothing that makes this inevitable; a poor string is as likely to be created by the random effects of crossover as a good one. The key point is that, when multiple promising building blocks are brought together by chance in one string, the string gets a fitness boost from the presence of the blocks; this is the basis of the Schema Theorem (a schema is a contiguous set of genes that comprise a building block). This improved fitness may give it an advantage over the competition so that it is more likely to survive for a few generations. During this period, the useful building blocks have the opportunity to start spreading through the population. By contrast, solutions containing only inferior building blocks that are created by the crossover operator will have low fitness and will always be at risk of disappearing by being displaced by better solutions. As generations pass, building blocks created by the evolutionary operators will therefore tend to thrive if they are good, and be lost from the population if they are not, leading eventually to a population consisting predominantly of high-quality solutions. (The alert reader will have noticed that the two building blocks discussed here are, in fact, part of two different, and incompatible, solutions. Two equivalent optimum solutions exist: one in which the dipoles all point to the left and one in which they all point to the right. At some stage in the calculation, one of these building blocks will be lost and the solution containing the other building block is likely to ‘‘take over’’ the population.)
Where Is the Learning in the Genetic Algorithm? It was suggested earlier that a key feature of AI software is that it learns; so where is learning in the GA? The initial GA population consists entirely of random strings. These strings contain no information about the nature of a solution to the problem, so at this point the algorithm knows nothing. (A technical exception to this rule occurs if the strings have been constructed taking advantage of heuristic or other knowledge about the nature of viable solutions. However, even in this case, the starting strings do no more than
362
Development and Uses of Artificial Intelligence in Chemistry
represent the heuristics in a noisy form, so they do not tell us anything that we did not previously know.) As the calculation proceeds, strings within the population gradually begin to pick up the characteristics of high-quality solutions; the algorithm is developing an understanding of what such solutions look like and is storing this information in the strings. The GA learns in this instance that solutions in which neighboring dipoles are aligned have low energy; this knowledge appears as segments of information in the strings that survive, which is knowledge that is passed from one generation to the next through the evolution of the algorithm. In this fashion, therefore, the learning of the algorithm is encapsulated in the gradual development of encoded knowledge about good solutions within the population.
What Can the Genetic Algorithm Do? Numerous applications of GAs within science and other fields have appeared in the literature; references to a few of them are given at the end of this chapter. The method has been used for computer learning, modeling of epidemics, the scheduling of the production of fine chemicals, the prediction of the properties of polymers, spectral analysis, and a wide variety of other investigations. In this section we consider a few examples of recent applications in chemistry. Protein Structure One of the most keenly studied topics in computational chemistry at present is determining how proteins fold. Several groups are using genetic algorithms in this area. Using a one-letter coding to identify amino acids, Ziegler and Schwarzinger2 have used the GA to study the stabilization of alpha-helices and to design helices with predetermined solubility or other parameters. The role of the crossover and mutation operators in any GA application is to create new, potentially valuable strings, but as we have observed, these operators are also disruptive, fragmenting, or destroying information. To prevent these operators from causing too much damage, the authors used masks to restrict the string positions that could be mutated, or the types of amino acid residue that could appear at certain spots in the structure. Their work illustrates how the mating and mutation operators can be adjusted by the user to suit the requirements of the problem, without invalidating the GA approach. Partly because of the fashion in which the authors chose to restrict the work of these operators, the system converged in 200 generations to sequences that, the authors argued, are potentially of value in gene therapy related to prion protein helix 1. In a GA, the size of the population, rate of mutation, rate of crossover, choice of selection method, and other factors can all be selected by the user. The fact that this degree of freedom exists does not imply that the value chosen for each parameter is of little consequence. On the contrary, the parameter
Evolutionary Algorithms
363
choice, in conjunction with the topography of the fitness surface, whose form is often almost completely unknown, determines how the search proceeds across that surface. The success of the search, and the speed with which acceptable answers are found, are therefore both strongly affected by choice of parameters. In another study of protein structure, Cox and Johnston3 analyzed how the choice of GA parameters affects the quality of the GA search. This sort of approach was also adopted by Djurdjevic and Biggs,4 who presented a detailed study of how evolutionary algorithms can be used, in combination with a full atomistic protein ab initio model, for fold prediction and, like Cox and Johnston, considered the influence of the different values of parameters on the success of their protein folding calculations. Using a GA in Combination with Another Technique GAs are often combined with other AI or non-AI methods, with the role of the GA being to find some optimum set of parameters. Gributs and Burns5 used a GA to select a set of wavelengths or of wavelets6 that would provide a parsimonious estimate of the properties of interest in a study of NIR spectra, whereas Dam and Saraf7 combined a GA with a neural network to predict the product mix from the crude distillation unit of a petroleum refinery. Several adjustable parameters determine the structure and performance of neural networks,8 so it is natural to consider using a GA to select the best neural network geometry and thus minimize development time. Biomedical Applications Biomedical spectra are often extremely complex. Hyphenated techniques such as MS–MS can generate databases that contain hundreds of thousands or millions of data points. Reduction of dimensionality is then a common step preceding data analysis because of the computational overheads associated with manipulating such large datasets.9 To classify the very large datasets provided by biomedical spectra, some form of feature selection10 is almost essential. In sparse data, many combinations of attributes may separate the samples, but not every combination is plausible. Pranckeviciene et al.11 have assessed the NMR spectra of pathogenic fungi and of human biofluids, finding the spectral signature that comprises a set of attributes that serve to uniquely identify and characterize the sample. This use of GAs effectively reduces the dimensionality of the data, and it can speed up later processing as well as make it more reliable. Physical Chemistry In kinetics studies, as in mass spectrometry, data reduction can be helpful before starting a detailed analysis. A typical application in which data reduction is of value is high-temperature kinetics. Reactions in flames are complex, so study of these reactions is challenging not just experimentally but also
364
Development and Uses of Artificial Intelligence in Chemistry
computationally. At the high temperatures attained in a flame, the number of reactive species present may be large. Furthermore, the high temperature ensures that the rate constants for most conceivable reactions between those reactive species are high, so many reaction schemes must be taken into account. Detailed kinetic modeling is simplified if one can identify species that are in a quasi-steady-state. Montgomery and co-workers12 have used a GA to identify the species for which such a quasi-steady-state can be assumed and have obtained results in good agreement with the predictions of a detailed model that included all species believed to be present in significant quantities. A similar problem has been tackled by Elliott et al.13 who reduced a set of 388 reactions involving 67 species to 215 reactions and 50 species. They then used a GA to determine optimum values for the reaction rate parameters that best matched experimental profiles. In another combustion-related study, Lautenberger et al.14 used a GA to optimize a small parameter set, including the activation energy and pre-exponential factor for reactions. They recognized that the solutions were potentially unstable with respect to small changes in input data, so they incorporated heuristic information such as phase densities and specific heats, from a charring pyrolysis model,15 to ensure that the calculation was well behaved. Their work illustrates the manner in which extra heuristic information, if it can help define the nature of a good GA solution, can be incorporated into the evolution of the strings. Evolutionary algorithms have been widely used in other areas of physical chemistry, such as photonics. An interesting application is from Lipson et al.16 where the spontaneous emergence of structure was evident when using GAs to design a high-confinement photonic structure. There have been several reports of the use of GAs in direct or indirect determination of crystal structures. It is possible to use GAs to determine crystal structures through the analysis of experimental data; an alternative approach is to use them to predict crystal structures theoretically. Dova et al.17 used a GA incorporating a ‘‘parameter box,’’ to analyze synchrotron powder diffraction data of spin-crossover complexes; the size of the parameter box was adjusted dynamically to include different volumes of search space. Working from a theoretical rather than an experimental viewpoint, Oganov and Glass18 combined quantum mechanical ab initio total energy calculations with an evolutionary algorithm to predict a crystal structure, although the number of reported structures was not large. Similarly, Abraham and Probert19 used a GA to predict the global energy minimum, without making prior assumptions about unit cell size, symmetry, or shape. In a somewhat less academically rigorous study, enjoyable but largely unsuccessful attempts have been made to use the genetic algorithm to optimize the taste of muesli.20 Clusters GAs have been widely used in the study of atom clusters. The number of ways in which atoms can be arranged so that a cluster of them lies in a local
Evolutionary Algorithms
365
energy minimum rises rapidly with the number of atoms, being of the order of 1010 for a cluster containing 50 identical atoms (an even larger number of structures exist if the atoms are not identical). Because an exhaustive search through billions of possible structures to locate the absolute energy minimum is not feasible, some intelligent search method, such as a GA, is required to find low-energy structures. Hsu and Lai21 combined a GA with a basin-hopping approach to avoid the pitfall that the algorithm can become trapped within local minima. They determined the structure of mixed copper-gold clusters containing up to 38 atoms by assessing the interatomic energy with a Gupta potential.22 Ona et al.23 chose to use MSINDO24 combined with local optimization in their study of small silicon clusters. Local minimization of this sort is often a helpful addition to GA methods, because, as the calculation converges, the mutation operator alone may not be sufficient to bring about the subtle adjustments to the string needed to move from good strings to the optimum in reasonable time. Marim et al.25 investigated the structure of prolate and near-spherical geometries of small silicon clusters containing up to 18 atoms, whereas Juhas and co-workers26 adopted the interesting approach of using GAs to solve the unassigned distance geometry problem in which data from pair distribution functions are used to determine the cluster structure. This latter method is a potentially powerful technique, because it might be applicable to structure determination even when X-ray crystallographic methods are impracticable.
What Can Go Wrong with the Genetic Algorithm? The GA, like every other method of treating data, is not magical and must be used with care. In each application discussed below, the authors have combined a GA with an experimental measurement—a potentially promising tactic—but they have used the GA in a less-than-robust fashion. In an experiment-linked GA, the algorithm generates a potential solution for a problem and a real experiment is then run to return a value for the fitness function. However, as the fitness of many different strings must be evaluated in any GA calculation, this combination of computational with experimental investigation can be time consuming. It may be difficult to run the algorithm for a sufficiently large number of generations, or there may be a temptation to reduce the string length, so that fewer strings, or simpler strings, need to be laboratory tested. Be forewarned that, if the algorithm is allowed to run only for a few generations, it is far less effective than it could be and may in fact be little better than a simple random search. Sohn et al.27 used GAs to try to determine the composition of a high luminescence phosphor at 400 nm. Rather than relying on a theoretical model to assess the fitness of solid mixtures proposed by the GA, they synthesized each mixture and measured the emission efficiency experimentally. This is in
366
Development and Uses of Artificial Intelligence in Chemistry
principle a productive way to proceed, but because of the amount of practical work required, the experimental assessment of each generation required 2–3 days of work in the laboratory. Perhaps because of the experimental demands, the group ran their algorithm for just ten generations, although one would normally expect the GA to require the completion of hundreds or thousands of generations to find optimum solutions, using a seven-parameter string. The best string emerged within six generations, which is early enough in the evolution of the algorithm to suggest that insufficient time was available to optimize the solution. Grubert et al.28 also chose to abbreviate the operation of the algorithm. That group attempted to find an optimum catalyst for the water–gas shift reaction, starting in two different experiments from a pool of 72 or 36 catalytic materials. After running their calculation for seven generations or, in the latter case, for only three generations, they concluded that the best catalytic composition was ‘‘approached’’ by the genetic algorithm, but once again it is in the nature of an evolutionary algorithm that it needs to run for many generations if one is to have confidence that the optimum set of variable values has been found. In a study of the time dependence of the response of an ion-selective electrode, Watkins and Puxty29 took reduction of the length of the GA string to an extreme limit, using a string consisting of just three values. The GA was used to provide initial estimates of the parameters describing the time-dependent response, and these parameters were then refined by nonlinear regression. Even with this two-step approach, the algorithm could not yield values within an order of magnitude of an alternative nonlinear fit. Rivera and co-workers30 likewise applied the genetic algorithm to batch fermentation, using strings consisting of just five parameters. When manipulating such short strings, it is evident that the building block mechanism (even given the limitations of that model of GA behavior) is unlikely to be effective at finding an optimum solution. As these examples suggest, the GA is not a universal optimizer, guaranteed to work with any kind of problem. It is true that, even when the problem is poorly suited to a GA approach, a slow drift toward good solutions may occur, because the GA may operate as a slightly intelligent random search machine; however, this does not imply that the GA would outperform other algorithms, whatever the problem.
NEURAL NETWORKS Neural Network Principals Although the operation of the neural network is rather different from the GA, it too derives its inspiration from nature.
Neural Networks
367
Humans excel at some types of task: We can recognize the face of a friend almost instantly, even though the computations in the brain required to do so are complex. On the other hand, humans are not very adept at mathematical tasks: Although a computer could cube root 373232448215999 in milliseconds, few people could do this at all without the aid of pencil and paper or calculator, let alone manage it in less than a second. Conventional ‘‘linear’’ computer programs can readily outrun the human brain on numerical calculations, but computer scientists recognized several decades ago that, although the human mind is not engineered to cube root 15-digit numbers, it is well adapted to recognize patterns. They reasoned that, if it were possible to create a computer program that functioned in the same way as a human brain, the software might be able to emulate the learning abilities of humans and therefore be at least as good as humans at pattern recognition, but without the irritating propensity of humans to forget the important stuff and remember the trivial. The artificial neural network, or ANN, was an attempt to create such software. The power of the method is now reflected in its widespread use in image recognition and classification tasks. Neural networks are valuable when one wishes to identify patterns or trends that are not readily apparent in data, whether the data consist of images or purely numerical data, and are also useful in providing answers to ‘‘what if’’ questions, in other words, in making predictions. The range of possible questions is very large, from ‘‘What would happen in this nuclear power station if the control rods were raised by 20%?’’ to ‘‘Would this applicant pay back £150,000 if we made a mortgage offer to him?’’ An additional advantage of neural networks is that, once fully prepared through suitable training, they are fast in execution, which makes them valuable in real-time applications such as process control. It was not long after neural networks were proposed before computer scientists recognized that the natural world does not necessarily provide us with a perfect template for a learning machine. As with genetic algorithms, the links with the natural world that initially inspired this technique have become weaker as the method has matured. Although ANNs have changed substantially in structure and operation as they have developed, the basic model, in which many identical processing units cooperate to create a single large-scale tool, remains. To this limited extent, the structure of the most widely used neural network, the feedforward network, still resembles that of the brain. Just as there are several varieties of evolutionary algorithm, so the neural network is available in several flavors. We shall consider feedforward networks and, briefly, Kohonen networks and growing cell structures, but Hopfield networks, which we shall not cover in this chapter, also find some application in science.31
368
Development and Uses of Artificial Intelligence in Chemistry
In a standard feedforward network, the raw data that the network is to assess are fed in and the network responds by generating some output. The input data might be for example: . The infrared absorption spectrum of a compound expressed as transmission intensities at a number of different wavelengths. . In a process control application, the temperature, pH, viscosity, and composition of a chemical mixture in a continuous flow stirred reactor. . A set of molecular descriptors for a particular molecule being considered as a possible drug. The corresponding output could be: . A number that signifies whether the molecule contains a carbonyl group. . Outputs that specify what changes should be made to the composition of the feedstock and the rate of reactor heating in order to optimize product composition. . An indication of whether this compound might be a suitable drug to treat tuberculosis. Before a network can provide a meaningful response to the input data, it must be trained, so we turn now to the details of how to construct a neural network, and how one goes about training it.
Neural Network Implementation A feedforward network, the type most commonly used in chemistry, is constructed from several artificial neurons (Figure 6), which are joined together to form a single processing unit. The operation of each artificial
Figure 6 An artificial neuron.
Neural Networks
369
Figure 7 The step (Heaviside) threshold function.
neuron is particularly simple and loosely mimics that of the human neuron. Several inputs feed signals into the neuron; the neuron sums the inputs and then uses the result to determine what its output signal should be. The relationship between the summed inputs to a neuron and its output is an important characteristic of the network, and it is determined by a transfer function (or squashing function or activation function). The simplest of neurons, the perceptron, uses a step function for this purpose, generating an output of zero unless the summed input reaches a critical threshold (Figure 7); for a total input above this level, the neuron ‘‘fires’’ and gives an output of one. It is easy to construct a network of perceptrons by bolting them together so that the outputs of some of them form the inputs of others, but in truth it is hardly worth the effort. The perceptron is not just simple, it is too simple. A network of perceptrons constructed manually can perform a few useful tasks, but it cannot learn anything worthwhile, and since learning is the key to a successful neural network, some modification is needed. The problem with the behavior of the perceptron lies in the transfer function; if a neuron is to be part of a network capable of genuine learning, the step function used in the perceptron must be replaced by an alternative function that is slightly more sophisticated. The most widely used transfer function is sigmoidal in shape (Figure 8, Eq. [2]), although a linear relationship between input and output signals is used occasionally. fðvk Þ ¼ a tanhðbvk Þ
½2
The sigmoidal function generates a different output signal for each input signal, so the neuron can pass on information about the size of the input in a fashion that is not possible with a step function, which can transmit only an on/off signal. A network composed of neurons with sigmoidal functions can learn complicated behavior. Most importantly, it can learn to model nonlinear functions, and because nonlinear behavior is ubiquitous in science, this ability is crucial in producing a scientific tool of wide applicability.
370
Development and Uses of Artificial Intelligence in Chemistry
Figure 8 The sigmoidal threshold function.
A feedforward neural network brings together several of these little processors in a layered structure (Figure 9). The network in Figure 9 is fully connected, which means that every neuron in one layer is connected to every neuron in the next layer. The first layer actually does no processing; it merely distributes the inputs to a hidden layer of neurons. These neurons process the input, and then pass the result of their computation on to the output layer. If there is a second hidden layer, the process is repeated until the output layer is reached. We recall that AI tools need a memory—where is it in the neural network? There is an additional feature of the network to which we have not yet been introduced. The signal output by a neuron in one layer is multiplied by a connection weight wij (Figure 10) before being passed to the next neuron, and it is these connection weights that form the memory of the network. Each weight can be adjusted independently, so the neurons in the hidden layer, although they all take signals from the same set of input neurons, receive a different set of signals after the inputs have been multiplied by the connection weights. Training the network then consists of finding the set of weights that, when a particular input signal is passed through the network, will give the correct output signal. At the start of training, a suitable network ‘‘geometry’’ must be chosen. A single hidden layer is common, although two or even more layers are sometimes used. The number of neurons is selected, bearing in mind the presumed complexity of the database. Once the geometry has been chosen, all connection weights are set to random values. This parallels what happens at the start of a GA run, so just like the GA, the neural network knows nothing to begin with.
Neural Networks
Figure 9 A feedforward neural network.
Figure 10 Connection weights in a feedforward network.
371
372
Development and Uses of Artificial Intelligence in Chemistry
A sample is chosen at random from the dataset and fed into the network. Each input node in the network accepts a particular item of data, so if an infrared spectrum was being fed in, the first neuron might be given the % transmission at 4000 cm1, the second the % transmission at 3990 cm1, the third the value at 3980 cm1, and so on. (In reality, it would be preferable to give the network only data that it might find useful, so positions in the spectrum at which absorption was likely to vary significantly among samples would be chosen. There is no advantage to be gained from spacing the input data evenly across the spectrum.) The network uses the input to generate some output; that output is then compared with the correct response and an error signal, normally the square of the difference between desired and actual output, is calculated. Suppose the input to the network is an infrared spectrum, from which the network must determine whether the molecule whose spectrum is being assessed contains a carbonyl group. We could require that the network output a value of one if it believes that a carbonyl group is present in the molecule, and zero otherwise. It is very unlikely that the untrained network will generate exactly the correct output when it is presented with the first sample, so the error in its prediction will be nonzero. In that case, the connection weights in the network are modified (see below) to reduce the error, and thus, they make it more likely that the network will provide the correct answer the next time it sees this spectrum. A second sample is then chosen from the dataset and fed through the network; once again the network output is compared with the desired output and the network weights are adjusted in order to reduce the difference between desired and actual output. This process is repeated until all samples in the dataset have been fed through the network once; this constitutes one epoch. Many epochs are normally required to train the network, especially if the dataset is both small and diverse. In an alternative, and broadly equivalent, procedure the updating of the weights only occurs after the complete set of samples has been observed by the network, that is, after each epoch. It might seem that we should continue to train the network in this way until it can provide the desired output for every sample with an acceptable level of error. However, things are not quite so simple. The role of a neural network is to discover rules that allow it to generalize about the data in the database, not to learn to memorize each sample. Accordingly, we must judge the ANN’s performance not by how competently it can identify samples from within the training set (which, after all, it has already observed so it should know a bit about it), but by how well it does when confronted with data it has never observed before. For this purpose, before training begins, we divide the database into a training set, which is used to train the network—that is, to find the optimum set of connection weights—and a separate test set, which is used to assess how well the learning has progressed.
Neural Networks
373
Training is fundamental to the application of every neural network, and much effort has been devoted to determine how this should be done in a way that is both efficient and effective. The training of feedforward networks is a type of supervised learning, which means that the answer that the network is required to produce is always provided to it along with the sample; all the network has to do is to adjust itself so as to generate a prediction that is as close as possible to the right answer. The most widely used method of adjusting the connection weights is known as backpropagation. In this method, the signals that generated the final output of the network are inspected to see which contributed the most to the error signal. The weights of all connections to the output units are adjusted to reduce the error, with those connections that made the greatest contribution to that error being changed by the largest amount. The error on the output units is then propagated one layer backward to weights on the previous layer, and these weights are also adjusted. Once all weights have been adjusted, another sample is drawn from the database and the process is repeated, a forward pass generating the error signal and a reverse pass being used to modify connection weights, until the performance of the network is satisfactory (or until we run out of patience). The mathematical treatment of backpropagation, which is a type of gradient descent, can be found in any standard text on neural networks. ANNs are versatile tools, so they can be applied to the analysis of many different types of data. The input might be the intensities at defined wavelengths from an infrared spectrum, and the output might be the identification of the functional groups in a molecule. A network of the same structure could just as easily be taught, through suitable training, to use personal details about a person’s credit history to make a judgment whether they were a good risk for a mortgage. Sharda and Delen32 have even used ANNs to predict the boxoffice success of films; an effort, one presumes, of considerable potential commercial value. However, there seem at present to be no reports in the literature demonstrating that their method has led to the creation of any new Hollywood blockbusters.
Why Does the Neural Network Work? An untrained neural network knows nothing. During training, adjustment of network weights is designed to ensure that, when presented with a particular sample for the second time, the network is more likely to output the desired response. It is easy to accept that, shown the same data again and again, the network could adjust its weights in a fashion that leads to the required output. The clever bit is that it can learn to do this for many different samples; this requires the learning of rules, rather than specifics. The ability of the network to learn rules rather than remember samples depends on a variety of factors, such as how many rules are needed to
374
Development and Uses of Artificial Intelligence in Chemistry
adequately describe the features of the samples in the database, and how the size of the database and the number of rules required compare with the number of connection weights. If the number of samples in the database is comparable with, or less than, the number of connection weights, the network may learn specific examples from the dataset and few general rules. However, if the dataset is sufficiently large and diverse, the network will learn the general rules that underlie the dataset, thus, in effect, deducing the laws buried in the database. This observation leads to the intriguing possibility of automatic rule discovery. Suppose that we launch a neural network into a large and complicated dataset in which the relationship between the data and the desired output is inadequately understood. Once fully trained, the network weights should encode relationships that allow it to describe these links. Such relationships are in effect equivalent to scientific laws, so this process is rule-discovery by machine. Software capable of doing this is not far over the scientific horizon; indeed, some rule-discovery networks are already operating. There is, however, a problem: It is difficult to extract laws from a standard feedforward neural network, because the laws are encoded in a complicated way in the connection weights. Our network trying to assess an infrared spectrum to determine whether it indicates that a carbonyl-containing compound is present will not, having assessed the database, report that If there is a peak in the spectrum near 1760 cm1, the molecule contains a carbonyl group; if there is no peak it probably does not,
even though it will have, in effect, deduced this. Instead, the network’s understanding of this rule is disguised in the set of connection weights from which the extraction of a human language rule may not be simple. Nevertheless, rulediscovery neural networks, or their derivatives, have great potential and will play an increasingly important role in scientific progress.33
What Can We Do with Neural Networks? ANNs have the advantage that no assumptions need to be made about the data before analysis starts. In analyzing spectra, for example, the underlying shape of the spectral distribution or line shape is of no consequence to a neural network, unlike the requirements of more traditional methods. They are also tolerant of noise and outliers. In contrast to expert systems, they can accommodate aberrant points without too much difficulty. However, as noted above, it is difficult to extract from the network weights an algebraic function that relates input to output, so most neural networks function as black boxes; in addition, training is lengthy, although the operation of a trained network is rapid. Let us consider now some areas of scientific exploration in which ANNs have been used.
Neural Networks
375
Reactivity Laboratory or field testing of explosives is expensive, subject to large run-to-run variations and, it will be no surprise to learn, potentially hazardous. Keshavarz and Jaafari34 used ANNs to predict the impact sensitivity of explosives, based on a knowledge of a group of ten molecular descriptors. Most descriptors that they used were binary, denoting the presence or absence of a particular structural feature in the molecule, for example, an aromatic ring or an N–NO2 group. The drop height H50, the height from which a 2.5-kg weight when dropped on a sample will cause detonation in 50% of cases, was predicted by the network with a reasonable degree of success. Although the authors argue that their network can be used to make predictions, they did not attempt to interpret those results in which the network failed to make a reliable prediction. Santana et al.35 has investigated the use of ANNs to predict the cetane number, a measure of how readily a vapor ignites under compression, for a range of hydrocarbon fuels. Because a complete set of descriptors was not available for the molecules, the group chose to divide the hydrocarbons into two groups, based on what descriptors were known for each molecule. The tactic of using different sets of descriptors for each group and different networks to assess them was adopted to avoid the difficulties that neural networks may encounter when some data points are missing, a situation that would arise if two different descriptor sets were fed into a single network. The calculation was hampered by substantial uncertainty in the database of cetane numbers, but the workers quote generally accurate predicted CNs and consider the mechanism that might lead to the observed correlations. QSAR Molecular descriptors are widely used in quantitative structure-activity relationship (QSAR) and similar studies. Experimental values of IC50 for 277 inhibitors of glycon synthase kinase-3 were used by Katritzky et al. to develop a QSAR model of the biological activity of the inhibitors, with a multilinear model and an ANN model being compared with experimental data. The set of descriptors was calculated entirely from the molecular structures. The number of descriptors that could potentially be of value in QSAR studies is large; for example, Katritzky et al.36 started their study with a pool of 961 descriptors. Such a large collection of descriptors is rarely carried through an entire calculation because it typically contains a considerable degree of redundancy; the set of descriptors is normally reduced to a more manageable size before additional analysis. By removing correlated descriptors, those with a small variance ratio or for which no values were available for some structures, and deleting those that showed apparently random correlations, Katritzky’s group reduced the number of descriptors to 12; from these, six were selected through sensitivity-stepwise analysis to build the ANN model. The selection of a limited set of descriptors that are most effective in describing the behavior of
376
Development and Uses of Artificial Intelligence in Chemistry
the system helps to ensure a good working model, but it also helps in identifying the key interactions that determine behavior. Details about how variable selection is done can be found in a previous tutorial in this book series.10 Recognizing that eliminating compounds with unacceptable toxicity at an early stage can result in significant cost savings in the development of novel drugs, Molnar et al.37 considered ways to assess the cytotoxicity of potential drugs, choosing to work with a large pool of descriptors. Overall, 164 descriptors were used as inputs into a network containing 13 hidden neurons. Cytotoxic compounds in the library were identified correctly about 75% of the time, which might be viewed as a slightly disappointing result in view of the large number of descriptors used. However, even a 75% reduction in the number of chemicals that need to be synthesized and tested in the laboratory yields significant savings in both cost and time. Physical Chemistry Light scattering is a convenient technique for measuring particle sizes of polymers, animal and plant cells, and other microscopic particles.38 The prediction of the direction and intensity of light scattered from a known size distribution of particles is a well-studied problem—but the inverse problem – that of determining the range of particle sizes from an experimentally observed pattern of scattered light—is more challenging. Neural networks were used by Berdnik et al.39 to predict the size of an average particle and the variance of the size distribution function. The network was trained using a theoretical model, and then tested using experimental flow cytometry data. A modest performance in determining the mean refractive index was ascribed to the presence of noise in the data. In a paper relating to the partitioning of a substance between two solvents, Kalach40 comments that ‘‘based on the available concepts of solutions, it is impossible to make a priori estimates of extracting activities of organic solvents. . .’’ Whether this view is justifiable, Kalach’s work illustrates the fact that ANNs may be, and often are, applied to situations in which whatever correlations do exist may be unknown. Kalach trained the network using a small set of around 20 benzoic acid derivatives, using as input data a combination of discrete data (the presence or absence of a substituent in a particular ring position) and continuous data such as melting point and pKa. Reasonable results were obtained for solvent partitioning, although the testing sets used were small. Analytical methods are ripe for attack using AI methods. Capillary electrophoresis is a routine separation technique, but like other separation techniques, its effectiveness is correlated strongly with experimental conditions. Hence it is important to optimize experimental conditions to achieve the maximum degree of separation. Zhang and co-workers41 studied the separation of mixtures in reserpine tablets, in which vitamin B1 and dibazolum may be incompletely separated, as may promethazine hydrochloride and chloroquine
Neural Networks
377
phosphate. One could feed raw electrophoresis data into a network, but many network inputs would have been required that would have slowed training, without any certainty that the final analyses would have been improved by the network having access to the complete instrumental output. Instead, a preliminary principal components analysis step was used to reveal that two principal components account for virtually all variation in the data; this reduced the dimensionality of the data greatly and, therefore, the time required for training, as these two components could then be used to allow quantitation. Shan and Seidel-Morgenstern42 studied a problem that shares features with the capillary electrophoresis studies of Zhang et al. and with Kalach’s work. The ability to separate components in a mixture using prep-liquid chromatography depends not only on finding the appropriate solvent, but also on selecting suitable experimental conditions, such as temperature and column packing. Gradient methods, in which temperature and/or solvent composition are varied during a chromatographic run, are widely used. These authors investigated the use of a neural network to approximate the relationship between the adjustable parameters and the objective functions, such as recovery yield, and then used this relationship to find the optimum conditions. Several authors have considered the use of ANNs in X-ray spectrometry. Luo43 has provided a useful overview of the use of AI methods in this area, with some discussion of their relative merits. Proteins Determination of the structure of proteins is a crucial first step in computational drug design. Various classification databases for proteins exist, some of which rely on manual classification and others automatic classification.44 However classification is attempted, it is a challenging task, as illustrated by recent work by Cheng, Sweredoski and Baldi,45 in which recursive neural networks were used to develop a protein domain predictor. Despite being among the most successful predictors to date in this area, their model could correctly predict the protein domain for barely two thirds of proteins in a dataset comprising both single- and multidomain proteins. The work of Passerini et al.46 also used neural networks in the prediction of protein structure, and they too have reported a success rate of around two thirds. Wang et al.,47 starting from an abstract representation of the protein structure, have used neural networks to provide efficient classification. Automatic classification and structure prediction methods are likely to become important tools because of the rate at which the number and size of protein databases are growing, but this is one of the most demanding scientific areas in which artificial intelligence tools are currently being employed. Sensors One of the most interesting areas of research that combines chemistry with ANNs is the use of networks to interpret data from sensor arrays.
378
Development and Uses of Artificial Intelligence in Chemistry
A sensor array, often known as an electronic nose, consists of several fast sensors, each capable of responding to the presence of chemicals such as flavors, volatile organic compounds (VOCs), or pollutants in a sample. When a mixture of materials is passed across the sensor array, each sensor generates a different response, which may be expressed by the variation of several parameters, generally describing the variation of the sensor output with time. There are many applications of sensors in the recent literature, of which the report from Gualdron et al.48 who used a bank of 12 Taguchi Gas Sensors (TGSs), is typical. Combination of ANN with Another Technique Huang and Tang49 trained a neural network with data relating to several qualities of polymer yarn and ten process parameters. They then combined this ANN with a genetic algorithm to find parameter values that optimize quality. Because the relationships between processing conditions and polymer properties are poorly understood, this combination of AI techniques is a potentially productive way to proceed. Cartwright, Sztandera and Chu50 have also used the combination of a neural network with a GA to study polymers, using the neural network to infer relationships between the structure of a polymer and polymer properties and the genetic algorithm to predict new promising polymer structures whose properties can be predicted by the network.
What Can Go Wrong? As we saw earlier, user-selectable parameters exist in the genetic algorithm; in neural networks too, there are parameters whose values affect how successful the network may be and how rapidly training converges. The selection of suitable values is in fact even more important when using neural networks than was the case for a GA, because a badly chosen geometry for the ANN may create a tool whose predictions are not just poor but also misleading. If the network contains too few nodes, it will learn only the most general features of the data and be unable to make the necessary finer distinctions between samples. By contrast, if the network has too many neurons, the number of connection weights will be large and the network is likely to fall into the trap of recognizing specific examples in the database rather than learning more general rules. To tackle this problem, some workers such as Palmes and Usui51 have tried to evolve the structure of the network at the same time as the network is being trained, and this appears to be a promising approach to ensure that the ANN neither underperforms nor memorizes data. A similar approach is adopted in cascade-correlation learning,52 in which an initial network with no hidden units is trained. If the network cannot learn satisfactorily, a new hidden unit is chosen from a set of randomly-generated candidates and training is continued; the process is repeated until the network is of satisfactory quality.
Neural Networks
379
Figure 11 Variation of training error (solid line) with testing error (dashed line) as a function of epoch during typical neural network training.
As we have observed, training consists of repeated passes through the training data set, so that the network learns the correlations that are buried within it. It might seem, thinking back to one’s school days, that there could never be too much training, but in the case of neural networks, this is not so. During training, a network first learns the general rules that exist within the dataset. If training is continued beyond this point, the network may then try to improve its performance further by starting to memorize particular training samples. Although this may improve the network’s performance on the training set, it usually leads to a degradation of performance with the testing set. Hence it is common practice to run the training until performance with the testing set begins to degrade, and then to bring the training to a halt, as depicted graphically in Figure 11. Other methods of assessing network performance also exist, most notably cross-validation, which is usually superior to the ‘‘split-sample’’ method described above when only a small dataset is available for training. The interested reader will find these methods described in detail in standard texts on neural networks. It is also important to ensure that the dataset adequately covers the range of tasks to which the network will be applied; examples of all features that the network will be asked to recognize must therefore exist in the dataset. We should not expect a network trained to detect the presence of a carbonyl group in a compound to be able to also determine the presence of a C–Cl moiety unless it has been exposed to several examples of the latter group and has been specifically trained to recognize them. Just as it is common to feel that if a little training is good then more must be better, some researchers have worked on the assumption that if a few neurons are good, many neurons must be better. This working assumption is often incorrect. Two hidden layers are all that is needed for an ANN to deal with discontinuous or nonlinear functions. More layers may be used, but they are not normally necessary except for specialized applications such as bottleneck networks,53 which are not covered in this chapter. The inclusion of extra
380
Development and Uses of Artificial Intelligence in Chemistry
layers usually results in the inclusion of extra neurons, and whether this may lead to an improvement in performance or a degradation must be judged in the light of the nature of the database. The database must be large enough to cover the entire domain, and there must be more samples than weights; otherwise the network will train to recognize each sample individually rather than derive general rules as mentioned above. A number of samples equal to 1.5 times the number of connection weights is a realistic minimum, but a greater proportion of samples is desirable. Some data preprocessing may be needed, because the absolute magnitude of the inputs must be within an appropriate range for the ANN to function maximally. The sigmoidal transfer function works best when input data are in the range 0.1 to 0.9. If the input data are well outside this range, much time during training will be spent simply adjusting the network weights in order to scale the input data appropriately. As an example of the problems that can arise, we point to the work of Zampronio, Rohwedder and Poppi,54 who applied ANN to the analysis of citric and malic acids in fruit juices, which is a suitable area in which to apply neural networks. They tried a number of different network sizes, but they found that the standard error of prediction varied little with network geometry. Observing the lack of correlation between network performance and geometry, they concluded that the number of neurons in the hidden layer was not important. However, behavior that is indifferent to network geometry is more likely to indicate either insufficient training or that the number of factors needed to describe the dataset is small. Even a small network is therefore able to find general rules. The predictions of networks trained under such conditions, where the optimum number of neurons has not been determined, must be treated with some caution. Similarly, Song, Xu and Yu55 used a database of 16 potentiometric titrations to train a network with 35 hidden nodes. The number of samples must be greater than the number of connection weights, not less, which was not the case in this study, so the network is unlikely to generalize properly. The chemical solutions used in the titrations contained maleic acid, propanedioic acid, and succinic acid. One would expect that an ANN would be able to learn to recognize titration curves for these simple acids almost exactly, but the network gave results with 5% error, providing further evidence that the choice of the network geometry was probably inappropriate.
SELF-ORGANIZING MAPS The layered structure of feedforward neural networks provides a flexible tool that allows us to relate input data to some desired output, but what if there is no output? Can a neural network still do something useful? Rather curiously, the answer is yes, if we are prepared to employ a different kind of neural network. Numerous classification tasks exist in science, in which
Self-Organizing Maps
381
each sample contained in a large database must be assigned to one of a limited number of groups. For example, we might wish to classify drugs by their range of effectiveness and side effects, or solvents by their ability to dissolve each of a number of different substances. The relevant AI method in this case is the selforganizing map (SOM) or Kohonen network, named after its originator.56 In a SOM, each member of the dataset is located on a two-dimensional map in such a fashion that similar samples, such as two compounds whose infrared spectra are alike, are positioned close together. Although this gathering together of samples on a two-dimensional map is not in itself particularly original, clustering can yield information if, when the clustered map is inspected, it becomes apparent that samples sharing a particular property, such as being indicators of breast cancer, lie close together in a single region of the map.57 The role of the SOM is to take data of high dimensionality, such as infrared spectra or the output from GC runs, and squash the data onto two dimensions in such as fashion that samples that are close together in n-dimensions remain close together in two dimensions. Samples of similar characteristics, such as infrared spectra that display evidence of a carbonyl group, may then be found to be clustered in particular regions of the completed map. If a sample whose identity is unknown is fed into the map and is found to lie in one of those areas, one can conclude that it is likely that the sample contains a carbonyl group. The structure of a SOM is different from that of the feedforward network. Instead of the layered structure of the feedforward network, there is a single layer of nodes, which functions both as an input layer and an output layer. In a feedforward network, each node performs a processing task, accepting input, processing it, and generating an output signal. By contrast, in a SOM, every node stores a vector whose dimensionality and type matches that of the samples. Thus, if the samples consist of infrared spectra, each node on the SOM stores a pseudo-infrared spectrum (Figure 12). The spectra at the nodes are refined as the network learns about the data in the database and the vector at each node eventually becomes a blended composite of all spectra in the database. Because no ‘‘right answer’’ is attached to the database samples that are fed into a SOM, the process of training a SOM is rather different from that of training a feedforward network. A sample is selected at random from the database and given to the network. This sample is compared in turn with the vector stored at each node and the difference between sample and vector is calculated. The node whose vector most closely resembles the sample data, the winning node, is identified and the vector at that node is adjusted slightly to make it more like the sample. Similar, but smaller, adjustments are made to nodes that are neighbors to the winning node, and these adjustments ripple away from the winning node, with each vector being adjusted by an amount that falls off with distance from the winning node. Once the adjustments have
382
Development and Uses of Artificial Intelligence in Chemistry
Figure 12 The geometry of a typical self-organizing map.
been made, another sample is chosen from the database and the process of selecting a winning node and modifying the node vectors is repeated. The process continues until the presentation of further samples to the network produces a negligible change in the vectors at the nodes, which indicates that the map has converged.
Where Is The Learning? The memory of the SOM is contained in the node vectors. The effect of training is to modify these vectors so that they eventually resemble some average of the sample data. However, this does not imply that all node vectors converge to the same average. If the comparison between sample data and node vectors shows that the vector at node (i,j) most closely resembles a particular sample, that vector will be adjusted to make it more like the sample by an amount that is greater than the adjustment at any other node. More distant nodes are adjusted only slightly, and in time, they are likely to more accurately represent other spectra in the dataset. In this way, the vectors at nodes in different regions in the two-dimensional map gradually evolve to represent different classes of sample (Figure 13). If, by way of example, the database contained information on a large numbers of solvents, with data on their ability to dissolve a variety of common solids, once training was complete, those solvents that were effective
Self-Organizing Maps
383
Figure 13 A typical trained self-organizing map (unpublished data).
for ionic solids would be likely to cluster in one region of the map, whereas those that were better solvents for organic molecules might be found in another region. Such maps can be used in a most intuitive and straightforward way. To take the example of solvents once again, if we chose from the database a sample that is a good solvent for ionic substances, and fed it into the trained map, its properties would be similar to the vector at a node on the map that lies in the region corresponding to ionic-dissolving solvents. There is nothing surprising about this, but if we now take a second solvent whose properties as a solvent are unknown because it was not a member of the training set, and fed this into the map, its properties would most closely match the vector at a particular node. By checking the area of the map to see whether this area is dominated by ionic-dissolving solvents or organic-dissolving substances, we can determine the likely behavior of this material as a solvent. An early classic example of the use of a SOM is the clustering of samples of olive oil,58 where oils from similar geographic regions were found to be clustered together on the map, thus allowing determination of the origin of an unknown sample of oil, merely by checking to see the region on the map with which any particular oil was most strongly associated. SOMs can be convenient and effective tools for organizing data. However, in view of what has been said above about GAs and ANNs, it will be no surprise to find that critical parameters exist in the development of a SOM. In particular, the dimension of the map is crucial. Enough space across the map must be available to allow samples of very different characteristics to position themselves far apart, whereas similar samples can cluster together. This might suggest that maps should be large, but if the map is too large, all samples will be widely scattered and the clustering, which is the reason that we use a SOM, will hardly be evident, so the algorithm will have accomplished little. It is reasonable to assume that there is probably some ideal size for the map derived from a given dataset. The problem is knowing just what this size might be,
384
Development and Uses of Artificial Intelligence in Chemistry
because it depends on the size of the dataset, its diversity, the number of ‘‘different’’ types of examples in the dataset (almost certainly unknown), and other factors. One effective technique that can be used to determine a suitable dimension for a SOM is to allow it to grow by itself, rather than forcing it to have a defined geometry as a starting point. In the growing cell structure (GCS) approach, the initial map consists of just three nodes, a number that is almost always too small to describe the variability in a diverse dataset effectively. The map is then encouraged to expand through the gradual addition of nodes, until it reaches a size that is sufficient to take account of the variation in the dataset.
Some Applications of SOMs The number of applications of SOMs in chemistry is at present small, but the method is becoming increasingly popular. Recognizing that drugs are usually small molecules and that many contain common motifs, Ertl et al.59 used a SOM to analyze molecular scaffolds of approximately 500,000 potential drug molecules, using several molecular descriptors for each molecule. By analyzing which molecules were clustered together on the map, they found strong correlations between possible therapeutic activity and the number of atoms in each molecule and between activity and the number of nitrogen, oxygen, or sulfur atoms, although there was no convincing link with a molecular dipole moment. Zuppa et al.60 have used SOMs in the assessment of data from an electronic nose. Six chemicals—water, propanol, acetone, acetonitrile, butanol, and methanol—were presented at varying concentrations to a 32-element conducting polymer gas sensor array. The output was used to train a group of SOMs, rather than a single SOM, to avoid the problems of parameter drift. One SOM was associated with each vapor, and with suitable use of smoothing filters, the SOM array was found to perform effectively. The GCS is relatively uncommon in chemistry at present. However, GCSs are generally applicable in situations when a conventional SOM might be used, and because the final map is more likely to be of the appropriate size than if a geometry for the SOM is assumed in advance, it is expected to be effective at clustering. In an early application, Walker, Cross and Harrison57 applied GCSs to the assessment of fine-needle aspirates of breast lesions, finding good correlation between several parameters and benign or malignant regions on the map. More recently, Wong and Cartwright61 have demonstrated the power of the technique in the assessment of the very large datasets generated by mass spectrometric investigation of biofluids. They compared their method of deterministic projection with centroid mapping and random projection and showed that, for such datasets, deterministic projection generally outperforms the other methods.
Expert Systems
385
EXPERT SYSTEMS Expert systems were once widely used in chemistry, but their use has diminished in recent years as other methods have taken over. An expert system is a software tool whose role is to mimic the decision-making process of a human expert. It encapsulates the knowledge of the expert as a set of rules so that advice can be provided to the user without direct access to a human expert. Such software can allow inexperienced users to take advantage of expert knowledge in a specialized area, such as the analysis of polluted river water, encompassing the types of methods to employ in the analysis of particular pollutants, what solvents to use for LC for optimum separation, what types of technique are most appropriate to minimize the effects of contamination by other species, and so on. One might think that a neural network could be taught to derive rules such as these, but a neural network is not reliable when a database contains very few examples of a particular feature, so is not well able to accommodate ‘‘one-off’’ examples. Nor would it be easy for a computer to learn such knowledge by observation, because an expert cannot always explain precisely why they reach a particular decision. In fact, human experts tend to rely not just on expert knowledge, but also on intuition, based on a ‘‘feel’’ for the situation. A significant advantage of expert systems is that they are generally constructed using a natural language interface, so that one can hold something that passes for a conversation with the software. This interface is particularly appropriate for analytical laboratories in which work may be performed by technicians who may not possess the specialist knowledge required for some analyses. When computer software was relatively unsophisticated, these systems seemed to offer considerable promise, sometimes being incorporated into the software provided by manufacturers with instruments such as liquid or gas chromatographs. To a large extent, such systems have been overtaken by software that can provide informative spectra without advance knowledge of the sample characteristics, and to a lesser degree, they are threatened by neural network software, which is gradually becoming more effective at learning automatically how the experts think. Furthermore, the development of an expert system is a slow process, determined by the speed with which an expert can be interrogated in order to fill the expert database. Nevertheless, where it is difficult for a computer to learn expert system rules, the traditional expert system still has a place. A typical recent example of the use of expert systems is provided by the work of Dobrzanski and Madejski62 who have developed a prototype system for determining metal coating that provides an optimum combination of appearance, abrasion resistance, color, and other factors. A less scientific, but still intriguing, example of the use of these systems is HuskEval, an expert system for evaluating the quality of Siberian Huskies.63
386
Development and Uses of Artificial Intelligence in Chemistry
CONCLUSION This chapter has sought to provide a sample of AI methods that offer the greatest potential in chemistry, focusing on genetic algorithms and neural networks. Methods such as these offer the possibility of tackling problems that cannot be solved using conventional methods or can be solved only slowly. Their use is therefore set to grow, and to grow rapidly. Just as it is possible to use conventional methods in an inappropriate way, AI methods must be used with care, and with an appreciation of their limitations. As we have observed in this chapter, most AI algorithms contain adjustable parameters, such as the size of a genetic algorithm population, or the geometry of a neural network, and the values of such parameters must be chosen with care if the algorithm is to be able to compete against alternative methods. The characteristics of the data and the solution are important too; for the genetic algorithm to be effective, it must be possible to write the solution in linear (or array) format, and the data fed into a neural network often need to be scaled to reduce training time. However, these limitations are not too severe, and when used with understanding, AI algorithms can outperform other methods across a wide range of applications. Their future in chemistry thus seems assured.
REFERENCES 1. R. Judson, in Reviews in Computational Chemistry, Vol. 10, K. B. Lipkowitz and D. B. Boyd, Eds., Wiley, New York, 1997, pp. 1–100. Genetic Algorithms and Their Use in Chemistry. 2. J. Ziegler and S. Schwarzinger, J. Computer-Aided Mol. Design, 20, 47 (2006). Genetic Algorithms as a Tool for Helix Design – Computational and Experimental Studies on Prion Protein Helix 1. 3. G. A. Cox and R. L. Johnston, J. Chem. Phys., 124, 204714 (2006). Analyzing Energy Landscapes for Folding Model Proteins. 4. D. P. Djurdjevic and M. J. Biggs, J. Comput. Chem., 27, 1177 (2006). Ab initio Protein Fold Prediction Using Evolutionary Algorithms: Influence of Design and Control Parameters on Performance. 5. C. E. W. Gributs and D. H. Burns, Chemometrics and Intelligent Lab. Systems, 83, 44 (2006). Parsimonious Calibration Models for Near-Infrared Spectroscopy using Wavelets and Scaling Functions. 6. C. M. Sunderling, N. Sukumar, H. Zhang, M. J. Embrechts, and C. M. Breneman, in Reviews in Computational Chemistry, Vol. 22, K. B. Lipkowitz, T. R. Cundari and V. J. Gillet, Eds., Wiley-VCH, New York, 2006, pp. 295–329. Wavelets in Chemistry and Cheminformatics. 7. M. Dam and D. N. Saraf, Computers and Chemical Engineering, 30, 722 (2006). Design of Neural Networks using Genetic Algorithm for On-Line Property Estimation of Crude Fractionator Products. 8. K. L. Peterson, in Reviews in Computational Chemistry, Vol. 16, K. B. Lipkowitz and D. B. Boyd, Eds., Wiley-VCH, New York, 2000, pp. 53–140. Artificial Neural Networks and Their Use in Chemistry. 9. J. W. H. Wong, C. Durante, and H. M. Cartwright, Anal. Chem., 77, 5655 (2005). Application of Fast Fourier Transform Cross-Correlation for the Alignment of Large Chromatographic and Spectral Data Sets.
References
387
10. D. J. Livingstone and D. W. Salt, in Reviews in Computational Chemistry, Vol. 21, K. B. Lipkowitz, R. Larter and T. R. Cundari, Eds., Wiley-VCH, New York, 2005, pp. 287–348. Variable Selection – Spoilt for Choice? 11. E. Pranckeviciene, R. Somorjai, R. Baumgartner, and M.-G. Jeon, Artificial Intelligence in Medicine, 35, 215 (2005). Identification of Signatures in Biomedical Spectra using Domain Knowledge. 12. C. J. Montgomery, C. Yang, A. R. Parkinson, and J.-Y. Chen, Combustion and Flame, 144, 37 (2006). Selecting the Optimum Quasi-Steady-State Species for Reduced Chemical Kinetic Mechanisms using a Genetic Algorithm. 13. L. Elliott, D. B. Ingham, A. G. Kyne, N. S. Mera, M. Pourkashanian, and S. Whittaker, Computers and Chemical Engineering, 30, 889 (2006). Reaction Mechanism Reduction and Optimization for Modeling Aviation Fuel Oxidation using Standard and Hybrid Genetic Algorithms. 14. C. Lautenberger, G. Rein, and C. Fernandez-Pello, Fire Safety Journal, 41, 204 (2006). The Application of a Genetic Algorithm to Estimate Material Properties for Fire Modeling from Bench-Scale Fire Test Data. 15. K. B. McGrattan, G. P. Forney, J. E. Floyd, S. Hostikka, and K. O. Prasad, Fire Dynamics Simulator (version 4) user’s guide. National Institute of Standards and Technology, NISTIR 6784, 2004. 16. A. Gondarenko, S. Preble, J. Robinson, L. Chen, H. Lipson, and M. Lipson, Phys. Rev. Lett., 96, 143904 (2006). Spontaneous Emergence of Periodic Patterns in a Biologically Inspired Simulation of Photonic Structures. 17. E. Dova, R. Peschar, M. Sakata, K. Kato, and H. Schenk, Chem. Eur. J., 12, 5043 (2006). HighSpin and Low-Spin-State Structures of [Fe(chloroethyltetrazole)6](ClO4)2 from Synchrotron Powder Diffraction Data. 18. A. R. Oganov and C. W. Glass, J. Chem. Phys., 124, 244704 (2006). Crystal Structure Prediction using Ab Initio Evolutionary Techniques: Principles and Applications. 19. N. L. Abraham and M. I. J. Probert, Phys. Rev. B, 73, 224104 (2006). A Periodic Genetic Algorithm with Real-Space Representation for Crystal Structure and Polymorph Prediction. 20. H. M. Cartwright, unpublished work. 21. P. J. Hsu and S. K. Lai, J. Chem. Phys., 124, 044711 (2006). Structures of Bimetallic Clusters. 22. F. Cleri and V. Rosato, Phys. Rev. B., 48, 22 (1993). Tight-Binding Potentials for Transition Metals and Alloys. 23. O. Ona, V. E. Bazterra, M. C. Caputo, J. C. Facelli, P. Fuentealba, and M. B. Ferraro, Phys. Rev. A., 73, 053203 (2006). Modified Genetic Algorithm to Model Cluster Structures in Medium-Sized Silicon Clusters Si18-Si60. 24. B. Ahlswede and K. Jug, J. Comput. Chem., 20, 563 (1999). Consistent Modifications of SINDO1. I. Approximations and Parameters. 25. L. R. Marim, M. R. Lemes, and A. Dal Pino Jr., Phys. Stat. Sol. (B) - Basic Solid State Physics, 243, 449 (2006). Investigation of Prolate and Near Spherical Geometries of Mid-Sized Silicon Clusters. 26. P. Juhas, D. M. Cherba, P. M. Duxbury, W. F. Punch, and S. J. L. Billinge, Nature, 440/30, 655 (2006). Ab Initio Determination of Solid-State Nanostructure. 27. K.-S. Sohn, D. H. Park, S. H. Cho, B. I. Kim, and S. I. Woo. J. Comb. Chem., 8, 44 (2006). Genetic Algorithm-Assisted Combinatorial Search for a new Green Phosphor for use in Tricolor White LEDs. 28. G. Grubert, S. Kolf, M. Baerns, I. Vauthey, D. Farrusseng, A. C. van Veen, C. Mirodatos, E. R. Stobbe, and P. D. Cobden, Applied Catalysis A: General, 306, 17 (2006). Discovery of New Catalytic Materials for the Water-Gas Shift Reaction by High Throughput Experimentation. 29. P. Watkins and G. Puxty, Talanta, 68, 1336 (2006). A Hybrid Genetic Algorithm for Estimating the Equilibrium Potential of an Ion-Selective Electrode.
388
Development and Uses of Artificial Intelligence in Chemistry
30. E. C. Rivera, A. C. Costa, D. I. P. Atala, F. Maugeri, M. R. Wolf Maciel, and R. M. Filho, Process Biochem., 41, 1682 (2006). Evaluation of Optimization Techniques for Parameter Estimation: Application to Ethanol Fermentation Considering the Effect of Temperature. 31. See, for example, M. Arakawa, M. Hasegawa, and K. Funatsu, J. Chem. Inform. Comput. Sci., 43, 1390 (2003). Novel Alignment Method of Small Molecules Using the Hopfield Neural Network. 32. R. Sharda and D. Delen, Expert Systems with Applications, 30, 243 (2006). Predicting Box-Office Success of Motion Pictures with Neural Networks. 33. L. M. Fu, and E. H. Shortliffe, IEEE Trans. On Neural Nets, 11, 647 (2000). The Application of Certainty Factors to Neural Computing for Rule Discovery. 34. M. H. Keshavarz and M Jaafari, Propellants, Explosives, Pyrotech., 31, 216 (2006). Investigation of the Various Structure Parameters for Predicting Impact Sensitivity of Energetic Molecules via Artificial Neural Network. 35. R. C. Santana, P. T. Do, M. Santikunaporn, W. E. Alvarez, J. D. Taylor, E. L. Sughrue, and D. E. Resasco, Fuel, 85, 643 (2006). Evaluation of Different Reaction Strategies for the Improvement of Cetane Number in Diesel Fuels. 36. A. R. Katritzky, L. M. Pacureanu, D. A. Dobchev, D. C. Fara, P. R. Duchowicz, and M. Karelson, Bioorganic and Medicinal Chemistry, 14, 4987 (2006). QSAR Modeling of the Inhibition of Glycogen Synthase Kinase-3. 37. L. Molnar, G. M. Keseru, A. Papp, Z. Lorincz, G. Ambrus, and F. Darvas, Bioorganic and Medicinal Chemistry Letters, 16, 1037 (2005). A Neural Network Based Classification Scheme for Cytotoxicity Predictions: Validation on 30,000 Compounds. 38. K. S. Schmitz, An Introduction to Dynamic Light Scattering by Macromolecules. Academic Press, London, 1990. 39. V. V. Berdnik, K. Gilev, A. Shvalov, V. Maltsev, and V. A. Loiko, J. Quant. Spectrosc. & Radiative Transfer., 102, 62 (2006). Characterization of Spherical Particles using HighOrder Neural Networks and Scanning Flow Cytometry. 40. A. V. Kalach, Russian Chem. Bull. Int. Edn., 55, 212 (2006). Using Artificial Neural Networks for Prediction of Organic Acid Partition Coefficients. 41. Y. Zhang, H. Li, A. Hou, and J. Havel, Chemometrics and Intell. Lab. Systems, 82, 165 (2006). Artificial Neural Networks Based on Principal Component Analysis Input Selection for Quantification in Overlapped Capillary Electrophoresis Peaks. 42. Y. Shan and A. Seidel-Morgenstern, J. Chromatogr. A, 1093, 47 (2005). Optimization of Gradient Elution Conditions in Multicomponent Preparative Liquid Chromatography. 43. L. Luo, X-Ray Spectrom., 35, 215 (2006). Chemometrics and its Applications to X-Ray Spectrometry. 44. P. Koehl, in Reviews in Computational Chemistry, Vol. 22, K. B. Lipkowitz, T. R. Cundari and V. J. Gillet, Eds., Wiley-VCH, New York, 2006, pp. 1–55. Protein Structure Classification. 45. J. Cheng, M. J. Sweredoski, and P. Baldi, Data Mining Knowledge Disc., 13, 1 (2006). DOMpro: Protein Domain Prediction Using Profiles, Secondary Structure, Relative Solvent Accessibility and Recursive Neural Networks. 46. A. Passerini, M. Punta, A. Ceroni, B. Rost, and P. Frasconi, PROTEINS: Structure, Function and Bioinform., 65, 305 (2006). Identifying Cysteines and Histidines in Transition-MetalBinding Sites Using Support Vector Machines and Neural Networks. 47. Y. Wang, L.-Y. Wu, X.-S. Zhang, and L. Chen, TAMC LNCS, 3959, 505 (2006). Automatic Classification of Protein Structures Based on Convex Hull Representation by Integrated Neural Network. 48. O. Gualdron, E. Llobet, J. Brezmes, X. Vilanova, and X. Correig, Sensors and Actuators B, 114, 522 (2006). Coupling Fast Variable Selection Methods to Neural Network-Based Classifiers: Application to Multi-Sensor Systems.
References
389
49. C.-C. Huang and T.-T. Tang, J. Appl. Polymer Sci., 100, 2532 (2006). Optimizing Multiple Qualities in As-spun Polypropylene Yarn by Neural Networks and Genetic Algorithms. 50. H. M. Cartwright, L. Sztandera, and C.-C. Chu, NTC Ann. Rep., Sept. 2005. Genetic Algorithms in Molecular Design of Novel Fibers. 51. P. P. Palmes and S. Usui, BioSystems, 82, 168 (2005). Robustness, Evolvability and Optimality of Evolutionary Neural Networks. 52. S. E. Fahlman, and C. Lebiere in Advances in Neural Information Processing Systems, D. S. Touretzky, Ed., Morgan Kaufmann, San Mateo, California, 1990 pp. 524–532. The Cascade-Correlation Learning Architecture. 53. See, for example, R. Linker, J. Chemometrics, 19, 492 (2005). Spectrum Analysis by Recursively Pruned Extended Auto-Associative Neural Network. 54. C. G. Zampronio, J. J. R. Rohwedder, and R. J. Poppi, Chemometrics and Intell. Lab. Syst., 62, 17 (2002). Artificial Neural Networks Applied to Potentiometric Acid-Base Flow Injection Titrations. 55. X.-H. Song, J. Xu, and R.-Q. Yu, Mikrochim. Acta, 111, 199 (1993). Artificial Neural Networks Applied to Potentiometric Titration of Multi-Component Polybasic Acid Mixtures. 56. T. Kohonen, Biol. Cybern., 43, 59 (1982). Self-Organized Formation of Topologically Correct Feature Maps. 57. A. J. Walker, S. S. Cross, and R. F. Harrison, Lancet, 354, 1518 (1999). Visualisation of Biomedical Datasets by Use of Growing Cell Structure Networks: A Novel Diagnostic Classification Technique. 58. X.-H. Song and P. K. Hopke, Analytica Chemica Acta, 334, 57 (1996). Kohonen Neural Network as a Pattern Recognition Method Based on the Weight Interpretation. 59. P. Ertl, S. Jelfs, J. Muhlbacher, A. Schuffenhauer, and P. Selzer. J. Med. Chem., 49, 4568 (2006). Quest for the Rings. In Silico Exploration of Ring Universe to Identify Novel Bioactive Heteroatomic Scaffolds. 60. M. Zuppa, C. Distante, P. Siciliano, and K. C. Persaud, Sensors and Actuators B, 98, 305 (2004). Drift Counteraction with Multiple Self-Organising Maps for an Electronic Nose. 61. J. W. H. Wong and H. M. Cartwright, J. Biomed. Inform., 38, 322 (2005). Deterministic Projection of Growing Cell Structure Networks for Visualization of High-Dimensionality Datasets. 62. L. A. Dobrzanski and J. Madejski, J. Materials Processing Technol., 175, 163 (2006). Prototype of an Expert System for Selection of Coatings for Metals. 63. B. Hinkemeyer, N. Januszewski, and B. A. Julstrom, Expert Syst. with Applicat., 30, 282 (2006). An Expert System for Evaluating Siberian Huskies.
This Page Intentionally Left Blank
Author Index Aarts, D., 156 Abbott, M., 158 Abe, M., 281 Abraham, F. F., 123 Abraham, N. L., 387 Abramowitz, M., 338 Abramson, E. H., 185 Adam, G., 60, 156 Adam, N. E., 59 Adcock, J. D., 345 Adelman, S. A., 123 Aguilera-Granja, F., 243, 245, 247, 248 Ahlrichs, R., 280 Ahlswede, B., 387 Ahumada, O., 64 Aichele, M., 63, 65, 124 Alacid, M., 337, 340 Alba-Simionesco, C., 62 Alder, B. J., 244 Allen, M. P., 58, 153 Allen, P. G., 283 Allinger, N. L., 58, 122 Alonso, J. A., 245 Althorpe, S. C., 346 Alvarez, F., 64, 66 Alvarez, W. E., 388 Amara, P., 283 Ambrus, G., 388 Andersen, H. C., 60, 63, 154 Andersen, K. H., 62 Andersson, K., 280, 282 Andrews, L., 280, 282, 283, 284 Andriotis, A. N., 246
Angell, C. A., 57, 153, 155, 156, 157 Anisimov, V. I., 246 Ansoborlo, E., 283 Antikainen, J., 344 Antoniadis, S. J., 65 Antropov, V. P., 244 Aoki, K., 188 Aouadi, A., 64 Aoyagi, M., 336 Apsel, S. E., 243 Arakawa, M., 388 Arbe, A., 57, 64, 66 Areshkin, D. A., 123 Armand, M., 66 Arnold, A., 58 Arnold, M., 345 Arnoldi, W. E., 342 Arrighi, V., 64 Artacho, E., 247 Ashkenazi, G., 342 Ashurst, W. T., 156 A˚strand, P.-O., 283 Atala, D. I. P., 388 Auerbach, S. M., 342 Austin, E. J., 338 Autera, J. R., 189 Axenrod, T., 189 Bach, P., 155 Bacic, Z., 332 Back, A., 344 Baer, M., 338 Baer, M. R., 184, 185, 186
Reviews in Computational Chemistry, Volume 25 edited by Kenny B. Lipkowitz and Thomas R. Cundari Copyright ß 2007 Wiley-VCH, John Wiley & Sons, Inc.
391
392
Author Index
Baerns, M., 387 Baghi, K., 120 Bain, A., 343 Balabaev, N. K., 61, 63 Balasubramanian, S., 120 Balba´s, L. C., 243 Baldi, P., 388 Balint-Kurti, G. G., 336, 338, 346 Bancroft, G. M., 124 Bannister, E., 284 Baram, A., 158 Baranyai, A., 154 Barnett, R. N., 248 Barrat, J.-L., 122, 153 Barret, R., 334 Baruath, T., 247 Baschnagel, J., 59, 60, 61, 62, 63, 65 Bashford, D., 57 Bastea, S., 186 Baumann, C. A., 246 Baumgartner, R., 387 Bauschlicher, Jr., C. W., 282 Bayly, C. I., 57 Bazhanov, V. I., 284 Bazterra, V. E., 387 Beck, M. H., 339 Becke, A. D., 188, 244 Bedrov, D., 62, 64, 65, 66, 184 Behrens, R., 187, 189 Belkic, D., 340 Bellott, R. L., 57 Ben-Amotz, D., 156 Bennemann, K. H., 61, 62, 63, 65, 243 Bennewitz, R., 122, 124 Benoit, M., 188 Benson, D. J., 184 Bentley, J. A., 337, 341 Berdnik, V. V., 388 Berendsen, H. J. C., 60, 155 Bergroth, M. N. J., 156 Berman, M., 343 Bernasconi, M., 123, 187 Berne, B. J., 60 Bernel, S., 246 Bernstein, J. M., 154 Bernstein, N., 121, 123 Berry, M., 334 Berthier, L., 65 Bertsch, G. F., 244 Beutler, T. C., 60 Beyer, R. A., 189 Bhat, S., 246
Bhushan, B., 120 Bian, W., 346 Bickham, S. R., 189 Biggs, M. J., 386 Billard, I., 283 Billas, I. M. L., 243 Billing, G. D., 342 Billinge, S. J. L., 387 Binder, K., 57, 58, 60, 61, 62, 63 Bird, R. B., 153 Biroli, G., 65, 153 Bishop, A. R., 123 Bittererova, M., 344 Blais, N. C., 185 Blomberg, M. R. A., 280 Bloomfield, L. A., 243, 246, 247 Bludsky, O., 344 Blu¨gel, S., 246 Blum, K., 343 Bobadova-Parvanova, P., 247 Bocquet, L., 122 Boehler, R., 187 Boehme, C., 283 Bo¨hm, H. J., 280 Bolmont, D., 247 Boone, T. D., 59 Booth, C., 57 Borin, A. C., 282 Borodin, O., 58, 64, 66 Bouarab, S., 245 Bouchaud, J. P., 65, 153 Bowden, F. P., 120 Bowen, J. P., 58, 122, 333 Bowles, R. K., 156 Bowman J. M., 343, 346 Boyd, D. B., 58, 120, 122, 123, 333, 386 Boyd, J. P., 338 Boyd, R. H., 60, 64 Boyd, S., 189 Boyd, S. U., 64 Boyle, J. M., 340 Braly, L. B., 345 Bramley, M. J., 333, 334 Braun, O. M., 13 Breneman, C. M., 386 Brennan, J. K., 186 Brenner, D. W., 120, 121, 123, 189 Brenner, V., 283 Brezmes, J., 388 Briere, T. M., 247 Briggs, J. S., 340 Brill, T. B., 188, 189
Author Index Brostow, W. J., 156 Broughton, J. Q., 123 Brown, F. B., 280 Brumer, Y., 157 Brunet, J.-P., 337 Bru¨ning, R., 61 Brynda, M., 282, 284 Bucher, J. J., 283 Bucher, J. P., 246 Buchholz, J., 61 Buck, H. M., 335 Bulatov, V. V., 123 Buldyrev, S. V., 155 Bulusu, S., 187, 189 Bunker, P. R., 341 Burke, K., 244 Burns, D. H., 386 Bursten, B. E., 281, 282, 283, 284 Bush, A. W., 121 Byers Brown, W., 186 Byrd, R. H., 157 Cagin, T., 124 Cai, W., 123 Cailliaux, A., 62 Calais, J.-L., 280 Caldwell, J. W., 57 Callaway, J., 243 Campana, C., 123 Capaccioli, S., 157 Capelle, K., 245 Caputo, M. C., 387 Car, R., 185, 244 Carbone, C., 247 Cargo, M., 347 Carlson, R. O., 246 Carmesin, I., 59 Carnahan, N. F., 156 Carney, G. D., 332 Carra, P., 246 Carrington Jr., T., 333, 334, 335, 336, 341, 342, 343, 344, 347 Carter, S., 332 Cartwright, H. M., 386, 387, 389 Casalini, R., 157 Case, D. A., 57 Casey S. M., 282 Castleman, A. W., 249 Cates, M. E., 156 Cavazzoni, C., 123, 187 Cederbaum, L. S., 334 Ceperley, D. M., 244
393
Ceriotti, A., 248 Cerjan, C., 333 Ceroni, A., 388 Certain, P. R., 342 Chaikin, P. M., 153 Chakraborty, D., 186 Chakravarty, C., 158 Chambers, C., 187 Chan, T. F., 334 Chandler, C., 338 Chandler, D., 154, 188 Chandross, M., 124 Chang, A., 280 Chapman, S., 153 Chapuisat, X., 344, 347 Charbonneau, P., 153 Charlet, F., 185 Charron, G., 334 Chateauneuf, G. M., 123 Chatelain, A., 243 Chau, P. L., 155 Chau, R., 188 Cheeseman, P. A., 155 Chen, A., 64 Chen, B., 248 Chen, J., 340 Chen, J.-Y., 387 Chen, L., 387, 388 Chen, R., 334, 335, 337, 338, 339, 340, 341, 342, 343, 344, 347 Chen, X., 243 Cheng, H., 246 Cheng, J., 388 Cheng, Y.-T., 124 Cherba, D. M., 387 Chiarotti, G. L., 123, 187 Chidester, S. K., 189 Chikenji, G., 60 Child, M. S., 341 Cho, S. H., 387 Choi, S. E., 333 Chong, S. H., 61, 62, 63 Chou, M. Y., 245 Chouairi, A., 248 Christiansen, O., 281 Chu, C.-C., 389 Chuamyun, X., 248 Chudinovskikh, L., 187 Chung, S. C., 245 Ciccotti, G., 60, 122 Cieplak, M., 124 Cieplak, P., 57
394
Author Index
Cipelletti, L., 65 Clary, D. C., 344 Clavague´ra-Sarrio, C., 283 Clemenger, K., 245 Cleri, F., 387 Cleveland, T., 58, 123 Cobden, P. D., 387 Cohen, M. H., 155 Cohen, M. L., 124, 245 Collins, L. A., 189 Colmenero, J., 57, 64, 66 Colton, R. J., 121 Coluzzi, B., 157 Comeau, D. C., 280 Coniglio, A., 157 Conrad, J. C., 155 Corey, G. C., 333, 343 Cornell, W. D., 57 Correig, X., 388 Corti, D. S., 155, 156 Cossi, M., 280 Costa, A. C., 388 Cotton, F. A., 281, 282, 284 Cowling, T. G., 153 Cowman, C. D., 281 Cowperthwaite, M., 185, 186 Cox, A. J., 243, 247 Cox, G. A., 386 Cramer, C., 123 Cramer, C. J., 282 Cross, S. S., 389 Crowhurst, J. C., 185, 187 Csaszar, A. G., 333 Cullum, J. K., 333, 334, 342 Cumings, J., 124 Cummings, P. T., 122 Cundari, T. R., 59, 61, 333, 386, 387, 388 Curtin, W. A., 123, 124 Curtiss, C. F., 153 Curtiss, L., 58 Cushman, J. H., 121 Czako, G., 333 Dachsel, H., 280 Dacosta, P. G., 123 Dai, Y., 124 Dal Pino, Jr., A., 387 Dallos, M., 280 Dallwig, S., 335 Dam, M., 386 Dando, P. A., 340 Danel, J. F., 185
Darakjian, Z., 335 Darvas, F., 388 Dasgupta, S., 186 Davenport, J. W., 246 Davidson, D. F., 336 Davidson, E. R., 333, 336 Davis, B., 246 Davis, W. C., 185 Dawes, R., 347 De Gennes, P. G., 57 de Groot, J., 64 de Heer, W. D., 243, 245 de Jongh, L. J., 248 de Koning, M., 123 de Leeuw, S. W., 63 De Michele, C., 157 de Pablo, J. J., 66, 157 Debenedetti, P. G., 153, 154, 155, 156, 157, 158 Dederichs, P. H., 246, 248 Delen, D., 388 Dellago, C., 188 Demangeat, C., 247 Demmel, J., 334 Demontis, P., 187 Deng, J., 243, 246 Denniston, C., 122 Desjonque´res, M. C., 247, 248 Desmarais, N., 245 Deutsch, H.-P., 59 Di Marzio, E. A., 61 Dieckman, T., 340 Diestler, D. J., 121 Dieterich, J. H., 121 Dill, K. A., 155 Dinur, U., 58, 122 Distante, C., 389 Djurdjevic, D. P., 386 Do, I. P. H., 184 Do, P. T., 388 Do, T., 124 Dobchev, D. A., 388 Dobrzanski, L. A., 389 Dognon, J.-P., 283 Doi, M., 57 Doll, J. D., 123 Domcke, W., 334 Donati, C., 65 Donato, J., 334 Dong, J., 245 Dongarra, J. J., 334, 340 Donnet, C., 124
Author Index Donth, E., 65 Doolittle, A. K., 155 Dorantes-Da´vila, J., 243, 246, 247 Doring, W., 187 Dosseh, G., 62 Douglas, N., 281 Douglas, J. F., 65, 156 Douglas, R., 64 Douglass, D. C., 246 Dova, E., 387 Dowson, D., 120, 122 Doxastakis, M., 59, 60, 64 Dreyfus, C., 64 Dreysse´, H., 247, 248 Drittler, B., 246 Dube´, M., 121 Ducastelle, F., 245 Duchowicz, P. R., 388 Dudko, O. K., 121 Dullens, R. P. A., 156 Dumont, R. S., 343 Dunbrack Jr., R. L., 57 Dunlap, B., 247 Du¨nweg, D., 60, 61, 122 Durante, C., 386 Duxbury, P. M., 387 Dyke, J. M., 282 Dymond, J. H., 155 Dzugutov, M., 158 Eberhardt, W., 247 Edelstein, N. M., 283 Ederer, C., 246 Ediger, M. D., 57, 64, 65, 66, 153 Edlund, A., 336 Edwards, S. F., 57, 122 Ehrenreich, H., 244 Ehrhardt, C., 280 Eijkhout, V., 334 Ekardt, W., 245 El Masri, D., 65 Elert, M. L., 189 Elliott, L., 387 Elrod, M. J., 345 Elsner, J., 188 Elstner, M., 185, 188 Embrechts, M. J., 386 Emelyanov, A. M., 284 Emmert, J. W., 243, 246 Engelke, R., 185 Engkvist, O., 283 Ericsson, T., 335
Ernzerhof, M., 244, 280 Errington, J. R., 154, 158 Ertl, P., 389 Espan˜ol, P., 122 Evans, D. J., 122, 154 Evans, R., 189 Evanseck, J. D., 57 Evstigneev, M., 122 Ewig, C. S., 58 Ezhov, Y. S., 284 Facelli, J. C., 387 Fackler, J. P., 282 Fahlman, S. E., 389 Fa¨hnle, M., 244, 246 Fahrer, N., 335 Faller, R., 59, 64 Fara, D. C., 388 Farago, B., 57, 62, 64, 65, 66 Farantos, S. C., 343, 344, 346 Farrusseng, D., 387 Fauquignon, C., 185 Fei, Y. W., 188 Feigon, J., 340 Feit, M. D., 339 Felker, P. M., 345 Feller, D., 333 Feller, R. S., 345 Feng, H., 65 Feng, Y. P., 247 Ferguson, D. M., 57 Fernandez-Pello, C., 387 Fernandez-Perea, R., 66 Ferrante, F., 281 Ferraro, M. B., 387 Ferry, J. D., 57 Fetter, A. L., 243 Fettinger, J. C., 282 Fickett, W., 185 Field, M., 283 Field, M. J., 57 Field, R. W., 345 Fifer, R. A., 189 Filho, R. M., 388 Filippov, A. E., 121 Finch, E. D., 155 Finger, M., 185, 187 Fink, M., 282 Finley, J., 281 Finnis, M. W., 245 Fiolhais, C., 244 Fischer, E. W., 64
395
396
Author Index
Fischer, S., 57 Flannery, B. P., 333 Fleck, J. A., 339 Fleurat-Lessard. P., 344 Flory, P. J., 61 Floudas, G., 57 Floyd, J. E., 387 Focher, P., 123 Forney, G. P., 387 Forsberg, N., 281 Fox, T., 57 Frank, M. R., 188 Franosch, T., 61 Frasconi, P., 388 Fraser, B., 122 Frauenheim, T., 185, 188 Freed, J. H., 342 Frenkel, D., 61, 153, 154 Freund, R. W., 334, 341 Frick, B., 62, 65, 66 Fried, L. E., 184, 185, 186, 187, 188, 189 Friedel, J., 243 Friesner, R. A., 334, 335, 337, 343, 344 Fu, L. M., 388 Fuchs, M., 61, 62, 63, 156 Fuentealbe, P., 387 Fuger, J., 283 Fujihisa, H., 188 Fujima, N., 245, 246 Fujita, Y., 245 Fu¨lscher, M. P., 280 Funatsu, K., 388 Furtenbacher, T., 333 Furuya, H., 63 Fytas, G., 64 Gagliardi, L., 279, 280, 281, 282, 283, 284 Gale, J. D., 247 Gallego, L. J., 247 Galli, G., 188 Ganesan, V., 156 Gao, G.-T., 121, 123 Gao, H., 124 Gao, J., 57 Garbow, B. S., 340 Garcı´a, A., 247 Gatti, F., 344, 345, 347 Gauss, J., 281 Gavriliuk, A., 187 Gayalog, T., 122 Gazdy, B., 333 Gdanitz, R. H., 280
Gebremichael, Y., 65, 156 Gee, R. H., 60, 65 Geiger, A., 156 Geissler, P. L., 188 Gellman, A. J., 124 Germann, T. C., 189 Gerroff, I., 59 Geusic, M. E., 248 Gewinner, G., 247 Geyler, S., 59 Ghigo, G., 280 Giannousaki, A. E., 59 Gibbs, J. H., 60, 61, 156 Gibson, R. D., 121 Gilev, K., 388 Gillet, V. J., 61, 386, 388 Giovambattista, N., 155 Gisser, D. J., 64 Glaesemann, K., 186, 187 Glass, C. W., 387 Gleim, T., 63 Glosli, J. N., 121 Glotzer, S. C., 58, 65, 153, 156 Glowinkowski, S., 64 Gluck, M., 340 Gnecco, E., 122, 124 Goddard III, W. A., 124, 186, 188 Goedecker, S., 189 Goldfield, E. M., 343, 346 Goldman, N., 187 Goldstein, M., 153 Golub, G. H., 333, 335, 337 Goncharenko, I., 62 Goncharov, A. F., 187 Goncharov, V., 283 Gondarenko, A., 387 Gongwer, P., 189 Gordon, M., 58 Gorokhov, L. N., 284 Goscinski, O., 338 Go¨tze, W., 61, 62, 153 Gould, I. R., 57 Grant, M., 121 Gray, H. B., 281 Gray, S. K., 335, 336, 338, 343, 346 Graybush, R. J., 189 Green, H. S., 154 Gregoryanz, E., 187, 188 Grenthe, I., 283 Grest, G. S., 59, 124 Gributs, C. E. W., 386 Grigera, J. R., 155
Author Index Grigera, T. S., 157 Groenenboom, G. C., 335 Gropen, O., 281 Gross, E. K. U., 245 Grossmann, F., 340 Grover, R., 156 Grozdanov, T. P., 339, 343 Grubert, G., 387 Gualdron, O., 388 Gubanov, V. A., 244 Gubbins, K. E., 186, 187 Gubin, S. A., 185 Gudat, W., 247 Guevara, J., 244, 247 Guiang, C. S., 343 Guidry, M., 185 Guillot, B., 187 Guirado-Lo´pez, R., 248, 249 Guo, H., 57, 334, 335, 337, 338, 339, 340, 341, 342, 343, 344, 346, 347 Guo, W., 124 Guo, Y., 124 Gutknecht, M. H., 341 Gutzow, I., 57 Gygi, F., 188 Ha, S., 57 Ha¨berlen, O. D., 248 Hadjichristidis, N., 64 Hafner, J., 244 Hagberg, D., 284 Hagler, A. T., 58, 122 Ha¨kkinen, H., 248 Halonen, L., 341 Hamilton, D. C., 188 Hammerberg, J. E., 123 Han, J., 60, 65, 283 Handy, N. C., 283, 332 Hanf, M. C., 247 Hansen, D. W., 185 Hansen, J.-P., 61, 62, 153 Hardwick, A. J., 155 Harmon, B. N., 244 Harris, C. B., 281 Harrison, J. A., 120, 121, 123 Harrison, R. F., 389 Harrison, W. A., 243 Hartke, B., 339 Hasegawa, M., 388 Hashimoto, N., 340 Hauk, M., 188 Havel, J., 388
Haydock, R., 244 Hayes, B., 185 Hayes, E. F., 335 Hazendonk, P., 343 Hchtl, P., 280 He, G., 121 Heaven, M. C., 282, 283, 284 Hedin, L., 244 Heermann, W. W., 59 Hehre, W. J., 332 Heine, V., 243 Helm, F. H., 185 Hemley, R. J., 188 Hemmingsen, L., 283 Hendricks, J., 64 Hennig, C., 283 Henry, B. R., 334 Hergert, W., 246, 248 Hess, B. A., 281 Hestenes, M. R., 333 Heuer, A., 65, 158 Higginson, G. R., 122 Hildebrand, J. H., 155 Hillard, E. A., 284 Hilpert, K., 282 Hinkemeyer, B., 389 Hirano, M., 120 Hirao, K., 281, 282 Hirschfelder, J. O., 153 Ho, J., 245 Hobbs, M. L., 185, 186 Hobson, E. W., 154 Ho¨ck, K. H., 244 Hoffman, D. K., 338, 339, 345, 346 Hohenberg, P., 244 Holian, B. L., 13, 189 Holm, C., 58 Holmes, N. C., 188 Holmgren, S., 336 Hood, R. Q., 188 Hoover, W. G., 60, 156 Hopke, P. K., 389 Horbach, J., 62, 66 Hornig, H., 185 Hornig, H. C., 186 Horoi, M., 247 Hostikka, S., 387 Hou, A., 388 Houston, J. E., 124 Howard, W. M., 185, 186, 189 Howells, W. S., 66 Hoyau, S., 283
397
398
Author Index
Hoye, J. S., 186 Hsu, P. J., 387 Hu, H., 340 Hu, J. Z., 188 Hu, X.-G., 342 Huang, C.-C., 389 Huang, S.-W., 335, 336, 347 Huang, Y., 338, 339, 345 Hubbard, W. B., 187 Huisinga, W., 343 Hull, S., 187 Hummler, K., 244 Hutter, J., 188 Hwang, M. J., 58 Hyun, S., 121 Iba, Y., 60 Ichihara, Y., 245 Ikeda, S., 63 Infante, I., 283 Ingham, D. B., 387 Ingram, M. D., 57 In˜iguez, M. P., 245 Inoue, K., 63 Ioannou, I., 283 Irle, S., 280 Ismail, N., 280 Iung, C., 335, 336, 337, 344, 345, 347 Iyengar, S. S., 339 Jaafari, M., 388 Ja¨ckle, J., 57 Jackson, K. A., 247 Jaffe, R. L., 64, 65 Jain, S., 343 Jain, T. S., 66, 157 Jalaie, M., 58 Jamorski, C., 245 Jang, H. W., 346 Januszewski, N., 389 Jaswal, S. S., 244 Jeffrey, S. J., 342, 344 Jelfs, S., 389 Jena, P., 245, 246, 247, 248 Jensen, H. J. A., 280 Jensen, P., 243 Jeon, M.-G., 387 Jin, J., 65, 283 Jin, Y., 60, 64 Jinlong, Y., 247, 248 Joannopoulos, J. D., 189 Johnston, R. L., 386
Jolicard, G., 338 Jones, H. D., 186 Jones, N. O., 247 Jongma, R., 345 Jørgensen, P., 280 Jorgensen, W. L., 57 Joseph-McCarthy, D., 57 Joubert L., 284 Judd, B. R., 246 Judson, R., 386 Jug, K., 387 Juhas, P., 387 Julstrom, B. A., 389 Jungnickel, G., 188 Junquera, J., 247 Justum, Y., 347 Kahara, M., 185 Kaiming, D., 248 Kaji, K., 63 Kalach, A. V., 388 Kaledin, L. A., 283 Kanaya, T., 63 Kanno, H., 155 Kansal, A. R., 155 Karatasos, C., 64 Karayiannis, N. C., 59 Karelson, M., 388 Karlsson, H. O., 336, 337, 338 Karlstro¨m, G., 283, 284 Karlstro¨m, K., 280 Karplus, M., 58, 154 Kasrai, M., 124 Katakuse, I., 245 Kato, K., 387 Katoh, E., 188 Katritzky, A. R., 388 Katsnelson, M. I., 244 Kauzmann, W., 153 Kawazoe, Y., 246, 247 Kaxiras, E., 123 Kazandjian, L., 185 Kedziora, G., 280 Kegel, W. K., 156 Kelin, W., 247, 248 Kellman, M. E., 344 Kern, C. W., 332 Kerns, K. P., 248 Keseru, G. M., 388 Keshavarz, M. H., 388 Ketkar, S. N., 282 Khanna, S. N., 245, 246, 247, 248
Author Index Khodeev, Y. S., 284 Kikuchi, M., 60 Kilgore, B. D., 121 Kim, B. I., 387 Kim, E. G., 65 Kim, K. I., 124 Kim, W., 345 Kinal, A., 282 Kirkpatrick, T. R., 154 Kistiakowsky, G. B., 185 Kittel, C., 153 Kivshar, Y. S., 123 Kiyanagi, Y., 63 Kjaergaard, H. G., 334 Klafter, J., 121 Klein, M. L., 120, 187 Kloeden, P. E., 60 Knickelbein, M. B., 243, 247, 248 Knight, W. D., 245 Ko, J. S., 124 Kob, W., 57, 61, 62, 63, 65, 66, 154, 157 Kober, E. M., 188 Koehl, P., 388 Koeppel, H., 334 Kofke, D. A., 154 Kohl, C., 244 Kohn, W., 244 Kohonen, T., 389 Kolf, S., 387 Kollman, P. A., 57 Komarov, S. A., 284 Komelj, M., 246 Komissarov, A. V., 283 Konings, R., 283 Kono, G., 335 Kooh, A. B., 189 Kopf, A., 60 Korambath, P. P., 335 Korsch, H. J., 340, 342 Kosloff, R., 333, 336, 338, 339, 342, 343 Koster, G. F., 243 Kotelyanski, M., 58 Kouri, D. J., 338, 339, 345, 346 Kovacs, A. J., 57 Kovar, T., 280 Kramer, G. J., 158 Krauth, W., 157 Krekelberg, W. P., 156 Krembel, C., 247 Kremer, F., 64 Kremer, K., 58, 59, 62, 122 Kress, J. D., 189
Kroes, G.-J., 346, 347 Krogh, J. W., 282 Kroll, N. M., 281 Kru¨ger, S., 245 Krushev, S., 62, 64 Ku¨bler, J., 244 Kubo, R., 122 Kuchnir, L., 57 Kuczera, K., 57 Kudva, G., 189 Kumar, V., 247 Kunc, K., 123 Kuo, I. F. W., 187, 188 Kuo, K. K., 189 Kurly, J. E., 187 Kurth, S., 244 Kury, J. W., 185 Kushto, G. P., 284 Kussmann, J., 333 Kutnetsov, N. T., 284 Kutteh, M., 58 Kyne, A. G., 387 L’Hote, D., 65 La Macchia, G., 284 La Manna, G., 283 La Nave, E., 156 Ladieu, F., 65 Lagana, A., 346 Lai, S. K., 387 Lambrecht, D. S., 333 Lanczos, C., 333, 338 Landau, D. P., 59, 157 Landers, A. G., 188 Landis, C. R., 58, 123 Landman, U., 248 Lang, E. W., 155 Larter, R., 387 Lau, F. T. K., 57 Lautenberger, C., 387 Lawley, K. P., 280 Lawton, D., 284 Lazaridis, T., 154 Le Mogne, Th., 124 Leak, S. J., 59 Lebeault-Dorget, M.-A., 282 Lebiere, C., 389 Lebon, M. J., 64 Lee, C. T., 188 Lee, E., 185 Lee, E. L., 187 Lee, G. S., 185
399
400
Author Index
Lee, H.-S., 343, 344 Lee, K., 248 Lee, T. K., 245 Lee, Y. J., 189 Lees, A. W., 122 Leforestier, C., 335, 336, 337, 340, 341, 342, 344, 345, 347 Lehmann, K. K., 345 Lehoucq, R. B., 335 Leland, T. W., 186 Lemes, M. R., 387 Lemire, R., 283 Lemoine, D., 333 Lendvay, G., 346 Leopold, D. G., 282 LeQuere, F., 335 LeSar, R., 187 Levesque, D., 61 Lewis, J., 187, 189 Li, G., 337, 341, 342 Li, H., 337, 346, 388 Li, J., 282 Li, S., 124, 346 Li, Z. Q., 247 Li, Z.-S., 60 Liang, B., 283 Liebs, M., 244 Liechtenstein, A. I., 244, 246 Light, J. C., 332, 333, 335, 341, 343, 344, 346 Lin, J. F., 187, 188 Lin, S. Y., 343, 344, 346 Lindh, R., 280, 281 Linker, R., 389 Linse, P., 280 Lipkowitz, K. B., 58, 59, 61, 120, 122, 123, 333, 386, 387, 388 Lipson, H., 387 Lipson, M., 387 Lisal, M., 186 Lischka, H., 280 Litovitz, T. A., 155 Littlejohn, R. G., 347 Litzinger, T. A., 189 Liu, F., 243 Liu, H., 155 Liu, K., 345 Liu, L., 341 Livingstone, D. J., 387 Llobet, E., 388 Llois, A. M., 244, 247 Lodge, T. P., 57 Loiko, V. A., 388
Lombardi, J. R., 246 Lomdahl, P. S., 189 Long, G. J., 282 Longo, R. C., 247 Loose, W., 122 Lo´pez, M. J., 245 Lorenz, C. D., 124 Lorenz, R., 244 Lorincz, Z., 388 Louie, S. G., 124 Lourderback, J. G., 243, 247 Lo¨wdin, P. O., 243, 279 Lu, P. H., 157 Luan, B. Q., 121, 122, 124 Lubchenko, V., 157 Lubensky, T. C., 153 Lucchesi, M., 157 Ludemann, H. D., 155 Ludwig, G. W., 246 Lue, C. J., 283 Lundqvist, S., 244 Luo, L., 388 Lu¨tzenkirchen, K., 283 Lynden-Bell, R. M., 155 Lyulin, A. V., 61, 63, 64 Ma, G., 341, 344 MacDonald, J. K. L., 333 Macedo, E. A., 155 Macedo, P. B., 155 MacKerell Jr., A. D., 57 Madejski, J., 389 Mader, C. L., 185 Magill, J. H., 156 Maillet, J. B., 189 Main, J., 340 Maldivi, P., 284 ˚ ., 280, 281, 282, 284 Malmqvist, P.-A Maltsev, V., 388 Manaa, M. R., 184, 185, 188 Mandelshtam, V. A., 337, 338, 339, 340, 342, 343, 344, 346, 347 Manhong, Z., 247 Manthe, U., 346 Mao, G. M., 66 Mao, M. K., 187 Maple, J. R., 58 Marazri, N., 188 March, N. H., 244 Mareschal, M., 189 Marian, C., 281 Marijnissen, A., 282
Author Index Marim, L. R., 387 Marques, M., 244 Marsden, C., 280 Marsden, C. J., 283 Marsh, S. P., 187 Martin, J. M., 124 Martin, W., 284 Martins, J., 188 Martyna, G. J., 60, 120 Marx, D., 188 Mason, R., 284 Matsuda, H., 245 Matsui, Y., 155 Matsuo, T., 245 Mattice, W. L., 65 Mattos, C., 58 Maugeri, F., 388 Mavrantzas, G., 59, 60 Maxwell, D. S., 57 Maynard, A. T., 345 Mayr, M. R., 61 Mc Kenna, G. B., 57 McClelland, G., 121 McCormack, D. A., 347 McDonald, I. R., 62, 153 McGee, B. C., 186 McGrattan, K. B., 387 McGuire, R., 185 McNichols, A., 334 McQuarrie, D. A., 153 Meakin, P., 121 Medvedev, N. N., 155, 156 Meijer, A. J. H. M., 346 Meirovitch, H., 61 Melius, C. F., 185, 189 Menikoff, R., 184 Menon, M., 246 Menou, M., 337, 341, 347 Mera, N. S., 387 Mercha´n, M., 280, 282 Merz, K. M., 57 Messiah, A., 332 Meyer, E., 122, 124 Meyer, H., 59 Meyer, H.-D., 339 Michels, M. A. J., 61, 63, 64 Michnick, S., 58 Mikulski, P. T., 121, 123 Milchev, A., 59, 61 Milfeld, K. F., 334 Militzer, B., 188 Miller, R. E., 123
401
Miller, W. H., 336, 338, 341, 345, 346 Millie´, P., 283 Milne, G. W. A., 189 Minehardt, J. T., 345 Minich, R. W., 188 Mirodatos, C., 387 Mishima, O., 155 Mitchell, A. C., 188 Mitchell, K. A., 347 Mittal, J., 158 Mladenovic, M., 347 Mohanty, U., 157 Moiseyev, N., 334, 335, 340, 342, 343 Mokrani, A., 247 Moler, C. B., 340 Molina, V., 280 Molinari, J.-F., 121 Moll, H., 283 Molnar, L., 388 Monkenbusch, M., 57, 64, 66 Monson, P. A., 154 Montejano-Carrizales, J. M., 243, 245, 247 Montgomery, C. J., 387 Mora´n-Lo´pez, J. L., 243 Morgan, C. V., 189 Morgan, R. B., 336 Moriyama, H., 247 Moro, G., 342 Morriss, G. P., 122 Morse, M. D., 248 Moseler, M., 248 Mosey, N. J., 121, 124 Mountain, R. D., 154 Mryasov, O. N., 244 Muckerman, J. T., 341, 344 Muhlbacher, J., 389 Muller, A., 283 Mu¨ller, M., 60 Muller, R. P., 186 Mu¨ller, T., 280 Mu¨ller-Plathe, F., 59, 64 Mullin, A. S., 340 Mundy, C. J., 120, 187, 188 Mun˜oz, C., 347 Murillo, C. A., 284 Murray, C., 336 Mu¨ser, M. H., 120, 121, 122, 123, 124 Naberukhin, Y. I., 155 Nachtigal, N. M., 341, 334 Nachtigall, P., 344 Nagel, S. R., 153
402
Author Index
Nait-Laziz, H., 248 Nakajima, T., 281 Nakano, H., 247 Narang, H., 186 Nardi, E., 158 Narevicius, E., 342 Narros, A., 64 Nauts, A., 334, 347 Nayak, S., 147 Nayak, S. K., 245, 246, 247, 248 Nealey, P. F., 66 Neimark, A. V., 154 Neitola, R., 123 Nellis, W. J., 188 Nelson, D. R., 154 Neogrady, P., 280 Nesbet, R. K., 246 Nesbitt, D. J., 345 Ness, H. C. V., 158 Nettleton, R. E., 154 Neuefeind, J., 283 Neuhauser, D., 338, 339, 340, 342, 345, 346, 347 Neumaier, A., 342 Neurock, M., 284 Ngai, K. L., 57 Ngo, T., 58 Nguyen, D. T., 58 Nguyen, T., 282 Nguyen-Trung, C., 283 Nicholls, M. A., 124 Nichols III, A. L., 189 Nicol, M., 188 Nielsen, O. H., 123 Nieman, G. C., 246 Nitsche, H., 283 Nocedal, J., 157 Nogueira, F., 244 Nooijen, M., 247 Norton, P. R., 124 Nose´, S., 60, 188 Noya, E. G., 247 Nusair, M., 244 Nyberg, P. E., 59 Nyman, G., 335, 336 Ochsenfeld, C., 333 Oda, T., 244 Odintsov, V. V., 185 Oganov, A. R., 387 Ogawa, T., 156 Ogita, N., 156
Ohno, K., 247 Okada, O., 63 O’Leary, D., 64 Ole´s, A. M., 247 Olsen, J., 280 Ona, O., 387 Ong, C. K., 247 Oppenheim, I., 62, 157 Ordejo´n, P., 247 Ortiz, M. J., 283 ¨ ttinger, H. C., 60 O Ovchinnikov, M., 340 Oxgaard, J., 188 Oxley, J. C., 189 Pacchioni, G., 245, 248 Pacureanu, L. M., 388 Pagoria, P. F., 184, 185 Paige, C. C., 333, 334 Pakkanen, T. A., 123 Pakula, T., 59 Palmes, P. P., 389 Panagiotopoulos, A. Z., 154, 157 Pang, J. W., 340 Pant, P. V. K., 59 Papaconstantopoulos, D. A., 244 Papp, A., 388 Parasuk, V., 280 Parisi, F., 244 Parisi, G., 157 Park, B., 124 Park, D. H., 387 Park, T. J., 343 Parker, G. A., 338 Parkinson, A. R., 387 Parks, E. K., 245, 246, 248 Parlett, B. N., 333 Parr, R. G., 188 Parrinello, M., 122, 123, 185, 187, 188 Pasquarello, A., 244 Passerini, A., 388 Pastor, G. M., 243, 246, 247 Pate, B. H., 345 Patkowski, A., 64 Paul, W., 57, 58, 59, 60, 61, 62, 63, 64, 65, 66 Paulovic, J., 282 Pederson, M. R., 247 Pei, L., 121 Pendergast, P., 335 Peng, J. W., 347 Pepekin, V. I., 185 Pepper, M., 284
Author Index Pepper, M. J. M., 280 Perdew, J. P., 244 Perino, M., 65 Persaud, K. C., 389 Persson, B. J., 282 Persson, B. N. J., 120, 121 Pesce, L., 343 Peschar, R., 387 Peskin, U., 336, 343 Peterson, K. A., 343 Peterson, K. L., 386 Petrucci, S., 64 Pick, R. M., 64 Piecuch, P., 282 Pierloot, K., 280 Pitzer, R. M., 280 Piveteau, B., 247 Platen, E., 60 Plimpton, S. J., 65 Poilin, N., 343 Poirier, B., 336, 341, 346, 347 Politzer, P., 189 Pollard, W. T., 343 Ponder, J. W., 57 Poole, P. H., 65, 156 Popik, M. V., 284 Pople, J. A., 332 Poppi, R. J., 389 Porezag, D., 188 Pou-Ame´rigo, R., 282 Poulin, N. M., 334 Pourkashanian, M., 387 Power, P. P., 282 Pozo, R., 334 Prager, S., 57 Pranckeviciene, E., 387 Prandtl, L., 120 Prasad, K. O., 387 Preble, S., 387 Press, W. H., 333 Price, C., 57 Price, D. L., 66 Price, S. L., 58 Prielmeier, F. X., 155 Probert, M. I. J., 387 Prodhom, B., 58 Pruss, A., 187 Puertas, A. M., 156 Pulay, P., 336 Punch, W. F., 387 Punta, M., 388 Pusey, P. N., 66
Puxty, G., 387 Puzzarini, C., 282 Pyykko¨, P., 279, 284 Qi, J., 343 Qi, Y., 124 Qiu, X. H., 57, 64, 66 Racine, S., 336 Rader, R., 247 Radom, L., 332 Rahman, A., 122 Raiser, Y. P., 185 Rao, B. K., 245, 246, 247 Rapaport, D. C., 153 Rasaiah, J. C., 186 Rasmussen, A. J., 342 Rathgeber, S., 57 Raveche´, H. J., 154 Ravelo, R., 189 Reddy, B. M., 245 Reddy, R. V., 247 Ree, F. H., 185, 186 Reed, E. J., 188, 189 Reed, T. M., 186 Rehaman, A., 282 Reich, T., 283 Reichman, D. R., 153, 157 Reignier, D., 346 Reiher III, W. E., 58 Reimann, P., 122 Rein, G., 387 Reineker, P., 62 Reinhardt, W. P., 342 Reinisch, J., 157 Reiss, H., 156 Reiter, J., 59 Rendell, A., 280 Resasco, D. E., 388 Reuse, F. A., 245, 246, 247 Rhykerd, C. L., 121 Riande, E., 57 Ribeiro, F., 336, 344 Ricci, A., 122 Rice, B. M., 186 Rice, J. R., 335 Richert, R., 64, 157 Richter, D., 57, 62, 64, 65, 66 Rienstra-Kiracofe, J. C., 283 Rigby, D. J., 60, 63 Riley, S. J., 245, 246, 248 Ritort, F., 154
403
404
Author Index
Ritter, C., 62 Rivera, E. C., 388 Rizos, A. K., 57 Robbins, M. O., 120, 121, 122, 124 Robertson, D. H., 189 Robinson, J., 387 Roder, J., 123 Rodrı´guez-Lo´pez, J. L., 245 Roe, R. J., 60, 63 Rohwedder, J. J. R., 389 Roland, C. M., 157 Romero, A. H., 188 Romine, C., 334 Ronchetti, M., 154 Roos, B. O., 280, 281, 282, 283, 284 Root, D. M., 58, 123 Rosato, V., 387 Ro¨sch, N., 245, 248 Rosche, M., 62 Rosenfeld, Y., 157, 158 Ross, M., 186 Rossberg, A., 283 Rossini, I., 283 Rossky, P. J., 335 Rost, B., 388 Rotstein, N. A., 57 Roux, B., 58 Rowlinson, J. S., 186 Roy, P.-N., 336, 344 Ruedenberg, K., 280 Ruhe, A., 335 Ruhman, S., 339, 342 Ruiz-Montero, M. J., 154 Rushbrooke, G. S., 186 Ruthardt, K., 282 Ryckaert, J.-P., 60, 64 Ryde, U., 280 Saad, Y., 333, 334 Saalfrank, P., 343 Saboungi, M. L., 66 Sadlej, A. J., 280 Saika-Voivod, I., 156 Saito, Y., 123 Sakashita, M., 188 Sakata, M., 387 Saksaengwijit, A., 157 Sakurai, T., 245 Salt, D. W., 387 Salzgeber, R. F., 344 Samara, C. T., 65 Samwer, K., 61
Sa´nchez-Portal, D., 247 Sandratskii, L. M., 244 Sang, Y., 121 Sansonetti, J., 284 Santana, R. C., 157, 388 Santikunaporn, M., 388 Santoro, F., 344 Saraf, D. N., 386 Sarkar, P., 343 Sarman, S. S., 122 Sastry, S., 154, 155, 156, 157 Sather, G. A., 186 Sattelberger, A. P., 281, 282 Saunders, M. A., 334 Saunders, W. A., 245 Saykally, R. J., 345 Scala, A., 156 Scandolo, S., 123, 187 Scharf, P., 280 Schatz, G. C., 340 Scheidsteger, T., 61 Schenk, H., 387 Schieffer, P., 247 Schiffer, H., 280 Schilling, R., 61 Schimmelpfennig, B., 280, 281, 283 Schindler, M., 280 Schinke, R., 333, 344 Schlenkrich, M., 58 Schler, M., 280 Schleyer, P. v. R., 332 Schlier, C., 335, 344 Schmelzer, J., 57 Schmidt-Rohr, K., 65 Schmitz, D., 247 Schmitz, K. S., 388 Schneider, T., 122 Schober, H., 64 Schoen, M., 121 Schofield, S. A., 337 Schrøder, T. B., 65 Schuffenhauer, A., 389 Schultz, M. H., 334 Schulz, M., 62 Schwager, B., 187 Schwarzinger, S., 386 Schwegler, E., 188 Sciortino, F., 153, 156, 157 Scoles, G., 345 Scott, D. C., 336 Scott, D. S., 334 Seidel-Morgenstern, A., 388
Author Index Seideman, T., 338, 345, 346 Seifert, G., 188 Seijo, L., 280 Seitz, F., 244 Sellmyer, D. J., 244 Selzer, P., 389 Se´mon, L., 283 Serra, S., 123 Serrano-Andre´s, L., 280, 281 Seth, M., 280 Sette, F., 246 Sevast’yanov, V. G., 284 Sewell, T., 187, 189 Sewell, T. D., 184 Shaka, A. J., 340 Sham, L. J., 244 Shan, Y., 388 Sharda, R., 388 Shavitt, I., 280 Shaw, M. S., 186 Sheffield, S. A., 185 Shell, M. S., 154, 157 Shenderova, O. A., 123 Shepard, R., 280 Shi, Y., 341 Shibata, M., 63 Shinjo, K., 120 Shortliffe, E. H., 388 Shuh, D. K., 283 Shvalov, A., 388 Siciliano, P., 389 Siebert, R., 344 Siegbahn, P. E. M., 280 Sillescu, H., 65 Silva, C. M., 155 Silva, M., 345 Silva, R., 283 Silvestrelli, P. L., 188 Simard, B., 282 Simon, B., 338 Simpson, R. L., 184 Sinclair, J. E., 245 Singh, A. P., 61 Singh, D. J., 244 Sjo¨gren, L., 61 Skanthakumar, S., 283 Skokov, S., 343, 346 Skylaris, C.-K., 283 Slater, J. C., 243 Sluiter, H. F., 247 Smalley, R. E., 248 Smit, B., 61, 153
405
Smith, E. D., 124 Smith, G. D., 57, 58, 62, 64, 65, 66, 184 Smith, J. C., 58 Smith, J. M., 158 Smith, S. C., 336, 339, 340, 341, 342, 343, 344, 346, 347 Socoliuc, A., 124 Soddemann, T., 122 Soderholm, L., 283 Sohn, K.-S., 387 Sokoloff, J. B., 123 Soler, J. M., 247 Sollich, P., 154 Somayazulu, M., 187 Somorjai, R., 387 Song, X.-H., 389 Sorensen, D. C., 335 Souers, P. C., 185, 186 Soulard, L., 189 Souter, P. F., 284 Spanjard, D., 247, 248 Speedy, R. J., 155, 156 Spellmeyer, D. C., 57 Spencer, S., 283 Spiess, H. W., 65 Sprandel, L. L., 332 Srinivas, S., 247 Stahlberg, E. A., 280 Stanley, H. E., 155, 156 Starling, K. E., 156 Starr, F. W., 65, 155, 156 Steele, W., 121 Steffen, W., 64 Steger, A., 339 Stegun, I. A., 338 Steifel, E. L., 333 Steinhardt, P. J., 154 Stell, G., 156, 186 Stepanyuk, V. S., 246, 248 Stephenson, T. A., 284 Stevens, M. J., 122, 124 Sticht, J., 244 Stickel, F., 64 Stillinger, F. H., 61, 153, 154, 155, 158 Stobbe, E. R., 387 Stockfish, T. P., 58 Stoll, E., 122 Stote, R., 58 Straatsma, T. P., 58, 155 Strachan, A., 188 Straub, J., 58 Strobl, G. R., 57
406
Author Index
Strube, B., 64 Struzhkin, V. V., 187, 188 Stuart, S. J., 120, 123 Sughrue, E. L., 388 Suhai, S., 188 Sukumar, N., 386 Summerfield, M., 189 Sun, H., 57 Sun, J., 60 Sun, Q., 243 Sundberg, K. R., 280 Sunderling, C. M., 386 Suryanarayana, B., 189 Sutton, A. D., 282 Suzuki, A., 283 Sweredoski, M. J., 388 Szabo´, Z., 283 Szalay, P. G., 280 Szalay, V., 333 Szekers, R., 189 Sztandera, L., 389 Tabor, D., 120 Takatsuka, T., 340 Tal-Ezer, H., 336, 338, 342, 343 Tamaddon, S., 155 Tanemura, M., 156 Tang, C.-J., 189 Tang, T.-T., 389 Tangney, P., 124 Tannor, D. J., 341, 345 Tarazona, P., 157 Tartaglia, P., 157 Tarver, C. M., 189 Tatewaki, H., 247 Taubes, C. H., 157 Taylor, H. S., 339, 340, 342, 343, 344 Taylor, J. D., 388 Taylor, P. R., 280 ten Wolde, P. R., 154 ter Meulen, J. J., 282 Teukolsky, S. A., 333 Theodorou, D. N., 58, 59, 60, 64, 65 Thirumalai, D., 154 Thole, B. T., 246 Thomas, T. R., 121 Thompson, D., 187 Thompson, P. A., 121 Tildesley, D. J., 58, 153 Tirado-Rives, J., 57 Toigo, F., 247 Tokuyama, M., 62
Tomanek, D., 123 Tomlinson, G. A., 120 Torda, A., 60 Torquato, S., 153, 154, 155 Tosatti, E., 123, 187 Touretzky, D. S., 389 Tracht, U., 65 Tremblay, J. C., 342 Triolo, A., 64 Trogler, W. C., 281 Tromp, J. W., 333 Troullier, N., 188 Trouw, F., 64 Truhlar, D. G., 332, 338 Truong, T., 187 Truskett, T. M., 154, 155, 156, 158 Tsuneyuki, S., 155 Tsunoda, Y., 244 Tsushima, S., 283 Tuckerman, M. E., 60, 120 Turkel, M. L., 185 Turnbull, D., 155, 244 Tutein, A. B., 123 Twu, C. H., 186, 187 Tyng, V., 344 Uhl, M., 244 Uhlherr, A., 59 Ulman, A., 124 Underwood, R., 335 Urbakh, M., 120, 121 Usui, S., 389 Uzer, T., 345 Vallet, V., 283 van Beest, B. W. H., 158 van der Laan, G., 246 van der Vorst, H., 334 van Duin, A. C. T., 188 van Gunsteren, W. F., 60 van Leeuwen, D. A., 248 Van Loan, C. F., 333 van Megen, W., 66 Van Opdorp, K., 187 van Ruitenbeek, J. M., 248 van Santen, R. A., 158 van Schaik, R. C., 60 van Schilfgaarde, M., 244 van Thiel, M., 185, 186 van Veen, A. C., 387 Van Zee, R. J., 246 van Zon, A., 63
Author Index Van, Q. N., 340 Vanderbilt, D., 188 Varnik, F., 61, 62 Vauthey, I., 387 Vega, A., 243, 245 Verrocchio, P., 157 Veryazov, V., 280, 281 Vetterling, W. T., 333 Viel, A., 344, 347 Vijay, A., 340, 342 Vilanova, X., 388 Villasen˜or-Gonza´lez, P., 247 Vishnyakov, A., 154 Visscher, L., 281, 283 Vogel, M., 156 Vollmayr, K., 61 Voloshin, V. P., 155 von Barth, U., 244 von Neumann, J., 187 Vosko, S. H., 244 Voth, G., 187, 189 Wade, C. G., 64 Wagner, W., 187 Wahlgren, U., 281, 283 Waldman, M., 58 Walecka, J. D., 243 Wales, D. J., 157 Walker, A. J., 389 Wall, M. R., 339, 345, 347 Wallqvist, A., 283 Wan, X., 245 Wang, D., 245 Wang, D. S., 246 Wang, F., 157 Wang, G., 243 Wang, L. S., 246, 248 Wang, X.-G., 341, 342, 344, 347 Wang, Y., 343, 388 Wanner, H., 283 Wannier, G. H., 188 Warren, P., 122 Watanabe, M., 58 Watkins, P., 387 Weber, S. E., 248 Weber, T. R. TA or TR?, 153 Webster, F., 335 Weeks, D. E., 345 Weeks, J. D., 154 Weinhild, F., 342 Weissmann, M., 244 Weitz, D. A., 155
407
Welsh, J. H., 337 Weltner, W., 246 Wenning, L., 121 Wentzcovitch, R. M., 122 White, C. T., 121, 189 Whitnell, R. M., 341 Whittaker, S., 387 Widmark, P.-O., 280, 281, 282, 283 Wildberger, K., 246, 248 Wilhelm, M., 65 Wilk, L., 244 Wilkinson, G., 284 Wilkinson, J. H., 334 Willetts, A., 283 Williams, A. R., 244 Williams, G., 189 Willner, L., 57, 62 Willoughby, R. A., 333, 334, 342 Wilson, E. B., 185 Winkler, R. G., 62 Wiorkiewicz-Kuczera, J., 58 Wipff, G., 283 Wittmann, H.-P., 59, 61, 62 Wittmer, J. P., 59 Wloch, M., 282 Wodtke, A. M., 345 Wolf Maciel, M. R., 388 Wolfgardt, M., 61 Wolinski, K., 280 Wolynes, P. G., 154, 157, 337 Wong, J. H., 386 Wong, J. W. H., 389 Woo, S. I., 387 Woo, T. K., 121, 124 Woodbury, H. H., 246 Woolf, L. A., 155 Workum, K. V., 66 Wu, C., 187 Wu, H., 246 Wu, H. Z., 248 Wu, L.-Y., 388 Wu, X. T., 335 Wyatt, R. E., 333, 334, 335, 337, 341, 342, 343, 345 Xia, X., 124, 154 Xie, D., 335, 337, 338, 341, 344, 346, 347 Xu, D., 337, 338, 341, 346 Xu, J., 389 Yabushita, S., 280 Yamaguchi, T., 245, 246
408
Author Index
Yamamoto, S., 247 Yamashita, K., 335 Yamawaki, H., 188 Yan, G., 344 Yan, Q., 157 Yan, Z., 155 Yang, C., 335, 387 Yang, C.-Y., 343 Yang, T., 283 Yang, W. T., 188 Yao, G., 337, 343 Yin, D., 58 Yip, S., 58, 123 Yoon, D. Y., 57, 60, 62, 64, 65 Yoshida, H., 60 Yoshimoto, K., 66 Young, S., 284 Yu, H.-G., 335, 336, 339, 341, 344, 345, 347 Yu, J. Z., 247 Yu, K.-Q., 60 Yu, R.-Q., 389 Zaanen, J., 246 Zampronio, C. G., 389 Zarzycki, J., 57 Zaug, J. M., 185, 187 Zel0 dovich, Y. B., 185, 187 Zeller, R., 246, 248 Zervopoulou, E., 59 Zettl, A., 124 Zhang, D.-H., 335
Zhang, G. W., 247 Zhang, H., 246, 336, 340, 343, 346, 347, 386 Zhang, J. Z. H., 335 Zhang, S., 187 Zhang, W., 189 Zhang, X.-S., 388 Zhang, Y., 388 Zhang, Z., 280 Zhang, Z. H. Z., 333 Zhao, J., 243 Zhao, J.-G., 280 Zheng, Q., 124 Zhong, W., 13, 124 Zhou, C., 344 Zhou, H.-C., 284 Zhou, L., 245, 246 Zhou, M., 280, 282 Zhou, S. J., 123 Zhu, C. Y., 157 Zhu, L., 245 Zhu, W., 64, 338, 345 Ziegler, J., 386 Ziman, J. M., 243 Zinamon, Z., 158 Zinn-Justin, J., 61 Zirkel, A., 57, 64 Zunger, A., 244 Zuppa, M., 389 Zwanzig, R. W., 154 Zwisler, W. H., 185, 186 Zybin, S. V., 189
Subject Index Computer programs are denoted in boldface, databases and journals are in italic 4d-Algorithm, 16 Ab initio molecular dynamics (AIMD), 101, 118 Abrasives, 119 Acceptance probability, 14 Actinide compounds, 249 Actinide-actinide bonds, 251, 270 Activation function, 369 Active orbitals, 252 Active space, 266 Adam-Gibbs relationship, 149 Adam-Gibbs theory, 26, 145 Adhesive interaction, 75 Adiabatic expansion, 163 Alpha-relaxation time, 3 AMBER, 9 Amorphous halo, 2, 32 Andersen barostat, 19 Angell plot, 4 Anionic clusters, 239 Annihilation operator, 200 ANO-RCC basis sets, 259 Antiferromagnetic clusters, 228 Antiferromagnetic spin ordering, 225 Anti-wear additives, 119 Anti-wear films, 117, 119 AO basis sets, 259 Apparent area of contact, 74, 110 Applications of SOMs, 384 Approximate preconditioner, 302 Arnoldi recursion, 319, 323 Arrhenius law, 4 Artificial neurons, 368
Artificial boundary inhomogeneity (ABI), 328 Artificial intelligence (AI), ix, 349 Artificial neural networks (ANNs), 367 Asperity, 74, 118 Atom clusters, 191, 364 Atomic force microscopy (AFM), 98 Atomic mean field integrals (AMFI), 258 Atomic natural orbital (ANO), 259 Atomic orbitals, 200 Atomic-scale roughness, 109 Atomistic modeling of friction, 68 Atomistic models, 9, 109, 160, 171 Atomistic simulations, 199 Autocorrelation function, 318 Automatic rule discovery, 374 Available volume, 138 Backpropagation, 373 Barostat, 18 Basis functions, 286 Basis set, 200 Basis set superposition error (BSSE), 278 Bead-spring model, 2, 6, 11, 19, 30, 34 Becker-Kistiakowski-Wilson (BKW) EOS, 164 Bessel functions, 325 Binary tournament selection, 355 Block copolymers, 95 Block Lanczos algorithm, 300 Bond-fluctuation lattice model, 11, 34, 22 Bond-orientational order, 128 Bound states, 326 Boundary conditions, 68, 92 Boundary lubricants, 73, 75
Reviews in Computational Chemistry, Volume 25 edited by Kenny B. Lipkowitz and Thomas R. Cundari Copyright ß 2007 Wiley-VCH, John Wiley & Sons, Inc.
409
410
Subject Index
Brownian dynamics, 17 Brownian motion, 5 Bubbles, 181 Building-block hypothesis, 358 Bulk metals, 234 Bulk-like atoms, 197, 240 Byers Brown EOS, 164 Canonical ensemble, 18 Canonical partition function, 22 Capillary electrophoresis, 376 Carbon nanotubes, 113 Car-Parrinello Molecular Dynamics (CPMD), 173 Cascade-correlation learning, 378 CASPT2, 254 CASSCF state interaction (CASSI), 259 Cavities, 138 Cavity volumes, 138 Centroid mapping, 384 Cetane number, 375 Chain connectivity, 11 Chain stiffness, 22 Chapman-Jouguet (C-J) state, 161 Charge transfer, 202 Charge-induced dipoles, 167 CHARMM, 9 Chebyshev operator, 308 Chebyshev polynomials, 164, 309 Chebyshev propagation, 328 Chebyshev recursion, ix, 308 Cheetah, 165, 170 Chemical equilibrium, 161 Chemical kinetic modeling, 167 Chemically complex lubricant systems, 119 Chemically realistic modeling, 7 Chromium clusters, 225, 227 Chromium-chromium multiple bond, 264 Chromosome, 352 CI expansion, 253 C-J detonation theory, 163 Classification, 377, 380 Close lying electronic states, 250 Cluster pivot algorithm, 147 Cluster potential, 198 Cluster surface, 203, 237, 240 Clusters, 191, 192, 218, 364 Clusters (‘‘Magic’’), 218 Clusters of 4d elements, 234 Coarse-grained models, 6, 11, 19, 103 Coherent scattering function, 3 Cold welding, 72, 74
Combustion, 160 Commensurability, 78 Commensurate surfaces, 69, 78 Commensurate systems, 106 COMPASS, 9 Complete active space self-consistent field (CASSCF), viii, 251, 252 Complex-symmetric matrices, 287, 322 Compression, 117 Compression rate, 132 Computational bottleneck, 291 Computational chemistry, v Computational convenience, 79 Computational efficiency, 12 Computational materials chemistry, vi Computer simulations, 2, 7 Condensed-phase detonation simulations, 171 Configuration functions (CFs), 252 Configuration interaction (CI), 302 Configuration space, 13 Configurational entropy, 21, 22, 25, 145 Confining walls, 91 Conformational dynamics, 41, 45, 53 Conformational rearrangements, 21 Conformational transitions, 52 Conjugate gradient (CG) method, 296 Connection weight, 370 Connectivity altering moves, 15 Connectivity changing algorithm, 16 Conservation of momentum, 89 Continuous data, 376 Continuous instabilities, 106 Continuum models, 11, 109 Continuum-mechanics-based models, 103 Converged eigenvalues, 297 Convergence dynamics, 297 Cooling rate dependence, 18 Cooperative motion algorithm, 15 Copper-gold clusters, 365 Cosine propagator, 308 Coulomb correlation, 204 Coulomb integrals, 201 Coulomb’s law of friction, 76 Coupled cluster (CC) theory, 251, 254 CPU scaling law, 295 Creation operator, 200 Cross-correlation functions, 318 Crossover operator, 357 Crystal field potential, 202 Crystal nucleation, 133 Crystal structures, 364 Crystal-independent metrics, 128
Subject Index Cullum-Willoughby test, 298, 305 Curie temperature, 194 Cut-off radius, 88 d Electrons, 192, 198, 235 Darling-Dennison resonance, 321 Darwin term, 258 Data analysis, 349 Davidson method, 302 Decision-making process, 385 Delaunay tessellations, 138 Delta filter, 312, 317 Delta function, 314 Density fluctuations, 26, 136 Density functional theory (DFT), 100, 180, 203, 240, 251 Density of electronic states, 241 Density-of-states algorithm, 147 Detailed balance condition, 13 Determination of crystal structures, 364 Deterministic projection, 384 Detonation, 160 Detonation conditions, 160 Detonation tests, 161 Detonation velocity, 166, 170 Detonation wave, 161 Diagonalization, ix Diamond anvil cell, 173, 181 Dielectric relaxation, 41 Dielectric screening, 167 Diffusion constant, 174 Diffusive motion, 5 Diffusivity, 136, 142 Dihedral barriers, 21, 46 Direct diagonalization , 289 Discrete energy representation (DER), 314 Discrete cosine propagator, 309 Discrete data, 376 Discrete variable representation (DVR), 288 Discretization, 286 Discretized Hamiltonian, 324 Dispersion interactions, 8 Dissipation mechanisms, 18 Dissipation of heat, 86 Dissipative particle dynamics (DPD), 88 Double-bridging algorithm, 15 Douglas-Kroll-Hess (DKH) Hamiltonian, 258 Drugs, 381, 384 Dual Lanczos algorithm, 323 Dynamic electron correlation, 251, 253, 254 Dynamic heterogeneity, 50, 53 Dynamic neutron scattering, 41
411
Dynamic scattering techniques, 3 Dynamics, 126 Effective core potentials (ECPs), 259 Effective direct integrals, 201 Effective Kohn-Sham potential, 204 Effective Slater integrals, 221 Eigenpairs, 329 Eigenproblems, 285 Eigenvalues, 287 EISACK, 316 Elastic coupling, 72 Elastic instabilities, 72 18-Electron rule, 250 Electron correlation, 249, 254 Electron photodetachment spectroscopy, 239 Electron spin resonance (ESR) spectroscopy, 229 Electronic configurations, 192 Electronic nose, 377, 384 Electrostatic interactions, 167, 201 Empty lattice sites, 24 Entropy of glasses, 125 End-bridging algorithm, 15 Energetic materials, vii, 159 Energy content, 161 Energy dissipation, 71, 73, 85, 98, 105 Energy of detonation, 166 Entanglement , 14 Entanglement chain length, 5 Entropy, 7, 144 Entropy of liquids, 125 Epoch, 372 Equation of state (EOS), 163 Equations of motion, 89, 93 Equilibration in a polymer melt, 16 Equilibrium simulations, 68 Error cancellations, 205 Evolution operator, 324 Evolutionary algorithms, 350 Evolutionary operators, 353 Ewald summation, 100 Excess chemical potential, 24 Excess entropy, 130, 151 Exchange effects, 204 Exchange integrals, 201 Exchange interaction, 192 Exchange-correlation energy, 204 Exchange-correlation hole, 204 Exchange-correlation potential, 204 Excluded volume, 6, 11 Expectation values, 307
412
Subject Index
Experimental magnetic moment, 222 Expert systems, ix, 374, 385 Explicit water molecules, 269 Explosives, 375 Extended symmetry-adapted discrete variable representation (ESADVR), 322 Extended X-ray absorption fine structure (EXAFS) spectroscopy, 269 External magnetic field, 194, 205 External orbitals, 252 Extreme conditions, 159 Far from equilibrium, 68, 85, 180 Fast degrees of freedom, 17 Fast Fourier transform (FFT), 288 Fast-multipole methods, 100 Feature selection, 363 Feedforward network, 368 Fermi correlation, 204 Fermi energy, 197, 203 Fermi hole, 204 Fermi level, 204, 240 Ferromagnetic clusters, 193, 238 Fictitious forces, 101 Filter diagonalization, ix Filter function, 314 Filter operator, 313 Filter-diagonalization (FD), 313, 316 Filters, 312, 319 Finitely extensible nonlinear elastic (FENE) potentials, 11 Finnis-Sinclair potential, 219 First principles simulations of high explosives, 179 First-order instabilities, 106, 108 Fitness, 352 Fitness function, 353 Fluctuation-dissipation theorem, 104 Force constants, 9 Force field, 7, 8, 99, 179 Force field parameterization, 9 Force field validation, 9 Four-component Dirac operator, 258 Fourier transform, 3, 43, 82 Fractal surface, 90 Fragile glass formers, 4, 19, 20 Free surface area, 137 Free volume, 125, 139 Free-volume autocorrelation function, 143 Frenkel-Kontorova (FK) model, 98 Friction, vi, 68, 73, 98 Friction coefficient, 5, 107
Friction mechanisms, 70 Friction-velocity dependence, 87 Friedel model, 197 Full CI, 252 Fully optimized reaction space (FORS), 252 Gauss-Chebyshev quadrature points, 314 Gaussian filter, 315 Gaussian white noise processes, 17 Generalized gradient approximations (GGAs), 205, 229, 239 Generalized minimal residual (GMRES) method, 296, 301, 319 Generalized time representation (GTR), 314 Genetic algorithms, ix, 350 Geometric constraints, 94 Ghost particles, 16 Givens method, 289 Glass forming fluids, 19 Glass transition, v, 1, 2, 14, 126, 142 Glass-forming polymers, 1 Glassy freezing , 38 Glauber rates, 14 Global order, 133 Global orthogonality, 296, 300 Gram-Schmidt orthogonalization , 316 Grand canonical ensemble, 130 Graphical unitary group approach (GUGA), 252 Graphite , 113 Graphite sheets, 101 Green filter, 312, 319 Green operator, 327 Green’s functions, 104, 203, 314 Growing cell structure (GCS), 384 Gupta potential, 212, 365 Hamiltonian, 303 Hard-sphere chains, 34 Harmonic vibrations, 148 Hartree-Fock, 251 Heat capacity, 4 Heat-bath, 14 Hellmann-Feynman theorem, 308 Hermitian matrix, 287 Hermitian operators, 285 Hextuple bond, 265, 271, 274 Hidden layer, 370, 379 High explosive detonation, 162 High explosives, vii, 159 High spectral density, 327 High-pressure conditions, 99
Subject Index High-temperature kinetics, 363 HMX, 167, 180 HMX a-polymorph, 180 Hopping integrals, 201 Householder method, 290 Hund’s rule, 192 Hurst roughness exponent , 82 Hydrodynamic interactions, 89 Hydrodynamic lubrication, 91 Hydrodynamic reaction zone, 161 Hydrogen bonding, 173 Hysteresis, 71, 74, 105, 108, 175 Icosahedral growth, 212 Ideal randomness, 131 Implicitly restarted Lanczos algorithms, 300 Improper torsion, 8 Inactive orbitals, 252 Incommensurate surfaces, 69, 76, 92, 106 Individual, 352 Information content, 318 Infrared spectrum, 372 Insertion probability, 24 Instabilities, 87, 98, 105 Integration time step, 18 Interfacial symmetry, 77 Internal relaxation processes, 21 Interstitial space, 210 Intramolecular vibrational energy redistribution (IVR), 326 Intruder states, 257 Inverse iteration, 290 Iron clusters, 225, 227, 238 Irreversible tribological phenomena, 74 Isothermal compression, 134, 136 Isotropic liquid, 130 Itinerant exchange, 193 Jacobi rotation, 289 Jacobs-Cowperthwaite-Zwissler (JCZ3) EOS, 164 Jellium model, 218 Kauzmann paradox, 25 Kauzmann temperature, 21, 22, 48 Kinetic friction, 17, 107 Kinetic friction force, 68, 69, 116 Kinetic hindrance of ordering , 7 Kinetic modeling, 364 Kinetic studies, 363 Kohlrausch-Williams-Watts (KWW) law, 6, 37, 49
413
Kohn-Sham equations, 203 Kohn-Sham Hamiltonian, 207 Kohonen network, 381 Kronecker delta, 288 Krylov subspace, 292, 304, 329 Kubo theory, 104 Lamellar phase, 95 Lanczos algorithm, 294 Lanczos interpolation, 300 Lanczos phenomenon, 297 Lanczos recursion, ix, 293 Langevin dynamics, 105 Langevin thermostat, 85 Large eigenproblems, viii Large matrices, 297 Large tensile stresses, 99 Lattice polymer models, 6, 11 Lattice potential, 198 LDAþU Method, 220 Learning, 349 Learning Genetic Algorithms, 361 Lees-Edwards periodic boundary conditions, 93 Legendre transformation, 22 Lennard-Jones (LJ) potentials, 99 Lennard-Jones system, 132 Ligand to metal charge transfer (LMCT), 262 Light scattering, 376 Linear response, 76 Linear response theory, 85, 88 Linear scaling, 288 Liouville equation, 26 Liouville super-operator, 325 Liouville-von Neumann equation, 325 Liquid water, 134 Load, 73, 83 Load-dependence of friction, 74 Local conformational flexibility, 12 Local coordination number, 224, 227 Local density approximation (LDA), 204 Local density of electronic states, 197, 240 Local Green’s functions, 242 Local magnetic moments, 212 Local moments, 227 Local orbital moments, 222 Local order, 133 Local spin density approximation (LSDA), 208, 229 Local spin polarization, 224 Long-lives resonances, 328 Long-range ordered structure, 2
414
Subject Index
Look-ahead algorithm, 320 Lorentz-Berthelot combination rules, 165 Lo¨wdin orbitals, 200 Low-storage filter-diagonalization (LFSD), 317 Lubricant, 73 MacDonald’s theorem, 289 Macroscopic properties, 125, 140 Magnetic anisotropy energy, 194 Magnetic moments of bulk metals, 193 Magnetic properties, 191, 192 Magnetism, 196, 240 Magnetism of small clusters, 192, 193, 202 Magnetization density, 207 Manganese clusters, 228, 229 Markov chain, 14 Material properties, 132 Mating operator, 357 Matrix isolation spectroscopy, 270 Matrix-vector multiplication, 288 Mean field approximation, 201 Mean interparticle distance, 2 Melt structure factor, 46 Melt viscosity, 6 Mesoscopic scales, 11 Mesoscopic time scale, 14 Metallic clusters, 212 Metallic fuels, 161 Metal-metal bond length, 274 4d Metals, 234 Metastable state, 97 Method of moments, 241 Metropolis rates, 14 Microcanonical ensemble, 18 Microcanonical partition function, 22 Microcanonical trajectory, 17 Microscopic points of contact, 73 Minimal residual (MINRES) method, 296, 301, 319 Mode coupling theory (MCT), 26, 46 Model building, 7 Molar excess entropy, 151 MOLCAS-6, 251 Molecular descriptors, 375 Molecular dynamics (MD), 10, 13, 17, 81, 147, 217, 270 Molecular orientational order, 128 Molecular shape, 167 Molecular-orbital methods, 100 Møller-Plesset second-order perturbation theory (MP2), 254
Monte Carlo (MC), 10, 13, 147, 164, 165 Moore’s Law, 160 MoS2, 113 MSINDO, 365 Multi time-step integrators, 17 Multiconfigurational quantum methods, 249 Multiconfigurational wave function, 251 Multilayer icosahedral (MIC) structures , 215 Multiple metal-metal bond, 259 Multi-reference CI (MRCI), 254 Mutation operator, 357 Near IR (NIR) spectra, 363 Necking, 99 Neel temperature, 194 Network geometry, 370 Network-forming liquids, 148 Neural networks, ix, 363, 366 Neutron scattering, 29, 30, 41 Neutron spin echo experiment, 29 Newtonian mechanics, 70 Newton’s equations of motion, 17, 96 Nickel clusters, 211, 219 Nitromethane, 161, 180 NMR, 29 NMR spin-lattice relaxation, 41 Noble metal clusters, 218 Noise, 374 Non-additive pair interactions, 165 Nonbonded interactions, 8 Noncollinear magnetic configurations, 209, 241 Noncollinear spin DFT, 209 Nonequilibrium conditions, 81 Nonequilibrium simulations, 68 Non-hermiticity, 257 Non-isotropic stresses, 96 Nonlinear dynamics, 98 Nonlinear functions, 369, 379 Non-molecular phases, 167, 179 Nonorthogonality effects, 202 Nonrelativistic quantum chemistry, 257 Normal load, 68, 101 Normal pressure, 75, 76 Nose´-Hoover method, 18 Nose´-Hoover thermostat, 19, 174 Objective function, 353 Octet Rule, 250 Off-diagonal elements, 289 Off-lattice models, 11 Oligomers, 9
Subject Index OPLS-AA, 9 Optimum solutions, 358 Orbital magnetic moment, 222 Orbital magnetism, 241 Orbital polarization, 219, 220 Order, 125, 132 Order metrics, 127 Ordering map, 132 Organizing data, 383 Orientation autocorrelation function, 42 Origins of friction, 68 Oscillator strengths, 261 Out-of-equilibrium, 4, 20 Outliers, 374 Out-of-plane bending, 8 Overlap integrals, 198 Overlap matrix, 200 p Electrons, 200, 217 Packing arrangements, 128 Packing effects, 46, 47 Pade´ approximation, 168 Paige test, 297 Palladium clusters, 234, 237 Parallel tempering , 47, 147 Parents, 353 Partition function, 14, 24 PCFF, 9 Penta-erythritol tetranitrate (PETN), 166, 170 Perceptron, 369 Perfect crystalline structure, 131 Periodic boundary conditions (PBCs), 92, 97, 99, 181 Persistence times, 143 Phase space volume, 17 Photodetachment spectroscopy, 239 Photoelectron spectrum, 239 Plane-wave basis sets, 101 Plastic deformation, 70, 72, 103, 111, 112 Plastic flow, 73 Plastic-bonded explosive (PBX), 159 Polarizable continuum medium, 269 Polarizable force field, 270 1,4-Polybutadiene, 40, 95 Polymer coil, 5 Polymer force fields, 8 Polymer melts, 1 Polymer properties, 377 Polymer repeat units, 9 Polymers, 1 Polymorph, 128, 180 Polystyrene, 40, 95
415
Potential drug molecules, 384 Potential energy landscape (PEL), 145 Power method, 292 Prandtl-Tomlinson model, 71, 98 Preconditioned inexact spectra transform (PIST), 302 Preconditioned Lanczos algorithms, 302 Preconditioners, 320 Predictive material models, 160 Predictor-corrector methods, 86 Pressure-induced chemical reactions, 108 Principal components analysis, 377 Projection operator, 321 Propagation of wave packets, 324 Protein domain predictor, 377 Protein structure, 362 Proteins, 377 Pulay forces, 101 QR factorization, 290 Quadruple bond, 263, 265, 271 Quantitative structure-activity relationship (QSAR), 375 Quantum chemical methods, 100, 117 Quantum chemistry, 249 Quantum mechanical methods, 179 Quantum mechanics, 285, 303 Quantum-based MD simulations, 173 Quasi-elastic neutron scattering (QENS), 41 Quasi-minimal residual (QMR) method, 296, 301, 319 Quasi-steady state, 364 Quintuple bond, 265, 274 Radial distribution function, 30, 129, 152, 175 Radius of gyration, 5, 18 Raman spectra, 173 Random close-packed state, 132 Random coil-like configuration, 5 Random forces, 17, 86 Random projection, 384 Random walk (RW), 11 Randomness, 132 Rare gas matrices, 229, 268 Rate of crossover, 362 Rate of mutation, 362 Rayleigh line, 163 Rayleigh-Ritz variation, 289 RDX, 167 Reaction dynamics, 327 Reaction probability operator, 328
416
Subject Index
Reaction time scales, 162 Reactive force fields, 97, 100, 117 Reactive scattering, 328 Real area of contact, 73 Realistic simulations, 81 Real-symmetric matrices, 287, 308 Reciprocal space, 82, 89, 104 Recursive diagonalization methods, 288, 291 Recursive linear equation solvers, 296, 301, 320 Recursive methods, 319 Recursive neural networks, 377 Recursive residue generation method (RRGM), 303, 304 Recursive solutions, 285 Reference function, 257 Relativistic AO basis sets, 259 Relativistic corrections, 205 Relativistic effects, 249, 251 Relativity, 257 Relaxation functions, 6 Relaxation processes, 145 Relaxation time, 2, 4, 14, 16 Reptation moves, 16 Reptation-like behavior, 5 Resonance states, 323, 328 Resonances, 330 Restricted active space (RAS) SCF method, 253 Rhodium clusters, 234, 235 Rough surfaces, 81 Roulette wheel selection, 355 Round-off errors, 296 Rouse mode, 6, 38 Ro-vibrational Schro¨dinger equation, 326 Rule-discovery by machine, 374 Rules, 385 Ruthenium clusters, 234, 237 s Electrons, 200 Scaling laws, 329 Schema theorem, 361 Schocked hydrocarbons, 162 Schro¨dinger equation, 160, 173, 286 Scraping, 74 Second quantization, 200 Second virial coefficients, 141 Second-order Gear predictor-corrector method, 86 Second-order perturbation theory, 266 Segment length, 5 Segmental friction, 38
Segmental friction coefficient, 6 Self-assembled monolayers (SAMs), 116 Self-avoiding random walk (SAW), 11 Self-consistent charge density-functional tight-binding (SCC-DFTB), 180 Self-diffusion, 34, 136, 142, 144 Self-diffusion coefficient, 6, 26, 34 Self-diffusivity, 149 Self-organizing maps (SOMs), ix, 380 Self-similar surface, 82 Semi-crystalline, 1 Semi-empirical molecular orbital methods, 100 Sensors, 377 Sextuple bond, 225 Shear, 83, 117 Shear force, 73 Shear rate, 93 Shear stress, 75, 116 Shock conditions, 160 Shock Hugoniot, 163 Shocked hydrocarbons, 180 Short-iterative Lanczos (SIL) method, 325 Short-range orientational correlations, 5 Silicon clusters, 365 Simulation methods, 13 Simulations, 68 Single Lanczos propagation (SLP) method, 305 Single-chain structure factor, 30, 46 Singular value diagonalization (SVD), 289, 316 Sintering, 191 Size-extensivity, 254 Sliding velocity, 72, 76, 88 Slip length, 79, 91 Slow relaxation in polymers, 14 Solid friction, 72 Solvation shell, 270 Solvent partitioning, 376 Sparse data, 363 Sparse Hamiltonian matrices, 319 SPC/E Water, 135, 149 Specific entropy, 18 Specific heat, 7 Specific volume, 18, 19 Spectral density, 42, 43, 321 Spectral density operator, 311 Spectral method, 310, 313 Spectral transform Lanczos algorithm, 301 Spectral transforming filters, 301
Subject Index Spectroscopy, 326 Spherical harmonic, 131, 221 Spin magnetic moment, 222 Spin magnetism, 241 Spin polarization, 202, 205 Spin-density matrix, 206 Spin-dependent operators, 207 Spin-lattice relaxation time (T1), 42, 45 Spin-orbit coupling (SOC), 251, 259, 268 Spin-orbit interaction, 222, 241 Spinors, 206 Spin-polarized DFT, 206, 208 Spurious eigenvalues, 323 Squashing function, 369 Starting population, 353 Static correlation effects, 254 Static defects, 181 Static friction force, 69, 107, 110 Statistical mechanics, 164 Stern-Gerlach deflection experiment, 231, 239 Stick condition, 91 Stick-slip motion, 79, 85, 107, 116 Stochastic dynamics, 13 String, 352 Strong glass formers, 4, 19, 20 Structural correlations, 126 Structural glass transition, 1 Structural metrics, 130 Structural order, 132 Structural order metrics, 127 Structural ordering maps, 132 Structure factor, 2, 3 Super-cooled Lennard-Jones fluid, 133 Super-cooled liquid, 7, 21, 127, 145 Super-cooled polymer melts, 26, 142 Supercritical phase, 181 Superionic solid, 172 Superionic water, 167, 172, 179 Superlubricity, 70, 74, 112 Super-paramagnetism, 194, 231 Supervised learning, 373 Surface area, 137 Surface asperities, 74 Surface atoms, 197, 220, 224 Surface roughness, 81 Survival of the fittest operator, 353 Swap Monte Carlo algorithm, 147 Symmetry adaptation, 320 Symmetry-adapted autocorrelation function, 321 Symmetry-adapted Lanczos algorithm, 322 Symplectic integrator, 17
417
Test set, 372 Tetrahedrality parameter, 134 Tetrahedral order, 134 Thermal expansion, 3 Thermal expansion coefficient, 3, 19 Thermal fluctuations, 76, 143, 175 Thermodynamic equilibrium, 194 Thermodynamic properties, 18 Thermostats, 19, 68, 85, 97, 134 Threshold force, 69 Tight binding calculations, 211, 241 Tight binding method, 198, 240 Tight-binding DFT, 100 Time propagator, 324, 327 Time scales, 184 Time-dependent Schro¨dinger equation, 324 Time-dependent friction, 18, 104 Time-temperature superposition principle, 6 Topological constraints, 94 Topology, 11 Torsional autocorrelation function, 52 Torsional correlation times, 45 Torsional transitions, 47 Toxicity, 376 Training set, 372, 379 Trajectory, 13, 86 Transfer function, 369 Transition amplitudes, 303 Transition elements, 191 Transition metal, 249 Transition metal clusters, vii Transition rates, 13 Translational order, 128 Tribochemical reactions, 108 Tribochemistry, 100, 117 Tribological simulation, 97 Tribology, 68 Tribometer experiments, 84 Two-center integrals, 198 Ultra-low friction, 70, 74, 113 Uncertainty, 350 Uncertainty principle, 316 Union volume, 138 United atoms, 9, 30 Unrestricted Hartree-Fock approximation, 200 U-U bond, 274 van-Hove correlation functions, 51 Variable-cell AIMD simulations, 101 Velocity Verlet integrator, 17, 86 Velocity-dependence of friction, 76
418
Subject Index
Vertical excitation energies, 261 Vibrational entropy, 146 Vibrational mode, 326 Vibrational quantum numbers, 327 Viscosimetric glass transition (Tg), 4 Viscosity, 4, 91 Vogel-Fulcher laws, 4, 19 Vogel-Fulcher temperature, 5, 20, 22, 25, 48 Voids, 181 Volume relaxation, 21 von Schweidler exponent, 28, 37 von Schweidler law, 28, 49 Vornoi tessellations, 138 Wannier function, 178 Water, 149
Water environment, 270 Water phase diagram, 172, 173 Wave packet, 324 Wavelets, 363 Wear, 70, 74, 112, 119 Winning node, 381 X-ray diffraction, 173 X-ray scattering, 30, 269 Yield strength, 73, 75 Zinc phosphates (ZPs), 117 Zledovich-von Neumman-Doring (ZND) state, 172 Zwanzig formalism, 104