HANDBOOK OF MATERIALS MODELING
HANDBOOK OF MATERIALS MODELING Part A. Methods Editor Sidney Yip, Massachusetts Institute of Technology
A C.I.P. Catalogue record for this book is available from the Library of Congress.
ISBN-10 1-4020-3287-0 (HB) Springer Dordrecht, Berlin, Heidelberg, New York ISBN-10 1-4020-3286-2 (e-book) Springer Dordrecht, Berlin, Heidelberg, New York ISBN-13 978-1-4020-3287-5 (HB) Springer Dordrecht, Berlin, Heidelberg, New York ISBN-13 978-1-4020-3286-8 (e-book) Springer Dordrecht, Berlin, Heidelberg, New York
Published by Springer, P.O. Box 17, 3300 AA Dordrecht, The Netherlands.
Printed on acid-free paper
All Rights Reserved
© 2005 Springer No part of this work may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, microfilming, recording or otherwise, without written permission from the Publisher, with the exception of any material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. Printed in The Netherlands
CONTENTS PART A – METHODS Preface
xii
List of Subject Editors
ix
List of Contributors
xi
Detailed Table of Contents
xxix
Introduction
1
Chapter 1.
Electronic Scale
7
Chapter 2.
Atomistic Scale
449
Chapter 3.
Mesoscale/Continuum Methods
1069
Chapter 4.
Mathematical Methods
1215
PART B – MODELS Preface
xii
List of Subject Editors
ix
List of Contributors
xi
Detailed Table of Contents
xxix
Chapter 5.
Rate Processes
1565
Chapter 6.
Crystal Defects
1849
Chapter 7.
Microstructure
2081
Chapter 8.
Fluids
2409
Chapter 9.
Polymers and Soft Matter
2553
Plenary Perspectives
2657
Index of Contributors
2943
Index of Keywords
2947 v
PREFACE This Handbook contains a set of articles introducing the modeling and simulation of materials from the standpoint of basic methods and studies. The intent is to provide a compendium that is foundational to an emerging field of computational research, a new discipline that may now be called Computational Materials. This area has become sufficiently diverse that any attempt to cover all the pertinent topics would be futile. Even with a limited scope, the present undertaking has required the dedicated efforts of 13 Subject Editors to set the scope of nine chapters, solicit authors, and collect the manuscripts. The contributors were asked to target students and non-specialists as the primary audience, to provide an accessible entry into the field, and to offer references for further reading. With no precedents to follow, the editors and authors were only guided by a common goal – to produce a volume that would set a standard toward defining the broad community and stimulating its growth. The idea of a reference work on materials modeling surfaced in conversations with Peter Binfield, then the Reference Works Editor at Kluwer Academic Publishers, in the spring of 1999. The rationale at the time already seemed quite clear – the field of computational materials research was taking off, powerful computer capabilities were becoming increasingly available, and many sectors of the scientific community were getting involved in the enterprise. It was felt that a volume that could articulate the broad foundations of computational materials and connect with the established fields of computational physics and computational chemistry through common fundamental scientific challenges would be timely. After five years, none of the conditions have changed; the need remains for a defining reference volume, interest in materials modeling and simulation is further intensifying, the community continues to grow. In this work materials modeling is treated in 9 chapters, loosely grouped into two parts. Part A, emphasizing foundations and methodology, consists of three chapters describing theory and simulation at the electronic, atomistic, and mesoscale levels, and a chapter on analysis-based methods. Part B is more concerned with models and basic applications. There are five chapters describing basic problems in materials modeling and simulation, rate-dependent phenomena, crystal defects, microstructure, fluids, polymers and soft matter. In vii
viii
Preface
addition this part contains a collection of commentaries on a range of issues in materials modeling, written in a free-style format by experienced individuals with definite views that could enlighten the future members of the community. See the opening Introduction for further comments on modeling and simulation and an overview of the Handbook contents. Any organizational undertaking of this magnitude cans only be a collective effort. Yet the fate of this volume would not be so certain without the critical contributions from a few individuals. My gratitude goes to Liesbeth Mol, Peter Binfield’s successor at Springer Science + Business Media, for continued faith and support, Ju Li and Xiaofeng Qian for managing the websites and manuscript files, and Tim Kaxrias for stepping in at a critical stage of the project. To all the authors who found time in your hectic schedules to write the contributions, I am deeply appreciative and trust you are not disappointed. To the Subject Editors I say the Handbook is a reality only because of your perseverance and sacrifices. It has been my good fortune to have colleagues who were generous with advice and assistance. I hope this work motivates them even more to continue sharing their knowledge and insights in the work ahead. Sidney Yip Department of Nuclear Science and Engineering, Department of Materials Science and Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
LIST OF SUBJECT EDITORS Martin Bazant, Massachusetts Institute of Technology (Chapter 4) Bruce Boghosian, Tufts University (Chapter 8) Richard Catlow, Royal Institution, UK (Chapter 6) Long-Qing Chen, Pennsylvania State University (Chapter 7) William Curtin, Brown University (Chapter 1, Chapter 2, Chapter 4) Tomas Diaz de la Rubia, Lawrence Livermore National Laboratory (Chapter 6) Nicolas Hadjiconstantinou, Massachusetts Institute of Technology (Chapter 8) Mark F. Horstemeyer, Mississippi State University (Chapter 3) Efthimios Kaxiras, Harvard University (Chapter 1, Chapter 2) L. Mahadevan, Harvard University (Chapter 9) Dimitrios Maroudas, University of Massachusetts (Chapter 4) Nicola Marzari, Massachusetts Institute of Technology (Chapter 1) Horia Metiu, University of California Santa Barbara (Chapter 5) Gregory C. Rutledge, Massachusetts Institute of Technology (Chapter 9) David J. Srolovitz, Princeton University (Chapter 7) Bernhardt L. Trout, Massachusetts Institute of Technology (Chapter 1) Dieter Wolf, Argonne National Laboratory (Chapter 6) Sidney Yip, Massachusetts Institute of Technology (Chapter 1, Chapter 2, Chapter 6, Plenary Perspectives)
ix
LIST OF CONTRIBUTORS Farid F. Abraham IBM Almaden Research Center, San Jose, California
[email protected] P20
Robert Averback Accelerator Laboratory, P.O. Box 43 (Pietari Kalmin k. 2), 00014, University of Helsinki, Finland; Department of Materials Science and Engineering, University of Illinois at Urbana-Champaign, Illinois, USA
[email protected] 6.2
Francis J. Alexander Los Alamos National Laboratory, Los Alamos, NM, USA
[email protected] 8.7
D.J. Bammann Sandia National Laboratories, Livermore, CA, USA
[email protected] 3.2
N.R. Aluru Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
[email protected] 8.3
K. Barmak Department of Materials Science and Engineering, Carnegie Mellon University, Pittsburgh, PA 15213, USA
[email protected] 7.19
Filippo de Angelis Istituto CNR di Scienze e Tecnologie Molecolari ISTM, Dipartimento di Chimica, Universit´a di Perugia, Via Elce di Sotto $, I-06123, Perugia, Italy
[email protected] 1.4
Stefano Baroni DEMOCRITOS-INFM, SISSA-ISAS, Trieste, Italy
[email protected] 1.10
Emilio Artacho University of Cambridge, Cambridge, UK
[email protected] 1.5
Rodney J. Bartlett Quantum Theory Project, Departments of Chemistry and Physics, University of Florida, Gainesville, FL 32611, USA
[email protected] 1.3
Mark Asta Northwestern University, Evanston, IL, USA
[email protected] 1.16
Corbett Battaile Sandia National Laboratories, Albuquerque, NM, USA
[email protected] 7.17
xi
xii Martin Z. Bazant Department of Mathematics, Massachusetts Institute of Technology, Cambridge, MA, USA
[email protected] 4.1, 4.10 Noam Bernstein Naval Research Laboratory, Washington, DC, USA
[email protected] 2.24 Kurt Binder Institut fuer Physik, Johannes Gutenberg Universitaet Mainz, Staudinger Weg 7, 55099 Mainz, Germany
[email protected] P19 Peter E. Bl¨ohl Institute for Theoretical Physics, Clausthal University of Technology, Clausthal-Zellerfeld, Germany
[email protected] 1.6 Bruce M. Boghosian Department of Mathematics, Tufts University, Bromfield-Pearson Hall, Medford, MA 02155, USA
[email protected] 8.1 Jean Pierre Boon Center for Nonlinear Phenomena and Complex Systems, Universit´e Libre de Bruxelles, 1050-Bruxelles, Belgium
[email protected] P21
List of contributors Russel Caflisch University of California at Los Angeles, Los Angeles, CA, USA
[email protected] 7.15 Wei Cai Department of Mechanical Engineering, Stanford University, Stanford, CA 94305-4040, USA
[email protected] 2.21 Roberto Car Department of Chemistry and Princeton Materials Institute, Princeton University, Princeton, NJ, USA
[email protected] 1.4 Paolo Carloni International School for Advanced Studies (SISSA/ISAS) and INFM Democritos Center, Trieste, Italy
[email protected] 1.13 Emily A. Carter Department of Mechanical and Aerospace Engineering and Program in Applied and Computational Mathematics, Princeton University, Princeton, NJ 08544, USA
[email protected] 1.8
Iain D. Boyd University of Michigan, Ann Arbor, MI, USA
[email protected] P22
C.R.A. Catlow Davy Faraday Laboratory, The Royal Institution, 21 Albemarle Street, London W1S 4BS, UK; Department of Chemistry, University College London, 20 Gordon Street, London WC1H 0AJ, UK
[email protected] 2.7, 6.1
Vasily V. Bulatov Lawrence Livermore National Laboratory, University of California, Livermore, CA 94550, USA
[email protected] P7
Gerbrand Ceder Department of Materials Science and Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
[email protected] 1.17, 1.18
List of contributors
xiii
Alan V. Chadwick Functional Materials Group, School of Physical Sciences, University of Kent, Canterbury, Kent CT2 7NR, UK
[email protected] 6.5
Marvin L. Cohen University of California at Berkeley and Lawrence Berkeley National Laboratory, Berkeley, CA, USA
[email protected] 1.2
Hue Sun Chan University of Toronto, Toronto, Ont., Canada
[email protected] 5.16
John Corish Department of Chemistry, Trinity College, University of Dublin, Dublin 2, Ireland
[email protected] 6.4
James R. Chelikowsky University of Minnesota, Minneapolis, MN, USA
[email protected] 1.7 Long-Qing Chen Department of Materials Science and Engineering, Penn State University, University Park, PA 16802, USA
[email protected] 7.1 I-Wei Chen Department of Materials Science and Engineering, University of Pennsylvania, Philadelphia, PA 19104-6282, USA
[email protected] P27 Sow-Hsin Chen Department of Nuclear Engineering, MIT, Cambridge, MA 02139, USA
[email protected] P28 Christophe Chipot Equipe de dynamique des assemblages membranaires, Unit´e mixte de recherche CNRS/UHP 7565, Institut nanc´een de chimie mol´eculaire, Universit´e Henri Poincar´e, BP 239, 54506 Vanduvre-l`es-Nancy cedex, France 2.26 Giovanni Ciccotti INFM and Dipartimento di Fisica, Universit`a “La Sapienza,” Piazzale Aldo Moro, 2, 00185 Roma, Italy
[email protected] 2.17, 5.4
Peter V. Coveney Centre for Computational Science, Department of Chemistry, University College London, 20 Gordon Street, London WC1H 0AJ, UK
[email protected] 8.5 Jean-Paul Crocombette CEA Saclay, DEN-SRMP, 91191 Gif/Yvette cedex, France
[email protected] 2.28 Darren Crowdy Department of Mathematics, Imperial College, London, UK
[email protected] 4.10 G´abor Cs´anyi Cavendish Laboratory, University of Cambridge, UK
[email protected] P16 Nguyen Ngoc Cuong Massachusetts Institute of Technology, Cambridge, MA, USA
[email protected] 4.15 Christoph Dellago Institute of Experimental Physics, University of Vienna, Vienna, Austria
[email protected] 5.3
xiv J.D. Doll Department of Chemistry, Brown University, Providence, RI, USA Jimmie
[email protected] 5.2 Patrick S. Doyle Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
[email protected] 9.7
List of contributors Diana Farkas Department of Materials Science and Engineering, Virginia Tech, Blacksburg, VA 24061, USA
[email protected] 2.23 Clemens J. F¨orst Institute for Theoretical Physics, Clausthal University of Technology, Clausthal-Zellerfeld, Germany
[email protected] 1.6
Weinan E Department of Mathematics, Program in Applied and Computational Mathematics, Princeton University, Princeton, NJ 08544-1000, USA
[email protected] 4.13
Glenn H. Fredrickson Department of Chemical Engineering & Materials, The University of California at Santa, Barbara Santa Barbara, CA, USA
[email protected] 9.9
Jens Eggers School of Mathematics, University of Bristol, University Walk, Bristol BS8 1TW, UK
[email protected] 4.9
Daan Frenkel FOM Institute for Atomic and Molecular Physics, Amsterdam, The Netherlands
[email protected] 2.14
Pep Espanol ˜ Dept. Física Fundamental, Universidad Nacional de Educaci´on a Distancia, Aptdo. 60141, E-28080 Madrid, Spain
[email protected] 8.6 J.W. Evans Ames Laboratory - USDOE, and Department of Mathematics, Iowa State University, Ames, Iowa, 50011, USA
[email protected] 5.12 Denis J. Evans Research School of Chemistry, Australian National University, Canberra, ACT, Australia
[email protected] P17 Michael L. Falk University of Michigan, Ann Arbor, MI, USA
[email protected] 4.3
Julian D. Gale Nanochemistry Research Institute, Department of Applied Chemistry, Curtin University of Technology, Perth, 6845, Western Australia
[email protected] 1.5, 2.3 Giulia Galli Lawrence Livermore National Laboratory, CA, USA
[email protected] P8 Venkat Ganesan Department of Chemical Engineering, The University of Texas at Austin, Austin, TX, USA
[email protected] 9.9 Alberto García Universidad del País Vasco, Bilbao, Spain
[email protected] 1.5
List of contributors C. William Gear Princeton University, Princeton, NJ, USA
[email protected] 4.11 Timothy C. Germann Applied Physics Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
[email protected] 2.11 Eitan Geva Department of Chemistry, University of Michigan, Ann Arbor, Michigan 48109-1055, USA
[email protected] 5.9 Nasr M. Ghoniem Department of Mechanical and Aerospace Engineering, University of California, Los Angeles, CA 90095-1597, USA
[email protected] 7.11, P11, P30 Paolo Giannozzi Scuola Normale Superiore and National Simulation Center, INFM-DEMOCRITOS, Pisa, Italy
[email protected] 1.4, 1.10 E. Van der Giessen University of Groningen, Groningen, The Netherlands
[email protected] 3.4 Daniel T. Gillespie Dan T Gillespie Consulting, 30504 Cordoba Place, Castaic, CA 91384, USA
[email protected] 5.11 George Gilmer Lawrence Livermore National Laboratory, P.O. box 808, Livermore, CA 94550, USA
[email protected] 2.10
xv William A. Goddard III Materials and Process Simulation Center, California Institute of Technology, Pasadena, CA 91125, USA
[email protected] P9 Axel Groß Physik-Department T30, TU M¨unchen, 85747 Garching, Germany
[email protected] 5.10 Peter Gumbsch Institut f¨ur Zuverl¨assigkeit von Bauteilen und Systemen izbs, Universit¨at Karlsruhe (TH), Kaiserstr. 12, 76131Karlsruhe, Germany and Fraunhofer Institut f¨ur Werkstoffmechanik IWM, W¨ohlerstr. 11, D-79194 Freiburg, Germany
[email protected] P10 Fran¸cois Gygi Lawrence Livermore National Laboratory, CA, USA P8 Nicolas G. Hadjiconstantinou Department of Mechanical Engineering, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, USA
[email protected] 8.1, 8.8 J.P. Hirth Ohio State and Washington State Universities, 114 E. Ramsey Canyon Rd., Hereford, AZ 85615, USA
[email protected] P31 K.M. Ho Ames Laboratory-U.S. DOE and Department of Physics and Astronomy, Iowa State University, Ames, IA 50011, USA 1.15
xvi
List of contributors
Wesley P. Hoffman Air Force Research Laboratory, Edwards, CA, USA
[email protected] P37
C.S. Jayanthi Department of Physics, University of Louisville, Louisville, KY 40292
[email protected] P39
Wm.G. Hoover Department of Applied Science, University of California at Davis/Livermore and Lawrence Livermore National Laboratory, Livermore, California, 94551-7808
[email protected] P34
Raymond Jeanloz University of California, Berkeley, CA, USA
[email protected] P25
M.F. Horstemeyer Mississippi State University, Mississippi State, MS, USA
[email protected] 3.1, 3.5 Thomas Y. Hou California Institute of Technology, Pasadena, CA, USA
[email protected] 4.14 Hanchen Huang Department of Mechanical, Aerospace and Nuclear Engineering, Rensselaer Polytechnic Institute, 110 8th Street, Troy, NY 12180-3590, USA
[email protected] 2.30 Gerhard Hummer National Institutes of Health, Bethesda, MD, USA
[email protected] 4.11 M. Saiful Islam Chemistry Division, SBMS, University of Surrey, Guildford GU2 7XH, UK
[email protected] 6.6 Seogjoo Jang Chemistry Department, Brookhaven National Laboratory, Upton, New York 11973-5000, USA
[email protected] 5.9
Pablo Jensen Laboratoire de Physique de la Mati´ere Condens´ee et des Nanostructures, CNRS and Universit´e Claude Bernard Lyon-1, 69622 Villeurbanne C´edex, France
[email protected] 5.13 Yongmei M. Jin Department of Ceramic and Materials Engineering, Rutgers University, 607 Taylor Road, Piscataway, NJ 08854, USA
[email protected] 7.12 Xiaozhong Jin Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
[email protected] 8.3 J.D. Joannopoulos Francis Wright Davis Professor of Physics, Department of Physics, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
[email protected] P4 Javier Junquera Rutgers University, New Jersey, USA
[email protected] 1.5 Jo˜ao F. Justo Escola Polit´ecnica, Universidade de S˜ao Paulo, S˜ao Paulo, Brazil
[email protected] 2.4
List of contributors Hideo Kaburaki Japan Atomic Energy Research Institute, Tokai, Ibaraki, Japan
[email protected] 2.18 Rajiv K. Kalia Collaboratory for Advanced Computing and Simulations, Department of Physics & Astronomy, University of Southern California, 3651 Watt Way, VHE 608, Los Angeles, CA 90089-0242, USA
[email protected] 2.25 Raymond Kapral Chemical Physics Theory Group, Department of Chemistry, University of Toronto, Toronto, Ont. M5S 3H6, Canada
[email protected] 2.17, 5.4 Alain Karma Northeastern University, Boston, MA, USA
[email protected] 7.2 Johannes K¨astner Institute for Theoretical Physics, Clausthal University of Technology, Clausthal-Zellerfeld, Germany
[email protected] 1.6 Markos A. Katsoulakis Department of Mathematics and Statistics, University of Massachusetts - Amherst, Amherst, MA 01002, USA
[email protected] 4.12 Efthimios Kaxiras Department of Nuclear Science and Engineering and Materials Science and Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
[email protected] 2.1, 8.4
xvii Ronald J. Kerans Air Force Research Laboratory, Materials and Manufacturing Directorate, Wright-Patterson Air Force Base, Ohio, USA
[email protected] P38 Ioannis G. Kevrekidis Princeton University, Princeton, NJ, USA
[email protected] 4.11 Armen G. Khachaturyan Department of Ceramic and Materials Engineering, Rutgers University, 607 Taylor Road, Piscataway, NJ 08854, USA
[email protected] 7.12 T.A. Khraishi University of New Mexico, Albuquerque, NM, USA
[email protected] 3.3 Seong Gyoon Kim Kunsan National University, Kunsan 573-701, Korea
[email protected] 7.3 Won Tae Kim Chongju University, Chongju 360-764, Korea
[email protected] 7.3 Michael L. Klein Center for Molecular Modeling, Chemistry Department, University of Pennsylvania, 231 South 34th Street, Philadelphia, PA 19104-6323, USA
[email protected] 2.26 Walter Kob Laboratoire des Verres, Universit´e Montpellier 2, 34095 Montpellier, France
[email protected] P24
xviii David A. Kofke University at Buffalo, The State University of New York, Buffalo, New York, USA
[email protected] 2.14 Maurice de Koning University of S˜ao Paulo, S˜ao Paulo, Brazil
[email protected] 2.15 Anatoli Korkin Quantum Theory Project, Departments of Chemistry and Physics, University of Florida, Gainesville, FL 32611, USA 1.3 Kurt Kremer MPI for Polymer Research, D-55021 Mainz, Germany
[email protected] P5
List of contributors C. Leahy Department of Physics, University of Louisville, Louisville, KY 40292, USA P39 R. LeSar Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
[email protected] 7.14 Ju Li Department of Materials Science and Engineering, Ohio State University, Columbus, OH, USA
[email protected] 2.8, 2.19, 2.31 Xiantao Li Program in Applied and Computational Mathematics, Princeton University, Princeton, NJ 08544, USA
[email protected] 4.13
Carl E. Krill III Materials Division, University of Ulm, Albert-Einstein-Allee 47, D-89081 Ulm, Germany
[email protected] 7.6
Gang Li Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
[email protected] 8.3
Ladislas P. Kubin LEM, CNRS-ONERA, 29 Av. de la Division Leclerc, BP 72, 92322 Chatillon Cedex, France
[email protected] P33
Vincent L. Lign`eres Department of Chemistry, Princeton University, Princeton, NJ 08544, USA 1.8
D.P. Landau Center for Simulational Physics, The University of Georgia, Athens, GA 30602, USA
[email protected] P2 James S. Langer Department of Physics, University of California, Santa Barbara, CA 93106-9530, USA
[email protected] 4.3, P14
Turab Lookman Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM, USA
[email protected] 7.5 Steven G. Louie Department of Physics, University of California at Berkeley and Materials Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
[email protected] 1.11
List of contributors
xix
John Lowengrub University of California, Irvine, California, USA
[email protected] 7.8
Richard M. Martin University of Illinois at Urbana, Urbana, IL, USA
[email protected] 1.5
Gang Lu Division of Engineering and Applied Science, Harvard University, Cambridge, Massachusetts, USA
[email protected] 2.20
Georges Martin ´ Commissariat a` l’Energie Atomique, Cab. H.C., 33 rue de la F´ed´eration, 75752 Paris Cedex 15, France
[email protected] 7.9
Alexander D. MacKerell, Jr. Department of Pharmaceutical Sciences, School of Pharmacy, University of Maryland, 20 Penn Street, Baltimore, MD, 21201, USA
[email protected] 2.5
Nicola Marzari Department of Materials Science and Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
[email protected] 1.1, 1.4
Alessandra Magistrato International School for Advanced Studies (SISSA/ISAS) and INFM Democritos Center, Trieste, Italy 1.13
Wayne L. Mattice Department of Polymer Science, The University of Akron, Akron, OH 44325-3909
[email protected] 9.3
L. Mahadevan Division of Engineering and Applied Sciences, Department of Organismic and Evolutionary Biology, Department of Systems Biology, Harvard University Cambridge, MA 02138, USA
[email protected] Dionisios Margetis Department of Mathematics, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
[email protected] 4.8
V.G. Mavrantzas Department of Chemical Engineering, University of Patras, Patras, GR 26500, Greece
[email protected] 9.4 D.L. McDowell Georgia Institute of Technology, Atlanta, GA, USA
[email protected] 3.6, 3.9
E.B. Marin Sandia National Laboratories, Livermore, CA, USA
[email protected] 3.5
Michael J. Mehl Center for Computational Materials Science, Naval Research Laboratory, Washington, DC, USA
[email protected] 1.14
Dimitrios Maroudas University of Massachusetts, Amherst, MA, USA
[email protected] 4.1
Horia Metiu University of California, Santa Barbara, CA, USA
[email protected] 5.1
xx R.E. Miller Carleton University, Ottawa, ON, Canada
[email protected] 2.13 Frederick Milstein Mechanical Engineering and Materials Depts., University of California, Santa Barbara, CA, USA
[email protected] 4.2 Y. Mishin George Mason University, Fairfax, VA, USA
[email protected] 2.2 Francesco Montalenti INFM, L-NESS, and Dipartimento di Scienza dei Materiali, Universit`a degli Studi di Milano-Bicocca, Via Cozzi 53, I-20125 Milan, Italy
[email protected] 2.11 Dane Morgan Massachusetts Institute of Technology, Cambridge MA, USA
[email protected] 1.18 John A. Moriarty Lawrence Livermore National Laboratory, University of California, Livermore, CA 94551-0808
[email protected] P13 J.W. Morris, Jr. Department of Materials Science and Engineering, University of California, Berkeley, CA, USA
[email protected] P18 Raymond D. Mountain Physical and Chemical Properties Division, Chemical Science and Technology Laboratory, National Institute of Standards and Technology, Gaithersburg, MD 20899-8380, USA
[email protected] P23
List of contributors Marcus Muller ¨ Department of Physics, University of Wisconsin, Madison, WI 53706-1390, USA
[email protected] 9.5 Aiichiro Nakano Collaboratory for Advanced Computing and Simulations, Department of Computer Science, University of Southern California, 3651 Watt Way, VHE 608, Los Angeles, CA 90089-0242, USA
[email protected] 2.25 A. Needleman Brown University, Providence, RI, USA
[email protected] 3.4 Abraham Nitzan Tel Aviv University, Tel Aviv, 69978, Israel
[email protected] 5.7 Kai Nordlund Accelerator Laboratory, P.O. Box 43 (Pietari Kalmin k. 2), 00014, University of Helsinki, Finland; Department of Materials Science and Engineering, University of Illinois at Urbana-Champaign, Illinois, USA 6.2 G. Robert Odette Department of Mechanical Engineering and Department of Materials, University of California, Santa Barbara, CA, USA
[email protected] 2.29 Shigenobu Ogata Osaka University, Osaka, Japan
[email protected] 1.20
List of contributors Gregory B. Olson Department of Materials Science and Engineering, Northwestern University, Evanston, IL, USA
[email protected] P3 Pablo Ordej´on Instituto de Materiales, CSIC, Barcelona, Spain
[email protected] 1.5 Tadeusz Pakula Max Planck Institute for Polymer Research, Mainz, Germany and Department of Molecular Physics, Technical University, Lodz, Poland
[email protected] P35 Vijay Pande Department of Chemistry and of Structural Biology, Stanford University, Stanford, CA 94305-5080, USA
[email protected] 5.17 I.R. Pankratov Russian Research Centre, “Kurchatov Institute”, Moscow 123182, Russia
[email protected] 7.10 D.A. Papaconstantopoulos Center for Computational Materials Science, Naval Research Laboratory, Washington, DC, USA
[email protected] 1.14 J.E. Pask Lawrence Livermore National Laboratory, Livermore, CA, USA
[email protected] 1.19 Anthony T. Patera Massachusetts Institute of Technology, Cambridge, MA, USA
[email protected] 4.15
xxi Mike Payne Cavendish Laboratory, University of Cambridge, UK
[email protected] P16 Leonid Pechenik University of California, Santa Barbara, CA, USA
[email protected] 4.3 Joaquim Peir´o Department of Aeronautics, Imperial College, London, UK
[email protected] 8.2 Simon R. Phillpot Department of Materials Science and Engineering, University of Florida, Gainesville, FL 32611, USA
[email protected] 2.6, 6.11 G.P. Potirniche Mississippi State University, Mississippi State, MS, USA
[email protected] 3.5 Thomas R. Powers Division of Engineering, Brown University, Providence, RI, USA thomas
[email protected] 9.8 Dierk Raabe Max-Planck-Institut f¨ur Eisenforschung, Max-Planck-Str. 1, D-40237 D¨usseldorf, Germany
[email protected] 7.7, P6 Ravi Radhakrishnan Department of Bioengineering, University of Pennsylvania, Philadelphia, PA, USA
[email protected] 5.5
xxii Christian Ratsch University of California at Los Angeles, Los Angeles, CA, USA
[email protected] 7.15 John R. Ray 1190 Old Seneca Road, Central, SC 29630, USA
[email protected] 2.16 William P. Reinhardt University of Washington Seattle, Washington, USA
[email protected] 2.15 Karsten Reuter Fritz-Haber-Institut der Max-Planck-Gesellschaft, Faradayweg 4-6, D-14195 Berlin, Germany
[email protected] 1.9 J.M. Rickman Department of Materials Science and Engineering, Lehigh University, Bethlehem, PA 18015, USA
[email protected] 7.14, 7.19
List of contributors Tomonori Sakai Centre for Computational Science, Queen Mary, University of London, Mile End Road, London E1 4NS, UK 8.5 Deniel S´anchez-Portal Donostia International Physics Center, Donostia, Spain
[email protected] 1.5 Joachim Sauer Institut f¨ur Chemie, Humboldt-Universit¨at zu Berlin, Unter den Linden 6, D-10099 Berlin, Germany 1.12 Avadh Saxena Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM, USA
[email protected] 7.5 Matthias Scheffler Fritz-Haber-Institut der Max-Planck-Gesellschaft, Faradayweg 4-6, D-14195 Berlin, Germany
[email protected] 1.9
Angel Rubio Departamento Física de Materiales and Unidad de Física de Materiales Centro Mixto CSIC-UPV, Universidad del País Vasco and Donosita Internacional Physics Center (DIPC), Spain
[email protected] 1.11
Klaus Schulten Theoretical and Computational Biophysics Group, Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
[email protected] 5.15
Robert E. Rudd Lawrence Livermore National Laboratory, University of California, L-045 Livermore, CA 94551, USA
[email protected] 2.12
Steven D. Schwartz Departments of Biophysics and Biochemistry, Albert Einstein College of Medicine, New York, USA
[email protected] 5.8
Gregory C. Rutledge Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
[email protected] 9.1
Robin L.B. Selinger Physics Department, Catholic University, Washington, DC 20064, USA
[email protected] 2.23
List of contributors Marcelo Sepliarsky Instituto de Física Rosario, Facultad de Ciencias Exactas, Ingenieria y Agrimensura, Universidad Nacional de Rosario, 27 de Febreo 210 Bis, (2000) Rosario, Argentina
[email protected] 2.6 Alessandro Sergi Chemical Physics Theory Group, Department of Chemistry, University of Toronto, Toronto, Ont. M5S 3H6, Canada
[email protected] 2.17, 5.4 J.A. Sethian Department of Mathematics, University of California, Berkeley, CA, USA
[email protected] 4.6 Michael J. Shelley Courant Institute of Mathematical Sciences, New York University, New York, NY, USA
[email protected] 4.7 C. Shen The Ohio State University, Columbus, Ohio, USA
[email protected] 7.4 Spencer Sherwin Department of Aeronautics, Imperial College, London, UK
[email protected] 8.2 Marek Sierka Institut f¨ur Physikalische Chemie, Lehrstuhl f¨ur Theoretische Chemie, Universit¨at Karlsruhe, Kaiserstraße 12, D-76128 Karlsruhe, Germany
[email protected] 1.12 Asimina Sierou University of Cambridge, Cambridge, UK
[email protected] 9.6
xxiii Grant D. Smith Department of Materials Science and Engineering, Department of Chemical Engineering, University of Utah, Salt Lake City, Utah, USA
[email protected] 9.2 Fr´ed´eric Soisson CEA Saclay, DMN-SRMP, 91191 Gif-sur-Yuette, France
[email protected] 7.9 Jos´e M. Soler Universidad Aut´onoma de Madrid, Madrid, Spain
[email protected] 1.5 Didier Sornette Institute of Geophysics and Planetary Physics and Department of Earth and Space Science, University of California, Los Angeles, California, USA and CNRS and Universit´e des Sciences, Nice, France
[email protected] 4.4 David J. Srolovitz Princeton Materials Institute and Department of Mechanical and Aerospace Engineering, Princeton University, Princeton, NJ 08544, USA
[email protected] 7.1, 7.13 Marcelo G. Stachiotti Instituto de Física Rosario, Facultad de Ciencias Exactas, Ingenieria y Agrimensura, Universidad Nacional de Rosario, 27 de Febreo 210 Bis, (2000) Rosario, Argentina
[email protected] 2.6 Catherine Stampfl Fritz-Haber-Institut der Max-Planck-Gesellschaft, Faradayweg 4-6, D-14195 Berlin, Germany; School of Physics, The University of Sydney, Sydney 2006, Australia
[email protected] 1.9
xxiv
List of contributors
H. Eugene Stanley Center for Polymer Studies and Department of Physics Boston, University, Boston, MA 02215, USA
[email protected] P36
Meijie Tang Lawrence Livermore National Laboratory, P.O. Box 808, Livermore, CA 94550
[email protected] 2.22
P.A. Sterne Lawrence Livermore National Laboratory, Livermore, CA, USA
[email protected] 1.19
Mounir Tarek Equipe de dynamique des assemblages membranaires, Unit´e mixte de recherche CNRS/UHP 7565, Institut nanc´eien de chimie mol´eculaire, Universit´e Henri Poincar´e, BP 239, 54506 Vanduvre-l`es-Nancy cedex, France 2.26
Howard A. Stone Division of Engineering and Applied Sciences, Harvard University, Cambridge, MA 01238, USA
[email protected] 4.8 Marshall Stoneham Centre for Materials Research, and London Centre for Nanotechnology, Department of Physics and Astronomy, University College London, Gower Street, London WC1E 6BT, UK
[email protected] P12 Sauro Succi Istituto Applicazioni Calcolo, National Research Council, viale del Policlinico, 137, 00161, Rome, Italy
[email protected] 8.4 E.B. Tadmor Technion-Israel Institute of Technology, Haifa, Israel
[email protected] 2.13 Emad Tajkhorshid Theoretical and Computational Biophysics Group, Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
[email protected] 5.15
DeCarlos E. Taylor Quantum Theory Project, Departments of Chemistry and Physics, University of Florida, Gainesville, FL 32611, USA
[email protected] 1.3 Doros N. Theodorou School of Chemical Engineering, National Technical University of Athens, 9 Heroon Polytechniou Street, Zografou Campus, 157 80 Athens, Greece
[email protected] P15 Carl V. Thompson Department of Materials Science and Engineering, M.I.T., Cambridge, MA 02139, USA
[email protected] P26 Anna-Karin Tornberg Courant Institute of Mathematical Sciences, New York University, New York, NY, USA
[email protected] 4.7 S. Torquato Department of Chemistry, PRISM, and Program in Applied & Computational Mathematics, Princeton University, Princeton, NJ 08544, USA
[email protected] 4.5, 7.18
List of contributors Bernhardt L. Trout Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
[email protected] 5.5 Mark E. Tuckerman Department of Chemistry, Courant Institute of Mathematical Science, New York University, New York, NY 10003, USA
[email protected] 2.9 Blas P. Uberuaga Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
[email protected] 2.11, 5.6 Patrick T. Underhill Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA 9.7 V.G. Vaks Russian Research Centre, “Kurchatov Institute”, Moscow 123182, Russia
[email protected] 7.10 Priya Vashishta Collaboratory for Advanced Computing and Simulations, Department of Chemical Engineering and Materials Science, University of Southern California, 3651 Watt Way, VHE 608, Los Angeles, CA 90089-0242, USA
[email protected] 2.25 A. Van der Ven Department of Materials Science and Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA 1.17 Karen Veroy Massachusetts Institute of Technology, Cambridge, MA, USA
[email protected] 4.15
xxv Alessandro De Vita King’s College London, UK, Center for Nanostructured, Materials (CENMAT) and DEMOCRITOS National Simulation Center, Trieste, Italy alessandro.de
[email protected] P16 V. Vitek Department of Materials Science and Engineering, University of Pennsylvania, Philadelphia, PA 19104, USA
[email protected] P32 Dionisios G. Vlachos Department of Chemical Engineering, Center for Catalytic Science and Technology, University of Delaware, Newark, DE 19716, USA
[email protected] 4.12 Arthur F. Voter Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
[email protected] 2.11, 5.6 Gregory A. Voth Department of Chemistry and Henry Eyring Center for Theoretical Chemistry, University of Utah, Salt Lake City, Utah 84112-0850, USA
[email protected] 5.9 G.Z. Voyiadjis Louisiana State University, Baton Rouge, LA, USA
[email protected] 3.8 Dimitri D. Vvedensky Imperial College, London, United Kingdom
[email protected] 7.16 G¨oran Wahnstr¨om Chalmers University of Technology and G¨oteborg University Materials and Surface Theory, SE-412 96 G¨oteborg, Sweden
[email protected] 5.14
xxvi
List of contributors
Duane C. Wallace Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
[email protected] P1
Brian D. Wirth Department of Nuclear Engineering, University of California, Barkeley, CA, USA
[email protected] 2.29
Axel van de Walle Northwestern University, Evanston, IL, USA
[email protected] 1.16
Dieter Wolf Materials Science Division, Argonne National Laboratory, Argonne, IL 60439, USA
[email protected] 6.7, 6.9, 6.10, 6.11, 6.12, 6.13
Chris G. Van de Walle Materials Department, University of California, Santa Barbara, California, USA
[email protected] 6.3
C.Z. Wang Ames Laboratory-U.S. DOE and Department of Physics and Astronomy, Iowa State University, Ames, IA 50011, USA
[email protected] 1.15
Y. Wang The Ohio State University, Columbus, Ohio, USA
[email protected] 7.4
Yu U. Wang Department of Materials Science and Engineering, Virginia Tech., Blacksburg, VA 24061, USA
[email protected] 7.12
Hettithanthrige S. Wijesinghe Department of Mechanical Engineering, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, USA
[email protected] 8.8
Chung H. Woo The Hong Kong Polytechnic University, Hong Kong SAR, China
[email protected] 2.27 Christopher Woodward Northwestern University, Evanston, Illinois, USA
[email protected] P29 S.Y. Wu Department of Physics, University of Louisville, Louisville, KY 40292, USA
[email protected] P39 Yang Xiang Department of Mathematics, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
[email protected] 7.13 Sidney Yip Department of Physics, Harvard University, Cambridge, MA 02138, USA
[email protected] 2.1, 2.10, 6.7, 6.8, 6.11 M. Yu Department of Physics, University of Louisville, Louisville, KY 40292, USA P39
List of contributors H.M. Zbib Washington State University, Pullman, WA, USA
[email protected] 3.3 Fangqiang Zhu Theoretical and Computational Biophysics Group, Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
[email protected] 5.15
xxvii M. Zikry North Carolina State University, Raleigh, NC, USA
[email protected] 3.7
DETAILED TABLE OF CONTENTS PART A – METHODS Chapter 1. Electronic Scale 1.1
Understand, Predict, and Design Nicola Marzari 1.2 Concepts for Modeling Electrons in Solids: A Perspective Marvin L. Cohen 1.3 Achieving Predictive Simulations with Quantum Mechanical Forces Via the Transfer Hamiltonian: Problems and Prospects Rodney J. Bartlett, DeCarlos E. Taylor, and Anatoli Korkin 1.4 First-Principles Molecular Dynamics Roberto Car, Filippo de Angelis, Paolo Giannozzi, and Nicola Marzari 1.5 Electronic Structure Calculations with Localized Orbitals: The Siesta Method Emilio Artacho, Julian D. Gale, Alberto García, Javier Junquera, Richard M. Martin, Pablo Ordej´on, Deniel S´anchez-Portal, and Jos´e M. Soler 1.6 Electronic Structure Methods: Augmented Waves, Pseudopotentials and the Projector Augmented Wave Method Peter E. Bl¨ochl, Johannes K¨astner, and Clemens J. F¨orst 1.7 Electronic Scale James R. Chelikowsky 1.8 An Introduction to Orbital-Free Density Functional Theory Vincent L. Lign`eres and Emily A. Carter 1.9 Ab Initio Atomistic Thermodynamics and Statistical Mechanics of Surface Properties and Functions Karsten Reuter, Catherine Stampfl, and Matthias Scheffler 1.10 Density-Functional Perturbation Theory Paolo Giannozzi and Stefano Baroni
xxix
9 13
27
59
77
93 121 137
149 195
xxx
Detailed table of contents
1.11 Quasiparticle and Optical Properties of Solids and Nanostructures: The GW-BSE Approach Steven G. Louie and Angel Rubio 1.12 Hybrid Quantum Mechanics/Molecular Mechanics Methods and their Application Marek Sierka and Joachim Sauer 1.13 Ab Initio Molecular Dynamics Simulations of Biologically Relevant Systems Alessandra Magistrato and Paolo Carloni 1.14 Tight-Binding Total Energy Methods for Magnetic Materials and Multi-Element Systems Michael J. Mehl and D.A. Papaconstantopoulos 1.15 Environment-Dependent Tight-Binding Potential Models C.Z. Wang and K.M. Ho 1.16 First-Principles Modeling of Phase Equilibria Axel van de Walle and Mark Asta 1.17 Diffusion and Configurational Disorder in Multicomponent Solids A. Van der Ven and G. Ceder 1.18 Data Mining in Materials Development Dane Morgan and Gerbrand Ceder 1.19 Finite Elements in Ab Initio Electronic-Structure Calculations J.E. Pask and P.A. Sterne 1.20 Ab Initio Study of Mechanical Deformation Shigenobu Ogata
215
241
259
275 307 349
367 395 423 439
Chapter 2. Atomistic Scale 2.1 2.2 2.3 2.4 2.5 2.6
2.7 2.8
Introduction: Atomistic Nature of Materials Efthimios Kaxiras and Sidney Yip Interatomic Potentials for Metals Y. Mishin Interatomic Potential Models for Ionic Materials Julian D. Gale Modeling Covalent Bond with Interatomic Potentials Jo˜ao F. Justo Interatomic Potentials: Molecules Alexander D. MacKerell, Jr. Interatomic Potentials: Ferroelectrics Marcelo Sepliarsky, Marcelo G. Stachiotti, and Simon R. Phillpot Energy Minimization Techniques in Materials Modeling C.R.A. Catlow Basic Molecular Dynamics Ju Li
451 459 479 499 509
527 547 565
Detailed table of contents 2.9 2.10 2.11
2.12
2.13 2.14 2.15 2.16
2.17 2.18
2.19 2.20
2.21 2.22
2.23 2.24 2.25
2.26
Generating Equilibrium Ensembles Via Molecular Dynamics Mark E. Tuckerman Basic Monte Carlo Models: Equilibrium and Kinetics George Gilmer and Sidney Yip Accelerated Molecular Dynamics Methods Blas P. Uberuaga, Francesco Montalenti, Timothy C. Germann, and Arthur F. Voter Concurrent Multiscale Simulation at Finite Temperature: Coarse-Grained Molecular Dynamics Robert E. Rudd The Theory and Implementation of the Quasicontinuum Method E.B. Tadmor and R.E. Miller Perspective: Free Energies and Phase Equilibria David A. Kofke and Daan Frenkel Free-Energy Calculation Using Nonequilibrium Simulations Maurice de Koning and William P. Reinhardt Ensembles and Computer Simulation Calculation of Response Functions John R. Ray Non-Equilibrium Molecular Dynamics Giovanni Ciccotti, Raymond Kapral, and Alessandro Sergi Thermal Transport Process by the Molecular Dynamics Method Hideo Kaburaki Atomistic Calculation of Mechanical Behavior Ju Li The Peierls–Nabarro Model of Dislocations: A Venerable Theory and its Current Development Gang Lu Modeling Dislocations Using a Periodic Cell Wei Cai A Lattice Based Screw-Edge Dislocation Dynamics Simulation of Body Center Cubic Single Crystals Meijie Tang Atomistics of Fracture Diana Farkas and Robin L.B. Selinger Atomistic Simulations of Fracture in Semiconductors Noam Bernstein Multimillion Atom Molecular-Dynamics Simulations of Nanostructured Materials and Processes on Parallel Computers Priya Vashishta, Rajiv K. Kalia, and Aiichiro Nakano Modeling Lipid Membranes Christophe Chipot, Michael L. Klein, and Mounir Tarek
xxxi
589 613
629
649 663 683 707
729 745
763 773
793 813
827 839 855
875 929
xxxii
Detailed table of contents
2.27 Modeling Irradiation Damage Accumulation in Crystals Chung H. Woo 2.28 Cascade Modeling Jean-Paul Crocombette 2.29 Radiation Effects in Fission and Fusion Reactors G. Robert Odette and Brian D. Wirth 2.30 Texture Evolution During Thin Film Deposition Hanchen Huang 2.31 Atomistic Visualization Ju Li
959 987 999 1039 1051
Chapter 3. Mesoscale/Continuum Methods 3.1 3.2
3.3 3.4 3.5 3.6 3.7 3.8 3.9
Mesoscale/Macroscale Computational Methods M.F. Horstemeyer Perspective on Continuum Modeling of Mesoscale/Macroscale Phenomena D.J. Bammann Dislocation Dynamics H.M. Zbib and T.A. Khraishi Discrete Dislocation Plasticity E. Van der Giessen and A. Needleman Crystal Plasticity M.F. Horstemeyer, G.P. Potirniche, and E.B. Marin Internal State Variable Theory D.L. McDowell Ductile Fracture M. Zikry Continuum Damage Mechanics G.Z. Voyiadjis Microstructure-Sensitive Computational Fatigue Analysis D.L. McDowell
1071
1077 1097 1115 1133 1151 1171 1183 1193
Chapter 4. Mathematical Methods 4.1 4.2
4.3
4.4
Overview of Chapter 4: Mathematical Methods Martin Z. Bazant and Dimitrios Maroudas Elastic Stability Criteria and Structural Bifurcations in Crystals Under Load Frederick Milstein Toward a Shear-Transformation-Zone Theory of Amorphous Plasticity Michael L. Falk, James S. Langer, and Leonid Pechenik Statistical Physics of Rupture in Heterogeneous Media Didier Sornette
1217
1223
1281 1313
Detailed table of contents 4.5 4.6
4.7 4.8 4.9 4.10 4.11 4.12
4.13 4.14
4.15
Theory of Random Heterogeneous Materials S. Torquato Modern Interface Methods for Semiconductor Process Simulation J.A. Sethian Computing Microstructural Dynamics for Complex Fluids Michael J. Shelley and Anna-Karin Tornberg Continuum Descriptions of Crystal Surface Evolution Howard A. Stone and Dionisios Margetis Breakup and Coalescence of Free Surface Flows Jens Eggers Conformal Mapping Methods for Interfacial Dynamics Martin Z. Bazant and Darren Crowdy Equation-Free Modeling for Complex Systems Ioannis G. Kevrekidis, C. William Gear, and Gerhard Hummer Mathematical Strategies for the Coarse-Graining of Microscopic Models Markos A. Katsoulakis and Dionisios G. Vlachos Multiscale Modeling of Crystalline Solids Weinan E and Xiantao Li Multiscale Computation of Fluid Flow in Heterogeneous Media Thomas Y. Hou Certified Real-Time Solution of Parametrized Partial Differential Equations Nguyen Ngoc Cuong, Karen Veroy, and Anthony T. Patera
xxxiii
1333
1359 1371 1389 1403 1417 1453
1477 1491
1507
1529
PART B – MODELS Chapter 5. Rate Processes 5.1 5.2 5.3 5.4 5.5
5.6
Introduction: Rate Processes Horia Metiu A Modern Perspective on Transition State Theory J.D. Doll Transition Path Sampling Christoph Dellago Simulating Reactions that Occur Once in a Blue Moon Giovanni Ciccotti, Raymond Kapral, and Alessandro Sergi Order Parameter Approach to Understanding and Quantifying the Physico-Chemical Behavior of Complex Systems Ravi Radhakrishnan and Bernhardt L. Trout Determining Reaction Mechanisms Blas P. Uberuaga and Arthur F. Voter
1567 1573 1585 1597
1613 1627
xxxiv 5.7 5.8
5.9 5.10
5.11 5.12
5.13 5.14 5.15
5.16 5.17
Detailed table of contents Stochastic Theory of Rate Processes Abraham Nitzan Approximate Quantum Mechanical Methods for Rate Computation in Complex Systems Steven D. Schwartz Quantum Rate Theory: A Path Integral Centroid Perspective Eitan Geva, Seogjoo Jang, and Gregory A. Voth Quantum Theory of Reactive Scattering and Adsorption at Surfaces Axel Groß Stochastic Chemical Kinetics Daniel T. Gillespie Kinetic Monte Carlo Simulation of Non-Equilibrium Lattice-Gas Models: Basic and Refined Algorithms Applied to Surface Adsorption Processes J.W. Evans Simple Models for Nanocrystal Growth Pablo Jensen Diffusion in Solids G¨oran Wahnstr¨om Kinetic Theory and Simulation of Single-Channel Water Transport Emad Tajkhorshid, Fangqiang Zhu, and Klaus Schulten Simplified Models of Protein Folding Hue Sun Chan Protein Folding: Detailed Models Vijay Pande
1635
1673 1691
1713 1735
1753 1769 1787
1797 1823 1837
Chapter 6. Crystal Defects 6.1 6.2 6.3 6.4 6.5 6.6 6.7
Point Defects C.R.A. Catlow Point Defects in Metals Kai Nordlund and Robert Averback Defects and Impurities in Semiconductors Chris G. Van de Walle Point Defects in Simple Ionic Solids John Corish Fast Ion Conductors Alan V. Chadwick Defects and Ion Migration in Complex Oxides M. Saiful Islam Introduction: Modeling Crystal Interfaces Sidney Yip and Dieter Wolf
1851 1855 1877 1889 1901 1915 1925
Detailed table of contents 6.8 6.9 6.10
6.11 6.12 6.13
Atomistic Methods for Structure–Property Correlations Sidney Yip Structure and Energy of Grain Boundaries Dieter Wolf High-Temperature Structure and Properties of Grain Boundaries Dieter Wolf Crystal Disordering in Melting and Amorphization Sidney Yip, Simon R. Phillpot, and Dieter Wolf Elastic Behavior of Interfaces Dieter Wolf Grain Boundaries in Nanocrystalline Materials Dieter Wolf
xxxv
1931 1953
1985 2009 2025 2055
Chapter 7. Microstructure 7.1 7.2 7.3 7.4 7.5 7.6 7.7 7.8
7.9
7.10 7.11 7.12 7.13
Introduction: Microstructure David J. Srolovitz and Long-Qing Chen Phase-Field Modeling Alain Karma Phase-Field Modeling of Solidification Seong Gyoon Kim and Won Tae Kim Coherent Precipitation – Phase Field Method C. Shen and Y. Wang Ferroic Domain Structures using Ginzburg–Landau Methods Avadh Saxena and Turab Lookman Phase-Field Modeling of Grain Growth Carl E. Krill III Recrystallization Simulation by Use of Cellular Automata Dierk Raabe Modeling Coarsening Dynamics using Interface Tracking Methods John Lowengrub Kinetic Monte Carlo Method to Model Diffusion Controlled Phase Transformations in the Solid State Georges Martin and Fr´ed´eric Soisson Diffusional Transformations: Microscopic Kinetic Approach I.R. Pankratov and V.G. Vaks Modeling the Dynamics of Dislocation Ensembles Nasr M. Ghoniem Dislocation Dynamics – Phase Field Yu U. Wang, Yongmei M. Jin, and Armen G. Khachaturyan Level Set Dislocation Dynamics Method Yang Xiang and David J. Srolovitz
2083 2087 2105 2117 2143 2157 2173
2205
2223 2249 2269 2287 2307
xxxvi
Detailed table of contents
7.14 Coarse-Graining Methodologies for Dislocation Energetics and Dynamics J.M. Rickman and R. LeSar 7.15 Level Set Methods for Simulation of Thin Film Growth Russel Caflisch and Christian Ratsch 7.16 Stochastic Equations for Thin Film Morphology Dimitri D. Vvedensky 7.17 Monte Carlo Methods for Simulating Thin Film Deposition Corbett Battaile 7.18 Microstructure Optimization S. Torquato 7.19 Microstructural Characterization Associated with Solid–Solid Transformations J.M. Rickman and K. Barmak
2325 2337 2351 2363 2379
2397
Chapter 8. Fluids 8.1 8.2
8.3
8.4 8.5
8.6 8.7
8.8
Mesoscale Models of Fluid Dynamics Bruce M. Boghosian and Nicolas G. Hadjiconstantinou Finite Difference, Finite Element and Finite Volume Methods for Partial Differential Equations Joaquim Peir´o and Spencer Sherwin Meshless Methods for Numerical Solution of Partial Differential Equations Gang Li, Xiaozhong Jin, and N.R. Aluru Lattice Boltzmann Methods for Multiscale Fluid Problems Sauro Succi, Weinan E, and Efthimios Kaxiras Discrete Simulation Automata: Mesoscopic Fluid Models Endowed with Thermal Fluctuations Tomonori Sakai and Peter V. Coveney Dissipative Particle Dynamics Pep Espa˜nol The Direct Simulation Monte Carlo Method: Going Beyond Continuum Hydrodynamics Francis J. Alexander Hybrid Atomistic–Continuum Formulations for Multiscale Hydrodynamics Hettithanthrige S. Wijesinghe and Nicolas G. Hadjiconstantinou
2411
2415
2447 2475
2487 2503
2513
2523
Chapter 9. Polymers and Soft Matter 9.1 9.2
Polymers and Soft Matter L. Mahadevan and Gregory C. Rutledge Atomistic Potentials for Polymers and Organic Materials Grant D. Smith
2555 2561
Detailed table of contents 9.3 9.4 9.5 9.6 9.7
9.8 9.9
Rotational Isomeric State Methods Wayne L. Mattice Monte Carlo Simulation of Chain Molecules V.G. Mavrantzas The Bond Fluctuation Model and Other Lattice Models Marcus M¨uller Stokesian Dynamics Simulations for Particle Laden Flows Asimina Sierou Brownian Dynamics Simulations of Polymers and Soft Matter Patrick S. Doyle and Patrick T. Underhill Mechanics of Lipid Bilayer Membranes Thomas R. Powers Field-Theoretic Simulations Venkat Ganesan and Glenn H. Fredrickson
xxxvii
2575 2583 2599 2607
2619 2631 2645
Plenary Perspectives P1 P2 P3 P4 P5 P6
P7
Progress in Unifying Condensed Matter Theory Duane C. Wallace The Future of Simulations in Materials Science D.P. Landau Materials by Design Gregory B. Olson Modeling at the Speed of Light J.D. Joannopoulos Modeling Soft Matter Kurt Kremer Drowning in Data – A Viewpoint on Strategies for Doing Science with Simulations Dierk Raabe Dangers of “Common Knowledge” in Materials Simulations Vasily V. Bulatov
Quantum Simulations as a Tool for Predictive Nanoscience Giulia Galli and François Gygi P9 A Perspective of Materials Modeling William A. Goddard III P10 An Application Oriented View on Materials Modeling Peter Gumbsch P11 The Role of Theory and Modeling in the Development of Materials for Fusion Energy Nasr M. Ghoniem
2659 2663 2667 2671 2675
2687
2695
P8
2701 2707 2713
2719
xxxviii
Detailed table of contents
P12 Where are the Gaps? Marshall Stoneham P13 Bridging the Gap between Quantum Mechanics and Large-Scale Atomistic Simulation John A. Moriarty P14 Bridging the Gap between Atomistics and Structural Engineering J.S. Langer P15 Multiscale Modeling of Polymers Doros N. Theodorou P16 Hybrid Atomistic Modelling of Materials Processes Mike Payne, G´abor Cs´anyi, and Alessandro De Vita P17 The Fluctuation Theorem and its Implications for Materials Processing and Modeling Denis J. Evans P18 The Limits of Strength J.W. Morris, Jr. P19 Simulations of Interfaces between Coexisting Phases: What Do They Tell us? Kurt Binder P20 How Fast Can Cracks Move? Farid F. Abraham P21 Lattice Gas Automaton Methods Jean Pierre Boon P22 Multi-Scale Modeling of Hypersonic Gas Flow Iain D. Boyd P23 Commentary on Liquid Simulations and Industrial Applications Raymond D. Mountain P24 Computer Simulations of Supercooled Liquids and Glasses Walter Kob P25 Interplay between Materials Theory and High-Pressure Experiments Raymond Jeanloz P26 Perspectives on Experiments, Modeling and Simulations of Grain Growth Carl V. Thompson P27 Atomistic Simulation of Ferroelectric Domain Walls I-Wei Chen
2731
2737
2749 2757 2763
2773 2777
2787 2793
2805 2811
2819 2823
2829
2837 2843
Detailed table of contents
xxxix
P28 Measurements of Interfacial Curvatures and Characterization of Bicontinuous Morphologies Sow-Hsin Chen
2849
P29 Plasticity at the Atomic Scale: Parametric, Atomistic, and Electronic Structure Methods Christopher Woodward P30 A Perspective on Dislocation Dynamics Nasr M. Ghoniem P31 Dislocation-Pressure Interactions J.P. Hirth P32 Dislocation Cores and Unconventional Properties of Plastic Behavior V. Vitek P33 3-D Mesoscale Plasticity and its Connections to Other Scales Ladislas P. Kubin P34 Simulating Fluid and Solid Particles and Continua with SPH and SPAM Wm.G. Hoover P35 Modeling of Complex Polymers and Processes Tadeusz Pakula P36 Liquid and Glassy Water: Two Materials of Interdisciplinary Interest H. Eugene Stanley P37 Material Science of Carbon Wesley P. Hoffman P38 Concurrent Lifetime-Design of Emerging High Temperature Materials and Components Ronald J. Kerans P39 Towards a Coherent Treatment of the Self-Consistency and the Environment-Dependency in a Semi-Empirical Hamiltonian for Materials Simulation S.Y. Wu, C.S. Jayanthi, C. Leahy, and M. Yu
2865 2871 2879
2883 2897
2903 2907
2917 2923
2929
2935
INTRODUCTION Sidney Yip Department of Nuclear Science and Engineering, Department of Materials Science and Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139 (USA)
The way a scientist looks at the materials world is changing dramatically. Advances in the synthesis of nanostructures and in high-resolution microscopy are allowing us to create and probe assemblies of atoms and molecules at a level that was unimagined only a short time ago – the prospect of manipulating materials for device applications, one atom at a time, is no longer a fantasy. Being able to see and touch the materials up close means that we are more interested than ever in understanding their properties and behavior at the atomic level. Another factor which contributes to the present state of affairs is the advent of large-scale computation, once a rare and highly sophisticated resource accessible only to a few privileged scientists. In the past few years materials modeling, in the broad sense of theory and simulation in integration with experiments, has emerged as a field of research with unique capabilities, most notably the ability to analyze and predict a very wide range of physical structures and phenomena. Some would now say the modeling approach is becoming an equal partner to theory and experiment, the traditional methods of scientific inquiry. There are certain problems in the fundamental description of matter, previously regarded as intractable, now are amenable to simulation and analysis. The ab initio calculation of solid-state properties using electronic-structure methods and the direct estimation of free energies based on statistical mechanical formulations are just two examples where predictions are being made without input from experiments. Because materials modeling draws from all the disciplines in science and engineering, it greatly benefits from cross fertilization within a multidisciplinary community. There is recognition that Computational Materials is just as much a field as Computational Physics or Chemistry; it offers a robust framework for focused scientific studies and exchanges, from the introduction of new university curricula to the formation of centers for collaborative research among academia, corporate and government laboratories. A basic appeal to all members of the growing community 1 S. Yip (ed.), Handbook of Materials Modeling, 1–5. c 2005 Springer. Printed in the Netherlands.
2
S. Yip
is the challenge and opportunity of solving problems that are fundamental in nature and yet have great technological impact, problems spanning the disciplines of physics, chemistry, engineering and biology. Multiscale modeling has come to symbolize the emerging field of computational materials research. The idea is to link simulation models and techniques across the micro-to-macro length and time scales, with the goal of analyzing and eventually controlling the outcome of critical materials processes. Invariably these are highly nonlinear, inhomogeneous, or non-equilibrium phenomena in nature. In this paradigm, electronic structure would be treated by quantum mechanical calculations, atomistic processes by molecular dynamics or Monte Carlo simulations, mesoscale microstructure evolution by methods such as finite-element, dislocation dynamics, or kinetic Monte Carlo, and continuum behavior by field equations central to continuum elasticity and computational fluid dynamics. The vision of multiscale modeling is that by combining these different methods, one can deal with complex problems in a much more comprehensive manner than when the methods are used individually [1]. “Modeling is the physicalization of a concept, simulation is its computational realization.”
This is an oversimplified statement. On the other hand, it is a way to articulate the intellectual character of the present volume. This Handbook is certainly about modeling and simulation. Many would agree that conceptually the process of modeling ought to be distinguished from the act of simulation. Yet there seems to be no consensus on how the two terms should be used to show that each plays an essential role in computational research. Here we suggest a brief all-purpose definition (admittedly lacking specificity). By concept we have in mind an idea, an idealization, or a picture of a system (a scenario of a process) which has the connotation of functionality. For an example consider the subway map of Boston. Although it gives no information about the city streets, its purpose is to display the connectivity of the stations – few would dispute that for the given purpose it is a superb physical construct enabling any person to navigate from point A to point B [2]. So it is with our twopart definition; it is first a thoughtfully simplified representation of an object to be studied, a phenomenon, or a process (modeling), then it is the means with which to investigate the model (simulation). Notice also that when used together modeling and simulation implies an element of coordination between what is to be studied and how the study is to be conducted.
Length/Time Scales in Materials Modeling Many physical phenomena have significant manifestations on more than one level of length or time scale. For example, wave propagation and
Introduction
3
attenuation in a fluid can be described at the continuum level using the equations of fluid dynamics, while the determination of shear viscosity and thermal conductivity is best treated at the level of molecular dynamics. While each level has its own set of relevant phenomena, an even more powerful description would result if the microscopic treatment of transport could be integrated into the calculation of macroscopic flows. Generally speaking, one can identify four distinct length (and corresponding time) scales where materials phenomena are typically studied. As illustrated in Fig. 1, the four regions may be referred to as electronic structure, atomistic, microstructure, and continuum. Imagine a piece of material, say a crystalline solid. The smallest length scale of interest is about a few angstroms (10−8 cm). On this scale one deals directly with the electrons in the system which are governed by the Schr¨odinger equation of quantum mechanics. The techniques that have been developed for solving this equation are extremely computationally intensive, as a result they can be applied only to small simulation systems, at present no more than about 300 atoms. On the other hand, these calculations are theoretically the most rigorous; they are particularly valuable for developing and validating more approximate but computationally more efficient descriptions. The scale at the next level, spanning from tens to about a thousand angstroms, is called atomistic. Here discrete particle simulation techniques, molecular dynamics (MD) and Monte Carlo (MC), are well developed,
Figure 1. Length scales in materials modeling showing that many applications in our physical world take place on the micron scale and higher, while our basic understanding and predictive ability lie at the microscopic levels.
4
S. Yip
requiring the specification of an empirical classical interatomic potential function with parameters fitted to experimental data and electronic-structure calculations. The most important feature of atomistic simulation is that one can now study a system of large number of atoms, at present as many as 109 . On the other hand, because the electrons are ignored atomistic simulations are not as reliable as ab initio calculations. Above the atomistic level the relevant length scale is a micron (104 angstroms). Whether this level should be called microscale or mesoscale is a matter for which convention has not been clearly established. The simulation technique commonly in use is finite-element calculations (FEM). Because many useful properties of materials are governed by the microstructure in the system, this is perhaps the most critical level for materials design. However, the information required to carry out such calculations, for example, the stiffness matrix, or any material-specific physical parameters, has to be provided from either experiment or calculations at the atomistic or ab initio level. To a large extend, the same can be said for the continuum-level methods, such as computational fluid dynamics (CFD) and continuum elasticity (CE). The parameters needed to perform these calculations have to be supplied externally. There are definite benefits when simulation techniques at different scales can be linked. Continuum or finite-element methods are often most practical for design calculations. They require parameters or properties which cannot be generated within the methods themselves. Also they cannot provide the atomic-level insights needed for design. For these reasons continuum and finite element calculations should be coupled to atomistic and ab initio methods. It is only when methods at different scales are effectively integrated that one can expect materials modeling to give fundamental insight as well as reliable predictions across the scales. The efficient bridging of the scales in Fig. 1 is a significant challenge in the further development of multiscale modeling. The classification of materials modeling and simulation in terms of length and time scales is but one way of approaching the subject. The point of Fig. 1 is to emphasize the theoretical and computational methods that have been developed to describe the properties and behavior of physical systems, but it does not address other equally important issues, those of applications. One might imagine discussing materials modeling through a matrix of methods and applications which could be useful for displaying their connection and particular suitability. This would be quite difficult to carry out at present because there are not enough clear-cut case studies in the literature to make the construction of such a matrix meaningful. From the standpoint of knowing what methods are best suited for certain problems, materials modeling is a field still in its infancy.
Introduction
5
An Overview of the Handbook The Handbook is laid out in 9 chapters, dealing with modeling and simulation methods (Part A) and models for specific areas of studies (Part B). In Part A the first three chapters describe modeling concepts and simulation techniques at the electronic (Chapter 1), atomistic (Chapter 2), and mesoscale (Chapter 3) levels, in the spirit of Fig. 1. In contrast Chapter 4 describes a variety of methods based on mathematical analysis. The chapters in Part B focus on systems in which basic studies have been carried out. Chapter 5 treats rate processes where time-scale problems are just as important and challenging as length-scale problems. The next four chapters cover a range of physical structures, crystal defects (Chapter 6) and microstructure (Chapter 7) in solids, various models and methods for fluid simulation (Chapter 8), and models of polymer and soft matter (Chapter 9). In each chapter there are other significant topics which have not been included; for these we recommend the readers consult the references given in each article. Each chapter begins with an introduction which serves to connect the individual articles in the chapter with the broad themes that are relevant to our growing community. While no single chapter attempts to be inclusive in treating the many important aspects of materials modeling, even with restrictions to fundamental methods and models, hopefully, the entire Handbook is a first step in that direction. The Handbook also has a special section which we call Plenary Perspectives. This is a collection of commentaries by recognized authorities in the materials modeling or related fields. Each author was invited to write briefly on a topic that would give the readers, especially the students, insight on different issues in materials modeling. Together with the 9 chapters these perspectives are meant to inform the future workers coming into this exciting field.
References [1] S. Yip, “Synergistic science,” Nature Mater., 3, 1–3, 2003. [2] M. Ashby, “Modelling of materials problems,” J. Comput.-Aided Mater. Des., 3, 95–99, 1996.
1.1 UNDERSTAND, PREDICT, AND DESIGN Nicola Marzari Department of Materials Science and Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
Electronic-structure approaches are changing dramatically the way much theoretical and computational research is done. This success derives from the ability to characterize from first-principles many material properties with an accuracy that complements or even augments experimental observations. This accuracy can extend beyond the properties for which a real-life experiment is either feasible or just cost-effective, and it is based on our ability to compute and understand the quantum-mechanical behavior of interacting electrons and nuclei. Density-functional theory, for which the Nobel prize in chemistry was awarded in 1998, has been instrumental to this success, together with the availability of computers that are now routinely able to deal with the complexity of realistic problems. The extent of such revolution should not be underestimated, notwithstanding the many algorithmic and theoretical bottlenecks that await resolution, and the existence of hard problems rarely amenable to direct simulations. Since ab-initio methods combine fundamental predictive power with atomic resolution, they provide a quantitatively-accurate first step in the study and characterization of new materials, and the ability to describe with unprecedented control molecular architectures exactly at those scales (hundreds to thousands of atoms) where some of the most promising and undiscovered properties are to be engineered. In the current effort to control and design the properties of novel molecules, materials, and devices, firstprinciples approaches constitute thus a unique and very powerful instrument. Complementary strategies emerge: • Insight: First-principles simulations provide a unique connection between microscopic and macroscopic properties. When partnered with experimental tools – from spectroscopies to microscopies – they can deliver unique insight and understanding on the detailed arrangements of atoms 9 S. Yip (ed.), Handbook of Materials Modeling, 9–11. c 2005 Springer. Printed in the Netherlands.
10
N. Marzari
and molecules, and on their relation to the observed phenomena. Gedanken computational experiments can be used to prove or probe cause-effect relationships in ways that are different, and novel, compared with our established approaches. • Control: Microscopic simulations provide an unprecedented degree of control on the systems studied. While macroscopic behavior often emerges from complexity – thus explaining all the ongoing efforts in overcoming the time- and length-scale limitations – fundamental understanding needs to be built from the bottom-up, under the carefully controlled condition of a computational experiment. Simulations can offer early and accurate insights on complex materials that are challenging to control or characterize. • Design: Quantitatively accurate predictions of materials’ properties provide us with an unprecedented freedom, a “magic wand” that can be used with ingenuity to try and engineer novel material properties. Intuitions can often be rapidly validated, shifting and focusing appropriately the synthetic challenge to the later stages, once a promising class of materials has been identified. • Optimization: Finally, the systematic exploration of material properties inside or across different classes of materials can highlight the potential for absolute or differential improvements. Stochastic techniques such as data mining and optimization then identify the most promising candidates, narrowing down the field of structures to be targeted in real-life testing. While the extent and scope of this emerging discipline are nothing short of revolutionary, researchers in the field face key challenges that are worth remembering: achieving thermodynamical accuracy, bridging length-scales, and overcoming time-scales limitations. It is unlikely that an overarching solution to these problems will appear, and much of the art of modeling goes into solving these challenges for the problem at hand. It is nevertheless important to remark the role of correlations: whenever the typical correlation lengths become smaller then the size of the simulation box (e.g., for a liquid studied in periodic-boundary conditions), the system studied becomes virtually infinite, and the finite-size bias irrelevant. The articles presented in this volume offer a glimpse on the panorama of electronic-structure modeling; in such distinguished company, it would be inappropriate for me to condense such diverse and exciting contributions into a few sentences. I will leave the science to the authors, and conclude with a few statements on future developments. The continuous improvement in the price vs. performance ratio for commodity CPUs is now widely apparent. Whereas computational resources seem never enough, and the desire of a longer and bigger simulation is always looming, we are now in the position where even a single desktop is sufficient to
Understand, predict, and design
11
sustain research of world-class quality (of course, human resources are even more precious, and human ingenuity can be sometimes light-heartedly traded for sheer computational power). This availability of computer power is now combined with the availability of state-of-the-art computer packages – some of them freely distributed and developed under a shared-community, public-license model akin to that, e.g., of Linux. The net result has been that “computational laboratories” around the world have been increasing in capability with a speed comparable to Moore’s law, their hardware and software infrastructures replicated almost at the flick of a switch. Some conclusions can be attempted: • The geographic distribution of researchers in this field might change significantly. World-class science can now be done inexpensively and extensively, and knowhow and human resources become almost exclusively the most precious commodities. • Publicly available electronic structure packages take the role of internationally shared infrastructures; in perfect analogy with the way brick-andmortar facilities (such as synchrotrons) serve many groups in different countries. It could even be argued that investment in “computational infrastructures” (electronic-structure packages) can have comparable benefits, and a remarkable cost structure. • While these technologies become faster, more robust, and prettier, they also become more and more complex, often requiring years of training to be mastered – content and expertise could also be developed and freely shared following similar public-license models. The last point brings us back to one of the greatest challenges, and one for which we hope this Handbook will bring a positive contribution: how to avoid trading contents for form, critical thinking for indiscriminate simulations. In T.S. Eliot’s words: “The last temptation is the greatest treason: To do the right deed for the wrong reason.”
1.2 CONCEPTS FOR MODELING ELECTRONS IN SOLIDS: A PERSPECTIVE Marvin L. Cohen University of California at Berkeley and Lawrence Berkeley National Laboratory, Berkeley, CA, USA
1.
The Electron’s Central Role
It’s clear that an understanding of the behavior of electrons in solids is essential for explaining and predicting solid state properties. Electrons provide the glue holding solids together, and hence they are central in determining structural, mechanical and vibrational properties. Under the influence of electromagnetic fields, electrical current transport involves electron transport for most solids. Optical properties for many ranges of frequency are dominated by electronic transitions. Understanding superconductivity, magnetism, dielectric properties, ferroelectricity, and most properties of solids requires a detailed knowledge of “electronic structure” which is the term associated with the study of electronic energy levels, but more broadly a general label for the subfield of condensed matter physics which is focused on the properties of electrons in solids. In the end, modeling, simulating, calculating, and computing refer to producing equations, numbers or pictures which describe, explain, and predict properties. So this general area has always had a mixed set of goals. Theoretical researchers vary in their emphasis on these goals. For example, some theorists are focused on explaining phenomena with the simplest possible models containing the fundamental physics. A good example is the Bardeen–Cooper– Schrieffer (BCS) [1] theory of superconductivity which is one of the great achievements of 20th century physics. This theory brought new concepts, but the modeling of the electrons forming Cooper pairs considered electrons in free electron states because calculating normal-state properties for particular solids was not very far along in 1957. As a result, computing transition
13 S. Yip (ed.), Handbook of Materials Modeling, 13–26. c 2005 Springer. Printed in the Netherlands.
14
M.L. Cohen
temperatures for specific solids using BCS theory was, and still is, difficult; and, for some researchers, this was viewed at the time as a defect in the fundamental theory, which it was not. There are theorists interested in numerical precision. They continually push at the forefront of computer science and applied mathematics to develop consistent approaches that can deal with properties of clusters, molecules, and complex solids with many atoms in a unit cell. Sometimes these researchers have strong overlap with computer scientists and engineers and even get involved in hardware development. Perhaps the largest and most dominant group of researchers in modeling solids at this time are theorists motivated by particular experimental properties or phenomena. Unlike the researchers interested only in phenomena, they are trying to calculate these properties for “real materials.” For these theorists, it is essential that interactions among electrons and ionic cores not be replaced by a constant (as in the BCS model), and electrons are not viewed as completely free or as atomic states. They want the appropriate description of the electronic states for the material at hand and a computational approach to calculate measured properties. Successful comparisons with experiments is the goal, and it is the degree of accuracy in these comparisons which measures the worth of the calculation rather than numerical precision. In the papers presented in this volume, the reader will find authors with research goals having varying degrees of “accuracy for explaining and predicting properties” versus “calculational precision” as a primary goal. Irrespective of motivation, an essential component for modeling is the conceptual base. In other words, the way we picture solids on a microscopic or nanoscopic level.
2.
Conceptual Base
Under pressure, gases made of atoms can condense to become liquids with molecular units of clusters or atoms, and then, with more pressure, they generally transform into solids. So most models of solids involve a picture of atoms interacting to form a periodic array of ions with electrons in various geometric configurations. Modern electron charge density plots [2] have influenced our mental images of covalent, ionic and metallic bonding using contour maps and pictures of dense dots to represent electrons confined in bonds appropriate for covalent or ionic semiconductors or spread out charge maps to represent electrons in metals. As an example, Fig. 1 shows the electronic charge density in the (110) plane for carbon and silicon both in the diamond structure. The bond lengths are 1.54 Å and 2.35 Å, respectively. It has been said that carbon is the basis of biology while silicon is the basis of geology, and it is the nature of the covalent bonds in these two systems which determines these properties. As
Concepts for modeling electrons in solids: a perspective
15
Valence charge density (110 plan)
Figure 1. Contour maps of the valence electron charge density of C and Si in the diamond structure to illustrate a visual perception of covalent bonding.
shown in the figure, the carbon bond has two maxima while there is essentially one for silicon. The electrons in carbon can form sp2 hybrids for three-fold coordination and multiple bonds while elemental silicon at ambient pressures and temperatures forms sp3 bonds and is tetrahedrally coordinated. If solids are made of atoms, then it is the job of those modeling electronic behavior to illustrate this evolution of electrons from being localized around ions to the formation of covalent and metallic bonds. For this purpose, the old atomic models of Thomson and Newton work well pictorially. Thomson’s plum pudding model resembled our modern picture of jellium with a positive smeared out background representing the ions and then electrons existing in this background. Unlike jellium where the electrons are smeared out, Thompson’s electrons were plums. Hence, the essential difference is that the electrons in the jellium model are treated quantum mechanically and despite the fact that they can be excited out of the metal and look like Thomson’s plums, inside the metal they are itinerant. The resulting jellium model works for many properties of metals. In contrast to Thomson’s atomic model, Newton’s atoms had hooks, and it takes little imagination to see how these atoms with interlocking hooks can be used to form the basis of covalent and ionic crystals. However, again we need to show how the electrons can become hooks and form covalent or ionic bonds, and this requires quantum mechanics.
16
M.L. Cohen
Our modern quantum atom description is based on wavefunctions which yield probabilities for electron density. So, we can determine “exactly where an electron probably is.” This brings up the challenge of Dirac [3] posed after the development of quantum theory: “The underlying physical laws necessary for a large part of physics and the whole of chemistry are thus completely known, and the difficulty is only that the exact application of these laws leads to equations much too complicated to be soluble.” It is probably safe to say that to some extent we have answered Dirac’s challenge and we can now model electrons in some solids. Modern computing machines and new algorithms for solving complex equations have been an important ingredient, but just as important and probably more so is the conceptual base or modern “picture” of a solid that is inherently quantum mechanical.
3.
Standard Model
Since solids are made of atoms, why not start with atomic wavefunctions and perturb them. This works; it is the tight binding model which has had great success especially for systems where electrons are not “too itinerant.” Methods like this represent a natural path for quantum chemists who start from atoms and study molecules. This is also a logical path for doing computations of finite small systems like clusters or nanostructures. Another approach is to think of the free electron metal where each atom contributes its electrons to the soup of electrons in a solid. Perturbations on this model, such as the nearly free electron model, represent a very successful approach. Both of these very different paths will be represented in this volume and both are useful. The latter approach is conceptually the more difficult because in some sense it starts from a gas of electrons instead of electrons bound to atoms, but it has had widespread use and leads to very useful methods. One generally restricts the basis set to plane waves which are appropriate for free electrons, but there are other approaches. So in this model, sometimes referred to as the “Standard Model,” one can visualize an array of positive cores in a background sea of valence electrons coming from the atoms. In the plane wave pseudopotential version of this model, there are two types of particles: valence electrons and positive cores. For a study of a particular solid, one arranges the cores in a periodic array and uses a plane wave basis set for the quantum mechanical calculations. The particles interact in the following way. Core–core interactions which can be viewed as point-like Coulombic objects which can be represented by Madelung type sums to give accurate descriptions of these interactions. The electron–core
Concepts for modeling electrons in solids: a perspective
17
interaction is modeled using pseudopotentials [4, 5] and the electron–electron interactions are dealt with using density functional theory [6]. It is amazing how robust this model is when one considers the fact that for over 50 years beginning with approaches like the OPW [7] and APW [8] methods, researchers struggled with the band structure dilemma of how to describe electrons which are atomic-like near the cores and free electron-like between the cores. The conceptual breakthrough was the pseudopotential which accounted for the Pauli forces near the cores and led to weak effective potentials. Early versions were empirical [9] and fit to optical data, but eventually it became possible to construct pseudopotentials from first principles. Further discussion of pseudopotentials will be given in this volume. A convenient approach using the standard model is to calculate total energies [10] for model solids where the atoms are arranged in different configurations and only atomic information such as atomic numbers and masses are used as input. Hence different candidate crystal structures can be compared at varying volumes or pressures to explain the stability of observed structures or predict new ones. Here we find a major application of this method since in addition to structural stability, properties such as lattice constants, elastic constants, bulk moduli, vibrational spectra, and even electron–phonon and anharmonic properties of solids can be evaluated. The techniques connected to this method have evolved and they too will be discussed in this volume. Using plane waves or other basis sets and even tight binding schemes, there appears to be consensus in this area. Particularly dramatic early successes were the successful predictions of new high pressure crystalline phases of Si and Ge, and the successful prediction of superconductivity in high pressure phases of Si [11]. A more recent success is a detailed explanation of the unusual superconducting properties of MgB2 [12].
4.
Now and Later
So what are the modern challenges? If in fact we have to some extent answered Dirac’s challenge of 75 years ago, what’s next? A few obvious areas at this point for future exploration and development are: studies of electron behavior and transport in confined or small systems; development of better order N methods for calculating electronic properties so that more complex systems can be addressed; further development of theories designed to study excited states for optical and related properties; and the evaluation of the effects of strong electron correlation. In addition, more semi-empirical models should be developed since they were important in the past, and there is reason to believe these will contribute to future development.
18
5.
M.L. Cohen
Confinement
It is clear that confinement sets the energy scale whether we are considering protons in nuclei, electrons in atoms or clusters, and to some extent, electrons in nano and macro materials. In the latter case, there are confinement scales set by the overall object size and by the components such as atoms or unit cells. One gets a good sense of how this works when considering shell models for nuclei or for alkali metal clusters [13, 14]. The so-called magic numbers emerge for the number of atoms in a cluster and stability of energy shells. The energy shell structure can influence overall structure and properties. For macrosystems, it is the atoms, their spacings, and the unit cell which set the energy scales. For confinement in macrosystems, their large sizes lead to such small energy splittings that the available energy states appear continuous even at the lowest attainable temperatures. However, size effects for small systems and surfaces can bring in a new scale and methods such as the supercell method [15] can be used to address situations like this where translational symmetry is lost. Clusters are good examples of systems where confinement effects can be dominant. Here, supercell techniques can be used, but real space methods, such as those described in this volume, can cover a wide range of situations where size matters. Nanotubes, peapods, atomic chains, quantum dots, large molecules, network systems, polymers, fullerenes, etc. are all examples of systems where electron confinement can lead to significant alterations in wavefunctions and hence properties. Transport is a particularly interesting field of study on the nanoscale. There are a number of research groups focused on the formulation of a transport theory for electron conduction through molecules and nanosystems. Here the vexing problem of contacts must be dealt with, and, for chains of atoms, questions related to even and odd numbers of atoms are relevant. Because the nanoscale is of interest to physicists, chemists, biologists, engineers, materials scientists, and computer scientists, there has been a great deal of synergy between these disciplines and surprising demonstrations of the commonality of the problems facing researchers in these fields. One example is molecular motors. The problem of understanding friction in molecular motors with nanotube bearings is not very different from similar questions posed by biologists studying friction in biomotors. Another example is the application of nanostructures for devices. Figure 2 shows the merging of an (8,0) semiconducting carbon nanotube with a (7,1) metallic carbon nanotube. This is achieved by inserting a defect between them with adjacent five-member and seven-member rings of carbon atoms. The result is a Schottky barrier whose properties are determined just by the action of a handful of atoms at the interface.
Concepts for modeling electrons in solids: a perspective
19
Figure 2. A schematic drawing of Schottky barrier composed of semiconducting (8,0) and metallic (7,1) carbon nanotubes.
6.
Methods
Many researchers are exploring so-called “order N ” methods for attacking large or complex systems. As mentioned before, real space methods also appear promising. Researchers have developed new schemes for attempting to do inversions of matrices employing methods that resemble a “divide and conquer” approach. Schematically, a large matrix can be cut down through different point sampling into smaller units. The developments in this area are encouraging, and the collaborations between mathematicians doing numerical analysis and theoretical physicists and chemists appear to be productive. Another approach is to acknowledge that most problems on solids are multi-scale problems. A multi-scale approach can be most simply illustrated
20
M.L. Cohen
by an example where one calculates microscopic parameters and uses them along with semi-empirical models at a larger scale. Many sophisticated versions of this approach have been developed in recent years. Some of this very interesting research is described in detail in this volume.
7.
Excited States
Generally the problem which arises when excited states of solids are considered is that many of the standard methods used to compute the effects of electron–electron interactions use the local density approximation (LDA) which is not directly applicable for calculating excited state properties. For example, in the total energy LDA approach [10], ground state properties such as lattice constants and mechanical properties are determined quite accurately. However, in an optical process, photons create electron–hole pairs in the solid which influence the excited state properties of the many electron system. When band gaps of semiconductors are evaluated from energy bands obtained using the LDA methods, there is an underestimate of the band gap typically by a factor of about two. In some cases metallic behavior is predicted for systems known to be semiconductors. The so called “band gap” problem was of central concern when applications of the “standard model,” which were so successful for ground state properties, became clearly unusable for computing band gaps. The overall topology of the energy bands was approximately right and in agreement with empirical models and experimental data where checks were possible, but the details were wrong. Early suggestions such as the “scissors model” where levels were artificially shifted by adding a constant energy to the calculated bandgap were considered to be “band aids” and not cures. Although this is still an active area of research, there are methods for evaluating quasiparticle energies. One of the most successful is the GW method [16] which works for a broad class of solids. Two major ingredients in this approach are the inclusion of electron self-energy effects and the modulation of the charge density in the crystal. This latter feature allows for the effects on exchange and correlation energies arising from the concentrations of electrons into bonds as an example. Another feature of the properties of the excited state which must be addressed is the role of electron–hole interactions. Two of the most dramatic effects are the formation of excitons and the alteration of oscillator strengths arising from electron–hole interactions. Again, this is an active area of research, but a workable theory is available [17] where the Bethe–Salpeter approach for two particle scatterings is adopted and applied along with the GW machinery. Forces in the excited state and other special features arising
Concepts for modeling electrons in solids: a perspective
21
from considering these interactions can be calculated. Comparisons between this method and others, such as time dependent density functional approaches [18], quantum Monte Carlo methods and more quantum chemistry oriented approaches are yielding new insights into this area. It appears that research in this field will remain active for some time as there are many possible applications.
8.
Strongly Correlated Electrons
At this time, it is commonly believed that a forefront field of condensed matter theory is the study of strongly correlated electrons. However, as in the case of defining biophysics, the image of what is meant by this field of study varies with individuals. As was described at the beginning of this article, there are theorists attempting to use simplified models to get the essence of the physics associated with problems related to strongly correlated electron systems. A prime example is the large amount of research devoted to the study of superconductivity in copper oxide systems. Here it is clear why theorists are motivated. Electron correlation effects are important, there is no consensus yet on the underlying electron pairing mechanism, and the normal state and superconducting properties are very interesting. So the application of models such as the Hubbard Model has attracted a large number of theoretical researchers. Many interesting proposals for explaining the electronic properties of the oxides using Hubbard-like models have been advanced. At present, this is an active field, but as mentioned before, there is still no general agreement on “the” appropriate description of these systems, and in general, there is a lack of definitive proof of good theoretical–experimental agreement. The more ab initio approaches designed for specific materials are beginning to make some impact on this area. Despite the known shortcomings of applying band structure calculations based on a density functional approach to materials of this kind, these were among the most useful calculations for interpreting experiments like photoelectron spectroscopy aimed at determining electronic structure. The Fermi surface topology and other electronic characteristics were explored with considerable success through experimental– theoretical comparisons along with reasonable empirical adjustments to the electronic structure calculations. Currently, efforts are underway for a more frontal assault on this problem. By combining local spin density calculations together with Hubbard-like terms to account for electron–electron repulsion, more realistic electronic structure calculations are being done. Variations and improvements on these “LSDA + U” approaches [19] including the use of pseudopotentials appear to be promising. And it is possible that the more first
22
M.L. Cohen
principles, materials-motivated approach may make important contributions to the conceptual development of this field.
9.
Empirical Models
Just as the atomic models of Thompson and Newton described earlier help to form a basis for the conceptual picture of electronic behavior, other empirical and semi-empirical models had a considerable effect on the the development of this field of study. The Thomas–Fermi model which allowed calculation of electron screening effects, Slater’s and Wigner’s formulas for evaluating the effects of exchange and correlation gave important insight into the role of these many body effects. Free electron and nearly free electron models were extremely important as were empirical tight binding models for estimating band structure effects. An example which illustrates the transition from an empirical model designed to explain experimental data into a first-principles approach is the Empirical Pseudopotential Method (EPM). In this approach [9], a few form factors (usually three per atom) of the potential in the unit cell are fit to yield band structures consistent with experimental measurements. For example, three band gaps in the optical spectrum of Si or Ge can be used to fix the potential for these atoms, and then the electronic band structure and other properties can be computed with a high degree of accuracy. When applying the EPM, the pseudopotential is taken to be the total potential a valence electron experiences; it combines the electron–ion and electron– electron interactions. In the course of fitting these potentials, the problem of how the optical properties of semiconductors were related to interband transitions was solved in the 1960s and 1970s. In addition, a great deal was learned about the pseudopotential. It was found that pseudopotentials were “transferable.” Pseudopotentials constructed for InAs, InSb and GaAs could be used to extract As, In, Sb and Ga pseudopotentials. In fact, the extracted In, Ga As, and Sb pseudopotentials were transferable between compounds and even worked well to give the electronic structure of these metals and semi-metals. So it became clear that each atom had its own transferable potential, and at least to a first approximation, these could be extracted from experiment and applied widely. In addition to learning about the transferability of the pseudopotentials, their general form and properties gave a great deal of information which was used when first-principles potentials were developed. So this empirical approach which is still used not only provided an accessible and flexible calculational tool, it also provided ideas and facts for use in developing the fundamental theory. The resulting band structures were also accurate. Figure 3 shows a comparison between the predicted EPM band structures of GaAs
Concepts for modeling electrons in solids: a perspective
23
Figure 3. A comparison of the predicted pseudopotential band structure for occupied energy bands in GaAs together with the experimental bands determined by Angular Resolved Photoemission Spectroscopy.
and the subsequent experimentally determined data using Angular Resolved Photoemission Spectroscopy. Another example involved bulk moduli of semiconductors and insulators. The first principles approach using total energy calculations as a function of volume E(V ) allows the determination of elastic constants and, in particular, the bulk modulus B. These calculations are fairly extensive and hence costly. Another approach based on concepts introduced by Phillips [20] yields a connection between spectral properties of semiconductors and insulators and their structural or bonding properties. By exploiting [21] these concepts, a simple formula can be derived for B which requires only the bond length d, and the integers I = 0, 1, 2 to indicate a group IV, III–V, or II–VI compounds. The resulting formula B = (1972 − 220 I) d−3.5 gives calculated values for B to within a few percent of the experimental values. Again, not only is this semi-empirical approach valuable because the calculation can be done on a hand calculator in a few seconds, it also give
24
M.L. Cohen
insight into the nature of compressibilities. For example, one can make estimates and explore limits of B for aids in predicting the existence of superhard solids [22].
10.
Future
As Yogi Berra stated, “Predictions are hard to make, especially about the future.” However, it is clear that this area of physics will expand. Multi-scale methods [23] to study materials assembled from fundamental building blocks that are understood at the micro or nano level will continue to be an active field with interest coming from materials science, chemistry, and physics. Problems like understanding the nature of growth, diffusion, amorphous materials, and even non-equilibrium processes can be addressed. Molecular dynamics [24] can also be used to attack problems of this kind [25]. Real space methods [26, 27] will also continue to impact this area of research. The general interest in clusters and how they develop properties associated with bulk properties and the study of the evolution of material properties as size changes will demand new methods and concepts. As mentioned in the section on excited states, there has been considerable progress in determining optical properties from first-principles theory for solids. There has also been progress on the calculation of optical properties for clusters and nanocrystals. These approaches [18] are sometimes labeled as time dependent LDA or TDLDA. Growth in this area is also expected. A frontier has always been the study of increasingly more complex solids. Many materials can be described in terms of unit cells with a finite number of atoms. Computational problems arise as the number of atoms increases. Here hardware development helps, and it is impressive how much progress continues to be made in extending the complexity of systems that can be studied. However, the appetite for considering more complex systems is large particularly at the border where this field of science merges with biophysics. Complex molecules and systems like DNA are coming into the range of study where researchers expect precision on the level of what has been achieved for crystals. Clearly this is an area of important research with a bright future as is nanoscience and quantum computation where we may possibly learn new things about quantum mechanics. As mentioned earlier, the frontier of correlated electrons remains, and many feel that present theory is up to the challenge. If success is achieved in this area and our ability to treat more complex systems is enhanced, it may be possible to predict new states of matter. I would expect that this phase of discovery, if it is in the cards for theorists, will be preceded by the development of semiempirical theories like the EPM. With good models and general knowledge of effects such as polarizability [28] one may be able to predict phenomena
Concepts for modeling electrons in solids: a perspective
25
on the level of magnetism, superconductivity, and the quantum Hall effects. However, this may be a long way off, so we still need experimentalists.
Acknowledgments This work was supported by National Science Foundation Grant No. DMR00-87088 and by the Director, Office of Science, Office of Basic Energy Sciences, Division of Materials Sciences and Engineering, US Department of Energy under contract No. DE-AC03-76SF00098.
References [1] J. Bardeen, L.N. Cooper, and JR., Schrieffer, “Theory of superconductivity,” Phys. Rev., 108, 1175–1204, 1957. [2] J.P. Walter and M.L. Cohen, “Electronic charge densities in semiconductors,” Phys. Rev. Lett., 26, 17–19, 1971. [3] P.A.M. Dirac, “Quantum mechanics of many-electron systems,” Proc. R. Soc. (London), A123, 714–733, 1929. [4] E. Fermi, “On the pressure shift of the higher levels of a spectral line series,” Nuovo Cimente, 11, 157, 1934. [5] J.C. Phillips and L. Kleinman, “New method for calculating wave functions in crystals and molecules,” Phys. Rev., 116, 287–294, 1959. [6] W. Kohn and L.J. Sham, “Self-consistent equations including exchange and correlation effects,” Phys. Rev., 140, A1133–A1138, 1965. [7] C. Herring, “A new method for calculating wave functions in crystals,” Phys. Rev., 57, 1169–1177, 1940. [8] J.C. Slater, “Wave functions in a periodic potential,” Phys. Rev., 51, 846–851, 1937. [9] M.L. Cohen and T.K. Bergstresser, “Band structures and pseudopotential form factors for fourteen semiconductors of the diamond and zincblende structures,” Phys. Rev., 141, 789–796, 1966. [10] M.L. Cohen, “Pseudopotentials and total energy calculations,” Phys. Scripta, T1, 5–10, 1982. [11] K.J. Chang, M.L. Cohen, J.M. Mignot, G. Chouteau, and G. Martinez, “Superconductivity in high-pressure metallic phases of Si,” Phys. Rev. Lett., 54, 2375–2378, 1985. [12] H.J. Choi, D. Roundy, H. Sun, M.L. Cohen, and S.G. Louie, “The origin of the anomalous superconducting properties of MgB2 ,” Nature, 418, 758, 2002. [13] W.D. Knight, K. Clemenger, W.A. de Heer, W.A. Saunders, M.Y. Chou, and M.L. Cohen, “Electronic shell structure and abundances of sodium clusters,” Phys. Rev. Lett., 52, 2141–2143, 1984. [14] W.A. de Heer, W.D. Knight, M.Y. Chou, and M.L. Cohen, “Electronic shell structure and metal clusters,” In: H. Ehrenreich and D. Turnbull, (eds.), Solid State Physics, vol. 40, Academic Press, New York, p. 93, 1987. [15] M.L. Cohen, M. Schl¨uter, J.R. Chelikowsky, and S.G. Louie, “Self-consistent pseudopotential method for localized configurations: molecules,” Phys. Rev. B, 12, 5575–5579, 1975.
26
M.L. Cohen [16] M.S. Hybertsen and S.G. Louie, “First-principles theory of quasiparticles: calculation of band gaps in semiconductors and insulators,” Phys. Rev. Lett., 55, 1418–1421, Phys. Rev. B, 34, 5390–5413, 1986. [17] M. Rohlfing and S.G. Louie, “Electron–hole exitations in semiconductors and insulators,” Phys. Rev. Lett., 81, 2312–2315, 1998, Phys. Rev. B, 62, 4927–4944, 2000. [18] I. Vasiliev, S. Ogut, and J.R. Chelikowsky, “First-principles density-functional calculations for optical spectra of clusters and nanocrystals,” Phys. Rev. B, 65, 115416, 2002. [19] V.I. Anisimov, J. Zaanen, and O.K. Andersen, “Band theory and Mott insulators: Hubbard U instead of Stoner I,” Phys. Rev. B, 44, 943–954, 1991. [20] J.C. Phillips, Bonds and Bands in Semiconductors, Academic Press, New York, 1973. [21] M.L. Cohen, “Calculation of bulk moduli of diamond and zinc-blende solids,” Phys. Rev. B, 32, 7988–7991, 1985. [22] A.Y. Liu and M.L. Cohen, “Prediction of new low compressibility solids,” Science, 245, 841, 1989. [23] N. Choly and E. Kaxiras, “Fast method for force computations in electronic structure calculations,” Phys. Rev. B, 67, 155101, 2003. [24] R. Carr and M. Parrinello, “Variational quantum Monte Carlo nonlocal pseudopotential approach to solids: cohesive and structural properties of diamond,” Phys. Rev. Lett., 61, 1631–1634, 1988. [25] S. Yip, “Nanocrystaline metals – Mapping plasticity,” Nature Mater., 3, 11, 2004. [26] J.R. Chelikowsky, N. Troullier, and Y. Saad, “The finite-difference-pseudopotential method: electronic structure calculations without a basis,” Phys. Rev. Lett., 72, 1240–1243, 1994. [27] M.M.G. Alemany, M. Jain, J.R. Chelikowsky, and L. Kronik, “A real space pseudopotential method for computing the electronic properties of periodic systems,” Phys. Rev. B, 69, 075101, 2004. [28] I. Souza, J. Iniguez, D. Vanderbilt, “Dynamics of berry-phase polarization in timedependent electric fields,” Phys. Rev. B, 69, 085106, 2004. [29] M.L. Cohen, and J.R. Chelikowsky, Electronic Structure and Optical Properties of Semiconductors, Springer-Verlag, Berlin, 1988. [30] C. Kittel, Introduction to Solid State Physics, seventh edition, Wiley, New York, 1996. [31] J.C. Phillips, Bonds and Bands in Semiconductors, Acadamic Press, New York, 1973. [32] P.Y. Yu and M. Cardona, Fundamentals of Semiconductors, Springer, Berlin, 1996.
1.3 ACHIEVING PREDICTIVE SIMULATIONS WITH QUANTUM MECHANICAL FORCES VIA THE TRANSFER HAMILTONIAN: PROBLEMS AND PROSPECTS Rodney J. Bartlett, DeCarlos E. Taylor, and Anatoli Korkin Quantum Theory Project, Departments of Chemistry and Physics, University of Florida, Gainesville, FL 32611, USA
1.
Prologue
According to the Westmoreland report [1], “in the next ten years, molecularly based modeling will profoundly affect how new chemistry, biology, and materials physics are understood, communicated, and transformed to technology, both intellectually and in commercial applications. It creates new ways of thinking – and of achieving.” Computer modeling of materials can potentially have an enormous impact in designing or identifying new materials, how they fracture or decompose, what their optical properties are, and how these and other properties can be modified. However, materials’ simulations can be no better than the forces provided by the potentials of interaction among the atoms involved in the material. Today, these are almost invariably classical, analytical, two- or threebody potentials, because only such potentials permit the very rapid generation of forces required by large-scale molecular dynamics. Furthermore, while such potentials have been laboriously developed over many years, adding new species frequently demands another long-term effort to generate potentials for the new interactions. Most simulations also depend upon idealized crystalline (periodic) symmetry, making it more difficult to describe the often more technologically important amorphous materials. If we also want to observe bond breaking and formation, optical properties, and chemical reactions, we must have a quantum mechanical basis for our simulations. This requires a multi-scale philosophy, where a quantum mechanical core is tied to a classical 27 S. Yip (ed.), Handbook of Materials Modeling, 27–57. c 2005 Springer. Printed in the Netherlands.
28
R.J. Bartlett et al.
atomistic region, which in turn is embedded in a continuum of some sort, like a reaction field or a finite-element region. It is now well-known that ab initio quantum chemistry has achieved the quality of being “predictive” to within established small error bars for most properties of isolated, relatively small molecules, making it far easier to obtain requisite information about molecules from applications of theory, than to attempt complicated and expensive experimental observation. In fact, applied quantum chemistry as implemented in many widely used computer programs, ACES II [2], GAUSSIAN, MOLPRO, MOLCAS, QCHEM, etc, has now attained the status of a tool that is complimentary to those of X-ray structure determination and NMR and IR spectra in the routine determination of the structure and spectra of molecules. However, there is an even greater need for the computer simulations of complex materials to be equally predictive. Unlike molecules, which can usually be characterized in detail by spectral and other means, materials are far more complex and cannot usually be investigated experimentally under similarly controlled conditions. They have to be studied at elevated temperatures and under non-equilibrium conditions. Frequently, the application of the material might be meant for extreme situations that might not even be accessible in a laboratory. Hence, if we use more economical computer models to learn how to suitably modify a material to achieve an objective, our materials simulations must be “predictive,” to trust both the qualitative and quantitative consequences of the simulations. Besides the predictive aspect, another theme that permeates our work with materials is “chemistry.” By chemistry we mean that unlike the idealized systems that have been the focus of most of the simulation work in materials science, we want to consider the essential interactions among many different molecular species; and, in particular, under stress. As an example, a long unsolved problem in materials is why water will cause forms of silica to weaken by several orders of magnitude compared to their dry forms [3–5] while ammonia with silica shows a different behavior. A proper, quantum mechanically based simulation should reflect these differences, qualitatively and quantitatively. The third theme of our work is that by virtue of using a quantum mechanical (QM) core in multi-scale simulations, unlike all the simulations based upon classical potentials, we have quantum state specificity. In a problem like etching silica with CF4 , which generates the ething agent, CF3 , a classical potential − · cannot distinguish between CF+ 3 , CF3 , and CF3 , yet obviously the chemistry will be very different. Furthermore, we also have need for the capability to use excited electronic states in our simulations, to include species like CF∗3 , e.g., or to distinguish between different modes of fractures of the silica target, such as radical dissociation as opposed to ionic dissociation. Conventionally, the only quantum mechanically based multi-scale dynamics simulations that would permit as many as 500–1000 atoms in the QM region were based upon the tight-binding (TB) method, density functional theory
Achieving predictive simulations with quantum mechanical forces
29
(DFT) being used only for smaller QM regions. TB is a pervasive term that covers everything from crude, non-self-consistent descriptions like extended H¨uckel theory [6], to quasi-self-consistent schemes based upon Mulliken or other point charges [7], to a long history of solid state efforts [8, 9], to TB with three-body terms [10]. The poorest of these do not introduce overlap, selfconsistency, nor explicit consideration of the nuclear–nuclear repulsion terms that would be essential in any ab initio approach; so in general such methods cannot correctly describe bond breaking, where charge transfer is absolutely essential. However, there have been significant improvements on several fronts in the recent TB literature [11, 12] which are helping to rectify these failings. The alternative approach to TB is that based upon the semi-empirical quantum chemistry tradition starting with Pariser and Parr [13, 14], Dewar et al. [15, 16] and Pople et al. [17, 18], and being extended on several fronts by Stewart [19–21], Thiel [22], Merz [23], Repasky et al. [24], and TubertBrohman et al. [25]. These “neglect of differential overlap methods,” of which the most flexible is the NDDO method, meaning “neglect of diatomic differential overlap” will be our initial focus. Like TB methods, the Hamiltonian is greatly simplified but not necessarily by limiting all interactions to nearest neighbors, but instead to operationally limiting interactions to mostly diatomic units in molecules. We will address some of the details later, but for most of our purposes, the particular form for the “transfer Hamiltonian” will be at our disposal and suitable forms with rigorous justification are a prime objective of our research. It might be asked why a “Hamiltonian” instead of a potential energy surface? Fitting the latter especially while including the plethora of bond-breaking regions, is virtually impossible for even simple molecules. Highly parameterized molecular mechanics (MM) methods [26] can do a good job of generating a potential energy surface near equilibrium for well-defined and unmodified molecular units; but bond breaking and formation is outside the scope of MM. So our objective, instead of the PES (potential energy surface), is to create a “transfer Hamiltonian” that permit the very rapid determination of, in principle, all the properties of a molecule; and especially the forces on a PES for steps of the MD. The transfer Hamiltonian gives us a way to subsum most of the complications of a PES in a very convenient package that will yield the energy and first and second derivatives upon command. This has been done to some degree in rate constant applications for several atom molecules where the complication is the need for multi-dimensional PES information [27–29]. Here, we conceive of the transfer Hamiltonian as a way to get all the relevant properties of a molecule including its electronic density, and related properties like dipole moments, and its photoelectron, electronic, and vibrational spectra. Except for the latter, these are purely “electronic” properties, which depend solely on the electronic Schr¨odinger equation. These should be distinguished from forces and the PES itself, which are properties of the total energy.
30
R.J. Bartlett et al.
The distinction between the two has been at the heart of the principal dilemma in simplified or semi-emprirical theory, where a set of parameters that give the total energy are not able to describe electronic properties equally well. It is also critical that the Hamiltonian be computed very rapidly to accomodate MD applications, and a form for it needs to be determined such that we retain the accuracy of the forces and other properties that would come from ab initio correlated theory. This is more an objective than a fait-accompli, but we will discuss how to try to accomplish this in this contribution. Our approach is to appeal to the highest level of ab initio quantum chemistry, namely coupled-cluster (CC) theory, to use as a basis for a “transfer Hamiltonian” that embed the accurate, predictive quality CC forces taken from suitable clusters into it, but in an operator that is of very low rank, making it possible to do fully self-consistent calculations on ∼500–1000 atoms undergoing MD. Hence, as long as a phenomena is accessible to MD, and if the transfer Hamiltonian forces retain the accuracy of CC theory, we should be able to retain the predictive quality of the CC method in materials simulations; and if we can also describe the electronic properties accurately, we have everything that the Schr¨odinger equation could tell us about our system. In addition, we have no problem with changing atoms or adding new molecules to our simulations, as our transfer Hamiltonian is applicable to any system once trained to ensure its proper description. We will also develop the transfer Hamiltonian approach from DFT considerations in the following to show the essential consistency between the wavefunction and density functional methods. Our emphasis on predictability, chemistry, and state specificity, offers a novel perspective in the field; and the tools we are developing, all tied together with highly flexible software, sets the stage for the kinds of simulations that will lead to reliable materials design. As the Westmoreland report further states, ‘The top needs required by industry are methods that are “bigger, better, faster;” (with) more extensive validation, and multiscale techniques.’
2.
Introduction
Our objective is predictive simulations of materials. The critical element in any such simulation are the forces that drive the molecular dynamics. For a reliable description of bond breaking, as in fracture or chemical reaction, or to distinguish between a free radical and a cation or anion, to be electronic state specific; or to account for optical spectra; the forces must be obtained from a quantum mechanical method. Today’s entirely first-principles, quantum chemical methods are “predictive” for small molecules in the sense that with a suitable level of electron correlation, notably with coupled-cluster (CC) theory [30], and large enough basis sets [30, 31]; or to a lesser extent, density functional theory (DFT) [32–34] the results for molecular structure, spectra,
Achieving predictive simulations with quantum mechanical forces
31
energetics and the associated atomic forces required for these quantities and for reaction paths are competitive with experiment. In particular, these highly correlated methods offer accurate results for transient molecules and other experimentally inaccessible species, and particularly reaction paths that can seldom be known from solely experimental considerations. In terms of ab initio theory, the established paradigm of results from converging, correlated methods is MP2
Figure 1. Comparison of CI, MBPT, and CC results with full CI. Results Based on DZP basis for BH, and H2 O at Re, 1.5Re, and 2.0Re.
32
R.J. Bartlett et al.
well. When we go to X = 3, we get a third s, third set of p functions, a second set of d functions, and a set of f functions. Clearly, we rapidly go to quite large basis sets when X 3. The fundamental problem with using these methods for large molecules is that after MP2 (∼n 5 ) the above CC calculations scale non-linearly with the number of basis functions, as ∼n 6 for CCSD, ∼n 7 for (T), ∼n 8 for CCSDT, etc. The CC methods now in wide use were developed by the Bartlett group from 1978 to the present [36], and have now been implemented numerous times by independent researchers. As for benchmarks toward experiment, many studies of expected error bars exist in the literature for various levels of CC. Notably, the book by Helgaker et al. [31] shows many comparisons. We plot the normal distributions of their results for HF, CCSD and CCSD(T) in Figs. 2–4 for the pVDZ and pVTZ bases. All ab initio results depend upon the quality of the basis set as well as the correlation corrections. We can summarize the results in Table 1. With a triply polarized basis like cc-pVTZ, the CCSD(T) standard deviations are for structure (∼0.0023 Å), dissociation energies for single bonds (∼3.5 kcal/mol), harmonic vibrational frequencies(∼5–20 cm−1 ), excitation energies (∼0.1 eV for singly excited states) and NMR coupling constants (∼5 Hz), with similar ones for other properties. From the normal distributions of errors for bond lengths, dissociation energies, and heats of atomization in small molecules at various levels of theory, there is a dramatic improvement of CC methods over SCF, CISD, MP2. There can also be a significant difference between CCSD and CCSD(T) where the triple excitations are added in a non-iterative form to CCSD [36]. There is an inadequate database about transition states and activation barriers, since few are known experimentally. For complex systems of the type addressed by modern multi-scale simulations [39, 40], maintaining a chain of approximations built upon the quantum mechanical core like the paradigm above to retain the predictability of the underlying forces is even more important, as there is seldom the extent and quality
Figure 2. Normal distributions of the errors in calculated bond distances for a set of 28 molecules containing first row atoms.
Achieving predictive simulations with quantum mechanical forces
33
Figure 3. Normal distributions of the errors in calculated atomization energies for a set of 16 molecules containing first row atoms.
Figure 4. Normal distributions for the errors in calculated reaction enthalpies for a set of 13 reactions containing first row atoms. Table 1. Bond lengths and dissociation energies as a function of basis set and method [37, 38] Band length
HF MP2 CCSD(T)
Dissociation energy
DZ
TZ
DZ
TZ
0.021 0.013 0.016
0.028 0.006 0.002
7.12 7.41 8.78
6.85 3.28 2.88
of molecular specific experimental data available to test the theory that there is for small molecules Hence, evolving toward predictive simulations is critical to obtaining accurate, qualitative and quantitative conclusions. So how can we achieve the predictability we need for materials simulations? The problem is illustrated in Fig. 5. We can do highly accurate studies of molecular structure, spectra, and bond breaking for ∼20 atoms at the CC level; ∼50–200 at the MP2 level; and ∼100–300 at the DFT level. In an isolated case for the energy at a
34
R.J. Bartlett et al.
Figure 5. Computational accuracy and efficiency of available potential forms compared to the transfer Hamiltonian.
single geometry (not necessarily forces) with additional tricks we can go much further to ∼1000 atoms [41, 42]. But here we are only concerned with methods for the forces that can be done on a time scale that can be reasonably tied to MD. This imposes a severe limitation on the size of system that can be addressed. The “transfer Hamiltonian” concept [43, 44] is meant to be a way to retain much of the accuracy of ab initio quantum chemistry, like that from CCSD; but in a way that permits the description of ∼500–1000 atoms to be described by QM forces within a time-frame that can be tied to dynamics. We will first consider the wavefunction viewpoint and then that from DFT. After discussing the formal structure, we will specify to a particular form for the transfer Hamiltonian and illustrate its application with numerical results.
3.
Transfer Hamiltonian: Wavefunction Approach
In the correlated CC theory we start with the time-independent Schr¨odinger equation, H = E = exp(T )|0
(1) (2)
Achieving predictive simulations with quantum mechanical forces
35
and introduce the CC ansatz, by writing the wavefunction in the exponential form of Eq. (2). The operator T = T1 + T2 + T3 + · · · tia {a † i} T1 =
(3) (4)
a,i
T2 =
† † tiab j {a ib j }
(5)
i> j.a>b
T3 =
† † † tiabs j k {a ib j c k}
(6)
i> j >k,a>b.c
The T1 generates all single excitations, i.e., T1 |0= a,i tia ai from the vacuum, usually HF (but could equally well be the Kohn–Sham determinant), meaning excitation of an electron from an occupied orbital to an unoccupied one. We use the convention that i, j, k, l represent orbitals occupied in the Fermi vacuum, while a, b, c, d are unoccupied, and p, q, r, s are unspecified. T2 does the same for the double excitations, and T3 the triple excitations. Continuation through Tn for n electrons will give the full CI solution. Multiplying the Schr¨odinger equations from the left by exp(−T ), the critical quantity in CC theory is the similarity transformed Hamiltonian, exp(−T )H exp(T ) = H
(7)
where the Schrodinger equation becomes, H |0 = E|0
(8)
|0 is the Fermi vacuum, or an independent particle wavefunction, but E(R) = 0|H |0 is the exact energy at a given geometry, and the exact forces subject to atomic displacement are ∇ E(R) = F(R)
(9)
The effects of electron correlation are contained in the cluster amplitudes, whose equations at a given R are Q n H |0 = 0 ab abc abc where Q1 = |ai ai |, Q 2 = |ab i j i j |, Q 3 = |i j k i j k |+ · · · . Q1 projections give the equations for {tai }, and similarly for the other amplitudes. Limiting ourselves to single and double excitations, we have CCSD which is a highly correlated, accurate wavefunction. Consideration of triples provides, CCSDT, the state-of-the-art; while for practical application, its non-iterative forms CCSD[T] and its improved modification, CCSD[T]; is currently considered the “gold standard” for most molecular studies [36, 43].
36
R.J. Bartlett et al.
Regardless of choice of excitation, H may be written in secondquantization as 1 pq † † p H = h q p† q + grs p q s r + III + IV + · · · 2
(10)
where summation of repeated indices is assumed and III and IV indicate threeand four-body operators. The indices can indicate either atomic or molecular pq = pq|rs = ( pr|qs) = d1 d2φ ∗p (1)φr (1)g12 φq∗ orbitals. More explicitly, grs (2)φs (2) where the latter two-electron integral indicates the interaction between the electron distributions associated with electrons 1 and 2, respectively. We use g12 instead of r−1 12 because in the generalized form for H there may be additional operators of two-electron type besides just the familiar integrals. Such one- and two-electron quantitites further separated into one, two, and more atomic centers, are the quantitites that will have to be computed or in the case of simplified theories, approximated, to provide the results we require. At this point, we have an explicitly correlated, many-particle theory. It is important to distinguish this from an effective one-particle theory as in DFT or Hartree–Fock, which are much easier to apply to complicated systems. To make this connection, we choose to reformulate the many-particle theory into an effective one-particle form. This is accomplished by insisting that the energy variation δ E = 0, which means the derivative of E with respect to the orbitals that will compose the single determinant, |, vanish. As our expressions for tab.. i j.. , the CC equations, will depend upon the integrals over these orbitals, and consequently H ; this procedure is iterative. As any such variation of a determinant can be written in the form | = exp(T1 )|0, the single excitation projection of H has to vanish, ai |H |0 = 0 = a|hT |i
(11) (12)
where we introduce the “transfer Hamiltonian” operator, hT . Since this matrix element vanishes between the occupied orbital, i, and the unoccupied orbital, a, we can use the resolution of the identity 1= j |j j | + b |bb| to rewrite this equation in the familiar form, hT |i =
λ j i | j = i |i
(13)
j
where the first form retains the off-diagonal Lagrangian multipliers, while the second is canonical. The above can equally well be done for HF-SCF theory, except hT = f= t + v + J− K =h + J− K , wherewe have the kinetic-energy operator, the electron–nuclear attraction term − Z A /|r − R A |, combined together into the one-particle element of Eq. (13); the Coulomb repulsion and
Achieving predictive simulations with quantum mechanical forces
37
the non-local exchange operator, repectively. The Hartree–Fock effective one particle operator, J − K = j d2φ ∗j (2)(1 − P12 )φ j (2), and there would be no correlation in the Fock operator. In that case, i provides the negative of the Koopmans’ estimate of ionization potentials, and a the Koopmans’ approximation to the electron affinities. For the correlated hT , which is the one-particle theory originally due to Brueckner [45, 46], all single excitations vanish from the exact wavefunction, and as a consequence, we have maximum overlap of the Brueckner determinant with the exact wavefunction, | B ||. In general, Brueckner theory is not Hermitian, but in any order of perturbation theory we can insist upon its hermiticity, i.e., i|hT |a = 0, and that will be sufficient for our purposes. The specific form for the transfer Hamiltonian matrix element is a|hT |i = a| f|i +
1 a j ||cbticbj − k j ||ib tkjab 2
(14)
where summation over repeated indices is implied. Keeping the form of the hT operator in the a|hT |i matrix element the same, when a is replaced by an occupied orbital, m, we have m|hT |i = m| f|i +
1 m j ||cbticbj − k j ||ibtkjmb 2
(15)
Then, we have the Hartree–Fock-like equations but now for the correlated one-particle operator, hT , represented in the basis set, |χ, where S = χ|χ is the overlap matrix, hT C = SC
(16)
and the (molecular) orbitals are |φ = |χC. The Brueckner determinant, B , is composed of the lowest n occupied MOs, |φ0 = |χC0 In particular, the matrix elements for the transfer Hamiltonian in terms of the atomic orbital basis set are
µ
µα µ|h T |ν = h ν + Pαβ (g µα νβ − g βν )
Pµν = cµi ciν
(17) (18)
(summation of repeated indices is assumed),where Pνµ is the density matrix for µ the Brueckner determinant. Hence, subject to modified definitions for h ν and µα g νβ , which we will assume are renormalized to include the critical parts of the three- and higher-electron effects, we have the matrix which contains the exact ionization potentials for the system.
38
R.J. Bartlett et al. The total energy, E = B |H | B =
i|h|i +
1 1 i j ||i j + i j ||abtiab j 2 i, j 4 i, j,a,b
(19)
i|h|i +
1 i j |g 12 |i j 2 i, j
(20)
i
=
i
1 T rP(h+hT ) 2 1 −1 = r12 + T2 ||abab|| 2 a,b
(21)
= g 12
(22)
†
is also written in terms of the reference density matrix P = C0 C0 , evaluated from the occupied orbital coefficients, C0 . The quantityt hT = differs from the form in Eq. (15), because of the absence of the third term on the RHS. This term is an orbital relaxation term that only pertains to the ionization potentials, as there we would need to allow the system to relax after the ionization. Hence, this cannot contribute to the ground state energy, and its manifestation of that is that the total energy cannot be written in terms of the exact ionization potentials in Eq. (13), but can be written in terms of an approxi mation introduced by hT . The analytical forces for MD can be written eas T includes all electron correlation. Once h µ and g µα ily, as well. Notice h νβ ν are specified, which need to be viewed as quantities to be determined to reproduce the reference results from ab initio correlated calculations, we obtain self-consistent solutions for the correlated, effective, one-particle Hamiltonian. The self-consistency is essential in accounting for bond-breaking and associated charge rearrangement. The overlap matrix is included for generality, but as is often done in NDDO type theories, enforcing the ZDO approximation removes it. Another way to view this is to assume the parameters are based upon using the orthonormal expansion basis, |χ = |χS−1/2 which gives hT = S−1/2 hT S−1/2 . Developing this expression to include low-order in some S terms permits us to still retain the simpler and computationally faster orthogonal form of the eigenvalue equation, yet introduce what is sometimes called “Pauli repulsion” in the semi-empricial community [22]. A self-consistent solution provides the coefficients, C and the reference orbital energies, ,which as we discussed, are not the exact Ip’s that would come from including the contributions of the tmb j k amplitudes, which contain three-hole line and one-particle line. Such terms arise in the generalized EOM or Fock space CC theory for ionized, electron attached, and excited states. In lowest order, tmb j k =mb|| j k/( j +k −b −m ).
Achieving predictive simulations with quantum mechanical forces
4.
39
Transfer Hamiltonian: Density Functional Viewpoint
The DFT approach to the hT starts from a different premise that is actually simpler, since DFT is already exact in an independent particle form, unlike the usual many-particle theory above. As is well known, we have the Sham oneparticle Hamiltonian [32] whose first n eigenvectors give the exact density, h S = t + v + J + Vx + V h S |i = i |i. h S C = SC ρ(1) = φi (1)φi∗ (1) = χµ (1)Pµν χ∗ν (1) i
(23) (24) (25) (26)
µ,ν
†
and like the above, the density matrix is P = C0 C0 . The highest-occupied MO, n, has the property that n = −Ip(n). However, solving these equations does not provide an energy until we know the functional E xc [ρ], from which we know that δ E xc [ρ]/δρ(1) = Vxc (1), to close the cycle. The objective of DFT is to get the density, ρ, first; and then all other ground state properties follow; in particular, the energy and forces we need for MD. The transfer Hamiltonian in this case will be defined by the condition that ρCCSD = ρKS . Satisfying this condition means that we could obtain a Vxc from this density by using the ZMP method [47], but our approach is simply to parameterize the elements in h S = hT in analogy with that in semi-empirical quantum chemistry or TB such that the density condition is satisfied. This should specify Vxc , and indeed, the other terms in hT , which is then sufficient to obtain the forces, {∂ E(R)/∂X A }. Note this bypasses the need to use an explicit Exc [ρ],but, of course, that would always be an option. We can also bypass any explicit treatment of the kinetic energy operator by virtue of parametrization of h = t + v as in the semi-empirical approach discussed below. Besides the density condition, we also have the option to use the force condition in the sense that the forces can be obtained from CC theory, and then their values directly used to obtain the parameterized version of h S = hT . Ideally, the parameters will be able to describe both the densities and the forces, although this raises the issue of the long-term inability of semi-empirical methods to describe structures and spectra with the same parameters, discussed further in the last section. As our objective is to be able to define a hT that will satisfy many of the essential elements of ab initio theory, some of interest besides the forces are the density, and the ionization potential and electron affinity. The latter define the Mulliken electronegativity, E N = (I − A)/2, which should help to ensure that our calculations correctly describe the charge distribution in a system and the density. We also know the correct long-range behavior of the √ density is determined by the homo ionization potential, ρ (r) ∝ exp(−2 2I )r, which is a property of exact DFT. If the density is right, then we also know
40
R.J. Bartlett et al.
that we will get the correct dipole moments for the molecules involved, and this is likely to be critical if we hope to correctly describe polar systems like water, along with their hydrogen bonding.
5.
What About Semi-Empirical Methods?
Before embarking upon a particular form for the transfer Hamiltonian that must inevitably be semi-empiricial or TB type, we can ask what kind of accuracy is possible with such methods. In an recent paper on PM5, a parameterized NDDO Hamiltonian, [20, 21] Stewart reports that the PM5 heats of formation for over ∼1000 molecules composed of H, C, N, O, F, S, Cl, Br, and I have a mean absolute deviation (MAD) of 4.6 kcal/mol, nearly the same as DFT using BLYP or BPW91. The errors of PM3 are slightly larger (5.2) and AM1 (7.2). The largest errors are 27.2, (PM5), 35.1, (PM3), 54.8, (AM1) and 55.7 for BLYP and 34.5 for BPW91. Using a TZ instead of a DZ basis for the latter gives some improvement in the worst cases. For Jorgensen’s reparameterized PM3 and MNDO methods, referred to as PDDG [22, 25], the MAD heats of formation for 662 molecules limited to H, C, N, and O are reduced from 8.4 to 5.2, and with some extra PDDG additions, from 4.4 to 3.2 kcal/mol. For geometries, PDDG gets bond lengths to a MAD of 0.016 Å, 2.3◦ bond angle, and 29.0◦ dihedral angle. The principal Ip is typically within ∼0.5 eV – though it can be off by several – which is some 3% more accurate than PM3 and 12% less accurate than PM5. For dipole moments, the MAD is 0.24 Debye. There is less information about transition states and activation barriers, but these methods have seen extensive use for such problems in chemistry. Recent TB work termed SCC-DFTB for self-consistent charge density functional TB [11] is based upon DFT rather than HF and is less empirical, but still simplified using similar approximations for two-center interactions as in NDDO, discussed below. It is developed for solids as well as molecules. For the latter, in 63 organic examples the MAD deviations in bond lengths are 0.012 Å, and angles, 1.80◦ . For heats of reaction, in 36 example molecules composed of H, C, N, O the MAD is 12.5 kcal/mol compared to 11.1 for DFT-LSD. On the other hand, we can have dramatic failures. None of these new semi-empirical methods yet even treat Si, much less heavier elements of the sort that are important in many materials applications. To quote just one example, in comparisons of nine Zn complexes with B3LYP and CCSD(T), “MNDO/d failed the case study” and the errors compared to ab initio or DFT were dramatic.” The authors [48] say “No one semiempiricial model is applicable for the calculations of the whole variety of structures found in Zn chemistry.”
Achieving predictive simulations with quantum mechanical forces
6.
41
Forms for Tranfer Hamiltonian
Our objective is to model hT for the particular phenomena of interest and for chosen representative systems (i.e. unlike normal semi-empirical theory we do not expect the parameters to describe many elements at once) in a way that permits the routine, self-consistent treatment of a very large number of the same kinds of atoms. We also recognize that the traditional approaches are built upon approximating the HF-SCF one-particle Hamiltonian, f, not the more exact DFT or Brueckner approach discussed above. Also, traditionally, only a minimum basis set of an s orbital on H, and one s and a set of p orbtials are used on the other atoms, until d orbtials are occupied. Thinking more like ab initio theory, we do not presuppose such restrictions, but will use polarization functions and potentially double zeta sets of s and p orbitals on all atoms. We recognize the attraction of a transfer Hamiltonian that (1) consists solely of atomic parameters; and (2), is essentially two-atom in form, as all threeand four-center contributions are excluded. This is the fundamental premise of all neglect of differential overlap approximations [15, 17, 19]. Hence, as a first realization, guided by many years of semi-empirical quantum chemistry, we choose the “neglect of diatomic differential overlap” (NDDO) Hamiltonian, µ|hT |ν =
αµν δuv +
µ∈ A
+
µ=α,ν=β µ,ν∈ A
µ∈ A,ν∈B
−
µ= /β∈ A, ν= /α∈B
Pαβ (µα|νβ) −
µβ=να,µ= /β µ,β∈A
Pαβ (µβ|να)
1 (βu + βv )Sνµ + Pαβ (µα|νβ) 2 µ=α∈ A,v,β∈B,
Pαβ (µβ|να)
ν=β∈B,µ,α∈ A
(27)
consisting of atomic and diatomic units. αµµ is a purely atomic quantity that represents the one-particle part of the energy of an electron in its atomic orbital. We would have different values for s, p, d, . . . orbitals, collectively indicated as αA . The one-center, two-electron terms for atom A are separated into coulomb and exchange terms and weighted by the density matrix. No explicit correlation operator as in DFT is yet considered. Instead modifications (parameterizations) of the coulomb and exchange terms are viewed as potentially accomplishing the same objective. βu is an atomic parameter indicative of each orbital type (s,p,d) on atom A and Sµν is the overlap integral between, formally, two atomic orbitals on atoms A and B. A Slater type orbital on atom A is χA =rAn−1 exp (−ζA )Yl,m (ϑA, ϕA ), and the overlap integral, Sµν (ζA, ζB ) depends upon ζA and ζB , so it is entirely determined by what the atoms are. So it, too, consists of atomic parameters.
42
R.J. Bartlett et al.
The terms which include density matrix elements account for the twoelectron repulsion terms which depend upon the purely one-center two-electron µν integral type, (µA νA |µA νA ) = γAA . A typical choice for the two-center, twoelectron term then becomes [49, 50]
2 (µA νA |µB νB ) ∝ rAB + (cAuv + cBuv )2
−1/2
(28)
where rAB = RAB + qi and the additive terms cuv are numerically determined such that the two-center repulsion integral goes to the proper one-center limiting value. RAB is the distance bewteen atoms A and B, but differs from rAB due to the multipole method used to compute the two-electron integral. For (sA sA | pB pB ), a monopole and quadrupole are used for the p electron distribution while a monopole is used for the s distribution. The radial extent of the multipoles is given by qi = q p B, and is a function of the atomic orbital exponent ζB on atom B. This form for the two-electron integrals assumes the correct long-range (1/R) behavior. More general forms for the two-center, two-electron integrals combine such contributions together from several multipoles to distingush (ss|ss) from (ss|dd), etc. [19, 51]. This set of approximations defines the NDDO form of the matrix elements of hT between two atomic orbitals. Now we have to consider the nuclear repulsion contribution to the ene rgy, A,B ZA Z B /RAB . Importantly, and unlike in ab initio theory, the effective atomic number, ZA ,which is chosen initially to be equal to the number of valence electrons being contributed by atom A, is also made a function of all RAB in the system. This introduces several new parameters into the calculation, justified roughly by some ideas of electron screening. The AM1 choice [16] for the latter reflects screening of the effective nuclear charge with the parameterized form
E CR = Z A Z B (sA sA |s B sB ) 1 + e(−dA RAB ) + e(−dB RAB ) Z Z + A B RAB
k
aAk e
−bA (RAB −CkA )2
+
aBk e
−bB (RAB −CkB )2
(29)
k
These core repulsion (CR) parameters, d, b, a and C account for the nuclear repulsion, which means they contribute to total energies and forces, but not to purely electronic results. The latter depend upon the electronic parameters βA, γAA , αA , . . . . In our work, both sets are specified via a genetic algorithm to ensure that correlated CCSD results are obtained for representative systems, tailored to the phenomena of interest. Looking at the above approximations, we see that we retain only one and two-center two-electron integrals. In principle, we can have a three-center one-electron integral from µA |Z C /|r − RC νB , but in NDDO, such terms are excluded as well. Any approximation of hT that is to be tied to ab initio
Achieving predictive simulations with quantum mechanical forces
43
results, has to have the property of “saturation.” To achieve this, we insist that our form for hT be fundamentally short range. We see from the above, that our hT depends on two-center interactions, but unlike TB, not just those for the nearest neighbor atoms but for all the two-body interactions in the system. This short-range character helps to saturate the atomic parameters for comparatively small example systems that are amendable to ab initio correlated methods. Then once the atomic parameters are obtained, and found to be unchanged to within a suitable tolerance when redetermined for larger clusters, they define a saturated, self-consistent, correlated, effective one-particle Hamiltonian that can be readily solved for quite large systems to rapidly determine the forces required for MD. We also have easy access to the secondderivatives (Hessians) for definitive saddle point determination, vibrational frequencies, and interpolation between calculations at different points for MD. Using H2 O as an example for saturation, we can obtain the cartesian force matrix for the monomer by insisting that our simplified Hamiltonian provide the same force curves as a function of intra-atomic separation for breaking the O–H bond with the other degrees of freedom being optimum (i.e. a distinguished reaction path). Call this matrix FA. From FA we use a GA to obtain the Hamiltonian parameters that, in turn, determine h and g elements that make our transfer Hamiltonian reproduce these values. The more meaningful gradient norm |F | is used in practice rather than the individual cartesian elements. Now consider two water molecules interacting. The principal new element is the dihedral angle that orients one monomer relative to the other, but the H-bonding and dipole–dipole interaction will cause some small change when we break an O–H bond in the dimer. Our first approximation to FAB =FA +FB + VAB . Then by changing our parameters to accomodate the dimer bond breaking, we get slightly modified h and g elements in the transfer hamiltonian. VAC, VBC This makes FAB = FA + FB . Going to a third unit, we would add VABC, perturbations and repeat the process to define FABC = FA + FB + FC . Since these atomic based interactions will rapidly fall off with distance, we expect that relatively quickly we would have a saturated set of parameters for the bond breaking in water with a relatively small number of clusters. We can obviously look at other properties, too, such as dipole moments, cluster structures, etc., to assess their degree of saturation with our hT parameters. If we fail to achieve a satisfactory saturation, then we have to pursue more flexible, or more accurate forms of transfer Hamiltonians. It is essential to identify the terms that matter, and the DFT form provides complimentary input to the wavefunction approach in this regard. Also, unlike most semi-empirical methods we do not limit ourselves to a minimum basis set. The general level we would anticipate is CCSD with a double-zeta + polarization basis, while dropping the core electrons. This is viewed as the quality of ab initio result that we would pursue for complicated molecules.
44
R.J. Bartlett et al.
In addition, following the equation-of-motion (EOM) CC approach [52], we insist that H Rk |0 = ωk Rk |0
(30)
where Rk exp (T )|0 = k and ωk is the excitation energy for any ionized, Ik, electron-attached, Ak, or excited state. In other words, this provides Ips and Eas that tie to the Mulliken electronegativity, to help to ensure that our transfer Hamiltonian represents the correct charge distribution and density size. Furthermore, whereas forces and geometries are highly sensitive to the corerepulsion parameters, properties like I and A are sensitive to the electronic parameters in the transfer Hamiltonian. The transfer Hamiltonian procedure is far more general than the particular choice of Hamiltonian chosen here, since we can choose any expansion of H or hT that is formally correct and include elements to be computed or parameters to be determined, to define a transfer Hamiltonian. Furthermore, we can insist that it satisfy suitable exact and consistency conditions such as having the correct asymptotic or scaling behavior. Other desirable conditions might include the satisfaction of the virial and Hellman–Feynman theorems. We can also choose to do many of the terms like the one-center ones, ab initio, and keep those values fixed subsequently. Then, our simplified forms 12 (βu +βv )Sνµ and that of Eq. (29), are the only ones where there is an electronic dependence upon geometry. Adding this dependence to that from the core–core repulsions, has to provide the forces that drive the MD. We can explore many other practical approximations such as supressing self-consistency by setting P = 1, and impose the restriction that only nearest neighbor two-atom interactions be retained, to extract a non-self-consistent TB Hamiltonian that should be very fast in application. We can obviously make many other choices and create, perhaps, a series of improving approximations to the ab initio results that parallel their computational demands.
7.
Numerical Illustrations
As an illustration of the procedure, consider the prototype system for an Si–O–Si bond as in silica, pyrosilicic acid (Fig. 6). This molecule has been frequently used as a simple model for silica. We are interested in the Si-O bond rupture. Hence, we perform a series of CCSD calculations as a function of the Si–O distance all the way to the separated radical units, ·Si(OH)3 and ·O–Si(OH)3 , relaxing all other degrees of freedom at each point (while avoiding any hydrogen bonding which would be artificial for silica) using now wellknown CC analytical gradient techniques [36]. For each point we compute the
Achieving predictive simulations with quantum mechanical forces
45
O
O H
H O Si
Si O
O O
H
H
O H
H
Figure 6. Structure of pyrosilicic acid.
Figure 7. Comparison of forces from standard semi-empirical theory (AMI) and the transfer Hamiltonian (TH-CCSD) with coupled-cluster (CCSD) results for dissociation of pyrosilicic acid into neutral fragments.
gradient norm of the forces for the 3N cartesian coordinates, q I , (3 per atom 2 1/2 and use the genetic algorithm PIKAIA [53] A), |F| = 3N I [(∂ E/∂q I ) ] to minimize the difference between |F(CCSD)-F(hT )| for the transfer Hamiltonian and the CCSD solution. This is shown in Fig. 7. Since forces drive the MD, their determination is more relevant for the problem than the potential energy curves, themselves. For this case, we find that fixing the parameters in our transfer Hamiltonian that are associated with the core-repulsion
46
R.J. Bartlett et al.
function is sufficient, leaving the electronic parameters at the standard values for the AM1 method. As seen in Fig. 7, these new parameters are responsible for removing AM1s too large repulsion at short Si–O distances and erroneous behavior shortly beyond the equilibrium point. Hence, to a small tolerance, the transfer Hamiltonian provides the same forces as that in the highly sophisticated ab initio CCSD method. In a second study, QM forces permit the description of different electronic states. As an example, for this system we can also separate pyrosilicic acid into charged fragments, Si(OH)3+ and O–Si(OH)3− , and in a material undergoing bond-breaking, we would expect to take multiple paths such as this. A classical potential has no such capability. Figure 8 shows the curve and once again we obtain a highly accurate representation from the transfer Hamiltonian, with the same parameters obtained for the radical dissociation. Hence, our transfer Hamiltonian has the capability of describing the effects of these different electronic states in simulations, which besides enabling reliable descriptions of bond-breaking, should have an essential role if a materials’ optical properties are of interest. Figure 9 shows the integrated force curves to illustrate that even though the parameters were determined from the forces, the associated potential energy surfaces are also accurate compared to the reference CCSD results, and more accurate than the conventional AM1 results. The latter has an error of ∼0.4 eV between the neutral and charged paths compared to the CCSD results. We have also investigated the parameter saturation. Moving to trisilicic acid we obtain the reference results wihout any further change in our parameters.
Figure 8. Comparison of forces from standard semi-empirical theory (AM1) and the Transfer Hamiltonian (TH-CCSD) with coupled-cluster (CCSD) results for dissociation of pyrosilicic acid into charged fragments.
Achieving predictive simulations with quantum mechanical forces
47
Figure 9. Comparison of PES for dissociation of pyrosilicic acid. Each curve is labeled by the Hamiltonian used and the dissociation path followed.
The correct description of complicated phenomena in materials requires that the approach be able to describe, accurately, a wealth of different valence states and coordination states of the relevant atoms involved. For example, the surface structure of silica is known to show three, four, and five coordinate Si atoms. Hence, a critical test of the ability of the hT is how well its form can account for the observed structure of such species with the same parameters already determined for bond breaking. In Figs. 10 and 11, we show comparisons of the hT results for some Six O y molecules with DFT (B3LYP), various two-body classical potentials [54, 55], and a three-body potential [56] frequently used in simulations, and molecular mechanics [26]. The reference values are from CCSD(T), which are virtually the same as the experimental values when available. The hT results are competitive with DFT and superior to all classical forms, including even MM with standard parameterization. The latter is usually quite accurate for molecular structures at equilibrium geometries, but not necessarily for SiO2 . MM methods do not attempt to describe bond breaking. The comparative timings using the various methods are shown in Table 2 for two different sized systems, pyrosilicic acid and a 108-atom SiO2 nanorod [57]. The 216-atom version is shown in Fig. 12. The hT procedure is about 3.5 orders of magnitude faster than the gaussian basis B3LYP DFT results, which is another ∼3.5 orders of magnitude faster than CCSD[ACESII]. The 108 atom nanorod is clearly well beyond the capacity of CCSD ab initio calculations, but even the DFT result (in this case with a plane wave basis using the BO-LSD-MD (GGA) program, is excessive, while the hT is again three to four orders of magnitude faster. With streamlining of programs, we expect that this can still be significantly improved.
48
R.J. Bartlett et al.
Figure 10. Error in computed Six O y equilibrium bond lengths relative to CCSD(T) using various potentials.
Figure 11. Error in computed Six O y equilibrium bond angles relative to CCSD(T) using various potentials.
Achieving predictive simulations with quantum mechanical forces
49
Table 2. Comparative timings for electronic structure calculations (IBM RS/6000) Pyrosilicic acid Method CCSD DFT T h BKS
CPU time (s) 8656 375 0.17 0.001
108-atom nanorod Method CCSD DFT T h BKS
CPU time (s) N/A 85,019 43 0.02
Finally, to illustrate the results of a simulation we consider the 216-atom SiO2 system of Fig. 12, subject to a uniaxial stress, using various classical potentials and that for our QM transfer Hamiltonian. The equilibrated nanorod was subjected to uniaxial tension by assigning a fixed velocity (25 m/s) in the loading direction to the 15 atoms in the caps at each end of the rod. The stress was computed by summing the forces in the end caps and dividing by the projected cross sectional area at each time step. The simulations evolved for (approximately) 10 ps where the system temperature was maintained at 1 K by velocity rescaling. Figure 13 shows the computed stress–strain curves. The main differences between the classical potentials and their QM potentials seems to be the differnce at the maximum and the long tail indicating surface reconstruction. The QM potential shows the expected brittle fracture, perhaps a little more than the classical potentials. The transfer Hamiltonian, retains self-consistency, state specificity, and permits readily adding other molecules to simulations after ensuring that they, too, reflect the reference ab initio values for their various interactions. Hence, the transfer Hamiltonian built upon NDDO or more general forms, would seem to offer a practical approach to moving toward the objective of predictive simulations. In Fig. 14 we show the same kind of information about bond-breaking in water, showing the substantial superiority of the hT results compared to standard AM1. A well-known failing of semi-empirical methods is their inability to correctly describe H-bonding. In Fig. 15 we compare the equilibrium structure of the water dimer obtained from the hT , ab initio MBPT(2), and standard semi-empirical theory. It provides the quite hard to describe water dimer in excellent agreement with the first-principles calculations, contrary to AM1 which leads to errors in the donor–acceptor O–H bond of 0.15 Å. In this example, we have to change the electronic parameters along with the corecore repulsion. We would expect this to be the case for most applications. In the future, we hope we can develop the hT to the point that we will have an accurate, QM, description of water and its interactions with other species.
50
R.J. Bartlett et al.
Figure 12. Silica nanorod containing 216 atoms.
Achieving predictive simulations with quantum mechanical forces
Figure 13. potentials.
51
Stress–strain curve for 216-atom silica nanorod with classical and quantum
Figure 14. Comparison of forces for O–H bond breaking in water monomer.
52
R.J. Bartlett et al.
Figure 15. Structure of water dimer using transfer Hamiltonian, MBPT(2), and standard AM1 Hamiltonian. Bond lengths in angstroms and angles in degrees.
8.
Future
This article calls for some expectations for the future. We have little doubt that the future will demand QM potentials and forces in simulations. It seems to be the single most critical, unsolved, requirement if we aspire toward “predictive” quality. If we could use high-level CC forces in simulations for realistic systems, we would be as confident of our results – as long as the phenomena of interest is amendable to classical MD – as we would be for the determination of molecular properties at that level of theory and basis. Of course, in many cases we cannot run MD for long enough time periods to allow some phenomena to manifest themselves, perhaps forcing more of a kinetic Monte Carlo time extension at that point. We clearly also need much accelerated MD methods regardless of the choice of forces. Like the above NDDO and TB methods, DFT as used in practice, is also a “semi-empirical” theory, as methods like B3LYP now use many parameters to define their functionals and potentials. Even the bastion of state-of-the-art ab initio correlated methods – coupled-cluster theory – is not exact because it depends upon a basis set, as shown in the examples in the introduction. Since even DFT cannot generally be used in MD simulations involving more than ∼300 atoms, to make progress in this field demands that we have “simplified” methods that we can argue retain ab initio or DFT accuracy but now for
Achieving predictive simulations with quantum mechanical forces
53
>1000 atoms, and that can be readily tied to simulations. In this article, we have suggested a procedure for doing so. We showed that the many-electron CC theory could be reformulated into a single determinant form, but at the cost λδη of having a procedure to reliably introduce the quantites we called gνµ ,gλδ µν , gµν , etc. These are complicated quantities that in an ab initio calculation would depend upon one- and two-electron integrals over the basis functions and the cluster amplitudes in T . We could directly compute these elements from ab initio CC methods, to assess their more detailed importance and behavior, and expect to do so. But we prefer, initially, to obtain most of these elements from consideration of a smaller set of quantities and parameters like those in NDDO, or perhaps in TB; and investigatewhether those limited numbers of parameters will be capable of fixing hT = µ,ν |µµ|hT |νν| to the required accuracy. We believe in ensuring that hT has the correct long- and short-range behavior, including the united atom and the separated atom limits. We also want to make sure that the proper balance between the core–core repulsions and the electronic energy is maintained. In our opinion, this is the origin of the age-old problem in semi-empirical theory, that there needs to be different parameters for the total energy, forces, transition states, and those for purely electronic parameters like the electronic density, or photo-electron, or electronic spectra. The same features are observed in solid state applications where the accuracy of cohesive energies and lattice parameters does not transfer to the band structure. Such electronic properties do not depend upon the core– core repulsion at all, yet for many of the total energy properties, as we saw for SiO2 , only the core repulsion parameters need to be changed to get agreement with CCSD. This is not surprising. For total energies and forces, we are fitting the difference between two large numbers, which is much easier to fit than the much larger electronic energy, itself. It would be nice to develop a method that fully accounts for whatever the appropriate cancellation of the core–core effects with the electronic effects from the beginning. Only an ability to describe both reliably will pay the dividends of a truly predictive theory. DFT, MP2, and even higher level methods will continue to progress using local criteria [41], linear scaling, various density fitting tricks [58] and a wealth of other schemes; but regardless, if we can make a transfer Hamiltonian that is already ∼4–5 orders of magnitude faster than DFT, retain and transfer the predictive quality of ab initio or DFT results for clusters to very large molecules, there will always be a need to describe much larger systems accurately and smaller systems faster. In fact, it might be argued, that if such a procedure can be created that will be able to correctly reproduce high-level ab initio results for representative clusters – and fulfill the saturation property we emphasized – the final results might well exceed those from a purely ab initio or DFT method for ∼1000 atoms. The compromises made to make such large molecule applications possible, even at one geometry, forces
54
R.J. Bartlett et al.
restricting the basis sets, or number of grid points, or other assorted elements to acommodate the size of system. In principle, the transfer Hamiltonian would not be similarly compromised. Its compromises lie elsewhere.
Acknowledgments This work was support by the National Science Foundation under grant numbers DMR-9980015 and DMR-0325553.
References [1] P.R. Westmoreland, P.A. Kollman, A.M. Chaka, P.T. Cummings, K. Morokuma, M. Neurock, E.B. Stechel, and P. Vashishta, “Applications of molecular and materials modeling,” NSF, DOE, NIST, DARPA, AFOSR, NIH, 2002. [2] ACES II is a program product of the Quantum Theory Project, University of Florida. Authors: J.F. Stanton, J. Gauss, J.D. Watts, MNooijen, N. Oliphant, S.A. Perera, P.G. Szalay, W.J. Lauderdale, S.A. Kucharski, S.R. Gwaltney, S. Beck, A. Balkov D.E. Bernholdt, K.K. Baeck, P. Rozyczko, H. Sekino, C. Hober, and R.J. Bartlett. Integral packages included are VMOL (J. Almlf and P.R. Taylor); VPROPS (P.Taylor) ABACUS; (T. Helgaker, H.J. Aa. Jensen, P. Jrgensen, J. Olsen, and P.R. Taylor). [3] D.T. Griggs and J.D. Blacic, “Quartz – anomalous weakness of synthetic crystals,” Science, 147, 292, 1965. [4] G.V. Gibbs, “Molecules as models for bonding in silicates,” Am. Mineral, 67, 421, 1982. [5] A. Post and J. Tullis, “The rate of water penetration in experimentally deformed quartzite, implications for hydrolytic weakening,” Tectonophysics, 295, 117, 1998. [6] R. Hoffman, “An extended Huckel theory. I. hydrocarbons,” J. Chem. Phys., 39, 1397, 1963. [7] M. Wolfsberg and L. Helmholtz, “The spectra and electronic structure of the tetrahedral ions MnO4 , CrO4 , and ClO4 ,” J. Chem. Phys., 20, 837, 1952. [8] J.C. Slater and G.F. Koster, “Simplified LCAO method for the periodic potential problem,” Phys. Rev., 94, 1167, 1954. [9] W.A. Harrison, “Coulomb interactions in semiconductors and insulators,” Phys. Rev. B, 31, 2121, 1985. [10] O.F. Sankey and D.J. Niklewski, “Ab initio multicenter tight binding model for molecular dynamics simulations and other applications in covalent systems,” Phys. Rev. B, 40, 3979, 1989. [11] M. Elstner, D. Porezag, G. Jungnickel, J. Elsner, M. Haugk, T. Frauenheim, S. Suhai, and G. Seifert, “Self-consistent charge density functional tight binding method for simulations of complex materials properties,” Phys. Rev. B, 58, 7260, 1998. [12] M.W. Finnis, A.T. Paxton, M. Methfessel, and M. van Schilfgaarde, “Crystal structures of zirconia from first principles and self-consistent tight binding,” Phys. Rev. Lett., 81, 5149, 1998. [13] R. Pariser, “Theory of the electronic spectra and structure of the polyacenes and of alternant hydrocarbons,” J. Chem. Phys., 24, 250, 1956.
Achieving predictive simulations with quantum mechanical forces
55
[14] R. Pariser and R.G. Parr, “A semi-empirical theory of electronic spectra and electronic structure of complex unsaturated molecules,” J. Chem. Phys., 21, 466, 1953. [15] M.J.S. Dewar and G. Klopman, “Ground states of sigma bonded molecules. I. A semi-empirical SCF MO treatment of hydrocarbons,” J. Am. Chem. Soc., 89, 3089, 1967. [16] M.J.S. Dewar, J. Friedheim, G. Grady, E.F. Healy, and J.J.P. Stewart, “Revised MNDO parameters for silicon,” Organometallics, 5, 375, 1986. [17] J.A. Pople, D.P. Santry, and G.A. Segal, “Approximate self-consistent molecular orbital theory. I. Invariant procedures,” J. Chem. Phys., 43, S129, 1965. [18] J.A. Pople, D.L. Beveridge, and P.A. Dobosh, “Approximate self-consistent molecular orbital theory. 5. Intermediate neglect of differential overlap,” J. Chem. Phys., 47, 2026, 1967. [19] J.J.P. Stewart, In: K.B. Lipkowitz and D.B. Boyd (eds.), Reviews in Computational Chemistry, VCH Publishers, Weinheins, 1990. [20] J.J.P. Stewart, “Comparison of the accuracy of semiempirical and some DFT methods for predicting heats of formation,” J. Mol. Model, 10, 6, 2004. [21] J.J.P. Stewart, “Optimization of parameters for semiempirical methods. IV. Extension of MNDO, AM1, and PM3 to more main group elements,” J. Mol. Model, 10, 155, 2004 [22] W. Thiel, “Perspectives on semiempirical molecular orbital theory,” Adv. Chem. Phys., 93, 703, 1996. [23] K.M. Merz, “Semiempirical quantum chemistry: where we are and where we are going,” Abstr. Pap. Am. Chem. Soc., 224, 205, 2002. [24] M.P. Repasky, J. Chandrasekhar, and W.L. Jorgensen, “PDDG/PM3 and PDDG/MNDO: improved semiempirical methods,” J. Comput. Chem., 23, 1601, 2002. [25] I. Tubert-Brohman, C.R.W. Guimaraes, M.P. Repasky, and W.L. Jorgensen, “Extension of the PDDG/PM3 and PDDG/MNDO semiempirical molecular orbital methods to the halogens,” J. Comput. Chem., 25, 138, 2003. [26] M.R. Frierson and N.L. Allinger, “Molecular mechanics (MM2) calculations on siloxanes,” J. Phys. Org. Chem., 2, 573, 1989. [27] I. Rossi and D.G. Truhlar, “Parameterization of NDDO wavefunctions using genetic algorithms – an evolutionary approach to parameterizing potential energy surfaces and direct dynamics for organic reactions,” Chem. Phys. Lett., 233, 231, 1995. [28] K. Runge, M.G. Cory, and R.J. Bartlett, “The calculation of thermal rate constants for gas phase reactions: the quasi-classical flux–flux autocorrelation function (QCFFAF) approach,” J. Chem. Phys., 114, 5141, 2001. [29] S. Sekusak, M.G. Cory, R.J. Bartlett, and A. Sabljic, “Dual-level direct dynamics of the hydroxyl radical reaction with ethane and haloethanes: toward a general reaction parameter method,” J. Phys. Chem. A, 103, 11394, 1999. [30] R.J. Bartlett, “Coupled-cluster approach to molecular structure and spectra – a step toward predictive quantum chemistry,” J. Phys. Chem., 93, 1697, 1989. [31] T. Helgaker, P. Jorgensen, and J. Olsen, Molecular Electronic Structure Theory, John Wiley and Sons, West Sussex England, 2000. [32] W. Kohn and L.J. Sham, “Self-consistent equations including exchange and correlation effects,” Phys. Rev., 140, 1133, 1965. [33] J.P. Perdew and W. Yue, “Accurate and simple density functional for the electronic exchange energy – generalized gradient approximation,” Phys. Rev. B, 33, 8800, 1986.
56
R.J. Bartlett et al. [34] A. Becke, “Density functional thermochemistry 3. The role of exact exchange,” J. Chem. Phys., 98, 5648, 1993. [35] D.E. Woon and T.H. Dunning, Jr., “Gaussian basis sets for use in correlated molecular calculations. 4. Calculation of static electrical response properties,” J. Chem. Phys., 100, 2975, 1994. [36] R.J. Bartlett, “Coupled-cluster theory: an overview of recent developments,” In: D. Yarkony (ed.) Modern Electronic Structure, II. World Scientific, Singapore, pp. 1047–1131, 1995. [37] K. Bak, P. Jorgensen, J. Olsen, T. Helgaker, and W. Klopper, “Accuracy of atomization energies and reaction enthalpies in standard and extrapolated electronic wave function/basis set calculations,” J. Chem. Phys., 112, 9229, 2000. [38] T. Helgaker, J. Gauss, P. Jorgensen, and J. Olsen, “The prediction of molecular equilibrium structures by the standard electronic wave functions,” J. Chem. Phys., 106, 6430, 1997. [39] J.Q. Broughton, F.F. Abraham, N. Bernstein, and E. Kaxiras, “Concurrent coupling of length scales: methodology and application,” Phys. Rev. B, 60, 2391, 1999. [40] F. Abraham, J. Broughton, N. Bernstein, and E. Kaxiras, “Spanning the length scales in dynamic simulation,” Computers in Phys., 12, 538, 1998. [41] M. Schutz and H.J. Werner, “Local perturbative triples correction (T) with linear cost scaling,” Chem. Phys. Lett., 318, 370, 2000. [42] J. Cioslowski, S. Patchkovskii, and W. Thiel, “Electronic structures, geometries, and energetics of highly charged cations of the C-60 fullerene,” Chem. Phys. Lett., 248, 116, 1996. [43] R.J. Bartlett, “Electron correlation from molecules to materials,” In: A. Gonis, N. Kioussis, and M. Ciftan (eds.), Electron Correlations and Materials Properties 2, Kluwer/Plenum, Dordrecht, pp. 219–236, 2003. [44] C.E. Taylor, M.G. Cory, R.J. Bartlett, and W. Thiel, “The transfer Hamiltonian: a tool for large scale simulations with quantum mechanical forces,” Comput. Mater. Sci., 27, 204, 2003. [45] K.A. Brueckner, “Many body problem for strongly interacting particles. 2. linked cluster expansion,” Phys. Rev., 100, 36, 1955. [46] P.O. Lowdin, “Studies in perturbation theory. 5. Some aspects on exact selfconsistent field theory,” J. Math. Phys., 3, 1171, 1962. [47] Q. Zhao, R.C. Morrison, and R.G. Parr, “From electron densities to Kohn–Sham kinetic energies, orbital energies, exchange-correlation potentials, and exchange correlation energies,” Phys. Rev. A, 50, 2138, 1994. [48] M. Brauer, M. Kunert, E. Dinjus, M. Klussmann M. Doring, H. Gorls, and E. Anders, “Evaluation of the accuracy of PM3, AM1 and MNDO/d as applied to zinc compounds,” J. Mol. Struct., (Theo. Chem.) 505, 289, 2000. [49] G. Klopman, “Semiempirical treatment of molecular structures. 2. Molecular terms + application to diatomic molecules,” J. Am. Chem. Soc., 86, 4550, 1964. [50] K. Ohno, “Some remarks on the pariser–parr–pople method,” Theor. Chim. Acta, 2, 219, 1964. [51] M.J.S. Dewar and W. Thiel, “A semiempirical model for the two-center repulsion integrals in the NDDO approximation,” Theor. Chim. Acta, 46, 89, 1977. [52] J.F. Stanton and R.J. Bartlett, “The equation of motion coupled-cluster method – a systematic biorthogonal approach to molecular excitation energies, transition probabilities and excited state properties,” J. Chem. Phys., 98, 7029, 1993. [53] P. Charbonneau, “Genetic algorithms in astronomy and astrophysics,” Astrophys. J. (Suppl), 101, 309, 1995.
Achieving predictive simulations with quantum mechanical forces
57
[54] S. Tsuneyuki, H. Aoki, M. Tsukada, and Y. Matsui, “First-principle interatomic potential of silica applied to molecular dynamics,” Phys. Rev. Lett., 61, 869, 1988. [55] B.W.H van Beest, G.J. Kramer, and R.A. van Santen, “Force fields for silicas and aluminophosphates based on ab initio calculations,” Phys. Rev. Lett., 64, 1955, 1990. [56] P. Vashishta, R.K. Kalia, J.P. Rino, and I. Ebbsjo, “Interaction potential for SiO2 – a molecular dynamics study of structural correlations,” Phys. Rev. B, 41, 12197, 1990. [57] T. Zhu, J. Li, S. Yip, R.J. Bartlett, S.B. Trickey and N.H. de Leeuw, “Deformation and fracture of a SiO2 nanorod,” Mol. Simul., 29, 671, 2003. [58] M. Schutz and M.R. Manby, “Linear scaling local coupled cluster theory with density fitting. Part I: 4-external integrals,” Phys. Chem. – Chem. Phys., 5, 3349, 2003.
1.4 FIRST-PRINCIPLES MOLECULAR DYNAMICS Roberto Car1 , Filippo de Angelis2 , Paolo Giannozzi3, and Nicola Marzari4 1 Department of Chemistry and Princeton Materials Institute, Princeton University, Princeton, NJ, USA 2 Istituto CNR di Scienze e Tecnologie Molecolari ISTM, Dipartimento di Chimica, Universit´a di Perugia, Via Elce di Sotto 8, I-06123, Perugia, Italy 3 Scuola Normale Superiore and National Simulation Center, INFM-DEMOCRITOS, Pisa, Italy 4 Department of Materials Science and Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
Ab initio or first-principles methods have emerged in the last two decades as a powerful tool to probe the properties of matter at the microscopic scale. These approaches are used to derive macroscopic observables under the controlled condition of a “computational experiment,” and with a predictive power rooted in the quantum-mechanical description of interacting atoms and electrons. Density-functional theory (DFT) has become de facto the method of choice for most applications, due to its combination of reasonable scaling with system size and good accuracy in reproducing most ground state properties. Such an electronic-structure approach can then be combined with classical molecular dynamics to provide an accurate description of thermodynamic properties and phase stability, atomic dynamics, and chemical reactions, or as a tool to sample the features of a potential energy surface. In a molecular-dynamics (MD) simulation the microscopic trajectory of each individual atom in the system is determined by integration of Newton’s equations of motion. In classical MD, the system is considered composed of massive, point-like nuclei, with forces acting between them derived from empirical effective potentials. Ab initio MD maintains the same assumption of treating atomic nuclei as classical particles; however, the forces acting on them are considered quantum mechanical in nature, and are derived from an electronic-structure calculation. The approximation of treating quantummechanically only the electronic subsystem is usually perfectly appropriate, due to the large difference in mass between electrons and nuclei. Nevertheless, nuclear quantum effects can be sometimes relevant, especially for light 59 S. Yip (ed.), Handbook of Materials Modeling, 59–76. c 2005 Springer. Printed in the Netherlands.
60
R. Car et al.
elements such as hydrogen; classical or ab initio path integral approaches can then be applied, albeit at a higher computational cost. The use of Newton’s equations of motion for the nuclear evolution implies that vibrational degrees of freedom are not quantized, and will follow a Boltzmann statistics. This approximation becomes fully justified only for temperatures comparable with the highest vibrational level in the system considered. In the following, we will describe the combined approach of Car and Parrinello to determine the simultaneous “on-the-fly” evolution of the (Newtonian) nuclear degrees of freedom and of the electronic wavefunctions, as implemented in a modern density-functional code [1] based on plane-waves basis sets, and with the electron–ion interactions described by ultrasoft pseudopotentials [2].
1.
Total Energies and the Ultrasoft Pseudopotential Method
Within DFT, the ground-state energy of a system of Nv electrons, whose one-electron Kohn–Sham (KS) orbitals are φi , is given by E tot [{φi }, {R I }] =
i
+
h2 ¯ 2 φi − ∇ + VNL φi + E H [n] + E xc [n] 2m ion dr Vloc (r)n(r) + U ({R I }),
(1)
where the i index runs over occupied KS orbitals (Nv /2 for closed-shell systems) and n(r) is the electron density. E H [n] is the Hartree energy defined as: E H [n] =
e2 2
dr dr
n(r)n(r ) , |r − r |
(2)
E xc [n] is the exchange and correlation energy, R I are the coordinates of the I th nucleus, {R I } is the set of all nuclear coordinates, and U ({R I }) is the nuclear Coulomb interaction energy. In typical first-principles MD implementations, pseudopotentials (PPs) are used to describe the interaction between the valence electrons and the ionic core, which includes the nucleus and the core electrons. The use of PPs allows to simplify the many-body electronic problem by avoiding an explicit description of the core electrons, which in turn results in a greatly reduced number of orbitals and allows the use of plane waves as a basis set. In the following, we will consider the general case of ultrasoft PPs [2], which includes as a special case norm-conserving PPs [3] in separable form. The PP is composed of ion , given by a sum of atom-centred radial potentials: a local part Vloc ion Vloc (r) =
I
I Vloc ( |r − R I | )
(3)
First-principles molecular dynamics
61
and a nonlocal part VNL :
VNL =
(0) I Dnm |βn βmI |,
(4)
nm,I (0) characterize the PP and are where the functions βnI and the coefficients Dnm specific for each atomic species. For simplicity, we will consider only a single atomic species in the following. The βnI functions, centred at site R I , depend on the nuclear positions via
βnI (r) = βn (r − R I ).
(5)
βn here is a combination of an angular momentum eigenfunction in the angular variables times a radial function which vanishes outside the core region; the indices n and m in Eq. (4) run over the total number Nβ of these functions. The electron density entering Eq. (1) is given by n(r) =
|φi (r)|2 +
i
I Q nm (r)φi |βnI βmI |φi ,
(6)
nm,I
where the sum runs over occupied KS orbitals. The augmentation functions I (r) = Q nm (r − R I ) are localized in the core. The ultrasoft PP is fully Q nm I (0) (r), Dnm , Q nm (r), and βn (r). The functions determined by the quantities Vloc Q nm (r) are related to atomic orbitals via Q nm (r) = ψnae∗ (r)ψmae (r) − ψnps∗ (r) ψmps (r), where ψ ae are the all-electron atomic orbitals (not necessarily bound), and ψ ps are the corresponding pseudo-orbitals. The Q nm (r) themselves can be smoothed for computational convenience, by taking a truncated multipole expansion [4]. For the case of norm-conserving PPs the Q nm (r) are identically zero. The KS orbitals obey generalized orthonormality conditions φi | S({R I }) |φ j = δi j ,
(7)
where S is a Hermitian overlap operator given by S=1+
qnm |βnI βmI |,
(8)
nm,I
and
qnm =
dr Q nm (r).
(9)
The orthonormality condition (7) is consistent with the conservation of the charge dr n(r) = Nv . Note that the overlap operator S depends on nuclear positions through the |βnI .
62
R. Car et al.
The ground-state orbitals φi that minimize the total energy (1) subject to the constraints (7) are given by δ E tot = i Sφi (r), δφi∗ (r)
(10)
where the i are Lagrange multipliers. Equation (10) yields the KS equations H |φi = i S|φi ,
(11)
where H , the KS Hamiltonian, is defined as H =−
h¯ 2 2 I Dnm |βnI βmI |. ∇ + Veff + 2m nm,I
(12)
Here, Veff is a screened effective local potential ion (r) + VH (r) + µxc (r), Veff (r) = Vloc
(13)
µxc (r) is the exchange-correlation potential µxc (r) =
δ E xc [n] , δn(r)
(14)
and VH (r) is the Hartree potential VH (r) = e
2
dr
n(r ) . |r − r |
(15)
I appearing in Eq. (12) are defined as The “screened” coefficients Dnm I Dnm
=
(0) Dnm
+
I dr Veff (r)Q nm (r).
(16)
I The Dnm depend on the KS orbitals through Veff (Eq. (13)) and the charge density n(r) (Eq. (6)). Since the KS Hamiltonian in Eq. (11) depends on the KS orbitals φi via the charge density, the solution of Eq. (11) is achieved by an iterative self-consistent field procedure.
2.
First-Principles Molecular Dynamics: Born–Oppenheimer and Car–Parrinello
We will assume here that all nuclei (together with their core electrons) can be treated as classical particles; furthermore, we consider only systems for which a separation between the classical motion of the atoms and the quantum motion of the electrons can be achieved, i.e., systems satisfying the
First-principles molecular dynamics
63
Born–Oppenheimer adiabatic approximation. For any given ionic configurations, it is possible to calculate the self-consistent electronic ground state, and consequently the forces acting on the ions by virtue of the Hellmann– Feynman theorem. The knowledge of the ionic forces allows then to evolve the nuclear trajectories in time, using any of the algorithms developed in classical mechanics for finite-differences solution of Newton’s equations of motion (two of the most popular choices are Verlet algorithms and Gear predictor– corrector approaches). Born–Oppenheimer MD strives for an accurate evolution of the ions by alternatively converging the electronic wavefunctions to full selfconsistency, for a given set of nuclear coordinates, and then evolving by a finite time step the ions according to the quantum mechanical forces acting on them. A practical algorithms could be summarized as such: • self-consistent solution of the KS equations for a given ionic configuration {R I }; • calculation of the forces acting on the nuclei via the Hellmann–Feynman theorem; • integration of the Newton’s equations of motion for the nuclei; • update of the ionic configuration. This way, the nuclei move on the Born–Oppenheimer surface, i.e., with the electrons in their ground state for any instantaneous configuration of the {R I }. An efficient implementation of this class of algorithms relies on efficient selfconsistent minimization schemes for the electronic wavefunctions, and on accurate extrapolations of the electronic ground-state from one step to the other. The time step itself will only be limited by the need to integrate accurately the highest ionic frequencies. In addition, due to the impossibility of reaching perfect electronic selfconsistency, a drift of the constant of motion is unavoidable, and long simulations require the use of a thermostat to compensate. On the other hand, the Car–Parrinello approach [5] combines “on-thefly” the simultaneous classical MD evolution of the atomic nuclei with the determination of the ground-state wavefunction for the electrons. A (fictitious) dynamics for the electronic degrees of freedom is introduced, defining a classical Lagrangian for the combined electronic and ionic degrees of freedom L=µ
i
dr |φ˙i (r)|2 +
1 ˙ 2 − E tot ({φi }, {R I }); MI R I 2 I
(17)
the wavefunctions above are subject to the set of orthonormality constraints Ni j ({φi }, {R I }) = φi |S({R I })|φ j − δi j = 0.
(18)
Here, µ is a mass parameter coupled to the electronic degrees of freedom, M I are the masses of the atoms, and E tot and S were given in Eqs. (1) and (8),
64
R. Car et al.
respectively. The first term in Eq. (17) plays the role of a kinetic energy associated to the electronic degrees of freedom. The orthonormality constraints (18) are holonomic and do not lead to energy dissipation in a MD run. The Euler equations of motion generated by the Lagrangian of Eq. (17) under the constraints (18) are: µφ¨ i = −
δ E tot + i j Sφ j , δφi∗ j
¨ I = − ∂ E tot + FI = MI R ∂R I
ij
(19)
∂S i j φi ∂R
I
φj .
(20)
where i j are Lagrange multipliers enforcing orthogonality. If the system is in the electronic ground state corresponding to the nuclear configuration at that time step, the forces acting on the electronic degrees of freedom µφ¨i =0 vanish and Eq. (19) reduces to the KS equations (10) or (11). A unitary rotation brings the matrix into diagonal form: i j = i δi j . Similarly, the equilibrium nuclear configuration is achieved when the atomic forces F I in Eq. (20) vanish. In deriving explicit expressions for the forces, Eq. (20), one should keep in mind that the electron density also I depends on R I through Q nm and βnI . Introducing the quantities I = ρnm
φi |βnI βmI |φi ,
(21)
i
and I = ωnm
i j φ j |βnI βmI |φi ,
(22)
ij
we arrive at the expression FI = − −
∂U − ∂R I nm
dr
ion ∂ Vloc n(r) − ∂R I
dr Veff (r)
I ∂ω I I ∂ρnm Dnm + qnm nm , ∂R I ∂R I nm
I ∂ Q nm (r) nm
∂R I
I ρnm
(23)
I and Veff have been defined in Eqs. (16) and (13), respectively. The where Dnm last term of Eq. (23) gives the constraint contribution to the forces. We underline that the dynamical evolution for the electronic degrees of freedom should not be construed as representing the true electron dynamics; rather it represent a dynamical system of fictitious degree of freedom adiabatically decoupled from the moving ions, but driven to follow closely the ionic dynamics, with small and oscillatory departures from what would be the exact Born–Oppenheimer ground-state energy. As a consequence, even
First-principles molecular dynamics
65
the Car–Parrinello dynamics for the nuclei becomes in principle inequivalent to the Born–Oppenheimer dynamics. However, suitable choices for the computational parameters used in the simulation exist, and are such that the two dynamics give the same macroscopic observables. The full self-consistency cycle of the Born–Oppenheimer dynamics can be dispensed for, at a great computational advantage only marginally offset by the need to use shorter timesteps to integrate the fast electronic degrees of freedom. The adiabatic separation can be understood on the basis of the following argument [6, 7]. The fictitious electronic dynamics, once close to the ground state, can be described as a superposition of harmonic oscillators whose frequencies are given by:
2( j − i ) ωi j = µ
1/2
,
(24)
where i is the KS eigenvalue of the ith occupied orbital and j is the KS eigenvalue of the j th unoccupied orbital. For a system with an energy gap E g , the lowest frequency can be estimated to be ωmin = (2E g /µ)1/2. If ωmin is much larger than the highest frequency appearing in the nuclear motion, there is a large separation between electronic and nuclear frequencies. Under such conditions, the electronic motion is adiabatically decoupled from the nuclear motion and there is negligible energy transfer from nuclear to electronic degrees of freedom. This is a nonobvious result, since both dynamics are classical and subject to the equipartion of energy, and it is the key to understand when and why the Car–Parrinello dynamics works. For typical E g values, in the order of a few electronvolts, the electronic mass parameter µ can be chosen relatively large, in the order of 300–500 amu or even more, without any loss of adiabaticity. The time step of the simulation can be chosen as the largest compatible with the resulting electronic dynamics. Larger values of µ allow the use of larger time steps, but the requirement of adiabaticity sets an upper limit to µ. Time steps of a fraction of a femtosecond are typically accessible. The electronic dynamics is faster than the nuclear dynamics and averages out the error on forces that is present because the system is never at the instantaneous electronic ground state, but only close to it (the system has to be brought close to the electronic ground state at the beginning of the dynamics). In such conditions, the resulting nuclear dynamics is very close to the true Born–Oppenheimer dynamics, and the electronic dynamics is stable (with negligible energy transfer from the nuclei) even for long simulation times. Moreover, the Car–Parrinello dynamics is computationally more convenient than the Born–Oppenheimer dynamics, because the latter requires a high accuracy in self-consistency in order to provide the needed accuracy on the forces. The Car–Parrinello dynamics does not provide accurate instantaneous forces, but it provides accurate average nuclear trajectories.
66
R. Car et al.
2.1.
Equations of Motion and Orthonormality Constraints
In Car–Parrinello implementations equations of motion (19) and (20) are discretized using the standard-Verlet or the velocity-Verlet algorithm. The following discussion, including the treatment of the R I -dependence of the orthonormality constraints, applies to the standard Verlet algorithm, and using the Fourier acceleration scheme of Tassone et al. [8]. (In this approach the fictitious electronic mass is generally represented by an operator , chosen in such a way to reduce the highest electronic frequencies.∗ ) From the knowledge of the electronic orbitals at time t and t − t, the orbitals at t + t are given, in the standard Verlet, by φi (t + t) = 2φi (t) − φi (t − t)
δ E tot i j (t + t) S(t)φ j (t); −(t)2 −1 ∗ − δφi j
(25)
where t is the time step, and S(t) indicates the operator S evaluated for nuclear positions R I (t). Similarly the nuclear coordinates at time t + t are given by: R I (t + t) = 2R I (t) − R I (t − t) −
(t)2 MI
∂ S(t) ∂ E tot φ j (t) . × − i j (t + t) φi (t) ∂R I ∂R I ij
(26)
The orthonormality conditions must be imposed at each time-step: φi (t + t)|S(t + t)|φ j (t + t) = δi j ,
(27)
leading to the following matrix equation: A + λB + B † λ† + λCλ† = 1
(28)
where the unknown matrix λ is related to the matrix of Lagrange multipliers at time t + t via λ = (t)2 ∗ (t + t). In Eq. (28), the dagger indicates ∗ When using plane waves, a convenient choice for the matrix elements of such operator is
G,G = max(µ, µ((h¯ 2 G 2 )/(2m E c )))δG,G , where G, G are the wave vector of PWs, E c is a cutoff (typically
a few Ry) which defines the threshold for Fourier acceleration. The fictitious electron mass depends on G as the kinetic energy for large G, it is constant for small G. This scheme allows us to use larger steps with negligible computational overhead.
First-principles molecular dynamics
67
Hermitian conjugate (λ = λ† ). The matrices A, B, and C are given by: Ai j = φ¯i |S(t + t)|φ¯ j , Bi j = −1 S(t)φi (t)|S(t + t)|φ¯ j , Ci j = −1 S(t)φi (t)|S(t + t)| −1 S(t)φ j (t),
(29)
with φ¯ i = 2φi (t) − φi (t − t) − (t)2 −1
δ E tot(t) . δφi∗
(30)
The solution of Eq. (28) in the ultrasoft PP case is not obvious, because Eq. (26) is not a closed expression for R I (t + t). The problem is that (t + t) appearing in Eq. (26) depends implicitly on R I (t + t) through S(t + t). Consequently, it is in principle necessary to solve iteratively for R I (t + t) in Eq. (26). A simple solution to this problem was provided in Laasonen et al. [4]. (t + t) is extrapolated using two previous values: i(0) j (t + t) = 2i j (t) − i j (t − t).
(31)
4 Equation (26) is used to find R(0) I (t +t), which is correct to O(t ). From (0) (1) R I (t +t) we can obtain a new set i j (t +t) and repeat the procedure until convergence is achieved. It turns out that in most practical applications the procedure converges at the very first iteration. Thus, the operations described above are generally executed only once per time step. The solution of Eq. (28) is found using a modified version [4, 9] of the iterative procedure of Car and Parrinello [10]. The matrix B is decomposed into hermitian (Bh ) and antihermitian (Ba ) parts,
B = Bh + Ba ,
(32)
and the solution is obtained by iteration: λ(n+1) Bh + Bhλ(n+1) = 1 − A − λ(n) Ba − Ba† λ(n) − λ(n) Cλ(n) .
(33)
The initial guess λ(0) can be obtained from λ(0) Bh + Bh λ(0) = 1 − A.
(34)
Here, the Ba - and C-dependent terms are neglected because they are of higher order in t (Ba vanishes for vanishing t). Equations (34) and (33) have the same structure: λBh + Bhλ = X
(35)
68
R. Car et al.
where X a Hermitian matrix. Equation (35) can be solved exactly by finding the unitary matrix U that diagonalizes Bh : U † BhU = D, where Di j = di δi j . The solution is obtained from (U † λU )i j = (U † XU )i j /(di + d j ).
(36)
When X = 1 − A, Eq. (36) yields the starting λ(0), while λ(n+1) is obtained from λ(n) by solving Eq. (36) with X given by Eq. (33). This iterative procedure usually converges in very few steps (ten or less).
3.
Plane-Wave Implementation
In most standard implementations, first-principles MD schemes employ a plane-wave (PW) basis set. An advantage of PWs is that they do not depend on atomic positions and are free of basis-set superposition errors. Total energies and forces on the atoms can be calculated using computationally efficient Fast Fourier transform (FFT) techniques and Pulay forces [11] vanish because PWs do not depend on atomic positions. Finally, the convergence of a calculation can be controlled in a simple way, since it depends only upon the number of PWs included in the expansion of the electron density. The dimension of a PW basis set is controlled by a cutoff in the kinetic energy of the PWs. A disadvantage of PWs is their extremely slow convergence in describing core states, which can however be circumvented by the use of PPs. Ultrasoft PPs allow to efficiently deal with this difficulty also in systems containing transition metals or first-row elements O, N, F whose 3d and 2p orbitals, respectively, are very contracted. The use of a PW basis set implies that periodic boundary conditions are imposed. Systems not having translational symmetry in one or more directions, have to be placed into a suitable periodically repeated box (a “supercell”). Let {R} be the translation vectors of the periodically repeated supercell. The corresponding reciprocal lattice vectors {G} obey the conditions Ri · G j = 2π n, with n an integer number. The KS orbitals can be expanded in a plane-wave basis up to a kinetic energy cutoff E cwf : 1 φ j,k (r) = √ φ j,k (G)e−i(k+G)·r , G∈{G wf }
(37)
c
where is the volume of the cell, {Gcwf} is the set of G vectors satisfying the condition h¯ 2 |k + G|2 < E cwf , 2m
(38)
and k is the Bloch vector of the electronic states. In crystals, one must use a grid of k-points dense enough to sample the Brillouin zone (the unit cell of the
First-principles molecular dynamics
69
reciprocal lattice). In molecules, liquids and in general if the simulation cell is large enough, the Brillouin zone can be sampled using only the k = 0 () point. An advantage of this choice is that the orbitals can be taken to be real in r-space. In the following we will drop the k vector index. Functions in real space and their Fourier transforms will be denoted by the symbols, if this does not originate ambiguity. The φ j (G)s are the actual electronic variables in the fictitious dynamics. The calculation of H φ j and of the forces acting on the ions are the basic ingredients of the computation. Scalar products φ j |βnI and their spatial derivatives are typically evaluated in G-space. An important advantage of I are easily working in G-space is that atom-centred functions like βnI and Q nm evaluated at any atomic position: βnI (G) = βn (G)e−iG·R I .
(39)
Thus,
φ j |βnI =
φ ∗j (G)βn (G)e−iG·R I
(40)
G∈{Gcwf }
and
∂β I n φj = −i ∂R I
Gφ ∗j (G)βn (G)e−iG·R I .
(41)
G∈{Gcwf }
The kinetic energy term is diagonal in G-space and is easily calculated:
− ∇ 2 φ j (G) = G 2 φ j (G).
(42)
In summary, the kinetic and nonlocal PP terms in H φ j are calculated in G-space, while the local potential term Veff φ j , that could be calculated in G-space, is more convenient determined using a ‘dual space’ technique, switching from G- to r-space with FFTs, and performing the calculation in the space where it is least expensive. In practice, the KS orbitals are first Fourier-transformed to r-space; then, (Veff φ j )(r) = Veff (r)φ j (r) is calculated in r-space, where Veff is diagonal; finally (Veff φ j )(r) is Fourier-transformed back to (Veff φ j )(G). In order to use FFT, the r-space is discretized by a uniform grid spanning the unit cell: f (m 1 , m 2 , m 3 ) ≡ f (rm 1 ,m 2 ,m 3 ),
rm 1 ,m 2 ,m 3 = m 1
a1 a2 a3 + m2 + m3 , N1 N2 N3 (43)
where a1 , a2 , a3 are lattice basis vectors, the integer index m 1 runs from 0 to N1 − 1, and similarly for m 2 and m 3 . In the following we will assume
70
R. Car et al.
for simplicity that N1 , N2 , N3 are even numbers. The FFT maps a discrete periodic function in real space f (m 1 , m 2 , m 3 ) into a discrete periodic function in reciprocal space f˜(n 1 , n 2 , n 3 ) (where n 1 runs from 0 to N1 − 1, and similarly for n 2 and n 3 ), and vice versa. The link between G-space components and FFT indices is: f˜(n 1 , n 2 , n 3 ) ≡ f (Gn1 ,n2 ,n3 ), n 1
n 1
n 1
Gn1 ,n2 ,n3 = n 1 b1 + n 2 b2 + n 3 b3
(44)
n 1
≥ 0, n 1 = + N1 if < 0, and similarly for n 2 and n 3 . where n 1 = if The FFT dimensions N1 , N2 , N3 must be big enough to include all non negligible Fourier components of the function to be transformed: ideally the Fourier component corresponding to n 1 = N1 /2, and similar for n 2 and n 3 , should vanish. In the following, we will refer to the set of indices n 1 , n 2 , n 3 and to the corresponding Fourier components as the “FFT grid”. The soft part of the charge density n soft(r) = j |φ j (r)|2 contains Fourier components up to a kinetic energy cutoff E csoft = 4E cwf . This is evident from the formula: n soft(G) =
G ∈{Gcwf }
j
φ ∗j (G − G )φ j (G ).
(45)
In the case of norm-conserving PPs, the entire charge density is given by n soft(r). Veff should be expanded up to the same E csoft cutoff since all the Fourier components of Veff φ j up to E cwf are required. Let us call {Gcsoft} the set of G-vectors such that h¯ 2 G < E csoft . (46) 2m The soft part of the charge density is calculated in r-space, by Fouriertransforming φ j (G) into φ j (r) and summing over the occupied states. The exchange-correlation potential µxc (r), Eq. (14), is a function of the local charge density and – for gradient-corrected functionals – of its gradient at point r: µxc (r) = Vxc (n(r), |∇n(r)|).
(47)
The gradient ∇n(r) is conveniently calculated from the charge density in G-space, using (∇n)(G) = −iGn(G). The Hartree potential VH (r), Eq. (15), is also conveniently calculated in G-space: VH (G) =
4π n(G)∗ . G2
(48)
Thus, in the case of norm-conserving PPs, a single FFT grid, large enough to accommodate the {Gcsoft} set, can be used for orbitals, charge density, and potential.
First-principles molecular dynamics
71
The use of FFT is mathematically equivalent to a pure G-space description (we neglect here a small inconsistency in exchange-correlation potential and energy density, due to the presence of a small amount of components beyond the {Gcsoft} set). This has important consequences: working in G-space means that translational invariance is exactly conserved and that forces are analytical derivatives of the energy (apart from the effect of the small inconsistency mentioned above). Forces that are analytical derivatives of the energy ensure that the constant of motion (i.e., the sum of kinetic and potential energy of the ions in Newtonian dynamics) is conserved during the evolution.
3.1.
Double-Grid Technique
Let us focus on ultrasoft PPs. In G-space the charge density is: n(G) = n soft(G) +
I Q mn (G)φi |βnI βmI |φi .
(49)
i,nm,I
The augmentation term often requires a cutoff higher than E csoft , and as a consequence a larger set of G-vectors. Let us call {Gcdens} the set of G-vectors that are needed for the augmented part: h¯ 2 2 G < E cdens . 2m
(50)
In typical situations, using pseudized augmented charges, E cdens ranges from E csoft to ∼ 2 − 3E csoft . The same FFT grid could be used both for the augmented charge density and for KS orbitals. This however would imply using an oversized FFT grid in the most expensive part of the calculation, dramatically increasing computer time. A better solution is to introduce two FFT grids: • a coarser grid (in r-space) for the KS orbitals and the soft part of the charge density. The FFT dimensions N1 , N2 , N3 of this grid are big enough to accommodate all G-vectors in {Gcsoft}; • a denser grid (in r-space) for the total charge density and the exchangecorrelation and Hartree potentials. The FFT dimensions M1 ≥ N1 , M2 ≥ N2 , M3 ≥ N3 of this grid are big enough to accommodate all G-vectors in {Gcdens}. In this framework, the soft part of the electron density n soft , is calculated in r-space using FFTs on the coarse grid and transformed in G-space using a coarse-grid FFT on the {Gcsoft} grid. The augmented charge density is calculated in G-space on the {Gcdens} grid, using Eq. (49) as described in the next section. n(G) is used to evaluate the Hartree potential, Eq. (48). Then
72
R. Car et al.
n(G) is Fourier-transformed in r-space on the dense grid, where the exchangecorrelation potential, Eq. (47), is evaluated. In real space, the two grids are not necessarily commensurate. Whenever the need arises to go from the coarse to the dense grid, or vice versa, this is done in G-space. For instance, the potential Veff , Eq. (13), is needed both on the I dense grid to calculate quantities such as the Dnm , Eq. (16), and on the coarse grid to calculate Veff φ j , Eq. (11). The connection between the two grids occurs in G-space, where Fourier filtering is performed: Veff is first transformed in G-space on the dense grid, then transferred to the coarse G-space grid by eliminating components incompatible with E csoft , and then back-transformed in r-space using a coarse-grid FFT. We remark that for each time step only a few dense-grid FFT are performed, while the number of necessary coarse-grid FFTs is much larger, proportional to the number of KS states Nks .
3.2.
Augmentation Boxes
Let us consider the augmentation functions Q nm , which appear in the calI culation of the electron density, Eq. (49), in the calculation of Dnm , Eq. (16), I and in the integrals involving ∂ Q nm /∂R I needed to compute the forces acting on the nuclei, Eq. (23). The calculation of the Q nm in G-space has a large computational cost because the cutoff for the Q nm is the large cutoff E cdens . The computational cost can be significantly reduced if we take advantage of the localization of the Q nm in the core region. We call “augmentation box” a fraction of the supercell, containing a small portion of the dense grid in real space. An augmentation box is defined only for atoms described by ultrasoft PPs. The augmentation box for atom I is centred at the point of the dense grid that is closer to the position R I . During a MD run, the centre of the I th augmentation box makes discontinuous jumps to one of the neighbouring grid points whenever the position vector R I gets closer to such grid point. In a MD run, the augmentation box must always contain completely the augmented charge belonging to the I th atom; otherwise, the augmentation box must be as small as possible. The volume of the augmentation box is much smaller than the volume of the supercell. The number of G-vectors in the reciprocal space of the augmentation box is smaller than the number of G-vectors in the dense grid by the ratio of the volumes of the augmentation box and of the supercell. As a consequence, the cost of calculations on the augmentation boxes increases linearly with the number of atoms described by ultrasoft PPs. Augmentation boxes are used (i) to construct the augmented charge density, Eq. (6), and (ii) to calculate the self-consistent contribution to the
First-principles molecular dynamics
73
coefficients of the nonlocal PP, Eq. (16). In case (i), the augmented charge is conveniently calculated in G-space, following [4], and Fourier-transformed in r-space. All these calculations are done on the augmentation box grid. Then the calculated contribution at each r-point of the augmentation box grid is added to the charge density at the same point in the dense grid. In case I as follows: for every atom described (ii), it is convenient to calculate Dnm by a ultrasoft PP, take the Fourier transform of Veff (r) on the corresponding augmentation box grid and evaluate the integral of Eq. (16) in G-space.
3.3.
Parallelization
Various parallelization strategies for PW–PP calculations have been described in the literature. A strategy that ensures excellent scalability in terms of both computer time and memory consists in distributing the PW basis set and the FFT grid points in real and reciprocal space across processors. A crucial issue for the success of this approach is the FFT algorithm, which must be capable of performing three-dimensional FFT on data shared across different processors with good load balancing. The parallelization in the case of ultrasoft PPs is described in detail in Giannozzi et al. [12].
4.
Applications
Presently, systems described by supercells containing up to a few hundreds atom are within the reach of first-principles MD. A large body of techniques developed for classical MD, such as simulated annealing, finite-temperature simulations, free-energy calculations, etc. can be straightforwardly extended to first-principles MD. Typical applications include the study of aperiodic systems: liquids, atomic clusters, large molecules, including biological active sites; complex solid-state systems: defects in solids, defect diffusion, surface reconstructions; dynamical processes: chemical reactions, catalysis, and finitetemperature studies. The use of ultrasoft PPs is especially convenient in the simulation of systems containing first-row atoms (C, N, O, F) and transition metal elements, such as, e.g., biological active sites, involving Fe, Mn, Ni as catalytic centers. A good example of application of first-principles MD is the investigation of a complex organometallic reaction: the migratory insertion of carbon monoxide (CO) into zirconium–carbon bonds anchored to a calix[4]arene moiety, shown in Fig. 1 [13]. The investigated reactivity is representative of the large class of migratory insertions of carbon monoxide and alkyl-isocyanides into metal–alkyl bonds observed for most of the early d-block metals, leading to the formation of a new carbon–carbon bond [14].
74
R. Car et al.
Figure 1.
Figure 2.
Geometry of calix[4]arene.
Insertion of CO into the Zr-CH3 bond of a calix[4]arene.
The CO migratory insertion is supposed to be initialized by the coordination of the nucleophilic CO species to the electron-deficient zirconium centre of [ p-But calix[4](OMe)2 (O)2 –Zr(Me)2 ], 1 in Fig. 2, to form the relatively stable adduct 2. MD simulations were started by heating up by small steps (via rescaling of atomic velocities) the structure of 2 to a temperature of 300 K. Both electronic and nuclear degrees of freedom were allowed to evolve without any constraint for 2.4 ps. The migratory CO insertion can be followed by studying the time evolution of the carbon–carbon CH3 –CO, metal–carbon Zr–CH3 and metal– oxygen Zr–O distances. Figure 3 clearly shows that the reactive CO migration takes place within ca. 0.4 ps: the fast decrease in the CH3 –CO distance from ca. 2.7 Å to ca. 1.5 Å corresponds to the formation of the new CH3–CO carbon– carbon bond. At the same time the Zr–CH3 distance follows an almost complementary trajectory with respect to the CH3 –CO distance and grows from ca. 2.4 up to ca. 3.7 Å, reflecting the methyl detachment from the metal centre upon CO insertion.
First-principles molecular dynamics
75
4.5
’C-C’ ’Zr-C’ ’Zr-O’
4
Distances (Angstrom)
3.5
3
2.5
2
1.5
1
0
0.2
0.4
0.6
0.8
1
1.2 1.4 Time (ps)
1.6
1.8
2
2.2
2.4
Figure 3. Evolution of carbon–carbon CH3 –CO, metal–carbon Zr–CH3 and metal–oxygen Zr–O distances during the simulation of CO insertion into calix[4]arene.
The Zr–O distance is found to decrease from its initial value of ca. 3.5 Å in 2, to ca. 2.2 Å, corresponding to the Zr–O bond in 4, within 1.0 ps. The 0.6 ps delay between the formation of the CH3 –CO bond and the formation of the Zr–O bond suggests the initial formation of a transient species, 3 in Fig. 2, characterized by an η1 -coordination of the OC–CH3 acyl group with a formed CH3 –CO bond and still a long Zr–O bond; this η1 -acyl subsequently evolves to the corresponding η2 -bound acyl species. The short time stability of the η1 -acyl isomer (ca. 0.6 ps) suggests a negligible barrier for the conversion of the η1 into the more stable η2 -isomer, as confirmed by static DFT calculations.
Acknowledgments Algorithms and codes presented in this work have been originally developed at EPFL Lausanne by Alfredo Pasquarello and Roberto Car, and then at Princeton University by Paolo Giannozzi and Roberto Car. Several people have also contributed or are contributing to the current development and distribution under the GPL License: Kari Laasonen, Andrea Trave, Carlo Cavazzoni, and Nicola Marzari.
76
R. Car et al.
References [1] A. Pasquarello, P. Giannozzi, K. Laasonen, A. Trave, N. Marzari, and R. Car, The Car–Parrinello molecular dynamics code described in this paper is freely available in the Quantum-espresso distribution, released under the GNU Public License at http://www.democritos.it/scientific.php., 2004. [2] D. Vanderbilt, “Soft Self-Consistent Pseudopotentials in a Generalized Eigenvalue Formalism,” Physical Review B, 41, 7892, 1990. [3] D.R. Hamann, M. Schl¨uter, and C. Chiang, “Norm-Conserving Pseudopotentials,” Physical Review Letters, 43, 1494, 1979. [4] K. Laasonen, A. Pasquarello, R. Car, C. Lee, and D. Vanderbilt, “Car–Parrinello Molecular Dynamics with Vanderbilt Ultrasoft Pseudopotentials,” Physical Review B, 47, 10142, 1993. [5] R. Car and M. Parrinello, “Unified Approach for Molecular Dynamics and DensityFunctional Theory,” Physical Review Letters, 55, 2471, 1985. [6] G. Pastore, E. Smargiassi, and F. Buda, “Theory of Ab Initio Molecular-Dynamics Calculations,” Physical Review A, 44, 6334, 1991. [7] D. Marx and J. Hutter, “Ab-Initio Molecular Dynamics: Theory and Implementation,” In: Modern Methods and Algorithms of Quantum Chemistry, John von Neumann Institute for Computing, FZ J¨ulich, pp. 301–449, 2000. [8] F. Tassone, F. Mauri, and R. Car, “Acceleration Schemes for Ab Initio MolecularDynamics Simulations and Electronic-Structure Calculations,” Physical Review B, 50, 10561, 1994. [9] C. Cavazzoni and G.L. Chiarotti, “A Parallel and Modular Deformable Cell Car–Parrinello Code,” Computer Physics Communuications, 123, 56, 1999. [10] R. Car and M. Parrinello, “The Unified Approach for Molecular Dynamics and Density Functional Theory,” In: A. Polian, P. Loubeyre, and N. Boccara (eds.), Simple Molecular Systems at Very High Density, Plenum, New York, p. 455, 1989. [11] P. Pulay, “Ab Initio Calculation of Force Constants and Equilibrium Geometries,” Molecular Physics, 17, 197, 1969. [12] P. Giannozzi, F. De Angelis, and R. Car, “First-Principle Molecular Dynamics with Ultrasoft Pseudopotential: Parallel Implementation and Application to Extended Bio-Inorganic Systems,” Journal of Chemical Physics, 120, 5903–5915, 2004. [13] S. Fantacci, F. De Angelis, A. Sgamellotti, and N. Re, “Dynamical Density Functional Study of the Multistep CO Insertion into Zirconium–Carbon Bonds Anchored to a Calix[4]arene Moiety,” Organometallics, 20, 4031, 2001. [14] L.D. Durfee and I.P. Rothwell, “Chemistry of Eta-2-acyl, Eta-2-iminoacyl, and Related Functional Groups,” Chemical Reviews, 88, 1059, 1988.
1.5 ELECTRONIC STRUCTURE CALCULATIONS WITH LOCALIZED ORBITALS: THE SIESTA METHOD Emilio Artacho1 , Julian D. Gale2 , Alberto García3 , Javier Junquera4, Richard M. Martin5 , Pablo Ordej´on6 , Daniel S´anchez-Portal7, and Jos´e M. Soler8 1 University of Cambridge, Cambridge, UK 2 Curtin University of Technology, Perth, Western Australia, Australia 3 Universidad del País Vasco, Bilbao, Spain 4 Rutgers University, New Jersey, USA 5 University of Illinois at Urbana, Urbana, IL, USA 6 Instituto de Materiales, CSIC, Barcelona, Spain 7 Donostia International Physics Center, Donostia, Spain 8
Universidad Aut´onoma de Madrid, Madrid, Spain
Practical quantum mechanical simulations of materials, which take into account explicitly the electronic degrees of freedom, are presently limited to about 1000 atoms. In contrast, the largest classical simulations, using empirical interatomic potentials, involve over 109 atoms. Much of this 106 -factor difference is due to the existence of well-developed order-N algorithms for the classical problem, in which the computer time and memory scale linearly with the number of atoms N of the simulated system. Furthermore, such algorithms are well suited for execution in parallel computers, using rather small interprocessor communications. In contrast, nearly all quantum mechanical simulations involve a computational effort which scales as O(N 3 ), that is, as the cube of the number of atoms simulated. Such an intrinsically more expensive dependence is due to the delocalized character of the electron wavefunctions. Since the electrons are fermions, every one of the ∼N occupied wavefunctions must be kept orthogonal to every other one, thus requiring ∼N 2 constraints, each involving an integral over the whole system, whose size is also proportional to N . Despite such intrinsic difficulties, the last decade has seen an intense advance in algorithms that allow quantum mechanical simulations with an 77 S. Yip (ed.), Handbook of Materials Modeling, 77–91. c 2005 Springer. Printed in the Netherlands.
78
E. Artacho et al.
O(N ) computational effort. Such algorithms are based on avoiding the spatially extended electron eigenfunctions and using instead magnitudes, such as the one-electron density matrix, that are spatially localized, thus allowing for a spatial decomposition of the electronic problem. This strategy exploits what has been called by Walter Kohn the nearsightedness of the electron-gas [1]. Its implementation requires, or is greatly facilitated, by the use of a spatially localized basis set, such as a linear combination of atomic orbitals (LCAO). This paper gives a brief overview of such methods and describes in some detail one of them, the Spanish Initiative for Electronic Simulations with Thousands of Atoms (SIESTA).
1.
Order- N Algorithms
Despite its relatively recent development, there are already good reviews of O(N ) methods for the electronic structure problem, such as those of Ordejon [2] and Goedecker [3]. Here we will only explain briefly the basic difficulties and lines of solution, emphasizing the more practical aspects. Although some methods, such as that of Car and Parrinello, use a direct minimization approach, it is pedagogically convenient to consider the solution of the electronic problem as a two-step process. First, one needs to find the Hamiltonian (and eventually the overlap) matrix in some convenient basis. Second one has to find the solution of Schr¨odinger’s equation in that representation, that is, the electron wavefunctions or density matrix as a linear combination of basis functions. Since the effective electron potential, and therefore the Hamiltonian, depends on the electron density, this two-step process has to be iterated to selfconsistency. Although both steps require highly nontrivial algorithms to be performed with O(N ) effort, from a physical point of view the second one involves more fundamental problems and solutions. We will therefore give first, in this section, an overview of the second step, and leave for the next section the technical solution of the first step (the construction of the Hamiltonian), in the context of SIESTA. Although O(N ) methods have been developed for Hatree–Fock calculations as well, here we will restrict ourselves to density functional theory (DFT) because the methods are more mature and easier to understand in this context. There are numerous good introductory reviews on DFT like in Ref. [4]. A central magnitude in most O(N ) methods is the one-electron density operator ρˆ =
|ψi f (i )ψi |.
(1)
i
Its representation in real space is the density matrix ρ(r, r ) =
i
f (i ) ψi (r) ψi∗ (r ),
(2)
Electronic structure calculations with localized orbitals
79
where ψi (r) is the ith eigenfunction of the Kohn–Sham one-electron Hamiltonian of DFT, i is its corresponding eigenvalue, and f (i ) is its Fermi– Dirac occupation factor. Such a representation is appropriate for recent schemes that use finite difference formulae, in a real space grid of points, to solve the Kohn–Sham equations. We will assume, however, that a basis set of some kind of localized orbitals φµ (r), is used to expand the electron wavefunc = matrix takes the form tions: ψi (r) µ ciµ φµ (r). In this case the density ∗ . The density ρ(r, r ) = µν ρµν φµ (r) φν∗ (r ), where ρµν = i f (i ) ciµ ciν matrix allows to generate all the magnitudes required for a self-consistent DFT calculation. The electron density is simply its diagonal, ρ(r) = ρ(r, r), and it allows to calculate the Hartree (electrostatic) and exchange-correlation potentials. The electronic kinetic energy is given by 1 E kin = − 2
∇r2 ρ(r, r )
r=r
d3 r =
µν
ρµν Tνµ ,
(3)
where, using atomic units (e = m e = = 1),
1 φν∗ (r)∇ 2 φµ (r)d3 r. (4) 2 Notice from Eq. (2) that the electron eigenstates ψi (r) are also eigenvectors of the density matrix, whose corresponding eigenvalues are the occupation factors f (i ). However, diagonalizing ρµν is an O(N 3 ) operation, no cheaper than diagonalizing the Hamiltonian, so that magnitudes that depend on the eigenvectors, like the band structure or the density of states, are not ususally obtained in O(N ) calculations (although there are special O(N ) techniques to obtain partially some of these magnitudes [3]). The central role of ρ(r, r ) in O(N ) methods stems from the fact that it is sparse: when r and r are far away, ρ(r, r ) becomes negligibly small. To see this, it suffices to consider a uniform electron gas. In this case, the one-electron √ eigenfunctions become plane waves of the form ψk (r) = exp(ikr)/ where k is a wave vector and is the system volume. By substitution into Eq. (2), it is easy to see that ρ(r, r ), which in this case depends only on |r − r|, is simply the Fourier transform of the Fermi function in k space: f (k) = 1, if |k| ≤ k F , and f (k) = 0 otherwise, at zero temperature. Its Fourier transform ρ(|r − r|) decays as cos(k F |r − r|)/|r − r|2 . Furthermore, it turns out that the free electron gas at T = 0 is the worst possible case: at finite temperature the decay is exponential, with a decay constant proportional to the temperature. For an insulator, the decay is also exponential, even at zero temperature, with a decay constant that increases with the energy gap [3]. Therefore, the number of non-negligible values of ρ(r, r ) increases only linearly with the size of the system, with a prefactor that depends on its bonding character, and particularly on whether it is metallic or insulating. We will see that the computational effort (execution time and memory) is directly related to the number of those Tνµ = −
80
E. Artacho et al.
non-negligible matrix elements. In practice, for metallic systems, the prefactor is so large that the crossover system size, at which O(N ) methods become computationally competitive over traditional O(N 3 ) methods, has not yet been reached. We will therefore assume that the systems that we are considering are insulators, even though some (but not all) of the methods described could in principle be applied to metals as well. Chronologically, the first quantum mechanical O(N ) method, the divide and conquer (DC) scheme of Weitao Yang et al., is also conceptually the simplest from a physical point of view (recursion and other methods based on Green’s functions were developed in the 1970s that were also linear scaling; their linear-scaling character was not the driving force behind them though, and they are not so well suited for self-consistent studies). It is based on dividing the whole system into smaller pieces, each surrounded by a buffer region, that are then treated (including the buffer) by conventional quantum mechanical methods, i.e., by diagonalizing the local Hamiltonian. Using a common value for the chemical potential (Fermi energy) allows for charge transfer among different regions. From this treatment, the density (in the first proposal) or the density matrix (in a subsequent development) of the different pieces are combined to generate that of the entire system. The matrix elements between points (or orbitals) in different spatial pieces are obtained from those between the pieces themselves and their buffer regions (the elements between two buffer points are not used). Thus, the width of the buffer regions must account fully for the decay of ρ(r, r ). Beyond this width, usually called the localization radius, the matrix elements are neglected. In practice, this implies rather large buffer regions, making the method more expensive than other, more recent, O(N ) methods. The second O(N ) method to be mentioned, the Fermi operator expansion (FOE), constructs the whole (though sparse) density matrix as an expansion of the Hamiltonian. To this end, one expands the Fermi–Dirac function (conveniently smoothed) as a polynomial, within some energy range: f () = nmax n a , for min < < max . In practice, one uses n max + 1 Chebyshev n n=0 polynomials rather than powers of for stability reasons, but this is just a technical point [3]. Then one constructs the density matrix (by performing n max multiplications of the Hamiltonian) as ρˆ =
n max
an H n ,
(5)
n=0
where the coefficients an are the same as before. To keep the O(N ) scaling of the computation, one needs to restrict the spatial range, within the required localization radius, after each matrix multiplication. To understand the effect of this operator, consider its application to an eigenvector of the Hamiltonian. Provided that the eigenvalue is within the range of the expansion, the result
Electronic structure calculations with localized orbitals
81
max will be ρψ ˆ = nn=0 an n ψ = f ()ψ. This is exactly the effect of the density matrix operator of Eq. (1). A closely related method is the Fermi operator projection (FOP), in which one starts from a trial set of electron wavefunctions, each constrained within a different localization region (usually around atoms) and applies the expansion (5) of the density matrix operator (without constructing it) to the trial functions, projecting them into the occupied subspace. One still needs to make them orthogonal but, since they are spatially localized by construction, the process can be performed in O(N ) operations. The resulting functions are a complete representation of the density matrix, of size Nel × Nloc , with Nel the number of electrons and Nloc the number of basis orbitals within a localization region. In contrast, the normal representation of the density matrix, used in the FOE method, has Nbasis × Nloc nonzero matrix elements, where Nbasis is the number of basis orbitals, which is substantially larger than Nel . Therefore, the FOP method is more efficient than the FOE. In the density matrix minimization (DMM) method of Li, Nunes and Vanderbilt, the entire sparse density matrix is also obtained by minimizing the total energy as a function of its matrix elements in a localized basis set of atomic orbitals [5], grid points, or some other kind of support functions [6]. Again, matrix elements separated by more than a pre-established localization radius are neglected. A complication is that in performing the minimization, one must impose the constraint that the eigenvalues of the density matrix (i.e., the occupation weights) must be between zero and one, as required by the Fermi exclusion principle (for simplicity, we will consider combined spin–orbital indexes µ and i, so that each basis orbital or electron state has a defined spin and contains a single electron). At zero temperature, the constrained energy minimization will make all the eigenvalues either zero (above the Fermi energy) or one, what amounts to making matrix ρ idempotent: ρ 2 = ρ (since all the eigenvalues of ρ 2 will be identical to those of ρ). To impose this constraint, one introduces an auxiliary matrix ρ˜µν , with the same dimensions, and defines the density matrix using the McWeeny “purification” transformation ρ = 3ρ˜ 2 − 2ρ˜ 3 . Thus, the eigenvalues of ρ and ρ˜ are related by f i = 3 f˜i2 − 2 f˜i3 . It can be easily seen that, if f˜i is between –1/2 and 3/2, then f i is within the required range 0 ≤ f i ≤ 1. And if f˜i is close to either 0 or 1, then f i is even closer to these values. This allows for an unconstrained minimization of the ˜ = min. A practical energy as a function of the auxiliary matrix: E tot (ρ(ρ)) problem is that the spatial range of ρ˜ 3 is three times larger than the localization radius of ρ. ˜ To improve efficiency, one may truncate ρ further, although this degrades its exact idempotency, introducing extra errors. If the basis set is not orthonormal, ρ˜ 3 becomes (ρ˜ S)3 and the problem worsens. Like the FOP method, the orbital minimization (OM) approach uses a set of ∼Nel localized wavefunctions, conventionally called Wannier functions.
82
E. Artacho et al.
These wavefunctions are optimized, within their respective localization regions, by minimizing a modified total energy functional proposed by Kim, Mauri, and Galli, which has the form E = Tr[(H − µI )(2S − I )]
(6)
where Hi j and Si j are, respectively, the Hamiltonian and overlap matrix elements between the localized states i and j , Ii j ≡ δi j is the identity matrix, and µ is the chemical potential (Fermi energy). Although not immediately obvious, it has been shown that this functional form has very convenient properties. Initially, the localized orbitals need not be orthonormal, but the functional penalizes them for not being so, in such a way that they become orthogonal as a result of the unconstrained minimization. Furthermore, although more localized orbitals are used than the number of electrons, the minimization retains only Nel of them with norm equal to one, while the rest become normless. A problem with this method is that it usually requires a very large number (frequently over 1000) of iterations in the first functional minimization (for the first Hamiltonian). This is a consequence of the minimization problem becoming ill-conditioned when the localization regions are imposed on the wavefunctions. Subsequent minimizations, during the self-consistency process and geometry relaxation, require many fewer iterations (typically of the order of ten), so that the initial minimization problem is not so important in most practical calculation projects. Another practical problem is to choose the chemical potential µ, which must lie within the energy gap to ensure charge conservation. Furthermore, the self-consistency process and geometry relaxation may result in a shift of the gap, thus requiring cumbersome changes of µ during it. There are also hybrid methods. Gillan et al. use the DMM method, optimizing a density matrix expanded in a rather small basis of localized orbitals. These orbitals are in turn optimized by expanding them in terms of a much richer basis of finite elements called “blips” [6]. Bernholc et al use a similar approach, sometimes called the quasi-O(N ) method [7], in which a conventional diagonalization, rather than DMM, is used to find the eigenvectors (and the density matrix) in terms of the small basis of localized orbitals, which are then optimized in a fine real space grid. Although the diagonalization step is O(N 3 ), the small size of the localized orbital basis, and thus of the Hamiltonian, implies a small prefactor, allowing for simulations of rather large systems in practice, including metallic ones.
2.
The SIESTA method
The O(N ) methods, described in the previous section, were developed initially in the context of tight binding calculations, in which the Hamiltonian
Electronic structure calculations with localized orbitals
83
matrix elements, between atomic orbitals of a minimal basis set, are given by empirical formulae for any atomic positions. This allows to concentrate on the more fundamental problem of finding the electron states, given a Hamiltonian of minimum size, without caring about how to obtain selfconsistently such a Hamiltonian. This latter problem, although more prosaic and technical, involves a large number of small sub-problems, such as finding good and efficient pseudopotentials and basis sets, calculating the electron density from the electron wavefunctions, the Hartree and exchange-correlation potentials from the density, the matrix elements of the kinetic and potential operators, the atomic forces, etc. Although none of these problems poses essential difficulties, solving all of them with an O(N ) effort is a major enterprise that involves tens or hundreds of thousands of code lines. Therefore, there are not many well developed codes able to perform practical O(N ) DFT simulations. On this respect, we may cite, apart from SIESTA: the implementation of the DMM method in the GAUSSIAN code [5]; the CONQUEST code, which uses the hybrid approach mentioned in last section [6]; and the recent ONETEP code [8] using finite-cut-off representations of Dirac delta functions as basis set. Although not using strictly O(N ) methodology, we will also mention the FIREBALL code of Lewis et al. [9], which was the precursor of SIESTA in many technical aspects, as well as that of Lippert et al. [10], which also employs a very similar approach. The first major decision of any DFT implementation concerns the election of the basis set. Traditionally, most codes developed in the condensed matter community employ plane waves (PWs). They are conceptually simple and asymptotically complete. Most importantly, this completeness is very easy to approach in a systematic way, what greatly simplifies their practical use. Not depending on the atomic positions, plane waves are also spatially unbiased, what simplifies many developments and eliminates spurious effects like Pulay forces, even when the basis is far from converged. In addition, there are some very efficient techniques, particularly the fast Fourier transform (FFT), that greatly help and simplify the implementation of an efficient plane wave code. PWs have also disadvantages: being unbiased, they can equally represent any function, but they are not specially well suited to represent any one in particular. In comparison, the atomic orbitals traditionally used in quantum chemistry are very specially suited to represent the electron wavefunctions, and therefore they are much more efficient. Thus, one frequently needs tens or even hundreds of PWs per atom to achieve the same accuracy of a minimal basis of just four atomic orbitals. However, when comparing basis set efficiency, it is essential to consider the target accuracy of the calculations. LCAO basis are very efficient initially (i.e., for low accuracies). They can also achieve very high accuracies, but they are much harder to improve systematically than PWs. Therefore, in terms of both human and computational effort, LCAO basis sets become less and less convenient, compared to PW, as the required accuracy increases.
84
E. Artacho et al.
In practice, most simulation projects involve a huge number of trial calculations, to check the importance and the convergence of many effects and parameters, to explore candidate geometries and compositions, etc. To perform efficiently this initial exploration, it is extremely useful to have a method (and a basis set in particular) that allows a uniform transition from very fast “quick and dirty” calculations to very accurate ones. And LCAO bases allow precisely that. Apart from the pros and cons of PW mentioned before, their main disadvantage for us is their intrinsic inadequacy for O(N ) calculations. This is because each plane wave extends over the whole system, making PW inadequate to expand localized wave functions. Partly because of this reason, the last decade has seen a renaissance of real space methods, in which the electron wave functions are represented directly in a grid of points [11]. Such a “basis” has many of the advantages of PW, specially its systematic completeness, while it is also perfectly adequate to represent localized wave functions. It also allows for implementing a variety of boundary conditions, apart from the periodic ones imposed by PW. In practice, considerably more real space points are required than the already numerous PW, to achieve a similar precision, thus facing important limitations, especially in computer memory. The other main alternative for bases to implement O(N ) methods is LCAO. This is the traditional workhorse basis of quantum chemistry methods, in most of which the atomic orbitals are in turn expanded as a linear combination of Gaussian orbitals. This Gaussian expansion greatly facilitates the calculation of the three- and four-center integrals required in Hartree–Fock and configuration interaction methods. However, it is not specially useful to calculate the matrix elements of the nonlinear exchange and correlation potential, needed in DFT. In this case, it is better to use numerical orbitals, given by the product of a spherical harmonic times a radial function, represented in a fine radial grid. Furthermore, in order to expand the localized electron states and density matrices, used in O(N ) methods, it is conceptually and practically useful that the basis functions are stricly localized, i.e., defined to be zero beyond a specified radius. Such orbitals were proposed by Sankey and Niklewski and implemented in the codes FIREBALL [9] and SIESTA [12, 13]. They are generated by solving, for each angular momentum, the radial Schr¨odinger equation for the corresponding nonlocal pseudopotential. At the atomic orbital eigenvalue, the wavefunction decays exponentially for r → ∞. Shifting the energy to a slightly higher value, the wavefunction has a node at some radius rc , and may be considered as the solution under the constraint of a hard wall at rc . Using a common “energy shift” for all atoms and angular momenta (what implies a different rc for each one) provides a balanced basis, avoiding or mitigating spurious charge transfers. This scheme has the disadvantage of generating orbitals with a discontinuous derivative at rc (kink), which has been proven to have a small effect on the energy of condensed systems.
Electronic structure calculations with localized orbitals
85
To generate a richer basis set, SIESTA splits these numerical atomic orbitals (NAO) as the sum of a smooth part with even shorter range, plus a remainder, treating both parts as variationally independent basis orbitals, and producing in this way a radial flexibilization of the basis set. This splitting, inspired by the “split-valence” procedure used with Gaussian-expanded orbitals in quantum chemistry, can be repeated to generate multiple-ζ bases for each valence orbital. In order to introduce also angular flexibilization, polarization orbitals with higher angular momentum can be included. To provide them, SIESTA finds the perturbation created in the valence orbitals by an applied electric field. These polarization orbitals can also be “split,” using the previously described method, to create arbitrarily rich basis sets. It is well known that the optimal atomic basis orbitals are environment dependent. The simplest example is the hydrogen molecule, in which the optimal exponential atomic orbitals decay as e−r (in atomic units) for large interatomic separations (isolated atoms) and as e−2r for zero separation (helium atom). To account for this effect, the basis orbitals can be optimized variationally (i.e., by minimizing the total energy) within an environment similar (but simpler) to that in which they will be used. The transferability will improve by increasing the number of atomic orbitals in the basis set. To eliminate the kink, present at rc , in the orbitals of Sankey and Niklewski, it is convenient to use as variational parameters those defining a soft confinement potential, which diverges at rc . As with the “energy shift” of the hard-potential orbitals, it is important to use a common “pressure” parameter, for all the atoms and angular momenta, that controls the range of the orbitals during the optimization process [14]. To handle efficiently the core electrons, SIESTA uses the norm-conserving pseudopotentials of Troullier and Martins, in the fully nonlocal form of Kleinman and Bylander: VˆPS =
PS
d r |rVlocal (r)r| + 3
lmax
|χlm Vl χlm |,
(7)
l,m
where Vlocal(r) decays as −Z val /r when r → ∞. Since these pseudopotentials have become standard in condensed matter electronic structure codes, and they have been covered in other chapters of this handbook, we will only mention that, in SIESTA, Vlocal(r) is optimized for smoothness, rather than using the semilocal pseudopotential of a given angular momentum. The Hamiltonian and ovelap matrix elements contain several terms. The simplest ones to calculate in O(N ) operations are those involving two-center integrals between overlapping orbitals, because each orbital overlaps only with a small number of other orbitals, independent of the system size. These matrix elements are the overlap elements themselves Sµν = φµ |φν , the integrals χlm |φµ involved in the second term of Eq. (7), and the kinetic matrix elements Tµν = φµ | − 12 ∇ 2 |φν . All of these are calculated in Fourier
86
E. Artacho et al.
space, using convolution techniques, and stored as a product of spherical harmonics times numerical radial functions, interpolated in a fine radial grid [13]. To compute the matrix elements of the local potentials, we first find the electron density ρ(r), in a regular three-dimensional grid of points r, from the density matrix: ρ(r) =
µν
ρµν φµ (r)φν (r).
(8)
Notice that, for a given point r, only a few orbitals are nonzero at r and contribute to the sum, so that the evaluation of ρ(r) is an O(N ) operation, given the fact that the the number of grid points scales linearly with the volume, which in turn is proportional to N . From ρ(r) we calculate the Hartree potential VH (r) (the electrostatic potential created by ρ(r)) using FFT. This step scales as N log(N ) and is therefore not strictly O(N ). In practice it represents only a very minor part of the whole calculation, even for the largest systems considered up to now. Whenever this step becomes dominant, we may switch to other methods, like fast multipoles or multigrid algorithms, that are strictly O(N ). The exchange and correlation potential Vxc (r) is computed in the local density (LDA) or generalized gradient approximations (GGA), the latter using finite difference derivatives. We then find the total effective potential Veff (r) by adding the local pseudopotentials of all the atoms to VH (r) + Vxc (r). Since both Vlocal and VH have long range parts with opposite signs, we subtract from each of them the electrostatic potential created by a reference density, the sum of the electron densities of the free atoms. We then find the matrix elements φµ |Veff |φν by direct integration in the grid points. Like the evaluation of ρ(r), the effort of this step has O(N ) scaling, because the number of nonzero orbitals at each grid point is independent of the system size. The evaluation of the total energy, atomic forces, and stress tensor, proceeds simultaneously to that of the Hamiltonian matrix elements, using the last density matrix available during the self-consistency process. For exam ple, the kinetic and Hartree energies are given by E kin = µν ρµν Tνµ and E H = 12 µν ρµν φν |VH |φµ , respectively. The factor 1/2 prevents double counting of the electron–electron interactions. For the forces and stress we directly use the analytic derivatives of each term of the total energy. For each term, energy, forces and stresses are computed simultaneously, in the same places of the code. This ensures an exact compatibility between the computed total energy and its derivatives, including all corrections like Pulay forces. Once the Hamiltonian and overlap matrices have been calculated, a new density matrix is obtained either by: (i) solving the generalized eigenvalue problem by conventional O(N 3 ) methods of linear algebra, or (ii) using the O(N ) orbital minimization method of Kim, Mauri, and Galli, described in previous section. The first one must be used for systems that are metallic or suffer bond breakings that create partially occupied states during the
Electronic structure calculations with localized orbitals
87
simulation. Apart from those, systems below a threshold size actually run faster with the conventional O(N 3 ) methods. This threshold depends on the bonding nature of the system, on the size of the basis set used, on the spatial range of the basis orbitals, and on other calculation parameters, but it is typically around ∼100 atoms. Even for sizes above this threshold, it may be more efficient, specially in terms of human investment, to use plain diagonalization. This is because the O(N ) method is intrinsically more limited (specially for bond breaking) and difficult to use, with more parameters to adjust: the localization radius of the Wannier orbitals and, especially, the chemical potential. As a rule of thumb, the O(N ) method is practical for long geometry relaxations or molecular dynamics of systems with more than ∼300 atoms, or for short calculations with more than ∼500 atoms. With conventional diagonalization, an important efficiency consideration is whether the computational effort is dominated by the diagonalization itself or by the construction of the Hamiltonian. In the first case, which occurs above ∼100 atoms, the only relevant efficiency parameter is the basis set size, while other parameters, like the spatial range of the basis orbitals or the fineness of the integration grid, can be incresed at negligible cost, to improve the accuracy. In fact, it may be advantageous to increse the grid fineness even for efficiency reasons, since this will decrease the so called “eggbox effect”: a spurious ripling of the potential, due to the dependence of the total energy on the atomic positions relative to the integration grid. Though slight in the energy, the effect is larger on the atomic forces, and may increase considerably the number of iterations required to relax the geometry. We will finish this section by briefly mentioning some capabilities of SIESTA to perform a variety of calculations: • For very fast “quick and dirty” calculations, it is possible to use the non-self-consistent Harris–Foulknes functional, in which the only Hamiltonian calculated derives from a superposition of free atom densities. For diagonalization-dominated systems, with more than ∼100 atoms, and used in combination with a minimal basis set, this is essentially as fast as a tight binding calculation. • SIESTA contains algorithms for a large variety of geometry relaxation and dynamics, including the simultaneous relaxation of the lattice vectors and atomic positions, Parrinello–Rahman molecular dynamics, dynamics at constant pressure and/or temperature, etcetera. • The SIESTA program itself does not consider symmetries because it is designed for large and/or dynamical systems, which generally have low or no symmetry. However, an accompanying package contains several tools to facilitate the evaluation of phonon modes and spectra, which prepare data files with the required geometries (considering the system symmetry) and process the resulting forces to calculate the phonons.
88
E. Artacho et al.
• SIESTA is able to apply an external electric field to systems like molecules, clusters, chains and slabs, as well as to calculate the spontaneous polarization of a solid, using the Berry phase formalism of King-Smith and Vanderbilt [4]. • It is also possible to simulate magnetic systems, using spin dependent DFT, including the ability to impose the total magnetic moment, to start with antiferromagnetic configurations, and to allow noncollinear spin solutions. • A forthcoming version will also include time-dependent DFT, using the method of Yabana and Bertsch [13].
3.
DNA: A Prototype Application
SIESTA has been applied to hundreds of different systems, including solid metals, semiconductors and insulators, liquids, molecules, surfaces, nanotubes, and biological systems [15]. Of all these, because of the reasons explained in previous section, only a minority has been studied using the O(N ) methodology to solve Schr¨odinger’s equation (although the Hamiltonian is always generated in O(N ) operations). A good representative of this minority is the study of the electronic structure of DNA by Artacho et al. [16]. Apart from its obvious biological interest, DNA has generated much interest recently as a candidate for controlled self assembly of molecular electronic devices. On this respect, its ability to conduct electricity is of maximum interest, but very contradictory experimental results have been obtained on this ability. Furthermore, in such devices, DNA is normally found in a dry environment, very different from its conditions in vivo, which might strongly affect its structure. Thus, the goal of the calculations was to study the structural stability and the electrical conductivity of dry DNA. A preliminary calculation used the B conformation, but later studies used the A conformation, which is known experimentally to be more stable under dry conditions. The poly(C)–poly(G) sequence (only guanines in one of the strands and only cytosines in the other one) was chosen because guanine has the smallest ionization energy (and therefore the highest apetite for electron holes, which are suspected to be the relevant carriers) and because a uniform sequence is optimal for band conductivity. The CG base pair contains 65 atoms, including those in the sugar-phosphate side chains. Since the A conformation has a helix pitch of 11 base pairs, the total number of atoms per unit cell was 715. In solution, DNA is negatively ionized, by losing a proton in each phosphate group (two per base pair). This negative charge is neutralized by positive ions in solution around the DNA chain. In dried DNA, like that deposited on surfaces, it is uncertain how the charge will be distributed, but a reasonable approximation was to restore the phosphate protons (acidic form). It must be kept in mind, however, that in reality some of
Electronic structure calculations with localized orbitals
89
these protons (or whatever countercations) may be missing, in which case the charge must be compensated by electron holes, like in a doped semiconductor. The calculations were done with a double-ζ basis set, with additional polarization orbitals on the hydrogen atoms involved in hydrogen bonds and on the phosphorous atoms, for a total basis set size of 4510 orbitals. To find the chemical potential, an initial selfconsistent calculation was performed using standard diagonalizations. Then, the geometry relaxation proceeded during ∼800 steps using the O(N ) method of Kim, Mauri, and Galli, with a localization radius of 4 Å for the Wannier orbitals. A final calculation, using standard diagonalization, was performed for the relaxed coordinates, to find the electron eigenfunctions and to compare the total energy and forces. The total energy with the extended eigenfunctions was only 5 meV/atom lower than with the localized Wannier functions, and the average residual force was 6 meV/Å, while it was 2 meV/Å for the linear scaling. While a geometry relaxation step takes only about one hour with the O(N ) method, it takes 20 h using standard diagonaliztion, in a single 1 GHz Intel Pentium III processor. Despite the large number of relaxation steps, the relaxed geometry was rather close to the initial one, taken from X-ray diffraction experiments. Its structural parameters are typical of the A conformation, showing that this structure is indeed stable (at least metastable) for dry DNA. The electronic structure shows clear bands, as expected for a periodic system. The highest valence band is formed by the guanine HOMO states, and has a width of only 40 meV. The lowest conduction band is formed by the cytosine LUMO states, with a width of 270 meV. Between them, there is a wide band gap of 2.0 eV, showing that nondoped poly(C)–poly(G) must be an insulator. Even for DNA doped with holes, the extremely narrow HOMO band suggests that the holes will become localized by any lattice disorder, according to Anderson’s model. To check this, we performed two calculations for “perturbed” systems. The first system has one of the base pairs inverted (GC instead of CG) as the simplest realization of sequence disorder, after which the geometry was relaxed again. As a result, the band structure of the system changed dramatically, and the extended Bloch states changed to states localized over two-three base pairs in particular sections of the 11-base-pair periodic cell. The second “perturbed” system was one of the intermediate geometries during the relaxation process, with “random” changes in the atomic coordinates, relative to the final relaxed positions. These coordinate changes lead to a total energy difference compatible with that of thermal fluctuations at 300 K. Though not as dramatic as those of the base pair inversion, the changes in the electronic band structure were also substantial, and the electron states became localized as well, indicating in this case a strong electron-phonon interaction. These results ruled out band-like conduction of holes in doped DNA, suggesting also that holes would become localized by polaronic effects (structure deformations around
90
E. Artacho et al.
the hole). Such a suggestion was confirmed by later calculations of the hole polaron in poly(C)–poly(G) [17].
4.
Outlook
Besides the differences in scaling with system size, a large part of the advantage of classical potentials for large systems stems from the ease of parallelizing the algorithms involved in their use. In the case of quantum simulations, there are codes, like CONQUEST, which have been designed from the begining to run in massively parallel computers, and which have demonstrated their ability to run in them simulations with over ten thousand atoms. This was not the case of SIESTA, which was designed to run in modest workstations and PCs, and only later parallelized. The initial parallel versions were not very efficient, although demonstration runs with over one hundred thousand atoms were done. Recent versions have improved the parallel scaling considerably and now aim at one million atom demonstration runs. Much progress has been obtained also in a variety of acceleration techniques, from hybrid quantum mechanics–molecular mechanics to accelerated molecular dynamics. All this combined may lead very soon to unprecedented simulations of materials properties and devices with quantum mechanical methods. The major obstacle to make this possible, however, will be to find practical O(N ) methods for metals and systems with broken bonds. This is a subject of very active reseach in which much progress is expected in the coming years.
References [1] W. Kohn, “Density functional and density matrix method scaling linearly with the number of atoms,” Phys. Rev. Lett., 76, 3168–3171, 1996. [2] P. Ordej´on, “Order-N tight-binding methods for electronic-structure and molecular dynamics,” Comp. Mat. Sci., 12, 157–191, 1998. [3] S. Goedecker, “Linear scaling electronic structure methods,” Rev. Mod. Phys., 71, 1085–1123, 1999. [4] R.M. Martin, Electronic Structure: Basic Theory and Practical Methods, Cambridge University Press, Cambridge, 2004. [5] G.E. Scuseria, “Linear scaling density functional calculations with gaussian orbitals,” J. Phys. Chem. A, 103, 4782–4790, 1999. [6] D.R. Bowler, T. Miyazaki, and M.J. Gillan, “Recent progress in linear scaling ab initio electronic structure techniques,” J. Phys. Condens. Matter, 14, 2781–2798, 2002. [7] J.L. Fattebert and J. Bernholc, “Towards grid-based O(N) density-functional theory methods: optimized nonorthogonal orbitals and multigrid acceleration,” Phys. Rev. B, 62, 1713–1722, 2000.
Electronic structure calculations with localized orbitals
91
[8] A.A. Mostofi, C.-K. Skylaris, P.D. Haynes, and M.C. Payne, “Total-energy calculations on a real space grid with localized functions and a plane-wave basis,” Comput. Phys. Commun., 147, 788–802, 2002. [9] J.P. Lewis, K.R. Glaesemann, G.A. Voth, J. Fritsch, A.A. Demkov, J.Ortega, and O.F. Sankey, “Further developments in the local-orbital density-functional-theory tight-binding method,” Phys. Rev. B, 64, 195103.1–10, 2001. [10] G. Lippert, J. Hutter, P. Ballone, and M. Parrinello, “A hybrid gaussian and plane wave density functional scheme,” Mol. Phys., 92, 477–487, 1997. [11] T.L. Beck, “Real-space mesh techniques in density-functional theory,” Rev. Mod. Phys., 72, 1041–1080, 2000. [12] P. Ordej´on, E. Artacho, and J.M. Soler, “Selfconsistent order-N density-functional calculations for very large systems,” Phys. Rev. B, 53, R10441–R10444, 1996. [13] J.M. Soler, E. Artacho, J.D. Gale, A. García, J. Junquera, P. Ordej´on, and D. S´anchezPortal, “The SIESTA method for ab initio order-N materials simulation,” J. Phys. Condens. Matter, 14, 2745–2779, 2002. [14] E. Anglada, J.M. Soler, J. Junquera, and E. Artacho, “Systematic generation of finiterange atomic basis sets for linear-scaling calculations,” Phys. Rev. B, 66, 205101.1–4, 2000. [15] D. S´anchez-Portal, P. Ordej´on, and E. Canadell, “Computing the properties of materials from first principles with SIESTA,” Struct. Bonding, 113, 103–170, 2004. See also http://www.uam.es/siesta. [16] E. Artacho, M. Machado, D. S´anchez-Portal, P. Ordej´on, and J.M. Soler, “Electrons in dry DNA from density functional calculations,” Mol. Phys., 101, 1587–1594, 2003. [17] S.S. Alexandre, E. Artacho, J.M. Soler, and H. Chacham, “Small polarons in dry DNA,” Phys. Rev. Lett., 91, 108105–108108, 2003.
1.6 ELECTRONIC STRUCTURE METHODS: AUGMENTED WAVES, PSEUDOPOTENTIALS AND THE PROJECTOR AUGMENTED WAVE METHOD Peter E. Bl¨ochl, Johannes K¨astner, and Clemens J. F¨orst Institute for Theoretical Physics, Clausthal University of Technology, Clausthal-Zellerfeld, Germany
The main goal of electronic structure methods is to solve the Schr¨odinger equation for the electrons in a molecule or solid, to evaluate the resulting total energies, forces, response functions and other quantities of interest. In this paper we describe the basic ideas behind the main electronic structure methods such as the pseudopotential and the augmented wave methods and provide selected pointers to contributions that are relevant for a beginner. We give particular emphasis to the projector augmented wave (PAW) method developed by one of us, an electronic structure method for ab initio molecular dynamics with full wavefunctions. We feel that it allows best to show the common conceptional basis of the most widespread electronic structure methods in materials science. The methods described below require as input only the charge and mass of the nuclei, the number of electrons and an initial atomic geometry. They predict binding energies accurate within a few tenths of an electron volt and bond lengths in the 1–2% range. Currently, systems with a few hundred atoms per unit cell can be handled. The dynamics of atoms can be studied up to tens of picoseconds. Quantities related to energetics, the atomic structure and to the ground-state electronic structure can be extracted. In order to lay a common ground and to define some of the symbols, let us briefly touch upon the density functional theory [1, 2]. It maps a description for interacting electrons, a nearly intractable problem, onto one of non-interacting electrons in an effective potential. Within density functional theory, the total
93 S. Yip (ed.), Handbook of Materials Modeling, 93–119. c 2005 Springer. Printed in the Netherlands.
94
P.E Bl¨ochl, J. K¨astner, and C.J. F¨orst
energy is written as E[n (r), R R ] =
n
fn
−h 2 ¯ 2 n ∇ n 2m e 2
1 e n(r) + Z (r) n(r ) + Z (r ) + · d3r d3r 2 4π 0 |r − r | + E xc [n(r)]
(1)
occupations, n(r) = Here, |n are one-particle electron states, f n are the state ∗ f (r) (r) is the electron density and Z (r) = − n n n R Z R δ(r − R R ) is the n nuclear charge density expressed in electron charges. Z R is the atomic number of a nucleus at position R R . It is implicitly assumed that the infinite selfinteraction of the nuclei is removed. The exchange and correlation functional contains all the difficulties of the many-electron problem. The main conclusion of the density functional theory is that E xc is a functional of the density. We use Dirac’s bra and ket notation. A wavefunction n corresponds to a ket |n , the complex conjugate wave function n∗ corresponds to a bra n |, and a scalar product d3rn∗ (r)m (r) is written as n |m . Vectors in the three-dimensional coordinate space are indicated by boldfaced symbols. Note that we use R as position vector and R as atom index. In current implementations, the exchange and correlation functional E xc [n(r)] has the form
E xc [n(r)] =
d3r Fxc (n(r), |∇n(r)|),
where Fxc is a parameterized function of the density and its gradients. Such functionals are called gradient corrected. In local spin density functional theory, Fxc furthermore depends on the spin density and its derivatives. A review of the earlier developments has been given by Parr and Yang [3]. The electronic ground state is determined by minimizing the total energy functional E[n ] of Eq. (1) at a fixed ionic geometry. The one-particle wavefunctions have to be orthogonal. This constraint is implemented with the method of Lagrange multipliers. We obtain the ground state wavefunctions from the extremum condition for F[n (r), m,n ] = E[n (r)] −
[n |m − δn,m ]m,n
(2)
n,m
with respect to the wavefunctions and the Lagrange multipliers m,n . The extremum condition for the wavefunctions has the form H |n f n =
m
|m m,n
(3)
Electronic structure methods
95
2
h¯ where H = − 2m ∇2 + v eff (r) is the effective one-particle Hamilton operator. e The effective potential depends itself on the electron density via
v eff (r) =
e2 4π 0
d3r
n(r ) + Z (r ) + µxc (r), |r − r |
xc [n(r)] is the functional derivative of the exchange and correwhere µxc (r) = δ Eδn(r) lation functional. After a unitary transformation that diagonalizes the matrix of Lagrange multipliers m,n , we obtain the Kohn–Sham equations:
H |n = |n n .
(4)
The one-particle energies n are the eigenvalues of n,m 2fnf+n ffmm [4]. The remaining one-electron Schr¨odinger equations, namely the Kohn– Sham equations given above, still pose substantial numerical difficulties: (1) in the atomic region near the nucleus, the kinetic energy of the electrons is large, resulting in rapid oscillations of the wavefunction that require fine grids for an accurate numerical representation. On the other hand, the large kinetic energy makes the Schr¨odinger equation stiff, so that a change of the chemical environment has little effect on the shape of the wavefunction. Therefore, the wavefunction in the atomic region can be represented well already by a small basis set. (2) In the bonding region between the atoms the situation is opposite. The kinetic energy is small and the wavefunction is smooth. However, the wavefunction is flexible and responds strongly to the environment. This requires large and nearly complete basis sets. Combining these different requirements is nontrivial and various strategies have been developed. • The atomic point of view has been most appealing to quantum chemists. Basis functions that resemble atomic orbitals are chosen. They exploit that the wavefunction in the atomic region can be described by a few basis functions, while the chemical bond is described by the overlapping tails of these atomic orbitals. Most techniques in this class are a compromise of, on the one hand, a well-adapted basis set, where the basis functions are difficult to handle, and on the other hand numerically convenient basis functions such as Gaussians, where the inadequacies are compensated by larger basis sets. • Pseudopotentials regard an atom as a perturbation of the free electron gas. The most natural basis functions are planewaves. Plane wave basis sets are, in principle, complete and suitable for sufficiently smooth wavefunctions. The disadvantage of the comparably large basis sets required is offset by their extreme numerical simplicity. Finite plane-wave expansions are, however, absolutely inadequate to describe the strong
96
P.E Bl¨ochl, J. K¨astner, and C.J. F¨orst
oscillations of the wavefunctions near the nucleus. In the pseudopotential approach the Pauli repulsion of the core electrons is therefore described by an effective potential that expels the valence electrons from the core region. The resulting wavefunctions are smooth and can be represented well by plane-waves. The price to pay is that all information on the charge density and wavefunctions near the nucleus is lost. • Augmented wave methods compose their basis functions from atom-like wavefunctions in the atomic regions and a set of functions, called envelope functions, appropriate for the bonding in between. Space is divided accordingly into atom-centered spheres, defining the atomic regions, and an interstitial region in between. The partial solutions of the different regions, are matched at the interface between atomic and interstitial regions. The PAW method is an extension of augmented wave methods and the pseudopotential approach, which combines their traditions into a unified electronic structure method. After describing the underlying ideas of the various approaches let us briefly review the history of augmented wave methods and the pseudopotential approach. We do not discuss the atomic-orbital based methods, because our focus is the PAW method and its ancestors.
1.
Augmented Wave Methods
The augmented wave methods have been introduced in 1937 by Slater [5] and were later modified by Korringa [6], Kohn and Rostokker [7]. They approached the electronic structure as a scattered-electron problem. Consider an electron beam, represented by a plane wave, traveling through a solid. It undergoes multiple scattering at the atoms. If for some energy, the outgoing scattered waves interfere destructively, a bound state has been determined. This approach can be translated into a basis set method with energy and potential dependent basis functions. In order to make the scattered wave problem tractable, a model potential had to be chosen: The so-called muffin-tin potential approximates the true potential by a constant in the interstitial region and by a spherically symmetric potential in the atomic region. Augmented wave methods reached adulthood in the 1970s: Andersen [8] showed that the energy-dependent basis set of Slater’s APW method can be mapped onto one with energy independent basis functions, by linearizing the partial waves for the atomic regions in energy. In the original APW approach, one had to determine the zeros of the determinant of an energy dependent matrix, a nearly intractable numerical problem for complex systems. With the new energy independent basis functions, however, the problem is reduced to
Electronic structure methods
97
the much simpler generalized eigenvalue problem, which can be solved using efficient numerical techniques. Furthermore, the introduction of well-defined basis sets paved the way for full-potential calculations [9]. In that case the muffin-tin approximation is used solely to define the basis set |χi , while the matrix elements χi |H |χ j of the Hamiltonian are evaluated with the full potential. In the augmented wave methods one constructs the basis set for the atomic region by solving the Schr¨odinger equation for the spheridized effective potential
−h¯ 2 2 ∇ + v eff (r) − φ,m (, r) = 0 2m e
as function of energy. Note that a partial wave φ,m (, r) is an angular momentum eigenstate and can be expressed as a product of a radial function and a spherical harmonic. The energy-dependent partial wave is expanded in a Taylor expansion about some reference energy ν, φ,m (, r) = φν,,m (r) + ( − ν, )φ˙ ν,,m (r) + O(( − ν, )2 ), where φν,,m (r) = φ,m (ν, , r). The energy derivative of the partial wave φ˙ν (r)= ∂φ(,r) solves the equation ∂ ν,
−h¯ 2 2 ∇ + v eff (r) − ν, φ˙ ν,,m (r) = φν,,m (r). 2m e
Next, one starts from a regular basis set, such as plane waves, Gaussians or Hankel functions. These basis functions are called envelope functions |χ˜ i . Within the atomic region they are replaced by the partial waves and their energy derivatives, such that the resulting wavefunction is continuous and differentiable: χi (r) = χ˜i (r) −
R
θ R (r)χ˜ i (r) +
+ φ˙ν,R,,m (r)b R,,m,i .
θ R (r) φν,R,,m (r)a R,,m,i
R,,m
(5)
θ R (r) is a step function that is unity within the augmentation sphere centered at R R and zero elsewhere. The augmentation sphere is atom-centered and has a radius about equal to the covalent radius. This radius is called the muffintin radius, if the spheres of neighboring atoms touch. These basis functions describe only the valence states; the core states are localized within the augmentation sphere and are obtained directly by radial integration of the Schr¨odinger equation within the augmentation sphere.
98
P.E Bl¨ochl, J. K¨astner, and C.J. F¨orst
The coefficients a R,,m,i and b R,,m,i are obtained for each |χ˜i as follows: The envelope function is decomposed around each atomic site into spherical harmonics multiplied by radial functions: χ˜ i (r) =
u R,,m,i (|r − R R |)Y,m (r − R R ).
(6)
,m
Analytical expansions for plane waves, Hankel functions or Gaussians exist. The radial parts of the partial waves φν,R,,m and φ˙ν,R,,m are matched with value and derivative to u R,,m,i (|r|), which yields the expansion coefficients a R,,m,i and b R,,m,i . If the envelope functions are plane waves, the resulting method is called the linear augmented plane wave (LAPW) method. If the envelope functions are Hankel functions, the method is called linear muffin-tin orbital (LMTO) method. A good review of the LAPW method [8] has been given by Singh [10]. Let us now briefly mention the major developments of the LAPW method: Soler and Williams [11] introduced the idea of additive augmentation: While augmented plane waves are discontinuous at the surface of the augmentation sphere if the expansion in spherical harmonics in Eq. (5) is truncated, Soler replaced the second term in Eq. (5) by an expansion of the plane wave with the same angular momentum truncation as in the third term. This dramatically improved the convergence of the angular momentum expansion. Singh [12] introduced so-called local orbitals, which are nonzero only within a muffintin sphere, where they are superpositions of φ and φ˙ functions from different expansion energies. Local orbitals substantially increase the energy transferability. Sj¨ostedt et al. [13] relaxed the condition that the basis functions are differentiable at the sphere radius. In addition they introduced local orbitals, which are confined inside the sphere, and that also have a kink at the sphere boundary. Due to the large energy-cost of kinks, they will cancel, once the total energy is minimized. The increased variational degree of freedom in the basis leads to a dramatically improved plane-wave convergence [14]. The second variant of the linear methods is the LMTO method [8]. A good introduction into the LMTO method is the book by Skriver [15]. The LMTO method uses Hankel functions as envelope functions. The atomic spheres approximation (ASA) provides a particularly simple and efficient approach to the electronic structure of very large systems. In the ASA, the augmentation spheres are blown up so that their volume are equal to the total volume and the first two terms in Eq. (5) are ignored. The main deficiency of the LMTO-ASA method is the limitation to structures that can be converted into a closed packed arrangement of atomic and empty spheres. Furthermore, energy differences due to structural distortions are often qualitatively incorrect. Full potential versions of the LMTO method, that avoid these deficiencies of the ASA have been developed. The construction of tight
Electronic structure methods
99
binding orbitals as superposition of muffin-tin orbitals [16] showed the underlying principles of the empirical tight-binding method and prepared the ground for electronic structure methods that scale linearly instead of with the third power of the number of atoms. The third generation LMTO [17] allows to construct true minimal basis sets, which require only one orbital per electronpair for insulators. In addition they can be made arbitrarily accurate in the valence band region, so that a matrix diagonalization becomes unnecessary. The first steps towards a full-potential implementation, that promises a good accuracy, while maintaining the simplicity of the LMTO-ASA method are currently under way. Through the minimal basis-set construction the LMTO method offers unrivaled tools for the analysis of the electronic structure and has been extensively used in hybrid methods combining density functional theory with model Hamiltonians for materials with strong electron correlations [18].
2.
Pseudopotentials
Pseudopotentials have been introduced to (1) avoid describing the core electrons explicitly and (2) to avoid the rapid oscillations of the wavefunction near the nucleus, which normally require either complicated or large basis sets. The pseudopotential approach traces back to 1940 when Herring [19] invented the orthogonalized plane-wave method. Later, Phillips and Kleinman [20] and Antoncik [21] replaced the orthogonality condition by an effective potential, which mimics the Pauli repulsion by the core electrons and thus compensates the electrostatic attraction by the nucleus. In practice, the potential was modified, for example, by cutting off the singular potential of the nucleus at a certain value. This was done with a few parameters that have been adjusted to reproduce the measured electronic band structure of the corresponding solid. Hamann et al. [22] showed in 1979 how pseudopotentials can be constructed in such a way, that their scattering properties are identical to that of an atom to first order in energy. These first-principles pseudopotentials relieved the calculations from the restrictions of empirical parameters. Highly accurate calculations have become possible especially for semiconductors and simple metals. An alternative approach towards first-principles pseudopotentials [23] preceded the one mentioned above.
2.1.
The Idea Behind Pseudopotential Construction
In order to construct a first-principles pseudopotential, one starts out with an all-electron density-functional calculation for a spherical atom. Such
100
P.E Bl¨ochl, J. K¨astner, and C.J. F¨orst
calculations can be performed efficiently on radial grids. They yield the atomic potential and wavefunctions φ,m (r). Due to the spherical symmetry, the radial parts of the wavefunctions for different magnetic quantum numbers m are identical. For the valence wavefunctions one constructs pseudo-wavefunctions |φ˜ ,m : There are numerous ways [24–27] to construct the pseudo-wavefunctions. They must be identical to the true wave functions outside the augmentation region, which is called core-region in the context of the pseudopotential approach. Inside the augmentation region the pseudo-wavefunction should be nodeless and have the same norm as the true wavefunctions, that is φ˜ ,m |φ˜ ,m = φ,m |φ,m (compare Fig. 1). From the pseudo-wavefunction, a potential u (r) can be reconstructed by inverting the respective Schr¨odinger equation:
h¯ 2 2 − ∇ + u (r) − ,m φ˜,m (r) = 0 2m e ⇒ u (r) = ,m +
h¯ 2 2 ∇ φ˜,m (r). φ˜ ,m (r) 2m e 1
·
0
0
1
2
3
r [abohr] Figure 1. Illustration of the pseudopotential concept at the example of the 3s wavefunction of Si. The solid line shows the radial part of the pseudo-wavefunction φ˜,m . The dashed line corresponds to the all-electron wavefunction φ,m , which exhibits strong oscillations at small radii. The angular momentum dependent pseudopotential u (dash-dotted line) deviates from the all-electron one v eff (dotted line) inside the augmentation region. The data are generated by the fhi98PP code [28].
Electronic structure methods
101
This potential u (r) (compare Fig. 1), which is also spherically symmetric, differs from one main angular momentum to the other. Next we define an effective pseudo-Hamiltonian
h¯ 2 2 e2 ps ∇ + v (r) + H˜ = − 2m e 4π 0
d3r
n(r ˜ ) + Z˜ (r ) + µxc ([n(r)], ˜ r) |r − r |
ps
and determine the pseudopotentials v such that the pseudo-Hamiltonian produces the pseudo-wavefunctions, that is ps v (r)
e2 = u (r) − 4π 0
d3r
n(r ˜ ) + Z˜ (r ) − µxc ([n(r)], ˜ r). |r − r |
(7)
This process is called “unscreening.” ˜ Z(r) mimics the charge density of the nucleus and the core electrons. It is usually an atom-centered, spherical Gaussian that is normalized to the charge of nucleus and core of that atom. In the pseudopotential approach, Z˜ R (r) does ˜ n (r) ˜ n∗ (r) not change with the potential. The pseudo density n(r) ˜ = n fn is constructed from the pseudo-wavefunctions. In this way we obtain a different potential for each angular momentum channel. In order to apply these potentials to a given wavefunction, the wavefunction must first be decomposed into angular momenta. Then each comps ponent is applied to the pseudopotential v for the corresponding angular momentum. The pseudopotential defined in this way can be expressed in a semilocal form
¯ −r)+ v (r, r ) = v(r)δ(r ps
,m
ps
Y,m (r) v (r) − v(r) ¯
δ(|r| − |r |) ∗ × Y,m (r ) . |r|2
(8)
The local potential v(r) ¯ only acts on those angular momentum components, not included in the expansion of the pseudopotential construction. Typically, it is chosen to cancel the most expensive nonlocal terms, the one corresponding to the highest physically relevant angular momentum. The pseudopotential is nonlocal as it depends on two position arguments, r and r . The expectation values are evaluated as a double integral ˜ = ˜ ps | |v
3
dr
˜ ). ˜ ∗ (r)v ps (r, r )(r d3r
102
P.E Bl¨ochl, J. K¨astner, and C.J. F¨orst
The semilocal form of the pseudopotential given in Eq. (8) is computationally expensive. Therefore, in practice, one uses a separable form of the pseudopotential [29–31]: v ps ≈
−1
v ps |φ˜i φ˜ j |v ps |φ˜ i
i, j
i, j
φ˜ j |v ps .
(9)
Thus, the projection onto spherical harmonics used in the semilocal form of Eq. (8) is replaced by a projection onto angular momentum dependent functions |v ps φ˜ i . The indices i and j are composite indices containing the atomic-site index R, the angular momentum quantum numbers , m and an additional index α. The index α distinguishes partial waves with otherwise identical indices R, , m, as more than one partial wave per site and angular momentum is allowed. The partial waves may be constructed as eigenstates to the ps pseudopotential v for a set of energies. One can show that the identity of Eq. (9) holds by applying a wavefunction ˜ = i |φ˜ i ci to both sides. If the set of pseudo partial waves |φ˜i in Eq. (9) | is complete, the identity is exact. The advantage of the separable form is that ˜ ps | is treated as one function, so that expectation values are reduced to φv ˜ combinations of simple scalar products φ˜i v ps |. The total energy of the pseudopotential method can be written in the form E=
n
fn
h2 ¯ ˜ 2 ˜ n |v ps | ˜ n ˜ n − ∇ f n n + E self + 2m e n
˜ ) 2 n(r) ˜ + Z˜ (r) n(r ˜ ) + Z(r
1 e × + · 2 4π 0
d3r
d3r
|r − r |
+ E xc [n(r)]. ˜ (10)
The constant E self is adjusted such that the total energy of the atom is the same for an all-electron calculation and the pseudopotential calculation. For the atom, from which it has been constructed, this construction guarantees that the pseudopotential method produces the correct one-particle energies for the valence states and that the wavefunctions have the desired shape. While pseudopotentials have proven to be accurate for a large variety of systems, there is no strict guarantee that they produce the same results as an allelectron calculation, if they are used in a molecule or solid. The error sources can be divided into two classes: • Energy transferability problems: Even for the potential of the reference atom, the scattering properties are accurate only in given energy window. • Charge transferability problems: In a molecule or crystal, the potential differs from that of the isolated atom. The pseudopotential, however, is strictly valid only for the isolated atom.
Electronic structure methods
103
The plane-wave basis set for the pseudo wavefunctions is defined by the shortest wave length λmin = 2π/|G max | via the so-called plane-wave cutoff h2 G2 E PW = ¯ 2mmax . It is often specified in Rydberg (1Ry = 12 H≈13.6 eV). The planee wave cutoff is the highest kinetic energy of all basis functions. The basis-set convergence can systematically be controlled by increasing the plane-wave cutoff. The charge transferability is substantially improved by including a nonlinear core correction [32] into the exchange-correlation term of Eq. (10). Hamann [33] showed how to construct pseudopotentials from unbound wavefunctions as well. Vanderbilt [31] and Laasonen et al. [34] generalized the pseudopotential method to non-norm-conserving pseudopotentials, so-called ultra-soft pseudopotentials, which dramatically improves the basis-set convergence. The formulation of ultra-soft pseudopotentials has already many similarities with the projector augmented wave method. Truncated separable pseudopotentials suffer sometimes from so-called ghost states. These are unphysical core-like states, which render the pseudopotential useless. These problems have been discussed by Gonze et al. [35] . Quantities such as hyperfine parameters that depend on the full wavefunctions near the nucleus, can be extracted approximately [36]. A good review about pseudopotential methodology has been written by Payne et al. [37] and Singh [10]. In 1985, Car and Parrinello [38] published the ab initio molecular dynamics method. Simulations of the atomic motion have become possible on the basis of state-of-the-art electronic structure methods. Besides making dynamical phenomena and finite temperature effects accessible to electronic structure calculations, the ab initio molecular dynamics method also introduced a radically new way of thinking into electronic structure methods. Diagonalization of a Hamilton matrix has been replaced by classical equations of motion for the wavefunction coefficients. If one applies friction, the system is quenched to the ground state. Without friction truly dynamical simulations of the atomic structure are performed. Using thermostats [39–42], simulations at constant temperature can be performed. The Car–Parrinello method treats electronic wavefunctions and atomic positions on an equal footing.
3.
Projector Augmented Wave Method
The Car–Parrinello method had been implemented first for the pseudopotential approach. There seemed to be unsurmountable barriers against combining the new technique with augmented wave methods. The main problem was related to the potential-dependent basis set used in augmented wave methods: the Car–Parrinello method requires a well-defined and unique total energy functional of atomic positions and basis set coefficients. Furthermore, the analytic evaluation of the first partial derivatives of the total energy with respect
104
P.E Bl¨ochl, J. K¨astner, and C.J. F¨orst
to wavefunctions, H |n , and atomic position, the forces, must be possible. Therefore, it was one of the main goals of the PAW method to introduce energy and potential independent basis sets that are as accurate as the previously used augmented basis sets. Other requirements have been: (1) The method should at least match the efficiency of the pseudopotential approach for Car–Parrinello simulations. (2) It should become an exact theory when converged and (3) its convergence should be easily controlled. We believe that these criteria have been met, which explains why the PAW method becomes increasingly widespread today.
3.1.
Transformation Theory
At the root of the PAW method lies a transformation, that maps the true wavefunctions with their complete nodal structure onto auxiliary wavefunctions, that are numerically convenient. We aim for smooth auxiliary wavefunctions, which have a rapidly convergent plane-wave expansion. With such a transformation we can expand the auxiliary wave functions into a convenient basis set such as plane waves, and evaluate all physical properties after reconstructing the related physical (true) wavefunctions. Let us denote the physical one-particle wavefunctions as |n and the aux˜ n . Note that the tilde refers to the representation of iliary wavefunctions as | smooth auxiliary wavefunctions and n is the label for a one-particle state and contains a band index, a k-point and a spin index. The transformation from the auxiliary to the physical wavefunctions is denoted by T : ˜ n . |n = T |
(11)
Now we express the constrained density functional F of Eq. (2) in terms of our auxiliary wavefunctions ˜ n] − ˜ n , m,n ] = E[T F[T
˜ n |T † T | ˜ m − δn,m ]m,n . [
(12)
n,m
The variational principle with respect to the auxiliary wavefunctions yields ˜ n = T † T | ˜ n n . T † H T |
(13)
Again we obtain a Schr¨odinger-like equation (see derivation of Eq. (4)), but now the Hamilton operator has a different form, H˜ = T † H T , an overlap operator O˜ = T † T occurs, and the resulting auxiliary wavefunctions are smooth. When we evaluate physical quantities we need to evaluate expectation values of an operator A, which can be expressed in terms of either the true or the auxiliary wavefunctions: A =
n
f n n |A|n =
n
˜ n |T † AT | ˜ n . f n
(14)
Electronic structure methods
105
In the representation of auxiliary wavefunctions we need to use transformed ˜ † AT . As it is, this equation only holds for the valence electrons. operators A=T The core electrons are treated differently as will be shown below. The transformation takes us conceptionally from the world of pseudopotentials to that of augmented wave methods, which deal with the full wavefunctions. We will see that our auxiliary wavefunctions, which are simply the plane-wave parts of the full wavefunctions, translate into the wavefunctions of the pseudopotential approach. In the PAW method, the auxiliary wavefunctions are used to construct the true wavefunctions and the total energy functional is evaluated from the latter. Thus it provides the missing link between augmented wave methods and the pseudopotential method, which can be derived as a well-defined approximation of the PAW method. In the original paper [4], the auxiliary wavefunctions have been termed pseudo wavefunctions and the true wavefunctions have been termed allelectron wavefunctions, in order to make the connection more evident. We avoid this notation here, because it resulted in confusion in cases, where the correspondence is not clear-cut.
3.2.
Transformation Operator
So far, we have described how we can determine the auxiliary wave functions of the ground state and how to obtain physical information from them. What is missing, is a definition of the transformation operator T . The operator T has to modify the smooth auxiliary wave function in each atomic region, so that the resulting wavefunction has the correct nodal structure. Therefore, it makes sense to write the transformation as identity plus a sum of atomic contributions S R : T =1+
SR .
(15)
R
For every atom, S R adds the difference between the true and the auxiliary wavefunction. The local terms S R are defined in terms of solutions |φi of the Schr¨odinger equation for the isolated atoms. This set of partial waves |φi will serve as a basis set so that, near the nucleus, all relevant valence wavefunctions can be expressed as superposition of the partial waves with yet unknown coefficients: (r) =
φi (r)ci
for |r − R R | < rc,R ,
(16)
i∈R
with i ∈ R we indicate those partial waves that belong to site R. Since the core wavefunctions do not spread out into the neighboring atoms, we will treat them differently. Currently we use the frozen-core approximation, which imports the density and the energy of the core electrons from
106
P.E Bl¨ochl, J. K¨astner, and C.J. F¨orst
the corresponding isolated atoms. The transformation T shall produce only wavefunctions orthogonal to the core electrons, while the core electrons are treated separately. Therefore, the set of atomic partial waves |φi includes only valence states that are orthogonal to the core wavefunctions of the atom. For each of the partial waves we choose an auxiliary partial wave |φ˜i . The identity |φi = (1 + S R )|φ˜i for i ∈ R S R |φ˜i = |φi − |φ˜i
(17)
defines the local contribution S R to the transformation operator. Since 1 + S R shall change the wavefunction only locally, we require that the partial waves |φi and their auxiliary counter parts |φ˜i are pairwise identical beyond a certain radius rc,R : φi (r) = φ˜i (r)
for i ∈ R and |r − R R | > rc,R .
(18)
Note that the partial waves are not necessarily bound states and are therefore not normalizable, unless we truncate them beyond a certain radius rc,R . The PAW method is formulated such that the final results do not depend on the location where the partial waves are truncated, as long as this is not done too close to the nucleus and identical for auxiliary and all-electron partial waves. In order to be able to apply the transformation operator to an arbitrary auxiliary wavefunction, we need to be able to expand the auxiliary wavefunction locally into the auxiliary partial waves. ˜ (r) =
φ˜i (r)ci =
i∈R
˜ φ˜ i (r) p˜i |
for |r − R R | < rc,R ,
(19)
i∈R
which defines the projector functions | p˜i . The projector functions probe the local character of the auxiliary wave function in the atomic region. Examples of projector functions are shown in Fig. 2. From Eq. (19) we can derive ˜ i∈R |φi p˜i | = 1, which is valid within rc,R . It can be shown by insertion, ˜ that can be that the identity Eq. (19) holds for any auxiliary wavefunction | expanded locally into auxiliary partial waves |φ˜i , if p˜i |φ˜ j = δi, j
for i, j ∈ R.
(20)
Note that neither the projector functions nor the partial waves need to be orthogonal among themselves. The projector functions are fully determined with the above conditions and a closure relation, which is related to the unscreening of the pseudopotentials (see Eq. 90 in Ref. [4]). By combining Eqs. (17) and (19), we can apply S R to any auxiliary wavefunction: ˜ = S R |
i∈R
˜ = S R |φ˜ i p˜i |
˜ |φi − |φ˜ i p˜i |.
i∈R
(21)
Electronic structure methods
107
Figure 2. Projector functions of the chlorine atom. Top: two s-type projector functions, middle: p-type, bottom: d-type.
Hence, the transformation operator is T =1+
|φi − |φ˜i p˜i |,
(22)
i
where the sum runs over all partial waves of all atoms. The true wavefunction can be expressed as ˜ + | = |
˜ = | ˜ + |φi − |φ˜i p˜i |
i
˜ R1 | R1 − |
(23)
R
with | R1 =
˜ |φi p˜i |
(24)
˜ |φ˜i p˜i |.
(25)
i∈R
˜ R1 = |
i∈R
In Fig. 3, the decomposition of Eq. (23) is shown for the example of the bonding p-σ state of the Cl2 molecule. To understand the expression Eq. (23) for the true wavefunction, let us concentrate on different regions in space. (1) Far from the atoms, the partial waves are, according to Eq. (18), pairwise identical so that the auxiliary wavefunc˜ tion is identical to the true wavefunction, that is (r) = (r). (2) Close to an atom R, however, the auxiliary wavefunction is, according to Eq. (19), identi˜ ˜ R1 (r). Hence, the true cal to its one-center expansion, that is, (r) =
108
P.E Bl¨ochl, J. K¨astner, and C.J. F¨orst
Figure 3. Bonding p-σ orbital of the Cl2 molecule and its decomposition of the wavefunction into auxiliary wavefunction and the two one-center expansions. Top-left: True and auxiliary wave function; top-right: auxiliary wavefunction and its partial wave expansion; bottomleft: the two partial wave expansions; bottom-right: true wavefunction and its partial wave expansion.
wavefunction (r) is identical to R1 (r), which is built up from partial waves that contain the proper nodal structure. In practice, the partial wave expansions are truncated. Therefore, the identity of Eq. (19) does not hold strictly. As a result, the plane waves also contribute to the true wavefunction inside the atomic region. This has the advantage that the missing terms in a truncated partial wave expansion are partly accounted for by plane waves, which explains the rapid convergence of
Electronic structure methods
109
the partial wave expansions. This idea is related to the additive augmentation of the LAPW method of Soler and Williams [11]. Frequently, the question comes up, whether the transformation Eq. (22) of the auxiliary wavefunctions indeed provides the true wavefunction. The transformation should be considered merely as a change of representation analogous to a coordinate transform. If the total energy functional is transformed consistently, its minimum will yield auxiliary wavefunctions that produce the correct wavefunctions |.
3.3.
Expectation values
Expectation values can be obtained either from the reconstructed true wavefunctions or directly from the auxiliary wave functions A =
Nc
f n n |A|n +
n
=
φnc |A|φnc
n=1
˜ n |T † AT | ˜ n + f n
n
Nc
φnc |A|φnc ,
(26)
n=1
where f n are the occupations of the valence states and Nc is the number of core states. The first sum runs over the valence states, and second over the core states |φnc . Now we can decompose the matrix element for a wavefunction into its individual contributions according to Eq. (23):
˜ + |A| =
R
˜ ˜ + = |A|
R
˜ R1 ) ( R1 −
˜ R1 |A| ˜ R1 R1 |A| R1 −
R
+
˜ R1 ) A ˜ + ( R1 −
part 1
˜ R1 |A| ˜ − ˜ R1 + ˜ − ˜ R1 |A| R1 − ˜ R1 R1 −
R
part 2 +
R/ = R
˜ R1 |A| R1 − ˜ R1 . R1 −
part 3
(27)
110
P.E Bl¨ochl, J. K¨astner, and C.J. F¨orst
Only the first part of Eq. (27) is evaluated explicitly, while the second and third parts of Eq. (27) are neglected, because they vanish for sufficiently local operators as long as the partial wave expansion is converged: The func˜ R1 vanishes per construction beyond its augmentation region, tion R1 − because the partial waves are pairwise identical beyond that region. The func˜ − ˜ R1 vanishes inside its augmentation region, if the partial wave expantion ˜ R1 sion is sufficiently converged. In no region of space both functions R1 − ˜ − ˜ R1 are simultaneously nonzero. Similarly the functions R1 − ˜ R1 and from different sites are never non-zero in the same region in space. Hence, the second and third parts of Eq. (27) vanish for operators such as the kinetic h¯ 2 2 ∇ and the real space projection operator |rr|, which produces energy − 2m e the electron density. For truly nonlocal operators the parts 2 and 3 of Eq. (27) would have to be considered explicitly. The expression, Eq. (26), for the expectation value can therefore be written with the help of Eq. (27) as
A =
˜ n |A| ˜ n + n1 |A|n1 − ˜ n1 |A| ˜ n1 + f n
n
=
˜ n |A| ˜ n + f n
n
R
−
φnc |A|φnc
n=1
+
Nc
R
Nc
φ˜nc |A|φ˜nc
n=1
Di, j φ j |A|φi +
Nc,R
i, j ∈R
n∈R
Nc,R
Di, j φ˜ j |A|φ˜i +
i, j ∈R
φnc |A|φnc
φ˜ nc |A|φ˜nc ,
(28)
n∈R
where Di, j is the one-center density matrix defined as
Di, j =
n
˜ n | p˜ j p˜i | ˜ n = f n
˜ n f n ˜ n | p˜ j , p˜i |
(29)
n
The auxiliary core states, |φ˜ nc allow to incorporate the tails of the core wavefunction into the plane-wave part, and therefore assure, that the integrations of partial wave contributions cancel strictly beyond rc . They are identical to the true core states in the tails, but are a smooth continuation inside the atomic sphere. It is not required that the auxiliary wave functions are normalized.
Electronic structure methods
111
Following this scheme, the electron density is given by n(r) = n(r) ˜ + n(r) ˜ =
n 1R (r) − n˜ 1R (r)
R ∗ ˜ n (r) ˜ n (r) fn
(30)
+ n˜ c (r)
n
n 1R (r) =
Di, j φ ∗j (r)φi (r) + n c,R (r)
i, j ∈R
n˜ 1R (r)
=
Di, j φ˜ ∗j (r)φ˜ i (r) + n˜ c,R (r),
(31)
i, j ∈R
where n c,R is the core density of the corresponding atom and n˜ c,R is the auxiliary core density, which is identical to n c,R outside the atomic region, but smooth inside. Before we continue, let us discuss a special point: The matrix element of a general operator with the auxiliary wavefunctions may be slowly converging with the plane-wave expansion, because the operator A may not be well behaved. An example for such an operator is the singular electrostatic potential of a nucleus. This problem can be alleviated by adding an “intelligent zero”: If an operator B is purely localized within an atomic region, we can use the identity between the auxiliary wavefunction and its own partial wave expansion ˜ n − ˜ n1 |B| ˜ n1 . ˜ n |B| 0 =
(32)
Now we choose an operator B so that it cancels the problematic behavior of the operator A, but is localized in a single atomic region. By adding B to the plane-wave part and the matrix elements with its one-center expansions, the plane-wave convergence can be improved without affecting the converged result. A term of this type, namely v¯ will be introduced in the next section to cancel the Coulomb singularity of the potential at the nucleus.
4.
Total Energy
Like wavefunctions and expectation values also the total energy can be divided into three parts: ˜ n , R R ] = E˜ + E[
E 1R − E˜ 1R .
(33)
R
The plane-wave part E˜ involves only smooth functions and is evaluated on equi-spaced grids in real and reciprocal space. This part is computationally
112
P.E Bl¨ochl, J. K¨astner, and C.J. F¨orst
most demanding, and is similar to the expressions in the pseudopotential approach: E˜ =
−h 2 ¯ ˜ ˜n ∇2 2m e n 2
n
+ +
e 1 · 2 4π 0
d 3r
d3r
[n(r) ˜ + Z˜ (r)][n(r ˜ ) + Z˜ (r )] |r − r |
d3r v(r) ¯ n(r) ˜ + E xc [n(r)]. ˜
(34)
Z˜ (r) is an angular-momentum dependent core-like density that will be described in detail below. The remaining parts can be evaluated on radial grids in a spherical harmonics expansion. The nodal structure of the wavefunctions can be properly described on a logarithmic radial grid that becomes very fine near the nucleus, E 1R
=
i, j ∈R
Di, j
N c,R 2 −h 2 ¯ 2 c ¯ 2 c −h φj ∇ φi + φn ∇ φn 2m e 2m e
n∈R
e2 1 [n 1 (r) + Z (r)][n 1 (r ) + Z (r )] + · d3 r d3 r 2 4π 0 |r − r | 1 + E xc [n (r)] 2 − h ¯ 1 2 Di, j φ˜ j ∇ φ˜ i E˜ R = 2m e i, j ∈R + +
e2 1 · 2 4π 0
d3 r
d3 r
(35)
[n˜ 1 (r) + Z˜ (r)][n˜ 1 (r ) + Z˜ (r )] |r − r |
d3r v(r) ¯ n˜ 1 (r) + E xc [n˜ 1 (r)].
(36)
˜ The compensation charge density Z(r) = R Z˜ R (r) is given as a sum of angular momentum dependent Gauss functions, which have an analytical plane-wave expansion. A similar term occurs also in the pseudopotential approach. In contrast to the norm-conserving pseudopotential approach, however, the compensation charge of an atom Z˜ R is nonspherical and constantly adapts to the instantaneous environment. It is constructed such that n 1R (r) + Z R (r) − n˜ 1R (r) − Z˜ R (r)
(37)
has vanishing electrostatic multipole moments for each atomic site. With this choice, the electrostatic potentials of the augmentation densities vanish outside their spheres. This is the reason that there is no electrostatic interaction of the one-center parts between different sites.
Electronic structure methods
113
The compensation charge density as given here is still localized within the atomic regions. A technique similar to an Ewald summation, however, allows to replace it by a very extended charge density. Thus we can achieve, that the plane-wave convergence of the total energy is not affected by the auxiliary density. The potential v¯ = R v¯ R , which occurs in Eqs. (34) and (36), enters the total energy in the form of “intelligent zeros” described in Eq. (32) 0=
n
=
˜ n |v¯ R | ˜ n − ˜ n1 |v¯ R | ˜ n1 f n ˜ n |v¯ R | ˜ n − f n
n
Di, j φ˜i |v¯ R |φ˜ j .
(38)
i, j ∈R
The main reason for introducing this potential is to cancel the Coulomb singularity of the potential in the plane-wave part. The potential v¯ allows to influence the plane-wave convergence beneficially, without changing the converged result. v¯ must be localized within the augmentation region, where Eq. (19) holds.
5.
Approximations
Once the total energy functional provided in the previous section has been defined, everything else follows: Forces are partial derivatives with respect to atomic positions. The potential is the derivative of the nonkinetic energy contributions to the total energy with respect to the density, and the auxiliary ˜ n with respect to auxiliary wave Hamiltonian follows from derivatives H˜ | functions. The fictitious Lagrangian approach of Car and Parrinello [38] does not allow any freedom in the way these derivatives are obtained. Anything else than analytic derivatives will violate energy conservation in a dynamical simulation. Since the expressions are straightforward, even though rather involved, we will not discuss them here. All approximations are incorporated already in the total energy functional of the PAW method. What are those approximations? • First, we use the frozen-core approximation. In principle, this approximation can be overcome. • The plane-wave expansion for the auxiliary wavefunctions must be complete. The plane-wave expansion is controlled easily by increasing the plane-wave cut-off defined as E PW = 12 h¯ 2 G 2max . Typically, we use a planewave cut-off of 30 Ry. • The partial wave expansions must be converged. Typically we use one or two partial waves per angular momentum (, m) and site. It should be noted that the partial wave expansion is not variational, because it
114
P.E Bl¨ochl, J. K¨astner, and C.J. F¨orst changes the total energy functional and not the basis set for the auxiliary wavefunctions.
We do not discuss here numerical approximations such as the choice of the radial grid, since those are easily controlled.
6.
Relation to the Pseudopotentials
We mentioned earlier that the pseudopotential approach can be derived as a well-defined approximation from the PAW method: The augmentation part of the total energy E = E 1 − E˜ 1 for one atom is a functional of the one-center density matrix Di, j ∈R defined in Eq. (29). The pseudopotential approach can be recovered if we truncate a Taylor expansion of E about the atomic density matrix after the linear term. The term linear to Di, j is the energy related to the nonlocal pseudopotential. E(Di, j ) = E(Di,atj )+ = E self +
(Di, j − Di,atj )
i, j
˜ n |v ps | ˜ n − f n
∂E + O(Di, j − Di,atj )2 ∂ Di, j
d3r v(r) ¯ n(r)+ ˜ O(Di, j −Di,atj )2
n
(39) which can directly be compared to the total energy expression, Eq. (10), of the pseudopotential method. The local potential v(r) ¯ of the pseudopotential approach is identical to the corresponding potential of the projector augmented ˜ wave method. The remaining contributions in the PAW total energy, namely E, differ from the corresponding terms in Eq. (10) only in two features: our auxiliary density also contains an auxiliary core density, reflecting the nonlinear core correction of the pseudopotential approach, and the compensation density Z˜ (r) is non-spherical and depends on the wavefunction. Thus, we can look at the PAW method also as a pseudopotential method with a pseudopotential that adapts to the instantaneous electronic environment. In the PAW method, the explicit nonlinear dependence of the total energy on the one-center density matrix is properly taken into account. What are the main advantages of the PAW method compared to the pseudopotential approach? First, all errors can be systematically controlled so that there are no transferability errors. As shown by Watson and Carter [43] and Kresse and Joubert [44], most pseudopotentials fail for high-spin atoms such as Cr. While it is probably true that pseudopotentials can be constructed that cope even with this situation, a failure can not be known beforehand, so that some empiricism remains in practice: A pseudopotential constructed from an isolated atom is
Electronic structure methods
115
not guaranteed to be accurate for a molecule. In contrast, the converged results of the PAW method do not depend on a reference system such as an isolated atom, because PAW uses the full density and potential. Like other all-electron methods, the PAW method provides access to the full charge and spin density, which is relevant, for example, for hyperfine parameters. Hyperfine parameters are sensitive probes of the electron density near the nucleus. In many situations they are the only information available that allows to deduce atomic structure and chemical environment of an atom from experiment. The plane-wave convergence is more rapid than in norm-conserving pseudopotentials and should in principle be equivalent to that of ultra-soft pseudopotentials [31]. Compared to the ultra-soft pseudopotentials, however, the PAW method has the advantage that the total energy expression is less complex and can therefore be expected to be more efficient. The construction of pseudopotentials requires to determine a number of parameters. As they influence the results, their choice is critical. Also the PAW methods provides some flexibility in the choice of auxiliary partial waves. However, this choice does not influence the converged results.
7.
Recent Developments
Since the first implementation of the PAW method in the CP-PAW code, a number of groups have adopted the PAW method. The second implementation was done by the group of Holzwarth [45]. The resulting PWPAW code is freely available [46]. This code is also used as a basis for the PAW implementation in the AbInit project. An independent PAW code has been developed by Valiev and Weare [47]. Recently, the PAW method has been implemented into the VASP code [44]. The PAW method has also been implemented by Kromen into the ESTCoMPP code of Bl¨ugel and Schr¨oder. Another branch of methods uses the reconstruction of the PAW method, without taking into account the full wavefunctions in the energy minimization. Following chemists’ notation, this approach could be termed “postpseudopotential PAW.” This development began with the evaluation for hyperfine parameters from a pseudopotential calculation using the PAW reconstruction operator [36] and is now used in the pseudopotential approach to calculate properties that require the correct wavefunctions such as hyperfine parameters. The implementation by Kresse and Joubert [44] has been particularly useful as they had an implementation of PAW in the same code as the ultrasoft pseudopotentials, so that they could critically compare the two approaches with each other. Their conclusion is that both methods compare well in most
116
P.E Bl¨ochl, J. K¨astner, and C.J. F¨orst
cases, but they found that magnetic energies are seriously – by a factor 2 – in error in the pseudopotential approach, while the results of the PAW method were in line with other all-electron calculations using the linear augmented plane-wave method. As a short note, Kresse and Joubert incorrectly claim that their implementation is superior as it includes a term that is analogous to the nonlinear core correction of pseudopotentials [32]: this term however is already included in the original version in the form of the pseudized core density. Several extensions of the PAW have been done in the recent years: For applications in chemistry truly isolated systems are often of great interest. As any plane-wave based method introduces periodic images, the electrostatic interaction between these images can cause serious errors. The problem has been solved by mapping the charge density onto a point charge model, so that the electrostatic interaction could be subtracted out in a self-consistent manner [48]. In order to include the influence of the environment, the latter was simulated by simpler force fields using the molecular-mechanics–quantummechanics (QM–MM) approach [49]. In order to overcome the limitations of the density functional theory, several extensions have been performed. Bengone et al. [50] implemented the LDA+U approach into the CP-PAW code. Soon after this, Arnaud and Alouani [51] accomplished the implementation of the GW approximation into the CP-PAW code. The VASP-version of PAW [52] and the CP-PAW code have now been extended to include a noncollinear description of the magnetic moments. In a noncollinear description, the Schr¨odinger equation is replaced by the Pauli equation with two-component spinor wavefunctions. The PAW method has proven useful to evaluate electric field gradients [53] and magnetic hyperfine parameters with high accuracy [54]. Invaluable will be the prediction of NMR chemical shifts using the GIPAW method of Pickard and Mauri [55], which is based on their earlier work [56]. While the GIPAW is implemented in a post-pseudopotential manner, the extension to a self-consistent PAW calculation should be straightforward. An post-pseudopotential approach has also been used to evaluate core level spectra [57] and momentum matrix elements [58].
Acknowledgments We are grateful for carefully reading the manuscript to S. Boeck, J. Noffke, A. Poddey, as well as to K. Schwarz for his continuous support. This work has benefited from the collaborations within the ESF Programme on “Electronic Structure Calculations for Elucidating the Complex Atomistic Behavior of Solids and Surfaces.”
Electronic structure methods
117
References [1] P. Hohenberg and W. Kohn, “Inhomogeneous electron gas,” Phys. Rev., 136, B864, 1964. [2] W. Kohn and L.J. Sham, “Self-consistent equations including exchange and correlation effects,” Phys. Rev., 140, A1133, 1965. [3] R.G. Parr and W. Yang, Density Functional Theory of Atoms and Molecules, Oxford University Press, Oxford, 1989. [4] P.E. Bl¨ochl, “Projector augmented-wave method,” Phys. Rev. B, 50, 17953, 1994. [5] J.C. Slater, “Wave functions in a periodic potential,” Phys. Rev., 51, 846, 1937. [6] J. Korringa, “On the calculation of the energy of a Bloch wave in a metal,” Physica (Utrecht), 13, 392, 1947. [7] W. Kohn and J. Rostocker, “Solution of the schr¨odinger equation in periodic lattices with an application to metallic lithium,” Phys. Rev., 94, 1111, 1954. [8] O.K. Andersen, “Linear methods in band theory,” Phys. Rev. B, 12, 3060, 1975. [9] H. Krakauer, M. Posternak, and A.J. Freeman, “Linearized augmented plane-wave method for the electronic band structure of thin films,” Phys. Rev. B, 19, 1706, 1979. [10] S. Singh, Planewaves, Pseudopotentials and the LAPW method, Kluwer Academic, Dordrecht, 1994. [11] J.M. Soler and A.R. Williams, “Simple formula for the atomic forces in the augmented-plane-wave method,” Phys. Rev. B, 40, 1560, 1989. [12] D. Singh, “Ground-state properties of lanthanum: treatment of extended-core states,” Phys. Rev. B, 43, 6388, 1991. [13] E. Sj¨ostedt, L. Nordstr¨om, and D.J. Singh, “An alternative way of linearizing the augmented plane-wave method,” Solid State Commun., 114, 15, 2000. [14] G.K.H. Madsen, P. Blaha, K. Schwarz, E. Sj¨ostedt, and L. Nordstr¨om, “Efficient linearization of the augmented plane-wave method,” Phys. Rev. B, 64, 195134, 2001. [15] H.L. Skriver, The LMTO Method, Springer, New York, 1984. [16] O.K. Andersen and O. Jepsen, “Explicit, first-principles tight-binding theory,” Phys. Rev. Lett., 53, 2571, 1984. [17] O.K. Andersen, T. Saha-Dasgupta, and S. Ezhof, “Third-generation muffin-tin orbitals,” Bull. Mater. Sci., 26, 19, 2003. [18] K. Held, I.A. Nekrasov, G. Keller, V. Eyert, N. Bl¨umer, A.K. McMahan, R.T. Scalettar, T. Pruschke, V.I. Anisimov, and D. Vollhardt, “The LDA+DMFT approach to materials with strong electronic correlations,” In: J. Grotendorst, D. Marx, and A. Muramatsu (eds.) Quantum Simulations of Complex Many-Body Systems: From Theory to Algorithms, Lecture Notes, vol. 10 NIC Series. John von Neumann Institute for Computing, J¨ulich, p. 175, 2002. [19] C. Herring, “A new method for calculating wave functions in crystals,” Phys. Rev., 57, 1169, 1940. [20] J.C. Phillips and L. Kleinman, “New method for calculating wave functions in crystals and molecules,” Phys. Rev., 116, 287, 1959. [21] E. Antoncik, “Approximate formulation of the orthogonalized plane-wave method,” J. Phys. Chem. Solids, 10, 314, 1959. [22] D.R. Hamann, M. Schl¨uter, and C. Chiang, “Norm-conserving pseudopotentials,” Phys. Rev. Lett., 43, 1494, 1979. [23] A. Zunger and M. Cohen, “First-principles nonlocal-pseudopotential approach in the density-functional formalism: development and application to atoms,” Phys. Rev. B, 18, 5449, 1978.
118
P.E Bl¨ochl, J. K¨astner, and C.J. F¨orst [24] G.P. Kerker, “Non-singular atomic pseudopotentials for solid state applications,” J. Phys. C, 13, L189, 1980. [25] G.B. Bachelet, D.R. Hamann, and M. Schl¨uter, “Pseudopotentials that work: from H to Pu,” Phys. Rev. B, 26, 4199, 1982. [26] N. Troullier and J.L. Martins, “Efficient pseudopotentials for plane-wave calculations,” Phys. Rev. B, 43, 1993, 1991. [27] J.S. Lin, A. Qteish, M.C. Payne, and V. Heine, “Optimized and transferable nonlocal separable ab initio pseudopotentials,” Phys. Rev. B, 47, 4174, 1993. [28] M. Fuchs and M. Scheffler, “Ab initio pseudopotentials for electronic structure calculations of poly-atomic systems using density-functional theory,” Comput. Phys. Commun., 119, 67, 1999. [29] L. Kleinman and D.M. Bylander, “Efficacious form for model pseudopotentials,” Phys. Rev. Lett., 48, 1425, 1982. [30] P.E. Bl¨ochl, “Generalized separable potentials for electronic structure calculations,” Phys. Rev. B, 41, 5414, 1990. [31] D. Vanderbilt, “Soft self-consistent pseudopotentials in a generalized eigenvalue formalism,” Phys. Rev. B, 41, 17892, 1990. [32] S.G. Louie, S. Froyen, and M.L. Cohen, “Nonlinear ionic pseudopotentials in spindensity-functional calculations,” Phys. Rev. B, 26, 1738, 1982. [33] D.R. Hamann, “Generalized norm-conserving pseudopotentials,” Phys. Rev. B, 40, 2980, 1989. [34] K. Laasonen, A. Pasquarello, R. Car, C. Lee, and D. Vanderbilt, “Implementation of ultrasoft pseudopotentials in ab initio molecular dynamics,” Phys. Rev. B, 47, 110142, 1993. [35] X. Gonze, R. Stumpf, and M. Scheffler, “Analysis of separable potentials,” Phys. Rev. B, 44, 8503, 1991. [36] C.G. Van de Walle and P.E. Bl¨ochl, “First-principles calculations of hyperfine parameters,” Phys. Rev. B, 47, 4244, 1993. [37] M.C. Payne, M.P. Teter, D.C. Allan, T.A. Arias, and J.D. Joannopoulos, “Iterative minimization techniques for ab initio total-energy calculations: molecular dynamics and conjugate-gradients,” Rev. Mod. Phys., 64, 11045, 1992. [38] R. Car and M. Parrinello, “Unified approach for molecular dynamics and densityfunctional theory,” Phys. Rev. Lett., 55, 2471, 1985. [39] S. Nos´e, “A unified formulation of the constant temperature molecular-dynamics methods,” Mol. Phys., 52, 255, 1984. [40] Hoover, “Canonical dynamics: equilibrium phase-space distributions,” Phys. Rev. A, 31, 1695, 1985. [41] P.E. Bl¨ochl and M. Parrinello, “Adiabaticity in first-principles molecular dynamics,” Phys. Rev. B, 45, 9413, 1992. [42] P.E. Bl¨ochl, “Second generation wave function thermostat for ab initio molecular dynamics,” Phys. Rev. B, 65, 1104303, 2002. [43] S.C. Watson and E.A. Carter, “Spin-dependent pseudopotentials,” Phys. Rev. B, 58, R13309, 1998. [44] G. Kresse and J. Joubert, “From ultrasoft pseudopotentials to the projector augmented-wave method,” Phys. Rev. B, 59, 1758, 1999. [45] N.A.W. Holzwarth, G.E. Mathews, R.B. Dunning, A.R. Tackett, and Y. Zheng, “Comparison of the projector augmented-wave, pseudopotential, and linearized augmented-plane-wave formalisms for density-functional calculations of solids,” Phys. Rev. B, 55, 2005, 1997.
Electronic structure methods
119
[46] A.R. Tackett, N.A.W. Holzwarth, and G.E. Matthews, “A projector augmented wave (PAW) code for electronic structure calculations. Part I: atompaw for generating atom-centered functions. A projector augmented wave (PAW) code for electronic structure calculations. Part II: pwpaw for periodic solids in a plane wave basis,” Comput. Phys. Commun., 135, 329–347, 2001. See also pp. 348–376. [47] M. Valiev and J.H. Weare, “The projector-augmented plane wave method applied to molecular bonding,” J. Phys. Chem. A, 103, 10588, 1999. [48] P.E. Bl¨ochl, “Electrostatic decoupling of periodic images of plane-wave-expanded densities and derived atomic point charges,” J. Chem. Phys., 103, 7422, 1995. [49] T.K. Woo, P.M. Margl, P.E. Bl¨ochl, and T. Ziegler, “A combined Car–Parrinello QM/MM implementation for ab initio molecular dynamics simulations of extended systems: application to transition metal catalysis,” J. Phys. Chem. B, 101, 7877, 1997. [50] O. Bengone, M. Alouani, P.E. Bl¨ochl, and J. Hugel, “Implementation of the projector augmented-wave LDA+U method: application to the electronic structure of NiO,” Phys. Rev. B, 62, 16392, 2000. [51] B. Arnaud and M. Alouani, “All-electron projector-augmented-wave GW approximation: application to the electronic properties of semiconductors,” Phys. Rev. B., 62, 4464, 2000. [52] D. Hobbs, G. Kresse, and J. Hafner, “Fully unconstrained noncollinear magnetism within the projector augmented-wave method,” Phys. Rev. B, 62, 11556, 2000. [53] H.M. Petrilli, P.E. Bl¨ochl, P. Blaha, and K. Schwarz, “Electric-field-gradient calculations using the projector augmented wave method,” Phys. Rev. B, 57, 14690, 1998. [54] P.E. Bl¨ochl, “First-principles calculations of defects in oxygen-deficient silica exposed to hydrogen,” Phys. Rev. B, 62, 6158, 2000. [55] C.J. Pickard and F. Mauri, “All-electron magnetic response with pseudopotentials: NMR chemical shifts,” Phys. Rev. B., 63, 245101, 2001. [56] F. Mauri, B.G. Pfrommer, and S.G. Louie, “Ab initio theory of NMR chemical shifts in solids and liquids,” Phys. Rev. Lett., 77, 5300, 1996. [57] D.N. Jayawardane, C.J. Pickard, L.M. Brown, and M.C. Payne, “Cubic boron nitride: experimental and theoretical energy-loss near-edge structure,” Phys. Rev. B, 64, 115107, 2001. [58] H. Kageshima and K. Shiraishi, “Momentum-matrix-element calculation using pseudopotentials,” Phys. Rev. B, 56, 14985, 1997.
1.7 ELECTRONIC SCALE James R. Chelikowsky University of Minnesota, Minneapolis, MN, USA
1.
Real-space methods for ab initio calculations
Major computational advances in predicting the electronic and structural properties of matter come from two sources: improved performance of hardware and the creation of new algorithms, i.e., software. Improved hardware follows technical advances in computer design and electronic components. Such advances are frequently characterized by Moore’s Law, which states that computer power will double every 2 years or so. This law has held true for the past 20 or 30 years and most workers expect it to hold for the next decade, suggesting that such technical advances can be predicted. In clear contrast, the creation of new high performance algorithms defies characterization by a similar law as creativity is clearly not a predictable activity. Nonetheless, over the past half century, most advances in the theory of the electronic structure of matter have been made with new algorithms as opposed to better hardware. One may reasonably expect these advances to continue. Physical concepts such as the pseudopotentials and density functional theories coupled with numerical methods such as iterative diagonalization methods have permitted very large systems to be examined, much larger systems than could be handled solely by the increase allowed by computational hardware advances. Systems with hundreds, if not thousands, of atoms can now be examined, whereas methods of a generation ago might handle only tens of atoms. The development of real-space methods for the electronic structure over the past ten years is a notable advance in high performance algorithms for solving the electronic structure problem. Real-space methods do not require an explicit basis. The convergence of the method, assuming a uniform grid, can be tested by varying only one parameter: the grid spacing. The method can be easily be applied to neutral or charged systems, to extended or localized systems, and to diverse materials such as simple metals, semiconductors, 121 S. Yip (ed.), Handbook of Materials Modeling, 121–135. c 2005 Springer. Printed in the Netherlands.
122
J.R. Chelikowsky
and transition metals. These methods are also well suited for highly parallel computing platforms as few global communications are required. Review articles on these approaches can be found in Refs. [1–3].
2.
The Electronic Structure Problem
Most contemporary descriptions of the electronic structure problem for large systems cast the problem within density functional theory [4]. The many body problem is mapped onto a one electron Schr¨odinger equation called the Kohn–Sham equation [5]. For an atom, this equation can be written as
−2 ∇ 2 Z e2 − + VH ( r ) + Vxc [ r , ρ( r )] 2m r
ψn ( r ) = E n ψn ( r)
(1)
where there are Z electrons in the atom, VH is the Hartree or Coulomb potential, and Vxc is the exchange-correlation potential. The Hartree and exchangecorrelation potentials can be determined from the electronic charge density. The eigenvalue and eigenfunctions, (E n , ψn ( r )), can be used to determine the total electronic energy of the atom. The density is given by ρ( r ) = −e
|ψn ( r )|2
(2)
n,occup
The summation is over all occupied states. The Hartree potential is then determined by r ) = −4π eρ( r) ∇ 2 VH (
(3)
This term can be interpreted as the electrostatic interaction of an electron with the charge density of system. The exchange-correlation potential is more problematic. Within density functional theory, one can define an exchange correlation potential as a functional of the charge density. The central tenant of the local density approximation [5] is that the total exchange-correlation energy may be written as
r ) xc (ρ( r )) d 3r E xc [ρ] = ρ(
(4)
where xc is the exchange-correlation energy density. If one has knowledge of the exchange-correlation energy density, one can extract the potential and total electronic energy of the system. As a first approximation the exchangecorrelation energy density can be extracted from a homogeneous electron gas. It is common practice to separate exchange and correlation contributions to xc : xc = x + c [4]. It is not difficult to solve the Kohn–Sham equation (Eq. 1) for an atom. The potential, and charge density, is assumed to be spherically symmetric
Electronic scale
123
and the Kohn–Sham problem reduces to solving a one-dimensional problem. The Hartree and exchange-correlation potentials can be iterated to form a selfconsistent field. Usually the process is so quick for an atom that it can be done on desktop or laptop computer in a matter of seconds. In three dimensions, as for a complex atomic cluster, liquid or crystal, the problem is highly nontrivial. One major difficulty is the range of length scales involved. For example, in the case of a multielectron atom, the most tightly bound, core electrons can be confined to within ∼0.1 Å whereas the outer valence electrons may extend over ∼1–5 Å. In addition, the nodal structure of the atomic wave functions are difficult to replicate with a simple basis, especially the cusp in a wave function at the nuclear site where the Coulomb potential diverges. One approach to this problem is to form a basis combining highly localized functions with extended functions. This approach enormously complicates the electronic structure problem as valence and core states are treated on equal footing whereas such states are not equivalent in terms of their chemical activity. Consider the physical content of the periodic table, i.e., arranging the elements into columns with similar chemical properties. The Group IV elements such as C, Si, and Ge have similar properties because they share an outer s2 p2 configuration. This chemical similarity of the valence electrons is recognized by the pseudopotential approximation [6, 7]. The pseudopotential replaces the “all electron” potential by one that reproduces only the chemically active, or valence electrons. Usually, the pseudopotential subsumes the nuclear potential with those of the core electrons to generate an “ion core potential.” As an example, consider a sodium atom whose core electron configuration is 1s2 2s2 2p6 and valence electron configuration is 3s1 . The charge on the ion core pseudopotential is +1 (the nuclear charge minus the number of core electrons). Such a pseudopotential will bind only one electrons. The length scale of the pseudopotential is now set by the valence electrons alone. This permits a great simplification of the Kohn–Sham problem in terms of choosing a basis. For the purposes of designing an ab initio pseudopotential let us consider a sodium atom. By solving for the Na atom, we know the eigenvalue, 3s , and the corresponding wave function, ψ3s (r) for the valence electron. We demand several conditions for the Na pseudopotential: (1) The potential bind only the valence electron, the 3s-electron for the case at hand. (2) The eigenvalue of the corresponding valence electron be identical to the full potential eigenvalue. The full potential is also called the all-electron potential. (3) The wave function be nodeless and identical to the “all electron” wave function outside the core region. For example, we construct a pseudo-wave function, φ3s (r) such that φ3s (r)=ψ3s (r) for r > rc where rc defines the size spanned by the ion core, i.e., the nucleus and core electrons. For Na, this means the “size” of 1s2 2s2 2p6
124
J.R. Chelikowsky
states. Typically, the core is taken to be less than the distance corresponding to the maximum of the valence wave function, but greater than the distance of the outermost node. If the eigenvalue, p , and the wave function, φp (r), are known from solving the atom, it is possible to invert the Kohn–Sham equation to yield an ion core pseudopotential, i.e., a pseudopotential that when screened will yield the exact eigenvalue and wave function by construction: p
Vion(r) = p +
2 ∇ 2 φp − VH (r) − Vxc [r, ρ(r)] 2mφp
(5)
Within this construction, the pseudo-wave function, φp (r), should be identical to the all electron wave function, ψAE (r), outside the core: φp (r) = ψAE (r) for r >rc will guarantee that the pseudo-wave function will yield similar chemical properties as the all electron wave function. For r < rc , one may alter the all-electron wave function as one wishes, within certain limitations, and retain the chemical accuracy of the problem. For computational simplicity, we take the wave function in this region to be smooth and nodeless. Another very important criterion is mandated. Namely, the integral of the pseudocharge density, i.e., square of the wave function |φp (r)|2 , within the core should be equal to the integral of the all-electron charge density. Without this condition, the pseudo-wave function can differ by a scaling factor from the all-electron wave function, that is, φp (r)=C ×ψAE (r) for r > rc where the constant, C, may differ from unity. Since we expect the chemical bonding of an atom to be highly dependent on the tails of the valence wave functions, it is imperative that the normalized pseudo wave function be identical to the all-electron wave functions. The criterion by which one insures C = 1 is called norm conserving [2]. An example of a pseudopotential, in this case the Na pseudopotential, is presented in Fig. 1. The ion core pseudopotential is dependent on the angular momentum component of the wave function. This is apparent from Eq. (5) p where the Vion is “state dependent” or nonlocal. This nonlocal behavior is pronounced for first row elements, which lack p-states in the core, and for first row transition metals, which lack d-states in the core. A physical explanation for this behavior can be traced to the orthogonality requirement of the valence wave functions to the core states. This may be illustrated by considering the carbon atom. The 2s of carbon is orthogonal to the 1s state, whereas the 2p state is not required to be orthogonal to a 1p state. As such, the 2s state has a node; the 2p does not. In transforming these states to nodeless pseudo-wave functions, more kinetic energy associated with the 2s exists compared to the 2p state. The additional kinetic energy cancels the strong coulombic potential better for the 2s state than the 2p. In terms of the ion core pseudopotential, the 2s potential is weaker than the 2p state.
Electronic scale
125
2 1
s-pseudopotential
Potential (Ry)
0 ⫺1 p-pseudopotential
⫺2 d-pseudopotential
⫺3 ⫺4
all electron
⫺5
0
1
2 r (a.u.)
3
4
Figure 1. Pseudopotential compared to the all-electron potential for the sodium atom. This pseudopotential was constructed using the method of Troullier and Martins [8].
In the case of sodium, only three significant components (s, p, and d) are required for an accurate pseudopotential. Note how the d component is the strongest following the argument that no core states of similar angular momentum exist within the Na core. For more complex systems such as a rare earth metals, one might have four or more components. In Fig. 2, the 3s state for the all electron potential is illustrated. It is compared to the lowest s-state for the pseudopotential illustrated in Fig. 1 The Kohn–Sham equation can be rewritten for a pseudopotential as
−2 ∇ 2 p + Vion ( r ) + VH ( r ) + Vxc [ r , ρ( r )] 2m
ψn ( r ) = E n ψn ( r)
(6)
p
where Vion can be expressed as p
r) = Vion(
Vi,ion ( r − Ri ) p
(7)
i p
where Vi,ion is the ionic pseudopotential for the ith-atomic species located at position, Ri . The charge density in Eq. (7) corresponds to a sum over the wave functions for occupied valence states.
126
J.R. Chelikowsky 0.6 Na
Wave Functions
0.4
3s
3p
0.2
0
⫺0.2 0
1
2 r (a.u.)
3
4
5
Figure 2. Pseudopotential wave functions compared to all-electron wave functions for the sodium atom. The all-electron wave functions are indicated by the dashed lines.
Since the pseudopotential and corresponding wave functions vary slowly in space, a number of simple basis sets is possible, e.g., one could use Gaussians [9] or plane waves [6, 7]. Both methods often work quite well, although each has its limitations. Owing in part to the simplicity and ease of implementation, plane wave methods have become of the method of choice for electronic structure work, especially for simple metals and semiconductors like silicon [7, 10]. Methods based on plane wave bases are often called “momentum” or “reciprocal” space approaches to the electronic structure problem. Plane wave approaches utilize a basis of “infinite extent.” The extended basis requires special techniques to describe localized systems. For example, suppose one wishes to examine a cluster of silicon atoms. A common approach is to use a “supercell method.” The cluster would be placed in a large cell, which is periodically repeated to fill up all space. The electronic structure of this system corresponds to an isolated cluster, provided sufficient “vacuum” surrounds each cluster. This method is very successful and has been used to consider localized systems such as clusters as well as extended systems such as surfaces or liquids [10]. In contrast, one can take a rather dramatic alternative view and eliminate an explicit basis altogether and solve Eq. (6) completely in real space using
Electronic scale
127
a grid. Real space or grid methods are typically used for engineering problems, e.g., one might solve for the strain field in an airplane wing using finite element methods. Such methods have not been commonly used for the electronic structure problem. There are at least two reasons for this situation. First, without the pseudopotential method, a nonlinear grid would be needed to describe the singular coulombic potential near the atomic nucleus and the corresponding cusp in the wave function. This would enormously complicate the problem and destroy the simplicity of the method. Second, the non-local nature of the pseudopotential can be easily addressed in grid methods, but until recently the formalism for this task has not been available. Real-space approaches overcome many of the complications involved with explicit basis, especially for describing nonperiodic systems such as molecules, clusters and quantum dots. Unlike localized orbitals such as Gaussians, the basis is unbiased. One need not specify whether the basis contains particular angular momentum components. Moreover, the basis is not “attached” to the atomic positions and no Pulay forces need to be considered [11]. Pulay forces arise from an incomplete basis. As atoms are moved, the basis needs to be recomputed as the convergence changes with the atomic configuration. Unlike an extended basis such as those based on plane waves, the vacuum is easily described by grid points. In contrast to plane waves, grids are efficient and easy to implement on parallel platforms. Real space algorithms avoid the use of fast Fourier transforms by performing all calculations in physical space instead of Fourier space. A benefit of avoiding Fourier transforms is that very few global communications are required. Different numerical methods can be used to implement real space methods such as finite element or finite difference methods. Both approaches have advantages and liabilities. Finite element methods can easily accommodate nonuniform grids and can reflect the variational principle as the mesh is refined [1]. This is an appropriate approach for systems in which complex boundary conditions exist. For systems where the boundary conditions are simple, e.g., outside a domain the wave function is set to zero, this is not an important consideration. Finite differencing methods are easier to implement compared to finite element methods, especially with uniform grids. Both approaches have been extensively utilized; however, owing to the ease of implementation, finite differencing methods have been applied to a wider range of materials and properties. For this reason, we will illustrate the finite differencing method. A key aspect to the success of the finite difference method is the availability of higher order finite difference expansions for the kinetic energy operator, i.e., expansions of the Laplacian [12]. Higher order finite difference methods significantly improve convergence of the eigenvalue problem when compared with standard finite difference methods. If one imposes a simple, uniform grid
128
J.R. Chelikowsky
on our system where the points are described in a finite domain by (xi , y j , z k ), one may approximate the Laplacian operator at (xi , y j , z k ) by M ∂ 2ψ = Cn ψ(xi + nh, y j , z k ) + O(h 2M+2 ), ∂ x 2 n=−M
(8)
where h is the grid spacing and M is a positive integer. This approximation is accurate to O(h2M+2 ) under the assumption that ψ can be approximated accurately by a power series in h. Algorithms are available to compute the coefficients Cn for arbitrary order in h [12]. With the kinetic energy operator expanded as in Eq. (8), one can set up the Kohn–Sham equation over a grid. For simplicity, let us assume a uniform grid, but this is not a necessary requirement. ψ(xi , y j , z k ) is computed on the grid by solving the eigenvalue problem:
M M 2 Cn1 ψn (xi + n 1 h, y j , z k ) + Cn2 ψn (xi , y j + n 2 h, z k ) − 2m n =−M n =−M 1
+
M n 3 =−M
2
Cn3 ψn (xi , y j , z k + n 3 h) + Vion(xi , y j , z k ) + VH (xi , y j , z k )
+ Vxc (xi , y j , z k ) ψn (xi , y j , z k ) = E n ψn (xi , y j , z k )
(9)
For L grid points, the size of the full matrix is L 2 . A uniformly spaced grid in a three-dimensional cube is shown in Fig. 3. Each grid point corresponds to a row in the matrix. However, many points in the cube are far from any atoms in the system and the wave function on these points may be replaced by zero. Special data structures may be used to discard these points and retain only those having a nonzero value for the wave function. The size of the Hamiltonian matrix is usually reduced by a factor of two to three with this strategy, which is quite important considering the large number of eigenvectors which must be saved. Further, since the Laplacian can be represented by a simple stencil, and since all local potentials sum up to a simple diagonal matrix, the Hamiltonian need not be stored. Nonlocality in the pseudopotential, i.e., the “state dependence” of the potential as illustrated in Fig. 1, is easily treated using a plane wave basis in Fourier space, but it may also be calculated in real space. The nonlocality appears only in the angular dependence of the potential and not in the radial coordinate. It is often advantageous to use a more advanced projection scheme, due to Kleinman and Bylander [13]. The interactions between valence electrons and pseudo-ionic cores in the Kleinman–Bylander form may be separated into a local potential and a nonlocal pseudopotential in real space [8], which differs from zero only inside the small core region around each atom.
Electronic scale
129
Figure 3. Uniform grid illustrating a typical configuration for examining the electronic structure of a localized system. The gray sphere represents the domain where the wave functions are allowed to be nonzero. The light spheres within the domain are atoms.
One can write the Kleinman–Bylander form in real space as p
r )φn ( r) = Vion(
Vloc (| ra |)φn ( r) +
a a K n,lm
1 = a Vlm
G an,lm u lm ( ra )Vl (ra ),
(10)
a, n,lm
u lm ( ra )Vl (ra )ψn ( r )d3r,
(11)
a is the normalization factor, and Vlm
<
a Vlm
> = u lm ( ra )Vl (ra )u lm ( ra ) d3r,
(12)
where ra = r − Ra , and the u lm are the atomic pseudopotential wave functions of angular momentum quantum numbers (l, m) from which the l-dependent ionic pseudopotential, Vl (r), is generated. Vl (r) = Vl (r) − Vloc (r) is the difference between the l component of the ionic pseudopotential and the local ionic potential. As a specific example, in the case of Na, we might choose the local part of the potential to replicate only the l = 0 component as defined by the 3s state. The nonlocal parts of the potential would then contain only the l = 1 and l = 2 components. The choice of which angular component is chosen for the local part of the potential is somewhat arbitrary. It is often convenient to chose the local potential to correspond to the highest l-component of interest. This
130
J.R. Chelikowsky
reduces the computational effort associated with higher l-components [3]. The choice of the local potential can be tested by utilizing different components for the local potential. There are several difficulties with the eigen problems generated in this application in addition to the size of the matrices. First, the number of required eigenvectors is proportional to the atoms in the system, and can grow up to thousands. Besides storage, maintaining the orthogonality of these vectors can be a formidable task. Second, the relative separation of the eigenvalues becomes increasingly poor as the matrix size increases and this has an adverse effect on the rate of convergence of the eigenvalue solvers. Preconditioning techniques attempt to alleviate this problem. A brief review of these approaches can be found in Ref. [3]. The architecture of the Hamiltonian matrix is illustrated in Fig. 4 for a diatomic molecule. Although the details of matrix structure will be a function of the geometry of the system, the essential elements remain the same. The off-diagonal elements arise from the expansion coefficients in Eq. (8) and the nonlocal potential in Eq. (10). These elements are not updated during the self-consistency cycle. The on-diagonal matrix elements consist of the local ion core pseudopotential, the Hartree potential and the exchange-correlation potential. These terms are updated each self-consistent cycle.
Figure 4. Hamiltonian matrix for a diatomic molecule in real space. Nonzero matrix elements are indicated by black dots. The diagonal matrix elements consist of the local ionic pseudopotential, Hartree potential and local density exchange-correlation potential. The off-diagonal matrix elements consistent of the coefficients in the finite difference expansion and the nonlocal matrix elements of the pseudopotential. The system contains about 4000 grid points or 16 million matrix elements.
Electronic scale
131
Figure 5. Potentials and wave functions for the oxygen dimer molecule. The total electronic potential is shown on the left along a ray connecting the two oxygen atoms. The Kohn–Sham molecular orbitals are shown on the right side of the figure. The orbitals on the left are from a real space calculation and the ones on the right from a plane wave calculation.
While the Hamiltonian matrix in real space can be large, it never needs to be explicitly saved. Also, the matrix is sparse; the sparsity is a function of M (see Eq. 8), which is the order of the higher order difference expansion. For larger values of M, the grid can be made coarse. However, this reduces the sparsity of the matrix. Conversely if we use standard finite difference methods, the matrix is sparser, but the grid size must be fine to retain the same accuracy. In practice, a value of M = 4−6 appears to work very well. There is a close relationship between the plane wave method and real-space methods. For example, one can always do a Fourier transform on a real-space method and obtain results in reciprocal space, or perform the operation in reverse to go from Fourier space to real space. In this sense, higher order finite differences can be considered an abridged Fourier transform as one does not sum over all grid points in the mesh. As a rough measure of the convergence of real space methods, one can consider a Fourier component or plane wave cut off of (π/ h)2 for a grid spacing, h. Using this criterion, a grid spacing of h = 0.5 a.u.1 would correspond to a plane wave cut-off of approximately 40 Ry. In Fig. 5, a comparison is between the plane-wave supercell method and a real-space method for the oxygen dimer. The oxygen dimer is a difficult 1 1 a.u. = 0.529 Å or one bohr unit of length.
132
J.R. Chelikowsky
molecular species using pseudopotentials as the potential is rather deep and quite nonlocal as compared to second row elements such as silicon. The total local electronic potential is depicted along a ray containing the oxygen atoms [14]. Also shown are the Kohn–Sham one electron orbitals. The agreement between the two methods is quite good, certainly less than the uncertainties involved in the local density approximation. The most noticeable difference in the potential occurs at the nuclear positions. At this point, the atomic pseudopotential are quite strong and the variation in the wave function requires a fine mesh. However, it is important to note that this spatial regime is removed from the bonding region of the molecule. A survey of cluster and molecular species using both plane waves and real space method confirms that the accuracy of the two methods is comparable, but the real space method is easier to implement [14].
3.
Outlook
The focus of the electronic structure problem will likely not reside in solving for the energy bands of ordered solids. The energy band structure of crystalline matter, especially elemental solids, has largely been exhausted. This is not to say that elemental solids are no longer of interest. Certainly, interest in these materials will continue as testing grounds for new electronic structure methods. However, interest in nonperiodic systems such as amorphous solids, liquids, glasses, clusters, and nanoscale quantum dots is now a major focus of the electronic structure problem. Perhaps this is the greatest challenge for electronic structure methods, i.e., systems with many electronic and nuclear degrees of freedom and little or no symmetry. Often the structure of these materials are unknown and the materials properties may be a strong function of temperature. Real-space methods offer a new avenue for these large and complex systems. As an illustration of the potential of these methods, consider the example of quantum dots. In Fig. 6, we illustrate hydrogenated Ge clusters. These clusters are composed of bulk fragments of Ge whose dangling bonds are capped with hydrogen. The hydrogen passivates any electronically active dangling bonds. The larger clusters correspond to quantum dots, i.e., semiconductor fragments whose surface properties have been removed, but whose optical properties are dramatically altered by quantum confinement. It is well known that these systems have optical properties with much larger gaps than that of the bulk crystal. The optical spectra of such clusters are shown in Fig. 7. The largest cluster illustrated contains over 800 atoms, although even larger clusters have been examined. This size cluster would be difficult to examine with traditional methods. Although these calculations were done with a ground state method the general shape of the spectra are correct and the evolution of the
Electronic scale
Figure 6.
133
Hydrogenated germanium clusters ranging from germane (GeH4 ) to Ge147 H100 .
Ge35H36
Ge87H76
Photoabsorption (arb.un.)
Ge147H100
Ge191H148
Ge239H196
Ge293H172
Ge357H204
E
Ge525H276
1
E
2
E0 1
2
3 4 Transitionenergy (eV)
5
6
Figure 7. Photoabsorption spectra for hydrogenated germanium quantum dots. The labels E 0 , E 1 and E 2 refer to optical features.
134
J.R. Chelikowsky
spectra appear bulk-like by a few hundred atoms. Surfaces, clusters, magnetic systems, complex solids have also been treated with real-space methods [1, 15]. Finally, systems approach the macroscopic limit, it is common to employ finite element or finite difference methods to describe material properties. One would like to couple these methods to those appropriate at the quantum (or nano) limit. The use of real space methods at these opposite limits would be a natural choice. Some attempts along these lines exist. For example, fracture methods often divide up a problem by treating the fracture tip with quantum mechanical methods, the surrounding area by molecular dynamics and the medium away from the tip by continuum mechanics [16].
References [1] T.L. Beck, “Real-space mesh techniques in density functional theory,” Rev. Mod. Phys., 74, 1041, 2000. [2] J.R. Chelikowsky, “The pseudopotential-density functional method applied to nanostructures,” J. Phys. D: Appl. Phys., 33, R33, 2000. [3] C.L. Bris (ed.), Handbook of Numerical Analysis (Devoted to Computational Chemistry), Volume X, Elsevier, Amsterdam, 2003. [4] S. Lundqvist and N.H. March (eds.), Theory of the Inhomogeneous Electron Gas, Plenum, New York, 1983. [5] W. Kohn and L.J. Sham, “Self-consistent equations including exchange and correlation effects,” Phys. Rev., 140, A1133, 1965. [6] W. Pickett, “Pseudopotential methods in condensed matter applications,” Comput. Phys. Rep., 9, 115, 1989. [7] J.R. Chelikowsky and M.L. Cohen, “Ab initio pseudopotentials for semiconductors,” In: T.S. Moss and P.T. Landsberg (eds.), Handbook of Semiconductors, 2nd edn., Elsevier, Amsterdam, 1992. [8] N. Troullier and J.L. Martins, “Efficient pseudopotentials for plane-wave calculations,” Phys. Rev. B, 43, 1993, 1991. [9] J.R. Chelikowsky and S.G. Louie, “First principles linear combination of atomic orbitals method for the cohesive and structural properties of solids: application to diamond,” Phys. Rev. B, 29, 3470, 1984. [10] J.R. Chelikowsky and S.G. Louie (eds.), Quantum Theory of Materials, Kluwer, Dordrecht, 1996. [11] P. Pulay, “Ab initio calculation of force constants and equilibrium geometries,” Mol. Phys., 17, 197, 1969. [12] B. Fornberg and D.M. Sloan, “A review of pseudospectral methods for solving partial differential equations,” Acta Numerica, 94, 203, 1994. [13] L. Kleinman and D.M. Bylander, “Efficacious form for model pseudopotential,” Phys. Rev. Lett., 48, 1425, 1982. [14] J.R. Chelikowsky, N. Troullier, and Y. Saad, “The finite-difference-pseudopotential method: electronic structure calculations without a basis,” Phys. Rev. Lett., 72, 1240, 1994.
Electronic scale
135
[15] J. Bernholc, “Computational materials science: the era of applied quantum mechanics,” Phys. Today, 52, 30, 1999. [16] A. Nakano, M.E. Bachlechner, R.K. Kalia, E. Lidorkis, P. Vashishta, G.Z. Voyladjis, T.J. Campbell, S. Ogata, and F. Shimojo, “Multiscale simulation of nanosystems,” Comput. Sci. Eng., 3, 56, 2001.
1.8 AN INTRODUCTION TO ORBITAL-FREE DENSITY FUNCTIONAL THEORY Vincent L. Lign`eres1 and Emily A. Carter2 1 Department of Chemistry, Princeton University, Princeton, NJ 08544, USA 2
Department of Mechanical and Aerospace Engineering and Program in Applied and Computational Mathematics, Princeton University, Princeton, NJ 08544, USA
Given a quantum mechanical system of N electrons and an external potential (which typically consists of the potential due to a collection of nuclei), the traditional approach to determining its ground-state energy involves the optimization of the corresponding wavefunction, a function of 3N dimensions, without considering spin variables. As the number of particles increases, the computation quickly becomes prohibitively expensive. Nevertheless, electrons are indistinguishable so one could intuitively expect that the electron density – N times the probability of finding any electron in a given region of space – might be enough to obtain all properties of interest about the system. Using the electron density as the sole variable would reduce the dimensionality of the problem from 3N to 3, thus drastically simplifying quantum mechanical calculations. This is in fact possible, and it is the goal of orbital-free density functional theory (OF-DFT). For a system of N electrons in an external potential Vext , the total energy E can be expressed as a functional of the density ρ [1], taking on the following form: E[ρ] = F[ρ] +
Vext ( r )ρ( r ) d r
(1)
Here, denotes the system volume considered, while F is the universal functional that contains all the information about how the electrons behave and interact with one another. The actual form of F is currently unknown and one has to resort to approximations in order to evaluate it. Traditionally, it is split into kinetic and potential energy contributions, the exact forms of which are also unknown. Kohn and Sham first proposed replacing the exact kinetic energy of an interacting electron system with an approximate, noninteracting, single 137 S. Yip (ed.), Handbook of Materials Modeling, 137–148. c 2005 Springer. Printed in the Netherlands.
138
V. Lign`eres and E.A. Carter
determinantal wavefunction that gives rise to the same density [2]. This approach is general and remarkably accurate but involves the introduction of one-electron orbitals. E[ρ] = TKS [φ1 , . . . , φ N ] +
Vext( r )ρ( r )d r + J [ρ] + E xc [ρ]
(2)
TKS denotes the Kohn–Sham (KS) kinetic energy for a system of N noninteracting electrons (i.e., for the case of noninteracting electrons, a single-determinantal wavefunction is the exact solution), the φi are the corresponding one-electron orbitals, J is the classical electron–electron repulsion, and E xc is a correction term that should account for electron exchange, electron correlation, and the difference in kinetic energy between the interacting and noninteracting systems. If the φi are orthonormal, TKS has the following explicit form: TKS = −
1 2
N
φi∗ ( r )∇ 2 φi ( r ) d r
(3)
i=1
Unfortunately, the required orthogonalization of these orbitals makes the computational time scale cubically in the number of electrons. Although linearscaling KS algorithms exist, they require some degree of localization in the orbitals and, for this reason, are not applicable to metallic systems [3]. For condensed matter systems, the KS method has another bottleneck: the need to sample the Brillouin zone for the wavefunction (also called “k-point sampling”) can add several orders of magnitude in cost to the computation. Thus, a further advantage of OF-DFT is that, without a wavefunction, this very expensive computational prefactor of the number of k-points is completely absent from the calculation. At this point, many general, efficient and often accurate functionals are available to handle every term in Eq. (2) as functionals of the electron density alone, except for the kinetic energy. The development of a generally applicable, accurate, linear-scaling kinetic energy density functional (KEDF) would remove the last bottleneck in the DFT computations and enable researchers to study much larger systems than are currently accessible. In the following, we will focus our discussion on such functionals.
1.
General Overview
Historically, the first attempt at approximating the kinetic energy assumes a uniform, noninteracting electron gas [4, 5] and is known as the Thomas–Fermi (TF) model for a slowly varying electron gas. 3 (3π 2 )2/3 ρ( r )5/3d r (4) TTF = 10
Orbital-free density functional theory
139
The model, although crude, constitutes a reasonable first approximation to the kinetic energy of periodic systems. It fails for atoms and molecules, however, as it predicts no shell structure, no interatomic bonding, and the wrong behavior for ρ at the r = 0 and r = +∞ limits. We will discuss some ways to improve this model later. A deeper look at Eq. (3) reveals another approach to describing the kinetic energy as a functional of the density. Within the Hartree–Fock (HF) approximation [6], we have ρ( r) = ρ( r) =
N i=1 N
φi∗ ( r )φi ( r)
(5a)
ρi ( r)
(5b)
i=1
so that, using the hermiticity of the gradient operator, and acting on Eq. (5) we obtain r) = 2 ∇ 2 ρ(
N
φi∗ ( r )∇ 2 φi ( r ) + ∇φi∗ ( r )∇φi ( r)
(6)
i=1
Rearranging Eq. (6), integrating over , and substituting Eq. (3) into Eq. (6) yields TKS = −
1 4
∇ 2 ρ( r )d r+
1 2
N
∇φi∗ ( r )∇φi ( r ) d r
(7)
i=1
Multiplying and dividing every term of the sum by ρi naturally introduces ∇ρi TKS = −
1 4
∇ 2 ρ( r )d r+
1 8
N |∇ρi ( r )|2
i=1
ρi ( r)
d r
(8)
but does not provide a form for which the sum can be evaluated simply. Nevertheless, the first term can be rewritten as the integral of the gradient of the density around the edge of space.
∇ 2 ρ( r )d r=
∇ρ( r ) d r
(9)
For a finite system, the gradient of the density vanishes at large distances and for a periodic system the gradients on opposite sides of a periodic cell cancel each other out, so that this integral evaluates to zero in both cases. Finally, for a one-orbital system, we obtain the following exact expression for the kinetic energy [7]. 1 TVW = 8
|∇ρ( r )|2 d r ρ( r)
(10)
140
V. Lign`eres and E.A. Carter
Although only exact for up to two electrons, the von Weizs¨acker (VW) functional is an essential component of the true kinetic energy and provides a good first approximation in the case of quickly varying densities such as those of atoms and molecules. Unfortunately, the total energy corresponding to the ground-state electron density has the same magnitude as the exact kinetic energy. Consequently, errors made in approximating the kinetic energy have a dramatic impact on the total energy and, by extension, on the ground state electron density computed by minimization. Unlike the exchange-correlation energy functionals, which represent a much smaller component of the total energy, kinetic-energy functionals must be highly accurate in order to achieve consistently accurate energy predictions.
2.
KEDFs for Finite Systems
In the case of a finite system such as a single atom, a few molecules in the gas phase, or a cluster, the electron density varies extremely rapidly near the nuclei, making the TF functional inadequate. Although many corrections have been suggested to improve upon the TF results for atoms, these modifications only yield acceptable results when densities obtained from a different method are used, usually HF. Left to determine their own densities self-consistently, these corrections still predict no shell structure for atoms. Nevertheless, the TF functional, or some fraction of it, may still be useful as a corrective term, as we will see later. Going back to the KS expression from Eq. (8), we introduce r) = n i (
r) ρi ( ρ( r)
(11)
which, when multiplying both sides by ρ( r ) and taking the gradient, yields ∇ρi ( r ) = n i ( r )∇ρ( r ) + ρ( r )∇n i ( r)
(12)
Substituting Eq. (12) into Eq. (8) gives the following expression: TKS =
1 8
N (n i ( r )∇ρ( r ) + ρ( r )∇n i ( r ))2
n i ( r )ρ( r)
i=1
d r
(13)
The product is expanded into three sums and reorganized as TKS
1 = 8
N N |∇ρ( r )|2 n i ( r ) + 2∇ρ( r) ∇n i ( r) ρ( r ) i=1 i=1
+ ρ( r)
N |∇n i ( r )|2 i=1
n i ( r)
d r
(14)
Orbital-free density functional theory
141
From Eq. (11), it follows immediately that N
n i ( r) = 1
(15)
i=1
and so, making use of the linearity of the gradient operator in the second term of Eq. (14) N
∇n i ( r) = ∇
i=1
N
n i ( r ) = ∇(1) = 0
(16)
i=1
the expression further simplifies to
TKS =
|∇ρ( r )|2 d r+ 8ρ( r)
ρ( r)
N |∇n i ( r )|2 i=1
8n i ( r)
d r
(17)
As every quantity in the second integral is positive, we can conclude that the VW functional (the first term in Eq. 17) constitutes a lower bound on the noninteracting kinetic energy. This makes physical sense anyway, as we know that the VW kinetic energy is exact for any one-orbital system (one or two electrons, or any number of bosons). Any other orbital introduced will have to be orthogonal to the first. This introduces nodes in the wavefunction, which raises the kinetic energy of the entire system. Therefore, further improvements upon the VW model involve adding an extra term to take into account the larger kinetic energy in the regions of space in which more than one orbital is significant. Far away from the molecule, only one orbital tends to dominate the picture and the VW functional is accurate enough to account for the relatively small contribution of these regions to the total kinetic energy. Most of the deviation from the exact, noninteracting kinetic energy is located close to the nuclei, in the core region of atoms. Corrections based on adding some fraction of the TF functional to the VW have been proposed (see, for instance, Ref. [8]), but only when nonlocal functionals (those depending on more than one point in space, e.g., r and r ) are introduced is a convincing shell structure observed for atomic densities [9]. Even without such correction terms, the TF and VW functionals may still be enough to obtain an accurate description of the system in some limited cases. For instance, Wesolowski and Warshel used a simple, orbital-free KEDF to describe water molecules as a solvent for a quantum-chemically treated water molecule solute [10]. They were able to reproduce the solvation free energy of water accurately using this method. Although this result is encouraging, the ultimate goal of OF-DFT is to determine a KEDF that would be accurate even without the backup provided by the traditional quantum-mechanical method. One key to judging of the
142
V. Lign`eres and E.A. Carter
quality of a given functional is to express it in terms of its kinetic-energy density.
T [ρ] =
t (ρ( r )) d r
(18)
The KS functional as it is expressed in Eq. (3) uniquely defines its kinetic-energy density. Certainly, if a given functional can reproduce the KS kinetic-energy density faithfully it must reproduce the total energy also. Any functional that differs from that one by a function that integrates to 0 over the entire system – like, for instance, the Laplacian of the density – will match the KS energy just as well but not the KS kinetic-energy density. For the VW functional, for instance, the corresponding kinetic-energy density should include a Laplacian contribution:
TVW =
tVW (ρ) d r
(19)
|∇ρ( 1 r )|2 tVW (ρ) = − ∇ 2 ρ( r) + 4 8ρ( r)
(20)
OF-DFT has experienced its most encouraging successes for periodic systems using a different class of kinetic energy functionals described below. These achievements led to attempts to use this alternative class of functionals for nonperiodic systems as well. Choly and Kaxiras recently proposed a method to approximate such functionals and adapt them for nonperiodic systems [11]. If successful, their method may further enlarge the range of applications where currently available functionals yield physically reasonable results.
3.
KEDFs for Periodic Systems
If the system exhibits translational invariance, or can be approximated using a system that does, it becomes advantageous to introduce periodic boundary conditions and thus reduce the size of the infinite system to a small number of atoms in a finite volume. A plane-wave basis set expansion most naturally describes the electron density under these conditions. As an additional advantage, quantities can be computed either in real or reciprocal space, by performing fast Fourier transforms (FFTs) on the density represented on a uniform grid. The number of functions necessary to describe the electron density in a given system is highly dependent upon the rate of fluctuation of said density. Quickly varying densities need more plane waves in real space which translate into larger reciprocal-space grids and, consequently, into finer realspace meshes. Unfortunately, in real systems, electrons tend to stay mostly
Orbital-free density functional theory
143
around atomic nuclei and only occasionally venture in the interatomic regions of space. This makes the total electron density vary extremely rapidly close to the nuclei, in the core region of space. Consequently, an extremely large number of plane waves would be necessary to describe the total electron density. One can get around this problem by realizing that the core region density is often practically invariant upon physical and chemical change. This observation is similar to the realization that only valence shell electrons are involved in chemical bonding. The valence electron density varies a lot less rapidly than the total density, so that if the core electrons could be removed, one could drastically reduce the total number of plane waves required in the basis set. Of course, the influence of the core electrons on the geometry and energy of the system must still be accounted for. This is done by introducing pseudopotentials that mimic the presence of core electrons and the nuclei. Obviously, if one is interested in any properties that require an accurate description of the electron density near the nuclei of a system, such pseudopotential-based methods will be inappropriate. Each chemical element present in the system must be represented by its own unique pseudopotential, which is typically constructed as follows. First, an all-electron calculation on an atom is performed to obtain the valence eigenvalues and wavefunctions that one seeks to reproduce within a pseudopotential calculation. Then, the oscillations of the valence wavefunction in the core region are smoothed out to create a “pseudowavefunction,” which is then used to invert the KS equations for the atom to obtain the pseudopotential that corresponds to the pseudowavefunction, subject to the constraint that the allelectron eigenvalues are reproduced. Typically, this is done for each angular momentum channel, so that one obtains a pseudopotential that has an angular dependence, usually expressed as projection operators involving the atomic pseudowavefunctions. Such a pseudopotential is referred to as “nonlocal,” because it is not simply a function of the distance from the nucleus, but also depends on the angular nature of the wavefunction it acts upon. In other words, when a nonlocal pseudopotential acts on a wavefunction, s-symmetry orbitals will be subject to a different potential than p-symmetry orbitals, etc. (as in the exact solution to the Schroedinger equation for a one-electron atom or ion). This affords a nonlocal pseudopotential enough flexibility so that it is quite accurate and transferable to a diverse set of environments. The above discussion presents a second significant challenge for OF-DFT beyond kinetic energy density functionals, since nonlocal pseudopotentials cannot be employed in OF-DFT, because no wavefunction exists to be acted upon by the orbital-based projection operators intrinsic to nonlocal pseudopotentials. In the case of an orbital-free description of the density, the pseudopotentials must be local (depending only on one point in space) and spherically symmetrical around the atomic nucleus. Thus, in OF-DFT, the challenge is to
144
V. Lign`eres and E.A. Carter
construct accurate and transferable local pseudopotentials for each element. An attempt in this direction specifically for OF-DFT was made by Madden and coworkers, where the OF-DFT equation δ E xc δJ δTKS + Vext + + =µ δρ δρ δρ
(21)
is inverted to find a local pseudopotential (the second term on the left-hand side of Eq. (21)) that reproduces a crystalline density derived from a KS calculation using a nonlocal pseudopotential [12]. Here the terms on the left-hand side of Eq. (21) are the density functional variations of the same terms given in Eq. (2), except that in OF-DFT, TKS will be a functional of the density only and not of the orbitals. On the right-hand side is µ, the chemical potential. This method yielded promising results for alkali and alkaline earth metals, but was not extended beyond such elements because inherent to the method was the assumption and use of a given approximate kinetic energy density functional. Hence the pseudopotential had built into it the success and/or failure associated with any given choice of kinetic energy functional. A related approach for constructing local pseudopotentials based on embedding an ion in an electron gas was proposed by Anta and Madden; this method yielded improved results for liquid Li, for example [13]. More recently, Zhou et al. proposed that improved local pseudopotentials for condensed matter could be obtained by inverting not the OF-DFT equations but instead the KS equations so that the exact kinetic energy could be used in the inversion procedure. This was done subject to the constraint of reproducing accurate crystalline electron densities, using a modified version of the method developed by Wang and Parr for the inversion procedure [14]. Zhou et al. showed that a local pseudopotential could be constructed in this way that, e.g., for silicon, yielded bulk properties for both semiconducting and metallic phases in excellent agreement with predictions by a nonlocal pseudopotential within the KS theory. This bulk-derived local pseudopotential also exhibited improved transferability over those derived from a single atomic density. In principle, Zhou et al.’s approach is a general scheme applicable to all elements, since the exact kinetic energy is utilized [15]. With local pseudopotentials now in hand, we turn our attention back to calculating accurate valence electron densities via kinetic-energy density functionals within OF-DFT. The valence electron density in condensed matter can be viewed as fluctuating around an average value that corresponds to the total number of electrons spread homogeneously over the system. If this were exactly the case, we would have a uniform electron gas for which the kinetic energy is described exactly by the TF functional in Eq. (4) with a constant density. For an inhomogeneous density, the TF functional still constitutes an
Orbital-free density functional theory
145
appropriate starting point and is the zeroth order term of the conventional gradient expansion (CGE) [16]. TKS [ρ] = TTF [ρ] + T 2 [ρ] + T 4 [ρ] + T 6 [ρ] + · · ·
(22)
Here, T 2, T 4, and T 6 correspond to the second-, fourth-, and sixth-order corrections, respectively. All odd-order corrections are zero. The second-order correction is found to be one ninth of the VW kinetic energy, while the fourthorder term is [17]: 1 T [ρ] = 540(3π 2 )2/3
4
ρ
1/3
(∇ 2 ρ)2 9∇ 2 ρ(∇ρ)2 (∇ρ)4 − + d r (23) ρ2 8ρ 3 3ρ 4
Starting with the sixth-order term, all further corrections diverge for quickly varying or exponentially decaying densities [18]. Moreover, the fourth-order correction constitutes only a minor improvement over the second-order term and its potential δT 4 [ρ]/δρ also diverges for quickly varying or exponentially decaying densities. Usually then, the CGE expansion is truncated at second order as TCGE [ρ] = TTF [ρ] + 19 TVW [ρ]
(24)
For slowly varying densities, this truncation is reasonable. For the nearly-free electron gas, linear response theory can provide an additional constraint on the kinetic-energy functional [19].
1 δ 2 T [ρ]
=− = Fˆ
2
δρ χLind ρ 0
−1
1 1 − η2
1 + η
+ ln
2 4η 1 − η
(25)
Here Fˆ denotes the Fourier transform, δ the functional derivative evaluated at a reference density ρ0 , and χLind is the Lindhard susceptibility function, the expression for which is detailed on the right-hand side, where η = q/2kF , q is the reciprocal space wave vector and kF = (3π 2 ρ0 )1/3 . Although the exact susceptibility is known in this case, the actual kinetic-energy functional is not. Its behavior at the small and large q limits can be evaluated, however. The exact linear response matches the CGE only for very slowly varying densities, which correspond to small values of q.
δ 2 (TTF [ρ] + 19 TVW [ρ])
δ 2 T [ρ]
ˆ = Lim F Lim Fˆ
η→0 η→0 δρ 2 ρ δρ 2 ρ 0
(26)
0
In the limit of infinitely quickly varying densities or the large q limit (LQL), the linear response behavior is very different.
δ 2 (− 35 TTF [ρ] + TVW [ρ])
δ 2 T [ρ]
ˆ = Lim F Lim Fˆ
(27) η→+∞ η→+∞
δρ 2 ρ δρ 2 ρ 0
0
146
V. Lign`eres and E.A. Carter
As we saw before though, the VW kinetic energy constitutes a lower bound to the kinetic energy. Therefore, here the linear response behavior cannot be correct (we are far from the small perturbations away from the uniform gas limit required in linear response theory) and we can conclude that linear response theory inadequately describes quickly varying densities. Nevertheless, a lot of effort has been made to determine the corresponding kineticenergy functional. Bridging the gap between the small and large q to obtain the linear response kinetic-energy functional involves explicitly enforcing the correct linear response behavior. Pioneering work in this direction by Wang and Teter [20], Perrot [21], and Smargiassi and Madden [22] produced impressive results for many main group metals. A correction term is added to the TF and VW functionals to enforce the linear response. T [ρ] = TTF [ρ] + TVW [ρ] + TX [ρ]
(28)
Here TX is the correction, usually a nonlocal functional of the density that can be expressed as a double integral
TX [ρ] =
ρ α ( r)
w( r − r )ρ β ( r ) d r d r
(29)
where w is called the response kernel and is adjusted to produce the global linear response behavior, while α and β are functional-dependent parameters. More complex functionals, based either on higher-order response theories [23], for instance) or on density-dependent kernels (like those of Chac´on and coworkers [24] or Wang et al. [25] can produce more general and transferable results. However, their excellent performance comes with increased computational costs and, in the case of the Chac´on functional, with quadratic scaling of the computational time with system size. Nevertheless, computations using these functionals are several orders of magnitude faster than those using the KS kinetic energy. For example, Jesson and Madden performed DFT molecular dynamics simulations of solid and liquid aluminum using the Foley and Madden KEDF, on systems four times larger and for simulation times twice as long [26] as previous KS molecular dynamics studies [27] could consider. Although the melting temperature they predicted was much lower than the experimental value and previous predictions, it appears that their pseudopotential, not their KEDF, was the main source of error. It is important to emphasize that even the best of today’s functionals do not exactly match the accuracy of the KS method, exhibiting non-negligible deviations from the KS densities and energies in many cases. This should spur further developments of kinetic-energy density functionals.
Orbital-free density functional theory
4.
147
Conclusions and Outlook
Despite more than seventy years of research in this field and some tremendous progress, kinetic-energy density functionals have not yet reached a degree of sophistication that allow their use reliably and transferably for all elements in the periodic table and for all phases of matter. One could easily view the development of accurate descriptions of the kinetic energy in terms of the density alone as the last great frontier of density functional theory. Currently, OF-DFT research is moving from the development of new, approximate functionals to attempting to determine the properties of the exact one [28]. Also, it is becoming clearer that reproducing the KS energy for a given system is not a guarantee of functional accuracy. More efforts have been devoted to trying to reproduce the kinetic energy density predicted by the KS method at every point in space [29]; one can expect this type of effort to intensify in the future. If highly accurate and general forms for the kinetic-energy density functional are discovered, which retain the linear scaling efficiency of current functionals, OF-DFT will undoubtedly become the quantum-based method of choice for investigating wavefunctionindependent properties of large numbers of atoms. Aside from spectroscopic quantities, most properties of interest (e.g., vibrations, forces, dynamical evolution, structure, etc.) do not depend on knowledge of the electronic wavefunction and hence OF-DFT can be employed. For further reading about advanced technical details in kinetic-energy density functional theory, see Wang and Carter [30].
References [1] P. Hohenberg and W. Kohn, “Inhomogeneous electron gas,” Phys. Rev., 136, B864– B871, 1964. [2] W. Kohn and L.J. Sham, “Self-consistent equations including exchange and correlation effects,” Phys. Rev., 140, A1133–A1138, 1965. [3] S. Goedecker, “Linear scaling electronic structure models,” Rev. Mod. Phys., 71(4), 1085–1123, 1999. [4] E. Fermi, “Un metodo statistice per la determinazione di alcune proprieta dell’atomo,” Rend. Accad., Lincei 6, 602–607, 1927. [5] L.H. Thomas, “The calculation of atomic fields,” Proc. Camb. Phil. Soc., 23, 542– 548, 1927. [6] C.C.J. Roothaan, “New developments in molecular orbital theory,” Rev. Mod. Phys., 23, 69–89, 1951. [7] C.F. von Weizs¨acker, “Zur Theorie der Kernmassen,” Z. Phys, 96, 431–458, 1935. [8] P.K. Acharya, L.J. Bartolotti, S.B. Sears, and R.G. Parr, “An atomic kinetic energy functional with full Weizsacker correction,” Proc. Natl. Acad. Sci. USA, 77, 6978– 6982, 1980. [9] P. García-Gonz´alez, J.E. Alvarellos, and E. Chac´on, “Kinetic-energy density functional: atoms and shell structure,” Phys. Rev. A, 54, 1897–1905, 1996.
148
V. Lign`eres and E.A. Carter [10] T. Wesolowski and A. Warshel, “Ab initio free-energy perturbation calculations of solvation free-energy using the frozen density-functional approach,” J. Phys. Chem., 98, 5183–5187, 1994. [11] N. Choly and E. Kaxiras, “Kinetic evergy density functionals for non-periodic systems,” Solid State Commun., 121, 281–286, 2002. [12] S. Watson, B.J. Jesson, E.A. Carter, and P. A. Madden, “Ab initio pseudopotentials for orbital-free density functionals,” Europhys. Lett., 41, 37–42, 1998. [13] J.A. Anta and P.A. Madden, “Structure and dynamics of liquid lithium: comparison of ab initio molecular dynamics predictions with scattering experiments,” J. Phys. Condens. Matter, 11, 6099–6111, 1999. [14] Y. Wang and R.G. Parr, “Construction of exact Kohn–Sham orbitals from a given electron density,” Phys. Rev. A, 47, R1591–R1593, 1993. [15] B. Zhou, Y.A. Wang, and E.A. Carter, “Transferable local pseudopotentials derived via inversion of the Kohn–Sham equations in a bulk environment,” Phys. Rev. B, 69, 125109, 2004. [16] D.A. Kirzhnits, “Quantum corrections to the Thomas–Fermi equation,” Sov. Phys. – JETP, 5, 64–71, 1957. [17] C.H. Hodges, “Quantum corrections to the Thomas–Fermi approximation – the Kirzhnits method,” Can. J. Phys., 51, 1428–1437, 1973. [18] D.R. Murphy, “The sixth-order term of the gradient expansion of the kinetic energy density functional,” Phys. Rev. A, 24, 1682–1688, 1981. [19] J. Lindhard. K. Dan. Vidensk. Selsk. Mat. Fys. Medd., 28, 8, 1954. [20] L.-W. Wang and M.P. Teter, “Kinetic-energy functional of the electron density,” Phys. Rev. B, 45, 13196–13220, 1992. [21] F. Perrot, “Hydrogen–hydrogen interaction in an electron gas,” J. Phys. Condens. Matter, 6, 431–446, 1994. [22] E. Smargiassi and P.A. Madden, “Orbital-free kinetic-energy functionals for firstprinciples molecular dynamics,” Phys. Rev. B, 49, 5220–5226, 1994. [23] M. Foley and P.A. Madden, “Further orbital-free kinetic-energy functionals for ab initio molecular dynamics,” Phys. Rev. B, 53, 10589–10598, 1996. [24] P. García-Gonz´alez, J.E. Alvarellos, and E. Chac´on, “Nonlocal symmetrized kineticenergy density functional: application to simple surfaces,” Phys. Rev. B, 57, 4857– 4862, 1998. [25] Y.A. Wang, N. Govind, and E.A. Carter, “Orbital-free kinetic-energy density functionals with a density-dependent kernel,” Phys. Rev. B, 60, 16350–16358, 1999. [26] B.J. Jesson and P.A. Madden, “Ab initio determination of the melting point of aluminum by thermodynamic integration,” J. Chem. Phys., 113, 5924–5934, 2000. [27] G.A. de Wijs, G. Kresse, and M.J. Gillan, “First-order phase transitions by firstprinciples free-energy calculations: the melting of Al.,” Phys. Rev. B, 57, 8223–8234, 1998. ´ Nagy, “A method to get an analytical expression for the non-interacting [28] T. G´al and A. kinetic energy density functional,” J. Mol. Struct., 501–502, 167–171, 2000. [29] E. Sim, J. Larkin, and K. Burke, “Testing the kinetic energy functional: kinetic energy density as a density functional,” J. Chem. Phys., 118, 8140–8148, 2003. [30] Y.A. Wang and E.A. Carter, “Orbital-free kinetic energy density functional theory,” In: S.D. Schwartz (ed.), Theoretical Methods in Condensed Phase Chemistry, Kluwer, Dordrecht, pp. 117–184, 2000.
1.9 AB INITIO ATOMISTIC THERMODYNAMICS AND STATISTICAL MECHANICS OF SURFACE PROPERTIES AND FUNCTIONS Karsten Reuter1 , Catherine Stampfl1,2, and Matthias Scheffler1 1 Fritz-Haber-Institut der Max-Planck-Gesellschaft, Faradayweg 4-6, D-14195 Berlin, Germany 2 School of Physics, The University of Sydney, Sydney 2006, Australia
Previous and present “academic” research aiming at atomic scale understanding is mainly concerned with the study of individual molecular processes possibly underlying materials science applications. In investigations of crystal growth one would, for example, study the diffusion of adsorbed atoms at surfaces, and in the field of heterogeneous catalysis it is the reaction path of adsorbed species that is analyzed. Appealing properties of an individual process are then frequently discussed in terms of their direct importance for the envisioned material function, or reciprocally, the function of materials is often believed to be understandable by essentially one prominent elementary process only. What is often overlooked in this approach is that in macroscopic systems of technological relevance typically a large number of distinct atomic scale processes take place. Which of them are decisive for observable system properties and functions is then not only determined by the detailed individual properties of each process alone, but in many, if not most cases, also the interplay of all processes, i.e., how they act together, plays a crucial role. For a predictive materials science modeling with microscopic understanding, a description that treats the statistical interplay of a large number of microscopically well-described elementary processes must therefore be applied. Modern electronic structure theory methods such as density-functional theory (DFT) have become a standard tool for the accurate description of the individual atomic and molecular processes. In what follows we discuss the present status of emerging methodologies that attempt to achieve a (hopefully seamless) match of DFT with concepts from statistical mechanics or thermodynamics, in order to also address the interplay of the various molecular processes. The 149 S. Yip (ed.), Handbook of Materials Modeling, 149–194. c 2005 Springer. Printed in the Netherlands.
150
K. Reuter et al.
new quality of, and the novel insights that can be gained by, such techniques is illustrated by how they allow the description of crystal surfaces in contact with realistic gas-phase environments, which is of critical importance for the manufacture and performance of advanced materials such as electronic, magnetic and optical devices, sensors, lubricants, catalysts, and hard coatings. For obtaining an understanding, and for the design, advancement or refinement of modern technology that controls many (most) aspects of our life, a large range of time and length scales needs to be described, namely, from the electronic (or microscopic/atomistic) to the macroscopic, as illustrated in Fig. 1. Obviously, this calls for a multiscale modeling, were corresponding theories (i.e., from the electronic, mesoscopic, and macroscopic regimes) and their results need to be linked appropriately. For each length and time scale regime alone, a number of methodologies are well established. It is however, the appropriate linking of the methodologies that is only now evolving. Conceptually quite challenging in this hierarchy of scales are the transitions from what is often called a micro- to a mesoscopic system description, and from a meso- to a macroscopic system description. Due to the rapidly increasing number of particles and possible processes, the former transition is methodologically primarily characterized by the rapidly increasing importance of statistics, while in the latter, the atomic substructure is finally discarded in favor of a
Statistical Mechanics or Thermodynamics
length (m) 1 10
-3
10
-6
10
-9
macroscopic regime
Density Functional Theory mesoscopic regime electronic regime
time (s) 10
-15
10
-9
10
-3
1
Figure 1. Schematic presentation of the time and length scales relevant for most material science applications. The elementary molecular processes, which rule the behavior of a system, take place in the so-called “electronic regime”. Their interplay, which frequently determines the functionalities however, only develops after meso- and macroscopic lengths or times.
Ab initio atomistic thermodynamics and statistical mechanics
151
continuum modeling. In this contribution we will concentrate on the micro- to mesoscopic system transition, and correspondingly discuss some possibilities of how atomistic electronic structure theory can be linked with concepts and techniques from statistical mechanics and thermodynamics. Our aim is a materials science modeling that is based on understanding, predictive, and applicable to a wide range of realistic conditions (e.g., realistic environmental situations of varying temperatures and pressures). This then mostly excludes the use of empirical or fitted parameters – both at the electronic and at the mesoscopic level, as well as in the matching procedure itself. Electronic theories that do not rely on such parameters are often referred to as first-principles (or in latin: ab initio) techniques, and we will maintain this classification also for the linked electronic-statistical methods. Correspondingly, our discussion will mainly (nearly exclusively) focus on such ab initio studies, although mentioning some other work dealing with important (general) concepts. Furthermore, this chapter does not (or only briefly) discuss equations; instead the concepts are demonstrated (and illustrated) by selected, typical examples. Since many (possibly most) aspects of modern material science deal with surface or interface phenomena, the examples are from this area, addressing in particular surfaces of semiconductors, metals, and metal oxides. Apart from sketching the present status and achievements, we also find it important to mention the difficulties and problems (or open challenges) of the discussed approaches. This can however only be done in a qualitative and rough manner, since the problems lie mostly in the details, the explanations of which are not appropriate for such a chapter. To understand the elementary processes ruling the materials science context, microscopic theories need to address the behavior of electrons and the resulting interactions between atoms and molecules (often expressed in the terminology of chemical bonds). Electrons move and adjust to perturbations on a time scale of femtoseconds (1 fs = 10−15 s), atoms vibrate on a time scale of picoseconds (1 ps = 10−12 s), and individual molecular processes take place on a length scale of 0.1 nanometer (1 nm = 10−9 m). Because of the central importance of the electronic interactions, this time and length scale regime is also often called the “electronic regime”, and we will use this term here in particular, in order to emphasize the difference between ab initio electronic and semi-empirical microscopic theories. The former explicitly treat the electronic degrees of freedom, while the latter already coarse-grain over them and directly describe the atomic scale interactions by means of interatomic potentials. Many materials science applications depend sensitively on intricate details of bond breaking and making, which on the other hand are often not well (if at all) captured by existing semi-empiric classical potential schemes. A predictive first-principles modeling as outlined above must therefore be based on a proper description of molecular processes in the “electronic regime”, which is much harder to accomplish than just a microscopic description employing more or
152
K. Reuter et al.
less guessed potentials. In this respect we find it also appropriate to distinguish the electronic regime from the currently frequently cited “nanophysics” (or better “nanometer-scale physics”). The latter deals with structures or objects of which at least one dimension is in the range 1–100 nm, and which due to this confinement exhibit properties that are not simply scalable from the ones of larger systems. Although already quite involved, the detailed understanding of individual molecular processes arising from electronic structure theories is, however, often still not enough. As mentioned above, in many cases the system functionalities are determined by the concerted interplay of many elementary processes, not only by the detailed individual properties of each process alone. It can, for example, very well be that an individual process exhibits very appealing properties for a desired application, yet the process may still be irrelevant in practice, because it hardly ever occurs within the “full concert” of all possible molecular processes. Evaluating this “concert” of elementary processes one obviously has to go beyond separate studies of each microscopic process. However, taking the interplay into account, naturally requires the treatment of larger system sizes, as well as an averaging over much longer time scales. The latter point is especially pronounced, since many elementary processes in materials science are activated (i.e., an energy barrier must be overcome) and thus rare. This means that the time between consecutive events can be orders of magnitude longer than the actual event time itself. Instead of the above mentioned electronic time regime, it may therefore be necessary to follow the time evolution of the system up to seconds and longer in order to arrive at meaningful conclusions concerning the effect of the statistical interplay. Apart from the system size, there is thus possibly the need to bridge some twelve orders of magnitude in time which puts new demands on theories that are to operate in the corresponding mesoscopic regime. And also at this level, the ab initio approach is much more involved than an empirical one because it is not possible to simply “lump together” several not further specified processes into one effective parameter. Each individual elementary step must be treated separately, and then combined with all the others within an appropriate framework. Methodologically, the physics in the electronic regime is best described by electronic structure theories, among which density-functional theory [1–4] has become one of the most successful and widespread approaches. Apart from detailed information about the electronic structure itself, the typical output of such DFT calculations, that is of relevance for the present discussion, is the energetics, e.g., total energies, as well as the forces acting on the nuclei for a given atomic configuration. If this energetic information is provided as function of the atomic configuration {R I }, one talks about a potential energy surface (PES) E({R I }). Obviously, a (meta)stable atomic configuration corresponds to a (local) minimum of the PES. The forces acting on the given atomic configuration are just the local gradient of the PES, and the vibrational
Ab initio atomistic thermodynamics and statistical mechanics
153
modes of a (local) minimum are given by the local PES curvature around it. Although DFT mostly does not meet the frequent demand for “chemical accuracy” (1 kcal/mol ≈ 0.04 eV/atom) in the energetics, it is still often sufficiently accurate to allow for the aspired modeling with predictive character. In fact, we will see throughout this chapter that error cancellation at the statistical interplay level may give DFT-based approaches a much higher accuracy than may be expected on the basis of the PES alone. With the computed DFT forces it is possible to directly follow the motion of the atoms according to Newton’s laws [5, 6]. With the resulting ab initio molecular dynamics (MD) [7–11] only time scales up to the order of 50 ps are, however, currently accessible. Longer times may, e.g., be reached by so-called accelerated MD techniques [12], but for the desired description of a truly mesoscopic scale system which treats the statistical interplay of a large number of elementary processes over some seconds or longer, a match or combination of DFT with concepts from statistical mechanics or thermodynamics must be found. In the latter approaches, bridging of the time scale is achieved by either a suitable “coarse-graining” in time (to be specified below) or by only considering thermodynamically stable (or metastable) states. We will discuss how such a description, appropriate for a mesoscopic-scale system, can be achieved starting from electronic structure theory, as well as ensuing concepts like atomistic thermodynamics, lattice-gas Hamiltonians (LGH), equilibrium Monte Carlo simulations, or kinetic Monte Carlo simulations (kMC). Which of these approaches (or a combination) is most suitable depends on the particular type of problem. Table 1 lists the different theoretical approaches and the time and length scales that they treat. While the concepts are general, we find it instructive to illustrate their power and limitations on the basis of a particular issue that is central to the field of surface-related studies including applications as important as crystal growth and heterogeneous catalysis, namely to treat the effect of a finite gas-phase. With surfaces forming the interface to the surrounding environment, a critical dependence of their
Table 1. The time and length scales typically handled by different theoretical approaches to study chemical reactions and crystal growth Information
Time scale
Length scale
< 103 atoms Density-functional theory Microscopic – ∼ < 103 atoms Ab initio molecular dynamics Microscopic t< ∼ ∼ 50 ps < 103 atoms Semi-empirical molecular dynamics Microscopic t< ∼ ∼ 1 ns < < Kinetic Monte Carlo simulations Micro- to mesoscopic 1 ps < ∼ t ∼ 1 h ∼ 1 µm > 10 nm Ab initio atomistic thermodynamics Meso- to macroscopic Averaged ∼ > < < Rate equations Averaged 0.1 s ∼ t ∼ ∞ ∼ 10 nm < > 10 nm Continuum equations Macroscopic 1s < ∼t ∼∞ ∼
154
K. Reuter et al.
properties on the species in this gas-phase, on their partial pressures and on the temperature can be intuitively expected [13, 14]. After all, we recall that for example in our oxygen-rich atmosphere, each atomic site of a close-packed crystal surface at room temperature is hit by of the order of 109 O2 molecules per second. That this may have profound consequences on the surface structure and composition is already highlighted by the everyday phenomena of oxide formation, and in humid oxygen-rich environments, eventually corrosion with rust and verdigris as two visible examples [15]. In fact, what is typically called a stable surface structure is nothing but the statistical average over all elementary adsorption processes from, and desorption processes to, the surrounding gas-phase. If atoms or molecules of a given species adsorb more frequently from the gas-phase than they desorb to it, the species’ concentration in the surface structure will be enriched with time, thus also increasing the total number of desorption processes. Eventually this total number of desorption processes will (averaged over time) equal the number of adsorption processes. Then the (average) surface composition and structure will remain constant, and the surface has attained its thermodynamic equilibrium with the surrounding environment. Within this context we may be interested in different aspects; for example, on the microscopic level, the first goal would be to separately study elementary processes such as adsorption and desorption in detail. With DFT one could, e.g., address the energetics of the binding of the gas-phase species to the surface in a variety of atomic configurations [16], and MD simulations could shed light on the possibly intricate gas-surface dynamics during one individual adsorption process [10, 11, 17]. Already the search for the most stable surface structure under given gas-phase conditions, however, requires the consideration of the interplay between the elementary processes (of at least adsorption and desorption) at the mesoscopic scale. If we are only interested in the equilibrated system, i.e., when the system has reached its thermodynamic ground (or a metastable) state, the natural choice would then be to combine DFT data with thermodynamic concepts. How this can be done will be exemplified in the first part of this chapter. On the other hand, the processes altering the surface geometry and composition from a known initial state to the final ground state can be very slow. And coming back to the above example of oxygen–metal interaction, corrosion is a prime example, where such a kinetic hindrance significantly slows down (and practically stops) further oxidation after an oxide film of certain thickness has formed at the surface. In such circumstances, a thermodynamic description will not be satisfactory and one would want to follow the explicit kinetics of the surface in the given gas-phase. Then the combination of DFT with concepts from statistical mechanics explicitly treating the kinetics is required, and we will illustrate some corresponding attempts in the last section entitled “First-principles kinetic Monte Carlo simulations”.
Ab initio atomistic thermodynamics and statistical mechanics
1.
155
Ab Initio Atomistic Thermodynamics
First, let us discuss the matching of electronic structure theory data with thermodynamics. Although this approach applies “only” to systems in equilibrium (or in a metastable state), we note that at least, at not too low temperatures, a surface is likely to rapidly attain thermodynamic equilibrium with the ambient atmosphere. And even if it has not yet equilibrated, at some later stage it will have and we can nevertheless learn something by knowing about this final state. Thermodynamic considerations also have the virtue of requiring comparably less microscopic information, typically only about the minima of the PES and the local curvatures around them. As such, it is often advantageous to first resort to a thermodynamic description, before embarking upon the more demanding kinetic modeling described in the last section. The goal of the thermodynamic approach is to use the data from electronic structure theory, i.e., the information on the PES, to calculate appropriate thermodynamic potential functions like the Gibbs free energy G [18–21]. Once such a quantity is known, one is immediately in the position to evaluate macroscopic system properties. Of particular relevance for the spatial aspect of our multiscale endeavor is further that within a thermodynamic description larger systems may readily be divided into smaller subsystems that are mutually in equilibrium with each other. Each of the smaller and thus potentially simpler subsystems can then first be treated separately, and the contact between the subsystems is thereafter established by relating their corresponding thermodynamic potentials. Such a “divide and conquer” type of approach can be especially efficient, if infinite, but homogeneous parts of the system like bulk or surrounding gas-phase can be separated off [22–27].
1.1.
Free Energy Plots for Surface Oxide Formation
How this quite general concept works and what it can contribute in practice may be illustrated with the case of oxide formation at late transition metal (TM) surfaces sketched in Fig. 2 [28, 29]. These materials have widespread technological use, for example, in the area of oxidation catalysis [30]. Although they are likely to form oxidic structures (i.e., ordered oxygen–metal compounds) in technologically-relevant high oxygen pressure environments, it is difficult to address this issue at the atomic scale with the corresponding experimental techniques of surface science because they often require Ultra-High Vacuum (UHV) [31]. Instead of direct, so-called in situ measurements, the surfaces are usually first exposed to a defined oxygen dosage, and the produced oxygen-enriched surface structures are then cooled down and analyzed in UHV. Due to the low temperatures, it is hoped that the surfaces do not attain their equilibrium structure in UHV during the time of the measurement, and
156
K. Reuter et al.
Figure 2. Cartoon sideviews illustrating the effect of an increasingly oxygen-rich atmosphere on a metal surface. Whereas in perfect vacuum (left) the clean surface prevails, finite O2 pressures in the environment lead to an oxygen-enrichment in the solid and its surface. Apart from some bulk dissolved oxygen, frequently observed stages in this oxidation process comprise (from left to right) on-surface adsorbed O, the formation of thin (surface) oxide films, and eventually the transformation to an ordered bulk oxide compound. Note, that all stages can be strongly kinetically-inhibited. It is, e.g., not clear whether the observation of a thin surface oxide film means that this is the stable surface composition and structure at the given gas-phase pressure and temperature, or whether the system has simply not yet attained its real equilibrium structure (possibly in form of the full bulk oxide). Such limitations can be due to quite different microscopic reasons: adsorption from or desorption to the gas-phase could be slow/hindered, or (bulk) oxide growth may be inhibited because metal diffusion through the oxide to its surface or oxygen diffusion from the surface to the oxide/metal interface is very slow.
thus provide information about the corresponding surface structure at higher oxygen pressures. This is, however, not fully certain, and it is also not guaranteed that the surface has reached its equilibrium structure during the time of oxygen exposure. Typically, a large variety of potentially kinetically-limited surface structures can be produced this way. Even though it can be academically very interesting to study all of them in detail, one would still like to have some guidance as to which of them would ultimately correspond to an equilibrium structure under which environmental conditions. Furthermore, the knowledge of a corresponding, so-called surface phase diagram as a function of, in this case, the temperature T and oxygen pressure pO2 can also provide useful information to the now surging in situ techniques, as to which phase to expect. The task for an ab initio atomistic thermodynamic approach would therefore be to screen a number of known (or possibly relevant) oxygen-containing surface structures, and evaluate which of them turns out to be the most stable one under which (T, pO2 ) conditions [24–27]. Most stable translated into the thermodynamic language meaning that the corresponding structure minimizes an appropriate thermodynamic function, which would in this case be the Gibbs free energy of adsorption G [32, 33]. In other words, one has to compute G as a function of the environmental variables for each structural model,
Ab initio atomistic thermodynamics and statistical mechanics
157
and the one with the lowest G is identified as most stable. What needs to be computed are all thermodynamic potentials entering into the thermodynamic function to be minimized. In the present case of the Gibbs free energy of adsorption these are for example the Gibbs free energies of bulk and surface structural models, as well as the chemical potential of the O2 gas phase. The latter may, at the accuracy level necessary for the surface phase stability issue, well be approximated by an ideal gas. The calculation of the chemical potential µO (T, pO2 ) is then straightforward and can be found in standard statistical mechanics text books, (e.g., [34]). Required input from a microscopic theory like DFT are properties like bond lengths and vibrational frequencies of the gas-phase species. Alternatively, the chemical potential may be directly obtained from thermochemical tables [35]. Compared to this, the evaluation of the Gibbs free energies of the solid bulk and surface is more involved. While in principle contributions from total energy, vibrational free energy or configurational entropy have to be calculated [24–26], a key point to notice here is that not the absolute Gibbs free energies enter into the computation of G, but only the difference of the Gibbs free energies of bulk and surface. This often implies some error cancellation in the DFT total energies. It also leads to quite some (partial) cancellation in the free energy contributions like the vibrational energy. In a physical picture, it is thus not the effect of the absolute vibrations that matters for our considerations, but only the changes of vibrational modes at the surface as compared to the bulk. Under such circumstances it may result that the difference between the bulk and surface Gibbs free energies is already well approximated by the difference of their leading total energy terms, i.e., the direct output of the DFT calculations [24]. Although this is of course appealing from a computational point of view, and one would always want to formulate the thermodynamic equations in a way that they contain such differences, we stress that it is not a general result and needs to be carefully checked for every specific system. Once the Gibbs free energies of adsorption G(T, pO2 ) are calculated for each surface structural model, they can be plotted as a function of the environmental conditions. In fact, under the imposed equilibrium the two-dimensional dependence on T and pO2 can be summarized into a one-dimensional dependence on the gas-phase chemical potential µO (T, pO2 ) [24]. This is done in Fig. 3(a) for the Pd(100) surface including, apart from the clean surface, a number of previously characterized oxygen-containing surface structures. These are two structures with ordered on-surface√O adsorbate layers of differ√ ent density ( p(2 × 2) and c(2 × 2)), a so-called ( 5 × 5)R27◦ surface oxide containing one layer of PdO on top of Pd(100), and finally the infinitely thick PdO bulk oxide [37]. If we start at very low oxygen chemical potential, corresponding to a low oxygen concentration in the gas-phase, we expectedly find the clean Pd(100) surface to yield the lowest G line, which in fact is used here as the reference zero. Upon increasing µO in the gas-phase,
158
K. Reuter et al. pO (atm) -20
600K
10
-50
300K (a)
10
2
-10
10 -40
10
-30
10
10
10
1 -20
10
-10
10
1
(b)
-100
2
10
bulk oxide
1
10 -50
0
clean
-2
10
rfa c
50
eo
2)
c(2
su
x2
)
-3
10
x
-4
√5
10
)R 27
100
surface oxide bulk oxide ˚
metal 150 -2
10
2
p(2x
pO (atm)
-1
xi de
0
5 (√
∆G (meV/Å)
10
metal
-5
10
-6
-1.5
-1
µO (eV)
-0.5
0
600
700
800
900 10
T (K)
Figure 3. (a) Computed Gibbs free energy of adsorption G for the clean Pd(100) surface and several oxygen-containing surface structures. Depending on the chemical potential µO of the surrounding gas-phase, either the clean √ surface √ or a surface oxide film (labeled here according to its two-dimensional periodicity as ( 5 × 5)R27◦ ), or the infinite PdO bulk oxide exhibit the lowest G and result as the stable phase under the corresponding environmental conditions (as indicated by the different background shadings). Note that a tiny reduction of its surface energy would suffice to make the p(2 × 2) adlayer structure most stable in an intermediate range of chemical potential between the clean surface and the surface oxide. Within the present computational uncertainty, no conclusion can therefore be made regarding the stability of this structure. (b) The stability range of the three phases, evaluated in (a) as a function of µO , plotted directly in (T, pO2 )-space. Note the extended stability range of the surface oxide compared to the PdO bulk oxide (after Refs. [28, 36]).
the Gibbs free energies of adsorption of the other oxygen-containing surfaces decrease gradually, however, as it becomes more favorable to stabilize such structures with more and more oxygen atoms being present in the gas-phase. The more oxygen the structural models contain, the steeper the slope of their G curves becomes, and above a critical µO we eventually find the surface oxide to be more stable than the clean surface. Since the PdO bulk oxide contains a macroscopic (or at least mesoscopic) number of oxygen atoms, the slope of its G line exhibits an infinite slope and cuts the other lines vertically at µO ≈ − 0.8 eV. For any higher oxygen chemical potential in the gas-phase, the bulk PdO phase will then always result as most stable.
Ab initio atomistic thermodynamics and statistical mechanics
159
With the clean surface, the surface and the bulk oxide, the thermodynamic analysis yields therefore three equilibrium phases for Pd(100) depending on the chemical potential of the O2 environment. Exploiting ideal gas laws, this one-dimensional dependence can be translated into the physically more intuitive dependence on temperature and oxygen pressure. For two fixed temperatures, this is also indicated by the resulting pressure scales at the top axis of Fig. 3(a). Alternatively, the stability range of the three phases can be directly plotted in (T, pO2 )-space, as shown Fig. 3(b). A most intriguing result is that the thermodynamic stability range of the recently identified surface oxide extends well beyond the one of the common PdO bulk oxide, i.e., the surface oxide could well be present under environmental conditions where the PdO bulk oxide is known to be unstable. This result is somewhat unexpected, in two ways: First, it had hitherto been believed that it is the slow growth kinetics (not the thermodynamics) that exclusively controls the thickness of oxide films at surfaces. Second, the possibility of only few atomic layer thick (surface) oxides with structures not necessarily related to the known bulk oxides was traditionally not perceived. √ √ The additional stabilization of the ( 5 × 5)R27◦ surface oxide is attributed to the strong coupling of the ultrathin film to the Pd(100) substrate [37]. Similar findings have recently been obtained at the Pd(111) [28, 38] and Ag(111) [33, 39] surfaces. Interestingly, the low stability of the bulk oxide phases of these more noble TMs had hitherto often been used as argument against the relevance of oxide formation in technological environments like in oxidation catalysis [30]. It remains to be seen whether the surface oxide phases and their extended stability range, which have recently been intensively discussed, will change this common perception.
1.2.
Free Energy Plots of Semiconductor Surfaces
Already in the introduction we had mentioned that the concepts discussed here are general and applicable to a wide range of problems. To illustrate this, we supplement the discussion by an example from the field of semiconductors, where the concepts of ab initio atomistic thermodynamics had in fact been developed first [18–21, 40]. Semiconductor surfaces exhibit complex reconstructions, i.e., surface structures that differ significantly in their atomic composition and geometry from the one of the bulk-truncated structure [13]. Knowledge of the correct surface atomic structure is, on the other hand, a prerequisite to understand and control the surface or interface electronic properties, as well as the detailed growth characteristics. While the number of possible configurations with complex surface unit-cell reconstructions is already large, searching for possible structural models becomes even more involved for surfaces of compound semiconductors. In order to minimize the number
160
K. Reuter et al.
of dangling bonds, the surface may exchange atoms with the surrounding gasphase, which in molecular beam epitaxy (MBE) growth is composed of the substrate species at elevated temperatures and varying partial pressures. As a consequence of the interaction with this gas-phase, the surface stoichiometry may be altered and surface atoms be displaced to assume a more favorable bonding geometry. The resulting surface structure depends thus on the environment, and atomistic thermodynamics may again be employed to compare the stability of existing (or newly suggested) structural models as a function of the conditions in the surrounding gas-phase. The thermodynamic quantity that is minimized by the most stable structure is in this case the surface free energy, which in turn depends on the Gibbs free energies of the bulk and surface of the compound, as well as on the chemical potentials in the gasphase. The procedure of evaluating these quantities goes exactly along the lines described above, where in addition, one frequently assumes the surface fringe not only to be in thermodynamic equilibrium with the surrounding gasphase, but also with the underlying compound bulk [24]. With this additional constraint, the dependence of the surface structure and composition on the environment can, even for the two component gas-phase in MBE, be discussed as a function of the chemical potential of only one of the compound species alone. Figure 4 shows as an example the dependence on the As content in the gas-phase for a number of surface structural models of the GaAs(001)
Figure 4. Surface energies for GaAs(001) terminations as a function of the As chemical potential, µAs . The thermodynamically allowed range of µAs is bounded by the formation of Ga droplets at the surface (As-poor limit at −0.58 eV) and the condensation of arsenic at the surface (As-rich limit at 0.00 eV). The ζ (4 × 2) geometry is significantly lower in energy than the previously proposed β2(4 × 2) model for the c(8 × 2) surface reconstruction observed under As-poor growth conditions (from Ref. [41]).
Ab initio atomistic thermodynamics and statistical mechanics
161
surface. A reasonable lower limit for this content is given, when there is so little As2 in the gas-phase that it becomes thermodynamically more favorable for the arsenic to leave the compound. The resulting GaAs decomposition and formation of Ga droplets at the surface denotes the lower limit of As chemical potentials considered (As-poor limit), while the condensation of arsenic on the surface forms an appropriate upper bound (As-rich limit). Depending on the As to Ga stoichiometry at the surface, the surface free energies of the individual models have either a positive slope (As-poor terminations), a negative slope (As-rich terminations) or remain constant (stoichiometric termination). While the detailed atomic geometries behind the considered models in Fig. 4 are not relevant here, most of them may roughly be characterized as different ways of forming dimers at the surface in order to reduce the number of dangling orbitals [42]. In fact, it is this general “rule” of dangling bond minimization by dimer formation that has hitherto mainly served as inspiration in the creation of new structural models for the (001) surfaces of III–V zinc-blende semiconductors, thereby leading to some prejudice in the type of structures considered. In contrast, at first the theoretically proposed so-called ζ(4 × 2) structure is actuated by the filling of all As dangling orbitals and emptying of all Ga dangling orbitals, as well as a favorable electrostatic (Ewald) interaction between the surface atoms [41]. The virtue of the atomistic thermodynamic approach is now that such a new structural model can be directly compared in its stability against all existing ones. And indeed, the ζ(4 × 2) phase was found to be more stable than all previously proposed reconstructions at low As pressure. Returning to the methodological discussion, the results shown in Figs. 3 and 4 nicely summarize the contribution that can be made by such analysis. While ab initio atomistic thermodynamics has a much wider applicability (see Sections 1.3–1.5), the approach followed for obtaining Figs. 3 and 4 has some limitations. Most prominently, one has to be aware that the reliability is restricted to the number of considered configurations, or in other words that only the stability of those structures plugged in can be compared. Had, for example, the surface oxide structure not been considered in Fig. 3, the p(2×2) adlayer structure would have yielded the lowest Gibbs free energy of adsorption in a range of µO intermediate to the stability ranges of the clean surface and the bulk oxide, changing the resulting surface phase diagram √ √ accordingly. Alternatively, it is at present not completely clear, whether the ( 5× 5)R27◦ structure is really the only surface oxide on Pd(100). If another yet unknown surface oxide exists and exhibits a sufficiently low G for some oxygen chemical potential, it will similarly affect the surface phase diagram, as would another novel and hitherto unconsidered surface reconstruction with sufficiently low surface free energy in the GaAs example. As such, appropriate care should be in place when addressing systems where only limited information about surface structures is available. With this in mind, even in such systems the
162
K. Reuter et al.
atomistic thermodynamics approach can still be a particularly valuable tool though, since it allows, for example, to rapidly compare the stability of newly devised structural models against existing ones. In this way, it gives tutorial insight into what structural motives may be particularly important. This may even yield ideas about other structures that one should test, as well, and the theoretical identification of the ζ(4 × 2) structure in Fig. 4 by Lee et al. [41] is a prominent example. In the Section 1.4 we will discuss an approach that is able to overcome this limitation. This comes unfortunately at a significantly higher computational demand, so that it has up to now only be used to study simple adsorption layers on surfaces. This will then also provide more detailed insight into the transitions between stable phases. In Figs. 3 and 4, the transitions are simply drawn abrupt, and no reference is made to the finite phase coexistence regions that should occur at finite temperatures, i.e., regions in which with changing pressure or temperature one phase gradually becomes populated and the other one depopulated. That this is not the case in the discussed examples is not a general deficiency of the approach, but has to do with that the configurational entropy contribution to the Gibbs free energy of the surface phases has been deliberately neglected in the two corresponding studies. This is justified, since for the well-ordered surface structural models considered, this contribution is indeed small and will affect only a small region close to the phase boundaries. The width of this affected phase coexistence region can even be estimated [26], but if more detailed insight into this very region is desired, or if disorder becomes more important e.g., at more elevated temperatures, then an explicit calculation of the configurational entropy contribution will become necessary. For this, equilibrium MC simulations as described below are the method of choice, but before we turn to them there is yet another twist to free energy plots that deserves mentioning.
1.3.
“Constrained Equilibrium”
Although a thermodynamic approach can strictly describe only the situation where the surface is in equilibrium with the surrounding gas-phase (or in a metastable state), the idea is that it can still give some insight when the system is close to thermodynamic equilibrium, or even when it is only close to thermodynamic equilibrium with some of the present gas-phase species [25]. For such situations it can be useful to consider “constrained equilibria,” and one would expect to get some ideas as to where in (T, p)-space thermodynamic phases may still exist, but also to identify those regions where kinetics may control the material function.
Ab initio atomistic thermodynamics and statistical mechanics
163
We will discuss heterogeneous catalysis as a prominent example. Here, a constant stream of reactants is fed over the catalyst surface and the formed products are rapidly carried away. If we take the CO oxidation reaction to further specify our example, the surface would be exposed to an environment composed of O2 and CO molecules, while the produced CO2 desorbs from the catalyst surface at the technologically employed temperatures and is then transported away. Neglecting the presence of the CO2 , one could therefore model the effect of an O2 /CO gas-phase on the surface, in order to get some first ideas of the structure and composition of the catalyst under steady-state operation conditions. Under the assumption that the adsorption and desorption processes of the reactants occur much faster than the CO2 formation reaction, the latter would not significantly disturb the average surface population, i.e., the surface could be close to maintaining its equilibrium with the reactant gas-phase. If at all, this equilibrium holds, however, only with each gasphase species separately. Were the latter fully equilibrated among each other, too, only the products would be present under all environmental conditions of interest. It is in fact particularly the high free energy barrier for the direct gas-phase reaction that prevents such an equilibration on a reasonable time scale, and necessitates the use of a catalyst in the first place. The situation that is correspondingly modeled in an atomistic thermodynamics approach to heterogeneous catalysis is thus a surface in “constrained equilibrium” with independent reservoirs representing all reactant gas-phase species, namely O2 and CO in the present example [25]. It should immediately be stressed though, that such a setup should only be viewed as a thought construct to get a first idea about the catalyst surface structure in a high-pressure environment. Whereas we could write before that the surface will sooner or later necessarily equilibrate with the gas-phase in the case of a pure O2 atmosphere, this must no longer be the case for a “constrained equilibrium”. The on-going catalytic reaction at the surface consumes adsorbed reactant species, i.e., it continuously drives the surface populations away from their equilibrium value, and even more so in the interesting regions of high catalytic activity. That the “constrained equilibrium” concept can still yield valuable insight is nicely exemplified for the CO oxidation over a “Ru” catalyst [43]. For ruthenium, the afore described tendency to oxidize under oxygen-rich environmental conditions is much more pronounced than for the above discussed nobler metals Pd and Ag [28]. While for the latter the relevance of (surface) oxide formation under the conditions of technological oxidation catalysis is still under discussion [28, 33, 39, 44], it is by now established that a film of bulklike oxide forms on the Ru(0001) model catalyst during high-pressure CO oxidation, and that this RuO2 (110) is the active surface for the reaction [45]. When evaluating its surface structure in “constrained equilibrium” with an O2 and CO environment, four different “surface phases” result depending on the gas-phase conditions that are now described by the chemical potentials of both
164
K. Reuter et al.
reactants, cf. Fig. 5. The “phases” differ from each other in the occupation of two prominent adsorption site types exhibited by this surface, called bridge (br) and coordinatively unsaturated (cus) sites. At very low µCO , i.e., a very low CO concentration in the gas-phase, either only the bridge, or bridge and cus sites are occupied by oxygen depending on the O2 pressure. Under increased CO concentration in the gas-phase, both the corresponding Obr /− and the Obr /Ocus phase have to compete with CO that would also like to adsorb at the cus sites. And eventually the Obr /COcus phase develops. Finally, under very reducing gas-phase conditions with a lot of CO and essentially no oxygen, a completely CO covered surface results (CObr /COcus). Under these conditions the RuO2 (110) surface can at best be metastable, however, as above the white-dotted line in Fig. 5 the RuO2 bulk oxide is already unstable against CO-induced decomposition. With the already described difficulty of operating the atomic-resolution experimental techniques of surface science at high pressures, the possibility of reliably bridging the so-called pressure gap is of key interest in heterogeneous catalysis research [30, 43, 46]. The hope is that the atomic-scale understanding gained in experiments with some suitably chosen low pressure conditions would also be representative of the technological ambient pressure situation. Surface phase diagrams like the one shown in Fig. 5 could give some valuable guidance in this endeavor. If the (T, pO2 , pCO ) conditions of the low pressure experiment are chosen such that they lie within the stability region of the same surface phase as at high-pressures, the same surface structure and composition will be present and scalable results may be expected. If, however, temperature and pressure are varied in such a way, that one crosses from one stability region to another one, different surfaces are exposed and there is no reason to hope for comparable functionality. This would, e.g., also hold for a naive bridging of the pressure gap by simply maintaining a constant partial pressure ratio. In fact, the comparability holds not only within the regions of the stable phases themselves, but with the same argument also for the phase coexistence regions along the phase boundaries. The extent of these configurational entropy induced phase coexistence regions has been indicated in Fig. 5 by white regions. Although as already discussed, the above mentioned approach gives no insight into the detailed surface structure under these conditions, pronounced fluctuations due to an enhanced dynamics of the involved elementary processes can generally be expected due to the vicinity of a phase transition. Since catalytic activity is based on the same dynamics, these regions are therefore likely candidates for efficient catalyst functionality [25]. And indeed, very high and comparable reaction rates have recently been noticed for different environmental conditions that all lie close to the white region between the Obr /Ocus and Obr /COcus phases. It must be stressed, however, that exactly in this region of high catalytic activity one would similarly expect the
Ab initio atomistic thermodynamics and statistical mechanics
165
Figure 5. Top panel: Top view of the RuO2 (110) surface explaining the location of the two prominent adsorption sites (coordinatively unsaturated, cus, and bridge, br). Also shown are perspective views of the four stable phases present in the phase diagram shown below (Ru = light large spheres, O = dark medium spheres, C = white small spheres). Bottom panel: Surface phase diagram for RuO2 (110) in “constrained equilibrium” with an oxygen and CO environment. Depending on the gas-phase chemical potentials (µO , µCO ), br and cus sites are either occupied by O or CO, or empty (–), yielding a total of four different surface phases. For T = 300 and 600 K, this dependence is also given in the corresponding pressure scales. Regions that are expected to be particularly strongly affected by phase coexistence or kinetics are marked by white hatching (see text). Note that conditions representative for technological CO oxidation catalysis (ambient pressures, 300–600 K) fall exactly into one of these ranges (after Refs. [25, 26]).
166
K. Reuter et al.
breakdown of the “constrained equilibrium” assumption of a negligible effect of the on-going reaction on the average surface structure and stoichiometry. At least everywhere in the corresponding hatched regions in Fig. 5 such kinetic effects will lead to significant deviations from the surface phases obtained within the approach described above, even at “infinite” times after steady-state has been reached. Atomistic thermodynamics may therefore be employed to identify interesting regions in phase space. Their surface coverage and structure, i.e., the very dynamic behavior, must then however be modeled by statistical mechanics explicitly accounting for the kinetics, and the corresponding kMC simulations will be discussed towards the end of the chapter.
1.4.
Ab Initio Lattice-gas Hamiltonian
The predictive power of the approach discussed in the previous sections extends only to the structures that are directly considered, i.e., it cannot predict the existence of unanticipated geometries or stoichiometries. To overcome this limitation, and to include a more general and systematic way of treating phase coexistence and order–disorder transitions, a proper sampling of configuration space must be achieved, instead of considering only a set of plausible structural models. Modern statistical mechanical methods like Monte Carlo (MC) simulations are particularly designed to efficiently fulfill this purpose [6, 47]. The straightforward matching with electronic structure theories would thus be to determine with DFT the energetics of all system configurations generated in the course of the statistical simulation. Unfortunately, this direct linking is currently, and also in the foreseeable future, computationally unfeasible. The exceedingly large configuration spaces of most materials science problems require a prohibitively large number of free energy evaluations (which can easily go beyond 106 for moderately complex systems), including also disordered configurations. With the direct matching impossible, an efficient alternative is to map the real system somehow onto a simpler, typically discretized model system, the Hamiltonian of which is sufficiently fast to evaluate. This then enables us to evaluate the extensive number of free energies required by the statistical mechanics. Obvious uncertainties of this approach are how appropriate the model system represents the real system, and how its parameters can be determined from the first-principles calculations. The advantage, on the other hand, is that such a detour via an appropriate (“coarse-grained”) model system often provides deeper insight and understanding of the ruling mechanisms. If the considered problem can be described by a lattice defining the possible sites for the species in the system, a prominent example for such a mapping approach is given by the concept of a LGH (or in other languages, an “Isingtype model” [48] or a “cluster-expansion” [49, 50]). Here, any system state
Ab initio atomistic thermodynamics and statistical mechanics
167
is defined by the occupation of the sites in the lattice and the total energy of any configuration is expanded into a sum of discrete interactions between these lattice sites. For a one component system with only one site type, the LGH would then for example read (with obvious generalizations to multicomponent, multi-site systems): H=F
i
ni +
p m=1
Vmpair
(i j )m
ni n j +
q m=1
Vmtrio
ni n j nk + . . . ,
(1)
(i j k)m
where the site occupation numbers n l = 0 or 1 tell whether site l in the lattice is empty or occupied, and F is the free energy of an isolated species at this lattice site, including static and vibrational contributions. There are p pair interactions with two-body (or pair) interaction energies Vmpair between species at mth nearest neighbor sites, and q trio interactions with Vmtrio three-body interaction energies. The sum labels (i j )m (and (i j k)m ) indicate that the sums run over all pairs of sites (i j ) (and three sites (i j k)) that are separated by m lattice constants. Formally, higher and higher order interaction terms (quattro, quinto, . . . ) would follow in this infinite expansion. In practice, the series must obviously (and can) be truncated after a finite number of terms though. Figure 6 illustrates some of these interactions for the case of a two-dimensional (a)
(b)
Figure 6. (a) Illustration of some types of lateral interactions for the case of a twodimensional adsorbate layer (small dark spheres) that can occupy the two distinct threefold pair hollow sites of a (111) close-packed surface. Vm (n = 1, 2, 3) are two-body (or pair) interactions at first, second, and third nearest neighbor distances of like hollow sites (i.e., fcc–fcc or hcp–hcp). Vmtrio (n = 1, 2, 3) are the three possible three-body (or trio) interactions between pair(h,f)
three atoms in like nearest neighbor hollow sites, and Vm (n = 1, 2, 3) represent pair interactions between atoms that occupy unlike hollow sites (i.e., one in fcc and the other in hcp or vice versa). (b) Example of an adsorbate arrangement from which an expression can be obtained for use in solving for interaction parameters. The (3 × 3) periodic surface unit-cell is indicated by the large darker spheres. The arrows indicate interactions between the adatoms. Apart from the obvious first nearest-neighbor interactions (short arrows), also third nearestneighbor two-body interactions (long arrows) exist, due to the periodic images outside of the unit cell.
168
K. Reuter et al.
adsorbate layer that can occupy the two distinct threefold hollow sites of a (111) close-packed surface. In particular, the pair interactions up to third nearest neighbor between like and unlike hollow sites are shown, as well as three possible trio interactions between adsorbates in like sites. It is apparent that such a LGH is very general. The Hamiltonian can be equally well evaluated for any lattice occupation, be it dense or sparse, periodic or disordered. And in all cases it merely comprises performing an algebraic sum over a finite number of terms, i.e., it is computationally very fast. The disadvantage is, on the other hand, that for more complex systems with multiple sites and several species, the number of interaction terms in the expansion increases rapidly. Which of these (far-reaching or multi-body) interaction terms need to be considered, i.e., where the sum in Eq. (1) may be truncated, and how the interaction energies in these terms may be determined, is the really sensitive part of such a LGH approach that must be carefully checked. The methodology in itself is not new, and traditionally the interatomic interactions have often been assumed to be just pairwise additive (i.e., higherorder terms beyond pair interactions were neglected); the interaction energies were then obtained by simply fitting to experimental data (see, e.g., [51–53]). This procedure obviously results in “effective parameters” with an unclear microscopic basis, “hiding” or “masking” the effect and possible importance of three-body (trio) and higher-order interactions. This has the consequence that while the Hamiltonian may be able to reproduce certain specific experimental data to which the parameters were fitted, it is questionable and unlikely that it will be general and transferable to calculations of other properties of the system. Indeed, the decisive contribution to the observed behavior of adparticles by higher-order, many-atom interactions has in the meanwhile been pointed out by a number of studies (see, e.g., [54–58]). As an alternative to this empirical procedure, the lateral interactions between the particles in the lattice can be deduced from detailed DFT calculations, and it is this approach in combination with the statistical mechanics methods that is of interest for this chapter. The straightforward way to do this is to directly compute these interactions as differences of calculations, with different occupations at the corresponding lattice sites. For the example of a pair interaction between two adsorbates at a surface, this would translate into two DFT calculations where only either one of the adsorbates sits at its lattice site, and one calculation where both are present simultaneously. Unfortunately, this type of approach is hard to combine with the periodic boundary conditions that are typically required to describe the electronic structure of solids and surfaces [16]. In order to avoid interactions with the periodic images of the considered lattice species, huge (actually often prohibitively large) supercells would be required. A more efficient and intelligent way of addressing the problem is instead to specifically exploit the interaction with the periodic images. For this, different configurations in various (feasible)
Ab initio atomistic thermodynamics and statistical mechanics
169
supercells are computed with DFT, and the obtained energies expressed in terms of the corresponding interatomic interactions. Figure 6 illustrates this for the case of two adsorbed atoms in a laterally periodic surface unit-cell. Due to this periodicity, each atom has images in the neighboring cells. Because of these images, each of the atoms in the unit-cell experiences not only the obvious pair interaction at the first neighbor distance, but also a pair interaction at the third neighbor distance (neglecting higher pairwise or multi-body interactions for the moment). The computed DFT binding energy for this conpair pair (3×3),i = 2E + 2V1 + 2V3 . Doing figuration i can therefore be written as E DFT this for a set of different configurations thus generates a system of linear equations that can be solved for the interaction energies either by direct inversion (or by fitting techniques, if more configurations than interaction parameters were determined). The crucial aspect in this procedure is the number and type of interactions to include in the LGH expansion, and the number and type of configurations that are computed to determine them. We note that there is no a priori way to know at how many, and what type of, interactions to terminate the expansion. While there are some attempts to automatize this procedure [59–61], it is probably fair to say that the actual implementation remains to date a delicate task. Some guidelines to judge on the convergence of the constructed Hamiltonian include its ability to predict the energies of a number of DFT-computed configurations that were not employed in the fit, or that it reproduces the correct lowest-energy configurations at T = 0 K (so-called “ground-state line”) [50].
1.5.
Equilibrium Monte Carlo Simulations
Once an accurate LGH has been constructed, one has at hand a very fast and flexible tool to provide the energies of arbitrary system configurations. This may in turn be used for MC simulations to obtain a good sampling of the available configuration space, i.e., to determine the partition function of the system. An important aspect of modern MC techniques is that this sampling is done very efficiently by concentrating on those parts of the configuration space that contribute significantly to the latter. The Metropolis algorithm [62], as a famous example of such so-called importance sampling schemes, proceeds therefore by generating at random new system configurations. If the new configuration exhibits a lower energy than the previous one, it is automatically “accepted” to a gradually built-up sequence of configurations. And even if the configuration has a higher energy, it still has an appropriately Boltzmann weighted probability to make it to the considered set. Otherwise it is “rejected” and the last configuration copied anew to the sequence. This way, the algorithm preferentially samples low energy configurations, which contribute most to the partition function. The acceptance criteria of the Metropolis, and of other
170
K. Reuter et al.
importance sampling schemes, furthermore fulfill detailed balance. This means that the forward probability of accepting a new configuration j from state i is related to the backward probability of accepting configuration i from state j by the free energy difference of both configurations. Taking averages of system observables over the thus generated configurations yields then their correct thermodynamic average for the considered ensemble. Technical issues regard finally how new trial configurations are generated, or how long and in what system size the simulation must be run in order to obtain good statistical averages [6, 47]. The kind of insights that can be gained by such a first-principles LGH + MC approach is nicely exemplified by the problem of on-surface adsorption at a close-packed surface, when the latter is in equilibrium with a surrounding gas-phase. If this environment consists of oxygen, this would, e.g., contribute to the understanding of one of the early oxidation stages sketched in Fig. 2. What would be of interest is for instance to know how much oxygen is adsorbed at the surface given a certain temperature and pressure in the gas-phase, and whether the adsorbate forms ordered or disordered phases. As outlined above, the approach proceeds by first determining a LGH from a number of DFT-computed ordered adsorbate configurations. This is followed by grand-canonical MC simulations, in which new trial system configurations are generated by randomly adding or removing adsorbates from the lattice positions and where the energies of these configurations are provided by the LGH. Evaluating appropriate order parameters that check on prevailing lateral periodicities in the generated sequence of configurations, one may finally plot the phase diagram, i.e., what phase exists under which (T, p)-conditions (or equivalently (T, µ)-conditions) in the gas-phase. The result of one of the first studies of this kind is shown in Fig. 7 for the system O/Ru(0001). The employed LGH comprised two types of adsorption sites, namely the hcp and fcc hollows, lateral pair interactions up to third neighbor and three types of trio interactions between like and unlike sites, thus amounting to a total of fifteen independent interaction parameters. At low temperature, the simulations yield a number of ordered phases corresponding to different periodicities and oxygen coverages. Two of these ordered phases had already been reported experimentally at the time the work was carried out. The prediction of two new (higher coverage) periodic structures, namely a 3/4 and a 1 monolayer phase, has in the meanwhile been confirmed by various experimental studies. This example thus demonstrates the predictive nature of the first-principles approach, and the stimulating and synergetic interplay between theory and experiment. It is also worth pointing out that these new phases and their coexistence in certain coverage regions were not obtained in early MC calculations of this system based on an empirical LGH, which was determined by simply fitting a minimal number of pair interactions to the then available experimental phase diagram [51]. We also like to
Ab initio atomistic thermodynamics and statistical mechanics
171
1.00
D C Chemical potential (eV)
0.75
B 0.50
0.25
A
0.00 200
l.g. 400
600
800
T (K)
Figure 7. Phase diagram for O/Ru(0001) as obtained using the ab initio LGH approach in combination with MC calculations. The triangles indicate first order transitions and the circles second order The identified ordered structures are labeled as: (2×2)-O (A), (2×1)√ transitions. √ O (B), ( 3 × 3)R30◦ (C), (2 × 2)-3O (D), and disordered lattice-gas (l.g.) (from Ref. [63]).
stress the superior transferability of the first-principles interaction parameters. As an example we name simulations of temperature programmed desorption (TPD) spectra, which can among other possibilities be obtained by combining the LGH with a transfer-matrix approach and kinetic rate equations [61]. Figure 8 shows the result obtained with exactly the same LGH that also underlies the phase diagram of Fig. 7. Although empirical fits of TPD spectra may give better agreement between calculated and experimental results, we note that the agreement visible in Fig. 8 is in fact quite good. The advantage, on the other hand, is that no empirical parameters were used in the LGH, which allows to unambiguously trace back the TPD features to lateral interactions with well-defined microscopic meaning. The results summarized in Fig. 7 also serve quite well to illustrate the already mentioned differences between the initially described free energy plots and the LGH + MC method. In the first approach, the stability of a fixed set of configurations is compared in order to arrive at the phase diagram. Consider, for example, that we would have restricted our free energy analysis of the O/Ru(0001) system to only the O(2 × 2) and O(2 × 1) adlayer structures that were the two experimentally known ordered phases before 1995. The stability region of the prior phase, bounded at lower chemical potentials by the clean surface and at higher chemical potentials by O(2 × 1) phase, then comes
172
K. Reuter et al.
O2 desorption rate (ML/s)
0.05 0.04
θ ⫽ 1.0
0.03 θ ⫽ 0.8
0.02
θ ⫽ 0.1
0.01 0.00 800
1000 1200 1400 temperature (K)
1600
Figure 8. Theoretical (left panel) and experimental (right panel) temperature programmed desorption curves. Each curve shows the rate of oxygen molecules that desorb from the Ru(0001) surface as a function of temperature, when the system is prepared with a given initial oxygen coverage θ ranging from 0.1 to 1 monolayer (ML). The first-principles LGH employed in the calculations is exactly the same as the one underlying the phase diagram of Fig. 7 (from Refs. [57, 58]).
out just as much as in Fig. 7. This stability range will be independent of temperature, however, there is no order–disorder transition at higher temperature due to the neglect of configurational entropy. More importantly, since the two higher-coverage phases would not have been explicitly considered, the stability of the O(2 × 1) phase would falsely extend over the whole higher chemical potential range. One would have to include these two configurations into the analysis to obtain the right result shown in Fig. 7, whereas the LGH + MC method yields them automatically. While this emphasizes the deeper insight and increased predictive power that is achieved by the proper sampling of configuration space in the LGH + MC technique, one must also recognize that the computational cost of the latter is significantly higher. It is, in particular, straightforward to directly compare the stability of qualitatively different geometries like the on-surface adsorption and the surface oxide phases in Fig. 3 in a free energy plot (or the various surface reconstructions entering Fig. 4). Setting up an LGH that would equally describe both systems, on the other hand, is far from trival. Even if it were feasible to find a generalized lattice that would be able to treat all system states, disentangling and determining the manifold of interaction energies in such a lattice will be very involved. The required discretization of the real system, i.e., the mapping onto a lattice, is therefore to date the major limitation of the LGH + MC technique – be it applied to two-dimensional pure surface systems or
Ab initio atomistic thermodynamics and statistical mechanics
173
even worse to three-dimensional problems addressing a surface fringe of finite width. Still, it is also precisely this mapping and the resulting very fast analysis of the properties of the LGH that allows for an extensive and reliable sampling of the configuration space of complex systems that is hitherto unparalleled by other approaches. Having highlighted the importance of this sampling for the determination of unanticipated new ordered phases at lower temperatures, the final example in this section illustrates specifically the decisive role it also plays for the simulation and understanding of order-disorder transitions at elevated temperatures. A particularly intriguing transition of this kind is observed for Na on Al(001). The interest in such alkali metal adsorption systems has been intense, especially since in the early 1990s it was found (first for Na on Al(111) and then on Al(100)) that the alkali metal atoms may kick-out surface Al atoms and adsorb substitutionally [65–67]. This was in sharp contrast to the “experimental evidence” and the generally accepted understanding of the time, which was that alkali-metal atoms adsorb in the highest coordinated on-surface hollow site, and cause little disturbance to a close-packed metal surface. For the specific system Na on Al(001) at a coverage of 0.2 monolayer, recent low energy electron diffraction experiments observed furthermore a reversible phase transition √ √ in the◦temperature range 220 K–300 K. Below this range, an ordered ( 5 × 5)R27 structure forms, where the Na atoms occupy surface substitutional sites, while above it, the Na atoms, still in the substitutional sites, form a disordered arrangement in the surface. Using the ab initio LGH + MC approach the ordered phase and the disorder transition could be successfully reproduced [67]. Pair interactions up to the sixth nearest neighbor and two different trio interactions, as well as one quarto interaction were included in the LGH expansion. We note that determining these interaction parameters requires care, and careful cross-validation. To specifically identify the crucial role played by configurational entropy in the temperature induced order–disorder transition, a specific MC algorithm proposed by Wang and Landau [68] was employed. In contrast to the above outlined Metropolis algorithm, this scheme affords an explicit calculation of the density of configuration states, g(E), i.e., the number of system configurations with a certain energy E. This quantity provides in turn all major thermodynamic functions, e.g., the canonical distributionat a given temperature, g(E)e−E/ kB T , the free energy, F(T ) = − kB T ln( E g(E)e−E/kB T ) = −kB T ln(Z ), where Z is the partition function, the internal energy, U (T ) = [ E Eg(E)e−E/kB T ]/Z , and the entropy S = (U − F)/T . Figure 9 shows the calculated density of configuration states g(E), together with the internal and free energy derived from it. In the latter two quantities, the abrupt change corresponding to the first-order phase transition obtained at 301 K can be nicely discerned. This is also visible as a double peak in the logarithm of the canonical distribution (Fig. 9(a), inset) and as a singularity
174
K. Reuter et al. (a)
(b)
Figure 9. (a) Calculated density of configuration states, g(E), for Na on Al(100) at a coverage of 0.2 monolayers. Inset: Logarithm of the canonical distribution P(E, T ) = g(E)e E/ kB T , at the critical temperature. (b) Free energy F(T ) and internal energy U (T ) as a function of temperature, derived from g(E). The cusp in F(T ) and discontinuity in U (T ) at 301 K reflect the occurrence of the disorder–order phase transition, experimentally observed in the range 220–300 K (from Ref.[67]).
Ab initio atomistic thermodynamics and statistical mechanics
175
in the specific heat at the critical temperature (not shown) [67]. It can be seen that the free energy decreases notably with increasing temperature. The reason for this is clearly the entropic contribution (difference in the free and internal energies), the magnitude of which suddenly increases at the transition temperature and continues to increase steadily thereafter. Taking this configurational entropy into account is therefore (and obviously) the crucial aspect in the simulation and understanding of this order–disorder phase transition, and only the LGH+MC approach with its proper sampling of configuration space can provide it. What the approach does not yield, on the other hand, is how the phase transition actually takes place microscopically, i.e., how the substitutional Na atoms move their positions by necessarily displacing surface Al atoms, and on what time scale (with what kinetic hindrance) this all happens. For this, one necessarily needs to go beyond a thermodynamic description, and explicitly follow the kinetics of the system over time, which will be the topic of the following section.
2.
First-Principles Kinetic Monte Carlo Simulations
Up to now we had discussed how equilibrium MC simulations can be used to explicitly evaluate the partition function, in order to arrive at surface phase diagrams as function of temperature and partial pressures of the surrounding gas-phase. For this, statistical averages over a sequence of appropriately sampled configurations were taken, and it is appealing to also connect some time evolution to this sequence of generated configurations (MC steps). In fact, certain nonequilibrium problems can already be tackled on the basis of this uncalibrated “MC time” [47]. The reason why this does not work in general is twofold. First, equilibrium MC is designed to achieve an optimum sampling of configurational space. As such, also MC moves that are unphysical like a particle hop from an occupied site to an unoccupied one, hundreds of lattice spacings away may be allowed, if they help to obtain an efficient sampling of the relevant configurations. The remedy for this obstacle is straightforward, though, as one only needs to restrict the possible MC moves to “physical” elementary processes. The second reason is more involved, as it has to do with the probabilities with which the individual events are executed. In equilibrium MC the forward and backward acceptance probabilities of time-reversed processes like hops back and forth between two sites only have to fulfill the detailed balance criterion, and this is not enough to establish a proper relationship between MC time and “real time” [69]. In kinetic Monte Carlo simulations (kMC) a proper relationship between MC time and real time is achieved by interpreting the MC process as providing a numerical solution to the Markovian master equation describing the
176
K. Reuter et al.
dynamic system evolution [70–74]. The simulation itself still looks superficially similar to equilibrium MC in that a sequence of configurations is generated using random numbers. At each configuration, however, all possible elementary processes and the rates with which they occur are evaluated. Appropriately weighted by these different rates one of the possible processes is then executed randomly to achieve the new system configuration, as sketched in Fig. 10. This way, the kMC algorithm effectively simulates stochastic processes, and a direct and unambiguous relationship between kMC time and real time can be established [74]. Not only does this open the door to a treatment of the kinetics of nonequilibrium problems, but also it does so very efficiently, since the time evolution is actually coarse-grained to the really decisive rare events, passing over the irrelevant short-time dynamics. Time scales of the order of seconds or longer for mesoscopically-sized systems are therefore readily accessible by kMC simulations [12].
Figure 10. Flow diagram illustrating the basic steps in a kMC simulation. (i) Loop over all lattice sites of the system and determine the atomic processes that are possible for the current system configuration. Calculate or look up the corresponding rates. (ii) Generate two random numbers, (iii) advance the system according to the process selected by the first random number (this could, e.g., be moving an atom from one lattice site to a neighboring one, if the corresponding diffusion process was selected). (iv) Increment the clock according to the rates and the second random number, as prescribed by an ensemble of Poisson processes, and (v) start all over or stop, if a sufficiently long time span has been simulated.
Ab initio atomistic thermodynamics and statistical mechanics
2.1.
177
Insights from MD, MC, and kMC
To further clarify the different insights provided by molecular dynamics, equilibrium and kinetic Monte Carlo simulations, consider the simple, but typical rare event type model system shown in Fig. 11. An isolated adsorbate vibrates at finite temperature T with a frequency on the picosecond time scale and diffuses about every microsecond between two neighboring sites of different stability. In terms of a PES, this situation is described by two stable minima of different depths separated by a sizable barrier. Starting with the particle in any of the two sites, a MD simulation would follow the thermal motion of the adsorbate in detail. In order to do this accurately, timesteps in the femtosecond range are required. Before the first diffusion event can be observed at all, of the order of 109 time steps have therefore to be calculated first, in which the particle does nothing but just vibrate around the stable minimum. Computationally this is unfeasible for any but the simplest model systems, and even if it were feasible it would obviously not be an efficient tool to study the long-term time evolution of this system. For Monte Carlo simulations on the other hand, the system first has to be mapped onto a lattice. This is unproblematic for the present model and results
Figure 11. Schematic potential energy surface (PES) representing the thermal diffusion of an isolated adsorbate between two stable lattice sites A and B of different stability. A MD simulation would explicitly follow the dynamics of the vibrations around a minimum, and is thus inefficient to address the rare diffusion events happening on a much longer time scale. Equilibrium Monte Carlo simulations provide information about the average thermal occupation of the two sites,
, based on the depth of the two PES minima (E A and E B ). Kinetic Monte Carlo simulations follow the “coarse-grained” time evolution of the system, N(t), employing the rates for the diffusion events between the minima (rA→B , rB→A ). For this, PES information not only about the minima, but also about the barrier height at the transition state (TS) between initial and final state is required (E A , E B ).
178
K. Reuter et al.
in two possible system states with the particle being in one or the other minimum. Equilibrium Monte Carlo provides then only time-averaged information about the equilibrated system. For this, a sequence of configurations with the system in either of the two system states is generated, and considering the higher stability of one of the minima, appropriately more configurations with the system in this state are sampled. When taking the average, one arrives at the obvious result that the particle is with a certain higher (Boltzmann-weighted) probability in the lower minimum than in the higher one. Real information on the long-term time-evolution of the system, i.e., focusing on the rare diffusion events, is finally provided by kMC simulations. For this, first the two rates of the diffusion events from one system state to the other and vice versa have to be known. We will describe below that they can be obtained from knowledge of the barrier between the two states and the vibrational properties of the particle in the minima and at the barrier, i.e., from the local curvatures. A lot more information on the PES is therefore required for a kMC simulation than for equilibrium MC, which only needs input about the PES minima. Once the rates are known, a kMC simulation starting from any arbitrary system configuration will first evaluate all possible processes and their rates and then execute one of them with appropriate probability. In the present example, this list of events is trivial, since with the particle in either minimum only the diffusion to the other minimum is possible. When the event is executed, on average the time (rate)−1 has passed and the clock is advanced accordingly. Note that as described initially, the rare diffusion events happen on a time scale of nano- to microseconds, i.e., with only one executed event the system time will be directly incremented by this amount. In other words, the time is coarse-grained to the rare event time, and all the short-time dynamics (corresponding in the present case to the picosecond vibrations around the minimum) are efficiently contained in the process rate itself. Since the barrier seen by the particle when in the shallower minimum is lower than when in the deeper one, cf. Fig. 11, the rate to jump into the deeper minimum will correspondingly be higher than the one for the backwards jump. Generating the sequence of configurations, each time more time will therefore have passed after a diffusion event from deep to shallow compared to the reverse process. When taking a long-time average, describing then the equilibrated system, one therefore arrives necessarily at the result that the particle is on average longer in the lower minimum than in the higher one. This is identical to the result provided by equilibrium Monte Carlo, and if only this information is required, the latter technique would most often be the much more efficient way to obtain it. KMC, on the other hand, has the additional advantage of shedding light on the detailed time-evolution itself, and can in particular also follow the explicit kinetics of systems that are not (or not yet) in thermal equilibrium.
Ab initio atomistic thermodynamics and statistical mechanics
179
From the discussion of this simple model system, it is clear that the key ingredients of a kMC simulation are the analysis and identification of all possibly relevant elementary processes and the determination of the associated rates. Once this is known, the coarse graining in time achieved in kMC immediately allows to follow the time evolution and the statistical occurrence and interplay of the molecular processes of mesoscopically sized systems up to seconds or longer. As such it is currently the most efficient approach to study long time and larger length scales, while still providing atomistic information. In its original development, kMC was exclusively applied to simplified model systems, employing a few processes with guessed or fitted rates (see, e.g., Ref. [69]). The new aspect brought into play by so-called first-principles kMC simulations [75, 76] is that these rates and the processes are directly provided from electronic structure theory calculations, i.e., that the parameters fed into the kMC simulation have a clear microscopic meaning.
2.2.
Getting the Processes and Their Rates
For the rare event type molecular processes mostly encountered in the surface science context, an efficient and reliable way to obtain the individual process rates is transition-state theory (TST) [77–79]. The two basic quantities entering this theory are an effective attempt frequency, ◦ , and the minimum energy barrier E that needs to be overcome for the event to take place, i.e., to bring the system from the initial to the final state. The atomic configuration corresponding to E is accordingly called the transition state (TS). Within the harmonic approximation, the effective attempt frequency is proportional to the ratio of normal vibrational modes at the initial and transition state. Just like the barrier E, ◦ is thus also related to properties of the PES, and as such directly amenable to a calculation with electronic structure theory methods like DFT [80]. In the end, the crucial additional PES information required in kMC compared to equilibrium MC is therefore the location of the transition state in form of the PES saddle point along a reaction path of the process. Particularly for high-dimensional PES this is not at all a trivial problem, and the development of efficient and reliable transition-state-search algorithms is a very active area of current research [81, 82]. For many surface related elementary processes (e.g., diffusion, adsorption, desorption or reaction events) the dimensionality is fortunately not excessive, or can be mapped onto a couple of prominent reaction coordinates as exemplified in Fig. 12. The identification of the TS and the ensuing calculation of the rate for individual identified elementary processes with TST are then computationally involved, but just feasible. This still leaves as a fundamental problem, how the relevant elementary processes for any given system configuration can be identified in the first place.
180
K. Reuter et al.
O cus position along [001] (Å)
1.25
2.08 Å
1.15 Å 1.88 Å 1.79 Å
1.87
[001] 0.00 Å 3.12Å 0.89 eV 2.50
3.12 0.00
0.62
C
cus
1.25
1.87
> 1.50 1.40 1.30 1.20 1.10 1.00 0.90 0.80 0.70 0.60 0.50 0.40 0.30 0.20 < 0.10
eV
position along [001] (Å)
Figure 12. Calculated DFT-PES of a CO oxidation reaction process at the RuO2 (110) model catalyst surface. The high-dimensional PES is projected onto two reaction coordinates, representing two lateral coordinates of the adsorbed Ocus and COcus (cf. Fig. 5). The energy zero corresponds to the initial state at (0.00 Å, 3.12 Å), and the transition state is at the saddle point of the PES, yielding a barrier of 0.89 eV. Details of the corresponding transition state geometry are shown in the inset. Ru = light, large spheres, O = dark, medium spheres, and C = small, white spheres (only the atoms lying in the reaction plane itself are drawn as three-dimensional spheres) (from Ref. [26]).
Most TS-search algorithms require not only the automatically provided information of the actual system state, but also knowledge of the final state after the process has taken place [81]. In other words, quite some insight into the physics of the elementary process is needed in order to determine its rate and include it in the list of possible processes in the kMC simulation. How difficult and nonobvious this can be even for the simplest kind of processes is nicely exemplified by the diffusion of an isolated metal atom over a close-packed surface [82]. Such a process is of fundamental importance for the epitaxial growth of metal films, which is a necessary prerequisite in many applications like catalysis, magneto-optic storage media or interconnects in microelectronics. Intuitively, one would expect the surface diffusion to proceed by simple hops from one lattice site to a neighboring lattice site, as illustrated in Fig. 13(a) for an fcc (100) surface. Having said that, it is in the meanwhile well established that on a number of substrates diffusion does not operate preferentially by such hopping processes, but by atomic exchange as explained in Fig. 13(b). Here, the adatom replaces a surface atom, and the latter then assumes the adsorption site. Even much more complicated, correlated exchange diffusion processes involving a larger number of surface atoms are currently discussed for some materials. And the complexity increases of course further, when diffusion along island edges, across steps and around defects needs to be treated in detail [82].
Ab initio atomistic thermodynamics and statistical mechanics
181
(a)
(b)
Figure 13. Schematic top view of a fcc(100) surface, explaining diffusion processes of an isolated metal adatom (white circle). (a) Diffusion by hopping to a neighboring lattice site, (b) diffusion by exchange with a surface atom.
While it is therefore straightforward to say that one wants to include, e.g., diffusion in a kMC simulation, it can in practice be very involved to identify the individual processes actually contributing to it. Some attempts to automatize the search for the elementary processes possible for a given system configuration are currently undertaken, but in the first-principles kMC studies performed up to date (and in the foreseeable future), the process lists are simply generated by physical insight. This obviously bears the risk of overlooking a potentially relevant molecular process, and on this note this just evolving method has to be seen. Contrary to traditional kMC studies, where an unknown number of real molecular processes is often lumped together into a handful effective processes with optimized rates, first-principles kMC has the advantage, however, that the omission of a relevant elementary process will definitely show up in the simulation results. As such, first experience [15] tells that a much larger number of molecular processes needs to be accounted for in a corresponding modeling “with microscopic understanding” compared to traditional empirical kMC. In other words, that the statistical interplay determining the observable function of materials takes places between quite a number of different elementary processes, and is therefore often way too complex to be understood by just studying in detail the one or other elementary process alone.
2.3.
Applications to Semiconductor Growth and Catalysis
The new quality of and the novel insights that can be gained by mesoscopic first-principles kMC simulations was first demonstrated in the area of nucleation
182
K. Reuter et al.
and growth in metal and semiconductor epitaxy [75, 76, 83–87]. As one example from this field we return to the GaAs(001) surface already discussed in the context of the free energy plots. As apparent from Fig. 4, the so-called β2(2 × 4) reconstruction represents the most stable phase under moderately As-rich conditions, which are typically employed in the MBE growth of this material. Aiming at an atomic-scale understanding of this technologically most relevant process, first-principles LGH + kMC simulations were performed, including the deposition of As2 and Ga from the gas phase, as well as diffusion on this complex β2(2 × 4) semiconductor surface. In order to reach a trustworthy modeling, the consideration of more than 30 different elementary processes was found to be necessary, underlining our general message that complex materials properties cannot be understood by analyzing isolated molecular processes alone. Snapshots of characteristic stages during a typical simulation at realistic deposition fluxes and temperature are given in Fig. 14. They show a small part (namely 1/60) of the total mesoscopic simulation area, focusing on one “trench” of the β2(2 × 4) reconstruction. At the chosen conditions, island nucleation is observed in these reconstructed surface trenches, which is followed by growth along the trench, thereby extending into a new layer. Monitoring the density of the nucleated islands in huge simulation cells (160 × 320 surface lattice constants), a saturation indicating the beginning of steady-state growth is only reached after simulation times of the order of seconds for quite a range of temperatures. Obviously, neither such system sizes, nor time scales would have been accessible by direct electronic structure theory calculations combined, e.g., with MD simulations. In the ensuing steady-state growth, attachment of a deposited Ga atom to an existing island typically takes place before the adatom could take part in a new nucleation event. This leads to a very small nucleation rate that is counterbalanced by a simultaneous decrease in the number of islands due to coalescence. The resulting constant island density during steady-state growth is plotted in Fig. 15 for a range of technologically relevant temperatures. At the lower end around 500–600 K, this density decreases, as is consistent with the frequently employed standard nucleation theory. Under these conditions, the island morphology is predominantly determined by Ga surface diffusion alone, i.e., it may be understood on the basis of one molecular process class. Around 600 K the island density becomes almost constant, however, and even increases again above around 800 K. The determined magnitude is then orders of magnitude away from the prediction of classical nucleation theory, cf. Fig. 15, but in very good agreement with existing experimental data. The reason for this unusual behavior is that the adsorption of As2 molecules at reactive surface sites becomes reversible at these elevated temperatures. The initially formed Ga–As–As–Ga2 complexes required for nucleation, cf. Fig. 14(b), become unstable against As2 desorption, and a decreasing fraction of them can stabilize into larger aggregates. Due to the contribution of the decaying complexes, an
Ab initio atomistic thermodynamics and statistical mechanics (a)
183
(b)
t =100 ms
(c)
t =135 ms
(d)
t =170 ms
t =400 ms
Figure 14. Snapshots of characteristic stages during a first-principles kMC simulation of GaAs homoepitaxy. Ga and As substrate atoms appear in medium and dark grey, Ga adatoms in white. (a) Ga adatoms preferentially wander around in the trenches. (b) Under the growth conditions used here, an As2 molecule adsorbing on a Ga adatom in the trench initiates island formation. (c) Growth proceeds into a new atomic layer via Ga adatoms forming Ga dimers. (d) Eventually, a new layer of arsenic starts to grow, and the island extends itself towards the foreground, while more material attaches along the trench. The complete movie can be retrieved via the EPAPS homepage (http://www.aip.org/pubservs/epaps.html), document No. E-PRLTAO-87-031152 (from Ref. [86]).
effectively higher density of mobile Ga adatoms results at the surface, which in turn yields a higher nucleation rate of new islands. The temperature window around 700–800 K, which is frequently used by MBE crystal growers, may therefore be understood as permitting a compromise between high Ga adatom mobility and stability of As complexes that leads to a low island density and correspondingly smooth films. Exactly under the technologically most relevant conditions, surface properties that decisively influence the growth behavior (and therewith the targeted functionality) result therefore from the concerted interdependence of distinct molecular processes, i.e., in this case diffusion, adsorption and desorption. To further show that this interdependence is to our opinion more the rule than an exception in materials science applications, we return in the remainder of
184
K. Reuter et al. 880K 2x10
800K
700K
600K
500K
4
kMC simulation 4
⫺2
island density (µm )
10
nucleation theory i*⫽1 3
10
1.2
1.4
1.6 ⫺1 1000/T (K )
1.8
2
Figure 15. Saturation island density corresponding to steady-state MBE of GaAs as a function of the inverse growth temperature. The dashed line shows the prediction of classical nucleation theory for diffusion-limited attachment and a critical nucleus size equal to 1. The significant deviation at higher temperatures is caused by arsenic losses due to desorption, which is not considered in classical nucleation theory (from Ref. [87]).
this section to the field of heterogeneous catalysis. Here, the conversion of reactants into products by means of surface chemical reactions (A + B → C) adds another qualitatively different class of processes to the statistical interplay. In the context of the thermodynamic free energy plots we had already discussed that these on-going catalytic reactions at the surface continuously consume the adsorbed reactants, driving the surface populations away from their equilibrium value. If this has a significant effect, presumably, e.g., in regions of very high catalytic activity, the average surface coverage and structure does even under steady-state operation never reach its equilibrium with the surrounding reactant gas phase, and must thence be modeled by explicitly accounting for the surface kinetics [88–90]. In terms of kMC, this means that in addition to the diffusion, adsorption and desorption of the reactants and products, also reaction events have to be considered. For the case of CO oxidation, as one of the central reactions taking place in our car catalytic converters, this translates into the conversion of adsorbed O and CO into CO2 . Even for the afore discussed, moderately complex model catalyst RuO2 (110), again close to 30 elementary processes result, comprising both adsorption to and desorption from the two prominent site-types at the surface (br and cus, cf. Fig. 5), as well as diffusion between any nearest neighbor site-combination (br→br, br→cus, cus→br, cus→cus). Finally, reaction events account for the catalytic activity and are possible
Ab initio atomistic thermodynamics and statistical mechanics
185
whenever O and CO are simultaneously adsorbed in any nearest neighbor sitecombination. For given temperature and reactant pressures, the corresponding kMC simulations are then first run until steady-state conditions are reached, and the average surface populations are thereafter evaluated over sufficiently long times. We note that even for elevated temperatures, both time periods may again largely exceed the time span accessible by current MD techniques as exemplified in Fig. 16. The obtained steady-state average surface populations at T = 600 K are shown in Fig. 17 as a function of the gas-phase partial pressures. Comparing with the surface phase diagram of Fig. 5 from ab initio atomistic thermodynamics, i.e., neglecting the effect of the on-going catalytic reactions at the surface, similarities, but also the expected significant differences under some environmental conditions can be discerned. The differences affect most prominently the presence of oxygen at the br sites, where it is much more strongly bound than CO. For the thermodynamic approach only the ratio of adsorption to desorption matters, and due to the ensuing very low desorption rate, Obr is correspondingly stabilized even when there is much more CO in the gas-phase than O2 (left upper part of Fig. 5). The surface reactions, on the other hand, provide a very efficient means of 100 Site occupation number (%)
O 80
CO
br
cus
60 40 O
cus
20 0 0.0
CO 0.2
0.4 0.6 Time (s)
0.8
br 1.0
Figure 16. Time evolution of the site occupation by O and CO of the two prominent adsorption sites of the RuO2 (110) model catalyst surface shown in Fig. 5. The temperature and pressure conditions chosen (T = 600 K, pCO = 20 atm, pO2 = 1 atm) correspond to an optimum catalytic performance. Under these conditions kinetics builds up a steady-state surface population in which O and CO compete for either site type at the surface, as reflected by the strong fluctuations in the site occupations. Note the extended time scale, also for the “induction period” until the steady-state populations are reached when starting from a purely oxygen covered surface. A movie displaying these changes in the surface population can be retrieved via the EPAPS homepage (http://www.aip.org/pubservs/spaps.html), document No. E-PRLTAO93-006438 (from Ref. [90]).
186
K. Reuter et al.
Figure 17. Left panel: Steady state surface structures of RuO2 (110) in an O2 /CO environment obtained by first-principles kMC calculations at T = 600 K. In all non-white areas, the average site occupation is dominated (> 90 %) by one species, and the site nomenclature is the same as in Fig. 5, where the same surface structure was addressed within the ab initio atomistic thermodynamics approach. Right panel: Map of the corresponding catalytic CO oxidation activity measured as so-called turn-over frequencies (TOFs), i.e., CO2 conversion per cm2 and second: White areas have a TOF < 1011 cm−2 s−1 , and each increasing gray level represents one order of magnitude higher activity. The highest catalytic activity (black region, TOF > 1017 cm−2 s−1 ) is narrowly concentrated around the phase coexistence region that was already suggested by the thermodynamic treatment (from Ref. [90]).
removing this Obr species that is not accounted for in the thermodynamic treatment. As net result, under most CO-rich conditions in the gas phase, oxygen is faster consumed by the reaction than it can be replenished from the gas phase. The kMC simulations covering this effect yield then a much lower surface concentration of Obr , and in turn show a much larger stability range of surface structures with CObr at the surface (blue and hatched blue regions). It is particularly interesting to notice, that this yields a stability region of a surface structure consisting of only adsorbed CO at br sites that does not exist in the thermodynamic phase diagram at all, cf. Fig. 5. The corresponding CObr /− “phase” (hatched blue region) is thus a stable structure with defined average surface population that is entirely stabilized by the kinetics of this open catalytic system. These differences were conceptually anticipated in the thermodynamic phase diagram, and qualitatively delineated by the hatched regions in Fig. 5. Due to the vicinity to a phase transition and the ensuing enhanced dynamics at the surface, these regions were also considered as potential candidates for highly efficient catalytic activity. This is in fact confirmed by the first-principles kMC simulations as shown in the right panel of Fig. 17. Since the detailed statistics of all elementary processes is explicitly accounted for in the latter type simulations, it is straightforward to also evaluate the average occurrence of
Ab initio atomistic thermodynamics and statistical mechanics
187
the reaction events over long time periods as a measure of the catalytic activity. The obtained so-called turnover frequencies (TOF, in units of formed CO2 per cm2 per second) are indeed narrowly peaked around the phase coexistence line, where the kinetics builds up a surface population in which O and CO compete for either site type at the surface. This competition is in fact nicely reflected by the large fluctuations in the surface populations apparent in Fig. 16. The partial pressures and temperatures corresponding to this high activity “phase”, and even the absolute TOF values under these conditions, agree extremely well with detailed experimental studies measuring the steady-state activity in the temperature range from 300–600 K and both at high pressures and in UHV. Interestingly, under the conditions of highest catalytic performance it is not the reaction with the highest rate (lowest barrier) that dominates the activity. Although the particular elementary process itself exhibits very suitable properties for catalysis, it occurs too rarely in the full concert of all possible events to decisively affect the observable macroscopic functionality. This emphasizes again the importance of the statistical interplay and the novel level of understanding that can only be provided by first-principles based mesoscopic studies.
3.
Outlook
As highlighted by the few examples from surface physics, many materials’ properties and functions arise out of the interplay of a large number of distinct molecular processes. Theoretical approaches aiming at an atomic-scale understanding and predictive modeling of such phenomena have therefore to achieve both an accurate description of the individual elementary processes at the electronic regime and a proper treatment of how they act together on the mesoscopic level. We have sketched the current status and future direction of some emerging methods which correspondingly try to combine electronic structure theory with concepts from statistical mechanics and thermodynamics. The results already achieved with these techniques give a clear indication of the new quality and novelty of insights that can be gained by such descriptions. On the other hand, it is also apparent that we are only at the beginning of a successful bridging of the micro- to mesoscopic transition in the multiscale materials modeling endeavor. Some of the major conceptual challenges we see at present that need to be tackled when applying these schemes to more complex systems have been touched in this chapter. They may be summarized under the keywords accuracy, mapping and efficiency, and as outlook we briefly comment further on them. Accuracy: The reliability of the statistical treatment depends predominantly on the accuracy of the description of the individual molecular processes that are input to it. For the mesoscopic methods themselves it makes in fact no
188
K. Reuter et al.
difference, whether the underlying PES comes from a semi-empirical potential or from first-principles calculations, but the predictive power of the obtained results (and the physical meaning of the parameters) will obviously be significantly different. In this respect, we only mention two somehow diverging aspects. For the interplay of several (possibly competing) molecular processes, an “exact” description of the energetics of each individual process, e.g., in form of a rate for kMC simulations may be less important than the relative ordering among the processes as, e.g., provided by the correct trend in their energetics. In this case, the frequently requested chemical accuracy in the description of single processes could be a misleading concept, and modest errors in the PES would tend to cancel (or compensate each other) in the statistical mechanics part. Here, we stress the words modest errors, however, which, e.g., largely precludes semi-empiric potentials. Particularly for systems where bond breaking and making is relevant, the latter do not have the required accuracy. On the other hand, for the particular case of DFT as the current workhorse of electronic structure theories it appears that the present uncertainties due to the approximate treatment of electronic exchange and correlation are less problematic than hitherto often assumed (still caution, and systematic tests are necessary). On the other hand, in other cases where for example one process strongly dominates the concerted interplay, such an error cancellation in the statistical mechanics part will certainly not occur. Then, a more accurate description of this process will be required than can be provided by the exchangecorrelation functionals in DFT that are available today. Improved descriptions based on wave-function methods and on local corrections to DFT exist or are being developed, but come so far at a high computational cost. Assessing what kind of accuracy is required for which process under which system state, possibly achieved by evolutionary schemes based on gradually improving PES descriptions, will therefore play a central role in making atomistic statistical mechanics methods computationally feasible for increasingly complex systems. Mapping: The configuration space of most materials science problems is exceedingly large. In order to arrive at meaningful statistics, even the most efficient sampling of such spaces still requires (at present and in the foreseeable future) a number of PES evaluations that is prohibitively large to be directly provided by first-principles calculations. This problem is mostly circumvented by mapping the actual system onto a coarse-grained lattice model, in which the real Hamiltonian is approximated by discretized expansions, e.g., in certain interactions (LGH) or elementary processes (kMC). The expansions are then first parametrized by the first-principles calculations, while the statistical mechanics problem is thereafter solved exploiting the fast evaluations of the model Hamiltonians. Since in practice these expansions can only comprise a finite number of terms, the mapping procedure intrinsically bears the problem of overlooking a relevant interaction or process. Such an omission can
Ab initio atomistic thermodynamics and statistical mechanics
189
obviously jeopardize the validity of the complete statistical simulation, and there are at present no fool-proof or practical, let alone automatized schemes as to which terms to include in the expansion, neither how to judge on the convergence of the latter. In particular when going to more complex systems the present “hand-made” expansions that are mostly based on educated guesses will become increasingly cumbersome. Eventually, the complexity of the system may become so large, that even the mapping onto a discretized lattice itself will be problematic. Overcoming these limitations may be achieved by adaptive, self-refining approaches, and will certainly be of paramount importance to ensure the general applicability of the atomistic statistical techniques. Efficiency: Even if an accurate mapping onto a model Hamiltonian is achieved, the sampling of the huge configuration spaces will still put increasing demands on the statistical mechanics treatment. In the examples discussed above, the actual evaluation of the system partition function, e.g., by MC simulations is a small add-on compared to the computational cost of the underlying DFT calculations. With increasing system complexity, different problems and an increasing number of processes this may change eventually, requiring the use of more efficient sampling schemes. A major challenge for increasing efficiency is for example the treatment of kinetics, in particular when processes operate at largely different time scales. The computational cost of a certain time span in kMC simulations is dictated by the fastest process in the system, while the slowest process governs what total time period needs actually to be covered. If both process scales differ largely, kMC becomes expensive. A remedy may, e.g., be provided by assuming the fast process to be always equilibrated at the time scale of the slow one, and correspondingly an appropriate mixing of equilibrium MC with kMC simulations may significantly increase the efficiency (as typically done in nowadays TPD simulations). Alternatively, the fast process could not be explicitly considered anymore on the atomistic level, and only its effect incorporated into the remaining processes. Obviously, with such a grouping of processes one approaches already the meso- to macroscopic transition, gradually giving up the atomistic description in favor of a more coarse-grained or even continuum modeling. The crucial point to note here is that such a transition is done in a controlled and hierarchical manner, i.e., necessarily as the outcome and understanding from the analysis of the statistical interplay at the mesoscopic level. This is therefore in marked contrast to, e.g., the frequently employed rate equation approach in heterogeneous catalysis modeling, where macroscopic differential equations are directly fed with effective microscopic parameters. If the latter are simply fitted to reproduce some experimental data, at best a qualitative description can be achieved anyway. If really microscopically meaningful parameters are to be used, one does not know which of the many in principle possible elementary processes to consider. Simple-minded “intuitive” approaches like,
190
K. Reuter et al.
e.g., parametrizing the reaction equation with the data from the reaction process with the highest rate may be questionable in view of the results described above. This process may never occur in the full concert of the other processes, or it may only contribute under particular environmental conditions, or be significantly enhanced or suppressed due to an intricate interplay with another process. All this can only be filtered out by the statistical mechanics at the mesoscopic level, and can therefore not be grasped by the traditional rate equation approach omitting this intermediate time and length scale regime. The two key features of the atomistic statistical schemes reviewed here are in summary that they treat the statistical interplay of the possible molecular processes, and that these processes have a well-defined microscopic meaning, i.e., they are described by parameters that are provided by first-principles calculations. This distinguishes these techniques from approaches where molecular process parameters are either directly put into macroscopic equations neglecting the interplay, or where only effective processes with fitted or empirical parameters are employed in the statistical simulations. In the latter case, the individual processes lose their well-defined microscopic meaning and typically represent an unspecified lump sum of not further resolved processes. Both the clear cut microscopic meaning of the individual processes and their interplay are, however, decisive for the transferability and predictive nature of the obtained results. Furthermore, it is also precisely these two ingredients that ensure the possibility of reverse-mapping, i.e., the unambiguous tracing back of the microscopic origin of (appealing) materials’ properties identified at the meso- or macroscopic modeling level. We are convinced that primarily the latter point will be crucial when trying to overcome the present trial and error based system engineering in materials sciences in the near future. An advancement based on understanding requires theories that straddle various traditional disciplines. The approaches discussed here employ methods from various areas of electronic structure theory (physics as well as chemistry), statistical mechanics, mathematics, materials science, and computer science. This high interdisciplinarity makes the field challenging, but is also part of the reason why it is exciting, timely, and full with future perspectives.
References [1] P. Hohenberg and W. Kohn, “Inhomogeneous electron gas,” Phys. Rev. B, 136, 864, 1964. [2] W. Kohn and L. Sham, “Self consistent equations including exchange and correlation effects,” Phys. Rev. A, 140, 1133, 1965. [3] R.G. Parr and W. Yang, Density Functional Theory of Atoms and Molecules, Oxford University Press, New York, 1989. [4] R.M. Dreizler and E.K.U. Gross, Density Functional Theory, Springer, Berlin, 1990. [5] M.P. Allen and D.J. Tildesley, Computer Simulation of Liquids, Oxford University Press, Oxford, 1997.
Ab initio atomistic thermodynamics and statistical mechanics
191
[6] D. Frenkel and B. Smit, Understanding Molecular Simulation, 2nd edn., Academic Press, San Diego, 2002. [7] R. Car and M. Parrinello, “Unified approach for molecular dynamics and densityfunctional theory,” Phys. Rev. Lett., 55, 2471, 1985. [8] M.C. Payne, M.P. Teter, D.C. Allan, T.A. Arias, and J.D. Joannopoulos, “Iterative minimization techniques for ab initio total energy calculations: molecular dynamics and conjugate gradients,” Rev. Mod. Phys., 64, 1045, 1992. [9] G. Galli and A. Pasquarello, “First-principle molecular dynamics,” In: M.P. Allen, and D.J. Tildesley (eds.), Computer Simulations in Chemical Physics, Kluwer, Dordrecht, 1993. [10] A. Gross, “Reactions at surfaces studied by ab initio dynamics calculations,” Surf. Sci. Rep., 32, 293, 1998. [11] G.J. Kroes, “Six-dimensional quantum dynamics of dissociative chemisorption of H2 on metal surfaces,” Prog. Surf. Sci., 60, 1, 1999. [12] A.F. Voter, F. Montalenti, and T.C. Germann, “Extending the time scale in atomistic simulation of materials,” Annu. Rev. Mater. Res., 32, 321, 2002. [13] A. Zangwill, Physics at Surfaces, Cambridge University Press, Cambridge, 1988. [14] R.I. Masel, Principles of Adsorption and Reaction on Solid Surfaces, Wiley, New York, 1996. [15] C. Stampfl, M.V. Ganduglia-Pirovano, K. Reuter, and M. Scheffler, “Catalysis and corrosion: the theoretical surface-science context,” Surf. Sci., 500, 368, 2002. [16] M. Scheffler and C. Stampfl, “Theory of adsorption on metal substrates,” In: K. Horn and M. Scheffler (eds.), Handbook of Surface Science, vol. 2: Electronic Structure, Elsevier, Amsterdam, 2000. [17] G.R. Darling and S. Holloway, “The dissociation of diatomic molecules at surfaces,” Rep. Prog. Phys., 58, 1595, 1995. [18] E. Kaxiras, Y. Bar-Yam, J.D. Joannopoulos, and K.C. Pandey, “Ab initio theory of polar semiconductor surfaces. I. Methodology and the (22) reconstructions of GaAs(111),” Phys. Rev. B, 35, 9625, 1987. [19] M. Scheffler, “Thermodynamic aspects of bulk and surface defects – first-principles calculations,” In: J. Koukal (ed.), Physics of Solid Surfaces – 1987, Elsevier, Amsterdam, 1988. [20] M. Scheffler and J. Dabrowski, “Parameter-free calculations of total energies, interatomic forces, and vibrational entropies of defects in semiconductors,” Phil. Mag. A, 58, 107, 1988. [21] G.-X. Qian, R.M. Martin, and D.J. Chadi, “First-principles study of the atomic reconstructions and energies of Ga- and As-stabilized GaAs(100) surfaces,” Phys. Rev. B, 38, 7649, 1988. [22] X.-G. Wang, W. Weiss, Sh.K. Shaikhutdinov, M. Ritter, M. Petersen, F. Wagner, R. Schl¨ogl, and M. Scheffler, “The hematite (alpha–Fe2 O3 )(0001) surface: evidence for domains of distinct chemistry,” Phys. Rev. Lett., 81, 1038, 1998. [23] X.-G. Wang, A. Chaka, and M. Scheffler, “Effect of the environment on Al2 O3 (0001) surface structures,” Phys. Rev. Lett., 84, 3650, 2000. [24] K. Reuter and M. Scheffler, “Composition, structure, and stability of RuO2 (110) as a function of oxygen pressure,” Phys. Rev. B, 65, 035406, 2002. [25] K. Reuter and M. Scheffler, “First-principles atomistic thermodynamics for oxidation catalysis: surface phase diagrams and catalytically interesting regions,” Phys. Rev. Lett., 90, 046103, 2003. [26] K. Reuter and M. Scheffler, “Composition and structure of the RuO2 (110) surface in an O2 and CO environment: implications for the catalytic formation of CO2 ,” Phys. Rev. B, 68, 045407, 2003.
192
K. Reuter et al. [27] Z. Lodzianan and J.K. Nørskov, “Stability of the hydroxylated (0001) surface of Al2 O3 ,” J. Chem. Phys., 118, 11179, 2003. [28] K. Reuter and M. Scheffler, “Oxide formation at the surface of late 4d transition metals: insights from first-principles atomistic thermodynamics,” Appl. Phys. A, 78, 793, 2004. [29] K. Reuter “Nanometer and sub-nanometer thin oxide films at surfaces of late transition metals,” In: U. Heiz, H. Hakkinen, and U. Landman (eds.), Nanocatalysis: Principles, Methods, Case Studies, 2005. [30] G. Ertl, H. Kn¨ozinger, and J. Weitkamp (eds.), Handbook of Heterogeneous Catalysis, Wiley, New York, 1997. [31] D.P. Woodruff and T.A. Delchar, Modern Techniques of Surface Science, 2nd edn., Cambridge University Press, Cambridge, 1994. [32] W.-X. Li, C. Stampfl, and M. Scheffler, “Insights into the function of silver as an oxidation catalyst by ab initio atomistic thermodynamics,” Phys. Rev. B, 68, 16541, 2003. [33] W.-X. Li, C. Stampfl, and M. Scheffler, “Why is a noble metal catalytically active? the role of the O–Ag interaction in the function of silver as an oxidation catalyst,” Phys. Rev. Lett., 90, 256102, 2003. [34] D.A. Mc Quarrie, Statistical Mechanics, Harper and Row, New York, 1976. [35] D.R. Stull and H. Prophet, JANAF Thermochemical Tables, 2nd edn., U.S. National Bureau of Standards, Washington, D.C., 1971. [36] E. Lundgren, J. Gustafson, A. Mikkelsen, J.N. Andersen, A. Stierle, H. Dosch, M. Todorova, J. Rogal, K. Reuter, and M. Scheffler, “Kinetic hindrance during the initial oxidation of Pd(100) at ambient pressures,” Phys. Rev. Lett., 92, 046101, 2004. [37] M. Todorova, E. Lundgren, V. Blum, A. Mikkelsen, S. Gray, J. Gustafson, √M. Borg, √ J. Rogal, K. Reuter, J.N. Andersen, and M. Scheffler, “The Pd(100)-( 5 × 5) R27◦ -O surface oxide revisited,” Surf. Sci., 541, 101, 2003. [38] E. Lundgren, G. Kresse, C. Klein, M. Borg, J.N. Andersen, M. De Santis, Y. Gauthier, C. Konvicka, M. Schmid, and P. Varga, “Two-dimensional oxide on Pd(111),” Phys. Rev. Lett., 88, 246103, 2002. [39] A. Michaelides, M.L. Bocquet, P. Sautet, A. Alavi, and D.A. King, “Structures and thermodynamic phase transitions for oxygen and silver oxide phases on Ag{111},” Chem. Phys. Lett., 367, 344, 2003. [40] C.M. Weinert and M. Scheffler, In: H.J. von Bardeleben (ed.), Defects in Semiconductors, Mat. Sci. Forum, 10–12, 25, 1986. [41] S.-H. Lee, W. Moritz, and M. Scheffler, “GaAs(001) under conditions of low as pressure: edvidence for a novel surface geometry,” Phys. Rev. Lett., 85, 3890, 2000. [42] C.B. Duke, “Semiconductor surface reconstruction: the structural chemistry of twodimensional surface compounds,” Chem. Rev., 96, 1237, 1996. [43] T. Engel and G. Ertl, “Oxidation of carbon monoxide,” In: D.A. King and D.P. Woodruff (eds.), The Chemical Physics of Solid Surfaces and Heterogeneous Catalysis, Elsevier, Amsterdam, 1982. [44] B.L.M. Hendriksen, S.C. Bobaru, and J.W.M. Frenken, “Oscillatory CO oxidation on Pd(100) studied with in situ scanning tunnelling microscopy,” Surf. Sci., 552, 229, 2003. [45] H. Over and M. Muhler, “Catalytic CO oxidation over ruthenium – bridging the pressure gap,” Prog. Surf. Sci., 72, 3, 2003. [46] G. Ertl, “Heterogeneous catalysis on the atomic scale,” J. Mol. Catal. A, 182, 5, 2002. [47] D.P. Landau and K. Binder, A Guide to Monte Carlo Simulations in Statistical Physics, Cambridge University Press, Cambridge, 2002. [48] D. de Fontaine, In: P.E.A. Turchi and A. Gonis (eds.), Statics and Dynamics of Alloy Phase Transformations, NATO ASI Series, Plenum Press, New York, 1994.
Ab initio atomistic thermodynamics and statistical mechanics
193
[49] J.M. Sanchez, F. Ducastelle, and D. Gratias, “Generalized cluster description of multicomponent systems,” Physica A, 128, 334, 1984. [50] A. Zunger, “First principles statistical mechanics of semiconductor alloys and intermetallic compounds,” In: P.E.A. Turchi and A. Gonis (eds.), Statics and Dynamics of Alloy Phase Transformations, NATO ASI Series, Plenum Press, New York, 1994. [51] P. Piercy, K. De’Bell, and H. Pfn¨ur, “Phase diagram and critical behavior of the adsorption system O/Ru(001): comparison with lattice-gas models,” Phys. Rev. B, 45, 1869, 1992. [52] G.M. Xiong, C. Schwennicke, H. Pfn¨ur, and H.-U. Everts, “Phase diagram and phase transitions of the adsorbate system S/Ru(0001): a monte carlo study of a lattice gas model,” Z. Phys. B, 104, 529, 1997. [53] V.P. Zhdanov and B. Kasemo, “Simulation of oxygen desorption from Pt(111),” Surf. Sci., 415, 403, 1998. [54] S.-J. Koh and G. Ehrlich, “Pair- and many-atom interactions in the cohesion of surface clusters: Pdx and Irx on W(110),” Phys. Rev. B, 60, 5981, 1999. ¨ [55] L. Osterlund, M.Ø. Pedersen, I. Stensgaard, E. Lægsgaard, and F. Besenbacher, “Quantitative determination of adsorbate-adsorbate interactions,” Phys. Rev. Lett., 83, 4812, 1999. [56] S.H. Payne, H.J. Kreuzer, W. Frie, L. Hammer, and K. Heinz, “Adsorption and desorption of hydrogen on Rh(311) and comparison with other Rh surfaces,” Surf. Sci., 421, 279, 1999. [57] C. Stampfl, H.J. Kreuzer, S.H. Payne, H. Pfn¨ur, and M. Scheffler, “First-principles theory of surface thermodynamics and kinetics,” Phys. Rev. Lett., 83, 2993, 1999. [58] C. Stampfl, H.J. Kreuzer, S.H. Payne, and M. Scheffler, “Challenges in predictive calculations of processes at surfaces: surface thermodynamics and catalytic reactions,” Appl. Phys. A, 69, 471, 1999. [59] J. Shao, “Linear model selection by cross-validation,” J. Amer. Statist. Assoc., 88, 486, 1993. [60] P. Zhang, “Model selection via multifold cross-validation,” Ann. statist., 21, 299, 1993. [61] A. van de Walle and G. Ceder, “Automating first-principles phase diagram calculations,” J. Phase Equilibria, 23, 348, 2002. [62] N. Metropolis, A.W. Rosenbluth, M.N. Rosenbluth, A.H. Teller, and E. Teller, “Equation of state calculations by fast computing machines,” J. Chem. Phys., 21, 1087, 1976. [63] J.-S. McEwen, S.H. Payne, and C. Stampfl, “Phase diagram of O/Ru(0001) from first principles,” Chem. Phys. Lett., 361, 317, 2002. [64] H.J. Kreuzer and S.H. Payne, “Theoretical approaches to the kinetics of adsorption, desorption and reactions at surfaces,” In: M. Borowko (eds.), Computational Methods in Surface and Colloid, Marcel Dekker, New York, 2000. [65] C. Stampfl and M. Scheffler, “Theory of alkali metal adsorption on close-packed metal surfaces,” Surf. Rev. Lett., 2, 317, 1995. [66] D.L. Adams, “New phenomena in the adsorption of alkali metals on Al surfaces,” Appl. Phys. A, 62, 123, 1996. [67] M. Borg, C. Stampfl, A. Mikkelsen, J. Gustafson, E. Lundgren, M. Scheffler, and J.N. Andersen, “Density of configurational states from first-principles: the phase diagram of Al-Na surface alloys,” Chem. Phys. Chem. (in press), 2005. [68] F. Wang and D.P. Landau, “Efficient, multiple-range random walk algorithm to calculate the density of states,” Phys. Rev. Lett., 86, 2050, 2001.
194
K. Reuter et al. [69] H.C. Kang and W.H. Weinberg, “Modeling the kinetics of heterogeneous catalysis,” Chem. Rev., 95, 667, 1995. [70] A.B. Bortz, M.H. Kalos, and J.L. Lebowitz, “New algorithm for Monte Carlo simulation of ising spin systems,” J. Comp. Phys., 17, 10, 1975. [71] D.T. Gillespie, “General method for numerically simulating stochastic time evolution of coupled chemical reactions,” J. Comp. Phys., 22, 403, 1976. [72] A.F. Voter, “Classically exact overlayer dynamics: diffusion of rhodium clusters on Rh(100),” Phys. Rev. B, 34, 6819, 1986. [73] H.C. Kang and W.H. Weinberg, “Dynamic Monte Carlo with a proper energy barrier: surface diffusion and two-dimensional domain ordering,” J. Chem. Phys., 90, 2824, 1989. [74] K.A. Fichthorn and W.H. Weinberg, “Theoretical foundations of dynamical Monte Carlo simulations,” J. Chem. Phys., 95, 1090, 1991. [75] P. Ruggerone, C. Ratsch, and M. Scheffler, “Density-functional theory of epitaxial growth of metals,” In: D.A. King and D.P. Woodruff (eds.), Growth and Properties of Ultrathin Epitaxial Layers. The Chemical Physics of Solid Surfaces, vol. 8, Elsevier, Amsterdam, 1997. [76] C. Ratsch, P. Ruggerone, and M. Scheffler, “Study of strain and temperature dependence of metal epitaxy,” In: Z. Zhang and M.G. Lagally (eds.), Morphological Organization in Epitaxial Growth and Removal, World Scientific, Singapore, 1998. [77] S. Glasston, K.J. Laidler, and H. Eyring, The Theory of Rate Processes, McGrawHill, New York, 1941. [78] G.H. Vineyard, “Frequency factors and isotope effects in solid state rate processes,” J. Phys. Chem. Solids, 3, 121, 1957. [79] K.J. Laidler, Chemical Kinetics, Harper and Row, New York, 1987. [80] C. Ratsch and M. Scheffler, “Density-functional theory calculations of hopping rates of surface diffusion,” Phys. Rev. B, 58, 13163, 1998. [81] G. Henkelman, G. Johannesson, and H. Jonsson, “Methods for finding saddle points and minimum energy paths,” In: S.D. Schwartz (ed.), Progress on Theoretical Chemistry and Physics, Kluwer, New York, 2000. [82] T. Ala-Nissila, R. Ferrando, and S.C. Ying, “Collective and single particle diffusion on surfaces,” Adv. Phys., 51, 949, 2002. [83] S. Ovesson, A. Bogicevic, and B.I. Lundqvist, “Origin of compact triangular islands in metal-on-metal growth,” Phys. Rev. Lett., 83, 2608, 1999. [84] K.A. Fichthorn and M. Scheffler, “Island nucleation in thin-film epitaxy: a firstprinciples investigation,” Phys. Rev. Lett., 84, 5371, 2000. [85] P. Kratzer M. Scheffler, “Surface knowledge: Toward a predictive theory of materials,” Comp. in Science and Engineering, 3(6), 16, 2001. [86] P. Kratzer and M. Scheffler, “Reaction-limited island nucleation in molecular beam epitaxy of compound semiconductors,” Phys. Rev. Lett., 88, 036102, 2002. [87] P. Kratzer, E. Penev, and M. Scheffler, “First-principles studies of kinetics in epitaxial growth of III–V semiconductors,” Appl. Phys. A, 75, 79, 2002. [88] E.W. Hansen and M. Neurock, “Modeling surface kinetics with first-principles-based molecular simulation,” Chem. Eng. Sci., 54, 3411, 1999. [89] E.W. Hansen and M. Neurock, “First-principles-based Monte Carlo simulation of ethylene hydrogenation kinetics on Pd,” J. Catal., 196, 241, 2000. [90] K. Reuter, D. Frenkel, and M. Scheffler, “The steady state of heterogeneous catalysis, studied with first-principles statistical mechanics,” Phys. Rev. Lett., 93, 116105, 2004.
1.10 DENSITY-FUNCTIONAL PERTURBATION THEORY Paolo Giannozzi1 and Stefano Baroni2 1 DEMOCRITOS-INFM, Scuola Normale Superiore, Pisa, Italy 2
DEMOCRITOS-INFM, SISSA-ISAS, Trieste, Italy
The calculation of vibrational properties of materials from their electronic structure is an important goal for materials modeling. A wide variety of physical properties of materials depend on their lattice-dynamical behavior: specific heats, thermal expansion, and heat conduction; phenomena related to the electron–phonon interaction such as the resistivity of metals, superconductivity, and the temperature dependence of optical spectra, are just a few of them. Moreover, vibrational spectroscopy is a very important tool for the characterization of materials. Vibrational frequencies are routinely and accurately measured mainly using infrared and Raman spectroscopy, as well as inelastic neutron scattering. The resulting vibrational spectra are a sensitive probe of the local bonding and chemical structure. Accurate calculations of frequencies and displacement patterns can thus yield a wealth of information on the atomic and electronic structure of materials. In the Born–Oppenheimer (adiabatic) approximation, the nuclear motion is determined by the nuclear Hamiltonian H: H=−
h¯ 2 I
∂2 + E({R}), 2M I ∂R2I
(1)
where R I is the coordinate of the I th nucleus, M I its mass, {R} indicates the set of all the nuclear coordinates, and E({R}) is the ground-state energy of the Hamiltonian, H{R} , of a system of N interacting electrons moving in the field of fixed nuclei with coordinates {R}: H{R} = −
1 h¯ 2 ∂ 2 e2 + v I (ri − R I ) + E N ({R}), + 2 2m i ∂ri 2 i=/ j |ri − r j | i,I
(2) 195 S. Yip (ed.), Handbook of Materials Modeling, 195–214. c 2005 Springer. Printed in the Netherlands.
196
P. Giannozzi and S. Baroni
where ri is the coordinate of the ith electron, m is the electron mass, −e is the electron charge, E N ({R}) is the nuclear electrostatic energy: E N ({R}) =
e2 Z I Z J , 2 I =/ J |R I − R J |
(3)
Z I being the charge of the I th nucleus, and v I is the electron–nucleus Coulomb interaction: v I (r) = −Z I e2 /r. In a pseudopotential scheme each nucleus is thought to be lumped together with its own core electrons in a frozen ion which interacts with the valence electrons through a smooth pseudopotential, v I (r). The equilibrium geometry of the system is determined by the condition that the forces acting on all nuclei vanish. The forces F I can be calculated by applying the Hellmann–Feynman theorem to the Born–Oppenheimer Hamiltonian H{R} :
∂ H{R} ∂ E({R}) {R} , = − {R} FI ≡ − ∂R I ∂R I
(4)
where {R} (r1 , . . . , r N ) is the ground-state wavefunction of the electronic Hamiltonian, H{R} . Eq. (4) can be rewritten as: FI = −
n(r)
∂v I (r − R I ) ∂ E N ({R}) dr − , ∂R I ∂R I
(5)
where n(r) is the electron charge density for the nuclear configuration {R}:
n(r) = N
|{R} (r, r2 , . . . , r N )|2 dr2 · · · dr N .
(6)
For a system near its equilibrium geometry, the harmonic approximation applies and the nuclear Hamiltonian of Eq. (1) reduces the Hamiltonian of a system of independent harmonic oscillators, called normal modes. Normal mode frequencies, ω, and displacement patterns, U Iα for the αth Cartesian component of the I th atom, are determined by the secular equation:
αβ
β
C IJ − M I ω2 δ IJ δαβ U J = 0,
(7)
J,β αβ
where C IJ is the matrix of interatomic force constants (IFCs): αβ
C IJ ≡
∂ 2 E({R}) β
∂ R αI ∂ R J
=−
∂ FIα
β.
∂ RJ
(8)
Various dynamical models, based on empirical or semiempirical inter-atomic potentials, can be used to calculate the IFCs. In most cases, the parameters of the model are obtained from a fit to some known experimental data, such as a set of frequencies. Although simple and often effective, such approaches tend
Density-functional perturbation theory
197
to have a limited predictive power beyond the range of cases included in the fitting procedure. It is often desirable to resort to first-principles methods, such as density-functional theory, that have a far better predictive power even in the absence of any experimental input.
1.
Density-Functional Theory
Within the framework of density-functional theory (DFT), the energy E({R}) can be seen as the minimum of a functional of the charge density n(r): e2 E({R}) = T0 [n(r)] + 2 +
n(r)n(r ) dr dr + E xc [n(r)] |r − r |
V{R} (r)n(r)dr + E N ({R}),
(9)
with the constrain that the integral of n(r) equals the number of electrons in the system, N . InEq. (9), V{R} indicates the external potential acting on the electrons, V{R} = I v I (r − R I ), T0 [n(r)] is the kinetic energy of a system of noninteracting electrons having n(r) as ground-state density, N/2 h¯ 2 ∂ 2 ψn (r) T0 [n(r)] = −2 ψn∗ (r) dr 2m n=1 ∂r2
n(r) = 2
N/2
|ψn (r)|2 ,
(10) (11)
n=1
and E xc is the so-called exchange-correlation energy. For notational simplicity, the system is supposed here to be a nonmagnetic insulator, so that each of the N/2 lowest-lying orbital states accommodates two electrons of opposite spin. The Kohn-Sham (KS) orbitals are the solutions of the KS equation:
HSCF ψn (r) ≡
h¯ 2 ∂ 2 − + VSCF (r) ψn (r) = n ψn (r), 2m ∂r2
(12)
where HSCF is the Hamiltonian for an electron under an effective potential VSCF : n(r ) 2 VSCF (r) = V{R} (r) + e (13) dr + v xc (r), |r − r | and v xc – the exchange-correlation potential – is the functional derivative of the exchange-correlation energy: v xc (r) ≡ δ E xc /δn(r). The form of E xc is unknown: the entire procedure is useful only if reliable approximate expressions for E xc are available. It turns out that even the simplest of such expressions, the local-density approximation (LDA), is surprisingly good in many
198
P. Giannozzi and S. Baroni
cases, at least for the determination of electronic and structural ground-state properties. Well-established methods for the solution of KS equations, Eq. (12), in both finite (molecules, clusters) and infinite (crystals) systems, are described in the literature. The use of more sophisticated and more performing functionals than LDA (such as generalized gradient approximation, or GGA) is now widespread. An important consequence of the variational character of DFT is that the Hellmann–Feynman form for forces, Eq. (5), is still valid in a DFT framework. In fact, the DFT expression for forces contains a term coming from explicit derivation of the energy functional E({R}) with respect to atomic positions, plus a term coming from implicit dependence via the derivative of the charge density: =− FDFT I
n(r)
∂ V{R}(r) ∂ E N ({R}) dr − − ∂R I ∂R I
δ E({R}) ∂n(r) dr. (14) δn(r) ∂R I
The last term in Eq. (14) vanishes exactly for the ground-state charge density: the minimum condition implies in fact that the functional derivative of E({R}) equals a constant – the Lagrange multiplier that enforces the constrain on the total number of electrons – and the integral of the derivative of the electron = FI density is zero because of charge conservation. As a consequence, FDFT I as in Eq. (5). Forces in DFT can thus be calculated from the knowledge of the electron charge-density. IFCs can be calculated as finite differences of Hellmann–Feynman forces for small finite displacements of atoms around the equilibrium positions. For finite systems (molecules, clusters) this technique is straightforward, but it may also be used in solid-state physics (frozen phonon technique). An alternative technique is the direct calculation of IFCs using density-functional perturbation theory (DFPT) [1–3].
2.
Density-Functional Perturbation Theory
An explicit expression for the IFCs can be obtained by differentiating the forces with respect to nuclear coordinates, as in Eq. (8): ∂ 2 E({R}) = ∂R I ∂R J
∂n(r) ∂ V{R} (r) ∂ 2 V{R} (r) ∂ 2 E N ({R}) dr + δ IJ n(r) dr + . ∂R J ∂R I ∂R I ∂R J ∂R I ∂R J (15)
The calculation of the IFCs thus requires the knowledge of the ground-state charge density, n(r), as well as of its linear response to a distortion of the nuclear geometry, ∂n(r)/∂R I .
Density-functional perturbation theory
199
The charge-density linear response can be evaluated by linearizing Eqs. (11)–(13), with respect to derivatives of KS orbitals, density, and potential, respectively. Linearization of Eq. (11) leads to: ∂ψn (r) ∂n(r) = 4 Re ψn∗ (r) . ∂R I ∂R I n=1 N/2
(16)
Whenever the unperturbed Hamiltonian is time-reversal invariant, eigenfunctions are either real, or they occur in conjugate pairs, so that the prescription to keep only the real part in the above formula can be dropped. The derivatives of the KS orbitals, ∂ψn (r)/∂R I , are obtained from linearization of Eqs. (12) and (13):
(HSCF − n )
∂ψn (r) ∂n ∂ VSCF (r) =− − ∂R I ∂R I ∂R I
ψn (r),
(17)
where ∂ VSCF (r) ∂ V{R} (r) = + e2 ∂R I ∂R I
1 ∂n(r ) dr + |r − r | ∂R I
δv xc (r) ∂n(r ) dr δn(r ) ∂R I (18)
is the first-order derivative of the self-consistent potential, and
∂ VSCF ∂n ψn = ψn ∂R I ∂R I
(19)
is the first-order derivative of the KS eigenvalue, n . The form of the righthand side of Eq. (17) ensures that ∂ψn (r)/∂R I can be chosen so as to have a vanishing component along ψn (r) and thus the singularity of the linear system in Eq. (17) can be ignored. Equations (16)–(18) form a set of self-consistent linear equations. The linear system, Eq. (17), can be solved for each of the N/2 derivatives ∂ψn (r)/∂R I separately, the charge-density response calculated from Eq. (16), and the potential response ∂ VSCF /∂R I is updated from Eq. (18), until self-consistency is achieved. Only the knowledge of the occupied states of the system is needed to construct the right-hand side of the equation, and efficient iterative algorithms – such as conjugate gradient or minimal residual methods – can be used for the solution of the linear system. In the atomic physics literature, an equation analogous to Eq. (17) is known as the Sternheimer equation, and its self-consistent version was used to calculate atomic polarizabilities. Similar methods are known in the quantum chemistry literature, under the name of coupled Hartree–Fock method for the Hartree–Fock approximation [4, 5].
200
P. Giannozzi and S. Baroni
The connection with standard first-order perturbation (linear-response) theory can be established by expressing Eq. (17) as a sum over the spectrum of the unperturbed Hamiltonian: 1 ∂ψn (r) = ψm (r) ∂R I n − m m= /n
∂ VSCF ψn , ψm ∂R
(20)
I
running over all the states of the system, occupied and empty. Using Eq. (20), the electron charge-density linear response, Eq. (16), can be recast into the form: 1 ∂ψn (r) =4 ψn∗ (r)ψm (r) ∂R I n − m /n n=1 m= N/2
∂ VSCF ψn . ψm ∂R
(21)
I
This equations shows that the contributions to the electron-density response coming from products of occupied states cancel each other. As a consequence, in Eq. (17) the derivatives ∂ψn (r)/∂R I can be assumed to be orthogonal to all states of the occupied manifold. An alternative and equivalent point of view is obtained by inserting Eq. (16) into Eq. (18) and the resulting equation into Eq. (17). The set of N/2 selfconsistent linear systems is thus recast into a single huge linear system for all the N/2 derivatives ∂ψn (r)/∂R I
∂ψn (r) ∂ψm + K nm ∂R I ∂R I m=1 N/2
(HSCF − n )
(r) = −
∂ V{R} (r) ψn (r), ∂R I
(22)
under the orthogonality constraints: ∂ψn ψn = 0. ∂R
(23)
I
The nonlocal operator K nm is defined as:
∂ψm K nm ∂R I
(r) = 4
ψn (r)
δv xc (r) e2 + |r − r | δn(r )
ψm∗ (r )
∂ψm (r ) dr . ∂R I (24)
The same expression can be derived from a variational principle. The energy functional, Eq. (9), is written in terms of the perturbing potential and of the perturbed KS orbitals: V (u I ) V{R} (r) + u I
∂ V{R} (r) , ∂R I
ψn(u I ) ψn (r) + u I
∂ψn (r) , ∂R I
(25)
and expanded up to second order in the strength u I of the perturbation. The first-order term gives the Hellmann–Feynman forces. The second-order one is a quadratic functional in the ∂ψn (r)/∂R I s whose minimization yields
Density-functional perturbation theory
201
Eq. (22). This approach forms the basis of variational DFPT [6, 7], in which all the IFCs are expressed as minima of suitable functionals. The big linear system of Eq. (22) can be directly solved with iterative methods, yielding a solution that is perfectly equivalent to the self-consistent solution of the smaller linear systems of Eq. (17). The choice between the two approaches is thus a matter of computational strategy.
3.
Phonon Modes in Crystals In perfect crystalline solids, the position of the I th atom can be written as: R I = Rl + τs = l1 a1 + l2 a2 + l3 a3 + τs
(26)
where Rl is the position of the lth unit cell in the Bravais lattice and τs is the equilibrium position of the sth atom in the unit cell. Rl can be expressed as a sum of the three primitive translation vectors a1 , a2 , a3 , with integer coefficients l1 , l2 , l3 . The electronic states are classified by a wave-vector k and a band index ν: ψn (r) ≡ ψν,k (r),
ψν,k (r + Rl ) = eik·Rl ψν,k (r)
∀l,
(27)
where k is in the first Brillouin zone, i.e.: the unit cell of the reciprocal lattice, defined as the set of all vectors {G} such that Gl · Rm = 2π n, with n an integer number. Normal modes in crystals (phonons) are also classified by a wave-vector q and a mode index ν. Phonon frequencies, ω(q), and displacement patterns, Usα (q), are determined by the secular equation:
C˜ stαβ (q) − Ms ω2 (q)δst δαβ Utβ (q) = 0.
(28)
t,β αβ
The dynamical matrix, C˜ st (q), is the Fourier transform of real-space IFCs: C˜ stαβ (q) =
e−iq·Rl Cstαβ (Rl ).
(29)
l
The latter are defined as Cstαβ (l, m) ≡
∂2 E β
∂u αs (l)∂u t (m)
= Cstαβ (Rl − Rm ),
(30)
where us (l) is the deviation from the equilibrium position of atom s in the lth unit cell: R I = Rl + τs + us (l).
(31)
Because of translational invariance, the real-space IFCs, Eq. (30), depend on l and m only through the difference Rl − Rm . The derivatives are evaluated
202
P. Giannozzi and S. Baroni
at us (l) = 0 for all the atoms. The direct calculation of such derivatives in an infinite periodic system is however not possible, since the displacement of a single atom would break the translational symmetry of the system. The elements of the dynamical matrix, Eq. (29), can be written as second derivatives of the energy with respect to a lattice distortion of wave-vector q: 1 ∂2 E , C˜ stαβ (q) = β Nc ∂u ∗α s (q)∂u t (q)
(32)
where Nc is the number of unit cells in the crystal, and us (q) is the amplitude of the lattice distortion: us (l) = us (q)eiq·Rl .
(33)
In the frozen-phonon approach, the calculation of the dynamical matrix at a generic point of the Brillouin zone presents the additional difficulty that a crystal with a small distortion, Eq. (33), “frozen-in,” loses the original periodicity, unless q = 0. As a consequence, an enlarged unit cell, called supercell, is required for the calculation of IFCs at any q =/ 0. The suitable supercell for a perturbation of wave-vector q must be big enough to accommodate q as one of the reciprocal-lattice vectors. Since the computational effort needed to determine the forces (i.e., the electronic states) grows approximately as the cube of the supercell size, the frozen-phonon method is in practice limited to lattice distortions that do not increase the unit cell size by more than a small factor, or to lattice-periodical (q = 0) phonons. The dynamical matrix, Eq. (32), can be decomposed into an electronic and an ionic contribution: (34) C˜ stαβ (q) = el C˜ stαβ (q) +ion C˜ stαβ (q), where: 1 el ˜ αβ Cst (q) = Nc
+ δst
∂n(r) ∂u αs (q)
n(r)
∗
∂ V{R} (r) β
∂u t (q)
dr
∂ 2 V{R}(r) β
∂u ∗α s (q = 0)∂u t (q = 0)
dr .
(35)
The ionic contribution – the last term in Eq. (15) – comes from the derivatives of the nuclear electrostatic energy, Eq. (3), and does not depend on the electronic structure. The second term in Eq. (34) depends only on the charge density of the unperturbed system and it is easy to evaluate. The first term in Eq. (34) depends on the charge-density linear response to the lattice distortion of Eq. (33), corresponding to a perturbing potential characterized by a single wave-vector q: ∂v s (r − Rl − τs ) ∂ V{R} (r) =− eiq·Rl . (36) ∂us (q) ∂r l
Density-functional perturbation theory
203
An advantage of DFPT with respect to the frozen-phonon technique is that the linear response to a monochromatic perturbation is also monochromatic with the same wave-vector q. This is a consequence of the linearity of DFPT equations with respect to the perturbing potential, especially evident in Eq. (22). The calculation of the dynamical matrix can thus be performed for any q−vector without introducing supercells: the dependence on q factors out and all the calculations can be performed on lattice-periodic functions. Real-space IFCs can then be obtained via discrete (fast) Fourier transforms. To this end, dynamical matrices are first calculated on a uniform grid of q-vectors in the Brillouin zone: b1 b2 b3 + l2 + l3 , (37) ql1 ,l2 ,l3 = l1 N1 N2 N3 where b1 , b2 , b3 are the primitive translation vectors of the reciprocal lattice, l1 , l2 , l3 are integers running from 0 to N1 − 1, N2 − 1, N3 − 1, respectively. αβ A discrete Fourier transform produces the IFCs in real space: C˜ st (ql1 ,l2 ,l3 ) → αβ Cst (Rl1 ,l2 ,l3 ), where the real-space grid contains all R−vectors inside a supercell, whose primitive translation vectors are N1 a1 , N2 a2 , N3 a3 : Rl1 ,l2 ,l3 = l1 a1 + l2 a2 + l3 a3 .
(38)
Once this has been done, the IFCs thus obtained can be used to calculate inexpensively via (inverse) Fourier transform dynamical matrices at any q vector not included in the original reciprocal-space mesh. This procedure is known as Fourier interpolation. The number of dynamical matrix calculations to be performed, N1 N2 N3 , is related to the range of the IFCs in real space: the realspace grid must be big enough to yield negligible values for the IFCs at the boundary vectors. In simple crystals, this goal is typically achieved for relatively small values of N1 , N2 , N3 [8, 9]. For instance, the phonon dispersions of Si and Ge shown in Fig. 1 were obtained with N1 = N2 = N3 = 4.
4.
Phonons and Macroscopic Electric Fields
Phonons in the long-wavelength limit (q → 0) may be associated with a macroscopic polarization, and thus a homogeneous electric field, due to the long-range character of the Coulomb forces. The splitting between longitudinal optic (LO) and transverse optic (TO) modes at q = 0 for simple polar semiconductors (e.g., GaAs), and the absence of LO–TO splitting in nonpolar semiconductors (e.g., Si), is a textbook example of the consequences of such phenomenon. Macroscopic electrostatics in extended systems is a tricky subject from the standpoint of microscopic ab initio theory. In fact, on the one hand, the macroscopic polarization of an extended system depends on surface effects; on the
204
P. Giannozzi and S. Baroni 600 Frequency [cm-1]
Si
400
200
0
Frequency [cm-1]
Ge
⌫
K
X
⌫
L
X
W
L
Dos
⌫
K
X
⌫
L
X
W
L
Dos
400 300 200 100 0
Figure 1. Calculated phonon dispersions and density of states for crystalline Si and Ge. Experimental data are denoted by diamonds. Reproduced from Ref. [8].
other hand, the potential which generates a homogeneous electric field is both nonperiodic and not bounded from below: an unpleasant situation when doing calculations using Born–von K´arm´an periodic boundary conditions. In the last decade, the whole field has been revolutionized by the advent of the so called modern theory of electric polarization [10, 11]. From the point of view of lattice dynamics, a more traditional approach based on perturbation theory is however appropriate because all the pathologies of macroscopic electrostatics disappear in the linear regime, and the polarization response to a homogeneous electric field and/or to a periodic lattice distortion – which is all one needs in order to calculate long-wavelength phonon modes – is perfectly well-defined. In the long-wavelength limit, the most general expression of the energy as a quadratic function of atomic displacements, us (q = 0) for atom s, and of a macroscopic electric field, E, is:
E({u}, E) =
1 ˜ st · ut − E · ∞ · E − e us · an C us · Z s · E, 2 st αβ 8π s
(39)
Density-functional perturbation theory
205
where is the volume of the unit cell; ∞ is the electronic (i.e., clamped nuclei) dielectric tensor of the crystal; Z s is the tensor of Born effective charges ˜ is the q =0 dynamical matrix of the system, calculated [12] for atom s; and an C at vanishing macroscopic electric field. Because of Maxwell’s equations, the polarization induced by a longitudinal phonon in the q → 0 limit generates a macroscopic electric field which exerts a force on the atoms, thus affecting the phonon frequency. This, in a nutshell, is the physical origin of the LO–TO splitting in polar materials. Minimizing Eq. (39) with respect to the electric field amplitude at fixed lattice distortion yields an expression for the energy which depends on atomic displacements only, defining an effective dynamical matrix which contains an additional (“nonanalytic”) contribution: C˜ stαβ =an C˜ stαβ +na C˜ stαβ ,
(40)
where na
C˜ stαβ
4π e2 =
γ
νβ Z γα 4π e2 (q · Z s )α (q · Z t )β ν Z t qν s qγ = γν q · ∞ · q γ ,ν qγ ∞ qν
(41)
displays a nonanalytic behavior in the limit q → 0. As a consequence, the resulting IFCs are long-range in real space, with a dependence on the interatomic distance, which is typical of the dipole–dipole interaction. Because of this long-range behavior, the Fourier technique described above must be modified: a suitably chosen function of q, whose q → 0 limit is the same as in Eq. (41), is subtracted from the dynamical matrix in q-space. This procedure makes residual IFCs short-range and suitable for Fourier transform on a relatively small grid of points. The nonanalytic term previously subtracted out in q-space is then readded in real space. An example of application of such procedure is shown in Fig. 2, for phonon dispersions of some III–VI semiconductors. The link between the phenomenological parameters Z and ∞ of Eq. (39) and their microscopic expression is provided by conventional electrostatics. From Eq. (39) we obtain the expression for the electric induction D: D≡−
4π ∂ E 4π e = Z s · us + ∞ E, ∂E s
(42)
from which the macroscopic polarization, P, is obtained via D = E + 4π P. One finds the known result relating Z to the polarization induced by atomic displacements, at zero electric field:
Z αβ s
∂Pα = ; β e ∂u s (q = 0) E=0
(43)
206
P. Giannozzi and S. Baroni 400
Frequency [cm-1]
GaAs 300 200 100 0
Frequency [cm-1]
AlAs
⌫
K X
⌫
L
X
W
L
Dos
⌫
K
X
⌫
L
X
W
L
Dos
⌫
K X
⌫
L
X
W
L
Dos
⌫
K
⌫
L
X
W
L
Dos
500
250
0
Frequency [cm-1]
GaSb
300
200
100
0
Frequency [cm-1]
AlSb
400 300 200 100 0
X
Figure 2. Calculated phonon dispersions and density of states for several III-V zincblende semiconductors. Experimental data are denoted by diamonds. Reproduced from Ref. [8].
Density-functional perturbation theory
207
while the electronic dielectric-constant tensor ∞ is the derivative of the polarization with respect to the macroscopic electric field at clamped nuclei:
αβ = δαβ ∞
∂Pα + 4π . ∂Eβ u (q=0)=0
(44)
s
DFPT provides an easy way to calculate Z and ∞ from first principles [8, 9]. The polarization linearly induced by an atomic displacement is given by the sum of an electronic plus an ionic term: ∂Pα
e =− β Nc ∂u s (q = 0)
r
e ∂n(r) dr + Z s δαβ . ∂u s (q = 0)
(45)
This expression is ill-defined for an infinite crystal with Born–von K´arm´an periodic boundary conditions, because r is not a lattice-periodic operator. We remark, however, that we actually only need off-diagonal matrix elements / n (see the discussion of Eqs. 20 and 21). These can be ψm |r|ψn with m = rewritten as matrix elements of a lattice-periodic operator, using the following trick: ψm |r|ψn =
ψm |[HSCF , r]|ψn
, m − n
∀ m =/ n.
(46)
The quantity |ψ¯ nα = rα |ψn is the solution of a linear system, analogous to Eq. (17): (HSCF − n )|ψ¯ nα = Pc [HSCF , rα ]|ψn ,
(47)
N/2
where Pc = 1 − n=1 |ψn ψn | projects out the component over the occupiedstate manifold. If the self-consistent potential acting on the electrons is local, the above commutator is simply proportional to the momentum operator: [HSCF , r] = −
h¯ 2 ∂ . m ∂r
(48)
Otherwise, the commutator will contain an explicit contribution from the nonlocal part of the potential [13]. The final expression for the effective charges reads:
Z αβ s
N/2 4 ¯ α ∂ψn = Zs + . ψn ∂u β (q = 0) Nc n=1
(49)
The calculation of ∞ requires the response of a crystal to an applied electric field E. The latter is described by a potential, V (r) = eE · r, that is neither lattice-periodic nor bounded from below. In the linear-response regime,
208
P. Giannozzi and S. Baroni
however, we can use the same trick as in Eq. (46) and replace all the occurrences of r|ψn with |ψ¯ nα calculated as in Eq. (47). The simplest way to calculate ∞ is to keep the electric field E fixed and to iterate on the potential: ∂ VSCF (r) ∂ V (r) = + ∂E ∂E
e2 δv xc (r) + |r − r | δn(r )
∂n(r ) dr . ∂E
(50)
One finally obtains:
αβ ∞
= δαβ
N/2 ∂ψ 16π e n − ψ¯ nα ∂Eβ Nc n=1
.
(51)
Effective charges can also be calculated from the response to an electric field. In fact, they are also proportional to the force acting on an atom upon application of an electric field. Mathematically, this is simply a consequence of the fact that the effective charge can be seen as the second derivative of the energy with respect to an ion displacement and an applied electric field, and its value is obviously independent of the order of differentiation. Alternative approaches – not using perturbation theory – to the calculation of effective charges and of dielectric tensors have been recently developed. Effective charges can be calculated as finite differences of the macroscopic polarization induced by atomic displacements, which in turn can be expressed in terms of a topological quantity – depending on the phase of ground-state orbitals – called the Berry’s phase [10, 11]. When used at the same level of accuracy, the linear-response and Berry’s phase approaches yield the same results. The calculation of the dielectric tensor using the same technique is possible by performing finite electric-field calculations (the electrical equivalent of the frozen-phonon approach). Recently, practical finite-field calculations have become possible [14, 15], using an expression of the position operator that is suitable for periodic systems.
5.
Applications
The calculation of vibrational properties in the frozen-phonon approach can be performed using any methods that provide accurate forces on atoms. Localized basis-set implementations suffers from the problem of Pulay forces: the last term of Eq. (14) does not vanish if the basis set is incomplete. In order to obtain accurate forces, the Pulay term must be taken into account. The plane-wave (PW) basis set is instead free from such problem: the last term in Eq. (14) vanishes exactly even if the PW basis set is incomplete.
Density-functional perturbation theory
209
Practical implementations of DFPT equations is straightforward with PW’s and norm-conserving pseudopotentials (PPs). In a PW-PP calculation, only valence electrons are explicitly accounted for, while the electron-ionic cores interactions are described by suitable atomic PPs. Norm-conserving PPs contain a nonlocal term of the form: NL (r, r ) = V{R}
Dnm βn∗ (r − Rl − τs )βm (r − Rl − τs ).
(52)
sl n,m
The nonlocal character of the PP requires some generalizations of the formulas described in the previous section, which are straightforward. More extensive modifications are necessary for “ultrasoft” PPs [16], which are appropriate to effectively deal with systems containing transition metal or other atoms that would otherwise require a very large PW basis set when using normconserving PPs. Implementations for other kinds of basis sets, such as LMTO, FLAPW, mixed basis sets (localized atomic-like functions plus PWs) exist as well. Presently, phonon spectra can be calculated for materials described by unit cells or supercells containing up to several tens atoms. Calculations in simple semiconductors (Fig. 1 and 2) and metals (Fig. 3) are routinely performed with modest computer hardware. Systems that are well described by some flavor of DFT in terms of structural properties have a comparable accuracy in their phonon frequencies (with typical error in the order of a few percent points) and phonon-related quantities. The real interest of phonon calculations in simple systems, however, stems from the possibility to calculate real-space IFCs also in cases for which experimental data would not be sufficient to set up a reliable dynamical model (as, for instance, in AlAs, Fig. 2). The availability of IFCs in real space and thus of the complete phonon spectra allows for the accurate evaluation of thermal properties (such as thermal expansion coefficients in the quasi-harmonic approximation) and of electron–phonon coupling coefficients in metals. Calculations in more complex materials are computationally more demanding, but still feasible for a number of nontrivial systems [2]: semiconductor superlattices and heterostructures, ferroelectrics, semiconductor surfaces [18], metal surfaces, high-Tc superconductors are just a few examples of systems successfully treated in the recent literature. A detailed knowledge of phonon spectra is crucial for the explanation of phonon-related phenomena such as structural phase transitions (under pressure or with temperature) driven by “soft phonons,” pressure-induced amorphization, Kohn anomalies. Some examples of such phonon-related phenomenology are shown in Fig. 4–6. Figure 4 shows the onset of a phonon anomaly at an incommensurate q-vector under pressure in ice XI, believed to be connected to the observed amorphization under pressure. Figure 5 displays a Kohn anomaly and the related lattice instability in the phonon spectra of ferromagnetic shape-memory alloy
210
P. Giannozzi and S. Baroni
Fe
ω [cm-1]
300
200
100
H
⌫
P
H
N
P
⌫
N
Ni
ω [cm-1]
300
200
100
⌫
X
W
X
K
⌫
L
Figure 3. Calculated phonon dispersions, with spin-polarized GGA (solid lines) and LDA (dotted lines), for Ni in the face-centered cubic structure and Fe in the body-centered cubic structure. Experimental data are denoted by diamonds. Reproduced from Ref. [17].
Ni2 MnGa. Figure 6 shows a similar anomaly in the phonon spectra of the hydrogenated W(110) surface. DFT-based methods can also be employed to determine Raman and infrared cross sections – very helpful quantities when analyzing experimental data. Infrared cross sections are proportional to the square of the polarization induced by a phonon mode. For the νth zone-center (q = 0) mode,
Density-functional perturbation theory (a) 500
0 kbar
(b)
211
15 kbar
(c)
35 kbar
400
kz
ω(cm⫺1)
Z A
200
Σ
100
B
T E
Λ ∆ Γ
V C
ky
kx
0 Γ
Figure 4.
Σ
C
Y
∆
Γ
Γ
Σ
C
Y
∆
Γ
Γ
Σ
C
Y
∆
Γ
Phonon dispersions in ice XI at 0, 15, and 35 kbar. Reproduced from Ref. [19].
Γ
K
X
125 LA
frequency (cm1)
100
75 TA1 50
25 TA2 0 theory 370oK
25
250oK 50 0
0.2
0.4 0.6 q=ζ[110] 2π/a
0.8
1
Figure 5. Calculated phonon dispersion of Ni2 MnGa in the fcc Heusler structure, along the − K − Z line in the [110] direction. Experimental data taken at 250 and 370 K are shown for comparison. Reproduced from Ref. [20].
characterized by a normalized vibrational eigenvector Usβ , the oscillator strength f is given by 2 αβ β Z s Us . f = α sβ
(53)
212
P. Giannozzi and S. Baroni clean
hydrogenated
frequency (cm⫺1)
200
[110] N
100
S
[001] H
Γ
Γ
[112]
H
N
S
Γ
H
N
S
Figure 6. Phonon dispersions of the clean (left panel) and hydrogenated (right panel) W(110). Full dots indicate electron energy-loss data, open diamonds helium-atom scattering data. Reproduced from Ref. [21].
The calculation of Raman cross sections is difficult in resonance conditions, since the knowledge of excited-state Born–Oppenheimer surfaces is required. Off-resonance Raman cross sections are however simply related to the change of the dielectric constant induced by a phonon mode. If the frequency of the incident light, ωi , is much smaller than the energy band gap, the contribution of the νth vibrational mode to the intensity of the light diffused in Stokes Raman scattering is: I (ν) ∝
(ωi − ων )4 αβ r (ν), ων
(54)
where α and β are the polarizations of the incoming and outgoing light beams, ων is the frequency of the νth mode, and the Raman tensor r αβ (ν) is defined as:
r
αβ
∂χ αβ 2 (ν) = , ∂eν
(55)
where χ = (∞ − 1)/4π is the electric polarizability of the system, eν is the coordinate along the vibrational eigenvector Usβ for mode ν, and indicates an average over all the modes degenerate with the νth one. The Raman tensor can be calculated as a finite difference of the dielectric tensor with a phonon frozen-in, or directly from higher-order perturbation theory [22].
Density-functional perturbation theory
6.
213
Outlook
The field of lattice-dynamical calculations based on DFT, in particular in conjunction with perturbation theory, is ripe enough to allow a systematic application to systems and materials of increasing complexity. Among the most promising fields of application, we mention the characterization of materials through the prediction of the relation existing between their atomistic structure and experimentally detectable spectroscopic properties; the study of the structural (in)stability of materials at extreme pressure conditions; the prediction of the thermal dependence of different materials properties using the quasi-harmonic approximation; the prediction of superconductive properties via the calculation of electron–phonon coupling coefficients. We conclude mentioning that sophisticated open-source codes for lattice dynamical calculations [23] are freely available for download from the web.
References [1] S. Baroni, P. Giannozzi, and A. Testa, “Green’s-function approach to linear response in solids,” Phys. Rev. Lett., 58, 1861, 1987. [2] S. Baroni, S. de Gironcoli, A. Dal Corso, and P. Giannozzi, etc. “Phonons and related crystal properties from density-functional perturbation theory,” Rev. Mod. Phys., 73, 515–562, 2001. [3] X. Gonze, “Adiabatic density-functional perturbation theory,” Phys. Rev. A, 52, 1096, 1995. [4] J. Gerratt and I.M. Mills, J. Chem. Phys., 49, 1719, 1968. [5] R.D. Amos, In: K.P. Lawley (ed.), Ab initio Methods in Quantum Chemistry – I, Wiley, New York, p. 99, 1987. [6] X. Gonze, “Perturbation expansion of variational principles at arbitrary order,” Phys. Rev. A, 52, 1086, 1995. [7] X. Gonze, “First-principles responses of solids to atomic displacements and homogeneous electric fields: Implementation of a conjugate-gradient algorithm,” Phys. Rev. B, 55, 10337, 1997. [8] P. Giannozzi, S. de Gironcoli, P. Pavone, and S. Baroni, “Ab initio calculation of phonon dispersions in semiconductors,” Phys. Rev. B, 43, 7231, 1991. [9] X. Gonze and C. Lee, “Dynamical matrices, Born effective charges, dielectric permittivity tensors, and interatomic force constants from density-functional perturbation theory,” Phys. Rev. B, 55, 10355, 1997. [10] D. Vanderbilt and R.D. King-Smith, “Electric polarization as a bulk quantity and its relation to surface charge,” Phys. Rev. B, 48, 4442, 1993. [11] R. Resta, “Macroscopic polarization in crystalline dielectrics: the geometrical phase approach,” Rev. Mod. Phys., 66, 899, 1994. [12] M. Born and K. Huang, Dynamical Theory of Crystal Lattices., Oxford University Press, Oxford, 1954. [13] S. Baroni and R. Resta, “Ab initio calculation of the macroscopic dielectric constant in silicon,” Phys. Rev. B, 33, 7017, 1986.
214
P. Giannozzi and S. Baroni [14] P. Umari and A. Pasquarello, “Ab initio molecular dynamics in a finite homogeneous electric field,” Phys. Rev. Lett., 89, 157602, 2002. [15] I. Souza, J. ´I˜niguez, and D. Vanderbilt, “First-principles approach to insulators in finite electric fields,” Phys. Rev. Lett., 89, 117602, 2002. [16] D. Vanderbilt, “Soft self-consistent pseudopotentials in a generalized eigenvalue formalism,” Phys. Rev. B, 41, 7892, 1990. [17] A. Dal Corso and S. de Gironcoli, “Density-functional perturbation theory for lattice dynamics with ultrasoft pseudo-potentials,” Phys. Rev. B, 62, 273, 2000. [18] J. Fritsch and U. Schr¨oder, “Density-functional calculation of semiconductor surface phonons,” Phys. Rep., 309, 209–331, 1999. [19] K. Umemoto, R.M. Wentzcovitch, S. Baroni, and S. de Gironcoli, “Anomalous pressure-induced transition(s) in ice XI,” Phys. Rev. Lett., 92, 105502, 2004. [20] C. Bungaro, K.M. Rabe, and A. Dal Corso, “First-principle study of lattice instabilities in ferromagnetic Ni2 MnGa,” Phys. Rev. B, 68, 134104, 2003. [21] C. Bungaro, S. de Gironcoli, and S. Baroni, “Theory of the anomalous Rayleigh dispersion at H/W(110) surfaces,” Phys. Rev. Lett., 77, 2491, 1996. [22] M. Lazzeri and F. Mauri, “High-order density-matrix perturbation theory,” Phys. Rev. B, 68, 161101, 2003. [23] PWscf package: www.pwscf.org. ABINIT: www.abinit.org.
1.11 QUASIPARTICLE AND OPTICAL PROPERTIES OF SOLIDS AND NANOSTRUCTURES: THE GW-BSE APPROACH Steven G. Louie1 and Angel Rubio2 1
Department of Physics, University of California at Berkeley and Materials Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA 2 ´ ´ Departamento Fisica de Materiales and Unidad de Fisica de Materiales ´ Vasco and Centro Mixto CSIC-UPV, Universidad del Pais Donosita Internacional Phycis Center (DIPC)
We present a review of recent progress in the first-principles study of the spectroscopic properties of solids and nanostructures employing a many-body Green’s function approach based on the GW approximation to the electron self-energy. The approach has been widely used to investigate the excitedstate properties of condensed matter as probed by photoemission, tunneling, optical, and related techniques. In this article, we first give a brief overview of the theoretical foundations of the approach, then present a sample of applications to systems ranging from extended solids to surfaces to nanostructures and discuss some possible ideas for further developments.
1.
Background
A large part of research in condensed matter science is related to the characterization of the electronic properties of interacting many-electron systems. In particular, an accurate description of the electronic structure and its response to external probes is essential for understanding the behavior of systems ranging from atoms, molecules, and nanostructures to complex materials. Moreover, many characterization tools in physics, chemistry and materials science as well as electro/optical devices are spectroscopic in nature, based on the interaction 215 S. Yip (ed.), Handbook of Materials Modeling, 215–240. c 2005 Springer. Printed in the Netherlands.
216
S.G. Louie and A. Rubio
of photons, electrons, or other quanta with matter exciting the system to higher energy states. Yet, many fundamental questions concerning the conceptual and quantitative descriptions of excited states of condensed matter and their interactions with external probes are still open. Hence there is a strong need for theoretical approaches which can provide an accurate description of the excitedstate electronic structure of a system and its response to external probes. In what follows we discuss some recent progress along a very fruitful direction in the first-principles studies of the electronic excited-state properties of materials, employing a many-electron Green’s function approach based on the so-called GW approximation [1–3]. Solving for the electronic structure of an interacting electron system (in terms of the many-particle Schr¨odinger equation) has an intrinsic high complexity: while the problem is completely well defined in terms of the total number of particles N and the external potential V(r), its solution depends on 3N coordinates. This makes the direct search for either exact or approximate solutions to the many-body problem a task of rapidly increasing complexity. Fortunately, in the study of either ground- or excited-state properties, we seldom need the full solution to the Schr¨odinger equation. When one is interested in structural properties, the ground-state total energy is sufficient. In other cases, we want to study how the system responds to some external probe. Then knowledge of a few excited-state properties must be added. For instance, in a direct photoemission experiment, a photon impinges on the system and an electron is removed. In an inverse photoemission process, an electron is absorbed and a photon is ejected. In both cases we just have to deal with the gain or loss of energy of the N electron system when a single particle is added or removed, i.e., with the one-particle excitation spectrum. If the electron was not removed after the absorption of the photon, the system evolves from its ground state to a neutral excited state, and the process may be described by correlated electron–hole excitation amplitudes. At the simplest level of treating the many-electron problem, the Hartree– Fock theory (HF) is obtained by considering the ground-state wavefunction to be a single Slater determinant of single-particle orbitals. In this way the N-body problem is reduced to N one-body problems with a self-consistent requirement due to the dependence of the HF effective potential on the wavefunction. By the variational theorem, the HF total energy is a variational upper bound of the ground-state energy for a particular symmetry. The HF-eigenvalues may also be used as rough estimates of the one-electron excitation energies. The validity of this procedure hinges on the assumption that the single-particle orbitals in the N and (N-1) system are the same (Koopman’s theorem), i.e., neglecting the electronic relaxation of the system. A better procedure to estimate excitation energies is to perform self-consistent calculations for the N and (N-1) systems and subtract the total energies (this is called the “-SCF method” for excitation energies which has also been used in other theoretical frameworks such as the
Quasiparticle and optical properties of solids and nanostructures
217
density-functional theory). For infinitely extended system, this scheme gives the same result as Koopman’s theorem and more refined methods are needed to address the problem of one-particle (quasiparticle) excitation energies in solids. The HF theory in general is far from accurate because typically the wavefunction of a system cannot be written as a single determinant for the ground state and Koopman’s theorem is a poor approximation. On the other hand, within density-functional-theory (DFT), the ground-state energy of an interacting system can be exactly written as a functional of the ground-state electronic density [4]. When comparing to conventional quantum chemistry methods, this approach is particularly appealing since solving the ground-state energy does not rely on the complete knowledge of the N-electron wavefunction but only on the electronic density, reducing the problem to that of a self-consistent field calculation. However, although the theory is exact, the energy functional contains an unknown quantity called the exchange-correlation energy, E xc [n], that has to be approximated in practical implementations. For ground-state properties, in particular those of solids and larger molecular systems, present-day DFT results are comparable or even surpassing in quality to those from standard ab initio quantum chemistry techniques. Its use has continued to increase due to a better scaling in computational effort with the number of atoms in the system. As in HF theory, the Kohn–Sham eigenvalues of the DFT cannot be directly interpreted as the quasiparticle excitation energies. Such interpretation has led to the well-known bandgap problem for semiconductors and insulators: the Kohn–Sham gap is typically 30–50% less than the observed band gap. Indeed, the original formulation of the DFT is not applicable to excited states nor to problems involving time-dependent external fields, thus excluding the calculation of optical response, quasiparticle excitation spectrum, photochemistry, etc. Theorems have, however, been proved subsequently for time-dependent density functional theory (TDDFT) which extends the applicability of the approach to excited-state phenomena [5, 6]. The main result of TDDFT is a set of time-dependent Kohn–Sham equations that include all the many-body effects through a time-dependent exchange-correlation potential. As for static DFT, this potential is unknown and has to be approximated in any practical application. TDDFT has been applied with success to the calculations of quantities such as the electron polarizabilities for the optical spectra of finite systems. However, TDDFT encounters problems in studying spectroscopic properties of extended systems [7] and severely underestimates the high-lying excitation energies in molecules when simple exchange and correlation functionals are employed. These failures are related to our ignorance of the exact exchangecorrelation potential in DFT. The actual functional relation between density, n(r), and the exchange-correlation potential, Vxc (r), is highly non-analytical and non-local. A very active field of current research is in the search of robust, new exchange-correlation functionals for real material applications.
218
S.G. Louie and A. Rubio
Alternatively, a theoretically well-grounded and rigorous approach for the excited-state properties of condensed matter is the interacting Green’s function approach. The n-particle Green’s function describes the propagation of the n-particle amplitude in an interacting electron system. It provides a proper framework for accurately computing the N-particle excitation properties. For example, knowledge of the one-particle and two-particle Green’s functions yields information, respectively, on the quasiparticle excitations and optical response of a system. The use of this approach for practical study of the spectroscopic properties of real materials is the focus of the present review. In the remainder of the article, we first present a brief overview of the theoretical framework for many-body perturbation theory and discuss the firstprinciples calculation of properties related to the one- and two-particle Green’s functions within the GW approximation to the electron self-energy operator. Then, we present some selected examples of applications to solids and reduced dimensional systems. Finally, some conclusions and perspectives are given.
2.
Many-body Perturbation Theory and Green’s Functions
A very successful and fruitful development for computing electron excitations has been a first-principles self-energy approach [1–3, 8] in which the quasiparticle’s (excited electron or hole) energy is determined directly by calculating the contribution of the dynamical polarization of the surrounding electrons. In many-body theory, this is obtained by evaluating the evolution of the amplitude of the added particle via the single-particle Green’s function, G(xt, x t ) = −iN |T {ψ(xt)ψ † (x t )}|N ,∗ from which one obtains the dispersion relation and lifetime of the quasiparticle excited state. There are no adjustable parameters in the theory and, from the equation of motion of the single-particle Green’s function, the quasiparticle energies E nk and wavefunctions ψnk are determined by solving a Schr¨odinger-like equation: (T + Vext + VH )ψk (r) +
dr(r,r ; E nk )ψnk (r ) = E nk ψnk (r),
(1)
where T is the kinetic energy operator, Vext is the external potential due to the ions, VH is the Hartree potential of the electrons, and is the self-energy operator where all the many-body exchange and correlation effects are included. The self-energy operator describes an effective potential on the quasiparticle * This corresponds to the Green’s function at zero temperature where |N > is the many-electron ground state, ψ(xt) is the field operator in the Heisenberg picture, x stands for the spatial coordinates r plus the spin coordinate, and T is the time ordered operator. In this context, ψ † (xt)|N> represents an (N + 1)-electron state in which an electron has been added at time t onto position r.
Quasiparticle and optical properties of solids and nanostructures
219
resulting from the interaction with all the other electrons in the system. In general is non-local, energy dependent and non-Hermitian, with the imaginary part giving the lifetime of the excited state. Similarly, from the two-particle Green’s function, we can obtain the correlated electron–hole amplitude and excitation spectrum, and hence the optical properties. For details of the Green’s function formalism and many-body techniques applied to condensed matter, we refer the reader to several comprehensive papers in the literature [2, 3, 7–10]. Here we shall just present some of the main equations used for the quasiparticle and optical spectra calculations. (To simplify the presentation, we use in the following atomic units, e = h¯ = m = 1.) In standard textbook, the unperturbed system is often taken to be the noninteracting system of electrons under the potential Vion(r) + VH (r). However, for rapid convergence in a perturbation series, it is better to start from a different non-interacting or mean-field scenario, like the Kohn–Sham DFT system, which already includes an attempt to describe exchange and correlations in the actual system. Also, in a many-electron system, the Coulomb interaction between two electrons is readily screened by a dynamic rearrangement of the other electrons, reducing its strength. It is more natural to describe the electron–electron interaction in terms of a screened Coulomb potential W and formulate the self energy as a perturbation series in terms of W. In this approach [1–3], the electron self-energy can then be obtained from a self-consistent set of Dyson-like equations: P(12) = −i
d(34)G(13)G(41+ ) (34, 2)
W (12) = v(12) + (12) = i
d(34)W (13)P(34)v(42)
d(34)G(14+ )W (13)(42, 3)
G(12) = G 0 (12) +
d(34)G 0 (13)[(34) − δ(34)Vxc (4)]G(42)
(12, 3) = δ(12)δ(13) +
(2) (3) (4) (5)
d(4567)[δ(12)/δG(45)] × G(46)G(75)(67, 3)
(6)
where 1 ≡ (x1 , t1 ) and 1+ ≡ (x1 , t1 + η)(η >0 infinitesimal). v stands for the bare Coulomb interaction, P is the irreducible polarization, W is the dynamical screened Coulomb interaction, and is the so-called vertex function. Here G 0 is the single-particle DFT Green’s function, G 0 (x, x ; ω) = n ψn (x)ψn∗ (x)/[ω−εn −iηsgn(µn )], with η a positive infinitesimal and ψn and εn the corresponding DFT wavefunctions and eigenenergies. This way of writing down the equations is in fact appealing since it highlights the important physical ingredients: the polarization (which contains the response of the system to the additional particle or hole) is built up by the creation of particle–hole pairs
220
S.G. Louie and A. Rubio
(described by the two-particle Green’s functions). The vertex function contains the information that the hole and the electron interact. This set of equations defines an iterative approach that allows us to gather information about quasiparticle excitations and dynamics. The iterative approach of course has to be approximated. We now describe some of the approximations used in the literature to address quasiparticle excitations and their subsequent extension to optical spectroscopy and exciton states.
3.
Quasiparticle Excitations: the GW Approach
In practical first-principles implementations, the GW approximation [1] is employed in which the self-energy operator is taken to be the first order term in a series expansion in terms of the screened Coulomb interaction W and the dressed Green function G of the electron P(12) = −i G(12)G(21) (12) = i G(12+ )W (12)
(7) (8)
(in frequency space: (r, r ; ω) = i/2π dω e−iω η G(r, r , ω − ω )W (r, r , ω )). Vertex corrections are not included in this approximation. This corresponds to the simplest approximation for (123), assuming it to be diagonal in space and time coordinates, i.e., (123) = δ(12)δ(13). This has to be complemented with Eq. (5) above. Thus, even at the GW level, we have a many-body self-consistent problem. Most ab initio GW applications do this self-consistent loop by (1) taking the DFT results as the mean field and (2) varying the energy of the quasiparticle but keeping fixed its wavefunction (equal to the DFT wavefunction). This corresponds to the G 0 W0 scheme for the calculation of quasiparticle energy as a first-order perturbation to the Kohn–Sham energy εnk : E nk ≈ εnk + nk|(E nk ) − Vxc |nk,
(9)
where Vxc is the exchange-correlation potential within DFT and |nk > is the corresponding wavefunction. This “G 0 W0 ” approximation reproduces to within 0.1 eV the experimental band gaps for many semiconductors and insulators and their surfaces, thus circumventing the well-known bandgap problem [2, 3]. Also it gives much better HOMO–LUMO gaps and ionization energies in localized systems, and results for the lifetimes of hot electrons in metals and image states at surfaces [7]. For some systems, the quasiparticle wavefunction can differ significantly from the DFT wavefunction; one then needs to solve the quasiparticle equation, Eq. (1), directly.
Quasiparticle and optical properties of solids and nanostructures
4.
221
Optical Response: the Bethe–Salpeter Equation
From Eqs. (2)–(6) for the GW self energy, we have a non-vanishing functional derivative δ/δG. One obtains a second-order correction to the bare vertex (1) (123) = δ(12)δ(13): (2)
(123) = δ(12)δ(13) +
d(4567)[δ (1) (12)/δG 0 (45)]G 0 (46) × G 0 (75) (1) (673).
(10)
This can be viewed as the linear response of the self-energy to a change in the total potential of the system. The vertex correction accounts for exchangecorrelation effects between an electron and the other electrons in the screening density cloud. In particular it includes the electron–hole interaction (excitonic effects) in the dielectric response∗ . Indeed, the functional derivative of G is responsible for the attractive direct term in the electron–hole interaction that goes into the effective two-particle equation, the Bethe–Salpeter equation, which determines the spectrum and wavefunctions of the correlated electron– hole neutral excitations created, for example, in optical experiments. Taking as first-order self energy (1) = G 0 W0 , it is easy to derive a Bethe–Salpeter equation, which correctly yields features like bound excitons and changes in absorption strength in the optical absorption spectra. Within this scheme [7, 10], the effective two-particle Hamiltonian takes (when static screening is used in W) a particularly simple, energy-independent form
[(εn1 − εn2 )δn1n3 δn2n4 + u (n1n2)(n3n4) − W(n1n2)(n3n4)]AS(n3n4)
n3n4
= S AS (n1n2)
(11)
where AS is the electron–hole amplitude and the matrix elements are taken with respect to the quasiparticle wavefunctions n 1 , . . . , n 4 as follows: u (n1n2)(n3n4) = n 1 n 2 |u|n 3 n 4 and W(n1n2)(n3n4) = n 1 n 3 |W |n 2 n 4 , with u equal to the Coulomb potential v except for the long-range component q = 0 that is set to zero (that is, u(q)=4π/q 2 but with u(0) = 0). The solution of Eq. (11) allows one to construct the optical absorption spectrum from the imaginary part of the macroscopic dielectric function ε M : Im[εM (ω)] = 16π e2 /ω2
|ˆe· < 0|i/h¯ [H, r]|S > |2 δ(ω − S )
(12)
S
* Vertex corrections and self-consistency tend to cancel to a large extent for the 3D homogeneous electron
gas. This cancellation of vertex corrections with self-consistency seems to be a quite general feature. However, there is no formal justification for it and further work along the direction of including consistently dynamical effects and vertex corrections should be explored (Aryasetiawan and Gunnarsson, 1998; and references therein).
222
S.G. Louie and A. Rubio
where eˆ is the normalized polarization vector of the light and i/h¯ [H ,r] is the single-particle velocity operator. The sum runs over all the excited states |S> of the system (with excitation energy S ) and |0 > is the ground state. One of the main effects of the electron–hole interaction is the coupling of different electron–hole configurations (denoted by |he >) which modifies the usual interband transition matrix elements that appear in Eq. (12) to: electrons (h,e) <0|i/h¯ [H, r]|S > = holes AS < h|i/h¯ [H, r]|e >. h e In this context, the Bethe–Salpeter approach to the calculation of two-particle excited states is a natural extension of the GW approach for the calculation of one-particle excited states, within a same theoretical framework and set of approximations (the GW-BSE scheme). As we shall see below, GW-BSE calculations have helped elucidate the optical spectra for a wide range of systems from nanostructures to bulk semiconductors to surfaces and 1D polymers and nanotubes.
5.
Applications to Bulk Materials and Surfaces
Since the mid 1980s, the GW approach has been employed with success to the study of quasiparticle excitations in bulk semiconductors and insulators [2, 3, 9, 11, 12]. In Fig. 1, the calculated GW band gaps of a number of insulating materials are plotted against the measured quasiparticle gaps [11]. A perfect agreement between theory and experiment would place the data points on the diagonal line. As seen from the figure, the Kohn–Sham gaps in the local density approximation (LDA) significantly underestimate the experimental values, giving rise to the bandgap problem. Some of the Kohn–Sham gaps are even negative. However, the GW results (which provide an appropriate description of particle-like excitations in an interacting systems) are in excellent agreement with experiments for a range of materials – from the small gap semiconductors such as InSb, to moderate size gap materials such as GaN and solid C60 , and to the large gap insulators such as LiF. In addition, the GW quasiparticle band structures for semiconductors and conventional metals in general compare very well with data from photoemission and inverse photoemission measurements. Figure 2 depicts the calculated quasiparticle band structure of germanium [11] and copper [13] as compared to photoemission data for the occupied states and inverse photoemission data for the unoccupied states. For Ge, the agreement is within the error bars of experiments. In fact, the conduction band energies of Ge were theoretically predicted before the inverse photoemission measurement. The results for Cu agree with photoemission data to within 30 meV for the highest d-band, correcting 90% of the LDA error. The energies of the other d-bands throughout the Brillouin zone are reproduced within 300 meV, and the maximum error (about 600 meV) is found for the bottom valence band at the
Quasiparticle and optical properties of solids and nanostructures
223
Theoretical Band Gap (eV)
15
10 Quasiparticle theory
5 Many-body corrections
LDA
0 0
5 10 Experimental Band Gap (eV)
15
Figure 1. Comparison of the GW bandgap with experiment for a wide range of semiconductors and insulators. The Kohn–Sham eigenvalue gaps calculated within the local density approximation (LDA) are also included for comparison. (after Ref. [11]).
Figure 2. Calculated GW quasiparticle band structure of Ge (left panel) and Cu (right panel) as compared with experiments (open and full symbols). In the case of Cu we also provide the DFT-LDA band structure as dashed lines. (after Ref. [11, 13]).
224
S.G. Louie and A. Rubio
Figure 3. Computed GW quasiparticle bandstructure for the Si(111) 2 × 1 surface compared with experimental results (dots). On the left we show a model of the surface reconstruction (after Ref. [15]).
point, where only 50% of the LDA error is corrected. This level of agreement for the d-bands cannot be obtained without including self-energy contributions∗ . Similar results have been obtained for other materials and even for some nonconventional insulating systems such as the transition metal oxides and metal hydrides. The GW approach has also been used to investigate the quasiparticle excitation spectrum of surfaces, interfaces and clusters. Figure 3 gives the example of the Si(111)2 × 1 surface [14, 15]. This surface has a very interesting geometric and electronic structure. At low temperature, to minimize the surface energy, the surface undergoes a 2 × 1 reconstruction with the surface atoms forming buckled π -bonded chains. The ensuing structure has an occupied and an unoccupied quasi-1D surface-state band, which are dispersive only along the π -bonded chains and give rise to a quasiparticle surface-state bandgap of 0.7 eV that is very different from the bulk Si bandgap of 1.2 eV. The calculated quasiparticle surface-state bands are compared to photoemission and inversed photoemission data in Fig. 3. As seen in the figure, both the calculated surface-state band dispersion and bandgap are in good agreement with experiment, and these results are also in accord with results from scanning tunneling spectroscopy (STS) which physically also probes quasiparticle excitations. But, a long-standing puzzle in the literature has been that the measured surface-state gap of this system from
* On the other hand, the total bandwidth is still larger than the measured one. This overestimate of the GW bandwidth for metals with respect to the experimental one seems to be a rather general feature, which is not yet properly understood.
Quasiparticle and optical properties of solids and nanostructures
225
optical experiments differs significantly (by nearly 0.3 eV) from the quasiparticle gap, indicative of perhaps very strong electron-hole interaction on this surface. We shall take up this issue later when we discuss optical response. Owing to interactions with other excitations, quasiparticle excitations in a material are not exact eigenstates of the system and thus possess a finite lifetime. The relaxation lifetimes of excited electrons in solids can be attributed to a variety of inelastic and elastic scattering mechanisms, such as electron–electron (e–e), electron–phonon (e–p), and electron–imperfection interactions. The theoretical framework to investigate the inelastic lifetime of the quasiparticle (due to electron–electron interaction as manifested in the imaginary part of ) has been based for many years on the electron gas model of Fermi liquids, characterized by the electron-density parameter rs . In this simple model for either electrons or holes with energy E very near the Fermi level, the inelastic lifetime is found to be, in the high-density limit (rs << 1), τ (E) = 263 rs−5/2 (E − E F )−2 fs, where E and the Fermi energy E F are expressed in eV [16]. A proper treatment of the electron dynamics (quasiparticle damping rates or lifetimes), however, needs to include bandstructure and dynamical screening effects in order to be in quantitative comparison with experiment. An illustrative example is given in Fig. 4 where the quasiparticle lifetimes of electrons and holes in bulk Cu and Au have been evaluated within the GW scheme, showing an increase in the lifetime close to the Fermi level as compared to the predictions of the free electron gas model. For Au, a major contribution from the occupied d states to the screening yields lifetimes of electrons that are larger than those of electrons in a free-electron-gas model by a factor of about 4.5 for electrons with
Figure 4. Calculated GW electron and hole lifetimes for Cu and Au. Solid and open circles represent the ab initio calculation of τ (E) for electrons and holes, respectively, as obtained after averaging over wavevectors and the bandstructure for each k vector. The solid and dotted lines represent the corresponding lifetime of electrons (solid line) and holes (dotted line) in a free electron gas with rs = 2.67 for Cu and rs = 3.01 for Au. In the inset for Au the theoretical results (solid circles) are compared with experimental data (open circles) from Ref. [17]. (after Refs. [18, 19]).
226
S.G. Louie and A. Rubio
energies 1–3 eV above the Fermi level. This prediction is in agreement with a recent experimental study of ultrafast electron dynamics in Au(111) films [17]. Up until the late 1990s, the situation for ab initio calculation of the optical properties of real materials was, however, not nearly as good as that for the quasiparticle properties. As discussed in Section 4, for the optical response of an interacting electron system, we must also include electron–hole interaction or excitonic effects. The important consequence of such effects is shown in Fig. 5 where the computed absorption spectrum of SiO2 neglecting electron– hole interaction is compared with the experimental spectrum [20]. There is hardly any resemblance between the spectrum from the non-interacting theory to that of experiment, which has led to extensive debates over the past 40 years on the nature of the four very sharp peaks observed in the experiment. We shall return to this technologically important material later. With the advance of the GW-BSE method [21–24], accurate ab initio calculation of the optical spectra of materials is now possible. As discussed above, solving the Bethe–Salpeter equation yields both the excitation energy and the coupling coefficients among the different electron–hole configurations that form the excited state. The resulting excited-state energies and electron–hole amplitude can then be used to compute the optical (or energy loss and related) spectrum including excitonic effects. The approach has been employed to obtain quite accurately optical transitions to both the bound and continuum states of
10
Im ε(ω)
8 6 4 2 0 0
10 15 Photon Energy (eV)
20
Figure 5. Comparison of the calculated absorption spectrum of SiO2 including excitonic effects (continuous curve) and neglecting electron–hole interaction (dot-dashed curve) with the experimental spectrum (dashed curve) taken from Ref. [25] (after Ref. [20]).
Quasiparticle and optical properties of solids and nanostructures
227
various materials [7, 10, 21, 22], including reduced dimensional systems and nanostructures. For bulk GaAs, the GW-BSE results for the optical absorption are compared with experiments in Fig. 6. We see that even for this simple and wellknown semiconductor, only with the inclusion of electron–hole interaction then we have good agreement between theory and experiment. The influence of the electron–hole interaction effects extends over an energy range far above the fundamental band gap. As seen from the figure, the optical strength of GaAs is enhanced by nearly a factor of two in the low frequency regime. Also, the electron–hole interaction enhances and shifts the second prominent peak (the so-called E 2 peak) structure at 5 eV to much closer to experiment. This very large shift of about 1/2 eV in the E 2 peak is not due to a negative shift of the transition energies, as one might naively expect from an attractive electron– hole interaction. The changes in the optical spectrum originate mainly from the coupling of different electron–hole configurations in the excited states, which leads to a constructive coherent superposition of the interband transition oscillator strengths for transitions at lower energies and to a destructive superposition at energies above 5 eV [21, 22]. In addition to the continuum part of the spectrum, one can also get out the bound exciton states near the absorption edge from the Bethe–Salpeter equation from first principles without making use of any effective mass approximation. For the case of GaAs, we see in Table 1 that the theory basically reproduces all the observed bound exciton structures to a very high level of accuracy.
Figure 6. Theoretical (continuous line) and measured (dots) optical absorption spectra for bulk GaAs. The experimental data are taken from Refs. [26, 27]. The calculated spectrum without inclusion of electron–hole interaction (dashed curve) is also given for completeness (after Refs. [21, 22]).
228
S.G. Louie and A. Rubio Table 1. Calculated exciton binding energies near the absorption edge for GaAs. The GW-BSE calculations are from [21, 22] and the experimental data are from [26] Binding energy
Theory (meV)
Experiment (meV)
E 1s E 2s E2 p
4.0 0.9 0.2–0.7
4.2 1.0 0–1
The scheme can thus directly be applied to situations in which simple empirical techniques do not hold. Similarly accurate results have been obtained for the other semiconductors. For larger gap materials, exciton effects are even more dramatic in the optical response as seen for the case of SiO2 [20] in Fig. 5. The quasiparticle gap of α-quartz is 10 eV. From the ab initio calculation, we learn that all the prominent peaks seen in the experiment and also in theory when electron–hole interaction is included are due to transitions to excitonic states. The much-debated peaks in the experimental spectrum are in fact due to the strong correlations between the excited electron and hole in resonant excitonic states since these excited states have energies that are higher than the value for the quasiparticle band gap.
6.
Applications to Reduced Dimensional Systems and Nanostructures
The GW-BSE approach in particular has been valuable in explaining and predicting the quasiparticle excitations and optical response of reduced dimensional systems and nanostructures. This is because Coulomb interaction effects in general are more dominant in lower dimensional systems owing to geometrical and symmetry restrictions. As illustrated below, self-energy and electron– hole interaction effects can be orders of magnitude larger in nanostructures than in bulk systems made up of the same elements. A good example of a reduced dimensional system is the conjugated polymers. The optical properties of these technologically important systems are still far from well understood when compared to conventional semiconductors [28]. For example, there has been much argument in the literature regarding to the binding energy of excitons in polymers such as poly-phenylene-vinylene (PPV); values ranging from 0.1 to 1.0 eV had been suggested. Ab initio calculation using the GW-BSE approach show that excitonic effects in PPV are indeed dominant and change qualitative the optical spectrum of the material. This is shown in Fig. 7 where we see that each of the 1D van Hove singularities in the
Quasiparticle and optical properties of solids and nanostructures
229
Figure 7. Optical absorption spectra of the polymer PPV. Theoretical results with (continuous line) and without (dashed line) including excitonic effects (after Ref. [28]).
interband absorption spectrum is replaced by a series of sharp peaks due to excitonic states. The lowest optically active exciton is a bound exciton state; but the others are strong resonant exciton states giving rise to peak structures that agree very well with experiment. In particular, when compared to the quasiparticle gap of 3.3 eV, the theoretical results in Fig. 7 yield a very large binding energy of nearly 1 eV for the lowest energy bound exciton in PPV. The reduced dimensionality at a surface can also greatly enhance excitonic effects. For example, in the case of the Si(111) 2 × 1 surface [28], it is found that the surface optical spectrum at low frequencies is dominated by a surfacestate exciton which has a binding energy that is an order of magnitude bigger than that of bulk Si, and one cannot interpret the experimental spectrum without considering the excitonic effects. This is illustrated in Fig. 8 where the measured differential reflectivity is compared with theory. Here we find that the peak in the differential reflectivity spectrum is dictated by a surface-state exciton with a binding energy of 0.23 eV. This very large binding energy for the surface-state exciton is to be compared to the excitonic binding energy in bulk Si which is only 15 meV. The large enhancement in the electron–hole interaction at this particular surface arises from the quasi-1D nature of the surface states, which are localized along the π -bonded atomic chains on the surface. Similar excitonic calculations for the Ge(111) 2 × 1 reconstructed surface demonstrate how optical differential reflectivity spectra can be used to distinguish between the two possible isomers of the reconstructed surface (see right panel in Fig. 8). This distinction has been enabled by the fact that a quantitative comparison between the calculated
230
S.G. Louie and A. Rubio
Figure 8. Comparison between experiments and the computed differential reflectivity spectra with and without electron–hole interaction for the Si(111)2 × 1 surface (left panel) [28] and for Ge(111)2 × 1 (right panel) [29].
and experimental spectrum is possible when electron–hole effects are treated correctly [29]. Another 1D system of great current interest is the carbon nanotubes [30]. These are tubular structures of graphene with diameter in the range of one nanometer and length that can be many hundreds of microns or longer. The carbon nanotubes can be metals or semiconductors depending sensitively on their geometric structure, which is indexed by a pair of integers (m, n) where m and n are the two integers specifying the circumferential vector in units of the two primitive translation vectors of graphene. Recent experimental advances have allowed the measurement of the optical response of well-characterized individual, single-walled carbon nanotubes (SWCNTs). For example, absorption measurement on well-aligned samples of SWCNTs of uniform diameter of 4 Å grown inside the channels of zeolites has been performed [31]. And, through the use of photoluminescence excitation techniques, the Rice group has succeeded in measuring both the first and second optical transition energies of well identified, individually isolated, semiconducting SWCNTs [32, 33]. The optical properties of these tubes are found be to quite unusual and cannot be explained by conventional theories. Because of the reduced dimensionality of the nanotubes, many-electron (both quasiparticle and excitonic) effects have been shown to be extraordinarily important in these systems [34, 35]. Figure 9 illustrates the effects of many-electron interactions on the quasiparticle excitation energies of the carbon nanotubes. Plotted in the figure are the quasiparticle corrections to the LDA Kohn–Sham energies for the metallic (3,3) carbon nanotube and the semiconducting (8,0) carbon nanotube.
Quasiparticle and optical properties of solids and nanostructures
231
Figure 9. Plot of the quasiparticle corrections to the DFT Kohn–Sham eigenvalues due to selfenergy effects as a function of the energy of the states for the metallic (3,3) carbon nanotube (left panel) and the semiconducting (8,0) carbon nanotube (right panel) (after Refs. [34, 35]).
Figure 10. Calculated quasiparticle density of states (left panel) and optical absorption spectrum (right panel) for the (3,3) carbon nanotube (after Refs. [34, 35]).
The general trends are that, for the metallic tubes, the corrections are relatively straight forward. Basically they stretch the bands by ∼15%, as in the case of graphite [11]. But, the self-energy corrections to the quasiparticle energies of the semiconducting tubes are quite large. The corrections cause a large opening of the minimum band gap, as well as a stretching of the bands. As seen in Fig. 9, the self-energy corrections cause the minimum quasiparticle gap of the (8,0) carbon nanotube to open up by nearly 1 eV. Many-electron interaction effects play an even more important role in the optical response of the carbon nanotubes. The calculated optical spectrum of the metallic (3,3) nanotube (which is one of the 4 Å diameter SWCNTs) is presented in Fig. 10. The left panel shows the electronic density of states. Because of the symmetry of the states, only certain transitions between states (indicated by the arrow A) are optically allowed. The right panel compares the calculated
232
S.G. Louie and A. Rubio
imaginary part of the dielectric response function between the case with and without electron–hole interactions. The optical spectrum of the (3,3) nanotube is changed qualitatively due to the existence of a bound exciton, even though the system is metallic. This rather surprising result comes from the fact that, although the tube is metallic, there is a symmetric gap in the electron–hole spectrum (i.e., there are no free electron–hole states of the same symmetry as the exciton possible in the energy range of the excitonic state). The symmetry gap is possible here because the (3,3) tube is a 1D metal – i.e., all k-states can have well-defined symmetry. Figure 11 depicts the results for the (5,0) tube, which is another metallic SWCNT of 4 Å in diameter. The surprise here is that, for the range of frequencies considered, the electron–hole interaction in this tube is a net repulsion between the excited electron and hole. Unlike the case of bulk semiconductors, owing to the symmetry of the states involved and metallic screening, the repulsive exchange term dominates over the attractive direct term in the electron– hole interaction. As a consequence, there are no bound exciton states in Fig. 11 and there is a suppression of the optical strength at the van Hove singularities, especially for the second peak in the spectrum. One expects the above excitonic effects should be even more pronounce in the semiconducting nanotubes. Indeed, this is the case. Figure 12 compares the calculated absorption spectrum of a (8,0) tube between the case with and without electron–hole interactions. The two resulting spectra are qualitatively and dramatically different. When electron–hole interaction effects are included, the spectrum is dominated by bona fide and resonant excitonic states. With interactions, each van Hove singularity structure in the non-interacting spectrum gives rise to a series of exciton states. For the (8,0) tube, the lowest-energy bound exciton has a binding energy of more than 1 eV. Note that the exciton binding energy for bulk semiconductors of similar size bandgap is in general only of the
Figure 11. Calculated quasiparticle density of states (left panel) and optical absorption spectrum (right panel) for the (5,0) carbon nanotube (after Refs. [34, 35]).
Quasiparticle and optical properties of solids and nanostructures
233
Figure 12. Optical absorption spectra for the (8,0) carbon nanotube (top panel) and the spatial extent of the excitonic wavefunction along the tube axis for a bound and resonant excitonic state (after Refs. [34, 35]).
order of tens of meVs. This illustrates again the dominance of many-electron Coulomb effects in the carbon nanotubes owing to their reduced dimensionality. The bottom two panels in Fig. 12 give the spatial correlation between the excited electron and hole in two of the exciton states, one bound and one resonant state. The extent of the exciton wavefunction is about 25–30 Å for both of these states. In Table 2 we compare the calculated results for the 4 Å diameter tubes with experimental data. For the samples of 4 Å diameter single-walled carbon nanotubes grown in the channels of the zeolite AlPO4 -5 crystal, the Hong Kong group observed three prominent peaks in the optical absorption spectrum [31]. There are only three possible types of carbon nanotubes with a diameter of 4 Å – (5,0), (4,2) and (3,3). All three types of tubes are expected to be present in these samples. The theoretical results quantitatively explain from first principles the three observed peaks and identify their physical origin. The first peak is due to
234
S.G. Louie and A. Rubio Table 2. Comparison between experimental [31] and calculated main absorption peaks for all possible 4 Å – (5,0), (4,2) and (3,3) – carbon nanotubes CNT
Theory (eV)
Experiment (eV)
Character
(5,0) (4,2) (3,3)
1.33 2.0 3.17
1.37 2.1 3.1
Interband Exciton Exciton
Table 3. Calculated lowest two optical transition energies for the (8,0) and (11,0) carbon nanotubes compared to experimental values [32, 33]. It is noted that the ratio between the two transition energies deviates strongly from the value of 2 predicted by a simple independent-particle model (after Refs. [34, 35]) (8,0) E 11 E 22 E 22 /E 11
(11,0)
Experiment
Theory
Experiment
Theory
1.6 eV 1.9 eV 1.19
1.6 eV 1.8 eV 1.13
1.2 eV 1.7 eV 1.42
1.1 eV 1.6 eV 1.45
an interband transition van Hove singularity from the (5,0) tubes, whereas the second peak and third peak are due to the formation of excitons in the (4,2) and (3,3) tubes, respectively [34–36]. The theoretical results [34, 35] on the larger semiconducting tubes have also been used to elucidate the findings from photoluminescence excitation measurements, which yielded detailed information on optical transitions in individual single-walled nanotubes. Table 3 gives a comparison between experiment and theory for the particular cases of the (8,0) and (11,0) tubes. The measured transition energies are in excellent agreement with theory. In particular, we found that the large reduction in the ratio of the second transition energy to the first transition energy E 22 /E 11 from the value of 2 (predicted by simple interband transition theory) is due to a combination of factors – bandstructure effects, quasiparticle self-energy effects, and excitonic effects. One must include all these factors to have an understanding of the optical response of the semiconducting carbon nanotubes. Another example of low-dimensional systems is clusters. In Fig. 13, we show some results on the optical spectra of the Na4 cluster calculated using the GW-BSE approach as well as those from TDLDA and experiment. The measured spectrum consists of three peaks in the 1.5–3.5 eV range and a broader feature around 4.5 eV. The agreement between results from TDDFT based calculations and GW-BSE calculations is very good. The comparison with the experimental peak positions is also quite good, although the calculated peaks appear shifted to higher energies by approximately 0.2 eV. Good agreement has been obtained for other small semiconductor and metal clusters.
Quasiparticle and optical properties of solids and nanostructures
235
Figure 13. Calculation of the optical absorption (proportional to the strength function) of a Na4 cluster using the GW-BSE scheme (dashed line) (from Ref. [37]) and with TDDFT using different kernels [38]: TDLDA (solid line), exact-exchange (dotted line). Filled dots represent the experimental results from Ref. [39] (after Ref. [7]).
The above are just several selected examples, given to illustrate the current status in ab initio calculations of quasiparticle and optical properties of materials. Similar results have been obtained for the spectroscopic properties of many other moderately correlated electron systems, in particular for semiconducting systems, to a typical level of accuracy of about 0.1 eV.
7.
Conclusions and Perspectives
We have discussed in this article the theory and applications of an ab initio approach to calculating electron excitation energies, optical spectra, and exciton states in real materials. The approach is based on evaluating the one-particle and the two-particle Green’s function, respectively, for the quasiparticle and optical excitations of the interacting electron system, including relevant electron self-energy and electron–hole interaction effects at the GW approximation level. It provides a unified approach to the investigation of both extended and confined systems from first principles. Various applications have shown that the method is capable of describing successfully the spectroscopic properties of a range of systems including semiconductors, insulators, surfaces, conjugated polymers, small clusters, nanotubes and other nanostructures. The agreement between theoretical spectra and data from experiments such as photoemission, tunneling,
236
S.G. Louie and A. Rubio
optical and related measurements is in general remarkably good for moderately correlated electron systems. A popular alternative scheme to address optical response is TDDFT. In particular the optical response of simple metal clusters and biomolecules is well reproduced by the standard TDLDA approximation [7, 40]. However, if we increase the size of the system towards a periodic structure in one, two or three dimensions (i.e., polymers, slabs, surfaces or solids), we must be careful with the form of the exchange-correlation functional employed. In contrast to the GW-BSE scheme, difficulties arise when applying TDDFT, for example, to long conjugated molecular chains, where the strong non-locality of the exact functional is not well reproduced in the usual approximations. Similarly, for bulk semiconductors and insulators, the standard functionals fail to describe the optical absorption spectra. The reason has been traced to the fact that the exchange and correlation kernel f xc (which describes the electron–hole interaction within TDDFT) should behave asymptotically, in momentum space, as 1/q 2 as q goes to 0 [7]. This condition, however, is not satisfied by the LDA or GGA. Input from the GW-BSE method has in fact been employed to improve the approximate exchange-correlation functionals for use in the TDDFT scheme ([41] and references therein). Such new many-body based f xc has given results on the optical loss spectra of bulk materials such as LiF and SiO2 that are in quite good agreement with the Bethe–Salpeter equation results and with experiments. (See Fig. 14.) Both spatial nonlocality and frequency dependence of the f xc kernel turn out to be important in order to properly describe excitonic effects. However, quasiparticle effects still need to be embodied properly within this new approximated TDDFT scheme. An interesting practical question is: which of the two approaches, the GW-BSE method or the TDDFT, would be more efficient in computing the optical properties of the different systems of interest in the future?∗ The overall success of the first-principles many-body Green’s function approach is impressive and has been highly valued. Nevertheless, the G 0 W0 scheme can be refined in some applications. Studies have shown that: (i) inclusion of vertex corrections improves the description of the absolute position of quasiparticle energies although the amount of such corrections depends sensitively on the model used for the vertex [7, 9]; (ii) vertex effects slightly changes the occupied bandwidth of the homogeneous electron gas, but this correction is not enough to fit the experimental results for metals such as Na; (iii) for the bandwidth of simple metals, self-consistency performed for the homogenous electron gas [42]
* The GW Bethe–Salpeter equation approach offers a clear physical and straightforward picture for the analysis of results and further improvements. It works over a wide range of systems for both quasiparticle and optical excitations. The TDDFT approach, on the other hand, is appealing since it computes optical response more efficiently, but it is appropriate only for neutral excitations and its range of validity is uncertain because of uncontrolled approximations to the functionals.
Quasiparticle and optical properties of solids and nanostructures
237
Figure 14. Calculated optical absorption spectra within the GW-BSE approach (continuous line) and those from a new TDDFT f xc kernel derived from the BSE method (dashed line) are compared to experiment (open dots). The independent-quasiparticle response (dashed-dotted line) is also shown (after Ref. [41]).
showed that partially self-consistent GW0 calculations – in which W is calculated only once using the random-phase-approximation (RPA) so that Eq. (7) is not included in the iterative process – only slightly increase the G 0 W0 occupied bandwidth. Results are even worse at full self-consistency without vertex corrections. The effects of self-consistency thus must be necessarily balanced by the proper inclusion of vertex corrections. This, however, is not the case for the calculation of total energies where the fully self-consistent GW solution appears to provide better results than the partial G 0 W0 procedure. But, if one is interested in spectroscopic properties, a self-consistent GW procedure seems to perform worse than the simpler G 0 W0 scheme. Experiences from numerous past applications to bulk solids and reduced dimensional systems have demonstrated that in general the GW scheme is an
238
S.G. Louie and A. Rubio
excellent approximation for the evaluation of the quasiparticle and optical properties of moderately correlated systems. Methods beyond the GW approximation are expected to be required for the study of the spectral features of highly correlated systems. The GW-BSE approach described in this article, however, is arguably the most reliable, practical, and versatile tool we have at present to tackle the optical and electronic response of real material systems from first principles. Further developments in the field should address the proper treatment of self-consistency and vertex corrections. This would further extend the range of applicability of this, already successful, many-body Green’s function approach.
References [1] L. Hedin, “New method for calculating the one-particle Green’s function with application to the electron-gas problem,” Phys. Rev., 139, A796, 1965. [2] M.S. Hybertsen and S.G. Louie, “First-principles theory of quasiparticles: calculation of band gaps in semiconductors and insulators,” Phys. Rev. Lett., 55, 1418, 1985. [3] M.S. Hybertsen and S.G. Louie, “Electron correlation in semiconductors and insulators: band gaps and quasiparticle energies,” Phys. Rev. B, 34, 5390, 1986. [4] W. Kohn, “Nobel lecture: electronic structure of matter-wave functions and density functionals,” Rev. Mod. Phys., 71, 1253, 1999. [5] E. Runge and E.K.U. Gross, “Density-functional theory for time-dependent systems,” Phys. Rev. Lett., 52, 997, 1985. [6] E.K.U. Gross, J. Dobson, and M. Petersilka, “Density functional theory of timedependent phenomena,” In Density Functional Theory II, R.F. Nalewajski (ed.), Topics in Current Chemistry, vol. 181, Springer, Berlin, p. 81, 1986. [7] G. Onida, L. Reining, and A. Rubio, “Electronic excitations: density functional versus many-body Green’s-function approaches,” Rev. Mod. Phys., 74, 601, 2002. [8] L. Hedin and S. Lundqvist, “Effects of electron–electron and electron–phonon interactions on the one electron states of solids,” In: H. Ehrenreich, F. Seitz, and D. Turnbull (eds.), Solid State Physics, Academic Press, New York, vol. 23, p. 1, 1969. [9] F. Aryasetiawan and O. Gunnarsson, “GW method,” Rep. Prog. Phys., 61, 3, 1998. [10] M. Rohlfing and S.G. Louie, “Electron–hole excitations and optical spectra from first principles,” Phys. Rev. B, 62, 4927, 2000. [11] S.G. Louie, “First-principles theory of electron excitation energies in solids, surfaces, and defects,” In: C.Y. Fong (ed.), Topics in Computational Materials Science, World Scientific, Singapore, p. 96, 1997. [12] W.G. Aulbur, L. J¨onsson, and J. Wilkins, “Quasiparticle calculations in solids,” In: Solid State Physics, vol. 54, p. 1, 2000. [13] A. Marini, G. Onida, and R. Del Sole, “Quasiparticle electronic structure of copper in the GW approximation,” Phys. Rev. Lett., 88, 016403, 2002. [14] J.E. Northrup, M.S. Hybertsen, and S.G. Louie, “Many-body calculation of the surface state energies for Si(111)2 × 1,” Phys. Rev. Lett., 66, 500, 1991. [15] M. Rohlfing and S.G. Louie, “Optical excitations in conjugated polymers,” Phys. Rev. Lett., 82, 1959, 1999. [16] P.M. Echenique, J.M. Pitarke, E. Chulkov, and A. Rubio, “Theory of inelastic lifetimes of low-energy electrons in metals,” Chem. Phys., 251, 1, 2000.
Quasiparticle and optical properties of solids and nanostructures
239
[17] J. Cao, Y. Gao, H.E. Elsayed-Ali, R.D.E. Miller, and D.A. Mantell, “Femtosecond photoemission study of ultrafast dynamics in single-crystal Au(111) films,” Phys. Rev. B, 50, 10948, 1998. [18] I. Campillo, J.M. Pitarke, A. Rubio, E. Zarate, and P.M. Echenique, “Inelastic lifetimes of hot electrons in real metals,” Phys. Rev. Lett., 83, 2230, 1999. [19] I. Campillo, A. Rubio, J.M. Pitarke, A. Goldman, and P.M. Echenique, “Hole dynamics in noble metals,” Phys. Rev. Lett., 85, 3241, 2000. [20] E.K. Chang, M. Rohlfing, and S.G. Louie, “Excitons and optical properties of alphaquartz,” Phys. Rev. Lett., 85, 2613, 2000. [21] M. Rohlfing and S.G. Louie, “Excitonic effects and the optical absorption spectrum of hydrogenated Si clusters,” Phys. Rev. Lett., 80, 3320, 1998. [22] M. Rohlfing and S.G. Louie, “Electron–hole excitations in semiconductors and insulators,” Phys. Rev. Lett., 81, 2312, 1998. [23] L.X. Benedict, E.L. Shirley, and R.B. Bohm, “Optical absorption of insulators and the electron–hole interaction: an ab initio calculation,” Phys. Rev. Lett., 80, 4514, 1998. [24] S. Albretch, L. Reining, R. Del Sole, and G. Onida, “Ab initio calculation of excitonic effects in the optical spectra of semiconductors,” Phys. Rev. Lett., 80, 4510, 1998. [25] H.R. Philipp, “Optical transitions in crystalline and fused quartz,” Solid State Commun., 4, 73, 1966. [26] D.E. Aspnes and A.A. Studna, “Dielectric functions and optical parameters of Si, Ge, GaP, GaAs, GaSb, InP, InAs and InSb frp, 1.5 to 6.0 eV,” Phys. Rev. B, 27, 985, 1983. [27] P. Lautenschlager, M. Garriga, S. Logothetisdis, and M. Cardona, “Interband critical points of GaAs and their temperature dependence,” Phys. Rev. B, 35, 9174, 1987. [28] M. Rohlfing and S.G. Louie, “Excitations and optical spectrum of the Si(111)-(2×1) surface,” Phys. Rev. Lett., 83, 856, 1999. [29] M. Rohlfing, M. Palummo, G. Onida, and R. Del Sole, “Structural and optical properties of the Ge(111)-( 2 × 1) surface,” Phys. Rev. Lett., 85, 5440, 2000. [30] S. Iijima, “Helical microtubules of graphitic carbon,” Nature, 354, 56, 1991. [31] Z.M. Li, Z.K. Tang, H.J. Liu, N. Wang, C.T. Chan, R. Saito, S. Okada, G.D. Li, J.S. Chen, N. Nagasawa, and S. Tsuda, “Polarized absorption spectra of single-walled 4 Å carbon nanotubes aligned in channels of an AlPO4 -5 single crystal,” Phys. Rev. Lett., 87, 127401, 2001. [32] M.J. O’Connell, S.M. Bachilo, C.B. Huffman, V.C. Moore, M.S. Strano, E.H. Haroz, K.L. Rialon, P.J. Boul, W.H. Noon, C. Kittrell, J. Ma, R.H. Hauge, R.B. Weisman, and R.E. Smalley, “Band gap fluorescence from individual single-walled carbon nanotubes,” Science, 297, 593, 2002. [33] S.M. Bachilo, M.S. Strano, C. Kittrell, R.H. Hauge, R.E. Smalley, and R.B. Weisman, “Structure-assigned optical spectra of single-walled carbon nanotubes,” Science, 298, 2361, 2002. [34] C.D. Spataru, S. Ismail-Beigi, L.X. Benedict, and S.G. Louie, “Excitonic effects and optical spectra of single-walled carbon nanotubes,” Phys. Rev. Lett., 92, 077402, 2004. [35] C.D. Spataru, S. Ismail-Beigi, L.X. Benedict, and S.G. Louie, “Quasiparticle energies, excitonic effects and optical absorption spectra of small-diameter single-walled carbon nanotubes,” Appl. Phys. A, 78, 1129, 2004. [36] E. Chang, G. Bussi, A. Ruini, and E. Molinari, “Excitons in carbon nanotubes: an ab initio symmetry-based approach,” Phys. Rev. Lett., 92, 196401, 2004. [37] G. Onida, L. Reining, R.W. Godby, and W. Andreoni, “Ab initio calculations of the quasiparticle and absorption spectra of clusters: the sodium tetramer,” Phys. Rev. Lett., 75, 818, 1995.
240
S.G. Louie and A. Rubio [38] M.A.L. Marques, A. Castro, and A. Rubio, “Assesment of exchange-correlation functionals for the calculation of dynamical properties of small clusters in TDDFT,” J. Chem. Phys., 115, 3006, 2001. http://www.tddft.org/programs/octopus. [39] C.R.C. Wang, S. Pollack, D. Cameron, and M.M. Kappes, “Optical absorption spectroscopy of sodium clusters as measured by collinear molecular-beam photodepletion,” J. Chem. Phys., 93, 3787, 1990. [40] M.A.L. Marques, X. L´opez, D. Varsano, A. Castro, and A. Rubio, “Time-dependent density-functional approach for biological photoreceptors: the case of the Green fluorescent protein,” Phys. Rev. Lett., 90, 158101, 2003. [41] A. Marini, R. Del Sole, and A. Rubio, “Bound excitons in time-dependent densityfunctional-theory: optical and energy-loss spectra,” Phys. Rev. Lett., 91, 256402, 2003. [42] B. Holm and U. von Barth, “Fully self-consistent GW self-energy of the electron gas,” Phys. Rev. B, 57, 2108, 1998.
1.12 HYBRID QUANTUM MECHANICS/ MOLECULAR MECHANICS METHODS AND THEIR APPLICATION Marek Sierka1,∗ and Joachim Sauer2 1 Institut für Physikalische Chemie, Lehrstuhl für Theoretische Chemie, Universität Karlsruhe, Kaiserstraße 12, D-76128 Karlsruhe, Germany 2 Institut für Chemie, Humboldt-Universität zu Berlin, Unter den Linden 6, D-10099 Berlin, Germany
Hybrid quantum mechanics (QM)/molecular mechanics (MM) methods allow simulations for much larger systems than accessible by QM methods alone. The size of many systems of topical interest in chemistry and biochemistry prevents efficient and accurate treatment by quantum mechanical ab initio methods. For reactions in condensed phase and surfaces periodic boundary conditions (PBC) can be applied reducing the size of the problem to a unit cell [1–3]. However, many interesting structure features such as defects or active sites require larger unit cells due to broken space and translation symmetry. A computationally appealing alternative are interatomic potential functions ranging from molecular mechanics force fields to ion-pair potentials. They yield accurate equilibrium structures for the type of systems for which they are parameterized [4], but are usually not suitable to describe the active sites of catalysts with sufficient accuracy. Moreover, unless special modifications are made, they cannot be used to model reactions in which chemical bonding is changed. The cluster model approach is an alternative that makes the calculations on active sites and defects feasible to ab initio methods [5]. Only a fragment of the structure is considered that contains the interesting part, and the surroundings are neglected or approximately included. There exist, however, classes of problems, which require a computational treatment of the whole system. A prominent example is shape selectivity in zeolite catalysis.
∗ Present address: Institut für Chemie, Humboldt-Universität zu Berlin, Unter den Linden 6, D-10099
Berlin, Germany 241 S. Yip (ed.), Handbook of Materials Modeling, 241–258. c 2005 Springer. Printed in the Netherlands.
242
M. Sierka and J. Sauer
Although zeolite catalysts with different framework structures have the same active sites in common, they may show very different catalytic performances. Hybrid quantum mechanics-molecular mechanics (QM/MM) methods combine advantages of an accurate QM description of the important part of the system, e.g., the active site, with the computational efficiency of interatomic potential functions applied to its surroundings. This way the most important environmental effects can be included, for example mechanical constraints, electrostatic interactions and polarization. The idea of hybrid QM/MM methods goes back to the late 1970s [6]. Today, there are a large number of different implementations of the QM/MM hybrid approach and an increasing number of applications. One example is our own combined quantum mechanics– interatomic potential functions approach (QM–Pot) that has been applied to various problems of homogeneous and heterogeneous catalysis [7]. This more general name is chosen since the term “molecular mechanics” (MM) stresses the force field type of potential functions most often used for organic and biomolecules, while inorganic solids are better described by ion-pair potential functions. In this contribution the QM–Pot method and its applications are reviewed, with special focus on problems in zeolite catalysis. We demonstrate that hybrid methods are not only an alternative to full QM calculations, in particular when active sites are considered, but in some cases can even recover the deficiencies of approximate QM methods. More complete reviews can be found, for example in Gao and Thompson [8], Sherwood [9] and in several articles of the “Encyclopedia of Computational Chemistry” [10].
1.
Definition of the QM/MM Potential Energy Surface
This section describes the theoretical background of hybrid QM/MM methods in general and the QM–Pot method in particular. The entire system (S) is partitioned into an inner or active part (I) and the outer part (O), as shown in Fig. 1. The interactions within the inner part are treated at the higher, usually QM level. All interactions within the outer part are described by a computationally less expensive, lower level method, for example, a parameterized potential function, MM force field or a more approximate QM method. Thus, the energy of the whole system is expressed as E(S)high/low = E(O)low + E(I)high + E(I–O).
(1)
The E(I–O) term describes the mutual interaction between I and O parts. It can be described by potential function or MM method alone (mechanical embedding) or include some QM terms, for example, the electrostatic interaction between I and O or mutual polarization terms (electronic embedding) [11].
Hybrid QM/MM methods and their application (a)
243
(b)
Figure 1. (a) Chemical system partitioned into the active (inner) part I and an outer part O; (b) Link atoms L are used to saturate the inner part when its definition requires the breaking of covalent bonds, A–B.
If the E(I–O) term is given entirely by the low level method, Eq. (1) can be alternatively expressed as E(S)high/low = E(S)low − E(I)low + E(I)high.
(2)
This so-called subtraction scheme [12–14] has the advantage that standard methods, QM (high) and MM (low), are applied to well-defined systems, I and O. This idea is followed in our QM–Pot approach [12, 15]. Note, that in case of Eq. (2), there is no direct influence of the O part on the wavefunction of the QM part. However, the electronic structure obtained from the QM/MM calculations is different from that of a QM calculation for the I part only since the equilibrium structures obtained in the two optimizations are different. The advantage of Eq. (2) is that the electrostatic interactions between I and O parts are treated at the same lower level, i.e., by the potential function used. Balanced and proven point charge or higher multiple models can be used and even mutual polarization effects can be treated provided that the potential functions have this functional form (e.g., [11]).
2.
Link Atoms
The division of a system is sometimes trivial, for example, in solvated systems with solvent as the O part and solute as the I part [16]. Difficulties arise, however, if the I and O parts are connected by covalent bonds. Simply breaking the bonds would result in a highly charged or high spin system leading to a poor description of the interactions at the QM level. In such a case the definition of the I part requires proper description of the boundary region. Several methods have been developed to handle this situation, for example, localized orbitals (e.g., [17]) but the easiest and most commonly applied approach
244
M. Sierka and J. Sauer
involves socalled link atoms. Most of the hybrid QM/MM and embedding methods differ by the way the boundary region is treated and by contributions to E(I–O) that are evaluated at the QM level. In the link atom approach the bonds between atoms of the inner region and atoms of the outer region (A–B, A ∈ I, B ∈ O) are replaced by the bonds between the inner part atoms and link atoms (A–L), as shown in Fig. 1(b). The I part atoms together with the link atoms form a finite molecular cluster model (C). The type and number of link atoms should be chosen such that bond orders are conserved and no open valencies are left on the link atoms. In most cases hydrogen atoms seem to be the most reasonable choice for link atoms [5]. Introduction of link atoms requires modification of the QM–Pot energy expression, Eq. (2), which now takes the form E(S,L)high/low = E(S)low − E(C)low + E(C)high,
(3)
where the E(S,L) notation stresses that the energy defined by Eq. (3) depends now also on coordinates of the link atoms. The energy defined by Eq. (3) differs formally from that of Eq. (2) by the term that involves differences of energy contributions at the high QM and low Pot levels = E(L)low − E(L)high + E(I–L)low − E(I–L)high,
(4)
where E(L) and E(I–L) denote contributions from the link atoms and the interactions between link atoms and the I part atoms. To maintain high accuracy of the QM/MM hybrid calculations the influence of this term on computed relative energies has to be minimized. This can be achieved in two ways: (a) use of potential functions that mimic the quantum mechanical energy contribution connected with link atoms sufficiently well, and (b) use of large enough QM clusters, so that the distances between the reaction center and the link atoms is sufficiently large to ensure that remains constant during the reaction course. In practice, (a) can be achieved using potential functions parameterized to reproduce results of QM calculations on small model systems. For (b) it is difficult judge a priori whether a given QM cluster size is sufficient to yield acceptable errors. Therefore, careful convergence studies of the calculated properties with increasing QM cluster size or comparison with full QM calculations are very important in the calibration of the QM–Pot results. The link atoms introduce additional, artificial degrees of freedom to the system and their proper treatment is very important in structure optimizations or molecular dynamics simulations. There are generally two ways of treating link atoms – unconstrained (e.g., [18]) and constrained (e.g., [12, 19]). In the unconstrained approach the link atoms are free to move and their positions are independently optimized. In the constrained approach the link atoms are kept fixed at some chosen position, usually on the bonds they terminate.
Hybrid QM/MM methods and their application
245
In the QM–Pot approach the terminating link atoms are kept in a position in which they serve their purpose best: they are constrained to stay on the bonds between the inner and the outer part of the system that they terminate [12, 15]. This way the explicit dependence of the QM–Pot energy on link atom positions is removed and replaced by a parametric dependence in the form of constraints. This creates additional contributions to the forces and force constant on the atom in bonds linking the I and O part. The advantage is that the derivatives of the E high/low, Eq. (3), fulfill the requirements that follow from its translational and rotational invariance. Morokuma’s ONIOM method uses a similar approach [19].
3.
The Reaction Force Field
The QM–Pot approach relies on the subtraction scheme and the potential function must be also known for the I part, at least the part that describes the long-range interaction with the O part. The usual force fields and potential functions describe the potential energy surface only in the vicinity of a stable minimum, but are not valid for the regions corresponding to transition structures. A solution to this problem was proposed by Warshel [20]. His empirical valence bond (EVB) method creates a smooth connection between the force fields describing different states (resonance forms) of the system. We have adopted the EVB idea for efficient location of transition structures in extended systems by the QM–Pot method [15]. In the simplest case of just two states described by single minimum interatomic potential functions V1 and V2 a simple 2×2 eigenvalue problem is obtained with a nondiagonal V12 element that couples these two states (see Fig. 2). The lowest eigenvalue describes an adiabatic state that creates a smooth transition between the V1 and V2 states. The V12 term is defined in terms of a small set of internal coordinates in which only atoms with the largest displacement along the reaction path are involved, and the necessary parameters are obtained from QM calculations on small model system. Since the I part is described by the QM method this EVB blending of two potentials affects only its interaction with the O part, E(I–O) term in Eq. (1), and already a crude estimate of the V12 parameters yields accurate enough results [21].
3.1.
Long-Range Interactions
Reaction energies, energy barriers or other relative energies calculated with the QM–Pot method consist of two contributions [22] E QM−Pot = E QM//QM−Pot + E Pot//QM−Pot .
(5)
246
M. Sierka and J. Sauer
Figure 2. Interatomic potential functions V1 and V2 for reactants (R) and products (P), respectively, coupled by the EVB method. The result is a smooth potential function E valid also in the region of transition state (TS).
The notation “//QM–Pot” means that the energies are evaluated for the structures obtained by QM–Pot calculations. The first one, the direct QM contribution E QM = E(C2 )QM//QM−Pot − E(C1 )QM//QM−Pot
(6)
is different from unconstrained cluster model results because the structures of the embedded clusters are different due to constraints imposed by the extended system (e.g., solid lattice). Superscripts “1” and “2” correspond to the two states of the system, for example, reactants and products or reactants and transition state. The second term includes all contributions due to the interatomic potential functions E L R = E(S2 )Pot//QM−Pot − E(S1 )Pot//QM−Pot −E(C2 )Pot//QM−Pot + E(C1 )Pot//QM−Pot.
(7)
If the QM cluster is large enough to account for all structure distortions upon the reaction, the latter contribution can be considered as a correction accounting for all long-range interactions not included in the QM part.
3.2.
Implementation
The QM–Pot method, the EVB coupling of parameterized potential energy surfaces and its combination with the QM–Pot approach have been implemented in the QMPOT program [15]. It is designed as an optimizer for
Hybrid QM/MM methods and their application
247
minima and saddle points. External programs provide the energies, forces, and force constants for the QM and Pot parts, and the communication is achieved through the interface functions. The main features of QMPOT are (a) QM– Pot and EVB structure optimizations to minima and saddle points, (b) QM– Pot and EVB energy second derivatives and harmonic vibrational frequencies. Morokuma’s IMOMM, IMOMO, and ONIOM are other implementations of hybrid QM/MM and QM/QM methods and available in the Gaussian program [19]. A third implementation with various options is the ChemShell software [9].
4.
Applications to Structure and Reactivity of Zeolite Catalysts
Zeolites are nanoporous crystalline solids built of three-dimensional networks of corner-sharing TO4 tetrahedra in which T is an electropositive element, typically Si, Al, or P. Depending on their composition, (SiO2 )x (AlO− 2 )y + (PO2 )z , frameworks are negatively charged and charge-compensating metal cations or protons are present on extra-framework positions. The different structure types and examples for zeolites found as minerals or synthesized in the laboratory are collected in the “Atlas of Zeolite Structure Types” [23]. Figure 3 shows four examples of zeolite lattices, chabazite (CHA), faujasite (FAU), mordenite (MOR), and ZSM-5 (MFI). Because of their unique nanoporous structure, zeolites have the ability to act as catalysts for chemical reactions which take place within the internal cavities. The most important sources of catalytic activity are: (i) Protons as charge compensating cations (solid Brønsted acids). This is exploited in many organic reactions, including crude oil cracking, isomerization and fuel synthesis. (ii) Transition metals cations occupying extra-framework positions. Examples are Cu, Co, or Ag exchanged zeolites used in NOx decomposition. (iii) Isomorphous substitution of Si4+ by other tetrahedrally coordinated cations, for example substitution of Ti into high-silica zeolite frameworks creates highly selective oxidation catalysts.
5.
Acidic Zeolite Catalysts
The acidic strength of zeolites containing Brønsted acidic sites can be characterized by different model reactions. (1) Deprotonation, ZO–H → ZO− + H+ .
(8)
248
M. Sierka and J. Sauer
Figure 3.
(a)
(b)
(c)
(d)
Different zeolite framework structures: (a) CHA, (b) FAU, (c) MFI, and (d) MOR.
This hypothetical reaction defines gas phase acidities. The relative values of enthalpies of deprotonation for gas phase molecules are accessible from proton transfer equilibrium data. For acidic sites at surfaces only inferences can be made from spectroscopic data, and reliable values can only be provided by theoretical calculations. (2) Chemisorption of basic molecules, e.g., NH3 , ZO–H + NH3 → ZO–NH+ 4.
(9)
This process can be studied by calorimetry and the reverse process is observed in temperature programmed desorption experiments. (3) Proton motion between two oxygen atoms Z(O2 )O1 –H → Z(O1 )O2 –H.
(10)
The reaction energy for this proton motion is, by definition, given by the relative deprotonation energy of the two sites involved. The barrier and the corresponding jump rate characterize the proton mobility of the active site. (4) Protonation of organic molecules involved in the catalytic reactions in zeolites, for example, hydrocarbon conversions.
Hybrid QM/MM methods and their application
249
The QM–Pot method has been applied to investigate influence of the structure and chemical composition of zeolites on the acidity of their Brønsted sites [22, 24–26]. For deprotonation and ammonia adsorption the QM–Pot reaction energies deviate from the periodic full QM results by 4–9 kJ/mol only, which demonstrates the power of the combined approach [27]. A more detailed review of these results is given by Sauer and Sierka [28]. Here, we give a short overview of QM–Pot calculations on dynamic properties of Brønsted sites, i.e., reactions (3) and (4) above.
6.
Proton Mobility in Zeolites
The simplest dynamic process, which characterizes Brønsted acidic sites is the proton jump between oxygen atoms of the zeolite framework. In dehydrated acidic zeolites two types of proton motion can be distinguished (Fig. 4): (a) local, on-site jumps between the four oxygen atoms of the AlO4 tetrahedron, and (b) translational, inter-site motions between two different aluminum sites. Clearly, reliable theoretical predictions of how the proton jump barriers and rates depend on zeolite structure and chemical composition require a modeling method which takes the whole periodic lattice into account. Our QM–Pot approach is such a method. The predictions of on-site jump barriers and rates require localization of all four local minima and six transition structures for a given crystallographic location of the Al atom. For a small unit cell zeolite chabazite the maximum deviation of the QM–Pot results from the full periodic QM treatment is 4 kJ/mol for reaction energies (stabilities of different proton positions around the AlO4 tetrahedron) and 6 kJ/mol for proton jump barriers, as shown in Table 1 [15]. For zeolites with a large unit cell such as FAU and MFI convergence of the QM–Pot results with the size of the QM cluster was investigated [7]. The long-range correction to the barrier decreases with the QM cluster size, but shows large variations with the specific sites considered. Even for the largest clusters comprising up to 25 TO4 units (T = Si, Al) the long-range corrections vary over 25 kJ/mol, but due to the combined QM–Pot method the total barrier heights are stable within a few kJ/mol. Hence, use of (a)
Figure 4.
(b)
Two types of proton jumps in dehydrated zeolites: on-site (a) and inter-site (b).
250
M. Sierka and J. Sauer Table 1. Comparison of relative stabilities (kJ/mol) of proton positions (E) and proton jump barriers (E ‡ ) between two oxygen atoms calculated with the QM–Pot method and full periodic QM for zeolite chabazite. Data taken from Sierka and Sauer [15] E ‡a
E a Jump path
QM–Pot
Full QM
QM–Pot
Full QM
O3-O4b
1.7 3.3 65 69 O1-O2 11.2 11.5 66 72 O3-O2 3.5 3.8 87 92 O1-O3 3.6 7.2 90 90 O2-O4 0.2 3.2 99 97 O1-O4 12.4 13.4 102 105 a At 0 K temperature, zero point energy correction not included. b Numbers denote different crystallographic positions of oxygen atoms within the zeolite lattice.
cluster models of the same size without embedding by the QM–Pot scheme will produce large errors on the relative barrier heights. For the investigated zeolites (CHA, FAU, and MFI) the final on-site proton jump barriers including zero-point vibrational energies vary between 52 and 106 kJ/mol, depending on zeolite type, crystallographic site, and path for the proton motion. The predicted jump rates also show large variations, from 10−6 to 105 s−1 . Estimates show that tunneling is not an important factor above room temperature. For the inter-site proton motion in acidic MFI the activation barriers are found to depend on the spatial separation of the two neighboring Al sites. The calculated proton jump rates vary over a broad range of 10−10 – 1010 s−1 , depending on the proton jump path and the Al–Al distance [29].
6.1.
Hydrocarbon Conversion in Zeolites
Zeolites are very important catalysts in petroleum refining and petrochemical conversion processes. Experimental techniques cannot easily provide information about elementary catalytic steps because adsorption, desorption, and diffusion processes interplay with multiple simultaneous reactions. Although zeolites of different framework structures have the same Brønsted sites in common, they show orders of magnitude differences in catalytic performance. Thus, reliable modeling techniques must incorporate a realistic model of the active site environment. Large hydrocarbon species in zeolite catalysts with unit cells containing several hundred atoms are still a challenge for quantum chemistry methods. Recently, several studies using periodic density functional theory (DFT) method with plane wave basis set for small and medium unit cell zeolites have been reported (see, e.g., [30–33]). However, such calculations are far from being routine and still a challenge for
Hybrid QM/MM methods and their application
251
zeolites with unit cells of the size of ZSM-5 zeolite, and suffer from an additional problem. While current density functionals are reasonably well suited for describing bond breaking–bond making reaction steps, they fail to yield reliable energies for the van der Waals (vdW) interactions that dominate the adsorption–desorption steps [34]. The QM–Pot approach is such a method capable of treating the full active site environment at reasonable computational expense. It has also the advantage of producing more reliable adsorption energies compared to the full DFT treatment. When the QM part can be chosen small enough, the most important vdW interactions between the hydrocarbon and the zeolite are described by the force field (which is superior to DFT for this purpose). A cluster model containing three T atoms with the formula (OH)2 Al(OSi(OH)3 )2 proved the best choice. This model is also large enough to describe bond-breaking and bond-making properly. We have applied our method in studies of two important issues in hydrocarbon chemistry in zeolites – the role of the carbocations in hydrocarbon conversions [21] and the phenomenon known as Transition State Shape Selectivity [35].
6.2.
Adsorption of Unsaturated Hydrocarbons and the Role of Carbocations
Adsorption of unsaturated hydrocarbons on zeolitic Brønsted sites results in formation of an adsorption complex. Contrary to the full periodic DFT calculations our QM–Pot approach yields reasonable adsorption energies. For m-xylene in ZSM5 the QM–Pot adsorption energy is approximately 61 kJ/mol, with the DFT part of this result of only 12 kJ/mol. Using plane-wave periodic DFT calculations we obtain a value of 28 kJ/mol. Experimental values are between 60 and 85 kJ/mol on NaY and KY zeolites [35]. We clearly see the failure of DFT to give the reasonable dispersion interactions resulting in much underestimated adsorption energies. The detailed mechanisms by which solid acids catalyze hydrocarbon conversion reactions are still not completely known. The work of Olah and others demonstrated that liquid super acids protonate hydrocarbons and stabilize carbenium ions [36, 37], but still it is controversial if zeolites do the same. We have used our hybrid QM/MM method and studied the bimolecular mechanism of the disproportionation of m-xylene into toluene and trimethylbenzene (TMB) [21]. The most important finding from our calculations is that benzenium type carbenium ions are local minima on the potential energy surface (see Fig. 5) and, hence, possible intermediates in the reaction mechanism. Nicolas and Haw [38] have produced NMR evidence for some carbenium ions and concluded that only species with proton affinities (PA) greater than about 874 kJ/mol live long enough in zeolites to be observed on the NMR time scale.
252
M. Sierka and J. Sauer (a)
(b)
Figure 5. Benzenium carbenium ion based on 3-methylphenyl-2,4-dimethylphenyl-methane (upper part of the figure) and its positions within the FAU zeolitic cage: the electrostaticaly stabilized position A in the vicinity of A1 atom (a), and van der Waals stabilized position B far from the active site (b).
The PA of the benzenium ion shown in Fig. 5 is only 821 kJ/mol and it remains to be seen if this is enough to observe it experimentally. Both the ionic attraction between the negatively charged active Al site on the zeolite framework and the positively charged benzenium ion and the vdW (dispersion) interaction with the zeolite wall contribute to the stabilization of the benzenium-type intermediate in the zeolite cavity. While the electrostatic attraction dominates for position A close to the negatively charged active site, in structure B the benzenium ion fits tightly to the zeolite wall far from the active site where it maximizes the vdW interaction. The unique feature of zeolite catalysts is that they combine the activation of molecules with the confinement of the nano-sized pores and cavities. This has led to the concept of transition state shape selectivity. It states that some isomers may not be observed in the product stream, not because they are too bulky to leave the pores, but because the transition state through which they form at the active site may be too bulky for the pore of a given zeolite. An experimental proof is difficult, and therefore, we have used our hybrid method to study a sterically demanding reaction, the disproportionation of m-xylene into TMB and toluene [35]. The results of the QM–Pot calculations show that one product isomer (1,3,5-TMB) is disfavored, but relative selectivity to the other two isomers varies with pore geometry, mechanistic pathway, and inclusion of entropic effects. The calculated barriers, including zero point energy corrections, are in general agreement with experimental
Hybrid QM/MM methods and their application
253
data. For both pathways they fall into the range of 112–135 kJ/mol, comparing to experimentally derived apparent reaction barriers of 80–110 kJ/mol. Variation of the environment shape at the critical transition states is shown to affect the course of reaction in three zeolites investigated (FAU, MFI, and MOR). Barrier height shifts on the order of 10–20 kJ/mol are achievable. However, observed selectivities do not agree with the transition state characteristics calculated and, hence, are most likely due to product shape selectivity.
7.
Transition Metal Containing Zeolites
Systems containing transition metals ions (TMIs) show catalytic activity in different systems ranging from homogeneous and heterogeneous catalysts to biomolecules. Particularly, TMIs exchanged high-silica zeolites such as MFI and ferrierite (FER) show high catalytic activity for the direct conversion of NO into N2 and O2 (“deNOx activity”) and for the selective catalytic reduction of NO by hydrocarbons in the presence of excess oxygen. Among different materials the Cu exchanged MFI shows an unusually high catalytic activity [39]. Due to a rather high Si/Al ratio and low TMI loading detailed information about catalytic processes is not easily accessible experimentally. Therefore, reliable theoretical studies are of great importance to understand the distribution, local structure, and the catalytic activity of TMI sites. Empirical interatomic potentials can be applied to periodic structures (e.g., [40]) and are capable of distinguishing between different zeolite frameworks and different sites within a given framework, but the reliability of potential functions is an open question. DFT provide reliable results for TMI interactions [41], but problems arise from the use of cluster models. Cluster models are not capable of describing differences between different zeolite frameworks, the results of cluster model calculations also depend on the shape and size of the cluster as well as on geometric constraints imposed on it. An example is the correct prediction of the number of coordinations (CN) of the Cu+ ions to the zeolite framework. It appears that linear cluster models are biased towards twofold coordination, whereas cyclic ones are biased toward structures with higher coordination numbers [42]. For linear chains of TO4 tetrahedra (T = Si,Al) with one to five T atoms, two-fold coordination of Cu+ cations to two oxygen atoms of AlO4 tetrahedra was consistently found by several authors (e.g., [43, 44]), while cluster models with a ring of TO4 tetrahedra containing four to six T atoms yielded structures with three- or four-fold coordinated Cu+ ions (e.g., [45]). Reliable results can only be obtained when the whole periodic zeolite structure is included, different possible locations of TMIs are considered and the interactions with the zeolite framework are described accurately enough. This can be easily achieved by using the QM–Pot method, which since its first application to the chemistry of Cu+ cations in MFI [46] proved a powerful tool for studying TMI ions in zeolites, particularly in MFI and FER [28, 42].
254
M. Sierka and J. Sauer
Determination of the preferred siting and coordination of Cu+ ions in zeolites such as MFI and FER requires investigation of many different arrangements of aluminium atoms and copper ions, since such information is not available from experiment. A two-step computational strategy proved very useful [47, 48]. First, lattice energy minimizations using an accurate potential function parametrized on DFT data alone were performed for a large number of initial Al and Cu+ distributions. This allows for a fast determination of the most favored sites. Next, for selected structures QM–Pot energy minimizations were performed using QM clusters large enough to capture the most important interactions between the Cu+ ion and the zeolite framework. In both zeolites sites were found with two-, three-, or four-fold coordinated Cu+ ions and with average coordination numbers in close agreement with experimental data. The sites were classified depending on the number of O atoms coordinating the Cu+ ion and its position in the framework, as shown in Fig. 4 of [42]. Type II site copper ions are coordinated to two framework O atoms, either at the channel intersection (I2 site in MFI and FER) or on the walls of the main or perpendicular channels (M2 and P2 sites, respectively in FER). Higher coordinated sites, summarized as type I sites, have one or two additional coordinations to other oxygen atoms within a five- or sixmembered (TO)n ring. The existence of the two types of sites also emerged from experimental photoluminescence spectra. While the observed 3d10 (1 S0 ) – 3d9 4s1 (1 D2 ) excitation spectra show two well-separated bands, the band splitting almost disappears in the emission spectra. The QM–Pot calculations not only confirmed this observation but also provided an explanation [49]. In the ground state, different types of Cu+ coordination cause large variations in the excitation energies. In contrast, in the excited state the coordination differences between type I and II sites disappear. The type I sites give up their additional coordination and retain only the twofold coordination to the AlO4 tetrahedron, whereas type II sites remain unchanged. The reason is that on excitation the 4s orbital becomes occupied, which is much larger than the 3d orbital, and so the Cu+ ion moves away from the zeolite wall. Thus, because the excited structures are alike for all Cu+ sites considered, the emission energies are also very similar. Relatively large relaxation effects of the surrounding zeolite lattice, showing that such phenomena cannot be adequately described by free space or frozen lattice cluster calculations, accompany the excitation process. The most important question is whether the two different types of Cu+ sites in MFI and FER exhibit different catalytic properties. The QM–Pot calculations have been performed to investigate the influence of the Cu+ ion location on adsorption of small molecules, such as CO [50], NO, NO2 , N2 , and H2 O [51]. Upon the interaction with one molecule the coordination of TMI to zeolite framework is unchanged for type II sites. For type I site the interaction of TMI with any of the studied molecule leads to the loss of the coordination
Hybrid QM/MM methods and their application
255
Table 2. Calculated QM–Pot interaction energies (kJ/mol)a of NO, CO, N2 , and NO2 molecules with Cu+ ions in zeolites MFI and FER. Data taken from Nachtigall et al. [51] Type I site
Type II site
Molecule
MFI
FER
MFI
FER
NO CO N2 NO2
117 151 84 146
84 117 54 109
146 176 109 180
138 167 100 159
a At 0 K temperature, zero point energy correction not included.
of TMI to non-AlO4 tetrahedron framework oxygen atoms and TMI is moved farther form the channel wall. For this reason, the interaction energies with type II sites are 6–8 and 11–13 kcal/mol stronger for MFI and FER, respectively, than with type I sites, as shown in Table 2. The significant differences between type I and type II sites were also found for interaction with two or three molecules. For example, the Cu+ ion in type II site can bind two or even three CO molecules (in agreement with experimental observation) while the Cu+ ion in type I site can bind two CO molecules at most. The two step QM–Pot approach has also been successful in determining the siting and local coordination of Cu(I) pairs [52], and Ag+ ions in MFI [53] as well as the coordination of Cu+ and Cu2+ ions in MFI in the vicinity of two framework Al atoms [54].
8.
Ti-Silicalite Catalysts
Isomorphous substitution of Si4+ by Ti4+ in synthetic zeolites gives rise to an interesting family of very active and selective catalysts. Ti containing silicalite-1 (TS-1) shows an unprecedented catalytic activity for the oxidation of organic substances [55]. Experimental methods have difficulties to localize the active Ti sites and characterize their catalytic properties because of their low concentration (Ti/Si < 0.025) and probable structural disorder. The QM–Pot method was used to examine the possible location of Ti within the framework and the interaction of the Ti-sites with one or two water molecules [56]. Full periodic QM calculations on the small unit cell zeolite Ti-chabazite showed that the QM–Pot results converge to the true periodic QM limit when large enough QM cluster are used. In the dehydrated state the stability differences between structures with Ti in different crystallographic positions are typically between 0 and 10 kJ/mol. A similar range of energies was obtained in calculations using cluster models [57] or periodic Hartree-Fock calculation [58]. A recent QM/MM study using partially constrained cluster models (ONIOM: DFT in combination with the UFF force field) yielded unreliable stability
256
M. Sierka and J. Sauer
differences of up to 235 kJ/mol [59]. We suspect that the UFF force field used by the authors does not reliably describe zeolitic systems. The binding of H2 O and NH3 to Ti-sites was recently examined using the ONIOM method (DFT/larger basis set for a small cluster combined with Hartree-Fock/small basis set for a large cluster) [60]. The QM–Pot method was also used to study the spectroscopic properties of Ti substituted zeolites [61, 62]. The QM–Pot method not only correctly reproduced the observed IR and 29 Si NMR spectra of titanium silicalite-1 but also provided insight in the role of Ti substitutions. Such substitution causes a shift of the 29 Si NMR signal of neighboring Si nuclei of only ∼1 ppm to lower fields, while the dependence of the chemical shift on the average T–O–T angle remains unaltered. Hydration of the framework Ti site has been found to strongly influence the 29 Si NMR chemical shift via structural distortions of the lattice and by hydrolysis of Si–O–Ti bridges. QM–Pot calculations confirmed that the IR mode at 960 cm−1 is characteristic of Ti substitution and due to asymmetric TiO4 vibrations [62].
Acknowledgments M. Sierka acknowledges support from the “Center for Functional Nanostructures”, which is funded by the “Deutsche Forschungsgemeinschaft,” the State of Baden-W¨urttemberg and the Universit¨at Karlsruhe. J. Sauer has been supported by the “Deutsche Forschungsgemeinschaft” (SPP 1155) and the “Fonds der chemischen Industrie.”
References [1] C. Pisani (ed.), Quantum-Mechanical Ab-initio Calculation of the Properties of Crystalline Materials, Lecture Notes in Chemistry, vol. 67, Springer-Verlag, Berlin, 1996. [2] M. Parrinello, Sol. Stat. Commun., 102, 107–120, 1997. [3] D. Marx and J. Hutter, In: J. Grotendorst (ed.), Modern Methods and Algorithms of Quantum Chemistry, NIC Series, vol. 3, NIC Directors, FZ J¨ulich, J¨ulich, pp. 301– 449, 2000. [4] J.R. Hill, C.M. Freeman, and L. Subramanian, “Use of force fields in materials modeling,” In: K.B. Lipkowitz and D.B. Boyd (eds.), Reviews in Computational Chemistry, vol. 16, VCH, New York, pp. 141–216, 2000. [5] J. Sauer, Chem. Rev., 89, 199–255, 1989. [6] A. Warshel and M. Levitt, J. Mol. Biol., 103, 227–249, 1976. [7] M. Sierka and J. Sauer, J. Phys. Chem. B, 105, 1603–1613, 2001. [8] J. Gao and M.A. Thompson (eds.), Combined Quantum Mechanical and Molecular Mechanical Methods, ACS Symposium Series, vol. 712, American Chemical Society, Washington, 1998.
Hybrid QM/MM methods and their application
257
[9] P. Sherwood, In: J. Grotendorst (ed.), Modern Methods and Algorithms of Quantum Chemistry, NIC Series, vol. 3, NIC Directors, FZ J¨ulich, J¨ulich, pp. 257–277, 2000. [10] P. von Ragu´e Schleyer, N.L. Allinger, T. Clark, J. Gastaiger, P.A. Kollman, H.F. Schaefer, III, and P.R. Schreiner (eds.), Encyclopedia of Computational Chemistry, Wiley, Chichester, 1998. [11] D. Bakowies and W. Thiel, J. Phys. Chem., 100, 10580–10594, 1996. [12] U. Eichler, C.M. K¨olmel, and J. Sauer, J. Comput. Chem., 18, 463–477, 1997. [13] S. Humbel, S. Sieber, and K. Morokuma, J. Chem. Phys., 105, 1959–1967, 1996. [14] A.L. Shluger and J.D. Gale, Phys. Rev. B, 54, 962–969, 1996. [15] M. Sierka and J. Sauer, J. Chem. Phys., 112, 6983–6996, 2000. [16] J. Gao, “Methods and applications of combined quantum mechanical and molecular mechanical potentials,” In: K.B. Lipkowitz and D.B Boyd (eds.), Reviews in Computational Chemistry, vol. 7, VCH, New York, pp. 119–185, 1995. [17] M.F. Ruiz-L´opez and J.L. Rivail, “Combined quantum mechanics and molecular mechanics approaches to chemical and biochemical reactivity,” In: P. von Ragu´e Schleyer, N.L. Allinger, T. Clark, J. Gastaiger, P.A. Kollman, H.F. Schaefer, III, and P.R. Schreiner (eds.), Encyclopedia of Computational Chemistry, Vol. 1, pp. 437–448 Wiley, Chichester, 1998. [18] J.R. Shoemaker, L.W. Burggraf, and M.S. Gordon, J. Phys. Chem. A, 103, 3245– 3251, 1999. [19] S. Dapprich, I. Kom´aromi, K.S. Byun, K. Morokuma, and M.J. Frisch, J. Mol. Struct. (Theochem), 461–462, 1–21, 1999. [20] A. Warshel, Computer Modeling of Chemical Reactions in Enzymes and in Solutions, New York, Wiley, 1991. [21] L.A. Clark, M. Sierka, and J. Sauer, J. Am. Chem. Soc., 125, 2136–2141, 2003. [22] U. Eichler, M. Br¨andle, and J. Sauer, J. Phys. Chem. B, 101, 10035–10050, 1997. [23] Baerlocher, Ch., W.M. Meier, and D.H. Olson, Atlas of Zeolite Framework Types, Amsterdam, Elsevier, 2001. [24] M. Sierka and J. Sauer, Faraday Discuss, 106, 41–62, 1997. [25] M. Sierka, U. Eichler, J. Datka, and J. Sauer, J. Phys. Chem. B, 102, 6397–6404, 1998. [26] M. Br¨andle and J. Sauer, J. Am. Chem. Soc., 120, 1556–1570, 1998. [27] M. Br¨andle, J. Sauer, R. Dovesi, and N.M. Harrison, J. Chem. Phys., 109, 10379– 10389, 1998. [28] J. Sauer and M. Sierka, J. Comput. Chem., 21, 1470–1493, 2000. [29] M.E. Franke, M. Sierka, U. Simon, and J. Sauer, Phys. Chem. Chem. Phys., 4, 5207– 5216, 2002. [30] T. Demuth, X. Rozanska, L. Benco, J. Hafner, R.A. van Santen, and H. Toulhoat, J. Catal., 214, 68–77, 2003. [31] X. Rozanska, R.A. van Santen, T. Demuth, F. Hutschka, and J. Hafner, J. Phys. Chem. B, 107, 1309–1315, 2003. [32] X. Rozanska, R.A. van Santen, F. Hutschka, and J. Hafner, J. Am. Chem. Soc., 123, 7655–7667, 2001. [33] A.M. Vos, X. Rozanska, R.A. Schoonheydt, R.A. van Santen, F. Hutschka, and J. Hafner, J. Am. Chem. Soc., 123, 2799–2809, 2001. [34] T.A. Wesolowski, O. Parisel, Y. Ellinger, and J. Weber, J. Phys. Chem. A, 101, 7818– 7825, 1997. [35] L.A. Clark, M. Sierka, and J. Sauer, J. Am. Chem. Soc., 126, 936–947, 2004. [36] J.F. Haw, Phys. Chem. Chem. Phys., 4, 5431–5441, 2002.
258
M. Sierka and J. Sauer [37] G.A. Olah and A. Molnar, Hydrocarbon Chemistry, Willey, New York, 1995. [38] J.B. Nicholas and J.F. Haw, J. Am. Chem. Soc., 120, 11804–11805, 1998. [39] M. Iwamoto, H. Furukawa, Y. Mine, F. Uemura, S. Mikuriya, and S. Kagawa, J. Chem. Soc. Chem. Commun., 1272–1273, 1986. [40] D.C. Sayle, C.R.A. Catlow, J.D. Gale, M.A. Perrin, and P. Nortier, J. Mater. Chem., 7, 1635–1639, 1997. [41] K. Koszinowski, D. Schr¨oder, H. Schwarz, M.C. Holthausen, J. Sauer, H. Koizumi, and P.B. Armentrout, Inorg. Chem., 41, 5882–5890, 2002. [42] J. Sauer, D. Nachtigallov´a, and P. Nachtigall, In: G. Centi, B. Wichterlov´a, and A.T. Bell (eds.), Catalysis by Unique Metal Ion Structures in Solid Matrices. From Science to Application, Nato Science Series, Sub-Series II, vol. 13, Kluwer Dordrecht, Academic Publishers, pp. 221–234, 2001. [43] B.L. Trout, A.K. Chakraborty, and A.T. Bell, J. Phys. Chem., 100, 4173–4179, 1996. [44] K.C. Haas and W.F. Schneider, Phys. Chem. Chem. Phys., 1, 639–648, 1999. [45] E. Broclawik, J. Datka, B. Gill, and P. Kozyra, Phys. Chem. Chem. Phys., 2, 401–405, 2000. [46] L. Rodriguez-Santiago, M. Sierka, V. Branchadell, M. Sodupe, and J. Sauer, J. Am. Chem. Soc., 120, 1545–1551, 1998. [47] D. Nachtigallov´a, P. Nachtigall, M. Sierka, and J. Sauer, Phys. Chem. Chem. Phys., 1, 2019–2026, 1999. [48] P. Nachtigall, M. Davidov´a, and D. Nachtigallov´a, J. Phys. Chem. B, 105, 3510– 3517, 2001. [49] P. Nachtigall, D. Nachtigallov´a, and J. Sauer, J. Phys. Chem. B, 104, 1738–1745, 2000. [50] M. Davidov´a, D. Nachtigallov´a, R. Bul´anek, and P. Nachtigall, J. Phys. Chem. B, 107, 2327–2332, 2003. [51] P. Nachtigall, M. Davidov´a, M. Silhan, and D. Nachtigallov´a, In: R. Aiello, G. Giordano, and F. Testa (eds.), Studies in Surface Science and Catalysis, vol. 142, Elsevier, Amsterdam, pp. 101–108, 2002. [52] P. Spuhler, M.C. Holthausen, D. Nachtigallov´a, P. Nachtigall, and J. Sauer, Chem. Eur. J., 8, 2099–2115, 2002. [53] M. Silhan, D. Nachtigallov´a, and P. Nachtigall, Phys. Chem. Chem. Phys., 3, 4791– 4795, 2001. [54] D. Nachtigallov´a, P. Nachtigall, and J. Sauer, Phys. Chem. Chem. Phys., 3, 1552– 1559, 2001. [55] B. Notari, Adv. Catal., 41, 253–334 and references cited therein, 1996. [56] G. Ricchiardi, A. de Man, and J. Sauer, Phys. Chem. Chem. Phys., 2, 2195–2204, 2000. [57] C.A. Hijar, R.M. Jacubinas, J. Eckert, N.J. Henson, P.J. Hay, and K.C. Ott, J. Phys. Chem. B, 104, 12157-12164, 2000. [58] C.M. Zicovich-Wilson, R. Dovesi, J. Phys. Chem. B, 102, 1411–1417, 1998. [59] T. Atoguchi and S. Yao, J. Mol. Catal A: Chem., 191, 281–288, 2003. [60] A. Damin, S. Bordiga, A. Zecchina, and C. Lamberti, J. Chem. Phys., 117, 226–237, 2002. [61] G. Ricchiardi and J. Sauer, Z. Phys. Chem. (Munich), 209, 21–32, 1999. [62] G. Ricchiardi, A. Damin, S. Bordiga, C. Lamberti, G. Spano, F. Rivetti, and A. Zecchina, J. Am. Chem. Soc., 123, 11409–11419, 2001.
1.13 AB INITIO MOLECULAR DYNAMICS SIMULATIONS OF BIOLOGICALLY RELEVANT SYSTEMS Alessandra Magistrato and Paolo Carloni International School for Advanced Studies (SISSA/ISAS) and INFM Democritos Center, Trieste, Italy
1.
Introduction
Ab initio (Car–Parrinello) molecular dynamics (AIMD) simulations [1] are increasingly used to investigate structural, dynamical, energetic and electronic properties of biomolecules. At opposite to classical MD simulations, in this approach the underlying potential energy surface is calculated directly from first-principles. This leads to a parameter free molecular dynamics, where interatomic forces are not empirically derived, but are evaluated from electronic structure calculations as the simulations proceeds. In most of its implementations Car–Parrinello AIMD relies on density functional theory (DFT) [2] as electronic structure method. This is due to the relatively low computational cost of DFT compared to post Hartree–Fock methods and to its wide range of applicability. The application of AIMD to the study of biochemically relevant systems poses, however, severe limitations because of the invariably large size of the systems under investigation. A natural way to reconcile the requirements on system size and accuracy is the use of mixed quantum/classical (AIMD/MM) simulations [3–5]. In this scheme, originally developed by Warshel [6], the chemically relevant part of the system (usually the active site) is treated at the quantum mechanical level, while the effects of the surroundings are explicitly taken into account within a mechanical force field description. This enables a realistic description of chemical reactions that occur in a complex heterogeneous environment such as enzymatic reaction cycles in explicit protein environment [7]. In this review, we first provide the fundamental principles of the AIMD and hybrid AIMD/MM methods in most wide spread implementations, namely 259 S. Yip (ed.), Handbook of Materials Modeling, 259–274. c 2005 Springer. Printed in the Netherlands.
260
A. Magistrato and P. Carloni
using density functional theory and planewave basis sets. Subsequently, we illustrate the power and limitations of the techniques for the modeling of biological systems by a survey of selected applications from our own work. Because of the explicit treatment of the electronic degrees of freedom, AIMD/MM calculations allow the direct simulation of bond breaking-bond forming processes, such as those occurring in an enzymatic reactions. Here, we report a study on the reaction catalyzed by caspase-3 [8], a current target for the cure of neurodegenerative diseases. We also show that fundamental insights can be obtained into an important class of biomimetics of iron-based enzymes [9]. We then move to the characterization of two transition-metal drugs: rhenium and technetium hexatioether complexes and cisplatin, along with their targeting abilities. This is a fundamental point in the design of new and highly selective drugs. Unfortunately, because of the presence of a transition metal ion, biomolecular force-fields might encounter difficulties to describe structural and energetic properties of the drug, which depends in an intricate way on the electronic structure [10]. AIMD/MM may provide a valuable alternative, as it takes electronic properties into account. Here, we present an investigation of the structural and electronic properties of metal-based drugs targeting proteins and DNA [11, 12]. Our approach from one hand provides insights into the reactivity of the metal based drugs, on the other hand it is used as a tool for docking these drugs on their targets. The article is concluded with a brief perspective on some of the methodologies, which may further extend the domain of applications of AIMD to biological systems.
2. 2.1.
Methods The Basic “Idea” of Car–Parrinello Molecular Dynamics
Under the simplifying assumptions that the motion of the nuclei can be described by classical laws and that the Born–Oppenheimer approximation holds, the most intuitive way to combine electronic structure calculations with a classical molecular dynamics scheme is a straightforward coupling of the two approaches (“Born–Oppenheimer molecular dynamics”). In this approach, the total potential energy E pot is calculated for a given nuclear configuration by solving the electronic structure problem. E pot ( R I ) = E n ( R I ) + E e ( R I ) (1) where E n ( R I ) is the direct internuclear interaction energy and E e ( R I ) is the ground state energy evaluated at fixed nuclear positions R I . The nuclear forces
AIMD simulations of biologically relevant systems
261
are then calculated from E pot and the nuclei are moved to new positions according to the laws of classical mechanics: M I R¨ I = −∇ I min {0 | He |0 }
(2)
where the energy of the ground state reads He 0 = E 0 0
(3)
where He is the Hamiltonian of the electronic subsystem and 0 is the ground state wavefunction. A different approach, proposed in 1985 by Car and Parrinello [1, 13] treats the electronic degrees of freedom (represented by the one electron wave functions ψi ) as fictitious classical variables. The system is therefore described in terms of the following extended Lagrangian L CP = Tn + Te − E pot
(4)
where L CP represents the Car–Parrinello extended Lagrangian, Tn the kinetic energy of the nuclei, Te the fictitious kinetic energy of the electronic system and E pot the potential energy that depends on both the nuclear position RI and the electronic variables ψi . The explicit form of this extended Lagrangian reads
L CP =
1 1 . . M I R˙ 2I + µ ψ i | ψ i − E KS ψi ; R I 2 I 2 i
+
λi j
ψi∗ ( r )ψ j ( r ) dr
− δi j
(5)
i, j
where M I represents the ionic masses and µ the fictitious mass associated with the electronic degrees of freedom, the last term represents the constraints that are added to ensure orthonormality of the one-electron wave functions. In most of the current Car–Parrinello AIMD implementations, the potential energy is given by the Kohn–Sham energy functional [2]
1 E KS ψi ; RI = − 2 +
1 2
r ) ∇ψi ( r) + dr ψi∗ (
dr VN ρ(r )
i
dr dr
ρ( r )ρ( r ) + E xc ρ( r) | r − r |
(6)
r ) is the external potential, E xc [ρ( r )] the exchange-correlation where VN ( functional and the electron density ρ( r ). The extended Lagrangian reported
262
A. Magistrato and P. Carloni
in Eq. (5) determines the evolution of a fictitious classical system in which nucleic positions as well as electronic degrees of freedom are treated as dynamical variables. The Newtonian equations of motion of this system are given by the Euler–Lagrange equations δL d δL = ˙ dt δ R I δ R I
(7)
δL d δL . = dt δ ψ ∗i δψ ∗i
(8)
and the corresponding Car–Parrinello equations of motions are δ M I R¨ I (t) = −
o | He |o δ R I
δ o | He |o .. µi ψ i (t) = − + ij ψ j = −He ψi + ij ψ j δψi j j
(9) (10)
where µ determines the velocity at which the electronic degrees evolve in time. In particular, the ratio µ/M characterizes the relative speed in which the electronic variables are propagated with respect to the nuclear positions. For µ M, the electronic degrees of freedom adjust instantaneously to changes in the nuclear coordinates and the resulting dynamics is adiabatic. Under this condition, Te Tn and the extended Lagrangian becomes identical to the physical Lagrangian of the system (L CP ≈ L). According to Eqs. (9) and (10), while the nuclei are moving at a certain physical temperature proportional to their kinetic energy, the electronic degrees of freedom move at a certain “fictitious temperature”. Thus, for a finite value of µ, the electronic subsystem moves within a limited width, given by its fictitious kinetic energy, above the Born–Oppenheimer surface. In such a way adiabaticity can be ensured only if the highest frequency of the nuclear motion is well separated from the lowest frequency associated with the fictitious ωmax I motions of the electronic degrees of freedom ωemin . Since ωemin is proportional to the square root of the electronic energy difference (E gap ) between the lowest unoccupied orbital and the highest occupied orbital (HOMO–LUMO gap).
ωemin α
E gap µ
(11)
. In the parameter µ can be chosen in such a way as to ensure ωemin ωmax I this way, energy is not transferred from the electronic to the nuclear subsystem. In Car–Parrinello AIMD the explicitly treated electron dynamics limits the largest time step that can be used to 0.1–0.2 fs only. This limitation does of
AIMD simulations of biologically relevant systems
263
course not exist in BO dynamics where there is no explicit electron dynamics and the time step is usually one order of magnitude larger with respect to Car–Parrinello AIMD. Therefore, the advantage of performing the diagonalization of the Hamiltonian only at the very first step of the dynamics has to find a compromise with the use of a small time step. The Kohn–Sham one electron ψi orbitals are expanded in a basis set of m ) up to a given kinetic energy cutoff E cut. plane waves (with wave vectors G ψi ( r) = √
1 cim ei G m •r Vcell m
(12)
In such a scheme, an adequate treatment of the inner core electrons would require prohibitively large basis sets. Therefore, only valence electrons are explicitly treated and the effect of ionic core electrons is integrated out using ab initio pseudopotential formalism. All calculations presented in section 3 are performed with the original Car–Parrinello scheme [14] based on (gradient corrected) [15] density functional theory in the framework of a pseudopotential approach [16] and a basis set of plane waves.
2.2.
Hybrid Car–Parrinello/MM Calculations
Chemical and biochemical processes of relevance usually occur in heterogeneous condensed phase environments consisting of thousands of atoms. One of the solutions that it is often used in order to model such systems is the use of a hybrid QM/MM approach [3, 4] in which the whole system is partitioned into a localized chemically active region, treated at the quantum mechanical level, and the remaining part of the system treated with empirical force fields. Several schemes exist in which the Car–Parrinello molecular dynamics scheme has been extended to a hybrid QM/MM framework [5, 17]. The general form of a mixed QM/MM Hamiltonian was introduced by Warshel [6] H = HQM + HMM + H QM/MM
(13)
where HMM is described by a standard biomolecular force field and comprises bonded (harmonic bonds, angles and dihedrals) and nonbonded interactions (electrostatic point charges and van der Waals interactions). The difficulties of each QM/MM implementation lie in the coupling between the QM and the MM region that is described in the HQM/MM part of the Hamiltonian. Recently, the CPMD has been combined with an MM approach [5]. In this scheme bonds between QM and MM region of the system are treated with specifically designed monovalent pseudopotentials [18], where the remaining bond interactions are described by the classical force field. The same holds for van der Waals interactions between QM and MM part of the system. On the other hand, electrostatic effects of the classical environment are treated
264
A. Magistrato and P. Carloni
as an additional contribution to the external field of the quantum system and particular care is taken in order to avoid over polarization of electron clouds near the boundary region, the so called spill-out effects. In addition, in order to limit the computational overhead, the electrostatic interactions between the QM system and the more distant MM atoms are included via a Hamiltonian term that explicitly couples the multipole moments of the quantum charge distribution with the classical point charges. In the scheme we have used to perform mixed QM/MM Car–Parrinello simulations either the GROMOS96 [19] or the AMBER95 [20] force fields can be used for the molecular mechanics part, in combination with particle– particle–particle mesh (P3M) treatment of long-range electrostatic interactions [21]. This scheme provides an efficient computational tool which takes explicitly into account the entire system and solvent effects.
3. 3.1.
Applications Enzymes and Biomimetic Compounds
3.1.1. Caspase-3 Caspase-3, a cysteine protease, plays a key role for cell apoptosis and it is directly involved in the neurodegeneration of the Alzheimer’s disease [22]. Up to now, the identification of inhibitors has been based only on combinatorial chemistry that, however, has not completely solved efficiency and selectivity problems. Knowledge of the energetics and the structural features of the enzymatic reaction mechanism is of relevance to develop transition state analog inhibitors. We have used AIMD/MM simulations to investigate the second reaction step, involving the deacylation of the peptide susbstrate (Chart 1) [8]. The activation free energy of the reaction is investigated using a thermodynamic integration technique. The attack of the hydrolytic water molecule implies an activation free energy of ∼20 kcal/mol and leads to a previously unrecognized gem-diol intermediate that can readily evolve to the enzyme products (Fig. 1). Analogs resembling the gem-diol transition state structures will therefore provide specific powerful noncovalent inhibitors by capturing a fraction of the binding energy for the transition state species. The consequent C–S bond dissociation, which requires a much lower activation free energy (∼5 kcal/mol) is concerted with a proton transfer to the side chain of the substrate Asp (Fig. 1). Such a mechanism is an alternative to the proposal that the positively charged His residue transfers a proton to the anionic intermediate, proposed for the correspondent reaction in aqueous solution in presence of a His residue [23].
AIMD simulations of biologically relevant systems
265
Chart 1
In addition, it suggests that the decrease in catalytic efficiency on passing from papain to caspase-3 may be ascribed to both conformational and electrostatic properties.
3.1.2. Iron-based biomimetic compounds Nonheme diiron enzymes [24–30] catalyze a variety of important and diverse chemical reactions, which may be of interest for industry [31, 32]. For instance, the diiron enzyme methane monooxygenase (MMO) is able to catalyze the oxidation of alkane, alkene and aromatic groups [33] under mild conditions and in an efficient and highly selective manner [24, 25], while the corresponding industrial process occurs under extreme conditions with low yields [31, 32]. Thus, the idea of synthesizing enzyme mimics emerges quite naturally. These mimics should retain the structural features common to this class of diiron proteins, which is a four antiparallel α-helix peptide bundle in which each iron atom binds a histidine and a glutamate ligand and two bridging carboxylates bind both metal ions [33].
266
A. Magistrato and P. Carloni
Figure 1. Water nucleophilic attack to the acyl-enzyme, as emerging from the QM/MM calculations: structure of transition state (a) and of the intermediate (I3 in Chart 1). (b) The H-bond pattern of His237 is shown with dotted lines. The QM part is indicated in thick lines. Labeling as in Chart 1.
Based on a retro structural analysis on a series of diiron proteins [33–35], the synthetic biomimetic complex Due Ferro 1 (DF1, Fig. 2) has been recently synthesized and characterized by DeGrado’s Group [33–35]. The complex resembles the common motif of diiron proteins. Besides fully emulating the active site and the tertiary structure of the real enzymes, DF1 binds also metal ions [34, 35] other than iron (such as zinc,
AIMD simulations of biologically relevant systems
267
Figure 2. (a) Schematic structure of DF1 [33–35]. (b) Close-up of the bridged bimetallic putative catalytic center.
manganese), thus mimicking the corresponding dizinc [36, 37], dimanganese [38] containing enzymes. AIMD and hybrid AIMD/MM have elucidated the key factors governing stability/reactivity of the active site of the two latter species. In the dizinc compound, our calculations have elucidated the crucial role of the environment (in particular, the second-shell ligands and the solvent waters) for stabilizing the hydrogen bond networks that surround the active site. Similar conclusions have been observed also for other metal centers in proteins [39]. In addition, our calculations show a highly flexible nature of the carboxylate-bridged binuclear motif. The chelating carboxylate ligands are particularly mobile and in presence of the whole protein they perform a syn–anti isomerization in which the glutamate coordinates with the internal and the external oxygen lone pair, respectively. In case of the manganese species, our simulations have shown that DF1 is not active as a mimic of manganese catalase [38] and calculations on chemically modified species have helped to shed light on the catalytic mechanism of the wild-type enzyme [40]. In conclusion, our calculations confirm that transition metal centers have a highly dynamical behaviour in which the coordination sphere undergoes continuous changes in the geometric arrangement of the ligands. In DF1, the mobility of the Glu ligands is expected to play an important role for the catalytic properties of the caboxylate-bridged binuclear motif. Therefore, our computational approach can be critically important for the tailoring of efficient and highly selective biomimetic catalysts.
268
3.2.
A. Magistrato and P. Carloni
Pharmaceutical Compounds
3.2.1. Metal-based radiopharmaceutical compounds Metal complexes with radioactive nuclei find multiple applications in medicine as they enable to monitor biological functions and constitute a tool for imaging of tumors, organs and tissues [41, 42]. Over 90% of nuclear diagnostic medicine is carried out with technetium 99m. This is mainly due to the favorable properties of this radio isotope (99m Tc is a radio isotope with a half life of 6 h and an emission energy of 141 keV only) and its ready generator availability. At opposite, radioisotopes such as the β-emitting 186 Re and 188 Re are now of widespread use in a therapeutic manner for the in situ treatment of cancerous tissues [43, 44]. A central issue in the development of radiopharmaceuticals with improved imaging and therapeutic properties is the search for compounds with enhanced selectivity. Unfortunately, a rational design of highly selective agents is hampered by the limited knowledge of the factors determining their reactivity and biodistribution. In the human body these molecules encounter a variety of different chemical environments (such as different pH or redox potential) and their pathways and final destinations are crucially determined by their chemical transformations under these varying external conditions. A characterization of the detailed physicochemical behavior of these compounds is therefore important to develop new radiopharmaceuticals with improved features. This is the case of the crown thioether complexes of rhenium and technetium, of general formula [M(9S3)2 ]2+ (M = Re, Tc) which are similar to the so-called “first generation” radiopharmaceutical agents [45]. These have been successful in the selective imaging of organs such as the heart, the brain, the liver, the kidneys and the bones. In the presence of reducing agents (such as ascorbic acid, Zn, Cr or SnCl2 ) under mild conditions, the Re and Tc compounds undergo instantaneous C–S bond cleavage to yield ethene and [M(9S3)L]+ (where L = SCH2CH2 SCH2 CH2 S), whereas in presence of other transition metals the reaction does not occur [46]. AIMD calculations have confirmed the hypothesis, based on experiments and semiempirical calculations [47], that the reductive bond scission in Tc and Re is caused by a strong π -back-donation from donor t2g -metal-orbitals into antibonding C–S σ *-orbitals of the thioether ligands (Fig. 3) [11]. In addition, the calculations show that the reaction proceeds in two steps. The first step consists in the reduction of the doubly positive charged metal complex to the unipositive analogue. The additional electron promotes a lowering of the activation energy barrier for the dissociation of ethane of about 10 kcal/mol [11] and this is sufficient to reduce the activation energy barrier to a level that only a
AIMD simulations of biologically relevant systems
269
Figure 3. Contour plot of the HOMO-2 orbital at the transition state for [Re(9S3)2 ]2+ indicating the presence of π-back-donation from one of the d (t2g ) orbitals into C–S σ ∗ ligand orbitals. The contours are given at ±4.0 au.
short simulations of few ps at room temperature allows us a direct observation of the loss of the ethene molecule (Fig. 4). In conclusion, our study provides a detailed understanding of the mechanism of the reductive C–S bond cleavage in rhenium and technetium radioactive agents and contributes to a comprehensive characterization of their chemical behavior in redox active environments.
3.2.2. Cisplatin binding to DNA Cisplatin (cis-diamminedichloroplatinum(II)) is widely used in clinic treatment against a variety of cancer diseases [48]. This compound targets DNA, distorting its structure (kink in the helical axis from ∼50◦ to ∼80◦ ) and thereby
270
A. Magistrato and P. Carloni (1)
(2)
(3)
(4)
Figure 4. Dissociation pathway of [Re(9S3)2 ]+ at 350 K. Snapshots of the most representative frames 1–4 are shown. (a) Conformation of the molecule after 0.1 ps simulation time. (b) Simultaneous dissociation of the two C–S bonds and release of ethene (0.2 ps). (c) Progressive removal of ethene. (d) Formation of the final product.
inhibiting the replication and transcription machinery of the cell. Upon DNA binding, this drug loses its two chlorine ligands and binds to a guanine N7 atom and an adjacent guanine N7 atom (65% of total platinated DNA), or to a lesser extent, to adenine N7 (25%). AIMD simulations were used to investigate the first (and rate-limiting) step of the DNA binding [49], which is believed to involve the substitution of a chlorine ligand with a water molecule (Fig. 5). The calculations provided a structural model of the transition state of the reaction and the calculated free energy barrier compared remarkably well with the experimental data. Subsequently, we carried out QM/MM calculations in aqueous solution of the final product of the reaction, namely the complex between cisplatin and an DNA oligomer (cisplatin-d(CCTCTG*G*TCTCC) d(GGAGACCAGAGG)) [12], for which both an X-ray and an NMR structures are available [50, 51].
AIMD simulations of biologically relevant systems
271
Figure 5. (a) Cisplatin in water. H-bonds denoted by dashed lines. (b) Initial structural model of the AIMD/MM simulations (cispt-d(CCTCTG*G*TCTCC) -d(GGAGACCAGAGG). (c) comparison between the initial and final AIMD/MM structures.
The platinated moiety was the QM region, while the biomolecular frame was treated with the AMBER force field [20]. During the dynamics, the structure of the platinated DNA dodecamer rearranged from the initial, X-ray structure towards the structural determinants of the solution structure as obtained by NMR spectroscopy (Fig. 5) [52]. The calculated 195 Pt chemical shifts of the QM/MM structure relative to cisplatin in aqueous solution were in qualitative agreement with the experimental data. The [Pt(NH3 )]2+ 2 moiety was subsequently docked onto DNA in its canonical B form. Within the relative short time scale (∼7 ps), the DNA oligomer experienced a large kink and a rearrangement of DNA, as experimentally observed in the platinated adducts. The AIMD/MM approach described here can be used in the future to model the interaction of other platinum-based compounds with DNA oligomers and DNA nucleobases, for which a valuable force field parametrization has not yet been developed [52].
4.
Concluding Remarks
Because of the large number of AIMD applications already present in the literature, it would clearly be impossible to review all the work appeared so far. Therefore, only very few examples are included here (for more exhaustive reviews the reader is referred to Refs. [53–55]). Before closing, we would like to mention some of the developments in the code, which are extending the domain of AIMD applications to biomolecular modeling: (i) The calculation of IR and Raman spectra [56], as well as of NMR chemical shifts [57], which may allow to make contact with experiment.
272
A. Magistrato and P. Carloni
(ii) The implementation of DFT-based methods for excited states such as ROCKS [58] and time dependent DFT (TDDFT) [59], which allows to simulate to the study of photophysical processes such as cis–trans isomerization of the retinal chormophore in rhodopsin [60]. (iii) The implementation of path integral MD (PIMD) simulations [61], which allows to describe hydrogen tunneling. These quantum effects are believed to play an important role for some enzymatic reactions [62, 63].
Acknowledgments We would like to thank people who have contributed to this review, namely, K. Spiegel, M. Sulpizi, and in particular, U. Rothlisberger and M.L. Klein. In addition, we thank M. Parrinello for his continuous support.
References [1] R. Car and M. Parrinello, Phys. Rev. Lett., 55, 2471, 1985. [2] R.G. Parr and W. Yang, Density Functional Theory of Atoms and Molecules, Oxford University Press, Oxford, 1989. [3] P. Sherwood, In: Modern Methods and Algorithms of Quantum Chemistry, vol. 1, J. Grotendorst (ed.), John von Neumann Institute for Computing, Juelich, NIC Series, 1257, 2000. [4] M. Colombo, L. Guidoni, A. Laio, A. Magistrato, P. Maurer, S. Piana, U. Roehrig, K. Spiegel, M. Sulpizi, J. VandeVondele, M. Zumstain, and U. Rothlisberger, Chimia, 56, 13, 2002. [5] A. Laio, J. VandeVondele, and U. Rothlisberger, J. Chem. Phys., 116, 6941, 2002. [6] A. Warshel and M. Levitt, J. Mol. Biol., 7, 718, 1976. [7] D. Sebastiani and U. R¨othlisberger, Advances in Density-Functional-Theory based Modeling Techniques – Recent Extension of the Car–Parrinello Approach. In: P. Carloni and F. Alber (eds.), Quantum Medicinal Chemistry, Chap 1. p. 5, 2002. [8] M. Sulpizi, A. Laio, J. VandeVondele, A. Cattaneo, U. Rothlisberger, and P. Carloni, Proteins 52, 212–224, 2003. [9] A. Magistrato, W.F. DeGrado, A. Laio, U. Rothlisberger, J. VandeVondele, and M.L. Klein, J. Phys. Chem. B, 107, 4182, 2003. [10] L. Banci and P. Comba, Molecular Modeling and Dynamics of Bioinorganic Compounds, Kluwer Academic Publisher, Dorderecht, Boston, London, 1997. [11] A. Magistrato, P. Maurer, T. F¨assler, and U. Rothlisberger, J. Phys. Chem. A, 108, 2008–2013, 2004. [12] K. Spiegel, U. Rothlisberger, and P. Carloni, J. Phys. Chem. B, 108, 2699–2707, 2004. [13] D. Marx and J. Hutter, “Ab initio molecular dynamics: theory and implementations,” In: J. Grotendorst (ed.), Modern Methods and Algorithms in Quantum Chemistry, John von Neumann Insitute for Computing, Julich, p. 301, 2000.
AIMD simulations of biologically relevant systems
273
[14] All calculations are performed with the code J. CPMD Hutter, A. Alavi, T. Deutsch, P. Ballone, M. Bernasconi, P. Focher, S. Goedecker, M. Tuckerman, and M. Parrinello, CPMD. Max-Planck-Institut f¨ur Festk¨orperforschung, Stuttgart and IBM Research Laboratory Z¨urich, 1995–1999. [15] The calculations presented in the next section have been performed using the gradient corrected scheme developed by of Becke (A.D. Becke, Phys. Rev. A, 38, 3098–3100, 1988) for the exchange and by Lee, Yang and Parr (C. Lee, W. Yang, and R.G. Parr, Phys. Rev. B, 37, 785–789, 1988), or Perdew (J.P. Perdew, Phys. Rev. B, 33, 8822, 1986), for the correlation part. [16] In all our calculations we have employed pseudopotentials of the Martins–Troullier type (M. Trouiller and J.L. Martins, Phys. Rev. B, 43, 1993, 1991. [17] M. Eichinger, P. Tavan, J. Hutter, and M. Parrinello, J. Chem. Phys., 21, 10452, 1999. [18] U. Rothlisberger, To be published. [19] W.R.P. Scott, P. H¨unemberger, I.G. Tironi, A.E. Mark, S.R. Billiter, A.E. Torda, T. Huber, P. Krueger, and W.F. van Gunsteren, J. Phys. Chem. A, 103, 3596, 1999. [20] D. Pearlman, D.A. Case, J.W. Caldwell, W.S. Ross, T.E. Cheatham, S. Debolt, D. Ferguson, G. Seibel, and P. Kollman, Comput. Phys. Commun., 91, 1, 1995. [21] P. H¨unemberger, J. Chem. Phys., 113, 10464, 2000. [22] S. Shimohama, Apoptosis, 5, 9, 2000. [23] M. Strajbl, J. Florian, and A. Warshel, J. Phys. Chem. B, 105, 4471, 2001. [24] (a) B.J. Wallar and J.D. Lipscomb, Chem. Rev., 96, 2625–2657, 1996. (b) A.L. Feig and S.J. Lippard, Chem. Rev., 94, 759, 1994. [25] S.J. Lange and L. Que Jr., Curr. Opin. Chem. Biol., 2, 159, 1998. [26] R.E. Stenkamp, Chem. Rev., 94, 715, 1994. [27] (a) A.K. Powell, Met. Ions. Biol. Syst., 35, 515, 1998. (b) P.M. Harrison, P.D. Hempstead, P.J. Artymiuk, and S.C. Andrews, Met. Ions. Biol. Syst., 3, 5, 435, 1998. [28] P. Nordlund and H. Eklung, Curr. Opin. Struct. Biol., 5, 758, 1995. [29] (a) S.C. Gallagher, A. George, and H. Dalton, Eur. J. Biochem., 254, 480, 1998. (b) M. Lee, M. Lenman, A. Banas, M. Bator, S. Singh, N. Schweizer, R. Nilsson, C. Liljenberg, A. Dahlquist, and P.D. Gummeson, et al. Science, 280, 915, 1998. [30] P. Nordlund, B.M. Sj¨orberg, and H. Eklund, Nature, 345, 593, 1990. [31] A.E. Shilov and G.B. Shul’pin, Chem. Rev., 97, 2879, 1997. [32] A.E. Shilov, Activation of Saturated Hydrocarbons by Transition Metal Complexes D, Riedel Publishing Co., Dordrecht, The Netherlands, 1984. [33] C.M. Summa, A. Lomardi, M. Lewis, and W.F. DeGrado, Curr. Opin. Struct. Biol., 9, 500, 1999. [34] A. Lomabrdi, C.M. Summa, S. Geremia, L. Randaccio, V. Pavone, and W.F. DeGrado Proc. Natl. Acad. Sci. USA, 97, 6298, 2000. [35] L. Di Costanzo, H. Wade, S. Geremia, L. Randaccio, V. Pavone, W.F. DeGrado, and A. Lombardi, J. Am. Chem. Soc., 123, 12749, 2001. [36] M.C.J. Wilce, C.S. Bond, N.E. Dixon, H.C. Freeman, J.M. Guss, P.E. Lilley, and J.A. Wilce, Proc. Natl. Acad. Sci. USA, 95, 3472, 1998. [37] S.P. Liu, J. Widom, C.W. Kemp, C.M. Crews, and J. Clardy, Science, 282, 1324, 1998. [38] G.C. Dismukes, Chem. Rev., 96, 2909, 1996. [39] M. Dal Peraro, A.J. Vila, and P. Carloni, J. Biol. Inorg. Chem., 7, 704, 2002. [40] A. Magistrato, W.F. DeGrado, and M.L. Klein, To be published. [41] J.R. Dilworth and S.J. Parrott, Chem. Soc. Rev., 27, 43, 1998. [42] W.A. Volkert and S. Jurisson, Technetium and Rhenium, 176, 123, 1996. [43] S. Prakash, M.J. Went, and P.J. Blower, Nucl. Med. Biol., 23, 543, 1996.
274
A. Magistrato and P. Carloni [44] G. Schoeneich, H. Palmedo, D. Heimbach, H.J. Biersack, and S.C. Muller, Onkologie, 20, 316, 1997. [45] www.cardiolite.com. [46] (a) G.E.D. Mullen, M.J. Went, S. Wocadlo, A.K. Powell, and P.J. Blower, Angew. Chem. Int. Ed. Engl., 36, 1205, 1997. (b) G.E.D. Mullen, P.J. Blower, D.J. Price, A.K. Powell, M.J. Howard, and M.J. Went, Inorg. Chem., 39, 4093, 2000. [47] G.E.D. Mullen, F.T. F¨assler, M.J. Went, K. Howland, B. Stein, and P.J. Blower, J. Chem. Soc., Dalton Trans., 21, 3759, 1999. [48] J. Redijk, Proc. Natl. Acad. Sci. USA, 100, 3611, 2003. [49] P. Carloni, M. Sprik, and W. Andreoni, J. Phys. Chem. B, 104, 823, 2000. [50] P.M. Takahara, A.C. Rosenzweig, C.A. Frederick, and S.J. Lippard, Nature, 377, 649–652, 1995. [51] A. Gelasco and S.J. Lippard, Biochemistry, 37, 9230, 1998. [52] M.A. Elizondo-Riojas and J. Kozelka, J. Mol. Biol., 314, 1227, 2001. [53] P. Carloni and U. Rothlisberber, In: L. Eriksson (ed.), Theoretical Biochemistry – processes and Properties of Biological Systems, Elsevier Science, New York, 2000. [54] W. Andreoni, A. Curioni, and T. Mordasini, IBM J. Res. Dev., 45, 397, 2001. [55] P. Carloni, U. Rothlisberger, and M. Parrinello, Acc. Chem. Res., 35, 455, 2002. [56] D. Sebastiani and Parrinello, J. Phys. Chem. A, 105, 1951, 2001. [57] P. Silvestrelli and M. Parrinello, Phys. Rev. Lett., 82, 3308, 1999. [58] I. Frank, J. Hutter, D. Marx, and M. Parrinello, J. Chem. Phys., 108, 4060, 1998. [59] (a) E.K.U. Gross, F.J. Dobson, and M. Petersilka, Density Functional Theory, Springer, Berlin 1996. (b) M.E. Casida, In: D.P. Chong (ed.), Recent Advances in Density Functional Methods, World Scientific, Singapore, 1995. [60] U. Rohrig, L. Guidoni, A. Laio, J. VandeVondele, and U. Rothlisberger, To be published. [61] S. Raugei, M.L. Klein, J. Am. Chem. Soc., 125, 8992, 2003. [62] D.B. Northrop, Acc. Chem. Res., 34, 790, 2001. [63] K.M. Doll, B.R. Bender, and R.G. Finke, J. Am. Chem. Soc., 125, 10877, 2003.
1.14 TIGHT-BINDING TOTAL ENERGY METHODS FOR MAGNETIC MATERIALS AND MULTI-ELEMENT SYSTEMS Michael J. Mehl and D.A. Papaconstantopoulos Center for Computational Materials Science, Naval Research Laboratory, Washington, DC, USA
The classic paper of Slater and Koster [1] described a method for modifying a linear combination of atomic orbitals (LCAO) for use in an interpolation scheme to determine energy bands over the entire Brillouin zone while only fitting to the results of first-principles calculations at high symmetry points in the zone. This tight-binding (TB) method was shown to be extremely useful for the study of the band structure of solids with little computational cost. Harrison [2, 3] developed a “universal” set of parameters which are used both to obtain a basic understanding of band structures and for making approximate calculations. Papaconstantopoulos [4] computed the Slater–Koster parameters for most elements by fitting to results obtained from the first-principles augmented plane wave (APW) method. Numerous other applications of this method have appeared in the literature [5, 6]. As computational methods developed, it was realized [7–11] that tightbinding methods, properly applied, could be used as scheme for determining structural energies as well as electronic structure. Since these methods use a minimal basis set for each atom, they are much faster than first-principles methods for similar size systems, and therefore useful for quickly studying systems containing several hundred atoms, e.g., in molecular dynamics simulations [12]. One example of the method is the two-center, nonorthogonal NRL-TB method [9, 10], which uses environment-dependent on-site parameters and bond-length dependent hopping parameters to go beyond interpolating between fitted structures to the determination of elastic constants, phonon spectra, and defect structures. A similar approach is used by the Ames group [11, 13, 14] who approximate the three-center integrals by modifying the 275 S. Yip (ed.), Handbook of Materials Modeling, 275–305. c 2005 Springer. Printed in the Netherlands.
276
M.J. Mehl and D.A. Papaconstantopoulos
two-center hopping integrals according to the local environment. Cohen, Stixrude, and Wasserman [15] have modified the description of the on-site parameters (4–6) to include crystal-field like corrections, extending the work of Mercer and Chou [16] to include d orbitals. We have previously summarized much of this work [5, 6]. In this article we focus on extensions of the TB method beyond the original elemental systems. Specifically, we show how the method can be extended to spin-polarized systems, including noncollinear spins, using the atomic moment approximation (AMA) [17]. We also describe the development of parameters for binary and ternary compounds. As we will see, although the determination of the TB parameters is tedious, the resulting method is computationally efficient, capable of performing static and dynamic calculations beyond the limits of first-principles methods. The method has been applied to all of the magnetic elements, and many nonmagnetic compounds. The accuracy of electronic, elastic, and phonon properties is comparable to that of the original, nonmagnetic single element calculations. In the discussion of our work below, our TB calculations are fitted to first principles results obtained from the linearized augmented plane wave (LAPW) method [18], including full potential and total energy capabilities [19, 20]. Calculations used the Kohn–Sham independent electron formulation of Density Functional Theory [21, 22] with various local density approximations (LDA) [23] or the Perdew–Wang 1991 generalized gradient approximation (GGA) [24]. Other tight-binding methods use similar first-principles techniques, as described in the references. This work is divided into two major parts. Section 1 describes work on spin-polarized systems, including noncollinear spins, while Section 2 shows how TB methods can be adapted to compounds. Finally, in Section 3 we briefly discuss the future of TB total energy methods.
1.
Magnetic Systems
Since spin-polarized density functional calculations produce eigenvalues for both the majority and minority spin channels, it is rather easy to set up a Slater–Koster tight-binding parametrization for each channel. These parameter sets are bound together by the requirement that they reproduce the firstprinciples eigenvalues for each spin as well as the total energy. Accordingly, we modify the original nonpolarized TB procedure [9, 10] as follows. The total energy of the TB system is given by the sum over occupied states of the shifted spin-polarized eigenvalues: E=
i
f (εi↑ − µ )εi↑ +
i
f (εi↓ − µ )εi↓ ,
(1)
TB methods for magnetic materials and multi-element systems
277
where f (ε) is a smoothing function, usually the Fermi function [25], and µ is the shifted Fermi level, which gives the correct number of occupied bands, and the arrows indicate the collinear spin-polarization of the electronic states. The eigenvalues ε are uniformly shifted from the eigenvalues ε found from the density functional calculations: = εi↑ + εs , εi↑
and εi↓ = εi↓ + εs .
(2)
The shift, εs , is defined so that the total energy E in (1) is equal to the total energy from the DFT calculation:
εs =
E−
[ f (εi↑ − µ)εi↑ +
i
f (εi↓ − µ)εi↓
Ne ,
(3)
i
where Ne is the number of electrons in the system and µ = µ + εs is the Fermi level for the original DFT calculation. In our approach to spin-polarized TB [26] we assign all of the difference between the majority and minority bands to the on-site terms. Thus, with each atom i we associate both a majority and a minority “density” of nearby atoms:
exp(−λ2↑,↓ Ri, j )F(Ri j ),
(4)
F(R) = θ(Rc − R)/{1 + exp[(R − Rc )/L + 5]},
(5)
ρi(↑,↓) =
j
where
is a screening function designed to smoothly take the densities (4) to zero at distances greater than Rc . Typically, we take Rc between 10 and 16 a.u., and L between 0.25 and 0.5 a.u. Once we have the density in the neighborhood of each atom, we assign the spin-dependent on-site parameters for states with angular momentum = s, p, and d by 2/3
4/3
2 . h i(↑,↓) = α(↑,↓) + β(↑,↓)ρi(↑,↓) + γ(↑,↓)ρi(↑,↓) + δ(↑,↓)ρi(↑,↓)
(6)
We will frequently find it useful to determine the energy of a paramagnetic system using these TB parameters. In the paramagnetic system the on-site parameters are the average of the majority and minority spin parameters in (6). The hopping and on-site terms have the same form here as in our unpolarized TB calculations, and are taken to be independent of the spin associated with each TB orbital. Thus the Slater–Koster hopping parameters between atoms separated by a distance R are given by 2 Hµ = [A µ + Bµ R + Cµ R 2 ] exp(−D µ R)F(R),
(7)
278
M.J. Mehl and D.A. Papaconstantopoulos
where µ = (ssσ, spσ, ppσ, ppπ, sdσ, pdσ, pdπ, ddσ, ddπ, ddδ) are the Slater–Koster parameters. We usually assume the TB basis to be nonorthogonal, requiring us to define a set of overlap parameters Sµ to compliment (7). In the spin-polarized calculations we have done so far we have given S the same functional form as H , only noting here that this is not required for a successful theory [9, 10]. For an sp3 d5 basis, the procedure above gives 106 independent parameters. For Iron [26] we fit these parameters to reproduce a database of eigenvalues and total energies for paramagnetic bcc Fe, ferromagnetic bcc Fe, and ferromagnetic fcc Fe, using the GGA [24] to obtain the correct ferromagnetic body-centered cubic (bcc) ground state. The structural energies as a function of volume are shown in Fig. 1, where we compare our results to firstprinciples calculations. We note that the output paramagnetic fcc total energy closely tracks the paramagnetic fcc energy from LAPW calculations. It should be noted that the TB parametrization cannot reproduce the low-spin/high-spin discontinuity found in ferromagnetic fcc iron [27]. This is not usually a problem in Fe, especially when we consider that the paramagnetic fcc TB total energy is very close to the low-spin fcc LAPW total energy.
Energy/atom (Ry)
0.04 BCC FM LAPW BCC FM TB BCC PM LAPW BCC PM TB FCC FM LAPW FCC FM TB FCC PM LAPW FCC PM TB
0.02
0.00
65
70
75 Volume/atom (a.u.)
80
85
Figure 1. Comparison for first-principles and tight-binding calculations for Fe, using a spinpolarized tight-binding parametrization [26]. Squares represent bcc phases, diamonds fcc phases. Solid symbols denote ferromagnetic phases, open symbols unpolarized phases. Red lines are LAPW calculations, blue lines tight-binding. The low-spin/high-spin discontinuity in the LAPW ferromagnetic phase is not reproduced by the tight-binding parametrization.
TB methods for magnetic materials and multi-element systems
279
The TB method also lets us examine the total polarization in a system, as the difference in occupation number between the majority and minority spin sites, m=
[ f (εi↑ − µ) − f (εi↓ − µ)].
(8)
i
Figure 2 shows the magnetic moment for fcc and bcc Fe as a function of volume. Note that the first-principles high/low spin transition in fcc Iron occurs at approximately the same volume at which the paramagnetic TB total energy becomes lower than the ferromagnetic TB energy for the FCC lattice. We have extended our tight-binding calculations for magnetic systems to cobalt and nickel (as well as chromium, which will be discussed below). Both elements are substantially easier to fit than Iron, since there is no high/low spin transition in any state. Our fitting database included first-principles LAPW total energy calculations for the fcc, bcc, and simple cubic structures, using the Hedin–Lundqvist LDA [23]. The resulting TB parameters correctly predict the ferromagnetic hcp lattice as the ground state of Co, even though we did not include this state in the fit. Table 1 shows our calculated elastic constants [28, 29] for the three ferromagnetic elements as well as Cr. We list the TB results at both the equilibrium and experimental volumes [30]. At the
3.0
Moment/atom (spins)
2.5 2.0 1.5 1.0 BCC FM LAPW BCC FM TB FCC FM LAPW FCC FM TB
0.5 0.0
65
70
75 Volume/atom (a.u.)
80
85
Figure 2. Comparison for first-principles and tight-binding calculations for the magnetic moment of Fe, using the spin-polarized tight-binding parametrization of Ref. [26]. The notation is the same as in Fig. 1.
280
M.J. Mehl and D.A. Papaconstantopoulos
Table 1. Elastic constants for the magnetic elements computed from the spin-polarized tight-binding parameters and compared to experiment [30]. Calculations for Fe, Co, and Ni were done with ferromagnetic spin orientations. The first “TB” column is the tight-binding equilibrium volume, while the second is at the experimental equilibrium. For Co we use the tight-binding minimum energy value for c/a at the experimental volume. As explained in the text, we model the spin-density wave in chromium by a CsCl type unit cell, where one of the Cr atoms has spin “up” and the other spin “down”. All elastic constants are in GPa Cr Fe Co Ni TB TB Exp. TB TB Exp. TB TB Exp. TB TB Exp. a (a.u.) 5.280 5.451 5.451 5.373 5.416 5.416 4.797 4.786 4.743 6.483 6.652 6.652 c (a.u.) 7.591 7.557 7.693 B 278 164 162 180 158 173 223 247 186 264 175 185 C11 599 407 350 250 223 237 348 359 287 358 251 249 C12 117 42 68 145 125 141 180 189 158 217 137 153 160 168 116 C13 322 336 322 C33 C44 142 105 101 142 132 116 78 80 66 75 69 96 Table 2. Phonon frequencies at selected high symmetry points for ferromagnetic fcc Iron and bcc Nickel, computed from NRL tight-binding parameters and compared to experiment. Symmetry labels follow the notation of Miller and Love [33]. The column labeled “P” indicates the polarization of the mode, either longitudinal (L) or transverse (T), if it is defined. The column “D” indicates the degeneracy of the mode. All frequencies are in inverse centimeters Fe Sym. H P N3 N2 N4
Ni
P
D
TB
Exp. [31]
L T T
3 3 1 1 1
289 262 308 221 148
286 240 357 215 149
Sym.
P
D
TB
Exp. [32]
X3 X5 L2 L3 W2 W5
L T L T
1 2 1 2 1 2
273 180 265 130 170 198
285 209 296 141 207 250
experimental volume we find that the elastic constants are in good agreement with experiment, and are at the same level of accuracy as first-principles DFT calculations. Using our TB parameters we have determined phonon frequencies at highsymmetry locations in the Brillouin zone, using the frozen-phonon method. Table 2 shows phonon frequencies for iron and nickel, compared to experiment [31, 32]. The symmetry notation used here follows that of Miller and Love [33]. We see that the agreement here is comparable to similar calculations for nonmagnetic transition metals [9]. Barreteau et al. [34] have developed a method for the study of magnetism in transition metals by starting with an approach similar to ours for the nonmagnetic part of the interaction [35], and modeling the magnetic interactions
TB methods for magnetic materials and multi-element systems
281
by a multiband Hubbard model treated in the Hartree–Fock approximation. The method has been applied to Rh and Pd clusters and slabs [36]. Recently, Barreteau et al. [37] analyzed the main effects due to the renormalization of the hopping integrals by the intersite Coulomb interactions. They find that these effects are strongly dependent on the relative values of the intersite electron– electron interaction and on the shape of the electronic density of states. The predicted electronic structure for bcc iron, hcp cobalt, and fcc nickel are in excellent agreement with first-principles calculations. Xie and Blackman [38] begin with a similar, though orthogonal, form for the nonmagnetic part of the TB calculation, and add parametrized terms for charge self-consistency and spin polarization. They use their method to study the magnetics of iron clusters embedded in Cobalt. Finally, we note that one could apply the semiempirical approach of Krasko [39], using a Stoner model to add a magnetization energy to, in our case, a nonmagnetic TB parametrization. This approach has the advantage that a single set of parameters serves for both the magnetic and nonmagnetic cases, but it has not been applied to materials other than iron. We have calculated vacancy formation energies by a supercell method [10, 25]. One atom in the supercell is removed and neighboring atoms are allowed to relax around this vacancy while preserving the symmetry of the lattice. The great advantage of the NRL-TB method over first-principles approaches is that we can do the calculation in a very large supercell, in a computationally efficient manner, including relaxation with the TBMD code [12]. We found that a supercell containing 216 atoms was sufficient to eliminate vacancy-vacancy interactions in ferromagnetic iron and nickel. For iron, we found an unrelaxed vacancy formation energy of 2.62 eV, and a relaxed formation energy of 2.33 eV. For nickel we found 1.87 and 1.60 eV for the unrelaxed and relaxed formation energies. The relaxed vacancy formation energies are in very good agreement with the experimental values of 2.0 eV for iron and 1.6 eV for nickel.
1.1.
Noncollinear Magnetism
The theory described above assumes, as in most versions of spin-dependent density functional theory, that the electronic spin points in a global “up” or “down” direction, excluding the possibility that electrons on different atoms might be aligned in different directions. This is a difficult problem in density functional theory. A simplified approach valid within the AMA was made by Pickett [17]. We have adapted [40] it to our TB procedure (1)–(7) as follows: For each atom, define the paramagnetic part of each on-site term as ti = (h i↑ + h i↓ )/2,
(9)
282
M.J. Mehl and D.A. Papaconstantopoulos
where the h i(↑,↓) are defined in (6). Define the exchange splitting introduced by the polarization by
i = (h i↑ − h i↓ )/2.
(10)
Note that both (9) and (10) define diagonal elements in the Slater–Koster Hamiltonian. To introduce noncollinear spin polarization, we give each atom a spin direction dˆi , where |dˆi | = 1. We then construct the nonorthogonal Slater– Koster Hamiltonian by coupling the majority and minority spin channels together. The hopping and overlap terms between majority and minority orbitals are assumed to be identical to the terms between orbitals of the same spin are have the form (7). The on-site terms, however, are mixed according to the rule h is, j s = ti δi, j δ, − 1/2 i δi, j δ, dˆi · σss ,
(11)
where the s and s components indicate the spin index (↑ or ↓), and σss is the vector form of the Pauli spin matrices for spins s and s . The simplest application of noncollinear magnetization is an antiferromagnet, where the dˆi are along the Cartesian directions zˆ and −ˆz . This a common model for chromium [41], which has a nominally bcc structure modulated by an incommensurate spin-density wave with vector q = (2π/a)(0, 0, 0.952). If we model this vector by (2π/a)(0, 0, 1), which is the ground state of all first-principles calculations using current Density Functionals [42], then the wave is commensurate and we can model it as an antiferromagnetic CsCllike unit cell with atoms on the cesium sites having spins pointing in the zˆ direction and atoms on the chlorine sites point along the opposite direction. We computed the total energy for this state by using our spin-polarized tightbinding parameters for Cr, and Eqs. (9)–(11), alternating the “up” and “down” spins in a CsCl structure, to yield the results shown in Fig. 3. We see that the antiferromagnetic phase has lower energy than the ferromagnetic phase for all volumes, in agreement with experimental data. Manganese is another element with an antiferromagnetic ground state. We have previously shown [43] that paramagnetic TB parameters correctly predict the ground state αMn structure, but we did not consider the effects of magnetic interactions. Using a spin-polarized set of TB parameters, fitted to the fcc, bcc, and simple cubic structures, we computed the total energy of αMn for all possible spin configurations which preserve the symmetry of the crystal. As shown in Fig. 4, we found that a configuration with 13 “up” atoms and 16 “down” atoms gives the lowest energy. Given the constraints of the 29-atom unit cell we cannot get a perfect antiferromagnet. This will require (at least) doubling the unit cell. An alternative method for determining magnetization within a parametrized TB framework was developed by Mukherjee and Cohen [44]. In this method, the net magnetic moment (8) is considered to be a parameter, and is solved for
TB methods for magnetic materials and multi-element systems
283
0.66 FM AFM Energy/atom (Ry)
0.67 0.68 0.69 0.70 0.71 0.72 60
65
70
75 80 85 Volume/atom (a.u.)
90
95
Figure 3. Tight-binding total energy calculations for bcc chromium, using spin-polarized parameters. The ferromagnetic (FM) calculations were done in the bcc unit cell. The antiferromagnetic (AFM) calculations were performed using two atoms in a simple cubic unit cell, with one spin pointing “up”, and the other “down” [40].
self-consistently. This allows ferromagnetic and paramagnetic systems to be computed from the same set of parameters. The method has been successfully applied to high pressure hcp iron [45], which has a rather unusual magnetic structure [46], Zhuang and Halley [47] use a charge self-consistent TB method to describe the noncollinear magnetic spin structures of MnF2 and MnO2 .
2.
Compounds
Extension of the method to compounds requires several modifications [48]. As always, we begin by shifting the eigenvalues so that their sum is the total energy E[n(r)] =
f (εn − µ ) εn ,
(12)
n
There are three types of parameters in the fit: the on-site terms, which depend on the local environment and represent the energy required to put an electron in a specificatomicshell,thehoppingparameters,whichrepresenttheenergyrequired for the electron to move between atoms, and overlap parameters, detailing the
284
M.J. Mehl and D.A. Papaconstantopoulos 28.25 PM FM AFM
Energy/unit cell (Ry)
28.30 28.35 28.40 28.45 28.50 28.55 2000
2100
2200
2300
2400
Volume/unit cell (a.u.) Figure 4. Tight-binding total energy calculations for α-Manganese, using spin-polarized and unpolarized parameters and the noncollinear tight-binding method [40]. The paramagnetic (PM) calculations used the average of the spin-up and spin-down parameters. The ferromagnetic (FM) calculations used the spin-polarized parameters with all the atomic spins aligned. For nearly anti-ferromagnetic (AFM) calculations, atoms at the (2a) and one set of (24g) Wyckoff positions were aligned in the “up” direction, and atoms on the (8c) and second (24g) sites were aligned “down.” This yields the lowest possible total spin for the primitive 29-atom α-Mn unit cell.
nonorthogonality of the TB orbitals. In all three cases we must now determine pairwise interactions between atoms of the same type as well as those between atoms of different species. The environmental dependence of the on-site parameters is controlled by a set of atomic-like densities, ρ(i, ˜) =
j ∈˜
exp[−λ2ı˜˜ |Ri − R j |]F(|Ri − R j |),
(13)
where the ith atom is of type ı˜, the jth atom is of type ˜, ρ(i, ˜) is the density on atom i due to atoms of type ˜, and λı˜˜ is a fitting constant to be determined, and F is defined in (4). The on-site terms themselves are polynomial functions in ρ 2/3 : h (i) = a (˜ı ) +
[b (˜ı , ˜)ρ(i, ˜)2/3 + c (˜ı , ˜)ρ(i, ˜)4/3
˜
+ d (˜ı , ˜)ρ(i, ˜)2 ],
(14)
TB methods for magnetic materials and multi-element systems
285
where the sum is over all atom types in the system. Each atom type interacts with the target atom differently. The method used here was adopted for the sake of expediency, and is not the ideal form. However, it is a very useful form, as we shall see. In general, we use angular momenta = s, p, d. However, in systems with essentially cubic symmetry it is sometimes convenient to split the d on-site terms into tg and e2g components. We took this approach for the parametrization of FeAl [48], but not for Cu–Au [49]. The two-center Slater–Koster hopping integrals are determined using an exponentially damped polynomial, and depend only on the atomic species and the distance between the atoms: Hµ (i, j ; R) = [A µ (˜ı , ˜) + Bµ (˜ı , ˜)R 2 + Cµ (˜ı , ˜)R 2 ] exp[−D ı , ˜) R]F(R). µ (˜
(15)
The A, B, C and D parameters are to be fit. For like-atom (˜ = ı˜) interactions, there are 10 independent Slater–Koster parameters: ssσ, spσ, ppσ, ppπ, sdσ, pdσ, pdπ, ddσ, ddπ, and ddδ. When the atoms are of different types, we must include an additional four parameters, psσ, dsσ, dpσ, and dpπ. Note that we do not distinguish between tg and e2g orbitals when computing the hopping integrals. Since we are using a nonorthogonal basis set, we must also parametrize the overlap integrals. These have a form similar to the hopping integrals: Sµ (i, j ; R) = [Oµ (˜ı , ˜) + P µ (˜ı , ˜)R + Q µ (˜ı , ˜)R 2 ] exp[−T2 µ (˜ı , ˜) R]F(R),
(16)
where O, P, Q, and T also represent parameters to be fit. Again, we do not distinguish between tg and e2g orbitals. For a two-component system with s, p, d orbitals, including tg and e2g on-site terms, there are 330 parameters (λs, a, b, c, d, A, B, etc.) which are used in the fit, in contrast to 97 for a single-element parametrization [10]. These parameters are chosen so as to reproduce the eigenvalues ε and energies E in Eq. (12). While the number of parameters may seem rather large, one must realize that we are using these parameters as a mathematical transformation from the DFT to the TB formalism. With this in mind, the number of parameters seems quite reasonable.
2.1.
Copper–Gold
A good test case for the method is the Cu–Au system. Experimentally it is known that ordered phases exist up to 200–400◦ C for Cu3 Au (L12 ), CuAu
286
M.J. Mehl and D.A. Papaconstantopoulos
(L10 ), and CuAu3 (L12 ) [50]. Theoretically, Ozoli¸nsˇ, et al. [51] have done extensive first-principles calculations on hypothetical ordered phases in this system, using the energetics data to fit a cluster expansion model for the alloy. In our calculations, we first obtained good TB parameters for Cu [52] and Au [12]. These were fixed throughout the remainder of the fit. We then fit the Cu–Au on-site, hopping, and overlap terms to reproduce the band structure and total energies of Cu3 Au and CuAu3 in the L12 and D03 structures, and CuAu in the L10 , L11 , B1, and B2 structures. We then compute the total energies of a number of ordered structures, and compute the formation energy per atom, which, for a structure with formula unit Cum Aun is E form (m, n) = [E 0 (Cum Aun ) − n E fcc (Cu) − m E fcc (Au)]/(m + n). (17)
Formation Energy (Ry/atom)
where E 0 is the minimum energy for the structure in question, and E fcc is the equilibrium energy of the pure element in the face-centered cubic phase. The results for the low-lying phases in the Cu–Au system are shown in Fig. 5. We see that these parameters do, in fact, predict the existence of ordered
0.010
C11b
D022
A1’ C32
C11b
0.005
A2’ B32
C32
C32
D03
A2 0.000
D03
A1
A3’
A1’ A2 A1
C19 L12
L12 ⫺0.005
B2 L1 0
0
0.2
0.4 0.6 0.8 Gold Concentration (x)
1
Figure 5. Formation energy diagram for ordered Cu1−x Aux compounds, using our tightbinding parameters [6, 49]. Strukturbericht symbols are used to designate the phases, except for A1 and A2 , which are ordered Cu7 Au and CuAu7 supercells of the fcc and bcc lattices, respectively. The tie line connects the known ordered structures in the Cu–Au system [50]. The red dots represent structures used to fit the tight-binding parameters, while the blue dots are predictions.
TB methods for magnetic materials and multi-element systems
287
Formation Energy (meV/atom)
50 Cu3 Au
Cu Au
Cu Au2 Cu Au3
25
0
⫺25
⫺50
⫺75
NRLLAPW NRELLAPW TB L12 D023 D022 L10 NbP W2 SQS8aL11 C11b L12 D022
Figure 6. Formation energy of several ordered phases in the Cux Au1−x system, calculated using our tight-binding parameters [6, 49] (blue bars), and compared to first-principles calculations performed by Ozoli¸nsˇ et al. [51] (red bars). The structure notation is from Ref. [51]. On this scale, the cluster-expansion energies found in Ref. [51] are indistinguishable from the corresponding LAPW results. For comparison, we also plot our first-principles LAPW results (green bars), which were used in the Cu–Au tight-binding fitting process.
L12 Cu3 Au and L10 CuAu. The L12 CuAu3 structure is, on the other hand, above the tie-line between CuAu and pure gold. This is consistent with our LAPW calculations, suggesting that L12 is not the ground state structure of CuAu3 . Figure 6 compares some of our structural energies to the first-principles formation energies found by Ozoli¸nsˇ et al. [51] We see that we have very good agreement for the low-lying phases. Part of the discrepancy may be that we disagree slightly on the first-principles formation energies of some structures, as shown in the figure. To further assess the transferability of the Cu–Au parameters, we computed elastic constants and zone-center phonon frequencies for ordered Cu3 Au and compared them to experiment [53, 54] as well as first-principles LAPW calculations. The results are shown in Table 3. We find reasonable agreement between these values and results obtained from first-principles. The advantage of the tight-binding method over first-principles is that it allows us to quickly study systems with a large number of atoms. Accordingly, we used these parameters to seek understanding of the surface electronic
288
M.J. Mehl and D.A. Papaconstantopoulos Table 3. Equilibrium bulk properties of Cu3 Au in the L12 structure, as determined by our tight-binding parametrization [49], first-principles LAPW calculations, and from experiment Property a (Å) C11 (GPa) C12 (GPa) C44 (GPa) 4 (cm−1 ) 4 (cm−1 ) 5 (cm−1 )
Experiment
LAPW
TB
3.755 [53] 187 [30] 135 [30] 68 [30] 125 [54] 210 [54] 161 [54]
3.68 180 120
3.69 198 98 92 153 270 195
110 200 159
Figure 7. Band structure of the Cu3 Au from [49]. (a) bulk system along the R direction, and (b) (111) surface along the M direction. E 1 and E 2 are the experimentally determined surface states [55].
structure of Cu3 Au [49]. Experiment [55] shows that two electronic surface states exist at in the (111) surface Brillouin zone of Cu3 Au. We model this system using our TB parameters and a slab consisting of 15 atomic layers and 60 atoms. In Fig. 7, we compare band structures for bulk Cu3 Au and the slab. We see that the surface states found experimentally agree nicely with the states found in our TB calculation.
TB methods for magnetic materials and multi-element systems
2.2.
289
Aluminides, Hydrides, and Carbides
To study aluminides we created a database of LAPW calculations for the B1 (NaCl), B2 (CsCl), D03 (Fe3 Al), C11b (MoSi2 ), and B32 (NaTl) structures, generating TB Hamiltonians for FeAl [48], CoAl, and NiAl by fitting the energy bands for the B2 structure and the total energies for all the above structures. The TB Hamiltonian included the s, p, and d orbitals for both the metal and Al sites, which were all necessary for obtaining a good fit to the LAPW results. The RMS error for the total energy was less than 1 mRy for all structures fitted, and in the B2 structure the RMS error for the lowest 12 bands was less than 20 mRy. We were able to reproduce well the lattice constants and bulk moduli, and electronic properties, such as the densities of states and energy bands. In addition, quantities that were not fitted, such as elastic constants, are found to be in good agreement with independent LAPW calculations and experiment. Figure 8 shows that there is excellent agreement between the LAPW results and the TB results over a wide range of pressures for all the fitted phases. The agreement is especially good in the ground-state CsCl (B2) structure. We plot the formation energy, which is defined in analogy with (17).
0.04 Formation Energy/Atom (Ry)
B1 0.02
0.00 B32
⫺0.02 D03 ⫺0.04
C11b B2
60
70
80
90
100
3
Volume/Atom (Bohr ) Figure 8. The formation energies versus atomic volume for ordered Fex Al1−x structures, calculated using our TB parameters [48] and compared to first-principles LAPW calculations. The solid lines represent the TB results while the points represent the LAPW results.
290
M.J. Mehl and D.A. Papaconstantopoulos
The TB and LAPW band structures of the B2 FeAl structure are shown in Fig. 9. The original TB calculations [48] reproduces the main features of the first-principles results, but in detail there are significant differences. Here we use a parameter set which has a better fit to the band structure, and find that the behavior of the bands near the Fermi level is close to the LAPW results. We obtained the TB and LAPW electronic densities of states (DOS) by the tetrahedron method [56], using 165 k-points in the irreducible part of the Brillouin zone. The LAPW and TB DOS shown in Fig. 10 are in good agreement. Experimentally, the DOS at the Fermi energy is known only from specificheat measurements, where it was measured to be ρ(εF ) = 31.1 states/Ry/FeAl molecule [57]. Our TB calculation yields (F) = 48.7 states/Ry, slightly higher 0.4
0.2
0.2
0.0
0.0 ε (Ry)
0.4
⫺0.2
⫺0.2
⫺0.4
⫺0.4
⫺0.6
⫺0.6
⫺0.8
Γ
∆
X
Z M
Σ
Γ
Λ
⫺0.8
R
S
X
S
R
T M
Γ
∆ X Z M
Σ
Γ
Λ
R
S
X
S
R T M
Figure 9. The band structure of FeAl in the CsCl structure, at the lattice constant a = 2.94 Å. The left figure shows the tight-binding band structure, while the LAPW results are on the right. In both cases the Fermi level has been set to zero. These calculations were done using a tightbinding parameter set which was selected to improve the fit to the FeAl band structure compared to our original parameters [48].
120
120
80
εF 100 ρ(ε) (States/Ry/FeAl)
ρ(ε) (States/Ry/FeAl)
100
Total Fe s Fe p Fe d Fe f Al s Al p Al d
60
40
εF
60
40
20
20
0 ⫺0.8
80
Total Fe s Fe p Fe d Fe f Al s Al p Al d
⫺0.6
⫺0.4
⫺0.2 ε (Ry)
0.0
0.2
0 ⫺0.8
⫺0.6
⫺0.4
⫺0.2
0.0
0.2
ε (Ry)
Figure 10. The electronic density of states of B2 (CsCl) FeAl, using the TB (left) and LAPW (right) methods, at the lattice constant a = 2.94 Å. In each case the Fermi level has been set to zero. The partial densities of states are given according to the legend in each part of the figure. While the LAPW results have a longer tale at low energy, the DOS are essentially similar near the Fermi level.
TB methods for magnetic materials and multi-element systems
291
than the LAPW value ρ(εF ) = 36.8. Other reports in the literature also find the theoretical value of ρ(εF ) to be greater than that from experiment, a discrepancy that does not allow for electron–phonon enhancement, which puts the experimental result into question. This discrepancy is possibly caused by the nonstoichiometry of the Fe–Al samples. Our predicted equilibrium lattice parameters and bulk modulus are also in good agreement with the first-principles results shown in Table 4. This is a result of the fitting procedure, as we fit the TB parameters to total energies at several volumes. However, the shear elastic moduli that we computed [58, 29] for the CsCl phase were not included in the fit, and except for C44 are in good agreement with the experimental results. In summary, we have presented a brief report of our TB study of the FeAl system. We showed that the parameters describe excellently several bcc and bcc-like phases as well as the NaCl phases. We have also developed TB parametrizations for several other binary compounds. We can judge the transferability of the parameters by computing elastic constants for the equilibrium phase and comparing to experiment, as we do in Table 4. In many cases, the compound measured is not stoichiometric, e.g., PdH0.66 [59] or Fe0.5989Al0.4011 [30], or only has been measured in thin films [60]. In extreme cases, where there is no available experimental data, we compare to the results of LAPW calculations [58]. The TB method described here is not limited to the study of bulk systems. It can, indeed, be used to study chemisorption processes. Our initial work was on the Pd–H2 system [63]. Building on our previous parameters for Pd [9], and using a database of 55 ab initio total energy calculations, we were able to model dissociation of molecular hydrogen at the Pd (100) surface. We modified our usual procedure so that the fitting was done varying only the hydrogen on-site terms and the H–H and Pd–H Hamiltonian and overlap hopping parameters. The Pd on-site terms and Pd–Pd parameters were kept fixed to their pure Pd values. However, to obtain higher accuracy we expanded the polynomial that described the H–H and Pd–H parameters up to fourth order. Figure 11 shows potential energy surface cross-sections for two orientations of the H2 molecule above the surface. A comparison of the TB and ab initio results reveals that the fit reproduces the minimum energy paths and also the general shape of the elbow plots very well. The overall RMS error, including additional ab initio values that were not fitted, was only 0.1 eV, a value that is usually considered to be within the accuracy of the ab initio total energies. Using similar techniques, we have also developed a set of TB parameters for studying the dissociation of the O2 molecule as it approaches a platinum surface [64]. In addition to the energy surfaces (as we computed for Pd–H2 ), we used the TB Molecular Dynamics (TBMD) [12] code to compute sticking probabilities. This was done by performing TBMD runs for a number of incident O2 kinetic energies in the range 0–1.5 eV, and averaging over 150 trajectories for a given energy. The results are shown in Fig. 12. We see that the
a B C11 C12 C44
6.908 234 311 196 64
LAPW
NiH
6.908 238 353 181 92
TB 7.584 183 227 161 69
Exp. [59]
PdH
7.723 207 282 170 27
TB 5.479 136 181 114 127
Exp. [30]
FeAl 5.323 204 313 149 71
TB 5.461 166 211 143 112
Exp. [30]
NiAl 5.389 195 247 168 60
TB 5.408 157 257 107 130
LAPW [58]
CoAl 5.295 213 306 166 82
TB
8.447 340 620 200 150
Exp. [61]
NbC 8.405 313 639 151 126
TB
7.873 333 570 214 170
TB
7.810[62] 268 533 135 133
Exp. [60]
VN
Table 4. Equilibrium lattice parameters and elastic constants (in GPa) for various cubic compounds in the NaCl or CsCl structure. Tight-binding results are compared to the available experimental data, noting that some compounds do not exist at the given stoichiometry. Values are calculated at the indicated equilibrium lattice constants (in atomic units)
292 M.J. Mehl and D.A. Papaconstantopoulos
TB methods for magnetic materials and multi-element systems
a)
293
b)
3.0
2.6 Pd
2.2
H
1.8 Z(Å) 1.4
0.2 0.0
1.0
⫺0.2
-0.2 ⫺0.4 ⫺0.8
0.0
⫺0.6
0.6
0.2 ⫺1.0
-0.2 0.5
1.0
1.5 d
2.0 H⫺H
(Å)
2.5
3.0
0.5
1.0 d
H⫺H
1.5
2.0
(Å)
Figure 11. Contour plots of the TB-PES along two two-dimensional cuts through the sixdimensional coordinate space of H2 /Pd(100) [63]. The coordinates in the figure are the H2 center-of-mass distance from the surface Z and the H–H interatomic distance dH−H . The lateral H2 center-of-mass coordinates in the surface unit cell and the orientation of the molecular axis, i.e., the coordinates X, Y , u, and f are kept fixed for each 2D cut and depicted in the insets. The molecular axis is kept parallel to the surface; (a) corresponds to the dissociation at the bridge site, (b) to dissociation at the top site. The dots denote the points that have been used to obtain the fit. Energies are in eV per H2 molecule. The contour spacing is 0.1 eV.
294
M.J. Mehl and D.A. Papaconstantopoulos 1.0 Exp. Luntz et al.,Ts ⫽ 200K Exp. Luntz et al.,Ts ⫽ 90K
0.8
Exp. Nolan et al.,Ts ⫽ 77K
Trapping probability
TBMD, Ts ⫽ 0K TBMD, Ts ⫽ 0 K,Erot⫽ 0.1 eV
0.6
0.4
0.2
0.0 0.0
0.2
0.4 0.6 0.8 Kinetic energy (eV)
1.0
1.2
1.4
Figure 12. Trapping probability of O2 /Pt(111) as a function of the kinetic energy for normal incidence [64]. Results of molecular beam experiments for surface temperatures of 90 and 200 K [65] and 77 K [66] are compared to TBMD simulations for the surface initially at rest (Ts = 0 K).
trapping probability has the same basic behavior as found experimentally [65, 66], showing that we can successfully model the chemisorption of O2 on Pt.
2.3.
Silicon Carbide
We previously developed parameter sets for both carbon [67] and silicon [68], so it is natural to extend the technique to the development of a parameter set for SiC [69]. Silicon carbide has a wide variety of polytypes, distinguished by the stacking of the SiC layers. It is therefore a good test of the ability of the method to develop transferable parameter sets. The parameters were developed by fitting to the zincblende (stacking ABCABC), wurtzite (stacking ABAB), and 4H (stacking ABACABAC) structures, several zone-boundary phonons, elastic constant modes, and diamond Si and C. The method was able to successfully reproduce the first-principles electronic band structure, as shown in Fig. 13. In addition, we computed phonon frequencies along the (001)
TB methods for magnetic materials and multi-element systems 1.0
295 LAPW TB
ε(k) (Ry)
0.5
0.0
⫺0.5
⫺1.5 Γ
∆
X ZW Q L Λ Γ
Σ
KSX Z W K
L
Figure 13. Band structure of zincblende SiC along high symmetry directions of the Brillouin zone, calculated from sp3 d5 tight-binding parameters (solid lines) and LAPW–LDA (dashed lines) [69].
direction of the zincblende unit cell and compared them to experiment [70]. As seen in Fig. 14, the acoustic modes are in good agreement with experiment, though the optic modes are somewhat low. Thermal expansion was also computed using the TBMD program, and found to be in good agreement with experiment.
2.4.
Tight-binding Description of MgB2
A nonorthogonal TB Hamiltonian for the superconductor MgB2 was derived [71] by fitting to both the total-energy and energy-band results of a first-principles full-potential LAPW calculation using the Hedin–Lundqvist parametrization of the local-density approximation LDA. The LAPW calculations were performed in the ground-state (AlB2 ) structure, for 17 different combinations of c and a, that determined the LDA equilibrium volume. The LAPW results for the total energy and the energy bands at 76 k points in the irreducible hexagonal Brillouin zone, were used as a database to determine the parameters of the TB Hamiltonian. Our basis included the s and p orbitals in both Mg and B in a nonorthogonal two-center representation. In order to obtain an accurate fit it was essential to block diagonalize the Hamiltonian at the high-symmetry points , A, L, K, and H. We found that at a given set of lattice parameters (c,a) we can reproduce the energy bands of MgB2 quite
296
M.J. Mehl and D.A. Papaconstantopoulos Γ
L
1000 Long. Trans.
800
Exp.
ν (cm1)
Optical Modes
TB
600
400
Acoustic Modes
200
0 0.0
0.2
0.4
0.6
0.8
1.0
q Figure 14. Phonon dispersion along the − L direction in zincblende SiC. Circles are from an sp3 tight-binding parametrization [69], and diamonds are from experiment [70].
well. A comparison is shown in Fig. 15, where the solid and broken lines represent the LAPW and TB bands, respectively, at the LDA values of the equilibrium lattice parameters. The TB bands are in very good agreement with the LAPW bands, including the two-dimensional B- band in the A direction just above, which has been identified as hole-band-controlling superconductivity. The RMS fitting error is 2 mRy for the total energy, and close to 10 mRy for the first five bands. Beyond the fifth band our fit is not as accurate, as the Mg d bands, which are not included in our Hamiltonian, come into play. The values of our TB parameters are given in the references [71]. In Fig. 16 we show a comparison of TB and LAPW densities of states DOS. There is an excellent agreement in both the total DOS and the B p-like DOS. The B and Mg s components of the DOS have their strongest presence at the bottom of the valence band, from −0.8 to −0.6 Ry on our scale. They are much smaller than the p-like DOS, so we chose not to include them in Fig. 16. Additionally, we have omitted the Mg p-like DOS, which is also small below εF , but becomes significant above E f . Our TB value of the total DOS at εF is ρ(εF ) = 0.69 states/eV, which is almost identical to that found from our direct LAPW calculation. This value of ρ(εF ) corresponds to the LDA equilibrium volume and is slightly smaller than the value of 0.71 states/eV reported by other workers at the experimental volume. Using our value of ρ(εF ) and the measured value of the specific-heat coefficient γ we find a value
TB methods for magnetic materials and multi-element systems
297
0.4 0.2
ε (Ry)
0.0 0.2 ⫺0.4 ⫺0.6 ⫺0.8 Γ
Σ
M U L
R
A ∆ Γ
T
K P H
S
A
Figure 15. The band structure of MgB2 in the AlB2 structure at the theoretical equilibrium volume, as determined by the full-potential LAPW method (solid lines) and our tight-binding parametrization dashed lines [71]. The Fermi level is at zero.
ρ(ε) (States/Ry/Unit Cell)
20
εF
15
10
5
0 1
0.8
0.6
0.4
0.2
0
0.2
0.4
ε (Ry)
Figure 16. The electronic density of states DOS of MgB2 in the AlB2 structure at the theoretical equilibrium volume, comparing the total DOS as determined by the full-potential LAPW method (upper solid line) and our tight-binding parametrization [71] (upper dashed line), and the partial single-atom B p decomposition lower lines.
298
M.J. Mehl and D.A. Papaconstantopoulos ⫺0.580 V (a.u.) 170 175 180 185 190 195 200
E (Ry)
⫺0.590
⫺0.600
⫺0.610 1.00
1.05
1.10
1.15
1.20
1.25
c/a Figure 17. Total energy of MgB2 , at fixed volume versus c/a, calculated from our tightbinding parametrization [71]. The points indicate the actual calculated energies, while the lines are cubic polynomial fits to the data.
of the electron-phonon coupling constant λ =0.65, which is consistent with the high superconducting-transition temperature in MgB2 . Our TB Hamiltonian also provides an accurate description of the energetics of MgB2 , as shown in Fig. 17. We have tested our parameters by computing the TB equilibrium structure. We find an equilibrium of c 6.66 a.u. and a 5.79 a.u., in good agreement with the LAPW result. At c/a = 1.14, the experimental value, we deduce a bulk modulus of B = 165 GPa which is in good agreement with the experimental value of 120 GPa and with the calculated value of 147 GPa reported by Bohnen et al. [72].
2.5.
Ternary Systems: Ruthenates
The NRL-TB scheme has been applied to ternary systems as well. For such applications the number of parameters increases substantially. However, in most cases it is easy to restrict the number of parameters by using an orthogonal Hamiltonian and by reducing the orbitals to only those who are the most dominant in the particular system. We consider first [73] SrRuO3 and Sr2 RuO4 where for the former we have constructed a 14×14 orthogonal Hamiltonian including Ru-d and O-p orbitals and for the latter the Hamiltonian size is
TB methods for magnetic materials and multi-element systems
299
27×27 with Sr-d, Ru-d, and O-p orbitals. In these calculations we did not fit the total energies, as we aim only for a very accurate reproduction of the LAPW band structures. These Hamiltonians allow the band structure to be computed on very fine meshes in the Brillouin zone at low computational cost, and additionally have yielded an analytic form for band velocities, while retaining the accuracy of the full-potential electronic structure calculations. This greatly facilitates calculation of transport and superconducting parameters related to the fermiology. These features were exploited to calculate the Hall coefficient and an anisotropy parameter relevant to the superconducting vortex lattice geometry for Sr2 RuO4 . A comparison of TB and LAPW Fermi surface for Sr2 RuO4 is shown in Fig. 18 where we see an excellent agreement.
Figure 18. Fermi surface of SrRuO4 from LAPW and TB calculations [73].
300
2.6.
M.J. Mehl and D.A. Papaconstantopoulos
Na x CoO2
The TB method has been applied to the study of the odd-gap superconductor Nax CoO2 [74, 75]. This system has strong nesting, involving nearly 70% of the electrons at the Fermi level. Since this effect primarily involves the Co and O atoms, the parametrization was restricted to those states. The crystal field of the octahedral structure splits the on-site Co d-bands into a1g , eg , and eg bands. To accommodate this, the on-site parameters (14) were computed independently for the x y, yz, zx, x 2 − y 2 , and 3z 2 − r 2 Co d orbitals and the x, y, and z O p orbitals. The dependence of the on-site and hopping parameters on bond distance was then used to analyze the Fermi surface change with interlayer distance. The band structure and Fermi surface was found to depend on the Oxygen height in a non-trivial manner. In addition, the one-electron susceptibility was then computed, as shown in Fig. 19. The nesting shown hear leads to a charge density wave as well as spin-fluctuations, suggesting that system is an odd-gap triplet s-wave superconductor.
Figure 19. Low frequency limit of χ0 (q, ω)/ω in Nax CoO2 , using a tight-binding parametrization of Co–O [74], The double humped peaks on the zone boundary indicate nesting.
TB methods for magnetic materials and multi-element systems
2.7.
301
Other Methods
Porezag et al. [8] developed an alternative method for computing total energies and electronic eigenvalues from a parametrized tight-binding scheme. In their work, based on the Linear Combination of Atomic Orbitals (LCAO) method, the hopping parameters computed directly from first principles calculations. A repulsive pair potential between the ions is then fitted so that the sum of the pair potential energies and the sum over the occupied states gives the correct total energy. Finally, the on-site terms are corrected with a simulated Coulomb interaction to preserve charge self-consistency on each ion. The method has been applied to many sp3 systems, including, e.g., predicting the structure of tetragonal CN compounds [76], the electronic structure of GaN edge dislocations, and the structure of amorphous CN [77]. Halley and coworkers [7, 78, 79] have developed a similar charge selfconsistent TB approach, which has been applied mainly to oxides (rutile TiO2 ) and fluorides (MnF2 , discussed in Section 1). In this method the isolated ions are required to have the proper energy levels, which allows for better descriptions of electrochemistry. As noted above, this method has also been used to study magnetic systems. Pan [80] adapted the work of the Ames group on carbon [11] to derive a TB parametrization for hydrocarbons. This has been used to study the geometries of small hydrocarbons and hydrogenated diamond surfaces, and finds geometries in qualitative agreement with previous results. We have discussed extensions of the our original TB total energy method [9, 10] to spin-polarized systems, including non-collinear spins, and compounds. Althoughthe determination of theTB parameters istedious, theresulting method is computationally efficient, capable of performing static and dynamic calculations beyond the limits of first-principles methods. We have applied the method to all of the magnetic elements, and many nonmagnetic compounds. The accuracy of electronic, elastic, and phonon properties is comparable to that of the original, nonmagnetic single element calculations.
3.
Outlook
Tight-binding total energy methods can be thought of as a mapping of a large set of first-principles data onto a compact TB Hamiltonian based on Slater–Koster parameters. As we have seen, these methods are nearly as accurate as first-principles calculations over a wide range of structures and densities. The calculations are very fast, as well. A typical first-principles calculation for a transition metal or intermetallic compound requires on the order of one hundred basis functions per atom to achieve convergence. The TB calculation will use only nine functions per atom, assuming an sp3 d5
302
M.J. Mehl and D.A. Papaconstantopoulos
basis set. Given that the time to diagonalize Hamiltonian scales as the cube of the number of basis functions, we see that TB methods are inherently 1000 times faster than the corresponding first-principles calculations. Furthermore, any algorithmic improvements in eigenvalue determination can be applied to TB methods as well as first-principles. Tight-binding calculations will therefore always be faster than first-principles, and so can be applied to much larger systems. As we have seen, these methods are routinely applied to molecular dynamics simulations containing hundreds of atoms, and have been applied to systems containing several thousand atoms. The major bottleneck to the widespread use of TB methods is the development of accurate parameter sets, particularly for binary and ternary compounds. This involves large numbers of first-principles calculations, and thorough testing of the resulting parameter sets. However, once a parameter set is validated, it can be used for a wide variety of applications. We expect the use of TB methods to grow rapidly as more systems are parametrized.
Acknowledgments This work was supported by the U.S. Office of Naval Research (ONR). The development of the tight-binding codes was supported in part by the U.S. Department of Defense Common HPC Software Support Initiative (CHSSI). Work on the magnetic elements was sponsored in part by the ONR Design of Naval Steels program. In addition to our collaborators, we would like to thank W. Pickett for helpful discussions concerning noncollinear magnetization.
References [1] J.C. Slater and G.F. Koster, Phys. Rev., 94, 1498, 1954. [2] W.A. Harrison, Electronic Structure and the Properties of Solids, Freeman, San Francisco, 1980. [3] W.A. Harrison, Elementary Electronic Structure, World Scientific, Singapore, 1999. [4] D.A. Papaconstantopoulos, Handbook of the Band Structure of Elemental Solids, Plenum, New York, 1986. [5] M.J. Mehl and D.A. Papaconstantopoulos, In: C.F. Yong (ed.), Topics in Computational Materials Science, World Scientific, Singapore, Chap.V, pp. 169–213, 1998. [6] D.A. Papaconstantopoulos and M.J. Mehl, J. Phys. Condens. Matter, 15, R413, 2003. [7] N. Yu and J.W. Halley, Phys. Rev. B, 51, 4768, 1995. [8] D. Porezag, T. Frauenheim, T. K¨ohler, G. Seifert, and R. Kaschner, Phys. Rev. B, 51, 12947, 1995. [9] R.E. Cohen, M.J. Mehl, and D.A. Papaconstantopoulos, Phys. Rev. B, 50, 14694, 1994. [10] M.J. Mehl and D.A. Papaconstantopoulos, Phys. Rev. B, 54, 4519, 1996.
TB methods for magnetic materials and multi-element systems
303
[11] M.S. Tang, C.Z. Wang, C.T. Chan, and K.M. Ho, Phys. Rev. B, 53, 979, 1996. [12] F. Kirchhoff, M.J. Mehl, N.I. Papanicolaou, D.A. Papaconstantopoulos, and F.S. Khan, Phys. Rev. B, 63, 195101, 2001. [13] H. Haas, C.Z. Wang, M. F¨ahnle, C. Els¨asser, and K.M. Ho, Phys. Rev. B, 57, 1461, 1998. [14] C.Z. Wang, B.C. Pan, and K.M. Ho, J. Phys. Condens. Matter, 11, 2043, 1999. [15] R.E. Cohen, L. Stixrude, and E. Wasserman, Phys. Rev. B, 56, 8575, (1997), erratum Phys. Rev. B, 58, 5873, 1998. [16] J.L. Mercer, Jr. and M.Y. Chou, Phys. Rev. B, 49, 8506, 1994. [17] W.E. Pickett, J. Korean Phys. Soc. (Proc. Suppl.), 29, S70, 1996. [18] O.K. Andersen, Phys. Rev. B, 12, 3060, 1975. [19] S.-H. Wei and H. Krakauer, Phys. Rev. Lett., 55, 1200, 1985. [20] D. Singh, Phys. Rev. B, 43, 6388, 1991. [21] P. Hohenberg and W. Kohn, Phys. Rev., 136, B864, 1964. [22] W. Kohn and L.J. Sham, Phys. Rev., 140, A1133, 1965. [23] L. Hedin and B.I. Lundqvist, J. of Phys. C: Solid State Phys., 4, 2064, 1971. [24] J.P. Perdew, J.A. Chevary, S.H. Vosko, K.A. Jackson, M.R. Pederson, D.J. Singh, and C. Fiolhais, Phys. Rev. B, 46, 6671, 1992. [25] M.J. Gillan, J. Phys. Condens. Matter, 1, 689, 1989. [26] N.C. Bacalis, D.A. Papaconstantopoulos, M.J. Mehl, and M. Lach-hab, Physica B: Conden Matter, 296, 125, 2001. [27] P. Entel, R. Meyer, K. Kadau, H. Herper, and E. Hoffmann, Eur. Phys. J. B, 5, 379, 1998. [28] M.J. Mehl, Phys. Rev. B, 47, 2493, 1993. [29] M.J. Mehl, B.M. Klein, and D.A. Papaconstantopoulos, In: J.H. Westbrook and R.L. Fleischer (eds.), Intermetallic Compounds – Principles and Practice, vol. 1, John Wiley and Sons, London, pp. 195–210, 1994. [30] G. Simmons and H. Wang, Single Crystal Elastic Constants and Calculated Aggregate Properties: A Handbook, 2nd ed., MIT Press, Cambridge, MA and London, 1971. [31] V.J.M.G. Shirane and R. Nathans, Phys. Rev., 162, 528, 1967. [32] R.J. Birgeneau, J. Cordes, G. Dolling, and A.D.B. Woods, Phys. Rev., 136, A1359, 1964. [33] S.C. Miller and W.F. Love, Tables of Irreducible Representations of Space Groups and Co-representations of Magnetic Space Groups, Pruett, Bolder, 1967. [34] C. Barreteau, R. Guirado-L´opez, M.C. Desjonqu`eres, D. Spanjaard, and A.M. Ole´s, Comput. Mat. Sci., 17, 211, 2000. [35] C. Barreteau, D. Spanjaard, and M.C. Desjonqu`eres, Phys. Rev. B, 58, 9721, 1998. [36] C. Barreteau, R. Guirado-L´opez, D. Spanjaard, M. C. Desjonqu`eres, and A.M. Oles, Phys. Rev. B, 61, 7781, 2000. [37] C. Barreteau, M.-C. Desjonqueres, A.M. Oles, and D. Spanjaard, Phys. Rev. B, 69, 064432, 2004. [38] Y. Xie and J.A. Blackman, Phys. Rev. B, 66, 085410, 2002. [39] G.L. Krasko, J. Appl. Phys., 79, 4682, 1996. [40] M.J. Mehl, D.A. Papaconstantopoulos, I.I. Mazin, N.C. Bacalis, and W.E. Pickett, J. Appl. Phys., 89, 6880, 2001. [41] K. Hirai, J. Phys. Soc. Japan, 67, 1776, 1998. [42] R. Hafner, D. Spisak, R. Lorenz, and J. Hafner, Phys. Rev. B, 65, 184432, 2002. [43] M.J. Mehl and D.A. Papaconstantopoulos, Europhys. Lett., 31, 537, 1995. [44] S. Mukherjee and R.E. Cohen, J. Comput.-Aid. Matter Des., 8, 107, 2001.
304
M.J. Mehl and D.A. Papaconstantopoulos [45] R.E. Cohen and S. Mukherjee, Phys. Earth Planet. Int., 143–144, 445, 2004. [46] G. Steinle-Neumann, L. Stixrude, and R.E. Cohen, Proc. Natl. Acad. Sci., USA, 101, 33, 2004. [47] M. Zhuang and J.W. Halley, Phys. Rev. B, 64, 024413, 2001. [48] S.H. Yang, M.J. Mehl, D.A. Papaconstantopoulos, and M.B. Scott, J. Phys. Condens. Matter, 14, 1895, 2002. [49] C.E. Lekka, N. Bernstein, M.J. Mehl, and D.A. Papaconstantopoulos, Appl. Surf. Sci., 219, 158, 2003. [50] T.B. Massalski (ed.), Binary Alloy Phase Diagrams, American Society for Metals, Metals Park, OH, 1987. [51] V. Ozoli¸nsˇ, C. Wolverton, and A. Zunger, Phys. Rev. B, 57, 6427, 1998. [52] Y. Mishin, M.J. Mehl, D.A. Papaconstantopoulos, A.F. Voter, and J.D. Kress, Phys. Rev. B, 63, 224106, 2001. [53] P.D. Bogdanoff, B. Fultz, and S. Rosenkranz, Phys. Rev. B, 60, 3976, 1999. [54] S. Katano, M. Iizumi, and Y. Noda, J. Phys. F, 18, 2195, 1988. [55] R. Courths, M. Lau, T. Scheunemann, H. Gollisch, and R. Feder, Phys. Rev. B, 63, 195110, 2001. [56] O. Jepsen and O.K. Andersen, Solid State Commun., 9, 1763, 1971. [57] H. Okamoto and P.A. Beck, Monatsh. Chem., 103, 907, 1972. [58] M.J. Mehl, J.E. Osburn, D.A. Papaconstantopoulos, and B.M. Klein, Phys. Rev. B, 41, 10311, 1990, erratum Phys. Rev. B, 42, 5362, 1990. [59] D.K. Hsu and R.G. Leisure, Phys. Rev. B, 20, 1339, 1979. [60] J.O. Kim, J.D. Achenbach, P.B. Mirkarimi, M. Shinn, and S.A. Barnett, J. Appl. Phys., 72, 1805, 1992. [61] W. Weber, Phys. Rev. B, 8, 5082, 1973. [62] H. Holleck, J. Vac. Sci. Technol, A, 4, 2661, 1986. [63] A. Gross, M. Scheffler, M.J. Mehl, and D.A. Papaconstantopoulos, Phys. Rev. Lett., 82, 1209, 1999. [64] A. Groß, A. Eichler, J. Hafner, M.J. Mehl, and D.A. Papaconstantopoulos, Surf. Sci., 539, L542, 2003. [65] A.C. Luntz, M.D. Williams, and D.S. Bethune, J. Chem. Phys., 89, 4381, 1988. [66] P.D. Nolan, B.R. Lutz, P.L. Tanaka, J.E. Davis, and C.B. Mullins, J. Chem. Phys., 111, 3696, 1999. [67] D.A. Papaconstantopoulos, M.J. Mehl, S.C. Erwin, and M.R. Pederson, In: P. Turchi, A. Gonis, and L. Colombo (eds.), Tight-Binding Approach to Computational Materials Science, vol. 491, Materials Research Society, Pittsburgh, p. 221, 1998. [68] N. Bernstein, M.J. Mehl, D.A. Papaconstantopoulos, N.I. Papanicolaou, M.Z. Bazant, and E. Kaxiras, Phys. Rev. B, 62, 4477, 2000, erratum Phys. Rev. B, 65, 249002(E), 2002. [69] N. Bernstein, H.J. Gotsis, D.A. Papaconstantopoulos, and M.J. Mehl, submitted to Phys. Rev. B, (unpublished). [70] D.W. Feldman, J. James H.Parker, W.J. Choyke, and L. Patrick, Phys. Rev., 173, 787, 1968. [71] D.A. Papaconstantopoulos and M.J. Mehl, Phys. Rev. B, 64, 172510, 2001. [72] K.-P. Bohnen, R. Heid, and B. Renker, Phys. Rev. Lett., 86, 5771, 2001. [73] I.I. Mazin, D.A. Papaconstantopoulos, and D.J. Singh, Phys. Rev. B, 61, 5223, 2000. [74] M.D. Johannes, I.I. Mazin, D.J. Singh, and D.A. Papaconstantopoulos, condmat/0403135, Phys. Rev. Lett., 93, 101802, 2004. [75] M.D. Johannes, D.A. Papaconstantopoulos, D.J. Singh, and M.J. Mehl, Europhys. Lett., 68, 433, 2004.
TB methods for magnetic materials and multi-element systems
305
[76] E. Kim, C. Chen, T. Kohler, M. Elstner, and T. Frauenheim, Phys. Rev. Lett., 86, 652, 2001. [77] F. Weich, J. Widany, and T. Frauenheim, Phys. Rev. Lett., 78, 3326, 2001. [78] P.K. Schelling, N. Yu, and J.W. Halley, Phys. Rev. B, 58, 1279, 1998. [79] J.W. Halley, Y. Lin, and M. Zhuang, Faraday Discuss., 121, 85, 2002. [80] B.C. Pan, Phys. Rev. B, 64, 155408, 2001.
1.15 ENVIRONMENT-DEPENDENT TIGHT-BINDING POTENTIAL MODELS C.Z. Wang and K.M. Ho Ames Laboratory-U.S. DOE and Department of Physics and Astronomy, Iowa State University, Ames, IA 50011, USA
The use of tight-binding formalism to parametrize the electronic structures of crystals and molecules has been a subject of continuous interest since the pioneering work of Slater and Koster [1] half a century ago. In the last 15 years, tight-binding method has attracted even more attention due to the development of tight-binding total energy models that can provide interatomic forces for molecular dynamics simulations of materials [2–7]. The simplicity of the tight-binding description makes the method very promising for large scale electronic calculations and atomistic simulations [8, 9]. However, studies of complex systems require that the tight-binding parameters should be “transferable” [4], i.e., should be able to describe accurately the electronic structure and total energy of a material in different bonding configurations. Although tight-binding molecular dynamics has been successfully applied to a number of interesting systems such as carbon fullerenes and carbon nanotubes [10–12], the transferability of tight-binding potentials is still a major issue that hinders the wide spread application of the method to more materials of current interest. Most of the tight-binding models developed so far are based on the Slater– Koster formalism [1]. The accuracy and transferability of these tight-binding models are limited by the approximations inherent in the Slater–Koster theory. One of the key approximations is the assumption that the hopping integrals are independent of the bonding environment. Experience from firstprinciples calculations showed that a fixed minimal basis set optimized for a given atomic geometry will not usually give accurate results for total energies of other atomic geometries. Minimal basis sets need to have the flexibility to adjust to the bonding environment of the atom on which they are based in order to give converged results for different geometries. Since tight-binding is a minimal-basis description of the electronic structure, it must follow that 307 S. Yip (ed.), Handbook of Materials Modeling, 307–347. c 2005 Springer. Printed in the Netherlands.
308
C.Z. Wang and K.M. Ho
the tight-binding hopping parameters should be environment dependent. Another major limitation in the Slater–Koster theory is the use of the twocenter approximation [1]. Such an approximation could be justified only when the system is governed by strong covalent interactions (e.g., carbon). For systems where metallic bonding effects are significant, contributions beyond pairwise interactions should not be neglected. Furthermore, the Lo¨ wdin symmetric orthogonalization procedure [13] used to construct the orthogonal basis set in the Slater–Koster theory may also result in additional structure-dependent contributions to the two-center hopping integrals because the overlap matrices are different for different structures. Several developments to go beyond the Slater–Koster theory have been undertaken in the last ten years. The environment-dependent tight-binding (EDTB) potential models developed by the authors and co-workers is one of such attempts. These potential models incorporate environment-dependent scaling functions not only for the diagonal matrix elements, but also for the off-diagonal hopping integrals and the repulsive energy in the tight-binding total energy expression. These models provide a mechanism for including some of the important effects that have been ignored in the Slater–Koster theory such as the variation of the local minimal basis set with environment, three-center interactions, and effects due to L¨owdin orthogonality. The EDTB models have been demonstrated to describe well the properties of both the lower-coordinated covalent structures as well as the higher-coordinated metallic structures of carbon and silicon [14, 15]. In spite of these progress, the development of EDTB models so far still relies on empirical fitting to the band structure and total energies of some standard structures. The fitting procedure is quite laborious if we want to study a broad range of materials, especially in compound systems where different sets of interactions have to be determined simultaneously from a given set of electronic structures. Moreover, fundamental questions such as how and to what extent the approximations used in the Slater–Koster scheme influence the transferability of the tight-binding models are still not well understood from the empirical fitting approach. Information from first-principles calculations about these issues are highly desirable to guide the development of more accurate and transferable tight-binding models. In general, the overlap and one-electron Hamiltonian matrices from firstprinciples calculations cannot be used directly to infer the tight-binding parameters because first-principles calculations are done using a large basis set in order to get convergent results while tight-binding parameters are based on a minimal basis representation. Recently, the authors and co-workers have developed a method for extracting a minimal basis Hamiltonian starting from ab initio calculation results using large converged basis sets [16–19]. This new development provides a clear way to separate the electronic structure into different component interactions among the atomic orbitals in the system. This
Environment-dependent tight-binding potential models
309
provides the basis for developing a systematic scheme to simplify the potential generation process to make the problem much faster and more tractable, especially when we are dealing with compound systems.
1.
Fundamentals of Tight-Binding Potential Models
The expression for the binding energy of a system with M atoms and N valence electrons in tight-binding molecular dynamics (TBMD) is given by E binding = E bs + E rep − E 0
(1)
The first term on the right-hand side of Eq. (1) is the band structure energy which is equal to the sum of the one-electron eigenvalues εi of the occupied states given by a tight-binding Hamiltonian HTB , E bs =
f i εi
(2)
i
where f i is the electron occupation (Fermi–Dirac) function and i f i = N . The second term on the right-hand side of Eq. (1) is a repulsive energy usually expressed as a sum of short-ranged pairwise interactions E rep =
1 φ(ri, j ) 2 i, j
(3)
or a functional of sum of pairwise interactions E rep =
Fi
i
φ(ri, j )
(4)
j
where F is a function which for example can be a fourth order polynomial [20]. The term E 0 in Eq. (1) is a constant which represents the sum of the energies of the individual atoms. The founding work of tight-binding Hamiltonians HTB for the electronic structure of solids was done by Slater and Koster in 1954 [1]. Starting with a set of atomic orbitals φiα (r − Ri ) located on an atom at Ri , they used the L¨owdin symmetric orthogonalization scheme [13] to construct a set of orthogonal orbitals iα (r − Ri ): iα (r − Ri ) =
jβ
−1/2
Siα, jβ φ jβ (r − Ri )
(5)
310
C.Z. Wang and K.M. Ho
where S is the overlap matrix of the original atomic orbitals: Siα, jβ (Ri − R j ) =
∗ φiα (r − Ri )φ jβ (r − R j )d3 r
(6)
These set of orthogonal orbitals iα (r − Ri ) have the symmetry properties of the corresponding φiα [1]. If the system is a periodic crystal, then the Bloch sum can be used to construct the wave-vector k-dependent orbitals iα,k = N −1/2
exp(ik · Ri )iα (r − Ri )
(7)
Ri
where N is the number of unit cell. Let H be the Hamiltonian of the system which also has the periodicity of the lattice, then the k-dependent Hamiltonian matrix elements can be written as Hiα, jβ (k) =
Rj
exp[ik · (R j − Ri )] ·
∗ iα (r − Ri )H jβ (r − R j )d3 r (8)
One of the key approximations made by Slater and Koster, which led to the commonly used tight-binding formulation of the Hamiltonian Matrix is the so called two-center approximation. They assumed that the potential part of the Hamiltonian is a sum of spherical potentials located on the two atoms on which the L¨owdin orbitals are located, and the three-center integrals are disregarded. Under this assumption, the integral in Eq. (8) becomes similar to the type that one would expect in a diatomic molecule. The Hamiltonian matrix elements of Eq. (8) now depend only on the form of the L¨owdin orbitals iα and the vector (R j − Ri ). Since the L¨owdin orbitals iα has the same symmetry as the corresponding atomic orbital φiα , it can be expanded as a sum of functions with well-defined angular momentum with respect to the axis between two atoms. The labels σ, π, δ were used (in analogy with s, p, d) for the angular momentum quantum number m = 0, 1, 2, respectively. For example, if iα is a p orbital, it can be expanded as linear combination of a pσ and pπ± function with respect to the axis. For a system containing only s and p orbitals, there are only four types of hopping integrals h ssσ , h spσ , h ppσ , and h ppπ to be considered. These integrals depend only on the separation r of the two atoms and can be treated as parameters to be determined by fitting to ab initio band structure calculations. Once the hopping integrals are obtained, the TB Hamiltonian matrix can be constructed by linear combination of the hopping integrals using the direction cosines (cx , c y , cz ) of the vector (R j − Ri ). Note that when R j = Ri the integrals Eq. (8) yield the diagonal matrix elements. The formulations described in this subsection can also be applied to nonperiodic systems (i.e., clusters).
Environment-dependent tight-binding potential models
2.
311
Environment-Dependent Tight-Binding Potential Models
Since the classic paper of Slater and Koster [1], a lot of work have been devoted to the tight-binding parameterization of different materials. The Slater–Koster theory has been extended by incorporating continuous distancedependent scaling functions for the hopping parameters. One of such scaling functions is the famous 1/r 2 scaling function introduced by Harrison [21] and by Chadi [22]. Subsequently, Goodwin et al. (GSP) [4] showed that the transferability of tight binding models for silicon can be improved by multiplying an exponential attenuation function to the simple power-law scaling of the tight-binding parameters and the pairwise repulsion. Although the two-center approximation greatly simplifies the tight-binding parameterization, it limits the accuracy and transferability of the tight-binding models. There have been a number of evidence pointing to the necessity of going beyond the two-center approximation. Sawada [23] and subsequently Kohyama [24], and Mercer and Chou [6] showed that explicit inclusion of three-center interactions into the repulsive energy gives a better description of the energy–volume curves for silicon and germanium in comparison with the pure two-center model proposed by Goodwin et al. (GSP). Tight-binding models that allow the diagonal matrix elements to be dependent on the environment of the atoms developed by Mercer and Chou [25] for silicon and by Cohen et al. [26] and Mehl et al. [7] for metallic elements showed significant improvements in transferability. Li and Biswas [27] found that it is necessary to allow neighbordependent hopping integrals between silicon and hydrogen atoms for a correct description of the properties of interstitial hydrogen atom in the silicon lattice. Tight-binding potential models that include environment-dependent scaling functions for off-diagonal as well as diagonal matrix elements were developed by Wang and Ho et al. [14, 15].
2.1.
Formalism
In the environment-dependent tight-binding potential model of Wang et al., the minimal basis set is taken to be orthogonal. The effects of L¨owdin orthogonality, three-center interactions, and the variation of the local basis set with environment are taken into account empirically by renormalizing the interaction strength between atom pairs according to the surrounding atomic configurations. The tight-binding hopping parameters and the repulsive interaction between atoms i and j depend on the environments of atoms i and j through two scaling functions [14]. The first one is a screening function that is designed to weaken the interactions between two atoms when there are
312
C.Z. Wang and K.M. Ho
intervening atoms between them. Another is a bond-length scaling function which scales the interatomic distance (hence the interaction strength) between the two atoms according to their effective coordination numbers. Longer effective bond lengths are assumed for higher coordinated atoms. Specifically, the hopping parameters and the pairwise repulsive potential for silicon and carbon are expressed as α4 2 h(ri j ) = α1 Ri−α j exp[−α3 Ri j ](1 − Sij )
(9)
In this expression, h(rij ) denotes the possible types of interatomic hopping parameters h ssσ , h spσ , h ppσ , h ppπ and pairwise repulsive potential φ(rij ) between atoms i and j . rij is the real distance and Rij is a scaled distance between atoms i and j . Sij is a screening function. The parameters α1 , α2 , α3 , α4 , and parameters for the bond-length scaling function Rij and the screening function Sij can be different for different hopping parameters and the pairwise repulsive potential. Note that expression Eq. (9) will reduce to the traditional two-center form if we set Rij = rij and Sij = 0. The screening function Sij is expressed as a hyperbolic tangent (tanh) function (i.e., Sij = tanh(ξij )) with argument ξij given by ξij = β1
exp − β2
l
ril + r j l rij
β3
(10)
where β1 , β2 , and β3 are adjustable parameters. Maximum screening effect occurs when the atom l is situated close to the line connecting the atoms i and j (i.e., ril + rl j is minimum). This approach allows us to distinguish between first and further neighbor interactions without explicit specification. This is well-suited for molecular dynamics simulations where it is difficult to define exactly which atoms are first neighbors and which are second neighbors. The bond-length scaling function scales the distance between two atoms according to their effective coordination numbers. Longer effective bond lengths are assumed for higher coordinated atom pairs, leading to reduced interactions per atom pair for larger-coordinated structures. The scaling between the real and effective interatomic distance is given by Rij = rij (1 + δ1 + δ2 2 + δ3 3 ) where 1 = 2
ni − n0 n0
+
n j − n0 n0
(11)
is the fractional coordination number relative to the coordination number of i and j . The coordination the diamond structure n 0 , averaged between atoms number can be modeled by a smooth function, n i = j (1 − Sij ) with a proper choice of parameters for Sij which has the form of the screening function described above.
Environment-dependent tight-binding potential models
313
Besides the hopping parameters, the diagonal matrix elements are also dependent on the bonding environments. The expression for the diagonal matrix elements is eλ,i = eλ,0 +
eλ (rij )
(12)
j
where eλ (rij ) takes the same expression as Eq. (9), λ denotes the two types of orbitals (s or p). es,0 and ep,0 are the on site energies of a free atom. Finally, the repulsive energy term is expressed in a functional of the sum of pairwise interactions as defined in Eq. (4) in the previous section. The parameters in the model are determined by fitting to the self-consistent first-principles density functional calculations results of electronic band structures and the cohesive energy versus volume curves of several crystalline structures of different coordination numbers. Such crystalline structures include diamond, graphite, β-tin, simple cubic, bcc, and fcc structures. In addition, some elastic constants and vibration frequencies of the lowest-energy structures are also included in the fitting in order to ensure that the model gives good description of elastic and vibrational properties in addition to electronic structures and binding energies.
2.2.
EDTB Potential for Carbon
Carbon is a strong covalent bonded material best described by the tightbinding scheme. The two-center tight-binding model for carbon developed by Xu et al. (XWCH model [20]) gives accurate description for carbon in the low-coordination diamond, graphite, and linear-chain structures as shown in Fig. 1. The potential also describes well the structures and energies of carbon fullerenes and carbon nanotubes. Therefore, the two-center XWCH carbon potential is adequate for studying most carbon systems. However, the twocenter XWCH model describes the higher coordinated carbon structures poorly (see Fig. 1). Therefore, it is not suitable for studying carbon structures at high pressure or high compressive stress. In order to correct this deficiency, Tang et al. have developed an environment-dependent tight-binding potential for carbon following the formalism described in the previous subsection. The parameters of this potential are given in Tables 1 and 2. The parameters for calculating the coordination number of carbon are β1 = 2.0, β2 = 0.0478, β3 = 7.16. The cutoff distance for the interaction is rij = 3.3 Å. As shown in Fig. 2, the environment-dependent tight-binding potential model for carbon describes very well the binding energies not only for the covalent (diamond, graphite, and linear chain) structures, but also for the higher-coordinated metallic (bcc, fcc, and simple cubic) structures. The EDTB potential is also more accurate for
314
C.Z. Wang and K.M. Ho
Figure 1. The cohesive energies as a function of nearest neighbor distance for carbon in different crystalline structures calculated using the two-center XWCH TB model are compared with the results from the first-principles DFT-LDA calculations. The solid curves are the TB results and the dashed curves are the LDA results [20].
Table 1. The parameters of the EDTB model for carbon. The TB hopping integrals are in the unit of eV and the interatomic distances are in the unit of Å. φ is dimensionless h ssσ h spσ h ppσ h ppπ φ es , ep
α1
α2
α3
α4
β1
−8.9491 8.3183 11.7955 −5.4860 30.0000 0.1995275
0.8910 0.6170 0.7620 1.2785 3.4905 0.029681
0.1580 0.1654 0.1624 0.1383 0.00423 0.19667
2.7008 2.4692 2.3509 3.4490 6.1270 2.2423
β2
2.0200 1.3000 1.0400 0.2000 1.5035 0.055034
0.2274 0.2274 0.2274 8.5000 0.205325 0.10143
β3
δ
4.7940 4.7940 4.7940 4.3800 4.1625 3.09355
0.0310 0.0310 0.0310 0.0310 0.002168 0.272375
Table 2. The coefficients (in unit of eV) of the polynomial function F(x) for the EDTB potential for carbon c0
c1
c2
12.201499972 0.583770664 0.336418901 × 10−3
c3
c4
−0.5334093735 × 10−4
0.7650717197 × 10−6
elastic constants and phonon frequencies of diamond and graphite structures as compare to the two-center tight-binding model (Tables 3 and 4). Another example that demonstrates the better transferability of the EDTB model over the two-center model for complex simulations is the study of diamond-like amorphous carbon. Diamond-like (or tetrahedral) amorphous carbon consists of mostly sp3 bonded carbon atom produced under highly compressive stress which promotes the formation of sp3 bonds, in contrast to the formation of sp2 graphite-like bonds under normal conditions [30–32].
Environment-dependent tight-binding potential models
315
Figure 2. The cohesive energies as a function of nearest neighbor distance for carbon in different crystalline structures calculated using the environment-dependent TB model are compared with the results from the first-principles DFT-GGA calculations. The solid curves are the TB results and the dashed curves are the GGA results [14].
Table 3. Elastic constants, phonon frequencies and Gr¨unneisen parameters of diamond calculated from the XWCH-TB model [20] and the environment-dependent TB (EDTB) model [14] are compared with experimental results [28]. Elastic constants are in units of 1012 dyn/cm2 and the phonon frequencies are in terahertz
a(Å) B c11 − c12 c44 νLTO() νTA(X ) νTO(X ) νLA(X ) γLTO() γTA(X ) γTO(X ) γLA(X )
XWCH
EDTB
Experiment
3.555 4.56 6.22 4.75 37.80 22.42 33.75 34.75 1.03 −0.16 1.10 0.62
3.585 4.19 9.25 5.55 41.61 25.73 32.60 36.16 0.93 0.30 1.50 0.98
3.567 4.42 9.51 5.76 39.90 24.20 32.0 35.5 0.96
Although the two-center XWCH carbon potential can produce the essential topology for the diamond-like amorphous carbon network [33], the comparison with experiment is not quite satisfactory as one can see from Fig. 3. There are also some discrepancies in ring statistics between the two-center tight-binding potential generated and ab initio molecular dynamics generated diamond-like amorphous carbon models [34]. Specifically, a small fraction of three and four-membered rings observed in the ab initio model is absent
316
C.Z. Wang and K.M. Ho Table 4. Elastic constants, phonon frequencies and Gr¨uneisen parameters of graphite calculated from the XWCH-TB model [20] and the environment-dependent TB (EDTB) model [14] are compared with experimental results [29]. Elastic constants are in units of 1012 dyn/cm2 and the phonon frequencies are in terahertz c11 − c12 E2g2 A2u γ(E2g2) γ(A2u )
XWCH
EDTB
Experiment
8.40 49.92 29.19 2.00 0.10
8.94 48.99 26.07 1.73 0.05
8.80 47.46 26.04 1.63
Figure 3. Radial distribution functions G(r ) of the two tetrahedral amorphous carbon samples (F and G) generated by tight-binding molecular dynamics using the two-center XWCH TB potential (solid curve) are compared with the neutron scattering data of Ref. [32] (dotted curve). The theoretical results have been convoluted with the experimental resolution corresponding to the termination of the Fourier transform at the experimental maximum scattering vector Q max = 16 Å−1 [33].
Environment-dependent tight-binding potential models
317
from the results of the two-center tight-binding model. These subtle deficiencies are corrected when the EDTB potential is used to generate diamond-like amorphous carbon [35]. The radial distribution function of the diamond-like a-c obtained from the EDTB potential is in much better agreement with experiment as one can see from Fig. 4. More discussion of the applications of the EDTB carbon potentials will be given in the next section.
Figure 4. Radial distribution functions G(r ) of the tetrahedral amorphous carbon structure generated by tight-binding molecular dynamics using the environment-dependent TB potential (solid curve) are compared with the neutron scattering data of Ref. [32] (dotted curve). The theoretical result has been convoluted with the experimental resolution corresponding to the termination of the Fourier transform at the experimental maximum scattering vector Q max = 16 Å−1 [35].
2.3.
EDTB Potential for Silicon
Although the diamond structure of Si has covalent sp3 bonding configurations, the higher coordinated metastable structures of Si are metallic and with energies close to that of the diamond structure. Therefore, Si can be metallic under high pressures or at high temperatures. For example, the coordination of the liquid phase of Si is close to the coordination of the metallic structures (i.e., 6.5). These properties of Si pose a challenge for accurate tight-binding
318
C.Z. Wang and K.M. Ho
modeling of Si: it is difficult to describe the low-coordinated covalent structures and high-coordinated metallic structures with good accuracy using one set of tight-binding parameters. With the environment-dependent tight-binding formalism, Wang et al. show that this difficulty can be overcome [15]. The EDTB Si potential developed by them gives excellent fit to the energy vs interatomic distance of various silicon crystalline structures with different coordination as shown in Fig. 5. The EDTB Si potential also describes well the structure and energies of Si surfaces in addition to other bulk properties such as elastic constants and phonon frequencies. These results can be seen from Tables 5 and 6. The parameters of the EDTB Si potential are listed in Tables 7 and 8. The parameters for calculating the coordination number of Si are β1 = 2.0, β2 = 0.02895, β3 = 7.96284. The cutoff distance for the interaction is rij = 5.2 Å. A useful benchmark for Si interatomic potentials is a series of model structures for the = 13{510} symmetric tilt boundary structures in Si [37]. Eight different structures as indicated in the horizontal axis of Fig. 6 have been selected for the calculations. These structures were not included in the database for fitting the parameters. The structures are relaxed by steepest-decent
Figure 5. The cohesive energies as a function of nearest neighbor distance for silicon in different crystalline structures calculated using the environment-dependent TB model are compared with the results from the first-principles DFT-LDA calculations. The solid curves are the TB results and the dashed curves are the LDA results [15].
Environment-dependent tight-binding potential models
319
Table 5. Elastic constants and phonon frequencies of silicon in the diamond structure calculated from the two-center TB model [36] and the environment-dependent TB (EDTB) model [15] are compared with experimental results [28]. Elastic constants are in units of 1012 dyn/cm2 and the phonon frequencies are in terahertz Two-center TB
EDTB
Experiment
0.876 0.939 0.890 21.50 5.59 20.04 14.08
5.450 0.90 0.993 0.716 16.20 5.00 12.80 11.50
5.430 0.978 1.012 0.796 15.53 4.49 13.90 12.32
a(Å) B c11 − c12 c44 νLTO() νTA(X ) νTO(X ) νLA(X )
Table 6. Surface energies of the silicon (100) and (111) surfaces from the EDTB Si potential [15]. E is the energy relative to that of the (1×1)-ideal surface. The energies are in the unit of eV/(1×1) Surface energy
E
Si(100) (1×1)-ideal (2×1) p(2×2) c(4×2)
2.292 1.153 1.143 1.148
0.0 −1.139 −1.149 −1.144
Si(111) (1×1)-ideal (1×1)-relaxed (1×1)-faulted √ √ 3 × 3 − t4 √ √ 3 × 3 − h3 (2×1)-Haneman (2×1)-π-bonded chain (7×7)-DAS
1.458 1.435 1.495 1.213 1.346 1.188 1.138 1.099
0.0 −0.025 0.037 −0.245 −0.112 −0.270 −0.320 −0.359
Structure
Table 7. The parameters obtained from the fitting for the EDTB model of Si [15]. The α1 is in the unit of eV. Other parameters are dimensionless α1 h ssσ −5.9974 h spσ 3.4834 h ppσ 11.1023 h ppπ −3.6014 φ 126.640 es , ep 0.2830
α2
α3
0.4612 0.0082 0.7984 1.3400 5.3600 0.1601
0.1040 0.1146 0.1800 0.0500 0.7641 0.050686
α4
β1
2.3000 4.4864 1.8042 2.4750 1.4500 1.1360 2.2220 0.1000 0.4536 37.00 2.1293 7.3076
β2
β3
0.1213 6.0817 0.1213 6.0817 0.1213 6.0817 0.1213 6.0817 0.56995 19.30 0.07967 7.1364
δ1
δ2
0.0891 0.0494 0.1735 0.0494 0.0609 0.0494 0.4671 0.0494 0.082661 −0.023572 0.7338 −0.03953
δ3 −0.0252 −0.0252 −0.0252 −0.0252 0.006036 −0.062172
320
C.Z. Wang and K.M. Ho Table 8. The coefficients of the polynomial function f (x) for the EDTB potential of Si c0 (eV) x ≥ 0.7 −0.739 × 10−6 x < 0.7 −1.8664
c1 0.96411 6.3841
c2 (eV−1 ) 0.68061 −3.3888
c3 (eV−2 )
c4 (eV−3 )
−0.20893 0.0
0.02183 0.0
Figure 6. Energies of the = 13 { 510 } symmetric tilt boundary structures in Si. Eight different structures as indicated in the horizontal axis were selected for calculations. The energies are relative to that of the structure M which has been identified by experiment. The energies obtained from the calculations using the EDTB Si potential are compared with results from ab initio calculations, and from two-center Si tight-binding potentials [36], and classical potential calculations (classical I [38] and classical II [39]). The results of EDTB, ab initio, and classical I are taken from Ref. [37].
method until the forces on each atom were less than 0.01 eV/ Å. The energies obtained from the calculations using the EDTB Si potential are compared with the results from ab initio calculations, and from two-center Si tight-binding potentials [36], and classical potential calculations as shown in Fig. 6. The energy differences for different structures predicted by the EDTB calculations agree very well with those from the ab initio calculations. The energies from the two-center tight-binding potentials and classical potentials do not give the correct results in comparison with the results from ab initio and environment tight-binding potential calculations even though the atoms in the structures are all four-fold coordinated.
Environment-dependent tight-binding potential models
321
Figure 7. Structure of Si13 cluster predicted by (a) classical potential, (b) two-center GSP TB potential, (c) ab initio Car–Parrinello method, and (d) EDTB. The formation energies listed under the structures are calculated using first-principles DFT-LDA. See the text for more details.
Another example for the predictive power of the EDTB silicon potential is the prediction of the ground-state structure of Si clusters. The structure of Si13 has been the subject of many debates [40–44, 46]. Si13 is special because ionized Si+ 13 clusters are observed to have very low chemical reactivity compared to other clusters in the range from Si11 to Si20 [47–49]. Based on theoretical calculations using a classical force field model, Chelikowsky et al. proposed an icosahedral structure with an atom in the center of an icosahedral cage (Fig. 7a) [40]. The argument in favor of this structure is that it is highly symmetric and seems to be chemically less reactive [43]. On the
322
C.Z. Wang and K.M. Ho
other hand, tight-binding calculations using the two-center GSP tight-binding model favor a structure with an icosahedral cage plus an atom attached from the outside (Fig. 7b). However, theoretical calculations using ab initio methods showed that both structures are energetically very unfavorable [42, 44, 45]. In 1992, Rothlisberger et al. found that a structure with a C3v symmetry (Fig. 7c) has energy much lower than that of the icosahedral structure [42]. This C3v isomer has been regarded as the ground state structure for Si13 cluster over years until the new lowest-energy structure revealed by calculations using the environment-dependent Si tight-binding potential. Combining tightbinding molecular dynamics calculation with a genetic algorithm for structural search, we obtained a new structure with a Cs symmetry which has energy even lower than that of the C3v isomer [50]. This new structure is shown in Fig. 7d. More about TBMD/GA applications will be discussed in the next section.
2.4.
EDTB Potential for Molybdenum
Since the EDTB model describes the metallic phases accurately for carbon and silicon, they can also be applied to the metallic elements. Here we use Molybdenum, a bcc transition metal, as an example. We use an orthogonal minimal basis set of one s, three p, and five d atomic orbitals to construct the TB Hamiltonian. This choice of basis gives 10 independent hopping parameters h ll m and three intra-atomic matrix elements (s , p and d ). The environment– dependence the three on-site energies of atom i is taken into account by 0 0 0 (r ), = + + d = d + d ij s d j j s−d (rij ), and p = d + p−d + s−d j p−d (rij ). Here the quantities with superscript 0 denote the parts independent of the environment and the l are the environment–dependent contributions. We adopt the same functional form for the distance-dependence of h ll m , l and φ(rij ), namely f (rij ) = α1 exp(−α2 Rij )(1 − Sij ),
(13)
where Sij is the screening function as discussed in the EDTB model for carbon and Si (see Eq. 10). Rij is the bond length scaling function as defined in Eq. (11). The coordination numbers are calculated using the same parameters as used for carbon, but the coordination of the bcc structure is chosen as n 0 for calculating the relative coordination . Only the linear term with respect to in Eq. (11) is consider for the bond length scaling for Mo, and two parameters (δ1 ), one for h ll m and l , and another for φ(rij ), are introduced to describe the bonding length scaling. Using this bond–length scaling considerably improves the TB total energies of the sc and A15 structures as compare to our previous model without bond length scaling [52].
Environment-dependent tight-binding potential models
323
In order to further reduce the number of parameters in the fitting we impose √ the universal ratios h pdσ : h pdπ = − 3 : 1 and h ddσ : h ddπ : h ddδ = (−6) : 4 :(−1) for the pre-exponential factors as we did in the previous model [52]. Furthermore, we realized from the fitting that h ppπ is small and can be neglected without a noticeable change of the results. Altogether the final model contains 55 parameters. Data used in the fitting are: ab initio band structures along lines of high symmetry in the Brillouin zone, the total energy for Mo in various crystal structures (sc, bcc, fcc) for a variety of lattice parameters around the respective equilibrium lattice parameters, the experimental phonon frequencies at the points N , H , P of the Brillouin zone, the experimentally obtained elastic constant C44 , the unrelaxed vacancy formation energy obtained by the mixed–basis pseudopotential (MBPP) method, and the unrelaxed (100) surface energy obtained by the MBPP method. The optimized parameters are listed in Table 9. The cutoff distance for the interactions is 8.5 a.u. The total energy curves for sc, bcc, and fcc structures of Mo as well as that of hcp and A15 structures, which were not included in the fit, are shown in Fig. 8. Figure 9 exhibits the fit to the band structure in bcc Mo. Table 10 represents the TB results for bcc Mo for the equilibrium lattice constant a0 , the elastic constants C11 , C12 and C44, the vacancy formation energy E Vf for a relaxed supercell with 54 sites and the formation energy E If for an octahedral interstitial atom in a relaxed supercell with 16 regular lattice sites, in comparison with experimental results and with results from the MBPP approach. The quantities a0 and C44 were included in the fit, and therefore the comparison simply tests for the quality of the fit. In contrast, the results for C11 , C12 , for E Vf in the relaxed supercell and for E If are predictions of the TB model, which
Table 9. The parameters of the EDTB model for Mo, α1 is in the unit of Ry α1 h ssσ h ppσ h ppπ h ddσ h ddπ h ddδ h spσ h sdσ h pdσ h pdπ s−d p−d d φ
α2
−1.11973 0.72213 0.23682 0.29692 0.00000 0.00000 −4.28084 0.86823 2.85390 0.86823 −0.71347 0.86823 0.21103 0.35923 −0.35016 0.61218 −4.97982 0.89071 2.87510 0.89071 2.56385 0.89192 2.05237 −0.83267 −0.02691 −0.72297 344.66341 1.31611 0 = −0.20387 Ry, s−d
β1
β2
1.02360 0.85483 0.94862 0.71027 0.00000 0.00000 0.61901 1.25923 0.61901 1.25923 0.61901 1.25923 0.63336 0.11474 6.07645 0.74477 3.56861 0.58919 3.56861 0.58919 1.30193 0.91911 3.39410 0.96099 0.97189 0.96909 2.91007 0.94413 0 p−d = 0.25666 Ry,
β3
δ1
1.75553 0.08233 2.06777 0.08233 0.00000 0.00000 1.53080 0.08233 1.53080 0.08233 1.53080 0.08233 3.02617 0.08233 3.52520 0.08233 3.12674 0.08233 3.12674 0.08233 1.95662 0.08233 1.85273 0.08233 2.35201 0.08233 1.77243 0.06595 d0 = 0.09414 Ry
324
C.Z. Wang and K.M. Ho
Figure 8. Total energy versus volume for sc, bcc, fcc, hcp, and the A15 structures of Mo. The dashed and the full lines represent the EDTB and ab initio data obtained by the linear-muffintin-orbital method in atomic-sphere approximation (LMTO-ASA), respectively.
Figure 9. The EDTB band structure (dots) of bcc Mo at a0 = 5.8 a.u. in comparison with the ab initio LMTO-ASA band structure (solid lines).
Environment-dependent tight-binding potential models
325
Table 10. Results of the TB model for the equilibrium lattice constant a0 , the elastic conf , and the formation energy E f of an stants C11 , C12 and C44 , the vacancy formation energy E V I octahedral interstitial atom in a relaxed supercell containing 16 sites, in comparison with results from ab initio MBPP calculations and experimental data
TB MBPP Experiment
a0 (a.u.)
C11 (Mbar)
C12 (Mbar)
C44 (Mbar)
f EV (eV)
E If (eV)
5.935 5.926 5.945 [54]
4.75 ± 0.10 – 4.50 [55]
1.45 ± 0.10 – 1.73 [55]
0.99 ± 0.04 – 1.25 [55]
2.95 2.90 ± 0.1 2.9 [56]
10.55 9.54 –
Figure 10. Comparison of the phonon frequencies in bcc Mo from the EDTB method (frozen– phonon calculation, full lines) and from inelastic neutron scattering [57] at T = 296 K (dots).
agree rather well with the data from experiments (C11 , C12 , E Vf ) and/or the MBPP calculation (E Vf , E If ). The phonon dispersion curves calculated using the EDTB potential as shown in Fig. 10 also compare well with experiments. The TB model also describes well the structure, energy, and electronic properties of the Mo(001) surface reconstruction [58]. These benchmark results suggest that this tight-binding potential is accurate and suitable for molecular dynamics simulation of Mo under a variety of environments.
326
C.Z. Wang and K.M. Ho
Using this environment-dependent tight-binding model, we have studied the core structure of the a20 [111] screw dislocation in bcc molybdenum at T = 0 [59]. We carry out energy minimizations with two initial core structures: one generated by continuum theory that has no polarity, and another fully relaxed by the Finnis–Sinclair potential that has finite polarity (see Fig. 11, upper panel). In both cases, the atoms relax to the same configuration by the tight-binding potential, with a zero-polarity core structure whose differential displacement (DD) map is plotted in Fig. 11, lower panel. The results predicted by the EDTB model are consistent with results from ab initio calculations, while all classical potentials favor structures with finite polarity.
3. 3.1.
More Applications of EDTB Potentials Coupling TBMD with Genetic Algorithm for Structural Optimization
Global structure optimization for atomic clusters is a challenging problem. A widely used method is simulated annealing by molecular dynamics or Monte Carlo simulations. However, simulated annealing method works well only for small clusters (typically less than 20 atoms). As the number of metastable structures grows rapidly with the number of atoms, the time needed for a cluster with more than 20 atoms to reach its ground state structure by simulated annealing is usually beyond the capability of our computers. An alternative optimization strategy inspired by the Darwinian evolution process [60], i.e., genetic algorithm (GA), for atomistic structure optimization has been developed by Deaven and Ho [61, 62]. Coupling the accurate TBMD simulations with GA has led to an efficient TBMD/GA scheme in searching for the candidate geometries for atomic clusters. In the TBMD/GA scheme, we start with a population of randomly generated structures. TBMD is used to relax the structures to the nearest local energy minimum. Using the energies of relaxed structures as the criteria of fitness, a fraction of the population (usually 10–20 different structures) is selected to be kept in the candidate pool. The next generation of candidates is then generated by a “cut-and-paste” mating operation [50, 61, 62] on the parent structures selected from the candidate pool. When the structures of this new generation have been relaxed, the candidate pool is updated according to the fitness criteria mentioned above. This optimization procedure is repeated until the candidate pool is “converged”, i.e., no more low-energy structure can be found within a reasonable computational time. When the TBMD/GA structural optimization is done, the candidates that remain in the pool can be further evaluated by ab initio calculations in order to determine the groundstate structure.
Environment-dependent tight-binding potential models
327
Figure 11. (Upper panel) DD map of Mo screw dislocation using the Finnis–Sinclair potential. (Lower panel) DD map of Mo screw dislocation using the environment-dependent tightbinding potential.
328
C.Z. Wang and K.M. Ho
Deaven and Ho first applied this algorithm to optimize the geometry of carbon clusters up to C60 [61] using the XWCH carbon tight-binding potential. In all cases of study, the algorithm is successful in finding the ground-state structures starting from an unbiased population of random atomic structures. This performance is very impressive since the strong directional bonds in carbon systems result in large energy barriers between different isomers. Although there have been many previous attempts to generate the C60 buckyball structure from atomistic simulated annealing, none has yielded the ground-state structure [63, 64]. The genetic algorithm approach dramatically outperforms simulated annealing and can arrive at the lowest-energy structure of C60 (the icosahedral buckminsterfullerene cage) in a relatively short simulation time, as one can see from Fig. 12. Ho et al. have also applied the TBMD/GA scheme to determine the structures of medium-sized silicon clusters Sin (n = 11–20) [50, 51]. Due to the complexity of the bonding characters, the structures of Sin clusters with n ≥ 11 have been an outstanding challenge. The structural optimizations require an accurate description of interatomic potential that empirical classical potentials cannot provide. Using the accurate environment-dependent silicon tight
Figure 12. Generation of the C60 molecule, starting from random coordinates, using the genetic algorithm with four candidates ( p = 4). The energy per atom is plotted for the lowestenergy (solid line) and highest-energy (dashed line) candidate structure as a function of MD time steps. Mating operations among the four candidates are performed every 30 MD time steps [61].
Environment-dependent tight-binding potential models
329
binding potential developed by Wang et al. [15] and with the efficient GA describe above, Ho et al. were able to locate candidate structures for the medium-sized silicon clusters Sin (n = 11–20) [50, 51]. The structures obtained by the TBMD/GA search are further studied by ab initio calculations and the ground-state structures of the clusters are determined. The results are shown in Fig. 13. The properties of the clusters are in excellent agreement with experimental measurements [50, 51, 65, 66].
Figure 13. Structures of silicon clusters Sin (n = 11–20) obtained from TBMD/GA global optimization and DFT-GGA calculations. The biding energy (per atom) below the structures are from DFT-GGA calculations. Larger binding energy indicates that the isomer is more stable [51].
330
C.Z. Wang and K.M. Ho
The TBMD/GA approach has also been recently applied to investigate the low-energy structures of hydrogenated silicon clusters (Si7 H2m (m = 1 − 7) and Si8 H8 ) [67, 68] and the structures of silicon “magic” cluster on Si(111)(7×7) surface [69]. These studies demonstrate that the combination of TBMD with GA is a powerful tool for global structure optimization of atomic clusters and surfaces.
3.2.
Inclusion of Electronic Entropy Effects in TBMD
At high temperatures, electronic entropy plays a significant role in molecular dynamics simulations. Since the electronic structures are calculated explicitly in the tight-binding potential model, the effects of electronic entropy can be taken into account properly in TBMD simulations. At finite temperatures Tel , the Fermi–Dirac (FD) distribution can be used to describe the occupation of the electronic states in the energy and force calculations in the TBMD simulations: E TB = 2
εi f i ,
(14)
i
Fl = −2
∂ HTB ∂ fi ∂(εi − µ)
ψi f i − 2 εi · ·
∂Rl ∂(εi − µ) ∂Rl i
ψi
i
(15)
where fi =
1 e(εi −µ)/ kB Tel
+1
(16)
µ, the chemical potential, is adjusted every time step to guarantee the conservation of the total number of electrons Nel : 2
f i = Nel
(17)
i
It was pointed out by Pederson and Jackson [70] that it is very difficult to calculate the second term in the equation (15) in first-principles molecular dynamics simulations. However, Wentzcovitch et al. [71] introduced the Mermin free energy [72]: G = E total + K I − Tel Sel,
(18)
Sel = −2k B
(19)
[ f i ln f i + (1 − f i ) ln(1 − f i )]
i
and showed that MD simulations conserve the free energy G if one drops the second term in Eq. (15). It can also be shown analytically that only the first term in Eq. (15) is required if the Hellmann–Feynman forces are calculated using
Environment-dependent tight-binding potential models
331
the electronic free energy instead of the electronic energy ETB [73]. The second term in Eq. (15) is canceled by the derivative of the electronic entropy Sel : ∂ Sel ∂(εi − µ) ∂ fi ∂(Tel Sel) · = Tel · ∂Rl ∂ f i ∂(εi − µ) ∂Rl i
(20)
Equation (20) can be rewritten as ∂[ f i ln f i + (1 − f i ) ln(1 − f i )] ∂(Tel Sel) = −2kB Tel ∂Rl ∂ fi i
×
∂(εi − µ) ∂ fi · ∂(εi − µ) ∂Rl
(21)
and after some simple algebra, Eq. (20) becomes ∂(εi − µ) ∂ fi ∂(Tel Sel) · =2 εi · ∂Rl ∂(εi − µ) ∂Rl i
(22)
Here, conservation of the total number of electrons is assumed: i
∂(εi − µ) ∂ 1 ∂ ∂ fi · = fi = Nel = 0 ∂(εi − µ) ∂Rl ∂Rl i 2 ∂Rl
(23)
Thus, the second term in Eq. (15) is canceled by the derivative of electronic entropy and the first term is −(∂(E TB − Tel Sel )/∂Rl ). The inclusion of electronic temperature effects not only avoids the instability caused by the change of occupancies of states near the Fermi level in metallic systems, but also includes the effects of electronic entropy into the calculation in a very convenient manner. Explicit inclusion of the electronic entropy in TBMD simulations allows us to investigate the behavior of the system when the electrons are highly excited, for example, by ultra-fast laser pulses in laser ablation experiments. We have taken advantage of this approach and performed TBMD simulation to study structural changes on diamond (111) surfaces under laser irradiation [74]. The simulation results as shown in Figs. 14 and 15 indicate that lasertreated diamond surfaces behave differently depending on the duration of the laser pulses being used. Under nanosecond or longer pulses, the diamond (111) surface graphitizes via formation of graphite–diamond interfaces driven by thermal fluctuations, leading to a mixed graphite–diamond surface after the laser treatment (see Fig. 14). With femtosecond laser pulses, graphitization of the surface is a nonthermal process and occurring in a layer by layer fashion, resulting in a clean diamond surface after the process (see Fig. 15). These results provide a microscopic explanation of experimental observed differences in laser-ablation of diamond surfaces [74]. With femtosecond pulses, there is efficient removal of material and the surface retains a diamond Raman
332
C.Z. Wang and K.M. Ho
Figure 14. Graphitization of the diamond (111) surface via the thermal process. The snapshot pictures are taken from the tight-binding molecular dynamics simulation in which the electrons and the ions are thermal equilibrated at 2700 K. The plots show the side view of the simulation unit cell which is a 12-layer slab with two (111) surfaces (the top and bottom layers). In plane periodic boundary conditions are imposed. Graphitization is found to occur through the formation of graphite-diamond interfaces (see (d)). The whole process takes about 3 ps [74].
signal after ablation, whereas with nanosecond pulses, the Raman signal characteristic of the diamond lattice disappears after ablation. In these simulations, the electronic entropy plays an important role in driving the graphitization under the femtosecond laser pulses. As shown in Fig. 16 the free energies (including the electronic entropy) calculated along the diamond-to-graphite transition pathway under the two different laser pulses (electron temperature of 2700 and 15 000 K represent the system under nanosecond and femtosecond laser pulses, respectively; see Ref. [74] for details) are very different. The free-energy barrier at the higher electron temperature of 15 000 K is much
Environment-dependent tight-binding potential models
333
Figure 15. Graphitization of the diamond (111) surface due to the effects of hot electron plasma (no-thermal process). The snapshot pictures are taken from the tight-binding molecular dynamics simulation in which the electronic temperature is raised to 15 000 K and the ions are evolved freely. The orientation of the simulation unit cell is the same as specified in Fig. 14. Note that the graphitization takes place in a layer-by-layer fashion. The slab is graphitized completely within 500 fs of simulation time [74].
334
C.Z. Wang and K.M. Ho
Figure 16. The free energy (potential energy plus the contribution of electronic entropy) as a function of inter-layer distance along the diamond to rhombohedral graphite transition path at three given intra-layer lattice constants. (a) and (b) show the results at electron temperatures of 2700 K and 15 000 K respectively. The intra-layer lattice constants are corresponding to the lattice constants of diamond structure at 3.50 Å (open circles and dotted lines), 3.58 Å (filled circles and solid lines), and 3.66 Å (open squares and dashed lines) respectively [74].
smaller than that at low electronic temperatures. This will make the graphitization transition much easier at high electronic temperatures. It is also interesting to note that the free energy of the graphite phase is much lower than that of the diamond phase (by about 0.3 eV/atom) at high electronic temperatures. The free energy gain due to initial graphitization caused by femtosecond laser pulses will help the system complete the transition very quickly.
3.3.
TBMD Simulation of Nanodiamond to Carbon Cage Transformation
Nanometer-sized diamonds have been found in interstellar dust [75], solid detonation products [76], and diamond-like films [77]. Recently, Raty et al. [78] performed ab initio calculations and TBMD simulations to study the
Environment-dependent tight-binding potential models
335
structure of a nanodiamond and found that the carbon nanoparticle consist of a diamond core and a reconstructed fullerene-like surface. Experiments have shown that diamond nanoparticles of diameter ∼ 5 nm can be transformed into spherical and polyhedral carbon onions at high temperatures [79–81]. Using the environment-dependent carbon tight-binding potential developed by Tang et al. [14], Lee et al. have recently performed tight-binding molecular dynamics simulations to study the structural transformation of nanodiamond at high temperature. The simulations show that upon annealing up to 2500 K, a 1.4 nm-diameter nanodiamond is transformed into a cage structure that looks like a single-walled capped nanotube [82]. The simulation was performed with a bulk-terminated carbon cluster of 275 atoms within a sphere of diameter of 1.4 nm cut from bulk diamond. This cluster is relaxed using the steepest descent method with the environmentdependent tight-binding carbon potential. The cluster structure after the relaxation consists of a diamond core and fullerene-like reconstructed surface, similar to the structure obtained by ab initio calculation [78]. Starting from this relaxed cluster geometry, TBMD simulation was performed to investigate the structural transformation of the nanodiamond at high temperatures. The snapshots of the system during its structural transformation from a nanodiamond into a capped nanotube are shown in Fig. 17. The nanodiamond cluster was heated up to about 2500 K by constant-temperature molecular dynamics simulations. Near 2500 K, as shown in Fig. 17(b), the (111) surface layer of the nanodiamond begins to graphitize after a simulation time of 3 ps, the exfoliation of the graphitized (111) layer occurs by breaking the bonds between graphene fragments and the underneath “core” atoms. This resembles the graphitization process of the (111) surface of bulk diamond induced by nanosecond laser pulses as discussed in the previous subsection [74]. As the simulation continues, the graphitized layer evaporates, breaking ¯ down into carbon dimers one-by-one from the end of layer. Similarly, the (1¯ 1¯ 1) surface layer consisting of three pentagons undergoes the same exfoliation and evaporation process as that of the (111) surface. At about 18 ps, as shown in Fig. 17(c), the graphitization process extends to the entire cluster surface. The “core” and the surface “shell” start to separate at the bottom side of the cluster. As the bonds between the “core” and “shell” atoms start to break up, the cluster begins to inflate like a bubble. At this stage, if the thermostat is maintained to keep the system at a constant temperature of 2500 K, the whole cluster would completely evaporate within a simulation time of 45 ps. To prevent full vaporization, the system is cooled by decreasing the temperature from 2500 K to 2000 K in 10 ps (during the simulation time between 25 and 35 ps). The cluster is then further cooled down to a temperature of ∼1500 K in 20 ps when a stable cage structure is found to form. The two holes (“H1” and “H2” in Fig. 17d), generated by the successive breaking of bonds among surface atoms play an important role in pumping inner carbon atoms out onto the surface to
336
C.Z. Wang and K.M. Ho
Figure 17. Atomic processes of structural transformation of nanodiamond to capped nanotube by successive annealings. (a) 0 K (at time t = 0 ps), (b) ∼2500 K (t ≈ 3 ps), (c) ∼2500 K (t ≈ 19 ps), (d) ∼2100 K (t ≈ 35 ps), (e) ∼1900 K (t ≈ 50 ps), (f) ∼20 K (t ≈ 120 ps). Simulated annealings with temperatures up to 3000 K are performed during the process (e)→(f). White color indicates atoms and bonds of 2 and less-fold coordination. Red and Yellow colors indicate atoms and bonds of three fold coordination and four fold coordination, respectively. Green colors indicate atoms and bonds of 5 and higher-fold coordination. Note that two holes H1 and H2 are created in (d) [82].
form the graphitic layer. This process is referred to as the “flow-out” mechanism by Lee et al. During the annealing process (stage (e)–(f) in Fig. 17), two other interesting atomic processes, namely the “direct absorption” and “pushout” mechanisms, have also been identified from the simulations to play a crucial role in the conversion of the residual inner carbon atoms of the nanodiamond into the surface atoms of the nanotube [82].
Environment-dependent tight-binding potential models
4.
337
Recent Developments and Future Perspective
In order to develop more accurate and transferable tight-binding models for large scale atomistic simulations, it is important to understand quantitatively the errors introduced by the various approximations used in the tightbinding models. In this regard, detailed information from first-principles calculations would be very useful. In general, overlap and one-electron Hamiltonian matrices from first-principles calculations cannot be used directly to infer the tight-binding parameters because fully converged first-principles calculations are done using a large basis set while tight-binding parameters are based on a minimal basis representation. Very recently, the authors and co-workers have developed a method for projecting a set of chemically deformed atomic minimal-basis-set orbitals from accurate first-principles wavefunctions [16–18]. These orbitals, referred to as “quasi-atomic minimalbasis-sets orbitals” (QUAMBOs), are highly localized on atoms and exhibit shapes close to orbitals of the isolated atom. Moreover, the QUAMBOs span exactly the same occupied subspace as the original first-principles calculation with a large basis set. Therefore, accurate tight-binding Hamiltonian and overlap matrix elements can be obtained directly from ab initio calculations through the construction of QUAMBOs. This new development enables us to examine the accuracy and transferability of the tight-binding models from a first-principles perspective. The key step in constructing the above mentioned QUAMBOs is the selection of a small subset of unoccupied orbitals from the entire virtual space that are maximally overlaped with the atomic orbitals of interest. For simplicity, let us assume that we are dealing with a nonperiodic system (i.e., cluster). Generalization to periodic systems will involve Bloch sums and is straightforward [18]. Suppose that a set of occupied valence orbitals φn (n=1, 2, . . . , Nval ) and virtual orbitals φv (v = Nval + 1, Nval + 2, . . . , Nval + Nvir ) are obtained from ab initio molecular orbital calculations, our objective is to construct a set of quasi-atomic orbitals Aiα spanned by the occupied valence orbitals φn and a small subset of orthogonal virtual orbitals ϕ p (ϕ p = v T p,v φv , p = 1, 2, . . . , N p < Nvir ). That is, A= = =
n n n
an φn + an φn + an φn +
p p,v v
bpϕp b p T p,v φv
(24)
av φv
where a = p b p T p,v , and b p = v av T p,v (because ϕ p are orthogonal, v i.e., v T p,v Tq,v = δ p,q ). The orbital index iα where i is the index of the atom and α denotes the orbital type (e.g., s or p orbitals) is implied in Eq. (24) and will also be omitted in the rest of this subsection unless specified. The
338
C.Z. Wang and K.M. Ho
requirement is that Aiα should as close as possible to the corresponding free atom orbitals A∗iα . Mathematically, this is a problem of minimizing A− A∗ | A− A∗ under the side condition A | A = 1. Therefore the Lagrangian for this minimization problem is L = A − A∗ | A − A∗ − λ( A | A − 1) =
(an −
an∗ )2
+
v
n
av∗ )2
(av −
−λ
n
an2
+
v
av2
−1
(25)
The independent variables are an and b p . Using the side condition, the Lagrange’s minimization leads to A=D
−1/2
∗
φn φn |A +
n
with D=
φn |A∗ 2 +
n
∗
ϕ p ϕ p |A
(26)
p
ϕ p |A∗ 2
(27)
p
For this optimized A, the total mean-square deviation from A∗ is, A − A∗ | A − A∗ = 2(1 − A | A∗ )
(28)
It is clear from the Eqs. (27) and (28) that the key step to get quasi-atomic minimal-basis-set orbitals is to select a subset of virtual orbitals ϕ that have p ∗ maximal overlap with the atomic orbitals set of Aiα , i.e., S = iα, p ϕ p | A∗iα
A∗iα | ϕ p is maximized. This can be achieved by forming the rectangular matrix T, which defines the subset of virtual states p ( p = 1, 2, . . . , N p ), from the eigenvectors with the largest eigenvalues of the matrix Bµ,ν =
φµ | A∗iα A∗iα | φν
(29)
iα
where both indexes, µ as well as ν run over a range of virtual space. Once the ϕ p are determined, the localized QUAMBOs are then constructed using Eqs. (26) and (27). As shown in Fig. 18, the QUAMBOs constructed by such a scheme are indeed atomic-like and well localized on the atoms. However, these QUAMBOs are different from the atomic orbitals of the free atoms because they are deformed according to the bonding environment. The Hamiltonian matrix in the QUAMBO representation by construction preserves the occupied valence subspace from the first-principles calculations so that it should give the exact eigenvalues and eigenvectors for the occupied states as those from first-principle calculations. This property can be seen from Fig. 19 where the eigenvalues of
Environment-dependent tight-binding potential models
339
Figure 18. QUAMBOs for the four non-equivalent atoms in Si10 cluster. The QUAMBOs are similar to the 3s and 3p orbitals of a free silicon atom but are deformed according to the environment of the atoms [17].
a Si10 cluster from first-principles calculations and from the QUAMBO based tight-binding Hamiltonian are compared. Once the QUAMBOs have been constructed, the overlap and the oneelectron Hamiltonian matrices from the first-principles calculations in terms of QUAMBOs are readily calculated. The Slater–Koster tight-binding parameters can then be extracted by decomposing the matrix elements using the Slater–Koster geometrical factors [1]. Using Si as an example, Lu et al. [19] have performed calculations for three types (diamond, sc, and fcc) of bulkfragment clusters with several different bond lengths for each type of clusters so that they can study the tight-binding parameters in different bonding environments. Figure 20 shows the overlap parameters Sssσ , Sspσ , Sppσ , and Sppπ
340
C.Z. Wang and K.M. Ho
Figure 19. Electronic eigenvalues of Si10 in terms of QUAMBOs (QUAMBO Space) are compared to those from self-consistent DFT (Full AO Space) calculations. Note that the occupied states (below −5.0 eV) are exactly reproduced in the QUAMBO space [17].
from different structures and different pair of atoms, plotted as a function of interatomic distance. Note that the two-center nature of overlap integrals for fixed atomic minimal basis orbitals may not necessarily hold for the QUAMBOs because QUAMBOs are deformed according to the bonding environments of the atoms. Nevertheless, the overlap parameters obtained from the calculations as plotted in Fig. 20 fall into smooth scaling curves nicely. These results suggest that the two-center approximation is adequate for overlap integrals. By contrast, the hopping parameters as plotted in Fig. 21 are far from being transferable, especially for h ppσ . Even for a given pair of atoms, the hopping parameters h ppσ and h ppπ obtained from the decomposition of different matrix elements can exhibit slightly different values, especially for the sc and fcc structures. The hopping parameters from different structures do not follow the same scaling curve. For a given crystal-like structure, although the bond-length dependence of hopping parameters for the first and second neighbor interactions can be fitted to separate smooth scaling curves respectively, these two scaling curves cannot be joined together to define an unique transferable scaling function for the structure. Beside the hopping parameters, crystal-field effects on the on-site atomic energies can also be seen clearly from the density and structure dependence of the on-site matrix elements as
Environment-dependent tight-binding potential models
341
Figure 20. The overlap integrals as a function of interatomic distance for Si clusters in the diamond (diamond-like), simple cubic (sc-like), and face-centered cubic (fcc-like) structures.
plotted in Fig. 22. These results suggest that it is necessary to go beyond the two-center approximation even in the case of nonorthogonal tight-binding scheme. Information from the QUAMBO-based tight-binding analysis will provide very useful information for future tight-binding model developments. Expressing the tight-binding Hamiltonian matrix in terms of QUAMBOs also allows us to address the issue of the effects of orthogonality on the transferability of tight-binding models from a first-principles perspective. By applying the L¨owdin transformation to the QUAMBOs described above, we can obtain a set of orthogonal QUAMBOs. The Hamiltonian matrix and hence the Slater–Koster hopping integrals in the orthogonal QUAMBO representation can be calculated. As shown in Fig. 23, the hopping parameters decay much faster than their nonorthogonal counterparts. The interactions in the orthogonal tight-binding scheme are essentially dominated by first neighbor interactions which depend not only on the interatomic separations and also on the coordination of the structures. In contrast to the nonorthogonal model, the magnitudes of the hopping parameters decrease as the coordination number
342
C.Z. Wang and K.M. Ho
Figure 21. The nonorthogonal tight-binding hopping integrals for Si clusters in the diamond (diamond-like), simple cubic (sc-like), and face-centered cubic (fcc-like) structures obtained by decomposing the QUAMBO-based one-electron Hamiltonian according to the Slater–Koster tight-binding scheme.
Figure 22. The non-orthogonal QUAMBO-based Hamiltonian diagonal matrix elements (E s and E p ) of Si clusters in the diamond (diamond-like), simple cubic (sc-like), and facecentered cubic (fcc-like) structures are plotted as a function of density.
Environment-dependent tight-binding potential models
343
Figure 23. The orthogonal tight-binding hopping integrals for Si clusters in the diamond (diamond-like), simple cubic (sc-like), and face-centered cubic (fcc-like) structures obtained by decomposing the QUAMBO-based one-electron Hamiltonian according to the Slater–Koster tight-binding scheme.
of the structure increase. These coordination-dependence of the hopping parameters and the short-range nature of the interactions are well described by the environment dependent tight-binding model of Wang et al. [14, 15]. The on-site energies are also found to be dependent on the structures and densities. In contrast to the behavior in the nonorthogonal case described above, the onsite energies in the orthogonal tight-binding scheme decreases as the density is decreased as one can see from the plot of Fig. 24. This behavior of the diagonal tight-binding matrix elements is also described by the environmentdependent tight-binding model of Wang et al. [14, 15]. However, in the orthogonal QUAMBO description, the second and higher neighbor hopping parameters, though small, are not entirely negligible. In particular, some hopping parameters in the orthogonal scheme are found to change sign for second and higher neighbors. Such behavior need to be taken into account properly in future tight-binding model developments.
344
C.Z. Wang and K.M. Ho
Figure 24. The orthogonal QUAMBO-based Hamiltonian diagonal matrix elements (E s and E p ) of Si clusters in the diamond (diamond-like), simple cubic (sc-like), and face-centered cubic (fcc-like) structures are plotted as a function of density.
Acknowledgments We would like to thank Dr Wencai Lu for help in preparing the manuscript and the figures. We also thank Drs Songyou Wang and Cristian Ciobanu for performing the two-center TB and classical II calculations for Fig. 6. Ames Laboratory is operated for the U.S. Department of Energy by Iowa State University under Contract No. W-7405-Eng-82. This work was supported by the Director for Energy Research, Office of Basic Energy Sciences including a grant of computer time at the National Energy Research Supercomputing Center (NERSC) in Berkeley.
References [1] [2] [3] [4] [5] [6]
J.C. Slater and G.F. Koster, Phys. Rev., 94, 1498, 1954. C.Z. Wang, C.T. Chan, and K.M. Ho, Phys. Rev., B, 39, 8592, 1989. F.S. Khan and J.Q. Broughton, Phys. Rev., B, 39, 3688, 1989. L. Goodwin, A.J. Skinner, and D.G. Pettifor, Europhys. Lett., 9, 701, 1989. C.Z. Wang, C.T. Chan, and K.M. Ho, Phys. Rev., B, 42, 11276, 1990. J.L. Mercer, Jr. and M.Y. Chou, Phys. Rev., B, 47, 9366, 1993.
Environment-dependent tight-binding potential models
345
[7] M.J. Mehl and D.A. Papaconstantopoulos, In: C.Y. Fong (ed.), Topic in Computational Materials Science, World Scientific, Singapore, pp. 169–213, 1997. [8] C.Z. Wang, and K.M. Ho, In: I. Prigogine, and S.A. Rice (eds.), Advances in Chem. Phys., Vol. XCIII, John Wiley & Sons, New York, pp. 651–702, 1996. [9] L. Colombo, Annu. Rev. Comput. Phys., 147(IV), 1, 1996. [10] C.Z. Wang, B.L. Zhang, K.M. Ho, and X.Q. Wang, Int. J. Mod. Phys. B, 7, 4305, 1993. [11] C.Z. Wang, B.L. Zhang, and K.M. Ho, In: D.A. Jelski and T.F. Geoge, (eds.), Computational Studies of New Materials, World Scientific, Singapore, pp.74–111, 1999. [12] C.Z. Wang and K.M. Ho, J. Comput. Theor. Nanosci., 1, 1, 2004. [13] P.O. Lowdin, J. Chem. Phys., 18, 365, 1950. [14] M.S. Tang, C.Z. Wang, C.T. Chan, and K.M. Ho, Phys. Rev., B, 53, 979, 1996. [15] C.Z. Wang, B.C. Pan, and K.M. Ho, J. Phys. Condens. Matter, 11, 2043, 1999. [16] W.C. Lu, C.Z. Wang, M.W. Schmidt, L. Bytautas, K.M. Ho, and K. Ruedenberg, J. Chem. Phys., 120, 2629, 2004. [17] W.C. Lu, C.Z. Wang, M.W. Schmidt, L. Bytautas, K.M. Ho, and K. Ruedenberg, J. Chem. Phys., 120, 2638, 2004. [18] W.C. Lu, C.Z. Wang, Z.L. Chan, K. Ruedenberg, and K.M. Ho, Phys. Rev. B, 70, 041101, 2004. [19] W.C. Lu, C.Z. Wang, K. Ruedenberg, and K.M. Ho, to be published. [20] C.H. Xu, C.Z. Wang, C.T. Chan, and K.M. Ho, J. Phys. Condens. Matter, 4, 6047, 1992. [21] W.A. Harrison, Electronic Structure and the Properties of Solids, Freeman, San Francisco, 1980. [22] D.J. Chadi, Phys. Rev. Lett., 41, 1062, 1978; Phys. Rev. B, 29, 785, 1984. [23] S. Sawada, Vacuum, 41, 612, 1990. [24] M. Kohyama, J. Phys. Condens. Matter, 3, 2193, 1991. [25] J.L. Mercer, Jr. and M.Y. Chou, Phys. Rev. B, 49, 8506, 1994. [26] R.E. Cohen, M.J. Mehl, and D.A. Papaconstantopoulos, Phys. Rev. B, 50, 14694, 1994. [27] Q.M. Li and R. Biswas, Phys. Rev. B, 50, 18090, 1994. [28] O. Madelung, M. Schulz, and H. Weiss (eds.), Semiconductors: Physics of Group IV Elements and III–V Compounds, Landolt-B¨ornstein New Series III/17a, SpringerVerlag, New York 1982; O. Madelung and M. Schulz, (eds.), Semiconductors: Intrinsic Properties of Group IV Elements and III–V, II–VI and I–VII Compounds, Landolt-B¨ornstein New Series III/22a, Springer-Verlag, New York, 1987. [29] M.S. Dresselhaus and G. Dresselhaus, In: M. Cardona and G. Guntherodt (eds.), Light Scattering in Solids III, Springer, Berlin, pp. 8, 1982. [30] For a review, see J. Robertson, Adv. Phys. 35, 317, (1986), and R. Clausing et al. (eds.), Diamond and Diamond-like Films and Coatings, NATO Advanced Study Institutes Ser. B, 266, 331, Plenum, New York, 1991. [31] D.R. McKenzie, D. Muller, and B.A. Pailthorpe, Phys. Rev. Lett., 67, 773, 1991. [32] P.H. Gaskell, A. Saeed, P. Chieux, and D.R. McKenzie, Phys. Rev. Lett., 67, 1286, 1991. [33] C.Z. Wang and K.M. Ho, Phys. Rev. Lett., 71, 1184, 1993. [34] N.A. Marks, D.R. McKenzie, B.A. Pailthorpe, M. Bernasconi, and M. Parrinello, Phys. Rev. Lett., 76, 768, 1996. [35] C.Z. Wang and K.M. Ho, “Structural trends in amorphous carbon,” In: M.P. Siegal et al. (eds.), MRS Symposium Proceedings, 498, MRS, 1998.
346
C.Z. Wang and K.M. Ho [36] I. Kwon, R. Biswas, C.Z. Wang, K.M. Ho, and C.M. Soukoulis, Phys. Rev. B, 49, 7242, 1994. [37] J.R. Morris, Z.Y. Lu, D.M. Ring, J.B. Xiang, K.M. Ho, C.Z. Wang, and C.L. Fu, Phys. Rev. B, 58, 11241, 1998. [38] J. Tersoff, Phys. Rev. B, 38, 9902, 1988. [39] T.J. Lenosky, B. Sadigh, E. Alonso, V.V. Bulatov, T. Diaz de la Rubia, J. Kim, A.F. Voter, and J.D. Kress, Modell. Simul. Matter Sci. Eng., 8, 825, 2000. [40] J.R. Chelikowsky, J.C. Phillips, M. Kamal, and M. Strauss, Phys. Rev. Lett., 62, 292, 1989. [41] J.R. Chelikowsky, K.M. Glassford, and J. C. Phillips, Phys. Rev. B, 44, 1538, 1991. [42] U. Rothlisberger, W. Andreoni, and P. Giannozzi, J. Chem. Phys., 96, 1248, 1992. [43] J.C. Phillips, Phys. Rev. B, 47, 14132, 1993. [44] J.C. Grossman and L. Mitas, Phys. Rev. Lett., 74, 1323, 1995. [45] Y.F. Li, C.Z. Wang, and K.M. Ho, unpublished. [46] J.C. Grossman and L. Mitas, Phys. Rev. B, 52, 16735, 1995. [47] M.F. Jarrold, J.E. Bower, and K.M. Creegan, J. Chem. Phys., 90, 3615, 1989. [48] U. Ray and M.F. Jarrold, J. Chem. Phys., 94, 2631, 1991. [49] M.F. Jarrold and E.C. Honea, J. Am. Chem. Soc., 114, 459, 1992. [50] K.M. Ho, A. Shvartsburg, B.C. Pan, Z.Y. Lu, C.Z. Wang, J. Wacker, J.L. Fye, and M.F. Jarrold, “Structures of Medium-Sized Silicon Clusters,” Nature, 392, 582, 1998. [51] B. Liu, Ph.D thesis, Iowa State University 2001. [52] H. Haas, C.Z. Wang, M. Fahnle, C. Elsasser, and K.M. Ho, Phys. Rev. B, 57, 1461, 1998. [53] H. Haas, C.Z. Wang, M. Fahnle, C. Elsasser, and K.M. Ho, “Environment-dependent tight-binding model for molybdenum,” In: P.E.A. Turchi et al. (eds.), MRS Symposium Proceedings, 491, MRS, pp. 327, 1998. [54] Y. Waseda, K. Hirata, and M. Ohtani, High Temp. High Press., 7, 221, 1975. [55] F.H. Featherston and J.R. Neighbours, Phys. Rev., 130, 1324, 1963. [56] R. Ziegler and H.-E. Schaefer, Matter Sci. Forum, 15–18, 145, 1987. [57] B.M. Powell, P. Martel, and A.D.B. Woods, Can. J. Phys., 55, 1601, 1977. [58] H. Haas, C.Z. Wang, K.M. Ho, M. Fahnle, and C. Elsasser, Surf. Sci. Lett., 457, L397, 2000. [59] Ju Li, C.Z. Wang, J.-P. Chang, W. Cai, V. Bulatov, K.M. Ho, and S. Yip, Phys. Rev. B, 70, 104113, 2004. [60] J.H. Holland, Adaptation in Natural and Artificial Systems, The University of Michigan Press, Ann Arbor, 1975. [61] D. Deaven and K.M. Ho, Phys. Rev. Lett., 75, 288, 1995. [62] D.M. Deaven, N. Tit, J.R. Morris, and K.M. Ho, Chem. Phys. Lett., 256, 195, 1996. [63] C.Z. Wang, C.H. Xu, C.T. Chan, and K.M. Ho, J. Phys. Chem., 96, 7603, 1992. [64] J.R. Chelikowsky, Phys. Rev. Lett., 67, 2970, 1991. [65] B. Liu, Z.Y. Lu, B. Pan, C.Z. Wang, K.M. Ho, A.A. Shvartsburg, and M.F. Jarrold, J. Chem. Phys., 109, 9401, 1998. [66] A.A. Shvartsburg, M.F. Jarrold, B. Liu, Z.Y. Lu, C.Z. Wang, and K.M. Ho, Phys. Rev. Lett., 81, 4616, 1998. [67] Mingsheng Tang, Wencai Lu, C.Z. Wang, and K.M. Ho, Chem. Phys. Lett., 377, 413, 2003. [68] Mingsheng Tang, C.Z. Wang, W.C. Lu, and K.M. Ho, Phys. Rev. B, (submitted). [69] F.C. Chuang, B. Liu, C.Z. Wang, T.L. Chan, and K.M. Ho, Phys. Rev. B, (submitted). [70] M. Pederson and K. Jackson, Phys. Rev. B, 43, 7312, 1991. [71] R.M. Wentzcovitch, J.L. Martin, and P.B. Allen, Phys. Rev. B, 45, 11372, 1992.
Environment-dependent tight-binding potential models [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82]
347
N.D. Mermin, Phys. Rev., 137, A1441, 1965. B.L. Zhang, C.Z. Wang, C.T. Chan, and K.M. Ho, Phys. Rev. B, 48, 11381, 1993. C.Z. Wang, K.M. Ho, M. Shirk, and P. Molian, Phys. Rev. Lett., 85, 4092, 2000. R.S. Lewis, E. Anders, and B.T. Draine, Nature (London), 339, 117, 1989. N.R. Greiner, D.S. Phillios, J.D. Johnson, and F. Volk, Nature (London), 333, 440, 1988. Y.K. Chang, H.H. Hsieh, W.F. Pong, M.H. Tsai, F.Z. Chien, P.K. Tseng, L.C. Chen, T.Y. Wang, K.H. Chen, and D.M. Bhusari, Phys. Rev. Lett., 82, 5377, 1999. J.Y. Raty, G. Galli, C. Bostedt, T.W. Van Buuren, and L.J. Terminello, Phys. Rev. Lett., 90, 037401, 2003. S. Tomita, M. Fujii, and S. Hayashi, Phys. Rev B, 66, 245424, 2002. S. Tomita, M. Fujii, S. Hayashi, and K. Yamamoto, Chem. Phys. Lett., 305, 225, 1999. V.L. Kuznetsov, A.L. Chuvilin, Y.V. Butenko, I.Y. Mal’kov, and V.M. Titov, Chem. Phys. Lett., 222, 343, 1994. G.D. Lee, C.Z. Wang, J. Yu, E. Yoon, and K.M. Ho, Phys. Rev. Lett., 91, 265701, 2003.
1.16 FIRST-PRINCIPLES MODELING OF PHASE EQUILIBRIA Axel van de Walle and Mark Asta Northwestern University, Evanston, IL, USA
First-principles approaches to the modeling of phase equilibria rely on the integration of accurate quantum-mechanical total-energy calculations and statistical-mechanical modeling. This combination of methods makes possible parameter-free predictions of the finite-temperature thermodynamic properties governing a material’s phase stability. First-principles, computationalthermodynamic approaches have found increasing applications in phase diagram studies of a wide range of semiconductor, ceramic and metallic systems. These methods are particularly advantageous in the consideration of previously unexplored materials, where they can be used to ascertain the thermodynamic stability of new materials before they are synthesized, and in situations where direct experimental thermodynamic measurements are difficult due to constraints imposed by kinetics or metastability.
1.
First-Principles Calculations of Thermodynamic Properties: Overview
At finite temperature (T ) and pressure (P) thermodynamic stability is governed by the magnitude of the Gibbs free energy (G): G = E − T S + PV
(1)
where E, S and V denote energy, entropy and volume, respectively. In principle, the formal statistical-mechanical procedure for calculating G from firstprinciples is well defined. Quantum-mechanical calculations can be performed to compute the energy E(s) of different microscopic states (s) of a system, which then must be summed up in the form of a partition function (Z ): Z=
exp[−E(s)/kB T ]
(2)
s
349 S. Yip (ed.), Handbook of Materials Modeling, 349–365. c 2005 Springer. Printed in the Netherlands.
350
A. van de Walle and M. Asta
from which the free energy is derived as F = E − T S = −kB T ln Z , where kB is Boltzman’s constant. Figure 1(a) illustrates, for the case of a disordered crystalline binary alloy, the nature of the disorder characterizing a representative finite-temperature atomic structure. This disorder can be characterized in terms of the configurational arrangement of the elemental species over the sites of the underlying parent lattice, coupled with the displacements characterizing positional disorder. In principle, the sum in Eq. (2) extends over all configurational and displacive states accessible to the system, a phase space that is astronomically large for a realistic system size. In practice, the methodologies of atomicscale molecular dynamics (MD) and Monte Carlo (MC) simulations, coupled with thermodynamic integration techniques (Kofke and Frenkel, de Koning, Chapter 2), reduce the complexity of a free energy calculation to a more tractable problem of sampling on the order of several to tens of thousands of representative states. Electronic density-functional theory (DFT) provides an accurate quantummechanical framework for calculating the relative energetics of competing atomic structures in solids, liquids and molecules for a wide range of materials classes (Kaxiras, Chapter 1). Due to the rapid increase in computational cost with system size, however, DFT calculations are typically limited to structures containing fewer than ≈1000 atoms, while ab initio MD simulations (Scheffler, Chapter 1) are practically limited to time scales of less than ≈1 ns. For liquids or compositionally ordered solids, where the time scales for structural rearrangements (displacive in the latter case, configurational and displacive
(a)
(b)
Figure 1. (a) Disordered crystalline alloy. The state of the alloy is characterized both by the atomic displacements v i and the occupation of each lattice site. (b) Mapping of the real alloy onto a lattice model characterized by occupation variables σi describing the identity of atoms on each of the lattice sites.
First-principles modeling of phase equilibria
351
in the former) are sufficiently fast, and the size of periodic cells required to accurately model the atomic structure are relatively small, DFT-based MD methods have found direct applications in the calculation of finite-temperature thermodynamic properties [1, 2]. For crystalline solids containing both positional and concentrated compositional disorder, however, direct applications of DFT to the calculation of free energies remains intractable; the time scales for configurational rearrangements are set by solid-state diffusion, ruling out direct application of MD, and the necessary system sizes required to accurately model configurational disorder are too large to permit direct application of DFT as the basis for MC simulations. Effective strategies have nonetheless been developed for bridging the size and time-scale limitations imposed by DFT in the first-principles computation of thermodynamic properties for disordered solids. The approach involves exploitation of DFT methods as a framework for parameterizing classical potentials and coarse-grained statistical models. These models serve as efficient “effective Hamiltonians” in direct simulation-based calculations of thermodynamic properties; they can also function as useful reference systems for thermodynamic-integration calculations.
2.
Thermodynamics of Compositionally Ordered Solids
In an ordered solid thermal fluctuations take the form of electronic excitations and lattice vibrations and, accordingly, the free energy can be written as F = E 0 + Felec + Fvib , where E 0 is the absolute zero total energy while Felec and Fvib denote electronic and vibrational free energy contributions, respectively. This section is devoted to the calculation of the electronic and vibrational contributions most commonly considered in phase-diagram calculations under the assumption that electron–phonon interactions are negligible (i.e., Felec and Fvib are simply additive). To account for electronic excitations, electronic DFT (Kaxiras, Chapter 1) can be extended to nonzero temperatures by allowing for partial occupations of the electronic states [3]. Within this framework, the electronic contribution to the free energy Felec (T ) at temperature T can be decomposed as∗ Felec (T ) = E elec (T ) − E elec (0) − T Selec(T )
(3)
* Equations (3)–(5) also assume that both the electronic charge density and the electronic density of states
can be considered temperature-independent.
352
A. van de Walle and M. Asta
where the electronic band energy E elec (T ) and the electronic entropy Selec(T ) are respectively given by E elec (T ) =
f µ,T (ε)εg(ε) dε
(4)
Selec(T ) = −kB ( f µ,T (ε) ln f µ,T (ε) + (1 − f µ,T (ε)) ln(1 − f µ,T (ε)))g(ε) dε
(5)
where g(ε) is the electronic density of states obtained from a density-functional calculation, while f µ,T (ε) is the Fermi distribution when the electronic chemical potential is equal to µ,
ε−µ f µ,T (ε) = 1 + exp kB T
−1
.
(6)
The chemical potential µ is the solution to f µ,T (ε)g(ε) dε=n ε , where n ε is the total number of electrons. Under the assumption that the electronic density of states near the Fermi level is slowly varying relative to f µ,T (ε), the equations for the electronic free energy reduce to the well-known Sommerfeld model, an expansion in powers of T whose lowest order term is Felec (T ) = −
π2 2 2 k T g(ε F ) 6 B
(7)
where g(ε F ) is the zero-temperature value of the electronic density of states at the Fermi level (ε F ). The quantum treatment of lattice vibrations in the harmonic approximation provides a reliable description of thermal vibrations in many solids for low to moderately high temperatures [4]. To describe this theory, consider an infinite periodic system with n atoms per unit cell and let u li for i = 1, . . . , n denote the displacement of atom i in cell l away from its equilibrium position and let Mi be the mass of atom i. Within the harmonic approximation, the potential energy of this system U is entirely determined by: (i) the potential energy (per unit cell) of the system at its equilibrium position E 0 and (ii) the force constants tensors li lj whose components are given, for α, β = 1, 2, 3, by
αβ
l i
∂ 2U l = l j ∂u α i ∂u β lj
(8)
evaluated at u li = 0 for all l, i. Such a harmonic approximation to the Hamiltonian of a solid is often referred to as a Born–von K´arm´an model. The thermodynamic properties of a harmonic system are entirely determined by the frequencies of its normal modes of oscillations, which can be
First-principles modeling of phase equilibria
353
obtained by finding the eigenvalues of the so-called 3n × 3n dynamical matrix of the system:
D(k) =
ei2π(k·l)
l
0l √ (1 1 ) M1 M1
.. . (n0 1l ) √
Mn M1
··· .. . ···
0l √ (1 n ) M1 Mn
.. . 0l √ (n n )
(9)
Mn Mn
for all vectors k in the first Brillouin zone. The resulting eigenvalues λb (k) for b = 1,.√ . . , 3n, provide the frequencies of the normal modes through νb (k) = λb (k) . This information for all k is conveniently summarized by 1/2π g(ν), the phonon density of states (DOS), which specifies the number of modes of oscillation having a frequency lying in the infinitesimal interval [ν, ν + dν]. The vibrational free energy (per unit cell) Fvib is then given by ∞
Fvib = k B T 0
hν ln 2 sinh 2k B T
g(ν) dν
(10)
where h is Planck’s constant and k B is Boltzman’s constant. The associated vibrational entropy Svib of the system can be obtained from the well-known thermodynamic relationship Svib = −∂ Fvib /∂ T . The high temperature limit (which is also the classical limit) of Eq. (10) is often a good approximation over the range of temperature of interest in solid-state phase diagram calculations ∞
Fvib = k B T 0
hν ln kB T
g(ν) dν.
The high temperature limit of the vibrational entropy difference between two phases is often used as measure of the magnitude of the effect of lattice vibrations on phase stability. It has the advantage of being temperature-independent, thus allowing a unique number to be reported as a measure of vibrational effects. Figure 2 (from [5]) illustrates the use of the above formalism to assess the relative phase stability of the θ and θ phases responsible for precipitation hardening in the Al–Cu system. Interestingly, accounting for lattice vibrations is crucial in order for the calculations to agree with the experimentally observed fact that the θ phase is stable at typical processing temperatures (T > 475 K). A simple improvement over the harmonic approximation, called the quasiharmonic approximation, is obtained by employing volume-dependent force constant tensors. This approach maintains all the computational advantages of the harmonic approximation while permitting the modeling of thermal expansion. The volume dependence of the phonon frequencies induced by the volume dependence of the force constants is traditionally described by the Gr¨uneisen parameter γkb = −∂ ln νb (k)/∂ ln V . However, for the purpose of
354
A. van de Walle and M. Asta
Figure 2. Temperature-dependence of the free energy of the θ and θ phases of the Al2 Cu compound. Insets show the crystal structures of each phase and the corresponding phonon density of states. Dashed lines indicate region of metastability and the θ phase is seen to become stable above about 475 K. (Adapted from Ref. [5] with the permission of the authors.)
modeling thermal expansion, it is more convenient to directly parametrize the volume-dependence of the free energy itself. This dependence has two sources: the change in entropy due to the change in the phonon frequencies and the elastic energy change due to the expansion of the lattice: F(T, V ) = E 0 (V ) + Fvib (T, V )
(11)
where E 0 (V ) is the energy of a motionless lattice whose unit cell is constrained to remain at volume V, while Fvib (T, V ) is the vibrational free energy of a harmonic system constrained to remain with a unit cell volume V at temperature T . The equilibrium volume V ∗ (T ) at temperature T is obtained by minimizing F(T, V ) with respect to V . The resulting free energy F(T ) at temperature T is then given by F(T, V ∗ (T )). The quasiharmonic approximation has been shown to provide a reliable description of thermal expansion of numerous elements up to their melting points, as illustrated in Fig. 3. First-principles calculations can be used to provide the necessary input parameters for the above formalism. The so-called direct force method proceeds by calculating, from first principles, the forces experienced by the atoms in response to various imposed displacements and by determining the value of the force constant tensors that match these forces through a least-squares fit.
First-principles modeling of phase equilibria 2.0 Na
0.0 ⫺1.0 ⫺2.0
∆1/1(%)
∆1/1(%)
1.0
355
A1
1.0 0.0
⫺1.0 0
100 200 300 400 Temperature (K)
0
200 400 600 800 1000 Temperature (K)
Figure 3. Thermal expansion of selected metals calculated within the quasiharmonic approximation. (Reproduced from Ref. [6] with the permission of the authors.)
Note that the simultaneous displacements of the periodic images of each displaced atom due to the periodic boundary conditions used in most ab initio methods typically requires the use of a supercell geometry, in order to be able to sample all the displacements needed to determine the force constants. While the number of force constants to be determined is in principle infinite, in practice, it can be reduced to a manageable finite number by noting that the force constant tensor associated with two atoms that lie farther than a few nearest neighbor shells can be accurately neglected for many systems. Alternatively, linear response theory (Rabe, Chapter 1) can be used to calculate the dynamical matrix D(k) directly using second-order perturbation theory, thus circumventing the need for supercell calculations. Linear response theory is also particularly useful when a system is characterized by non-negligible long-range force-constants, as in the presence of Fermi-surface instabilities or long-ranged electrostatic contributions. The above discussion has centered around the application of harmonic (or quasiharmonic) approximations to the statistical modeling of vibrational contributions to free energies of solids. While harmonic theory is known to be highly accurate for a wide class of materials, important cases exist where this approximation breaks down due to large anharmonic effects. Examples include the modeling of ferroelectric and martensitic phase transformations where the high-temperature phases are often dynamically unstable at zero temperature, i.e., their phonon spectra are characterized by unstable modes. In such cases, effective Hamiltonian methods have been developed to model structural phase transitions from first principles (Rabe, Chapter 1). Alternatively, direct application of ab initio molecular-dynamics offers a general framework for modeling thermodynamic properties of anharmonic solids [1, 2].
356
3.
A. van de Walle and M. Asta
Thermodynamics of Compositionally Disordered Solids
We now relax the main assumption made in the previous section, by allowing atoms to exit the neighborhood of their local equilibrium position. This is accomplished by considering every possible way to arrange the atoms on a given lattice. As illustrated in Fig. 1(b), the state of order of an alloy can be described by occupation variables σi specifying the chemical identity of the atom associated with lattice site i. In the case of a binary alloy, the occupations are traditionally chosen to take the values +1 or −1, depending on the chemical identity of the atom. Returning to Eq. (2), all the thermodynamic information of a system is contained in its partition function Z and in the case of a crystalline alloy system, the sum over all possible states of the system can be conveniently factored as follows: Z=
σ v∈σ e∈v
exp[−β E(σ, v, e)]
(12)
where β = (k B T )−1 and where • σ denotes a configuration (i.e., the vector of all occupation variables); • v denotes the displacement of each atom away from its local equilibrium position; • e is a particular electronic state when the nuclei are constrained to be in a state described by σ and v; and • E(σ, v, e) is the energy of the alloy in a state characterized by σ , v and e. Each summation defines an increasingly coarser level of hierarchy in the set of microscopic states. For instance, the sum over v includes all displacements such that the atoms remain close to the undistorded configuration σ . Equation (12) implies that the free energy of the system can be written as
F(T ) = −kB T ln
σ
exp[−β F(σ, T )]
(13)
where F(σ, T ) is nothing but the free energy of an alloy with a fixed atomic configuration, as obtained in the previous section
F(σ, T ) = −kB T ln
v∈σ e∈v
exp[−β E(σ, v, e)]
(14)
The so-called “coarse graining” of the partition function illustrated by Eq. (13) enables, in principle, an exact mapping of a real alloy onto a simple lattice model characterized by the occupation variables σ and a temperaturedependent Hamiltonian F(σ, T ) [7, 8].
First-principles modeling of phase equilibria
357
Although we have reduced the problem of modeling the thermodynamic properties of configurationally disordered solids to a more tractable calculation for a lattice model, the above formalism would still require the calculation of the free energy for every possible configuration σ , which is computationally intractable. Fortunately, the configurational dependence of the free energy can often be parametrized using a convenient expansion known as a cluster expansion [7, 9]. This expansion takes the form of a polynomial in the occupation variables F(σ, T ) = J∅ +
Ji σi +
i
i, j
Ji j σi σ j +
Ji j k σi σ j σk + · · ·
i, j,k
where the so-called effective cluster interactions (ECI) J∅ , Ji , Ji j , . . . , need to be determined. The cluster expansion can be recast into a form which exploits the symmetry of the lattice by regrouping the terms as follows F (σ, T ) =
α
m a Ja
σi
i∈α
where α is a cluster (i.e., a set of lattice sites) and where the summation is taken over all clusters that are symmetrically distinct while the average . . .
is taken over all clusters α that are symmetrically equivalent to α. The multiplicity m α weight each term by the number of symmetrically equivalent clusters in a given reference volume (e.g., a unit cell). While the cluster expansion is presented here in the context of binary alloys, an extension to multicomponent alloys (where σi can take more than two different values) is straightforward [9]. It can be shown that when all clusters α are considered in the sum, the cluster expansion is able to represent any function of configuration σ by an appropriate selection of the values of Jα . However, the real advantage of the cluster expansion is that, for many systems, it is found to converge rapidly. An accuracy that is sufficient for phase diagram calculations can often be achieved by keeping only clusters α that are relatively compact (e.g., short-range pairs or small triplets, as illustrated in the left panel of Fig. 4). The unknown parameters of the cluster expansion (the ECI Jα ) can then determined by fitting them to F(σ, T ) for a relatively small number of configurations σ obtained from first-principles computations. Once the ECI have been determined, the free energy of the alloy for any given configuration can be quickly calculated, making it possible to explore a large number of configurations without recalculating the free energy of each of them from first principles. In some applications the development of a converged cluster expansion can be complicated by the presence of long-ranged interatomic interactions mediated by electronic-structure (Fermi-surface), electrostatic and/or elastic effects. Long-ranged interactions lead to an increase in the number of ECIs
358
A. van de Walle and M. Asta
Figure 4. Typical choice of clusters (left) and structures (right) used for the construction of a cluster expansion on the hcp lattice. Big circles, small circles and crosses represent consecutive close-packed planes of the hcp lattice. Concentric circles represent two sites, one above the other in the [0001] direction. The unit cell of the structures (right) along the (0001) plane is indicated by lines while the third lattice vector, along [0001], is identical to the one of the hcp primitive cell. (Adapted, with the permission of the authors, from Ref. [10], a first-principles study of the metastable hcp phase diagram of the Ag–Al system.)
that must be computed, and a concomitant increase in the number of configurations that must be sampled to derive them. For metals it has been demonstrated how long-ranged electronic interactions can be derived from perturbation theory using coherent-potential approximations to the electronic structure of a configurationally disordered solid as a reference state [11]. Effective approaches to modeling long-ranged elastically mediated interactions have also been formulated [12]. Such elastic effects are known to be particularly important in describing the thermodynamics of mixtures of species with very large differences in atomic “size”. The cluster expansion tremendously simplifies the search for the lowest energy configuration at each composition of the alloy system. Determining these ground states is important because they determine the general topology of the alloy phase diagram. Each ground state is typically associated with one of the stable phases of the alloy system. There are three main approaches to identify the ground states of an alloy system. With the enumeration method, all the configurations whose unit cell contains less than a given number of atoms are enumerated and their energy
First-principles modeling of phase equilibria
359
is quickly calculated using the value of F(σ, 0) predicted from the cluster expansion. The energy of each structure can then be plotted as a function of its composition (see Fig. 5) and the points touching the lower portion of the convex hull of all points indicate the ground states. While this method is approximate, as it ignores ground states with unit cell larger than the given threshold, it is simple to implement and has been found to be quite reliable, thanks to the fact that most ground states indeed have a small unit cell. Simulated annealing offers another way to find the ground states. It proceeds by generating random configurations via MC simulations using the Metropolis algorithm (G. Gilmer, Chapter 2) that mimic the ensemble sampled in thermal equilibrium at a given temperature. As the temperature is lowered, the simulation should converge to the ground state. Thermal fluctuations are used as an effective means of preventing the system from getting trapped in local minima of energy. While the constraints on the unit cell size are considerably relaxed relative to the enumeration method, the main disadvantage of this method is that, whenever the simulation cell size is not an exact multiple of the ground state unit cell, artificial defects will be introduced in the simulation that need to be manually identified and removed. Also, the risk of obtaining local rather than global minima of energy is not negligible and must be controlled by adjusting the rate of decay of the simulation temperature.
Figure 5. Ground state search using the enumeration method in the Scx -Vacancy1−x S system. Diamonds represent the formation energies of about 3×106 structures, predicted from a cluster expansion fitted to LDA energies. The ground states, indicated by open circles, are the structures whose formation energy touches the convex hull (solid line) of all points. (Reproduced from Ref. [13], with the permission of the authors.)
360
A. van de Walle and M. Asta
Finally, there exists an exact, although computational demanding, algorithm to identify the ground states [14]. This approach relies on the fact that σ the cluster expansion is linear in the correlations σα ≡ i∈a i . Moreover, it can be shown that the set of correlations σα that correspond to “real” structures can be defined by a set of linear inequalities. These inequalities are the result of lattice-specific geometric constraints and there exists systematic methods to generate them [14]. As an example of such constraints, consider the fact that it is impossible to construct a binary configuration on a triangular lattice where the nearest neighbor pair correlations take the value −1 (i.e., where all nearest neighbors are between different atomic species). Since both the objective function and the constraints are linear in the correlations, linear programming techniques can be used to determine the ground states. The main difficulties associated with this method is the fact that the resulting linear programming problem involves a number of dimensions and a number of inequalities that grows exponentially fast with the range of interactions included in the cluster expansion. Once the ground states have been identified, thermodynamic properties at finite temperature must be obtained. Historically, the infinite summation defining the alloy partition function has been approximated through various mean-field methods [7, 14]. However, the difficulties associated with extending such methods to systems with medium to long-ranged interactions, and the increase in available computational power enabling MC simulations to be directly applied, have led to reduced reliance upon these techniques more recently. MC simulations readily provide thermodynamic quantities such as energy or composition by making use of the fact that averages over an infinite ensemble of microscopic states can be accurately approximated by averages over a finite number of states generated by “importance” sampling. Moreover, quantities such as the free energy, which cannot be written as ensemble averages, can nevertheless be obtained via thermodynamic integration (Frenkel, Chapter 2; de Koning, Chapter 2) using standard thermodynamic relationships to rewrite the free energy in terms of integrals of quantities that can be obtained via ensemble averages. For instance, since energy E(T ) and free energy F(T ) are related through E(T ) = ∂(F (T )/T )/∂(1/T ) we have T
F(T0 ) F(T ) − =− T T0
T0
E(T ) dT T2
(15)
and free energy differences can therefore be obtained from MC simulations providing E (T ). Figures 6 and 7 show two phase diagrams obtained by combining first principles calculations, the cluster expansion formalism and MC simulations, an approach which offers the advantage of handling, in a
First-principles modeling of phase equilibria
361
Figure 6. Calculated composition–temperature phase diagram for a metastable hcp Ag–Al alloy. Note that the cluster expansion formalism enables a unified treatment of both solid solutions and ordered compounds. (Reproduced from Ref. [10], with the permission of the authors.)
Figure 7. Calculated composition–temperature solid-state phase diagram for a rocksalt-type CaO–MgO alloy. The inclusion of lattice vibrations via the coarse-graining formalism is seen to substantially improve in agreement with experimental observations (filled circles). (Reproduced from Ref. [15], with the permission of the authors.)
362
A. van de Walle and M. Asta
unified framework, both ordered phases (with potential thermal defects) and disordered phases (with potential short-range order).
4.
Liquids and Melting Transitions
While first-principles thermodynamic methods have found the widest application in studies of solids, recent progress has been realized also in the development and application of methods for ab initio calculations of solid–liquid phase boundaries. This section provides a brief overview of such methods, based upon the application of thermodynamic integration methods within the framework of ab initio molecular dynamics simulations. Consider the ab initio calculation of the melting point for an elemental system, as was first demonstrated by Sugino and Car [1] in an application to elemental Si. The approach is based on the use of thermodynamic-integration methods to compute temperature-dependent free energies for bulk solid and liquid phases. Let U1 (r1 , r2 , . . . , r N ) denote the DFT potential energy for a collection of ions at positions (r1 , . . . , r N ), while U0 (r1 , r2 , . . . , r N ) corresponds to the energy of the same collection of ions described by a reference classical-potential model. We suppose that the free energy of the reference system, F0 , has been accurately calculated, either analytically (as in the case of an Einstein crystal) or using the atomistic simulation methods reviewed by Kofke and Frenkel in Chapter 2. We proceed to calculate the difference F1 − F0 between the DFT free energy (F1 ) and F0 employing the statistical-mechanical relation: F1 − F0 =
1 0
dUλ dλ dλ
1
= λ
dλ U1 − U0 λ
(16)
0
where the brackets · · · λ denote an average over the ensemble generated by the potential energy Uλ = λU1 + (1 − λ)U0 . In practice, · · · λ can be calculated from a time average over an MD trajectory generated with forces derived from the hybrid energy Uλ . The integral in Eq. (16) is evaluated from results computed for a discrete set of λ values, or from a time average over a simulation where λ is slowly “switched” on from zero to one. Practical applications of this approach rely on the careful choice of the reference system to provide energies that are sufficiently “close” to DFT to allow the ensemble averages in Eq. (16) to be precisely calculated from relatively short MD simulations. It should be emphasized that the approach outlined in this paragraph, when applied to the solid phase, provides a framework for accurately calculating anharmonic contributions to the vibrational free energy. Figure 8 shows results derived from the above procedure by Sugino and Car [1] in an application to elemental Si (using the Stillinger–Weber potential as a reference system). Temperature-dependent chemical potentials for solid and
Chemical potential (eV/atom)
First-principles modeling of phase equilibria
363
0.0 ⫺0.2 ⫺0.4 ⫺0.6 ⫺0.8 0.0
Solid Liquid
0.4
0.8 1.2 1.6 Temperature (⫻ 1000 K)
2.0
Figure 8. Calculated chemical potential of solid and liquid silicon. Full lines correspond to theory and dashed lines to experiments. (Reproduced from Ref. [1], with the permission of the authors.)
liquid phases (referenced to the zero-temperature free energy of the crystal) are plotted with symbols and are compared to experimental data represented by the dashed lines. It can be seen that the temperature-dependence of the solid and liquid free energies (i.e., the slopes of the curves in Fig. 8) are accurately predicted. Relative to the solid, the liquid chemical potentials are approximately 0.1 eV/atom lower than experiment, leading to a calculated melting temperature that is approximately 300 K lower than the measured value. Comparable and even somewhat higher accuracies have been demonstrated in more recent applications of this approach to the calculation of melting temperatures in elemental metal systems (see, e.g., the references cited in [2]). The above formalism has been extended as a basis for calculating solid and liquid chemical potentials in binary mixtures [2]. In this application, thermodynamic integration for the liquid phase is used to compute the change in free energy accompanying the continuous interconversion of atoms from solute to solvent species. Such calculations form the basis for extracting solute and solvent atom chemical potentials. For the solid phase the vibrational free energy of formation of substitutional impurities is extracted either within the harmonic approximation (along the lines described above) and/or from thermodynamic integration to derive anharmonic contributions. In applications to Fe-based systems relevant to studies of the Earth’s core, the approach has been used to compute the equilibrium partitioning of solute atoms between
364
A. van de Walle and M. Asta
solid and liquid phases in binary mixtures at pressures that are beyond the range of direct experimental measurements.
5.
Outlook
The techniques described in this article provide a framework for computing the thermodynamic properties of elements and alloys from first principles, i.e., requiring, in principle, only the atomic numbers of the elemental constituents as input. In the most favorable cases, these methods have been demonstrated to yield finite-temperature thermodynamic properties with an accuracy that is limited only by the approximations inherent in electronic DFT. For a growing number of metallic alloy systems, such accuracy can be comparable to that achievable in direct measurements of thermodynamic properties. In such cases, ab initio methods have found applications as a framework for augmenting the experimental databases that form the basis of “computationalthermodynamics” modeling in the design of alloy microstructure. Firstprinciples methods offer the advantage of being able to provide estimates of thermodynamic properties in situations where direct experimental measurements are difficult due to constraints imposed by sluggish kinetics, metastability or extreme conditions (e.g., high pressures or temperatures). In the development of new materials, first-principles methods can be employed as a framework for rapidly assessing the thermodynamic stability of hypothetical structures before they are synthesized. With the continuing increase in computational power and improvements in the accuracy of first-principles electronicstructure methods, it is anticipated that ab initio techniques will find growing applications in predictive studies of phase stability for a wide range of materials systems.
References [1] O. Sugino and R. Car, “Ab initio molecular dynamics study of first-order phase transitions: melting of silicon,” Phys. Rev. Lett., 74, 1823, 1995. [2] D. Alf`e, M.J. Gillan, and G.D. Price, “Ab initio chemical potentials of solid and liquid solutions and the chemistry of the Earth’s core,” J. Chem. Phys., 116, 7127, 2002. [3] N.D. Mermin, “Thermal properties of the inhomogeneous electron gas,” Phys. Rev., 137, A1441, 1965. [4] A.A. Maradudin, E.W. Montroll, and G.H. Weiss, Theory of Lattice Dynamics in the Harmonic Approximation, 2nd edn., Academic Press, New York, 1971. [5] C. Wolverton and V. Ozoli¸nsˇ, “Entropically favored ordering: the metallurgy of Al2 Cu revisited,” Phys. Rev. Lett., 86, 5518, 2001. [6] A.A. Quong and A.Y. Lui, “First-principles calculations of the thermal expansion of metals,” Phys. Rev. B, 56, 7767, 1997.
First-principles modeling of phase equilibria
365
[7] D. de Fontaine, “Cluster approach to order-disorder transformation in alloys,” Solid State Phys., 47, 33, 1994. [8] A. van de Walle and G. Ceder, “The effect of lattice vibrations on substitutional alloy thermodynamics,” Rev. Mod. Phys., 74, 11, 2002. [9] J.M. Sanchez, F. Ducastelle, and D. Gratias, “Generalized cluster description of multicomponent systems,” Physica, 128A, 334, 1984. [10] N.A. Zarkevich and D.D. Johnson, “Predicted hcp Ag–Al metastable phase diagram, equilibrium ground states, and precipitate structure,” Phys. Rev. B, 67, 064104, 2003. [11] G.M. Stocks, D.M.C. Nicholson, W.A. Shelton, B.L. Gyorffy, F.J. Pinski, D.D. Johnson, J.B. Staunton, P.E.A. Turchi, and M. Sluiter, “First Principles Theory of Disordered Alloys and Alloy Phase Stability,” In: P.E. Turchi and A. Gonis (eds.), NATO ASI on Statics and Dynamics of Alloy Phase Transformation, vol. 319, Plenum Press, New York, p. 305, 1994. [12] C. Wolverton and A. Zunger, “An ising-like description of structurally-relaxed ordered and disordered alloys,” Phys. Rev. Lett., 75, 3162, 1995. [13] G.L. Hart and A. Zunger, “Origins of nonstoichiometry and vacancy ordering in Sc1−x x S,” Phys. Rev. Lett., 87, 275508, 2001. [14] F. Ducastelle, Order and Phase Stability in Alloys., Elsevier Science, New York, 1991. [15] P.D. Tepesch, A.F. Kohan, and G.D. Garbulsky, et al., “A model to compute phase diagrams in oxides with empirical or first-principles energy methods and application to the solubility limits in the CaO–MgO system,” J. Am. Ceram., 49, 2033, 1996.
1.17 DIFFUSION AND CONFIGURATIONAL DISORDER IN MULTICOMPONENT SOLIDS A. Van der Ven and G. Ceder Department of Materials Science and Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
1.
Introduction
Atomic diffusion in solids is a kinetic property that affects the rates of important nonequilibrium phenomena in materials. The kinetics of atomic redistribution in response to concentration gradients determine not only the speed, but often also the mechanism by which phase transformations in multicomponent solids occur. In electrode materials for batteries and fuel cells high mobilities of specific ions ranging from lithium or sodium to oxygen or hydrogen are essential. In many instances, diffusion occurs in nondilute regimes in which different migrating atoms interact with each other. For example, lithium intercalation compounds such as Lix CoO2 and Lix C6 which serve as electrodes in lithium-ion batteries, can undergo large variations in lithium concentrations, ranging from very dilute concentrations to complete filling of all interstitial sites available for Li in the host. In nondilute regimes, diffusing atoms interact with each other, both electronically and elastically. A complete theory of nondilute diffusion in multi-component solids needs to account for the dependence of the energy and migration barriers on the configuration of diffusing ions. In this chapter, we present the formalism to describe and model diffusion in multicomponent solids. With tools from alloy theory to describe configurational thermodynamics [1–3], it is now possible to rigorously calculate diffusion coefficients in nondilute alloys from first-principles. The approach relies on the use of the alloy cluster expansion which has proven to be an invaluable statistical mechanical tool that links first-principles energies to the thermodynamic and kinetic properties of solids with configurational disorder. Although diffusion is a nonequilibrium phenomenon, diffusion coefficients 367 S. Yip (ed.), Handbook of Materials Modeling, 367–394. c 2005 Springer. Printed in the Netherlands.
368
A. Van der Ven and G. Ceder
can nevertheless be calculated by considering fluctuations at equilibrium using Green–Kubo relations [4]. We first elaborate on the atomistic mechanisms of diffusion in solids with interacting diffusing species. This is followed with a discussion of the relevant Green–Kubo expressions for diffusion coefficients. We then introduce the cluster expansion formalism to describe the configurational energy of a multi-component solid. We conclude with several examples of first-principles calculations of diffusion coefficients in multi-component solids.
2.
Migration in Solids with Configurational Disorder
Multi-component crystalline solids under most thermodynamic boundary conditions are characterized by a certain degree of configurational disorder. The most extreme example of configurational disorder occurs in a solid solution in which on average the arrangements of the different components of the solid approximate randomness. But even ordered compounds exhibit some degree of disorder due to thermal excitations or slight off-stoichiometry of the bulk composition. Atoms diffusing over crystal sites of a disordered solid sample a variety of different local environments along their trajectory. Diffusion in most crystals can be characterized as a Markov process whereby atoms after each hop completely thermalize before migrating to the next site along its trajectory. Hence each hop is independent of all previous hops. With reasonable accuracy, the rate with which individual atomic hops occur, can be described with transition state theory according to = ν ∗ exp
−E b kB T
(1)
where ν ∗ is a vibrational prefactor (having units of Hz) and E b is an activation barrier. Within the harmonic approximation, the vibrational prefactor is a ratio between the vibrational eigenmodes of the solid at the initial state of the hop to the vibrational eigenmodes when the migrating atom is at the activated state [5]. In the presence of configurational disorder, the activation barrier and frequency prefactor depend on the local arrangement of atoms around the migrating atom. Modeling of diffusion in a multicomponent system therefore requires a knowledge of the dependence of E b and ν ∗ on configuration. Especially, the configuration dependence of E b is of importance as the hop frequency, , depends on it exponentially. We restrict ourselves here to migration that occurs by individual atomic hops to adjacent vacant sites. Hence we do not consider diffusion that occurs through either a ring or intersticialicy mechanism. We also make a distinction between diffusion of interstitial species and substitutional species.
Diffusion and configurational disorder in multicomponent solids
2.1.
369
Interstitial Diffusion
Interstitial diffusion occurs in many important materials. A common example is the diffusion of carbon atoms over the interstitial sites of bcc or fcc iron (i.e., steel). Many phase transformations in steel involve the redistribution of carbon atoms between growing precipitate phases and the consumed matrix phase. A defining characteristic of interstitial diffusion is the existence of an externally imposed lattice of sites over which atoms can diffuse. In steel, the crystallized iron atoms create the interstitial sites for carbon. A similar situation exists in Lix CoO2 in which a crystalline CoO2 host creates an array of intersitial sites that can be occupied by lithium. While in Lix CoO2 , the lithium concentration x can be varied from 0 to 1, in steel FeC y , the carbon concentration y is typically very low. Individual carbon atoms interfere minimally with each other as they wander over the interstitial sites of iron. In Lix CoO2 , however, as the lithium concentration is typically large, migrating lithium atoms interact strongly with each other and influence each other’s diffusive trajectories. Another type of system that we place in the category of interstitial diffusion is adatom diffusion on the surface of a crystalline substrate. Often a crystalline surface creates an array of well defined sites on which adsorbed atoms reside, such as the fcc sites on a (111) terminated surface of an fcc crystal. Diffusion then involves the migration of adsorbed atoms over these surface sites. The presence of many diffusing atoms creates a state of configurational disorder over the interstitial sites that evolves over time as a result of the activated hops of individual atoms. Not only does the activation barrier of a migrating atom depend on the local arrangement of the surrounding interstitial atoms, but also the migration mechanism can depend on that arrangement. This is the case in Lix CoO2 , a layered compound consisting of close packed oxygen planes stacked with an ABCABC sequence. Between the oxygen layers are alternating layers of Li and Co which occupy octahedral sites of the oxygen sublattice. Within each lithium plane, the lithium ions occupy a two dimensional triangular lattice. As lithium is removed from LiCoO2 , vacancies are created in the lithium planes. First-principles density functional theory calculations (LDA) have shown that two migration mechanisms for lithium exchange with an adjacent vacancy exist depending on the arrangement of surrounding lithium atoms [3]. This is illustrated in Fig. 1. If the two sites adjacent to the end points of the hop (sites (a) and (b) in Fig. 1a) are simultaneously occupied by lithium ions, then the migration mechanism follows a direct path, passing through a dumbel of oxygen atoms. The calculated activation barrier for this mechanism is high, approaching 0.8 eV. This mechanism occurs when lithium migrates to an isolated vacancy. If, however, one or both of the sites adjacent to the end points of the hop are vacant (Fig. 1b), then the migrating lithium follows a curved path which passes through an adjacent tetrahedral
370
A. Van der Ven and G. Ceder
O Li
a
a b
b O Co O Single vacancy hop (a)
Divacancy hop 35 (b)
Figure 1. Two lithium migration mechanims in Lix CoO2 depending on the arrangement of lithium ions around the migrating ion. (a) When both sites a and b are occupied by Li, the migrating lithium performs a direct hop whereby it has to squeeze through a dumbel of oxygen ions. This mechanism occurs when the migrating lithium ion hops into an isolated vacancy (square). (b) When either site a or site b are vacant, the migrating lithium ion performs a curved hop whereby it passes through a tetrahedrally coordinated site. This mechanism occurs when the migrating atom hops into a divacancy.
site, out of the plane formed by the Li sites. For this mechanism, the activation barrier is low, taking values in the vicinity of 0.3–0.4 eV. This mechanism occurs when lithium migrates into a divacancy. Comparison of the activation barriers for the two mechanisms clearly shows that lithium diffusion mediated with divacancies is more rapid than with single vacancies. Nevertheless, we can already anticipate that the availability of divacancies will depend on the overall lithium concentration. The complexity of diffusion in a disordered solid is evident in Fig. 2 which schematically illustrates a typical disordered arrangement of lithium atoms within a lithium plane of Lix CoO2 . Hop 1, for example, must occur with a large activation barrier as the lithium is migrating to an isolated vacancy. In hop 2, lithium migrates to a vacant site that belongs to a divacancy and hence follows a curved path passing through an adjacent tetrahedral site characterized by a low activation barrier. In hop 3, lithium migrates to a vacant site belonging to two divacancies simultaneously, and hence has two low energy paths available. Similar complexities can be expected for adatom diffusion on crystalline substrates.
2.2.
Substitutional Diffusion
Substitutional diffusion is qualitatively different from interstitial diffusion in that an externally imposed lattice of sites for the diffusing atoms is absent.
Diffusion and configurational disorder in multicomponent solids
371
Cobalt Lithium
C
Oxygen
B c
A C B
3
A
2 a
LixCoO2
1 Lithium plane
Figure 2. A typical disordered lithium-vacancy arrangement within the lithium planes of Lix CoO2 . In a given lithium-vacancy arrangement, several different migration mechanisms can occur.
Instead, the diffusing atoms themselves form the network of crystal sites. This describes the situation for most metallic and semiconductor alloys. Vacancies with which to exchange with do exist in these crystalline alloys, however, the concentrations are often very dilute. Examples where substitutional diffusion is relevant are alloys such as Si–Ge, Al–Ti and Al–Li, in which the different species reside on the same crystal structure, and migrate by exchanging with vacancies. As with intersitial compounds, widely varying degrees of local order or disorder exist, affecting migration barriers. Al(1−x)Lix for example is metastable on the fcc crystal structure for low x and forms an ordered L12 compound at x = 0.25. Diffusion within a solid solution is different than in the ordered compound as the local arrangement of Li and Al are different. Figure 3 illustrates a diffusive hop of an Al atom to a neighboring vacancy within the ordered L12 Al3 Li phase. The energy along the migration path as calculated with LDA is also
372
A. Van der Ven and G. Ceder
1000
Energy (meV)
800
600
400
200
0 0
0.5
1
1.5
2
2.5
3
Migration path (Angstrom) Figure 3. The energy along the migration path of an Al atom hopping into a vacancy (square) on the lithium sublattice of L12 Al3 Li. Lighter atoms are Al, darker atoms are Li.
illustrated in Fig. 3. Clearly, the vacancy prefers the Li sublattice as the energy of the solid increases as the vacancy migrates from the Li sublattice to the Al sublattice by exchanging with an Al atom.
3.
Green–Kubo Expressions for Diffusion
While diffusion is complex at the atomic length scale, of central importance at the macroscopic length scale is the rate with which gradients in concentration dissipate. These rates can be described by diffusion coefficients that relate atomic fluxes to gradients in concentration. Green–Kubo methods make it possible to link kinetic coefficients to microscopic fluctuations of appropriate quantities at equilibrium. In this section we present the relevant Green–Kubo equations that allow us to calculate diffusion coefficients in multi-component solids from first-principles. We again make a distinction between interstitial and substitutional diffusers.
Diffusion and configurational disorder in multicomponent solids
3.1.
373
Interstitial Diffusion
3.1.1. Single component diffusion For a single component occuping interstitial sites of a host, such as carbon in iron, or Li in Lix CoO2 , irreversible thermodynamics [4] stipulates that a net flux J in particles occurs when a gradient in the chemical potential µ of the interstitial specie exists according to J = −L∇µ
(2)
where L is a kinetic coefficient that depends on the mobility of the diffusing atoms. Often it is more practical to express the flux in terms of a concentration gradient instead of a chemical potential gradient as the former is more accessible experimentally J = −D∇C.
(3)
D in Eq. (3) is the diffusion coefficient and the concentration C refers to the number of interstitial particles per unit volume. While the true driving force for diffusion is a gradient in chemical potential, it is nevertheless possible to work with Eq. (3) provided the diffusion coefficient is expressed as
D=L
dµ . dC
(4)
Hence the diffusion coefficient consists of a kinetic factor L and a thermodynamic factor dµ/dC. The validity of irreversible thermodynamics is restricted to systems that are not too far removed from equilibrium. To quantify this, it is useful to mentally divide the solid into small subregions that are microscopically large enough for thermodynamic variables to be meaningful yet macroscopically small enough that the same thermodynamic quantities can be considered constant within each subregion. Hence, although the solid itself is removed from equilibrium, each subregion is locally at equilbrium. This is called the local equilibrium approximation, and it is within this approximation that the linear kinetic equation Eq. (2) is considered valid. Within the local equilibrium approximation, the kinetic parameters D and L can be derived by a consideration of relevant fluctuations at thermodynamic equilibrium. Crucial in this derivation, is the assumption made by Onsager in his proof of the reciprocity relations of kinetic parameters, that the regression of a fluctuation of a particular extensive property around its equilibrium value occurs on average according to the same linear phenomenological laws as those governing the regression of artificially induced fluxes of the same extensive quantity [4]. This regression hypothesis is a consequence of the fluctuation–dissipation theorem of nonequilibrium statistical mechanics [6].
374
A. Van der Ven and G. Ceder
Several techniques, collectively referred to as Green–Kubo methods, exist to link microscopic fluctuations to macroscopic kinetic quantities [7–9]. Neglecting crystallographic anisotropy, the Green–Kubo expression for the kinetic factor for diffusion can be written as [10–12]
L=
ζ
Rζ (t)
2
(2d)t Mv s kB T
(5)
where Rζ (t) is the vector connecting the end points of the trajectory of particle ζ after a time t, M refers to the total number of interstitial sites available, v s is the volume per interstitial site, kB is the Boltzmann constant, T is the temperature and d refers to the dimension of the interstitial network. The brackets indicate an ensemble average performed at equilibrium. Often, the diffusion coefficient is also written in an equivalent form as [10] D = DJ F where
DJ =
ζ Rζ (t) (2d)t N
and
d F=
(6)
µ kB T
2
(7)
d ln(x)
.
(8)
N refers to the number of diffusing atoms and x = N/M to the fraction of filled interstitial sites. F is often called a thermodynamic factor and DJ is sometimes called the jump-diffusion or self-diffusion coefficient. A common approximation is to neglect cross correlations between different diffusing species and to replace DJ with the tracer diffusion coefficient defined as
D∗ =
Rζ2 (t) (2d)t
.
(9)
The difference between DJ and D ∗ is that the former depends on the square of the displacement of all the particles while the latter depends on the average of the square of the displacement of individual diffusing atoms. DJ is a measure of collective fluctuations of the center of mass of all the diffusing particles. Figure 4 compares DJ and D ∗ calculated with kinetic Monte Carlo simulations for the Lix CoO2 system. Notice in Fig. 4 that DJ is systematically larger than D ∗ for all lithium concentrations x, only approaching D ∗ for dilute lithium concentrations.
Diffusion and configurational disorder in multicomponent solids
375
⫺6
13
D ( 10 ) (cm2/s) ν∗
⫺7 ⫺8 ⫺9
⫺10 ⫺11 0
0.2
0.6 0.4 Li concentration
0.8
1
Figure 4. A comparison of the self diffusion coefficient DJ (crosses), and the tracer diffusion coefficient D ∗ (squares), for lithium diffusion in Lix CoO2 calculated at 400 K.
For interstitial components, the chemical potential of the diffusing atoms is defined as dG dg = (10) dN dx where G is the free energy of the crystal containing the interstitial species and g is the free energy normalized per interstitial site. While the thermodynamic factor is related to the chemical potential according Eq. (8) it is often convenient to determine F by considering fluctuations in the number of interstitial atoms within the grand canonical ensemble (constant µ, T and M). µ=
F=
N N 2 − N 2
(11)
Diffusion involves redistribution of particles from subregions of the solid with a high concentration of interstitial atoms to other subregions with a low concentration. The thermodynamic factor describes the thermodynamic response to concentration fluctuations within sub-regions.
3.1.2. Two component system A similar formalism emerges when two different species reside and diffuse over the same interstitial sites of a host. This is the case for example for carbon and nitrogen diffusion in iron or lithium and sodium diffusion over the
376
A. Van der Ven and G. Ceder
interstitial sites of a transition metal oxide host. Referring to the two diffusing species as A and B, the flux equations become JA = −L AA ∇µA − L AB ∇µB JB = −L BA ∇µA − L BB ∇µB
(12)
where L i j (i, j = A or B) are kinetic coefficients similar to L of Eq. (2). As with Eq. (2), gradients in chemical potential are often not readily accessible experimentally and Eq. (12) can be written as JA = −DAA ∇CA − DAB ∇CB JB = −DBA ∇CA − DBB ∇CB .
(13)
where the matrix of diffusion coefficients
DAA DB A
D AB DB B
=
L AA LBA
L AB L BB
∂µ A ∂C A ∂µ B ∂C A
∂µ A ∂C B ∂µ B ∂C B
(14)
can again be factorized into a product of a kinetic term (the 2×2 L matrix) and a thermodynamic factor (the 2 × 2 matrix of partial derivative of the chemical potentials). The Green–Kubo expressions relating the macroscopic diffusion coefficients to atomic fluctuations are [13, 14]
ζ
Lij =
Rζi (t)
ξ
Rξ (t) j
.
(2d)tv s MkB T
(15)
where Rζi is the vector linking the end points of the trajectory of atom ζ of specie i after time t. Another factorization of D is practical when studying diffusion with a lattice model description of the interactions between the different constituents residing on the crystal network ˜ D = L˜ Θ
(16)
where
L˜ i j =
ζ
Rζi (t)
(2d)t M
ξ
Rξ (t) j
.
(17)
Diffusion and configurational disorder in multicomponent solids and ˜ ij =
∂
µi kB T
377
∂x j
.
(18)
are respectively matrices of kinetic coefficients and thermodynamic factors. As with the single component intersitial systems the chemical potentials for a binary component interstitial system are defined as µi =
∂G ∂g = ∂ Ni ∂ x i
(19)
˜ can also be written in terms where i refers to either A or B. The components of of variances of the number of particles residing on the M site crystal network at constant chemical potentials, that is in terms of measures of fluctuations M ˜ = Θ Q
N B2 − N B 2 − (N B N A − N A N B )
− (N A N B − N A N B ) N A2 − N A 2
(20) where
Q=
NA2 − NA 2
NB2 − NB 2 − (NA NB − NA NB )2
These fluctuations in NA and NB are to be calculated in the grand canonical ensemble at the chemical potentials µA and µB corresponding to the concentrations at which the diffusion coefficient is desired.
3.2.
Substitutional Diffusion
The starting point for treating substitutional diffusion in a binary alloy are the Green–Kubo relations of Eqs. (14)–(18). However, several modifications and qualifications are necessary. These arise from the fact that alloys are characterized by a dilute concentration of vacancies and that the crystallographic sites on which the diffusing atoms reside are not created externally by a host, but are rather formed by the diffusing atoms themselves. The consequences of this for diffusion is that the chemical potentials appearing in the thermodynamic factor are not the conventional chemical potentials for the individual species A and B of a substitutional alloy, but are rather differences in chemical potentials between that of each diffusing specie and the vacancy chemical potential. Hence the chemical potentials of Eqs. (12), (14) and (18) need to be replaced by µ˜ i in which µ˜ i = µi − µV
(21)
378
A. Van der Ven and G. Ceder
where µV is the vacancy chemical potential in the solid. The reason for this modification arises from the fact that the chemical potential appearing in the Green–Kubo expression for the diffusion coefficient matrix Eq. (14) and defined in Eq. (19) corresponds to the change in free energy as component i is added by holding the number of crystalline sites constant, meaning that i is added at the expense of vacancies. This differs from the conventional chemical potentials of alloys which are defined as the change in free energy of the solid as component i is added by extending the crystalline network of the solid. µ˜ i refers to the chemical potential for a fixed crystalline network, while µi and µV correspond to chemical potentials for a solid in which the crystalline network is enlarged as more species are added. The use of µ˜ i instead of µi in the thermodynamic factor of the Green–Kubo expressions for the diffusion coefficients of crystalline solids also follows from irreversible thermodynamics [15, 16] as well as thermodynamic considerations of crystalline solids [17]. It can also be understood on physical grounds. By dividing the crystalline solid up into subregions, diffusion can be viewed as the flow of particles from one subregion to the next. Because of the constraint imposed by the crystalline network, the only way for excess atoms from one sub-region to be accommodated by a neighboring subregion is through the exchange of vacancies. One subregion gains vacancies the other loses them. The change in free energy in each subregion due to diffusion occurs by adding or removing atoms at the expense of vacancies. Another important modification to the treatment of binary interstitial diffusion is the identification of interdiffusion. Interdiffusion in its most explicit form refers to the dissipation of concentration gradients by the intermixing of A and B atoms. It is this phenomenon of intermixing that enters into continuum descriptions of diffusion couples and phase transformations involving atomic redistribution. Kehr et al. [18] demonstrated that in the limit of dilute vacancy concentrations, the full 2 × 2 diffusion matrix can be diagonalized producing an eigenvalue λ+ corresponding to density relaxations due to inhomegeneities in vacancies and an eigenvalue λ− corresponding to interdiffusion. The diagonalization of the D matrix is accompanied by a coordinate transformation of the fluxes and the concentration gradients. In matrix notation, J = −D∇C
(22)
where J and ∇C are column vectors containing as elements JA , JB and ∇CA , ∇CB , respectively. Diagonalization of D leads to D = EλE−1
(23)
Diffusion and configurational disorder in multicomponent solids
379
where λ is a diagonal matrix with components λ+ (the larger eigenvalue) and λ− (the smaller eigenvalue) in the notation of Kehr et al. [18], i.e.,
λ=
λ+ 0 0 λ−
The flux equation (22) can then be rewritten as E−1 J = −λE−1 ∇C.
(24)
The eigenvalue λ− , which describes the rate with which gradients in the concentration of A and B atoms dissipate by an intermixing mode is the most rigorous formulation of what is commonly referred to as an interdiffusion coefficient.
4.
Cluster Expansion
The Green–Kubo expressions for diffusion coefficients are proportional to the ensemble averages of the square of the collective distance travelled by the diffusing particles of the solid. Trajectories of interacting diffusing particles can be obtained with kinetic Monte Carlo simulations in which particles migrate on a crystalline network with migration rates given by Eq. (1). The migration rates of a specific atom, however, depend on the local arrangement of the other diffusing atoms through the configuration dependence of the activation barrier and frequency prefactor. Ideally, the activation barrier for each local environment could be calculated from first-principles. Nevertheless, this is computationally impossible, as the number of configurations are exceedingly large, and firstprinciples activation barrier calculations have a high computational cost. It is here that the cluster expansion formalism [1–3] becomes invaluable as a tool to extrapolate energy values calculated for a few configurations to determine the energy for any arrangement of atoms in a crystalline solid. In this section, we describe the cluster expansion formalism and how it can be applied to characterize the configuration dependence of the activation barrier for diffusion. We first focus on describing the configurational energy of atoms residing at their equilibrium sites, i.e., of the configurational energy of the end points of any hop.
4.1.
General Formalism
We restrict ourselves to binary problems though the cluster expansion formalism is valid for systems with any number of species [1, 2]. While it is clear that two component alloys without crystalline defects such as vacancies are
380
A. Van der Ven and G. Ceder
binary problems, atoms residing on the interstitial sites of a host can be treated as a binary system as well, with the interstitial atoms constituting one of the components and the vacancies the other. In crystals, atoms can be assigned to well defined sites, even when relaxations from ideal crystallographic positions occur. There is always a one to one correspondence between each atom and a crystallographic site. If there are M crystallographic sites, then there are 2 M possible arrangements of two species over those sites. To characterize a particular configuration, it is useful to introduce occupation variables σi that are +1 (−1) if an A (B which could be an atom different from A or a vacancy) resides at site i. The vector σ =(σ1 , σ2 , . . . , σi , . . . , σ M ) then uniquely specifies a configuration. The use of σ , however, is cumbersome and a more versatile way of uniquely characterizing configurations can be achieved with polynomials φα of occupation variables defined as [1] σ) = φα (
σi
(25)
i∈α
where i are sites belonging to a cluster α of crystal sites. Typical examples of clusters are a nearest neighbor pair cluster, a next nearest neighbor pair cluster, a triplet cluster etc. Examples of clusters on a two dimensional triangular lattice are illustrated in Fig. 5. There are 2 M different clusters of sites and σ ). therefore 2 M cluster functions φα ( σ ) form a complete It can be shown [1] that the set of cluster functions φα ( and orthonormal basis in configuration space with respect to the scalar product 1 f ( σ )g( σ) (26) f, g = M 2 σ where f and g are any scalar functions of configuration. The sum in Eq. (26) extends over all possible configurations of A and B atoms over the M sites of the crystal. Because of their completeness and orthonormality over the space of configurations, it is possible to expand any function of configuration f ( σ) σ ). In particular, the conas a linear combination of the cluster functions φα ( figurational energy (with atoms relaxed around the crystallographic positions of the crystal) can be written as E( σ ) = Eo +
α
Vα φα ( σ)
(27)
where the sum extends over all clusters α over the M sites. The coefficients σ ) with the Vα are constants and formally follow from the scalar product of E( σ) cluster function φα ( 1 σ ), φα ( σ ) = M E( σ )φα ( σ ). (28) Vα = E( 2 σ σ) E o is the coefficient of the empty cluster φo = 1 and is the average of E( over all configurations. Equation (27) is referred to as a cluster expansion of
Diffusion and configurational disorder in multicomponent solids
381
b a γ
α
β
Figure 5. Examples of clusters for a two dimensional triangular lattice.
the configurational energy and the coefficients of the expansion Vα are called effective cluster interactions (ECI). Equation (27) can be viewed as a generalized Ising model Hamiltonian containing not only nearest neighbor pair interactions, but also all other pair and multibody interactions extending beyond the nearest neighbors. Through Eq. (28), a formal link is made between the interaction parameters of the generalized Ising model and the configuration dependent ground state energies of the solid in each configuration σ . Clearly, the cluster expansion for the configurational energy, Eq. (27), is only useful if it converges rapidly, i.e., there exists a maximal cluster αmax such that all ECI corresponding to clusters larger than αmax can be neglected. In this case, the cluster expansion can be truncated to yield E( σ ) = Eo +
α max α
Vα φα ( σ)
(29)
382
A. Van der Ven and G. Ceder
A priori mathematical criteria for the convergence of the configurational energy cluster expansion do not exist. Experience indicates that convergence depends on the particular system being considered. In general, though, it can be expected that the lower order clusters extending over a limited range within the crystal will have the largest contribution in the cluster expansion.
4.2.
Symmetry and the Cluster Expansion
Simplifications to the cluster expansion (27) or (29) can be made by taking the symmetry of the crystal into account [2]. Clusters are said to be equivalent by symmetry if they can be mapped onto each other with at least one space group symmetry operation. For example, clusters α and β of Fig. 5 are equivalent since a clockwise rotation of α by 60◦ followed by a translation by the vector 2b maps α onto β. The ECI corresponding to clusters that are equivalent by symmetry have the same numerical value. In the case of α and β of Fig. 5, Vα = Vβ . All clusters that are equivalent by symmetry are said to belong to an orbit α where α is a representative cluster of the orbit. For any arrangement σ we can define averages over cluster functions φα ( σ ) as φα ( σ ) =
1 φβ ( σ) | α | β∈ α
(30)
where the sum extends over all clusters β belonging to the orbit α and | α | represents the number of clusters that are symmetrically equivalent to α. The φα ( σ ) are commonly referred to as correlation functions. Using the definition of the correlation functions and the fact that symmetrically equivalent clusters have the same ECI, we can rewrite the configurational energy normalized by the number of primitive unit cells Np (i.e., number of Bravais lattice points of the crystal which is not necessarily equal to the number of crystal sites M), as e( σ) =
E( σ) = Vo + m α Vα φα ( σ ) Np α
(31)
where m α is the multiplicity of the cluster α, defined as the number of clusters per Bravais lattice point symmetrically equivalent with α (i.e., m α = | α |/Np ) and Vo = E o /Np . The sum in (31) is only performed over the symmetrically non-equivalent clusters.
4.3.
Determination of the ECI
According to Eq. (28), the ECI for the energy cluster expansion are determined by the first-principles ground state energies for all the different
Diffusion and configurational disorder in multicomponent solids
383
configurations σ . Explicitly calculating the ECI according to the scalar product Eq. (28) is intractable. Techniques, such as direct configurational averaging (DCA), though, have been devised to approximate the scalar product (28) [2, 19, 20]. In recent years, the preferred method of obtaining ECI has been with an inversion method [21–29]. In this approach, energies E( σ I ) for a set of P periodic configurations σ I with I = 1, . . . , P are calculated from firstprinciples and a truncated form of (31) is inverted such that it reproduces the E( σ I ) within a tolerable error when Eq. (31) is evaluated for configuration σ I . The simplest inversion scheme uses a least squares fit. More sophisticated algorithms involving linear programming techniques [30], cross-validation optimization [32] or the inclusion of k-space terms to account for long-range elastic strain have been developed [33, 34].
4.4.
Local Cluster Expansion
The traditional cluster expansion formalism described so far is applicable to the configurational energy of the solid which is an extensive quantity. We will refer to these expansions as extended cluster expansions. Activation barriers, however, are equal to the difference between the energy of the solid when the migrating atom is at the activated state and that when the migrating atom is at the initial equilibrium site. Hence, the configuration dependence of the activation barrier of an atom needs to be described by a cluster expansion with no translational symmetry and as such it converges to a fixed value as the system size grows. Not only is the activation barrier a function of configuration, but it also depends on the direction of the hop. This is schematically illustrated in Fig. 6 in which the end points of the hop have a different configurational energy. Describing the configuration dependence of the activation barrier independent of the direction of the hop is straightforward if a kinetically resolved activation barrier is introduced [3], defined as E KRA = E act −
n 1 Ej n j =1
(32)
∆Eb
∆EKRA
∆Eb
Figure 6. The activation barrier for migration depends on the direction of the hop when the energies of the end points of the hop are different.
384
A. Van der Ven and G. Ceder
where E act is the energy of the solid with the migrating atom at the activated state and E j are the energies of the solid with the migrating atom at the end points j of the hop. In most solids, there are n=2 end points to a hop, however, it is possible that more end points exist. All terms in Eq. (32) depend on the arrangement of atoms surrounding the end points of the hop and the activated state. The dependence of E KRA on configuration can, be described with a cluster expansion that has a point group symmetry compatible with the symmetry of the crystal as well as that of the activated state. For this reason, the cluster expansion of E KRA is called a local cluster expansion [3]. The kinetically resolved activation barrier is not the true activation barrier that enters in the transition state theory expression for the hop frequency, Eq. (1). It is merely a useful quantity that characterizes the configuration dependence of the activated state independent of the direction of the hop. The true activation barrier can be calculated from E KRA using
n 1 E j − Ei E b = E KRA + n j =1
(33)
where E i is the energy of the crystal with the migrating atom at the initial site of the hop. All quantities on the right hand side of Eq. (33) can be described with either a local cluster expansion (for E KRA ) or an extended cluster expansion (for the configurational energy of the solid).
5.
Practical Implementation
Calculating diffusion coefficients from first-principles in multicomponent solids involves three steps. First, a variety of ab initio energies for different atomic arrangements need to be calculated with an accurate first-principles method. This includes energies for a wide range of atomic arrangements over the sites of the crystal, as well as energies for migrating atoms placed at activated states surrounded by different arrangements. The latter calculations are typically performed with an atom at the activated state in large supercells. A useful technique to find the activated state between two equilibrium end points is the nudged elastic band method [31] which determines the lowest energy path between two equilibrium states. Calculating the vibrational prefactor requires a calculation of the phonon density of states for different atomic arrangements both with the migrating atom at its equilibrium site and at the activated state. While sophisticated techniques have been devised to characterize the configurational dependence of the vibrational free energy of a solid [35], for diffusion studies, a convenient simplification is the local harmonic approximation [36].
Diffusion and configurational disorder in multicomponent solids
385
In the second step, the first-principles energy values for different atomic arrangements are used to determine the coefficients of both a local cluster expansion (for the kinetically resolved activation barriers) and a traditional extended cluster expansion (for the energy of the crystal with all atoms at non-activated crystallographic sites) with either a least squares fit or with one of the more sophisticated methods alluded to above. The cluster expansions enable the calculation of the energy and activation barrier for any arrangement of atoms on the crystal. They serve as a convenient and robust tool to extrapolate accurate first-principles energies calculated for a few configurations to the energy of any configuration. Hence the migration rates of Eq. (1) can be calculated for any arrangement of atoms. The final step is the combination of the cluster expansions with kinetic Monte Carlo simulations to calculate the quantities entering the Green–Kubo expressions for the diffusion coefficients. Kinetic Monte Carlo simulations have been discussed extensively elsewhere [3, 37, 38]. Applied to diffusion in crystals, kinetic Monte Carlo algorithms are used to simulate the stochastic migrations of many atoms, hopping to neighboring sites with frequencies given by Eq. (1). A kinetic Monte Carlo simulation starts from a representative arrangement of atoms (typically obtained with a standard Monte Carlo method for lattice models). As atoms migrate, their trajectories and the time are kept track of, enabling the calculation of the quantities between the brackets in the Green–Kubo expressions. Since the Green–Kubo expressions involve ensemble averages, many kinetic Monte Carlo runs which start from different representative initial conditions are necessary. Depending on the desired accuracy, averages need to be performed over the trajectories departing from between 100 and 10 000 different initial conditions.
6.
Examples
Two examples of first-principles calculations of diffusion coefficients in multi-component solids are reviewed in this section. The first is for lithium diffusion in Lix CoO2 and is an example of nondilute interstitial diffusion. The second example, diffusion in the fcc based Al–Li alloy, corresponds to a substitutional system.
6.1.
Interstitial Diffusion
Lix CoO2 consists of a host structure made up of a CoO2 frame work. Layers of interstitial sites that can be occupied by lithium ions reside between O–Co–O slabs. The interstitial sites are octahedrally coordinated by oxygen and they form two dimensional triangular lattices. As described in Section 2.1,
386
A. Van der Ven and G. Ceder
two migration mechanisms exist for lithium: a single vacancy mechanism whereby lithium squeezes through a dumbell of oxygen atoms into an isolated vacancy and a divacancy mechanism in which lithium migrates through an adjacent tetrahedral site into a vacant site that is part of a divacancy [3]. The two migration mechanisms are illustrated in Fig. 1. Not only does the local arrangement of lithium ions around a hopping ion determine the migration mechanism, it also affects the value of the activation barrier for a particular migration mechanism. Figure 7 illustrates kinetically resolved activation barriers calculated from first- principles (LDA) for a variety of different lithium-vacancy arrangements around the migrating ion at different bulk lithium concentrations [3]. Note that for a given bulk composition, many possible lithium-vacancy arrangements around an atom in the activated state exist. The kinetically resolved activation barriers illustrated in Fig. 7 correspond to only a small subset of the these many configurations. The local cluster expansion is used to extrapolate from this set to all the configurations needed in a kinetic Monte Carlo simulation. Figure 7 shows that the activation barrier for the divacancy migration mechanism can vary by more that 200 meV with lithium concentration. The increase in activation barrier upon lithium removal from the host can be traced to the contraction of the host along the c-axis as the lithium concentration is reduced [3].
Activation Barrier (meV)
1000
800
600
400
200
0
0
0.2
0.4
0.6
0.8
1
Li concentration Figure 7. A sample of first-principles (LDA) kinetically resolved activation barriers E KRA for the divacancy hop mechanism (circles) and the single vacancy mechanism (squares).
Diffusion and configurational disorder in multicomponent solids
387
This contraction disporportionately penalizes the activated state over the end point states of the divacancy hop mechanism. Another contribution to the variation in activation barrier with composition derives from the fact that the activated state is very close in proximity to a Co ion, which becomes progressively more oxidized (i.e., its eff ective charge becomes more positive) as the overall lithium concentration is reduced [3, 29]. This leads to an increase in the electrostatic repulsion between the activated Li and the Co as x is reduced. Extended and local cluster expansions can be constructed to describe both the configurational energy of Lix CoO2 and the configuration dependence of the kinetically resolved activation barriers. An extended cluster expansion for the first-principles configurational energy of Lix CoO2 has been described in detail in Ref. [29]. This cluster expansion when combined with Monte Carlo simulations accurately predicts phase stability in Lix CoO2 . In particular, two ordered lithium-vacancy phases are predicted at x = 1/2 and x = 1/3. Both phases are observed experimentally [39, 40]. A local cluster expansion for the kinetically resolved activation barriers has been described in Ref [3]. Figure 8 illustrates calculated diffusion coefficients at 300 K determined by applying kinetic Monte Carlo simulations to the cluster expansions of Lix CoO2 [3]. While the configuration dependence of the activation barriers were rigorously accounted for with the cluster expansions, no attempt in these calculations was made to describe the migration rate prefactor ν ∗ from first- principles. Instead, a value of 1013 Hz was used for all compositions and environments. Figure 8(a) shows both DJ and the chemical diffusion coefficient D, while Fig. 8(b) illustrates the thermodynamic factor F, which was determined by calculating fluctuations in the number of lithium particles in grand canonical Monte Carlo simulations [3] (see Section 3.1). Notice that the calculated diffusion coefficient varies by several orders of magnitude with composition, showing that the assumption of a concentration independent diffusion coefficient in this system is unjustified. The thermodynamic factor F is a measure for the deviation from ideality. In the dilute limit (x → 0), interactions between lithium ions are negligible and the configurational thermodynamics approximates that of an ideal solution. In this limit the thermodynamic factor is 1. As x increases from 0, and the solid departs from ideal behavior, the thermodynamic factor increases substantially. The local minima in DJ and D at x = 1/2 and x = 1/3 are a result of lithium ordering at those compositions. Lithium-vacancy ordering effectively locks in lithium ions into energetically favorable sublattice positions which reduces ionic mobility. The thermodynamic factor on the other hand exhibits peaks at x = 1/2 and x = 1/3 as the configurational thermodynamics of an ordered phase deviates strongly from ideal behavior. The peak signifies the fact that in an ordered phase, a small gradient in composition leads to an enormous gradient in chemical potential, and hence a large thermodynamic driving force for diffusion. This partly compensates the reduction in DJ .
388
A. Van der Ven and G. Ceder
⫺7
D
13
D (10 ) (cm2/s) ν∗
⫺8 ⫺9 ⫺10
DJ
⫺11 ⫺12 ⫺13 ⫺14
0
0.2
0.4
0.6
0.8
1
Li concentration 100000
Thermodynamic factor
10000
1000
100
100
1 1
0.2
0.4
0.6
0.8
1
Li concentration
Figure 8. (a) Calculated self diffusion coefficient DJ and chemical diffusion coefficient D for Li x CoO2 at 300 K. (b) The thermodynamic factor of Lix CoO2 at 300 K.
Diffusion and configurational disorder in multicomponent solids
389
A similar computational approach can be followed to determine for example the diffusion coefficient for oxygen diffusion on a platinum (111) surface. If in addition to oxygen, sulfur atoms are also adsorbed on the platinum surface, Green–Kubo relations for binary interstitial diffusion would be needed. Furthermore, ternary cluster expansions are then necessary to describe the configuration dependence of the energy and kinetically resolved activation barrier as there are then three species: oxygen, sulfur and vacancies.
6.2.
Substitutional Diffusion
To illustrate diffusion in a binary substitutional solid, we consider the fcc Al–Li alloy. While Al1−x Lix is predominantly stable in the bcc based crystal structure, it is metastable in fcc up to x = 0.25. In fact, it is the metastable form of fcc Al1−x Lix that strengthens the important candidate alloy for aerospace applications. A first step in determining the diffusion coefficients in this system is an accurate first-principles characterization of the alloy thermodynamics. This can be done with a binary cluster expansion for the configurational energy [26]. The expansion coefficients of the cluster expansion were fit to the first-principles energies (LDA) of more than 70 different periodic lithium-aluminum arrangements on the fcc lattice [41]. Figure 9(a) illustrates the calculated metastable fcc based phase diagram of Al1−x Lix obtained by applying Monte Carlo simulations to the cluster expansion [41]. The phase diagram shows that a solid solution phase is stable at low lithium concentration and at high temperature. At x = 0.25, the L12 ordered phase is stable. In this ordered phase the Li atoms occupy the corner points of the conventional cubic fcc unit cell. Diffusion in most metals is dominated by a vacancy mechanism. Hence it is not sufficient to simply characterize the thermodynamics of the strictly binary Al–Li alloy. Real alloys always have a dilute concentration of vacancies that wander through the crystal and in the process redistribute the atoms of the solid. The vacancies themselves have a thermodynamic preference for particular local environments over others which in turn affects the mobility of the vacancies. Treating vacancies in addition to Al and Li makes the problem a ternary one and in principles would require a ternary cluster expansion. Nevertheless, since vacancies are present in dilute concentrations, a ternary cluster expansion can be avoided by using a local cluster expansion to describe the configuration dependence of the vacancy formation energy [41]. In effect, the local cluster expansion serves as a perturbation to the binary cluster expansion to describe the interaction of a dilute concentration of a third component, in this case the vacancy. A local cluster expansion for the vacancy formation energy in fcc Al–Li was constructed by fitting to first-principles (LDA) vacancy formation energies in 23 different Al–Li arrangements [41]. Combining the vacancy
390
A. Van der Ven and G. Ceder 800 (a) solid solution
Temperature (K)
700
600
L12
500
Vacancy concentration
(b)
1e-06
1e-07 (c)
Li concentration around vacancy
0.8 0.6 2nd NN 0.4 1st NN
0.2 0 0
0.1
0.2
0.3
x in LixAl(1-x) Figure 9. (a) First-principles calculated phase diagram of fcc based Al(1−x) Lix alloy. (b) Calculated equilibrium vacancy concentration as a function of bulk alloy composition at 600 K. (c) Average lithium concentration in the first two nearest neighbor shells around a vacancy. The dashed line corresponds to the average bulk lithium concentration.
Diffusion and configurational disorder in multicomponent solids
391
formation local cluster expansion with the binary cluster expansion for Al–Li in Monte Carlo simulations enables a calculation of the equilibrium vacancy concentration as a function of alloy composition and temperature. Figure 9(b) illustrates the result for Al–Li at 600 K [41]. While the vacancy concentration is more or less constant in the solid solution phase, it can vary by an order of magnitude over a small concentration range in the ordered L12 phase at 600 K. Another relevant thermodynamic property that is of importance for diffusion is the equilibrium short range order around a vacancy in fcc Al–Li. Monte Carlo simulations using the cluster expansions predict that the vacancy repels lithium ions, preferring a nearest neighbor shell rich in aluminum. Illustrated in Fig. 9(c) is the lithium concentration in shells with varying distance around a vacancy. The lithium concentration in the first nearest neighbor shell is less than the bulk alloy composition, while it is slightly higher than the bulk composition in the second nearest neighbor shell. This indicates that the vacancy repels Li and attracts Al. In the ordered phase, stable at 600 K between x = 0.23 and 0.3, the degree of order around the vacancy is even more pronounced as illustrated in Fig. 9(c). Between x = 0.23 and 0.3, the vacancy is predominantly surrounded by Al in its first and third nearest neighbor shells and by Li in its second and fourth nearest neighbor shells. This corresponds to a situation in which the vacancy occupies the lithium sublattice of the L12 ordered phase. Clearly the thermodynamic preference of the vacancies for a specific local environment will have an impact on their mobility through the crystal. While thermodynamic equilibrium determines the degree of order within the alloy and which environments the vacancies are attracted to, atomic migration mediated by a vacancy mechanism involves passing through activated states, which requires passing over an energy barrier that also depends on the local degree of order. Contrary to what is predicted for Lix CoO2 , the kinetically resolved activation barriers in fcc Al1−x Lix are not very sensitive to configuration and bulk composition [42]. For each type of atom (Al or Li), the variations in kinetically resolved activation barriers are within the numerical errors of the first-principles method (50 meV for a plane wave pseudopotential method using 107 atom supercells). This is likely the result of a negligible variation in volume of fcc Al1−x Lix with composition. But while the migration barriers do not depend significantly on configuration, they are very different depending on which atom performs the hop. The first-principles calculated migration barrier for Al hops are systematically between 150 to 200 meV larger than for Li hops [42]. The thermodynamic tendency of the vacancy to repel lithium atoms deprives Li of diffusion mediating defects. Kinetically, though, Li has a lower activation barrier relative to Al for migration into an adjacent vacancy. Hence a trade-off exists between thermodynamics and kinetics. While Li exchanges more readily with a neighboring vacancy, thermodynamically it has less access to those vacancies. Quantitatively determining the effect of this trade-off requires explicit
392
A. Van der Ven and G. Ceder
Interdiffusion coefficient (cm2/s)
10⫺11
10⫺12 Two phase coexistence
10⫺13
10⫺14 0
0.05
0.1 0.15 x in Al(1-x)Lix
0.2
0.25
0.3
Figure 10. Calculated interdiffusion coefficient (the λ− eigenvalue of the 2 × 2 D matrix) for fcc Al(1−x) Lix alloy at 600 K.
evaluation of diffusion coefficients. This can be done by applying kinetic Monte Carlo simulations to cluster expansions that describe the configurational energy and kinetically resolved activation barriers for Al, Li and dilute vacancies on the fcc lattice. Figure 10 illustrates the calculated interdiffusion coefficient at 600 K obtained by diagonalizing the D matrix of Eq. (14) [42]. The coefficient for interdiffusion describes the rate with which the Al and Li atoms intermix in the presence of a concentration gradient in the two species. The calculated interdiffusion coefficient is more or less constant in the solid solution phase, but drops by more than an order of magnitude in the L12 ordered phase. The thermodynamic preference of the vacancies for the lithium sublattice sites of L12 dramatically constricts the trajectory of the vacancies, leading to a drop in overall mobility of Li and Al.
7.
Conclusion
In this chapter, we have presented the statistical mechanical formalism that relates phenomenological diffusion coefficients for multicomponent solids to microscopic fluctuations of the solid at equilibrium. We have focussed on
Diffusion and configurational disorder in multicomponent solids
393
diffusion that is mediated by a vacancy mechanism and have distinguished between interstitial systems and substitional systems. An important property of multicomponent solids is the existence of configurational disorder among the constituent species. This adds a level of complexity in calculating diffusion coefficients from first- principles since the activation barriers vary along an atom’s trajectory as a result of variations in the local degree of atomic order. In this respect, the cluster expansion is an invaluable tool to describe the dependence of the energy, in particular of the activation barrier, on atomic configuration. While the formalism of calculating diffusion coefficients from firstprinciples in multicomponent solids has been established, many opportunities exist to apply it to a wide variety of multicomponent crystalline solids, including metals, ceramics and semiconductors. Faster computers and improvements to electronic structure methods that go beyond density functional theory will lead to more accurate first-principles approximations to activation barriers and vibrational prefactors. It is only a matter of time before first-principles diffusion coefficients for multicomponent solids are routinely used in continuum simulations of diffusional phase transformations and electrochemical devices such as batteries and fuel cells.
Acknowledgments We acknowledge support from the AFOSR, grant F49620-99-1-0272 and the Department of Energy, Office of Basic Energy Sciences under Contract No. DE-FG02-96ER45571. Additional support came from NSF (ACI-9619020) through computing resources provided by NPACI at the San Diego Supercomputer Center.
References [1] J.M. Sanchez, F. Ducastelle, and D. Gratias, Physica A, 128, 334, 1984. [2] D. de Fontaine, In: H. Ehrenreich and D. Turnbull (eds.), Solid State Physics., Academic Press, New York, pp. 33, 1994. [3] A. Van der Ven, G. Ceder, M. Asta, and P.D. Tepesch, Phys. Rev. B, 64, 184307, 2001. [4] S.R. de Groot and P. Mazur, Non-Equilibrium Thermodynamics, Dover Publications, Mineola, NY, 1984. [5] G.H. Vineyard, J. Phys. Chem. Solids, 3, 121, 1957. [6] D. Chandler, Introduction to Modern Statistical Mechanics, Oxford University Press, Oxford, 1987. [7] R. Zwanzig, Annu. Rev. Phys. Chem., 16, 67, 1965. [8] R. Zwanzig, J. Chem. Phys., 40, 2527, 1964. [9] Y. Zhou and G.H. Miller, J. Phys. Chem., 100, 5516, 1996. [10] R. Gomer, Rep. Prog. Phys., 53, 917, 1990.
394
A. Van der Ven and G. Ceder [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42]
M. Tringides and R. Gomer, Surf. Sci., 145, 121, 1984. C. Uebing and R. Gomer, J. Chem. Phys., 95, 7626, 1991. A.R. Allnatt, J. Chem. Phys., 43, 1855, 1965. A.R. Allnatt, J. Phys. C: Solid State Phys., 15, 5605, 1982. R.E. Howard and A.B. Lidiard, Rep. Prog. Phys., 27, 161, 1964. A.R. Allnatt and A.B. Lidiard, Rep. Prog. Phys., 50, 373, 1987. J.W. Cahn and F.C. Larche, Scripta Met., 17, 927, 1983. K.W. Kehr, K. Binder, and S.M. Reulein, Phys. Rev. B, 39, 4891, 1989. C. Wolverton, G. Ceder, D. de Fontaine, and H. Dreysse, Phys. Rev. B, 45, 13105, 1992. C. Wolverton and A. Zunger, Phys. Rev. B, 50, 10548, 1994. J.W.D. Connolly and A.R. Williams, Phys. Rev. B, 27, 5169, 1983. J.M. Sanchez, J.P. Stark, and V.L. Moruzzi, Phys. Rev. B, 44, 5411, 1991. Z.W. Lu, S.H. Wei, A. Zunger, S. Frotapessoa, and L.G. Ferreira, Phys. Rev. B, 44, 512, 1991. M. Asta, D. de Fontaine, M. Vanschilfgaarde, M. Sluiter, and M. Methfessel, Phys. Rev. B, 46, 5055, 1992. M. Asta, R. McCormack, and D. de Fontaine, Phys. Rev. B, 48, 748, 1993. M.H.F. Sluiter, Y. Watanabe, D. de Fontaine, and Y. Kazazoe, Phys. Rev. B, 53, 6136, 1996. P.D. Tepesch, et al., J. Am. Cer. Soc., 79, 2033, 1996. V. Ozolins, C. Wolverton, and A. Zunger, Phys. Rev. B, 57, 6427, 1998. A. Van der Ven, M.K. Aydinol, G. Ceder, G. Kresse, and J. Hafner, Phys. Rev. B, 58, 2975, 1998. G.D. Garbulsky and G. Ceder, Phys. Rev. B, 51, 67, 1995. G. Mills, H. Jonsson, and G.K. Schenter, Surf. Sci., 324, 305, 1995. A. van de Walle and G. Ceder, J. Phase Eqilib., 23, 348, 2002. D.B. Laks, L.G. Ferreira, S. Froyen, and A. Zunger, Phys. Rev. B, 46, 12587, 1992. C. Wolverton, Philos. Mag. Lett., 79, 683, 1999. A. van de Walle and G. Ceder, Rev. Mod. Phys., 74, 11, 2002. R. LeSar, R. Najafabadi, and D.J. Srolovitz, Phys. Rev. Lett., 63, 624, 1989. A.B. Bortz, M.H. Kalos, and J.L. Lebowitz, J. Comput. Phys., 17, 10, 1975. F.M. Bulnes, V.D. Pereyra, and J.L. Riccardo, Phys. Rev. E, 58, 86, 1998. J.N. Reimers and J.R. Dahn, J. Electrochem. Soc., 139, 2091, 1992. Y. Shao-Horn, S. Levasseur, F. Weill, and C. Delmas, J. Electrochem. Soc., 150, A366, 2003. A. Van der Ven and G. Ceder, Phys. Rev. B., 2005 (in press). A. Van der Ven and G. Ceder, Phys. Rev. Lett., 2005 (in press).
1.18 DATA MINING IN MATERIALS DEVELOPMENT Dane Morgan and Gerbrand Ceder Massachusetts Institute of Technology, Cambridge MA, USA
1.
Introduction
Data Mining (DM) has become a powerful tool in a wide range of areas, from e-commerce, to finance, to bioinformatics, and increasingly, in materials science [1, 2]. Miners think about problems with a somewhat different focus than traditional scientists, and DM techniques offer the possibility of making quantitative predictions in many areas where traditional approaches have had limited success. Scientists generally try to make predictions through constitutive relations, derived mathematically from basic laws of physics, such as the diffusion equation or the ideal gas law. However, in many areas, including materials development, the problems are so complex that constitutive relations either cannot be derived, or are too approximate or intractable for practical quantitative use. The philosophy of a DM approach is to assume that useful constitutive relations exist, and to attempt to derive them primarily from data, rather than from basic laws of physics. As an example, consider what will likely stand forever as the greatest application of DM in the hard sciences, the periodic table. In 1869 Mendeleev organized the elements based on their properties, without any guiding theory, into the first modern periodic table [3]. With the advent of quantum theory it became possible to predict the structure of the periodic table and DM was no longer strictly necessary, but the results had already been known and used for many years. Even today, the easy organization of data made possible by the classifications in the periodic table make it an everyday tool for research scientists. Mendeleev established a simple ordering based on a relatively small amount of data, and so could do it on paper. However, today’s data sets can be many orders of magnitude larger, and an impressive array of computational algorithms have been developed to automate the task of identifying relationships within data. 395 S. Yip (ed.), Handbook of Materials Modeling, 395–421. c 2005 Springer. Printed in the Netherlands.
396
D. Morgan and G. Ceder
DM is becoming an increasingly valuable tool in the general area of materials development, and there are good reasons why this area is particularly fruitful for DM applications. There is an enormous range of possible new materials, and it is often difficult to physically model the relationships between constituents, and processing, and final properties. For this reason, materials are primarily still developed by what one might call informed trial-and-error, where researchers are guided by experience and heuristic rules to a somewhat restricted space of constituents and processing conditions, but then try as many combinations as possible to find materials with desired properties. This is essentially human DM, where one’s brain, rather than the computer, is being used find correlations, make predictions, and design optimal strategies. Transferring DM tasks from human to computer offers the potential to enhance accuracy, handle more data, and allow wider dissemination of accrued knowledge. Other key drivers for growing DM use in materials development are ease of access to large databases of materials properties, new data being generated in large quantities by high-throughput experiments and quantitative computational models, and improved algorithms, computer speed, and software packages leading to more effective and easy to use DM methods. Note that DM is also used in other areas of materials science beside materials development, e.g., design and manufacturing [4, 5], but this work will not be discussed here. The interdisciplinary nature of DM creates a special challenge, since a typical materials scientist’s education does not provide an introduction to DM techniques, and the computer scientists and statisticians usually involved in developing DM methods are equally unlikely to be versed in materials science. The goal of this paper is to help foster communication between the disciplines and show examples of how they can be joined productively. We introduce DM concepts in a fairly general framework, discuss a few of the more common methods, and describe how DM is being used to tackle some materials development problems, including predicting physiochemical properties of compounds, modeling electrical and mechanical properties, developing more effective catalysts, and predicting crystal structure. The breadth of methods and applications makes a comprehensive discussion impossible, but hopefully this brief introduction will be enough to allow the interested reader to follow up on specific areas of interest.
2.
Key Methods of Data Mining
Data Mining (DM) is a vast and rapidly changing topic, with many different techniques appearing in many different fields. Broad reviews of the issues, methods, and applications are given in Refs. [1, 2] and somewhat less comprehensively but more in depth in Refs. [6, 7]. There is some disagreement about exactly what constitutes DM, as opposed to, e.g., knowledge discovery or
Data mining in materials development
397
statistical analysis. We will not worry much about such distinctions, and give DM the rather all encompassing definition of using your data to obtain information. This essentially defines every discovery task as some kind of DM, but there is really a continuum. The more data one has, and the less physical modeling one includes, then the more time one will spend on data management, models, and investigation, and the more DM the task will be. If one has eight data points of force and acceleration, and one performs a linear regression to fit mass, it is silly to consider it DM. There is very little time spent on the data, and one is essentially just fitting an unknown parameter in the known physical law F = ma. However, if one is trying to predict what song can be a commercial hit based on a database of song characteristics and sales data, then the primacy of data, and the absence of any guiding theory, make it clearly a DM problem [8]. DM in materials development generally focuses on prediction. Relationships are established between desired dependent properties (e.g., melting temperature or catalytic activity) and independent properties that are easily controlled and measured (e.g., precursor concentrations or annealing temperatures). Once such a relationship is established, dependent properties can be quickly predicted from independent ones, without having to perform costly and time consuming experiments. It is then possible to optimize over a large space of possible independent properties to obtain the desired dependent property. In general, we will define X as the independent properties or variables, Y as the dependent properties or variables, F as the derived relationship between X and Y , and YPred as the predicted values of Y based on F and X . The goal of a DM effort is usually to determine F such that YPred represents Y as effectively as possible. There are several key areas that need to be considered in a DM application such as the one described above: data management and preparation, prediction methods, assessment, optimization, and software.
2.1.
Data Preparation and Management
Data preparation and management will not be discussed in detail since the issues are very dependent on the specific data being used. However, the tasks associated with cleaning and managing the data can often take up the bulk of a DM project, and should not be underestimated. Data must be stored so that it can be accessed efficiently, interfaced with equipment, updated, etc. Solutions can range from simple flat files to sophisticated database software. Issues often exist with the type and quality of the data, and it is frequently necessary to make significant transformations to bring the data into a universally comparable format, and to regroup data into appropriate new variables. There is sometimes erroneous or just missing data, which may need to be dealt with
398
D. Morgan and G. Ceder
in some manner before or during the DM process. Finally, data must be adequately comprehensive to be amenable to DM. It may be necessary to obtain further data in key areas, perhaps guided by the DM results in an iterative procedure. These issues are described in many data mining books, e.g., Ref. [7].
2.2.
Prediction Methods
Prediction methods form the heart of DM tools relevant for materials development. Although there are many DM approaches that can be used for prediction, here we focus only on three of the most popular, linear regression, neural networks, and classification methods. Linear regression is often one of the first approaches to try in a DM project, unless one has reasons to expect nonlinear behavior. It is assumed that the relationship F is a linear function, and the unknown parameters are determined by multivariate linear regression to minimize the squared error between YPred and Y (these methods are discussed in many textbooks, e.g., Refs. [9, 10]. Linear regression is generally performed by matrix manipulations and is very robust and rapid. There are many variations on strict regression, e.g., adding weights or transforming variables with logarithms. Some of the most useful regression tools are those for reducing the number of independent variables (X ), sometimes called dimensional reduction. It is frequently the case that there are many possible independent variables, but not all of them will be truly independent or important. Furthermore, the original data categories may not be optimal, and linear combinations of the variables, called latent variables, might be more effective. For example, alloy properties affected by strain will depend on the differences in atomic sizes, rather than the size of each constituent element separately. It is often difficult to have enough data to properly fit coefficients for a large number of variables (e.g., uniformly gridding a space of n variables with m points for each variable requires n m data points, which rapidly becomes unmanageable. This is sometime called the “curse of dimensionality” and is a much more significant problem in nonlinear fitting methods, such as the neural networks described below). Having too many variables that are not well constrained can lead to overfitting and poor predictive ability of the function F. Ideally, the DM method will help the user define and include the most effective latent variables for prediction. One common method for defining latent variables is Principal Component Analysis (PCA), which yields latent variables that are orthogonal and ordered by decreasing variance [11]. Assuming that variance correlates well with the importance of the latent variable to the dependent variables, then the principal components are ordered in a sensible fashion and can be truncated at some point. Orthogonality assures that latent variables are independent and
Data mining in materials development
399
will represent different variations. A limitation of this approach is that no information about Y is used in picking the variables. Some improvement can often be obtained by using Partial Least Squares (PLS) regression [9, 12–14], which is similar in spirit to PCA, but constructs orthogonal latent variables that maximize the covariance between X and Y . PLS latent variables capture a lot of the variation of X , but are also well correlated with Y , and so are likely to provide effective predictions. However one defines the latent variables, it is important to test their effectiveness, and there are a number of methods to identify statistically significant variables in a regression (e.g., ANOVA) [7, 9]. Another popular method is to make use of cross validation, which is discussed below, to exclude variables that are not predictive. Neural Network (NN) methods [15] are more general than linear approaches and have become a popular prediction tool for many areas. NNs loosely model the functioning of the brain, and consist of a network of neurons that can take inputs, sum them with weights, operate on the sum with a transfer function, and then emit an output. The NN is generally viewed as having layers, the first takes input from outside the NN, and the last outputs the final results to the user, while layers in between are called hidden and communicate only with other layers. For the problems considered here, the NN plays the role of the relationship F between X and Y . The weights of the neurons are unknown and must be determined by training based on known input X and output Y , where the goal is generally to minimize |YPred − Y |. The training process is analogous to a linear regression, except that the unknown weights are much more difficult to determine and many different training methods exist. Similar problems occur with excessive numbers of independent variables, and some dimensional reduction, e.g., by PCA, may be necessary. The strength of NNs is that they are very flexible, and with enough training can in principle represent any function, making them more powerful than linear methods. However, this increased power comes at a price of increased complexity. NNs have many choices that must be made correctly for optimal performance, including the number of layers, the number of neurons in each layer, the type of transfer function for each neuron, and the method of training the neural network. In general, training a NN is orders of magnitude slower than a linear regression, and convergence to the optimal parameters is by no means assured. NNs also have the drawback that it is less obvious how the X and Y variables are related than in a linear regression, making intuitive understanding more challenging. The problems of inadequate training and overfitting data are quite serious with NN’s. Some NN’s make use of “Bayesian regularization” [16–19], which includes uncertainty in the NN weights and provides some protection against overfitting. Another common solution is combining predictions from a number of differently trained NN’s (prediction by “committee”) (this approach is used
400
D. Morgan and G. Ceder
in, e.g., Refs. [20, 21]). Another interesting approach, which can only be used in cases where one if faced with many similar problems, is to retrain NNs on related problems, making use of the information already gained in their previous training (this is done in, e.g., Ref. [22]). Classification maps data into predefined classes rather than continuous variables, where the classes are defined based on the dependent properties Y . For example, if Y is conductivity, one could classify materials into metals and insulators, and try to predict to which class a material should belong based on X , rather than performing a full regression of Y on X to predict the continuous conductivity values. Another example is predicting crystal structure, where each different structure type can be considered a class, and the goal is to be able to predict class (assign a structure type) based on the independent data X . In classification DM the relation F maps X onto categories YPred , rather than continuous values. There are a range of different classification methods, as described in most standard textbooks (we found Ref. [6] particularly lucid on these issues). The only classification scheme that will be discussed here is the K -nearest neighbor method, which is one of the simplest. This approach requires that one can define a distance between any two samples, dij = distance between X i and X j . Classification for a new X i is performed by calculating its K nearest neighbors in the existing data set, and then assigning X i to the class that contains the most items from the K neighbors. The spirit of this approach underlies structure maps for crystal structure prediction, discussed in more detail below. Other classification approaches use Bayesian probabilistic methods, decision trees, NNs, etc. but will not be described here [1, 6, 7]. There are some issues with defining a metric of success for classifications. Since YPred and Y represent class occupancies, there is not necessarily any way to measure a distance between them. One way to view the results is what is rather wonderfully called a confusion matrix, where matrix element m ij gives the number of times a sample belonging in class Ci was assigned to C j . In order to define a metric for success it is important to realize that when assigning samples to a class there are two parameters that characterize the accuracy, the fraction of samples correctly placed into the class (true positives), and the fraction of samples incorrectly placed into the class (false positives). These can vary independently and their importance can be very dependent on the problem (for example, in classifying blood as safe, it is important to get as many true positives as possible, but absolutely essential not to allow any false positives, since that would allow unsafe blood into the blood supply). Therefore, the metric for success in classification must be chosen with some care. Note that clustering, which is similar to classification, is differentiated by the fact that clustering groups data without the data clusters being predefined. This is sometimes called “unsupervised” learning and will not be discussed further here, but can be found in most DM references.
Data mining in materials development
2.3.
401
Assessment
Cross-validation (CV) [23, 24] is a technique to assess the predictive ability of a fit and reduce the danger of overfitting. In a CV test with N data points, N − n data points are fit and used to predict the n points excluded from the fit. The predicted error of the excluded points is the CV score. This process can be averaged over many possible subsets of the data, which is called “leave n out CV”. The key concept behind CV is that the CV score is based on data not used in the fit. For this reason, the CV score will decrease as the model becomes more predictive, but will start to increase if the model under- or overfits the data. This in contrast to predicted errors in data that is included in the fit, which will always decrease with more fitting degrees of freedom. For example, consider a linear regression on a set of latent variables. The root mean square (RMS) error in the fit data will be a monotonically decreasing function of the number of latent variables used in the regression. However, the CV score will generally decrease for the initial principal components, and then start to increase again as the number of principal components gets large. The initial decrease in the CV score occurs because statistically meaningful variables are being added and the regression model is becoming more accurate. The increasing CV score signals that too many variables are being used, the regression is fitting noise, and that the model is overfit. By minimizing the CV score it is therefore possible to select an optimal set of latent variables for prediction. This idea is illustrated schematically in Fig. 1.
CV Error
Optimal
RMS
Number of latent variables Figure 1. A schematic comparison of the error calculated with data included in the fit (normal RMS fitting error – solid line) and excluded from the fit (CV score – dashed line).
402
D. Morgan and G. Ceder
Test data is another important assessment tool, and simply refers to a set of data that is excluded from working data at the beginning of the project and then used to validate the model at the end of model building. To some extent, the CV method does this already, but in the common case where the model is altered to optimize the CV score, it will overestimate the true predictive accuracy of the model [23]. It is only by testing on an entirely new data set, which the model has not previously encountered, that a reliable estimate of the predictive capacity of the model can be established. Sometimes there is not enough data to create an effective test data set, but it is certainly advisable to do so if at all possible.
2.3.1. Optimization Optimization methods [25, 26] are not usually considered DM, but they are an essential tool of many DM projects. For example, once a predictive model has been established, one frequently wants to optimize the inputs to give a desired output. This usually cannot be done with local optimization schemes (e.g., conjugate gradient methods) due to a rough optimization surface with many local minima. It is therefore frequently necessary to use an optimization method capable of finding at least close to the global minimum in a landscape with many local minima. A detailed discussion of these methods is beyond the scope of this article, but common approaches include simulated annealing Monte Carlo, genetic algorithms, and branch and bound strategies. Genetic algorithms seem to be the most popular in the DM applications discussed here, and work by “evolving” toward an optimal sample population through operations such as mixing, changing, and removing samples.
2.3.2. Software Many DM algorithms are fairly simple, and can be programmed relatively quickly. Often the underlying numerical operations involve no more than standard matrix operations, and access to widely available basic linear algebra subroutines (BLAS) is adequate. However, DM is generally very explorative, and it is common to try many different approaches. Coding everything from scratch becomes prohibitive, and will lock the user into the few things they can readily implement. Fortunately, there are a large number of both free and commercial DM tools available for users. Some tools, like the Neural Net Toolbox in Matlab, are implemented in languages likely to be familiar to the materials scientist, and are readily accessible. An impressive list of possible tools is given in Appendix A of Refs. [6, 7]. It should also be remembered that for the academic user many companies will have special rates, so it is worth exploring commercial software.
Data mining in materials development
3.
403
Applications
There are far too many studies using DM methods to offer a comprehensive revue. Therefore, we focus on a few key areas where DM techniques are highlighted and seem to be playing an increasingly important role.
3.1.
Quantitative Structure–Property Relationships (QSPR)
Quantitative Structure–Property Relationships (QSPR), and the closely related techniques of Quantitative Structure–Activity Relationships (QSAR), are based on the fundamental tenet that many molecular properties, from boiling point to biological activity, can be derived from basic descriptors of molecular structure. For some examples, see the general review of using NNs to predict physiochemical properties in Ref. [27] QSPR/QSAR are generally considered methods of chemistry, but are closely related to the activities of a DM material scientist. QSPR/QSAR is a large field and here we consider only one particularly illustrative example, the work of Chalk et al., predicting boiling points for molecules [20]. The boiling point for any given compound is not a particularly hard measurement, but the ability to quickly predict boiling points for many compounds, particularly ones that only exist as computer models, can be useful for screening in, e.g., drug design. Computing the boiling point of a compound directly from physical principles requires a very accurate model of the energetics and significant computation. Therefore, researchers have generally turned to DM applications in this area. Chalk et al. have a database of 6629 molecular structures and boiling points. The dependent variables Y are taken as the boiling points. A set of descriptors, X 0 , are developed based on structural and electronic characteristics (derived from semiempirical atomistic models). A technique called formal inference-based recursive modeling (FIRM) is then used to asses the relevance of each variable (this technique will not be described here but allows the influence of a variable to be tested). A set of 18 descriptors are settled on as likely to be significant and they are used for the independent variables X . A test data set of 629 molecules that span the whole range of boiling temperatures is removed. The remaining 6000 molecules are then used to find the optimal model function F to map X to Y . F is represented by a NN, and after some initial testing one is chosen with 18 first layer nodes, 10 nodes in the hidden second layer, and a single node in the third layer. The transfer functions are all sigmoids (sig(x) = 1/(1 + exp(−x))) and trained with a back-propagation algorithm. In order to control for overfitting the data is broken up into 10 disjoint subsets and a “leave
404
D. Morgan and G. Ceder
600 out” cross validation is performed. This trains 10 distinct NNs on 5400 molecules each. The NN training is stopped when the CV score reaches a minimum. The prediction function F is taken to be a committee, and uses the mean result of the values predicted by all 10 NNs. The final test for F is done by comparing the predicted and true boiling points for the 629 molecule test set, giving errors with a standard deviation of only 19 K (the predicted vs. true melting temperatures for the test set are shown in Fig. 2). The predictive capacity is good enough that for many of the largest prediction errors it was possible to go back to the experimental data and show that the input data itself was in error. One could now imagine using a genetic algorithm and the predicting function F to search the space of molecular structures to find, e.g., a very high melting temperature molecule, although no such work was performed by the authors. It is worth noting that computation plays an important role in providing the basic input data in the study. All of the structural and electrostatic descriptors were generated by semi-empirical atomistic models. Using computational methods can be an efficient way to generate large amounts of descriptor information, greatly reducing the amount of experimental work required.
Figure 2. Predicted vs. true boiling points for 629 compounds. Prediction is done by neural networks fit to 6000 boiling points that did not include the 629 shown here. (After [20], reproduced with permission).
Data mining in materials development
3.2.
405
Processing–Structure–Property Relationships
Processing–Structure–Property (PSP) relationships refer to the challenging materials problem of connecting the processing parameters of a material to its structure and properties. Processing conditions might include such things as initial composition of reactants and annealing schedule, while structural aspects might be crystal structure or grain size, and final properties are such characteristics as yield stress and corrosion resistance. PSP relationships are very important because they allow processing parameters to be adjusted to create optimal materials. PSP relationships tend to involve many different phenomena, with widely varying length and time scales, making direct modeling extremely challenging. However, analogous to QSPR’s reliance on the fact that properties must be a function of the structure of the molecules involved, in PSP relationships we know that properties must follow from structure in some manner, and that structure is somehow determined by processing. The assurance that PSP relationships exist, combined with the challenge of directly modeling them, makes this a good area for DM applications. One of the most active groups in this area has been Bhadeshia and co-workers. Bhadeshia’s review in 1999 [21] covers a lot of the material’s work that had been done up to that time in neural network (NN) modeling, and he and co-workers have continued to apply NN techniques in PSP applications to such areas as creep modeling [28, 29], mechanical weld properties [30, 31], and phase fractions in steel [32]. In general, these studies follow the DM framework used in QSPR above. Many of the data and codes used by Bhadeshia et al., as well as many others, can be found online as part of the Materials Algorithm Project [33]. Malinov and co-workers have also done extensive work with DM tools in PSP relationships, and have developed a code suite, complete with graphical user interface, to make use of their models [34]. Their work has focused primarily on Ti alloys [35–37] and nitrocarburized steels [38, 39]. The NN software they developed uses a cross validation (CV)-like strategy to assess the effectiveness of different NN architectures, training methods, and trainings, so that the best network can be obtained by optimization, rather than intuitive choice. It is a general trend in DM applications to try to automatically optimize as many choices as possible, since this gives the best results with the least user intervention. Many apparent DM choices, such as which latent variables or NN architectures to use, can in fact be determined by performing a large number of tests. Implementing this type of automation is generally limited by the user’s willingness to code the required tests, the time it takes to perform the optimization, and the amount of data required for sufficient testing. Also, one should ideally have a test set that is entirely excluded from all the optimization processes for final testing.
406
D. Morgan and G. Ceder
A particularly interesting application by Malinov et al. is the prediction of time–temperature-transformation (TTT) diagrams for Ti alloys [34, 35, 37]. TTT diagrams give the time to reach a specified fraction of phase transformation at each temperature, and for a given phase fraction they are a curve in time–temperature space. They can be modeled to some extent directly with Johnson–Mehl–Avrami theory, but Malinov et al. chose to use a NN model so as to be able to predict for many systems and composition variations. The details discussed here are all from Ref. [35]. The data set was 189 TTT diagrams for Ti alloys, and the independent variables were taken to be the compositions of the 8 most common alloying elements and oxygen. Some additional elements that were not prevalent enough in the data set for accurate treatment had to be removed or mapped onto a Mo equivalent. It should be noted that the authors are careful to identify the ranges of the concentrations of alloying elements present in the test set. This is very important, since given the limited data, it is not clear that this NN would give accurate predictions outside the concentration ranges used in training. The dependent variables represented more of a problem, since TTT diagrams are curves, not single values. Malinov et al. solved this problem by representing the TTT diagram as a 23-tuple. Two entries gave the position of the TTT graph nose, its time and temperature. Ten entries gave the upper portion of the curve, where each entry was the fractional change in time for a fixed change in temperature, and ten more the lower portion. Finally, one entry was reserved for the martensite start temperature. These considerations, for both the independent and dependent variables, demonstrate some of the data processing that can be required for successful DM. The final predictions are quite accurate for test sets, and allowed exploration of the dependence of TTT curves on alloy composition. A number of TTT diagram predictions for (at that time) unmeasured materials were given, and some of these have since been measured, demonstrating reasonably good predictive ability for the NN model (see Fig. 3) [37]. A set of studies using DM techniques to model Al alloys recently came out of Southampton University [40–44]. The work by Starink et al. [44] summarizes studies on strength, electrical conductivity, and toughness. These studies are particularly interesting since they directly compare different DM methods as well as more physically based modeling, based on known constitutive relations. Starink et al. make use of linear regression and Bayesian NN models like those discussed above, but also apply neurofuzzy methods and support vector machines. We will not discuss these further except to point out that the latter is a relatively new development that seems to have some improved ability to give accurate predictions over the more common NN methods, and will likely grow in importance [45–47]. For the cases of direct comparison, Starink et al. find that physically based modeling performs slightly better. However, these examples involve very small data sets (around 30 samples),
Data mining in materials development
407
Figure 3. Comparison of predicted and measured TTT diagrams for different Ti alloys. These predictions were made and published before the experimental measurements were taken. (After Ref. [37], reproduced with permission.)
so one expects there to be significant undertraining in DM methods. Also of interest is the over three-fold decrease in predictive error for conductivity when going from linear to nonlinear DM methods, demonstrating why nonlinear NN methods have become the dominant tool for many applications. Starink et al. make some use of the concept of hybrid physical and DM approaches. This is a very natural idea, but worth mentioning explicitly. The spirit of DM is often one of using as little physical knowledge as possible, and allowing the data to guide the results. However, by introducing a certain amount of physical knowledge, a DM effort can be greatly improved. As summarized by Starink et al., this can be done through initially choosing independent variables based on known physics, using functional forms that are physically motivated in the DM, and using DM to fit remaining errors after a physical model has been used.
3.3.
Catalysis
A particularly exciting area of DM applications at present is in catalysis. A lot of recent activity in this field has been driven by the advent of highthroughput experiments, where the ability to rapidly create large data sets has created a new need for data mining concepts to interpret and guide experiment. Some reviews in this area can be found in Refs. [48–50].
408
D. Morgan and G. Ceder
Some authors have taken approaches similar to those used in QSPR/QSAR applications and the PSP modeling described above – finding a NN model to connect the properties of interest to tractable descriptors, and then exploring that model to understand dependencies or optimize properties [22, 50–56]. The input independent variables are generally the compositions of possible alloying materials in the catalyst, and the output is some measure of the catalytic activity. Note that it is quite possible to have multiple final nodes in the network to output multiple measures of interest, such as conversion of the reactants and percentages of different products [51, 52]. It is also possible to look at catalytic behavior for a fixed catalyst under different reactor conditions, where the reactor conditions become the independent variables [22]. Once a NN has been trained, the best catalyst can be found through optimization of the function defined by the NN. This is generally done with a genetic algorithm [51, 54, 56], but other methods have also been explored [55]. Baerns et al. have done influential work in using a genetic algorithm to design new catalysts, but have skipped the step of fitting a model altogether, directly running experiments on each new generation of catalysts suggested by the genetic algorithm [57–59]. For example, Baerns et al. studied oxidative dehydrogenation of propane to propene using metal oxide catalysts with up to eight metal constituents, and found a general trend toward better catalytic activity with each generation, as shown in Fig. 4. Although optimizing the direct experimental data limits the number of samples that can be examined (Baerns et al. generally look at only a few hundred) the results have been very encouraging, e.g., leading to an effective multicomponent catalyst for low-temperature oxidation of low-concentration propane [58]. Further success
Figure 4. The best (open bar) and mean (solid bar) yield of propene at each generation of catalysts created by genetic algorithm. (After [57], reproduced with permission.)
Data mining in materials development
409
was obtained in studying oxidative dehydrogenation of propane to propene by following up on materials suggested by the combinatorial genetic algorithm search with further noncombinatorial “fundamental” studies [57]. Baerns et al.’s work demonstrates that the best results are sometimes obtained by combining DM and more traditional approaches. Further improvements in high-throughput methods will make direct iterative optimization of the experiments increasingly effective, but a fitted model will likely always be able to explore more samples and provide more rigorous optimization. The choice to use a fitted model is then a balance between the advantage of being able to optimize more accurately and the disadvantage of having a less accurate function to optimize. Umegaki et al. suggest that, in direct comparisons, a combined NN and genetic algorithm approach is more effective than direct optimization of experimental results, but this is a complex issue and will be problem dependent [56]. Despite many encouraging successes, DM in catalysis still faces a number of challenges. As pointed out by Hutchings and Scurrell [49] extending the independent variables to include more preparation and processing variables might significantly broaden the search for optimal materials. In addition, issues related to lifetime, stability, and other aspects of long-term performance are often difficult to predict and need to be addressed. Finally, Klanner et al. point out that there are different challenges for optimizing a library over a well known space of possible compositions and designing a discovery program for development in areas where there is essentially no precedent [50]. In the case of development of truly new materials, the problem of using a QSPR/QSAR approach in catalysis design is complicated because of the inherent difficulties of characterizing heterogeneous solids to build diverse initial libraries. Structure is a good metric for measuring diversity of molecular behavior, and therefore allows relatively easy assembly of diverse libraries for exploration. However, the very nonlinear behavior of solid catalysts, where activity is often dependent on such subtle details as surface defects, means that at this point there is no metric for measuring, a priori, the diversity of solid catalysts. Klanner et al. therefore suggest that development work will have to take place through building a large initial set of descriptors, based on synthesis data and properties of the constituent elements, and then use dimensional reduction to get a manageable number. Finally, no effort has been made here to make comparisons of DM to direct kinetic equation modeling in catalysis design. Some comments with regards to theses methods, and how they can be integrated with DM approaches, are given in Ref. [60]. It should be noted that the above issue of assembling diverse libraries, along with using genetic algorithms for intelligent searching, can be viewed as parts of the general problem of optimized experimental design. This is not a new area, but has become increasingly important due to the advent of high-throughput methods. It also encompasses such well developed fields as
410
D. Morgan and G. Ceder
statistical Design of Experiments. This is a fruitful area for statistical and DM methods, and many of the relevant issues have already been mentioned, but we will not discuss it further here. The interested reader can consult the review by Harmon and references therein [48]. Another DM area that has been receiving increased attention due to high-throughput experiments is correlating the results of cheap and fast experimental measurements with properties of interest. This becomes particularly important when it is necessary to characterize large numbers of samples quickly, and careful measurement of the desired properties is not practical. For a discussion of this issue in high-throughput polymer research see Refs. [61, 62] and a number of rapid screening tools and detection schemes used in high-throughput catalysis development are described in Ref. [63].
3.4.
Crystal Structure
The prediction of crystal structure is a classic materials problem that has been an area of ongoing research for many years. Now that modeling efforts have made computational materials design a real possibility in many areas, the problem of predicting crystal structure has become more practically pressing, since it is usually a prerequisitie for any extensive materials modeling. Crystal structure prediction is an area well suited for DM efforts, since there is no generally reliable and tractable method to predict structure, and there is a lot of structural data collected in crystallographic databases (e.g., ICSD [64], Pauling files [65], CRYSTMET [66], ICDD [67]). Some of the most successful methods for crystal structure prediction are what are known as structure maps, reviewed at length in Refs. [68, 69]. Structure maps exist primarily for binary and ternary compounds, and the best known examples are probably the Pettifor maps [70]. To understand how Pettifor maps work, consider the map designed for AB binary alloys. Each possible element is assigned a number, called the Mendeleev number. Then each alloy AB can be plotted on a Cartesian axis by assigning it the position (x, y), where x is the Mendeleev number for element A and y is the Mendeleev number for element B. At position (x, y) one places a symbol representing the structure type for alloy AB. When enough data is plotted the like symbols tend to cluster – in other words, alloys with the same structure type tend to be located near each other on the map. This can be clearly seen in the Pettifor map in Fig. 5. The probable structure type for a new alloy can simply be found by locating where the new alloy should reside in the map and examining the nearby structure types. Structure maps were not originally introduced as an example of DM, but can be understood within that framework. One can extend the idea of using Mendeleev number to a general “vector map,” which maps each alloy to a
Data mining in materials development
411
Figure 5. An AB binary alloy Pettifor map. Notice that like structure types show a clear tendency to cluster near one another. Provided by John Rodgers using the CRYSTMET database [66].
multicomponent vector. The vector components might be any set of descriptors for the alloy, such as Mendeleev numbers, melting temperatures, or differences in electronegativities. Once the alloys have been mapped to representative vectors they are amenable to different DM schemes. Since crystal structures are discrete categories, not continuous values, some sort of classification DM is going to be required. Structure maps work by defining a simple Euclidean metric on the alloy vectors and making the assumption that alloys with the same structure types will be close together. When a new alloy is encountered its crystal structure
412
D. Morgan and G. Ceder
is predicted by examining the neighborhood of the new alloy in the structure map. Structure types that appear frequently in a small neighborhood of the new alloy are good candidates for the alloy’s structure type. This is a geometric classification scheme, along the lines of K -nearest-neighbors described above. There is no unique way to define the vectors that create the structure map, and many different physical quantities, such as electronegativities and effective radii, have been proposed for constructing structure maps. Ref. [64] lists at least 53 different atomic parameters that could be used as descriptors to define a structure map. The most accurate Pettifor maps are built by mapping alloys to vectors using a specially devised chemical scale [71]. The chemical scale was motivated by many physical concerns, but is fundamentally an empirical way to map alloys to vectors, chosen to optimize the clustering of alloys with the same crystal structures. A number of new ideas are suggested by viewing crystal structure prediction from a DM framework. First, it is clear that many standard assessment techniques have only recently begun to be incorporated. It was not until about 20 years after the first Pettifor maps that an effort was made to formalize their clustering algorithm and assess their accuracy using cross validation techniques (the accuracy was found to very good, in some cases giving correct predictions for non-unique structures 95% of the time) [72]. Also, the question of how to assess errors can be fruitfully thought of in terms of false positives (predicting a crystal structure that is wrong) and false negatives (failing to predict the crystal structure that is right). For many situations, e.g., predicting structures to be checked by ab initio methods or used as input for Rietveld refinements, a false positive is not a large problem, since the error will likely be discarded at a later stage, but a false negative is critical, since it means the correct answer will not be found with further investigation. This leads to the idea of using maps to suggest a candidate structure list, rather than a single candidate structure [72]. Using a list creates many false positives, but greatly reduces the chance of false negatives. A DM perspective on structure prediction encourages one to think of moving beyond present structure map methods. For example, different metrics, other classification algorithms, or mining on more complex alloy descriptors, might yield more accurate results. Some work along these lines has already occurred, including machine learning based structure maps [73] and NN and clustering predictions of compound formation [74]. A similarly spirited application used partial least squares to predict higher level structural features of zeolites in terms of simpler structural descriptors [75], and is part of a more general center focused on DM in materials science [76]. The structure maps have at least two severe limitations. As described above, they predict structure type given that the alloy has a structure at a given stoichiometry, but do not consider the question of whether or not an alloy will have an ordered phase at that stoichiometry. This is not a problem when a structure
Data mining in materials development
413
is known to exist and one wants to identify it, but in many cases that information is not available. There are some successful methods for identifying alloys as compound forming versus having no-compounds, e.g., Meidema’s rules [77] or Villar’s maps for ternary compounds [68], but the problem of identifying when an alloy will show ordering at a given composition has not been thoroughly investigated in the context of structure maps. However, it is certainly possible that further DM work could be of value solving this problem, and some potentially useful methods are discussed below. Another serious limitation on structure maps is that classification DM is only effective when an adequate number of samples of each class are available. There are already thousands of structure types, the number is still increasing, and only a small percentage of possible multicomponent alloy systems have been explored [68]. Therefore, it seems unlikely that sufficiently many examples of all the structure type classes will ever be available for totally general application of structure maps. Infrequent structure types are less robustly predicted with structure maps, and totally new structure types cannot be predicted at all. The problem of limited sampling can be alleviated by restricting the area of focus, e.g., considering only the most common structure types, which are likely to be well sampled, or only a subset of alloys, where all the relevant structure types can be discovered. However, the very significant challenge of sampling all the relevant structure types creates a need for other methods. One promising idea is to abandon the use of structure types as the most effective way to classify structures and replace it with a scheme easier to sample. An idea along these lines is to classify alloys by the local environments around each atom [68, 78]. Local environments may in fact be a more relevant method of classification than structure type for understanding physical properties, and there seem to be far fewer local environments than different structure types. This is analogous to classifying proteins by their different folds, which are essential to function and come in limited variety [79]. Computational methods, using different Hamiltonians, offer an increasingly practical route toward crystal structure prediction. Given an accurate Hamiltonian for an alloy, the stable crystal structures can be calculated by minimizing the total energy. These techniques can also predict entirely new structures never seen experimentally, since the prediction is done on the computer. Unfortunately, the structural energy landscape has many local minima, and it cannot be explored quickly or easily. Researchers in this area therefore are forced to make a tradeoff between the speed and accuracy of the energy methods, and the range of possible structures that are explored. For example, Jansen has used simple pair potentials to explore the energy landscape, and then applied more accurate ab initio methods for likely structural candidates [80]. This is a common approach, to optimize with simplified expressions and then use slower and more accurate ab initio energy methods on only the more promising areas. A similar approach was taken to predict a range of
414
D. Morgan and G. Ceder
inorganic structures from a genetic algorithm [81]. If one restricts the possible structures, then direct optimization of ab initio energies can be performed. For example, low cohesive energy structures for 32 possible alloying elements were found on a four atom, face centered cubic unit cell by optimizing ab initio energies using a genetic algorithm [82]. Although these approaches are quite promising, optimizing the energy over the space of all possible atomic arrangements is generally not practical. It is necessary to find some approach to guide the calculations to regions of structure space that are likely to have the lowest energy structures and can be explored effectively. A practical and common method to guide calculations is sometimes colloquially referred as the “round up the usual suspects” approach, borrowing a quotation from Captain Louis Renault in the end of Casablanca. This approach simply involves calculating structures one thinks are likely to be ground states and is another example of human DM, where the scientist is drawing on their own experience to guide the calculations toward the correct structure. As mentioned in the introduction, formalizing human DM on the computer offers many advantages in accuracy, verification, portability, and efficiency. An improvement can be made by limiting the human component to suggesting a few likely parent lattices, and then fitting simplified Hamiltonians on each parent lattice to predict stable structures. This approach, called cluster expansion, has been well validated in many systems [83, 84] and has been successful in predicting some structures that had not been previously identified experimentally [85, 86]. However, choosing the correct parent lattice and performing the fitting required for cluster expansion is at present still difficult to automate, although efforts along these lines are being made [87]. Ideally, the process of guiding computational crystal structure prediction would be entirely automated by DM methods. A step in this direction has been taken by Curtarolo et al. who have demonstrated how one might combine experimental data, high-throughput computation, and DM methods to guide new calculations toward likely stable crystal structures [88]. Experimental information is used to get a list of commonly occurring structure types, and then these are calculated using automated scripts for a large number of systems. Mined correlations between structural energies are then used to guide calculations on new systems toward stable regions, reducing the number of calculations required to predict crystal structures. This approach can, in theory, be expanded to totally new structure types, since these can be generated on the computer, and work in this direction is under development.
4.
Conclusions
We have seen here a number of different examples of DM applications in different areas, and it is valuable to step back and note some overall
Data mining in materials development
415
features. In general, DM applications in materials development still need to prove themselves, and relatively few new discoveries have been made using them. Many of the results in this field consist primarily of exploring new models to demonstrate that such modeling is possible, that accurate predictions can be made, and that useful understanding of dependencies on key variables can be obtained. This will inevitably cause some skepticism about the final utility of the methods, but it is appropriate for a field which is still relatively young and finding its place. A similar evolution has been taken by, e.g., ab initio quantum mechanical techniques. It is only recently that these methods have moved out the stage where the accuracy of the model was the key issue to the stage where the bulk of papers focus on the materials results, not the techniques. All the drivers for using DM methods identified in the introduction, more data, databases, and DM tools, will only become increasingly forceful with continuing advances in experiment, computation, algorithms, and information technology. For these reasons, we believe that DM approaches are going to be increasingly important tools for the modern materials developer. A number of the above examples showed the necessity of combining DM methods with more traditional physical approaches. Whether it is microstructural modeling in the area of processing–structure–property prediction or kinetic equation modeling in catalysis design, physical modeling is by no means standing still, and its utility will continue to expand. In the few cases where authors make direct comparisons, it is not clear that DM applications have been more effective [44, 89]. It is already true that DM approaches, although more data focused, are deeply intertwined with traditional physical modeling. A researchers knowledge of the physics of the problem strongly influences such things as choices of descriptors (e.g., exponentiating parameters where thermal activation is expected), choices in the predictive model (e.g., using linear models when linear relationships are expected), and many unwritten small decisions about how the DM is done. DM and physical modeling, despite an apparent conflict, are really best used collaboratively, and effective materials researchers will need to combine both tools to have maximal impact. Another important feature to note is the difference between DM in materials science and the more established areas of drug design and QSPR/QSAR. Although the overall framework is very similar, establishing effective descriptors for independent variables seems to be harder in materials applications. Bulk materials, more common in traditional materials science applications, often have atomic-, nano-, and micro-structural features that are hard to characterize and quantify with effective descriptors. In their absence, further progress on many problems will require additional descriptors relating to processing choices.
416
D. Morgan and G. Ceder
Finally, we would like to stress the natural synergy between DM and other kinds of computational modeling. High-throughput computation can help provide the wealth of data needed for robust data mining, as was illustrated above in the use of computationally optimized structures for boiling point modeling [20] and crystal structure prediction [80–82, 88]. Impressive examples of high-throughput ab initio computation providing large amounts of accurate materials data can be found in Refs. [90–92]. High-throughput computation not only increases the effectiveness of DM methods, but extends the reach of computational modeling, since DM methods can help span the challenging range of length and time scales involved in materials phenomena. The growing power of DM and other computational methods will only increase their interdependence in the future. Finally, on a more personal note, we have found that one of the most valuable contributions of DM to our research has been to expand how we think about problems. DM encourages one to ask how one can make optimal use of data and to look deeply for patterns that might provide valuable information. DM makes one think on a large scale, thereby encouraging the automation of experiment, computation, and data analysis for high-throughput production. DM also encourages a culture of careful testing for any kind of fitting, through cross validation and statistical methods. Finally, DM is inherently inderdisciplinary, encouraging materials scientists to learn more about analogous problems and techniques from across the hard and soft sciences, thereby enriching us all as researchers.
References [1] W. Klosgen and J.M. Zytkow, Handbook of Data Mining and Knowledge Discovery, Oxford University Press, Oxford, 2002. [2] N. Ye, The Handbook of Data Mining, Lawrence Erlbaum Associates, London, 2003. [3] D. von Mendelejeff, “Ueber die Bezlehunger der Eigenschaften Zu den Atomgewichte der Elemente,” Zeit. Chem., 12, 405–406, 1869. [4] M.F. Ashby, Materials Selection in Mechanical Design., Butterworth-Heinemann, Boston, 1999. [5] D. Braha, Data Mining for Design and Manufacturing, Kluwer Academic Publishers, Boston, 2001. [6] M.H. Dunham, Data Mining: Introductory and Advanced Topics, Pearson Education, Inc., Upper Saddle River, New Jersey, 2003. [7] M. Kantardzic, Data Mining: Concepts, Models, Methods, and Algorithms, WileyInterscience, IEEE Press, Hoboken, New Jersey, 2003. [8] PolyphonicHMI, (http://www.polyphonichmi.com/technology.html). [9] M.H. Kutner, C.J. Nachtschiem, W. Wasserman, and J. Neter, Applied Linear Statistical Models, McGraw-Hill, New York, 1996. [10] A.C. Rencher, Methods of Multivariate Analysis, Wiley-Interscience, New York, 2002.
Data mining in materials development
417
[11] J.E. Jackson, A User‘s Guide to Principal Components, John Wiley & Sons, New York, 1991. [12] S.d. Jong, “Simpls: an alternative approach to partial least squares regression,” in Chemometrics and Intelligent Laboratory Systems, 18, 251–263, 1993. [13] B.M. Wise and N.B. Gallagher, PLS Toolbox 2.1 for Matlab, Eigenvector Reseach, Inc., Manson, WA, 2000. [14] S. Wold, A.H.W. Ruhe, and W.J. Dunn, “The collinearity problem in linear regression. The partial least squares (PLS) approach to generalized inverses,” SIAM J. Sci. Stat. Comput., 5, 735–743, 1984. [15] M.T. Hagan, H.B. Demuth, and M.H. Beale, Neural Network Design, Martin Hagan, 2002. [16] D.J.C. Mackay, “Bayesian interpolation,” Neural Comput., 4, 415–447, 1992. [17] D.J.C. Mackay, “A practical bayesian framework for backpropagation networks,” Neural Comput., 4, 448–472, 1992. [18] D.J.C. Mackay, “Probable networks and plausible predictions – a review of practical bayesian methods for supervised neural networks,” Network-Comput. Neural Syst., 6, 469–505, 1995. [19] D.J.C. MacKay, “Bayesian modeling with neural networks,” In: H. Cerjack (ed.), Mathematical Modeling of Weld Phenomena, vol. 3. The Institute of Materials, London, pp. 359–389, 1997. [20] A.J. Chalk, B. Beck, and T. Clark, “A quantum mechanical/neural net model for boiling points with error estimation,” J. Chem. Inf. Comput. Sci., 41, 457–462, 2001. [21] H. Bhadeshia, “Neural networks in materials science,” ISIJ Int., 39, 966–979, 1999. [22] J.M. Serra, A. Corma, A. Chica, E. Argente, and V. Botti, “Can artificial neural networks help the experimentation in catalysis?,” Catal. Today, 81, 393–403, 2003. [23] K. Baumann, “Cross-validation as the objective function for variable-selection techniques,” Trac-Trend Anal. Chem., 22, 395–406 2003. [24] A.S. Goldberger, A Course in Econometrics, Harvard University Press, Cambridge, MA, 1991. [25] E.K.P. Chong and S.H. Zak, An Introduction to Optimization, John Wiley & Sons, New York, 2001. [26] W.H. Press, S.A. Teukolsky, W.T. Vetterling, and B.P. Flannery, Numerical Recipes in C, Cambridge University Press, Cambridge, 1992. [27] J. Taskinen and J. Yliruusi, “Prediction of physicochemical properties based on neural network modelling,” Adv. Drug Deliv. Rev., 55, 1163–1183, 2003. [28] H. Bhadeshia, “Design of ferritic creep-resistant steels,” ISIJ Int., 41, 626–640, 2001. [29] T. Sourmail, H. Bhadeshia, and D.J.C. MacKay, “Neural network model of creep strength of austenitic stainless steels,” Mater. Sci. Technol., 18, 655–663, 2002. [30] S.H. Lalam, H. Bhadeshia, and D.J.C. MacKay, “Estimation of mechanical properties of ferritic steel welds part 1: yield and tensile strength,” Sci. Technol. Weld. Joining 5, 135–147, 2000. [31] S.H. Lalam, H. Bhadeshia, and D.J.C. MacKay, “Estimation of mechanical properties of ferritic steel welds part 2: Elongation and charpy toughness,” Sci. Technol. of Weld. Joining, 5, 149–160, 2000. [32] M.A. Yescas, H. Bhadeshia, and D.L. MacKay, “Estimation of the amount of retained austenite in austempered ductile irons using neural networks,” Mater. Sci. Eng. A, 311, 162–173, 2001. [33] S. Cardie and H.K.D.H. Bhadeshia, “Materials algorithms project (map): Public domain research software & data,” In: Mathematical Modelling of Weld Phenomena IV, Institute of Materials, London, 1998.
418
D. Morgan and G. Ceder [34] S. Malinov and W. Sha, “Software products for modelling and simulation in materials science,” Comput. Mater. Sci., 28, 179–198, 2003. [35] S. Malinov, W. Sha, and Z. Guo, “Application of artificial neural network for prediction of time-temperature-transformation diagrams in titanium alloys,” Mater. Sci. Eng. Struct. Matter Properties Microstruct. Process, 283, 1–10, 2000. [36] S. Malinov, W. Sha, and J.J. McKeown, “Modelling the correlation between processing parameters and properties in titanium alloys using artificial neural network,” Comput. Mater. Sci., 21, 375–394, 2001. [37] S. Malinov and W. Sha, “Application of artificial neural networks for modelling correlations in titanium alloys,” Mater. Sci. Eng., A365, 202–211, 2004. [38] T. Malinova, S. Malinov, and N. Pantev, “Simulation of microhardness profiles for nitrocarburized surface layers by artificial neural network,” Surf. Coat. Technol., 135, 258–267, 2001. [39] T. Malinova, N. Pantev, and S. Malinov, “Prediction of surface hardness after ferritic nitrocarburising of steels using artificial neural networks,” Mater. Sci. Technol., 17, 168–174, 2001. [40] S. Christensen, J.S. Kandola, O. Femminella, S.R. Gunn, P.A.S. Reed, and I. Sinclair, “Adaptive numerical modelling of commercial aluminium plate performance,” Aluminium Alloys: Their Physical and Mechanical Properties, Pts 1–3, 331–3, 533– 538, 2000. [41] O.P. Femminella, M.J. Starink, M. Brown, I. Sinclair, C.J. Harris, and P.A.S. Reed, “Data pre–processing/model initialisation in neurofuzzy modelling of structure-property relationships in Al–Zn–Mg–Cu alloys,” ISIJ Int., 39, 1027–1037, 1999. [42] O.P. Femminella, M.J. Starink, S.R. Gunn, C.J. Harris, and P.A.S. Reed, “Neurofuzzy and supanova modelling of structure–property relationships in Al–Zn–Mg–Cu alloys,” Aluminium Alloys: Their Physical and Mechanical Properties, Pts 1–3, 331– 3, 1255–1260, 2000. [43] J.S. Kandola, S.R. Gunn, I. Sinclair, and P.A.S. Reed, “Data driven knowledge extraction of materials properties,” In: Proceedings of Intelligent Processing and Manufacturing of Materials, Hawaii, USA, 1999. [44] M.J. Starink, I. Sinclair, P.A.S. Reed, and P.J. Gregson, “Predicting the structural performance of heat-treatable al-alloys,” In: Aluminum Alloys - Their Physical and Mechanical Properties, Parts 1-3, vol. 331–337, pp. 97–110, Trans Tech Publications, Switzerland, 2000. [45] H. Byun and S.W. Lee, “Applications of support vector machines for pattern recognition: A survey,” Pattern Recogn. Support Vector Machines, Proc., 2388, 213–236, 2002. [46] N. Cristianini and J. Shawe-Taylor, An Introduction to Support Vector Machines, Cambridge University Press, Cambridge, UK, 2000. [47] V.N. Vapnik, The Nature of Statistical Learning Theory, Springer-Verlag, New York, 1995. [48] L. Harmon, “Experiment planning for combinatorial materials discovery,” J. Mater. Sci., 38, 4479–4485, 2003. [49] G.J. Hutchings and M.S. Scurrell, “Designing oxidation catalysts – are we getting better?,” Cattech, 7, 90–103, 2003. [50] C. Klanner, D. Farrusseng, L. Baumes, C. Mirodatos, and F. Schuth, “How to design diverse libraries of solid catalysts?,” QSAR & Combinatorial Science, 22, 729–736, 2003.
Data mining in materials development
419
[51] T.R. Cundari, J. Deng, and Y. Zhao, “Design of a propane ammoxidation catalyst using artificial neural networks and genetic algorithms,” Indust. & Eng. Chem. Res., 40, 5475–5480, 2001. [52] T. Hattori and S. Kito, “Neural-network as a tool for catalyst development,” Catal. Today, 23, 347–355, 1995. [53] M. Holena and M. Baerns, “Feedforward neural networks in catalysis - a tool for the approximation of the dependency of yield on catalyst composition, and for knowledge extraction,” Catal. Today, 81, 485–494, 2003. [54] K. Huang, X.L. Zhan, F.Q. Chen, and D.W. Lu, “Catalyst design for methane oxidative coupling by using artificial neural network and hybrid genetic algorithm,” Chem. Eng. Sci., 58, 81–87, 2003. [55] A. Tompos, J.L. Margitfalvi, E. Tfirst, and L. Vegvari, Information mining using artificial neural networks and “holographic research strategy,” Appl. Catal. A, 254, 161–168, 2003. [56] T. Umegaki, Y. Watanabe, N. Nukui, E. Omata, and M. Yamada, “Optimization of catalyst for methanol synthesis by a combinatorial approach using a parallel activity test and genetic algorithm assisted by a neural network,” In: Energy Fuels, 17, 850–856, 2003. [57] O.V. Buyevskaya, A. Bruckner, E.V. Kondratenko, D. Wolf, and M. Baerns, “Fundamental and combinatorial approaches in the search for and optimisation of catalytic materials for the oxidative dehydrogenation of propane to propene,” Catal. Today, 67, 369–378, 2001. [58] U. Rodemerck, D. Wolf, O.V. Buyevskaya, P. Claus, S. Senkan, and M. Baerns, “High-throughput synthesis and screening of catalytic materials – case study on the search for a low-temperature catalyst for the oxidation of low-concentration propane,” Chem. Eng. J., 82, 3–11, 2001. [59] D. Wolf, O.V. Buyevskaya, and M. Baerns, “An evolutionary approach in the combinatorial selection and optimization of catalytic materials,” Appl. Catal. A, 200, 63–77, 2000. [60] J.M. Caruthers, J.A. Lauterbach, K.T. Thomson, V. Venkatasubramanian, C.M. Snively, A. Bhan, S. Katare, and G. Oskarsdottir, “Catalyst design: knowledge extraction from high-throughput experimentation,” J. Catal., 216, 98–109, 2003. [61] A. Tuchbreiter and R. Mulhaupt, “The polyolefin challenges: catalyst and process design, tailor-made materials, high-throughput development and data mining,” Macromol. Symp., 173, 1–20, 2001. [62] A. Tuchbreiter, J. Marquardt, B. Kappler, J. Honerkamp, M.O. Kristen, and R. Mulhaupt, “High-output polymer screening: exploiting combinatorial chemistry and data mining tools in catalyst and polymer development,” Macromol. Rapid Comm., 24, 47–62, 2003. [63] A. Hagemeyer, B. Jandeleit, Y.M. Liu, D.M. Poojary, H.W. Turner, A.F. Volpe, and W.H. Weinberg, “Applications of combinatorial methods in catalysis,” Appl. Catal. A, 221, 23–43, 2001. [64] G. Bergerhoff, R. Hundt, R. Sievers, and I.D. Brown, “The inorganic crystal-structure data-base,” J. Chem. Compu. Sci., 23, 66–69, 1983. [65] P. Villars, K. Cenzual, J.L.C. Daams, F. Hullinger, T.B. Massalski, H. Okamoto, K. Osaki, and A. Prince, Pauling File, ASM International, Materials Park, Ohio, USA, 2002. [66] P.S. White, J. Rodgers, and Y. Le Page, “Crystmet: a database of structures and powder patterns of metals and intermetallics,” Acta Cryst. B, 58, 343–348, 2002.
420
D. Morgan and G. Ceder [67] S. Kabekkodu, G. Grosse, and J. Faber, “Data mining in the icdd’s metals & alloys relational database,” Epdic 7: European Powder Diffraction, Pts 1 and 2, 378–3, 100–105, 2001. [68] P. Villars, Factors governing crystal structures. In: J.H. Westbrook and R.L. Fleischer (eds.), vol. 1, John Wiley & Sons, New York, pp. 227–275, 1994. [69] J.K. Burdett and J. Rodgers, “Structure & property maps for inorganic solids,” In: R.B. King (ed.), Encyclopedia of Inorganic Chemistry, vol. 7, John Wiley & Sons, New York, 1994. [70] D.G. Pettifor, “The structures of binary compounds: I. Phenomenological structure maps,” J. Phys. C: Solid State Phys., 19, 285–313, 1986. [71] D.G. Pettifor, “A chemical scale for crystal-structure maps,” Solid State Commun., 51, 31–34, 1984. [72] D. Morgan, J. Rodgers, and G. Ceder, “Automatic construction, implementation and assessment of Pettifor maps,” J. Phys. Condens. Matter, 15, 4361–4369, 2003. [73] G.A. Landrum, Prediction of Structure Types for Binary Compounds, Rational Discovery, Inc., Palo Alto, pp. 1–8, 2001. [74] Y.H. Pao, B.F. Duan, Y.L. Zhao, and S.R. LeClair, “Analysis and visualization of category membership distribution in multivariate data,” Eng. Appl. Artif. Intell., 13, 521–525, 2000. [75] A. Rajagopalan, C.W. Suh, X. Li, and K. Rajan, “Secondary” descriptor development for zeolite framework design: an informatics approach, Appl. Catal. A, 254, 147–160, 2003. [76] K. Rajan, Combinatorial materials science and material informatics laboratory (COSMIC), (http://www.rpi.edu/∼rajank/materialsdiscovery/). [77] F.R. de Boer, R. Boom, W.C.M. Matten, A.R. Miedema, and A.K. Niessen, Cohesion in Metals: Transition Metal Alloys, North Holland, Amsterdam, 1988. [78] J.L.C. Daams, “Atomic environments in some related intermetallic structure types,” In: J.H. Westbrook and R.L. Fleischer (eds.), Intermetallic Compounds, Principle and Practice, vol. 1, John Wiley & Sons, New York, pp. 227–275, 1994. [79] S. Dietmann, J. Park, C. Notredame, A. Heger, M. Lappe, and L. Holm, “A fully automatic evolutionary classification of protein folds: Dali domain dictionary version 3,” Nucleic Acids Res., 29, 55–57, 2001. [80] M. Jansen, “A concept for synthesis planning in solid-state chemistry,” Angew. Chem. Int. Ed., 41, 3747–3766, 2002. [81] S.M. Woodley, P.D. Battle, J.D. Gale, and C.R.A. Catlow, “The prediction of inorganic crystal structures using a genetic algorithm and energy minimisation,” Phys. Chem. Chem. Phys., 1, 2535–2542, 1999. [82] G.H. Johannesson, T. Bligaard, A.V. Ruban, H.L. Skriver, K.W. Jacobsen, and J.K. Norskov, “Combined electronic structure and evolutionary search approach to materials design,” Phys. Rev. Lett., 88, pp. 255506-1–255506-5, 2002. [83] D. de Fontaine, “Cluster approach to order-disorder transformations in alloys,” In: Solid State Physics, H. Ehrenreich and D. Turnbull (eds.), vol. 47, Academic Press, pp. 33–77 1994. [84] A. Zunger, “First-principles statistical mechanics of semiconductor alloys and intermetallic compounds,” Statics and Dynamics of Alloy Phase Transformations, New York, 1994. [85] V. Blum and A. Zunger, “Structural complexity in binary bcc ground states: The case of bcc Mo–Ta,” Phys. Rev. B, 69, pp. 020103-1–020103-4, 2004. [86] G. Ceder, “Predicting properties from scratch,” Science, 280, 1099–1100, 1998.
Data mining in materials development
421
[87] A. van de Walle, M. Asta, and G. Ceder, “The alloy theoretic automated toolkit: A user guide,” Calphad-Computer Coupling of Phase Diagrams and Thermochemistry, 26, 539–553, 2002. [88] S. Curtarolo, D. Morgan, K. Persson, J. Rodgers, and G. Ceder, “Predicting crystal structures with data mining of quantum calculations,” Phy. Rev. Lett., 91, 2003. [89] B. Chan, M. Bibby, and N. Holtz, “Predicting 800 to 500 Degrees C Weld Cooling Times by using Backpropagation Neural Networks,” Trans. Can. Soc. Mech. Eng., 20, 75, 1996. [90] T. Bligaard, G.H. Johannesson, A.V. Ruban, H.L. Skriver, K.W. Jacobsen, and J.K. Norskov, “Pareto-optimal alloys,” Appl. Phys. Lett., 83, 4527–4529, 2003. [91] S. Curtarolo, D. Morgan, and G. Ceder, “Accuracy of ab initio methods in predicting the crystal structures of metals: Review of 80 binary alloys,” submitted for publication, 2004. [92] A. Franceschetti and A. Zunger, “The inverse hand-structure problem of finding an atomic configuration with given electronic properties,” Nature, 402, 60–63, 1999.
1.19 FINITE ELEMENTS IN AB INITIO ELECTRONIC-STRUCTURE CALULATIONS J.E. Pask and P.A. Sterne Lawrence Livermore National Laboratory, Livermore, CA, USA
Over the course of the past two decades, the density functional theory (DFT) (see e.g., [1]) of Hohenberg, Kohn, and Sham has proven to be an accurate and reliable basis for the understanding and prediction of a wide range of materials properties from first principles (ab initio), with no experimental input or empirical parameters. However, the solution of the Kohn–Sham equations of DFT is a formidable task and this has limited the range of physical systems which can be investigated by such rigorous, quantum mechanical means. In order to extend the interpretive and predictive power of such quantum mechanical theories further into the domain of “real materials”, involving nonstoichiometric deviations, defects, grain boundaries, surfaces, interfaces, and the like; robust and efficient methods for the solution of the associated quantum mechanical equations are critical. The finite-element (FE) method (see e.g., [2]) is a general method for the solution of partial differential and integral equations which has found wide application in diverse fields ranging from particle physics to civil engineering. Here, we discuss its application to large-scale ab initio electronic-structure calculations. Like the traditional planewave (PW) method (see e.g., [3]), the FE method is a variational expansion approach, in which solutions are represented as a linear combination of basis functions. However, whereas the PW method employs a Fourier basis, with every basis function overlapping every other, the FE method employs a basis of strictly local piecewise polynomials, each overlapping only its immediate neighbors. Because the FE basis consists of polynomials, the method is completely general and systematically improvable, like the PW method. Because the basis is strictly local, however, the method offers some significant advantages. First, because the basis functions are localized, they can be concentrated where needed in real space to increase the efficiency 423 S. Yip (ed.), Handbook of Materials Modeling, 423–437. c 2005 Springer. Printed in the Netherlands.
424
J.E. Pask and P.A. Sterne
of the representation. Second, a variety of boundary conditions can be accommodated, including Dirichlet boundary conditions for molecules or clusters, Bloch boundary conditions for crystals, or a mixture of these for surfaces. Finally, and most significantly for large-scale calculations, the strict locality of the basis facilitates implementation on massively parallel computational architectures by minimizing the need for nonlocal communications. The advantages of such a local, real-space approach in large-scale calculations have been amply demonstrated in the context of finite-difference (FD) methods (see, e.g., [4]). However, FD methods are not variational expansion methods, and this leads to disadvantages such as limited accuracy in integrations and nonvariational convergence. By retaining the use of a basis while remaining strictly local in real space, FE methods combine significant advantages of both PW and FD approaches.
1.
Finite Element Bases
The construction and key properties of FE bases are perhaps best conveyed in the simplest case: a one-dimensional (1D), piecewise-linear basis. Figure 1 shows the steps involved in the construction of such a basis on a domain = (0, 1). The domain is partitioned into subdomains called elements (Fig. 1a). In this case, the domain is partitioned into three elements 1 –3 ; in practice, there are typically many more, so that each element encompasses only a small fraction of the domain. For simplicity, we have chosen a uniform partition, but this need not be the case in general. (Indeed, it is precisely the flexibility to partition the domain as desired which allows for the substantial efficiency of the basis in highly inhomogeneous problems.) A parent basis φˆ i ˆ = (−1, 1) (Fig. 1b). In this case, the is then defined on the parent element parent basis functions are φˆ 1 (ξ ) = (1 − ξ )/2 and φˆ 2 (ξ ) = (1 + ξ )/2. Since the parent basis consists of two (independent) linear polynomials, it is complete to linear order, i.e., a linear combination can represent any linear polynomial exactly. Furthermore, it is defined such that each function takes on the value 1 at exactly one point, called its node, and vanishes at all (one, in this case) other nodes. Local basis functions φi(e) are then generated by transformations ξ (e) (x) ˆ to each element e of the parent basis functions φˆ i from the parent element (Fig. 1c). In present case, for example, φ1(1) (x) ≡ φˆ1 (ξ (1)(x)) = 1 − 3x and φ2(1) (x) ≡ φˆ2 (ξ (1)(x)) = 3x, where ξ (1)(x) = 6x − 1. Finally, the piecewisepolynomial basis functions φi of the method are generated by piecing together the local basis functions (Fig. 1d). In the present case, for example, φ2 (x) =
(1) φ2 (x),
φ1(2) (x),
0,
x ∈ [0, 1/3] x ∈ [1/3, 2/3] otherwise.
Finite elements in ab initio electronic-structure calculations
425
Figure 1. 1D piecewise-linear FE bases. (a) Domain and elements. (b) Parent element and parent basis functions. (c) Local basis functions generated by transformations of parent basis functions to each element. (d) General piecewise-linear basis, generated by piecing together local basis functions across interelement boundaries. (e) Dirichlet basis, generated by omitting boundary functions. (f) Periodic basis, generated by piecing together boundary functions.
The above 1D piecewise-linear FE basis possesses the key properties of all such bases, whether of higher dimension or higher polynomial order. First, the basis functions are strictly local, i.e., nonzero over only a small fraction of the domain. This leads to sparse matrices and scalability, as in FD approaches,
426
J.E. Pask and P.A. Sterne
while retaining the use of a basis, as in PW approaches. Second, within each element, the basis functions are simple, low-order polynomials, which leads to computational efficiency, generality, and systematic improvability, as in FD and PW approaches. Third, the basis functions are C 0 in nature, i.e., continuous but not necessarily smooth. As we shall discuss, this necessitates extra care in the solution of second-order problems, with periodic boundary conditions in particular. Finally, the basis functions have the key property φi (x j ) = δi j i.e., each basis function takes on a value of 1 at its associated node and vanishes at all other nodes. By virtue of this property, an FE expansion f (x) = c φ i i i (x) has the property f (x j ) = c j , so that the expansion coefficients have a direct, real-space meaning. This eliminates the need for computationally intensive transforms, such as Fourier transforms in PW approaches, and facilitates preconditioning in iterative solutions, such as multigrid in FD approaches (see, e.g., [4]). Figure 1(d) shows a general FE basis, capable of representing any piecewise linear function (having the same polynomial subintervals) exactly. To solve a problem subject to vanishing Dirichlet boundary conditions, as occurs in molecular or cluster calculations, one can restrict the basis as in Fig. 1(e), i.e., omit boundary functions. To solve a problem subject to periodic boundary conditions, as occurs in solid-state electronic-structure calculations, one can restrict the basis as in Fig. 1(f), i.e., piece together local basis functions across the domain boundary in addition to piecing together across interelement boundaries. Regarding this periodic basis, however, it should be noted that an arbitrary linear combination f (x) = i ci φi (x) necessarily satisfies f (0) = f (1),
(1)
but does not necessarily satisfy f (0) = f (1).
(2)
Thus, unlike PW or other such smooth bases, while the value condition (1) is enforced by the use of such an FE basis, the derivative condition (2) is not. And so for problems requiring the enforcement of both, as in solid-state electronic-structure, the derivative condition must be enforced by other means [5]. We address this further in the next section. Higher-order FE bases are constructed by defining more independent parent basis functions, which requires that some basis functions be of higher order than linear. And, as in the linear case, what is typically done is to define all functions to be of the same order so that, for example, to define a 1D quadratic basis, one would define three quadratic parent basis functions; for a 1D cubic basis, four cubic parent basis functions, etc. With higher-order basis functions,
Finite elements in ab initio electronic-structure calculations
427
however, come new possibilities. For example, with cubic basis functions there are sufficient degrees of freedom to specify both value and slope at end points, thus allowing for the possibility of both value and slope continuity across interelement boundaries, and so allowing for the possibility of a C 1 (continuous value and slope) rather than C 0 basis. For sufficiently smooth problems, such higher order continuity can yield greater accuracy per degree of freedom and such bases have been used in the electronic-structure context [6, 7]. However, while straightforward in one dimension, in higher dimensions this requires matching both values and derivatives (including cross terms) across entire curves or surfaces, which becomes increasingly difficult to accomplish and leads to additional constraints on the transformations, and thus meshes, which can be employed [8]. Higher-dimensional FE bases are constructed along the same lines as the 1D case: partition the domain into elements, define local basis functions within each element via transformations of parent basis functions, and piece together the resulting local basis functions to form the piecewise-polynomial FE basis. In higher dimensions, however, there arises a significant additional choice: that of shape. The most common 2D element shapes are triangles and quadrilaterals. In 3D, tetrahedra, hexahedra (e.g., parallelepipeds), and wedges are among the most common. A variety of shapes have been employed in atomic and molecular calculations (see, e.g., [9]). In solid-state electronic-structure calculations, the domain can be reduced to a parallelepiped and C 0 [5] as well as C 1 [7] parallelepiped elements have been employed.
2.
Solution of the Schr¨odinger and Poisson Equations
The solution of the Kohn–Sham equations can be accomplished by a number of approaches, including direct minimization of the energy functional [10], solution of the associated Lagrangian equations [11], and self-consistent (SC) solution of associated Schr¨odinger and Poisson equations (see, e.g., [3]). A finite-element based energy minimization approach has been described by Tsuchida and Tsukada [7] in the context of molecular and -point crystalline calculations. Here, we shall describe a finite-element based SC approach. In this section, we discuss the solution of the Schr¨odinger and Poisson equations; in the next, we discuss self-consistency. The solution of such equations subject to Dirichlet boundary conditions, as appropriate for molecular or cluster calculations, is discussed extensively in the standard texts and literature (see, e.g., [2, 9]). Here, we shall discuss their solution subject to boundary conditions appropriate for a periodic (crystalline) solid.
428
J.E. Pask and P.A. Sterne
In a perfect crystal, the electronic potential is periodic, i.e., V (x + R) = V (x)
(3)
for all lattice vectors R, and the solutions of the Schr¨odinger equation satisfy Bloch’s theorem ψ(x + R) = eik·R ψ(x)
(4)
for all lattice vectors R and wavevectors k [12]. Thus the values of V (x) and ψ(x) throughout the crystal are completely determined by their values in a single unit cell, and so the solutions of the Poisson and Schr¨odinger equations in the crystal can be reduced to their solutions in a single unit cell, subject to boundary conditions consistent with Eqs. (3) and (4), respectively. We consider first the Schr¨odinger problem: 1 − ∇ 2 ψ + V ψ = εψ 2
(5)
in a unit cell, subject to boundary conditions consistent with Bloch’s theorem, where V is an arbitrary periodic potential (atomic units are used throughout, unless otherwise specified). Since V is periodic, ψ can be written in the form ψ(x) = u(x)eik·x ,
(6)
where u is a complex, cell-periodic function satisfying u(x) = u(x + R) for all lattice vectors R [12]. Assuming the form (6), the Schr¨odinger equation (5) becomes 1 1 − ∇ 2 u − ik · ∇u + k 2 u + VL u + e−ik·x VNL eik·x u = εu, 2 2
(7)
where, allowing for the possibility of nonlocality, VL and VNL are the local and nonlocal parts of V . From the periodicity condition (4), the required boundary conditions on the unit cell are then [12] u(x) = u(x + Rl ),
x ∈ l
(8)
and nˆ · ∇u(x) = nˆ · ∇u(x + Rl ),
x ∈ l ,
(9)
where l and Rl are the surfaces of the boundary and associated lattice vectors R shown in Fig. 2, and nˆ is the outward unit normal at x. The required Bloch-periodic problem can thus be reduced to the periodic problem (7)–(9). However, since the domain has been reduced to the unit cell, nonlocal operators require further consideration. In particular, if as is typically the case for ab initio pseudopotentials, the domain of definition is all space (i.e., the
Finite elements in ab initio electronic-structure calculations
429
R3
R2
R1
Figure 2. Parallelepiped unit cell (domain) , boundary , surfaces 1 –3 , and associated lattice vectors R1 –R3 .
full crystal), they must be transformed to the relevant finite subdomain (i.e., the unit cell) [13]. For a separable potential of the usual form VNL (x, x ) =
a a vlm (x − τa − Rn )h la vlm (x − τa − Rn ),
(10)
n,a,l,m
where n runs over all lattice vectors and a runs over atoms in the unit cell, the nonlocal term e−ik·x VNL eik·x u in Eq. (7) is
e−ik·x
a vlm (x − τa − Rn )h la
n,a,l,m
a dx vlm (x − τa − Rn )eik·x u(x ),
R3
where the integral is over all space. Upon transformation to the unit cell , this becomes
e−ik·x
a eik·Rn vlm (x − τa − Rn )h la
a,l,m n
×
dx
n
a e−ik·Rn vlm (x − τa − Rn )eik·x u(x ).
Having reduced the required problem to a periodic problem on a finite domain, solutions may be obtained using a periodic FE basis. However, if the
430
J.E. Pask and P.A. Sterne
basis is C 0 , as is typically the case, rather than C 1 or smoother, some additional consideration is required. First, the direct application of the Laplacian to such a basis is problematic. Second, being periodic in value but not in derivative (as discussed in the preceding section), the basis does not satisfy the required boundary conditions. Both issues can be resolved by reformulating the original differential formulation in weak (integral) form. Such a weak formulation can be constructed which contains no derivatives higher than first order, and which requires only value-periodicity (i.e., Eq. (8)) of the basis, thus resolving both issues. Such a weak formulation of the required problem (7)–(9) is [5]: Find scalars ε and functions u ∈ V such that 1 2
1 dx ∇v ∗ · ∇u + dxv ∗ −ik · ∇u + k 2 u + VL u + e−ik·x VNL eik·x u 2
= ε dxv ∗ u
∀v ∈ V,
where V = {v : v(x) = v(x + Rl ), x ∈ l }, and the x dependence of u and v has been suppressed for compactness. Having reformulated the problem in weak form,solutions may be obtained using a C 0 FE basis. Letting u = j c j φ j and v = j d j φ j , where φ j are real periodic finite element basis functions and c j and d j are complex coefficients, leads to a generalized Hermitian eigenproblem determining the approximate eigenvalues ε and eigenfunctions u of the weak formulation and thus of the required problem [5]:
Hi j c j = ε
j
Si j c j ,
(11)
j
where
Hi j =
dx
1 1 ∇φi · ∇φ j − ik · φi ∇φ j + k 2 φi φ j + VL φi φ j 2 2
+ φi e−ik·x VNL eik·x φ j and
Si j =
(12)
dx φi φ j ,
(13)
and again the x dependence of φi and φ j has been suppressed for compactness. For a separable potential of the form (10), the nonlocal term in (12) becomes [13]
dx φi (x)e−ik·x VNL eik·x φ j (x) =
a,l,m
ai a flm hl
aj ∗
flm ,
Finite elements in ab initio electronic-structure calculations
431
where ai = flm
dx φi (x)e−ik·x
a eik·Rn vlm (x − τa − Rn ).
n
As in the PW method, the above matrix elements can be evaluated to any desired accuracy, so that the basis need only be large enough to provide a sufficient representation of the required solution, though other functions such as the nonlocal potential may be more rapidly varying. As in the FD method, the above matrices are sparse and structured due to the strict locality of the basis. Figure 3 shows a series of FE results for a Si pseudopotential [14]. Since the method allows for the direct treatment of any Bravais lattice, results are shown for a two-atom fcc primitive cell. The figure shows the sequence of band structures obtained for 3 × 3 × 3, 4 × 4 × 4, and 6 × 6 × 6 uniform meshes vs. exact values at selected k points (where “exact values” were obtained from a well converged PW calculation). The variational nature of the method is clearly manifested: the error is strictly positive and the entire band structure converges rapidly and uniformly from above as the number of basis functions is increased. Further analysis [5] shows that the convergence of the eigenvalues is in fact sextic, i.e., the error is of order h 6 , where h is the mesh spacing, consistent with asymptotic convergence theorems for the cubic-complete case [8]. The Poisson solution proceeds along the same lines as the Schr¨odinger solution. In this case, the required problem is −∇ 2 VC (x) = f (x),
x∈
(14)
subject to boundary conditions VC (x) = VC (x + Rl ),
x ∈ l
(15)
and nˆ · ∇ VC (x) = nˆ · ∇ VC (x + Rl ),
x ∈ l ,
(16)
where the source term f (x) = −4πρ(x), VC (x) is the potential energy of an electron in the charge density ρ(x), and the domain , bounding surfaces l , and lattice vectors Rl are again as in Fig. 2. Reformulation of (14)–(16) in weak form and subsequent discretization in a real periodic FE basis φ j leads to a symmetric linear system determining the approximate solution VC (x) = j c j φ j (x) of the weak formulation and thus of the required problem [5]: j
L i j c j = fi ,
(17)
432
J.E. Pask and P.A. Sterne
Si 15
Energy (eV)
10
3⫻3⫻3 4⫻4⫻4 6⫻6⫻6
5
FE Exact 0
L
Γ
X
Figure 3. Exact and finite-element (FE) band structures for a series of meshes, for a Si primitive cell. The convergence is rapid and variational: the entire band structure converges from above, with an error of O(h 6 ), where h is the mesh spacing.
where
Lij =
dx ∇φi (x) · ∇φ j (x)
(18)
and
fi =
dx φi (x) f (x).
(19)
As in the FD method, the above matrices are sparse and structured due to the strict locality of the basis, requiring only O(n) storage and O(n) operations
Finite elements in ab initio electronic-structure calculations
433
for solution by iterative methods, whereas O(n log n) operations are required in a PW basis, where n is the number of basis functions.
3.
Self-Consistency
The above Schr¨odinger and Poisson solutions can be employed in a fixed point iteration to obtain the self-consistent solution of the Kohn–Sham equations. In the context of a periodic solid, the process is generally as follows (see, e.g., Ref. [3]): an initial electronic charge density ρein is constructed (e.g., by overlapping atomic charge densities). An effective potential Veff is constructed based upon ρein (see below). The eigenstates ψi of Veff are computed by solving the associated Schr¨odinger equation subject to Bloch boundary conditions. From these eigenstates, or “orbitals”, a new electronic charge density ρe is then constructed according to ρe = −
f i |ψi |2 ,
i
where the sum is over occupied orbitals with occupations f i . If ρe is sufficiently close to ρein , then self-consistency has been reached; otherwise, a new ρein is constructed based on ρe and the process is repeated until self-consistency is achieved. The resulting density minimizes the total energy and is the DFT approximation of the physical density, from which other observables may be derived. The effective potential can be constructed as the sum of ionic (or nuclear, in an all-electron context), Hartree, and exchange-correlation parts: Veff = ViL + ViNL + VH + VXC ,
(20)
where, allowing for the possibility of nonlocality, ViL and ViNL are the local and nonlocal parts of the ionic term. For definiteness, we shall assume that the atomic cores are represented by nonlocal pseudopotentials. ViNL is then determined by the choice of pseudopotential. VXC is a functional of the electronic density determined by the choice of exchange-correlation functional. ViL is the Coulomb potential associated with the ions (sum of local ionic pseudopotentials). VH is the Coulomb potential associated with electrons (the Hartree potential). In the limit of an infinite crystal, ViL and VH are divergent due to the long range 1/r nature of the Coulomb interaction, and so their computation requires careful consideration. A common approach is to add and subtract analytic neutralizing densities and associated potentials, solve the resulting neutralized problems, and add analytic corrections (see, e.g., Ref. [3] in a reciprocal space context, [15] in real space). Alternatively [13], it may be L associated with each atom noted that the local parts of the ionic potentials Vi,a
434
J.E. Pask and P.A. Sterne
can be replaced by corresponding localized ionic charge densities ρi,a since the potentials fall off as −Z /r (or rapidly approach this behavior) for r > rc , where Z is the number of valence electrons, r is the distance from the ion center, and rc is on the order of half the nearest neighbor distance. The total Coulomb potential VC = ViL + VH in the unit cell may then be computed at once by solving the Poisson equation ∇ 2 VC = 4πρ subject to periodic boundary conditions, where ρ = ρi + ρe is the sum of electronic and ionic charge densities in the unit cell, and the ionic charge densities ρi,a associated with each atom a are related to their respective local ionic L potentials Vi,a by Poisson’s equation L ρi,a = ∇ 2 Vi,a /4π.
Since the ionic charge densities are localized, their summation in the unit cell is readily accomplished, whereas the summation of ionic potentials is not, due to their long range 1/r tails. With VC determined, Veff can then be constructed as in Eq. (20), and the self-consistent iteration can proceed.
4.
Total Energy
Like Veff , the computation of the total energy in a crystal requires careful consideration due to the long range nature of the Coulomb interaction and resulting divergent terms. In this case, the electron–electron and ion–ion terms are divergent and positive, while the electron–ion term is divergent and negative. As in the computation of Veff , a common approach involves the addition and subtraction of analytic neutralizing densities (see, e.g., Refs. [3, 15]). Alternatively, it may be noted that the replacement of the local parts of the ionic potentials by corresponding localized charge densities, as discussed above, yields a net neutral charge density ρ = ρi + ρe , and all convergent terms in the total energy. For sufficiently localized ρi,a , a quadratically convergent expression for the total energy in terms of Kohn–Sham eigenvalues εi is then [13] E tot =
i
1 − 2
f i εi +
dx ρe (x)
VLin (x)
1 dx ρi (x)VC (x) + 2 a
1 − VC (x) − εXC [ρe (x)] 2
L dx ρi,a (x)Vi,a (x),
(21)
R3
where VLin is the local part of Veff constructed from the input charge density ρein , VC is the Coulomb potential associated with ρe , i.e., ∇ 2 VC = 4π(ρi + ρe ), εXC
Finite elements in ab initio electronic-structure calculations
435
is the exchange-correlation energy density, i runs over occupied states with occupations f i , and a runs over atoms in the unit cell. Figure 4 shows the convergence of FE results to well converged PW results as the number of elements in each direction of the wavefunction mesh is increased in a self-consistent GaAs calculation at an arbitrary k point, using the same pseudopotentials [16] and exchange-correlation functional. As in the PW method, higher resolution is employed in the calculation of the charge density and potential (twice that employed in the calculation of the of the wavefunctions, in the present case). The rapid, variational convergence of the FE approximations to the exact self-consistent solution is clearly manifested: the error is strictly positive and monotonically decreasing, with an asymptotic slope of ∼−6 on a log–log scale, indicating an error of O(h 6 ), where h is the mesh spacing, consistent with the cubic completeness of the basis. This is in contrast to FD approaches where, lacking a variational foundation, the error can be of either sign and may oscillate.
5.
Outlook
Because FE bases are simultaneously polynomial and strictly local in nature, FE methods retain significant advantages of FD methods without sacrificing the use of a basis, and in this sense, combine advantages of both PW
GaAs self-consistent total energy and eigenvalues
EFE⫺EEXACT (Ha)
10⫺2
10⫺2
10⫺3
10⫺3
10⫺4
10⫺4 Etot E1 E2 E3
10⫺5 10⫺6 8
10⫺5 10⫺6 12
16
20
24
28
32
Elements in each direction Figure 4. Convergence of self-consistent FE total energy and eigenvalues with respect to number of elements, for a GaAs primitive cell. As for a fixed potential, the convergence is rapid and variational: the error is strictly positive and monotonically decreasing, with an error of O(h 6 ), where h is the mesh spacing.
436
J.E. Pask and P.A. Sterne
and FD based approaches for ab initio electronic structure calculations. In particular, while variational and systematically improvable, the method produces sparse matrices and requires no computation- or communication-intensive transforms; and so is well suited to large, accurate calculations on massively parallel architectures. However, FE methods produce generalized rather than standard eigenproblems, require more memory than FD based approaches, and are more difficult to implement. Because of the relative merits of each approach, and because FE based approaches are yet at a relatively early stage of development, it is not clear which approach will prove superior in the largescale ab initio electronic structure context in the years to come [4]. Early nonself-consistent applications to ab initio positron distribution and lifetime calculations involving over 4000 atoms [5] are promising indications, however, and the development and optimization of FE based approaches for a range of large-scale applications remains a very active area of research.
Acknowledgment This work was performed under the auspices of the U.S. Department of Energy by University of California, Lawrence Livermore National Laboratory under Contract W-7405-Eng-48.
References [1] R.O. Jones and O. Gunnarsson, “The density functional formalism, its applications and prospects,” Rev. Mod. Phys., 61, 689–746, 1989. [2] O.C. Zienkiewicz and R.L. Taylor, The Finite Element Method, McGraw-Hill, New York, 4th edn., 1988. [3] W.E. Pickett, “Pseudopotential methods in condensed matter applications,” Comput. Phys. Rep., 9, 115–198, 1989. [4] T.L. Beck, “Real-space mesh techniques in density-functional theory,” Rev. Mod. Phys., 72, 1041–1080, 2000. [5] J.E. Pask, B.M. Klein, P.A. Sterne, and C.Y. Fong, “Finite-element methods in electronic-structure theory,” Comput. Phys. Commun., 135, 1–34, 2001. [6] S.R. White, J.W. Wilkins, and M.P. Teter, “Finite-element method for electronic structure,” Phys. Rev. B, 39, 5819–5833, 1989. [7] E. Tsuchida and M. Tsukada, “Large-scale electronic-structure calculations based on the adaptive finite-element method,” J. Phys. Soc. Japan, 67, 3844–3858, 1998. [8] G. Strang and G.J. Fix, An Analysis of the Finite Element Method, Prentice-Hall, Englewood Cliffs, NJ, 1973. [9] L.R. Ram-Mohan, Finite Element and Boundary Element Applications in Quantum Mechanics, Oxford University Press, New York, 2002. [10] M.C. Payne, M.P. Teter, D.C. Allan, T.A. Arias, and J.D. Joannopoulos, “Iterative minimization techniques for ab initio total-energy calculations: molecular dynamics and conjugate gradients,” Rev. Mod. Phys., 64, 1045–1097, 1992.
Finite elements in ab initio electronic-structure calculations
437
[11] T.A. Arias, “Multiresolution analysis of electronic structure: semicardinal and wavelet bases,” Rev. Mod. Phys., 71, 267–311, 1992. [12] N.W. Ashcroft and N.D. Mermin, Solid State Physics, Holt, Rinehart and Winston, New York, 1976. [13] J.E. Pask and P.A. Sterne, “Finite-element methods in ab initio electronic-structure calculations,” Modell. Simul. Mater. Sci. Eng., to appear, 2004. [14] M.L. Cohen and T.K. Bergstresser, “Band structures and pseudopotential form factors for fourteen semiconductors of the diamond and zinc-blende structures,” Phys. Rev., 141, 789–796, 1966. [15] J.L. Fattebert and M.B. Nardelli, “Finite difference methods in ab initio electronic structure and quantum transport calculations of nanostructures,” In: P.G. Ciarlet, (ed.), Handbook of Numerical Analysis, vol. X: Computational Chemistry, Elsevier, Amsterdam, 2003. [16] C. Hartwigsen, S. Goedecker, and J. Hutter, “Relativistic separable dual-space gaussian pseudopotentials from H to Rn,” Phys. Rev. B, 58, 3641–3662, 1998.
1.20 AB INITIO STUDY OF MECHANICAL DEFORMATION Shigenobu Ogata Osaka University, Osaka, Japan
The Mechanical properties of materials under finite deformation are very interesting and are important topics for material scientists, physicists, and mechanical and materials engineers. Many insightful experimental tests of the mechanical properties of such deformed materials have afforded an increased understanding of their behavior. Recently, since nanotechnologies have started to occupy the scientific spotlight, we must accept the challenge of studying these properties in small nano-scaled specimens and in perfect crystals under ideal conditions. While state-of-the-art experimental techniques have the capacity to make measurements in extreme situations, they are still expensive and require specialized knowledge. However, the considerable improvement in calculation methods and the striking development of computational capacity bring such problems within the range of atomic-scale numerical simulations. In particular, within the past decade, ab initio simulations, which can often give qualitatively reliable results without any experimental data as input, have become readily available. In this section, we discuss methods for studying the mechanical properties of materials using ab initio simulations. At present, we have many ab initio methods that have the potential to perform such mechanical tests. Here, however, we employ planewave methods based on density functional theory (DFT) and pseudopotential approximations because they are widely used in solid state physics. Details of the theory and of more sophisticated, state-of-the-art techniques can be found in the other section of this volume and in a review article [1]. Concrete examples of parameters settings appearing in this section presuppose that the reader is using the VASP (Vienna Ab initio Simulation Package) code [2, 3] and the ultrasoft pseudopotential. Other codes based on the same theory, such as ABINIT, CASTEP, and so on, should basically accept the same parameter settings as on VASP. 439 S. Yip (ed.), Handbook of Materials Modeling, 439–448. c 2005 Springer. Printed in the Netherlands.
440
1.
S. Ogata
Applying Deformation to Supercell
In the planewave methods, we usually use a parallelepiped-shaped supercell that has a periodic boundary condition in all directions and includes one or more atoms. The supercell can be defined by three, linearly independent basis vectors, h1 = (h 11 , h 12 , h 13 ), h2 = (h 21 , h 22 , h 23 ), h3 = (h 31, h 32 , h 33 ). In investigating the phenomena connected with a local atomic displacement, for example, a slip of the adjacent atomic planes in a crystal, an atomic position in the supercell can be directly moved within the system of fixed basis vectors. However, when we need a uniform deformation of the system under consideration, we can accomplish this by changing the basis vectors directly as we would do, for example, in simulating a phase transition or crystal twinning, and in calculations of the elastic constants and ideal strength of a perfect crystal. Let a deformation gradient tensor F represent the uniform deformation of the system. The F can be defined as Fi j =
dxi , dX j
where x and X are, respectively, the positions of a material particle in a deformed and in a reference state. By using the F, each basis vector is mapped to a new basis vector h via h k = Fkj h j . For example, for a simple shear deformation, F can be written as,
1 0 γ F = 0 1 0 , 0 0 1 where γ represents the magnitude of the shear corresponding to the engineering shear strain. In some cases, for ease of understanding, different coordinate systems for F and for the basis vectors are taken. In this case, F is transformed into the coordinate system for a basis vector by an orthogonal tensor Q ( Q Q T = I). F = Q F Q T, h k = Fkj h j .
2.
Simulation Setting
In DFT calculations, the pseudopotential (if the code is not full-potential code) and the exchange correlation potentials should be carefully selected.
Ab initio study of mechanical deformation
441
Since these problems are not particular to deformation analysis, the reader who needs a more detailed discussion can find it elsewhere. Only a short commentary is given here. When we use the pseudopotential in a separable form [4], we need to pay attention to a possible ghost band [5], because almost all DFT codes use the separable form to save computational time and memory resources. Usually the pseudopotentials in the package codes were very carefully determined to avoid a ghost band in an equilibrium state. However, even when a pseudopotential does not generate a ghost band in the equilibrium state, such a band may still appear in a deformed state. Therefore, it is strongly recommended that a pseudopotential result should be confirmed by comparing it with the result of a full-potential calculation where possible. For the exchange correlation potential, we can normally use functions derived from the local density approximation (LDA), generalized gradient approximation (GGA), and LDA+U. In many cases, the former two methods are equally accurate. The LDA tends to underestimate lattice constants, and overestimate elastic constants and strength, and the GGA to overestimate elastic constants and strength, and underestimate lattice constants. The LDA+U sometimes offers a significantly improved accuracy [6]. The above discussions of the pseudopotential and exchange-correlation potential pertain to error sources resulting from theoretical approximations. However, as well as attending to errors from this source, we should also take care of numerical errors. Numerical errors in the planewave DFT calculation usually derive from the finite size of the k-point set and the finite number of planewaves which are uniquely determined by the supercell shape and the planewave cut-off energy. With regard to other problems, a good estimation of the stress tensor to MPa accuracy requires a finer k-point sampling than does that for an energy estimation with meV accuracy. Figure 1 shows the
-3.6
3.5 3 2.5 Stress GPa
Total energy eV
-3.62
-3.64
-3.66
2 1.5 1
-3.68 0.5 -3.7
0 0
10000 20000 30000 40000 50000 60000 70000 80000 Number of k-points
(a) Total energy vs. number of k-points
0
10000 20000 30000 40000 50000 60000 70000 80000 Number of k-points
(b) Shear stress vs. number of k-points
Figure 1. Total energy and stress vs. number of k-points curves for an aluminum primitive ¯ direction. cell under 20% shear in the {111}112
442
S. Ogata
convergence of the energy and stress as the number of k-points is increased. The model supercell is a primitive cell with an fcc structure which contains ¯ just one aluminum atom. An engineering shear strain of 0.2 to the {111}112 direction has already been applied to the primitive cell. Only the shear stress component corresponding to the shearing direction is shown. Clearly, the stress converges very slowly even though the energy converges relatively quickly. Figure 2 shows the stress–strain curves of the Al primitive cell under a {111} ¯ shear deformation using two sets of k-points, the normal 15 × 15 × 15 112 and a fine 43 × 43 × 43 Monkhorst–Pack Brillouin zone sampling [7]. This sampling scheme is explained later. The curve for 15×15×15 is significantly wavy even though the total free energy of the primitive cell agrees to the order of meV with the energy of the 43 × 43 × 43 case. Apparently, a small set of k-points does not produce a smooth stress–strain curve. This is not a small problem for the study of mechanical properties of materials, because, in the above case, the ideal strength, that is, the maximum stress of the stress– strain curve, is overestimated by 20%, a level which is usually corresponds to 2 ∼ 20 GPa. Although there are many k-points sampling schemes, in recent practice, the Monkhorst–Pack sampling scheme is typically used for testing mechanical properties. Since more efficient schemes [8], in which a smaller number
3.5
Shear Stress GPa
3 2.5 2
43x43x43 k-points
1.5 1 0.5
15x15x15 k-points
0 0
0.05
0.1 0.15 0.2 0.25 Engineering Shear Strain
0.3
0.35
Figure 2. Shear stress vs. strain curves calculated with different numbers of k-point sets. ¯ direction is applied. A shear deformation in the {111}112
Ab initio study of mechanical deformation
443
of k-points can be used without loss of accuracy, are constructed based on crystal symmetries, a deformation which would break the crystal symmetries would remove their advantage. Therefore, the Monkhorst–Pack scheme is often favored because of its simplicity. In it, the sampling points are defined in the following manner: k(n, m, l) = nb1 + mb2 + l b3 , 2r − q − 1 ; r = 1, 2, 3, . . . , q n, m, l = 2q where bi are the reciprocal lattice vectors of the supercell and n, m, and l are the mesh sizes for each reciprocal lattice vector direction. Therefore, the total number of sampled k-points is n×m ×l. If we find that, under the symmetries of the supercell, some of the k-points are equivalent we consider only the nonequivalent k-points to save computational time. The planewave cut-off energy should also be carefully determined. We should use a large enough planewave cut-off energy to achieve a convergence of energy and stress to the required degree of accuracy. Since the atomic configuration affects the cut-off energy, it is better that we estimate that energy for the particular atomic configuration under consideration. However, in mechanical deformation analysis, it is difficult to fix the cut-off energy before starting the simulation because the deformation path cannot be predicted at the simulation’s starting point. In such a case, we have to add a safety margin of 10–20 % to the cut-off energy estimated from a known atomic configuration, for example, that of an equivalent structure. In principle, a complete basis set is necessary to express an arbitrary function by a linear combination of the basis functions. As discussed above, the planewave basis set is used to express the wave functions of electrons in ordinary DFT calculations using the pseudopotential. Because a FFT algorithm can be easily used to calculate the Hamiltonian, we can save computational time. To achieve completeness, a infinite number of the planewaves is necessary; however, to perform a practical numerical calculation, we must somehow reduce the infinite number to a finite one. Fortunately, we can ignore planewaves which have a higher energy than a cut-off value, termed the planewave cut-off energy, because the wave functions of electrons in real system do not have a component of extremely high frequencies. To estimate the cut-off energy, we can perform a series of calculations with an increasing cut-off energy for a single system. By this means, we can find a cut-off energy which is large enough to ensure that the total energy and the stress convergence of the supercell of interest fall within the required accuracy. Usually, the incompleteness of a finite number of planewave basis sets produces an unphysical stress, that is, a Puley stress. However, by using a large enough number of planewaves, we can avoid this problem. Therefore, both the stress convergence check and the energy convergence check are important in
444
S. Ogata 3.5
Shear Stress GPa
3
Ecut=90 eV
2.5 2 Ecut=129 eV
1.5 1 0.5 0 0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
Engineering Shear Strain Figure 3. Shear stress vs. strain curves calculated with different cut-off energies. A shear ¯ direction is applied. deformation in the {111}112
deformation. Figure 3 shows the stress–strain curves obtained by the use of different planewave cut-off energies. The model and simulation procedure are the same as those we have utilized in the above k-point check. Clearly, even though the error due to a small cut-off energy is small in a near equilibrium structure, it becomes larger at in a highly strained structure.
3.
Mechanical Deformation of Al and Cu
Many ab initio studies of mechanical deformation, such as tensile and shear deformation studies for metals and ceramics, have been done in the past two decades. An excellent summary of the history of ab initio mechanical testing ˇ [9]. can be found in a review paper written by Sob Here, we discuss as examples both a fully relaxed and an unrelaxed uniform shear deformation analysis [10], that is, an analysis of a pure shear and a simple shear, for aluminum and copper. The shear mode is the most important deformation mode in our consideration of the strength of a perfect crystalline solid. The shear deformation analysis usually involves more computational cost than the tensile analysis; because the shear deformation breaks many of the crystal symmetries, many nonequivalent k-points should be treated in the calculation.
Ab initio study of mechanical deformation
445
The following analysis has been performed using the VASP code. The exchange-correlation density functional potential adopted is the Perdew–Wang generalized gradient approximation (GGA) [11]; the ultrasoft pseudopotentials [12] are used. Brillouin zone k-point sampling is performed using the Monkhorst–Pack algorithm, and the integration follows the Methfessel–Paxton scheme [13] with the smearing width chosen so that the entropic free energy (a “-T S” term) is less than 0.5 meV/atom. A six atom fcc supercell which has three {111} layer is used, and 18×25×11 k-points for Al and 12×17×7 k-points for Cu are adopted. The k-point convergence is checked as shown in Table 1. The carefully determined cut-off energies of the planewaves for the Al and Cu supercells are 162 and 292 eV, respectively. Incremental affine shear strains of 1% as described above are imposed on each crystal along the experimentally determined common slip systems to obtain the corresponding energies and stresses. In each step, the stress components, excluding the resolved shear stress along the slip system, are kept to a value less than 0.1 GPa during the simulation. In Table 2, the equilibrium lattice constants a0 obtained from the energy minimization are listed and compared with the experimental data. The calculated relaxed and unrelaxed shear moduli G r , G u for the common slip systems are compared with computed analytical values based on the experimental elastic constants. A value of γ = 0.5% is used to interpolate the resolved shear stress (σ ) versus the engineering shear strain (γ ) curves and to calculate the resolved shear moduli. In the relaxed analysis, the stress components are relaxed to within a convergence tolerance of 0.05 GPa. Table 1. Calculated ideal pure shear σr and simple shear strengths σu using different k-point sets No. of k-points 12 × 17 × 7 18 × 25 × 11 21 × 28 × 12 27 × 38 × 16
Al
Cu
σ u (GPa)
σ r (GPa)
σ u (GPa)
σ r (GPa)
3.67 3.73 – 3.71
2.76 2.84 – 2.84
3.42 3.44 3.45 –
2.16 2.15 2.15 –
Table 2. Equilibrium lattice constant (a0 ), relaxed (G r ) ¯ shear moduli of Al and Cu and unrelaxed (G u ) {111}112 Al (calc.) Al (expt.) Cu (calc.) Cu (expt.)
a0 (Å)
G r (GPa)
G u (GPa)
4.04 4.03 3.64 3.62
25.4 27.4 31.0 33.3
25.4 27.6 40.9 44.4
446
S. Ogata 3
Stress (GPa)
2.5 2 1.5 1 0.5 0
0
0.1
0.2
0.3
0.4
0.5
x/bp Figure 4. Shear stress vs. displacement curves for Al and Cu of the fully relaxed shear ¯ direction. deformation in the {111}112
At equilibrium, the Cu is considerably stiffer, with simple and pure shear moduli greater by 65 and 25%, respectively, than those of the Al. However, the Al ends up with a 32% larger ideal pure shear strength σmr than the Cu, because it has a longer range of strain before softening (see Fig. 4): γm = 0.200 in the Al, γm = 0.137 in the Cu. Figure 5 shows the changes of the iso-surfaces of the valence charge density during the shear deformation (h ≡ Vcell ρv , Vcell and ρv are the supercell volume and valence charge density, respectively). At the octahedral interstice in Al, the pocket of charge density has cubic symmetry and is angular in shape, with a volume comparable to the pocket centered on every ion. In contrast, in Cu, there is no such interstitial charge pocket, the charge density being nearly spherical about each ion. The Al has an inhomogeneous charge distribution in the interstitial region and bond directionality, while the Cu has relatively homogeneous charge distributions and little bond directionality. The charge density analysis gives a clear view of the electron activity under shear deformation, and sometime informs us about the origin of the mechanical behavior of the solids.
4.
Outlook
Currently, we can perform ab initio mechanical deformation analyses for many types of materials and for primitive and nano systems. However, in the
Ab initio study of mechanical deformation (a)
<110>
<112>
c
(b)
x=x1=0.196
a
c a
c
x=x2=0.436
b
b
<112>
b
c a
a
x=0.000
<111>
x=x2=0.494
b
b
a
<110>
x=x1=0.283
x=0.000
<111>
447
b
c a
c
Figure 5. Charge density isosurface change in (a) Al; (b) Cu during the shear deformation in ¯ direction. the {111}112
near future, the most interesting studies incorporating these analyses might address not only the mechanical behavior of materials under deformation and loading, but also the relation between mechanical deformation and loading, and physical and chemical reactions, such as stress corrosion. For this purpose, ab initio methods are the most powerful and reliable tools.
References [1] M.C. Payne, M.P. Teter, D.C. Allan, T.A. Arias, and J.D. Joannopoulos. “Iterative minimization techniques for ab initio total-energy calculations – molecular dynamics and conjugate gradients,” Rev. Mod. Phys., 64, 1045–1097, 1992. [2] G. Kresse and J. Hafner, “Ab initio molecular dynamics for liquid metals,” Phys. Rev. B, 47, RC558–RC561, 1993. [3] G. Kresse and J. Furthm¨uller, “Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set,” Phys. Rev. B, 54, 11169–11186, 1996. [4] L. Kleinman and D.M. Bylander “Efficacious form for model pseudopotentials,” Phys. Rev. Lett., 48, 1425–1428, 1982.
448
S. Ogata [5] X. Gonze, P. Kackell, and M. Scheffler, “Ghost states for separable, norm-conserving, ab initio pseudopotential,” Phys. Rev. B, 41, 12264–12267, 1990. [6] S.L. Dudarev, G.A. Botton, S.Y. Savrasov, C.J. Humphreys, and A.P. Sutton, “Electron-energy-loss spectra and the structural stability of nickel oxide: An LSDA+ U study,” Phys. Rev. B, 57, 1505–1509, 1998. [7] H.J. Monkhorst and J.D. Pack, “Special points for Brillouin zone integrations,” Phys. Rev. B, 13, 5188–5192, 1976. [8] D.J. Chadi, “Special points in the Brillouin zone integrations,” Phys. Rev. B, 16, 1746–1747, 1977. ˇ [9] M. Sob, M. Fri´ak, D. Legut, J. Fiala, and V. Vitek, “The role of ab initio electronic structure calculations,” Mat. Sci. Eng. A, to be published, 2004. [10] S. Ogata, J. Li, and S. Yip, “Ideal pure shear strength of aluminum and copper,” Science, 298, 807–811, 2002. [11] J.P. Perdew and Y. Wang, “Atoms, molecules, solids, and surfaces: application of the generalized gradient approximation for exchange and correlation,” Phys. Rev. B, 46, 6671–6687, 1992. [12] D. Vanderbilt, “Soft self-consistent pseudopotentials in a generalized eigenvalue formalism,” Phys. Rev. B, 41, 7892–7895, 1990. [13] M. Methfessel and A. T. Paxton, “High-precision sampling for Brillouin zone in metals,” Phys. Rev. B, 40, 3616–3621, 1989.
2.1 INTRODUCTION: ATOMISTIC NATURE OF MATERIALS Efthimios Kaxiras1 and Sidney Yip2 1
Department of Nuclear Science and Engineering and Materials Science and Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA 2 Department of Physics, Harvard University, Cambridge, MA 02138, USA
Materials are made of atoms. The atomic hypothesis was put forward by the Greek philosopher Demokritos about 25 centuries ago, but was only proven by quantitative arguments in the 19th and 20th centuries, beginning with the work of John Dalton (1766–1844) and through the development of quantum mechanics, the theory that provided a complete and accurate description of the properties of atoms. The very large number of atoms encountered in a typical material (of order ∼1024 or more) precludes any meaningful description of its properties based on a complete account of the behavior of each and every atom that comprises it. Special cases, such as perfect crystals, are exceptions where symmetry reduces the number of independent atoms to very few; in such cases, the properties of the solid are indeed describable in terms of the behavior of the few independent atoms and this can be accomplished using quantum mechanical methods. However, this is only an idealized model of actual solids in which perfect order is broken either by thermal disorder or by the presence of defects that play a crucial role in determining the physical properties of the system. An example of a crystal defect is dislocations, which determine the mechanical behavior of solids (their tendency for brittle or ductile response to external loading); these defects have a core which can only be described properly by its atomic scale structure, but they also have long range strain and stress fields which are adequately described by continuum elasticity theory (see Chapters 3 and 7). This situation typifies the dilemma of describing the behavior of real materials: the majority of atoms, far from the defect regions, behave in a manner consistent with a macroscopic, continuum description, where the atomic hypothesis is not important, while a small minority of atoms, in the immediate neighborhood of the defects, do not follow this rule and need to be 451 S. Yip (ed.), Handbook of Materials Modeling, 451–458. c 2005 Springer. Printed in the Netherlands.
452
E. Kaxiras and S. Yip
described individually. Neither aspect, atomistic or macroscopic, can provide by itself a satisfactory description of the defect and its role in determining the material’s behavior. The example of dislocations is representative: any type of crystal defect (vacancies, interstitials, impurities, grain boundaries, surfaces, interfaces, etc.) requires, at some level, atomic scale representation in order to fully understand its effect on the properties of the material. Similarly, disorder induced by thermal motion and other external agents (pressure, irradiation) can lead to changes in the stucture of a solid, possibly driving it to new phases, which also requires a detailed atomistic description (see Chapters 2.29 and 6.11). Finally, the case of fluids or solids like polymers, in which there is no order at the atomic scale, is another example of where atomistic scale description is necessary to provide invaluable information for a comprehensive picture of the system’s behavior (see Chapters 8.1 and 9.1). These considerations provide the motivation for the description of materials properties based on atomistic simulations, by judiciously choosing the aspects that need to be explicitly modeled at the atomic scale. The term “atomistic simulations” has acquired a particular meaning: it refers to computational studies of materials properties based on explicit treatment of the atomic degrees of freedom within classical mechanics, either deterministically, that is, in accordance with the laws of classical dynamics (the so-called Molecular Dynamics or MD approach, see Chapter 2.8), or stochastically, that is, by appropriately sampling distributions from a chosen ensemble (the so called Monte Carlo or MC approach, see Chapter 2.10). The energy functional underlying the calculation of forces for the dynamics of atoms or the ensemble distribution, can be based either on a classical description or a quantum mechanical one. We will discuss briefly the issues that arise from the various approaches and then elaborate on what these approaches can provide in terms of a detailed understanding of the behavior of materials.
1.
The Input to Atomistic Simulation
The energy of a system as a function of atomic positions should ideally be treated within quantum mechanics, with the valence electrons providing the interactions between atoms that hold the solid together. The development of Density Functional Theory [1, 2] and of pseudopotential theory (for a comprehensive review see, e.g., Ref. [3]) has produced a computational methodology which is accurate and efficient, and has the required chemical versatility to describe a very wide range of materials properties, fully within the quantum mechanical framework [4]. However, this is an approach which puts exceptionally large demands on computational resources for systems larger than a few tens of atoms, a situation that arises frequently in the descriptions of
Introduction: atomistic nature of materials
453
realistic systems (the dislocation core is a case in point), and this limitation applies to a single atomic configuration. The description of systems comprising of thousands to millions of atoms, and including a large number of atomistic configurations (as a molecular dynamics or a Monte Carlo simulation would require) is beyond current and anticipated computational capabilities. Consequently, alternative approaches have been pursued in order to be able to model such systems, which, though large on the atomistic scale, are still many orders of magnitude smaller than typical meterials. The basic idea is to employ either a simplified quantum mechanical approach for the electrons, or a purely classical one in which the electronic degrees of freedom are completely eliminated and the interactions between atoms are modeled by an effective potential; in both cases, the computational resources required are greatly reduced, permitting the treatment of much larger systems and more extensive exploration of their configurational space (more time steps in a MD simulation or more samples in a MC simulation). The strategies for reducing the computational cost, whether quantum mechanical or classical in nature, are usually distinctly different when applied to systems with covalent versus those with metallic bonding, because of the difference in the nature of electronic bonds in these two situations. In the quantum case, covalent systems are typically modeled by a so-called tight-binding hamiltonian, which restricts the electronic wavefunctions to linear combinations of localized atomic orbitals; this approach is adequate to describe the nature of the covalent bonds (see Chapters 1.14 and 1.15), but can also be extended to capture metallic systems. The restricted variational freedom of electronic wavefunctions greatly reduces the computational cost involved in finding the proper solution. For simple metallic systems, an approach based on density functional theory but without requiring electronic orbitals has also been employed to approximate their properties, again with very substantial reduction in computational cost. These developments have made possible the quantum mechanical, atomistic scale simulation of systems consisting of up to a few thousand atoms (see Ref. [3] for examples). An altogether different methodology is to maintain a strictly classical description with interactions between the atoms provided by an effective potential which somehow encapsulates all the effects of valence electrons. The methodology used in this type of approach is again determined by the type of system to which it is applied. Specifically, for covalently bonded systems, the emphasis of the potential is to reproduce the energy cost of distorting the length of covalent bonds, the angles between them and the torsional angles, which are the basic features characterizing structures with predominantly covalent bonding; a characteristic example of such approaches is silicon, the prototypical covalently bonded solid, for which many attempts have been made to produce a reliable effective interactomic potential with various
454
E. Kaxiras and S. Yip
degrees of success [5–7]. In contrast to this, for metallic systems the emphasis of the potential is to describe realistically the environment of an atom embedded in the background of valence electrons of the host solid; the approaches here often employ an effective (but not necessarily realistic) representation of the valence electron density and are referred to as the embedded atom method [for a review see, Ref. [8], see Chapter 2.2]. In both types of approaches, great care is given to ensuring that the potential reproduces accurately the energetics of at least a set configurations, by fitting it to a database produced by the more elaborate and accurate quantum mechanical methods. Finally, there are also cases where a more generic type of approach can be employed, modeling for instance the interaction between atoms as a simple potential derived by heuristic arguments without fitting to any particular system. Examples of such potentials are the well known van der Waals and Morse potentials, which have the general behavior of an attractive tail, a well-defined minimum and a repulsive core, as a function of the distance between two atoms (see Chapters 2.2–2.6, and 9.2). While not specific to any given material or system, these potentials can provide great insight as far as generic behavior of solids is concerned, including the role of defects in fairly complex contexts (see Chapters 6.1 and 7.1).
2.
Unique Properties of Molecular Dynamics and Monte Carlo
There are certain aspects of atomistic simulation, particularly molecular dynamics and Monte Carlo, which make this approach quite unique. The basic underlying concept here is particle tracking. Without going into the distinction between the two methods of simulation, we make the following general observations. (i) A few hundred particles are often sufficient to simulate bulk properties. Bulk or macroscopic properties like the system pressure and temperature can be determined with a simulation cell containing less than a thousand atoms, even though the number of atoms in a typical macroscopic system is of order of Avogadro’s number, 6 × 1023 . (ii) Simulation allows a unified study of all physical properties. A single simulation can generate the basic data, particle trajectories or configurations, with which one can calculate all the materials properties of interest, structural, thermodynamic, vibrational, mechanical, transport, etc. (iii) Simulation provides a direct connection between the fundamental description of a material system, such as internal energy and atomic structure, and all the physical properties of interest. In essence, it is a “numerical theory of matter”.
Introduction: atomistic nature of materials
455
(iv) In simulation one has complete control over the conditions under which the system study is carried out. This applies to the specification of interatomic interactions and the initial and boundary conditions. With this information and the simulation output one has achieved a precise characterization of the material being simulated. (v) Simulation can give properties that cannot be measured. This can be a very significant feature with regard to testing theory. In situations where the most clean-cut test involves systems or properties not accessible by laboratory experiments, simulation can play the role of experiment and provide this information. Conversely, in those cases where there are no theories to interpret an experiment, simulation can play the role of theory. (vi) Simulation makes possible the direct visualization of physical phenomena of interest. Visualization can play a very important role in modeling and simulation at all scales, for communication of results, gaining physical insights, and discovery. While its potential is recognized, its practical use remains underdeveloped. We recall here an oft quoted sentiment: “Certainly no subject is making more progress on so many fronts than biology, and if we were to name the most powerful assumption of all, which leads one on and on in an attempt to understand life, it is that all things are made of atoms, and that everything that living things do can be understood in terms of the jiggling and wiggling of atoms.” Richard Feynman, Lectures on Physics, vol. 1, p. 3–6 (1963)
3.
Limitations of Atomistic Simulation
To balance the usefulness of molecular dynamics and Monte Carlo, it is appropriate to acknowledge at the same time the inherent limitations of atomistic simulation. As mentioned earlier, the first-principles, quantum mechanical description of atomic bonding in solids is restricted to very few (by macroscopic standards) atoms and for exteremely short time scales: barely a few hundred atoms can be handled, for periods of few hundreds of femto seconds. Extending this fundamental description to larger systems and longer times of simulation requires the introduction of approximations in the quantum mechanical method (such as tight binding or orbital-free approaches), which significantly limit the accuracy of the quantum mechanical approach. With such restrictions on size and time-span of the simulation, the scope of applications to real materials properties is rather limited. The alternative is to use a purely classical description, based on empirical interatomic potentials to describe the interactions of atoms. This, however, introduces more severe approximations, which limit the
456
E. Kaxiras and S. Yip
ability of the approach to capture realistically how the bonds between atoms are formed and dissolved during a simulation. Such uncertainties put bounds on the scope of physical phenomena that can be successfuly addressed by simulations. The other limitation is a practical issue, that is, the finite capabilities of computers no matter how large they are. This translates into limits on the spatial size (usually identified with the number of atoms N in the model) and the temporal extent of simulations, which often fall short of desired values. It is quite safe to say that the upper bounds on system size and run time, whatever they are, will be pushed out further with time, because computer power is certain to increase in the foreseeable future. Probably more important in extending the effective size of simulations are novel algorithmic developments, which are likely to produce computational gains in the simulation size and duration much larger than any direct gains by raw increases in computer power. As an example of new approaches, we mention multiscale simulations of materials, which combine the different types of system description (quantum, classical and continuum) into a single method. Several approaches of this type have appeared in the last few years, and their development is at present a very active field which holds promise for bringing to fruition the full potential of atomistic simulations.
4.
A Brief Survey of the Chapter Contents
The diversity of atomistic simulations, regarding either methods or applications, makes any attempt at a complete coverage a practically impossible task. The contributions that have been brought together here should give the reader a substantial overview of the basic capabilities of the atomistic simulation approach, along with emphasis on certain unique features of modeling and simulation at this scale from the standpoint of multiscale modeling. Leading off the discussions are five articles describing the development of interatomic potentials for specific classes of materials – metals (Chapter 2.2), ionic (Chapter 2.3) and covalent (Chapter 2.4) solids, molecules (Chapter 2.5), and ferroelectrics (Chapter 2.6). From these the reader gains an appreciation of the physics and the database that go into the models, and how the resulting potentials are validated. Immediately following are articles on the simulation methods where the potentials are the necessary inputs, energy minimization (Chapter 2.7), molecular dynamics (Chapters 2.8, 2.9, 2.11), Monte Carlo (Chapter 2.10), and methods at the mesoscale which incorporate atomistic information (Chapters 2.12, 2.13). In the next set of articles emphasis is directed at applications, beginning with free-energy calculations (Chapters 2.14, 2.15) for which atomistic simulations are uniquely well suited, followed by studies of elastic constants (Chapter 2.16), transport coefficients (Chapters 2.17, 2.18),
Introduction: atomistic nature of materials
457
mechanical behavior (Chapter 2.19), dislocations (Chapters 2.20, 2.21, 2.22), fracture in metals (Chapter 2.23), and semiconductors (Chapter 2.24). The next two articles deal with large scale simulations, on metallic and ceramic nanostructures (Chapter 2.25) and biological membranes (Chapter 2.26), followed by three articles on studies in radiation damage to which atomistic modeling and simulations have made significant contributions (Chapters 2.27, 2.28, 2.29). The next article, on thin-film deposition (Chapter 2.30), is an example of how simulation can address problems of technological relevance. The chapter concludes with an article on visualization at the atomistic level (Chapter 2.31), a topic which is destined to grow in recognized importance as well as opportunities for software innovation. The contents of this chapter clearly have a great deal of overlap with the rest of the Handbook. The connection between atomistic simulations using classical potentials and electronic structure calculations (Chapter 1.1) permeates throughout the present chapter, since the potentials used in MD/MC simulations rely on the first-principles quantum mechanical calculations for inspiration of functional form of the potentials, for the database used to determine parameter values, and for benchmark results in model validation. The connection to the mesoscale (Chapter 3.1) is clearly also very intimate since this is the next level of length/time scale. Since atomistic simulation methods and results are used liberally throughout the Handbook, one may be tempted to say that this chapter serves as perhaps the most central link to the different parts of the volume. If we may be allowed another quote from R.P. Feynman, the following is a different way of expressing the centrality of the chapter. “If, in some cataclysm, all of scientific knowledge were to be destroyed, and only sentence passed on to the next generatios of creatures, what statement would contain the most information in the fewest words? I believe it is the atomic hypothesis (or the atomic fact, whatever you wish to call it) that all things are made of atoms – little particles that move around in perpetual motion, attracting each other when they are a little distance apart, but repelling upon squeezed into one another. In that one sentence, you will see, there is enormous amount of information about the world, if just a little imagination and thinking are applied.” Richard P. Feynman, Six Easy Pieces, (Addison-Wesley, Reading, 1963), p. 4.
References [1] P. Hohenberg and W. Kohn, “Inhomogeneous electron gas,” Phys. Rev., 136, B864– 871, 1964. [2] W. Kohn and L.J. Sham, “Self-consistent equations including exchange and correlation effects,” Phys. Rev. A, 140, 1133–1138, 1965. [3] R.M. Martin, Electronic Structure: Basic Theory and Practical Methods, Cambridge University Press, Cambridge, 2004.
458
E. Kaxiras and S. Yip [4] R. Car and M. Parrinello, “Unified approach for molecular dynamics and densityfunctional theory,” Phys. Rev. Lett., 55, 2471–2474, 1985. [5] F.H. Stillinger and T.A. Weber, “Computer simulation of local order in condensed phases of silicon,” Phys. Rev. B, 31, 5262–5271, 1985. [6] J. Tersoff, “New empirical model for the structural properties of silicon,” Phys. Rev. Lett., 56, 632–635, 1986. [7] J. Justo, M.Z. Bazant, E. Kaxiras, V.V. Bulatov, and S. Yip, “Interatomic potential for silicon defects and disordered phases,” Phys. Rev. B, 58, 2539–2550, 1998. [8] A.F. Voter, Intermetallic Compounds, vol. 1, Wiley, New York, pp. 77, 1994.
2.2 INTERATOMIC POTENTIALS FOR METALS Y. Mishin George Mason University, Fairfax, VA, USA
Many processes in materials, such as plastic deformation, fracture, diffusion and phase transformations, involve large ensembles of atoms and/or require statistical averaging over many atomic events. Computer modeling of such processes is made possible by the use of semi-empirical interatomic potentials allowing fast calculations of the total energy and classical interatomic forces. Due to their computational efficiency, interatomic potentials give access to systems containing millions of atoms and enable molecular dynamics simulations for tens or even hundreds of nanoseconds. State-ofthe-art potentials capture the most essential features of interatomic bonding, reaching the golden compromise between computational speeds and accuracy of modeling. This article reviews interatomic potentials for metals and metallic alloys. The basic concepts used in this area are introduced, the methodology commonly applied to generate atomistic potentials is outlined, and capabilities as well as limitations of atomistic potentials are discussed. Expressions for basic physical properties within the embedded-atom formalism are provided in a form convenient for computer coding. Recent trends in this field and possible future developments are also discussed.
1.
Embedded-atom Potentials
Molecular dynamics, Monte Carlo, and other simulation methods require multiple evaluations of Newtonian forces Fi acting on individual atoms i or (in the case of Monte Carlo simulations) the total energy of the system, E tot . Atomistic potentials, also referred to as force fields, parameterize the configuration space of a system and represent its total energy as a relatively simple function of configuration point. The interatomic forces are then obtained as coordinate derivatives of E tot , Fi = −∂ E tot /∂ri , ri being the radius-vector of an 459 S. Yip (ed.), Handbook of Materials Modeling, 459–478. c 2005 Springer. Printed in the Netherlands.
460
Y. Mishin
atom i. This calculation of E tot and Fi is a simple and fast numerical procedure that does not involve quantum-mechanical calculations, although the latter are often used when generating potentials as will be discussed later. Potential functions contain fitting parameters, which are adjusted to give desired properties of the material known from experiment and/or first-principles calculations. Once the fitting procedure is complete, the parameters are not subject to any further changes and the potential thus defined is used in all subsequent simulations of the material. The underlying assumption is that a potential providing accurate energies/forces at configuration points used in the fit will also give reasonable results for configurations between and beyond them. This property of potentials, often refereed to as “transferability,” is probably the most adequate measure of their quality. Early atomistic simulations employed pair potentials, usually of the Morse or Lennard-Jones type [1, 2]. Although such potentials have been and still are a useful model for fundamental studies of generic properties of materials, the agreement between simulation results and experiment can only be qualitative at best. While such potential can be physically justified for inert elements and perhaps some ionic solids, they do not capture the nature of atomic bonding even in simple metals, not to mention transition metals or covalent solids. Daw and Baskes [3] and Finnis and Sinclair [4] proposed a more advanced potential form that came to be known as the embedded atom method (EAM). In contrast to pair potentials, EAM incorporates, in an approximate manner, many-body interactions between atoms, which are responsible for a significant part of bonding in metals. The introduction of the many-body term has enabled a semi-quantitative, and in good cases even quantitative, description of metallic systems. In the EAM model, E tot is given by the expression E tot =
1 si s j (rij ) + Fsi (ρ¯i ). 2 i, j ( j =/ i) i
(1)
The first term is the sum of all pair interactions between atoms, si s j (rij ) being a pair-interaction potential between atoms i (of chemical sort si ) and j (of chemical sort s j ) at positions ri and r j = ri + rij , respectively. Function Fsi is the so-called embedding energy of atom i, which depends upon the host electron density ρ¯i at site i induced by all other atoms of the system. The host electron density is given by the sum ρ¯i =
ρs j (rij ),
(2)
j= /i
where ρs j (r) is the electron density function assigned to atom j . The second term in Eq. (1) represents the many-body effects. The functional form of Eq. (1) was originally derived as a generalization of the effective medium theory [5] and the second moment approximation to tight-binding theory [4, 6]. Later, however, it lost its close ties with the original physical meaning
Interatomic potentials for metals
461
and came to be treated as a working semi-empirical expression with adjustable parameters. A complete EAM description of an n-component system requires n(n + 1)/2 pair interaction functions ss (r), n electron density functions ρs (r), ¯ (s = 1, . . . , n). An elemental metal is desand n embedding functions Fs (ρ) cribed by three functions (r), ρ(r) and F(ρ), ¯ 1 while a binary system A–B ¯ and requires seven function AA (r), AB (r), BB (r), ρA (r), ρB (r), FA (ρ), ¯ Notice that if potential functions for pure metals A and B are availFB (ρ). able, only the cross-interaction function AB (r) is needed for describing the respective binary system. Over the past two decades, EAM potentials have been constructed for many metals and a number of binary systems. Potentials for ternary systems are scares and their reliability is yet to be evaluated. The pair-interaction and electron-density functions are normally forced to turn to zero together with several higher derivatives at a cutoff radius Rc . Typically, Rc covers 3–5 coordination shells. EAM functions are usually defined by analytical expressions. Such expressions and their derivatives can be directly coded into a simulation program. However, a more common and computationally more efficient procedure is to tabulate each function at a large number of points (usually, a few thousand) and store it in the tabulated form for all subsequent simulations. In the beginning of each simulation run, the tables are read into the program, interpolated by a cubic spline, and the spline coefficients are used during the rest of the simulation for retrieving interpolated values of the functions and their derivatives for any desired value of the argument. It is important to understand that the partition of E tot into pair interactions and the embedding energy is not unique [7]. Namely, E tot defined by Eq. (1) is invariant under the transformations ¯ → Fs (ρ) ¯ + gs ρ, ¯ Fs (ρ) ss (r) → ss (r) − gs ρs (r) − gs ρs (r),
(3) (4)
where s, s = 1, . . . , n and gs are arbitrary constants. In addition, all functions ρs (r) can be scaled by the same arbitrary factor p with a simultaneous scaling of the argument of the embedding functions: ρs (r) → pρs (r), ¯ → Fs (ρ/ ¯ p). Fs (ρ)
(5) (6)
Thus, there is a large degree of ambiguity in defining EAM potential functions: the units of the electron density are arbitrary, the pair-interaction and electron-density functions can be mixed with each other, and the embedding energy can only be defined up to a linear function. It is important, however, that 1 For elemental metals, the chemical indices s are often omitted.
462
Y. Mishin
the embedding function be non-linear, otherwise the second term in Eq. (1) can be absorbed by the first one, resulting in a simple pair potential. The non¯ reflects the bond-order character of atomic interactions by linearity of Fs (ρ) making the energy per nearest-neighbor bond decrease with increasing number ¯ must be positive of bonds. To capture this trend, the second derivative Fs (ρ) and thus Fs (ρ) ¯ a convex curve, at least around the equilibrium volume of the ¯ is proportional to crystal. Furthermore, in pure metals at equilibrium, F (ρ) the Cauchy pressure (c12 − c44 )/2, which is normally positive (cij are elastic constants). Notice that all pair potentials inevitably give c12 = c44 , a relation which is rarely followed by real materials. Given this arbitrariness of EAM functions, one should be careful when comparing EAM potentials developed by different research groups for the same material: functions looking very different may actually give close physical properties. As a common platform for comparison, potentials are often converted to the so-called effective pair format. To bring potential functions to this format, apply the transformations by Eqs. (3) and (4) with coefficients gs chosen as gs = −Fs (ρ), ¯ where the derivative is taken at the equilibrium lattice parameter of a reference crystal structure. For that structure, the transformed ¯ = 0 at equilibrium. In embedding functions will satisfy the condition Fs (ρ) other words, each embedding function Fs (ρ) ¯ will have a minimum at the host electron density arising at atoms of the respective sort s in the equilibrium reference structure. Together with the normalization condition ρ¯1 = 1 applied to sort s = 1 in that structure, the potential format is uniquely defined and different potentials can be conveniently compared with each other provided that their reference structures are identical. In elemental metals, the natural choice of the reference structure is the ground state, whereas for binary systems this choice is not unique and should always be specified by the author.
2.
Calculation of Properties with EAM Potentials
Below we provide EAM expressions for some basic physical properties of materials in a form convenient for computer coding. We are using a laboratory reference system with rectangular Cartesian coordinates, so that positions of indices of vectors and tensors are unimportant. We will reserve superscripts for Cartesian coordinates of atoms and subscripts for their labels (all atoms are assumed to be labeled) and chemical sorts (s-indices). The force acting on a particular atom i in a Cartesian direction α(α = 1, 2, 3) is given by the expression Fiα =
j= /i
f ij (rij )
rijα , rij
(7)
Interatomic potentials for metals
463
where f ij (rij ) = si s j (rij ) + Fsi (ρ¯i )ρs j (rij ) + Fs j (ρ¯ j )ρs i (rij ).
(8)
Notice that this force depends on the electron density on all neighboring atoms j , which in turn depends on positions of all neighbors of atom j . It follows that force coupling between atoms extends effectively over a distance of 2Rc and not just Rc as for pair potentials. EAM allows a direct calculation of the mechanical stress tensor for any atomic configuration: 1 αβ σ i , V i i
σ αβ =
(9)
where αβ
σ i i ≡
1 j= /i
2
si s j (rij ) + Fsi (ρ¯i )ρs j (rij )
β
rijα rij rij
.
(10)
Here, V = i i is the total volume of the system and i are atomic volumes assigned to individual atoms. A partition of V between atoms is somewhat arbitrary but adopting a reasonable approximation (for example, equipartiαβ tion) one can compute the local stress tensor σi on individual atoms. Analysis of stress distribution can be especially useful in atomistic simulations of dislocations, grain boundaries and other crystal defects. The condition of mechanical equilibrium of an isolated or periodic system can be expressed as σ αβ = 0 for all α and β: 1 i, j ( j = / i)
2
si s j (rij )
+
Fsi (ρ¯i )ρs j (rij )
β
rijα rij = 0. rij
(11)
In particular, equilibrium with respect to volume variations requires that the hydrostatic stress vanish, α σ αα = 0, which reduces Eq. (11) to 1 i, j ( j = / i)
2
si s j (rij ) + Fsi (ρ¯i )ρs j (rij ) rij = 0.
(12)
Analysis of stresses also allows us to formulate equilibrium conditions of a crystal with respect to tetragonal or any other homogeneous distortion. We now turn to elastic constants of an equilibrium prefect crystal. The elastic constant tensor C αβγ δ of a general crystal structure is given by C αβγ δ =
1 αβγ δ αβγ δ αβ γ δ Ui + Fsi (ρ¯i )Wi + Fsi (ρ¯i )Vi Vi , n b 0 i
(13)
464
Y. Mishin
where 0 is the equilibrium atomic volume and
αβγ δ Ui
αβγ δ Wi
=
j= /i
αβ Vi
si s j (rij ) rijα rijβ rijγ rijδ 1 = si s j (rij ) − , 2 j =/ i rij (rij )2
ρsj (rij )
ρs (rij ) rijα rijβ rijγ rijδ − j , rij (rij )2
β rijα rij ρs j (rij ) rij j= /i
=
(14)
(15)
.
(16)
In Eq. (13), i is the summation over n b basis atoms defining the structure, while the summation j extends over all neighbors of atom i within its cutoff sphere. Expressions for contracted elastic constants cij can be readily developed from the above equations. It is important to remember that Eqs. (13)–(16) have been derived by applying to the crystal an infinitesimal homogeneous strain. These equations are, thus, not valid for structures (e.g., HCP or diamond cubic) where the lack of inversion symmetry gives rise to internal atomic relaxations under applied strains. EAM provides relatively simple expressions for force constants and the dynamical matrix [8]. For off-diagonal (i=/ j ) elements of the force-constant αβ matrix G ij we have αβ G ij
s s (rij ) rijα rijβ δαβ f ij (rij ) ≡ α β =− − si s j (rij ) − i j rij rij (rij )2 ∂ri ∂r j ∂ E tot
−
Fsi (ρ¯i )
ρsj (rij )
−
−Fs j (ρ¯ j ) ρsi (rij ) −
ρs j (rij ) rijα rijβ rij
+
k= / i, j
(rij )2 β
ρs i (rij ) rijα rij rij (rij )2
α β rij
+ Fsj (ρ¯ j )ρs i (rij )Q j
rij
− Fsi (ρ¯i )ρs j (rij )Q αi
Fsk (ρ¯k )ρs i (rik )ρs j (r j k )
β
rij rij
β
rikα r j k , rik r j k
(17)
where Q αi =
m= /i
ρs m (rim )
α rim rim
(18)
Interatomic potentials for metals
465 αβ
and f ij (rij ) is given by Eq. (8). For the diagonal elements G ii we have αβ G ii
∂ E tot
≡
β
∂riα ∂ri +
k= /i
+
k= /i
+
= δαβ
Fsi (ρ¯i )
f ik (rik ) k= /i
rik
ρsk (rik )
Fsk (ρ¯k )
ρsi (rik )
β Fsi (ρ¯i )Q αi Q i
+
k= /i
+
k= /i
−
ρs k (rik ) rik
si sk (rik )
(rik ) rikα rikβ − si sk rik (rik )2
β
rikα rik (rik )2
ρ (rik ) rikα rikβ − si rik (rik )2 Fsk (ρ¯k )
ρs i (rik )
2 r α r β ik ik
(rik )2
.
(19)
If the system is subject to periodic boundary conditions or if there are no αβ external fields, G ii can be simply found from the relation
αβ
αβ
G ij + G ii = 0,
(20)
j= /i
expressing the invariance of E tot with respect to arbitrary rigid translations of the system. Eqs. (17) and (19) reveal again that dynamic coupling between atoms in EAM extends over distances up to 2Rc . Notice that these equations are not limited to a perfect crystal and are valid for any equilibrium atomic configuration. αβ Knowing G ij , we can construct the dynamical matrix αβ Dij
αβ
G ij = , Mi M j
(21) αβ
Mi and M j being the atomic masses. A diagonalization of Dij gives us squares, ωn2 , of the normal vibrational frequencies ωn of our system. For a stable system all eigenvalues ωn2 are non-negative, which allows us to determine the normal frequencies. These, in turn, can be immediately plugged into the relevant statistical-mechanical expressions for the free energy and other thermodynamic functions associated with atomic vibrations. This procedure, with possible slight modifications, lies in the foundation of all harmonic and quasi-harmonic thermodynamics calculations with atomistic potentials [9, 10]. In particular, a minimization of the total free energy (vibrational free energy plus E tot ) with respect to volume provides a quasi-harmonic scheme of thermal expansion calculations [11]. Alternatively, for a perfect crystal it is straightforαβ ward to compute the Fourier transform, Dij (k), of the dynamical matrix for various k-vectors within the Brillouin zone (here i and j refer to basis atoms). αβ A diagonalization of Dij (k) permits a calculation of 3n b phonon dispersion relations ω(k).
466
Y. Mishin
If an EAM potential is used in the effective pair format and we need to αβ αβ ¯ =0 compute G ij or Dij for the equilibrium reference structure, then all Fs (ρ) and Eqs. (17) and (19) are somewhat simplified. But even without this simpliαβ fication, the computation of G ij directly from Eqs. (17) and (19) is a straightforward and relatively fast computational procedure. In fact, it is the diagonalization of the dynamical matrix rather than its construction that becomes the bottleneck of harmonic calculations for large systems. Finally, we will provide EAM expressions for the unrelaxed vacancy formation energy. The change in E tot accompanying the creation of a vacancy at a site i without relaxation equals E i = −
si s j (rij ) − Fsi (ρ¯i ) +
j= /i
Fs j (ρ¯j − ρi (rij )) − Fs j (ρ¯j ) ,
j= /i
(22) where ρ¯j is the host electron density at site j =/ i before the vacancy creation. The first two terms in Eq. (22) account for the energy of broken bonds and the loss of the embedding energy of atom i, whereas the third term represents the changes in embedding energies of neighboring atoms j due to the reduction in their host electron density upon removal of atom i. For an elemental metal whose crystal structure consists of symmetrically equivalent sites,2 the unrelaxed vacancy formation energy equals E v = E i + E 0 , where E0 =
1 (rij ) + F(ρ) ¯ 2 j =/ i
(23)
is the cohesive energy of the crystal (the choice of site i is unimportant). Thus, Ev = −
1 (rij ) + F(ρ¯ − ρ(rij )) − F(ρ) ¯ . 2 j =/ i j= /i
(24)
The relaxation typically decreases E v by 10–20%. For a pair potential, Eq. (24) leads to E v = −E 0 , a relation which overestimates experimental values of E v over a factor of two. For example, in copper E v = 1.27 eV while E 0 = −3.54 eV (both experimental numbers). The embedding energy terms in Eq. (24) make the agreement with experiment much closer. For an alloy or compound, Eq. (22) only gives the so-called “raw” formation energy of a vacancy [12]. This energy alone is not sufficient for calculating the equilibrium vacancy concentration but it serves as one of the ingredients required for such calculations. For an ordered intermetallic compound, “raw” energies of vacancies and antisite defects need to be computed for each sublattice. Expressions similar to Eq. (22) can be readily developed 2 Some structures, for example A15, contain nonequivalent sites.
Interatomic potentials for metals
467
for antisite defects. Another ingredient is the average cohesive energy of the compound,
1 1 s s (rij ) + Fsi (ρ¯i ), E0 = n b i 2 j =/ i i j
(25)
where the summation i is over n b basis atoms and the summation j is over all neighbors of atom i. The set of all “raw” formation energies of point defects and E 0 provides input for statistical-mechanical models describing dynamic equilibrium among point defects and allowing a numerical calculation of their equilibrium concentrations [12, 13]. Although relaxations can reduce the “raw” energies significantly, fast unrelaxed calculations are very useful when generating potentials or making preliminary tests. EAM potentials serve as a workhorse in the overwhelming majority of atomistic simulations of metallic materials. They are widely used in simulations of grain boundaries and interfaces [14], dislocations [15], fracture [16], diffusion and other processes [17]. EAM potentials have a good record of delivering reasonable results for a wide variety of properties. For elemental metals, elastic constants and the vacancy formation energies are usually reproduced accurately. Surface energies tend to lie 10–20% below experiment, a problem that can hardly be solved within regular EAM. Surface relaxations and reconstructions usually agree with experiment at least qualitatively. Vacancy migration energies tend to underestimate experimental values unless specifically fit to them. Phonon dispersion curves, thermal expansion, melting temperatures, stacking fault energies, and structural energy differences may not come out accurate automatically but can be adjusted during the potential generation procedure (see below). For binary systems, experimental heats of phase formation and properties of individual ordered compounds can be fitted to with reasonable accuracy. For some binary systems, even basic features of phase diagrams can be reproduced without fitting to experimental thermodynamic data [18]. However, in systems with multiple intermediate phases, transferability across the entire phase diagram can be problematic [18].
3.
Generation and Testing of Atomistic Potentials
We will first discuss potential generation procedures for elemental metals. The EAM functions (r) and ρ(r) are usually described by analytical expressions containing five to seven fitting parameters each. Different authors use polynomials, exponents, Morse, Lennard-Jones or Gaussian functions, or their combinations. In the absence of strong physical leads, any reasonable function can be acceptable as long as it works. It is important, however, to keep the functions simple and smooth. Oscillations and wiggles can lead to
468
Y. Mishin
rapid changes or even discontinuities in higher derivatives and cause unphysical effect in phonon frequencies, thermal expansion and other properties. The risk increases when analytical forms are replaced by cubic splines (discontinuous third derivative), especially with a large number of nodes. Increasing the number of fitting parameters should be done with great caution. The observed improvement in accuracy of fit can be illusive as the potential may perform poorly for properties not included in the fit. Many sophisticated potentials contain hidden flaws that only reveal themselves under certain simulation conditions. As a rough rule of thumb, potentials whose (r) and ρ(r) together contain over 15 fitting parameters may lack reliability in applications. At the same time, using too few (say, < 10) parameters may not take full advantage of the capabilities of EAM. Since the speed of atomistic simulations does not depend on the complexity of potential functions or the number of fitting parameters,3 it makes sense to put efforts in optimizing them for the best accuracy and reliability. There are two ways of constructing the embedding function F(ρ). ¯ One way is to describe it by an analytical function (or cubic spline [19]) with adjustable parameters. Another way is to postulate an equation of state of the ground-state structure. Most authors use the universal binding curve [20], E(a) = E 0 (1 + αx) e−αx ,
(26)
where E(a) is the crystal energy per atom as a function of the lattice parameter a, x = (a/a0 − 1) (a0 being the equilibrium value of a),
α=
−
90 B , E0
and B is the bulk modulus. F(ρ) ¯ is then obtained by inverting Eq. (26). Namely, by varying the lattice parameter we compute ρ(a) ¯ and F(a) = E(a) − E p (a), where E(a) is given by Eq. (26) and E p (a) is the pair-interaction part of ¯ thus obtained parametrically define F(ρ). ¯ E tot . The functions F(a) and ρ(a) Notice that this procedure automatically guarantees an exact fit to E 0 , a0 and B. A slightly improved procedure is to add a higher-order term ∼βx 3 to the pre-exponential factor of Eq. (26) and use the additional parameter β to fit to an experimental pressure-volume relation under large compressions [21]. Even if we do not postulate Eq. (26) and treat F(ρ) ¯ as a function with parameters, E 0 , a0 , and B can still be matched exactly using Eq. (23) for E 0 , the lattice equilibrium condition 1 (rij )rij + F (ρ) ¯ ρ (rij )rij = 0 2 j =/ i j= /i 3 We assume that potential functions are used by the simulation program in a tabulated form.
(27)
Interatomic potentials for metals
469
(follows from Eq. (12)) and the expression for B, 90 B =
1 (rij )(rij )2 + F (ρ) ¯ ρ (rij )(rij )2 2 j =/ i j= /i
+ F (ρ) ¯
2
ρ (rij )rij
(28)
j= /i
(can be derived from Eqs. (13) and (27)). These three equations can be readily ¯ and F (ρ) ¯ at a = a0 . satisfied by adjusting the values of F(ρ), ¯ F (ρ) Fitting parameters of a potential are optimized by minimizing the weighted mean squared deviation of properties from their target values. The weights are used as a means of controlling the importance of some properties over others. Some properties are included with a very small weight that only prevents unreasonable values without pursuing an actual fit. While early EAM potentials were fit to experimental properties only, the current trend is to include into the fitting database both experimental and first-principles data [19, 21, 22]. In fact, some of the recent potentials are predominantly fit to firstprinciples data and only use a few experimental numbers, which essentially makes them a parameterization of first-principles calculations. The incorporation of first-principles data into the fitting database improves the reliability of potentials by sampling larger areas of configuration space, including atomic configurations away from those represented by experimental data. Experimental properties used for potential generation traditionally include E 0 , a0 , elastic constants cij , the vacancy formation energy, and often the stacking fault energy. Thermal expansion factors, phonon frequencies, surface energies, and the vacancy migration energy can also be included. Depending on the intended use of the potential, some of these properties are strongly enforced while others are only used for a sanity check (small weight). First-principles data usually come in the form of energy–volume relations for the ground-state structure and several hypothetical “excited” structures of the same metal. The role of these structures is to probe various local environments and atomic volumes of the metal. This sampling improves the transferability of potentials to atomic configurations occurring during subsequent atomistic simulations. Furthermore, first-principles energies along uniform deformation paths between different structures are often calculated, such as the tetragonal deformation path between the FCC and BCC structures (Bain path) or the trigonal deformation path FCC – simple cubic – BCC. Such deformations, however, are normally used for testing potentials rather than fitting. An alternative way of using first-principles data is to fit to interatomic forces drawn from snapshots of first-principles molecular dynamics simulations for solid as well as liquid phases of a metal (force matching method) [19]. The liquid-phase configurations can improve the accuracy of the potential in melting simulations.
470
Y. Mishin
To illustrate the accuracy achievable by modern EAM potentials, Table 1 summarizes selected properties of copper calculated with an EAM potential [23] in comparison with experiment. This particular potential was parameterized by simple analytical functions. A universal equation of state was not enforced and F(ρ) ¯ was described by a polynomial. The cutoff radius of the potential, Rc = 0.551 nm, covers four coordination shells but the contribution of the fourth shell is extremely small. Besides experimental properties indicated in Table 1, the fitting database included two experimental phonon frequencies at the zone-boundary point X , a high pressure–volume relation and, with a small weight, the dimer bond energy E d and thermal expansion factors at several temperatures. The first-principles data included energy–volume relations for several structures. Only the FCC, HCP, and BCC structures were used in the fit, while other structures were deferred for testing. The potential demonstrates excellent agreement with experiment for both fitted and predicted properties, except for the surface energies which are too low. Phonon dispersion relations and thermal expansion factors are also in accurate agreement with experiment (Fig. 1). The potential accurately reproduces firstprinciples energies of alternate structures not included in the fit, as well as energies along several deformation paths between them.
Table 1. Selected properties of Cu calculated with an embedded-atom potential [23] in comparison with experimental data (see [23] for experimental references). Notations: E vf and E vm – vacancy formation and migration energies, E if and E im – self-interstitial formation and migration energies, γSF – intrinsic stacking fault energy, γus – unstable stacking fault energy, γs – surface energy, γT – symmetrical twin boundary energy, Tm – melting temperature, Rd – dimer bond length, E d – dimer bond energy. All other notations are explained in the text. All defect energies were obtained by static relaxation at 0 K Property a0 (nm)a E 0 (eV)a c11 (GPa)a c12 (GPa)a c44 (GPa)a E vf (eV)a E vm (eV)a E if (eV) E im (eV)
Experiment
EAM
0.3615 −3.54 170.0 122.5 75.8 1.27 0.71 2.8–4.2 0.12
0.3615 −3.54 169.9 122.6 76.2 1.27 0.69 3.06 0.10
Property γSF (mJ/m2 )a γus (mJ/m2 ) γT (mJ/m2 ) γs (111) (mJ/m2 ) γs (110) (mJ/m2 ) γs (100) (mJ/m2 ) Tm (K) Rd (nm) E d (eV)d
a Used in the fit. b Average orientation. c Calculated by molecular dynamics (interface velocity method). d Used in the fit with a small weight.
Experiment
EAM
45 – 24 1790b 1790b 1790b 1357 0.22 −2.05
44.4 158 22.2 1239 1475 1345 1327 0.218 −1.93
Interatomic potentials for metals (a)
9
Γ
[q00]
471
X
K
Γ
[qq0]
[qqq]
L
EAM Experiment
8 7
T2
L
ψ(THz)
6 5
L
L
4 T
3
T1
2
T
1 0 0.00 0.25 0.50 0.75 1.00
0.75
q
(b)
0.50 q
0.25
0.00
0.25
0.50
q
EAM Monte Carlo Experiment
2.0
Linear expansion (%)
1.5
1.0
0.5
0.0
Tm
⫺0.5 0
200
400
600 800 1000 Temperature (K)
1200
1400
Figure 1. Comparison of embedded-atom calculations [23] with experimental data for Cu. (a) phonon dispersion curves, (b) linear thermal expansion relative to room temperature. The discrepancy in thermal expansion at low temperatures is due to quantum effects that are not captured by classical Monte Carlo simulations.
For a binary system A–B, the simplest potential generation scheme is to utilize existing potentials for two metals A and B and only construct a cross-interaction function AB (r).4 4 An alternative approach is to optimize all seven potential functions simultaneously, see for example, Ref. [24].
472
Y. Mishin
To win additional fitting parameters we take advantage of the fact that the transformations ¯ → FA (ρ) ¯ + gA ρ, ¯ FA (ρ) AA (r) → AA (r) − 2gA ρA (r), ¯ → FB (ρ) ¯ + gB ρ, ¯ FB (ρ) BB (r) → BB (r) − 2gB ρB (r), ρB (r) → pB ρB (r), ¯ → FB (ρ/ ¯ pB ) FB (ρ)
(29) (30) (31) (32) (33) (34)
leave the energies of elements A and B invariant while altering energies of binary alloys. Thus, pB , gA and gB can be treated as adjustable parameters. After the fit, the new potential functions can be converted to the binary effective pair format by applying the invariant transformations by Eqs. (3)–(6) with gA = −FA (ρ¯A ) and gB = −FB (ρ¯B ), ρ¯A , and ρ¯B being host electron densities in a reference compound. It should be remembered that the binary effective pair format thus obtained will produce elemental potential functions different from the initial ones. Thus, if the initial elemental potentials were in the effective pair format, it will generally be destroyed by the fitting process. Indeed, the reference state of an elemental potential is its ground state, while the reference state of the binary system is a particular binary compound. Physically, however, both elemental potentials will remain exactly the same. All these mathematical transformations should be carefully observed when comparing different potentials or reconstructing them from published parameters. Experimental properties used for optimizing a binary potential typically include E 0 , a0 , and cij of a chosen intermetallic compound. For structural intermetallics, energies of generalized planar faults involved in dislocation dissociations can also be used in the fit to improve the applicability of the potential to simulations of mechanical behavior [15]. Fracture simulations [16] may additionally require reasonable surface energies, which can be adjusted to some extent during the fitting procedure. On the other hand, for thermodynamic and diffusion simulations it is more important to reproduce the heat of the compound formation and point defect characteristics. As with pure metals, the current trend in constructing binary potentials is to incorporate first-principle data, usually in the form of energy–volume relations for experimentally observed and hypothetical compounds. The transferability of a potential can be significantly improved by including compounds with several different stoichiometries across the entire phase diagram [18, 21, 24]. Even if such compounds do not actually exist on the experimental diagram, they sample a broader area of configuration space and secure reasonable energies of various environments and chemical compositions that may occur locally during atomistic simulations, for example, in core regions of lattice
Interatomic potentials for metals
473
defects. Some of the recent binary potentials only use a few experimental numbers but otherwise heavily rely on first-principles input [18]. Besides structural energies, such input may include energies along deformation paths between compounds, energies of stable and unstable planar faults, point defect energies and other data. Some of this information can be deferred from the fitting database and used for testing the potential. The most critical test of transferability of a binary potential is its ability to reproduce the phase diagram at least qualitatively. Unfortunately, many existing potentials are nicely fit to specific properties of a particular compound but fail to describe other structures and compositions with any acceptable accuracy. Such potentials can easily produce incorrect structures of grain boundaries, interfaces or any other defects whose local chemical composition deviates significantly from the bulk composition. A challenge of future research is to establish a procedure for generating reliable EAM potentials for ternary systems. A carefully chosen model system A–B–C must be used as a testing ground. The first step would be to simply construct three binary potentials, A–B, B–C, and C–A, based on the same set of high-quality elemental potentials and capable of reproducing the relevant binary phase diagrams at least on a qualitative level. Such potentials should be based on extensive first-principles input and a smart procedure for a simultaneous optimization of the transformation parameters gs and ps relating to different binaries. The critical test of this potential set would be an evaluation of thermodynamic stability of ternary compounds existing on the experimental diagram. At the next step, calculated properties of such compounds can be improved by further adjustments of the binary potentials.
4.
Angular-dependent Potentials
EAM potentials work best for simple and noble metals but are less accurate for transition metals. The latter reflects an intrinsic limitation of EAM, which is essentially a central-force model that cannot capture the covalent component of bonding arising due to d-electrons in transition metals. Baskes et al. [25– 28] developed a non-central-force extension of EAM, which they called the modified embedded-atom method (MEAM). In MEAM, electron density is treated as a tensor quantity and the host electron density ρ¯i is expressed as a function of the respective tensor invariants. In the simplest approximation, ρ¯i is given by the expansion
(0) (ρ¯i )2 = ρ¯i
2
+ ρ¯i(1)
2
+ ρ¯i(2)
2
+ ρ¯i(3)
2
,
(35)
474
Y. Mishin
where
ρ¯i(0)
2
=
j= /i
ρ¯i(1)
ρ¯i(2)
2
ρ¯i(3)
2
(36)
j= /i
sj
(37)
rij
2 2 α β r r 1 ij ij = ρ (2) (rij ) − ρ (2) (rij ) , α,β
ρs(0) (rij ) , j
2 α r ij = ρ (1) (rij ) , α
2
2
=
α,β,γ
j= /i
j= /i
sj
ρs(3) (rij ) j
rij2
3
β γ
rijα rij rij rij3
j= /i
sj
(38)
2 .
(39)
The terms ρ¯i(k) (k = 0, 1, 2, 3) can be thought of as representing contributions of s, p, d, and f electronic orbitals, respectively. It should be emphasized, however, that the exact relation of these terms to electronic orbitals is not physically clear and Eqs. (35)–(39) can as well be viewed as ad hoc expressions whose only role is to introduce non-spherical components of bonding. The regular EAM is recovered by including only the electron density of “s-orbitals,” ρ¯i(0) , and neglecting all other terms. In comparison with regular EAM, MEAM introduces three new functions, ρs(1) (r), ρs(2) (r), and ρs(3) (r) for each species s, which are fit to experimental and first-principles data in much the same manner as in EAM. While EAM potentials are smoothly truncated at a sphere breaembracing several coordination shells, MEAM includes only one or two coordination shells but introduces a many-body “screening” procedure described in detail by Baskes [27, 29]. Computationally, MEAM is roughly a factor of five to six slower than EAM but can be more accurate for transition metals. It has even been successfully applied to covalent solids, including Si and Ge [27]. Advantages of MEAM over EAM are particularly strong for noncentrosymmetric structures and materials with a negative Cauchy pressure. The latter can be readily reproduced ¯ > 0. MEAM potentials have by angular-dependent terms while keeping F (ρ) been constructed for a number of metals [27, 29, 30] and intermetallic compounds [31, 32]. Pasianot et al. [33] proposed a slightly different way of incorporating angular interactions into EAM. In their so-called embedded-defect method (EDM), the total energy is written in the form E tot =
1 si s j (rij ) + Fsi (ρ¯i ) + G Yi , 2 i, j ( j =/ i) i i
(40)
Interatomic potentials for metals where ρ¯i =
475
ρs j (rij ),
(41)
j= /i
2 β 2 rijα rij 1 ρs j (rij ) 2 − ρs j (rij ) . Yi = α,β
3
rij
j= /i
(42)
j= /i
Expression (40) was originally derived from physical considerations different from those underlying MEAM. Mathematically, however, Eqs. (40)–(42) present a particular case of Eqs. (35)–(39) in which ρ¯i(1) and ρ¯i(3) are neglected, F(ρ¯i ) is approximated by a linear expansion in terms of the small 2 perturbation ρ¯i(2) , and the later is expressed through the undisturbed electron density function ρs (r): ρs(2) (r) ≡ρs (r). In comparison with EAM, EDM introduces only one additional parameter, G. Like EAM, EDM uses cutoff functions, thus avoiding the MEAM screening procedure. EDM potentials have been successfully constructed for several HCP [33] and BCC transition metals [33, 35–37]. While EDM is computationally faster than MEAM, it is less general and offers less fitting parameters for the angular part. However, the original EDM formulation can be readily generalized by including more angular-dependent terms: E tot =
1 si s j (rij ) + Fsi (ρ¯i ) 2 i, j ( j =/ i) i
+
2 ρ¯i(1)
+
2 ρ¯i(2)
+
2 ρ¯i(3)
,
(43)
i
where ρ¯i(k) are expressed through parameterized functions ρs(k) (r) by Eqs. (37)–(39). Overall, MEAM, EDM, and Eq. (43) are all equally legitimate empirical expressions introducing angular-dependent forces. The role of ρ¯i(k) ’s is to simply penalize E tot for deviations from local cubic symmetry. These terms do not affect the energy–volume relations for cubic crystals but are important for structures with broken local cubic symmetry. Thus, energies of many common crystal structures such as L12 , L10 , and L11 , depend of the “quadrupole” term ρ¯i(2) . This dependence opens new degrees of freedom for reproducing structural energies of intermetallic compounds. Since nonhydrostatic strains break cubic symmetry, ρ¯i(2) also affects elastic constants, which enables their more accurate fit and a reproduction of negative Cauchy pressures. In some structures, such as diamond and some binary compounds, elastic constants are also affected by the “dipole” term ρ¯i(1) . Areas of broken symmetry inevitably
476
Y. Mishin
exist around lattice defects. Due to the additional penalty arising from angular terms, defect energies can be larger than in EAM. In particular, it becomes possible to reproduce higher surface energies and a more accurate vacancy migration energy. In sum, angular-dependent terms can improve the accuracy of fit of potentials in comparison with regular EAM. However, the effect of such terms on the transferability of potential needs to be studied in more detail.
5.
Outlook
Embedded-atom potentials provide a reasonable description of a broad spectrum of properties of metallic systems and enable fast atomistic simulations of a variety of processes ranging from thermodynamic functions and diffusion to plastic deformation and fracture. There are intrinsic limitations of EAM, which is still a semi-empirical model based on central-force interactions. Such limitations set boundaries to the accuracy achievable within this method. However, the accuracy and robustness of EAM potentials gradually improve, within those boundaries, by developing more efficient fitting and testing procedures, using larger data sets, and most importantly, increasing the weight of first-principles data. The latter trend may eventually transform the method to a parameterization, or mapping, of first-principles data. Much work needs to be done to improve transferability of binary EAM potentials. This, again, can be achieved by further optimizing the potential generation procedures and using more first-principle data. The most severe test of a binary potential is its ability to predict the correct phase stability across the entire phase diagram. It is not quite clear at this point how far EAM can be pushed in that direction, but this certainly deserves to be explored. Reliable ternary potentials remain a grand challenge of future research. Presently, the only way of generalizing EAM to include non-central interactions is to introduce energy penalties for local deviations from cubic symmetry. This can be achieved by calculating local dipole, quadrupole, and perhaps higher order tensors and making the energy a function of their invariants. Depending on the initial physical motivation behind such tensors and some technical details (such as cutoff functions versus screening), this idea has been implemented first in MEAM and later in EDM. It should be emphasized, however, that other equally legitimate forms of an angular-dependent potential can be readily constructed in the same spirit, Eq. (43) being just one example. Since there is no unique physical justification for those different forms, they all can simply be viewed as useful empirical expressions. Both MEAM and EDM potentials have been developed for a number of transition metals and have demonstrated an improved accuracy in reproducing their properties. MEAM has also been applied, with significant success, to
Interatomic potentials for metals
477
intermetallic compounds and even covalent solids. Future work may further develop this group of methods towards binary and eventually ternary systems.
References [1] D. Frenkel and B. Smit, Understanding Molecular Simulation: From Algorithms to Applications, 2nd edn., Academic, San Diego, 2002. [2] D.P. Landau and K. Binder, A Guide to Monte Carlo Simulations in Statistical Physics, Cambridge University Press, Cambridge, 2000. [3] M.S. Daw and M.I. Baskes, “Embedded-atom method: derivation and application to impurities, surfaces, and other defects in metals,” Phys. Rev. B, 29, 6443–6453, 1984. [4] M.W. Finnis and J.E. Sinclair, “A simple empirical N-body potential for transition metals,” Philos. Mag. A, 50, 45–55, 1984. [5] J.K. Nørskov, “Covalent effects in the effective-medium theory of chemical binding: Hydrogen heats of solution in the 3d metals,” Phys. Rev. B, 26, 2875–2885, 1982. [6] D.G. Pettifor, Bonding and Structure of Molecules and Solids, Clarendon Press, Oxford, 1995. [7] M.S. Daw, “Embedded-atom method: many-body description of metallic cohesion,” In: V. Vitek and D.J. Srolovitz (eds.), Atomistic Simulation of Materials: Beyond Pair Potentials, Plenum Press, New York, pp. 181–191, 1989. [8] M.S. Daw and R.L. Hatcher, “Application of the embedded atom method to phonons in transition metals,” Solid State Comm., 56, 697–699, 1985. [9] A. Van de Walle and G. Ceder, “The effect of lattice vibrations on substitutional alloy thermodynamics,” Rev. Mod. Phys., 74, 11–45, 2002. [10] J.M. Rickman and R. LeSar, “Free-energy calculations in materials research,” Annu. Rev. Mater. Res., 32, 195–217, 2002. [11] S.M. Foiles, “Evaluation of harmonic methods for calculating the free energy of defects in solids,” Phys. Rev. B, 49, 14930–14938, 1994. [12] Y. Mishin and C. Herzig, “Diffusion in the Ti-Al system,” Acta Mater., 48, 589–623, 2000. [13] M. Hagen and M.W. Finnis, “Point defects and chemical potentials in ordered alloys,” Philos. Mag. A, 77, 447–464, 1998. [14] D. Wolf, Handbook of Materials Modeling, vol. 1, Chapter 8, Interfaces, 2004. [15] W. Cai, “Modeling dislocations using a periodic cell,” Article 2.21, this volume. [16] D. Farkas and R. Selinger, “Atomistics of fracture,” Article 2.33, this volume. [17] A.F. Voter, “The embedded-atom method,” In: J.H. Westbrook and R.L. Fleischer (eds.), Intermetallic Compounds, vol. 1, John Wiley & Sons, New York, pp. 77–90, 1994. [18] Y. Mishin, “Atomistic modeling of the γ and γ phases of the Ni-Al system,” Acta Mater., 52, 1451–1467, 2004. [19] F. Ercolessi and J.B. Adams, “Interatomic potentials from first-principles calculations: the force-matching method,” Europhys. Lett., 26, 583–588, 1994. [20] J.H. Rose, J.R. Smith, F. Guinea, and J. Ferrante, “Universal features of the equation of state of metals,” Phys. Rev. B, 29, 2963–2969, 1984.
478
Y. Mishin [21] R.R. Zope and Y. Mishin, “Interatomic potentials for atomistic simulations of the Ti-Al system,” Phys. Rev. B, 68, 024102, 2003. [22] Y. Mishin, D. Farkas, M.J. Mehl, and D.A. Papaconstantopoulos, “Interatomic potentials for monoatomic metals from experimental data and ab initio calculations,” Phys. Rev. B, 59, 3393–3407, 1999. [23] Y. Mishin, M.J. Mehl, D.A. Papaconstantopoulos, A.F. Voter, and J.D. Kress, “Structural stability and lattice defects in copper: ab initio, tight-binding and embeddedatom calculations,” Phys. Rev. B, 63, 224106, 2001. [24] Y. Mishin, M.J. Mehl, and D.A. Papaconstantopoulos, “Embedded-atom potential for B2-NiAl,” Phys. Rev. B, 65, 224114, 2002. [25] M.I. Baskes, “Application of the embedded-atom method to covalent materials: a semi-empirical potential for silicon,” Phys. Rev. Lett., 59, 2666–2669, 1987. [26] M.I. Baskes, J.S. Nelson, and A.F Wright, “Semiempirical modified embedded-atom potentials for silicon and germanium,” Phys. Rev. B, 40, 6085–6110, 1989. [27] M.I. Baskes, “Modified embedded-atom potentials for cubic metals and impurities,” Phys. Rev. B, 46, 2727–2742, 1992. [28] M.I. Baskes, J.E. Angelo, and C.L. Bisson, “Atomistic calculations of composite interfaces,” Modelling Simul. Mater. Sci. Eng., 2, 505–518, 1994. [29] M.I. Baskes, “Determination of modified embedded atom method parameters for nickel,” Mater. Chem. Phys., 50, 152–158, 1997. [30] M.I. Baskes and R.A. Johnson, “Modified embedded-atom potentials for HCP metals,” Modelling Simul. Mater. Sci. Eng., 2, 147–163, 1994. [31] M.I. Baskes, “Atomic potentials for the molybdenum–silicon system,” Mater. Sci. Eng. A, 261, 165–168, 1999. [32] D. Chen, M. Yan, and Y.F. Liu, “Modified embedded-atom potential for L10 -TiAl,” Scripta Mater., 40, 913–920, 1999. [33] R. Pasianot, D. Farkas, and E.J. Savino, “Empirical many-body interatomic potentials for bcc transition metals,” Phys. Rev. B, 43, 6952–6961, 1991. [34] J.R. Fernandez, A.M. Monti, and R.C. Pasianot, “Point defects diffusion in α-Ti,” J. Nucl. Mater., 229, 1–9, 1995. [35] G. Simonelli, R. Pasianot, and E.J. Savino, “Point-defect computer simulation including angular forces in bcc iron,” Phys. Rev. B, 50, 727–738, 1994. [36] G. Simonelli, R. Pasianot, and E.J. Savino, “Phonon-dispersion curves for transition metals within the embedded-atom and embedded-defect methods,” Phys. Rev. B, 55, 5570–5573, 1997. [37] G. Simonelli, R. Pasianot, and E.J. Savino, “Self-interstitial configuration in BCC metals. An analysis based on many-body potentials for Fe and Mo,” Phys. Status Solidi (b), 217, 747–758, 2000.
2.3 INTERATOMIC POTENTIAL MODELS FOR IONIC MATERIALS Julian D. Gale Nanochemistry Research Institute, Department of Applied Chemistry, Curtin University of Technology, Perth, 6845, Western Australia
Ionic materials are present in many key technological applications of the modern era, from solid state batteries and fuel cells, nuclear waste immobiliza tion, through to industrial heterogeneous catalysis, such as that found in automotive exhaust systems. With the boundless possibilities for their utilization, it is natural that there has been a long history of computer simulation of their structure and properties in order to understand the materials science of these systems at the atomic level. The classification of materials into different types is, of course, an arbitrary and subjective decision. However, when a binary compound is composed of two elements with very different electronegativities, as is the case for oxides and halides in particular, then it is convenient to regard it as being an ionic solid. The implication is that, as a result of charge transfer from one element to the other, the dominant binding force between particles is the Coulombic attraction between opposite charges. Such materials tend to be characterized by close-packed, dense structures that show no strong directionality in the bonding. Typically, most ionic materials possess a large band gap and are therefore insulating. As a consequence, the notion that the solid is composed of spherical ions whose interactions can be represented by simple distance-dependent functional forms is quite a reasonable one, since overtly quantum mechanical effects are lesser than in materials where covalent bonding occurs. Thus it is possible to develop force fields that are specific for ionic materials, and this approach can be surprisingly successful considering the simplicity of the interatomic potential model. When considering how to construct a force field for ionic materials, the starting point, as is the case for all types of system, is to assume that the total 479 S. Yip (ed.), Handbook of Materials Modeling, 479–497. c 2005 Springer. Printed in the Netherlands.
480
J.D. Gale
energy, Utot, can be decomposed into interactions between different numbers of atoms: Utot =
1 1 1 Ui j + Ui j k + Ui j kl + · · · 2! i j 3! i j k 4! i j k l
Here, Ui j is the energy of interaction between a pair of atoms, i and j , or so-called two-body interaction energy; Ui j k is the extra interaction that arises (beyond the sum of the three two-body energy components for the pairs i − j, j − k, and i − k) when a triad of atoms is considered, and so forth for higher order terms. Note that the inverse factorial prefactor is required to avoid double counting of interactions between particles. In principle, the above decomposition is exact if carried out to terms of high enough order. However, in practice it is necessary to truncate the expansion at some point. For many ionic materials it is often sufficient to only include the two-body term, though the extensions beyond this will be discussed later. Imagining an ionic solid as being composed of cations and anions whose electron densities are frozen, which represents the simplest possible case, the physical interactions present can be intuitively understood. There will obviously be a Coulombic attraction between ions of opposite charge, with a corresponding repulsive force between those of like nature. Because ions are arranged such that the closest neighbours are of opposite sign, this gives rise to a strong net attractive energy that will tend to contract the solid in order to lower the energy. In order that an equilibrium structure is obtained there must be a counterbalancing repulsive force. This arises from the overlap of the electron densities of two ions, regardless of the sign of their charge, and has its origin in the Pauli repulsion between electrons. Hence, we can write the breakdown of the two-body energy in general terms as: repulsive
+ Ui j Ui j = UiCoulomb j
While real spherical ions will have a radial electron density distribution, it is convenient to treat the ions as point charges – i.e., as though all the electron density is situated at the nucleus. Within this approximation, the electrostatic interaction of two charged particles is just given by Coulomb’s law; = UiCoulomb j
qi q j 4π 0ri j
or, if written in atomic units, as will subsequently be done, we can drop the constant factor of 4π 0 : = UiCoulomb j
qi q j ri j
Interatomic potential models for ionic materials
481
The error in the electrostatic energy arising from the point charge approximation is usually subsumed into the repulsive energy contribution, since this latter term is usually derived by a fitting procedure, rather than from direct theoretical considerations.
1.
Calculating the Electrostatic Energy
Not only is the electrostatic energy the dominant contribution to the total value, but it turns out that it is actually the most difficult to evaluate. While it is easy to write down that the electrostatic energy is the sum over all pairwise interactions, including all periodic images of the unit cell, the complication arises because the sum must be truncated for actual computation. Unfortunately, the summation is an example of a conditionally convergent series, i.e., the value of the sum depends on how the truncation is made. The reason for this can be understood by considering the interactions of a single ion with all other ions within a given radius, r. The convergence of the energy of r , is given by the number of interactions, Nr , multiplied by the interaction, Utot magnitude of the interaction, U r : r Utot =
Nr U r
r
As r increases, the number of interactions rises in proportion to the surface area of the cut-off sphere: Nr ∝ 4πr 2 . However, the interaction itself only decreases as the inverse power of r, as has been shown previously. Consequently, the magnitude of interaction potentially increases as the cut-off radius is extended. The fact that the magnitude converges in practice relies on the fact that there is cancelation between interactions with cations and anions. It turns out that the electrostatic energy of a system actually depends on the macroscopic state of a crystal due to the long-ranged effect of Coulomb fields. In other words, it is not purely a property of the bulk crystal, but also depends, in general, on the nature of the surfaces and of the crystal morphology [3]. To make it feasible to define an electrostatic energy that is useful for the simulation of ionic materials, it is conventional to impose two conditions on the Coulomb summation: 1. The sum of the charges within the system must be equal to zero: i
qi = 0
482
J.D. Gale
2. The total dipole moment of the system in all directions must also be equal to zero: µ x = µ y = µz = 0 If these conditions are satisfied, the electrostatic energy will always converge to the same value as the cut-off radius is incremented. It is also possible to define the electrostatic energy when the dipole moments along the three Cartesian axes differ from zero. This Coulomb energy is related to the value obtained when the dipole moment is zero, U 0 , according to the following expression; U = U0 +
2π 2 µx + µ2y + µ2z 3V
where V is the volume of the unit cell. Considering the expression for the dipole moment in a given direction, α; µα =
qi riα
i
where riα is the position of the ith ion projected on to this axis, then there is a complication. Because there are multiple images of the same ion, due to the presence of periodic boundary conditions, the dipole contribution of any given ion is an ambiguous quantity. The only way to determine the true dipole moment is to perform the sum over all ions within the entire crystal, which includes those ions at the surface. This is the origin of the electrostatic energy being a macroscopic property of the system. While it has been stated that the electrostatic energy is convergent if the above conditions are obeyed, it is not obvious how to achieve this in practice for a general crystal structure. Various methods have been proposed, the most reknown of which is that of Evjen who constructed charge neutral shells of ions about each interacting particle. However, this is more difficult to automate for a computational implementation and is best for high symmetry structures. Apart from the need to converge to a defined electrostatic energy, there is also the issue of how rapidly the sum converges, since it is required that the calculation be fast for numerical evaluation. By far the dominant approach to evaluating the electrostatic energy is through the use of the summation method due to Ewald which aims to accelerate the convergence by partially transforming the expression into reciprocal space. While the details of the derivation are beyond the scope of this text, and can be found elsewhere [2, 9], the concepts behind the approach and the final result will be given below. In Ewald’s approach, a Gaussian charge distribution of equal magnitude, but opposite sign, is placed at the position of every ion in the crystal. Because the charges cancel, all but for the contribution from the differing
Interatomic potential models for ionic materials
483
shape of the distribution, the resulting electrostatic interaction between ions is now rapidly convergent when summed out in real space and converges to the energy U real . In order to recover the original electrostatic energy it is then necessary to compute two further terms. Firstly, the interaction of the Gaussian charge distributions with each other must be subtracted. Because of the smooth nature of the electrostatic potential arising from such a distribution, it is possible to efficiently evaluate this term, U recip , by expanding the charge density in planewaves with the periodicity of the reciprocal lattice. Again, the energy contribution is rapidly convergent with respect to the cut-off radius within reciprocal space. Finally, there is the self-energy, U self , that arises from the interaction of the Gaussian with itself. Mathematically, the Ewald sum is derived by a Laplace transform of the Coulomb energy and the final expressions are given below; U Coulomb = U real + U recip + U self 1 1 qi q j U real = er f c η 2 ri j 2 R i j ri j U recip =
1 4π exp −(G 2 /4η) q q exp G.r) (i i j 2 G i j V 2 G2
U self = −
i
1
qi2
η π
2
where R denotes a real space lattice vector, G represents a reciprocal lattice vector and η is a parameter that determines the width of the Gaussian charge distribution. Note that the summation over reciprocal lattice vectors excludes the case when G = 0. The key to rapid convergence of the Ewald sum is to choose the optimal value of η. If the value is small, then the Gaussians are narrow and so the real space expression converges quickly, while the reciprocal space sum requires a more extensive summation due to the higher degree of curvature of the charge density. Choosing a large value of η obviously leads to the inverse situation. One approach to choosing the convergence parameter is to derive an expression for the total number of terms to be evaluated in real and reciprocal space for a given accuracy and then to find the stationary point where this quantity is at a minimum. The choice of ηopt is then given by;
ηopt =
Nπ3 V
1 3
where N is the number of particles within the unit cell. If the target accuracy, A, is represented by the given fractional degree of convergence (e.g.,
484
J.D. Gale
A = 0.001 would imply that the energy is converged to within 0.1%), then the cut-off radii in real and reciprocal space are given as follows:
max ropt
−ln A = η
12 1
2 G max opt = 2(−η ln A)
Before leaving the evaluation of the electrostatic energy, it is important to comment on other dimensionalities than three-dimensional (3-D) periodic boundary conditions. There is also an analogous approach involving a partial reciprocal space transformation in two dimensions, due to Parry, which can be employed for slab or surface calculations [6]. For the 1-D case of a polymer, the Coulomb sum is now absolutely convergent for a charge neutral system. However, it is still beneficial to use methods that accelerate the convergence, though there is less concensus as to the most efficient technique.
2.
Non-electrostatic Contributions to the Energy
While the electrostatic energy often accounts for the majority of the binding, the non-Coulombic contributions are equally critical since they determine the position and shape of the energy minimum. As previously mentioned, there must always be a short-ranged repulsive force between ions to counter the Coulomb attraction and therefore prevent the collapse of the solid. Most work has followed the pioneering work in the field, as embodied in the Born– Meyer and Born–Lande equations for the lattice energy, by utilizing either an exponential or inverse power-law repulsive term. This gives rise to two widely employed functional forms, namely the Buckingham potential; short−ranged
Ui j
= Ai j exp −
ri j ρi j
−
Ci j ri6j
and that due to Lennard–Jones: Bi j Ci j short−ranged = m − n Ui j ri j ri j For the Lennard–Jones potential, the exponents m and n are typically 9–12 and 6, respectively. This latter potential can also be recast in many different forms by rewriting in terms of the well-depth, ε, and either the distance at repulsive which the potential intercepts the Ui j = 0 axis, r0 , or the position of the minimum, req . Both the Buckingham and Lennard–Jones potentials have the same common features – a short-ranged repulsive term and a slightly longerranged attractive term. The latter contribution, often referred to as the C6 term,
Interatomic potential models for ionic materials
485
arises as the leading term in the expansion of the dispersion energy between two non-overlapping charge densities. When choosing between the use of Buckingham and Lennard–Jones potentials, there are arguments for and against both. Physically, the exponential form of the Buckingham potential should be more realistic because electron densities of ions decay with this shape and so it would seem natural that the repulsion follows the magnitude of the interacting ion densities, at least for weak overlap. However, in the limit of ri j → 0 the repulsive Buckingham potential tends to Ai j , i.e., a constant value that is unphysically low for nuclear fusion! Worse still, if the coefficient Ci j is non-zero, then the potential, while initially repulsive, goes through a maximum and then tends to −∞ – a result that is physically absurd. In contrast, the Lennard-Jones potential behaves sensibly and tends to +∞ as long as m > n. While the false minimum of the Buckingham potential is not usually a problem for energy minimization studies, it can be an issue in molecular dynamics where there is a finite probability of the system gaining sufficient kinetic energy to overcome the repulsive barrier. There is a further solution to the problems with the Buckingham potential at small distances. The problems arise due to the simple power-law expression for the dispersion energy. However, this is also incorrect at short-range since the electron densities begin to overlap leading to a reduction of the dispersion contribution. This can be accounted for by explicitly damping the C6 term as the distance tends to zero, and the most widely used approach for doing this is to adopt the form proposed by Tang and Toennies:
UiC6 j
=− 1−
6
bri j k k=0
k!
exp −bri j
Ci j
ri6j
Occasionally other short-ranged, two-body potentials are choosen, such as the Morse or a harmonic potential. However, these are normally selected when acting between two atoms that are bonded. In this situation, the potential is usually Coulomb-subtracted too, in order that the parameters can be directly equated with the bond length and curvature. All the above short-ranged potentials are pairwise in form. However, there are instances where it is useful to include higher order contributions. For example, in the case of semi-ionic materials, such as silicates, where there is a need to reproduce a tetrahedral local coordination geometry, it is common to include three-body terms that act as a constraint on an angle: 2 1 Ui j k = k3 θi j k − θi0j k 2
There are also many variants on this, such as including higher powers of the deviation of the angle from the equilibrium value and the addition of an
486
J.D. Gale
exponential dependence on the bond lengths so that the potential becomes smooth and continuous with respect to coordination number changes. For systems containing particularly polarizable ions, there is also the possibility of including the three-body contribution to the dispersion energy, as embodied in the Axilrod–Teller potential. As with all materials, it is necessary to select the most approriate force field functional form based on the physical interactions that are likely to dominate in an ionic material. While this will often consist of just the electrostatic term and a two-body short-ranged contribution for dense close-packed materials, it may be necessary to contemplate adding further terms as the degree of covalency and structural complexity increases.
3.
Ion Polarization
Up to this point we have considered ions to have a frozen spherical electron density that may be represented by a point charge. While this is a reasonable representation of many cations, it is not as accurate a description for anions which tend to be much more polarizable. This can be readily appreciated for the oxide ion, O2− in particular. In this case, the first electron affinity of oxygen is favourable, while the second electron affinity is endothermic due to the Coulomb repulsion between electrons. Consequently, the second electron is only bound by the electrostatic potential due to the surrounding cations, and therefore the distribution of this electron will be strongly perturbed by the local environment. It is therefore natural to include the polarizability of anions, and even some larger cations, in ionic potential models when reliable results are required. While polarization may occur to arbitrary order, here the focus will be on the dipole polarizability, α, which is typically the dominant contribution. In the presence of an electric field, E, the dipole moment, µ, generated is given by; µ = αE and the polarization energy, U dipolar, that results is: U dipolar = − 12 α E 2 The electric field at an ion is given by the first derivative of the electrostatic potential with respect to the three Cartesian directions, and therefore can be calculated from the Ewald summation for a bulk material. In principle, it is then straightforward to apply the above point ion polarizability correction to the total energy of a simulation. However, it introduces extra complexity since
Interatomic potential models for ionic materials
487
the induced dipole moments will also generate an electric field at all other ions in the system. Hence, it is necessary to consider the charge–dipole and dipole–dipole interactions as well. The whole procedure involves iteratively solving for the dipole moments on the ions until self-consistency is achieved in a manner analogous to the self-consistent field procedure that occurs in quantum mechanical methods. There is one disadvantage to the use of point ion polarizabilities, as described above, which is that the value of α is a constant. Physically, the more polarized an ion becomes, the harder it should be to polarize it further, and so the induced dipole is prevented from reaching extreme values. If the polarizablity is a constant, a so-called polarization catastrophe can occur in which the total electrostatic energy becomes exothermic faster than the repulsive energy increases leading to the collapse of two ions onto each other. This is particularly problematic with the Buckingham potential since the energy at zero distance tends to −∞. An alternative description of dipolar ion polarization that addresses the above problem is the shell model introduced by Dick and Overhauser [4]. Their approach is to create a simple mechanical model for polarization by dividing each ion into two particles, known as the core and the shell. Here the core can be conceptually thought of as representing the nucleus and core electrons, while the shell represents the more polarizable valence electrons. Thus the core is often positively charged, while the shell is negatively charged, though when utilizing a shell model for a cation it is not uncommon for both core and shell to share the positive charge. Both particles are Coulombically screened from each other and only interact via a harmonic restoring force: 2 U core−shell = 12 kcsrcs
where rcs is the distance between the core and shell. There are two important consequences of the shell model approach. Firstly, because the shell enters the simulation as a point particle, the achievement of electrostatic self-consistency is transformed into a minimization of the shell coordinates. Consequently, this is achieved concurrently with the optimization of the real atomic positions (namely the core positions), though at the cost of doubling the number of variables. While this significantly increases the time required to invert the Hessian matrix, assuming Newton–Raphson optimization is being employed, the convergence rate is also enhanced through all the information on the coupling of coordinates with the polarization being utilized. Secondly, it is the usual convention for the short-ranged potentials to act on the shell of a particle, rather than on the core, which leads to the polarizability becoming environment dependent. If the force constant (second derivative) of the short-range potential acting on the shell is kSR and the shell charge is
488
J.D. Gale
qshell , the polarizability of the ion is equal to: α=
2 qshell kcs + kSR
Special handling of the shell model is required in some simulations. In particular, for molecular dynamics the presence of a particle with no mass potentially complicates the solution of Newton’s equations of motion. However, there are two solutions to this that parallel the techniques found in electronic structure methods. One approach is to divide the atomic mass so that a small fraction is attributed to the shell instead of the core. If chosen to be small enough, the frequency spectra for the shells is higher than any mode of the real material, such that the shells are largely decoupled from the nuclear motions. The disadvantage of this is that a smaller timestep is required in order to achieve an accurate integration. Alternatively, the shells can be minimized at every timestep in order to follow the adiabatic surface. Although the same timestep can now be used as per core-only dynamics, the cost per move is greatly increased. Similarly in lattice dynamics, it is also necessary to consider the contribution from relaxation of the shell positions to the dynamical matrix, which will act to soften the energy surface. Both point ion polarizabilities and the shell model have benefits for interatomic potential simulations of ionic materials. Firstly, they act to stabilize lower symmetry structures and hence it would not be possible to reproduce the structural distortion of various materials without their inclusion. Secondly, they make it possible to determine many materials properties that intrinsically have a strong electronic component. For instance, both the low and high frequency dielectric constant tensors may be calculated, where the former is determined by both the electronic and nuclear contributions, while the latter is purely dependent on the contribution from the polarization model.
4.
Derivation of Ionic Potentials
So far, the typical functional form of the interaction energy in ionic materials has been described, without discussing how the parameter values are arrived at within the model. Many aspects are similar to general forcefield derivation as practiced for organic and inorganic systems, be they ionic or not. However, there are a few differences also that will be highlighted below. Given the dominance of the electrostatic contribution for ionic materials, the starting point for any force field is to determine the nature of the point charges to be employed. There are two broad approaches – either to employ the formal valence charge or to chose smaller partial charges. The main advantages of formal charges are that they remove a degree of freedom from the fitting process and also ensure wide compatability of force fields, in
Interatomic potential models for ionic materials
489
that parameters from binary compounds can be combined to model ternary or more complex phases where the cations do not have the same formal valence charge. Furthermore, when studying defects in materials the vacancy, interstitial or impurity will be guaranteed to carry the correct total charge. On the other hand, for materials with a formal valence of greater than +2 it is argued that formal charges are unrealistic and so partial charges must be used. Indeed, Mulliken charges from ab initio calculations do suggest that such materials are not fully ionic. However, the Mulliken charge is only one of several charge partitioning schemes. Arguably more pertinent measures of ionicity are the Born effective charges that describe the response of the charge density to an electric field. For a solid, where it is not possible to determine the charges that best reproduce the external electrostatic potential, as would be the case for molecules, considering the dipolar response is the next best thing. It is often the case that formal charges, in combination with a shell model for polarization, yield very similar Born effective charges to periodic density functional calculations [6]. Consequently, for low symmetry structures at least, both formal and partial charges can be equally valid in a well derived model. Having determined the charge states of the ions, it is then necessary to derive the short-range and other parameters for the force field by fitting. Parameter derivation falls into one of two classes, either being based on the use of theoretical or experimental data. While truly ab initio parameter derivation is desirable, most theoretical procedures are subject to systematic errors and so empirical fitting to experimental information has tended to be prevalent. Fitting consists of specifying a training set of observable quantities, that may be derived theoretically or experimentally, and then varying the parameters in a least squares procedure in order to minimize the discrepancy between the calculated and observed values [5]. Typically, the training set would consist of one or more structures that represent local energy minima (i.e., stable states with zero force) and data that provide information as to the curvature of the energy surface about these minima, such as bulk moduli, elastic constants, dielectric constants, phonon frequencies, etc. Ideally, multiple structures and as much data as possible should be included in the procedure in order to maximize transferability and to constrain the parameters to physically sensible values. Because it is possible to weight the observables according to their reliability or importance there can never be a single unambiguous fit. In the above brief statement of what fitting is, it is given that the structural data is to be used as an observable. However, there are several distinct ways in which this can be done. If the force field is a perfect fit then the forces calculated at the observed experimental, or theoretically optimized, structure should be zero. Hence it is common to use the forces determined at this point as the observable for fitting, rather than the structure per se, since they are straight forward to calculate. In practice, the quality of the fit is usually imperfect and so there will be residual forces. Lowering the forces does not guarantee that the
490
J.D. Gale
discrepancy in the optimized structural parameters will be minimized though, since this also depends on the curvature. Assuming that the system is within the harmonic region, the errors in the structure, x, will be related to residual force vector, f resid , according to x = H −1 f resid where H is the Hessian matrix containing the second derivatives. Thus one approach to directly fitting the structure is to use the above expression for the errors in the structure. Alternatively, the structure can be fully optimized for each evaluation of the fit quality, which is considerably more expensive, but guaranteed to be reliable regardless of whether the energy surface is quadratic or not. This latter method, referred to as relaxed fitting, also possesses the advantage that any curvature related properties can be evaluated for the structure of zero force, such that the harmonic expressions employed are truly valid. The case of a shell model fit deserves special mention here, since the issues do not usually arise during fits to other types of model. Because of the mapping of dipoles to a coordinate space representation there is the question of how to handle the shell positions during a fit. Given that the cores are equated with the nuclear position, and that it is difficult to ascribe atom-centered dipoles in a crystal, there is rarely any information on where the shells should be sited. In a relaxed fit the issue disappears, since the shells just optimize to the position of minimum force. For a conventional force-based fit then the shells must either still be relaxed explicitly at each evaluation of the sum of squares, or their coordinates can be included as variable parameters such that the relaxation occurs concurrently with the fitting process. Theoretical derivation of parameters can either closely resemble empirical fitting, by inputing calculated observables, or alternatively an energy hypersurface can be utilized. In this latter case many different structures, usually sampled from around the energy minima, are specified along with their corresponding energies. As a result, the curvature of the energy surface is fitted directly rather than by assuming harmonic behavior about the minimum. Again the issue of weighting is particularly important since it tends to be more crucial to ensure a good quality of fit close to the minimum at the expense of points that are further away. To date it has been more common to utilize quantum mechanical data for finite clusters in potential derivation, rather than directly fitting solid state ab initio information. However, this introduces uncertainties, since it is not clear how transferable the gas phase cluster data will be to bulk materials since they are dominated by surface effects. There are two further theoretical methods for parameter derivation that deserve a mention, namely electron gas methods and rule-based methods. The first is particularly significant since it was a popular approach in the early days of the computer simulation of ionic materials at the atomistic level. In the electron gas method, the energy of overlapping frozen ion electron densities
Interatomic potential models for ionic materials
491
is calculated according to density functional theory as a function of distance. These energies can then be used directly via splines or fitted to a functional form. Given that not all ions, such as O2− , are stable in vacu, the ion densities were usually determined in an appropriate potential well to mimic the lattice environment. The results obtained directly from this procedure where not always accurate, given the limitations of density functional theory, so often the distance dependence was shifted to improve the position of the minimum. The second alternative theoretical approach is to use rules that encapsulate how to determine interactions from atomic properties, such as the polarizability and atomic radius, in order to generate force fields of universal applicability. Of course, this compromises the accuracy of the results for any given system, but can be useful for systems were there is little known data to fit to.
5.
Applications of Ionic Potentials
Having defined the appropriate force field for a material, it is then possible to calculate many different properties in a very straight forward fashion. Simulations can be broadly divided into two categories – static and dynamic. In a static calculation, the structure of a material is optimized to the nearest local minimum, which may represent one desired polymorph of a system, as opposed to the global minimum, and then the properties are derived by consideration of the curvature about that position. For example, many of the mechanical, vibrational and electrical response properties are all functions of the second derivatives of the energy with respect to atomic coordinates and lattice strains. For pair potentials, the determination of these properties is not dramatically more expensive than the evaluation of the forces, with the exception of matrix inversions that may be required once the second derivative matrix has been calculated. This is in contrast to quantum mechanical methods where the determination of the wavefunction derivatives makes analytical property calculations almost as expensive as finite difference procedures. In a dynamical simulation, the probability distribution, composed of many different nuclear configurations, is sampled to provide averaged properties that depend on temperature. This usually involves performing either molecular dynamics (in which case the time correlation between data is known) or Monte Carlo (where configurations are selected randomly according to the Boltzmann distribution). Fundamentally static and dynamic methods differ because the former are founded within the harmonic approximation, while the latter allow for anharmonicity. For the purposes of this section, the focus will be placed on the static information that can be obtained from ionic potentials, but stoichastic simulations would also be equally as applicable. The first information to be yielded by an energy minimization is the equilibrium structure. Given that many potentials are
492
J.D. Gale
fitted to such data, it is not surprising that the quality of structural reproduction, at least for simple binary materials, is usually high. Many force fields are derived with out explicit reference to temperature, so consequently the structure that is calculated may contain implicit temperature effects even though the optimization was performed nominally at zero Kelvin. As an example of the application of the formal charge, shell model potential a set of parameters has been derived for alumina. The observables used consisted of the structure of corundum and its elastic and dielectric constants. As a starting model, the parameters originally derived by Catlow et al. [1] were used and subjected to the relax fitting approach. Alumina is a material that has been much studied already, so the aim here is just to illustrate typical results yielded by a fit to such a material and some of the related issues. Values of the calculated properties for corundum, α-Al2 O3 are given in Table 1, along with the comparison against experiment, using the potentials derived, which are given in Table 2. Before considering the results, let us consider the parameters that resulted from the fit since they highlight a number of points. Firstly, by looking at the shell charges and spring constants it can be seen that the oxide ion is responsible for most of the polarizability of the system as would be expected. This is a natural result of the fitting process since the charge distribution between core and shell, as well as the spring constant, was allowed to vary. Secondly, in accord with this picture the attractive dispersion term for Al–O is set to zero, though even if allowed to vary it remains small. Finally, the oxygen–oxygen Table 1. Calculated versus experimental structure and properties for aluminium oxide in the corundum structure based on a shell model potential fitted to the same experimental data Observable
Experiment
Calculated
a (Å) c (Å) Al z (frac) O x (frac) C11 (GPa) C12 (GPa) C13 (GPa) C14 (GPa) C33 (GPa) C44 (GPa) C66 (GPa) 0 ε11 0 ε33 ∞ ε11 ∞ ε33
4.7602 12.9933 0.3522 0.3062 496.9 163.6 110.9 −23.5 498.0 147.4 166.7 9.34 11.54 3.1 3.1
4.9084 12.9778 0.3597 0.2987 567.1 224.6 158.1 −54.3 453.3 127.6 171.2 8.70 13.38 2.88 3.06
Interatomic potential models for ionic materials
493
Table 2. Interatomic potential parameters derived for alumina based on relax fitting to the experimental observables given in Table 1. The starting parameters were taken from Catlow et al. and a two-body cut-off distance of 16.0 Å was employed, while that for the core-shell interaction was 0.8 Å. All non-Coulombic interactions not explicitly given are implicitly zero. The shell charges for A1 and O were −0.0395 and −2.0816, respectively Species 1
Species 2
A (eV)
ρ (Å)
C (eV/Å6 )
kcs (eV/Å2 )
A1 shell O shell A1 core O core
O shell O shell A1 shell O shell
1012.17 22764.00 – –
0.32709 0.14900 – –
0.0 22.368 – –
– – 331.958 24.625
repulsive term is particularly short-ranged and only makes a minute contribution at the equilibrium structure. Consequently, the A and ρ values are rarely varied from the original starting values. The rhombohedral corundum structure is sufficiently complex that even though the potential was empirically fitted to this particular system it is still not possible to achieve a perfect fit. While for many dense high symmetry ionic compounds it is possible to obtain accuracy of better than 1% for structural parameters, the moment there are appreciable anisotropic effects it becomes more difficult. This is illustrated by corundum where it is impossible with the basic shell model to accurately describe the behavior in the ab plane and along the c axis simultaneously, leading to an error of 3% in the a and b cell parameters. Not only is this true for the structure, but it is even more valid for the curvature related properties. If the values of C11 and C33 are compared, which are indicative of the elastic behavior in the two distinct directions, the calculated values have to achieve a compromise by one value being higher than experiment, while the other is lower. In reality, alumina is elastically fairly isotropic, but a dipolar model cannot capture this. The above results for alumina also illustrate the fact that while it is usually possible to reproduce structural parameters to within a few percent, the errors associated with other properties can be considerably greater. As pointed out earlier, although a formal charge model for alumina was employed, the ions in fact behave as though the system is less than fully ionic due to the polarizability. The calculated Born effective charges show that aluminium has a reduced ionicity with a charge of +2.32 in the ab plane and a slightly higher value of +2.55 parallel to the c axis. These magnitudes are in good agreement with assessments of the degree of ionicity of corundum obtained from ab initio calculations. There are many more bulk properties that can be readily determined from interatomic potentials than those given above. For instance, phonon
494
J.D. Gale
frequencies, dispersion curves and densities of states, acoustic velocities, thermal expansion coefficients, heat capacities, entropies and free energies can all be obtained from determining the dynamical matrix about an optimized structure [6]. Other important quantities can also be determined by creating defects in the system, such as vacancies, interstitials and grain boundaries, or by locating other stationary points, in particular transition states for ion diffusion. The possibilities are as boundless as the number of physical processes that can occur in a real material.
6.
Discussion
So far, the basic ionic potential approach to the modeling of solids has been portrayed. While this is very successful for many of the materials for which it was intended, and that composed the majority of the earlier studies, there are increasingly many situations where extensions and modifications are required in order to broaden the scope of the technique. These enhancements recognize the fact that many systems comprise atoms that are less than fully ionic and often non-spherical. One of the most limiting aspects of the ionic model is the use of fixed charges. It is often the case that potential parameters are derived for the bulk material alone where a compound is at its most ionic. However, the ideal force field should also be transferable to lower coordination environments, such as surfaces and even gas phase clusters. Fundamentally, the problem with any fixed charge model, be it formally or partially charged, is that it cannot reproduce the proper dissociation limit of the interaction. Ultimately, if sufficiently far removed from each other, an ionic structure should transform into separate neutral atoms. There is a more sophisticated way of determining partial charges within a force field that addresses the above issue, which is to calculate them as an explicit function of geometry. While this has only been sparsely utilized to date, due to the extra complexity, it has the potential to capture, through chargetransfer, many of the higher order polarizabilities beyond the dipole level, as well as yielding the proper dissociation behavior. The predominant approach to determining the charges has been via electronegativity equalization [8]. Here the self energy of an ion is expressed as a quadratic function of the charge in terms of the electronegativity, χ, and hardness, µ: Uiself (q) = Uiself (0) + χi q + 12 µi q 2 When coupled to the electrostatic energy of interaction between the ions, and solved subject to the condition of charge neutrality for the unit cell, this
Interatomic potential models for ionic materials
495
determines the charges on the ions. The main variation between schemes is the form selected for the Coulomb interaction between ions. While some workers have used the limiting point-charge interaction of 1r at all distances, it has been argued that damped interactions should be used that more realistically mimic the nature of two-centre integrals (i.e., tend to a constant value as r → 0). Variable charge schemes have shown some promise, and have clear advantages since they allow multiple oxidation states to be treated with a single set of parameters, at least in principle. This simplifies the study of materials where the same cation occurs in multiple oxidation states, since no prior assumption needs to be made as to the charge ordering scheme. However, there are still many challenges in this area since it appears that choosing the more formally correct screened Coulomb interaction leads to the electrostatics only contributing weakly to the interionic forces to an extent that is unrealistic. Looking beyond dipolar polarizability, which is a limitation of the most widely used form of ionic model, there are instances where higher order contributions are important. Here, we consider two examples that highlight the issues. Experimentally it is observed that many cubic rock salt structured materials exhibit a so-called Cauchy violation in that the elastic constants C12 and C44 are not equivalent. It has been demonstrated that two-body potential models are unable to reproduce this phenomenon, and inclusion of dipolar polarizability fails to improve the situation. The Cauchy violation actually requires a many-body coupling of the interactions through a higher order polarization. This can be handled through the inclusion of a breathing shell model. Here the shell is given a finite radius that is allowed to vary with a harmonic restoring force about an equilibrium value, with the repulsive short-ranged potential also acting on it. This non-central ion force generates a Cauchy violation, though always of one particular sign (C44 > C12 ), while the experimental values can be in either direction. A second example of the role of polarization, is in the stability of polymorphs of alumina. If the relative energies of alumina adopting different possible M2 O3 structures is examined using most standard interatomic potential models, including that given in the previous section, then it is found that the corundum structure (which is the experimental ground state under ambient conditions) is not the most stable, with the bixbyite form being preferred. Investigations have demonstrated that the inclusion of quadrupolar polarizability is essential here [7]. This can be readily achieved within the point ion approach, but is more difficult in the shell model case. While an elliptical breathing shell model can capture the effect, it highlights the fact that the extension of this mechanical approach to higher order terms becomes increasingly cumbersome. While most alkali and alkaline earth metals conform reasonably well to the ionic model, there are substantial problems with describing many of the remaining cations in the periodic table. In particular, transition metals ions
496
J.D. Gale
are often non-spherical due to the partial occupancy of the d-orbitals. The classic example of this is when the anti-bonding eg∗ orbitals of an octahedral ion are half-filled for a particular spin, giving rise to a Jahn–Teller distortion, as is the case for Cu2+ . To describe this effect with a simple potential model is impossible, except by constructing a highly specific model with different short-ranged potentials for each metal–ligand interaction, regardless of the fact that they may be acting between the same species. So far, the only solution to the problem of ligand–field effects has been to resort to approaches that mimic the underlying quantum mechanics, but in an empirical fashion. Hence, most work has utilized the angular overlap model to describe a set of energy levels that are subsequently populated according to a Fermi–Dirac distribution, where the states are determined by diagonalizing a 5 × 5 matrix determined according to the local environment [11]. This approach has been successfully used to describe the manganate (Mn3+ , d4 ) cation, as well as other systems within a molecular mechanics framework. At the heart of the ionic potential method is the electrostatic energy, normally evaluated according to the Ewald sum when working within 3-D boundary conditions. However, this approach possesses the disadvantage that it scales 3 at best as N 2 , where N again represents the number of atoms within the simulation cell. In an era when very large scale simulations are being targeted, it is necessary to also reassess the underlying algorithms to ensure the optimal efficiency is attained. Consequently, the fundamental task of calculating the Coulomb energy is still an area of active research. Approaches currently being employed include the particle-mesh and cell multipole methods. The desirable characteristics of an algorithm are now that it should both scale linearly with system size and also be amenable to parallel computation. Both of these can be achieved as long as the method is local in real space, in some cases with complementary linear-scaling in reciprocal space, or if a hierarchical scheme is utlized within the cell multipole method to make the problem increasing coarse-grained the greater the distance of interaction is. Methods have been proposed that use a spherical cut-off in real space alone, which naturally satisfies both desirable criteria [10]. However, it becomes difficult to achieve the defined Ewald limiting value without a considerable prefactor.
7.
Outlook
The state of the art in force fields for ionic materials looks set for a gradual evolution that sees it take on board many concepts from other types of system, while retaining the aim of an accurate evaluation of the electrostatic energy at the core. For the very short-ranged interactions it is likely that bond order models, widely used in the semiconductor and hydrocarbon fields, and
Interatomic potential models for ionic materials
497
also closely related to the approach taken for metallic systems, will be blended with schemes that capture the variation of the charge and higher order multipole moments as a function of structure. The result will be force fields that are capable of simulating not only one category of material, but several distinct ones. Development of solid state quantum mechanical methods to increased levels of accuracy will increasingly provide the wealth of information required for parameterisation of more complex interatomic potentials for systems, especially where there is a paucity of experimental data. Ultimately, this will lead to a seamless transition to models capable of reliably describing interfaces between ionic and non-ionic systems – currently one of the most challenging problems in materials science.
References [1] C.R.A. Catlow, R. James, W.C. Mackrodt, and R.F. Stewart, “Defect energetics in α-Al2 O3 and rutile TiO2 ,” Phys. Rev. B, 25, 1006–1026, 1982. [2] C.R.A. Catlow and W.C. Mackrodt, “Theory of simulation methods for lattice and defect energy calculations in crystals,” Lecture Notes in Phys., 166, 3–20, 1982. [3] S.W. de Leeuw, J.W. Perram, and E.R. Smith, “Simulation of electrostatic systems in periodic boundary conditions. i. lattice sums and dielectric constants,” Proc. R. Soc. London, Ser. A, 373, 27–56, 1980. [4] B.G. Dick and A.W. Overhauser, “Theory of the dielectric constants of alkali halide crystals,” Phys. Rev., 112, 90–103, 1958. [5] J.D. Gale, “Empirical potential derivation for ionic materials,” Phil. Mag. B, 73, 3–19, 1996. [6] J.D. Gale and A.L. Rohl, “The general lattice utility program (GULP),” Mol. Simul., 29, 291–341, 2003. [7] P.A. Madden and M. Wilson, “ ‘Covalent’ effects in ‘ionic’ systems,” Chem. Soc. Rev., pp. 339–350, 1996. [8] W.J. Mortier, K. van Genechten, and J. Gasteiger, “Electronegativity equalization: applications and parameterization,” J. Am. Chem. Soc., 107, 829–835, 1985. [9] M.P. Tosi, “Cohesion of ionic solids in the Born model,” Solid State Phys., 16, 1–120, 1964. [10] D. Wolf, P. Keblinski, S.R. Philpot, and J. Eggebrecht, “Exact method for the simulation of Coulombic systems by spherically truncated, pairwise r −1 summation,” J. Chem. Phys., 110, 8254–8282, 1999. [11] S.M. Woodley, P.D. Battle, C.R.A. Catlow, and J.D. Gale, “Development of a new interatomic potential for the modeling of ligand field effects,” J. Phys. Chem. B, 105, 6824–6830, 2001.
2.4 MODELING COVALENT BOND WITH INTERATOMIC POTENTIALS Joa˜ o F. Justo Escola Polit´ecnica, Universidade de S˜ao Paulo, S˜ao Paulo, Brazil
Atoms, the elementary carriers of chemical identity, interact strongly with each other to form solids. It is interesting that those interactions could be directly mapped to the electronic and structural properties of the resulting materials. This connection between microscopic and macroscopic worlds is appealing, and suggests that a theoretical atomistic model could help to model and build materials with predetermined properties. Atomistic simulations represent one of the tools that can bridge those two worlds, accessing to information on the microscopic mechanisms which, in many cases, could not be sampled out by experiments. One of the most important elements in an atomistic simulation is the model describing the interatomic interactions. In principle, such model should take into account all the particles (electrons and nuclei) of the system. Quantum mechanical (or ab initio) methods provide a precise description of those interactions, but they are computationally prohibitive. As a result, simulations would be restricted to systems involving only up to a thousand (or a few thousand) atoms, which is not enough to capture many important atomistic mechanisms. Some approximation, leading to less expensive models, should be implemented. A radical approach is to describe the interactions by classical potentials, in which the electronic effects are somehow integrated out, being taken into account only implicitly. The gain in computational efficiency comes with a price: a poorer description of the interactions. Ab initio methods will become increasingly important in materials science over the next decade. Even using the fastest computers, those methods will continue being computationally expensive. Therefore, there is a demand for less expensive models to explore a number of important phenomena, to provide a qualitative view, scan for trends or insights on atomistic events, which could be later refined using ab initio methods. Developing an interatomic potential involves a combination of intuitive thinking, which comes out from our 499 S. Yip (ed.), Handbook of Materials Modeling, 499–507. c 2005 Springer. Printed in the Netherlands.
500
J.F. Justo
knowledge on the nature of the interatomic bonding, and theoretical input. However, there is no theory which would directly provide the functional form for an interatomic potential. As a result, depending on the bonding type, considerably distinct approaches have been devised to describe interatomic interactions [1, 2]. In any case, the functional form should have a physical motivation and enough flexibility, in terms of fitting parameters, to capture the essential aspects underlying the interatomic interactions. The next sections discuss the specific case of modeling the covalent bonding by interatomic potentials, and the elements which should be present to properly describe such interactions.
1.
Pair Potentials
The cohesive energy (E c ) is the relevant property which quantifies cohesion in a solid. It is given by E c (Rn , rm ), where Rn and rm represent the degrees of freedom of the n nuclei and m electrons, respectively. While E c could be computed by solving the quantum mechanical Schr¨odinger equation for the electrons of the system, one should inquire what kind of approximation could be performed to describe E c with less expensive methods. One strategy is to average the electronic effects out, but still keeping the electronic degrees of freedom explicitly. One of these approaches, called tight-binding method, provides a realistic description of bonding in solids. However, those models are still computationally too expensive, although simulations with a few thousand atoms could be performed. An extreme approach is to remove all the electronic degrees of freedom, and E c would be given by E c (Rn , rm ) ≈ E c (Rn ). In this last case, the electronic effects would be implicitly present in the functional form. Several interatomic potentials for covalent bonding have been developed over the years. Only for silicon, which is considered the prototypical covalent material, there are more than thirty models which have been extensively used and tested [3]. This and the following sections discuss the relevant elements of an interatomic potential to describe a typical covalent material. The discussion focuses on the two most important models which have been developed for silicon [4, 5]. Cohesive energy could be determined by the atomic arrangement, in terms of a many-body expansion [6] Ec =
n i
V1 (Ri ) +
n i, j
V2 (Ri , R j ) +
n
V3 (Ri , R j , Rk ) + · · · ,
(1)
i, j,k
in which the sums are over all the n atoms of the system. In principle, E c could be determined by an infinite many-body expansion, but the computational cost scales with n l , where l is the order in which the expansion is truncated. The one-body terms (V1 ) are generally neglected, but the two-body (V2 ) and
Modeling covalent bond with interatomic potentials
501
three-body (V3 ) terms carry most of the relevant effects underlying bonding. While the V2 and V3 have a simple physical interpretation, intuition for higher order terms is not so straightforward, and most models have avoided such terms. Could the expansion (1) be truncated in a two-body expansion and still capture the essential properties of covalent bonding? For a long period, pair potentials were used to investigate materials properties, and revealed a number of fundamental atomistic processes. Models including higher order expansions, later developed, provided results which were qualitatively consistent with those early investigations. This sets light on the discussion of pair potentials. Although they provide an unrealistic description of covalent bonding, they still capture some of the essential aspects of cohesion. A typical V2 function has a strong repulsive interaction at short interatomic separations, changing to an attractive interaction at intermediate separations which goes smoothly to zero at longer distances. The V2 interaction, between atoms i and j , can be written as combination of a repulsive (VR ) plus an attractive (V A ) interaction in terms of the interatomic distance, ri j = |Ri − R j |.
V2 / ε
1
0
⫺1 1
2
r/a Figure 1. The two-body interatomic potential. The figure presents V2 for two models: the Lennard–Jones (full line) and the Stillinger–Weber (dashed line) potentials. The functions are plotted normalized in terms of the minimum in energy and equilibrium separation (a).
502
J.F. Justo
The Lennard–Jones potential, shown in Fig. 1, is an example of a pair potential used to model cohesion in a solid V2 (r) = VR (r) + V A (r) = 4ε
12
σ r
−
6
σ r
,
(2)
where ε and σ are free parameters which can be fitted to properties of the material. The equilibrium interatomic distance (a) is related to the crystalline lattice parameter, while the curvature of the potential near a is directly correlated to the macroscopic bulk modulus. The functional form in Eq. (2) is long ranged, and the computational cost scales with n 2 . On the other hand, this cost could scale linearly with n if a cut-off function f c (r) were used. This f c (r) function should not change substantially the interaction for the relevant region of bonding, near the minimum of V2 (r), and should vanish at a certain interatomic distance Rc , defined as the cut-off of the interaction. Therefore, the two-body interaction is described by an effective potential V2eff (r) = V2 (r) f c (r). The functional form of the Lennard–Jones potential can provide a realistic description of noble gases in condensed phases. Although pair potentials capture some essential aspects of bonding, there are still some important elements missing in order to properly describe covalent bonding. If interatomic interactions were described only by pair potentials, there would be a gain in cohesive energy if an atom increased its coordination (number of nearest neighbors). Since there is no energy penalty for increasing coordination, pair potentials will always lead to closed packed crystalline structures. However, atoms in covalent materials sit in much more open crystalline structures, such as hexagonal or the diamond cubic. Pair potentials alone cannot describe the covalent bonding, and many-body effects must be introduced in the description of cohesion.
2.
Beyond Pair Potentials
The many-body effects [6] could be introduced in E c by several procedures: inside the two-body expansion (pair functionals), by an explicit many-body expansion (cluster potentials), or a combination of both (cluster functionals). Models which have been successfully developed to describe covalent systems fit into one of these categories. The Stillinger–Weber [4] and the Tersoff [5] models can be classified as a cluster potential and as a cluster functional, respectively. In a description using only pair potentials, as given by Eq. (2), the cohesive energy of an individual bond inside a crystal is constant for any atomic coordination. However, this departs from a realistic description. Figure 2(a) shows the cohesive energy per bond as a function of atomic coordination for several crystalline structures of silicon. There is a weakening of the bond strength
Modeling covalent bond with interatomic potentials (a)
503
(b) 1.5
0
1
b(Z)
E c /bond
⫺1
⫺2
0.5 ⫺3
0
2
4
6
8
10
12
14
0
2
Z
4
6
8
10
12
14
Z
Figure 2. (a) Cohesive energy per bond (E c /bond) as a function of atomic coordination (Z ). Cohesive energies are taken from ab initio calculations (diamond), and the full and dashed lines represent fitting with a Z −1/2 and exp(−β Z 2 ), respectively. (b) Bond order term b(Z ) as a function atomic coordination taken from ab initio calculations (diamond), and fitted to Z −1/2 (full line) and exp(−β Z 2 ) (dashed line).
with increasing coordination, a behavior that is observed in any material. However, bond strength weakens very fast with coordination in molecular crystals and very slow in most metals. That is why molecular solids favor very low coordinations and metals favor high coordinations. Covalent solids fall between those two extremes. Cohesive energy can be written as a sum over all the individual bonds Vi j Ec =
1 1 Vi j = VR (ri j ) + bi j V A (ri j ) , 2 i, j 2 i, j
(3)
where the parameter bi j controls the strength of the attractive interaction in Vi j . The attractive interaction between two atoms, i.e., the interaction controlling cohesion, is a function of the local environment. This dependence could be translated into a physical quantity called local coordination (Z ). As the coordination increases, valence electrons should be shared with more neighbors, so the individual bond between an atom and its neighbors weakens. Using chemistry arguments, it can be shown that the bond order term (bi j ), can be given as a function of the local coordination (Z i ) in atom i by −1/2
bi j (Z i ) = η Z i
,
(4)
where η is a fitting parameter. Figure 2(b) shows the bond order term as a function of coordination for several crystalline structures. The Z −1/2 function is a good approximation for high coordinations, but fails for low coordinations. It has been recently shown [7] that an exponential behavior for bi j would be more adequate. The introduction of the bond order term in V2 considerably improves the description of cohesion in a covalent material. With this new
504
J.F. Justo
term, the equilibrium distance and strength of a bond is also determined by the local coordination at each atomic center. Even using a bond order term, covalent bonding still requires a functional form with some angular dependence to stabilize the open crystalline structures. Angular functions could be introduced inside the bond order term b(Z ), as developed by Tersoff [5], which becomes b(Z , θ), where θ represents the angles between adjacent bonds around each atom of the system. Another procedure is to use an explicit three-body expansion [4]. In terms of energetics, there is a parallel between two-body and three-body potentials. In the former case, there is an energy penalty for interatomic distances differing from a certain equilibrium value. In the later case, there is a penalty for angles differing from a certain equilibrium value θ0 . The three-body potentials are generally positive, being null at an equilibrium angle. The interaction for the (i, j, k) set of atoms is described by V3 (ri j , rik , r j k ) = h(ri j )h(rik )g(θi j k ),
(5)
where the radial functions h(r) goes monotonically to zero with increasing the interatomic distance. Figure 3 shows the behavior of typical angular functions g(θ). The Stillinger–Weber model used a three-body expansion, and the V3 potential was developed as a penalty function with a minimum 2
i θ ijk
1.5
g(θ)
j
k
1
0.5
0
30
60
90
120
150
180
θ Figure 3. Angular function g(θ) from the Stillinger–Weber (full line) and Tersoff (dashed line) models.
Modeling covalent bond with interatomic potentials
505
at the tetrahedral angle (109.47◦ ). On the other hand, the Tersoff potential introduced an angular function inside the bond order term, and the minimum of the angular term was a fitting parameter.
3.
Models
Developing an interatomic potential involves several elements. The first one is the functional form, which should capture all the properties of covalent bonding. The functions should have enough flexibility, in terms of number of free parameters, to allow a description of a wide set of the materials properties. The second element is the fitting procedure used to find the set of free parameters that better describes a predetermined database. The database comprises a set of crystalline properties (such as cohesive energy, lattice parameter, elastic constants) and other specific properties (such as the formation energy of point defects) obtained from experiments or ab initio calculations. Additionally, the interatomic potential should be transferable, i.e., it should provide a realistic description of relevant configurations away from the database. Two interatomic potentials [4, 5] have prevailed over the others in studies of covalent materials. The Tersoff model is described by a two-body expansion, including a bond order term Ec =
1 Vi j , 2 i=/ j
(6)
Vij = f c (rij ) f R (rij ) + bij f A (ri j ) ,
(7)
where f R (ri j ) and f A (ri j ) are respectively, the repulsive and attractive terms given by f R (r) = A exp(−λ1r)
and
f A (r) = −B exp(−λ2r).
(8)
The f c (r) is a cut-off function which is one for the relevant region of bonding r < S, going smoothly to zero in the range S < r < R. The R and S, which control the range of interactions, are fitting parameters. The bij is the bond order term which is given by
bi j = 1 + β n ζinj ζi j =
1/2n
,
(9)
g(θi j k ) exp α 3 (ri j − rik )3 ,
(10)
c2 c2 − , d 2 d 2 + (h − cos θ)2
(11)
k= / i, j
g(θ) = 1 +
where θij k is the angle between i j and ik bonds.
506
J.F. Justo
The Tersoff potential was fitted to several silicon polytypes, being extended to other covalent systems, including multi-component materials. The Brenner potential [8], a model which resembles the Tersoff potential, is widely used to study hydrocarbon systems. The Stillinger–Weber potential is the most used model for covalent materials. It was developed as a three-body expansion E=
V2 (ri j ) +
i, j
V3 (ri j , rik , r j k ).
(12)
i, j,k
The two-body term V2 (r) is given by
B V2 (r) = A ρ − 1 f c (r), r
(13)
where the cut-off function f c (r) is given by
f c (r) = exp µ/(r − R) ,
(14)
if r < R and null otherwise. The three-body potential V3 is given by: V3 (ri j , rik ) = h(ri j )h(rik )g(θi j k ), h(r) = exp γ /(r − R) , g(θ) = (cos θ + 1/3)2.
(15) (16) (17)
This model was fitted to properties of the diamond cubic structure and local order of liquid silicon. Other models have been developed to describe covalent materials. Those models have used different approaches, such as functional forms with up to 50 parameters and extensive database. Some of those models have been compared with each other, specially in the case of silicon [3]. Such comparisons revealed that no interatomic potential is suitable for all situations, such that there is still space for further developments. Recently, a new model for covalent materials was developed [7] and included the features of both the Tersoff and the Stillinger–Weber models. That model included explicitly bond order terms in the two-body and three-body interactions, which allowed a better description of covalent bonding as compared to previous models.
4.
Perspectives
Interatomic potentials will continue playing an important role in atomistic simulations. Although potentials have been successfully applied to investigate covalent materials, they still face several challenges. As new models are
Modeling covalent bond with interatomic potentials
507
developed, theoretical input will increasingly prevail over empirical input. So far, the physical properties of bonding have been introduced by trial and error. Attempts to improve models have been in the direction of trying new functional forms, going to higher order expansions or increasing the number of fitting parameters. This will give place to more sophisticated approaches, in which the functional forms could be directly extracted from theory. Interatomic potentials also face the challenge to describe materials with mixed bonding character (metallic, covalent, and ionic altogether). The Tersoff potential, for example, has been extended to systems with some ionic character, but still with prevailing covalent character. That model would not work for materials with stronger ionic character, requiring at least the introduction of a long-ranged Coulomb interaction term. Finally, even if sophisticated interatomic potentials are developed, one should keep in mind that every model has its limited applicability and should always be used with caution.
References [1] A.F. Voter, “Interatomic potentials for atomistic simulations,” MRS Bulletin, 21(2), 17–19, (and additional references in the same issue, 1996). [2] R. Phillips, Crystals, Defects and Microstructures: Modeling Across Scales, Cambridge University Press, Cambridge, UK, 2001. [3] H. Balamane, T. Halicioglu, and W.A. Tiller, “Comparative study of silicon empirical interatomic potentials,” Phys. Rev. B, 46, 2250–2279, 1992. [4] F.H. Stillinger and T.A. Weber, “Computer simulation of local order in condensed phases of silicon,” Phys. Rev. B, 31, 5262–5271, 1985. [5] J. Tersoff, “New empirical-approach for the structure and energy of covalent systems,” Phys. Rev. B, 37, 6991–7000, 1988. [6] A.E. Carlsson, “Beyond pair potentials in elemental transition metals and semiconductors,” In: H. Ehrenreich and D. Turnbull (eds.), Solid State Physics, vol. 43, Academic Press, San Diego, pp. 1–91, 1990. [7] J.F. Justo, M.Z. Bazant, E. Kaxiras, V.V. Bulatov, and S. Yip, “Interatomic potential for silicon defects and disordered phases,” Phys. Rev. B, 58, 2539–2550, 1998. [8] D.W. Brenner, “Empirical potential for hydrocarbons for use in simulating the chemical vapor-deposition of diamond films,” Phys. Rev. B, 42, 9458–9471, 1990.
2.5 INTERATOMIC POTENTIALS: MOLECULES Alexander D. MacKerell, Jr. Department of Pharmaceutical Sciences, School of Pharmacy, University of Maryland, 20 Penn Street, Baltimore, MD, 21201, USA
Interatomic interactions between molecules dominate their behavior in condensed phases, including the aqueous phase in which biologically relevant processes occur [1]. Accordingly, it is essential to accurately treat interatomic interactions using theoretical approaches in order to apply such methods to study condensed phase phenomena. Typical condensed phase systems subjected to theoretical studies include thousands to hundreds of thousands of particles. Thus, to allow for calculations on such systems to be performed simple, computationally efficient functions, termed empirical or potential energy functions, are applied to calculate the energy as a function of structure. In this chapter an overview of potential energy functions used to study of condensed phase systems will be presented, with emphasis on biologically relevant systems. This overview will include information on the optimization of these models and address future developments in the field.
1.
Empirical Force Fields
Potential energy functions used for condensed phase simulation studies are comprised of simple functions to relate the structure, R, to the energy, V , of the system. An example of such a function is shown in Eqs. (1)–(3). The total potential V (R)total = V (R)internal + V (R)external V (R)internal =
K b (b − b0 )2 +
bonds
+
(1) K θ (θ − θ0 )2
angles
K χ (1 + cos (nχ − δ))
dihedrals
509 S. Yip (ed.), Handbook of Materials Modeling, 509–525. c 2005 Springer. Printed in the Netherlands.
(2)
510
A.D. MacKerell V (R)external
Rmin,ij εij = rij nonbonded
12
−
Rmin,ij ri j
6
qq + i j ε D rij
atompairs
(3) energy, V (R)total, is separated into internal or intramolecular terms, V (R)internal and external, V (R)external terms. The latter are also referred to as intermolecular or nonbonded terms. While interatomic interactions are dominated by external terms, the internal terms also make a significant contribution to condensed phase properties, requiring their consideration in this chapter [2]. Furthermore, it is not just the potential energy function alone that is required for determination of the energy as a function of the structure, but the parameters in Eqs. (2) and (3) are also needed. The combination of the potential energy function along with the parameters is termed an empirical force field. Application of an empirical force field to a chemical system of interest, in combination with numerical approaches allowing for sampling of relevant conformations via, e.g., a molecular dynamics simulation (MD) [3] (see below), can be used to predict a variety of structural and thermodynamic properties via statistical mechanics [4]. Importantly, such approaches allow for comparisons with experimental thermodynamic data and the atomic details of interatomic interactions between molecules that dictate the thermodynamic properties can be obtained. Such atomic details are often difficult to access via experimental approaches, motivating the application of computational approaches. Equations (2) and (3) represent a compromise between simplicity and chemical accuracy. The structure or geometry of a molecule is simply represented by four terms, as shown in Fig. 1. The intramolecular geometry is based on bond lengths, b, valence angles, θ, and dihedral or torsion angles, χ, that describe the orientation of 1,4 atoms (i.e., atoms connected by 3 covalent bonds). Additional internal terms may be included in a potential energy function, as described elsewhere [5, 6]. The bond stretching and angle bending terms are treated harmonically; bond and angle parameters include b0 and θ0 , the equilibrium bond length and equilibrium angle, respectively, and K b and K θ are the force constants associated with the bond and angle terms, respectively. The use of harmonic terms for the bond and valence angles is typically sufficient for molecular distortions near ambient temperatures and in the absence of bond breaking or making events, due the bonds and angles staying close to their equilibrium values at room temperature. Dihedral or torsion angles represent the rotations that occur about a bond. These terms are oscillatory in nature (e.g., rotation about the central carbon– carbon bond in ethane changes the structure from a low energy staggered conformation, to a high energy eclipsed conformation, back to a low energy staggered conformation and so on), requiring the use of a sinusoidal function to accurately model them. The dihedral angle parameters (Eq. (2)) include the
Interatomic potentials: molecules
511
Figure 1. Schematic diagram of the terms used to describe the structure of molecules in empirical force fields. Internal or intramolecular terms include bond lengths, b, valence angles, θ, and dihedral or torsion angles, χ. For the intermolecular interactions only the distance between atoms i and j, rij , is required.
force constant, Kχ , the periodicity or multiplicity, n, and the phase, δ. The magnitude of Kχ dictates the height of the barrier to rotation, such that Kχ associated with a double bond would be significantly larger that that for a single bond. The multiplicity, n, indicates the number of cycles per 360◦ rotation about the dihedral. In the case of an sp3–sp3 bond, as in ethane, n would equal three, while an sp2–sp2 C=C double bond would have n equal to two. The phase, δ, dictates the location of the maxima in the dihedral energy surface allowing for the location of the minima for a dihedral with n = 2 to be shifted from 0◦ to 90◦ and so on. Typically, δ is equal to 0 or 180, although recent extensions allow any value from 0 to 360 to be assigned to δ ◦ [7]. Each dihedral angle in a molecule may be treated with a sum of dihedral terms that have different multiplicities, as well as force constants and phases. The use of a summation of dihedral terms for a single torsion angle, a fourier series, greatly enhances the flexibility of the dihedral term allowing for more accurate reproduction of experimental and quantum mechanical (QM) energetic target data. Equation (3) describes the intermolecular, external or nonbond interaction terms which are dependent on the distance, rij , between two atoms i and j (Fig. 1). As stated above, these terms dominate the interactions between molecules and, accordingly, condensed phase properties. Intermolecular interactions are also important for the structure of biological macromolecules
512
A.D. MacKerell
due to the large number of interactions that occur between different regions of biological polymers that dictate their 3D conformation (e.g., hydrogen bonds between Watson–Crick base pairs in DNA or between peptide bonds in α-helicies or β-sheets in proteins). Parameters associated with the external terms are the well depth, εij , between atoms i and j, the minimum interaction radius, Rmin,i j , and the partial atomic charge, qi . The dielectric constant, ε D , is generally treated as equal to one, the permittivity of vacuum, although exceptions do exist when implicit solvent models are used to treat the condensed phase environment [8]. The first term in Eq. (3) is used to treat the van der Waals (vdW) interactions. The particular form in Eq. (3) is referred to as the Lennard–Jones (LJ) 6-12 term. The 1/r 12 term represents the exchangerepulsion between atoms associated with overlap of the electron clouds of the individual atoms (i.e., the Pauli exclusion principle). The strong distance dependence of the repulsion is indicated by the 12th power of this term. Representing London’s dispersion interactions or instantaneous-dipole induceddipole interactions is the 1/r 6 term, which is negative indicating its favorable nature. In the LJ 6-12 equation there are two parameters; the well depth, εij , dictating the magnitude of the favorable London’s dispersion interactions between two atoms, i, j, and Rmin ,ij is the distance between atoms i and j at which the minimum LJ interaction energy occurs; the latter is related to the vdW radius of an atom. Typically, εij and Rmin ,ij are not determined for every possible interaction pair, i, j. Instead εi and Rmin,i parameters are determined for the individual atom types (e.g., sp2 carbon vs sp3 carbon) and then combining rules are used to create the ij cross terms. These combining rules are generally quite simple being either the arithmetic mean (i.e., Rmin,ij = (Rmin,i + √ Rmin, j )/2) or the geometric mean (i.e., εij = ( εi · ε j )), although other variations exist [9]. The use of combining rules greatly simplifies the determination of the εi and Rmin,i parameters. In special cases the force field can be supplemented by specific i, j LJ parameters, referred to as off-diagnol terms, to treat interactions between specific atom types that are poorly modeled by the use of combining rules. There are several commonly used alternate forms for treatment of the vdW interactions. The three primary alternatives to the LJ 6-12 term included in Eq. (3) are designed to “soften” the repulsive wall associated with Pauli exclusion, yielding better agreement with high-level QM data [9]. For example, the Buckingham potential [10] uses an exponential term to treat repulsion while a buffered 14-7 term is used in the MMFF force field [11–13]. A simple alternative is to replace the r 12 repulsion with an r 9 term. The final term contributing to the intermolecular interactions is the electrostatic or Coulombic term. This term involves the interaction between partial atomic charges, qi and q j , on atoms i and j divided by the distance, rij , between those atoms with the appropriate dielectric constant taken into account. Use of a charge representation for the individual atoms, or monopoles,
Interatomic potentials: molecules
513
effectively includes all higher order electronic interactions, such as dipoles and quadrapoles. As will be discussed below, the majority of force fields treat the partial atomic charges as static in nature, due to computational considerations. These are referred to as non-polarizable or additive force fields. Finally, the use of a dielectric constant, ε, of one is appropriate when the condensed phase environment is treated explicitly (i.e., use of explicit water molecules to treat an aqueous condensed phase). Combined, the Lennard–Jones and Coulombic interactions have been shown to produce an accurate representation of the interaction between molecules, including both the distance and angle dependencies of hydrogen bonds [14]. This success has allowed for the omission of explicit terms to treat hydrogen bonding from the majority of empirical force fields. It is important to emphasize that the LJ and electrostatic parameters are highly correlated, such that LJ parameters determined for a set of partial atomic charges will not be applicable to another set of charges. In addition, the values of the internal parameters are dependent on the external parameters. For example, the barrier to rotation about the C–C bond in ethane includes electrostatic and vdW interactions between the hydrogens as well as contributions from the bond, angle and dihedral terms. Accordingly, if the LJ parameters or charges are changed, the internal parameters will have to be adjusted to reproduce the correct energy barrier. Finally, condensed phase properties obtained from empirical force field calculations contain contributions for the conformations of the molecules being studied as well as interatomic interactions between those molecules, emphasizing the importance of both internal and external portions of the force field for accurate condensed phase simulations.
2.
Parameter Optimization
Due to the simplicity of the potential energy function used in empirical force fields it is essential that the parameters in the function be optimized allowing for the force field to yield accurate results as judged by their quality in reproducing the experimental regimen. Parameter optimization is based on reproducing a set of target data. The target data may be obtained from QM calculations or experimental data. QM data is generally readily accessible for most molecules; however, limitations in QM level of theory, especially with respect to the treatment of dispersion interactions [15, 16], require the use of experimental data when available [6]. In the rest of this article, we will focus on intermolecular parameter optimization due to their dominant role in the interactions between molecules. Readers can obtain information on the optimization of internal parameters elsewhere [5, 11–13, 16, 17]. A large number of studies have focused on the determination of the electrostatic parameters; the partial atomic charges, qi . The most common charge
514
A.D. MacKerell
determination methods are the supramolecular and QM electrostatic potential (ESP) approaches. Other variations include bond charge increments [19, 20] and electronegativity equilization methods [21]. An important consideration with the determination of partial atomic charges, related to the Coulombic treatment of electrostatics in Eq. (3), is the omission of explicit electronic polarizability or induction. Thus, it is necessary for static charges to reproduce the polarization that occurs in the condensed phase. To do this, the partial atomic charges of a molecule are “enhanced” leading to an overestimation of the dipole moment as compared to the gas phase value, yielding an implicitly polarized model. For example, many of the water models used in additive empirical force fields (e.g., TIP3P, TIP4P, SPC) have dipole moments in the vicinity of 2.2 debeye [22], vs. the gas phase value of 1.85 debeye for water. Such implicit polarizability allows for additive empirical force fields based on Eq. (3) to reproduce a wide variety of condensed phase properties [23]. However, such models are limited when treating molecules in environments of significantly different polar character. Determination of partial atomic charges via the supramolecular approach is used in the OPLS [24, 25] and CHARMM [26–29] force fields. In this approach, the charges are optimized to reproduce QM determined minimum interaction energies and geometries of a model compound with, typically, individual water molecules or for model compound dimers. Historically, the HF/6-31G* level of theory was used for the QM calculations. This level typically overestimates dipole moments [30], thereby approximating the influence of the condensed phase on the obtained charge distribution leading to the implicitly polarizable model. In addition, the supramolecular approach implicitly includes local polarization effects due to the charge induction caused by the two interacting molecules, facilitating determination of charge distributions appropriate for the condensed phase. With CHARMM it was found that an additional scaling of the QM interaction energies prior to charge fitting was necessary to obtain the correct implicit polarization for accurate condensed phase studies of polar neutral molecules [31]. Even though recent studies have shown that QM methods can accurately reproduce gas phase experimental interaction energies for a range of model compound dimers [32, 33], it is important to maintain the QM level of theory that was historically used for a particular force field when extending that force field to novel molecules. This assures that the balance of the nonbond interactions between different molecules in the system being studied is maintained. Finally, an advantage of charges obtained from the supramolecular approach is that they are generally developed for functional groups, such that they may be transferred between molecules allowing for charge assignment to novel molecules to readily be performed. ESP charge fitting methods are based on the adjustment of charges to reproduce a QM determined ESP mapped onto a grid surrounding the model
Interatomic potentials: molecules
515
compound. Such methods are convenient and a number of charge fitting methods based on this approach have been developed [34–38]. However, there are limitations in ESP fitting methods. First, the ability to unambiguously fit charges to an ESP is not trivial [37] and charges on “buried” atoms (e.g., a carbon to which three or four nonhydrogen atoms are covalently bound) tend to be underdetermined, requiring the use of restraints during fitting [36]. The latter method is referred to as Restrained ESP (RESP). Third, since the charges are based on a gas phase QM wave function, they may not necessarily be consistent with the condensed phase, although recent developments are addressing this limitation [39]. Finally, considerations of multiple conformations of a molecule, for which different charge distributions typically exist, must be taken into account [30]. It should be noted that the last two problems must also be considered when using the supramolecular approach. As with the supramolecular approach, the QM level of theory was often the HF/6-31G*, as in the AMBER force fields [41], due to that level typically overestimating the dipole moment. More recently, higher level QM calculations have been applied in conjunction with the RESP approach [42], although their ability to reproduce condensed phase thermodynamic properties has not been tested. Clearly, both the supramolecular and ESP methods are useful for the determination of partial atomic charges. Which one is used, therefore, should be based on compatibility with that used for the remainder of the force field being applied. Accurate optimization of the LJ parameters is one of the most important aspects in the development of a force field for condensed phase simulations. Due to limitations in QM methods for the determination of dispersion interactions, optimization of LJ parameters is dominated by the reproduction of thermodynamics properties in condensed phase simulations, generally neat liquids [43, 44]. Typically, the LJ parameters for a model compound are optimized to reproduce experimentally measured values such as heats of vaporization, densities, isocompressibilities and heat capacities. Alternatively, heats or free energies of aqueous solvation, partial molar volumes or heats of sublimation and lattice geometries of crystals [45, 46] can be used as the target data. These methods have been applied extensively for development of the force fields associated with the programs AMBER, CHARMM, and OPLS. However, it should be noted that LJ parameters are typically underdetermined due to only a few experimental observations being available for the optimization of a significantly larger number of LJ parameters. This enhances the parameter correlation problem where LJ parameters for different atoms in a molecule (e.g., H and C in ethane) can compensate for each other such that it is difficult to accurately determine the “correct” LJ parameters of a molecule based on reproduction of condensed phase properties alone [5]. To overcome this approach a method has been developed that determines the relative value of the LJ parameters based on high level QM data [47] and the absolute values
516
A.D. MacKerell
based on the reproduction of experimental data [16, 49]. This approach is tedious as it requires supramolecular interactions involving rare gases; however, once satisfactory LJ parameters are optimized for atoms in a class of functional groups they can often be directly transferred to other molecules with those functional groups without further optimization.
3.
Considerations for Condensed Phase Simulations
Proper application of an empirical force field is obviously essential for success of a condensed phase calculation. An important consideration is the inclusion of all nonbond interactions between all atom-atom pairs For the electrostatic interactions this can be achieved via Ewald methods [49], including the particle Mesh Ewald approach [50], for periodic systems while reaction field methods can be used to simulation finite (e.g., spherical) systems [51– 53]. For the LJ interactions, long-range corrections exist that treat the interactions beyond the atom-atom truncation distance (i.e., those beyond a distance were the atom–atom interactions are calculated explicitly) as homogenous in nature [54, 55]. Another important consideration is the use of integrators that generate proper ensembles in MD simulations, allowing for direct comparison with experimental data [3, 57–60]. In addition, a number of methods are available to increase the sampling of conformational space [60–62]. The available and proper use of these different methods greatly facilitates investigations of molecular interactions via condensed phase simulations.
4.
Available Empirical Force Fields
A variety of empirical force fields have been developed. Force fields that focus on biological molecules include AMBER [18, 42] CHARMM [26–29], GROMOS [63, 64], and OPLS [24, 25], All of these force fields have been parametrized to account for condensed phase properties, such that they all treat molecular interactions with a reasonably high level of accuracy [65, 66]. However, these force fields, to varying extents, do not treat the full range of pharmaceutically relevant compounds. Force fields designed for a broad range of compounds include MMFF [11–13, 67], CVFF [17, 68], the commercial CHARMm force field [69], CFF [70], COMPASS [71], the MM2/MM3/MM4 series [72–74], UFF [75], Drediing [76], the Tripos force field (Tripos, Inc.), among others. However, these force fields have been designed primarily to reproduce internal geometries, vibrations and conformational energies, often sacrificing the quality of the nonbond interactions [65]. Exceptions are MMFF and COMPASS where nonbond parameters have been investigated at a reasonable level of detail. With all force fields the user is advised to perform tests
Interatomic potentials: molecules
517
on molecules for which experimental data is available to validate the quality of the model.
5.
Electronic Polarizability
Future improvements in the treatment of interatomic interactions between molecules will be based on the extension of the treatment of electrostatics to include explicit treatment of electronic polarizability [77, 78]. There are several methods by which electronic polarizability may be included in a potential energy function. These include fluctuating charge models [79–85], induced dipole models [85–89], or a combination of those methods [90, 91]. The classic Drude oscillator is an alternative method [92, 93] in which a “Drude” particle is attached to the nucleus of each atom and, by applying the appropriate charges to the atoms and “Drude” particles, the polarization response can be modeled. This method is also referred to as the shell model and has only been used in a few studies thus far [94–96]. In all of these approaches, the polarizability is solved analytically, iteratively or, in the case of MD simulations via extended Lagrangian methods [3, 77]. In extended Lagrangian methods the polarizability is treated as a dynamic variable in MD simulations. Extended Lagrangian methods are important for the inclusion of polarizability in empirical force fields as they offer the necessary computational efficiency to perform simulations on large systems. To date, work on water has dominated the application of polarizable force fields to molecular interactions. Polarizable water models have been shown to accurately treat both the gas and condensed phase properties [78, 86–89, 95, 97–99]. The ability to treat both the gas and condensed phases accurately marks a significant improvement over force fields where polarizability is not included explicitly. Other examples, where the inclusion of electronic polarization has been shown to increase the accuracy of the treatment of molecular interactions includes the solvation of ions [79, 85, 100, 101], ion-pair interactions in micellar systems [102], condensed phase properties of a variety of small molecules [78, 83, 103–107], cation–π interactions [103, 104], and in interfacial systems [108]. With respect to biological macromolecules, only a few successful applications have been made thus far [109–111]. Thus, explicit treatment of electronic polarizability in empirical force fields, although computationally more expensive then nonpolarizable models, is anticipated to make a significant contribution to the understanding molecular interactions at an atomic level of detail. An interesting observation with electronic polarizability is the apparent inability to apply gas phase polarizabilities to condensed phase systems, as evidenced in studies on water [95]. This phenomenom appears to be associated with the Pauli exclusion principle such that the deformability of the electron
518
A.D. MacKerell
cloud due to induction by the environment is hindered by the presence of adjacent molecules in the condensed phase [112]. This would lead to a decreased effective polarizability in the condensed phase. Such a phenomena has more recently been observed in QM studies of water clusters [113]. Further studies are required to better understand this phenomenon and properly treat it in empirical force fields.
6.
Summary
Interatomic interactions involving molecules dominate the properties of condensed phase systems. Due to the number of particles in such systems, it is typically necessary to apply computationally efficient empirical force fields to study them via theoretical methods. The success of empirical force field is based, in large part, on their accuracy in reproducing a variety of experimental observations; the accuracy being dictated by the quality of the optimization of the parameters that comprise the empirical force field. Proper optimization requires careful selection of target data as well as use of the appropriate optimization process. In cases where empirical force field parameters are being developed as an extension of an available force field, the optimization strategy must be selected to insure consistency with the previous parameterized molecules. These considerations will maximize the potential that the atomistic details obtained from condensed phase simulations will be representative of the experimental regimen. Finally, when analyzing results from condensed phase simulations, possible biases due to the parameters themselves must be considered when interpreting the data.
Acknowledgments Financial support from the NIH (GM51501) and the University of Maryland, School of Pharmacy, Computer-Aided Drug Design Center is acknowledged.
References [1] O.M. Becker, A.D. MacKerell, Jr., B. Roux, and M. Watanabe (eds.), Computational Biochemistry and Biophysics, Marcel-Dekker, Inc., New York, 2001. [2] W.L. Jorgensen, “Theoretical studies of medium effects on conformational equilibria,” J. Phys. Chem., 87, 5304–5312, 1983. [3] M.E. Tuckerman and G.J. Martyna, “Understanding modern molecular dynamics: techniques and applications,” J. Phys. Chem. B, 104, 159–178, 2000.
Interatomic potentials: molecules
519
[4] D.A. McQuarrie, Statistical Mechanics, Harper & Row, New York, 1976. [5] A.D. MacKerell, Jr., “Atomistic models and force fields,” In: O.M. Becker, A.D. MacKerell, Jr., B. Roux, and M. Watanabe, Computational Biochemistry and Biophysics, Marcel Dekker, Inc., New York, pp. 7–38, 2001. [6] A.D. MacKerell, Jr., “Empirical force fields for biological macromolecules: overview and issues,” J. Comp. Chem., 25, 1584–1604, 2004. [7] A. Blondel and M. Karplus, “New formulation of derivatives of Torsion angles and improper Torsion angles in molecular mechanics: elimination of singularities,” J. Comput. Chem., 17, 1132–1141, 1996. [8] M. Feig, A. Onufriev, M.S. Lee, W. Im, D.A. Case, and C.L. Brooks, III, “Performance comparison of generalized born and Poisson methods in the calculation of electrostatic solvation energies for protein structures,” J. Comput. Chem., 25, 265– 284, 2004. [9] T.A. Halgren, “Representation of van der Waals (vdW) Interactions in molecular mechanics force fields: potential form, combination rules, and vdW parameters,” J. Amer. Chem. Soc., 114, 7827–7843, 1992. [10] A.D. Buckingham and P.W. Fowler, “A model for the geometries of van der Waals complexes,” Can. J. Chem., 63, 2018, 1985. [11] T.A. Halgren, “Merck molecular force field. I. Basis, form, scope, parameterization, and performance of MMFF94,” J. Comput. Chem., 17, 490–519, 1996a. [12] T.A. Halgren, “Merck molecular force field. II. MMFF94 van der Waals and electrostatic parameters for intermolecular interactions,” J. Comput. Chem., 17, 520–552, 1996b. [13] T.A. Halgren, “Merck molecular force field. III. Molecular geometries and vibrational frequencies for MMFF94,” J. of Comput. Chem., 17, 553–586, 1996c. [14] W.E. Reiher, Theoretical Studies of Hydrogen Bonding, Harvard University, 1985. [15] G. Chalasinski and M.M. Szczesniak, “Origins of structure and energetics of van der Waals clusters from ab initio calculations,” Chem. Rev., 94, 1723–1765, 1994. [16] I.J. Chen, D. Yin, and A.D. MacKerell, Jr., “Combined ab initio/empirical optimization of Lennard–Jones parameters for polar neutral compounds,” J. Comp. Chem., 23, 199–213, 2002. [17] C.S. Ewig, R. Berry, U. Dinur, J.R. Hill, M.-J. Hwang, H. Li, C. Liang, J. Maple, Z. Peng, T.P. Stockfisch, T.S. Thacher, L. Yan, X. Ni, and A.T. Hagler, “Derivation of class II force fields. VIII. Derivation of a general quantum mechanical force field for organic compounds,” J. Comp. Chem., 22, 1782–1800, 2001. [18] J. Wang and P.A. Kollman, “Automatic parameterization of force field by systematic search and genetic algorithms,” J. Comp. Chem., 22, 1219–1228, 2001. [19] B.L. Bush, C.I. Bayly, and T.A. Halgren, “Consensus bond-charge increments fitted to electrostatic potential or field of many compounds: application of MMFF94 training set,” J. Comp. Chem., 20, 1495–1516, 1999. [20] A. Jakalian, B.L. Bush, D.B. Jack, and C.I. Bayly, “Fast, efficient generation of highquality atomic charges. AM1-BCC model: I. Method,” J. Comp. Chem., 21, 132–146, 2000. [21] M.K. Gilson, H.S. Gilson, and M.J. Potter, “Fast assignment of accurate partial atomic charges: an electronegativity equilization method that accounts for alternate resonance forms,” J. Chem. Inf. Comp. Sci., 43, 1982–1997, 2003. [22] W.L. Jorgensen, J. Chandrasekhar, J.D. Madura, R.W. Impey, and M.L. Klein, “Comparison of simple potential functions for simulating liquid water,” J. Chem. Phys., 79, 926–935, 1983.
520
A.D. MacKerell [23] R.C. Rizzo and W.L. Jorgensen, “OPLS all-atom model for amines: resolution of the amine hydration problem,” J. Amer. Chem. Soc., 121, 4827–4836, 1999. [24] W.L. Jorgensen and J. Tirado-Rives, “The OPLS potential functions for proteins. energy minimizations for crystals of cyclic peptides and crambin,” J. Amer. Chem. Soc., 110, 1657–1666, 1988. [25] W.L. Jorgensen, D.S. Maxwell, and J. Tirado-Rives, “Development and testing of the OPLS all-atom force field on conformational energetics and properties of organic liquids,” J. Amer. Chem. Soc., 118, 11225–11236, 1996. [26] A.D. MacKerell, Jr., D. Bashford, M. Bellott, R.L. Dunbrack, Jr., J. Evanseck, M.J. Field, S. Fischer, J. Gao, H. Guo, S. Ha, D. Joseph, L. Kuchnir, K. Kuczera, F.T.K. Lau, C. Mattos, S. Michnick, T. Ngo, D.T. Nguyen, B. Prodhom, W.E. Reiher, III., B. Roux, M. Schlenkrich, J. Smith, R. Stote, J. Straub, M. Watanabe, J. WiorkiewiczKuczera, D. Yin, and M. Karplus, “All-hydrogen empirical potential for molecular modeling and dynamics studies of protein using the Charmm22 force field,” J. Phys. Chem. B, 102, 3586–3616, 1998. [27] A.D. MacKerell, Jr., D. Bashford, M. Bellott, R.L. Dunbrack, Jr., J. Evanseck, M.J. Field, S. Fischer, J. Gao, H. Guo, S. Ha, D. Joseph, L. Kuchnir, K. Kuczera, F.T.K. Lau, C. Mattos, S. Michnick, T. Ngo, D.T. Nguyen, B. Prodhom, W.E. Reiher, I., B. Roux, M. Schlenkrich, J. Smith, R. Stote, J. Straub, M. Watanabe, J. WiorkiewiczKuczera, D. Yin, and M. Karplus, “All-atom empirical potential for molecular modeling and dynamics studies of proteins,” J. Phys. Chem. B, 102, 3586–3616. [28] N. Foloppe and A.D. MacKerell, Jr., “All-atom empirical force field for nucleic acids: 1) parameter optimization based on small molecule and condensed phase macromolecular target data,” J. Comp. Chem., 21, 86–104, 2000. [29] S.E. Feller, K. Gawrisch, and A.D. MacKerell, Jr., “Polyunsaturated fatty acids in lipid bilayers: intrinsic and environmental contributions to their unique physical properties,” J. Amer. Chem. Soc., 124, 318–326, 2002. [30] P. Cieplak, W.D. Cornell, C.I. Bayly, and P.K. Kollman, “Application of the multimolecule and multiconformational RESP methodlogy to biopolymers: charge derivation for DNA, RNA, and proteins,” J. Comp. Chem., 16, 1357–1377, 1995. [31] A.D. MacKerell, Jr. and M. Karplus, “Importance of attractive van der Waals contributions in empirical energy function models for the heat of vaporization of polar liquids,” J. Phys. Chem., 95, 10559–10560, 1991. [32] K. Kim and R.A. Friesner, “Hydrogen bonding between amino acid backbone and side chain analogues: a high-level ab initio study,” J. Amer. Chem. Soc., 119, 12952– 12961, 1997. [33] N. Huang and A.D. MacKerell, Jr., “An ab initio quantum mechanical study of hydrogen-bonded complexes of biological interest,” J. Phys. Chem. B, 106, 7820– 7827, 2002. [34] U.C. Singh and P.A. Kollman, “An approach to computing electrostatic charges for molecules,” J. Comp. Chem., 5, 129–145, 1984. [35] L.E. Chirlian and M.M. Francl, “Atomic charges derived from electrostatic potentials: a detailed study,” J. Comput. Chem., 8, 894–905, 1987. [36] K.M. Merz, “Analysis of a large data base of electrostatic potential derived atomic charges,” J. Comput. Chem., 13, 749–767, 1992. [37] C.I. Bayly, P. Cieplak, W.D. Cornell, and P.A. Kollman, “A well-behaved electrostatic potential based method using charge restraints for deriving atomic charges: the RESP model,” J. Phys. Chem., 97, 10269–10280, 1993. [38] R.H. Henchman and J.W. Essex, “Generation of OPLS-like charges from molecular electrostatic potential using restraints,” J. Comp. Chem., 20, 483–498, 1999.
Interatomic potentials: molecules
521
[39] A. Laio, J. VandeVondele, and U. Rothlisberger, “D-RESP: dynamically generated electrostatic potential derived charges from quantum mechanics/molecular mechanics simulations,” J. Phys. Chem. B, 106, 7300–7307, 2002. [40] M.M. Francl, C. Carey, L.E. Chirlian, and D.M. Gange, “Charge fit to electrostatic potentials. II. Can atomic charges be unambiguously fit to electrostatic potentials?” J. Comp. Chem., 17, 367–383, 1996. [41] W.D. Cornell, P. Cieplak, C.I. Bayly, I.R. Gould, K.M. Merz, D.M. Ferguson, D.C. Spellmeyer, T. Fox, J.W. Caldwell, and P.A. Kollman, “A second generation force field for the simulation of proteins, nucleic acids, and organic molecules,” J. Amer. Chem. Soc., 117, 5179–5197, 1995. [42] Y. Duan, C. Wu, S. Chowdhury, M.C. Lee, G. Xiong, W. Zhang, R. Yang, P. Ceiplak, R. Luo, T. Lee, J. Caldwell, J. Wang, and P. Kollman, “A point-charge force field for molecular mechanics simulations of proteins based on condensed-phase quantum mechanical calculations,” J. Comp. Chem., 24, 1999–2012, 2003. [43] W.L. Jorgensen, “Optimized intermolecular potential functions for lipuid hydrocarbons,” J. Amer. Chem. Soc., 106, 6638–6646, 1984. [44] W.L. Jorgensen, “Optimized intermolecular potential functions for liquid alcohols,” J. Phys. Chem., 90, 1276–1284, 1986. [45] A. Warshel and S. Lifson, “Consitent force field calculations. II. Crystal structures, sublimation energies, molecular and lattice vibrations, molecular conformations, and enthalpy of alkanes,” J. Chem. Phys., 53, 582–594, 1970. [46] A.D. MacKerell, Jr., J. Wi´orkiewicz-Kuczera, and M. Karplus, “An all-atom empirical energy function for the simulation of nucleic acids,” J. Am. Chem. Soc., 117, 11946–11975, 1995. [47] D. Yin and A.D. MacKerell, Jr., “Ab initio calculations on the use of helium and neon as probes of the van der Waals surfaces of molecules,” J. Phys. Chem., 100, 2588–2596, 1996. [48] D. Yin and A.D. MacKerell, Jr., “Combined ab initio/empirical approach for the optimization of Lennard–Jones parameters,” J. Comp. Chem., 19, 334–348, 1998. [49] P.P. Ewald, “Die berechnung optischer und elektrostatischer gitterpotentiale,” Annalen der Physik, 64, 253–287, 1921. [50] T. Darden, “Treatment of long-range forces and potentials,” In: O.M. Becker, A.D. MacKerell, Jr., B. Roux, and M. Watanabe (eds.), Computational Biochemistry and Biophysics, Marcel Dekker, Inc., New York, pp. 91–114, 2001. [51] D. Beglov and B. Roux, “Finite representation of an infinite bulk system: solvent boundary potential for computer simulations,” J. Chem. Phys., 100, 9050–9063, 1994. [52] T.C. Bishop, R.D. Skeel, and K. Schulten, “Difficulties with multiple time stepping and fast multipole algorithm in molecular dynamics,” J. Comp. Chem., 18, 1785– 1791, 1997. [53] W. Im, S. Bern´eche, and B. Roux, “Generalized solvent boundary potential for computer simulations,” J. Chem. Phys., 114, 2924–2937, 2001. [54] M.P. Allen and D. J. Tildesley, Computer Simulation of Liquids, Oxford University Press, New York, 1989. [55] P. Lague, R.W. Pastor, and B.R. Brooks, “A pressure-based long-range correction for Lennard–Jones interactions in molecular dynamics simulations: application to alkanes and interfaces,” J. Phys. Chem. B, 108, 363–368, 2004. [56] M. Tuckerman, B.J. Berne, and G.J. Martyna, “Reversible multiple time scale molecular dynamics,” J. Chem. Phys., 97, 1990–2001, 1992.
522
A.D. MacKerell [57] G.J. Martyna, D.J. Tobias, and M.L. Klein, “Constant pressure molecular dynamics algorithms,” J. Chem. Phys., 101, 4177–4189, 1994. [58] S.E. Feller, Y. Zhang, R.W. Pastor, and R.W. Brooks, “Constant pressure molecular dynamics simulation: The Langevin Piston Method,” J. Chem. Phys., 103, 4613– 4621, 1995. [59] E. Barth and T. Schlick, “Extrapolation versus impulse in multiple-timestepping schemes. II. Linear analysis and applications to Newtonian and Langevin dynamics,” J. Chem. Phys., 109, 1633–1642, 1998. [60] R. Elber and M. Karplus, “Enhanced sampling in molecular dynamics: use of the time-dependent hartree approximation for a simulation of carbon monoxide diffusion through myoglobin,” J. Amer. Chem. Soc., 112, 9161–9175, 1990. [61] U.H.E. Hansmann, “Parallel tempering algorithm for conformational studies of biological molecules,” Chem. Phys. Lett., 281, 140–150, 1997. [62] C. Simmerling, T. Fox, and P.A. Kollman, “Use of locally enhanced sampling in free energy calculations: testing and application to the α∅β Anomerization of Glucose,” J. Am. Chem. Soc., 120, 5771–5782, 1998. [63] W.F. van Gunsteren, “GROMOS. Groningen molecular simulation program package,” University of Groningen, Groningen, 1987. [64] W.F. van Gunsteren, S.R. Billeter, A.A. Eising, P.H. H¨unenberger, P. Kr¨uger, A.E. Mark, W.R.P. Scott, and I.G. Tironi, Biomolecular Simulation: The GROMOS96 Manual and User Guide, BIOMOS b.v., Z¨urich, 1996. [65] G. Kaminski and W.L. Jorgensen, “Performance of the AMBER94, MMFF94, and OPLS-AA force fields for modeling organic liquids,” J. Phys. Chem., 100, 18010– 18013, 1996. [66] M.R. Shirts, J.W. Pitera, W.C. Swope, and V.S. Pande, “Extremely precise free energy calculations of amino acid side chain analogs: comparison of common molecular mechanics force fields for proteins,” J. Chem. Phys., 119, 5740–5761, 2003. [67] T.A. Halgren, “MMFF VII. Characterization of MMFF94, MMFF94s, and other widely available force fields for conformational energies and for intermolecularinteraction energies and geometries,” J. Comp. Chem., 20, 730–748, 1999. [68] S. Lifson, A.T. Hagler, and P. Dauber, “Consistent force field studies of intermolecular forces in hydrogen-bonded crystals. 1. Carboxylic acids, amides, and the C=O. . .H hydrogen bonds,” J. Amer. Chem. Soc., 101, 5111–5121, 1979. [69] F.A. Momany and R. Rone, “Validation of the general purpose QUANTA 3.2/CHARMm force field,” J. comput. Chem., 13, 888–900, 1992. [70] M.J. Hwang, T.P. Stockfisch, and A.T. Hagler, “Derivation of class II force fields. 2. Derivation and characterization of a class II force field, CFF93, for the alkyl functional group and alkane molecules,” J. Amer. Chem. Soc., 116, 2515–2525, 1994. [71] H. Sun, “COMPASS: An ab initio force-field optimized for condensed-phase applications-overview with details on alkane and benzene compounds,” J. Phys. Chem. B, 102, 7338–7364, 1998. [72] U. Burkert and N.L. Allinger, Molecular Mechanics, American Chemical Society, Washington, D.C., 1982. [73] N.L. Allinger, Y.H. Yuh, and J.L. Lii, “Molecular mechanics, the MM3 force field for hydrocarbons. 1,” J. Amer. Chem. Soc., 111, 8551–8566, 1989. [74] N.L. Allinger, K.H. Chen, J.H. Lii, and K.A. Durkin, “Alcohols, ethers, carbohydrates, and related compounds. I. The MM4 force field for simple compounds,” J. Comput. Chem., 24, 1447–1472, 2003.
Interatomic potentials: molecules
523
[75] A.K. Rapp´e, C.J. Colwell, W.A. Goddard, III, and W.M. Skiff, “UFF, a full periodic table force field for molecular mechanics and molecular dynamics simulations,” J. Amer. Chem. Soc., 114, 10024–10035, 1992. [76] S.L. Mayo, B.D. Olafson, and I. Goddard, W.A. “DREIDING: a generic force field for molecular simulations,” J. Phys. Chem., 94, 8897–8909, 1990. [77] T.A. Halgren and W. Damm, “Polarizable force fields,” Curr. Opin. Struct. Biol., 11, 236–242, 2001. [78] S.W. Rick and S.J. Stuart, “Potentials and algorithms for incorporating polarizability in computer simulations,” Rev. Comp. Chem., 18, 89–146, 2002. [79] S.W. Rick, S. J. Stuart, J. S. Bader, and B. J. Berne, “Fluctuating charge force fields for aqueous solutions,” J. Mol. Liq., 66/66, 31–40, 1995. [80] S.W. Rick and B.J. Berne, “Dynamical fluctuating charge force fields: the aqueous solvation of amides,” J. Amer. Chem. Soc., 118, 672–679, 1996. [81] R.A. Bryce, M.A. Vincent, N.O.J. Malcolm, I.H. Hillier, and N.A. Burton, “Cooperative effects in the structure of fluoride water clusters: ab initio hybrid quantum mechanical/molecular mechanical model incorporating polarizable fluctuating charge solvent,” J. Chem. Phys., 109, 3077–3085, 1998. [82] J.L. Asensio, F.J. Canada, X. Cheng, N. Khan, D.R. Mootoo, and J. Jimenez-Barbero, “Conformational differences between O- and C-glycosides: the alpha-O-man(1-->1)-beta-Gal/alpha-C-Man-(1-->1)-beta-Gal case--a decisive demonstration of the importance of the exo-anomeric effect on the conformation of glycosides,” Chemistry, 6, 1035–1041, 2000. [83] N. Yoshii, R. Miyauchi, S. Niura, and S. Okazaki, “A molecular-dynamics study of the equation of water using a fluctuating-charge model,” Chem. Phys. Lett., 317, 414–420, 2000. [84] E. Llanta, K. Ando, and R. Rey, “Fluctuating charge study of polarization effects in chlorinated organic liquids,” J. Phys. Chem. B, 105, 7783–7791, 2001. [85] S. Patel and C.L. Brooks, III, “CHARMM fluctuating charge force field for proteins: I parameterization and application to bulk organic liquid simulations,” J. Comput. Chem., 25, 1–15, 2004. [86] J. Caldwell, L.X. Dang, and P.A. Kollman, “Implementation of nonadditive intermolecular potentials by use of molecular dynamics: development of a water–water potential and water–ion cluster interactions,” J. Amer. Chem. Soc., 112, 9144–9147, 1990. [87] A. Wallqvist and B.J. Berne, “Effective potentials for liquid water using polarizable and nonpolarizable models,” J. Phys. Chem., 97, 13841–13851, 1993. [88] D.N. Bernardo, Y. Ding, K. Krogh-Jespersen, and R.M. Levy, “An anisotropic polarizable water model: incorporation of all-atom polarizabilities into molecular mechanics force fields,” J. Phys. Chem., 98, 4180–4187, 1994. [89] L.X. Dang, “Importance of polarization effects in modeling hydrogen bond in water using classical molecular dynamics techniques,” J. Phys. Chem. B, 102, 620–624, 1998. [90] H.A. Stern, G.A. Kaminski, J.L. Banks, R. Zhou, B.J. Berne, and R.A. Friesner, “Fluctuating charge, polarizable dipole, and combined models: parameterization from ab initio quantum chemistry,” J. Phys. Chem. B, 103, 4730–4737, 1999. [91] B. Mannfors, K. Palmo, and S. Krimm, “A new electrostatic model for molecular mechanics force fields,” J. Mol. Struct., 556, 1–21, 2000. [92] B.G. Dick, Jr. and A.W. Overhauser, “Theory of the dielectric constants of alkali halide crystals,” Phys. Rev., 112, 90–103, 1958.
524
A.D. MacKerell
[93] L.R. Pratt, “Effective field of a dipole in non-polar polarizable fluids,” Mol. Phys., 40, 347–360, 1980. [94] P.J. van Marren and D. van der Spoel, “Molecular dynamics simulations of water with novel shell-model potentials,” J. Phys. Chem. B, 105, 2618–2626, 2001. [95] G. Lamoureux, A.D. MacKerell, Jr., and B. Roux, “A simple polarizable model of water based on classical Drude oscillators,” J. Chem. Phys., 119, 5185–5197, 2003. [96] G. Lamoureux and B. Roux, “Modelling induced polarizability with drude oscillators: theory and molecular dynamics simulation algorithm,” J. Chem. Phys., 119, 5185–5197, 2003. [97] M. Sprik and M.L. Klein, “A polarizable model for water using distributed charge sites,” J. Chem. Phys., 89, 7556–7560, 1988. [98] B. Chen, J. Xing, and I.J. Siepmann, “Development of polarizable water force fields for phase equilibrium calculations,” J. Phys. Chem. B, 104, 2391–2401, 2000. [99] H.A. Stern, F. Rittner, B.J. Berne, and R.A. Friesner, “Combined fluctuating charge and polarizable dipole models: application to a five-site water potential function,” J. Chem. Phys., 115, 2237–2251, 2001. [100] S.J. Stuart and B.J. Berne, “Effects of polarizability on the hydration of the chloride ion,” J. Phys. Chem., 100, 11934–11943, 1996. [101] A. Grossfield, P. Ren, and J.W. Ponder, “Ion solvation thermodynamics from simulation with a polarizable force field,” J. Amer. Chem. Soc., 125, 15671–15682, 2003. [102] J.C. Shelley, M. Sprik, and M.L. Klein, “Molecular dynamics simulation of an aqueous sodium octanoate micelle using polarizable surfactant molecules,” Langmuir, 9, 916–926, 1993. [103] J.W. Caldwell and P.A. Kollman, “Cation–π interactions: nonadditive effects are critical in their accurate representation,” J. Amer. Chem. Soc., 117, 4177–4178, 1995a. [104] J.W. Caldwell and P.A. Kollman, “Structure and properties of neat liquids using nonadditive molecular dynamics: water, methanol, and N-methylacetamide,” J. Phys. Chem., 99, 6208–6219, 1995b. [105] J. Gao, D. Habibollazadeh, and L. Shao, “A polarizable potential function for simulation of liquid alcohols,” J. Phys. Chem., 99, 16460–16467, 1995. [106] M. Freindorf and J. Gao, “Optimization of the Lennard–Jones parameter for combined ab initio quantum mechanical and molecular mechanical potential using the 3-21G basis set,” J. Comp. Chem., 17, 386–395, 1996. [107] P. Cieplak, J.W. Caldwell, and P.A. Kollman, “Molecular mechanical models for organic and biological systems going beyond the atom centered two body additive approximations: aqueous solution free energies of methanol and N-methyl acetamide, nucleic acid base, and amide hydrogen bonding and chloroform/water partition coefficients of the nucleic acid bases,” J. Comp. Chem., 22, 1048–1057, 2001. [108] L.X. Dang, “Computer simulation studies of ion transport across a liquid/liquid interface,” J. Phys. Chem. B, 103, 8195–8200, 1999. [109] G.A. Kaminski, H.A. Stern, B.J. Berne, R.A. Friesner, Y.X. Cao, R.B. Murphy, R. Zhou, and T.A. Halgren, “Development of a polarizable force field for proteins via ab initio quantum chemistry: first generation model and gas phase tests,” J. Comp. Chem., 23, 1515–1531, 2002. [110] V.M. Anisimov, I.V. Vorobyov, G. Lamoureux, S. Noskov, B. Roux, and A.D. MacKerell, Jr. “CHARMM all-atom polarizable force field parameter development for nucleic acids,” Biophys. J., 86, 415a, 2004. [111] S. Patel, A.D. MacKerell, Jr., and C.L. Brooks, III, “CHARMM fluctuating charge force field for proteins: II protein/solvent properties from molecular dynamics simulations using a non-additive electrostatic model,” 25, 1504–1514, 2004.
Interatomic potentials: molecules
525
[112] A. Morita and S. Kato, “An ab initio analysis of medium perturbation on molecular polarizabilities,” J. Chem. Phys., 110, 11987–11998, 1999. [113] A. Morita, “Water polarizability in condensed phase: ab initio evaluation by cluster approach,” J. Comp. Chem., 23, 1466–1471, 2002.
2.6 INTERATOMIC POTENTIALS: FERROELECTRICS Marcelo Sepliarsky1, Marcelo G. Stachiotti1 , and Simon R. Phillpot2 1
Instituto de Física Rosario, Facultad de Ciencias Exactas, Ingeniería y Agrimensura, Universidad Nacional de Rosario, 27 de Febreo 210 Bis, (2000) Rosario, Argentina 2 Department of Materials Science and Engineering, University of Florida, Gainesville, FL 32611, USA
Ferroelectric perovskites are important in many areas of modern technology including memories, sensors and electronic applications, and are of fundamental scientific interest. The fascinating feature of perovskites is that they exhibit a wide variety of structural phase transitions. Generically these compounds have a chemical formula ABO3 , where A is a monovalent or divalent cation and B, a transition metal cation; perovskites in which both A and B are trivalent, such as LaAlO3 also exist, though we will not discuss them here. Although their high-temperature structure is very simple (Fig. 1), it displays a wide variety of structural instabilities, which may involve rotation and distortions of the oxygen octahedral as well as displacement of the ions from their crystallographically defined sites. The types of crystal symmetries manifested in these materials and the types of phase transitions behavior depend on the individual compound. Among the perovskites one finds ferroelectric crystals such as BaTiO3 , KNbO3 (displaying three solid-state phase transitions), and PbTiO3 (displaying only one transition), antiferroelectrics such as PbZrO3 , and materials such as SrTiO3 that exhibit other nonpolar instabilities involving the rotation of the oxygen octahedra [1]. In recent years, new applications have opened up for these materials as the systems exploited have become both chemically more complex, e.g., solid solutions and superlattices, and microstructurally more complex, e.g., thin films and nanocapacitors. While the overall properties of such systems can be relatively easily investigated experimentally, it is difficult to obtain microscopic information. There is thus a significant need for a simulation method which can provide atomic-level information on ferroelectric behavior, and yet is computationally efficient enough to allow materials problems to be addressed. Computer 527 S. Yip (ed.), Handbook of Materials Modeling, 527–545. c 2005 Springer. Printed in the Netherlands.
528
M. Sepliarsky et al. A
O
B
Figure 1. Cubic perovskite-type structure, ABO3 .
simulations based on interatomic potentials can provide such microscopic insights. However, the validity of any simulation potential study depends on the quality of the interatomic potential used, to a considerable extent. Obtaining accurate interatomic potentials which are able to describe ferroelectricity in ABO3 perovskites constitutes a challenging problem, mainly due to the small energy differences (sometimes less than 10 meV/cell) involved in the lattice instabilities associated with the various phases. The theoretical investigation of ferroelectric materials can be addressed at different lenght scale and level of complexity, ranging from phe-nomenological theories (based on the continuous medium approximation) to first-principles methods. The traditional approach is based on Ginzburg–Landau–Devonshire (GLD) theory [2]. This mesoscale approach treats a ferroelectric as a continuum solid denned by components of polarization and by elastic strains or stresses. This approach has proved very successful in providing significant insights into the ferroelectric properties of perovskites. However, it cannot provide detailed microscopic information. Over the last decade, considerable progress has been made in first-principles calculations of ferroelectricity in perovskites [3, 4]. These calculations have contributed greatly to the understanding of the origins of structural phase transitions in perovskites and to the nature of the ferroelectric instability. These methods are based upon a full solution for the quantum mechanical ground state of the electron system in the framework of Density Functional Theory (DFT). While able to provide detailed information on the structural, electronic and lattice dynamical properties of single crystals, they also have limitations. In particular, due to the heavy computational load, only systems of up to approximately a hundred ions can be simulated. Moreover, at the moment such calculations cannot provide anything but static, zero temperature, properties. An effective Hamiltonian method has been used for the simulation of finite-temperature properties of
Interatomic potentials: ferroelectrics
529
perovskites [3]. Here, a model Hamiltonian is written as a function of a reduced number of degrees of freedom (a local mode amplitude vector and a local strain tensor). The parameters of the Hamiltonian are determined in order to reproduce the spectrum of low-energy excitations of a given material as obtained from first-principles calculations. This approach has been applied with considerable success to several ferroelectric materials (pure compounds and solid solutions), producing results in very good qualitative agreement with experiments. However, some quantitative predictions are not so satisfactory; in particular, the calculated transition temperatures can differ from the experimental values by hundreds of degrees. Moreover, the lack of an atomistic description of the material makes the effective Hamiltonian approach inappropriate for the investigation of many interesting properties of perovskites, such as surface and interface effects. Atomistic modeling using interatomic potentials has a long and illustrious history in the description of ionic materials. The fundamental idea is to describe a material at the atomic level, with the interatomic interactions defined by classical potentials, thereby providing spatially much more detailed information than the GLD approach, yet without the heavy computational load associated with the first-principles methods. In the context of ionic materials, the interactions between the point ions are generally described via the Coulombic interactions between the atoms which provides cohesion. However, a neutral solid interacting purely by Coulombic interactions is unstable to a catastrophic collapse in which all the ions become arbitrarily close. Thus, to mimic the physical short-ranged repulsion that prevents such a collapse, an empirical largely repulsive interaction is added. One standard choice for this function is the Buckingham potential, which consists of a purely repulsive, exponential decaying Born–Mayer term between shells and a van der Waals attractive term to account for covalency effects: V (r) = ae(−r/ρ) − (c/r 6 ). This is the so-called rigid ion model. In the shell model, an important improvement over the rigid-ion model, atomic polarizability is accounted for by defining a core and a shell for each ion (representing the ion core with the closed shells of electrons, and the valence electrons, respectively), which interact with each other through a harmonic spring (characterizing the ionic polarizability), and interact with the cores and shells of other ions via repulsive and Coulombic interactions. In some parameterizations, the ions (core plus shell) are assigned their formal charges. However, in ionic materials with a significant amount of covalency, such as perovskites, the incomplete transfer of electrons between the cations and anions can be accounted for by assigning partial charges (smaller than the formal charges) to the ions as well as the van der Waals term, which is non-zero only for the O–O interactions. For more details see the article “Interatomic potential models for ionic materials” by Julian Gale presented in this handbook.
530
M. Sepliarsky et al.
The success of the atomistic approach is evident from the large number of investigations on complex oxides crystals. Regarding ferroelectric perovskites, we note the early work of Lewis and Catlow, who derived empirical shellmodel potential parameters for the study of defect energies in cubic BaTiO3 [5, 18]. This model was subsequently used for more refined ab initio embeddedcluster calculations of impurities, as well as for the simulation of surface properties. For lattice dynamical properties, the most successful approach has been carried out in the framework of the nonlinear oxygen polarizability model [6]. In this shell model an anisotropic core–shell interaction is considered at the O2− ions, with a fourth-order core–shell interaction along the B–O bond. The potential parameters were obtained by fitting experimental phonon dispersion curves of the cubic phase. The main achievement of this model was the description of the soft mode temperature dependence (TO-phonon softening which is related with the ferroelectric transition). However, neither of these models, was able to simulate the ferroelectric phase behavior of the perovskites. Besides the traditional empirical approach, in which potentials are obtained by suitable fitting procedures to macroscopic physical properties, there is increasing interest in deriving pair potentials from first-principles calculations. In 1994, Donnerberg and Exner developed a shell model for KNbO3 , deriving the Nb–O short-range pair potential from Hartree–Fock calculations performed on a cluster of ions [7]. They showed that this ab initio pair potential was in good agreement with a corresponding empirical potential obtained from fitting procedures to macroscopic properties. Their model, however, was not able to simulate the structural phase transition sequence of KNbO3 either. They argued that the consideration of additional many-body potential contributions would enable them to model structural phase transitions. However, as we will see, it is in fact possible to simulate ferroelectric phase transitions just by using classical pairwise interatomic potentials fitted to first-principles calculations. Ab initio methods provide underlying potential surfaces and phonon dispersion curves at T = OK, thereby exposing the presence of structural instabilities in the full Bril-louin zone, and this information is indeed very useful for parameterizing classical potentials which can then be used in molecular dynamics simulations. In this way, finite-temperature simulations of ABO3 perovskites and the properties of chemically and microstructurally more complex systems can be addressed at the atomic level.
1.
Modeling Ferroelectric Perovskites
Among the perovskites BaTiO3 which can be considered as a prototypical ferroelectric is one of the most exhaustively studied [8]. At high temperatures, it has the classic perovskite structure. This is cubic centrosymmetric, with the
Interatomic potentials: ferroelectrics
531
Ba at the corners, Ti at the center, and oxygen at the face centers (see Fig. 1). However, as the temperature is lowered, it goes through a succession of ferroelectric phases with spontaneous polarizations along the [001], [011], and [111] directions of the cubic cell. These polarizations arise from net displacements of the cations with respect to the oxygen octahedra along the above directions. Each ferroelectric phase involves also a small homogeneous deformation which can be thought of as an elongation of the cubic unit cell along the corresponding polarization direction. Thus the system becomes tetragonal at 393 K, orthorhombic at 278 K, and rhombohedral at 183 K. An anisotropic shell model with pairwise repulsive Buckingham potentials was developed for the simulation of ferroelectricity in BaTiO3 [9]. This model is a classical shell model where an anisotropic core–shell interaction is considered at the O2− ions, with a fourth-order core–shell interaction along the O–Ti bond. The Ba and Ti ions are considered to be isotropically polarizable. The set of seventeen shell model parameters were obtained by fitting phonon frequencies, lattice constant of the cubic phase, and underlying potential surfaces for various configurations of atomic displacements. In order to better quantify the ferroelectric instabilities of the cubic phase, a first-principles frozen-phonon calculation of the infrared active modes was performed. Once the eigenvectors at had been determined, the total energy as a function of the displacement pattern of the unstable mode was evaluated for different directions in the cubic phase, including also the effects of the strain. The first-principles total energy calculations were performed within DFT, using the highly precise full-potential Linear Augmented Plane Wave (LAPW) method. The energy surfaces of the model for different ferroelectric distortions is shown in Fig. 2, where they are compared with the first-principles results. A satisfactory overall agreement is achieved. The model yields clear ferroelectric instabilities with similar energies and minima locations as the LAPW calculations. Energy lowerings of ≈1.2, 1.65, and 1.9 mRy/cell are obtained for the (001), (011), and (111) ferroelectric mode displacements, respectively, which is consistent with the experimentally observed phase transitions sequence. Concerning the energetics for the (001) displacements, it can be also seen in the left panel that the effect of the tetragonal strain is to stabilize these displacements with a deeper minimum and with a higher energy barrier at the centrosymmetric positions. Phonon dispersion relations provide a global view of the harmonic energy surface around the cubic perovskite structure. In particular the unstable modes, which have imaginary frequencies, determine the nature of the phase transitions. A first-principles linear response calculation of the phonon dispersion curves of cubic BaTiO3 revealed the presence of structural instabilities with pronounced two-dimensional character in the Brillouin zone, corresponding to chains of displaced Ti ions oriented along the [001] directions [10]. The shell model reproduces these instabilities is illustrated in the calculated phonon
532
M. Sepliarsky et al. 1
E (mRy/cell)
[001]
[111]
[011]
0
1
2 0.00
c/ a =
0.05
1.01
0.00
0.05
0.00
0.05
Ti relative to Ba displacement (Å) Figure 2. Total energy as a function of the unstable mode displacements along the [001] (left panel), [011] (center panel), and [111] (right panel) directions. For the sake of simplicity, the mode displacement is represented through the Ti displacement relative to Ba; the oxygen ions are also displaced in a manner determined by the Ti ion displacement. Energies for [001] displacements in a tetragonal strained structure are also included in the left panel. First-principles calculations are denoted by squares (circles) for the unstrained (strained) structures. Full lines correspond to the shell model result.
dispersion curves in Fig. 3. Excellent agreement with the ab initio linear response calculation is achieved, particularly for the unstable phonon modes. Two transverse optic modes are unstable at the point, and they remain unstable along the –X direction with very little dispersion. One of them stabilizes along the –M and X–M directions; and both become stable along the –R and R–M lines. The Born effective charge tensor is conventionally defined as the proportionality coefficients between the components of the dipole moment per unit cell and the components of the κ sublattice displacement which give rise to the dipole moment ∗ = Z κ,αβ
∂ Pβ . ∂δκ,α
(1)
For the cubic structure of ABO3 perovskites, this tensor is fully characterized by four independent numbers. Experimental data had suggested that the amplitude of the Born effective charges should deviate substantially from the nominal static charges, with two essential features: the oxygen charge tensor is highly anisotropic (with two inequivalent directions either parallel or perpendicular to the B–O bond), and the Ti and O|| effective charges are anomalously large. This was confirmed by more recent first-principles calculations [3] demonstrating the crucial role played by the B(d)–O(2p) hybridization as a dominant mechanism for such anomalous contributions.
Interatomic potentials: ferroelectrics
533
800
Frequency (cm-1 )
600
400
200
0
200 Γ
X
M
Γ
R
M
Figure 3. Phonon dispersion curves of cubic BaTiO3 calculated with the shell model. Imaginary phonon frequencies are represented as negative values.
Although the shell model does not explicitly include charge transfer between atoms, it takes into account the contribution of the electronic polarizability effects through the shell model. It is thus possible to evaluate the Born effective charge tensor by calculating the total dipole moment per unit cell created by the displacement of a given sublattice of atoms as a sum of two contributions Pα = Z κ δκ,α +
Yκ wκ,α .
(2)
κ
The first term is the sublattice displacement contribution while the second term is the electronic polarizability contribution. The calculated Born effective charges for cubic BaTiO3 are listed in Table 1 together with results obtained from different theoretical approaches. The two essential features of the Born effective charge tensor of BaTiO3 are satisfactorily simulated. To this point, we have shown that this anisotropic shell model for BaTiO3 reproduces the lattice instabilities and several zero-temperature properties which are relevant for this material. To investigate if the model can describe the temperature driven structural transitions of BaTiO3 constant-pressure molecular dynamics (MD) simulations were performed. Although an excellent overall agreement was obtained for the structural parameters, showing that the model reproduces the delicate structural changes involved along the transitions, the theoretically determined transition temperatures were much lower
534
M. Sepliarsky et al. Table 1. Born effective charges of BaTiO3 in the cubic structure
Nominal Experiment First principles Shell model (nominal) Shell model (effective)
Z ∗Ba
Z T∗ i
Z ∗O
+2 +2.9 +2.75 +1.86 +1.93
+4 +6.7 +7.16 +3.18 +6.45
−2 −2.4 −2.11 −1.68 −2.3
⊥
Z ∗O
||
−2 −4.8 −5.69 −1.68 −3.79
than in experiment [9]. Interestingly, the effective Hamiltonian approach presents the same problem. Since ferroelectricity is very sensitive to volume, the neglect of thermal expansivity in the effective Hamiltonian approach was thought to be responsible for the shifts in the predicted transition temperatures. The MD simulations, however properly simulate the thermal expansion and, nevertheless, result in a similar anomaly in the transition temperatures. This indicates the presence of inherent errors in the first-principles LDA approach which tend to underestimate the ferroelectric instabilities. A recent study demonstrated that, in the effective Hamiltonian approach, there are at least two significant sources of errors: the improper treatment of the thermal expansion and the LDA error. Both types of errors may be of same magnitude [11]. While the anisotropic shell model for BaTiO3 does have the desired effect of describing the ferroelectric phase transition in perovskites it can only be used in crystallographic well-defined environment of O ions. Unfortunately, it is not always possible to unambiguously characterize the crystallographic environment of any given ion, for example, in the simulation of a grain boundary or other interface. For such systems isotropic models are required. Isotropic shell models have recently been developed, which describe the phase behavior of both KNbO3 [12] and BaTiO3 [13]. The isotropic shell model differs from the anisotropic one only in that the anisotropic fourthorder core–shell interaction on the O ions is replaced by an isotropic fourthorder core–shell interaction on both the transition metal and the O ions, which together stabilize the ferroelectric phases. Since the LDA-fitted shell model gives theoretically determined transition temperatures much lower than in experiment, the parameters of the potential were improved in an ad hoc manner to give better agreement. In this way, the model for KNbO3 displays the experimentally observed sequence of phases on heating: rhombohedral, orthorhombic, tetragonal and finally cubic with transition temperatures of 225 K, 475 K and 675 K, which are very close to the experimental values of 210 K, 488 K and 701 K, respectively. As shown in Fig. 4, for BaTiO3 , in comparison with the anisotropic model, the isotropic shell model gives transition temperature values (140 K, 190 K and 360 K) in better agreement with the experimental values (183 K, 278 K and 393 K).
Interatomic potentials: ferroelectrics BaTiO3
4.08
Lattice parameters (Å)
535
4.04
4
0
100
200
300
400
100
200
300
400
30
2
Polarization (µC/cm )
BaTiO3 20
10
0 0
Temperature (K) Figure 4. Phase diagram of BaTiO3 as determined by MD simulations for the isotropic shell model. Top panel: cell parameters as a function of temperature. Bottom panel: the three components of the average polarization (each one represented with a different symbol).
2.
Solid Solutions
The current keen interest in solid solutions of perovskites is driven by the idea of tuning the composition to create structures with properties unachievable in single component materials. Prototypical solid solutions are Bax Sr1−x TiO3 (BST), a solid solution of BaTiO3 and SrTiO3 , and KTax Nb1−x O3 , a solid solution of KTaO3 and KNbO3 . Both solutions exist for the whole concentration range and are mixtures of a ferroelectric with an incipient ferroelectric. We present briefly the main features of isotropic shell-model potentials developed to describe the structural behavior of BST.
536
M. Sepliarsky et al.
In order to simulate BST solid solutions, it was also necessary to develop an isotropic model for SrTiO3 . From a computational point of view, the SrTiOs model must be compatible with the BaTiO3 model in that the only difference between the two can be in the different Ba–O and Sr–O interactions and the different polarizability parameters for Ba and Sr. The challenge is thus, by only changing these interactions, to reproduce the following main features of ST: (i) a smaller equilibrium volume, (ii) incipient ferroelectricity, and (iii) a tetragonal antiferrodistortive ground state. It is indeed possible to reproduce these three critical features. The equilibrium lattice constant of the resulting model in the cubic phase is a = 3.90 Å which reproduces the extrapolation to T = 0 K of the experimental lattice constant. Regarding the other two conditions, the low-frequency phonon dispersion curves of the cubic structure are shown in Fig. 5. The model reproduces the rather subtle antiferrodistortive instabilities, driven by the unstable modes at the R and M points. It also presents a subtle ferroelectric instability (unstable mode at the zone center). These detailed features of the dispersion of the unstable modes along different direction in the Brillouin zone are in good agreement with ab initio linear response calculations. Random solid solutions of BST of various compositions in the range x = 0 (pure SrTiO3 ) to x = 1 (pure BaTiO3 ) have been simulated. In the simulation supercell the A-sites of the ATiO3 perovskite are randomly occupied by Ba and Sr ions. The results of the molecular dynamics simulations on the phase behavior of BST are summarized in Fig. 6 (filled symbols connected by solid lines) as the concentration dependence of the transition temperatures.
Figure 5. Low-frequency phonon dispersion curves for cubic SrTiO3 . The negative values correspond to imaginary frequencies, characteristic of the ferroelectric instability at the point and the additional antiferrodistortive instabilities at the R and M points.
Interatomic potentials: ferroelectrics
537
400 Ba xSr1- x TiO 3 Cubic
l
na
300
o ag
tr
T (K)
Te 200
Orthorhombic 100 Rhombohedral 0
0
0.2
0.6
0.4
0.8
1
x Figure 6. Concentration dependence of transition temperatures (solid symbols and dark lines) shows good agreement with experimental values (open symbols and dotted lines).
With increasing concentration of Sr (i.e., decreasing x), the Curie temperature decreases essentially linearly with x. The simulations showed that all four phases remain stable down to x ≈ 0.2 at which the three transition temperatures essentially coincide. Below x ≈ 0.2 only the cubic and rhombohedral phases appear in the phase diagram. These results are similar to the experimental data (open symbols and dotted lines), giving particularly good agreement for the concentration at which the tetragonal and orthorhombic phases disappear from the phase diagram. The above analyses demonstrate that the atomistic approach can reproduce the basic features of the phase behavior of perovskite solid solutions, on a semiquantitative basis. There are two fundamental structural effects associated with the solid solution: a concentration dependence of the average volume and large variations in the local strain arising from strong variations in the local composition [12, 13]. SrTiO3 is denser than BaTiO3 . Thus in the solid solution, the SrTiO3 cells tend to be under a tensile strain (which tends to encourage a ferroelectric distortion) while the BaTiO3 cells tend to be under a compressive strain (which tends to suppress the ferroelectric distortion). Indeed, the large tensile strain on the SrTiO3 cells has the effect of inducing a polarization. Remarkably, at a given concentration (fixed volume) the polarization of the SrTiO3
538
M. Sepliarsky et al.
cells is actually larger than that of the BaTiO3 cells. There is also an additional effect associated with the local environment of each unit cell. In particular, the simulations show that the maximum and minimum values of polarization for the SrTiO3 cells correspond to the polarizations of SrTiO3 cells (of the same average volume as that of the solid solution) embedded completely in a matrix of SrTiO3 and BaTiO3 cells, respectively. Likewise, for the BaTiO3 cells the maximum and minimum polarizations correspond to SrTiO3 and BaTiO3 embeddings, respectively.
3.
Heterostructures
Superlattices containing ferroelectric offer another approach to achieving dielectric, and optical properties unachievable in the bulk. Among the heterostructures grown have been ferroelectric/paraelectric superlattices including BaTiO3 /SrTiO3 and KNbO3 / KTaO3 and ferroelectric/ferroelectric superlattices PbTiO3 /BaTiO3 . In comparison with the well-documented tunability of the properties of solid solutions, the tunability of the properties of multilayer heterostructures has been less well demonstrated. While there is experimental evidence for a strong dependence of the properties of such superlattices on modulation length, (the thickness of a KNbO3 / KTaO3 bilayer), the underlying physics controlling their properties is only poorly understood. Atomic-level simulations are ideal for the study of multilayers because the simulations can be carried out on the same length scale as the experimental systems. Moreover, the crystallography of the multilayer can be defined and the position of every ion determined, thereby providing atomic-level information on the ferroelectric and dielectric properties. Furthermore, once the nature of the interactions between ions and the crystallographic structure of the interface are defined, the atomic-level simulations will determine the local atomic structure and polarization at the interfaces. To that purpose, the structure and properties of coherent KNbO3 /KTaO3 superlattices were simulated using isotropic shell-model potentials for KNbO3 and KTaO3. Since the simulations were intended to model a superlattice on a KT substrate, as had been experimentally investigated, the in-plane lattice parameter was fixed to that of KT at zero temperature; however since the heterostructure is not under any constraint in the modulation direction, the length of the simulation cell in the z direction was allowed to expand or contract to reach zero stress. Figure 7 shows the variation in the polarization in the modulation direction Pz (solid circles) and in the x–y plane, Px = Py (open circles) averaged over unit-cell-thick slices through the = 36 superlattice. In analyzing these polarization profiles, we first address the strain effects produced by the KT substrate, which result in a compressive strain of 0.7% on the KN layers.
Interatomic potentials: ferroelectrics
539
40 Pz
2
Polarization (µC/cm )
30 20
P x =P y
10 0 10 20 30
0
9
18
27
36
45
54
63
72
Z Figure 7. Components of polarization, Px (open circles) and Pz (solid circles), in unit-cellthick slices through the = 36 KN/KT superlattice on a KT substrate.
To compensate for this in-plane compression, the KN layers expand in the z direction thereby breaking the strict rhombohedral symmetry of the polarization of KN; however, these strains are not sufficient to force the KN to become tetragonally polarized. Similarly, the absence of any in-plane polarization for the KT layer is consistent with the absence of any strain arising from the KT substrate. The finite value of Pz in the interior of the KT layer, however, is different from the expected value of Pz =0 for this unstrained layer and arises from the very strong coupling of the electric field produces by the electric dipoles in the KNbO3 layers with the very large dielectric response of the KTaO3 [14, 15]. The switching behavior of ferroelectric heterostructures is of considerable interest. It was found that for = 6, the polarization in the KTaO3 layers is almost as large as in the KNbO3 layers; moreover, the coercive fields for the KNbO3 and KTaO3 layers are identical. This single value for the coercive fields and the weak spatial variation in the polarization indicates that the entire superlattice is essentially acting as a single structure, with properties different from either of its components. For = 36, the KNbO3 layer has a square hysteresis loop characteristic of a good ferroelectric; the polarization and coercive field are larger than for = 6, consistent with more bulk-like
540
M. Sepliarsky et al.
behavior of a thicker KNbO3 layer. The KTO layer also displays hysteretic behavior. However, by contrast with the = 6 superlattice, the coercive field for the KTaO3 layers is much smaller than for the KNO layer, indicating that the KNbO3 and KTaO3 layers are much more weakly coupled than in the = 6 superlattice. The hysteresis loop for the KTO layers resembles the response of a poor ferroelectric; however, it was shown that it is actually the response of a paraelectric material under the combination of the applied electric field and the internal field produced by the polarized KNbO3 layers. The hysteretic behavior is, therefore, not an intrinsic property of the KTaO3 layer but arises from the switching of the KNbO3 layers under the large external electric field which, in turn, switches the sign of the internal field on the KTaO3 layers.
4.
Nanostructures
The causes of size effects in ferroelectrics are numerous, and it is difficult to separate true size effects from other factors that change with film thickness or capacitor size, such as microstructure, defect chemistry, and electrode interactions. For this reason, atomic-level investigations play a crucial role in determining their intrinsic behavior. The anisotropic shell model for BaTiO3 was used to determine the critical thickness for ferroelectricity in a free-standing BaTiO3 stress-free film (it was also shown that the model developed for the bulk material can also describe static surface properties [16] such as structural relaxations and surface energies, which are in quite good agreement with firstprinciples calculations). For this investigation a [001] TiO2 -terminated slab was chosen. The equilibrated zero-temperature structure of the films was determined by a zero-temperature quench. The size and shape of the simulation cell was allowed to vary to reach zero stress. Shown in the top panel of Fig. 8 is the cell-by-cell polarization profile pz (z) at T = 0 K of a randomly chosen chain perpendicular to the film surface for various film thicknesses. It is clear from this figure that the film of 2.8 nm width does not display ferroelectricity. As a consequence of surface atomic relaxations, the two unit cells nearest to the surface develop a small polarization at both sides of the slab, which are pointing inwards towards the bulk, so the net chain polarization vanishes. For the cases of 3.6 nm and 4.4 nm film thickness, however, the chains develop a net out-of-plane polarization. Although these individual chains display a perpendicular nonvanishing polarization, the net out-of-plane polarization of the film is zero due to the development of stripe-like domains, as is shown in the bottom panel of Fig. 8. It was demonstrated that the strain effect produced by the presence of a substrate can lead to the stabilization of a polydo-main ferroelectric state in films as thin as 2.0 nm [16].
Interatomic potentials: ferroelectrics
541 d=2.8 nm d=3.6 nm d=4.4 nm
12
2
pz ( µ C/ cm )
9 6 3 0
3 6
0.0
0.8
1.6
2.4
3.2
4.0
z(nm) 2
Pz ( µ C/ cm ) 6 -- 8 4 -- 6 2 -- 4 0 -- 2 2 -- 0 4 -- 2 6 -- 4 8 -- 6
Figure 8. Top panel: Cell-by-cell out-of-plane polarization profile of a ramdomly chosen chain perpendicular to the film surface for different slab thickness. Bottom panel: top view of the out-of-plane polarization pattern for the case d = 4.4 nm showing stripe-like domains. A similar picture is obtained for d = 3.6 nm.
To investigate to what extent a decrease in lateral size will affect the ferroelectric properties of the film, the equilibrium atomic positions and local polarizations at T = 0 K for a stress-free cubic cell of 3.6 nm size were computed. The nanocell is constructed in such a way that the top and bottom faces (perpendicular to the z axis) are [001] TiO2 -planes and its lateral faces (parallel to the z axis) are [100] BaO-planes.
542
M. Sepliarsky et al.
Shown in the top panel of Fig. 9 are the cell-by-cell polarization profiles pz (z) for three different chains along the z direction: one chain at an edge of the cell, one at the center of a face, and the last one inside the nanocell. It is clear from this figure that the total chain polarization at the edges and at the lateral faces is zero. The large local polarizations pointing in opposite directions, at both sides of the cell, are just a consequence of strong atomic relaxations at the nanocell surface. On the other hand, the chain inside the nanocell displays
edge face inside film
2
pz ( µ C/ cm )
40 20 0
20 40 0.0
0.4
0.8
1.2
1.6
2.0
2.4
2.8
3.2
3.6
z(nm) 2
Pz ( µ C/ cm ) 3 -- 5 1 -- 3 1 -- 1 3 -- 1 5 -- 3
Figure 9. Top panel: cell-by-cell polarization profiles ( pz (z)) of three chosen chains in the nanocell. The profile for the 3.6 nm slab is showed for comparison. Bottom panel: top view of the polarization pattern for the nanocell.
Interatomic potentials: ferroelectrics
543
a net, nonvanishing, polarization of ≈ 5 µC/cm2 . For comparison we have also plotted in Fig. 9, the pz (z) profile of the stress-free film of 3.6 nm width. We can clearly see that the two profiles are very similar. This is an indication that the decrease in lateral size does not affect the original ferroelectric properties of the thin film. As in the film case, the net polarization of the nanocell is zero due to the development of domains with opposite polarizations, as is shown in the bottom panel of Fig. 9. It was further demonstrated that a nanocell with different lateral faces, TiO2 planes instead of Ba–O planes, present a different domain structure and polarization due to a strong surface effect [17].
5.
Outlook
First-principles calculations of ferroelectric materials can answer some important questions directly, but this approach by itself cannot address the most challenging materials-related and microstructure-related problems. Fortunately, first-principles methods can provide benchmarks for the validation of other conceptually less sophisticated approaches that, because of their low computational loads, can address such issues. The atomistic approach presented here demonstrates that enough of the electronic effects associated with ferroelectricity can be mimicked at the atomic level to allow the fundamentals of ferroelectric behavior to be reproduced. Moreover, the interatomic potential approach, firmly grounded by having its parameters computed on firstprinciples calculations, will be a very useful tool for the theoretical design of new materials for specific target applications. One important challenge in this field is the simulation of technologically important solid solutions which are more complex than the ones discussed here; for example, PbZrx Ti1−x O3 (PZT) and PbMg1/3 Nb2/3 O3 -PbTiO3 (PMNPT), which is a single crystal piezoelectric with giant electromechanical coupling. The difficult point here is the development of interatomic potentials suitable for such investigations. The simultaneous fitting of transferable potentials for the different pure materials is a way to develop interatomic potentials for the solid solutions. This could be done by using an extensive first-principles database to adjust the potential parameters. Although the methodology presented here is computationally efficient enough to allow materials problems to be addressed, clearly there are a lot of work to do in order to get a closer coupling with experiment. Real ferroelectric materials are frequently ceramics, and a critical role is often played by grain boundaries, impurities, surfaces, dislocations, domains walls, etc. Among the critical issues that atomic-level simulation should be able to address include the microscopic processes associated with ferroelectric switching by domainwall motion and the coupling of ferroelectricity and microstructure in such ceramics. There are exciting challenges in the simulation of ferroelectric
544
M. Sepliarsky et al.
device structures. However, since such structures can involve ferroelectrics, electrodes (metallic or conducting oxide) and semiconductors, the development of atomic-level methods to simulate such chemically diverse materials will have to be developed; this is an exciting challenge for the future.
Acknowledgments We would like to thank S. Tinte, D. Wolf, and R.L. Migoni, who collaborated in the work described in this review.
References [1] M.E. Lines and A.M. Glass, Principles and Applications of Ferroelectric and Related Materials, Clarendon Press, Oxford, 1977. [2] A.F. Devonshire, “Theory of ferroelectrics,” Phil. Mag., (Suppl.) 3, 85, 1954. [3] D. Vanderbilt, “First-principles based modelling of ferroelectrics,” Current Opinion in Sol. Stat. Mater. Sci., 2, 701–705, 1997. [4] R. Cohen, “Theory of ferroelectrics: a vision for the next decade and beyond,” J. Phys. Chem. Sol., 61, 139–146, 2000. [5] G.V. Lewis and C.R.A. Catlow, “Potential model for ionic oxides,” J. Phys. C, 18, 1149–1161, 1985. [6] R. Migoni, H. Bilz, and D. B¨auerle, “Origin of Raman scattering and ferroelectricity in oxide perovskites,” Phys. Rev. Lett., 37, 1155–1158, 1976. [7] H. Donnerberg and M. Exner, “Derivation and application of ab initio Nb5+ –O2− short-range effective pair potentials in shell-model simulations of KNbO3 and KTaO3 ,” Phys. Rev. B, 49, 3746–3754, 1994. [8] F. Jona and G. Shirane, Ferroelectric Crystals, Dover Publications, New York, 1993. [9] S. Tinte, M.G. Stachiotti, M. Sepliarsky, R.L. Migoni, and C.O. Rodriguez, “Atomistic modelling of BaTiO3 based on first-principles calculations,” J.Phys.: Condens. Matter, 11, 9679–9690, 1999. [10] P.H. Ghosez, E. Cockayne, U.V. Waghmare, and K.M. Rabe, “Lattice dynamics of BaTiO3 , PbTiO3 and PbZrO3 : a comparative first-principle study,” Phys. Rev. B, 60, 836–843, 1999. [11] S. Tinte, J. Iniguez, K. Rabe, and D. Vanderbilt, “Quantitative analysis of the firstprinciples effective Hamiltonian approach to ferroelectric perovskites,” Phys. Rev. B, 67, 064106, 2003. [12] M. Sepliarsky, S.R. Phillpot, D. Wolf, M.G. Stachiotti, and R.L. Migoni, “Atomiclevel simulation of ferroelectricity in perovskite solid solutions,” Appl. Phys. Lett., 76, 3986–3988, 2000. [13] S. Tinte, M.G. Stachiotti, S.R. Phillpot, M. Sepliarsky, D. Wolf, and R.L. Migoni, “Ferroelectric properties of Bax Sr1−x TiO3 solid solutions by molecular dynamics simulation,” J. Phys.: Condens. Matt., 16, 3495–3506, 2004. [14] M. Sepliarsky, S. Phillpot, D. Wolf, M.G. Stachiotti, and R.L. Migoni, “Long-ranged ferroelectric interactions in perovskite superlattices,” Phys. Rev. B, 64, 060101 (R), 2001.
Interatomic potentials: ferroelectrics
545
[15] M. Sepliarsky, S. Phillpot, D. Wolf, M.G. Statchiotti, and R.L. Migoni, “Ferroelectric properties of KNbO3 /KTaO3 superlattices by atomic-level simulation,” J. Appl. Phys., 90, 4509–4519, 2001. [16] S. Tinte and M.G. Stachiotti, “Surface effects and ferroelectric phase transitions in BaTiO3 ultrathin films,” Phys. Rev. B, 64, 235403, 2001. [17] M.G. Stachiotti, “Ferroelectricity in BaTiO3 nanoscopic structures,” Appl. Phys. Lett., 84, 251–253, 2004. [18] G.V. Lewis and C.R.A. Catlow, “Defect studies of doped and undoped Barium Titanate using computer simulation techniques,” J. Phys. Chem. Sol., 47, 89–97, 1986.
2.7 ENERGY MINIMIZATION TECHNIQUES IN MATERIALS MODELING C.R.A. Catlow1,2 1
Davy Faraday Laboratory, The Royal Institution, 21 Albemarle Street, London W1S 4BS, UK 2 Department of Chemistry, University College London, 20 Gordon Street, London WC1H 0AJ, UK
1.
Introduction
Energy minimization is one of the simplest but most widely applied of modeling procedures; indeed, its applications have ranged from biomolecular systems to superconducting oxides. Moreover, minimization is often the first stage in any modeling procedure. In this section, we review the basic concepts and techniques, before providing a number of topical examples. We aim to show both the wide scope of the method as well as its extensive limitations.
2.
Basics and Definitions
The conceptual basis of energy minimization (EM) is simple: an energy function E(r1 , . . . , r N ) is minimized with respect to the nuclear coordinates ri (or combinations of these) of a system of N atoms, which may be a molecule or cluster, or a system with 1, 2 or 3D periodicity; in the latter case, the minimization may be applied to the lattice parameter(s), in addition to the coordinates of the atom within the repeat unit. E may be calculated using a quantum mechanical method, although the term energy minimization is often associated with interatomic potential methods or some simpler procedures. The term “molecular mechanics” is essentially synonymous but refers to applications to molecular systems. The term “static lattice” methods is also widely used and normally implies a minimization procedure followed by the calculation of properties of the minimized configuration. EM methods may be extended to “free energy minimization” if the entropy contribution can be calculated 547 S. Yip (ed.), Handbook of Materials Modeling, 547–564. c 2005 Springer. Printed in the Netherlands.
548
C.R.A. Catlow
by configurational or by molecular or lattice dynamical procedures. But by definition, EM excludes any explicit treatment of thermal motions. EM methods normally involve the specification of a “starting point” or initial configuration and the subsequent application of a numerical algorithm to locate the nearest local minimum, from which there arises possibly the most fundamental limitation of the approach, i.e., the “local minimum” problem: minimization can never be guaranteed to find the global minimum of an energy (or any other) function. And straightforward implementations of the method are essentially refinements of approximately known structures. Indeed, for many complex systems, e.g., protein structures, unless the starting configuration is very close to the global minimum, a local minimum will invariably be generated by minimization. Procedures for attempting to identify global minima will be discussed later in the section. Although minimization by definition excludes dynamical effects, it is possible to apply the technique to rate processes (e.g., diffusion and reaction) using methods based on an Absolute Rate Theory, in which rates (ν) are calculated according to the expression: ν = ν0 exp(−G ACT /kT ),
(1)
where the pre-exponential factor, ν0 may be loosely related to a vibrational frequency and G ACT refers to the free energy of activation of the process, i.e., the difference between the free energy of the transition states for the process and the ground state of the system. If the transition states can be located via some search procedure (or can be postulated from symmetry or other considerations), then the activation energy and (much less commonly) activation free energy may be calculated. Such procedures have been widely used in modeling atomic transport in solids. In Section 2.1, we first consider the type of energy function employed; the methods used to identify minima are then discussed followed by a more detailed survey of methodologies. Recent applications are reviewed in the final sub-section. In all cases, the emphasis is on applications to materials, but many of the considerations apply generally to atomistic modeling.
2.1.
Energy Functions
As noted earlier, minimization may be applied to any energy function that may be calculated as a function of nuclear coordinates. In atomistic simulation studies, three types of energy function may be identified: (i) Quantum mechanically evaluated energies, where essentially we use the energy calculated by solving the Schr¨odinger equation at some level of approximation. Extensive discussions of such methods are, of course, available elsewhere in this volume.
Energy minimization techniques in materials modeling
549
(ii) Interatomic potential based energy function. Here we use interatomic potentials to calculate the total energy of the system with respect to component atoms (i.e., the cohesive energy) or ions (the lattice energy), i.e., E=
N N N N N 1 1 Vi2j (ri j ) + V 3 (ri r j rk ) . . . . 2 i j =/ i 3 i j =/ i k/= j =/ i i j k
(2)
where the Vi j are the pair potential components, Vi j k the three-body term and of course the series continues in principle to higher order terms. The sum is over all N atoms in the system, but would normally be terminated beyond a “cut-off” distance (although note the case of the electrostatic term discussed later). In a high proportion of calculations (especially on non-metallic systems) only the two-body term is included, which allows the energy, E, for periodic systems to be written as: E=
Nc Ncut 1 Vi j (ri j ), 2 i=1 j =/ i
(3)
where the first summation refers to all atoms in the unit cell where interactions with all other atoms are summed up to the specified cut-off. It is common to separate off the electrostatic contributions Vi j , i.e., Vi j (ri j ) =
qi q j + ViSR j (ri j ), ri j
(4)
where qi and q j are atomic or ion charges and V SR is the remaining, “shortrange” component of the potential. This allows us to write: E = Ec +
Nc Ncut
Vij (ri j ),
(5)
i=1 j = /i
where E c is the coulomb term, obtained by summing the r −1 terms, which should not be truncated in any accurate calculation. The short-range terms can, however, usually be safely truncated at a distance of 10–20 Å. The summation of the electrostatic term must be carefully undertaken, as it may be conditionally convergent if handled in real space. The most widely used procedure rests on the work of Ewald (see, e.g., [1]) which obtains rapid convergence by a partial transformation into reciprocal space. The procedure has been very extensively used and for applications to materials we refer to the articles in Ref. [2].
550
2.2.
C.R.A. Catlow
Other Functions
In some cases, a simple “cost function” may be used based on geometrical criteria rather than energies. For example, the distance least squares (DLS) approach [3] is based on minimization of a cost function obtained by summing the squares of the distances between calculated and “standard” bond lengths for a structure. More complex cost functions include deviation from calculated and specified coordination numbers. We have also noted earlier that if entropy terms can be estimated, energy can be extended to free energy minimization. Such extensions will be discussed in detail for the case of periodic lattices.
2.3.
Identification of Minima
We recall that standard minimization methods aim to identify the energy minimum starting form a specified initial configuration, using algorithms which will be discussed later. And as argued earlier, it is impossible ever to guarantee that a global minimum has been achieved. However, a number of procedures are available to mitigate the effects of the local minimum problem, with the two main classes being: (i) Simulated Annealing (SA), where the approach is to use molecular dynamics (MD) or Monte Carlo (MC) systems initially at high temperature, thereby allowing the system to explore the potential energy surface and escape from local into the global minimum region. The normal procedure is to “cool” the system during the course of the simulation, which usually concludes with a standard minimization. SA has been used successfully and predictively in a number of cases in crystal structure modeling. If used carefully and appropriately, the method offers a good probability of identifying the global minimum; but there always remains a distinct possibility that the simulation will fail to locate regions of configurational space close to the global minimum, especially if there are substantial energy barriers between this and other regions. (ii) Genetic Algorithm methods (GA), which GA have been widely used in optimization studies, and where the approach is fundamentally different from SA. Instead of one starting point, there are many, which may simply be different random arrangements of atoms (with some overall constraint such as unit-cell dimensions). A cost function is specified, and is evaluated for each configuration. the population of configurations then evolves through successive generations. The “breeding” process involves exchange of features between different members of the population and is driven so as to generate a population with a low cost function.
Energy minimization techniques in materials modeling
551
At the end of the procedure, selected members of the population are subjected to energy minimization, giving a range of minimum structures from which the lowest energy one may be selected. GA methods again offer no guarantee that the global minimum has been located. Their particular merit is that they use a variety of initial configurations, rather than one as in SA. However, both approaches unquestionably have their value. A good account of the application of the GA method to periodic solids is given in Ref. [4].
3.
Methodologies
Minimization methods may be applied to periodic lattices, to defects within lattices, to surfaces and to clusters. The methodological aspects are similar in all these different areas. In this section, we pay the greatest attention to perfect lattice minimization. The field of defect calculations is reviewed in Chapter 6.4.
3.1.
Perfect Lattice Calculations
The first objective here is to calculate the lattice energy, in which the summation in Eq. (1) is taken over all atoms/ions in the unit cell interacting with all other species. The calculation is tractable via the use of the Ewald summation for the Coulombic terms and the cut-off for the short-range interactions. We note that the great majority of lattice energy calculations only include the two-body contribution to the short-range energy. One important matter of definition is that the lattice energy gives the energy of the crystal with respect to component ions at infinity. If it is desired to express the energy with respect to atoms at infinity (for which the more appropriate term is then the cohesive energy) then the appropriate ionization energies and electron affinities will be added. Lattice energy calculations are now routine, and may be carried out for very large unit cells containing several hundred atoms. The codes METAPOCS, THBREL and GULP undertake lattice energy calculations including both twoand three-body terms, using both bond-bending and triple-dipole formalisms. Lattice energy calculations provide valuable insight into the structures and stabilities of ionic and semi-ionic solids. The technique is most powerful when combined with energy minimization procedures, which generate the structure of minimum energy. These are discussed later after the calculation of entropies have been described. The results in Table 1 give a good illustration of the value of lattice energy studies. They are the energy minimum lattice energies calculated for a number of purely siliceous microporous zeolitic structures which
552
C.R.A. Catlow Table 1. Relative energies (per mol) of microporous siliceous structures with respect to quartz (after Ref. [5]) Structure
Energy (kJ/mol)
Silicalite Mordenite Faujasite
11.2 20.52 21.4
are compared with the lattice energy of α-SiO2 . The latter has the lowest value as would indeed be expected since the more porous structures are known to be metastable with respect to the dense α-SiO2 polymorph. Of greater interest is the observation that of the porous structures, silicalite has the greatest stability. This accords with the fact that this polymorph can only be prepared as a highly siliceous compound unlike the case with the other zeolitic structures which are normally synthesized with high aluminium contents. The calculations which are discussed in greater detail by Ooms et al. [5], suggest that this behavior has its origin at least in part in the thermodynamic stability of the compounds. We note that more recently very similar results were obtained by Henson et al. [6] who also showed that the calculated values were in excellent agreement with experiment. In addition to calculating energies, it is also possible to calculate routinely a range of crystal properties, including the lattice stability, the elastic and dielectric and piezoelectric constants, and the phonon dispersion curves. The techniques used which are quite standard require knowledge of both first and second derivatives of the energy with respect to the atomic coordinates. Indeed it is useful to describe two quantities: first the vector, g, whose components gα i are defined as: gα i =
∂E ∂xα i
(6)
i.e., the first derivative of the lattice energy with respect to a given Cartesian coordinate (α) of the ith atom. The second derivative matrix W has components αβ Wij ; defined by:
∂ 2E αβ Wij = β ∂xα i ∂xj
(7)
The expressions used in calculating the properties referred to above from these derivatives are discussed in greater detail in Refs. [2] and [7]. For more detailed discussions of the calculation of phonon dispersion curves from the second derivative or “dynamical” matrix W , the reader should consult [8] and
Energy minimization techniques in materials modeling
553
Parker and Price [9]. Finally, we note that by the term “lattic stability” we refer to the equilibrium conditions both for the atoms within the unit cell, and for the unit cell as a whole. The former are available from the gradient vector g, while the latter are described in terms of the six components ε1 . . . ε1 which define the strain matrix ε, where
ε=
ε1
1 ε 2 4
1 ε 2 4 1 ε 2 5
ε2
1 ε 2 5 1 ε 2 6
1 ε 2 6
ε3
(8)
So when the unit cell as a whole is strained, we describe the modification of an arbitrary vector r in the unstrained matrix to a vector r in the strained matrix, using the equation: r = (1 + ε) r
(9)
where 1 is the unit matrix. The six derivatives of energy with respect to strain, [∂ E/∂εi ], therefore measure the forces acting on the unit-cell. The equilibrium condition for the crystal therefore requires that g = 0 and [∂ E/∂εi ] = 0 for all i.
3.2.
Entropy Calculations
The entropy in a solid arises first from configuration terms which for a perfect solid are zero; while for a solid showing orientational or translational disorder configurational expressions based on the Boltzmann expression S = k ln(W ) may be used. In this section we shall pay more attention to the second term, which is due to the population of the vibrational degrees of freedom of the solid. Thus the entropy of a solid may be written as:
Q
Svib = k
dQ
hνi
−1 hνi −hνi exp − 1 − ln 1 − exp kT kT kT
i
0
(10)
where the sum is over all phonon frequencies and the integral is over the Brillouin zone. In practice the integral is normally evaluated by sampling over the zone for which a variety of techniques are available. Vibrational terms also give a contribution to the lattice energy of the crystal:
Q
E vib = kT
dQ 0
hνi i
−1
hνi hνi exp −1 + 2kT kT kT
(11)
554
C.R.A. Catlow
which results in the following expression for the crystal free energy with respect to ions at rest of infinity: F = E + kT
Q
dQ 0
hνi i
2kT
+ ln 1 − exp
hνi kT
(12)
where E is the lattice energy (omitting vibrational terms).
3.3.
Energy Minimization
Having evaluated energies and free energies of a crystal structure we are now able to implement these in an energy (or free energy) minimization procedure. Let us consider first the simple case of minimization to constant volume (i.e., within fixed cell dimensions). We write the energy of the crystal as a Taylor expansion in the displacements of the atoms, δ, from that current configuration giving: 1 E(δ) = E 0 + gδ + δW δ + . . . . 2
(13)
If we terminate this function at the second order term and minimize E with respect to δ, we obtain for the energy minimum: 0 = g + Wδ
i.e., δ = −gW −1
(14)
Displacement of the coordinates by δ as given in Eq. (14) will generate the energy minimum configuration. Of course, in practice, it will not be valid to truncate the summation at the quadratic term, except when very close to the minimum. However, Eq. (14) provides the basis of an effective iterative procedure for attaining the minimum. Indeed this “Newton Raphson” method is widely used in both perfect and defect lattice energy minimization, as it is generally rapidly convergent. Its main disadvantage is that it requires the calculation, inversion and storage of the second derivative matrix, W . Recalculation and inversion each iteration may be avoided by use of updating procedures (see e.g., [10]). The storage problem may become serious with very large structures owing to the high cpu memory requirements. Recourse may be made to gradient methods, e.g., the well known conjugate gradients technique, which make use only of first derivatives. Such methods are, however, more slowly converging. The increasing availability of very large cpu memories is, however, reducing the difficulties associated with the storage of the W matrix. For evaluation of the energy minimum with respect to constant pressure (i.e., with variable cell dimensions), first we note that we can define the six
Energy minimization techniques in materials modeling
555
components of the mechanical pressure acting on the solid, corresponding to the six strain components, defined in Eq. (8), i.e., P εi =
1 V
dUi dεi
(15)
where V is the unit cell volume. The strains can then be evaluated, using Hooke’s law, ε = PC −1
(16)
where C is the (6 × 6) elastic constant tensor, which may be calculated from W . Substitution of these calculated strain components into Eq. (16) then yields the new cell dimensions and atomic coordinates. Again, the procedure is iterative, as it is only strictly valid in the region of applicability of the harmonic approximation. With a sensible starting point, however, only a small number of iterations (typically 2–5) is required. The treatment above assumes that the pressure and corresponding strains are entirely mechanical in origin. However, at finite temperatures there will be a “kinetic pressure” arising from the changes in the vibrational free energy with volume. These may be written as: εi Pvib
1 = V
dFvib dεi
(17)
where Fvib is the vibrational free energy. These kinetic pressures are most simply evaluated by applying small arbitrary strains to the structure and calculating the corresponding changes in Fvib . If Pvib is added to the mechanical pressure P in Eq. (15), it enables us to carry out free energy minimization. (see e.g., [11]). A general computer code, PAPAPOCS, is available for such calculations and the same functionality is available in the GULP code [12]. A detailed discussion is given by Parter and Price [9] and Watson et al. [11] who also describe how the techniques may be used to calculate lattice expansivity, either directly or by calculating the cell dimension as a function of temperature or by calculation of the thermal Gr¨uneisen parameter.
3.4.
Surface Simulations
The procedures here are closely related to those employed in perfect lattice calculations but adapted to 2D periodicity. The most widely used procedure is that pioneered by Tasker et al. [13], in which a slab is taken and divided into
556
C.R.A. Catlow
two regions. Full minimization is undertaken on the upper region which represents the relaxed surface structure and which is embedded in a rigid representation of the underlying lattice. The Ewald summation must be adapted for 2D periodicity using the formalism developed by Parry [14]. Surface simulations have been widely and successfully applied especially to the surfaces of ionic materials, and a number of standard codes are available, e.g., METADISE and MARVIN. The methods may also be readily adapted to study interfaces and other 2D periodic systems such as grain boundaries as will be discussed later in this chapter.
3.5.
Defect and Cluster Calculations
Defects simulations, as discussed in detail in Chapter 6.4, proceed by relaxation of an atomistically represented region of lattice which is embedded in a more approximate representation of the more distant regions of the lattice whose dilectric and/or elastic response to the defect is calculated. An increasingly widely used extension of the procedure is to describe the immediate environment of the defect, (the defect itself and a small number of surrounding coordination shells) quantum mechanically. The detailed discussion of such “embedded cluster” methods is beyond the scope of the present chapter; a recent review is available in Ref. [15]. Minimization of the energy of clusters is, of course, conceptually straightforward. Minimization algorithms are applied to the cluster energy (or free energy) obtained by direct summation. Considerable attention has been paid in this field to the use of global optimization techniques owing to the prevalence of multiple minima. A recent review of cluster simulations is available from Ref. [16].
4.
Discussion and Applications
Minimization methods have been extensively applied to metals, ceramics, silicates, semiconductors and molecular materials. In this section we will provide topical examples which will illustrate the current capabilities of the techniques.
4.1.
Predictions of the Structures of Microporous Materials
Microporous materials have been widely investigated over the last 50 years owing to their extensive range of applications in catalysis, gas separation and
Energy minimization techniques in materials modeling
557
ion exchange. Zeolites, (originally observed as minerals, but now extensively available as synthetic materials) are all silica or aluminosilicate materials, based on fully corner shared networks of SiO4 and AlO4 tetrahedra, but with structures that contain channels pores and voids of molecular dimensions; pore sizes are typically in the range 5–15 Å. The aluminosilicate materials contain exchangeable cations, while the microporous structures give rise to the applications in molecular sieving and sorption. Exchange of protons into the materials creates acid sites which promote catalytic reactions including cracking, isomerization and hydrocarbon synthesis; while metal ions in both framework and extraframework locations can act as active sites for partial oxidation reactions. Modeling techniques have been applied extensively and successfully to the study of microporous materials (see, e.g., the books edited by Catlow [17] and Catlow et al. [18]). And there have been a number of successful applications of minimization techniques to the accurate and indeed to the predictive modeling of microporous structures. Here we highlight a recent significant development, namely the prediction of new hypothetical structures. There have been many attempts to predict new microporous structures, most of which have rested on the fact that the very definition of these materials is based on geometry, rather than on precise chemical composition, occurence or function. In order to be considered as a zeolite, or zeolitetype material (zeotype), a mineral or synthetic material must possess a 3D four-connected inorganic framework, i.e., a framework consisting of tetrahedra which are all corner-sharing. There is an additional criterion that the framework should enclose pores or cavities which are able to accommodate sorbed molecules or exchangeable cations, which leads to the exclusion of denser phases. Topologically, the zeolite frameworks may thus be thought of as fourconnected nets, where each vertex is connected to its four closest neighbours. So far 139 zeolite framework types are known , either from the structures of natural minerals or from synthetically produced inorganic materials. In enumerating microporous structures, a number of fruitful approaches have been developed. Some have involved the decomposition of existing structures into their various structural subunits, and then recombining these in such ways as to generate novel frameworks . Methods which involve combinatorial, or systematic, searches of phase space have also been successfully deployed. Recently, an approach based on mathematical tiling theory has also been reported [19]. It was established that there are exactly 9, 117 and 926 topological types of fourconnected uninodal (i.e., containing one topologically distinct type of vertex), binodal and trinodal networks, respectively, derived from simple tilings (tilings with vertex figures which are tetrahedra), and at least 145 additional uninodal networks derived from quasi-simple tilings (the vertex figures of which are derived from tetrahedra, but contain double edges). In principle, the tiling
558
C.R.A. Catlow
approach offers a complete solution to the problem of framework enumeration, although the number of possible nets is infinite. Potentially therefore we may be able to generate an unlimited number of possible zeolitic frameworks. Of these, only a portion is likely to be of interest as having desirable properties, with an even smaller fraction being amenable to synthesis in any given composition. It is this last problem, the feasibility of hypothetical frameworks, which is the key question in any analysis of such structures. The answer is not a simple one, since the factors which govern the synthesis of such materials are not fully understood. As discussed earlier, zeolites are metastable materials. Aside from this thermodynamic constraint, the precise identity of the phase or phases formed during hydrothermal synthesis is said to be under “kinetic control,” although there is increasing sophistication in targeting certain types of framework using various templating methods, fluoride media and other synthesis parameters . Additionally, certain structural motifs are more likely to formed within certain compositions, e.g., double four-rings in germanates, three-rings in beryllium-containing compounds and so on. A full characterization of any hypothetical zeolite must therefore include an analysis of framework topology and of the types of building unit present, as well as some estimate of the thermodynamic stability of the framework. Using an appropriate potential model, lattice energy minimization can, as shown above, provide a very good measure of this stability and well as optimizing structures to a high degree of accuracy. In the method adopted by Foster et al. [20], networks derived from tiling theory were first transformed into “virtual zeolites” of composition SiO2 by placing silicon atoms at the vertices of the nets, and bridging oxygens at the midpoints of connecting edges. The structures were then refined using the geometry-based DLS procedure, referred to above, before final optimization by lattice energy minimization. Among the 150 or so uninodal structures examined, all 18 known uninodal zeolite frameworks were found. Moreover, most of the unknown frameworks had been described by previous authors; in fact there a considerable degree of overlap between sets of uninodal structures generated by different methods. Most of the binodal and trinodal structures, however, are completely new. Using calculated lattice energy as an initial measure of feasibility, a number of the more interesting structures are shown in Fig. (1). The challenge is now to synthesize these structures.
4.2.
Grain Boundary Structures in Mantle Minerals
Grain boundaries are known to be a major factor controling mechanical and rheological properties of materials. Detailed knowledge of their structures is, however, limited. Simulation methods have made a major contribution over
Energy minimization techniques in materials modeling
559
detl_14
detl_19
detl_11
delt_71
delt_35
Figure 1. Illustrations of feasible uninodal zolite structures generated by tiling theory and modeled using lattice energy minimization.
the past 20 years in developing models for grain boundaries as in the work of Keblinski et al. [21] on metal systems and Duffy, Harding and Stoneham [22] on ionic systems. Recent work has explored grain boundary properties in the Mantle mineral forsterite Mg2 SiO4 , a member of the olivine group of minerals, which comprise a major proportion of the upper part of the Earth’s Mantle. Knowledge of the grain boundary structure of this material is vital for developing an improved
560
C.R.A. Catlow
understanding of the rheology of the Mantle. Modeling boundaries in this material, however, presents substantial challenges owing to the complexity of the crystal structure. The recent work of de Leeuw et al. [23] investigated this problem using static lattice simulation techniques. They modeled the forsterite grain boundaries using empirical potential models for SiO2 and MgO. Atomistic simulation techniques are appropriate for these calculations because they are capable of modeling systems consisting of large numbers of ions which is necessary when modeling grain boundaries, as shown in many studies. Energy minimization techniques were used to investigate the structure and stability of the grain boundaries and the interactions between the lattice ions at the boundaries and adsorbed species, such as protons and dissociated water molecules, to identify the strength of interaction with specific boundary features. They employed the energy minimization code METADISE, which is designed to model dislocations, interfaces and surfaces . A grain boundary is created by fitting two surface blocks together in different orientations. In the present case, two series of tilt grain boundaries (M1 and M2, defined by the type of cation site at the surface) were created from appropriate models of stepped forsterite (010) surfaces at increasing boundary angles. Both boundary and adhesion energies were calculated, which describe the stability of the boundary with respect to the bulk material and free surfaces, respectively. Results are reported in Table 2 and Fig. 2. The atomistic models generated are shown in Fig. 3. The larger grain boundaries do not form a continuously disordered interface but rather a series of open channels in the interfacial region with practically bulk termination of the two mirror planes (Fig. 3). We would expect that physical processes such as melting and diffusion of ions and molecules, e.g., oxygen or water, will be enhanced especially at the larger-terraced boundaries due to the low density of these regions compared to the bulk crystal. The minima in the adhesion energies at φ = ∼ 200 (M1) or ∼ 300 (M2) (Fig. 2) Table 2. Calculated boundary energies of (010) tilt grain boundaries in forsterite Boundary
Boundary angle (◦ )
Boundary energy (Jm−2 )
M2
65 47 36 28 23 60 41 30 23 19
1.32 2.72 3.57 3.50 3.09 2.12 3.13 3.19 2.94 2.88
M1
Energy minimization techniques in materials modeling
561
adhesion energy (J/m2)
5
4
3
2
1
0 0
20
40
60
80
angle (degrees) M2
M1
Figure 2. Adhesion energies as a function of grain boundary tilt angle.
indicate the boundaries which are most easily cleaved and are due to the relative stabilitities of the grain boundaries and corresponding free surfaces. Overall, the results show the ability of simulation methods to generate realistic models for these complex interfaces.
4.3.
Nanocluster Structures in ZnS
Our final example is an intriguing case study in cluster chemistry. As part of an extensive study aimed at identifying the structures of the critical growth nuclei in the growth of ZnS crystals Spano et al. [24, 25] have identified a whole series of stable open cluster structures for (ZnS)n clusters with n ranging from 1 to 80. They have employed simulated annealing and minimization techniques using interatomic potentials but with critical structures also being modeled by Density Functional Theory electronic structure methods, (the results of which validate the interatomic potential based simulations.) The cluster structures have quite different topologies from bulk ZnS. A particularly interesting example is shown in Fig. 4. It is an onion like cluster with an inner core and outer shell. Work is in progress aimed at detecting these structures experimentally.
562
C.R.A. Catlow
Figure 3. Relaxed structures of tilt grain boundaries with (010) mirror terraces, top (100) step wall showing two round channels per terrace, bottom (001) step wall with one triangular channel per terrace.
5.
Conclusions
This chapter has surveyed the essential methodological aspects of minimization techniques and has illustrated the scope of the field by a number of recent examples. Despite their simplicity, minimization methods will remain powerful tools in materials simulation.
Energy minimization techniques in materials modeling
563
Figure 4. Predicted onion-like structure for (ZnS)60 .
Acknowledgments I am grateful to many colleagues for their contributions to the work discussed in this chapter, but special thanks go to Robert Bell, Martin Foster, Nora de Leeuw, Stephen Parker and Said Hamad, whose recent work was highlighted in the applications section.
References [1] M.P. Tosi, Solid State Phys., 16, 1, 1964. [2] C.R.A. Catlow (ed.), Computer Modelling in Inorganic Crystallograpy, Academic Press, London, 1997. [3] W.M. Meier and H. Villiger, Z. Kristallogr, 128, 352, 1969. [4] S.M. Woodley, In: R.L. Johston (ed.), Structure and Bonding, vol. 110, Springer, Heidelberg, 2004. [5] G. Ooms, R.A. van Santen, C.J.J. den Ouden, R.A. Jackson, and C.R.A. Catlow, J. Phys. C: Condensed Matter., 92, 4462, 1988. [6] N.J. Henson, A.K. Cheetham, and J.D. Gale, Chem. Mater., 6, 1647, 1994. [7] C.R.A. Catlow and W.C. Mackrodt (eds.), “Computer simulation of solids,” Lecture Notes in Physics, vol. 166, Springer, Berlin, 1982. [8] W. Cochran, Crit. Rev. Solid Sci., 2, 1, 1971. [9] S.C. Parker and G.D. Price, In: C.R.A. Catlow (ed.), Advanced Solid State Chemistry, vol. 1, JAI Press, 1990.
564
C.R.A. Catlow
[10] M.J. Norgett and R. Fletcher, J. Phys. C: Condensed Matter, 3, L190, 1970. [11] Watson et al., In: C.R.A. Catlow (ed.), Computer Modelling in Inorganic Crystallography, Academic Press, London, p. 55, 1997. [12] J.D. Gale, J. Chem Soc. Faraday Trans., 93, 629, 1997. [13] P.W. Tasker, J. Phys. C: Condensed Matter., 12, 4977, 1979. [14] D.E. Parry, Surf. Sci., 49, 433, 1975. [15] P. Sherwood et al., J. Mol. Struct. – Theochem, 632, 1, 2003. [16] R.L. Johnston, Dalton Trans., 22, 4193, 2003. [17] C.R.A. Catlow (ed.), Modelling of Structure and Reactivity in Zeolites, Academic Press, London, 1992. [18] C.R.A. Catlow, B. Smit, and R.A. van Santen (eds.), Modelling Microporous Materials, Elsevier, Amsterdam, 2004. [19] O. Delgado Friedrichs, A.W.M. Dress, D.H. Huson, J. Klinowski, and A.L. Mackay, Nature, 400, 644, 1999. [20] M.D. Foster, A. Simpler, R.G. Bell, O. Delgado Friedrichs, F.A. Almeida Paz, and J. Klinowski, Nature Mat., 3, 234, 2004. [21] P. Keblinski, D. Wolf, S.R. Phillpot, and H. Gleiter, Philos. Mag. A., 79, 2735, 1999. [22] D.M. Duffy, J.H. Harding, and A.M. Stoneham, Philos. Mag. A, 67, 865, 1993. [23] N.H. De Leeuw, S.C. Parker, C.R.A. Catlow, and G.D. Price, Am. Mineral, 85, 1143, 2000. [24] E. Spano, S. Hamad, and C.R.A. Catlow, J. Phys. Chem. B, 107, 10337, 2003. [25] E. Spano, S. Hamad, and C.R.A. Catlow, Chem. Commun., 864, 2004.
2.8 BASIC MOLECULAR DYNAMICS Ju Li Department of Materials Science and Engineering, Ohio State University, Columbus, OH, USA
A working definition of molecular dynamics (MD) simulation is technique by which one generates the atomic trajectories of a system of N particles by numerical integration of Newton’s equation of motion, for a specific interatomic potential, with certain initial condition (IC) and boundary condition (BC). Consider, for example, a system with N atoms in a volume . We can define its internal energy: E ≡ K + U , where K is the kinetic energy, K ≡
N 1 i=1
2
m i |˙xi (t)|2 ,
(1)
and U is the potential energy, U = U (x3N (t)).
(2)
x3N (t) denotes the collective of 3 D coordinates x1 (t), x2 (t), . . . , x N (t). Note that E should be a conserved quantity, i.e., a constant of time, if the system is truly isolated. One can often treat a MD simulation like an experiment (Fig. 1). Below is a common flowchart of an ordinary MD run: [system setup] sample selection (pot., N , IC, BC)
→
[equilibration] sample preparation (achieve T, P)
→
[simulation run] property average (run L steps)
→
[output] data analysis (property calc.)
in which we fine-tune the system until it reaches the desired condition (here, temperature T and pressure P), and then perform property averages, for instance calculating the radial distribution function g(r) [1] or thermal conductivity [2]. One may also perform a non-equilibrium MD calculation, during which the system is subjected to perturbational or large external driving forces, 565 S. Yip (ed.), Handbook of Materials Modeling, 565–588. c 2005 Springer. Printed in the Netherlands.
566
J. Li
N particles
xi(t) z
y x
Figure 1. Illustration of the MD simulation system.
and we analyze its non-equilibrium response, such as in many mechanical deformation simulations. There are five key ingredients to a MD simulation, which are boundary condition, initial condition, force calculation, integrator/ensemble, and property calculation. A brief overview of them is given below, followed by more specific discussions. Boundary condition. There are two major types of boundary conditions: isolated boundary condition (IBC) and periodic boundary condition (PBC). IBC is ideally suited for studying clusters and molecules, while PBC is suited for studying bulk liquids and solids. There could also be mixed boundary conditions such as slab or wire configurations for which the system is assumed to be periodic in some directions but not in the others. In IBC, the N -particle system is surrounded by vacuum; these particles interact among themselves, but are presumed to be so far away from everything else in the universe that no interactions with outside occur except perhaps responding to some well-defined “external forcing.” In PBC, one explicitly keeps track of the motion of N particles in the so-called supercell, but the supercell is surrounded by infinitely replicated, periodic images of itself. Therefore a particle may interact not only with particles in the same supercell but also with particles in adjacent image supercells (Fig. 2). While several polyhedra shapes (such as hexagonal prism and rhombic dodecahedron from Wigner–Seitz construction) can be used as the space-filling unit and thus can serve as PBC supercell, the simplest and most often used supecell shape is a parallelepiped, specified by its three edge vectors h1 , h2 and h3 . It should be noted that IBC can most often be well mimicked by a large enough PBC supercell so the images do not interact. Initial condition. Since Newton’s equations of motion are second-order ordinary differential equations (ODE), IC basically means x3N (t = 0) and
Basic molecular dynamics
567
rc h2
h1
Figure 2. Illustration of periodic boundary condition (PBC). We explicitly keep track of trajectories of only the atoms in the center cell called the supercell (defined by edge vectors h1 , h2 and h3 ), which is infinitely replicated in all three directions (image supercells). An atom in the supercell may interact with other atoms in the supercell as well as atoms in the surrounding image supercells. rc is a cut-off distance of the interatomic potential beyond which interaction may be safely ignored.
x˙ 3N (t = 0), the initial particle positions and velocities. Generating the IC for crystalline solids is usually quite easy, but IC for liquids needs some work, and even more so for amorphous solids. A common strategy creating a proper liquid configuration is to melt a crystalline solid. And if one wants to obtain an amorphous configuration, a strategy is to quench the liquid during the MD run. Let us focus on IC for crystalline solids. For instance, x3N (t = 0) can be a fcc perfect crystal (assuming PBC), or an interface between two crystalline phases. For most MD simulations, one needs to write a structure generator. Before feeding the initial configuration thus created into a MD run, it is a good idea to visualize it first, checking bond lengths and coordination numbers, etc. [3]. A frequent cause of MD simulation breakdown is pathological initial condition, as the atoms are too close to each other initially, leading to huge forces. According to the equipartition theorem [4], each independent degree of freedom should possess kB T /2 kinetic energy. So, one should draw each
568
J. Li
component of the 3N -dimensional x˙ 3N (t =0) vector from a Gaussian–Maxwell normal distribution N (0, kB T /m i ). After that, it is a good idea to eliminate the center of mass velocity, and for clusters, the net angular momentum as well. Force calculation. Before moving into details of force calculation, it should be mentioned that two approximations underly the use of the classical equation of motion mi
∂U d2 xi (t) = fi ≡ − , 2 dt ∂xi
i = 1, . . . , N.
(3)
to describe the atoms. The first is the Born–Oppenheimer approximation [5] which assumes the electronic state couples adiabatically to nuclei motion. The second is that the nucleus motion is far removed from the Heisenberg uncertainty lower bound: Et h¯ /2. If we plug in E = kB T /2, the kinetic energy, and t = 1/ω, where ω is a characteristic vibrational frequency, we obtain kB T /h¯ ω 1. In solids, this means the temperature should be significantly greater than the Debye temperature, which is actually quite a stringent requirement. Indeed, large deviations from experimental heat capacities are seen in classical MD simulations of crystalline solids [2]. A variety of schemes exist to correct this error [1], for instance the Wigner–Kirkwood expansion [6] and path integral molecular dynamics [7]. The evaluation of the right-hand side of Eq. (3) is the key step that usually consumes most of the computational time in a MD simulation, so its efficiency is crucial. For long-range Coulomb interactions, special algorithms exist to break them up into two contributions: a short-ranged interaction, plus a smooth, field-like interaction, both of which can be computed efficiently in separate ways [8]. In this contribution we focus on issues concerning shortrange interactions only. There is a section about the Lennard–Jones potential and its trunction schemes, followed by a section about how to construct and maintain an atom–atom neighborlist with O(N ) computational effort per step. Finally, see Chap. 2.4 and 2.5 for the development of interatomic potential U (x3N ) functions for metallic and covalent materials, respectively. Integrator/ensemble. Equation (3) is a set of second-order ODEs, which can be strongly nonlinear. By converting them to first-order ODEs in the 6N dimensional space of {x N , x˙ N }, general numerical algorithms for solving ODEs such as the Runge–Kutta method [9] can be applied. However, these general methods are rarely used in practice, because the existence of a Hamiltonian allows for more accurate integration algorithms, prominent among which are the family of predictor-corrector integrators [10] and the family of symplectic integrators [8, 11]. A section in this contribution gives a brief overview of integrators. Ensembles such as the micro-canonical, canonical, and grand-canonical are concepts in statistical physics that refer to the distribution of initial conditions. A system, once drawn from a certain ensemble, is supposed to follow strictly
Basic molecular dynamics
569
the Hamiltonian equation of motion Eq. (3), with E conserved. However, ensemble and integrator are often grouped together because there exists a class of methods that generates the desired ensemble distribution via time integration [12, 13]. Equation (3) is modified in these methods to create a special dynamics whose trajectory over time forms a cloud in phase space that has the desired distribution density. Thus, the time-average of a single-point operator on one such trajectory approaches the thermodynamic average. However, one should be careful in using it to calculate two-point correlation function averages. See Chap. 2.4 for detailed description of these methods. Property calculation. A great strength of MD simulation is that it is “omnipotent” at the level of classical atoms. All properties that are well-posed in classical mechanics and statistical mechanics can in principle be computed. The remaining issue is computational efficiency. The properties can be roughly grouped into four categories: 1. Structural characterizations. Examples include radial distribution function, dynamic structure factor, etc. 2. Equations of state. Examples include free-energy functions, phase diagrams, static response functions like thermal expansion coefficient, etc. 3. Transport. Examples include viscosity, thermal conductivity (electronic contribution excluded), correlation functions, diffusivity, etc. 4. Non-equilibrium response. Examples include plastic deformation, pattern formation, etc.
1.
The Lennard–Jones Potential
The solid and liquid states of rare gas elements Ne, Ar, Kr, Xe are better understood than other elements because their closed-shell electron configurations do not allow them to participate in covalent or metallic bonding with neighbors, which are strong and complex, but only to interact via weak van der Waals bonds, which are perturbational in nature in these elements and therefore mostly additive, leading to the pair-potential model: U (x3N ) =
N
V (|x j i |),
x j i ≡ x j − xi ,
(4)
j >i
where we assert that the total potential energy can be decomposed into the direct sum of individual “pair-interactions.” If there is to be rotational invariance in U (x3N ), V can only depend on r j i ≡ |x j i |. In particular, the Lennard–Jones potential V (r) = 4
12
σ r
−
6
σ r
,
(5)
570
J. Li
is a widely used form for V (r), that depends on just two parameters: a basic energy-scale parameter , and a basic length-scale parameter σ . The potential is plotted in Fig. 3. There are a few noteworthy facts about the Lennard–Jones potential: • V (r = σ ) = 0, at which point the potential is still repulsive, meaning V (r = σ ) > 0 and two atoms would repel each other if separated at this distance. • The potential minimum occurs at rmin = 21/6 σ , and Vmin = −. When r > rmin the potential switches from being repulsive to being attractive. • As r → ∞, V (r) is attractive and decays as r −6 , which is the correct scaling law for dispersion (London) forces between closed-shell atoms. To get a feel for how fast V (r) decays, note that V (r =2.5σ )=−0.0163, V (r = 3σ ) = −0.00548, and V (r = 3.5σ ) = −0.00217. • As r → 0, V (r) is repulsive as r −12 . In fact, r −12 blows up so quickly that an atom seldom is able to penetrate r < 0.9σ , so the Lennard– Jones potential can be considered as having a “hard core.” There is no conceptual basis for the r −12 form, and it may be unsuitable as a model for certain materials, so it is sometimes replaced by a “soft core” of the form exp(−kr), which combined with the r −6 attractive part is called the Buckingham exponential-6 potential. If the attractive part is also of an exponential form exp(−kr/2), then it is called a Morse potential.
2
VLJ(r)/ε
1.5 1 0.5 0 ⫺0.5 ⫺1
1
1.5
2 r/σ
Figure 3. The Lennard–Jones potential.
2.5
Basic molecular dynamics
571
For definiteness, σ = 3.405 Å and = 119.8 kB = 0.01032 eV for Ar. The mass can be taken to be the isotopic average, 39.948 a.m.u.
1.1.
Reduced Units
Unit systems are invented to make physical laws look simple and numerical calculations easy. Take Newton’s law: f =ma. In the SI unit system, this means that if an object of mass x (kg) is undergoing an acceleration of y (m/s2 ), the force on the object must be x y (N). However, there is nothing intrinsically special about the SI unit system. One (kg) is simply the mass of a platinum–iridium prototype in a vacuum chamber in Paris. If one wishes, one can define his or her own mass unit – ˜ which say is 1/7 of the mass of the Paris prototype: 1 (kg) = 7 (kg). ˜ (kg), ˜ If (kg) is one’s choice of the mass unit, how about the unit system? One really has to make a decision here, which is either keeping all the other units ˜ transition, or, changing some unchanged and only making the (kg) → (kg) ˜ other units along with the (kg) → (kg) transition. Imagine making the first choice, that is, keeping all the other units of the SI system unchanged, including the force unit (N), and only changes the mass unit ˜ That is all right, except in the new unit system the Newton’s from (kg) to (kg). ˜ law must be re-expressed as F = ma/7, because if an object of mass 7x (kg) 2 is undergoing an acceleration of y (m/s ), the force on the object is x y (N). There is nothing inherently wrong with the F = ma/7 expression, which is just a recipe for computation – a correct one for the newly chosen unit system. Fundamentally, F = ma/7 and F = ma describe the same physical law. But it is true that F = ma/7 is less elegant than F = ma. No one likes to memorize extra constants if they can be reduced to unity by a sensible choice of units. The SI unit system is sensible, because (N) is picked to work with other SI units to satisfy F = ma. ˜ as the mass unit? How may we have a sensible unit system but with (kg) ˜ ˜ ˜ Simple, just define (N) = (N)/7 as the new force unit. The (m)–(s)–(kg)–( N)– unit system is sensible because the simplest form of F = ma is preserved. Thus we see that when a certain unit in a sensible unit system is altered, other units must also be altered correspondingly in order to constitute a new sensible unit system, which keeps the algebraic forms of all fundamental physical laws unaltered. (A notable exception is the conversion between SI and Gaussian unit systems in electrodynamics, during which a non-trivial factor of 4π comes up.) In science people have formed deep-rooted conventions about the simplest algebraic forms of physical laws, such as F = ma, K = mv 2 /2, E = K + U , P = ρ RT , etc. Although nothing forbids one from modifying the constant coefficients in front of each expression, one is better off not to. Fortunately, as long as one uses a sensible unit system, these algebraic expressions stays invariant.
572
J. Li
Now, imagine we derive a certain composite law from a set of simple laws. On one side, we start with and consistently use a sensible unit system A. On the other side, we start with and consistently use another sensible unit system B. Since the two sides use exactly the same algebraic forms, the resultant algebraic expression must also be the same, even though for a given physical instance, a variable takes on two different numerical values on the two sides as different unit systems are adopted. This means that the final algebraic expression describing the physical phenomena must satisfy certain concerted scaling invariance with respect to its dependent variables, corresponding to any feasible transformation between sensible unit systems. This strongly limits the form of possible algebraic expressions describing physical phenomena, which is the basis of dimensional analysis. As mentioned, once certain units are altered, other units must be altered correspondingly to make the algebraic expressions of physical laws look invariant. For example, for a single element Lennard–Jones system, one can ˜ = (J), new length unit (m) ˜ = σ (m), and new mass define new energy unit (J) ˜ unit (kg) = m a (kg) which is the atomic mass, where , σ and m a are pure ˜ unit system, the potential energy function is, ˜ m)–( ˜ kg) numbers. In the (J)–( V (r) = 4(r −12 − r −6 ),
(6)
and the mass of an atom is m = 1. Besides that, all physical laws must remain invariant. For example, K = mv 2 /2 in the SI system, and it still should hold ˜ unit system. This can only be achieved if the derived time ˜ kg) in the (J˜)–(m)–( unit (also called reduced time unit), (˜s) = τ (s), satisfies,
m aσ 2 . (7) ˜ v = 1 (m)/(˜ ˜ s), and K = 1/2 (J˜) is a solution To see this, note that m = 1 (kg), 2 ˜ ˜ ˜ kg) unit system, but must also be a solution to to K = mv /2 in the (J)–(m)–( K = mv 2 /2 in the SI system. For Ar, τ turns out to be 2.156 × 10−12 , thus the reduced time unit [˜s] = 2.156 [ps]. This is roughly the timescale of one atomic oscillation period in Ar. = m a σ 2 /τ 2 ,
1.2.
or τ =
Force Calculation
For pair potential of the form (4), there is, fi = −
∂ V (ri j ) j =/i
=
j =/i
∂xi
=
j =/i
1 ∂ V (r) − r ∂r r=ri j
∂ V (r) − ∂r r=ri j
xˆ i j
xi j ,
(8)
Basic molecular dynamics
573
where xˆ i j is the unit vector, xˆ i j ≡
xi j , ri j
xi j ≡ xi − x j .
(9)
One can define force on i due to atom j ,
fi j ≡
1 ∂ V (r) − r ∂r r=ri j
xi j ,
(10)
and so there is, fi =
fi j .
(11)
j =/i
It is easy to see that, fi j = −f j i .
(12)
MD programs tend to take advantage of symmetries like the above to save computations.
1.3.
Truncation Schemes
Consider the single-element Lennard–Jones potential in (5). Practically we can only carry out the potential summation up to a certain cutoff radius. There are many ways to truncate, the simplest of which is to modify the interaction as
V0 (r) =
V (r) − V (rc ), r < rc . 0, r ≥ rc
(13)
However, V0 (r) is discontinuous in the first derivative at r = rc , which causes large error in time integration (especially with high-order algorithms and large time steps) if an atom crosses rc , and is detrimental to calculating correlation functions over long time. Another commonly used scheme
V1 (r) =
V (r) − V (rc ) − V (rc )(r − rc ), r < rc 0, r ≥ rc
(14)
makes the force continuous at r = rc , but also makes the potential well too shallow (see Fig. 4). It is also slightly more expensive because we have to compute the square root of |xij |2 in order to get r. An alternative is to define V˜ (r) =
V (r) exp(rs /(r − rc )), r < rc 0, r ≥ rc
574
J. Li LJ6-12 potential and its truncated forms
E [ε]
0
⫺0.5
V(r) V0(r) V1(r) W(r)
⫺1 1
1.5
2
2.5
r [σ] Figure 4. Lennard–Jones potential and its modified forms with cutoff rc = 2.37343 σ . Black lines indicate positions of neighbors in a single-element fcc crystal at 0 K.
which has all derivatives continuous at r = rc . However, this truncation scheme requires another tunable parameter rs . The following truncation scheme, 6 18 12 12 σ σ σ σ 4ε − + 2 − r r rc rc 6 12 6 W (r) = r σ σ × −3 +2 , σ rc rc
0,
r < rc
(15)
r ≥ rc
is recommended. W (r), V (r), V0 (r) and V1 (r) are plotted in Fig. 4 for comparison. rc is chosen to be 2.37343σ , which falls exactly at the 2/3 interval between the fourth and fifth neighbors at equilibrated fcc lattice of 0 K. There is clearly a tradeoff in picking rc . If rc is large, the effect of the artificial truncation is small. On the other hand, maintaining and summing over a large neighbor list (size ∝ rc3 ) costs more. For a properly written O(N ) MD code, the cost versus neighbor number relation is almost linear. Let us see what is the minimal rc for a fcc solid. Figure 5 shows the neighboring atom shells and their multiplicity. Also drawn are the three glide planes.
Basic molecular dynamics
575 fcc neighboring shells 68; 86
748; 134
4 12; 54
324; 42
112; 12
origin
524; 78
26; 18
Figure 5. FCC neighboring shells. For example, label “68; 86 ” means there are eight sixth nearest neighbors of the type shown in figure, which adds up to 86 neighbors in all if included. The ABC stacking planes are also shown in the figure.
With (15), once the number of interacting neighbor shells are determined, we can evaluate the equilibrium volume and bulk modulus of the crystal in closed form. The total potential energy of each atom is r j i
(16)
For fcc crystal, we can extract scale-independent coefficients from the above summation and differentiate with respect to the lattice constant a – the minima of which yields the equilibrium lattice constant a0 . If we demand rc to fall into an exact position between the highest included shell and the lowest excluded shell, we can iterate the process until mutual consistency is achieved. We then plug a0 into (16) to calculate the binding energy per atom, e0 ; the atomic volume a03 , 4 and the bulk modulus 0 =
(17)
a02 d 2 e 4 d 2 e dP = B≡− = d log 90 da 2 a 9a0 da 2 a 0
0
(for fcc). 0
(18)
576
J. Li
Table 1. FCC neighboring shells included in Eq. (15) vs. properties n
N
rc [σ ]
a0 [σ ]
0 [σ 3 ]
e0 [ε]
B[εσ −3 ]
1 2 3 4 5 6 7 8 9 10
12 18 42 54 78 86 134 140 176 200
1.44262944953 1.81318453769 2.11067974132 2.37343077641 2.61027143673 2.82850677530 3.03017270367 3.21969263257 3.39877500485 3.56892997792
1.59871357076 1.57691543349 1.56224291246 1.55584092331 1.55211914976 1.55023249772 1.54842162594 1.54727436382 1.54643096926 1.54577565469
1.02153204121 0.98031403353 0.95320365252 0.94153307381 0.93479241591 0.93138774467 0.92812761235 0.92606612556 0.92455259927 0.92337773387
−2.03039845846 −4.95151157088 −6.12016548816 −6.84316556834 −7.27254778301 −7.55413237921 −7.74344974981 −7.88758411490 −7.99488847415 −8.07848627384
39.39360127902 52.02448553061 58.94148705580 64.19738627468 66.65093979162 68.53093399765 69.33961787572 70.63452119577 71.18713376234 71.76659559499
The self-consistent results for rc ratio 2/3 are shown in Table 1. That is, rc is exactly at 2/3 the distance between the nth interacting shell and the (n +1)th non-interacting shell. The reason for 2/3(> 1/2) is that we expect thermal expansion at finite temperature. If one is after converged Lennard–Jones potential results, then rc = 4σ is recommended. However, it is about five times more expensive per atom than the minimum-cutoff calculation with rc = 2.37343σ .
2.
Integrators
An integrator serves the purpose of advancing the trajectory over small time increments t: x3N (t0 ) → x3N (t0 + t) → x3N (t0 + 2t) → · · · → x3N (t0 + Lt) where L is usually ∼104 − 107 . Here we give a brief overview of some popular algorithms: central difference (Verlet, leap-frog, velocity Verlet), Beeman’s algorithm [14], predictor-corrector [10], and symplectic integrators [8, 11].
2.1.
Verlet Algorithm
Assuming x3N (t) trajectory is smooth, perform Taylor expansion xi (t0 + t) + xi (t0 − t) = 2xi (t0 ) + x¨ i (t0 )(t)2 + O((t)4 ).
(19)
Since x¨ i (t0 ) = fi (t0 )/m i can be evaluated given the atomic positions x3N (t0 ) at t = t0 , x3N (t0 + t) in turn may be approximated by,
xi (t0 + t) = −xi (t0 − t) + 2xi (t0 ) +
fi (t0 ) (t)2 + O((t)4 ). mi (20)
Basic molecular dynamics
577
By throwing out the O((t)4 ) term, we obtain a recursion formula to compute x3N (t0 + t), x3N (t0 + 2t), . . . successively, which is the Verlet [15] algorithm. The velocities do not participate in the recursion but are needed for property calculations. They can be approximated by vi (t0 ) ≡ x˙ i (t0 ) =
1 [xi (t0 + t) − xi (t0 − t)] + O((t)2 ). 2t
(21)
To what degree does the outcome of the above recursion mimic the real trajectory x3N (t)? Notice that in (20), assuming xi (t0 ) and xi (t0 − t) are exact, and assuming we have a perfect computer with no machine error storing the relevant numbers or carrying out floating-point operations, the computed xi (t0 + t) would still be off from the real xi (t0 + t) by O((t)4 ), which is defined as the local truncation error (LTE). LTE is an intrinsic error of the algorithm. Clearly, as t → 0, LTE → 0, but that does not guarantee the algorithm works, because what we want is x3N (t0 +t ) for a given t , not xi (t0 +t). To obtain x3N (t0 + t ), we must integrate L = t /t steps, and the difference between the computed x3N (t0 + t ) and the real x3N (t0 + t ) is called the global error. An algorithm can be useful only if when t → 0, the global error → 0. Usually (but with exceptions), if LTE in position is ∼ (t)k+1 , the global error in position should be ∼ (t)k , in which case we call the algorithm a k-th order method. The Verlet algorithm is third order in position and potential energy, but only second order in velocity and kinetic energy. This is only half the story because the order of an algorithm only characterizes its performance when t → 0. To save computational cost, most often one must adopt a quite large t. Higher-order algorithms do not necessarily perform better than lower-order algorithms at practical t’s. In fact, they could be much worse by diverging spuriously (causing overflow and NaN), while a more robust method would just give a finite but manageable error for the same t. This is the concept of the stability of a numerical algorithm. In linear ODEs, the global error e of a certain normal mode k can always be written as e(ωk t, T /t) by dimensional analysis, where ωk is the mode’s frequency. One then can define the stability domain of an algorithm in the ωt complex plane as the border where e(ωk t, T /t) starts to grow exponentially as a function of T /t. To rephrase, a higher-order algorithm may have a much smaller stability domain than the lower-order algorithm even though its e decays faster near the origin. Since e is usually larger for larger |ωk t|, the overall quality of an integration should be characterized by e(ωmax t, T /t) where ωmax is the maximum intrinsic frequency of the molecular system that we explicitly integrate. The main reason behind developing constraint MD [1, 8] for some molecules is so that we do not have to integrate its stiff intramolecular vibrational modes, allowing one to take a larger t, so one can follow longer the “softer modes” that we are more interested in. This is also
578
J. Li
the rationale behind developing multiple time step integrators like r-RESPA [11]. In addition to LTE, there is round-off error due to the computer’s finite precision. The effect of round-off error can be better understood in the stability domain: (1) In most applications, the round-off error LTE, but it behaves like white noise which has a very wide frequency spectrum, and so for the algorithm to be stable at all, its stability domain must include the entire real ωt axis. However, as long as we ensure non-positive gain for all real ωt modes, the overall error should still be characterized by e(ωk t, T /t), since the white noise has negligible amplitude. (2) Some applications, especially those involving high-order algorithms, do push the machine precision limit. In those cases, equating LTE ∼ where is the machine’s relative accuracy, provides a practical lower bound to t, since by reducing t one can no longer reduce (and indeed would increase) the global error. For single-precision arithmetics (4 bytes to store one real number), ∼ 10−8 ; for double-precision arithmetics (8 bytes to store one real number), ≈ 2.2 × 10−16 ; for quadrupleprecision arithmetics (16 bytes to store one real number), ∼ 10−32 .
2.2.
Leap-frog Algorithm
Here we start out with v3N (t0 − t/2) and x3N (t0 ), then,
vi t0 +
t 2
t 2
= vi t0 −
+
fi (t0 ) t + O((t)3 ), mi
(22)
followed by,
xi (t0 + t) = xi (t0 ) + vi
t t0 + 2
t + O((t)3 ),
(23)
and we have advanced by one step. This is a second-order method. The velocity at time t0 can be approximated by,
vi (t0 ) =
2.3.
1 t vi t0 − 2 2
+ vi t0 +
t 2
+ O((t)2 ).
(24)
Velocity Verlet Algorithm
We start out with x3N (t0 ) and v3N (t0 ), then, xi (t0 + t) = xi (t0 ) + vi (t0 )t +
1 2
fi (t0 ) (t)2 + O((t)3 ), mi
(25)
Basic molecular dynamics
579
evaluate f3N (t0 + t), and then,
1 fi (t0 ) fi (t0 + t) + t + O((t)3 ), vi (t0 + t) = vi (t0 ) + 2 mi mi
(26)
and we have advanced by one step. This is a second-order method. Since we can have x3N (t0 ) and v3N (t0 ) simultaneously, it is very popular.
2.4.
Beeman’s Algorithm
It is similar to the velocity Verlet algorithm. We start out with x3N (t0 ), f3N (t0 − t), f3N (t0 ) and v3N (t0 ), then,
4fi (t0 ) − fi (t0 − t) (t)2 xi (t0 + t) = xi (t0 ) + vi (t0 )t + mi 6 4 + O((t) ),
(27)
evaluate f3N (t0 + t), and then,
2fi (t0 + t) + 5fi (t0 ) − fi (t0 − t) t vi (t0 + t) = vi (t0 ) + , (28) mi 6 and we have advanced by one step. This is a third-order method.
2.5.
Predictor-corrector Algorithm
Let us take the often used 6-value predictor-corrector algorithm [10] as an example. We start out with 6 × 3N storage: x3N(0) (t0 ), x3N(1) (t0 ), x3N(2) (t0 ), . . . , x3N(5) (t0 ), where x3N(k) (t) is defined by,
x(k) i (t)
≡
dk x(ti ) dt k
(t)k k!
.
(29)
The iteration consists of prediction, evaluation, and correction steps:
2.5.1. Prediction step (0) (1) (2) (3) (4) (5) x(0) i = xi + xi + xi + xi + xi + xi , (1) (2) (3) (4) (5) x(1) i = xi + 2xi + 3xi + 4xi + 5xi , (2) (3) (4) (5) x(2) i = xi + 3xi + 6xi + 10xi , (3) (4) (5) x(3) i = xi + 4xi + 10xi , (4) (5) x(4) i = xi + 5xi .
(30)
580
J. Li
The general formula for the above is x(k) i =
M−1 k =k
) k ! x(k i , (k − k)!k!
k = 0, . . . , M − 2,
(31)
with M = 6 here. The evaluation must proceed from 0 to M − 2 sequentially.
2.5.2. Evaluation step Evaluate force f3N using the newly obtained x3N(0) .
2.5.3. Correction step Define the error e3N as, ei ≡
x(2) i
−
fi mi
(t)2 . 2!
(32)
Then apply corrections, (k) x(k) i = xi − C Mk ei ,
k = 0, . . . , M − 1,
(33)
where C Mk are constants listed in Table 2. It is clear that the LTE for x3N is O((t) M ) after the prediction step. But one can show that the LTE is enhanced to O((t) M+1 ) after the correction step if f3N depends on x3N only (i.e., is conservative). And so the global error would be O((t) M ).
2.6.
Symplectic Integrators
In the absence of round-off error, certain numerical integrators rigorously maintain the phase space volume conservation property (Liouville’s theorem) of Hamiltonian dynamics, which are then called symplectic. This severely limits the possibilities of mapping from initial to final states, and for this reason symplectic integrators tend to have much better total energy conservation in Table 2. Gear predictor-corrector coefficients C Mk M M M M M
k=0
k=1
=4 1/6 5/6 =5 19/120 3/4 =6 3/20 251/360 = 7 863/6048 665/1008 = 8 1925/14112 19087/30240
k =2 1 1 1 1 1
k=3
k=4
k=5
k=6
k=7
1/3 1/2 1/12 11/18 1/6 1/60 25/36 35/144 1/24 1/360 137/180 5/16 17/240 1/120 1/2520
Basic molecular dynamics
581 Integration of 1000 periods of Kepler orbitals with eccentricity 0.5
Integration of 100 periods of Kepler orbitals with eccentricity 0.5 0
10
0
10
⫺1
10 ⫺1
10
⫺2
II final (p,q) error II2
II final (p,q) error II2
10 ⫺2
10
⫺3
10
⫺4
10
⫺5
10
⫺6
Ruth83 Schlier98_6a Tsitouras99 Calvo93 Schlier00_6b Schlier00_8c 4th Runge-Kutta 4th Gear 5th Gear 6th Gear 7th Gear 8th Gear
⫺4
10
⫺5
10
⫺6
10
⫺7
10
⫺8
Ruth83 Schlier98_6a Tsitouras99 Calvo93 Schlier00_6b Schlier00_8c 4th Runge-Kutta 4th Gear 5th Gear 6th Gear 7th Gear 8th Gear
10
10
100
⫺3
10
150
200
300
400
500
number of force evaluations per period
600
700
800 900 1000
150
200
300
400
500
600
700 800 900 1000
1200 1400 16001800 2000
number of force evaluations per period
Figure 6. (a) Phase error after integrating 100 periods of Kepler orbitals. (b) Phase error after integrating 1000 periods of Kepler orbitals.
the long run. The velocity Verlet algorithm is in fact symplectic, followed by higher-order extensions [16, 17]. As with the predictor-corrector method which can be derived up to order 14 following the original construction scheme [10], suitable for double-precision arithmetics, symplectic integrators also tend to perform better at higher orders even on a per cost basis. We have benchmarked the two families of integrators (Fig. 6) by numerically solving the two-body Kepler’s problem (eccentricity 0.5) which is nonlinear and periodic, and comparing with the exact analytical solution. The two families have different global error versus time characteristics: non-symplectic integrators all have linear energy error (E ∝ t) and quadratic phase error (|| ∝ t 2 ), while symplectic integrators have constant (fluctuating) energy error (E ∝ t 0 ) and linear phase error (|| ∝ t), with respect to time. Therefore the asymptotic long-term performance of a symplectic integrator is always superior to that of a non-symplectic integrator. But, it is found that for a reasonable integration duration, say 100 Kepler periods, high-order predictorcorrector integrators can have a better performance than the best of the symplectic integrators at large integration timesteps (small number of force evaluations per period). This is important, because it means that in a real system if one does not care about the autocorrelation of a mode beyond 100 oscillation periods, then high-order predictor-corrector algorithms can achieve the desired accuracy at a lower computational cost.
3.
Order- N MD Simulation With Short-ranged Potential
We outline here a linked-bin algorithm that allows one to perform MD simulation in a PBC supercell with O(N ) computational effort per time step, where N is the number of atoms in the supercell (Fig. 7). Such approach
582
J. Li
(a)
each timestep: N
2
(b)
(c) 1
2
3
rc
2D usage ratio: 35% ? ?
3D usage ratio: 16% (!)
Figure 7. There are N atoms in the supercell. (a) The circle around a particular atom with radius rc indicates the range of its interaction with other atoms. (b) The supercell is divided into a number of bins, which have dimensions such that an atom can only possibly interact with atoms in adjacent 27 bins in 3D (nine in 2D). (c) This shows that an atom–atom list is still necessary because on average there are only 16% of the atoms in 3D in adjacent bins that interact with the particular atom.
is found to outperform the brute-force Verlet neighbor-list update algorithm, which is O(N 2 ), when N exceeds a few thousand atoms. The algorithm to be introduced here allows for arbitrary supercell deformation during a simulation, and is implemented in large-scale MD and conjugate gradient relaxation programs as well as a visualization program [3]. Denote the three edges of a supercell in Cartesian frame by row vectors h1 , h2 , h3 , which stack together to form a 3 × 3 matrix H. The inverse of the H matrix B ≡ H−1 satisfies I = HB = BH.
(34)
If we define row vectors b1 ≡ (B11, B21, B31),
b2 ≡ (B12, B22, B32 ),
b3 ≡ (B13, B23 , B33), (35)
then (34) is equivalent to hi · b j ≡ hi bTj = δi j .
(36)
Since b1 is perpendicular to both h2 and h3 , it must be collinear with the normal direction n of the plane spanned by h2 and h3 : b1 ≡ |b1 |n. And so by (36), 1 = h1 · b1 = h1 · (|b1 |n) = |b1 |(h1 · n).
(37)
Basic molecular dynamics
583
But |h1 · n| is nothing other than the thickness of the supercell along the h1 edge. Therefore, the thicknesses (distances between two parallel surfaces) of the supercell are, d1 =
1 1 1 , d2 = , d3 = . |b1 | |b2 | |b3 |
(38)
The position of atom i is specified by a row vector, si = (si1 , si2 , si3 ), with siµ satisfying 0 ≤ siµ < 1, µ = 1, . . . , 3,
(39)
and the Cartesian coordinate of this atom, xi , also a row vector, is xi = si1 h1 + si2 h2 + si3 h3 = si H,
(40)
where siµ has the geometrical interpretation of the fraction of the µth edge in order to build xi . We will simulate particle systems that interact via shortranged potentials of cutoff radius rc (see previous section for potential truncation schemes). In the case of multi-component system, rc is generalized to a matrix rcαβ , where α ≡ c(i), β ≡ c( j ) are the chemical types of atom i and j , respectively. We then define xji . (41) x j i ≡ x j − xi , r j i ≡ |x j i |, xˆ j i ≡ r ji The design of the program should allow for arbitrary changes in H that include strain and rotational components (see Section 2.5). One should use the Lagrangian strain η, a true rank-2 tensor under coordinate frame transformation, to measure the deformation of a supercell. To define η, one needs a reference H0 of a previous time, with x0 = sH0 and dx0 = (ds)H0 , and imagine that with s fixed, dx0 is transformed to dx = (ds)H, under H0 → H ≡ H0 J. The Lagrangian strain (see Chap 2.4) is defined by the change in the differential line length, dl 2 = dx dxT ≡ dx0 (I + 2η)dxT0 ,
(42)
where by plugging in dx = (ds)H = (dx0 )H−1 0 H = (dx0 )J, η is seen to be
η=
1 2
T −T H−1 0 HH H0 − I =
1 2
JJT − I .
(43)
Because η is a symmetric matrix, it always has three mutually orthogonal eigen-directions x1 η = x1 η1 , x2 η = x2 η√ 2 , x3 η = x3 η√ 3 . Along those √ directions, the line lengths are changed by factors 1 + 2η1 , 1 + 2η2 , 1 + 2η3 , which achieve extrema among all line directions. Thus, as long as η1 , η2 and η3 oscillate between [−ηbound , ηbound] for some √ chosen ηbound, any line segment at H0 can√be lengthened by no more than 1 + 2ηbound and shortened by no less than 1 − 2ηbound . That is, if we define length measure √ (44) L(s, H) ≡ sHHT sT ,
584
J. Li
then so long as η1 , η2 , η3 oscillate between [ηmin , ηmax ], there is
1 + 2ηmin L(s, H0 ) ≤ L(s, H) ≤
1 + 2ηmax L(s, H0 ).
(45)
One can use the above result to define a strain session, which begins with H0 = H and during which no line segment is allowed to shrink by less than a threshold f c ≤ 1, compared to its length at H0 . This is equivalent to requiring that, f ≡
1 + 2 (min(η1 , η2 , η3 )) ≤ f c .
(46)
Whenever the above condition is violated, the session terminates and a new session starts with the present H as the new H0 , and triggers a repartitioning of the supercell into equal-size bins, which is called a strain-induced bin repartitioning. The purpose of bin partition is the following: it can be a very demanding task to determine if atoms i, j are within rc or not, for all possible i j combinations. Formally, this requires checking r j i ≡ L(s j i , H) ≤ rc .
(47)
Because si , s j and H are all moving – they differ from step to step, it appears that we have to do this at each step. This O(N 2 ) complexity would indeed be the case but for the observation that, in most MD, MC and static minimization procedures, si ’s of most atoms and H often change only slightly from the previous step. Therefore, once we ensured that (47) hold at some previous step, we can devise a sufficient condition to test if (47) still must hold now, at a much smaller cost. Only when this sufficient condition breaks down do we resort to a more complicated search and check in the fashion of (47). As a side note, it is often more efficient to count interaction pairs if the potential function allows for easy use of such half-lists, such as pair- or EAM potentials, which achieves 1/2 saving in memory. In these scenarios we pick a unique “host” atom among i and j to store the information about the i j -pair, that is, a particle’s list only keeps possible pairs that are under its own care. For load-balancing it is best if the responsibilities are distributed evenly among particles. We use a pseudo-random choice of: if i + j is odd and i > j , or if i + j is even and i < j , then i is the host; otherwise it is j . As i > j is “uncorrelated” with whether i + j is even or odd, significant load imbalance is unlikely to occur even if the indices correlate strongly with the atoms’ positions. The step-to-step small change is exploited as follows: one associates each si with a semi-mobile reduced coordinate sai called atom i’s anchor (Fig. 8). At each step, one checks if L(si − sai , H), that is, the current distance between 0 or not. If it is not, then sai i and its anchor, is greater than a certain rdrift ≥ rdrift a does not change; if it is, then one redefines si ≡ si at this step, which is called
Basic molecular dynamics
585
atom trajectory
d L
anchor trajectory
d
Usually,
d = 0.05rc
Figure 8. This illustrates the concepts of an anchor, which is the relative immbobile part of an atom’s trajectory. Using an anchor–anchor list, we can derive a “flash” condition that locally updates an atom’s neighbor-list when the atom drifts sufficiently far away from its anchor.
atom i’s flash incident. At atom i’s flash, it is required to update records of all atoms (part of the records may be stored in j ’s list, if 1/2-saving is used and j happens to be the host of the i j pair) whose anchors satisfy L(saj − sai , H0 ) ≤ rlist ≡
0 rc + 2rdrift . fc
(48)
Note that the distance is between anchors instead of atoms (sai = si , though), and the length is measured by H0 , not the current H. (48) nominally takes O(N ) work per flash, but we may reduce it to O(1) work per flash by partitioning the supercell into m 1 × m 2 × m 3 bins at the start of the session, whose thicknesses by H0 (see (38)) are required to be greater than or equal to rlist : d1 (H0 ) d2 (H0 ) d3 (H0 ) , , ≥ rlist . m1 m2 m3
(49)
The bins deform with H and remains commensurate with it, that is, its s-width 1/m 1 , 1/m 2 , 1/m 3 remains fixed during a strain session. Each bin keeps an updated list of all anchors inside. When atom i flashes, it also updates the bin-anchor list if necessary. Then, if at the time of i’s flash two anchors are separated by more than one bin, there would be L(saj − sai , H0 ) >
d1 (H0 ) d2 (H0 ) d3 (H0 ) , , ≥ rlist, m1 m2 m3
(50)
and they cannot possibly satisfy (48). Therefore we only need to test (48) for anchors within adjacent 27 bins. To synchronize, all atoms flash at the start of a strain session. From then on, atoms flash individually whenever L(si −sai , H) > rdrift . If two anchors flash at the same step in a loop, the first flash may get it wrong – that is, missing the second anchor, but the second flash will correct the mistake. The important thing here is not to lose an interaction. We see that to maintain anchor lists that captures all solutions to (48) among the latest anchors, it takes only O(N ) work per step, and the pre-factor of which is also 0 . small because flash events happen quite infrequently for a tolerably large rdrift
586
J. Li
The central claim of the scheme is that if j is not in i’s anchor records (suppose i’s last flash is more recent than j ’s), which was created some time ago in the strain session, then r j i > rc . The reason is that the current separation 0 between the anchor i and anchor j , L(saj − sai , H), is greater than rc + 2rdrift , since by (45), (46) and (48), L(saj − sai , H) ≥ f · L(saj − sai , H0 ) > f · rlist ≥ f c · rlist = f c ·
0 rc + 2rdrift . fc (51)
So we see that r j i > rc maintains if neither i or j currently drifts more than f · rlist − rc 0 ≥ rdrift , (52) 2 from respective anchors. Put it another way, when we design rlist in (48), we take into consideration both atom drifts and H shrinkage which both may bring i j closer than rc , but since the current H shrinkage has not yet reached the designed critical value, we can convert it to more leeway for the atom drifts. For multi-component systems, we define rdrift ≡
αβ
rlist ≡
0 rcαβ + 2rdrift , fc
(53)
0 0 where both f c and rdrift are species-independent constants, and rdrift can be thought of as putting a lower bound on rdrift , so flash events cannot occur too frequently. At each bin repartitioning, we would require
d1 (H0 ) d2 (H0 ) d3 (H0 ) αβ , , ≥ max rlist . α,β m1 m2 m3
(54)
And during the strain session, f ≥ f c , we have
α rdrift
≡ min min β
αβ
f · rlist − rcαβ , min β 2
βα
f · rlist − rcβα 2
,
(55)
a time- and species-dependent atom drift bound that controls whether an atom of species α needs to flash.
4.
Molecular Dynamics Codes
At present there are several high-quality molecular dynamics programs in the public domain, such as LAMMPS [18], DL POLY [19, 20], Moldy [21], and some codes with biomolecular focus, such as NAMD [22, 23] and Gromacs [24, 25]. CHARMM [26] and AMBER [27] are not free but are standard and extremely powerful codes in biology.
Basic molecular dynamics
587
References [1] M. Allen and D. Tildesley, Computer Simulation of Liquids, Clarendon Press, New York, 1987. [2] J. Li, L. Porter, and S. Yip, “Atomistic modeling of finite-temperature properties of crystalline beta-SiC - II. Thermal conductivity and effects of point defects,” J. Nucl. Mater., 255, 139–152, 1998. [3] J. Li, “AtomEye: an efficient atomistic configuration viewer,” Model. Simul. Mater. Sci. Eng., 11, 173–177, 2003. [4] D. Chandler, Introduction to Modern Statistical Mechanics, Oxford University Press, New York, 1987. [5] M. Born and K. Huang, Dynamical Theory of Crystal Lattices, 2nd edn., Clarendon Press, Oxford, 1954. [6] R. Parr and W. Yang, Density-functional Theory of Atoms and Molecules, Clarendon Press, Oxford, 1989. [7] S.D. Ivanov, A.P. Lyubartsev, and A. Laaksonen, “Bead-Fourier path integral molecular dynamics,” Phys. Rev. E, 67, art. no.–066710, 2003. [8] T. Schlick, Molecular Modeling and Simulation, Springer, Berlin, 2002. [9] W. Press, B. Flannery, S. Teukolsky, and W. Vetterling, Numerical Recipes in C: the Art of Scientific Computing, 2nd edn., Cambridge University Press, Cambridge, 1992. [10] C. Gear, Numerical Initial Value Problems in Ordinary Differential Equation, Prentice-Hall, Englewood Cliffs, NJ, 1971. [11] M.E. Tuckerman and G.J. Martyna, “Understanding modern molecular dynamics: techniques and applications,” J. Phys. Chem. B, 104, 159–178, 2000. [12] S. Nose, “A unified formulation of the constant temperature molecular dynamics methods,” J. Chem. Phys., 81, 511–519, 1984. [13] W.G. Hoover, “Canonical dynamics – equilibrium phase-space distributions,” Phys. Rev. A, 31, 1695–1697, 1985. [14] D. Beeman, “Some multistep methods for use in molecular-dynamics calculations,” J. Comput. Phys., 20, 130–139, 1976. [15] L. Verlet, “Computer “experiments” on classical fluids. I. Thermodynamical properties of Lennard–Jones molecules,” Phys. Rev., 159, 98–103, 1967. [16] H. Yoshida, “Construction of higher-order symplectic integrators,” Phys. Lett. A, 150, 262–268, 1990. [17] J. Sanz-Serna and M. Calvo, Numerical Hamiltonian Problems, Chapman & Hall, London, 1994. [18] S. Plimpton, “Fast parallel algorithms for short-range molecular-dynamics,” J. Comput. Phys., 117, 1–19, 1995. [19] W. Smith and T.R. Forester, “DL POLY 2.0: a general-purpose parallel molecular dynamics simulation package,” J. Mol. Graph., 14, 136–141, 1996. [20] W. Smith, C.W. Yong, and P.M. Rodger, “DL POLY: application to molecular simulation,” Mol. Simul., 28, 385–471, 2002. [21] K. Refson, “Moldy: a portable molecular dynamics simulation program for serial and parallel computers,” Comput. Phys. Commun., 126, 310–329, 2000. [22] M.T. Nelson, W. Humphrey, A. Gursoy, A. Dalke, L.V. Kale, R.D. Skeel, and K. Schulten, “NAMD: a parallel, object oriented molecular dynamics program,” Int. J. Supercomput. Appl. High Perform. Comput., 10, 251–268, 1996. [23] L. Kale, R. Skeel, M. Bhandarkar, R. Brunner, A. Gursoy, N. Krawetz, J. Phillips, A. Shinozaki, K. Varadarajan, and K. Schulten, “NAMD2: Greater scalability for parallel molecular dynamics,” J. Comput. Phys., 151, 283–312, 1999.
588
J. Li [24] H.J.C. Berendsen, D. Vanderspoel, and R. Vandrunen, “Gromacs – a messagepassing parallel molecular-dynamics implementation,” Comput. Phys. Commun., 91, 43–56, 1995. [25] E. Lindahl, B. Hess, and D. van der Spoel, “GROMACS 3.0: a package for molecular simulation and trajectory analysis,” J. Mol. Model., 7, 306–317, 2001. [26] B.R. Brooks, R.E. Bruccoleri, B.D. Olafson, D.J. States, S. Swaminathan, and M. Karplus, “Charmm – a program for macromolecular energy, minimization, and dynamics calculations,” J. Comput. Chem., 4, 187–217, 1983. [27] D.A. Pearlman, D.A. Case, J.W. Caldwell, W.S. Ross, T.E. Cheatham, S. Debolt, D. Ferguson, G. Seibel, and P. Kollman, “Amber, a package of computer-programs for applying molecular mechanics, normal-mode analysis, molecular-dynamics and freeenergy calculations to simulate the structural and energetic properties of molecules,” Comput. Phys. Commun., 91, 1–41, 1995.
2.9 GENERATING EQUILIBRIUM ENSEMBLES VIA MOLECULAR DYNAMICS Mark E. Tuckerman Department of Chemistry, Courant Institute of Mathematical Science, New York University, New York, NY 10003
Over the last several decades, molecular dynamics (MD) has become one of the most important and commonly used approaches for studying condensed phase systems. MD calculations generally serve two often complementary purposes. First, an MD simulation can be used to study the dynamics of a system starting from particular initial conditions. Second, MD can be employed as a means of generating a collection of classical microscopic configurations in a particular equilibrium ensemble. The latter of these uses shows that MD is intimately connected with statistical mechanics and can serve as a computational tool for solving statistical mechanical problems. Indeed, even when MD is used to study a system’s dynamics, one never uses just a single trajectory (generated from a single initial condition). Dynamical properties in the linear response regime, computed according to the rules of statistical mechanics from time correlation functions, require an ensemble of trajectories starting from an equilibrium distribution of initial conditions. These points underscore the importance of having efficient and rigorous techniques capable of generating equilibrium distributions. Indeed while the problem of producing classical trajectories from a distribution of initial conditions is relatively straightforward – one simply integrates Hamilton’s equations of motion – the problem of generating the equilibrium distribution for a complex system is an immense challenge for which advanced sampling techniques are often required. Whether or not one is employing MD on its own or combining it with one of a variety of advanced sampling methods, the underlying MD scheme must be tailored to generate the desired distribution. Once such a scheme is in place, it can be employed as is or adapted for advanced sampling techniques such as umbrella sampling [1], the bluemoon ensemble approach [2, 3], or variable transformations [4]. In this contribution, our focus will be on the underlying MD schemes, themselves, and the problem of generating numerical integrators 589 S. Yip (ed.), Handbook of Materials Modeling, 589–611. c 2005 Springer. Printed in the Netherlands.
590
M.E. Tuckerman
for these schemes. The latter is still an open area of research in which a number of important theoretical questions remain unanswered. Thus, we will discuss the current state of knowledge and allude to the outstanding issues as they arise. At this point, it is worth mentioning that equilibrium ensemble distributions are not the sole domain of MD. Monte Carlo (MC) methods and hybrid MD/MC approaches can also be employed. Moreover, advanced sampling techniques designed to work with MC, such as configurational bias MC [5], and with hybrid methods, such as hybrid MC [6], exist as well. To some extent, the choice between MC, MD and hybrid MD/MC approaches is a matter of taste. Each has particular advantages and disadvantages and both allow for creative innovations within their respective frameworks. A particular advantage of the MD and hybrid MD/MC approaches lies in the fact that they lend themselves well to scalable parallelization, allowing large systems and long time scales to be accessed. Indeed, efficient parallel algorithms for MD have been proposed [7] and a wide variety of parallel MD codes are available to the community via the Web, such as the NAMD (www.ks.uiuc.edu/Research/namd) and PINY MD (homepages.nyu.edu/˜mt33/PINY MD/PINY.html) codes, to name just a few. In thermodynamics, one divides the thermodynamic universe into the system and its surroundings. How the system interacts with its surroundings determines the particular ensemble distribution the system will obey. The interaction between the system and its surroundings causes certain thermodynamic variables to fluctuate and others to remain fixed. For example, if the system can exchange thermal energy with its surroundings, its internal energy will fluctuate, however, its temperature will, when equilibrium is reached, be fixed at the temperature of the surroundings. Thermodynamic variables of the system that are fixed due its interaction with the surroundings can be viewed as “control variables,” since they can be adjusted via the surroundings (e.g., changing the temperature of the surroundings will change the temperature of the system if the two can exchange thermal energy). These control variables, therefore, characterize the ensemble. Let us begin our discussion with the simplest possible case, that of a system that has no interaction with its surroundings. Let the system contain N particles in a container of volume V . Let the positions of the N particles at time t be designated r1 (t), . . . , r N (t) and velocities v1 (t), . . . , v N (t), and let the particles have masses m 1 , . . . , m N . In general, the time evolution of any classical system is given by Newton’s equations of motion m i r¨ i = Fi
(1)
where Fi is the total force on the ith particle, and the overdot notation signifies time differentiation, i.e., r˙ i = dri /dt = vi . Thus, r¨ i is the acceleration of the ith particle. Since Newton’s equations constitute a set of 3N coupled second order differential equations, if an initial condition on the positions and
Generating equilibrium ensembles via molecular dynamics
591
velocities r1 (0), . . . , r N (0), v1 (0), . . . , v N (0) is specified, the solution to Newton’s equations will be a unique function of time. For a system isolated from its surroundings, the force on each particle will only be due to its interaction with all of the other particles in the system. Thus, the forces F1 , . . . , F N will be functions only of the particle positions, i.e., Fi = Fi (r1 , . . . , r N ), and, in addition, they will be conservative, meaning that they can be expressed as the gradient of a scalar potential energy function U (r1 , . . . , r N ): ∂ (2) Fi (r1 , . . . , r N ) = − U (r1 , . . . , r N ) ∂ri If a conservative force is taken to act over a closed path that brings a particle back to its point of origin, no net work is done. When only conservative forces act within a system, the total energy E=
N 1 m i v2i + U (r1 , . . . , r N ) 2 i=1
(3)
is conserved by the motion. Given the law of conservation of energy, the equations of motion for an isolated system can be cast in a way that is particularly useful for establishing the connection to equilibrium ensembles, namely, in terms of the classical Hamiltonian. The Hamiltonian is nothing more than the total energy E expressed as a function of the positions and momenta, pi = m i vi . Thus, the Hamiltonian H is a function of these variables, i.e., H = H (p1 , . . . , p N , r1 , . . . , r N ). Introducing the shorthand notation r ≡ r1 , . . . , r N , p ≡ p1 , . . . , p N , and substituting vi = pi /m i into Eq. (3), the Hamiltonian becomes H (p, r) =
N p2i i=1
2m i
+ U (r, . . . , r N )
(4)
The equations of motion for the positions and momenta are then given by Hamilton’s equations ∂ H pi ∂H ∂U = =− (5) p˙ i = − r˙ i = ∂pi m i ∂ri ∂ri It is straightforward to show, by substituting the time derivative of the equation for r˙ i into the equation for p˙ i , that Hamilton’s equations are mathematically equivalent to Newton’s equations (1). It is also straightforward to show that H (p, r) is conserved by simply computing dH/dt via the chain rule:
N ∂H ∂H dH = · r˙ i + · p˙ i dt ∂ri ∂pi i=1
=
N ∂H i=1
=0
∂H ∂H ∂H · − · ∂ri ∂pi ∂pi ∂ri
(6)
592
M.E. Tuckerman
(It is important to note that the form of Hamilton’s equations is valid in any set of generalized coordinates q1 , . . . , q3N , p1 , . . . , p3N , i.e., q˙k = ∂ H/∂ pk , p˙ k = −∂ H/∂qk .) Just as for Newton’s equations, given an initial condition, (p(0), r(0)), Hamilton’s equations will generate a unique solution (r(t), p(t)) that conserves the total Hamiltonian, i.e., that satisfies H (p(t), r(t))=constant. This condition tells us that the positions and momenta are not all independent variables. In order to understand what this means, let us introduce an abstract 6N -dimensional space, known as phase space, in which 3N of the mutually orthogonal axes are labeled by the 3N position variables and the other 3N axes are labeled by the 3N momentum variables. Since a classical system is completely specified by specifying all of the positions and momenta, a classical microscopic state, or classical microstate, is represented by a single point in the phase space. The condition H (p, r) = constant defines a (6N − 1)dimensional hypersurface in the phase space known as the constant energy hypersurface. It, therefore, becomes clear that any solution to Hamilton’s equations will, for all time, remain on a constant energy hypersurface determined by the initial conditions. If the dynamics is such that the trajectory is able to visit every point of the constant energy hypersurface given an infinite amount of time, then the trajectory is said to be ergodic. There is no general way to prove that a given trajectory is ergodic, and, indeed, in many cases, an arbitrary solution of Hamilton’s equations will not be ergodic. However, if a trajectory is ergodic, then it will generate a sampling of classical microscopic states corresponding to constant total energy, E. Moreover, since the system is in isolation, the particle number N and volume, V are trivially conserved. The collection of classical microscopic states corresponding to constant N , V , and E comprise the statistical mechanical ensemble known as the microcanonical ensemble. In the microcanonical ensemble, the classical microstates must be distributed according to f (p, r) ∝ δ(H (p, r) − E), which satisfies the equilibrium Liouville equation { f, H } = 0, where {. . . , . . .} is the classical Poisson bracket. Thus, an ergodic trajectory generates, not only the dynamics of the system, but also the complete microcanonical ensemble. This tells us that any physical observable expressible as an average A over the ensemble A =
MN (N, V, E)
dp
dr A(p, r)δ (H (p, r) − E)
(7)
D(V )
of a classical phase space function A(p, r) where M N = E 0 /(N !h 3N ), E 0 is a reference energy, h is Planck’s constant, D(V ) is the spatial domain defined by the containing volume, and (N, V, E) is the microcanonical partition function (N, V, E) = M N
dp D(V )
dr δ (H (p, r) − E)
(8)
Generating equilibrium ensembles via molecular dynamics
593
can be computed from a time average over an ergodic trajectory 1 A = A¯ ≡ lim T →∞ T
T
dt A(p(t), r(t))
(9)
0
In Eq. (8), the phase space volume element dp dr = dp1 · · · dp N dr1 · · · dr N is a 6N -dimensional volume element. The Dirac delta-function δ(H (p, r) − E) restricts the integration over the phase space to only those points that lie on the constant energy hypersurface. Clearly, then, the microcanonical partition function corresponds to the total number of microscopic states contained in the microcanonical ensemble. It is, therefore, related to the entropy of the system S(N, V, E) via Boltzmann’s relation S(N, V, E) = k ln (N, V, E)
(10)
where k is Boltzmann’s constant. From this, it is clear that the partition function leads to other thermodynamic quantities via differentiation. The temperature, pressure and chemical potential, for example, are given by
∂S k∂ ln 1 = = T ∂ E N,V ∂E N,V P ∂S k∂ ln = = T ∂ V N,E ∂V N,E µ ∂S k∂ ln =− = T ∂ N V ,E ∂ N V ,E
(11)
The complexity of the forces in Hamilton’s equations is such that an analytical solution is not possible, and one must resort to numerical techniques. In constructing numerical integration schemes, it is important to preserve two properties characterized by Hamiltonian systems. The first is known as Liouville’s Theorem. For simplicity, let us denote the phase space trajectory (p(t), r(t)) simply by xt , known as the phase space vector. Since the solution, xt to Hamilton’s equations is a unique function of the initial condition x0 , we can express xt as a function of x0 , i.e., xt = xt (x0 ). This designation shows that Hamilton’s equations generate a transformation of the complete set of phase space variables from x0 −→ xt . If we consider a small volume element dxt in phase space, this volume element will transform according to dxt = J (xt ; x0 )dx0
(12)
where J (xt ; x0 ) is the Jacobian |∂ xt /∂ x0 | of the transformation. Liouville’s theorem states that J (xt ; x0 ) = 1 or equivalently that dxt = dx0
(13)
In other words, the phase space volume element is conserved. Liouville’s theorem is a consequence of the fact that Hamiltonian systems have a vanishing
594
M.E. Tuckerman
phase space compressibility, κ(x) defined in an analogous manner to the usual hydrodynamic compressibility κ(x) = ∇ · x˙ = =
N ∂
i=1
∂ · p˙ i + · r˙ i ∂pi ∂ri
i=1
∂ ∂H ∂ ∂H − · + · ∂pi ∂ri ∂ri ∂pi
N
=0
(14)
The second property is the time reversibility of Hamilton’s equations. This property implies that if an initial condition x0 is allowed to evolve up to time t, at which point all of the momenta are reversed, the system will, in another time interval of length t, return to the point x0 . Any numerical integration scheme applied to Hamilton’s equations should respect these two properties, as they both ensure that all points of the constant energy hypersurface are given equal statistical weighting, as required by the equilibrium statistical mechanics. A class of integrators that satisfies these conditions are the so called symplectic integrators. In devising a numerical integrator for Hamilton’s equations, it is certainly possible to use a Taylor series approach and expand the solution xt for a short time t = t about t = 0. While this method is adequate for Hamiltonian systems described by Eq. (4), it generally fails for more complicated Hamiltonian forms as well as for non-Hamiltonian systems of the type we will be considering shortly for generating other ensembles. For this reason, we will introduce a more powerful and elegant approach based on operator calculus. This approach begins by recognizing that Hamilton’s equations can be case in a compact form as r˙ i = iLri
p˙ i = iLpi
where a linear operator iL has been introduced (i = iL = =
N ∂H i=1
∂ ∂H ∂ · − · ∂pi ∂ri ∂ri ∂pi
i=1
∂ ∂ · + Fi · m i ∂ri ∂pi
N pi
√
(15) −1) given by
(16)
This operator is known as the Liouville operator. Note that the operator L, itself, is Hermitian. Thus, the equations of motion can be cast in terms of the phase space vector as x˙ = iL x, which has the formal solution x t = eiLt x0
(17)
Generating equilibrium ensembles via molecular dynamics
595
The unitary operator exp(iLt) is known as the classical propagator. Since the classical propagator cannot be evaluated analytically for any but the simplest of systems, it would seem that Eq. (17) is little better than a formal device. In fact, Eq. (17) is the starting point for the derivation of practically useful numerical integrators. In order to use Eq. (17) in this way, it is necessary to introduce an approximation to the classical propagator. To begin, note that iL can be written in the form iL = iL 1 + iL 2
(18)
where iL 1 =
N pi i=1
mi
·
∂ ∂ri
iL 2 =
N
Fi ·
i=1
∂ ∂pi
(19)
Although these two operators do not commute, the propagator exp(iLt) can be factorized according to the Trotter theorem: eiLt = lim
M→∞
eiL 2 t /2M eiL 1 t /M eiL 2 t /2M
M
(20)
where M is an integer. As will be seen shortly, each of the operators in brackets can be evaluated analytically. Thus, the exact propagator could be evaluated by dividing the time t into an infinite number of “steps” of length t/M and evaluating the operator in brackets for each of these steps. While this is obviously not possible in practice, if we approximate M as a finite number, a practical scheme emerges. For finite M, Eq. (20) becomes
eiLt ≈ eiL 2 t /2M eiL 1 t /M eiL 2 t /2M
M
+ O(t 3 /M 2 )
eiLt /M ≈ eiL 2 t /2M eiL 1 t /M eiL 2 t /2M + O(t 3 /M 3 ) eiLt ≈ eiL 2 t /2 eiL 1 t eiL 2 t /2 + O(t 3 )
(21)
where, in the second line, the 1/M power of both sides is taken, and, in the third line, the identification t = t/M is made. The error terms in each line illustrate the difference between the global error in the long-time limit and the error in a single short time step. While the latter is t 3 , the former is t 3 /M 2 = tt 2 , indicating that the error in a long trajectory generated by repeated application of the approximate propagator in Eq. (21) is actually t 2 , despite the fact that the error in the approximate short-time propagator is t 3 . In order to illustrate how to evaluate the action of the approximate propagator in Eq. (21), consider a single particle moving in one dimension. Let q and p be the coordinate and conjugate momentum of the particle. The equations of motion are simply q˙ = p/m and p˙ = F(q). Thus, the approximate propagator becomes ∂ ∂ p ∂ t t F(q) F(q) exp t exp (22) exp[iLt] = exp 2 ∂p m ∂q 2 ∂p
596
M.E. Tuckerman
In order to evaluate the action of each of the operators, we only need the operator identity
exp c
∂ ∂x
f (x) = f (x + c)
(23)
where c is independent of x. This identity can be proved by expanding the exponential of the operator in a Taylor series. This type of operator is called a shift or translation operator because it has the effect of shifting x by an amount c. Applying the operator to the phase space vector (q, p) gives
q(t) p(t)
∂ ∂ p ∂ t t F(q) F(q) = exp exp t exp 2 ∂p m ∂q 2 ∂p
p ∂ ∂ t = exp exp t F(q) 2 ∂p m ∂q p+
= exp
t ∂ F(q) 2 ∂p p+
q+
=
p+
t 2
p+
t 2
p+
F(q) + F q + q+
t m
t 2
t m
p+
F(q) + F q +
F q+
t 2 t m
t 2 2m
t mp
p+
p+
F(q)
t 2
F(q)
F(q)
t m
F(q)
q + t mp
=
q p
q t 2
t 2 2m
(24)
F(q)
Since the last line is just (q(t), p(t)) staring from the initial condition (q, p), the algorithm becomes, after substituting in (q(0), p(0)) for the initial condition: q(t) = q(0) + tv(0) + v(t) = v(0) +
t 2 F(q(0)) 2m
t F(q(0)) + F(q(t)) 2m
(25)
where the momentum has been replaced by the velocity v = p/m. Equation (25) is the well known velocity Verlet algorithm. However, it has been derived in a very powerful way starting from the classical propagator. In fact, the real power of the operator approach is that it can eliminate the need to derive a set of explicit finite difference equations. To see this, note that the velocity Verlet
Generating equilibrium ensembles via molecular dynamics
597
algorithm can be written in the following equivalent way t F(q(0)) 2m q(t) = q(0) + tv(t/2) t v(t) = v(t/2) + F(q(t)) 2m
v(t/2) = v(0) +
(26)
Written in this way, it becomes clear that the three assignments in Eq. (26) correspond to the three operators in Eq. (22), i.e., a shift by an amount (t/2m) F(q(0)) applied to the velocity v(0), followed by a shift of the coordinate q(0) by tv(t/2), followed by a shift of v(t/2) by an amount (t/2m) F(q(t)). Note that the input to each operation is just the output of the previous operation. This fact suggests that one can simply look at an operator such as that of Eq. (22) and directly write the instructions in code corresponding to each operator, only keeping in mind that when the coordinate changes, the force needs to be recalculated. We call this technique of translating the operators in a given factorization scheme directly into instructions in code the direct translation method [8]. Applying this approach to Eq. (22), the following pseudocode could be written down immediately just by looking at the operator expression: v ←− v + t ∗ F/m q ←− q + t ∗ v Call GetNewForce(q, F) v ←− v + t ∗ F/m
!! Shift the velocity !! Shift the coordinate !! Evaluate force at new coordinate !! Shift the velocity
(27)
The velocity Verlet method is an example of a symplectic integrator as can be shown by computing the Jacobian of the transformation (q(0), p(0) → (q(t), p(t)). One could also factorize the propagator according to
exp[iLt] = exp
∂ t p ∂ t p ∂ exp t F(q) exp 2 m ∂q ∂p 2 m ∂q
(28)
and obtain yet another symplectic integrator known as the position Verlet method [9]. The use of the Liouville operator formalism also allows for easy development of integrators capable of exploiting the natural separation of time scales in many complex systems to yield more efficient algorithms [9]. Having seen how to devise numerical integration algorithms for the microcanonical ensemble, we now take up the issue of generating other ensembles. The next case we will consider is that of a system interacting with its surroundings via exchange of thermal energy. If the temperature of the surroundings is T , then, in equilibrium, the system will also have this temperature, and its internal energy will fluctuate. However, since only thermal energy is exchanged with the surroundings, the number of particles N and volume V of the system
598
M.E. Tuckerman
are trivially conserved. Thus, in this case, we have an ensemble whose thermodynamic control variables are N , V and T , known as the canonical ensemble. In this ensemble, the average of any quantity A(p, r) is given by A =
CN Q(N, V, T )
dp
dr A(p, r)e−β H (p,r)
(29)
D(V )
where C N = 1/(N !h 3N ), β = 1/kT , and Q(N, V, T ) is the canonical partition function Q(N, V, T ) = C N
dp
dr e−β H (p,r)
(30)
D(V )
Thermodynamic quantities in the canonical ensemble are given in terms of the partition function as follows: The Helmholtz free energy is A(N, V, T ) = −
1 ln Q(N, V, T ) β
(31)
The pressure, internal energy, chemical potential, and heat capacity at constant volume are given by
∂ ln Q(N, V, T ) P = kT ∂V N,T ∂ ln Q(N, V, T ) E =− ∂β N,V ∂ ln Q(N, V, T ) µ = −kT ∂N V ,T
C V = kβ
2
∂ 2 ln Q(N, V, T ) ∂β 2
(32) N,V
In the canonical ensemble, the surroundings act as a heat bath coupled to the system. Thus, unless we treat explicitly the surroundings that might be present in an actual constant temperature experiment, we cannot determine how this coupling will affect the dynamics of the system. Since this is clearly out of the question, the only alternative is to mimic the effect of the surroundings in a simple way so as to ensure that the system will be driven to generate a canonical distribution. There is no unique way to accomplish this, a fact that has lead practitioners of MD to propose a variety of methods. One class of methods that has become increasingly popular since their introduction are the so called extended phase space methods, originally pioneered by Andersen [10]. In this class of methods, the physical position and momentum variables of the particles in the system are supplemented by additional phase space variables that mimic the effect of the surroundings by controlling the fluctuations in certain quantities in such a way that their averages are
Generating equilibrium ensembles via molecular dynamics
599
consistent with the desired ensemble. For example, in the canonical ensemble, additional variables are used to control the fluctuation in the instantaneous kinetic energy i p2i /2m i such that its average is 3N kT /2. Extended phase space methods based on both Hamiltonian and non-Hamiltonian dynamical systems have been proposed. The former include the original formulation by Nos´e [11], and the more recent Nos´e-Poincar´e method [12]. The latter include the well known Nos´e–Hoover [13] and Nos´e–Hoover chain approaches [13] as well as the more recent generalized Gaussian moment method [14]. It is not possible to discuss all of these methods here, so we will focus on the Nos´e– Hoover and Nos´e–Hoover chain approaches, which are among the most widely used. Since these methods are of the non-Hamiltonian variety, it is necessary to review some of the basic statistical mechanics of non-Hamiltonian systems [15, 16]. Consider a non-Hamiltonian system with a generic smooth evolution equation x˙ = ξ(x)
(33)
where ξ(x) is a vector function. A clear signature of a non-Hamiltonian system will be a non-vanishing compressibility, κ(x), although non-Hamiltonian systems with vanishing compressibility exist as well. The consequence of nonzero compressibility is that the Jacobian of the transformation x0 −→ xt is no longer 1, and the Liouville theorem of Eq. (13) does not hold. However, for a large class of non-Hamiltonian systems described by Eq. (33), a generalization of Liouville’s theorem can be derived [15, 16]. This generalization states that a metric-weighted volume element is conserved, i.e.,
g(xt , t)dxt =
g(x0 , 0)dx0
where the metric factor
√
g(xt , t) = e−w(xt ,t )
(34)
g(xt , t) is given by (35)
where the function w(x) is related to the compressibility by κ(xt )=dw(xt , t)/dt. Equation (34) shows that for non-Hamiltonian systems, phase space integrals should use e−w(x,t )dx as the integration measure rather than just dx. This will be an important point in the analysis of the dynamical systems we will be considering. Finally, although Eq. (34) allows for time-dependent metrics, the systems we will be considering all have time-independent metric factors. Suppose the non-Hamiltonian in Eq. (33) has a time-independent metric factor and a set of Nc conservation laws k (x) = Ck , k = 1, . . . , Nc , where k is a function on the phase space and Ck is a constant. Then,if the system is ergodic, it Nc δ(k (x) − Ck ), which will generate a microcanonical distribution f (x) = k=1
600
M.E. Tuckerman
satisfies a non-Hamiltonian generalization of the Liouville equation [15, 16]. The corresponding partition function is =
dx e−w(x)
Nc
δ(k (x) − Ck )
(36)
k=1
The first non-Hamiltonian system we will consider for generating the canonical distribution are the Nos´e–Hoover equations (NH) [17]. In the Nos´e–Hoover system, an additional variable η and its corresponding momentum pη and “mass” Q (so designated because Q actually has units of energy × time2 ) are introduced into a Hamiltonian system as follows: pi r˙ i = mi pη p˙ i = Fi − pi Q pη (37) η˙ = Q N p2i − 3N kT p˙η = mi i=1 The physics embodied in Eqs. (37) is based on the fact that the term −( pη /Q)pi in the momentum equation acts as a kind of dynamic frictional force. Although the average pη = 0, instantaneously, pη can be positive or negative and, therefore, act to damp or boost the momentum. According to the equation for pη , if the kinetic energy is larger than 3N kT /2, pη will increase and have a greater damping effect on the momenta, while if the kinetic energy is less than 3N kT /2, pη will decrease and have a greater boosting effect on the mometa. In this way, the NH system acts as a “thermostat” regulating the kinetic energy so that its average is the correct canonical value. Equations (37) have the conserved energy H =
N p2i i=1
2m i
+ U (r1 , . . . , r N ) +
= H (p, r) +
pη2 + 3N kTη 2Q
pη2 + 3N kT 2Q
(38)
where H (p, r) is the Hamiltonian of the physical system. Moreover, the compressibility of Eqs. (37) is κ(x) =
N ∂
∂pi pη = −3N Q = −3N η˙ i=1
· p˙ i +
∂ p˙η ∂ ∂ η˙ + · r˙ i + ∂ri ∂η ∂ pη
(39)
Generating equilibrium ensembles via molecular dynamics
601
√ This implies that w(x) = −3N η, and the metric factor is g(x) = exp(3N η). If Eq. (38) is the only conservation law, then the partition function generated by Eqs. (37) can be written down as =
dp D(V )
pη2 + 3N kTη − E dr dη d pη e3Nη δ H (p, r) + 2Q
(40)
Performing the integrals over the variables η and pη yields the partition function of the physical subsystem
pη2 1 1 dp dr d pη exp E − H (p, r) − = 3N kT kT 2Q D(V ) √ 2π QkT e E/ kT = dp dr e−H (p,r)/ kT 3N kT
(41)
D(V )
which shows that the partition function for the physical system is canonical apart from the prefactors. Although this analysis would suggest that the NH equations should always produce a canonical distribution, it turns out that if even a single additional conservation law is obeyed by the system, Eqs. (37) will fail [16]. Figure 1 shows that for a simple harmonic oscillator coupled to the NH thermostat, the physical phase space and position and momentum distribution are not those of the canonical ensemble. Note that in N -particle systems, a common additional conservation law is conservation of N total momentum i=1 pi = K, where K is a constant vector. This conservation N Fi = law is obeyed by systems on which no external forces act, so that i=1 0. Conservation of total momentum is an example of a common conservation law in N -particle systems that can cause the NH equations to fail rather spectacularly [16]. A solution to this problem was devised by Martyna et al. [13] in the form of the Nos´e–Hoover chain equations. In this scheme, the heat bath variables, themselves, are connected to a heat bath, which, in turn is connected to a heat bath, until a “chain” of M heat baths is generated. The equations of motion are r˙ i =
pi mi
p˙ i = Fi − η˙ k =
pηk Qk
p˙ηk = G k − p˙η M = G M
pη1 pi Q1 k = 1, . . . , M pηk+1 pη Q k+1 k
602
M.E. Tuckerman
Figure 1. Simple harmonic oscillator with momentum p, coordinate q, mass m = 1, frequency ω = 1 and temperature kT = 1. Top left: Poincar´e section ( pq plane) of the oscillator when coupled to the Nos´e–Hoover thermostat with Q = 1 and q(0) = 0, p(0) = 1, η(0) = 0, pη (0) = 1. Middle left: The position distribution function of the oscillator. The solid line is the distribution function generated by the NH dynamics while the dashed line is the analytical result for a canonical ensemble. Bottom left: Same for the momentum distribution. Top right: Poincar´e section for the Nos´e-Hoover chain scheme with M = 4, q(0) = 0, p(0) = 1, ηk (0) = 0, pηk (0) = (−1)k . Middle right: The position distribution function. The solid line is the distribution function generated by the NHC dynamics while the dashed line is the analytical result. Bottom right: Same for the momentum distribution. In all simulations, the equations of motion were integrated for 5×106 steps using a time step of 0.01 and a fifth-order SY decomposition with n c = 5.
where the heat-bath forces have been introduced and are given by G1 =
N p2i i=1
mi
− 3N kT
Gk =
pη2k−1 Q k−1
− kT
(42)
Equations (42) have the conserved energy H = H (p, r) +
M pη2k k=1
2Q k
+ d N kT η1 + kT
M k=2
ηk
(43)
Generating equilibrium ensembles via molecular dynamics
603
and a compressibility κ(x) = −3N η˙1 −
M
η˙ k
(44)
k=2
By allowing the “length” of the chain to be arbitrarily long, the problem of unexpected conservation laws is avoided. In Fig. 1, the physical phase space and momentum and position distributions for a harmonic oscillator coupled to a thermostat chain of length M = 4 is shown. It can be seen that the correct canonical distribution is obtained. The general proof that the canonical distribution is generated by Eqs. (42) follows the same pattern as for the NH equations. However, if additional conservation laws, such as conservation of total momentum, are obeyed, the NHC equations will still generate the correct distribution [16]. The NHC scheme can be used in a flexible manner to enhance the equilibration of a system. For example, rather than using a single global NHC thermostat, it is also possible to couple many NHCs to a system, one to each of a small number of degrees of freedom. In fact, coupling one NHC to each degree of freedom has been shown to lead to a highly effective method for studying quantum systems via the Feynman path integral using molecular dynamics [18]. In order to develop a numerical integration algorithm for the NHC equations, it is important to keep in mind the modified Liouville theorem, Eq. (34). The complexity of the NHC equation is such that a Taylor series approach cannot be employed to derive a satisfactory integrator, i.e., one that does not lead to substantial drifts in the conserved energy [19]. Thus, the NHC system is an example of a problem on which the power of the Liouville operator method can be brought to bear. We begin by writing the total Liouville operator for Eqs. (42) as iL = iL 1 + iL 2 + iL T
(45)
where iL 1 and iL 2 are given by Eq. (19) and iL T =
M k=1
N M−1 pη ∂ pηk ∂ ∂ pη1 ∂ k+1 + Gk − pi · − pηk Q k ∂ηk ∂ pηk Q1 ∂pi Q k+1 ∂ pηk i=1 k=1
(46) The propagator is now factorized in a manner very similar to the velocity Verlet algorithm
eiLt = eiL T t /2eiL 2 t /2eiL 1 t eiL 2 t /2eiL T t /2 + O t 3
(47)
The only new feature in this scheme is the operator exp(iL T t/2). Application of this operator to the phase space requires some care. Clearly, the operator needs to be further factorized into individual operators that can be applied
604
M.E. Tuckerman
analytically. However, the NHC equations constitute a stiff set of differential equations and, therefore, a simple O(t 3 ) factorization scheme will not be accurate enough. Thus, for this operator, a higher-order factorization is needed. Note that the overall integrator will still be O(t 3 ) despite the use of a higherorder method on the thermostat operator. The higher order method we choose is the Suzuki–Yoshida (SY) scheme [20, 21], which involves the introduction of weighted time steps, w j t, j = 1, . . . , n sy , the value of n sy determines the n order of the method. The weights w j are required to satisfy j sy=1 w j = 1 and are chosen so as to cancel out the lower order error terms. Applying the SY scheme, the operator exp(iL T t/2) becomes eiL T t /2 =
n sy
eiL T w j t /2
(48)
j =1
In order to avoid needed to choose n sy too high, another device can be introduced, namely, simply cutting the time step by a factor of n c and applying the operator in Eq. (48) n c times, i.e., e
iL T t /2
=
n sy nc
eiL Tw j t /2nc
(49)
i=1 j =1
In this way, both n c and n sy can be adjusted so as to minimize the number of operations needed for satisfactory performance of the overall integrator. Having introduced the above scheme, it only remains to specify a particular factorization of the operator exp(iL T w j t/2n c ). Defining δ j = w j t/n c , we choose the following factorization
δj δj ∂ GM = exp exp iL T 2 4 ∂ pη M
δj ∂ Gk × exp 4 ∂ pηk
N
1 k=M−1
δ j pηk+1 ∂ exp − pηk 8 Q k+1 ∂ pηk
δ j pηk+1 ∂ exp − pηk 8 Q k+1 ∂ pηk
δ j pη1 ∂ × exp − pi · 2 Q1 ∂pi i=1 ×
M−1 k=1
M
δ j pηk ∂ exp − 2 Q k ∂ηk k=1
δ j pηk+1 ∂ δj ∂ Gk exp − pηk exp 8 Q k+1 ∂ pηk 4 ∂ pηk
δ j pηk+1 ∂ × exp − pη 8 Q k+1 k ∂ pηk
δj ∂ GM exp 4 ∂ pη M
(50)
Although the overall scheme may seem complicated, the use of the direct translation technique simplifies considerably the job of coding the algorithm.
Generating equilibrium ensembles via molecular dynamics
605
All of the operators appearing in Eq. (50) are either translation operators or operators of the form exp(cx∂/∂ x), the action of which is
exp cx
∂ x = xec ∂x
(51)
We call such operators scaling operators, because the effect is to multiply x by an x-independent factor ec . The examples of Fig. 1 were generated using the above scheme. The last ensemble we will discuss corresponds to a system that interacts with its surroundings through exchange of thermal energy and via a mechanical piston that adjusts the volume of the system until its internal pressure is equal to the external pressure of the surroundings. Such an ensemble will be characterized by constant particle number, N , internal pressure P, and temperature T and is known as the isothermal-isobaric ensemble. In this ensemble, it is necessary to consider all possible values of the volume. Thus, the average of any quantity A(p, r) is given by DN A = (N, P, T )
∞
dV e
−β P V
dp
dr A(p, r)e−β H (p,r)
(52)
D(V )
0
where D N = 1/(N !h 3N V0 ), with V0 being a reference volume, and where the partition function (N, P, T ) is given by (N, P, T ) = D N
∞
dV e
−β P V
dp
dr e−β H (p,r)
(53)
D(V )
0
The thermodynamic quantities defined in this ensemble are the Gibbs free energy, given by G(N, P, T ) = −
1 ln (N, P, T ) β
(54)
and the average volume, average enthalpy, chemical potential, and constantpressure heat capacity, given, respectively, by
∂ ln (N, P, T ) ∂P N,T ∂ ln (N, P, T ) H=− ∂β N,P ∂ ln (N, P, T ) µ = −kT ∂N P,T
V = −kT
C P = kβ
2
∂ 2 ln (N, P, T ) ∂β 2
N,P
(55)
606
M.E. Tuckerman
As with the canonical ensemble,there is no unique way to generate the correct volume fluctuations. Nevertheless, among the various algorithms that have been proposed for constant pressure MD, it can be shown [16] that they do not all generate the correct isothermal-isobaric distribution. We shall, therefore, focus on the Martyna–Tobias–Klein (MTK) algorithm [22], which has been shown to give both the correct phase space and volume distributions. The MTK approach uses both a set of thermostat variables to control the kinetic energy fluctuations as well as a barostat to control the fluctuations in the instantaneous pressure. The latter is given by the virial expression
N N 1 ∂U p2i + ri · Fi − 3V Pint = 3V i=1 m i ∂V i=1
(56)
Finally, the volume V is also treated as a dynamical variable. Thus, the equations of motion take the form pi p + ri r˙ i = m i W 1 p pη pi − 1 pi p˙ i = Fi − 1 + N W Q1 3V p V˙ = W N 1 pξ p2i p˙ = (Pint − P) + − 1 p N i=1 m i Q1 pηk (57) η˙ k = k = 1, . . . , M Qk pη p˙ηk = G k − k+1 pηk Q k+1 p˙η M = G M pξ ξ˙k = k k = 1, . . . , M Qk pξ p˙ξk = G k − k+1 pξk Q k+1 p˙ξ M = G M In Eqs. (57), the variable p with mass parameter W (having units of energy × time2 ) corresponds to the barostat, coupling both to the positions and the momenta. If the system is subject to a set of holonomic constraints, leaving only N f degrees of freedom, then the 1/N factors appearing in Eq. (57) must be replaced by 3/N f in three spatial dimensions. Moreover, note that two Nos´e– Hoover chains are coupled to the system, one to the particles and the other to the barostat. This device is particularly important, as the barostat tends to evolve on a much slower time scale than the particles. The heat-bath forces G k are defined by G 1 =
p 2 − kT W
G k =
pξ2k−1
Q k−1
− kT
(58)
Generating equilibrium ensembles via molecular dynamics
607
The MTK equations have the conserved energy M p2 H = H (p, r) + + P V + 2W k=1
+ kT
M
ηk + kT
k=2
M
pη2k pξ2k + 2Q k 2Q k
ξk
+ dN kT η1 (59)
k=1
and a phase space metric factor
g(x) = exp dN η1 +
M
ηk +
k=2
M
ξk
(60)
k=1
In order to prove that the MTK equations generate a correct isothermalisobaric distribution, one needs to substitute Eqs. (60) and (59) into Eq. (36) and perform the integrals over all of the heat bath variables and p following the same procedure as was done for the canonical ensemble. Moreover, since Nos´e-Hoover chain thermostats are employed in the MTK scheme, the correct distribution will also be generated even if additional conservation laws, such as total momentum, are obeyed by the system. Integrating the MTK equations is only slightly more difficult than integrating the NHC equations and builds on the technology already developed. We begin by introducing the variable = (1/3) ln(V / V0 ) and writing the total Liouville operator as iL = iL 1 + iL 2 + iL ,1 + iL ,2 + iL T−baro + iL T−part
(61)
where iL 1 =
N pi i=1
N
p ∂ + ri · mi W ∂ri
p ∂ iL 2 = Fi − α pi · W ∂pi i=1 p ∂ iL ,1 = W ∂ ∂ iL ,2 = G ∂ p
(62)
and iL T−part and iL T−baro are defined in an analogous manner to Eq. (46). In Eq. (62), α = 1 + 1/N , and G = α
p2 i i
mi
+
N i=1
ri · Fi − 3V
∂φ − PV ∂V
(63)
608
M.E. Tuckerman
The propagator is factorized in a manner that bears a very close resemblance to that of the NHC equations, namely
t t t exp iL T−part exp iL ,2 exp(i Lt) = exp iL T−baro 2 2 2 t × exp iL 2 exp iL ,1 t exp (iL 1 t) 2 t t t exp iL ,2 exp iL T−part × exp iL 2 2 2 2 t × exp iL T−baro + O(t 3 ) 2
(64)
In evaluating the action of this propagator, the Suzuki–Yoshida decomposition already developed for the NHC equations is applied to the operators exp(iL T−baro t/2) and exp(iL T−part t/2). The operators exp(iL ,1 t) and exp(iL ,2 t/2) are simple translation operators. The operators exp(iL 1 t) and exp(iL 2 t/2) are somewhat more complicated than their microcanonical or canonical ensemble counterparts due to the barostat coupling. The action of the operator exp(iL 1 t) can be determined by solving the differential equation r˙ i = vi + v ri
(65)
for constant vi =pi /m i and constant v = p /W for an arbitrary initial condition ri (0) and evaluating the solution at t = t. This yields the evolution ri (t) = ri (0)ev t + tvi (0)ev t /2
sinh(v t/2) v t/2
(66)
Similarly, the action of exp(i L 2 t/2) can be determined by solving the differential equation v˙ i =
Fi − αv vi mi
(67)
for an arbitrary initial condition vi (0) and evaluating the solution at t = t. This yields the evolution vi (t/2) = vi (0)e−αv t /2 +
t sinh(αv t/4) Fi (0)e−αv t /4 2m i αv t/4
(68)
In practice, the factor sinh(x)/x should be evaluated by a power series for x small to avoid numerical instabilities. These equations together with the Suzuki–Yoshida factorization of the thermostat operators completely define an integrator for the isothermal-isobaric ensemble that can be shown to satisfy Eq. (34). The integrator can be easily coded using the direct translation technique. As an example, the MTK algorithm is applied to the problem of a
Generating equilibrium ensembles via molecular dynamics
609
particle moving in a one-dimensional potential
2π q mω2 V 2 1 − cos (69) 2 4π V where V is the one-dimensional “volume” or box length. The system is coupled to the MTK thermostat/barostat and subject to periodic boundary conditions. Figure 2 shows the position and volume distributions generated together with the analytical results. It can be seen that the method is capable of generating correct distributions of both the phase space and of the volume. We conclude this contribution with a few closing remarks. First, the MTK equations can be generalized [22] to treat anisotropic pressure fluctuations as the Parrinello-Rahman scheme [23]. In this case, one considers the full 3 × 3 φ(q, V ) =
1.5
f(q)
1
0.5
0
0
1
2
3
4
6
8
q 0.5 0.4
f(V)
0.3 0.2 0.1 0 0
2
4
V Figure 2. Top: The position distribution of the system described by the periodic potential of Eq. (69) in the isothermal-isobaric ensemble. The numerical and analytical distributions are shown as the solid and dashed lines, respectively. Bottom: Same for the volume distribution. Nos´e–Hoover chain lengths of 4 were coupled to the particle and to the barostat. The mass m and frequency ω were both taken to be 1, W = 18, kT = 1, P = 1, Q k = 1, Q k = 9. The time step was taken to be 0.005, and the equations of motion were integrated for 5×107 steps using a seventh-order SY scheme with n c = 6.
610
M.E. Tuckerman
cell matrix h = (a, b, c), where a, b, and c, which form the columns of h, are the three cell vectors. The partition function for this ensemble is (N, P, T ) =
1 dh e−β Pdet(h) [det(h)]2
dp
dr e−β H (p,r)
(70)
D(h)
Although we will not discuss the equations of motion here, we remark that it is important to generate the correct factors of det(h) (recall det(h) = V ) in the distribution. The generalized MTK algorithm has been shown to achieve this. Next, the reader may have noticed the glaringly obvious absence of a pure MD based approach to the grand canonical ensemble. Although a number of important proposals for generating this ensemble via MD have appeared in the literature, there is no standard, widely adopted approach to this problem, as is the case for the canonical and isothermal-isobaric ensembles, and the development of such a method for the grand canonical ensemble remains an open question. The main problem with the grand canonical ensemble comes from the need to treat the fluctuations in a discrete variable, N . Here, adiabatic dynamics techniques adopted to allow slow insertion and deletion of particles in the system at constant chemical potential might be useful. Finally, although we encourage the use of the Liouville operator approach in developing integrators for new sets of equations of motion, this method is not foolproof and must be used with some degree of caution, particularly for nonHamiltonian systems. Not every factorization scheme applied to the propagator of a non-Hamiltonian system is guaranteed to preserve the phase space volume as Eq. (34) requires. Although significant attempts have been made to develop a general procedure for devising such factorization schemes, not enough is known at this point about the phase space structure of non-Hamiltonian systems for a truly general theory of numerical integration, so that this, too, remains an open area. An advantage, however, of the Liouville operator approach is that it renders the problem of combining the NHC and MTK schemes with multiple time scale methods [9] and constraints [24] relatively transparent.
References [1] G.M. Torrie and J.P. Valleau, “Nonphysical sampling distributions in Monte Carlo free energy estimation: umbrella sampling,” J. Comp. Phys., 23, 187, 1977. [2] E.A. Carter, G. Ciccotti, J.T. Hynes, and R. Kapral, “Constrained reaction coordinate dynamics for the simulation of rare events,” Chem. Phys. Lett., 156, 472, 1989. [3] M. Sprik and G. Ciccotti, “Free energy from constrained molecular dynamics,” J. Chem. Phys., 109, 7737, 1998. [4] Z. Zhu, M.E. Tuckerman, S.O. Samuelson, and G.J. Martyna, “Using novel variable transformations to enhance conformational sampling in molecular dynamics,” Phys. Rev. Lett., 88, 100201, 2002.
Generating equilibrium ensembles via molecular dynamics
611
[5] J.I. Siepmann and D. Frenkel, “Configurational bias Monte Carlo – a new sampling scheme for flexible chains,” Mol. Phys., 75, 59, 1992. [6] S. Duane, A.D. Kennedy, B.J. Pendleton, and D. Roweth, “Hybrid Monte Carlo,” Phys. Lett. B, 195, 216, 1987. [7] S. Plimpton, “Fast parallel algorithms for short-range molecular dynamics,” J. Comput. Phys., 117, 1, 1995. [8] G.J. Martyna, M.E. Tuckerman, D.J. Tobias, and M.L. Klein, “Explicit reversible integrators for extended systems dynamics,” Mol. Phys., 87, 1117, 1996. [9] M.E. Tuckerman, G.J. Martyna, and B.J. Berne, “Reversible multiple time scale molecular dynamics,” J. Chem. Phys., 97, 1990, 1992. [10] H. Andersen, “Molecular dynamics at constant temperature and/or pressure,” J. Chem. Phys., 72, 2384, 1980. [11] S. Nos´e, “A unified formulation of the constant temperature molecular dynamics methods,” J. Chem. Phys., 81, 511, 1984. [12] S.D. Bond, B.J. Leimkuhler, and B.B. Laird, “The nos´e–poincar´e method for constant temperature molecular dynamics,” J. Comput. Phys., 151, 114, 1999. [13] G.J. Martyna, M.E. Tuckerman, and M.L. Klein, “Nos´e–Hoover chains: the canonical ensemble via continuous dynamics,” J. Chem. Phys., 97, 2635, 1992. [14] Y. Liu and M.E. Tuckerman, “Generalized Gaussian moment thermostatting: a new continuous dynamical approach to the canonical ensemble,” J. Chem. Phys., 112, 1685, 2000. [15] M.E. Tuckerman, C.J. Mundy, and G.J. Martyna, “On the classical statistical mechanics of non-Hamiltonian systems,” Europhys. Lett., 45, 149, 1999. [16] M.E. Tuckerman, Y. Liu, G. Ciccotti, and G.J. Martyna, “Non-Hamiltonian molecular dynamics: Generalizing Hamiltonian phase space principles to non-Hamiltonian systems,” J. Chem. Phys., 115, 1678, 2001. [17] W.G. Hoover, “Canonical dynamics – equilibrium phase space distributions,” Phys. Rev. A, 31, 1695, 1985. [18] M.E. Tuckerman, B.J. Berne, G.J. Martyna, and M.L. Klein, “Efficient molecular dynamics and hybrid Monte Carlo algorithms for path integrals,” J. Chem. Phys., 99, 2796, 1993. [19] M.E. Tuckerman and G.J. Martyna, Comment on “Simple reversible molecular dynamics algorithms for No´se–Hoover chain dynamics,” J. Chem. Phys., 110, 3623, 1999. [20] H. Yoshida, “Construction of higher-order symplectic integrators,” Phys. Lett. A, 150, 262, 1990. [21] M. Suzuki, “General-theory of fractal path-integrals with applications to many-body theories and statistical physics,” J. Math. Phys., 32, 400, 1991. [22] G.J. Martyna, D.J. Tobias, and M.L. Klein, “Constant-pressure molecular-dynamics algorithms,” J. Chem. Phys., 101, 4177, 1994. [23] M. Parrinello and A. Rahman, “Crystal-structure and pair potentials – a moleculardynamics study,” Phys. Rev. Lett., 45, 1196, 1980. [24] J.P. Ryckaert, G. Ciccotti, and H.J.C. Berendsen, “Numerical-integration of cartesian equations of motion of a system with constraints – molecular-dynamics of n-alkanes,” J. Comput. Phys., 23, 327, 1977.
2.10 BASIC MONTE CARLO MODELS: EQUILIBRIUM AND KINETICS George Gilmer1 and Sidney Yip2 1 Lawrence Livermore National Laboratory, P.O. box 808, Livermore, CA 94550 USA 2
Department of Nuclear Science and Engineering and Materials Science and Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139 USA
1.
Monte Carlo Simulations in Statistical Physics
Monte Carlo (MC) is a very general computational technique that can be used to carry out sampling of distributions. Random numbers are employed in the sampling, and often in other parts of the code. One definition of MC based on common usage in the literature is, any calculation that involves significant applications of random numbers. Historical accounts place the naming of this method in March 1947, when Metropolis suggested it for his method of evaluating the equilibrium properties of atomic systems, and this is the application that we will discuss in this section [1]. An important sampling technique is the one named after Metropolis, which we will describe below. There are several areas of computation besides the statistical mechanics of atomic systems where MC is used. An efficient method for the numerical evaluation of many-dimensional integrals is to apply random sampling techniques on the integrand [2]. A second application is the simulation of random walk diffusion processes in statistical mechanics and condensed matter physics [3]. Tracking particles and radiation (neutrons, photons, charged particles) during transport in non-equilibrium systems is another important area [4–7]. Models for crystal growth, ion implantation, radiation damage and other nonequilibrium systems often make use of random numbers. For example, in most of the MC models of ion implantation, the positions where the ions impinge on the surface of the target are selected using random numbers, whereas the trajectories of ions and target atoms are calculated deterministically using atomic collision theory. In models with diffusion, such as crystal growth and the annealing of radiation damage, the decision on which direction to move a particle or defect performing a random walk will be determined by random numbers. 613 S. Yip (ed.), Handbook of Materials Modeling, 613–628. c 2005 Springer. Printed in the Netherlands.
614
1.1.
G. Gilmer and S. Yip
Metropolis Sampling
In statistical physics one can find the average of a property A({r }) that is a function of the coordinates {r} of N particles, in a system that is in thermodynamic equilibrium,
A =
d3N r A({r }) exp[−U ({r})/kT ] . d3N r exp[−U ({r})/kT ]
(1)
The calculation involves averaging the dynamical variable of interest, A, which depends on the positions of all the particles in the system, over an appropriate thermodynamic ensemble. Often the canonical ensemble is chosen; one with a fixed number of particles, volume and temperature, N , V , and T . In this case the configurations are weighted by the Boltzmann factor exp[−U ({r})/kT ], where U is the potential energy of the system, and k the Boltzmann constant. Integration is over the positions of all particles (3N coordinates). The denominator in Eq. (1) is needed for normalization, and is an important quantity in its own right, because the Helmholtz free energy can be obtained from it (for a system with the independent variables, N , V , and T ). We consider two ways to perform the indicated integral. It is clearly overkill to integrate over all of configuration phase space, because the number of integrals is 3N , where N may have values of thousands or millions. The selection of some representative points seems like a reasonable alternative. One approach is to sample distinct configurations randomly and then obtain A by approximating Eq. (1) by a sum over a set of configurations A =
i=1
A({r }i ) exp[−U ({r}i )/kT ] . i=1 exp[−U ({r}i )/kT ]
(2)
The configurations could be selected by use of a random number generator. One could obtain coordinates to assign to the N atoms with which to fill the cell with N atoms using a sequence of numbers ξ i that are uniform in the range (0, 1), and scaling 3N values of ξ by the edge lengths of the rectangular computational cell. However this procedure would also be grossly inefficient. In a solid or liquid system, many of the atoms in such a random configuration would be overlapping, giving a huge potential energy, and hence a negligible weight, exp[−U ({r})/kT ], in the sampling procedure. The net result is that only a small fraction of “low energy” configurations would determine the value of A, and even these configurations would likely have potential energies much larger than the actual value U . To get around this difficulty, a second approach may be used, where the sampled configurations are picked in a way that is biased by the probability that they will appear in the equilibrium ensemble, i.e., using the factor exp[−U ({r})/kT ]. Then A is determined by weighing the contribution from
Basic monte carlo models: equilibrium and kinetics
615
each configuration equally, since the bias in the selection of configurations accounts for the Boltzmann weighting factor, A =
,Cn i=1
A({r i })
,Cn i=1
δii
,
(3)
where {r}i are configurations sampled from the biased distribution, as indicated by Cn above the summation sign. (The denominator is simply the number of states summed over, or .) How does one do this biased summation? One way is to adopt a procedure developed by Metropolis et al. in 1953 [8]. This procedure is an example of the concept of importance sampling in MC methods [9].
1.2.
Metropolis Sampling
One option for obtaining a set of configurations biased by exp[−U ({r})/ kT ] is to take small excursions from an initial configuration that has a low energy U ({r}). The initial coordinates could be the coordinates of N atoms in a perfect crystalline lattice structure at 0 K. Then, an atom is picked at random, and given a displacement that is small enough that the atom will not approach a neighbor too closely, and yet long enough to produce a significant displacement or change in the system energy. Let the initial position of the particle be (x, y, z). Imagine now displacing the particle from its initial position to a trial position (x + αξx , y + αξ y , z + αξz ), where α is a constant, and ξx , ξ y , and ξz are uniform in the interval (−1, 1). The value of α for obtaining the optimum sampling of phase space depends on the conditions, including density and T , among others. This could be determined from a preliminary run, or optimized as the simulation proceeds. With this move the system goes from configuration {r} j → {r} j +1 . The Metropolis procedure now consists of four steps. 1. Move system in the way just described. 2. Calculate U = U (final) − U (initial) = Uj +1 − Uj , i.e., U is the energy change resulting from the move. 3. If U < 0, accept the move. This means leaving the particle in its new position. 4. If, U > 0, accept the move provided ξ <exp[−U/kT ], where ξ is a fourth random number in the interval (0, 1). The Metropolis sampling technique generates a series of configurations, each of which is closely related to the previous one. This is true because of the small change on the total configuration affected by the displacement of only one atom. The series is, however, a Markov chain, since it satisfies the condition that the new configuration is derived from the previous one, without
616
G. Gilmer and S. Yip
i i⫹1 Markov chain (Time)
i⫹2
i⫹3
i⫹4
Figure 1. Illustration of the chain of states created by the Metropolis algorithm for a model of group of adatoms on a crystal surface. Each state differs from the one preceding it by the displacement of one atom to a neighboring lattice site.
taking into account the history of states before it. This is very different from molecular dynamics (MD) simulations, where the momentum of the particles plays an important role in determining the configuration of the next iteration. Figure 1 shows a schematic of the states generated by the Metropolis algorithm for a lattice gas, modeling a group of adatoms on the (100) face of a crystal. The elementary move in this model is a diffusion hop of an atom to a neighboring lattice site, and clearly the four hops in this series left much of the system unchanged. We see that it is the uphill moves of step 2 that account for the effect of temperature on the distribution of the system over the energy states. High temperature increases the magnitude of the Boltzmann factor, and therefore the probability of acceptance of moves that increase the energy of the system. If not for step 2, step 3 would only allow the system to go downhill in energy U , which would mean that the system of atoms would lose potential energy systematically and end up in a local energy minimum.
2.
Proof that Metropolis Sampling Results in a Canonical Ensemble
One can show that the Metropolis procedure allows one to sample the distribution of states biased by exp[−U/kT ]. Consider two states (configurations) of the system, i and j , and let Ui > U j . According to the Metropolis procedure, the probability of an (i → j ) transition is Pi ν ij , where Pi is the probability that the system is in state i, and ν ij is the transition probability that a system in state i will go to state j . Similarly, the probability of a ( j → i) transition is P j ν ij exp[−(Ui − U j )/kT ], where we have used the fact that ν j i= ν ij exp[−(Ui − U j )/kT ] according to the Metropolis procedure described above. At equilibrium the two transitions must have equal probabilities, otherwise the populations of some states in the ensemble could be increasing in probability, others decreasing, and the system would not be in equilibrium. This is the principle of microscopic reversibility, or detailed balance. Figure 2 shows
Basic monte carlo models: equilibrium and kinetics
617
Vij Vji
state i
state j
Vij exp (⫺Ui/kT) ⫽ Vji exp (⫺Uj/kT) Figure 2. The microscopic reversibility condition on the transition rates (or probabilities) between two states. This condition is necessary to insure that there is an equilibrium state for the system.
an example of this for a lattice model of an atomic system. Thus, equating the probability of an (i → j ) transition to that for the reverse transition, we find: Pi = P j exp[−(Ui − U j )/kT ] or Pi = C exp[−Ui /kT ] and P j = C exp[−U j /kT ],
(4)
where C is a normalization constant. Whereas (4) relates the probability of finding the ensemble in state i to that for state j , based on the direct transitions between the two states, it also applies to states without direct transitions. Of course, a system can reach internal equilibrium only if there is a sequence of states, connected by direct transitions, between any two states in the system. That is, all of the states are interconnected. Any model that does not satisfy this condition will have isolated pockets of states in phase space that will not equilibrate with each other. But, a system of states that are interconnected in this way will have all states satisfying Eq. (4), which is the canonical ensemble. This completes the proof of the Metropolis sampling method. Stated again, the Metropolis method is an efficient way to sample states of the system with a bias equal to the Boltzmann factor, and that has the same form as the canonical distribution in thermodynamics. It is worthwhile to note that this method can be used in optimization problems, where one is interested in finding the global minimum of multidimensional parameters. One example is to calculate the optimum arrangement of the components of a silicon device to minimize the path length of electrical interconnect lines. The analog of energy is the total length of the conducting lines. The method is better than the standard energy minimization methods such as the conjugate gradient procedure, because it allows the system energy (length of interconnect lines) to increase occasionally in the search for the global minimum. This feature allows it to surmount energy
618
G. Gilmer and S. Yip
barriers and visit more than one global minimum. The approach to optimization problems is similar to that used to find the global minimum in the energy of an atomic system. A large initial “annealing temperature” is chosen, since this allows the system to pass between global minima. The “temperature” is then reduced in steps for annealing until eventually reaching zero temperature and a minimum, hopefully the global minimum, and the desired optimum value. This is the basis of the “simulated annealing” algorithm used for optimization problems [10].
3.
Free Energy Calculations
As mentioned earlier, the Helmholtz free energy of an atomic system can be obtained from an integration of the Boltzmann factor over phase space, and this is given by
F = −kT · ln V−N
d3N r exp[−U ({r})/kT ] ,
or in an unbiased sample equivalent to Eq. (2), F = −kT · ln
i=1
exp[−U ({r}i )/kT ] . i=1 δii
(5)
This result is not very helpful for obtaining F, however, for the same reason that Eq. (2) is not a useful way to average properties in a canonical ensemble. Again, the major contributions to the sum in (5) occur in well-ordered configurations with atoms avoiding close encounters with their neighbors, whereas the random sampling approach will yield very few such low potential energy configurations indeed. The Metropolis algorithm does not help either. In the biased sample derived from the Metropolis technique, the equivalent of (5) includes a term that cancels the bias in the sum in the numerator and denominator. For purposes of understanding we can assume that we sum over the same states, but that the number of times a given state is included in the Metropolis series, or its degeneracy, is proportional to the Boltzmann factor. That is, each state in the canonical ensemble has, in effect, been multiplied by exp[−U ({r}i )/kT ] because of the preferential choice of states with low potential energy. Therefore to obtain the equivalent of Eq. (5) in the canonical ensemble sums, we simply multiply each term of the sums by exp[U ({r}i )/kT ], giving
F = −kT · ln
,Cn ,Cn
i=1
δii
exp[U ({r}i )/kT ] 1 = −kT · ln . exp[U ({r}i )/kT]Cn
i=1
(6)
Basic monte carlo models: equilibrium and kinetics
619
Although the evaluation of exp[U ({r}i )/kT ]Cn by the Metropolis method is a valid way to obtain the free energy, it is also totally impractical. The sum in the denominator of the middle expression in Eq. (6) will not be evaluated accurately, since it is large when bias factor is small, and vice versa. Therefore all states are equally important for evaluating the average exp[U ({r}i )/kT ]Cn , with the bias factor canceling the exponential in each term of the sum. Importance sampling fails, because each term is equally important, even states that have essentially zero probability of appearing in the ensemble, because these terms are multiplied by the huge exponential, exp[U ({r}i )/kT ]. One approach to calculating the free energy of a system of atoms is to relate it to a known reference system, i.e., a set of Einstein oscillators. If we define a potential energy U ({r}i ) = λU1 ({r}i ) + (1−λ)U0 ({r}i ), then when λ goes from 0 to 1, the potential energy goes from that corresponding to the interatomic potential for U0 ({r}i ) to that for U1 ({r}i ). Differentiating Eq. (5) with respect to λ, using our definition of U ({r}i ), we obtain
(U1 ({r }i ) − U0 ({r}i )) exp[−U ({r}i )/kT ] ∂F , = i=1 ∂λ i=1 exp[−U ({r}i )/kT ∂F = U1 ({r}i ) − U0 ({r i })Cn , ∂λ
or
(7) (8)
where the sampling in Eq. (8) is over an ensemble weighted with the Boltzmann factor exp[−{λU1{r }i + (1 − λ)U0{r }i }/kT ]. Integration of the derivative of F with respect to λ then gives the change in F between the reference state and the state with the desired configuration. Another method known as “umbrella sampling” has been used in situations where is it desired to compare two systems with almost identical interatomic potentials, or with slightly different temperatures [11,12]. If the interatomic potential is changed only a small amount, U ({r}i ) = U ({r}i ) − U0 ({r}i ), then it may be possible to make accurate calculations of the differences in the free energies or other properties A in a single Metropolis MC run. Then one chooses an “unphysical” bias potential, exp[−UUMB ({r}i )/kT , that will, ideally, reproduce the minimum values of both U0 ({r}i ) and of U ({r}i ). Then A0 is given by A0 =
,CnUMB i=1
A({r }i ) exp[−U0 + UUMB ]/kT
,CnUMB i=1
δii exp[−U0 + UUMB ]/kT
,
(9)
as discussed in Ref. [8]. Comparing Eq. (9) with Eq. (3) and the discussion following it, we see that the modified Metropolis method generates only one set of configurations, based on the bias potential, but that the average value of A must be calculated from these configurations weighted by the appropriate exponential, as shown in (9). An analogous expression holds for A for the interatomic potential giving U ({r}i ). Accurate results are only obtained for
620
G. Gilmer and S. Yip
small differences in the potential, and if the size of the atomic configuration is less than several hundred atoms. The choice of bias functions is also crucial for accurate results. But the selection of these functions usually requires some laborious trial-and-error runs. A more complete discussion on methods to obtain free energy differences is given in Chapter 2.15 by de Koning and Reinhardt. MC methods have a number of advantages over MD for obtaining free energies and other equilibrium properties. The ability to bias the sampling process and transition rates while retaining the conditions for an equilibrium ensemble provides some powerful methodologies. One of these applies to the evaluation of the properties of metastable and other defects such as dislocations, surfaces, and interfaces. Because of the small number of atoms involved compared to the total number in the system, statistical noise from the fluctuations in the bulk system will interfere with the measurement of the relatively small impact of the defect on the properties of the atomic system. MC methods allow the concentration of events on the region around the defect being investigated, while retaining the essential condition of microscopic reversibility. In this way, slowly relaxing regions can be allowed to approach a metastable equilibrium without spending most of the computer time on a less important part of the system. Slow structural rearrangements can be accommodated at the interface, without spending computer power simulating the uninteresting parts of the system as they perform their equilibrium fluctuations. MD simulations tend to be more efficient computationally than MC in the case where a system of atoms is being equilibrated at a new temperature or some other change in its conditions is implemented. The advantage for MD results from the fact that the displacements of the atoms during an MD time step are quite different from those discussed earlier for the MC methods. With classical MC, a displacement of a particle has nothing to do with the environment of the particle, but is chosen by random numbers along the three orthogonal coordinate axes. A particle that is close to a neighbor and therefore in a strong repulsive force field may be given a displacement moving it even closer. Such a move will likely cause a large increase in energy and be rejected, but the cost of generating the random numbers for the unsuccessful move affects the efficiency of the process. Furthermore, coordinated moves of a number of particles such as those moving into a region of reduced pressure are not possible with Metropolis MC, whereas their presence in MD allows fast relaxation of a pressure pulse or recovery from artificial initial conditions. Force bias MC was developed to speed up MC relaxation of atomic systems [13]. In this technique, atomic displacements with a large component in the same direction as the force on an atom are selected preferentially to those that are mainly in a direction orthogonal to the force. To maintain microscopic reversibility, atoms moving against the force must also be given a larger selection probability, but
Basic monte carlo models: equilibrium and kinetics
621
since they are likely to be moving uphill in energy and to have their move rejected, the result is that more atoms move in the desired direction. This technique is found to be effective and to increase the speed of relaxation in many MC systems. But the calculation of the forces requires extra computer time, so that some applications are still faster if done by basic MC methods [13]. In cases where the flexibility of the MC technique provides strong advantages, it is likely to be advantageous to implement the force-bias algorithm.
4.
Kinetic Interpretation of MC [6]
The Metropolis algorithm was developed primarily for obtaining equilibrium properties of a physical system. Strictly speaking, however, the method never reaches complete equilibrium condition; that is, states whose appearance in an ensemble occurs with the probability Pi = C exp[−Ui /kT ]. Consider the behavior of an infinite ensemble, i.e., an infinite number of identical computational cells, and all starting in the same state, but run with different random number sequences. Calculate the ensemble average properties Ai at each MC step i, starting with the initial state i = 0. In other words, we obtain the average of the system property A by averaging over the computational cells composing the ensemble after each MC event. This differs from the usual procedure, where A is averaged over the successive states of a single computational cell generated by the Metropolis method. The ensemble average Ai will initially have properties similar to the initial state, A0 , since most of the atoms will be in the same position as the starting state. Unless the initial state has very unusual properties, Ai will change its value as i increases, and eventually approach an asymptotic value corresponding to equilibrium, with Pi = C exp[−Ui /kT ]. The approach to the equilibrium ensemble is a property of the system “kinetics,” and depends strongly on the probabilities for transitions between states, ν ij . The ν ij can be thought of as transition rates, in which case the approach to the equilibrium ensemble can be plotted as a function of time instead of MC event number i. A transition with U < 0 has the highest probability ν ij , and would correspond to the highest transition rate. However, transition rates proportional to the Metropolis transition probabilities are unphysical, and would not yield the kinetics of any real system. For this purpose, it is necessary to obtain rate constants for atomic diffusion, chemical reactions, and other unit mechanisms that are relevant for the physical system being studied. These may be obtained by the use of interatomic potentials in molecular dynamics simulations as discussed in preceding chapters, or from molecular dynamics or saddle point evaluations using density functional theory as discussed in Chapter 1.
622
G. Gilmer and S. Yip
Kinetic Monte Carlo (KMC) is similar to equilibrium MC, but with transition rates appropriate for real systems. It can be applied both to equilibrium conditions and to conditions where the system is out of equilibrium. In order to distinguish KMC from equilibrium MC, we will use different terminology. Let P(x, t) be the probability that the system configuration is x at time t. Note that the configuration previously represented by {r}i is now simply x. Then P(x, t) satisfies the equation dP(x, t) W (x → x )P(x, t) + W (x → x)P(x , t), =− dt x x
(10)
where W (x → x ) is the transition probability per unit time of going from x to x (W is analogous to ν ij in the Metropolis method above). Equation (10) is called the Master equation. For the system to be able to reach equilibrium, as discussed above, the transition probabilities must satisfy the condition of microscopic reversibility, (cf. Eq. (4)). Peq (x)W (x → x ) = Peq (x )W (x → x).
(11)
At equilibrium, P(x, t) = Peq (x) and dP(x, t)/dt = 0. Since the probability of occupying state x is Peq (x) =
1 exp[−U (x)/kT ], Z
(12)
where Z is the partition function, Z = i exp[−U ({r}i )/kT ], and (11) gives the basic condition that must be satisfied by the transition probabilities imposed by microscopic reversibility, we have W (x → x ) = exp[{U (x) − U (x )}/kT ]. W (x → x)
(13)
Equation (13) is satisfied by the Metropolis procedure, but other transition rates also satisfy this condition. As we noted above, the Metropolis procedure is unphysical, but real systems also have equilibrium states when the transition rates that satisfy Eq. (13).
5.
Lattice MC: Crystal Growth
Kinetic Monte Carlo models of thin film and crystal growth are often based on the simplification of the lattice model, where atoms are confined to lattice sites on a perfect crystal lattice. We introduced a simple case in Fig. 1, where we discussed a model of a group of atoms diffusing on a crystal surface, and the model consisted of moving the atoms between lattice sites corresponding to a square array of binding sites on a fcc(100) substrate.
Basic monte carlo models: equilibrium and kinetics
623
The potential energies of the KMC lattice gas model (KMC LG) can be obtained from empirical interatomic potentials developed for MD simulations, or from simple bond-counting methods if the properties of the model are not required to match experiments. Usually the interactions are limited to nearest neighbors, although the embedded atom potentials have an effective range that is greater than the cut-off value because of indirect interactions through the embedding function. Thus, a potential that has an embedding function and pair interaction limited to first neighbors actually has interactions extending to second or third neighbors. Most KMC LG models do not account for stress fields, and as a result the potential energies U (x) ¯ take on discrete values. The Boltzmann factors for the allowed displacements can then be easily tabulated for computational efficiency. The efficiency of KMC LG models depends on the disparity of the different atomic displacement rates. The example of vapor deposition onto a crystal surface illustrates the possible effects of a large disparity. In the case of Al, the diffusion of an adatom to an adjacent site on a (111) surface requires crossing a potential energy barrier of less than 0.1 eV, according to first principles calculations, implying a rate of approximately of 1010 hops/s at room temperature. On the other hand, the deposition of atoms by sputtering gives an accumulation rate of only about 4 nm/s for the deposited material, or a rate of 20 atoms/s impinging on every surface site. Since the models are usually designed to measure film growth processes and morphologies, it is apparent that the simulations require runs corresponding to real deposition times on the order of a second or more. But it is also necessary to include all of the diffusion hops, which require spending a large fraction of the computer time on moving adatoms around on the surface. However, the capability for performing such simulations has been increasing dramatically, both as a result of cheaper computational power, and because of new algorithms that dramatically speed up the simulations. Techniques are being developed to model random walk diffusion processes, without the necessity of simulating explicitly each of the millions of diffusion hops, by making use of the known properties of random walk diffusion processes [14]. In addition, there are several methods that handle highly disparate events without the inefficiency of spending computer time calculating moves that subsequently get rejected, as in the case of the Metropolis algorithm [15–17]. Methods to treat systems with long-range correlations efficiently have also been developed [18].
6.
Off-lattice KMC: Ion Implantation and Radiation Damage
The implantation of dopant ions into silicon wafers is the primary means to insert the electrically active atoms during the manufacturing of silicon
624
G. Gilmer and S. Yip
devices. Atomistic models of this process are receiving much attention recently because of the decreasing size of silicon device components. Atomistic effects are becoming important since fluctuations in dopant atoms may degrade uniformity in device properties, and control of the distribution of the dopant atoms is becoming more critical. Two distinct models are required for the simulations. First, a model describing the entry of the energetic ions into the crystal, together with the damage resulting from silicon atoms displaced from their lattice sites. Although these models, for example, MARLOWE [19], involve some use of random numbers as mentioned above, most of the computer time is involved with calculating the collisions of the energetic particles with the silicon atoms. After the ions are implanted, the wafer is usually annealed to reduce the damage and improve the electrical properties of the device. This requires the simulation of several types of defects and dopant atoms diffusing through the crystal. Vacancies and interstitials are the two main defects, although the diffusion of complexes such as interstitial-dopant and vacancy-dopant pairs, interstitial dimers, divacancies, and larger clusters can have a significant influence on the redistribution and clustering of dopant atoms. Rather complex set of events can be simulated by the KMC OL method. In these simulations, the defects and clusters diffuse through a complex path of saddle points and potential energy minima; only the vacancy spends most of its time on lattice sites. Furthermore, the exact path of the diffusing species as a function of time is not particularly important for the KMC OL simulation, although they are essential for the more detailed first principles calculations used to calculate overall diffusion rates. The crucial parameters for KMC OL are the binding energies between defects and dopant atoms and their mobilities, the defect–defect binding energies, cross-sections for capture, and the recombination cross section for vacancies and interstitials. Fortunately, there have been a number of first principles calculations for these parameters, at least for the smaller clusters and defects. As in the case of surface diffusion, the disparity of diffusion rates is quite large, and it is essential to employ efficient algorithms for the simulations. An example of the complexity of the simulations, is given in Fig. 3, where we show model calculations of the relatively simple case of the implantation of silicon ions into a silicon target using the DADOS simulator [20]. Silicon ions (5 keV of kinetic energy) are implanted into perfect crystalline silicon dislodging some silicon atoms from their lattice sites creating vacancies (dark spheres) and interstitials (grey spheres). Figure 3(a) shows the high concentration of defects after implantation at room temperature, with many vacancy-interstitial pairs created by the energetic ions. After a few seconds of annealing, Fig. 3(b), a large number of point defects have recombined, leaving an excess of interstitials corresponding to the implanted ions. The excess interstitials gradually aggregate and form {311} defects, Fig. 3(c) and (d). Note that
Basic monte carlo models: equilibrium and kinetics
625
Figure 3. Kinetic Monte Carlo results showing point defects in crystalline silicon after implantation of Si ions into perfect crystalline Si at room temperature, and during subsequent annealing at 800◦ C [19]. Grey spheres represent interstititals, and dark ones vacancies; only the defects are shown. (a) corresponds to the defects after implantation at room temperature, (b) 1 s anneal, (c) 40 s anneal, and (d) 250 s anneal.
simulation does not predict the structure of the interstitial clusters, because of the off-lattice nature of the model. The structure {311} of the defects is inserted into the model since it is important for the point-defect cluster interactions and cross-sections. As the defects diffuse and recombine in the initial stages and, later, as the {311} defects emit and absorb interstitials during the ripening phase, a very large number of diffusion hops take place demanding long KMC simulations. Eventually the interstitial clusters dissolve as the interstitial excess equilibrates with the surface.
7.
Simulation of Particle and Radiation Transport
MC is quite extensively used to track the individual particles as each moves through the medium of interest, streaming and colliding with the atomic constituents of the medium. To give a simple illustration, we consider the trajectory of a neutron as it enters a medium, as depicted in Fig. 4. Suppose the first interaction of this neutron is a scattering collision at point 1. After the scattering the neutron moves to point 2 where it is absorbed, causing a fission reaction which emits two neutrons and a photon. One of the neutrons streams to point 3 where it suffers a capture reaction with the emission of a photon, which in turn leaves the medium at point 6. The other neutron and the photon from the fission event both escape from the medium, to points 4 and 7, respectively, without undergoing any further collisions. By sampling a trajectory we mean that process in which one determines the position of point 1 where the scattering occurs, the outgoing neutron direction and its energy, the position of point 2 where fission occurs, the outgoing directions and energies of the two fission neutrons and the photon, etc. After tracking many such trajectories one can estimate the probability of a neutron penetrating the medium and the amount of energy deposited in the medium as a result of the reactions induced along the path of each trajectory. This is the kind of information
626
G. Gilmer and S. Yip
Figure 4. Schematic of a typical particle trajectory simulated by Monte Carlo. By repeating the simulation many times one obtains sufficient statistics to estimate the probability of radiation penetration in the case of shielding calculations, or the probability of energy deposition in the case of dosimetry problems.
that one needs in shielding calculations, where one wants to know how much material is needed to prevent the radiation (particles) from getting across the medium (a biological shield), or in dosimetry calculations where one wants to know how much energy is deposited in the medium (human tissue) by the radiation.
8.
Comparison of MC with MD
As discussed in several of the sections of Chapter 2, MD is a technique to generate the atomic trajectories of a system of N particles by direct numerical integration of the Newtons equations of motion. In a similar spirit, we say that the purpose of MC is to generate an ensemble of atomic configurations by stochastic sampling. In both cases we have a system of N particles interacting through the same interatomic potential. In MD, the system evolves in time by following the Newtons equations of motion where particles move in response to forces created by their neighbors. The particles therefore follow the correct dynamics according to classical mechanics. In contrast, in MC the particles move by sampling a distribution such as the canonical distribution. The dynamics thus generated is stochastic or probabilistic rather than deterministic which is the case for MD. The difference is, dynamics becomes important in problems where we wish to simulate the system over a long period of time. Because MD is constrained to real dynamics, the time scale of the simulation is fixed by such factors as the interatomic potential, and the mass of the particle. This time scale is of the order of picoseconds (10−12 ). If one wants to observe a phenomenon on a longer scale such as microseconds, it would require extensive computer resources to simulate it directly by MD. On the other hand, the time scale of MC is not fixed in the same way. KMC models
Basic monte carlo models: equilibrium and kinetics
627
often are able to simulate many of the same phenomena as MD, but on a much longer time scale by using a simplified description of the motion. If we consider the system of atoms on a crystal surface represented in Fig. 1, the MD simulation would consist of a substrate that provides a potential consisting of a square array of binding sites. Mobile atoms on the substrate would vibrate around the potential energy minimum of the binding site, and occasionally surmount the barrier and hop to a neighboring site. The vibrations of the atoms around the binding site may not be of importance for many applications, but the diffusion hops to neighboring sites and the aggregation into larger clusters on the substrate could be important for studying thin film structures during annealing, as discussed earlier. A KMC model could be developed where the elementary move is a diffusion hop to a neighboring site, ignoring the vibrations. Information from the MD model on the hop rate to neighboring sites, together with the effect of neighboring atoms on the hop rate, is often used to develop the KMC model. Because of the greatly reduced frequency of the diffusion events compared to the vibrations, the simulation can cover much larger time and length scales, and yet provide the needed information on the atomic diffusion and clustering. Another way to characterize the difference between MC and MD is to consider each as a technique to sample the degrees of freedom of the system. Since we follow the particle positions and velocities in MD, we are sampling the evolution of the system of N particles in its phase space, the 6-N dimensional space of the positions and velocities of the N particles. In MC we generate a set of particle positions in the system of N particles, thus the sampling is carried out in the 3-N configurational space of the system. In both cases, the sampling generates a trajectory in the respective spaces, as shown in Fig. 5. Such trajectories then allow properties of the system to be calculated as averages over these trajectories. In MD one performs a time average whereas in MC one
Figure 5. Schematic depicting the evolution of the same N-particle system in the 3-N dimensional configurational space (µ) as sampled by MC, and in the 6-N dimensional phase space (γ ) sampled by MD. In each case, the sampling results in a trajectory in the appropriate space, which is the necessary information that allows average system properties to be calculated. For MC, the trajectory is that of a random walk (Markov chain) governed by stochastic dynamics, whereas for MD the trajectory is what we believe to be the correct dynamics as given by the Newton’s equations of motion in classical mechanics. The same interatomic potential is used in the two simulations.
628
G. Gilmer and S. Yip
performs an average over discrete states. Under appropriate conditions MC and MD give the same results for equilibrium properties, a consequence of the so called ergodic hypothesis (ensemble average = time average); however, dynamical properties calculated using the two methods in general will not be the same.
References [1] N. Metropolis, “The beginning of the Monte Carlo method,” Los Alamos Sci., Special Issue, 125, 1987. [2] E.J. Janse van Rensburg and G.M. Torrie “Estimation of multidimensional integrals: is Monte Carlo the best method?” J. Phys. A: Math. Gen., 26, 943–953, 1993. [3] A.R. Kansal and S. Torquato, “Prediction of trapping rates in mixtures of partially absorbing spheres,” J. Chem. Phys., 116, 10589, 2002. [4] H. Gould and J. Tobochnik, An Introduction to Computer Simulation Methods, Part 2, Chaps 10–12, 14, 15, Addison-Wesley, Reading, 1988. [5] D.W. Hermann, Computer Simulation Methods, 2nd edn., Chap 4, Springer-Verlag, Berlin, 1990. [6] K. Binder and D.W. Hermann, Monte Carlo Simulation in Statistical Physics, An Introduction, Springer-Verlag, Berlin, 1988. [7] E.E. Lewis and W.F. Miller, Computational Methods of Neutron Transport, Chap 7, American Nuclear Society, La Grange Park, IL, 1993. [8] N. Metropolis, A.W. Rosenbluth, M.N. Rosenbluth, A.H. Teller, and E. Teller, “Equation of state calculations by fast computing machines,” J. Chem. Phys., 21, 1087, 1953. [9] M.H. Kalos and P.A. Whitlock, Monte Carlo Methods , Wiley, New York, 1986. [10] S. Kirkpatrick, C.D. Gelatt, and M.P. Vecchi, “Optimization by simulated annealing,” Science, 220, 671, 1983. [11] G.M. Torrie and J.P. Valleau, “Non-physical sampling distributions in Monte Carlo free energy estimation – umbrella sampling,” J. Comput. Phys., 23, 187, 1977. [12] M.P. Allen and D.J. Tildesley, Computer Simulation of Liquids, Oxford University Press, Oxford, 1987. [13] M. Rao, C. Pangali, and B.J. Berne, “On the force bias Monte Carlo simulation of water: methodology, optimization and comparison with molecular dynamics,” Mol. Phys., 37, 1773, 1979. [14] J. Dalla Torre, C.-C. Fu, F. Willaime, and J.-L. Bocquet, Simulations multi-echelles des experiences de recuit de resistivite isochrone dans le Fer-ultra pur irradie aux electrons: premiers resultants, CEA Annuel Rapport, p. 94, 2003. [15] D. T. Gillespie, “General method for numerically simulating stochastic time evolution of coupled chemical-reactions,” Comp. Phys., 22, 403–434, 1976. [16] A.B. Bortz, M.H. Kalos, and J. L. Lebowitz, J. Comput. Phys., 17, 10, 1975. [17] G. H. Gilmer, “Growth on imperfect crystal faces,” J. Cryst, Growth, 36, 15, 1976. [18] R.H. Swendsen and J.S. Wang, “Replica Monte Carlo simulation of spin-glasses,” Phys. Rev. Lett., 57, 2607, 1986. [19] M.T. Robinson, “The binary collision approximation: background and introduction, Rad. Eff. Defects Sol., 130–131, 3, 1994. [20] M.E. Law, G.H. Gilmer, and M. Jaraiz, “Simulation of defects and diffusion phenomena in silicon,” MRS Bull., 25, 45, 2000.
2.11 ACCELERATED MOLECULAR DYNAMICS METHODS Blas P. Uberuaga1, Francesco Montalenti2 , Timothy C. Germann3, and Arthur F. Voter4 1 Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA 2
INFM, L-NESS, and Dipartimento di Scienza dei Materiali, Universit`a degli Studi di Milano-Bicocca, Via Cozzi 53, I-20125 Milan, Italy 3 Applied Physics Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA 4 Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
Molecular dynamics (MD) simulation, in which atom positions are evolved by integrating the classical equations of motion in time, is now a well established and powerful method in materials research. An appealing feature of MD is that it follows the actual dynamical evolution of the system, making no assumptions beyond those in the interatomic potential, which can, in principle, be made as accurate as desired. However, the limitation in the accessible simulation time represents a substantial obstacle in making useful predictions with MD. Resolving individual atomic vibrations – a necessity for maintaining accuracy in the integration – requires time steps on the order of femtoseconds, so that reaching even one microsecond is very difficult on today’s fastest processors. Because this integration is inherently sequential in nature, direct, spatial parallelization does not help significantly; it just allows simulations of nanoseconds on much larger systems. Beginning in the late 1990s, methods based on a new concept have been developed for circumventing this time scale problem. For systems in which the long-time dynamical evolution is characterized by a sequence of activated events, these “accelerated molecular dynamics” methods [1] can extend the accessible time scale by orders of magnitude relative to direct MD, while retaining full atomistic detail. These methods – hyperdynamics, parallel-replica dynamics, and temperature accelerated dynamics (TAD) – have already been demonstrated on problems in surface and bulk diffusion and surface growth. With more development they will become useful for a broad range of key materials problems, such as pipe diffusion along a dislocation core, impurity clustering, grain 629 S. Yip (ed.), Handbook of Materials Modeling, 629–648. c 2005 Springer. Printed in the Netherlands.
630
B.P. Uberuaga et al.
growth, dislocation climb and dislocation kink nucleation. Here we give an introduction to these methods, discuss their current strengths and limitations, and predict how their capabilities may develop in the next few years.
1. 1.1.
Background Infrequent Event Systems
We begin by defining an “infrequent-event” system, as this is the type of system we will focus on in this article. The dynamical evolution of such a system is characterized by the occasional activated event that takes the system from basin to basin, events that are separated by possibly millions of thermal vibrations within one basin. A simple example of an infrequent-event system is an adatom on a metal surface at a temperature that is low relative to the diffusive jump barrier. We will exclusively consider thermal systems, characterized by a temperature T , a fixed number of atoms N , and a fixed volume V ; i.e., the canonical ensemble. Typically, there is a large number of possible paths for escape from any given basin. As a trajectory in the 3N -dimensional coordinate space in which the system resides passes from one basin to another, it crosses a (3N –1)dimensional “dividing surface” at the ridgetop separating the two basins. While on average these crossings are infrequent, successive crossings can sometimes occur within just a few vibrational periods; these are termed “correlated dynamical events” [2–4]. An example would be a double jump of the adatom on the surface. For this discussion it is sufficient, but important, to realize that such events can occur. In most of the methods presented below, we will assume that these correlated events do not occur – this is the primary assumption of transition state theory – which is actually a very good approximation for many solid-state diffusive processes. We define the “correlation time” (τcorr ) of the system as the duration of the system memory. A trajectory that has resided in a particular basin for longer than τcorr has no memory of its history and, consequently, how it got to that basin, in the sense that when it later escapes from the basin, the probability for escape is independent of how it entered the state. The relative probability for escape to a given adjacent state is proportional to the rate constant for that escape path, which we will define below. An infrequent event system, then, is one in which the residence time in a state (τrxn ) is much longer than the correlation time (τcorr ). We will focus here on systems with energetic barriers to escape, but the infrequent-event concept applies equally well to entropic bottlenecks.1 The key to the accelerated
1 For systems with entropic bottlenecks, the parallel-replica dynamics method can be applied very
effectively [1].
Accelerated molecular dynamics methods
631
dynamics methods described here is recognizing that to obtain the right sequence of state-to-state transitions, we need not evolve the vibrational dynamics perfectly, as long as the relative probability of finding each of the possible escape paths is preserved.
1.2.
Transition State Theory
Transition state theory (TST) [5–9] is the formalism underpinning all of the accelerated dynamics methods, directly or indirectly. In the TST approximation, the classical rate constant for escape from state A to some adjacent state B is taken to be the equilibrium flux through the dividing surface between A and B (Fig. 1). If there are no correlated dynamical events, the TST rate is the exact rate constant for the system to move from state A to state B. The power of TST comes from the fact that this flux is an equilibrium property of the system. Thus, we can compute the TST rate without ever propagating a trajectory. The appropriate ensemble average for the rate constant for escape from A, k TST A→ , is k TST A→ = |dx/dt | δ(x − q) A ,
(1)
where x ∈ r is the reaction coordinate and x = q the dividing surface bounding state A. The angular brackets indicate the ratio of Boltzmann-weighted integrals over 6N -dimensional phase space (configuration space r and momentum space p). That is, for some property P(r, p),
P =
P(r, p)exp[−H (r, p)/kB T ] dr dp , exp[−H (r, p)/kB T ] dr dp
(2)
A
Ea
B
Figure 1. A two-state system illustrating the definition of the transition state theory rate constant as the outgoing flux through the dividing surface bounding state A.
632
B.P. Uberuaga et al.
where kB is the Boltzmann constant and H (r, p) is the total energy of the system, kinetic plus potential. The subscript A in Eq. (1) indicates the configuration space integrals are restricted to the space belonging to state A. If the effective mass (m) of the reaction coordinate is constant over the dividing surface, Eq. (1) reduces to a simpler ensemble average over configuration space only [10], k TST A→ =
2kB T /π m δ(x − q) A .
(3)
The essence of this expression, and of TST, is that the Dirac delta function picks out the probability of the system being at the dividing surface, relative to everywhere else it can be in state A. Note that there is no dependence on the nature of the final state B. In a system with correlated events, not every dividing surface crossing corresponds to a reactive event, so that, in general, the TST rate is an upper bound on the exact rate. For diffusive events in materials at moderate temperatures, these correlated dynamical events typically do not cause a large change in the rate constants, so TST is often an excellent approximation. This is a key point; this behavior is markedly different than in some chemical systems, such as molecular reactions in solution or the gas phase, where TST is just a starting point and dynamical corrections can lower the rate significantly [11]. While in the traditional use of TST, rate constants are computed after the dividing surface is specified, in the accelerated dynamics methods we exploit the TST formalism to design approaches that do not require knowing in advance where the dividing surfaces will be, or even what product states might exist.
1.3.
Harmonic Transition State Theory
If we have identified a saddle point on the potential energy surface for the reaction pathway between A and B, we can use a further approximation to TST. We assume that the potential energy near the basin minimum is well described, out to displacements sampled thermally, with a second-order energy expansion – i.e., that the vibrational modes are harmonic – and that the same is true for the modes perpendicular to the reaction coordinate at the saddle point. Under these conditions, the TST rate constant becomes simply −E a / kB T , k HTST A→B = ν0 e
(4)
where 3N
min i νi ν0 = 3N−1 . νisad i
(5)
Accelerated molecular dynamics methods
633
Here E a is the static barrier height, or activation energy (the difference in energy between the saddle point and the minimum of state A (Fig. 1)), {νimin } are the normal mode frequencies at the minimum of A, and {νisad } are the nonimaginary normal mode frequencies at the saddle separating A from B. This is often referred to as the Vineyard [12] equation. The analytic integration of Eq. (1) over the whole phase space thus leaves a very simple Arrhenius temperature dependence.2 To the extent that there are no recrossings and the modes are truly harmonic, this is an exact expression for the rate. This harmonic TST expression is employed in the temperature accelerated dynamics method (without requiring calculation of the prefactor ν0 ).
1.4.
Complex Infrequent Event Systems
The motivation for developing accelerated molecular dynamics methods becomes particularly clear when we try to understand the dynamical evolution of what we will term complex infrequent event systems. In these systems, we simply cannot guess where the state-to-state evolution might lead. The underlying mechanisms may be too numerous, too complicated, and/or have an interplay whose consequences cannot be predicted by considering them individually. In very simple systems we can raise the temperature to make diffusive transitions occur on an MD-accessible time scale. However, as systems become more complex, changing the temperature causes corresponding changes in the relative probability of competing mechanisms. Thus, this strategy will cause the system to select a different sequence of state-to-state dynamics, ultimately leading to a completely different evolution of the system, and making it impossible to address the questions that the simulation was attempting to answer. Many, if not most, materials problems are characterized by such complex infrequent events. We may want to know what happens on the time scale of milliseconds, seconds or longer, while with MD we can barely reach one microsecond. Running at higher T or trying to guess what the underlying atomic processes are can mislead us about how the system really behaves. Often for these systems, if we could get a glimpse of what happens at these longer times, even if we could only afford to run a single trajectory for that long, our understanding of the system would improve substantially. This, in essence, is the primary motivation for the development of the methods described here.
2 Note that although the exponent in Eq. (4) depends only on the static barrier height E , in this HTST a
approximation there is no assumption that trajectory passes exactly through the saddle point.
634
1.5.
B.P. Uberuaga et al.
Dividing Surfaces and Transition Detection
We have implied that the ridgetops between basins are the appropriate dividing surfaces in these systems. For a system that obeys TST, these ridgetops are the optimal dividing surfaces; recrossings will occur for any other choice of dividing surface. A ridgetop can be defined in terms of steepest-descent paths – it is the 3N –1-dimensional boundary surface that separates those points connected by steepest descent paths to the minimum of one basin from those that are connected to the minimum of an adjacent basin. This definition also leads to a simple way to detect transitions as a simulation proceeds, a requirement of parallel-replica dynamics and temperature accelerated dynamics. Intermittently, the trajectory is interrupted and minimized through steepest descent. If this minimization leads to a basin minimum that is distinguishable from the minimum of the previous basin, a transition has occurred. An appealing feature of this approach is that it requires virtually no knowledge of the type of transition that might occur. Often only a few steepest descent steps are required to determine that no transition has occurred. While this is a fairly robust detection algorithm, and the one used for the simulations presented below, more efficient approaches can be tailored to the system being studied.
2.
Parallel-Replica Dynamics
The parallel-replica method [13] is the simplest and most accurate of the accelerated dynamics techniques, with the only assumption being that the infrequent events obey first-order kinetics (exponential decay); i.e., for any time t > τcorr after entering a state, the probability distribution function for the time of the next escape is given by p(t) = ktot e−ktot t ,
(6)
where ktot is the rate constant for escape from the state. For example, Eq. (6) arises naturally for ergodic, chaotic exploration of an energy basin. Parallelreplica allows for the parallelization of the state-to-state dynamics of such a system on M processors. We sketch the derivation here for equal-speed processors. For a state in which the rate to escape is ktot , on M processors the effective escape rate will be Mktot , as the state is being explored M times faster. Also, if the time accumulated on one processor is t1 , on the M processors a total time of tsum = Mt1 will be accumulated. Thus, we find that p(t1 ) dt1 = Mktot e−Mktot t1 dt1 p(t1 ) dt1 = ktot e−ktot tsum dtsum p(t1 ) dt1 = p(tsum ) dtsum
(7a) (7b) (7c)
Accelerated molecular dynamics methods
635
and the probability to leave the state per unit time, expressed in tsum units, is the same whether it is run on one or M processors. A variation on this derivation shows that the M processors need not run at the same speed, allowing the method to be used on a heterogeneous or distributed computer; see Ref. [13]. The algorithm is schematically shown in Fig. 2. Starting with an N -atom system in a particular state (basin), the entire system is replicated on each of M available parallel or distributed processors. After a short dephasing stage during which each replica is evolved forward with independent noise for a time tdeph ≥ τcorr to eliminate correlations between replicas, each processor carries out an independent constant-temperature MD trajectory for the entire N -atom system, thus exploring phase space within the particular basin M times faster than a single trajectory would. Whenever a transition is detected on any processor, all processors are alerted to stop. The simulation clock is advanced by the accumulated trajectory time summed over all replicas, i.e., the total time τrxn spent exploring phase space within the basin until the transition occurred. The parallel-replica method also correctly accounts for correlated dynamical events (i.e., there is no requirement that the system obeys TST), unlike the other accelerated dynamics methods. This is accomplished by allowing the trajectory that made the transition to continue on its processor for a further amount of time tcorr ≥ τcorr , during which recrossings or follow-on events may occur. The simulation clock is then advanced by tcorr , the final state is replicated on all processors, and the whole process is repeated. Parallelreplica dynamics then gives exact state-to-state dynamical evolution, because the escape times obey the correct probability distribution, nothing about the procedure corrupts the relative probabilities of the possible escape paths, and the correlated dynamical events are properly accounted for.
A
B
C
D
A
Figure 2. Schematic illustration of the parallel-replica method (after Ref. [1]). The four steps, described in the text, are (A) replication of the system into M copies, (B) dephasing of the replicas, (C) evolution of independent trajectories until a transition is detected in any of the replicas, and (D) brief continuation of the transitioning trajectory to allow for correlated events such as recrossings or follow-on transitions to other states. The resulting configuration is then replicated, beginning the process again.
636
B.P. Uberuaga et al.
The efficiency of the method is limited by both the dephasing stage, which does not advance the system clock, and the correlated event stage, during which only one processor accumulates time. (This is illustrated schematically in Fig. 2, where dashed line trajectories advance the simulation clock but dotted line trajectories do not.) Thus, the overall efficiency will be high when τrxn /M tdeph + tcorr .
(8)
Some tricks can further reduce this requirement. For example, whenever the system revisits a state, on all but one processor the interrupted trajectory from the previous visit can be immediately restarted, eliminating the dephasing stage. Also, the correlation stage (which only involves one processor) can be overlapped with the subsequent dephasing stage for the new state on the other processors, in the hope that there are no correlated crossings that lead to a different state. Figure 3 shows an example of a parallel-replica simulation; an Ag(111) island-on-island structure decays over a period of 1 µs at T = 400 K. Many of the transitions involve concerted mechanisms. Parallel-replica dynamics has the advantage of being fairly simple to program, with very few “knobs” to adjust – tdeph and tcorr , which can be conservatively set at a few ps for most systems. As multiprocessing environments become more ubiquitous, with more processors within a node or even on a chip, and loosely linked Beowulf clusters of such nodes, parallel-replica dynamics will become an increasingly important simulation tool. Recently, parallel-replica dynamics has been extended to driven systems, such as systems with some externally applied strain rate. The requirement here is that the drive rate is slow enough that at any given time the rates for the processes in the system depend only on the instantaneous configuration of the system.
3.
Hyperdynamics
Hyperdynamics builds on the basic concept of importance sampling [14, 15], extending it into the time domain. In the hyperdynamics approach [16], the potential surface V (r) of the system is modified by adding to it a nonnegative bias potential Vb (r). The dynamics of the system is then evolved on this biased potential surface, V (r) + Vb (r). A schematic illustration is shown in Fig. 4. The derivation of the method requires that the system obeys TST – that there are no correlated events. There are also important requirements on the form of the bias potential. It must be zero at all the dividing surfaces, and the system must still obey TST for dynamics on the modified potential surface. If such a bias potential can be constructed, a challenging
Accelerated molecular dynamics methods
637
t = 0.00 µs
t = 0.15 µs
t = 0.25 µs
t = 0.39 µs
t = 0.41 µs
t = 0.42 µs
t = 0.44 µs
t = 0.45 µs
t = 1.00 µs
Figure 3. Snapshots from a parallel-replica simulation of an island on top of an island on the Ag(111) surface at T = 400 K (after Ref. [1]). On a microsecond time scale, the upper island gives up all its atoms to the lower island, filling vacancies and kink sites as it does so. This simulation took 5 days to reach 1 µs on 32 1 GHz Pentium III processors.
task in itself, we can substitute the modified potential V (r) + Vb (r) into Eq. (1) to find k TST A→ =
|v A | δ(x − q)Ab , eβVb (r) Ab
(9)
where β = 1/kB T and the state Ab is the same as state A but with the bias potential Vb applied. This leads to a very appealing result: a trajectory on this modified surface, while relatively meaningless on vibrational time scales,
638
B.P. Uberuaga et al.
C A
B
Figure 4. Schematic illustration of the hyperdynamics method. A bias potential (V (r)), is added to the original potential (V (r), solid line). Provided that V (r) meets certain conditions, primarily that it be zero at the dividing surfaces between states, a trajectory on the biased potential surface (V (r) + V (r), dashed line) escapes more rapidly from each state without corrupting the relative escape probabilities. The accelerated time is estimated as the simulation proceeds.
evolves correctly from state to state at an accelerated pace. That is, the relative rates of events leaving A are preserved: k TST k TST Ab →B A→B = . TST k TST k Ab →C A→C
(10)
This is because these relative probabilities depend only on the numerator of Eq. (9) which is unchanged by the introduction of Vb since, by construction, Vb = 0 at the dividing surface. Moreover, the accelerated time is easily estimated as the simulation proceeds. For a regular MD trajectory, the time advances at each integration step by tMD , the MD time step (often on the order of 1 fs). In hyperdynamics, the time advance at each step is tMD multiplied by an instantaneous boost factor, the inverse Boltzmann factor for the bias potential at that point, so that the total time after n integration steps is thyper =
n
tMD eV (r(t j ))/ kB T.
(11)
j =1
Time thus takes on a statistical nature, advancing monotonically but nonlinearly. In the long-time limit, it converges on the correct value for the
Accelerated molecular dynamics methods
639
accelerated time with vanishing relative error. The overall computational speedup is then given by the average boost factor,
boost(hyperdynamics) = thyper/tMD = eV (r)/ kB T
Ab ,
(12)
divided by the extra computational cost of calculating the bias potential and its forces. If all the visited states are equivalent (e.g., this is common in calculations to test or demonstrate a particular bias potential), Eq. (12) takes on the meaning of a true ensemble average. The rate at which the trajectory escapes from a state is enhanced because the positive bias potential within the well lowers the effective barrier. Note, however, that the shape of the bottom of the well after biasing is irrelevant; no assumption of harmonicity is made. Figure 5 illustrates an application of hyperdynamics for a two-dimensional, periodic model potential using a Hessian-based bias potential [16]. The hopping diffusion rate was compared against MD at high temperature, where the two calculations agreed very well. At lower temperatures where the MD calculations would be too costly, it is compared against the result computed ⫺5
⫺10
In(D)
47 200
⫺15
⫺20
3435 8682
⫺25 4
6
8 1/kBT
10
12
Figure 5. Arrhenius plot of the diffusion coefficients for a model potential, showing a comparison of direct MD (), hyperdynamics (•), and TST + dynamical corrections (+). The symbols are sized for clarity. The line is the full harmonic TST approximation, and is indistinguishable from a least-square line through the TST points (not shown). Also shown are the boost factors, relative to direct MD, for each hyperdynamics result. The boost increases dramatically as the temperature is lowered (after Ref. [16]).
640
B.P. Uberuaga et al.
using TST plus dynamical corrections. As the temperature is lowered, the effective boost gained by using hyperdynamics increased to the point that, at kB T = 0.09, the boost factor was over 8500. See Ref. [16] for details. The ideal bias potential should give a large boost factor, have low computational overhead (though more overhead is acceptable if the boost factor is very high), and, to a good approximation, meet the requirements stated above. This is very challenging, since we want, as much as possible, to avoid utilizing any prior knowledge of the dividing surfaces or the available escape paths. To date, proposed bias potentials typically have either been computationally intensive, have been tailored to very specific systems, have assumed localized transitions, or have been limited to low-dimensional systems. But the potential boost factor available from hyperdynamics is tantalizing, so developing bias potentials capable of treating realistic many-dimensional systems remains a subject of ongoing research by several groups. See Ref. [1] for a detailed discussion on bias potentials and results generated using various forms.
4.
Temperature Accelerated Dynamics
In the temperature accelerated dynamics (TAD) method [17], the idea is to speed up the transitions by increasing the temperature, while filtering out the transitions that should not have occurred at the original temperature. This filtering is critical, since without it the state-to-state dynamics will be inappropriately guided by entropically favored higher-barrier transitions. The TAD method is more approximate than the previous two methods, as it relies on harmonic TST, but for many applications this additional approximation is acceptable, and the TAD method often gives substantially more boost than hyperdynamics or parallel-replica dynamics. Consistent with the accelerated dynamics concept, the trajectory in TAD is allowed to wander on its own to find each escape path, so that no prior information is required about the nature of the reaction mechanisms. In each basin, the system is evolved at a high temperature Thigh (while the temperature of interest is some lower temperature Tlow ). Whenever a transition out of the basin is detected, the saddle point for the transition is found. The trajectory is then reflected back into the basin and continued. This “basin constrained molecular dynamics” (BCMD) procedure generates a list of escape paths and attempted escape times for the high-temperature system. Assuming that TST holds and that the system is chaotic and ergodic, the probability distribution for the first-escape time for each mechanism is an exponential (Eq. (6)). Because harmonic TST gives an Arrhenius dependence of the rate on temperature (Eq. (4)), depending only on the static barrier height, we can then extrapolate each escape time observed at Thigh to obtain a corresponding escape time at Tlow that is drawn correctly from the exponential distribution at Tlow . This extrapolation, which requires knowledge of the saddle point energy, but not the preexponential factor, can be illustrated graphically in an
Accelerated molecular dynamics methods
641
Arrhenius-style plot (ln(1/t) vs. 1/T ), as shown in Fig. 6. The time for each event seen at Thigh extrapolated to Tlow is then tlow = thigh e Ea (βlow −βhigh ) ,
(13)
Tlow time
Thigh time
In(νmin)
ln(1/t)
In(ν*min)
low
ln(1/t short ) ln(1/tstop)
1/Thigh
1/Tlow 1/T
Figure 6. Schematic illustration of the temperature accelerated dynamics method. Progress of the high-temperature trajectory can be thought of as moving down the vertical time line at 1/Thigh . For each transition detected during the run, the trajectory is reflected back into the basin, the saddle point is found, and the time of the transition (solid dot on left time line) is transformed (arrow) into a time on the low-temperature time line. Plotted in this Arrhenius-like form, this transformation is a simple extrapolation along a line whose slope is the negative of the barrier height for the event. The dashed termination line connects the shortest-time transition recorded so far on the low temperature time line with the confidence-modified minimum =ν preexponential (νmin min /ln(1/δ)) on the y-axis. The intersection of this line with the highT time line gives the time (tstop , open circle) at which the trajectory can be terminated. With confidence 1-δ, we can say that any transition observed after tstop could only extrapolate to a shorter time on the low-T time line if it had a preexponential lower than νmin .
642
B.P. Uberuaga et al.
where, again, β = 1/kB T . The event with the shortest time at low temperature is the correct transition for escape from this basin. Because the extrapolation can in general cause a reordering of the escape times, a new shorter-time event may be discovered as the BCMD is continued at Thigh. If we make the additional assumption that there is a minimum preexponential factor, νmin , which bounds from below all the preexponential factors in the system, we can define a time at which the BCMD trajectory can be stopped, knowing that the probability that any transition observed after that time would replace the first transition at Tlow is less than δ. This “stop” time is given by
ln(1/δ) νmin tlow,short thigh,stop ≡ νmin ln (1/δ)
Tlow /Thigh
,
(14)
where tlow,short is the shortest transition time at Tlow . Once this stop time is reached, the system clock is advanced by tlow,short, the transition corresponding to tlow,short is accepted, and the TAD procedure is started again in the new basin. The average boost in TAD can be dramatic when barriers are high and Thigh/Tlow is large. However, any anharmonicity error at Thigh transfers to Tlow ; a rate that is twice the Vineyard harmonic rate due to anharmonicity at Thigh will cause the transition times at Thigh for that pathway to be 50% shorter, which in turn extrapolate to transition times that are 50% shorter at Tlow . If the Vineyard approximation is perfect at Tlow , these events will occur at twice the rate they should. This anharmonicity error can be controlled by choosing a Thigh that is not too high. As in the other methods, the boost is limited by the lowest barrier, although this effect can be mitigated somewhat by treating repeated transitions in a “synthetic” mode [17]. This is in essence a kinetic Monte Carlo treatment of the low-barrier transitions, in which the rate is estimated accurately from the observed transitions at Thigh , and the subsequent low-barrier escapes observed during BCMD are excluded from the extrapolation analysis. Temperature accelerated dynamics is particularly useful for simulating vapor-deposited crystal growth, where the typical time scale can exceed minutes. Figure 7 shows an example of TAD applied to such a problem. Vapor deposited growth of a Cu(100) surface was simulated at a deposition rate of one monolayer per 15 s and a temperature T = 77 K, exactly matching (except for the system size) the experimental conditions of Ref. [18]. Each deposition event was simulated using direct MD for 2 ps, long enough for the atom to collide with the surface and settle into a binding site. A TAD simulation with Thigh = 550 K then propagated the system for the remaining time until the next deposition event was required, on average 0.3 s later. The overall boost factor was ∼ 107 . A key feature of this simulation was that, even at this low temperature, many events accepted during the growth process
Accelerated molecular dynamics methods
1 ML
2 ML
3 ML
4 ML
643
5 ML Figure 7. Snapshots from a TAD simulation of the deposition of five monolayers (ML) of Cu onto Cu(100) at 0.067 ML/s and T =77 K, matching the experimental conditions of Egelhoff and Jacob [18]. Deposition of each new atom was performed using direct molecular dynamics for 2 ps, while the intervening time (0.3 s on average for this 50 atom/layer simulation cell) was simulated using the TAD method. The boost factor for this simulation was ∼107 over direct MD (after Ref. [1]).
involved concerted mechanisms, such as the concerted sliding of an eight-atom cluster [1]. This MD/TAD procedure for simulating film growth has been applied also to Ag/Ag(100) at low temperatures [19] and Cu/Ag(100) [20]. Heteroepitaxial systems are especially hard to treat with techniques such as kinetic Monte Carlo because of the increased tendency for the system to go off lattice due
644
B.P. Uberuaga et al.
to mismatch strain, and because the rate table needs to be considerably larger when neighboring atoms can have multiple types. Recently, enhancements to TAD, beyond the “synthetic mode” mentioned above, have been developed that can increase the efficiency of the simulation. For systems that revisit states, the time required to accept an event can be reduced for each revisit by taking advantage of the time accumulated in previous visits [21]. This procedure is exact; no assumptions beyond the ones required by the original TAD method are needed. After many visits, the procedure converges. The minimum barrier for escape from that state (E min ) is then known to within uncertainty δ. In this converged mode (ETAD), the average time at Thigh required to accept an event no longer depends on δ, and the average boost factor becomes simply
t low,short boost(ETAD) = = exp E min t high,stop
1 1 − kB Tlow kB Thigh
(15)
for that state. The additional boost (when converged) compared to the original TAD can be an order of magnitude or more. For systems that seldom (or never) revisit the same state, it is still possible to exploit this extra boost by running in ETAD mode with E min supplied externally. One way of doing this is to combine TAD with the dimer method [22]. In this combined dimer-TAD approach, first proposed by Montalenti and Voter [21], upon entering a new state, a number of dimer searches are used to find the minimum barrier for escape, after which ETAD is employed to quickly find a dynamically appropriate escape path. This exploits the power of the dimer method to quickly find low-barrier pathways, while eliminating the danger associated with the possibility that it might miss important escape paths. Although the dimer method might fail to find the lowest barrier correctly, this is a much weaker demand on the dimer method than trying to find all relevant barriers. In addition, the ETAD phase has some chance of correcting the simulation during the BCMD if the dimer searches did not find E min .
5.
Outlook
As these accelerated dynamics methods become more widely used and further developed (including the possible emergence of new methods), their application to important problems in materials science will continue to grow. We conclude this article by comparing and contrasting the three methods presented here, with some guidelines for deciding which method may be most appropriate for a given problem. We point out some important limitations of the methods, areas in which further development may significantly increase their usefulness. Finally, we discuss the prospects for these methods in the immediate future.
Accelerated molecular dynamics methods
645
The key feature of all of the accelerated dynamics methods is that they collapse the waiting time between successive transitions from its natural time (τrxn ) to (at best) a small number of vibrational periods. Each method accomplishes this in a different way. TAD exploits the enhanced rate at higher temperature, hyperdynamics effectively lowers the barriers to escape by filling in the basin, and parallel-replica dynamics spreads the work across many processors. The choice of which accelerated dynamics method to apply to a problem will typically depend on three factors. The first is the desired level of accuracy in following the exact dynamics of the system. As described previously, parallel-replica is the most exact of the three methods; the only assumption is that the kinetics are first order. Not even TST is assumed, as correlated dynamical events are treated correctly in the method. This is not true with hyperdynamics, which does rely upon the assumptions of TST, in particular the absence of correlated events. Finally, temperature accelerated dynamics makes the further assumptions inherent in the harmonic approximation to TST, and is thus the most approximate of the three methods. If complete accuracy is the main goal of the simulation, parallel-replica is the superior choice. The second consideration is the potential gain in accessible time scales that the accelerated dynamics method can achieve for the system. Typically, TAD is the method of choice when considering this factor. While in all three methods the boost for escaping from each state will be limited by the smallest barrier, if the barriers are high relative to the temperature of interest, TAD will typically achieve the largest boost factor. In principle, hyperdynamics can also achieve very significant boosts, but, in practice, existing bias potentials either have a very simple form which generally provide limited boosts for complex many-atom systems, or more sophisticated (e.g., Hessian-based) forms whose overhead reduces the boosts actually attainable. It may be possible, using prior knowledge about particular systems, to construct a computationally inexpensive bias potential which simultaneously offers large boosts, in which case hyperdynamics could be competitive with TAD. Finally, parallel-replica dynamics usually offers the smallest boost given the typical access to parallel computing today (e.g., tens of processors or fewer per user for continuous use), since the maximum possible boost is exactly the number of processors. For some systems, the overhead of, for example, finding saddle points in TAD may be so great that parallel-replica can give more overall boost. However, in general, the price of the increased accuracy of parallel-replica dynamics will be shorter achievable time scales. It should be emphasized that the limitations of parallel-replica in terms of accessible time scales are not inherent in the method, but rather are a consequence of the currently limited computing power which is available. As massively parallel processing becomes commonplace for individual users, and any number can be used in the study of a given problem, parallel-replica should become just as efficient as the other methods. If enough processors are available
646
B.P. Uberuaga et al.
so that the amount of simulation time each processor has to do for each transition is on the order of ps, parallel-replica will be just as efficient as TAD or hyperdynamics. This analysis may be complicated by issues of communication between processors, but the future of parallel-replica is very promising. The last main factor determining which method is best suited to a problem is the shape of the potential energy surface (PES). Both TAD and hyperdynamics require that the PES be relatively smooth. In the case of TAD, this is because saddle points must be found and standard techniques for finding them often perform poorly for rough landscapes. The same is true for the hyperdynamics bias potentials that require information about the shape of the PES. Parallel-replica, however, only requires a method for detecting transitions. No further analysis of the potential energy surface is needed. Thus, if the PES describing the system of interest is relatively rough, parallel-replica dynamics may be the only method that can be applied effectively. The temperature dependence of the boost in hyperdynamics and TAD gives rise to an interesting prediction about their power and utility in the future. Sometimes, even accelerating the dynamics may not make the activated processes occur frequently enough to study a particular process. A common trick is to raise the temperature just enough that at least some events will occur in the available computer time, hoping, of course, that the behavior of interest is still representative of the lower-T system. When faster computers become available, the same system can be studied at a lower, more desirable, temperature. This in turn increases the boost factor (e.g., see Eqs. (12) and (14)), so that, effectively, there is a superlinear increase in the power of accelerated dynamics with increasing computer speed. Thus, the accelerated dynamics approaches will become increasingly more powerful in future years simply because computers keep getting faster. A particularly appealing prospect is that of accelerated electronic structurebased molecular dynamics simulations (e.g., by combining density functional theory (DFT) or quantum chemistry with the methods discussed here), since accessible electronic structure time scales are even shorter, currently on the order of ps. However, because of the additional expense involved in these techniques, the converse of the argument given in the previous paragraph indicates that, for example, accelerated DFT dynamics simulations will not give much useful boost on current computers (i.e., using DFT to calculate the forces is like having a very slow computer). DFT hyperdynamics may be a powerful tool in 5–10 years, when breakeven (boost = overhead) is reached, and this could happen sooner with the development of less expensive bias potentials. TAD is probably close to being viable for combination with DFT, while parallel-replica dynamics and dimer-TAD could probably be used on today’s computers for electronic structure studies on some systems. Currently, these methods are very efficient when applied to systems in which the barriers are much higher than the temperature of interest. This is often true
Accelerated molecular dynamics methods
647
for systems such as ordered solids, but there are many important systems that do not so cleanly fall into this class, a prime example being glasses. Such systems are characterized by either a continuum of barrier heights, or a set of low barriers that describe uninteresting events, like conformational changes in a molecule. Low barriers typically degrade the boost of all of the accelerated dynamics methods, as well as the efficiency of standard kinetic Monte Carlo. However, even these systems will be amenable to study through accelerated dynamics methods as progress is made on this low-barrier problem. A final note should be made about the computational scaling of these methods with system size. While the exact scaling depends on the type of system and many aspects of the implementation, a few general points can be made. In the case of TAD, if the work of finding saddles and detecting transitions can be localized, it can be shown that the scaling goes as N 2−Tlow /Thigh [21] for the simple case of a system that has been enlarged by replication. This is improved greatly with ETAD, which scales as O(N ), the same as regular MD. Real systems are more complicated and, typically, lower barrier processes will arise as the system size is increased. Thus, even hyperdynamics with a bias potential requiring no overhead might scale worse than N . The accelerated dynamics methods, as a whole, are still in their infancy. Even so, they are currently powerful enough to study a wide range of materials problems that were previously intractable. As these methods continue to mature, their applicability, and the physical insights gained by their use, can be expected to grow.
Acknowledgments We gratefully acknowledge vital discussions with Graeme Henkelman. This work was supported by the United States Department of Energy (DOE), Office of Basic Energy Sciences, under DOE Contract No. W-7405-ENG-36.
References [1] A.F. Voter, F. Montalenti, and T.C. Germann, “Extending the time scale in atomistic simulation of materials,” Annu. Rev. Mater. Res., 32, 321–346, 2002. [2] D. Chandler, “Statistical-mechanics of isomerization dynamics in liquids and transition-state approximation,” J. Chem. Phys., 68, 2959–2970, 1978. [3] A.F. Voter and J.D. Doll, “Dynamical corrections to transition state theory for multistate systems: surface self-diffusion in the rare-event regime,” J. Chem. Phys., 82, 80–92, 1985. [4] C.H. Bennett, “Molecular dynamics and transition state theory: simulation of infrequent events,” ACS Symp. Ser., 63–97, 1977. [5] R. Marcelin, “Contribution a` l’´etude de la cin´etique physico-chimique,” Ann. Physique, 3, 120–231, 1915.
648
B.P. Uberuaga et al. [6] E.P. Wigner, “On the penetration of potential barriers in chemical reactions,” Z. Phys. Chemie B, 19, 203, 1932. [7] H. Eyring, “The activated complex in chemical reactions,” J. Chem. Phys., 3, 107–115, 1935. [8] P. Pechukas, “Transition state theory,” Ann. Rev. Phys. Chem., 32, 159–177, 1981. [9] D.G. Truhlar, B.C. Garrett, and S.J. Klippenstein, “Current status of transition state theory,” J. Phys. Chem., 100, 12771–12800, 1996. [10] A.F. Voter and J.D. Doll, “Transition state theory description of surface selfdiffusion: comparison with classical trajectory results,” J. Chem. Phys., 80, 5832– 5838, 1984. [11] B.J. Berne, M. Borkovec, and J.E. Straub, “Classical and modern methods in reaction-rate theory,” J. Phys. Chem., 92, 3711–3725, 1988. [12] G.H. Vineyard, “Frequency factors and isotope effects in solid state rate processes,” J. Phys. Chem. Solids, 3, 121–127, 1957. [13] A.F. Voter, “Parallel-replica method for dynamics of infrequent events,” Phys. Rev. B, 57, 13985–13988, 1998. [14] J.P. Valleau and S.G. Whittington, “A guide to Monte Carlo for statistical mechanics: 1. highways,” In: B.J. Berne (ed.), Statistical Mechanics. A. A Modern Theoretical Chemistry, vol. 5, Plenum, New York, pp. 137–168, 1977. [15] B.J. Berne, G. Ciccotti, and D.F. Coker (eds.), Classical and Quantum Dynamics in Condensed Phase Simulations, World Scientific, Singapore, 1998. [16] A.F. Voter, “A method for accelerating the molecular dynamics simulation of infrequent events,” J. Chem. Phys., 106, 4665–4677, 1997. [17] M.R. Sørensen and A.F. Voter, “Temperature-accelerated dynamics for simulation of infrequent events,” J. Chem. Phys., 112, 9599–9606, 2000. [18] W.F. Egelhoff, Jr. and I. Jacob, “Reflection high-energy electron-diffraction (RHEED) oscillations at 77K,” Phys. Rev. Lett., 62, 921–924, 1989. [19] F. Montalenti, M.R. Sørensen, and A.F. Voter, “Closing the gap between experiment and theory: crystal growth by temperature accelerated dynamics,” Phys. Rev. Lett., 87, 126101, 2001. [20] J.A. Sprague, F. Montalenti, B.P. Uberuaga, J.D. Kress, and A.F. Voter, “Simulation of growth of Cu on Ag(001) at experimental deposition rates” Phys. Rev. B, 66, 205415, 2002. [21] F. Montalenti and A.F. Voter, “Exploiting past visits or minimum-barrier knowledge to gain further boost in the temperature-accelerated dynamics method,” J. Chem. Phys., 116, 4819–4828, 2002. [22] G. Henkelman and H. J´onsson, “A dimer method for finding saddle points on high dimensional potential surfaces using only first derivatives,” J. Chem. Phys., 111, 7010–7022, 1999.
2.12 CONCURRENT MULTISCALE SIMULATION AT FINITE TEMPERATURE: COARSE-GRAINED MOLECULAR DYNAMICS Robert E. Rudd Lawrence Livermore National Laboratory, University of California, L-045 Livermore, CA 94551, USA
1.
Embedded Nanomechanics and Computer Simulation
With the advent of nanotechnology, predictive simulations of nanoscale systems have become in great demand. In some cases, nanoscale systems can be simulated directly at the level of atoms. The atomistic techniques used range from models based on a quantum mechanical treatment of the electronic bonds to those based on more empirical descriptions of the interatomic forces. In many cases, however, even nanoscale systems are too big for a purely atomistic approach, typically because the nanoscale device is coupled to its surroundings, and it is necessary to simulate the entire system comprising billions of atoms. A well-known example is the growth of nanoscale epitaxial quantum dots in which the size, shape and location of the dot is affected by the elastic strain developed in a large volume of the substrate as well as the local atomic bonding. The natural solution is to model the surroundings with a more coarse-grained (CG) description, suitable for the intrinsically longer length scale. The challenge then is to develop the computational methodology suitable for this kind of concurrent multiscale modeling, one in which the simulated length scale can be changed smoothly and seamlessly from one region of the simulation to another while maintaining the fidelity of the relevant mechanics, dynamics and thermodynamics. The realization that Nature has different relevant length scales goes back at least as far as Democritus. Some 24 centuries ago he put forward the idea that solid matter is comprised ultimately at small scales by a fundamental constituent that he termed as atom. Implicit in his philosophy was the idea that an 649 S. Yip (ed.), Handbook of Materials Modeling, 649–661. c 2005 Springer. Printed in the Netherlands.
650
R.E. Rudd
understanding of the atom would lead to a more robust understanding of the macroscopic world around us. In the intervening period, of course, not only has the science of this atomistic picture been put on a sound footing through the inventions of chemistry, the discovery of the nucleus and the development of quantum mechanics and modern condensed matter physics, but a host of additional length scales with their own relevant physics has been uncovered. A great deal of scientific innovation has gone into the development of physical models to describe the phenomena observed at these individual length scales. In the past decade a growing effort has been devoted in understanding how physics at different length scales works in concert to give rise to the observed behavior of solid materials. The use of models at multiple length scales, especially computer models optimized in this way, has been known as multiscale modeling. An example of multiscale modeling that we will consider in some detail is the modeling of the elastic deformation of solids at the atomistic and continuum levels. Clearly one kind of multiscale model would be to calculate the mass density and elastic constants within an atomistic model, and to use those data to parameterize a continuum model to describe large-scale elastic deformation. Such a parameter-passing, hierarchical approach has been used extensively to study a variety of systems [1]. Its success relies on the occurrence of well-separated length scales. We shall refer to such an approach as sequential multiscale modeling. In some systems, it is not clear how to separate the various length scales. An example would be turbulence, in which vortex structures are generated at many length scales and hierarchical models have to date only worked in very special cases [2]. Alternatively, the system of interest may be inhomogeneous and have regions in which small-scale physics dominates embedded in regions governed by large-scale physics. Examples would include fracture [3, 4], various nucleation phenomena [5], nanoscale moving mechanical components on computer chips (NEMS) [6], ion implantation and radiation damage events [7], epitaxial quantum dot growth [8] and so on. In either case hierarchical approach is not ideal, and concurrent multiscale modeling is preferred [9]. Here we focus on the inhomogeneous systems, and in particular on systems like those mentioned above in which the most interesting behavior involves the mechanics of a nanoscale region, but the overall behavior also depends on how the nanoscale region is coupled to its large-scale surroundings. This embedded nanomechanics may be studied effectively with concurrent multiscale modeling, where regions dominated by different length scales are treated with different models, either explicitly through a hybrid approach or effectively through a derivative approach. Here we focus on the methodology of coarse-grained molecular dynamics (CGMD) [9–12], one example of a concurrent multiscale model. CGMD describes the dynamic behavior of solids concurrently at the atomistic level and at more coarse-grained levels. The CG description is similar to finite element
Concurrent multiscale simulation at finite temperature
651
modeling (FEM) of continuum elasticity, with several important distinctions. CGMD is derived directly from the atomistic model without recourse to a continuum description. This approach is important because it allows a more seamless coupling of the atomistic and coarse-grained models. The other important distinction is that CGMD is designed for finite temperature, and the coarse-graining procedure makes use of the techniques of statistical mechanics to ensure that the model provides a robust description of the thermodynamics. Several other concurrent multiscale models for solids have been proposed and used [13–18]. The Quasicontinuum technique is of particular note in this context, because it is also derived entirely from the underlying atomistic model [14]. CGMD was the first concurrent multiscale model designed for finite temperature simulations [10]. Recently, another finite temperature concurrent multiscale model has been developed using renormalization group techniques, including time renormalization [17]. This model is very interesting, although to date its formulation is based on bond decimation procedures that is limited to simple models with pair-wise nearest-neighbor interactions. The formulation of CGMD is more flexible, making it compatible with most classical interatomic potentials. It has been applied to realistic potentials in 3D whose range extends beyond nearest neighbors.
2.
Formulation of CGMD
Coarse-grained molecular dynamics provides a model whose minimum length scale may vary from one location to another in the system. The CGMD formulation begins with a specification of a mesh that defines the length scales that will be represented in each region (see Fig. 1). As in finite element modeling [19], the mesh is unstructured, and it comes with a set of shape functions that define how fields are continuously interpolated on the mesh. For example, the displacement field is the most basic field in CGMD, and it is approximated as u(x) ≈
u j N j (x),
(1)
j
where N j (x) is the value of the j th shape function evaluated at the point x in the undeformed (reference) configuration. It is often useful to let N j (x) have support at node j so that the coefficient u j represents the displacement at node j , but it need not be so for the derivation of CGMD. We will refer to u j as nodal displacements, bearing in mind that the coarse-grained fields could be more general. Ultimately the usual criteria to ensure well-behaved numerics will apply, such as the cells should not have high aspect ratios and the mesh size should not change too abruptly; for the purposes of the formulation, the only requirement we impose is that if a region of the mesh is at the atomic
652
R.E. Rudd Micron Resonator
CG
MD
Figure 1. Schematic diagram of a concurrent multiscale simulation of a NEMS silicon microresonator [4–6] to illustrate how a system may be decomposed into atomistic (MD) and coarse-grained (CG) regions. The CG region comprises most of the volume, but the MD region contains most of the simulated degrees of freedom. Note that the CG mesh is refined to the atomic scale where it joins with the MD lattice.
scale, the positions of the nodes coincide with equilibrium lattice sites. This is not required for coarser regions of the mesh. To the first approximation, CGMD is governed by mass and stiffness matrices. They are derived from the underlying atomistic physics, described by a molecular dynamics (MD) model [20]. Define the discrete shape functions by evaluating the shape function N j (x) at the equilibrium lattice site x0µ of atom µ: N jµ = N j (x0µ ).
(2)
The discrete shape functions allow us to approximate the atomic displace ments uµ ≈ j u j N jµ . If we were to make this a strict equality, we would be on the path to the Quasicontinuum technique. Instead, we consider this a constraint on the system, and allow all of the unconstrained degrees of freedom in the system to fluctuate in thermal equilibrium. In particular, we demand that the interpolating fields be best fits to the underlying atomistic degrees of freedom of the system. In the case of the displacement field this requirement means that the nodal displacements minimize the chi-squared error of the fit: 2 2 u j N j µ . χ = uµ − µ j
(3)
Concurrent multiscale simulation at finite temperature
653
The minimum of χ 2 is given by u j = (N N T )−1 j k Nkµ uµ ≡ f jµ uµ ,
(4)
where repeated indices are summed and the inverse is a matrix inverse. We have introduced the weighting function expressed in terms of the discrete shape function as f jµ = (N N T )−1 j k Nkµ . Equation (4) provides the needed correspondence between the coarse and fine degrees of freedom. Once the weighting function f jµ is defined, the CGMD energy is defined as an average energy over the ensemble of systems in different points in phase space satisfying the correspondence relation (4). Mathematically, this is expressed as E(uk , u˙ k ) = Z
−1
dxµ dpµ HMD e−β HMD ,
(5)
where Z is the constrained partition function (the same integral without the HMD pre-exponential factor). The integral runs over the full 6Natom -dimensional MD phase space. The inverse temperature is given by β = 1/kT . The factor HMD is the MD Hamiltonian, the sum of the atomistic kinetic and potential energies. The potential energy is determined by an interatomic potential, a generalization of the well-known Lennard–Jones potential that typically includes non-linear many-body interactions [20]. The factor is a product of delta functions enforcing the constraint, =
j
δ uj −
µ
uµ f jµ δ u˙ j −
pµ f j µ µ
mµ
.
(6)
Once the energy (5) is determined, the equations of motion are derived as the corresponding Euler–Lagrange equations. The CGMD energy (5) consists of kinetic and potential terms. The CGMD kinetic energy can be computed exactly using analytic techniques for any system; the CGMD potential energy can also be calculated exactly, provided the MD interatomic potential is harmonic. Anharmonic corrections may be computed in perturbation theory. The details are given in Ref. [11]. Here we focus on the harmonic case, in which the potential energy is quadratic in the atomic displacements, and the coefficient of the quadratic term (times 2) is known as the dynamical matrix, Dµν . The result for harmonic CGMD is that E(uk , u˙ k ) = Uint + 12 (M j k u˙ j · u˙ k + u j · K j k uk ), Uint = Natom E + 3(Natom − Nnode )kT, Mij = m Niµ N jµ , −1 f j ν )−1 K ij = ( f iµ Dµν × ˜ −1 × Dµν D j ν , = Niµ Dµν N j ν − Diµ coh
(7) (8) (9) (10) (11)
654
R.E. Rudd
where Mij is the mass matrix and K ij is the stiffness matrix. Here again and throughout this Article a sum is implied whenever indices are repeated on one side of an equation unless otherwise noted. The internal energy Uint includes the total cohesive energy of the system, Natom E coh , as well as the internal energy of a collection of (Natom − Nnode ) harmonic oscillators at finite temperature. The form of the mass matrix (9) assumes a monatomic lattice. A more general form is given in Ref. [11]. The two forms of the stiffness matrix are equivalent in principle, although in practice numerical considerations have favored one form or the other for particular applications. The first form (10) was used for the early CGMD applications. It is most suited for applications in which the nodal index may be Fourier transformed, such as the computation of phonon spectra. The second form (11) is better suited for real space applications. It depends on an off-diagonal block of the dynamical matrix
D ×jµ = δµρ − N jµ f jρ Dρν N j ν
(12)
−1 for the internal and a regularized form of the lattice Green function D˜ µν degrees of freedom that is defined in Ref. [11]. Note that the mass matrix and the compliance matrix (the inverse of the stiffness matrix) are weighted averages of the corresponding MD quantities, the MD mass and MD lattice Green function, respectively. The CGMD equations of motion are derived from the CGMD Hamiltonian (5) using the Euler–Lagrange procedure
M j k u¨ k = −K j k uk + Fext j ,
(13)
where we have included the possibility of an external body force on node j given by Fext j . The anharmonic corrections to these equations of motion form an infinite Taylor series in powers of uk [11]. In regions of the mesh refined to the atomic level, it has been shown that the infinite series sums up to the MD interatomic forces; i.e., the original MD equations of motion are recovered in regions of the mesh refined to the atomic scale [10]. In the case of a harmonic system, the recovery of the MD equations of motion in the atomic limit should be clear from the equations for the mass and stiffness matrices. In this limit Niµ = δiµ and f iµ = δiµ , so Mij = mδij and K ij = Dij from Eqs. (9) and (10), respectively. In practice, we define two complementary regions of the simulation. In the CG region, the harmonic CGMD equations of motion (13) are used, whereas in the region of the mesh refined to the atomic level, called the MD region, the anharmonic terms are restored through the use of the full MD equations of motion. In a CGMD simulation the mass and stiffness matrices are calculated once at the beginning of the simulation. The reciprocal space (Fourier transform) representation of the dynamical matrix is used in order to make the calculation of the stiffness matrix tractable. This representation implicitly assumes that the solid in the form of a crystal lattice free from defects in the CG region.
Concurrent multiscale simulation at finite temperature
655
The CGMD mass matrix involves couplings between nearest neighbor nodes in the CG region, just as the distributed mass matrix of finite element modeling does. The fact that the mass matrix is not diagonal is inconvenient, since a system of equations must be solved in order to determine the nodal accelerations. The system of equations is sparse, but this step introduces some computational overhead, and it is desirable to eliminate it. In FEM, the distributed mass matrix is often replaced by a diagonal approximation, the lumped mass matrix [19]. In CGMD, the lumped mass approximation, lump
Mij
= m δij
Niµ
(no sum on i)
(14)
µ
has proven useful in the same way [9]. This definition assumes that the shape functions form a partition of unity, so that i Niµ = 1 for all µ. In principle, the determination of the equations of motion together with the relevant initial and boundary conditions completely specifies the problem. In practice, we have typically used a thermalized initial state and a mixture of periodic and free boundary conditions suitable for the problem of interest. The equations of motion are integrated in time using a velocity Verlet time integrator [20] with the conventional MD time step used throughout the simulation. The natural time scale of the CG nodes is longer due to the greater mass and greater compliance of the larger cells, and it would be natural to use a longer time step in the CG region. We have found little motivation to explore this possibility, however, since the computational cost of our simulations is typically dominated by the MD region, so there is little to gain by speeding up the computation in the CG region. We now turn to the question of how CGMD simulations are analyzed. Much of the analysis of CGMD simulations is accomplished using standard MD techniques. The simulations are typically constructed such that the most interesting phenomena occur in the MD region, and here most of the usual MD tools may be brought to bear. Thermodynamic quantities are calculated in the usual way, and the identification and tracking of crystal lattice defects may be accomplished with conventional techniques. In some cases it may be of interest to analyze the simulation in the CG region, as well. For example, it may be of interest to plot the temperature throughout the simulation in order to verify that the behavior at the MD/CG interface is reasonable. In MD the temperature is directly related to the mean ˙ 2 , where the brackets indicate the kinetic energy of the atoms: kT = 13 m|u| average [20]. In CGMD, a similar expression holds [11] kT = 13 |u˙ i |2 /Mii−1
(no sum on i),
(15)
where Mii−1 is the diagonal component corresponding to node i of the inverse of the mass matrix. This analysis of the temperature and thermal oscillations is
656
R.E. Rudd
closely tied to the kinetic energy in the CG region. Similar tools are available to analyze the potential energy and the related quantites such as deformation, pressure and stress [11].
3.
Validation
Validation of concurrent multiscale models is a challenge in its own right, and the development of quantitative tools and performance measures to analyze models like CGMD has taken place at the same time as the development of the first models. CGMD has been tested in several ways to see how it compares with a full MD simulation of a test system, as well as other concurrent multiscale simulations. The first test was the calculation of the spectrum of elastic waves or phonons. The techniques to calculate these spectra in atomistic systems have been developed long ago in the field of lattice dynamics [21]. In general the phonon spectrum is comprised of D acoustic mode branches (where D is the number of dimensions) together with D(Nunit −1) optical branches (where Nunit is the number of atoms in the elementary unit cell of the crystal lattice) [22]. The acoustic modes are distinguished by the fact that their frequency goes to zero as their wavelength becomes large. The infinite wavelength corresponds to uniform translation of the system, a process that costs no energy and hence corresponds to zero frequency. Elastic wave spectra are an interesting test of CGMD and other concurrent multiscale techniques because they represent a test of dynamics and because elastic waves have a natural length scale associated with them: the wavelength. When a CG mesh is introduced, the shortest wavelengths are excluded. These modes are eliminated because they are irrelevant in the CG region, and their elimination increases the efficiency of the simulation. The test then is to see how well the model describes those longer wavelength modes that are represented in the CG region. The elastic wave spectra for solid argon were computed in CGMD on a uniform mesh for various mesh sizes, and compared to the MD spectra and spectra computed using a FEM model based on continuum elasticity [9, 11]. The bonds between argon atoms were modeled with a Lennard–Jones potential cut off at the fifth shell of neighboring atoms. Several interesting results were found. First, both CGMD and FEM agreed with the MD spectrum at long wavelengths. This is to be expected, since for wavelengths much longer than the mesh spacing, the waveform should be well represented on the mesh. Also, at long wavelengths the FEM assumption of a continuous medium is justified, and the slope of the spectrum gives the sound velocity, c = ω/k for k → 0. Here ω is the (angular) frequency and k is the wave number. The error in ω(k) was found to be of order O(k 2 ) for FEM, as expected. It goes to zero in the long wavelength limit, k → 0. One nice feature of CGMD was a reduced
Concurrent multiscale simulation at finite temperature
657
error of order O(k 4 ) [10]. Moreover, CGMD provides a better approximation of the elastic wave spectra for all wavelengths supported on the mesh. Of course, CGMD also has the important feature that the elastic wave spectra are reproduced exactly when the mesh is refined to the atomic level, a property that FEM does not possess. Interatomic forces are not merely FEM elasticity on an atomic sized grid. Solid argon forms a face-centered cubic crystal lattice and hence has only three acoustic wave branches in its phonon spectrum. For crystals with optical phonon branches, there is more than one way to implement the coarsegraining, depending on the physics that is of interest, but the general CGMD framework continues to work well [23]. The other validation of CGMD has been the study of the reflection of elastic waves from the MD/CG interface. For applications such as crack propagation, it has proven important to control this unphysical reflection. The reflected waves can propagate back into the heart of the MD simulation and interfere with the processes of interest. In the case of crack propagation, a noticeable anomaly in the crack speed occurs at the point in time when the reflected waves reach the crack tip [24]. The reflection coefficient, a measure of the amount of elastic wave energy reflected at a given wavelength, has been calculated for CGMD and FEM based on continuum elasticity [10, 11]. Typical results are shown in Fig. 2. Long wavelength elastic waves are transmitted into the CG region, whereas short wavelength modes are reflected. The short wavelengths cannot be supported on the mesh, and since energy is conserved, they must go somewhere and they are reflected. The transmission threshold is expected to occur at a wave number k0 = π/(Nmax a). The CGMD threshold occurs precisely at 1 CGMD lump FEM dist FEM
0.8
Reflection Coefficient
Reflection Coefficient
1
0.6 0.4 0.2
⫺5
10
⫺10
10
CGMD lumped mass FEM distributed mass FEM
⫺15
10
⫺20
0
10 0
0.2
0.4
0.6
0.8
Wave number k/k0
1
1.2
1.4
0
0.2
0.4
0.6
0.8
1
1.2
1.4
Wave number k/k0
Figure 2. A comparison of the reflection of elastic waves from a CG region in three cases: CGMD and two varieties of FEM. Note that the reflection coefficient is plotted on a log scale. A similar graph plotted on a linear scale is shown in Ref. [10]. The dashed line marks the natural cutoff [k0 = π/(Nmax a)], where Nmax is the number of atoms in the largest cells. The bumps in the curves are scattering resonances. Note that at long wavelengths CGMD offers significantly suppressed scattering.
658
R.E. Rudd
this wave number, while the threshold for transmission in distributed mass and lumped mass FEM models occurs somewhat above and below this value, respectively. The scattering in the long wavelength limit shows a generalized Rayleigh scattering behavior. In conventional Rayleigh scattering the scattering crosssection goes like σ ∼ k 4 , which is the behavior exhibited by scattering here in FEM. For CGMD, the scattering drops off more quickly at long wavelengths, with the reflection coefficient approximately proportional to k 8 [11]. One aspect of concurrent multiscale modeling that remains poorly understood is the requirements for a suitable mesh. Certainly, many of the desired properties are clear either from the nature of the problem or from experience with FEM. For example, the mesh needs to be refined to the atomic level in the MD region, so here the mesh nodes should coincide with equilibrium crystal lattice sites. In the periphery large cells are desirable since the gain in efficiency is proportional to the cell size. From FEM it is well known that the aspect ratio of the cells should not be too large. Beyond these basic criteria, one is left with the task of generating a mesh that interpolates between the atomic-sized cells in the MD region to the large cells in the periphery without introducing high aspect ratio cells. One question we have investigated is whether the abruptness of this transition matters, and indeed it does matter. Figure 3 shows the reflection coefficient as a function of the wave number for two meshes that go between an MD region and a CG region with a maximum cell size of 20 lattice spacings. In one case, the transition is made gradually, whereas in the other case it is made abruptly. The mesh with the
1 CGMD smooth mesh
0.8
Reflection Coefficient
Reflection Coefficient
1
CGMD abrupt mesh 0.6 0.4 0.2 0
⫺5
10
⫺10
10
CGMD smooth mesh ⫺15
CGMD abrupt mesh
10
⫺20
0
0.2
0.4
0.6
0.8
Wave number k/k0
1
1.2
1.4
10
0
0.2
0.4
0.6
0.8
1
1.2
1.4
Wave number k/k0
Figure 3. A comparison of the reflection of elastic waves from a CG region whose mesh varies smoothly in cell size and one with an abrupt change in cell size, both computed in CGMD. In both cases the reflection coefficient is plotted as a function of the wave number in units of the natural cutoff [k0 = π/(Nmax a)], where a is the lattice constant and Nmax a = 20a is the maximum linear cell size in the mesh. The pronounced series of scattering resonances in the case of the abruptly changing mesh is undesirable. The second panel is a log-linear plot of the same data in order to show how the series of scattering resonances continues at decreasing amplitudes to long wavelengths.
Concurrent multiscale simulation at finite temperature
659
abrupt transition exhibits markedly increased scattering, including a series of strong scattering resonances. Note that the envelope of the scattering curve is well defined in the case of the abrupt mesh, a property used to calculate the scaling of the reflection coefficient, R ∼ k 8 .
4.
Outlook
CGMD provides a formalism for concurrent multiscale modeling at finite temperature. The initial tests have been very encouraging, but there are still many ways in which CGMD can be developed. One area of active research is numerical algorithms to make CGMD more efficient for large simulations. The calculation of the stiffness matrix involves the inverse of a large matrix whose size grows with the number of nodes in the CG region, NCGnode . The 3 calculation of the inverse scales like NCGnode and the matrix storage scales 2 like NCGnode , for the exact matrix without any cutoff. Even though the calculation of the stiffness matrix need only be done once during the simulation, the calculation has proven sufficiently onerous to prevent the application of CGMD to the large-scale simulations for which it was originally intended. Only now are linear scaling CGMD algorithms starting to become available. There are several directions in which CGMD has begun to be extended for specific applications. The implementation of CGMD described in this Article conserves energy. It implicitly makes the assumption that the only thermal fluctuations that are relevant to the problem are those supported on the mesh. Fluctuations of the degrees of freedom that have been integrated out are neglected. Those fluctuations can be physically relevant in several ways [12]. First, they exert random and dissipative forces on the coarse-grained degrees of freedom in a process that is analogous to the forces in Brownian motion exerted on a large particle by the atoms in the surrounding liquid. Second, they also act as a heat bath that is able to exchange and transport thermal energy. Finally, they can transport energy in non-equilibrium processes, such as the waves generated by a propagating crack discussed above. A careful treatment of the CG system leads to a generalization of the CGMD equations of motion presented above [12]. In addition to the conservative forces, there are random and dissipative forces that form a generalized Langevin equation. The dissipative forces involve a memory function in time and space that acts to absorb waves that cannot be supported in the CG region. The memory kernel is similar to those that have been discussed in the context of absorbing boundary conditions for MD simulations [25, 26], except that in CGMD the range of the kernel is shorter because the long wavelength modes are able to propagate into the CG region and do not need to be absorbed. Interestingly, in the case of a CG region surrounded by MD regions, the memory kernel also contains propagators that recreate the absorbed waves on the far
660
R.E. Rudd
side of the CG region after the appropriate propagation delay [12]. Of course, use of the generalized Langevin incurs additional computational expenses both in terms of run time and memory. There are many other ways in which CGMD could be extended. Additional CG fields could be introduced to model various material phenomena such as electrical polarization, defect concentrations and local temperature. Fluxes such as heat flow and defect diffusion can be included through the technique of coarse-graining the atomistic conservation equations. CGMD provides a powerful framework in which to formulate finite temperature multiscale models for a variety of applications.
Acknowledgments This article was prepared under the auspices of the US Department of Energy by University of California, Lawrence Livermore National Laboratory under Contract W-7405-Eng-48.
References [1] J.A. Moriarty, J.F. Belak, R.E. Rudd, P. Soderlind, F.H. Streitz, and L.H. Yang, “Quantum-based atomistic simulation of materials properties in transition metals,” J. Phys.: Condens. Matter, 14, 2825–2857, 2002. [2] A.A. Townsend, The Structure of Turbulent Shear Flow, 2nd edition, Cambridge University Press, Cambridge, 1976. [3] F.F. Abraham, J.Q. Broughton, E. Kaxiras, and N. Bernstein, “Spanning the length scales in dynamic simulation,” Comput. in Phys., 12, 538–546, 1998. [4] F.F. Abraham, R. Walkup, H. Gao, M. Duchaineau, T. Diaz de la Rubia, and M. Seager, “Simulating materials failure by using up to one billion atoms and the world’s fastest computer: work-hardening,” Proc. Natl. Acad. Sci. USA, 99, 5783–5787, 2002. [5] D.R. Mason, R.E. Rudd, and A.P. Sutton, “Atomistic modelling of diffusional phase transformations with elastic strain,” J. Phys.: Condens. Matter, 16, S2679–S2697, 2004. [6] R.E. Rudd and J.Q. Broughton, “Atomistic simulation of MEMS resonators through the coupling of length scales,” J. Model. Simul. Microsys., 1, 29–38, 1999. [7] R.S. Averback and T. Diaz de la Rubia, “Fundamental studies of radiation effects in solids,” Solid State Phys., 51, 281–402, 1998. [8] R.E. Rudd, G.A.D. Briggs, A.P. Sutton, G. Medieros-Ribiero, and R.S. Williams, “Equilibrium model of bimodal distributions of epitaxial island growth,” Phys. Rev. Lett., 90, 146101, 2003. [9] R.E. Rudd and J.Q. Broughton, “Concurrent multiscale simulation of solid state systems,” Phys. Stat. Sol. (b), 217, 251–291, 2000. [10] R.E. Rudd and J.Q. Broughton, “Coarse-grained molecular dynamics and the atomic limit of finite elements,” Phys. Rev. B, 58, R5893–R5896, 1998.
Concurrent multiscale simulation at finite temperature
661
[11] R.E. Rudd and J.Q. Broughton, “Coarse-grained molecular dynamics: non-linear finite elements and finite temperature,” Phys. Rev. B, 2004 (unpublished). [12] R.E. Rudd, Coarse-grained molecular dynamics: Dissipation due to internal modes. Mater. Res. Soc. Symp. Proc., 695, T10.2, 2002. [13] S. Kohlhoff, P. Gumbsch, and H.F. Fischmeister, “Crack-propagation in bcc crystals studied with a combined finite-element and atomistic model,” Philos. Mag. A, 64, 851–878, 1991. [14] E.B. Tadmor, M. Ortiz, and R. Phillips, “Quasicontinuum analysis of defects in solids,” Philos. Mag. A, 73, 1529–1563, 1996. [15] J.Q. Broughton, F.F. Abraham, N. Bernstein, and E. Kaxiras, “Concurrent coupling of length scales: methodology and application,” Phys. Rev. B, 60, 2391–2403, 1999. [16] L.E. Shilkrot, R.E. Miller, and W.A. Curtin, “Coupled atomistic and discrete dislocation plasticity,” Phys. Rev. Lett., 89, 025501, 2002. [17] S. Curtarolo and G. Ceder, “Dynamics of an inhomogeneously coarse grained multiscale system,” Phys. Rev. Lett., 88, 255504, 2002. [18] W.A. Curtin and R.E. Miller, “Atomistic/continuum coupling in computational materials science,” Modell. Simul. Mater. Sci. Eng., 11, R33–R68, 2003. [19] T.J.R. Hughes, The Finite Element Method: Linear Static and Dynamic Finite Element Analysis, Dover, Mineola, 2000. [20] M.P. Allen and D.J. Tildesley, Computer Simulation of Liquids, Clarendon Press, Oxford, 1987. [21] M. Born and K. Huang, Dynamical Theory of Crystal Lattices, Clarendon Press, Oxford, 1954. [22] N.W. Ashcroft and N.D. Mermin, Solid State Physics, Saunders College Press, Philadelphia, 1976. [23] B. Kraczek, private communication, 2003. [24] B.L. Holian and R. Ravelo, “Fracture simulations using large-scale moleculardynamics,” Phys. Rev. B, 51, 11275–11288, 1995. [25] W. Cai, M. de Koning, V.V. Bulatov, and S. Yip, “Minimizing boundary reflections in coupled-domain simulations,” Phys. Rev. Lett., 85, 3213–3216, 2000. [26] W.E and Z. Huang, “Matching conditions in atomistic-continuum modeling of materials,” Phys. Rev. Lett., 87, 135501, 2001.
2.13 THE THEORY AND IMPLEMENTATION OF THE QUASICONTINUUM METHOD E.B. Tadmor1 and R.E. Miller2 1 Technion–Israel Institute of Technology, Haifa, Israel 2
Carleton University, Ottawa, ON, Canada
While atomistic simulations have provided great insight into the basic mechanisms of processes like plasticity, diffusion and phase transformations in solids, there is an important limitation to these methods. Specifically, the large number of atoms in any realistic macroscopic structure is typically much too large for direct simulation. Consider that the current benchmark for largescale fully atomistic simulations is on the order of 109 atoms, using massively paralleled computer facilities with hundreds or thousands of CPUs. This represents 1/10 000 of the number of atoms in a typical grain of aluminum, and 1/1 000 000 of the atoms in a typical micro-electro-mechanical systems (MEMS) device. Further, it is apparent that with such a large number of atoms, substantial regions of a problem of interest are essentially behaving like a continuum. Clearly, while fully atomistic calculations are essential to our understanding of the basic “unit” mechanisms of deformation, they will never replace continuum models altogether. The goal for many researchers, then, has been to develop techniques that retain a largely continuum mechanics framework, but impart on that framework enough atomistic information to be relevant to modeling a problem of interest. In many examples, this means that a certain, relatively small, fraction of a problem require full atomistic detail while the rest can be modeled using the assumptions of continuum mechanics. The quasicontinuum method (QC) has been developed as a framework for such mixed atomistic/continuum modeling. The QC philosophy is to consider the atomistic description as the “exact” model of material behaviour, but at the same time acknowledge that the sheer number of atoms make most problems intractable in a fully atomistic framework. Then, the QC uses continuum assumptions to reduce the degrees of freedom and computational demand without losing atomistic detail in regions where it is required. 663 S. Yip (ed.), Handbook of Materials Modeling, 663–682. c 2005 Springer. Printed in the Netherlands.
664
E.B. Tadmor and R.E. Miller
The purpose of this article is to provide an overview of the theoretical underpinnings of the QC method, and to shed light on practical issues involved in its implementation. The focus of the article will be on the specific implementation of the QC method as put forward in Refs. [1–4]. Variations on this implementation, enhancements, and details of specific applications will not be presented. For the interested reader, these additional topics can be found in several QC review articles [5–8] and of course in the original references. The most recent of the QC reviews [5] provides an extensive literature survey, detailing many different implementations, extensions and applications of the QC. Also included in that review are several other coupled methods that are either direct descendants of the QC or are similar alternatives developed independently. For a detailed comparison between several coupled atomistic/continuum methods including the QC, the reader may find the review by Curtin and Miller [9] of interest. A QC website designed to serve as a clearinghouse for information on the QC method has been established at www.qcmethod.com. The site includes information on QC research, links to researchers, downloadable QC code and documentation. The downloadable code is freely available and corresponds to the QC implementation discussed in this paper.
1.
Atomistic Modeling of Crystalline Solids
In the QC, the point-of-view which is adopted is that there is an underlying atomistic model of the material which is the “correct” description of the material behaviour. This could, in principle, be a quantum-mechanically based description such as density functional theory (DFT), but in practice the focus has been primarily on atomistic models based on semi-empirical interatomic potentials. A review of such methods can be found, for example, in [10]. Here, we present only the features of such models which are essential for our discussion. We focus on lattice statics solutions, i.e., we are looking for equilibrium atomic configurations for a given model geometry and externally imposed forces or displacements, because most applications of the QC have used a static implementation. Recent work to extend QC to finite temperature and dynamic simulations shows promise, and can be found in Ref. [11]. We assume that there is some reference configuration of N atomic nuclei, confined to a lattice. Thus, the reference position of the ith atom in the model X i is found from an integer combination of lattice vectors and a reference (origin) atom position, X 0 X i = X 0 + li A1 + m i A2 + n i A3 ,
(1)
The theory and implementation of the quasicontinuum method
665
where (li , m i , n i ) are integers, A j is the j th Bravais lattice vector.1 The deformed position of the ith atom x i , is then found from a unique displacement vector ui for each atom. x i = X i + ui .
(2)
The displacements ui , while only having physical meaning on the atomic sites, can be treated as a continuous field u(X) throughout the body with the property that u(X i ) ≡ ui . This approach, while not the conventional one in atomistic models, is useful in effecting the connection to continuum mechanics. Note that for brevity we will often refer to the field u to represent the set of all atomic displacements {u1 , u2 , . . . , u N } where N is the number of atoms in the body. In standard lattice statics approaches using semi-empirical potentials, there is a well defined total energy function E tot that is determined from the relative positions of all the atoms in the problem. In many semi-empirical models, this energy can be written as a sum over the energy of each individual atom. Specifically, E tot =
N
E i (u),
(3)
i=1
where E i is the site energy of atom i, which depends on the displacements u through the relative positions of all the atoms in the deformed configuration. For example, within the embedded atom method (EAM) [13, 14] atomistic model, this site energy is given by E i = Ui (ρ¯i ) +
1 Vi j (ri j ), 2 j =/ i
(4)
where Ui can be interpreted as an electron-density dependent embedding energy, Vi j is a pair potential between atom i and its neighbor j and ri j = (x i − x j ) · (x i − x j ) is the interatomic distance. The electron density at the position of atom i, ρ¯i , is the superposition of spherically averaged density contributions from each of the neighbors, ρ j : ρ¯i =
ρ j (ri j ).
(5)
j= /i
A similar site energy can be identified for other empirical atomistic models, such as those of the Stillinger–Weber type [15], for instance. 1 We omit a discussion of complex lattices with more than one atom at each Bravais lattice site. This topic is discussed in Refs. [5, 12].
666
E.B. Tadmor and R.E. Miller
In addition to the potential energy of the atoms, there may be energy due to external loads applied to atoms. Thus, the total potential energy of the system (atoms plus external loads) can be written as (u) = E tot(u) −
N
f i ui ,
(6)
i=1
where − f i ui is the potential energy of the applied load f i on atom i. In lattice statics, we seek the displacements u such that this potential energy is minimized.
2.
The QC Method
The goal of the static QC method is to find the atomic displacements that minimize Eq. (6) by approximating the total energy of Eq. (3) such that: 1. the number of degrees of freedom is substantially reduced from 3N , but the full atomistic description is retained in certain “critical” regions, 2. the computation of the energy in Eq. (3) is accurately approximated without the need to explicitly compute the site energy of all the atoms, 3. the fully atomistic, critical regions can evolve with the deformation, during the simulation. In this section, the details of how the QC achieves each of these goals are presented.
2.1.
Removing Degrees of Freedom
A key measure of a displacement field is the deformation gradient F. A body deforms from reference state X to deformed state x = X + u(X), from which we define F(X) ≡
∂x ∂u =I+ , ∂X ∂X
(7)
where I is the identity tensor. If the deformation gradient changes gradually on the atomic scale, then it is not necessary to explicitly track the displacement of every atom in the region. Instead, the displacements of a small fraction of the atoms (called representative atoms or “repatoms”) can be treated explicitly, with the displacements of the remaining atoms approximately found through interpolation. In this way, the degrees of freedom are reduced to only the coordinates of the repatoms.
The theory and implementation of the quasicontinuum method
667
The QC incorporates such a scheme by recourse to the interpolation functions of the finite element method (FEM) (see, for example, [16]). Figure 1 illustrates the approach in two-dimensions in the vicinity of a dislocation core. The filled atoms are the selected repatoms, which are meshed by a space-filling set of linear triangular finite elements. Any atom not chosen as a repatom, like the one labeled “A”, is subsequently constrained to move according to the interpolated displacements of the element in which is resides. The density of repatoms is chosen to vary in space according to the needs of the problem of interest. In regions where full atomistic detail is required, all atoms are chosen as repatoms, with correspondingly fewer in regions of more slowly varying deformation gradient. This is illustrated in Fig. 1, where all the atoms around the dislocation core are chosen as repatoms. Further away, where the crystal experiences only the linear elastic strains due to the dislocation, the density of repatoms is reduced. This first approximation of the QC, then, is to replace the energy E tot by tot,h E : E
tot,h
=
N
E i (uh ).
(8)
i=1
In this equation the atomic displacements are now found through the interpolation functions and take the form h
u =
Nrep
Sα uα ,
(9)
α=1
where Sα is the interpolation (shape) function associated with repatom α, and Nrep is the number of repatoms, Nrep N . Note that the formal summation over the shape functions in Eq. (9) is in practice much simpler due to the compact support of the finite element shape functions. Specifically, shape functions are identically zero in every element not immediately adjacent to a specific repatom. Referring back to Fig. 1, this means that the displacement of atom A is determined entirely from the sum over the three repatoms B, C and D defining the element containing A: uh (X A ) = SB (X A )uB + SC (X A )uC + SD (X A )uD .
(10)
Introducing this kinematic constraint on most of the atoms in the body will achieve the goal of reducing the number of degrees of freedom in the problem, but notice that for the purpose of energy minimization we must still compute the energy and forces on the degrees of freedom by explicitly visiting every atom – not just the repatoms – and building its neighbor environment from the interpolated displacement fields. Next, we discuss how these calculations are approximated and made computationally tractable.
668
E.B. Tadmor and R.E. Miller (a)
A
(b)
D
B
A
C
Figure 1. Selection of repatoms from all the atoms near a dislocation core are shown in (a), which are then meshed by linear triangular elements in (b). The density of the repatoms varies according to the severity of the variation in the deformation gradient. After Ref. [5]. Reproduced with permission.
The theory and implementation of the quasicontinuum method
2.2.
669
Efficient Energy Calculations: The Local QC
In addition to the degree of freedom reduction described in Section 2.1, the QC requires an efficient means of computing the energy and forces without the need to visit every atom in the problem as implied by Eq. (8). The first way to accomplish this is by recourse to the so-called Cauchy–Born (CB) rule (see Ref. [17] and references therein), resulting in what is referred to as the local formulation of the QC.1 The use of linear shape functions to interpolate the displacement field means that within each element, the deformation gradient will be uniform. The Cauchy–Born rule assumes that a uniform deformation gradient at the macro-scale can be mapped directly to the same uniform deformation on the micro-scale. For crystalline solids with a simple lattice structure,2 this means that every atom in a region subject to a uniform deformation gradient will be energetically equivalent. Thus, the energy within an element can be estimated by computing the energy of one atom in the deformed state and multiplying by the number of atoms in the element. In practice, the calculation of the CB energy is done separately from the model in a “black box,” where for a given deformation gradient F, a unit cell with periodic boundary conditions is deformed appropriately and its energy is computed. The strain energy density in the element is then given by E(F) =
E 0 (F) , 0
(11)
where 0 is the unit cell volume (in the reference configuration) and E 0 is the energy of the unit cell when its lattice vectors are distorted according to F. Now the total energy of an element is simply this energy density times the element volume, and the total energy of the problem is simply the sum of element energies: E
tot,h
≈E
tot,h
=
N element
e E(F e ),
(12)
e=1
where e is the volume of element e. The important computational saving made here is that a sum over all the atoms in the body has been replaced by a sum over all the elements, each one requiring an explicit energy calculation for only one atom. Since the number of elements is typically several orders of magnitude smaller than the total number of atoms, the computational 1 The term “local” refers to the fact that use of the CB rule implies that the energy at each point in the
continuum will only be a function of the deformation at that point and not on its surroundings. 2 A simple lattice structure is one for which there is only one atom at each Bravais lattice site. In a complex lattice with two or more atoms per site, the Cauchy–Born rule must be generalized to permit shuffling of the off-site atoms. See Ref. [12].
670
E.B. Tadmor and R.E. Miller
savings is substantial. The number of elements scales linearly with the number of repatoms, and so the local QC scales as O(Nrep ). Note, however, that even in the case where the deformation is uniform within each element, the local prescription for the energy in the element is only approximate. This is because in the constrained displacement field uh , the deformation gradient varies from one element to the next. At element boundaries and free surfaces, atoms can have energies that differ significantly from that of an atom in a bulk, uniformly deformed lattice. Figure 2 illustrates this schematically for an initially square lattice deformed according to two different deformation gradients in two neighboring regions. The energy of the atom labeled as a “bulk atom” can be accurately computed from the CB rule; its neighbor environment is uniform even though some of its neighbors occupy other elements. However, the “interface atom” and “surface atom” are not accurately described by the CB rule, which assumes that these atoms see uniformly deformed bulk environments. In situations where the deformation is varying slowly from one element to the next and where surface energetics are not important, the local approximation is a good one. Using the CB rule as in Eq. (11), the QC can be thought of as a purely continuum formulation, but with a constitutive law that is based on
Reference
Deformed
interface atom
surface atom
bulk atom
Figure 2. On the left, the reference configuration of a square lattice meshed by triangular elements. On the right, the deformed mesh shows a bulk atom, for which the CB rule is exactly correct, and two other atoms for which the CB rule will give the wrong energy due to its inability to describe surfaces or changes in the deformation gradient. After Ref. [5]. Reproduced with permission.
The theory and implementation of the quasicontinuum method
671
atomistics rather than on an assumed phenomenological form. The CB constitutive law automatically ensures that the correct anisotropic crystal elasticity response will be recovered for small deformations. It is non-linear elastic (as dictated by the underlying atomistic potentials) for intermediate strains and includes lattice invariance for large deformations; for example, a shear deformation that corresponds to the twinning of the lattice will lead to a rotated crystal structure with zero strain energy density. An advantage of the local QC formulation is that it allows the use of quantum-mechanical atomistic models that cannot be written as a sum over individual atom energies such as tight binding (TB) and DFT. In these models only the total energy of a collection of atoms can be obtained. However, for a lattice undergoing a uniform deformation it is possible to compute the energy density E(F) from a single unit cell with periodic boundary conditions. Incorporation of quantum-mechanical information into the atomic model generally ensures that the description is more transferable, i.e., it provides a better description of the energy of atomic configurations away from the reference structure to which empirical potentials are fitted. This allows truly firstprinciples simulations of some macroscopic processes such as homogeneous phase transformations.
2.3.
More Accurate Calculations: Mixed Local/Non-Local QC
The local QC formulation successfully enhances the continuum FEM framework with atomistic properties such as nonlinearity, crystal symmetry and lattice invariance. The latter property means that dislocations may exist in the local QC. However, the core structure and energy of these dislocations will only be coarsely represented due to the CB approximation of the energy. The same is true for other defects such as surfaces and interfaces, where the deformation of the crystal is non-uniform over distances shorter than the cutoff radius of the interatomic potentials. For example, to correctly account for the energy of the interface shown in Fig. 2, the non-uniform environment of the atoms along the interface must be correctly accounted for. While the local QC can support deformations (such as twinning) which may lead to microstructures containing such interfaces, it will not account for the energy cost of the interface itself. In order to correctly capture these details, the QC must be made non-local in certain regions. The energy of Eq. (8), which in the local QC was approximated by Eq. (12), must instead be approximated in a way that is sensitive to non-uniform deformation and free surfaces, especially in the limit where full atomistic detail is required.
672
E.B. Tadmor and R.E. Miller
We now make the ansatz that the energy of Eq. (8) can be approximated by computing only the energy of the repatoms, but we will identify each repatom as being either local or non-local depending on its deformation environment. Thus, the repatoms are divided into Nloc local repatoms and Nnl non-local repatoms (Nloc + Nnl = Nrep ). The energy expression is then approximated as E
tot,h
≈
Nnl
n α E α (uh ) +
α=1
Nloc
n α E α (uh ).
(13)
α=1
The important difference between Eq. (8) and Eq. (13) is that the sum on all the atoms in the problem has been replaced with a sum on only the repatoms. The function n α is a weight assigned to repatom α, which will be high for repatoms in regions of low repatom density and vice versa. For consistency, the weight functions must be chosen so that Nrep
n α = N,
(14)
α=1
which further implies (through the consideration of a special case where every atom in a problem is made a repatom) that in atomically-refined regions, all n α = 1. From Eq. (14), the weight functions can be physically interpreted as the number of atoms represented by each repatom α. The weight n α for each repatom (local or non-local) is determined from a tessellation that divides the body into cells around each repatom. One physically sensible tessellation is Voronoi cells [18], but an approximate Voronoi diagram can be used instead due to the high computational overhead of the Voronoi construction. In practice, the coupled QC formulation makes use of a simple tessellation based on the existing finite element mesh, partitioning each element equally between each of its nodes. The volume of the tessellation cell for a given repatom, divided by the volume of a single atom (the Wigner–Seitz volume) provides n α for the repatom. In typical QC simulations, non-local regions are fully refined down to the atomic scale, and so the weight of the non-local repatoms is one. To compute the energy of a local repatom α, we recognize that of the n α atoms it represents, n eα reside in each element e adjacent to the repatom. The weighted energy contribution of the repatom is then found by applying the CB rule within each element adjacent to α such that Eα =
M n eα e=1
nα
0 E(F e ),
nα =
M
n eα ,
(15)
e=1
where E(F e ) is the energy density in element e by the CB rule, 0 is the Wigner–Seitz volume of a single atom and e runs over all elements adjacent to α.
The theory and implementation of the quasicontinuum method
673
Note that this description of the local repatoms is exactly equivalent to the element-by-element summation of the local QC in Eq. (12); it is only the way that the energy partitioning is written that is different. In a mesh containing only local repatoms, the two formulations are the same, but the summations have been rearranged from one over elements in Eq. (12) to one over the repatoms here. The energy of each non-local repatom is computed from the deformed neighbor environment dictated from the current interpolated displacements in the elements. In essence, every atom in the vicinity of a non-local repatom is displaced to the deformed configuration, the energy of each non-local repatom in this configuration is computed from Eq. (4), and the total energy is the sum of these repatom energies weighted by n α . For example, the energy of the repatom identified as an “interface atom” in Fig. 2 requires that the neighbor environment be generated by displacing each neighbor according to the element in which it resides. Thus, the energy of each non-local repatom is exactly as it should be under the displacement field uh , while the local approximation is used in regions where the deformation is uniform on the atomic scale. From this starting point, the forces on all the repatoms can be obtained as the appropriate derivatives of Eq. (13), and energy minimization can proceed. When making use of the mixed formulation described in Eq. (13), it now becomes necessary to decide whether a given repatom should be local or non-local. This is achieved automatically in the QC using a non-locality criterion. Note that simply having a large deformation in a region does not in itself require a non-local repatom, as the CB rule of the local formulation will exactly describe the energy of any uniform deformation, regardless of the severity. The key feature that should trigger a non-local treatment of a repatom is a significant variation in the deformation gradient on the atomic scale in the repatom’s proximity. Thus, the non-locality criterion in implemented as follows. A cut-off, rnl , is empirically chosen to be between two and three times the cut-off radius of the interatomic potentials. The deformation gradients in every element within this cut-off of a given representative atom are compared, by looking at the differences between their eigenvalues. The criterion is then: max |λak − λbk | < ,
(16)
a,b;k
where λak is the kth eigenvalue of the right stretch tensor U a = F aT F a in element a, k = 1 · · · 3, and the indices a and b run over all elements within rnl of a given repatom. The repatom will be made local if this inequality is satisfied, and non-local otherwise. In practice, the tolerance is determined empirically. A value of 0.1 has been used in a number of tests and found to give good results. The effect of this criterion is clusters of non-local atoms in regions of rapidly varying deformation.
674
E.B. Tadmor and R.E. Miller
The fact that the non-local repatoms tend to cluster into atomistically refined regions surrounded by local regions leads to non-local/local interfaces in the QC. As in all attempts to couple a non-local atomistic region to a local continuum region found in the literature, this will lead to spurious forces near the interface. These forces, dubbed “ghost-forces” in the QC literature, arise due to the fact that there is an inherent mismatch between the local (continuum) and non-local (atomistic) regions in the problem. In short, the finite range of interaction in the non-local region mean that the motion of repatoms in the local region will effect the energy of non-local repatoms, while the converse may not be true. Upon differentiating Eq. (13), forces on repatoms in the vicinity of the interface may include a non-physical contribution due to this asymmetry. Note that these ghost forces are a consequence of differentiating an approximate energy functional, and therefore they still are “real” forces in the sense that they come from a well-defined potential. The problem is that the mixed local/non-local energy functional of Eq. (13) is approximate, and the error in this approximation is most apparent at the interface. A consequence of this is that a perfect, undistorted crystal containing an artificial local/nonlocal interface will be able to lower its energy below the ground-state energy by rearranging the atoms in the vicinity of the interface. This is clearly a non-physical result. In Ref. [3], a solution to the ghost forces was proposed whereby corrective forces were added as dead loads to the interface region. In this way, there is a well-defined contribution of the corrective forces to the total energy functional (since the dead loads are constant) and the minimization of the modified energy can proceed using standard conjugate gradient or Newton–Raphson techniques. The procedure can be iterated to self-consistency.
2.4.
Evolving Microstructure: Automatic Mesh Adaption
The QC approach outlined in the previous sections can only be successfully applied to general problems in crystalline deformation if it is possible to ensure that the fine structure in the deformation field will be captured. Without a priori knowledge of where the deformation field will require fine-scale resolution, it is necessary that the method have an automatic way to adapt the finite element mesh through the addition or removal of repatoms. To this end, the QC makes use of the finite element literature, where considerable attention has been given to adaptive meshing techniques for many years. Typically in finite element techniques, a scalar measure is defined to quantify the error introduced into the solution by the current density of nodes (or repatoms in the QC). Elements in which this error estimator is higher than some prescribed tolerance are targeted for adaption, while at the same time
The theory and implementation of the quasicontinuum method
675
the error estimator can be used to remove unnecessary nodes from the model. The error estimator of Zienkiewicz and Zhu [19], originally posed in terms of errors in the stresses, is re-cast for the QC in terms of the deformation gradient. Specifically, we define the error estimator to be
1
εe =
e
1/2
¯ − F e ) :( F ¯ − F e )d (F
,
(17)
e
where e is the volume of element e, F e is the QC solution for the deformation ¯ is the L 2 -projection of the QC solution for F, gradient in element e, and F given by ¯ = SF avg . (18) F Here, S is the shape function array, and F avg is the array of nodal values ¯ Because the deformation gradients of the projected deformation gradient F. are constant within the linear elements used in the QC , the nodal values F avg are simply computed by averaging the deformation gradients found in each element touching a given repatom. This is then interpolated throughout the elements using the shape functions, providing an estimate to the discretized field solution that would be obtained if higher order elements were used. The error, then, is defined as the difference between the actual solution and this estimate of the higher order solution. If this error is small, it implies that the higher order solution is well represented by the lower order elements in the region, and thus no refinement is required. The integral in Eq. (17) can be computed quickly and accurately using Gaussian quadrature. Elements for which the error εe is greater than some prescribed error tolerance are targeted for refinement. Refinement then proceeds by adding three new repatoms at the atomic sites closest to the mid-sides of the targeted elements. Notice that since repatoms must fall on actual atomic sites in the reference lattice, there is a natural lower limit to element size; if the nearest atomic sites to the mid-sides of the elements are the atoms at the element corners, the region is fully refined and no new repatoms can be added. The same error estimator is used in the QC to remove unnecessary repatoms from the mesh. In this process, a repatom is temporarily removed from the mesh and the surrounding region is locally remeshed. If the all of the elements produced by this remeshing process have a value of the error estimator below the threshold, the repatom can be eliminated.
3.
Practical Issues in QC Simulations
In this section, we will use a specific, simple example to highlight the practical issues surrounding solutions using the QC method. The example to be
676
E.B. Tadmor and R.E. Miller
discussed is also provided with the QC download at qcmethod.com, and it is discussed in even greater detail in the documentation that accompanies that code.
3.1.
Problem Definition
Consider the problem of a twin boundary in face-centered cubic (FCC) aluminum. The boundary is perfect but for a small step. A question of interest may be “how does this stepped boundary respond to mechanical load?” In this example, we probe this question by using the QC method to solve the problem shown in Fig. 3(a), where two crystals, joined by a stepped twin boundary, are sheared until the boundary begins to migrate due to the load. The result will elucidate the mechanism of this migration. The implementation of the QC method used to solve this problem has been described as “two and a half” dimensional to emphasize that, while it is not a fully 3D model it is also not simply 2D. Specifically, the reference crystal structure is 3D, and all the underlying atomistic calculations (both local and non-local) consider the full, 3D environment of each atom. However, the deformation of the crystal is constrained such that the three components of displacement, u x , u y and u z are functions only of two coordinates x and y. This allows, for example, both edge and screw dislocations, but forces the line direction of the dislocations to be along z. For the reader who is familiar with purely atomistic simulations, this is equivalent to imposing periodic boundary conditions along the z direction, and then using a periodic cell with the
(a)
(b) 200
200
150
150 fcc Al
100
50
Stepped twin boundary
0
Y
Y
50
100
⫺50
0
⫺50
⫺100
⫺100
fcc Al
⫺150
⫺150
⫺200
⫺200 ⫺200 ⫺100
x
0
100
200
⫺200 ⫺100
x
0
100
200
Figure 3. (a) Initial coarse mesh used to define the simulation volume and (b) the final mesh after the automatic adaption.
The theory and implementation of the quasicontinuum method
677
minimum possible thickness along z to produce the correct crystal structure. We sometimes refer to this as a “2D” implementation for brevity, but ask that the reader bears in mind the true nature of the model. The use of a 2D implementation of the QC to study this problem is appropriate given its geometry. However, fully 3D implementations of the QC exist and these must be used for many problems of interest (see examples in Ref. [5]). The starting point for a QC simulation is a crystal lattice, defined by an origin atom and a set of Bravais vectors as in Eq. (1). To allow the QC method to model polycrystals, it is necessary to define a unique crystal structure within each grain. The shape of each grain is defined by a simple polygon in 2D. Physically, it makes sense that the polygons defining each grain do not overlap, although it may be possible to have holes between the grains. In our example, it is easy to see how the shape of the two grains could be defined to include the grain boundary step. Mathematically, the line defining the boundary should be shared identically by the two grains, but this can lead to numerical complications; for example in checking whether two grains overlap. Fortunately, realistic atomistic models are unlikely to encounter atoms that are less than an Angstr¨om or so apart, and so there exists a natural “tolerance” in the definition of these polygons. For example, a gap between grains of 0.1 Å will usually provide sufficient numerical resolution between the grains without any atoms falling “in the gap” and therefore being omitting from the model. In the QC implementation, the definition of the grains is separate from the definition of the actual volume of material to be simulated. This simulation volume is defined by a finite element mesh between an initial set of repatoms. Each element in this mesh must lie within one or more of the grain polygons described above, but the finite element mesh need not fill the entire volume of the defined grains. It is useful to think of the actual model (the mesh) being “cut-out” from the previously defined grain structure. For our problem, a sensible choice for the initial mesh is shown in Fig. 3(a), where the grain boundary lies approximately (to within the height of the step) along the line y = 0. Elements whose centroid lie above or below the grain boundary are assumed to contain material oriented according to the lattice of the upper or lower grain, respectively. Since our interest here is atomic scale processes along the grain boundary, it is clear that the model shown in Fig. 3(a), with elements approximately 50 Å in width, will not provide the necessary accuracy. Thus, we can make use of the QC’s automatic adaption to increase the resolution near the grain boundary. The main adaption criterion, as outlined earlier, is based on error in the finite element interpolation of the deformation gradient. However, there will initially be no deformation near the grain boundary and thus no reason for automatic adaption to be triggered. It is therefore necessary to force the model to adapt in regions that are inhomogeneous at the atomic scale for reasons other than deformation. To this end, we can identify certain segments of the
678
E.B. Tadmor and R.E. Miller
grain boundary as “active” segments. Any repatom within a prescribed distance of an active segment will be made non-local. This further implies that the elements touching this repatom will be targetted for refinement, since we require that n α = 1 for all non-local repatoms. The effect of such a technique is shown in Fig. 3(b), where the segment of the boundary between x = −100 and 100 Å was defined to be active. The result is that the grain boundary structure is correctly captured in the vicinity of the step, as well as for some distance on either side of the step.
3.2.
Solution Procedure
In the static QC implementation, the solution procedure amounts to minimization of the total energy (elastic energy plus the potential energy of the applied loads, see Eq. (6)) for a given set of boundary conditions (applied displacements or forces on certain repatoms). However, problems solved using the QC method are typically highly nonlinear, and as such their energy functional typically includes many local minima. In order to find a physically realistic solution, it is necessary to use a quasi-static loading approach, whereby boundary conditions are gradually incremented, the energy is minimized, and the minimum energy configuration is used in generating an initial guess to the solution after the subsequent load increment. Again, we can refer to the specific example of the stepped twin boundary to make this more clear. Our desire, in this example, is to study the effect of applying a shear strain to the stepped twin boundary. Specifically, we may be interested in knowing the critical shear strain at which the boundary begins to migrate and to understand the mechanism of this migration. We begin by choosing a sensible strain increment to apply, such that the incremental deformation will not be too severe between minimization steps. For this example, the initial guess, un+1 0 , used to solve for the relaxed displacement, un+1 , of load step n + 1 is given by un+1 = un + F X, 0
(19)
where un is the relaxed, minimum energy displacement field from load step n, u0 = 0, and the matrix F corresponding to pure shear along the y direction is
1 γ F = 0 1 0 0
0 0 . 1
(20)
Thus, a shear strain increment of γ is applied, the outer repatoms are held fixed to the resulting displacements, and all inner repatoms are relaxed until the
The theory and implementation of the quasicontinuum method
679
energy reaches a minimum. Then, another strain increment is superimposed on these relaxed displacements and the process repeated. After n load steps, a total macroscopic shear strain of γ = n γ has been applied to the outer boundary of the bi-crystal. The energy minimization can be performed using several standard approaches, such as the conjugate gradient (CG) or the Newton–Raphson (NR) methods (both of which are described, for example, in Ref. [20]). The CG method has the advantage over the NR technique in that it requires only the energy functional and its first derivatives with respect to the repatom positions (i.e., the forces). The NR method requires a second derivative, or “stiffness matrix” that is not straightforward to derive or to code in an efficient manner. Once correctly implemented, however, the NR method has the advantage of quadratic convergence (compared to linear convergence for the CG method) once the system is close to the energy minimizing configuration. By monitoring the applied force (measured as the sum of forces in the y-direction applied to the top surface of the bi-crystal) versus the accumulated shear strain, γ , it can be observed that there is an essentially linear response for the first six load steps, and then a sudden load drop from step six to seven. This jump corresponds to the first inelastic behaviour of the boundary, the mechanism of which is shown in Fig. 4. In Fig. 4(a), a close-up of the relaxed step at an applied strain of γ = 0.03 is shown, while Fig. 4(b) shows the relaxed configuration after the next strain increment at γ = 0.035. The mechanism of this boundary motion is the motion of two Shockley partial dislocations from the corners of the step along the boundary. This can be seen clearly by observing the finite element mesh between the repatoms in Fig. 4(c). Because the mesh is triangulated in the reference configuration, the effect of plastic slip is the shearing of a row of elements in the wake of the moving dislocations. One challenge in modeling dislocation motion in crystals at the atomic scale is evident in this simulation. In crystals with a low Peierls resistance like the FCC crystal modelled here, dislocations will move long distances under small applied stresses. In this simulation, the Shockley partials which nucleated at the step move to the ends of the region of atomic-scale refinement. In order to rigorously compute the equilibrium position of the dislocations, it would be necessary to further adapt the model. The presence of the dislocation in close proximity to the larger elements to the left of the fully refined region will trigger the adaption criterion, as well as increase the number of repatoms that are non-local according to the non-locality criterion defined earlier. This will allow the dislocations to move somewhat further upon subsequent relaxation. In principle, this process of iteratively adapting and relaxing can be repeated until the dislocations come to its true equilibrium, which in this example would be at the left and right free surfaces of the bi-crystal.
680
E.B. Tadmor and R.E. Miller
(a)
Initial Boundary Location (b)
Boundary Migration (c)
Slip of Shockley partials Figure 4. Mechanism of migration of the twin boundary under shear. (a) Before migration. (b) After migration (c) Deformed mesh showing the motion of Shockley partial dislocations.
In practice, however, we may not be interested in the full details of where this dislocation comes to rest, if we are willing to accept some degree of error in the simulation. Specifically, the fact that the dislocation is held artificially close to the step may effect the critical load level at which subsequent migration events occur. The compromise is made for the sake of computational speed, which will be significantly compromised if we were to iteratively adapt and relax many times for each load step.
The theory and implementation of the quasicontinuum method
4.
681
Summary
This review has summarized the theory and practical implementation of the QC method. Rather than provide an exhaustive review of the QC literature (which can already be found, for example, in Ref. [5]), the intent has been to provide a simple overview for someone interested in understanding one implementation of the QC method. More specific details, including free, opensource code and documentation, can be found at www.qcmethod.com.
References [1] E.B. Tadmor, M. Ortiz, and R. Phillips, “Quasicontinuum analysis of defects in solids,” Phil. Mag. A, 73, 1529–1563, 1996a. [2] E.B. Tadmor, R. Phillips, and M. Ortiz, “Mixed atomistic and continuum models of deformation in solids,” Langmuir, 12, 4529–4534, 1996b. [3] V.B. Shenoy, R. Miller, E. Tadmor, D. Rodney, R. Phillips, and M. Ortiz, “An adaptive methodology for atomic scale mechanics: the quasicontinuum method,” J. Mech. Phys. Sol., 47, 611–642, 1998a. [4] V.B. Shenoy, R. Miller, E.B. Tadmor, R. Phillips, and M. Ortiz, “Quasicontinuum models of interfacial structure and deformation,” Phys. Rev. Lett., 80, 742–745, 1998b. [5] R.E. Miller and E.B. Tadmor, “The quasicontinuum method: overview, applications and current directions,” J. of Computer-Aided Mater. Design, 9(3), 203–231, 2002. [6] M. Ortiz, A.M. Cuitino, J. Knap, and M. Koslowski, “Mixed atomistic continuum models of material behavior: the art of transcending atomistics and informing continua,” MRS Bull., 26, 216–221, 2001. [7] D. Rodney, “Mixed atomistic/continuum methods: static and dynamic quasicontinuum methods,” In: A. Finel, D. Maziere, and M. Veron (eds.), NATO Science Series II, Vol. 108, “Thermodynamics, Microstructures and Plasticity,” Kluwer Academic Publishers, Dordrecht, 265–274, 2003. [8] M. Ortiz and R. Phillips, “Nanomechanics of defects in solids,” Adv. Appl. Mech., 36, 1–79, 1999. [9] W.A. Curtin and R.E. Miller, “Atomistic/continuum coupling methods in multi-scale materials modeling,” Model. Simul. Mater. Sci. Eng., Vol. 11(3), R33–R68, 2003. [10] A. Carlsson, “Beyond pair potentials in elemental transition metals and semiconductors,” Sol. Stat. Phys., 43, 1–91, 1990. [11] V. Shenoy, V. Shenoy, and R. Phillips, “Finite temperature quasicontinuum methods,” Mater. Res. Soc. Symp. Proc., 538, 465–471, 1999. [12] E. Tadmor, G. Smith, N. Bernstein, and E. Kaxiras, “Mixed finite element and atomistic formulation for complex crystals,” Phys. Rev. B, 59, 235–245, 1999. [13] M. Daw and M. Baskes, “Embedded-atom method: derivation and application to impurities, surfaces, and other defects in metals,” Phys. Rev. B, 29, 6443–6453, 1984. [14] J. Norskøv and N. Lang, “Effective-medium theory of chemical binding: application to chemisorption,” Phys. Rev. B, 21, 2131–2136, 1980. [15] F. Stillinger and T. Weber, “Computer-simulation of local order in condensed phases of silicon,” Phys. Rev. B, 31, 5262–5271, 1985.
682
E.B. Tadmor and R.E. Miller [16] O.C. Zienkiewicz, The Finite Element Method, vols. 1–2, 4th edn. McGraw-Hill, London, 1991. [17] J. Ericksen, In: M. Gurtin (ed.), Phase Transformations and Material Instabilities in Solids, Academic Press, New York. [18] A. Okabe, Spatial Tessellations: Concepts and Applications of Voronoi Diagrams, Wiley, Chichester, England, 1992. [19] O.C. Zienkiewicz and J. Z. Zhu, “A simple error estimator and adaptive procedure for practical engineering analysis,” Int. J. Numer. Meth. Eng., 24, 337–357, 1987. [20] W.H. Press, S.A. Teukolsky, W.T. Vetterling, and B.P. Flannery, Numerical Recipes in FORTRAN: The Art of Scientific Computing, 2nd edn. Cambridge University Press, Cambridge, 1992.
2.14 PERSPECTIVE: FREE ENERGIES AND PHASE EQUILIBRIA David A. Kofke1 and Daan Frenkel2 1 University at Buffalo, The State University of New York, Buffalo, New York, USA 2
FOM Institute for Atomic and Molecular Physics, Amsterdam, The Netherlands
Analysis of the free energy is required to understand and predict the equilibrium behavior of thermodynamic systems, which is to say, systems in which temperature has some influence on the equilibrium condition. In practice, all processes in the world around us proceed at a finite temperature, so any application of molecular simulation that aims to evaluate the equilibrium behavior must consider the free energy. There are many such phenomena to which simulation has been applied for this purpose. Examples include chemical-reaction equilibrium, protein-ligand affinity, solubility, melting and boiling. Some of these are examples of phase equilibria, which are an especially important and practical class of thermodynamic phenomena. Phase transformations are characterized by some macroscopically observable change signifying a wholesale rearrangement or restructuring occurring at the molecular level. Typically this change occurs at a specific value of some thermodynamic variable such as the temperature or pressure. At the exact point where the transition occurs, both phases are equally stable – have equal free energy – and we find a condition of phase equilibrium or coexistence [1].
1.
Free-Energy Measurement
Free-energy calculations are among the most difficult but most important encountered in molecular simulation. A key “feature” of these calculations is their tendency to be inaccurate, yielding highly reproducible results that are nevertheless wrong, despite the calculation being performed in a way that is technically correct. Often seemingly innocuous changes in the way the calculation is performed can introduce (or eliminate) significant inaccuracies. So it 683 S. Yip (ed.), Handbook of Materials Modeling, 683–705. c 2005 Springer. Printed in the Netherlands.
684
D.A. Kofke and D. Frenkel
is important when performing these calculations to have a strong sense of how they can go awry, and proceed in a way that avoids their pitfalls. The aim of any free-energy calculation is to evaluate the difference in free energy between two systems. “System” is used here in a very general sense. The systems may differ in thermodynamic state (temperature, pressure, chemical composition), in the presence or absence of a constraint, or most generally in their Hamiltonian. Often the free energy of one system is known, either because it is sufficiently simple to permit evaluation analytically (e.g., an ideal gas or a harmonic crystal), or because its free energy was established by a separate calculation. In many cases the free-energy difference is itself the principal quantity of interest. The important point here is that free-energy calculations always involve two (or more) systems. We will label these systems A and B in our subsequent discussion, and their free energy difference will be defined F = FB − FA . Once the systems of interest have been identified, a large variety of methods are available to evaluate F. At first glance the methods seem to be very diverse and unrelated, but they nevertheless can be grouped into two broad categories: (a) methods based on measurement of density of states and (b) methods based on work calculations. Implicit in both approaches is the idea of a path joining the two systems, and one way that specific methods differ is in how this path is defined. As free energy is a state function, the free-energy difference of course does not depend on the path, but the performance of a method can depend greatly on this choice (and other details). It is always possible to define a parameter λ that locates a position on the path, such that one value λ A corresponds to system A and another value λ B indicates system B. The parameter λ may be continuous or discrete (in fact, it is not uncommon that it have only two values, λ A and λ B ), and may represent a single variable or a set of variables, depending on the choice of the path. Moreover, for a given path, the parameter λ can be viewed as a state variable, such that a free energy F(λ) can be associated with each value of λ. Thus F = F(λ B ) − F(λ A ). The term “Landau free energy” is sometimes used in connection with this dependence.
1.1.
Density-of-States Methods
If a system is given complete freedom to move back and forth across the path joining A and B, it will explore all possible values of the path variable λ, but it will (in general) not spend equal time at each value. The probability p(λ) that the system is observed to be at a particular point λ on the path is related to the value of the free energy there p(λ) ∝ exp (−F(λ)/kT ) ,
(1)
Perspective: free energies and phase equilibria
685
where T is the absolute temperature and k is Boltzmann’s constant. This relation is the basic idea behind the density-of-states methods. The specific way in which λ samples values depends on how the simulation is implemented. Typically density-of-states calculations are performed as part of Monte Carlo (MC) simulations. In this case sampling includes trial moves in which λ is perturbed to a new value, and a decision to accept the trial is taken in the usual MC fashion. It is possible also to have λ vary as part of a molecular dynamics (MD) simulation. In such a situation λ must couple to the equations of motion of the system, usually via an extended-Lagrangian formalism [2]. Then λ follows a deterministic dynamical trajectory akin to the way that the particles’ coordinates do. In almost all cases of practical interest, conventional Boltzmann sampling will probe only a small fraction of all possible λ-values. The variation of the free energy F(λ) can be many times kT when considered over all λ values of interest, and consequently the probability p(λ) can vary over many orders of magnitude. Extra measures must therefore be taken to ensure that sufficient information is gathered over all λ to evaluate the desired free-energy difference, and one of the features distinguishing different density-of-states methods is the way that they take these measures. Almost always an artificial bias φ(λ) must be imposed to force the system to examine values of λ where the free energy is unfavorable, Usually the aim is to formulate the bias to lead to a uniform sampling over λ, which is achieved if φ(λ) = −F(λ). Of course, inasmuch as the aim is to evaluate F(λ) it is necessary to set up a scheme in which the free energy can be estimated either through preliminary simulations or as part of a systematic process of iteration. The greatest difficulty is found if the free energy change is extensive, meaning that λ affects the entire system and not just a small part of it (e.g., a path that results in a change in the thermodynamic phase, versus a path in which a single molecule is added to the system). In such cases F(λ) scales with the system size and is likely to vary by very large amounts with λ. The practical consequence is that the bias must be tuned very precisely to ensure that good sampling over all λ is accomplished. A robust solution to the problem is the use of windowing, in which the problem of evaluating the full free energy profile F(λ) is broken into smaller problems, each involving only a small range of all λ of interest. Separate simulations are performed over each λ range, and the composite data are assembled to yield the full profile. Even here there are different ways that one can proceed, and a popular approach to this end uses the histogram-reweighting method, which optimally combines the data in a way that accounts for their relative precision. Histogram reweighting is discussed in another chapter of this volume. Within the framework outlined above, the most obvious way to measure the probability distribution p(λ) is to use a visited-states approach: MC or MD sampling of λ values is performed, perhaps in the presence of the bias φ, and
686
D.A. Kofke and D. Frenkel
a histogram is recorded of the frequency with which each value (or bin of values) of λ is occupied. The Wang-Landau method [3, 4] (and its extensions) is the most prominent such technique today. Another approach of this type applies a history-dependent bias using a Gaussian basis [5]. An alternative to visited-states has recently emerged in the form of transition-matrix methods [6–10]. In such an approach one does not tabulate the occupancy of each λ value; rather one tallies statistics about the attempts to transition from one λ to another in a MC simulation. The movement among different λs forms a Markov process, and knowledge of the transition probabilities is sufficient to derive the limiting distribution p(λ). Interestingly, even rejected MC trials contribute information to the transition matrix, so it seems that this approach is gathering information that is discarded in visited-states methods. The transition-matrix approach has several other appealing features. The method can accommodate the use of a bias to flatten the sampling, but the bias does not enter into the transition matrix, so if the bias is updated as part of a scheme to achieve a flat distribution the previously recorded transition probabilities do not have to be discarded, as they must be in visited-states methods (at least in its simpler formulations). Moreover, if windowing is applied to obtain uniform samples across λ, it is easy to join data from different windows. It is not even required that adjacent windows overlap, just that they attempt trials (without necessarily accepting) into each other’s domain. Details of the transition-matrix methods are still being refined, and the versatility of the approach is currently being explored through its application to different problems. Additionally, there are efforts now to combine visited-states and transition-matrix approaches, exploiting the relatively fast (but rough) convergence of the former while relying on the more complete data collection abilities of the latter to obtain the best precision [11].
1.2.
Work-Based Methods
Classical thermodynamics relates the difference in free energy between two systems to the work associated with a reversible process that takes one into the other. A straightforward application of this idea leads to the thermodynamic integration (TI) free-energy method, which has a long history and has seen widespread application. The TI method is but one of several approaches in a class based on the connection between F and the work involved in transforming a system from A to B. A very important development in this area occurred recently, when Jarzynski showed that F could be related to work associated with any such process, not just a reversible one [12–15]. Jarzynski’s non-equilibrium work (NEW) approach requires evaluation of an ensemble of
Perspective: free energies and phase equilibria
687
work values, and thus involves repeated transformation from A to B, evaluating the work each time. The connection to the free energy is then exp(−F/kT ) = exp(−W/kT ),
(2)
where W is the total work, and the overbar on the right-hand side indicates an average taken over many realizations of the path from A to B, always starting from an equilibrium A condition. For an equilibrium (reversible) path, the repeated work measurements will each yield exactly the same value (within the precision of the calculations), while for an arbitrary non-equilibrium transformation a distribution of work values will be observed. It is remarkable that these non-equilibrium transformations can be analyzed to yield a quantity related to the equilibrium states. The instantaneous work w involved in the transformation λ → λ + λ will in general depend upon the detailed molecular configuration of the system at the instant of the change. Assuming that there is no process of heat transfer accompanying the transformation, this work is given simply by the change in the total energy of the system w = E(r N ; λ + λ) − E(r N ; λ).
(3)
For sufficiently small λ, this difference can be given in terms of the derivative dE(λ) λ, (4) w= dλ r N which can be interpreted in terms of a force acting on the parameter λ. The derivative relation is the natural formulation for use in MD simulations, in which the work is evaluated by integrating the product of this force times the displacement in λ over the complete path. The former expression (Eq. (3)) is more appropriate for MC simulation, in which larger steps in λ are typically taken across the path from A to B. Thermodynamic integration is perhaps the first method by which free energies were calculated by molecular simulation. Thermodynamic integration methods are usually derived from classical thermodynamics [1], with molecular simulation appearing simply to measure the integrand. As indicated above, TI also derives as a special (reversible) case of Jarzynski’s NEW formalism, whereby F =W rev for the reversible path. The total work W rev is in turn given by integration of Eq. (4), leading to: F =
λ B
w(λ) dλ.
(5)
λA
Equilibrium values of w are measured in separate simulations at a few discrete λ points along the path. It is then assumed that w is a smooth function
688
D.A. Kofke and D. Frenkel
of λ, and simple quadrature formulas (e.g., trapezoid rule) can be applied. The primary mechanism for the failure of TI is the occurrence of a phase transition, and therefore a discontinuity in w, along the path. Otherwise TI has been successfully applied to a very wide variety of systems, dating to the earliest simulations. Its primary disadvantage is that it does not provide direct measurement of the free energy, and if one is not interested in behavior for points along the integration path then another approach might be preferred. TI approximates a reversible path by smoothing equilibrium, ensembleaveraged, “forces” measured discretely along the path. Alternatively, one can access a reversible path by mimicking a truly reversible process, i.e., by attempting to traverse the path via a slow, continuous transition. In this manner the simulation constantly evolves from system A to system B, such that every MC or MD move is accompanied by a tiny step in λ (or some variation of this protocol). The differential work associated with these changes is accumulated to yield the total work W , which then approximates the free-energy difference. The process may proceed isothermally or adiabatically, the latter being the so-called adiabatic-switch method (and which instead yields the entropy difference between A and B) [16]. The weakness of these methods is in the uncertainty on whether the evolution of the system is sufficiently slow to be considered reversible. Such concerns can be allayed by implementing the calculation using the Jarzynski free-energy formula, Eq. (9); however this remedy then requires averaging of repeated realizations of the transition. One is then led to ask whether it is better to average, say, ten NEW passes, or to perform a single switch ten times more slowly. Free-energy perturbation (FEP) is obtained as the special case of the NEW method in which the transformation from A to B is taken in a single step. Free-energy perturbation is a well established and widely used method. Its principal advantage is that it permits F to be given as an ensemble average over configurations of the A system, removing the complication and expense of defining and traversing a path. The working formula emphasizes this feature exp(−βF) = exp [−β(E B − E A )]A .
(6)
A given NEW calculation can in principle be performed in either direction, starting from A and transforming to B, or vice versa. In practice the calculation will give different results when applied in one or the other direction; moreover these results will bracket the correct value of F. The results differ because they are inaccurate, and the fact that they bracket the correct value makes it tempting to take their average as the “best” result. But this practice is not a good idea, because the magnitude of the inaccuracies is in general not the same for the two directions [17,18]. In fact, it is not uncommon for one direction to provide the right result while the other yields an inaccurate one. But it is also not uncommon in other cases for the average to give a better estimate than either direction individually. The point is that one often does not know what
Perspective: free energies and phase equilibria
689
is the best way to interpret the results. The more careful practitioners will apply sufficient calculation (and perhaps use sufficient stages) until a point is reached in which the results from each direction match each other. However, this practice can be wasteful. To understand the problem and its remedy it is helpful to consider the systems A and B from the perspective of configuration space.
1.3.
Configuration Space
Configuration space is a high-dimensional space of all molecular configurations, such that any particular arrangement of the N atoms in real space is represented by a single point in 3N -dimensional configuration space (more generally we may consider 6N -dimensional phase space, which includes also the momenta) [19]. An arbitrary point in configuration space will typically describe a configuration that is unrealistic and unimportant, in the sense that one would not expect ever to observe the configuration to arise spontaneously in course of the system’s natural dynamics. For example, it might be a configuration in which two atoms occupy overlapping positions. Configuration space will of course contain points that do represent realistic, or important configurations, ones that are in fact observed in the system. It is helpful to consider the set * of all such configurations, as we do schematically in Fig. 1. The enclosing square represents the high-dimensional configuration space, and the ovals drawn within it represent (in a highly simplified manner) the set of all important configurations for the systems. The concept of “important configurations” is relevant to free-energy calculations because the ease with which a reliable (accurate) free-energy difference can be measured depends largely on the relation between the * regions of the two systems defining the free-energy difference. There are five general possibilities [20], summarized in Fig. 1. In a FEP calculation perturbing from A to B, the simulation samples the region labeled ∗A and at intervals it examines its present configuration and gauges its importance to the B system. Three general outcomes are possible for the difference E B − E A seen in Eq. (6): (a) it is a large positive number and the contribution to the FEP average is small; this occurs if the point is in ∗A but not in ∗B ; (b) it is a number of order unity, and a significant contribution is made to the FEP average; this occurs if the point is in ∗A and in ∗B ; or (c) it is a large negative number, and an enormous contribution is made to the FEP average; this occurs if the point is not in ∗A but is in ∗B . The third case will arise rarely if ever, because the sampling is by definition largely confined to the region ∗A . This contradiction (a large contribution made by a configuration that is never sampled) is the source of the inaccuracy in FEP calculation, and it arises if any part of ∗B lies outside of ∗A .
690
D.A. Kofke and D. Frenkel
(a)
Γ
(b)
(c) Γ*B
Γ*B Γ*A
Γ*A (d)
Γ*A
Γ*B
(e) Γ*B
Γ*A ⫻
Γ*A Γ*B
Figure 1. Schematic depiction of types of structures that can occur for the region of important configurations involving two systems. The square region represents all of phase space, and the filled regions are the important configurations ∗A and ∗B for the systems “A” and “B”, as indicated. (a) simple case in which ∗A and ∗B are roughly coincident, and there is no significant region of one that lies outside the other; (b) case in which the important configurations of A and B have no overlap, and energetic barriers prevent each from sampling the other; (c) case in which one system’s important configurations are a wholly contained, not-very-small subset of the others; (d) case in which ∗B is a very small subset of ∗A ; (e) case in which ∗A and ∗B overlap, but neither wholly contains the other.
This observation leads us to the most important rule for the reliable application of FEP: the reference and target systems must obey a configuration-space subset relation. That is, the important configuration space of the target system (B) must be wholly contained within the important configuration space of the system governing the sampling (A). Failure to adhere to this requirement will lead to an inaccurate result. Note the asymmetry of the relation “is a subset of” is directly related to the asymmetry of the FEP calculation. Exchange of the roles of A and B as target or reference can make or break the accuracy of the calculation. For example, consider the free energy change associated with the addition of a molecule to the system. In this case, F equals the excess chemical potential. The A system is one in which the “test” molecule has no interaction with the others, and the B system is one in which it interacts as all the other molecules do. Any configuration in which the test molecule overlaps another molecule is not important to B but is (potentially) important to A – the B system may be a subset of A, while A is most certainly not a subset of B. Whether all of ∗B is within ∗A cannot be stated for the general case. In more complex
Perspective: free energies and phase equilibria
691
systems (e.g., water) it is likely that there are configurations sampled by B that would not be important to A, while in simpler systems (a Lennard–Jones fluid at moderate density) the subset relation is satisfied. This black-and-white picture, in which the * regions are well defined with crisp boundaries, presents only a conceptual illustration of the nature of the calculations. In reality the “importance” of a given configuration (point in ) is not so clear-cut, and the * regions for the A and B systems may overlap in shades of gray (i.e., degrees of importance). The discussion here is given in the context of a FEP calculation, but the same ideas are relevant to the more general NEW calculation. Each increment of work performed in a NEW calculation must adhere to the subset relation too. The difference with NEW is that if the change is made sufficiently slowly (approaching reversibility), then the important phase spaces at each step will differ by only small amounts (cf. Fig. 1(a)), and the subset relation will be satisfied. To the extent that a NEW calculation is performed irreversibly, the issue of inaccuracy and asymmetry becomes increasingly important.
1.4.
Staging Strategies
In practice one is confronted with pair of systems for which F is desired, and there is no control over whether their * regions satisfy a subset relation. Yet FEP and NEW cannot be safely applied unless this condition is met. Two remedies are possible. Phase space can be redefined, such that a given point in it can represent different configurations for the A and B systems [21–23]. This approach has been applied to evaluate free energy differences between crystal structures (e.g., fcc vs. bcc) of a given model system. The phase-space points are defined to represent deviations from a perfect-crystal configuration, and the reference crystal is defined differently for the two systems. The switch from A to B entails swapping the definition of the reference crystal while keeping the deviations (i.e., the redefined phase-space point) fixed. With this transformation, two systems having disjoint * regions are redefined such that their * at least have significant overlap, and perhaps obey the subset requirement. Multiple staging is a more general approach to deal with systems that do not satisfy the subset relation [24–26]. Here the desired free energy difference is expressed in terms of the free energy of one or more intermediate systems, typically defined only to facilitate the free-energy calculation. Thus, F = (FB − FM ) + (FM − FA ),
(7)
where M indicates the intermediate. Free-energy methods are then brought to evaluate separately the two differences, between the M and B and M and A systems, respectively. The M system should be defined such that a subset relation can be formed between it and both the A and B systems. There are
692
D.A. Kofke and D. Frenkel
several options to this end, depending on the * relation in place for the A and B systems. Figure 2 summarizes the possibilities, and the cases are named as follows: • Umbrella sampling. Here M is formulated to contain both A and B, and sampling is performed from it into each [27]. • Funnel sampling. This is possible only if B is already a subset of A. Then M is defined as a subset of A and superset of B, and each perturbation stage is performed accordingly [20, 25, 28]. • Overlap sampling. Here M is formulated to be a subset of both A and B, and sampling is performed on each with pesrturbation into M [29]. General ways to define M to satisfy these requirements are summarized in Table 1, which also lists the general working equations for each multistage scheme. Umbrella sampling is a well-established method but is has only recently been viewed from the perspective given here. Bennett’s acceptanceratio method is a particular type of overlap sampling in which an optimal
Figure 2. Schematic depiction of types of structures that can occur for the region of important configurations involving two systems and a weight system formulated for multistage sampling. The square region represents all of phase space, and the filled regions are the important configurations ∗A , ∗B , and ∗M for the systems A and B, and M as indicated. (a) well formulated umbrella potential defines important configuration that have both ∗A and ∗B as subsets; (b) safely formulated funnel potential needed to focus sampling on tiny set of configurations ∗B while still representing all configurations important to A; (c) well formulated overlap potential, with important configurations formed as a subset of both the A and B systems. Table 1. Summary of staging methods for free-energy perturbation calculations Method
Formula for e−β(FB −F A )
Preferred staging potential, e−β E M
−β(E −E ) B M e Umbrella sampling −β(E −E ) M e
A
e
M
M
Funnel sampling
−1
e−β(E A −F A ) + e−β(E B −FB )
M
−β(E −E ) M A e Overlap sampling −β(E −E ) A B
B
e+β(E A −F A ) + e+β(E B −FB )
e−β(E M −E A ) A e−β(E B −E M ) M No general formulation
Perspective: free energies and phase equilibria
693
M is selected to minimize the variance of F; it is a highly effective and underappreciated method. The funnel-sampling multistage scheme is new, and a general, effective formulation for an M system appropriate to it has not yet been identified. Overlap sampling and umbrella sampling are not particularly helpful if A and B already satisfy the subset relation – they do not give much better precision than a simple single-stage FEP calculation taken in the appropriate direction. However, if implemented correctly they do provide some measure of safety against problems of inaccuracy, which is useful because in most cases one usually does not know clearly the nature of the phase-space relation for the A and B systems, and whether (and which way) a single-stage calculation is safe to perform between them.
2.
Methods for Evaluation of Phase Coexistence
Our perspective now shifts to the calculation of phase coexistence by molecular simulation, for which free-energy methods play a major role. Applications in this area have exploded over the past decade or so, owing to fundamental advances in algorithms, hardware, and molecular models. Some of the methods and concepts surveyed here have been discussed in more detail in recent reviews [30, 31].
2.1.
What is a Phase?
An order parameter is a statistic for a configuration. It is a number (or perhaps a vector, tensor, or some other set of numbers) that can be calculated or measured for a system in a particular configuration, and that in some sense quantifies the configuration. Examples include the density, the mole fraction in a mixture, the magnetic moment of a ferromagnet, and so on. Some molecular order parameters are formulated as expansion coefficients of an appropriate distribution function rendered in a suitable basis set. For example, a natural choice for crystalline translation order parameters is the value of the structure factor for an appropriate wave vector k. Orientational order parameters are widely used in the field of liquid crystals, and a common choice is based on expansion of the orientation distribution in Legendre polynomials. Usually an order parameter is defined such that it has a physical manifestation that can be observed experimentally. A thermodynamic phase is the set of all configurations that have (or are near) a given value of an order parameter. Phases are important because a system will spontaneously change its phase in response to some external perturbation.
694
D.A. Kofke and D. Frenkel
In doing so, the configurations exhibited by the system change from those associated with one value of the order parameter to those of another. Usually such a large shift in the predominant configurations will cause the system’s physical properties (mechanical, electrical, optical, etc.) to change in ways that might be very useful. A well known example is the boiling of a liquid to form a vapor. In response to a small change in temperature, the observed configurations of the system go from those corresponding to a large density to those for a much smaller density. In both cases the system (being at fixed pressure) is free to adopt any desired density. In changing phase it overwhelmingly selects configurations for one density over another. This phenomenon, and its many variants, has a multitude of practical applications. Clearly, there is a close connection between this molecular picture of a phase transformation, and the ideas presented above about the important phase space for a system. When a system changes phase, it is actually changing its important phase space, and the * region for the system before and after the change can relate in any of the ways described in Fig. 1. Analysis of the free energy is required to identify the location of the phase change quantitatively. Often the order parameter describing the phase change serves as the path parameter λ when performing this analysis.
2.2.
Conditions for Phase Equilibria
In a typical phase-equilibrium problem one is interested in the two (or more) phases involved in the transformation. At the exact condition at which one becomes favored over the other, both are equally stable. Molecular simulation is applied to locate this point of phase equilibrium and to characterize the coexisting phases. Formally, the thermodynamic conditions of coexistence can be identified as those minimizing an appropriate free energy, or equivalently by finding the states in which the intensive “field” variables of temperature, pressure, and chemical potential (and perhaps others) are equal among the candidate phases. Most methods for evaluation of phase equilibria by molecular simulation are based on identifying the conditions that satisfy the thermodynamic phase-coexistence criteria, and consequently they require evaluation of free energies or a free-energy difference. Still there is a lot of variability in the approaches, because really there are two problems involved in the calculation. The first is the measurement of the thermodynamic properties, particularly the free energy, while the second is the numerical “root-finding” problem of locating the coexistence conditions. Methods differ largely in the way they combine these two numerical problems, and the most effective and popular methods synthesize these calculations in elegant ways.
Perspective: free energies and phase equilibria
2.3.
695
Direct Contact of Phases, Spontaneous Transformations
Before turning to the free-energy based approaches for evaluating phase coexistence, it is worthwhile to consider the more intuitive approaches that mimic the way phase transitions are studied experimentally. By this we mean methods in which a system is simulated and the phase it spontaneously adopts is identified as the stable thermodynamic phase. Two general approaches can be taken, depending on the types of variables that are fixed in the simulation (i.e., the governing ensemble). In the first case, only one size variable is imposed (typically the number of molecules), and the remaining variables are fields (temperature, pressure, chemical potential difference). Then a scan is made of one or more of the fields (e.g., the temperature is increased), and one looks for the condition at which the phase changes spontaneously (e.g., the system undergoes a sudden expansion). For example, the temperature at which this happens, and the conditions of the phases before and after the transition, characterizes the coexistence point. In practice this method is effective only for producing a coarse description of the phase behavior. It is very easy for a system to remain in a metastable condition as the field variable moves through the transition point, and the spontaneous transformation may occur at a point well beyond the true value. The reverse process is susceptible to the same problem, so the transformation process exhibits hysteresis when the field is cycled back and forth through the transition value. In the second case, two or more extensive variables are imposed (i.e., the number of molecules and the volume), and the system is simulated at a condition inside the two-phase region. A macroscopic system in this situation would separate into the two phases, and both would coexist in the given volume. In principle, this too happens in a molecular simulation, but usually the system size is not sufficiently large to wash out effects due to the presence of the interface. In effect, neither bulk phase is simulated. Nevertheless, the directcontact method does work in some situations. Solid-fluid phase behavior has been studied this way. The interface is slow to equilibrate in this system, so one must be careful to ensure that the simulation begins with a well equilibrated solid. Vapor-liquid equilibria have also been examined using direct contact of the phases. Of course, this approach cannot be applied when too close to the critical point. Often such systems are examined because the interfacial properties are themselves of direct interest. Spontaneous formation of phases has been used recently to examine the behaviors of models that exhibit complex morphologies. Glotzer et al. have examined the mesophases formed by a wide variety of model nanoparticles, including hard particles with tethers, and particles with sticky patches [32].
696
D.A. Kofke and D. Frenkel
The systems have been observed to spontaneously form many complex structures, including columns, lamella, micelles, sheets, double layers, gyroid phases, and so on. The question remains of the absolute stability of the observed structures, but their spontaneous formation is a strong indicator that they are certainly relevant, and could likely be the most stable of all possible phases at the simulated conditions. The phase behaviors of other types of mesoscale models are also studied through the direct-observation methods. Systems modeled using dissipative particle dynamics [2, 33] are good candidates for this treatment, because they have a very soft repulsion and particles can in effect pass through each other; and as a consequence they equilibrate very quickly.
2.4.
Methods Based on Solution of Thermodynamic Equalities
A well worn approach to the free-energy based evaluation of phase equilibria focuses on satisfying the coexistence conditions given in terms of equality of the field parameters. In this approach each phase is studied separately, and state conditions are varied systematically until the coexistence conditions are met. An effective way to attack this problem is to combine the search for the coexistence point with the evaluation of the free energy through thermodynamic integration. For example, to evaluate a vapor-liquid coexistence point, one can start with a subcooled liquid of known chemical potential (evaluated using any of the methods reviewed above), and proceed with a series of isothermal-isobaric simulations following a line of decreasing pressure. At each point the chemical potential can be evaluated through the thermodynamic integration using the measured density µ(P) = µ(P0 ) +
P
d p/ρ( p).
(8)
P0
A similar series of simulations can be performed in the vapor separately, at the same temperature as the liquid simulations, but increasing the pressure toward the point of saturation (alternatively, an equation of state might be applied to characterize the vapor). Once the liquid and vapor simulations reach an overlapping range of pressures, the chemical potentials computed according to Eq. (8) can be examined at each pressure, until the point is found at which chemical potential is equal across the two phases for a given pressure. This general approach can be somewhat tedious to implement, but it is perhaps the most robust of all methods. It is likely to provide a good result for almost all types of coexistence. It has been applied to many types of phase equilibria, including those involving solids [34], liquid crystals [35], plastic
Perspective: free energies and phase equilibria
697
crystals, as well as fluids. The search for the coexistence condition can be applied using almost any order parameter (density was used in this example), although one must perhaps put some effort toward developing the appropriate formalism defining a field to couple to the parameter, and implementing a simulation in which this field is applied. Complications arise if many field parameters are relevant. For example, if one is studying a mixture, then a separate field parameter (chemical potential) is needed to couple to each molefraction variable. The problem can be simplified by fixing all but one of the field variables in the two phases, but often this leads to a statement of the coexistence problem that is at odds with the problem of real interest (e.g., one might want to know the composition of the incipient phase arising from another phase of given composition, which in the context of vapor-liquid equilibria is known as a bubble-point or a dew-point calculation). For mixtures, this formulation is expressed by the semigrand ensemble [36]. This method, like many others, will suffer when applied to characterize a weak phase transition, that is, one that is accompanied by only a small change in the relevant order parameter. The order parameter is related to the slope of the line that is being mapped in this calculation, and consequently for a weak transition the slopes of these lines for the two phases will not be very different from each other. It can be difficult to locate precisely the intersection of two nearly parallel lines – any errors in the position of the lines will have a greatly magnified effect on the error in the point of intersection. Therefore the application of this method to a weak transition can fail if the relevant ensemble averages and the free energies for the initial points of the integration are not measured with high precision and accuracy.
2.5.
Gibbs Ensemble
A breakthrough in technique for the evaluation of phase coexistence by molecular simulation arrived in 1987 with the advent of the Gibbs ensemble [37]. This method presents a very clever synthesis of the problem of locating the conditions of coexistence and measuring the free energy in the candidate phases. It accomplishes this through the simulation of both phases simultaneously, each occupying its own simulation volume. Although the phases are not in “physical” contact, they are in contact thermodynamically. This means that they are capable of exchanging volume and mass in response to the thermodynamic driving forces of pressure and chemical potential difference, respectively. The systems evolve in this way, increasing or decreasing in density with the mass and volume exchanges, until the point of coexistence is found. Upon reaching this condition the systems will fluctuate in density about the values appropriate for the equilibrium state, which can then be measured as a simple
698
D.A. Kofke and D. Frenkel
ensemble average. Details of the method are available in several reviews and texts [2, 37, 38]. The Gibbs ensemble is the method of choice for straightforward evaluation of vapor–liquid and liquid–liquid equilibria. It does not suffer any particular complications when applied to mixtures, and it has been applied with great success to many phase coexistence calculations. However, there are several ways in which it can fail. First, an essential element of the technique is the exchange of molecules at random between the coexisting phases. If trials of this type are not accepted with sufficient frequency, the systems will not equilibrate and a poor result is obtained. This problem arises in applications to large, complex molecules, and/or at low temperatures and high densities. It can be overcome to a useful degree through the application of special sampling techniques, such as configurational bias. Second, in its basic form the Gibbs ensemble is not applicable to equilibria involving solids, or to lattice models. The problem is only partially due to the difficulty of inserting a molecule into a solid. The “mass balance” is the more insidious obstacle. The number of molecules present in each phase at equilibrium is set by the initial number of molecules and the volume of the composite system of both phases (as well as the values of the coexistence densities). A defect-free crystal can be set up in a periodic system using only a particular number of molecules. For example an fcc lattice in cubic periodic boundaries can be set up using 32, 108, 256, 500, and so on molecules (i.e., 4n 3 where n is an integer). When beginning a Gibbs ensemble calculation there is no simple way to ensure this condition will be met in the equilibrium system. Tilwani and Wu [39] have treated these problems with an alternative approach in which an atom is added to the unit box of the solid and this new unit box is used to fill up (tile) space. In this way, particles can be added or removed from the system, while the crystal structure is maintained. The Gibbs ensemble fails also upon approach to the critical point. As this condition is reached, contributions to the averages increase for densities in the region between the two phases. It then becomes possible, even likely, that the simulated phases will swap their roles as the liquid and vapor phases. This is not a fatal flaw, but it presents a complication to the method, and it is an indicator that the general approach is beginning to fail. Thus the consensus today is that in this region of the phase envelope density-of-states methods are more suitable for characterizing the coexistence behavior. More generally, the Gibbs ensemble can encounter difficulty when applied to any weak phase transition, if only because it is necessary to configure the composite system so that it lies in the two phase region – this can be difficult to do if this region is very narrow. Interestingly enough, the Gibbs ensemble can fail also if it is applied using very large system sizes. In this situation an interface is increasingly likely to form in one or both phases, and the result is that a clean separation of phases between the volumes is no longer in place – instead both
Perspective: free energies and phase equilibria
699
simulation volumes each end up representing both phases. Typically the Gibbs ensemble is applied for its simplicity and ability to provide quick results, so the large systems needed to raise this problem are not usually encountered.
2.6.
Gibbs–Duhem Integration
The Gibbs–Duhem integration (GDI) method [40] applies thermodynamic integration to both parts of the combined problem of evaluating the free energy and locating the point of transition. In particular, the path of integration is constructed to follow the line of coexistence. All of this is neatly packaged by the Clapeyron differential equation for the coexistence line, which in the pressure–temperature plane is [1]
dP dT
= σ
H , T V
(9)
where H and V are the differences in molar enthalpy and molar volume, respectively, between the two phases; the σ subscript indicates a path along the coexistence line. The GDI procedure treats Eq. (9) as a numerical problem of integrating an ordinary differential equation. The complication, of course, is that the right-hand side must be evaluated through molecular simulation at the temperature and pressure specified by the integration procedure, and moreover separate simulations are required to characterize both phases involved in the difference. A simple iterative process is applied to refine the pressure according to Eq. (9) after a step in temperature is taken, using preliminary results for the ensemble averages from the simulations. Predictor-corrector methods are effective in performing the integration, and inasmuch as the primary error in the calculation arises from the imprecision of the ensemble averages, a low-order integration scheme suffices for the purpose. The GDI method applies much more broadly than indicated in this description. Any type of field variables can be used in the role held by pressure and temperature in Eq. (9), with appropriate modification to the right-hand side. For example, integrations have been performed along paths of varying composition, polydispersity, orientational order, and interparticle-potential softness, rigidity, or shape [36]. The method applies equally well to equilibria involving fluids or solids, or other types of phases. It has been used to follow three-phase coexistence lines too. In this application one must integrate two differential equations similar to Eq. (9), involving three field variables. In all cases there are a number of practical implementation issues to consider, such as how the integration is started, and the proper selection of the functional form of the field variables (e.g., integration in ln(P) vs. 1/T has advantages for tracing
700
D.A. Kofke and D. Frenkel
vapor–liquid coexistence lines). These issues have been discussed in some detail in recent reviews [36, 41]. The GDI method has some limitations. It does require an initial point of coexistence in order to begin the integration procedure. Concerns are often expressed that errors in this initial point will propagate throughout the integration, but this problem is not as bad as one might think. A stability analysis shows that any such errors will be attenuated if the integration is performed in a direction from a weaker to a stronger transition (e.g., away from the liquid– vapor critical point toward lower temperatures). On the other hand, if the integration is performed in the opposite direction, initial and accumulated errors will be amplified. Regardless it seems that in practice any such problems do not arise. A related concern is the general difficulty in treating weak phase transitions. If the differences on the right-hand side of Eq. (9) are small, and thus may be formed using averages that have stochastic errors comparable to the differences themselves, then it is clear that the method will not work well. In such cases one might be better off employing a method that directly bridges the difference between the phases, such as by mapping the full density of states in this region. The basic idea of tracing coexistence lines has been further generalized for mapping of other classes of phase equilibria, such as tracing of azeotropes [42], and dew/bubble-point lines [41]. Escobedo has developed and applied a general framework for these approaches [30, 43–47].
2.7.
Mapping the Density of States
Density of states methods evaluate coexisting phases by calculating the full free-energy profile across the range of values of the order parameter between and including the two phases. It is only in the past few years that this method has come to be viewed as generally viable, and even a good choice for evaluating phase coexistence. The effort involved in collecting information for the intermediate points seems wasteful, although with the approach these data are needed to obtain the relative free energies of the real states of interest (i.e., the coexisting phases). The methods reviewed above are popular because they avoid this complication and are more efficient because of it. However, there is some advantage in having the system cycle through the uninteresting states. It helps to move the sampling through phase space. Thus, a simulated system might go from a liquid configuration, then to a vapor, and back to the liquid but in a very different configuration from which it started. This is particularly important for complex fluids such as polymers (in the context of other phase equilibria), in which it is otherwise difficult to escape from ergodic traps. Second, the intermediate states may be of interest in themselves; they can be used, for example, to evaluate the surface tension associated with contacting the two
Perspective: free energies and phase equilibria
701
phases [10]. Third, it may be that the distance between the coexisting phases is not so large (i.e., the transition is weak), so covering the ground between them does not introduce so much expense; moreover in such a situation other methods do not work very well. Regardless, continuing improvements in computing hardware and algorithms (some reviewed above), particularly in parallel methods and architectures, have made the density-of-states strategy look much more appealing. We describe the basic approach in the context of vapor–liquid equilibria. Simulation can be performed in the grand-canonical potential with a chemical potential selected to be in the vicinity of the coexistence value. The density of states is mapped as a function of number of molecules at fixed volume; the transition-matrix method with a biasing potential in N has been found to be convenient and effective in this application. The resulting density of states will most likely exhibit two unequal peaks, representing the two nearly coexisting phases. Histogram reweighting is then applied to the density of states to determine the value of the chemical potential that makes the peaks equal in size. This is taken to be the coexistence value of the chemical potential, and the positions of the peaks give the molecule numbers (densities) of the coexisting phases. The coexistence pressure can be determined from the grand potential, which is available from the density of states. Additional details are presented by Errington [9].
3.
Outlook
The nature of the questions that we address with the help of computer simulations is changing. Increasingly, we wish to be able to predict the changes that will occur in a system when external conditions (e.g., temperature, pressure or the chemical potential of one or more species) are changed. In order to predict the stable phase of a many-body system, or the “native” conformation of a macromolecule, we need to know the accessible volume in phase space that corresponds to this state or, in other words, its free energy. Both the MC and the MD methods were created in effectively the form in which we use them today. However, the techniques used to compute free energy differences have expanded tremendously and have become much more powerful and much more general than they were only a decade ago. Yet, the roots of some of these techniques go back a long way. For instance, the density-of-states method was already considered in the late 1950s [48] and was first implemented in the 1960s [49]. The aim of the present chapter is to provide a (very concise) review of some of the major developments. As the developments are in a state of flux, this review provides nothing more than a snapshot.
702
D.A. Kofke and D. Frenkel
It is always risky to identify challenges for the future, but some seem clear. First of all, it would seem that there must be a quantum-mechanical counterpart to Jarzynski’s NEW method. However, it is not at all obvious that this would lead to a tractable computational scheme. A second challenge has to do with the very nature of free energy. In its most general (Landau) form, the free energy of a system is a measure of the available phase space compatible with one or more constraints. In the case of the Helmholtz free energy, the quantities that we constrain are simply the volume V and the number of particles N . However, when we consider the pathway by which a system transforms from one state to another, the constraint may correspond to a non-thermodynamic order parameter. In simple cases, we know this order parameter, but often we do not. We know the initial and final states of the system and hopefully the transformation between the two can be characterized by one, or a few, order parameters. If such a low-dimensional picture is correct, it is meaningful to speak of the “free-energy landscape” of the system. However, although methods exist to find pathways that connect initial and final states in a barriercrossing process [50], we still lack systematic ways to construct optimal low-dimensional order-parameters to characterize the transformation of the system. To date, most successful schemes to map free-energy landscapes assume that the true reaction coordinates are spanned by a relatively small set of supposedly relevant coordinates. However, is not obvious that it will always be possible to find such coordinates. Yet, without a physical picture of the constraint or reaction coordinate, free energy surfaces are hardly more informative than the high-dimensional potential-energy surface from which they are ultimately derived. Without this knowledge we can still compute the relative stability of initial and final state (provided we have a criterion to distinguish the two), but we will be unable to gain physical insight into the factors that affect the rate of transformation from the metastable to the stable state.
Acknowledgments DAK’s activity in this area is supported by the U.S. Department of Energy, Office of Basic Energy Sciences. The work of the FOM Institute is part of the research program of FOM and is made possible by financial support from the Netherlands organization for Scientific Research (NWO).
References [1] K. Denbigh, Principles of Chemical Equilibrium, Cambridge: Cambridge University, 1971. [2] D. Frenkel and B. Smit, Understanding Molecular Simulation: From Algorithms to Applications, Academic Press, San Diego, 2002.
Perspective: free energies and phase equilibria
703
[3] F. Wang and D.P. Landau, “Determining the density of states for classical statistical models: a random walk algorithm to produce a flat histogram,” Phys. Rev. E, 64, 056101-1–056101-16, 2001a. [4] F. Wang and D.P. Landau, “Efficient, multiple-range random walk algorithm to calculate the density of states,” Phys. Rev. Lett., 86, 2050–2053, 2001b. [5] A. Laio and M. Parrinello, “Escaping free-energy minima,” Proc. Nat. Acad. Sci., 99, 12562–12566, 2002. [6] M. Fitzgerald, R.R. Picard, and R.N. Silver, “Canonical transition probabilities for adaptive Metropolis simulation,” Europhys. Lett., 46, 282–287, 1999. [7] J.-S. Wang, T.K. Tay, and R.H. Swendsen, “Transition matrix Monte Carlo reweighting and dynamics,” Phys. Rev. Lett., 82, 476–479, 1999. [8] M. Fitzgerald, R.R. Picard, and R.N. Silver, “Monte Carlo transition dynamics and variance reduction,” J. Stat. Phys., 98, 321, 2000. [9] J. R. Errington, “Direct calculation of liquid–vapor phase equilibria from transition matrix Monte Carlo simulation,” J. Chem. Phys., 118, 9915–9925, 2003a. [10] J. R. Errington, “Evaluating surface tension using grand-canonical transition-matrix Monte Carlo simulation and finite-size scaling,” Phys. Rev. E, 67, 012102-1 – 012102-4, 2003b. [11] M.S. Shell, P.G. Debenedetti, and A.Z. Panagiotopoulos, “An improved Monte Carlo method for direct calculation of the density of states,” J. Chem. Phys., 119, 9406– 9411, 2003. [12] C. Jarzynski, “Equilibrium free-energy differences from nonequilibrium measurements: a master-equation approach,” Phys. Rev. E, 56, 5018–5035, 1997a. [13] C. Jarzynski, “Nonequilibrium equality for free energy difference,” Phys. Rev. Lett., 78, 2690–2693, 1997b. [14] G.E. Crooks, “Nonequilibrium measurements of free energy differences for microscopically reversible Markovian systems,” J. Stat. Phys., 90, 1481–1487, 1998. [15] G.E. Crooks, “Entropy production fluctuation theorem and the nonequilibrium work relation for free energy differences,” Phys. Rev. E, 60, 2721–2726, 1999. [16] M. Watanabe and W.P. Reinhardt, “Direct dynamical calculation of entropy and free energy by adiabatic switching,” Phys. Rev. Lett., 65, 3301–3304, 1990. [17] N.D. Lu and D.A. Kofke, “Accuracy of free-energy perturbation calculations in molecular simulation I. Modeling,” J. Chem. Phys., 114, 7303–7311, 2001a. [18] N.D. Lu and D.A. Kofke, “Accuracy of free-energy perturbation calculations in molecular simulation II. Heuristics,” J. Chem. Phys., 115, 6866–6875, 2001b. [19] J.P. Hansen and I.R. McDonald, Theory of Simple Liquids, Academic Press, London, 1986. [20] D.A. Kofke, “Getting the most from molecular simulation,” Mol. Phys., 102, 405– 420, 2004. [21] A.D. Bruce, N.B. Wilding, and G.J. Ackland, “Free energy of crystalline solids: a lattice-switch Monte Carlo method,” Phys. Rev. Lett., 79, 3002–3005, 1997. [22] A.D. Bruce, A.N. Jackson, G.J. Ackland, and N.B. Wilding, “Lattice-switch Monte Carlo method,” Phys. Rev. E, 61, 906–919, 2000. [23] C. Jarzynski, “Targeted free energy perturbation,” Phys. Rev. E, 65, 046122, 1–5, 2002. [24] J.P. Valleau and D.N. Card, “Monte Carlo estimation of the free energy by multistage sampling,” J. Chem. Phys., 57, 5457–5462, 1972. [25] D.A. Kofke and P.T. Cummings, “Quantitative comparison and optimization of methods for evaluating the chemical potential by molecular simulation,” Mol. Phys., 92, 973–996, 1997.
704
D.A. Kofke and D. Frenkel [26] R.J. Radmer and P.A. Kollman, “Free energy calculation methods: a theoretical and empirical comparison of numerical errors and a new method for qualitative estimates of free energy changes,” J. Comp. Chem., 18, 902–919, 1997. [27] G.M. Torrie and J.P. Valleau, “Nonphysical sampling distributions in Monte Carlo free-energy estimation: umbrella sampling,” J. Comp. Phys., 23, 187–199, 1977. [28] D.A. Kofke and P.T. Cummings, “Precision and accuracy of staged free-energy perturbation methods for computing the chemical potential by molecular simulation,” Fluid Phase Equil., 150, 41–49, 1998. [29] N.D. Lu, J.K. Singh, and D.A. Kofke, “Appropriate methods to combine forward and reverse free energy perturbation averages,” J. Chem. Phys., 118, 2977–2984, 2003. [30] J.J. de Pablo, Q.L. Yan, and F.A. Escobedo, “Simulation of phase transitions in fluids,” Ann. Rev. Phys. Chem., 50, 377–411, 1999. [31] A.D. Bruce and N.B. Wilding, “Computational strategies for mapping equilibrium phase diagrams,” Adv. Chem. Phys., 127, 1–64, 2003. [32] Z.L. Zhang, M.A. Horsch, M.H. Lamm, and S.C. Glotzer, “Tethered nano building blocks: Towards a conceptual framework for nanoparticle self-assembly,” Nano Lett., 3, 1341–1346, 2003. [33] R.D. Groot and P.B. Warren, “Dissipative particle dynamics: bridging the gap between atomistic and mesoscopic simulation,” J. Chem. Phys., 107, 4423–4435, 1997. [34] P.A. Monson and D.A. Kofke, “Solid–fluid equilibrium: insights from simple molecular models,” Adv. Chem. Phys., 115, 113–179, 2000. [35] M.P. Allen, G.T. Evans, D. Frenkel, and B.M. Mulder, “Hard convex body fluids,” Adv. Chem. Phys., 86, 1–166, 1993. [36] D.A. Kofke, “Semigrand canonical Monte Carlo simulation; Integration along coexistence lines,” Adv. Chem. Phys., 105, 405–441, 1999. [37] A.Z. Panagiotopoulos, “Direct determination of phase coexistence properties of fluids by Monte Carlo simulation in a new ensemble,” Mol. Phys., 61, 813–826, 1987. [38] A.Z. Panagiotopoulos, “Direct determination of fluid phase equilibria by simulation in the Gibbs ensemble: a review,” Mol. Sim., 9, 1–23, 1992. [39] P. Tilwani, “Direct simulation of phase coexistence in solids using the Gibbs ensemble: Configuration annealing Monte Carlo,” M.S. Thesis, Colorado School of Mines, Golden, Colorado, 1999. [40] D.A. Kofke, “Direct evaluation of phase coexistence by molecular simulation through integration along the saturation line,” J. Chem. Phys., 98, 4149–4162, 1993. [41] J. Henning, and D.A. Kofke, “Thermodynamic integration along coexistence lines,” In: P.B. Balbuena and J. Seminario (eds.), Molecular Dynamics, Amsterdam: Elsevier, 1999. [42] S.P. Pandit and D.A. Kofke, “Evaluation of a locus of azeotropes by molecular simulation,” AIChE J., 45, 2237–2244, 1999. [43] F.A. Escobedo, “Novel pseudoensembles for simulation of multicomponent phase equilibria,” J. Chem. Phys., 108, 8761–8772, 1998. [44] F.A. Escobedo, “Tracing coexistence lines in multicomponent fluid mixtures by molecular simulation,” J. Chem. Phys., 110, 11999–12010, 1999. [45] F.A. Escobedo, “Molecular and macroscopic modeling of phase separation,” AIChE J., 46, 2086–2096, 2000a. [46] F. A. Escobedo, “Simulation and extrapolation of coexistence properties with singlephase and two-phase ensembles,” J. Chem. Phys., 113, 8444–8456, 2000b. [47] F.A. Escobedo and Z. Chen, “Simulation of isoenthalps and Joule–Thomson inversion curves of pure fluids and mixtures,” Mol. Sim., 26, 395–416, 2001.
Perspective: free energies and phase equilibria
705
[48] Z.W. Salsburg, J.D. Jacobson, W. Fickett, and W.W. Wood, “Application of the Monte Carlo method to the lattice-gas model. I.Two-dimensional triangular lattice,” J. Chem. Phys., 30, 65–72, 1959. [49] I.R. McDonald and K. Singer, “Calculation of thermodynamic properties of liquid argon from Lennard-Jones parameters by a Monte Carlo method,” Discuss. Faraday Soc., 43, 40–49, 1967. [50] P.G. Bolhuis, D. Chandler, C. Dellago, and P.L. Geissler, “Transition path sampling: throwing ropes over rough mountain passes, in the dark,” Ann. Rev. Phys. Chem., 53, 291–318, 2002.
2.15 FREE-ENERGY CALCULATION USING NONEQUILIBRIUM SIMULATIONS Maurice de Koning1 and William P. Reinhardt2 1 University of S˜ao Paulo S˜ao Paulo, Brazil 2
University of Washington Seattle, Washington, USA
1.
Introduction
Stimulated by the progress of computer technology over the past decades, the field of computer simulation has evolved into a mature branch of modern scientific investigation. It has had a profound impact in many areas of research including condensed-matter physics, chemistry, materials and polymer science, as well as in biophysics and biochemistry. Many problems of interest in all of these areas involve complex many-body systems and analytical solutions are generally not available. In this light, atomistic simulations play a particularly important role, giving detailed insight into the fundamental microscopic processes that control the behavior of complex systems at the macroscopic level. They provide key and effective tools for providing ab initio predictions, interpreting complex experimental data, as well as conducting computational “experiments” that are difficult or impossible to realize in a laboratory. In this article, we will discuss one of the most fundamental and difficult applications of atomistic simulation techniques such as Monte Carlo (MC) [1] and molecular dynamics (MD) [2, 3]; the determination of those thermodynamic properties that require determination of the entropy. The entropy, the chemical potential, and the various free energies are examples of thermal thermodynamic properties. In contrast their mechanical counterparts such as the enthalpy, thermal quantities cannot be computed as simple time, or ensemble, averages of functions of the dynamical variables of the system and, therefore, are not directly accessible in MC or MD simulations. Yet, the free energies are often the most fundamental of all thermodynamic functions. Under appropriate constraints they control chemical and phase equilibria, and transition state estimates of the rates of chemical reactions. Examples of applications 707 S. Yip (ed.), Handbook of Materials Modeling, 707–727. c 2005 Springer. Printed in the Netherlands.
708
M. de Koning and W.P. Reinhardt
range from determination of the influence of crystal defects on the mechanical properties of materials, to the mechanisms of protein folding. The development of efficient and accurate techniques for their calculation has therefore attracted considerable attention during the past fifteen years, and is still a very active field of research [4]. As detailed in the previous chapter [4], the evaluation of free energies (or, more specifically free-energy differences) requires simulations that collect data along a sequence of states on a thermodynamic path linking two equilibrium states. If the system is at equilibrium at every point along such a path, the simulated process is quasistatic and reversible, and standard thermodynamic results may be used to interpret collected data and to estimate the free-energy difference between the initial and final equilibrium states. The present chapter generalizes this approach to the case where data is collected during nonequilibrium, and thus irreversible, processes. Several important themes will emerge, making clear why this generalization is of interest, and how nonequilibrium calculations may be set up to provide both upper and lower bounds (and thus systematic in addition to statistical error estimates) to the desired thermal quantities. Additionally, the irreversible process may be optimized in a variational sense so as to improve such bounds. The statistical–mechanical theory of nonequilibrium systems within the regime of linear response will prove particularly helpful in this endeavor. Finally, newly developed re-averaging techniques have appeared that, in some cases, allow quite precise estimates of equilibrium thermal quantities directly from nonequilibrium data. The combination of such techniques with near-optimal paths can give well converged results from relatively short computations. In the illustrations that follow, for sake of conciseness, we will limit ourselves to the application of nonequilibrium methods within the realm of the classical canonical ensemble. For this representative case the relevant thermodynamic variables are the number of particles N , the volume V , and the temperature T ; and the appropriate free energy is the Helmholtz free energy, A(N, V, T ) = E(N, V, T ) − T S(N, V, T ), E and S being the internal energy and entropy, respectively. However, appropriate generalizations of nonequilibrium methods to other classical ensembles, as well as to quantum systems, are readily available.
2.
Equilibrium Free-Energy Simulations
The calculation of thermodynamic quantities by means of atomistic simulation is rooted in the framework of equilibrium statistical mechanics [5], which provides the link between the microscopic details of a system and its macroscopic thermodynamic properties. Let us consider a system consisting
Free-energy calculation using nonequilibrium simulations
709
of N classical particles with masses m i . A microscopic configuration of the system is fully specified by the set of N particle momenta {pi } and positions {ri }, and its energy is described in terms of a potential-energy function U ({ri }). Statistical mechanics in the canonical ensemble then tells us that the distribution of the particle positions and momenta is given by ρ(Γ) =
1 exp(−β H (Γ)), Z (N, V, T )
(1)
where Γ ≡ ({p}, {r}) denotes a microstate of the system, β = 1/k B T (with k B Boltzmann’s constant) and H (Γ) is the classical Hamiltonian. The denominator in Eq. (1) is referred to as the canonical partition function, defined as Z (N, V, T ) =
dΓ exp[−β H (Γ)],
(2)
and guarantees proper normalization of the distribution function. The mechanical thermodynamic properties such as the internal energy, enthalpy and pressure, can be expressed as ensemble averages over the distribution function ρ(Γ). Here, the attribute “mechanical” means that the quantity of interest, X , is associated with a specific function X = X (Γ) of the microstate, Γ, of the system and can be written as X =
dΓρ(Γ)X (Γ).
(3)
Standard atomistic simulation techniques such as Metropolis MC [1] and MD [2, 3] provide powerful algorithms for generating sequences of microstates (Γ1 , Γ2 , . . . , Γ M ) that are distributed according the particular statistical– mechanical (e.g., canonical) distribution function of interest. In this manner, the average implied by Eq. (3) is easily estimated by averaging the function X (Γ) over a sequence, Γj , of microstates generated using MC or MD simulation, X = lim
M→∞
M 1 X (Γ j ). M j =1
(4)
Although the partition function Z , itself, is not known this does not present a problem in the case one is interested in any of the mechanical properties of the system; since Z is implicit in the generation of the sequence of microstates, Γi , it is not needed to perform the ensemble average of Eq. (3). The calculation of thermal quantities is not so straightforward, however. For example, the Helmholtz free energy A(N, V, T ) = −
1 1 ln Z (N, V, T ) = − ln β β
dΓ exp[−β H (Γ)] ,
(5)
710
M. de Koning and W.P. Reinhardt
is seen to be an explicit function of the partition function Z rather than an average of the type shown in Eq. 3. Therefore, as Z is not directly accessible in an MC or MD simulation, indirect strategies must be used. The most widely adopted strategy is to construct a real or artificial thermodynamic path that consists of a continuous sequence of equilibrium states linking two states of interest of the system and then attempt to calculate the free-energy difference between them. Should the free energy of one of these states be exactly known, the free energy of the other may then be put on an absolute basis. This approach provides the basis for the common thermodynamic integration (TI) method. Usually TI relies on the definition of a thermodynamic path in the space of system Hamiltonians. Typically, this involves the construction of an “artificial” Hamiltonian H (Γ , λ), which, aside from the usual dependence on the microstate Γ is also a function of some generalized coordinate or switching parameter λ. This generalized Hamiltonian is then constructed in such a way that it leads to a continuous transformation from the Hamiltonian of a system of interest to that of a reference system of which the free energy is known beforehand. Within the canonical ensemble, the Helmholtz free-energy difference between the initial and final states of the path, characterized by the switching coordinate values λ1 and λ2 , respectively, is then given by A ≡ A(λ2 ; N, V, T ) − A(λ1 ; N, V, T ) λ2
dλ
= λ1
∂ A(λ; N, V, T ) ∂λ
λ
λ2
= dλ λ1
∂ H (Γ, λ) ∂λ
λ
≡ Wrev ,
(6)
where A(λ; N, V, T ) is the Helmholtz free energy of the system as a function of the switching coordinate λ for fixed N , V , and T , and the brackets in the second integral denote an average evaluated for the canonical ensemble associated with the generalized coordinate value λ = λ . From a thermodynamic standpoint, Eq. (6) may be interpreted in the following way. The free-energy difference between the initial and final states is equal to the reversible work Wrev done by the generalized thermodynamic driving force ∂ H (Γ, λ)/∂λ along a quasistatic, or reversible process connecting both states. By quasistatic we mean that the process is carried out so slowly that the system remains in equilibrium at all times and the instantaneous driving force is equal to the associated equilibrium ensemble average. In this way, the TI method represents a numerical discretization of the quasistatic process; Wrev is estimated by computing the equilibrium ensemble averages of the driving force on a grid of λ-values on the interval [λ1 , λ2 ], after which the integration is carried out using standard numerical techniques. For further details of the TI method and its applications we refer to the chapter by Kofke and Frenkel [4].
Free-energy calculation using nonequilibrium simulations
3. 3.1.
711
Nonequilibrium Free-Energy Estimation Establishing Free-Energy Bounds: Systematic and Statistical Errors
Nonequilibrium free-energy estimation is an alternative approach to measuring the reversible work Wrev . Instead of discretizing the quasistatic process in terms of a sequence of independent equilibrium states, the reversible work is estimated by means of a single, dynamical sequence of nonequilibrium states, explored along an out-of-equilibrium simulation. This is achieved by introducing an explicit “time-dependent” element into the originally static sequence of states by making λ = λ(t) an explicit function of the simulation “time” t. Here we have used the quotes to emphasize that t should not always be interpreted as a real physical time. For instance, in contrast to MD simulations, typical displacement MC simulations do not involve a natural time scale, in case of which t is simply an index variable that orders the sequence of sampling operations, measured in simulation steps. Suppose we choose λ(t) such that λ(0)=λ1 and λ(tsim )=λ2 , so that λ varies between λ1 and λ2 in a time tsim . Accordingly, the Hamiltonian H (Γ, λ) = H (Γ, λ(t)) also becomes a function of t, and is driven from the initial system H1 to the final system H2 in the same time. The irreversible work Wirr done by the driving force along this switching process, defined as tsim
dt
Wirr = 0
dλ dt
t
∂H ∂λ
λ(t )
,
(7)
provides an estimator for the reversible work Wrev done along the corresponding quasistatic process. The point of this nonequilibrium procedure is that values of Wirr can be found, in principle, from a single simulation, because the integration in Eq. (7) involves instantaneous values of the function ∂ H/∂λ rather than ensemble averages. If efficient, this would be much less costly than the TI procedure in Eq. (6), which requires a series of independent equilibrium simulations. But there is, of course, a trade-off. While the TI method is inherently “exact” in that the errors are associated only with statistical sampling and the discreteness of the mesh used for the numerical integration, the irreversible work procedure provides a biased estimator for Wrev . That is, aside from statistical errors arising from different choices of initial configurations for calculation of Eq. (7), the irreversible estimator Wirr is subject to a systematic error Esyst. Both types of error are due to the inherently irreversible nature of the nonequilibrium process. The statistical errors originate from the fact that, for a fixed and finite simulation time tsim , the value of the integral in Eq. (7) depends on the initial
712
M. de Koning and W.P. Reinhardt
conditions of the nonequilibrium process. In other words, for different initial conditions, Γ j (t = 0), and a finite simulation time tsim , the value of Wirr in Eq. (7) is not unique. Instead, it is a stochastic quantity characterized by a distribution function with a finite variance, giving rise to statistical errors of the sort arising in any MC or MD simulation. The systematic error manifests itself in terms of a shift of the mean of the irreversible work distribution with respect to the value of the ideal quasistatic work Wrev . This shift is caused by the dissipative entropy production characteristic of irreversible processes [6]. Because the entropy always increases, the systematic error Ediss is always positive, regardless of the sign of the reversible work Wrev . In this way, the average value Wirr of many measurements of the irreversible work will yield an upper bound to the reversible work Wrev , provided the average is taken over an ensemble of equilibrated initial conditions j (t = 0) at the starting point, t = 0. The importance of satisfying the latter condition was demonstrated by Hunter et al. [7]. From a purely thermodynamic point of view, the bounding error is simply a consequence of the Helmholtz inequality. Starting from an equilibrium initial state, for instance at λ = λ1 , the irreversible work upon driving the system to λ = λ2 is always an upper bound to the actual free-energy change between the equilibrium states of initial and final systems, i.e., Wirr ≥ A = A(λ2 ; N, V, T ) − A(λ1 ; N, V, T ).
(8)
Only in the limit of an ideally quasistatic, or reversible process, represented by the tsim → ∞ limit, does the inequality in Eq. (8) become the equality, Wrev = A, as also manifested in Eq. (6). The preceding ideas are illustrated conceptually in Fig. 1(a) and (b), which show typical distribution functions of irreversible work measurements starting from an ensemble of equilibrated initial conditions. Figure 1(a) compares the results that might be obtained for irreversible work measurements for two different finite simulation times tsim = t1 and tsim = t2 , with t2 > t1 to the ideally reversible tsim → ∞ limit. Both finite-time results show distribution functions with a finite variance and whose mean values have been shifted with respect to the reversible work value by a positive systematic error. Both the variance and systematic error for tsim = t1 are larger than the corresponding values for tsim = t2 , given that the latter process proceeds in a slower manner, leading to smaller irreversibility. Figure 1(b) shows the irreversible work estimators obtained for the reversible work associated with a quasistatic process in which system 1 is transformed into system 2 as obtained in the forward (1 → 2) and backward (2 → 1) directions using the same simulation time tsim . Given that the systematic error is always positive, the forward and backward processes provide upper and lower bounds to the reversible work value, respectively. However, in general, the systematic and statistical errors need not be equal for both directions.
Free-energy calculation using nonequilibrium simulations (a)
713
(b) tsim → ∞
tsim t2 > t1
(2 → 1)
(1 → 2)
∆Ediss (t2)
tsim t1 ∆Ediss(t1)
Wrev (1 → 2)
Wrev Wirr
Wirr
Figure 1. Conceptual illustration of typical irreversible work distributions obtained from nonequilibrium simulations. (a) compares the results that might be obtained for irreversible work measurements for two different finite simulation times tsim = t1 and tsim = t2 , with t2 > t1 to the ideally reversible tsim → ∞ limit. (b) shows the irreversible work estimators obtained for the reversible work associated with a quasistatic process in which system 1 is transformed into system 2 as obtained in the forward (1 → 2) and backward (2 → 1) directions using the same simulation time tsim .
3.2.
Optimizing Free-Energy Bounds: Insight from Nonequilibrium Statistical Mechanics
A natural question that arises after considering the discussion in previous section is how one might tune the nonequilibrium process so as to minimize the systematic and statistical errors associated with the irreversibility for given initial and final equilibrium states and a given simulation time tsim . To answer this question, it is useful to investigate the microscopic origin of entropy production in nonequilibrium processes. For this purpose, it is particularly helpful to consider the particular class of close-to-equilibrium nonequilibrium processes for which the instantaneous distribution functions of nonequilibrium states do not deviate too much from the ideally quasistatic equilibrium distribution functions and where theory of linear response [5] is appropriate. As we will see later on, it is not too difficult to reach this condition in practical situations. As described by Onsager’s regression hypothesis [5], when a nonequilibrium state is not too far from equilibrium, the relaxation of any mechanical property can be described in terms of the proper equilibrium autocorrelation function. In other words, the hypothesis states that the relaxation of a nonequilibrium disturbance is governed by the same laws as the regression of spontaneous microscopic fluctuations in an equilibrium system.
714
M. de Koning and W.P. Reinhardt
Under the assumption of proximity to equilibrium, one can then derive the following expression for the mean dissipated energy, i.e., the systematic error Ediss(tsim ), for a series a irreversible work measurements obtained from nonequilibrium simulations of duration tsim [8–10]: 1 Ediss(tsim ) = kB T
tsim
dt 0
dλ dt
2 t
∂H τ [λ(t )] var ∂λ
.
λ(t )
(9)
Aside from the switching rate, the integrand in Eq. (9) contains both the correlation time as well as the equilibrium variance of the driving force ∂ H/∂λ. These two factors describe, respectively, how quickly the fluctuations in the driving force decay and how large these fluctuations are in the equilibrium state. It is clear that the integral is positive-definite, as it must be. Moreover, it indicates that, for near-equilibrium processes, the systematic error should be the same for forward and backward processes. This means that, in the linear–response regime, one can obtain an unbaised estimator for the reversible work Wrev by combining the results obtained from forward and backward processes. More specifically, in this regime we have Wirr (1 → 2) = Wrev (1 → 2) + Ediss ,
(10)
Wirr (2 → 1) = −Wrev (1 → 2) + Ediss ,
(11)
and
leading to the unbaised estimator (i.e., subject to statistical fluctuations only) Wrev (1 → 2) = 12 (Wirr (1 → 2) − Wirr (2 → 1) .
(12)
Concerning minimization of dissipation, Eq. (9) tells us that one should attempt to reduce both the magnitude of the fluctuations in the driving force as well as the associated correlation times. This involves both a static component, i.e., the magnitude of the equilibrium fluctuations, and a dynamic one, namely the typical decay time of equilibrium correlations. This shows that not only the choice of the path, H (λ), but also the simulation algorithm by which the system is propagated in “time” (i.e., MC or MD simulation) will affect the dissipation in the irreversible work measurements. Whereas the magnitude of the equilibrium fluctuations should be algorithm independent (as long as the algorithms sample the same equilibrium distribution function), the correlation time is certainly algorithm-dependent. In case of displacement MC simulation, as we will see below, the choice of the maximum displacement parameter affects the correlation time τ , and, consequently, the magnitude of the dissipation.
Free-energy calculation using nonequilibrium simulations
715
Finally, let us now assume that we have a prescribed path H (λ) and a simulation algorithm to sample the nonequilibrium process between the systems H (λ1 ) and H (λ2 ). How do we now choose the functional form of the time-dependent switching function λ(t) to minimize the dissipation? Equation (9) provides us with an explicit answer. To see this, we first perform a change of integration variable, setting x = t /tsim , obtaining Ediss(tsim ) =
1 tsim
Ediss[λ(x)],
(13)
with 1 Ediss[λ(x)] = kB T
1
dx 0
dλ dx
2 x
∂H τ (λ(x )) var ∂λ
λ(x )
.
(14)
Equation (14) is a functional of the common form [11] 1
S[λ(x)] =
dx F(λ (x), λ(x), x).
(15)
0
The minimization of the dissipation is thus equivalent to finding the function λ(x) that minimizes a functional of the type (15) subject to the boundary conditions λ(0)=λ1 and λ(1)=λ2 . Standard variational calculus then shows that the solution is obtained by solving the Euler–Lagrange equation [11] associated with the functional, d ∂F ∂F = , dx ∂λ ∂λ
(16)
subject to the mentioned boundary conditions.
4.
Applications of Nonequilibrium Free-Energy Estimation
To illustrate the discussion of the previous sections we will now discuss a number of applications of nonequilibrium free-energy estimation, demonstrating the bounding properties of irreversible-work measurements, as well as aspects of dissipation optimization.
4.1.
Harmonic Oscillators
In the first application we consider the problem of computing the free-energy difference between two systems consisting of 100 identical, independent,
716
M. de Koning and W.P. Reinhardt
one-dimensional harmonic oscillators of unit mass with different characteristic frequencies [9]. In particular we will consider the path defined by H (λ) =
100 1 [(1 − λ)ω12 + λω22 ] xi2 , 2 i=1
(17)
with ω1 = 4 and ω2 = 0.5 at a temperature k B T = 2. Note that we are considering only the potential energy of the oscillators here and have neglected any kinetic energy contributions. We can do this because the free-energy difference between two harmonic oscillators at a fixed temperature is determined only by the configurational part of the partition function. The value of the desired reversible work Wrev per oscillator associated with a quasistatic modification of the frequency from ω1 to ω2 is known analytically: ω1 = −4.15888. (18) Wrev (ω1 → ω2 ) = −k B T ln ω2 The simulation algorithm we utilize is standard Metropolis displacement MC with a fixed maximum trial displacement xmax = 0.3. First we consider the statistics of the irreversible work measurements as a function of the simulation “time” tsim , which here stands for the number of MC sweeps (one sweep corresponds to one trial displacement per oscillator) per process, for a linear switching function. The results are shown as the dashed line curves in Fig. 2(a) and (b), in which each data point represents the mean value of Wirr over 50 independent initial conditions. Figure 2(a) shows that the upper and lower
Upper/Lower bounds to Work
2 3 4 5 6
Analytical Linear function Optimized function
7 8 9 10
0
0.5
1
1.5
2
2.5
tsim ( 104 MC sweeps)
Average of forward and backward
(b)
(a)
4.0 4.5 5.0 Analytical Linear function Optimized function
5.5 6.0 6.5
0
0.5
1
1.5
2
2.5
tsim ( 104 MC sweeps)
Figure 2. Results of irreversible-work measurements per oscillator as a function of the switching time tsim for the linear (dashed lines) and optimal (solid lines) switching function. The analytical reversible work value is also shown (dot dashed line). (a) shows the results of the forward (upperbounds) and backward (lowerbounds) directions. (b) shows the values of the combined estimator of Eq. (12).
Free-energy calculation using nonequilibrium simulations
717
limit do converge toward the reversible value Wrev , although they do so quite slowly. The slow convergence becomes more apparent when we consider the behavior of the combined estimator of Eq. (12) in Fig. 2(b). If the process were sufficiently slow for linear–response theory to be accurate, the combined estimator should be unbiased and show no systematic deviation. It is clear that this is only the case for the slowest process, at tsim =2.56×104 MC sweeps. All shorter simulations show a systematic deviation, indicating that the associated processes remain quite far from equilibrium, hampering convergence. Next, we attempt to minimize dissipation in the simulation by using the switching function λ(x) that satisfies the Euler–Lagrange Eq. (16). For this purpose we first measured the equilibrium variance in the driving force and the characteristic correlation time of decay as a function of λ from a series of equilibrium simulations (i.e., fixed λ), after which we numerically solved Eq. (16), subject to the boundary conditions λ(0) = 0 and λ(1) = 1. The equilibrium variances, correlation times and the resulting optimal switching function are shown in Fig. 3(a)–(c), respectively. The results in Fig. 3(a) and (b) indicate that the main contribution to the dissipation originates from the region λ ≈ 1, where both the magnitude as well the characteristic decay time of the fluctuations in the driving force increase sharply. The optimal switching function in Fig. 3(c) captures this effect, prescribing a slow switching rate where one should and going faster where one can. The results obtained with this function for the irreversible work measurements are shown as the red lines in Fig. 2(a) and (b). The improvement compared to the linear switching function is quite significant. Figure 2(b), for instance, shows that for tsim as short as 3.2 × 103 MC sweeps, the nonequilibrium process has already reached the linear–response regime. The above optimization procedure is useful in cases where the thermodynamic path H (λ) is prescribed beforehand. This is the case, for instance, for
5
(a)
10
0 0
(c)
(b) 1.0
100
Correlation time
Variance
10 20
0.5
1
0.8
50
0.6 0.4
Linear Optimized
0.2 0
0.0 0
0.5
1
0
0.2 0.4 0.6 0.8 1.0
x
Figure 3. (a) The equilibrium variance (∂ H/∂λ), and (b) the correlation decay time (in MC sweeps) as a function of λ. (c) shows the optimal switching function, as determined by numerically solving Euler–Lagrange equation (16).
718
M. de Koning and W.P. Reinhardt
the reversible-scaling method [12], in which each state along the fixed path H (λ) = λV (V is the interatomic interaction potential) represents the physical system of interest in a different temperature state. In this manner, a single irreversible-work simulation along the scaling path provides a continuous series of estimators of the system’s free energy on a finite temperature interval. If one has some information about the behavior of the magnitude of the and correlation-decay times of the fluctuations of the driving force, one may use the variational method described above to optimize the switching function and minimize dissipation effects.
4.2.
Compression of Confined Lennard–Jones Particles
In the following application we consider a system consisting of 30 Lennard– Jones particles, constrained to move on the x-axis only. In addition, the particles are subject to an external field whose strength is controlled by an external parameter L. More specifically, we consider the path
6 12 σ σ 2xi 26 − + , H (L) =
xi j
xi j
L
(19)
where xi describes the position of particle i on the x-axis and xi j ≡ |xi − x j | is the distance between particles i and j . The second term in Eq. (19) is the external field, which is a very steeply rising potential and has the effect of confining the particles through very strong interactions with the first and last particles, effectively causing the 30 particles to lie approximately evenly spaced between x = ±L/2. Now consider the compression process wherein L changes from L 0 = 30σ to L 1 = 26σ , forcing the line of particles to undergo a one-dimensional compression. As in the previous example, we will attempt to compute the reversible work associated with this process by measuring the irreversible work Wirr for both process directions. Once again we utilize the Metropolis MC algorithm, but instead of fixing the algorithm parameter xmax , describing the maximum trial displacement, we now consider the effects of changing the sampling algorithm on the convergence of the upper and lower bounds. Although the variance of the driving force var (∂ H/∂λ) will not be affected, the correlation time will certainly depend on the choice of xmax . This is illustrated in Fig. 4, which shows the convergence of the upper and lower bounds to the reversible work as obtained for 3 different values of max at a temperature k B T = 0.35 : xmax = 0.6σ , 0.1σ , and 0.04σ , respectively. Effectively, the variation of this algorithm parameter may be thought of as changing the strength of the coupling between the MC “thermostat” and the system of particles. We utilized the linear switching function which varies L linearly between L 0 and L 1 in tsim MC sweeps (each sweep
Free-energy calculation using nonequilibrium simulations
719
Figure 4. Results of forward (upperbound) and backward (lowerbound) irreversible-work measurements (in units of ) as a function of the switching time tsim for the linear switching function for three different values of the MC algorithm parameter xmax .
consisting of 30 MC single-particle trial moves). Each data point and corresponding error bar (±1 standard deviation) were obtained from a set of 21 irreversible work measurements initiated for independent, equilibrated initial conditions. It is also useful to note that it is not necessary to explicitly compute the work Wirr by using (7). All that is needed, through the first law of thermodynamics which applies equally to reversible and irreversible processes, is to calculate the work as Wirr = E − Q, where E is the difference in internal energies of the system between the first and last switching steps, and Q is the heat accumulated during the switching process. This heat, Q, is simply the sum of energies added to, or subtracted from, the system as MC configurations evolve during a simulation. Given that these energies, εi , are already calculated in determining whether moves for particle i are to be accepted or rejected according to the canonical exp(−εi /k B T ), no extra programming is needed to calculate Wirr . It is immediately seen that the strength of the system-thermostat coupling through the algorithm parameter max is indeed a variational parameter
720
M. de Koning and W.P. Reinhardt
for the free-energy computations. Accordingly, rather than selecting a pre-set acceptance ratio of trial moves, as is usually done in equilibrium MC simulations, xmax should be determined so as to minimize the difference between the upper and lower bounds to A. The results show that for all three values of xmax , the upper and lower bounds show convergence. Yet, the convergence properties are clearly different for the three parameter values, giving the best results for xmax = 0.1 and the worst for xmax = 0.04, indicating that the correlation decay time for the fluctuations in the driving force are the shortest for the former and the longest for the latter. Nevertheless, the convergence of the bounds is still quite slow, in that hundreds of thousands of MC sweeps are required to obtain convergence of to within a few percent. This is a consequence of the strong interactions between the particles, as their hard cores interact during the compression from the “ends” of the line of particles and such hard core density gradients are typically slow to work themselves out through single particle MC moves. Contrary to the simple harmonic oscillator problem discussed in the previous section, this problem will be ubiquitous in most atomic and molecular systems in the condensed phase, seemingly rendering the free-energy computations on realistic systems of interest problematic. The questions that now arise are as to whether we can estimate the systematic errors Ediss from data already in hand and use it to improve the estimates of Fig. 4; and/or if we can optimize the thermodynamic path to reduce dissipation and achieve better behavior at short switching times; or perhaps both?
4.3.
Estimating Equilibrium Work from Nonequilibrium Data
Recently, Jarzynski [13] has generalized the Gibbs–Feynman identity, A = A1 − A0 = −k B T lnexp[−(H1 − H0 )/k B T ]0
(20)
where · · · 0 denotes canonical averaging with respect to configurations generated by H0 , and which is the basis of thermodynamic perturbation theory [4], to finite-time processes. Equation (20) is an identity, but in practice it is useful only when the configurations generated by canonical sampling with respect to H0 strongly overlap those generated by H1 . For hard core fluids this would be unusual unless H1 and H0 are quite “close”, resulting in the perturbative use of Eq. (20). Jarzynski now allows H0 to dynamically approach H1 along a path, in analogy with the above discussions. The result, in the context discussed here, suggests that for a given set of N irreversible-work measurements Wi ≡ Wirr (i , t = 0), with i = 1, . . . , N , instead of estimating Wirr as the sim-
Free-energy calculation using nonequilibrium simulations
721
ple arithmetic mean of the Wi , one should calculate the Boltzmann weighted “Jarzynski” (or “Jz”) average W Jz =
M 1 exp(−Wi /k B T ), M i=1
(21)
and then estimate the free energy change as AJz ≡ −k B T lnW Jz .
(22)
In this way bounding is sacrificed, but a more accurate result is not precluded given that, in principle, the Jz-average is unbiased. This approach has been shown to be effective both in the analysis of simulation data as well as finite-time polymer extension experiments, which are of course irreversible. An immediate concern, however, is that, although in the limit of complete sampling as in the Gibbs–Feynman identity, the Jarzynski results are exact in the context of a dissipation-free system, incomplete MC sampling may result in unsatisfactory results.
Work of Compression 300 Forward arithmetic average Backward arithmetic average Forward Jarzynski average Backward Jarzynski average
Upper/Lower bounds to Work
250
200
150
100
50
0 100
1000
10
4
10
5
tsim (MC sweeps) Figure 5. Results of forward and backward irreversible-work averages (in units of ) for the 30-particle confined Lennard–Jones system as a function of the switching time tsim . The results show both the simple arithmetic averages as well as the Boltzmann-weighted Jarzynski averages.
722
M. de Koning and W.P. Reinhardt
This is illustrated in Fig. 5, where data used to generate the bounds to A in Fig. 4, are plotted over a much larger range of switching times tsim , and compared to the AJz estimates. Both the simple arithmetic as well as the Jarzynski averages for both directions were computed over the 21 independent initial conditions. It is evident that, although not giving bounds, the AJz estimates indeed improve the upper and lower bounds compared to those calculated as simple averages. However, the Jarzynski averages become useful when the convergence of the simple arithmetic averages has reached the order of less than 1 k B T per particle. In this fashion, although a promising computational asset, the Jarzynski procedure still requires systematic procedures for finding more reversible paths.
4.4.
Path Optimization through Scaling and Shifting of Coordinates
As we have seen in the harmonic oscillator and Lennard–Jones problems, the choice of the thermodynamic path and the used switching function is quite crucial to the success of nonequilibrium free-energy estimation. In the case of the harmonic oscillator problem it was relatively straightforward to find a good switching function by explicitly solving the variational problem in Eqs. (15) and (16), which lead to an optimized simulation that “spends the right amount of time along each segment” of the already defined path. Here it is important to note that this variational optimization should be carried out over an ensemble averaged Wirr , being identical for every member of the ensemble, independently of any specific i (t = 0). This is the reason why early attempts by Pearlman and Kollman [14] to determine paths “on the fly” by looking ahead and avoiding strong dissipative collisions in specific configurations may result in the unintentional introduction of a Maxwell demon [15], violating the second law of thermodynamics, which is of course the fundamental origin of the Helmholtz inequality. Compared to the simple harmonic oscillator problem, the optimization of the nonequilibrium simulation of the confined Lennard–Jones system is significantly more challenging because of the strong interactions between the particles as during the compression of the system. Given that this type of interaction is expected to occur in most interesting problems, it is of interest to design thermodynamic paths that are different from the ones in which one simply follows H (λ) as λ runs from an initial to a final value, like we did in the case of the harmonic oscillator problem. We now present two approaches that follow this idea and lead to thermodynamic paths that are significantly more reversible. Both the coordinate scaling [16] and coordinate shifting methods discussed below derive from
Free-energy calculation using nonequilibrium simulations
723
the same fundamental thought: is there a (λ-dependent) coordinate system in which all particles are apparently at rest with relative to one another during the switching process? In such a coordinate system perhaps all particles will have little difficulty in remaining close to equilibrium during the whole switching process, with only the magnitude of their local fluctuations changing.
4.4.1. Coordinate scaling Figure 6 illustrates the possibilities of such an approach, when applied to the simple problem of compression discussed above. Here, in an admittedly simple example, all particles should be compressed “uniformly,” rather than by the nonuniform compression generated through the interactions of the confining potential with the particles at both ends of the line. This is accomplished by writing the coordinates as s(λ) xi , where s(λ) is a (common) scaling parameter, which may then be variationally optimized. The greatly improved bounds of Fig. 6 indicate that a better path has indeed been found. How does this fit the “at rest” criterion mentioned earlier? If one watches the MC dynamics in the unscaled “xi ” coordinates using an optimized s(λ), rather than in the actual physical coordinates, s(λ) xi , it appears that the equilibrium positions xi do not change during the switching, and thus, indeed, the only irreversibility arises from the changes in the RMS fluctuations about the equilibrium positions. It should be noted, however, that, as these scalings may be regarded as a change in the metric that affects the length and volumes definitions, one should include a entropic (calculable) correction to obtain the desired free-energy difference. Recently, there has been a variety of applications of the scaling approach [16–18], including the determination of the absolute free energy of Lennard– Jones clusters and a smooth metric scaling through a first order solid–solid phase transition, fcc to bcc, with no apparent hysteresis with its resulting irreversibility.
4.4.2. Coordinate shifting In the applications of metric scaling, thermodynamic paths are often easily determined when a clear symmetry is present. Another approach, namely coordinate shifting is more useful when such symmetries are absent. As an alternative to writing a moving coordinate using the scaling relation s(λ) xi , one can take xi = xifluct + xiref (λ). Here each particle moves in a concerted fashion along a λ-dependent reference path, chosen by symmetry, or by methods such as simulated annealing, to avoid strong hard core interactions or other
724
M. de Koning and W.P. Reinhardt
likely causes of irreversibility. As λ evolves, only the fluctuation coordinates xifluct are subject to MC variations: should the physical environment of each particle remain at least roughly constant, one may hope that the fluctuations from the xiref (λ) do not depend strongly on λ. To the extent that this is the case, the fluctuation coordinates are always at equilibrium, and thus the path is reversible! Figure 7 illustrates the efficacy of this method for the linear compression problem. As opposed to coordinate scaling, coordinate shifting does not change the metric, dispensing the need for entropic corrections and paving the way for applications involving inhomogeneous systems where the possible absence of symmetries obscures the choice of an appropriate metric obvious and complicates the computation of scaling entropy corrections. As is also clear from the results shown in Figure 7, the finite-time upper and lower bounds converge sufficiently quickly for the Jarzynski averaging to actually markedly improve even the shortest-time results. More general “non-linear” combinations of scaling and shifting may also be used to advantage, as in [19].
Work of Compression: Optimized Scaling
Upper/Lower bounds to Work
225
175
125
75
25 1
10
2
10
3
10
4
10
5
10
6
10
tsim (MC sweeps) Figure 6. Convergence of upper and lower bounds to the free-energy change associated with the compression of the confined Lennard–Jones system at k B T = 0.35 as a function of the switching time tsim . The outer pair of lines are from standard finite-time switching, whereas the inner pair represents the results from finite-time switching using linear metric scaling. The vertical bars represent the standard error in the mean of 100 replicas.
Free-energy calculation using nonequilibrium simulations
725
Work of Compression: Optimized Shifting 71 Forward arithmetic average Backward arithmetic average Forward Jarzynski average Backward Jarzynski average
Upper/Lower bounds to Work
70 69 68 67 66 65 64 63 10
2
10
3
10
4
tsim (MC sweeps) Figure 7. Convergence of upper and lower bounds to the free-energy change associated with the compression of the confined Lennard–Jones system at k B T = 0.35 as a function of the switching time tsim as obtained by optimized coordinate shifting. The vertical bars represent the standard error in the mean of 21 replicas. The results obtained with Jarzynski averages are also shown.
5.
Outlook
One of the most fundamental and challenging applications of atomistic simulation techniques concerns the determination of those thermodynamic properties that require determination of the entropy, the chemical potential and the various free energies, which are all examples of thermal thermodynamic properties. In contrast to their mechanical counterparts (e.g., enthalpy, pressure) they cannot be computed as ensemble (or time) averages, and indirect strategies must be adopted. Here, we have discussed the basic aspects of a particular strategy, that of using nonequilibrium simulations to obtain estimators of reversible work between equilibrium states. The point of this approach is that, in contrast to equilibrium methods such as thermodynamic integration, the desired value can, in principle, be estimated from a single simulation. But there is a trade-off, in that the nonequilibrium estimators are subject to both systematic and statistical errors, caused by the inherently irreversible nature of nonequilibrium processes.
726
M. de Koning and W.P. Reinhardt
Yet, the approach allows one to systematically obtain upper and lower bounds to the requested reversible result by exploring the nonequilibrium processes both in forward and backward directions. The bounds for a given process become tighter with decreasing process rates. But more importantly, it is possible to optimize the nonequilibrium process so as to minimize irreversibility and, for a given process time, decrease the bounds. We have discussed a number of methods by which to conduct this optimization task, including explicit functional optimization using standard variational calculus and techniques based on special coordinate transformations aimed at the reduction of irreversibility. These techniques have been quite successful so far, allowing accurate free-energy measurements using relatively short nonequilibrium simulations. In this light, the idea of using nonequilibrium simulations has now grown into a robust and efficient computational approach to the problem of computing thermal thermodynamic properties using atomistic simulation methods. Nevertheless, further development remains necessary, in particular toward improving/generalizing the existing optimization schemes.
References [1] G. Gilmer and S. Yip, Handbook of Materials Modeling, vol. I, chap. 2.14, Kluwer, 2004. [2] J. Li, Handbook of Materials Modeling, vol. I, chap. 2.8, Kluwer, 2004. [3] M.E. Tuckerman, Handbook of Materials Modeling, vol. I, chap. 2.9, Kluwer, 2004. [4] D.A. Kofke and D. Frenkel, Handbook of Materials Modeling, vol. I, chap. 2.14, Kluwer, 2004. [5] D. Chandler, Introduction to Modern Statistical Mechanics, Oxford University Press, Oxford, 1987. [6] L.D. Landau and E.M. Lifshitz, Statistical Physics, Part 1, 3rd edn., Pergamon Press, Oxford, 1980. [7] J.E. Hunter III, W.P. Reinhardt, and T.F. Davis, “A finite-time variational method for determining optimal paths and obtaining bounds on free energy changes from computer simulations,” J. Chem. Phys., 99, 6856, 1993. [8] L.W. Tsao, S.Y. Sheu, and C.Y. Mou, “Absolute entropy of simple point charge water by adiabatic switching processes,” J. Chem . Phys., 101, 2302, 1994. [9] M. de Koning and A. Antonelli, “Einstein crystal as a reference system in free energy estimation using adiabatic switching,” Phys. Rev. E, 53, 465, 1996. [10] M. de Koning and A. Antonelli, “Adiabatic switching applied to realistic crystalline solids: vacancy-formation free energy in copper,” Phys. Rev. B, 55, 735, 1997. [11] R. Courant and D. Hilbert, Methods of Mathematical Physics, vol. 1, Wiley, New York, 1953. [12] M. de Koning, A. Antonelli, and S. Yip, “Optimized free energy evaluation using a single reversible-scaling simulation,” Phys. Rev. Lett., 83, 3973, 1999. [13] C. Jarzynski, “Nonequilibrium equality for free energy differences,” Phys. Rev. Lett., 78, 2690, 1997.
Free-energy calculation using nonequilibrium simulations
727
[14] D.A. Pearlman and P.A. Kollman, “The lag between the Hamiltonian and the system configuration in free energy perturbation calculations,” J. Chem. Phys., 91, 7831, 1989. [15] H.S. Leff and A.F. Rex, Maxwell’s Demon 2, Entropy, Classical and Quantum Information, Computing, Institute of Physics Publishing, Bristol, U.K, 2002. [16] M.A. Miller and W.P. Reinhardt, “Efficient free energy calculations by variationally optimized metric scaling: concepts and applications to the volume dependence of cluster free energies and to solid–solid phase transitions,” J. Chem. Phys., 113, 7035, 2000.
728
M. de Koning and W.P. Reinhardt [17] L.M. Amon and W.P. Reinhardt, “Development of reference states for use in absolute free energy calculations of atomic clusters with application to 55-atom Lennard– Jones clusters in the solid and liquid states,” J. Chem. Phys., 113, 3573, 2000. [18] W.P. Reinhardt, M.A. Miller, and L.M. Amon, “Why is it so difficult to simulate entropies, free energies and their differences?” Accts. Chem. Res., 34, 607, 2001. [19] C. Jarzynski, “Targeted free energy perturbation,” Phys. Rev. E, 65, 046122, 2002.
2.16 ENSEMBLES AND COMPUTER SIMULATION CALCULATION OF RESPONSE FUNCTIONS John R. Ray 1190 Old Seneca Road, Central, South Carolina 29630, USA
1.
Statistical Ensembles and Computer Simulation
Calculation of thermodynamic quantities in molecular dynamics (MD) and Monte Carlo (MC) computer simulations is a useful, often employed tool [1–3]. In this procedure one chooses a particular statistical ensemble for the computer simulation. Historically, this was the microcanonical, or (EhN) ensemble for MD and the canonical, or (ThN) ensemble for MC, but there are several choices available for MD or MC. The notations, (EhN), (ThN) denote ensembles by the thermodynamic state variables that are constant in an equilibrium simulation; energy E, shape-size matrix h, particle number N and temperature T . (There could be other thermodynamic state variables, gi , i = 1, 2, . . . , such as electric or magnetic field applied to the system, and these additional variables would be in the defining brackets.) The shape-size matrix is made up of the three vectors defining the computational MD or MC cell. If the vectors defining the parallelepiped, containing the particles in the computational cell, are denoted (a, b, c) then the 3×3 shape-size matrix is defined by having its columns constructed from the three cell vectors, h = (a, b, c).The volume V of the computational cell is related to the h matrix by V = det(h). For simplicity, we assume that the atoms in the simulation are described by classical physics using an effective potential energy function to describe the inter-particle interactions. Unless explicitly stated otherwise we suppose that periodic boundary conditions are applied to the particles in the computational cell. The periodic boundary conditions have the effect of removing surface effects and, conveniently, making the calculated system properties approximately equal to those of bulk matter. We assume the system obeys 729 S. Yip (ed.), Handbook of Materials Modeling, 729–743. c 2005 Springer. Printed in the Netherlands.
730
J.R. Ray
the Born–Oppenheimer approximation and can be described by a potential energy U using classical mechanic and classical statistical mechanics.
2.
Ensembles
For a single component system there are eight basic ensembles that are convenient to introduce. These ensembles and their connection to their reservoirs are shown in Fig. 1 [4]. Each ensemble represents a system in contact with different types of reservoirs. These eight systems are physically realizable and each can be employed in MD or MC simulations. The combined reservoir is a thermal reservoir, a tension (or stress) and pressure reservoir (the pressure reservoir in Fig. 1 represents a tension and pressure reservoir) and a chemical potential reservoir. The reservoirs are used to impose, respectively,
Figure 1. Shown are the eight ensembles for a single component system. The systems interact through a combined temperature, pressure and chemical potential reservoir. The ensembles on the left are adiabatically insulated from the reservoir while those on the right are in thermal contact with the reservoir. Pistons and porous walls allow for volume and particle exchange. Adiabatic walls are shown cross-hatched while dithermal walls are shown as solid lines. Ensembles on the same line like a and e are related by Laplace and inverse Laplace transformations. The pressure stands for the pressure and the tension.
Ensembles and computer simulation calculation
731
constant temperature, tension and pressure, and chemical potential. The eight ensembles naturally divide into pairs of ensembles. The left-hand column in Fig. 1, a–d are constant energy ensembles while ensembles in the right hand column, e–h have constant temperature. These pairs of ensembles are connected to each other by direct and inverse Laplace transformations, a ↔ e, et cet. The energies that are associated with each ensemble are related to the internal energy E by Legendre transformations [4]. The eight ensembles may be defined using the state variables that are held constant in the ensemble ([5] pp. 293–304). The eight ensembles include the (EhN) and (ThN) ensembles introduced earlier. Another pair of ensembles is the (H t and P N) and (T t and P N) ensembles where H = E + Vo Tr(tε) + PV is the enthalpy, tij is the thermodynamic tension tensor, εij the strain tensor, P the pressure and Tr represents the trace operation. The thermodynamic tension is a modified stress tensor applied to the system that is introduced in the thermodynamics of anisotropic media. Due to definitions in the thermodynamic of non-linear elasticity we denote the tension and pressure separately. A third pair of ensembles is the (Lhµ) and (Thµ), where L is the Hill energy L = E−µN and µ the chemical potential for the one component system. The isothermal member of this latter pair of ensembles is Gibb’s grand canonical ensemble, (Thµ) ensemble. The final pair of ensembles is the (R t and Pµ) and (T t and Pµ) ensembles where R = E + Vo Tr(tε) + PV −µN is the R-energy. The latter member of this ensemble pair was introduced by Guggenheim [6] and is interesting since it has all intensive variables, T, P, µ, and these are all held fixed, but we know only two of these can be independent. Nevertheless, this ensemble can be used in simulations although its size will increase or decrease in the simulation. The (R t and P µ) ensemble allows variable particle number along with variable shape/size. These last four ensembles all have constant chemical potential and variable particle number. For multi-component systems there are a series of hybrid ensembles that are useful. As an example, for two component systems we can use the (T t and P µ1 N2 ) ensemble that is useful for studying the absorption of species 1 in species 2 as for example the absorption of hydrogen gas in a solid [7, 8]. Each of the eight ensembles, for a single component system, may be simulated using either MD or MC simulations. The probability distributions are exponentials for the isothermal ensembles and power laws for the adiabatic ensembles. For example, for the (TVN) ensemble the probability density has the Boltzmann form P(q; T VN ) = Ce−U (q)/(kB T ) with U (q) the potential energy and C a constant. For the (H t and PN) ensemble P(q;H, t,P,N) = CV N (H −Vo Tr(tε) −PV−U(q))(3N/2−1) . The trial MC moves involve particle moves and shape/size matrix moves [9]. For the (R t and Pµ) ensemble MC moves involve particle moves, shape/size matrix moves and attempted creation and destruction events [10]. For MC simulation of these ensembles one uses the probability density directly in the simulation, whereas for MD simulations
732
J.R. Ray
ordinary differential equations of motion are solved for equations arising from Hamilton’s equations. An important advancement in using MD to simulate different ensembles was the extended variable approach introduced by Andersen [11]. In this approach, which some variation is used in all but the (EhN) ensemble, extra variables are introduced into the system to introduce the variation of the variable in the ensemble. Although these variations are fictitious it can be proven that the correct ensemble is generated using these extended variable schemes. In the original approach for the (H PN) ensemble Andersen introduced an equation of motion for the volume that responds to a force that is the difference between the internal microscopic pressure and an external constant pressure imposed by the reservoir. This leads to volume fluctuations that are appropriate to the (H PN) ensemble, see Fig. 1. Nose, thereafter, generalized MD to the isothermal ensembles by introducing a mass scaling variable that allows for energy fluctuations in the (ThN) and the other isothermal ensembles [12]. These energy fluctuations mimic the interaction of the system with the heat reservoir and allow MD to generate the probability densities of the isothermal ensembles. Which ensemble/ensembles to use, and whether to use MD or MC depends on user preference and the particular problem under consideration. For the variable particle number ensembles (those involving the chemical potential in their designation) one usually employs MC methods since simulations using these ensembles involve attempted creation and destruction of particles and this fits naturally with the stochastic nature of the MC method. However, MD simulations of these ensembles have been investigated and performed [13].
3.
Response Function Calculation
Response functions are thermodynamic properties of the system that are often measured, such as specific heats, heat capacities, expansion coefficients, and elastic constants to name a few. Response functions are associated with derivatives of the basic thermodynamic state variables like energy, pressure, entropy and include the basic thermodynamic state variables themselves. We do not include (non-equilibrium) transport properties, such as thermal conductivity, electrical conductivity, and viscosity, in our discussions since they fall under a different calculation schema that uses time correlation functions [14]. Formulas, that may be used to calculate response functions in simulations, may be derived by differentiation of quantities connecting thermodynamic state variables with integrals over functions of microscopic particle variables. These formulas are specific to each ensemble, and are standard statistical mechanics relations. Such a quantity, in the canonical ensemble, is the partition
Ensembles and computer simulation calculation
733
function Z (T, h, N), which for a N particle system in three-dimensions has the form Z (T, h, N ) =
1 N !(2π)3N
e−H (q, p,h)/ kB T d 3N qd 3N p,
(1)
where q and p denote the 6N -dimensional phase space canonical coordinates of the system, H the system Hamiltonian, kB Boltzmann’s constant, Plank’s constant, and dτ = d 3N qd3N p the phase space volume element. The integral in Eq. (1) is carried out over the entire phase space. Although we have indicated the Hamiltonian depends on the cell vectors, h, it would also depend on additional thermodynamic state variables gi . For liquids and gases the dependence on h is replaced by simple dependence on the volume V ; for discussions of elastic properties of solids it is important to include the dependence on the shape and size of the system through the shape size matrix h or some function of h. The Helmholtz free energy A(T, h, N ) is obtained from the canonical ensemble partition function A(T, h, N ) = −k B T ln Z (T, h, N ).
(2)
Average values of phase space functions may be calculated using the phase space probability, which for the canonical ensemble is the integrand in the partition function in Eq. (1). For example, the canonical ensemble average for the phase space function f(q,p,h)is f=
f e−H/k B T dτ e−H/k B T dτ .
(3)
In an MD or MC simulation the thermodynamic quantity f is calculated by using a simple average over the simulation configurations, for MD this is an average over time, whereas for MC it is an average over the Markov chain of configurations generated. If the value of f at each configuration (each value of q, p, h) is f n , n = 1, 2, 3, . . . , M. for M time-steps in MD or trials in MC, then the average of f for the simulation is M
fn
. (4) M In the simulation Eq. (4) is the approximation to the phase space average in Eq. (3). If, for example, H = f , then this average gives the thermodynamic energy E = H and the caloric equation of state E = E(T, h, N ). The assumption that Eq. (4) approximates the integral in Eq. (3) is often referred to in the literature by saying that MD or MC “generates the ensemble”. The approximate equality of these two results in MD is the quasi-ergodic hypothesis of statistical mechanics which states that ensemble averages, Eq. (3) and time averages, Eq. (4) are equal. This hypothesis has never been proven f=
n=1
734
J.R. Ray
for realistic Hamiltonians but it is the pillar on which statistical mechanics rests. In what follows we shall assume that averages over simulation-generated configurations are equal to statistical mechanics ensemble averages. Thus, we use formulas from statistical mechanics but calculate the average values in simulations using Eq. (4) employing MD or MC. An important point to note is that for calculation of meaningful averages in a simulation we must “equilibrate” the system before collecting the values f n in Eq. (4). This is done by carrying out the simulation for a “long enough time” and then discarding these configurations and starting the simulation from that point. This removes transient behavior, associated with the particular initial conditions used to start the simulation, from overly influencing the average in Eq. (4). How long one must “equilibrate” the system depends on relaxation rates in the system, that are initially unknown. Tasks like the equilibration of the system, the estimate of the accuracy of calculated values, and so forth are part and parcel of the art of carrying out valid and, therefore, useful simulations and must be learned by actually carrying out simulations. In this aspect computer simulations have a similarity to experimental science, like gaining experience with the measuring apparatus, but, of course, they are theoretical calculations made possible by computers. From our discussion, so far, it might seem, to those who know thermodynamics, that the problem of calculating all response functions is finished, since if the Helmholtz free energy is known from Eq. (2) then all response functions may be calculated by differentiation of the Helmholtz free energy with respect to various variables. For example, the energy H may be found from H = kT 2
∂( A/kT ) . ∂T
(5)
Unfortunately, in MC or MD only average values like Eq. (3), that are ratios of phase space integrals, can be easily evaluated in simulations and not the 6N dimensional phase space integral itself, like Eq. (1). The reason for this is that in high-dimensions (dimensions greater than say, 10) the numerical methods used to accurately calculate integrals (e.g., Simpson’s rule) require computer resources beyond those presently available. For example, in 10 dimensions, for a grid of 100 intervals in each dimension, 1020 variables are required for the grid. Even with the most advanced computer, this number of variables is not easy to handle. In a typical simulation the dimension is typically hundreds or thousands, not ten. One might think that the high dimensional integrals could be calculated directly by MD or MC methods but this also does not work since the integrand in the high dimensional phase space is rapidly varying and one cannot sample for long enough to smooth out this rapid variation. The integral is determined by the value of the integrand in a few pockets (“equilibrium pockets”) in phase space that will only be sampled infrequently. For the ratio of high dimensional integrals, MD or MC methods have the
Ensembles and computer simulation calculation
735
effect of focusing the sampling on just those important regions. The difficulty, in high dimensions, of calculating quantities that require the evaluation of an integral as compared to the ratio of integrals leads to a classification of quantities to be calculated by computer simulation as thermal or mechanical properties. Thermal properties require the value of the partition function, or some other high-dimensional integral, for their evaluation whereas mechanical properties do not require the value of the partition function for their evaluation, but are a ratio of two high dimensional integrals. As examples, for the canonical ensemble the Helmholtz free energy is a thermal variable and the energy is a mechanical variable. Other thermal variables are the entropy, chemical potential, and Gibbs free energy. Other mechanical variables are temperature, pressure, enthalpy, thermal expansion coefficient, elastic constants, heat capacity, and so forth. Special methods must be developed for calculating thermal properties and the calculation of thermal properties is, in general, more difficult. We have developed novel methods to calculate thermal variables using different ensembles [15, 16] but shall not discuss them in detail in this contribution. As an example of the calculation of a mechanical response function, consider the fluctuation formula for the heat capacity in the canonical ensemble. Differentiation of the average energy H in Eq. (3) with respect to T while holding the cell vectors rigid leads to the heat capacity at constant shape-size CV CV =
1 ∂H = (H 2 − H 2 ). ∂T kB T 2
(6)
Recall that in the simulation the average values in Eq. (2) are approximated by simple averages of the quantity. Thus, in a single canonical ensemble simulation, MC or MD we may calculate the heat capacity of the system at the given thermodynamic state point by calculating the average value of the square of the energy, subtracting the average value of the energy squared and dividing by kB T 2 . The quantity, δ H 2 = H 2 − H 2 ,
(7)
the variance in probability theory, is called the fluctuation in the energy H. The fluctuation of quantities enters into the formulas for response functions for mechanical variables. It should be noted that a direct way of calculating the heat capacity CV is to calculate the thermal equation of state at a number of temperatures and then numerically differentiate H with respect to T . This requires a series of simulations and is not as convenient or as easy to determine an estimate of accuracy but is simple and is a useful check on the value obtained from the fluctuation formula, Eq. (6). We refer to this method of calculating response functions as the direct method. Any mechanical response function can, in principle, be calculated by the direct method.
736
4.
J.R. Ray
Thermodynamics of Anisotropic Media
For the present we choose the reference state to be the equilibrium state of the system with zero tension applied to the system. The h matrix for this reference state is h o while for an arbitrary state of tension we have h. The following formulation of the thermodynamics of aniostropic media is consistent with nonlinear or finite elasticity theory. In the following repeated indices are summed over. The elastic energy Uel is defined by Uel = Vo Tr(tε),
(8)
where Vo is the reference volume, t is the thermodynamic tension tensor, ε is the strain tensor and Tr implies trace. The h matrix maps the particle coordinates into fractional coordinates, sai , in the unit cube through the relation xai = h ij sa j . The strain of the system relative to the unstressed state is εij = 12 (h oT −1 Gh −1 0 − I )ij ,
(9)
where G = h T h is the metric tensor. Here h o is the reference value for measuring strain, that is, the value of h when the system is unstrained. This value can be obtained by carrying out a (H t and PN) simulation, MD or MC with the tension set to zero. Equation (9) can be derived by noting that the deformation gradient can be written in terms of the h matrices as ∂ xi /∂ xoj = h ik h −1 okj , and using this in the defining relation for the Lagrangian strain of the system. The thermodynamic tension tensor is defined so that the work done in an infintesimal distortion of the system is given by dW = V o Tr(tdε). The stress tensor, σ , is related to the thermodynamic tension by T −1 T σ = Vo hh −1 h / V. o th o
(10)
The thermodynamic law is T d S = dE + Vo Tr(t dε),
(11)
where T is the temperature, S the entropy and E the energy of the particles. Using the definition of the strain, Eq. (9), the thermodynamic law can be recast as T −1 T d S = dE + Vo Tr(h −1 dG)/2. o th o
(12)
From this latter we obtain T −1 (∂ E/∂ G kn ) S = −(Vo h −1 )kn /2. o th o
(13)
In the (EhN) ensemble we have the general relation (∂ E/∂ G kn ) S = ∂ H/∂ G kn ,
(14)
Ensembles and computer simulation calculation
737
where H is the particle Hamiltonian and the average is the (EhN) ensemble average. Combining the last two equations leads to T −1 )kn /2. ∂ H/G kn = −(Vo h −1 o th o
(15)
The particle Hamiltonian is transformed by the canonical transformation xai = h ij sa j, pai = h ijT −1 πa j , into H (sa , πa , h) =
N 1 πai G −1 ij πa j /m a + U (r12 , r13 , . . .), 2 a=1
(16)
where the distance between particles a and b is to be replaced by the relation2 = sabi G ij sabj and sabi is the fractional coordinate difference between a ship rab and b. The microscopic stress tensor ij may be obtained by differentiation of the particle Hamiltonian with respect to the h matrix while holding constant (sa , πa ) : ∂ H/∂h ij = ik Akj , where A is the area tensor A=VhT −1 . For the Hamiltonian, Eq. (16), the microscopic stress tensor is 1 ij = V
pai pa j /m a −
a
∂U a
∂rab
xabi xabj /rab .
(17)
Differentiating the Hamiltonian with respect to the parameters G kn we obtain Mkn ≡ (∂ H/∂ G kn ) = −(V h −1 h T −1 )kn /2,
(18)
where is the microscopic stress tensor, Eq. (17). If the average value of Eq. (18) is combined with Eq. (15) we obtain t = V h o h −1 h T −1 h oT / Vo .
(19)
Comparing Eq. (19) and Eq. (10) we find σ =
(20)
the stress tensor is the average of the microscopic stress tensor. Equation (20) holds in all ensembles but the proofs would be different. For the (ThN) ensemble we would use the Helmholtz free energy A=E−TS instead of the energy E. The counterpart to Eq. (14) would be (∂ A/∂ G kn )T = ∂ H/∂ G kn .
5.
Calculation of Elastic Constants in the (EhN) Ensemble
In order to discuss the calculation of the elastic constants we describe the system by the microcanonical, (EhN) ensemble. The adiabatic elastic constants are defined as the derivative of the tension by the strain Ci(S) j kl = −(∂tij /∂εkl ) S .
(21)
738
J.R. Ray
Note the minus sign in Eq. (21) implies that the tension and stress are positive for compressive loading. Often the opposite convention is employed and no minus sign occurs in Eq. (21) in that convention. In the literature of finite elasticity the elastic constants defined in Eq. (21) are often called stiffness coefficients or elastic moduli. Assume the system Hamiltonian describing the system has the form H (xa , pa ) =
N 1 p2 /m a + U (r12 , r13 , . . .), 2 a=1 a
(22)
where pa is the momentum of particle a, rab is the distance between particle a and b and the system contains N particles. Let the reference value, h o , denote the shape-size matrix for the unstressed system and h represent an arbitrary state of stress. The (EhN) fluctuation formula involving the adiabatic elastic constants, for a potential that depends only on interparticle distances has the form −1 −1 −1 (S) Vo h −1 oip h 0 j q h okr h ons C pqrs = −4δ(Mij Mkn )/k B T −1 −1 −1 +2N k B (G −1 in G j k + G ik G j n ) N
k(a, b, c, d)sabi sabj sabk sabn ,
(23)
k(a, b, c, d) = (∂ 2 U/∂rab ∂rcd − (∂U/∂rab )δac δbd /rab )/(rabrcd ).
(24)
+
a
where
The averages in Eq. (23) are calculated using (EhN) simulations, MD or MC. In (EhN) MD we would solve Newton’s laws for the motion of the particles: m a x¨ai = −∂U/∂ xai to generate configurations to be used to calculate averages, Eq. (4). In MC we would use the probability density [17]: W(q) = C(E−U(q))3N/2−1 to generate configurations by attempting a trial move of an atom q → q(trial), and accepting the move if W(q(trial))/W(q) > random, where is a random number between 0 and 1. Equations (23) and (24) also holds for the isothermal elastic constants if one replaces C (S) pqrs by the isothermal elas(T ) tic constants, C pqrs and calculates the average values in Eq. (23) using (ThN) simulations. The three distinct terms in Eq. (23) are called the fluctuation term (term involving the fluctuation of M), the kinetic term (term with multiplier 2NkB ) and Born term (term containing k (a,b,c,d). Equations (23) and (24) are valid for any potential that depends only on the distance between particles; it is valid for many-body forces as long as they can be written in terms of only the distance between particles. In particular, this would include potentials that depend on tetrahedral and dihedral angles and, therefore, have many body forces. For pair wise additive potentials the last term in Eq. (23) reduces to
Ensembles and computer simulation calculation
739
N 4 the simpler form a
X ij =
Mij (E − H (q, p))dτ ,
(25)
where M is defined in Eq. (18) and the unit step function. For Eq. (25) applied to large system one can keep, to good approximation, only the largest term or Mij =
Mij (E − H (q, p))dτ,
(26)
where the phase volume inside the energy shell, H (q, p) = E. The entropy is related to the phase volume by the Boltzmann relation S = k B ln . Differentiation of Eq. (26) with respect to G kn leads to −1 −1 −1 (S) 2 Vo h −1 oip h 0 j q h okr h ons C pqrs = − 4δ(Mij Mkn )/kB T + 4∂ H/∂ G ij ∂ G kn . (27)
Calculating the last term in Eq. (27) leads to Eqs. (23) and (24). More rigorous derivations of Eq. (23) are discussed by Ray [18]. Equations (23) and (24) have been used to calculate elastic constants in a nearest neighbor Lennard–Jones (6–12) system in both the microcanonical and canonical ensemble using MD and compared to calculations of these same quantities in earlier canonical ensemble MC calculations [19, 20]. These calculations have been reproduced by several workers and now can be used to check programs that are written to calculate elastic constants. Since there are thermodynamic relations connecting the adiabatic and isothermal elastic constants (like the thermodynamic formulas connecting CV and CP ) this makes it possible to calculate the adiabatic elastic constants in either the (EhN) ensemble or the (ThN) ensemble, and the same for the isothermal elastic constants. A comparison of the values in the two ensembles can be looked upon as a stringent test of the validity of the Nose [12] theory for isothermal MD simulations, [20]. Equations (23) and (24) have also been used to calculate the elastic constants of crystalline and amorphous silicon modeled by the Stillinger–Weber potential [21, 22]. Equations (23) and (24) allow one to break down the Born
740
J.R. Ray
term into a two-body Born term and a three-body Born term for the Stillinger– Weber potential. These values have also been checked by a number of workers and can now be used as program checks. Equations (23) and (24) were generalized to apply to potentials with an explicit volume dependent term, such as in metallic potentials, or when using the Ewald method to evaluate the Coulomb potential. The resulting theory was then applied to a model of sodium [23]. Another generalization was to study the calculation of the third-order elastic constants using a generalization of Eqs. (23) and (24) [24]. For systems where the reference state for measuring strain is a stressed state of the system, generalizations of Eqs. (23) and (24) are required. This extension with calculations for a model of solid helium has been developed [25]. A detailed application of Eqs. (23) and (24) was applied to embedded atom method potentials for palladium by Wolf et al. [26]. Extension of (Ht and PN) calculations to higher order elastic constants was given by Ray [27].
6.
Calculation of Elastic Constants in the (Ht and PN) and (Tt and PN) Ensembles
In these ensembles the shape-size or strain of the system fluctuates. The Parrinello–Rahman fluctuation formula for the elastic constants involves just this fluctuation [28] δ(εij εkl ) = k B T (CiSj kl )−1 / Vo ,
(28)
where the adiabatic compliance tensor (C S )−1 is the inverse of the elastic constant tensor, (CiSj kl )−1 = −(∂εij /∂tkl ) S ,
(29)
and S is the entropy. The averages in Eq. (28) are calculated using (H t and PN) MD or MC. The same formula, Eq. (28), holds in the (T t and PN) ensemble if we change to the isothermal elastic constants and calculate averages using isothermal MD or MC. For MD the extended Hamiltonian for variable shapesize ensembles has the form [29] H1 (s, π, h, , f, ρ) =
(πaT G −1 πa /(2m a f 2 ) + U
a
+Tr( T )/(2W ) + Vo Tr(tε) + P V +ρ 2 /(2M) + (3N + 1)kB To ln( f )),
(30)
where (s,π ) are scaled coordinates and conjugate momenta, U is the potential energy, (h, ) are the coordinates and momenta of the computational cell, and
Ensembles and computer simulation calculation
741
(f,ρ) are the Nose mass scaling variable and its conjugate momenta. The constants W and M are introduced so that h and f satisfy dynamical equations; note that in classical statistical mechanics equilibrium properties of the system are independent of the masses and, therefore, are independent of W, M and the particle masses m a . To is the reservoir temperature in the constant temperature ensembles. The physical particle variables (xa , pa ) are related to the scaled particle variables by xa = hsa , pa = h T −1 π a / f . The relationship between the physical variables and the scaled variables may be described by a canonical transformation defined by h along with a mass scaling transformation with f . The equations of motion following from this Hamiltonian may be written in the form m a f 2 s¨si = −
(∂U/∂rab )sabi /rab − m a ( f 2 G −1 G˙ + 2 f f˙)˙sai ,
(31)
W h¨ = ( − P I )A − h ,
(32)
M f¨ = 2K / f − (3N + 1)kB To / f,
(33)
T −1 is related to the tension applied to the system and K is where = Vo h −1 o th o the particle kinetic energy. Equation (31) is just Newton’s law applied to the particles with the additional modification of the variable cell and the mass scaling variables. Equation (32) is the Parrinello–Rahman equation [28] as generalized [29, 30] to be valid for finite deformations which involves introducing the tension instead of the stress; this lead to the form of the enthalpy for finite elasticity in agreement with Thurston [31]. Equation (33), [12] is the equation of motion for the mass scaling variable which is introduced to drive the average temperature of the system to the reservoir temperature To in an equilibrium simulation. If the Nose mass scaling variable satisfies f = 1, df/dt = 0 then Eqs. (31) and (32) are the MD equations of motion for the (H t and PN) ensemble and the trajectories yield averages in this ensemble. If the cell matrix satisfies h = const., dh/dt = 0, then Eqs. (30) and (32) are the equations of motion for the (ThN) ensemble and the trajectories yield averages in this ensemble. If the previous conditions on h and f are both satisfied then the (EhN) ensemble is generated. If Eqs. (30)–(33) are solved in the general case with f and h varying then the (Tt and PN) ensemble is generated. The variable cell equations of motion have great utility in studying solid– solid phase transformations by computer simulation. These same transformations can be studied using MC methods. In the (H t and PN) ensemble the calculation of elastic constants in MD is not as good as in MC. That is, Eq. (28) converges faster using MC than MD. This is illustrated in detail by Fay and Ray, 1992 [9] and Karimi et al. 1998 [32]. However, (H t and PN) MC elastic constant calculations do not converge as fast as (EhN) MD or MC. The convergence is governed by the fluctuation terms in either Eq. (23) or Eq. (28). The fluctuation of the microscopic stress tensor in Eq. (23) converges
742
J.R. Ray
faster than the fluctuation of the shape/size matrix in Eq. (28) in the cases we have investigated. This is unfortunate since the (EhN) formulas require values of the second derivatives of the potential whereas the (H t and PN) fluctuation formulas require only first derivatives in MD or no derivatives in MC. The derivatives of the potential may not be easy to calculate for a many body potential although one could employ algebraic computer programs to calculate the derivatives. One can calculate elastic constants in the variable particle number ensembles but we have not discovered a case where that offers any advantage over the four fixed particle number ensembles discussed. If the second derivatives of the potential can be evaluated or accurately approximated, then the (EhN) formuals, Eqs. (23) and (24), using either MD or MC are the best choice for calculating the elastic constants. If the second derivative of the potential is not available then MC using the probability density P(q; H, t, P, N ) = CV N (H − Vo Tr(tε)− PV−U(q))(3N/2−1) with Eq. (28) is the best choice. MC calculations in the (H t and PN) ensemble also offer the advantage of not having to worry about the choice of the different fictitious kinetic energy and mass terms introduced in extended MD; these are not unique. Either Eqs. (23) or (28) offers a convenient way of calculating elastic properties of condensed matter systems as a function of temperature or other parameters in a way that includes all anharmonic effects in an exact manner.
References [1] M.P. Allen and D.J. Tildesley, Computer Simulation of Liquids, Oxford Univeristy Press, Oxford, 1987. [2] G. Ciccotti, D. Frenkel, and I.R. McDonald, Simulation of Liquids and Solids, NorthHolland, Amsterdam, 1987. [3] D. Frenkle and B. Smit, Understanding Molecular Simulations, Academic Press, New York, 1996. [4] H.W. Graben and J.R. Ray, “Eight physical systems of thermodynamics, statistical mechanics, and computer simulation,” Mol. Phys., 80, 1183–1193, 1993. [5] M.W. Zemansky and R.H. Dittman, Heat and Thermodynamics, 7th edn., McGraw Hill, New York, 1997. [6] E.A. Guggenheim, J. Chem. Phys., 7, 103, 1939. [7] R.J. Wolf, M.W. Lee, R.C. Davis, P.J. Fay, and J.R. Ray, “Pressure-composition isotherms for palladium hydride,” Phys. Rev. B, 48, 12415–12418, 1993. [8] R.J. Wolf, M.W. Lee, and J.R. Ray, “Pressure-composition isotherms for nanocrystalline palladium hydride,” Phys. Rev. B, 73, 557–560, 1994. [9] P.J. Fay and J.R. Ray, “Monte Carlo simulations in the isoenthalpic-isotension– isobaric ensemble,” Phys. Rev. A, 46, 4645–4649, 1992. [10] J.R. Ray and R.J. Wolf, “Monte Carlo simulations at constant chemical potential and pressure,” J. Chem. Phys., 98, 2263–2267, 1993. [11] H.C. Andersen, “Molecular dynamics simulations at constant pressure and/or temperature,” J. Chem. Phys., 2384–2393, 1990.
Ensembles and computer simulation calculation
743
[12] S. Nose, “A unified formulation of the constant temperature molecular dynamics method,” J. Chem. Phys., 81, 511–519, 1994. [13] T. Cagin and B.M. Pettitt, “Molecular dynamics with a variable number of particles,” Mol. Phys., 72, 169, 1991. [14] E. Helfand, “Transport coefficients from dissipation in a canonical ensemble,” Phys. Rev., 119, 1, 1960. [15] P.J. Fay, J.R. Ray, and R.J. Wolf, “Detailed balance method for chemical potential determination in Monte Carlo and molecular dynamics simulations,” J. Chem. Phys., 100, 2154–2160, 1994. [16] P.J. Fay, J.R. Ray, and R.J. Wolf, “Detailed balance method for chemical potential determination,” J. Chem. Phys., 103, 7556–7561, 1995. [17] J.R. Ray, “Microcanonical Ensemble Monte Carlo method,” Phys. Rev. A, 44, 4061– 4064, 1991. [18] J.R. Ray, “Elastic Constants and statistical ensembles in molecular dynamics,” Comput. Phys. Rep., 8, 109–152, 1988. [19] J.R. Ray, M.C. Moody, and A. Rahman, “Molecular dynamics calculation of the elastic constants for a crystalline system in equilibrium,” Phys. Rev. B, 32, 733–735, 1985. [20] J.R. Ray, M.C. Moody, and A. Rahman, “Calculation of elastic constants using isothermal molecular dynamics,” Phys. Rev. B, 33, 895–899, 1986. [21] M.D. Kluge, J.R. Ray, and A. Rahman, “Molecular dynamic calculation of the elastic constants of silicon,” J. Chem. Phys., 85, 4028–4031, 1987. [22] M.D. Kluge and J.R. Ray, “Elastic constants and density of states of a moleculardynamics model of amorphous silicon,” Phys. Rev. B, 37, 4132–4136, 1988. [23] T. Cagin and J.R. Ray, “Elastic constants of sodium from molecular dynamics,” Phys. Rev., 37, 699–705, 1988. [24] T. Cagin and J.R. Ray, “Third-order elastic constants from molecular dynamics; Theory and an example calculation,” Phys. Rev. B, 38, 7940–7946, 1988. [25] J.R. Ray, “Effective elastic constants of solids under stress: theory and calculations for helium from 11.0 to 23.6 GPa,” Phys. Rev. B, 40, 423–430, 1989. [26] R.J. Wolf, K.A. Mansour, M.W. Lee, and J.R. Ray, “Temperature dependence of elastic constants of embedded-atom models of palladium,” Phys. Rev. B, 46, 8027– 8035, 1992. [27] J.R. Ray, “Fluctuations and thermodynamic properties of anisotropic solids,” J. Appl. Phys., 53, 6441–6443, 1982. [28] M. Parrinello and A. Rahman, “Polymorphic transitions in single crystals: a new molecular dynamics method,” J. Appl. Phys., 52, 7182–7190, 1981. [29] J.R. Ray and A. Rahman, “Statistical ensembles and molecular dynamics studies of anisotropic solids II,” J. Chem. Phys., 82, 4243–4247, 1985. [30] J.R. Ray and A. Rahman, “Statistical ensembles and molecular dynamics studies of anisotropic solids,” J. Chem. Phys., 80, 4423–4428, 1984. [31] R.N. Thurston, Physical Acoustics: Principles and Methods, W.P. Mason (ed.), Academic Press, New York, 1964. [32] M. Karimi, H. Yates, J.R. Ray, T. Kaplan, M. Mostoller, “Elastic constants of silicon using Monte Carlo simulations,” Phys. Rev. B, 58, 6019–6025, 1998.
2.17 NON-EQUILIBRIUM MOLECULAR DYNAMICS Giovanni Ciccotti1 , Raymond Kapral2 , and Alessandro Sergi2 1
INFM and Dipartimento di Fisica, Universit`a “La Sapienza,” Piazzale Aldo Moro, 2, 00185 Roma, Italy 2 Chemical Physics Theory Group, Department of Chemistry, University of Toronto, Toronto, Ont. M5S 3H6, Canada
Statistical mechanics provides a well-established link between microscopic equilibrium states and thermodynamics. If one considers systems out of equilibrium, the link between microscopic dynamical properties and nonequilibrium macroscopic states is more difficult to establish [1, 2]. For systems lying near equilibrium, linear response theory provides a route to derive linear macroscopic laws and the microscopic expressions for the transport properties that enter the constitutive relations. If the system is displaced far from equilibrium, no fully general theory exists to treat such systems. By restricting consideration to a class of non-equilibrium states which arise from perturbations (linear or non-linear) of an equilibrium state, methods can be developed to treat non-equilibrium states. Furthermore, non-equilibrium molecular dynamics (NEMD) simulation methods can be devised to provide estimates for the transport properties of these systems. Molecular dynamics is typically based on equations of motion derived from a Hamiltonian. However, often in the simulation of large complex systems, constraints are introduced to remove certain “fast” degrees of freedom from the system that are deemed to be unimportant for the phenomena under investigation. An important and prevalent example is the introduction of bond constraints to remove rapid vibrational degrees of freedom from the molecules of the system. Such constraints can be handled by the introduction of generalized coordinates and in these coordinates a Hamiltonian description of the equations of motion may be written. However, it is often more convenient to work with Cartesian coordinates with the holonomic constraints explicitly introduced in the equations of motion through Lagrange multipliers. One can treat the set of Lagrange multipliers as parameters that can be determined by 745 S. Yip (ed.), Handbook of Materials Modeling, 745–761. c 2005 Springer. Printed in the Netherlands.
746
G. Ciccotti et al.
SHAKE [3] and one still has a kind of Hamiltonian description involving these parameters [4]. Alternatively, one can explicitly determine the Lagrange multipliers as functions of the phase space coordinates and in this case the equations of motion are non-Hamiltonian and are characterized by the existence of a non-zero phase space compressibility [5, 6]. Consequently, such constrained systems are a special case of general non-Hamiltonian systems whose statistical mechanical formulation has been, recently, a topic of considerable interest. The statistical mechanical methods that have been developed for nonHamiltonian flows [7] can be used to formulate a non-equilibrium statistical mechanics of constrained molecular systems. With such a formulation in hand, a response theory can be developed to investigate linear and non-linear perturbations of equilibrium states, thus providing the link between microscopic dynamics and macroscopic non-equilibrium properties. In this chapter we show how this program can be carried out. In simulations, when external forces are applied to the system, the equations of motion must be supplemented with a thermostating mechanism to compensate for the input of energy from the external forces. The resulting thermostated equations are non-Hamiltonian in character. While, for simplicity, we do not explicitly consider the thermostat in the formulation presented below, the techniques we describe can also be extended to treat this more general situation.
1.
Non-Hamiltonian Equations of Motion with Constraints
Consider an N -particle system with coordinates ri and momenta pi and Hamiltonian H0 , H0 =
N p2i i=1
2m i
+ V (r),
(1)
where V (r) is the potential energy. We let phase space coordinate labels without particle indices stand for the entire set of coordinates, (r, p) = (r1 , r2 , . . . , r N , p1 , p2 , . . . , p N ). When we wish to refer to these variables collectively, we use the notation x = (r, p). We suppose that the system is subjected to holonomic constraints σα (r) = 0,
α = 1, . . . , .
(2)
The σα could be the bond constraints mentioned above, or any other holonomic constraint such as a reaction coordinate constraint imposed in the simulation of rare reactive events [8]. The constrained equations of motion are pi , p˙ i = Fi − λα ∇i σα (r), (3) r˙ i = mi
Non-equilibrium molecular dynamics
747
where Fi = −∇i V is the force on particle i due to the potential energy and the second term represents the constraint forces with Lagrange multipliers λα . We use the Einstein summation convention on the Greek indices. Equivalently, we may write this pair of equations as single second order equation as m i r¨ i = Fi − λα ∇i σα (r).
(4)
Since σα is constrained at all times, σ˙ α = i (pi /m i ) · ∇i σα = 0. Typically, such equations are solved in molecular dynamics simulations using the SHAKE algorithm. However, in order to carry out the statistical mechanical treatment of such constrained systems is more convenient to formulate the problem in a form where its non-Hamiltonian character is evident. To this end we first determine an explicit expression for the Lagrange multipliers. The Lagrange multipliers can be found by differentiating the constraint σ˙ α = 0 with respect to time and using Eq. (4) to yield, σ¨ α = =
d pi · ∇i σ α = r¨ i · ∇i σα + r˙ i r˙ j : ∇i ∇ j σα dt i m i i i, j
Fi
pi p j λβ : ∇i ∇ j σα = 0. − ∇i σ β · ∇i σ α + mi mi mi m j i, j
i
(5)
Solving this equation for λα we find,
pi p j Fi : ∇i ∇ j σβ (Z−1 )βα , · ∇i σ β + λα = λα (x) =
mi
i
i, j
mi m j
(6)
where Zαβ =
1 i
mi
(∇i σα ) · (∇i σβ ).
(7)
Using this explicit form of the Lagrange multiplier, the resulting equations of motion, pi , p˙ i = Fi − λα (x)∇i σα (r), (8) r˙ i = mi are no longer in Hamiltonian form and represent a motion with the constraints as conserved quantities [5, 6]. The phase space flow is compressible and the compressibility is given by κ = ∇x · x˙ = −
N ∂λα (x) i=1
= −2
1 i
where Z = det Z.
∂pi mi
· ∇i σα (r)
(∇i σ˙ α ) · (∇i σβ )(Z−1 )βα = −
d ln Z , dt
(9)
748
G. Ciccotti et al.
The constrained phase space flow in Eq. (8) may be generated by the action of the Liouville operator, iL 0 = x˙ · ∇x =
N pi
∂ ∂ ∂ · + Fi · − λα (x)(∇i σα (r)) · , m ∂ri ∂pi ∂pi
i=1
(10)
on the phase space variables. More generally, the evolution of any dynamical variable, B(x), is given by dB(x(t)) = iL 0 B(x(t)), dt
(11)
whose formal solution is B(x(t)) = eiL 0 t B(x).
(12)
We now wish to discuss the statistical mechanics of such a system. The existence of a phase space compressibility has implications for the nature of the phase space flow and the computation of statistical properties [6, 7]. The phase space volume element at time t0 , dxt0 transforms into dxt = J (xt ; xt0 )dxt0 j at time t, where J (xt ; xt0 ) = det J and the matrix J has elements Ji j = ∂xit /∂xt0 . Using the fact that J = det J = exp(Tr ln J) one may derive an equation of motion by differentiating this expression for J to find,
∂ x˙ti ∂ x t dJ −1 dJ (xt ; xt0 ) 0 = Tr J J = J j i dt dt ∂ x ∂ x t t0 i, j
=
j
∂ x˙ i t i
∂ xti
J = κ(xt ) J (xt ; xt0 ).
(13)
Integrating this equation and using the explicit expression for κ given above, one may show that the Jacobian takes the form, J (xt ; xt0 )=Z (rt0 )/Z (rt ). Consequently, we see that Z (rt ) dxt = Z (rt0 ) dxt0 and dµ(r, p) = Z (r) dr dp is the invariant measure for the phase space flow. Next we consider the time evolution of the phase space distribution func tion f (x), where V dµ(x) f (x)= V dxZ (r) f (x) ≡ V dxρ(x) is the fraction of systems in the phase space volume V . The phase space flow is conserved so that ρ(x) = Z (x) f (x) satisfies the continuity equation, ∂ρ(x, t) + ∇x · (˙xρ(x, t)) = 0, ∂t
(14)
and, therefore, ∂ρ(x, t) = −(˙x · ∇x + ∇x · x˙ )ρ(x, t) = −(iL 0 + κ)ρ(x, t). ∂t
(15)
Non-equilibrium molecular dynamics
749
We now want to derive the evolution equation for f (x, t). To this end we first note that we may again use the identity Z = det Z = exp(Tr ln Z) and the fact that Z depends only on the coordinates to compute iL 0 Z =
pi i
=
mi
· ∇i Z =
pi i
1 j
mj
· (∇i Z αβ )(Z−1 )βα Z
mi
(∇ j σ˙ α ) · (∇ j σβ ) + (∇ j σα ) · (∇ j σ˙ β ) (Z−1 )βα Z = −κ Z , (16)
where we have used Eq. (9) and the fact that Z is a symmetric matrix to relate the expression in the penultimate equality to the compressibility. Then, inserting the definition ρ(x) = Z (x) f (x) into Eq. (15) and using the result iL 0 Z = −κ Z we find the Liouville equation for f (x), d f (x, t) ∂ f (x, t) = + iL 0 f (x, t) = 0. dt ∂t
(17)
The equations of motion (8) have H0 , σα and σ˙ α as constants of the motion. Consequently, the equilibrium density is given by f eq (x) = (E)−1 δ(H0 − E)
α
δ(σα )δ(σ˙ α ),
(18)
where (E) is a normalizing factor and E is the energy of the microcanonical system. In non-equilibrium statistical mechanics the average value of a dynamical variable at time t is given by the integral over the phase space measure of the phase space probability density times the dynamical variable, ¯ = B(t)
dµ(x)B(x) f (x, t) =
dµ(x)B(x)e−iL 0 t f (x).
(19)
We may transfer the action of the evolution operator to the dynamical variable. To do this we first use the following identity for any phase space functions A(x) and B(x), which is obtained by integrating by parts:
dxB(x)iL 0 A(x) = =−
dx
−iL 0 +
∂λα (x) i
∂pi
· (∇i σα ) B(x) A(x)
dx((iL 0 + κ)B(x))A(x).
(20)
The last equality was obtained using ∂λα (x) i
∂pi
· (∇i σα ) = −κ.
(21)
750
G. Ciccotti et al.
Making use of this result we may also show that
dµ(x)B(x)iL 0 A(x) =
=− =−
dxZ (r)B(x)iL 0 A(x)
dx((iL 0 + κ)(Z (r)B(x)))A(x) dx(iL 0 B(x)))Z (r)A(x),
(22)
where the last equality follows from the fact that iL 0 (Z A) = −κ Z A + Z iL 0 A, again using iL 0 Z = −κ Z . Thus, expanding the propagator exp(−iL 0 t) in Eq. (19) as power series, integrating by parts term by term, using the above ¯ identities and finally resumming, we obtain for B(t) in Eq. (19), ¯ = B(t)
dµ(x)(eiL 0 t B(x)) f (x) =
dµ(x)B(x(t)) f (x).
(23)
Thus, we have the result
dµ(x)B(x)e
−iL 0 t
f (x) =
dµ(x)(eiL 0 t B(x)) f (x),
(24)
which shows that when the scalar product is defined with respect to the measure dµ(x), the Liouville operator L 0 is self-adjoint. Alternatively, we may write the right hand side of Eq. (24) and then integrate by parts, using the properties discussed above, to obtain the equivalent formulas, ¯ B(t) =
dxB(x)ρ(x, t) =
=
dxB(x)e−(iL 0 +κ)t ρ(x)
dx(eiL 0 t B(x))ρ(x) =
dµ(x)(eiL 0 t B(x)) f (x).
(25)
This result shows that if the scalar product is defined with respect to integration over the phase space coordinates, and not the invariant measure, the adjoint of iL 0 is −(iL 0 + κ) and the operator is not self-adjoint. The development we have presented contains the standard Hamiltonian description of statistical mechanics if the constraints are not present. In this case the terms involving the explicit forms of the Lagrange multipliers no longer appear and the equations of motion adopt a Hamiltonian form. The metric factor Z (r) = 1 and consequently dxt = dxt0 and the Liouville operator is self-adjoint with respect to this simple metric. We may now use this Liouville formulation of the dynamics of constrained systems to carry out an analysis of how the system responds when external forces are applied to the system.
Non-equilibrium molecular dynamics
2.
751
Response Theory
We next examine how the constrained non-Hamiltonian system responds to external time dependent forces. In the presence of such external forces the equations of motion take the general form, pi p + Cri (x)F (t), p˙ i = Fi − λα (x, t)∇i σα (r) + Ci (x)F(t). (26) r˙ i = mi We write this set of equations compactly as x˙ = G(x, t),
(27)
where we have indicated the explicit time dependence in λα (x, t) and G(x, t) arising from the external force. Since the Lagrange multipliers must be determined in the presence of the external forces, they acquire explicit time dependence. In the general case we are considering, the external forces are not assumed to be derived from a generalized potential; i.e., there is no funcp tion V(r, p) such that Cri = (∂V/∂pi ) and Ci = −(∂V/∂ri ). We do assume that CT (x) = (Cr , C p ), where T stands for the transpose, satisfies the incompressibility condition, ∇x · C = 0. This latter condition guarantees that, even in the presence of the external forces, the compressibility arises only from the Lagrange multipliers, which impose the constraints, and is still given by κ(t) = ∇x · x˙ = −
N ∂λα (x, t) i=1
∂pi
· ∇i σα (r).
(28)
The Liouville operator that generates these equations of motion is iL(t) = x˙ · ∇x = G(x, t) · ∇x , or, more explicitly, iL(t) =
N pi i=1
+
·
m
N
∂ ∂ + (Fi − λα (x, t)(∇i σα (r))) · ∂ri ∂pi
Cri ·
i=1
∂ ∂ p + Ci · ∂ri ∂pi
F(t).
(29)
We compute the response of the system to the external force by measuring the average value of a dynamical variable B(x) as (cf. Eq. (25)) ¯ = B(t)
dxB(x)ρ(x, t),
(30)
where ρ(x, t) again satisfies the continuity Eq. (14) which now takes the form, ∂ρ(x, t) = −(˙x · ∇x + ∇x · x˙ )ρ(x, t) = −(iL(t) + κ(t))ρ(x, t). (31) ∂t The compressibility κ(t) now also depends explicitly on time since the Lagrange multipliers appear in its expression. If we integrate Eq. (31) from
752
G. Ciccotti et al.
some initial time t0 to time t and solve the resulting integral equation by iteration we obtain, ρ(x, t) = ρ(x, t0 ) −
t
dt1 (iL(t1 ) + κ(t1 ))ρ(x, t0 ) +
t0
t
t1
dt1 t0
dt2 (iL(t1 )
t0
+ κ(t1 ))(iL(t2 ) + κ(t2 ))ρ(x, t0 ) + · · · .
(32)
The formal solution of Eq. (32) can be written as ρ(x, t) = U † (t, t0 )ρ0 (x),
(33)
where ρ0 (x) = ρ(x, t0 ) and the propagator U † (t, t0 ) is defined by
U † (t, t0 ) = T exp −
t
dτ (iL(τ ) + κ(τ )),
(34)
t0
where T is the time-ordering operator. For any two phase space functions A(x) and B(x) we have the analog of Eq. (20): −
dxB(x)(iL(t) + κ(t))A(x) =
dx(iL(t)B(x))A(x).
(35)
Consequently, we may substitute the series solution for ρ(x, t) into Eq. (32) and integrate by parts term by term, using Eq. (35), to obtain, ¯ B(t) =
dxB(x)ρ(x, t)
=
dx B(x) +
t
dt1 iL(t1 )B(x)
t0
+
t
t1
dt1 t0
dt2 iL(t1 )iL(t2 )B(x) + · · · ρ0 (x).
(36)
t0
This series defines the evolution operator U (t, t0 ), t U (t, t0 ) = T exp dτ iL(τ ),
(37)
t0
which is the formal solution of the equation of motion, dU (t, t0 ) = iL(t)U (t, t0 ). dt
(38)
Non-equilibrium molecular dynamics
753
The propagator U (t, t0 ) is the adjoint of U † (t, t0 ). As a result of these considerations we may write, ¯ = B(t)
dxB(x)U † (t, t0 )ρ0 (x) =
dx(U (t, t0 )B(x))ρ0(x).
(39)
This formula provides a means to determine the non-equilibrium macroscopic value of the dynamical variable B(x) by considering its evolution under the fully perturbed dynamics and taking the phase space average over the arbitrary initial preparation of the system described by ρ0 (x). If the initial distribution is taken to be the equilibrium distribution in the absence of the perturbing field, ρeq (x) = Z (r) f eq (x), then the problem has a well-defined structure. Inserting the initial equilibrium distribution into Eq. (39) we find ¯ B(t) =
=
dx(U (t, t0 )B(x))ρeq(x) dµ(x)(U (t, t0 )B(x)) f eq(x) ≡ U (t, t0 )B(x)eq ,
(40)
where the measure dµ(x) is that for the unperturbed system discussed in the previous section. From this equation we see that, for systems initially at equilibrium, non-equilibrium properties may be obtained from the equilibrium ensemble average of the observable evolved under the full perturbed dynamics due to the external force. This equation expresses some fundamental features of nonequilibrium statistical mechanics. It is an expression of Onsager’s regression hypothesis that relates the decay of macroscopic observables to the regression of fluctuations about the equilibrium state [9] and has been exploited in NEMD simulations to take dynamical averages out of equilibrium [10]. In the limiting case where the equations of motion are Hamiltonian in form and the external perturbation arises from a potential VI (t)= A(x)F(t), Eq. (40) reduces to the microcanonical version of Kubo’s linear response result [11] in the limit of weak perturbations. To see this we note that the perturbed Liouville operator in Eq. (29) reduces to the following usual form for this simple Hamiltonian case:
iL(t) =
N pi i=1
N ∂ ∂ A(r) ∂ ∂ ∂ A(x) ∂ + Fi +F(t) · − · m ∂ri ∂pi ∂pi ∂ri ∂ri ∂pi i=1
·
≡ iL 0 + iL I (t).
(41)
Using the decomposition of iL(t) into unperturbed and perturbed parts and the expression U0 (t, t0 ) = exp(−iL 0 (t − t0 )) for the propagator of the
754
G. Ciccotti et al.
unperturbed system, we may write Eq. (38) in the form of a Dyson equation for U (t, t0 ): U (t, t0 ) = U0 (t, t0 ) +
t
dτ U0 (τ, t0 )iL I (τ )U (t, τ ).
(42)
t0
If we then insert this expression for U (t, t0 ), truncated to first order in the external force, into Eq. (40), we obtain ¯ = B(x)eq + B(t)
t
dτ t0
˙ d feq (x) F(τ ), dx(U0 (t, τ )B(x)) A(x) dH0
(43)
where we have used the fact that f eq (x) = (E)−1 δ(H0 − E) is a function of the phase space coordinates through the Hamiltonian. Finally, using the identity g(z)δ (z − a) = −g (a)δ(z − a), and the fact that the microcanonical partition function is related to the entropy by (E) = exp(S(E)/kB ), we have d f eq (x)/dH0 = −β f eq (x). Thus, we obtain the standard result ¯ =− B(t)
t
˙ dτ (U0 (t, τ )B(x)) A(x) eq F(τ ).
(44)
t0
¯ ¯ = Here B(t) is the deviation from the equilibrium average value, B(t) ¯ B(t) − B(x)eq . However, the linear response of constrained systems to perturbations, either with Hamiltonian or non-Hamiltonian structure, is not simple. The external forces enter in the Lagrange multipliers that appear in the nonHamiltonian equations of motion as well as in the explicit forces that couple the system to the external fields. Consequently, the form of the perturbation in the Liouville operator is complicated and unfamiliar terms appear in the linear response formulas. In addition, the equilibrium distribution function in Eq. (18) does not depend solely on the Hamiltonian but contains delta function contributions from the conserved constraint variables. As a result, some of the standard manipulations that are often carried out in linear response theory to obtain the response as a physically interesting correlation function in Eq. (43), such as those that give d f eq (x)/dH0 = −β feq (x), may no longer be carried out. For instance, even if the perturbation is of the form of VI (t) given above, the linear response of the constrained system is not simple because the form of the equilibrium density precludes a reduction to that in Eq. (44). These technical difficulties with the linear response derivation do not detract from the computational utility of Eq. (40) which forms the starting point for investigating the linear or non-linear response of either Hamiltonian or nonHamiltonian systems by NEMD. Below, we discuss how such simulations may be carried out.
Non-equilibrium molecular dynamics
3.
755
Simulation of Non-Equilibrium Systems
The dynamical response of a system subjected to the general timedependent external force in Eq. (26) and initially in an equilibrium state of the unperturbed system can be computed from Eq. (40). To do this one simply samples phase space configurations along an equilibrium trajectory of the unperturbed system. For independent initial configurations extracted from this trajectory, one evolves the dynamical variable B(x) using Eq. (26) under the full perturbed dynamics for a time t. The ensemble average over these trajectory segments yields B¯ (t). The method based on Eq. (40) is called the dynamical approach to non-equilibrium molecular dynamics. In carrying out such NEMD simulations, one necessarily needs perturbation strengths that are huge on the macroscopic scale in order to produce a detectable response. Such large perturbation strengths are needed to yield a response that is larger than the statistical noise. From such a simulation it is difficult to obtain information in the region of linear behavior. Consequently, an extrapolation to small perturbation strengths is required in order to compare the numerical results with those of linear response theory. For example, consider the mobility of an ion with mass m immersed in a fluid. In the absence of an external field the average ion velocity is zero and its variance is kB T /m. The typical velocity of the ion is (kB T /m)1/2 and sampling of 100 independent configurations will yield an estimate of the average value zero by (kB T /m)1/2 /10, which is still a large number. If we wish to apply an external force to push the ion and to compute its drift velocity, the drift velocity should be significantly larger than the noise, v ion neq 10−1 (kB T /m)1/2 . In order to fulfill this condition a huge field strength is required and one can no longer guarantee that the linear regime is being investigated. A solution to this problem was obtained by using the subtraction technique [12], a method that permits one to decrease the noise of the response. If we consider evolution under U0 (t, t0 ), the propagator of the unperturbed system, then, since the equilibrium distribution is stationary under this evolution † we have U0 (t, t0 ) f eq (x) = 0 and U0 (t, t0 )B(x)eq = 0. Therefore we may write Eq. (40) in the form, ¯ = (U (t, t0 )B(x) − U0 (t, t0 )B(x))eq. B(t)
(45)
The dynamical variable inside the parentheses has the same average value as that in Eq. (40) but the variance is significantly different. This is easily seen for a time-impulsive perturbation at time t → t0+ : Var [U (t, t0 )B(x) − U0 (t, t0 )B(x)] = Var [U (t, t0 )B(x)] + Var [U0 (t, t0 )B(x)] − 2Cov [U (t, t0 )B(x), U0(t, t0 )B(x)] .
(46)
756
G. Ciccotti et al.
Using Cov[X, Y ] = (Var[X ]Var[Y ])1/2 γ , with the correlation coefficient |γ | ≤ 1 and noticing that for t close to t0 the correlation coefficient of the two microscopic fluxes is equal to 1 + O(F), one finds from Eq. (46) that the leading term of the variance of the difference between the two fluxes is
Var U (t → t0+ , t0 )B(x) − U0 (t → t0+ , t0 )B(x)
= SD U (t → t0+ , t0 )B(x) − SD U0 (t → t0+ , t0 )B(x) = O(F),
2
+ O(F) (47)
where SD stands for the standard deviation. This result applies only for t ≈ t0 and the variance will generally grow exponentially as t increases. In many situations, though, the desired results can be obtained using a time range compatible with this divergence of the variance. A simple illustration of the subtraction technique is provided by the mobility of a charged particle in an atomic fluid of Lennard–Jones particles. In this case the system has no constraints and we may use the simpler limiting forms of the equations presented above. The interaction between the charged particle and the neutral bath atoms is given by the Lennard–Jones plus a charge induced dipole term VD (r) = VLJ (r) − 12 α P e2r −4 , where αP is the atomic polarizability and e the electric charge. (See Ref. [12] for details.) To calculate the mobility of the ion we take B(x) = vc , the velocity of the ion, and use Eq. (45). The molecular dynamics runs are broken into segments and two trajectories of the particles are computed in each segment, starting from the same initial configuration: one trajectory evolves in the absence of the external force, and a constant force F of order 1 eV cm−1 is applied to the charged particle in the other trajectory. The drift velocity of the charged particles uD induced by the applied field is computed as a function of time simply by calculating the difference of the particle’s velocity in the perturbed and unperturbed trajectories, averaged over all segments of the run. One obtains, uD ≡ v¯ c (t) = [U (t, 0)vc − U0 (t, 0)vc ]eq .
(48)
The mobility constant µ is given by uD (∞) = µF at vanishingly small F. The force applied in Ref. [12] was about 10−7 of the mean Lennard–Jones force. The calculation of the drift velocity induced by such a small external field in a simulation run of realistic length is made possible only by the subtraction technique. The results for the mobility agree well with experimental data for argon and with calculations using the Green–Kubo formula. As a second illustration of the subtraction technique to simple systems, we consider the calculation of the shear-rate dependence of the viscosity of a Lennard–Jones fluid [13, 14]. To simulate the response of the equilibrium system, at t = 0 a fictitious external impulsive field is applied which induces
Non-equilibrium molecular dynamics
757
a planar Couette flow (the so-called SLLOD perturbation [15, 16]). The equations of motion are r˙ i = pi /m i + ri · ∇u, p˙ i = Fi − pi · ∇u,
(49) (50)
where Fi is the total force acting on atom i with mass m i and the velocity gradient ∇u has been chosen to yield a planar Couette flow in the x direction, with shear along y:
0 0 0 ∇u = h 0 δ(t) 0 0 , 0 0 0
(51)
where h 0 is a constant related to the shear rate ˙ . The velocity gradient, induced in the MD cell by the application of the chosen external field, has to be accomodated at the boundaries by applying the appropriate generalization of the periodic boundary conditions known in the literature as Lees–Edwards boundary conditions [17, 18]. The transient behavior of the off-diagonal element Sx y of the stress tensor of the perturbed system was simulated over a time interval t in order to determine the viscosity coefficient, η, η = lim η(t) = lim lim t →∞
t →∞ h 0 →0
h −1 0
t
dτ U (τ, 0)Sx y − U0 (τ, 0)Sx y eq .
(52)
0
In Fig. 1 we show the time dependent friction coefficient η(t) calculated by the subtraction technique for a low shear rate (˙ < 0.02). One see that good agreement between the non-equilibrium molecular dynamics simulation and linear response theory is found for low shear rates. In Ref. [13] it is also shown that, for shear rates below 1012 s−1 , the viscosity, considered as a function of the shear rate, does not differ significantly from its limiting value. It is only at rates higher than 1012 s−1 (a rather high perturbation!) that it starts to depend on the shear rate. Moreover, the dependence is well represented by an analytical power series truncated to the forth order. Non-equilibrium molecular dynamics has also been used to investigate transport properties of polyatomic systems using Cartesian coordinates with imposed holonomic constraints. Liquid butane has been the subject of a number of studies [19, 20] and the subtraction technique was used in [20] to compute the shear viscosity of this molecular system. In this calculation the symmetric part of the molecular stress tensor for a polyatomic fluid in a volume V is determined from the center of mass positions and velocities of the butane molecules: 1 1 1 s PI x PI y + (FI x R I x + FI y R I y ) . (53) Sx y = − V I M 2
758
G. Ciccotti et al. ψ
1.5
B 1
A
0.5
t
0 0
200
400
600
800
Figure 1. Curve A: time dependent friction coefficient η(t) for a Lennard–Jones atomic fluid, calculated by the subtraction technique at low shear rate (˙ < 0.02); curves above and below indicate t the uncertainties. Curve B: η(t) from the Green–Kubo expression, η(t) = (V /k B T ) 0 ds Sx y (s)Sx y (0)eq ; errors are shown as vertical bars.
Here M, R I , P I are, respectively, the mass, the center of mass coordinate and the total linear momentum of molecule I , while F I is the total force acting on the center of mass of the molecule; the sum runs over all molecules in the system. The shear viscosity coefficient η can be directly obtained from the constitutive law [21], Ss − pI = 2η∇u,
(54)
where p is the hydrostatic pressure and u is the local velocity field corresponding to a pure deformational flow (∇ · u = 0). The equations of motion for such a polyatomic system can be written [15] (denoting the atomic coordinates and momenta of the n atoms of molecule I by (r I , p I ) = (r1I , r2I , . . . , rn I , p1I , p2I , . . . , pn I )) as pk I + R I · ∇u, mk mk ∇u · P I − λα I ∇k I σα I (r I ), = Fk I − M
r˙ k I =
(55)
p˙ k I
(56)
where Fk I is the force acting on atom k of molecule I , the tensor ∇u is the homogeneous velocity gradient (possibly time dependent), σα I (r I ) are the constraints acting on molecule I with their associated Lagrange multipliers
Non-equilibrium molecular dynamics
759
λα I . The perturbed equations of motion (55)–(56), are known as the DOLLS [22] tensor equations and can be derived from a Hamiltonian perturbation, HI =
R I P I : (∇u)T ,
(57)
I
where (∇u)T is the transpose of ∇u. The calculation of the shear viscosity by the subtraction technique proceeds as in the previous application. An impulsive external force derived from a shear gradient of the form
0 (h 0 /2)δ(t) 0 0 0 ∇u = (h 0 /2)δ(t) 0 0 0
(58)
is applied to the equilibrium system at t = 0. Notice that, due to the choice of a symmetric velocity gradient tensorial perturbation, Eq. (58), no distinction exists, in this case, between the DOLLS tensor equations, used in this application, and the SLLODS equations employed in Ref. [13]. The viscosity is then computed from the analog of Eq. (52) using the symmetric part of the stress tensor, Sxs y = (Sx y + S yx )/2. Since the molecular constraints do not act on the centers of mass of the molecules no technical difficulties are encountered as a result of their presence. The viscosity obtained using this method has been compared with the corresponding Green–Kubo formula and with the available experimental data. Although the equivalence between the Green–Kubo and NEMD methods is not evident in the present molecular case, the authors find a remarkable agreement between the results of the two methods, while the known experimental value is about half of the computed values. There are many possible reasons for this discrepancy (and the authors correctly point them out), however, at this stage, we cannot exclude a theoretical inconsistency.
4.
Outlook
Non-equilibrium molecular dynamics is a field with a long history. For atomic systems the problem has been formulated completely and a variety of applications have been studied (see Ref. [18] for a review, as well as other chapters in this book). In this chapter we have shown that there are new issues that need to be considered when molecular systems with constraints are studied. In order to make molecular dynamics simulations practical, most complex molecular systems are treated by imposing constraints to remove certain degrees of freedom from the problem. It is therefore important to formulate a response theory for such constrained systems in order to be able to compute non-equilibrium properties. We have shown that it is possible to carry out this program for systems with constraints in the context of a non-Hamiltonian
760
G. Ciccotti et al.
formulation of the equations of motion. The general expression for the response (Eq. (40)) is simple and is in a form that permits direct application of the subtraction method to determine small responses. The passage to the linear regime and determination of the analogs of standard Green–Kubo expressions for transport properties involve subtle issues that deserve further study. We have not included a thermostat in the formulation presented here. In practice it is necessary to thermostat the system when external forces are applied to it to study the response. Such thermostats may also be implemented naturally in the context of a non-Hamiltonian framework.
References [1] D.J. Evans and G.P. Morriss, Statistical Mechanics of Nonequilibrium Liquids., Academic Press, New York, 1990. [2] G. Ciccotti, D. Frenkel, and I.R. McDonald, Simulations of Lliquids and Solids, 3rd edn., North-Holland, Amsterdam, 1987. [3] J.P. Ryckaert, G. Ciccotti, and H.J.C. Berendsen, “Numerical-integration of cartesian equations of motion of a system with constraints–molecular-dynamics of n-alkanes,” J. Comp. Phys., 23, 327–341, 1977. [4] T.O. White, G. Ciccotti, and J.P. Hansen, “Brownian dynamics with constraints,” Mol. Phys., 99, 2023–2036, 2001. [5] S. Melchionna, “Constrained systems and statistical distributions,” Phys. Rev. E, 61, 6165–6170, 2000. [6] M.E. Tuckerman, Y. Liu, G. Ciccotti, and G.L. Martyna, “Non-Hamiltonian molecular dynamics: generalizing Hamiltonian phase space principles to non-Hamiltonian systems,” J. Chem. Phys., 115, 1678–1702, 2001. [7] M.E. Tuckerman, C.J. Mundy, and G.L. Martyna, “On the classical statistical mechanics of non-Hamiltonian systems,” Europhys. Lett., 45, 149–155, 1999. [8] G. Ciccotti, R. Kapral, and A. Sergi, “Simulating reactions that occur once in a blue moon,” In: S. Yip (ed.), Handbook of Materials Modeling, Volume 1: Methods and Models, Springer, Berlin, 2005. [9] L. Onsager, “Reciprocal relations in irreversible processes. I,” Phys. Rev., 37, 405– 426, 1931. “Reciprocal relations in irreversible processes. II,” Phys. Rev., 38, 2265– 2279, 1931. [10] G. Ciccotti, G. Jacucci, and I.R. McDonald, “Thought experiments by molecular dynamics,” J. Stat. Phys., 21, 1–22, 1979. [11] R. Kubo, M. Toda, N. Hashitsume, M. Toda, and R. Kubo, Statistical Physics II: Nonequilibrium Statistical Mechanics, 2nd edn., Springer, Berlin, 1995. [12] G. Ciccotti and G. Jacucci, “Direct computation of dynamical response by moleculardynamics–mobility of a charged Lennard–Jones particle,” Phys. Rev. Lett., 35, 789– 792, 1975. [13] J.P. Ryckaert, A. Bellemans, G. Ciccotti, and G.V. Paolini, “Shear-rate dependence of the viscosity of simple fluids by nonequilibrium molecular dynamics,” Phys. Rev. Lett., 60, 128–131, 1988. [14] J.P. Ryckaert, A. Bellemans, G. Ciccotti, and G.V. Paolini, “Evaluation of transport coefficients of simple fluids by MD: comparison of Green–Kubo and nonequilibrium approaches for shear viscosity,” Phys. Rev. A, 39, 259–267, 1989.
Non-equilibrium molecular dynamics
761
[15] A.J.C. Ladd, “Equations of motion for non-equilibrium molecular-dynamics simulations of viscous-flow in molecular fluids,” Mol. Phys. Rep., 53, 459–463, 1984. [16] D.J. Evans and G.P. Morriss, “Non-Newtonian molecular-dynamics,” Comp. Phys. Rep., 1, 297–343, 1984. [17] A.W. Lees and S.F. Edwards, “The computer study of transport processes under extreme conditions,” J. Phys. C, 5, 1921–1972, 1972. [18] G. Ciccotti, C. Pierleoni, and J.P. Ryckaert, “Theoretical foundation and rheological application of non-equilibrium molecular dynamics,” In: M. Mareschal and B.L. Holian (eds.), Simulations of Complex Hydrodynamic Phenomena, NATO ASI Series B 292. Plenum Press, New York, 1992. [19] R. Edberg, D.J. Evans, and G.P. Morriss, “Conformational dynamics in liquid butane by nonequilibrium molecular dynamics,” J. Chem. Phys., 87, 5700–5708, 1987. [20] G. Marechal, J-P. Ryckaert, and A. Bellemans, “The shear viscosity of n-butane by equilibrium and non-equilibirum molecular dynamics,” Mol. Phys., 61, 33–49, 1987. [21] S.R. de Groot and P. Mazur, Non-Equilibrium Thermodynamics, North-Holland, Amsterdam, 1962. [22] W.G. Hoover, D.J. Evans, R.B. Hickman, A.J. Ladd, W.T. Ashurst, and B. Moran, “Lennard–jones triple-point bulk and shear viscosities. Green–Kubo theory, hamiltonian mechanics, and nonequilibrium molecular dynamics,” Phys. Rev. A, 22, 1690– 1697, 1980.
2.18 THERMAL TRANSPORT PROCESS BY THE MOLECULAR DYNAMICS METHOD Hideo Kaburaki Japan Atomic Energy Research Institute, Tokai, Ibaraki, Japan
We do molecular dynamics simulations for the system in equilibrium, for example, at some finite temperature, and by taking averages the spontaneous fluctuations we can evaluate thermal transport coefficients that control the nonequilibrium system, such as thermal conductivity, fluid viscosity, and diffusion constant. We can do this by exploiting a significant formula, called the Green–Kubo formula, in nonequlibrium statistical mechanics that connects the macroscopic thermal transport coefficient and the microscopic time autocorrelation function.
1.
Diffusion Process, Transport Properties, Macroscopic Equations
When we put a drop of ink in a fluid, particles of ink in the localized region of higher concentration extend to the region of lower concentration, that is, the system relaxes to a uniform state. This equalization process is a diffusion phenomenon, which is a fundamental process in nonequilibrium state. A typical diffusion process is heat or thermal conduction, which is connected to the irreversibility stated in the second law of thermodynamics. When there is some nonuniformity of energy, momentum, and particle concentration in a material, a flow occurs accompanying dissipation. When the gradients of these quantities are small, where thermodynamic quantities are defined locally, transport coefficients, which regulate flows of energy, momentum, and particles can be defined in the process of thermal conduction, viscous flow, and diffusion. All the macroscopic phenomenological equations of motion that govern the thermal conduction, viscous flow, and diffusion, are written in the conservation form or the equation of continuity. In the case of thermal conduction, a conserved quantity is the energy of unit volume ρε, where ρ is the density 763 S. Yip (ed.), Handbook of Materials Modeling, 763–771. c 2005 Springer. Printed in the Netherlands.
764
H. Kaburaki
and ε is the internal energy per unit mass. A conservation law is described by equating the time derivative of the conserved quantity and minus the divergence of flux density vector. The magnitude of the flux density vector j is the amount of energy per unit time passing through a unit area perpendicular to the direction of flow. A conservation equation ofenergy iswritten as ∂(ρε) ∂t =−div j. Time derivative of ε is expressed as ∂ε ∂t =c∂ T ∂t, where c is the specific heat, assuming that the thermal expansion is neglected. The flux density j is related to the local temperature T (r, t) by the Fourier’s law of thermal conduction j = −k∇T through the thermal conductivity κ. With all these relations combined, we finally get the thermal conduction equation or the diffusion equation of temperature ∂ T /∂t = (κ/ρc)∇ 2 T . Here, κ/(ρc) is the thermal diffusivity or the temperature conductivity. The diffusion equation is derived for the mass density ρ, and the diffusion constant D is defined. Also, in the same way, a fluid motion can be described using the momentum density ρ ν and the momentum flux tensor , and the Navier–Stokes equation results with the transport coefficient of viscosity [1].
2.
Classical Laws of Mechanics Nonequilibrium Statistical Mechanics
The derived equations above are all phenomenological and are based on the continuum assumption, that is, microscopic properties of a material are all smeared out and the material is assumed to be smooth and continuous. Here, a set of equations describing the space–time variations of the conserved quantities is closed when the transport coefficients are determined by experiment. The role of nonequilibrium statistical mechanics is to derive these properties starting from the laws of mechanics. Let us consider a material or a system as an assembly of N atoms following the classical laws of mechanics. The phase space is defined as a space consisting of coordinates and momenta of atoms (q1 , q2 , . . . , q N ; p1 , p2 , . . . , p N ) for 6N degrees of freedom. The mechanical state of this system is described as a point in the phase space and the time development of this system is described by a set of equations of 6N variables. If the Hamiltonian of this system is given by H=
N
p2i /(2m) + U (q1, q2 , . . . , q N ),
i=1
the Hamilton’s equations of motion are dpi /dt = −∂ H/∂qi = −∂U/∂qi = Fi , dqi /dt = ∂ H/∂pi = ∂pi /m for i = 1 . . . N.
Thermal transport process by the molecular dynamics method
765
The molecular dynamics simulation is equivalent to the time integration of these equations and to a trajectory in the phase space. For the small volume in the phase space d = dq1 · · · dq N dp1 · · · dp N , we can define the particle density f (q1 , q2 , . . . , q N ; p1 , p2 , . . . , p N , t) = f (q, p, t). The number of phase points in the volume d is described as f (q, p, t) d. If we consider a flow of points in the phase space, the equation of the density, the Liouville equation
N ∂f ∂H ∂f ∂H ∂f =− − , ∂t ∂qi ∂pi ∂pi ∂qi i=1
which is equivalent to the Hamilton’s equations of motion, is obtained. These are the fundamental microscopic formulas for a system consisting of N atoms. Now consider some dynamical quantity A(q, p) and what microscopic expressions correspond to the observed values in macroscopic state. The particle density f (q, p, t) is considered as a distribution function or a probability of finding a phase point in the phase space. We regard the ensemble average A =
dq dp A(q, p) f (q, p, t)
in terms of the distribution function in the phase space corresponds to the observed values. Here, the distribution function f (q, p, t) follows the Liouville equation, so that by partial integration the distribution function in the above expression is replaced by the ensemble average with reference to the distribution function for initial conditions. Since we cannot control the initial conditions microscopically, an observed value corresponds to taking averages by sampling from the distribution of initial conditions (q0 , p0 ) [2]. In a molecular dynamics simulation, we can derive a time evolution of a system of interacting N particles starting from some initial condition (q0 , p0 ). This means that one trajectory in the phase space is generated for a long time starting from an initial point in a molecular dynamics simulation. In order to take the ensemble average A using the molecular dynamics, we repeat simulations starting from the arbitrarily N chosen initial conditions, and take averages of these similar systems. However, time averages A¯ are better suited to a molecular dynamics simulation to exploit a long time trajectory. We call for the ergodic hypothesis to use the time averages. This hypothesis states that for a stationary random process the time average of observing N instants of time for a long time single simulation is equal to the ensemble average. Time averages are better suited to a single run of long time molecular dynamics simulation.
766
3.
H. Kaburaki
Linear Response Theory, Green–Kubo Formula
We consider a nonequilibrium system that is very close to equilibrium. We have to consider the thermal internal forces as a perturbation to the system to describe thermal conduction and diffusion. The thermal internal forces are intrinsically statistical and cannot be treated as a perturbation to the Hamiltonian, as in the external mechanical forces. In order to evaluate thermal transport properties, we need heat reservoirs of different temperatures or nonuniform temperature distribution to establish the nonequilibrium state. There are various ways of describing microscopic states for this nonequilibrium system, and many theories have been presented. The final formula for the thermal transport coefficients is expressed all in the same form, as far as the system is in the linear regime from the equilibrium [2–5]. The Green–Kubo formula for thermal conductivity is expressed as κ=
1 3V k B T 2
∞
J(t) · J(0) dt ,
0
where the heat flux J is described as J=
i
E i vi +
1 (r i − r j )[F ij · (vi + v j )], 2 i> j pairs
1 E i = mv2i + φ(|ri − r j |), 2 i> j pairs
Fi j = −
∂φ(|ri − r j |) . ∂ri
Here means the ensemble average, V is the volume, and φ is the pair potential energy between atoms. Also, E i and Vi are the energy and velocity of an ith particle and Fij is a force on a particle i from a particle j . The formula is expressed in terms of the autocorrelation functions of the heat current density. Time correlation function expressions for all the transport coefficients in the macroscopic equations are derived in the same way [4].
4.
Time Correlation Function
The Green–Kubo formula reveals that for the system in the linear regime close to equilibrium the macroscopic transport coefficient is connected to microscopic quantities. For example, the macroscopic thermal conductivity coefficient is represented as the time integral of the ensemble averaged heat flux autocorrelation function. This indicates that the dynamical properties of the system can be derived from this formulation, which is in contrast to the kinetic approach where the stochasticity is introduced [6].
Thermal transport process by the molecular dynamics method
767
In order to have a finite thermal conductivity, the autocorrelation function should be decayed rapidly, for example, exponentially. There are some cases where the transport properties show anomalous behavior. For example, in the case of the two-dimensional fluid, the autocorrelation function decreases algebraically, which is called the long time tail. In this case, self-diffusion coefficient diverges. In the case of one and two-dimensional lattices, thermal conductivity diverges as the number of particles increases [7]. Generally, the time correlation function is very difficult to calculate exactly and for some cases such as classical fluid, molecular dynamics plays a very important role for studying the structure of liquids [8].
5.
Numerical Calculations of Transport Coefficients by the Molecular Dynamics Method
Molecular dynamics simulation provides a method of calculating the heat flux autocorrelation function in the Green–Kubo formula for thermal conductivity. While the Green–Kubo formula is frequently used to determine the thermal conductivity of liquids [9], studies on solids using this method are rather recent. For a material with not too high thermal conductivity, such as argon, the direct evaluation of the heat flux autocorrelation function is effective in using the molecular dynamics trajectory of the time sequence of positions and velocities. As stated above, the ensemble average is replaced by the time average of observing N instants of time sampled from a long time single simulation as in Fig. 1. The sampling time tsample should be taken long enough for the autocorrelation function to be decayed. During this sampling time, the autocorrelation functions of various time differences, in which tsample is the longest correlation time, are evaluated. The difference time between samples tshift should be shifted long enough for the samples to be independent. For example, we show here a molecular dynamics simulation of thermal conductivity for the liquid and solid argon cases, where a system of 864 particles is numerically integrated using the Lennard-Jones potential φ(r) = 4ε[(σ/r)12 − (σ/r)6 ], where ε = 119.8 K, σ = 3.405 Å, with a time step of 2 fs and total time steps of up to 107 . Here, the length of a sampling time tsample is 2.0 × 10−12 − 1.0 × 10−11 s. The time between samples tshift is taken as 0.1 ps in this case. Figure 2 shows how the ensemble average for the autocorrelation function converges to the final results with increasing the number of samplings. It is seen that the result with 500 samplings mostly converges to the final result with large number of samplings. Figure 3 shows the time autocorrelation functions for liquid and solid argon at 90 K and 60 K under the freestanding condition in the Cartesian coordinate plot. The autocorrelation function for the liquid is clearly seen to decay exponentially, while that of solid also decreases exponentially
768
H. Kaburaki
Figure 1. Sampling of ensembles from a single long time molecular dynamics simulation.
Figure 2. Convergence of the ensemble averaged autocorrelation function.
Thermal transport process by the molecular dynamics method
769
Figure 3. Heat flux autocorrelation function for argon liquid and solid in the Cartesian coordinates.
but has a longer tail. This is clearly seen by plotting the results in the log–log representation in Fig. 4. The relaxation of solid argon consists of two stages, shorter atomistic and longer phonon parts. The final thermal conductivity is derived by numerically integrating the time autocorrelation function, and as it is seen in Fig. 5 that it converges to the constant value of thermal conductivity. The final converged value of thermal conductivity for the liquid state is 0.1255 W m−1 K−1 . For a material with high thermal conductivity, such as covalent bond crystals, very long runs are required since a smaller step size is needed and the correlation extends more than 100 ps. One method is to derive the power spectrum from the original heat flux autocorrelation function through fast Fourier transforms and to take the zero-frequency limit ω → 0 to obtain the thermal conductivity. Since the length of the data is finite, care should be taken that there is some ambiguity in this process. This process can be understood by considering the more general expression of thermal conductivity for unsteady deviations from the equilibrium. The formula for this case is generalized to [5]. 1 κ(ω) = 3V k B T 2
∞ 0
J (t) · J (0)eiωt dt.
(1)
770
H. Kaburaki
Figure 4.
Heat flux autocorrelation function for argon liquid and solid in the log–log plot.
Figure 5. Integral of heat flux autocorrelation function for liquid and solid argon.
Thermal transport process by the molecular dynamics method
771
Taking the zero-frequency limit ω → 0 means that the macroscopic time 2π/|ω| is much larger than the microscopic relaxation time of the autocorrelation function. The above expression of thermal conductivity reduces to the static expression under this limit.
6.
Outlook
The Green–Kubo formalism combined with the molecular dynamics simulation has been applied, immediately after the theory is formulated, to the classical liquid problem and the method is well established in this area. On the other hand, the application of this method to the thermal conductivity of solids is rather delayed. The thermal conductivity problems have been studied mostly by the phonon Boltzmann–Peierls method with the relaxation time approximation. With this method, the relaxation times are finally fitted to experimental data or evaluated by the theoretical perturbation calculation. On the other hand, the method of Green–Kubo formula is formally exact and can open a way to the calculation of the thermal transport coefficients for any states, gas, liquid, or solid. However, this method requires a calculation of autocorrelation function of fluctuating fluxes, which is equal to solving a system of equations of motion for N interacting particles. With the development of computers, the molecular dynamics method with this formula continues to be an effective tool, in particular, to delve into dynamical aspects of thermal transport properties for various materials.
References [1] L. Landau and E.M. Lifshitz, Fluid Mechanics, 2nd (edn.), Pergamon Press, New York, 1987. [2] D.N. Zubarev, Nonequilibrium Statistical Thermodynamics, Consultants Bureau, New York, 1974. [3] R. Zwanzig, “The correlation functions and transport coefficients in statistical mechanics,” Annu. Rev. Phys. Chem., 16, 67–102, 1965. [4] D.A. McQuarrie, Statistical Mechanics, Harper & Row, New York, 1976. [5] R. Kubo, M. Toda, and N. Hashitsume, Statistical Physics II, 2nd (edn.), Springer, Berlin, 1991. [6] R.E. Peierls, Quantum Theory of Solids, Oxford University Press, New York, 1955. [7] S. Lepri, R. Livi, and A. Politi, “Thermal conduction in classical low-dimensional lattices,” Phys. Rep., 377, 1–80, 2003. [8] P. Boon and S. Yip, Molecular Hydrodynamics, Dover, New York, 1980. [9] C. Hoheisel and R. Vogelsang, “Thermal transport coefficients for one- and twocomponent liquids from time correlation functions computed by molecular dynamics,” Comput. Phys. Rep., 8, 1–70, 1988.
2.19 ATOMISTIC CALCULATION OF MECHANICAL BEHAVIOR Ju Li Department of Materials Science and Engineering, Ohio State University, Columbus, OH, USA
Mechanical behavior is stress-related behavior. This can mean the material response is driven by externally applied stress (or partially), or the underlying processes are mediated by an internal stress field; very often both are true. Due to defects and their collective behavior [1], the spatiotemporal spectrum of stress field in a real material tends to have very large spectral width, with non-trivial coupling between different scales, which is another way of saying that the mechanical behavior of real materials tends to be multiscale. The concept of stress field is usually valid when coarse-grained above a few nm; in favorable circumstances like when crystalline order is preserved locally, it may be applicable down to sub-nm lengthscale [2]. But overall, the atomic scale is where the stress concept breaks down, and atomistic simulations [3–5] provide very important termination or matching condition for stress-based theories. Large-scale atomistic simulations (chap 2.27) are approaching µm lengthscale and are starting to reveal the collective behavior of defects [6]. But studying defect unit processes is still a main task of atomistic simulation. It is infeasible to list the current developments in this area to any degree of completeness, so only a few highlights are given. A somewhat more detailed review can be found in Ref. [5]. • The study of deformation [7–11], grain growth [12] and fracture [13, 14] in nanocrystalline materials. • Atomistic simulation of adhesion and friction [15, 16], and nanoindentation [17–20]. • The study of dislocation core structure and Peierls stress in BCC metals [21], semiconductors [22] and intermetallics [23]. A proper definition of dislocation core energy and numerically precise ways [24, 25] to extract 773 S. Yip (ed.), Handbook of Materials Modeling, 773–792. c 2005 Springer. Printed in the Netherlands.
774
• • • • •
• •
J. Li the core energy from periodic boundary condition (PBC) atomistic calculations. Thin film deposition, texture evolution and mechanical properties [26, 27]. The study of dynamical brittle fracture [28, 29] and lattice trapping barriers [30, 31], ductile fracture [6, 32]. The study of phase and grain boundaries [33, 34]. Deformation and fracture of amorphous materials [35]. The application of Hessian-free minimum energy path (MEP) search algorithms [36] to study dislocation cross-slip in FCC metals [37], double kink nucleation and migration in semiconductors [38] and BCC metals [39], and heterogeneous dislocation nucleation at crack tips [2]. Defect generation/evolution induced by irradiation, and effect on mechanical properties [40–42]. Connection of atomistics to the mesoscale [43–46].
In this contribution, we review the basic concepts of strain, stress and elastic constant [47]. Then we move to a discussion about dislocation core energy [25]. Finally we discuss a minimum energy path calculation of heterogeneous dislocation nucleation at an atomically sharp crack tip [2].
1.
Strain, Stress and Elastic Constants
Stress and strain have many definitions, which although do not change the physics, differ in the efficiency of representing a particular problem. Here we introduce a system that is usually the most convenient for atomistic calculations. Strain should be relative. To define strain, one must first declare the reference state. This is reasonable because strain describes deformation. Strain should be frame-covariant like any true second-rank tensor [48], since how much an object is deformed does not really depend on the angle one looks at it. Here we denote the geometrical configuration of an object by X, Y or Z , which describes its shape, i.e., surface constraints. For periodic boundary condition (PBC) simulations, this would be the supercell H -matrix (Chapter 2.8). Affine transformation of an object from one shape to the other is specified by the tensor J , expressed as Y = J X , which is homogeneous in the sense that surface constraints of the object change uniformly according to J . But it does not have to be a microscopically homogeneous transformation, as different kinds of atoms may have different atomic-scale relaxations. The Lagrangian strain is defined to be, ηYX ≡ 12 ( J T J − 1).
(1)
Atomistic calculation of mechanical behavior
775
Subscript X in ηYX denotes the reference state and superscript Y denotes the final state. If the final state is apparent we may omit the superscript and simply write as η X . The polar decomposition theorem [49] states that every matrix can be uniquely expressed as the left or right product of a symmetric matrix and a rotational matrix,
J = RM = ML M T = M, R T R = L T L = 1
(2)
Therefore, η X = 12 ( J T J − 1) = 12 (M 2 − 1).
(3)
There is one-to-one correspondence between η X and M, as, M=
1 + 2η X = 1 + η X − 12 η2X + . . .
(4)
Let Y = J X, Z = K Y = KJX . There is ηYZ = 12 (K T K − 1), η XZ = 12 ( J T K T K J − 1) = 12 ( J T (1 + 2ηYZ ) J − 1) = J T ηYZ J + ηYX ,
(5)
which is the law of η conversion between reference systems. Contrary to strain, stress should be absolute, meaning it should not depend on any reference state besides the current state of the object. We use two definitions of stress here: the first is the external stress τij , which is the usual “force per area” definition used by engineers, dTi = τij n j dS,
(6)
where dTi is the external traction force, n j is the outward surface normal and dS is the surface area, and the Einstein summation convention is used. τij is what the outside environment exerts on the object. To prevent rotation, it must satisfy τij = τ j i . The second kind of stress is the thermodynamic stress tij , also called the intrinsic stress of the volume, whose definition is based on the Helmholtz free energy F(N, T, X ) of the object: F(N, T, X ) = E − T S ≡ −kB T ln Z (N, T, X )
(7)
where Z (N, T, X ) is the partition function [50, 51], Z (N, T, X ) ≡
X
exp(−βH(q N , p N ))
dq N d p N . N !h 3N
(8)
776
J. Li
Here F is a function of the particle number N , temperature T , and geometrical constraint X . Since the Hamiltonian H(q N , p N ) is usually rotationally invariant, F is also rotationally invariant. Thus, F(N, T, Y ) = = = = =
F(N, T, J X ) F(N, T, R M X ) F(N, T, M X ) F(N, T, 1 + 2η X X ) F(N, T, η X , X ),
(9)
i.e., F is a function of η X , once X is chosen. A function can always be expanded into Taylor series:
F(η X , X ) = F(0, X ) +
∂ F ∂ηij η
1 ∂ 2 F + 2 ∂ηij ∂ηkl η
ηij X =0
ηij ηkl + . . .
(10)
X =0
Because ηij is symmetric, the expansion should only involve six independent variables: η11 , η22 , η33 , η23 , η13 , η12 . But that is often inconvenient for index contraction, so what people do is to symmetrize the expansion coefficients over ηij and η j i whenever possible, but pretending ηij , η j i to be different summation variables. Let us define second and fourth rank symmetrization operators: Sˆ 2(G ij ) = 12 (G ij + G j i ), Sˆ 4(Wi j kl ) =
1 (Wi j kl 4
(11)
+ Wi j lk + W j ikl + W j ilk ).
(12)
The thermodynamic stress at configuration X is defined to be,
1 ˆ ∂ F(η X , X ) S2 tij (X ) = (X ) ∂ηij
,
(13)
η X =0
and the elastic constant:
1 ˆ ∂ 2 F(η X , X ) S4 Ci j kl (X ) = (X ) ∂ηij ∂ηkl η
,
(14)
X =0
where (X ) is the volume of the object at X , so tij and Ci j kl are intensive quantities. By definition,
F(η X , X ) = F0 + (X ) tij (X )ηij + 12 Ci j kl (X )ηij ηkl . . . tij = t j i ,
Ci j kl = Ci j lk = C j ikl = C j ilk .
(15)
Atomistic calculation of mechanical behavior
777
Notice that since tij and Ci j kl are expansion coefficients of η X in F(η X , X ) at η X = 0, they themselves are not functions of η X , but only of X . That means the definitions of thermodynamic stress and elastic constant do not require a reference state, since to evaluate them we use the object itself at that moment as the reference state. The use of this co-moving reference frame has some “strange” consequences, which is covered generally in differential geometry [48]. For instance, tij (Y ) =/ tij (X ) + Ci j kl (X )(ηYX )kl + . . . ,
(16)
which is not what one may expect for the Taylor expansion of the “first-order derivative” in terms of the “second-order derivative”, which works when we use a fixed reference frame. In fact, in light of (5), F(Z ) = F(ηYZ , Y )
= F(Y ) + (Y )Tr t (Y )ηYZ + O (ηYZ )2 = F(η XZ , X )
= F(X ) + (X )Tr t (X )η XZ +
+ O (η XZ )3
(17)
(X ) Z Tr η X C(X )η XZ 2
(X ) Z Tr η X C(X )η XZ
2 Y 2 Z Z 2 + O (η X ) ηY + O (ηY ) .
= F(X ) + (X )Tr t (X )η XZ +
(18)
The linear coefficient of ηYZ in (17) and (18) must be equal. Plugging in (5) to (18), we have,
F(Z ) = const + (X )Tr J t (X ) J T ηYZ + (X )Tr J ηYX C(X ) J T ηYZ
+ O (ηYX )2 ηYZ + O (ηYZ )2 .
(19)
Therefore matching the linear coefficient of ηYZ to that of (17), we have, t (Y ) =
J (C(X )ηYX ) J T J t (X ) J T + + O (ηYX )2 . det |J | det |J |
(20)
It can be shown that if J is constrained to be symmetric, then
tij (Y ) = tij (X ) + Bi j kl (X )(ηYX )kl + O (ηYX )2 ,
(21)
where Bi j kl (X ) is the elastic stiffness coefficient [47]: Bi j kl (X ) = Ci j kl (X ) + 12 (δik t j l (X ) + δ j k til (X ) + δil t j k (X ) + δ j l tik (X ) − 2δkl tij (X )).
(22)
778
J. Li
Bi j kl (X ) is equal to Ci j kl (X ) only when tij (X ) = 0, therefore the use of elastic constant as the linear expansion coefficient of stress versus strain (both defined above) is only a valid practice at zero load. It can be proven by minimizing the Gibbs free energy [47] that equilibrium is reached at X when tij (X ) = τij . Thus the two quantities have identical values at equilibrium; however they have different connotations physically. Atomistic expressions for the thermodynamic stress and elastic constants can be derived for the canonical ensemble [50, 52]. The partition function for a deformed system is,
Z (X, M) =
exp(−βH(q˜ N , p˜ N ))
MX
dq˜ N d p˜ N , N !h 3N
(23)
where we assume, H(q˜ N , p˜ N ) =
N p˜ nT · p˜ n n=1
+ V (q˜ 1 , q˜ 2 , . . . , q˜ N ).
2m n
(24)
Under a change of variables q˜ n → qn , p˜ n → pn : p˜ n ≡ M −1 pn ,
q˜ n ≡ Mqn ,
n = 1, . . . , N,
(25)
the Hamiltonian can be written as, H(q N , p N ) =
N pnT M −2 pn
+ V (Mq1 , Mq2 , . . . , Mq N ).
2m n
n=1
(26)
Using (4) and also, M −2 =
1 = 1 − 2η X + 4η2X + . . . 1 + 2η X
(27)
the partition function can be written as: Z (X, η X ) =
exp −β
N pnT (1 − 2η X + 4η2X )pn n=1
X
+V
1 + ηX −
2m n 1 2 η 2 X
q
N
dq N d p N ,
(28)
where we threw away the N !h 3N constant. Using index notation ηij for matrix η X : ∂Z 1 1 ∂F · =− = ∂ηij β Z ∂ηij Z
Tij exp(−βH)dq N d p N X
(29)
Atomistic calculation of mechanical behavior
779
where, H(q N , p N ) ≈
N pnT (1 − 2η X + 4η2X )pn
2m n
n=1
+V
1 + η X − 12 η2X q N , (30)
and Tij =
N ∂H pin (−δ j k + 4η j k ) pkn = + (δik − ηik )qkn ∇ nj V ((1 + η X )q N ). ∂ηij n=1 mn
(31) Setting η X to zero, we get the atomistic formula for the thermodynamic stress:
tij (X ) =
N − pin pnj 1 ˆ S2 + qin ∇ nj V (q N ) (X ) m n n=1
.
(32)
The means canonical ensemble average in the original configuration X . One may wonder why the sum (32) does not always give 0 at T = 0, since n ∇ j V (q N ) ≡ 0 for bulk atoms at equilibrium. The answer is that if we were to compute the stress using (32) as it stands now, we must count those atoms on the surface, whose equilibrium conditions F jn =∇ nj V (q N ) in general require the presence of external force F jn , which is the force the wall exerts on the atom to keep it within X . Since in (32) those F n ’s are weighted by q n ’s, this surface contribution does not vanish in the thermodynamic limit (N, → ∞), as the surface energy does, on a per volume basis. On the other hand, it appeals to one’s intuition that stress originates from the bulk, not from the surface, and is an intensive quantity. This can be seen in the following way: because V (q N ) in general is the sum of local interac N tions, for instance V (q ) = {lmn} W (ql , qm , qn ), where W ’s are three-body local interactions. Due to translational symmetry: W (ql + δ, qm + δ, qn + δ) = W (ql , qm , qn ), one must have ∇ l W +∇ m W +∇ n W ≡ 0, so the contribution of this specific interaction to the total (32) sum can be rewritten as (qil −qin )∇ lj W + (qim − qin )∇ mj W , conceptualized as F · q, i.e., force contribution weighted by the relative distance between action and reaction. Through this localization transformation, all q n weighting factors in the sum can be converted to q’s which are not larger than the interatomic distance. In this transformed summation, which should be converted from (32) as soon as the interatomic potential model is known, the surface contribution would vanish in the thermodynamic limit on a per volume basis, like the surface energy. So for local interactions, we can prove that the stress is intensive and indeed may be thought of as originating from the bulk.
780
J. Li
To get the atomistic formula for elastic constants, we need to further differentiate (29): ∂2 F 1 = ∂ηij ∂ηkl Z + =β
X
β Z
∂ Tij − βTij Tkl exp(−βH)dq N d p N ∂ηkl
Tkl exp(−βH)dq N d p N Tij
X
Tij Tkl − Tij Tkl
+
∂ Tij . ∂ηkl
(33)
From (31) we can get: N N 4 pin pkn ∂ Tij m n m n N n n N = δ + q q ∇ ∇ V (q ) − δ q ∇ V (q ) . j l il k i l j k j ∂ηkl η X =0 n=1 m n m,n=1
(34) So we get the unsymmetrized form of elastic constants: Di j kl = β(X )
1 + (X )
tij tkl − tij tkl
N
1 + (X )
qkm qin ∇lm ∇ nj V (q N ) −
m,n=1
N 4 pin pkn n=1
N
mn
δ jl
qkn ∇ nj V (q N )δil .
(35)
n=1
The first term is defined to be the fluctuation term. The last term is defined to be the Born term, usually written as CiBj kl . The elastic constant is therefore Ci j kl = Sˆ4(Di j kl ),
(36)
which is valid at finite temperature and for arbitrary stress. The summation (35) also needs to undergo the localization procedure as (32) to be computable in atomistic calculations. Equations (32) and (35) especially, are only applicable to canonical ensemble. For micro-canonical ensemble, a different set of formulas can be derived [53].
2.
Dislocation Core Energy
The dislocation core is a remarkable bond-cutting machine (the “sharpest knife”) that nature comes up with to relieve the stored elastic energy. While the internal mechanisms of this machine can be highly complicated, the overall effect is that atomic bonds come into the machine, get cut in shear, and new
Atomistic calculation of mechanical behavior
781
bonds with dislocated neighbors are left in the wake, much like a combine in a crop field. With its operation, diffuse elastic strain in the environment are collected and condensed into local inelastic (transformation) strain in a one-atomic-layer thin platelet, the glide plane [5]. There are actually two definitions of the dislocation core size [24, 25] a physical core width and a mathematical/elasticity core width. The physical core was described in the first paragraph, and is defined by atoms whose local atomic order like the coordination number or inversion symmetry (chap. 2.32) is drastically different from that of the crystalline bulk, from which we may define a phys core size r0 . In other words, the physical core is the set of atoms which are phys participating actively in the bond-cutting business. Obviously, r0 is significant and useful, but needs not be a precise real number (like 1.8234a0 ) due to lattice discreteness. In contrast, the mathematical core radius r0 and core energy E core can be defined precisely as real numbers from an asymptotic expansion of the total energy of a dislocation dipole in an infinite, and otherwise perfect, atomic lattice, E(d) = 2E core + 2A(θ) +
K s |b|2 |d| + O |d|−1 , log 2π r0
(37)
at large |d|. Here, E(d) is defined to be the total energy increase in a thought experiment of an infinite lattice whose atoms displace according to the leadingphys order [55] solution uG (x) at |x−d/2|, |x+d/2| r0 , but which are allowed to relax atomistically near the physical cores. As the Stroh solution is selfequilibrating (stress equilibrium is satisfied), the above thought experiment is well-posed and E(d) is the final increase in the atomistic total energy. At large |d|, the leading d-dependent term in E(d) must be K s |b|2 log |d|/2π , with K s proven invariant with respect to the displacement cut direction dˆ ≡ d/|d| [56]. Let us define θ to be the angle between dˆ and an arbitrarily chosen reference ˆ = |ˆa| = 1, and ξ is the line direcdirection aˆ , with dˆ ⊥ ξ and aˆ ⊥ ξ , |d| tion of the straight dislocation. An asymptotic expansion of E(d) at large |d| would yield O(log |d|), O(1), O(|d|−1 ), . . . terms. The O(1) term may contain a θ-dependent component 2A(θ), and a θ-independent component. For the sake of definiteness, we require A(θ = 0) = 0, and aˆ will be called the zero-angle reference axis. A(θ) is given entirely by anisotropic elasticity, 2A(θ) =
3 b T Kα b α=1
4π
log
(dˆx + pαr dˆy )2 + ( pαi dˆy )2 , (aˆ x + pαr aˆ y )2 + ( pαi aˆ y )2
(38)
where pα ≡ pαr + i pαi , α = 1..3, are the three Stroh eigenvalues with nonnega)Im(Lα )T + Im(Lα )Re(Lα )T ) is the tive imaginary parts, and Kα ≡ −2(Re(L 3 α T mode-specific modulus [56], with α=1 b Kα b = K s |b|2 . Physically, 2A(θ) is the rotational energy landscape of a dislocation dipole with fixed |d| in an infinite anisotropic medium [24], when |d| is asymptotically large. It is seen
782
J. Li
¯ and Mo from (38) that A(θ) = A(θ + π ). To illustrate, A(θ)’s for Si a0 /2[110] a0 /2[111] screw dislocations are evaluated and shown in Fig. 1. With O(log |d|) and θ-dependent O(1) parts known, the |d|- and θ-independent O(1) part of E(d) can be used to determine the mathematical core r0 , E core pair. Imagine for a fixed θ, we plot E(d) data with |d| on a chart (d can only take discrete lattice spacing), and we would like to fit the data ˜ to a smooth function E(d). We need to shift the function K s |b|2 log |d|/2π up or down to get a good fit at large |d|. That shift operation is well defined asymptotically and is mathematically unique. If we ignore |d|−1 , etc. terms 2 ˜ , 2E core + 2A(θ) in the fitting template E(d) ≡ 2E core + 2A(θ) + K s2π|b| log |d| r0 ˜ would be the abscissa of E(d) at |d| = r0 . It does not mean, however, that ˜ only fits E(d) well at large |d| (satisfying at E(r0 ) = 2E core + 2A(θ), as E(d) phys minimum |d| 2r0 ). It is thus clear that r0 , E core are mathematical instruments to fit E(d) to an asymptotic form and do not carry physical meaning in either quantity alone. If one likes, one may choose r0 =1000|b| and select E core ˜ accordingly so E(d) remains the same function and nothing is changed. There are several popular choices, however, such as (a) take r0 = |b|, (b) choose r0 so phys E core = 0, (c) r0 =r0 to minimize confusion, (d) r0 = 1Å to simplify numerical calculation, etc. It is seen that except for (c), none of the r0 ’s has anything to ˜ do with a physical core size. It is also clear that although E(d) by definition phys must fit E(d) well at large |d|, there should be a big error as |d| → 2r0 and the physical cores begin to overlap. Finally, r0 and E core (and aˆ too) combined (b)
(a)
3
10 3.5
0.04
3
0.03
2.5
A(θ) [eV/A]
A(θ) [eV/A]
0.05
0.02 0.01 0
2 1.5
0.01
1
0.02
0.5
0.03
0
20 40 60 80 100 120 140 160 180 θ [degree]
TB FS
0
0
20 40 60 80 100 120 140 160 180 θ [degree]
¯ shuffle-set screw dislocation in Figure 1. (a) The angular function A(θ) of a0 /2[110] ¯ as the zero-angle reference axis aˆ . The correStillinger–Weber potential Si [24], with 112 sponding core energy is computed to be 0.502 eV/Å for r0 = |b|. In a separate calculation [54], with 111 as the zero-angle reference axis, the core energy was computed to be 0.526 eV/Å. The 0.024 eV/Å difference is verified to be exactly A(θ = π/2), as shown above in circle. (b) The angular function A(θ) for Mo a0 /2 [111] screw dislocation using the Finnis–Sinclair ¯ potential (dash line) and the tight-binding potential (solid line), both with aˆ chosen to be 112. There is A(θ) = A(θ + π/3) due to crystal symmetry.
Atomistic calculation of mechanical behavior
783
do carry physical meaning – as much as any other defect formation energies – for example in evaluating the absolute total energy of formation of a dislocation loop. The atomistically computed E core is critical for constructing the total energy landscape of coarse-grained models like nodal dislocation dynamics. From the above, it is apparent that the choice of the zero-angle reference axis aˆ influences the numerical value of E core , in addition to the choice of r0 . This point is not widely appreciated. Indeed, even the existence of the dipole rotational energy 2A(θ) has usually been ignored in the analyses of atomistic simulation results in the literature. Note from Eq. (38) that A(θ) originates entirely from elasticity. A(θ =/ nπ ) is generally non-zero for any dislocation dipole except screw dislocation dipole in isotropic medium. For example, A(θ) is nonzero for edge dislocation dipole in isotropic medium. E core thoroughly characterizes the net energy consequence of core atomic relaxations, but one must be informed about what elasticity function parameters r0 and aˆ are chosen as ¯ matching partners. For instance, it was reported [54] that E core of a0 /2[110] shuffle-set screw dislocation in diamond cubic Si was 0.502 eV/Å, with r0 = |b| and using the Stillinger–Weber potential. Later, a separate, independent calculation gives E core = 0.526 eV/Å for the same setup. It is then traced back and ¯ the fordetermined that while the latter calculation uses definition aˆ = 112, mer calculation in effect used aˆ = 111. The offset is exactly given by A(θ = π/2) = 0.024 eV/Å as shown in Fig. 1(a). So both calculations are correct, with the only difference in the choice of the zero-angle reference axis aˆ and a trivial conversion of E core ’s between them. To reiterate, the numerical value of E core carries no physical meaning unless aˆ and r0 are specified. The conversion of E core to other aˆ , r0 “basis” can be performed easily using the fact that E(d) of Eq. (37), being a physical measurable in a well-posed thought experiment, is invariant, while aˆ , r0 , E core are merely parameters in the mathematical representation of its asymptotic form. In the example next, we show how the core energy of BCC Mo screw dislocation can be calculated in a small supercells using the Finnis–Sinclair ¯ potential [57]. All our E core values below will be based on r0 =|b| and aˆ =112. ¯ e2 = a0 [110], ¯ e3 = a0 /2[111]. The setup is as follows. Define e1 = a0 [112], An orthogonal supercell 7e1 × 11e2 × e3 is almost square and contains 462 atoms, in which we can put in four equally spaced screw dislocations to form a quadrupole. Because of symmetry redundancy, this quadrupole cell can be mapped to an entirely equivalent dipole cell half its size with three edges h1 = 7e1 , h2 = 3.5e1 + 5.5e2 + 0.5e3 , h3 = e3 . The 0.5e3 in h2 is critical to this mapping, in view of the fact that total = elastic + plastic, where total is total strain corresponding to the tilt of the supercell, plastic is the plastic strain generated by the displacement cut in the dipole cell (in the quadrupole cell, plastic is zero as there are two opposing cuts), and elastic is the volume-averaged elastic strain in the supercell, which relates directly to the cell-averaged Virial stress τvirial . So, by “preemptively” making total = plastic, we make sure that
784
J. Li
the elastic = 0 and τvirial ≈ 0. It can be shown that (a) τvirial = 0 minimizes the supercell total energy E atomistic with respect to cell shape (h1 , h2 , h3 ) [24, 54], and (b) at dipole separation d = h1 /2, the local stresses at the first and second dislocations vanish simultaneously: τ1 = τ2 = 0. This stabilizes the two dislocations so they would not annihilate, which happens frequently in small supercell calculations. And even when they do not annihilate, a finite driving force would push the dislocation core against the lattice barrier and distort its shape from equilibrium, which introduces error to the computed core energy E core . We can now briefly discuss the image sum procedure for extracting the core energy from periodic supercell calculations. A detailed account is given in chap. 2.22. An instructive approach to this problem is to think about how to explicitly construct a displacement field u(x) in the supercell, that (a) satisfies the displacement cut required by the dipole, (b) is self-equilibrating, and (c) is compatible with the PBC: u(x + h0i ) = u(x) and all orders of derivatives including the first, with {h0i } being the supercell edges before the dipole cut. The following Green’s function sum
u˜ λ (x) ≡ λ uG (x) +
uG (x − R)
(39)
R= /0
could conceivably lead to u(x), where uG (x) is the displacement field of an isolated dislocation dipole in an infinite medium (the one used in the thought experiment). The dislocation lines are all parallel to h03 , and R = n 1 h01 + n 2 h02 , n 1 = −N..N, n 2 = −α N..α N. λ is from 0 to 1 to label the magnitude of the cut displacement from 0 to b. Presence of the uG (x) term in u˜ λ (x) will satisfy condition (a). Condition (b) is trivially satisfied as all Green’s function displacements are self-equilibrating away from the cores. Condition (c) is a bit more subtle. But it can be rigorously shown that,
1 (40) N as N → ∞, where D(α) is a 3 × 3 affine transformation matrix that depends on the image summation aspect ratio α only. D(α) is the cause of the apparent conditional convergence. To get rid of it, we write: u˜ λ (x + h0i ) − u˜ λ (x) = λD(α)h0i + O
uλ (x) = u˜ λ (x) − λD(α)x.
(41)
It is seen now that uλ (x) satisfies (a),(b),(c) simultaneously, so one can use uλ=1(x) to transform atoms in the PBC cell without creating gaps or stress non-equilibrium. In practice, D(α) is evaluated numerically by analyzing the behavior of u˜ λ (x) from image summations at a constant α and progressively large N ’s. Suppose we start out with a PBC supercell {h0i } containing a stress-free crystal. We adiabatically change λ by effecting a cut increment dλb along the
Atomistic calculation of mechanical behavior
785
dipole cut in the cell. At each instant, the displacement field in the cell is uλ (x), so the stress field σλ (x) is available by plugging in ∇uλ (x). The incremental work is simply:
dW = dλ
b · σλ (x) · n dS,
(42)
which is converted to potential energy. Equations (39), (41), and (42) combined give a total energy expression that consists of: • dipole self-energy in the form of (37) • image dipole/displacement-cut coupling energy • D(α) stress/displacement-cut coupling energy Summation over individual Stroh modes like Eq. (38) is needed to account for the dipole-dipole interaction energy E dipole−dipole. The expression E dipole−dipole =
|R + d||R − d| K s |b|2 log 2π |R|2
(43)
is simply incorrect in anisotropic medium as it ignores the 2A(θ) angularcoupling terms. Note also that one needs to put in an extra factor of 1/2: Wimage dipole = 12 E dipole−dipole
(44)
for the R=/ 0 dipole–dipole interaction energy, since one dipole “owns” only one half of the total coupling energy. All these follow automatically from Eq. (42). The Eq. (41) setup is easier to explain, but gives a large supercell virial stress, as total = 0, and since plastic ≡
T Dplastic + Dplastic
elastic = − plastic.
2
,
Dplastic ≡
b(d × h03 )T , V (45)
Therefore in practice we use uλ (x) = u˜ λ (x) + λ(Dplastic − D(α))x
(46)
solution more often, with a new supercell hi =h0i +λDplastich0i that is introduced at the beginning of this section. The energy of this setup can be related to the previous one by accounting for the boundary work, which leads to a very simple result [24, 53]. To validate the above, we relax the Mo screw dislocation dipole in four supercell geometries using the Finnis–Sinclair potential: i. h1 = 7e1 , h2 = 3.5e1 + 5.5e2 + 0.5e3 , h3 = e3 cell, containing 231 atoms, ii. h1 = 8e1 , h2 = 16e2 + 0.5e3 , h3 = e3 cell, containing 768 atoms,
786
J. Li
iii. h1 = 16e1 , h2 = 64e2 + 0.5e3 , h3 = e3 cell, containing 6, 144 atoms, iv. h1 = 32e1 , h2 = 32e2 + 0.5e3 , h3 = e3 cell, containing 6, 144 atoms. The differential displacement maps [58] of (i) and (ii) are shown in Fig. 2, in which the spontaneous polarities are manifest. If we use Å as the length unit, then we can write:
K s |b|2 E atom = E elastic + 2 E core − log r0 |h03 |, 4π
(47)
where E atom is the increase in total energy in the PBC supercell, E elastic is the result of the elastic energy summation without the r0 , E core constants, and also ¯ so the 2A(θ) term in Eq. (37) gives no contribution by choosing aˆ = 112 (but its effects are present in the image dipole coupling energies). K s |b|2 /4π , the single dislocation energy prefactor, is 0.499 eV/Å for the Finnis–Sinclair potential. Numerical results for (i)–(iv) are shown in Table 1, respectively. We see that by varying the supercell size and shape, the elastic energy contribution E elastic dominates the total energy landscape. However, the differences between ¯ E atom and E elastic remain remarkably constant. If we take r0 = |b| and aˆ = 112, then E core = 0.300 ± 0.001 eV/Å, a definitive result. Further, we note that cell (ii)
(i)
Figure 2. Differential displacement map [58] of Mo screw dislocation using the Finnis– Sinclair potential. (i) h1 = 7e1 , h2 = 3.5e1 + 5.5e2 + 0.5e3 , h3 = e3 cell. (ii) h1 = 8e1 , h2 = 16e2 + 0.5e3 , h3 = e3 cell. ¯ Table 1. Mo screw dislocation core energy with r0 = |b| and aˆ = 112 using the Finnis–Sinclair potential
(i) (ii) (iii) (iv)
E supercell [eV]
E elastic [eV]
E core [eV/Å]
6.0410 7.0069 8.8935 11.0432
7.1361 8.0955 9.9838 12.1318
0.2995 0.3006 0.3003 0.3007
Atomistic calculation of mechanical behavior
787
(i), which contains only 231 atoms, is capable of representing the core energy very accurately.
3.
Crack-tip Dislocation Emission
A stressed crack tip has two basic options to relieve its stored strain energy: surface creation by breaking bonds, or plastic deformation (localized shearing). Whichever route has the lower activation energy in the long run should be the dominant mechanism. Therefore activation energy calculations are essential for understanding brittle-to-ductile transitions (BDT). Dislocation nucleation [59, 60] and migration [61] are both possible rate-limiting step in BDT. The former has become one of the standard problems in nanomechanics [62, 63], because proper treatment of the crack and dislocation cores are necessary. Previous atomistic calculations focused on K emit , the athermal dislocation emission threshold, and the so-called 2D activation pathway in which the dislocation is constrained to be always straight. Zhu et al. [2] have applied the nudged elastic band (NEB) method [36] to calculate the 3D bow-out nucleation pathway atomistically. Figures 3a–3c show the calculation setup for Cu (111) crack using the empirical potential of Mishin √ [64]. The 3D minimum energy path (MEP) obtained at K I = 0.44 MPa m is compared with 2D MEP in Fig. 3d. It is seen to be the lower pathway for the same initial and final states. The external load is applied via a fixed-displacement boundary condition for all the NEB nodes (i–ix) during path relaxation. We find that for this model of Cu with the unstable stacking energy γus = 158 mJ/m2 , the Rice–Peierls model [62] underestimates both K I,emit and the activation energy √ Q(K I ) of partial dislocation emission. K I,emit turns out to be 0.508 MPa m, which is 45% greater than √ the 0.35 MPa m from the analytic formula of Rice and Beltz [62]. Furthermore, at (K I /K I,emit )2 = G I /G I,emit = 0.75, we find Q(K I ) to be 1.1 eV, which is significantly larger than the first continuum estimate of 0.18 eV based on a perturbative approach [62], and a second, improved estimate of 0.41 eV using a more flexible representation of the embryonic dislocation loop [63]. Preliminary analyses indicate that two factors may be causing the discrepancy, which if corrected, may lead to much better semi-continuum models. The first is the negligence of surface deformation energetics near the crack tip [59, 60]. The second is that we believe the continuum models may induce a systematic error in the dislocation core energy E core (see last topic), which drives down the energy cost of nucleating a half loop. We suggest that whenever one uses semi-continuum models to calculate activation energies, the core energies of straight dislocations should first be calibrated against atomistic results. The semi-continuum model may then be systematically improved to give better core energies, or if not, very often the error can be conveniently adsorbed in
788
J. Li
(b)
(c)
(d) Actual atomistic σyy
Stroh σyy solution
2
2D activation
1
∆E [eV]
0
iii iv i
v 3D activation
ii
1
vi
2 at GI/GI,Emit 0.75
3
vii
4 5
(a) x2
[111]
x1,[112]
θ
[110]
ii
i
[112]
viii 0
0.2
0.4
0.6
ix
0.8
Reaction coordinate iii
1
iv
(111)
v
vi
vii
viii
ix
Figure 3. (a) Geometry of the mode-I crack [2], containing 24 unit cells (61 Å) in x2 (periodic boundary condition) and 103,920 Cu atoms in a R = 80 Å cylinder. Atoms within 5 Å of the cylinder border are fixed according to anisotropic linear elastic [65] solution. (b) Continuum Stroh solution and (c) the actual atomistic local stress distribution [20] of σ yy at G I /G I,emit = 0.75. (d) 3D activation minimum energy path (solid line) of partial dislocation emission by bow-out, and its competing 2D pathway (dash line). i–ix show the sequential nine NEB nodes or images on the minimum energy path, with iv being the saddle point; atoms whose coordination number [66] differs from 12 are not shown. Note that a stacking fault is actually dragged behind the dislocation.
heuristic gradient functionals like κ|∇u (x)|2 . Otherwise the semi-continuum model to calculate activation energies will have a systematic “core energy error” compared to atomistic results. This recommendation is quite general since heterogeneous nucleation of dislocation half loops by 3D bow-out is ubiquitous, in cross-slip, slip transmission across grain and phase boundaries, initiation at surface asperities, etc. That it has not been carried out before has more to do with the fact that the proper definition of dislocation core energy and numerically precise way to calculate it atomistically were only worked out recently [24, 25]. Figure 4 shows the saddle-point configuration obtained at G I /G I,emit = 0.75. It shows the birth of a shear-dominant singularity (embryonic dislocation loop) near a tensile-dominant singularity, the crack. To make connections with continuum models, we calculate the relative displacement between atoms on two sides of the slip plane. This completely discrete data set are then interpolated to form a continuum field estimate u(x), which is further decomposed into shear shock component u (x) parallel to the slip plane (localized inelastic, or transformation, strain), and tensile opening component u⊥ (x) normal to
Atomistic calculation of mechanical behavior
789
Figure 4. Analysis of the shock displacement field u(x) on the inclined slip plane at the saddle point iv, obtained by 2D spline interpolation of the discrete atomic displacements. (a) Atomic view. (b) Shear component u (x) normalized by b p = a0 [112]/6, and (c) |∇u (x)|2 . (d) Tensile opening component u⊥ (x) normalized by the interplanar spacing h 0 = 3−1/2 a0 .
the slip plane (large, but still elastic). The dislocation core is best visualized by looking at |∇u (x)|2 (Fig. 4c), showing that the core is simply the domain wall between inelastically sheared and unsheared regions [5]. Yet, in the heart of this shear-dominant secondary singularity, there is also a little tensile component. Figure 4d shows that u⊥ (x) is maximized near where |∇u (x)|2 is maximized. Such are the intricacies of shear-tension coupling, and one kind of singularity giving birth to the opposite kind. For instance, we know that when a lot of dislocations are piled up on a hard interface, a microcrack may also be nucleated heterogeneously.
References [1] R. Phillips, Crystals, Defects and Microstructures: Modeling Across Scales, Cambridge University Press, Cambridge, 2001. [2] T. Zhu, J. Li, K.J. Van Vliet, S. Ogata, S. Yip, and S. Suresh, “Predictive modeling of nanoindentation-induced homogeneous dislocation nucleation in copper,” J. Mech. Phys. Solids, 52, 691–724, 2004. [3] M. Allen and D. Tildesley, Computer Simulation of Liquids, Clarendon Press, New York, 1987. [4] D. Frenkel and B. Smit, Understanding Molecular Simulation: From Algorithms to Applications, 2nd edn., Academic, San Diego, 2002. [5] J. Li, A.H.W. Ngan, and P. Gumbsch, “Atomistic modeling of mechanical behavior,” Acta Mater., 51, 5711–5742, 2003. [6] F.F. Abraham, R. Walkup, H.J. Gao, M. Duchaineau, T.D. De la Rubia, and M. Seager, “Simulating materials failure by using up to one billion atoms and the world’s fastest computer: work-hardening,” Proc. Natl Acad. Sci. USA., 99, 5783–5787, 2002.
790
J. Li [7] J. Schiotz, F.D. Di Tolla, and K.W. Jacobsen, “Softening of nanocrystalline metals at very small grain sizes,” Nature, 391, 561–563, 1998. [8] V. Yamakov, D. Wolf, S.R. Phillpot, and H. Gleiter, “Dislocation–dislocation and dislocation–twin reactions in nanocrystalline Al by molecular dynamics simulation,” Acta Mater., 51, 4135–4147, 2003. [9] J. Schiotz and K.W. Jacobsen, “A maximum in the strength of nanocrystalline copper,” Science, 301, 1357–1359, 2003. [10] V. Yamakov, D. Wolf, S.R. Phillpot, A.K. Mukherjee, and H. Gleiter, “Deformationmechanism map for nanocrystalline metals by molecular-dynamics simulation,” Nat. Mater., 3, 43–47, 2004. [11] H. Van Swygenhoven, P.M. Derlet, and A.G. Froseth, “Stacking fault energies and slip in nanocrystalline metals,” Nat. Mater., 3, 399–403, 2004. [12] A.J. Haslam, V. Yamakov, D. Moldovan, D. Wolf, S.R. Phillpot, and H. Gleiter, “Effects of grain growth on grain-boundary diffusion creep by molecular-dynamics simulation,” Acta Mater., 52, 1971–1987, 2004. [13] A. Hasnaoui, H. Van Swygenhoven, and P.M. Derlet, “Dimples on nanocrystalline fracture surfaces as evidence for shear plane formation,” Science, 300, 1550–1552, 2003. [14] A. Latapie and D. Farkas, “Molecular dynamics investigation of the fracture behavior of nanocrystalline alpha-Fe,” Phys. Rev. B, 69, art. no.–134110, 2004. [15] M.H. Muser, “Towards an atomistic understanding of solid friction by computer simulations,” Comput. Phys. Commun., 146, 54–62, 2002. [16] M. Urbakh, J. Klafter, D. Gourdon, and J. Israelachvili, “The nonlinear nature of friction,” Nature, 430, 525–528, 2004. [17] C.L. Kelchner, S.J. Plimpton, and J.C. Hamilton, “Dislocation nucleation and defect structure during surface indentation,” Phys. Rev. B, 58, 11085–11088, 1998. [18] J.A. Zimmerman, C.L. Kelchner, P.A. Klein, J.C. Hamilton, and S.M. Foiles, “Surface step effects on nanoindentation,” Phys. Rev. Lett., 8716, art. no.–165507, 2001. [19] G.S. Smith, E.B. Tadmor, N. Bernstein, and E. Kaxiras, “Multiscale simulations of silicon nanoindentation,” Acta Mater., 49, 4089–4101, 2001. [20] K.J. Van Vliet, J. Li, T. Zhu, S. Yip, and S. Suresh, “Quantifying the early stages of plasticity through nanoscale experiments and simulations,” Phys. Rev. B, 67, 2003. [21] V. Vitek, “Core structure of screw dislocations in body-centred cubic metals: relation to symmetry and interatomic bonding,” Philos. Mag., 84, 415–428, 2004. [22] H. Koizumi, Y. Kamimura, and T. Suzuki, “Core structure of a screw dislocation in a diamond-like structure,” Philos. Mag. A, 80, 609–620, 2000. [23] C. Woodward and S.I. Rao, “Ab initio simulation of (a/2)¡110] screw dislocations in gamma-TiAl,” Philos. Mag., 84, 401–413, 2004. [24] W. Cai, V.V. Bulatob, J.P. Chang, J. Li, and S. Yip, “Periodic image effects in dislocation modelling,” Philos. Mag., 83, 539–567, 2003. [25] J. Li, C.-Z. Wang, J.-P. Chang, W. Cai, V.V. Bulatov, K.-M. Ho, and S. Yip, “Core energy and peierls stress of screw dislocation in bcc molybdenum: a periodic cell tight-binding study,” Phys. Rev. B, (in print). See http://164.107.79.177/Archive/ Papers/04/Li04c.pdf, 2004. [26] H.C. Huang, G.H. Gilmer, and T.D. de la Rubia, “An atomistic simulator for thin film deposition in three dimensions,” J. Appl. Phys., 84, 3636–3649, 1998. [27] L. Dong, J. Schnitker, R.W. Smith, and D.J. Srolovitz, “Stress relaxation and misfit dislocation nucleation in the growth of misfitting films: molecular dynamics simulation study,” J. Appl. Phys., 83, 217–227, 1998.
Atomistic calculation of mechanical behavior
791
[28] D. Holland and M. Marder, “Ideal brittle fracture of silicon studied with molecular dynamics,” Phys. Rev. Lett., 80, 746–749, 1998. [29] M.J. Buehler, F.F. Abraham, and H.J. Gao, “Hyperelasticity governs dynamic fracture at a critical length scale,” Nature, 426, 141–146, 2003. [30] R. Perez and P. Gumbsch, “Directional anisotropy in the cleavage fracture of silicon,” Phys. Rev. Lett., 84, 5347–5350, 2000. [31] N. Bernstein and D.W. Hess, “Lattice trapping barriers to brittle fracture,” Phys. Rev. Lett., 91, art. no.–025501, 2003. [32] S.J. Zhou, D.M. Beazley, P.S. Lomdahl, and B.L. Holian, “Large-scale molecular dynamics simulations of three-dimensional ductile failure,” Phys. Rev. Lett., 78, 479– 482, 1997. [33] P. Keblinski, D. Wolf, S.R. Phillpot, and H. Gleiter, “Structure of grain boundaries in nanocrystalline palladium by molecular dynamics simulation,” Scr. Mater., 41, 631–636, 1999. [34] M. Mrovec, T. Ochs, C. Elsasser, V. Vitek, D. Nguyen-Manh, and D.G. Pettifor, “Never ending saga of a simple boundary,” Z. Metallk., 94, 244–249, 2003. [35] M.L. Falk and J.S. Langer, “Dynamics of viscoplastic deformation in amorphous solids,” Phys. Rev. E, 57, 7192–7205, 1998. [36] G. Henkelman and H. Jonsson,“Improved tangent estimate in the nudged elastic band method for finding minimum energy paths and saddle points,” J. Chem. Phys., 113, 9978–9985, 2000. [37] T. Vegge and W. Jacobsen, “Atomistic simulations of dislocation processes in copper,” J. Phys.-Condes. Matter, 14, 2929–2956, 2002. [38] V.V. Bulatov, S. Yip, and A.S. Argon, “Atomic modes of dislocation mobility in silicon,” Philos. Mag. A, 72, 453–496, 1995. [39] M. Wen and A.H.W. Ngan, “Atomistic simulation of kink-pairs of screw dislocations in body-centred cubic iron,” Acta Mater., 48, 4255–4265, 2000. [40] B.D. Wirth, G.R. Odette, D. Maroudas, and G.E. Lucas, “Energetics of formation and migration of self-interstitials and self-interstitial clusters in alpha-iron,” J. Nucl. Mater., 244, 185–194, 1997. [41] T.D. de la Rubia, H.M. Zbib, T.A. Khraishi, B.D. Wirth, M. Victoria, and M.J. Caturia, “Multiscale modelling of plastic flow localization in irradiated materials,” Nature, 406, 871–874, 2000. [42] R. Devanathan, W.J. Weber, and F. Gao, “Atomic scale simulation of defect production in irradiated 3CSiC,” J. Appl. Phys., 90, 2303–2309, 2001. [43] E.B. Tadmor, M. Ortiz, and R. Phillips, “Quasicontinuum analysis of defects in solids,” Philos. Mag. A, 73, 1529–1563, 1996. [44] V. Bulatov, F.F. Abraham, L. Kubin, B. Devincre, and S. Yip, “Connecting atomistic and mesoscale simulations of crystal plasticity,” Nature, 391, 669–672, 1998. [45] V.B. Shenoy, R. Miller, E.B. Tadmor, D. Rodney, R. Phillips, and M. Ortiz, “An adaptive finite element approach to atomic-scale mechanics – the quasicontinuum method,” J. Mech. Phys. Solids, 47, 611–642, 1999. [46] R. Madec, B. Devincre, L. Kubin, T. Hoc, and D. Rodney, “The role of collinear interaction in dislocation-induced hardening,” Science, 301, 1879–1882, 2003. [47] J.H. Wang, J. Li, S. Yip, S. Phillpot, and D. Wolf, “Mechanical instabilities of homogeneous crystals,” Phys. Rev. B, 52, 12627–12635, 1995. [48] I.S. Sokolnikoff, Tensor Analysis, Theory and Applications to Geometry and Mechanics of Continua., 2nd edn., Wiley, New York, 1964. [49] S.C. Hunter, Mechanics of Continuous Media, 2nd edn., E. Horwood, Chichester, 1983.
792
J. Li [50] J.F. Lutsko, “Stress and elastic-constants in anisotropic solids – molecular dynamics techniques,” J. Appl. Phys., 64, 1152–1154, 1988. [51] J.F. Lutsko, “Generalized expressions for the calculation of elastic constants by computer-simulation,” J. Appl. Phys., 65, 2991–2997, 1989. [52] J.R. Ray, “Elastic-constants and statistical ensembles in moleculardynamics,” Comput. Phys. Rep., 8, 111–151, 1988. [53] T. Cagin and J.R. Ray, “Elastic-constants of sodium from molecular-dynamics,” Phys. Rev. B, 37, 699–705, 1988. [54] W. Cai, V.V. Bulatov, J.P. Chang, J. Li, and S. Yip, “Anisotropic elastic interactions of a periodic dislocation array,” Phys. Rev. Lett., 86, 5727–5730, 2001. [55] A. Stroh, “Steady state problems in anisotropic elasticity,” J. Math. Phys., 41, 77– 103, 1962. [56] J. Hirth and J. Lothe, Theory of Dislocations, 2nd edn., Wiley, New York, 1982. [57] M.W. Finnis and J.E. Sinclair, “A simple empirical n-body potential for transitionmetals,” Philos. Mag. A, 50, 45–55, 1984. [58] V. Vitek, “Theory of core structures of dislocations in body-centered cubic metals,” Cryst Lattice Defects, 5, 1–34, 1974. [59] J. Knap and K. Sieradzki, “Crack tip dislocation nucleation in FCC solids,” Phys. Rev. Lett., 82, 1700–1703, 1999. [60] J. Schiotz and A.E. Carlsson, “The influence of surface stress on dislocation emission from sharp and blunt cracks in fcc metals,” Philos. Mag. A, 80, 69–82, 2000. [61] P. Gumbsch, J. Riedle, A. Hartmaier, and H.F. Fischmeister, “Controlling factors for the brittle-to-ductile transition in tungsten single crystals,” Science, 282, 1293–1295, 1998. [62] J.R. Rice and G.E. Beltz, “The activation-energy for dislocation nucleation at a crack,” J. Mech. Phys. Solids, 42, 333–360, 1994. [63] G. Xu, A.S. Argon, and M. Oritz, “Critical configurations for dislocation nucleation from crack tips,” Philos. Mag. A, 75, 341–367, 1997. [64] Y. Mishin, M.J. Mehl, D.A. Papaconstantopoulos, A.F. Voter, and J.D. Kress, “Structural stability and lattice defects in copper: ab initio, tight-binding, and embeddedatom calculations,” Phys. Rev. B, 6322, art. no.–224106, 2001. [65] A. Stroh, “Dislocations and cracks in anisotropic elasticity,” Phil. Mag., 7, 625, 1958. [66] J. Li, “Atomeye: an efficient atomistic configuration viewer,” Model. Simul. Mater. Sci. Eng., 11, 173–177, 2003.
2.20 THE PEIERLS–NABARRO MODEL OF DISLOCATIONS: A VENERABLE THEORY AND ITS CURRENT DEVELOPMENT Gang Lu Division of Engineering and Applied Science, Harvard University, Cambridge, Massachusetts, USA
Dislocations are central to the understanding of mechanical properties of crystalline solids. While continuum elasticity theory describes well the long-range elastic strain of a dislocation for length scales beyond a few lattice spacings, it breaks down near the singularity in the region surrounding the dislocation center, known as the dislocation core. There has been a great deal of interest in describing accurately the dislocation core structure on an atomic scale because of its important role in many phenomena of crystal plasticity [1–3]. The core properties control, for instance, the mobility of dislocations, which accounts for the intrinsic ductility or brittleness of solids. The core is also responsible for the interaction of dislocations at close distances, which is relevant to plastic deformation. Two types of theoretical approaches have been employed to study dislocation core properties. The first is based on direct atomistic simulations using either empirical interatomic potentials or ab initio calculations. Empirical potentials involve the fitting of parameters to a predetermined database and hence may not be reliable in predicting the core properties, where severe distortions like bond breaking, bond formation and switching necessitate a quantum mechanical description of the electronic degrees of freedom. On the other hand, ab initio total energy calculations, though considerably more accurate, are computationally expensive for the studies of dislocation properties. The second approach is based on the framework of the Peierls–Nabarro (P–N) model which holds the promise of becoming a plausible alternative to direct atomistic simulations. For this reason, there has been a resurgence of interest in the simple and tractable P–N model for studying the dislocation core structure and mobility. In particular, the P–N model permits easy estimation of the key dislocation 793 S. Yip (ed.), Handbook of Materials Modeling, 793–811. c 2005 Springer. Printed in the Netherlands.
794
G. Lu
characteristics of nucleation and mobility, directly from quantities (GSF energy, see later) accessible through standard quantum mechanical or empirical atomistic computations.
1.
Original P–N Model
Peierls [4] first proposed the remarkable hybrid model in which some of the details of the discrete dislocation core were incorporated into an essentially continuum framework. Nabarro [5] and Eshelby [6] further developed Peierls’ model and gave the first meaningful estimate of the lattice friction to dislocation motion. Later attempts to generalize the original treatment of Peierls and Nabarro assumed a more general core configuration from which they derived the interactions across the glide plane which satisfy the Peierls integral equation. The basic idea of the P–N model can be illustrated in Fig. 1. The dislocated solid is separated into two elastic half-spaces joined by atomic-level forces across their common interface, known as the glide plane. The dislocation is characterized by a slip distribution δ(x) = u(x, 0+ ) − u(x, 0− ), where u(x) is the displacement vector at position x in the glide plane. The goal of the P–N model is to determine the slip distribution δ(x) (or the displacement field u(x)) across the glide plane that minimizes the total energy of the solid. The total energy includes two distinct contributions that compete with each
Y
Linear elastic half–spaces
Nonlinear interplanar potential
X
Figure 1. A schematic illustration showing an edge dislocation in a lattice. The partition of the dislocated lattice into linear elastic region and nonlinear atomistic region allows a multiscale treatment of the problem.
The Peierls–Nabarro model of dislocations
795
other in determining the equilibrium slip distribution. One of contributions accounts for the atomic interaction across the glide plane which reflects the fact that there is an energy penalty for the misfit across the glide plane. Such misfit energy can be written as +∞
γ(δ(x)) dx,
Umisfit =
(1)
−∞
where γ(δ(x)) is the generalized stacking fault (GSF) energy defined as the following [7]: consider a perfect crystal cut across a single plane into two parts which are then subjected to a relative displacement through an arbitrary vector δ and rejoined. The reconnected lattice has a surplus energy per unit area γ(δ). As the vector δ is varied to span a unit cell of the interface, γ(δ) generates the generalized stacking fault energy surface. The procedure can be repeated for various crystal planes. The significance of the GSF energy surface (or γ -surface) is that for a fault vector δ there is an interfacial restoring stress Fb (δ) = −∇(γ (δ)),
(2)
which has the same formal interpretation as the restoring stress in the P–N model. Note that the GSF energy surface retains the translational and rotational symmetry of the underlying lattice. For example, there is no attendant energy cost if the atoms across the glide plane experience a relative displacement equal to the Burgers vector. The second energy contribution to the total energy is the elastic energy stored in the two elastic half-spaces. This energy corresponds to the elastic energy of the dislocation, and it depends on the slip distribution δ(x) as well. Without losing generality, we can assume a one-dimensional slip of δ(x) first, and deal with a three-dimensional slip δ(x) later. As pointed out by Eshely [6], a straight dislocation can be represented as a continuous distribution of infinitesimal dislocations whose Burgers vectors are defined as the local gradient of the slip distribution. For example, the infinitesimal dislocation lying between x and x + dx has a Burgers vector
dδ(x) db(x ) = dx
x=x
dx ≡ ρ(x ) dx ,
(3)
where the local slip gradient ρ(x) is also called dislocation density. Integrating the dislocation density over all x we find +∞
ρ(x) dx =
−∞
+∞ −∞
dδ(x) dx = δ(+∞) − δ(−∞) = b, dx
(4)
which is what we would expect from the definition of the dislocation density (see Fig. 1). These infinitesimal dislocations interact elastically, and the total
796
G. Lu
elastic energy can be obtained through the superposition principle by adding up the contribution from each infinitesimal dislocation separately. More specifically, an infinitesimal edge dislocation located at x produces a shear stress at some other point x which is given by σx y (x, 0) = K e
db(x ) , x − x
(5)
K e is the prelogarithmic elastic factor for an edge dislocation. The displacement u(x) necessary to create the infinitesimal dislocation at x takes place in the presence of the shear stress from the dislocation at x , giving the following contribution to the elastic energy from the latter dislocation: dUelastic = K e
db(x ) u(x). x − x
(6)
Integrating this expression over all values of x from −L to L, and over db(x ) to add the contribution from all infinitesimal dislocations, we obtain the total elastic energy of the original dislocation 1 Uelastic = K e 2
L b −L 0
db(x ) 1 u(x) dx = K e x − x 2
L L
u(x) −L −L
ρ(x ) dx dx, x − x
(7)
where L is an inconsequential constant introduced as a large cutoff distance. Performing an integration by parts over x, we arrive the following expression for the elastic energy: 1 Ke Uelastic = K e b2 ln L − 2 2
L L
ρ(x)ρ(x ) ln |x − x | dx dx .
(8)
−L −L
Similar results can be also found for a screw dislocation with K e replaced by the corresponding elastic constant for the screw dislocation. For a general mixed dislocation with an angle θ between the dislocation line and its Burgers vector, the elastic energy is given by 1 K Uelastic = K b2 ln L − 2 2 where µ K= 2π
L L
ρ(x)ρ(x ) ln |x − x | dxdx ,
(9)
−L −L
sin2 θ + cos2 θ , 1−ν
(10)
for an isotropic solid. µ and ν are the shear modulus and Poisson’s ratio, respectively. This result clearly separates the contribution of the long-range
The Peierls–Nabarro model of dislocations
797
elastic field of the dislocation, embodied in the first term of Eq. (9), from the contribution of the large distortions at the dislocation core, embodied in the second term of the equation. We will drop the first term in our later discussion and concentrate on the second term, which represents the energy contribution from the dislocation core. Now we arrive expression of the total energy of the dislocation as a functional of dislocation density ρ(x): +∞
γ(δ(x))dx −
Ut ot [ρ(x)] = −∞
K 2
L L
ρ(x)ρ(x ) ln |x − x | dxdx . (11)
−L −L
By minimizing the above energy functional, we can find the equilibrium structure of the dislocation. A variational derivative of Eq. (11) with respect to the dislocation density ρ(x) leads to the P–N integro-differential equation: +∞
K −∞
1 dδ(x ) dx = Fb (δ(x)). x − x dx
(12)
If a simple sinusoidal form is assumed for Fb (δ(x)), as in the original P–N treatment, the misfit is then given by the well-known analytical solution, b x b tan−1 + , π ζ 2
(13)
d Kb = 4π Fmax 2(1 − ν)
(14)
δ(x) = where ζ=
is the half-width of the dislocation core and Fmax = µb/(2π d) is the maximum restoring stress with d as the interlayer distance between the glide planes. One of the key features that emerges from this solution is that the P–N model removes the artificial divergence at the core that is associated with the idealized continuum dislocation of Volterra. By introducing the nonlinear and nonconvex interplanar potential into the model, the solution of P–N model in terms of stress and strain is seen to be well behaved. One of the achievements of the P–N model is that it provides a reasonable estimate of the dislocation size, characterized by ζ as a result of the competition between the two energy contributions. The more important achievement of the P–N model is that it offers an insight into the value of the critical stress to move a dislocation in an otherwise perfect lattice. Such stress has thus been termed as Peierls stress. In order to derive the Peierls stress, however, the P–N energy functional needs to be modified. The expression in Eq. (11) that we have discussed so far is invariant with respect to an arbitrary translation of the dislocation density ρ(x) → ρ(x + t). In other words, the dislocation
798
G. Lu
described by the P–N solution does not experience any resistance as it moves through the lattice. This is clearly unrealistic, and is a consequence of neglecting the discrete nature of the lattice: The P–N model views the solid as a continuous medium. The only effect of the lattice periodicity. so far, comes from the periodicity of the misfit potential with a period of the Burgers vector. In order to rectify this problem and to recover the lattice resistance of dislocation motion, the P–N energy functional was modified so that the misfit potential is not sampled continuously as in Eq. (11), but only at the positions of the actual atomic planes. This amounts to the following modification of the first term in total energy of the dislocation in Eq. (11): +∞
+∞
γ (δ(x))dx →
γ (δ(xn )) x,
(15)
n=−∞
−∞
where xn is the position of the nth atomic planes and x is the spacing between these atomic planes. Assuming a sinusoidal restoring stress F[δ(x)]= µb/(2π d) sin[2π δ(x)/b], the misfit potential (Frenkel potential) γ [δ(x)] is
γ [δ(x)] =
2π δ(x) Fmax b 1 − cos . 2π b
(16)
If the center of the dislocation is displaced by αb with α < 1, the total misfit energy becomes:
+∞ µb3 n 1 + cos 2 tan−1 2(1 − ν) α + b/d Umisfit = 2 8π d n=−∞ 2
,
(17)
where we have used Eq. (13) for δ(x), and Eq. (16) for γ(δ) to evaluate the misfit energy in Eq. (15). After appropriate manipulations, Eq. (17) may be rewritten as Umisfit (α) =
+∞ µb3 4ζ 2 1 , 2 2 2 4π d b n=−∞ (2ζ /b) + (2α + n)2
(18)
which can then be handled by the Poisson summation formula +∞
f (n) =
n=−∞
+∞ +∞
f (x)e2πikx dx.
(19)
k=−∞−∞
After performing the relevant integrations, we arrive the final expression for the misfit energy
Umisfit (α) =
µb2 µb2 −4π ζ + exp 4π(1 − ν) 2π(1 − ν) b
cos 4π α.
(20)
From the above expression, we see that the straight dislocation experiences a periodic potential wells and in the act of passing from one potential well
The Peierls–Nabarro model of dislocations
799
to the next it must cross an energy barrier known as the Peierls barrier W p . The stress required to surmount this energy barrier is the Peierls stress, σ p , given by [8]
1 ∂Umisfit (α) 2π W p ≡ 2 b ∂α b2 max 2µ 4π ζ 2µ 2π d exp − exp − = = . 1−ν b 1−ν b(1 − ν)
σp =
(21)
A few observations are now in place. First, we note that the Peierls stress is extremely sensitive to the ratio of (ζ /b) or (d/b) for fixed values of elastic constants µ and ν. Therefore an edge dislocation is more mobile than a screw dislocation with the same Burgers vector in the same material since the edge dislocation is wider (larger ζ ) than the corresponding screw dislocation. In general, the more the edge component of a mixed dislocation, the wider the dislocation core, and hence the greater the mobility. In a given crystal, the slip system of dislocations corresponds to the largest value of (d/b), namely, the slip plane tends to have the largest interplanar spacing, and the slip direction or Burgers vector is along the nearest neighbor direction (smaller b). In closepacked metallic systems, the values of (ζ /b) and (d/b) are large, and these materials are usually ductile. In contrast, crystals with more complex unit cells (such as ceramics) have relatively small d/b ratio, giving larger Peierls stress. In these materials, the shear stress for dislocation motion cannot be overcome before fracturing the solid, thus they are usually brittle. A second observation to be made concerns the magnitude of the Peierls stress. What we find from the P–N model is that the stress to move a dislocation is down by an exponential factor in comparison with the shear modulus. It explains the fact that in many materials, plastic deformation operates at a shear stress that is orders of magnitude below its shear modulus: it is all due to dislocation motion! A final observation that we should make is, that dislocation motion often takes place by kink mechanisms in which a bulge on the dislocation line brings a segment of the dislocation into adjacent Peierls valleys, and the resulting kinks propagate with the end result being a net forward motion of the dislocation. In this case, what the kinks have to overcome is the secondary Peierls barrier along the dislocation line direction. Despite the great heuristic value and useful insight that the original P–N model offers, it lacks the quantitative power for prediction. In particular, the model becomes increasingly inaccurate for dislocations with narrow cores, as is typically the case in covalently bonded solids. Since the P–N model represents a combination of the continuum model and the GSF interplanar potential, its accuracy can be affected by either component. One of the main deficiencies of the original P–N model is the assumption of sinusoidal force law. In real materials, however, the interplanar potential (GSF energy) is not
800
G. Lu
at all sinusoidal. At present, the GSF energies can be calculated very accurately by using an ab initio quantum mechanical framework, which brings to the problem of possible inaccuracies in the continuum component. In the following, we will address the limitations in the original P–N model, and present some solutions that have been put forward to improve the quantitative description of dislocation core properties and mobility. (1) The original P–N model is based on the isotropic or the pseudo–isotropic elasticity theory. Recently, full anisotropic treatments have been implemented in the P–N model, which do not require much more computational effort [9]. (2) The original P–N model is one-dimensional, assuming the slip is along one direction. This “constrained path approximation” fails for dislocations in many Bravais lattices. For example, in an fcc lattice, dislocations can dissociate into partials whose Burgers vectors are not parallel, and a treatment in two dimension is mandatory. Currently, many P–N theories have been proposed to solve this problem. For example, we will introduce a powerful P–N model, namely, the Semidiscrete Variational P–N (SVPN) model which is capable to deal with three-dimensional displacement field, particularly useful for studying narrow dislocations. (3) The original P–N model yields a variation of the misfit energy and the Peierls stress which has a periodicity of b/2, in contrast with the feature of the dislocation barrier, which must in general, exhibit the periodicity of the Burgers vector b. There have been controversies and confusions with regard to this problem (e.g., see [10]), and it was attributed to an erroneous representation of the atomic positions across the glide plane in the original P–N model after the dislocation is translated by a distance [11]. By correcting this error, one can recover the correct periodicity of b for the misfit energy and the Peierls stress. Another relevant idea has also been exploited in a numerical formulation of the P–N model. Namely, the misfit energy is not summed over the the position of atomic nuclei, but rather it is averaged over the Thomas–Fermi radius around the nuclei. This modification is particularly useful for metallic systems where electrons are more delocalized. It has been shown that the Peierls barrier and Peierls stress can be lowered considerably by this modification [12]. (4) In the original P–N model, the Peierls stress is derived by considering the misfit energy exclusively (see Eq. (20)). The elastic energy is assumed to be constant, in other words, the dislocation shape is assumed to be rigid during the dislocation translation process. This assumption turns out to be unrealistic. In fact, it is critical to include the variation of the elastic energy in Eq. (20) when evaluating the Peierls stress. However in doing so, one faces an inconsistency in the formulation: while the elastic energy is computed by a continuous integration, the misfit energy has to be sampled discretely in order to incorporate the discrete nature of lattice. Thus, the two energy contributions are not treated on the equal footing and the total energy is not variational. The
The Peierls–Nabarro model of dislocations
801
SVPN model was developed precisely to resolve this inconsistency by discretizing the elastic energy [13]. As we shall see later, the variational formulation of the P–N theory permits us to compute the Peierls stress more accurately by allowing the dislocation shape to change during the translation. The relaxation of dislocation core is particularly important for narrow dislocations. In fact, it has been shown that the Peierls stress can be reduced by three orders of magnitude for the screw dislocation in Si by allowing the relaxation of the dislocation shape. The reduced Peierls stress is in a much better agreement with the direct atomistic simulation result. We should emphasize that the P–N model calculation takes only a small fraction of the computational time (a few minutes) that a direct atomistic simulation may take (hours or even days).
2.
Semidiscrete Variational P–N Model
In the remaining of this article, we will introduce the SVPN model to exemplify the current development of the P–N theory. There are two versions (planar and nonplanar) of the SVPN model that have been developed. The planar model aims to treat a dislocation that is entirely confined to a single glide plane, while the nonplanar model deals with a dislocation that is spread onto more than one glide planes. The nonplanar model was developed in order to study stress-assisted dislocation cross-slip and constriction [14]. After the discussion of the models, we will apply the planar model to dislocations in Al with ab initio determined GSF energy surface. To facilitate the presentation, we adopt the following conventions. As defined in Fig. 2, xoz plane is the (111) glide plane for Al, z axis is in the direction of the dislocation line, and x axis is the glide direction, with y axis normal to the glide plane. For planar dislocations, the displacements along y direction are usually small. The Burgers vector b lies on the glide plane making an angle θ with the z axis. The Burgers vector is along x axis (θ = 90◦ ) for an edge dislocation and along z axis (θ = 0◦ ) for a screw dislocation. The Burgers vector of a mixed dislocation has both an edge component, b sin θ, and a screw component, b cos θ. In general, the atomic displacements have components in all three directions rather than only along the direction of the Burgers vector, because the path along the Burgers vector may have to surmount a higher interplanar energy barrier in the GSF surface (see Fig. 4). In the planar SVPN formalism, the dislocation slip is assumed to be confined to a single glide plane, separating two semi-infinite linear elastic continua. The equilibrium structure of a dislocation is obtained by minimizing the dislocation energy functional [15] Udisl = Uelastic + Umisfit + Ustress + K b2 ln(L),
(22)
802
G. Lu y [111] normal to glide plane
b sinθ
O
x glide direction
θ
θ os
b
bc
σ
xy
σzy
σby
z dislocation line Figure 2. in Al.
Cartesian set of coordinates showing the directions relevant for dislocations
where Uelastic[{ρ}] =
1 i, j
Umisfit [{δ}] =
4π
(2) (2) (3) (3) χij [K e (ρi(1) ρ (1) j + ρi ρ j ) + K s ρi ρ j ],
xγ3 (δi ),
(23) (24)
i
Ustress[{ρ}, τ ] = −
2 x i2 − x i−1 i,l
2
(ρi(l) τi(l) ),
(25)
with respect to the dislocation density or the slip vector. Here, ρi(1) , ρi(2), and ρi(3) are the edge, vertical and screw components of the general dislocation density at the ith nodal point and γ3 (δi ) is the three-dimensional GSF energy surface. x is the area assigned to each atomic row (the length of all dislocation lines is 1 Å). The corresponding components of the applied stress interacting with the ρi(1), ρi(2) , and ρi(3) , are τ (1) = σ21 , τ (2) = σ22 and
The Peierls–Nabarro model of dislocations
803
τ (3) = σ23 , respectively. K , K e , and K s are the prelogarithmic energy factors defined earlier. The dislocation density at the ith nodal point is defined as ρi = (δi − δi−1 )/(xi − xi−1 ). The remaining quantities entering in this expression are: χij = 32 φi,i−1 φ j, j −1 +ψi−1, j −1 +ψi, j −ψi, j −1 −ψ j,i−1 , with φi, j =xi −x j , and ψi, j = 12 φi,2 j ln |φi, j |. The first term in the energy functional, Uelastic, represents the configurationdependent (density or slip) part of the elastic energy, which has been discretized. Since any details of the displacements across the glide plane other than those on the atomic rows are disregarded, it is consistent to assume that the dislocation density is constant between the nodal points. This explicit discretization of the elastic energy term removes the inconsistency in the original P–N model and produces a total energy functional which is variational. Another modification in this approach is that the nonlinear misfit potential in the energy functional, Umisfit , is a function of all three components of the nodal displacements, δ(xi ). Namely, in addition to the displacements along the Burgers vector, lateral and even vertical displacements across the glide plane are also included. This in turn, allows the treatment of straight dislocations of arbitrary orientation in arbitrary glide planes. Furthermore, because the slip vector δ(xi ) is allowed to change during the process of dislocation translation, the Peierls energy barrier can be significantly lowered compared to its corresponding value from a rigid translation. In order to examine the trend of energetics for different dislocations, we identify the dislocation configurationdependent part of the total energy as the core energy, Ucore = Uelastic + Umisfit , which includes the density-dependent part of the elastic energy and the entire misfit energy, in the absence of external stress. The last term in Eq. (22), K b2 ln(L), is independent of the dislocation density, and hence, is irrelevant in the variational procedure and has no contribution to the evaluation of the Peierls stress (a typical value for the outer cutoff radius L is 103 Å; we use this value for all dislocations in the calculations discussed below). The response of a dislocation to an applied stress is determined by the minimization of the energy functional with respect to ρi at a given value of applied stress, τi(l) . An instability is reached when an optimal solution for ρi no longer exists, which is manifested numerically by the failure of the minimization procedure to convergence. The Peierls stress is defined as the critical value of the applied stress which gives rise to this instability. Having developed the planar SVPN model, it is not difficult to extend it to more than one glide plane. We have recently developed the nonplanar SVPN model in order to study dislocation cross-slip and constriction in fcc metals [14]. As shown in Fig. 3, a screw dislocation placed at the intersection of the primary (plane I) and cross-slip plane (plane II) is allowed to spread into the two planes simultaneously. The X (X ) axis represents the glide direction of the dislocation at the plane I (II). For an fcc lattice, the two slip
804
G. Lu Y [111]
dislocation line
I O
X [121] θ
L Z [101]
X'
II
Figure 3. Cartesian set of coordinates showing the directions relevant to the screw dislocation located at the intersection of the two glide planes. Plane I (II) denotes the primary (cross-slip) plane.
¯ planes are (111) and (111), forming an angle θ ≈ 71◦ . The total energy of the dislocation is Utot = UI + UII + U˜ .
(26)
Here, UI and UII are the energies associated with the dislocation spread on planes I and II, respectively, and U˜ represents the elastic interaction energy between the dislocation densities on planes I and II. The expressions for UI and UII are identical to that given earlier for the single glide plane case, while the new term U˜ can be derived from Nabarro’s equation for general parallel dislocations [5]: UI(II) =
1 i, j
2
χij {K e [ρ1I(II) (i)ρ1I(II) ( j ) + ρ2I(II) (i)ρ2I(II) ( j )]
+ K s ρ3I(II) (i)ρ3I(II) ( j )} +
xγ3 δ1I(II) (i), δ2I(II) (i), δ3I(II) (i)
i
−
x(i)2 − x(i − 1)2 i,l
2
ρlI(II) (i)τlI(II) + K b2 ln L ,
The Peierls–Nabarro model of dislocations U˜ = −
p
K s ρ3I (i)ρ3 ( j ) Aij −
i, j p + ρ2I (i)ρ2 ( j )]Aij
−
805 p
K e [ρ1I (i)ρ1 ( j )
i, j p
p
K e [ρ2I (i)ρ2 ( j )Bij + ρ1I (i)ρ1 ( j )Cij
i, j p − ρ2I (i)ρ1 ( j )Dij
p
− ρ1I (i)ρ2 ( j )Dij ].
Here, δ1I(II) (i), δ2I(II) (i), and δ3I(II) (i) represent the edge, vertical, and screw component of the general dislocation slip vector at the ith nodal point in plane I (II), respectively, while the corresponding component of dislocation density in plane I (II) is defined as before in the planar case. The projected dislocation density ρ p (i) is the projection of the density ρ II (i) from plane II onto plane I in order to deal with the nonparallel components of the slip vector. χij , Aij , Bij , Cij , and Dij are double-integral kernels defined by χij =
x j xi
ln|x − x | dx dx ,
x j −1 x i−1 x
j xi
Aij = x j −1 x i−1
1 ln(x02 + y02 ) dx dx , 2
x
j xi
Bij =
ln x j −1 x i−1
x02
x02 dx dx , + y02
x
j xi
Cij =
ln x j −1 x i−1
y02 dx dx , x02 + y02
x
j xi
Dij =
ln x j −1 x i−1
x0 y0 dx dx , + y02
x02
where x0 = L − x + x cos θ, and y0 = −x sin θ. The equilibrium structure of the dislocation is again determined by minimizing the total dislocation energy functional with respect to the dislocation density.
3.
Dislocation Core Properties in Aluminum
The GSF energy surface, γ3 (δi ) entering the P–N model can usually be determined from ab initio calculations based on the density functional theory
806
G. Lu
[15]. In Fig. 4, we show the GSF energy surface for Al which was computed by using a pseudo-potential plane-wave method. The computational detail can be found in [16]. As shown in Fig. 4, the calculated GSF energy surface maintains the underlying translational and rotational symmetry of the fcc lattice. The three high peaks of the GSF surface correspond to the run-on stacking fault configuration ABC|CABC, in which two C layers are nearest neighbors. The local minimum and maximum along the [112] direction correspond to intrinsic and unstable stacking faults, respectively. We first examine the core properties of four typical dislocations, i.e., the screw, 30◦ , 60◦ and edge dislocations. These dislocations have the same Burgers vector, b = a/2 [101], but different orientations (characters). The results for the energetics and the Peierls stress for the four dislocations are presented in Table 1, along with the values of ζ . First one can see the trend that the half-width ζ increases monotonically with the dislocation angle θ. Secondly, the misfit energy, Umisfit , also increases monotonically from the screw to the edge dislocation, while the configuration-dependent elastic energy, Uelastic (negative in sign) decreases as the angle increases.
0.5 0.4 0.3 0.2 0.1 0
[110] [112]
Figure 4. The GSF energy surface for displacements along a (111) plane in Al (J/m2 ) (the corners of the plane and its center correspond to identical equilibrium configurations, i.e., the ideal Al lattice) from DFT calculations.
The Peierls–Nabarro model of dislocations
807
Table 1. Core half-widths ζ (in Å); core energies Ucore and separate contributions from the configuration-dependent elastic energy, Uelastic and the misfit energy Umisfit ; K b2 ln L (in eV/A); and Peierls stress (in MPa) for the four dislocations. Core widths Ucore Uelastic Umisfit K b2 ln L Peierls stress
Screw 2.1 −0.0834 −0.1828 0.0938 1.6050 256
30◦ 2.5 −0.1096 −0.2317 0.1221 1.8123 53
60◦ 3.0 −0.1678 −0.3199 0.1521 2.233 98
Edge 3.5 −0.1979 −0.3666 0.1688 2.446 35
The configuration-independent elastic energy K b2 ln L is also included. Several points need to be emphasized: (1) The configuration-dependent elastic energy Uelastic, ignored in some previous studies, is the dominant contribution to the core energy Ucore (about a factor of two larger than Umisfit). More importantly, it depends strongly on the dislocation character; (2) While Uelastic is negative here, in principle, it can be of either sign. For example, Uelastic was found to be positive in Si; (3) Inclusion of the configuration-independent elastic term, K b2 ln L, yields positive values for both the total energy and the total elastic energy. As alluded earlier, the Peierls stress in this work is calculated as the critical value of the applied stress τ , at which the dislocation energy functional fails to be minimized with respect to ρi through standard conjugate gradient techniques. This approach is more accurate and physically transparent, because it captures the nature of the Peierls stress as the stress at which the displacement field of the dislocation undergoes a discontinuous transition. A typical value for the Peierls stress of Al from the analysis of the Bordoni internal peaks is about 230 MPa, which is very close to our value for the screw dislocation (256 MPa) [17]. In order to correlate dislocation properties with the dislocation character, we have studied dislocation properties of 19 different dislocations that have the same Burgers vector but different orientations. The angle between the dislocation line and the Burgers vector varies from 0◦ to 90◦ . The core energy, along with its separate contributions from the configuration-dependent elastic energy Uelastic and the misfit energy Umisfit , are presented in Fig. 5 as a function of the dislocation angle θ. We find that Ucore and Uelastic decrease monotonically as the angle increases, whereas Umisfit increases with θ. The configuration-dependent elastic energy Uelastic decreases with θ because the prelogarithmic factor K increases with θ. On the other hand, the monotonic increase of Umisfit with θ is due to the fact that the core width increases with
808
G. Lu 0.20
0.10 Ucore Uelastic Umisfit
Energy (eV/Å)
0.00
⫺0.10 ⫺0.20 ⫺0.30 ⫺0.40 0.0
30.0
60.0
90.0
Angle (θ) Figure 5. The core energy, elastic energy and misfit energy as a function of dislocation orientations.
the dislocation angle. Note that the configuration-dependent elastic energy, not only is the dominant contribution to the total energy stored in the core region, but also is more sensitive to the dislocation character than the misfit energy. To correlate the Peierls stress with the dislocation character, we plotted ¯ b) as a function of ζ /a¯ in Fig. 6. Here, ζ is the half-width of a ln(σ p a/K dislocation and a¯ is the average nodal spacing along the x direction. It should be pointed out that most of the dislocations in the fcc lattice have noneven nodal spacing, except for the 30◦ and edge dislocations. Most of the calculated values can be fitted (solid line) with σp =
2π K b −1.7ζ /a¯ e . a¯
(27)
The large deviation of σ P for the 30◦ and edge dislocations from the common trend, indicates that the nodal spacing (even vs. non-even) between atomic planes plays an important role on the Peierls stress [18]. On the other hand, the deviation of the 10.9◦ and 14.9◦ dislocations from the common trend is
The Peierls–Nabarro model of dislocations
809
⫺4.0
In(σpab-1 K-1)
⫺6.0
30 ⫺8.0
10.9 ⫺10.0 90
⫺12.0 0.0
2.0
14.9
4.0
6.0
8.0
10.0
ζ/a Figure 6. The scaled Peierls stress as a function of the ratio of the core width to the average atomic spacing perpendicular to the dislocation line.
unclear to us at present. Note, that the Peierls stress is more sensitive to the average atomic spacing a¯ than to the half-width. For example, while both the 0◦ and 14.9◦ dislocations have predominant screw components and similar half-widths of 2.1 Å and 2.3 Å, respectively, they have quite different atomic spacings, 1.2 Å and 0.3 Å, respectively. This results in a Peierls stress of 6 MPa for the 14.9◦ dislocation, almost two orders of magnitude smaller than that of 256 MPa for the screw dislocation.
4.
Conclusion
To conclude, the P–N model serves as a link between atomistic and continuum approaches, by providing a means to incorporate information obtained from atomistic calculations (ab initio or empirical) directly into continuum models. The resultant approach can then be applied to problems that neither atomistic nor conventional continuum models could handle separately. The simplicity of the P–N model makes it an attractive alternative to direct
810
G. Lu
atomistic simulations of dislocation properties. It provides a rapid and inexpensive route to determine dislocation core structure and mobility. Combined with ab initio determined GSF energy surface, the P–N model could give rather reliable quantitative predictions for various dislocation properties. Furthermore, since ab initio based P–N model calculations are much more expedient than direct ab initio atomistic calculations for dislocations, the P–N model could serve as a powerful and efficient tool for alloy design, where the goal is to select the “right” elements with the “right” alloy composition to tailor desired mechanical, and in particular, dislocation properties. Finally, we should comment that the P–N model is just one example of more general cohesive surface models that are built upon the idea of limiting all constitutive nonlinearity to certain privileged interfaces, while the remainder of materials is treated through more conventional continuum theories. The same strategy has also been applied to the study of fracture and dislocation nucleation from a crack tip [19].
References [1] M.S. Duesbery, “Dislocation core and plasticity,” Dislocations in Solids, F.N.R. Nabarro, ed., vol. 8, 67, North-Holland, Amsterdam, 1989. [2] M.S. Duesbery and G.Y. Richardson, “The dislocation core in crystalline materials,” CRC Crit. Rev. Sol. State Mater. Sci., 17, 1, 1991. [3] V. Vitek, “Structure of dislocation cores in metallic materials and its impact on their plastic behavior,” Prog. Mater. Sci., 36, 1, 1992. [4] R. Peierls, “The size of a dislocation,” Proc. Phys. Soc. London, 52, 34, 1940. [5] F.R.N. Nabarro, “Dislocations in a simple cubic lattice,” Proc. Phys. Soc. London, 59, 256, 1947. [6] J.D. Eshelby, “Edge dislocations in anisotropic materials,” Phil. Mag., 40, 903, 1949. [7] V. Vitek, “Intrinsic stacking faults in body-centered cubic crystals,” Phil. Mag., 18, 773, 1968. [8] J.P. Hirth and J. Lothe, Theory of Dislocations, 2nd edn., Wiley, New York, 1992. [9] G. Schoeck, “The core energy of dislocations,” Acta Metall. Mater., 127, 3679, 1995. [10] J.W. Christian and V. Vitek, “Dislocations and stacking faults,” Rep. Prog. Phys., 33, 307, 1970. [11] J. Wang, “A new modification of the formulation of peierls stress,” Acta Mater., 44, 1541, 1996. [12] G. Schoeck, “Peierls energy of dislocations: a critical assessment”, Phys. Rev. Lett., 82, 2310, 1999. [13] V. Bulatov and E. Kaxiras, “Semidiscrete variational peierls framework for dislocation core properties,” Phys. Rev. Lett., 78, 4221, 1997. [14] G. Lu, V. Bulatov, and N. Kioussis, “A non-planar peierls–nabarro model and its application to dislocation cross-slip,” Phil. Mag., 83, 3539, 2003. [15] P. Hohenberg and W. Kohn, “Inhomogeneous electron gas,” Phys, Rev., 136, B864, 1964.
The Peierls–Nabarro model of dislocations
811
[16] G. Lu, N. Kioussis, V. Bulatov, and E. Kaxiras, “Generalized-stacking-fault energy surface and dislocation properties of aluminum,” Phys. Rev. B, 62, 3099, 2000a. [17] W. Benoit, N. Bujard, and G. Gremaud, “Kink dynamics in f.c.c. metals,” Phys. Stat. Sol., (a), 104, 427, 1987. [18] G. Lu, N. Kioussis, V. Bulatov, and E. Kaxiras, “The peierls-nabarro model revisited,” Phil. Mag. Lett., 80, 675, 2000b. [19] J.R. Rice, “Dislocation nucleation from a crack tip: an analysis based on the peierls concept,” J. Mech. Phys. Sol., 40, 239, 1992.
2.21 MODELING DISLOCATIONS USING A PERIODIC CELL Wei Cai Department of Mechanical Engineering, Stanford University, Stanford, CA 94305-4040
Dislocations are lattice defects responsible for many mechanical behaviors of crystalline materials, ranging from their growth to deformation and failure [1]. Dislocation motion leads to plastic deformation. In some cases, dislocation interactions give rise to materials strengthening, while in other cases they participate in ductile fracture and fatigue. Dislocation nucleation and subsequent multiplication are important processes for many new technologies on thin film and micro-mechanical structures. Examples include relaxation of strained heteroepitaxial semiconductor layers [2], and materials characterization by micro- and nano-indentations [3]. While long-range interactions between dislocations are well described by continuum elasticity theory, dislocation nucleation, motion and short-range reactions are sensitively dependent on atomistic mechanisms in the nonlinear region of dislocation core. Understanding these local unit mechanisms requires atomistic simulations such as Molecular Dynamics (MD). However, the long-range elastic fields of dislocations present some problems for atomistic simulations, which are usually limited in length scale. For example, a dislocation can generate appreciable stress over a range of a micrometer, whereas a cubic simulation cell containing 106 atoms is only about 30 nm in size. Consequently atomistic simulation results can easily be contaminated by artifacts if boundary conditions are not treated properly. A natural approach to this problem is to establish a good coupling between discrete atomistic models with continuum elasticity theory. The purpose of this article is to discuss some ubiquitous problems of boundary conditions in atomistic modeling of dislocations. Specifically we focus on the usage of periodic boundary conditions (PBC) and the problem of conditional convergence during error correction. The solution of this problem gives the proper procedure for setting up the initial dislocation structure, extracting 813 S. Yip (ed.), Handbook of Materials Modeling, 813–826. c 2005 Springer. Printed in the Netherlands.
814
W. Cai
dislocation core energy, as well as for computing the Peierls stress. It is also applicable to compute stress fields in microscale dislocation dynamics (DD) simulations using PBC.
1.
Setting up a Dislocation Structure
From a continuum mechanics point of view, a dislocation is the boundary line of a cut plane, the two sides of which has slipped with respect to each other by a constant vector b – the Burgers vector. Figure 1(a) shows a screw dislocation (b parallel to dislocation line) in the center of a cylinder. In isotropic elasticity, the displacement field for this dislocation is parallel to z direction and is given by, θ , (1) u z (x, y) = b 2π where θ ∈ (−π, π ] is vector angle of the field point as shown in Fig. 1(a). In Fig. 1 we have assumed that the outer radius of the cylinder goes to infinity. The angle θ and the displacement u z become ill defined at the geometrical center of the dislocation (x = y = 0). Here the continuum theory breaks down. To circumvent this problem, a small tube with radius rc around the dislocation center is usually carved out in continuum models, as shown in Fig. 1(a). rc is called the core radius. In reality this problem does not exist because crystals are not continuum but consists of discrete atoms. With the core cut-off, the total elastic energy stored within a radius R of the cylinder can also be derived, µb2 R (2) ln , 4π rc where µ is the shear modulus of the medium. The choice of rc is merely a convention and hence somewhat arbitrary (rc = 1b is often used). We need to E el (R, rc ) =
(a)
(b)
(c)
y
x
I
II
b z
Figure 1. (a) A straight screw dislocation (thick dashed line) in an cylindrical continuum medium, with Burgers vector magnitude b. (b) Atomic structure cell of a screw dislocation in Ta with boundary atoms (grey) fixed according to linear elastic solutions. The interior atoms are initially displaced by elasticity solutions and are subsequently relaxed to minimize total energy. Atoms with high local energies are plotted in dark color showing the dislocation core region. (c) Side view of the same structure as in (b).
Modeling dislocations using a periodic cell
815
carefully specify our choice of rc when comparing results with others since rc influences elastic energy values. To create an atomic structure of a dislocation, a standard approach is to start with a perfect lattice structure and then displace all atoms according to the prediction of elasticity theory, for example, Eq. (1). When choosing the positions of dislocation center and cut plane, it is best to avoid intersection with any atoms, so that there will be no ambiguity in computing atomic displacements.1 Figure 1(b) shows a choice of cut plane (dashed line) for creating a screw dislocation in BCC metal Tantalum. The displacement field obtained this way is usually accurate far away from the dislocation center. Hence we can fix the outmost layer of atoms at these positions to serve as boundary conditions, such as the grey atoms in Fig. 1(b). We can then relax the inner atoms to their equilibrium positions by minimizing the total atomistic energy. The atomic structure thus obtained is usually different from the elasticity theory prediction near the dislocation center, and is referred to as the core structure. The atoms near the core usually have high local energies. By plotting high energy atoms with a different color, as in Fig. 1(b), we can identify the position and spread of the dislocation core. We emphasize that the “physical spread” of the dislocation core in this visualization has nothing to do with the core radius rc introduced earlier in the elasticity theory. The latter is only a theoretical construct to get rid of the singularity. There is no singularity in the atomistic model. The local energy of every atom is finite, so is the total energy E atm of the relaxed structure within a cylinder of radius R. We define E atm with respect to the energy of a perfect lattice with the same number of atoms. Thus E atm and E el in Eq. (2) refer to the same quantity though defined in different models. If elasticity theory is valid, the two energies should agree with each other up to a constant (due to the arbitrariness of rc ), that is, E atm (R) = E el (R, rc ) + E core (rc ).
(3)
In other words, E atm and E el should have the same dependence on R, provided that R is large enough. E core is called the core energy. We emphasize that the core energy depends on rc , hence it is not a physical quantity by itself. Its dependence on rc should exactly cancel the dependence of E el on rc , so that the sum of the two gives rise to a total energy that is invariant with rc . Therefore, the core energy supplements elasticity theory to form a complete physical description of a dislocation. Equation (3) has been numerically verified by atomistic simulations [4], from which one can deduce the core energy. However, a simulation using 1 If Burgers vector has non-zero component out of the cut plane, it would also be necessary to insert atoms into or delete them from the original lattice [1].
816
W. Cai
cylindrical boundary conditions is not the most accurate way to compute core energy because of the ambiguity of defining the radius R of a cylinder – one can always find a range of R that encloses the same number of atoms. Using periodic boundary conditions is a better approach, which we will describe below.
2.
Dislocations in a Supercell
Periodic boundary conditions are widely used in atomistic simulations of condensed matter. The principal advantage of PBC is that they eliminate artificial surfaces and preserve translational invariance of space. In PBC, the atomic structure is periodically repeated in space with three repeat vectors: c1,2,3 . This means that, whenever there is an atom at position r, there are also atoms at positions r + n 1 c1 + n 2 c2 + n 3 c3 , where n 1 , n 2 and n 3 are arbitrary integers. As shown in Fig. 2(a), we can imagine a parallelepiped simulation cell defined by these three vectors [5, 6]. This simulation cell, usually called a supercell, is then surrounded by an infinite number of copies of itself. The border of the supercell is immaterial; only its shape (as specified by c1,2,3 ) is relevant. To see this, consider the infinite structure formed by periodically repeating the supercell. We can then arbitrarily shift the supercell to a different location, carve out the atoms within it and form a new supercell. The new supercell will correspond to exactly the same system as the original one. Therefore, no point in space is made more special than others and the translational invariance of space is preserved. In comparison, this is not the case for other types of boundary conditions. For example in Fig. 1(b), the atoms in
(a)
(b)
c2 βb
c2 −αb b
b a
a
R αb
c1
−βb
c2
Figure 2. (a) Atomistic supercell containing a dislocation dipole in silicon. The repeat vectors of the supercell are c1 =4[112], c2 =3[111], c3 =[110]. The distance between the two dislocations is a = c1 /2. (b) Schematic representation of the supercell and its image cells, each containing a dislocation dipole. Image dipoles are illustrated in grey and “ghost” dislocations introduced to facilitate image energy calculation is plotted in white.
Modeling dislocations using a periodic cell
817
the outer layer of the cylinder are fixed while the inner ones are allowed to move. This creates an interface between the two domains, which may be quite undesirable for certain applications. In addition, PBC is the de facto boundary condition for electronic structure calculations that employ plane-wave bases, simply because plane wave demands translational invariance. However, the advantages of PBC come at a price. First, supercells can only accommodate dislocation arrangements whose net Burgers vector is zero. Thus, the minimal number of dislocations that can be introduced in a supercell is two, forming a dipole. These two dislocations interact with each other, as well as with the periodic images of themselves. Associated with these interactions are additional strains, energies and forces whose effects can “pollute” the simulation results. Fortunately, it is possible to quantify such artifacts by exercising continuum elasticity theory; after that they can be either corrected or minimized. This extra work is well worth it given the unique advantages offered by PBC – their simplicity, flexibility and versatility. Let us begin our discussion with setting up the atomic structure of a dislocation dipole in a supercell. This turns out to be nontrivial. Consider two separated from each other screw dislocations with Burgers vector b and −b, by a . According to Eq. (1), the displacement field of these two dislocations in an infinite medium (without any images) is simply, u dipole(x, y) = b
θ1 − θ2 , 2π
(4)
to the where θ1 (θ2 ) is the angle between the vector from the dislocation b (−b) field point and the cut plane, similarly defined as in Fig. 1(a). Figure 3(a) plots the displacement field in the rectangular region x ∈ [−1, 1], y ∈ [−0.5, 0.5] for a dislocation dipole at x = ±0.5, y = 0. The cut planes of the two dislocations overlap with each other so that the discontinuity in displacement field only occurs between the dislocations. This displacement field is obviously non-periodic. Attempts to fit this configuration into a periodic supercell will inevitably create some mismatch at the box boundaries. Nevertheless, taking this as an initial condition, it is often possible to relax the mismatch away – but this is not guaranteed. Some mismatch can persist as spurious interfaces that contaminate the simulation. It is thus desirable to create initial atomic structures that already satisfies PBC. A natural approach to generate the desired displacement field is to superimposing the displacement fields of many dislocation dipoles that form a periodic array, that is, r) = u sum z (
= u dipole u dipole ( r − R) ( r ) + u img r) , z z z (
R
r) u img z (
≡
R
dipole uz ( r
, − R)
(5)
818
W. Cai
(a)
(b)
u dipole z
1
1
0
0
⫺1 0.5
⫺1 0.5
y
1
0 ⫺0.5 ⫺1
0
y
x
u err z
(c)
1
0
0
⫺1 0.5
⫺1 0.5
y
1 ⫺0.5 ⫺1
0
x
1
0 ⫺0.5 ⫺1
0
x
err uz⫹uimg z ⫺ uz
(d)
1
0
img u dipole ⫹ uz z
y
1
0 ⫺0.5 ⫺1
0
x
Figure 3. Constructing the displacement field u z (x, y) of a screw dislocation dipole in PBC by superimposing the displacement field of primary, in (a), and image dipoles. The fully corrected result is shown in (d), see text.
where the summation runs over the two dimensional lattice R =n 1 c1 + n 2 c2 , (n 1 , n 2 are integers) that specifies the offset of the image dipoles with respect to the primary dipole (that lies inside the supercell), and the dislocation lines are parallel to c3 . The summation excludes the term for the primary dipole, that is, R = 0. In practice, this sum is evaluated only for a finite number of image dipoles closest to the primary dipole. From Fig. 3(b) it is clear that, with the image contributions added, the displacement field now looks “more periodic” than before, but not exactly. Let . It can be us denote the desired but yet unknown periodic solution as u PBC z sum PBC = u − u is a field with proved that the remaining non-periodic part u err z z z a constant slope [5, 6], as shown in Fig. 3(c). It can also be shown that the error field u err z can converge to arbitrary values, depending on how the terms
Modeling dislocations using a periodic cell
819
are ordered during summation or, equivalently, how the sum is truncated at the end of the summation. For example, the field will converge to different values by summing over the dipoles contained in circles with increasing radius and those in squares with increasing size. This undesirable behavior is called conditional convergence and is a consequence of the long-range character of the elastic fields of dislocations. Fortunately, we are just one step away from obtaining the desired and . Since u err unique solution u PBC z z must be a linear function, it can be easily measured by taking differences u sum r + c1,2 ) − u sum r ), because by definiz ( z ( PBC PBC r + c1,2 ) = u z ( r ). Thus, we recover the periodic solution u PBC ( r) tion u z ( z by subtracting off the linear term (see Fig. 3(d)). After correction, the result err u sum z (x, y) − u (x, y) becomes absolutely convergent, in that it no longer depends on the details of summation procedures. For completeness, it is often desirable to use a displacement field that is not strictly periodic with respect to x and y but includes a constant tilt. In terms of the supercell repeat vectors, this corresponds to the case where c1,2,3 are not orthogonal to each other. The purpose is to minimize the average elastic stress within the supercell. The required tilt can be computed through the following considerations. The creation of the dislocation dipole introduces a plastic strain to the supercell, 1 V, pl = ( A ⊗ b + b ⊗ A)/ (6) 2 where A is the area of the cut plane (times plane normal vector) on which the V is the volume of the supercell. pl will cause displacement field jumps by b. non-zero average elastic strain and stress unless the supercell is tilted in such a way to exactly accommodate it. In our example, A = ( c3 × c1 )/2 c2 . Zero average internal stress can be achieved by using a new repeat vector, c2 = c2 + b/2.
(7)
This corresponds to introducing a linear term, u tilt (x, y)=by/2 to the displacement field. The result, u PBC (x, y) + u tilt (x, y), is no longer periodic in y. We emphasize that although u tilt (x, y) and u err (x, y) are both linear fields, they have completely different meaning. u err (x, y) is an arbitrary error that needs to be subtracted off from the summation in order to obtain a unique answer. On the other hand, u tilt (x, y) has a specific value and is introduced intentionally to minimize internal stresses in the supercell.
3.
Core Energy
The atomic structure created by the above procedure can be used as initial conditions for an energy minimization algorithm to find the equilibrium
820
W. Cai
structure of the dislocation dipole in a supercell. Let the atomistic energy be a ).2 This energy can also be derived from continuum elasticity theory E atm ( a , rc ). Again, for elasticity theory to be and we denote the result to be E el ( valid, the two energies must agree with each other up to a constant, a ) = E el ( a , rc ) + 2E core (rc ) , E atm (
(8)
where we have assumed that the core energy for the two dislocations are identical. This equation provides another way to extract dislocation core energies, a , rc ). For self-consistency, the core energy provided we can compute E el ( E core (rc ) thus obtained should only depend on the choice of rc , and independent of the a and c1,2,3 . a , rc ), we start by writing down the elastic energy of To compute E el ( an isolated dislocation dipole (assuming isotropic elasticity and screw dislocations), µb2 a ln . (9) 2π rc At the same time, the total elastic energy also includes interactions between the dislocation dipole in the primary supercell and those in the (infinitely many) image cells, dipole
E el
( a , rc ) =
dipole
img
a , rc ) = E el ( a , rc ) + E el ( a ), E el ( 1 img a) = E dd ( R), E el ( 2 R 2
= µb ln E dd ( R) 2π
(10) (11)
| R + a | · | R − a | . 2 | R|
(12)
is the interaction energy between the primary dipole with an image E dd ( R) The summation is over a lattice R = n 1 c1 + n 2 c2 excluding dipole offset by R. the origin. The factor 12 appears in Eq. (11) because only half of the interaction energy should be attributed to the primary supercell (the other half belongs to the image cell). a , rc ) and hence Given the above three equations, the task of computing E el ( E core (rc ) does not look complicated. Unfortunately, we have the conditional convergence problem again, this time for the summation in Eq. (11). Depending on how do we truncate the summation (e.g., by circles or by squares), img a ). This is obviously we will converge to different numerical values for E el ( unacceptable. It turns out that the solution to this problem is similar to the one we encountered earlier for setting up the initial atomic structure. One can show 2 The dependence on repeat vectors c 1,2,3 is implicitly assumed by not written out explicitly.
Modeling dislocations using a periodic cell
821
that the conditional convergent component in the image energy is proportional to the spurious average stress (σierr j ) generated by the image dipoles in the supercell, which is also a conditional convergent quantity [5, 6]. The absolutely convergent form of the image energy is, img
a) = E el (
1 + 1 A j bi σierr E dd ( R) j . 2 2
(13)
R
Similar to the previous approach of measuring u err z , the second term in the above equation can be measured by the interaction energy between image dipoles with “ghost” dislocation dipoles introduced at the supercell boundary – when the spurious stress is not present, the interaction with the “ghost” dislocations should vanish. Therefore, the image energy can also be written as, img
a) = E el (
1 −1 E dd ( R) E dg ( R), 2 2 R
(14)
R
is the interaction energy between a image dipole (offset by R) where E dg ( R) with four ghost dislocations, as shown in Fig. 2(b). The Burgers vector magnitude of the ghost dislocations satisfies the condition: a = α c1 + β c2 . Figure 4 plots the numerical data [5, 6] for a dislocation dipole in silicon with supercell geometry shown in Fig. 2(a). Both E atm and E el are computed for a few supercell geometries with a kept at c1 /2. The fact that the difference between these two energies remains a constant demonstrates the validity of our approach, from which we can also extract the dislocation core energy. We note that in order to reach this accuracy, one need to use anisotropic elasticity theory, in which the formula for dislocation interactions are more complicated
4
E (eV/A)
3 2 1 0 4
6
8
10
c1([112])
Figure 4. Atomistic and linear elastic energies of the dislocation dipole as functions of the supercell shape. E atm is shown in ; E el is in ♦. The solid line represents 2E core = E atm − E el .
822
W. Cai
than Eq. (12). The elastic constants used in elastic energy calculation should be the ones corresponding to the interatomic potential used for the atomistic simulation.
4.
Peierls Stress
Continuum elasticity theory can be used to determine the thermodynamic driving force on a dislocation line, which is the ratio of energy dissipation over a virtual dislocation displacement. Yet, how fast will the dislocation move in response to its driving force is usually beyond the realm of continuum elasticity and requires an atomistic treatment. A fundamental property of dislocation mobility is the Peierls stress, which is the minimum stress required for a straight dislocation to move at zero temperature. Peierls stress is related, although not necessarily directly, to the macroscopic yield stress above which the crystal deforms plastically. It is an idealized concept, because in a real crystal dislocations are usually not straight, and zero temperature cannot be achieved in practice. Nonetheless, Peierls stress is a well defined quantity (at least in theory) and is a useful measure of the intrinsic lattice resistance to dislocation motion, for dislocations with a higher Peierls stress generally have lower mobility under similar conditions. Peierls stress can be computed from a series of atomistic simulations in which the applied stress is gradually increased. Each stress increment is followed by an energy minimization of the atomic structure. The critical stress τc at which the dislocation starts to move is an estimate of the Peierls stress τ PN . However various boundary condition artifacts introduce errors to the measured critical stress τc , making it deviate from the true Peierls stress τ PN . In the following, we describe how to compute Peierls stress using a supercell and how to minimize the error coming from the boundary conditions. Figure 5(a) shows a suitable supercell set up for Peierls stress calculations. The two screw dislocations (b along z) are separated vertically ( a along y) while their glide planes are horizontal. An applied stress (σ yz ) will exert equal force on the two dislocations but in opposite directions. When the critical stress is reached, both dislocations starts to move indefinitely across the supercell (without the danger of annihilating each other) and the energy minimization algorithm will fail to converge. This makes the critical condition easy to detect. Due to the interaction between the two dislocations in the supercell, plus the interaction with their images, the actual force experienced by each dislocation does not solely come from the applied stress. To analyze the effect of boundary conditions, consider the energy variation as the relative positions of the two dislocations changes along x direction. As shown in Fig. 5(b), the energy variation E(x) is a periodic function of x. The data from anisotropic elasticity theory agree well with atomistic simulations [5, 6]. This shows
Modeling dislocations using a periodic cell (a)
823
(b)
(c)
0.1
c2
2.02
b
τc (GPa)
a
Eel (eV/A)
2
x
1.98
0.05
⫺b c1
1.96 1.94
0
0.5
0 x /c1
0.5
1.92 1
2
3
4
c2 / c1
Figure 5. (a) A supercell suitable for Peierls stress calculation with a = c2 /2 initially. The normal vector dislocation glide planes are parallel to c2 , so that dislocations will not annihilate with each other by gliding. (b) The atomistic energy variation of a dislocation dipole in silicon as a function of the relative position of two dislocations along the x direction, in ◦. When x = 0, two dislocations are separated by a = c2 /2. The solid line is the anisotropic elasticity prediction and the dashed line is the isotropic elasticity result. (c) Critical stress τc of screw dislocations in silicon for aspect ratio of the supercell (c2 /c1 ) at fixed c1 = 5[112]. are the data points for x = 0 and are for x = c1 /2. • are obtained by averaging and .
continuum elasticity can be used to accurately determine boundary effects by computing the elastic energy [Eq. (10)]. This is advantageous because direct computation of the energy variation from atomistic simulations is time consuming. The slope of the E(x) curve gives rise to an image force on the dislocations in addition to the Peach–Koehler force [1] due to the applied stress σ yz . This extra force introduces an error in the Peierls stress calculation. Considering the shape of the E(x) curve, it is obvious that this error is minimized for dislocation positions where dE/dx = 0, that is, either at x = 0 or x = c1 /2. A second order error still exists even in these two special configurations, due to the curvature (d2 E/dx 2 ) and lattice discreteness. Because the E(x) curve is close to sinusoidal, the error in critical stress calculation are opposite in sign at x = 0 and x = c1 /2. Therefore the error can be further reduced by computing the Peierls stress at these two settings and average the results. Figure 5(c) plots the critical stress τc of screw dislocation in silicon usc1 = 5[112]). Values ing supercells with different height (c2 ) but fixed width ( of τc computed at x = 0 are shown in , while those for x = c1 /2 are in . Both sets of data converge to 1.98 GPa with increasing c2 , while their averaged values (in •) reach this asymptotic value even for relatively short cells. This indicates that with this procedure it is possible to accurately determine the Peierls stress using a relatively small supercells due to error cancellation. This is helpful for first principles simulations which are limited to small supercells.
824
5.
W. Cai
Stress Field Calculations
The conditional convergence problems we encountered in atomistic simulations using supercells, for computing displacement fields or elastic energies, are caused by the intrinsic long-range character of dislocation interactions. Because periodic boundary conditions are widely used in various kinds of simulations, it is natural to expect this problem to be quite ubiquitous. For example, PBC is often used in microscale DD simulations. The calculation of stress fields in DD also involves a conditional convergent summation. In the following, we discuss how can we apply the same idea as developed above to solve this problem. Dislocation dynamics simulations do not deal with atoms. The relevant degrees of freedom are mathematical lines, usually discretize into straight segments (or curved splines), representing the location of the dislocations. The driving force on each segment is related to the local stress field, which is the superposition of the applied stress and the internal stress from all other segments. Therefore, most of the time in DD simulations is spent on computing the stress of one segment centered at position S at the material point P. When periodic boundary conditions are used, the stress field due to infinite number of images of segment S need to be included. As an example, consider a differential segment at origin and field point at r = (x, y, z). A segment is called differential if its length dl is considerably smaller than r. The stress field for differential segments takes a simpler form – proportional to dl, than that for finite length segments. To be specific, consider an edge dislocation segment with b along y axis and dl along x-axis, and consider the x–z component of its stress field. In isotropic elasticity, it is given by [1], seg
r) νx 3x z 2 σ13 ( = − , µ · b · dl (1 − ν)r 3 (1 − ν)r 5
(15)
where µ is the shear modulus, ν is the Poisson ratio. To construct the stress field when a supercell is used, we need to superimpose the above stress field for a periodic array of dislocation segments, σ sum ( r) =
σ seg ( r − R),
(16)
R
where the sum runs over all lattice points R = n 1 c1 + n 2 c2 + n 3 c3 . seg −2 Because σ ( r ) ∼ r , the sum in Eq. (16) (now in three dimensions) is only conditionally convergent. Define the desired but yet unknown stress field to be σ PBC ( r ). One can show that the difference between σ sum ( r ) and the correct solution is a field with a constant slope,
σ sum ( r ) = σ PBC ( r ) + g · r + σ 0 ,
(17)
Modeling dislocations using a periodic cell
825
where g is a third-order tensor accounting for a stress gradient and σ 0 is an average stress. Both g and σ 0 are conditionally convergent – their values varies depending on how the summation is truncated. Because the stress fields of a differential dislocation segment is anti-symmetric with respect to inversion, that is, σ seg (− r ) = −σ seg ( r ), it is a simple matter to ensure that σ 0 = 0 by always including the image segment at − R whenever an image segment at R is encountered. The stress gradient g, on the other hand, is generally nonzero. However, it can be easily computed after the summation is completed, by measuring the stress difference at supercell borders, for example, ci /2) − σ sum(− ci /2), g · ci = σ sum (
for i = 1, 2, 3.
(18)
r ) from an It is then straightforward to obtain a regularized solution σ PBC ( arbitrarily chosen summation sequence σ sum ( r ), by solving for g and subtracting off the term g · r from σ sum ( r ). In practice, the summations over stress fields are performed before the DD simulation for efficiency. The results are tabulated so that no image summation is required during the DD simulation.
6.
Summary
Dislocations are dual objects: They possess both a localized highly nonlinear core region and a long-range elastic field. Because of this, setting up proper boundary conditions for dislocation modeling is not trivial, and usually requires coupling between atomistic models with continuum elasticity theory. This article focuses on periodic boundary conditions and the ensuing conditional convergence problem, which appears both in setting up the initial dislocation structure and in extracting the dislocation core energy. The problem is solved by that fact that the conditional convergent term can always be related to a field with a constant slope, which can be measured and subtracted off, so that the correct solution can be recovered. The idea can be applied to minimize the boundary artifacts for Peierls stress calculations, as well as to compute stress fields in microscale Dislocation Dynamics simulation using a supercell.
Acknowledgment This work was performed under the auspices of U.S. Department of Energy by University of California Lawrence Livermore National Laboratory under contract No. W-7405-Eng-48.
826
W. Cai
References [1] [2] [3] [4] [5] [6]
J.P. Hirth and J. Lothe, Theory of Dislocations, Wiley, New York, 1982. K.W. Schwarz, Phys. Rev. Lett., 91, 145503, 2003. J. Li, K.J. Van Vilet, S. Yip, and S. Suresh, Science, 418, 307, 2002. W. Xu and J. Moriarty, Phys. Rev. B, 54, 6941, 1996. W. Cai, V.V. Bulatov, J. Chang, J. Li, and S. Yip, Philos. Mag., 83, 539, 2003. W. Cai, V.V. Bulatov, J. Chang, J. Li, and S. Yip, Phys. Rev. Lett., 86, 5727, 2001.
2.22 A LATTICE BASED SCREW-EDGE DISLOCATION DYNAMICS SIMULATION OF BODY CENTER CUBIC SINGLE CRYSTALS Meijie Tang Lawrence Livermore National Laboratory, P.O. Box 808, Livermore, CA 94550
1.
Introduction and Historical Background
This article is an introduction to the three-dimensional (3d) dislocation dynamics (dd) simulation method. It is complementary to the article by Z Bib in Chapter 3 of this handbook. It is the intention of the author to introduce a specific model of the method with examples that can be understood rather easily. A complete understanding of plasticity involves the understanding of individual dislocation properties as well as their collective behavior at the mesoscopic scale. Although the dislocation theory and transmission electron microscopy (TEM) have revealed significant basic properties and mechanisms of dislocations [1, 2], the multiplicity and the complexity of the mechanisms of dislocation motion and interactions make it hopeless for reaching a quantitative description of dislocation mechanism based plasticity. The 3d dd method was developed as a numerical means to track the complex collective dislocation motion and to provide a key link between individual dislocation properties and isolated dislocation mechanisms to the plastic deformation properties of materials at the macroscopic length scale. In the mid-1980s, the first computational models for dd appeared. In these approaches, the dislocations were handled as simplified point objects, representing infinitely long parallel dislocation lines, in idealized 2d crystals with a set of rules defining their behavior [3, 4]. However, these simulations missed important effects such as slip geometry, line tension effect, dislocation multiplications and intersections. In early 1990s, the development of new dd simulations in 3d started to merge (for a review, see Bulatov et al. [5]). In these 827 S. Yip (ed.), Handbook of Materials Modeling, 827–837. c 2005 Springer. Printed in the Netherlands.
828
M. Tang
new simulation models, the dislocation lines are rigorously represented by a proper discretization scheme, either segment based or node based. The dislocation motion and mutual interactions are handled explicitly according to the physics input and the underlying governing mechanisms arriving from either the dislocation core properties or the elastic properties. A wealth of plastic deformation properties can be obtained including the stress-strain curves, the dislocation density evolution, and detailed information related to the dislocation microstructure evolution. In this article, we use the screw-edge based dislocation dynamics method as an example to introduce some fundamental aspects of the 3d dd method, to show how the method is formulated, to demonstrate how the input and output are constructed. Specific dd simulation examples are given for the body center cubic (bcc) single crystals at low temperatures where the screw-edge based method is most convenient.
2.
Lattice Based Screw-Edge Dislocation Dynamics Simulation Method
Unlike molecular dynamics simulations, dislocation dynamics simulations track the motion of line objects (dislocations) instead of point objects (atoms). These line objects can be straight or highly curved, and they interact with each other in complex manners. All dislocations interact with each other through the elastic field, which is long ranged similar to the electric field of charges. When dislocations approach to contact each other, various topological changes (i.e., so called short-range interaction) can occur. Dislocations can annihilate, reconnect, and form new types of dislocations. Similar to what has been observed under TEM, the “spaghetti”-like dislocation microstructures can be quite complex. What is more complex in the simulation is that the dislocation microstructure evolves as a function of time.
2.1.
Dislocation Line Discretization and Topology Changes
The basic function of the dd simulation is to track the topological changes of the dislocation lines. A proper discretization scheme is needed to do so. In the screw-edge based approach, the dislocations are represented by piece-wise connected screw and/or edge segments residing on a discretized sublattice [6]. The basic unit segments are the smallest segments defined on the underlying discretized lattice structure. For example, for a bcc single crystal, the lattice structure for the dd simulation is a simple cubic one with a lattice parameter defined by a ∗ , where a ∗ is typically about a few nanometers and is thus much
A lattice based screw-edge dd simulation
829
larger than the atomistic crystalline lattice parameter (a few angstroms). For ¯ in the case of a given Burgers vector and slip plane, for example, 111(011) bcc, one can define the unit (or smallest) screw and edge segments in the discretized lattice as shown in Fig. 1(a). An arbitrary dislocation loop, as shown in Fig. 1(b), can then be represented by screw and edge segments that are multiplies of the unit ones for the slip system considered. The dislocation segments can have both positive and negative directions. A bcc single crystal has 12 slip systems of 111(110) type. Unit segments of screw and edge are defined for each slip system and thus the whole dislocation configurations can be represented. The screw-edge discretization scheme provides a convenient way to follow the topological changes including dislocation motion as well as short-range interactions. A few examples of the topological changes are given below. These are the segment movement, the segment annihilation, and the segment rediscretization. In order to maintain the connectivity of dislocation lines, when a dislocation segment moves, new segments are inserted between the moving segment and its previous neighbors, as shown in Fig. 2(a). If the moving segment is an edge (or screw), the inserted ones are screw (or edge). Also, the smallest distance of movement that one segment can make is determined by the unit segment of its opposite character in the discretized lattice. For example, the smallest moving distance of a screw is the length of a unit edge; The smallest moving distance of an edge is the length of a unit screw. Because all dislocation segments are made of multiples of the unit segments and all segment ends reside at lattice sites, the moving distances are multiples of the unit segment lengths. Figure 2(b) is an example of partial annihilation and reconnection between two dislocations with the same Burgers vector and opposite directions approaching each other in the same slip plane. Another type of topological changes deals with the rediscretization as shown in
(a)
(b)
Figure 1. (a) A dd sublattice with unit screw √ and edge segments defined. The unit screw segment is along 111 with the length of 3a ∗ . The unit edge segment is along 112 and √ and with the length of 6a ∗ . (b) An arbitrarily shaped dislocation loop is discretized into piecewise connected screw and edge segments.
830
M. Tang (a)
(b)
(c)
Figure 2. A few examples of topological changes. (a) Segment movement. (b) Segment annihilation and reconnection. (c) Segment rediscretization and movement.
Fig. 2(c). When the discretization is too coarse to describe detailed dislocation configurations, the long segment is cut into shorter ones according to the pre-described criteria (such as stress variation along the dislocation line or the curvature of the line), and the subsequent movements will result in more refined dislocation line shapes.
2.2.
Dislocation Motion and Plastic Strain
Dislocations move and make topological changes in response to the dislocations to the local resolved shear stress τ ∗, which is the total driving force for dislocation motion. It can be calculated using the Peach–Kholer formula from the various stresses the dislocations experience. Typically, it has the following contributions τ ∗ = τelas + τapp + τbc ,
(1)
where τelas comes from the elastic interaction stresses between dislocation segments, τapp comes from the externally applied stresses, and τbc comes from the specified boundary conditions. For example, in the case of thin films, τbc comes from the image stresses due to the free surfaces in thin films. In the
A lattice based screw-edge dd simulation
831
case of bulk systems with periodic boundary conditions, τbc is due to the periodic images of the dislocations in the simulation box. The local resolved shear stress is calculated at the center of each segment. How dislocations move under the local resolved shear stresses is defined by the mobility rules. Because the dislocation mobility is determined by the dislocation core structure, it is atomistic in nature and thus needs to be provided to the dd simulations as input. The dislocation mobility varies for different crystalline structures and for different characters of dislocations. In the case of bcc single crystals, the screw dislocations have a three-dimensional core structure and thus have high lattice friction for motion, that is, high-Peierls stress. As a result, they move by the thermally assisted kink pair mechanism at low temperatures [7]. On the other hand, the edge dislocations have low-Peierls stress and move by the phonon dragging mechanism. The mobility for the edge and the screw in a bcc single crystal is given by [8] νscrew = ν0 exp[−H (τ ∗ )/k B T ] νedg = τ ∗ b/B
(2)
where H (τ ∗ ) is the kink pair activation enthalpy, ν0 is a pre-factor, T is the temperature, k B is the Boltzman constant, and B is the drag coefficient. For typical conditions at low temperatures, the mobility of the screw can be several orders of magnitude slower than that of the edge. If the time step is dictated by the edges, the computation is very inefficient because the screws will stay idle most of the time. In our method, the time step of the simulation is chosen based on the screw mobility instead, and the edges are assumed to move infinitely fast. The edges will only stop at either their equilibrium positions determined by the balance of the applied stress, the line tension and other elastic stresses, or stop at the positions where close contact interactions occur. This algorithm is approximate, but it is efficient and captures the dominating effect of the low temperature plastic behavior of bcc single crystals. In the screw mobility, the most critical input is the stress dependent activation enthalpy, which is extracted from experiments. It can also be calculated directly from atomistic simulations. Once the mobility rules are provided, the dd simulations proceed by moving the segments with finite time steps. As the dislocations move, they accumulate plastic strains. The incremental plastic strain δε p during a time step by all segments is given by δε p =
n L i · νi δt · bi i=1
V
,
(3)
where V is the volume of the simulation box, n is the total number of segments, L i the ith segment length, νi δt is the distance moved by the ith segment within the time step, and bi is the Burgers vector of the segment.
832
2.3.
M. Tang
Dislocation Junctions and Strain Hardening
The dislocation motion during the plastic flow is rarely steady. Dislocations experience both long-range elastic interactions as well as close contact interactions. While the former is relatively smooth during the dislocation motion, the later can be quite jerky and complex depending on the reacting dislocations’ initial line shape, the characters, and the trajectories. These close interactions tend to have important consequences to the plastic flow. One of the most important types of close interactions is the formation of dislocation junctions. Junctions are dislocations with lower energy than the reacting ones and they are often sessile, thus pinning the initially mobile dislocations temporarily. When the local stress is large enough to provide the work needed to break the junctions, the dislocations can be de-pinned from the junctions and set free to move again. In order to keep the same plastic flow, the applied stresses need to be increased in order to overcome the junctions. This is one of the main mechanisms for strain hardening in single crystalline materials. Figure 3 is an example of dislocation junctions in bcc single crystals both shown schematically in Fig. 3(a) and as simulated in Fig. 3(b). In Fig. 3(a), the long dislocation line is a screw dislocation pinned at its two ends by junctions formed. Since the screw dislocation moves by the kink pair mechanism, the kinks nucleate at the center and pile up at its ends so that the middle portion of the screw keeps moving forward. When the distance traveled by the screw reaches a critical value X c , the line tension due to the curvature of the connecting arms between the junction and the screw provides large enough driving force to break the junctions. The critical distance is determined by X c = αµb/τa , where µ is the shear modulus, b the Burgers vector, τa the flow stress resolved on the slip plane, and α the average junction strength parameter. When the junction is broken, the dislocation segment is set free and rejoins the plastic flow.
(a)
(b)
Figure 3. Dislocation junctions in bcc single crystals. (a) Schematic drawing shows the critical configuration of a junction. (b) A simulated array of dislocation junctions along a screw dislocation.
A lattice based screw-edge dd simulation
3.
833
Examples of dd Simulations of bcc Single Crystals at Low Temperatures
The examples given below are generic to most transition bcc single crystals. The first example is a Frank–Read source at low temperatures. Frank– Read source is a mechanism to multiply dislocations from a single dislocation as shown in Fig. 4. At low temperatures, the edges move much easier than the screws. They travel long distance and leave behind elongated screws. The simulated Frank–Read source resembles what was observed under in situ TEM experiments. The next example is a 3d simulation of yield stresses of bcc at low temperatures. The simulations are performed at a constant strain rate of 10−4 /s under uniaxial loading condition. The simulation box size is 15 µm in length. The simulation starts with a dislocation configuration of an initial density of 1011 /m 2 with equal density of screw and edge segments randomly (both length and location) distributed in all slip systems. The constant strain rate is maintained through the monitoring of the applied stress by δσ = C(˙ε δt − δε p ),
(4)
(a)
(b)
(c)
(d)
(e)
(f)
Figure 4. Frank–Read sources for bcc at low temperatures. (a) A screw segment initially pinned at one end (the other end extending out of simulation box). (b) The screw segment moves under an applied stress and creates an edge neighbor at the pinning point, which moves for a large distance under the same applied stress. (c) The edge segment moves out of the box and left behind two screws with opposite directions. (d) Both screws move and the pinned screw creates new edge neighbors again. The edge moves in the opposite direction from in (b). (e) The same process starts to repeat from (a) to (d) with one pair of screw segments generated. (f) As the process continues, the screws move away from the center and the edges move out of the simulation box. Several pairs of elongated screw segments have been multiplied.
834
M. Tang
where δσ is the increment of the applied stress, C is the Young’s modulus along the uniaxial loading orientation, ε˙ is the applied constant strain rate, and δε p is the total plastic strain increment along the loading orientation during the time step. By doing so, one is able to bring the system to and maintain a constant strain rate. Thus, the so-called stress-strain curves can be generated. Some examples of stress-strain curves are shown in Fig. 5. These curves all show a distinctive turnover of the flow stress from going up sharply to being near flat. When the stresses are low, the edge segments move, but most screws do not. The screws hinder the overall plastic deformation because the edges are stopped by the line tension of their screw neighbors. When the plastic strain rate is below the applied value, the applied stress is increased rapidly according to Eq. (4). As the stress increases to a level at which most screw dislocations start to move, the Frank–Read sources start to operate and the mobile dislocation densities increase significantly. Then, much larger plastic strains are accumulated for each time step. The simulated strain rate approaches the applied value rapidly. Therefore, the flow stresses show significant slow down of increase when they approach the yield stress values. The yield stress is a macroscopic material property that can be measured in experiments. Its experimental definition is the flow stress at the strain of 1% or 2%. Essentially it is defined as the stress at the onset of plastic deformation, below which elastic deformation dominates. In the dd simulations, it is defined through Eq. (4), that is, when the plastic strain rate reaches the applied value. Fig. 5(b) shows the yield stresses obtained from the end of the stressstrain curves shown in Fig. 5(a). The comparison with experiments is quite reasonable [8].
(a)
(b)
Figure 5. Stress-strain curves and yield stresses as a function of temperature for a single crystal tantalum. (a) Simulated stress-strain curves at different temperatures (from top to bottom, the temperatures are 160 K, 197 K, 250 K, and 300 K, respectively). Both the flow stress and the plastic strain are resolved in the primary slip plane. (b) The simulated resolved yield stresses (filled circles) are extracted from the end of the stress strain curves in (a). They are compared with two sets of experimental measurements (filled triangles).
A lattice based screw-edge dd simulation (a)
(b)
835 (c)
Figure 6. Dislocation configurations of bcc single crystals at low temperatures. (a) A snapshot taken at the start of the simulation for the initial relaxed configuration. The dislocation density is 2×1011 /m 2 . (b) A snapshot taken from the simulation after the plastic yielding. The dislocation density is 1.9 × 1012 /m 2 . The simulation is performed at the constant strain rate of 1/s and at 300 K under a single slip orientation. The simulation box size is 15 µm. (c) is a TEM micrograph ¯ slip plane after the plastic deformation at 50 K. of niobium on (011)
The elongated screw dislocations seen in the Frank–Read source in Fig. 4 is quite characteristic for bcc plastic deformation at low temperatures. Even during pre-yield stage, the edges start to move at quite low stresses while the screws do not. Thus, the screws are elongated without moving. At yielding, the screws start to move and they continue to multiply long screws as seen in Fig. 4. Some typical dislocation configurations are shown in Fig. 6. The two snapshots are taken from the constant strain rate simulations of tantalum at 300 K and at the strain rate of 1/s. Also shown in the figure is a TEM ¯ slip plane of a niobium micrograph of long screw dislocations in the (011) crystal after deformation at 50 K.
4.
Progress and Outlook
This article introduces the basic aspects that form a dd simulation for bcc single crystals by the lattice based screw-edge type of model. A few examples are given to introduce some basic applications of the method. This model is most convenient for the plastic deformation of bcc single crystals at low temperatures where the screw dislocations dominate the plastic deformation. By now, various improved or more sophisticated versions of dd codes have been developed or are being developed. As far as the discretization scheme is concerned, there are several varied approaches including dislocation node based [9, 10], the parametric segment based [11], and the off-lattice mixed segments approach [12]. As for the lattice based approach, a few selected mixed segments are added to the screw and edge basis for the face center cubic (fcc) simulations [13]. Most methods are general for either bcc, or fcc. Some
836
M. Tang
methods are extended to other single crystals such as diamond cubic crystals. As for the boundary conditions, the periodic boundary condition has been developed for the simulation of bulk materials [5]. For systems with free surfaces, it becomes a rather standard approach to couple a dd code with a finite element method (FEM). An advanced algorithm is being developed to utilize an analytical solution to account for the singular stresses at the intersection of the dislocations with the free surfaces. Another important forefront development of the 3d dd method is to reach high scalability on massively parallel computers. The newly developed ParaDiS code at Lawrence Livermore National Laboratory is among the latest development of performing large scale dd simulations using thousands and even more CPUs [9]. Progress has also been made in bridging length scales from the atomistic level to the continuum level. For example, atomistic simulations using rather accurate first principle based interatomic potential functions have been used to calculate the kink pair activation enthalpy for bcc single crystals [14]. On the other end, dd simulations are used to provide insight, fit parameters, and validate models that are based on dislocation density at the continuum level [15]. The reader is suggested to read related parts of this handbook that discuss the other length scales such as the atomistic and the continuum.
References [1] J. Friedel, Dislocations, Pergamon, Oxford, 1964. [2] J.P. Hirth and J. Lothe, Theory of Dislocations, 2nd edition, Wiley, New York, 1982. [3] J. Lepinoux and L.P. Kubin, “The dynamics organization of dislocation structures: a simulation,” Scripta Metall., 21, 833–838, 1987. [4] N.M. Ghoniem and R.J. Amodeo, “Computer simulation of dislocation pattern formation,” Solid State Phenom., 3&4, 377–388, 1988. [5] V.V. Bulatov, M. Tang, and H.M. Zbib, “Crystal plasticity from dislocation dynamics,” MRS Bull., 26, 191–195, 2001. [6] B. Devincre, “Meso-scale simulation of the dislocation dynamics,” In: H.O. Kirchner, L.P. Kubin, and V. Pontikis, (eds.), Computer Simulation in Materials Science, NATO ASI E 318, Kluwer, Dordrecht, 309–323, 1995. [7] L.P. Kubin, “The low temperature plastic deformation of bcc metals,” Rev. Deformation Behavior Mater., 1, 244–288, 1977. [8] M. Tang, L.P. Kubin, and G.R. Canova, “Dislocation mobility and the mechanical response of bcc single crystals: a mesoscopic approach,” Acta Meta., 46, 3221–3235, 1998. [9] V. Bulatov, W. Cai, M. Hiratani, M. Tang, J. Fier, G. Hommes, T. Pierce, M. Rhee, K. Yates, and T. Arsenlis, “Scalable line dynamics in ParaDiS,” supercomputing 2004, http://www.sc-conference.org/sc2004/schedule/pdfs/pap206.pdf. [10] K.W. Schwarz, “Interaction of dislocations on crossed glide planes in a strained epitaxial layer,” Phys. Rev. Lett., 78, 4785–4788, 1997.
A lattice based screw-edge dd simulation
837
[11] N.M. Ghoniem, S.-H. Tong, and L.Z. Sun, “Parametric dislocation dynamics: a thermodynamics-based approach to investigations of mesoscopic plastic deformation,” Phys. Rev. B, 61, 913–927, 2000. [12] J.P. Hirth, M. Rhee, and H.M. Zbib, “Modeling of deformation by a 3D multi-pole curved dislocations,” J. Comput.-Aided Mater. Des., 3, 164–66, 1996. [13] R. Madec, B. Devincre, and L.P. Kubin, “From dislocation junctions to forest hardening,” Phys. Rev. Lett., 89, 255508–255512, 2002. [14] J.A. Moriarty, J.F. Belak, R.E. Rudd, P. Soderlind, F.H. Streitz, and L.H. Yang, “Quantum-based atomistic simulation of materials properties in transition metals,” J. Phys. Condens. Matter, 14, 2825, 2002. [15] A. Arsenlis and M. Tang, “Crystal plasticity continuum model development from dislocation dynamics,” Modelling Simul. Mater. Sci. Eng., 11, 1251, 2003.
2.23 ATOMISTICS OF FRACTURE Diana Farkas1 and Robin L.B. Selinger2 1 Department of Materials Science and Engineering, Virginia Tech, Blacksburg, VA 24061, USA 2 Physics Department, Catholic University, Washington, DC 20064, USA
Atomistic simulation studies of fracture are aimed at addressing both practical problems in materials engineering and providing basic understanding in fundamental issues in the science of solid mechanics. A practical goal is the development of computational tools to predict the fracture toughness of materials as a function of composition, microstructure, temperature, environment, and loading conditions. Such tools would be extremely useful in the engineering development of novel high-strength structural materials by identifying likely candidate formulations and reducing the number of laboratory trials needed for their testing and validation. As basic research, computer simulation of fracture in single crystals has provided new insight into the stability of crack propagation, the phenomenon of lattice trapping, and the origins of brittle and ductile behavior. Simulation studies of polycrystalline and particularly nanocrystalline solids are increasingly important research tools for investigating fracture and deformation mechanisms in these materials. Large scale simulations that are made possible by the increasing computational power available [1, 2] can shed new light on phenomena that can now be compared with experimental observations. For recent reviews, see Refs. [3, 4]. The accuracy of any atomistic model is primarily determined by the quality of the potential function used to calculate interatomic interactions. Most classical potentials have been fit to reproduce a material’s equilibrium bulk properties, which depend mostly on the shape and curvature of the potential near its minimum. By contrast, the behavior of a crack under loading depends sensitively on the mechanical response of the material surrounding the crack tip, where chemical bonds are stretched and bonding geometries are distorted, so that interactions are governed by the shape of the potential far away from its minimum. Surface energy, a property that plays a crucial role in crack stability, may depend on the phenomenon of surface reconstruction, which 839 S. Yip (ed.), Handbook of Materials Modeling, 839–853. c 2005 Springer. Printed in the Netherlands.
840
D. Farkas and R.L.B. Selinger
is not always well described by classical potentials. Where fracture is ductile, it is important also to consider whether dislocation core structure and mobility are accurately reproduced by the chosen interatomic potential. In spite of these concerns, the behavior of many metallic materials, particularly those with FCC structure, can be described with reasonable accuracy by computationally efficient many-body semi-empirical potentials such as the embedded atom method (EAM) [5] and effective medium theory (EMT) [6]. Such potentials can be developed by fitting to first principles calculations, as described in this volume in the chapter by Mishin, and have been successfully used to model fracture in FCC materials [2]. However, many classical potentials have been shown to be far less accurate in modeling the fracture behavior of other materials, notably semiconductors; see, e.g., Ref. [7]. Multi-scale methods discussed elsewhere in this volume allow the use of more accurate models to describe chemical bonding in a small region near the crack tip, while the rest of the system is modeled using classical potentials and continuum-level models of elastic-plastic behavior; see also Ref. [8]. After selecting an interatomic potential, the next tasks in constructing a fracture simulation include choosing the initial configuration; defining an appropriate loading geometry and boundary conditions; and deciding by what dynamical algorithm the system will evolve. A wide variety of loading and boundary condition schemes have been developed to study either quasistatic or dynamic crack propagation. Consider a three-dimensional simulation block of copper atoms arranged in an FCC structure. If the sample is a single crystal, its crystallographic orientation must be specified, keeping in mind that material properties such as surface energy and dislocation mobility are highly anisotropic and that the orientation of relevant slip planes will be an important factor. A crack can be introduced into the initial configuration by calculating the displacement field associated with an ideal straight crack in a linear elastic continuum solid under an overall strain and then displacing the atoms accordingly; details for different crack loading geometries can be found in Ref. [9]. Since an atomistic solid behaves non-linearly at large strains, we can anticipate that the atoms in the crack tip region are out of elastic equilibrium and will relax once we set the simulation in motion. An alternative way to introduce a defect is to remove atoms to create a crack-shaped void, or remove a wedge of atoms to create a notch. Dislocations can also be introduced into the initial configuration using continuum elastic displacements [10]. Again we expect that atoms in the core region will be out of elastic equilibrium and will relax once the simulation is set in motion. If desired, the initial configuration can be constructed as a polycrystal instead of a single crystal. A model grain structure can be constructed using a variety of algorithms, including the Voronoi construction [11]. Alternatively, amorphous configurations can be constructed by melting and then quenching a sample [12]. The technique can also be applied to study fracture behavior in
Atomistics of fracture
841
amorphous materials such as metallic glass [13]. The most stable amorphous systems are typically mixtures with atoms of two or more different sizes. To cause the sample to fracture or deform, we apply a strain or stress of specified character, and then maintain or perhaps increase it as the system evolves through some dynamical scheme. We accomplish this goal through the use of constraints, which are typically applied via the system’s boundary conditions. Thus the choice of boundary conditions is a crucial part of designing any fracture simulation. Though our simulated sample is by necessity relatively small, our goal in selecting boundary conditions and loading geometry is often to mimic a single isolated crack within an infinite solid. One choice is the use of periodic boundary conditions along one or more directions, but the crack and any other defects in the system will be affected by interactions with their periodic images, which may considerably complicate analysis of results. Periodic boundary conditions also introduce topological constraints on extended defects such as grain boundaries and dislocations. Another possibility is using free boundaries with the strain maintained by traction forces applied to atoms along the edges of the system. However, free boundaries introduce image interactions [10] which again complicate analysis of results; and surface atoms subject to traction forces may simply tear away from the sample if the stress is too large. Several better geometries have been developed. For studying the quasistatic propagation of a single crack, one may use a block geometry where atoms along the boundaries of the sample are constrained with positions calculated from the continuum elastic displacements associated with an ideal crack under the appropriate loading. Atoms in the interior of the sample, including the crack tip region, are unconstrained. As the applied strain is increased during the course of the simulation, the positions of the constrained atoms are recalculated using the continuum solution. Of all the options available, this geometry gives the closest approximation of an isolated crack moving at low speed in an infinite solid and is ideal for use with molecular statics methods. We discuss this technique in further detail below. A particularly interesting geometry for dynamic fracture simulation is a crack “treadmill”. As a crack propagates through a crystalline solid, broken crystal planes are removed from trailing end of the sample and defect-free, strain-matched crystal planes are added on the leading end, so that the crack remains always in the middle of the simulation cell [14, 15]. Applied strain may be maintained or adjusted by constraining the top and bottom layers of the crystal to a given separation. This choice of loading conditions allows a dynamic crack to propagate long enough to reach steady state speed, even in a relatively small sample. Marder has demonstrated that even extremely small samples give results for crack properties that converge rapidly toward bulk values. In a dynamic fracture simulation, the moving crack emits phonons which eventually bounce off the system boundaries – or propagate through periodic
842
D. Farkas and R.L.B. Selinger
boundary conditions – and impinge upon the crack tip. A “stadium” geometry has been developed to isolate a moving crack from these reflections by the use of surrounding damping regions to absorb them [16]. In a “treadmill” geometry, damping is also needed to protect the crack tip from any shock waves or other disturbance that may be generated when crystal planes are added or removed. Basic atomistic simulation techniques used in the simulation of fracture are of two types: molecular statics and molecular dynamics. Molecular statics (MS) is a technique designed to determine the lowest energy configuration of a system under its applied constraints. Every atom within the simulated system interacts with its neighbors according to an interaction potential, and the presence of a defect typically induces forces. The non-constrained atoms are moved using an iterative relaxation process to bring the system to a minimum energy configuration. Using a conjugate gradient approximation method, the minimization technique moves the atoms along the direction of the steepest gradient of the energy function, i.e., in the direction of greatest energy decrease. In each single iteration step, the atom is displaced in the direction of the resultant force applied by its neighbors as well as in a direction perpendicular to its previous displacement. The energy is computed after each iteration and the system is assumed to be at equilibrium when the energy gradient drops to zero, or when the forces on each atom are below a specified value. The number of iterations required to reach equilibrium may vary from several tens to several hundreds. Once the system reaches elastic equilibrium, the applied strain is incremented once more and the system is again relaxed to elastic equilibrium. This procedure is repeated until the process being studied, e.g., the motion of a crack across the sample, is complete. The molecular statics method has the advantage that it represents quasistatic evolution of the system under a slowly varying strain, but the disadvantage that it does not take account of the effects of finite temperature. No atomic vibrations due to thermal activation are taken into account and the results obtained only characterize the material at 0 K. To introduce time, strain rate, and temperature as meaningful variables, we turn to molecular dynamics (MD) simulation techniques, which model the motion of atoms according to Newton’s laws with forces arising from both interatomic interactions and applied constraints. For a general introduction to MD methods, see Ref. [17]. In an MD simulation, the initial configuration includes the position, mass, and velocity of each atom. The initial velocities are selected from a random distribution (e.g., the Maxwell–Boltzmann distribution) associated with a specific initial temperature; if a different random velocity distribution is used, the velocities will naturally evolve toward the Maxwell–Boltzmann distribution in the first steps of the simulation. Forces acting on each atom from its neighbors are summed by vector components, and the equations of motion are integrated forward in time. The
Atomistics of fracture
843
value of the integration time step t, must be set low compared to the period of the highest frequency motion in the system in order to conserve energy and momentum to the desired precision. During each time step both atomic positions and velocities evolve. A variety of numerical methods may be used to carry out the numerical integration [17]; higher order methods allow the use of a larger time step. When the system evolves under Newtonian dynamics, the sum of the potential and kinetic energies remains constant. However, an applied constraint that changes over time may do work on the system, e.g., if you gradually increase the applied strain, the system’s total energy may increase. Motion of dislocations or propagation of a crack both relieve the applied strain and thus convert potential energy to kinetic, causing the system to heat up. This is a realistic effect, but because the strain rates are so high and system size so small, the temperature increase may be much more extreme than that observed in a relevant experiment. To control the temperature in the simulation, we can place all or part of the system in contact with a model heat bath via the use of a “thermostat” algorithm derived from statistical thermodynamics. In simulating dynamic fracture, it is often useful to avoid applying the thermostat to the crack-tip region so that the crack’s stability is not affected. Any heat generated will diffuse toward the thermostat region, flowing down a temperature gradient. If periodic boundary conditions are used, stress or hydrostatic pressure can be similarly controlled through the use of a “barostat,” which allows the simulation cell size, and thus the strain, to fluctuate. Both thermostats and barostats may introduce a fictitious degree of freedom and an associated time scale [17]. Care must be taken that those time scales are well separated from the dynamics of any other type of mechanical response under study in the simulation. Considering that the MD algorithm is directly derived from a classical mechanics treatment of the system, the simulated system is expected to evolve as it would during an experiment on a short (e.g., nanoseconds) timescale. Fracture mechanisms and diffusion mechanisms can be determined by direct observation, without any a priori assumptions. The relative importance of various mechanisms can also be studied as a result of the simulation. Thus, the MD simulation is a very powerful technique that produces very detailed information about the simulated system. It is an appropriate tool when the goal is precisely to study the exact nature of the fracture mechanisms. The main drawback of the molecular dynamics technique is the short time scale accessible. For studies of deformation this means that the deformation process is carried out at very high strain rates. Since the strain rate may affect the deformation mechanism, it follows that molecular dynamics, although very useful as a technique can give only part of the picture. On the other hand, the molecular statics alternative calculates quasi-equilibrium configurations at various stress intensities and is therefore a better model of stable crack growth. Table 1 compares and summarizes the basic algorithms associated with MS and
844
D. Farkas and R.L.B. Selinger
Table 1. Simulation procedure using molecular statics and molecular dynamics
MD Crack Simulation
MS Crack Simulation
MD, as applied to the simulation of fracture introducing a semi-infinite crack loaded to a given stress intensity K. In both cases the loading is introduced by using displacement fields obtained from elasticity theory for the given value of the stress intensity. Because of their size and time-scale limitations, atomistic simulations cannot independently tell us everything we need to know about the fracture behavior of a bulk solid. Accurate atomistic simulations need to be bridged with simulations at other length scales. Multi-scale models that address hand-shaking issues between simulation techniques at different length scales are undergoing rapid development and are discussed elsewhere in this volume. However, even without
Atomistics of fracture
845
those techniques, the bridging of length scales can be accomplished in simpler ways by the use of (a) interatomic potentials that are developed based on calculations performed at the quantum theory level and (b) boundary conditions that are based on the continuum theory of fracture mechanics, e.g., using the block geometry discussed above. We now turn to the topic of interfacing ab initio calculations with molecular statics and dynamics. Classical interatomic potentials describe the energy associated with chemical interactions among atoms of the same or differing species, in a simplified form that is computationally efficient, as a function of atomic positions. One way to bridge length scales in a sense is to derive interatomic potentials from first principle simulations of impurity effects, mostly using ab initio density functional theory in the linear augmented plane wave (LAPW) approach. These calculations can be performed for cluster sizes of 10–50 atoms, and they must be bridged in some way to techniques at a larger scale. These studies include atomic configurations that deviate significantly from equilibrium configurations. As discussed above, accurate modeling of a crack tip, where strains are large, requires good fitting of the potential not only near its minimum but also for atoms in high energy configurations. It is therefore important to use a description of interatomic interaction that though empirical can be reliable in these off-equilibrium configurations. Experimental information is usually linked to situations that are only deviated slightly from equilibrium, mostly in the elastic regime. With the possible exemption of the activation energy for diffusion, there are no experimental properties of the bulk material that can provide information on energetic interactions far from equilibrium, so first principle calculations play a major role in developing accurate interatomic potentials. Potentials derived in this manner yield a good description for pure metals and metallic alloys. In simulation studies of fracture, particular interest must be devoted to accurate values of the surface energies in various crystallographic directions. These properties of the potential directly influence the cleavage planes observed. For the study of ductile/brittle behavior the other important quantity that needs to be accurately reproduced by the potential is the unstable stacking fault energy. We must also address the issue of interfacing an atomistic simulation with the continuum through boundary conditions. From a macroscopic point of view, plastic deformation in a crystalline material proceeds by simultaneous sliding on available slip planes. In continuum theory simulations, slip systems on which the resolved shear stress exceeds a certain threshold value are assumed to be active. The sliding on these slip systems determines the overall plastic deformations of the body. This kind of macroscopic simulation setup requires input parameters such as threshold stresses for the activation of particular slip systems. One goal of atomistic simulations of fracture is to provide
846
D. Farkas and R.L.B. Selinger
precisely the parameters that the continuum type of simulation requires. Such macroscopic parameters and criteria should be consistent with the plasticity observed in the atomistic simulations. These parameters can be used in the continuum theory simulations. In turn, continuum theory provides the boundary conditions necessary for the simulations in many cases and the appropriate criteria and parameter values will be found via an iterative process. The basic idea is that we will find a self-consistent solution for which the criteria for the onset of plasticity used in the macroscopic calculations of the boundary conditions are consistent with the atomistic results. We note that this is a very efficient way to interface atomistic and continuum calculations that uses mostly existing code, without the need for handshaking procedures. Figure 1 shows a schematic of how the continuum simulations are used as boundary conditions in a typical fracture simulation at the atomistic level. In this figure we indicate the possibility of introducing a buffer region of atoms that are at intermediate distances from the crack tip. The buffer atoms do not move independently but can be adjusted according to the forces they experience from the free atoms. The use of the buffer region is not necessary but results in the possibility of using smaller simulation cells without significant effects from the fixed region. If a buffer region is not used, larger simulation cells will be necessary to avoid effects from the fixed region. The introduction of the crack is performed using the continuum solutions for all the regions indicated in Fig. 2. The role of the continuum solution is two-fold. First, it serves as an initial guess for the relaxed atomic configuration in all the regions of the simulation. Second, it serves as boundary conditions that are kept fixed in the fixed region far from the crack tip. Since
Mode I loading
Y
FIXED BUFFER
Y X
FREE
Z Crack front line
Figure 1. Illustration of block geometry.
X
Atomistics of fracture
847 Nanocrystalline α-iron samples S1, S2 and S3
Geometric construction of three samples: S1, S2, S3.
Relaxation of the samples with MS + EAM potential for α-iron To obtain a minimum energy atomistic configuration of nc α-iron
Introduction crack by MS
of a Mode I at 0K
Mode I loading
Temperature equilibration of samples S1 S2 and S3 at
100K, 300K and 600K respectively using MD
Cracked samples S1, S2 and S3 at 100K, 300K and 600K respectively
Figure 2.
Cracked samples at 0K
Modeling process to obtain three cracked samples at different temperatures.
the solution based on elastic fracture mechanics should be valid far from the crack tip, the atomic positions in the fixed region are kept fixed during the energy minimization procedure in molecular statics or during a certain number of molecular dynamics time steps in the MD technique. As the simulation proceeds the loading can be increased, and this is accomplished by updating the fixed boundary conditions to those representative of a crack with a higher loading level. In the simplest case, the boundary conditions are given by the solution of the displacement field of a semi infinite crack in an isotropic medium. These are [18]:
θ K r θ cos 2 − 4ν + 2 sin2 ux = 2µ 2π 2 2 θ K r θ sin uy = 4 − 4ν − 2 cos2 . 2µ 2π 2 2
848
D. Farkas and R.L.B. Selinger
Using this isotropic approximation, simulations of more complicated polycrystalline and multi-phase systems can also be performed using this technique, since the continuum solutions, to first approximation, are independent of the detailed crystal configuration in the atomistic simulation block. As a case study, we now consider the fracture of nanocrystalline Fe of varying grain size at different temperatures. Fracture of single crystal Fe has been studied since the early days of computer simulation. Cheung and Yip [19] used MD simulations for α-iron to show that a brittle-to-ductile transition occurs between 200 and 300 K for various crack tip geometries: 100{110}, 100{100} and 110{100}, where the crack is lying on the indicated plane and its crack front is located along the given direction. At low temperature, i.e., at temperatures of 200 K and below, the three crack orientations show brittle behavior and cleavage crack is observed to occur on {100} or {110} planes. With increasing temperature above the ductile-brittle transition (DBT) temperature, profuse dislocation emission accompanied with crack tip blunting is observed in the three orientations. 111{110} is identified as one of the slip systems activated. Furthermore, for the 110{100} orientation, additional features of local structural transformation and twinning associated with 111{112} are observed. DeCelis et al. [20] also found brittle cleavage to be the preferred mode of instability for cracks at 0 K. Kohlhoff et al. [21] have used a combined finite elements and atomistic model to study {100} and {110} cracks. Their approach is to consider both crack planes {100} and {110} with their crack front oriented along either 100 or 110 directions. Both cracks with either of the two orientations are observed to cleave without dislocation emission. However, cleavage on the {100} plane is found to be easier than {110}. Shastry and Farkas [22] investigated crack propagation under Mode I loading using molecular statics simulation models. Their study involved cracks on {110} planes but with different crack geometries than those considered previously, i.e., {110} crack plane associated with 100, 110 and 111 crack front directions. These results show that crack propagation in single crystal samples is very dependent on the particular crystallographic orientation of the crack front and plane. More recently, fracture of nanocrystalline Fe has been studied by Latapie and Farkas [23]. Figure 2 shows the procedure used for the simulation of fracture behavior in a nanocrystalline material. In this procedure, three different samples with varying grain sizes were created using a bcc crystalline structure and a Voronoi construction for the randomly oriented grain structure. Each sample contained 15 grains. These samples were equilibrated using both molecular statics and dynamics to obtain a stable grain boundary structure. The initial crack was then introduced using the equations above and stress intensity initially at about the Griffith value, K IC =
2µG IC / (1 − υ)
with G IC being twice the surface energy.
Atomistics of fracture
849
For the case of Fe with G IC = 2∗ γs = 2∗ 0.089 = 0.178 eV/Å2 , ν = 0.293 and µ = C44 = 0.699 ev/Å3 we obtain K IC = 0.6 eV/Å5/2 = 0.96 MPa.m1/2 . The cracked samples are then equilibrated at three different temperatures (100, 300 and 600 K) using MD for 2000 time steps, with a step size of 8 × 10−15 s. With the same technique, the fracture process in each sample was conducted for the three temperatures by incrementally loading the semi-infinite mode I crack starting from a stress intensity value slightly below the Griffith criterion for α-iron single crystal. We let the system evolve for 1000 MD steps between each loading, giving an overall simulation time of 200 ps. Since the MD technique follows the actual forces on the atoms as they migrate, the fracture mechanisms can be determined by direct observation, without a priori assumptions. As the simulation progresses and the stress intensity is increased, the crack begins to advance and we follow the crack for a stress intensity up to three times the Griffith value. The simulated strain rate is very high compared to real experiments, so to check for strain rate effects, MS simulations of the same samples are conducted to compare them with the MD simulation results at low temperature. This procedure verifies that the same fracture and deformation mechanisms occur using the conjugate gradient technique, helping to rule out effects of the unrealistic high strain rates typical of molecular dynamics. Visualization of the results is an important consideration, e.g., the nucleation and motion of lattice defects and propagation of cracks need to be clearly identified. The standard techniques of visualizing the atomic configurations use coloring schemes related to atomic environment or energetics. In the example of the simulation of fracture in nanocrystalline Fe, the color scheme denotes the coordination number for each atom. Darker shades of gray represent atoms with coordination numbers different from eight. The fully three-dimensional samples can then be visualized in slices perpendicular to the crack front. Each of the slices can be rendered using any molecular visualization package that takes as input the atomic coordinates, atom types and at least one extra parameter to control the color and/or size of the atomic symbols. The results of visualizing the fracture of nanocrystalline Fe using this technique are shown in Fig. 3, for a sample of 12 nm grain size loaded up to three times the Griffith stress intensity at different temperatures. The results clearly show the increasing ductility that is associated with increasing the temperature of the simulation. Quantitative evaluation of simulation data allows comparison between simulation and experiment. One important quantity is fracture resistance. By plotting the applied stress intensity as a function of crack tip position one can obtain crack resistance curves, such as are used in continuum fracture mechanics. These curves give information on how the crack advances as increasingly higher loading is applied, including the effects of the plastic deformation
850
Figure 3.
D. Farkas and R.L.B. Selinger
Temperature dependence on intergranular fracture, at 100, 300 and 600 K.
Atomistics of fracture
851
processes that occur simultaneously with crack advance. These curves are particularly useful in studying effects of various parameters of the crack configuration and loading condition, such as the effects of crystallographic orientation, temperature, or grain size in polycrystalline simulations. In the example of nanocrystalline Fe, this technique can be used to study the effect of grain size on fracture resistance at 100 K. The results are shown in Fig. 4, where increased fracture resistance is shown with decreasing grain size. In coming years, we anticipate significant new results in atomistic simulation of fracture in metals and metal alloys using semi-empirical potentials. Current computing power now allows the simulation of polycrystalline materials with grain sizes up to and above the 30 to 40 nm range. This is an important accomplishment because grain sizes larger than these values in metals begin to behave much like their macroscopic counterparts. Larger sample sizes will empower researchers to examine not only dislocation emission from the crack tip but also the subsequent evolution of the plastic zone surrounding the crack tip, and give researchers better ability to predict fracture mechanisms as a function of composition, microstructure, and loading geometry. Faster computers will also allow MD simulation of fracture at slower and more realistic strain rates. The predictive value of such large simulations will be limited primarily by the accuracy of the empirical interatomic potentials employed. Further improvement in available computing resources will make it possible to use more accurate methods than just semi-empirical potentials; these improvements will include 1.6
Stress Intensity Factor KI
1.5 1.4 1.3 1.2 1.1
9nm 12nm
1 0.9 0.8 0.7 10
20
30
40
50
60
Crack Tip Position Figure 4. Fracture resistance curves from simulation studies of nanocrystalline Fe with two different average grain sizes (9 and 12 nm) at a temperature of 100 K.
852
D. Farkas and R.L.B. Selinger
the simulation of fracture in small samples using purely first principles techniques. Many of the same computational methods and geometries used at present with semi-empirical potentials will be useful in that context as well. Multi-scale methods also show enormous promise to overcome the accuracy limitations of semi-empirical potentials, but without the computational requirements of an exclusively first principles calculation. Techniques for direct coupling between classical atomistic simulation and first principles techniques are already being developed for semiconductors and could also be applied to metals. Other multiscale techniques such as the Quasicontinuum method, described in this volume in the chapter by Miller, will likely serve as important tools to couple atomistic simulations with larger scale modeling techniques.
References [1] F.F. Abraham, “Very large scale simulations of materials failure,” Phil. Trans. R. Soc. Lond. Ser. A—Math. Phys. Eng. Sci., 360, 367–382, 2002. [2] S.J. Zhou, P.S. Lomdahl, A.F. Voter, and B.L. Holian, “Three-dimensional fracture via large-scale molecular dynamics,” Eng. Fract. Mech., 61, 173–187, 1998. [3] M. Marder, “Molecular dynamics of cracks,” Comput. Sci. Eng., 1, 48–55, 1999. [4] R.L.B. Selinger and D. Farkas (eds.), “Atomistic theory and simulation of fracture,” MRS Bulletin, 25, No. 5, 2000. [5] M.S. Daw and M.I. Baskes, “Semiempirical, quantum mechanical calculation of hydrogen embrittlement in metals,” Phys. Rev. Lett., 50, 1285–1288, 1983. [6] K.W. Jacobsen, J.K. Norskov, and M.J. Puska, “Interatomic interactions in the effective-medium theory,” Phys. Rev. B, 35, 7423–7442, 1986. [7] J.A. Hauch, D. Holland, M.P. Marder, and H.L. Swinney, “Dynamic fracture in single crystal silicon,” Phys. Rev. Lett., 82, 3823–3826, 1999. [8] F.F. Abraham, N. Bernstein, J.Q. Broughton, and D. Hess, “Dynamic fracture of silicon: concurrent simulation of quantum electrons, classical atoms, and the continuum solid,” MRS Bull., 25(5), 27–32, 2000. [9] Lawn, Brian, Fracture of Brittle Solids, Cambridge University Press, Cambridge, U.K., 1993. [10] J.P. Hirth and J. Lothe, Theory of Dislocations, JohnWiley & Sons, New York, 1992. [11] D. Farkas, H. Van Swygenhoven, and P.M. Derlet, “Intergranular fracture in nanocrystalline metals,” Phys. Rev. B, 66, 060101–4(R), 2002. [12] P. Keblinski, D. Wolf, and S.R. Phillpot, “Molecular dynamics simulation of grainboundary diffusion creep,” Interface Sci., 6, 205–212, 1998. [13] M. Falk, “Molecular-dynamics study of ductile and brittle fracture in model noncrystalline solids,” Phys. Rev. B, 60, 7062–7070, 1999. [14] D. Holland and M. Marder, “Ideal brittle fracture of silicon studied with molecular dynamics,” Phys. Rev. Lett., 80, 746–749, 1997. [15] R.L.B. Selinger and J.M. Corbett, “Dynamic fracture in disordered media,” MRS Bull., 25(5), 46–50, 2000. [16] S.J. Zhou, P.S. Lomdahl, R. Thomson, and B.L. Holian, “Dynamic crack processes via molecular dynamics,” Phys. Rev. Lett., 76, 2318–2321, 1996. [17] D. Rapaport, The Art of Molecular Dynamics Simulation, 2nd edn. Cambridge University Press, Cambridge, U.K., 2004.
Atomistics of fracture
853
[18] G.C. Sih and H. Liebowitz, Fracture: An Advanced Treatise, In: H. Liebowitz (ed.), vol.II, Academic Press, New York, 69, 189, 1968. [19] K.S. Cheung and S. Yip, “Brittle–ductile transition in intrinsic frcture behavior of crystals,” Phys. Rev. Lett., 65, 2804–2807, 1990. [20] B. DeCelis, A.S. Argon, and S. Yip, “Molecular dynamics simulation of crack tip processes in alpha-iron and copper,” J. Appl. Phys., 54, 4864–4878, 1983. [21] S. Kohlhoff, P. Gumbsch, and H.F. Fischmeister, “Crack propagation in b.c.c. crystals studied with a combined finite-element and atomistic model,” Philos. Mag. A, 64, 851–878, 1991. [22] C. Shastry and D. Farkas, “Molecular statics simulation of fracture in α-iron,” Modeling Simulation Mater. Sci. Eng., 4, 473–492, 1996. [23] A. Latapie and D. Farkas, “Molecular dynamics investigation of the fracture behavior of nanocrystalline α-Fe,” Phys. Rev. B, 69, 134110, 2004.
2.24 ATOMISTIC SIMULATIONS OF FRACTURE IN SEMICONDUCTORS Noam Bernstein Naval Research Laboratory, Washington, DC, USA
1.
Introduction
Semiconductors are the materials that underlie nearly all modern electronics. They include elemental solids, such as silicon and germanium, as well as compounds such as gallium arsenide and silicon carbide. Since their main use is in electronic applications, semiconductors are not usually thought of as structural materials. Nevertheless there are important reasons, both technological and scientific, for the study of mechanical properties of semiconductors. The developing field of micro-machines, from micro-electromechanical systems (MEMS) to nanotechnology, relies on fabrication techniques developed for electronic devices to make microscopic mechanical system. To a large extent it is the link between these fabrication techniques, including deposition, masking, and etching, and the materials that has driven the use of semiconductors as structural components. On a more fundamental level, the ability to fabricate extremely pure and nearly defect free samples makes semiconductors excellent model systems for studying the physics of fracture. In this section I will attempt to give an overview of the ways in which atomistic simulations have been applied to fracture in semiconductors using a number of illustrative examples. Fracture is one possible failure mode of materials under mechanical load [1]. It occurs when a crack grows, eventually entirely through a sample, causing it to fail. In brittle fracture the crack tip is sharp, and the geometry causes a concentration of stress at the tip. In a continuum description of the solid and in the limit of an infinitely sharp crack the stress concentration becomes singular, and the stress field diverges at the crack tip. This stress concentration makes the material ahead of an existing crack most susceptible to failure, and causes the behavior of the material to be dominated by preexisting cracks. Since the 855 S. Yip (ed.), Handbook of Materials Modeling, 855–873. c 2005 Springer. Printed in the Netherlands.
856
N. Bernstein
amount of stress concentration, i.e., the coefficient of the singular term [2], is correlated with the length of the crack, brittle materials tend to break catastrophically: once a crack has begun to propagate, it becomes longer, increasing the stress concentration, and making it even more likely that it will continue to propagate. The presence of a singularity in the continuum elasticity solution of the stress field would naively indicate that the stress at the crack tip is infinite. While in a real material this singularity would be cut-off by the discrete nature of the atomic lattice, even the continuum problem has an elegant solution. Griffith set up an energy balance equation, comparing the amount of elastic energy released by crack extension with the amount of surface energy needed to generate the newly exposed crack surface [1, 3]. From this equation emerged the Griffith criterion for brittle fracture G ≥ 2γs ,
(1)
where G is the elastic energy release rate and γs is the surface energy. The elastic energy release rate is generally quadratic in the applied load (stress or strain). When the applied load is large enough, the elastic energy released by the lengthening crack will overcome the energy cost of making new surface, and the crack will be unstable with respect to growth. While this criterion is appealing, it is not necessarily valid in practice. Because it is based on a conservation of energy argument, it is most likely a good lower bound to the critical load. However, Griffith’s derivation completely ignores any atomistic details of the bond breaking process. The lack of accurate, independent ways of measuring the surface energy experimentally makes it difficult to test the Griffith criterion, but simulation remains a possibility. Since semiconductors seem to be such ideal brittle materials, atomistic simulations of fracture can be used to test the Griffith criterion. Another possible failure mode for a material under mechanical load is plastic deformation. The material can deform irreversibly by moving dislocations that allow the sample to relieve some of the applied stress [4]. The stress concentration at the tip of a crack tends to enhance the probability that dislocations will nucleate and move near the crack tip. However, unlike in brittle fracture, plasticity dissipates a lot of energy, reduces the stress concentration by making the crack blunt, and the dislocations can shield the crack tip from the applied stress. This type of ductile behavior, typical of metals, leads to robust structural materials: the initiation of failure does not necessarily extend catastrophically through the entire sample, and a lot of energy is dissipated in the process of deforming the material [5]. The issue of brittle fracture, ductility, and the brittle-to-ductile transition (BDT) is in fact a very important aspect of semiconductor fracture. Many materials, including some of great technological interest as advanced structural materials, undergo a transition from brittle fracture at low temperature to
Atomistic simulations of fracture in semiconductors
857
ductile behavior at higher temperatures. Examples include steels, intermetallic alloys such as TiAl3 , and semiconductors such as Si, Ge, and SiC. Since brittle materials tend to have low fracture toughness and to fail catastrophically, one possible route for improving their technological usefulness is by inducing a transition to ductile behavior, if it can be done without compromising the strength of the material. Silicon has become a model system for the BDT because in silicon the transition is extremely sharp [6]. The competition that controls brittleness and ductility is whether the material near the crack is more likely to cleave or to emit and propagate dislocations. Because both brittle and ductile failure of materials are controlled by atomic scale processes such as bond breaking and dislocation nucleation, atomistic simulation is nearly the only tool that can provide us with an atomic resolution view of what is happening at the crack tip during fracture. One question that we can address is whether the Griffith criterion for brittle fracture is valid given the discrete, atomistic nature of matter. Another is the nature of the microscopic processes that occur at the crack tip as the material fractures or deforms plastically. We can also address matters of technological importance, such as the development of new, stronger materials, or the tailoring of the failure properties of existing materials. To carry out the simulation we need both a procedure for computing some property that we can relate to an experimentally observed macroscopic property, as well as a procedure for computing the interaction between the atoms in the material. The structural properties of semiconductors are controlled by their atomic composition and structure. Essentially all semiconductors, elemental or compound, consist of a network of atoms joined by covalent (or mixed covalentionic) bonds. These covalent bonds typically involve sp3 hybrid orbitals that favor tetrahedral coordination, leading to open lattices such as the diamond structure (for elemental semiconductors) or its two-component analog, the zinc-blende structure. The covalent bonds are stiff with respect to deformation of the angles between the bonds, leading to a strong resistance to shear in the lattice. The directionality of the bonds leads to a large energy cost for forming the defects that allow for plasticity, such as dislocations. This suppression of dislocations makes most semiconductors brittle, at least at low temperatures. The nature of the bonding in semiconductors also affects the methods that can be used to simulate them. The basic ingredient that underlies all atomistic simulation is a method for computing the interactions between the atoms. Since covalent bonds are inherently a manifestation of the quantum-mechanical nature of the electrons, approaches that treat the quantum-mechanics explicitly have been quite successful. These approaches range from first-principles methods such as density functional theory (DFT) [7] to faster approximations such as the tight-binding (TB) approach [8–10]. Many interatomic potentials that approximate, but do not explicitly describe, the quantum-mechanics are also
858
N. Bernstein
available for semiconductors. The potentials typically include bond stretching and bond bending terms [11]. Given a method for computing the interaction, we can compute the energy of a particular configuration of atoms and the forces on the atoms. However, using this capability to compute fracture properties is still quite challenging. In a real materials, the fracture process is complex and spans a wide range of length scales. An elastic field that can extend over an entire macroscopic sample is focused, through the stress singularity, at the crack tip. The progress of the crack can be affected by many factors, including the crystal lattice, geometry, impurities, and defects. Clearly a single simulation describing this range of phenomena is too computationally expensive to be feasible, at least with the more accurate quantum-mechanical simulation methods. Thus, a number of approaches are used to simplify the problem. These can be roughly separated into two classes: idealized models and direct simulations, either quasistatic or dynamic.
2.
Idealized Models
The simplest approach to simulations of fracture properties is to ignore all of the details, and develop a highly idealized model that relates true fracture properties to some quantity that is easier to compute atomistically. The combination of model and atomistic calculation has several benefits for treating the range of length and time scales involved in fracture processes. In and of itself an atomistic simulation is limited to systems with size ranging from a few hundred atoms for a first-principles method, to a few million for an empirical interatomic potential. This translates to a size of about 10–500 Å. It is also governed by the time scale over which a single atom moves, comparable to the fastest vibrational mode, about 10−13 s in silicon. Plugging the results of the atomistic simulations into the idealized model connects these tiny length and time scales to a description of processes that occur in macroscopic systems on experimental time scales. A number of approximate calculations of fracture properties of semiconductors have used empirical relations between elastic moduli and some phenomenological measure of hardness. Usually this has been the Knoop or Vickers hardness, which is defined as the apparent pressure required to indent a material by a particular shape diamond indenter [12]. This type of relation was implied in the classic paper by Liu and Cohen predicting that cubic carbon nitride [13] can exist, and might be harder than the hardest known substance, diamond. While it was not the central point of the paper, a correlation between low compressibility (i.e., high bulk modulus) and high measured hardness was the main reason for the interest in this material. However, the shear
Atomistic simulations of fracture in semiconductors
859
modulus is actually better correlated with hardness, and the shear modulus of cubic carbon nitride is not as high as that of diamond. While these phenomenological approaches have the advantage that they are among the most computationally inexpensive ways of computing anything related to fracture, they are at best approximate. The elastic moduli are nearequilibrium properties that represent the curvature of the potential energy surface for small deformation. Fracture, on the other hand, is a process that is far from equilibrium, and depends on the unstable part of the potential energy surface where bonds are broken or irreversible deformation occurs. An alternative to the phenomenological approach is to use a microscopically motivated model to determine some quantities that are feasible to compute. The Griffith criterion (Eq. (1)) is probably the first example of this approach. Using an analytical solution of continuum theory, the macroscopic fracture toughness (i.e., the energy dissipated during the growth of the crack) is related to an essentially microscopic quantity, the surface energy. Firstprinciples calculations of the surface energy of semiconductors are now routine, so they can be used for prediction of fracture properties by assuming the validity of the Griffith criterion. However, since there is no reliable independent way of measuring the surface energy, checking the accuracy of the prediction is impossible. Another simple approach for the calculation of fracture properties neglects the complexities of fracture mechanics and computes instead an “ideal strength”. In a simplified picture, this is the peak stress that a uniform system experiences as a function of applied strain, typically uniaxial tension or simple shear. The ideal strength is relatively easy to compute, even with an accurate first-principles approach. It requires that the energy and stress of a bulk system (i.e., a small unit cell with periodic boundary conditions) be computed as a function of strain for a range of applied strains. A minor complication is caused by the fact that semiconductors have complex lattices, i.e., lattices with more than one atom per unit cell, so the positions of the different atoms in the unit cell have to be relaxed at each applied strain. Figure 1 shows an
Figure 1. Plot of the calculated stress vs. applied tensile strain for three semiconducting materials (After Fig. 2 in D. Roundy and M.L. Cohen, Phys. Rev. B, 64, p. 212103, 2001. Reproduced with permission).
860
N. Bernstein
example of this technique applied to three elemental semiconductors: Si, Ge, and C. The stress in the simulated system at a range of applied shears shows a linear rise in of the stress in the elastic regime, following by inelastic behavior and finally a maximum in the stress the material can support. The results show the trend of decreasing strength from C to Si to Ge, consistent with experiment. The ideal shear strengths of Si and Ge are quite low relative to their ideal tensile strengths. Since shear strength is (qualitatively) characteristic of resistance to dislocation formation and motion, and tensile strength is (qualitatively) characteristic of resistance to cleavage, this relation indicates that Si and Ge might be expected to be ductile. Since both Si and Ge are brittle, at least at low temperatures, the prediction of ductility shows the limitation of the simplified model that underlies the ideal strength approach. The quantitative values of the critical stress and strain in the simulation are also much higher than ever observed experimentally. Many complications present in real cracks might explain these discrepancies. The stress field at the tip of the crack is closer to biaxial tension than to uniaxial tension. The stress field is also highly inhomogeneous and anisotropic. The process of crack propagation by cleavage (opening a gap between two particular atomic planes) is not quite the uniform tension that ideal tensile strength measures, and dislocation nucleation depends on the slip between two atomic planes, which is not the same as uniform shear. All of these inhomogeneities are neglected by the ideal strength calculations, but gauging their significance is not easy. Reliable experimental numbers that are accurate and free from material imperfections are non-existent, so more sophisticated simulations are currently the only practical approach. A more sophisticated idealized model that has played an important role in atomistic simulations for fracture in semiconductors is a criterion for dislocation nucleation analogous to the Griffith criterion for brittle fracture. The Rice criterion, as it is known, is based on an expression for the critical load for dislocation nucleation [14]. The critical load is computed by combining the continuum elasticity solution for a loaded crack with an atomistic expression for the energy of a solid as a function of slip between two atomic planes. When the critical load for dislocation nucleation is lower than the critical load for cleavage, the material is ductile: it will nucleate dislocations before it cleaves, and these dislocations will shield the crack tip. The essential ingredients for this calculation are the elastic constants and surface energy (needed for the Griffith criterion) and the unstable stacking energies, which are the saddle point energies of the so called γ-surface. The γ surface is the energy of the material as a function of slip. It can be computed by computing the energy of an infinite crystal (represented by a unit cell with periodic boundary conditions) separated into two halves by a plane, and translating one half with respect to the other, as a function of the relative translation vector. One interesting aspect of the γ-surface is that it is a theoretical construct. There is no way to deform
Atomistic simulations of fracture in semiconductors
861
a system experimentally and measure its γ-surface or its unstable stacking energies. Since all three ingredients in the Rice criterion can be computed using DFT [7], it is possible to use this reliable and accurate method to study the complex interplay between brittleness and ductility. The geometry of the diamond structure lattice (common to elemental semiconductors) makes the details of the calculation complex. The dominant orientation for both cleavage and slip are high-symmetry (111) planes of the lattice. There are two inequivalent places to cut the lattice (Fig. 2). For cleavage one of these cuts is much higher in energy, and therefore irrelevant. For slip, on the other hand, the saddle point energies are comparable (although the energy maxima are not). The main conclusion is that in Si the unstable stacking energies are large enough that Si should be brittle according to the Rice criterion. This conclusion is consistent with the experimental observations of silicon as a brittle material at low temperatures. However, it is in contradiction with Rice’s original rough estimate used before the unstable stacking energy was calculated. It is unclear how to relate Rice’s criterion and calculations of unstable stacking energies to the most interesting aspect of brittleness and ductility in semiconductors: the BDT. Rice’s criterion in its original form is a zerotemperature theory that neglects kinetics and finite temperature effects, while the BDT is inherently a finite temperature phenomenon. A number of theoretical explanations for the BDT has been proposed, invoking thermally activated dislocation motion, a thermally activated shift from immobile to mobile dislocations, and collective effects of dislocation–dislocation interactions on nucleation or mobility. A detailed discussion of this topic is beyond the scope of this chapter, since the BDT theories are complicated and controversial. However, it clear that advances in simulation methods are making it possible to reliably and accurately compute the quantities that enter into these BDT theories. Perhaps future work based on atomistic calculations will help settle the mechanism for the BDT in Si and other semiconductors.
3.
Quasistatic Direct Numerical Simulation
A complementary approach to the use of idealized models is direct numerical simulation. As discussed in more detail in Chapter 1 [15, 16], it is possible to use energy and force calculations to follow the trajectory of a system of atoms. If the length and time scale issues can be adequately addressed, this type of simulation can give us the most direct view of the fracture process. The greatest difficulty in carrying out direct numerical simulations is in developing a method for computing the energy of the system: the method must be accurate enough to capture the important physics while remaining fast enough to be practical. Three possible approaches have been tried. One is very accurate first principles or other electronic structure methods, but applied to very
862
N. Bernstein (a)
(b)
Figure 2. Schematic of the shuffle cut plane and γ surface (upper panel) and glide cut plane and γ surface (lower panel). Note the different energy scales for the two γ surfaces. (After Fig. 1 in E. Kaxiras and M.S. Duesbery, Phys. Rev. Lett., 70, p. 3742, 1993. Reproduced with permission.)
Atomistic simulations of fracture in semiconductors
863
tiny systems, only a few tens of atoms. Another is simulations using interatomic potentials, which can be easily applied to 104 atoms or more, but, as we discuss in more detail below, have serious problems with accuracy that can lead to qualitatively wrong results. The third is a coupling of the two methods, using an accurate method near the tip of the crack, and an interatomic potential far from the crack. To use a first-principles method such as density functional theory to directly simulate fracture both the length and time scales must be minimized. In practice this means that the simulated system is made as small as possible, typically about 100 atoms. The time scale can be removed altogether by making the simulation quasistatic. This means that instead of simulating the dynamics of the atoms by integrating Newton’s equations of motion (i.e., doing molecular dynamics [16]), the system is allow to relax toward the minimum energy atomic positions at each applied load [15]. The energy minimization can find stable configurations, but the path the system goes from the initial state to the final state does not necessarily have physical meaning. Unless directly manipulated, energy minimization methods are usually designed never to go over energy barriers. This constraint can dramatically change the way kinetics, for example the competition between two mechanisms with comparable energy barriers, affect the simulated crack propagation process. The energy minimization approach can be used to study the limits of the Griffith criterion in predicting the crack propagation process. One possible effect of the discrete nature of the atomic lattice is inherent in the localization of the bonding to pairs of atoms. The Griffith criterion treats the bonding energy as a uniform surface energy density. The connectivity of the covalentbond network in the semiconductor makes this energy density, in so far as it is even well defined, inhomogeneous. When two atoms that are directly bonded are being separated by the propagation of the crack, the energy cost, related to bond stretching, is large. When the hypothetical continuum crack tip is propagating through a region that does not cross any covalent bonds, atoms that are moving apart are connected through a chain of bonds. The opening of the crack faces can be accommodated primarily by bond bending, which is less energetically costly than bond stretching. Although the average energy cost for extending the crack surface is the same as the continuum value, there may be “lattice trapping”, where energy barriers associated with breaking each interatomic bond impede the propagation of the crack [17]. Another aspect of fracture in a real material that is neglected by the continuum description is the orientation of the crack front. In the continuum theory that underlies Griffith’s work the only relevant parameter is the surface energy. In a crystal the surface energy is orientation dependent, and tends to be minimized for high symmetry crystal faces such as the (111), (110), or (100), favoring cracks that create such high-symmetry surfaces. The geometry of the network of bonds that is being broken depends on the orientation of the crack front. Although
864
N. Bernstein
this crack-front orientation dependence isn’t captured by the Griffith criterion, it may affect the true critical load. To carry out the calculation, a small system of bulk Si surrounding the tip of a planar crack is deformed according to the continuum elasticity solution of the displacement field around a crack tip. A visualization of this configuration is shown in Fig. 3. Because the experimental system is many orders of
Figure 3. An image of one of the crack-tip systems simulated by energy minimization from Fig. 1 in Ref. [18]. Grey circles indicate Si atoms and white circles indicate H atoms passivating broken bonds. Atoms outside the dotted-line region are constrained to the continuum elasticity displacement field positions. (After Fig. 1 in R. P´erez and P. Gumbsch, Acta Mater., 48, p. 4517, 2000. Reproduced with permission.)
Atomistic simulations of fracture in semiconductors
865
magnitude larger, the edges of the simulated system are unphysical. To prevent electronic surface states that can form on these fictitious free surfaces, each broken bond must be passivated with a H atom. To apply the correct loading two layers of Si atoms at the boundary are kept fixed at the continuum displacement field positions. Observing the system as it is relaxed reveals whether the crack propagates at each applied load. The behavior as a function of load indicates the minimum critical load for crack propagation (at or above the Griffith criterion) and maximum critical load for crack healing (at or below the Griffith criterion). According to the Griffith criterion, where there are no energy barriers and no hysteresis, these two loads would be the same. The deviations among these two loads and the Griffith critical load are a measure of the lattice trapping. The variations in the critical loads as a function of crack front orientation (but always exposing the same surface with the same surface energy) quantify the cleavage anisotropy. It is known experimentally that cracks that open (111) and (110) surfaces can propagate in a stable manner in Si, although the behavior of the (110) cracks is dependent on the propagation direction. Simulations on all of these crack geometries show significant deviation from the Griffith criterion predictions, revealing the importance of the discreteness of the atomic lattice. The critical loads for crack propagation are between 20 and 35% higher in applied stress (i.e., 40–70% higher in G, which is quadratic in applied stress) than the Griffith criterion prediction. This deviation in and of itself shows that significant energy barriers will affect crack propagation. The differences between the (111) cracks, which are isotropic with respect to propagation direction, and (110) cracks, which are anisotropic, is also explained by the simulation results. Cracks that expose (111) surfaces show the least lattice trapping. Cracks that expose (110) surfaces, on the other hand, show more lattice trapping, and the amount depends on the crack front direction. The crack-front direction that corresponds to experimentally observed propagating cracks (a [001] front) shows a moderate amount of lattice trapping, while the direction where no stable crack propagation is observed ¯ front) shows the most. This orientation anisotropy is experimentally (a [110] attributed to the geometry of the bonds that are just ahead of the crack. For the directions with low and moderate lattice trapping the load is concentrated on just one bond ahead of the crack and the bond breaking process is continuous: as the load is increased, the length of each bond increases smoothly from strained bulk-like to a broken bond. The high level of lattice trapping in the ¯ crack-front is caused by the distribution of the load between two bonds, [110] and the bond breaking process is discontinuous. The lengths of bonds ahead of the crack increase slowly until a critical load where the bonds snap open (Fig. 4). This example of quasistatic simulations of fracture shows the power of atomistic simulations to reveal details of the fracture process. The simulation
866
N. Bernstein
Figure 4. Plot of bond distance for each bond along the crack propagation direction for a crack ¯ crack front at different applied loads. The loads are scaled to on the (110) plane with a [110] the Griffith criterion critical stress intensity factor. (After Fig. 5 in R. P´erez and P. Gumbsch, Acta Mater., 48, p. 4517, 2000. Reproduced with permission.)
results can be used in different ways. The simplest is for the calculation of quantities such as the critical load for brittle fracture. This load can be compared to experiment, or stand as a prediction for materials that have not been studied experimentally. A more sophisticated approach is to use the critical load as a parameter in a more coarse-grained simulation. Cohesive zone models, for example, numerically solve the continuum elasticity problem [19] with finite elements while including the possibility of cracks opening up in the material. One of the essential parameters for the cohesive zones, which model the opening crack, is the critical energy release rate, which can be obtained from reliable quasistatic first-principles simulations. Another way to use the simulation results is in more detailed analysis, not to make quantitative prediction, but to explain experimental observations. The relation between the propagation-direction dependence of (110) cracks to the way in which the peak crack-tip stress is distributed over the network of bonds is one example. This level of insight into the reason for a previously unexplained experimental observation is one of the great contributions that atomistic simulations can make.
Atomistic simulations of fracture in semiconductors
4.
867
Dynamic Direct Numerical Simulation
Dynamic simulations of fracture are in many ways the closest we can get to a “computer experiment”. Molecular dynamics simulation [16] provide this capability, but both the time and length scales required for dynamic simulations are inherently substantial. To avoid transient startup effects and to gather reasonable statistics it is helpful to be able to simulate the crack moving a significant distance (at least significant on the atomic scale). This requirement translates to systems that are large enough to enclose the distance the crack will travel, as well as enough surrounding material to insulate the crack tip from edge effects. It also requires simulations that are long enough in time to follow the crack as it moves this distance. Until recently only interatomic potentials have been sufficiently computationally efficient to make dynamical simulations of fracture practical. As I discuss below, it has turned out that most commonly used interatomic potentials for silicon fail qualitatively to simulate brittle fracture. This failure has motivated the use of hybrid methods, which combine an interatomic potential simulation of a fracturing sample with a more accurate electronic structure method near the crack tip. Some aspects of the basic physics of fracture are inherently dynamic, and can’t be captured by quasistatic energy minimization calculation. One example is a dynamic form of lattice trapping in brittle fracture called the velocity gap [20]. The discrete nature of the fracturing material makes it impossible for a dynamic crack to propagate below a critical speed. In dynamic propagation this gap can manifest itself as a range of forbidden crack velocities, or as a difference between the loading required to make a crack begin to propagate and the loading required to stop a steady-state propagating crack. Since semiconductors are so brittle at low temperatures, they make good model systems for studying this basic instability in dynamic fracture. Because the velocity gap is an inherently dynamic and steady-state phenomenon, the simulations needed to be dynamic, and free of transient effects. To achieve these requirements the molecular dynamics runs must have a long duration and be carefully monitored for their progress toward steady state. These simulations can be made computationally feasible by using a modified version of the Stillinger–Weber (SW) interatomic potential [11] (discussed in more detail below), together with a number of techniques to minimize the simulated system size in a controlled manner. The crack is simulated in a quasitwo-dimensional geometry: a thin sample with periodic boundary conditions are used in the direction along the crack front (Fig. 5). This simulates an infinitely thick system with a straight crack front. To minimize the size of the system perpendicular to the crack surface the scaling of the crack phenomenon with respect to that system dimension can be studied analytically. With this analytical solution results from a small system can be extrapolated to infinite system size. To minimize the size of the system along the crack propagation
868
N. Bernstein
y
x Figure 5. Cartoon of a two-dimensional dynamic fracture simulation. The system is loaded in tension along y by fixing the positions of the top and bottom layers of atoms. The crack front is a straight line parallel to the z-axis. The crack is propagating from left to right (indicated by the arrow), parallel to the x-axis. Periodic boundary conditions are used along z. The dashed box on the left indicates region where atoms are removed from the simulation, and the solid box on the right indicates region where atoms are added ahead of the crack. In an actual simulation all of the in-plane dimensions are significantly larger.
direction a sort of virtual treadmill is used. Only a block of material near the crack front is explicitly simulated. Material behind the crack front, where the crack has already opened fully, is dropped from the simulated system. To compensate more material is added ahead of the crack front, far enough that the new material has reached local equilibrium before the crack front has reached it. The simulation results show the effects of both quasistatic lattice trapping and of the dynamic velocity gap. At very low temperatures, near 0 K, there is a deviation of about 10% in applied strain between the loading required to initiate crack propagation to the loading required to stop a propagating crack. However, both of these dynamic critical loads are more than 20% in applied strain over the Griffith criterion critical load, indicating the presence of quasistatic lattice trapping as well. The velocity gap becomes becomes smaller at higher temperatures, and essentially disappears at room temperature. This velocity gap has not been observed experimentally [21], although the experiments are quite challenging and it is not yet clear if it has been ruled out. The velocity gap simulations also show one important problem with empirical potential simulations of fracture in silicon: most commonly used empirical potentials for Si (the one known exception is discussed below) show ductile fracture. At the initial stages of loading dislocations form at the crack tip,
Atomistic simulations of fracture in semiconductors
869
and at higher loads additional dislocations nucleate until the material ahead of the crack simply disintegrates. Since the velocity gap is a feature of brittle fracture, those simulations were carried out using a modified form of the SW potential that increased the energy cost for bond bending, thereby suppressing dislocation formation. Although perhaps not the best description of silicon, the modified SW is a more realistic model than the models previously used to study similar instabilities. Rather than using simplified geometries and analytical extrapolations, sophisticated computational tool can be used to make simulations of large systems with many atoms computationally tractable. Parallel computers, often implemented by networking together a large number of low-priced workstations, have brought this approach within reach for many research groups. The development of parallel algorithms for material simulations, in particular the issue of distributing the work evenly between the parallel processors, and analyzing the vast quantities of data that result, are beyond the scope of this discussion. However, even these computationally sophisticated approaches must beware of the problems with empirical potential simulations of fracture. Silicon nitride is a dielectric material used in Si and GaAs electronic devices. During production and operation thermal and mechanical stresses can cause cracking of the Si3 N4 in the device, but the cracks are arrested when they reach the Si layer. To understand the reason for the crack arrest the system was simulated using the SW potential for Si, and a sophisticated potential including both covalent and electrostatic effects for the Si–N interactions. The technical achievements of the simulation were considerable: over 106 atoms were used to minimize edge effects in the fully three-dimensional simulations. The results show that brittle cracks in the Si3 N4 are arrested at the Si interface, and emit dislocations into the bulk Si region. However, the behavior of the original and modified SW potential simulations in the velocity gap simulations suggest that this agreement with experiment may be fortuitous. Since SW simulations never show brittle fracture, it is unsurprising that the simulated cracks arrested at the Si3 N4 /Si interface. The unphysical extreme ductility of the SW potential explains that the simulated crack arrest is an artifact that is most likely unrelated to the reason for the crack arrest in the experiment. The ostensible disagreement between simulations of Si, which show cracktip ductility, and experiments on Si, which show apparently brittle fracture at low temperatures, is an instructive example of the difficulties in definitively comparing simulations and experiment. Is the ductility seen in simulations in fact unphysical? Large scale simulations using empirical potentials show that the dislocations remain at or near the crack-tip. The size of the disordered region is so small that even if it is real it is not clear whether it would have been noticed in experiments. Visualization of the simulation results shows a crack-tip that is blunt on the atomic scale, but quite sharp (a few tens of Å) on the macroscopic scale. The speed of the crack, which one might naively
870
N. Bernstein
expect to be quite different in brittle fracture vs. localized crack-tip ductility, turns out to be quite insensitive to the mode of fracture. Both for the ductile empirical potentials as well as the brittle modified SW, the speed goes up with applied load but never exceeds about 2/3 of the theoretical limiting speed, the Rayleigh wave speed [22]. A more sensitive quantitative measure is required to settle the question. The critical energy release rate G, which measures the amount of energy dissipated during crack propagation, provides the necessary information. If the critical G is close to the Griffith criterion value, the fracture process must be essentially brittle. If the critical G is much higher, microscopic ductility remains possible. Careful experimental measurements finally showed that the critical energy release for fracture in Si is quite close to twice the best estimate of the surface energy (from density functional theory calculations). While the uncertainty in the experiments and the surface energy calculations prevent this measurement from being an accurate test of the Griffith criterion prediction, it does seem to rule out significant ductility. The behavior of most empirical potentials for Si is simply not consistent with the experimentally measured energy release rate. The specific problems with interatomic potential simulations of fracture in Si, as well as the general view that an explicitly quantum-mechanical method would be more reliable, drove the development of a multi-scale method for simulating fracture and other material processes. The general approach, pioneered by Kohlhoff et al. [23] takes advantage of the fact that in most of the loaded sample continuum mechanics or interatomic potentials are a very accurate way of describing the material. Only in the crack-tip region is the deviation from the continuum elasticity result significant, and most likely only at the crack tip, where bond are being broken, are the shortcomings of the interatomic potential significant. Coupling together different computational methods, each applied in a different part of the system where it is valid, can combine the accuracy and efficiency of the different methods. This coupled approach was applied to fracture in silicon by embedding a tight-binding simulation of the crack-tip region into an interatomic-potential simulation that describes the rest of the system. The coupled simulation shows brittle fracture initiating at loads only slightly above the Griffith criterion prediction. Analyzing the change in energy as the crack moves by one atomic spacing shows that the main difference between the interatomic potentials and the (crack-tip) tight-binding results was in the size of the lattice trapping energy barrier. The interatomic potentials show large energy barriers, large enough to suppress fracture up to the critical loads for dislocation nucleation. This explains the unphysically large amount of ductility seen in empirical potential simulations. The coupled simulation has a much smaller energy barrier that disappears at the critical load for brittle fracture. Decomposing the energy changes during crack propagation into bond-breaking and elastic-relaxation parts indicates that two length scales control the height of the barrier: the
Atomistic simulations of fracture in semiconductors
871
distance over which the bond is broken, which is related to the type of bonding and the method used to simulate it, and the elastic relaxation distance, which is controlled by the shape of the crack-tip. The modified SW used for the velocity gap simulations is more brittle than SW not because the barrier is smaller, but because the dislocation nucleation point is pushed to higher loads by the artifically stiff bond angles. There is one interatomic potential for Si that does produce brittle fracture, based on the modified embedded atom method (MEAM) [24]. Simulations using a MEAM potential for Si show brittle fracture with a critical load that is about 20% higher in applied stress than the Griffith criterion prediction. This amount of lattice trapping is comparable to the small-system quasistatic first-principles simulations, but significantly larger than the dynamic coupled empirical-potential-tight-binding simulations. At loads significantly above the critical load for crack propagation it has been observed experimentally that the crack speed increases with increasing load, but not as quickly as the continuum elasticity prediction. The reason for this deviation, and for the saturation of the experimental crack speed at about 2/3 the continuum limit is not known. The MEAM simulation, which reproduces the experimental crack speed measurements at high loads, reveals the reason for the reduced crack speed. At higher loads some of the elastic energy provided by the load is dissipated in the creation of damage near the crack-tip. This damage takes the form of dislocations at moderate loads, and surface steps of one to five atomic layers at larger loads (shown in Fig. 6). Even in an initially perfect material, the crack propagation process becomes unstable and produces an irregular crack surface.
[111] [211]
Figure 6. Image of crack propagating through Si at high loading, showing surface steps and dislocations. (After Fig. 4 in J.G. Swadener, M.I. Baskes, and M. Nastasi, Phys. Rev. Lett., 89, p. 85503, 2002. Reproduced with permission.)
872
5.
N. Bernstein
Future Directions
The applications of atomistic simulation methods to fracture in silicon have come a long way since their beginnings in the 1960s and 1970s. The power of computers has grown by many orders of magnitude, and the methods for evaluating the energies and interatomic forces have become correspondingly more sophisticated. Nevertheless, the range of time and length scales that are inherently involved in the fracture process, from the atomic vibrations that lead to bond breaking to steady-state crack growth or the interplay with plasticity, will continue to make this a challenging computational problem. Many open questions are just beginning to be addressed: the nature of lattice trapping in its dynamic and static forms, the nature of instabilities in dynamic crack propagation, and the brittle-to-ductile transition. The unexpected difficulties in applying interatomic potentials will ensure a role for explicitly quantum-mechanical methods until more reliable interatomic potentials are developed. The need for such computationally expensive methods will ensure an important role for simulation approaches based on idealized models that make the calculations tractable. While simple empirical models may become less important, more sophisticated ways of linking atomistically calculated quantities, through continuum mechanics, to macroscopic mechanical properties will continue to be useful. For dynamical processes direct numerical simulations will require advances in interatomic potentials and ways of using quantum-mechanical methods just in the regions where they are most needed. With these ongoing advances atomistic simulations are poised to finally give us an atomic resolution view of the complex fracture properties of semiconductors.
References [1] K.B. Broberg, Cracks and Fracture, Academic Press, San Diego, 1999. [2] G.R. Irwin, “Analysis of stresses and strains near the end of a crack traversing a plate,” J. Appl. Mech., 24, 361–364, 1957. [3] A.A. Griffith, “The phenomena of rupture and flow in solids,” Philos. Trans. R. Soc. London A, 221, 163, 1921. [4] J.P. Hirth and J. Lothe, Theory of Dislocations, 2nd edn., Wiley, New York, 1992. [5] D. Farkas and R.L.B. Selinger, “Atomistics of fracture,” Article 2.23, this volume. [6] J. Samuels and S.G. Roberts, “The brittle-ductile transition in silicon. I. Experiments,” Proc. Roy. Soc. London A, 421, 1–23, 1989. [7] M.L. Cohen, “Concepts for modeling electrons in solids,” Article 1.2, this volume. [8] W.A. Harrison, Electronic Structure and the Properties of Solids., Freeman, San Francisco, 1980. [9] M.J. Mehl and D.A. Papaconstantopoulos, “Tight-binding total energy methods for magnetic materials and multi-element systems,” Article 1.14, this volume. [10] C.Z. Wang and K.M. Ho, “Environment-dependent tight-binding potential models,” Article 1.15, this volume.
Atomistic simulations of fracture in semiconductors
873
[11] J. Justo, “Interatomic potentials: covalent bonds,” Article 2.4, this volume. [12] P. Haasen, Physical Metallurgy, Cambridge University Press, Cambridge, 1986. [13] A.Y. Liu, and M.L. Cohen, “Prediction of new low compressibility solids,” Science, 245, 841–842, 1989. [14] J.R. Rice, “Dislocation nucleation from a crack tip: an analysis based on the Peierls concept,” J. Mech. Phys. Solids, 40, 239–271, 1992. [15] C.R.A. Catlow, “Perspective: energy minimisation techniques in materials modelling,” Article 2.7, this volume. [16] J. Li, “Basic molecular dynamics,” Article 2.8, this volume. [17] B. Lawn, Fracture of Brittle Solids, Cambridge University Press, Cambridge, p. 148, 1993. [18] R. Perez and P. Gumbsch, “An ab initio study of the cleavage anisotropy in silicon,” Acta Mater., 48, 4517–4530, 2000. [19] D.J. Bammann, “Perspective: continuum modeling of mesoscale/macroscale phenomena,” Article 3.2, this volume. [20] M. Marder, “Molecular dynamics of cracks,” Comp. Sci. Eng., 1, 48–55, 1999. [21] I. Beery, U. Lev, and D. Sherman, “On the lower limiting velocity of a dynamic crack in brittle solids,” J. Appl. Phys., 93, 2429–2434, 2003. [22] L.B. Freund, Dynamic Fracture Mechanics, Cambridge University Press, Cambridge, 1998. [23] S. Kohlhoff, P. Gumbsch, and H.F. Fischmeister, “Crack propagation in BCC crystals studied with a combined finite-element and atomistic model,” Phil. Mag. A, 64, 851–878, 1991. [24] Y. Mishin, “Interatomic potentials: metals,” Article 2.2, this volume.
2.25 MULTIMILLION ATOM MOLECULAR-DYNAMICS SIMULATIONS OF NANOSTRUCTURED MATERIALS AND PROCESSES ON PARALLEL COMPUTERS Priya Vashishta1, Rajiv K. Kalia2 , and Aiichiro Nakano3 1
Collaboratory for Advanced Computing and Simulations, Department of Chemical Engineering and Materials Science, University of Southern California, 3651 Watt Way, VHE 608, Los Angeles, CA 90089-0242, USA 2 Collaboratory for Advanced Computing and Simulations, Department of Physics & Astronomy, University of Southern California, 3651 Watt Way, VHE 608, Los Angeles, CA 90089-0242, USA 3 Collaboratory for Advanced Computing and Simulations, Department of Computer Science, University of Southern California, 3651 Watt Way, VHE 608, Los Angeles, CA 90089-0242, USA
1.
Introduction
Materials by design efforts have thus far focused on controlling structures at diverse length scales – atoms, defects, fibers, interfaces, grains, pores, etc. Because of the inherent complexity of such multiscale materials phenomena, atomistic simulations are expected to play an important role in the design of materials such as metals, semiconductors, ceramics, and glasses [1]. In recent years, we have witnessed rapid progress in large-scale atomistic simulations, highly efficient algorithms for massively parallel machines, and immersive and interactive virtual environments for analyzing and controlling simulations in real time. As a result of these advances, simulation efforts are being directed toward reliably predicting properties of materials in advance of fabrication. Thus, materials simulations are capable of complementing and guiding experimental search for new and novel materials. Computer simulation is the third mode of scientific research that bridges the gap between analytical theory and laboratory experiment. Experiments 875 S. Yip (ed.), Handbook of Materials Modeling, 875–928. c 2005 Springer. Printed in the Netherlands.
876
P. Vashishta et al.
search for patterns in complex natural phenomena. Theories encode the discovered patterns into mathematical equations that provide predictive laws for the behavior of nature. Computer simulations solve these equations numerically in their full complexity, where analytical solutions are prohibitive due to a large number of degrees of freedom, nonlinearity, or lack of symmetry. In computer simulations, environments can be controlled with any desired accuracy and extreme conditions are accessible far beyond the scope of laboratory experiments. Advanced materials and devices with nanometer grain/feature sizes are being developed to achieve higher strength and toughness in ceramic materials and greater speeds in semiconducting electronic and photonic devices. Below the length scale of 100 nm, however, continuum description of materials and devices must be supplemented by atomistic descriptions [2]. Current state-of-the-art atomistic simulations involve 1 million to 1 billion atoms [3– 5]. Finally, the impact of large-scale nanosystems simulations cannot be fully realized without major breakthroughs in scientific visualization [6]. The current practice of sequentially processing visualization data is highly ineffective for large-scale applications that produce terabytes of data. The only viable solution is to integrate visualization into simulation, so that they are both performed concurrently on multiple parallel machines and then examine the results in real time in three-dimensional immersive and interactive virtual environments. This article describes our efforts to combine scalable and portable simulation, visualization, and data-management algorithms to enable very large-scale molecular-dynamics (MD) simulations. Scalable multiresolution algorithms and visualization, and data-management algorithms that enable these largescale simulations are described in the first part of this article. In the second part, we discuss the molecular dynamics simulations of various nanostructured materials and processes of great scientific and technological importance. The simulations described in this article were carried out in collaboration with our past and current graduate research assistants, postdoctoral research associates, and our long-term overseas collaborators.
2.
Part I: Scalable Simulation and Visualization Algorithms
In this part, we describe our scalable simulation and visualization algorithms. Following a general introduction to the MD simulation method and interatomic potentials to describe various materials, we describe our space–time multiresolution MD algorithms and their implementation on massively parallel computers. We also describe a multiscale simulation approach, which seamlessly combines quantum-mechanical and MD simulations with continuum simulation based on the finite-element method, on a Grid of globally distributed
Multimillion atom molecular-dynamics simulations
877
parallel computers. We conclude part I with discussions of management, mining, and immersive and interactive visualization of massive simulation data.
2.1.
Molecular Dynamics Simulation
In the MD approach, the phase–space trajectories of the system (positions and velocities of all the atoms at all time) are obtained from the numerical solution of Newton’s equations (see Fig. 1). This allows one to study how atomistic processes determine macroscopic materials properties. Recent advances in scalable, space–time multiresolution algorithms coupled with access to massively parallel computing resources have enabled us to perform some of the largest atomistic simulations of complex materials. The mathematical model underlying an MD simulation is the Newton’s equation of motion in mechanics, which states that the acceleration of an atom is proportional to the total force exerted on the atom by all the other atoms [5]. In MD simulations, a physical system consisting of N atoms is represented by a set of coordinates, {r k = (xk , yk , z k ) | k = 1, . . . , N }, and we trace the atomic trajectories – positions, r k (t), and velocities, vk (t) – by integrating the Newton’s equations numerically with respect to time, t (see N N Fig. 1). Fk = −∂ E MD r ∂rk is the force acting on the kth atom, E MD r is the interatomic potential energy, and r N = (r1 , r2 , . . . , r N ) is a 3N -dimensional vector representing the positions of the atoms.
j r ij i
k r ik
Figure 1. A molecular-dynamics simulation consists of a collection of atoms, which exert forces to each other, depending on their mutual interactions and relative positions.
878
P. Vashishta et al.
Choice of numerical algorithms is crucial for efficient simulations. For example, the velocity Verlet algorithm is time reversible, i.e., a simulation can be played back to recover the starting state exactly. In addition, the solution satisfies a certain symmetric property called symplecticness, which is related to the conservation of phase–space volume along the trajectory. These properties are essential for the long-time stability of a simulation. Mathematically, a force law is encoded in the interatomic potential energy, E MD (r N ). In the past years, we have developed reliable interatomic potentials for a number of materials, including ceramics such as silica (SiO2 ) [7, 8], silicon nitride (Si3 N4 ) [9–12], silicon carbide (SiC) [13–16] and alumina (A12 O3 ), as well as semiconductors such as gallium arsenide (GaAs), aluminum arsenide (AlAs), and indium arsenide (InAs) [17–20]. Interatomic potential energy for these materials consists of two- and three-body terms. The two-body potential energy is a sum over contributions from N(N + 1)/2 atomic pairs, (i, j ). The contribution from each pair depends only on their relative distance, |ri j | (see Fig. 1). Physically, the two-body terms are steric repulsion between atoms, electrostatic interaction due to charge transfer, charge–dipole interaction that takes into account the large electronic polarizability of negative ions, and van der Waals interaction. The three-body potential energy consists of contributions from atomic triples (i, j, k), and takes into account covalent effects through bending and stretching of atomic bonds, ri j and rik (see Fig. 1). These many-body potentials have been validated through comparison of simulation data with various experimental quantities. Theoretical results are in good agreement with experimental lattice constants, cohesive energy, elastic constants, melting temperature, phonon density-of-states, and fracture energies. As shown in Fig. 2, MD simulations on GaAs reproduce other experimental data as well – phonon dispersion, X-ray static structure factor, Sx (q), of amorphous state, and high-pressure structural transformation [21]. In Fig. 3, we compare MD and experimental data on the neutron-scattering static structure factor in amorphous SiO2 [22]. Figure 3 also shows MD and neutron-scattering experimental data on the phonon density of states in crystalline α-Si3 N4 [9].
2.2.
Multiresolution Algorithms
Efficient algorithms are key to extending the scope of simulations to larger spatial and temporal scales that are otherwise impossible to be simulated. These algorithms often utilize multiresolutions in both space and time. The most computationally intensive problem in an MD simulation is the computation of the electrostatic energy for N charged atoms. Direct evaluation of all the atomic-pair contributions requires O(N 2 ) operations. In 1987, Greengard and Rokhlin discovered an O(N ) algorithm called the fast
Multimillion atom molecular-dynamics simulations
879
Figure 2. Comparison with MD and experimental results for GaAs. (a) Theoretical and experimental phonon dispersion of zinc-blende GaAs. (b) X-ray static structure factor of amorphous GaAs. The MD results (solid curves) are in excellent agreement with X-ray diffraction data (open circles). (c) MD and EXAFS results for the GaAs nearest-neighbor distance during forward (squares) and reverse (circles) structural transformations.
Figure 3. (Left) Neutron-scattering static structure factor, S N (q), of amorphous SiO2 : Solid curve, the MD result at 300 K; open circles, neutron diffraction experiment at 10 K. (Right) Neutron-weighted phonon density of states of α-Si3 N4 : Solid curve, MD result; open circles, neutron scattering result.
multipole method (FMM) [23]. The FMM groups distant atoms together and treats them collectively [23–25]. Hierarchical grouping is facilitated by recursively dividing the physical system into smaller cells, therefore generating a tree structure (see Fig. 4). The root of the tree is at level 0, and it corresponds to the entire simulation box. A parent cell at level l is decomposed into 2 × 2 × 2 children cells of equal volume at level l + 1. The FMM uses the truncated multipole expansion and the local Taylor expansion of the electrostatic
880
P. Vashishta et al.
Level 0
Level 1
Level 2
Level 3
Figure 4. Schematic of the far-field computation in a two-dimensional system in the fast multipole method. The multipoles of a parent cell at level l are obtained by shifting the multipoles of its children cells at level l + 1 and summing them. Solid circles represent charged particles, and vertical lines represent parent-child relationships.
potential field. By computing both expansions recursively for the hierarchy of cells, the electrostatic energy is computed with O(N ) operations. The FMM also has well defined error bounds. For systems with periodic boundary conditions, other schemes based on Ewald summations, such as the O(N log N ) particle-mesh Ewald method [26], are ideally suited. The discrete time step, t, in MD simulations must be chosen sufficiently small so that the fastest characteristic oscillations of the simulated system are accurately represented. However, many important physical processes are slow and are characterized by time scales that are many orders-of-magnitude larger than t. Molecular-dynamics simulations of such “stiff” systems require many iteration steps, and this severely restricts the applicability of the simulation. We have used an approach called the multiple time-scale (MTS) method [27] which uses different t for different force components to reduce
Multimillion atom molecular-dynamics simulations
881
P5
P2 Step 3
P4
P1 Step 2
P0
Step 1
Step 4
P3
Figure 5. Spatial decomposition (2 × 3 × 1) of a porous silica system into 6 systems, which are mapped onto 6 processors (P0–P5). Two types of spheres represent silicon and oxygen. Logical partition boundaries between subsystems are represented by yellow planes. As denoted by arrows, message passing for interprocessor caching is completed in 4 steps. (This number of message-passing steps is smaller than 6, since the system is not partitioned in the third dimension in this example.)
the number of force evaluations. To further speed up simulations, we have also used a hierarchy of dynamics including rigid-body motion of atomic clusters [28]. Our multiresolution molecular-dynamics (MRMD) algorithm [5, 24] combining the FMM and MTS has been implemented on a number of parallel computers using a spatial decomposition, see Fig. 5. The MRMD algorithm is highly scalable: for a 664 million-atom SiO2 system, one MD step takes only 7 s on 1024 IBM SP3 nodes. The parallel efficiency of this algorithm-machine combination, defined as a speedup divided by the number of processors, is 93%.
2.3.
Parallel Molecular Dynamics
Parallel computing technology has extended the scope of computer simulations in terms of simulated system size. In order to perform parallel computer simulations efficiently, however, algorithms developed for serial computers must often be modified.
882
P. Vashishta et al.
Parallel computing requires decomposing the computation into subtasks and mapping them to multiple processors [5]. For MD simulations, the divideand-conquer strategy based on spatial decomposition is commonly used. The total volume of the system is divided into P subsystems of equal volume, and each subsystem is assigned to a node in an array of Pprocessors (see Fig. 5). The data associated with atoms of a subsystem are assigned to the corresponding processor. To calculate the force on an atom in a subsystem, the coordinates of the atoms in the boundaries of neighbor subsystems must be “cached” from the corresponding processor. In the actual code, the message passing to the 26 neighbors is completed in six steps by sending the boundary-atom information to east, west, north, south, up and down neighbors sequentially. The corner and edge boundary atoms are copied to proper neighbor processors by forwarding some of the received boundary atoms to other neighbors. After updating the atomic positions due to the time-stepping procedure, some atoms may have moved out of its subsystem. These atoms are “migrated” to the proper neighbor nodes. With the spatial decomposition, the computation scales as N/P while communication scales as (N/P)2/3, where N is the number of atoms. Thus the communication overhead becomes less significant when N (typically 106 –109 ) is much larger than P(102 –103 ), i.e., for coarse-grained applications. To implement the FMM discussed in the main text on parallel computers, processors are logically organized in a 3-dimensional array of Px × Py × Pz . For deeper tree levels, l ≥ log2 (max(Px , Py , Pz )), the calculation of the multipoles is local to each processor so that the computation scales with N/P [24, 25]. For lower levels, however, the number of FMM cells, 8l , becomes smaller than the number of processors. Consequently many processors become idle or alternatively they duplicate the same computation, and this computation overhead scales as log P. For a coarse-grained decomposition (N P), this log P overhead also becomes insignificant. Many MD simulations are characterized by irregular atomic distribution. Simulation of dynamic fracture is a typical example. One practical problem in simulating such irregular systems on parallel computers is that of load imbalance. Suppose that we partition the simulation system into subsystems of equal volume according to the three-dimensional array of processors as in Fig. 5. Because of the irregular distribution of atoms, this uniform spatial decomposition results in unequal partition of workloads among processors. As a result, the parallel efficiency is degraded significantly. This load-imbalance problem can be solved by partitioning the system not in the physical Euclidean space but in a computational space, which is related to the physical space by a curvilinear coordinate transformation (see Fig. 6). (The computational space shrinks where the workload density is high and expands where the density is low, so that the workload is uniformly distributed.) The optimal coordinate system is determined to minimize the load-imbalance and communication costs [29, 30].
Multimillion atom molecular-dynamics simulations
883
Figure 6. Schematic of the Computational space decomposition for load balancing. A 2D slice of a 3D MD configuration shows atoms as circles and partition boundaries between subsystems as curves.
Figure 7. Snapshots of a variable-charge MD simulation of an oxygen molecule on an aluminum surface. Atomic charges vary according to the local environment. Aluminum and oxygen atoms are represented by small and large spheres, respectively.
Having established scalable MD algorithms on parallel computers, current focus of research is how to enhance the physical realism of these simulations. For example, conventional interatomic potential functions used in MD simulations are often fitted to bulk solid properties, and they are not easily transferable to systems containing defects, cracks, surfaces, and interfaces. In these systems, partial charges and other chemical properties of atoms vary dynamically according to the change in the local environment. For example, environment-dependent charge distribution is crucial for the physical properties of these systems including the fracture toughness. Transferability of interatomic potentials is greatly enhanced by incorporating variable atomic charges, which dynamically adapt to the local environment. Recently, a simple semiempirical approach has been developed in which atomic charges are determined to equalize electronegativity (see, Fig. 7) [31–33].
884
P. Vashishta et al.
However, the increased physical realism of this model comes at the cost of computational overhead to determine atomic charges by minimizing the electrostatic energy at every MD step. This minimization is equivalent to solving a dense, linear equation system, and the computational cost scales as the cubic power of N.We have developed an acceleration scheme [34] which computes the matrix-vector multiplication in O(N ) time using the FMM. Also a dynamical simulated-annealing scheme uses the charges determined at the previous MD step to initialize an iterative solution, reducing the number of iterations to O(1). To speed up the solution further, our multilevel preconditioned conjugate gradient (MPCG) method splits the Coulomb-interaction matrix into short- and long-range components. The method uses the sparse short-range matrix as a preconditioner to improve the linear system’s spectral property, thereby accelerating the solution. The MPCG algorithm has enabled the first successful MD simulation of the oxidation of an aluminum nanocluster [35].
2.4.
Multiscale Simulation Approach
Processes such as crack propagation and fracture in real materials involve structures on many different length scales. They occur on a macroscopic scale, but require atomic-level resolution in highly nonlinear regions. To study such multiscale materials processes, we need a multiscale simulation approach that can describe physical and mechanical processes over several decades of length scales. Recently, Abraham, Broughton, Bernstein, and Kaxiras have developed a hybrid simulation approach that combines quantum mechanical (QM) calculations based on the tight-binding approximation, with large-scale MD simulations embedded in a continuum, which is handled with the finite element (FE) approach based on linear elasticity [36]. Such a multiscale FE/MD/QM simulation approach is illustrated in Fig. 8 for a material with a crack [2, 37]. Let’s denote the total system to be simulated as So . A subregion denoted as S1(⊂ S0 ) near the crack exhibits significant nonlinearity, and hence it is simulated atomistically, whereas the rest of the system, S0 − S1 is accurately described by the FE approach. In the region S2 (⊂ S1) near the crack surfaces, bond breakage during fracture and chemical reactions due to environmental effects are important. To handle such chemical processes, QM calculations must be performed in S2,while the subsystem, S1 − S2 , can be simulated with the classical MD method. Figure 8 also shows typical length scales covered by each of the FE, MD, and QM methods. In the following, we describe how an FE calculation can seamlessly embed an MD simulation, which in turn embeds a QM calculation.
Multimillion atom molecular-dynamics simulations
885
Figure 8. Illustration of a hybrid FE/MD/QM simulation. The FE, MD, and QM approaches are used to compute forces on particles (either FE nodes or atoms) in subsystems, S0 − S1 (represented by meshes), S1 − S2 , and S2 . These forces are then used in a time-stepping algorithm to update the positions and velocities of the particles. Typical length scales covered by each of the FE, MD, and QM method is also shown.
2.5.
Hybrid FE/MD Scheme
In continuum elasticity theory, a displacement vector, u(r), is associated with each point, r, in a deformed medium. In the FE method, space is tessellated with a mesh. The displacement field, u, is discretized on the mesh points (nodes), while its values within the mesh cells (elements) are interpolated from its nodal values. Time evolution of u(r) is governed by equations of motion that are a set of coupled ordinary differential equations subjected to forces from surrounding nodes. The nodal forces are derived from the potential energy, EFE [u(r)], which encodes how the system responds mechanically in the framework of elasticity theory. In hybrid FE/MD approaches, the physical system is spatially divided into FE, MD, and handshake (HS) regions [36–38]. Within the FE region, the equations for continuum elastic dynamics are solved on an FE mesh. To make the transition from the FE to MD regions seamlessly, the FE mesh in the HS region is refined down to the atomic scale near the FE/MD interface in such a way that each FE node coincides with an MD atom. The FE and MD regions
886
P. Vashishta et al.
are made to overlap over the HS region, establishing a one-to-one correspondence between the atoms and the nodes. Figure 9(a) illustrates an FE/MD approach. On the top is the atomistic region (crystalline silicon in this example), and on the bottom is the FE region. The red box marks the HS region, in which particles are hybrid nodes/atoms, and the blue dotted line within the HS region marks the FE/MD interface. These hybrid nodes/atoms follow hybrid dynamics to ensure a smooth transition between the FE and MD regions. In the scheme by Abraham, Broughton, Bernstein, and Kaxiras, an explicit energy function, or Hamiltonian, for the transition zone is defined to ensure energy-conserving dynamics [36]. All finite elements that cross the interface contribute half their weight to the potential energy; similarly, any MD interaction between atomic pairs and triples that cross the FE/MD interface contributes half its value to the potential energy. We use a lumped-mass scheme in the FE region, i.e., the
Figure 9. (a) Illustration of a hybrid FE/MD scheme for a three-dimensional silicon crystal for the crystallographic orientations (011). On the top is the MD region, where spheres and lines represent atoms and atomic bonds, respectively. At the bottom is the FE region, where spheres represent FE nodes and FE cells are bounded by lines. Region enclosed between the lines is the handshake (HS) region, in which particles are hybrid nodes/atoms, and the dotted line within the HS region indicates the FE/MD interface. (b) Time evolution of FE nodes and MD atoms in a hybrid FE/MD simulation of a projectile impact on a silicon crystal. (The figure shows a thin slice of the crystal for clarity.) Absolute displacement of each particle from its equilibrium position is shown. No reflection is seen at the boundary.
Multimillion atom molecular-dynamics simulations
887
mass is assigned on nodes instead of being distributed continuously within an element. This reduces to the correct description in the atomic limit, where nodes coincide with atoms. To rapidly develop an FE/MD code by reusing an existing MD code, we took advantage of formal similarities between the FE and MD dynamics. In our FE/MD program, particles are either FE nodes or MD atoms, and their positions and velocities are stored in a single array. The FE method requires an additional book keeping, since each element must be associated with its corresponding nodes. This is done efficiently with use of the linked cell list in the MD code. The FE/MD simulation approach has been parallelized based on the same spatial decomposition scheme as in our parallel MD program. To validate our FE/MD scheme, we simulated a projectile impact on a three-dimensional block of crystalline silicon, see Fig. 9(b) [2]. The block has ¯ ¯ crystallographic dimensions of 10.5 nm and 6.1 nm along the [211] and [011] orientations, respectively, and periodic boundary conditions are imposed in these directions. Along the [111] direction, the system consists of a 11.5 nm thick MD region, a 0.63 nm thick HS region, and a 19.6 nm thick FE region. The top surface in the MD region is free and the nodes at the bottom surface in the FE region are fixed. The fully three-dimensional FE scheme uses 20-node brick elements for the region far from the HS region, which provide a quadratic approximation for the displacement field and are adequate for continuum. In the scaled down region close to the FE/MD interface, we switch to eight-node brick elements, which provide a linear approximation for the displacement field. Within the HS region, the elements are distorted so as to exhibit the same lattice structure as crystalline silicon. In addition to these elements, prism-like elements are used for coarsening the FE mesh from the atomic to larger scales. The projectile is approximated by an infinite-mass hard sphere of radius 1.7 nm, from which the silicon atoms scatter elastically. A harmonic motion of the projectile along the [111] direction creates smallamplitude waves in the silicon crystal. Figure 9(b) shows snapshots at three different times, in which only a thin slice is plotted for the clarity of presentation. The color denotes the absolute displacement from the equilibrium positions measured in Å. The induced waves in the MD region propagate into the FE region without reflection, demonstrating seamless handshaking between MD and FE.
2.6.
Hybrid MD/QM Scheme
Empirical interatomic potentials used in MD simulations fail to describe chemical processes. Instead, interatomic interaction in reactive regions needs to be calculated by a QM method that can describe breaking and formation of bonds. There have been growing interests in developing hybrid MD/QM
888
P. Vashishta et al.
simulation schemes, in which a reactive region treated by a QM method is embedded in a classical system of atoms interacting via an empirical interatomic potential. An atom consists of a nucleus and surrounding electrons, and QM schemes treat electronic degrees-of-freedom explicitly, thereby describing wave-mechanical nature of electrons. One of the simplest QM schemes is based on the tight-binding (TB) method [36]. The TB method does not involve electronic wave functions explicitly, but solves an eigenvalue problem for the matrix that represents interference between electronic orbitals. In the TB scheme, electronic contribution to interatomic forces is derived through the Helmann–Feynman theorem, which states that only partial derivatives of the matrix elements with respect to r N contribute to forces. A more accurate but compute-intensive QM scheme deals explicitly with electronic wave functions, ψ Nwf (r) = {ψ1 (r), ψ2 (r), . . . , ψ Nwf (r)} (Nwf is the number of independent wave functions, or electronic bands, in the QM calculation), and their mutual interaction in the framework of the density functional theory (DFT) theory [39–41] and electron–ion interaction using pseudopotentials [42]. The DFT, for the development of which Walter Kohn received a 1998 Nobel chemistry prize, reduces the exponentially complex quantum–body problem to a self3 ) operations. In consistent eigenvalue problem that can be solved with O(Nwf the DFT scheme, not only accurate interatomic forces are obtained from the Helmann–Feynman theorem, but also electronic information such as charge distribution can be calculated. Hybrid MD/QM schemes have been developed extensively by the quantumchemistry community. In the seminal work by Abraham, Bernstein, Broughton, and Kaxiras [37], which combines the FE/MD/QM approaches in a single simulation, a semiempirical TB method is used as a QM method and a HS Hamiltonian is introduced to link the MD/TB boundary. We have developed a hybrid scheme for dynamic simulations of materials on parallel computers, in which a QM region is embedded in an atomistic region, see Fig. 9 [37, 43]. The motion of atoms in the QM region is described by a real-space [44] multigridbased DFT [45–47] and in the surrounding region with the MD approach. To partition the total system into the cluster and its environmental regions, we use a modular approach that is based on a linear combination of QM and MD potential energies and consequently requires minimal modification of existing QM and MD codes [48]: system
E = E CL
system
cluster cluster + E QM − E CL ,
is the classical semiempirical potential energy for the whole where E CL system and the last two terms encompass the QM correction to that energy. cluster is the QM energy for an atomic cluster cut out of the total system (its E QM
Multimillion atom molecular-dynamics simulations
889
cluster dangling bonds are terminated by hydrogen atoms – HS Hs) and E CL is the semi-empirical potential energy of a classical cluster in which HS Hs are replaced by appropriate atoms. In this approach, both QM and MD potential energies for the cluster need be calculated. Termination atoms are introduced in both calculations for the cluster. Handshake atoms linking the cluster and the environment regions are treated by a novel scaled position method, in which the positions of handshake atoms are determined as functions of the original atomic positions in the system with different scaling parameters in the QM and classical clusters to relate the HS atoms to the termination atoms. The hybrid simulation scheme is implemented on massively parallel computers by first dividing processors into the QM and the MD calculations (task decomposition), and then using spatial decomposition in each task. The parallel program is based on the message-passing paradigm and is written with the message passing interface (MPI) standard. Processors are first grouped into MD and QM groups by defining two MPI communicators. (Communicator is an MPI data structure that represents a dynamic group of processes with a unique ID called context.) The code is written in the single program multiple data (SPMD) programming style, so that each processor executes an identical program. Selection statements are used for the QM and the MD processors to execute only the QM and the MD code segments, respectively. The hybrid MD/QM simulation code was applied to oxidation of a silicon surface, to demonstrate seamless coupling of the cluster and the environment atoms [45]. Figure 10 shows snapshots of the atomic configuration at 50 and 250 fs in which atomic kinetic energies are color-coded. We see that dissociation energy released at the reaction of an oxygen molecule with silicon atoms in the cluster region is transferred seamlessly to silicon atoms in the environment region.
2.7.
Hybrid FE/MD/QM Scheme
Due to the formal similarity between parallel MD and FE/MD codes and the modularity of the MD/QM scheme mentioned above, it is also straightforward to embed a QM subroutine in a parallel FE/MD code to develop a parallel FE/MD/QM program [37]. The parallel hybrid simulation code is applied to oxidation of Si (111) surface to demonstrate seamless coupling of the FE, MD, and QM regions, see Fig. 11. For the hybrid simulation, a slab ¯ ¯ [111]) directions with dimensions (212.8 Å, 245.7 Å, 83.l Å) in ([211], [011], is cut out from bulk Si. The MD and the FE/MD–HS regions consist of 12 and 4 atomic layers along [111] direction, respectively, whereas the FE region corresponds to bottom two-thirds of the Si slab. Periodic boundary conditions ¯ ¯ directions. The total number of atoms and FE are applied in [211] and [011]
890
P. Vashishta et al.
Figure 10. (Top) Initial configuration in the present hybrid MD/QM simulation for oxidation of Si (100) surface. Magenta spheres represent the cluster silicon atoms; gray, the environment silicon atoms; yellow, termination hydrogen atoms for QM calculations; blue, termination silicon atoms for MD calculations; green, cluster oxygen atoms. (Bottom) Snapshots at 50 and 250 fs in the present hybrid MD/QM simulation for oxidation of Si (100) surface. Colors represent kinetic energies of atoms in Kelvin.
Figure 11. Snapshots at 150 fs, 300 fs, and 900 fs in the hybrid simulation of oxidation of Si (111) surface. Colors represent [111]-displacements of the atoms and the FE nodes.
nodes for the Si slab is N = 15, 212. Initial configuration of the hybrid simu¯ direction lation is obtained by placing an O2 molecule (oriented along [211] with zero velocity) 2.0 Å above the (111) surface of the slab. The O2 molecule and surrounding Si atoms are treated in the DFT calculation. Figure 11 shows
Multimillion atom molecular-dynamics simulations
891
snapshots of the atomic configuration at 150 fs, 300 fs, and 900 fs, in which [111] displacements of the atoms and the FE nodes are color-coded. The O2 molecule dissociates and each O atom is captured by a Si–Si bond at the surface to form a Si–O–Si structure, which is associated with increase in the Si–Si distance. Resulting strains in the QM region are transferred to the surrounding Si atoms in the MD region as shown in Fig. 11. Such strain waves reach the QM/MD–HS regions at ∼300 fs, and propagate into the FE regions at ∼900 fs with no reflection or refraction observed at the QM/MD and MD/FE boundaries.
2.8.
Grid Computing
Metacomputing on a Grid [49] of geographically distributed Teraflop-toPetaflop computers and immersive virtual reality environments connected via high-speed networks will revolutionize science and engineering, by enabling hybrid simulations that integrate multiple expertise distributed globally. We have performed a multidisciplinary, collaborative MD/DFT simulation on a Grid of geographically distributed Linux clusters in the US and Japan, based on the modular, additive hybridization scheme (see Fig. 12) [50]. The multiscale MD/QM simulation code has been Grid-enabled based on a divideand-conquer scheme, in which the QM region is a union of multiple QM clusters.
Figure 12. Multiscale MD/DFT simulation of the reaction of water at a crack tip in silicon (top), on a Grid of distributed Linux clusters in the US and Japan (bottom). In this figure, five QM calculations (circles) around five water molecules are embedded in an MD simulation.
892
P. Vashishta et al.
Since the energy is a sum of the QM energy corrections for the clusters in the additive divide-and-conquer hybridization scheme, each QM-cluster calculation does not access the atomic coordinates in the other clusters, and accordingly its parallel implementation involves no inter-QM-cluster communication. Furthermore, the multiple-QM-cluster scheme is computationally more efficient than the single-QM-cluster scheme because of the O(N 3 ) scaling. (The large prefactor of O(N ) DFT algorithms makes conventional O(N 3 ) algorithms faster below a few hundred atoms.) We have implemented the multiscale MD/DFT simulation algorithm as a single MPI program. The Globus middleware and the Grid-enabled MPI implementation, MPICH-G2, have been used to implement the MPI-based multiscale MD/DFT simulation code in a Grid environment. In the initial implementation, processors on multiple PC clusters are statically allocated using a host file. The user specifies the number of processors for each QM-cluster calculation in a configuration file. In more recent MD/DFT simulations, a simple local error indicator based on atomic bond lengths has been used to automatically change the size of QM calculations in run-time. The Gridified MD/QM simulation code has been used to study environmental effects of water molecules on fracture in silicon. A preliminary run of the code has achieved a parallel efficiency of 94% on 25 PCs distributed over 3 PC clusters in the US and Japan.
2.9.
Data Management and Mining
A serious technological gap exists between the growth in processor power and that of input/output (I/O) speed. The I/O (including data transfer to remote archival storage devices) has thus become the bottleneck in our large-scale MD simulations. We address the I/O problem using a scalable data-compression scheme we have developed recently [51]. It uses octree indexing and sorts atoms accordingly on the resulting spacefilling curve (see Fig. 13). By storing differences between successive atomic coordinates, the I/O requirement with the same error tolerance level reduces from O(N log N ) to O(N ). This, together with a variable-length encoding to handle exceptional values, reduces the I/O size by an order of magnitude with a user-controlled error bound. Large-scale MD simulations are expected to reveal atomistic correlations between local stresses and microstructural activities during dynamic fracture in complex materials. A challenge is to extract topological defects, such as dislocations, and their activities from massive data with large thermal noises, especially at high temperatures. This will require nontrivial knowledge discovery or data-mining processes from very large noisy data sets.
Multimillion atom molecular-dynamics simulations
893
Figure 13. (a) A spacefilling curve based on octree indexing maps the 3D space into a sequential list, while preserving spatial proximity of consecutive list elements. (The panel shows a 2D example.) (b) Atoms are sorted along the spacefilling curve and only relative positions are stored. (c) A 3D spacefilling curve color-coded from red (the head of the list) to blue (the tail).
Visualization of collective motion of many atoms is a difficult task because of the high dimensionality (3N dimensions for N atoms) of the space in which the collective motion occurs. We find that concerted motion of many atoms can be visualized effectively by using graph data structures. In this approach, atoms and interatomic bonds are regarded as nodes and edges of a graph, respectively [52]. Node degree, the number of neighbor atoms, is a measure of local chemical order. Atomic processes are characterized by reconnection of edges. We have found that intermediate-range order in amorphous solids, which often extends up to five edge-lengths, is closely related to the distribution of the shortest-path rings of the graph [53]. Furthermore, graph data structures even encode global properties such as the rigidity of the entire solid [54]. Recently, we have applied a graph-theoretical topological analysis to pressure-induced structural transformation in gallium arsenide nanocrystals
894
P. Vashishta et al.
Figure 14. Graph-theoretical topological spectroscopy showing four- and six-membered rings during structural transformation of a gallium arsenide nanocrystal under pressure. The highpressure phase represented by four-membered rings nucleates at the surface and grows inward.
[55, 56]. The low- and high-pressure phases of this system are characterized by six- and four-membered rings, respectively (see Fig. 14). We have found that these ring structures are insensitive to the existence of surfaces. Consequently, bulk topological defects and incipient phases during structural phase transformation hidden inside the system can be easily detected. We have also applied a graph-theoretical approach based on the shortestpath ring analysis to identify and track topological defects such as dislocations during indentation and impact on materials [15].
2.10.
Multiscale Visualization in Virtual Environment
The MRMD algorithm with associated data structures (octree FMM cells, linked lists for the member atoms of FMM cells, and neighbor-atom lists) have been reused to efficiently visualize multimillion-atom simulations. We have developed a software named Atomsviewer, which visualizes billion-atom data sets at interactive speeds in an immersive environment [6]. In visualization, polygon rendering on a graphics pipeline is the primary bottleneck; thus, we minimize the pipeline workload by processing only the data the viewer will see. To do this, we use data-management techniques based on the octree data structure, see Fig. 15. Novel algorithms and techniques, such as our
Multimillion atom molecular-dynamics simulations
895
Figure 15. The octree data structure overlain on the atomistic data. The figure shows only the atoms that are selected for subsequent rendering.
probabilistic approach, to remove hidden atoms and a parallel and distributed design can further reduce the rendering pipeline’s workload. Furthermore, we offload all processing that precedes rendering to a PC cluster and dedicate the graphics server to rendering. The resulting architecture provides multiple viewpoints, thus enhancing the user experience. A plan is under way to increase the system size of atomistic simulations significantly, using a “Grid” of distributed, heterogeneous parallel machines and immersive and interactive virtual environments. This will present unprecedented challenges of scalability, load balancing, distributed data access, latency hiding, and control of levels of detail for fast rendering. Accordingly, multidisciplinary research involving simulation, visualization, and data management/ mining algorithms will become increasing more important. In particular, emerging hybrid simulation algorithms combining continuum and atomistic approaches will provide solid foundations for hybrid rendering algorithms combining atomistic, volumetric, and surface models.
3.
Part II: Multimillion-atom Molecular Dynamics Simulations of Nanostructured Materials and Processes
The scalable simulation algorithms described in the previous section have been used to perform large-scale atomistic simulations of various
896
P. Vashishta et al.
nanostructured materials and interfaces. In the following sections, we summarize some of the simulation results. The simulations that are described in this section deal with semiconductors, ceramic, and metallic nanostructures and nanostructured materials and processes. These include sintering of nanoclusters and nanostructured ceramics, fracture in nanostructured materials and scaling properties of fractured surfaces, interfacial fracture of silicon/silicon nitride interface, nanometer-scale stress patterns in silicon/silicon nitride nanopixels, self-limiting growth and critical lateral sizes in gallium arsenide/ indium arsenide nanomesas, structural transformation in GaAs nanocrystals, nanoindentation of crystalline and amorphous ceramic films, dynamics of oxidation of aluminum nanoparticles, ceramic fiber composite, and environmental effects on fracture – stress corrosion. For the next generation of aerospace engines and high efficiency and environmentally clean turbines, it will be necessary to have materials that are mechanically stable at or above l700◦ C. This is a very challenging problem and to accomplish the objective of making such materials synthesis, processing, and simulations will have to be carried out concurrently. In the first set of simulations, we will discuss sintering of silicon carbide and silicon nitride nanoclusters, crack propagation and fracture on nanostrucuted silicon nitride, including the scaling behavior of fractured surfaces.
3.1.
Sintering of Silicon Nitride Nanoclusters
Sintering is key to a number of advanced technologies. For example, multilayer ceramic integrated circuits (MCIC) are attracting much attention as an effective way to integrate discrete components for high-frequency wireless communication equipment. The major challenge in MCIC is to control constrained sintering of laminated ceramic multilayers to obtain mechanically stable products with desired properties. Computer simulations of MCIC is of particular interest to companies such as Motorala, Texas Instruments, Intel and other related industries in the area of wireless communication technologies. The first MD simulations of sintering of ceramic nanoclusters have been carried out in our group in 1996. The MD simulations have been performed to study sintering of Si3 N4 nanoclusters (each cluster consisting of 20 335 atoms) [57]. The simulations provide a microscopic view of anisotropic neck formation during early stages of sintering (Fig. 16). In the case of Si3 N4 nanocrystals at 2000 K, considerable relative motion of clusters is observed in the initial stages. Subsequently a few Si and N atoms join the two nanocrystals and, thus bound, they continue to rotate relative to each other for 100 ps. In the next 100 ps, the relative motion subsides and a steady growth of an asymmetric neck between the two nanocrystals is observed. In the neck region, there are more four-fold than three-fold coordinated Si atoms.
Multimillion atom molecular-dynamics simulations
897
Figure 16. (Left panel) Snapshots of Si3 N4 nanocrystals at 2000 K: (a) at time t = 0; (b) after 40 ps; (c) after 100 ps; and (d) close-up of the neck region after 200 ps. Two types of spheres denote Si and N atoms. (Right panel) Snapshot of sintered amorphous nanoclusters after 700 ps.
The sintering of amorphous Si3 N4 nanoclusters has been also simulated at 2000 K. The neck between amorphous nanoclusters is much more symmetric than the neck between thermally rough nanocrystals (Fig. 16). The neck region between amorphous nanoclusters has nearly the same number of three- and four-fold coordinated Si atoms. For both nanocrystals and amorphous nanoclusters, sintering is driven by diffusion of surface atoms. The diffusion in the neck region of amorphous clusters is four times faster than in the neck between nanocrystals. MD simulations have been also performed to study sintering between three nanoclusters. For nanocrystals, a significant rearrangement of the nanocrystals occurs within 100 ps, followed by the onset of neck formation. Amorphous nanoclusters aligned along a straight line have been simulated. Within 100 ps, we observed relative motion of the clusters. In the next 100 ps, a symmetric neck forms between each pair of clusters and, thereafter, the relative motion subsides. The simulation shows a chain-like structure, which have been observed experimentally as well.
3.2.
Structure and Mechanical Properties of Nanostructured Ceramics
Advanced structural ceramics are highly desirable materials for applications in extreme operating conditions. Light-weight, elevated melting temperatures,
898
P. Vashishta et al.
high strengths, and wear and corrosion resistance make them very attractive for high-temperature and high-stress applications. The only serious drawback of ceramics is that they are brittle at low to moderately high temperatures. In recent years, a great deal of progress has been made in the synthesis of ceramics that are much more ductile than conventional coarse-grained materials [58, 59]. These so called nanostructured materials are fabricated by in situ consolidation of nanometer size clusters. Despite a great deal of research, many perplexing questions concerning nanostructured ceramics remain unanswered. Experiments have yet to provide information regarding the morphology of pores or the structure and dynamics of atoms in nanostructured ceramics. As far as modeling is concerned, only a few atomistic simulations of nanostructured materials have been reported thus far. This is due to the fact that these simulations are highly compute-intensive: a realistic MD simulation of a nanostructured solid requires 105 –106 time steps and ∼106 atoms (each nanocluster itself consists of 103 –104 atoms). Large-scale MD simulations have been performed to investigate sintering, structure, and mechanical behavior of nanostructured Si3 N4 [60–62], SiC [13] and SiO2 [8]. Figure 17 shows the results of the first joint experimental and MD study of sintering of nanostructured SiC (n-SiC) [13]. In both experiment (solid diamonds) and simulation (open circles), the onset of sintering is around l500 K. The MD simulations provide a microscopic picture of how the morphology of micropores in n-SiC changes with densification. The fractal dimension and the surface roughness exponent of micropores are found to be 2.4 and 0.45, respectively, over the entire pressure range between 0 and
Figure 17. (Left) Snapshot of nanophase SiC. (Right) The onset of sintering is indicated by an increase in the average particle size in the neutron data (solid diamonds) and an increase in the rate of bond formation between nanoparticles in the MD results (open circles). The dotted line is a guide to the eye for the MD results.
Multimillion atom molecular-dynamics simulations
899
15 GPa. Small-angle neutron scattering at low wave vectors yields a fractal dimension of two for pores in n-SiC. MD calculations of pair-distribution functions and bond-angle distributions reveal that interfacial regions between nanoparticles are highly disordered with nearly the same number of three-fold and four-fold coordinated Si atoms. The effect of consolidation on mechanical properties is also investigated with the MD approach. The results show a power-law dependence of elastic moduli on the density with an exponent of 3.4 ± 0.1. The simulation of nanostructured SiO2 involves amorphous nanoclusters, which are obtained from bulk amorphous SiO2 [8]. In n-SiO2 the morphology of micropores, mechanical behavior, and the effect of nanoscale structures on the short-range and intermediate-range order (SRO and IRO) are investigated (Fig. 18). Pores in nanostructured a-SiO2 are found to have a self-similar structure with a fractal dimension close to two; the pore surface width scales √ with the volume as, W ∼ V. The MD simulations also reveal that the SRO in nanostructured silica glass is very similar to that in the bulk glass: both of them consist of corner-sharing Si(O1/2 )4 tetrahedra. However, the IRO in nanostructured silica glass is quite different from that in the bulk glass. We have also investigated the mechanical behavior of nanostructured a-SiO2 . The elastic moduli are found to have a power-law dependence on the density with
Figure 18. (Top) Snapshots of nanophase amorphous silica at densities 1.37, 1.59, 1.84, and 2.13 g/cc, corresponding to pressures 2, 4, 8 and 16 GPa, respectely. (Bottom) The same systems as the above, but pores are colored as red.
900
P. Vashishta et al.
an exponent of 3.5. These results are in excellent agreement with experimental measurements on high-density silica aerogels.
3.3.
Crack Propagation in Amorphous SiO2 and Nanostructured Si3 N4
Amorphous silica (a-SiO2 ) was obtained by heating β-cristobalite to 3200 K and then quenching the molten system to room temperature. The short-range spatial correlations and medium-range order in the computer-generated system are in good agreement with neutron scattering measurements. The calculated bond angle distribution, Si–O–Si, also compares very well with Nuclear Magnetic Resonance measurements [63]. The amorphous system was notched and a uniaxial strain was applied to atoms within 7.5 Å (cutoff in the potential) from the outermost layers. The system was relaxed for several thousand time steps before incrementing the strain. Figure 19 shows that crack propagation is accompanied by nucleation and growth of nanometer scale cavities ahead of the crack tip. Cavities coalesce and
Figure 19. (Top) Snapshot of atoms (t = 55 ps) in a MD simulation of fracture in a-SiO2 at room temperature shows nanometer scale cavities (black) in front of the crack, cavity coalescence, and merging of cavities with the advancing crack. (Bottom) AFM picture which is relative to a stress corrosion crack (i.e., sub-critical crack growth, where the corrosion by the water contained in the atmosphere assists the crack propagation) in an aluminosilicate glass at room temperature reveals nanometric cavities ahead of the crack. With the Fracturesurface Topography Analysis (FRASTA) method, it is shown that the voids contribute to the final fracture and are actually damage cavities. Recently, the group has observed the same fracture mechanism in silica glass.
Multimillion atom molecular-dynamics simulations
901
merge with the advancing crack to ultimately cause failure. Recent experimental work of Bouchaud et al. [64], involving an Atomic Force Microscope study of fracture in an aluminosilicate glass, reveals nanocavitation and coalescence of cavities with the crack to be the mechanism of fracture. √ The calculation of the MPa m and the experimental critical stress intensity factor, K√1C , in a-SiO2 is 1 √ values range between 0.8 MPa m and 1.2 MPa m [65]. Turning to nanostructured silicon nitride (n-Si3 N4 ), we first remove a spherical nanoparticle of diameter 6 nm from crystalline α-Si3 N4 [60]. The nanoparticle is thermalized at room temperature and then 108 different configurations of the nanoparticle are placed randomly in a cubic MD box. (The system contains approximately 106 atoms.) The initial configuration is heated to 2000 K and subsequently sintered under hydrostatic pressures of 5,10, and 15 GPa. The final sintered system (at 15 GPa) is cooled down and thermalized at room temperature. Subsequently, the pressure is reduced to 10, 5, and 0 GPa. In each instance, the system is relaxed for thousands of time steps. The final n-Si3 N4 configuration is consolidated to 92% of the density of crystalline a-Si3 N4 . MD calculations of Si–Si, Si–N, and N–N pair-distribution functions and Si–N–Si and N–Si–N bond-angle distributions reveal that interior regions of nanoparticles remain crystalline whereas interfacial regions between nanoparticles are akin to amorphous Si3 N4 [60]. This was confirmed by MD simulations of amorphous Si3 N4 of the same mass density as the average mass density of interparticle regions in n-Si3 N4 . Partial pair-distribution functions and bond-angle distributions for the two systems are similar. As we shall see momentarily, the amorphous structure of interparticle regions plays a key role in crack propagation in n-Si3 N4 . In MD simulations of dynamic fracture in n-Si3 N4 , the sintered system at room temperature is notched and subjected to an external strain [61]. Figure 20(a) is a snapshot of the system at 10 ps after the strain reaches
Figure 20. MD simulations of dynamic fracture in n-Si3 N4 at room temperature. Snapshots show the crack front and the cavities in n-Si3 N4 at 10 ps after applied strains of 5% (a), 11% (b), and 14% (c) were reached.
902
P. Vashishta et al.
5%. (To highlight cavities and cracks in the system, atoms are not shown in the figure.) In addition to the notch (magenta), we observe nanoscale cavities in amorphous interparticle regions. As the strain is increased, the notch advances and cavities grow and coalesce among themselves and also with the advancing crack; see Fig. 20(b). The crack front meanders through amorphous interparticle regions; see Fig. 20(c). Nanoscale cavitation, crack meandering, and crack branching render n-Si3 N4 much tougher than a-Si3 N4 crystal, which undergoes cleavage fracture. Fracture toughness of n-Si3 N4 is estimated to be 6 times larger than that of the crystal. We have also investigated crack propagation in amorphous nanostructured silica (n-SiO2 ). The system was generated by removing a spherical nanoparticle of diameter 8 nm from the bulk a-SiO2 system mentioned before. After thermalizing it at room temperature, 100 different configurations of the nanoparticle were placed randomly in a cubic box. Periodic boundary conditions were applied and the system was sintered at l000 K under hydrostatic pressure of 16 GPa. Subsequently, the system was cooled down to room temperature and thermalized both before and after removing the pressure. MD simulations of fracture in n-SiO2 reveal that the crack propagates through interparticle regions. At small values of the applied strain, these regions have a few isolated nanocavities. As the applied strain is increased, we observe: (a) the precrack advances mostly through interfacial regions; (b) nanocavities grow and coalesce; and (c) new nanocavities form ahead of the crack in interparticle regions. The crack meanders through nanoparticle boundaries, coalescing with nanocavities in its path, until the system completely fractures.
3.4.
Scaling Properties of Fracture Surfaces
We have examined the morphology of fracture surfaces in n-Si3 N4 and have found scaling behavior akin to that observed experimentally in a variety of other materials. Fracture surfaces are self-affine objects with the height– height correlation function varying as: ∝ rζ , h(r) = (x(z + r) − x(z))2 1/2 z
(1)
where x is the height of the fracture profile normal to the plane of crack propagation and . . .z implies an average over z. Figure 21 shows the MD results for fracture surfaces in n-Si3 N4 . The log–log plot of h vs. r reveals two distinct power-law regimes with exponents ζ = 0.58 and 0.84 below and above a cross-over length, ξc , respectively. The smaller exponent (0.58) is found to be due to intra-cavity correlations while the larger one (0.84) results from intercavity correlations and crack–cavity coalescence. The cross-over length, ξc is close to the size of the nanoparticle.
Multimillion atom molecular-dynamics simulations
903
Figure 21. Height–height correlation function for fracture surfaces in n-Si3 N4 . The MD results show that the roughness exponent is 0.58 and 0.84 below and above a certain crossover length, respectively. The cross-over length is close to the nanoparticle size.
Fracture experiments on various metals, alloys, ceramics, and glasses reveal similar scaling behavior [66]. The experimental value of the lower experiment is around 0.5 while the larger exponent is always close to 0.8, independent of the material or its microstructure. The cross-over length ξc is, however, a material characteristic, which decreases with an increase in the crack velocity.
3.5.
Interfacial Fracture at Silicon/Silicon Nitride Interface
Interfaces between dissimilar materials are ubiquitous in silicon integrated circuit and other heterojunction based technologies. Owing to the differences in their mechanical and thermal properties, high stresses are known to develop at such interfaces and at the edge regions generated in delineating discrete device elements or pixels [67]. This can cause defect formation, including crack initiation and propagation. Fracture at interfaces has been a subject of numerous experimental and theoretical studies. Cracking patterns range from surface cracks and channeling in the film to substrate damage, spalling and debonding of the interface. Silicon dioxide and silicon nitride are two dielectrics commonly employed in semiconductor technology for a variety of purposes such as gate insulator, trench isolation, encapsulation, etc. In recent years, finite element analyses have been undertaken to examine aspects of stress distributions in such situations to supplement the more limited results available from analytical theories [68]. Theoretical studies and simulations of crack initiation,
904
P. Vashishta et al.
propagation (i.e., dynamics), and fracture have, however, been lacking for the semiconductor/dielectric interfaces. A way to simulate crack initiation and its propagation is to apply uniaxial strain parallel to the interface and examine, via molecular dynamics, the time evolution of the system to analyze its failure resistance. We will discuss the Si/Si3 N4 interface and nanopixels in the following two sections. In an effort to increase processing speed and memory density, the feature sizes of semiconductor devices are expected to shrink to 50 nm or smaller in the next several years. Stresses induced in Si/Si3 N4 nanopixels are major sources of defects and inhomogeneities in the system and they become significant for pixel sizes in the nanometer range. One of the main issues is the development of reliable physical models for the Si/Si3 N4 interface, which can supplement empirical data in nanopixel design. In our simulations, silicon nitride is represented by an interatomic potential involving two- and three-body interactions. The two-body terms include steric repulsion, the effect of charge transfer via Coulomb interaction, and the large electronic polarizability of anions through the charge-dipole interaction. Threebody terms account for bond-bending and bond-stretching effects. Bulk and Young moduli along with the phonon density-of-states of a crystal and structural correlations in the amorphous state [21] are described well by the interaction potential. It has been used successfully to study fracture in crystalline, amorphous, and nanophase Si3 N4 [9, 10, 60, 61]. The silicon system is described by the Stillinger/Weber potential [69]. To account for all the structural correlations for silicon, silicon nitride and the Si (111)/Si3 N4 (0001) interface, the system is modeled using eight components [70, 71]. These consist of: Si4+ and N3− in the bulk Si3 N4 ; Si3+ , N2− , and N3− at the Si3 N4 side of the interface; three-fold coordinated Si at Si (111) interface, its four-fold coordinated neighboring silicon in the plane; and bulk Si. The multimillion atom simulations were performed on a variety of parallel supercomputers using highly efficient space–time MRMD algorithms. Molecular dynamics, Langevin dynamics, and steepest descent quench methods were used. These interfacial bond lengths obtained from our interaction potentials are consistent with chemical arguments and self consistent linear combination on of atomic orbitals (LCAO) calculations [72] and give satisfactory description of the structure of silicon nitride/silicon interface (see Fig. 22). A schematic of the geometry of the interface system is shown in Fig. 23 [71]. After thermalizing the system at 300 K, the system is stretched parallel ¯ direction for silicon nitride and in the [211] ¯ to the interface, i.e., in the [21¯ 10] direction for silicon until it failed. For each percent of strain the system has been subjected to a 2 ps stretching phase and a 2 ps relaxation phase as seen in the time evolution of σx x , the stress tensor component in the stretching direction (see Fig. 23). The system did not show any failure up to 8% strain. At 9% strain, within the first 2 ps, σx x decreased dramatically. This is due to the
Multimillion atom molecular-dynamics simulations
905
Figure 22. (Left) The atomic structure of the Si (111)/Si3 N4 (0001) interface. The small red spheres are the Si (111) atoms; the small cyan spheres are the Si atoms of the Si3 N4 side; and the large spheres are the N atoms of the Si3 N4 side. (Right) The valence charge–density map of the same system calculated with an LCAO calculation.
Figure 23. (Left) Schematic of the simulated Si [111]/Si3 N4 [0001] system. (Center) Uniaxial stress in the x direction as a function of time. (Right) A slice of the system, in which a dislocation is highlighted by a circle. Solid dots are atoms.
fact that a crack started to form at the top surface of the silicon nitride layer and it propagated through the whole silicon nitride layer within 17 ps. The system was monitored for additional 80 ps. It was found that the crack does not propagate into Si, but instead emits dislocations, which correlates well with an additional drop in σx x after 48 ps. We have examined the structure of
906
P. Vashishta et al.
Figure 24. Close-up a dislocation loop. Only Si atoms with energies larger than the average silicon atom energy by +0.35 eV are plotted.
silicon at the interface to determine the nature of defects created by the crack arriving from silicon nitride. In Fig. 23 the extra line of atoms (in yellow) in a Si[111] plane parallel to the interface – an edge dislocation – can be clearly seen. The dislocation core lies within the white dashed circle. The projection of ¯ direction as indicated the displacement vector onto the [111] plane is in [110] by the arrow from a red to a yellow Si atom. The time evolution is given in Figs. 24 and 25(a)–(c). Only those Si atoms whose energies are higher than the average silicon energy by +0.35 eV are shown. Interfacial (blue) and surface atoms (red) also satisfy this criterion, i.e., their energy is 0.35 eV larger than the average energy, and can be seen at the top (interface – blue atoms) and bottom (silicon surface – red atoms). In Fig. 25(a), we see the formation of a dislocation loop at the interfacial plane (blue) and the right-hand silicon surface (surface atoms belonging to the vertical planes have been removed from the plot to make the dislocation loop ¯ plane denoted with dashed lines visible). The dislocation loop lies on a (1¯ 11) in Fig. 25(a). This loop has five segments – the line in the interfacial plane ¯ – the first segment. Moving clock(blue atoms at the top) is in direction [110] wise, the second segment, vertical, is in direction [011], the third in direction ¯ the fourth in direction [110], and the last segment is in direction [011]. [011], As time proceeds the dislocation loop grows (see Figs. 25(b) and (c)) till it reaches the silicon surface (red) at the bottom after 13 ps. From our simulation data we estimate the speed of the dislocation motion to be 500 (±100) m/s.
3.6.
Nanometer-Scale Stress Patterns in Si/Si3 N4 Nanopixels
The first MD simulations of nanopixels in the ranges of 25 to 70 nm were performed in our group by using parallel MD [70]. Large scale
Multimillion atom molecular-dynamics simulations
907
Figure 25. Time evolution of dislocation motion. Only Si atoms with eneries larger than the average silicon atom energy by +0.35 eV are plotted, i.e., interfacial atoms (blue), surface atoms (red), and atoms in the dislocation core (red). (a) At 9.12 ps, formation of a dislocation loop at the interfacial plane (blue) and the right-hand silicon surface (surface atoms belonging to the vertical planes have been removed from the plot to make the dislocation loop visible). The ¯ plane denoted by dashed lines. The dislocation loop consists of dislocation loop lies on a (1¯ 11) five segments. The first segment is in the interfacial plane (blue atoms at the top) in direction ¯ [110]; moving clockwise, the second segment, vertical, is in direction [011], the third is in ¯ the fourth in direction [110], and the last segment is in direction [011]. (b) and direction [011], (c) show dislocation loop after 10.56 ps and 12 ps, respectively.
computing resources located at the Caltech National Science Foundation facility and the DoD sites were used for these simulations. There are many interesting and challenging issues at the semiconductor/ceramic interface. These include stresses due to bonding of two very dissimilar materials and the effect of lattice mismatch of interfacial stresses. Beyond these, there is also the question of stresses due to edges and corners for such small nanostructures. Many interesting phenomena are associated with length scales far beyond those accessible to electronic structure calculations. At present, the only viable solution to this problem is large-scale MD simulations provided the interatomic potentials are able to describe Si, Si3 N4 , and the interface in a seamless fashion.
908
P. Vashishta et al.
In the interatomic interaction scheme, a clear distinction between Si atoms in the silicon substrate and those in silicon nitride is essential. In addition, the atoms near the interface have different charge transfer from those in bulk Si3 N4 . The LCAO electronic structure calculations for the Si (111)/Si3 N4 (0001) interface [72] indicate that the interatomic interaction in Si/Si3 N4 can be modeled very well as an eight-component system, where each of the eight atom types is associated with a different set of parameters in the interatomic potential. Bulk Si is modeled by the Stillinger–Weber potential. The potential for bulk silicon nitride is a sum of two-body and three-body terms. The former includes the effects of charge transfer, electronic polarizability, steric repulsion and Van der Waals interactions; the latter takes into account covalent effects through bondbending and bond-stretching terms. The interatomic potential has been validated by comparison with experiments on crystalline and amorphous Si3 N4 . For atoms at the interface, the charge transfer, bond lengths, and bond angles are consistent with the results of the electronic-structure calculations. To study atomic-level stress distribution in a Si/Si3 N4 nanopixel, we have performed MD simulations involving up to 27 million atoms [70]. The interatomic potential model used in the simulations has been developed on the basis of LCAO electronic structure calculations [72]. The system consists of a Si mesa placed on top of a Si(111) substrate. The top surface of the mesa is covered with a crystalline Si3 N4 (0001) or amorphous Si3 N4 film. The Si (111)/cSi3 N4 (0001) interface has a 1.1% lattice mismatch which induces stresses in the system. The lattice mismatch causes compressive stresses in Si3 N4 , while a tensile stress is observed in Si (Fig. 26(a)). Note the effects of surfaces and edges on the stress. In the case of an a-Si3 N4 film, we find the stress to be nonuniform laterally, as seen in Fig. 26(b), which is quite different from that for crystalline films. Lateral stress domains on the scale of 100–150 Å are observed in the case of amorphous Si3 N4 film. As the PECVD (plasma enhanced chemical vapor deposition) films employed are polycrystalline, such a lateral inhomogeneity in stress is expected in the films employed and our results reveal a hitherto unappreciated serious consideration for the processing of nanoscale pixels. Figure 27 shows stress patterns and the effect of the mesa shape (square or rectangular) on stresses. For a 25 nm square-mesa system with 3.7 million atoms, three-fold symmetry of Si (111) gives rise to three tensile stress domains. For a 10 million-atom system with a rectangular mesa of dimensions 54 nm × 33 nm, a similar stress pattern is observed in silicon just below the interface. However, the aspect ratio 1.6 for the 54 nm × 33 nm mesa does not accommodate two three-fold patterns like the one in the 25 nm square mesa. The two stress patterns are squeezed together into a Y shape with the longer leg along the longer length of the mesa. Pixel sizes on the order of or less than 50 nm are currently being considered by industry and government agencies for fabrication in 2005–2010
Multimillion atom molecular-dynamics simulations
909
Figure 26. Pressure distribution in a Si/Si/Si3 N4 nanopixel. (a) To show the pressure inside a nanopixel covered with crystalline Si3 N4 , one quarter of the system is removed. (b) Pressure distribution in Si substrate parallel to the interface with amorphous Si3 N4 .
Figure 27. Horizontal cross-sections of stress distributions in nanopixels covered with amorphous Si3 N4 for two different system sizes. The slices are taken through Si3 N4 above the interface and Si below the interface for the 25 nm square and the 54 nm × 33 nm rectangular mesas.
910
P. Vashishta et al.
period. Stress domains in these pixels may have a significant effect on the performance of such devices: they may cause dopant distribution to be highly inhomogeneous, since their size of stress domains can be comparable to the dimensions of the nanopixel.
3.7.
Self-limiting Growth and Critical Lateral Sizes in GaAs/InAs Nanomesas
In recent years, coherently strained three-dimensional islands formed in semiconductor overlayers having high lattice-mismatch with underlying substrates have attracted much attention due to their importance in the study of electronic behavior in zero dimension and applications in electronic and optoelectronic devices. The role and manipulation of stress in the formation of such nanostructures have been systematically examined through a study of the growth of InAs on planar and patterned GaAs (001) substrate (these systems have a large lattice mismatch of 6.6%). On infinite planar substrates, the strain relief leads to the formation of coherent three-dimensional island structures above a critical amount, ∼1.6 monolayers (ML), of InAs deposition. On the contrary, when InAs is deposited on 100 oriented GaAs square mesas of size ≤75 nm, the island morphology is suppressed and, instead, a continuous film with flat morphology is observed. This InAs film growth is, however, self-limiting and stops at ∼11 ML. In order to understand the self-limiting nature of the InAs film growth, we have recently performed MD simulations of InAs/GaAs nanomesas with {101}-type sidewalls, see Fig. 28(a) [18]. The inplane lattice constant of InAs layers parallel to the InAs/GaAs (001) interface starts to exceed the InAs bulk
Figure 28. (a) Atomic-level hydrostatic stress in an InAs/GaAs square nanomesa with a 12 ML InAs overlayer. (b) Vertical displacement of As atoms in the first As layer above the InAs/GaAs interface in the 8.5 million-atom and the 2.2 million-atom nanomesas.
Multimillion atom molecular-dynamics simulations
911
value at the 12th ML and the hydrostatic stresses in InAs layers become tensile above ∼12th ML. As a result, it is not favorable to have InAs overlayers thicker than 12 ML. This may explain the experimental findings of the growth of flat InAs overlayers with self-limiting thickness of ∼11 ML on GaAs nanomesas. Length scales are of critical significance for stress relaxation and manipulation leading to control of the island number on chosen nanoscale area arrays. For example, on stripe mesas of sub-100-nm widths on GaAs (001) substrates, deposition of InAs is shown to allow self-assembly of three, two, and single chains of InAs three-dimensional island quantum dots selectively on the stripe mesa tops for widths decreasing from 100 nm down to 30 nm. We have recently investigated lateral size effects on the stress distribution and morphology of InAs/GaAs nanomesas using parallel MD simulations, see Fig. 28(b) [17]. Two mesas with the same vertical size but different lateral sizes are simulated. For the smaller mesa, a single stress domain is observed in the InAs overlayer, whereas two stress domains are found in the larger mesa (a highly compressive domain is located at the center of the InAs overlayer, whereas the peripheral region of the InAs overlayer is less compressive). This indicates the existence of a critical lateral size for domain formation in accordance with recent experimental findings. We have also studied the morphology of the InAs overlayer near the InAs/GaAs interface. For the 2.2 million-atom nanomesa, the As layer is “dome” shaped. In contrast, the As layer in the 8.5 million-atom nanomesa shows a “dimple” at the center of the mesa. This provides clear evidence that there exists a critical lateral size for such stress domain formation and the critical value is somewhere between 124 and 407 Å. Detailed analysis of structural correlations have revealed that the InAs overlayer in the larger mesa is laterally constrained to the GaAs bulk lattice constant but vertically relaxed to the InAs bulk lattice constant, which is consistent with the Poisson effect.
3.8.
Structural Transformation in GaAs Nanocrystals
Aggregates of nanometer-size semiconductor crystals have promising applications as photovoltaics, light-emitting diodes, and single-nanocrystal, singleelectron transistors. Self-organized assembly of colloidal nanocrystals acts as an intelligent photonic-crystal material, which can be used as sensors and optical switches. Recently, self-formation of laser was demonstrated in semiconductor nanopowders due to disorder-induced photon localization mechanisms. Rod-shaped nanocrystals emit polarized light and will be useful for biological tagging applications. The most recent additions to this family of nanocrystals include tetrapods. Such systematically-controlled anisotropic shapes can be used as building blocks for self-assembly of three-dimensionally integrated nanostructures through surface-stress encoded epitaxy, self-alignment, and
912
P. Vashishta et al.
biological templates. Finally, these nanocrystals can be utilized as new synthetic paths to novel materials that do not exist in bulk form. Size-dependent phase stability plays an essential role in the synthesis of nanocrystals. For example, many III–V and II–VI semiconductors transform from a four-coordinated phase to a six-coordinated phase as the pressure is increased, and the transition pressure often exhibits strong size dependence. Upon release of pressure, the metastable high-pressure phase can be kinetically trapped in nanoclusters. This is due to the large number of surface atoms such that the surface energetics essentially affects the phase stability. We may thus be able to prepare interior bonding geometries that do not occur in the known extended solid by adjusting the surface energy. In other words, it is possible to manipulate nanocrystal surfaces to trap structures that might ordinarily be unstable in the bulk. Nanophase engineering uses controlled pressurization and annealing to achieve new material forms, which are nonexistent in the bulk. Molecular-dynamics simulations are expected to reveal microscopic mechanisms of the nanocrystalline phase kinetics. We have performed MD simulations to investigate pressure-induced structural transformations in GaAs nanocrystals of different sizes [55]. To simulate the experimental situation, the nanocrystals are immersed in a Lennard-Jones liquid so that they can be subjected to hydrostatic pressure, see Fig. 29. It is found that the transformation from four-fold (zinc-blende) to six-fold (rocksalt) coordination starts at the surfaces of nanocrystals and proceeds inwards with increasing pressure, see Fig. 30(a). Inequivalent nucleation of the high-pressure phase at different sites leads to an inhomogeneous deformation of the nanocrystal. For sufficiently
Figure 29. Initial thermalized system. The GaAs nanocrystal is embedded in the Lennard– Jones liquid that serves as a hydrostatic pressure medium.
Multimillion atom molecular-dynamics simulations
913
Figure 30. Structural transformation in a GaAs nanocrystal from outer to inner shells. (a) An 8 Å slice of an initially spherical nanocrystal of diameter 60 Å that is partially transformed at a pressure of 17.5 GPa. Outermost shell shows the rocksalt structure (atoms making fourmembered rings) while the innermost shell continues to show the zinc-blende (atoms making six-membered rings). (b) The same slice with the nanocrystal completely transformed at 22.5 GPa. The rocksalt structure can now be seen in the innermost shell. The red lines are a guide to the eye to see the differently oriented grains.
large spherical nanocrystals, this gives rise to rocksalt structures of different orientations separated by grain boundaries, see Fig. 30(b). The absence of such grain boundaries in a faceted nanocrystal of moderate size indicates sensitivity of the transformation to the initial nanocrystal shape. The pressure corresponding to the complete transformation increases with the nanocrystal radius and it approaches the bulk value for a spherical nanocrystal of ∼5000 atoms.
3.9.
Nanoindentation of Silicon Nitride
Nanoindentation testing is a unique probe of mechanical properties of materials. Typically, an atomic force microscope tip is modified to indent the surface of a very thin film, see Fig. 31. The resulting damage is used to rank the ability of the material to withstand plastic damage against that of other materials. In addition, a load-displacement curve is constructed from the measured force at each displacement, and the elastic modulus in the direction of the indent can be measured from the initial part of the unloading curve. Commercial nanoindenting apparatus typically have a force resolution of ±75 nN and
914
P. Vashishta et al. Cantilever Arm
Indenter Tip Damage Caused By Indenter Thin Film Coating
Substrate (a)
Indente
Top
[0001]
Y Y
Z
[1210]
X
X
[1010] (b)
(c)
Figure 31. (a) Schematic of an AFM modified for nanoindentation experiments. (b) and (c) Schematic view of the indenter/substrate system. In our MD simulations the substrate has dimensions 60.6 × 60.6 × 30 nm3 and has 10, 614, 240 atoms. The x- and y-axes are normal ¯ and (12 ¯ 10) ¯ surfaces, respectively. The indent was done into the (0001) surface. to the (1010)
depth resolution of ±0.1 nm. Recent developments in parallel computing and multiscale algorithms have enabled MD simulations to reach the scale of such commercial nanoindenters, resulting in a better atomic-level understanding of the indentation process.
Pressure (Gpa)
Multimillion atom molecular-dynamics simulations
915
20 Å
40 Å
60 Å
80 Å
0 Å
20 Å
40 Å
60 Å
20 10 5 0 ⫺5
Figure 32. Local pressure distribution directly under the indenter. Frames from the loading and unloading cycles are shown in clockwise order. The displacement of the indenter is given in the top left corner of each frame.
We have performed MD simulations to investigate nanoindentation in Si3 N4 [11, 12]. The nanoindentation simulation is performed on the (0001) surface of a 60 nm × 60 nm × 30 nm crystalline α-Si3 N4 slab consisting of 10 million atoms (see Fig. 31). The sample is indented using a pyramid indenter with a load ∼10 µN and indentation depth ∼10 nm (see Figs. 32 and 33). From the load-displacement curve, hardness value is estimated to be of 50.3 Gpa (see Fig. 34). (We have also calculated the hardness of amorphous Si3 N4 to be 31.5 GPa using a similar geometry.) Our simulations reveal significant plastic deformation and pressure-induced amorphization under the indenter. The simulations also exhibit anisotropic fracture toughness: Indentation cracks are ¯ 10] ¯ direction, which coincides with one of the diagonal observed along the [12 ¯ directions of the indenter, but not for the other diagonal direction, [1100]. Simulations were also performed to determine temperature effects, loadrate effects, and simulation-size effects in crystalline and amorphous silicon nitride. The simulations were run on several different parallel platforms, including the 256 and 128 node IBM SPs at the US Army Engineer Research and Development Center (ERDC), the 1088-node Cray T3E at NAVO, and the 512 node Origin 2000 at Aeronautical Systems Center (ASC). Recently we have completed nanoindentation simulations of SiC [15] and GaAs. The MD simulations of nanoindentation of alumina are in progress.
3.10.
Oxidation of Aluminum Nanoparticles
Oxidation plays a critical role in the performance and durability of various nanosystems. Oxidation of metallic nanoparticles offers an interesting possibility of synthesizing nanocomposites with both metallic and ceramic
916
P. Vashishta et al.
Figure 33. Time sequence of an indented surface of α-Si3 N4 .
Multimillion atom molecular-dynamics simulations
917
Figure 34. Load-displacement curve for (left) 10 million-atom α-Si3 N4 nanoindentation simulation and (right) 10 million-atom amorphous Si3 N4 simulation.
properties. We have performed the first successful MD simulation of oxidation of an Al nanoparticle (diameter 200 Å) [35]. The MD simulations are based on an interaction scheme developed by Streitz and Mintmire, which can successfully describe a wide range of physical properties of both metallic and ceramic systems [32]. This scheme is capable of treating bond formation and bond breakage and changes in charge transfer as the atoms move and their local environments are altered. The MD simulations provide detailed picture of the rapid evolution and culmination of the surface oxide thickness, local stresses, and atomic diffusivities, see Fig. 35. In the first 5 ps, oxygen molecules dissociate and the oxygen atoms first diffuse into octahedral and subsequently into tetrahedral sites in the Al nanoparticle. In the next 20 ps, as the oxygen atoms diffuse radially into and the Al atoms diffuse radially out of the nanoparticle, the fraction of six-fold coordinated oxygen atoms drops dramatically. Concurrently, there is a significant increase in the number of O atoms, forming clusters of cornersharing and edge-sharing OA14 tetrahedra. Between 30 and 35 ps, clusters of OA14 coalesce to form a neutral, percolating tetrahedral network that impedes further intrusion of oxygen atoms into and of Al atoms out of the nanoparticle. The electrostatic and non-electrostatic contributions to the local pressure in the nanocluster after 100 ps of simulation time are shown in Figs. 36(a) and (b), respectively. A stable oxide scale formed at the end of our simulation is shown in Fig. 37. Structural analysis reveals a 40 Å thick amorphous oxide scale on the Al nanoparticle. The thickness and structure of the oxide scale are in accordance with experimental results.
918
P. Vashishta et al.
Figure 35. Initial stages of oxidation of an Al nanoparticle. Size distributions of OAl4 clusters between 20 and 31 ps are shown. The clusters coalesce and percolate rapidly.
The MD simulations provide detailed picture of the rapid evolution and culmination of the surface oxide thickness, local stresses, and atomic diffusivities. Clusters of OA14 coalesce to form a neutral, percolating tetrahedral network that impedes further intrusion of oxygen atoms into and of Al atoms out of the nanoparticle. As a result, a stable oxide scale is formed. Structural analysis reveals a 40 Å thick amorphous oxide scale on the Al nanoparticle, see Fig. 37. The thickness and structure of the oxide scale are in accordance with experimental results.
3.11.
Ceramic Fiber Composites
Physical properties of composite materials often exhibit synergistic enhancement. For example, the fracture toughness of a fiber composite is much larger than a linear combination of the toughness values of the constituent
Multimillion atom molecular-dynamics simulations
919
Figure 36. (a) Electrostatic and (b) nonelectrostatic contributions to the local pressure in the nanocluster after 100 ps of simulation time.
materials. This enhanced toughness has been attributed to the frictional work associated with pulling out of fibers, which suggests that tough composites can be designed by combining strong fibers with weak fiber-matrix interfaces. Recently we have performed MD simulations (Fig. 38) to investigate the atomistic toughening mechanisms in Si3 N4 ceramic matrix (bulk modulus 285 GPa) reinforced with SiC fibers (bulk modulus 220 GPa, 16 vol. % fibers) coated with amorphous silica (bulk modulus 36 GPa) [73]. The simulations involving 1.5 billion atoms were performed on DoD parallel supercomputers. Fiber-reinforcement is found to increase the fracture toughness by a factor of two. The atomic-stress distribution shows an enhancement of shear stresses at the interfaces. The enhanced toughness results from frictional work during the pullout of the fibers. Immersive visualization of these simulations reveals a rich diversity of atomistic processes including fiber rupture and emission of molecular fragments, which must be taken into account in the design of tough ceramic composites.
3.12.
Environmental Effects on Fracture
The hybrid MD/QM simulation scheme was applied to study the effects of environmental molecules on fracture initiation in silicon, see Fig. 39 [43]. A (110) crack under tension (mode-I opening) is simulated with multiple H2 O
920
P. Vashishta et al.
Figure 37. Snapshot of the Al nanocluster after 0.5 ns of simulation time. (A quarter of the system is cut out to show the aluminum/aluminum-oxide interface.) The larger spheres correspond to oxygen and smaller spheres to aluminum; color represents the charge on an atom.
Figure 38. (Left panel) Fractured silicon nitride ceramic reinforced with silica coated silicon carbide fibers. (Right panel) close-up of the fractured composite system. Small spheres represent silicon atoms and large spheres represent nitrogen, carbon, and oxygen atoms.
Multimillion atom molecular-dynamics simulations
921
Figure 39. Schematic of three types of reaction processes–(a) chemisorption,√(b) oxidation, (c) bond breakage–found in the MD/QM simulation with K = 0.4 and 0.5 MPA m.
molecules around the crack front. Electronic structure near the crack front is calculated with density functional theory. To accurately model the longrange stress field, the quantum-mechanical description is embedded in a large classical molecular dynamics simulation. The hybrid simulation results show that the reaction of H2 O molecules at√a silicon crack tip is sensitive to the stress intensity factor K . For K = 0.4 MPa m, an H2 O molecule either decomposes and adheres to dangling-bond sites on the crack surface or oxidizes Si, resulting √ in the formation of a Si–O–Si structure. For a higher K value, 0.5 MPa m, an H2 O molecule either oxidizes or breaks a Si–Si bond.
3.13.
Conclusion and Future Research
Current multi-teraflop parallel supercomputers (operating trillions of floating-point operations per second) enable large-scale MD simulations involving up to billion atoms [5]. Petaflop computers (operating 1015 floating-point
922
P. Vashishta et al.
operations per second) anticipated to be built in the next 5–10 years are expected to enable trillion-atom MD simulations. In the same time frame, metacomputing on a Grid of geographically distributed supercomputers, mass storage, and virtual environment connected via high-speed networks will revolutionize computational research by enabling (i) very large-scale computations that are beyond the power of a single supercomputer, and (ii) collaborative, hybrid computations that integrate distributed, multiple expertise [49]. A multidisciplinary application that will soon require Grid-level computing is emerging at the forefront of computational science and engineering. We have recently developed such a multiscale simulation approach which seamlessly combines continuum mechanics based on the FE method, MD simulations to describe atomistic processes, and QM calculations based on the DFT to handle breakage and formation of atomic bonds [37]. These emerging new computer architectures, together with further developments in scalable simulation algorithms and parallel computing frameworks, will be critical for the advancement of modeling and simulation research. Some of the most exciting and challenging opportunities in simulation research lie at the nano-bio interface. The following illustrates several nanoscale systems that will be amenable to atomistic simulations in the near future.
3.14.
Chemically Synthesized Quantum Dot Structures
3.14.1. Quantum rods and tetrapods Self-organized assembly of colloidal nanocrystals acts as an intelligent photonic-crystal material, which can be used as sensors and optical switches. Rod-shaped nanocrystals synthesized by Paul Alivisatos’ group at Berkeley [74] (Fig. 40, left) emit polarized light and will be useful for biological tagging
Figure 40. Transmission electron micrographs of CdSe nanocrystal quantum rods (left) and tetrapod (right) (from Paul Alivisatos’s Group, University of California, Berkeley).
Multimillion atom molecular-dynamics simulations
923
applications. The most recent additions to this family of nanocrystals include tetrapods (Fig. 40, right). Such systematically-controlled anisotropic shapes can be used as building blocks for self-assembly of three-dimensionally integrated nanostructures through surface-stress encoded epitaxy, self-alignment, and biological templates [75].
3.14.2. Core-shell nanoparticles The nanocrystals mentioned above are often coated with heterogeneous materials to form so called “core-shell” structures [76]. In semiconductor nanocrystals, the core-shell structures achieve better quantum confinement and enhanced luminescence quantum yield compared with their monolithic counterparts. Semiconductor nanocrystals can be passivated by both epitaxially grown heterogeneous semiconductor layers and disordered oxides to improve carrier confinement and enhance optical diffraction. Furthermore, these semiconductor quantum dots can be coated with conducting layers, enabling intercluster charge transfer to achieve tunable electrophotonic properties. Although coreshell quantum dots such as CdSe/CdS and CdSe/ZnS achieve higher luminescence quantum yields compared with their monolithic counterparts, the large lattice mismatch (CdSe lattice constant is 4.0% and 12.7% larger than that of CdS and ZnS, respectively) causes residual stresses and mechanical instabilities such as cracking. From a geometrical consideration, materials with larger lattice constants are preferable as a shell (such as InAs shell with 6.6% larger lattice constant than that of GaAs core). Molecular-dynamics simulations will be useful to study residual stresses and cracking in lattice-mismatched core-shell quantum dots.
3.14.3. Protein-based nanostructures Self-assembled protein structures from extremophiles (microbes such as bacteria and archea living in extreme environments), combined with advanced genetic-engineering techniques, offer tremendous opportunities for nanotechnology. Recently, double-ring structures composed of heat shock proteins isolated from hyperthermophilic archea have been used as building blocks for synthesizing a wide variety of self-assembled nanostructures such as nanotubes and two-dimensional superlattices (Fig. 41). Protein nanostructures have potential applications as templates for self-assembled optoelectronic devices and as biocompatible coatings [75]. Since each protein consists of 104 –105 atoms, atomistic simulations of these protein-based nanostructures will be a challenge, requiring the Petaflop and Grid architectures.
924
P. Vashishta et al.
Figure 41. (Left) Sliced top view revealing the ring structure in the thermosome from Thermoplasma acidophilum – a heat shock protein 60 in organisms (thermophiles) living at high temperatures. (Center) Aggregates of chaperonin filaments synthesized by Jonathan Trent’s group at NASA Ames. (Right) a two-dimensional superlattice of chaeronins synthesized by Jonathan Trent’s group at NASA Ames.
On Petaflop machines, due to be available in the 2010 time frame, it should be possible to simulate in its entirety the three-dimensional nanostructure built from nanoscale rods, tetrapods, and core-shell nanoparticles on an ordered array of proteins.
Acknowledgments This work is partially supported by AFOSR, ARO, DARPA, DOE, NSF, and USC-Berkeley-Princeton-LSU DURINT. A few million-atom simulations were performed using the inhouse parallel computers at the Collaboratory for Advanced Computing and Simulations at the University of Southern California. Ten million to billion atom simulations were performed using parallel computers at the High Performance Computing Center at the University of Southern California and at the Department of Defense’s Major Shared Resource Centers under a DoD Challenge project.
References [1] A. Pechenik, R.K. Kalia, and P. Vashishta, Computer-Aided Design of HighTemperature Materials. , Oxford University Press, Oxford, UK, 1999. [2] A. Nakano, M.E. Bachlechner, R.K. Kalia, E. Lidorikis, P. Vashishta, G.Z. Voyiadjis, T.J. Campbell, S. Ogata, and F. Shimojo, “Multiscale simulation of nanosystems,” Comput. Sci. Engrg., 3(4), 56–66,2001. [3] F.F. Abraham, R. Walkup, H.J. Gao, M. Duchaineau, T.D. De la Rubia, and M. Seager, “Simulating materials failure by using up to one billion atoms and the world’s fastest computer: Brittle fracture,” Proc. Nat. Acad. Sci. USA., 99, 5777–5782, 2002. [4] T.C. Germann and P.S. Lomdahl, “Recent advances in large-scale atomistic materials simulations,” IEEE Comput. Sci. Eng., 1(2), 10, 1999.
Multimillion atom molecular-dynamics simulations
925
[5] A. Nakano, R.K. Kalia, P. Vashishta, T.J. Campbell, S. Ogata, F. Shimojo, and S. Saini, “Scalable atomistic simulation algorithms for materials research,” Sci. Progr., 10, 263, 2002. [6] A. Sharma, A. Nakano, R.K. Kalia, P. Vashishta, S. Kodiyalam, P. Miller, W. Zhao, X.L. Liu, T.J. Campbell, and A. Haas, “Immersive and interactive exploration of billion-atom systems,” Presence-Teleoper. Vir. Environ., 12, 85–95, 2003. [7] P. Vashishta, R.K. Kalia, J.P. Rino, and I. Ebbsjo, “Interaction potential for SiO2 – a molecular-dynamics study of structural correlations,” Phys. Rev. B, 41, 12197–12209, 1990. [8] T. Campbell, R.K. Kalia, A. Nakano, F. Shimojo, K. Tsuruta, P. Vashishta, and S. Ogata, “Structural correlations and mechanical behavior in nanophase silica glasses,” Phys. Rev. Lett., 82, 4018–4021, 1999. [9] P. Vashishta, R.K. Kalia, and I. Ebbsjo, “Low-energy floppy modes in hightemperature ceramics,” Phys. Rev. Lett., 75, 858–861, 1995. [10] A. Nakano, R.K. Kalia, and P. Vashishta, “Dynamics and morphology of brittle cracks – a molecular-dynamics study of silicon-nitride,” Phys. Rev. Lett., 75, 3138– 3141, 1995. [11] P. Walsh, R.K. Kalia, A. Nakano, P. Vashishta, and S. Saini, “Amorphization and anisotropic fracture dynamics during nanoindentation of silicon nitride: a multimillion atom molecular dynamics study,” Appl. Phys. Lett., 77, 4332–4334, 2000. [12] P. Walsh, W. Li, R.K. Kalia, A. Nakano, P. Vashishta, and S. Saini, “Structural transformation, amorphization, and fracture in nanowires: a multimillion-atom molecular dynamics study,” Appl. Phys. Lett., 78, 3328–3330, 2001. [13] A. Chatterjee, R.K. Kalia, A. Nakano, A. Omeltchenko, K. Tsuruta, P. Vashishta, C. K. Loong, M. Winterer, and S. Klein, “Sintering, structure, and mechanical properties of nanophase SiC: a molecular-dynamics and neutron scattering study,” Appl. Phys. Lett., 77, 1132–1134, 2000. [14] F. Shimojo, I. Ebbsjo, R.K. Kalia, A. Nakano, J.P. Rino, and P. Vashishta, “Molecular dynamics simulation of structural transformation in silicon carbide under pressure,” Phys. Rev. Lett., 84, 3338–3341, 2000. [15] I. Szlufarska, R.K. Kalia, A. Nakano, and P. Vashishta, “Nanoindentation-induced amorphization in silicon carbide,” Appl. Phys. Lett., 85, 378–380, 2004. [16] J.P. Rino, I. Ebbsjo, P.S. Branicio, R.K. Kalia, A. Nakano, and P. Vashishta, “Shortand intermediate-range structural correlations in amorphous silicon carbide (a-SiC): a molecular dynamics study,” Phys. Rev. B, 70, 045207, 2004. [17] X.T. Su, R.K. Kalia, A. Nakano, P. Vashishta, and A. Madhukar, “Critical lateral size for stress domain formation in InAs/GaAs square nanomesas: a multimillion-atom molecular dynamics study,” Appl. Phys. Lett., 79, 4577–4579, 2001. [18] X.T. Su, R.K. Kalia, A. Nakano, P. Vashishta, and A. Madhukar, “Million-atom molecular dynamics simulation of flat InAs overlayers with self-limiting thickness on GaAs square nanomesas,” Appl. Phys. Lett., 78, 3717–3719, 2001. [19] P.S. Branicio, R.K. Kalia, A. Nakano, J.P. Rino, F. Shimojo, and P. Vashishta, “Structural, mechanical, and vibrational properties of Gal-xInxAs alloys: a molecular dynamics study,” Appl. Phys. Lett., 82, 1057–1059, 2003. [20] P.S. Branicio, J.P. Rino, F. Shimojo, R.K. Kalia, A. Nakano, and P. Vashishta, “Molecular dynamics study of structural, mechanical, and vibrational properties of crystalline and amorphous Gal-xInxAs alloys,” J. Appl. Phys., 94, 3840–3848, 2003. [21] A. Nakano, M.E. Bachlechner, P. Branicio, T. J. Campbell, I. Ebbsjo, R.K. Kalia, A. Madhukar, S. Ogata, A. Omeltchenko, J.P. Rino, F. Shimojo, P. Walsh, and P. Vashishta, “Large-scale atomistic modeling of nanoelectronic structures,” IEEE T. Electron Dev., 47, 1804–1810, 2000.
926
P. Vashishta et al.
[22] A. Nakano, R.K. Kalia, and P. Vashishta, “First sharp diffraction peak and intermediate-range order in amorphous silica – finite-size effects in moleculardynamics simulations,” J. Non-Crystall. Sol., 171, 157–163, 1994. [23] L. Greengard and V. Rokhlin, “A fast algorithm for particle simulations,” J. Comput. Phys., 73, 325, 1987. [24] A. Nakano, R.K. Kalia, and P. Vashishta, “Multiresolution molecular-dynamics algorithm for realistic materials modeling on parallel computers,” Comput. Phys. Commun., 83, 197–214, 1994. [25] S. Ogata, T.J. Campbell, R.K. Kalia, A. Nakano, P. Vashishta, and S. Vemparala, “Scalable and portable implementation of the fast multipole method on parallel computers,” Comput. Phys. Commun., 153, 445–461, 2003. [26] T. Darden, D. York, and L. Pederson, “Particle mesh Ewald: an Nlog(N) method for Ewald sums in large systems,” J. Chem. Phys., 98, 10089, 1993. [27] G.J. Martyna, M.E. Tuckerman, D.J. Tobias, and M.L. Klein, “Explicit reversible integrators for extended systems dynamics,” J. Chem. Phys., 101, 4177, 1994. [28] A. Nakano, “Fuzzy clustering approach to hierarchical molecular-dynamics simulation of multiscale materials phenomena,” Comput. Phys. Commun., 105, 139, 1997. [29] A. Nakano and T.J. Campbell, “An adaptive curvilinear-coordinate approach to dynamic load balancing of parallel multiresolution molecular dynamics,” Parallel Comput., 23, 1461, 1997. [30] A. Nakano, “Multiresolution load balancing in curved space: the wavelet representation,” Concurrency: Prac. Exper., 11, 343, 1999. [31] A.K. Rappe and W.A. Goddard, “Charge equilibration for molecular-dynamics simulations,” J. Phys. Chem., 95, 3358–3363, 1991. [32] F.H. Streitz and J.W. Mintmire, “Electrostatic potentials for metal-oxide surfaces and interfaces,” Phys. Rev. B, 50, 11996, 1994. [33] A.C.T. van Duin, S. Dasgupta, F. Lorant, and W.A. Goddard, “ReaxFF: a reactive force field for hydrocarbons,” J. Phys. Chem. A, 105, 9396–9409, 2001. [34] A. Nakano, “Parallel multilevel preconditioned conjugate-gradient approach to variable-charge molecular dynamics,” Comput. Phys. Commun., 104, 59, 1997. [35] T. Campbell, R.K. Kalia, A. Nakano, P. Vashishta, S. Ogata, and S. Rodgers, “Dynamics of oxidation of aluminum nanoclusters using variable charge molecular-dynamics simulations on parallel computers,” Phys. Rev. Lett., 82, 4866–4869, 1999. [36] J.Q. Broughton, F.F. Abraham, N. Bernstein, and E. Kaxiras, “Concurrent coupling of length scales: methodology and application,” Phys. Rev. B, 60, 2391–2403, 1999. [37] S. Ogata, E. Lidorikis, F. Shimojo, A. Nakano, P. Vashishta, and R.K. Kalia, “Hybrid finite-element/molecular-dynamics/electronic-density-functional approach to materials simulations on parallel computers,” Comput. Phys. Commun., 138, 143–154, 2001. [38] E. Lidorikis, M.E. Bachlechner, R.K. Kalia, A. Nakano, P. Vashishta, and G.Z. Voyiadjis, “Coupling length scales for multiscale atomistics-continuum simulations: atomistically induced stress distributions in Si/Si3 N4 nanopixels,” Phys. Rev. Lett., 87, 086104, 2001. [39] P. Hohenberg and W. Kohn, “Inhomogeneous electron gas,” Phys. Rev., 136, 864, 1964. [40] W. Kohn and L.J. Sham, “Self-consistent equations including exchange and correlation effects,” Phys. Rev., 140, 1133, 1965. [41] W. Kohn and P. Vashishta, “General density functional theory,” In: N.H. March and S. Lundquist (eds.), Inhomogeneous Electron Gas, Plenum, 79, 1983. [42] N. Troullier and J.L. Martins, “Efficient pseudopotentials for plane-wave calculations. 2. Operators for fast iterative diagonalization,” Phys. Rev. B, 43, 8861–8869, 1991.
Multimillion atom molecular-dynamics simulations
927
[43] S. Ogata, F. Shimojo, R.K. Kalia, A. Nakano, and P. Vashishta, “Environmental effects of H2 O on fracture initiation in silicon: a hybrid electronic-densityfunctional/molecular-dynamics study,” J. Appl. Phys., 95, 5316–5323, 2004. ¨ ut, I. Vasiliev, and A. Stathopoulos, “Electronic [44] J.R. Chelikowsky, Y. Saad, S. Og¨ structure methods for predicting the properties of materials: grids in space,” Phys. Stat. Sol. (b), 217, 173, 2000. [45] S. Ogata, F. Shimojo, R.K. Kalia, A. Nakano, and P. Vashishta, “Hybrid quantum mechanical/molecular dynamics simulation on parallel computers: density functional theory on real-space multigrids,” Comput. Phys. Commun., 149, 30–38, 2002. [46] F. Shimojo, R.K. Kalia, A. Nakano, and P. Vashishta, “Linear-scaling densityfunctional-theory calculations of electronic structure based on real-space grids: design, analysis, and scalability test of parallel algorithms,” Comput. Phys. Commun., 140, 303–314,2001. [47] J.-L. Fattebert and J. Bernholc, “Towards grid-based O(N) density-functional theory methods: optimized nonorthogonal orbitals and multigrid acceleration,” Phys. Rev. B, 62, 1713, 2000. [48] S. Dapprich, I. Kom´aromi, K.S. Byun, K. Morokuma, and M.J. Frisch, “A new ONIOM implementation in Gaussian 98. I. The calculation of energies, gradients, vibrational frequencies, and electric field derivatives,” J. Mol. Struct. (Theochem.), 461–462, 1, 1999. [49] I. Foster and C. Kesselman, The Grid 2: Blueprint for a New Computing Infrastructure., Morgan Kaufmann, San Francisco, 2003. [50] H. Kikuchi, R.K. Kalia, A. Nakano, P. Vashishta, H. Iyetomi, S. Ogata, T. Kouno, F. Shimojo, K. Tsuruta, and S. Saini, “Collaborative simulation Grid: multiscale quantum-mechanical/classical atomistic simulations on distributed PC clusters in the US and Japan,” Proc. Supercomputing ’02, IEEE, 2002. [51] A. Omeltchenko, T.J. Campbell, R.K. Kalia, X.L. Liu, A. Nakano, and P. Vashishta, “Scalable I/O of large-scale molecular dynamics simulations: a data-compression algorithm,” Comput. Phys. Commun., 131, 78–85, 2000. [52] A. Sharma, R.K. Kalia, A. Nakano, and P. Vashishta, “Large multidimensional data visualization for materials science,” Comput. Sci. Engrg., 5(2), 26–33, 2003. [53] J.P. Rino, I. Ebbsjo, R.K. Kalia, A. Nakano, and P. Vashishta, “Structure of Rings in Vitreous SiO2 ,” Phys. Rev. B, 47, 3053–3062, 1993. [54] D.J. Jacobs and M.F. Thorpe, “Generic rigidity percolation – the pebble game,” Phys. Rev. Lett., 75, 4051–4054, 1995. [55] S. Kodiyalam, R.K. Kalia, H. Kikuchi, A. Nakano, F. Shimojo, and P. Vashishta, “Grain boundaries in gallium arsenide nanocrystals under pressure: a parallel molecular-dynamics study,” Phys. Rev. Lett., 86, 55–58, 2001. [56] A. Nakano, R.K. Kalia, and P. Vashishta, “Scalable molecular-dynamics, visualization, and data-management algorithms for materials simulations,” Comput. Sci. Engrg., 1, 39–47, 1999. [57] K. Tsuruta, A. Omeltchenko, R.K. Kalia, and P. Vashishta, “Early stages of sintering of silicon nitride nanoclusters: a molecular-dynamics study on parallel machines,” Europhys. Lett., 33, 441–446, 1996. [58] H. Gleiter, “Materials with ultrafine microstructures: retrospectives and perspectives,” Nanostruct. Mater., l, 1, 1992. [59] R.W. Siegel, “Creating nanophase materials,” Sci. Amer., December, 74, 1996. [60] R.K. Kalia, A. Nakano, K. Tsuruta, and P. Vashishta, “Morphology of pores and interfaces and mechanical behavior of nanocluster-assembled silicon nitride ceramic,” Phys. Rev. Lett., 78, 689–692, 1997.
928
P. Vashishta et al.
[61] R.K. Kalia, A. Nakano, A. Omeltchenko, K. Tsuruta, and P. Vashishta, “Role of ultrafine microstructures in dynamic fracture in nanophase silicon nitride,” Phys. Rev. Lett., 78, 2144–2147, 1997. [62] K. Tsuruta, A. Nakano, R.K. Kalia, and P. Vashishta, “Dynamics of consolidation and crack growth in nanocluster-assembled amorphous silicon nitride,” J. Amer. Ceram. Soc., 81, 433–436, 1998. [63] R.F. Pettifer, R. Dupree, I. Farnan, and U. Sternberg, “NMR determinations of Si– O–Si bond angle distributions in silica,” J. Non-Crystall. Sol., 106, 408–412, 1988. [64] E. Celarie, S. Prades, D. Bonamy, L. Ferrero, E. Bouchaud, C. Guillot, and C. Marliere, “Glass breaks like metal, but at the nanometer scale,” Phys. Rev. Lett., 90, 075504, 2003. [65] L.V. Brutzel, C.L. Rountree, R.K. Kalia, A. Nakano, and P. Vashishta, MRS Proc., 703, 3.9.1–3.9.6, 2001. [66] P. Daguier, B. Nghiem, E. Bouchaud, and F. Creuzet, “Pinning and depinning of crack fronts in heterogeneous materials,” Phys. Rev. Lett., 78, 1062–1065, 1997. [67] S.M. Hu, “Stress-related problems in silicon technology,” J. Appl. Phys., 70, R53– R80, 1991. [68] S.C. Jain, H.E. Maes, K. Pinardi, and I. DeWolf, Appl. Phys. Rev., 79, 8145, 1996. [69] F.H. Stillinger and T.A. Weber, “Computer-simulation of local order in condensed phases of silicon,” Phys. Rev. B, 31, 5262–5271, 1985. [70] A. Omeltchenko, M.E. Bachlechner, A. Nakano, R.K. Kalia, P. Vashishta, I. Ebbsj¨o, A. Madhukar, and P. Messina, “Stress domains in Si (lll)/Si3 N4 (0001) nanopixel – 10 million-atom molecular dynamics simulations on parallel computers,” Phys. Rev. Lett., 84, 318, 2000. [71] M.E. Bachlechner, A. Omeltchenko, A. Nakano, R.K. Kalia, P. Vashishta, I. Ebbsj¨o, and A. Madhukar, “Dislocation emission at the silicon/silicon nitride interface: a million-atom molecular dynamics simulation on parallel computers,” Phys. Rev. Lett., 84, 322–325, 2000. [72] G.L. Zhao and M.E. Bachlechner, “Electronic structure and charge transfer in alphaand beta-Si3 N4 and at the Si (lll)/Si3 N4 (001) interface,” Phys. Rev. B, 58, 1887–1895, 1998. [73] P. Vashishta, R.K. Kalia, and A. Nakano, “Large-scale atomistic simulations of dynamic fracture,” Comput. Sci. Engrg., 1(5), 56–65, 1999. [74] X.G. Peng, L. Manna, W.D. Yang, J. Wickham, E. Scher, A. Kadavanich, and A.P. Alivisatos, “Shape control of CdSe nanocrystals,” Nature, 404, 59–61, 2000. [75] R.A. McMillan, C.D. Paavola, J. Howard, S.L. Chan, N.J. Zaluzec, and J.D. Trent, “Ordered nanoparticle arrays formed on engineered chaperonin protein templates,” Nat. Mater., 1, 247–252, 2002. [76] M.C. Schlamp, X.G. Peng, and A.P. Alivisatos, “Improved efficiencies in light emitting diodes made with CdSe(CdS) core/shell type nanocrystals and a semiconducting polymer,” J. Appl. Phys., 82, 5837–5842, 1997.
2.26 MODELING LIPID MEMBRANES Christophe Chipot,1 Michael L. Klein,2 and Mounir Tarek1 1 Equipe de dynamique des assemblages membranaires, Unit´e mixte de recherche Cnrs/Uhp 7565, Institut nanc´eien de chimie mol´eculaire, Universit´e Henri Poincar´e, BP 239, 54506 Vandœuvre–l`es–Nancy cedex, France 2 Center for Molecular Modeling, Chemistry Department, University of Pennsylvania, 231 South 34th Street, Philadelphia, PA 19104–6323, USA
1.
Introduction
Membranes consist of an assembly of a wide variety of lipids [1], proteins and carbohydrates that self-organize to assume a host of biological functions in the cell machinery, like the passive and active transport of matter, the capture and storage of energy, the control of the ionic balance, or the intercellular recognition and signalling. In essence, membranes act as walls that delimit the interior of the cell from the outside environment, preventing the free translocation of small molecules from one side to the other. At an atomic level, knowledge of both the structure and the dynamics of membranes remains to a large extent fragmentary, on account of the remarkable fluidity of these systems under physiological conditions. As a result, the amount of experimental information that can be interpreted directly in terms of positions and motions is still rather limited. A method that could provide the atomic detail of lipid bilayers, that is often inaccessible to conventional experimental techniques would, therefore, be extremely valuable for improving our understanding of how membranes function. It would further constitute a bridge between observations at the macroscopic and the microscopic levels, and possibly reconcile the two views. Atomic simulations [2], in general, and molecular dynamics (MD) simulations, in particular, have proven to be an effective approach for investigating lipid aggregates, providing new insights into both the structure and the dynamics of these systems. 929 S. Yip (ed.), Handbook of Materials Modeling, 929–958. c 2005 Springer. Printed in the Netherlands.
930
C. Chipot et al.
Basic structural characteristics of the membrane are determined by the nature of the lipids, and how the latter self-organize into complex threedimensional arrangements, exposing their polar head groups to the aqueous environment, while protecting the aliphatic domain to form the hydrophobic core. Atomic simulations have developed over the past two decades to such an extent that it possible to model with the desired accuracy these structural features. Statistical simulations rely on models that have undeniably improved over the years, getting inexorably closer to the chemical, physical and biological reality of the systems investigated. Yet, they remain models, subject to a host of underlying approximations. It is, therefore, necessary to confront as systematically as possible the results of numerical simulations to the experimental data available. Only when the models have proven to have reached the appropriate robustness and reliability, can they serve as an explanatory, possibly predictive tool, capable of (i) rationalizing experimental findings, (ii) providing additional insights into experimentally observed phenomena, and (iii) suggesting new experiments. In the particular case of water–lipid assemblies, there is a considerable wealth of experimental information that can potentially be used to support or contradict in silico studies, albeit immediate confrontation often turns out to be rather cumbersome. Modeling biological membranes raises a number of difficulties, that still have not found a satisfactory solution. A lipid bilayer is, in essence, a disordered liquid crystal of virtually infinite extent. Truncation of this system into a finite-size patch, to comply with the current limitations of molecular simulations, de facto rubs out significant ranges of the wavelength spectrum that corresponds, for instance, to bending and splay motions. Current limitations in the available computational resources not only impose restrictions on the size of the system, but also on the time–scales explored. In silico experiments, like MD simulations, nonetheless, represent a powerful tool which is able to offer new insights into the structural and dynamical properties of lipid bilayers. This chapter is aimed at introducing this field to non-specialists, yet providing the necessary guidance for setting up and understanding statistical simulations of lipid–water assemblies, together with key-references for further reading. Up-to-date comprehensive reviews on modeling membranes can be found elsewhere [3–10]. After outlining the properties that govern selforganization, and the type of structural information accessible from experiment, the methodologies utilized to model these systems are described. Next, examples of atomic simulations of lipid bilayers are presented emphasizing how the results can be compared to experiment. Last, selected simulations of more complex membrane assemblies are described and discussed critically, with a glimpse into the future of this very promising research area.
Modeling lipid membranes
2. 2.1.
931
Lipid–Water Assemblies What Are The Factors That Determine The Morphology?
By and large, lipids and surfactants are amphipathic chemical species formed, roughly speaking, by a hydrophilic head group and a hydrophobic, alkyl tail. As a function of the chemical specie, this non-polar tail may be constituted by one or two aliphatic chains, either saturated or unsaturated. In the case of phospholipids, the head group usually consists of a phosphate group bonded to a variety of functional moieties, like a choline, an ethanolamine, a serine, or a glycerol fragment. Depending upon the type of fragment, the lipid is either charged – e.g., dimyristoylphosphatidylglycerol (DMPG), or neutral – e.g., dimyristoylphosphatidylcholine (DMPC). At the so-called sn–3 position, the phosphate group is attached to a glycerol hydroxyl moiety, the two remaining hydroxyl moieties being connected to aliphatic chains by means of ester linkages at position sn–1 and sn–2. At low concentrations, lipids or surfactants in an aqueous medium usually remain in a monomeric state. Beyond the critical micelle concentration (CMC), they self-assemble into a wide variety of unique three-dimensional structures, that encompass micelles, inverse micelles, bilayers, hexagonal tubular phases and more complicated bicontinuous labyrinths (see Fig. 1). The nature of the lipid determines the morphology of the three-dimensional arrangement [11, 12]. In water, lipids aggregate in such a fashion that the polar head group be hydrated adequately, while protecting the alkyl chains from exposure towards the aqueous environment. As a consequence, the cross-sections of both the head group and the chains dictate the morphology of the resulting lipid–water assembly. For instance, lipids featuring a large head group and a single alkyl chain usually form direct micelles, whereas lipids characterized by a smaller head group and possibly two alkyl chains tend to self-organize into inverse micelles [1]. For lipids forming planar bilayer assemblies, the net charge borne by the head group plays a noteworthy role in the self-organization process. Small, charged head groups show an interesting tendency to associate by means of intermolecular hydrogen bonds, resulting in compact structures with a small surface area per lipid – e.g., dilaureylphosphatidylethanolamine (DLPE) [13]. In larger, zwitterionic lipids , like phosphatidylcholine, lipid head groups are organized in inter and intra molecular charge pairs between the oppositely charged choline and phosphate groups [14]. Aside from the nature of the lipid, external conditions, like the concentration, the temperature, the pressure or the ionic strength of the solvent, strongly influence self-organization into a particular structure. Extensive variables, for instance, can be used to control the transition between phases. At low
932
C. Chipot et al.
(a)
(b)
(c) Figure 1. Polymorphism of lipid–water assemblies. (a) The cross-sectional area of the head group is larger than that of the alkyl tail. In an aqueous environment, this specie forms direct micelles, which further organize into hexagonal HI phases. (b) The cross-sectional area of the head group is smaller than that of the alkyl tail. The lipids form inverted micelles in water, which may aggregate into hexagonal HII phases. (c) The cross-sectional areas of the head group and the tail are comparable. The lipids assemble into planar bilayers, in the gel, Lβ , phase (left) or in the liquid crystal, Lα , phase.
temperatures, lipid bilayers remain in the gel, Lβ , phase, wherein the alkyl chains, mostly in an all-trans conformation, are well ordered and exhibit a reduced mobility. At higher temperatures, the gel phase transforms into a liquid crystal, Lα , phase characterized by an increase of the surface area per lipid and a decrease of the thickness of the bilayer, as a direct consequence of the “melting” of the participating alkyl chains. The transition temperature, depends on the chemical nature of the lipid. For instance, it increases with the length of the alkyl chains, but it decreases with the number of unsaturations. Most cell membranes in vivo exist in the fluid, liquid crystal phase, barring a few cases, e.g., stratum corneum specialized membrane [15]. It is, therefore, not too surprising that, at the exception of a handful of simulations of lipid bilayers in the gel phase, most investigations have focused on the so-called, biologically relevant Lα phase.
2.2.
Experimental Available Information
To this date, neutron and x-ray diffraction experiments probably remain the most powerful tools for determining structures of lipid bilayers at an atomic
Modeling lipid membranes
933
resolution [16–19]. A particularly pertinent information supplied by diffraction experiments are density distributions [20], that can be deconvoluted in terms of atomic positions in the direction normal to the water–membrane interface, for different types of atoms. High-resolution x-ray diffraction experiments may offer additional, valuable information, that can directly serve as a reference for computational studies. Such is the case of the surface area per lipid, that may be derived from gravimetric x-ray methods or from electron density profiles. It should be underlined, however, that the highly disordered nature of liquid crystal, Lα , phases, and their fluctuations makes the observation of such systems particularly difficult, and explains the large uncertainty in the values supplied by the literature [20]. Whereas x-ray and neutron diffraction on multi-layered samples have historically been a source of high-resolution structural information of model membranes, neutron reflectivity has provided unique data on single lipid bilayers in contact with bulk water. Scattering length density (SLD) profiles along the normal to a layered system are deduced from the information collected as a function of the scattering wave-vector transfer (Q). Recently it has been shown that it is also possible to invert directly the reflectivity spectra to obtain SLD profiles [21]. It is important to note, however, that only the total SLD profile is determined. For more complex systems, atomistic modeling can provide valuable insight into such structures, thereby complementing the experimental studies [22–24]. Nuclear magnetic resonance (NMR) techniques are also used extensively to probe the molecular organization in lipid membranes. Earlier on, 2 H NMR experiments on oriented lipid matrices supplied lipid order parameters, against which the average orientational order along the acyl chains calculated from simulations could be confronted. Today, thanks to the introduction of magic angle spinning (MAS) techniques, a very large number of parameters from lipid bilayers are available, providing a wealth of information on the conformation of all lipid segments [25]. X-ray and neutron scattering measurements as well as NMR experiments may also be used as a possible source of comparison of dynamical properties against MD simulations. As will be seen in what follows, the significant computational effort involved in atomic simulations of large lipid–water assemblies limits, from a biological perspective, their length to relatively short times. Short time-scale dynamics is yet amenable to MD, and the data determined by this approach can be confronted directly to scattering experiments [4, 26], and, for instance, to nuclear Overhauser enhancement spectroscopy (NOESY) cross-relaxation rates [27], which probe motions occurring over comparable time-scales.
934
3.
C. Chipot et al.
Modeling Lipid Bilayers
In order to eliminate edge effects and to mimic a macroscopic system, simulations of lipid bilayers consist of considering a small patch of lipid and water molecules confined in a central simulation cell, and replicating the latter using periodic boundary conditions (PBCs) in the three directions of Cartesian space, as is being done in the simulations of molecular liquids and crystals. In doing so, the simulated system corresponds to a small fragment of either a multi-lamellar liposome or of a multi-lamellar oriented lipid stack, similar to those deposited on a substrate (see Fig. 2). The size of the simulated sample results in artefactual, symmetry-induced effects and the impossibility to witness collective phenomena like bending or splay motions that occur over length-scales above the size of the cell [20, 28, 29]. If needed, one may render a more biologically or physically meaningful picture, consistent with experimentally observed phenomena, by incorporating a large number of lipid and water molecules [30]. Even then, the length of the simulation constitutes another critical aspect in the modeling of lipid–water assemblies, essentially because a number motions in lipid bilayers, occur over time-scales exceeding 10 ns (see Fig. 2).
3.1.
Choice of the Thermodynamic Ensemble
From a technical perspective, the simplest thermodynamical ensemble for simulating lipid–water assemblies is undeniably the microcanonical, (N , V, E),
10⫺8s (c)
10⫺11s
(a)
10⫹4s 10⫺9s
10⫺6s
(b)
Figure 2. Left: small patch of lipid bilayer replicated by PBCs. Right: characteristic time-scales in lipid bilayers. Overall, motions occur on times that range between a few ps for the separation of sn–1 and sn–2 alkyl chains, to a few hours for the so-called flip–flop, where in a lipid unit migrates from one leaflet to the other.
Modeling lipid membranes
935
ensemble, or possibly the canonical, (N, V, T ), ensemble, wherein the temperature is controlled rigorously by means of a thermostat. In this event, the modeler may choose to fix the cross-sectional area per lipid unit to its experimental value and leave an appropriate head space of air in contact with the water lamellae, above and below the membrane. Whereas this protocol is ad hoc in the case of a simple, homogeneous lipid bilayer, one may legitimately wonder how it will perform when additives – e.g., small solutes to large proteins, are introduced into the membrane or in its vicinity. A better adapted thermodynamic ensemble should then be employed to allow the participating lipid chains to relax in response to the modification of the surface tension imposed by the additive. A very tempting solution consists in turning to the isobaric–isothermal, (N, P, T ), ensemble, that makes use of rigorous barostats and thermostats to maintain, respectively, the pressure and the temperature at the desired values. This raises, however, difficulties of its own. In a mixture of oil and water with a positive surface tension, the free energy increases monotonously with the surface area, as the system minimizes the contact area between the two liquids. In the case of lipids interacting with water – viz. typically a hydrated lipid bilayer, the picture is somewhat more intricate. Just like for a mixture of oil and water, by virtue of the hydrophobic effect, the free energy increases with the surface area. This is evidently not the sole contribution governing the behavior of lipid bilayers, the surface area of which would be minimized regardless of the temperature, thereby forcing the system in the gel, Lβ , phase. Small surface areas, indeed, restrain the alkyl chains in an ordered state, consequently decreasing the entropy of the lipid–water assembly. As a result, the free energy no longer increases with the surface area, but, on the contrary, exhibits a minimum that corresponds to an optimum surface area for a given temperature. This also implies that the surface tension, γ , should be strictly zero, and, therefore, that the lateral pressure, P , be strictly equal to the pressure normal to the water–lipid interface, P⊥ : γ=
P⊥ − P (z) dz = 0.
(1)
This important result, which is expected for a self-organized system, prompted a host of authors to simulate lipid bilayers in the isotropic isobaric– isothermal ensemble, (N, P, T ) [31]. Whereas, strictly speaking, Eq. (1) is true for a lipid–water assembly of virtually infinite extent, it should be kept in mind that in atomic simulations, one models patches of finite size. Feller and Pastor put forward that a finite surface tension should be introduced to compensate for such finite-size effects that eliminate the possibility to observe collective phenomena like undulations over significant length-scales [32, 33], as in ripple, Pβ , phases, for instance. Tieleman and Berendsen argued that in the systems they investigated, the dependence of the surface tension with the surface area was marginal [34]. Lindahl and Edholm later showed that an applied
936
C. Chipot et al.
surface tension in the order of 10 mN/m would correct for large fluctuations in the surface area per lipid unit that are witnessed in simulations of lipid–water assemblies of limited size [35]. One thing is certain: in atomic simulations of lipid bilayers, P and P⊥ are anticipated to vary differently on account of the anisotropy of the environment. It is, therefore, strongly recommended to adopt an algorithm that generates the (N, P, T ) distribution, so that the dimensions of the simulation cell are rescaled independently in the x, y (in-plane) and in the z-directions [2, 36, 37].
3.2.
The Potential Energy Function
In atomic statistical simulations of membranes, all atoms pertaining to the system are treated classically as point masses, which, in the harmonic approximation, are connected to each other by means of springs. In some instances, for the sake of computational effort, certain groups of atoms, like methylene, –CH2 –, or methyl, –CH3 , moieties, are represented as a single, “united” atom of appropriate van der Waals radius and well depth [38]. Seminal simulations of lipid–water assemblies made use of the available multi-purpose force fields, often aimed at the modeling of solvated proteins and nucleic acids. It is, therefore, not too surprising that in early investigations, the agreement with experiment was either far from optimal, or clearly too good to not suspect a fortuitous cancellation of errors due to the conjunction of inadequate parameters and excessively short runs. In the following years, it was realized that a specific potential energy function should be employed to mimic accurately the properties of lipids, like the subtle trans-gauche equilibrium in the alkyl chains. A dearth of efforts was and is still invested to improve the representation of lipids and surfactants by means of an appropriate parameterization of the force-field contributions likely to affect the structural and dynamical features of these systems [39–43]. In some of these force-fields, to obtain a better description of the ordering in the fatty aliphatic chains, that can be ascribed to trans-gauche defects, the standard low-order Fourier series that is often used in conventional macromolecular force fields, was replaced by the more sophisticated Ryckaert–Bellemans torsional potential [44]. In addition, correct packing of the alkyl chains depends to a large extent on the quality of the van der Waals parameters utilized. One of the underlying assumptions made for the design of force fields is the transferability of these parameters between molecules – e.g., the van der Waals radius and well depth of an aliphatic sp3 carbon should be the same regardless of the chemical environment. The interaction parameters of the united methylene and methyl groups were originally derived from statistical simulations of short hydrocarbons like n-butane, as is the case of the OPLS force field [45]. The
Modeling lipid membranes
937
transferability hypothesis has proven to be inadequate when handling long alkyl chains, prompting a number of authors to reoptimize van der Waals interactions based on simulations of large hydrocarbons like pentadecane [31]. Determination of net atomic charges for lipids and surfactants from sophisticated quantum mechanical calculations may turn out to be a difficult task, on account of the size of the molecules. Unquestionably, partial charges derived from the electrostatic potential constitute the most satisfactory solution among the arsenal of approaches available to the modeler [46]. Yet, as has been demonstrated, point charges are inherently conformation-dependent [47], thus making the derivation of a unique set of charges representative of all possible conformations questionable. To circumvent the difficulties connected to the size of the molecules, it has been proposed to derive the net atomic charges as independent fragments, that are ultimately pieced together. This scheme, although tempting, should be considered with great care if local charges are delocalized over large spatial extents.
3.3.
Intermolecular Interactions
As has been mentioned previously, physically and biologically realistic simulations should involve a sufficiently large number of lipid and water molecules to minimize finite-size effects. Much of the computation effort involved in atomic simulations of lipid–water assemblies lies in the evaluation of pairwise interactions, the number of which increases dramatically with the number of particles in the system. Based upon the assumption that intermolecular interactions decay with the distance, earlier studies have employed a cut-off sphere, beyond which the interactions are truncated. This approximation is expected to be ad hoc for the short-range, van der Waals contribution. The use of a brute, finite spherical cut-off for truncating the short-range van der Waals interactions may, however, modulate the forces responsible for the cohesion of lipid–water assemblies. Accurate use of a cut-off requires to take into account the appropriate long-range corrections for both the energy and the pressure [48], based on the classical formulae utilized for Lennard–Jones fluids [49]. For Coulomb interactions, the range of which varies in 1/rn, where n ≤ 3 [2, 49], truncation becomes particularly arguable. In this event, the long-range character of the participating charge–charge (n = 1) and charge–dipole (n = 2) interactions makes the use of a spherical truncation unsuitable. Probably the most accurate approach for handling the long-range nature of electrostatic interactions in spatially replicated simulation cells is solving the Poisson equation. The Ewald approach [50], that decomposes the conditionnally convergent Coulombic sum over periodic boxes into two rapidly decaying contributions evaluated respectively in the direct and reciprocal spaces is the most
938
C. Chipot et al.
used method. Formally, the computational effort involved in this method scales as (N 2 ), thus making statistical simulations of large ensembles of atoms particularly costly. This effort can be reduced, scaling down the calculation to (N ln N ), by solving the Poisson equation numerically on a grid of points, over which the position of the particles are interpolated. Such a scheme constitutes the central idea of algorithms like particle–mesh Ewald (PME) or particle– particle–particle–mesh (P3 M) [51]. For completeness, while to our knowledge, it has not been yet applied in membrane simulations, it is worth mentioning the fast multipole approach, a method alternative to Ewald summation, that treats long-range interactions in a rigorous fashion, and scales linearly with N for very large systems – viz. on the order of 100 000 atoms [52]. The substantial computational investment required to attain a physically consistent description of the simulated molecular assembly may be further reduced by taking advantage of recent advances in the MD methodology. Considering that the different degrees of freedom involved in the system relax over distinct time-scales, it is not necessary that the corresponding force contributions be evaluated concurrently. This is, in essence, the basic principle of the so-called multiple time-step methods [53, 54], in which intramolecular, van der Waals and Coulomb forces can be updated with different frequencies [55]. In conjunction with constraint algorithms like SHAKE or RATTLE [56], that virtually eliminate the vibrations due to hard degrees of freedom it is possible to explore large regions of the phase space for a lesser computational effort, thus making long simulations of large lipid–water assemblies somewhat more affordable – the reader is referred to the chapter of Tuckerman and Martyna dedicated to integrator and ensembles in statistical simulations. Contemporary, academic MD packages, like AMBER [57], CHARMM [58], GROMACS [59] or NAMD [60], have benefited from several recent methodological developments on the algorithmic front, and incorporate more or less all the key-features discussed so far, that are necessary to investigate lipid–water assemblies rigorously. As has been commented on, obtaining a realistic picture of complex chemical systems like membranes requires handling sufficiently large sets of atoms, thereby increasing rapidly the computational effort in a dramatic fashion. From a modeling perspective, numerical simulations, in order to prove their usefulness, should supply the desired answer within a reasonable computation time – i.e., hopefully faster than would experimental data aquisition and analysis be carried out. Fulfilling this requirement implies taking advantage of modern, parallel architectures, over which the computational chore can be distributed. Yet, a number of the most popular MD codes were written several years ago, in the dawn of parallelism, when scalar computers were utilized predominantly. Although methodological and technical improvements of MD programs still constitute an ongoing process, the best performances in MD simulations can admittedly only be obtained using those codes that were designed specifically for parallel architectures,
Modeling lipid membranes
939
often based on a domain decomposition scheme in conjunction with an appropriate load balancing, that spreads the computational effort evenly across the array of available processors. Among these programs, NAMD, for instance, was developed in the spirit of conserving an optimal scalability as the number of processors increases – assuming large enough ensembles of atoms. By and large, MD codes targeted at massively parallel environments have undeniably contributed to making atomic simulations of membranes more affordable, allowing the modeler to deal with up to a few hundred thousands atoms [61]. Aside from the purely computational aspect of membrane modeling, visualization has also proven to play a significant role in these advances by helping to interpret the raw results of MD simulations. Flexible, user-friendly visualization programs developed in academical environments tend to become an indispensable element in the arsenal of tools at the disposal of the modeler. Today, non-commercial packages, like VMD [62], offer an increasing number of functionalities that can be tailored according to the own aspiration of the modeler, through object-oriented languages and the possibility to introduce new features by means of plugins.
4.
Atomic Simulations of Lipid Membranes
Traditionally, phospholipids have served as models for investigating in silico the structural and dynamical properties of membranes. From both a theoretical and an experimental perspective, zwitterionic phosphatidylcholine (PC) lipids constitute the best characterized systems. Hydrated DMPC [13, 63] and dipalmitoylphosphatidylcholine [31, 34, 64–67] (DPPC) bilayers have been so far probably the most extensively surveyed lipid membranes. Yet, on account of their intrinsic limitations – viz. the short alkyl chains in DMPC and the temperature of Lβ to Lα phase transition in DPPC, above physiological conditions – several authors have turned to biologically more relevant lipids like palmitoyloleylphosphatidylcholine [68, 69] (POPC), in particular for examining membrane proteins in a realistic environment, and lipids based on mixtures of saturated/polyunsaturated alkyl chains (SDPC, 18:0/22:6 PC) [43, 70]. A variety of alternative lipids, featuring different, possibly charged, head groups, have also been explored – e.g., DLPE [13, 71] (DLPE), dipalmitoylphosphatidylserine [72, 73] (DMPS) or glycerolmonoolein [30, 74] (GMO). In several cases, however, the modeler is faced with an absence of experimental data to which the results of atomic simulations can be confronted. Bilayers built from PC lipids, nonetheless, represent remarkable test systems not only to probe the methodology, but also to gain additional insight into the physical properties of membranes. In this section, the derivation of these properties from MD trajectories and how a bridge with experiment can be established will be detailed.
940
4.1.
C. Chipot et al.
Bilayer Structure
4.1.1. Density distributions As can be seen in Fig. 3, the spatial extent encompassed by the headgroup region of the DMPC units in a bilayer arrangement is remarkably broad. This is clearly seen in the number density profiles computed from the MD trajectory – an analysis along the direction normal to water–membrane interface of the in-plane densities of lipid and water atoms. A striking feature emerging from these distributions is the penetration of water far in the head-group region. The farthest extent of water molecules roughly coincides with the ester moieties of the lipids. The width of the interfacial region, on the order of 8–10 Å for a fully hydrated DMPC bilayer highlights the significant static and dynamic roughness of the membrane surface [74, 75], therefore, refining the traditional textbook picture, like that of Figs. 1 and 2. Interestingly enough, phosphate and choline groups lie approximately at the same depth in the bilayer, indicating that head groups are rather oriented in the plane of the bilayer. The average orientation of the head-group P–N bond dipoles with respect to the normal of the water–membrane interface arises around 70◦ , pointing towards the aqueous medium, albeit the orientational distribution is remarkably wide, and depends upon the temperature and, as expected, the potential energy function utilized [14]. In addition, the level of lipid hydration has been shown to play a noteworthy role in the orientation of the head groups [9]. Under any circumstances, it is crucial that the slow reorientation of the lipid head groups be considered when interpreting results from short MD trajectories. Estimates from single-molecule anisotropy
Figure 3. Left: snapshot taken from an MD simulation of a fully hydrated DMPC bilayer. Methylene and methyl carbon atoms of the alkyl chains are shown in light grey. The lipid headgroup atoms are shown dark grey, and the water molecules are drawn in black and white. Note the protrusion of head groups that results locally in a rough water–membrane interface. Right: density distributions for selected groups of atoms in a fully hydrated DMPC bilayer examined at 303 K (from Ref. [76]).
Modeling lipid membranes
941
imaging for fluorophore-tagged POPC molecules [77] indicate a rotational diffusion coefficient of ca. 0.7 rad2 /ns, slightly below estimates from MD simulations [78], suggesting that sampling of the whole rotational space for each molecule would necessitate over few tens of a nanosecond. The information provided by the density distributions can be confronted directly to x-ray and neutron diffraction measurements [17] (considering, respectively, the atomic scattering length densities, or the electron densities). The MD trajectories can further be used in conjunction with the scattering experiments in order to refine the data by, for instance including fraction volumes extracted from the simulation [79].
4.1.2. Lipid tail conformation Deuterium quadrupole splitting measured by 2 H NMR on non-oriented samples of membrane preparations, is mainly determined by the average conformation of the phospholipid molecules, and, as such, supplies valuable structural information about the system. Order parameters can be derived from MD trajectory, and can be expressed as a tensor, the elements of which write [80]: 1 3 cos ϕα cos ϕβ − δαβ (2) 2 Here, ϕα is the angle formed by the α-th molecular axis and the normal to the water–bilayer interface, and · · · is an ensemble average over all lipid chains. In most circumstances, based on symmetry relationships, it is assumed that the order parameters, SCD , for an alkyl chain bearing deuterium labels can be expressed as:
Sαβ =
1 3 cos2 θ − 1 (3) 2 where θ is simply the angle between the C–D chemical bond and the normal to the bilayer. When C–D is uniformly distributed, SCD = 0, and when the chain is all–trans, |SCD| = 0.5. In general for saturated lipids, |SCD | exhibits a plateau value at ca. 0.2 for the upper chain segments (Fig. 4). Force fields of the new generation reproduce quite well these order parameters, barring small discrepancies for the second carbon atom of the alkyl chains. Further analysis of MD trajectories may be aimed at extracting additional information from the NMR experiments. For instance, one may refine those methods targeted at obtaining such quantities as the average chain length or the surface area per molecule [81]. Another study exemplifies the successful combination of MD simulation with experiment to probe alkyl chain packing in lipid membranes. Such is the case of infrared (IR) data, that have been reinterpreted to estimate the concentrations of gauche–gauche, trans–gauche and trans–trans conformational sequences in a DPPC bilayer [82].
SCD =
942
C. Chipot et al.
Figure 4. Snapshot taken from an MD simulation of a synthetic channel formed of cyclic peptides of alternated D- and L- chiralities, embedded in a fully hydrated DMPC bilayer. Color coding of the atoms is identical to that in Fig. 3. Note the antiparallel β-sheet like conformation of the nanotube spanning the membrane. Within a few hundreds of ps, a single-file chain of water molecules is established in the hollow tubular structure.
4.1.3. Hydration of the head-group region In atomic simulations, solvation properties are often measured by means of radial distribution or pair correlation functions (RDFs):
gi j (r) =
N j (r; r + δr) 4π j
(4)
r 2 dr
where N j is the number of particles j at a distance from i comprised between r and r + δr and j is the density of particles j . In essence, this definition is targeted at isotropic fluids, and, in principle, should not be applied, as is, to anisotropic lipid–water assemblies [83]. To estimate the coordination number of site i – e.g., PC head groups, it seems far more appropriate to merely evaluate N j (r; r + δr) as a function of the separation r and determine its value at the first minimum of a qualitative RDF computed using Eq. (4).
4.1.4. Transmembrane electrostatic potentials Orientation of water molecules near the head-group region of the lipid bilayer is clearly anisotropic, compared to the bulk aqueous medium. This can be shown by measuring the average cosine of the angle formed by the dipole moment of the water molecules and the normal to the bilayer, as a function
Modeling lipid membranes
943
of the distance from its geometrical center. A marked peak emerges at a distance characteristic of the phosphate groups, emphasizing the orienting power exerted by this moiety on the surrounding aqueous environment. The preferential orientation of the dipole moment borne by the water molecules is at the origin of the vocabulary “dipole potential,” that has been employed extensively to denote the electrostatic potential across the water–membrane interface [84, 85]. This conspicuous ordering of water molecules was recently directly evidenced using coherent anti-Stokes Raman scattering microscopy [86]. In a number of in silico investigations, the electrostatic potential has been estimated from the knowledge of the charge density. In the spirit of atomic density distributions, charges are accumulated as a function of their position along the direction normal to water–bilayer interface. The negative of the first integral of the charge density yields the electric field. In turn, integral of the field provides the electrostatic potential. Not too surprisingly, the resulting “dipole potential” inherently depends upon the choice of the potential energy function and should, thus, be interpreted cautiously [4, 31, 63].
4.2.
Dynamics
The increasing level of interaction between experimental studies and numerical simulations of lipid bilayers evidenced in the previous section also holds for the dynamics of lipid bilayers. Feller et al. [27] have used MD simulations to analyze NOESY cross relaxation rates in lipid bilayers. Magnetic dipole–dipole correlation in such systems occurs over a variety of time-scales and depends upon the probability of close approach for proton–proton interactions. The relaxation rates have been calculated directly from a 10 ns MD simulation of DPPC. Fitting the autocorrelation functions yields characteristic correlation times and weight factors that determine the relative contributions of the individual type of motions. Combining simulations and experiments, relaxation rates may, therefore, be assigned to various motions – viz. less than 1 ps for chemical bond vibrations, 50–100 ps for trans–gauche isomerization, 1–2 ns for molecular rotation and wobble, and beyond 100 ns for lateral diffusion. A model for the dynamics of individual lipid molecules has also been proposed based on a thorough comparison of simulation data and experimental measurements of the 13 C NMR T1 relaxation in DPPC alkyl chains [87]. Employing Brownian dynamics and MD simulations associated to fits of experimental data, it was found that lipid molecules confine themselves into a cylinder within the 100 ps time scale, and wobble in a cone-like potential on the nanosecond time scale. A similar model for lipid dynamics has emerged from an MD study aimed at interpreting inelastic neutron scattering (INS) data. One particular aspect
944
C. Chipot et al.
of such experiments, probing the motion of individual hydrogen nuclei – i.e., self correlation of single particle, is that they are space- and time resolved. In the case of DPPC bilayers, a good agreement between simulations and experiments probing the 100 ps time scale is attained [88]. The analysis corroborates the fact that the motion of the center of mass and the internal motions of lipid molecules are decoupled. Moreover, the former is well described as a diffusion in a confined space, i.e., a cylinder. A refined picture of the internal dynamics arising from the simulation shows that protons of the alkyl chains move according to a chain defect model, where kinks or chain defects form and disappear randomly – i.e., stochastic model – along the lipid tail, rather than diffuse along the chain. Collective dynamics of lipid bilayers have also been examined carefully as simulations over increasingly significant time scales and length scaled are feasible. Large systems involving 1,024 lipid molecules studied over 10 ns led to the direct observation of bilayer undulations and thickness fluctuations of mesoscopic nature [35]. Continuum properties such as bending modulus, surface compressibility and mode relaxation times were calculated and agreed nicely with experiment. Several processes occurring at different length scales were identified. The undulatory motions could be separated in two regimes – one involving more than 50 lipids, that can be ascribed to mesoscopic undulations, and the other, involving less than 25 lipids, that is attributed to collective lipid protrusion. Peristaltic modes – i.e., anti-correlated modes between the two layers – could also be distinguished in two types: bending modes involving 50–400 lipids, and protrusion modes over shorter length scales. Shorter wavelength collective dynamics may be probed using coherent inelastic, viz. neutron or x-ray, scattering. Density fluctuations of length scales comparable to the interlipid distance are believed to play a pivotal role in the transport of small molecules across the bilayer. Recently, MD simulations have been used to complement inelastic x-ray data of lipid bilayers, both in the gel, Lβ , and the liquid crystal, Lα , phases [89]. The results support the applicability of generalized hydrodynamics to describe the motion of carbon atoms in the hydrophobic core, thus allowing the modeler to extract key-parameters, such as sound mode propagation velocity, thermal diffusivity and kinematic longitudinal viscosity.
4.3.
Modeling Transport Phenomena
Models of lipid bilayers have been employed widely to investigate diffusion properties across membranes through assisted and non-assisted mechanisms. Simple ions, e.g., Na+ , K+ , Ca2+ or Cl− , have been shown to play a significant role in the cell machinery, in particular at the level of intercellular communication. In order to enter the cell, the ion must preliminarily
Modeling lipid membranes
945
permeate the lipid bilayer that acts as a rampart towards the cytoplasm. Wilson and Pohorille have investigated the passive transport of Na+ and Cl− ions across a lipid bilayer formed by glycerolmonoolein units, which undergoes severe deformations as the ions translocate across the water–membrane interface. This process is accompanied by thinning defects and the formation of water fingers that ensure an appropriate hydration of the ion as it penetrates in the non-polar environment [90]. Ideally, atomic simulations could also serve as a predictive tool for estimating water–membrane partition coefficients of small drugs, in strong connection with the so-called blood–brain barrier – the ultimate step in the de novo design of pharmacologically active molecules. Diffusion of small, organic solutes in lipid bilayers was examined for a variety of molecular species ranging from benzene [91, 92] to more complex anesthetics [93–95]. Yet, access to partition coefficients by means of statitistical simulations implies the determination of the underlying free energy behavior along the direction normal to the interface [96]. In the specific instance of inhaled anesthetics, an analysis of the variations of the free energy for translocating the solute from the aqueous medium into the interior of the bilayer suggests that potent anesthetics reside preferentially near the water–membrane interface. Contrary to the dogmatic Meyer–Overton hypothesis [97], potency is shown to correlate with the interfacial concentration of the anesthetic, rather than its sole lipophilicity [98]. The considerable free energy associated to the transfer of ions from the aqueous medium to the interior of the membrane rationalizes the use in cells of specific transmembrane channels, pumps or carriers that facilitate while controlling selectively the passage of ionic species across the lipid bilayer [99]. Recent complete reviews of the theoretical developments and simulation capabilities in ion channels modelling can be found in reference [100] and [101]. Here, we briefly describe some of the complex systems examined hitherto. Gramicidin A, a prototypical channel for assisted ion transport, has been the object of thorough analyses from both experimental and theoretical perspectives. Dimerization of individual protein units results in membranespanning channels suitable for ion conduction. MD simulations of gramicidin A embedded in hydrated lipid bilayers, e.g., DMPC, were able to reproduce the structural features observed experimentally [102]. Such studies have clearly shown that important questions related to ion selectivity, ion binding, gating and proton transfer mechanisms may be addressed with some confidence. Internal arrangement of water molecules in single-file chain of water molecules, characteristic in complex transporters [103], was also witnessed in a somewhat more rudimentary, synthetic channel formed by stacked cyclic peptides of alternated D- and L-chiralities (see Fig. 5) [76]. Such nanotubes have been recognized to modify in a selective fashion the permeability of cell membranes and are envisioned to act as potent therapeutic agents in response to bacterial resistance [104]. Aquaporins, membrane channels ubiquitous to most
946
C. Chipot et al.
(a)
(b)
Figure 5. MD simulation of DNA rods interacting with a membrane formed by cationic and neutral lipids, from Ref. [118]. Snapshot along the DNA axis (a) and perpendicular to it (b). Color coding: DNA phosphate moieties are shown in light grey, PC phosphate groups in black and choline groups dark grey.
living species controlling the water contents of the cells, have also focused much attention lately. They are formed of tetramers that organize to facilitate the transport of water, and possibly other small solutes, across the lipid bilayer. The resulting water pores remain, however, impervious to the passage of small ions to ensure a proper conservation of the electrochemical potential [105]. As a final note, it is worth mentioning that, as expected, the determination of the high resolution structure of KscA, a bacterial K+ channel, has motivated a large number of realistic simulations taking into account the lipidic environment studies aimed at deciphering the underlying complex conduction mechanism.
4.4.
Interaction of Small Molecules, Nucleic Acids, Peptides and Proteins with Membranes
In most circumstances, the biological membrane is described at the theoretical level as a simple, homogeneous bilayer formed by a single type of lipid – usually the well-studied, zwitterionic PC lipids. Membranes, however, are infinitely more complex and consist of an heterogeneous assembly of
Modeling lipid membranes
947
different lipids, either charged or not, carbohydrates and proteins. Approaching the fine detail of the biological picture by incorporating in atomic simulations chemical species of different natures is evidently the direction towards which the modeler is evolving. From a modeling perspective, the influence of cholesterol [106–110], and more generally sterols [111], on the structure and dynamics of lipid bilayers has attracted a lot of attention in recent years. Although the limited sampling in some simulations calls into question the conclusions reached by the authors, cholesterol is shown to increase the order parameters of the alkyl chains while decreasing their tilt angle with respect to the normal to the water–membrane interface, in qualitative agreement with experiment [112]. Aside from transporters and channels that assist the transport of chemical species across lipid bilayers, a vast array of key-cellular functions are accomplished by proteins that interact with the membrane, either spanning the latter, or bound to its surface [113]. Yet, interfacial and transmembrane proteins generally play distinct roles in the cell machinery, albeit the frontier between these two classes of proteins remains somewhat fuzzy. A number of proteins, for instance, are only partially buried in the membrane – e.g., melittin or alamethicin, the insertion of which is conditioned by the transmembrane electric field [114–117]. Recent MD simulations have focused on the association of DNA with lipid membranes, that results in stable complexes of potential use as viral-based carriers. Here, rods of DNA are intercalated between bilayer leaflets formed by mixed cationic and neutral, PC, lipid units, producing undulations of the membrane interface (see Fig. 5). In such a topology, where the host interacts with the head groups, it is shown that both PC and cationic lipids contribute to the overall screening of the phospate groups of the nucleic acids [118]. The strength of in silico experiments is to provide glimpses into the atomic detail of biological membranes that conventional experimental techniques cannot capture. Of particular interest is the molecular interplay that govern membrane–protein association, accessible through large-scale atomic simulations. MD simulations illuminated, for instance, how the presence of a protein perturbs the structure of the lipid membrane. For example, the helices of the Influenza A M2 channel tilt in a DMPC bilayer to maximize membrane–protein hydrophobic contacts [119, 120]. In the case of gramicidin A, key-residues located in the head-group region have been shown to stabilize the channel in the membrane [102]. The influence of the protein on the lipid bilayer can be viewed as the subtle balance between hydrophobic and hydrophilic contributions that, in principle, can be captured by MD simulations. Differences in the order parameters of lipid units adjacent to the protein and far from it have led to the concept of “boundary lipids”. In a vast number of instances, among which the Mycobacterium tuberculosis MscL channel [121], the Influenza A virus M2
948
C. Chipot et al.
protein [122], and the Escherichia coli OmpF trimer [123], it was observed that the membrane protein induces an increasing disorder of the lipid alkyl chains in its neighborhood. In sharp contrast, alkyl chains close to the transporter gramicidin A tend to be more ordered, compared to those pertaining to the bulk lipid environment [102, 124]. In the light of these computational investigations, it would, therefore, appear that trans–gauche equilibria in lipid chains are dictated by the very nature of the membrane protein. Yet, as was shown recently [76], drawing definitive conclusions based on limited simulation lengths may turn out to give a distorted vision of the actual behavior of the lipid bilayer. In principle, exceedingly short simulations do not permit the complete relaxation of lipid chains in the vicinity of the protein, and should, thus, be interpreted cautiously. The close adequation between the thickness of the lipid bilayer and the length of the hydrophobic segment of the protein spanning the latter constitutes yet another important facet of the protein–membrane interplay. By providing the microscopic detail of the interactions of integral proteins with the lipid environment, atomic statistical simulations may contribute to advance our understanding of the underlying physical principles that govern the function and structure of membranes [125], In the light of a series of experimental investigations on model peptides embedded in PC membranes with alkyl chains of increasing length, it was found that if the hydrophobic thickness of the peptide is greater than that of the bilayer, the latter becomes thinner, and vice versa [126]. A similar phenomenon was observed recently in the MD simulation of a single peptide nanotube inserted in a hydrated DMPC bilayer [76]. The hydrophobic thickness of the membrane adjusts itself as the synthetic channel tilts concurrently to adapt to its host lipid environment. Whereas the so-called hydrophobic mismatch [127] does not appear to induce perturbations in peptide nanotubes, it can, however, modulate strongly the function of more complex proteins. As was observed recently for gramicidin A, minute changes in the length of the lipid alkyl chains – viz. from the 18-carbon oleyl- to the 20-carbon eicosenoylphosphatidylcholine, switch the protein from a stretch-activated to a stretch-inactivated channel. Symmetrically, the hydrophobic mismatch may alter the phase behavior of the membrane, as demonstrated in the case of walp peptides that promote the formation of non-lamellar phases [128]. These remarkable results should, therefore, incline the modeler to be cautious when solvating membrane proteins in lipid surroundings. The choice of the lipid unit for a given protein may turn out as a genuine leap of faith if attention has not been paid to the possible imbalance in the hydrophobic thicknesses of the membrane and the protein, likely to render a physically unrealistic picture of the assembly. When devised appropriately, atomic simulations can, nonetheless, shed new light on the nature of the protein–membrane interplay, by allowing the modeler to not only visualize, but also possibly quantify the strength of the participating
Modeling lipid membranes
949
interactions. Of particular interest, the non-covalent chemical bonds formed by l-Trp residues and acceptor moieties of the head-group region have been recognized to act as anchoring points of the protein into the lipid bilayer [129, 130]. As has been shown in the case of gramicidin A, the presence of several l-Trp amino acids at the level of the lipid head groups is expected to mediate the overall stabilization of the channel in the membrane [102].
5.
Discussion, Outlook and Future Prospects
Retrospectively, with about 15 years of hindsight, it has become clear that atomic simulations, in particular MD simulations of lipid–water assemblies have contributed in a large measure to improve our knowledge of these very complex systems from both a structural and a dynamical point of view. It is also obvious that the successes of pioneering, tantalizing investigations, which not only ignited the field of lipid simulations, but were also rapidly fueled by many studies on larger assemblies, often reflected as much good fortune as they did science. Yet, major advances on both the hardware and the software, algorithmic fronts progressively allowed the modeler to tackle systems of increasing complexity over time-scales compatible with the physical, chemical and biological reality. Among these advances, the development of specific methods for performing the simulation in apt thermodynamic ensembles, the improvement of potential energy functions targeted at the specific modeling of lipid–water assemblies, and the continuous decrease of the price/performance ratio of modern computers have helped pushing back the intrinsic limitations of MD simulations. More recent studies have demonstrated that simulations at least an order of magnitude longer than those reported when the field was only in its infancy, were required to obtain reliable and reproducible results [131]. Simulation of lipid–water systems still constitutes a research area seething with excitement. The development of all the ingredients to investigate in silico lipid bilayers with full confidence opens new perspectives, in particular on the biological front, and should rapidly allow the modeler to use lipids in a routine fashion, just like any other solvent. In this spirit, theoretical studies of membrane proteins in a realistic environment should continue to flourish in the near future. Unfortunately, as the level of sophistication of atomic simulations increases, together with the available computational power, so does the ambition of the modeler, attempting to deal with molecular systems yet even more complex, both in terms of size- and time-scales. This explains the current teeming activity in the development of approximate schemes that could serve as alternatives to a full-atomic description for the modeling of large lipid–water assemblies over long times. Among these alternatives, a dearth of effort has been invested in recent years in the field of implicit solvation [8]. Since the seminal work of Onsager on
950
C. Chipot et al.
continuum electrostatics [132], the temptation to represent explicit surroundings by a simple dielectric medium for a myriad of chemical systems has been the object of tremendous interest. Modeling the complexity of lipid bilayers by means of a continuum description has been used, for example, to investigate the insertion of α-helical peptides in a membrane [133], or the interaction of a small toxin with the latter [116]. Results of continuum electrostatics simulations, which are based on solving the Poisson–Boltzmann equation numerically, are, in general, in qualitative agreement with atomic simulations. Yet, not too unexpectedly, this approximate description cannot capture subtle, specific interactions that govern the stability of the solute – e.g., a short peptide, at the water–membrane interface. As was underlined recently by Lin et al., the reproduction of membrane dipole potentials based on a sole continuum electrostatics representation is usually erroneous, but can be significantly improved by inclusion of explicit layers of water molecules near the head-group region [134]. Aside from implicit solvation approaches, the use of coarse-grained representations, wherein each lipid unit is described by a limited number of interacting sites, is probably the most promising. The underlying assumption that the formation of a lipid vesicle is a sufficiently robust process to be simulated by simplified models of lipids was ascertained recently by Marrink and Mark through a study of the aggregation of DPPC units into small unilamellar vesicles [135]. By and large, the strength of coarse-grained models resides in their ability to make simulations self-assembly processes substantially more affordable than conventional all- or even united-atom models [136]. The level of representation offered by this alternative is, in sharp contrast, incompatible with the fine description of specific interactions of the participating lipid units with small solutes, yet, the major advantage of coarse-grain (CG) models is their usefulness in simulating processes, which are otherwise either difficult or impossible to carry out using the conventional atomistic approaches. Many phenomena involving membranes lie within the mesoscopic spatio-temporal scale that may be explored with coarse grain methods. [137] Among those, recent studies have shown the power of such technique in investigating lipid-protein interactions, and membrane-membrane interactions such as anti-microbial attack on membranes and membrane fusion [138].
References [1] R.B. Gennis, Biomembranes: Molecular Structure and Function, Spring Verlag, Heidelberg, 1989. [2] D. Frenkel and B. Smit, Understanding Molecular Simulations: From Algorithms to Applications, Academic Press, San Diego, 1996. [3] D.P. Tieleman, S.J. Marrink, and H.J.C. Berendsen, “A computer perspective of membranes: molecular dynamics studies of lipid bilayer systems,” Biochim. Biophys. Acta, 1331, 235–270, 1997.
Modeling lipid membranes
951
[4] D.J. Tobias, “Water and membranes: molecular details from md simulations,” In: M.C. Bellissent-Funel (ed.), Hydration Processes in Biology, vol. 305, NATO ASI Series A: Life Sciences, IOM Press, New York, pp. 293–310, 1999. [5] L.R. Forrest and M.S.P. Sansom, “Membrane simulations: bigger and better,” Curr. Opin. Struct. Biol., 10, 174–181, 2000. [6] S.E. Feller, “Molecular dynamics simulations of lipid bilayers,” Curr. Opin. Colloid Interface Sci., 5, 217–223, 2000. [7] H.L. Scott, “Modeling the lipid component of membranes,” Curr. Opin. Struct. Biol., 12, 495–502, 2002. [8] D.J. Tobias, “Electrostatic calculations: recent methodological advances and applications to membranes,” Curr. Opin. Struct. Biol., 11, 253–261, 2001. [9] R.J. Mashl, H.L. Scott, S. Subramaniam, and E. Jakobsson. “Molecular simulation of dioleylphosphatidylcholine bilayers at differing levels of hydration,” Biophys. J., 81, 3005–3015, 2001. [10] L. Saiz and M.L. Klein, “Computer simulation studies of model biological membranes,” Acc. Chem. Res., 35, 482–489, 2002. [11] J. Israelachvili, S. Marcelja, and R.G. Horn, “Physical principles of membrane organization,” Quart. Rev. Biophys., 13, 121–200, 1980. [12] J. Israelachvili, Intermolecular and Surface Forces. Academic Press, London, 1992. [13] K.V. Damodaran and K.M. Merz Jr., “A comparison of dmpc- and dlpe-based lipid bilayers,” Biophys. J., 66, 1076–1087, 1994. [14] L. Saiz and M.L. Klein, “Electrostatic interactions in a neutral model phospholipid bilayer by molecular dynamics simulations,” J. Chem. Phys., 116, 3052–3057, 2002. [15] J.A. Bouwstra, M.A. Salomons-de Vries, J.A. Van der Spek, and W. Bras, “Structure of human stratum corneum as a function of temperature and hydration: a wide angle x-ray diffraction study,” Int. J. Pharmacol., 84, 205–216, 1992. [16] G. Zaccai, G. B¨uldt, A. Seelig, and J. Seelig, “Neutron diffraction studies on phosphatidylcholine model membranes. II. chain conformation and segmental order,” J. Mol. Biol., 134, 693–706, 1979. [17] M.C. Wiener and S.H. White, “Structure of fluid dioleylphosphatidylcholine bilayer determined by joint refinement of x-ray and neutron diffraction data. III. complete structure,” Biophys. J., 61, 434–447, 1992. [18] J.F. Nagle, R. Zhang, S. Tristram-Nagle, W.J. Sun, H.I. Petrache, and R.M. Suter, “X–ray structure determination of fully hydrated Lα phase dipalmitoylphosphatidylcholine bilayers,” Biophys. J., 70, 1419–1431, 1996. [19] K. Hristova and S.H. White, “Determination of the hydrocarbon core structure of fluid dopc bilayers by X–ray diffraction using specific bromination of the double– bonds: effect of hydration,” Biophys. J., 74, 2419–2433, 1998. [20] J.F. Nagle and S. Tristram-Nagle, “Lipid bilayer structure,” Curr. Opin. Struct. Biol., 10, 474–480, 2000. [21] C.F. Majkrzak and N.F. Berk, “Exact determination of the phase in neutron reflectometry by variation of the surrounding media,” Phys. Rev. B., 58, 15416–15418, 1998. [22] M. Tarek, K. Tu, M.L. Klein, and D.J. Tobias, “Molecular dynamics simulations of supported phospholipid/alkanethiol bilayers on a gold(111) surface,” Biophys. J., 77, 464–472, 1999. [23] C.F. Majkrzak, N.F. Berk, S. Krueger, J.A. Dura, M. Tarek, D.J. Tobias, V. Silin, C.W. Meuse, J. Woodward, and A.L. Plant. “First principle determination of hybrid bilayer membrane structure by phase-sensitive neutron reflectometry,” Biophys. J., 79, 3330–3340, 2000.
952
C. Chipot et al. [24] S. Krueger, C.W. Meuse, C.F. Majkrzak, J.A. Dura, N.F. Berk, M. Tarek, and A.L. Plant. “Investigation of hybrid bilyer membranes with neutron reflectometry: probing the interaction of melittin,” Langmuir, 17, 511–521, 2001. [25] K. Gawrisch, N.V. Eldho, and I.V. Polozov, “Novel NMR tools to study structure and dynamics of biomembranes,” Chem. Phys. Lipids, 116, 135–151, 2002. [26] S.J. Marrink, M. Berkowitz, and H.J.C. Berendsen, “Molecular dynamics simulation of a membrane–water interface: The ordering of water and its relation to the hydration force,” Langmuir, 9, 3122–3131, 1993. [27] S.E. Feller, D. Huster, and K. Gawrisch, “Interpretation of NOESY cross-relaxation rates from molecular dynamics simulations of a lipid bilayer,” J. Am. Chem. Soc., 121, 8963–8964, 1999. [28] K. Sengupta and J. Raghunathan, “Structure of ripple phase in chiral and racemic dimyristoylphosphatidylcholine multibilayers,” Phys. Rev. E, 59, 2455–2457, 1999. [29] J. Katsaras, S. Tristram-Nagle, Y. Liu, R.L. Headrick, E. Fontes, P.C. mason, and J.F. Nagle. “Clarification of the ripple phase of lecithin bilayers using fully hydrated aligned samples,” Phys. Rev. E, 61, 5668–5677, 2000. [30] S.J. Marrink and A.E. Mark, “Effect of undulations on surface tension in simulated bilayers,” J. Phys. Chem. B, 105, 6122–6127, 2001. [31] O. Berger, O. Edholm, and F.J¨ahnig, “Molecular dynamics simulations of a fluid bilayer of dipalmitoylphosphatidylcholine at full hydration, constant pressure, and constant temperature,” Biophys. J., 72, 2002–2013, 1997. [32] S.E. Feller and R.W. Pastor, “On simulating lipid bilayers with an applied surface tension: periodic boundary conditions and undulations,” Biophys. J., 71, 1350–1355, 1996. [33] S.E. Feller and R.W. Pastor, “Constant surface tension simulations of lipid bilayers: the sensitivity of surface areas and compressibilities,” Biophys. J., 111, 1281–1287, 1999. [34] D.P. Tieleman and H.J.C. Berendsen, “Molecular dynamics simulations of a fully hydrated dipalmitoylphosphatidylcholine bilayer with different macroscopic boundary conditions and parameters,” J. Chem. Phys., 105, 4871–4880, 1996. [35] E. Lindahl and O. Edholm, “Mesoscopic undulations and thickness fluctuations in lipid bilayers from molecular dynamics simulations,” Biophys. J., 79, 426–433, 2000. [36] G.J. Martyna, D.J. Tobias, and M.L. Klein, “Constant pressure molecular dynamics algorithms,” J. Chem. Phys., 101, 4177–4189, 1994. [37] S.E. Feller, Y.H. Zhang, R.W. Pastor, and B.R Brooks, “Constant pressure molecular dynamics simulations – the Langevin piston method,” J. Chem. Phys., 103, 4613– 4621, 1995. [38] A.M. Smondyrev and M.L. Berkowitz, “United atom force field for phospholipid membranes: constant pressure molecular dynamics simulation of dipalmitoylphosphatidicholine/water system,” J. Comput. Chem., 20, 531–545, 1999. [39] E. Egberts, S.J. Marrink, and H.J.C. Berendsen, “Molecular dynamics simulation of a phospholipid membrane,” Eur. Biophys. J., 22, 423–436, 1994. [40] D.J. Tobias, K. Tu, and M.L. Klein, “Assessment of all–atom potentials for modeling membranes: molecular dynamics simulations of solid and liquid alkanes and crystals of phospholipid fragments,” J. Chim. Phys., 94, 1482–1502, 1997. [41] S.-W. Chiu, M. Clark, E. Jakobsson, S. Subramaniam, and H. L. Scott, “Optimization of hydrocarbon chain interaction parameters: application to the simulation of fluid phase lipid bilayers,” J. Phys. Chem. B, 103, 6323–6327, 1999.
Modeling lipid membranes
953
[42] S.E. Feller and A.D. MacKerell Jr., “An improved empirical potential energy function for molecular simulations of phospholipids,” J. Phys. Chem. B, 104, 7510–7515, 2000. [43] S.E. Feller, K. Gawrisch, and A.D. MacKerell Jr., “Polyunsaturated fatty acids in lipid bilayers: intrinsic and environmental contributions to their unique physical properties,” J. Am. Chem. Soc., 124, 318–326, 2002. [44] J. Ryckaert and A. Bellemans, “Molecular dynamics of liquid alkanes,” Chem. Soc. Faraday Discuss., 66, 95–106, 1978. [45] W.L. Jorgensen and J. Tirado-Rives, “The OPLS potential functions for proteins: energy minimizations for crystals of cyclic peptides and crambin,” J. Am. Chem. Soc., 110, 1657–1666, 1988. [46] W.D. Cornell and C. Chipot, “Alternative approaches to charge distribution calculations,” In: P.v.R. Schleyer, N.L. Allinger, T. Clark, J. Gasteiger, P.A. Kollman, H. F. Schaefer III, and P. R. Schreiner (eds.), Encyclopedia of Computational Chemistry, vol. 1, Wiley and Sons, Chichester, pp. 258–263, 1998. [47] F. Colonna and E. Evleth, “Conformationally invariant modeling of atomic charges,” Chem. Phys. Lett., 212, 665–670, 1993. [48] K. Tu, D.J. Tobias, and M.L. Klein, “Constant pressure and temperature molecular dynamics simulation of a fully hydrated liquid crystal phase DPPC bilayer,” Biophys. J., 69, 2558–2562, 1995. [49] M.P. Allen and D.J. Tildesley. Computer Simulation of Liquids, Clarendon Press, Oxford, 1987. [50] A.Y. Toukmaji and J.A. Board Jr., “Ewald summation techniques in perspective: a survey,” Comput. Phys. Comm., 95, 73–92, 1996. [51] R.W. Hockney and J.W. Eastwood, Computer Simulation Using Particles, IOP Publishing Ltd., Bristol, England, 1988. [52] K.E. Schmidt and M.A. Lee, “Implementing the fast multiple method in three dimensions,” J. Stat. Phys., 63, 1223–1235, 1991. [53] G.J. Martyna, M.E. Tuckerman, D.J. Tobias, and M.L. Klein. “Explicit reversible integrators for extended systems dynamics,” Mol. Phys., 87, 1117–1128, 1996. [54] M.E. Tuckerman and G.J. Martyna, “Understanding modern molecular dynamics: techniques and applications,” J. Phys. Chem. B, 104, 159–178, 2000. [55] J. A. Izaguirre, S. Reich, and R.D. Skeel, “Longer time steps for molecular dynamics,” J. Chem. Phys., 110, 9853–9864, 1999. [56] J. Ryckaert, G. Ciccotti, and H.J.C. Berendsen, “Numerical integration of the cartesian equations of motion for a system with constraints: molecular dynamics of n-alkanes,” J. Comput. Phys., 23, 327–341, 1977. [57] D.A. Pearlman, D.A. Case, J.W. Caldwell, W.R. Ross, T.E. Cheatham III, S. DeBolt, D. Ferguson, G. Seibel, and P. Kollman. “AMBER, a computer program for applying molecular mechanics, normal mode analysis, molecular dynamics and free energy calculations to elucidate the structures and energies of molecules,” Comput. Phys. Commun., 91, 1–41, 1995. [58] B.R. Brooks, R.E. Bruccoleri, B.D. Olafson, D.J. States, S. Swaminathan, and M. Karplus. “CHARMM: a program for macromolecular energy, minimization, and dynamics calculations,” J. Comput. Chem., 4, 187–217, 1983. [59] E. Lindhal, B. Hess, and D. van der Spoel, “GROMACS 3.0: a package for molecular simulation and trajectory analysis,” J. Mol. Mod., 7, 306–317, 2001. [60] L. Kal´e, R. Skeel, M. Bhandarkar, R. Brunner, A. Gursoy, N. Krawetz, J. Phillips, A. Shinozaki, K. Varadarajan, and K. Schulten. “NAMD2: greater scalability for parallel molecular dynamics,” J. Comput. Phys., 151, 283–312, 1999.
954
C. Chipot et al. [61] E. Tajkhorshid, A. Aksimentiev, I. Balabin, M. Gao, B. Isralewitz, J.C. Phillips, F. Zhu, and K. Schulten. “Large scale simulation of protein mechanics and function,” In: F.M. Richards, D.S. Eisenberg, and J. Kuriyan (eds.), Advances in Protein Chemistry, vol. 66, Elsevier Academic Press, New York, pp. 195–247, 2003. [62] W. Humphrey, A. Dalke, and K. Schulten, “VMD – visual molecular dynamics,” J. Mol. Graph., 14, 33–38, 1996. [63] S.-W. Chiu, M. Clark, V. Balaji, S. Subramaniam, H. L. Scott, and E. Jakobsson, “Incorporation of surface tension into molecular dynamics simulation of an interface: a fluid phase lipid bilayer membrane,” Biophys. J., 69, 1230–1245, 1995. [64] R.M. Venable, Y. Zhang, B.J. Hardy, and R.W. Pastor, “Molecular dynamics simulations of a lipid bilayer and of hexadecane: an investigation of membrane fluidity,” Science, 262, 223–226, 1993. [65] S.E. Feller, R.M. Venable, and R.W. Pastor, “Computer simulation of a DPPC phospholipid bilayer: structural changes as a function of molecular surface area,” Langmuir, 13, 6555–6561, 1997. [66] U. Essman and M. Berkowitz, “Dynamical properties of phospholipid bilayers from computer simulations,” Biophys. J., 76, 2081–2089, 1999. [67] S.-W. Chiu, M. Clark, E. Jakobsson, S. Subramaniam, and H. L. Scott, “Application of combined Monte Carlo and molecular dynamics method to simulation of dipalmitoyl phosphatidylcholine lipid bilayer,” J. Comp. Chem., 11, 1153–1164, 1999. [68] S.-W. Chiu, E. Jakobsson, S. Subramaniam, and H.L. Scott. “Combined Monte Carlo and molecular dynamics simulation of fully hydrated dioleyl and palmitoyl–oleyl phosphatidylcholine lipid bilayers,” Biophys. J., 77, 2462–2469, 1999. [69] T. R´og, K. Murzyn, and M. Pasenkiewicz-Gierula, “The dynamics of water at the phospholipid bilayer: A molecular dynamics study,” Chem. Phys. Lett., 352, 323– 327, 2002. [70] L. Saiz and M.L. Klein, “Structural properties of a highly polyunsaturated lipid bilayer from molecular dynamics simulations,” Biophys. J., 81, 204–216, 2001. [71] M.L. Berkowitz and M.J. Raghavan, “Computer simulation of a water/membrane interface,” Langmuir, 7, 1042–1044, 1991. [72] J.J. L´opez Cascales, H.J.C. Berendsen, and J. García de la Torre, “Molecular dynamics simulation of water between two charged layers of dipalmitoylphosphatidylserine,” J. Phys. Chem., 100, 8621–8627, 1996. [73] S.A. Pandit and M.L. Berkowitz, “Molecular dynamics simulation of dipalmitoylphosphatidylserine bilayer with Na counterions,” Biophys. J., 82, 1818–1827, 2002. [74] M. Wilson and A. Pohorille, “Molecular dynamics of a water–lipid bilayer interface,” J. Am. Chem. Soc., 116, 1490–1501, 1994. [75] A. Pohorille and M.A. Wilson, “Molecular dynamics studies of simple membrane– water interfaces: structure and functions in the beginnings of cellular life,” Orig. Life Evol. Biosph., 25, 21–46, 1995. [76] M. Tarek, B. Maigret, and C. Chipot, “Molecular dynamics investigation of an oriented cyclic peptide nanotube in DMPC bilayers,” Biophys. J., 85, 2287–2298, 2003. [77] G.S. Harms, M. Sonnleitner, G.J. Schtz, and T. Schmidt. “Single-molecule anisotropy imaging,” Biophys. J., 77, 2864–2870, 1999. [78] P.B. Moore, C.F. Lopez, and M.L. Klein, “Dynamical properties of a hydrated lipid bilayer from a multinanosecond molecular dynamics simulation,” Biophys. J., 81, 2484–2494, 2001. [79] R.S. Armen, O.D. Uitto, and S.E. Feller, “ Phospholipid component volumes: determination and application to bilayer structure calculations,” Biophys. J., 75, 734–744, 1998.
Modeling lipid membranes
955
[80] J.P. Doulier, A.L´eonard, and E.J. Dufourc, “Restatement of order parameters in biomembranes: calculation of C–C bond order parameters from C–D quadrupolar splitting,” Biophys. J., 68, 1727–1739, 1995. [81] H.I. Petrache, K. Tu, and J.F. Nagle, “Analysis of simulated NMR order parameters for lipid bilayer structure determination,” Biophys. J., 76, 2479–2487, 1999. [82] R.G. Snyder, K. Tu, M.L. Klein, R. Mendelssohn, H.L. Strauss, and W. Sun, “Acyl chain conformation and packing in dipalmitoylphosphatidylcholine bilayers from MD simulations and IR spectroscopy,” J. Chem. Phys. B, 106, 6273–6288, 2002. [83] M. Tarek, D.J. Tobias, and M.L. Klein, “Molecular dynamics simualtion of tetradecyltrimethylammonium bromide monolayers at the air/water interface,” J. Phys. Chem., 99, 1393–1402, 1995. [84] K. Gawrisch, D. Ruston, J. Zimmerberg, V. Parsegian, R. Rand, and N. Fuller, “Membrane dipole potentials, hydration forces, and the ordering of water at membrane surfaces,” Biophys. J., 61, 1213–1223, 1992. [85] W. Shinoda, M. Shimizu, and S. Okazaki, “Molecular dynamics study on electrostatic properties of a lipid bilayer: polarization, electrostatic potential, and the effects on structure and dynamics of water near the interface,” J. Phys. Chem. B, 102, 6647–6654, 1998. [86] J.X. Cheng, S. Pautot, D.A. Weitz, and X.S. Xie, “Ordering of water molecules between phospholipid bilayers visualized by coherent anti-stokes raman scattering microscopy,” Proc. Natl Acad. Sci. USA, 100, 9826–9830, 2003. [87] R.W. Pastor, R.M. Venable, and S.E. Feller, “Lipid bilayers, NMR relaxation, and computer simulations,” Acc. Chem. Res., 35, 438–446, 2002. [88] D.J. Tobias, “Membrane simulations,” In: O.H. Becker, A.D. Mackerell Jr., B. Roux, and M. Watanabe (eds.), Computational Biochemistry and Biophysics, Marcel Dekker, New York, 2001. [89] M. Tarek, D.J. Tobias, S.H. Chen, and M.L. Klein, “Short waverlength collective dynamics in phospholipid bilayers: a molecular dynamics study,” Phys. Rev. Lett., 87, 238101, 2001. [90] M.A. Wilson and A. Pohorille, “Mechanism of unassisted ion transport across membrane bilayers,” J. Am. Chem. Soc., 118, 6580–6587, 1996. [91] H.E. Alper and T.R. Stouch, “Orientation and diffusion of a drug analogue in biomembranes molecular dynamics simulations,” J. Phys. Chem., 99, 5724–5731, 1995. [92] D. Bassolino-Klimas, H.E. Alper, and T.R. Stouch, “Drug–membrane interactions studied by molecular dynamics simulation: size dependence of diffusion,” Drug Des. Discov., 13, 135–141, 1996. [93] K. Tu, M. Tarek, M.L. Klein, and D. Scharf, “Effects of anesthetics on the structure of a phospholipid bilayer: molecular dynamics investigation of halothane in the hydrated liquid crystal phase of dipalmitoylphosphatidylcholine,” Biophys. J., 75, 2123–2134, 1998. [94] L. Koubi, M. Tarek, M.L. Klein, and D. Scharf, “Distribution of halothane in a dipalmitoylphosphatidylcholine bilayer from molecular dynamics calculations,” Biophys. J., 78, 800–811, 2000. [95] L. Koubi, M. Tarek, M.L. Bandyophadhyay, and D. Scharf. “Effects of the nonimmobilizer hexafluroethane on the model membrane DMPC,” Anesthesiology, 97, 848–855, 2002. [96] A. Pohorille and M.A. Wilson, “Excess chemical potential of small solutes across water–membrane and water–hexane interfaces,” J. Chem. Phys., 104, 3760–3773, 1996.
956
C. Chipot et al.
[97] E. Overton, Studien u¨ ber die Narkose zugleich ein Betrag zur allgemeinen Pharmakologie, Verlag von Gustav Fischer, Jena, 1901. [98] A. Pohorille, M.A. Wilson, M.H. New, and C. Chipot. “Concentrations of anesthetics across the water–membrane interface; the Meyer–Overton hypothesis revisited,” Toxicology Lett., 100, 421–430, 1998. [99] A. Pohorille, M.A. Wilson, K. Schweighofer, M.H. New, and C. Chipot, “Interactions of membranes with small molecules and peptides,” In: J. Leszczynski (ed.), Theoretical and Computational Chemistry – Computational Molecular Biology, vol. 8, Elsevier, The Netherlands, pp. 485–535, 1999. [100] D.P. Tieleman, P.C. Biggin, G.R. Smith, and M.S.P. Sansom, “Simulation approaches to ion channel structure-function relationships,” Quart. Rev. Biophys, 34, 473–561, 2001. [101] B. Roux, “Theoretical and computational models of ion channels,” Curr. Opin. Struct. Biol., 12, 182–189, 2002. [102] B. Roux, “Computational studies of the gramicidin channel,” Acc. Chem. Res., 35, 366–375, 2002. [103] R. Pom`es and B. Roux, “Molecular mechanism of H+ conduction in the single–file water chain of the gramicidin channel,” Biophys. J., 82, 2304–2316, 2002. [104] S. Fernandez-Lopez, H.S. Kim, E.C. Choi, M. Delgado, J.R. Granja, A. Khasanov, K. Kraehenbuehl, G. Long, D.A. Weinberger, K.M. Wilcoxen, and M.R. Ghadiri, “Antibacterial agents based on the cyclic d, l-α-peptide architecture,” Nature, 412, 452–455, 2001. [105] E. Tajkhorshid, P. Nollert, M.O. Jensen, L.J.W. Miercke, J. O’Connell, R.M. Stroud, and K. Schulten. “Control of the selectivity of the aquaporin water channel family by global orientational tuning,” Science, 296, 525–530, 2002. [106] O. Edholm and A.M. Nyberg, “Cholesterol in model membranes: a molecular dynamics study,” Biophys. J., 63, 1081–1089, 1992. [107] R.R. Gabdoulline, G. Vanderkooi, and C. Zheng, “Comparison of the structures of dimyristoylphosphatidylcholine in the presence and absence of cholesterol by molecular dynamics simulations,” J. Phys. Chem., 100, 15942–15946, 1996. [108] K. Tu, M.L. Klein, and D.J. Tobias, “Constant–pressure molecular dynamics investigation of cholesterol in a dipalmitoylphosphatidylcholine bilayer,” Biophys. J., 75, 2147–2156, 1998. [109] A.M. Smondyrev and M.L. Berkowitz, “Structure of dipalmitoylphosphatidylcholine/cholesterol bilayer at low and high cholesterol concentrations: molecular dynamics simulation,” Biophys. J., 77, 2075–2089, 1999. [110] S.-W. Chiu, E. Jakobsson, and H.L. Scott, “Combined Monte Carlo and molecular dynamics simulation of hydrated dipalmitoyl–phosphatidylcholine–cholesterol lipid bilayers,” Biophys. J., 114, 5435–5443, 2001. [111] A.M. Smondyrev and M.L. Berkowitz, “Molecular dynamics simulation of the structure of dimyristoylphosphatidylcholine bilayers with cholesterol, ergosterol, and lanosterol,” Biophys. J., 80, 1649–1658, 2001. [112] T.W. McMullen and R.N. McElhaney, “Physical studies of cholesterol–phospholipid interactions,” Curr. Opin. Coll. Int. Sci., 1, 83–90, 1996. [113] A. Watts, “Solid-state NMR apporaches for studying the interaction of peptides and proteins with membranes,” Biochim. Biophys. Acta, 1376, 297–318, 1998. [114] D.S. Cafiso, “Alamethicin: A peptide model for voltage gating and protein– membrane interactions,” Annu. Rev. Biophys. Biomol. Struct., 23, 141–165, 1994. [115] C.E. Dempsey, “The actions of melittin on membranes,” Biochim. Biophys. Acta, 1031, 143–161, 1990.
Modeling lipid membranes
957
[116] S. Bern`eche, M. Nina, and B. Roux, “Molecular dynamics simulation of melittin in a dimyristoylphosphatidylcholine bilayer membrane,” Biophys. J., 75, 1603–1618, 1998. [117] D.P. Tieleman, H.J.C. Berendsen, and M.S.P. Sansom. “Voltage-dependent insertion of alamethicin at phospholipid/water and octane/water,” Biophys. J., 80, 331–346, 2001. [118] S. Bandyopadhyay, M. Tarek, and M.L. Klein, “Molecular dynamics study of lipid– DNA complexes,” J. Phys. Chem. B, 103, 10075–10080, 1999. [119] Q. Zhong, T. Hisslein, P.B. Moore, D.M. Newns, P. Pattnaik, and M.L. Klein, “The M2 channel of influenza A virus: a molecular dynamics study,” FEBS Lett., 434, 265–271, 1998. [120] K. Schweighofer and A. Pohorille, “Computer simulation of ion channel gating: the M2 channel of influenza a virus in a lipid bilayer,” Biophys. J., 78, 150–163, 2000. [121] D.E. Elmore and D.A. Dougherty, “Molecular dynamics simulations of wild-type and mutant forms of the mycobacterium tuberculosis MscL channel,” Biophys. J., 81, 1345–1359, 2001. [122] T. Husslein, P.B. Moore, Q.F. Zhong, D.M Newns, P.C. Pattnaik, and M.L. Klein. “Molecular dynamics simulation of a hydrated diphytanol phosphatidylcholine lipid bilayer containing an alpha-helical bundle of four transmembrane domains of the Influenza a virus M2 protein,” Faraday Disc., 111, 201–208, 1998. [123] D.P. Tieleman, L.R. Forrest, M.S.P. Sansom, and H.J.C. Berendsen. “Lipid properties and the orientation of aromatic residues in OMPF, Influenza M2 and alamethicin systems: molecular dynamics simulations,” Biochemistry, 37, 17544–17561, 1998. [124] S.W. Chiu, S. Subramanian, and E. Jakobsson, “Simulation study of a gramicidin/lipid bilayer system in excess water and lipid. II. rates and mechanisms of water transport,” Biophys. J., 76, 1939–1950, 1999. [125] O.G. Mouritsen and M. Bloom, “Mattress model of lipid–protein interactions in membranes,” Biophys. J., 46, 141–153, 1984. [126] M.R.R. de Planque, D.V. Greathouse, H. Sch¨afer, D. Marsh, and J.A. Killian, “Influence of lipid/peptide hydrophobic mismatch on the thickness of diacylphosphatidylcholine bilayers. A 2 H NMR and ESR study using designed transmembrane α-helical peptides and gramicidin A,” Biochemistry, 37, 9333–9345, 1998. [127] D. Duque, X.J. Li, K. Katsov, and M. Schick, “Molecular theory of hydrophobic mismatch between lipids and peptides,” J. Chem. Phys., 116, 10478–10484, 2002. [128] S. Morein, R.E. Koeppe II, G. Lindblom, B. de Kruijff, and J.A. Killian, “The effect of peptide/lipid hydrophobic mismatch on the phase behavior of model membranes mimicking the lipid composition of Escherichia coli membranes,” Biophys. J., 78, 2475–2485, 2000. [129] M.R.R. de Planque, J.A.W. Kruijtzer, R.M.J. Liskamp, D. Marsh, D.V. Greathouse, R.E. Koeppe II, B. de Kruijff, and J. A. Killian. “Different membrane anchoring positions of tryptophan and lysine in synthetic transmembrane α-helical peptides,” J. Biol. Chem., 274, 20839–20846, 1999. [130] W.M. Yau, W.C. Wimley, K. Gawrisch, and S.H. White, “The preference of tryptophan for membrane interfaces,” Biochemistry, 37, 14713–14718, 1998. [131] C. An´ezo, A.H. de Vries, H.D. H¨oltje, P. Tieleman, and S.J. Marrink, “Methodological issues in lipid bilayer simulations,” J. Phys. Chem. B, 107, 9424–9433, 2003. [132] L. Onsager, “Electric moments of molecules in liquids,” J. Am. Chem. Soc., 58:1486– 1493, 1936.
958
C. Chipot et al.
[133] N. Ben-Tal, A. Ben-Shaul, A. Nicholls, and B. Honig. “Free–energy determinants of α-helix insertion into lipid bilayers,” Biophys. J., 70, 1803–1812, 1996. [134] J.H. Lin, N. A. Baker, and J.A. McCammon, “Bridging implicit and explicit solvent approaches for membrane electrostatics,” Biophys. J., 83, 1374–1379, 2002. [135] S.J. Marrink and A.E. Mark, “Molecular dynamics simulation of the formation, structure, and dynamics of small phospholipid vesicles,” J. Am. Chem. Soc., 125, 15233–15242, 2003. [136] J.C. Shelley, M.Y. Shelley, R.C. Reeder, S. Bandyopadhay, P.B. Moore, and M.L. Klein. “Simulations of phospholipids using a coarse grain model,” J. Phys. Chem. B, 105, 9785–9792, 2001. [137] S.O. Nielsen, C.F. Lopez, G. Srinivas, M.L. Klein, “Coarse grain models and the computer simulation of soft materials,” J. Phys. Condens. Matter, 16, R481–R512, 2004. [138] C.F. Lopez, S.O. Nielsen, P.B. Moore, M.L. Klein, “Understanding nature’s design for a nanosyringe,” Proc. Natl. Acad. Sci. 101, 4431–4434, 2004.
2.27 MODELING IRRADIATION DAMAGE ACCUMULATION IN CRYSTALS Chung H. Woo The Hong Kong Polytechnic University, Hong Kong SAR, China
Bombardment of crystalline solids by energetic particles produces lattice defects, the accumulation of which is the origin of the macroscopic effects of irradiation damage. In an all-inclusive theory, the defects produced fall into two categories: (1) atomic displacements creating freely migrating vacancies and interstitials and their clusters, both mobile and immobile; and (2) transmutations creating impurity elements, such as helium. The first type of damage is called displacement damage, and is recoverable via the recombination of the vacancies and the interstitials before they disappear into grain boundaries, voids and dislocations, but the second type is not. In the present article, our attention is on the former. Depending on the energy transfer between the projectile particle and the atom of first encounter in the irradiated material, i.e., the primary knock-on atom or PKA, the initial displacement damage may take the form of (i) vacancy–interstitial (Frenkel) pairs, when the energy transferred is just sufficient to overcome the displacement threshold, e.g., irradiation by MeV electrons; or (ii) cascades and sub-cascades, when the energy transferred is substantially higher, e.g., irradiation by fast neutrons and heavy ions. The lattice defects generated directly from the displacement damage in case (i) are isolated vacancies and interstitials, and their generation rates can be approximated by the displacement rate [1], after discounting the recombination of correlated pairs [2]. In case (ii), due to the high concentration of displacement damage produced in the small cascade volume, and the high mobility of atoms resulting from the energy deposited by the PKA, substantial recombination of the lattice defects takes place already during the coolingdown phase. The final numbers of interstitials and vacancies produced from the cascade are only small fractions of those estimated from the available energy due to the PKA. While the recombination is taking place, a significant fraction of interstitials and vacancies cluster at the same time. The fraction 959 S. Yip (ed.), Handbook of Materials Modeling, 959–986. c 2005 Springer. Printed in the Netherlands.
960
C.H. Woo
of vacancies immobilized in the primary vacancy clusters (PVCs) need not be exactly the same as that of interstitials in the primary interstitial clusters (PICs), and the concepts of atomic displacements and residual defect production must be carefully distinguished. It is obvious that the details of the initial displacement damage determines the characteristics of the lattice defects produced, and hence the kinetics of their reactions among themselves and with the existing microstructure, and ultimately, the macroscopic effects resulting from the ensuing microstructure evolution. For example, electron irradiation produces displacement damage in the form of isolated Frenkel pairs. Accordingly, microstructure evolution can be modeled in terms of the reaction kinetics of point-defects produced homogeneously in space and uniformly in time. On the other hand, irradiation damage with heavier particles begins in the form of cascades and sub-cascades, in which case, defect clustering occurs heterogeneously and athermally. The modeling of the resulting microstructure evolution must take into account these characteristics. In terms of spatial and temporal scales, investigations involving the details of the initial damage, the properties of the defects produced, and the characteristics of the subsequent reactions, naturally lie within the realm of atomistic modeling using techniques such as molecular dynamics (MD) or lattice kinetic Monte Carlo (KMC) (see other papers in this volume). On the other hand, the macroscopic manifestation of the damage effects occurs through the accumulation of irradiation-induced defects, which causes changes in the underlying microstructure of the irradiated material, often through multiple development stages. Modeling irradiation damage in this domain must consider spatial dimensions in the range of microns, time scales of days, weeks, years, or even decades, and accumulation of astronomical numbers of events occurring with infinitesimal probabilities. Such scales make direct atomistic modeling prohibitively expensive, even if one may take into account the rapid development of computer hardware and computational techniques. Irradiation damage modeling must recognize the multi-scale nature of the problem. Theories on irradiation effects, such as void swelling, irradiation creep and growth, based on the reaction kinetics of the underlying microstructure evolution, has been the mainstream approach of irradiation-damage modeling for half a century since the early 1960s (see Ref. [3], for a comprehensive review). This is still by far the most effective approach that bridges the vast gap between mechanisms at the atomistic scale to the associated effects at the component scale. Many models have been developed from these theories, for use as technological tools of analysis and interpretation for reactor design and operation. However, due to the lack of information on the irradiation-induced defects, model parameters fitted to experimental data have to be used. In many cases, based on a single set of parameters, a model cannot be made to be consistent with all existent experimental observations, a weakness which usually reflects
Modeling irradiation damage accumulation in crystals
961
the inadequate theoretical understanding or model oversimplification. In this situation, the usefulness of the model for predictive purposes, or as an interpretative tool, would be seriously compromised. This underpins the need of a well-articulated multi-scale approach to irradiation damage modeling, which spans the vast territory between nano- and macro scales. In this article we present an overview of the reaction kinetic theory of irradiation damage. To facilitate the articulation with the domain of atomistic modeling, our approach will be based on the discrete crystal lattice description. To prepare for the fundamentals, we start off with the general theory of bimolecular reaction kinetics in a diffusive medium, within the framework of which we discuss the standard rate-theory model, the concept of sink bias, the effect of anisotropic diffusion and the DAD bias, and the introduction of the effective lifetime for the treatment of sink competition. We then consider, within the atomistic picture, the effects on the kinetics due to interactions with applied fields, its implications to the reaction kinetics of one-dimensional diffusers, and the stress-induced preferred absorption (SIPA) effect due to elasto-diffusion. Damage caused by cascade-producing irradiation and its modeling using production bias is covered next. Various difficulties facing the production-bias model necessitate considerations of modeling issues beyond the mean-field approximation. To overcome the inadequacy of the mean-field approximation, stochastic effects due to random cascade initiation have to be taken into account. The nucleation of voids and dislocation loops during cascade damage, the stability of a spatial homogeneity, and the development of heterogeneous microstructure are discussed in this context. The overview is concluded with a summary and outlook section.
1.
Reaction Kinetics in a Diffusive Medium
The kinetics of the reactions involving point-defects and microstructure components have been considered via several approaches, including the rate theory approach (see Ref. [4]) the master equation approach (see Ref. [5]), Fokker Planck Equation approach (see Ref. [6]). We begin systematically with the well-developed theory of diffusion-influenced chemical reaction kinetics for molecules diffusing in a solvent, and relate to the other approaches as we proceed.
1.1.
Bimolecular Reaction Kinetics
Chemical reaction kinetics theory has a long history, starting with the elegant work of von Smoluchowski’s [7]. Based on the concept of the pair probability function, Goesele and Seeger [8] developed a theory for the
962
C.H. Woo
bimolecular reaction rates, which can be generalized to include several important factors in the study of irradiation damage, such as interactions between the reaction partners, proximity of other reaction partners, anisotropic diffusion, and finite lifetime of the reactants. Goesele [9] gave a detailed exposition on this topic. Starting with randomly distributed reactants A and B, having diffusion tensors DA and DB , and an interaction potential E(r) between them, the rate of change in the spatially averaged concentrations CA and CB (in atomic fractions) is governed by the reaction coefficient α(t): ¯ C˙ A = C˙ B = − Dα(t)C A CB ,
(1)
where the dots over CA and CB denote the time derivative and D¯ = (Dx D y Dz )1/3 .
(2)
Dx , D y , Dz are the principal values of the relative diffusion tensor D, which is assumed to be constant in space and time D = DA + DB .
(3)
In the case of three-dimensional anisotropic diffusion, Dx , D y , Dz are all non zero. Then α(t) is a function of time and is related to the average diffusiv¯ and the drift field E originated from the interaction between A and B. ity D, Explicit expressions for the most interesting cases have been derived [9, 10]. If B is a microstructure component that is indestructible, C˙ B = 0 and Eq. (1) only applies to CA . The concept of pair probability densities discussed above is only valid for the equilibrium case, and not for the dynamic case, in which the reactants are continuously generated and subsequently annihilated. Goesele [11] introduced the concept of effective lifetime τeff of a point-defect as the mean time until annihilation of the point-defect, whether through spontaneous or induced conversion, recombination with the anti-defect, or rejoining the crystal lattice at a microstructure component. In terms of τeff the effective time-independent reaction constant α can be derived from known expressions of the equilibrium time-dependent annealing reaction coefficient α(t).
1.2.
Rate Theory Model and the Concept of Sink Bias
The steady-state kinetic equation is similar to Eq. (1), but with α(t) replaced by α. Goesele, [11] showed that α is related to the sink strength in the effective medium approximation of the rate-theory model. Indeed, if B is an inexorable sink, then αCB is the usual sink strength k 2 of a microstructure component (sink) in the effective medium approximation. The lifetime
Modeling irradiation damage accumulation in crystals
963
of an individual process and the corresponding reaction coefficients are also related, i.e., ¯ B )−1 kB2 = αCB = ( Dτ and k2 =
¯ eff )−1 kB2 = ( Dτ
(4)
(5)
B
with −1 τeff =
τB−1
(6)
B
An imbalance in the fluxes of vacancies and interstitials to a microstructure component causes microstructure evolution, which in most cases, results in macroscopic property changes. To be specific, let us consider a case in which vacancies and self-interstitials are produced at a rate G. Replacing α(t) by α in Eq. (1), the net interstitial annihilation rate at S, s , is then given by s = αis D¯ i Ci − αvs D¯ v Cv ,
(7)
where αis is the reaction constant of interstitials with a microstructure component S, and αvs is the corresponding quantity for vacancies. The symbols Ci and Cv are the interstitial and vacancy concentrations (i.e., the spatial and temporal mean) respectively, and D¯ i , D¯ v are their respective averaged diffusion coefficients. The concentrations Ci and Cv can be calculated using the usual particle conservation equations, giving Gα s ¯ s = ni n (βs − β) N αv
(8)
n
where βs = and
αis − αvs αis
(9)
N n αin βn n ¯ β= n n N αi
(10)
n
Here N n is the density of the n th type of sinks. Note that the average is weighted by the reaction constant for the interstitials. Equation (8) carries an important message, namely, whether a microstructure component absorbs a net flow of vacancies or interstitials is determined by the difference between the respective reaction constants α nj for the two
964
C.H. Woo
types of defects. α nj depends on a myriad of factors, including the size and shape of reaction volumes and the associated boundary conditions, anisotropy of the diffusion, interaction between reaction partners, proximity of other reactant partners, continuous generation of reaction partners, lifetime and geometric arrangement, and spatial correlation of the reacting partners. Thus, the behavioral difference between vacancies and interstitials relating to any of the forgoing factors will contribute to the preference of the sink for a particular type of defect. Historically, the effects of the dislocation strain field on the kinetics of such reactions may be the earliest factor studied in irradiation damage theory, leading to the concept of the dislocation bias to explain void swelling (see Ref. [11]). Thus, neglecting diffusion anisotropy, the difference between αis and αvs is caused only by the difference in the drift potentials E (i.e., the elastic interaction between the reaction partners), between the vacancies and the interstitials. For example, for edge dislocations, E is larger for the self-interstitials than for the vacancies because of the larger elastic interaction. For voids, E is negligible for both kinds of point-defects. Thus, Eq. (9) gives for voids, β v = 0, and for edge dislocations, β D > 0, and Eqs. (9) and (10) immediately give βv − β¯ < 0 and βD − β¯ > 0 showing that, in an irradiated crystal with voids and edge dislocations, the voids will grow because it absorbs a net flux of vacancies. At the same time, interstitial loops will grow and edge dislocations will climb, producing a volume strain, because they receive a net flux of interstitials. This is the conventional understanding of the mechanism responsible for void swelling. β s is sometimes referred to as the bias of the microstructure component s.
1.3.
Ansiotropic Diffusion and the DAD Effect
Taking into account the crystal structure, the mobilities of the defects produced during irradiation are not always isotropic. This is of prime importance to the understanding of irradiation damage behavior in crystals. Diffusional anisotropy can be a consequence of either the non-cubic structure of the host lattice (e.g., hexagonal close packed, hcp, structure of the zirconium lattice), or that of the defect itself (e.g., a crowdion in a cubic lattice), or both. In such cases, the diffusional anisotropy difference (DAD) between the vacancies and interstitials may also introduce a large bias according to Eq. (9). Woo [10] comprehensively reviewed the effects of anisotropic diffusion in the theory of irradiation damage in non-cubic crystals. The bias caused by DAD is independent of the associated strain field of the sink, but may completely dominate the conventional dislocation bias caused by the elastic interaction between the point-defects and the sink. Thus, unlike the usual dislocation bias caused by the elastic interaction, the bias for edge dislocations in non-cubic
Modeling irradiation damage accumulation in crystals
965
metals depend on their line directions in the crystal, and does not have to be biased towards interstitials. Instead of being weakly biased sinks when the effect of anisotropic diffusion is neglected, grain boundaries and surfaces can also be strongly biased towards the vacancies or interstitials according to their orientations. This large variability of biases for sinks is a source of the complex behavior of irradiated non-cubic metals. Indeed, Woo and Goesele [12] are the first to suggest the link between anisotropic diffusion and coldworked zirconium alloys with the hexagonal-close-pack (hcp) crystal structure. Subsequently, reviewing the irradiation damage accumulation behavior of hcp zirconium alloys, Woo [13, 14] traced many of its “anomalous” properties to the DAD effect. Anisotropic diffusion also offers a natural explanation to the ordering of microstructure such as void lattice formation, and the ordering of dislocation loops [15]. Subsequent atomistic studies using molecular dynamics and statics of point-defect diffusion in α-zirconium [16] and α-titanium [17] both of hcp crystal structure, have indeed found evidence of DAD in both cases. The bias calculated from the atomistic anisotropy ratios was found to be consistent with the experimental HVEM loop growth measurements over a wide range of temperatures [13]. To formalize these concepts in the understanding of irradiation damage, the terms elastic interaction difference (EID) and diffusional anisotropy difference (DAD) were introduced [10]. EID is a major source of driving force for microstructure evolution traditionally considered to cause the dislocation bias, and DAD is a natural candidate responsible for irradiation-induced effects geometrically related to the crystallography of the host lattice, such as irradiation growth in non-cubic metals, void lattice formation, dislocation loop ordering, etc.
1.4.
Sink Competition and Effective Lifetime
The effects of sink competition and similar topics have been investigated in the modeling of reaction kinetics involving continuously produced migrating defects. Two basic issues are of concern: (a) the limited lifetime between the creation of the migrating point-defects, and its annihilation at sinks, and (b) the overlapping diffusion fields, realized through the applied boundary conditions. The first can be easily visualized through the classical probability P of annihilation, by a sink, of a three-dimensional diffuser created in its neighborhood. Polya [18] first found that P was a function of the boundary condition, and the initial distributions. Simpson and Sosin [19] showed that for the Smoluchowski boundary condition and a δ-function initial distribution, P was a function of the duration t, given by
P(r, t) = 1 −
r−R R erfc √ r 4Dt
(11)
966
C.H. Woo
For high sink concentrations, or high recombination, the point-defect lifetime is shortened, and the annihilation probability becomes very large [11]. Indeed, the effect of sink competition can also be seen in Eq. (5), in which the reaction constant α is expressed as a function of the effective lifetime τeff , which depends on the total reaction constant, to which α itself contributes. As the total sink strength increases due to the increased strength of competing sinks, the effective lifetime decreases, resulting in an increase of the strength of the particular sink under consideration. The effect of overlapping diffusion fields can be seen from the difference between sink strengths derived from various boundary conditions, such as the effective medium, the cellular model, etc. Goesele [9] has discussed this topic in some detail, to which the interested reader is referred.
2.
Effects of Interactions and Applied Fields: The Atomistic Picture
The kinetics of reactions among molecules depends on forces on them, which may come from their mutual interaction or from an external field. In earlier studies (see Ref. [20] for review), the averaged rate of arrival of pointdefects at a sink (i.e., microstructure component) within the continuum theory of drift diffusion is considered, by solving the boundary-value problems involving equations of the type, ∂C = ∇[D0 (∇C + βC∇ E)], ∂t
(12)
where C is the concentration of the point-defects per unit volume, D0 the diffusivity tensor of the ideal crystal, E is the potential energy of the defects in the applied force field (external plus internal), and β is the reciprocal of the product of the Boltzmann constant and the absolute temperature. This approach has provided a useful model for determining the reaction constant of a mobile defect with a sink that has a strain field associated with it, e.g., dislocations. The continuum theory behind Eq. (12) neglects the atomistic nature of the migration of the defects and omits important crystalline effects [21], because: (i) the effect of the force field on the symmetry of the diffusivity tensor is absent; (ii) the configuration of the point-defect with which the interaction energy is evaluated is not clear; and (iii) the symmetry of the elementaryjump mechanism has no effect. Using a kinetic theoretic treatment of lattice jumps, in terms of the atomic jump vectors and jump barriers, Eq. (12) can be rewritten with the diffusivity tensor D given by Dij =
1 h i h j λeff (h) exp −β E s (h) − E eeff . 2 h
(13)
Modeling irradiation damage accumulation in crystals
967
Here the summation is carried over all nearest-neighbor jump directions hˆ (the symbol ˆ in hˆ denotes the unit vector in the direction of h), and h i is the ith component of the position vector of the nearest neighbor to which a ˆ is the interaction energy of the point-defect in its saddlejump occurs. E s (h) point configuration with the external field, which in general depends on the ˆ is an effective jump frequency in the hˆ direction ˆ λ0 (h) jump direction h. eff of the defect in the unstressed crystal, obtained by averaging over different non-equivalent configurations of the point-defect, before and after the jump in the hˆ direction. At the same time, the drift potential E in Eq. (12) is replaced by an effective E eeff , obtained by averaging over ground-state configurations of the point-defect [see Eq. (36) of Ref. [21]]. Comparison of Eq. (13) with the ideal diffusivity tensor used in Eq. (12), (D0 )ij =
1 ˆ h i h j λ0e f f (h), 2 ˆ
(14)
h
shows that the atomistic theory reduces to the continuum theory, if there is no difference between the drift potential evaluated with the point-defect at the saddle-point configuration, and that at the ground-state configurations. In other words, the atomistic effects are produced mostly by the difference between the ground-state and saddle-point configurations. The latter in general varies with the jump direction, while the former does not. However, in the case of slightly distorted cubic crystals, Dederich and Schroeder [21] show that the effect of E eeff only enters through the boundary conditions. In this case, writing the drift potentials E e and E s , for which we have dropped the superscript eff, as a sum of two contributions from, respectively, the internal (i.e., short-range, subscript i) and external (e.g., uniform, subscript x), strain fields, i.e., it can be shown that Eq. (12) can be rewritten with D0 replaced by Dx , and E, by E si . Dx is the same as Eq. (13) expressed only in terms of the external component E x of the total drift potential. Expanding the exponential in Eq. (13), and retaining terms linear in the external stress, we obtain (Dx )ij ∼ (Do )ij + di j kl σkl ,
(15)
where di j kl is the elastodiffusion tensor [21], and σkl the applied stress field. Equation (15) shows that the crystal lattice introduces two distinct effects. Firstly, a short-range interaction produces a drift field defined by the pointdefect configuration at the saddle-point rather than at equilibrium. Secondly, the lattice distortion introduced by an external stress changes the symmetry of the diffusion field of the point-defect throughout the crystal, and hence its reaction kinetics accordingly.
968
C.H. Woo
In terms of its atomistic properties, the elastic interaction potential V (r) of a point-defect at a point r in an external field ζ i j (r) can be expressed in terms of the atomistic properties of the point-defect [22]: V (r) = −Pij ζij (r) − 12 αi j kl ζij (r)ζkl (r)
(16)
where ζij (r) = eij (r) + εij (r).
(17)
Repeated indices imply summation. Here, ε ij (r) is the strain field associated with the microstructure component (short-ranged), and eij (r) is the strain field caused by an externally applied stress (long-ranged) at the defect position r. Pij is the elastic dipole tensor of the point-defect, which describes its elastic strain field away from the defect centre. α i j kl is the elastic polarizability of the point-defect and describes the modification of the point-defect strain field caused by the total applied strain field ζ ij (r). These atomistic quantities can be obtained via computer modeling [23]. In the continuum theory in which the point-defect is modeled as a spherical inhomogeneous elastic inclusion, the interaction arising from Pij corresponds to the first-order size effect and that arising from α i j kl , to the second-order inhomogeneity interaction [24], and there is no distinction between the equilibrium or saddle-point configurations. Within the atomistic theory, however, both Pij and α i j kl refer to the saddlepoint configuration of the point-defect. To simplify the notation, we drop the superscript used to distinguish between the two different configurations. The elastic interaction of point-defects with a stress field modifies the potential barriers to possible atomic jumps and, subsequently, their reaction constants with a specific microstructure component. Vacancies and intersitials have different dipole tensors Pij and polarizability α i j kl , which interact differently with the external field, and produce different reaction constants with the same microstructure component. This is the origin of the so-called dislocation bias [25], as we have discussed. In earlier models, the microstructure component is usually represented as a sink with surface geometry and boundary conditions that describe the proceedings of the reaction. The interaction between the defect and the microstructure component is considered to be the size–effect interaction −pV, where V is the isotropic volume strain of the point-defect, and p is the hydrostatic stress field of the microstructure component. In the presence of an external applied stress, the second-order inhomogeneity interaction couples the strain fields of the sink and the external stress through the cross-term contained in the quadratic strain term. The stress dependence of the reaction constant then produces a stress-induced preferred absorption (SIPA) effect, which is often used to explain deformation due to creep during particle bombardment [26].
Modeling irradiation damage accumulation in crystals
969
In the atomistic picture, the reaction rate depends on the renormalized diffusivity [27] that can be rewritten using Eq. (16) as
1 s ˆ exp β Pkls (h)ζ ˆ kl + 1 αklmn ˆ kl ζmn , h i h j λ0eff (h) ( h)ζ (18) D˜ ij = 2 2 ˆ h
where the index s denotes a quantity evaluated at the saddle-point configuration in a stress-free crystal. If one uses the approximation that the point-defect is a centre of isotropic dilation or contraction, then only the hydrostatic component of the applied strain field ζkl (r) contributes to the interaction ζkl Pkls . As a result, this interaction, and hence the diffusivity tensor, would be independent of the orientation of the external stress relative to the dislocation. However, when the anisotropy of the point-defect configuration (shape) is taken into account, the total interaction energy must include a contribution due to the shear component of the applied field ζ kl (r). Using Eq. (17), the normalized diffusivity tensor in Eq. (18) can be expanded as 1 ˆ kls (h) ˆ h i h j λ0eff (h)P D˜ ij = D˜ ij0 + βekl 2 hˆ 1 ˆ kls (h) ˆ h i h j λ0eff (h)P + βεkl 2 hˆ (19) 1 2 s ˆ ˆ kls (h)P ˆ mn + β ekl εmn h i h j λ0eff (h)P (h) 2 hˆ 1 s ˆ klmn ˆ + O(ekl emn ; εkl εmn ) h i h j λ0eff (h)α (h) + βekl εmn 2 ˆ h
The first term is the renormalized diffusivity tensor of the defect in the ideal crystal, i.e., in the absence of an applied stress. The second term has the form d˜i j kl ekl where d˜i j kl is the renormalized elasto-diffusion tensor defined by 1 ˆ kls (h) ˆ h i h j λ0eff (h)P d˜i j kl = β 2 ˆ
(20)
h
This term is responsible for the global diffusional anisotropy introduced by the application of an external stress, i.e., external to the crystal, as discussed earlier in this section. The third term, which does not contain ekl is responsible for the dislocation bias due to the size and shape effects [28]. The fourth and fifth terms couple the strain fields of the external stress and that of the sink; they cause a dependence of the bias of the sink on the orientation of the external stress and hence a SIPA-type effect. The fourth term arises from the non-linear dependence of the diffusivity on the interaction energy and is a result of the discrete lattice theory. The fifth term comes from the point-defect polarizability and is responsible for the conventional SIPA mechanism [26]. The effects of both the fourth and fifth terms are second-order, being proportional to the product of two strains.
970
2.1.
C.H. Woo
Implications to the Reaction Kinetics of One-Dimensional Diffusers
It is obvious from Eq. (18) that when small jump barriers are involved, such as typical of crowdion motion of single interstitial or clusters, λ0eff is close to the ideal lattice frequency, and the elastic interaction may dictate the migratory properties of the defect. Thus, recent MD computer simulation results of Wirth et al. [29] show that clusters of 19 and 37 interstitials in “Fe” have intrinsic migration energies of 0.023 and 0.052 eV, respectively. In comparison, using the infinitesimal loop approximation, the interaction energy of a circular planar interstitial cluster of area δA in a stress field σij is given by δ Anˆ i bj σij , where nˆ i is the unit plane-normal vector [30]. For a uniaxial stress of 100 MPa, acting along b, the interaction energies of the clusters, in the form of prismatic loops, are over 0.1 and 0.2 eV, respectively. Larger clusters will have even larger interaction energies, in proportional to its defect content. In such cases, the reaction kinetics between the defect clusters and the microstructure component may be determined, not so much by its intrinsic properties, but rather by its interaction with the stress fields associated with other crystal defects. Trapped at the local minima of the interaction energy, they may continue to evolve in response to the net influx of vacancies or interstitials to it, similar to immobile clusters. Or, if the interaction experienced by the migrating defect is repulsive and causes the direction of motion to change, its migration may proceed via a three-dimensional percolation mode [31]. Indeed, recent MD simulations show that interstitial clusters in alloys tend to diffuse three-dimensionally, instead of one-dimensionally, as in 100% single-component crystalline materials [32, 33]. Dudarev, Semenov and Woo [31] estimate that, in practical terms, the impurities generated via radioactive transmutation are already sufficient to reduce the one-dimensional diffusing range down to the sub-micron range, and suggested that the general importance of one-dimensional diffusion kinetics to explain features of microstructure of scales beyond the sub-micron range should not be overemphasized.
2.2.
Stress-induced Preferred Absorption (SIPA) due to Elastodiffusion
It may seem that the fourth and fifth terms of Eq. (19) are the only ones that can produce a dependence of the reaction constant on the orientation of the microstructure component with respect to the external stress, thus producing a SIPA effect. However, according to Eq. (4), the reaction constant of a geometrically anisotropic reaction volume in an anisotropic diffusion field would also depend on the relative orientations of the two entities. In the present case,
Modeling irradiation damage accumulation in crystals
971
through the operation of the elasto-diffusion term (i.e., the second term on the RHS of Eq. (15) or (19)), an external stress produces an anisotropy in an otherwise isotropic diffusion field (or changes the anisotropy of an intrinsically anisotropic one). This anisotropy can also cause the reaction constant to depend on the geometric orientation of the microstructure component with respect to the external stress, and thereby producing a SIPA effect. It is important to note that this is a first-order effect, being proportional to the external stress, in contrast to the second-order effects represented by the fourth and fifth terms in Eq. (19). An important feature of this term is its line-direction dependence, causing dislocations with different line directions to have different biases under the action of an external stress. As a result, under nonequilibrium conditions, the application of a stress will cause edge dislocations to climb with different velocities, according to their line directions. If these dislocations also have different Burgers vectors, atoms will be deposited on, or removed from, various crystallographic planes at different rates, thus producing a time-dependent deformation, i.e., creep and stress relaxation [27]. The drift-diffusion problem has been solved analytically for a straight edge dislocation [28] and an infinitesimal edge dislocation loop [34]. In the presence of an external shear stress, the reaction constant has a much stronger dependence (by an order of magnitude) on the line direction than on the Burgers vector direction, which can be traced completely to the external-stressinduced anisotropy of diffusion. The effects of the second-order terms, i.e., the fourth and fifth terms in Eq. (19), are indeed negligible compared with those of the first-order one (i.e., the second term).
3.
Damage by Cascade Producing Irradiation and Production Bias
The energy transferred during a high energy recoil event, such as caused by fast fission or fusion neutrons, causes a large number of atomic displacements in a crystalline solid in a very short time (∼10−12 s). The high concentration of displacement damage produced and the large energy deposited, by the PKA in the small cascade volume, give rise to two effects. Firstly, extensive annealing occurs during the cooling-down phase, allowing only a small fraction of the initial displacements to survive as individual vacancies and interstitials. Secondly, a significant fraction of the remaining interstitials and vacancies form clusters. These “primary clusters” are also segregated in space, such that the primary vacancy clusters (PVCs) are formed near the cascade core and the primary interstitial clusters (PICs) are formed near the cascade periphery. Early investigations of the structure of cascades and sub-cascades concentrated on the intra-cascade clustering of vacancies. Using a diffusion-reaction
972
C.H. Woo
formulation to account for clustering, Woo, et al. [35] calculated the recombination and clustering of interstitials in a cascade and discovered that a significant fraction of the interstitials produced in a cascade may be immobilized in the form of clusters. Subsequently, numerous molecular dynamics studies of cascades confirmed the intra-cascade formation of interstitial clusters (see related articles, this volume). More imporatntly, the fraction of vacancies in the PVCs is not exactly the same as that of interstitials in the PICs. Trinkaus et al. [36] reviewed the experimental evidence, and concluded that all observations thus far are consistent with the premise of interstitial cluster formation. Thus, the available evidence suggests that under cascade damage conditions, a substantial fraction of surviving interstitials and vacancies are produced in the forms of PICs and PVCs, in addition to the Frenkel pairs. Up to the peak swelling temperature, the PICs are thermally stable because of their large binding energy, and unlike the PVCs, which are generally immobile, the larger PICs (containing more than 10 interstitials) usually collapse into platelets forming dislocation loops, which may be glissile or sessile. The glissile ones are onedimensional diffusers with a very small jump barrier, and can reach and react with the sinks via long-range migration of the cluster as a whole. However, as discussed earlier in this article, the one-dimensional diffusing PICs can easily get trapped at the local fields of other crystal defects, or change its direction of motion when repeled or released from a trapped state. In realistic materials, the direction change may occur sufficiently frequently between their creation and annihilation, so that their mean free paths are much shorter than the sink separation, and the reaction kinetics of the mobile PICs are effectively threedimensional diffusers [31]. As a simplifying assumption, the mobile PICs may be reasonably considered just as a constituting component of the interstitial flux in their reaction with sinks.
3.1.
Modeling Irradiation Damage Under Cascade Conditions and Production Bias
In view of the specific features discussed in the foregoing, it is important that the characteristics of the damage production and annihilation be represented accurately in the modeling of irradiation damage under cascade conditions. Specifically, the extensive intra-cascade recombination, the continuous generation and accumulation of the PICs and PVCs must be adequately accounted for. This requires that the evolution of the PICs and PVCs, the kinetics of their annihilation by the extended defects (e.g., dislocations), and their functions as both sources and sinks of the free defects, be incorporated as an integral and self-consistent part of any irradiation damage theory involving cascades.
Modeling irradiation damage accumulation in crystals
973
As the irradiation damage production process becomes increasingly better understood, irradiation damage modeling has progressed from the standard rate theory (SRT) model [25] to the BEK model [37] to the production bias model (PBM) [38, 39]. The strength and weakness of the models in terms of their ability and consistency in the comprehensive description of the effects of temperature, dose rate and particle type on available experimental observations in swelling, creep, growth, microstructure evolution, radiation enhanced diffusion (RED), irradiation-induced segregation (RIS) have been analyzed and reviewed by Woo et al. [3]. Interested readers are refered to this article for a critical overview and comparison of these models. In the following, we give a brief introduction of the production bias model, and then concentrate on its further development in the last several years. That a significant fraction of point-defects are retained in the form of immobile clusters immediately suggests that this portion may not participate in the conventional segregation of interstitials and vacancies via preferential attraction of single interstitials to dislocations. Instead, it is now realized that at irradiation temperatures above annealing Stage V, vacancies would evaporate from the PVCs due to thermal dissociation, and a large fraction of them would enter the medium and contribute to the global vacancy supersaturation. The PICs, on the other hand, are expected to remain thermally stable, at least up to peak-swelling temperatures [22]. Woo and Singh [38, 39] noticed this large asymmetry between the effective production efficiencies of mobile vacancies and interstitials, and found that the resulting “production bias” can provide a large driving force for microstructure evolution. This has led to the introduction of the production bias concept. In general terms, intra-cascade clusters can be both sources and sinks of point-defects. At low temperatures the PVCs and the immobile portion of the PICs (IPICs) are predominantly sinks, and at high temperatures, emission due to thermal dissociation makes them effectively sources. The difference between the PVCs and IPICs in their capacity as point-defect sources varies with temperature, from which a net point-defect flux to sinks to drive microstructure evolution can be derived, just like from the dislocation bias. However, it is important to realize that this bias does not originate from the reaction kinetics of the point-defect with the sink, but from the difference between the effective production rates of the two types of freely migrating defects. That is, it should not be confused with a sink bias such as the dislocation bias. It is also important to note here that the interstitial clusters considered in the production bias model (PBM) are the IPICs, and the mobile interstitial clusters are effectively considered as part of the collection of the three-dimensional migrating interstitials annihilated at the sinks [38–40]. The PBM was initially developed to consider steady-state void swelling. The high swelling rate and the sharp temperature dependence in the peak swelling regime was naturally caused, not by the dislocation bias, but by the
974
C.H. Woo
additional supply of free vacancies from the thermal dissociation of PVCs. Their model revealed the natural occurrence of the following characteristics of void swelling under cascade damage conditions, which is consistent with experimental observations. Thus, there are two sharply separated temperature regimes: low swelling rate at lower temperatures and high swelling rate at the peak swelling temperature. The swelling mechanism in the high swelling rate regime is dominated by the production bias whereas in the low swelling rate regime, it is determined by the dislocation bias. The transition from the low temperature (dislocation bias) regime to the high temperature (production bias) regime is abrupt. The steepness of the temperature dependence in this transition regime is consistent with a relatively high activation energy (∼3 eV), nearly equal to the activation energy for self-diffusion. The swelling rate at all temperatures increases with the amount of interstitial clustering. Similarly, the temperature dependence of irradiation growth in zirconium [41] also showed the existence of two sharply separated temperature regimes: a low growth rate regime at lower temperature and a high growth rate peak at higher temperatures. The possible connection between this growth behavior and the large excessive vacancy supersaturation created by the production bias was investigated, and production bias was found to be a plausible explanation [3]. It is worthwhile noting that the steep temperature dependence of the steadystate swelling rate observed under cascade damage conditions is a cascade effect, and cannot be explained within the rate theory (SRT). In cases where point-defects are generated homogeneously in the form of Frenkel pairs, the SRT is applicable, and the temperature dependence at low temperatures is determined via the kinetics of vacancy–interstitial recombination, which becomes important because of the high point-defect concentrations that usually prevail under these conditions. Since the activation energy for the recom bination controled process is half of the vacancy migration energy (∼ 0.7 eV in steels), the swelling rate would decrease only slowly with decreasing irradiation temperature [3].
3.2.
Difficult Issues Facing the Production-bias Model
Whilst the PBM gives a very good description of the behavior of void swelling, it does not offer a consistent explanation to the evolution of interstitial clusters and the dislocation structure. Indeed, in the temperature range just above the annealing stage V, i.e., the peak-swelling regime, the flux of freely migrating vacancies to all sinks is, on the average, much higher than that of the interstitials, due to the dissociation of PVCs that are thermally less stable than the PICs. It is not obvious how interstitial loops can nucleate and
Modeling irradiation damage accumulation in crystals
975
grow, how the swelling strain is realized, and how the PICs are removed, so that they do not accumulate and suppress all driving forces for microstructure evolution. Another issue is connected with the experimental observation that in wellannealed metals with a dislocation density of ∼1011 m−2 , neutron irradiation at the peak swelling temperatures yields a high swelling rate of ∼1%/dpa [42, 43] at doses less than 10−2 dpa. At the same time, a heterogeneous and segregated microstructure forms by self-organization. Dislocation-bias has not been able to explain the observed swelling. Although using the mean-field approximation, the swelling rate predicted by PBM is much higher, it is still far too small to explain the measured values. The development of heterogeneous void swelling observed near grain boundaries [44, 45] in both pure metals and concentrated alloys is another challenge to PBM. Attempts to explain the formation of the ordered microstructure within the framework of SRT using the concept of dislocation bias was met with failure. Trinkaus et al. [46] suggested that this may also be caused by the long-range transport by the one-dimensional random walk of small interstitial clusters along the close-packed atomic directions, often observed in MD simulations (see related articles in this volume). This suggestion was investigated in further detail subsequently by Dudarev [47]. The predicted microstructure varied significantly according to the dimensionality of the diffusion of the PICs assumed, and the agreement with experiments of the calculated swelling profile was found to have improved if the diffusion was assumed to be one-dimensional. Indeed, Singh (1999) speculated that this offered the direct evidence of one-dimensional diffusion kinetics of the mobile PICs generated under cascade damage conditions. However, the recent discovery of this phenomenon in concentrated alloys [49] weakened this speculation considerably. Indeed, as explained earlier in this article, pure one-dimensional migration over distances large compared with the sink separations, without interruption, is hard to justify in the presence of high concentration of trapping or deflection centers. When the Burgers vector change produced by the interruption is sufficiently frequent, the reaction kinetics becomes effectively three-dimensional. The one-dimensional kinetics argument is further weakened by the fact that the capture probability of a one-dimensional random walker by sinks is much smaller than that of a three-dimensional one, so that long-range onedimensional transport of clusters is basicaly inconsistent with the formation of a heterogeneous microstructure in a volume with low dislocation and cluster densities. Above all, if a large fraction of interstitials generated by irradiation is removed from the bulk to the grain boundaries, it would be difficult for the interstitial loops to nucleate and grow. This is particularly true at elevated temperatures, when there is a net vacancy flux because of the vacancy emission from the thermally unstable vacancy clusters. Without the nucleation
976
C.H. Woo
and growth of interstitial loops, the microstructure evolution at high doses is difficult to understand. Experimental observations of the behavior of void evolution during irradiation are also inconsistent with the operation of the reaction kinetics of onedimensional diffusers (see review by Woo, [6]). Thus, the capture probability of a one-dimensional diffusing cluster by the voids is proportional to the square of the void radius, and for free vacancies this probability is only linearly proportional to the void radius. If a significant portion of PICs is able to diffuse one-dimensionally over distances of several microns, as it is assumed, then void swelling must saturate when the void sizes become large enough. Indeed, the calculated swelling rate in copper based on the assumption of one-dimensional diffusing clusters is reduced by several times when the voids grow from less than 0.01–1 dpa. Experimentally, however, starting from a dose of 5 dpa up to doses of more than 100 dpa, voids in copper exhibits very robust swelling rates of about 0.5% per dpa in the temperature range 370–430 ◦ C, and there is no sign of swelling saturation. In another aspect, the interaction of voids with interstitial clusters migrating one-dimensionally along the close-packed directions should promote the formation of void lattice at a sufficiently large irradiation dose, because voids aligned along these directions have the most favorable spatial positions for the growth [15]. There is also no observation of the void lattice formation in copper either. The apparent difficulty in understanding large-scale heterogeneous voidswelling without one-dimensional diffusion is considered recently by Dudarev et al. [31], who showed that this phenomenon can be explained according to the PBM, and within the framework of three-dimensional diffusion reaction kinetics of defects, if one may assume a heterogeneous dislocation structure that recognizes the denudation of dislocations next to the grin boundary.
4.
Beyond the Mean-field Theory: Stochastic Effects
The issues encountered in the foregoing are of fundamental importance to irradiation damage modeling, which, thus far in this article, has adopted the spatial and temporal average picture, i.e., the mean-field approximation. A more realistic description of the microstructure evolution under neutron and heavy-ion irradiation must also recognize the strongly stochastic nature of this problem, derived from the discrete nature of the crystal lattice. This is particularly true when considering the evolution of the small interstitial clusters and the nucleation of interstitial loops and voids, and in situations where a random microstructure self-organized into a spatially ordered structure. Indeed, under cascade-damage conditions, point-defects and their clusters are produced randomly in time and space, and in discrete packages. The statistical nature of diffusion jumps and cascade initiation introduce fluctuations in
Modeling irradiation damage accumulation in crystals
977
the point-defect arrival rates at the sinks. In processes that involve only a small number of point-defects, such as the evolution of small point-defect clusters during nucleation events, it is intuitively clear that the fluctuations are important. Thus, an interstitial cluster that has been annihilated due to a wave of vacancies, cannot be revived by interstitials that arrive afterwards, even though the interstitial wave may be much bigger. On the other hand, the interstitial cluster would have survived and grow within the mean-field picture, resulting in a largely over-estimated number density of the interstitial clusters, which may completely distort the behavior of the irradiated system. To deal with the problem of temporal fluctuations and spatial variations in defect production and microstructure development, many authors formulate their problems using more advanced kinetic theories such as the Master equation or the Fokker–Planck equation. Monte Carlo simulation techniques are also used sometimes on problems for which a limited scope in time, space and the number of sink types is not important. Most calculations performed before the PBM only takes into account stochastic fluctuations due to the randomness of the point-defect jumps, and not those due to the randomness of the location, time and size of cascade initiation. The cascade diffusion theory of Mansur et al. [50], which took into account the space and time variation of cascade initiation, but not the randomness due to the migratory jumps of the point-defects, nor the variation among defect contents of different cascades, is an exception. Nevertheless, the results of this work suffer from a flaw in their statistical treatment [51]. To properly resolve the difficult issues faced by PBM, there is no doubt that the effects of stochastic fluctuations have to be rigorously explored. An attempt to take on this challenging task was made by Semenov and Woo, who considered these issues in a series of papers published in the period between 1993 and 2003. This work is complex, but has been partially reviewed by Woo [6]. To avoid repetition, the reader is referred to this review and the references therein for the statistical analysis of the point-defect production, transport and annihilation, under cascade damage conditions, based on which the evolution equation is formulated. The following concentrates on the application of the stochastic theory to microstructure nucleation, and the development of the heterogeneous microstructure due to the loss of stability of the homogeneous one.
4.1.
The Evolution Equations with Cascade Effects
The evolution equations of the small clusters are of central importance to the proper formulation of the PBM. Starting from the statistical description of the arrival at sinks of vacancies and interstitials from randomly initiated cascades, Semenov and Woo [52] derived the full kinetic equation for the
978
C.H. Woo
distribution function of the net number of vacancies accumulated in a sink. Various levels of approximations applied to this equation results in different forms that can be identified with various equations used in the literature. Thus, when all direct effects of stochastic fluctuations, spatial or temporal, are ignored, the kinetic equations reduce to the conventional rate equations. If only the probabilistic nature of cascade initiation is neglected, the kinetic equation reduces to the conventional master equation, frequently used to describe microstructure evolution under the continuous irradiation. When the stochastic process can be approximated by a Markov process, i.e., when the fluctuations can be assume to be delta correlated, statistical cumulants of order higher than two (i.e., k > 2) can be neglected, and the familiar Fokker–Planck equation is obtained, which takes into account both the migratory jump-induced and the cascade-induced fluctuations. In this case the probability distribution functions of the stochastic variables concerned are approximated by the appropriate Gaussian distributions. Based on the general kinetic equation, Semenov and Woo [52] analyzed the relative importance of the cascade-induced fluctuations, in comparison with the migratory jump-induced fluctuations, and concluded that cascadeinduced fluctuations play a much more important role than previously realized. For example, when they are absent, the conventional master equation gives a description qualitatively similar to the mean-field approximation. The total cluster density is typically much too high, and the interstitial content of the matrix is much too low, resulting in a swelling rate that is significantly underpredicted. Other theoretical calculations [53] and analyses of the experimental data on radiation-enhanced diffusion [54] also came to the same conclusion. In contrast, by properly taking into account the cascade effects, the Fokker– Planck equation approach gives a more reasonable picture. Indeed, the neglect of cascade-induced fluctuations produce a large increase of the cluster density, causing a seven-fold drop of the swelling rate in the case of steels in the peak-swelling regime, from about 1%/NRTdpa to 0.15%/NRTdpa [55]. Taking into account the cascade-induced fluctuations at the initial stages of irradiation leads to an order-of-magnitude reduction in the total sink strength. As expected, inclusion of the fluctuations due to random cascade initiation is also important for a proper description of the nucleation of interstitial loops and voids, as we shall see in the following.
4.2.
Nucleation of Voids and Dislocation Loops During Cascade Damage
The conventional approach to modeling void nucleation under irradiation is based on the classical description of the formation of small precipitates in a supersaturated solution, in which small thermally unstable new-phase
Modeling irradiation damage accumulation in crystals
979
embryos continuously form and redissolve in the supersaturated solution, but some can grow beyond the critical size via stochastic fluctuations. Beyond the critical size, the nuclei of the new phase become thermally stable and, on the average, can grow directly from the supersaturated solution, without the help of the stochastic fluctuations. At this stage, the nucleation process is considered to be complete. In this model, nucleation cannot occur within the mean-field theory. In earlier models, only statistical fluctuations produced by random point-defect jumps are considered, and the dislocation bias is the only driving force for the evolution of the damage microstructure. Using the Fokker–Planck equation to account for stochastic effects, Semenov and Woo [56, 57] applied the classical nucleation model to both voids and interstitial loops, also including contributions from the random initiation of cascades and the emission of vacancies from voids. In the classical nucleation model, void nucleation essentially constitutes the growth of small thermally unstable void embryos to the critical void size, which can only occur via the stochastic fluctuations of point-defect fluxes received by the void embryo. Three sources of fluctuations of the point-defect fluxes have been included: the diffusive jump, the cascade initiation, and the vacancy emission from the void. At elevated temperatures and when the sink density is low, the fluctuation of vacancy emission from voids is the dominant factor. The effects of the cascade fluctuations is important only when the total sink strength for point-defects is high, e.g., > 1015 m−2 . Application of the model to void nucleation in neutron-irradiated annealed pure copper and molybdenum at elevated temperatures, show reasonable agreement with the experimental observation. The nucleation of interstitial loops can be considered along a similar approach. As mentioned earlier in this article, in the temperature range just above the annealing stage V (i.e., the peak swelling regime) the dissociation of primary vacancy clusters (PVCs) produces a net flux of freely migrating vacancies to all sinks. Despite the net vacancy flux they receive, steady growth of faulted loops may be achievable through the absorption of smaller interstitial clusters and loops by coalescence. Indeed, the numerical calculation of Semenov and Woo [55] showed that the absorption of small interstitial clusters and loops could provide both the positive growth rate of the larger loops, and the sufficiently high climb rate of network dislocations, to produce a swelling rate in agreement with experimentally observed values. However, the probability of finding a neighboring cluster, with which it can combine, diminishes as the loop size decreases, and vanishes for the smallest immobile interstitial clusters. Thus, this mechanism can only account for the growth of sufficiently large interstitial loops. The smaller loops (or clusters) can only grow through stochastic fluctuations, similar to the case of sub-critical voids. From the foregoing description, the resemblance between the nucleation of voids and Frank loops at elevated temperatures is clear. Both vacancy and
980
C.H. Woo
interstitial clusters are directly produced in collision cascades, and critical sizes exist for both. Both the average sub-critical void and loop embryos shrink during the nucleation processes, and nucleation can only be accomplished via stochastic fluctuations. Thus, within the framework of the classical theory of nucleation, the nucleation processes of both voids and interstitial loops from primary clusters can be accomplished under the same framework. Indeed, Semenov and Woo [58] derived an analytic expression for the nucleation probability, applicable to both voids and Frank loops at elevated temperatures. Based on the classical nucleation model, Semenov and Woo [57] showed that, despite being the receivers of a net vacancy flux due to dissociating vacancy clusters at elevated temperatures, a small fraction of the primary interstitial clusters may still grow to achieve the critical size via the stochastic fluctuations. The probability that this may be achieved increases exponentially with a reduction in the mean loop shrinking rate, and/or the increase in the strength of the stochastic fluctuations. The contribution from the cascadeinduced fluctuations increases the nucleation probability by several orders of magnitude. The calculated rate of interstitial loop nucleation based on the derived nucleation probability is sufficiently high to account for the experimentally observed number densities of interstitial loops at a dose of one NRT dpa. The continuous regeneration of network dislocations from the present theory produces a swelling rate that agrees very well with the experimental value, which is on the order of ∼1%/NRT dpa.
4.3.
System Instability and Heterogeneous Microstructure Development
Many of the contentious issues facing the PBM arise in cases in which the microstructure is heterogeneous. In this regard, one must realize that spatial homogeneity is an integral part of the description of a system within the mean-field approximation. A system with a heterogeneous microstructure is basically inconsistent with the mean-field approximation of PBM. Indeed, the self-organization of a homogeneous structure of higher entropy into an ordered structure with lower entropy is a strong indication of the instability of the former. Thus, a solution based on the spatial homogeneity assumption may exist in the mean-field approximation, but may not be stable when the statistical nature of the system is explicitly taken into account. The average capture probability of a point-defect generated in a volume V in the neighborhood of a particular sink can be calculated. It can be verified that only about 20% of the point-defects created inside the characteristic sink volume are annihilated at that sink. Since the average steady-state flux of point-defects to a sink is equal to the rate of generation of such point-defects in the characteristic sink volume, there must then be a continuous exchange of
Modeling irradiation damage accumulation in crystals
981
point-defects between neighboring volumes. As a result, concentration fluctuations in V do not cancel each other completely. This exchange of point defects between neighboring regions gives rise to the classical V −1 -dependence in the variance of point-defect concentrations. The random cascade initiation produces additional variations. The assumption of spatial homogeneity requires detailed balancing to be observed, which cannot be assumed a priori according to the foregoing description. In most cases, nevertheless, the magnitude of the variance, over a meaningful volume, is small and bounded, i.e., stable with respect to small perturbations, and the error of the assumption is negligible. Thermodynamically, the force to keep the entropy production to a minimum, tends to maintain the stability of the spatial homogeneity of the irradiated system [59]. However, in a far from equilibrium situation, there are cases in which the spatially homogeneous system is only conditionally stable, and small initial deviations from the condition of detail balancing will grow beyond all bounds, if the conditions are not met [60]. The assumption of spatial homogeneity is used implicitly in most calculations in the study of irradiation damage. An interesting case, in which the break down of this assumption may occur, can be found in the cascade-irradiationinduced microstructure evolution of a fully annealed metal, at low-dose. In the absence of dislocations, both the absorption of primary clusters by dislocations and the effects of dislocation bias are insignificant. Assuming spatial homogeneity, the temporal evolution of the microstructure has been considered using both the mean-field approximation and the Fokker–Planck equation approach [6]. At low doses and for temperatures at which the vacancy clusters are thermally unstable, vacancy accumulation will take place at the voids. Interstitials will accumulate in the PICs that are continuously produced in the cascades. Assuming detail balancing of the point-defect fluxes, the solution of the kinetic equations at low doses must satisfy matter conservation that requires the local equality of the void swelling rate and the rate of interstitials accumulation in PICs. In this scenario, void growth will continue until the PICs become the dominant sink, and act predominantly as recombination centres. Statistically, we have seen that the number density of PICs must vary, and higher void swelling can be expected in regions of lower PIC density. However, the a priori assumption of spatial homogeneity does not allow these to happen. In the case when the spatially homogeneous solution is unstable, the relation between the local swelling rate and the rate of interstitial accumulation in PICs cease to hold, and the local concentration of point-defects over a sizable region may deviate drastically from the global average. Then the description of the evolution of the local microstructure under the assumption of spatial homogeneity breaks down. Noting that the actual microstructure observed in fully annealed pure copper at low doses is not spatially homogeneous, but is heterogeneous and
982
C.H. Woo
segregated, Semenov and Woo [61] analyzed the stability conditions of the spatial homogeneous solution of a system of primary interstitial clusters and small voids. They treated the IPICs reduced below a minimum size as mobile three-dimensional random walkers. With the constraint of spatial homogeneity on the point-defect concentrations removed, and replaced by the diffusionbased matter conservation equation, the spatially homogeneous solution was found to be conditionally stable only. When the homogeneous void growth rate became sufficiently low, due to the increase of either the void concentration or average sizes under the irradiation, the system becomes unstable. At the onset of the instability, the microstructure starts to evolve heterogeneously. Since there is a wide and continuous spectrum of the growing spatial modes, the developing structure is not spatially periodic. The instability develops with increasing spatial scales, in agreement with the experimental observation. It also follows from this investigation that the characteristic scale of the spatial heterogeneity increases significantly with temperature, from a few microns at 525 K to tens of microns at 625 K. Physically, void growth and nucleation in the spatially homogeneous stage lead to the accumulation of small clusters, and the reduction of the homogeneous void growth rate. At the same time, small inhomogeneous deviations of the void size change the rate of vacancy emission from the voids, and this feeds back to produce further enhancements in the variation of the void sizes. When the void-growth rate becomes sufficiently low during the homogeneous stage of the evolution, any increase in the amplitude of inhomogeneous variations of void sizes cannot be damped out by the net vacancy flux into the voids. This produces an unstable increase in the variation of voids sizes in different regions, so that voids may become under-critical and shrink away in some regions, while in the other regions their size may grow beyond the saturation value allowed by the corresponding homogeneous solution. Both the void swelling rate and the interstitial accumulation rate in clusters become location dependent due to the development of heterogeneity. This means that the shrinkage rate of the interstitial clusters becomes location dependent as well. Consequently, the flux of small mobile interstitial clusters between adjacent spatial regions does not observe detailed balancing as in the homogeneous case, due to the difference in the cluster shrinkage rates in different regions. This is consistent with and earlier result of Semenov and Woo [60], that the outflow of small mobile clusters, from a volume V with a size of 10–20 average cavity spacing, is sufficient to totally balance the net vacancy flux in an adjacent region with a characteristic width on the order of 0.1 µm. In this earlier work, the possibility of instability of the spatially homogeneous solution is explored, and the physical nature of the instability is studied. It is found that the escape of the mobilized clusters from a finite volume, if not counterbalanced by an equal amount of mobile PICs from the neighboring volumes, leads to the accumulation of vacancies, and enhancement of void swelling in
Modeling irradiation damage accumulation in crystals
983
this volume. At the same time, in the adjacent volumes the influx of interstitials in small mobile PICs neutralizes the net vacancy flux towards primary interstitial clusters. This prevents the clusters from shrinkage, thus reducing the escape probability of PICs, and enhancing the accumulation of interstitials (in clusters) in these regions. The entire process gives rise to a positive feedback system that leads to instability.
5.
Summary and Outlook
This article presents an overview of recent advances in the modeling of irradiation-damage accumulation, recognizing the limitation of the continuum theory of point-defect migration, and the weakness of a mean-field approach in treating the kinetics of reactions involving small clusters. Focusing on the improved insight provided by the atomistic picture of the crystal lattice and the statistical considerations of the reaction kinetics, we trace the progress from the standard rate theory model to the production bias model, and from the mean-field theory to the stochastic theory, taking into account progressively more realistic features of the irradiation-damage process, with an increasing degree of sophistication. Consideration of reaction kinetics in a diffusive medium from an atomistic point of view leads to the discovery of a powerful sink bias arising from the diffusional anisotropy difference (DAD) between the interstitial and vacancy type defects, which adds a new dimension to the understanding of the behavior of irradiated crystals. Within the atomistic picture, it is clear that the DAD effects depend on fundamental properties of the defects at both the ground state and the saddle points, such as the direction-dependent jump distance, jump barriers, and the configurations in terms of the corresponding dipoletensors. With the advent of computational hardware and software, the dynamic and static properties of such defects can be obtained readily via atomistic simulation. Further work in this direction will yield information that may contribute a long way towards the understanding of the complex behavior of metals and alloys under irradiation, particularly for cases in which either the defect or the host crystal has non-cubic symmetry. Thus, the behavior of one-dimensional diffusers in the strain field of impurities and small dislocation loops or near a grain boundary, should be investigated in the context of trapping and detrapping, recombination and coalescence. Such information is of fundamental importance to irradiation-damage modeling. The intrinsic and external-fieldinduced anisotropy of diffusion of vacancies and interstitials, and their clusters in hexagonal metals of different c/a ratios should also be obtained for an understanding of the systemic irradiation damage behavior of these metals. In the context of DAD, we also consider the effects of an externally applied stress on the point-defect kinetics in irradiated metals. The most
984
C.H. Woo
important effect that emerges in this regard, within the atomistic picture, arises from the change of the symmetry of the point-defect diffusional field in response to an external applied stress. Such a change introduces diffusional anisotropy (elasto-diffusion) in an isotropic diffusing species, or changes the anisotropy of an anisotropic diffusing species. In both cases the reaction constants between point-defects and sinks become a function of the sink orientation with respect to the principal directions of the applied stress. Thus, the bias differential is changed among dislocations with different line directions, or among grain boundaries with different surface normals. This has a profound effect on the development and evolution of the microstructure and the associated macroscopic dimensional changes. Being a first-order effect, it is likely that this stress-induced diffusional anisotropy plays a major role in void swelling and irradiation creep mechanisms and the coupling between them. In this context, the effect of stress in the nucleation of an anisotropic dislocation structure, and of voids should be considered in irradiation deformation studies in proper nucleation theory that takes full account of the stochastic effects. It is important to recognize the strong effects arising from the stochastic nature of the reactions between point-defects and small clusters, and to take into account the intrinsic statistical variation of concentrations and size distributions. One must consider the statistical nature of point-defect production, transport, and annihilation at sinks. Of special importance, the subtle effects of fluctuations must be taken into account in considering the behavior of small clusters, which are one of the most important, yet most obscured, components of the microstructure. A cluster that has been annihilated cannot be revived afterwards, independent of whether the time-averaged point-defect flux dictates that it must always grow or shrink. Indeed, the great majority of small clusters will shrink away in this manner, leaving only a very small proportion of survivors, of which some will grow much faster than others. Coarsening, whether it is in the case of the distribution of voids or interstitial loops, is one of the most important stochastic effects that results. Consideration of this effect is crucial in nucleation models, and models in which small clusters form an essential component of the sink. An additional important point that also must be appreciated is that, relative to the diffusive jump-induced fluctuations usually considered in most calculations, the cascade-induced fluctuations have a much larger effect. As a result, more recent calculations found that stochastic effects play a much more important role than previously thought, that is, before the real characteristic of cascade damage is appreciated quantitatively via the establishment of the production bias model. Calculations involving the solution of the Fokker–Planck equations, however, are complex. To facilitate easy application, a satisfactory way of including these effects, within a reasonable approximation in a simple model such as the rate theory is desirable, but has yet to be accomplished.
Modeling irradiation damage accumulation in crystals
985
Another important issue that is often overlooked is that spatial homogeneity is an integral part of a system describable within the mean-field approximation. A heterogeneous microstructure is basically inconsistent with the mean-field approximation. The self-organization of a homogeneous structure of higher entropy, preferred by near-equilibrium thermodynamics, into an ordered structure with lower entropy is a strong indication of the instability of the former. Thus, when the statistical nature of the system is explicitly taken into account, the stability of a spatially homogeneous solution cannot be taken for granted, but has to be established. Otherwise, a flawed conclusions could be the result.
Acknowledgment This project was supported by grants from the Research Grants Council of the Hong Kong Special Administrative Region (PolyU 5177/02E, 5167/01E, 5173/01E).
References [1] M.J. Norgett, M.T. Robinson, and I.M. Torrens, ASTM Standards E 521–83, 1983. [2] W. Schilling and H. Ullmaier, Mater. Sci. Technol., 10B, 179, 1994. [3] C.H. Woo, B.N. Singh, and A.A. Semenov, J. Nucl. Mater., 239, 7, 1996. [4] R. Bullough, Proceedings Conference on Dislocations and Properties of Real Materials, Royal Society, London, The Institute of Metals: London, p. 382, 1985. [5] N.M. Ghoniem, Phys. Rev. B, 39, 11810, 1989. [6] C.H. Woo, J. Computer-Aided Mater. Des., 6, 247, 1999. [7] M. von Smoluchowski, Z. Phys. Chem., 92, 129, 1917. [8] U. Goesele and A. Seeger, Philos. Nag., 14, 177, 1976. [9] U.M. G¨osele, Prog. React. Kin., 13, 63, 1984. [10] C.H. Woo, J. Nucl. Mater., 159, 237, 1988. [11] U. Goesele, J. Nucl. Mater., 78, 83, 1978. [12] C.H. Woo and U. Goesele, J. Nucl. Mater., 119, 119, 1983. [13] C.H. Woo, Radiation Effects and Defects in Solids, 144, 145, 1998. [14] C.H. Woo, J. Nucl. Mater., 276, 90, 2000. [15] C.H. Woo and W. Frank, J. Nucl. Mater., 137, 7, 1985. [16] C.H. Woo, Huang, Hanchen, and W.J. Zhu, Appl. Phys. A, 76, 101, 2003. [17] M. Wen, C.H. Woo, and J. Huang, Hanchen, J. of Computer-Aided Mater. Des., 7, 97, 2000. [18] R. Polya, Math. Annalen, 84, 149, 1926. [19] H.M. Simpson and A. Sosin, Radiat. Eff., 3, 1, 1970. [20] R. Bullough, D.V. Wells, J.R. Willis, and M.H. Wood, Dislocation Modeling of Physical Systems, Pergammon Press, New York, p. 116, 1980. [21] P.H. Dederichs and K. Schroeder, Phys. Rev. B, 17, 2524, 1978.
986
C.H. Woo [22] H. Ullmaier and W. Schilling, Physics of Modern Materials, International Atomic Energy Agency, Vienna, 301, 1980. [23] M.P. Puls and C.H. Woo, J. Nucl. Mater., 139, 48, 1986. [24] A.H. Cottrell, Report on Conference on the Strength of Solids, The Physical Society, London, 1948. [25] A.D. Brailsford and R. Bullough, J. Nucl. Mater., 44, 121, 1972. [26] R. Bullough and J.R. Willis, Philos. Mag., 31, 855, 1975. [27] C.H. Woo, J. Nucl. Mater., 120, 55, 1984. [28] B.C. Skinner and C.H. Woo, Phys. Rev. B, 30, 30384, 1984. [29] B.D. Wirth, G.R. Oddette, D. Maroudas, and G.E. Lucas, J. Nucl. Mater., 276, 33, 2000. [30] F. Kroupa, Philos. Mag., 7, 783, 1962. [31] S.L. Dudarev, A.A. Semenov, and C.H. Woo, Phys. Rev. B, 67, 094103, 2003 and Phys. Rev. B, 70, 094115, 2004. [32] J. Marian, B.D. Wirth, J.M. Perlado, G.R.Odette, and T. Diaz de la Rubia, Phys. Rev. B, 64, 094303, 2001. [33] J. Marian, B.D. Wirth, A. Caro, B. Sadigh, G.R. Odette, J.M. Perlado, and T. Diaz de la Rubia, Phys. Rev. B, 65, 144102, 2002. [34] C.H. Woo and E.J. Savino, J. Nucl. Mater., 116, 17, 1983. [35] C.H. Woo, B.N. Singh, and H.L. Heinisch J. Nucl. Mater., 174, 190, 1990. [36] H. Trinkaus, V. Naundorf, B.N. Singh, and C.H. Woo, J. Nucl. Mater., 210, 244, 1994. [37] R. Bullough, B.L. Eyre, and K. Krishan, Proc. R. Soc. A, 346, 81, 1975. [38] C.H. Woo and B.N. Singh, Phys. Stat. Sol. (b), 159, 609, 1990. [39] C.H. Woo and B.N. Singh, Phil. Mag. A, 65, 889, 1992. [40] B.N. Singh and A.J.E. Foreman, Phil. Mag. A, 66, 975, 1992. [41] R.P. Tucker, V. Fidleris, and R.B. Adamson, ASTM STP 804, 427, 1984. [42] B.N. Singh, T. Leffers, and A. Horsewell, Phil. Mag. A, 53, 233, 1986. [43] T. Leffers, B.N. Singh, A.V. Volobuyev, and V.V. Gann, Phil. Mag. A, 53, 243, 1986. [44] C.W. Chen and R.W. Buttry, Radiat. Eff., 56, 219, 1981. [45] B.N. Singh, T. Leffers, W.V. Green, and S.L. Green, J. Nucl. Mater., 105, 1, 1982. [46] H. Trinkaus, B.N. Singh, and A.J.E. Foreman, J. Nucl. Mater., 206, 200, 1993. [47] S.L. Dudarev, Phys. Rev. B, 62, 9325, 2000. [48] B.N. Singh, Radiat. Eff. Defects Solids, 148, 383, 1999. [49] S. Zinkle and B.N. Singh, J. Nucl. Mater., 283–287, 306, 2000. [50] L.K. Mansur, A.D. Brailsford, and W.A. Coghlan, Acta Metall., 33, 1407, 1985. [51] A.A. Semenov and C.H. Woo, J. Nucl. Mater., 233–237, 1045, 1996. [52] A.A. Semenov and C.H. Woo, Appl. Phys. A, 69, 445, 1999. [53] H. Wiedersich, J. Nucl. Mater., 205, 40, 1993. [54] H. Trinkaus, B.N. Singh, and C.H. Woo, J. Nucl. Mater., 212–215, 18, 1994. [55] A.A. Semenov and C.H. Woo, Appl. Phys. A, 67, 193, 1998. [56] A.A. Semenov and C.H. Woo, Phys. Rev. B, 66, 024118, 2002. [57] A.A. Semenov and C.H. Woo, Philos. Mag., 83, 3765, 2003. [58] A.A. Semenov and C.H. Woo, J. Nucl. Mater., 323, 192, 2003. [59] G. Nicolis and I. Prigogine, Self-organization in Nonequilibrium Systems, John Wiley & Sons, Inc, New York, 1977. [60] A.A. Semenov and C.H. Woo, Appl. Phys. A, 73, 371, 2001. [61] A.A. Semenov and C.H. Woo, Appl. Phys. A, 74, 639, 2002a.
2.28 CASCADE MODELING Jean-Paul Crocombette CEA Saclay, DEN-SRMP, 91191 Gif/Yvette cedex, France
Cascade modeling deals with the effect of a high velocity particle impact on a solid. These simulations are of primary interest in the nuclear engineering community as they are major tools to analyze the behavior of materials submitted to internal or external irradiation. Simulations tools were originally designed and are still used by this community. However, these simulations also interest implantation studies for micro electronics as well as sputtering and more generally surface modification studies.
1.
Introduction
When an energetic particle penetrates a solid, it looses its energy by series of elastic nuclear collisions and through excitations of the electronic system. The latter dominates in the high energy (MeV) range whereas the former is most important for smaller energies (below a few tens or hundreds of keV). Elastic collisions set into motion target particles, which can in turn displace neighboring atoms, thereby creating a displacement cascade. Similar cascades appear when a radioactive atom inserted in the solid decays. Due to the inability of all displaced atoms to return to their original or equivalent sites, a cascade results in the creation of vacancies and self interstitials and in the mixing of the atomic structure. The accumulation of such defects under irradiation eventually leads to the modification of microstructure and properties of the material. A general review on damage in irradiated materials can be found in Ref. [1]. Atomistic simulations are essential tools to analyze cascades as they provide a description of the cascade processes and a detailed view of the primary state of damage. In this part, we will focus on the atomic description of the cascade, the microstructure and property changes under irradiation being addressed in another section of this book. Cascade descriptions start with the 987 S. Yip (ed.), Handbook of Materials Modeling, 987–998. c 2005 Springer. Printed in the Netherlands.
988
J.-P. Crocombette
definition of the primary knocked on atom (PKA), which is defined as the first atom set into motion. Depending on the situation, it can be the target atom initially struck by the irradiation particle, the incident ion itself or the recoil nucleus created by radioactive decay. After collision with the PKA, a target atom is displaced from its original position if its kinetic energy is larger than a threshold displacement energy (TDE), which depends on the element, the material and the crystallographic direction of the impulsion of the target atom. They are of the order of 20–70 eV. Cascades occur when kinetic energies of the target atoms are large enough to ensure further displacements of other atoms. Simple mechanics show that this is not the case for electron irradiations, which due to their small mass, can transfer energies up to a few tens of eV to target atoms. Electron irradiation thus only creates isolated Frenkel pairs. At the opposite, neutron or ion irradiations cause PKA to recoil with energies of dozens of keV thus leading to displacement cascades. A cascade can be decomposed in three successive phases. Once the PKA has been set into motion, series of atomic displacements take place through collisions (ballistic phase). After less than ∼0.2 ps the energies of the recoil atoms fall below their threshold displacement energy and the ballistic phase ends. But, due to the atomic motions, a local area of high temperature exists in the material creating the condition of a thermal spike (thermal phase), the material being locally in a liquid-like state. After a few picoseconds the spike dissipates and the so-called primary state of damage is reached. Simulations have shown that the primary state of damage depend highly on the way the material structure reacts during the thermal spike. In some materials the crystalline structure rebuilds rapidly leaving only point defects after the cascade. This is the case in pure metals such as Ni and Fe where cascade creates vacancies and interstitials. Some ionic compounds (e.g., UO2 ) react in the same way. In metallic alloys, anti-sites are also produced leading to an ion mixing effect. At the opposite, in silicon and other covalent or ionocovalent materials (SiO2 , zircon) cascades do not create isolated defects but amorphous pockets of materials which do not re-crystallize after the thermal phase. After the thermal phase, starts the subsequent diffusive phase during which long-term evolution of the material through thermally activated phenomena takes place. Depending on the material, the competition between the accumulation of damage and the diffusive restoration phenomena may lead to dramatic changes such as the complete amorphization of the material as the crystalline structure eventually collapses, leading to the so-called metamict state. Displacement cascade simulations aim to reproduce the ballistic and thermal phases and to describe the primary state of damage. Two main simulation methodologies exist. The binary collision approximation (BCA) describes the ballistic phase and gives fast and reliable results for a global picture of energy loss, damage geometry and ion implantation range. It is used in situations
Cascade modeling
989
where good statistics are needed. Molecular dynamics (MD) simulations are much more computationally demanding but lead to a more precise description of the material as they describe both the ballistic and the thermal phases.
2.
Binary Collision Approximation
In the BCA, particles are supposed to move along straight trajectories between successive two body (binary) collisions. Moving atoms collide only with particles at rest. The BCA is at the heart of the Kinchin and Pease (KP) expression which relates the number of defects produced by one collision of the PKA with a secondary atom as a function of its energy: υ(T ) = T /2E d where T and E d are the kinetic energy of the PKA and E d is the TDE of the target atoms. Norgett, Robinson and Torrens (NRT) [2] have obtained a modified form which is widely used to quantify irradiation damage. By integration these formula give the total number of defect in the cascade as a function of the PKA initial energy. Many simulation codes are based on the BCA. In these codes, the only energy that is dealt with is the kinetic energy of moving particles. For each encounter, a fraction of the kinetic energy of the incoming particle is transmitted to the target particle. After such collision the target particle is set into motion only if its kinetic energy exceeds its TDE, otherwise it remains motionless. Some energy is subtracted from its initial kinetic energy to account for the binding energy of the particle. At the opposite the incoming particle stops if its kinetic energy falls below its TDE. The simulation is initiated by giving an initial impulsion to the PKA and stops when there are no moving particles left. The Binary Collision Approximation therefore only describes the ballistic phase. The determination of the amount of transmitted energy between colliding particles relies on cross section calculations using some conservative central potential. Different kind of pair potentials may be used. A common choice is the Ziegler-Biersack-Littmark (ZBL) potential [3] which assumes a universal form for inter-atomic interactions. Electronic excitations are modelled through a supplementary energy loss applied for each collision. The Binary Collision Approximation simulations show that, for high energy PKA, the cascade can be divided in quite disconnected subcascades. The first mechanism for subcascade formation is the fact that, for projectiles with energies greater than a few tens of keV, the mean free path between successive energetic collisions is long. Since most secondary recoil atoms have energies much less than the PKA, their subsequent collisions take place close to their original sites. The main cascade is therefore made of a string of localized defective zones separated by areas containing few defects. Secondly, the rare high energy recoils create subcascades of their own that
990
J.-P. Crocombette
are disconnected for the main one. Finally, at lower energies, the crystal structure plays an important role in subcascade formation as a moving atom can be steered by atomic rows through a crystal channel. Of course, this last mechanism cannot take place in materials where no such channel exists, for example, low symmetry crystals or glasses. The two major BCA codes are SRIM [3, 4] and MARLOWE [5]. They differ by the level of description of the atomic structure of the material. MARLOWE is well suited for crystals as it includes the description of the atomic structure. SRIM randomly determines successive collision targets and the only structural pieces of information are the composition and density of the material under study. The Binary Collision Approximation codes are very fast: a cascade simulation of any energy takes about one second with SRIM on any computer. It is therefore possible to make multiple simulations to obtain a statistically relevant picture of the damage caused by one kind of irradiation. These codes are therefore mainly used by experimentalists to quickly assess expected results of irradiation such as the depth of penetration of implanted particles. A rough estimation of the structure of the cascade track and the number of defects created are also obtained.
3.
Threshold Displacement Energy Calculations with Molecular Dynamics
Within MD, TDE can be easily calculated, for each ion type, by giving, in various directions, a series of impulsions of increasing kinetic energy to an atom and following the subsequent atomic motions. The threshold displacement energy calculations are therefore nothing else but simulations of lowenergy cascades. In each direction, for energies lower than the TDE, after some atomic displacements, the knocked-on atom readily returns to its original position, leaving the crystal unperturbed. In this case, the atom did not leave the region of instability surrounding its vacant site called the spontaneous recombination volume (SRV). Beyond the threshold energy, the knocked-on atom goes out of the SRV and does not return to its original site. In this case, at least one Frenkel pair remains at the end of the simulation. Technically, the simulated time should be large enough to allow for spontaneous recombination and as small as possible to save computational time and prevent diffusion recombinations. A simulated time around 0.5 ps seems reasonable. The threshold displacement energy calculations are also of interest as they often exhibits behaviors present in higher energies cascades. For instance in crystals where atoms are aligned in straight rows, replacement collision sequences (RCS) may take place when series of atoms are displaced along a crystalline row leading to a disconnected interstitial-vacancy pair.
Cascade modeling
991
A difficulty common to cascade and TDE MD simulations arises for the inter-atomic potential. Indeed common potentials are designed to fit (low energy) equilibrium or close to equilibrium properties, whereas cascades involve high-energy configurations and very short inter-atomic distances. For such small distances one has to turn on specific potentials especially devoted to the high energies involved such as the ones used in BCA codes. Thus, two kinds of potentials have to be connected. One common way to do this is to connect smoothly (thanks to a high order polynomial form) the high-energy and short-distance potential to the pair repulsion that exists in all forms of low-energy potentials and to extinguish at small distances the higher order terms (3 body or embedding parts) that may appear in the equilibrium potentials. Unfortunately, this connection takes place in a sensitive range of interatomic repulsion, namely between 10 and 100 eV which is precisely the range of the TDE. The calculated values of the TDE are therefore highly dependant on this connection. In the uncommon cases where experimental figures are available for the TDE, they should be used to properly design the connection. The threshold displacement energies, as all quantitative figures of cascade modelling, depend strongly on the details of potentials used.
4.
Methodology of Cascade Simulations with Molecular Dynamics
At the opposite of BCA models, which treat the cascade as a succession of independent two body encounters, MD simulations fully integrate the classical equations of motion for all atoms simultaneously. Of course, the price to pay is a much heavier computer requirement than for BCA codes. However, MD leads to a much more precise picture of the material as it describes both the ballistic and the thermal phases. However, MD simulations deal with atoms or ions and thus focus on the elastic loses. Inelastic loses due to electronic excitations are not explicitly considered. In the case of metals, they play little role in the energy range considered in cascade simulations (tens of keV) and they are conveniently accounted for by a friction-like force acting on moving atoms. For insulators, such approaches are clearly insufficient as electronic excitations are long lived and can lead to specific defect formations. No satisfactory formalism exists at present to deal with them. After the initial impulsion has been given to the PKA, one simply follows the movements of all the atoms inside the simulation box until a meta-stable state is reached for the time scales of MD simulations that is, picoseconds. Long-term evolution of the material is out of reach for MD and should be studied with other tools (see other sections of this book). Due to the computational cost of MD simulations, the size of the simulation box and the number of time steps are limited and so are the PKA energies. The size of the
992
J.-P. Crocombette
box should be large enough to easily accommodate the cascade. There is no standard rule on this point but common practice is that the projected range of the projectile should be less than one-fourth the length of the box and that the number of atoms in the box should be greater than 25 times the energy of the projectile in eV (500 000 atoms for 20 keV). Such large boxes are especially needed when channeling processes are awaited as they may lead to damage creation far from the PKA track. A proper cascade simulation should last for at least a few picoseconds, which amounts to a few ten thousandtime steps (see below). Fortunately, thanks to the increase in computer power, the size and time that can be simulated have become larger and larger. It is now possible, with supercomputers, to simulate irradiation events of primary energy close to what is expected in nuclear reactors or α disintegrations, that is, tens of keV. Even for cases where it is not possible to reproduce the real energy of the PKA (in implantation studies for instance), the division in subcascades evidenced by BCA models for high energy PKA justifies the simulation of lower energy cascades. Due to the large kinetic energies involved, the velocities of the ions can reach quite high values at the beginning of the cascade. To ensure a proper conservation of the energy, it is necessary to use a time step as small as 10−5 ps to discretize the trajectories of the ions. After some 10−2 ps, the maximum atomic velocity starts to decrease and the time step of the simulation can be progressively increased. This may be accomplished routinely by adjusting the time step so that the maximum atomic displacement between two consecutive time steps is smaller than some distance (e.g., 0.05 Å). It is also possible to use multi-time steps algorithms that consider various time steps in the different areas of the box depending on the local velocities of atoms. Cascade simulations should be performed in the pseudo (see below) NVE ensemble, that is, at constant volume and energy. However tempting, constant pressure or constant temperature algorithms should not be used. Indeed these kinds of algorithms are built up to describe thermo-dynamical equilibrium properties when in fact there is no such thing as equilibrium during a cascade. Using such algorithms leads to clearly spurious and unphysical behaviors. For instance, a large and sudden increase of the temperature of the material takes place in the core of the cascade, creating the thermal spike. Applying global constant temperature algorithms (such as Nose-Hoover) completely freezes out the atoms in the periphery of the cascade to almost zero temperature in an attempt to achieve an average constant temperature in the box. Nevertheless, it is true that one should take care of the temperature and pressure wave that propagates from the cascade core to the rest of the simulation box. First, one should use large enough boxes. Indeed the size of the simulation box should be carefully chosen with respect to the energy of cascade that is modeled. Due to the finite size of the box, whatever the boundary conditions are, the heat and pressure waves created by the cascade will eventually return to the cascade
Cascade modeling
993
area. The box should be large enough for the ballistic and thermal phases to be completed before the return of the waves. Second, an approximate way to deal with these waves has been designed. It consists in initiating the cascade in the centre of the simulation box and damping the thermal wave by controlling the temperature of the external layer of the box to model the thermal bath constituted in reality by the crystal. This damping can be performed either by simple rescaling of the atomic velocities or through some Lagrangian formalism. The heat wave of the cascade is then partially absorbed on the border of the simulation box. Proper handling of the pressure wave is less common, but a generalized Langevin type approach exists [6].
5.
Results of Cascade Simulations with Molecular Dynamics
Results of MD simulation vary from one material to the other. See Nordlund et al. [7] for comparison between different materials. A common feature is that the cascade region undergoes local melting and that this melting has a large influence on the primary state of damage. Many studies have been done to characterize this melting zone in terms of volume, temperature, density, structural characteristics, cooling rates and duration. A difficulty lies in the somewhat approximate definition one has to use for the cascade zone. Quite obviously, the most important result of the MD cascade simulation is the set of atomic positions at the end of it. One should really take time to look at atomic configurations as much information can be learned by eye inspection. Special care should therefore be paid to the visualization of atoms positions. Looking at the final atomic structure gives information about the kind of defects created by the cascade. One can easily see whether a cascade creates some point defects or an amorphous area. To visually analyze the primary stage of damage beyond eye inspection of all atoms’ positions, one has to design, on a case-by-case basis, convenient analysis tools, the idea being to extract from the complete set of atomic positions the atoms that are in a specific state of disorder. For materials where the crystalline structure is restored during the thermal phase, a cascade creates ion mixing and some point defects. For these materials, MD simulation has shown that the number of defect predicted by the KP and NRT models are overestimated. The ratio between MD results and NRT predictions decrease to a limit of one third for energies of the order of 10 keV. This reduced defect production efficiency is analyzed as the effect of fast defect recombination during the thermal phase. At the end of the cascade, one can define and plot vacancies, interstitials, antisites and simple replacements as in Fig. 1, which shows the primary state of damage in Ni3 Al after a 30 keV cascade [8]. This image exemplifies behaviors that appear in many materials.
994
J.-P. Crocombette
Figure 1. Morphology of a zircon (ZrSiO4 ) crystal after a 5 keV cascade [9]; Si (light gray), O (gray), Zr (dark gray).
Thus one can see that the cascade is divided into subcascades by channeling (see the red lines). Replacement collision sequences are also visible as little tails that point out of the main damaged zone. Quite naturally, vacancies and interstitials are, on the whole, situated respectively in the centre and at the periphery of the cascade area. Other MD simulations in metals have shown that the primary state of damage includes clusters of self interstitials which, exhibit fast (possibly athermal) diffusion properties. At the opposite of metals and alloys for which the recrystallization is almost complete, in materials subject to amorphization (silicon, zircon, zirconolite, etc.) pockets of amorphous material may remain at the end of the cascade. The possible difference between the structure of such amorphous domains and the glassy structure obtained by fast quenching is still under debate. Zircon is an example of such direct amorphization process around the PKA track. The best way to figure this behavior is to show all atomic positions,
Cascade modeling
995
as quoted in Fig. 2 [9]. The amorphous area in the centre appears clearly. Once defined, the cascade area can be analyzed in terms of radial or angular distribution functions, which can be compared with experiments such as neutron diffraction or extended x-ray absorption fine structure (EXAFS) performed on irradiated materials.
Figure 2. Visualization of the result of a 30 keV cascade in Ni3 Al [8]. Replaced atoms and antistes are represented by small white spheres. Light and dark gray spheres represent vacancies and interstitials, respectively. The lines join two channeled atoms to their original positions.
996
J.-P. Crocombette
Global quantitative figures can also be extracted from the final configurations such as the number of defects created, the energy stored in the structure at the end of the cascade. These figures allow comparisons with the prediction of other models or with experiments. They can in turn be fed in long time models of the global evolution of the material under irradiation.
6.
Summary
Binary collision approximation and MD simulations have proved to be highly informative on the cascade unfolding and damages. For a low computational cost, BCA gives an overall picture of the damage. Molecular dynamics simulations lead to reliable qualitative information. Quantitative predictions are also possible even if they depend more heavily on the choice of the empirical potential. Still, limitations exist for these simulations. On the one hand, BCA approaches suffer from their rough approximations and, on the other, even with supercomputers most of the experimental irradiation conditions are out of reach for MD simulations. Some attempts have been made to chain the BCA and MD approaches to push up the limits in PKA energies. Other development involve the linking of MD to simulate cascades with Kinetic Monte Carlo approaches to access the long term evolution of the material under irradiation. These developments are promising but are still far from routine as they depend highly on the system under study. Last but not least, for the case of insulators, electronic loses which are known experimentally to be important are completely neglected by present day simulations.
References [1] R. Averback and T. Diaz de la Rubia, “Displacement damage in irradiated metals and semiconductors,” Sol. Stat. Phys., 51, 281, 1998. [2] M.J. Norgett, M.T. Robinson, and I.M. Torrens, “A proposed method of calculating displacement dose rates,” Nucl. Eng. Design, 33, 50–54, 1975. [3] J.F. Ziegler, J.P. Biersack, and U. Littmark, The Stopping and Range of Ions in Solids, New York, Pergamon, 1985. [4] J.F. Ziegler, SRIM www.srim.org, 2003. [5] M.T. Robinson, “MARLOWE www.ssd.ornl.gov/Programs/Marlowe/guide/index .htm,” 2003. [6] M. Moseler, J. Nordiek, and H. Haberland, “Reduction of the reflected pressure wave in the molecular-dynamics simulation of energetic particle-solid collisions,” Phys. Rev. B, 56(23), 15439–15445, 1997. [7] K. Nordlund, et al., “Defect production in collision cascades in elemental semiconductors and fcc metals,” Phys. Rev. B, 57, 7556–7570, 1998.
Cascade modeling
997
[8] N.V. Doan and R. Vascon, “Displacement cascades in metals and ordered alloys. Molecular dynamics simulations,” Nucl. Instrum. Meth. B, 135(1–4), 207–213, 1998. [9] J.P. Crocombette and D. Ghaleb, “Molecular dynamics modeling of irradiation damage in pure and uranium doped zircon,” J. of Nucl. Mater., 295, 167–178, 2001.
2.29 RADIATION EFFECTS IN FISSION AND FUSION REACTORS G. Robert Odette1 and Brian D. Wirth2 1
Department of Mechanical Engineering and Department of Materials, University of California, Santa Barbara, CA, USA 2 Department of Nuclear Engineering, University of California, Barkeley, CA, USA
Since the prediction of “Wigner disease” [1] and the subsequent observation of anisotropic growth of the graphite used in the Chicago Pile, the effects of radiation on materials has been an important technological concern. The broad field of radiation effects impacts many critical advanced technologies, ranging from semiconductor processing to severe materials degradation in nuclear reactor environments. Radiation effects also occur in many natural environments, ranging from deep space to inside the Earth’s crust. As selected examples that involve many basic phenomena that cross-cut and illustrate the broader impacts of radiation exposure on materials, this article focuses on modeling microstructural changes in iron-based ferritic alloys under high-energy neutron irradiation relevant to light water fission reactor pressure vessels. We also touch briefly on radiation effects in structural alloys for fusion reactor first wall and blanket structures; in this case the focus is on modeling the evolution of self-interstitial atom clusters and dislocation loops. Note, since even the narrower topic of structural materials for nuclear energy applications encompass a vast literature dating from 1942, the references included in this article are primarily limited to these two narrower subjects. Thus, the references cited here are presented as examples, rather than comprehensive bibliographies. However, the interested reader is referred to proceedings of continuing symposia series that have been sponsored by several organizations,∗ several monographs [2–4] and key journals (e.g., Journal of Nuclear Materials, Radiation Effects and Defects in Solids). ∗ Meetings and symposia series of interest include the American Society for Testing and Mechanics Special
Topical Meetings on Radiation Effects on Materials, the International Conference on Fusion Reactor Materials, the Symposia series on Microstructural Processes in Irradiated Materials sponsored by the Materials Research Society and the Materials, Metals and Minerals (TMS) Society, and the International Symposia series on Environmental Degradation of Materials in Light Water Reactors. 999 S. Yip (ed.), Handbook of Materials Modeling, 999–1037. c 2005 Springer. Printed in the Netherlands.
1000
G.R. Odette and B.D. Wirth
The underlying physics controlling neutron radiation damage, and its attendant consequence to material properties, is inherently hierarchical and multiscale. Pertinent length and time scales controlling radiation effects range from neutron collision-reactions on the scale of the nucleus to the size and service lifetimes of structural components, spanning factors in excess of 1014 (length) and 1022 (time) [5–10]. Radiation effects are also inherently “multi-physics” as well. Numerous basic nuclear, atomic and solid-state physics processes are linked to complex nano and microstructural evolutions in multi-constituent, multi-phase engineering materials through non-equilibrium thermodynamics and accelerated kinetics, leading to structure–property and property–property relations described by micro and macro mechanics models [5, 7, 11]. The governing processes involve enormous degrees of freedom and critical outcomes often depend on small differences between large competing effects. For example, void swelling results from a small bias in vacancy versus self-interstitial atom fluxes to different sinks [5, 12]. The fundamental objective of multi-scale – multi-physics (MSMP) radiation effects modeling is quantitatively predicting the generation, transport, fate and consequences of all defect species created and solutes transported by irradiation. The practical aim of modeling is to provide improved predictions of materials (component) performance and lifetime by relating time-dependent property changes to the combination of governing material and irradiation variables. Physical models provide a framework for synthesizing experimental information, ranging from laboratory-based mechanism studies, to real world surveillance data. Thus models can be used to more reliably extrapolate beyond an often limited and imperfect database [7, 13–16].
1.
Irradiation Effects in Ferritic–Bainitic and Ferritic–Martensitic Alloys
An example of the successful application of the multi-scale-modeling concept is improvements in the prediction of irradiation embrittlement of reactor pressure vessel (RPV) steels [13, 14]. Western RPVs are fabricated from quenched and tempered C–Mn–Si–Mo–Ni low-alloy steels and operate around 300 ◦ C. These ferritic–bainitic alloys contain coarse scale Fe(Mn)3 C and smaller Mo2 C carbides with dislocation densities of about 2 × 1014 /m2 . As summarized in Table 1, RPVs accumulate fast neutron fluences from about 1 to 10×1023 n/m2 over a 40–60 year service life, corresponding to a maximum damage dose less than 0.15 displacement per atom (dpa) [8, 13, 17]. Even this relatively low dose is sufficient to produce embrittlement characterized by upward shifts in the transition temperature (T ) between the more brittle cleavage and more ductile microvoid coalescence fracture regimes.
Irradiation conditions: - Temperature (◦ C) - Dose rate (dpa/s) - Target dose (dpa) - Gas generation (appm He, H) - Neutron flux (E > 1 MeV) - Neutron flux (E > 10 MeV) - Mean PRA energy ≈15 keV
≈ 290 ≈ 0.5 × 10−10 – 10−11 <≈ 0.05 minimal 1.1×1015 n/(m2 -s), (≈ 20% of total)
≈ 2×1014
Dislocation densities (m−2 )
Microstructures
Fe–C(0.05–0.2%)–Mn(0.7–1.6%)–Mo(0.4– 0.6%)–Ni(0.2–1.4%)–Si(0.2–0.6%)–Cr(0.05– 0.5%)–Cu(0.05–0.4%)–P(0.005–0.025%) Ferritic–bainitic quenched and tempered forgings and stress-relieved submerged arc welds; Coarse scale Fe(Mn)3 C and Mo2 C carbides
Composition (weight %)
Reactor pressure vessel (RPV) steels
Table 1. Summary of RPV steel and LAMS composition and irradiation conditions Low activation martensitic steels (LAMS)
300–550 ≈ 0.5×10−6 − 10−6 150 1500 appm He, 6000 appm H 1.5 × 1018 n/(m 2 − s), (≈ 40% of total) 8.8 × 1017 n/(m 2 − s), (≈20% of total) ≈ 50 keV
Normalized and tempered ferritic–martensitic steels; Ta and V alloyed carbides, Cr-rich α and Fe2 W Laves phases 0.5–10×1014 (depending on tempering conditions)
Fe–Cr(8–12%)–W(1–2%)–Ta(0.05–0.5%)– V(0.05–0.3%)–C(0.1%)
Radiation effects in fission and fusion reactors 1001
1002
G.R. Odette and B.D. Wirth
Embrittlement is primarily the result of irradiation hardening, reflected in increases in yield stress (σ y ), and can reach values of 300 ◦ C or more. Embrittlement is controlled by a complex combination of variables [7, 8, 13, 18], including the neutron flux, fluence and spectrum, the irradiation temperature (irradiation variables), and the alloy’s starting microstructure and composition (material variables). Important compositional variables (all compositions given in weight %) include Cu (0.02–0.4%), Ni (0.2–2%) and Mn (0.3–1.9%), while P (0.005–0.040%) and Si (0.2–0.7%) play a secondary role. The primary hardening and embrittling features are a high concentration of nm-scale coherent Cu-rich precipitates (CRPs) [19]. The CRPs are alloyed with Mn, Ni, Si and P [5, 7, 13, 18, 20, 21]. In alloys containing high quantities of Mn and Ni, the CRPs give way to Mn-Ni rich precipitates (MNPs). The MNPs are also alloyed with Cu and Si, but MNPs can form in Cu-free steels [22]. In low Cu steels, the main hardening features are vacancy–cluster solute (Mn, Ni, Si) complexes, MNPs and alloy phosphide phases [5, 7, 17, 20]. The mechanisms of RPV steel embrittlement are summarized below. Normalized and tempered low activation martensitic steels (LAMS) are the leading candidate for use in fusion first wall and breeding blanket reactor structures [23–26]. These alloys will experience much larger doses in service, in the range of 100–200 dpa, accompanied by high concentrations of solid and gaseous transmutation products, including insoluble He (1000–2000 atomic parts per million, appm) and reactive H (4000–8000 appm). Fusion reactor components fabricated from LAMS will operate at service temperatures ranging from about 300 to 550 ◦ C. As summarized in Table 1, the hard fusion neutron spectrum, with a large (≈ 20%) component of 14 MeV neutrons from the D–T reactions, produces much higher levels of He and H from (n, α) and (n, p) threshold reactions. Predictive irradiation effects models must treat a large number of performance sustaining properties that may be degraded, including the yield strength and strain hardening constitutive laws, various types of “ductility,” fatigue crack growth rates, fracture toughness, irradiation and thermal creep rates, void swelling, creep-rupture time and strain, thermo-mechanical fatigue stress and strain limits, creep-crack growth rates, creep–fatigue interactions, environmentally assisted cracking and bulk corrosion–oxidation compatibility [23–26]. LAMS typically contain ≈ 8% Cr plus 1–2% W and ≈ 0.1% C, along with smaller quantities of carbide forming micro alloying elements like and Ta and small to modest concentrations of Mn and Si [11, 26]. Depending on the irradiation temperature and dose, major phases include a variety of alloyed carbides, Cr-rich α and Fe2 W Laves phases. LAMS microstructures are composed of moderately high dislocation densities (≈ 2 × 1014 /m2 ) and dislocation sub-structures inside martensitic laths, forming small groups of lath packets within the prior austenitic grains. These coarse scale (>0.05 µm) structures and phases formed during processing are generally stable at
Radiation effects in fission and fusion reactors
1003
low-to-intermediate irradiation temperatures and doses. In the regime below about 400 ◦ C, the dominant features induced by irradiation are dislocation loops, gas bubbles and voids, and in some cases, fine scale precipitates like α [27, 28]. These fine-scale features cause hardening, loss of uniform tensile strain capacity, flow localization and embrittlement [23, 29]. The effects of hardening may be amplified by high levels of H and He that, at sufficient concentrations, may lead to grain boundary decohesion and a brittle intergranular fracture mode up to very high temperatures, potentially reaching 500 ◦ C or more [11, 30]. Irradiation creep occurs over the entire range of service temperatures and is the dominant source of dimensional instability at low-to-intermediate temperatures [28]. The number densities of the features decrease with increasing irradiation temperature [27]. In the range of 400–500 ◦ C, high He levels may result in enhanced swelling associated with the biased vacancy flux driven transformation of stably growing bubbles to unstably growing voids [31]. Above about 450 ◦ C (or perhaps a lower limit with increasing dose) the coarser microstructures and phases become increasingly unstable and tend to recover and coarsen, while precipitation of grain boundary Laves phases occurs [32]. These evolutions can lead to both softening and non-hardening embrittlement [11, 33]. Irradiation and applied stresses may accelerate and lower the temperature range of the time–temperature C-curves describing these transformations by radiation enhanced diffusion or other mechanisms. However, LAMS do not generally show large irradiation “driven” effects on microstructural stability, or phenomena such as severe solute segregation. The main concern at higher temperatures (above 500 ◦ C) is the accumulation of He on grain boundaries, which may be accompanied by severe reductions in creep rupture times and strains due to stress driven creep controlled growth of creep cavities that nucleate on bubbles [34]. The high sink density in LAMS is believed to offer some degree of grain boundary protection [35], but this has not been verified under fusion relevant conditions. A major challenge to predicting the performance of materials in fusion environments is the absence of a high-energy, high-dose neutron source [36, 25]. However, fission reactors can be used to study the effects of high levels of helium by combinations of spectral tailoring and doping with isotopes of elements with high (n, α) cross sections (B and Ni) and more recently, in situ α-implantation of LAMS from thin adjoining layers rich in these elements [37]. Information from these experiments will be used to develop, calibrate and validate multi-scale models of the effects of damage accumulations, including the transport, fate and consequences of high levels of helium. Since the wide range of irradiation effects are controled by the large combination of many material and irradiation variables, purely empirical characterization of material properties under irradiation is impossible, and predictions will always involve significant interpolation and extrapolation. Therefore,
1004
G.R. Odette and B.D. Wirth
development of experimentally calibrated and validated MSMP models that are based on a physical understanding the underlying processes and how they interact is a practical necessity.
2.
MSMP Models of Radiation Effects
Figure 1 illustrates the hierarchy of processes that must be integrated into MSMP models of property changes in fission and fusion environments. Since the focus here is on modeling microstructure evolution under irradiation, we begin by simply noting that hierarchical modeling of irradiation effects on materials performance [7, 11, 13] involves linking sub-models relating: (a) changes in the microstructure to local structure sensitive deformation and fracture properties, like changes in yield stress (σ y ); (b) combinations of the fundamental local constitutive properties to more complex continuum engineering properties, like shifts in the temperature indexing a specified fracture
Figure 1. Illustration of the length and time scales (and inherent feedback) involved in the multiscale processes responsible for microstructural changes in irradiated materials. The processes are described in more detail in the text.
Radiation effects in fission and fusion reactors
1005
toughness (T ). The relationship between microstructure and local constitutive behavior has traditionally been addressed with phenomenological models based on dislocation theory [13, 38, 39]. Corresponding models pertinent to local fracture properties have been based on micro-mechanics theories [11, 40, 41]. More recently, however, direct simulations based on MD and dislocation dynamics (DD) have been used to characterize many details of the structure property relation, such as dislocation – obstacle interaction mechanics [42– 45], the motion of single and multiple dislocations through arrays of obstacles of varying strength [13, 46, 47] and the evolution of dislocation structures [48–50]. However, a detailed review of these topics is beyond the scope of this article.
2.1.
Primary Recoil Atoms: Neutron Scattering and Reactions
Radiation damage begins with the creation of energetic primary recoil atoms (PRA) through high-energy neutron–nuclear interactions and concurrent production of He, H and solid transmutants. For a given material, the PRAs have a characteristic cross-section K(E,T) determined by the kinematics of the nuclear interactions that describe the probability for a neutron of energy (E) producing a recoil of energy (T). The nuclear cross-sections and kinematics models needed to compute PRA spectra and gas production reactions are incorporated in codes such as SPECTER [51]. As illustrated in Fig. 2, the PRAs in a fusion first wall spectrum have a high-energy component, peaking at ≈ 500 keV with a mean energy of 50 keV. This compares to a mean T ≈ 15 keV for the quarter-thickness location in the RPV of a pressurized water reactor [52].
2.2.
Primary Defect Production in Displacement Cascades
The PRAs quickly lose kinetic energy through a branching chain of atomic displacement collisions, as well as non-displacing interactions with electrons, generating a high temperature displacement cascade containing large concentrations of vacancy and self-interstitial atom (SIA) defects [53, 54]. The formation, initial cooling and relaxation of displacement cascades, including cluster formation and spontaneous recombination of vacancies and SIA, occurs over very short times of ≈100 ps within regions less than approximately 50 nm diameter [55–59]. The standard radiation damage dose unit is the number of displacementsper-atom (dpa). The computed dpa dose does not account for cascade recombination or defect reconfigurations in cascades. The dpa is essentially the total
1006
G.R. Odette and B.D. Wirth 1
Normalized PRA spectrum
10⫺1 10⫺2 10⫺3 10⫺4 ITER Be first wall ITER Be first wall HFIR PTP PWR 1/4-T RPV
10⫺5 10⫺6
10⫺4
10⫺3
10⫺2
10⫺1
1
PRA ENERGY (MeV) Figure 2. Normalized PRA energy spectra for four prototypic irradiation environments, including a fusion reactor (FFTF mid-core Be first wall) and the reactor pressure vessel of a fission reactor (PWR 1/4-T RPV) [52].
kinetic energy deposited in atomic recoils that is not lost to electrons. While the dpa has been empirically successful as a dose unit [60], more physical measures of damage production are needed in MSMP models. This requires modeling the structure and dynamics of cascades. Conceptually, replacement sequences transport the SIA to the periphery of the cascade leaving a vacancy rich core. The SIA are very mobile and a large fraction of them rapidly form SIA clusters. Over short time scales the vacancies are relatively immobile, but also form some small clusters as well as precursors to larger nanovoids that continue to evolve at longer times. The structure of cascades and the number of primary defects (νd ) produced by a PRA of energy T [νd (T )] have been extensively studied by molecular dynamics (MD) simulations [55–59] and binary collision approximations [61], and are discussed in a companion article by J.-P. Crocombette. Libraries of MD cascades in iron and other elements show the number of primary defects are statistically distributed around a mean νd (T ). The primary defects that are typically considered include the total displacements, the net residual defects escaping initial recombination and the number of small vacancy and
Radiation effects in fission and fusion reactors
1007
SIA clusters grouped in different size bins. The residual defect fraction as a function of the T derived from MD simulations is shown in Fig. 3 [58]. The MD predictions are consistent with low temperature experiments and show that for T >1 keV, only roughly 1/3 of the total computed displacements survive initial recombination [57, 58]. Thus the spectral averaged defect production cross sections σd and production rate Rd are given by σd =
E T
φ(E)K (E, T )dE νd (T )dT φt
Rd = φt σd φt =
(1) (2.a)
φ(E)dE
(2.b)
E
While some residual questions remain, such as the effects of alloying, preexisting microstructures and electron–phonon coupling on cascade evolution; primary defect production cross sections needed in MSMP models are relatively well established. However, the important physics of cascade aging over much longer periods of time has only recently been addressed. This stage of
Point detect survival fraction (per NRT)
1.6
Average values and standard error: 100k 600k 900k
1.4 1.2 1 0.8 0.6 0.4 0.2
0.1
1 10 MD cascade energy (keV)
100
Figure 3. Fraction of point defects surviving in-cascade recombination as a function of PRA energy in MD simulations [58].
1008
G.R. Odette and B.D. Wirth
modeling requires understanding of defect transport and interaction dynamics. Since the pertinent time scales are very different, the SIA and vacancy evolutions can often be treated sequentially.
2.3.
SIA Cluster Properties
Extensive MD simulations have been used to study the properties of SIA and SIA clusters using Finnis–Sinclair and EAM-type interatomic potentials† [62–66]. The SIA ground state in Fe is predicted to be a 110 split-dumbbell, in agreement with experiment [62, 67]. As illustrated in Fig. 4, the dumbbells rotate into the 111 split dumbbell–crowdion configuration with low activation energy of about 0.18 eV. The crowdions undergo essentially athermal diffusion with activation energies < 0.05 eV, until rotating back into the 110 split-dumbbell orientation with a small activation barrier. Thus the overall SIA diffusion obtained from MD simulations is 3D at higher temperatures, with an effective activation energy of about 0.13 eV [65]. However, the simulations are sensitive to the approach used to treat interatomic-interaction energies and forces. As summarized in Table 2, recent ab initio calculations confirm that the 110 dumbbell is the lowest SIA energy configuration, but the corresponding energy of the 111 SIA is higher by 0.7 eV [68, 69]. This is a rather dramatic difference compared to the EAMtype simulations. The ab initio results indicate an activation energy of 0.34 eV for a rotation and translation jump of the 110 split-dumbbell, which involves a similar but not identical jump process as described by the EAM potentials. This may be due to the fact that standard EAM-type potentials do not treat directional, multi-electron band and magnetic effects. However, the sensitivity of the ab initio results to the selection of pseudopotential, basis set and k-point sampling remain to be fully understood, as do the effect of image interactions from small periodic supercells. Thus, the migration mechanism and associated activation energy of the self-interstitial atom in Fe remains a subject of ongoing research and scientific debate. Further, the ab initio simulations predict that di-SIA with 110 orientations are strongly bound and have a lower energy than 111 di-SIA [69]. This is at variance with the MD EAM-type potential results that predict ground state 111 configurations for all SIA clusters with sizes n ≥ 2 [62–64]. The EAM simulations reveal that larger SIA clusters form prismatic a/2111 dislocation loops that undergo rapid 1 D diffusion on their glide prism, at least
† The Finnis–Sinclair [70] and embedded atom method (EAM) [71]–type potentials have very similar functional forms. In this article, EAM-type potentials will be used as a general reference to the results obtained in bcc Fe alloys by semi-empirical Finnis-Sinclair and EAM potentials.
Radiation effects in fission and fusion reactors
Formation energy (eV)
<110>
<***>
<111>
1009 <111> crowdion
<111>
5.05 5.01 4.99 4.87
Migration path coordinate Figure 4. Illustration of the migration process of a single self-interstitial atom predicted by MD simulations. The lowest energy (0 K) configuration is a 110 split-dumbbell. Migration occurs as a result of migration into 111 split-dumbbell configurations with 1 D translation in the 111 direction through the 111 crowdion saddle point.
Table 2. Summary of SIA formation energies (in eV) obtained from semi-empirical Finnis–Sinclair and EAM potentials, and recent ab initio results [62, 63, 65, 69]
E f , 110 E f , 111 E f , 111 crowdion
Finnis–Sinclaira
EAMb
Finnis–Sinclairc
Ab initiod
4.76 4.87 4.91
4.33
4.87 4.99 5.01
3.64 4.34 4.34
a [60]; b [61]; c [63]; d [67]
in pure iron [64, 72, 73]. While the 1 D diffusing SIA loops show strong correlations between individual sequences of jumps, the 1 D diffusion process can be described by a diffusion coefficient with an activation energy < 0.1 eV and a weakly size (n) dependent pre-exponential factor (≈ 1/n2/3 ) on the order of 0.5–1×10−6 m2 /s [66, 73, 74]. Generally equivalent behavior is predicted in other crystal structures as well, and perfect prismatic vacancy loops are even found to be mobile in MD-EAM simulations [75]. The high 1 D mobility of SIA cluster-loops has a profound effect on the kinetics and nature of long-term evolution of the overall microstructure under irradiation. For example, this mechanism helps explain the apparent absence of observable dislocation loops in RPV steels irradiated to intermediate doses of 0.05 dpa. At higher doses, the sink bias between vacancies undergoing 3 D diffusion
1010
G.R. Odette and B.D. Wirth
and 1 D migrating SIA cluster-loops may enhance phenomena like void swelling [76]. It is clear that at some size the a/2111 SIA cluster configuration will have the lowest energy. However, what this size is, and the size-dependent mobility of the SIA clusters are important unresolved issues. Many other details remain to be resolved as well. For example, recent MD studies have indicated SIA cluster trapping by interstitial C (>0.6 eV) and He (>1.0 eV) [77, 78]. In contrast, the EAM MD simulations indicate that oversized substitutional Cu has little effect on SIA and SIA cluster-loop mobility [65, 66]. Other important issues are how SIA interact with other defects, including each other (discussed below), precipitate interfaces, as well as dislocation and grain boundary sinks. However, at high temperatures of interest (>500 K) at least some of these details can probably be safely ignored in microstructural evolution models. For example, as long as their mobility is much larger than vacancies, the precise value of the diffusion coefficient of SIA and SIA clusters is not important to overall defect balances. Hence, the focus of modeling should be on determining critical mechanisms like the dimensionality of SIA and SIA cluster diffusion and its consequences, trapping, the ability of mixed-dumbbell SIA to transport solutes (hence, to drive chemical segregation to sinks), the effective sink efficiencies and strengths for SIA clusters, and the reactions and effective reaction rates between all mobile defect species.
2.4.
Cascade Aging and Delayed Defect Production
Returning to the issue of cascade aging, based on any reasonable assumed diffusion parameters, SIA and SIA clusters quickly (< µs) leave the cascade region, unless they are strongly trapped, or recombine with cascade vacancies. Recombination during the post cooling stage has been studied by object-based Monte Carlo (KMC) methods that transport the SIA and SIA clusters within and away from the cascade, while the vacancies remain immobile [63, 79– 81]. Unlike the initial cascade cooling stage, additional recombination during this phase increases with increasing PRA energy T in the range >10 keV [63, 80], reaching values up to about 45% of the initial cascade defects, for the highest cascade energies (50 and 100 keV). Table 3 shows the results of such simulations [82]. This energy dependence is primarily due to SIA and SIA clusters that escape one sub-cascade and then recombine with vacancies in the spatially correlated (nearby) sub-cascades. This mechanism has not yet been accounted for in energy-dependent defect production cross-sections, but may have important implications in developing physically based damage production models for fusion versus fission neutron spectra. The cascade cores continue to evolve (age) over much longer times by spatially and time correlated short-range vacancy and coupled solute
Radiation effects in fission and fusion reactors
1011
Table 3. Summary of additional recombination during initial stages of cascade aging (t < 10−6 s) at 290 ◦ C. The table provides the average number of Frenkel pairs formed in the cascade [58] and the average number of surviving vacancies. The vacancy – self-interstitial recombination radius was the lattice parameter, a0 PRA energy 500 eV 1 keV 2 keV 5 keV 10 keV 20 keV 40 keV 50 keV 100 keV
Average number of Frenkel pairs produced in cascade (MD simulation)
Average number of surviving vacancies (KMC simulation)
4.2 6.4 9.4 22.0 33.9 59.3 131 168.3 332.3
3.0 (71%) 4.7 (73%) 6.1 (65%) 13.2 (60%) 20.2 (60%) 38.2 (64%) 77.5 (59%) 90.9 (54%) 180.1 (54%)
diffusion. This period (which can be described as delayed defect production) has been extensively studied using kinetic lattice Monte Carlo (KLMC) and object KMC techniques [80, 81, 83–85]. Both techniques track the real time associated with the cascade aging processes. The KLMC simulations show that cascade aging produces vacancy clustering, cluster migration, cluster coalescence and ultimately vacancy cluster dissolution. Interactions between vacancies and solutes (such as Cu) enhance the formation of vacancy–cluster complexes that are also mobile and often grow by cluster coalescence [83, 84]. Ultimately, most of the primary vacancies leave the cascade region, but some may form the nuclei for larger clusters that continue to grow by long-range vacancy and solute diffusion. The time-scale for cascade aging depends on the irradiation temperature; it overlaps with long-range diffusion processes below around 300 ◦ C. As described in more detail in the companion article on Monte Carlo by G. Gilmer, the KLMC simulations require the interaction and migration activation energies for vacancies and solutes. The Boltzmann weighted MC exchange probabilities depend on the local vacancy environment. The simplest approach possible uses pair bond models to compute lattice site energies. Most simulations have assumed the vacancy–solute activation barrier scales (increases over) that for Fe by half the difference in the lattice site energies. EAM potentials and MD have been used to derive ground state and activation energies in the Fe–Cu system [85–87]. Ultimately, relaxed ab initio simulations could be used to derive the many-bodied lattice site and activation energies for vacancy–solute (e.g., Cu, Mn, Ni, Si, Cr, He, . . . )-solvent (Fe) configurations, somewhat akin to the cluster variational approach used to model alloy phases [88]. Several workers have used nearest neighbor and EAM potentials to examine the effect of the local many solute atom environment on the activation energy for vacancy exchanges and used this information in KLMC to simulate precipitation of coherent clusters in systems with a
1012
G.R. Odette and B.D. Wirth
single vacancy, showing that adding such detail results in large changes in the kinetic decomposition paths [89–92]. An example of a KLMC simulation of cascade aging is shown in Fig. 5 for a 50 keV cascade in an Fe–0.3%Cu at 300 ◦ C alloy. This figure shows the vacancy–Cu cluster evolution starting at 1 ns and ending at more than 106 s. In order to simulate the enormous range of times in the KLMC simulations, special rescaling-annealing algorithms were developed [83]. The red (dark) circles show the positions of vacancies, green (light) circles show the positions of those Cu atoms that are nearest neighbors (clusters) to one or more Cu atom or vacancy. By 20 ms, 14 vacancy–Cu complexes have formed containing 80% of the initial vacancies, while 20% of the residual vacancies have left the cascade region. The cluster-complexes are thermodynamically unstable and dissolve by vacancy emission depending on the irradiation temperature as well as their size and composition. However, small cluster-complexes are also very mobile and between 20 ms and >105 s, diffusion–coalescence processes eventually lead to the formation of just one or two larger cluster complexes. Small migrating complexes also getter additional Cu, which increases the cluster binding energy, and thereby decreases the vacancy emission (a)
(b)
(c)
(e)
(f)
5 nm (d)
Figure 5. Kinetic Monte Carlo simulation results of the vacancy–Cu evolution (a) 1 ns, (b) 9 ms, (c) 20 ms, (d) 5 s, (e) 135 s, and (f) 1.35 × 106 s following the production of a 50 keV displacement cascade in an Fe–0.3% Cu alloy. Red (dark) circles show the position of vacancies, green (light) circles the clustered Cu atoms.
Radiation effects in fission and fusion reactors
1013
rates from clusters. The largest nanovoids contain up to several tens of vacancies, with some Cu segregated to their surfaces. Cluster diffusion–coalescence processes compete with dissolution by vacancy emission, but the rates of both processes decrease rapidly with increasing cluster size and Cu content. Eventually, the single or few large clusters fully dissolve, and in this case the last vacancy leaves the cascade region at 1.35×106 s. Notably some high energy cascade simulations have shown cascade lifetimes approaching 109 s. Small Cu clusters are left in the wake of the emission of cluster complex vacancies during the various stages of cascade aging. The preceding discussion dealt with the birth to death cycle for isolated cascades. However, as in the example given above, when the time scale of cascade aging becomes comparable to long-range diffusion processes and new overlapping cascade production, these processes must be accounted for. For example, the more stable vacancy solute cluster complexes and residual solute clusters will continue to grow by long-range diffusion of vacancies and solutes, respectively. Assuming a fusion reactor flux of ≈ 1019 n/m2 s and a cascade production cross-section of 2 × 10−28 m2 /atom, cascades will overlap within a 5 nm cascade core dimension approximately once every ≈ 104 s. KLMC simulations of the overlap of the vacancy rich cores of cascades results in more numerous and smaller vacancy – Cu clusters and less escape of isolated vacancies. Figure 6 shows a comparison of vacancy – Cu clusters formed in an Fe–0.3% Cu alloy at 290 ◦ C and a dose of approximately 0.4 mdpa, for
(a)
(b)
y
z
y
x
x z
vacancy ‘clustered’ Cu
5 nm Figure 6. Comparison of vacancy–Cu clusters formed at about 0.4 mdpa, with new defect (cascade) introduction at (a) 10−12 dpa/s and (b) 10−9 dpa/s.
1014
G.R. Odette and B.D. Wirth
introducing additional damage (cascades) introduction at a rate of about 10−12 versus 10−9 dpa/s. At the lower dose rate, the longer time between the arrival of a new overlapping cascade is such that the remnants of previous cascades are nearly completely dissolved. In contrast, at the higher dose rate, the vacancy– Cu complex remnants act as sinks for newly created cascade vacancies, thereby reducing the fraction of vacancies that escape the cascade region. The nonlinear interactions result in smaller but more numerous vacancy–Cu complexes. These simulations overestimate the vacancy survival and clustering under cascade overlap conditions, since they do not account for recombination due to the SIA and SIA clusters in the new overlapping cascades. Thus, object KMC simulations of the cascade overlap recombination phase will be used to refine these simulations in the near future. Cu is a surrogate for other solutes such as Mn, Ni and Si in RPV alloys that also bind to vacancies and form cluster complexes. Simulations show that a higher total active solute concentration (1 to > 2%) leads to smaller, but more numerous vacancy cluster–solute complexes with somewhat shorter lifetimes compared to the larger cluster complexes formed in more dilute alloys. These results are consistent with positron annihilation lifetime studies of complex steels and Fe–Mn–Cu model alloys versus simple Fe–Cu binaries [93]. Over longer times the residual solute clusters continue to evolve and are the likely source of loosely aggregated Mn, Ni and Si matrix features. The so-called matrix features are primarily responsible for hardening in low and no Cu RPV steels. Further, these features are likely nucleation sites for wellformed Mn–Ni–Si rich phases that grow due to long-range diffusion of these elements when present in sufficiently high concentrations. These so-called late blooming phases are discussed in Section 2.5. The time-scale overlap of cascade aging, where spatially and time correlated processes are important, with both long range diffusion and multiple local cascade events presents a significant modeling challenge. Further, a good method for coupling these atomistic results to mean field cluster dynamics diffusion reaction simulations is not yet in hand. One possible approach is to use delayed defect production cross sections based on an analysis of extensive libraries of aged cascades, including the effects of cascade overlap. However, even with a good database, this approach may be cumbersome to implement and lead to ambiguities that are difficult to resolve. MC methods provide a more direct approach to simulate larger volumes and longer times. Certainly, the rapid increases in computing power and parallel software, coupled with methods such as domain decomposition, will make MC the method of choice for such simulations in the future. These include both object and event based MC, described in the companion article by George Gilmer and in the literature [94]. The effects of misfit strain and long-range strain field interactions present a particular challenge. We recently have begun applying a fast multipole based MC technique to efficiently treat such interactions.
Radiation effects in fission and fusion reactors
3.
1015
Long-Term Microstructural Evolution
As described in the previous sections, primary cascade production processes are very rapid. But, depending on the irradiation temperature, subsequent cascade aging processes may occur over long periods of time. In general, however, long-term microstructural evolution takes place primarily by coupled long-range diffusion of defects and solutes. While a detailed discussion is beyond the scope and space available in this article, the most important processes can be briefly summarized as including: • Annihilation of mobile defect species at sinks, including dislocations and grain boundaries. The sink strengths generally are different for SIASIA clusters and vacancies. Such sink bias can arise from strain field diffusion-drift interactions, differences in local defect annihilation processes and one versus three-dimensional SIA cluster diffusion. • Clustering of insoluble He (produced by n, α reactions) to form gas bubbles that can act as nucleation sites for both voids and grain boundary creep cavities. Bubbles are stable in the sense that they grow only with the addition of gas atoms. However, a sink bias driven excess flux of vacancies relative to SIAs transforms bubbles that have grown beyond a critical size, r∗ , into unstably growing voids or creep cavities. • Void swelling and network dislocation climb, annihilation and production from loop unfaulting leading to evolved dislocation substructures, again due to bias driven imbalances between SIA and vacancy fluxes. • Driven non-equilibrium chemical radiation induced segregation (RIS) or desegregation due to coupling of solutes to persistent defect fluxes to fixed sinks. • Long-range diffusional aggregation of solutes forming a wide range of equilibrium and non-equilibrium precipitate phases due to radiationenhanced diffusion (RED) in lower temperature regimes, which are normally kinetically inaccessible under thermal aging conditions. The long-term microstructural evolution and consequences are governed by the mechanisms and microstructures controlling the transport and fate of defects coupled to He, solutes and impurities. These processes, in turn, depend on a large number of atomic scale processes, such as: He diffusion, trapping and emission from features in the matrix, on dislocations and on grain boundaries; He interactions with other mobile defects; and the properties of small He-vacancy clusters. These atomic scale mechanisms and pertinent parameters can be evaluated by various MC techniques, EAM-MD simulations, and in principle ab initio methods coupled to diffusion reaction cluster dynamics models. The final sections describe two examples of modeling microstructural evolution relevant to (i) RPV embrittlement, and (ii) dislocation loop evolution at intermediate dose and temperature.
1016
4.
G.R. Odette and B.D. Wirth
Nanoscale Precipitation in Irradiated RPV Steels
As noted previously, irradiation embrittlement of RPV steel has been most commonly characterized by shifts in a Charpy transition temperature (T ) marking a specified energy index (41 J). The T is due to the corresponding irradiation hardening, usually represented by increases in the yield stress (σy ) produced by ultrafine nm-scale precipitates and defect cluster complexes that evolve under irradiation. Micromechanical models are consistent with empirical observation that T ≈ Cc σy , where Cc ≈ 0.65 ± 0.15 ◦ C/MPa. The σy can be related to the size distribution, number density and the dislocation obstacle strengths of the mix of hardening features [13]. Modeling microstructural evolutions have been described in a series of publications [5, 7, 8, 13, 14, 18–21, 95] and they will not be repeated in the following paragraphs except as necessary. It has been common to divide the modeling of nm-scale features into those associated with Cu, which remains highly supersaturated following typical RPV heat treatments, and those that evolve in both Cu bearing (Cu > ≈ 0.075%) and Cu free (Cu < ≈ 0.075%) steels. In Cu-bearing steels, radiation enhanced diffusion greatly accelerates Cu clustering and the formation of coherent (bcc) Cu-rich transition phase precipitates (CRPs) alloyed with Mn, Ni and smaller quantities of other elements. Based on the assumption that radiation enhanced diffusion controlled Cu clustering is the rate-controlling step, the CRP kinetics can be treated with mean field cluster dynamics (CD) models of the time evolution of a number density of clusters N j , containing j = 2, n max Cu atoms, as dN j = α j +1 N j +1 + β j −1 N j −1 − (α j + β j )N j j = 3, n max − 1 dt (3) Here α j and β j are the Cu emission and impingement rates, respectively. Slightly different forms of Eq. (3) are needed for N1 , N2 and Nnmax to complete the set of n max coupled ODEs. These equations can be numerically integrated to compute all the N j (t). Since, they are not particularly stiff, and since large sets of equations can readily integrated (note n max = 10 000 corresponds to a CRP rmax ≈ 3 nm) there is no computational barrier to full CD simulations of Cu clustering. The physics is subsumed into the coefficients, α and β. Assuming, simple diffusion controlled kinetics, pure Cu precipitates and the capillary approximation ∗ α j ≈ 4πr j DCu ∗ β j ≈ 4πr j DCu
X ce 2γpm Va exp Va r j kT X cm Va
(4.a)
(4.b)
Radiation effects in fission and fusion reactors
1017
Here X ce is the equilibrium fraction of Cu in the ferrite matrix in equilibrium with a bcc phase, X cm is the remaining dissolved Cu fraction, γpm is the effective CRP – matrix interface energy, and Va is the atomic volume of Cu. These parameters can be determined from ab initio or EAM potential based simulations, and can be measured or estimated from experiment. Typical val∗ is the radiation enhanced ues used in the models are given in Table 4. Dcu diffusion coefficient that must be modeled separately. Within the assumptions ∗ simply sets the time, or φt-dpa dose, scale for precipitation. of the model, Dcu Integration of Eq. 3 predicts the time/fluence-dependent evolution of the pure Cu CRPs, N (φt, rp ). The enhanced diffusion of Cu under irradiation is primarily due to the cor∗ can be estimated using a responding excess concentration of vacancies. Dcu standard steady-state rate theory (SRT) model [12] as ∗ ≈ K (φ, T, St , . . .)φ + Dcu Dcu
(5)
Here K is the RED factor that depends on the total defect sink strength St, as well as solutes and other features that trap vacancies, promoting vacancy–SIA recombination, and Dcu is the thermal Cu diffusion coefficient. Both K and Dcu depend on the interactions between vacancies and Cu and the corresponding vacancy jump frequencies in the vicinity of Cu. The RED factor can be modeled by SRT as follows. Assuming steady state vacancy (X v ) and SIA (X i ) concentrations (atomic fractions) and ignoring bias effects and vacancy trapping, the defect balance can be expressed as G v − Dv X v Stv − X v X i (Dv + Di )R = 0 G i − Di X i Sti − X v X i (Dv + Di )R = 0
(6.a) (6.b)
Here G v = G i = G = σv φ are the vacancy and SIA generation rates and σv is the vacancy production cross-section, Stv = Sti is the total sink strength, Dv Table 4. Nominal values used in the mean field cluster dynamics calculations of radiation enhanced copper diffusion and copper precipitation kinetics Parameter Burger’s vector (b) Vacancy production cross-section (σv ) Recombination & trapping radius (rv ) CRP-matrix interface energy (γpm ) Atomic volume (Va ) Vacancy diffusion pre-factor (Dv,0 ) Vacancy migration energy (E m,0 ) Trap-vacancy binding energy (Hb ) Total sink strength (St ) Trap concentration (X t )
Value 0.248 nm 0.6 × 10−25 m2 0.57 nm 0.4 J/m2 1.17 × 10−29 m3 5 × 10−5 m2 /s 125 kJ/mol 30 kJ/mol 2 × 1014 m−2 0.03
1018
G.R. Odette and B.D. Wirth
and Di are the vacancy and SIA diffusion coefficients and R = 4πrr / Va is the vacancy–SIA recombination parameter where Va is the atomic volume and rr is the recombination radius. If recombination is ignored Dv X v = Di X i =
G D∗ ≈ sd St fc
(7)
∗ is the radiation enhanced solvent (Fe) self-diffusion coefficient and where Dsd f c is the self-diffusion correlation factor (≈1). Considering recombination and assuming Dv Di
f t (T, φ, St )G St 2 f t = η [(1 + η)1/2 − 1]
Dv X v =
η=
(8.a) (8.b)
16πrr G Va Dv St2
(8.c)
Here f t is the fraction of vacancies that survive recombination with SIA and reach sinks. However, recombination is greatly enhanced if vacancies are strongly bound to a high concentration (X t ) of solute trapping sites. Assuming that a solute trap is limited to one bound vacancy and that a small fraction of traps are occupied (X t X tv ),
G+
4π(rt X t + rr X v ) X tv − Dv X v St + τt Va
X tv Dv X v 4πrt X tv Dv X v 4πrt X t − − =0 Va τt Va τt ≈
b2 Dv exp (−Hb /RT )
(9.a) (9.b) (9.c)
Here, rt is the trap capture radius, τt is the average trapping time, Hb is the trap–vacancy binding energy and b (= 0.248 nm) is the atomic spacing. Solute vacancy binding energies are typically in the range of about 5 – 30 kJ/mol [96]. However, the effective Hb may be even higher. Equation (9) can be solved for the φ corresponding to a specified f t (φ, Ti , St, X t , E b ) as
1 St −1 ft σv . φ( f t ) = 4πrt f t 4πrt τt 4πrt X t f t + ft − 1 + Va Va St Dv St Va
(9.d)
Radiation effects in fission and fusion reactors
1019
∗ For dilute alloys, the simplest way to model Dcu is in terms of the radiation ∗ enhanced self-diffusion coefficient, Dsd , as ∗ ∗ Dcu ≈ Dsd
Dcu G ft ≈ + Dsd Dsd St
Dcu Dsd
(10)
Here Dsd is the thermal self-diffusion coefficient. ∗ t = f (φt). Note that there The total Cu precipitation is proportional to Dcu are two sources of dose rate (φ) effects in this formulation. If, as is typi∗ t = σv φt f t /St and any dose rate depcally the case, G v f t /St Dsd , then Dcu endence of precipitation at a given φt is contained in the f t recombination ∗ t is term. At low dose rates in the sink dominated regime, f t ≈ 1 and Dcu independent of dose rate.√At higher dose rates, in the recombination dominated regime, f t scales as ≈ 1/ φ. This means that the φt needed to produce a given amount of precipitation increases with increasing dose rate. At still higher dose rates, transient cascade vacancy clusters become the dominant defect sink and 10 270˚C 290˚C 320˚C
ft
1
0.1
0.01
0.001 1013
1014
1015
1016 1017 1018 φ[n/(m2−s)]
1019
1020
1021
Figure 7a. The fraction of vacancies that escape recombination and reach sinks, as a function of irradiation temperature and neutron flux.
1020
G.R. Odette and B.D. Wirth
f t scales as ≈ 1/φ; in this case the precipitation depends on time, t, but not φt. This is also the case at very low dose rates, in the thermal diffusion dominated regime, where Dsd G ft /St. More generally f t varies continuously with dose rate, scaling as φ −p where p varies between 0 and 1. Figure 7a shows f t in the sink and recombination dominated regimes as a function of φ and Ti for the base model parameters given in Table 4. Figure 7b shows the φ corresponding to f t = 0.5 versus Hb for various X t and St . The main advantage of framing RED in the form of Eq. (10) is that all the key atomic scale diffusion processes (that depend on the various vacancy– solute interaction energies, and the corresponding jump frequencies) are lumped in the [Dcu /Dsd ] term. Experimental estimates of [Dcu /Dsd ] are available at high temperatures, but they must be extrapolated to ≈ 300 ◦ C pertinent to RED Cu precipitation. Notably, however, the extrapolated [Dcu /Dsd ] ratio is much less sensitive to various uncertainties than either Dcu or Dsd . Further, 1020 St⫽2⫻1015m⫺2
1019
φ[ft⫽0.5(n/m2⫺S)
1018 St⫽2⫻1014m⫺2
1017
1016
1015 Xc⫽0.005 Xc⫽0.03
1014
1013 0
5
10
15
20
25
30
35
40
Hb(kJ/mol) Figure 7b. The flux at which the recombination fraction equals 50%, as a function of trap concentration, binding enthalpy and total sink density.
Radiation effects in fission and fusion reactors
1021
the jump frequencies that govern [Dcu /Dsd ] can be estimated from atomistic calculations based on MD and EAM type potentials; or even, in principle, using ab initio methods. The jump frequencies can be used in analytical models of [Dcu /Dsd ], including the effects of alloy composition. Alternatively, the jump frequencies can be used in KLMC simulations to extract the diffusion coefficients. Such a formulation provides a good example of an effective way to bridge the gap between atomistic simulations and other types of models. Estimates of [Dcu /Dsd ] based on fits to precipitation and hardening data in typical RPV steels suggest values of ≈ 50 at around 300 ◦ C for the nominal values of σv and St in Table 4; these are within an order of magnitude of the atomistic estimates [18]. Figure 8 shows the results of a cluster dynamics Cu precipitation simulation for a Fe–0.4%Cu alloy irradiated at 290 ◦ C. The results of the CD model are expressed in terms of the CRP number density (j > 3), Np , average radius, ∗ = 10−22 m2 /s. rp , and volume fraction f p as a function of φt, for a nominal Dcu Note, in principle, the CD models need not involve any adjustable parameters. The model shows overlapping stages of nucleation, growth and coarsening. Overall the predictions are in good semi-quantitative agreement with experimental observations. However, the simple CD models based on assuming dif∗ modestly differ in some details fusion controlled kinetics and a constant Dcu from experimental observations for both thermal and RED precipitation. Possible reasons for the disparity include (i) uncertainties with extrapolating
100 1025
101 N
G(C)
C
fp <rp>(n/m)
Np(m⫺3)
fp(%)
Np
<rp> 10⫺2 1035 1021
1022
1023
10⫺1 1024
φt(n/m2) Figure 8. CD model prediction of the nucleation, growth and coarsening evolution of Cu precipitate number density (Np ), mean radius (rp ) and volume fraction ( f p ) in an Fe–0.4% Cu alloy irradiated at 290 ◦ C.
1022
G.R. Odette and B.D. Wirth
thermodynamic and capillary-type concepts to the atomic scale, (ii) complex, non-uniform precipitate structures at the atomic scale, (iii) precipitates alloyed with Mn, Ni, Si, P, and even consisting of Mn–Ni rich phases at high alloy Mn and Ni concentrations (promoted by lower T and alloy Cu), (iv) excess free energy contributions from misfit coherency strains (or strain gradients), (v) a continuum range of inter-related features, from vacancy cluster-solute complexes to CRPs and MNPs, (vi) complex correlated diffusion processes associated with strong vacancy-solute interactions in semi-dilute alloys, and (vii) evolution of the RED coefficient with defect sink and vacancy trap evolution. Space does not permit a full discussion of these issues, but a brief discussion of the thermodynamic and LMC treatment of precipitate composition and chemical structure will be presented. Mean field thermodynamics can model the average composition of the precipitates as a function of the alloy composition. This requires evaluating the chemical potential (µi ) and corresponding activity (ai = exp[(µi − G io )/RT ]) of each species (Cu, Mn, Ni, Fe, . . . ) in both the matrix (m) and precipitate (p) phases, where G io is the free energy of pure element i. All constituents are allowed to flow to (aim > aip ) or from (aim < aip ) the precipitate until quasiequilibrium is established at the appropriate level of solute partitioning. For example, a significant amount of Cu in solution has a very high activity (acm 1) compared with that in a pure Cu precipitate (acp =1). Thus the matrix Cu must decrease to very low values to reach the condition acm = acp ≈ 1. Evaluations of µi are based on the standard definition:
µi =
∂ Gt ∂n i
T ,n j,k
.
(11)
Here, G t is the total free energy of the precipitate or matrix mixture. Evaluating the µi requires modeling the corresponding molar free energy (G) of the mixture as a function of temperature and composition. The G(X A , X B , . . . , T ) can be determined from regular or sub-regular solution models with empirical excess free energy (G ex ), enthalpy (Hex ) and entropy (Sex) of mixing (G ex = Hex − TSex ) and lattice change energy terms (G st ) taken from compilations such as CALPHAD [95] (www.calphad.org), in addition to the ideal solution terms. For the precipitate phase, G=
X i [G o,i + G st,i + RT lnX i ] + Hex − T Sex + 4πrp2 γpm
(12)
i
G o,i , n i and X i are the free energy, number of moles and mole fraction of the i’th element. The Hex derives from differences between bonding energies between like (e.g., Fe–Fe, Cu–Cu) and unlike (e.g., Fe–Cu) atoms. The binary
Radiation effects in fission and fusion reactors
1023
interaction between A and B atoms is typically given by a sub-regular solution model (e.g., www.calphad.org) as Hex = X a X b [X a L a (T ) + X b L b (T )].
(13)
Here the L A and L B are tabulated polynomial functions of T (over specified ranges) for various crystal structures; they are most often derived by fits to experimental binary phase diagrams. Analogous empirical analytic expressions exist for Sex. For a regular solution, L A = L B = (independent of temperature) and Sex = 0 [97]. The G evaluations can be extended to a larger number of constituents by summing the contributions from the binary (e.g., Cu–Mn) and higher order (e.g., Cu–Mn–Ni–..) interaction terms; however, generally, only the binary interaction terms are available. A further limitation is that there may not be information for the appropriate crystal structure, as for the bcc binary Cu–Ni phase. The free energy contributions of the composition dependent precipitate– matrix interface energy (γpm ) to µi must also be considered for nm-scale precipitates. The chemical energy contribution to γpm for a coherent interface can be approximated in terms of a regular solution pair bonding model [5, 95] γpm =
2 Hsi z b . (X ip − X im )2 3 i Ai z
(14)
Here, Hsi is the heat of solution for solute i, Ai is the area per atom in the interface, z b is the number of bonds across the interface (≈2) and z is the atomic coordination (= 8). The factor 2/3 is an adjustment to account for the observation that the simple pair bond model for γpm is typically about 50% higher than better experimental and theoretical estimates. The model predicts γpm ≈ 0.4 J/m2 for a pure Cu precipitate. Theoretical estimates of γpm can be obtained from MD simulations using Fe–Cu EAM potentials, or ab initio calculations. However, the main advantage of the simple pair bonding model is that it can be readily extended to interfaces between phases with multiple components (e.g., Fe, Cu, Ni Mn, . . . ). Since the Hs is much lower for Mn and Ni than Cu, these elements are more enriched at small precipitate sizes than for the corresponding case of bulk phases. The lower γpm and higher Mn and Ni solute concentrations are predicted to promote the nucleation and growth of a higher number density of precipitates, consistent with observation; P, and perhaps Si, also appear to play a similar role. The bulk phase boundaries can be determined by setting γpm = 0. Note the corresponding composition dependent coherency stain energy should also be considered, but this generally smaller effective contribution has not been included in the models to date. The thermodynamic models predict the existence of Mn–Ni phases even in Cu free steels, as well as MNPs in Cu-bearing alloys. Note these Mn–Ni rich phases are favored below ≈350 ◦ C where normal thermal aging kinetics is so slow that MNPs would not be observed experimentally. However, once
1024
G.R. Odette and B.D. Wirth
nucleated, RED would result in large volume fractions of the corresponding MNPs leading to severe embrittlement. Nucleation calculations indicate that Cu is very effective in promoting (catalyzing) MNP formation due to its high super-saturation, even at relatively small concentrations. In steels with Cu > 0.05–0.1%, Cu readily clusters along with Mn and Ni. In Cu-bearing alloys without Ni, the thermodynamic models predict that Mn will be enriched in the precipitates to X mn = 0.1–0.2. However, Ni strongly interacts with Mn; hence, when present in the steel, Ni is enriched in the precipitates as well. The models predict Xni /Xmn ratios between approximately 0.5 and 1, increasing with alloy Ni (and Mn) concentrations. In medium to high Cu alloys, the X cu /X mn ratios are approximately 3 to 1 depending on the Ni content. The larger volume fractions and higher number densities of the CRPs and MNP result in much larger hardening that increases rapidly with increasing Ni and Mn. For example the peak σy in alloys with 0.4Cu, 1.6Mn and with 0.0Ni versus 1.6Ni alloy are about 60 MPa compared to 270 MPa [18]. Thus the thermodynamic models rationalize the strong synergistic effect between Cu, Ni and Mn in irradiation hardening and embrittlement. The modelbased predictions of MNPs in high Ni and Mn Cu-bearing steels has been experimentally confirmed in numerous subsequent experiments. Experimental confirmation of the thermodynamic model for Cu bearing alloys includes the effects of thermal annealing at temperatures up to 450 ◦ C and above, which is predicted to reduce significantly Mn and Ni contents of precipitates in Cu bearing alloys [13, 98]. Among other limitations, the mean field thermodynamic model cannot accurately treat the detailed chemical and crystallographic structure of the precipitates. For example, it is expected that Mn and Ni would segregate to the outside of polyhedral precipitates with (100) and (110) facets, thus lowering γpm and the total precipitate interface energy. Further, the strong bonding interactions would be expected to produce some degree of ordering in the Mn and Ni rich regions. The actual precipitates would have a range of lowest energy configurations as modified by entropic effects. LMC methods can be applied to predict these structures. Ideally this would involve the use of rigorous many bodied interaction models or at least semi-empirical EAM type potentials. However, since such information is generally not available, regular solution pair bond energy (εij ) models have been derived based on thermodynamic data [21, 95] and references 20–25 therein; The εij can be estimated [97] as εij ≈
G ex (X i ,T ) εii + ε j j + Na z 2
(15)
Here G ex (X i , T ) is the excess molar free energy of a specified mixture of i and j, z = 8 is the atomic coordination, Na is Avogadro’s number and the εii and ε j j like bond energies determined from the pure element cohesive
Radiation effects in fission and fusion reactors
1025
energies. The G ex (X i , T ) are evaluated for prototypic precipitate and matrix compositions from thermodynamic data in the literature (e.g., references 20–25 in Ref. [95]). They also contain terms, as needed, for transformation to the bcc structure. For example, G ex (X i , T ) data is not available for the bcc phase of Cu–Ni, so 2G ex (X i , T )/3 obtained for the fcc phase is used to approximate the effects of lower coordination. Other modest adjustments to obtain estimates of εij in Fe–Cu–Ni–Mn–Si alloys are discussed elsewhere [95]. The total energy (E ) of a particular configuration of atoms is simply the sum of all the like and unlike bond energies. Starting with a random solid solution the Kawasaki LMC algorithm exchanges atomic positions with a Boltzmann weighted probability (P) as E , if E > 0 (16.a) P = exp − kT (16.b) P = 1, if E ≤ 0 Here E is the energy difference before and after the exchange. The algorithm randomly picks atoms for possible exchanges for a large number of sweeps until E fluctuates around a constant free energy minimum, reflecting the ensemble of precipitate configurations at a given temperature. This MC approach is essentially a regular thermodynamic solution model cast in atomistic form and thus, should produce results (e.g., phase boundaries) that are generally similar to the mean field predictions. However, within the approximations of the simple pair bond model, it can provide additional atomic level detail on the chemical and crystallographic structure on nm-scale precipitates. Some results are illustrated in Fig. 9. Figure 9a shows a typical snapshot for a partially ordered precipitate in an Fe–0.24%Cu, 0.59% Ni, 1.5% Mn, 1.0% Si alloy at 290 ◦ C with a Cu-rich core surrounded by a Ni–Si–Mn rich shell. The simulation is remarkably consistent with both atom probe and SANS measurements on an irradiated RPV weld with this composition [21, 95]. Figures 9 b–e show the range of predicted typical precipitate structures for other alloy compositions and T, including the structure of MNPs. Since they may be slow to nucleate, the MNPs in Cu-free (and very low Cu) steels were dubbed potential “late blooming phases” that could produce severe and unexpected rapid embrittlement above a high incubation dose (φt or dpa). The predicted formation of large volume fraction of MNPs in very low and Cu-free steels [5, 13, 20, 21], and corresponding high levels of hardening, has only recently been confirmed by a variety of characterization methods [22]. This excellent example of modeling leading experiment may have profound implications to the extended life of RPVs. Integrated experiments and refined models will be critical to further map the T, φ, φt, Cu, Ni, Mn regimes where MNPs may be important and to assess the possible role of other solutes, like Si, and phases as well.
1026
G.R. Odette and B.D. Wirth
Cu
Ni
Mn
Si
(a)
3 nm
5 nm (b)
(c)
Cu
(d)
Mn 2 nm
Ni
(e)
Figure 9. MC predictions of the atomic structure of CRP/MNPs. Bulk alloy compositions and temperatures for the simulations were (a) Fe–0.24% Cu–0.59% Ni–1.5% Mn–1.0% Si at 283 ◦ C, (b) 0.26%Cu, 0% Ni, 1.2% Mn at 260 ◦ C, (c) 0.26% Cu, 0.75% Ni, 1.2% Mn at 260 ◦ C, (d) 0.26% Cu, 1.2% Ni, 1.2% Mn at 260 ◦ C, and (e) 0.13% Cu, 0.75% Ni, 1.2% Mn at 290 ◦ C.
Radiation effects in fission and fusion reactors
5.
1027
Dislocation Loop Evolution in Ferritic Alloys
TEM examination of LAMS intended for fusion first wall and blanket application does not reveal any visible damage following low dose, intermediate temperature irradiation (<0.05 dpa at 300 ◦ C). However, as the irradiation dose increases above ∼0.05 dpa, a significant population of dislocation loops, primarily of self-interstitial type, is experimentally observed with b = a100 and b = a/2111. The distribution of loop Burger’s vectors observed ranges from almost equal proportions to predominantly a100, rather than the expected and lowest energy b = a/2111. While this result has been known for nearly 40 years [99–101], a self-consistent mechanisms to explain the presence of 100 loops in ferritic alloys has not been established until recently [102]. MD-EAM simulations show that self-interstitials and small clusters up to tetra-interstitials diffuse three dimensionally, with intrinsic activation energies of only a few tenths of an eV [62–65]. As previously discussed, recent ab initio results raise questions about whether these SIA clusters are a/2110 or a/2111 type [68, 69]. Larger (n>≈ 5 − 10) self-interstitial cluster a/2111 dislocation loops migrate by quasi 1 D diffusion along their glide prism, with activation energies less than 0.1 eV [64, 72]. The 1 D migration of a/2111 clusters is reasonably consistent with the ab initio results, which indicate very small energy differences between 111 dumbbell and 111 crowdion configurations [68, 69]. But, the size at which SIA clusters transform from 110 to 111-orientations is an issue, as is solute and impurity trapping. At damage levels relevant to fusion conditions, a100 dislocation loops are an important, but relatively unexplained part of the irradiation-induced microstructure. Two mechanisms have been proposed to explain the formation and growth of 100 loops in α-Fe [102, 103]. The Eyre and Bullough mechanism [103] assumes that SIA clusters of a/2110-orientation (Burger’s vector) form during irradiation and, upon reaching a critical size shear into a more energetically preferred configuration with a Burger’s vector of a100 or a/2111. However, the Eyre-Bullough model [103] does not explain why the a100 loop form in preference to the lower energy a/2111, since they involve nearly equivalent shear transformations. Further, a/2110 SIA clusters contain a stacking fault in the body centered cubic Fe structure. Such stacking faults have not been observed experimentally, nor anticipated due to very high stacking fault energies. Recently, it has been proposed that intersections between loops could lead to a100 loop formation [102]. Experiments performed in the early 1960s [104] clearly established that hexagonal dislocation networks composed of a/2111 and a100 dislocation segments form in Fe. It was recognized that a100 loops could form as a result of the reaction [99]: a a ¯ → a[100] [111] + [11¯ 1] (17) 2 2
1028
G.R. Odette and B.D. Wirth
However, Masters discounted this possibility since a/2111 loops were not observed [99]. As discussed previously, MD-EAM simulations show such loops form directly in displacement cascades [58] and are high mobile due to 1D on their 111-glide cylinder [64, 72]. As expected from continuum elasticity theory, MD-EAM simulations show that loops with a100 Burger’s vector have a higher self-energy than a/2111. However, recent MS calculations using a Finnis–Sinclair for Fe potential reveal a much smaller difference in energy than expected [102], raising the possible existence of metastable a100 loops. As shown in Fig. 10, MD simulations of interactions (collisions) between SIA dislocation loops reveal that junctions of a100 type do form in α-Fe consistent with Eq. (17). The necessary conditions for 100 junction formation by Eq. (17) are that both (a)
(b)
(c)
Figure 10. Sequence of MD snapshots at (a) 0, (b) 120 and (c) 430 ps, of the interaction of two a/2111 loops with Burgers vectors appropriate to Eq. (17) at 1000 K. The loop on the left side of the image is a perfect, hexagonal 37-SIA cluster, while the one on the right is a 34-SIA jogged hexagonal loop. After forming a 100 junction following the loop collision, the junction expands throughout the resulting loop.
Radiation effects in fission and fusion reactors
1029
interacting loops are larger than ≈ 20 SIAs and are approximately the same size [102]. When these conditions are not met, the smaller cluster always rotates into the 111 orientation of the larger cluster [78, 102]. These junctions are thermally (meta-) stable and can propagate across the loop through a complicated two-step mechanism described by Marian and co-workers [66]. MD simulations also reveal a mechanism for a100 clusters to grow to TEM observable sizes. Although potentially glissile, a100 loops have a very large activation energy for glide, computed to be >2.5 eV and are effectively sessile. Notably, MD simulations of the interaction between a/2111{110} and a100{100} loops reveal rotation of the smaller a/2111 cluster to join the larger a100 loop. Thus, immobile a100 loops are a biased sink for absorption of both mobile SIAs and a/2111 loops. Figure 11 shows an MD simulation in which a 19-SIA a/2111 cluster is absorbed by a 50-SIA a100 square loop, even though the lowest energy configuration is a 69-SIA a/2111 loop, the system follows the path favored by the lattice dynamics as: 100 + 2
1 2
111 → 211 → 100
(a)
(b)
(c)
(d)
(18)
Figure 11. Sequence of MD snapshots at (a) 0.0, (b) 1.5, (c) 2.2 and (d) 3.5 ps, of the absorption of a hexagonal, 19-SIA a/2[111](110) cluster by a square, 50-SIA a[100](100) loop according to equation (18) at 100 K. Interstitials displayed in white are those belonging to the a/2[111] cluster that have rotated to an a[100] configuration.
1030
G.R. Odette and B.D. Wirth
This reaction involves rotation of individual 111-oriented interstitials (in the presence of 100 SIAs) into an intermediate metastable 211 configuration that rapidly rotates into the 100 orientation [102]. This description provides a plausible mechanism for the formation and growth of a100 dislocation loops in LAMS, although additional research is required to quantify the loop density evolution with irradiation conditions, and validate the formation mechanism.
6.
Outlook
The effect of irradiation on materials is a classic example of an inherently multiscale problem involving multiple physical phenomena, and impacts a wide range of technologies. While much is known about the hierarchical processes that govern irradiation effects, the investigation of controlling mechanisms, refinement of key sub-models and extension of the modeling approaches to treat multi-constituent alloys is an active research area. While they are neither perfect, nor fully based on first principles, physical sub-models for the majority of key MSMP processes mediating irradiation effects in fission (RPV embrittlement) and to a lesser extent fusion reactors are now available [7, 13, 14]. Indeed, models have often led experimental observations of key embrittlement phenomena. Examples include the dominant role of RED-copper precipitation in embrittlement of RPV steels [19], the composition and structure of CRPs [20, 21, 95] and the existence of late-blooming MNPs [13, 18, 20]. More generally, existing RPV embrittlement models rationalize almost all observed embrittlement trends, including those that are counterintuitive and complex, such as seemingly contradictory effects of neutron flux [13, 105]. The integration of available sub-models into a comprehensive MSMP model for RPV embrittlement, in what is called a virtual test reactor (VTR), is being carried out in the REVE project. Reve, which stands for REactor for Virtual Experiments and means “dream” in French, is an international collaboration between a large number of institutions in Europe, the United States and Japan [106]. REVE has been led by Professor Jean-Claude Van Duysen, and Stephanie Jumel has led the code integration effort. The first integrated code RPV-1 simulator, which inputs key embrittlement variables and outputs the net corresponding yield stress increase, was recently released and is currently being calibrated and validated with large experimental databases [107]. RPV-1 links five codes and two databases contained in three modules that can be run separately. The linked codes consist of models of PRA production (SPECMIN) and sub-cascade formation (INCAS), a rate theory defect-Cu solute conservation code (MF-VISC) to simulate clustering and nanofeature evolution, a non-equilibrium thermodynamic code (DIFFG)
Radiation effects in fission and fusion reactors
1031
provided by Odette and co-workers [5], and a Foreman and Makin type model (DUPAIR) to simulate the shear stress required for dislocation penetration through a slip plane of obstacles. The component codes of RPV-1 are informed by databases of cascade structure, ms cascade aging and the strengths of individual obstacles. RPV-1 includes a user friendly Python interface and visualization package [107]. Progress on RPV-1 led to a new program to develop RPV-2, aimed at improving the sub-model codes and physics in RPV-1 and extending the hardening model to treat changes in fracture toughness; as well as INTERN-1, a VTR devised to simulate irradiation effects in stainless steels. These new developments are being carried out in a large effort (the PERFECT project) supported by the European Commission in the 6th Framework Program. The REVE project has also been expanded in Europe to model stress corrosion cracking in Zr–Nb alloys for fuel cladding in the on-going SIRENA project. The complexity and challenge of the broader field of radiation effects involves more phenomena and properties (e.g., radiation-induced segregation, non-equilibrium phase evolution and microstructure instabilities, and their impact on properties ranging from creep rupture to fatigue crack growth). However, over the longer term, all of these issues can be dealt with in a MSMP framework. Implementation of a fully integrated MSMP model has substantial advantages. These include a direct and rigorous accounting of defect balances and solute redistribution, better treatment of highly coupled processes, such as vacancy trapping and solute RED or RIS, and the inclusion of effects related to evolving sink and trapping microstructures. Further, an integrated model provides a convenient framework for testing and evaluating the impact of alternative and improved sub-models and a convenient tool for interpreting and analyzing data ranging from nanoscale characterization studies to quantitative statistical fits to engineering data. Steady progress will entail building a knowledge base that is far more accessible and useful (e.g., for design of new materials) than traditional approaches. As an example, a new initiative to simulate the transport, fate and consequences of He in LAMS and advanced high temperature steels has been initiated as a collaboration between the University of California, Santa Barbara, University of California, Berkeley and the Pacific Northwest National Laboratory. The simulations will encompass irradiation conditions pertinent to current experiments in fission reactors and fusion first wall and blanket structures. Similar activities are underway in Europe for simulating fusion materials performance [108]. Finally, we note that the role of advanced computational materials in the development of advanced fission and fusion energy systems was the topic of an international workshop in the spring of 2004 sponsored by the DOE Office of Science and the DOE office of Nuclear Energy and Sciences. In their report [109], a distinguished international panel of experts endorsed a balanced
1032
G.R. Odette and B.D. Wirth
computational modeling and experimental validation approach to meeting the enormous and indeed unprecedented challenges of developing and predicting the performance of materials in the critical new sources of energy that will serve mankind for the millennia.
Acknowledgments The authors express their appreciation to a large number of people who have contributed to this work. In particular, we thank Drs Gene Lucas, Takuya Yamamoto, Rick Kurtz, Roger Stoller, Steve Zinkle and Randy Nanstad for many helpful discussions. Finally, we gratefully acknowledge the financial support of the US Nuclear Regulatory Commission under contracts #04-94049 and 04-01-064, the Office of Fusion Energy Sciences, US Department of Energy under Grant DE-FG02-04ER54275 at UCSB, and the Office of Fusion Energy Sciences, US Department of Energy under Grant DE-FG0204ER54750 at UCB.
References [1] E.P. Wigner, Report for Month Ending December 15, 1942, Physics Division. US Atomic Energy Commision Report CP-387, University of Chicago, 1942. [2] D.R. Olander, Fundamental Aspects of Nuclear Reactor Fuel Elements. U.S. DOE, 1976. [3] J. Gittus, Irradiation Effects in Crystalline Solids. Applied Science Pub. Ltd, London, United Kingdom, 1978. [4] J.T.A. Roberts, Structural Materials in Nuclear Power Systems. Plenum Press, New York, 1981. [5] G.R. Odette, Neutron Irradiation Effects in Reactor Pressure Vessel Steels and Weldments. International Atomic Energy Agency, Vienna, IAEA IWG-LMNPP-98/3, 438, 1998. [6] B.N. Singh, “Impacts of damage production and accumulation on materials performance in irradiation environments,” J. Nucl. Mater., 258–263, 18, 1998. [7] G.R. Odette, B.D. Wirth, D.J. Bacon, and N.M. Ghoneim, “Multiscale-multiphysics modeling of radiation-damaged materials: embrittlement of pressure vessel steels,” MRS Bull., 26, 176, 2001. [8] G.R. Odette, Nuclear Reactors: Pressure Vessel Steels. Encyclopedia of Materials: Science and Technology, Elsevier Science Ltd., Amsterdom, 2001. [9] B.N. Singh, N.M. Ghoniem, and H. Trinkaus, “Experiment-based modeling of hardening and localized plasticity in metals irradiated under cascade damage conditions,” J. Nucl. Mater., 307–311, 159, 2002. [10] D.J. Bacon and Y.N. Osetsky, “Multiscale modeling of radiation damage in metals: from defect generation to material properties,” Mater. Sci. Eng. A, 365, 46, 2004.
Radiation effects in fission and fusion reactors
1033
[11] G.R. Odette, T. Yamamoto, H.J. Rathbun, M.Y. He, M.L. Hribernik, and J.W. Rensman, “Cleavage fracture and irradiation embrittlement of fusion reactor alloys: mechanisms, multiscale models, toughness measurements and implications to structural integrity assessment,” J. Nucl. Mater., 323, 313, 2003. [12] A.D. Brailsford and R. Bullough, “The rate theory of swelling due to void growth in irradiated metals,” J. Nucl. Mater., 44, 121, 1972. [13] G.R. Odette and G.E. Lucas, “Recent progress in understanding reactor pressure vessel steel embrittlement,” Rad. Effects Defects Solids, 144, 189, 1998. [14] S. Jumel, C. Domain, J. Ruste, J.-C. Van Duysen, C. Becquart, A. Legris, P. Pareige, A. Barbu, E. Van Walle, R. Chaouadi, M. Hou, G.R. Odette, R.E. Stoller, and B.D. Wirth, J. Test. Eval., 30, 37, 2002. [15] E.D. Eason, J.E. Wright, and G.R. Odette, Improved Embrittlement Correlations for Reactor Pressure Vessel Steels. NUREG/CR-6551, 1998. [16] T.J. Williams and D. Ellis, Effects of Radiation on Materials: 20th International Symposium, ASTM STP 1405, S.T. Rosinski et al. (eds.), American Society for Testing and Materials, West Conshohocken, PA, p. 8, 2001. [17] G.R. Odette and G.E. Lucas, “Embrittlement of nuclear reactor pressure vessels,” J. Metals, 53, 18, 2001. [18] G.R. Odette, T. Yamamoto, and D. Klingensmith, “On the effect of dose rate on irradiation hardening of RPV steels,” Phil. Mag., in press, 2005. [19] G.R. Odette, “On the dominant mechanism of irradiation embrittlement of reactor pressure vessel steels,” Scripta Met., 17, 1183, 1983. [20] G.R. Odette, “Radiation induced microstructural evolution in reactor pressure vessel steels,” Mater. Res. Soc. Symp. Proc., 373, 137, 1995. [21] G.R. Odette and B.D. Wirth, “A computational microscopy study of nanostructural evolution in irradiated pressure vessel steels,” J. Nucl. Mater., 251, 157, 1997. [22] G.R. Odette, M.K. Miller, K.F. Russell, and B.D. Wirth, “Precipitation in neutron irradiated copper free RPV steels,” J. Nucl. Mater., submitted, 2004. [23] S.J. Zinkle and N.M. Ghoniem, “Operating temperature windows for fusion reactor structural materials,” Fusion Eng. Des., 51–52, 55, 2000. [24] K. Ehrlich, “Materials research towards a fusion reactor,” Fusion Eng. Des., 56–57, 71, 2001. [25] E.E. Bloom, S.J. Zinkle, and F.W. Wiffen, “Materials to deliver the promise of fusion power – progress and challenges,” J. Nucl. Mater., 329–333, 12, 2004. [26] S. Jitsukawa, A. Kimura, A. Kohyama, R.L. Klueh, A.A. Tavassoli, B. van der Schaaf, G.R. Odette, J.W. Rensman, M. Victoria, and C. Petersen, “Recent results of the reduced activation ferritic/martensitic steel development,” J. Nucl. Mater., 329– 333, 39, 2004. [27] A. Kimura, M. Narui, and H. Kayano, “Effects of alloying elements on the postirradiation microstructure of 9-percent Cr 2-percent W low activation martensitic steel,” J. Nucl. Mater., 191, 879, 1992. [28] F.A. Garner, M.B. Toloczko and B.H. Sencer, “Comparison of swelling and irradiation creep behavior of FCC-austenitic and BCC-ferritic/martensitic alloys at high neutron exposure,” J. Nucl. Mater., 276, 123, 2000. [29] N. Hashimoto, S.J. Zinkle, R.L. Klueh, A.F. Rowcliffe, and K. Shiba, “Deformation mechanisms in ferritic/martensitic steels irradiated in HFIR,” Mater. Res. Soc. Proc., 650, R1.10.1, 2001. [30] G.R. Odette, T. Yamamoto, and H. Kishimoto, “An analysis of the effects of helium on fast fracture and embrittlement of 8Cr tempered martensitic steels,” Fusion Materials Semi-Annual Progress Report, DOE/ER-0313/35, 80, 2003.
1034
G.R. Odette and B.D. Wirth
[31] G.R. Odette, “On mechanisms controlling swelling in ferritic and martensitic alloys,” J. Nucl. Mater., 155–157, 921, 1988. [32] B. van der Schaaf, D.S. Gelles, S. Jitsukawa, A. Kimura, R.L. Klueh, A. Moslang, and G.R. Odette, “Progress and critical issues of reduced activation ferritic/martensitic steel development,” J. Nucl. Mater., 283–287, 52, 2000. [33] T. Yamamoto, G.R. Odette, H. Kishimoto, and J.W. Rensman, “Compilation and preliminary analysis of an irradiation hardening and embrittlement database for 8Cr martensitic steels,” Fusion Materials Semi-Annual Progress Report, DOE/ER0313/35, 100, 2003. [34] H. Trinkaus and H. Ullmaier, “High temperature embrittlement of metals due to helium: is the lifetime dominated by cavity growth or crack growth?” J. Nucl. Mater., 212–215, 303, 1994. [35] A. Kimura, R. Kasada, K. Morishita, R. Sugano, A. Hasegawa, K. Abe, T. Yamamoto, H. Matsui, N. Yoshida, B.D. Wirth, and T. Diaz de la Rubia, “High resistance to helium embrittlement in reduced activation martensitic steels,” J. Nucl. Mater., 307–311, 521, 2002. [36] E.E. Bloom, “The challenge of developing structural materials for fusion power systems,” J. Nucl. Mater., 258–263, 7, 1998. [37] G.R. Odette and T. Yamamoto, “A Helium injector concept for irradiating fusion reactor materials at representative He/dpa ratios,” Fusion Materials Semi-Annual Progress Report, DOE/ER-0313/37, 2005. [38] G.E. Lucas, “The evolution of mechanical property change in irradiated austenitic steels,” J. Nucl. Mater., 206, 287, 1993. [39] B.N. Singh, A.J.E. Foreman, and H. Trinkaus, “Radiation hardening revisited: role of intracascade clustering,” J. Nucl. Mater., 249, 103, 1997. [40] R.O. Ritchie, J.F. Knott, and J.R. Rice, “On the relationship between critical tensile stress and fracture toughness in mild steel,” J. Mech. Phys. Solids, 21m, 395, 1973. [41] G.R. Odette and M.Y. He, “A cleavage toughness master curve model,” J. Nucl. Mater., 283–287, 120, 2000. [42] D. Rodney and G. Martin, “Dislocation pinning by glissile interstitial loops in a nickel crystal: a molecular-dynamics study,” Phys. Rev. B, 61, 8714, 2000. [43] Y.N. Osetsky and D.J. Bacon, “An atomic-level model for studying the dynamics of edge dislocations in metals,” Model. Simul. Mater. Sci. Eng., 11, 427, 2003. [44] D. Rodney, “Molecular dynamics simulation of screw dislocations interacting with interstitial frank loops in a model FCC crystal,” Acta Mater., 52, 607, 2004. [45] B.D. Wirth, V.V. Bulatov and T. Diaz de la Rubia, J. Eng. Mater. Tech., 124, 329, 2002. [46] A.J.E. Foreman and M.J. Makin, “Dislocation movement through random arrays of obstacles,” Can. J. Phys., 45, 511, 1967. [47] Y. Xiang, D.J. Srolovitz, L.-T. Cheng, and E. Weinan, “Level set simulations of dislocation-particle bypass mechanisms,” Acta Mater., 52, 1745, 2004. [48] V.V. Bulatov, “Current developments and trends in dislocation dynamics,” J. Computer-Aid. Mater. Des., 9, 133, 2002. [49] T.A. Khraishi, H.M. Zbib, T.D. De La Rubia, and M. Victoria, “Localized deformation and hardening in irradiated metals: three-dimensional discrete dislocation dynamics simulations,” Metal. Mater. Trans. B, 33B, 285, 2002. [50] X. Han, N.M. Ghoniem, and Z. Wang, “Parametric dislocation dynamics of anisotropic crystals,” Phil. Mag., 83, 3705, 2003. [51] L.R. Greenwood and R.K. Smither, SPECTER: Neutron Damage Calculations for Materials Irradiations, ANL/FPP-TM-197, 1985.
Radiation effects in fission and fusion reactors
1035
[52] R.E. Stoller and L.R. Greenwood, “Subcascade formation in displacement cascade simulations: implications for fusion reactor materials,” J. Nucl. Mater., 271–272, 57, 1999. [53] J.A. Brinkman, J. Appl. Phys., 25, 961, 1954. [54] A. Seeger, Proceedings of the Second UN International Conference on Peaceful Uses of Atomic Energy, Geneva, vol. 6, United Nations, New York, 20, 1958. [55] A.F. Calder and D.J. Bacon, “A molecular dynamics study of displacement cascades in alpha-iron,” J. Nucl. Mater., 207, 25, 1993. [56] R.E. Stoller, G.R. Odette, and B.D. Wirth, “Primary damage formation in BCC iron,” J. Nucl. Mater., 251, 49, 1997. [57] R.S. Averback and T. Diaz de la Rubia, “Displacement damage in irradiated metals and semi-conductors,” Solid State Phys., 51, 281, 1998. [58] R.E. Stoller, “The role of cascade energy and temperature in primary defect formation in iron,” J. Nucl. Mater., 276, 22, 2000. [59] C.S. Becquart, A. Souidi, and M. Hou, “Relation between the interaction potential, replacement collision sequences, and collision cascade expansion in iron,” Phys. Rev. B, 66, 134104, 2002. [60] R.E. Stoller and G.R. Odette, “Recommendations on damage exposure units for ferritic steel embrittlement correlations,” J. Nucl. Mater., 186, 203, 1992. [61] S. Jumel and J.C. Van-Duysen, “INCAS: an analytical model to describe displacement cascades,” J. Nucl. Mater., 328, 151, 2004a. [62] B.D. Wirth, G.R. Odette, D. Maroudas, and G.E. Lucas, “Energetics of formation and migration of self-interstitials and self-interstitial clusters in α-iron,” J. Nucl. Mater., 244, 185, 1997. [63] N. Soneda and T. Diaz de la Rubia, “Defect production, annealing kinetics and damage evolution in a-Fe: an atomic-scale computer simulation,” Phil. Mag., A, 78, 995, 1998. [64] Y.N. Osetsky, D.J. Bacon, A. Serra, B.N. Singh, and S.I.Y. Golubov, “Stability and mobility of defect clusters and dislocation loops in metals,” J. Nucl. Mater., 276, 65, 2000. [65] J. Marian, B.D. Wirth, J.M. Perlado, G.R. Odette, and T. Diaz de la Rubia, “Dynamics of self-interstitial migration in Fe–Cu alloys,” Phys. Rev. B, 64, 094303, 2001. [66] J. Marian, B.D. Wirth, A. Caro, B. Sadigh, G.R. Odette, J.M. Perlado, and T. Diaz de la Rubia, “Dynamics of self-interstitial cluster migration in pure α-Fe and Fe–Cu alloys,” Phys. Rev. B, 65, 144102, 2002. [67] P. Ehrhart, K.H. Robrock, and H.R. Schober, In: R.A. Johnson and A.N. Orlov (eds.), Physics of Radiation Effects in Crystals, Elsevier, Amsterdam, Netherlands, 63, 1986. [68] C. Domain and C.S. Becquart, “Ab initio calculations of defects in Fe and dilute Fe–Cu alloys,” Phys. Rev. B, 65, 024103, 2002. [69] C.-C. Fu, F. Willaime, and P. Ordejon, “Stability and mobility of mono- and di-interstitials in a-Fe,” Phys. Rev. Lett., 92, 175503, 2004. [70] M.W. Finnis and J.E. Sinclair, “A simple empirical N-body potential for transition metals,” Phil. Mag. A, 50, 45, 1984. [71] M. Daw and M. Baskes, “Embedded-atom method: derivation and application to impurities, surfaces and other defects in metals,” Phys. Rev. B, 29, 6443, 1984. [72] B.D. Wirth, G.R. Odette, D. Maroudas, and G.E. Lucas, “Dislocation loop structure, energy and mobility of self-interstitial clusters, in BCC iron,” J. Nucl. Mater., 276, 33, 2000.
1036
G.R. Odette and B.D. Wirth
[73] N. Soneda and T. Diaz de la Rubia, “Migration kinetics of the self-interstitial atom and its clusters in bcc Fe,” Phil. Mag. A, 81, 331, 2001. [74] Y.N. Osetsky, D.J. Bacon, A. Serra, B.N. Singh, and S.I. Golubov, “One-dimensional atomic transport by clusters of self-interstitial atoms in iron and copper,” Phil. Mag., 83, 61, 2003. [75] Y.N. Osetsky, D.J. Bacon, and A. Serra, “Atomistic simulation of mobile defect clusters in metals,” Mater. Res. Soc. Symp., 540, 649, 1999. [76] H. Trinkaus, B.N. Singh, and S.I. Golubov, “Progress in modelling the microstructural evolution in metals under cascade damage conditions,” J. Nucl. Mater., 283– 287, 89, 2000. [77] Y.N. Osetsky, personal communication, 2004. [78] B.D. Wirth, G.R. Odette, J. Marian, L. Ventelon, J.A. Young-Vandersall, and L.A. Zepeda-Ruiz, “Multiscale modeling of radiation damage in Fe-based alloys in the fusion environment,” J. Nucl. Mater., 329–333, 103, 2004. [79] H.L. Heinisch and B.N. Singh, “Stochastic annealing simulation of intracascade defect interactions,” J. Nucl. Mater., 251, 77, 1997. [80] B.D. Wirth, G.R. Odette, and R.E. Stoller, “Recent progress toward an integrated multiscale–multiphysics model of reactor pressure vessel embrittlement,” MRS Soc. Symp. Proc., 677, AA5.2, 2001. [81] C. Domain, C.S. Becquart, and L. Malerba, “Simulation of radiation damage in Fe alloys: an object kinetic Monte Carlo approach,” J. Nucl. Mater., 335, 121, 2004. [82] B.K.P. Chang and B.D. Wirth, “Monte Carlo simulation of point defect recombination during the initial stages of cascade aging in Fe,” J. Nucl. Mater., in preparation, 2005. [83] B.D. Wirth and G.R. Odette, “Kinetic lattice Monte Carlo simulations of cascade aging in iron and dilute iron–copper alloys,” MRS Soc. Symp. Proc., 540, 637, 1999. [84] C. Domain, C.S. Becquart, and J.C. Van-Duysen, “Kinetic Monte Carlo simulations of FeCu alloys,” MRS Soc. Symp. Proc., 540, 643, 1999. [85] N. Soneda, S. Ishino, A. Takahashi, and K. Dohi, “Modeling the microstructural evolution in bcc-Fe during irradiation using kinetic Monte Carlo computer simulation,” J. Nucl. Mater., 323, 169, 2003. [86] B.D. Wirth and G.R. Odette, MRS Soc. Symp. Proc., 540, 637, 1999. [87] C. Domain, C.S. Becquart, J.C. Van Duysen, MRS Soc. Symp. Proc., 540, 643, 1999. [88] C. Buzano and M. Pretti, “Cluster variation approach to the Ising square lattice with two- and four-spin interactions,” Phys. Rev. B, 56, 636, 1997. [89] F. Soisson, A. Barbu, and G. Martin, “Monte Carlo simulations of copper precipitation in dilute iron–copper alloys during thermal ageing and under electron irradiation,” Acta Mater., 44, 3789, 1996. [90] M. Athenes, P. Bellon, and G. Martin, “Identification of novel diffusion cycles in B2 ordered phases by Monte Carlo simulation,” Phil. Mag. A, 76, 565, 1997. [91] S. Delage, B. Legrand, F. Soisson, and A. Saul, “Dissolution modes of Fe/Cu and Cu/Fe deposits,” Phys. Rev. B, 58, 15810, 1998. [92] T.T. Rautiainen and A.P. Sutton, “Influence of the atomic diffusion mechanism on morphologies, kinetics, and the mechanisms of coarsening during phase separation,” Phys. Rev. B, 59, 13681, 1999. [93] B.D. Wirth, G.R. Odette, P. Asoka-Kumar, R.H. Howell, and P.A. Sterne, “Characterization of nanostructural features in irradiated reactor pressure vessel model alloys,” In: G.S. Was (ed.), Proceedings of the 10th International Symposium on Environmental Degradation of Materials in Light Water Reactors, National Association of Corrosion Engineers, 2002.
Radiation effects in fission and fusion reactors
1037
[94] J. Dalla Torre, J.L. Bocquet, N.V. Doan, and E. Adam, “Jerk, an event-based Kinetic Monte Carlo model to predict microstructure evolution of materials under irradiation,” Phil. Mag., in press, 2004. [95] C.-L. Liu, G.R. Odette, B.D. Wirth, and G.E. Lucas, “A LMC simulation of nanophase compositions and structures in irradiated pressure vessel Fe–Cu–Ni–Mn– Si steels,” Mater. Sci. Eng. A, 238, 202, 1997. [96] A. Moslang, E. Albert, E. Recknagel, A. Weidinger, and P. Moser, “Interaction of vacancies with impurities in iron,” Hyperfine Interact., 15, 409, 1983. [97] D.A. Porter and K.E. Easterling, Phase Transformations in Metals and Alloys, Van Nostrand Reinhold, Thetford, Great Britain, 1986. [98] E.D. Eason, J.E. Wright, G.R. Odette, and E. Mader, Models for Embrittlement Recovery Due to Annealing of Reactor Pressure Vessel Steels, NUREG/CR-6327, 1995. [99] B.C. Masters “Dislocation loops in irradiated iron,” Phil. Mag., 11, 881, 1965. [100] B.L. Eyre and A.F. Bartlett, “An electron microscope study of neutron irradiation damage in alpha-iron,” Phil. Mag., 11, 261, 1965. [101] A.C. Nicol, M.L. Jenkins, and M.A. Kirk, “Matrix damage in iron,” Mater. Res. Soc. Symp., 650, R1.3, 2001. [102] J. Marian, B.D. Wirth, and J.M. Perlado, “On the mechanism of formation and growth of 100 interstitial loops in ferritic materials,” Phys. Rev. Lett., 88, 255507, 2002. [103] B.L. Eyre and R. Bullough “On the formation of interstitial loops in b.c.c. metals,” Phil. Mag., 12, 31, 1965. [104] W. Carrington, K.F. Hale, and D. McLean, “Arrangement of dislocations in iron,” Proc. R. Soc. Lond. A, 259, 203, 1960. [105] G.R. Odette, E.V. Mader, G.E. Lucas, W.J. Phythian, and C.A. English, “The Effect of Flux on the Irradiation Hardening of Pressure Vessel Steels,” In: A.S. Kumar, D.S. Gelles, R.K. Nanstad, and E.A. Little (eds.), Effects of Radiation on Materials: 16th International Symposium, ASTM-STP-1175, American Society for Testing and Materials, Philadelphia, PA, 373, 1993. [106] S. Jumel, C. Domain, J. Ruste, J.C. Van-Duysen, C. Becquart, A. Legris, P. Pareige, A. Barbu, E. Van Walle, R. Chaouadi, M. Hou, G.R. Odette, R.E. Stoller, and B.D. Wirth, “Simulation of Irradiation Effects in Reactor Pressure Vessel Steels: the reactor for Virtual Experiments (REVE) Project,” J. Test. Eval., 30, 37, 2002. [107] S. Jumel and J.C. Van-Duysen, “RPV-1: a first virtual reactor to simulate irradiation effects in light water reactor pressure vessel steels submitted for publication,” J. Nucl. Mater., 2005. [108] M. Victoria and G. Martin, personal communication, 2004. [109] R.E. Stoller, et al., DOE Workshop on Advanced Computational Materials Science: Application to Fusion and Generation IV Fission Reactors, Washington, D.C.31 March-2 April 2004, ORNL/TM-2004/132, 2004.
2.30 TEXTURE EVOLUTION DURING THIN FILM DEPOSITION Hanchen Huang Department of Mechanical, Aerospace and Nuclear Engineering, Rensselaer Polytechnic Institute, 110 8th Street, Troy, NY 12180-3590, USA
The modeling of materials processing intrinsically spans multiple scales, in terms of both space and time. The modeling of thin film deposition, together with the accompanying texture evolution, spans 15 orders of magnitude in time, from fundamental atomic vibration period of 10−13 s to deposition duration of 102 s. This section describes challenging issues, critically presents existing approaches, and offers an outlook of future developments in the modeling of thin film texture evolution.
1.
Thin Film Deposition and Texture Evolution
Human beings have processed thin films over thousands years, aiming to improve the quality of life. Cars with advanced electronics improve the quality of life, and so do artistic paintings. As a specific example, thin films are applied as metal conductors in integrated circuits (ICs). The performance of these thin films depends on their texture, which refers to the alignment of grain orientations. If most Cu grains in a thin film have their 111 direction along the surface normal, the film has an out-of-plane texture of 111. When this texture dominates, the metal conductors in ICs will last longer, and therefore the lifetime of computers and other IC-based equipments increases. In addition to the technological importance, the texture evolution is also scientifically challenging to materials scientists and solid-state physicists. To model the texture evolution during thin film deposition, let us first examine relevant processes. Although deposition techniques vary, the atomic or molecular addition to an evolving solid surface is common to all of them. Different deposition techniques lead to primarily different sources of atoms 1039 S. Yip (ed.), Handbook of Materials Modeling, 1039–1049. c 2005 Springer. Printed in the Netherlands.
1040
H. Huang
or molecules. To facilitate the presentation, let us take the physical vapor deposition as the prototype of deposition technique; see Powell and Rossnagel [1] for a comprehensive review of this deposition technique. The texture evolution is the result of complex atomic activities, across 15 orders of magnitude in time. Three distinct time scales are identifiable in the texture evolution process. The first time scale characterizes the initial incorporation of atoms on the surface of film or substrate, as shown in Fig. 1(a). The atoms may come from various sources, such as sputtered targets or evaporated filaments. These atoms carry kinetic energies, and their binding with the surface leads to additional energy release. These energies, together with momentums, cause local atoms to rearrange. The time scale of initial atomic incorporation is dictated by the intrinsic atomic vibration period, about 10−13 s. The next time scale characterizes the atomic diffusion or mass transport for clustering, as shown in Fig. 1(b). Atoms with less than perfect coordination, such as an atom having less than 12 nearest neighbors in close-packed Cu, tend to diffuse. The time scale of atomic diffusion depends on the activation energy and the local temperature, and varies over a wide range. It is not out-of-bound to associate a diffusion event with nanoseconds. As clusters grow and merge, a polycrystalline thin film forms (Fig. 1(c)). The third time scale characterizes the motion of grain boundaries in the film, and is on the order of seconds. As an estimate, the grain boundary may migrate 10–100 nm over the entire deposition period, say 100 s. That is, the migration of one atomic layer takes about one second. The entire texture evolution process, from initial atomic incorporation to completion of deposition, spans 15 orders of magnitude in time scale of 10−13 –102 s. It is worthy mentioning the spatial scale, in addition to the time scale, for completeness. In contrast to the 15 orders of magnitude in time scale, the spatial scale spans over only a few orders of magnitude. The smallest spatial scale, the atomic size, is a fraction of nm. Typical grain size or thickness of thin films is on the order of 100–1000 nm. The narrow span of 10−1 − 103 nm justifies the emphasis on time scale in this section.
(a)
(b)
(c)
Figure 1. Texture evolution during physical vapor deposition of thin films, starting from a substrate (a), to islands or grain nuclei (b), and to polycrystalline thin film (c). The spheres represent atoms, and the lines delineate grains.
Texture evolution during thin film deposition
2.
1041
Models of Texture Evolution
The large span of time scales, as discussed in the previous part, poses the biggest challenge to any models of texture evolution. The texture evolution lasts over the entire period of the deposition process, about 102 s. On the other hand, the atomic processes that dictate the evolution occur over the intrinsic time scale of 10−13 s. It is impossible for a brutal force model to span the entire 15 orders of magnitude in time scale. Various models have emerged with increasing degree of rigor, as knowledge and computational power build up. In terms of the time scale, three modeling approaches exist. The first approach focuses on the macroscopic time scale and ignores details of atomic vibration and atomic clustering. The second focuses on the details of atomic vibration and atomic clustering, by artificially speeding up the deposition process. The third incorporates the details of atomic vibration in atomic diffusion and atomic clustering in an effective manner over the macroscopic time scale. The following presentation elaborates on each of the three modeling approaches. The first modeling approach represents a polycrystalline thin film as a continuum [2, 3] and neglects the details of atomic motion, as shown in Fig. 2. The interior of each grain is a continuum, and the grain boundaries define the size and shape of each grain. Within this continuum approach, a series of meshing points and interpolation between them, fully represent the grain boundaries. Positions and velocities of the meshing points characterize the motion of grain boundaries. Each meshing point advances with a velocity according to the local driving force. One of the most common driving forces is the grain boundary curvature. A larger curvature corresponds to a more curved grain boundary, and a larger grain boundary area. For given energy per unit area of
(a)
(b)
Figure 2. Schematic of continuum model of texture evolution, from initial (a) to final (b) texture.
1042
H. Huang
grain boundary, the total energy goes up with the grain boundary curvature. Energy minimization drives the reduction of the curvature, and thereby the motion of grain boundaries. Relevant to texture evolution, the driving forces also include film surface energy, strain energy, and film-substrate interface energy. In a strain-free Cu polycrystalline thin film on an amorphous substrate, the minimization of surface and grain boundary energies drives the texture evolution. The minimization of the surface energy will favor grains having {111} surfaces, leading to the 111 texture. The minimization of grain boundary energy will favor grain boundaries of smaller curvature; ignoring differences of grain boundaries in terms of energy per unit area. This will lead to grain coarsening. Meshing points at grain boundaries, under these two driving forces, migrates toward non-111 grains or toward smaller grains. The speed of the migration depends on two physical quantities: the driving force and the migration barrier. The driving force determines the direction of migration, or the sign of the velocity. However, its effect on the magnitude of velocity (the speed) is approximately linear and limited. The speed is an exponential function of the migration barrier in Arrhenius form. The grain coarsening is demonstrated in Figs. 2(a) and (b). One extension of this continuum approach is the inclusion of deposition, in addition to the annealing process [4]. The other extension is the incorporation of atomistic mechanisms, such as grain rotation, in modeling the texture evolution [5]. The continuum approach is capable of tracking texture evolution in laboratory time scale of seconds. This advantage comes at the expense of neglecting details of atomic vibration and atomic clustering. Another limitation is the necessity of the initial grains distribution, such as the one shown in Fig. 2a. Before proceeding to the second modeling approach, it is worthy to note that the Pott’s model [2] may be considered as within the continuum approach. Although each grain appears in the form of discrete blocks of materials, the principles of grain boundaries motion in the Pott’s model are similar to those in the continuum approach. In contrast to the continuum approach, the second modeling approach is atomistic and based on the molecular dynamics method; Li discusses this method in detail in this handbook. Thin films and the corresponding substrates consist of atoms. The atomic positions and their relative arrangements naturally outline the texture of thin films. When atoms are packed in crystalline order, they represent the interior of a grain, of a specific orientation. Meanwhile, atoms in noncrystalline order belong to grain boundaries. The motion of grain boundaries, and thereby the texture evolution, is a natural result of atomic activities. Each atom interacts with its neighbors, according to a prescribed interatomic potential. The force on the atom determines its acceleration, and its dynamics according to the Newton’s second law. In principle, one may start from an amorphous substrate and track the grain nucleation and texture evolution during deposition. However, the grain nucleation on amorphous substrates
Texture evolution during thin film deposition
1043
is difficult to model; this issue will be elaborated in the Outlook part. Instead, one generally has to start from bi-crystalline or polycrystalline substrates. The molecular dynamics based approach allows the atomistic studies of texture competition as a function of various deposition parameters, such as kinetic energy of incoming atoms [6]. The details of atomic vibration and atomic clustering are natural output of the molecular dynamics based approach, but they come at a price. To track atomic vibrations, the numerical time step is usually 10−15 s. For millions of numerical steps, the total simulated time is only on the order of nanoseconds. Consequently, the deposition rate has to be on the order of several atomic layers per nanosecond or ∼1 m/s, which is nine orders of magnitude higher than realistic deposition rates. Usually, one has to compensate this artificially high deposition rate by high temperature in order to ensure enough atomic diffusion. The hyper molecular dynamics method [7] extends the time scale by several orders of magnitude, and therefore helps reduce the deposition rate. This method is the most effective when kinetic processes are not too complex, or when the potential energy surfaces of atomic migration are simple. During thin film deposition, it is common that surface atoms form complex configurations, rendering the hyper molecular dynamics less effective. So far, the molecular dynamics simulations of texture evolution remain in two dimensions. Extension to three dimensions is becoming feasible with the ever-increasing computational capacity. In addition to the computational constraints, physical approximations deserve full appreciation as well. The molecular dynamics method does not explicitly treat electrons and relies on an interatomic potential to effectively represent the electronic effects. This effective treatment assumes that electrons redistribute in a particular fashion according to a given function of atomic configurations, and that the redistribution is instantaneous – the Born–Oppenheimer approximation. In general, the effective treatment leads to correct crystal structures, but may fail in quantitative predictions of atomic energetics and their effects. The third modeling approach is atomistic and based on the Monte Carlo method; Gilmer discusses this method in detail in this handbook. Instead of following Newton’s equations, atomic motions are governed by atomic energetics and the corresponding Boltzmann’s factor in the Monte Carlo method. One variation of this approach leads to similar results as the molecular dynamics method. According to this variation, atoms may occupy any point in the space. Mapping the continuous space takes much computational effort. In two dimensions, this variation is realizable using effective particles [8]. Each effective particle represents a cluster of atoms. The particles move around in the continuous spaces during deposition. Their motion on surface corresponds to the effective diffusion of adatoms. Domains of various orientations are formed during two-dimensional simulations (Fig. 3). These domains may be interpreted as grains; however, they are not. Extension of this method to three dimensions results in unbearable computational cost [9].
1044
H. Huang
Figure 3. Schematic of domains formation in two dimensions.
(a)
(b)
(c)
Figure 4. Schematic of texture evolution in the Monte Carlo based approach, starting from an amorphous substrate (a), to grain nuclei (b), and to a polycrystalline thin film (c).
The other variation of the Monte Carlo based approach employs lattices. Like other Monte Carlo methods, atomic energetics govern the atomic motion according the Boltzmann’s factor. These energetics come from classical molecular dynamics simulations, ab initio calculations, and experimental measurements. The core of this variation is the lattice kinetic Monte Carlo method. According to this method, atoms occupy only lattice sites, and each lattice represents one grain of a specific orientation. As shown in Fig. 4(a), an amorphous substrate consists of atoms in different lattices (indicated by different gray scales). Starting from this substrate, an incoming atom may choose to align with any of the substrate atoms that it comes to form nearest neighbors. At the same time, atoms may also diffuse around and form grain nuclei (Fig. 4(b)). As more atoms attach to the nuclei, they grow and impinge, forming a polycrystalline thin film (Fig. 4(c)). This lattice kinetic Monte Carlo based approach enables simulations of thin film deposition over long time scales, up to 102 s, or more. At the same time, by taking inputs from detailed molecular dynamics and ab initio studies of atomic vibrations, this method also effectively accounts for atomic motions of finer time scales. Further, the atomic energetics can be more accurately represented than in molecular dynamics simulations, because they may come from ab initio calculations and
Texture evolution during thin film deposition
1045
experiments. The studies of atomic energetics are a continuing endeavor of knowledge accumulation. They include both the determination of conventional atomic energetics such as surface diffusion barriers and the identification of novel atomic mechanisms [10, 11]. These advantages of better energetics representation and longer time simulations are realizable if multiple lattices can be used in the Monte Carlo method. In contrast to the studies of atomic energetics, the use of multiple lattices is much more challenging. The challenges are two fold. First, an atom of one grain and atoms of other grains must not occupy the same spatial site. Avoiding the multiple occupancy is possible but numerically intensive, because of the necessity of examining all grains for each atom. Second, the direct use of multiple lattices will cost too much computer memory. Should one directly use 1000 lattices to represent 1000 grains, the number of atoms that a computer is capable of simulating will be reduced by 1000 times? For a single lattice, the present day computer is capable of simulating a billion atoms, or simulating films with linear dimension being ∼250 nm. The direct use of multiple lattices will reduce this dimension to only 25 nm. Parallel computations are not effective in increasing this dimension, because of the predominantly integer operations [12]. In lieu of the two challenges or difficulties, it will be advantageous to represent multiple lattices by a single lattice. The single lattice serves as a reference in space. A lattice of arbitrary orientation can be transformed to the reference lattice through three independent rotations. A one-to-one relationship between sites of this lattice and the reference lattice exists, since both lattices have the same site density. In another word, a lattice of arbitrary orientation can be mapped onto the reference lattice. In the single reference lattice, one readily knows whether a site is occupied. Therefore, the mapping solves the first problem, the possible multiple occupancy. However, the direct mapping in three dimensions results in the same requirement of computer memory as the direct use of multiple lattices. The alternative to the direct three-dimensional mapping is the use of three consecutive two-dimensional mappings. In the three-dimensional mapping of N lattices each having linear dimension L, the number of integers stored will be on the order of NL3 . On the other hand, the three consecutive two-dimensional mappings require memory storage of only order 3NL2 . For linear dimension of L = 250 nm (or 1000 atomic diameters), the use of three consecutive two-dimensional mappings reduces the memory requirement by 300 times. Two variations of this mapping concept have been implemented. The first implementation incorporates multiple lattices, and is in two dimensions [13]. This implementation enables studies of multiple textures competition, in twodimensional space. The second implementation incorporates only two lattices, and is in three dimensions [14]. This implementation is based on the mapping of face-centered-cubic {111} plane sites onto {100} plane sites. As a result, this implementation enables the simulations of two out-of-plane textures, 111 and
1046
H. Huang
100, using a single 100 lattice. The full implementation of multiple lattices in three dimensions is in progress of preparation and publication, and will be elaborated in the Outlook part. Before closing this part, it is worthwhile to appreciate the multiscale – in addition to the polycrystalline – nature of the lattice kinetic Monte Carlo based approach. The span of multiple time scales is realized through representation of atomic energetics in the lattice kinetic Monte Carlo based approach. The classical molecular dynamics simulations and ab initio calculations, together with experimental measurements, provide reliable atomic energetics and mechanisms of motion. In the lattice kinetic Monte Carlo based approach, the energetics and the mechanisms are parameterized as a function of atomic coordination. Although potential energies of individual atoms are not well defined, a parametric representation is meaningful in terms of the total energy of a simulated thin film. The nonlinear parameterization of potential energy as a function of the atomic coordination ensures the reproduction of surface defect formation energies. This reproduction, with respect to the molecular dynamics predictions in this case (Fig. 5), is essential to physical faceting during thin film deposition [15]. As to the atomic mechanisms of motion, multiple Monte Carlo jumps are used to represent diffusion jumps over steps and facets.
MC-EAM MD-EAM
Figure 5. Nonlinear parameterization of atomic potential energy vs. coordination (open circle and solid line) in the Monte Carlo model. The molecular dynamics predictions (solid diamonds) are included for comparison.
Texture evolution during thin film deposition
1047
Among the three approaches – the continuum, the molecular dynamics based, and the Monte Carlo based approaches – the lattice kinetic Monte Carlo based approach looks the most promising. In particular, it enables the simulations of texture evolution at the atomic level, under realistic deposition rates, without assuming initial grain distributions. Certainly, this approach is far from being complete and suffers from several drawbacks. First, the implementation has been realized for only multiple lattices in two dimensions or two lattices in three dimensions. Second, the mechanism of grain nucleation on amorphous substrates remains largely unclear, and there is no available method to study such mechanisms. Third, this approach is incapable of simulating thin films of 1000 nm or larger in linear dimensions. Finally, strain effects are intrinsically missing in this lattice kinetic Monte Carlo based approach.
3.
Outlook
Since the lattice kinetic Monte Carlo based approach looks the most promising, it will be the focus of this outlook of future developments. The first development is the full implementation of mapping multiple lattices onto one reference lattice in three dimensions. This will be realized through three consecutive two-dimensional mappings. The previous implementation of one such mapping in two dimensions indicates the feasibility. Once completed, the full implementation will enable simulations of multiple texture competition in three dimensions at the atomic level and under realistic deposition rates. The second development is the atomistic study of grain nucleation on amorphous substrates. The necessary condition of this study is a generic amorphous substrate. Although intermetallic glasses may serve as amorphous substrates, their surface roughness and local crystallinity are not controllable. In laboratory experiments over large substrate areas, such uncontrollability is not an issue. However, in atomistic simulations, substrate areas are small, and the variation of surface roughness and local crystallinity overshadow the underlying nucleation principles. Therefore, one aspect of this development is the design of a generic amorphous substrate with controllable roughness and crystallinity. In parallel, the other aspect is the formulation of an analysis technique to characterize the grain nucleation on amorphous substrates. This technique allows one to define whether a cluster of atoms is crystalline. These two aspects of the development will result in more clear understanding on mechanisms of grain nucleation on various amorphous substrates. The third development is bridging approaches of different length scales. At the present, the available computer memory of a single processor is capable of treating thin films of 500 nm × 500 nm in horizontal dimensions and 25 nm in thickness. To model thin films of larger dimensions, this approach needs to
1048
H. Huang
be bridged with grain continuum models, such as PLENTE [16]. Efforts have been made in this direction, but a seamless bridging is yet to be accomplished. Finally, the fourth development is the incorporation of strain effects. The use of lattices is necessary to simulate deposition processes of seconds in time scale. At the same time, the use of lattice intrinsically excludes strain. Fortunately, energy may represent strain effects in the form of strain energy. The incorporation of strain effects requires a combined use of the lattice kinetic Monte Carlo based approach and continuum analyses. At any moment of texture evolution, the strain distribution will be determined from a continuum analysis, based on the grain continuum model. The strain and corresponding strain energy serve as input to the subsequent Monte Carlo simulations of texture evolution. Once the texture evolves, strain distribution can be analyzed again. This iteration will enable the effective incorporation of strain effects in simulations of texture evolution. The first development – the full implementation of mapping multiple lattices – has been completed [17].
References [1] R.A. Powell and S. Rossnagel, Thin Films: PVD for Microelectronics, Academic Press, New York, 1999. [2] G. Grest, M. Anderson, D. Srolovitz, and A. Rollett, “Abnormal grain growth in three dimensions,” Scripta Metall. Mater., 24, 661–665, 1990. [3] D. Walton, H. Frost, and C. Thompson, “Development of near-bamboo and bamboo microstructures in thin film strips,” Appl. Phys. Lett., 61, 40–42, 1992. [4] Paritosh, D.J. Srolovitz, C.C. Battaile, X. Li, and J.E. Butler, “Simulation of faceted film growth in two dimenions: microstructure, morphology and texture,” Acta Mater., 47, 2269–2281, 1999. [5] D. Moldovan, D. Wolf, and S.R. Phillpot, “Linking atomistic and mesoscale simulations of nanocrystalline materials: quantitative validation for the case of grain growth,” Philos. Mag., 83, 3643–3659, 2003. [6] L. Dong and D. Srolovitz, “Texture development mechanisms in ion beam assisted deposition,” J. Appl. Phys., 84, 5261–5269, 1998. [7] A. Voter, “Hyperdynamics: accelerated molecular dynamics of infrequent events,” Phys. Rev. Lett., 78, 3908–3911, 1997. [8] M.J. Brett, S.K. Dew, and T. Smy, Thin Films: Modeling of Film Deposition for Microelectronic Applications, S. Rossnagel, ed., Academic Press, New York, 1996. [9] F. Baumann and G.H. Gilmer, “3D modeling of sputter and reflow processes for interconnect metals,” IEDM Technical Digest, 89, 1995. [10] S.J. Liu, H. Huang, and C.H. Woo, “Schwoebel–Ehrlich barrier: from two to three dimensions,” Appl. Phys. Lett., 80, 3295–3297, 2002. [11] M.G. Lagally and Z.Y. Zhang, “Materials science - Thin-film Cliffhanger,” Nature, 417, 907–910, 2002. [12] J.W. Shu, Q. Lu, W.O. Wong, and H. Huang, “Parallelization strategies for Monte Carlo simulations of thin film deposition,” Comput. Phys. Commun., 144, 34–45, 2002.
Texture evolution during thin film deposition
1049
[13] H. Huang and G.H. Gilmer, “Multi-lattice Monte Carlo model of thin films,” J. Compu. Aided Mater. Des., 6, 117–127, 1999. [14] G.H. Gilmer, H. Huang, T. Diaz de la Rubia, J.D. Torre, and F. Baumann, “Lattice monte Carlo models of thin film deposition,” Thin Solid Films, 365, 189–200, 2000. [15] H. Huang, G.H. Gilmer, and T. Diaz de la Rubia, “An atomistic simulator for thin film deposition in three dimensions,” J. Appl. Phys., 84, 3636–3649, 1998. [16] M.O. Bloomfield, D.F. Richards, and T.S. Cale, “A computational framework for modelling grain-structure evolution in three dimensions,” Philos. Mag., 83, 3549– 3568, 2003. [17] H. Huang and L.G. Zhou, “Atomistic simulator of polycrystalline thin film deposition in three dimensions,” J. Compu. Aided Mater. Des., in press, 2005.
2.31 ATOMISTIC VISUALIZATION Ju Li Department of Materials Science and Engineering, Ohio State University, Columbus, Ohio, USA
Visualization plays a critical role in materials modeling. This is particularly true for atomistic modeling, in which there is a large number of discrete degrees of freedom (DOF): the positions of the atoms. Atomic resolution is therefore the defining feature of atomistic visualization. This, however, does not exclude the possibility of going up in scale – visualizing the coarse-grained continuum fields, or going down – visualizing the electronic structure around a particular atom or cluster of atoms, of a configuration if the need arises. These discrete DOF in an atomistic simulation do not necessarily satisfy any smoothness condition like the continuum fields. For example the reconstructed atomic structure of a dislocation core in Si is not likely to be describable by a formula or a series expansion. However, this does not mean that there is no order in these DOF. Atomic-level order is ubiquitous in materials, even in amorphous or disordered materials, even in liquids. Finding these order, quantifying them, and then representing them in the best light are the tasks of atomistic visualization. Atomistic visualization is not merely a software engineering problem, it is also inherently a physics and mechanics problem. To appreciate the importance of atomistic visualization, one must recognize that in a setup like a large-scale molecular dynamics (MD) simulation, it is not infrequent that the DOF self-organize in ways that the investigator would not have expected before the simulation is carried out. Thus, a main function of atomistic simulation is discovering new structures, new kinetic pathways and micro-mechanisms, with atomic resolution. Even though these discoveries often need to be taken with a grain of salt due to the present accuracy of empirical interatomic potentials, large-scale simulation is nonetheless a unique and tremendously powerful tool of identifying key structures and processes. Once a structure or a process is clearly described and understood, it often can be isolated and modeled with a much smaller number of atoms at the firstprinciples level, allowing one to eventually select the most probable structure 1051 S. Yip (ed.), Handbook of Materials Modeling, 1051–1068. c 2005 Springer. Printed in the Netherlands.
1052
J. Li
or process out of a catalog of possible low-energy structures or processes. This surveying mission of large-scale simulation would be impossible without efficient visualization, for the amount of data from a large-scale simulation is truly enormous. This contribution is organized as follows. First, a brief survey of the present state-of-the-art in atomistic visualization is given, that includes both tool development and work done using the tools. Special emphasis is put on publicdomain visualization tools that the author is familiar with. Then, the design philosophy behind the free atomistic configuration viewer AtomEye is analyzed. Finally, a recently developed characterization of local atomic structure called central symmetry parameter is explained.
1.
A Brief Survey of Molecular Visualization
At the time this article is written, the state-of-art in atomistic visualization can be experienced in a movie that Farid Abraham et al. (IBM) made for a one-billion atom MD simulation of work-hardening, with two notched dislocation sources [1]. The MD simulation was performed for 200 000 time steps on the 12-teraflop, 4096-node ASCI White supercomputer at LLNL, for four days wall clock time, which generated 25 terabytes of raw data. They were compressed with 30× efficiency to less than 1 terabyte, which would still take about 10 hard drives (weighs ∼1.2 lb each) to store. The movie was made in the post-processing stage by Mark Duchaineau, a computer scientist (LLNL). It has a resolution of 640×480, a file size of 66 mb, and lasts 46 s. In terms of file size, the movie is less than a 1/1000% of the raw data. Watching the movie takes only a 1/100% of the time it takes the fastest computer in the world to run the simulation. Yet, one gets a very good overview of what went on in the simulation, that entail dislocation nucleation, interaction and dynamics, by just watching the movie. Thus, a main purpose of visualization is condensation of information. A crucial trick that enables such high condensation rate of data is selective representation of atoms. That is, one only renders “interesting” atoms near defects in the atomistic configuration, in this case dislocations and cracks. The “uninteresting” atoms which have bulk order are not rendered and do not cover up the field of view. Here, the “interesting” atoms are determined by a local energy criterion. Later in the article, we are going to illustrate alternative methods of distinguishing “interesting” atoms using some geometrical criteria without knowing the particular interatomic potential. As a side note, it was observed personally that the above movie never fails to captivate the audience in seminars and lectures, whether they are experts or not. Thus, aside from sifting and compressing information, atomistic visualization also lowers the barrier of entry for accessing the information.
Atomistic visualization
1053
Century-old methods of scientific visualization such as graphing/charting are still important as ever (they achieve even higher information compression rate). But the new kinds of visualization that come with the information age, in the forms of snapshots, movies/animations, and interactive navigation, great complement and enhance the traditional methods. Top-quality atomistic visualization such as above [1, 2] still require the expertise of dedicated computer science professionals. They may also require specialized hardware such as an Immersadesk or CAVE system [3, 4]. However, for day-to-day research, there is an array of visualization software available on personal computers. Commercial modeling packages such as Materials Studio, CAChe, ChemOffice, HyperChem, Spartan, etc. come with powerful visualization front ends, that usually include graphical user interface (GUI)driven atomic configuration builders as well. And there are also more specialized crystallographic software such as CrystalMaker. But here we are going to focus on free software, or freeware, that are accessible to everyone. Molscript by Kraulis [5] and Rasmol [6] are two pioneering freewares that have had tremendous impact on visualization, beyond the field of molecular biology from which they originated. According to the Institute for Scientific Information (ISI), from 1991 to 2004 the Molscript paper [5] has been cited more than 10 000 times, making it one of the most cited papers in science. Molscript takes an input file, which specifies the 3-D coordinates of biomolecules and the desired graphics state (such as viewpoint), and renders into publication-quality schematics in vector image formats like PostScript, which can be directly inserted into typesetting program such as LaTeX. Later, photorealistic rasterization program Raster3D [7, 8] and charge-density isosurface plotting program CONSCRIPT [9] were developed that can work in unison with Molscript. Similar to many present-day raytracing programs, Molscript, Raster3D and CONSCRIPT run on the command line and are noninteractive. So, while the qualities the configuration snapshots are excellent, they are less suitable as a configuration navigation and surveying tool. Rasmol, on the other hand, is designed with navigation in mind. One is able to rotate the configuration and change the rendering state interactively. The Rasmol source code, which is freely available starting from the early 1990s, implements advanced features such as shared memory extension for local display, scripting input interface, and various fast rendering technique, that advances the knowledge-base of developing molecular visualization freeware. Other macromolecule visualization tools with similar functions include the Swiss-PdbViewer (Deep View) [10, 11], and MOLMOL [12]. It should be pointed out that there are many detailed differences between molecular visualization of soft matter, specifically proteins, and atomistic visualization of hard matter. For example, in modeling deformation of solids, one can often use the perfect crystal as reference state. This means, in a visualization scheme, collective modes or defects can often be identified by comparing
1054
J. Li
with crystalline order atom by atom. Configuration changes in hard matter such as defect nucleation and mobility are often accompanied by the breaking and reformation of stiff, nearest-neighbor covalent or metallic bonds. In proteins, there is no crystalline reference state, and conformation changes are usually accomplished by the breaking and reformation of softer, non-nearestneighbor bonds like hydrogen bonds. And while the concepts of local strain and stress are still useful in proteins [13], quantification/visualization poses perhaps a greater challenge. On the other hand, there are well-recognized local orders in proteins such as α-helices, β-sheets, turns and loops, that do not have direct analogies in hard matter, and require special representations such as ribbons/thick tubes, arrows, and lines/thin tubes. Historically, the Protein Data Bank (PDB) configuration file format [14] and the Research Collaboratory for Structural Bioinformatics (RCSB) molecular structure database has been a major driving force behind promoting molecular visualization and standardization. No such standards yet exist in materials modeling. However, there are several good reasons not to use the PDB format to save one’s configurations and for information exchange in atomistic modeling of hard matter, which are: • Precision. PDB format has a fixed precision of 0.001 Å for storing the atomic coordinates. While this is probably sufficient for proteins, for which one usually models at around T = 300 K in solution so there is plenty of indeterminant thermal noise anyway, it is often not precise enough for hard matter. • Extensibility. Since PDB adopts a fix-line format, there is no standard and supported way to add in new properties. For instance, there is no standard option to store atomic velocities. • Support for periodic boundary condition (PBC). It is very difficult to coax the PDB format to robustly and consistently store atomic configurations satisfying PBC, because the atomic coordinates are saved in direct Cartesian x, y, z coordinates rather than dimensionless reduced coordinates [15]. In order to effect an affine transformation on the supercell, for instance, one needs to modify all atomic coordinates explicitly in PDB, rather than just modifying the 3 × 3 H-matrix [15]. An extensible, arbitrary-precision configuration file format (CFG) and its supporting viewer AtomEye [16] is introduced in the next section, which provides full support for PBC and is ideally suited for large-scale MD simulations. We now turn to another area, quantum chemistry, which also had profound influence on atomistic visualization. One deals with a smaller number of atoms in one configuration, usually no more than a few hundred at present, but scalar fields such as orbital wavefunctions need to represented besides molecular conformation. The pioneering freeware in this field is Molden [17], which renders the orbital wavefunctions, charge-density and electrostatic potential of
Atomistic visualization
1055
molecules, as well as their relaxation dynamics, vibrational normal modes and reaction pathways. It works well interactively, but also gives good quality vector graph output for 2-D contours and 3-D isosurfaces. Another freeware with similar functionalities is gOpenMol. An excellent freeware for visualizing electronic structure in crystals is XCrySDen [18, 19]. One can store the crystal structure plus an arbitrary number of scalar fields defined on a regular grid under PBC in the so-called XSF format, which can be visualized, rotated and numerically manipulated interactively. Isosurfaces and cut-plane contours of the scalar fields can be rendered with a variety of colormap, transparency, and specularity options. Both the onscreen display and the snapshots have outstanding quality, and the controls are highly responsive. XCrySDen also has some tools for analyzing reciprocalspace properties such as interactive selection of k-paths in the Brillouin zone for band-structure plots, and visualization of the Fermi surface. Presently, the most powerful and versatile freeware for visualizing molecular dynamics simulation trajectories is perhaps VMD [20]. It is based on OpenGL, with graphical user interfaces, but also a command line with full scripting capabilities. There is even a special syntax for choosing subsets of atoms for display (includes boolean operators, regular expressions, etc.). Trajectories can be played back, analyzed and easily converted to movies. Sterescopic display is fully supported. VMD can also display volumetric data sets, including electron density maps, electron orbitals, potential maps, and various types of user-generated volumetric data. They can be rendered using “VolumeSlice” or “Isosurface” representations, each of which provides several geometric rendering styles for viewing the data, varying isolevels, slice plane position, etc. 1-D, 2-D, and 3-D textures can be applied onto molecular and volumetric data representations to convey various types of information. VMD also provides the ability to render molecular scenes using external programs such as ray-tracing programs. This feature can be used to attain higher image quality that is possible using the built-in OpenGL rendering features. There are also many special features for analyzing large biomolecular systems. Compared to VMD, freeware such as AViz [21] and AtomEye [16], which are dedicated to atomistic visualization of nonbiological systems, are more lightweight. A good idea for beginners is to install and try all three freeware. The design philosophy behind AtomEye [16] is introduced in the next section. Aside from the specialized tools introduced above, there are general visualization packages such as OpenDX and VTK, that are programmable and extremely powerful. The python interface of VTK, for instance, has been incorporated into Atomic Simulation Environment (ASE), an open-source distribution of python scripts [22] that can wrap around several ab initio and molecular mechanics engines (Dacapo, SIESTA, MMTK, etc.). The commercial software package MATLAB is also a very good environment for data visualization. Freeware in this aspect include Gnuplot, Grace, Octave, and Scilab.
1056
2.
J. Li
Design of an Efficient Atomistic Configuration Viewer
AtomEye [16] is a lightweight and memory-efficient atomistic configuration viewer, which nonetheless achieves high quality in the limited number of things that it can do. It is based on the observation that when visualizing MD simulation results, most often only the spheres and cylinders, representing the atoms and bonds, need to be drawn in massive quantities. Therefore, special subroutines were developed to render the spheres and cylinders as graphics primitives, rather than as composites of polygons. This combined with areaweighted anti-aliasing [23] greatly enhance AtomEye’s graphics quality. One can also produce snapshots (in PNG, JPEG or EPS file formats) of a configuration in the desired graphics state at arbitrary resolutions (like 2560×2560) that are greater than the monitor display resolution, to obtain publication-quality figures (Figs. 1–6). Making movie is straightforward with a set of sequentially named configuration files. AtomEye is an easy-to-use configuration navigator with full support for PBC. The user can move the view frustum anywhere inside the atomic configuration (see Figs. 2, 4). This is done by defining an anchor point, which can be the position of an atom, the center of a bond, or the center of mass of the entire configuration. Dragging the mouse up or down with the right mouse button
Figure 1. A strand of DNA, visualized in AtomEye.
Atomistic visualization
1057
Figure 2. Inside a chiral single-walled carbon nanotube.
Figure 3. Dislocation emission in a two-dimensional bubble raft under a spherical indentor [24]. The color encoding of atoms is by the auxiliary property of local atomistic von Mises stress invariant.
pressed pulls the viewpoint away or closer from the anchor. Rotation is always done such that the anchor position is invariant in the field of view. At beginning, the anchor is taken to be the center of mass. This allows for global view of the configuration by rotating with mouse or with arrow keys (see below). When one right-clicks on an atom or a bond, the anchor is transferred to that particular atom or bond. So if one is interested in a closer view of a
1058
J. Li
Figure 4. A vacancy defect in silicon. Three-fold coordinated atoms are colored green, while 4-fold coordinated atoms are colored silver.
Figure 5. Cu nanocrystal configuration consisting of 424 601 atoms. Atom coloring is by coordination number.
Atomistic visualization
1059
Figure 6. Central symmetry color encoding showing intrinsic stacking faults bounded by partial dislocations in indentation of Cu.
particular atomic local environment, one right-clicks on an atom or bond and then drags the mouse down without releasing the right mouse button. To pull away, simply right-click on a vacuum region and drag the mouse up without releasing the right mouse button. One can always recover the center of mass anchor by pressing key “w”. Rotation by mouse movement is accomplished with the following concepts: there is a glass sphere about half the viewport size hinged at the center of the viewport. The configuration is “frozen” in the glass sphere and corotates with it. After the rotation, there is a compensating translation if necessary, to fix the anchor in the viewport. To rotate, one imagines putting a finger on the glass sphere surface and move the fingertip, which is done by left-clicking in the window and dragging the mouse without releasing the left button. The remainder of the viewport comprises of a flat glass surface parallel to the viewport, left-clicking and dragging which causes the configuration to rotate clockwise and counterclockwise. By pressing the arrow keys ←, →, ↑, ↓, and shift +↑, ↓, the configuration can also be rotated along three orthogonal axes. The rate of rotation is governed by the socalled gearbox value, that actually controls all rates of changes, which can be varied by pressing the numeric keys 0–9. One can always recover the initial view frustum orientation with x, y, z perfectly aligned, by pressing key “u”.
1060
J. Li
At this point we need to explain the design of the CFG configuration file format which AtomEye supports. (Though there is elementary support for the PDB file format [14], PDB is not recommended. See the last section.) In the CFG file, one always assumes that the configuration is under PBC, with a parallelopiped supercell defined by its three edge vectors (not necessarily orthogonal to each other). The reason for enforcing the PBC requirement is that, while it is quite easy to express a cluster configuration as a PBC configuration by putting a large enough PBC box around it, therefore separating the periodic images by vacuum, it is not so easy the other way around. To define a PBC configuration, a minimum of 3N + 9 real numbers needs to be supplied, where N is the number of atoms. First, one must specify a 3 × 3 matrix,
H11 H = H21 H31
H12 H22 H32
H13 H23 , H33
(1)
in the unit of angstrom (Å), which specifies the supercell size and shape. AtomEye uses a row-based vector notation. That is, the first row of the H matrix corresponds to the first edge (or basis) vector h1 of supercell, and similarly for h2 and h3 : h1 ≡ (H11 H12 H13),
h2 ≡ (H21 H22 H23 ),
h3 ≡ (H31 H32 H33 ). (2)
So, for instance, H23 is the z-component of the second edge vector of the supercell (in Å). It is recommended that h1 , h2 , h3 constitute a right-handed system, that is (h1 × h2 ) · h3 = det(H) > 0,
(3)
but it is not required. The atom positions are specified in the CFG file by the socalled reduced coordinates {si } instead of the Cartesian coordinates {xi }. Here i runs from 1 to N (in the program it actually runs from 0 to N − 1), and both si and xi are 1 × 3 row vectors si ≡ (si1 si2 si3 ) ,
xi ≡ (xi yi z i ) .
(4)
si1 , si2 , si3 are called reduced coordinates since 1. They are dimensionless, unlike xi , yi , z i which are in Å. 2. They are all between 0 and 1: 0 ≤ si1 < 1,
0 ≤ si2 < 1,
0 ≤ si3 < 1.
(5)
xi and si are related by the matrix vector product xi = si H = si1 h1 + si2 h2 + si3 h3 .
(6)
Atomistic visualization
1061
Since h1 , h2 , and h3 are the three edges of the parallelopiped supercell, it is seen that any point inside the supercell corresponds to si1 , si2 , si3 ∈ [0, 1), and vice versa. Any image atom outside the supercell can be expressed as (si1 + l, si2 + m, si3 + n), in which l, m, n are all integers, which is separated from the original atom xi by Cartesian distance lh1 + mh2 + nh3 . Knowing xi and H, one can also invert Eq. (6) to get si si = xi H−1 .
(7)
/ [0, 1), the atom is outside of the supercell (i.e., it is an If any of si1 , si2 , si3 ∈ image atom) and needs to be mapped back to the original supercell, by siα → siα − siα ,
α = 1, 2, 3,
(8)
where · is the floor function, returning the largest integer not greater than the argument. The reciprocal vectors of the supercell g1 , g2 , and g3 are the first, second and third row vectors of the matrix G ≡ 2π(H−1 )T ,
(9)
and satisfy the fundamental relations gα hβT = 2π δαβ ,
α, β ∈ 1 · · · 3
(10)
Since g1 is normal to the plane spanned by h2 and h3 , g2 is normal to the plane spanned by h1 and h3 , g3 is normal to the plane spanned by h1 and h2 , it is easy to see that the thicknesses of the supercell perpendicular to the three sets of planes are d1 =
2π , |g1 |
d2 =
2π , |g2 |
d3 =
2π , |g3 |
(11)
respectively. It can be shown that a sphere of radius R can fit into one supercell (without touching any of the six faces) if and only if 2R < min(d1 , d2 , d3 ).
(12)
The above is an important relation since it tells us whether a great simplification in treating image interactions can be taken or not. To appreciate this, let us suppose two atoms would interact/consider each other their neighbor, whenever their distance is less than rc . Given the contents of the supercell, the physical system it represents is an infinite lattice composed of infinitely tiled replicas of the original supercell. Theoretically, to determine how many neighbors an atom xi in the original supercell has, one needs to go over all atoms in nearby supercells. It is then possible that both x j + lh1 + mh2 + nh3 and x j +l h1 + m h2 + n h3 are neighbors of xi , which is called multiple counting. There is nothing wrong with multiple counting, but
1062
J. Li
this possibility makes the program more complicated and less efficient. So a natural question is, under what conditions would multiple counting be guaranteed to not occur, and one only has single counting? In other words, when would any two atoms i and j in the original supercell have at most one interaction even when all images of j are taken into account? To figure this out, suppose xi is at the center of the original parallelopiped, si = 12 , 12 , 12 . It is then seen that if and only if 2rc < min(d1 , d2 , d3 ),
(13)
can single counting be guaranteed for atom i, and all possible neighbors are within the original supercell. One then realizes that this criterion does not really matter where atom i is. One can always define a shifted supercell (possibly containing some image atoms) with atom i at its center, that has one-to-one mapping with atoms 1 · · · N in the original supercell. So long as Eq. (13) is satisfied, one only needs to loop over atoms 1 · · · N once to find out all the neighbor of i, according to the formulas
sij ≡ (sij 1 sij 2 sij 3 ),
sij α = siα − s j α − siα − s j α +
1 , 2
α = 1, 2, 3 (14)
xij ≡ sij H,
rij ≡ |xij |.
(15)
In the engines of AtomEye, condition (13) is assumed to hold, which is mostly the case for configurations involved in empirical potential simulations. However, configurations from ab initio calculations often do not satisfy (13). So, when AtomEye loads in the configuration, if (13) is found to be not satisfied, the configuration is automatically replicated in the necessary direction(s) so that condition (13) will become satisfied. In the CFG file, the H matrix can be specified flexibly according to the following formula
H = AH0 I + 2ηT,
(16)
where A, η and T are optional parameters, and I is the 3 × 3 identity matrix. A is a scalar and has the meaning of the basic lengthscale of the configuration in Å, and its default value is one. √ η is a desired Lagrangian strain which is a 3 × 3 symmetric matrix, and I + 2η is the affine transformation matrix that achieves η without rotation (see Chap 2.3); by default, η = 0, the zero matrix. Finally, T is an affine transformation matrix, which can contain a rotational component; by default, T = I. When A, η and T all take their default values, H = H0 . So if one do not care about scaling and affine transformations, one can just specify H by directly specifying H0 in Å, like
1.8075 H0 = 1.8075 0
1.8075 0 1.8075
0 1.8075 , 1.8075
(17)
Atomistic visualization
1063
for FCC Cu crystal primitive cell with equilibrium lattice constant 3.615 Å. However, it is perhaps better to set
0.5 H0 = 0.5 0
0.5 0 0.5
0 0.5 , 0.5
(18)
but set A = 3.615. This way, if we want to create a series of configurations with varying lattice parameters, we only need to change one number in the CFG file. Optional η and T matrixes are established for the same reasons. If we want to deform the entire configuration, we only need to change one or few parameters in the CFG file. For instance,
1 0 T = 0.5 1 0 0
0 0 1
(19)
means effecting a simple shear such that e y → e y + 0.5ex with ex and ez unchanged. The main data block comes after various required and optional declarations. There are N lines in the data block, one line for each atom. The first three entries on each line are the si1 , si2 , si3 of that atom. Depending on the declaration, there may or may not be three numbers following them that contain the velocity information. The CFG file is extensible in the sense that there is a supported way for the user to store extra atomic properties in the CFG file. For example, one may wish to store the instantaneous force on each atom computed fron an ab intio calculation, along with the positions. To do this, one can declare the existence of three auxiliary properties auxiliary[0] = f x[eV/Å] auxiliary[1] = f y[eV/Å] auxiliary[2] = f z[eV/Å], which provide indexing (start from zero), property name, and unit information. One can then append the auxiliary property data at the end of the line for each atom. AtomEye can be used to query atom by atom and to graphically visualize these auxiliary properties with various threshold and colormap options (see Fig. 3). The CFG file (with recommended suffix “.cfg”) is meant to be readable and editable by people, so it is in plain ASCII format. One can add comments after “#”, which is also a way to store nonstandard information. All data values can be specified to an arbitrary number of effective digits that the user deems necessary. To compensate for the large size, the user may compress the CFG file using gzip (recommended suffix “.cfg.gz”) or bzip2 (recommended suffix “.cfg.bz2”). AtomEye can directly load the compressed files, using an
1064
J. Li
automatic recognition and decompression scheme. To simplify operations, one CFG file should store one atomistic configuration only. A sequence of configurations should be named like “mdrun00001.cfg.gz”, “mdrun00002.cfg.gz”, “mdrun00003.cfg.gz”, . . . , etc., with the starting identifier “mdrun” arbitrary. AtomEye can recognize the above file name patterns automatically to determine a file group, with browsing forward, backward and loop-back capabilities. This greatly facilitates inspecting MD trajectories. AtomEye presently has three builtin functions to characterize the local atomic environment 1. Coordination number {ki }. This counts the total number of first-nearest neighbors each atom i has in a configuration (self excluded). It is of course a fuzzy concept, especially in liquids, where the sharp shell structure of the crystal reference is largely smeared out. Procedure wise, what is done in AtomEye is that there is a default atomic radius value Ru defined for each element species u. An atom i of species u would consider an atom j of species v its first-nearest neighbor if their distance rij (see Eq. (15)) is lesser than rc,uv ≡ Ru + Rv . By the definition above, and by common sense, this relationship is reciprocal; that is, if atom i considers atom j its first-nearest neighbor, then atom j would also consider atom i its first-nearest neighbor. The choice of the Ru default value is based on the following considerations. The first is Slater’s empirical atomic radius tabulation based on the equilibrium bond lengths in over 1200 ionic, metallic, and covalent crystals and molecules [25]. The second is that in order for the procedure to be maximally resistant to thermal noise at low T for the ground state perfect crystal, Ru should be set to be approximately halfway between the first and the second atomic shells of the T = 0 perfect crystal. (In liquids a similar choice would be to set Ru to the location of the minimum between the first and second maxima in the radial distribution function g(r).) This default rc,uv value however does not always work well in practice, and the user can change it. In AtomEye, {ki } is used as a versatile characterization of atomic defects. Point defects (Fig. 4), dislocations, grain boundaries (Fig. 5), etc. will often change the coordination number of atoms in their cores, thereby allowing their conformation to be visualized. Often, to see the defects, one also needs to render the “uninteresting” atoms invisible. Here, the uninteresting atoms are identified as those whose ki remains unchanged from the reference crystal value, such as 12 in FCC crystal. Ctrl+shift+right-click on them will make them invisible. 2. Central symmetry parameter {ci }. There are some important defects in crystals, such as stacking faults and twin boundaries, which do not change the coordination number of atoms. But they can be identified by
Atomistic visualization
1065
evaluating the degree of inversion symmetry breaking around each atom. This is explained in the next section. An example is shown in Fig. 6, where intrinsic stacking faults bound by Shockley partial dislocations in FCC crystals are visualized. 3. Local von Mises shear strain invariant {ηi }. A reference-state free measure of local atomic strain shear invariant has been derived for high-symmetry crystals [26]. Furthermore, the user is free to devise his/her own local environment characterization scheme, save the result as an auxiliary property (see Fig. 3) to visualize it later on. One may also define a “color patch” file that accompanies a CFG file to explicitly control how the atoms should be rendered. AtomEye provides a suite of tools to survey and interrogate the configuration. One can find out about atomic properties (auxiliaries included), bond length, bond angle, surface normal, and dihedral angle by right clicking on the atoms. One may define a large number of simultaneous cutting planes, and shift the configuration under PBC to expose the most interesting features. Finally, one can put down color marking on the atoms in one configuration and trace their diffusive or displacive motion in the ensuing configurations, for example during deformation.
3.
Central Symmetry Parameter
The central symmetry parameter {ci }, i = 1 · · · N is used to characterize the degree of inversion symmetry breaking in each atom’s local environment. Especially, it is useful for visualizing planar faults in FCC and BCC crystals [27]. We illustrate here how it is done. Define integer constant M to be the maximum number of neighbors for the computation of {ci }. For FCC lattice, we may want to use M = 12. For BCC lattice, we may want to use M = 8. The computer of course does not know whether the configuration is FCC- or BCC-based, so by default it is going to use, Nmost × 2, (20) Mdefault ≡ 2 where Nmost is the most popular coordination number in the set {Ni }, i =1 · · · N of the configuration. The user is able to override the default. But in any case, M must be even as we will be counting pairs of atoms. Now for each atom i ∈ 1 · · · N , define, m˜ i ≡ min(M, Ni ).
(21)
If m˜ i = 0, ci ≡ 0 since an isolated atom should have perfect inversion symmetry. If m˜ i = 1, ci ≡ 1, since a coordination-1 atom has no inversion image
1066
J. Li
to compare with, so in a sense its inversion symmetry is the most broken. For m˜ i ≥ 2, define,
mi ≡
m˜ i × 2, 2
(22)
and we use the following procedure to determine ci . 1. Sort the j = 1 · · · Ni neighbors of atom i according to their distances |d j | to atom i in ascending order. Pick the smallest m i -set. 2. Take the closest neighbor d1 . Search, among the other m i − 1 neighbors, the one that minimizes,
2 D˜ j ≡ d1 + d j ,
(23)
and let us define, j ≡ arg min D˜ j , j =2..m i
D1 ≡ D˜ j .
(24)
3. Throw atoms 1 and j out of the set, and look for the closest neighbor in the remaining set. Then repeat Step 2 until the set is empty. We then have obtained D1 , D2 , .., Dm i /2 . Define, m i /2
ci ≡
Dk . 2 j =1 |d j |
k=1
2
m i
(25)
Equation (25) is dimensionless. In the case of m i = 2, suppose the two neighbors are independently randomly oriented, it is easy to show that the mathematical expectation, 1 E[ci ] = . 2 On the other hand, we can prove that, max ci = 1, {d j }
(26)
(27)
so this matches with the definition of ci ≡ 1 at m˜ i = 1. But when m i 2, 1 (28) E[ci ] < , 2 because of the minimization process. For instance, at the intrinsic stacking fault in FCC lattice ABC|BCA, there is a loss of inversion symmetry in the two layers C|B, and ci is, √ 3 × 0 + 3 × (d 3/2 × 1/3 × 2)2 1 ≈ 0.0416, (29) = ci = 2 2 × 12d 24 assuming perfect stacking.
Atomistic visualization
1067
The good thing about expression (25) is that according to the Lindemann/ Gilvarry rule [28], a crystal melts when the atomic vibrational amplitudes reach about ∼12% of the nearest neighbor distance, so ci for perfect crystal should be < 0.01 even at finite temperature. Therefore, it is not very difficult to threshold out thermal noise vs a true stacking fault.
4.
Outlook
Visualization of modeling results plays the same role as microscopy in experiments: one relies on it to extract useful information from the often staggering amount of data. Good pictures and animations grab the audience’s attention in classrooms and seminars, and user-friendly visualization tools allow them to really interact with the numerical models. Atomistic visualization will become more widespread as suitable techniques are developed and software tools are refined. In the future, we expect distributed visualization of large data sets, like distributed number-crunching on Beowulf clusters and grid computers, will become more prevalent. In this paradigm, the display node takes care of assembling the scenes and user input, while multiple nodes on a fast network perform data readout and render the scenes in the background, in real-time navigations.
References [1] F.F. Abraham, R. Walkup, H.J. Gao, M. Duchaineau, T.D. De la Rubia, and M. Seager, “Simulating materials failure by using up to one billion atoms and the world’s fastest computer: work-hardening,” Proc. Natl. Acad. Sci. USA., 99, 5783–5787, 2002. [2] P. Vashishta, R.K. Kalia, and A. Nakano, “Multimillion atom molecular dynamics simulations of nanostructures on parallel computers,” J. Nanopart. Res., 5, 119–135, 2003. [3] S. Xu, J. Li, C. Li, and F. Chan, “Immersive visualisation of nano-indentation simulation of cu,” In: H. Lee and K. Kumar (eds.), Recent Advances in Computational Science and Engineering, World Scientific, Singapore. Proceedings of the International Conference on Scientific and Engineering Computation (IC-SEC) ISBN: 1-8609, 2002. [4] A. Sharma, A. Nakano, R.K. Kalia, P. Vashishta, S. Kodiyalam, P. Miller, W. Zhao, X.L. Liu, T.J. Campbell, and A. Haas, “Immersive and interactive exploration of billion-atom systems,” Presence-Teleoper. Virtual Env., 12, 85–95, 2003. [5] P.J. Kraulis, “Molscript - a program to produce both detailed and schematic plots of protein structures,” J. Appl. Crystallogr., 24, 946–950, 1991. [6] R.A. Sayle and E.J. Milner-White, “Rasmol – biomolecular graphics for all,” Trends Biochem. Sci., 20, 374–376, 1995.
1068
J. Li
[7] E.A. Merritt and M.E.P. Murphy, “Raster3D photorealistic molecular graphics,” Acta Crystallogr. Sect. D-Biol. Crystallogr., 50, 869–873, 1994. [8] E.A. Merritt and D.J. Bacon, “Raster3D: photorealistic molecular graphics,” Methods Enzymol., 277, 505–524, 1997. [9] M.C. Lawrence and P. Bourke, “CONSCRIPT: a program for generating electron density isosurfaces for presentation in protein crystallography,” J. Appl. Crystallogr., 33, 990–991, 2000. [10] N. Guex and M.C. Peitsch, “SWISS-MODEL and the Swiss-PdbViewer: an environment for comparative protein modeling,” Electrophoresis, 18, 2714–2723, 1997. [11] N. Guex, A. Diemand, and M.C. Peitsch, “Protein modelling for all,” Trends Biochem. Sci., 24, 364–367, 1999. [12] R. Koradi, M. Billeter, and K. Wuthrich, “MOLMOL: A program for display and analysis of macromolecular structures,” J. Mol. Graph., 14, 51–55, 1996. [13] O. Miyashita, J.N. Onuchic, and P.G. Wolynes, “Nonlinear elasticity, proteinquakes, and the energy landscapes of functional transitions in proteins,” Proc. Natl. Acad. Sci. USA, 100, 12570–12575, 2003. [14] H.M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T.N. Bhat, H. Weissig, I.N. Shindyalov, and P.E. Bourne, “The protein data bank,” Nucleic Acids Res., 28, 235– 242, 2000. [15] M. Parrinello and A. Rahman, “Polymorphic transitions in single-crystals – a new molecular dynamics method,” J. Appl. Phys., 52, 7182–7190, 1981. [16] J. Li, “Atomeye: an efficient atomistic configuration viewer,” Model. Simul. Mater. Sci. Engrg., 11, 173–177, 2003. [17] G. Schaftenaar and J.H. Noordik, “Molden: a pre- and post-processing program for molecular and electronic structures,” J. Comput.-Aided Mol. Des., 14, 123–134, 2000. [18] A. Kokalj, “Xcrysden – a new program for displaying crystalline structures and electron densities,” J. Mol. Graph., 17, 176, 1999. [19] A. Kokalj, “Computer graphics and graphical user interfaces as tools in simulations of matter at the atomic scale,” Comput. Mater. Sci., 28, 155–168, 2003. [20] W.D.A. Humphrey and K. Schulten, “VMD: visual molecular dynamics,” J. Mol. Graph., 14, 33–38, 1996. [21] J. Adler, A. Hashibon, N. Schreiber, A. Sorkin, S. Sorkin, and G. Wagner, “Visualization of md and mc simulations for atomistic modeling,” Comput. Phys. Commun., 147, 665–669, 2002. [22] S.R. Bahn and K.W. Jacobsen, “An object-oriented scripting interface to a legacy electronic structure code,” Comput. Sci. Engrg., 4, 56–66, 2002. [23] J.D. Foley, A. van Dam, S.K. Feiner, and J.F. Hughes, Computer Graphics: Principles and Practice in C, 2nd edn., Addison-Wesley, Reading, 1995. [24] J. Li, K.J. Van Vliet, T. Zhu, S. Yip, and S. Suresh, “Atomistic mechanisms governing elastic limit and incipient plasticity in crystals,” Nature, 418, 307–310, 2002. [25] J. Slater, J. Chem. Phys., 39, 3199, 1964. [26] J. Li, To be published, 2004. [27] C.L. Kelchner, S.J. Plimpton, and J.C. Hamilton, “Dislocation nucleation and defect structure during surface indentation,” Phys. Rev. B, 58, 11085–11088, 1998. [28] J. Gilvarry, “The lindemann and gruneisen laws,” Phys. Rev., 102, 308, 1956.
3.1 MESOSCALE/MACROSCALE COMPUTATIONAL METHODS M.F. Horstemeyer Mississippi State University, Mississippi State, MS, USA
The central idea of this chapter is the multiscale aspect of the constitutive relations for plasticity, damage/fracture, and fatigue. In continuum mechanics, the constitutive relations are required to complete the set of governing equations in concert with the conservation equations of mass, momentum, and energy. The constitutive equations essentially distinguish the material in the modeling framework. The particular constitutive relations focused on in this chapter relate to thermodynamically dissipating materials with the perspective of mesoscale and macroscale analyses. Mesoscale analyses typically start at the scale of the grain or crystal, whereas macroscale analyses start at the polycrystalline level. Mesoscale analyses at times focus on just activities within a single crystal. In a sense mesoscale analyses are ascribed as discrete methods but can address polycrystalline materials as averaging schemes over the single crystals are performed. Hence, mesoscale analyses employ continuum mechanics as well. Several types of mesoscale analyses can be performed. In this chapter, dislocation dynamics occurring at the scale of the grain is presented, which is a discrete method. Also, crystal plasticity, which is a mix of discrete and continuum concepts, starts at the grain scale and can be used for polycrystalline analysis with the use of averaging schemes. The macroscale analysis in this chapter is discussed in the context of internal state variables, which are rooted in continuum level thermodynamics. Because the mesoscale dislocation dynamics and crystal plasticity formulations require a large capacity of computing power, they are not generally used to solve large scale boundary value problems of structural components or systems. It is the macroscale internal state variable continuum theory that is often employed in solving practical engineering problems. The use of the mesoscale dislocation dynamics and crystal plasticity formulations arises in material analyses studies, which in turn can play a role in the macroscale 1071 S. Yip (ed.), Handbook of Materials Modeling, 1071–1075. c 2005 Springer. Printed in the Netherlands.
1072
M.F. Horstemeyer
efforts. We should also note that particular discrete details of dislocation nucleation, motion, and interaction can be quite clearly captured in the dislocation dynamics formulation. As such, the degrees of freedom required to solve an engineering problem are prohibitive. Now the crystal plasticity formulation employs discrete crystals but treats the dislocation effects in a phenomenological manner. Hence, it loses some of the details but can capture a larger scale boundary value problem. The macroscale internal state variable captures even less detail than the crystal plasticity formulation but can address even larger scale boundary value problems. When considering the multiscale modeling and simulation methods described in this book, one can consider that the ab initio and atomistic method simulation results can be imported into the dislocation dynamics formulation. The results from the dislocation dynamics formulation can be imported into the crystal plasticity formulation. And the crystal plasticity results can be imported into the macroscale internal state variable formulation. This bridging of the length scales is a relatively recent area of research in the areas of plasticity, damage/fracture, and fatigue. This chapter breaks down the different plasticity formulations related to the different length scales. The damage/fracture and fatigue portions can be included in the plasticity formulations at each of the scales as well. However, the continuum damage mechanics, fracture mechanics, and fatigue formulations have had more applicability at the macroscale and structural scale and much less at the mesoscale. The mesoscale couplings of damage, fracture, and fatigue with dislocation dynamics and crystal plasticity are certainly promising areas of research. Before proceeding further into the depths of each of the formulations, it is pertinent to discuss the context of the models within the context of continuum mechanics. These formulations considered in this chapter each focus on the kinematics, kinetics, and thermodynamics related to each scale and type of formulation for the most part (exception is dislocation dynamics). The constitutive model embedded within the governing conservation law equations is sometimes referred to as a law, but generally are not in the strict scientific sense. They essentially represent the constitution of the material. Just as the constitution of the United States dictates the internal response upon an external stimulus, so the constitution of a material will yield a stress upon an externally applied strain. The reader should know that several postulates of continuum theory are part of a constitutive theory. Classically, these assumptions (essentially postulates) are the following: frame indifference, physical admissibility, material memory, equipresence, and local action. Perhaps the most important of these assumptions is physical admissibility. In fact, the assumption of local action has been questioned recently when dealing with various length scales and some constitutive equations have assumed nonlocal notions.
Mesoscale/macroscale computational methods
1073
In order to address the physical admissibility assumption, let us summarize various length scales that arise in plasticity and damage/fracture analysis. At a low level, the lattice parameter is key since atomic rearrangement is intimately related to various types of dislocations. Nabarro developed a relationship with dislocation’s that related stress to the inverse of a length scale parameter, the Burgers vector [1]. Nabarro [2] also showed that the diffusion rate is inversely proportional to the grain size, another length scale parameter. Hall [3] and Petch [4] related the work hardening rate to the grain size. Ashby [5] found that the dislocation density increased with decreasing second phase particle size. Frank [6, 7] and Read [8] showed a relation with a dislocation bowing as a function of spacing distance and size. Hughes and Hanson [9] discovered that geometrically necessary boundary spacing decreases with increasing strain. Recently, Horstemeyer et al. [10, 11] found that the yield stress is a function of the volume per surface area, a different type of length scale parameter. Recent experimental studies have revealed that material properties change as a function of size. For example, Fleck et al. [12] have shown in torsion of thin polycrystalline copper wires the normalized yield shear strength increases by a factor of three as the wire diameter is decreased from 100 µ to 12.5 µ. Stolken and Evans [13] observed a substantial increase in hardening during the bending of ultra thin beams. In micro-indentation and nano-indentation tests [14–19], the measured indentation hardness increased by a factor of two as the depth of indentation decreased from 10 µ to 1 µ. Lloyd [20] investigated an aluminum-silicon matrix reinforced by silicon carbide particles. He observed a significant increase in strength when the particle diameter was reduced from 16 µ to 7.5 µ while holding the particle volume fraction fixed at 15%. Hughes et al. [21] investigated deformation induced from frictional loading and found that the stresses near the surface were much greater than that predicted by the local macroscale continuum theory, i.e., a length scale dependence was observed. Elssner et al. [22] measured both the macroscopic fracture toughness and the atomic work associated with the separation of an interface between two dissimilar single crystals. The interface (crack tip) between the two materials remained sharp, even though the materials were ductile and contained a large number of dislocations. The stress level necessary to produce atomic decohesion of a sharp interface is on the order of ten times the yield stress, while local theories predict that the maximum achievable stress at a crack tip is no larger than four to five times the yield stress. In continuum mechanics, length scales have been apart of some of the original premises. Euler in the 1700s related buckling to a column length. In the 1800s, Cauchy related the stress state to radius of a cylinder. In the 1900s Bridgman showed for many metal alloys that the notch radii change the stress state of the material. In terms of damage/fracture, Griffith [23] found a
1074
M.F. Horstemeyer
relation between the crack length and the stress intensity factor. Fairly recently, McClintock [24] determined the void growth rates as a function of the void size. Void/Crack nucleation was determined by various aspects of the second phase particle size distribution by Gangalee and Gurland [25]. Horstemeyer et al. [26] determined the nearest neighbor distance as a length scale parameter for void coalescence modeling. It is clear that whether damage mechanics or fracture mechanics is employed the length scale of interest is important. Although the computational methods described in this chapter address different length scales of analysis and hence require differing levels of discreteness or continuumness, all of the methods have some points of commonality. They are focused on crystalline metals, not being restricted to fcc, bcc, or hcp lattice structures. They have a common theme energy requirements and dissipative mechanisms within their relevant scale. Each scale of analysis has focused upon various strain rate and temperature effects as fundamental conditions. Continuum mechanics can be thought of as a branch of applied mathematics. The formulations comprise rigorous mathematical restrictions so standard tensor notation is used throughout the text. The tensor notation needs some explanation as well. An underscore indicates a first rank tensor (vector) for a lower case letter and a second rank tensor for a capital letter, i.e., v and F, respectively. A global Cartesian coordinate system is assumed so no distinction is made between the contravariant and covariant components. A first and second rank tensor in the Einsteinian indicial notation are given by v i andFi j , respectively. For the implementation of the constitutive model into the finite element codes, the tensors are denoted in bold face type. The summation convention over repeated indices is implied, for example, σii = σ11 + σ22 + σ33. In general, for any tensor variable x, x˚ represents the corotational derivative. The tensorial dyadic product is denoted by ⊗, for example, a ⊗ a is a second rank tensor. The rest of this chapter is outlined by the following sections which give detailed descriptions of mesoscale/macroscale continuum formulations: dislocation dynamics, crystal plasticity, internal state variable theory, ductile fracture, continuum damage mechanics, microstructure sensitive computational fatigue, and a final perspective on modeling at these scales.
References [1] [2] [3] [4]
J.M. Burgers, Proc. Kon. Ned. Akad. Wetenschap., 42, 293, 1939. F. Nabarro, Adv. in Phys., 1, 269, 1952. E.O. Hall, Proc. Phys. Soc. B, 64, 747, 1951. N.J. Petch, J. Iron Steel I., 174, 25, 1953.
Mesoscale/macroscale computational methods
1075
[5] M. Ashby, Strengthening Methods in Crystals, A. Kelly and R.B. Nicholson (eds.), 137, 1971. [6] F.C. Frank, Disc. Faraday Soc., 5, 48, 1949. [7] F.C. Frank, Phil. Mag., 42, 809, 1951. [8] W.T. Read, Dislocations in Crystals, McGraw-Hill, New York, 1953. [9] D.A. Hughes and N. Hansen, “High angle boundaries and orientation distributions at large strains,” Scripta Metallurgica et Materialia, Vol. 33, No. 2, Jul. 15, pp. 315– 321, 1995. [10] M.F. Horstemeyer and M.I. Baskes, “Atomistic finite deformation simulations: a discussion on length scale effects in relation to mechanical stresses,” J. Eng. Matls. Techn. Trans. ASME, 121, pp. 114–119, 1998. [11] M.F. Horstemeyer, S.J. Plimpton, and M.I. Baskes, “Size scale and strain rate effects on yield and plasticity of metals,” Acta Mater., 49, 4363–4374, 2001. [12] N.A. Fleck, G.M. Muller, M.F. Ashby, and J.W. Hutchinson, “Strain gradient plasticity – theory and experiment,” Acta Met., 42, (2), 475–487, 1994. [13] J.S. Stolken and A.G. Evans, “A microbend test method for measuring the plasticity length scale,” Acta Mater., 46, n 14 5109, 1998. [14] W.D. Nix, “Mechanical properties of thin films,” Metall. Trans., 20A, 2217–2245, 1989. [15] M.S. De Guzman, G. Newbauer, P. Flinn, and W.D. Nix, “The role of indentation depth on the measured hardness of materials,” Materials Research Symposium Proceedings, 308, 613–618, 1993. [16] N.A. Stelmashenko, N.A. Walls, L.M. Brown, and Y.V. Milman, “Microindentation on W and Mo oriented single crystal: an STM Study,” Acta. Metall. Mater., 41, 2855– 5865, 1993. [17] Q. Ma and D.R. Clarke, “Size dependent hardness in silver single crystals,” J. Mater. Research, 10, 853–863, 1995. [18] W.J. Poole, M.F. Ashby, and N.A. Fleck, “Microhardness of annealed and work hardened copper polycrystals,” Scripta Metall. Mater., 34, 559–564, 1996. [19] K.W. McElhaney, J.J. Vlassak, and W.D. Nix, “Determination of indentor tip geometry and indentation contact area for depth-sensing indentation experiments,” J. Mater. Res., 13, 1300–1306, 1998. [20] D.J. Loyd, “Particle Reinforced aluminum and magnesium matrix composites,” Int. Mater. Rev., 39, 1–23, 1994. [21] D.A. Hughes, D.B. Dawson, J.S. Korellis, and L.I. Weingarten, “Near surface microstructures developing under large sliding loads,” J. Matls. Enginring. Performance, 3, 459–475, 1994. [22] G. Elssner, D. Korn, and M. Ruehle, “The influence of interface impurities on fracture energy of UHV diffusion bonded metal-ceramic bicrystals,” Scripta Metall. Mater., 31, 1037–1042, 1994. [23] A.A. Griffith, “The phenomena of rupture and flow in solids,” Phil. Trans. Roy. Soc. London, Series A, 221, 163–198, 1920. [24] F.A. McClintock, “A criterion for ductile fracture by growth of holes,” J. Appl. Mechanics, 35, 363, 1968. [25] A. Gangalee and J. Gurland, “On the fracture of silicon particles in aluminum–silicon alloys,” Trans. Metall. Soc. of AIME, 239, 269–272, 1967. [26] M.F. Horstemeyer, M.M. Matalanis, A.M. Sieber, and M.L. Botos, “Micromechanical finite element calculations of temperature and void configuration effects on void growth and coalescence,” Int J. Plasticity, 16, 2000.
3.2 PERSPECTIVE ON CONTINUUM MODELING OF MESOSCALE/ MACROSCALE PHENOMENA D.J. Bammann Sandia National Laboratories, Livermore, CA, USA
The attempt to model or predict the inelastic response or permanent deformation and failure observed in metals dates back over 180 years. Various descriptions of the post elastic response of metals have been proposed from the fields of physics, materials science (metallurgy), engineering, mechanics, and applied mathematics. The communication between these fields has improved and many of the modeling efforts today involve concepts from most or all of these fields. Early engineering description of post yield response treated the material as perfectly plastic – the material continues to deform with zero additional increase in load. These models became the basis of the mathematical theory of plasticity and were extended to account for hardening, unloading, and directional hardening. In contradistinction, rheological models treated the finite deformation of a solid similar to the deformation of a viscous fluid. In many cases of large deformation, rheological models have provided both adequate and accurate information about the deformed shape of a metal during many manufacturing processes. The treatment of geometric defects in solid bodies initiated within the mathematical theory of elasticity, the dislocation, introduced as an incompatible “cut” in a continuum body. This resulted in a very large body of literature devoted to the linear elastic study of dislocations, dislocation structures, and their interactions, and has provided essential information in the understanding of the “state” of a deformed material. Later it was recognized that this mathematical description was consistent with the defect in a crystal responsible for inelastic deformation. Following this, many dislocation models were developed that explained macroscopically observed phenomena such as work hardening, rate sensitivity, temperature dependence, and load path dependent response in crystalline solids. In the 1950s, the understanding of defects in deformed bodies was explored through 1077 S. Yip (ed.), Handbook of Materials Modeling, 1077–1096. c 2005 Springer. Printed in the Netherlands.
1078
D.J. Bammann
incompatibility theory, introducing a precise mathematical structure to describe the deformation of solid bodies. While these theories dealt with the internal elastic strains associated with defects in a deformed body, the finite deformation kinematics resulting from the development of these theories formed the basis for many finite deformation models of plasticity. Meanwhile, the direct links were established between incompatibility theory and the modern differential geometry approach employed by physicists to describe gravitation theory and twisted and curved spaces. With the introduction of faster and larger computers crystal plasticity models became very important tools in the design of many large strain manufacturing processes, such as rolling. These approaches illustrated the importance of describing the underlying crystalline structure of metals in predicting evolving anisotropy and texture. These theories greatly enhanced the understanding of the effects of the deformation on the integrity of the final manufactured parts and resulted in significant improvements in metal forming processes. Phenomenological models were developed introducing plastic spin as a necessary link between crystal plasticity models and conventional engineering plasticity theories. Simultaneous with these developments, efforts were advanced to predict the self-organization of crystals deformed to extremely large strains, into cell or wall like structures. In these cases, the original crystal nature of the solid becomes less dominant than the properties of the smaller cell structure in determining the mechanical response to further loading. Dislocation theories, reaction-diffusion theories and other models involving higher order spatial gradients of strain or state have been proposed to model these self-organization processes. At very large deformations the schematic representation for a series of misoriented cells with alternating polarity to satisfy meso scale angular momentum balance is identical to the Benard instability resulting from convection between two plates of unequal temperature. Descriptions of large strain localization often contain descriptors such as rotational instability, previously associated solely with fields such as turbulence. This brief introduction is intended to illustrate how descriptions of the inelastic response of crystalline materials are extremely diverse and in some cases seemingly unrelated. As a result, this perspective cannot begin to cover all aspects of either crystal deformation or associated models. Instead, an attempt to relate the some of the approaches of the various articles in this chapter is presented. The concept of a dislocation has been utilized in modeling the internal state and mechanical response of crystals spanning orders of magnitude of length scales from atomistic to macroscopic. The different approaches to modeling inelastic response at different lengths scales are equally varied and include molecular dynamics (MD) simulations, discrete dislocation simulations, phenomenological continuum theories, including crystal plasticity and theories considering the average motion of groups or densities of dislocations.
Continuum modeling of mesoscale/macroscale phenomena
1079
In addition to the increasing importance of surface effects at smaller length scales due to the increased surface to volume ratio, theories at smaller length scales are also characterized by an increased number of degrees of freedom. An approach to encompass many of these features will be considered, beginning with a very brief overview of some of the important early developments in this field.
1.
Historical Overview
The oldest description of plasticity is that proposed by Coulomb [1], Tresca (1868) and amended by Venant [2, 3] and is based upon the assumption that plastic flow initiates when the maximum shear stress reaches a critical value. This describes a hexagonal prism in principle stress space with the axis of the prism having equal inclinations to all of the coordinate axes. If the stress state lies inside the surface, the response of the material is in a state of elastic loading or unloading. The experimental works of Bauschinger (1886) [4] were critical in enhancing the understanding of inelastic response of metals in terms of unloading. An alternate criterion for the initiation of plastic flow was proposed independently by Huber [5] and von Mises [6] in which plastic flow is initiated when the distortional energy reaches a critical value. This theory was later redefined by Hencky [7]. The Tresca surface in stress space is inscribed within the von Mises ellipse as shown in Fig. 1 as both theories predict a volume preserving deformation. The mathematical treatment of defects in a body began in the early twentieth century with Volterra (1907) when he studied the elastic stress field around displacement continuities in a continuous medium. Volterra introduced six S2
Tresca Mises
S1
Figure 1. A plane in principle stress space depicting the Tresca yield surface inscribed within the von Mises yield surface.
1080
D.J. Bammann
fundamental defects through a thick walled cylinder as shown in Fig. 2 with a cut extending the axial length the cylinder. Three of the discontinuities were introduced by a translational displacement of one side of the cut radially, axially, and circumferentially and represent two types of edge dislocations and screw dislocation, respectively. The other three defects introduced by rotating or twisting the opposite faces of the cut as seen in Fig. 2 are called rotational dislocations or disclinations. Parallel with advances in macroscopic descriptions of inelastic deformation and Volterra’s elasticity solutions for individual defects, x-ray diffraction techniques were utilized to develop a better understanding of the crystallographic nature of metals. In Frenkel (1926) calculated the theoretical shear strength of a crystal and determined that it greatly exceeded experimentally observed results. To account for this striking difference, Taylor [8], Orowan [9], and Polanyi [10] independently postulated the existence of dislocations as a mechanism for crystal deformation at stress levels far below the theoretical shear strength. Orowan [11] considered the mean rather than the individual aspects of dislocation motion in an attempt to describe macroscopic flow. He postulated that the rate of plastic deformation ε˙ p , was determined by the number of mobile dislocations per unit length and their rate of propagation. ε˙ p = ρm bv
(a)
(1) (b)
(c)
(d)
b
b
b
(e)
(f)
(g)
S ξ0
ω
ω ω
Figure 2. (a) the original cut in the cylindrical tube considered by Volterra, (b) edge dislocation created by translating one face of the cut surface radially inward, (c) screw dislocation created by translating the faces of the cut apart, (d) screw dislocation created by slipping one face of the cut axially with respect to the other, (e), (f) and (g) rotational dislocations or disclinations created by rotating or twisting the faces of the cut surface.
Continuum modeling of mesoscale/macroscale phenomena
1081
where ρm , v and b are the average mobile density, speed and Burgers vector, respectively. Johnston and Gilman [12] experimentally measured the average dislocation velocity as a function of stress in lithium fluoride crystals and proposed an empirical relationship to describe their results. They also assumed the rate of increase of mobile dislocations is proportional to the flux of mobile dislocations and the rate of immobilization proportional to the square root of the mobile density. When coupled with an empirical expression for dislocation velocity and using Orowan’s relationship with the assumption that all dislocations are mobile, the yield phenomena in LiF crystals was accurately predicted. This success led to the widespread adoption of what Argon [13] calls the “dilute solution” approach to dislocation motion. In this approach dislocations are assumed to move in a quasi-viscous manner under the action of an applied stress. The velocity of the motion is determined by the lattice resistance or friction. Interactions between dislocations are assumed negligible or accounted for by an effective stress (applied stress minus back stress). Hardening, instead of being related to a rate mechanism, is defined in terms of the effective stress. Along these lines Webster (1966) assumed that the time rate of change of dislocation density was due to multiplication and immobilization processes in a manner analogous to Gilman. Substituting an empirical relationship for the dislocation velocity in which he assumed an exponential dependence upon the applied stress, resulted in ρ˙ = n + αρ − βρ 2
(2)
where n, α and β are independent of dislocation velocity and functions of stress and temperature. For conditions of constant stress and temperature, Webster accurately predicted stages I and II creep in brass and aluminum oxide crystals. Following Taylor [8], strain hardening is described by assuming the flow stress consists of an athermal component that is proportional to the shear modulus µ and thermal component dependent upon strain rate and temperature. Taylor assumes a random distribution of dislocations in a network with average spacing between dislocations give by λ which by geometry is inversely proportional to the square root of the dislocation density. The stress any dislocation experiences due to the forces exerted on it by its neighbors is given as µb µb √ = ρ (3) 2πλ 2π The strain rate and temperature dependence of the plastic flow arises as dislocations are able to surmount these local obstacles through thermal fluctuations. Kocks (1975) has labeled the stress field associated with local obstacles the mechanical threshold stress and utilized this formulation in the development a macroscopic model of plastic flow. Mecking and Kocks [14] have also τ=
1082
D.J. Bammann
proposed a more physically based internal state variable evolution equation for the dislocation density based upon dislocation storage-recovery in the Taylor lattice. They proposed that in an increment of plastic strain, dislocations were stored inversely proportional the mean free path l, and recover proportional to the density of dislocations. In a Taylor lattice the mean free path is inversely proportional to the square root of the number of dislocations, therefore dρ c1 √ = − c2 ρ = c1 ρ − c2 ρ. dε p l
(4)
This internal state variable evolution equation for dislocation density (or similar forms) has been introduced to describe hardening in many phenomenological plasticity models. It was not until dislocations in crystalline slip were mathematically described that the significance of Volterra’s solutions became apparent. Taylor investigated the properties of straight dislocation lines in an elastic continuum; Burgers [15] studied the properties of curved dislocation lines using an analogy with vortex lines in hydrodynamics; and Peach and Koehler [16] derived the configurational force on a dislocation in an arbitrary stress field. These coupled with the studies of Peierls [17], Nabarro [18] and others began the study of the elastic interaction of defects with the crystal lattice which ultimately resulted in the computational models of discrete dislocations of Zbib (1992), Ghoniem [19], Van der Giessen and Needleman [20] and many others. These models yield solutions to complex boundary value problems, enhancing our understanding of defect interaction, and of equal importance provide insight into mesoscale modeling efforts, such as boundary conditions in strain gradient theories. One of the simplest representations of a dislocation line is the through the concept of a Burgers circuit, an atom-by-atom closed circuit drawn in a crystal. Consider Burgers circuit around the extra plane of atoms (edge dislocation) as depicted in Figs. 3 and 4. The closure failure resulting from the extra plane of atoms is termed the Burgers vector and is a measure of the presence of the dislocation. The Burgers circuit also provides a means of distinction between statistically stored dislocations (SSDs) and geometrically necessary dislocations (GNDs). Ashby postulated that GNDs naturally occurred during plastic flow in crystals to ensure overall compatibility of the total deformation. For example, GNDs are created to prevent gaps or overlaps as crystals rotate with respect to each other during polycrystalline deformation. Other examples of GNDs occur during indentation, at precipitate particles or other dislocation pileups that occur during deformation. SSDs occur under homogeneous deformation in positive and negative pairs as in the network considered by Taylor (1940) or Mecking and Kocks [14] in their models of hardening. These types of dislocations are responsible for most of the hardening observed during deformation of crystals, but result in a compatible deformation as evidenced
Continuum modeling of mesoscale/macroscale phenomena
1083
Figure 3. Burgers circuit for discrete geometrically necessary dislocation.
n t
ds
Figure 4.
Continuum Burgers circuit around a dislocation line with tangent t and normal n.
by the lack of closure failure and zero Burgers vector (Fig. 5). Notice that this distinction depends upon the size of the representative volume element or Burgers circuit considered, and if the size is chosen small enough, all dislocations are geometrically necessary. By considering a Burgers circuit in a plane normal to each axis, Nye [21] was able to construct a tensor that
1084
D.J. Bammann
Figure 5. Burgers circuit for statistically stored dislocations resulting in zero closure failure or incompatibility.
represented the dislocation lines piercing a volume and therefore contained information about the total Burgers vector at a continuum point. Nye related this dislocation density tensor to the curvature, which describes the rotation of the lattice and examined the distribution of dislocations that resulted during bending. The similarity to Volterra’s concept of a dislocation is easily seen by constructing a Burgers circuit in a continuum. A dislocation line in a continuum is simply a cut, displaced surface or a line of singularities. Therefore, any closure failure that occurs when integrating the displacement around a closed circuit that encompasses the dislocation line results in a closure failure or the Burgers vector.
du = −b
(5)
C
By considering a Burgers circuit in a plane normal to each axes, Nye [21] was able to construct a tensor that represented the dislocation lines piercing a volume and therefore contained information about the total Burgers vector at a continuum point. Nye related this dislocation density tensor to the curvature, which describes the rotation of the lattice and examined the distribution of dislocations that resulted during bending. The finite deformation equivalent of
Continuum modeling of mesoscale/macroscale phenomena
1085
this was developed in an attempt to solve the elasticity problem of the internal stress field in an unloaded (but previously loaded) body. Bilby et al. 1957 and [22] independently proposed that the deformation gradient be multiplicatively decomposed into elastic and plastic parts (Fig. 6). F = Fe Fp
(6)
Fp represents the plastic deformation from the prior loading while Fe the elastic strain in the unloaded body resulting from the presence of dislocations. The natural configuration is defined by unloading through F−1 e , but in general does not represent a compatible deformation state. Denoting reference configuration variables by upper case, current configuration by lower case and quantities in the natural configuration by an over tilde, a line segment is mapped form the reference to the configuration by dx = FdX
(7)
Integrability conditions for a compatible deformation require that
F dX = 0 or
dx = C
C0
dX = C0
F−1 dX = 0,
(8)
C
Fp dx
dx F
Fe dx
Figure 6.
Multiplicative decomposition of deformation gradient into elastic and plastic parts.
1086
D.J. Bammann
where C0 is any closed path surrounding an area A0 with surface normal N. Using Stokes theorem Eq. 7 can be rewritten as integrals over the areas as,
F dX = −
C0
Curl F N dA = 0
and
A0
F−1 dX = −
C
(Curl F−1 )n da = 0.
(9)
A0
This leads to the local form of compatibility as, Curl F = 0
and
Curl F−1 = 0.
(10)
Where Curl (•) = ∇X (•) · ε
(11)
is taken with respect to the reference configuration and ε is the reference configuration alternator tensor. In Cartesian coordinates, Eq. 11 takes the form, Curl F = Fik,l εklj ei ⊗ ej
(12)
Substituting Eq. 7 into Eq. 10 and mapping to the natural configuration, it follows that compatibility is satisfied if the elastic dislocation density tensor is equal to the negative of the plastic dislocation density tensor, both in the intermediate configuration, or ¯T det Fp α¯ pT = −A e
(13)
where, α¯ p = Fp Curl Fp ¯ e = Curl (Fe ) F−T A e
(14)
These quantities sum to give the total dislocation density tensor in the intermediate configuration. Therefore, the total dislocation density is merely the sum of plastic and elastic dislocation density tensors – just like strains. And just like strains, these tensors can be mapped to any configuration where a similar relation holds. Lardner showed that the vanishing of this total dislocation density tensor is equivalent to the vanishing of the net dislocation density tensor or vanishing of excesses of dislocations of the same sign (GND). Therefore, the presence of geometrically necessary dislocations results in an incompatible
Continuum modeling of mesoscale/macroscale phenomena
1087
natural configuration and the elastic deformation gradient produces a rotation to restore compatibility to the total deformation. These equations were obtained previously by Teodosiu [23], Werne (1976) and recently by Steinmann [24], while the small strain formulation of these equations was first obtained by Kroner [22]. Bilby et al. [25] and Kondo (1952) using concepts from tensor calculus also obtained the dislocation density tensor. In this approach, the dislocation density tensor is a result of the fact that the Cartan torsion tensor does not vanish and therefore the intermediate configuration is not a Euclidean space. There has been a resurgence of interest in regularizing or adding a mathematical length scale to these continuum models of deformation. The motivation for this comes from several sources. From the point of view of numerical solutions, it is well established that in the post-bifurcation regime of a solution (initiation of either strain localization or damage associated with material softening) the system of differential equations changes generating an ill posed problem. In codes modeling hyperbolic systems, the differential equations become elliptic in the post-bifurcation regime, but the boundary conditions are still prescribed for hyperbolic systems. Similarly, in static codes, elliptic systems transform into hyperbolic systems but the associated boundary conditions are still prescribed for the original elliptic system. This incongruence results in pathological mesh dependence; in other words, the system does not converge to a solution regardless of mesh size. This problem can be resolved by regularizing or adding a mathematical length scale to the continuum, either in the form of spatial gradients in the constitutive model or by some numerical construction. For example, let us consider the solution of plane strain extension of block of material using the Gurson damage model as described in a previous section of this chapter [26]. As the mesh size is continually reduced, the damage localizes into a smaller concentration associated with the smallest element (Figs. 7 and 8). The associated load displacement curve is also reduced with reducing mesh size. However, the addition of a Laplacian of effective plastic strain to the yield function results in convergence to a finite damage band that is proportional to the constant introduced in the strain gradient term. Another motivation for models containing a mathematical length scale results from the attempt to solve boundary value problems at extremely small length scales. Recent experimental studies have revealed that material properties change as a function of size. Fleck et al. [27] presented experimental data on the torsion of thin wires, which showed the break down of local theory at very small wire diameters. Similar effects have been observed in problems of microindentation, where the flow stress increases with decreasing indentor size (Ma and Clarke 1995 and Stelmashenko et al. 1993). At these small diameters, the flow stress increased with decreasing radius of the torsion specimen. For example, in mechanical tests on small specimens, when the
1088
D.J. Bammann
8⫻32
16⫻64
24⫻96
32⫻128
Figure 7. The contours of localized damage continue to decrease to a finer width as the mesh is refined. No convergence.
specimen dimensions reached a critically small size, the yield strength began to increase sharply with further decrease in specimen dimension. This dimension is generally smaller than the reasonable applicability of local continuum theory, but still much larger than that required for tractable solutions of the problem with atomistic methods. This problem can also be resolved by the aforementioned introduction of a length scale using spatial gradients. And finally, as the demands of design require the incorporation of more of the underlying physics, it is becoming important to develop a means to bridge the length scales from the macroscopic continuum to the atomistic levels. One approach to achieve this is generalization or the addition of more kinematic degrees of freedom to the continuum. This approach results in the development of a crystal plasticity model with a physical length scale. This can be accomplished by incorporating GNDs as a internal state variable or by choosing other higher order gradients of strain as state variables. Hence, the resulting model of the crystal includes a natural, physical length scale.
Continuum modeling of mesoscale/macroscale phenomena
8⫻32
16⫻64
24⫻96
32⫻128
1089
Figure 8. The contours of damage converge to a finite band associated with the length scale introduced by the spatial gradient.
2.
Internal State Variable Model of Gradient Crystal Plasticity
Many macroscopic models of plasticity have been developed that incorporate SSDs and an appropriate evolution equation to describe experimentally observed hardening. An incomplete list includes Teodosiu [23], Rice [28], Bammann (1984), Miller [29], Chaboche (1967), Hart [30], Kratochvil and Dillon [31], Perzyna (1964), Krieg et al. [32] and Bodner and Partom [33]. Teodosiu [23] was first to embed these concepts within the framework of the thermodynamics of internal state variables [34]. These types of models have been implemented into finite element codes and used to solve a wide range of boundary value problems extending over regimes from creep to the shock regime and cryogenic temperatures to melt. A complete overview of this internal state variable approach is presented earlier in this chapter. The use of internal state variable theory permits the formal introduction of the physics
1090
D.J. Bammann
associated with dislocation, void and crack mechanics including both the stored energy and dissipation associated with these defects. As stated previously, formal internal state variable theory provides the format to include statistically stored dislocations and geometrically necessary dislocations. Since GNDs introduce an incompatibility into the plastic deformation, their inclusion introduces a length scale that eliminates the problems of post bifurcation nonuniqueness as well as specimen size scale dependent results for very small length scales. Theories of this type have been proposed by Teodosiu [23], Acharya and Basani (1996), Acharya and Beaudoin (1999), Bammann (2000), Cermelli and Gurtin (2000), Gurtin [35], Svendsen (2001) among others. As a simplified example of this approach neglect temperature, in which case the mechanical version of the second law of thermodynamics simply states that the rate of change of internal energy must be less than the work done on the body. . (15) ψ ≤ σ · d p, where σ is the Cauchy stress and d p the plastic stretching or strain rate. Now assume that the Helmholtz free energy depends upon the intermediate config¯ e , the elastic lattice strain associated with a network of uration elastic strain E statistically stored dislocations εss and a strain like measure of GNDs κe . T κe = l1 Je F−1 Curl F−1 e e
√ εss = bµ ρss
(16)
This is a natural incorporation of GNDs from compatibility theory20 and the theory of Taylor from dislocation mechanics. Therefore, if we assume that ¯ e , εss , κe ), substitute into the mechanical the free energy takes the form, ψ(E version of the second law, and equate like terms and making standard assumptions of independence commonly utilized in internal state variable theory, we get the following,
¯ e , εss , κe ∂ψ E σ , ¯ = ∂ E¯ e
τˆ =
¯ e , εss , κe ∂ψ E ∂εss
,
χ=
∂ψ E¯ e , εss , κe
∂ κe (17)
and the dissipation reduces to,
σ ¯ · d p − τˆ ε˙ ss − χ · κe ≥ 0
(18)
Now assuming that the free energy is quadratic in al the elastic strain like variables,
2
2 ¯ e , εss , κe = 1 µ E ¯ e + 1 K tr 2 E ¯ e + cτ µεss + c2 µ |κe |2 ρψ E 2 2
(19)
Continuum modeling of mesoscale/macroscale phenomena
1091
Then, √ τˆ = cτ µεss = cτ µb ρss ,
σ ¯ = λtr E¯ e + 2µE¯ e ,
χ = c2 µκe
(20)
Notice that the Taylor model of internal strength is naturally recovered. Up to this point this model is similar to a mechanical version of the one proposed by Teodosiu. To complete the theory, an evolution equation is required for the SSD and GND densities. In addition to these evolution equations, Teodosiu required the div κe since the divergence of the curl of a field must vanish. In effect, this introduced a mathematical length scale into the model. With the advent of crystal plasticity, this is unnecessary. The expression for the plastic stretching is given as, Lp =
γ˙ (i) s(i) ⊗ n(i)
(21)
i
where, s(i) and n(i) are slip direction and normal, respectively. The magnitude of slip along a particular system is usually chosen as a power law function of the applied shear stress on that system and the slip resistance, τ .
γ˙
(i)
(i)
= γ˙0
σ ¯ n(i) · s(i) τˆ
(22)
The Kocks–Mecking model for the evolution of the statistically stored dislocations is modified to account for a mean free path associated with the GNDs (Acharya and Beaudoin (1999), Bammann (2000)) such that,
(i) = ρ˙ss
c1 L (i) s
+
c2
L (i) g
(i) (i) (i) γ˙ − c3 ρss γ˙
(23)
where as before, the mean free path L (i) s is inversely proportional to the square root of the density of SSDs and the mean free path for the GNDs is given as
−1
(i) · s(i) L (i) g = κe n
(24)
The evolution for the curl of the inverse of the elastic deformation gradient can be solved for explicitly since the plastic spin is known. This is a very important step in the ability to accurately predict GNDs which develop during deformation – no phenomenological evolution equation is required! And, the geometrically necessary dislocations that are generated during a deformation are precisely calculated to within the accuracy of the crystal plasticity model. This model comprises information assembled from many different fields. Beginning with local continuum mechanics, a multiplicative decomposition of the deformation gradient was introduced that implicitly introduced more degrees of freedom into the model and required an additional equation over classical small strain plasticity – the plastic spin. More degrees of
1092
D.J. Bammann
freedom were introduced through internal state variables that were chosen to represent real physical entities – statistically stored dislocations and geometrically necessary dislocations. Only elastic strains or strain like quantities appear in the free energy since the elastic distortions of the atoms associated with either external loading or internal defects are quantities that actually cause changes in the free energy. The Taylor model of internal strength is recovered in the model and the Mecking and Kocks evolution equation for SSDs is required because of the extra degree of freedom associated with the internal state variable. The internal state variable for GNDs draws upon compatibility theory and really goes back to the original Volterra concept of a dislocation. Since the GNDs are inherently tied to the rotations that occur during deformation, the expression for the plastic spin is sufficient to specify κe . The resulting model predicts observed sample size effects as illustrated in changing the dimensions of a block in simple shear (Fig. 9). In addition, as long as the mesh size is smaller than the introduced length scale, convergence to a solution results in the post bifurcation regime. Issues yet to be addressed include appropriate microforce balance laws for the conjugate thermodynamic
Figure 9. Model prediction of specimen size effect in simple shear.
Continuum modeling of mesoscale/macroscale phenomena
1093
forces associated with the internal state variables and the extra boundary conditions associated with the extra spatial gradient in the model. This will require micromechanical analysis such as discrete dislocation simulations and micromechanics of configurational forces. The importance of the plastic spin in local finite deformation plasticity was independently proposed by Loret (1983) and Dafalias (1983) that characterized the plastic spin using representation theorems for isotropic tensor functions. Other models were later developed based upon single slip by Bammann and Aifantis (1987) and double slip by Prantil et al. (1993), Zbib [36] and Van der Giessen [37]. An extension of the gradient crystal plasticity theory such as the one discussed above has been proposed by Regueiro et al. [38]. Alternatively, Clayton and McDowell (2002) directly calculated the incompatibilities associated with crystal slip and microcracks at the meso level resulting in a homogenized estimate of an incompatible deformation gradient at the macroscale level. Other approaches include higher order gradients of strain as state variables in attempt to accomplish the same effects, but the list is too long to list here.
3.
Summary
Local theory treats a body as a “continuum” of particles or points, the only geometrical property being that of position. A closer look at materials reveals a complex microstructure of grains, subgrains, shear bands and other topological features of the distribution of mass that are not taken into account by classical local theories. If the observer is far enough removed from a grain, he will see only a point. But a theory that strips away all of the geometrical properties of a grain except for the position of its center of mass will certainly fail to explain the more complex aspects of its mechanical response. In addition, the finite element implementation of this theory is incapable of dealing with boundary value problems where instabilities such as strain localization or fracture initiation result in a bifurcation in the solution. At the onset of these instabilities, the system of differential equations loses ellipticity and the problem becomes mathematically ill posed, resulting in pathological mesh dependency of the solutions. To overcome these difficulties requires a multi-field approach. The continuum must be embedded with more degrees of freedom to attempt to capture the large degrees of freedom associated with the physics occurring at very small length scales. This can be accomplished by increasing the degrees of freedom in the kinematics, the introduction of internal state variables or introducing higher order spatial gradients of variables. But in each case, an additional degree of freedom requires additional information to complete the system. For example, the multiplicative decomposition of the deformation gradient into elastic and plastic parts results the
1094
D.J. Bammann
need for an expression for the plastic spin. This is where the physics from smaller length scales must be embedded. Similarly, the extra degrees of freedom associated with internal state variables require equations that describe the temporal and maybe, spatial evolution of the variables. A physical, predictive theory requires that this information come from appropriate micromechanical models, including any additional boundary conditions. This approach allows a true bridging of length scales and with increasing computational capabilities, more detailed and physically descriptive model will be developed.
References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15]
[16] [17] [18] [19] [20] [21] [22] [23]
C.A. Coulomb, Theorie des machines simple, Paris, 1821. B.d. St. Venant, Compt. Rend., 70, 1870. B.d. St. Venant, Compt. Rend., 74, 1009, 1872. J. Bauschinger, Mitt. Mech. Tech., Lab Munchen, Vol. 13, No. 1, 1886. M.T. Huber, Czasopismo Techniczne, Lwow, 1904. R.v. Mises, Nachricten Ges. d. Wiss. Goettingen, 582, 1913. H. Hencky, Z. agnew Math. Mech., 5, 116, 1925. G.I. Taylor, Proceedings of Royal Society A, 145, 362, 1934. E. Orowan, Z. Phys., 84, 634, 1934. M. Polanyi, Z. Phys., 660, 1934. E. Orowan, Proceedings of Physical Society of London, 52, 8, 1940. W.G. Johnston and J.J. Gilman, J. Appl. Phys., 30, 129, 1959. A.S. Argon, Mat. Sci. Eng. 3, 24, 1968/1969. M.F. Ashby, “The deformation of plastically non-homogeneous materials,” Phil. Mag. 21, 399–424, 1970. H. Mecking and U.F. Kocks, “Kinetics of Flow and Strain-Hardening,” Acta Metall., 29, 1865–1875, 1981. J.M. Burgers, “Some considerations on the fields of stress connected with dislocations in a regular crystal lattice I,” Proceedings of Konshat Nederlands Akdemie Wetensch, 42, 293–324, 1939. M. Peach and J.S. Koehler, “The forces exerted on dislocations and the stress fields produced by them,” Physical Review., 80, 436–439, 1950. R.E. Peierls, P. Phys. Soc. Lond., 52, 34, 1940. F.R.N. Nabarro, “Dislocations in a simple cubic lattice,” Proceedings of the Physical Society of London, V. 59, 332, 256–272, 1947. N.M. Ghoniem and R.J. Amodeo, “Computer simulation of dislocation pattern formation,” Sol. Stat. Phenom, 3&4, 379–406, 1988. E. Van der Giessen and A. Needleman, “Discrete dislocation plasticity: a simple planar model,” Mater. Sci. Eng., 18, 41, 1995. J.F. Nye, “Some geometrical relations in dislocated crystals,” Acta Metall., 1, 153– 162, 1953. E. Kr¨oner, “Allgemeine kontinuumstheorie der versetzungen und eigenspannungen,” Arch. Rat. Mech. Anal., 4, 273–334, 1960. C. Teodosiu, “A dynamic theory of dislocations and its application to the theory of elastic-plastic continuum,” In: Fundamental Aspects of Dislocation Theory, NBS Special Publication 317, U.S. Government Printing Office, Gaithersburg, MD, 1969.
Continuum modeling of mesoscale/macroscale phenomena
1095
[24] P. Steinmann, “Views on multiplicative elastoplasticity and the continuum theory of dislocations,” Int. J. Eng. Sci., 34, 1717–1735, 1996. [25] B.A. Bilby and E. Smith, “Continuous distributions of dislocations III,” Proceedings of the Royal Society of London, A 232, 481–505, 1956. [26] S. Ramaswamy and N. Aravas, “Finite element implementation of gradient plasticity models Part I: Gradient-dependent yield functions,” Comout. Meth. Appl. Mech. Engrg., 163, 11–32, 1998. [27] N.A. Fleck, G.M. Muller, M.F. Ashby, and J.W. Hutchinson, “Strain gradient plasticity: theory and experiments,” Acta Metall. Mater., 42, 475–487, 1994. [28] J.R. Rice, “Inelastic constitutive relations for solids: an internal-variable theory and its application to metal plasticity,” J. Mech. Phys. Solids, 19, 433–455, 1971. [29] A.K. Miller, “An inelastic constitutive equation for monotonic, cyclic and creep deformation; part I, equations development and analytic procedures, part 2, application to type 304 stainless steel,” J. Eng. Mater. Tech., 98H, 97–113, 1976. [30] E.W. Hart, “Constitutive relations for the nonelastic deformaton of metals,” ASME J. Eng. Mater. Tech., 98, 193–202, 1976. [31] J. Kratochvil and O.W. Dillon, Jr., “Thermodynamics of elastic-plastic materials as a theory with internal state variables,” J. Appl. Phys., 40, 3207–3218, 1969. [32] R.D. Krieg, J.C. Swearengen, et al., “A physically based internal variable model for rate dependent plasticity,” Inelastic Behavior of Pressure Vessel and Piping Components ASME/CSME, PVP-PB-028, 15–36, 1978. [33] S. Bodner and Y. Partom, “A large deformation elastic visco-plastic analysis of thick walled spherical shells,” J. Eng. Mater. Tech., 115, 358–364, 1972. [34] B.D. Coleman and M.E. Gurtin, J. Chem. Phys., 47, 597–613, 1967. [35] M.E. Gurtin, “A gradient theory of single-crystal viscoplasticity that accounts for geometrically necessary dislocations,” J. Mech. Phys. Solids, 50, 5–32, 2002. [36] H.M. Zbib, “On the mechanics of large inelastic deformations: kinematics and constitutive modeling,” Acta Mech., 96, 119 138, 1993. [37] E. Van der Giessen, “Micromechanical and thermodynamic aspects of the plastic spin,” Int. J. Plast., 7, 365–386, 1991. [38] R.A. Regueiro, D.J. Bammann, E.B. Marin, and K. Garikipati, “A nonlocal phenomenological anisotropic finite deformation plasticity model accounting for dislocation defects,” J. Eng. Mat. Tech., 124, 380–387, 2002. [39] A. Acharya and J.L. Bassani, “Incompatible lattice deformations and crystal plasticity,” In: N. Ghoniem (ed.), Plastic and Fracture Instabilities in Materials, AMD vol. 200/MD vol. 57, ASME, N.Y., pp. 75–80, 1995. [40] A. Acharya and J.L. Bassani, “Lattice incompatibility and a gradient theory of crystal plasticity,” J. Mech. Phys. Solids, 48, 1565–1595, 2000. [41] D.J. Bammann, “A model of crystal plasticity containing a natural length scale,” Mat. Sci. Eng., A309-310, 406–410, 2001. [42] B.A. Bilby, R. Bullough, and E. Smith, “Continuous distributions of dislocations: a new application of the methods of non-Riemannian geometry,” Proceedings of the Royal Society of London, A 231, 263–273, 1955. [43] G.C. Butler and D.L. McDowell, “Polycrystal constraint and grain subdivision,” Int. J. Plast., 14, 703–717, 1998. [44] P. Cermelli and M.E. Gurtin, “On the characterization of geometrically necessary dislocations in finite plasticity,” J. Mech. Phys. Solids, 49, 1539–1568, 2001. [45] J.L. Chaboche, “Viscoplastic relations for the nonelastic deformation of metals,” Bulletin de l’Academie des Sciences Techniques, 25, 33–42,1977.
1096
D.J. Bammann
[46] J.D. Clayton and D.L. McDowell, “A multiscale multiplicative decomposition for elastoplasticity of polycrystals,” Int. J. Plast., in Press, 2002. [47] E. Cosserat and F. Cosserat, Sur la m´ecanique g´en´erale. Comptes Rendus de ´l Academe des Sciences Paris, 145, 1139–1142, 1907. [48] E. Cosserat and F. Cosserat, Th´eorie des Corps D´eformables, Hermann, Paris, 1909. [49] Y.F. Dafalias, “The plastic spin,” J. Appl. Mech., 107, 865–871, 1985. [50] I. Demir, J.P. Hirth, and H.M. Zbib, “The somigliana ring dislocation,” J. Elast., 28, 223–246, 1992. [51] N.A. Fleck and J.W. Hutchinson, “A phenomenological theory for strain gradient effects in plasticity,” J. Mech. Phys. Solids, 41, 1825–1857. 1993. [52] J. Frenkl, Zeit. Phys., 37, 572, 1926. [53] J.J. Gilman, Proceedings of 5th U.S. National Congress Appl. Mech. ASME., 1966. [54] J.J. Gilman, Micromechanics of Flow in Solids, New York, McGraw-Hill, 1969. [55] M.F. Horstemeyer and D.L. McDowell, “Modeling effects of dislocation substructure in polycrystal elastoviscoplasticity,” Mech. Mater., 27, 145–163, 1998. [56] D.A. Hughes, Q. Liu, D.C. Chrzan, and N. Hansen, “Scaling of microstructural parameters: misorientations of deformation induced boundaries,” Acta Mat., 45, 105–112, 1997. [57] E.H. Lee and D.T. Liu, “Elastic-plastic theory with application to plane-wave analysis,” J. Appl. Phys., 38, 19–27, 1967. [58] R.v. Mises, Z. agnew Math. Mech., 8, 161, 1928. [59] Missing [60] P. Perzyna, “The constitutive equations for work-hardening and rate sensitive plastic materials,” Proc. Vibr. Probl., 4, 281–290, 1963. [61] B. Svendsen, “Continuum thermodynamic models for crystal plasticity including the effects of geometrically-necessary dislocations,” J. Mech. Phys. Solids, 50, 1297–1329, 2002.
3.3 DISLOCATION DYNAMICS H.M. Zbib1 and T.A. Khraishi2 1 Washington State University, Pullman, WA, USA 2
University of New Mexico, Albuquerque, NM, USA
Crystalline materials are usually far from being perfect and may contain various forms of defects, such as vacancies, interstitials and impurity atoms (point defects), dislocations (line defects), grain boundaries, heterogeneous interfaces and microcracks (planar defects), chemically heterogeneous precipitates, twins and other strain-inducing phase transformations (volume defects). Indeed, these defects determine to a large extent the strength and mechanical behavior of the crystal. Most often, dislocations define plastic yield and flow behavior, either as the dominant plasticity carriers or through their interactions with the other strain-producing defects. A dislocation can be easily understood by considering that a crystal can deform irreversibly by slip, i.e., shifting or sliding along one of its atomic planes. If the slip displacement is equal to a lattice vector, the material across the slip plane will preserve its lattice structure and the change of shape will become permanent. However, rather than simultaneous sliding of two halfcrystals, slip displacement proceeds sequentially, starting from one crystal surface and propagating along the slip plane until it reaches the other surface. The boundary between the slipped and still unslipped crystal is a dislocation, and its motion is equivalent to slip propagation. In this picture, crystal plasticity by slip is a net result of the motion of a large number of dislocation lines, in response to the applied stress. It is interesting to note that this picture of deformation by slip in crystalline materials was first observed in the nineteenth century by [1, 2]. They observed that deformation of metals proceeded by the formation of slip bands on the surface of the specimen. Their interpretation of these results was obscure since metals were not viewed as crystalline at that time. Over the past seven decades, experimental and theoretical developments have firmly established the principal role of dislocation mechanisms in defining material strength. It is now understood that macroscopic properties of 1097 S. Yip (ed.), Handbook of Materials Modeling, 1097–1114. c 2005 Springer. Printed in the Netherlands.
1098
H.M. Zbib and T.A. Khraishi
crystalline materials are derivable, at least in principle, from the behavior of their constituent defects. However, this fundamental understanding has not been translated into a continuum theory of crystal plasticity based on dislocation mechanisms. The major difficulty in developing such a theory is the multiplicity and complexity of the mechanisms of dislocation motion and interactions that make it impossible to develop a quantitative analytical approach. The problem is further complicated by the need to trace the spatiotemporal evolution of a very large number of interacting dislocations over very long periods of time, as required for the calculation of plastic response in a representative volume element. Such practical intractability of the dislocation-based approaches, on one hand, and the developing needs of material engineering at the nano and micro length scales on the other, have created the current situation when equations of crystal plasticity used for continuum modeling are phenomenological and somewhat disconnected from all of the degrees of freedom related to the underlying dislocation behavior. Bridging the gap between dislocation physics and continuum crystal plasticity has become possible with the advancement in computational technology with larger and faster computers. To this end, over the past two decades various discrete dislocation dynamics models have been developed. The early discrete dislocation models were two-dimensional (2D) and comprised periodic cells containing multiple dislocations whose behavior was governed by a set of simplified rules [3–8]. These simulations, although served as a useful conceptual framework, were limited to 2D and, consequently, could not directly account for such important features in dislocation dynamics as slip geometry, line tension effects, multiplication, certain dislocation intersections and cross-slip, all of which are crucial for the formation of dislocation patterns. In the 1990s, development of new computational approaches of dislocation dynamics (DD) in three-dimensional (3D) space generated hope for a principal breakthrough in our current understanding of dislocation mechanisms and their connection to crystal plasticity [9–12]. In these new models, dislocation motion and interactions with other defects, particles and surfaces are explicitly considered. However, complications with respect to dislocation multiplications, self-interactions and interactions with other defects, and keeping track of complex mechanisms and reactions have provided a new set of challenges for developing efficient computational algorithms. The DD analysis and its computer simulation modeling devised by many researchers [4, 12–16] have been advanced significantly over the past decade. This progress has been further magnified by the idea of coupling DD with continuum mechanics in computational algorithms such as finite element codes. This coupling may pave the way to better understanding of the local response of materials at the nano and micro scales and globally at the macroscale [17], increasing the potential for future applications of this method in material, mechanical, structural and process engineering analyses. In the following, the
Dislocation dynamics
1099
principles of DD analysis will be presented followed by the procedure for the measurement of local quantities such as plastic distortion and internal stresses. The incorporation of DD technique into the 3D plastic continuum mechanicsbased finite elements modeling will then be described. Finally, examples are provided to illustrate the applicability of this powerful technique in material engineering analysis.
1.
Theoretical Fundamentals
In order to better describe the mathematical and numerical aspects of the DD methodology, first we will identify the basic geometric conditions and kinetics that control the dynamics of dislocations. This will be followed by discussion of the dislocation equation of motion, elastic interaction equations, and descritization of these equations for numerical implementation.
1.1.
Kinematics and Geometric Aspects
A dislocation is a line defect in an otherwise perfect crystal described by its line sense vector ξ and Burgers vector b. The Burgers vector has two distinct components: edge, perpendicular to its line sense vector, and screw, parallel to its line sense vector. Under loading, dislocations glide and propagate on slip planes causing deformation and change of shape. When the local line direction becomes parallel to the Burgers vector, i.e., screw character, the dislocation may propagate into other slip planes. This switching of the slip plane, which makes the motion of dislocations 3D, is better known as cross slip and is an important recovery mechanism to be dealt with in DD. In addition to glide and cross slip, dislocations can also climb in a non-conservative 3D motion by absorbing and/or emitting intrinsic point defects, vacancies, and interstitials. Some of these mechanisms become important at high load levels or temperatures when point defects become more mobile. In summary, the 3D dislocation dynamics accounts for the following geometric aspects: • Dislocation topology; 3D geometry, Burgers vector and line sense. • Identification of all possible slip planes for each dislocation. • Changes in the dislocation topology when part of it cross-slips and/or climbs to another plane. • Multiplication and annihilation of dislocation segments. • Formation of complex connections and intersections such as junctions, jogs, and branching of the dislocation in multiple directions.
1100
1.2.
H.M. Zbib and T.A. Khraishi
Kinetics and Interaction Forces
The dynamics of the dislocation is governed by a “Newtonian” equation of motion, consisting of an inertia term, damping term, and driving force arising from short-range and long-range interactions. Since the strain field of the dislocation varies as the inverse of the distance from the dislocation core, dislocations interact among themselves over long distances. As the dislocation moves, it has to overcome internal drag, and local barriers such as the Peierls stresses (i.e., lattice friction). The dislocation may encounter local obstacles such as stacking fault tetrahedra, defect clusters, and vacancies that interact with the dislocation at short ranges and affect its local dynamics. Furthermore, the internal strain field of randomly distributed local obstacles gives rise to stochastic perturbations to the encountered dislocations, as compared with deterministic forces such as the applied load. This stochastic stress field also contributes to the spatial dislocation patterning in the later deformation stages. Therefore, the strain field of local obstacles adds spatially irregular uncorrelated noise to the equation of motion. In addition to the random strain fields of dislocations or local obstacles, thermal fluctuations also provide a stochastic source in dislocation dynamics. Dislocations also interact with free surfaces, cracks, and interfaces, giving rise to what is termed as image stresses or forces. In summary, the dislocation may encounter the following set of forces: • Drag force, Bv, where B is the drag coefficient and v is the dislocation velocity. • Peierls stress Fpeierls. • Force due to externally applied loads, Fexternal. • Dislocation-dislocation interaction force F D. • Dislocation self-force Fself. • Dislocation-obstacle interaction force Fobstacle. • Image force Fimage. • Osmotic force Fosmotic resulting from non-conservative motion of dislocation (climb) and results in the absorption or emission of intrinsic point defects. • Thermal force Fthermal arising from thermal fluctuations. The DD approach attempts to incorporate all of the aforementioned kinematics and kinetics aspects into a computational traceable framework. In the numerical implementation, three-dimensional curved dislocations are treated as a set of connected segments as illustrated in Fig. 1. It is possible to represent smooth dislocations with any desired degree of realism, provided that the discretization resolution is taken high enough for accuracy (limited by the size of the dislocation core radius r0 , typically the size of one Burgers vector b). In such a representation, the dynamics of dislocation lines is reduced to
Dislocation dynamics
j⫹1
1101
j
ξ “2”
C2
“3”
i⫹1 i⫺2
z
i⫺1 i
y x
vj
ξ C1
j⫺1 “1”
j
C3
RjP
S j⫹1
P Field point
vj+1 dl´ v Velocity vector
Figure 1. points.
Discretization of dislocations loops and curves into nodes, segments and collocation
the dynamics of discrete degrees of freedom of the dislocation nodes connecting the dislocation segments.
1.3.
Dislocation Equation of Motion
The velocity v of a dislocation segment s is governed by a first order differential equation consisting of an inertia term, a drag term and a driving force vector [18–20], such that
1 1 dW v = Fs with m s = m s v˙ + Ms (T, p) ν dv Fs = Fpeirels + F D + Fself + Fexternal + Fobstacle + Fimage + Fosmotic + Fthermal
(1)1 (1)2
In the above equation, the subscript s stands for the segment, m s is defined as the effective dislocation segment mass density, Ms is the dislocation mobility which could depend both on the temperature T and the pressure p, and W is the total energy per unit length of a moving dislocation (elastic energy plus
1102
H.M. Zbib and T.A. Khraishi
kinetic energy). As implied by (1)2 , the glide force vector Fs per unit length arises from a variety of sources described in the previous section. The following relations for the mass per unit dislocation length have been suggested [19] for screw (m s )screw and edge (m s )edge dislocations when moving at a high speed. W0 (−γ −1 + γ −3 ) 2 ν W0 C 2 = (−16γl − 40γl−1 + 8γl−3 + 14γ + 50 γ −1 v4 − 22 γ −3 + 6γ −5 )
(m s )screw = (m s )edge
(2)
where γl = (1 − ν 2 /Cl2 )1/2 , γ = (1 − ν 2 /C 2 )1/2 , Cl is the longitudinal sound velocity, C is the transverse sound velocity, ν is Poisson’s ratio, W0 = (Gb2 /4ð) ln(R/r0 ) is the rest energy for the screw per unit length, G is the shear modulus. The value of R is typically equal to the size of the dislocation cell (about 1000b), or in the case of one dislocation is the shortest distance from the dislocation to the free surface [21]. In the non-relativistic regime when the dislocation velocity is small compared to the speed of sound, the above reduce to the familiar expression m = βρb2 ln(R/r0 ), where β is a constant dependent on the type of the dislocation, and ρ is the mass density.
1.3.1. Dislocation Mobility Function The reliability of the numerical simulation depends critically on the accuracy of the dislocation drag coefficient B(= 1/M) which is material dependent. There are a number of phenomenological relations for the dislocation glide velocity νg [22, 23], including relations of power law forms and forms with an activation term in an exponential or as the argument of a sinh form. Often, however [23, 24] the simple power law form is adopted for expedience, e.g., νg = νs (τe /τs )m , resulting in nonlinear dependence of M on the stress. In a number of cases of pure phonon/electron damping control or of glide over the Peierls barrier a constant mobility (with m = 1), predicts the results very well. This linear form has been theoretically predicted for a number of cases as discussed by Hirth and Lothe [21]. Mechanisms to explain dislocation drag have been studied for long time and the drag coefficients have been estimated in numerous experimental and theoretical works by atomistic simulations or quantum mechanical calculations (see, e.g., the review by Al’shitz [25]). The determination of each of the two components (phonon and electron drag) that constitute the drag coefficient for a specific material is not trivial, and various simplifications have been made, e.g., the Debye model neglects Van Hove singularities in phonon spectrum [26], isotropic approximation of deformation potentials, and so on.
Dislocation dynamics
1103
Also the values are sensitive to various parameters such as the mean free path or core radius. Nevertheless, in typical metals, the phonon drag Bph range is 30 ∼ 80 µPa s at room temperature and less than 0.1 µPa s at very low temperatures around 10 K, while for the electron drag Be the range is a few µPa s and expected to be temperature independent. Under strong magnetic fields at low temperature, macroscopic dislocation behavior can be highly sensitive to orientation relative to the field within accuracy of 1% [27]. Except for special cases such as deformation under high strain rate, weak dependences of drag on dislocation velocity are usually neglected. Examples of temperature dependence of each component of the drag coefficient can be found for the case of edge dislocation in Copper [28], or in Molybdenum [29]. Generally, however, the dislocation mobility could be, among other things, a fuction of the angle between the Burgers vector and the dislocation line sense, i.e., dislocation character, especially at low temperatures. For example, Wasserb¨ach [30] observed that at low deformation temperatures (77–195 K) the dislocation structure in Ta single crystals consisted of primary and secondary screw dislocations and of tangles of dislocations of mixed characters, while at high temperatures (295–470 K) the behavior was similar to that of fcc crystals. In the work of Mason and MacDonald [31] they measured the mobility of dislocation of an unidentified type in NB as 4.2 × 104 (Pa s)−1 near room temperature. A smaller value of 3.3 × 103 (Pa s)−1 was obtained by Urabe and Weertman [32] for the mobility of edge dislocation in Fe. The mobility for screw dislocations in Fe was found to be about a factor of two smaller than that of edge dislocations near room temperature. A theoretical model to explain this large difference in behavior is given in Hirth and Lothe [21] and is based on the observation that in bcc materials the screw dislocation has a rather complex three-dimensional core structure, resulting in a high Peierls stress, leading to a relatively low mobility for screw dislocations while the mobility of mixed dislocations is higher.
1.3.2. Dislocation Collisions When two dislocations collide, their response is dominated by their mutual interactions and becomes much less sensitive to the long-range elastic stress associated with external loads, boundary conditions, and all other dislocations present in the system. Depending on the shapes of the colliding dislocations, their approach trajectories and their Burgers vectors, two dislocations may form a dipole, or react to annihilate, or to combine to form a junction, or to intersect and form a jog. In the DD analysis, the dynamics of two colliding dislocations is determined by the mutual interaction force acting between them. In the case that the two dislocation segments are parallel (on the same plane and or intersecting planes) and have the same Burgers vector with opposite
1104
H.M. Zbib and T.A. Khraishi
sign they would annihilate if the distance between them is equal to the core size. Otherwise, the colliding dislocations would align themselves to form a dipole, a jog or a junction depending on their relative position. A comprehensive review of short-range interaction rules can be found in Rhee, Zbib et al. [33].
1.3.3. Discretization of Dislocation Equation of Motion Equation (1) applies to every infinitesimal length along the dislocation line. In order to solve this equation for any arbitrary shape, the dislocation curve may be discretized into a set of dislocation segments as illustrated in Fig.1. Then the velocity vector field over each segment may be assumed to be linear and, therefore, the problem is reduced to finding the velocity of the nodes connecting these segments. There are many numerical techniques to solve such a problem. Consider, for example, a straight dislocation segment s bounded by two nodes j and j + 1 as depicted in Fig. 1. Then within the finite element formulation [34], the velocity vector field is assumed to be linear over the dislocation segment length. This linear vector field v can be expressed in terms of the velocities of the nodes such that v = [ND ]T V D where V D is the nodal velocity vector and [ND ] is the linear shape function vector [34]. Upon using the Galerkin method, Eq. (1) for each segment can be reduced to a set of six equations for the two discrete nodes (each node has three degrees of freedom). The result can be written in the following matrix-vector form. D [M D ]V˙ + [CD ]V D = FD
D
D T
(3)
where [M D ] = m s N N dl is the dislocation segment 6 × 6 mass mat T rix, [CD ] = (1/Ms ) ND ND dl is the dislocation segment 6 × 6-damping matrix, and FD = ND Fs dl is the 6 × 1 nodal force vector. Then, following the standard element assemblage procedure, one obtains a set of discrete system of equations, which can be cast in terms of a global dislocation mass matrix, a global dislocation damping matrix, and a global dislocation force vector. In the case of one dislocation loop and with ordered numbering of the nodes around the loop, it can be easily shown that the global matrices are banded with half-bandwidth equal to one. However, when the system contains many loops that interact among themselves and new nodes are generated and/or annihilated continuously, the numbering of the nodes becomes random and the matrices become unbanded. To simplify the computational effort, one can employ the lumped mass and damping matrix method. In this method, the mass matrix [M D ] and damping matrix [C D ] become diagonal matrices (half-bandwidth equal to zero), and therefore the only coupling between the equations is through the nodal force vector F D . The computation of each component of the force vector is described below.
Dislocation dynamics
1.4.
1105
The Dislocation Stress and Force Fields
The stress induced by any arbitrary dislocation loop at an arbitrary field point P can be computed by the Peach–Koehler integral equation given in Hirth and Lothe [21]. This integral equation, in turn, can be evaluated numerically over many loops of dislocations by discretizing each loop into a series of line segments. If we denote, Nl = total number of dislocation loops Ns(l) = number of segments of loop l Nn(l) = number of nodes associated with the segments of loop l, i.e., Nn(l) = Ns(l) + 1 Ns = total number of segments = Ns(l) × N l , where summation over l is implied Nn = total number of nodes = Nn(l) × N l , where summation over l is implied. ls = length of segment s. r = distance from point P to the segment s. Then the discretized form of the Peach–Koehler integral equation for the stress at any arbitrary field point P becomes ()
d
σ ij (P) =
Ns N l=1 s=1
−
G 8π
ls
bp ∈mpi
∂ 2 G ∇ R dxj − ∂xm 8π
∂ G × ∇ 2 R dxi − ∂xm 4π (1 − ν)
×
ls
bp ∈mpi
ls
bp ∈mpk
∂3 R ∂ 2 ∇ R dxk − δij ∂xm ∂xi ∂xj ∂xm
(4)
where ∈i j k is the permutation symbol, and R is the magnitude of the R = r –r (with r being the position vector of point P and r the position vector of a differential line segment of the dislocation loop or curve). The integral over each segment can be explicitly carried out using the linear element approximation. Exact solution of Eq. (4) for a straight dislocation segment can be found in DeWit [35] and Hirth and Lothe [21]. However, evaluation of the above integral requires careful consideration as the integrand becomes singular in cases where point P coincides with one of the nodes of the segment that integration is taken over, i.e., self-segment integration. Thus, • If P is not part of the segment s, there is no singularity since R =/ 0 and the ordinary integration procedure may be performed. • If P coincides with a node of the segment s where the integration should be carried out, special treatment is required due to the singular nature of the stress field as R → 0. Here, the regularization scheme developed by Zbib and co-workers have been employed.
1106
H.M. Zbib and T.A. Khraishi
In general, the dislocation stresses can be decomposed into the following form. d
σ(P) =
N s −2
d
d
d
σ (s) + σ ( P+) + σ ( P−)
(5)
s=1 d
where σ (s) is the contribution to the stress at point P from a segment s, and d d σ ( P+) , σ ( P−) are the contributions to the stress from the two segments that are shared by a node coinciding with P which will be further discussed below. Once the dislocation stress field is computed the forces on each dislocation segment can be calculated by summing the stresses along the length of the segment. The stresses are categorized into those coming from the dislocations as formulated above and also from any other externally applied stresses plus the internal friction (if any) and the stresses induced by any other defects or micro-constituents. A model for the osmotic force Fosmotic is given in Raabe [36] and its inclusion in the total force is straightforward since it is a deterministic force. However, the treatment of the thermal force Fthermal is not trivial since this force is stochastic in nature, requiring a special consideration and algorithm leading to what is called stochastic dislocation dynamics (SDD) as developed by Hiratani and Zbib [37]. Therefore, the force acting on each segment can be written as:
Fs =
Ns
a (s)
d (m)
σ
+σ
+ τs
d
a
· bs × ξs = Fs + Fs + Fthermal
(6)
m=1 d
where σ (m) , is the contribution to the stresses along segment s from another a segment m (dislocation-dislocation interaction), σ (s) is the sum of all externally applied stresses, internal friction (if any) and the stresses induced by d
a
any other defects, and τs is the thermal stress; Fs , Fs and Fthermal are the corresponding total Peach–Koehler (PK) forces. d
Using Eq. (5), the force Fs can also be decomposed into two parts one arising from all dislocation segments and one from the self-segment, which is better known as the self-force, that is, d
Fs =
N s −2
d
d
Fs (m) + Fs (self)
(7)
m=1 d
d
where Fs(m) and Fs(self) are respectively, the contribution to the force on segment s from segment m and the self-force. In order to evaluate the self-force, a special numerical treatment as given by Zbib, Rhee et al. [12] and Zbib and
Dislocation dynamics
1107
Diaz de la Rubia [17] should be used in which exact expressions for the selfforce are given. This approximation works well in terms of accuracy and numerical convergence for segment lengths as small as 20b. For finer segments, however, one can use a more accurate approximation as suggested by Scattergood and Bacon [38]. Another treatment has been given by Gavazza and Barnett [39] and used in the recent work of Ghoniem and Sun [40]. The direct computation of the dislocation forces discussed above requires the use of a very fine mesh, especially when dealing with problems involving dislocation-defect interaction. As a rule to capture the effect of the very small defects, the dislocation segment size must be comparable to the size of the defect. Alternatively, one can use large dislocation segments compared to the smallest defect size, provided that the force interaction is computed over many points (Gauss points) over the segment length. In this case, the self-force of segment s would be evaluated first. Then the force contribution from other dislocations and defects is calculated by computing the stresses at several Gauss points along the length of the segment. The summation as in Eq. (6) would then follow according to: Fs =
Fsself
+
N s −2 m=1
ng d (m)
1 σ ( pg ) + · · · · bs × ξs n g g=1
(8)
where pg is the Gauss point g and n g is the number of Gauss points along segment s. The number of Gauss points depends on the length of the segment. As a rule the shortest distance between two Gauss points should be larger or equal to 2r0 , i.e., twice the core size.
1.5.
The Stochastic Force and Cross-slip
Thermal fluctuations can arise from dissipation mechanisms due to collision of dislocations with surrounding particles, such as phonons or electrons. Rapid collisions and momentum transfers result in random forces on dislocations. These stochastic collisions, in turn, can be regarded as time-independent noise of thermal forces acting on the dislocations. Suppose the exertion of thermal forces follows a Gaussian distribution. Then, thermal fluctuations most likely result in very small net forces due to mutual cancellations. However, they sometimes become large and may cause diffusive dislocation motion or thermal activation events such as overcoming obstacle barriers. Therefore, the DD simulation model should also account not only for deterministic effects but also for stochastic forces; leading to a model called “stochastic discrete dislocation dynamics” (SDD) [41]. The procedure is to include the stochastic force Fthermal in the DD model by computing the magnitude of the stress pulse (τs ) using a Monte Carlo type analysis.
1108
H.M. Zbib and T.A. Khraishi Table 1. The stress pulse peak height for various combinations of parameters, t = 50 fs T (K)
1/M
0 50 100 300
µPa s
2 5 10 30
τh (MPa) (l = 5b)
τh (MPa) (l = 10b)
11.5 40.6 81.1 256
8.11 28.7 57.4 181
Based on the assumption of the Gaussian process, the thermal stress pulse has zero mean and no correlation [36, 42] between any two different times. This leads to the average peak height given as [43, 44]. σs =
2kT /Ms b2 lt
(9)
where k denotes Boltzman constant, T absolute temperature of the system, b the magnitude of Burgers vector, t time step, and l is the dislocation segment length, respectively. Some values of the peak height are shown in Table 1 for typical combinations of parameters [41]. Here, t is chosen to be 50 fs, roughly the inverse of the Debye frequency. Numerical implementation includes an algorithm where stochastic components are evaluated at each time step of which strengths are correlated and sampled from a bivariate Gaussian distribution [45]∗ . With the inclusion of stochastic forces in DD analysis, one can treat cross-slip (a thermally activated process) in a direct manner, since the duration of waiting time and thermal agitations are naturally included in the stochastic process. For example, for the cross-slip in fcc model one can develop a model based on the Escaig– Friedel (EF) mechanism where cross-slip of a screw dislocation segment may be initiated by an immediate dissociation and expansion of Shockley partials. This EF mechanism has been observed to have lower activation energy than Shoeck–Seeger mechanism where the double super kinks are formed on the cross slip plane (this model is used for cross-slip in bcc [33]). In the EF mechanism, the activation enthalpy G depends on the interval of the Shockley partials (d) and the resolved shear stress on the initial glide plane (σ ). (See, e.g., the MD simulation of Rasmussen and Jacobs [46] and Rao, Parthasarathy et al. [47]). The constriction interval L is also dependent on σ . For example, for the case of copper, the activation energy for cross-slip can be computed using an empirical formula fitted to the MD results of Rao, Parthasarathy et al. [47]. Figure 2 depicts the G (σ ) for the case of copper where the value * Hiratani and Zbib Here generate stress pulses as τ = σ √−2 ln r cos(2πr ) where r and r are uniform s s 1 2 1 2
random numbers between zero and unity [45].
Dislocation dynamics
1109
1
∆G/∆Gc
0.9 0.8 0.7 0.6 0.5 10⫺5
10⫺4
10⫺3
10⫺2
σ/µ Figure 2. The normalized activation enthalpy for copper as a function of the normalized resolved shear stress on the glide plane. Gc and µ denote the activation free energy and the shear modulus, respectively (From Hiratani and Zbib, 2003).
of the activation free energy is 1.2 eV, and for stacking fault energy is equal to 0.045 J/m2 . This activation energy for stress assisted cross-slip is entered as an input data into the DD code. Usually, within the DD code, dislocations are represented as perfect dislocations while a pair of parallel Shockley partials are introduced in the case of screw dislocation segments only for stress calculation. Then a Monte Carlo type procedure is used to select either the initial plane or the cross slip plane according to the activation enthalpy [33]. For simplicity, one can set the regime of the barrier with area of L × d and strength of G/Ld. The virtual Shockley partials move according to the Langevin forces in addition to the systematic forces according to Eq. (13) until the partials overcome the barrier and the interval decreases to the core distance. The implementation of this model captures the anisotropic response of cross-slip activation process to the loading direction, and consideration of the time duration (waiting time) during the cross-slip event, which have been missing in the former DD simulations.
1.6.
Modifications for Long-Range Interactions: The Super-Dislocation Principle
Inclusion of the interaction among all the dislocation loops present in a large body is computationally expensive since the number of computations
1110
H.M. Zbib and T.A. Khraishi
per step would be proportional to Ns2 where Ns is the number of dislocation segments. A numerical compromise technique termed the super-dislocation method, which is based on the multipolar expansion method [7, 12, 33], reduces the order of computation to Ns log Ns with a high accuracy. In this approach, the dislocations far away from the point of interest are grouped together into a set of equivalent monopoles and dipoles. In the numerical implementation of the DD model, one would divide the 3D computational domain into sub-domains, and the dislocations in each sub-domain (if there are any) are grouped together in terms of monopoles, dipoles, etc. (depending on the desired accuracy) and the far stress field is then computed.
1.7.
Evaluation of Plastic Strains
The motion of each dislocation segment gives rise to plastic distortion, which is related to the macroscopic plastic strain rate tensor ε˙ p , and the plastic spin tensor W p through the relations ε˙ = p
Wp =
Ns ls νgs s=1 Ns s=1
(ns ⊗ bs + bs ⊗ ns )
(10)1
ls νgs (ns ⊗ bs − bs ⊗ ns ) 2V
(10)2
2V
where ns is a unit normal to the slip plane, νgs is the magnitude of the glide velocity of segment s, V is the volume of the representative volume element (RVE). The above relations provide the most rigorous connection between the dislocation motion (the fundamental mechanism of plastic deformation in crystalline materials) and the macroscopic plastic strain, with its dependence on strength and applied stress being explicitly embedded in the calculation of the velocity of each dislocation. Length scale effects are explicitly included into the calculation through long-range interactions. Another microstructure quantity, the dislocation density tensor α, can also be calculated according to
α=
Ns ls s=1
V
bs ⊗ ξs
(11)
This quantity provides a direct measure for the net Burgers vector that gives rise to strain gradient relief (bending of crystal) [48].
Dislocation dynamics
1.8.
1111
The DD Numerical Solution: An Implicit–Explicit Integration Scheme
An implicit algorithm to solve the equation of motion (3) with a backward integration scheme may be used, yielding the recurrence equation ν
t +δt
t 1+ m s Ms
t +δt
= νt +
t t +δt F ms s
(12)
This integration scheme is unconditionally stable for any time step size. However, the DD time step is determined by two factors: (i) the shortest flight distance for short-range interactions, and (ii) the time step used in the dynamic finite element modeling to be described later. This scheme is adopted since the time step in the DD analysis (for high strain rates) is of the same order of magnitude of the time required for a stable explicit finite element (FE) dynamic analysis. Thus, in order to ensure convergence and stable solution, the critical time tc and the time step for both the DD and the FE ought to be tc = lc /Cl, and t = tc /20, respectively, where lc is the characteristic length scale which is the shortest dimension in the finite element mesh. In summary, the system of equations given above summarizes the basic ingredients that a DD simulation model should include. There are a number of variations in the manner in which the dislocation curves may be discretized, for example zero order element (pure screw and pure edge), first order element (or piecewise linear segment with mixed character), or higher order nonlinear elements, but this is purely a numerical issue. Nonetheless, the DD model should have the minimum number of parameters and, hopefully, all of them should be basic physical and material parameters and not phenomenological ones for the DD result to be predictive. The DD model described above has the following set of physical and material parameters: • • • • • • •
Burgers vectors, elastic properties, core size (equal to one Burgers vector), thermal conductivity and specific heat, mass density, stacking fault energy, and dislocation mobility.
Also there are two numerical parameters: the segment length (minimum segment length cannot be less that three times the core size) and the time step (as discussed in conjunction with Eq. 12), but both are fixed to ensure convergence of the result. In the above list, it is emphasized that in general the dislocation mobility is an intrinsic material property that reflects the local drag mechanisms as discussed above. One can use an “effective” mobility that accounts
1112
H.M. Zbib and T.A. Khraishi
for additional drag from dislocation-point defect interaction, and thermal activation processes if the defects/obstacles are not explicitly impeded in the DD simulations. However, there is no reason not to include these effects explicitly in the DD simulations (as done in the model described above), i.e., dislocation defect interaction, stochastic processes and inertia effects, which actually permits the prediction of the “effective” mobility from the DD analysis [37, 44].
References [1] O. M¨ugge, Neues Jahrb, Min 13, 1883. [2] J.A. Ewing, and W. Rosenhain, “The crystalline structure of metals,” Phil. Trans. Roy. Soc. A, 193, 353–375, 1899. [3] J. Lepinoux and L.P. Kubin, “The dynamic organization of dislocation structures: A simulation,” Scripta Metall., 21, 833–838, 1987. [4] N.M. Ghoniem and R.J. Amodeo, “Computer simulation of dislocation pattern formation,” Sol. Stat. Phenom., 3 & 4, 379–406, 1988. [5] I. Groma and G.S. Pawley, “Role of the secondary slip system in a computer simulation model of the plastic behavior of single crystals,” Mater. Sci. Engrg. A, 164, 306–311, 1993. [6] E. Van der Giessen and A. Needleman, “Discrete dislocation plasticity: A simple planar model,” Mater. Sci. Eng., 3, 689–735, 1995. [7] H.Y. Wang and R. LeSar, “O(N) Algorithm for dislocation dynamics,” Phil. Mag. A, 71, 149–164, 1995. [8] K.C. Le and H. Stumpf, “A model of elasticplastic bodies with continuously distributed dislocations,” Int. J. Plasticity, 12, 611–628, 1996. [9] L.P. Kubin and G. Canova, “The modelling of dislocation patterns,” Scripta Metall., 27, 957–962, 1992. [10] G. Canova, Y. Brechet, L.P. Kubin, B. Devincre, V. Pontikis, and M. Condat, “3D simulation of dislocation motion on a lattice: Application to the yield surface of single crystals,” Microstructures and Physical Properties, J. Rabiet (ed.), CH-Transtech, 1993. [11] J.P. Hirth, M. Rhee, and H.M. Zbib, “Modeling of deformation by a 3D simulation of multipole, curved dislocations,” J. Computer-Aided Materials Design, 3, 164–166, 1996. [12] H.M. Zbib, M. Rhee, and J.P. Hirth, “3D simulation of curved dislocations: discretization and long range interactions,” Advances in Engineering Plasticity and its Applications, T. Abe and T. Tsuta (eds.), Pergamon, NY, 15–20, 1996. [13] G.R. Canova, Y.Brechet, and L.P. Kubin, “3D Dislocation simulation of plastic instabilities by work softening in alloys,” In: S.I. Anderson et al. (eds.), Modelling of Plastic Deformation and Its Engineering Applications, Riso National Laboratory, Roskilde, Denmark, 1992. [14] L.P. Kubin, “Dislocation patterning during multiple slip of FCC Crystals,” Phys. Stat. Sol. (a), 135, 433–443, 1993. [15] K.W. Schwarz, and J. Tersoff, “Interaction of threading and misfit dislocations in a strained epitaxial layer,” Appl. Phys. Lett., 69(9), 1220, 1996.
Dislocation dynamics
1113
[16] H.M. Zbib, M. Rhee, and J.P. Hirth, “On plastic deformation and the dynamics of 3D dislocations,” Int. J. Mech. Sci., 40, 113–127, 1998. [17] H.M. Zbib and T. Diaz de la Rubia, “A multiscale model of plasticity,” Int. J. Plasticity, 18(9), 1133–1163, 2002. [18] J.P. Hirth, “Injection of dislocations into strained multilayer structures,” Semiconductors and Semimetals, Academic Press, 37, 267–292, 1992. [19] J.P. Hirth, H.M. Zbib, and J. Lothe, “Forces on high velocity dislocations,” Modeling & Simulations in Maters. Sci. & Enger., 6, 165–169, 1998. [20] H. Huang, N. Ghoniem, T. Diaz de la Rubia, H.M. Rhee, Z. and J.P. Hirth, “Development of physical rules for short range interactions in BCC Crystals,” ASME-JEMT, 121, 143–150, 1999. [21] J.P. Hirth, and J. Lothe, “Theory of dislocations,” New York, Wiley, 1982. [22] U.F. Kocks, A.S. Argon, and M.F. Ashby, “Thermodynamics and kinetics of slip,” Oxford, Pergamon Press, 1975. [23] R. Sandstrom, “Subgrain growth occurring by boundary migration,” Acta Metall., 25, 905–911, 1977. [24] W.G. Johnston and J.J. Gilman, “Dislocation velocities, dislocation densities, and plastic flow in Lithium Flouride Crystals,” J. Appl. Phys., 30, 129–144, 1959. [25] V.I. Al’shitz, “The phonon-dislocation interaction and its role in dislocation dragging and thermal resistivity,” Elastic Strain and Dislocation Mobility, V.L. Indenbom and J. Lothe, Elsevier Science Publishers B.V, Chapter 11, 1992. [26] N.W. Ashcroft, and N.D. Mermin, Solid State Physics: Saunders College, 1976. [27] T.J. McKrell and J.M. Galligan, “Instantaneous dislocation velocity in iron at low temperature,” Scripta Materialia, 42, 79–82, 2000. [28] M. Hiratani and E.M. Nadgorny, “Combined model of dislocation motion with thermally activated and drag-dependent stages,” Acta Mat., 40, 4337–4346, 2001. [29] C. Jinpeng, V.V. Bulatov, and S. Yip, “Molecular dynamics study of edge dislocation motion in a bcc metal,” J. Comput. Mat., 6, 165–173, 1999. [30] W. Wasserb¨ach, “Plastic deformation and dislocation arrangement of Nb-34 at.% TA Alloy Crystals,” Phil. Mag. A, 53, 335–356, 1986. [31] W. Mason and D. MacDonald, “Damping of dislocations in Niobium by phonon viscosity,” J. Appl. Phys., 42, 1836, 1971. [32] N. Urabe and J. Weertman, “Dislocation mobility in potassium and iron single crystals,” Mater. Sci. Engng., 18, 41, 1975. [33] M. Rhee, H.M. Zbib, J.P. Hirth, H. Huang, and T.D. de la Rubia, “Models for long/short range interactions in 3D dislocation simulation,” Modeling & Simulations in Maters. Sci. & Enger., 6, 467–492, 1998. [34] K.J. Bathe, “Finite element procedures in engineering analysis,” New Jersey, Prentice-Hall, 1982. [35] R. DeWit, “The continuum theory of stationary dislocations,” Solid State Phys., 10, 249–292, 1960. [36] D. Raabe, “Introduction of a hybrid model for the discrete 3D simulation of dislocation dynamics,” Comput. Mater. Sci., 11, 1–15, 1998. [37] M. Hiratani and H.M. Zbib, “Stochastic dislocation dynamics for dislocation-defects interaction,” J. Enger. Mater. Tech., 124, 335–341, 2002. [38] R.O. Scattergood and D.J. Bacon, “The Orowan mechanism in ansiotropic crystal,” The Philosophical Magazine, 31, 179–198, 1975. [39] S.D. Gavazza and D.M. Barnett, “The self-force on a planar dislocation loop in an anisotropic linear-elastic medium,” J. Mech. Phys. Solids, 24, 171–185, 1976.
1114
H.M. Zbib and T.A. Khraishi
[40] N.M. Ghoniem and L. Sun, “A fast sum method for the elastic field of 3D dislocation ensembles,” Phys. Rev. B, 60, 128–140, 1999. [41] M. Hiratani and H.M. Zbib, “On dislocation-defect interaction and patterning: Stochastic discrete dislocation dynamics,” J. Nuc. Enger., in press, 2003. [42] D. Ronnpagel, T. Streit, and T. Pretorius, “Including thermal activation in simulation calculation of dislocation glide,” Phys. Stat. Sol., 135, 445–454, 1993. [43] T.J. Koppenaal and D. Kuhlmann-Wilsdorf, “The effect of prestressing on the strength of neutron-irradiated copper single crystals,” Appl. Phys. Lett., 4, 59, 1964. [44] M. Hiratani, H.M. Zbib, and M.A. Khaleel, “Modeling of thermally activated dislocation glide and plastic flow through local obstacles,” Int. J. Plasticity, 19, 1271–1296, 2003. [45] M.P. Allen and D.J. Tildesley, “Computer simulation of liquids,” Oxford Science Publications, 1987. [46] T. Rasmussen, and K.W. Jacobs, “Simulations of atomic structure, energetics, and cross slip of screw dislocations in copper,” Phys. Rev. B, 56(6), 2977, 1997. [47] S. Rao, T.A. Parthasarathy, and C. Woodward, “Atomistic simulation of cross-slip processes in model fcc structures,” Phil. Mag. A, 79, 1167, 1999. [48] K. Shizawa and H.M. Zbib, “Thermodynamical theory of strain gradient elastoplasticity with dislocation density: Part I – Fundamentals,” Int. J. Plasticity, 15, 899–938, 1999.
3.4 DISCRETE DISLOCATION PLASTICITY E. Van der Giessen1 and A. Needleman2 1 University of Groningen, Groningen, The Netherlands 2
Brown University, Providence, RI, USA
Plastic deformation of crystalline solids is of both scientific and technological interest. Over a wide temperature range, the principal mechanism of plastic deformation in crystalline solids involves the glide of large numbers of dislocations. As a consequence, since the 1930s, when dislocations were identified as carriers of plastic deformation in crystalline solids, there has been considerable interest in elucidating the physics of individual dislocations and of dislocation structures. Major effort has also been devoted to developing tools to solve boundary value problems based on phenomenological continuum descriptions in order to predict the plastic deformations that result in structures and components from some imposed loading. Since the 1980s these two approaches have grown toward each other, driven by, for instance, miniaturization and the need for more accurate models in engineering design. The approaches meet at a scale where the collective behavior of individual dislocations controls phenomena. This encounter, together with continuously increasing computing power, has fostered the development of an approach where boundary value problems are solved with plastic flow modeled in terms of the collective motion of discrete dislocations represented as line defects in a linear elastic continuum [1, 2]. This is the field of discrete dislocation plasticity. A dislocation is a line defect in a crystalline solid which bounds the region on a plane where the material above and below are shifted relative to each other. This shift is termed the slip and the key geometric ingredient of discrete dislocation plasticity is the Burgers vector that characterizes the magnitude and direction of the slip. As a consequence of slip the displacement field is not continuous. The associated stress, strain and rotation fields are continuous except on the dislocation line where they are singular. The state near the dislocation line, the dislocation core region, is not accurately represented by linear elasticity theory. However, atomistic simulations have shown that the linear 1115 S. Yip (ed.), Handbook of Materials Modeling, 1115–1131. c 2005 Springer. Printed in the Netherlands.
1116
E. Van der Giessen and A. Needleman
elastic fields give an excellent description of the displacement fields beyond 8–10 Burgers vectors from the core, so that also stress and deformation are described well by the linear fields. A discrete dislocation model of plastic flow entails the simulation of the evolution of the dislocation structure in response to a prescribed loading. The history dependence of plastic deformation is thus contained in the history of the dislocation structure. The physical mechanisms that underlie phenomena such as dislocation glide, annihilation, cross slip, etc. are governed by corelevel atomic-scale events and their governing properties are supplied in the form of constitutive rules. In this section, we outline discrete dislocation plasticity, giving a perspective on key assumptions, capabilities and limitations.
1.
Discrete Dislocation Dynamics
The aim is to determine the quasi-static evolution of the deformation and stress states for a dislocated solid subject to some prescribed loading history. This is done in an incremental manner in time. At a given instant, the stress state and dislocation structure are presumed known. An increment of loading is prescribed, and (i) the updated deformation and stress state, and (ii) the change in the dislocation structure need to be computed. The dislocations are represented as line singularities in a linear elastic solid. The long range interaction between dislocations is determined directly from elasticity theory, but constitutive rules are required for dislocation motion, dislocation nucleation, dislocation annihilation and, possibly, other short range interactions. Each time step involves three main computational stages: (i) determining the driving force for dislocation motion; (ii) determining the rate of change of the dislocation structure, which involves the motion of dislocations, the generation of new dislocations, their mutual annihilation, and their possible pinning at obstacles; and (iii) determining the stress and strain state for the updated dislocation arrangement. The key idea for determining the stress and deformation state of the solid given the current dislocation structure is superposition. The equilibrium stress and strain fields associated with the individual dislocations are singular, but they are known analytically [1, 2]. For a body with specified boundary conditions, the actual stress and deformation fields can be written as the sum of the singular fields associated with the individual dislocations and a nonsingular image field that enforces the boundary conditions. The advantage of this superposition is that while standard numerical methods for elasticity problems such as finite element, finite difference or boundary element methods cannot accurately represent the strongly singular individual dislocation fields, they can accurately resolve the image fields.
Discrete dislocation plasticity
1117
The governing equations to be satisfied at time t are: • Equilibrium, ∂σi j = 0, ∂x j
(1)
together with σi j = σ j i . • The constitutive relation, σi j = L i j kl kl ,
(2)
where L i j kl are the components of the tensor of elastic moduli. • The strain-displacement relation, 1 i j = 2
∂u j ∂u i + . ∂x j ∂ xi
(3)
For a dislocated solid, the strain field does not satisfy compatibility, i.e.,
u i, j ds = /0
(4)
C
since the displacement field is not a continuous single-valued function. • Boundary conditions, i.e., either prescribed displacements Ui0 or prescribed tractions σi j n j = Ti0 on the boundary with outward unit normal n i . The total displacement, u i , strain, i j , and stress, σi j fields are written as u i = u˜ i + uˆ i ,
i j = ˜i j + ˆi j ,
σi j = σ˜ i j + σˆ i j in V ,
(5)
respectively. The (˜) fields are the superposition of the fields of the individual dislocations, in their current configuration, i.e., u˜ i =
u iI ,
˜i j =
I
iIj ,
σ˜ i j =
I
σiIj
(I = 1, . . . , N )
(6)
I
where ( ) I denotes the singular field associated with an individual dislocation, N being the number of dislocations in the current configuration. The (˜) fields give rise to tractions T˜i and displacements U˜ i on the boundary of the body. The (ˆ) fields represent the image fields that correct for the actual boundary conditions on S. The governing equations for the (ˆ) fields are ∂ σˆ i j =0, ∂x j σˆ i j = L i j kl ˆkl
1 ˆi j = 2
∂ uˆ j ∂ uˆ i + ∂x j ∂ xi
(7) (8)
1118
E. Van der Giessen and A. Needleman σˆ i j n j = Tˆi = Ti0 − T˜i u i = Uˆ i = Ui0 − U˜ i
on ST on Su
(9)
Here, ST is the portion of the boundary on which tractions are prescribed and Su is the portion of the boundary on which displacements are prescribed, as illustrated in Fig. 1. A key point is that the (ˆ) fields are smooth, so that Eqs. (7)–(9) constitute a conventional linear elastic boundary value problem that can be conveniently solved by a conventional numerical method for linear elasticity problems. To date, only the finite element method has been used for this purpose, but other methods are also suitable and, for example, boundary element methods may have advantages for three-dimensional problems. The driving force for dislocation evolution is the Peach–Koehler force which is the configurational force associated with a change in dislocation position. With denoting the potential energy, the Peach–Koehler force f I on dislocation I is given by δ = −
I
L
f I · δs I dl
(10)
I
where L denotes dislocation line I and δs I is the change in its position. With t I a unit vector tangent to dislocation line I and m I a unit vector normal to its glide plane, the local glide direction is t I ×m I and the component of the Peach–Koehler force in the glide direction, f I , is I
I
f =
m iI
σˆ i j +
σiJj
b Ij
(11)
J= /I
Figure 1. Decomposition into the problem of interacting dislocations in an infinite solid, the (˜) fields, and the complementary problem for the finite body without dislocations, the (ˆ) or image fields.
Discrete dislocation plasticity
1119
Here, b Ij are the components of the Burgers vector of dislocation I . Note that the value of f I does not depend on any specification of core properties. This is because the Peach–Koehler force is calculated for a translation of the dislocation. An actual dislocation motion will, in general, involve a change in dislocation shape and thus a change in dislocation line length. The change in line length is accounted for through a constitutive rule and is referred to as the line tension. Dislocations can change glide planes (cross slip) and climb (motion off a glide plane), particularly at temperatures that are a significant fraction of the melting temperature, but attention here is confined to glide. Any effect of geometry changes is neglected in the formulation described above. Large deformations occur inside dislocation cores and these are not modeled by the linear elastic description of dislocations. However, outside dislocation cores, finite-deformation effects can come into play once significant slip has occurred. In particular, there are effects of lattice reorientation on dislocation glide and of geometry changes on the momentum balance. It is known from continuum slip crystal plasticity that lattice reorientation effects can have a significant on the overall response. Effects of geometry changes arise in two contexts: (i) overall shape change, as in the reduction of cross-sectional area in a plastically deformed tensile bar and (ii) the formation of surface slip steps and the resulting stress concentration that occurs there. Effects of overall shape changes also occur in continuum plasticity, but the possible formation of slip steps is an additional feature of discrete dislocation plasticity. A finite deformation discrete dislocation plasticity framework has been presented by Deshpande et al. [3]. Here, we will confine attention to the formulation with geometry changes neglected.
2.
Three-Dimensional Dislocation Dynamics The geometry of a dislocation is governed by a number of variables: • the slip plane, denoted with its unit normal vector m; • the dislocation line as a parameterized line on this plane and with a local tangent vector t; • the Burgers vector b. There are a few special parts of a generic loop, namely edge: b · t = 0 ; screw: b · t = ±b ,
(12) (13)
b being the length of b: b = |b|. Edge and screw dislocations are the central notions in two-dimensional studies, as discussed in a subsequent section.
1120
E. Van der Giessen and A. Needleman
The first step in discrete dislocation plasticity in three dimensions is the description of the individual dislocations. Most methods currently in use, involve discretization of each dislocation. These schemes vary from a screw– edge representation (e.g., [4]), a representation with straight segments, e.g., [5–7], to one with a spline representation [8]. The representation of dislocation loops by straight segments, as illustrated in Fig. 2, implies that each segment has in general a mixed nature, 0 ≤ |b · t| ≤ b. The advantage of this discretization within the superposition framework is that the fields of straight segments are known exactly for a linear elastic isotropic medium. The expressions for the stress fields of individual segments in infinite space are given by Hirth and Lothe [1] while the corresponding displacement fields can be found in [9]. The topology of the discretized loop illustrated in Fig. 2 is at any instant characterized by the set x A of positions of the nodes A = 1, . . . , N . Assuming glide motion only, the velocity of any node, v A , can be written as v A = v A t × m = v A s. The velocity at any point x(l) is obtained by linear interpolation between the nodal velocities v A . Assuming over-damped motion along the entire dislocation loop, the velocity v(l) can be related to the local Peach–Koehler force F(l) projected onto s via the drag relationship F(l) = Dv(l)
glide plane
m l⫽0 A1 x (0)
A⫹1 dl
t A s
x (l)
O
Figure 2. Description of a dislocation loop in its glide plane; m is the normal to the glide plane; the orientation of the loop is determined by the local tangent vector t and the Burgers b; s is defined as t × m. A loop is confined to its glide plane.
Discrete dislocation plasticity
1121
with F = F · s. Treating the discretized dislocation loop through a onedimensional finite element discretization, the dynamics of the loop can be formulated through the set of equations [6] FA =
N
K AB v B
( A = 1, . . . , N )
B=1
with K AB a “stiffness” matrix that is determined by the loop geometry and the chosen shape functions, and is linear in the drag coefficient D. When the nodal Peach–Koehler forces FA are calculated, the nodal velocities are obtained by solving this set of equations. The formulation can be extended to handle sliding nodes to treat dislocation junctions and dislocation segments leaving the crystal via a free surface. The computation of the nodal Peach–Koehler force FA requires care when it comes to the self-interaction, i.e., the contribution of the segments belonging to the same dislocation. In order to eliminate the singular contributions from the ends of the two adjacent segments, Brown’s scheme can be used, see [6, 7]. Nevertheless, high-order Gaussian integration is generally needed to obtain convergence with a loop discretization that is not excessively fine. There are various issues that require due attention in integrating the motion of a dislocation loop in time, which have to do with the continuous change of local curvature. Weygand et al. [6] have suggested (i) a two-level time stepping approach that minimizes the N 2 problem of interaction calculations and (ii) an adaptive re-discretization scheme of the dislocation. But there probably is much room to improve these numerical procedures in order to reduce the number of calculations while retaining accuracy. In particular, multipole methods [10, 11] can considerably reduce the computational time for evaluating dislocation interactions. Experience with the superposition approach to boundary-value problems in three dimensions, so far, has revealed that the numerics are more demanding than one may expect from two-dimensional applications. First of all, higherorder finite elements seem necessary; 20-node brick elements with eight-point Gaussian integration are likely to be the minimum requirement. Even then, Weygand et al. [6] found that at least one to two elements are needed between the dislocation and a free surface in order for the calculated image forces to converge. Moreover, sufficiently many integration points per surface element are needed to compute the nodal forces from the long-range traction fields T˜i . The evolution of the dislocation structure may lead to events where nodal points and part of the corresponding dislocation segments leave the material. These events need to be detected when dislocation nodes are moved and proper constraints must be applied to the resulting surface nodes. To facilitate this detection, the surface of the sample is approximated by a triangular mesh in [6]. When part of a dislocation glides out of the crystal, the dislocation cannot
1122
E. Van der Giessen and A. Needleman (b)
(a)
C
C
outside
surface
B A
A
C
sample
D E
B
A
B
d
E D
D E
Figure 3. The pseudo-mirror construction to mimic the attractive interaction: (a) node leaves the sample; (b) surface nodes are introduced and a mirror construction is created. The view shows the projection onto the glide plane.
be treated as being open but needs has to be closed through virtual segments outside the crystal. This ensures that the analytic expressions for the stress and displacement fields remain valid and that the step produced on the surface is captured through the analytic displacement field. The error by closing the loop outside the crystal is corrected by the (ˆ)-solution. The shape of the virtual dislocation part is in principle irrelevant, but care needs to be taken that the strong attractive image force on the remaining dislocation from the free surface is resolved to sufficient accuracy. A judicious choice of this shape can aid the accuracy of the calculation of the dislocation – surface interaction within the finite element context. Weygand et al. [6] have proposed a procedure where the first two outer segments (after a surface node) are put into positions which correspond to a “mirror image” of the inner last two segments before the surface node, as shown in Fig. 3. This idea is inspired by the notion of image dislocations [1] for plane surfaces and dislocation lines parallel to that surface; for the general situation of curved dislocations on glide planes that are not orthogonal to the free surface the approach is only approximate.
3.
Two-Dimensional Dislocation Dynamics
The computational complexity of discrete dislocation dynamics is substantially reduced by restricting attention to two-dimensional (2D) plane strain situations. The advantage of a 2D formulation is that complex boundary value problems can be solved with realistic dislocation densities with relatively modest computing resources. A disadvantage is that the range of phenomena that can be modeled is limited by the restricted physics of two-dimensional dislocation interactions.
Discrete dislocation plasticity
1123
Within the constraint of plane strain, the dislocations are restricted to being edge dislocations (screw dislocations are consistent with anti-plane shear deformations). For an elastically isotropic solid with shear modulus µ and Poisson’s ratio ν, the stress and displacement fields at (x1 , x2 ) for a dislocation with Burgers vector b I e1 at (X 1 , X 2 ) are:
I (x1 , x2 ) σ11
(x2 ) 3(x1 )2 + (x2 )2 µb I =− 2 2π(1 − ν) (x1 )2 + (x2 )2
I (x1 , x2 ) σ22
(x2 ) (x1 )2 − (x2 )2 µb I = 2 2π(1 − ν) (x1 )2 + (x2 )2
I (x1 , x2 ) σ12
(x1 ) (x1 )2 − (x2 )2 µb I =− 2 2π(1 − ν) (x1 )2 + (x2 )2
(14)
(15)
(16)
u 1I (x1 , x2 )
bI 1 (x1 )(x2 ) x1 = − (1 − ν) tan−1 2 2 2π(1 − ν) 2 (x1 ) + (x2 ) x2
u 2I (x1 , x2 ) =
(17)
(x2 ) b 1 2π(1 − ν) 2 (x1 )2 + (x2 )2 I
2
(x1 )2 + (x2 )2 1 − (1 − 2ν) ln 4 (b I )2
(18)
where xi = xi − X i . It can be computationally useful to take advantage of the fact that the superposition in Eq. (5) is not unique. As long as the (˜) fields incorporate the appropriate singularities, Eqs. (14)–(18) can be extended to include any convenient non-singular fields. In particular, in circumstances where there is a traction-free surface, such as a crack surface, the gradients that the numerically computed (ˆ) fields need to resolve can be reduced by using the dislocation fields for a half-space. These fields are most simply expressed in terms of a complex stress function ϕ (the dislocation index ( ) I is omitted from ϕ for clarity). With the traction-free surface being the x1 -axis, with θ the angle between the Burgers vector and the x1 -axis and the dislocation position being (x1 , h), the stress and displacement fields are given by I I − i σ˜ 12 = ϕ (z) − ϕ (¯z ) + (z − z¯ )ϕ (z), σ˜ 22 I I + i σ˜ 12 = ϕ (z) + ϕ (¯z ) + 2ϕ (z) − (z − z¯ )ϕ (z) σ˜ 11
(19) (20)
where z = x1 + i x2 and an overbar denotes the complex conjugate. The displacement components are given through 2µ(u˜ 1I + i u˜ 2I ) = (3 − 4ν)ϕ(z) + ϕ(¯z ) − (z − z¯ )ϕ (z)
(21)
1124
E. Van der Giessen and A. Needleman
with
2b I h µ i b¯I {ln [−m(ih − z)] − ln [m(ih ¯ + z)]} + ϕ(z) = 4π(1 − ν) z − ih (22) with m defined by b I = |b I |m = |b I |(cos θ + i sin θ). In addition to accounting for traction-free surfaces in the (˜) fields, it can be convenient to use analytical fields for infinite arrays of dislocations in case of periodic boundary conditions. Expressions for walls (dislocations stacked normal to the Burgers vector) and carpets (rows of dislocations parallel to the Burgers vector) in infinite space can be found in the literature as well as for carpets of dislocations in a half space. Such solutions are characterized by being periodic in one direction and decaying exponentially in the perpendicular direction. The latter eliminates the development of artificial patterning of dislocations when using individual dislocations with their 1/r decay and a finite cut-off radius for dislocation-dislocation interactions. A variety of two-dimensional analyses have been carried out so far where the magnitude of the Burgers vector is b for all dislocations and using the following set of simple constitutive rules: • Dislocation nucleation: Dislocation dipoles are nucleated by simulating Frank–Read sources. In 2D this is implemented through point sources that nucleate a dislocation dipole when the Peach–Koehler force at source site I ∗ ,
I
f =
m iI
σˆ i j +
σiJj
b j = m iI σi j b j
(23)
J
equals or exceeds bτnuc during a period of time tnuc , where b is the Burgers for each source. vector magnitude and τnuc and tnuc are parameters specified In Eq. (23) the superscript I pertains to the source while J σiJj gives the stress at the source site from the individual dislocation fields. The distance L nuc between the generated dislocations is taken to be given by L nuc =
b µ . 2π(1 − ν) τnuc
(24)
• Dislocation glide: The magnitude of the glide velocity v I of dislocation I is given by Bv I = f I − bτP with B the drag coefficient and τP the Peierls stress. * Note that the magnitude of f I in Eq. (23) is equal to b times the local resolved shear stress.
(25)
Discrete dislocation plasticity
1125
• Dislocation annihilation: Annihilation of two dislocations with opposite signed Burgers vector occurs when they come within a critical annihilation distance L e of each other. • Dislocation obstacles: Obstacles to dislocation motion are modeled as fixed points on a slip plane. Pinned dislocations can only pass the obstacles when their Peach–Koehler force exceeds a specified value bτobs . Within the framework of these constitutive rules, the sources and obstacles are specified initially and do not evolve with deformation. Two-dimensional simulations have been carried out that allow for strains of several percent and realistic dislocation densities, even in complex boundary value problems. However, the range of phenomena that can be modeled using the 2D framework is limited by the restricted physics of 2D dislocation interactions. For example, while the natural formation of dipoles at the intersection of slip planes emerges in 2D analyses, the formation of three-dimensional 3D junctions, which can be much stronger, is not accounted for. As a consequence, for example, 2D analyses of plane strain tension using the constitutive rules described above exhibit non-hardening behavior, i.e., after some initial transient plastic flow occurs at a more or less constant stress. Hardening can occur, but only when geometrically necessary dislocations are present. Recently, Benzerga et al. [12] have proposed dislocation constitutive rules for 2D analyses that model 3D dislocation mechanisms including dynamic junction formation, with some of the junctions serving as dislocation sources and some purely as obstacles. In this manner, the dislocation source density evolves with deformation, which is key for a realistic description of hardening. The physical background for these rules is given in [12]; here we just summarize the constitutive rules: • Junction formation: The formation of a junction is taken to occur when two dislocations gliding on two intersecting slip planes approach within a specified distance d ∗ from the intersection point of the slip plane traces regardless of the sign of the dislocations. The intersection point is identified with the junction location and the two dislocations forming the junction are immobile until the junction is broken. When a junction forms, there is a probability p that it acts as a potential anchoring point for a Frank–Read source and a probability (1 − p) that it acts as an obstacle. • Dynamic obstacles: Dislocations that approach the junction are kept at a distance greater than or equal to d ∗ from the junction location. A junction I is destroyed if the Peach–Koehler force acting on either dislocation I b with comprising the junction attains or exceeds the breaking force τbrk I = βbrk τbrk
µb SI
(26)
Here, S I is the distance to the nearest junction in any of the two intersecting planes, b is the magnitude of the Burgers vector of the dislocation
1126
E. Van der Giessen and A. Needleman
making up the junction and βbrk is a parameter giving the strength of the junction. • Source operation: A dislocation dipole is nucleated at source I when the I b for a time value of the Peach-Koehler force at the junction exceeds τnuc I tnuc , where I = βnuc τnuc
µb SI
(27)
with βnuc giving the source strength and S I the distance to the nearest junction on the slip plane. In evaluating S I all junctions are considered regardless of whether they are anchoring points or obstacles. The time I is given by tnuc I =γ tnuc
SI |τ I |b
(28)
where τ I is the resolved shear stress at the junction location and γ deI . pends on the drag coefficient B and on τ I /τnuc For nucleation of an isolated loop, I L nuc = κS I
(29)
where κ > 1. However, the emitted dipole is not allowed to pass through a dislocation near the source. As a consequence, the size of the emitted I < κS I . loop is S I ≤ L nuc • Line tension: The energy cost associated with loop expansion is modeled through a configurational force of magnitude L I b pointing from one dislocation in a dipole toward the other. The magnitude of L I is L I = −α
µ|b| SdI
(30)
where α is a proportionality factor and SdI is the algebraic distance between the two dislocations comprising the dipole, so that the sign of L I depends on the sign of SdI . The line tension is then included in Eq. (25) by adding L I b as a driving force to the right-hand side. • Interaction of moving dislocations with junctions: An anchoring point can be destroyed by annihilation of one of the dislocations forming the junction. On the other hand, an obstacle can be destroyed either by annihilation or by the local stress exceeding the obstacle strength. In order to analyze the consequences of these two mechanisms, two options have been considered: (i) only junction destruction can occur when a critical stress is reached so that, as a consequence, only obstacles can be destroyed and; (ii) annihilation is possible in which case both obstacles and
Discrete dislocation plasticity
1127
anchoring points can be destroyed. In option (i), when a dislocation of opposite sign comes close to an obstacle it is pinned at a distance d ∗ from the obstacle, while when a dislocation of opposite sign comes close to an anchoring point the gliding dislocation is free to oscillate around the anchoring point. Calculations using these constitutive rules also use the constitutive rules for dislocation motion, Eq. (25), and dislocation annihilation. In addition, initial static sources and obstacles can be specified. Although initial results are encouraging [12], it remains to be seen how much of 3D dislocation physics can actually be incorporated in a 2D formulation. Computing the change in the dislocation structure in each time increment involves: (i) computing the motion of existing dislocations; (ii) checking for interactions with the static obstacles and with existing dynamic junctions; (iii) checking for dislocation annihilation; (iv) determining if any dislocations have exited at a free surface; (v) determining if any dislocations pinned at static obstacles have broken away; (vi) checking for the destruction of the dynamic junctions; (vii) checking for the creation of new dynamic junctions; (viii) checking for nucleation at the static and dynamic sources. Since only edge dislocations are present in the 2D analyses and since nucleation involves the production of dipoles, the total Burgers vector does not change during the deformation history. The net Burgers vector in the body can only change when dislocations exit the body, leaving a step on the surface. Since edge dislocations correspond to addition or subtraction of a half-plane of atoms, conservation of total Burgers vector reflects conservation of mass. It is worth mentioning that the constitutive relations used for dislocation nucleation pertain to nucleation from Frank–Read sources where the main issue is mainly one of propagating a loop to its stable size. Criteria for other nucleation processes, for example from surface steps or grain boundaries (which can also act as dislocation sinks), remain to be developed. Dislocation dynamics is chaotic [13]. It seems that the chaotic behavior has relatively little effect on the predicted stress-strain response under monotonic loading, where the variations in dislocation position tend to average out, but possibly more effect on fracture predictions, where local values of stress and deformation can matter. However, the implications of this chaotic behavior remain to be fully explored.
4.
Example
Experiments have shown that stress evolution in films with a thickness on the order of micrometers is size dependent. This effect cannot be resolved by classical continuum theories since they lack a material length scale. The method presented above is illustrated by considering a 2D plane strain model
1128
E. Van der Giessen and A. Needleman
of a thin film bonded to an elastic substrate, as analyzed by [14]. The film of thickness h is considered to be a single crystal and perfectly bonded to a halfinfinite substrate, see Fig. 4. The single crystal contains three slip systems with slip plane orientation: φ (1) = 0◦ ; φ (2) = 60◦ ; φ (3) = 120◦ , which resembles an fcc crystal with the (110) plane coinciding with the x1 -x2 plane of deformation. The elastic properties of the film are assumed to be isotropic and the same as those of the substrate. Stress is caused by the mismatch in the coefficients of thermal expansion and arises from cooling from the stress-free state. This is taken into account by subtracting the thermal stress 3EαT /(1 − 2ν) due to a temperature difference T from the left-hand side of (8), where E =2(1+ ν)µ is Young’s modulus and α is the difference of the coefficient of linear thermal expansion in film, α f , and of that in the substrate, αs . Note that the thermal part of the problem is taken care of through the (ˆ) fields. The film is infinitely long in the x 1 direction but is treated as being periodic with cell width w. The (˜) fields are constructed from the periodic fields of a dislocation and all its replicas at mutual distance w. The traction-free condition of the film surface x2 = h is accounted for by the (ˆ) fields. The interface between film and substrate is treated here as being impenetrable by dislocations (by putting very strong obstacles at the ends of the slip planes). Simulations start from a stress-free and dislocation-free configuration. The film contains a random distribution of 60 sources/µm2 . The nucleation strength τnuc of each source is randomly taken out of a Gaussian distribution with average τnuc = 25 MPa and standard deviation τnuc = 5 MPa. A dislocation dipole is generated from the source when the resolved shear stress at the source exceeds the nucleation strength for a given time tnuc = 10 ns. There are no obstacles, and neither junction formation nor line tension is accounted for. x2
w
αf
h
φ
x1 αs π
π ∞
Figure 4. Geometry of the film-substrate problem. A unit cell of width w is analyzed and the height of the substrate is taken large enough to represent a half space.
Discrete dislocation plasticity
1129
Figure 5 shows how the dislocation distribution evolves from the initially dislocation- and stress-free state during cooling in a film with h = 0.5 µm from T = 600 K. After roughly 25 K, the first dislocation dipoles are generated inside the hitherto uniform elastic stress field. One dislocation moves toward the impenetrable interface where it gets stopped, while the other exits the film at the free surface. As cooling proceeds, more and more dislocations are generated and pile up against the interface. This causes the formation of a boundary layer of relatively high stress just above the interface. The thickness of the boundary layer turns out to be more or less independent of film thickness. This gives rise to a size effect: thinner films are harder, as shown in Fig. 6. The stress-temperature curves are serrated as a consequence of the discrete nucleation events. The straight-line fits demonstrate that hardening is approximately linear with the constitutive rules adopted in this simulation. The kink in the stress-temperature curve after ∼70 K for the h = 0.25 µm film is caused
(a)
(b)
(c) 0.4 0.2 0 ⫺0.2 ⫺0.4 0
0.5
1
1.5
2
Figure 5. Evolution of the dislocation distribution inside the film during cooling by: (a) 100 K; (b) 150 K; (c) 200 K. In (c) the distribution of the stress σ11 parallel to the film is superimposed, also showing the top 0.5 µm of the substrate.
1130
E. Van der Giessen and A. Needleman 150
h=0.25µm
<σ11>f
100
h=0.5µm 50
h=1µm
0 600
550
500
450
400
T[K] Figure 6.
Average stress in the film, σ11 f , versus temperature for three film thicknesses.
by the limited availability of sources in such thin films [14]. Quite generally, at small size scales limited source availability can significantly affect the evolution of plastic deformation.
References [1] J.P. Hirth and J. Lothe, Theory of Dislocations, 2nd edn., Wiley, New York, 1982. [2] F.R.N. Nabarro, Theory of Crystal Dislocations, Oxford Univ., Press, Oxford, 1967. [3] V.S. Deshpande, A. Needleman, and E. Van der Giessen, “Finite strain discrete dislocation plasticity,” J. Mech. Phys. Solids, 51, 2057–2083, 2003. [4] L.P. Kubin, G. Canova, M. Condat, B. Devincre, V. Pontikis, and Y. Br´echet, “Dislocation microstructures and plastic flow: a 3D simulation,” Solid State Phenomena, 23-24, 455–472, 1992. [5] H.M. Zbib, M. Rhee, and J.P. Hirth, “On plastic deformation and the dynamics of 3D dislocations,” Int. J. Mech. Sci., 40, 113–127, 1998. [6] D. Weygand, L.H. Friedman, E. Van der Giessen, and A. Needleman, “Aspects of boundary-value problem solutions with three-dimensional dislocation dynamics,” Model. Simul. Mat. Sci. Engrg., 10, 437–468, 2002.
Discrete dislocation plasticity
1131
[7] K.W. Schwarz, “Simulation of dislocations on the mesoscopic scale. I. Methods and examples,” J. Appl. Phys., 85, 108–119, 1999. [8] N.M. Ghoniem and L.Z. Sun, “Fast-sum method for the elastic field of threedimensional dislocation ensembles,” Phys. Rev. B, 60, 128–140, 1999. [9] D.M. Barnett, “The displacement field of a triangular dislocation loop,” Phil. Mag. A, 51, 383–387, 1985. [10] G.J. Rodin, “Towards rapid evaluation of the elastic interactions among threedimensional dislocations,” Phil. Mag. Lett., 77, 187–190, 1998. [11] R. LeSar and J.M. Rickman, “Multipole expansion of dislocation interactions: application to discrete dislocations,” Phys. Rev. B, 65, 144110, 2002. [12] A.A. Benzerga, Y. Br´echet, A. Needleman, and E. Van der Giessen, “Incorporating three-dimensional mechanisms intodislocation dynamics,” Modelling Simul. Mater. Sci. Eng., 12, 159–196, 2004. [13] V.S. Deshpande, A. Needleman, and E. Van der Giessen, “Dislocation dynamics is chaotic,” Scripta Mat., 45, 1047–1053, 2001. [14] L. Nicola, E. Van der Giessen, and A. Needleman, “Discrete dislocation analysis of size effects in thin films,” J. Appl. Phys., 93, 5920–5928, 2003.
3.5 CRYSTAL PLASTICITY M.F. Horstemeyer1, G.P. Potirniche1 , and E.B. Marin2 1 Mississippi State University, Mississippi State, MS, USA 2
Sandia National Laboratories, Livermore, CA, USA
Besides Dislocation Dynamics, crystal plasticity can be considered a mesoscale formulation, since the details of the equations start at the scale of the crystal or grain. In this section, the topics of classical crystal plasticity formulations, kinematics, kinetics, and the polycrystalline average methods will be discussed. Continuum slip polycrystal plasticity models have become quite popular in recent years as a tool to study deformation and texture behavior of metals during processing [1] and shear localization [2, 3]. The basic elements of the theory comprises (i) kinetics related to slip system hardening laws to reflect intragranular work hardening, including self and latent hardening components [4], (ii) kinematics in which the concept of the plastic spin plays an important role, and (iii) intergranular constraint laws to govern interactions among crystals or grains. The theory is commonly acknowledged for providing realistic prediction/correlation of texture development and stress-strain behavior at large strains as it joins continuum theory with discretized crystal activity. Different authors have developed or recommended various forms of the basic elements of polycrystal plasticity theory that address specific applications. Some have developed formulations at what is called the intermediate stress configuration [5, 6]. Others have focused on current configuration formulations [3, 7]. The texture and stress-strain responses are essentially the same. Most have ignored elasticity effects [7], while others include it in their formulations [3]. Again, the results are the same. Where differences arise lie within the assumptions related to the kinetics of slip. Inelastic deformation has historically been attributed to dislocation glide on slip planes, otherwise known as crystallographic slip. Taylor and Elam [8] were the first to determine the relationship between the orientation of one crystallographic slip axis and the tensile test axis, which were not necessarily coincident. They conjectured that perhaps two slip systems were involved. 1133 S. Yip (ed.), Handbook of Materials Modeling, 1133–1149. c 2005 Springer. Printed in the Netherlands.
1134
M.F. Horstemeyer et al.
Schmid [9] determined that the magnitude of crystallographic slip on the glide planes was related to the resolved shear stress. The next major historical work related to that of Taylor [10], who founded the “principle of minimum shears.” This principle disregarded elastic strains and assumed that only five independent slip systems were necessary to describe three dimensional polycrystalline behavior. Using Taylor’s assumption, Bishop and Hill [11] determined the three dimensional stress state resulting from all the slip possibilities in a facecentered cubic (FCC) lattice, which has twelve slip systems (three possible [110] slip directions on four {111} planes). Books by Havner [12] and Kocks et al. [4] provide a nice review of the history and the pertinent issues related to the kinetics, kinematics, and intergranular constraints of crystal plasticity. Now we turn towards examining the kinematics of crystal plasticity. The deformation gradient is often assumed to be a multiplicative decomposition of elastic and plastic parts after Lee and Liu [13], F = FeF p
(1)
so correspondingly the velocity gradient is given by
e p e p L = F˙ F −1 = F˙ F p + F e F˙ F p−1 F e−1 = F˙ F e−1 + F e F˙ F p−1 F e−1
(2) where L e = F˙ F e−1 and L p = F˙ F p−1 . Now the plastic velocity gradient corresponding to crystallographic slip is given by e
Lp =
2
γ˙i s 0i ⊗ m 0i
p
(3)
i=1
where γ˙i is the plastic slip rate on ith slip system, and s 0i and m 0i are the slip direction vector and unit normal vector to the slip plane, respectively. Because s 0i and m 0i are fixed in space according to the classical assumption of Taylor (material flows through the lattice), the so-called intermediate configuration is specified; hence, material plastically flows from the reference to the intermediate configuration. After plastic deformation, the lattice deforms and rotates with F e , which is defined by the polar decomposition F e = R eU e
(4)
where R e is the proper orthogonal rotation tensor, and U e is the right elastic stretch tensor. In general, R e comprises the rotation from both elastic deformation and rigid body rotation. As a result, the velocity gradient in the current configuration is given by Lˆ p =
2 i=1
γ˙i s i ⊗ m i
(5)
Crystal plasticity
1135
giving the velocity gradient as T T L = L e + R e U e R e Lˆ p R e U e−1 R e
(6)
eT
since R = R e−1 for proper orthogonal R e . Infinitesimal elastic strains are typically assumed; hence, the right elastic stretch is given by ∼ I +Y Ue =
(7)
in which higher order terms are generally neglected as well, where I is the identity tensor, and Y is the infinitesimal perturbation of the elastic stretch. The inverse of the right elastic stretch is given by
∼ I − Y. U e−1 = I − Y + O Y 2 =
(8)
Substituting (7) and (8) into (6), we get T T L = L e + Lˆ p + R e YR e Lˆ p + Lˆ p R e YR e .
(9)
The Green-elastic strain with respect to the intermediate configuration is given by E=
1 2
U e2 − I ,
(10)
so the second rank Cauchy stress tensor, σ , in the current configuration can be related to the intermediate configuration stress, σˆ , according to T
R e σˆ (E) R e = σ F e .
(11)
As a consequence, the elastic rotation due to elastic deformation may be neglected, and R e essentially represents a rigid rotation. The general constitutive form can be determined at the intermediate (stress free) configuration through a hyperelastic law as
ˆ σˆ Eˆ = C E,
(12)
where the elastic stiffness tensor, C is invariant for a given crystal in the intermediate configuration. The intermediate configuration is aligned with the crystalline axes. σˆ is the second Piola–Kirchhoff stress in the intermediate configuration, and Eˆ is the conjugate Green elastic strain. For cubic orthotropy, the single crystal elastic moduli are formed on axes of cubic symmetry (100, 010, and 001 axes). By defining C1 = C1111 = C2222 = C3333 C2 = C1122 = C2233 = C1133 C3 = C1212 = C1313 = C2323
(13)
1136
M.F. Horstemeyer et al.
the components are formed on the Cartesian axes coincident with (100, 010, and 001 axes) with all other Ci j kl equal to zero. The stress in the current configuration is related to the second Piola–Kirchhoff stress by T
σ = 1J F e σˆ F e .
(14)
Now the Zener anisotropy factor as related to the crystal axis (not the specimen axis) is given by Z=
2C3 . C1 − C2
(15)
When Z = 1, the elastic properties are isotropic; however, for copper Z > 3 for example. In finite inelastic deformation, grains rotate and tend to align themselves toward a texture pole. Now we will incorporate the kinematic equations into the constitutive equations. By virtue of Eq. (7), the Green-elastic strain can be written
E = Y + O Y2 ,
(16)
and the inverted elastic stiffness matrix can be defined as B ∗ = C ∗−1 .
(17)
By combining (11), (16), and (17), we may write B ∗ • σ = R e YR et
(18)
to be used later. The velocity gradient can be decomposed into its symmetric and antisymmetric parts as L = D + W.
(19)
By using (7), (10), (18), and (19) the symmetric and anti-symmetric parts of the velocity gradient in the current configuration can be identified as
D = D e + Dˆ p + B ∗ • σ Wˆ p − Wˆ p B ∗ • σ ,
W = W e + Wˆ p + B ∗ • σ Dˆ p − Dˆ p B ∗ • σ
(20)
(21)
when neglecting the higher order terms. Here W e = R˙ R e . We can gain insight into the interpretation of the current configuration quantities Dˆ p and Wˆ p by rearranging Eqs. (20) and (21) as e
Dˆ p = D − D e − B ∗ • σ Wˆ p + Wˆ p B ∗ • σ and
T
(22)
Wˆ p = W − W e − B ∗ • σ Dˆ p + Dˆ p B ∗ • σ .
(23)
Crystal plasticity
1137
In many macroscale plasticity formulations, the plastic rate of deformation and plastic spin are prescribed. What distinguishes macroscale internal state variable theory from this crystal plasticity formulation is that these quantities fall out naturally within the formulation. It is instructive to observe the rate forms of the crystal plasticity equations. The material time derivative of the Cauchy stress in the current configuration is given by differentiating Eq. (14) as σ˙ = R˙ σˆ (E) R e + R e σ˙ˆ (E) R e + R e σˆ (E) R˙ . e
T
eT
T
(24)
From the co-rotational stress rate in the current configuration, σ˙ is given by σ˙ = C • E˙
(25)
∼ Y˙ neglecting higher order terms, and the elastic part of the where E˙ = U e U˙ = velocity gradient is given by e
eT
˙ , D e = R e YR
T e e = R˙ R e .
(26)
By combining (24)–(26), the Cauchy stress rate becomes σ˙ = e σ − σe + C ∗ • D e .
(27)
The stress rate that co-rotates with the crystal lattice, which spins with W e , is a Jaumann-type form given by o
σ = C ∗ • D e = σ˙ − e σ + σe ,
(28)
where e = W e .
(29)
Combining (24)–(29), the stress rate becomes
˙ σ − σ W e + C ∗ • D − Dˆ p σ˙ = W e
+ C ∗ • Wˆ p B ∗ • σ − B ∗ • σ Wˆ p ,
(30)
since D e = D − D p + spin terms. The next important aspect of crystal plasticity is to include kinetics relations to the aforementioned kinematics and constitutive relations. A common viscoplastic employed by Hutchinson [14] for isotropic hardening and modified by Horstemeyer et al. [15] with kinematic hardening is given by the following, τi − αi M , g
γ˙i = γ˙o sgn (τi − αi )
i
(31)
1138
M.F. Horstemeyer et al.
where the plastic slip rate on the ith slip system, γ˙i , is a function of a fixed reference strain rate, γ˙0 , the reference shear strength, gi , the resolved shear stress on the slip system, τi , the rate sensitivity exponent for the material, M, and an internal state variable representing kinematic hardening effects resulting from backstress at the slip system level, αi . The isotropic hardening evolution law for the internal hardening state variable, gi , on ith slip system is given by g˙ i =
12
h i j γ˙ j
(32)
i, j =1
where h ij are the hardening (or plastic) moduli. The self-hardening components arise when i = j and the latent hardening components arise when i =/ j . The increase or decrease of flow stress on a secondary slip system due to crystallographic slip on an active slip system is referred to as latent hardening. Taylor and Elam [8], based on experimental evidence on aluminum crystals, observed that when latent hardening equals self hardening, an isotropic response exists. Kocks et al. [4] reviewed the behavior of several materials under different loading conditions and surmised that an intersecting slip system induces higher stresses in the well-developed flow stress regime. The latent hardening ratio, which is the ratio of hardening on the secondary system compared to the primary system, ranges from 1.0 to 1.4 for the form used by Hutchinson [14] and Peirce et al. [2], sometimes called the PAN rule, where 1.0 corresponds to Taylor hardening. However, texture and conventional latent hardening effects cannot account for all sources of anisotropy, in general. In essence, latent hardening models have focused on dislocation-dislocation interactions, but in reality latent hardening arises from dislocation-substructure interactions as well. In the latter case, an evolving latent hardening ratio would be necessary. Although potentially important, an evolving latent hardening ratio has yet to be established. A simple form of the hardening moduli [3] employing the PAN rule is given by
h i j = F (γ ) δi j + lhr 1 − δi j ,
(33)
where ) is a function of the cumulative shear on all slip systems, γ = F(γ γ j dt, and lhr is the latent hardening ratio. j Other latent hardening forms have been proposed and might be fruitful to consider in such parameter studies; Equation (21) cannot distinguish between acute and obtuse cross-slips in reversed quasi-static loading conditions. Havner [12] employed a two-parameter rule to examine latent hardening effects, showing that the contribution of incremental slip from self hardening equals that of the latent system. Other issues regarding latent hardening include differences that have been observed from one latent system to the next. In fcc Cu and Al single crystals, slip systems in which dislocations can form
Crystal plasticity
1139
sessile junctions appear to exhibit primary latent hardening. Secondary latent hardening is associated with systems for which dislocations form glissile junctions or Hirth locks with those of the active slip systems. Also not considered is the influence of the stacking fault energy; the lower the stacking fault energy, the higher the latent hardening. Models to date only empirically fit constants to the latent hardening equation and physical motivation is often lacking. Finally, although the latent hardening ratio seems to be independent of temperature, alloy type, and strain rate [4], it does change during deformation, saturating at a strain on the order of unity. The slip system hardening coefficient, F(γ ), has been emphasized by different researchers attempting to model various aspects of dislocation interaction. One example is the Rashid and Nemat-Nasser [3] hardening rule given by F(γ ) =
h 0,
0 ≤ γ ≤ γ0
h0 , γ0 ≤ γ 1 + (γ − γ0 )
,
(34)
where h 0 , , and γ0 are material constants. Another example is a modified hardening-recovery equation [15] that was also used in this study is given by F (γ ) = h 0 − Rg(γ ),
(35)
where R is a material constant. Other forms can be appropriated here but the motivation should be based upon hardening and recovery reflecting dislocation initiation, motion, and interaction. Kinematic hardening at the grain level is used to model dislocation substructure contribution to the directional dislocation resistance. Kinematic hardening at the level of the slip system has been rather widely employed to describe strengthening due to heterogeneous dislocation substructure and attendant Bauschinger effects. This substructural internal variable evolution equation evolves at the level of the grain as given by α˙ i = Crate (Csat γ˙i − αi γ˙i ),
(36)
where Crate controls the rate of evolution, and Csat is the saturation level of the backstress and were chosen to fit the experimental data. The substructural hardening internal state variable reflects dislocation interactions within the grain and follows the internal state variable constraint that the rate must be governed by a differential equation in which the plastic rate of deformation appears. It is well-known that a certain degree of kinematic hardening (Bauschinger effect) is introduced by virtue of the orientation dependence of grains and compatibility requirements among them in crystal plasticity theory. However, this is a highly transient effect that occurs over small cumulative plastic strain following a strain reversal. More persistent Bauschinger
1140
M.F. Horstemeyer et al.
effects arise from prescription of kinematic hardening at the scale of individual grains (slip systems), affecting slip system flow rules. Reversed loading experiments on single crystals of both precipitate-strengthened and pure metals exhibit kinematic hardening due to heterogeneous inelastic flow. Precipitates offer a clear source of the behavior in the former. Dislocation substructures induce these effects in the latter. In the latter case, the backstress is induced by the collective effects of interactions with dislocation structures at higher scales. The final topic of discussion pertinent to crystal plasticity is the averaging of the polycrystal from the single crystal starting point. One can think of the bridging of the mesoscale and macroscales is governed by the intergranular constraint formulation, which injects anisotropy through another bridge of length scales besides the micro-meso link illustrated in Eq. (36). A crystalto-aggregate averaging theorem that kinematically constrains all of the crystals in the same manner is based on the work of Taylor [10]. Another limit is to assume the same remote stress applied to each crystal [16]. A third form of polycrystalline constraint used in a crystal plasticity context is what is called relaxed constraints method. Various forms of this exist. Essentially, they start with the remote strain applied to all the crystals according to the Taylor constraint and then relax towards the Sach’s constraint. Terms such as self-consistent, relaxed constraints, and modified constraints have been used to describe this type of constraint. The idea is that the single crystal which is assumed to be an inclusion embedded in a matrix that possesses the aggregate properties of effective stress-strain behavior. One example using the elastic modulus to represent the aggregate in which each crystal’s strain tensor is perturbed from the polycrystal average according to σij − σijave =
2µ(1 − (2(4 − ν)/15(1 − ν))) ave ε − ε , ij eff ij (1 + 3µ(ε p /σ eff ))
(37)
where the volume averaged stress and strains over all the grains are given by σijave =
N 1 (σij )k , N k=1
N 1 (εij )k . N k=1
εijave =
(38)
and (σij )k and (εij )k are the stress and strain on the kth grain. Here, N is the number of grains, µ is the polycrystalline shear modulus, ν is the polycrystalline elastic Poisson’s ratio, and σ eff =
1 2
ave ave σ11 − σ22
2
2
ave ave + σ33 − σ11 2
2
ave ave ave + 6 σ12 + σ23 + σ13
2
ave ave + σ22 − σ33
1/2
2
(39)
Crystal plasticity
1141
The effective plastic strain is given by ε
p eff
=
p ave ε11
−
2
2 9
p ave
+ 43 ε12
p ave 2 ε22 2
p ave
+ ε23
+
p ave ε33 2
p ave
+ ε13
−
p ave 2 ε11
+
p ave ε22
−
p ave 2 ε33
1/2
(40)
Models with relaxed constraints have been used to provide understanding for length scale issues such as grain shape changes in predicting a more accurate texture response than that obtained by using the Taylor “full” constraint. As grains become flat or elongated as deformation proceeds, the average number of operative slip systems decreases. For example, in rolling as the grain shape changes from equiaxed to elongated, the anisotropy for the cube {100}001, Goss {100}011, and brass {110}112 textures is not induced, but the copper {112}111 and S {123}634 textures will be affected. These five main texture components are characterized by recrystallization (cube and Goss components) and by rolling (brass, copper, Goss, and S). Something generally not considered in modeling that would affect all five texture components is the contribution of the substructural geometric necessary boundary (GNB) evolution to the textural evolution. To introduce the substructural GNB effect on the grain shape change and slip system activity, the microheterogeneity internal state variable from Eq. (36) arising from the noncrystallographic microheterogeneity evolution is admitted to modify Eq. (31). The proposed deformation-induced anisotropy internal state variable intergranular constraint relation is given by µ ave ε − ε , (41) σij − σijave = ij C1 αˆ eff ij where αˆ
eff
=
1 2
ave ave αˆ 11 − αˆ 22
2
2 2
ave ave + αˆ 33 − αˆ 11 2
ave ave ave + 6 αˆ 12 + αˆ 23 + αˆ 13
2
ave ave + αˆ 22 − αˆ 33
2
1/2
(42)
and αˆ ijave = N1 3i, j αˆ ij . In Eq. (42), C1 is a constant that governs the intergranular constraint effect. The value of C1 will vary depending on the crystal lattice type (FCC or BCC) and number of material phases. The mathematical form for (µ/C1 αˆ eff ) decreases exponentially as deformation proceeds, analogous to the decay of the mean free path between dislocation substructures in the mesh length theory. In essence, it is a length scale parameter introduced from the lower scale within the grain that affects the intergranuler constraint of the polycrystal. Equation (41) seeks to express intergranular constraint in terms of the evolving magnitude of the grain level microheterogeneity internal state
1142
M.F. Horstemeyer et al.
variable, which responds in a transient manner to any abrupt change of loading path, reflecting in some manner the formation of dislocation substructures and grain subdivision processes. It is noted that αˆ is a long range transient, in general. The justification for the use of αˆ in the intergranular constraint relation is evident when one considers the role of geometric necessary boundaries and grain subdivision in accommodating deformation. Since geometrically necessary dislocations are generated predominately for the purpose of strain accommodation between adjacent grains, the formation of geometric necessary boundaries serves as an intragranular source of relieving intergranular constraint stresses. The fact that αˆ also enters into the flow rule in Eq. (41) reflects the influences of these intragranular structures on deformation-induced anisotropic strengthening. Hence, both intergranular hardening and intragranular constraint aspects of geometric necessary boundary formation are addressed. Now that we have discussed the theoretical aspects of crystal plasticity in terms of the, kinematics, kinetics, and the polycrystalline average methods, we now turn towards the implementation of the model into a numerical setting.
1.
Crystal Plasticity Implementation
The numerical implementation of the above-described theory can differ depending on the rate dependency of plasticity theory considered. While rate dependent crystal plasticity considers plastic deformation occurring simultaneously on all slip systems according to flow rules such as illustrated in Eq. (31), rate independent numerical schemes are confronted with a few problems concerning the activity of plastic slip on crystallographic slip systems in order to accommodate the required remote deformation. In this implementation, the plastic slip (deformation) is assumed to occur on the slip systems of the crystalline lattice. Each slip system can be fully defined by the set of vectors s 0i and m 0i . The mutually perpendicular vectors s 0i and m 0i have values according to the type of crystalline lattice (cubic, hexagonal, etc.) and its orientation with respect to the axes of the system of coordinates. For example, a face-centered cubic (FCC) lattice has twelve slip systems. When the crystallographic directions (1 0 0), (0 1 0) and (0 0 1) are aligned with the global axes of coordinates x, y and z, respectively, the twelve slip systems are defined by set of vectors s 0i and m 0i , defined in Table 1. If the lattice if rotated with respect to the global system of coordinates, then the components of the vectors s 0i and m 0i but be rotated accordingly with the rotation matrix R : si = R · si0
m i = R · m 0i
(43)
Crystal plasticity
1143 Table 1. Summary of slip and normal direction vectors for FCC metal s 0i
ith slip system 1 2 3 4 5 6 7 8 9 10 11 12
[1 [–1 [0 [1 [–1 [0 [–1 [0 [1 [–1 [1 [0
–1 0 1 0 –1 1 0 –1 1 1 0 –1
m 0i 0] 1] –1] 1] 0] –1] 1] –1] 0] 0] 1] –1]
(1 (1 (1 (–1 (–1 (–1 (1 (1 (1 (–1 (–1 (–1
1 1 1 1 1 1 –1 –1 –1 –1 –1 –1
1) 1) 1) 1) 1) 1) 1) 1) 1) 1) 1) 1)
Fourth order elasticity tensor C 0 defining the elastic response of the crystalline lattice under stress should also be transformed using the same rotation matrix: C = R · R · C 0 · RT · RT
(44)
There are many representations of the crystal orientation. One of the most common is Roe convention that uses three Euler angles ψ, φ, and θ. The rotation matrix in Roe convention is written as [4]:
cos ψ cos θ cos φ − sin ψ sin φ R = sin ψ cos θ cos φ + cos ψ sin φ − sin θ cos φ
1.1.
− cos ψ cos θ sin φ − sin ψ cos φ − sin ψ cos θ sin φ + cos ψ cos φ sin θ sin φ
cos ψ sin φ
sin ψ sin φ
(45)
cos θ
Rate Independent Numerical Integration Algorithm
Rate independent integration algorithms of crystal plasticity constitutive equations must deal with a few issues that are avoided in rate dependent plasticity. In rate independent crystal plasticity, several decisions must be made during incremental loading: which slip systems are active, what are the plastic slip increments in order to produce the accommodate remote deformation, and how are the selection of the set of active slip systems determined [5]. Numerous integration schemes of rate independent crystal plasticity have been put forth [5, 6, 10, 17, 18].
1144
M.F. Horstemeyer et al.
In the following paragraphs we will briefly introduce a classical rate independent approach of Anand and Kothary [5]. Consider a load step from time t to t + t. Assume all the variables known at time t, that is Cauchy stress σ t , pl plastic deformation gradient F t , total deformation gradient F t , slip systems orientations s ti and m ti . Also, the total deformation gradient at time t + t, F t + t is known. 1. Calculate the elastic deformation gradient from the total and the plastic deformation gradient, assuming the step is fully elastic:
F et+ t = F t+ t · F tp
−1
(46)
2. Using the elastic deformation gradient, update the normal and tangential vectors defining the slip systems: t+ t
si
= Fte+ t · s ti
t+ t
mi
= m ti · Fte+ t
−1
(47)
3. Calculate elastic Green strain tensor from the elastic deformation gradient: E et+ t =
1 2
F et+ t
T
· F et+ t − I
(48) T
where, I represents the identity matrix and A is the transpose of any matrix A. 4. Compute the trial Cauchy stress by tensorial multiplication of the elastic stiffness tensor and the trial elastic Green tensor: σ t + t = C : E et+ t
(49)
5. Compute the resolved shear stress on each slip system from the trial stress τit + t = σ t + t : P ti + t
(50)
where P ti + t is the Schmid tensor defined as: P ti + t =
1 2
s ti + t · m ti + t + m i
t+ t
· s ti + t
(51)
7. Check yielding criterion. If τit + t < git for all i = 1, n, then the load step [t, t + t] is fully elastic, and exit. Otherwise, continue with the next step. 8. Define the set of potentially active slip systems, as those systems for which the yield functions are greater or equal to zero t + t − git ≥ 0 τi
Apot = i
(52)
9. For the systems considered potentially active, calculate the plastic slip increments from the consistency condition at time t + t
j ∈ Apot
γ j P ti + t : C e : P tj+ t + h i j = τit + t − git − γit
(53)
Crystal plasticity
1145
In the above system of equations, the unknown quantities are the slip increments for the potentially active slip systems γi and γit represent the accumulated plastic slip increments at time t. By solving the above system of equations, one obtains the slip increments for the potentially active slip systems. For all other slip systems, the increments in plastic slip are zero. Depending on the functions h i j , the above system of equations can be solved directly or using a classical Newton-Raphson procedure. 10. The second decision to be made is about the choice of the set of active slip systems. If some of the plastic slip increments γi found previously are negative ( γi < 0 for some α ∈ Apot), then these slip systems are inactive. Consequently, these slip systems are dropped from the set Apot and return to Step 9. 11. After finding the slip increments for the slip systems active at Step 5, the reference shear strengths are recalculated for each slip system, and the inactive slip systems are monitored by calculating their respective yield functions and performing Step 7 again. If nonzero yield functions are found for some of the inactive slip systems, those systems are included in the set of potentially active slip systems Apot and the slip increments are recalculated again going back to Step 9. 12. Update state variables. If this numerical algorithm is implemented into an implicit finite element code, then an elasto-plastic stiffness matrix must be computed and passed to the code. For examples of how to compute a consistent stiffness matrix, see Kalidindi [19], Miehe and Schroder [20].
1.2.
Rate Dependent Numerical Integration Algorithm
Rate dependent integration algorithms for crystal plasticity constitutive equations avoid the complications related to a selection of active set of slip systems, by calculating plastic slip increments on all slip systems according, most commonly, to a power law flow rule. In the following paragraphs we will present a fully implicit integration algorithms proposed by Cuitino and Ortiz [6] for rate dependent crystal plasticity. As in the case of rate independent plasticity, we consider a substep from time t to t + t. At time t, all the variables are known, such as the plastic slip accumulated, as well as the isotropic and kinematic hardening on the slip systems are known. The incremental procedure assumes the deformation gradients at the beginning and the end of the time steps are also known, F(t) and F(t + t), respectively. An implicit
1146
M.F. Horstemeyer et al.
integration scheme of the differential equations representing the constitutive response of the lattice is briefly described in the following steps characteristic for finite element implementations [21]. An implicit integration algorithm is required to avoid numerical instabilities due to the power law flow rule. 1. In performing an implicit integration algorithm, the plastic slip increments are updated based on the values of the parameters from the flow rule at the end of the time step t + t: γ˙i = γ˙0 sgn
τit + t
αit + t
−
τ t + t − α t + t M i i git + t
(54)
The problem that arises here is that none of the quantities from the above equation are known at time t + t, so they must be calculated using an implicit numerical algorithm. Based on these unknown plastic shear rates, all the quantities in the above equations can be calculated. The variation of the reference shear resistance gi and the back stress α i with respect to the plastic shear rates γ˙i was explained in the previous section. Also, the resolved shear stress is a function of the γ˙i as, it can be observed from the following paragraphs. 2. Assuming that the plastic deformation gradient increment can be calculated solving the differential system of equations: p F˙ = L · F p
(55)
which has the solution: p
F t + t = exp (L · t) · F tp
(56)
where L is the velocity gradient in the intermediate configuration a function of γ˙i . 3. Based of the computed plastic deformation gradient, the elastic deformation gradient is:
p
F et+ t = F t + t · F t + t
−1
(57)
4. Using the elastic deformation gradient computed previously, update the normal and tangential vectors defining the slip systems: s t + t = F el · s t
m t + t = m t · F el
−1
(58)
5. Calculate elastic Green strain tensor from the elastic deformation gradient E et+ t =
1 2
Fte+ t
T
· Fte+ t − I
(59)
where, I represents the identity matrix and AT is the transpose of any matrix A.
Crystal plasticity
1147
6. Compute Cauchy stress from the second Piola–Kirchhoff stress: σ t + t =
T 1 e el e : F · C E t + t · Ft + t e det Ft + t
(60)
7. Using the Cauchy stress and the cosines directions of the slip systems, compute the resolved shear stress on each slip system: τit + t = σ t + t :
1 2
s ti + t · m ti + t + m ti + t · s ti + t
(61)
8. To calculate the plastic shear rates, the above equation is reversed and • a Newton–Raphson procedure is applied, to calculate γi : f i = τit + t − αit + t
• t + t 1/M t + t γi • − git + t • sgn γi =0 γ0
(62)
Applying Newton–Raphson method to solve the system of equations −
N ∂ fi j =1
∂γ j
· γ˙ j = f i
(63)
The above linear system of equations is solved for the unknowns γ˙i . The matrix (∂ f i /∂γ j ) is called the Jacobian of the linear system of equation, and it can be calculated by differentiating Eq. (63) defining the functions f i :
∂τ t + t ∂α t + t ∂α t + t γ˙i ∂ fi = i − i − i · · sgn (γ˙i ) ∂ γ˙ j ∂ γ˙ j ∂ γ˙ j ∂ γ˙ j γ0 ∂ γ˙i −αit + t sgn (γ˙i ) ∂ γ˙ j γ0
(64)
9. Updating of the plastic shear rates is performed after the increments
γ˙i are calculated: γ˙i = γ˙i + γ˙i
(65)
10. Check if convergence is achieved by computing some convergence criterion, in the form of an error. For additional details see [21]. An example of convergence criterion is: !
N γ˙i 1! " fi N i=1 γ˙max
2
≤ Tolerance
(66)
γi /i = 1, 2, . . . , N } γ˙max = max {˙
(67)
where,
1148
M.F. Horstemeyer et al.
N is the total number of slip systems (twelve for an FCC lattice), and Tolerance is a small number (very often chosen 1E − 8). The functions f i are defined at Step 12. 11. Re-compute all variables needed for calculating the convergence criterion at Step 11, and check for convergence. If the convergence criterion is fulfilled, exit Newton–Raphson procedure, otherwise perform a new Newton–Raphson step. Due to high nonlinearity of the power laws, especially for very large values of M, Newton–Raphson procedure may diverge and fail to achieve a solution. In this situation, the procedure must be combined with correction procedures, such a line search [22]. 12. Return to finite element program or to next step in algorithm.
References [1] P.R. Dawson, “On Modeling of Mechanical Property Changes During Flat Rolling of Aluminum,” Int. J. Solids Structures, 23(7), 947–968, 1987. [2] D. Peirce, R.J. Asaro, and A. Needleman, “An analysis of nonuniform and localized deformation in Ductile single crystals,” Acta. Metall., 30, pp. 1087–1119, 1982. [3] M.M. Rashid and S. Nemat-Nasser, “A constitutive algorithm for rate dependent crystal plasticity,” Computuer Methods in Applied Mechanics and Engineering, 94, 201–228, 1990. [4] U.F. Kocks, C.N. Tom´e, and H.R. Wenk, “Texture and anisotropy: preferred orientations in polycrystals and their effect on materials properties,” Cambridge University Press, 1998. [5] L. Anand and M. Kothari, “Computational procedure for rate-independent crystal plasticity,” Journal of the Mechanics and Physics of Solids, 44(4), 525–558, 1996. [6] A.M. Cuitino and M. Ortiz, “Computational modelling of single crystals,” Modelling Simul. Mater. Sci. Eng., 1, 225–263, 1992. [7] P.R. Dawson and E.B. Marin, “Computational mechanics for metal deformation processes using polycrystal plasticity,” Advances in Applied Mechanics, 34, 78–171, 1998. [8] G.I. Taylor and C.F. Elam, “The distortion of an aluminum crystal during a tensile test,” Proc. Royal Soc. London, A102, 643–667, 1923. [9] E. Schmid, “Ueber die Schubverfestigung von Einkristallen bei Plasticher Deformation,” Z. Physics, 40, 54–74, 1926. [10] G.I. Taylor, “Plastic strain in metals,” J. Inst. Metals, 62, 307, 1938. [11] J.F.W. Bishop and R. Hill, “A theoretical derivation of the plastic properties of a polycrystalline face-centered metal,” Phil. Mat. Ser., 7(42), 1298–1307, 1951. [12] K.S. Havner, Finite Plastic Deformation of Crystalline Solids, Cambridg University Press, 1992. [13] E.H. Lee and D.T. Liu, “Finite strain elastic-plastic theory with application to planewave analysis,” J. Appl. Phys., 38, 391–408, 1967. [14] J.W. Hutchinson, “Bounds and self-consistent estimates for creep of polycrystalline materials,” Proc. R. Soc. Lond. A, 348, 101–127, 1976.
Crystal plasticity
1149
[15] M.F. Horstemeyer and D.L. McDowell, “Modeling effects of dislocation substructure in polycrystal elastoviscoplasticity,” Mech. Matls., 27, 145–163, 1998. [16] G.Z. Sachs, Verein Deut. Ing., 72, 734, 1928. [17] R.I. Borja and J.R. Wren, “Discrete micromechanics of elastoplastic crystals,” International Journal for Numerical Methods in Engineering, 36, 3815–3840, 1993. [18] J. Schroder and C. Miehe, “Aspects of computational rate independent crystal plasticity,” Compututational Materials Science, 9, 168–176, 1997. [19] S.R. Kalidindi, “Polycrystal plasticity: constitutive modeling and deformation processing,” PhD. Thesis. MIT, Cambridge, MA, 1992. [20] C. Miehe and J. Schroder, “A Comparative study of stress update algorithms for rate-independent and rate-dependent crystal plasticity,” Int. J. Num. Met. in Eng., 50, 273–298, 2001. [21] R.D. McGinty, “Multiscale representation of polycrystalline inelasticity,” PhD. Thesis, Georgia Institute of Technology, Atlanta, GA, 2001. [22] W.H. Press, B.P. Flannery, S.A. Teukolsky, and W.T. Vetterling, Numerical Recipes, The Art of Scientific Computing, Cambridge University Press, 1986.
3.6 INTERNAL STATE VARIABLE THEORY D.L. McDowell Georgia Institute of Technology, Atlanta, GA, USA
Many practical problems of interest deal with irreversible, path dependent aspects of material behavior, such as hysteresis due to plastic deformation or phase transition, fatigue and fracture, or diffusive rearrangement. Some of these processes occur so slowly and so near equilibrium that attendant models forego description of nonequilibrium aspects of dissipation (e.g., grain growth). On the other hand, some irreversible behaviors such as thermally activated dislocation glide can occur farther from equilibrium with a spectrum of relaxation times. The fact that quasi-stable, nonequilibrium configurations of defects can exist in lattices at multiple length scales, combined with the long range nature of interaction forces, presents an enormous challenge to the utility of high fidelity, high degree of freedom (DoF) dynamical models that employ atomistic or molecular modeling methods. For example, analyses of simple crystal structures using molecular dynamics have now reached scales on the order of microns, but are limited to rather idealized systems such as pure metals and to small time durations of the order of nanoseconds. High fidelity analyses of generation, motion and interaction of line defects in lattices based on discrete dislocation dynamics, making use of interactions based on linear elastic solutions, cover somewhat higher length scales and longer time scales, but are also limited in considering realistic multiphase, hierarchical microstructures. Crystal plasticity as well cannot be used for large scale finite element simulations, for example, crash simulations of a vehicle into a barrier. As experimental capabilities have become highly automated with increasingly accurate resolution, computer control and digital data acquisition, and in situ characterization capabilities have improved, many important phenomenological aspects of the nonlinear, irreversible behavior of materials have become much better understood. Concurrently, computing capabilities are now sufficient to make feasible the integration of more complex constitutive relations (e.g., stress–strain behavior as a function of temperature, strain, strain 1151 S. Yip (ed.), Handbook of Materials Modeling, 1151–1169. c 2005 Springer. Printed in the Netherlands.
1152
D.L. McDowell
rate, and initial material condition) in obtaining approximate solutions to initial-boundary value problems via finite element or finite difference schemes. To add to this confluence of technologies, mesoscale methods have resulted in advances in understanding and description of the effects of various microstructure features such as grains, reinforcement phases, inclusions, crystal structures, and so forth on deformation and failure processes in various classes of materials. This brief overview considers Internal State Variable (ISV) constitutive theory, which offers a rather robust framework for incorporating irreversible, path dependent behavior, informed by experiments, computational materials science, and mesoscale micromechanics methods.
1.
Internal Variables and the Notion of Constrained Equilibrium States
For simplicity, we employ small strain measure ε with conjugate stress σ. Absolute temperature is designated by T. ISV constitutive theory was initially developed based on the notion that a general nonequilibrium, irreversible process can be treated as a sequence of constrained equilibrium states (cf. [1–4]). See Muschik [5] for an excellent overview. Suppose that for a sequence of nonequilibrium states we locally define entropy and temperature as per their usual equilibrium state functions of (ε, T, ξk ), such that the Helmholtz free energy function is written as ψ = ψ ε, T, ξk = u − T η. The vector of internal state variables, ξk , represents effects of evolving material microstructure on the change of free energy. In this way, we extend the equilibrium state space to nonequilibrium processes via augmentation of the dependence of thermodynamic state functions on nonequilibrium variables ξk . The primary assumption underlying the notion of accompanying constrained equilibrium states to a given nonequilibrium path is that rates of “forcing” variables (ε, T ) are either sufficiently slow relative to the characteristic relaxation rates of viscous, thermally activated barrier bypass or diffusion rates associated with irreversibility such that relaxation occurs to near equilibrium before the next increment is taken, or they are sufficiently fast relative to these characteristic relaxation rates such that viscous rearrangement processes have little time to occur [6]. In certain cases, such as high strain rate deformation of materials, the rates of dynamic rearrangement due to application of forces and relaxation are of similar order and the concept of a local constrained equilibrium state breaks down, as can be the case for other high frequency, short wavelength phenomena. In practice, it is common to assume a relatively small set of internal state variables that represent the pertinent physical processes associated with inelastic deformation. Clearly, if the microstructure does not undergo irreversible change, then the body is at most nonlinearly thermoelastic. Virtually any aspect
Internal state variable theory
1153
of microstructure that can undergo irreversible rearrangement during changes of (ε, T ) is a candidate for description by an internal state variable, e.g., dislocation density and arrangement, voids, cracks, slippage of fibers relative to matrices, particle cracking or debonding, phase changes, lattice orientation, and so on. It is presumed that a representative volume element (RVE) of material is considered, such that properties determined from boundary information over the scale of the RVE will not change with further increases of volume size or translation of its boundaries for uniform macrofields (cf. Krajcinovic 1996; Nemat-Nasser and Hori 1993). This definition of statistically homogeneous behavior requires a suitably large RVE relative to the size and spacing (correlation length) of heterogeneities in terms of both microstructure and microstructure change, depending on the property of interest. On occasion, particularly when conducting analyses with resolution on the same order as the microstructure, this requirement conflicts with the notion of behavior at an infinitesimal point; the theory must be extended to address nonlocal behavior in a finite neighborhood of a point, where length scale effects are an obvious manifestation of this extension. However, nonlocal theories are problematic in that their spatial dependence of the constitutive equations interacts with spatial dependence of the other governing field equations in ways that are beyond the capabilities of most traditional numerical codes. Hence, the local assumption is often invoked as a reasonable approximation. Care must be taken to determine parameters from experiments for which specimens are suitably large to validate the underlying RVE assumptions, including the absence of microstructure rearrangement localization effects (e.g., front of a phase transformation or shear banding) across the entire specimen. Under these assumptions, we may view ψ = ψ ε, T, ξ i as a local equation of state, with the path history dependence of inelastic behavior (nonuniqueness of succession of constrained equilibrium states) embedded in the evolution of ξi . This approach lends itself well to incorporation of known relations for evolution of microstructure. We assign the evolution equations ξ˙i = gi σ, T, ξ j = gˆi σ, T, f j , i, j = 1, 2, . . . , n as a crucial part of the constitutive formulation. Here, ξ˙i are considered as thermodynamic fluxes and are “strain-rate-like” quantities, and fi are conjugate thermodynamic (driving) forces. The nonnegative intrinsic entropy production rate (dissipation) per unit volume is given by ρ γloc = ρ η˙ − ρ ϒT + T1 ∇ · q = ρ η˙ −
1 T
(ρϒ − ∇ · q) ≥ 0
(1)
where ρ is mass density, q is the heat flux vector, ϒ is the internal rate of heat supply per unit mass extrinsic to dissipation associated with microstructure rearrangement processes (e.g., radiation), and η is the entropy per unit mass. The inequality ργloc ≥ 0 is a strong form of the local form of the 2nd Law of Thermodynamics (i.e., the Clausius–Duhem (C–D) inequality) which appends an inequality associated with heat conduction.
1154
D.L. McDowell
The local form of the 1st law of thermodynamics, the energy equation, is written as . (2) ρϒ − ∇ · q = ρ u˙ − σ : ε where u is the internal energy per unit mass. From the primal definition of ˙ = u˙ − T η˙ − η T˙ . Expanding state the Helmholtz free energy, ψ = u − T η → ψ ˙ = ∂ ψ/∂ ε : ε. + ∂ ψ/∂ T T˙ + ni=1 ∂ ψ/∂ ξi ∗ξ˙i , function ψ as a total differential, ψ and substituting into Eq. (2) gives
∂ψ ∂ψ . + ρη T˙ + ρT η˙ − ρϒ + ∇ · q −σ :ε+ ρ ∂ε ∂T n ∂ψ . +ρ ∗ ξi = 0. ∂ ξi i=1
ρ
(3)
Each term in parentheses must vanish independently since ε and T are independent variables, leading to the relations
σ=ρ
∂ψ , ∂ε
η=−
∂ψ , ∂T
−
n ρ 1 ∂ψ . ∗ ξi = ρ η˙ − (ρϒ − ∇ · q) . T i=1 ∂ ξi T
(4) Making use of the C–D inequality in Eq. (1), the last of Eqs. (4) reduces to n n . ρ 1 ∂ψ . ∗ ξi = fi ∗ ξ i ≥ 0, ρ γloc = − T i=1 ∂ ξ i T i=1
(5)
where f i = −ρ(∂ ψ/∂ ξi ) is the thermodynamic force conjugate to displacement ξ i . Operator “*” denotes an appropriate scalar product for the Euclidean space of components of tensor ξi since the various internal variables can be of arbitrary (and different) rank. We may regard the succession of free energy states through which the material evolves as driven by the global minimization of free energy of the RVE. We draw several important points from Eq. (5): (a) the C–D inequality is equivalent to reduction of free energy with respect to irreversible change of microstructure, (b) the free energy release rate is a generalization of the energy release rate concept in fracture mechanics (cf. [7]), and (c) we may admit certain rearrangement processes that increase free energy so long as there are other irreversible processes that dissipate more energy than is stored. This last point helps to explain why ISVs are necessary to model, in a manner consistent with the 2nd Law, certain processes such as negative creep rate in unloaded metals subjected to tensile prestrain at elevated temperatures (cf., [8]) or contraction of muscle tissue under application of a tensile force. It is noted as well that in the absence of irreversible internal structure rearrangement, ψ = ψ (ε, T ) suffices for a mechanical description of reversible elastic behavior; it is unnecessary to introduce ISVs in this case
Internal state variable theory
1155
unless they provide a description of configurational entropy changes as a function of a sequence of reversible elastic states of microstructure, for example, network models of long chain molecules in elastomers.
2.
Measurement and Interpretation of ISVs
At least two aspects of this formulation relegate the internal state variables to the status of “hidden” variables: (i) The local form of equations is written as a set of differential equations with no spatial dependence and therefore does not couple with the governing field balance equations of mass, momentum and energy on the spatial manifold. (ii) the complete set of ISVs in general cannot be directly measured by virtue of the concession of many-body problems to a low order description (i.e., thermodynamical, rather than dynamical, description). As mentioned previously, (i) can be relaxed if warranted by the need to consider nonlocal action (cf., [9]) by introducing spatial dependence of the ISV evolution equations either via nonlocal integral forms (cf., [10]) or in terms of gradient approximations (cf., [11, 12]). In fact, this step is essential if it is desired to allow the microstructure to evolve and spatially “self-organize” in response to changes in both initial and applied boundary conditions [13]. In many cases, however, such effects of heterogeneity are encompassed within the local ISV description in situations where the behavior at each point can be suitably characterized as statistically representative of a volume element. It is possible to account for spatial gradients of microstructure within a RVE . without introducing spatial derivatives in the evolution equations for the ξi by considering distributions of gradients of microstructural features such as dislocations or cracks with a statistically homogeneous infinitesimal neighborhood of a point (cf., [11, 12, 14]). However, requirements of statistical homogeneity often become more difficult to meet in such cases with regard to properties that depend on distributions of microstructure such as fracture resistance and ductility, in comparison to properties that depend in weak fashion on distribution, such as conductivity and elastic stiffness. To understand the implications of point (ii) above, we must consider that adoption of an RVE is intimately related to the replacement of an explicit field of microstructure heterogeneities by an “equivalent” homogeneous continuum. This idea of an equivalent homogeneous material element that replaces the heterogeneous material at each point is central to continuum ISV theory, as ∗ ∗ ∗ ∗ illustrated in Fig. 1. In Fig. 1, the homogenization mapping H (ψ (ε , T , x )) → ψ ε, T, ξ i RVE is assumed to hold at the scale of the RVE, such that
1156
D.L. McDowell
H Figure 1. Homogenization of heterogeneous microstructure into equivalent homogeneous continuum at the scale of an RVE.
local field variables (ε∗ , T ∗ ) that vary within this volume are mapped to RVE averages, which are related to RVE boundary information, with necessary augmentation by the ISVs to achieve the free energy equivalence. The resulting RVE level free energy function serves as the potential for statistically representative stress–strain, entropy-temperature and generalized ISV forcedisplacement relations for internal defects or heterogeneities, as outlined in the preceding. In mesoscale micromechanics methods of elastic heterogeneous media with defects (cf., [15]), homogenization is also achieved at the RVE level, but with the goal of volume averaging of stress, strain and the free energy (typically expressed in isothermal case as strain energy density), explicitly taking into account heterogeneities within the RVE. Accordingly, the aims of such mesoscale methods are essentially captured within the foregoing ISV framework, but there are some important distinctions. Often, the degree of idealization necessary to perform rigorous volume averaging in initial-boundary value problems to obtain RVE level potentials sacrifices too many of the essential features of the heterogeneity fields. For example, Fig. 2 shows three different idealizations of a polycrystal with distributed defects (e.g., dislocations, voids, etc.) within grains. In the first case, appearing at the far left, we explicitly consider both scales of grains and intragranular defects or subgrain structures. In the second case, shown in the middle, ISVs are introduced only to represent heterogeneous phenomena within grains, and only grain-scale heterogeneity is explicitly addressed. At the far right, the fully homogenized description replaces effects of all heterogeneities with ISVs. The number of DoF, N, associated with each of the descriptions shown in Fig. 2, which includes computational DoF, decreases from left to right as we smear heterogeneities and incorporate them in ISVs. It may be feasible to achieve a homogenized description (far right) by solving initial-boundary value problems either analytically or numerically with the second idealization, but it is often intractable
Internal state variable theory
Ψ(∈,T; ξI )
1157
Ψ(∈,T; ξII )
Ψ(∈,T; ξIlI )
N( ξI ) > N( ξII ) > N( ξIII ) Figure 2. Criteria of choice for ISV models: (left) explicit model of grains and subgrain defects/heterogeneities, (middle) implicit ISV model for subgrain regions combined with explicit grain level modeling, and (right) statistically homogeneous RVE that implicitly encompasses all scales shown at left.
to model realistic distributions of heterogeneities at multiple length scales. The number of DoF becomes simply overwhelming for numerical solutions at length scales well above the RVE if individual defects are considered. Moreover, specification of initial conditions becomes problematic, as this must meet certain requirements on quasi-minimum free energy to be realistic, and most assuredly these initial conditions are not unique. We further comment on (ii) above. In systems with large populations of defects or other sites associated with microstructure rearrangement processes per unit volume, statistical mechanics appeals to ensemble averaging. In so doing, we relinquish any attempt to explicitly model the “dynamical state” of defect interactions. Examples of such fully dynamical models include molecular dynamics, explicit models for interacting deformable systems of particles, and discrete dislocation dynamics theories. The number of DoF for these models climbs dramatically with volume of material considered. In contrast, ISV models of local type address only the reduced thermodynamical state and the solution addresses balance of linear and angular momenta through the governing field equations at the equivalent material (i.e., continuum) level rather than between individual defects or heterogeneities within the RVE. In the process of discarding the explicit representation, however, we lose the capability to model the state of the material in terms of a direct mapping between microstructure and properties. For example, in molecular dynamics we track the position and velocity of particles as a function of time, thereby permitting application of Newton’s laws to achieve a complete dynamical description of the body at arbitrary length scales above the atomic level; lacking the fully dynamic description, the positions of atoms, particles or other heterogeneities as measured periodically under the microscope, are insufficient to fully characterize
1158
D.L. McDowell
the change of free energy with microstructure evolution (cf., [16]). For thermally activated microstructure rearrangement, energy barriers are often quite nonuniformly distributed through the microstructure. Hence, we cannot rigorously link the evolution of ISVs of a thermodynamic description to measurable low order geometric parameters of an evolving microstructure, but must rely on guidance from measured kinetics or computational methods to build force–flux relations that approximate the behavior of the actual dynamic, many-body problem. Historically, force–flux relations have been inferred from laboratory measurements in important applications such as distributed cavity growth in high temperature creep and dislocation creep or plasticity of polycrystalline metals.
3.
Comments on Local ISV Framework
The foregoing framework provides little detail regarding the prescription of kinetics of evolution processes other than the need to accord with the C–D . inequality – this must be specified via the constitutive relations for fluxes ξk . We are at liberty to base these relations on physical observations, mesoscale micromechanics methods, other computational materials science simulations, or a heuristic of maximal rate of dissipation [17]. Maximal dissipation, com ˆ bined with the assumption of a convex dissipation potential = σ, T, f j in the space of conjugate forces and temperature, gives rise to a generalized nor. mality structure for the fluxes with respect to this potential, i.e., ξ j = ∂/∂ f j [3, 7]. Rice [21] has shown that this requires that each thermodynamic force depends only on its conjugate generalized flux. While this result is appealing and has been used for many processes such as dislocation plasticity (resolved shear stress vs. slip system shearing rate relations), it is emphasized that the heuristic of maximal dissipation is a constitutive assumption and therefore is not on the same grounds as the 2nd Law inequality. Ziegler [17] makes the point that for multiple mechanisms of dissipation, the viability of maximal rate of dissipation as a governing principle for communal response at the RVE scale depends on whether the mechanisms are coupled or decoupled, and whether force–flux relations are linear or not. For many processes, however, it is a powerful and useful heuristic. If not for the assumption of accompanying constrained equilibrium states, the adoption of the equilibrium form of the thermodynamic relations along a nonequilbrium process would be inessential. In fact, even if we do not invoke equilibrium state functions to model nonequilibrium behaviors under locally . constrained equilibrium, we may still specify evolution equations for the ξ j as statistical ensemble averages, but must remove the modifier “state” from the ISV label; we then consider the ξ j as internal variables, removing the modifier “state”. For highly nonequilibrium processes, Extended Irreversible Thermodynamics (EIT) approaches have been developed (cf., [18]) that include
Internal state variable theory
1159
dependence of nonequilibrium state functions on fluxes of energy, mass, momentum and/or higher moments of velocity; as such, the EIT approach is fully nonlocal and removes undesirable characteristics of infinite speed of propagation of signals associated with otherwise parabolic governing equations absent such flux terms. Another important restriction on the RVE is that in addition. to the use of the free energy as a potential in Eq. (4), the force–flux relations ξi = gˆi σ, T, f j must hold at the RVE level. In fact, these relations are most fundamental to the ISV approach as they do not necessarily rely on the assumption of constrained equilibrium for thermodynamic state functions. As these relations are often quite sensitive to the tails of the distributions (e.g., distribution of largest heterogeneities), the RVE that corresponds to statistically representative force– flux relations may be significantly larger than that corresponding to the use of free energy to serve as a statistically representative potential. In some cases, it may be on the order of the size of the structural element being modeled, and the premise of local ISV theory is inapplicable. This definition of statistically homogeneous behavior based on force–flux relations requires a suitably large RVE relative to the size and spacing (correlation length) of heterogeneities in terms of both microstructure and microstructure change. The assumption of self-similar development of damage is sometimes made for sake of convenience in extending mesoscale methods solutions for elastic stiffness as a function of damage intensity and distribution, by assuming that distributions of defects will evolve uniformly and neglecting differential changes of higher moments of the distributions due to coalescence phenomena. In general, the kinetics of defect field growth must be based on either physical measurements or more sophisticated, material-dependent simulations of damage growth.
4.
Examples of ISV Models
We may identify an inexhaustive list of well known types of constitutive models belonging to the ISV classification (cf., [7, 19, 20]) • Creep damage of metals in the form of voids, leading to the broad field of continuum damage mechanics • Void nucleation, growth and coalescence in metals under tensile loading • Matrix microcracking and fiber fracture in metal- and ceramic-matrix composites • Dislocation plasticity in single crystals or polycrystals. In all of these applications the common feature is that of distributed defects or other entities associated with microstructure rearrangement. In most cases, ISV models are based on a combination of experiments and guidance from available micromechanical or materials science-based models.
1160
D.L. McDowell
When ISV constitutive relations are based on macroscopic experiments, it is common for the modeler to introduce a relatively small set (typically 1-3) of different types of ISVs that reflect major features of the irreversible behavior. Evolution equations are framed to describe limited experimental information – since the description is nonunique, a plethora of models can be introduced to model the same range of behaviors. Usually, as the number of different types of experiments and test conditions expand, candidate models are eliminated due to lack of generality or physical inconsistency. As described later, critical . experiments can be designed to test various forms of evolution equations ξ j . Effectively, the material itself is used as a direct indicator of homogenized behavior, under the assumptions (which ideally should be checked) that boundary conditions and specimen geometry/size have no influence. The model is interpolative in nature. The ISVs are likely to have little relation to measurable features of microstructure or perhaps a single mechanism in view of the foregoing discussion concerning thermodynamic representation of low DoF models. Since this class of ISV model is historically prominent, ISV models are often more generally viewed as primarily phenomenological in nature. A second type of ISV model, however, is a hybrid combination of computational and/or analytical micromechanics methods and motivation from experimental results. A good example is polycrystal plasticity, in which discrete grains and slip systems are addressed computationally, with the slip system level constitutive equations for dislocation glide kinetics and work hardening based on phenomenological laws. This type of model uses relatively large numbers of ISVs for hardening parameters that implicitly address dislocation interactions, as well as explicit ISVs for slip system orientation and grain size/shape. Integrated numerically over a periodic RVE, they provide the necessary information regarding RVE-level thermodynamics and kinetics. Such an approach is much more predictive and robust than macroscopic plasticity, since it explicitly addresses evolution of crystallographic texture and models both anisotropic elasticity and plasticity. It is clear from this example that mesoscale micromechanics methods adopt a thermodynamic representation as assumed in ISV theory. Experiments are needed to specify initial conditions on orientation and misorientation distributions of grains, in addition to hardening laws. Yet a third type of ISV model has been introduced in which analytical or computational mesoscale methods are used to derive the form of the free energy function and evolution equations of ISVs. Then, the role of experiments is to inform the identification of parameters. This differs from ISV models based on experiments in that model idealization has been introduced to facilitate modeling at the RVE level. A good example is the derivation of models for elastic energy and energy release rate due to matrix microcracking, matrix/fiber slippage and fiber damage in viscoelastic matrix composites (cf., [20]). Experiments are needed to determine strengths of phases and interface properties.
Internal state variable theory
1161
The last two examples illustrate the close relation between mesoscale micromechanics methods and ISV theory. Increasing emphasis on modeling either prior to, or concurrent with, experiments as we progress through these examples should not be confused with increased accuracy of the approach. Clearly, within a given range of conditions, approaches that make more use of experiments will likely provide more realistic results. On the other hand, if it is desired to probe structure–property relations for large ranges of microstructure, then experiments are likely too time-consuming and costly, and hybrid models emphasizing computational micromechanics methods with ISV models to represent fine scale phenomena are good candidates. We may regard these as modeling options of a multi-resolution type, and as elements of a hierarchical, multiscale modeling strategy. When modeling moves from the realm of explicit dynamic analyses of particle or defect interactions to mesoscopic or macroscopic continuum formulations at higher length scales, such multiscale strategies are perhaps best facilitated by using thermodynamics and kinetics to make the linkage, whereby statements of thermodynamics and dissipation made by models with different DoF on overlapping spatial domains and time scales should be held to some measure of equivalence. Moreover, experiments suggest that the mix of contributions of various deformation mechanisms can be treated with a heuristic such as maximal rate of dissipation, which facilitates modeling of multiple deformation mechanisms. In any case, physical consistency is essential. This includes consistency with experimental observations as well as accordance with all the other governing equations and boundary conditions. For example, although much attention has been devoted in the literature to the constraints dictated by the C–D inequality in Eq. (1) on the form and parameters of constitutive equations for irreversible processes, many models can be eliminated because they simply predict the incorrect magnitude of dissipation in spite of the proper sign.
5.
Application: Metal Viscoplasticity
Viscoplastic behavior of metals is among the more common applications of ISV theory. There are at least five length scales at which dislocation plasticity may be addressed, as shown in Fig. 3. From left to right, these scales are atomistic (molecular dynamics), collections of dislocations (discrete dislocation theory), sub-grain dislocation substructures (continuously distributed dislocation theory), grain (crystal plasticity theory) and macroscale (classical kinematic-isotropic hardening theory with a macroscopic flow potential). Solid solution or precipitate strengthened metallic alloys, multi-phase alloys and whisker, particulate or fiber-reinforced metal matrix composites each exhibit additional length scales intermediate to those depicted in Fig. 3, which affect the hardening and flow behaviors as well as the interaction and growth
1162
Min. Length Scale, L
D.L. McDowell Atomistic
Discrete dislocations
Dislocation patterns
Polycrystal plasticity
Macroscale plasticity
O(10⫺10 m)
O(10⫺8 m)
O(10⫺7 m)
O(10⫺5 m)
O(10⫺3 m)
Figure 3. Window of resolution for dislocation plasticity/viscoplasticity, including the typical minimum explicit length scale of resolution at each window size of observation.
of defects such as microvoids. Of course, we do not explicitly consider time scales here, but dynamical atomistic approaches and dynamic discrete dislocation mechanics are limited to short time scales, the latter to a lesser degree, while the others are amenable to longer time scales and fall under the framework of the ISV theory outlined earlier. Moving from left to right in Fig. 3, we may roughly describe the variables necessary to specify the “state” for a representative window as decreasing in number in accordance with a shift from characterizing the fully dynamical state (positions and momentum of atoms or defects) to the thermodynamical state of the system (microstructure attributes such as dislocation density, dislocation patterns, grain size, orientation distribution, etc.). Rice [21] addressed metal plasticity due to dislocation slip in the context of ISV theory, employing the notion of a sequence of constrained equilibrium states. By this time, the phenomenological laws of crystalline metal plasticity were rather firmly established, as reviewed later by Asaro [22]. A physically-motivated additive decomposition of strain rate into thermoelastic and inelastic parts has been introduced, assuming small strain for simplicity in presentation, as . . . ε = εe + εin ,
σ =ρ
∂ψ ∂ψ = ρ e, ∂ε ∂ε
ˆˆ εe , T, ξ = ψ εe , T + ψ ξ , T ψ=ψ e in i i
(6)
Note that the complete set of variables necessary to define the state include the elastic strain tensor (reversible lattice deformation) in addition to T and ξi . The free energy is split into thermoelastic parts associated with applied elastic strain, ψe , and internal state variables, ψin , the latter representing, for example, stored elastic energy in the lattice. The inelastic strain rate tensor is given by the variation of. strain with respect to microstructure evolution, i.e., . . . εin ≡ ni=1 (∂ ε/∂ ξi ) ∗ ξi . Hence, εin = 0if and only if ξ i = 0, i = 1, 2, . . . , n. Hence, the inelastic strain tensor should not strictly be considered as in ISV in its own right since it merely provides a geometric compilation of effects
Internal state variable theory
1163
. of irreversible arrangements ξ on ε. For a single crystal [22], the inelastic . in N i (α) (α) (α) s ⊗ m sym , where N is the number strain rate is given by ε = α=1 γ˙ of slip systems, each with respective slip direction and slip plane unit normal vectors s(α) and m(α) . For the elastic–plastic decomposition in Eq. (6) the C–D inequality is written as
n n n . ∂ε . ∂ ψin . 1 ρ 1 . σ : εin + fi ∗ ξi = σ: ∗ ξi − ∗ ξi ≥ 0 T T ∂ ξi T i=1 ∂ ξi i=1 i=1
(7) . The term σ : εin in Eq. (7) represents unrecoverable rate of external work, εin are RVE averages), absent body forces, atthe scale of the RVE (σ and . . n n while the term (ρ/T ) i=1 (∂ ψin /∂ ξi ) ∗ ξi = (ρ/T ) i=1 (∂ ψ/∂ ξi ) ∗ ξi reflects increase of free energy associated with storage of dislocations within the microstructure (stored energy of cold work). There is no stored energy in a purely dissipative process, and hence no change of ψin . For a single crystal undergoing dislocation glide, the first term is just the sum of resolved shear stresses N τ (α) γ˙ (α) . multiplied by associated shearing rates on each slip system, i.e., α=1 For polycrystalline metals at large strains, the energy storage in the second . term of Eq. (7) is typically less than 10% of σ : εin , and plays a significant role at small strains during loading reversals and for abrupt changes in loading direction (identified with effects of slip system back stress and/or evolving threshold stress, cf., [7]). Works of numerous investigators have amplified and offered alternative sets of internal state variables that describe slip system hardening relations as well as slip system (crystallographic) orientation for purposes of tracking texture evolution in polycrystals. In crystal plasticity, the (α) ξi can include the set of slip system shearsγ [21]) (α) (cf., (α) and ∂ ε/∂ ξ i can be N e (α) s ⊗ m sym in the small strain assessed directly since ε = ε + α=1 γ case. The inelastic strain rate tensor is prescribed in terms of kinetics of the shearing rates, which in turn depend on thermodynamic conjugate resolved shear stresses. If a convex dissipation potential is constructed [3, 7] as an additive sum of such potentials for independent dissipative mechanisms (see also, [17]), then a generalized normality structure emerges for both the inelastic strain . rate and the fluxes ξi with respect to this summed potential. For a given external work rate the assertion of maximal dissipation for the collective RVE level behavior would be interpreted as minimum elastic energy storage rate associated with heterogeneous deformation within the RVE, similar to the principle of minimum elastic energy for elastic-perfectly plastic boundary value problems discussed in Nadai [23]. It is worth mentioning that one can apply the variational principle of virtual velocities (PVV), based on the working rate of microstructural rearrangements (dislocation glide, diffusive rearrangements, relative grain boundary motions, etc.) that can be identified as
1164
D.L. McDowell
kinematically admissible with an applied strain field over the RVE, to solve the micromechanical boundary value problem with coupled effects of multiple inelastic deformation and diffusion mechanisms via mesoscale methods (cf. applications to diffusion-driven void growth with viscous dislocation creep in metals by Needleman and Rice [24] and grain growth by Moldovan et al. [25]). In this case we invoke maximal dissipation for each mechanism, but collective behavior is described by an upper bound derived from PVV using a kinematically admissible field; accordingly, we seek the solution that exhibits minimum dissipation from among all conceivable upper bound solutions for the conjugate driving forces. It may be shown that the RVE inelastic strain rate is normal to a potential constructed by summing the potentials of the various micromechanisms, each of which are governed by a convex dissipation potential that encloses the origin; this structure is consistent with the maximal rate of dissipation. Hence, from among all solutions that maximize dissipation within the constitutive framework we choose the one with minimum dissipation to obtain the approximate solution nearest the exact solution to the boundary value problem. This minimum work approach traces back to Nadai [23]. Such methods are useful for modeling ensembles of mechanisms that contribute to overall inelastic strain rate. Upper (kinematically admissible) and lower (statically admissible) bounds can be developed as per usual applications of the PVV (cf., [24]). Cleri et al. [25] advocate a multiscale modeling framework that connects atomistics to mesoscale continuum models using this kind of generalized PVV; it is noted that viscous forces are assumed to dominate inertial forces in the micromechanical and ISV descriptions, although not always the case in atomistics. Accordingly, there is a link between concepts of solving micromechanical boundary value problems at the scale of an RVE and the adoption of an ISV framework at this scale. Mesoscale micromechanics solutions to boundary value problems using PVV as just outlined rely on being able to link the thermodynamic fluxes to the applied strain field in a kinematically admissible sense – compatibility and consistency with imposed boundary velocities; as we have seen, in the ISV approach this may not be possible with hidden variables that reflect contributions from distributed, heterogeneous dissipative mechanisms over a wide range of length scales within the RVE. Multi-resolution numerical simulations based on PVV (e.g., displacement-based finite element method) that pass boundary information from sub-regions to higher length scales offer one way to approach this problem. It is a highly fertile area for future research. It is appropriate to close this paper with a pertinent example of experimental inference of ISV evolution from metal plasticity. At the macroscale, we may introduce as a minimum set of internal variables the back stress and an isotropic hardening variable. They are “hidden” in the sense that their effect can be ascertained or inferred from the evolutionary behavior. The back stress
Internal state variable theory
1165
is used to reflect directional effects of pre-strain on the kinetics of plastic flow (so-called Bauschinger effects). These effects are manifested as dependence of reverse yielding on pre-strain in the forward direction, the occurrence of negative creep rate after unloading to zero stress following a tensile pre-strain, and behavior under abrupt changes in direction of loading paths. In terms of Eq. (7), the back stress introduces effects of storage of elastic energy due to heterogeneity of the microstructure. Physically we can suggest several origins of back stress in polycrystalline metals [8]: • differential yielding among grains with hard and soft orientation, which changes with texture evolution and sub-grain formation • pile-ups of dislocations against hard boundaries such as grain or phase boundaries • differential resistance to slip in forward and reverse directions in certain systems by virtue of irreversible pinning mechanisms in the presence of planar dislocation structures in second phases or in Taylor lattices • distribution of short range barriers to thermally-activated dislocation motion and anelastic bowing of dislocations. These mechanisms result from hierarchical behavior of a spectrum of dislocation trapping and escape events associated with heterogeneity. The back stress itself can be decomposed [27] into multiple components that reflect various mechanisms. Since it does not represent volume averaged internal stresses but instead characteristics of differential yielding or barrier bypass events in time and space within a heterogeneous microstructure, it cannot generally be inferred, for example, based on changes of average lattice spacing from equilibrium values. Back stress is inherently a multiscale modeling concept. The kinematic hardening variable is not the average of a self-equilibrated internal stress field, but is rather related to the peak athermal barrier resistance to thermally-activated motion of dislocations past obstacles. This example clarifies that mean field volume averaging procedures commonly employed in mesoscale methods of composite materials to determine effective properties are less useful in determining ISVs that reflect average effects of distributed microstructure rearrangements on changes of properties. It might be more appropriate to classify the volume averaging procedures for evolving microstructure as activation volume averaging, with the focus on rate limiting events rather than mean fields. In view of the preceding discussion, back stress can be inferred from experiments or from results of multiscale simulations accounting for a range of potential deformation mechanisms. While the simulation route is certainly a worthy goal, it has not yet been fully developed to account for the various mechanisms cited above, whereas the experimental approach is rather well-established and instructive. The asymmetry of forward and reverse yield strength following reversal of stress, termed the Bauschinger effect, has long
1166
D.L. McDowell
been identified as a principal feature of cyclic deformation response of ductile metals. Otherwise known as kinematic hardening, this response has been attributed to the nonuniformity of inelastic flow (dislocation motion) at the microscale. For more details, the reader is referred to the review of cyclic plasticity and viscoplasticity by Chaboche [28]. We consider here the shift and expansion/contraction of a simple, initially isotropic yield surface for type 304 stainless steel, introducing the set ξ i comprised of a single tensorial kinematic hardening variable, α, and a scalar isotropic hardening variable, R. The variable R reflects storage of elastic energy associated with overall dislocation density accumulation and reduction of mean free path. A simple, classical model of yielding for an initially isotropic, plastically incompressible rate-independent polycrystal employs a simple uniaxial normalized J2 yield surface of the form f =
3 2
σij − αi j σij − αi j − R 2
(8)
where σij = σi j − (σmm /3) δi j is the deviatoric stress, αi j is the deviatoric back stress tensor and R is the radius of the yield surface. The flow rule, for f = f˙ = 0, is written as p
ε˙ i j = h1 (σ˙ kl Nkl ) Ni j
(9)
where Ni j = σij − αi j σkl − αkl is the unit normal vector to f in devi
atoric stress space at the current stress point on the yield surface, and the p inelastic strain rate is the plastic strain rate, i.e., ε˙ iinj = ε˙ i j . Within this framework, the form of evolution of the back stress can be probed by experiments that involve continuous axial-torsional strain-controlled, nonproportional deformation of tubular specimens along cyclically stable paths. Under these conditions, it is assumed that R has saturated or stabilized and only back stress evolves. Assuming the form in Eq. (8) holds for the yield surface, Fig. 4 compares the performance of linear and nonlinear kinematic hardening laws for 304 stainless steel at room temperature [29], i.e., α˙ ij = H p˙ Nij
(10)
α˙ ij = C b Ni j − αi j p˙
(11)
where p˙ = ε˙ klin ε˙ klin , and H, C and b are scalars/constants. In Fig. 4, the back stress path is plotted √ in the subspace of axial back stress, α1 = α11 vs. shear back stress α3 = 3α12 for cyclically stable response, as estimated by backward extrapolation from the stress point along the inelastic strain rate direction determined by numerical differentiation of data in the vicinity of each point along the stress path; the backward- extrapolated path is selected that provides the smoothest evolution through the cycle, i.e., minimizing jump
Internal state variable theory
1167
Figure 4. Predicted directions of back stress rate (arrows) based on the Prager kinematic hardening law (top) and Armstrong–Frederick nonlinear kinematic hardening law (bottom), superimposed on trajectory of back stress (solid lines) as inferred from measurements under a cyclically stable nonproportional straining path for type 304 stainless steel [29].
1168
D.L. McDowell
discontinuities near points of abrupt change of direction of loading path. Arrows in the direction predicted by each of the back stress evolution laws in Eqs. (10) and (11) are drawn with tails along this curve. Clearly, the nonlinear form given in Eq. (11) provides a much more accurate description than that in Eq. (10) since the direction of the rate predicted by Eq. (11) is nearly tangent to the measured (inferred) back stress path through the cycle. In contrast, the Prager rule in Eq. (10) is inaccurate since the arrows are far from tangency to the path. Indeed, the Armstrong–Frederick form [30] of kinematic hardening in Eq. (11) has been frequently used in the last twenty years to provide accurate simulations of cyclic behavior under nonproportional loading. Further enhancements of this simple form have addressed more accurate unloading–reloading behavior as well.
References [1] B.D. Coleman and M.E. Gurtin, “Thermodynamics with internal state variables,” J. Chem. Phys., 47, 597–613, 1967. [2] J. Kestin and J.R. Rice, “Paradoxes in the application of thermo-dynamics to strained solids,” E.B. Stuart, B. Gal Or and A.J. Brainard (eds.), A Critical Review of Thermodynamics, Mono-Book Corp., Baltimore, pp. 275–298, 1970. [3] P. Germain, Q.S. Nguyen, and P. Suquet, “Continuum thermodynamics,” J. Appl. Mech. Trans. ASME, 50, 1010, 1983. [4] J. Kestin, “Local equilibrium formalism applied to mechanics of solids,” Int. J. Sol. Struct., 29(14–15), 1827–1836, 1992. [5] W. Muschik, “Fundamentals of nonequilibrium thermodynamics,” In: W. Mushik, (ed.), Non-Equilibrium Thermodynamics with Applications to Solids, CISM Courses and Lectures No. 336, International Centre for Mechanical Sciences, Springer-Verlag, New York, pp. 1–63, 1993. [6] J. Bataille and J. Kestin, “L’interpr´etation physique de la thermodynamique rationnelle,” J. de M´echanique, 14, 365–384, 1975. [7] J. Lemaitre and J.L. Chaboche, Mechanics of Solid Materials, Cambridge University Press, Cambridge, 1990. [8] D.L. McDowell, “Multiaxial effects in metallic materials. Symposium on Durability and Damage Tolerance,” ASME AD-Vol. 43, ASME Winter Annual Meeting, Chicago, IL, Nov. 6–11, pp. 213–267, 1994. [9] Z.P Bazant and E.-P Chen, “Scaling of structural failure,” Appl. Mech. Rev., 50(10), 593–627, 1997. [10] A.C. Eringen, “Non-local polar elastic continua,” Int. J. Engrg. Sci., 10, 1–16, 1972. [11] H.M. Zbib and E.C. Aifantis, “On the gradient-dependent theory of plasticity and shear banding,” Acta Mech., 92, 209–225, 1992. [12] H.M. Zbib and E.C. Aifantis, “Size effects and length scales in gradient plasticity and dislocation dynamics,” Scripta Mater., 48, 155–160, 2003. [13] E.C. Aifantis, “Pattern formation in plasticity,” Int. J. Engrg. Sci., 33(15), 2161– 2178, 1995. [14] T.E. Lacy, D.L. McDowell, and R. Talreja, “Gradient concepts for evolution of damage,” Mech. Mater., 31, 831–860, 1999.
Internal state variable theory
1169
[15] S. Nemat-Nasser and M. Hori, Micromechanics: Overall Properties of Heterogeneous Materials, North-Holland, Amsterdam, 1993. [16] M. Zhou and D.L. McDowell, “Equivalent continuum for dynamically deforming atomistic particle systems,” Phil. Mag. A, 82(13), 2547–2574, 2002. [17] H. Ziegler, “An Introduction to Thermomechanics,” In: E. Becker, B. Budiansky, W.T. Koiter, and H.A. Lauwerier (eds.), North Holland Series in Applied Mathematics and Mechanics, 2nd edn., vol. 21, North Holland, Amsterdam, New York, 1983. [18] G. Lebon, “Fundamentals of nonequilibrium thermodynamics,” In: W. Mushik (ed.), Non-Equilibrium Thermodynamics with Applications to Solids, CISM Courses and Lectures No. 336, International Centre for Mechanical Sciences, Springer-Verlag, New York, pp. 139–204, 1993. [19] D. Krajcinovic, Damage Mechanics, Elsevier, Amsterdam, 1996. [20] R.S. Kumar and R. Talreja, “A continuum damage model for linear viscoelastic composite materials,” Mech. Mater., 35(3–6), 463–480, 2003. [21] J.R. Rice, “Inelastic constitutive relations for solids: an internal variable theory and its application to metal plasticity,” J. Mech. Phys. Sol., 19, 433–455, 1971. [22] R.J. Asaro, “Crystal plasticity,” J. Appl. Mech., Trans. ASME, 50, 921–934, 1983. [23] A. Nadai, Theory of flow and fracture of solids, McGraw-Hill, New York, 1963. [24] A. Needleman and J.R. Rice, “Plastic creep flow effects in the diffusive cavitation of grain boundaries,” Acta Met., 28(10), 1315–1332, 1980. [25] D. Moldovan, D. Wolf, S.R. Phillpot, and A.J. Haslam, “Role of grain rotation during grain growth in a columnar microstructure by mesoscale simulation,” Acta Mater., 50, 3397–3414, 2002. [26] F. Cleri, G. D’Agostino, A. Satta, and L. Colombo, “Microstructure evolution from the atomic scale up,” Comput. Mater. Sci., 24, 21–27, 2002. [27] J.C. Moosbrugger and D.L. McDowell, “A rate dependent bounding surface model with a generalized image point for cyclic nonproportional viscoplasticity,” J. Mech. Phys. Sol., 38(5), 627–656, 1990. [28] J.-L. Chaboche, “Constitutive equations for cyclic plasticity and cyclic viscoplasticity,” Int. J. Plasticity, 5(3), 247–302, 1989. [29] D.L. McDowell, “An experimental study of the structure of constitutive equations for nonproportional cyclic plasticity,” J. Engrg. Mater. Techn., Trans. ASME, 107, 307–315, 1985. [30] P.J. Armstrong and C.O. Frederick, “A mathematical representation of the multiaxial Bauschinger effect,” CEGB Report RD/B/N731, Berkeley Nuclear Laboratories, 1966.
3.7 DUCTILE FRACTURE M. Zikry North Carolina State University, Raleigh, NC, USA
Up to this point, we have focused upon mesoscale and macroscale formulations for plasticity. The ISV method in the previous section applies to ductile fracture as well. It is generally called continuum damage mechanics (CDM). Before we proceed to discuss CDM, it is worthwhile discussing physical mechanisms and models for ductile fracture. Therefore, in this section, we examine another dissipative mechanism, namely that of ductile fracture. Ductile failure or rupture is generally preceded by extensive plastic deformation, and it may occur either due to geometrical instabilities associated with specimen dimensions or due to the nucleation, growth, and coalescence of microscopic voids that initiate and propagate at inclusions, second phase particles, or grain boundaries. Since ductile materials are generally used due to their toughness, it is essential to understand how failure initiates and evolves, so that the inherent relatively high toughness can be used for design.
1.
Essential Concepts
In ductile crystalline materials, such as aluminum, copper, and steels, fracture initiation can be triggered by plasticity. Simple tensile tests of ductile homogenous specimens, without any pre-existing flaws, with geometrical dimensions on the order of centimeters, will fail, after the specimen undergoes extensive necking and plastic deformation in the necked region. As the specimen necks, the stress unloads as a function of strain as illustrated in Fig. 1. Inelastic or plastic deformation occurs on specific crystalline or glide planes, which are oriented at 45◦ planes from the loading axis. Specimen failure can occur as a chisel point or as a ductile cone fracture (Fig. 1). Specimen necking is a geometrical instability that is usually related to the specimen’s aspect ratio, and it is directly related to the accumulation of plastic deformation and failure. Hence, ductile failure is generally distinguished from brittle failure by 1171 S. Yip (ed.), Handbook of Materials Modeling, 1171–1181. c 2005 Springer. Printed in the Netherlands.
1172
M. Zikry
Stress
F
Strain Figure 1. Stress unloading as a function of strain with chisel-point fracture (geometrical instability) and cup-cone fracture.
the extensive inelastic deformation that precedes failure and generally higher deformation energies. When these ductile fracture surfaces are observed and analyzed on axial metallographic sections, at a scale generally on the order of micrometers, the fracture surface appears as dimpled (Fig. 2). Several factors have been shown to influence ductile failure: stress triaxiality, void volume fraction, effective plastic strain, interparticle spacing, and void initiation, growth, and coalescence. Stress triaxiality is the ratio of the mean stress to the equivalent stress. The mean or hydrostatic stress, σm is the average of the normal stresses, σ1 + σ2 + σ3 , (1) 3 and the equivalent stress σeq is a scalar measure of the yield stress for the multiaxial Von Mises yield criteria, and is given as the following, σm =
σeq =
√1 2
1
(σ1 − σ2 )2 + (σ1 − σ3 )2 + (σ3 − σ2 )2 2 ,
(2)
where σ1 , σ2 , and σ3 are the principal normal stresses. Since, ductility decreases with increasing stress triaxiality for most metals [1], this is one measure that has been used to characterize ductile failure. The resulting fracture surface is generally a complex path of coalesced voids running through the material [2]. The failure path for coalescence can develop quickly, on a local scale, in regions that otherwise may appear globally uniform. The dimpled surfaces associated with ductile fracture can
Ductile fracture
Figure 2.
1173
Dimpled failure surface associated with ductile fracture in low-carbon steel.
occur sequentially through void nucleation, growth, and coalescence as follows: • a stage of void nucleation that can occur by internal cracking of second phase particles or inclusions or by the decohesion at particle–matrix interfaces; • a stage of void growth up to a critical void size and intervoid spacing ratio; • a stage of void coalescence of these voids that links the voids in a final path of rupture and subsequent fracture. These three essential stages are briefly described below with some of the approaches that have been used to address each stage.
1.1.
Void Nucleation
Microvoid or cavity nucleation at inclusions or second phase particles can proceed in several distinct ways: • by the fracture of hard non-metallic inclusions. The resulting microcracks can lead to ductile fracture if the stress conditions are not energetically favorable for the propagation of cleavage cracks;
1174
M. Zikry
• by the separation of hard or soft particles from a material matrix at a particle matrix interface by interfacial decohesion. This decohesion occurs when both the dissipated energy exceeds the energy required for the formation of new surfaces and when a local stress exceeds a critical quantity related to the interfacial strength between the particle and the matrix. Argon et al. [3] proposed one of the first expressions expression for the determination of this critical stress for equiaxed particles as kσeq + σm + C
l0 = σc , R0
(3)
where σm is the mean stress, σeq is the equivalent Von Mises stress, k is a geometrical factor, l0 is the distance between particles, R0 is the particle radius, and C is a numerical coefficient, and σc is the critical stress. Argon and his collaborators were able to show that particle interaction does not occur for particle volumes for less than 1% and for particle volume fractions of 10% or greater, and that the onset of decohesion occurs at the beginning of yielding. • And a final means of void nucleation is that at grain boundaries (triple points) where high stresses can arise from misorientations. However, because elastic mismatches of particles or inclusions embedded within a ductile matrix is greater than crystal misorientations, the most dominant feature of void nucleation is that from second phase particles or inclusions.
1.2.
Void Growth
In the earliest approaches that were used to analyze void growth, void interactions were generally ignored in void growth models. Typical of these approaches for void geometrical changes is one developed by McClintock [4], where he analyzed the behavior of a periodic array of circular cylindrical voids, with initial radii R0 in a material subjected to plane strain tensile deformation (in the axial z direction) with a Von Mises hardening criterion, an associated flow rule, and a power law hardening formulation. He obtained the following expression for the rate change of the current radius, R, as √ √ 3 n 3(σx + σ y ) n dε x + dε y dR = dεeq sinh + , (4) R 2 n−1 2σ eq n−1 2 where εx and ε y are the transverse strains, εeq is the equivalent strain, n is the strain-hardening coefficient, and σ y and σz are the transverse stresses. McClintock [5] obtained a similar expression for a cylinder subjected to triaxial loading conditions. Rice and Tracey [6] also obtained an expression for
Ductile fracture
1175
the rate of change for a spherical void subjected to a tensile deformation with a remote hydrostatic stress and linear strain hardening as
dR dε 1 + dε 2 + dε 3 = (1 + G) + R 2
2 2 dε 1 + dε 22 + dε 23 D R0 3
(5)
where R0 is the initial radius of the sphere, and D and G are constants that depend on the stress state and the strain hardening of the material. Both of these Eqs. (4)–(5) can be integrated by invoking incompressibility, and expressions can then be obtained for radial displacements.
1.3.
Void Coalescence
The models developed by McClintock and Rice and Tracey were for the cases of cylindrical and spherical single voids that did not take into account void interactions nor considered the prediction of ultimate failure. Hence, a separate failure criterion had to be applied to characterize microvoid coalescence. Some of the earliest models to address this were developed by Berg [7] and Gurson [8]. Numerous researchers have been extensively applied variations of the Gurson model. In this model, it is assumed that a porous solid material can be approximated as a homogenous spherical body, or matrix, with a spherical cavity at its center. The effects of voids appear indirectly through their influence on the global flow behavior. The main difference between the Gurson model and classical plasticity is that the Gurson model has a dependence on hydrostatic stress, while classical plasticity formulations do not. Gurson [8] developed the following potential function, =
2 σeq
σ02
+ 2 f cosh
3σm 2σ0
− (1 + f 2 ) = 0,
(6)
where σ0 is the yield strength of the matrix, σeq is the Von Mises equivalent stress, σm is the mean hydrostatic stress, and f is the void volume fraction. When f is zero, this potential function reduces to the classical Von Mises yield surface with isotropic hardening. When f is one, the yield surface shrinks to a point and the material stress carrying capacity vanishes. However, Eq. (6) greatly overestimates ductility and failure strains, because it does not account for void interactions. Tvergaard [9] attempted to correct this by introducing two parameters, q1 and q2. He developed these parameters through axisymmetric finite-element analyses of a circular cylindrical unit cell. Based on this, he obtained the following modified yield condition, =
2 σeq
σ02
+ 2 f cosh q1
3q2σm − (1 + q22 f 2 ) = 0, 2σ0
(7)
1176
M. Zikry
which is identical to the Gurson model, if the void fraction is multiplied by a factor and if the hydrostatic tension is increased. The values proposed by Tvergaard are q1 = 1.5 and q2 = 1.00. Hence, this modification has the effect of amplifying the hydrostatic stress at all strain levels. But if this model properly describes overall behavior, it did not account for the rapid drop in stress carrying capacity occurring before final rupture. Thus, Tvergaard and Needleman [10] suggested replacing f in the previous equations with the function f ∗ so that f∗= f f ∗ = fc +
f ≤ fc,
(8a)
f u∗ − f c ( f − fc) ff − fc
f ≥ fc,
(8b)
where the first expression is used when f ∗ is less than or equal to the critical value f c and the second equation is used when f ∗ is greater than f c . Here f f is the void volume fraction at which the stress carrying capacity vanishes, so that f u∗ = 1/q1. As f approaches f f , f ∗ approaches f u∗ and the material loses all stress carrying capacity. Values for f c and f f were determined numerically from different unit cell computations performed by Tvergaard [9]. He determined that f c varies linearly between 0.04 and 0.12 for an initial void volume fraction varying between 0 and 0.06. It has been postulated by several investigators (see for example, Ref. [11]) that the microvoid volume fraction increases partly because of the growth of existing microvoids and partly because of the nucleation of new microvoids. This is assumed to occur as d f = d f growth + d f nucleation,
(9)
where the growth term can be given by the incompressibility of the matrix, εii as d f = (1 − f )εii .
(10)
For strain-controlled nucleation, the nucleation term [12] can be given by
fn 1 d f nucleation = √ exp − 2 sn 2π
εeq − εN σN
2
dεeq ,
(11)
where f n is the volume fraction of particles on which voids are formed, εN is the mean critical plastic strain, σN is the stress corresponding to that critical strain, and sn is a standard deviation. A similar expression for stress-controlled nucleation can be obtained by replacing the strain terms with the corresponding stress terms, such as the mean stress and the equivalent stress. These modifications of Gurson’s formulation have generally resulted in accurate predictions of the softening response of rate-independent materials
Ductile fracture
1177
with periodic distributions (c.f., [11]). However, accurate failure strains based on Gurson’s constitutive formulation, for rate-dependent materials with random void distributions have generally not been obtained. As noted by Magnusen et al. [13, 14], deformation and failure modes that are inherently related to a statistical variation of properties cannot accurately be modeled by a single porosity variable. Furthermore, in Gurson’s constitutive formulation, there is no distinction made for the nature of void distribution in the material. Experiments by Magnusen et al. [13] and Becker and Smelser [15] have shown that a material with a random array of voids has less ductility and lower strength than specimens having a periodic array of voids. However, if a random and a periodic arrangement of voids had the same number of voids per unit volume, the porosity parameter, in Gurson’s formulation, would still be the same for both distributions. There is also no size or volume distinctions made in Gurson’s formulation. A population of small voids and one void having equal volumes would have the same porosity. Furthermore, most of these investigations have used unit cells (similar to mesoscale RVE’s as discussed in the previous section on ISV’s) to represent the overall aggregate mechanical response. In unit cell computations, symmetry is usually exploited. Therefore, only a symmetric portion of the void surface is analyzed. Interaction effects from nearby voids are usually modeled by imposing a kinematic or static boundary condition on a surface of the unit cell. However, there are several shortcomings associated with this approach. One difficulty is that all unit cell boundaries have to remain straight since it is assumed that all voids deform uniformly. This would preclude explicitly accounting for specimen instabilities, such as necking at the free boundary. A second shortcoming is that if only one symmetric portion of a void is used, the non-symmetric and irregular geometries associated with void deformation and coalescence cannot be accurately modeled. Experimental studies (c.f., [14]) have also shown that the spacing between voids is also a critical factor in void coalescence. In unit cell computations, this spacing between voids, as a function of void interaction, has generally not been accurately accounted for such that there is a correlation of unit cell computations with experimental results (c.f., [11]). Furthermore, as noted by Tvergaard [11], the range of parameters (q1 and q2) that are used to determine final fracture is strongly dependent on the initial porosity. These parameters are usually obtained for specific loading conditions and geometries, and they may have to be obtained for different computations.
1.4.
Current Investigations
When ductile solids are sufficiently deformed far in the inelastic regime, a smoothly varying deformation can give way to one involving highly localized
1178
M. Zikry
deformation in the form of shear-bands. This form of shear strain localization can occur in rocks, polymers, granular materials, and structural metals. Shear– strain localization is of considerable practical importance in metal forming, high speed machining, and ballistic impact. The physical mechanisms that can trigger localization are a function of loading rates. At quasi-static loading rates, geometrical softening (crystal rotation) may be the triggering mechanism, while for dynamic loading conditions, thermal softening coupled with geometrical softening may be the triggering mechanisms. The use of smooth yield surfaces with increasing strain hardening rates, such as a Von Mises surface, to model shear-band formation, would not yield localized deformations or stress behavior beyond the initial bifurcation point. Numerous approaches have been used, such that phenomenological models can be used to analyze shear–strain localization and post-bifurcation behavior. These models essentially are based on developing J2 (second invariant of deviatoric stress) theories with a corner or vertex on the yield surface. These models were developed [16] to incorporate the vertex formation on yield surfaces exhibited by physical models of poly-crystalline aggregates. The yield surface corners are formed by the intersection of several smooth yield surfaces, where each potential or smooth yield function can be cast in terms of the strain-rate and work hardening. An angle can be obtained [16] in terms of moduli obtained from deformation theory and linear elasticity in a way to ensure that the surface is convex. These models have been used with some success to model shear band localization Tvergaard [17]. As analyses of shear-band localization indicates, finite-strain plasticity addresses processes involving large plastic deformations that lead to failure. Macroscale continuum theories, such as the yield vertex models, may not directly address the actual physical micromechanisms, which result in permanent inelastic strains, since there is a certain degree of arbitrariness in the formulations. Furthermore, deformation based theories, may not be suitable for non-proportional loading conditions. These issues have been addressed within the context of crystal plasticity, where the deformation kinematics can be related to the atomic structure of the crystalline lattice.
2.
Research Trends Pertaining to Ductile Fracture
It should be obvious from the previous sections that ductile fracture is inherently related to material heterogeneities. These heterogeneities can occur due to voids, inclusions, grain-boundaries, subgrains, or localized bands. Hence, current and future research endeavors pertaining to ductile fracture that are necessary to advance this area would include research to understand and control heterogeneous behavior at different scales in ductile materials and structures. Obviously, there are myriad research avenues to address these
Ductile fracture
1179
issues. Due to the length constraints of this chapter, only three broad topical areas will briefly outlined. There are several references that would be relevant to discussing future research trends related to this topic (c.f., a review by McDowell [18]).
3.
Coupling of Large Strain Plasticity with New Failure Models
The notion here is to understand material behavior at scales ranging from the nano to the macro in such a manner to be able to design new failure resistant materials and systems. Current efforts in this area include the coupling of immobile and mobile dislocation-densities to three dimensional crystal plasticity formulations [19] development of computational models accounting for grain-boundary effects and dislocation-density evolution effects on fracture void growth and coalescence [20, 21], and the effects of discrete dislocations [22] on material response. Others include distinct multi-scale aspects focusing on void nucleation [23] and temperature effects on void coalescence [24].
3.1.
Hierarchical Modeling
The coupling of different computational methods, such as finite-element methods at the continuum level to atomistics and molecular dynamics to understand and control material behavior and physical mechanisms [25], where appropriate computational tools can be used at the relevant spatial and temporal scale. This will require efficient parallel computations based on new algorithms and new constitutive models that incorporate dominant material features and failure mechanisms relevant to each physical scale.
3.2.
Progressive Failure Surface Creation and Propagation
The development of new models that result in the creation of failure surfaces and their progression. Most failure models are based on continuum models with pre-existing defects, such as voids and cracks. The ability to develop predictive tools based on the initiation and growth of defects at different scales, which are not dependent on a specific computational approach, is a gap that if addressed can result in new and revolutionary engineering applications and devices.
1180
4.
M. Zikry
Summary
This chapter was has been mainly devoted as an introduction to ductile failure at the macroscopic scale. It has been seen that as more complex problems, such as shear strain localization, are investigated, new models based on crystal plasticity have been invaluable in providing new physical insights. Current and future research investigations are focused on accounting for failure initiation and growth at scales ranging from the nano to the macro scales. If these investigations are successful, then new materials and structures can be designed and tailored as needed from the lowest scales up. However, this can only be achieved, if modeling efforts are successful in developing physically based validated tools that can be utilized for material and system design. It should be also be underscored that even though models at the micro to the nano levels are invaluable, these approaches have to be linked to continuum level models that are more attuned to system and structural design.
References [1] J.W. Hancock and A.C. Mackenzie, “On the mechanisms of ductile failure in high strength steels subjected to multi-axial stress states,” J. Mech. Phys. Sol., 24, 147– 169, 1976. [2] S.H. Goods and L.M. Brown, “The nucleation of cavities by plastic deformation,” Acta Metall., 27, 1–15, 1979. [3] A.S. Argon, J. Imp, and R. Safely, “Cavity formation from inclusions in ductile fracture,” Met. Trans., 6A, 825–837, 1975. [4] F.A. McClintock, “A criterion for ductile fracture by the growth of hole,” J. Appl. Mech., 35, 363–371, 1968. [5] F.A. McClintock, “Plasticity aspects of fracture,” In: F. Liebowitz (ed.), Fracture: An Advanced Treatise, vol. 3, Academic Press, New York, pp. 47–225, 1971. [6] J.R. Rice and D.M. Tracey, “On the ductile enlargement of voids in triaxial stress fields,” J. Mech. Phys. Solids, 17, 201–217, 1969. [7] C.A. Berg, “Plastic dilation and void interaction,” In: M.F. Kanninen (ed.), Inelastic Behaviour of Solids, vol. 3, McGraw Hill, New York, pp. 171–210, 1970. [8] A.L. Gurson, “Continuum theory of ductile rupture by void nucleation and growth: part I. yield criteria and flow rules for porous ductile media,” J. Eng. Mater. Technol., 99, 2–15, 1977. [9] V. Tvergaard, “On localization in ductile materials containing spherical voids,” Int. J. Fract., 18, 237–252, 1982. [10] V. Tvergaard and A. Needleman, “Analysis of the cup-cone fracture in a round tensile bar,” Acta Metall., 32, 157–169, 1984. [11] V. Tvergaard, “Material failure by void growth to coalescence,” Adv. Appl. Mech., 27, 83–151, 1990. [12] C.C. Chu and A. Needleman, “Void nucleation effects in biaxially stretched sheets,” J. Eng. Math. Technol., 102, 249–256, 1980. [13] P.E. Magnusen, E.M. Dubensky, and D.A. Koss, “The effect of void arrays on void linking during ductile fracture,” Acta Metall., 36, 1503–1509, 1988.
Ductile fracture
1181
[14] P.E. Magnusen, D.J. Srolovitz, and D.A. Koss, “A simulation of void linking during ductile microvoid fracture,” Acta Metall., 38, 1013–1022, 1990. [15] R. Becker and R.E. Smelser, “Simulation of strain localization and fracture between holes in an aluminum sheet,” J. Mech. Phys. Solids, 42, 773–796, 1994. [16] J. Christoffersen and J.W. Hutchinson, “A class of phenomenological corner theories of plasticity,” J. Mech. Phys. Solids, 27, 465–487, 1979. [17] Tvergaard, Viggo (Tech. Univ. of Den., Lyngby), Source: Int. J. Fracture, vol. 17, No. 4, pp. 389–407, Aug. 1981. [18] D.L. McDowell, “Modeling and experiments in plasticity,” Int. J. Solids Struct., 37, 293–309, 2000. [19] Kameda and M.A. Zikry, “Three dimensional high strain rate failure evolution and triple junction grain-boundary effects in intermetallics,” Mech. Mater., 28, 93–102, 1998. [20] W.M. Ashmawi and M.A. Zikry, “Prediction of grain-boundary interfacial effects and mechanisms in crystalline systems,” J. Eng. Mater. Technol., 124, 88–95, 2002. [21] W.M. Ashmawi and M.A. Zikry, “Void morphology and grain-boundary effects in crystalline materials,” Mater. Sic. Eng. A, 343, 126–142, 2003. [22] H.M. Zbib, M. Rhee, and J.P. Hirth, “On plastic deformation and the dynamics of 3D dislocations,” Int. J. Mech. Sci., 40 (2–3), 113–127, 1997. [23] M.F. Horstemeyer, M. Negrete, and S. Ramaswamy, “Using a micromechanical finite element parametric study to motivate a phenomenological macroscale model for void/crack nucleation in aluminum with a hard second phase,” Mech. Mater., 35, 675–687, 2003. [24] M.F. Horstemeyer and S. Ramaswamy, “On factors affecting localization and void growth in ductile metals: a parametric study,” Int. J. Damage Mech., 9, 6–28, 2000. [25] E.B. Tadmor, M. Ortiz, and R. Phillips, “Quasicontinuum analysis of defects in solids,” Phil. Mag. A, 73(6), 1529–1551, 1996.
3.8 CONTINUUM DAMAGE MECHANICS G.Z. Voyiadjis Louisiana State University, Baton Rouge, LA, USA
Continuum Damage Mechanics (CDM) can be thought of as a subset of ISV theory as described earlier. It was introduced by Kachanov [1] and modified somewhat by Rabotnov [2] has now reached a stage that practical engineering problems can be solved. In contrast to fracture mechanics which considers the process of initiation and growth of microcracks as a discontinuous phenomenon, continuum damage mechanics uses a continuous variable, φ, which is related to the density of these defects, to describe the deterioration of the material before the initiation of macrocracks. Based on the damage variable φ, constitutive equations of evolution are developed to predict the initiation of macrocracks for different types of phenomena. Lemaitre [3] and Chaboche [4] used it to solve different types of fatigue problems. Leckie and Hayhurst [5], Hult [6], and Lemaitre and Chaboche [7] used it to solve creep and creep-fatigue interaction problems. Also, it was used by Lemaitre for ductile plastic fracture [8–10] and for a number of other applications [11]. The damage variable, based on the effective stress concept, represents average material degradation which reflects the various types of damage at the mesoscale level like nucleation and growth of voids, cavities, microcracks, and other microscopic defects. For the case of isotropic damage, the damage variable is scalar and the evolution equations are easy to handle. Lemaitre [11] argued that the assumption of isotropic damage is sufficient to give accurate predictions of the load carrying capacity and the number of cycles or the time to local failure in structural components. An extension of this theory is the incorporation of anisotropic damage and plasticity that has been experimentally confirmed [12–15] even if the virgin material is isotropic. This has prompted several researchers to investigate the general case of anisotropic damage. The theory of anisotropic damage mechanics was developed by Sidoroff [16], Cordebois and Sidoroff [17], and Cordebois [18], and later used by Lee et al. [15], and Chow and Wang [13, 14, 19], and Voyiadjis and Park [20] 1183 S. Yip (ed.), Handbook of Materials Modeling, 1183–1192. c 2005 Springer. Printed in the Netherlands.
1184
G.Z. Voyiadjis
to solve simple ductile fracture problems. Prior to these developments, Krajcinovic and Foneska [21], Murakami and Ohno [22], Murakami [23], and Krajcinovic [24] investigated brittle and creep fracture using appropriate anisotropic damage models. Although these models are based on a sound physical background, they lack vigorous mathematical justification and mechanical consistency. Consequently, more work is needed to develop a more involved theory capable of producing results that can be used for practical applications [21–25]. In the general case of anisotropic damage, the damage variable has been shown to be tensorial in nature [22–26]. This damage tensor was shown to be an irreducible even-rank tensor [27]. Several other basic properties of the damage tensor have been outlined by Betten [28, 29] in a rigorous mathematical treatment using the theory of tensor functions. Lemaitre [30] summarized the work done in the last fifteen years to describe crack behavior using the theory of continuum damage mechanics. Also, Lemaitre and Dufailly [31] described eight different experimental methods (direct and indirect) to measure damage according to the effective stress concept [32]. Chaboche [33–35] described different definitions of the damage variable based on indirect measurement procedures. Examples of these are damage variables based on the remaining life, the microstructure and several physical parameters like density change, resistivity change, acoustic emissions, the change in fatigue limit, and the change in mechanical behavior through the concept of effective stress.
1.
Damage in Metals Due to Uniaxial Loads
We know turn towards the various assumptions and the equivalence principle for CDM. This is followed by the derivation of the damage evolution equations. Finally, a new section is added on the separation of damage due to cracks and voids in metals. All the theory and derivations are based on the uniaxial tension test. Therefore, isotropic damage is assumed and all the equations employ scalar variables.
2.
Principles of Continuum Damage Mechanics
The limitation of classical fracture mechanics have been outlined recently by Lemaitre [30]. Parameters like the J-Integral and crack opening displacement (COD) are difficult to use in cases of large strain plasticity, time-dependent behavior, crack evolution for non-proportional loading, and delamination of composites. Murakami [36] indicated that proper understanding and corresponding mechanical description of damage progression arising from internal
Continuum damage mechanics
1185
defects is vital. A systematic approach to these problems of distributed defects can be provided by CDM. The fundamental notion of this theory is to represent the damage state of materials characterized by distributed cavities in terms of appropriate internal state variables rate equations. Lemaitre [11] indicated that damage in metals is mainly the process of the initiation and growth of microcracks and cavities. At the mesoscale, the phenomenon is discontinuous. At the macroscale the damage variable, however, smears out the response in a continuous smooth fashion and is written in terms of stress or strain. This function can still predict the initiation and growth of damage but in a macroscopic sense. These constitutive equations have been formulated in the framework of thermodynamics and identified for many phenomena: dissipation and low-cycle fatigue in metals [3], coupling between damage and creep [5, 6], high-cycle fatigue [4], creep-fatigue interaction [7], and ductile plastic damage [8, 37, 38]. In CDM, a crack is considered to be a zone (process zone) of high gradients of rigidity and strength that has reached critical damage conditions. Thus, a major advantage of CDM is that it utilizes a local approach and introduces a continuous damage variable in the process zone, while classical fracture mechanics uses more global concepts like the J-Integral and COD. Kachanov [1] introduced the idea of damage in the framework of continuum mechanics. In a damaged body, consider a volume element at the macroscale that is of a size large enough to contain many defects and small enough to be considered as a material point of a continuum. For the case of isotropic damage and using the concept of effective stress (because of its suitability for continuum mechanics), the damage variable ϕ is defined as a scalar in the following manner, φ=
A − A¯ , A
(1)
where A¯ is the effective (net) resisting area corresponding to the damaged area A. The effective area A¯ is obtained from A by removing the surface intersections of the microcracks and cavities and correcting for the micro-stress concentrations in the vicinity of discontinuities and for the interactions between closed defects. The expression given in Eq. (1) implies that φ = 0 corresponds to the undamaged state, and φ = φcr is a critical value which corresponds to the rupture of the element in two parts. According to Lemaitre [11], the critical value of the damage variable lies in the range 0.2 ≤ φcr ≤ 0.8 for metals. In general, the theoretical value of φ should be between 0 ≤ φ ≤ 1. Equation (1) can be rewritten in a more suitable form as follows, A = (1 − φ) A.
(2)
The cross-sectional areas A and A¯ are shown in Fig. 1 on cylindrical material element in the damaged and effective states, respectively.
1186
G.Z. Voyiadjis T⫽σΑ
T⫽σΑ
A
A
0≤ϕ≤1
ϕ⫽ 0
Damaged state
Equivalent fictitious undamaged state
Figure 1. Isotropic damage in uniaxial tension (concept of effective stress).
3.
Assumptions and the Equivalence Hypothesis
Stress, energy, or strain equivalence can be used in CDM. When the hypothesis of strain equivalence [9, 11] is not assumed, the effective resisting area A¯ can be calculated through mathematical homogenization techniques [39], but the shape and size of the defects must be known, which is somewhat difficult, even with an electron microscope. To avoid this difficulty, the hypothesis of strain equivalence is made [40]. This hypothesis states that “every strain behavior of a damaged material is represented by constitutive equations of the undamaged material in the potential of which the stress is simply replaced by the effective stress.” The effective stress σ¯ is defined as the stress in the effective (undamaged) state. Considering Fig. 1, the effective stress σ¯ can be obtained from Eq. 2 by equating the force T = σA acting on the damaged area A with the force T = σ¯ A¯ acting on the hypothetical undamaged ¯ i.e., area A, ¯ σA = σ¯ A,
(3)
where σ is the Cauchy stress acting on the damaged area A. From Eqs. (2) and (3), we can obtain the following expression for the effective Cauchy stress σ¯ σ¯ =
σ . 1−φ
(4)
Continuum damage mechanics
1187
The effective stress σ¯ can be considered as a fictitious stress acting on an undamaged equivalent (fictitious) area A¯ (net resisting area). For the uniaxial tension case shown in Fig. 1, the constitutive relation in Hooke’s law of linear elasticity is given by σ = Eε,
(5)
where ε is the strain and E is the modulus of elasticity (Young’s modulus). The same linear elastic constitutive relation applies to the effective (undamaged) state, i.e., σ¯ = E¯ ε¯ , (6) where ε¯ and E¯ are the effective counterparts of ε and E, respectively. Next, we will derive the necessary transformation equations between the damaged and the hypothetical undamaged states of the material. In the derivation, the following assumptions are incorporated: (1) the elastic deformations are small (infinitesimal) compared with the plastic deformations (finite), and (2) there exists an elastic strain energy scalar function U . This function is assumed based on the linear relation between the Cauchy stress σ and the engineering elastic strain ε given by Eq. (5). The elastic strain energy function U is defined by U = 12 σ ε.
(7)
It is clear from Eqs. (5) and (7) that σ = dU/dε and ε = dU/dσ. Sidoroff [16] proposed the hypothesis of elastic energy equivalence. This latter hypothesis assumes that “the elastic energy for a damaged material is equivalent in form to that of the undamaged (effective) material except that the stress is replaced by the effective stress in the energy formulation.” Thus, according to this hypothesis, the elastic strain energy U = 12 σ ε is equated to the effective elastic strain energy U¯ = 12 σ¯ ε¯ as follows 1 σε 2
= 12 σ¯ ε¯ .
(8)
Substituting Eq. (4) into Eq. (8) and simplifying, we obtain the following relation between the strain ε and the effective strain ε¯ ε¯ = (1 − φ)ε.
(9)
Continuing further, we substitute Eqs. (4) and (9) into Eq. (6), simplify the result and compare it with Eq. (5) to obtain ¯ − φ)2 . (10) E = E(1 Equation (10) represents the transformation law for the modulus of elasticity. It is clear now that Young’s modulus for the damaged material depends on the value of the damage variable φ. Solving Eq. (10) for φ, one obtains
φ=1−
E . E¯
(11)
1188
G.Z. Voyiadjis
Once the values of E¯ are measured experimentally, one can use Eq. (11) to obtain values of the damage variable φ. It should be noted that the value of E¯ is constant for the effective (undamaged) material.
4.
Damage Evolution
There are several approaches in the literature on the topic of damage evolution and the proper form of the kinetic equation of the damage variable. Kachanov [32] proposed an evolution equation of damage based on a power law with two independent material constants. However, the resulting kinetic equation for the damage variable evolution is complicated and difficult to solve. Therefore, a more rational approach based on energy considerations is presented here. The approach will depend on the introduction of a damage strengthening criterion in terms of a scalar function g, and a generalized thermodynamic force that corresponds to the damage variable φ [9, 15]. Substituting Eq. (6) and (9) into the right hand side of Eq. (8), we obtain the elastic strain energy U in the damaged state of the material as follows, ¯ − φ)2 σ ε 2 , U = 12 E(1
(12)
in which E¯ is constant, therefore, the incremental elastic strain energy dU is obtained by differentiating Eq. (12), ¯ − φ)2 ε dε − E(1 ¯ − φ)ε 2 dφ. dU = E(1
(13)
The generalized thermodynamic force y associated with the damage variable φ is thus defined by y≡
∂U ¯ − φ)ε 2 . = − E(1 ∂φ
(14)
Let g(y, L) be the damage function (criterion) as proposed by Lee et al. [15], where L ≡ L(l) is a damage strengthening parameter which is a function of the “overall” damage parameter l. For this problem, the scalar function g takes the following form g(y, L) = 12 y 2 − L(l) ≡ 0.
(15)
The damage strengthening criterion defined by Eq. (15) is similar to the von Mises yield criterion in the theory of plasticity. In order to derive a normality rule for the evolution of damage, we first start with the power of dissipation which is given by = −y dφ − L dl,
(16)
Continuum damage mechanics
1189
where the “d” in front of a variable indicates the incremental quantity of the variable. The problem is to extremize subject to the condition g = 0. Using the mathematical theory of functions of several variables, we introduce the Lagrange multiplier dλ and form the objective function (y, L) such that = − dλ · g.
(17)
The problem now reduces to extremizing the function . For this purpose, the two necessary conditions are ∂/∂ y = 0 and ∂/∂ L = 0. Using these conditions, along with Eqs. (16) and (17), one obtains the following dφ = −dλ
∂g , ∂y
(18a)
∂g . (18b) ∂L Substituting for g from Eq. (15) into Eq. (18b), one concludes directly that dλ = dl. Substituting this into Eq. (18a), along with Eq. (15), we obtain dl = −dλ
dφ = −dλ · y.
(19)
In order to solve the differential Eq. (19), we must first find an expression for the Lagrange multiplier dλ. This can be done by invoking the consistency condition dg = 0. Applying this condition to Eq. (15), we obtain ∂g ∂g dy + dL = 0. ∂y ∂L
(20)
Substituting for ∂g/∂ y and ∂g/∂ L from Eq. (15) and for dL = dl(∂ L/∂l), from the chain rule of differentiation, and solving for dl, we obtain dl = dλ =
y dy . ∂ L/∂1
(21)
Substituting the above expression of dλ into Eq. (19), we obtain the kinetic (evolution) equation of damage,
∂L ∂l
dφ = −y 2 dy
(22)
with the initial condition that φ = 0 when y = 0. The solution of Eq. (22) depends on the form of the function L(l). For simplicity, we may consider a linear function of the form L(l) = cl + d, where c and d are constants. The equivalent damage strengthening parameter can be analogously expressed as √ dl · dl or simply dl whereby giving a linear function in l as discussed above. Substituting this into Eq. (22) and integrating, we obtain the following relation
1190
G.Z. Voyiadjis
between the damage variable φ and its associated generalized thermodynamic force y, φ=−
y3 . 3c
(23)
The above relation is shown graphically in Fig. 2 where it is clear that φ is a monotonically increasing function of y. Next, we investigate the straindamage relationship. Differentiating the expression of y in Eq. (14), we obtain ¯ dy = Eε[ε dφ − 2dε(1 − φ)].
(24)
Substituting the expressions of y and dy of Eqs. (14) and (24), respectively, into Eq. (22), we obtain the strain-damage differential Equation [38]
∂L ∂l
dφ = E¯ 3 ε 5 (1 − φ)2 [2dε(1 − φ) − ε dφ].
(25)
The above differential equation can be solved easily by the simple change of variables x = ε 2 (1 − φ) and noting that the expression on the right-hand side of Eq. (25) is nothing but E¯ 3 x 2 dx. Performing the integration with the initial condition that φ = 0 when ε = 0 along with the linear expression of L(l), we obtain E¯ 3 6 φ = (26) ε . (1 − φ)3 3c
y1
3
3c Cubic function
ϕ1 Figure 2. force y1 .
Relation between the overall damage variable φ1 and its associated generalized
Continuum damage mechanics
1191
One should note that an initial condition involving an initial damage variable φ ◦ could have been used, i.e., φ = φ ◦ when ε = 0. In addition, the straindamage relation of Eq. (26) could easily have been obtained by substituting the expression of y of Eq. (14) directly into Eq. (23). However, it is preferable to derive it directly from the strain-damage differential Eq. (25) without the use of the generalized thermodynamic force y.
References [1] L.M. Kachanov, “On the creep fracture time,” Izv Akad. Nauk USSR Otd. Tekh., 8, 26–31 (in Russian), 1958. [2] Y.N. Rabotnov, “Creep problems of structural members,” North-Holland, Amsterdam, 1969. [3] J. Lemaitre, “Evaluation of dissipation and damage in metals submitted to dynamic loading,” ICM 1, Kyoto, Japan, 1971. [4] J.L. Chaboche, “Une Loi Differentielle d’Endommagement de Fatigue avec Cumulation Non-Lineare,” Rev. Francaise Mecanique, No. 50–51 (in French), 1974. [5] F.A. Leckie and D. Hayhurst, “Creep rupture of structures,” Proceedings of the Royal Society, London, A340, 323–347, 1974. [6] J. Hult, “Creep in continua and structures,” In: Topics in Applied Continuum Mechaniesh, Zeman and Ziegler (eds.), 137, Springer, NY, 1974. [7] J. Lemaitre and J.L. Chaboche, “A Nonlinear model of creep fatigue cumulation and interaction,” Proceedings of IUTAM, Symposium on Mechanics of Viscoelastic Media and Bodies, 291–301, 1975. [8] J. Lemaitre and J. Dufailly, “Modelisation et Identification de l’Endommagement Plastique de Mataux,” Zeme Congres Francaise de Mecanique, Grenoble, 1977. [9] J. Lemaitre, “A continuous damage mechanics model for ductile fracture,” J. Eng. M. Technol., 107, 83–89, 1985. [10] G.Z. Voyiadjis, “Model of inelastic behavior coupled to damage,” Handbook of Materials Behavior Models, J. Lemaitre (ed.), Chapter 9, Section 9.4, Academic Press, New York, 814–820, 2001. [11] J. Lemaitre, “How to use damage mechanics,” Nuc. Eng. Des., 80, 233–245, 1984. [12] D.R. Hayhurst, “Creep rupture under multiaxial states of stress,” J. Mech. Phys. Solids, 20, 381–390, 1972. [13] C.L. Chow and J. Wang, “An anisotropic theory of continuum damage mechanics for ductile fracture,” Eng. Fract. Mech., 27, 547–558, 1987. [14] C.L. Chow and J. Wang, “An anisotropic theory of elasticity for continuum damage mechanics,” Int. J. Frac., 33, 3–16, 1987. [15] H. Lee, K. Peng, and J. Wang, “An anisotropic damage criterion for deformation instability and its application to forming limit analysis of metal plates,” Eng. Fract. Mech., 21, 1031–1054, 1985. [16] F. Sidoroff, “Description of anisotropic damage application to elasticity,” In: IUTAM Colloquium on Physical Nonlinearities in Structural Analysis, pp. 237–244, Springer-Verlag, Berlin, 1981. [17] J.P. Cordebois and F. Sidoroff, “Damage induced elastic anisotropy,” Colloque Euromech, 115, Villard de Lans, 1979. [18] J.P. Cordebois, “Criteres d’Instabilite Plastique et Endommagement Ductile en Grandes Deformations,” Thse de Doctorat, Presente a l’Universite Pierre et Marie Curie, 1983.
1192
G.Z. Voyiadjis
[19] C.L. Chow and J. Wang, “Ductile fracture characterization with an anisotropic continuum damage theory,” Eng. Fract. Mech., 30, 547–563, 1988. [20] G.Z. Voyiadjis and T. Park, “Anisotropic damage for the characterization of the onset of macrocrack initiation in metals,” Int. J. Damage Mech., 5(1), 68–92, 1996. [21] D. Krajcinovic and G.U. Foneska, “The continuum damage theory for brittle materials,” J. Appl. Mech., 48, 809–824, 1981. [22] S. Murakami and N. Ohno, “A continuum theory of creep and creep damage,” In: Proceeding of Third IUTAM Symposium on Creep in Structures, pp. 422–444, Springer, Berlin, 1981. [23] S. Murakami, “Notion of continuum damage mechanics and its application to anisotropic creep damage theory,” J. Eng. M. Tech., 105, 99–105, 1983. [24] D. Krajcinovic, “Constitutive equations for damaging materials,” J. Appl. Mech., 50, 355–360, 1983. [25] E. Krempl, “On the identification problem in materials deformation modeling,” Euromech, 147, on Damage Mechanics, Cachan, France, 1981. [26] F.A. Leckie and E.T. Onat, “Tensorial nature of damage measuring internal variables,” In: IUTAM Colloquium on Physical Nonlinearities in Structural Analysis, 140–155, Springer-Verlag, Berlin, 1981. [27] E.T. Onat and F.A. Leckie, “Representation of mechanical behavior in the presence of changing internal structure,” J. Appl. Mech., 55, 1–10, 1988. [28] J. Betten, “Damage tensors in continuum mechanics,” J. Mecanique Theorique et Appliquees, 2, 13–32, Presented at Euromech Colloquium 147 on Damage Mechanics, Paris-VI, Cachan, 22 September, 1981. [29] J. Betten, “Applications of tensor functions to the formulation of constitutive equations involving damage and initial anisotropy,” Eng. Frac. Mech., 25, 573–584, 1986. [30] J. Lemaitre, “Local approach of fracture,” Eng. Fract. Mech., 25(5/6), 523–537, 1986. [31] J. Lemaitre and J. Dufailly, “Damage measurements,” Eng. Fract. Mech., 28(5/6), 643–661, 1987. [32] L.M. Kachanov, “Introduction to continuum damage mechanics,” Martinus Nijhoff publishers, The Netherlands, 1986. [33] J.L. Chaboche, “Continuum damage mechanics: present state and future trends, international seminar on local approach of fracture, moret-sur-loing, France,” 1986. [34] J.L. Chaboche, “Continuum damage mechanics: part I – general concepts,” J. Appl. Mech., 55, 59–64, 1988a. [35] J.L. Chaboche, “Continuum damage mechanics: part II – damage growth, crack initiation and crack growth,” J. Appl. Mech., 55, 65–72, 1988b. [36] S. Murakami, “Mechanical modeling of material damage,” J. Appl. Mech., 55, 280– 286, 1988. [37] G.Z. Voyiadjis and P.I. Kattan, “A plasticity-damage theory for large deformation of solids, part I: theoretical formulation,” Int. J. Eng. Sci., 30(9), 1089–1108, 1992. [38] G.Z. Voyiadjis and P.I. Kattan, “Advances in damage mechanics: metals and metal matrix composites,” 542 p., Elsevier, Oxford, 1999. [39] P. Suquet, “Plasticit et Homognisation,” Thse d’Etat, Universit Paris 6, 1982. [40] J. Lemaitre and J.L. Chaboche, “Aspect Phenomnologique de la Rupture par Endommagement,” J. Mec. Appl., 2, 317–365, 1978.
3.9 MICROSTRUCTURE-SENSITIVE COMPUTATIONAL FATIGUE ANALYSIS D.L. McDowell Georgia Institute of Technology, Atlanta, GA, USA
Previous sections focused on plasticity and damage formation and evolution under monotomic loading. Under cyclic loading, fatigue failure is a significant consideration. Historically, simple macroscopic fatigue correlations have proven quite useful in estimating fatigue crack initiation life of metallic components, based on measured or calculated stresses and strains at notches in components. The application of stress-based criteria for high cycle fatigue (HCF) or plastic strain-based criteria for low cycle fatigue (LCF) is typically based on transfer of results from tests on relatively small scale notched and unnotched laboratory specimens to larger components (cf. [1]). At the macroscale, fatigue of ductile materials has many common characteristics among alloy systems, leading to the utility of strain-life criteria. At the level of microstructure, fatigue is a complex, cycle-dependent process that differs in detail from one alloy system to the next. Often it is desired to understand mean fatigue resistance and scatter in fatigue as a function of microstructure in order to tailor microstructure to improve component level fatigue resistance. To this end, extension of fatigue analysis methods to microstructures is necessary as a means to augment and reduce the number of required experiments. The process of fatigue failure under constant amplitude loading typically includes several stages: a. cyclic plastic deformation, with formation of stable cyclic dislocation substructures that control intensity of notch root cyclic plasticity; b. formation of crack embryos at interface of regions of intensely localized shear (so-called persistent slip bands) and surrounding matrix, typically referred to as “nucleation”; c. sharpening of the crack front and onset of propagation, assisted by slip irreversibility and damage mechanisms ahead of the tip;
1193 S. Yip (ed.), Handbook of Materials Modeling, 1193–1214. c 2005 Springer. Printed in the Netherlands.
1194
D.L. McDowell
d. propagation beyond regions of stress concentration influenced by the notch root(s); and e. propagation to specimen/component failure. Cyclic plastic deformation behavior depends on a number of factors, including, among others: • Slip planarity (affected by stacking fault energy and solid solution strengthening); • Sizes and arrangement of multiple phases; • Dendrite cell size (DCS), otherwise known as secondary dendrite arm spacing, and interdendritic morphology in cast alloys; • Grain size, orientation and misorientation distribution, and morphology in wrought alloys; and • Larger incoherent precipitates or inclusions. The reader is referred to an excellent review of microstructural fatigue mechanisms in the monograph by Suresh [2]. Historically, much work has focused on observations of crack nucleation (incipient crack formation) under LCF loading conditions; this topic has received attention of recent modeling efforts [3, 4]. While there is much fundamental work to be done to understand fatigue crack formation in broad classes of engineering metals, much progress has been made in developing engineering methods and guidelines for fatigue life estimation and design. The resulting idealizations based on pure metals, both experimentally and theoretically, tend to support the empirical Coffin–Manson power law relation (cf. [1, 2]) for number of cycles to crack nucleation, i.e., ε p = A (2Nnuc )c 2
(1)
where ε p is the range of applied plastic strain over some representative volume of material, and A and c are constants. Indeed such studies are indicative of a still-developing science base for understanding fatigue processes at increasingly fine scales and levels of detail. In practical alloy systems, however, the nucleation regime is often either bypassed or coupled with debonding or fracture of interfaces between inclusions and matrix or at grain boundaries, or crack formation at existing surface scratches, machining marks, or at nearsurface pores or inclusions. The problem then focuses on the formation of small cracks at micronotches that subsequently propagate as microstructurally small cracks. Eventually, these cracks grow until they are sufficiently long compared to microstructure scales to facilitate the assumption of propagation in a homogeneous material; for many alloy systems under low amplitude HCF conditions, such cracks must be 500–1000 µm long.
Microstructure-sensitive computational fatigue analysis
1195
In fatigue life estimation for structural components, notches play a central role in affecting concentration of local stress and cyclic plastic strain. A rather conventional state-of-the-art engineering approach to fatigue life estimation for components is summarized as follows [1]: I. Conduct laboratory experiments or canvass fatigue data on the alloy system of interest to determine constants in strain-life relation. II. Develop load history profile for mission/duty cycles. III. Perform notch analysis either based on Neuber analysis or finite element analysis, and account for any notch size effects. IV. Apply strain-life relations to estimate fatigue crack initiation life, including effects of multiaxial stress–strain states as appropriate (cf. [5, 6]) based on some cumulative damage analysis to estimate a micronotch root crack of transition crack length (cf. [7, 8]) or some characteristic dimension on the order of the notch root radius [9]. V. Apply LEFM-based crack propagation analysis to estimate crack propagation life, accounting for crack closure and load history effects as appropriate. In this algorithm, the definition of fatigue crack initiation typically corresponds to a crack of length 500 µm to 1 mm in unnotched specimens used to determine constants in strain-life relations. An alternative approach used for fatigue-critical components is to ignore the initiation life and consider only propagation (so-called defect-tolerant approach), assuming that cracks of certain lengths pre-exist due to processing and/or handling. Then the goal is to determine the remaining propagation life for fatigue cracks based on measured initial crack size distributions or backward extrapolated values inferred from propagation relations. This approach is more realistic for the LCF regime (shorter lives) in which crack propagation life dominates the number of cycles required for formation of small cracks at the notch root; however, it may be too conservative for most applications in the HCF regime since formation of small cracks and propagation within the microstructure may consume a significant fraction of the total fatigue lifetime.
1.
Crack Formation in Microstructure-based Fatigue at Microstructural Notches
It is well known that formation of fatigue cracks in metals is dominantly related to the cyclic plastic shear strain (slip) range within the microstructure, as well as the local stress or strain state (cf. [10]). The magnitude of cyclic plastic strain within a heterogeneous microstructure varies spatially. Moreover, certain aspects of the microstructure dominate fatigue failure processes by virtue of lower resistance to crack formation, often associated with
1196
D.L. McDowell
enhanced localization of slip. In practical wrought alloy systems, it is often the case that second phase inclusions or impurities control fatigue resistance of a given microstructure, and there are significant effects of grain size, hardness, and other basic characteristics. In cast alloys, large particles and pores either within or near interdendritic boundaries often control fatigue resistance, with additional influence of secondary dendrite arm spacing, grain size, and distributed microporosity. To account for variation in fatigue behavior with regard to variation of microstructure in actual materials, it is useful to conduct deterministic analyses of a range of crack-starting defects or inclusions. Then, using the measured distributions of such inclusions it is possible to quantify the statistical distribution of fatigue responses, within the premises of certain simplified modeling assumptions. Such an approach seems much more satisfying for purposes of microstructure selection or design aimed at improving mean fatigue resistance as well as quantifying variability. In many industrial applications, minimum life design methods are employed for fatigue critical components, so prediction of scatter in fatigue is relevant. For purposes of illustration of a computational-based microstructuresensitive approach, we focus on effects of second phase inclusions. A range of microstructural parameters, applied loading conditions and material properties affect the local cyclic plastic strain localization in the neighborhood of an inclusion. Relevant parameters include: • • • •
Elastic stiffness and strength of inclusions, Elastic and inelastic properties of the matrix, Geometric attributes of inclusions – sizes, shapes, Spatial distribution of the inclusions – nearest neighbor distance of large inclusions, including correlation of position with respect to grain boundaries and free surfaces, • Crystallographic orientation of the grain in which the inclusion lies as well as misorientation with neighboring grain(s), • Integrity of inclusion and matrix–inclusion interface (e.g., perfectly bonded, partially debonded, fully debonded or cracked inclusion), and • Presence of denuded zones around inclusions or grain/dendrite boundaries. In addition to these microstructural parameters, loading parameters such as the amplitude of the applied strain, the load ratio, and multi-axiality can each have a significant effect on the number of cycles necessary to form a crack at an inclusion. Sensitivity to microstructure features and loading parameters can be explored using computational methods that consider relevant length scales and mechanisms of crack formation and small crack propagation within the microstructure. Steps involved in an extension of the notch root approach to
Microstructure-sensitive computational fatigue analysis
1197
estimation of fatigue resistance based on micronotch (e.g., inclusion) analysis can be suggested as follows: I. Identify controlling microstructure features for crack formation and early growth. II. Conduct numerical analyses (e.g., finite element) of cyclic loading of various microstructure/notch geometries for representative loading cases. III. Build transfer functions for each inclusion/micronotch type (including interactions between closely spaced inclusions or with free surfaces) between macroscopic loading and average micronotch root cyclic plastic strain as a function of applied strain amplitude, mean stress or strain, and variable amplitude loading, as appropriate. IV. Apply microstructure-scale crack formation/incubation relations of LCF type based on simple Coffin–Manson forms to model crack formation corresponding to a transition crack length that is suitable for application of crack propagation analysis. V. Apply microstructurally and possibly physically small crack propagation relations for crack growth to a length considered representative of “crack initiation”, typically 500 µm to 1 mm. VI. Calibrate constants of Coffin–Manson and small crack propagation relations to results for experimentally characterized microstructure(s) and then use these constants to predict results for other microstructures.
2.
Application to Cast Al Alloys
An example of the foregoing microstructure-sensitive fatigue analysis scheme is presented next for inclusions in cast Al alloys (cf. [11]). The method can also be applied to fatigue crack formation at inclusions (or other comparable heterogeneities) in wrought alloys. Cast A356-T6 is dominantly an Al–Si alloy with a dendrite cell size ranging from 30 to 90 µm and eutectic, interdendritic regions decorated with Si particles in an Al-rich matrix. There is a hierarchy of scales of inclusions that affect total fatigue life, ranging from distributed gas microporosity and Si particles with diameters on the order of 3–15 µm, to high levels of microporosity with maximum pore diameters about 60 µm to several hundred µm, to large shrinkage pores with diameter greater than several hundred µm, to large oxides which are introduced during casting, typically of size ranging from several hundred µm to mm. The larger inclusions among this hierarchy are increasingly detrimental to fatigue life for a given loading condition. To model their effect on fatigue resistance, it is necessary to treat the scales of inclusions and their features in distinct manner, and to
1198
D.L. McDowell
consider their interactions through propagation/coalescence relations. In this way, the joint probability of finding inclusions from given populations within fatigue-critical highly stressed regions of components can be considered in a microstructure-sensitive fatigue design methodology. Figure 1 schematically illustrates three distinct regions of the constant amplitude, completely reversed uniaxial strain- and stress-life plots for a low porosity cast A356-T6 alloy. In this plot, the length scale “D” pertains to the diameter of a typical Si particle or small gas pore within or near an interdendritic region. The length scale pertains to the size of the plastic zone at the notch root, defined as the scale over which the local plastic shear strain meets or exceeds some specified level, e.g., 0.01%. Effectively, gives a length scale over which local maximum cyclic plastic strain concentration is “substantial,” defined in an arbitrary but consistently applied manner. Here, we restrict the values of to lie in the range 0 ≤ ≤ D, such that we regard the case → D as a limiting case of unconstrained plasticity associated with macroscopic yielding and macroscopic LCF. With increasing stress amplitude, > D and ultimately extensive plasticity sets in; this regime is of little practical interest for most applications. With the Si particle spacing being on the order of particle diameter in the eutectic regions and /D < 0.3, the local plasticity at cracked/debonded particles or gas pores is confined to the vicinity of the inclusion and does not interact strongly with neighboring inclusions. We term this regime as constrained microplasticity. Interestingly, for pores or debonded Si particles, the value of /D = 0.3 approximately corresponds to the macroscopic cyclic yield strength of A356-T6, so we can connect the transition from microplasticity to macroscopic plasticity (appearance of hysteresis in the remote or global cyclic stress–strain response) as that from constrained plasticity to unconstrained plasticity within the microstructure. We may regard /D = 0.3 as a percolation limit for microplasticity through the microstructure. The regime of limit plasticity sets in as /D approaches unity, i.e., the plasticity becomes extensive at the macroscopic scale. Constrained microplasticity exists below the macroscopic yield point and leads to the formation and growth of small fatigue cracks. The applied uniaxial strain amplitude for the yield point of A356-T6 is approximately εa ≈ 0.0023, which corresponds to the percolation limit for microplasticity within the eutectic regions (/D ≈ 0.3) and is the pertinent definition of the demarcation between LCF and HCF. At applied strain amplitudes well above the percolation limit, a condition of limit plasticity is reached for which the macroscopic average plastic strain amplitude approaches the order of the local plastic strain amplitude within the microstructure. Both eventually exceed the remote applied elastic strain amplitude at a remote applied strain amplitude of about 0.008. This distinction from the conventional definition of the HCF–LCF transition in wrought alloys, which assumes equality of elastic and plastic strain ranges ε e = ε p ,
Microstructure-sensitive computational fatigue analysis
1199
Plastic ᐉ
D
1
ᐉ/D ᐉ / D ⫽ 0.3 0 εa ᐉ/D LCF Limit Plasticity Log εa
Constrained microplasticity ᐉ / D < 0.3 HCF
B Unconstrained microplasticity
A
ᐉ/D≈1
C
ᐉ / D ⫽ 0.3
Log NT
Log σa
C
A B
Log NT
Figure 1. Regimes characterizing cyclic microplasticity at Si particles and casting pores [11]. (A) Elastic–plastic fracture mechanics propagation-dominated extensive remoteplasticity, (B) LCF transition regime, (C) incubation-dominated HCF regime. Here, D is the inclusion diameter and is the characteristic size of the micronotch root cyclic plastic zone.
1200
D.L. McDowell
is significant. For A356-T6, the HCF region according to our definition is beyond about 5 × 104 cycles, whereas according to the conventional definition it is beyond only 100 cycles for this alloy. It is likely that high strength wrought alloys with fine scales of heterogeneous microstructure are subject to similar categorization, as their conventional transition fatigue lives are often on the order of tens or hundreds of cycles.
3.
Crack Formation/Incubation Relations
The cyclic plastic shear strain range is central to the evaluation of LCF potency of inclusions. In local finite element analyses, the minimum mesh size serves as a lower bound for the domain over which the plasticity is averaged. The maximum micronotch plastic shear strain range is meshsensitive (increases with a decrease in element size), and it is therefore necessary to introduce a non-local volume averaging procedure over integration points in the mesh to effectively remove mesh dependence. Moreover, such a procedure is entirely physically justified since we are interested in assessing the scale of the intense cyclic plastic zone relative to the micronotch dimension. The non-local average plastic shear strain associated with the θ-plane is calculated by averaging the plastic shear strain on the θ-plane over the area A of the micronotch root region, i.e., p∗ γθ
1 = A
p
γθ dA
(2)
A
The non-local cyclic plastic shear strain range for each plane is calculated using this expression based on the range over the third cycle of the simulation. p∗ The maximum of the range of γθ among all planes is taken to be the non-local maximum cyclic plastic shear strain range, i.e., p γmax 1 p∗ β≡ = max γθ 2 2 θ ∗
(3)
In 3D simulations, a volume V would be used instead of the in-plane area A in the averaging process of Eqs. (2) and (3). Similar studies were carried out to examine the effects of nearest neighbor distance and proximity to the free surface [12]. In these calculations, the matrix elastoplasticity is correlated to experimental cyclic stress–strain behavior of Al–1%Si specimens (eutectic composition) tested at room temperature and a frequency on the order of 1– 10 Hz. Nonlinear kinematic hardening J2 plasticity theory was used to describe the cyclic plasticity of the Al-rich matrix (cf. [13]).
Microstructure-sensitive computational fatigue analysis
1201
The lower bound on length scale of the area A for averaging to find β is related to the minimum slip length over which fatigue cracks might nucleate due to classical PSB formation (Venkataraman et al. [3]) This lower bound may range from about 300 to 1000 nm and effectively establishes the minimum finite element mesh size for the calculation of cyclic plastic shear strain within the matrix to be used in assessment of fatigue crack formation. As will be explained later, the size of an incubated crack at the root of an inclusion will be assumed to be proportional to inclusion diameter, following the concept of transition crack length from fracture mechanics, and therefore the averaging in Eqs. (2) and (3) is performed similarly over a comparable area, proportional to the square of the inclusion diameter. The non-local maximum cyclic plasp∗ /2 is used to estimate the number of cycles tic shear strain amplitude γmax to form and propagate (i.e., incubate) a crack with length on the order of the domain of influence of the micronotch root, a fraction of inclusion size. Following [11], a notch root Coffin–Manson law is applied for this purpose, i.e., α β = Cinc Ninc
(4)
where α and Cinc are material dependent parameters. The formulation of a non-local Coffin–Manson law at the microstructure-scale is consistent with energetic arguments based on slip irreversibility [3], as discussed in connection with Eq. (2). Both α and Cinc are obtained from experimental data while β is obtained computationally. Load ratio dependence is explicitly embedded in Cinc , which differs, in general, from the fatigue ductility coefficient ε f in the traditional application of the Coffin–Manson relations to correlate fatigue crack initiation lifetime (cf. [14]); ε f includes effects of substantial small crack propagation through the heterogeneous microstructure (often up to 1 mm), whereas Cinc pertains only to formation at an individual micronotch. In a real microstructure, inclusions of various sizes and aspect ratios are present. Furthermore, some of the inclusions might be clustered and hence their interaction may affect the non-local maximum plastic shear strain amplitude. Inclusions are also found near the free surface, which promotes shear localization and can lead to premature fatigue crack formation. In addition, the applied mean stress or R-ratio can affect the intensity of cyclic plastic shear strain at the notch root, especially when the inclusions are partially debonded. This is due to the contact interaction between the inclusion and the surrounding matrix. We do not know a priori which of these scenarios would be most critical. Hence parametric studies are conducted to determine the nonlocal maximum plastic shear strain amplitude β as a function of these parameters. Then, information regarding inclusion populations in actual materials can be processed through these relations to assess probabilities of failure based on distribution functions for inclusion types, number densities, sizes, shapes, proximities, etc.
1202
4.
D.L. McDowell
Micronotch Analyses in Cast A356-T6
An hierarchical treatment of five inclusion types was addressed in McDowell et al. [11] for cast Al–Si alloy A356-T6, spanning the range of length scales relative to the secondary dendrite arm spacing or dendrite cell size (DCS) listed below, according to the order of ascending severity: Type A B C D E
Inclusion Distributed microporosity and Si particles; no significant pores or oxides High levels of microporosity; no large pores or oxides (length scale < 3DCS, which is about 60–300 µm) Large pores (length scale > 3DCS) Large pores within one pore diameter of the free surface; no large oxides (length scale > 3DCS) Large folded oxides (length scale > 3DCS )
The hierarchical approach to fatigue modeling of cast alloys permits bypass of certain crack growth regimes associated with lower length scales if the cracks incubate at larger defects. The total fatigue life is modeled as the sum of numbers of cycles spent in several consecutive stages as follows: NT = Ninc + NMSC + NPSC + NLC = Ninc + NMSC/PSC + NLC
(5)
where Ninc is the number of cycles to incubate (nucleation plus small crack growth through the region of notch root influence) a crack at the micronotch root with initial length, ai , on the order of 1/2 the maximum Si particle diameter, Dˆ part , or pore size, Dˆ pore . Here, NMSC is the number of cycles required for propagation of a microstructurally small crack (MSC) with length ai < a < k DCS, where k is a non-dimensional factor which represents a saturation limit at which the 3D crack front encounters a network of Si particles; typically k is in the range of 3 – 5. Further, NPSC is the number of cycles required for propagation of a physically small crack (PSC) during the transition from microstructurally small crack status to that of a dominant, long crack. The long crack propagates according to LEFM with an associated number of cycles NLC . For this alloy, the DCS is typically on the order of 20 – 100 µm, and the PSC regime may conservatively extend up to 300 – 800 µm. For practical purposes, in view of experimental data on this class of alloy, we aggregate NMSC + NPSC into the single term NMSC/PSC . Finite element simulations of cyclic plastic deformation at debonded Si inclusions, which is established as a the worst case scenario for localization of the nonlocal cyclic plastic shear strain [14, 15], are used to fit relations between β in Eq. (5) and the applied von Mises uniaxial equivalent strain amplitude, ε¯ a , and R-ratio based on maximum principal stress (R = σ1 |min / σ1 |max ). The
Microstructure-sensitive computational fatigue analysis
1203
average maximum plastic shear strain at the micronotch root (Eqs. (2)–(3)) is calculated over 5% of the inclusion area. Computational micromechanics studies were conducted over a substantial range of inclusion geometries (pores and Si particles) and size distributions to determine the notch root average value p∗ /2, as a of the maximum local cyclic plastic shear strain amplitude, β = γmax function of the applied strain amplitude [15, 16]. Figure 2 shows examples for uniaxial loading. Particle-matrix contact during loading reversal plays a key role in cyclic plastic strain localization in such problems. The micronotch Coffin–Manson law for incubation life in Eq. (4) is augmented by the relations: ¯εa − 0.0006 = D 0.00567
for
≤ 0.3; D
0.0023 1/r = 1 − 0.7 D ε¯ a ≤1 for 0.3 < D
(6)
∗
p /2 = (0.1666 + 0.0266 R) [ 100 {¯εa − 0.00025 (1 − R)}]2.45 β = γmax × (1 + z ζ ) (7)
Cinc = Cn +
1 0.7
− 0.3 D
Cn = 0.24 (1 − R )
(Cm − Cn ) = Cn + z (Cm − Cn )
(8) (9)
In these equations, D is the maximum Si particle diameter or pore size at a given scale within the preceding hierarchy, D = Dˆ part or Dˆ pore . All hierarchical scales are pursued in analysis and minimum total life is taken among these. The exponent ‘r’ in Eq. (6) controls the shape factor for the transition to limit plasticity, and r = 0.1 is selected to provide a rapid transition into the limit plasticity regime as observed in finite element calculations. Incubation life Ninc rapidly becomes an insignificant fraction of the total fatigue life above the percolation limit for microplasticity where extensive shear localization dominates the eutectic regions. For /D ≥ 0.3, finite element simulations show that β rapidly saturates to a level well above its value at the percolation limit (on the order of 2% plastic strain in the interdendritic regions). In Eqs. (8)–(9), Cn is the coefficient for nucleation and small crack growth at inclusions in the HCF regime (constrained microplasticity), and Cm is the Coffin–Manson coefficient for incubation in the limit plasticity regime (macroscopic LCF), obtained from the dendrite cell Al–1%Si material. The Macauley bracket function is defined by f = f if f ≥ 0, f = 0 for negative f. The matrix fatigue ductility coefficient is estimated as Cm = 0.03, based on LCF experiments on Al–1%Si specimens at lives below 5 × 103 cycles. Dependence of Cn on R reflects an effective decrease in matrix fatigue ductility at higher positive R-ratios due to plastic strain localization; the localized plastic
1204
D.L. McDowell
Figure 2. Correlation of Eq. (6) with the finite element computational results of Ref. [15] for p∗ nonlocal γmax /2 (in %) versus far-field total strain amplitude, εa = ε/2 (in % ) for debonded Si particles (upper curves) and for cracked Si particles (lowest curve). An area of A = 0.0625 D 2 was used in averaging the cyclic plastic strain in 2D calculations, where D is particle diameter.
strain level increases with R-ratio [15]. Furthermore, ratcheting or progressive plastic deformation of the notch root plastic shear strain is also evident in the calculations of Gall et al. [15] and is known to degrade fatigue ductility. The
Microstructure-sensitive computational fatigue analysis
1205
exponent α in Eq. (4) pertains to the eutectic Al-rich matrix, and is estimated from LCF tests on Al–1%Si as α = − 0.5. The localization multiplier z = /D − 0.3 /0.7 is non-zero only above the microplasticity percolation limit, and rapidly transitions to unity as interdendritic plastic shear strain localization sets in just above the microplasticity percolation limit. Beyond this point, the incubation process is negligible (Ninc only a few cycles) due to the severe levels of strain localization between particles or pores in and around interdendritic regions. Multiplier ζ represents eutectic strain intensification in the LCF regime. For debonded particles, the value ζ = 9 is estimated based on finite element results for the R = − 1 case as /D → 1. Once nucleated, small cracks (typically on the order of microns) must then propagate through an enclave with a significant gradient of cyclic stress and plastic strain away from the inclusion, typically losing driving force as they grow away from the micronotch root. If the driving force remains above threshold, a crack effectively leaves behind the influence of the notch and behaves as a crack with a physical length that includes the inclusion diameter (cf. [8, 17]). The transition crack length is typically 10–15% of the inclusion diameter [7]. The MSC/PSC small crack propagation relation is given by
da dN
= G (CTD − CTDth ),
(10)
MSC/PSC
whereCTD is the cyclic crack tip displacement range, and G is a constant for a given microstructure, typically less than unity (cf. [18]). We assign the threshold value CTDth = 2.86×10−10 m = b, where b is the Burgers vector for pure fcc Al. This value is just slightly above the minimum cyclic crack growth advance per cycle measured for squeeze cast Al–Si alloys [19, 20]. We adopt the specific form
CTD = f (ϕ) ¯ CII
DCS DCSo
U σˆ Su
n
a + CI
DCS DCSo
2 p
γmax
2 macro (11)
The first term in Eq. (11) is based on the correlations of Ref. [20] for cracks in low porosity squeeze cast Al–Si alloys in the MSC/PSC regime under HCF loading conditions, with an additional influence of average void volume fraction (porosity) ϕ¯ via the function f (ϕ) ¯ to be discussed later. Coefficient CII is intended to apply to the MSC and PSC regimes for crack lengths ranging from a few microns to the millimeter range (cf. [20, 21]). The second term is added to describe elastic–plastic crack propagation in the limit plasticity regime, with CI as the leading coefficient; da/dN is essentially independent of the crack length in this regime, with the maximum plastic macroscopic
1206
D.L. McDowell
p shear strain, γmax /2 macro , as the driving force. This second term is negligible in the HCF regime as defined by the percolation limit for microplasticity. In Eq. (11), σˆ = 2θ σ¯ a + (1 − θ) σ1 is the range of the uniaxial equivalent stress, which isa linear combination of the von Mises uniaxial effective
stress amplitude σ¯ a = 3/2 σij /2
σij /2
and the range of the max-
imum principal stress, σ1 ; θ is a constant factor (0 ≤ θ ≤ 1) introduced by Hayhurst et al. [22] to model combined stress state effects (θ ≈ 0.4 based on torsional fatigue experiments). The factor U addresses mean stress effects on propagation, which are influenced strongly by interdendritic particle interactions ahead of and in the wake of the crack; U = 1/(1 − R) for R < 0, and U = 1 for R ≥ 0, where stress ratio R is based on the maximum principal stress. U = 0 if the peak principal stress in the cycle is compressive. This form for U is consistent with finite element calculations [23] and results of Ref. [24] for particle-reinforced systems. The driving force Uσˆ is normalized by ultimate strength Su in Eq. (11). We assign a dependence of the eutectic matrix fatigue ductility in the HCF regime on the average porosity, ϕ, ¯ as a scaling parameter to correlate with microporosity, i.e.,
f (ϕ) ¯ = 1 + ω 1 − exp −
ϕ¯ 2ϕth
,
ϕth ≈ 10−4 .
(12)
This accounts for the effect of microporosity in decreasing matrix ductility. The factor of two to three reduction in fatigue life observed for higher microporosity levels relative to low microporosity cast specimens suggests a value of ω ≈ 2 (cf. [20]). For two different low porosity squeeze cast alloys in the HCF regime, Shiozawa et al. [20] measured the combined coefficient GCII = 3.11×10−4 m/cycle for a reference dendrite cell size of DCSo = 30 µm; in this case, the microporosity is very low, i.e., f (ϕ) ¯ ≈ 1. For cast A356-T6, we take G = 0.32 and the other non-dimensional constants that result from data correlation give CI = 0.31, CII = 1.88×10−3 , n = 4.8 (as in Ref. [20]), and ω = 2. The reference DCS value in Eq. (11) is taken as DCSo = 30 µm, corresponding to a horizontally cast plate. For horizontally cast plate, Su = 310 MPa. The exponent n = 4.8 is also reasonably close to the exponent on stress range of the da / dN versus K relation for the A356-T6 alloy in the long crack regime, and is supported by limited finite element calculations of theCTD versus applied stress for cycling in the HCF regime [23]. The mechanically long crack growth relation is given by
da dN
= Ap LC
K e f f
M
− K e f f,t h
M
(13)
√ −4.2 /cycle. The For A356-T6, M ≈ 4.2 and Ap ≈ 1.5 × 10−11 m - MPa √ m intrinsic threshold is given by K eff,th ≈ 1.3MPa m for A356-T6, as
Microstructure-sensitive computational fatigue analysis
1207
determined from experiments at very high stress ratios. The effective stress intensity factor range is defined by K eff = K max − K op if K min < K op , K eff = K max − K min if K min ≥ K op , where the opening stress intensity factor level is given by (Couper and Griffiths 1990) K op = 3.4 + 3.8R 2 for R > 0, K op = 3.4 (1 + R) for 0 ≥ R ≥ − 1, and K op = 0 for R < − 1. Following incubation, we select between the MSC/PSC and LC growth laws as the crack extends by considering the maximum of either of the two respective rates, i.e., da = max dN
da dN
da , dN MSC/PSC
.
(14)
LC
Use of the K -based growth relation is subject to a constraint that the requirements for validity of the homogeneous LEFM approach to model fatigue crack growth in the heterogeneous cast alloy, i.e.,
Sy a > 30DCS σeff
2
(15)
where σeff = σmax − σop , and the opening stress accords with K op . This criterion corresponds to a cyclic plastic zone enclave at the crack tip on the order of the DCS. For Si particles, the initial crack size is given by ai =
˜ Dˆ part + = 0.5625 Dˆ part 2 4
where ˜ ≈
Dˆ part . 4
(16)
For pores with diameter less than 3DCS, the initial crack size is Dˆ pore ˜ Dˆ pore + = 0.5625 Dˆ pore where ˜ ≈ . 2 4 4 For pores with diameter greater than 3DCS, the initial crack size is ai =
ai =
Dˆ pore 1 DCS Dˆ pore DCS + + , = 2 2 2 2 4
(17)
(18)
and the factor β is amplified by Dˆ pore /(3DCS) to account for loss of constraint on slip with increase of pore size relative to the dendrite cell size. The case of localization of cyclic plastic strain for large pores near free surfaces was characterized as well and considered as a separate case. For large oxides of length Dˆ oxide, it is assumed that the incubation relations are bypassed completely, with an initial crack size for propagation given by ai = Dˆ oxide/2. The final crack length is specified either as af = af |dominant = 1 mm for a dominant crack in the HCF regime Cof Fig. 1 (only maximum inclusion size dominates), or an effective crack length a˜ f (a˜ f < af ) in the LCF regime that accounts for multi-site incubation of cracks at largest inclusions of different populations, followed by dilute (non-interactive) growth, and impingement
1208
D.L. McDowell
coalescence. Coalescence phenomena for distributed crack nucleation is considered only for the LCF regime, i.e., /D → 1 (regimes A or B in Fig. 1). We set the final crack length in the propagation analysis as af = af |dominant + z (a˜ f − af |dominant) to account for the transition from dominant crack failure in HCF to multi-site crack formation and coalescence in LCF. For example, the approximate recursion relation
Dˆ pore (0.685 − 0.04ξ1 ) (ξ1 + 1) Dˆ part + δpart + a˜ f ≈ 2 2 2
×
n
(ξ1 )
2i−1
i=1
+
δpore − n (ξ1 )2(n+1)−1 2δpart
(19)
applies to the reduced effective crack length at failure due to a system of largest Si particles residing within a field of monosize gas orshrinkage pores. Here, n = I N T δpore / 2δpart and ξ1 = Dˆ pore / Dˆ pore + Dˆ part . The average spacing between the largest (fractured or debonded) Si particles is given by δpart , and δpore is the average nearest neighbor spacing between pores of a given mean diameter D¯ pore . Finally, we sum the various components of lifetime to arrive at the total number of cycles to produce a crack of 1 mm length or to reach a˜ f , NT = Ninc + NMSC/PSC + NLC . Figures 3 and 4, respectively, show the predicted uniaxial remote stress amplitude and strain amplitude versus NT curves for completely
∆σ/2 (MPa)
103
102
101
102
103
104
105
106
107
NT
Figure 3. Model predictions: remote stress amplitude versus total fatigue life for completely reversed uniaxial loading for a rangeof inclusion types and sizes.
Microstructure-sensitive computational fatigue analysis
1209
∆ε/2
10⫺2
10⫺3 101
102
103
104 NT
105
106
107
Figure 4. Variation of completely reversed, uniaxial applied strain-total life behavior as a function of inclusion type and size for cast A356-T6 Al, including coalescence effects in the LCF regime. In the coalescence propagation analysis, parameters were assigned based on experimental work: ϕ¯ = 1.5 × 10−3 , D¯ pore = 50 µm, Dˆ part = 12 µm, and DCS = 30 µm.
reversed loading (R = −1) as a function of inclusion type considered. For squeeze cast alloys (12 µm evenly spaced fractured Si particles in Figs. 3 and 4), the porosity is minimized by application of hydrostatic pressure during solidification and the maximum fatigue resistance is obtained in the HCF regime. However, coalescence phenomena render this microstructure less resistance to LCF, as shown in Fig. 4. The key point is that once such a multi-mechanism, multi-scale model of fatigue is established and its mean behavior correlated to in situ matrix fatigue behavior along with one or two well-characterized inclusion populations such as in the squeeze cast condition, it is robust in predicting the variability of fatigue resistance associated with a complete range of microstructural features, including dendrite cell size, eutectic Si particles, both gas and shrinkage pores, and oxides. Combining such a tool with component level stress analysis, it is foreseen that variations of microstructure can be achieved through design of process conditions (e.g., solidification rate and pressure) to tailor material fatigue resistance in critical locations of components rather than costly, indiscriminate control of processing on the entire part. Moreover, minimum weight component design can be undertaken considering additional constraints regarding the level of fatigue resistance of the microstructure. Many exciting possibilities exist in this regard.
1210
5.
D.L. McDowell
Cyclic Shakedown and Ratcheting in Fatigue
The foregoing model has focused on reversed cyclic plasticity as a driving force for formation of fatigue cracks. Cyclic plastic strain behavior is generally decomposed into three regimes: elastic shakedown, reversed cyclic plasticity, and plastic ratcheting, as shown in Fig. 5 (cf. [25]). Elastic shakedown is defined as the stress or strain level below which there is a cessation of cyclic plasticity. In other words, the condition of elastic shakedown is obtained when plastic deformation occurs during the early cycles but the steady state behavior is fully elastic due to the build-up of residual stresses. Reversed cyclic plasticity is the condition in which the material experiences reversed plastic straining during cycling with no net accumulation of plastic deformation; reversed cyclic plasticity is sometimes referred as plastic shakedown. Plastic ratcheting describes the condition in which the material accumulates a net directional plastic strain during each cycle. The ratcheting plastic strain increment per cycle is defined as
p
ε i j
ratch
p
= εi j
End of the cycle
p
− εi j
(20)
Beginning of the cycle
The reversed cyclic plastic strain range is given by
p
ε i j
(a)
cyc
p
= ε i j
max Over the cycle
p
− ε i j
(b)
(c)
p ∆εratch
Stress
p ∆εcyc
(21)
ratch
Strain
Figure 5. Steady state responses of plastic strain behavior during the cycle: (a) elastic shakedown, (b) reversed cyclic plasticity, and (c) plastic ratcheting.
Microstructure-sensitive computational fatigue analysis
1211
The effective reversed cyclic plastic strain range and ratcheting plastic strain increment are defined as follows: p ε cyc,eff
=
2 p p ε i j ε i j cyc cyc 3
p
ε ratch,eff =
(22)
2 p p ε i j ε i j ratch ratch 3
(23)
It is assumed that elastic shakedown occurs when the following conditions are satisfied: εipj =/ 0
p
p
and ε cyc,ef f and εrat ch,e f f
≤ εy
(24)
where ( 1) scales shakedown relative to the cyclic yield strain ε y . In other words, when elastic shakedown occurs, both reversed cyclic plastic strain and plastic ratchet strain amplitudes are considered to be zero. Shakedown and ratcheting maps expressed in terms of strain distributions or as a function of loading parameters can be quite valuable for interpreting the role of microstructure in fatigue resistance. The interested reader is referred to recent works of [26–28] which describe such maps constructed for fretting fatigue of Ti–6Al–4V based on computational multiphase crystal plasticity, where the surface boundary layer thickness is on the order of, and therefore intimately related to, microstructure scales. Ratcheting is a very important mechanism for this class of surface contact problems. It should be considered that classical to-and-fro slip is not responsible for all crack formation and propagation mechanisms at the microstructure scale. Progressive pileup of dislocations in slip bands (Zener mechanism) that impinge on grain or phase boundaries, or at oxidized inclusion interfaces, can lead to formation and propagation of small cracks in the microstructure. In fretting fatigue, for example, progressive plastic deformation of surface layers has been shown to contribute significantly to formation and early growth [27] of cracks on the order of grain size under ostensibly HCF conditions. An appropriate measure of plastic strain to reflect this sort of driving force is the p∗ /2 ratchet strain. The averaging procedure in Eq. (3) for non-local β = γmax can also be applied to a non-local measure of the increment of cyclic rate of p∗ ratchet strain accumulation, γmax,ratch , or its cumulative value. A microfracture criterion can be introduced for crack incubation in such cases, e.g., Ninc= p∗ p∗ ∗ , f γmax,ratch , or a Mohr–Coulomb form [6] Ninc =g γmax,ratch + h σ σn,max ∗ where σn,max is the maximum tensile normal stress to the plane of crack formation within the same region over which the averaging is performed to define p∗ γmax,ratch .
1212
6.
D.L. McDowell
Summary
An hierarchical approach for microstructure-sensitive fatigue analysis based on computational micromechanics is outlined. Each aspect of the relation of microstructure to fatigue damage is deterministic, framed to explicitly incorporate microstructure features. Such an approach can predict variability of fatigue life with respect to variation of any particular microstructure feature. Microstructure features for the cast Al alloy example described here include dendrite cell size, maximum Si particle size, maximum pore size, maximum oxide size, proximity to the free surface (for large pores), and average porosity level. If the probability distributions of these features are specified based on quantitative metallography, for example, then the probability distributions for the fatigue life can be computed directly from the model. The foregoing methodology is an extension of existing, straightforward practice of estimation of fatigue life of notched components to microstructural notches (Socie et al., 1984). It requires analyses of notch root behavior for various microstructural features (relations between remote loading conditions and behavior in notch root regions), as well as introduction of appropriate small fatigue crack growth relations for a given microstructure. The extension of this methodology to other characteristic wrought and cast microstructures is straightforward, although it requires an investment of effort to sort out mechanisms for crack formation and propagation, as well as appropriate properties. In principle, relations for broad classes of cast alloys should be similar, as should those for wrought alloys. We also recognize several additional directions of future research in computational fatigue models that can contribute to this type of approach: • Cohesive zone interface separation elements for inclusion-matrix and grain boundary interfaces. • Discrete dislocation simulations for understanding crack tip behavior and behavior of dislocations at very small micronotches for which the scale of cyclic plastic zone size is on the order of dislocation spacing. • Nonlocal relations for dislocation substructure formation and its relation to cyclic plastic deformation. • Adaptive remeshing and mesh-free approaches for propagation of cracks within microstructures to assist in establishing appropriate small crack growth relations for each characteristic type of microstructure.
References [1] J.A. Bannantine, J.J. Comer, and J.L. Handrock, Fundamentals of Metal Fatigue Analysis, Prentice-Hall, Englewood Cliffs, NJ, 1990.
Microstructure-sensitive computational fatigue analysis
1213
[2] S. Suresh, Fatigue of Materials, Cambridge University Press, 2nd edition, Cambridge, UK, 1998. [3] G. Venkataraman, Y.W. Chung, and T. Mura, “Application of minimum energy formalism in a multiple slip band model for fatigue-II. Crack nucleation and derivation of a generalised Coffin-Manson law,” Acta Met. Mater., 39(11), 2631–2638, 1991. [4] E.A. Repetto and M. Ortiz, “A micromechanical model of cyclic deformation and fatigue-crack nucleation in f.c.c. single crystals,” Acta Mater., 45(6), 2577–2595, 1997. [5] D.F. Socie, “Critical plane approaches for multiaxial fatigue damage assessment,” In: Advances in Multiaxial Fatigue, ASTM STP 1191, D.L. McDowell and R. Ellis (eds.), ASTM, Philadelphia, 7–36, 1993. [6] D.L. McDowell, “Multiaxial fatigue strength,” ASM Handbook, vol. 19 on Fatigue and Fracture, ASM International, 263–273, 1996a. [7] R.A. Smith and K.J. Miller, “Fatigue cracks at notches,” Int. J. Mech. Sci., 19, 11–22, 1977. [8] N.E. Dowling, “Fatigue at notches and the local strain and fracture mechanics approaches,” In: C.W. Smith (ed.), Fracture Mechanics, ASTM STP 677, ASTM, Philadelphia, pp. 247–273, 1979. [9] D.F. Socie, N.E. Dowling, and P. Kurath, “Fatigue life estimation of notched members,” In: Fracture Mechanics: 15th Symp., ASTM STP 833, R.J. Sanford (ed.), ASTM, Philadelphia, 284–299, 1984. [10] D.L. McDowell, “Basic issues in the mechanics of high cycle metal fatigue,” Int. J. Fracture, 80, 103–145, 1996b. [11] D.L. McDowell, K. Gall, M.F. Horstemeyer, and J. Fan, “Microstructure-based fatigue modeling of cast A356-T6 alloy,” Eng. Frac. Mech., 70, 49–80, 2003. [12] J. Fan, D.L. McDowell, M.F. Horstemeyer, and K. Gall, “Cyclic plasticity at pores and inclusions in cast Al–Si alloys,” Eng. Frac. Mech., 70(10), 1281–1302, 2003. [13] D.L. McDowell, “Multiaxial effects in metallic materials,” Symp. on Durability and Damage Tolerance, ASME AD-Vol. 43, ASME Winter Annual Meeting, Chicago, IL, Nov. 6–11, 213–267, 1994. [14] K. Gall, N. Yang, M. Horstemeyer, D.L. McDowell, and J. Fan, “The influence of modified intermetallics and Si particles on fatigue crack paths in a cast A356 Al alloy,” Fatigue Fract. Engng. Mater. Struct., 23(2), 159–172, 2000a. [15] K. Gall, M.F. Horstemeyer, B.W. Degner, D.L. McDowell, and J. Fan, “On the driving force for fatigue crack formation from inclusions and voids in a cast A356 aluminum alloy,” Int. J. Fract., 108, 207–233, 2001. [16] K. Gall, M. Horstemeyer, D.L. McDowell, and J. Fan, “Finite element analysis of the stress distributions near damaged Si particle clusters in cast Al–Si alloys,” Mech. Mater., 32(5), 277–301, 2000b. [17] J.C. Ting and F.V. Lawrence, Jr., “Modeling the long-life fatigue behavior of a cast aluminum alloy,” Fatigue Fract. Engng. Mater. Struct., 16(6), 631–647, 1993. [18] C.-H. Goh, D.L. McDowell, and R.W. Neu, “Characteristics of plastic deformation field in polycrystalline fretting contacts,” Int. J. Fatigue, 25(9–11), 1047–1058, 2003b. [19] A. Plumtree and S. Schafer, “Initiation and short crack behaviour in aluminum alloy castings,” In: the Behaviour of Short Fatigue Cracks, EGF Pub. 1, K.J. Miller and E.R. de los Rios (eds.), Mech. Engineering Publications, London, 215–227, 1986.
1214
D.L. McDowell
[20] K. Shiozawa, Y. Tohda, and S.-M. Sun, “Crack initiation and small fatigue crack growth behaviour of squeeze-cast Al-Si aluminum alloys,” Fatigue Fract. Engng. Mater. Struct., 20(2), 237–247, 1997. [21] S. Gungor and L. Edwards, “Effect of surface texture on fatigue life in a squeeze-cast 6082 aluminum alloy,” Fatigue Fract. Engng. Mater. Struct., 16(4), 391–403, 1993. [22] D.R. Hayhurst, F.A. Leckie, and D.L. McDowell, “Damage growth under nonproportional loading,” ASTM STP 853, ASTM, Philadelphia, 688–699, 1985. [23] J. Fan, D.L. McDowell, M.F. Horstemeyer, and K. Gall, “Computational micromechanics analysis of cyclic crack-tip behavior for microstructurally small cracks in dual-phase Al–Si alloys,” Eng. Frac. Mech., 68, 1687–1706, 2001. [24] M.J. Couper, A.E. Neeson, and J.R. Griffiths, “Casting defects and the fatigue behavior of an aluminum casting alloy,” Fatigue Fract. Eng. Mater. Struct., 13(3), 213–227, 1990. [25] J.M. Ambrico and M.R. Begley, “Plasticity in fretting contact,” J. Mech. Phys. Solids, 48(11), 2391–2417, 2000. [26] C.-H. Goh, J.M. Wallace, R.W. Neu, and D.L. McDowell, “Polycrystal plasticity simulations of fretting fatigue,” Int. J. Fatigue, 23, S423–S435, 2001. [27] C.-H. Goh, R.W. Neu, and D.L. McDowell, “Crystallographic plasticity in fretting of Ti–6Al–4V,” Int. J. Plasticity, 19(10), 1627–1650, 2003a. [28] C.-H. Goh, D.L. McDowell, and R.W. Neu, “Characteristics of plastic deformation field in polycrystalline fretting contacts,” Int. J. Fatigue, in press, 2003b.
4.1 OVERVIEW OF CHAPTER 4: MATHEMATICAL METHODS Martin Z. Bazant1 and Dimitrios Maroudas2 1 Massachusetts Institute of Technology, Cambridge, MA, USA 2
University of Massachusetts, Amherst, MA, USA
Mathematics is the language of science. In some sense, therefore, this entire Handbook is devoted to “mathematical methods” of materials modeling. What distinguishes the articles in this chapter is the degree of mathematical intensity or sophistication, as well as contributions from the applied mathematics community. Building on its traditional strengths in fluid mechanics, nonlinear dynamics, and numerical methods, applied mathematics has been steadily moving into materials science. The result is a fresh perspective on a wide range of materials problems, including many from other chapters of the Handbook. This chapter serves to highlight some major themes from current research, such as disordered materials, interfacial dynamics, and multiscale modeling, to give a taste of the subject. The chapter has been structured into three thematic sections. The first one (Articles 4.2–4.5) is devoted to theoretical descriptions of bulk phases of materials that are under extreme conditions of deformation and/or are characterized by heterogeneities in their microstructure or the loss of structural order. The second section (Articles 4.6–4.10) addresses problems of interfacial dynamics and morphological evolution or microstructural evolution of multiphase systems mediated by the dynamics of the boundaries between different phases, that have been challenging the fields of materials science and fluid dynamics for many decades. The third and final section (Articles 4.11–4.15) is devoted to mathematical developments in the multiscale modeling of complex systems, which is a promising and powerful computational means toward analysis and predictive modeling of realistic, technologically important materials and their processing and function.
1217 S. Yip (ed.), Handbook of Materials Modeling, 1217–1222. c 2005 Springer. Printed in the Netherlands.
1218
1.
M.Z. Bazant and D. Maroudas
Bulk Phases of Highly Deformed, Heterogeneous, or Disordered Materials
The articles of this section (Articles 4.2–4.5) present fundamental principles of theoretical analysis and address challenging problems of structural response to mechanical loading (or other external fields) of bulk material phases. The materials are either perfectly crystalline but subjected to elastic deformations that bring the crystal to its ideal-strength limit, or have complex microstructure such as in composite systems or random heterogeneous materials, or are fully disordered such as in amorphous solids. In these cases, the intensity of the loading conditions and/or the structural complexity of the bulk material phase introduce serious challenges in the theoretical analysis and its computational implementation. For example, conventional perturbation analyses about a linearly elastic continuum are not sufficient to rigorously address the fundamental theoretical problems and additional or more sophisticated mathematical tools are required. Article 4.2 by Milstein investigates theoretically the structural response of a perfect crystal under load for elastic deformations in the vicinity of the crystalline material’s “theoretical strength”. The principles of elastic stability analysis are reviewed and crystal stability criteria are derived under various loading modes. Both lattice–statics calculations and isostress molecular-dynamics simulations are used to explore the behavior of the crystal at and beyond the onset of the elastic instability and elucidate the roles of crystal symmetry and mode of loading, as well as the atomic-scale dynamical mechanisms that may lead to structural transformation (phase change) or failure of the crystal beyond the instability limit. Amorphous solids provide a promising starting point for theoretical analysis toward fundamentally understanding general classes of deformation behavior. In Article 4.3 by Falk, Langer, and Pechenik, two questions that are fundamental in the development of amorphous plasticity theory are addressed, namely how hardening-to-flow transitions occur under applied stress and how microstructural dynamics can be incorporated into macroscopic constitutive theories. A theoretical framework is developed that includes the two-state dynamics associated with shear transformation zones (or flow defects). Some predictions of the resulting model are given for the mechanical response of metallic glasses and amorphous polymers and the need to understand better the thermodynamics of nonequilibrium systems is emphasized. Article 4.4 by Sornette reviews the statistical physics of the rupture of heterogeneous materials, such as composite systems, the failure of which is of utmost importance to a broad range of technological applications. The theoretical challenge in the field arises from the complex interplay between heterogeneities and modes of damage, as well as a hierarchy of static and dynamic
Overview of chapter 4: mathematical methods
1219
characteristic scales; a common property of the heterogeneous systems of interest is the presence of large-scale inhomogeneities that limit the use of homogenization theories. The many-body nature of the rupture problem is highlighted and the need for a truly interdisciplinary approach to attack the problem is emphasized. Article 4.5 by Torquato addresses the theoretical prediction of random heterogeneous materials properties through the use of statistical correlation functions to describe the dependence on microstructure of effective material properties and the development of methods to estimate the corresponding functionals of microstructural information. A unified theoretical approach is outlined based on the canonical n-point correlation function and the analysis focuses on static (or approximately static) two-phase heterogeneous materials. The effective properties are used in averaged constitutive equations to close the appropriate homogenized governing partial differential equations (PDEs) that describe, for small-length-scale heterogeneities, physical processes occurring in heterogeneous materials.
2.
Interfacial Dynamics and Morphological & Microstructural Evolution
Materials science is increasingly focusing on the detailed dynamics of microstructures out of equilibrium, to better understand and optimize the macroscopic behavior of complex materials. Such problems are often beyond the reach of atomistic modeling and typically require continuum approaches, which have long been the domain of applied mathematics. The major difficulty is to describe the moving free boundary between different phases, which can be (or become) quite complicated. Even when the governing equations in each phase are linear, the mathematical problem for interfacial dynamics is generally nonlinear and nonlocal. This presents challenges for both numerical and analytical modeling, which are addressed by the articles of this section (Articles 4.6–4.10). A wide variety of numerical methods have been developed, which mostly fall into two classes, Eulerian and Lagrangian. The former represent moving boundaries as level sets of higher-dimensional functions defined on the same fixed mesh as the bulk fields, which naturally allows for topological changes and complicated microstructures. The Phase Field Method for solidification is a well-known example from materials science and is discussed in Chapter 7 of this Handbook. From applied mathematics, the Level-Set Method, presented in Article 4.6 by Sethian, has been used in diverse problems from shock dynamics to image recognition and is now a standard tool to simulate etching and deposition processes in semiconductor micro-fabrication. The method also
1220
M.Z. Bazant and D. Maroudas
is used widely in other areas of materials modeling, such as in thin-film growth (see, e.g., Article 7.15 by Caflisch and Ratsch in Chapter 7 of this Handbook). In contrast, Lagrangian methods explicitly track each moving boundary with a separate data structure. Examples include front-tracking methods for shock waves and immersed-boundary methods for cardiac fibrillations. The Lagrangian approach is particularly useful when the boundary has its own physical properties, separate from the bulk, as in many soft condensed matter systems (see, e.g., articles in Chapter 9 of this Handbook). For example, in simulations of complex fluids containing elastic solid filaments, as discussed in Article 4.7 by Shelley and Tornberg, boundary integral methods can eliminate the need to explicitly describe the bulk fluid phase (e.g., using the methods of Chapter 8 of this Handbook). Due to the complexity of interfacial dynamics, analytical methods are usually restricted to special situations, but, when available, they offer valuable insights. For continuum models based on PDEs, a crucial role is played by exact similarity solutions, in which the independent variables only appear in special power–law combinations, usually due to a separation of length and/or time scales. For example, continuum descriptions with similarity solutions are being developed for modeling crystal surface morphological evolution governed by surface diffusion, as discussed in Article 4.8 by Stone and Margetis. Similarity solutions also are essential to describe singularities in free-surface fluid flows, such as the break-up and coalescence of fluid drops, as discussed in Article 4.9 by Eggers, which are difficult to capture with conventional numerical methods. In two dimensions, conformal-mapping methods from complex analysis allow elegant formulations of interfacial dynamics problems, convenient for analytical and numerical solutions, without any special similarity assumptions, as discussed in Article 4.10 by Bazant and Crowdy. Continuous conformalmap dynamics is a mature subject for viscous fingering and other fluid instabilities, but it is being extended to new problems in materials microstructure, such as viscous sintering, electromigration-driven void dynamics in metals, pore evolution in elastic solids, and solidification in fluid flows. The recent development of stochastic conformal-map dynamics has also been a major breakthrough in the study of diffusion-limited aggregation and other fractalgrowth phenomena.
3.
Multiscale Modeling of Complex Systems
Multiscale modeling methods are becoming significant tools in materials modeling, as well as a broad range of areas in scientific and engineering research. Over the past decade, multiscale modeling has emerged as a
Overview of chapter 4: mathematical methods
1221
powerful, integrated computational approach for understanding, analyzing and quantitatively predicting the behavior of realistic complex systems. The aim of multiscale modeling is to link fine-scale phenomena with macroscopic response exhibited over coarse scales by establishing rigorous links over widely different theoretical formalisms and computational methods; the terms fine and coarse are not uniquely defined, but vary over different complex systems. Although the core capabilities of multiscale modeling include mature methods of quantum mechanics, statistical mechanics, and continuum mechanics, the rigorous coupling of these methods to produce satisfactory, ultimately predictive models for the accurate description of complex systems remains a very serious challenge. The articles of this section (Articles 4.11–4.15) address this very challenge. In current modeling, the best available descriptions of a complex system exist at a fine (atomistic or microscopic) scale while the modeling tasks need to address a much coarser, macroscopic scale. Article 4.11 by Kevrekidis, Gear, and Hummer gives an overview of their novel development of a mathematicallybased, computational enabling technology that allows for performing macroscopic tasks by acting directly on the microscopic models. This “equation-free” approach circumvents the need for deriving accurate macroscopic equations starting from the corresponding microscopic descriptions. An ensemble of short, appropriately initialized, fine-scale computer simulations is used to estimate time derivatives, functions, and functional derivatives, which are then used for system-level modeling through matrix-free numerical analysis and systemstheory tools. The approach has the potential to bridge elegantly microscopic simulation with macroscopic modeling of complex systems. Modeling mesoscopic inhomogeneities arising due to thermal fluctuations and complex interactions between microscopic mechanisms requires efficient description of length and time scales much larger than those captured by conventional molecular/microscopic models and simulations. Article 4.12 by Katsoulakis and Vlachos provides an overview of the key ingredients in the derivation of a mathematical framework for coarse graining of stochastic processes; these involve a coarse grid selection, as well as the derivation through a stochastic closure of a coarse stochastic model for a reduced (compared to the underlying microscopic description) number of observables, leading to the development of coarse-grained Monte Carlo algorithms. The approach is demonstrated focusing on simple Ising-type models and the coarse-graining errors are estimated using information theory methods. Multiscale modeling aims at developing numerical tools of accuracy comparable to that of microscopic models and efficiency comparable to that of macroscopic models, by properly coupling the microscopic with the macroscopic models. Article 4.13 by E and Li reviews some of these strategies that have been developed for multiscale modeling of crystalline solids, focusing on
1222
M.Z. Bazant and D. Maroudas
the coupling between molecular dynamics and continuum mechanics and, in particular, on concurrent coupling methods for linking different scales “on the fly”. These modeling methods are classified into energy-based and dynamicsbased formulations. Specific methods discussed include the quasi-continuum method, macro atomistic ab initio dynamics, coarse-grained molecular dynamics, and the heterogeneous multiscale method. The need for rigorous multiscale modeling of solids beyond single crystals with isolated defects is emphasized. In multiscale modeling, a natural question is the development of a computational method that captures small-scale effects on the large scales using a coarse grid without the requirement to resolve all the small-scale features. Article 4.14 by Hou illustrates some of the key issues in designing multiscale computational methods for fluid flows, using as examples incompressible flow and two-phase flow in heterogeneous porous media. Emphasis is placed on a multiscale finite-element method by constructing local basis functions that capture small-scale information within each element and bring it to the large scales through the coupling of the global stiffness matrix. The need to localize the subgrid small-scale problems by properly implementing microscopic boundary conditions for the local basis functions is highlighted and methodology to accomplish this is discussed for both diffusion-dominated and convection-dominated transport problems. Future directions in addressing the need to carry out multiscale analysis that accounts for long-range interactions of small scales also are discussed. Engineering analysis requires the prediction of selected “outputs” relevant to component/system performance as a function of “inputs”, i.e., system parameters that serve to identify a particular realization of the component/system. Article 4.15 by Cuong, Veroy, and Patera addresses modeling of components or systems in service or in operation, where typical computational tasks include robust parameter estimation and adaptive design, i.e., inverse problems and optimization problems, respectively. Their certified real-time approach to solve parameterized PDEs considers both approximation and computation opportunities and is based on rapidly, uniformly convergent reduced-basis approximations and associated rigorous and sharp error bounds. Examples to demonstrate the approach include Helmholtz elasticity and natural convection. These methods are appropriate for many classes of materials behavior and processing problems.
4.2 ELASTIC STABILITY CRITERIA AND STRUCTURAL BIFURCATIONS IN CRYSTALS UNDER LOAD Frederick Milstein Mechanical Engineering and Materials Depts., University of California, Santa Barbara, CA, USA
What happens when a crystalline material is deformed elastically to the point where it loses structural stability? Under what circumstances will it lose stability? Why are these questions important? The stress required to cause elastic instability is often considered to be the ultimate “theoretical strength” of a crystalline material, which is an inherently intriguing concept, in and of itself. The “theoretical strength” plays important roles in understanding and/or describing practical phenomena, e.g., it forms a basis for calculating the efficiency of grinding processes and it affects the stress distribution near the tip of a crack and thus influences whether a material will exhibit brittle or ductile behavior. From another viewpoint, structural phase change rather than loss of strength is the presumed outcome of elastic instability. New crystalline or amorphous structures that form under mechanical stress may remain elastically stable after the stress is released, and so may continue to exist indefinitely, even if not in the thermodynamic equilibrium state at zero stress. (An example of an elastically stable structure that is also not in the thermodynamic equilibrium state is the extremely hard, tetragonal crystalline form of iron– carbon alloy referred to as martensitic steel; such structures are sometimes called metastable.) Additionally, as noted by Hill [1], “Single crystals free from lattice imperfections are used increasingly as microstructural components. Perfect crystals are capable of elastic strains well beyond what can properly be treated as infinitesimal. Their response to general loading is virtually unknown and is doubtless complex, so experimentation will have to be conducted within some plausible theoretical framework”. In this context, Milstein and Chantasiriwan [2] observed “Atomistic model computations can shed light on these 1223 S. Yip (ed.), Handbook of Materials Modeling, 1223–1279. c 2005 Springer. Printed in the Netherlands.
1224
F. Milstein
complexities, particularly when comprehensive comparisons are made among different metals, crystal structures, and loading directions. Such comparisons can also serve to distinguish between finite strain responses that are sensitive to specific details of atomic binding and those dependent mainly on just crystal symmetries and the general nature of interatomic forces, i.e., attractive between atoms at relatively large interatomic spacing and repulsive between close, neighboring atoms”. Since Hill’s observation, almost 30 years ago, lattice model computations have yielded numerous insights into the large strain, non-linear, elastic response of crystals, although their general response to loading is still largely unknown, especially as it concerns the nature of atomic mechanisms at and beyond the onset of instability. Questions posed at the start of this article may be further elaborated as follows. How is elastic stability under load to be assessed theoretically? What are the roles of crystal symmetry and mode of loading? (For example, a face centered cubic (fcc) crystal with a uniaxial compressive load applied in a [100] direction (i.e., parallel to an edge of a unit cubic crystallographic cell) will respond differently than a body centered cubic (bcc) crystal with a uniaxial tensile load applied in a [111] direction (i.e., parallel to a body diagonal of the unit cell).) If the purported stability limit coincides with a bifurcation point, what are the allowed eigendeformations at the immediate onset of instability? At and after the initiation of a bifurcation, are the atomic mechanisms homogeneous (i.e., with the crystal deforming uniformly, in the manner of a predicted homogeneous eigenmode) or inhomogeneous (e.g., with the formation of domains or the shuffling of crystallographic planes)? Does post-bifurcation behavior lead to failure (loss of load carrying capacity) or to phase change without loss of strength? How are instability processes influenced by thermal activation? A goal of this article is to suggest some definitive, as well as some tentative, answers to the above questions, from within a framework that is both analytical and computational. The article is structured as follows. First, the principles of stability analysis for ideal crystals under load are reviewed; the methodology presented is complete to second order in both the internal energy of the crystal (expressed in terms of the crystal’s second order elastic moduli) and the external work. Then examples are given of various lattice statics (LS) calculations that are intended to provide illustrations of the manner in which particular crystal symmetries and modes of loading yield a rich and diverse range of mechanical and geometrical responses, prior to, at, and after the onset of instability. Next, the potential role of higher order elastic moduli at a point of bifurcation or branching of the crystal is discussed. The final topic is the behavior of crystals under stress in isostress molecular dynamics (IMD) simulations carried out in the methodology proposed by Parrinello and Rahman [3]. At each stage of this presentation, comments are made regarding the “outlook” and needs for future work.
Elastic stability criteria and structural bifurcations
1.
1225
Principles of Stability Analysis of Ideal Crystals
In pioneering work, Born [4] introduced the concept of theoretical strength as an elastic instability phenomenon; the first attempt to carry out calculations of the range of elastic stability of an ideal crystal subjected to uniaxial load was made by Born and Furth [5]. According to Born, a crystal under homogeneous deformation may be treated as a conservative dynamical system with six degrees of freedom; stability, in the ordinary Lagrangian sense, is then to be assessed along conventional lines. In Born’s formulation, however, external work contributions are not explicitly and fully included in the total potential energy. As a result, Born’s criterion for elastic stability amounts to equating the range of elastic stability with the domain of convexity of internal strain energy, which, as first noted by Hill [1], is not coordinate invariant. Consequences of adopting this approach, together with developments of rigorous, coordinate invariant, elastic stability criteria for crystals under load, were presented by Hill and Milstein [6] and Milstein and Hill [7, 8], and are reviewed briefly here. If homogeneous strains of a crystal lattice are described by some set of “generalized coordinates” qr (r = 1, . . . , 6) that together specify the geometry of the deformed crystallographic cell, then work-conjugate “generalized forces” pr in a configuration qr may be defined via the differential form dE = pr dqr
(1)
(summation convention, r = 1, . . . , 6), and “generalized moduli” crs via d pr = crs dqs
(2)
∂ 2E ∂qr ∂qs
(3)
with crs =
where E is the elastic strain energy per unit reference volume (e.g., per unit crystallographic cell). The incremental change in strain energy δ E resulting from incremental changes in the cell’s geometry δq r is then δ E = pr δqr + 12 crs δqr δqs,
(4)
correct to second order in the qr . Various coordinate sets qr have been employed in practice; e.g., the Green variables were always adopted by the Born school; Macmillan and Kelly [9] employed elements of the stretch tensor, and in his earlier work, Milstein [10] used the edges of the crystallographic cell and their included angles; these have been termed G-, S-, and M-variables, respectively. Now, consider the crystal to be in a current, homogeneously deformed, state qr under generalized forces pr and let the crystal undergo any small,
1226
F. Milstein
arbitrary, additional deformation of the chosen set qr specified by the set δqr . Elastic stability of the crystal then signifies that the combined incremental potential energy of the crystal and its external loading (i.e., the sum of the incremental elastic strain energy δ E and external work δW ) is positive for all possible, arbitrary, incremental variations δqr . The increment δW of external work must therefore also be specified objectively to second order in the qr ; i.e., δW = pr δq r + 12 k rs δq r δq s
(5)
where the coefficients krs depend on the test configuration and the choice of variables qr . The algebraic expression of the stability criterion, δ E − δW > 0, then becomes (crs − krs )δq r δq s > 0
(6)
for arbitrary δqr when not all δqr = 0. Inequality (6) may be contrasted with the Born criterion, i.e., crs δq r δq s > 0,
(7)
which neglects the explicit inclusion of the second order work terms. Inequality (7), which thus equates elastic stability with the positive definiteness of the matrix of elastic moduli crs , is equivalent to the assertion that δ E > pr δq r to second order. The lack of general coordinate invariance of inequality (7) is demonstrated briefly below (see Ref. [6] for further details). If qr and qr∗ represent two distinct choices of geometric coordinates, all variables appearing in relations (1)–(6) may be rewritten with asterisks when reckoned to the set qr∗ , and by invariance of the energy per unit mass of the crystal, pu∗ dqu∗ pr dqr , (8) = ρ∗ ρ where ρ ∗ and ρ are the masses (or equivalently, the number of atoms) in the reference cells. The conjugate variables then transform according to
ρ ρ∗
pu∗
=
∂qr ∂qu∗
pr
(9)
from which ρ ∂qr ∂ 2 qr ∗ d pu = d pr + pr dqv∗ . ρ∗ ∂qu∗ ∂qu∗ ∂qv∗
(10)
Next, substitute (2) and its asterisked analog into (10) and compare coefficients of the independent dqv∗ , which yields the transformation formulae for the moduli,
ρ c∗ = ρ ∗ uv
∂qr ∂qu∗
∂qs ∂qv∗
crs +
∂ 2 qr ∂qu∗ ∂qv∗
pr
(11)
Elastic stability criteria and structural bifurcations
1227
from which it follows that
ρ c∗ δq ∗ δq ∗ − crs δqr δqs = pr ρ ∗ uv u v
∂ 2 qr ∂qu∗ ∂qv∗
δqu∗ δqv∗ .
(12)
The right hand side of (12) does not in general vanish, which thus demonstrates the lack of coordinate invariance of the Born criterion. Invariance of δW/ρ also requires that the symmetrized krs transform according to
ρ k∗ = ρ ∗ uv
∂qr ∂qu∗
∂qs ∂qv∗
krs +
∂ 2 qr ∂qu∗ ∂qv∗
pr ,
(13)
in analogy with (11). Combining (11) and (13) then yields
1 ∗ ∗ (cuv − kuv )δqu∗ δqv∗ = ρ∗
1 ρ
(crs − krs )δqr δqs ,
(14)
which thereby demonstrates the coordinate invariance of the stability criterion (6). Relation (6), of course, reduces to relation (7) in the absence of applied load. Consider next the topic of bifurcations of an initially stable crystal on a primary path under a prescribed mode of loading at the “critical stage” where the quadratic form of relation (6) first passes from positive definite to semidefinite, i.e., at the instant at which the stability criterion (6) is first violated. At this stage the homogeneous equations δpr − krs δqs = 0
(15)
necessarily have at least one eigensolution that causes the quadratic form to vanish; these equations are also necessarily coordinate invariant [6]. However, since branching of a primary path under a prescribed mode of loading is associated with loss of stability, it follows that the location of the presumed branch point on the primary path is likewise not coordinate invariant in general when the criterion for its inception is stationarity of the conjugate forces during some virtual increment of deformation; i.e., by analogy with (15), the corresponding eigenequations associated with the Born criterion, δpr = crs δqs = 0,
(16)
are not coordinate invariant in general. (For example, under [100] uniaxial / 0 in general, all other pr = 0, q1 = / q2 = q3 , and loading of a cubic crystal ( p1 = cell edges remain perpendicular on the primary path), p1 achieves a maximum or minimum value on the primary path coincident with the extremum in the axial load l1 if the qr are the S- or M-variables whereas p1 and l1 /λ1 reach
1228
F. Milstein
extrema simultaneously if qr represents the G-variables, where the axial stretch λ1 is the length of any fiber coaxial with the [100] direction divided by its length in the reference state.) The above considerations naturally evoke a number of practical questions. First, how does one deal with the coefficients krs in practice? These coefficients may be readily obtained for certain special cases, such as the well-defined, technically uncomplicated, loading environment provided by a uniform hydrostatic pressure that remains constant during any departure of the crystal’s geometry from equilibrium [6–8]. However, more generally, the precise determination of appropriate krs values presents a particularly challenging problem. As noted by Hill and Milstein [6], “the loading in laboratory experiments is usually frame dependent and the work is affected also by rotation of the specimen. On the intrinsic view, the loads ‘follow’ the material during any disturbance; they may, in addition, be deformation sensitive and so become different in kind from those in a state of equilibrium whose stability is under test.” What, then, might be the additional consequences and implications of dropping the krs terms and reverting to the original Born criterion (i.e., in addition to the issue of coordinate invariance, as already discussed)? More specifically, (i) are the limits of “stability”, as judged from relation (7), strongly or weakly dependent on the choice of geometric variables qr , (ii) what are the mechanical implications of the Born concept of instability, and (iii) are there some exceptional bifurcations on some paths that are essentially coordinate invariant according to this criterion? With regard to part (iii) of the above question, only one such coordinate invariant eigenstate has yet been identified; it occurs on a path of [100] uniaxial loading of an initially cubic crystal; this has been called the “c22 = c23 ” invariant eigenstate and, as is discussed later in this article, this state plays a particularly important role in both the [100] and [110] uniaxial loading behavior of cubic crystals. With regard to (ii), Hill and Milstein [6] did provide a notional mechanical interpretation of the Born criterion. This is summarized, as follows, in a paper on the [111] loading of cubic crystals [11, p. 4289]. Although not explicitly stated by Born, it is implicit to “Born’s view. . . . [that the loading] environment is notional, since the implied work input during any δqr is pr δqr correct to second order. This means that the loading must be imagined to ‘follow’ the deforming crystal servo-mechanically so as to hold fixed the values of pr , regardless of changes in shape or orientation (for instance, with the Green variables, the [111] load must be maintained along the Bravais cell diagonal and proportional to its length). To that extent, Born’s criterion could perhaps be said to characterize an ‘intrinsic’ strength that reflects a property of the material alone. The fact is, however, that [inequality (7)] is not coordinate invariant, but depends on the particular choice of variables and on the reference configuration”.
Elastic stability criteria and structural bifurcations
1229
With regard to part (i) of the question posed on the prior page, it has been clearly demonstrated from a series of lattice statics based calculations of the domains of stability of cubic crystals under constant hydrostatic pressure that the ranges of “stability”, as judged by relation (7), are highly sensitive to the choice of qr (within the group of G-, S-, and M-variables), and they diverge significantly from the domains based on the rigorous criterion (6) [7, 8]. Furthermore, a meaningful physical interpretation of the notional Born criterion in a constant hydrostatic environment is lacking. On the other hand, when the same three sets of variables were used in LS computations to determine the ranges of “Born stability” of initially cubic crystals under uniaxial loadings coincident with principal symmetry directions, the typical result was a fairly small dependence on the choice of variables ([11]; Fang and Milstein, to be published; Chantasiriwan and Milstein, to be published); the exception was bcc metals under compression, as is discussed in the next section of this article. In addition, uniaxial IMD loading simulations have yielded instabilities in close proximity to the LS Born instabilities (Zhao, Maroudas, and Milstein, to be published). These results (which are discussed in following sections of this article) suggest that, while the Born criterion is inadequate, both philosophically and quantitatively, for assessment of stability under a constant hydrostatic environment, it can be efficacious for specific uniaxial loadings. Whether the Born criterion reasonably predicts the onset of instability under other modes of loading (e.g., shear or biaxial) and whether it has a strong or weak dependence on reasonable choices of the geometric variables remains to be investigated by means of IMD simulations, combined with LS computations based on diverse measures of lattice strain as generalized coordinates. With regard to the latter consideration, Hill [1] noted that, in principle, one could use components of various other measures of strain as generalized coordinates, and he considered any tensor coaxial with the principal fibers and having principal values e(λ1 ), e(λ2 ), e(λ3 ) where λ1 , λ2 , λ3 , are the principal stretches; e(λ) can be any smooth monotone function that yields agreement with the classical infinitesimal strain when deformation is first order (i.e., e(1) = 0 and e (1) = 1). Examples of e(λ) are λ − 1, ln λ, and (1/2)(λ2 − 1), the last of which generates the components of Green’s measure of strain. It can be instructive to illustrate the concepts presented above by way of concrete examples. For this purpose, let us consider cubic crystals subjected to three different modes of applied load, viz. hydrostatic pressure, [100] uniaxial loading, and [111] uniaxial loading. Although both lattice statics and isostress molecular dynamics simulations have been carried out for each of these three cases, as is discussed in subsequent sections of this article, the discussion that follows in this section is independent of any specific model of atomic binding; the only assumption about atomic binding is that the path dependent internal energy E and its derivatives with respect to the qr are calculable.
1230
F. Milstein
A cubic crystal under hydrostatic pressure remains cubic on a primary path, and thus has three independent elastic moduli c11, c12 , and c44 . More fundamental, however, are the moduli κ, µ, and µ defined by dσ11 + dσ22 + dσ33 = 3κ(ε11 + ε22 + ε33 ),
(17)
dσ11 − dσ22 = 2µ(ε11 − ε22 ),
(18)
dσ12 = 2µ ε12 ,
(19)
where the Cauchy stress is σi j , the Eulerian strain rate is εi j , and the d preceding the σi j denotes derivatives of components on cubic axes (or indeed on any rotating frame, when the current stress σi j = −Pδi j ). Thus, κ is the bulk modulus and µ and µ are the shear moduli in the relation between the cubicaxes components of the Cauchy stress increment and the rotationless strain increment (evaluated relative to the current configuration under pressure P). Milstein and Hill [7] employed the principles of bifurcation analyses of general materials in the determination of stability criteria for cubic crystals subjected to hydrostatic loading. The analyses are carried out in a manner equivalent to Hill (Ref. [12] Chapter III, Section C2) but without recourse to the general mathematical apparatus for handling follower-loadings. Milstein and Hill’s treatment of crystal stability is rigorous and complete; i.e., (a) the loading environment is fully specified, to sufficient order and in both its active and passive modes, and (b) the potential energy of the system as a whole is examined in all the nearby, possibly inhomogeneous, configurations allowed by the kinematic constraints, if any. Under a hydrostatic pressure that does not vary during any departure from a considered configuration of equilibrium, elastic stability is guaranteed if 2 + . + .) > 0. κ(ε11 + ε22 + ε33 )2 + 23 µ[(ε11 − ε22 )2 + . + .] + 4µ (ε12
(20)
Since the three terms are independently variable, the necessary and sufficient conditions for stability are the simultaneous satisfaction of the inequalities κ(P) > 0,
µ(P) > 0,
and
µ (P) > 0.
(21)
Milstein and Hill [7, 8] identified the primary eigenstates and corresponding eigensolutions ηi j associated with loss of stability on a fundamental path at a pressure P = Q as follows. / 0; (i) κ(Q) = 0, µ(Q) > 0, µ (Q) > 0 with eigensolutions η11 = η22 = η33 = η12 =η23 =η31 =0 (the eigenmode is necessarily homogeneous and purely volumetric, coincident with dP/dV = 0, where V is the volume). (ii) µ(Q) = 0, κ(Q) > 0, µ (Q) > 0 with solutions such that η11 + η22 + η33 = 0; η12 = η23 = η31 = 0 (the uniform eigenmodes make the lattice orthorhombic, or possibly tetragonal, without varying the cell volume).
Elastic stability criteria and structural bifurcations
1231
(iii) µ (Q) = 0, κ(Q) > 0, µ(Q) > 0 with solutions such that η11 = η22 = η33 = 0; any ratios η12 : η23 : η31 (the uniform eigenmodes distort the lattice without varying the lengths of the cell edges). Explicit connections between relations (6) and (21) are obtained as follows. For a cubic crystal under hydrostatic pressure, crs δqr δq s = 13 (c11 + 2c12 )(δq1 + δq2 + δq3 )2 + 13 (c11 − c12 )[(δq1 − δq2 )2 + . + .] + c44 [(δq4 )2 + (δq5 )2 + (δq6 )2 ].
(22)
The form krs δqr δqs can be expanded similarly, so the stability criterion (6) becomes c11 + 2c12 > k11 + 2k12 ,
c11 − c12 > k11 − k12 ,
and
c44 > k44 .
(23)
The relations between the moduli crs and κ, µ, and µ for a cubic crystal under pressure [7] are 4µ λe P 2 e2 c12 e2 c11 =κ + + = κ − µ − P, , λ 3 e λ 3 1 λe e2 c44 = µ + P 1 + . λ 2 e
and (24)
If e(λ) = (1/2)(λ2 − 1) (i.e., the Green measure of strain), relations (21) and (24) yield the stability criteria P 3κ = c11 + 2c12 + > 0, λ λ P µ = c44 − > 0, λ λ
2µ 2P = c11 − c12 − > 0, λ λ
and (25)
from which k11 = P/λ, k12 = −(P/λ), and k44 = P/λ in the Green measure of strain. If e(λ) = λ − 1 (which generates the stretch measure), 3κλ = c11 + 2c12 + 2Pλ > 0, µ λ = c44 − 12 Pλ > 0,
2µλ = c11 − c12 − Pλ > 0,
and (26)
so, if crs represents the S-moduli, k11 = 0, k12 = −Pλ and k44 = Pλ/2. Finally, if the qr are the edges of the cubic cell and their included angles, 3κλ = c11 + 2c12 + 2Pλ > 0, µ λ3 = c44 − Pλ3 > 0,
2µλ = c11 − c12 − Pλ > 0,
so in the M-variables, k11 = 0, k12 = −Pλ, and k44 = Pλ3 .
and (27)
1232
F. Milstein
Consider next the [100] loading of an initially cubic crystal. Under uniaxial / 0, all other pr = 0), the crystal becomes tetragonal on a primary path load ( p1 = / q2 =q3 ; cell edges remain perpendicular) with six independent moduli crs , (q1 = viz. c11 , c12 = c13 , c22 = c33 , c23 , c44 , and c55 = c66 (all other crs = 0, and crs = csr , of course). The differential relations (2) that govern an arbitrary differential disturbance are then d p1 = c11 dq1 + c12 (dq2 + dq3 ), d p2 = c12 dq1 + c22 dq2 + c23 dq3 , d p3 = c12 dq1 + c23 dq2 + c22 dq3 ,
and (28)
with, d p4 = c44 dq4 ,
d p5 = c55 dq5 ,
d p6 = c55 dq6 .
(29)
/ 0, all other d pr = 0, the general If the load were to remain uniaxial, i.e., d p1 = solution to (28) and (29) gives the coordinate increments (dq1 ,. . .,dq6 ) on the primary path, i.e., 2 −1 ] d p1 (c22 + c23 , −c12 , −c12 , 0, 0, 0)[c11 (c22 + c23 ) − 2c12
(30)
The quadratic form crs δqr δqs may be reduced to the sum of independent squares
c11 δq1 + c12 (δq2 + δq3 ) c11
2
1 2c2 + c22 + c23 − 12 2 c11
(δq2 + δq3 )2
+ 12 (c22 − c23 )(δq2 − δq3 )2 + c44 δq42 + c55 δq52 + δq62 ,
(31)
and the determinant of the moduli matrix factors as [6] 2 2 ]c44 c55 . det(crs ) = (c22 − c23 )[c11 (c22 + c23 ) − 2c12
(32)
Thus, the necessary and sufficient conditions for Born stability according to relation (7) are seen to be c11 > 0,
c22 + c23 −
2 2c12 > 0, c11
c22 − c23 > 0,
(33)
together with c44 > 0,
c55 > 0.
(34)
Elastic stability criteria and structural bifurcations
1233
The determinant (32) can vanish when, and only when, at least one factor does and each vanishing factor is associated with a particular type of eigensolution: 2 /c11 , (2c12 , −c11 , −c11 , 0, 0, 0), when c22 + c23 = 2c12 (0, 1, −1, 0, 0, 0), when c22 − c23 = 0, (0, 0, 0, 1, 0, 0), when c44 = 0, (0, 0, 0, 0, 1, 0) and (0, 0, 0, 0, 0, 1), when c55 = 0.
(35)
Born and Furth [5] employed an alternative method of judging convexity of internal energy based on the requirement that all principal minors of the matrix of moduli crs must be positive, according to a standard theorem in algebra. This approach, however, did not reveal the eigensolutions (35), and curiously, Born and Furth arrived at six “necessary and sufficient conditions for the stable equilibrium of the lattice”, five of which are equivalent to (33) and (34), in addition to the condition c22 > 0 (note however that relations (33) also imply c22 > 0). Next, we may ask, how useful is the notional Born concept, as expressed by relations (33)–(35), in judging the stability and bifurcation response of initially cubic crystals under [100] uniaxial loading? From expression (30), we see that the first kind of eigenstate in (35) occurs where the variable p1 becomes stationary, so the associated Young’s modulus vanishes. If this eigenstate were to terminate a notional stability range expressed in the M- or S-variables, it would occur at a maximum or minimum of the applied load l1 and therefore would make reasonable physical sense as a stability limit if the loading apparatus were to attempt to apply a constant uniaxial tensile load in excess of the maximum value of l1 (or compressive load exceeding the minimum l1 ) on the primary path, and transverse strains were unimpeded. (As indicated earlier, in the G-variables, this kind of eigenstate occurs where l1 /λ1 becomes stationary, and thus is found on the primary path before the maximum value of l1 is reached under tension and after a minimum of l1 in compression.) The location of the second eigenstate on the primary path is independent of the choice of qr , and thus if stability is indeed terminated at the invariant c22 =c23 eigenstate, the notional criterion (7) and the stability criterion (6) coincide. (The invariance simply requires that q1 is coaxial with the uniaxial load l1 , which is of course coaxial with the unique axis of the tetragonal crystal, and q2 and q3 are coaxial with the transverse tetragonal axes.) A rigorous proof of this result is given by Hill and Milstein [6, p. 3093]. For physical insight, we note that the eigensolution at this state has all d pr = 0, with dq2 = −dq3 , all other dqr = 0; if qr (r = 1, 2, 3) represents the edges of the tetragonal cell, then pr (r = 1, 2, 3) are the axial loads lr (i.e., the Mvariables). Thus the eigendeformation at the branch point takes the crystal structure from the primary tetragonal path to a secondary or orthorhombic
1234
F. Milstein
branch (q1 = / q2 = / q3 ; cell edges remains orthogonal), with the uniaxial load remaining dead during the differential eigendeformation. On the secondary path, at the branch point, the generalized Poisson ratios dq2 /dq 1 and dq3 /dq 1 are infinite and of opposite algebraic sign and the first order expression for d p1 /dq1 (i.e., expressed in terms of the second order moduli crs alone) is indeterminate. In fact, owing to the highly singular nature of the secondary path at the branch point, the correct expression for the variation of axial load with axial stretch on the secondary path at the point of bifurcation (expressed in terms of elastic moduli on the primary path at this point) must include third and fourth order moduli, crst and crst u . This is discussed further in the section of this article concerned with the role of higher order moduli. The second and third types of eigenstates in (35) are somewhat analogous; i.e., upon rotation of the 2- and 3-axes by 45◦ about the 1-axis, a new set of axes are obtained on which the crystal maintains tetragonal symmetry. (For example, if on the original, unrotated, set of axes the crystallographic cell is described as body centered tetragonal, on the rotated axes the cell appears as face centered tetragonal.) As a result, the c22 = c23 eigenstate, reckoned to the unrotated set of axes, occurs at the exact same point on the primary path as the “c44 = 0” eigenstate, reckoned to the rotated axes, and vice versa. It thus follows that the c22 = c23 and c44 = 0 eigenstates have the same invariance, and / 0, all other δqr = 0, on one set of tetragonal axes is the eigendeformation δq4 = identical to δq2 = −δq3 , all other δqr = 0, on the other (rotated) set of tetragonal axes. In view of the above discussion, we have clear, physically meaningful interpretations of each of the first three eigenstates in (35). Physical clarity in each of these three cases is enhanced by the condition that, during the eigendeformation, the load l1 remains uniaxial, parallel to the 1-axis, which in turn remains perpendicular to the 23-face of the crystallographic cell. Physical interpretation of the “c55 = 0” eigenstate is more problematic, owing to the characteristic shearing mode of the associated eigendeformation. That is, under this eigendeformation, if the load were to remain parallel to the 1-axis, it would cease to be perpendicular to the 23-face, while if it were to remain normal to the 23-face, it would no longer be aligned with the crystallographic ∗ 1-axis. This eigenstate is also not coordinate invariant; e.g., if c55 and c55 represent the Green and stretch moduli, respectively, ∗ = (λ1 + λ2 )2 c55 + l1 ; 4c55
(36)
∗ = 0 eigenstate is preceded by a c55 = 0 eigenthus, the occurrence of a c55 state in a tensile region (l1 > 0); the order of appearances is reversed under compression. Next consider the notional stability criteria and associated bifurcation response for cubic crystals under [111] uniaxial loading, following the exposition of Ref. [11]. Under [111] loading, the primary path is axisymmetric;
Elastic stability criteria and structural bifurcations
1235
select a set of rectangular axes with the 3-axis in the loading direction and the 1- and 2-axes arbitrarily transverse. Consider any fourth-rank tensor of moduli, however, defined, with components crs expressed in the usual 2-index notation. Crystal symmetry reduces the number of independent moduli to six, which, together with their interrelationships, are c11 = c22 , c33 , c44 = c55 , c12 , c13 = c23 , and c14 = −c24 = c56 , with c66 = (1/2)(c11 − c12 ) and c15 = −c25 = −c46 = c14 sin 3φ/cos 3φ, where φ is the angle between the 1-axis and a line of nearest neighbors in the (111) plane; adopt φ = 0◦ (or any integer multiple of π/3), which thus reduces c15 , c25 , and c46 to zero. The method of stability analysis for [111] uniaxial loading is similar to that employed for [100] loading, and again is more direct and insightful than a simple evaluation of the principal minors of the crs matrix. With the symmetries described in the prior paragraph, the quadratic form (6) can be arranged as [c33 δq3 + c13 (δq1 + δq2 )]2 [c44 δq4 + c14 (δq1 − δq2 )]2 + c c44 33 2 2 1 2c13 1 2c14 2 + c11 + c12 − (δq1 + δq2 ) + c11 − c12 − 2 c33 2 c44
× (δq1 − δq2 )2 + δq62 +
(c44 δq5 + c14 δq6 )2 c44
(37)
by successively “completing the square” in the variables taken in a sequence appropriate to the symmetries. Thus, in the manner of expression (31) for [100] loading, expression (37) makes “self-evident” the necessary and sufficient conditions for positive definiteness of the quadratic form crs δqr δqs under [111] loading, i.e. 2 (c11 + c12 )c33 > 2c13
and
2 (c11 − c12 )c44 > 2c14
(38)
with, c33 > 0 and
c44 > 0.
(39)
At the termination of a range of notional stability, quadratic form (37) becomes positive semi-definite at a primary eigenstate, and thus can be made zero by some critical disturbance δqr , called a primary eigenmode (or primary eigendeformation) that satisfies equations (16). This semi-definiteness occurs at the first violation of either of the inequalities (38), which, in themselves, preclude an earlier violation of either of the inequalities (39). Likewise, factorization of the determinant of the matrix crs yields 2 2 2 ][(c11 − c12 )c44 − 2c14 ], det(crs ) = 12 [(c11 + c12 )c33 − 2c13
(40)
wherein vanishing of a factor is associated with semi-definiteness of the corresponding notional stability criterion in (38). The first factor vanishes at an
1236
F. Milstein
extremum of p3 on the primary path, and the eigenmode is that of the axisymmetric path itself; i.e., a first order increment δqr that does not vary from the primary path is governed by
2 2c13 δq3 , δp3 = c33 − c11 + c12
(41)
with δq1 = δq2 =
−δq3 c13 , (c11 + c12 )
(42)
all other δqr = 0. When δp3 vanishes, the eigenmode becomes δq1 = δq2 =
−δq3 c33 . 2c13
(43)
Vanishing of the second factor, since double, is associated with a pair of independent eigenmodes: δq1 −δq2 −δq4 = = , c44 c44 2c14
(44)
δq5 −δq6 = , c14 c44
(45)
all other δqr = 0 in both cases. Unlike the case of [100] loading, none of the [111] loading eigenstates is invariant with respect to choice of strain variables, which thereby attaches a special significance to atomic based simulations of the [111] loading response. Lattice statics studies can determine which eigenstate is primary and whether its location on the primary path is sensitive to the choice of variables. Isostress molecular dynamics simulations can determine whether stability is indeed lost in proximity of a primary eigenstate, the nature of the atomic mechanisms at and after the initiation of instability, whether these mechanisms are in accord with the primary eigenmodes identified above, and whether instability leads to failure or phase change. Other writers on elastic stability have approached the subject from their own unique perspectives. Wang et al. [13] developed criteria that they tested with molecular dynamics simulations and found good agreement between theory and simulation results for a cubic crystal loaded in tension, both hydrostatically and along a cube edge. Their criteria for failure under pressure are identical to those of Hill and Milstein, and failure under volumetric expansion was associated with the vanishing of the bulk modulus κ. Moreover, under [100] tensile loading, their crystal failed in association with the c22 = c23 eigenstate described above, which, of course, is invariant with respect to choice of
Elastic stability criteria and structural bifurcations
1237
coordinates and thus is invariant within the formulation of any suitable theory. Morris et al. [14] sought stability criteria suitable for “tedious” ab initio computations and provided analyses for systems that maintain fixed boundaries; they propose that this condition yields an upper limit to the theoretical strength of a crystal; although, as they indicate, instabilities that result from deformations orthogonal to the chosen deformation may be missed.
2.
Large Strain Mechanical Response
In this section, the mechanical behavior of crystals under large elastic strains is explored through various avenues, which include analyses of specific atomic based LS computations, considerations of crystal symmetry and of the essential nature of interatomic forces (i.e., repulsive and attractive, respectively, at small and large interatomic spacing), and examinations of available experimental evidence. Lattice statics model computations of primary paths, bifurcation points, and secondary paths that branch from the primary paths under homogeneous eigendeformations are discussed within the framework of the stability analyses presented in the prior section. Lattice statics computational results, together with stability theory, provide the bases for understanding inhomogeneous branching observed in IMD simulations, as is discussed in the final section of this article. Lattice statics simulations based upon empirical and semi-empirical atomic models (i.e., pair potentials, embedded atom methods, and quantum mechanically based pseudopotentials) are suitable for our present purpose, which is to elucidate a broad range of qualitative and semi-quantitative phenomena, rather than to delve into more complex ab initio models that currently are unsuitable for use in large scale IMD simulations. Here, we consider first the topic of cubic crystals under hydrostatic pressure, after which, uniaxial and shear loading responses are examined. Apparently the first model computations of the bulk and shear moduli of cubic crystals under hydrostatic pressure, defined by Eqs. (17)–(19), are those of Milstein and Hill [7, 8, 15, 16]. They computed these moduli for the entire family of Morse function fcc, bcc, and simple cubic (sc) crystals under hydrostatic compression and expansion; they also determined the domains of stability according to relations (21) and identified the eigenmodes at the domain limits. In a Morse model crystal, the interaction energy φ(r) between any two atoms separated by a distance rin the crystal is φ(r) = D{exp[−2α(r − r O )] − 2 exp[−α(r − r O )]};
(46)
the internal energy E is then obtained by summing over a sufficient number of pairwise interactions to obtain convergence, and the pressure and elastic moduli are computed from lattice summations containing derivatives of φ(r).
1238
F. Milstein
The empirical parameters D, α, and r O are usually determined by fitting the model crystal to experimental data; the parameter β ≡ exp(αr O ) is an effective potential range indicator; larger values of β yield shorter range, steeper functions φ(r). Values of log β (natural logarithm) calculated from experimental values of elastic moduli and atomic volumes range from about 3 to 8 [17]. Milstein and Hill [7, 8] also computed the Born-notional ranges of stability of the complete family of Morse function cubic crystals, for various coordinate choices, according to relation (7), which by (22) or (23), is seen to be equivalent to c11 + 2c12 > 0, c11 − c12 > 0, and c44 > 0. Their results clearly demonstrated quantitatively the failure of the Born criterion to describe adequately the ranges of stability of crystals in a constant hydrostatic environment. For example, Fig. 1 compares the domains of classical stability (indicated as “exact”) with the notional domains of Born stability for the particular case in which the geometric coordinates are the components of Green’s strain. In the Morse model, each of these domains depends uniquely upon the parameter β; in fact, all dimensionless elastic properties of a Morse model crystal depend uniquely upon β. Among the divergences between the exact and notional criteria are (i) the notional G-stable range of sc crystals exists only in compression, whereas the exact criteria show that the sc crystal is stable only in hydrostatic tension (IMD simulations of Ref. [18] also verified that the sc Morse model crystals are stable in a range of hydrostatic expansion, and lose stability where predicted by the LS computations), (ii) the range of classical stability of fcc crystals increases monotonically as log β decreases whereas the corresponding G-stable range “peaks” near log β = 6, and (iii) the classical range of stability is much smaller than the G-stable range for bcc crystals with large values of log β, and vice versa for small values of log β. The “universal map” of classical stability domains of the Morse function cubic crystals has provided an effective basis for studying bcc to hexagonal close packed (hcp) transformation mechanisms in IMD simulations, as will be discussed in the final section of this article. The map was determined by computing the loci of states, λ = , at which κ, µ, and µ vanish. Subscripts F, B, and S identity the moduli for the fcc, bcc, and sc lattices, respectively. Since the dimensionless elastic properties of each crystal structure depend uniquely upon β, the values of λ at which the moduli vanish likewise depend solely upon β and crystal structure. Figure 2 shows the dependencies of the -values upon log β; the shear modulus µB of bcc crystals exhibited two zeros, µB(L) and µB(R) , where “L” and “R” designate the left- and right-hand zeros, respectively. In this model, the fcc crystals are seen to remain stable under arbitrary hydrostatic compression and to lose stability in tension when κ = 0. Stability of the bcc crystals is terminated when κ = 0 or when µ = 0, depending on whether log β is less than or greater than about 3.91, respectively, and stability of these crystals is lost under tension or compression, depending on whether log β is less than or greater than 4.517, respectively.
Elastic stability criteria and structural bifurcations
1239
(a) 1.14 FCC EXACT FCC G-STABLE
STRETCH, λ
1.12
1.10
1.08
1.06
1.04
3
5
7 log β
9
11
(b) SC EXACT SC G-STABLE
1.2
BBC EXACT
STRETCH, λ
BCC G-STABLE
1.0
0.8
0.6 3
5
7 log β
9
11
Figure 1. Domains of classical stability in a hydrostatic environment, and of convexity of the strain energy relative to the Green variables, for the Morse family of (a) fcc crystals and (b) bcc and sc crystals. The lattices are classically stable according to (21) in the regions indicated as “EXACT,” and notionally stable according to (7), with the Green variables, in the regions indicated as “G-STABLE”. The stretch λ is the length of any fiber in the crystal in its current state divided by its length in the absence of pressure P; λ > 1 indicates hydrostatic tension (P < 0) while λ < 1 signifies compression; the potential range indicator is log β = αr O . From Ref. [8].
1240
F. Milstein (a) 2.0
ΛµB(R) ΛµF 1.5 Λκ, Λµ, and Λµ'
Λµ' B Λµ' F ΛµS
1.0
Λκ
Λµ' S
ΛµB(L) 0.5
3
5
7
9 log β
11
13
15
(b)
ΛµB(R)
1.2 Λµ' B
Λκ, Λµ, and Λµ'
Λµ' S
Λκ
ΛµF
Λµ' F ΛµS
1.1
ΛµB(L)
1.0
3
4
5
6
7 log β
8
9
10
11
Figure 2. Values of stretch α A at which the bulk modulus (α=κ) and shear moduli (α=µ, µ ) of Morse model fcc, bcc, and sc (A = F, B, S) crystals vanish as functions of the potential range indicator log β; µB and µS are positive above µB(R) and µ S , respectively; all other moduli are positive at stretches below their corresponding -curve: (a) over a wide range of values and (b) enlarged view, over a more limited range. From Ref. [7].
Elastic stability criteria and structural bifurcations
1241
Following the work of Milstein and Hill [7, 8, 15, 16] Rasky and Milstein [19] derived analytic formulae for computing the elastic moduli of cubic metals described by quantum mechanically based pseudopotential models, under axial loadings, and formulated specific pseudopotential models appropriate for the alkali metals, based on the Heine–Abarenkov local model potential and the Taylor approximation for electron correlation and exchange. With but two adjustable parameters, the model was shown to provide very good agreement with nine experimentally determined properties (i.e., the binding energies, atomic volumes, elastic moduli κ, µ, and µ , first derivatives of the three moduli with respect to pressure, and second derivatives of κ; the second derivatives of the shear moduli were also computed, but experimental data were lacking); excellent agreement between theoretical and experimental pressure– volume relations was also obtained. Subsequently, Milstein and Rasky [20] employed the pseudopotential model to compute the bulk and shear moduli of the bcc and fcc configurations of each alkali metal over extensive ranges of hydrostatic compression and expansion. The alkali metals are known experimentally to exhibit seemingly diverse behaviors [21]. For example, at low temperatures, the heavier metals Cs, Rb, and K are bcc while Na and Li are in close-packed structures that are similar to fcc with periodic stacking faults; such close-packed structures evidently differ little in energy from the fcc phase. Indeed, cold working of Li below 75 K produces fcc. Under pressure, Cs, Rb, and K undergo bcc to fcc transitions, with the transition pressure greatest for K and least for Cs; also, experimentally, Na transforms from a close-packed structure to bcc at a relatively low pressure, and the bcc and close-packed structures coexist over a large range of pressure. From a theoretical, computational, viewpoint, Milstein and Rasky [20] showed that the bcc to fcc transformations in the heavier alkali metals are associated with the vanishing of the shear modulus µB and the simultaneous growth of the shear modulus µF , from negative (or “weakly positive”) to “strongly positive”. For Na, however, both the bcc and fcc structures exhibit elastic stability over wide ranges of compression in the region of transition between the bcc and close-packed structures, in accord with the experimentally observed “sluggishness” in this transition. Apparently these were the first computations of shear moduli over wide ranges of compression based on a theoretical model more sophisticated than pair potential models, and the first wherein initially unstressed, stable, cubic crystals become unstable under compression as a result of a shear modulus passing from positive to negative. The usual explanation for the bcc to fcc transitions in the heavier alkali metals is that, at high pressure, the valence electrons shift from primarily an s-character to a d-character, and the s–d transfer caused the structural transitions. The work of Milstein and Rasky [20], however, shows that these structural transitions may occur as a natural consequence of lattice instabilities, without the necessity of invoking the s–d electron transfer mechanism as the
1242
F. Milstein
driving force for structural change. Indeed, the s–d electron transfer may be an effect, rather than a cause, of the instabilities. Figure 3 shows results of computations of the shear moduli µB and µF , the bulk moduli κ, and the difference in Gibbs energy between the phases for Na and Rb as examples. For all of the alkali metals, the moduli µB and µF (not shown in Fig. 3) are positive throughout the compression range (µB for Li at very high pressures is an exception). In LS computations, the Gibbs energy and the enthalpy E + P V are identical, of course, since the atomic positions are “frozen”. Additionally, the variations of pressure with stretch for the fcc and bcc phases are found to be almost identical; thus the difference in Gibbs energy between the two phases at a given pressure may be represented on a plot where the stretch λ is the independent variable, as shown in Fig. 3, and the Gibbs energy differences are essentially the differences in binding energy per atom, E, since the pressure–volume products of the two phases are almost identical at a given pressure or volume. An examination of the interplay between the Gibbs energy difference and the shear moduli variations in Fig. 3 illustrates the critical role of elastic stability in phase transformation theory. At λ = a and at λ = b, where both phases have the same Gibbs energy and are equally favored thermodynamically, the values of µB and µF are very close. Where µB is substantially greater than µF , E is strongly negative, thereby favoring the bcc structure; and vice versa where µF is considerably larger than µB . For further insight, assume that a stable bcc crystal is compressed to where it’s Gibbs energy just exceeds that of the fcc crystal (e.g., to a state “just beyond” point b in Fig. 3). At this state, the bcc crystal is no longer thermodynamically favored, but it is still elastically stable, so an enthalpy barrier exists that resists transformation from the bcc state on any and all transformation paths. The barrier may be overcome and the transformation may proceed along some particular path (or paths) under the influence of some finite disturbance. In the absence of sufficient disturbance, the bcc state may continue to exist indefinitely. With further increase of pressure, as the stable bcc crystal approaches the state where µB = 0 (λ = µB(M) ), the disturbance required to cause phase transformation diminishes; at λ = µB(M) , the barrier for transformation vanishes on some unique transformation path (or paths), and then an infinitesimal disturbance would trigger the transformation. Such lattice “disturbances” may include thermally activated atomic vibrations, free surfaces, and internal defects that act as stress raisers. Isostress molecular dynamics studies of thermal activation of phase transitions and fractures associated with elastic instabilities have been carried out and are underway, as is discussed later in this article. Future work should also include IMD simulations of the behavior of more realistic crystal models (i.e., those containing internal defects and/or free surfaces) as an incipient instability is approached. In passing through the series of alkali metals, from Cs to Li, the states where λ = a, λ = b, and λ = µB(M) were found to occur at progressively lower
Elastic stability criteria and structural bifurcations
1243
(a)
(b)
Figure 3. Gibbs energy difference and the elastic moduli that control stability of the bcc and fcc alkali metals in hydrostatic loading; in these figures, the ’s terminate stability ranges; the subscripts M and L indicate “left-hand” and mid-range zeros in the shear moduli functions (1 GPa =1010 dyn/cm2 ). (a) Na and (b) Rb. From Ref. [20].
1244
F. Milstein
values of stretch and higher pressures; i.e., through this series, the curves µ(λ) and E(λ) are shifted toward the region of higher compression. It is of particular interest to note that the states “λ = a”, where the Gibbs energy difference vanishes, occur in regions of hydrostatic tension for K, Rb, and Cs and in compression for Na and Li. Thus, although both the bcc and fcc phases of all five alkali metals are elastically stable at zero-pressure, the thermodynamically preferred zero-pressure structures (i.e., the structures with the lower Gibbs energy) are indicated to be bcc for K, Rb, and Cs and fcc for Li and Na, in good agreement with experiment (i.e., as mentioned earlier, the lowtemperature phases of K, Rb, and Cs are indeed bcc while Li and Na are closed packed similar to “faulted fcc”.) From a theoretical viewpoint, the apparently divergent behaviors among the lighter and heavier alkali metals, as well as the increasing bcc to fcc transformation pressures that occur through the series Cs – K, are thus seen simply to be a result of subtle shifts of the oscillatory moduli and Gibbs energy functions. It is noteworthy (and perhaps contrary to one’s intuition) that increasing compression stabilizes the alkalis’ bcc structure, which is considered a more “open” structure than fcc; this increasing stabilization of bcc occurs over large ranges of tension and compression (e.g., where the slope of E(λ) is positive in Fig. 3). Increasing pressure also stabilizes the bcc structure in the Morse model, as is seen from Figs. 1 and 2. Further comparisons among the pseudopotential and Morse models of the alkali metals reveal similar pressure– volume relations (see Fig. 4), the condition µ > µ for both the bcc and fcc structures over wide ranges of compression and expansion, more than one zero in the µB -functions, and similar low pressure µB (λ)/µB (λ) ratios that increase initially with increasing compression (see Fig. 5). Thus, although the pseudopotential model of the alkalis is both rigorous and highly complex when compared with the Morse model (as, e.g., is evident from a comparison of the lattice summations required for moduli calculations in Ref. [19, Eqs. A5–A37] with those in Ref. [17, Eqs. 4.1–4.4]), the two models do have a number of important features in common. Although the Morse model represents a considerable simplification of the state of interatomic interactions in most real crystals, it has been found useful for exploring qualitative and semi-quantitative phenomena, since (i) it incorporates the essential nature of interatomic interactions, (ii) it often yields good agreement with more sophisticated model computations and with experimental evidence, particularly for uniaxial loadings (as is discussed later in this section), and (iii) it is sufficiently mathematically tractable to enable lattice instabilities and post-bifurcation behavior to be studied in IMD simulations of large (realistic size) supercells. One inadequacy of the Morse model, however, is its inability to replicate relatively large values of shear moduli ratios µB /µB , as are found among the bcc transition metals. Chantasiriwan and Milstein [22] recognized a need for atomic models that accurately reproduce the elastic
Elastic stability criteria and structural bifurcations
1245
Figure 4. Comparison of pseudopotential and Morse model computations of the pressures and bulk moduli of the bcc alkali metals in compression; κ(1) is the bulk modulus at zero pressure. From Ref. [19].
behavior of bcc transition metals and yet are suitable for use in large-supercell molecular dynamics simulations. This consideration led them to formulate embedded atom method (EAM) models for a number of cubic metals, including the bcc transition metals Fe, Mo, and Nb, that identically reproduce experimental values of the three second-order and six third-order elastic moduli. Thus, the initial linear (harmonic) and non-linear (anharmonic) mechanical responses of the models under arbitrary loading are dictated by experiment. In addition, (i) the pressure–volume relations of the metals are accurately modeled, (ii) the models yield good quality phonon spectra, (iii) the relative energetics between the bcc and fcc structures yields the correct low temperature, zero-stress, phase, and for Fe, K, Na, and Rb, experimentally observed phase
1246
F. Milstein
Figure 5. Comparison of pseudopotential and Morse model computations of the shear modulus ratio µB /µB of the alkali metals in compression. From Ref. [19].
transitions are indicated (Cs was not modeled), (iv) the energy of the crystal and its derivatives are represented by convenient analytical forms, and (v) the lattice summations converge rapidly (generally after third- or fourth-nearest neighbor interactions), so applications are not computationally intensive (the necessity of including at least third-nearest-neighbor interactions in the EAM formulation was noted by Chantasiriwan and Milstein [23]). Chantasiriwan and Milstein (to be published) also used their EAM models to compute the pressure dependencies of the moduli κ, µ, and µ of cubic crystals in LS simulations. Their LS work is intended to serve as a prelude to EAM IMD studies of lattice stability, to be carried out in due course. As an example of their EAM LS computational results, Fig. 6 shows the Gibbs energies G and the differences in Gibbs energies G of the bcc, fcc, and body centered tetragonal (bct) phases of Fe under hydrostatic pressure (the bct structure will be discussed later in this section) and Fig. 7 shows the pressure
Elastic stability criteria and structural bifurcations
1247
Figure 6. Influence of hydrostatic pressure on the Gibbs energy per atom, G, of the bcc, fcc, and bct structures, and the Gibbs energy differences, G, for the EAM model of Fe. From Chantasiriwan and Milstein, to be published.
dependences of the moduli κ, µ, and µ of the bcc and fcc structures. Although the EAM model still represents a considerable simplification of atomic bonding in Fe, it does yield reasonable agreement with experiment, as well as valuable insights. Experimentally, Fe is known to be bcc at atmospheric pressure and temperatures T below 1173 K; depending on the temperature, the application of pressure induces a bcc to fcc transformation (757 K
1248
F. Milstein
Figure 7. Influence of hydrostatic pressure on the bulk modulus κ and shear moduli µ and µ of the bcc and fcc structures of the EAM model of Fe. From Chantasiriwan and Milstein, to be published.
the fcc structure, with the Gibbs energy difference vanishing at about 23 GPa, after which the fcc structure has the lower Gibbs energy. The fcc phase is found to be unstable (µF < 0) up to pressures of about 8 GPa, where µF turns positive with increasing pressure. The modulus µB initially increases with increasing pressure, thus initially further stabilizing the bcc structure; however, with further increases of pressure, µB is diminished, and it eventually turns negative at about 50 GPa, causing loss of stability of the bcc structure. (A second zero in µB is found at about 120 GPa.) The initial increase in µB is “mandated” by the experimental values of the second- and third-order elastic moduli of Fe that are “built into” the model; the subsequent decrease in µB is a result that is “extracted” from the model. As µB decreases, the enthalpy barrier that acts to prevent loss of the bcc structure likewise decreases, while the driving force for phase transformation (i.e., the Gibbs energy difference) increases. The experimentally observed sluggishness of the bcc to hcp transition can be understood from the continued elastic stability of the bcc phase (µB > 0) at pressures beyond that at which the Gibbs energy difference between the phases vanishes. As will be discussed later in this article, a phase transformation path associated with a vanishing or diminishing shear modulus µB may take the bcc
Elastic stability criteria and structural bifurcations
1249
crystal into either an fcc or an hcp structure, depending on whether branching from the primary (cubic) path is homogeneous or inhomogeneous. The pseudopotential, EAM, and Morse model computational results all show that the bcc structure is “susceptible” to µ = 0 instabilities, i.e., to eigenstates of type (ii) (see after Eq. (21)); the associated homogeneous eigenmode takes the crystal structure from cubic to orthorhombic. Homogeneous branching at a µ = 0 eigenstate was first studied by Milstein et al. [24], and subsequently by Chantasiriwan and Milstein (to be published). Milstein et al. [24] began by conducting a thorough computational search of the orthorhombic states in the neighborhood of the µ = 0 eigenstates in the alkali metal pseudopotential models in order to locate all states in which the externally applied stresses and internal stresses remain in equilibrium (not necessarily a stable equilibrium) and hydrostatic after branching. (This was accomplished by calculating the internal states of stress while varying, independently, the three axial stretches λ1 , λ2 , and λ3 .) It was then found that the only non-cubic crystallographic states in the neighborhood of the µ = 0 eigenstates that satisfied the hydrostatic condition were on a unique secondary path on which the crystal remained tetragonal; furthermore, the same hydrostatic tetragonal path branched from all of the µB = 0 and µF = 0 eigenstates, thereby linking the primary bcc and fcc paths. (One way to envision the process is to assume that we may servo-mechanically control the lattice parameters of the cubic crystals; any incremental departure from the cubic path causes the path to become non-hydrostatic, except for unique departures carried out in the neighborhood of the µ = 0 eigenstates; at a µ = 0 eigenstate, we may deform the crystal along the unique secondary tetragonal path while the internal stresses remain hydrostatic; if we “start from” the bcc structure at a µB = 0 eigenstate, we will “end up” at the fcc structure at a µF = 0 eigenstate; the secondary path thus provides a hydrostatic, homogeneous, phase transformation path between the two cubic structures.) Figure 8 shows an example of homogeneous branching of cubic crystals under hydrostatic pressure for the case of the pseudopotential model of Rb. In this figure, all stretches are referenced to the unstressed bcc structure; i.e., λi = aib /aob (i = 1, 2, 3) where the lattice parameters of the body centered crystal at any stage are a1b , a2b , and a3b , and the lattice parameter of the unstressed bcc crystal is aob . On a primary bcc path, the stretches are always λ1 = λ2 = λ3 . The fcc structure can also be described as bct, with body centered (bc) lattice parameters a2b = a3b = a1b /21/2 ; thus the stretches at any stage on the primary fcc path vary as λ2 = λ3 = λ1 /21/2 . Successive computational stages on the primary paths are readily specified by the incrementation δλ1 = δλ2 = δλ3 for bcc and δλ2 =δλ3 =δλ1 /21/2 for fcc; this insures continued hydrostatic loading. Once cubic symmetry is broken at the branch point, however, hydrostatic states are no longer “automatically” located in such a simple manner; for a given increment δλ1 on the secondary path, the values of δλ2 and δλ3 were iteratively
1250
F. Milstein
Figure 8. Branching behavior under hydrostatic pressure in the pseudopotential model of Rb; the shear moduli µ of the cubic structures vanish at the states designated by ’s, at which points the secondary path branches from the primary cubic paths; on the secondary path, the crystal structure is body centered tetragonal (or equivalently, face centered tetragonal) and the state of internal stress is hydrostatic (the primary paths are of course also hydrostatic): (a) crystal geometry, (b) binding energy per atom, and (c) pressure. From Ref. [24].
Elastic stability criteria and structural bifurcations
1251
varied independently to reach hydrostatic states (i.e., to where the stresses in the three principal directions became equal). It is to be emphasized that, the only states where the crystal is able to depart from cubic symmetry homogeneously and still remain hydrostatic is at a shear modulus instability, and for the µ = 0 type of instability, the only homogeneous branching observed computationally is cubic to tetragonal. Furthermore, when higher order branching theory was employed, as is discussed in the next section of this article, it was proven ([24] and Chantasiriwan and Milstein (to be published)) that, for an initially cubic crystal, the only allowed symmetry-breaking, homogeneous, branching geometries are cubic to tetragonal, when µ vanishes, and cubic to trigonal, when µ vanishes, in agreement with computational results. Next we consider the response of cubic crystals under uniaxial loadings coincident with the principal symmetry axes. Experimentally it is known that the general elastic response of a metal, in both the linear and non-linear ranges, depends upon the “subgroup” to which the metal belongs. For example, three distinct subgroups are (i) fcc metals in general, (ii) the bcc β-brasses and alkali metals, and (iii) the bcc transition metals. For fcc metals, the Young moduli and Poisson ratios, respectively, are ordered according to E 111 > E 110 > E 100 ¯ 001 110 (this ordering also implies µ/µ < 1) and ν110 > ν100 > ν111 > 0 > ν110 , where E hkl is the initial ratio of stress to strain for uniaxial loading in the [hkl] crysh kl tallographic direction and νhkl is the negative of the initial ratio of transverse strain in the [h k l ] direction to axial strain in the [hkl] direction under [hkl] uniaxial loading (under [100] and [111] uniaxial loadings, the transverse strain is isotropic, so the superscripts are unnecessary); fcc metals also exhibit upwardly concave stress-strain relations in [100] loading, but downward concavity in [110] and [111] loading, with the magnitude of the nonlinearity greatest for the case of [110] loading and least for [111] loading [25, 26]. (As exceptions to the above “rules,” Al, Pt, and Ir lack the negative Poisson ratio in [110] loading, the µ/µ ratios of these metals, while smaller than unity, are considerably larger than the µ/µ ratios of other fcc metals (e.g., Ni, Pd, Cu, Ag, Au, Pb, Ce, and, Th), and the [100] stress–strain curve of Al is concave downward). The elastic response of the bcc alkali metals and the β-brasses is also characterized by the orderings µ > µ (typically, µ/µ ¯ 001 110 is of the order 0.1), E 111 > E 110 > E 100 , and ν110 > ν100 > ν111 > 0 > ν110 ; but the uniaxial loading curves are concave upward in [110] loading and concave downward in [100] and [111] loading, with relatively large curvatures for [100] and [110] loadings (Milstein and Marschall, 1988, 1992). Among the six bcc transition metals V, Nb, Ta, Cr, Mo, and W, with the exception of Ta, the linear elastic trends are “reversed” [26]; i.e., E 111 ≤ E 110 ≤ E 100 (which implies ¯ 001 10 ≤ ν 100 ≤ ν 111 ≤ ν 1110 (the equalities apply to W only). µ ≤ µ) and 0 ≤ ν110 The nearest neighbor atoms in a bcc crystal lie along the [111] direction, so
1252
F. Milstein
one might expect E 111 to be greatest, as it is in the group (ii) bcc metals, particularly when compressive loading is considered; the apparent “anomaly” in the bcc transition metals is evidently caused by localized directional bonding effects which are known to be significant owing to d-orbital electron interactions. Body centered cubic Fe is found to be intermediate to the two groups of bcc metals; its moduli are ordered as in the group (ii) metals, but its shear modulus ratio µ/µ is about 3 or 4 times that of the alkali metals and β-brasses. The experimentally observed moduli orderings, “upward concavities”, and negative Poisson ratios of the group (i) and group (ii) metals are, in fact, a direct consequence of the existence of multiple stress zeros on the stress–strain curves and of bifurcation phenomena that have been revealed in LS computations of large strain uniaxial loading curves; the multiple stress zeros and bifurcation phenomena, in turn, are a consequence of crystal symmetries and general characteristics of atomic bonding. As a result, LS computations based on even relatively simple atomic models, such as the Morse model, can be edifying; such models are also useful for exploring qualitative and semi-quantitative behavior of the group (i) and (ii) metals in IMD simulations. Figure 9 illustrates crystal symmetry under [100] uniaxial loading. Figure 9a shows a lattice that may be considered as initially unstressed bcc; under [100] uniaxial load, on the primary path, the body centered lattice parame/ a2b = a3b ; the left hand portion of this figure also shows a ters become a1 = bold-lined crystallographic cell that is initially face centered tetragonal (fct)
(b)
LOADING DIRECTION
(a)
Figure 9. Two ways of viewing initially cubic crystals under [100] and [110] uniaxial loads: (a) face centered cells (bold lines) in a body centered lattice structure and (b) body centered cells (bold lines) in a face centered lattice structure. From Ref. [32].
Elastic stability criteria and structural bifurcations
1253
and remains fct under uniaxial load, with lattice parameters, a1 = / a2f = a3f , in general, on the primary path. Analogously, Fig. 9b shows four crystallographic cells of a lattice that may be considered unstressed fcc initially, becoming fct, / a2f = a3f , under load on the primary path. The left-hand bold lined with a1 = / a2b = a3b , in general. Figure 9 thus illustrates the fact cell Fig. 9b is bct, a1 = that the primary paths of [100] loading of bcc and fcc crystals are identical, with the crystal residing in the fcc state when a2b = a3b = a1 /21/2 and in the bcc state at a2f = a3f = 21/2 a1 . General considerations of crystal symmetries and atomic bonding then require the primary path of uniaxial [100] loading to contain three states of zero stress, including a “special” unstressed tetragonal state, as demonstrated by Milstein [27]. That is, since the stress in a cubic state is hydrostatic, a uniaxial load must vanish in cubic states. The existence of two zeros on the primary path implies a third, since the load l1 must be compressive (i.e., negative) when the lattice parameter a1 is arbitrarily small, and l1 must be tensile (positive) when a1 is sufficiently large. When subjected to hydrostatic pressure, the initially unstressed tetragonal crystal may eventually reach a state where it becomes body centered or face centered cubic; at such states, the shear modulus µ of the cubic crystal necessarily vanishes; this occurrence is seen in Fig. 8 at the states where µ = 0; additionally, the pressure variation of the difference in the Gibbs energies of the cubic and tetragonal structures becomes stationary at these states, as is seen in Fig. 6. Numerous LS model computations of the [100] loading response of cubic crystals have been carried out for various atomic models, including by pair potentials, pseudopotentials, EAM models, and ab initio methods. General features common to such computations are illustrated in Fig. 10, which shows the [100] uniaxial compressive loading response of initially unstressed Morse model fcc crystals. Paths connecting bcc and fcc states via a tetragonal lattice distortion are called Bain transformation paths. The paths in Fig. 10 comprise a special class of Bain transformations, in that they connect two unstressed cubic states on a path of minimum energy barrier (that reaches its maximum value at the intermediate unstressed state). On other Bain paths (e.g., those under constant volume, constant transverse stretch, or other constraints), the energy barrier for transformation between the fcc and bcc structures is greater; see Ref. [28] for further discussion of this topic. In Fig. 10, the crystals are unstressed fcc at λ1 =λ2 =λ3 =1; they pass through the unstressed tetragonal states T; they are unstressed bcc at the remaining zeros (that occur at λ2 /λ1 = 21/2 ). The bcc structures also all occur in the neighborhood of λ1 = 0.79; this occurrence is readily understood from the consideration that the bcc and fcc volumes per atom are generally nearly the same. That is, if the bcc and fcc atomic volumes were identical, λ1 λ22 would be unity in the bcc state, which then would occur at λ1 = (1/2)1/3 = 0.794. The volume per atom in the unstressed tetragonal states T may vary widely when compared with its value in the unstressed cubic states, and if the unstressed fcc structure is stable
1254
F. Milstein
Figure 10. Mechanical responses of Morse model fcc and bcc crystals on the [100] uniaxial loading paths connecting the unstressed fcc and bcc structures; the stretch λ1 in this figure is the current value of the lattice parameter a1 divided by its value in the fcc state; the stress σ1 is the true stress (axial load divided by current transverse area) which is normalized by dividing by the corresponding value of the initial Young’s modulus specific to [100] loading. From Ref. [17].
(as it is for the complete family of Morse model crystals), the states T may occur either at the central or at the “left-hand” zero on the primary path; the structure at the central zero is necessarily unstable, owing to its falling load characteristic and its associated occurrence at a local energy maximum on the primary path. As mentioned earlier, unstressed Morse model bcc crystals are stable only for log β < 4.517; if log β = 4.517, the unstressed bcc and tetragonal states coincide, and the primary loading path is tangent to the abscissa at that that state; the corresponding structure lacks stability owing to its vanishing Young’s modulus. With reference to Fig. 10, the experimentally verified concavities of the
Elastic stability criteria and structural bifurcations
1255
[100] stress–strain curves of stable fcc crystals (upward) and stable bcc crystals (downward) are readily understood to be a consequence of crystal symmetries on the primary [100] loading path. Figure 10 also shows the locations of the invariant c22 = c23 eigenstates reckoned to the axes of the bct crystallographic cell. For initially stable Morse model crystals, these eigenstates are always primary eigenstates found in a region of compression. If the unstressed bcc crystal is unstable (i.e., log β > 4.517), these c22 = c23 states are embedded in the unstable region between the bcc structure and the local stress minimum on the primary path. When reckoned to the fct cell axes, the Morse model c22 = c23 eigenstates are likewise always primary, but are found in a region of tension (not shown in Fig. 10) between the fcc structure and the maximum value of the load l1 . Stationarity of the generalized force p1 also coincides with primary eigenstates for the bcc crystal in [100] tension (where p1 achieves a local maximum) and for the fcc structure under [100] compression (where p1 is at a local minimum). A wide range of models that are more sophisticated than the Morse model also yield c22 = c23 eigenstates in a tensile range between a stable fcc state and the maximum tensile load and in a in a compressive range for stable bcc crystals. For example, Fig. 11 shows pseudopotential model computations of the variations of c22 , c23 , and c44 on a primary [100] loading path. In this figure, the unstressed bcc, bct, and fcc states are denoted by B, T, and F, respectively. For the moduli reckoned to the fct crystallographic axes (Fig. 11a), c22 is seen to decrease and c23 to increase with increasing axial stretch, whereas the opposite occurs when these moduli are reckoned to the bct axes (Fig. 11b). A comparison of the a and b parts of Fig. 11 also points up the fact that the c22 = c23 eigenstate reckoned on the fct axes and the c44 = 0 eigenstate on the bct axes coincide, and visa versa, regardless of atomic model. (If the calculations in this figure were to have been made with a pair potential model, the c44 = 0 eigenstates would also have coincided with c23 = 0.) From the experimental viewpoint, variations of the moduli c22 and c23 on the primary [100] path may be calculated from measured second and third order elastic moduli; such calculations invariably show that, for fcc crystals, c22 decreases and c23 increases with increasing tension, while for bcc crystals, c22 decreases and c23 increases and with increasing compression, in agreement with atomic based lattice model computations. Thus there is an apparent invariance associated with the locations of these eigenstates on the primary [100] paths. The unstressed tetragonal state T is necessarily “to the left” of a stable fcc state on a primary [100] loading path. Conversely, if state T is “to the right” of the fcc state on this path, the centrally located fcc structure is unstable. In the EAM model of Chantasiriwan and Milstein, fcc Fe is unstable at zero pressure owing to the negative modulus µ (see Fig. 7); accordingly, the fcc state occurs at the central zero on the [100] loading path. However, at a hydrostatic pressure of about 8 GPa, the shear modulus µ of the fcc structure in the EAM model
1256 (a)
F. Milstein (b)
Figure 11. Variation of elastic moduli on the primary [100] loading path of Rb in the pseudopotential model; the upper abscissa scale is the current value of the lattice parameter a1 divided by its value in the unstressed fcc state F; the lower abscissa scale is the current value of a1 divided by its value in the unstressed bcc state B; the moduli are reckoned to the axes of (a) the face tetragonal axes and (b) the body centered tetragonal axes. From Ref. [32].
passes from negative to positive, at which state the fcc and hydrostatic bct structures coincide. This behavior is observed in Fig. 12, which shows EAM model computations of Bain transformation paths occurring under constant transverse compressive stress, σ2 = σ3 = −P. (In this figure the uppermost curve, at P =0, is a “true” [100] uniaxial loading path.) Three hydrostatic states are found on each path in Fig. 12, i.e., the bcc states (shown as “diamonds”), the fcc states (squares), and the hydrostatic bct states (circles). The central, unstable, hydrostatic state is occupied by the fcc structure at pressures below about 8 GPa, by the bct structure at pressures between about 8 and 50 GPa, and by the bcc structure at pressures over approximately 50 GPa. With reference to Fig. 7, it is seen in Fig. 12 that, on the Bain paths of constant transverse stress, the cubic states “exchange positions” with the hydrostatic bct state at pressures causing the respective shear moduli µ of the cubic crystal to vanish. Cubic crystals under [111] uniaxial loading exhibit load versus axial stretch responses that are qualitatively similar to the [100] responses discussed above; although, for the [111] loading case, the ordering of the three unstressed states on the primary path is invariant. That is, symmetries on the primary [111] loading path require the unstressed bcc, sc, and fcc structures to occur successively
Elastic stability criteria and structural bifurcations (a)
1257
(b)
Figure 12. Simulation of [100] axial loading of the EAM model of Fe in a constant hydrostatic environment. On each path, initially the crystal may be presumed to be in one of the cubic states, which is necessarily hydrostatic, under a pressure P. An additional axial stress σ1 is then applied, causing the path dependent state of stress in general to be σ1 = / σ2 = σ3 = −P = constant. The path then traverses three states that are under the same hydrostatic pressureP, i.e., bcc (indicated by a “diamond”), fcc (a circle), and the hydrostatic bct structure (a square). Pressures P are in the range showing “cross over” of (a) hydrostatic bct and fcc structures at the states µF = 0 and (b) hydrostatic bct and bcc structures at the states µB = 0. (1 Mbar = 100 GPa). From Chantasiriwan and Milstein, to be published.
at the states where the ratios of axial to transverse stretch are themselves in the ratio 1:2:4, as demonstrated by Milstein et al.[11]. The inherent instability of the unstressed sc structure can be understood from its location at the central zero on the [111] primary path. Similar behavior (i.e., three stress zeros and two energy minima) was found in the [0001] loading response of hcp crystals, thus suggesting the possible existence of a new crystal structure analogous to bcc, but with two atoms in the primitive basis [29]. The uniaxial [111] and the [0001] cases have fundamental similarities and differences; both have a uniaxial load applied normal to similar hexagonal planes of atoms, but with differing stacking orders of the planes (i.e., ABCABC. . . for [111] and ABAB. . . for [0001], in the usual notation); although crystal symmetry requires the existence of the three zeros in the [111] loading case, no such requirement is evident for the case of [0001] loading of hcp crystals.
1258
F. Milstein
The first LS model computations of the full [111] uniaxial loading path are those of Ref. [11], based on a Morse model of fcc Ni. Subsequently, Milstein and Chantasiriwan [2] and Chantasiriwan and Milstein (to be published) computed the [111] loading behavior of their EAM metal models. As an example of computational results, Fig. 13 shows the [111] loading response of the EAM Fe model discussed earlier. Note that, although the fcc form of Fe appears at the central zero on the [100] path owing to its negative µ-value, it appears at the right-hand zero on the [111] path, as required by crystal symmetry. These behaviors also point up the important fact that a crystal structure may lie at an energy minimum on a particular path, while it is at an energy maximum and hence unstable with respect to deformation along another path. Unstressed sc
(a)
(b)
(c)
Figure 13. Mechanical response of Fe to [111] uniaxial loading in the EAM formulation. The stretch λ3 (coaxial with the [111] direction) and transverse stretch λ2 are referenced to the unstressed bcc state B; the crystal necessarily becomes unstressed sc (S) and fcc (F) at values of λ3 /λ2 = 2 and 4, respectively [11]. Variation, with axial stretch, of (a) the (isotropic) transverse stretch and percent change of volume (from the volume VB of the bcc structure), (b) axial load L 3 , axial true stress σ 3 , and internal energy E, and (c) elastic moduli C33 2 and R = (C − C )C − 2C 2 and C44 and expressions R1 = (C11 + C12 )C33 − 2C13 2 11 12 44 14 (see relations (38) and (39)); the moduli here are those defined by (49), with the modification C14 = (a1 /V )2 E/a1 a4 , (a4 is the angle between the 2- and 3-axis) which yield the same domains of notional stability as the M-moduli. From Chantasiriwan and Milstein, to be published.
Elastic stability criteria and structural bifurcations
1259
crystals have a more “open” structure than the unstressed bcc, fcc, or bct crystals; for example, for the Fe model, the atomic volume of the unstressed bct structure is about 1% greater than that of the bcc and fcc structures, whereas, as seen in Fig. 13a, the volume per atom of the sc crystal is more than 10% greater than that of the bcc and fcc crystals. Accordingly, the energy and stress barriers for transitions between the bcc and fcc states on the [111] loading path are naturally greater than the corresponding barriers on the [100] path, which “explains” the experimental results, for both the bcc and fcc structures, that E 111 > E 100 (this condition, in turn, requires the orderings E 111 > E 110 > E 100 , based on general rules of anisotropic elasticity), that the nonlinearities in [111] loading are smaller than in [100] loading, and that the stress–strain responses under [111] loading are concave downward. (The downward concavity of the stress–strain curves of a stable bcc crystal in both [111] and [100] loading is a natural consequence of the structure’s position at the left hand zero on the paths, where there is no required stress zero residing to “its left”, in compression; by comparison, the symmetry-imposed stress-zero that lies “to the left” of a stable fcc crystal on both the [100] and [111] loading paths imposes an inflection point on the paths in the neighborhood of the fcc state; the relatively small stress barrier for the fcc to bcc transition on the [100] path “pushes” the inflection point into the tensile region so the [100] loading response is concave up; the much larger compressive stress barrier for the fcc to bcc transition in [111] loading enables the inflection point to reside in the region of compression, thereby yielding an initial stress-strain curve that is concave down.) The notional ranges of stability under [111] loading of all of the initially stable fcc metals that were studied with the EAM model (Li, Na, K, Rb, Cu, Ag, Au, Al, and Ni), as well as of the Morse Ni model, in both tension and compression, were found to be terminated at violations of the second inequality in (38), prior to the respective maximum and minimum loads and, as mentioned earlier, within in the group of G-, S-, and M-variables, the notional stability limits were fairly insensitive to the choice of variables. Similarly, for all of the bcc EAM model metals (Li, Na, K, Rb, Fe, Mo, and Nb) that were studied in [111] tensile loading, the notional stability limits were relatively insensitive to choice of strain variables and, with the exception of Mo, notional stability was terminated at violations of the second inequality in (38). In [111] compression, however, some of the bcc metals tended to remain notionally stable up to very large stress magnitudes (when compared with the maximum tensile stresses on the loading paths), where divergences among the diverse notional stability criteria may be expected. Among the bcc EAM models subjected to high [111] compressive stresses, primary eigenstates associated with violations of both the second and the first of the inequalities in (38) were found; since the latter type of notional instability is associated with a stationary value of p1 , this result is likely model-dependent. The stresses at
1260
F. Milstein
the limits of the notional stability domains in [111] bcc compression were, in some cases, found to be strongly dependent on the choice of strain variables; this result implies that the Born criterion is unlikely to be of general value for assessing stability of bcc crystals in [111] compressive loading. Crystal symmetry on the primary path of [110] loading of an initially unstressed fcc crystal is body centered orthorhombic (bco), while symmetry on the primary path of [110] loading of an initially unstressed bcc crystal is face centered orthorhombic (fco). Behavior on the primary [110] paths of bcc and fcc crystals can best be understood in terms of the secondary orthorhombic branch paths that emanate from the primary [100] path at the c22 = c23 eigenstates, since, as shown by Milstein and co-workers, the primary [110] uniaxial loading paths are identical to these secondary orthorhombic paths, and the path branchings profoundly influence the mechanical responses on the [110] paths. Crystal symmetry under [110] loading can be understood with reference to Fig. 9 also. For this purpose, assume that the bold-lined lined face centered (fc) cell in the right-hand portion of Fig. 9a is initially unstressed fcc; [110] loading of the fcc crystal is then accomplished by applying a uniaxial load as illustrated in the figure; the initially bc cells in Fig. 9a then become bco. Analogously, if the bold-lined bc cell in the right hand portion of Fig. 9b is initially unstressed bcc, [110] uniaxial loading of an initially unstressed bcc crystal is indicated, and the fc cells in Fig. 9b become fco. Now, let us “return” the bc cells in Fig. 9a to their unstressed bcc state (or the fc cells in Fig. 9b to their unstressed fcc state) and apply a compressive (or tensile) [100] uniaxial load in the direction shown. On the primary [100] path, under compression, a1 decreases and a2b = a3b increase (or under tension, a1 increases and a2f = a3f decrease); at the point of bifurcation, to first order on the secondary path, δa1 = 0, δa2 = −δa3 , so a2 and a3 vary much faster than a1 , at least initially, on the secondary paths. Without loss of generality, we may assume that a2b decreases and a3b increases when a1 decreases (or that a2f increases and a3f decreases when a1 increases); then a2b (or a2f ) will again become equal to a1 ; when this occurs, the load in the two-direction, which by definition is zero, must equal the load in the one-direction, so the crystal passes through an unstressed state on the orthorhombic secondary path; one unstressed state, however, implies a second, since the load must be compressive (or tensile) as a1 decreases further (or as a1 continues to increase). One of these stress zeros on each secondary path is naturally cubic, oriented as shown by the right-hand bold lined cells in Fig. 9. Branching of a crystal structure under uniaxial load was first observed computationally by Milstein and Huang [30] in their study of [110] loading of the fcc Morse model Ni crystal; they computed the path dependent axial and transverse strains, axial load and stress, energy, and elastic moduli and found the [110] bco path to branch from the primary tetragonal [100] loading path, under dead load, at the c22 = c23 eigenstate (with the moduli defined relative
Elastic stability criteria and structural bifurcations
1261
to the bc crystallographic axes). For the Morse Ni model, log β = 6.288; thus the branch point was embedded in the unstable, compressive, region of the [100] path (the location of the branch point may be induced from Fig. 10). The [110] path also contained the unstressed tetragonal state that is found on the primary [100] path, differently oriented. Milstein and Farber [31] employed similar LS model computations to study the analogous, but distinct, branching from the [100] tetragonal path to the secondary bco path that occurs at the point where c22 = c23 when the moduli are reckoned to the fct structure; there the branch point terminated a stable tensile region of the primary [100] path of an initially stable fcc crystal subjected to [100] tensile loading. Again, the secondary path passed through the same unstressed tetragonal state found on the primary [100] path, differently oriented. Milstein and Farber employed general theoretical arguments, supported by their computations, to show that the unique bco branch path (that remains under strict uniaxial load) takes the crystal into the unstressed bcc structure, which is oriented with the uniaxial load coincident with the [110] direction of the bcc crystal. In a review article, Milstein [17] presented a unified description of [100] and [110] uniaxial loading of bcc and fcc crystals that incorporated the branchings from the [100] path (considered as primary) to both [110] paths (considered as the secondary branch paths). Subsequently, the generalized behavior proposed by Milstein, based mainly on symmetry arguments, was verified in both pseudopotential [32] and EAM (Chantasiriwan and Milstein, to be published) model LS computations. Examples of these computational results are shown in Fig. 14 (for the Rb pseudopotential model) and Fig. 15 (for the Fe EAM model). In Figs. 14 and 15, the primary [100] path is represented by solid lines and the secondary [110] paths by dashed lines; the locations of the unstressed fcc, tetragonal, and bcc states are indicated by F, T, and B, respectively; the axial stretch λ1 is referenced to the unstressed bcc structure on the primary path; and the superscripts b and f indicate variables reckoned, respectively, to the bc and fc crystallographic axes (e.g., the transverse stretches λb2 and λb3 on the left hand branch paths are computed on the bco axes). In Fig. 15, the moduli are referenced to the bc crystal axes, and thus the right hand branch point occurs where c44 = 0. Figs. 14a and 15a make evident the basis for the ¯ 110 , which occurs naturally experimentally observed negative Poisson ratio ν110 as a result of the bifurcation characteristic; i.e., that a2b decreases when a1 decreases and that a2f increases when a1 increases; the large, positive, Pois001 are readily explained by the relatively rapid variations of a3b son ratios ν110 f and a3 on the secondary branches. The experimentally observed stress–strain concavities on the [110] paths are also readily understood from the stress– strain responses on the branch paths in Figs. 14 and 15; i.e., the right hand branch of [110] loading of the bcc crystal must “turn up” under compression, and the left hand branch of [110] loading of the fcc crystal must “turn down” under tension, in order that these paths “meet up” with the primary paths at the
1262
F. Milstein
(a)
(b)
(c)
(d)
Figure 14. Branching behavior under uniaxial loading of the pseudopotential model of Rb; the unstressed bcc, fcc, and tetragonal states are indicated by B, F, and T, respectively. Crystal structure is tetragonal on the primary path (solid line), which corresponds to [100] uniaxial loading of the bcc and fcc structures, and orthorhombic on the secondary branch paths (broken lines). The left-hand (body centered orthorhombic) branch contains the unstressed fcc structure with its [110] axis parallel to the uniaxial load and the right-hand (face centered orthorhombic) branch contains the unstressed bcc structure with its [110] axis parallel to the uniaxial load. The secondary paths are thus identical to the [110] loading paths of the cubic crystals: (a) Transverse stretch, (b) variation of atomic volume (B is the atomic volume in state B), (c) internal energy, and (d) true stress. From Ref. [32].
Elastic stability criteria and structural bifurcations
1263
(a)
(b)
(c)
Figure 15. Branching between the [100] (solid lines) and the [110] (broken lines) uniaxial loading paths for the EAM model of Fe (crystal structures on the respective paths are as described in the caption to Fig. 14). The only stable structure at zero stress in the Fe model is bcc (B); note the falling load characteristic and corresponding local maximum of energy that renders the unstressed tetragonal state (T) unstable with respect to deformation on the righthand branch path: (a) transverse stretch, (b) internal energy, (c) true stress. From Chantasiriwan and Milstein, to be published.
1264
F. Milstein
invariant c22 = c23 eigenstates. Thus, also, these [110] paths exhibit relatively large nonlinearities. Figures 16a–c compare the uniaxial loading responses of the EAM fcc metals Na, Cu, and Al and the bcc metals Na and Mo in tension on the three principal symmetry directions, and thereby explicitly demonstrate the profound influence of the crystal symmetries discussed above on uniaxial loading. The normalized maximum axial tensile stresses are relatively small and occur at relatively small values of axial stretch for bcc metals in [100] loading; this result helps to explain {100} cleavage in bcc metals, as noted by Morris et al. [14]. The fcc metals also exhibit relatively small maximum tensile stresses in [110] loading, although the greater number of slip systems available in the fcc structure evidently precludes cleavage in this mode of loading; the theoretical result does, however, suggest the possibility of cleavage of fcc crystals at low temperatures in [110] loading, where the slip systems may be less active. The large differences between the [100] and [110] loading responses of bcc and fcc crystals that are seen in Figs. 16a and b are absent in the [111] loading responses in Fig. 16c. Since the theoretical tensile [111] loading paths of bcc metals pass through unstressed sc structures that reside at significantly greater energies than the unstressed fcc, bct, or bcc structures, the stress barriers for a bcc to sc transition are generally quite large, and as noted by Milstein and Chantasiriwan [2], these stress barriers are close to the theoretical maximum stresses reached on the [110] tensile loading paths of bcc crystals, which, in turn, are governed by the inherent strength of the atomic bonds, rather than by crystallographic transformations among unstressed states or bifurcation phenomena. The major influence of crystal symmetries (and the general nature of atomic bonding) on uniaxial loadings of the fcc metals and the group (ii) bcc metals enables relatively simple models of atomic bonding to provide reasonable representations of the uniaxial loading responses; for example, see Fig. 17, which compares the Morse and EAM model mechanical responses of fcc Cu in tension and compression. While the Morse model incorporates only two empirical second-order elastic moduli, the EAM formulation incorporates all three second-order and all six third-order empirical elastic moduli; nevertheless, the two models exhibit similar qualitative and semi-quantitative mechanical responses. Thus, even simple models of atomic bonding can be useful in explorations of qualitative and semi-quantitative behavior in IMD simulations, as is discussed later in this article. As a final example of the rich and diverse phenomena exhibited by crystals under large elastic deformation, Fig. 18 shows the shear stress versus shearing angle θF of fcc crystals loaded in the mode of shearing depicted in Fig. 19. Here the shearing load is constrained to remain parallel to the (100) and (010) planes in the [100] and [010] directions, which takes the initially fcc structure into a bco configuration on the primary path; no other load acts on the crystal. Figure 18 displays the unexpected result that the decrease in the shearing
Elastic stability criteria and structural bifurcations
1265
(a)
(b)
(c)
Figure 16. Normalized axial stress (axial stress divided by the initial Young’s modulus appropriate to the particular metal and loading direction) versus axial stretch for the fcc metals Na, Cu, and Al and the bcc metals Na and Mo under unconstrained uniaxial loadings in the three principal symmetry directions.: (a) [100] loading, (b) [110] loading, (c) [111] loading. From Ref. [2].
1266
F. Milstein
Figure 17. Comparison of the mechanical responses of the Morse model of Cu (solid lines containing data points) and the EAM model of Cu (broken lines) in the three principal loading directions. The ordinate is uniaxial load in the [hkl] direction per unit reference area and the abscissa is axial stretch in the [hkl] direction. From Ref. [2].
angle θF (between the [100] and [010] directions), as well as all other lattice parameters, could be varied continuously only until θF reached the critical value of about 7.6◦ , after which continued decrease in θF required a first-order transition (i.e., a discontinuity in all lattice parameters) since a continuous path taking the θF beyond about 82.4◦ does not exist. (There does exist a continuous path taking θF back to 90◦ , transforming the original fcc structure into bcc, and passing through the unstressed bct structure along the way.) These results, as well as the “bc shearing analog” (i.e., where the shear forces are maintained parallel to the faces of the bc crystallographic axes and the crystal becomes fco on the primary path) are discussed in greater detail by Milstein [17].
3.
Role of Higher Order Moduli at Points of Bifurcation
Criteria for the elastic stability of a crystal subjected to a prescribed mode of loading may, in principle, be formulated in terms of inequalities among strain dependent second order elastic moduli and generalized forces, as is discussed earlier in this article. Loss of stability on the primary path is associated with possible bifurcation, leading to a secondary path on which the crystal may undergo phase transformation or failure. However, by analogy with branching
Elastic stability criteria and structural bifurcations
1267
Figure 18. Theoretical shear stress versus shearing angle for generalized Morse model fcc Ni crystals loaded in the shear mode illustrated in Fig. 19; m = 2 corresponds to the Morse model. From Ref. [37].
theory for discrete mechanical systems, the starting direction of a secondary path is decided by a higher order specification of the post critical loading program [6]. The onset of post-critical phenomena may be determinate of the full transformation path (in the case of phase change) or the mode of failure (if load carrying capacity is lost) or, indeed, of whether the material response to instability is either phase change or failure. While second order moduli are fundamental to the evaluation of stability domains, higher order moduli at the branch point are central to understanding post-bifurcation behavior. Although the development of higher order crystal stability – bifurcation theory
1268
F. Milstein (a)
(b)
Figure 19. Schematic illustration of the shear mode of loading employed in the computations of Fig. 18.
evidently is essential to a full understanding of the topic of material response to load, with but a few notable exceptions ([24, 33, 34] and Chantasiriwan and Milstein, to be published), relatively little work apparently has been done in this area. Here we examine two examples of post-bifurcation behavior, analyzed in terms of higher order elastic moduli on the primary paths at the branch points, i.e., (i) homogeneous branchings of initially cubic crystals at states where a shear modulus vanishes and (ii) the tetragonal to orthorhombic branching at the c22 = c23 eigenstates. As mentioned in the previous section, the only path found computationally that branched from a cubic structure at a µ = 0 eigenstate under continuing hydrostatic conditions was a unique tetragonal path. This behavior may be understood from second order expansions (third-order in the moduli) of the axial stress increments δσi , i = 1, 2, 3, that are constrained to remain equal on the secondary path. That is, following the development of Chantasiriwan and Milstein (to be published), the axial stress increments on an arbitrary orthorhombic cell may be expressed as
δσi =
∂σi ∂a j
1 δa j + 2
∂ 2 σi ∂a j ∂ak
δa j δak ,
(47)
where the summations are over repeated indices (i, j, k = 1, 2, 3). Equations (47) may be expanded as δσi = Cii δqi + (Ci j + P)δq j + (Cik + P)δqk + 12 Ciii δqi2 + (−Cii + Cii j )δqi δq j + (−Cii + Ciik )δqi δqk + 12 (−2P − 2Ci j + Ci j j )δq 2j + (−P − Ci j − Cik + Ci j k )δq j δqk + 12 (−2P − 2Cik + Cikk )δqk2 ,
(48)
Elastic stability criteria and structural bifurcations
1269
where no summation convention is implied, i = / j= / k, and δqi = δai /ai . The moduli in (48) are based on Rasky and Milstein’s 1986 formulation, i.e.,
Ci j =
ai a j V
∂2 E ∂ai ∂a j
and
Ci j k
ai a j ak = V
∂3 E . ∂ai ∂a j ∂ak
(49)
Next, in Eqs. (48), incorporate the moduli symmetries of a cubic-crystal (C111 = C222 = C333 and C112 = C122 = C113 = C133 = C223 = C233, along with the usual symmetries among the Ci j and with respect to interchange of indices), and set δσ1 − δσ2 = 0 and
δσ1 − δσ3 = 0,
(50)
conditions that ensure that the internal state of stress remains hydrostatic as branching occurs. The first of Eqs. (50) becomes δσ1 − δσ2 = (C11 − C12 − P)(δq1 − δq2 ) + (−C11 + C112 + P + 2C12 − C123 )(δq1 δq3 − δq2 δq3 ) + 12 (C111 + 2P + 2C12 − C112 )(δq12 − δq22 ) = 0, (51) with the analogous expression for δσ1 − δσ3 = 0. One obvious solution to the pair of equations is δq1 =δq2 =δq3 , i.e., the primary path; however, if C11 −C12 − P = 0 (which is equivalent to µ = 0), a second solution is possible, viz. δq1 = δq2 = / δq3 , which results in the crystal branching from cubic to tetragonal, with the lattice parameters on the primary path initially varying according to (C112 − C111 − 2C11 ) δq1 . = δq3 (C111 + C112 − 2C123 + 2C11 + 2C12 )
(52)
Chantasiriwan and Milstein similarly analyzed homogeneous branching at the µ = 0 eigenstate and found the branch path to have trigonal symmetry. The cubic to tetragonal branch path that emanates from the µ = 0 eigenstate is therefore the only secondary path that can occur homogeneously if the internal state of stress is maintained hydrostatic and, as illustrated in Fig. 8, the path has been found to connect primary bcc and fcc paths at their respective µ = 0 eigenstates. This mode of branching is equivalent to the uniform shear¯ direction; the secondary ing of (110) planes of the cubic crystal in the [110] path thereby reveals a mechanism for transformations between bcc and fcc structures by uniform shearing under hydrostatic pressure. By contrast, when shearing of (110) planes in a bcc crystal occurs inhomogeneously, wherein al¯ and [110] ¯ ternate (110) planes shear in [110] directions relative to each other (as was observed in the IMD simulations of Zhao et al. [18]), the transformation is bcc to hcp. Experimentally, pressure induced phase transformations between bcc and fcc structures have been observed in the metals K, Rb, Cs, Ca, Sr, Tl, and Fe while transformations between bcc and hcp structures under pressure have been observed in Be, Mg, Ba, Tl, Ti, Zr, and Fe. Questions
1270
F. Milstein
such as “What factors may cause shearing to occur either homogeneously or inhomogeneously in a bulk sample of a cubic crystal under pressure at a shear modulus instability?” and “Are these factors linked to the elastic moduli (particularly the higher order terms) of the crystal at the branch point?” appear ripe for further investigation. Hill [33] developed general theory for higher order constitutive branching in elastic materials, and discussed branching from the primary tetragonal to the secondary orthorhombic path at the c22 = c23 eigenstate as a special case. As a note of possible “historical interest”, some years ago, Hill and Milstein jointly investigated the mathematical and numerical character of the secondary path at the branch point with an aim of developing an expression for the variation of load, dl1 /dλ1 , on the secondary path, as a function of the elastic moduli on the tetragonal path. Initially, a third-order (in the moduli) theoretical formula failed to satisfy the computational results, leading us to question the computations. (Two independent calculations of dl1 /dλ1 , carried out as follows, gave divergent results: (i) higher-order moduli calculated from Morse model lattice summations were put into the theoretical formula for dl1 /dλ1 and (ii) the ratio δl1 /δλ1 was computed from finite differences on the branch path.) Subsequently, the following “fourth-order” formula derived by Hill gave “exact” agreement.
c2 [2c22 (c123 − c122 ) + c12 (c222 − c223 )]2 /c22 dl1 = c11 − 12 − , dλ1 c22 [(2/3)c22 (c2222 − 4c2223 + 3c2233 ) − (c222 − c223 )2 ] (53) where the moduli in (53) are ci j =
∂2 E ∂li = , ∂λi ∂λ j ∂λ j
ci j k =
∂ 2li , ∂λ j ∂λk
ci j kl =
∂ 3li . ∂λ j ∂λk ∂λl
(54)
Equation (53) was derived by expanding the incremental change δli of the load li to third-order in the δλi , (fourth-order in the moduli), δli = ci j δλ j +
1 2
ci j k δλ j δλk +
1 6
ci j kl δλ j δλk δλl
(i = 1, 2, 3), (55)
incorporating the symmetries of the tetragonal structure, substituting the following series expansions for the δλi in terms of a parameter t that approaches zero as the bifurcation point is approached, δλ1 = β1 t 2
δλ2 = γ t + β2 t 2
δλ3 = −γ t + β3 t 2 ,
(56)
setting δl1 = α1 t 2 ,
δl2 = δl3 = 0,
(57)
Elastic stability criteria and structural bifurcations
1271
and solving for α1 /β1 by grouping terms of like order in t. The higher order moduli (i, j, k, l = 1, 2, 3) on the tetragonal path are c111 , c112 , c122 , c123 , c222 , c223 , c1111 , c1112 , c1122 , c1123, c1222 , c1223 , c2222 , c2223 , c2233 , as well as those moduli formed by interchange of the indices 2 and 3 among distinct moduli and by interchange of the order of indices within a given modulus (e.g., c1223 = c1332 =c2123 =c3132= . . . ). The inclusion of fourth order moduli in the expansion (55) is necessary owing to the highly singular character of the bifurcation at the c22 = c23 eigenstate. 2 /c22 ) is the slope dl1 /dλ1 of the tetragonal path at The quantity (c11 − c12 the branch point, which must be positive if the c22 = c23 eigenstate terminates stability; consequently, the slope of the branch path at the point of bifurcation will be negative (and hence the secondary path will necessarily be unstable at the point of bifurcation) if the expression in (53) that contains the higher 2 /c22 ). For all Morse model order moduli is positive and greater than (c11 − c12 crystals and for the pseudopotential model crystals that have been investigated, the orthorhombic paths branch with negative slope; among the EAM models, however, both negative and positive sloped path branchings were observed. For example, LS simulations of [100] loading of the EAM model of Ni exhibited positive-sloped branching on the right-hand (fco) branch path, as is seen in Fig. 20. This behavior raises the interesting question of whether or not the tetragonal crystal actually becomes unstable before, at, or after the location of the c22 = c23 eigenstate on the primary path. Note that, before the branch point is reached on the primary path in Fig. 20, there exists a range over which the secondary fco path is at a lower internal energy and lower axial stress; crystal instability modes for this form of bifurcation diagram have yet to be investigated in IMD simulations. Since the load l1 passes through zeros at the cubic states on the secondary orthorhombic paths, the magnitude and algebraic sign of dl1 /dλ1 is also governed largely by crystal symmetry and the location of the branch point on the secondary path. That is, because the atomic volumes of a given substance in the unstressed bcc and fcc states are approximately equal, the cubic states occur on their respective secondary paths at approximately the same values of λ1 , regardless of atomic bonding characteristics. (For example, the unstressed bcc state on the right-hand (fco) branch occurs at about a 12% strain relative to the fcc state on the primary path because crystal symmetry dictates that λ1 = λ2 = 21/2 λ3 at the bcc state on the fco branch, and if the atomic volumes of the bcc and fcc structures were identical, λ1 λ2 λ3 would be unity in both states, so the bcc state on the fco branch would then occur at λ1 = 2(1/6) = 1.122.) If the point of bifurcation on the fco branch occurs where λ1 is greater than about 1.12, as it does for the EAM model of Ni, this branch emanates with dl1 /dλ1 > 0. Analogous symmetry conditions apply to the left-hand (bco) branch.
1272
F. Milstein
(a)
(c)
(b)
(d)
Figure 20. Mechanical response of the EAM model of Ni, exhibiting a positive-sloped (dl 1 /dλ1 > 0), secondary, face centered orthorhombic, branch path at the c22 = c23 eigenstate (where the face centered orthorhombic [110] uniaxial loading path of the bcc structure branches from the tetragonal [100] loading path of the fcc structure): (a) internal energy, (b) true stress, (c) change of volume relative to the volume of the unstressed fcc structure, VF , (d) transverse stretch. From Ref. [34].
4.
Instability and Bifurcations in Isostress Molecular Dynamics Simulations
Molecular dynamics simulations, particularly when carried our under isostress conditions, add “new dimensions” to the studies of elastic stability, bifurcation, and post-bifurcation behavior, in that inhomogeneous branching and temperature effects may be readily investigated in a natural, unconstrained manner. Available numerical “tools” include the isostress ansatz Lagrangian of Parrinello and Rahman [3], canonical fluctuation formulas for computing stress- and temperature-dependent elastic moduli [35], and visualization techniques for determining instability mechanisms (i.e., for viewing the evolutions
Elastic stability criteria and structural bifurcations
1273
of atomic configurations during the course of an instability). Here we illustrate the results of some applications of these computational methodologies in studies of stress-induced instabilities in Morse model crystals that have been thoroughly studied previously in lattice statics simulations. We begin with Zhao et al. [34] IMD study of thermally activated, inhomogeneous, shear modulus instabilities in bcc crystals under hydrostatic, isothermal, conditions. In LS simulations, bcc Morse model crystals lose stability under pressure (P > 0) on the curve µB(L) shown in Fig. 2, where the shear modulus µB passes from positive to negative with decreasing pressure (increasing λ) and sc Morse model crystals are stable only in hydrostatic tension above the µ S - and below the κ -curves. In the IMD simulations at temperatures approaching 0 K, the bcc and sc structures were also indefinitely stable in the ranges predicted by the LS computations, and the bcc and sc structures lost stability via inhomogeneous bifurcations that occurred in association with the vanishing of their respective shear modulus, µB and µS . These simulations demonstrated explicitly the applicability of Hill and Milstein’s stability criteria, relations (21), to cases where stability also is lost under inhomogeneous eigenmodes. The bcc crystals were observed to lose stability via the ¯ planes, in alternate [1¯ 10] ¯ and [110] directions (which is an shearing of (110) inhomogeneous eigenmode consistent with the µB = 0 eigenstate) and, postbifurcation, the crystals were observed directly to transform to hcp crystals, ¯ bcc planes became the (0001) planes of the new hcp wherein the sheared (110) crystal. This shearing mechanism for the bcc – hcp transformation was proposed much earlier by Burgers [36], based on analogous hexagonal geometry between the (110) planes in bcc crystals and (0001) planes in hcp crystals; the study by Zhao et al. [34], demonstrated that the mechanism is “triggered” by loss of elastic stability of the bcc structure in conjunction with a vanishing or diminishing shear modulus µ. The influence of temperature on the transition is particularly interesting. Figure 21 shows the variation of µ with pressure and stretch at the temperatures T = 10−5 , 1, and 10 K (data points) and in the LS simulations (solid lines) for a particular Morse model bcc crystal (log β = 4.54). It is seen that, at the indicated temperatures, there is very little difference in the µ-values at a given pressure or stretch. Figure 21 also shows the critical pressures, above which the bcc structure remained stable indefinitely (and below which, instability occurred), indicated by arrows at (a), (b), and (c) for the temperatures T = 10−5 , 1, and 10 K, respectively. (In order to avoid extremely long times to transformation, the critical states in Fig. 21 were determined in adiabatic simulations, although the crystals remained essentially isothermal until initiation of the instability.) With increasing temperature, the transformation is able to occur at increasing pressures and µ-values, owing to the effect of thermal activation; i.e., increased thermal agitation enables the larger enthalpy barriers that occur at greater µ-values, and hence at greater pressures, to be surmounted. Figure 22 shows the variation of enthalpy change
1274
F. Milstein
Figure 21. Shear modulus µ of a Morse model bcc crystal as a function of (1) pressure P and (2) stretch λ. Critical values of pressure and stretch, as determined in constant pressure IMD simulations, are indicated by arrows at (a) 10−5 K, (b) 1 K, and (c) 10 K. (Values of µ were also computed in “supercritical states”; i.e., by employing the fluctuation formulas prior to an incipient instability.) [34]
during the instability of the bcc Morse model crystal with log β = 4.54, at T = 1 K, as an example of the influence of pressure on enthalpy barrier during instabilities that occurred during isothermal IMD simulations. Since the enthalpy barrier vanishes at the µ = 0 eigenstate, and apparently it decreases rapidly as this state is approached from within an initially stable region, we see in Fig. 21 that, at very low temperatures (i.e., where the atomic positions are usually considered as essentially “frozen”), there exists a remarkably strong influence of temperature upon the critical transformation pressure. Next consider IMD simulations of two Morse model fcc crystals, each subjected to two modes of uniaxial loading, i.e., [100] and [111]; the models employ values of log β = 3.864 and 6.288, and reproduce experimental values of the elastic moduli c11 and c12 and atomic volumes of unstressed Cu and Ni, respectively. Figures 23 and 24 show the variations of the moduli combinations that determine the stability ranges according to relations (33) and (34) (in [100] loading) and (38) in [111] loading, as computed by fluctuation formulas at various temperatures, and in the lattice statics computations
Elastic stability criteria and structural bifurcations
1275
Figure 22. Enthalpy change H vs. degree of transformation ξ under isothermal (temperature T =1K ), isobaric, conditions in molecular dynamics simulations of bcc to hcp transitions; ξ is a geometric, path dependent parameter that varies from 0 (in the bcc state) to 1 (in the hcp state); in the main figure and in the upper left inset, pressure P = 2.43 GPa; for curves (1), (2), (3), and (4) in the upper right inset, P = 0, 0.63, 1.26, and 2.43 GPa, respectively; the solid curves are polynomial least-squares fits to the IMD results. From Ref. [34].
(Ref. [39] and Zhao, Maroudas, and Milstein, to be published). The LS computations of the moduli combinations (which are based on the Green variables) yield good agreement with the corresponding moduli combinations computed by fluctuation methods at 1 K in the molecular dynamics simulations, and loss of stability in the IMD simulations at 1 K, under both the [100] and the [111] modes, corresponds well with the violations of the stability criteria. In particular, in [100] compressive loading, loss of stability is associated with stationarity of the generalized force, p1 , which occurs near a load or stress minimum, and in [100] tension, stability is lost at the invariant c22 = c23 eigenstate. In [111] loading of both Morse model crystals, in both tension and compression, stability is terminated in conjunction with violation of the second of relations (38), in agreement with the earlier LS work although the IMD [111] Cu simulations lost compressive stability earlier than might be expected. Increasing temperature is seen to induce instabilities at smaller stress magnitudes owing to the effects of thermal activation, as is discussed above for the case of hydrostatic simulations. In the range of temperatures depicted in Figs. 23 and 24, increasing temperature also generally causes the instabilities to occur at
1276
F. Milstein (a)
(b)
(c)
(d)
Figure 23. Mechanical response of Morse models of fcc Ni and Cu under uniaxial [100] tension and compression. Critical values of axial stretch λ1 and stress σ1 , as determined in IMD simulations, are indicated by arrows at (a) for temperature T = 1 K and at (b) for T = 300 K. The crystals remained stable indefinitely at stress magnitudes below the indicated critical stress magnitudes and lost stability at greater stress magnitudes. From Ref. [39] and Zhao, Maroudas, and Milstein, to be published.
Elastic stability criteria and structural bifurcations
1277
(a)
(b)
(c)
(d)
Figure 24. Mechanical response of Morse models of fcc Ni and Cu under uniaxial [111] 2 and A = (c − c )c − 2c2 tension and compression. Expressions B = (c11 + c12 )c33 − 2c13 11 12 44 14 (see relations (38)). Critical values of axial stretch λ1 and stress σ1 , as determined in IMD simulations, are indicated by arrows at (a) for temperature T = 1 K, at (b) for T = 300 K, and, for Ni, at (c) for T =500 K. The crystals remained stable indefinitely at stress magnitudes below the indicated critical stress magnitudes and lost stability at greater stress magnitudes. From Zhao, Maroudas and Milstein, to be published.
1278
F. Milstein
smaller stretch magnitudes, although Cu in [100] compression is an exception, evidently owing to the “softening” of the crystal at higher temperatures. The modes of instabilities in [100] tension are particularly interesting. In both cases the bifurcation starts with the second of the eigendeformations listed in (35), although branching occurs in domains, rather than homogeneously; in alternating domains, δλ2 > 0, δλ3 < 0, and δλ2 < 0, δλ3 > 0; in the Ni model, wherein the instability occurs at a relatively large axial stretch, the instability leads to fracture; in the Cu model, wherein the instability occurs at a smaller axial stretch, the domains are able to rotate in opposite directions, leading to a phase change that exhibits a remarkable atomic pattern formation (Ref. [39]). Whether the modes of instability in [111] loading correspond to the eigendeformations (44) and/or (45) remains to be determined.
References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14]
[15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25]
R. Hill, Math. Proc. Camb. Phil. Soc., 77, 225, 1975. F. Milstein and S. Chantasiriwan, Phys. Rev. B, 58, 6006, 1998. M. Parrinello and A. Rahman, J. Appl. Phys., 52, 7182, 1981. M. Born, Proc. Camb. Phil. Soc., 36, 160, 1940. M. Born and R. Furth, Proc. Camb. Phil. Soc., 36, 454, 1940. R. Hill and F. Milstein, Phys. Rev. B, 15, 3087, 1977. F. Milstein and R. Hill, J. Mech. Phys. Solids, 27, 255, 1979. F. Milstein and R. Hill, Phys. Rev. Lett., 43, 1141, 1979. N.H. Macmillan and A. Kelly, Proc. R. Soc. London, A, 330, 291, see also p. 309, 1972. F. Milstein, Phys. Rev. B, 3, 1130, 1971. F. Milstein, R. Hill, and K. Huang, Phys. Rev. B, 21, 4282, 1980. R. Hill, Adv. Appl. Mech., 18, 1, 1978. J. Wang, S. Yip, S.R. Phillpot, and D. Wolf, Phys. Rev. Lett., 71, 4182, 1993. J.W. Morris, Jr., C.R. Krenn, D. Roundy, and M.L. Cohen, “Elastic stability and the limit of strength,” In:Turchi, P. E. and Gonis, A. (eds.), Phase Transformations and Evolution in Materials, Warrandale, PA, TMS, pp. 187–207, 2000. F. Milstein and R. Hill, J. Mech. Phys. Solids, 25, 457, 1977. F. Milstein and R. Hill, J. Mech. Phys. Solids, 26, 213, 1978. F. Milstein, “Crystal elasticity,” In: H.G. Hopkins and M.J. Sewell (eds.), Mechanics of Solids, Pergamon Press, Oxford and New York, pp. 417–451, 1982. J. Zhao, D. Maroudas, and F. Milstein, Phys. Rev. B, 62, 13 799, 2000. D.J. Rasky and F. Milstein, Phys. Rev. B, 33, 2765, 1986. F. Milstein and D.J. Rasky, Phys. Rev. B, 54, 7016, 1996. D.A. Young, Phase Diagrams of the Elements, University of California Press, Berkeley, 1991. S. Chantasiriwan and F. Milstein, Phys. Rev. B, 58, 5996, 1998. S. Chantasiriwan and F. Milstein, Phys. Rev. B, 48, 14 080, 1996. F. Milstein, H.E. Fang, X.Y. Gong, and D.J. Rasky, Solid State Commun., 99, 807, 1996. F. Milstein and D.J. Rasky, Phil. Mag. A, 45, 49, 1982.
Elastic stability criteria and structural bifurcations [26] [27] [28] [29] [30] [31] [32] [33] [34]
[35] [36] [37] [38] [39]
1279
F. Milstein and J. Marschall, Acta Metall. Mater., 40, 1229, 1992. F. Milstein, Solid State Commun., 34, 653, 1980. F. Milstein, H.E. Fang, and J. Marschall, Phil. Mag. A, 70, 621, 1994. F. Milstein, Y.C. Tang, K. Huang, and R. Hsu, Phil. Mag. A, 48, 871, 1983. F. Milstein and K. Huang, Phys. Rev. B, 18, 2529, 1978. F. Milstein and B. Farber, Phys. Rev. Lett., 44, 277, 1980. F. Milstein, J. Marschall, and H.E. Fang, Phys. Rev. Lett., 74, 2977, 1995. R. Hill, Math. Proc. Camb. Phil. Soc., 92, 167, 1982. J. Zhao, S. Chantasiriwan, D. Maroudas, and F. Milstein, ”Atomistic simulations of the mechanical response and modes of failure in metals at finite strain,” In: Proceedings of the Tenth International Conference on Fracture (Honolulu, Hawaii), Elsevier, Amsterdam, contribution IFC10 0575OR, 2001. J.R. Ray, Comput. Phys. Rep., 8, 109, 1988. W.G. Burgers, Physica, (Amsterdam), 1, 561, 1935. K. Huang, F. Milstein, and J.A. Baldwin, Jr., Phys. Rev. B, 10, 3635, 1974. F. Milstein and J. Marschall, Phil. Mag. A, 58, 365, 1988. F. Milstein, J. Zhao, and D. Maroudas, Phys. Rev. B, 70, 184102, 2004.
4.3 TOWARD A SHEAR-TRANSFORMATION-ZONE THEORY OF AMORPHOUS PLASTICITY Michael L. Falk1, James S. Langer2 , and Leonid Pechenik2 1 University of Michigan, Ann Arbor, MI, USA 2
University of California, Santa Barbara, CA, USA
Our understanding of how solid objects bend and break seems poised for major progress because of recent developments in computational capabilities, in experimental techniques, especially high resolution microscopy, and in basic theoretical understanding of non-equilibrium phenomena. In this chapter, we describe some ideas in the theory of amorphous plasticity that have been inspired and enabled by these developments. We focus on two fundamental questions: (i) “How do materials undergo transitions from hardening to flow under applied stresses?” and (ii) “How best can the relevent aspects of microstructural dynamics be incorporated into constitutive theories?” Perhaps the most important goal for modern solid mechanics is a general equation, or a set of equations, that can play the same role for plastically deforming materials that the Navier–Stokes equation plays for fluids and Hooke’s law plays for elastic solids. While most real fluids and real elastic solids exhibit important deviations from Newtonian or Hookean behavior, the solutions to these equations provide a firm basis for our understanding of phenomena ranging from flow in a pipe to the stress field around a crack. These two paradigms have been so powerful that, when describing a deforming solid, scientists and engineers tend to use equations that mimic one or the other. Constitutive laws are developed either for low stresses and low temperatures by prescribing functional relations between stress and strain, or else they are developed for high stresses and high temperatures in the form of equations that relate stress to strain rate. What is lacking is a more widely applicable but comparably tractable constitutive relation that can capture both kinds of behavior. Specifically, we need an equation that can describe the exchange of stability that accompanies the transition from hardening to flow. The existence of such an equation would allow us to deal quantitatively with situations in which hardening and flow coexist and, together, determine the performance of 1281 S. Yip (ed.), Handbook of Materials Modeling, 1281–1312. c 2005 Springer. Printed in the Netherlands.
1282
M.L. Falk et al.
real materials. Ultimately, such a constitutive description should tell us what actually happens when a material is apparently making a transition between brittle and ductile failure. In this chapter we discuss the development of such a constitutive law for deformation of amorphous solids. Where possible, we will describe the predictions of this model in the context of the experimental and theoretical literature on metallic glasses and amorphous polymers. A generally applicable constitutive theory of plasticity of the kind that we think is needed must necessarily be couched in terms of macroscopic quantities – stresses, strains, and coarse-grained internal state variables – and yet it must be based on atomistic concepts and must be consistent with fundamental symmetries and conservation laws. These principles have led us to move in some directions that differ from work that has been done in this field in recent decades. We shall point out briefly several places where earlier theories seem to omit essential physical mechanisms or even violate basic physical principles. We shall also show how we have used the laws of thermodynamics to constrain the form of our equations of motion. The way in which we relate phenomenology to atomistic mechanisms seems especially important if we are successfully to predict both the mechanical response and structural evolution of real materials. Much current work along these lines focuses on dislocation structures. However, while dislocations are crucially important for understanding the behavior of crystals, we believe they are not necessarily the most promising place to begin investigating the general relationship between structure and deformation. This is because the dislocation structures that govern deformation in crystals are composed of large numbers of extended one-dimensional defects. We propose that amorphous solids, being the most liquid-like of the solid materials, provide a more nearly ideal place to begin an investigation that will eventually provide useful guidance for understanding general classes of deformation behavior.
1.
Phenomenology from Experiment and Simulation
At the time of the birth of modern theories of plasticity in the mid 20th century there were severe limitations on the data available to guide the development of such a theory [1]. The data that was available did not detail rate or history dependences of plastic deformation or carefully consider temperature dependencies. Additionally very limited information was available on the microstructural changes that accompanied deformation. Today a large number of experimental techniques have been devised to investigate these aspects of materials response in more detail. Not only the quality and quantity of the available data has improved, but the qualitative nature of the measurements has broadened. Foremost among these is the ability to analyze microstructural changes on a wide range of spatial scales. It is important to note that these
Toward a shear-transformation-zone theory
1283
experimental advances in analyzing microstructure have, for the most part, been limited to crystalline materials. The study of amorphous solid microstructure has been much slower to advance; but recently new methods have begun to emerge including fluctuation microscopy [2] and quantitative techniques in high resolution electron microscopy [3, 4] that have revealed important structure even in materials with no immediately discernable crystal structure. In the absence of a standard set of methods for microstructural and atomicscale characterization, advances in understanding glasses have relied heavily on an entirely new means of investigating materials properties: atomic-scale simulation. The important difference between these techniques and continuum computational methods is that properties can emerge from these methods that were not a priori included in the model. For example dislocations and fracture can be observed in atomistic simulations of solids for which only the initial crystal structure and interatomic bonding potential are supplied as input. Thus these simulation techniques can be used to observe the consequences of assumptions regarding bonding and structure for materials behavior on larger scales. In the context of amorphous materials, investigations of this nature actually pre-date the advent of computer simulations. Some of the earliest “atomicscale” simulations of amorphous deformation were bubble raft experiments that were used as an analog for atomic scale structure in a metallic glass [5]. These and subsequent investigations were crucial for providing a qualitative picture of the nature and behavior of the microscopic structures that mediate shear deformation in disordered matter. These regions are typically referred to as “shear transformation zones” (STZs) or “flow defects”. Figure 1 shows an image of the result of a two-dimensional molecular dynamics simulation of 20 000 atoms having undergone a small amount of shear. The darker regions show where the atoms have undergone the most rearrangement. Figure 2 shows a close-up image of one such region. By examining many regions of this type a number of assertions can be made regarding their behavior essentially providing a phenomenology of the defect dynamics of the amorphous solid: • STZs are orientational in nature: they transform preferentially under a particular applied stress. • STZs are two-state systems: after transforming once they can reverse their transformation, but they cannot continuously undergo repeated transformations in the same direction. • STZ transformations may be thermally activated or mechanically driven. These three assertions will form the basis for the theoretical analysis to follow. In addition to dwelling on the micromechanics of deformation it is important to consider some aspects of the macroscopically observed material behavior that our theory must explain. In this chapter we will consider what we
1284
M.L. Falk et al.
Figure 1. A 2D simulation of a binary Lennard–Jones glass composed of 20 000 atoms undergoing less than 1% shear. Dark areas are local regions that have rearranged plastically.
believe to be the most important features of homogeneous deformation in these materials. In particular we wish to model both shear softening behavior and shear thinning (sometimes called super-plasticity) that is commonly observed in both metallic glasses and amorphous polymers. Both of these will be discussed in more detail below. We will not consider the formation of shear bands, crazes or other aspects of localization. While strain localization is an extremely important failure mode in amorphous polymers and metallic glasses, understanding these phenomena requires first a thorough understanding of the physics of homogeneous deformation. Thus, we expect that a physically complete theory of homogeneous deformation will also predict the subsequent instabilities that develop into inhomogeneities. Figure 3 shows typical stress–strain behavior measured near the glass transition temperature in a Zr based metallic glass as measured by Lu et al. [6].
Toward a shear-transformation-zone theory
1285
Figure 2. A closeup of a shear transformation zone in the sheared Lennard–Jones glass shown in Fig. 1 before and after undergoing a local rearrangement.
Figure 3. Stress strain behavior of a Zr41.2 Ti13.8 CU12.5 Ni10 Be22.5 bulk metallic glass measured for a range of strain rates and temperatures [6].
Figure 4 shows similar curves from an amorphous polymer system studied by Hasan and Boyce [7, 8]. A prominent peak in the transient loading curve is observed in the limit of high strain rates, low temperatures and long annealing times. As we shall see in the thermal STZ theory presented below, such behavior can be considered to arise from a dynamic lag between the application of a driving stress and the time needed for the glassy microstructure to adjust to its steady state STZ density. Figure 5a shows viscosity-strain rate behavior for the same metallic glass system shown in Fig. 3 taken from the same work. At low strain rates there exists a well defined Newtonian regime, but at high strain rates the viscosity decreases with increasing strain rate. The scaling behavior shown in Fig. 4b was first observed by Kato et al. [9], and is quite commonly observed in complex fluid and polymer rheology [10].
1286
M.L. Falk et al. 100
TRUE STRAIN (MPa)
Annealed Quenched
50
0 0.0
0.1 0.2 TRUE STRAIN
0.3
Figure 4. Stress strain behavior of annealed and quenched polystyrene at temperature 296 K and strain rate -0.001 s−1 [7].
Simulation has also served a useful purpose in studying the mechanical and rheological properties of amorphous metals and polymers via ersatz experiments analogous to those performed in the laboratory. It is important to note that simulation techniques often impose a much different set of restrictions than are imposed by experimental techniques. For example, while experimental loading may be limited to near quasi-static conditions and moderate strain rates, molecular dynamics simulations are typically restricted to very high strain rates. Although this constraint limits our ability to compare simulation directly to experiment, it provides a distinct window on another, possibly important, regime of the material systems’ physical response. Thus simulations can be used to access ultra-high quench rates and ultra-high strain rates, to perform repeated simulations on absolutely identical samples, and to prevent failure modes by imposing strict boundary conditions, all of which are difficult or impossible to do experimentally.
Toward a shear-transformation-zone theory
1287
Figure 5. Rheological behavior of a Zr41.2 Ti13.8 CU12.5 Ni10 Be22.5 bulk metallic glass measured for a range of temperatures [6].
Figure 6. Transient stress strain behavior of a 3D binary Lennard–Jones glass prepared by quenching from five different initial liquid temperatures [11].
An example of a set of simulations of this type are the stress strain curves shown in Fig. 6 taken from work by Albano et al. [11]. These represent glass samples that have been produced by instantaneously quenching from a number of different equilibrium liquid temperatures. The strain rate of the tests are in the range of 22% per nanosecond and the temperatures are approximately 40%
1288
M.L. Falk et al.
Figure 7. Rheological behavior of a molecular dynamics simulation of a confined polymeric system [12].
of the glass transition temperature. Again we see the very generic shear softening behavior that arises in the glasses that were quenched from the lower temperature liquids. Figure 7 from Ref. [12] shows similar shear thinning behavior to that observed in Fig. 5a. but in this case for a simulation of a confined thin polymer film. Clearly both the shear softening and shear thinning phenomena are quite generic and are observed across materials systems and in both simulation and experiment.
2.
Theoretical Background: Viscoplastic Constitutive Theories
A considerable number of constitutive theories have been proposed to describe deformation in amorphous solids. It will not be possible to address all of these theories here. We will, however, consider the most prominent theories describing the mechanical behavior of amorphous metals and polymers. We consider these theories in the light of our statements at the beginning of this chapter. That is to say that, while our goal is to learn as much as we can from these previous developments, ultimately we will only be satisfied with a theory that naturally describes the transition from hardening to flow and accomplishes this in such a way that the terms are clearly physical in their
Toward a shear-transformation-zone theory
1289
assumptions. In these respects the theories we discuss here, while insightful, appear to us to be incomplete and. in some cases, physically unrealistic. The particular theories we will use as points of comparison are the “flow defect” theory as originally developed by Spaepen [13] and more recently extended by Sietsma and coworkers De Hey et al. [14] in the context of metallic glasses; and the constitutive models for amorphous polymers developed by Hasan et al. [15] and Hasan and Boyce [8]. All of these theories have a number of features in common. They begin from a stress-assisted thermal activation formalism originated by Eyring [16] (see also Ref. [17]) which was applied to polycrystalline plasticity by Kocks et al. [18] and amorphous plasticity by Argon [19]. In this formalism, the plastic strain rate is proportional to the number of density of deformable sites in the system, n, times the net rate of transitions that promote shear:
nνs G(s) ˙ = exp − s kB T pl
G(−s) − exp − kB T
.
(1)
In this equation ν is a molecular vibration frequency; kB is Boltzmann’s constant; T is the temperature; s is the deviatoric part of the applied stress, and s is its magnitude. Here G(s) is an activation barrier for transitions that promote shear in the direction of the applied deviatoric stress s while G(−s) is the activaiton barrier for transformations that promote shear in the reverse direction. Note that each activation barrier is a function only of the magnitude of the applied deviatoric stress. It is also important to note that this formalism proposed by Eyring includes only one scalar population n because it was first conceived in the context of a “theory of holes” that attributed viscous rearrangement to the motion of vacancy-like objects in a liquid. Eyring further made the assumption that, to a good approximation, the stress dependence of these activation barriers is linear, i.e., G = G 0 − (s/2), leading to the simplified form:
˙ pl =
2nνs G 0 s exp − sinh . s kB T 2kB T
(2)
All the theories that follow attempt to describe the deformation dynamics by postulating equations of motion for n and, in some cases, the distribution of n as a function of activation energy. Both De Hey et al. [14] and Hasan et al. [15] chose equations of motion of the general form n˙ = n˙ T (n, T ) + n˙ (n, ˙ pl)
(3)
where n˙ T models the dynamics of thermal relaxation and n˙ models the production of defects induced by plastic strain. Hasan et al. [15] exclusively model n˙ and neglect the effects of aging arising from n˙ T . Duine et al. [20] consider several possible forms for n˙ T and claim that the empirical data
1290
M.L. Falk et al.
obtained by differental scanning calorimetry (DSC) of metallic glasses is best described by n˙ T = −kr n(n − n eq )
(4)
where kr is a thermally activated rate factor and n eq is the thermal equilibrium density of flow defects in the absence of shear. It is important to note that this thermodynamic analysis of DSC data requires the additional assumptions of the free volume theory originated by Turnbull and Cohen [21] and further developed by Spaepen [13] that begins by assuming that the defect density is related to the free volume by an equation of the form
n ∝ exp −
V∗ vf
(5)
where v f is the free volume and V ∗ is a molecular volume. van den Buekel and Sietsma [22] further assume that the rate of change of enthalpy is directly proportional to the rate of change of v f . This is a critical assumption of the free volume model and, as far as we are aware, has not been directly compared to any independent measure of v f . We will address these assumptions and some thermodynamic arguments for the necessity of a v f -like parameter toward the end of the chapter. There is considerable disagreement among the authors of these models regarding the form of n˙ in Eq. (3). In the context of polymers, Hasan et al. [15] initially chose a form where n˙ =
−˙ pl (n − n ∞ ) τp
(6)
where τp is a proportionality constant and n ∞ is the steady state defect density under shear at low temperature. In a later investigation, however, Hasan and Boyce [8] developed a more complex model in which the whole energy distribution rather than just the total number of defects shifts during deformation. In spirit, this model is similar to the free volume models mentioned in the previous paragraph, although in detail it is quite different. As previously mentioned, we will address the reasons one would include such extra degrees of freedom in the latter parts of this chapter, but here we simply describe some of the most notable features of the model. Particularly, in this model n ≈ exp(−a/kT ) and a˙ = −kp (a − aeq ) exp[−ξ exp(−ξ pl)]
(7)
where kp is a transition rate, and ξ describes the sharpness with which the dynamics of a “turn on” during plastic flow. An additional parameter S is introduced that models the development of stresses that favor the reverse transformation during loading. Similarly, in the context of metallic glass deformation, Spaepen [13] and De Hey et al. [14] propose dynamics of v f which indirectly alter the defect
Toward a shear-transformation-zone theory
1291
density n via Eq. (5). The form proposed in Ref. [14], that v˙f = c˙ pl , is the simpler of the two. In that model this free volume creation term is applied in conjunction with Eq. (4) above to produce an equation of motion for v f that models both shear induced disordering and thermal relaxation. Spaepen [13] proposes a form
v˙f =
s V ∗ kB T coth vf M 2kB T
− 1+
Mv f n D kB T
csch
s 2kB T
˙ pl
(8)
where M is a modulus. Note that in Eq. (8) we have factored out ˙ pl to make this form comparable to the other equations discussed above. At large values of s/kB T and non-zero T , this expression reduces to a form close to that proposed by De Hey et al. [14], v f2 ∝ ˙ pl . However, it behaves unphysically in the limit of vanishingly small T and non-zero s, where the steady-state value of v f is predicted to become infinite. Despite this difficulty, this equation of motion has been used by Huang et al. [23] and Steif et al. [24] in attempts to model shear band formation. There are several aspects of these models that we find unsatisfactory. One prime example is the appearance of the plastic strain pl in Eq. (7), which illustrates a problem that pervades much of plasticity theory. Fundamental principles imply that the plastic strain cannot serve as an internal state variable because it has no structural meaning outside of the ideally elastic regime. Materials undergoing large scale irreversible deformations necessarily lose their memories of earlier configurations; thus displacements measured from much earlier reference states cannot meaningfully characterize current states of these systems. For similar reasons, couching plasticity theories in a Lagrangian framework in terms of strains measured relative to an arbitrarily determined initial reference state seems to be an unprofitable mathematical strategy. The importance of expressing such relations in terms of rate equations that are most easily described in Eulerian coordinates has long been recognized; for example, see Ref. [25]. Nevertheless, Lagrangian formulations in which the current plastic strain is given undue weight as a state-determining measure continue to be advocated in review articles [26] and used in advanced mathematical treatments of plasticity [27]. Another serious problem in the above equations is that they use functional forms that cannot be made consistent with symmetry laws without violating analyticity requirements. For example in Eq. (6) and the evolution equations of Refs. [13, 14], the rate of change of a scalar denoting the internal state of the system, n or v f is proportional to the plastic strain rate, which is a tensor. Thus, although the production rate of defects due to shear should be positive regardless of the direction of the applied stress, when the sign of the shear is reversed in these equations, the defect production rate also changes sign. The authors most likely intended to use the magnitude of the plastic strain rate; but this also would be unsatisfactory because the absolute value is a nonanalytic
1292
M.L. Falk et al.
function that is unlikely to arise from any first-principles analysis of molecular mechanisms. Beyond these details, there are issues that arise directly from the forms of Eqs. (1) and (3) which underly nearly all of these theories. In particular, the fact that n is the prefactor of both the forward and backward transition rates reflects the fact that no attempt is being made to differentiate between the populations of the zones that undergo these two distinct transformations. Consequently, despite the fact that Eq. (1) ostensibly describes the forward and backward transitions of a two-state system, there is no way for these equations to describe the fact that once a region transforms it is not available for further transformation. This decoupling of the strain rate dynamics from the population dynamics implies that there is some fast relaxation mechanism – faster than any other rate introduced explicitly in the theory – which causes zones instantaneously to lose their memory of prior transformations. As a result, none of the above constitutive laws has the possibility of describing a transition from jammed to flowing behavior at a well defined yield stress. In what follows, we describe a theoretical framework that we have developed in an attempt to correct the problems mentioned above and to include the two-state dynamics associated with STZs. We find that we can apply simple thermodynamically motivated physical arguments to derive the form of our equivalent of the term n˙ in Eq. (3). We will accomplish this initally in the context of a low temperature theory that strictly applies only when thermally activated transitions are negligible, and we will then describe more briefly the ways in which thermal effects can be introduced into such a theory. The introduction of thermal effects is necessary to explain experimental data related to shear softening and shear thinning as described above. We will finish by discussing how this theory points toward the need for a better understanding of the thermodynamics of out-of-equilibrium systems and, in particular, how it may imply a relationship between the concepts of free volume and “effective temperature.” We believe that refining such concepts is a necessary next step toward constructing constitutive laws for plasticity with less emphasis on pure phenomenology and more on the statistical mechanical consequences of well defined atomic-scale mechanisms of deformation.
3.
The Low-temperature STZ Theory
We start by presenting the fundamental form of a constitutive relationship that exhibits a transition from jammed (viscoelastic) to flowing (viscoplastic) behavior. The theory will be introduced in the low-temperature limit, the limit in which the state of the system only changes when the strain rate is non-zero. In the derivation that follows we will take the STZ picture more literally than perhaps is necessary. We will assume that the material is riddled with local
Toward a shear-transformation-zone theory
1293
regions of approximately identical size that can undergo local shear rearrangements, that these regions are initially randomly oriented, and that they behave essentially as two-state systems. Importantly, we must also assume that these STZs can be created and destroyed due to deformation of the surrounding medium. The STZ picture implies a continuum of orientations of zones in the material. That is to say that, at any point, the distribution of zones could be written as a function ζ(eˆ ) that denotes the number of zones oriented in the direction eˆ , the unit vector in the direction with angular coordinate(s) . Because STZs contain orientational (director) information rather than directional (vector) information, ζ is defined on the unit sphere (unit circle in 2D) so that ζ(eˆ ) = ζ(−eˆ ). Because such a general field theory would be difficult to construct, we expand this orientational density field in terms of its first and second moments at each point. We write 1 r) = n tot( 2 1 n i j ( r) = 2
dζ( r , ), dζ( r , )d ij .
(9)
(10)
Here we replace the spatially r varying orientational function ζ with a spatially varying scalar field n tot corresponding to the number density of STZs, and a spatially varying traceless tensor field n i j . By definition dij ≡ 2eˆi eˆ j − δi j
(11)
and the integrals in Eqs. (9) and (10) are over the unit circle in 2D or the unit sphere in 3D. Rather than delve into the full tensor version of the STZ theory here (see Ref. [28] for details), it is more instructive to derive the constitutive equations by first specializing to the case of a 2D body (surface or film) loaded in pure shear with the pricipal axes of the loading aligned along the x x and yy directions. Furthermore, we will assume that all the STZ’s are aligned along the same principal axes. Without loss of generality, therefore, we let the deviatoric stress be diagonal along the x, y axes; specifically, let sx x = −s yy = s and sx y = 0. Then choose the “+” zones to be oriented (elongated) along the x-axis; and the “–” zones along the y-axis; and denote the population density of zones oriented in the “+/–” directions by the symbol n ± . The resulting equations are easily motivated in this context and a generalization to the tensorial 3D form of the equations is straightforward. We begin by expressing the plastic shear strain rate in terms of the rates of STZ transformations using a kinetic formalism that deviates from Eq. (l). This
1294
M.L. Falk et al.
equation explicitly includes the effect of the relative populations of “+” and “–” zones on the plastic strain rate: pl ≡ ˙ pl = ˙xplx = −˙ yy
λ (R(−s)n − − R(+s)n + ). τ0
(12)
Here λ is a material-specific parameter with the dimensions of volume (or area in a strictly 2D model) which must be roughly equal to the size of an STZ, that is, a few atoms in size. The quantity in parentheses in Eq. (12) is the net rate per unit area at which STZ’s are transforming from “−” to “+” orientations. Here, R(s)/τ0 and R(−s)/τ0 are the rates for “+” to “−” and “−” to “+” transitions, respectively. The instantaneous local densities of “+” and “−” zones are denoted by n + and n − , respectively. For simplicity, we write these rates as explicit functions of only the deviatoric stress s, although they depend implicitly on the temperature and pressure and perhaps other quantities as well. Note that the inclusion of separate densities for the “+” and “−” regions immediately distinguishes this approach from that in Eq. (l). The equation of motion for the populations n ± generally must be a master equation of the form:
τ0 n˙ ± = R(∓s)n ∓ − R(±s)n ∓ + (s, n + , n − )
n∞ − n± . 2
(13)
The first two terms on the right-hand side are the stress-driven transition rates introduced in the preceding paragraph. They describe volume-conserving, pure-shear deformations which preserve the total population of the STZ’s. These terms have no equivalent in Eq. (3) above. The last two terms in parentheses, proportional to , describe creation and annihilation of STZ’s. In the low-temperature theory, is nonzero only when the plastic strain rate is nonzero; the molecular rearrangements required for creating or annihilating STZ’s cannot occur spontaneously, that is, in the absence of external driving forces. The assumption in Eq. (13) that the annihilation and creation rates are both proportional to the same function has profound implications in this theory. Among those implications is the requirement that n ∞ be a strain-rate independent constant. Note that n ∞ is the total density of zones generated in a system that is undergoing steady plastic deformation. It is not the same as the equilibrium density at nonzero temperature and zero strain rate, which ordinarily is said to go rapidly to zero as the temperature decreases below the glass transition. On the other hand, n ∞ is a property of low-temperature materials at non-zero strain rates. The form in which we have cast Eq. (13) is consistent with our assumption that, under steady-state conditions with unidirectional shear stress, the number of events in which the molecules rearrange themselves is not proportional to the time but to the strain. That picture seems intuitively reasonable. If
Toward a shear-transformation-zone theory
1295
the system requires a certain number of STZ-like rearrangements in order to achieve some deformation, then it should not matter (within limits) how fast that deformation takes place. The picture breaks down, of course, when there are competing rearrangement mechanisms. For example, the density of STZ’s becomes strain-rate dependent when we introduce thermal fluctuations. We also expect that the picture may fail in polymeric glasses or polycrystalline solids, where more complex components may introduce extra length and time scales. We shall use the energetic arguments introduced in Ref. [29] to determine the factor in Eq. (13), but first we must discuss the state variables and specific forms for the transition rates. We define dimensionless state variables by writing ≡
n+ + n− , n∞
≡
n+ − n− . n∞
(14)
In a more general treatment [28–31], remains a scalar density, but becomes a traceless symmetric tensor with the same transformation properties as the deviatoric stress. In this way they are closely related to n tot and n ij introduced in Eqs. (9)–(10). We also define: S≡
1 (R(−s) − R(+s)), 2
C≡
1 (R(−s) + R(+s)), 2
T ≡
S . C
(15)
Then the STZ equations of motion become: τ0 ˙ pl = 0 C(s)( T (s) − );
(16)
˙ = 2C(s)( T (s) − ) − (s, , ); τ0
(17)
˙ = (s, , )(1 − ). τ0
(18)
and
Here, 0 ≡ λn ∞ is roughly the fraction of the total volume of the lowtemperature system in steady-state flow that is covered by the STZ’s. This is a material-specific quantity. If 0 is small, then the disorder induced in the system by deformation is small. Conversely, if 0 is large, then the STZlike defects cover the system and the material in some sense “melts” under persistent straining. Throughout this paper, we shall use only what we call the “quasi-linear” version of these equations [30]. That is, we note that T (s) and C(s) are, respectively, antisymmetric and symmetric dimensionless functions of s, and write: s ≡ s˜ ; C(s) ∼ (19) T (s) ∼ = 1, = sy
1296
M.L. Falk et al.
where s y will turn out to be the yield stress. The choice C(s) ∼ = 1 is, in effect, our definition of the time constant τ0 . As pointed out in earlier papers [29, 30], this quasilinear approximation has important shortcomings. Neglecting the stress dependence of C(s) means that we overestimate the amount of plastic deformation that occurs at small stresses and therefore also overestimate the rate at which orientational memory disappears in unloaded systems. Moreover, the quasilinear approximation is too simplistic to be related directly to atomic mechanisms, a point that we shall comment upon further. Nevertheless, the quasilinear theory has the great advantage that it is mathematically tractable and easy to interpret. It will serve to illustrate the main points that we wish to make in this paper, but aspects of the nonlinearities associated with C and T will need to be reintroduced before we shall be able to understand fully the non-equilibrium behavior of amorphous solids. Equations (16)–(18) now become: τ0 ˙ pl = 0 ( ˜s − );
(20)
˙ = 2( ˜s − ) − (˜s , ); τ0
(21)
˙ = (˜s , , )(1 − ). τ0
(22)
and The quantity n ∞ /τ0 is the STZ creation rate. We can derive an expression for that rate by using the energy-balance argument introduced in Ref. [29]. As before, we start by writing the first law of thermodynamics in the form: 20 s y d (23) ( ˜s − )˜s = 0 s y ψ( , ) + Q(˜s , , ). τ0 dt The left-hand side of Eq. (23) is the rate at which plastic work is done by the applied stress s = s y s˜ . On the right-hand side. 0 s y ψ is the state-dependent recoverable internal energy, and Q is the dissipation rate. So long as the STZ’s remain uncoupled from the heat bath. Q must be positive in order for the system to satisfy the second law of thermodynamics, that is. for the work done in going around a closed cycle in the space of variables s, , and to be non-negative. As argued in Ref. [29], the simplest and most natural choice for – and, so far as we can tell, the only one that produces a sensible theory – is that it be the energy dissipation rate per STZ. That is. ε0 s y (˜s , , ), (24) Q(˜s , , ) = τ0 2˙ pl s =
With this hypothesis, we can use Eqs. (21) and (22) to write Eq. (23) in the form ∂ψ ∂ψ (1 − ) + (2( ˜s − ) − ) + . (25) 2( ˜s − )˜s = ∂ ∂
Toward a shear-transformation-zone theory
1297
Then, solving for , we find: =
2( ˜s − )(˜s − ∂ψ/∂) . + (1 − )(∂ψ/∂ ) − (∂ψ/∂)
(26)
To assure that remains non-negative for all s˜, we must let ∂ψ = , (27) ∂ so that the numerator becomes 2 (˜s −/ )2 . Then (see Ref. [29]), we choose ψ( , ) = 2
2 1+ 2 ,
(28)
so that (˜s , , ) =
4 ( ˜s − )2 . (1 + )( 2 − 2 )
(29)
This result has the physically appealing feature that it diverges when 2 approaches its upper limit 2 , thus enforcing a natural boundary for dynamical trajectories in the space of the state variables and . It is convenient at this point to replace the variable by m = / , so that the equations of motion become: τ0 ε˙ pl = ε0 (˜s − m);
τ0 m˙ = 2(˜s − m) 1 − and
2m(˜s − m) ; (1 + )(1 − m 2 )
˙ 4(˜s − m)2 1− . τ0 = 1 − m2 1+
(30) (31)
(32)
At the stable fixed point of Eq. (32), A =1, Eq. (31) becomes 2(˜s − m)(1 − s˜ m) , (33) (1 − m 2 ) which exhibits what we believe to be the single most important consequence of the two-state STZ dynamics, that is, the occurrence of a the yield stress that was missing in earlier theories. Note that Eq. (33) has two steady-state solutions: a jammed state with ˙ pl = 0 for m = s˜ and a flowing state with nonzero ˙ pl for m = 1/˜s . A simple analysis indicates that the first of these states is dynamically stable for s˜ < 1 and the second for s˜ > 1. Thus, the system exhibits an exchange of dynamical stability at the yield stress s˜ = 1. The steady-state flow obeys a Bingham-like law: ε0 1 s˜ − , s˜ > 1. (34) ˙ pl = 0, s˜ < 1; ε˙ pl = τ0 s˜ τ0 m˙ =
1298
4.
M.L. Falk et al.
High-temperature STZ Theory
We return now to Eq. (13), the low-temperature master equation for the STZ population densities n±, and ask what changes need to be made in order to incorporate thermal effects at temperatures near the glass transition. One obvious possibility is to modify the rate factors R(±s) to include thermal activation across energy barriers. This is not difficult: but the extra analysis required is not relevant to the main points that we wish to make here and therefore will be omitted. The more important thermal effects are those that are completely missing in Eq. (13), specifically, the thermally assisted relaxation – i.e., aging – of the STZ variables that can occur spontaneously in the absence of external driving or plastic strain rate. There are two ways in which relaxation must occur. First, thermal fluctuations ought to act much like deformation-induced disorder in causing the n ± to relax toward their steady-state values n ∞ /2. Second, there should be some annealing mechanism that causes the total STZ population to decrease. Both of these mechanisms involve dilations and contractions of the kind associated with creation and annihilation of STZ’s: thus, again for simplicity, we assume that there is just a single relaxation rate, denoted ρ(T )/τ0 , that characterizes them. That rate may have the Vogel-Fulcher or Cohen and Grest [32] super-Arrhenius form, rapidly becoming extremely small as the temperature falls below the glass temperature. Specifically, ρ(T ) might have the form V dil , (35) ρ(T ) = ρ0 exp − υf (T ) where ρ0 is a dimensionless prefactor, V dil is the activation volume required to nucleate a dilational rearrangement, and v f (T ) is often identified as the free volume. We shall not make the latter assumption because we expect the free volume to depend on the history of plastic deformation, not just on the current temperature. In our work described here, we have evaluated ρ(T ) directly from measurements of the linear Newtonian viscosity, and have not used explicit formulas such as Eq. (35). By proposing that critical slowing down near the glass transition arises due to a divergence in the time scale associated with a rate process, we are departing from the formalism that has been developed in the context of conventional free volume theory. In the latter, theory, it is typically assumed that all rates are essentially Arrhenius while the defect population that mediates deformation is strongly suppressed below the glass temperature. We have chosen to ascribe the change in dynamics to a diverging time scale rather than the defect population, because it seems more likely that interesting temperature dependencies will arise by considering the paths connecting two low energy states rather than by considering the energies of the configurations themselves.
Toward a shear-transformation-zone theory
1299
Having made these assumptions, there are at least two ways forward that we can imagine. The first continues to describe the dynamics of the material in terms of a progression of defect–defect interactions, one that must be extended to include more complex diffusion-mediated effects at higher temperatures. This is the path that we have taken in recent investigations and have described in Ref. [33]. A second, more speculative path is based on the assumption that, during plastic deformation, the slow configurational degrees of freedom fall out of thermal equilibrium with the heat bath and can be described by an effective temperature. Teff . The assumption of such a model would be that the configurational energy and density fluctuations are described by a Boltzmann distribution with this effective temperature, so that STZs having a characteristic formation energy of E z would then be found with a probability proportional to exp(−E z /kB Teff ). Concepts of this kind have been presented recently in work by Ono et al. [34]; Cugliandolo et al. [35]; Sollich et al. [36]; Berthier and Barrat [37] and also are related to the ideas of Ref. [38]. Note that Teff may play a role similar to the free volume. It is an intensive variable, roughly analogous to υf (T ) in Eq. (35), but it cannot be simply a function of the bath temperature T. It should, however, approach T at sufficiently high T or for fixed nonzero T at sufficiently small rates of deformation. In what follows, we shall describe briefly both of these paths of investigation. Further details can be found in Ref. [33].
5.
Defect Dynamics
To begin, we consider a defect based theory and discuss its consequences. In this model our proposed form for the modified master equation is: τ0 n˙ ± = R(∓s)n ∓ − R(±s)n ±
− +( (s, , ) + ρ(T )) n2∞ − n ± − κρ(T ) n+n+n n±. ∞
(36)
The first and second appearances of ρ(T ) on the right-hand side of Eq. (36) correspond, respectively, to its two roles described above – relaxation of the populations n ± toward n ∞ /2 and annealing of the STZ population as a whole. The second of these terms, the quadratic form with a dimensionless multiplicative constant κ, is a bimolecular mechanism that has been discussed extensively in Refs. [39–41]. After repeating the energy-balance analysis of Eqs. (23)–(29). we find that the equations of motion analogous to Eqs. (31) and (32) are ˜ s , , m, T ), (37) τ0 m˙ = 2(˜s − m) − m (˜ and τ0
˙ ˜ s , , m, T )(1 − ) − κρ(T ) , = (˜
(38)
1300
M.L. Falk et al.
where, 2 2 ˜ s , , m, T ) = 4(˜s − m) + 2ρ(T ) + κρ(T ) (1 + m ) . (˜ (1 + )(1 − m 2 )
(39)
The expression for the plastic strain rate, given in Eq. (30), remains unchanged. We emphasize that this theory contains only three adjustable parameters: 0 , τ0 , and κ. Using the above equations in the low-stress limit, we find the Newtonian viscosity to be 2s y τ0 s , = pl ˙ ε0 ρ(T ) ε˙ pl →0 ε
η N ≡ lim
(40)
which confirms our expectation that, in this theory, it is the rate function ρ(T ) that governs viscous relaxation. Because we know the yield stress s y .we can use Eq. (40) along with the experimentally measured values of η N to determine the ratio 0 /τ0 , a characteristic strain rate, up to a multiplicative scale factor for ρ(T ). The scale factor ρ(T ) is obtained at each temperature from the Newtonian viscosities in the low strain-rate limit as shown in Table 1. With these constraints on the parameters, we obtain values for 0 /τ0 , the values of ρ(T ), and κ by fitting the steady-state data for stress as a function of strain rate in the non-Newtonian regime. To fit non-steady-state data, for example, stress– strain curves measured at constant strain rate, we have only one remaining adjustable parameter, 0 . The resulting analysis reproduces experimental results of Refs. [6, 9] with remarkable accuracy, exhibiting both the interesting scaling behavior that these experiments have revealed and providing quantitative evidence in favor of the general features of our theoretical framework. Table 1. Experimental data for viscosity taken from Ref. [6], and values of ρ used in the present calculations Temperature (K)
Viscosity, (Pa s)
ρ
573 593 603 613 623 643
4.00 × 10−14 4.03 × 10−13
1.07 × 10−8 1.06 ×10−7 4.77 ×10−7 1.06 × 10−6 5.88 × 10−6 1.01 × 10−4
8.99 × 10−12 4.03 × 10−12 7.29 × 10−11 4.27 × 10−10
To illustrate the principal results of our analysis, we first follow the lead of Refs. [6, 9] by looking for scaling in the steady-state behavior of our system. To be specific about what we mean here, we show in Figs. 8 and 9 theoretically computed sets of stress–strain curves for various temperatures and fixed strain rates. As we shall explain shortly, these figures are to be compared with Fig. 3 which is taken from Ref. [6]. (See also Fig. 1 in Ref. [14] and Fig. 1
Toward a shear-transformation-zone theory
1301
Figure 8. Theoretical curves of tensile stress versus strain for the bulk metallic glass Zr41.2 Ti13.8 CU12.5 Ni10 Be22.5 at several different temperatures as shown. The strain rate is ˙ total = 1 × 10−1 s −1 . For clarity, the curves have been displaced by constant increments along the strain axis.
Figure 9. Theoretical curves of tensile stress versus strain for the bulk metallic glass Zr41.2 Ti13.8 CU12.5 Ni10 Be22.5 at several different strain rates as shown. The temperature is T = 643 K. For clarity, all but the first of these curves have been displaced by the same amount along the strain axis.
1302
M.L. Falk et al.
in Ref. [9].) A general feature of these curves is that, when the strain rate is held constant, the stress rises through a maximum, decreases as the material softens, and then reaches a steady-state value. We shall discuss the initial transients later in this section, but look first at the late-stage, steady-state behavior. We compute the steady-state flow stress as a function of the strain rate ˙ = 0. Then, as in Refs. [6, 9], we by solving Eqs. (37) and (38) with m˙ = pl plot s˜ = s/s y as a function of η N ˙ for eight different values of the relaxation rate ρ(T )corresponding to the eight different temperatures for which data are reported in Ref. [6]. The results are shown in Fig. 10. As discovered by Kato et al. [9], all of these curves lie on top of one another for stresses s˜ < 1 but, in our case, they diverge from each other in the flowing regime, s˜ > 1, where the Bingham-like behavior shown in Eq. (34) sets in. Figure 11a contains the same theoretical curves as those shown in Fig. 10, but plotted there as tensile stress versus scaled strain rate, and compared with experimental data taken from Fig. 9a of Ref. [6] which is the same data shown here in Fig. 5. The same theoretical functions and data points are replotted in Fig. 11b to show the normalized viscosity, η/η N as a function of the scaled strain rate. The latter figure is directly comparable to Ref. [6], Fig. 9b. Note that the range of strain rates shown in Fig. 11 corresponds to the range of the experimental data and is substantially smaller than that shown in Fig. 10. The theoretical curve that lies above the rest at high strain rates is for T = 683 K ,
Figure 10. Scaling behavior in the STZ theory: shear stress s˜ as a function of strain rate scaled by η N . This graph is plotted for the same set of temperatures as shown in Fig. lla, but for a larger range of strain rates.
Toward a shear-transformation-zone theory
1303
(a)
(b)
Figure 11. Tensile stress and viscosity as functions of scaled strain rate η N ˙ total . The data points for the bulk metallic glass Zr41.2 Ti13.8 CU12.5 Ni10 Be22.5 are taken from Ref. [6], Fig. 9a and b. The solid curves are theoretical results computed for the same set of temperatures as shown.
the highest of the temperatures reported in Ref. [6]. The data points at that temperature all lie at scaled strain rates that are too small to test this predicted breakdown of the scaling law. Our Fig. 12a shows individual theoretical and experimental curves of tensile stress as a function of (unsealed) strain rate for different temperatures. Here, the experimental data is that from Fig. 5. These curves are replotted in Fig. 12b to show (unscaled) viscosity as a function of strain rate, analogous to Fig. 5 reprinted from Ref. [6]. Our main conclusion from this steady-state analysis is that we are observing a transition from thermally assisted creep to viscoplastic flow in the neighborhood of the dynamic yield stress. At low stresses and strain rates, the linear response relation contains only the factor η N ∝ 1/ρ(T ), thus we obtain
1304
M.L. Falk et al.
Steady state stress (MPa)
(a) 103
102
101 (b)
Viscosity (PaS)
1013 1012 1011 1010 109 108
10⫺5
10⫺4
10⫺3
10⫺2
10⫺1
Figure 12. Tensile stress (a) and viscosity (b) as functions of strain rate for different temperatures as shown. The data points are for the bulk metallic glass Zr41.2 Ti13.8 CU12.5 Ni10 Be22.5 as reported by Lu et al. [6], Figs. 7 and 8. The solid lines are theoretical curves.
the simple scaling. Near the yield stress, however, our theoretical strain rate increases by several orders of magnitude for small increments of stress, and the experimental behavior tracks this trend accurately. This behavior resembles super-plasticity. Interestingly, the theoretical scaling persists through the “super-plastic” region and does not break down until true viscoplastic flow begins. So far, we have examined only steady-state behavior. We turn next to stress–strain curves obtained in constant strain-rate experiments such as those shown in our Figs. 8 and 9 and in Fig. 3. To plot these curves, we solve 0 s˜˙ = ˙ total − (˜s − m), 2µ˜ τ0
(41)
Toward a shear-transformation-zone theory
1305
along with Eqs. (37) and (38) to compute s˜ as a function of the total strain. Here, µ˜ is the ratio of the shear modulus to the yield stress, which we know from experiment. Our Figs. 8 and 9 are drawn so as to be directly comparable to Fig. 3; that is. we use the same strain rates and temperatures. As mentioned above, we have chosen the parameter 0 to optimize our fits to these curves. The one other parameter that we need for solving Eqs. (37), (38), and (41) is the initial value of . For this purpose, we have chosen the steady-state solution of Eq. (38) at zero driving force, that is, the smallest value of that can be achieved by annealing. In all cases, the agreement between theory and experiments seems satisfactory. The peak heights and positions for fixed strain rate ˙ total = 0.1 s −1 and varying temperatures in Fig. 8, and for fixed temperature T = 643 K and varying strain rates in Fig. 9, are within about ten percent of their experimental values. The experimental curves for low temperatures and large strain rates end where the samples break: the dashed lines in our figures indicate our theoretical extensions of those parts of the curves for which no experimental data is available. The one systematic discrepancy is that our initial theoretical slopes are smaller than the experimental ones. This is primarily an artifact of our quasi-linear theory, where plastic deformation sets in at unphysically small stresses. Until we incorporate the fully nonlinear transition rates into our calculations, we do not believe that it will be meaningful to try to improve our fits to these transient stress–strain curves by further adjusting parameters such as κ or 0 . Finally, in Fig. 13, we use the material parameters deduced above for the system studied in Lu et al. [6] to plot stress–strain curves for different initial values of , say 0 , all at temperature T = 643 K and ˙ total = 3.2 × 102 s −1 . The different 0 ’s correspond to different initial states of disorder produced by varying the annealing times and temperatures. Presumably, annealing for longer times at lower temperatures produces smaller values of 0 ; but it seems difficult to make quantitative estimates of this effect. These curves may be compared qualitatively with those shown in Fig. 4 and [14], Fig. 9, where larger initial densities of STZ’s produce larger plastic responses and correspondingly smaller overshoots during the early stages of deformation.
6.
Outline of an Effective Temperature Theory
Although the defect-dynamics theory described above seems remarkably successful, it does have some significant shortcomings. In the first place, it has no way of predicting the results of calorimetric measurements. A related difficulty is that a more complete theory should predict some relationship between the two steady-state defect densities denoted here by n eq (T ) (the
1306
M.L. Falk et al.
Figure 13. Tensile stress as a function of strain for several different values of 0 . Curves are plotted for ˙ total = 3.2 × 10−2 s −1 at T = 643 K.
thermal equilibrium density in the absence of shear) and n ∞ (the lowtemperature density in the presence of persistent deformation rate). Presumably, these two quantities should approach each other in the limit of high temperature or small deformation rate, but they do not behave in that way here. These and other considerations lead us to believe that there may be a more fundamentally correct way of describing these intrinsically nonequilibrium phenomena. We now outline some features of a theoretical approach that we believe shows promise of solving the above problems. We assume that the STZ density is determined by the distribution of very slow spatial fluctuations of the configurational energies in our system, and we characterize these fluctuations by an intensive temperature-like variable that we call χ. If we assume a specific heat of order kB per molecule, then χ is the ratio of the effective thermal energy kB Teff to a characteristic STZ energy E Z . χ is thus a dimensionless intensive quantity that plays something like the role that the free volume did in earlier theories. In particular, our STZ density is proportional to exp(−1/χ) = exp(−E Z /kB Teff ). The difference between the theory presented here and previous free volume theories is that, although Teff may become equal to the bath temperature T under some circumstances, it is not just a function of T but, rather, depends on the deformation-induced internal state of disorder of the system. In fact, we may sometimes refer to Teff as the “disorder temperature.”
Toward a shear-transformation-zone theory
1307
Another way in which our effective temperature approach differs from traditional free volume models is again the attribution of the super-Arrhenius behavior to the rates rather than the STZ density. That is to say that, unlike Refs. [14, 20] our χ does not converge in the limit of zero shear rate to a temperature dependent equilibrium value determined by the Vogel–Fulcher or Cohen–Grest forms for υf (T ), but rather converges simply to kB T /E Z . The rate of this convergence, however, slows dramatically below the glass temperature, so that χ falls out of thermal equilibrium in quenched systems. In order not to confuse the defect based and disorder temperature based approaches, we eschew the addition of new defect interaction mechanisms, and instead assume that all STZ interactions are captured by the dissipation rate and the thermally induced transition rate ρ. The convergence of the STZ density to the value prescribed by our effective temperature is then accomplished by invoking detailed balance so that the ratio of the creation and annihilation rates is consistent with the effective temperature of the material. Thus our new master equation for the STZ densities, analogous to Eq. (36), takes the form:
τ0 n˙ ± = R(∓s)n ∓ − R(±s)n ± + ( + ρ)
n ∞ −1/χ e − n± . 2
(42)
From here, we can derive χ-dependent equations of motion for and m = / analogous to Eqs. (30), (37) and (38). These equations simplify if we eliminate by assuming that the dimensionless STZ density is always in equilibrium with the effective temperature, so that = exp(−1/χ). Then we find τ0 ˙ pl = 0 e−1/χ (˜s − m);
(43)
and τ0 m˙ =
2(˜s − m)(1 − m s˜ ) − mρ . 1 − m2
(44)
We now must construct an equation of motion for χ that describes the way in which the slow configurational degrees of freedom fall out of equilibrium with the heat bath. In the absence of loading, but for extremely long times, we expect that the system will gradually converge to the thermalized values of χ and the STZ density. However, we must also consider what value χ approaches when we take ˙ pl → 0 after taking T → 0. At present we have no way to determine this value or its rate dependence, but we appeal to our earlier discussion to argue that χ must approaches some steady state value, say χ∞ ≡ kB T∞ /E Z in this limit. Then we can construct an equation of motion for χ by assuming that this effective temperature determines the internal energy of all of the configurational degrees of freedom – not just those associated with
1308
M.L. Falk et al.
STZ’s – via a temperature independent specific heat. Therefore the dynamics of this effective temperature must obey an equation of the form:
CD T˙eff = Q 1 −
Teff T∞
+ 0 K (χ)
ρ(T ) kB (T − Teff ). τ0
(45)
On the left-hand side, CD is a specific heat. The first term on the right-hand side describes how the energy dissipated during plastic deformation, at a rate Q, is absorbed by the configurational degrees of freedom and drives Teff to T∞ . The second term in (45) describes how Teff → T in the absence of external driving, and does so at a rate proportional to the dilational fluctuation rate ρ/τ0 , which becomes very small at low temperatures. K (χ)is a cooling coefficient with dimensions of inverse volume, defined with a factor 0 purely for convenience. The χ dependence of K arises because K must be proportional to the density of sites at which these cooling events can occur, which must depend on the state of disorder. We can covert Eq. (45) into an equation for χ by our association of the energy dissipated during plastic deformation with . Q(s, , ) =
µ ˜ 0 −1/χ e (s, , m). τ0
(46)
The function (not + ρ) is (s, m, χ) =
2(˜s − m)2 + ρm 2 . 1 − m2
(47)
Note that the term proportional to ρ in disappears in an undriven system because m → 0 in that case. Our final equation for χ˙ is τ0 c˜ χ˙ = e−1/χ (s, , m)(χ∞ − χ) + K˜ (χ)ρ(T ) 0
T −χ . TZ
(48)
Here c˜ and K˜ are the dimensionless specific heat and cooling coefficients, respectively, and Tz = E Z /kB . It is useful at this point to exhibit the expression for the Newtonian viscosity that emerges from these equations of motion: ηN = lim
˙ pl →0
2s y τ0 µs ¯ exp(TZ /T ). ≈ pl ˙ 0 ρ(T )
(49)
Thus we recover approximately the same form as we did for our previous defect dynamics based model. Equation (40), except for the fact that here the temperature dependence appears in both the rate constant ρ and the temperature dependence of the thermal equilibrium STZ density. exp(−1/χ) ≈ exp(−TZ /T ).
Toward a shear-transformation-zone theory
7.
1309
Outlook
The investigations we have discussed here lead us to believe that the low temperature STZ theory captures one of the primary salient features of the experimental data: a transition from a jammed state to a flowing state at a well defined yield stress. This underlying dynamical transition is observed at high temperatures as a transition from thermally activated creep flow, to mechanically driven superplastic flow. Hence the material shear thins. Questions remain as to the precise way thermally activated transitions should be introduced in such a model. In our initial investigations we have proposed two such methods that consider very different physical pictures of the thermodynamics of deformation. In our preliminary opinion we believe the effective temperature model may do a better job of capuring the underlying physics than the defect dynamics picture. However, only careful theoretical development and experimental tests will resolve this issue. Both of these models are mere frameworks that shall have to be improved in specific respects if we are to develop one into a yet more quantitative, predictive description of plastic deformation in amorphous solids. We conclude this chapter by identifying three directions for the next phases of these investigations. Fully non-linear, temperature dependent transition rates. When we examine the quasi-linear STZ theory in the context of a theory that includes thermal fluctuations, we see that it is a special case in which the shear rearrangements are not being modeled as realistically as the dilations or contractions. To see this in more detail, go back to our original [42], fully non-linear version of the low-temperature rate factors R(s):
V shear (s) 1 ; R(s) = exp − τ0 υf
V shear (s) = V0shear e−s/µ¯ ,
(50)
where V shear (s) is the activation volume required to nucleate a shear transformation. Our idea here was that, at temperatures well below the glass temperature, the transitions between STZ states are not thermally activated but, rather, are controlled entropically. That is, the rate factors are determined by the number of paths that the molecules within a zone can follow in moving around each other while going from one state to the other. The exponential factor in Eq. (50) is an approximation for a weighted measure of that number of paths. Its s−dependence means that greater weight must be given to paths moving in the direction of the stress than opposite to it. The exponential form of V shear (s) is the simplest non-negative function that becomes arbitrarily small at large s and introduces just one new parameter, the effective STZ stiffness µ. ¯ The quasi-linear version of the theory corresponds (roughly) to the limit of small s and small values of V0shear /υf .
1310
M.L. Falk et al.
Comparison of Eq. (50) with Eq. (35) indicates that the natural way to include thermal effects in R(s) is simply to let υf have the T -dependent Cohen–Grest form. This means that, at low T, the ratio V0shear υf (T ) becomes very large, which, in turn, implies that the functions C(s) and T (s) introduced in Eq. (15) become strongly stress dependent, and the quasilinear approximations made in Eq. (19) are no longer valid. Importantly, C(s) becomes very small for small s, so that plastic deformation is strongly suppressed at stresses appreciably below the yield stress. The strong stress dependence of C(s) and T (s) should be especially apparent in transient behavior of the kind shown in Figs. 3, 4, 8, 9 and 13. Here, the initial response to loading at small stress will be almost entirely elastic, and plastic deformation will begin only later in the process. We shall have to use the fully non-linear theory when undertaking more detailed comparisons with these kinds of experimental results. Shear localization. All of the analysis in this paper pertains to spatially homogeneous systems. In order to make closer contact with experiments, we shall have to understand why and when these systems become unstable against shear banding and inhomogeneous failure modes, especially fracture. One mechanism for shear localization that we have not mentioned in this presentation is the elastic interaction between STZ’s studied in Ref. [43]. As shown in that paper, an STZ-like event generates a quadrupolar stress field that induces other nearby events along preferred spatial directions and suppresses events elsewhere. The result is a tendency toward shear localization that should be interesting to examine in the context of this more general version of the STZ theory. A second mechanism that seems likely to play a role in shear localization is already built into our equations of motion when we write them in terms of spatially varying fields. From Eqs. (39) and (38), we see that the STZ density grows most rapidly, within limits, in regions where already is large. This feedback effect, perhaps coupled to the effect of elastic interactions mentioned above, is our best guess at present about how shear banding will emerge in the STZ theory. Effective temperature and the interpretation of x, especially in granular materials and related systems. Apart from our investigations, there is increasing evidence that an effective temperature can be associated with systems like sheared foams or granular materials [34–37]. In those non-molecular systems, the usual kinetic temperature is zero because the constituents have very large masses, but an effective temperature determined by response–fluctuation relations goes to a non-zero limit when the deformation rate becomes arbitrarily small. In our present molecular system, there is a true kinetic temperature, but well below the glass transition that temperature is so small that thermally assisted molecular rearrangements are effectively frozen out. During irreversible processes such as plastic deformation, therefore, the way in which our
Toward a shear-transformation-zone theory
1311
slow, configurational degrees of freedom characterized by χ fell out of equilibrium with the fast, thermal (vibrational) degrees of freedom should resemble the behavior of the non-molecular systems. It should be an important and interesting project to see how or whether the STZ concepts can be extended to the latter kinds of systems.
References [1] R. Hill, The Mathematical Theory of Plasticity, Clarendon Press, Oxford, UK, 1950. [2] M. Treacy and J. Gibson, Ultramicroscopy, 52, 31, 1993. [3] J. Li, Z. Wang, and T. Hufnagel, Phys. Rev. B, 65, 144201, 2002. [4] W. Jiang and M. Atzmon, Acta Mater., 51, 4095, 2003. [5] A. Argon and L. Shi, “Simulation of plastic flow and distributed shear relaxations in metallic glasses by means of the amorphous bragg bubble raft,” In: V. Vitek (ed.), Amorphous Materials: Modeling of Structure and Properties, The Metallurgical Society of AIME, pp. 279–303, 1982. [6] J. Lu, G. Ravichandran, and W. Johnson, Acta Mater., 51, 3429, 2003. [7] O. Hasan and M. Boyce, Polymer, 34, 5085, 1993. [8] O. Hasan and M. Boyce, Polym. Eng. Sci., 35, 331, 1995. [9] H. Kato, Y. Kawamura, A. Inoue, and H. Chen, Appl. Phys. Lett., 73, 3665, 1998. [10] R.G. Larson, The Structure and Rheology of Complex Fluids, Oxford University Press, Oxford, 1999. [11] F. Albano, N. Lacevic, M. Falk, and S. Glotzer, Mater. Sci. Eng. A., 2004. [12] P. Thompson, G. Grest, and M. Robbins, Phys. Rev. Lett., 68, 3448–3451, 1992. [13] F. Spaepen, Acta Metall., 25, 407, 1977. [14] P. De Hey, J. Sietsma, and A. Van Den Beukel, Acta Mater., 46, 5873, 1998. [15] O. Hasan, M. Boyce, X. Li, and S. Berko, J. Polym. Sci.: Part B, 31, 185, 1993. [16] H. Eyring, J. Chem. Phys., 4, 283, 1936. [17] A. Krausz and H. Eyring, Deformation Kinetics, John Wiley and Sons, New York, 1975. [18] U. Kocks, A. Argon, and M. Ashby, Progress in Materials Science, vol. 19, Pergamon Press, Oxford, UK, 1975. [19] A. Argon, Phil. Mag., 28, 839, 1973. [20] P. Duine, J. Sietsma, and A. van den Beukel, Acta Metall. Mater., 40, 743, 1992. [21] D. Turnbull and M. Cohen, J. Chem. Phys., 52, 3038, 1970. [22] A. van den Buekel and J. Sietsma, Acta Metall. Mater., 38, 383, 1990. [23] R. Huang, Z. Suo, J. Prevost, and W. Nix, J. Mech. Phys. Solids, 50, 1011, 2002. [24] P.S. Steif, F. Spaepen, and J.W. Hutchinson, Acta Metall., 30, 447, 1982. [25] R.M. McMeeking and J.R. Rice, Int. J. Solids Struct., 11, 601, 1975. [26] P.M. Naghdi, Z. Angew. Math. Phys., 41, 315, 1990. [27] L. Anand and M.E. Gurtin, Int. J. Solids Struct., 40, 1465–1487, 2003. [28] L. Pechenik, Cond-mat/0305516, 2003. [29] J. Langer and L. Pechenik, Phys. Rev. E, 68, 061507, 2003. [30] M. Falk and J. Langer, M.R.S. Bull., 25, 40, 2000. [31] L. Eastgate, J. Langer, and L. Pechenik, Phys. Rev. Lett., 90, 045506, 2003. [32] M. Cohen and G. Grest, Phys. Rev. B, 20, 1077, 1979.
1312
M.L. Falk et al.
[33] M. Falk, J. Langer, and L. Pechenik, Submitted, 2003. [34] I. Ono, C. O’Hern, D. Durian, S. Langer, and A. Liu, Phys. Rev. Lett., 89, 095703, 2002. [35] L. Cugliandolo, J. Kurchan, and L. Peliti, Phys. Rev. E, 55, 3898, 1997. [36] P. Sollich, F. Lequeux, P. Hebraud, and M. Cates, Phys. Rev. Lett., 78, 2020, 1997. [37] L. Berthier and J.-L. Barrat, Phys. Rev. Lett., 89, 095702, 2002. [38] A. Mehta and S. Edwards, Phys. A, 157, 1091, 1990. [39] A. Taub and F. Spaepen, Acta Metall., 28, 1781, 1980. [40] A. Greer and F. Spaepen, Ann. N.Y. Acad. Sci., 371, 218, 1981. [41] S. Tsao and F. Spaepen, Acta Metall., 33, 881, 1985. [42] M. Falk and J. Langer, Phys. Rev. E, 57, 7192, 1998. [43] J. Langer, Phys. Rev. E, 64, 011504, 2001.
4.4 STATISTICAL PHYSICS OF RUPTURE IN HETEROGENEOUS MEDIA Didier Sornette Institute of Geophysics and Planetary Physics and Department of Earth and Space Science, University of California, Los Angeles, California, USA and CNRS and Universit´e des Sciences, Nice, France
The damage and fracture of materials are technologically of enormous interest due to their economic and human cost. They cover a wide range of phenomena like cracking of glass, aging of concrete, the failure of fiber networks in the formation of paper and the breaking of a metal bar subject to an external load. Failure of composite systems is of utmost importance in naval, aeronautics and space industry [1]. By the term composite, we refer to materials with heterogeneous microscopic structures and also to assemblages of macroscopic elements forming a super-structure. Chemical and nuclear plants suffer from cracking due to corrosion either of chemical or radioactive origin, aided by thermal and/or mechanical stress. Despite the large amount of experimental data and the considerable effort that has been undertaken by material scientists [2], many questions about fracture have not been answered yet. There is no comprehensive understanding of rupture phenomena but only a partial classification in restricted and relatively simple situations. This lack of fundamental understanding is indeed reflected in the absence of reliable prediction methods for rupture, based on a suitable monitoring of the stressed system. Not only is there a lack of non-empirical understanding of the reliability of a system, but also the empirical laws themselves have often limited value. The difficulties stem from the complex interplay between heterogeneities and modes of damage and the possible existence of a hierarchy of characteristic scales (static and dynamic). Many material ruptures occur by a “one crack” mechanism and a lot of effort is being devoted to the understanding, detection and prevention of the nucleation of the crack [3, 4]. Exceptions to the “one crack” rupture mechanism are heterogeneous materials such as fiber composites, rocks, concrete under compression, ice, tough ceramics and materials with large distributed 1313 S. Yip (ed.), Handbook of Materials Modeling, 1313–1331. c 2005 Springer. Printed in the Netherlands.
1314
D. Sornette
residual stresses. The common property shared by these systems is the existence of large inhomogeneities, that often limit the use of homogeneization or effective medium theories for the elastic and more generally the mechanical properties. In these systems, failure may occur as the culmination of a progressive damage involving complex interactions between multiple defects and growing microcracks. In addition, other relaxation, creep, ductile, or plastic behaviors, possibly coupled with corrosion effects may come into play. Many important practical applications involve the coupling between mechanical and chemical effects with the competition between several characteristic time scales. Application of stress may act as a catalyst of chemical reactions [5] or, reciprocally, chemical reactions may lead to bond weakening [6] and thus promote failure. A dramatic example is the aging of present aircrafts due to repeating loading in a corrosive environment [7]. The interaction between multiple defects and the existence of several characteristic scales present a considerable challenge to the modeling and prediction of rupture. Those are the systems and problems on which the interdisciplinary marriage with statistical physics has brought new fruitful ideas that we now briefly present.
1.
Creep Rupture
There are many different conditions under which a material can rupture: constant strain rate, or stress, or stress rate, or more complex strain/stress histories (involving also other control parameters such as temperature, water content, chemical activity, etc.). The situation in which a stress is imposed is very frequent in mechanics (constant weight) and leads to the phenomenon of creep (also known as “static fatigue”). A stress step leads in general to a strain response and other observable changes such as acoustic emissions (see for a review, Ref. [8]). Understanding damage and rupture of a material subjected to a constant stress is thus a good starting point. For industrial applications, creep experiments are not always practical because they require adjusting the stress to subcritical levels such that one does not wait too long before interesting processes (including eventually rupture) are monitored. Accelerated tests, which yield information more quickly, include step-stress and ramp-stress loading [9]. As we said, time-dependent deformation of a material subjected to a constant stress level is known as creep. In creep, the stress is below the mechanical strength of the material, so that the rupture does not occur upon application of the load. It is by waiting a sufficiently long time that the cumulative strain may finally end in catastrophic rupture. Creep is all the more important, the larger the applied stress and the higher the temperature. The time to creep rupture is found in a large variety of materials to be controlled by the stress sign and magnitude, temperature and microstructure.
Statistical physics of rupture in heterogeneous media
1315
Creep is often divided into three regimes: (i) the primary creep regime corresponds to a decay of the strain rate following the application of the constant stress, which can often be described by the so-called Andrade’s law (a power law decay with time); (ii) the secondary regime describes an (often very long) cross over, characterized by an approximately constant strain rate, towards the (iii) tertiary creep regime in which the strain rate accelerates up to rupture. Andrade’s law for the strain rate is similar to the power-law relaxation of the aftershock seismic activity triggered by the stress change induced by a previous earthquake, known as Omori’s law [10]. In creep experiments, Omori’s law describes the decay of the rate of acoustic emissions in the primary regime. Creep experiments are thus interesting both because they constitute standard mechanical tests of long-time properties of structures and because of the power laws reminiscent of the critical behavior of complex self-organizing systems that have become popular paradigms, as discussed below. Studies of the creep rupture phenomena have been performed through direct experiments [11–14] and different models [15–25]. If a lot of works were devoted to homogeneous materials like metals and ceramics, many recent studies are concerned with heterogeneous materials like composites and rocks [11–14]. The knowledge of the failure properties of composite materials are of great importance because of the increasing number of applications for composites in engineering structures. The long-term behavior of these materials, especially polymer matrix composites is a critical issue for many modern engineering applications such as aerospace, biomedical, and civil engineering infrastructure. The primary concerns in long-term performance of composite materials are in obtaining critical engineering properties that extend over the projected lifetime of the structure. Viscoelastic creep and creep-rupture behaviors are among the critical properties needed to assess long-term performance of polymer-based composite materials. The knowledge of these critical properties is also required to design material microstructures which can be used to construct highly reliable components. For heterogeneous materials, the underlying microscopic failure mechanism of creep rupture is very complex depending on several characteristics of the specific types of materials. Beyond the development of analytical and numerical models, which predict the damage history in terms of the specific parameters of the constituents, another approach is to study the similarity of creep rupture with phase transitions phenomena as summarized here. This approach tackles the large range of scales involved in the damage evolution by using coarse-grained models describing the mechanism of creep, damage and precursory rupture by averaging over the microscopic degrees of freedom to retain only a few major ingredients that are thought to be the most relevant. By comparing the predictions of a hierarchy of models, from simple to elaborate, it is then possible to assess what are the relevant ingredients.
1316
D. Sornette
A recent experimental work on heterogenenous structural materials, conducted in GEMPPM at INSA Lyon, illustrates this approach [26]. Figure 1 shows a rapid and continuous decrease of the strain rate de/dt in the primary creep regime, which can be described by Andrade’s law (Omori’s law for the acoustic emissions) 1 de ∼ p, (1) dt t with an exponent p smaller than or equal to one. A quasi-constant strain rate (steady-state or secondary creep) is observed over an important part of the total creep time and then followed by an increasing creep rate (tertiary creep regime) culminating in fracture. Creep strains at fracture are large with values from a few percent up to 40% for such composite samples. The acceleration of the strain rate before failure is well fitted by a power-law singularity 1 de ∼ , dt (tc − t) p
(2)
with an exponent p smaller than or equal to one. The critical time tc determined from the fit of the data with expression (2) is generally close to the observed failure time (within a few seconds). The same temporal evolution is generally obtained for the acoustic emission activity as for the strain rate. The same patterns are obtained when plotting the acoustic emission event rate or the rate of acoustic emission energy. There are much larger fluctuations for the energy rate than for the event rate, due to the large distribution of acoustic (a)
(b)
10⫺2
(c) 10⫺3
⫺3
10
10⫺4 10⫺5 10⫺6 0
1
2 time (s)
3 x 104
strain rate
strain rate
strain rate
10⫺3 10⫺4 p=0.99
10⫺5
10⫺6 101
102
103 104 time (s)
105
10⫺4
p’=0.80
10⫺5
10⫺6
104
103 tc⫺ t (s)
102
Figure 1. Creep strain rate for a Sheet Molding Compound (SMC) composite consisting in a combination of polyester resin, calcium carbonate filler, thermoplastic additive and random oriented short glass reinforced fibres. The creep experiment was performed at a stress of 48 MPa and a temperature T = 100◦ C, below the glass transition, at the GEMPPM, INSA LYON, Villeurbanne, France. The stress was increased progressively and reached a constant value after about 17 s. Left panel: full history in linear time scale; middle panel: time is shown in logarithmic scale to test for the existence of a power-law relaxation regime; right panel: time is shown in the logarithm of the time to rupture time tc such that a time-to-failure power law (2) is qualified as a straight line. Reproduced from Ref. [26].
Statistical physics of rupture in heterogeneous media
1317
emission energies, but the crossover time between primary creep and tertiary creep, and the values of p and p are similar for the acoustic emission event rate and acoustic emission energy rate. This suggests that the amplitude distribution does not depend on time, a conclusion which is verified experimentally. How can one rationalize all these observations?
2.
The Role of Heterogeneities and Disorder
First, we need to define more precisely what is meant by “heterogeneity” or “disorder”. Disorder may describe the existence of a distribution (say Weibulllike) of material strength, and/or of their elastic properties, as well as the presence of internal surfaces such as fiber-matrix interfaces, voids and microcracks (or internal microdefects). In this sense, a kevlar-matrix or carbon-matrix composite would behave more like a heterogeneous system than a homogeneous matrix. There is not a unique way of defining the amplitude of disorder, since the classification depends on how the mechanics and physics respond to the heterogeneity. It can, in fact, be shown from a theorem of Von Neumann and Morgenstern [27] that the existence of possible correlations in the disorder prevents the existence of a unique absolute measure of disorder amplitude. In other words, the measure of disorder is relative to the problem. In practice, it can usually be quantified by some measure of the contrast between material and strength properties of components of the systems, weighted by their relative concentrations and their scales. When disorder is uncorrelated in space, a reasonable measure of its amplitude is the width or standard deviation (when it exists) of its distribution. The correlation length of the disorder and the characteristic sizes and their distribution are also important variables as they control the length scales that are relevant for the stress heterogeneity. A consequence is the size/volume effect, which is a very important practical subject. As already mentioned, the key parameter controlling the nature of damage and rupture is the degree and nature of disorder. This was considered early by Mogi [28], who showed experimentally on a variety of materials that, the larger the disorder, the stronger and more useful are the precursors to rupture. For a long time, the Japanese research effort for earthquake prediction and risk assessment was based on this very idea [29]. A quantification of this idea of the role of heterogeneities on the nature of rupture has been obtained with a two-dimensional spring-block model with stress transfer over a limited range and with the democratic fiber bundle model [30]. These models do not claim realism but attempt rather to capture the interplay of heterogeneity and of the stress transfer mechanism. It was found that heterogeneity plays the role of a relevant field (in the language of the statistical physics of critical phase transitions): systems with limited stress amplification exhibit a tri-critical transition [31], from a Griffith-type abrupt rupture (first-order)
1318
D. Sornette
regime to a progressive damage (critical) regime as the disorder increases. In the two-dimensional spring-block model of surface fracture [30], the stress can be released by spring breaks and block slips. This spring-block model may represent schematically the experimental situation where a balloon covered with paint or dry resin is progressively inflated. An industrial application may be for instance a metallic tank with carbon or kevlar fibers impregnated in a resin matrix wrapped up around it which is slowly pressurized [32]. As a consequence, it elastically deforms, transferring tensile stress to the overlayer. Slipping (called fiber-matrix delamination) and cracking can thus occur in the overlayer. In Ref. [30], this process is modeled by an array of blocks which represents the overlayer on a coarse grained scale in contact with a surface with solid friction contact. The solid friction will limit stress amplification. The fact that the disorder is so relevant as to create the analog of a tri-critical behavior can be traced back to the existence of solid friction on the blocks which ensures that the elastic forces in the springs are carried over a bounded distance (equal to the size of a slipping “avalanche”) during the stress transfer induced by block motions. There are similarities between this model and models of quasi-periodic matrix cracking in fibrous composites and of fragmentation of fibers in the so-called “single-filament-composite” test. This last model has been extensively developed and extended in a global and local load-sharing framework [33–35]. In the presence of long-range elasticity, disorder is found to be always relevant leading to a critical rupture. However, the disorder controls the width of the critical region [36]. The smaller it is, the smaller will be the critical region, which may become too small to play any role in practice. This has been confirmed by simulations of the “thermal fuse model” described below [37]: the damage rate on the approach to failure for different disorder can be rescaled onto a universal master curve.
3.
Qualitative Physical Scenario: From Diffuse Damage to Global Failure
The following qualitative physical picture for the progressive damage of an heterogeneous system leading to global failure has emerged from a large variety of theoretical, numerical, and experimental works (see for instance Refs. [38–41]). First, single isolated defects and microcracks nucleate which then, with the increase of load or time of loading, both grow and multiply leading to an increase of the density of defects per unit volume. As a consequence, defects begin to merge until a “critical density” is reached. Uncorrelated percolation [42] provides a starting modeling point valid in the limit of very large disorder [43]. For realistic systems, long-range correlations transported by the
Statistical physics of rupture in heterogeneous media
1319
stress field around defects and cracks make the problem much more subtle. Time dependence is expected to be a crucial aspect in the process of correlation building in these processes. As the damage increases, a new “phase” appears, where microcracks begin to merge leading to screening and other cooperative effects. Finally, the main fracture is formed causing global failure. The nature of this global failure may be abrupt (“first-order”) or “critical” depending on the strength of heterogeneity as well as load transfer and stress relaxation mechanisms. In the “critical” case, the failure of composite systems may often be viewed, in simple intuitive terms, as the result of a “correlated percolation process.” However, the challenge is to describe the transition from damage and corrosion processes at a microscopic level to macroscopic failure.
4.
Scaling and Critical Point
Motivated by the multi-scale nature of ruptures in heterogeneous systems and by analogies with the percolation model [42], statistical physicists suggested in the mid-1980s that rupture of sufficiently heterogeneous media would exhibit some universal properties, in a way maybe similar to critical phase transitions [43–45]. The idea was to build on the knowledge accumulated in statistical physics on the so-called N -body problem and cooperative effects in order to describe multiple interactions between defects. However, most of the models were drastically simplified and essentially all of them quasistatic with rather unrealistic loading rules [46, 47]. Suggestive scaling laws, including multifractality, were found to describe size effects and damage properties [46, 48], but the relevance to real materials was not convincingly demonstrated with a few exceptions (e.g., percolation theory to explain the experimentally based Coffin-Manson law of low cycle fatigue [49] or the Portevin-Le Chatelier effect in diluted alloys [50]). With numerical simulations and perturbation expansions, Hansen, Hinrichsen and Roux [48] (see also Ref. [46]) have used this class of quasi-static rupture models (with short-range as well as long-range interactions) to classify three possible rupture regimes, as a function of the concentrations of weak versus strong elements in the system. Specifically, the distribution p(x) of rupture thresholds x of elements of the discretized systems was parameterized as follows: p(x) ∼ x φ0 −1 for x → 0 and p(x) ∼ x −(1+φ∞ ) for x → + ∞. Then, the three regimes depend on the relative value of the exponents φ0 and φ∞ c compared with two critical values φ0c and φ∞ . The “weak disorder” regime c c occurs for φ0 > φ0 (few weak elements) and φ∞ > φ∞ (few strong elements) and boils down essentially to the nucleation of a “one-crack” run-away. For c φ0 ≤ φ0c (many weak elements) and φ∞ > φ∞ (few strong elements), the rupture is controlled by the weak elements, with important size effects. The damage is diffuse but presents a structuration at large scales. For φ0 > φ0c (few weak
1320
D. Sornette
c elements) and φ∞ ≤ φ∞ (many strong elements), the rupture is controlled by the strong elements : the final damage is diffuse and the density of broken elements goes to a non-vanishing constant. This third case is very similar to the percolation models of rupture: Roux et al. [49] have indeed shown that percolation is retrieved in the limit of very large disorder. Beyond quasi-static models, the “thermal fuse model” of Sornette and Vanneste [37] was the first one with a realistic dynamical evolution law for the damage field. It was initially formulated in the framework of electric breakdown: when subjected to a given current, all fuses in a network heat up due to a generalized Joule effect (with exponent b); in the presence of heterogeneity in the conductances of the fuses, one of them will eventually breaks down first when its temperature reaches the melting threshold. Its current is then immediately distributed in the remaining fuses according to Kirchoff law. The model was later reformulated by showing that it is exactly equivalent to a (scalar) antiplane mechanical model of rupture with elastic interaction in which the temperature becomes a local damage variable [50]. This model accounts for space-dependent elastic and rupture properties, has a realistic loading (constant stress applied at the beginning of the simulation, for instance) and produces growing interacting microcracks with an organization which is a function of the damage-stress law. This model is thus a statistical generalization with quenched disorder of homogeneization theories of damage [51, 52]. In a creep experiment (constant applied stress), the total rate of damage in the late stage of evolution, as measured for instance by the elastic energy released per unit time dE/dt, is found on average to increase as a power law similar to expression (2),
1 dE ∼ , dt (tc − t)α
(3)
of the time-to-failure tc − t in the later stage. This behavior reproduces the tertiary creep regime culminating in the global rupture at tc . In this model, rupture is found to occur as the culmination of the progressive nucleation, growth and fusion between microcracks, leading to a fractal network. Interestingly, the critical exponents (such as α > 0) are non-universal and vary as a function of the damage law (exponent b). This model has since then been found to describe correctly the experiments on the electric breakdown of insulatorconducting composites [53]. Another application of the thermal fuse model is the damage by electromigration of polycristalline metal films [54]. See also Ref. [50] for relations with dendrites and fronts propagation. The concept that rupture in heterogenous materials is a genuine critical point, in the sense of phase transitions in statistical physics, was first articulated by Anifrani et al. [32], based on experiments on industrial composite structures. In this framework, the power law (3) is interpreted as analogous to a diverging susceptibility in critical phase transitions. It was found that
Statistical physics of rupture in heterogeneous media
1321
the critical behavior may correspond to an acceleration of the rate of energy release or to a deceleration, depending on the nature and range of the stress transfer mechanism and on the loading procedure. Symmetry arguments as well as the concept of a hierarchical cascade of damage events led in addition to suggest theoretically and verify experimentally that the power law behavior (3) of the time-to-failure analysis should be corrected for the presence of log-periodic modulations [32]. This “log-periodicity” can be shown to be the signature of a hierarchy of characteristic scales in the rupture process. This hierarchy can be generated dynamically by a cascade of sub-harmonic bifurcations [55]. These log-periodic corrections to scaling amount mathematically to taking the critical exponent α = α + iα complex, where i2 = −1 [56]. This has led to the development of a powerful predictive scheme ([57] and see below). The critical rupture concept can be seen as a non-trivial generalization of the dimension analysis based on Buckingham theorem and the asymptotic matching method proposed by Bazant [58, 59] to model size effect in complex materials, in the same way that Barenblatt’s [60] second-order similitude generalizes the naive similitude of first-order (or simple analytical behavior) of standard dimensional analysis, or in the same way the non-analytical behavior characterizing critical phase transitions generalizes the mean-field behavior of Landau-Ginzburg theory. Acharyya and Chakrabarti [61, 62] have shown how to define a “breakdown susceptibility” during the progressive damage of model systems when subjected to local short-duration impulses and how the breakdown point can then be located in advance by extrapolating this breakdown susceptibility. Numerical simulations on two-dimensional heterogeneous systems of elastic-brittle elements have confirmed that, near the global failure point, the cumulative elastic energy released during fracturing of heterogeneous solids with long-range elastic interactions follows a power law with log-periodic corrections to the leading term [63]. The presence of log-periodic correction to scaling in the elastic energy released has also been demonstrated numerically for the thermal fuse model [64] using a novel averaging procedure, called the “canonical ensemble averaging”. This averaging technique accounts for the dependence of the critical rupture time tc on the specific disorder realization of each sample. A recent experimental study of rupture of fiber-glass composites has also confirmed the critical scenario [65]. A systematic analysis of industrial pressure tanks brought to rupture has also confirmed the critical rupture concept and the presence of significant log-periodic structures, that are useful for prediction [66]. Through a series of computer and laboratory simulations and table-top experiments, Chakrabarti and Benguigui [67] have presented a useful synthesis of basic modeling principles borrowing from statistical physics putting in perspective three case studies: electrical failures like fuse and dielectric breakdown, mechanical fractures, and earthquakes. Their work also emphasizes the critical rupture concept [61, 62, 68, 69].
1322
D. Sornette
Let us also mention the work of Ramanathan and Fisher [70]: using analytical calculations and by numerical simulations, they compare the nature of the onset of a single crack motion in an heterogeneous material when neglecting or taking into account the dynamical wave stress transfer mechanism. In the quasistatic limit with instantaneous stress transfer, the crack front is found to undergo a dynamic critical phenomenon, with a second-order-like transition from a pinned to a moving phase as the applied load is increased through a critical value. Real elastic waves lead to overshoots in the stresses above their eventual static value when one part of the crack front moves forward. Simplified models of these stress overshoots showed an apparent jump in the velocity of the crack front directly to a nonzero value. In finite systems, the velocity also shows hysteretic behavior as a function of the loading. These results suggest a first-order-like transition [70].
5.
Creep Rupture: Models
Let us come back to the experiments shown in Fig. 1. There are many models, at the interface between standard mechanical approaches and statistical physics, which attempt to capture these observations. Vujosevic and Krajcinovic [71], Turcotte, Newman and Shcherbakov [23], Shcherbakov and Turcotte [24] and Pradhan and Chakrabarti [21, 22] used systems of elements or fibers within a probabilistic framework (corresponding to so-called annealed or thermal disorder) with a hazard rate function controlling the probability of rupture for a given fiber as a function of the stress applied to that fiber. Turcotte, Newman and Shcherbakov [23] obtained a finite-time singularity of the strain rate before failure in fiber bundle models by postulating a power law dependence of the hazard rate controlling the probability of rupture for a given fiber as a function of the stress applied to that fiber. Shcherbakov and Turcotte [24] studied the same model and recovered a power-law singularity of the strain rate for systems subjected to constant or increasing stresses with an exponent p = 4/3 larger than the experimental results. Using energy conservation and the requirement of non-negative entropy change, Lyakhovsky, Ben-Zion and Agnon [72] derived an evolution equation for the density of microcracks similar to that of Turcotte, Newman and Shcherbakov [23] for a fiber bundle model. Ben-Zion and Lyakhovsky [73] derived analytically the existence of power laws describing the time-dependent increase of the singular strain and the accelerated energy release in the tertiary regime using the continuum-based damage approach of Lyakhovsky, Ben-Zion and Agnon [72]. Sammis and Sornette [74] give an exhaustive review of the mechanisms giving rise to the power law tertiary regime, with application to earthquakes. Vujosevic and Krajcinovic [71] also found a power-law acceleration in
Statistical physics of rupture in heterogeneous media
1323
two-dimensional simulations of elements and in a mean-field democratic load sharing model, using a stochastic hazard rate, but they do not obtain Andrade’s law in the primary creep regime. Shcherbakov and Turcotte [24] were able to obtain Andrade law only in the situation of a system subjected to a constant applied strain (stable regime). But then, they did not have a global rupture and they did not obtain the critical power law preceding rupture. Thus, the models described above do not reproduce at the same time Andrade’s law for the primary regime and a power-law singularity before failure. Miguel et al. [15] reproduced Andrade’s law with p ≈ 2/3 in a numerical model of interacting dislocations, but their model does not reproduce the tertiary creep regime (no global failure). Several creep models consider the democratic fiber bundle model (DFBM) with thermally activated failures of fibers. Pradhan and Chakrabarti [21, 22] considered the DFBM and added a probability of failure per unit time for each fiber which depends on the amplitude of a thermal noise and on the applied stress. They computed the failure time as a function of the applied stress and noise level but they did not discuss the temporal evolution of the strain rate. Ciliberto, Guarino and Scorretti [16] and Politi, Ciliberto and Scorretti [20] considered the DFBM in which a random fluctuating force is added on each fiber to mimic the effect of thermal fluctuations. Ciliberto, Guarino and Scorretti [16] showed that this simple model predicts a characteristic rupture time given by an Arrhenius law with an effective temperature renormalized (amplified) by the quenched disorder in the distribution of rupture thresholds. Saichev and Sornette [25] showed that this model predicts Andrade’s law as well as a power law time-to-failure for the rate of fiber rupture with p = p = 1, with logarithm corrections (which may give apparent exponents p and p smaller than one). A few other models reproduce both a power-law relaxation in the primary creep and a finite time singularity in the tertiary regime. Main [19] reproduced a power-law relaxation (Andrade’s law) followed by a power-law singularity of the strain rate before failure by superposing two processes of subcritical crack growth, with different parameters. A first mechanism with negative feedback dominates in the primary creep and the other mechanism with positive feedback gives the power-law singularity close to failure. Lockner [14] gave an empirical expression for the strain rate as a function of the applied stress in rocks, which reproduces, among other properties, Andrade’s law with p = 1 in the primary regime and a finite-time singularity leading to rupture. Kun et al. [17] and Hidalgo, Kun and Herrmann [18] studied numerically and analytically a model of visco-elastic fibers, with deterministic dynamics and quenched disorder. They considered different ranges of interaction between fibers (local or democratic load sharing). Kun et al. [17] derived the condition for global failure in the system and the evolution of the failure time as a function of the applied stress in the unstable regime, and analysed the
1324
D. Sornette
statistics of inter-event times in numerical simulations of the model. Hidalgo, Kun and Herrmann [18] derived analytically the expression for the strain rate as a function of time. This model reproduces a power-law singularity of the strain rate before failure with p = 1/2 in the case of a uniform distribution of strengths, but is not able to explain Andrade’s law for the primary creep. This model gives a power-law decay of the strain rate in the primary creep regime only if the stress is at the critical point, but with an exponent p = 1/2 smaller than the experimental values. Nechad et al. [26] developed a variant of this model in which a composite system is viewed as made of a large set of representative elements (RE), each representative element comprising many fibers with their interstitial matrix. Each RE is endowed with a visco-elasto-plastic rheology with parameters which may be different from one element to another. The parameters characterizing each RE are frozen and do not evolve with time (so-called quenched disorder). Specifically, each RE is modeled as an Eyring dashpot in parallel with a linear spring. The Eyring rheology is standard for fiber composites [12]. It consists, at the microscopic level, in adapting to the matrix rheology the theory of reaction rates describing processes activated by crossing potential barriers. With these sole ingredients, the model recovers the three primary, secondary and tertiary regimes with exponents p = 1 (defined in expression (1)) and p = 1 (defined in expression (2)). These solutions for the primary and tertiary regimes are basically of the same form with p = p = 1 as the Langevin-type model solved by Saichev and Sornette [25]; this may not be surprising since the Eyring rheology describes, at the microscopic level, processes activated by crossing potential barriers, which are explicitely accounted for in the thermal fluctuation force model [25]. The key ingredients leading to these results are the broad (power law) distribution of rupture thresholds and the nonlinear Eyring rheology in a Kelvin element. Nechad et al.’s [26] model is a macroscopic deterministic effective description of the experiments. In contrast, the modeling strategy of Ciliberto, Guarino and Scorretti [16], of Politi, Ciliberto and Scorretti [20] and of Saichev and Sornette [25] emphasizes the interplay between microscopic thermal fluctuations and frozen heterogeneity. Qualitatively, Nechad et al.’s [26] model is similar to a deterministic macroscopic Fokker–Planck description while the thermal models of Ciliberto, Guarino and Scorretti [16], of Politi, Ciliberto and Scorretti [20] and of Saichev and Sornette [25] are reminiscent of stochastic Langevin models. It is well-known in statistical physics that Fokker–Planck equations and Langevin equations are exactly equivalent for systems at equilibrium and just constitute two different descriptions of the same processes, and their correspondence is associated with the general fluctuation-dissipation theorem. Similarly, the encompassing of both the Andrade relaxation law in the primary creep regime and of the time-to-failure power law singularity in the tertiary regime by Nechad et al.’s [26] model and by the thermal model solved in [25] suggests a deep connection between these two levels of description for creep and damage processes.
Statistical physics of rupture in heterogeneous media
6.
1325
Toward Rupture Prediction
There is a huge variability of the failure time from one sample to another one, for the same applied stress, as shown in Fig. 2. This implies that one cannot predict the time to failure of a sample using an empirical relation between the applied stress and the time of failure. There is however another approach suggested by Fig. 2 as proposed by Nechad et al. [26]. It shows the correlation between the transition time tm (minimum of the strain rate) and the rupture time tc ≈ tend and shows that tm is about 2/3 of the rupture time tc . This suggests a way to predict the failure time from the observation of the strain rate during the primary and secondary creep regimes, before the acceleration of the damage during the tertiary creep regime leading to the rupture of the sample. As soon as a clear minimum is observed, the value of tm can be measured and that of tc deduced from the relationship shown in Fig. 2. However, there are some cases where the minima is not well defined, for which the first (smoothed) minimum is followed by a second similar one. In this case,
106 [±62] [90/35] SMC fit tc=tm⫻1.58⫹16
rupture time tc (s)
105
104
103
102 102
103
104
105
transition time tm (s) Figure 2. Relation between the time tm of the minima of the strain rate and the rupture time tc , for all samples investigated in [26].
1326
D. Sornette
the application of the relationship shown in Fig. 2 would lead to a pessimistic prediction for the lifetime of the composite. The observation that the failure time is correlated with the p-value and the duration of the primary creep suggests that, either a single mechanism is responsible both for the decrease of the strain rate during primary creep and for the acceleration of the damage during the tertiary creep or, if the mechanisms are different nevertheless, the damage that occurs in the primary regime impacts on its subsequent evolution in the secondary and tertiary regime, and therefore on tc . In contrast, using a fit of the acoustic emission activity by a power-law to estimate tc according to formula (3) works only in the tertiary regime and thus does not exploit the information contained in the deformation and in the acoustic emissions of the primary and secondary regimes which cover 2/3 to 3/4 of the whole history. In practice, one needs at least one order of magnitude in the time tc − t to estimate accurately tc and p , which means that, if the power-law acceleration regime starts immediately when the stress is applied (no primary creep), one cannot predict the rupture time using a fit of the damage rate by Eq. (3) before 90% of the failure time. If, as observed in the experiments of Nechad et al. [26], the tertiary creep regime starts only at about 63% of tc , then one cannot predict the rupture time using a fit of the damage rate before 96% of the failure time. This limitation was the motivations for the development of formulas that interpolate between the primary and tertiary regimes beyond the pure power law (3) using log-periodic corrections to scaling [32, 66, 75–78]. In particular, Anifrani et al. [32] have introduced a method based on log-periodic correction to the critical power law which has been used extensively by the European Aerospace company A´erospatiale (now EADS) on pressure tanks made of kevlar-matrix and carbon-matrix composites embarked on the European Ariane 4 and 5 rockets. In a nutshell, the method consists in this application in recording acoustic emissions under constant stress rate and the acoustic emission energy as a function of stress is fitted by the above log-periodic critical theory. One of the parameters is the time of failure and the fit thus provides a “prediction” when the sample is not brought to failure in the first test [77]. The results indicate that a precision of a few percent in the determination of the stress at rupture is typically obtained using acoustic emission recorded about 20% below the stress at rupture. This has warranted the selection of this non-destructive evaluation technique as the routine qualifying procedure in the industrial fabrication process. This methodology and these experimental results have been guided by the theoretical research over the years using the critical rupture concept discussed above. In particular, there is now a better understanding of the conditions, the mathematical properties and physical mechanisms on the basis of log-periodic structures [55, 56, 79–81]). Another noteworthy approach already mentioned above for the prediction of rupture, which is inspired by statistical physics,
Statistical physics of rupture in heterogeneous media
1327
is the “breakdown susceptibility” introduced by Acharyya and Chakrabarti [61, 62]. It requires monitoring the response of the system when subjected to local short-duration impulses whose nature depends upon the problem (stress, strain, temperature, electromagnetic etc.). In summary, starting with the initial flurry of interest from the statistical physics community on problems of material rupture, a new awareness of the many-body nature of the rupture problem has blossomed. There is now a growing understanding in both communities of the need for an interdisciplinary approach, improving on the reductionist approach of both fields to tackle at the same time the difficult modeling of specific properties of the microscopic structures and their interactions leading to collective effects. Independently of the types of materials for given applications, this approach will be crucial in making progress on the optimization of the lifetime of materials (“durability”) and on the determination of the remaining life time of materials in use (“remaining potential”).
References [1] T. Reichhardt, “Rocket failure leads to grounding of small US satellites,” Nature (London), 384, 99–99, 1996. [2] H. Liebowitz (ed.), Fracture, New York, Academic Press, vols. I–VII, 1984. [3] J. Fineberg and M. Marder, “Instability in dynamic fracture,” Phys. Rep., 313, 2–108, 1999. [4] E. Bouchaud, “The morphology of fracture surfaces: a tool for understanding crack propagation in complex materials,” Surf. Rev. Lett., 10, 797–814, 2003. [5] J.J. Gilman, “Mechanochemistry,” Science, 274, 65–65, 1996. [6] A.R.C. Westwood, J.S. Ahearn, and J.J. Mills, “Developments in the theory and application of chemomechanical effects,” Colloid Surfaces, 2, 1, 1981. [7] National Research Council, Aging of U.S. Air Force Aircraft, Final Report from the Committee on Aging of U.S. Air Force Aircraft, National Materials Advisory Board Commission on Engineering and Technical Systems, Publication NMAB-4882, National Academy Press, Washington, D.C., 1997. [8] R. El Guerjouma, J.C. Baboux, D. Ducret, N. Godin, P. Guy, S. Huguet, Y. Jayet, and T. Monnier, “Non-destructive evaluation of damage and failure of fiber reinforced polymer composites using ultrasonic waves and acoustic emission,” Adv. Engrg. Mater., 8, 601–608, 2001. [9] W. Nelson, “Accelerated testing: statistical models, test plans and data analyses,” John Wiley & Sons, Inc., New York, 1990. [10] F. Omori, “On the aftershocks of earthquakes,” J. Coll. Sci. Imp. Univ. Tokyo, 7, 111, 1894. [11] A. Agbossou, I. Cohen, and D. Muller, “Effects of interphase and impact strain rates on tensile off-axis behaviour of unidirectional glass fibre composite: experimental results,” Engrg. Fract. Mech., 52 (5), 923–935, 1995. [12] J.Y. Liu, R.J. Ross, “Energy criterion for fatigue strength of wood structural members,” J. Engrg. Mater. Technol., 118(3), 375–378, 1996.
1328
D. Sornette
[13] A. Guarino, S. Ciliberto, A. Garcimartin, M. Zei, and R. Scorretti, “Failure time and critical behaviour of fracture precursors in heterogeneous materials,” Eur. Phys. J. B, 26 (2), 141–151, 2002. [14] D.A. Lockner, “A generalized law for brittle deformation of Westerly granite,” J. Geophys. Res., 103(B3), 5107–5123, 1998. [15] M.C. Miguel, A. Vespignani, M. Zaiser, and S. Zapperi, “Dislocation jamming and Andrade creep,” Phys. Rev. Lett., 89(16), 165501, 2002. [16] S. Ciliberto, A. Guarino, and R. Scorretti, “The effect of disorder on the fracture nucleation process,” Physica D, 158, 83–104, 2001. [17] F. Kun, Y. Moreno, R.C. Hidalgo, and H.J. Herrmann, “Creep rupture has two universality classes,” Europhys. Lett., 63(3), 347–353, 2003. [18] R.C. Hidalgo, F. Kun, and H.J. Herrmann, “Creep rupture of viscoelastic fiber bundles,” Phys. Rev. E, 65(3), 032502/1-4, 2002. [19] I.G. Main, “A damage mechanics model for power-law creep and earthquake aftershock and foreshock sequences,” Geophys. J. Int., 142(1), 151–161, 2000. [20] A. Politi, S. Ciliberto, and R. Scorretti, “Failure time in the fiber-bundle model with thermal noise and disorder,” Phys. Rev. E, 66(2), 026107/1-6, 2002. [21] S. Pradhan and B.K. Chakrabarti, “Failure due to fatigue in fiber bundles and solids,” Phys. Rev. E, 67, 046124, 2003a. [22] S. Pradhan and B.K. Chakrabarti, “Failure properties of fiber bundle models,” Int. J. Mod. Phys. B, 17( 29), 5565–5581, 2003b. [23] D.L. Turcotte, W.I. Newman, and R. Shcherbakov, “Micro and macroscopic models of rock fracture,” Geophys. J. Int., 152(3), 718–728, 2003. [24] R. Shcherbakov and D.L. Turcotte “Damage and self-similarity in fracture,” Theoretical Appl. Fract. Mech., 39(3), 245–258, 2003. [25] A. Saichev and D. Sornette, Andrade, “Omori and time-to-failure laws from thermal noise in material rupture,” Phys. Rev. E, 71(1), 2005 (preprint http://arXiv.org/abs/ cond-mat/0311493). [26] H. Nechad, A. Helmstetter, R. El Guerjouma, and D. Sornette, “Andrade and critical time-to-failure laws in fibre-matrix composites: experiments and model,” in press, J. Mech. Phys. Solids, http://arXiv.org/abs/cond-mat/0404035, 2005. [27] J. von Neumann and O. Morgenstern, “Theory of games and economic behavior,” Princetown University Press, 1947. [28] K. Mogi, “Some features of recent seismic activity in and near Japan: activity before and after great earthquakes,” Bull. Eq. Res. Inst. Tokyo Univ., 47, 395–417, 1969. [29] K. Mogi, “Earthquake prediction research in Japan,” J. Phys. Earth, 43, 533–561, 1995. [30] J.V. Andersen, D. Sornette, and K.-T. Leung, “Tri-critical behavior in rupture induced by disorder,” Phys. Rev. Lett., 78, 2140–2143, 1997. [31] A. Aharony, “Tricritical phenomena,” Lect. Notes Phys., 186, 209, 1983. [32] J.-C. Anifrani, C. Le Floc’h, D. Sornette, and B. Souillard, “Universal log-periodic correction to renormalization group scaling for rupture stress prediction from acoustic emissions,” J. Phys. I France, 5, 631–638, 1995. [33] W.A. Curtin, “Exact theory of fibre fragmentation in a single-filament composite,” J. Mater. Sci., 26, 5239–5253, 1991. [34] W.A. Curtin, “Size scaling of strength in heterogeneous materials,” Phys. Rev. Lett., 80, 1445–1448, 1998. [35] M. Ibnabdeljalil and W.A. Curtin, “Strength and reliability of fiber-reinforced composites: localized load-sharing and associated size effects,” Int. J. Sol. Struct., 34, 2649–2668, 1997.
Statistical physics of rupture in heterogeneous media
1329
[36] D. Sornette and J.V. Andersen, “Scaling with respect to disorder in time-to-failure,” Eur. Phys. J. B, 1, 353–357, 1998. [37] D. Sornette and C. Vanneste, “Dynamics and memory effects in rupture of thermal fuse networks,” Phys. Rev. Lett., 68, 612–615, 1992. [38] X.-L. Lei, K. Kusunose, M.V.M.S. Rao, O. Nishizawa, and T. Satoh, “Quasi-static fault growth and cracking in homogenous brittle rock under triaxial compression using acoustic emission monitoring,” J. Geophys. Res., 105, 6127–6139, 1999. [39] X.-L. Lei, K. Kusunose, O. Nishizawa, A. Cho, and T. Satoh, “On the spatiotemporal distribution of acoustic emissions in two granitic rocks under triaxial compression: the role of pre-existing cracks,” Geophys. Res. Lett., 27, 1997–2000, 2000. [40] C.A. Tang, H. Liu, P.K.K. Lee, Y. Tsui, and L.G. Tham, “Numerical studies of the infuence of microstructure on rock failure in uniaxial compression – Part I: effect of heterogeneity,” Int. J. Rock Mech. Mining Sci., 37, 555–569, 2000a. [41] C.A. Tang, H. Liu, P.K.K. Lee, Y. Tsui, and L.G. Tham, “Numerical studies of the infuence of microstructure on rock failure in uniaxial compression – Part II: constraint, slenderness and size effect,” Int. J. Rock Mech. Mining Sci., 37, 571–583, 2000b. [42] D. Stauffer and A. Aharony, “Percolation theory, Taylor and Francis, London,” 1992. [43] L. de Arcangelis, S. Redner, and H.J. Herrmann, “A random fuse model for breaking processes,” J. Physique Lett., 46, L585–590, 1985. [44] P.M. Duxbury, P.D. Beale, and P.L. Leath, “Size effects of electrical breakdown in quenched random media,” Phys. Rev. Lett., 57, 1052–1055, 1986. [45] A. Gilabert, C. Vanneste, D. Sornette, and E. Guyon, “The random fuse network as a model of rupture in a disordered medium,” J. Phys. France, 48, 763–770, 1987. [46] H.J. Herrmann and S. Roux (eds.), Statistical Models for the Fracture of Disordered media, Elsevier, Amsterdam, 1990. [47] P. Meakin, “Models for material failure and deformation,” Science, 252(5003), 226– 234, 1991. [48] A. Hansen, E. Hinrichsen, and S. Roux, “Scale-invariant disorder in fracture and related breakdown phenomena,” Phys. Rev. B, 43, 665–678, 1991. [49] Y. Br´echet, T. Magnin, and D. Sornette, “The Coffin–Manson law as a consequence of the statistical nature of the LCF surface damage,” Acta Metall., 40, 2281–2287, 1992. [50] M.S. Bharathi and G. Ananthakrishna, “Chaotic and power law states in the PortevinLe Chatelier effect,” Europhys. Lett., 60, 234–240, 2002; Correction, Ibid, 61, 430, 2003. [51] J.L. Chaboche, “A continuum damage theory with anisotropic and unilateral damage,” Rech. Aerospatiale, 2, 139, 1995. [52] J.F. Maire and J.L. Chaboche, “A new formulation of continuum damage mechanics (CDM) for composite materials,” Aerospace Sci. Technol., 1, 247–257, 1997. [53] L. Lamaign`ere, F. Carmona, and D. Sornette, “Experimental realization of critical thermal fuse rupture,” Phys. Rev. Lett., 77, 2738–2741, 1996. [54] R.M. Bradley and K. Wu, “Dynamic fuse model for electromigration failure of polycrystalline metal films,” Phys. Rev. E, 50, R631–R634, 1994. [55] Y. Huang, G. Ouillon, H. Saleur, and D. Sornette, “Spontaneous generation of discrete scale invariance in growth models,” Phys. Rev. E, 55, 6433–6447, 1997. [56] D. Sornette, “Discrete scale invariance and complex dimensions,” Phys. Rep., 297, 239–270, 1998.
1330
D. Sornette
[57] C. Le Floc’h and D. Sornette, “Predictive acoustic emission: application on helium high pressure tanks,” Pr´ediction des e´ v`enements catastrophiques: une nouvelle approche pour le controle de sant´e structurale, Instrumentation Mesure Metrologie, published by Hermes Science, RS Series, I2M, vol. 3(1–2), 89–97 (in french), 2003. [58] Z.P. Bazant, “Scaling of quasibrittle fracture: asymptotic analysis,” Int. J. Fract., 83, 19–40, 1997a. [59] Z.P. Bazant, “Scaling of quasibrittle fracture: hypotheses of invasive and lacunar fractality, their critique and Weibull connection,” Int. J. Fract., 83, 41–65, 1997b. [60] G.I. Barenblatt, Dimensional Analysis, Gordon and Breach, New York, 1987. [61] M. Acharyya and B.K. Chakrabarti, “Response of random dielectric composites and earthquake models to pulses – prediction possibilities,” Physica A, 224, 254–266, 1996a. [62] M. Acharyya and B.K. Chakrabarti, “Growth of breakdown susceptibility in random composites and the stick-slip model of earthquakes – prediction of dielectric breakdown and other catastrophes,” Phys. Rev. A, 53, 140–147; “Correction,” Phys. Rev. A, 54, 2174–2175, 1996b. [63] M. Sahimi and S. Arbabi, “Scaling laws for fracture of heterogeneous materials and rock,” Phys. Rev. Lett., 77, 3689–3692, 1996. [64] A. Johansen and D. Sornette, “Evidence of discrete scale invariance by canonical averaging,” Int. J. Mod. Phys. C, 9, 433–447, 1998. [65] A. Garcimartin, A. Guarino, L. Bellon, and S. Ciliberto, “Statistical properties of fracture precursors,” Phys. Rev. Lett., 79, 3202–3205, 1997. [66] A. Johansen and D. Sornette, “Critical ruptures,” Eur. Phys. J. B, 18, 163–181, 2000. [67] B.K. Chakrabarti and L.G. Benguigui, “Statistical physics of fracture and breakdown in disordered systems,” Clarendon Press, Oxford, 1997. [68] R. Banerjee and B.K. Chakrabarti, “Critical fatigue behaviour in brittle glasses,” B. Mater. Sci., 24(2), 161–164, 2001. [69] S. Pradhan and B.K. Chakrabarti, “Precursors of catastrophe in the Bak-TangWiesenfeld, Manna, and random-fiber-bundle models of failure,” Phys. Rev. E, 016113, 2002. [70] S. Ramanathan and D.S. Fisher, “Onset of propagation of planar cracks in heterogenous media,” Phys. Rev. B, 58, 6026–6046, 1998. [71] M. Vujosevic and D. Krajcinovic, “Creep rupture of polymers – a statistical model,” Int. J. Solids Struct., 34(9), 1105–1122, 1997. [72] V. Lyakhovsky, Y. Benzion, and A. Agnon, “Distributed damage, faulting and friction,” J. Geophys. Res. (Solid Earth), 102(B12), 27635–27649, 1997. [73] Y. Ben-Zion and V. Lyakhovsky, “Accelerated seismic release and related aspects of seismicity patterns on earthquake faults,” Pure Appl. Geophys., 159(10), 2385–2412, 2002. [74] S.G. Sammis and D. Sornette, “Positive feedback, memory and the predictability of earthquakes,” Proc. Nat. Acad. Sci. USA, 99(Supp. 1), 2501–2508, 2002. [75] S. Gluzman, J.V. Andersen, and D. Sornette, “Functional renormalization prediction of rupture,” Comput. Seismology, 32, 122–137, 2001. [76] A. Moura and V.I. Yukalov, “Self-similar extrapolation for the law of acoustic emission before failure of heterogeneous materials,” Int. J. Fract., 118(3), 63–68, 2002. [77] J. Gauthier, C. Le Floc’h, and D. Sornette, “Predictability of catastrophic events; a new approach for structural health monitoring predictive acoustic emission application on helium high pressure tanks,” In: D. Balageas (ed.), Proceedings of the first European workshop Structural Health Monitoring, ONERA, pp. 926–930, http://arXiv.org/abs/cond-mat/0210418, 2002.
Statistical physics of rupture in heterogeneous media
1331
[78] V.I. Yukalov, A. Moura, and H. Nechad, “Self-similar law of energy release before materials fracture,” J. Mech. Phys. Solids, 52, 453–465, 2004. [79] D. Sornette, “Predictability of catastrophic events: material rupture, earthquakes, turbulence, financial crashes and human birth,” Proc. Natl. Acad. Sci. USA, 99 (Supp. 1), 2522–2529, 2002. [80] K. Ide and D. Sornette, “Oscillatory finite-time singularities in finance, population and rupture,” Physica A, 307(1–2), 63–106, 2002. [81] W.-X. Zhou and D. Sornette, “Generalized q-analysis of log-periodicity: applications to critical ruptures,” Phys. Rev. E, 046111, 6604 N4 PT2:U129–U136, 2002. [82] D. Sornette and C. Vanneste, “Dendrites and fronts in a model of dynamical rupture with damage,” Phys. Rev. E, 50, 4327–4345, 1994. [83] S. Roux, A. Hansen, H. Herrmann, and E. Guyon, “Rupture of heterogeneous media in the limit of infinite disorder,” J. Stat. Phys., 52, 237–244, 1988.
4.5 THEORY OF RANDOM HETEROGENEOUS MATERIALS S. Torquato Department of Chemistry, PRISM, and Program in Applied & Computational Mathematics, Princeton University, Princeton, NJ 08544, USA
1.
Introduction
The theoretical prediction of the transport, electromagnetic, and mechanical properties of heterogeneous materials has a long and venerable history, attracting the attention of some of the luminaries of science, including Maxwell [1], Rayleigh [2], and Einstein [3]. Since the early work on the physical properties of heterogeneous materials, there has been an explosion in the literature on this subject [4–9] because of the rich and challenging fundamental problems it offers and its manifest technological importance. A heterogeneous material is composed of domains of different materials (phases), such as a composite, or the same material in different states, such as a polycrystal [8]. It is assumed that the “microscopic” length scale is much larger than the molecular dimensions but much smaller than the characteristic length of the macroscopic sample. In such circumstances, the heterogeneous material can be viewed as a continuum on the microscopic scale, and macroscopic or effective properties can be ascribed to it (see Fig. 1). Heterogeneous materials abound in synthetic products and nature. Synthetic examples include aligned and chopped fiber composites, particulate composites, powders, interpenetrating multiphase composites cellular solids, colloids, gels, foams, phase-separated metallic alloys, microemulsions, block copolymers, and fluidized beds. Some examples of natural heterogeneous materials are granular media, soils, polycrystals, sandstone, wood, bone, lungs, blood, animal and plant tissue, cell aggregates and tumors. The physical phenomena of interest occur on “microscopic” length scales that span from tens of nanometers in the case of gels to meters in the case of geological media. Structure on this “microscopic” scale is generically referred to as microstructure. 1333 S. Yip (ed.), Handbook of Materials Modeling, 1333–1357. c 2005 Springer. Printed in the Netherlands.
1334
S. Torquato
Figure 1. Left panel: A schematic of a random two-phase material shown as white and gray regions with general phase properties K 1 and K 2 and phase volume fractions φ1 and φ2 . Here L and represent the macroscopic and microscopic length scales, respectively. Right panel: When L is much bigger than , the heterogeneous material can be treated as a homogeneous material with effective property K e .
5 µm
125 µm
Figure 2. Examples of random heterogeneous materials [8]. Left panel: A colloidal system of hard spheres of two different sizes. Right panel: A Fontainebleau sandstone.
In many instances, the microstructures can be characterized only statistically, and therefore such materials are referred to as random heterogeneous materials. There is a vast family of random microstructures that are possible, ranging from dispersions with varying degrees of clustering to complex interpenetrating connected multiphase media, including porous media. Figure 2 shows examples of synthetic and natural random heterogeneous materials. The first example shows a scanning electron micrograph of a colloidal system of hard spheres of two different sizes, primarily composed of boron carbide (black
Theory of random heterogeneous materials
1335
regions) and aluminum (white regions). The second example shows a planar section through a Fontainebleau sandstone obtained via X-ray microtomography. This imaging technique enables one to obtain full three-dimensional renderings of the microstructure, revealing that the void or pore phase (white region) is actually connected across the sample. Four different classes of problems are summarized in Table 1 and we will focus on the following four steady-state (time-independent) effective properties associated with these classes: 1. 2. 3. 4.
Effective conductivity tensor, σe Effective stiffness (elastic) tensor, C e Mean survival time, τ Fluid permeability tensor, k
Knowledge of these effective properties are required for a host applications in engineering, physics, geology, materials science, and biology [8]. Depending on the physical context, each phase can be either solid, fluid or void. The quantity σe represents either the electrical or thermal conductivity tensor, which are mathematically equivalent properties. It is the proportionality constant between the average of the local electric current (heat flux) and average of the local electric field (temperature gradient) in the composite. This averaged relation is Ohm’s law or Fourier’s law (for the composite) in the electrical or thermal problems, respectively. For reasons of mathematical analogy, the determination of the effective conductivity translates immediately into
Table 1. The four different classes of steady-state effective media problems considered here. F ∝ K e ·G, where K e is the general effective property, G is the average (or applied) generalized gradient or intensity field, and F is the average generalized flux field. Class A and B problems share many common features and hence may be attacked using similar techniques. Class C and D problems are similarly related to one another [8] Class
General effective property (K e )
Average (or applied) generalized intensity (G)
Average generalized flux (F)
A
Thermal conductivity Electrical conductivity Dielectric constant Magnetic permeability Diffusion coefficient
Temperature gradient Electric field Electric field Magnetic field Concentration gradient
Heat flux Electric current Electric displacement Magnetic induction Mass flux
B
Elastic moduli Viscosity
Strain field Strain rate field
Stress field Stress field
C
Survival time NMR survival time
Species production rate NMR production rate
Concentration field Magnetization density
D
Fluid permeability Sedimentation rate
Applied pressure gradient Force
Velocity field Mobility
1336
S. Torquato
equivalent results for the effective dielectric constant, magnetic permeability, or diffusion coefficient. Therefore, we refer to all of these problems as class A problems as described in Table 1. The effective stiffness (elastic) tensor C e is one of the most basic mechanical properties of a heterogeneous material. The quantity C e is the proportionality constant between the average stress and average strain. This relation is the averaged Hooke’s law for the composite. Considerable attention has been devoted to instances in which the heterogeneous medium consists of a pore region in which diffusion (and bulk reaction) occurs and a “trap” region whose interface can absorb the diffusing species via a surface reaction. A key parameter in such processes is the mean survival time τ , which gives the average lifetime of the diffusing species before it gets trapped. Often it is useful to introduce its inverse, called the trapping constant γ ∝ τ −1 , which is proportional to the trapping rate. A key macroscopic property for describing slow viscous flow through porous media is the fluid permeability tensor k. The quantity k is the proportionality constant between the average fluid velocity and applied pressure gradient in the porous medium. This relation is Darcy’s law for the porous medium. Given the phase properties K 1 , K 2 , . . . , K M and phase volume fractions φ1 , φ2 , . . . , φM of a heterogeneous material with M phases, how are its effective properties mathematically defined? It will be shown below that the effective properties of the heterogeneous material are determined by averages of local fields derived from the appropriate governing continuum-field theories (partial differential equations) for the problem of concern. Specifically, any of the aforementioned effective properties, which we denote generally by K e , is defined by a linear relationship between an average of a generalized local flux F and an average of a generalized local (or applied) intensity G, i.e., F ∝ K e · G.
(1)
For the conduction, elasticity, trapping, and flow problems, the average generalized flux F represents the average local electric current (heat flux), stress, concentration, and velocity fields, respectively, and the average generalized intensity G represents the average local electric field (or temperature gradient), strain, production rate, and applied pressure gradient, respectively. Table 1 summarizes the average local (or applied) field quantities that determine the steady-state effective properties for all four problem classes. The similarities and differences between these classes are described fully by Torquato [8]. The effective properties of a heterogeneous material depend on the phase properties and microstructural information, including the phase volume fractions, which represent the simplest level of information. It is important to emphasize that the effective properties are generally not simple relations
Theory of random heterogeneous materials
1337
(mixtures rules) involving the phase volume fractions. This suggests that the complex interactions between the phases result in a dependence of the effective properties on nontrivial details of the microstructure. To illustrate this point, we consider a 50–50 two-phase system shown in the left panel of Fig. 3. It consists of a disconnected inclusion phase and a connected matrix phase. Let the gray “phase” be highly conducting (or stiff) compared to the white “phase”. The right panel shows a composite with exactly the same microstructure but with the phases interchanged. Which of the two composites has the higher effective conductivity (or stiffness)? Clearly, the one depicted in the right panel has the higher effective property, since the connected phase here is the more conducting (or stiffer) phase. Thus, even though both composites have the same volume fraction, their effective properties will be dramatically different, implying that the effective properties depend on microstructural information beyond that contained in the volume fractions. To summarize, for a random heterogeneous material consisting of M phases, the general effective property K e is the following function: K e = f (K 1 , K 2 , . . . , K M ; φ1 , φ2 , . . . , φ M ; ),
(2)
where indicates functionals of higher-order microstructural information. The mathematical form that this microstructural information takes is statistical correlation functions. A central aim of the theory of random heterogeneous materials is the development of methods to estimate the functional in (2) and therefore the relevant statistical correlation functions.
50⫺50 Mixture
50⫺50 Mixture
Figure 3. Left panel: 50–50 mixture consisting of a disconnected inclusion phase and a connected matrix phase. The gray phase is highly conducting (or stiff) relative to the white phase. Right panel: The same microstructure except the phases are interchanged [8].
1338
2.
S. Torquato
Microstructural Correlation Functions
The diverse effective properties that we are concerned with here rigorously lead to a wide variety of microstructural descriptors, generically referred to as microstructural correlation functions, and which are defined below. The reader is referred to the book by Torquato [8] for a detailed disussion of these microstructural correlation functions. We will assume that the microstructures are static or can be approximated as static, and therefore any realization ω of the random material will be taken to be independent of time. In particular, we will focus on two-phase heterogeneous materials. Each realization ω of the two-phase random medium comes from some probability space and occupies some subset V of d-dimensional Euclidean space, i.e., V ∈ d . The region of space V ∈ d of volume V that is partitioned into two disjoint random sets or phases: phase 1, a region V1 (ω) of volume fraction φ1 , and phase 2, a region V2 (ω) of volume fraction φ2 . Let ∂V(ω) denote the surface or interface between V1 (ω) and V2 (ω). For a given realization ω, the indicator function I (i) (x; ω) for phase i for x ∈ V is a random variable defined by (i)
I (x; ω) =
1, 0,
if x ∈ Vi (ω), otherwise,
(3)
for i = 1, 2. The indicator function M(x; ω) for the interface is defined as M(x; ω) = |∇I (1)(x; ω)| = |∇I (2) (x; ω)|,
(4)
and therefore is a generalized function that is nonzero when x is on the interface. Depending on the physical context, phase i can be a solid, fluid, or void characterized by some general tensor property. Henceforth, we will drop ω from the notation.
2.1.
n-Point Probability Functions
The so-called n-point probability function for phase i, Sn(i) , is the the expectation of the product I (i) (x 1 )I (i) (x 2 ) · · · I (i) (x n ), i.e.,
Sn(i) (x 1 , x 2 , . . . , x n ) ≡ I (i) (x 1 )I (i) (x 2 ) · · · I (i) (x n )
(5)
This quantity can be interpreted as the probability that n points at positions x 1 , x 2 , . . . , x n are found in phase i. For statistically homogeneous media, the Sn(i) are translationally invariant and therefore depend only on relative positions of the n points. In particular, S1(i) (x 1 ) is just the constant volume fraction φi of phase i. If the random medium is also statistically isotropic, the Sn(i) depend only on the distances between the n points. Henceforth, we will define
Theory of random heterogeneous materials
r
1339
S2(r) r
L(z)
Fsv(r)
z δ
Fss(r)
P(δ) r
Figure 4. A schematic depicting events that contribute to lower-order functions for random (1) media of arbitrary microstructure [8]. Shown is the two-point probability function S2 ≡ S2 for phase 1 (white region), surface–void and surface–surface functions Fsv and Fss , lineal-path function L ≡ L (1) , and the pore-size density function P.
Sn ≡ Sn(i) . The two-point or autocorrelation function S2(r) ≡ S2(1)(r) for statistically homogeneous media can be obtained by randomly tossing line segments of length r ≡ |r| with a specified orientation and counting the fraction of times the end points fall in phase 1 (see Fig. 4). For an isotropic porous solid, this two-point function can also be obtained experimentally via scattering of radiation [8].
2.2.
Surface Correlation Functions
Surface correlation functions contain information about the random interface ∂V and are of basic importance in the trapping and flow problems. In this context, we will let phase 1 denote the fluid or “void” phase, and phase 2 the “solid” phase. The simplest surface correlation function is the specific surface s(x) (interface area per unit volume) at point x, which is a one-point correlation function for statistically inhomogeneous media, i.e., s(x) = M(x),
(6)
where M(x) is the interface indicator function given by (4). Two-point surface correlation functions for statistically inhomogeneous media are defined by Fsv (x 1 , x 2 ) = M(x 1 )I(x 2 ), Fss (x 1 , x 2 ) = M(x 1 )M(x 2 ),
(7) (8)
1340
S. Torquato
where I(x) ≡ I (1)(x) is the indicator function for the void phase. These functions are called the surface–void and surface–surface correlation functions, respectively. Higher-order surface correlation functions can also be defined [8].
2.3.
Lineal Measures
For statistically isotropic media, the lineal-path function L (i) (z) gives the probability that a line segment of length z lies wholly in phase i when randomly thrown into the sample. Figure 4 shows an event that contributes to the lineal-path function. The function L (i) (z) is related to the chord-length probability density function p(i) (z) via the formula p(i) (z) =
C d2 L (i) (z) , φi dz 2
(9)
(i) where (i) C is the mean chord length for phase i, i.e., the first moment of p (z) defined by Chords are all of the line segments between intersections of an infinitely long line with the two-phase interface. For statistically isotropic media, the quantity p(i) (z)dz is the probability of finding a chord of length between z and z + dz in phase i.
2.4.
Pore-Size Functions
The pore-size probability density function P(δ) (also referred to as poresize “distribution” function) first arose to characterize the void or “pore” space in porous media. For simplicity, we will define P(δ) for phase 1, keeping in mind that it is equally well defined for phase 2. The quantity P(δ)dδ is defined as the probability that a randomly chosen point in V1 (ω) lies at a distance between δ and δ + dδ from the nearest point on the pore–solid interface.
2.5.
Two-Point Cluster Function
Perhaps the most promising two-point descriptor identified to date is the two-point cluster function C2(i) (x 1 , x 2 ) [8]. The quantity C2(i) (x 1 , x 2 ) gives the probability of finding two points at x 1 and x 2 in the same cluster of phase i. The formation of very large “clusters” of a phase in a heterogeneous material (on the order of the system size) can have a dramatic influence on its macroscopic properties. A cluster of phase i is defined as the part of phase i that can be reached from a point in phase i without passing through phase j =/ i. A critical point, known as the percolation threshold, is reached when a sample-spanning cluster first appears. Thus, C2(i) is the analogue of the two-point probability
Theory of random heterogeneous materials
1341
function S2(i) , but unlike its predecessor, it contains nontrivial topological “connectedness” information. Indeed, it is a useful signature of clustering in the system since it becomes longer ranged as the percolation threshold is approached from below. The measurement of C2(i) for a three-dimensional material sample cannot be made from a two-dimensional cross-section of the material, since it is an intrinsically three-dimensional microstructural function. The remaining challenge is to be able to incorporate C2(i) into a theory to predict macroscopic properties for a wide range of conditions, even near the threshold.
2.6.
Nearest-Neighbor Functions
All of the aforementioned statistical descriptors are defined for disordered materials of arbitrary microstructure. In the special case of random media composed of particles (phase 2) distributed randomly throughout another material (phase 1) or simple atomic systems, there is a variety of natural morphological descriptors. For simplicity, consider systems of identical spherical particles of diameter D (or radius R = D/2) at number density ρ. The “particle” nearest-neighbor probability density function HP (r) characterizes the probability of finding the nearest neighbor at some given distance from a reference particle. A different nearest-neighbor function, HV , referred to as the “void” nearest-neighbor probability density function, characterizes the probability of finding a nearest-neighbor particle center at a given distance from an arbitrary point in the system. Other descriptors of particle systems are described by Torquato [8].
3.
Unified Theoretical Approach
The previous section described some of the different types of statistical correlation functions that have arisen in rigorous structure–property relations [8]. Until recently, application of such structure–property relations was virtually nonexistent because of the difficulty involved in ascertaining the correlation functions. Are these different functions related to one another? Can one write down a single expression that contains complete statistical information and thereby compute any specific correlation function? The key quantity that enables one to answer to these two queries in the affirmative is the canonical n-point correlation function Hn [8, 10].
3.1.
Canonical n-Point Correlation Function
For simplicity, we will begin by considering a classical, closed system of N interacting identical spherical particles of radius R in volume V . Any ensemble
1342
S. Torquato
of many-particle systems is completely spatially characterized classically by the n-particle probability density function ρn (r n ). The quantity ρn (r n )dr n is proportional to the probability of finding any subset of n particles with configuration r n in volume element dr n . In general, ρn (r n )dr n depends on the N -particle potential N (r N ) and the particular dynamical process involved to create the system. In many instances, the total potential energy (in the absence of external fields) is well approximated by pairwise additivity. The key idea employed by Torquato [10] to define and derive series representations of the canonical n-point correlation function Hn is the available space and available surface to the ith “test” particle of radius bi that is inserted into the system of spheres of radius R. Letting ai = bi + R and yi j = |x i − r j |, one representation of the Hn is given by Hn (x ; x m
p−m
;r )= q
∞ s=0
n = p + q, where G n (x p ; r q ) =
(−1)s (−1)m
∞
q p
(−1)s
l=1
s=0
×
ρq+s (r q+s )
k=1
∂ ∂ ··· G (s) (x p ; r q ), ∂a1 ∂am n
(10)
e(ykl ; ak )
s! q+s j =q+1
1−
p
[1 − m(yi j ; ai )] dr j ,
(11)
i=1
and e(r; a) = 1 − m(r; a) equals zero if r < a and unity otherwise. Importantly, all of the aforementioned microstructural correlation functions are special limits of the Hn and thus one can, in principle, compute each of them for this class of model microstructures given the ρn . Representations of the Hn for dispersions of spheres with a size distribution have also been obtained. The formal results described above can be extended to statistically anisotropic models consisting of inclusions whose configuration is fully specified by center-of-mass coordinates (e.g., oriented ellipsoids, cubes, cylinders, etc.) as well as to statistically anisotropic laminates. The formalism also extends to nonparticulate systems, such as cell or lattice models. The reader is referred to Ref. [8] for a discussion of these extensions.
3.2.
Model Pair Potentials
For a system of noninteracting particles, N = 0. In so far as statistical thermodynamics is concerned, this is the trivial case of an ideal gas. However, this is a nontrivial model of a heterogeneous material, since the lack of spatial correlation implies that the particles may overlap to form complex clusters,
Theory of random heterogeneous materials
1343
as shown in Fig. 5. At low sphere densities, the particle phase is a dispersed, disconnected phase, but above a critical value, called the percolation threshold, the particle phase becomes connected. For d = 2 and 3, this threshold occurs at a sphere volume fraction of about 0.68 and 0.29, respectively [8]. We will refer to this model as overlapping spheres. Interpenetrable-sphere systems in general are useful models of consolidated media, such as sandstones and other rocks, and sintered materials. In the hard-sphere pair potential, the particles do not interact for interparticle separation distances greater than the sphere diameter D but experience an infinite repulsive force for distances less than or equal to D. Hard-sphere systems have received considerable attention, since they serve as a useful model for a number of physical systems, such as simple liquids, glasses, colloidal dispersions, fiber-reinforced composites, particulate composites, and granular media [8, 11–13]. The hard-sphere potential approximates well the structure of dense-particle systems with more complicated potentials because short-range repulsion between the particles is the primary factor in determining their spatial arrangement. Unlike the case of overlapping spheres, the determination of ρn for general ensembles of hard spheres is nontrivial. For hard-sphere systems the impenetrability constraint does not uniquely specify the statistical ensemble [8]. The hard-sphere system can be in thermal equilibrium or in one of the infinitely many nonequilibrium states, such as the random sequential addition (or adsorption) (RSA) process (see Fig. 6). The latter is produced by randomly, irreversibly, and sequentially placing nonoverlapping objects into a volume. For identical d-dimensional RSA spheres, the filling process terminates at the saturation limit, which is substantially lower than the maximum density for random hard spheres in equilibrium. Denoting the maximum sphere volume fraction by φ2max , it turns out that for identical hard spheres in an RSA process in the thermodynamic limit,
Figure 5. Two-dimensional overlapping-particle systems at a low density with at most three particle overlaps (left panel) and at a high density above the percolation threshold (right panel) [8].
1344
S. Torquato
"Frozen" or "Parked"
Equilibrium
RSA
Figure 6. “Snapshot” of an equilibrium system of hard particles (left) and a realization of hard particles assembled according to an RSA process (right) [8]. In the former, the particles are free to sample the configuration space subjected to impenetrability of the other particles, but in the latter, the particles are frozen at their initial positions.
φ2max ≈ 0.75, 0.55, and 0.38 for d = 1, 2, and 3, respectively. In contrast, for identical disordered hard spheres in equilibrium, φ2max is exactly unity for d =1, and for d = 2 and 3, φ2max ≈ 0.83 and 0.64, respectively. Interestingly, these maximum densities for equilibrium hard spheres apparently correspond to a special singular disordered state commonly referred to as the random close packing (RCP) state [14]. However, it has recently been shown [15] that the venerable concept of the random close packed state (RCP) is mathematically ill-defined. The RCP density is thought to be the highest density that a “random” packing of particles can attain. The term close packed implies maximal coordination throughout the system, i.e., an ordered lattice, which clearly is in conflict with the notion of randomness. The exact proportion of these two competing effects is not well-defined, and therein lies the problem. Finally, the notion of the randomness was never quantified, nor even clearly defined. Torquato et al. [15] demonstrated the ill-defined nature of the RCP state by introducing precise definitions for jamming [16] and scalar order metrics, and analyzing computer-generated sphere packings. They showed that since one can achieve packings with arbitrarily small increases in packing fraction at the expense of small increases in order, the notion of RCP as the highest possible density that a random sphere packing can attain is not well-defined. To replace this idea, they introduced a new concept: the maximally random jammed (MRJ) state, which can be defined precisely once a scalar order metric is chosen. This lays the mathematical groundwork for studying randomness in packings of particles and initiates the search for the MRJ state in a quantitative way not possible before. In a recent study [17], a comprehensive set of candidate jammed states of identical hard spheres and a variety of different order metrics have been used to estimate the MRJ packing fraction to be 0.637 ± 0.0015. Figure 7 shows a configuration near the MRJ state. The determination of the MRJ states for systems of spheres with a polydispersity in size [18] and for systems of
Theory of random heterogeneous materials
1345
Figure 7. A realization of a random packing of 500 identical spheres near the maximally random jammed state.
ellipsoids [19] are intriguing and challenging problems. Recently, it was shown that ellipsoids can randomly pack more densely than spheres in the MRJ state and nearly as dense as the densest (crystal) packing of spheres [19]. Interpenetrable-sphere models enable one to define systems that are intermediate between overlapping spheres and impenetrable spheres, thereby enabling one to vary the degree of connectivity of the particle phase. A popular interpenetrable-sphere model is the penetrable-concentric-shell model or, more colloquially, the cherry-pit model. When 0 ≤ λ ≤ 1, each sphere of diameter D may be thought of as being composed of an impenetrable core of diameter λD encompassed by a perfectly penetrable concentric shell of thickness (1 − λ)D/2. By varying the impenetrability parameter λ between 0 and 1, one can continuously pass between fully penetrable (overlapping) spheres and totally impenetrable spheres, respectively. Well-known models that incorporate attractive interactions include the square-well potential and the Lennard–Jones potential [11]. A special limit of the square-well potential that reduces attractive interactions to a delta function at contact is referred to the “sticky” hard-sphere potential. This potential provides a simple means of modeling aggregation processes in particle systems.
3.3.
Illustrative Calculations
Given the series representation of the canonical n-point correlation function Hn , one can compute (using statistical–mechanical techniques) specific statistical descriptors as special limiting cases as outlined above. Here we report two illustrative calculations of correlation functions. The reader is referred to Ref. [8] for computational details.
1346
S. Torquato
Figure 8 shows the matrix two-point probability function S2 for threedimensional overlapping spheres (phase 2) at a sphere volume fraction φ2 = 0.5. Included in the figure is the corresponding plot of S2 for equilibrium hard (totally impenetrable) spheres, which, unlike overlapping spheres, exhibits short-range order. Figure 9 shows the two-point cluster function C2 for sticky hard spheres. The key point is that C2 becomes longer ranged as the 0.5 Hard Spheres Overlapping Spheres
0.4
φ2 ⫽ 0.5 S2( r )
0.3
0.2
0.1
0.0 0.0
0.5
1.0
1.5 r/D
2.0
2.5
3.0
Figure 8. The matrix two-point probability function S2 (r ) versus the dimensionless distance r/D for two models of isotropic distributions of spheres of diameter D = 2R at a sphere volume fraction φ2 = 0.5 [8].
Two–point cluster function, C2 (r)
0.3 Sticky hard spheres φ2 ⫽0.297 0.2
φ2 ⫽0.2 φ2 ⫽0.1
0.1
0.0 0.0
1.0 2.0 3.0 Dimensionless distance, r/D
4.0
Figure 9. The two-point cluster function C2 (r ) versus the dimensionless distance r/D for sticky hard spheres of diameter D at several values of the sphere volume fraction φ2 . Here τ = 0.35 is the dimensionless “stickiness” parameter and φ2c = 0.297 [8].
Theory of random heterogeneous materials
1347
percolation threshold φ2c =0.297 is approached from below. We note that there is a variety of techniques available to extract microstructural functions from digitized 2D and 3D representations of heterogeneous materials [8].
4.
Homogenization Theory
Homogenization theory is concerned with finding the appropriate homogenized (or averaged, or macroscopic) governing partial differential equations describing physical processes occurring in heterogeneous materials when the length scale of the heterogeneities tends to zero. In such instances, it is desired that the effects of the microstructure reside wholly in the macroscopic or effective properties via certain weighted averages of the microstructure. In its simplest form, the method is based on the consideration of two length scales: the macroscopic scale L, characterizing the extent of the system, and the microscopic scale , associated with the heterogeneities. Moreover, it is supposed that some external field is applied that varies on a characteristic length scale . The limit of interest for purposes of homogenization is L ≥ . Therefore, there is a small parameter = /L associated with rapid fluctuations in the microstructure or local property. Accordingly, the field quantities (e.g., temperature field, electric field, stress field, concentration field, velocity field) depend on two variables: a global or slow variable x and a local or fast variable y = x/ . The slowly varying parts of the fields are imposed by the source, the boundary conditions, or the initial conditions, while the rapidly varying parts are imposed by the local property or microstructure. These variations are schematically shown in Fig. 10. Under these conditions, a complete analysis of the problem involves three steps: 1. One first sets out to find the form of the homogenized differential equations equations, valid on length scales O( ), by performing an asymptotic expansion of the field quantities in terms of the global and local variables.
O( )
O (Λ) Figure 10. A schematic depiction of the slow and rapid parts of the field.
1348
S. Torquato
2. Next, one must determine the effective properties that arise in the averaged equations as a function of the microstructure. 3. Finally, one must solve the homogenized equations under appropriate boundary or initial conditions. Two-scale homogenization theory enables one to show that effective conductivity tensor σe , effective stiffness tensor C e , mean survival time τ , and fluid permeability tensor k of the random heterogeneous material in the limit
→ 0 are determined by ensemble averages of local fields that satisfy the appropriate conservation equations, i.e., governing partial differential equations. The reader interested in these derivations is referred to the books by Torquato [8] and Milton [7].
4.1.
Conduction Problem
Consider the steady-state transport or displacement of a conservable quantity associated with any of the class A problems that are summarized in Table 1. To fix ideas, we will speak in the language of electrical or thermal conduction, keeping in mind that the results of this section apply as well to the determination of the effective dielectric constant, magnetic permeability, and diffusion coefficient. Each realization ω of the random heterogeneous material that occupies the space V is composed of two phases (phases 1 and 2) having constant conductivity tensors σ1 and σ2 .
4.1.1. Local differential equation Let J (x) denote the local electric (thermal) current or flux at position x, and let E(x) denote the local field intensity. Under steady-state conditions with no source terms, J is solenoidal and E is irrotational: ∇ · J(x) = 0 in V,
(12)
∇ × E(x) = 0 in V,
(13)
for each realization of the ensemble. The latter condition implies the existence of a potential field T , i.e., E = −∇T.
(14)
Thus, E and T represent the electric field (negative of the temperature gradient) and electric potential (temperature) in the electrical (thermal) problem, respectively. We also specify the potential T on the boundary of V.
Theory of random heterogeneous materials
1349
4.1.2. Local constitutive relation The fields J to E are linked by assuming a linear constitutive relation, i.e., J(x) = σ(x) · E(x) in V,
(15)
where the local conductivity tensor can be expressed as σ(x) = σ1 I (1) (x) + σ2 I (2) (x),
(16)
and I (i) (x) is the indicator function for phase i given by (3).
4.1.3. Averaged constitutive relation The following ensemble-averaged constitutive relation defines the symmetric second-order effective conductivity tensor: J (x) = σe · E(x).
4.2.
(17)
Elasticity Problem
4.2.1. Local differential equations Let τ(x) and ε(x) denote, respectively, the symmetric local stress and strain tensors at position x. Under steady state without sources, τ(x) and ε(x) obey the elastostatic equations for each realization of the ensemble: ∇ · τ = 0 in V
(18)
∇ × [∇ × ε]T = 0 in V,
(19)
The latter condition implies the existence of a displacement field u, i.e.,
ε(x) = 12 ∇u(x) + ∇u(x)T .
(20)
The superscript T denotes the transpose operation. We also specify the displacement u on the boundary of V.
4.2.2. Local constitutive relation We assume that the fields τ and ε via a linear constitutive relation, i.e., τ(x) =
C(x) : ε(x)
in V,
(21)
where C(x) = C 1 I (1)(x) + C 2 I (2)(x)
(22)
1350
S. Torquato
is the local stiffness tensor. Relation (21) is the generalization of Hooke’s law. Here the symbol : denotes the contraction with respect to two indices.
4.2.3. Averaged constitutive relation The following ensemble-averaged constitutive relation defines the symmetric second-order effective conductivity tensor: τ(x) = C e :ε(x).
(23)
For elastically isotropic media, C e is expressible in terms of two independent effective scalar parameters; for example, the effective bulk modulus K e and the effective shear modulus G e .
4.3.
Trapping Problem
Consider the problem of diffusion and reaction among partially absorbing “traps” in each realization ω of the random medium. Let V1 be the region in which diffusion occurs (i.e., trap-free, or pore, region) and let V2 be the trap region. It is important to emphasize that, unlike the previous two problems of conduction and elasticity, there is no local constitutive relation for the mean survival time τ or, equivalently, the trapping constant γ = (τ φ1 D)−1 , where D is the diffusion coefficient. This is also true for the fluid permeability, as will be described below. Thus, the trapping and flow problems (classes C and D problems) are fundamentally different from the conduction and elasticity problems (classes A and B problems); see Table 1.
4.3.1. Local differential equation The generation rate per unit trap-free volume of a diffusing species is G(x). The scaled concentration field of the reactants u(x) exterior to the partially absorbing traps under steady-state conditions is governed by the Poisson equation: u = −1 in V1
(24)
with the boundary condition at the pore–trap interface given by ∂c + κc = 0 on ∂V, (25) ∂n where is the Laplacian operator, κ is the surface rate constant, and n is the unit outward normal from the pore space. For infinite surface reaction D
Theory of random heterogeneous materials
1351
(κ = ∞), the traps are perfect absorbers, and u = 0 (diffusion-controlled limit). For vanishing surface reaction (κ = 0), the traps are perfect reflectors, and ∂u/∂n = 0 (reaction-controlled limit).
4.3.2. Averaged constitutive relation The trapping constant γ is defined by the following averaged constitutive relation: G(x) = γ DC(x),
(26)
where C(x) is the average concentration field and γ −1 = u = τ φ1 D.
4.4.
(27)
Flow Problem
For each realization ω of the random porous medium, let V1 be the region through which the viscous incompressible fluid with viscosity µ flows (i.e., pore, or void, region) and let V2 be the solid region.
4.4.1. Local differential equations The fluid motion satisfies the tensor Stokes equations w = ∇ π − I in V1 , ∇ · w = 0 in V1 , w = 0 on ∂V,
(28) (29) (30)
where I is the second-order unit tensor. In these equations, the scaled tensor velocity field wi j is the j th component of the velocity due to a unit pressure gradient in the ith direction, and π j is the j th component of the associated scaled pressure.
4.4.2. Averaged constitutive relation The fluid permeability tensor k is defined by Darcy’s law: k U(x) = − ∇ p0 (x), µ where U(x) is the average velocity field and k = w.
(31)
(32)
1352
5.
S. Torquato
Variational Principles and Bounds
Due to the complexity of the microstructure, there are relatively few situations in which one can evaluate the effective properties of heterogeneous materials exactly. Such rare results are nonetheless quite valuable as benchmarks to test theories and computer simulations [7, 8]. The difficulty in obtaining exact predictions of the effective properties is due to the fact that they generally depend on functionals involving an infinite set of statistical correlation functions that characterize the microstructure [7, 8]. Such complete information is usually not available. Hence, approximation and rigorous bounding techniques have been devised to estimate the effective properties [4, 6–9]. In the absence of exact results for the effective properties, any rigorous statement about them must be in the form of rigorous bounds. Bounds are useful because: (1) they rigorously incorporate nontrivial information about the microstructure via statistical correlation functions and consequently serve as a guide in identifying appropriate statistical descriptors; (2) as successively more microstructural information is included, the bounds become progressively narrower; (3) one of the bounds can provide a relatively sharp estimate of the property for a wide range of conditions, even when the reciprocal bound diverges from it; (4) they are usually exact under certain conditions and can be used to find extremal microstructures; (5) they can be utilized to test the merits of a theory or computer experiment; and (6) they provide a unified framework to study a variety of different effective properties.
5.1.
Variational Principles
Different methods exist to derive bounds on effective properties [7], but only variational principles (minimum energy principles) are available to derive bounds on all four different effective properties considered in this article [8]. To illustrate the basic idea, we state the minimum energy principles for the conduction problem without proof.
5.1.1. Minimum potential energy ˆ defined by the set Let AU be the class of trial intensity fields E ˆ ∇×E ˆ = 0, E ˆ = E}. AU = {ergodic E;
(33)
The actual average energy is bounded from above by the trial average energy, i.e., 1 ˆ · σ · E ˆ ˆ ∈ AU , E · σe · E ≤ 1 E ∀E (34) 2
2
Theory of random heterogeneous materials
1353
where E is curl-free. This principle leads to an upper bound on the effective conductivity.
5.1.2. Minimum complementary energy Let AL be the class of trial flux fields Jˆ defined by the set AL = {ergodic Jˆ ; ∇ · Jˆ = 0, Jˆ = J },
(35)
Again, the actual average energy is bounded from above by the trial average energy, i.e., 1 J 2
−1 ˆ 1 ˆ · σ−1 J e · J ≤ 2J · σ
∀ Jˆ ∈ AL ,
(36)
where J is solenoidal. This principle leads to a lower bound on the effective conductivity.
5.2.
Variational Bounds
To obtain bounds from the variational principles, one must construct specific trial fields. The simplest trial fields that satisfy the admissibility conditions are constant vectors. In particular, the constant fields E and J lead to the simple bounds σ−1 −1 ≤ σe ≤ σ,
(37)
where for any local property a, a ≡ φ1 a1 + φ2 a2 . Thus, the effective conductivity tensor is bounded from above by the arithmetic mean of the phase conductivities and from below by the harmonic mean of the phase conductivities. We refer to these results as one-point bounds, since they involve information up to the level of the volume fraction, which is a one-point correlation function. The one-point bounds (37) are generally far apart from one another. In order to improve upon them (i.e., to obtain higher-order bounds), one must construct nonuniform trial fields that better reflect the field interactions between the phases. One can systematically generate n-point bounds (bounds involving information up to the level of the relevant n-point correlation function) in this manner [7, 8]. Perhaps the most well-known two-point bounds are the optimal Hashin–Shtrikman bounds on the effective conductivity [7, 8, 20]. The reader is referred to the book of Torquato [8] for a comprehensive discussion of the derivation of n-point bounds on all four effective properties considered in this article.
1354
6.
S. Torquato
Evaluation of Property Bounds
Various two-, three-, and four-point bounds on effective properties have been evaluated for a number of different model microstructures. These evaluations require knowledge of the relevant n-point correlation functions, which were discussed in Section 3. To illustrate the predictive capability of bounds, we graphically depict them in Figs. 11 and 12 for each effective property for simple models of distributions of identical spheres in a matrix or fluid phase. All of these results are taken from Ref. [8]. Figure 11 compares two- and three-point bounds to corresponding simulation data and a three-point approximation for the effective conductivity of superconducting hard spheres (σ2 /σ1 = ∞) in equilibrium. Although the corresponding upper bounds diverge to infinity, the three-point lower bound provides a good estimate of σe because the particles do not cluster. It is noteworthy that the three-point approximation is quite accurate. The same figure includes a comparison of bounds to experimental data of the effective shear modulus G e for random equilibrium arrays of identical glass spheres in an epoxy matrix. The three-point bounds provide significant improvement over the two-point bounds, and the data lie closer to the three-point lower bound. Figure 12 compares a two-point interfacial surface upper bound and pore-size lower bound to simulation data for the scaled mean survival time τ D/R 2 versus trap volume fraction φ2 for three-dimensional fully overlapping spherical traps of radius R in the case of perfectly absorbing traps (κ ∗ = κ R/D = ∞). This figure also provides a comparison of bounds on the scaled inverse fluid permeability (resistance) ks /k for equilibrium arrays of identical
28 Random Hard Spheres σ 2 /σ 1 = ∞
Random Hard Spheres Scaled shear modulus, Ge/G1
Scaled conductivity, σe/σ1
10.0
2–point lower bound 3–point lower bound 3–point approximation Simulation data 5.0
0.0 0.0
0.2
0.4
0.6
Particle volume fraction, φ
2
24
G2/G1⫽28.5, G1/K1⫽0.228, G2/K1⫽0.66
20 16
Data 2–point bounds 3–point bounds
12 8 4 0 0.0
0.2
0.4
0.6
0.8
1.0
Particle volume fraction, φ
2
Figure 11. Left panel: Two- and three-point estimates of σe /σ1 at σ2 /σ1 = ∞ for random arrays of identical hard spheres in equilibrium [8]. Right panel: Comparison of bounds on G e /G 1 to experimental data for random equilibrium arrays of identical glass spheres in an epoxy matrix [8].
Theory of random heterogeneous materials
1355 100 Kozeny Equation
Overlapping Traps κ*⫽∞
τ D/R
2
1.5
Cross-Property Bound
1.0
ks k
10 2–Point Bound
0.5
0.0 0.0
3–Point Bound 1 0.2
0.4
φ2
0.6
0.8
1.0
0
0.2
φ
0.4
0.6
2
Figure 12. Left panel: Comparison of upper and lower bounds (solid curves) to simulation data for the scaled mean survival time τ D/R 2 for fully overlapping spherical traps of radius R in the case of perfect absorption (κ ∗ = κ R/D = ∞) [8]. Right panel: Comparison of bounds and estimates on the scaled inverse fluid permeability (resistance) ks /k for equilibrium arrays of identical nonoverlapping spheres [8].
three-dimensional nonoverlapping spheres of radius R versus sphere volume fraction φ2 , where ks = 2R 2 /(9φ2 ). Included in this figure is a cross-property bound and the empirical Kozeny relation, the latter of which provides a rough estimate of experimental data.
7.
Summary
An effective property of a random heterogeneous material is a functional of the relevant local fields weighted with certain correlation functions that statistically characterize the structure. Generally, the type of correlation function involved depends on the specific physical problem that one studies. However, for certain classes of materials, it has been shown that all of the apparently different types of correlation functions can be obtained from a canonical function Hn and, consequently, can be shown to be related to one another. Thus, seemingly different effective properties are indeed related to one another via so-called cross-property relations [8]. Such a unified approach to study effective properties of random heterogeneous is both natural and very powerful. In general, an infinite amount of microstructural information is required to determine an effective property exactly. In practice, therefore, lower-order n-point estimates (approximations or bounds) of the effective properties have been derived that often provide accurate predictions. Nonetheless, many challenges remain. For example, a systematic means of incorporating into structure–property relations important topological information, such as connectedness of the phases, in a nontrivial manner has not been accomplished to date.
1356
S. Torquato
There were many important theoretical topics that we were not able to cover in this article. Some of these topics include: (1) percolation theory [8, 9, 21, 22]; (2) image analysis of microstructures [8]; (3) exact solutions for effective properties [7, 8]; (4) analytical approximations for effective properties [7–9]; (5) property optimization [6, 7]; and (6) cross-property relations [7, 8], which link different effective properties.
References [1] J.C. Maxwell, Treatise on Electricity and Magnetism, Clarendon Press, Oxford, 1873. [2] L. Rayleigh, “On the influence of obstacles arranged in a rectangular order upon the properties of medium,” Phil. Mag., 34, 481–502, 1892. [3] A. Einstein, “Eine neue bestimmung der molek¨uldimensionen,” Ann. Phys., 19, 289– 306, 1906. [4] R.M. Christensen, Mechanics of Composite Materials, Wiley, New York, 1979. [5] V.V. Jikov, S.M. Kozlov, and O.A. Olenik, Homogenization of Differential Operators and Integral Functionals, Springer-Verlag, Berlin, 1994. [6] A.V. Cherkaev, Variational Methods for Structural Optimization, Springer-Verlag, New York, 2000. [7] G.W. Milton, The Theory of Composites, Cambridge University Press, Cambridge, England, 2002. [8] S. Torquato, Random Heterogeneous Materials: Microstructure and Macroscopic Properties, Springer-Verlag, New York, 2002. [9] M. Sahimi, Heterogeneous Materials I: Linear Transport and Optical Properties, Springer-Verlag, New York, 2003. [10] S. Torquato, “Microstructure characterization and bulk properties of disordered two-phase media,” J. Stat. Phys., 45, 843–873, 1986. [11] J.P. Hansen and I.R. McDonald, Theory of Simple Liquids, Academic Press, New York, 1986. [12] R. Zallen, The Physics of Amorphous Solids, Wiley, New York, 1983. [13] W.B. Russel, D.A. Saville, and W.R. Schowalter, Colloidal Dispersions, Cambridge University Press, Cambridge, England, 1989. [14] J.D. Bernal, “The geometry of the structure of liquids,” In: T. J. Hughel (ed.), Liquids: Structure, Properties, Solid Interactions, Elsevier, New York, pp. 25–50, 1965. [15] S. Torquato, T.M. Truskett, and P.G. Debenedetti, “Is random close packing of spheres well defined?,” Phys. Rev. Lett., 84, 2064–2067, 2000. [16] S. Torquato and F.H. Stillinger, “Multiplicity of generation, selection, and classification procedures for jammed hard-particle packings,” J. Phys. Chem. B, 105, 11849– 11853, 2001. [17] A.R. Kansal, S. Torquato, and F.H. Stillinger, “Diversity of order and densities in jammed hard-particle packings,” Phys. Rev. E, 66, 041109, 1–8, 2002a. [18] A.R. Kansal, S. Torquato, and F.H. Stillinger, “Computer generation of dense polydisperse sphere packing,” JCP, 117, 8212–8218, 2002b. [19] A. Donev, I. Cisse, D. Sachs, E.A. Variano, F.H. Stillinger, R. Connelly, S. Torquato, and P.M. Chaikin, Improving the density of jammed disordered packings using ellipsoids, Science, 303, 990–993, 2004.
Theory of random heterogeneous materials
1357
[20] Z. Hashin and S. Shtrikman, “A variational approach to the theory of the effective magnetic permeability of multiphase materials,” J. Appl. Phys., 33, 3125–3131, 1962. [21] G. Grimmet, Percolation, Springer-Verlag, New York, 1989. [22] D. Stauffer and A. Aharony, Introduction to Percolation Theory, Taylor and Francis, London, 1992.
4.6 MODERN INTERFACE METHODS FOR SEMICONDUCTOR PROCESS SIMULATION J.A. Sethian Department of Mathematics, University of California, Berkeley, CA, USA
The manufacture of semiconductor devices may include dozens of process steps, all delicately choreographed to produce a functioning, reliable, and efficient device. These steps, such as photolitography, etching and deposition, act to shape and mold the device, replete with various metals, insulators, and interconnects. As one might guess, a trial and error approach to determine a repeatable and reliable recipe is not inexpensive. Numerical simulations which capture the essential details of these processes have a valuable role to play. Understanding and predicting how devices can be effectively built is complicated by rapidly changing advances in the manufacturing process. Smaller and smaller devices have now moved far away from continuum equations for transport, etching and deposition, and now rely on such discrete effects as individual atom surface physics. Adequate equations and models for these phenomena are themselves subjects of great debate. Until recently, these modeling difficulties were matched by numerical difficulties associated with accurately tracking the evolution of material boundaries and interfaces under complex physics. Some of the difficulties included trying to capture the formation of sharp corners, topological changes which lead to void formation and shadow zones, delicate dependence of profile evolution on interface geometry, and the sheer programming difficulty of accurately representing and capturing three-dimensional motion. These computational obstacles masked the unresolved modeling issues inherent in the move towards smaller and smaller process scales. In recent years, there has been significant advancement in numerical methods to track propagating interfaces evolving in complex situations. These techniques rely on an Eulerian, implicit embedding view of an interface discussed below; briefly, the interface is embedded as a particular level set of 1359 S. Yip (ed.), Handbook of Materials Modeling, 1359–1369. c 2005 Springer. Printed in the Netherlands.
1360
J.A. Sethian
higher dimensional function, and it is that latter function that does all the work. Fortunately, this exchange is well-worth the trade: topological changes, numerical robustness, high accuracy, and straightforward programming come at little cost. The resulting numerical techniques, known as level set methods and narrow band level set methods, have been incorporated into a wide collection of TCAD and semiconductor process simulation codes.
1.
Physical Effects and Background
The goal of numerical simulations in microfabrication is to model the process by which silicon devices are manufactured. Here, we briefly summarize some of the physical processes. This material is discussed in more detail in “Evolution and Implementation of Level Set Methods” [1], and much of this description is taken from that source. First, a single crystal ingot of silicon is extracted from molten pure silicon. This silicon ingot is then sliced into several hundred thin wafers, each of which is then polished to a smooth finish. A thin layer of crystalline silicon is then oxidized, a light-sensitive “photoresist” that is sensitive to light is applied, and the wafer is then covered with a pattern mask that shields part of the photoresist. This pattern mask contains the layout of the circuit itself. Under exposure to a light or an electron beam, the exposed photoresist polymerizes and hardens, leaving an unexposed material that is then etched away in a dry etch process, revealing a bare silicon dioxide layer. Ionized impurity atoms such as boron, phosphorus, and argon are then implanted into the pattern of the exposed silicon wafer, and silicon dioxide is deposited at reduced pressure in a plasma discharge from gas mixtures at a low temperature. Finally, thin films such as aluminum are deposited by processes such as plasma sputtering, and contacts to the electrical components and component interconnections are established. The result is a device that carries the desired electrical properties. These processes produce considerable changes in the surface profile as it undergoes various effects of etching and deposition. This problem is known as the “surface topography problem” in microfabrication and is controlled by a large collection by physical events. Our central concerns are etching and deposition processes, in which the rate of change of the surface profile depends on such factors as the visibility of the etching/deposition source from each point of the evolving profile, reemission of particles, surface diffusion along the front, complex flux laws that produce faceting, shocks and rarefactions, material-dependent discontinuous etch rates, and masking profiles. The underlying physics and chemistry that contribute to the motion of the interface profile are very much areas of active research. Nonetheless, once empirical models are formulated, the problem ultimately becomes the one of tracking an interface moving under a given speed function. This problem
Modern interface methods for semiconductor process simulation
1361
occurs in a wide collection of physical phenomena, including ocean waves, fluid mixing, combustion, crystal growth and dendritic solidification, and secondary oil recovery. Interestingly, many non-physical problems can also be cast in this setting, including problems in shape segmentation, optimal structural design, and inverse reconstruction.
2.
Algorithmic Requirements for Tracking Interfaces
Abstractly, the goal is to devise numerical algorithms to track moving interfaces. For the moment, imagine a interface moving in a two-dimensional domain; for simplicity, we can think of a curve moving in the plane, and at each point of the curve and any time, we can query a function that gives us the speed of that point on the curve. For further simplicity, we shall call this speed function F, whose arguments may include the position (x, y), the time t, and a collection of other factors, including the shape of the interface and related physics on and off the interface, We seek a representation of this “interface” which allows us to update its position by repeatedly querying this function. Possibly, the most straightforward such discretization is to simply consider the curve as a collection of linked marker nodes, whose position at any times reveals the interface. At any time, one can query the speed function F to find the velocity of each node, and then update the node position to obtain the new location of the front. In Fig. 1, we show the idea of a discrete marker representation. On the left, a circle expanding with unit speed in the normal direction is discretized into a set of linked markers. However, the middle figure shows the first difficulty: for shapes with sharp corners, the normal direction is not defined, furthermore, for non-convex shapes, the moving markers may overlap. The figure on the right shows a greater difficulty: two evolving fronts may change topology as they intersect. The correct solution is the expanding envelope of the two shapes; however, it is difficult to determine which markers rightly belong on the new
?
?
?
?
Marker discretization
?
?
?
?
Collision of normals
Topological change
Figure 1. Marker particle interface discretization.
1362
J.A. Sethian
front. While algorithmic solutions to these problems exist, they become more complex and challenging in representing three-dimensional evolutions. In semiconductor process modeling, these effects are of great importance: fluxes and visibilities delicately depend on accurate calculation of normals, topological changes, and construction of an intact front.
3.
Level Set Methods
Level set methods, introduced by Osher and Sethian [2] (“Fronts Propagating with Curvature-Dependent Speeds”), were devised to accurately tracking interfaces evolving under a variety of complex speed laws in two and three dimensions. They rely in part on the theory of curve and surface evolution given by Sethian (“Curvature and the Evolution of Fronts”) [3] and on the link between front propagation and hyperbolic conservation laws given by Sethian [4] (“Numerical Methods for Propagating Fronts”). They recast interface motion as a time-dependent Eulerian initial value partial differential equation, and rely on viscosity solutions to the appropriate differential equations to update the position of the front, using an interface velocity that is derived from the relevant physics both on and off the interface. These viscosity solutions are obtained by exploiting schemes from the numerical solution of hyperbolic conservation laws. Level set methods are specifically designed for problems involving topological change, dependence on curvature, formation of singularities, and host of other issues that often appear in interface propagation techniques. The fundamental idea is easily explained. For simplicity, we focus on a one-dimensional front propagating in two space dimensions. Let (s, t = 0) be a simple closed curve lying in the plane; here, s parameterizes the curve and t = 0 corresponds to the initial time. We embed this curve as the zero level set φ(x, y, y = 0) = 0 of function φ from R2 × [0, inf] to R. Thus, in R3 , φ(x, y, t = 0) is a surface whose zero level set in the x y-plane corresponds to the initial position of the front (s, t = 0). Suppose this curve is propagating in its normal direction with a given speed F. We then seek an initial value partial differential equation for the evolution of the level set function φ(x, y, t) such that at any given time, the zero level set φ(x, y, t) = 0 corresponds to the new position of the front. In Fig. 2, we show an initial level set function for a circle centered at the origin. We now have two goals: first, to determine a good way of choosing the initial level set function φ(x, y, t = 0), and second, to determine an initial value PDE which transports this level set function such that the zero level set always corresponds to the evolving front. For an initial choice, one straightforward option is the signed distance function, namely φ(x, y, t = 0) = ± d(x, y)
(1)
Modern interface methods for semiconductor process simulation
1363
z
y F
ψ(x,y,t⫽0) F
y
x
F
x
F Figure 2.
ψ⫽0
Initial front and initial level set function.
where d(x, y) is the distance from the point (x, y) to the given initial curve , and the sign is chosen as positive (negative) if (x, y) is outside (inside) . Finally, to derive the initial value PDE, a straightforward application of the multidimensional chain rule [1, 2] produces the initial value PDE φt + F|∇φ| = 0.
(2)
This is the level set equation. The advantages of this approach are immediately apparent: • Topological changes occur automatically, since the level set function φ itself is always a graph, regardless of the connectedness of the zero level set corresponding to the interface. • The determination of the normal and curvature, which may be required to accurately evaluate the speed function F is easily approximated through derivative operators applied to the level set function, that is, n =
∇φ ∇φ and κ = ∇ · . |∇φ| |∇φ|
• The formulation is unchanged in higher dimensions, leading to a straightforward perspective to complex interface motion.
4.
Numerical Approximations to the Level Set Equation
Equation (2) formulates interface propagation as an initial value PDE in which the speed F in the normal direction is considered as supplied. Of course,
1364
J.A. Sethian
in realistic physical situations, the speed function F depends on a host of influences, including front geometry, parabolic and elliptic equations on either side of the interface with subtle boundary conditions, and additional factors. Nonetheless, from an operator splitting perspective, once both the position of the front (that is, the level set function φ) and its normal speed F are known at the beginning of a time step, the next job is to update the level set function itself. The subtlety in providing an accurate and robust discretization of Eq. (2) stems from the fact that interface (and hence the associated level set function) need not be differentiable. Hence, care must be taken in evaluating the gradient operator |∇φ|. Osher and Sethian [2] introduced a high order upwind finite difference scheme for Hamilton–Jacobi equations of this form. The idea (see Ref. [4] for motivation) is to move the considerable numerical technology built for hyperbolic conservation laws to Hamilton–Jacobi equations, which in a very rough sense can be thought of as integrated form. By doing so, viscosity solutions are automatically selected which accurately capture the evolution of corners by considering the limiting process of vanishing viscosity. One of the simplest forms for such a scheme (see Ref. [2]) is given by n + − φin+1 j k = φi j k − t[max(Fi j k , 0)∇ + min(Fi j k , 0)∇ ],
(3)
where
∇+ =
∇− =
−y
+x 2 2 2 max(Di−x j k , 0) + min(Di j k , 0) + max(Di j k , 0) +y
+z 2 2 + min(Di j k , 0)2 + max(Di−z j k , 0) + min(Di j k , 0) +y
−x 2 2 2 max(Di+x j k , 0) + min(Di j k , 0) + max(Di j k , 0) −y
−z 2 2 + min(Di j k , 0)2 + max(Di+z j k , 0) + min(Di j k , 0)
1/2 1/2
Here, we have used standard finite difference notation. Higher order schemes are available [2]. In this formulation, the entire level set function is updated on a computational mesh throughout the physical domain. This means that all the level sets are updated, not just the zero level set corresponding to the interface itself. The central advance that made these methods computationally competitive with other techniques came through the use of an adaptive “Narrow Band Level Set Method”; introduced by Adalsteinsson and Sethian [5] (“A Fast Level Set Method for Propagating Interfaces”): the idea of doing so was first discussed in Ref. [6]. In this approach, one works only in a neighborhood (or “narrow band”) of the zero level set, repeatedly rebuilding the signed distance function in a updated narrow band once the front has moved towards the edge of the domain.
Modern interface methods for semiconductor process simulation
1365
Figure 3. Grid points in dark area are members of narrow band.
There is one final issue. In most physical problems, the interface speed is defined on the front itself, yet the above level set formulation requires that a speed function F be defined throughout the narrow band/computational grid. Thus, a mechanism is required to extend the velocity from the front itself throughout the computational region. There are a variety of ways to do so; one is to solve an associated PDE for this extended velocity based on boundary values on the front itself. A particular fast way, introduced by Adalsteinsson and Sethian (“The Fast Construction of Extension Velocities in Level Set Methods” [7]) comes from using a variant of the fast marching method [8]. For details on narrow band level set methods, upwind schemes for interface evolution, reinitialization techniques and the construction of extension velocities, and general aspects of level set methods for interface propagation, see Refs. [1, 2, 5, 7, 8] (Fig. 3).
5.
Application to TCAD: Etching and Deposition in Semiconductor Processing
Narrow band level set methods for semiconductor etching and deposition simulations were introduced by Adalsteinssion and Sethian in a series of papers [9–12]. Details about computing the effects of visibility, reemission and redeposition, complex flux laws, and material-dependent etch rates may be found there. Here, we review a few applications, comparing simulations with experiment analyzing various aspects of surface thin film physics. All the
1366
J.A. Sethian
Figure 4. Ion-milling:experiment (top) vs. simulation (bottom).
simulations are performed using TERRAIN∗, a commercial version of these techniques built by Technology Modeling Associates and specifically designed for process simulation.
5.1.
Ion Milling
We begin with a comparison with experiment of an ion-milling process. Figure 4 shows an experiment on the top and a simulation at the bottom. We note that both the simulation and the experiment show the crossing non-convex curves on top of the structures, the sharp points, and the sloping sides.
5.2.
Plasma-enhanced Chemical Vapor Deposition
Next, we show comparison with experiment of two plasma-enhanced chemical vapor deposition (PECVD) simulations. We show a series of experiments. First, two smaller structure calculations are used to verify the ability to match experiment. Figures 5 and 6 show these results. Figures 7 and 8 show more simulations for more complex structures.
5.3.
SRAM Simulations
Finally, we show SRAM comparisons between experiment and simulations for large structures (Fig. 9). We show the original layout together with the actual pattern printed through photolithography, followed by the sequential processing steps. * We thank Juan Rey, Brian Li, and Jiangwei Li for providing these results.
Modern interface methods for semiconductor process simulation
Figure 5. PECVD, small-scalestructure: experiment (left) vs. simulation (right).
Figure 6. PECVD, small-scalestructure: experiment (left) vs. simulation (right).
Figure 7. PECVD: experiment(top) vs. simulation (bottom).
Figure 8. PECVD: experiment(top) vs. simulation (bottom).
1367
1368
J.A. Sethian
Original
Printed layout
Stimulation: Step one
Stimulation: Step two
Stimulation: Step three
Stimulation: Step four
Figure 9. SRAM simulation:experiment and simulation.
Modern interface methods for semiconductor process simulation
6.
1369
Discussion and Outlook
The use of Eulerian partial-differential-equations-based techniques for tracking moving interfaces carries some intrinsic advantages: topological changes are handled cleanly, geometric properties such as normals and curvature are accurately evaluated, and formulations carry forward to three dimensions with little change. In semiconductor simulations, some of the more challenging issues with this approach include material assignment, complexities at triple points, maintenance of sharp corners and fast evaluations of fluxes. Ultimately, the goal is to couple these simulations with diffusion solvers, stress analysis, and additional physics to make full process simulators. This work is currently underway.
References [1] J. Sethian, Level Set Methods and Fast Marching Methods, 2nd edn. Cambridge University Press, Cambridge, 1999. [2] S. Osher and J. Sethian, “Fronts propagating with curvature-dependent speeds: algorithms based on Hamilton–Jacobi formulations,” J. Comput. Phys., 79, 12–49, 1988. [3] J. Sethian, “Curvature and the evolution of fronts,” Commun. Math. Phys., 101, 489–499, 1985. [4] J. Sethian, “Numerical methods for propagating fronts,” In: P. Concus and R. Finn (eds.), Variational Methods for Free Surface Interfaces, Springer-Verlag, New York, pp. 66–80, 1987. [5] D. Adalsteinsson and J. Sethian, “A fast level set method for propagating interfaces,” J. Comput. Phys., 118, 269–277, 1995. [6] D. Chopp, “Computing minimal surfaces via level set curvature flow,” J. Comput. Phys., 106, 77–91, 1993. [7] D. Adalsteinsson and J. Sethian, “The fast construction of extension velocities in level set methods,” J. Comput. Phys., 148, 2–22, 1999. [8] J. Sethian, “Fast marching methods,” SIAM Rev., 41, 2–22, 1999. [9] D. Adalsteinsson and J. Sethian, “A unified level set approach to etching, deposition and lithography. I. Algorithms and two-dimensional simulations,” J. Comput. Phys., 120, 128–144, 1995. [10] D. Adalsteinsson and J. Sethian, “A unified level set approach to etching, deposition and lithography. II. Three-dimensional simulations,” J. Comput. Phys., 122, 348–366, 1995. [11] D. Adalsteinsson and J. Sethian, “A unified level set approach to etching, deposition and lithography. III. Complex simulations and multiple effects,” J. Comput. Phys., 138, 193–223, 1997. [12] J. Sethian and D. Adalsteinsson, “An overview of level set methods for etching, deposition, and lithography development,” IEEE Trans. Semiconductor Devices, 10, 167–184, 1996.
4.7 COMPUTING MICROSTRUCTURAL DYNAMICS FOR COMPLEX FLUIDS Michael J. Shelley and Anna-Karin Tornberg Courant Institute of Mathematical Sciences, New York University, New York, NY, USA
1.
Introduction
Complex liquids such as polymeric or fiber suspensions are rich in interesting phenomena and as an area of scientific inquiry sits astride materials science and fluid dynamics. Materials science enters because such fluids are a form of composite material with micro-structure, while fluid dynamics enters because complex liquids are, well, fluidic. Computation plays a large role in their study not only at the continuum level, which is not a finished business anyway, but also at the microscopic level where the microstructure can have non-trivial dynamics in response to simple forcing flows.
2.
Background
Flows in nature and engineering often acquire their interesting aspects by the presence in and interaction of the fluid with immersed elastic objects. Fish, tree leaves, flagellae, and rigid polymers all come to mind. A very important special case is when the elastic bodies are microscopic and filamentary. For example, flexible fibers make up the micro-structure of suspensions that show strongly non-Newtonian bulk behavior, such as elasticity, shear-thinning, and normal stresses in shear flow [13, 23]; micro-organisms utilize for locomotion the anisotropic drag properties of their long flexible flagella [7]; rigid biopolymers such as actin and tubelin mediate many intracellular division and transport processes, and dynamic rheological measurements are used to probe their biophysical properties [11]. The dynamics of flexible filaments are also relevant to understanding soft materials: Liquid crystal phase transitions lead to the study of “soft” growing filaments in a smectic-A phase [26, 33, 34], 1371 S. Yip (ed.), Handbook of Materials Modeling, 1371–1388. c 2005 Springer. Printed in the Netherlands.
1372
M.J. Shelley and A.-K. Tornberg
while solutions of wormlike micelles have very complicated and interesting macroscopic behavior [6, 19, 23], perhaps due to their susceptibility to shearinduced breakage. In all these problems, the filaments have large aspect ratios (length over radius), ranging from order 10 to a thousand for natural to synthetic fibers, and up to many thousands in biological settings. Clearly we are considering examples where the inertia of both fluid and filament can be neglected, i.e., very low Reynolds number flows, for which the fluid dynamics is described by the Stokes equations. The Stokes equations are linear, and time enters only as a parameter, leading to its celebrated reversibility [2]. However, this reversibility is broken by surface forces such as those induced by bending rigidity, and simple forcing flows can lead to very non-trivial dynamics. Here, we will also neglect the complicating effects of thermal fluctuations of a surrounding solvent, and focus on the cleaner case where both the surrounding fluid and the rod can be described by classical continuum mechanics. As a simple case, consider a plane shear flow. A straight rod placed in this flow will rotate and translate with the fluid. However, as Fig. 1 shows, the dynamics can be very different when the rod is flexible. As the strength of the shear flow increases relative to the bending rigidity of the filament, there is a sharp bifurcation beyond which the filament is unstable to buckling [3, 37] and small shape perturbations can grow into substantial bending of the filament. This stores elastic energy in the fiber which can later be released back to the system as the fiber is extended, and is related to the anomolous stresses that elastic fluids can develop, such as normal stress differences that push
0.5
0.5
0.5
0.5
0
0
0
0
⫺0.5 ⫺0.5
0 t⫽0
0.5
⫺0.5 ⫺0.5
0 t⫽48.64
0.5
⫺0.5 ⫺0.5
0 0.5 t⫽49.664
⫺0.5 ⫺0.5
0 0.5 t⫽50.688
Figure 1. Upper figure: Substantial buckling occurs for a flexible filament in a plane background shear flow, U0 = (γ˙ y, 0, 0). Lower figure: The first normal stress difference plotted as a function of time. For comparison, the dashed line indicates the first normal stress difference for a straight filament. From Tornberg and Shelley [37], with permission.
Computing microstructural dynamics for complex fluids
1373
apart bounding walls in linear shear experiments [23]. The dynamics of the first normal stress difference (N 1, the one responsible for the above effect) is also plotted in Fig. 1, as is the normal stress difference for a straight filament (dashed line). The first normal stress difference is zero in the absence of the filament, and is zero in temporal mean for a rigid filament. For a filament that bends, the symmetry of the first normal stress difference that holds for a straight filament is broken, and the integrated normal stress difference now yields a positive net contribution [3, 37]. Thus, the dynamics shows a surprising richness even for a single filament, and for suspensions there is much that is still not well understood. It is worth noting that while experiments capture such sharp changes in fluidic response, continuum theories generally do not. Indeed, here the change lies in the degrees of freedom available to the microscopic fiber. Ultimately, one would like to have a macroscopic model for such suspensions, and eliminate the need of computer simulations to resolve the micro-structure. Indeed, a greater understanding of these flows is needed in order to develop such models, and this requires both experiments and numerical simulations. While in experiments it is natural to study suspensions of fibers, it is difficult to isolate one or a few fibers and understand their individual dynamics. Numerical simulations offer the ability to obtain data for each fiber in great detail. The main challenge for a numerical method lies in its ability to include many filaments in the simulation, at a reasonable cost, while maintaining accuracy. Indeed, since the details of the micro-structural dynamics are important to the collective behavior of the system, a relatively large number of filaments is likely needed before computed average quantities can be considered representative.
2.1.
Numerical Approaches
Given the scales of the problem – many filaments, slenderness, complicated individual dynamics – several approximate methods have been developed. One such are the so-called bead-models. This class of models contains a wide variety of approximations, with the common feature being that a flexible fiber is modeled as a chain of linked rigid bodies. For example, Yamamoto and Matsouka [40] modeled their flexible filaments as chains of N spherical beads. Interactions between the spheres within a fiber are taken into account through the use of a mobility matrix, but to reduce the cost of the computations, interactions with spheres belonging to other fibers are not included. Forces from inter-fiber interactions are instead computed using a lubrication approximation, ignoring far field interactions. Ross and Klingenberg [31] modeled each fiber as a number of linked prolate spheroids. In their study of sheared suspensions of fibers, they neglect hydrodynamics interactions, i.e., the fact that the
1374
M.J. Shelley and A.-K. Tornberg
fibers have an effect on the fluid flow, and added only a short range repulsive force to keep fibers from overlapping. Very similar approximations have been made by Ning and Melrose [25] and Switzer and Klingenberg [36], where they instead use cylinders as building blocks for each fiber. In Ref. [36], in order to reduce the computational cost, only five cylinders were used to model each fiber, thereby restricting the possible modes of bending of the fiber. Joung et al. [21] developed a method for slightly flexible fibers, modeled as chains of spherical beads. In their formulation they include the forces due to interactions within beads of each fiber, as well as between different fibers. However, they do not solve this full system to obtain the bead velocities to update the positions of the beads. Instead, they first determine the updated end-to-end vector of each fiber, followed by a force and moment calculation to adjust the positions of the individual beads. For the computations of external moments, the fiber is assumed rigid, hereby limiting the validity to cases of only slightly bent fibers. The immersed boundary method [28] has also been applied to this class of problems. In this method, an elastic boundary is discretized with connected Lagrangian markers, and its relative displacements by fluid motion are used to calculate the boundary’s elastic responses. These elastic forces are then distributed onto a background grid covering the computational domain, and used as forces acting upon the fluid, thereby modifying the surrounding fluid flow. For example, Stockie [35] used an immersed boundary method (at moderate Reynolds number) to simulate a single “filament” (modeled as an infinitesimally thin elastic boundary) buckling in a two-dimensional linear shear flow. To add a physical width to the fiber, a fiber structure must be constructed from a bundle of intertwined immersed elastic boundaries. Lim and Peskin [24] used such a construction to study the so-called whirling instability [38] of one fiber at low Reynolds number. While this method has the advantage that flows at finite Reynolds numbers can be simulated, being fundamentally grid based, it would be very difficult to use this method to simulate a large number of high aspect ratio filaments. As a different starting point, we have recently developed a numerical approach based on a formulation of the problem where we make explicit use both of the Stokes equations, and of the slenderness of the fibers [37]. The key points are that for Stokes flow, boundary integral methods can be employed to reduce the three-dimensional dynamics to the dynamics of the two-dimensional filament surfaces [30], and by using slender-body asymptotics, this can be further reduced to the dynamics of the one-dimensional filament centerlines. The resulting integral equations capture the non-local interaction of the fiber with itself, as well as with any other structures within the fluid, such as other fibers. Indeed, the dynamics shown in Fig. 1 are simulated using this formulation, though only for a single filament interacting with a background shear flow.
Computing microstructural dynamics for complex fluids
1375
We have applied this method to simulate a moderate number of high aspect ratio, very flexible, filaments in a three-dimensional fluid [37]. Indeed, these are the first simulations to reach such a regime of both high flexibility and aspect ratio. Still, much improvement remains to be done, especially in the area of near-interactions of the filaments where lubrication forces become important, as well as improving the efficiency of computing the non-local filament– filament interactions. There are important and fascinating applications for such a set of numerical tools. This includes answering very basic questions concerning the development of non-Newtonian stresses in elastic fluids; shedding light on observations of micro-fluidic mixing through a form of low Reynolds “turbulence” induced by elastic response of the fluid; studying the transitions to and nature of collective dynamics in swimming bacteria. Some of these applications lead to special cases of the dynamics that allow the simulations to be performed at very low cost.
3.
Mathematical Formulation
Let denote the fluid domain in R3 , external to the filament. Consider a Newtonian fluid of viscosity µ, with velocity field u(x), and pressure p(x), where x = (x, y, z) ∈ R3 . Assuming that fluid inertia is negligible, u and p satisfy the Stokes equations: ∇ p − µu = 0 and ∇ · u = 0 in . Let denote the surface of the filament and u its surface velocity. We impose the no-slip condition on and require that u(x) → U0 (x) as x → ∞, where the background velocity U0 (x) is also a solution to the Stokes equations. Hence u = u on ,
u → U0 for ||x|| → ∞.
In the case of several filaments this can be generalized by considering the union of all filament surfaces, and imposing no-slip conditions thereon. A full boundary integral formulation for this problem would yield integral equations on the surfaces of the filaments relating surface stress and surface velocity [30]. For long, slender filaments, such a formulation would be very expensive to solve numerically. Instead we use the filament slenderness to reduce the integral equations to the filament centerlines.
1376
3.1.
M.J. Shelley and A.-K. Tornberg
Non-local Slender Body Approximation
Consider a slender filament; that is ε = a/L 1, where a is the filament radius, and L is its length. A non-local slender body approximation can be derived by placing fundamental solutions to the Stokes equations (Stokeslets and doublets) on the filament centerline, then applying the technique of matched asymptotics to derive the approximate equation. Such an approximation was derived by Keller and Rubinow in 1976 [22]. Their derivation yields an integral equation with a modified Stokeslet kernel on the filament centerline and relates the filament forces to the velocity of the centerline. Johnson [20] added a more detailed analysis and a modified formulation that included accurate treatment of the filament’s free ends, yielding an equation that is asymptotically accurate to O(ε 2 log ε) if the filament ends are tapered. Gotz [14] gives an integral expression for the fluid velocity U(x) at any point x outside the filament. If there are multiple filaments, their contributions simply add due to the superposition principle of Stokes flow. Denote the filaments by l , l = 1, . . . , M. Let the centerline of each filament be parameterized by arclength s ∈ [0, L], where L is the non-dimensional length of the filament and let xl (s, t) be the coordinates of the filament centerline. In the cases we consider, the arclength s is the material parameter for the filament, so that it is independent of t (i.e., the filament is assumed inextensible). We assume that each filament exerts a force per unit length, fl (s, t), upon the fluid. For filament l , we have
M ∂xl µ¯ Vk (xl (s)), (s, t) − U0 (xl , t) = −[fl ](s) − Kδ [fl ](s) − ∂t k=1, k = /l
(1) where the sum is over the contributions from all other filaments to the velocity of filament l, and U0 (x, t)is the background velocity, if any. Assuming the background flow to be a shear flow of strength γ˙ ,the prob˜ flow lem has been made non-dimensional by using a typical filament length L, −1 2 ˜ time-scale γ˙ , and the force F =E/ L , where E is the rigidity of the fiber. The non-dimensional parameters are the effective viscosity µ¯ = 8πµγ˙ L˜ 2 /(E/ L˜ 2 ), representing a ratio between characteristic fluid drag and the filament elastic force, and the asymptotic parameter c = log(ε 2 e), where the radius of the √ filament is r(s) = 2ε s(L − s)[20]. The local operator l is given by l [f](s) = [−c(I + sˆl sˆl (s)) + 2(I − sˆl sˆl (s))]f(s),
(2)
Computing microstructural dynamics for complex fluids and the integral operator Kl,δ [f](s) by
ˆ s )R(s, ˆ s) I + R(s, I + sˆl (s)ˆsl (s) Kl,δ [f](s) = f(s ) − l
|R(s, s )|2 + δ(s)2
|s − s |2 + δ(s)2
1377
f(s) ds . (3)
ˆR ˆ and sˆsˆ are dyadic products, with R ˆ the Here, R(s, s ) = x(s) − x(s ), and R ˆ normalized R vector and S the unit tangent vector. Note that these two operators depend on the shape of the filament (given by xl (s, t)). In the original slender-body formulations [14, 20, 22], the regularization parameter δ in Eq. (3) is zero. An analysis of the straight filament case shows that these original slender-body formulations are not suitable for numerical computations, due to high wave number instabilities at length-scales not accurately described by slender-body theory [34, 37]. As we discuss in Ref. [37], the regularization introduced can remove this instability while retaining the same asymptotic accuracy as the original formulation of Johnson. The Stokeslet and doublet contributions from the other filaments are given by Vk (¯x) =
k
ˆ k (s )R ˆ k (s ) I+R fk (s )ds |Rk (s )|
ε2 + 2
k
ˆ k (s )R ˆ k (s ) I − 3R fk (s )ds , |Rk (s )|3
(4)
where Rk (s ) = x¯ − xk (s ).
3.2.
Force Definition
The integral equation (1) relates the velocity of filament l to the forces acting upon the filament, as well as to the forces acting upon the other filaments. Here we assume that filament forces are described by Euler–Bernoulli elasticity [32], and for a filament given by x(s) the non-dimensional force (per unit length) is given by f(s) = −(T (s)xs )s + xssss ,
(5)
where derivatives with respect to arclength are denoted by a subscript s. The first term in Eq. (5) is the filament tensile force, with T the tension, that resists compression and extension. The second term represents bending forces. Twist elasticity is neglected [12]. The ends of the filament are considered “free,” that is, no forces or moments are exerted upon them, so that xss |s = 0,1 =xsss |s = 0,1 =0 and T |s = 0,1 = 0.
1378
3.3.
M.J. Shelley and A.-K. Tornberg
Completing the Formulation
Now, consider the assumption of inextensibility. This determines the line tension T. Since the filament is inextensible, s remains a material parameter, and thus s and t derivatives can always be interchanged. Writing ∂t x(s, t) = U(s, t), we have
∂t (xs · xs ) = 0 ⇒ xs · xt s = xs · Us = 0.
(6)
This condition can be combined with Eq. (1) to derive an integro-differential equation for the line tension. With this, the integro-differential equation for the line tension Tl (s) for filament l, l = 1, . . . , M, is of the form L s [Tl , xl ] = J [xl , U0 ] −
M k=1, k =l
(xl )s ·
∂ Vk (xl (s)), ∂s
(7)
where L s [T, x] = 2cTss + (2 − c) (xss · xss ) T − xs ·
∂ Kδ [(T xs )s ] ∂s
∂ U0 + (2 − 7c) (xss · xsss ) − 6c(xsss · xsss ) ∂s ∂ −xs · Kδ [xssss ] − µβ(1 ¯ − xs · xs ). ∂s
¯ s· J [x, U0 ] = µx
(8)
Equation (7) is solved together with the boundary condition T = 0 at s = 0,1. Here we have simplified the expressions for L s and J using a ladder of differential identities, derived by successive differentiations of xs · xs = 1 [37]. The line tension T (s) acts as a Lagrangian multiplier, constraining the motion of the filament to obey the inextensibility condition. However, the equation for T (s) was derived assuming that the filament was of exactly the correct length, and hence xs · xs = 1 for all s. However, if there is a small length error present, this error will not be corrected. On the contrary, the computed line tension can, depending on the configuration, even act so as to increase this error. We stabilize this constraint by replacing the inextensibility condition in (6) with (1/2) ∂t (xs · xs ) = xs · xt s = µβ(1 ¯ − xs · xs ), which is equivalent to the original condition when xs · xs = 1, and which acts to dynamically remove length errors if they are present (β is the penalization parameter, typically set to be of order O(10)). In summary, once the tension is found for each filament l ,their velocities are completely specified and the system can be stepped forward. Above, we seem to have suggested that each filament tension can be found independently of the others. This is not so; through its dependence on filament force, the operator J depends upon the tensions of all the other filaments k , k =/ l.
Computing microstructural dynamics for complex fluids
4.
1379
Numerical Methods
Thus, our system of flexible fibers is evolved by a large coupled set of integro-differential equations. There are several interesting aspects to the construction of accurate and efficient methods for its numerical solution, including methods of implicit time-stepping, and the evaluation of nearly singular integrals (which is usually more difficult that simply singular).
4.1.
Temporal Discretization
An explicit treatment of all terms in the time-dependent equation (1) would yield a very strict fourth-order stability constraint upon the time-step t. This arises basically from the large number of derivatives in the bending term of the force. To avoid this, we treat all occurrences of xssss implicitly, and combine this with a second-order backward differentiation formula [1]. Schematically, we write xt = F(x, xssss ) + G(x),
(9)
where x(s, t) are the coordinates of filament number l,and where the dependence on U0 and xk , k =/ l is not explicitly described. Here, xssss is treated implicitly, and all other terms are treated explicitly. We approximate this decomposition by 1 n n−1 (3xn+1 − 4xn + xn−1 ) = F(2xn − xn−1 , xn+1 ), ssss ) + 2G(x ) − G(x 2t (10) where t n = nt. We find that this scheme yields only a first-order constraint on t (i.e., proportional to the spatial grid size). The dynamics of multiple filaments are coupled to each other through the summation in Eq. (1). We treat this coupling term explicitly, that is, as part of G(x) in Eq. (10). In the resulting linear system for xln+1 (s), l = 1, . . . , M, the contribution from the other filaments will therefore be in the right hand side, and so the big system decouples into separate linear systems for xln+1 (s), l = 1, . . . , M. In doing this, we are essentially using that the interaction terms are smoothing operators and hence any the high-order terms that they contain do not contribute to high-order stability constraints on the time-step [18]. The equation for the line tensions Tl (s), l = 1, . . . , M is given in (7). This is a system of coupled integro-differential equations for the corresponding line tensions that must be solved at every time. To avoid solving one very large linear system for the line tensions on all the filaments, we introduce a fixed point iteration, in which we use the newest updates of the Tk s (i.e. possessive)
1380
M.J. Shelley and A.-K. Tornberg
available (k =/ l), when computing Tl (s). We find that this fixed point iteration typically converges rapidly, which again relies on the fact that the interaction terms are smoothing operators on the tensions.
4.2.
Spatial Discretization
The filament center lines are discretized uniformly in arclength s, with N intervals of step size h = 1/N. The discrete points are denoted s j = jh, j = 0, . . . , N, and the values f j = f (s j ). Second-order divided differences are used to approximate spatial derivatives. D p denotes divided difference operators such that D p f j approximates f ( p) (s j ) to an O(h 2 )error. Standard centered operators are used whenever possible, but at boundaries skew operators are applied. For the integral operator K in Eq. (3), both terms in the integrand are singular at s = s for δ = 0, and the integral is only well defined for the difference of these two terms. For the regularized operator, the terms are still nearly singular, and the numerical scheme must be designed with care to accurately treat the difference of these terms. To do this, we subtract off a term from the first part of the integral, and add the same term to the second part, and write the integral operator (3) as 1
Kδ [g](s) = 0
G(s, s )g(s )
(s − s )2 + δ(s)2
ds +(I + sˆ sˆ)
1 0
g(s ) − g(s)
(s − s )2 + δ(s)2
ds , (11)
where G(s, s ) is given by
G(s, s ) =
(s − s )2 + δ(s)2 ˆ R) ˆ − (I + sˆsˆ). (I + R |R|2 + δ(s)2
(12)
We then treat each part separately, by approximating the argument to the operator, as well as G(s, s ) by piecewise polynomials. These are all smooth, well-behaved functions. In the end, we need to evaluate integrals of the form sj +1
sj
(s − s j ) p ds = |s − s |2 + δ(s)2
h
αp dα, α 2 + bα + c + δ(s)2
0
where b = 2(s j − s) and c = (s j − s)2 , and p = 0, . . . , 4. These integrals have analytical formulas, becoming somewhat lengthy as p increases. By evaluating
Computing microstructural dynamics for complex fluids
1381
these integrals analytically, the rapidly changing part where s is close to s can be treated exactly. In the line tension equation (7), terms like xs · ∂/∂s(Kδ [g]) appear. These differentiated integral terms are approximated to second order by
∂ 1 Kδ [g](s)|s=si ≈ [Kδ [g](s j +1/2 ) − Kδ [g](s j −1/2 )]. ∂s h
(13)
This compact centered approximation of the derivative is important to achieve a stable numerical approximation of the line tension equation.
5.
The Microstructural Dynamics of Suspensions
This suite of numerical methods are presently at the leading edge for simulating flows of high aspect ratio, very flexible filaments, and form the basis for approaching a set of very interesting scientific problems. Figures 2 and 3 show a simulation of 25 filaments of equal and unit length evolving and interacting in a background oscillatory shear flow. Such dynamic background flows are used to extract such quantities as storage and loss moduli of elastic fluids [23]. Here, periodic boundary conditions are imposed in the streamwise (x) direction, with period twice the filament length, and viscosity µ¯ = 1.5×105 and aspect ratio ε = 10−2 . We have used N = 100 points to discretize each filament, and evolve with time-step t = 0.0128. The background shear flow is given by U0 = (sin(2πωt)y, 0, 0), where ω = (2000 t)−1 , so that one period is 2000 time-steps. The simulation is run for five periods.
Figure 2. Filament configurations at t = 0.0 and 32. The velocity profile of the background shear flow is indicated at the bottom. From Tornberg and Shelley [37], with permission.
1382
Figure 3.
M.J. Shelley and A.-K. Tornberg
As in Fig. 2, but at t=38 and 49. From Tornberg and Shelley [37], with permission.
As seen in the left plot of Fig. 2 all the filaments are initially straight. In simulations of a single filament, a small perturbation must be introduced to excite buckling. In this case, the non-local filament interactions are sufficient for this purpose. The right plot of Fig. 2 shows the filament configuration at t = 32, while Fig. 3 shows the configurations at t = 38 and 49. These three last plots all lie within the second period of the background shearing. In each plot, the tension T in each filament is used to color their surfaces, and the effects of compression and extension are clear; in particular, bending is typically associated with compression. Between t = 32 and 38 the background shear has slowed down and changed direction, and is slowly picking up again. This induces the strong bending of many filaments seen at t = 49. Figure 4 shows the evolution of the total elastic energy, Eel = ds X 2s , of the filament system, as well as N1 ,the first normal stress difference. As with the single filament simulations in a uniform shear [3, 37], we find a net positive first normal stress difference, though here with a contribution on each period of the forcing. However, examination of the elastic energy makes it particularly clear that the system is not in an “equilibrium” with averaged dynamics the same on each half-period. This suggests either that the simulations have not yet run long enough to remove dependencies on the initial configuration, or that the number of filaments is yet too small to get a good distribution of positions and orientations, or both. Indeed, one of the goals of our project is to make our methods much more efficient so that a larger number of filaments can be simulated, as is discussed below. If we were able to perform much larger simulations, and to compute quantities of interest such as filament concentration and orientations, elastic energy, viscosity of the suspension, and normal stress differences, the impact on the development of macroscopic models would be substantial. Also, within this type of simulation one can study the details of filament interactions directly,
Computing microstructural dynamics for complex fluids
1383
200 150 100 50 0
0
20
40
60
80
100
120
100
120
2000 1000 0 ⫺1000 ⫺2000 0
20
40
60
80
Figure 4. The upper plot shows the evolution of the elastic energy εel for the 25 filament simulation. The lower plot shows the evolution of the first normal stress difference N1 . From Tornberg and Shelley [37], with permission.
and relate macroscopic stress development directly to geometrical configuration of the filaments, and to their own internal stresses. We also intend to investigate dependencies on parameters such as bending rigidity, fiber slenderness, fiber concentration, and responses to protypical background flows, from shearing to extensional. Numerical simulations also constitute a great complement to laboratory experiments, since parameters and initial settings are easily varied and exactly controlled.
5.1.
Outlook
5.1.1. Numerical methods Simulating denser suspensions, while maintaining accuracy, is a real numerical challenge. There are two primary issues: the treatment of near-range filament–filament interactions, and the ability to simulate a suspension including a large number of filaments. On the first, as two filaments come within very close proximity of each other – on the order of a filament radius – lubrication forces become important and these are not well captured by slender-body theory. Shelley and Ueda [34] avoided the issue by reformulating the slender-body dynamics so that close approach induces a nearly singular response that kept the filaments separated. In their simulations of dense suspensions of straight
1384
M.J. Shelley and A.-K. Tornberg
rods, Butler and Shaqfeh [5] used a version of slender-body theory, and modeled lubrication forces by approximations of flow between two rigid rods (though their slender-body approximation was incomplete in describing the self-induced dynamics of fiber). A more general strategy would interpolate between slenderbody theory to describe regions away from close approach, and a full, but local, boundary integral formulation describing the local flow field in the region of close approach between two filaments. There are also being developed new fast summation strategies for computing the filament–filament interactions. Very recently, Zorin and his collaborators at the Courant Institute have developed highly efficient “kernel-free” fast multipole methods whose application here would yield a cost of O(MN) [4, 10, 29]. They already have a production code for fast stokeslet summation, and we are collaborating with them on applying this code to our problem.
5.1.2. Other dynamical problems in fiber-like flows To some extent, the slender-body formulation that we have developed follows the earlier work of Shelley and Ueda [34] (though with considerable improvement and generalization). There, they were specifically interested in modeling the filamentary dynamics of a phase transition observed in an undercooled smectic-A liquid crystal sample, and which was posited as the substrate upon which new industrial fibers and materials might be synthesized [26]. The left figure of Fig. 5 shows a snap-shot from the phase transition experiments of [26], in which a microscopic filament was demonstrated to grow exponentially in time. Here the pattern grows rapidly outwards, becoming “space-filling”, because the liquid crystal sample is confined to a narrow gap between two microscope coverslips, thus constraining the dynamics. The right figure shows the result of a numerical simulation of Shelley and Ueda [34] of this process. The system is of an elastic filament in a Stokesian fluid, whose axial tension is determined by the constraint of specified length growth. This growth leads to a buckling instability with a critical length-scale, and the hydrodynamical interactions of the filament with its disparate parts – mediated through a non-local slender-body theory – leads to the resulting dynamical patterns. E & PalffyMuhoray [9] have also elucidated many of the thermodynamical aspects of the phase transition. A recently discovered and very interesting phenomenon involving fluids with elastic response is that of “elastic turbulence”, which has important applications to fluid mixing in small devices at low Reynolds numbers. In experiments, Groisman and Steinberg [16] have shown that by adding a small amount of high-molecular-weight polymer to a viscous fluid, the flow in a cylindrical cup with a rotating top plate can become very irregular at low Reynolds numbers, and such that the fluid motion is excited over a wide
Computing microstructural dynamics for complex fluids 4
1385 4
t⫽1.05
t⫽2.25
2
2 0
0
⫺2
⫺2
⫺4
⫺4 ⫺4 ⫺2
4
0
2
⫺4 ⫺2
4 4
t⫽1.65
2
2
0
0
⫺2
⫺2
0
2
4
t⫽2.85
⫺4
⫺4 ⫺4 ⫺2
0
2
4
⫺4 ⫺2
0
2
4
Figure 5. Left: The “space-filling” pattern made by a thin filament of smectic-A material (whose molecular layers are presumably in a hedge-hog arrangement about the centerline) as isotropic material permeates into the filament, causing it to grow in length, and buckle. From Ref. [26], shown with permission. Right: A simulation of this growth process, based upon a non-local slender-body theory. Here the non-local hydrodynamical interactions cause the pattern to push outwards as the filament length grows exponentially in time, as in experiment. From Ref. [34], shown with permission.
range of spatial and temporal scales. The flow shows many of the features of developed turbulence. In Ref. [17], the same authors performed an experiment where they studied the mixing of very viscous liquids in a curved channel as polymers were added to the liquids. They found that at sufficiently high flow rates, the combination of elastic response with curved channel walls lead to elastic instabilities and an irregular flow with strongly enhanced mixing. Roughly speaking, the elastic response of the fluid supplies the requisite nonlinearity and extra time-scales to create very complicated flow patterns, even at very low Reynolds number. Very recently, Groisman et al. have demonstrated how the non-linearities of elastic fluids can be used to create micro-fluidic logical devices [15]. Relevant to these systems is also the lawful inclusion of thermal fluctuations in such continuum-based models. There has been some recent work in this direction. Montesi, Pasquali, and Wiggins (private communication) have incorporated a model of such fluctuations in a local drag model of an elastic filament in a shear flow. Many problems in biological settings involve elastic filaments immersed in fluids. The flagella utilized by micro-organisms for locomotion [7] are very slender and flexible. At one end it is attached to the body, and actively driven to perform a swimming stroke. The flexibility of the flagella is very important in reducing the drag in the backstroke. The internal structure of a cross-section of cilia and flagella is not symmetric, yielding it easier for the cilia/flagella to
1386
M.J. Shelley and A.-K. Tornberg
bend in one direction than the other, hence providing a prefered direction for the cilia/flagella to bend. It would interesting to investigate models such as ours where the filaments have preferred bending directions. Related to this are the dynamics of active suspensions, such as bacterial ones, where the microstructural elements are self-propelling, i.e., swimming and eating bugs [27]. These active elements react to concentration gradients (say, of oxygen), but are also carried along in the macroscopic fluid motions induced by all of the other locomoting bacteria. The resulting bacterial flow structures can be very complicated, and quite “turbulent” in appearance. It has been shown that the self-propelled flows of dense bacterial suspensions can be dominated by vortices and jets, each of which is made up of many individual bacteria ([39]; also, unpublished observations of Dombrowski et al. [8]). This is motivation for developing simulational methods for many-particle suspensions of self-locomoting straight fibers. This could be done with flexible fibers of high rigidity, but by exploiting the fact that the fibers are straight, the computational cost can be made far lower. Here, the approach is to determine a filament force that contains a motive part, as well as a part that constantly readjusts so as to be consistent with the filament being rigid. This yields an mathematical description with similarity to that of Butler and Shaqfeh [5] for simulating suspension dynamics of rigid rods.
References [1] K.E. Atkinson, An Introduction to Numerical Analysis, Wiley, New York, 1989. [2] G.K. Batchelor, An Introduction to Fluid Dynamics, Cambridge University Press, Cambridge, 1967. [3] L. Becker and M. Shelley, “The instability of elastic filaments in shear flow yields first normal stress differences,” Phys. Rev. Lett., 87, 198301, 2001. [4] G. Biros, L. Ying, and D. Zorin, “A kernel-independent adaptive fast multipole method in two and three dimensions,” J. Comput. Phys., 196, 591–626, 2004. [5] J.E. Butler and E.S.G Shaqfeh, “Dynamic simulation of the inhomogeneous sedimentation of rigid fibers,” J. Fluid Mech., 468, 205–237, 2002. [6] M.E. Cates and S.J. Candau, “Statics and dynamics of worm-like surfactant micelles,” J. Phys. Condens. Mater., 2, 6869–6892, 1990. [7] S. Childress, Mechanics of Swimming and Flying, Cambridge University Press, Cambridge, 1981. [8] C. Dombrowski, L. Cisneros, S. Chatkaew, R. Goldstein, and J. Kessler, “Selfconcentration and large-scale coherence in bacterial dynamics,” Preprint, 2003. [9] W.E and P. Palffy-Muhoray, “Dynamics of filaments during the isotropic-smectic a phase transition,” J. Nonlinear Sci., 9, 417–437, 1999. [10] Z. Gimbutas and V. Rokhlin, “A generalized fast multipole method for nonoscillatory kernels,” SIAM J. Sci. Comput., 24, 796–817, 2002. [11] T. Gisler and D.A. Weitz, “Scaling of the microrheology of semidilute F-Actin solutions,” Phys. Rev. Lett., 82, 1606–1609, 1999.
Computing microstructural dynamics for complex fluids
1387
[12] R. Goldstein, T. Powers, and C. Wiggins, “Viscous nonlinear dynamics of twist and writhe,” Phys. Rev. Lett., 80, 5232, 1998. [13] S. Goto, H. Nagazono, and H. Kato, “Polymer solutions, 1. Mechanical properties,” Rheol. Acta, 25, 119–129, 1986. [14] T. G¨otz, Interactions of Fibers and Flow: Asymptotics, Theory and Numerics, PhD thesis, University of Kaiserslautern, Germany, 2000. [15] A. Groisman, M. Enzelberger, and S. Quake, “Microfluidic memory and control devices,” Science, 300, 955–958, 2003. [16] A. Groisman and V. Steinberg, “Elastic turbulence in a polymer solution flow,” Nature, 405, 53, 2000. [17] A. Groisman and V. Steinberg, “Efficient mixing at low Reynolds numbers using polymer additives,” Nature, 410, 905, 2001. [18] T. Hou, J. Lowengrub, and M. Shelley, “Long-time evolution of vortex sheets with surface tension,” Phys. Fluids, 9, 1933, 1997. [19] A. Jayaraman and A. Belmonte, “Oscillations of a solid sphere falling through a wormlike micellar fluid,” Phys. Rev. E, 065301, 2003. [20] R.E. Johnson, “An improved slender-body theory for stokes flow,” J. Fluid Mech., 99, 411–431, 1980. [21] C.G. Joung, N. Phan-Thien, and X. Fan, “Direct simulation of flexible fibers,” J. Non-Newtonian Fluid Mech., 99, 1–36, 2001. [22] J. Keller and S. Rubinow, “Slender-body theory for slow viscous flow,” J. Fluid Mech., 75, 705–714, 1976. [23] R.G. Larson, The Structure and Rheology of Complex Fluids, Oxford University Press, Oxford, 1998. [24] S. Lim and C.S. Peskin, “Simulations of the whirling instability by the immersed boundary method,” SIAM J. Sci. Comput., 25, 2066–2083, 2004. [25] Z. Ning and J.R. Melrose, “A numerical model for simulating mechanical behavior of flexible filaments,” J. Chem. Phys., 111, 10717–10726, 1999. [26] P. Palffy-Muhoray, B. Bergersen, H. Lin, R. Meyer, and Z. Racz, “Filaments in liquid crystals: structure and dynamics,” In: S. Kai (ed.), Pattern Formation in Complex Dissipative Systems, World Scfientific, Singapore, 1991. [27] T. Pedley and J. Kessler, “Hydrodynamic phenomena in suspensions of swimming microorganisms,” Annu. Rev. Fluid Mech., 24, 313, 1992. [28] C.S. Peskin, “The immersed boundary method,” Acta Numer., 11, 479–517, 2002. [29] J. Phillips and J. White, “A precorrected-fft method for electrostatic analysis of complicated 3d structures,” IEEE Trans. Comput.-Aid. Des. Integrat. Circuits Syst., 16, 1059–1072, 1997. [30] C. Pozrikidis, Boundary Integral and Singularity Methods for Linearized Viscous Flow, Cambridge University Press, Cambrige, 1992. [31] R.F. Ross and D.J. Klingenberg, “Dynamic simulation of flexible fibers,” J. Chem. Phys., 106, 2949–2960, 1997. [32] L. Segel, Mathematics Applied to Continuum Mechanics, MacMillan, New York, 1977. [33] M. Shelley and T. Ueda, “The nonlocal dynamics of stretching, buckling filaments,” In: D. Papageorgiou and Y. Renardi (eds.), Multi-Fluid Flows and Instabilities, AMS-SIAM, Philadelphia, 1996. [34] M.J. Shelley and T. Ueda, “The stokesian hydrodynamics of flexing, stretching filaments,” Physica D, 146, 221–245, 2000. [35] J.M. Stockie, “Simulating the motion of flexible pulp fibres using the immersed boundary method,” J. Comput. Phys., 147, 147–165, 1998.
1388
M.J. Shelley and A.-K. Tornberg
[36] L.H. Switzer and D.J. Klingenberg, “Rheology of sheared flexible fiber suspensions via fiber-level simulations,” J. Rheol., 47, 759–778, 2003. [37] A.K. Tornberg and M.J. Shelley, “Simulating the dynamics and interactions of flexible fibers in stokes flow,” J. Comput. Phys., 196, 8–40 2004. [38] C. Wolgemuth, T. Powers, and R. Goldstein, “Twirling and whirling: viscous dynamics of rotating elastic filaments,” Phys. Rev. Lett., 84, 1623, 2000. [39] X.-L. Wu and A. Libchaber, “Quasi-two-dimensional bacterial bath,” Phys. Rev. Lett., 84, 3017, 2000. [40] S. Yamamoto and T. Matsuoka, “Dynamic simulations of fiber suspensions in shear flow,” J. Chem. Phys., 102, 2254–2260, 1995.
4.8 CONTINUUM DESCRIPTIONS OF CRYSTAL SURFACE EVOLUTION Howard A. Stone1 and Dionisios Margetis2 1
Division of Engineering and Applied Sciences, Harvard University, Cambridge, MA 01238, USA 2 Department of Mathematics, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
1.
Morphological Evolution of Crystalline Materials
It is well known that liquid surfaces evolve in shape due to the effect of surface tension, which drives configurations towards lower energy. The breakup of an initially cylindrical fluid thread into spherical droplets, first quantified experimentally by Plateau and analyzed by Rayleigh, is a popular and illustrative example. Solid surfaces, in particular surfaces of crystals, also evolve according to the analogous principle of minimizing their surface energy. The evolution in this case, however, is more complicated to describe physically and mathematically than the analogous phenomena for fluid interfaces, because there is a richer variety of competing mechanisms that are available for the solid to change its shape. In addition, a solid supports strain, which leads to the surface energy depending on the slope of the crystal surface. In this article we summarize the basic physical ideas that underlie crystal surface evolution, introduce continuum descriptions in terms of continuum thermodynamics and partial differential equations (PDEs), and provide solutions to some analytically tractable prototypical problems. A strong motivation for studying how crystal surfaces evolve is the need to better understand and harness properties of solid structures and electronic devices at the nanoscale. In most experimental, and technologically relevant, situations, such structures are not in thermodynamic equilibrium, and decay with a lifetime that varies appreciably with the temperature T and, most importantly, scales as an integer power of the feature size; thus, smaller structures decay faster. Hence, there is a need for the quantitative understanding of the factors that affect surface evolution, such as formation and growth of islands 1389 S. Yip (ed.), Handbook of Materials Modeling, 1389–1401. c 2005 Springer. Printed in the Netherlands.
1390
H.A. Stone and D. Margetis
and atom clusters. Other problems are related to crystal dissolution, the effects of catalysts, and surface functionalization (e.g., using physical chemistry techniques).
1.1.
Mechanisms of Surface Evolution of Crystalline Materials
There are at least four primary mechanisms for solid surfaces to evolve: (i) Evaporation–condensation processes whereby atoms leave the surface or deposit on the surface from above. These processes are driven by differences between the chemical potential of the surface and the adjacent bulk phases (solid or vapor). (ii) Surface diffusion whereby movable atoms or point defects (“adatoms”) perform random walks (Brownian motion) along the surface. (iii) Strain-driven rearrangements of atoms in the bulk of the material. (iv) Atomic motion driven by external electric fields, which is a phenomenon referred to as electromigration. Here we focus on mechanisms (i) and (ii). These mechanisms, and especially their effects on the macroscopic surface features as well as their quantitative description, depend on the temperature and the surface orientation. There are two distinct temperature regimes that mark different macroscopic behaviors of surfaces both at and away from equilibrium; these regimes are separated by the orientation-dependent roughening transition temperature TR . For any fixed T , continuously curved portions of the surface are characterized by a roughening transition temperature TR < T whereas macroscopically flat regions of the surface known as “facets” have TR > T [1]. Below TR the surface consists of distinct steps bounding terraces whose size can vary from a few nanometers up to a few microns, as shown in Fig. 1a which also illustrates kinks, clusters of atoms and voids. The increase (a)
(c)
(b) T < TR
(1)
(3) T > TR
Steps on Si(001)
(2)
xn xn+1
x
(4)
Figure 1. (a) A STM (single tunneling microscopy) image of a stepped Si (001) surface, which illustrates kinks, voids, atom clusters, a step and a terrace. (Ref. B.S. Swartzentruber’s website, Sandia National Laboratories). (b) The contrast between the shape of crystal surfaces above and below the roughening transition. For T < TR the equilibrium crystal shape has stable facets while for T > TR the surface is continuously rounded with no facets present (Ref. Fig. 7 from Ref. [3]). (c) The notation used for keeping track of the position of different steps with position xn (t).
Continuum descriptions of crystal surface evolution
1391
of the temperature above TR causes the terraces to shrink as steps are spontaneously created everywhere, and the surface appears to be “rough,” as shown in Fig. 1b. Accordingly, below TR the evolution is caused by the lateral motion of the atomic steps and is in principle more difficult to describe physically and mathematically. Moreover, the detailed description of these processes is impacted by kinetics at the step edges, especially those of extremal steps of opposite sign [2]. The physical picture described above implies that the energetics of the solid surface are different above and below TR , which can be a few hundreds of degrees K below the melting temperature for solids. This description of surface evolution clearly has important differences from the case of liquid–liquid interfaces.
1.2.
Theoretical Descriptions of Surface Evolution
The aim of most theoretical studies is to describe the surface morphology at mesoscopic and macroscopic length scales by taking into account the motion of atoms or steps at smaller length scales. Historically, there have been two different theoretical approaches: (i) Approaches based on continuum thermodynamics and principles of continuum mechanics such as mass conservation, which lead to diffusion-like PDEs or variational principles (e.g., for a recent variational approach see Ref. [4]). (ii) Simulations of individual atoms or step motion by solving a large number of coupled equations; for example, the wandering of an individual step is studied by taking into account the local or nonlocal interactions with adjacent steps. This second approach often succeeds in providing detailed information about the surface morphology by accounting for motions over a wide range of length scales. Nevertheless, the merits of the first approach include its relative simplicity because it often enables analytical solutions and, therefore, allows for quantitative predictions for experiments. It is worthwhile mentioning that the differential equations that arise in this continuum approach vary in their form and properties of solutions, and are not generally so familiar to researchers. We take the first approach in the main body of this article. There are many different types of problems that have emerged in the theoretical and experimental studies of morphological surface evolution, as determined mostly by the geometry and dimensionality of the surface configurations both above and below TR . We mention here only four types of such problems. In particular, there have been studies of: (a) the relaxation or flattening of a surface with long-wavelength features, an example being a periodic corrugation with an initial sine profile in one or two rectilinear coordinates, (b) the relaxation of a surface morphology with an initial localized “bump,” or structure of finite extent, (c) the evolution of the interface between two grain
1392
H.A. Stone and D. Margetis
boundaries, which is commonly referred to as grooving, and (d) the evolution of surfaces of revolution in three dimensions such as cones and cylinders (e.g., wires). Analytical solutions for representative versions of some of these problems are summarized in this paper.
1.3.
Step-flow Models
The basic description of atomistic processes in the framework of step kinetics that underlies surface evolution was given by Burton et al. [5] and is referred to as BCF theory; for an overview see Ref. [3]. Figure 1c shows a cross-section of a 1D step configuration along x with the position of the nth step denoted by xn (t), where the nth terrace is the region xn < x< xn+1 . The starting point for step-flow models is the conservation of mass, which relates xn with the adatom surface current (atoms/time), Jn (x), on the nth terrace by dxn [Jn−1 (xn , t) − Jn (xn , t)], x˙n (t) ≡ , (1) a dt where is the atomic area and a is the step height. The surface current is Jn = −Ds (∂cn /∂ x), where cn = cn (x, t) is the adatom concentration and Ds is the diffusivity. The concentration cn (x, t) satisfies the diffusion equation, Ds (∂ 2 cn /∂ x 2 ) = ∂cn /∂t ≈ 0, where the time derivative is negligible in the quasistatic approximation. Thus, cn on each terrace is cn (x, t) = An (t)x + Bn (t), where the time t enters implicitly through the boundary conditions. The requisite boundary conditions describe the attachment and detachment of atoms at the step edges, x˙n (t) =
−Jn (xn , t) = k[cn (xn , t) − cneq ],
eq
Jn (xn+1 , t) = k[cn (xn+1 , t) − cn+1 ], (2)
where k is the attachment–detachment rate coefficient and the superscript “eq” denotes the equilibrium atom density at the step edge. Hence, An and Bn can be determined in terms of cneq , which is related to the step chemical potential, µn , by
cneq = ceq exp
µn kB T
≈ ceq 1 +
µn , kB T
(3)
where we have also indicated the limit |µn | kB T [6]. Finally, µn is related to other step positions via the step interaction potential. In particular, for nextneighbor interactions described by the potential V (xn , xn+1 ), the step chemical potential is µn =∂[V (xn , xn+1 )+V (xn−1 , xn )]/∂ x. Hence, (1)–(3) define a system of coupled ODEs for the step positions which can be solved numerically with given initial conditions xn (0) to determine the evolution of a stepped surface.
Continuum descriptions of crystal surface evolution
2.
1393
Governing Equations for Continuum Descriptions
A basic ingredient of the continuum equations for surface evolution both above and below roughening is the chemical potential µ [7]. We restrict the majority of our discussion to the analysis of configurations in one independent space dimension where the height profile is denoted h = h(x, t) and h x ≡ ∂h/∂ x. The surface thermodynamics can be described in terms of either of two energies, the surface free energy per projected area of the high-symmetry plane, G(h x ), or the perhaps more familiar surface free energy per area of the surface, γ (φ), whereφ is the surface orientation, tan φ = h x . The two energies are related by G = γ 1 + h 2x . As above, we denote by the atomic area. The chemical potential µ is related to G by µ − µ0 = −(∂/∂ x)(∂ G/∂h x ), as shown using a variational principle [8] in the Appendix. It then follows by elementary calculus that ∂2G ∂ ∂G = −h x x 2 ∂ x ∂h x ∂h x γ (φ) ∂ ∂ 2 2 = − h x x cos φ cos φ ∂φ ∂φ cos φ hxx d2 γ = − γ + 2 , dφ (1 + h 2x )3/2
µ − µ0 = −
κ
(4)
where in the last step we used cos φ = (1 + h 2x )−1/2 . Also, κ denotes the curvature of the surface. The term d2 γ /dφ 2 ≡ γφφ is not present when considering surface evolution of liquids. In the more general, 2D setting [8], we have G = G(h x , h y ) and use µ − µ0 = −v ((∂/∂ x)(∂ G/∂h x ) + (∂/∂ y)(∂ G/∂h y )) where v is the atomic volume. As a result, the chemical potential is µ − µ0 = −v (γ + γφ1 φ1 )κ1 − v (γ + γφ2 φ2 )κ2 , where κ1 and κ2 are the principal curvatures, and φ1 and φ2 are the corresponding angles (surface orientations) along the normals to these principal curvatures.
2.1.
Surface Evolution by Evaporation–Condensation Above and Below TR
Perhaps the simplest case of surface dynamics is when the evolution occurs by displacement of atoms by evaporation from, or condensation on, the surface. The driving force for movement of the atoms is then the difference of chemical potentials between the surface and the vapor. Thus, with v n denoting the speed at which the surface is displaced in the normal direction, v n = −ζ(µ − µ0 ),
(5)
1394
H.A. Stone and D. Margetis
where ζ > 0 is the product of a surface mobility [9] and the inverse of the step height. It is necessary to distinguish two cases, T > TR and T < TR , since below roughening the existence of steps and facets produces differences in the form of the relation between surface energy (γ ) and surface orientation (h x ). In the classical case of evolution above roughening, T > TR , ζ and γ are analytic in case of constant properties, ζ0 and γ0 , Eq. (5) simplifies. h x . For the special
3/2 Since v n = h t 1 + h 2x and the curvature is κ = h x x / 1 + h 2x , then (4) and (5) lead to h t = ζ0 γ0
hxx 1 + h 2x
(T > TR ).
(6)
In the small-slope limit, |h x | 1, we simply have the familiar linear diffusion equation. Some examples, for both the linear and nonlinear equations, are given below. On the other hand, below the roughening transition, T < TR , the mobility may be dependent on the surface orientation, e.g., ζ = k0 |h x |α with α = 0 or 1 is common. Further, it is usual to consider small slopes and define the height function h(x, t) relative to a crystallographic plane. In this case, γ + γφφ = γ˜ |h x |β , where β = 1 when the dominant physical effect at the nanoscale is that of step–step elastic interactions that decay inversely proportional to the square of the step distance [10], or G = g0 + g1 |h x | + 13 g3 |h x |3 , where 2g3 = γ˜ . For α = 1 and β = 1 in particular, the surface evolves according to the non-linear equation h t = k0 γ˜ h 2x h x x .
(7)
A general discussion of the evaporation–condensation dynamics below roughening is given by Spohn [9]. Again, some examples are provided below.
2.2.
Surface Evolution by Surface Diffusion Above and Below TR
As above, the surface evolves in the normal direction at a speed v n owing to variations in the flux of atoms along the surface. It is straightforward to give the development in two independent space dimensions here [11]. If we let j denote the number of atoms per unit length normal to a contour lying in the surface and v the atomic volume as above, then mass conservation requires v n + v ∇s · j = 0.
(8)
For systems out of, but close to, equilibrium the surface flux j is proportional to the gradient of the surface chemical potential (or energy) for an
Continuum descriptions of crystal surface evolution
1395
atom. The corresponding thermodynamic force on the atom is −∇s µ, and the flux of atoms then follows from a form of a Stokes–Einstein argument: j = −(Ds cs /kB T )∇s µ, where Ds is the surface diffusivity, cs is the adatom concentration (number/area; adatoms are those atoms free to diffuse at any time along the surface), kB is Boltzmann’s constant and T is the absolute temperature. Assuming all material parameters are constants, the surface evolves according to vn =
Ds cs v 2 ∇s µ. kB T
(9)
Above the roughening transition, the chemical potential change, µ − µ0 , is proportional to the surface curvature. Hence, (9) yields a fourth-order nonlinear PDE for the height h. In one dimension the PDE is
∂2 hxx Ds cs 2 γ0 2 ht = − 1 + hx 2 . kB T ∂ x (1 + h 2x )3/2
(10)
For small slopes, this equation is linearized to h t = − (Ds cs 2 γ0 /kB T ) hxxxx. On the other hand, below the roughening transition, the surface energy depends on the surface orientation. Taking γ + γφφ = γ˜ |h x | in one dimension for small surface slopes, we obtain the nonlinear PDE ht = −
Ds cs 2 γ˜ (|h x |h x x )x x . kB T
(11)
Some solutions of these equations for surface-diffusion-driven evolution above and below the roughening temperature are given below.
3.
Solutions to Prototypical Problems: Surface Evolution by Evaporation–Condensation Processes, T > TR
In these last sections we tersely summarize a number of problems that have been treated analytically, including both the familiar linear second- and fourthorder diffusion equations and the more intricate nonlinear equations. We treat in sequence evaporation–condensation and surface diffusion processes, first for conditions above, and then for conditions below, the roughening transition. We begin with evaporation–condensation dynamics. Recall that (6) reduces to the diffusion equation for |h x | 1 so that h t = ζ0 γ0 h x x . Relaxation of periodically corrugated surfaces [12]. For an initial periodic profile with wavelength λ, h(x, 0) = A sin(2π x/λ), the diffusion equation is solved by applying a Fourier series in the form h(x, t)= ∞ n=1 an (t) sin(2nπ x/λ),
1396
H.A. Stone and D. Margetis
where the coefficients an (t) satisfy the ODE a˙ n + ζ0 γ0 (2π n/λ)2 an = 0 and the initial condition an (0) = A. Hence, the complete solution is h(x, t) = A
∞
e
−ζ0 γ0
2π n 2 λ
t
n=1
2nπ x sin . λ
(12)
For sufficiently long times, which corresponds to t λ2 /(4π 2 ζ0 γ0 ), 2 2 Eq. (12) simplifies to h(x, t) ∼ Ae−ζ0 γ0 (4π /λ )t sin(2π x/λ); thus, the lifetime of the periodic profile is proportional to λ2 . Decay of a localized mound of atoms. Again, we restrict ourselves to the small-slope approximation. For an initial bump, h(x, 0) = f (x) where f (x) is of finite extent, and the condition h → 0 sufficiently fast as |x| → ∞, h(x, t) ˜ s) = is analytically by applying the Laplace transform in t, h(x, √ ∞determined −st δ − s|x| ˜ dt h(x, t)e . In particular, for f (x) = δ(x), we find h (x, s) = e / 0√ (2 s) whose inversion gives the fundamental solution 2 1 − x e 4ζ0 γ0 t . h δ (x, t) = √ ζ0 γ0 t
(13)
Notice that this solution has the similarity form t −1/2 H (η) where η is the √ similarity variable x/ 4ζ0 γ0 t. The solution for an arbitrary initial bump is obtained by superposition ∞
h(x, t) =
dx h δ (x − x , t) f (x ).
(14)
−∞
Grooving at a grain boundary [13, 14]. Here we consider the evolution of a groove that forms at a grain boundary of an otherwise flat surface. It is thus necessary to solve the nonlinear equation (6) subject to the condition h x (0, t) = −(cos θ/ sin θ), where θ is half the dihedral angle formed at the groove. This problem admits a similarity solution of the form h(x, t) = (2ζ0 γ0 t)1/2 H x/(2ζ0 γ0 t)1/2 where H (η) satisfies the ODE
H − ηHη
1 + Hη2 = Hηη .
(15)
Thus, the grain deepens at a rate proportional to t 1/2 . A numerical solution of (15) is in principle necessary and is straightforward to obtain by the usual shooting procedure of guessing H (0) with a given Hη (0) until H (η → ∞) → 0. For the special case of small surface slopes, |Hη | 1, (15) can √ be linearized, and the resulting solution is H (η) = −(cos θ/sin θ)(η erfc(η/ 2)− √ 2 2/π e−η /2 ).
Continuum descriptions of crystal surface evolution
4.
1397
Surface Evolution by Surface-Diffusion Processes, T > TR
In the small-slope approximation, |h x | 1, Eq. (10) reads h t = − B h x x x x where B = (Ds cs 2 γ0 /kB T ) > 0 is a material parameter with dimension (length)4 /time. Decay of periodic surface modulations. For an initial periodic profile with wavelength λ, h(x, 0)= A sin(2π x/λ), (10) is solved again by applying Fourier series; the coefficients an (t) satisfy the ODE a˙ n + B(2π n/λ)4an = 0 and the initial condition an (0) = A. Hence, the complete solution is h(x, t) = A
∞
e
−B
2π n 4 λ
n=1
t
2nπ x sin . λ
(16)
For sufficiently long times, t (2π )−4 λ4 /B, this solution is approximated 4 by h(x, t) ∼ A e−B(2π/λ) t sin(2π x/λ); thus, the lifetime of the periodic profile is proportional to λ4 . This scaling with size should be contrasted with the case of evaporation-condensation for which the lifetime is proportional to λ2 . Decay of a localized mound of atoms. In some circumstances there are initial conditions that correspond to a mound of material on an otherwise flat surface. The system proceeds to lower its energy by flattening and so it is of interest to quantify this decay process. For an initial bump, h(x, 0) = f (x), and the condition h → 0 sufficiently fast as |x| → ∞, h(x, t) is again determined analytically by applying the Laplace transform of h(x, t) in t. In particular, for √ δ −1 −1/4 −3/4 −s 1/4 B −1/4 |x|/ 2 ˜ s e sin(s 1/4 2−1/2 f (x) = δ(x), we find h (x, s) = 2 B −1/4 B |x| + π/4) whose inversion gives the real solution h δ (x, t) =
1 1 2π i 2(Bt)1/4
i∞ −i∞
dσ σ −3/4 sin ησ 1/4 +
π , 4
(17)
where η = |x|/(4Bt)1/4 . The solution has the similarity form t −1/4 H (η), which, for long times, could have been recognized immediately. The solution for an arbitrary bump is given by (14); for sufficiently long times this solution also obtains a similarity structure. It is inferred that for long times the bump has a lifetime proportional to the fourth power of its linear size.
5.
Surface Evolution by Evaporation–Condensation Processes, T < TR
Decay of a localized mound of atoms in one space dimension. Here we consider Eq. (7). The material parameter k0 γ˜ has the units of diffusivity, (length)2 /time. If we consider an arbitrary initial distribution of atoms confined
1398
H.A. Stone and D. Margetis
to a region |x| ≤ X (t), then global mass conservation requires 2 0X (t ) h(x, t) dx = M = constant. For this problem there is a similarity solution that describes the long-time behavior of a bump, and a wide class of initial distributions are expected to evolve to the profile predicted by the similarity solution for times t 2 /(k0 γ˜ ), where is a length scale characteristic of the initial distribution. The similarity solution has the form
h(x, t) =
1/6
M4 96k0 γ˜
t −1/6 H (η), where η =
x , (3k0 γ˜ M 2 /2)1/6 t 1/6 (18)
and the function H (η) thus satisfies the fourth-order ODE −(H η)η = Hη 2 Hηη . ηe Conservation of the total mass becomes 0 H (η)dη = 1, where X e (t) = (1/6) (1/6) t is the finite extent of the evolving surface; the ηe 3k0 γ˜ M 2 /2 constant ηe remains to be determined. The ODE for H (η) can be integrated twice; using the symmetry condition Hη (0) = 0 along with the definition of the leading edge ηe as H (ηe ) = 0 we obtain
1/2
3 H (η) = 8
ηe2
1−
η ηe
4/3 3/2
,
(19)
which is the form given by Spohn [9]. The parameter ηe is determined from total mass conservation; we find ηe3
1/2 1
3 8
(1 − η
)
4/3 3/2
dη = 1 or
0
ηe =
2 6π
1/6
(1/4) (3/4)
1/6
, (20)
where (s) is the Gamma function. Decay of an axisymmetricbump in two dimensions ([9]). For axisymmetric shapes h = h(r, t) where r = x 2 + y 2 . We take [10] G = g0 + g1 |∇h| + 13 g3 |∇h|3 ,
(21)
and µ − µ0 = −v ((∂/∂ x)(∂ G/∂h x ) + (∂/∂ y)(∂ G/∂h y )). The resulting PDE for small slopes, |∇h| 1, and h r < 0 follows from (5) and (21) with ζ = k0 |∇h| to be hr g3 ∂ 2 (rh r ) 1+ where A = v k0 g1 , (22) ht = A r g1 ∂r with the initial condition h(r, 0) = H(r). Neglecting the g3 /g1 term √in the PDE and applying the method of characteristics we obtain h(r, t) ≈ H 2At + r 2 . This solution describes how the initial bump shrinks to zero at long times, while corrections due to the g3 /g1 term then are relatively small and can be obtained via simple iterations.
Continuum descriptions of crystal surface evolution
6.
1399
Surface Evolution by Surface–Diffusion Processes, T < TR
Evolution of a periodic profile in one dimension. Kinetic simulations [15, 16] based on a step-flow model with elastic step–step interactions have indicated that the height of periodic profiles in one dimension may evolve as h(x, t) = (x)(t), i.e., h has a separable form. From the continuum viewpoint the surface evolution can be described by (11), h t = − B (|h x |h x x )x x ˙ where B = (Ds cs 2 γ˜ /kB T ). Assuming h x > 0, and thus satisfy −/ 2 −1 = C = const. > 0 and (x x x )x x = C(x). Hence, (t) = (Ct + K ) ≈ C −1 t −1 for long times, while the ODE for can only be solved numerically. The set of boundary conditions that would yield a unique solution to this PDE is a topic of discussion in the literature (e.g. Ref. [2]). Evolution of an axisymmetric shape in two dimensions ([17]). Here we consider the surface-diffusion-driven change in shape of an initially conical surface (see Fig. 2a). Using (21) and the equation for µ − µ0 in terms of G along with (9), we obtain a PDE for h(x, y, t) in two dimensions. For axisymmetric shapes, h = h(r, t), with a growing facet of radius w(t), as shown in Fig. 2a, the PDE for the slope profile F = −h r is ∂ F 3B g3 ∂ 2 1 ∂ 2 = 4 −B ∇ rF , ∂t r g1 ∂r r ∂r
(23)
where B = (Ds cs 2v γ0 /kB T ). This equation can be studied using a combination of free-boundary (the facet width w(t) changes in time) and boundary-layer ideas (there is a region of rapid variation associated with the highest-derivative term in Eq. (23). For g3 /g1 < O(1) singular perturbation theory suggests that the solution F varies rapidly inside a boundary layer of width δ b near the facet.
(a)
er
w(t) z⫽h(r,t)
3.5 scaled slope, f0(η)
facet
(b)
ez
e
3
d
2.5
c
2
b
1.5
a
1 0.5 0
0
2 4 6 8 10 scaled radial coordinate, η
12
Figure 2. (a) Schematic of an axisymmetric shape with an indication of the step structure on the atomic scale. (b) Surface slope profiles as a function of a similarity variable. The different profiles correspond to different values of g3 /g1 as described in Ref. [17].
1400
H.A. Stone and D. Margetis
Taking F ≈ a(t) f0 (η) for long times where η = [r − w(t)]/δ b we obtain δ b = O( 1/3 ) and a universal ODE for f 0 , ( f 02 ) = f 0 − 1. This equation can only be solved numerically assuming slope continuity, f 0 (0) = 0. Solutions are obtained by the routine shooting procedure of starting with f 0 (η∗ ) ≈ c1 (η∗ )1/2 + c3 (η∗ )3/2 for η∗ 1 and finding the coefficients c1 and c3 so that f 0 (η → ∞) = 1, as dictated by asymptotic matching at η = ∞ with the “outer solution” for g3 /g1 = 0. Different numerical solutions of the ODE are shown in Fig. 2b. There is excellent agreement (not shown here) between the theoretical predictions and the results from kinetic simulations [6].
7.
Outlook
The development of continuum descriptions for the time evolution of the shape of crystalline materials leads to a number of different partial differential equations. The distinction of the driving forces for surface evolution above and below the roughening temperature is significant and it is only in fairly recent years that attention has focussed on the below roughening case. The use of step-flow models, and the understanding gained from these systems, is also important for probing kinetic, and other, features of the basic continuum models. Further advances and comparison of these ideas with experiment will lead to progress in future years.
Appendix A Here we derive the first line of equation (4), which relates the chemical potential µ to the surface energy parameter G(h x ). The total surface free energy in 1D is G t = dx G(h x ). Taking the first variation with respect to h of G t − dx λ˜ to be zero for h fixed at the endpoints, where λ˜ h is the change of the chemical potential, we find
0=
∂G dx δh x − −1 λ˜ h δh = − ∂h x
∂ ∂G dx + −1 λ˜ h δh, (A1) ∂ x ∂h x
By definition of the chemical potential, µ − µ0 = λ˜ h and the initial starting point in equation (4) is obtained.
Acknowledgments We thank M.Z. Bazant, D. Kandel, R.V. Kohn, R.R. Rosales, V. Shenoy, and Z. Suo for helpful conversations, E.D. Williams for her kind permission to reproduce a figure from (Jeong and Williams, 1999), and M.J. Aziz for his constant support, encouragement and valuable explanations.
Continuum descriptions of crystal surface evolution
1401
References [1] P. Nozi`eres, “Shape and growth of crystals,” In: C. Godreche (ed.), Solids Far from Equilibrium, Cambridge University Press, Cambridge, pp. 1–154, 1992. [2] A. Chame, S. Rousset, H.P. Bonzel, and J. Villain, “Slow dynamics of stepped surfaces,” Bulgarian Chem. Commun., 29, 398–434, 1996/97. [3] H.-C. Jeong and E.D. Williams, “Steps on surfaces: experiment and theory,” Surf. Sci. Rep., 34, 171–294, 1999. [4] V.B. Shenoy and L.B. Freund, “A continuum description of the energetics and evolution of stepped surfaces in strained nanostructures,” J. Mech. Phys. Solids, 50, 1817–1841, 2002. [5] W.K. Burton, N. Cabrera, and F.C. Frank, “The growth of crystals and the equilibrium structure of their surfaces,” Phil. Trans. R. Soc. London Ser. A, 243, 299–358, 1951. [6] N. Israeli and D. Kandel, “Profile of a decaying crystalline cone,” Phys. Rev. B, 60, 5946–5962, 1999. [7] C. Herring, “Surface tension as a motivation for sintering,” In: W.E. Kingston (ed.), The Physics of Powder Metallurgy, McGraw-Hill, New York, pp. 143–179, 1951. [8] W.W. Mullins, “Capillarity-induced surface morphologies,” Interface Sci., 9, 9–20, 2001. [9] H. Spohn, “Surface dynamics below the roughening transition,” J. Phys. I, France, 3, 69–81, 1993. [10] H.P. Bonzel, “Equilibrium crystal shapes: towards absolute energies,” Prog. Surf. Sci., 67, 45–57, 2001. [11] F.A. Nichols and W.W. Mullins, “Morphological changes of a surface of revolution due to capillarity-induced surface diffusion,” J. Appl. Phys., 36, 1826–1835, 1965. [12] W.W. Mullins, “Flattening of a nearly plane solid surface due to capillarity,” J. Appl. Phys., 30, 77–83, 1959. [13] W.W. Mullins, “Theory of thermal grooving,” J. Appl. Phys., 28, 333–339, 1957. [14] Z. Suo, “Motions of microscopic surfaces in materials,” Adv. Appl. Mech., 33, 193– 294, 1997. [15] M. Ozdemir and A. Zangwill, “Morphological equilibration of a corrugated crystalline surface,” Phys. Rev. B, 42, 5013–5024, 1990. [16] N. Israeli, and D. Kandel, “Decay of one-dimensional surface modulations,” Phys. Rev. B, 62, 13707–13717, 2000. [17] D. Margetis, M.J. Aziz, and H.A. Stone, “Continuum description of profile scaling in nanostructure decay,” Phys. Rev. B, 69, art. 041404(R), 2004.
4.9 BREAKUP AND COALESCENCE OF FREE SURFACE FLOWS Jens Eggers School of Mathematics, University of Bristol, University Walk, Bristol BS8 1TW, UK
The picture of a dolphin, jumping out of the water in the New England aquarium in Boston (Fig. 1), gives a very good idea of the challenges involved in the description of free-surface flows. In a complex series of events, which is still not well understood, water swept up by the dolphin breaks up into thousands of small drops. A more detailed idea of what happens close to the point of breakup is given in Fig. 2, which shows a drop of water falling from a faucet. Once an elongated neck has formed, surface energy is minimized by locally reducing its radius, and a drop separates at a point. Once the neck is broken, it rapidly snaps back, forming a capillary waves on its surface. In the last picture on the right, the neck has been severed on the other end as well. Thus in a single dripping event, two drops have actually formed, and the smaller “satellite” drop will subsequently break up to form even smaller drops. This gives a good idea of the complexity of just a single breakup event, driven by surface tension. The complementary event of drop coalescence is illustrated by Fig. 3, which shows two drops which have been made to touch at a point. Surface tension drives a motion that makes the drop coalesce, since the combined drop has a lower surface energy. The intitial motion is so rapid that it is hardly resolved by the camera, and it results in quite a complicated sequence of capillary waves, drop oscillations, etc. Clearly changes in topology brought about by breakup or coalescence are the most dramatic events in the evolution of a free-surface, characterized by a very rapid and complex motion of the surface (cf. Figs. 2 and 3). In fact, it is not a priori clear whether continuum equations are able to describe topology changes, since somewhere in between flow features develop which are
1403 S. Yip (ed.), Handbook of Materials Modeling, 1403–1416. c 2005 Springer. Printed in the Netherlands.
1404
Figure 1.
J. Eggers
Dolphin in the New England aquarium in Boston; photograph by Harald Edgerton.
of molecular size. So apart from predicting the actual motion near the singular point, the aim of the theory is to explain how one topology can be transformed into the other in a unique way. The spatial and temporal resolution of any numerical simulation is limited, so a thorough understanding of the singularity is needed. Once the rapid motion near the singularity can be described theoretically, the numerical evolution can be matched onto it. In addition, the theoretical description of singularities will explain some universal flow features, attributable to breakup or coalescence of drops.
Breakup and coalescence of free surface flows
1405
Figure 2. A sequence of photographs showing a drop of water falling from a pipette D = 5.2 mm in diameter (photograph by H. Peregrine, see Ref. [2]). The superimposed black lines are the result of a simulation of the one-dimensional equations (1) and (2).
Figure 3. A sequence of images (t = 1 ms) of two mercury drops brought into contact at the point indicated by an arrow [4].
1.
Non-linear Dynamics of Drop Formation
To obtain insight into the non-linear dynamics close to breakup one has to solve a notoriously difficult problem: the Navier–Stokes equation within a domain that is changing in time. The motion of the interface is dictated by the fluid motion itself, as the interface is convected passively by the fluid motion at the interface. The motion of the interface has to be computed with great accuracy, because the fluid motion is driven by surface tension, resulting in a Laplace pressure proportional to the mean curvature of the interface. Since the driving is proportional to pressure gradients, acceleration of a fluid element is effectively determined by third derivatives of the surface shape. The numerical difficulties inherent in this coupling of fluid motion and its driving force are discussed thoroughly by Scardovelli and Zaleski [1], giving an overview of available numerical methods.
1406
J. Eggers
To obtain greater insight into drop breakup, it is necessary to reduce the non-linear dynamics associated with it to its essentials. The idea is that near the point where the neck radius goes to zero, the fluid motion is directed primarily in a direction parallel to the axis. This allows to reduce the problem to an equation for the average velocity in the axial direction alone. Alternatively, and more or less equivalently, the velocity field can be expanded in the radial coordinate. If a typical radial length scale is smaller than the corresponding axial one, usually signaled by the interface slope being less than one, the leading order coefficient for the velocity suffices, as discussed in detail by Eggers [2]. The result is a system of equations for the local radius h(z, t) of the fluid neck, and the average velocity v(z, t) in the radial direction. All other terms are of higher order in h or the radial variable r. For a liquid with kinematic viscosity ν, density ρ, and surface tension γ (neglecting the effect of the outer gas), the result of the calculation is: ∂t h 2 + ∂z (vh 2 ) = 0, γ ∂t v + v∂z v = − ∂z ρ inertia
(1)
1 ∂z (∂z vh 2 ) 1 + + 3ν . 2 R1 R2 h
surface tension
(2)
viscosity
The simplification achieved by (1) and (2) is enormous. Firstly, the dimension of the problem been reduced by one (the radial variable has been eliminated). Secondly, the moving boundary has been described explicitely by h(z, t). Equation (1) expresses the conservation of mass: it is written as a conservation equation for the volume h 2 dz of a slice of fluid. Equation (2) is a balance of forces acting on a fluid element, and thus very similar in structure to the original Navier–Stokes equation [3]. The l.h.s. of (2) corresponds to inertial forces, driven by surface tension and viscous forces on the right. As to be expected from Laplace’s formula, surface tension forces are proportional to the mean curvature, which for a body of rotation is 1 1 ∂zz h 1 + = − 3. 2 R1 R2 h 1 + (∂z h) 1 + (∂z h)2
(3)
Strictly speaking, the radial expansion implied by (1) and (2) would have required us to replace the mean curvature by the leading-order expression 1/ h(z, t) alone. This is indeed sufficient to describe the neighborhood of the pinch point, but the applicability of the equations is greatly enhanced by including the full curvature, because the equations then include a spherical drop among their equilibrium solutions. The remarkable power of the system (1) and (2) in describing a real break-up event is illustrated by Fig. 2. The sequence of experimental pictures shows a drop of water falling from a pipette 5.2 mm in diameter. The drop is shown at the moment of the first bifurcation (first picture), after which the
Breakup and coalescence of free surface flows
1407
fluid neck recoils from the drop (second picture). Shortly afterward the neck pinches on its other end (third picture), thereby forming a satellite drop. Such satellite drops are seen to be a direct consequence of the long neck that is forming at pinch-off, which in turn reflects the profile being extremely asymmetric around the pinch-point: on one side, the profile asymptotes to the drop, on the other side it is very flat and forms a slender neck. It is evident from Fig. 2 that the one-dimensional approximation works extremely well in describing breakup, and the formation of satellite drops. This includes regions near the drop, where the profile is actually quite steep, so the expansion underlying (1) and (2) is formally not valid. A careful assessment of the quality of one-dimensional approximations, achieved through comparison with accurate numerical simulations of the full Navier–Stokes equation, is given in Ref. [5]. As discussed in the introduction, it is not clear how to pass from the first panel in Fig. 2 to the second on the basis of (1) and (2), but rather some “surgery” was necessary. Namely, when the minimum neck radius was just 10−4 times the original radius, it was cut and spherical caps were placed on either side. Below we will justify this procedure on the basis of a more detailed understanding of the dynamics at the pinch-point.
2.
Similarity Solutions
We now turn to the immediate neighborhood of the pinch-point, where separation occurs. Since the evolution takes place on length and time scales much smaller than any externally applied scales such as the diameter Dof the capillary in Fig. 2, the motion should be properly measured in some intrinsic units of the fluid. The only such units of length and time that can be formed from the fluid parameters are ν =
νρ γ
and tν =
ν3ρ2 . γ2
(4)
As expected intuitively, length and time scales increase with viscosity, which can vary greatly between different fluids. For water, the viscous length scale ν is just 10 nm, far below anything visible on the scale of the photographs in Figs. 2 and 3. For glycerol, on the other hand, ν is in the order of centimeters, and the asymptotics described below is easily observable. Since our description is local, it is clear that we have to represent the motion in a local coordinate system. The only reasonable choice for its origin is the point z 0 and time t0 where the singularity occurs. Making space and time dimensionless using ν and tν , we introduce z =
(z − z 0 ) ν
and
t =
(t0 − t) . tν
(5)
1408
J. Eggers
Now representing the spatial profile as well as the velocity field in these co-ordinates, h(z, t) = ν H (z , t ) v(z, t) =
ν V (z , t ) tν
(6)
we expect the new functions H and V to represent properties of the singularity alone. In particular, we hope that they are universal, independent of both the initial conditions and the material parameters of the fluid. Since no external scales are thus expected to come into play in the description of H (z , t ) and V (z , t ), these profiles should be invariant under a change of scale. This means both the height and the velocity profile should be self-similar: H (z , t ) = t φ
V (z , t ) = t
−1/2
z
t 1/2
ψ
z t 1/2
(7)
.
The meaning of (7) is that the shape of the profiles does not change as a function of time, only the radial and the axial scales are adjusted as t goes to zero. The exponents implicit in (7) were computed from the requirement that all terms in the equations balance, i.e., that inertial, surface tension, and viscous forces are of the same order close to the singularity. In particular, two things are noteworthy about the exponents: First, the neck radius shrinks linearly with t as the singularity is approached, while the corresponding axial scale only shrinks like t 1//2 . This implies that the profile is asymptotically slender, and the assumptions underlying the derivation of (1) and (2) were justified. Second, the exponent of the velocity is negative, so the motion is increasingly fast close to the singularity. This is not unexpected, since ever stronger surface tension forces are driving increasingly small fluid necks. Once the neck reaches microscopic size, of course, the description in terms of a velocity field becomes meaningless, so there is no danger of truly infinite velocities looming here. Finally, once the self-similar form (7) is re-introduced into the equations of motion (1) and (2), one obtains a set of ordinary differential equations for the similarity profiles φ(ξ ) and ψ(ξ ) alone. A more thorough analysis of the structure of the equations reveals [2] that there is only one universal solution of them, once proper boundary conditions are imposed at ξ = ±∞. These are derived from the condition that matching must be possible to the macroscopic profiles farther away, which evolve on much longer time scales than the selfsimilar solution itself. A remarkable consequence of this universality is that
Breakup and coalescence of free surface flows
1409
the minimum neck radius, at a given time away from the point of breakup, is a quantity that is independent of the initial radius [2]: h min = 0.03
γ (t0 − t). νρ
(8)
To look at a comparison between theory and experiment in more detail, Fig. 4 shows three successive images of a jet of glycerol pinching off to form a drop (a small section of which is seen on the right). Once the temporal distance from the singularity is known (from experiment), the profile can be predicted without adjustable parameters (dark continuous lines). The only difference between the three sets of lines is that the axes have been rescaled by the factor implied by (7). The universality of the solution described by (7) of course implies that it holds equally well for the pinching of the drop of water shown in Fig. 2, as it does for the glycerol jet of Fig. 4. The reason this common feature is not immediately apparent is that ν is extremely small for water, so one would have to observe the neighborhood of the point of breakup in Fig. 2 under extreme magnification. This means that only on a very small scale will all three forces contributing to (2) come into play. For the parts of the evolution where the minimum radius h min is much larger than ν , one can neglect viscosity, so that Fig. 2 is effectively described by inviscid dynamics. Thus to understand the appearance of drop pinch-off on a given scale D (such as the nozzle diameter), one has to take into account the phenomenon of
Figure 4. A sequence of interface profiles of a jet of glycerol close to the point of breakup (the center of the drop being formed is seen as a bright spot in the top picture). The experimental images correspond to t0 − t = 350, 298, and 46 µs (from top to bottom). Corresponding analytical solutions based on (7) are superimposed. (Experiment by T. Kowalewski, see Ref. [2]).
1410
J. Eggers
cross-over: if initially D ν , the dynamics is characterized by a balance of inertial and surface tension forces. As h min reaches ν , the dynamics changes toward an inertial–surface tension–viscous balance. If on the other hand D ν initially, inertia cannot play a significant role: the dynamics is dominated by viscosity and surface tension. In the course of this evolution, however, inertia becomes increasingly important and finally catches up with the other two. As a result, the same universal solution as before is finally observed. To each of the new balances described above corresponds a new similarity solution, distinct from (7) [2]. For example, the inertia–surface tension balance leads to a minimum drop radius that behaves like 2/3
h min = 0.7
γ ρ
(t0 − t)2/3 ,
(9)
while the viscous–surface tension balance corresponds to h min = 0.06
γ (t0 − t). νρ
(10)
The spatial structure of the corresponding similarity solutions largely explains the macroscopic appearance of high and low viscosity fluids, respectively. The axial and radial scales of the inviscid solution (9) both behave like (t0 − t)2/3 , thus leading to a neck that is cone-shaped, consistent with Fig. 2. For its computation, the lubrication Eqs. (1) and (2) are inadequate, rather, the full equations for inviscid, irrotational flow have to be solved. The reason is that the interface profile turns over, so that the tip of the cone-shaped neck is actually inside the drop. These predictions of similarity theory have been confirmed by both experiment and full numerical simulations of the Navier– Stokes equations by Chen et al. [6]. Very viscous fluids, on the other hand, tend to form very elongated threads. This is reflected by the fact that the typical axial scale of the viscous solution (10) behaves like (t0 − t)0.175 (t0 − t). Interestingly, the exponent β = 0.175 . . . is an irrational number coming from the solution of a transcendental equation [2]. This is an example of self-similarity of the second kind, in the classification of [7]. The striking difference in the behavior of high and low viscosity fluids is represented schematically in Fig. 5. It would be beyond the scope of this brief overview to mention all the recent developments in the study of drop pinch-off, some of which are discussed in Ref. [8]. To name some examples, the presence of an outer fluid significantly alters pinching, leading to new types of similarity solutions, with important applications for the physics of mixing. For extremely small jets of the size of nanometers, thermal fluctuations have to be taken into account, which significantly alter the dynamics. This has been found using molecular dynamics simulations of a jet 6 nm in diameter. However, even on much larger
Breakup and coalescence of free surface flows
1411 Re ⫽ 兹D/ 艎
h min ⫽ 0.06艎t'
h min ⫽ 0.7艎t' 2/3
⫺log(h min )
h min ⫽ 0.03艎t' Figure 5. A graphical representation of some of the scaling regimes that can be observed in droplet pinching, depending on the viscosity of the liquid. For high viscosity (Re small) threads form as a drop falls from a pipette of diameter D. In the opposite case of low viscosity (Re large) the pinching neck is conical. As the neck radius goes to zero, however, one always ends up with the same universal scaling regime. (Photographs by X.D. Shi and S. Nagel, see Ref. [2]).
scales small perturbations to the observed similarity solutions can be important. In fact, the threads shown in Fig. 4 are quite sensitive to perturbations, and a careful examination of the last panel shows (unfortunately obscured by the drawn lines) the growth of disturbances on the thread Eggers [2]. In other words, the question of what resolution (experimental or numerical) is necessary near the point of breakup depends very much on what one is interested in: a lot of detail may be buried within a pinching event, which may or may not be important for a given question. If one is trying to describe topological transitions numerically, one will always have to renounce the description of the dynamics below some cut-off length. It is therefore important to understand the mechanisms which guarantee the uniqueness of the continuation across the singularity. The key is again
1412
J. Eggers
the universality of similarity solutions we already found in the approach to the singularity. A new set of similarity solutions can be found after breakup – one exists on either side of the pinch-point. This is illustrated in Fig. 6, which shows some typical predictions of the similarity theory. On one side one sees the rapid retraction of a very thin needle, on the other a small protrusion is left initially on the drop, which quickly heals off to form a smooth drop. The difference to the similarity solution before breakup lies of course in the boundary condition at the retracting tip. A closer analysis reveals [2] that the similarity solutions after breakup depends on the boundary conditions for both the height and the velocity as one moves away to infinity. The crucial condition that guarantees unique continuation is the fact that both profiles to the left and right of the point of breakup have to coincide with the solutions before breakup as one moves to infinity. Information between the solutions before and after breakup is thus passed on solely on the basis of the far-field
(a)
72µm t ⫺ t0 ⫽ ⫺114µs, ⫺63µs, ⫺11µs
Before breakup
(b)
72µm
t ⫺ t0 ⫽ 11µs, 63µs, 114µs
After breakup
Figure 6. The breakup of a mixture of glycerol in four parts of ethanol, as calculated from similarity solutions before and after breakup. Part (a) shows three profiles before breakup, in time distances of 46 µs, corresponding to |t | = 1, 0.55, and 0.1. In part (b) the same is shown after breakup.
Breakup and coalescence of free surface flows
1413
behavior. Whatever microscopic physics determines the actual breakup event is irrelevant to the continuation.
3.
Coalescence
As we have seen above, the understanding of drop breakup is aided greatly by the universality of the observed solutions. One is able to almost completely disregard the free-surface motion away from the point of breakup. The main difficulty in finding a unifying picture for drop coalescence lies in the fact that one cannot disregard the drop motion that leads to the meeting of the two drops, resulting in a number of problems. Firstly, the motion produced by the purely geometrical overlap between two approaching drops is comparable or faster than the motion generated by surface tension. Hence the velocity of approach must come into play when describing the dynamics of coalescence. Secondly, the fluid caught between two approaching drops cannot be ignored, even if its viscosity is very small. The reason is that a very thin lubrication film between two drops will still produce an appreciable pressure, which deforms the drops prior to their meeting at a point. Thirdly, the mechanism that leads to the first small-scale union between the drops is not well understood. In particular, the presence of surfactants on the surface produces barriers that have to be overcome, which can significantly delay reconnection, as shown by Amarouchene et al. [9]. We will therefore focus on the simplest case of a vanishing speed of approach, in which case the ensuing dynamics is determined by the fluid parameters and the radius R of the drops alone. If the two drops do not have equal radius, the one with the smaller radius will play the dominant role and effectively replace R. Since the motion starts from rest, it will initially be slow, so the driving by surface tension is counteracted by viscosity alone. This behavior will persist until the radius rm of the liquid bridge connecting the two drops has reached ν , after which it crosses over to one where only inertia matters and viscosity drops out of the problem. Finally, for rm ≈ R, the initially local motion in the bridge between the drops evolves to a global motion involving all of the fluid. The central idea in investigating the dynamics of coalescence is of course that for rm R the motion is self-similar, dominated by the local behavior close to the meniscus where the two drops meet. At the meniscus the curvature is extremely high, and thus leads to a driving that is confined to a ring-shaped region, whose radius rm is expanding. Turning first to the initial stage of viscous motion, the problem is thus one of a line force moving through an infinite medium. The force per unit length of the line is 2γ , and one has to compute the speed that results from it. It is one of the characteristic features of Stokes flow that to obtain a finite answer, logarithmic corrections come into play, for
1414
J. Eggers
which an upper and a lower cut-off is needed. The upper cut-off evidently is the radius of the drop itself, the width of the meniscus serves as the lower cut-off. The result of the calculation [10] is (η = νρ being the viscosity): rm (t) ∼ −
γ (α − 1) γ t ln t, η 2π Rη
(11)
where the width of the meniscus is assumed to scale like ∝ rmα . Interestingly, the value of α which determines the prefactor in (11) depends on the presence of an outer fluid between the drops. If no outer fluid is present, the correct value is α = 3, which can in fact be deduced from an exact solution (due to Hopper) to the two-dimensional analogue of the problem under study [10]. A closeup of this extremely sharp meniscus is shown in panel (a) of Fig. 7. Even a small amount of interstitial fluid, however, changes the situation considerably. Owing to the fact that the gap between the drops is exceedingly narrow, it is quite hard to push any fluid away from the advancing meniscus. Instead, the interstitial fluid is collected in a pouch at the meniscus (see Fig. 7), and is now much larger, so that one finds α = 3/2 [10]. If the drop fluid is very viscous (ν > R), this is all that can be said from the point of view of aymptotics. Among others, we have established that the motion is described by a well-defined asymptotic solution. Hence after a very short time, details of the microscopic mechanisms leading to coalescence have been “forgotten”. In a numerical simulation, “surgery” done on a sufficiently small scale will meet a similar fate, and one soon ends up following the unique physical solution. Finally, if ν R, there is a region where the motion is almost inviscid. From a balance of surface tension forces with inertial forces at the meniscus one deduces [10] that
rm ∝
γR ρ
1/4
t 1/2 .
(12)
This behavior has been confirmed by recent numerical simulations of Ref. [11]. However, there is an unexpected complication: as the meniscus retracts, capillary waves grow ahead of it, whose amplitude finally equals the width of the channel. Thus the two sides of the drops touch, and a toroidal void is enclosed. This process repeats itself, leaving behind a self-similar succession of voids. In summary, one can often obtain analytical solutions to the equations of motion near a singularity, explaining some universal features of breakup and coalescence events. This is important for estimating errors introduced by a given numerical procedure used to describe topological transitions. Matching numerics to known analytical solutions can lead to considerable savings in numerical effort.
Breakup and coalescence of free surface flows (a)
(b)
0.040
y
0.040
0.030
0.030
0.020
y 0.020
0.010
0.010
0.000 ⫺0.0010
⫺0.0005
0.0000
0.0005
1415
0.000 ⫺0.0010 ⫺0.0005
0.0010
x
0.0000
0.0005
0.0010
x
(c)
0.20
0.15
y
0.10
0.05
0.00 ⫺0.010
⫺0.005
0.000
0.005
0.010
x
Figure 7. A closeup of the point of contact during coalescence of two identical drops for the two cases of no outer fluid, (a), and two fluids of equal viscosity, ((b) and (c)). Part (a) is Hopper’s solution (no outer fluid) for rm /R = 10−3 , 10−2.5 , 10−2 , and 10−1.5 . Part (b) is a numerical simulation of the case where the inner and outer viscosities are the same, showing fluid that collects in a bubble at the meniscus. Note that the two axes are scaled differently, so the bubble is almost circular. For large values of rm , as shown in (c), the fluid finally escapes from the bubble.
References [1] R. Scardovelli and S. Zaleski, “Direct numerical simulation of free-surface and interfacial flow,” Annu. Rev. Fluid Mech., 31, 567–603, 1999. [2] J. Eggers, “Non-linear dynamic and breakup of free-surface flows,” Rev. Mod. Phys., 69, 865–929, 1997. [3] L.D. Landau and E.M. Lifshitz, Fluid Mechanics, Pergamon, Oxford, 1984. [4] A. Menchaca-Rocha et al., “Coalescence of liquid drops by surface tension,” Phys. Rev. E, 63, 046309, 1–5, 2001.
1416
J. Eggers
[5] B. Ambravaneswaran, E.D. Wilkes, and O.A. Basaran, “Drop formation from a capillary tube: comparison of one-dimensional and two-dimensional analyses and occurence of satellite drops,” Phys. Fluids, 14, 2606–2621, 2002. [6] A.U. Chen, P.K. Notz, and O.A. Basaran, “Computational and experimental analysis of pinch-off and scaling,” Phys. Rev. Lett., 88, 174501, 1–4, 2002. [7] G.I. Barenblatt, Scaling, Self-Similarity, and Intermedeate Asymptotics, Cambridge, 1996. [8] S.P. Lin, Breakup of Liquid Sheets and Jets, Cambridge, 2003. [9] Y. Amarouchene, G. Cristobal, and H. Kellay, “Noncoalescing drops,” Phys. Rev. Lett., 87, 206104, 1–4, 2002. [10] J. Eggers, J.R. Lister, and H.A. Stone, “Coalescence of liquid drops,” J. Fluid Mech., 401, 293–310, 1999. [11] L. Duchemin, J. Eggers, and C. Josserand, “Inviscid coalescence of drops,” J. Fluid Mech., 487, 167–178, 2003.
4.10 CONFORMAL MAPPING METHODS FOR INTERFACIAL DYNAMICS Martin Z. Bazant1 and Darren Crowdy2 1
Department of Mathematics, Massachusetts Institute of Technology, Cambridge, MA, USA 2 Department of Mathematics, Imperial College, London, UK
Microstructural evolution is typically beyond the reach of mathematical analysis, but in two dimensions certain problems become tractable by complex analysis. Via the analogy between the geometry of the plane and the algebra of complex numbers, moving free boundary problems may be elegantly formulated in terms of conformal maps. For over half a century, conformal mapping has been applied to continuous interfacial dynamics, primarily in models of viscous fingering and solidification. Current developments in materials science include models of void electro-migration in metals, brittle fracture, and viscous sintering. Recently, conformal-map dynamics has also been formulated for stochastic problems, such as diffusion-limited aggregation and dielectric breakdown, which has re-invigorated the subject of fractal pattern formation. Although restricted to relatively simple models, conformal-map dynamics offers unique advantages over other numerical methods discussed in this chapter (such as the Level–Set Method) and in Chapter 9 (such as the phase field method). By absorbing all geometrical complexity into a time-dependent conformal map, it is possible to transform a moving free boundary problem to a simple, static domain, such as a circle or square, which obviates the need for front tracking. Conformal mapping also allows the exact representation of very complicated domains, which are not easily discretized, even by the most sophisticated adaptive meshes. Above all, however, conformal mapping offers analytical insights for otherwise intractable problems. After reviewing some elementary concepts from complex analysis in Section 1, we consider the classical application of conformal mapping methods to continuous-time interfacial free boundary problems in Section 2. This includes cases where the governing field equation is harmonic, biharmonic, or in a more general conformally invariant class. In Section 3, we discuss the 1417 S. Yip (ed.), Handbook of Materials Modeling, 1417–1451. c 2005 Springer. Printed in the Netherlands.
1418
M.Z. Bazant and D. Crowdy
recent use of random, iterated conformal maps to describe analogous discretetime phenomena of fractal growth. Although most of our examples involve planar domains, we note in Section 4 that interfacial dynamics can also be formulated on curved surfaces in terms of more general conformal maps, such as stereographic projections. We conclude in Section 5 with some open questions and an outlook for future research.
1.
Analytic Functions and Conformal Maps
We begin by reviewing some basic concepts from complex analysis found in textbooks such as Churchill and Brown [1]. For a fresh geometrical perspective, see Needham [2]. A general function of a complex variable depends on the real and imaginary parts, x and y, or, equivalently, on the linear combinations, z = x + i y and z¯ = x − i y. In contrast, an analytic function, which is differentiable in some domain, can be written simply as w = u + iv = f (z). The condition, ∂ f /∂ z¯ = 0, is equivalent to the Cauchy–Riemann equations, ∂u ∂v = ∂x ∂y
and
∂u ∂v =− , ∂y ∂x
(1)
which follow from the existence of a unique derivative, f =
∂ f ∂u ∂v ∂f ∂v ∂u = +i = = −i , ∂x ∂x ∂ x ∂(i y) ∂ y ∂y
(2)
whether taken in the real or imaginary direction. Geometrically, analytic functions correspond to special mappings of the complex plane. In the vicinity of any point where the derivative is nonzero, f (z) =/ 0, the mapping is locally linear, dw = f (z) dz. Therefore, an infinitesimal vector, dz, centered at z is transformed into another infinitesimal vector, dw, centered at w = f (z) by a simple complex multiplication. Recalling Euler’s formula, (r1 eiθ1 )(r2 eiθ2 ) = (r1r2 )ei(θ1 + θ2 ) , this means that the mapping causes a local stretch by | f (z)| and local rotation by arg f (z), regardless of the orientation of dz. As a result, an analytic function with a nonzero derivative describes a conformal mapping of the plane, which preserves the angle between any pair of intersecting curves. Intuitively, a conformal mapping smoothly warps one domain into another with no local distortion. Conformal mapping provides a very convenient representation of free boundary problems. The Riemann Mapping Theorem guarantees the existence of a unique conformal mapping between any two simply connected domains, but the challenge is to derive its dynamics for a given problem. The only constraint is that the conformal mapping be univalent, or one-to-one, so that physical fields remain single-valued in the evolving domain.
Conformal mapping methods for interfacial dynamics
2. 2.1.
1419
Continuous Interfacial Dynamics Harmonic Fields
Most applications of conformal mapping involve harmonic functions, which are solutions to Laplace’s equation, ∇ 2 φ = 0.
(3)
From Eq. (1), it is easy to show that the real and imaginary parts of an analytic function are harmonic, but the converse is also true: Every harmonic function is the real part of an analytic function, φ = Re , the complex potential. This connection easily produces new solutions to Laplace’s equation in different geometries. Suppose that we know the solution, φ(w) = Re (w), in a simply connected domain in the w-plane, w , which can be reached by conformal mapping, w = f (z, t), from another, possibly time-dependent domain in the z-plane, z (t). A solution in z (t) is then given by φ(z, t) = Re (w) = Re ( f (z, t))
(4)
because ( f (z)) is also analytic, with a harmonic real part. The only caveat is that the boundary conditions be invariant under the mapping, which holds for Dirichlet (φ = constant) or Neumann (nˆ · ∇ φ = 0) conditions. Most other boundary conditions invalidate Eq. (4) and thus complicate the analysis. The complex potential is also convenient for calculating the gradient of a harmonic function. Using Eqs. (1) and (2), we have ∇z φ =
∂φ ∂φ +i = , ∂x ∂y
(5)
where ∇z is the complex gradient operator, representing the vector gradient, ∇ , in the z-plane.
2.1.1. Viscous fingering and solidification The classical application of conformal-map dynamics is to Laplacian growth, where a free boundary, Bz (t), moves with a (normal) velocity, dz ∝ ∇ φ, (6) dt proportional to the gradient of a harmonic function, φ, which vanishes on the boundary [3]. Conformal mapping for Laplacian growth was introduced independently by Polubarinova–Kochina and Galin in 1945 in the context of ground-water flow, where φ is the pressure field and u = (k/η)∇ ∇ φ is the velocity of the fluid of viscosity, η, in a porous medium of permeability, k, according v=
1420
M.Z. Bazant and D. Crowdy
to Darcy’s law. Laplace’s equation follows from incompressibility, ∇ · u = 0. The free boundary represents an interface with a less viscous, immiscible fluid at constant pressure, which is being forced into the more viscous fluid. In physics, Laplacian growth is viewed as a fundamental model for pattern formation. It also describes viscous fingering in Hele–Shaw cells, where a bubble of fluid, such as air, displaces a more viscous fluid, such as oil, in the narrow gap between parallel flat plates. In that case, the depth averaged velocity satisfies Darcy’s law in two dimensions. Laplacian growth also describes dendritic solidification in the limit of low undercooling, where φ is the temperature in the melt [4]. To illustrate the derivation of conformal-map dynamics, let us consider viscous fingering in a channel with impenetrable walls, as shown in Fig. 1(a). The viscous fluid domain, z (t), lies in a periodic horizontal strip, to the right of the free boundary, Bz (t), where uniform flow of velocity, U , is assumed far ahead of the interface. It is convenient to solve for the conformal map, z = g(w, t), to this domain from a half strip, Re w > 0, where the pressure is simply linear, φ = Re Uw/µ. We also switch to dimensionless variables, where length is scaled to a characteristic size of the initial condition, L, pressure to UL/µ, and time to L/U . Since ∇w φ = 1 in the half strip, the pressure gradient at a point, z = g(w, t), on the physical interface is easily obtained from Eq. (30): ∂f = ∇z φ = ∂z
∂g ∂w
−1
(7)
(a)
(b)
3 4
2 1
2
0
0
1
2
2 4 3 4
3
2
1
0
1
2
3
2
0
2
4
6
Figure 1. Exact solutions for Laplacian growth, a simple model of viscous fingering: (a) a Saffman–Taylor finger translating down an infinite channel, showing iso-pressure curves (dashed) and streamlines (solid) in the viscous fluid, and (b) the evolution of a perturbed circular bubble leading to cusp singularities in finite time. (Courtesy of Jaehyuk Choi.)
Conformal mapping methods for interfacial dynamics
1421
where w = f (z, t) is the inverse mapping (which exists as long as the mapping remains univalent). Now consider a Lagrangian marker, z(t), on the interface, whose pre-image, w(t), lies on the imaginary axis in the w-plane. Using the chain rule and Eq. (7), the kinematic condition, Eq. (6), becomes, ∂g dw dz ∂g = + = dt ∂t ∂w dt
∂g ∂w
−1
.
(8)
Multiplying by ∂g/∂w =/ 0, this becomes
∂g ∂g ∂g 2 dw + = 1. ∂w ∂t ∂w dt
(9)
Since the pre-image moves along the imaginary axis, Re(dw/dt) = 0, we arrive at the Polubarinova–Galin equation for the conformal map:
Re
∂g ∂g ∂w ∂t
= 1,
for Re w = 0.
(10)
From the solution to Eq. (10), the pressure is given by φ = Re f (z, t). Note that the interfacial dynamics is nonlinear, even though the quasi-steady equations for φ are linear. The best-known solutions are the Saffman–Taylor fingers, t (11) g(w, t) = + w + 2(1 − λ) log(1 + e−w ) λ which translate at a constant velocity, λ−1 , without changing their shape [5]. Note that (11) is a solution to the fingering problem for all choices of the parameter λ. This parameter specifies the finger width and can be chosen arbitrarily in the solution (11). In experiments however, it is found that the viscous fingers that form are well fit by a Saffman–Taylor finger filling precisely half of the channel, that is with λ = 1/2, as shown in Fig. 1(a). Why this happens is a basic problem in pattern selection, which has been the focus of much debate in the literature over the last 25 years. To understand this problem, note that the viscous finger solutions (11) do not include any of the effects of surface tension on the interface between the two fluids. The intriguing pattern selection of the λ = 1/2 finger has been attributed to a singular perturbation effect of small surface tension. Surface tension, γ , is a significant complication because it is described by a non-conformally-invariant boundary condition, φ = γ κ,
for z ∈ Bz (t)
(12)
where κ is the local interfacial curvature, entering via the Young–Laplace pressure. Small surface tension can be treated analytically as a singular perturbation to gain insights into pattern selection [6, 7]. Since surface tension
1422
M.Z. Bazant and D. Crowdy
effects are only significant at points of high curvature κ in the interface, and given that the finger in Fig. 1(a) is very smooth with no such points of high curvature, it is surprising that surface tension acts to select the finger width. Indeed, the viscous fingering problem has been shown to be full of surprises [8]. In a radial geometry, the univalent mapping is from the exterior of the unit circle, |w| = 1, to the exterior of a finite bubble penetrating an infinite viscous liquid. Bensimon and Shraiman [9] introduced a pole dynamics formulation, where the map is expressed in terms of its zeros and poles, which must lie inside the unit circle to preserve univalency. They showed that Laplacian growth in this geometry is ill-posed, in the sense that cusp-like singularities occur in finite time (as a zero hits the unit circle) for a broad class of initial conditions, as illustrated in Fig. 1(b). (See Howison [3] for a simple, general proof due to Hohlov.) This initiated a large body of work on how Laplacian growth is “regularized” by surface tension or other effects in real systems. Despite the analytical complications introduced by surface tension, several exact steady solutions with non-zero surface tension are known [10, 11]. Surface tension can also be incorporated into numerical simulations based on the same conformal-mapping formalism [12], which show how cusps are avoided by the formation of new fingers [13]. For example, consider a threefold perturbation of a circular bubble, whose exact dynamics without surface tension is shown in Fig. 1(b). With surface tension included, the evolution is very similar until the cusps begin to form, at which point the tips bulge outward and split into new fingers, as shown in Fig. 2. This process repeats itself to produce a complicated fractal pattern [14], which curiously resembles the diffusion-limited particle aggregates discussed below in Section 3.
2.1.2. Density-driven instabilities in fluids An important class of problems in fluid mechanics involves the nonlinear dynamics of an interface between two immiscible fluids of different densities. In the presence of gravity, there are some familiar cases. Deep-water waves involve finite disturbances (such as steady “Stokes waves”) in the interface between lighter fluid (air) over a heavier fluid (water). With an inverted density gradient, the Rayleigh–Taylor instability develops when a heavier fluid lies above a lighter fluid, leading to large plumes of the former sinking into the latter. Tanveer [15] has used conformal mapping to analyze the Rayleigh– Taylor instability and has provided evidence to associate the formation of plumes with the approach of various conformal mapping singularities to the unit circle. A related problem is the Richtmyer–Meshkov instability, which occurs when a shock wave passes through an interface between fluids of different
Conformal mapping methods for interfacial dynamics
1423 4 3 2 1 0 ⫺1 ⫺2 ⫺3
4
3
2
1
0
⫺1
⫺2
⫺3
⫺4
⫺4
Figure 2. Numerical simulation of viscous fingering, starting from a three-fold perturbation of a circular bubble. The only difference with the Laplacian-growth dynamics in Fig. 1(b) is the inclusion of surface tension, which prevents the formation of cusp singularities. (Courtesy of Michael Siegel.)
densities. Behind the shock, narrow fingers of the heavier fluid penetrate into the lighter fluid. The shock wave usually passes so quickly that compressibility only affects the onset of the instability, while the nonlinear evolution occurs much faster than the development of viscous effects. Therefore, it is reasonable to assume a potential flow in each fluid region, with randomly perturbed initial velocities. Although real plumes roll up in three dimensions and bubbles can form, conformal mapping in two dimensions still provides some insights, with direct relevance for shock tubes of high aspect ratio. A simple conformal-mapping analysis is possible for the case of a large density contrast, where the lighter fluid is assumed to be at uniform pressure. The Richtmyer–Meshkov instability (zero-gravity limit) is then similar to the Saffman–Taylor instability, except that the total volume of each fluid is fixed. A periodic interface in the y direction, analogous to the channel geometry in Fig. 1, can be described by the univalent mapping, z = g(w, t), from the
1424
M.Z. Bazant and D. Crowdy
interior of the unit circle in the mathematical w plane to the interior of the heavy-fluid finger in the physical z-plane. Zakharov [16] introduced a Hamiltonian formulation of the interfacial dynamics in terms of this conformal map, taking into account kinetic and potential energy, but not surface tension. One way to derive equations of motion is to expand the map in a Taylor series, g(w, t) = log w +
∞
an (t)w n ,
|w| < 1.
(13)
n=0
(The log w term first maps the disk to a periodic half strip.) On the unit circle, w = eiθ , the pre-image of the interface, this is simply a complex Fourier series. The Taylor coefficients, an (t), act as generalized coordinates describing n-fold shape perturbations within each period, and their time derivatives, a˙ n (t), act as velocities or momenta. Unfortunately, truncating the Taylor series results in a poor description of strongly nonlinear dynamics because the conformal map begins to vary wildly near the unit circle. An alternate approach used by Yoshikawa and Balk [17] is to expand in terms resembling Saffman–Taylor fingers, g(w, t) = log w + b(t) −
N
bn (t) log(1 − λn (t)w),
(14)
n=1
which can be viewed as a re-summation of the Taylor series in Eq. (13). As shown in Fig. 3, exact solutions exist with only a finite number of terms in the finger expansion, as long as the new generalized coordinates, λn (t), stay inside the unit disk, |λn | < 1. This example illustrates the importance of the choice of shape functions in the expansion of the conformal map, e.g., w n vs. log(1 − λn w).
2.1.3. Void electro-migration in metals Micron-scale interconnects in modern integrated circuits, typically made of aluminum, sustain enormous currents and high temperatures. The intense electron wind drives solid-state mass diffusion, especially along dislocations and grain boundaries, where voids also nucleate and grow. In the narrowest and most fragile interconnects, grain boundaries are often well separated enough that isolated voids migrate in a fairly homogeneous environment due to surface diffusion, driven by the electron wind. Voids tend to deform into slits, which propagate across the interconnect, causing it to sever. A theory of void electro-migration is thus important for predicting reliability. In the simplest two-dimensional model [18], a cylindrical void is modeled as a deformable, insulating inclusion in a conducting matrix. Outside the void,
Conformal mapping methods for interfacial dynamics
1425
2 1 0
⫺1 ⫺2 ⫺3 ⫺4 ⫺5 ⫺6 ⫺7 ⫺6
⫺4
⫺2
0
2
4
6
Figure 3. Conformal-map dynamics for the strongly nonlinear regime of the RichtmyerMeshkov instability [17]. (Courtesy of Toshio Yoshikawa and Alexander Balk.)
the electrostatic potential, φ, satisfies Laplace’s equation, which invites the use of conformal mapping. The electric field, E = −∇ ∇ φ, is taken to be uniform far away and wraps around the void surface, due to a Neumann boundary condition, nˆ · E = 0. The difference with Laplacian growth lies in the kinematic condition, which is considerably more complicated. In place of Eq. (6), the normal velocity of the void surface is given by the surface divergence of the surface current, j , which takes the dimensionless form, nˆ · v =
∂ 2φ ∂ 2κ ∂j =χ 2 + 2, ∂s ∂s ∂s
(15)
where s is the local arc-length coordinate and χ is a dimensionless parameter comparing surface currents due to the electron wind force (first term) and due to gradients in surface tension (second term). This moving free boundary problem somewhat resembles the viscous fingering problem with surface tension, and it admits analogous finger solutions, albeit of width 2/3, not 1/2 [19]. To describe the evolution of a singly connected void, we consider the conformal map, z = g(w, t), from the exterior of the unit circle to the exterior of
1426
M.Z. Bazant and D. Crowdy
the void. As long as the map remains univalent, it has a Laurent series of the form, g(w, t) = A1 (t)w + A0 (t) +
∞
A−n (t)w −n ,
for |w| > 1,
(16)
n=1
where the Laurent coefficients, An (t), are now the generalized coordinates. As in the case of viscous fingering [3], a hierarchy of nonlinear ordinary differential equations (ODEs) for these coordinates can be derived. For void electromigration, Wang et al. [18] start from a variational principle accounting for surface tension and surface diffusion, using a Galerkin procedure. They truncate the expansion after 17 coefficients, so their numerical method breaks down if the void deforms significantly, e.g., into a curved slit. Nevertheless, as shown in Fig. 4(a), the numerical method is able to capture essential features of the early stages of strongly nonlinear dynamics. In the same regime, it is also possible to incorporate anisotropic surface tension or surface mobility. The latter involves multiplying the surface current by a factor (1 + gd cos mα), where α is the surface orientation in the physical z-plane, given at z = g(eiθ , t), by α = θ + arg
∂g iθ (e , t). ∂w
(17)
Some results are shown in Fig. 4(b), where the void develops dynamical facets.
(a)
(b)
Figure 4. Numerical conformal-mapping simulations of the electromigration of voids in aluminum interconnects [18]. (a) A small shape perturbation of a cylindrical void decaying (above) or deforming into a curved slit (below), depending on a dimensionless group, χ, comparing the electron wind to surface-tension gradients. (b) A void evolving with anisotropic surface diffusivity (χ = 100, gd = 100, m = 3). (Courtesy of Zhigang Suo.)
Conformal mapping methods for interfacial dynamics
1427
2.1.4. Quadrature domains We end this section by commenting on some of the mathematics underlying the existence of exact solutions to continuous-time Laplacian-growth problems. Significantly, much of this mathematics carries over to problems in which the governing field equation is not necessarily harmonic, as will be seen in the following section. The steadily-translating finger solution (11) of Saffman and Taylor turns out to be but one of an increasingly large number of known exact solutions to the standard Hele–Shaw problem. Saffman [20] himself identified a class of unsteady finger-like solutions. This solution was later generalized by Howison [21] to solutions involving multiple fingers exhibiting such phenomena as tip-splitting where a single finger splits into two (or more) fingers. It is even possible to find exact solutions to the more realistic case where there is a second interface further down the channel [22] which must always be the case in any experiment. Besides finger-like solutions which are characterized by time-evolving conformal mappings having logarithmic branch-point singularities, other exact solutions, where the conformal mappings are rational functions with timeevolving poles and zeros, were first identified by Polubarinova–Kochina and Galin in 1945. Richardson [23] later rediscovered the latter solutions while simultaneously presenting important new theoretical connections between the Hele–Shaw problem and a class of planar domains known as quadrature domains. The simplest example of a quadrature domain is a circular disc D of radius r centered at the origin which satisfies the identity
h(z) dx dy = πr 2 h(0),
(18)
D
where h(z) is any function analytic in the disc (and integrable over it). Equation (18), which is known as a quadrature identity since it holds for any analytic function h(z), is simply a statement of the well-known mean-value theorem of complex analysis [24]. A more general domain D, satisfying a generalized quadrature identity of the form
h(z) dx dy = D
N n k −1
c j k h ( j )(z k )
(19)
k=1 j =0
is known as a quadrature domain. Here, {z k ∈ C} is a set of points inside D and h ( j )(z) denotes the j th derivative of h(z). If one makes the choice h(z) = z n in (19) the resulting integral quantities have become known as the Richardson moments of the domain. Richardson showed that the dynamics of the Hele–Shaw problem is such as to preserve quadrature domains. That is, if the initial fluid domain in a Hele–Shaw cell is a quadrature domain at time
1428
M.Z. Bazant and D. Crowdy
t = 0, it remains a quadrature domain at later times (so long as the solution does not break down). This result is highly significant and provides a link with many other areas of mathematics including potential theory, the notion of balayage, algebraic curves, Schwarz functions and Cauchy transforms. Richardson [25] discusses many of these connections while Varchenko and Etingof [26] provide a more general overview of the various mathematical implications of Richardson’s result. Shapiro [27] gives more general background on quadrature domain theory. It is a well-known result in the theory of quadrature domains [27] that simply-connected quadrature domains can be parameterized by rational function conformal mappings from a unit circle. Given Richardson’s result on the preservation of quadrature domains, this explains why Polubarinova–Kochina and Galin were able to find time-evolving rational function conformal mapping solutions to the Hele–Shaw problem. It also underlies the pole dynamics results of Bensimon and Shraiman [9]. But Richardson’s result is not restricted to simply-connected domains; multiply-connected quadrature domains are also preserved by the dynamics. Physically this corresponds to time-evolving fluid domains containing multiple bubbles of air. Indeed, motivated by such matters, recent research has focused on the analytical construction of multiplyconnected quadrature domains using conformal mapping ideas [28, 29]. In the higher-connected case, the conformal mappings are no longer simply rational functions but are given by conformal maps that are automorphic functions (or, meromorphic functions on compact Riemann surfaces). The important point here is that understanding the physical problem from the more general perspective of quadrature domain theory has led the way to the unveiling of more sophisticated classes of exact conformal mapping solutions.
2.2.
Bi-Harmonic Fields
Although not as well known as conformal mapping involving harmonic functions, there is also a substantial literature on complex-variable methods to solve the bi-harmonic equation, ∇ 2 ∇ 2 ψ = 0,
(20)
which arises in two-dimensional elasticity [30] and fluid mechanics [31]. Unlike harmonic functions, which can be expressed in terms of a single analytic function (the complex potential), bi-harmonic functions can be expressed in terms of two analytic functions, f (z) and g(z), in Goursat form [24]: ψ(z, z¯ ) = Im [¯z f (z) + g(z)].
(21)
Note that ψ is no longer just the imaginary part of an analytic function g(z) but also contains the imaginary part of the non-analytic component z¯ f (z).
Conformal mapping methods for interfacial dynamics
1429
A difficulty with bi-harmonic problems is that the typical boundary conditions (see below) are not conformally invariant, so conformal mapping does not usually generate new solutions by simply a change of variables, as in Eq. (4). Nevertheless, the Goursat form of the solution, Eq. (21), is a major simplification, which enables analytical progress.
2.1.5. Viscous sintering Sintering describes a process by which a granular compact of particles (e.g., metal or glass) is raised to a sufficiently large temperature that the individual particles become mobile and release surface energy in such a way as to produce inter-particulate bonds. At the start of a sinter process, any two particles which are initially touching develop a thin “neck” which, as time evolves, grows in size to form a more developed bond. In compacts in which the packing is such that particles have more than one touching neighbor, as the necks grow in size, the sinter body densifies and any enclosed pores between particles tend to close up. The macroscopic material properties of the compact at the end of the sinter process depend heavily on the degree of densification. In industrial application, it is crucial to be able to obtain accurate and reliable estimates of the time taken for pores to close (or reduce to a sufficiently small size) within any given initial sinter body in order that industrial sinter times are optimized without compromising the macroscopic properties of the final densified sinter body. The fluid is modeled as a region D(t) of very viscous, incompressible fluid, in which the velocity field, u = (u, v) = (ψ y , −ψx )
(22)
is given by the curl of an out-of-plane vector, whose magnitude is a stream function, ψ(x, y, t), which satisfies the bi-harmonic equation [31]. On the boundary ∂ D(t), the tangential stress must vanish and the normal stress must be balanced by the uniform surface tension effect, i.e., − pn i + 2µei j = T κn i ,
(23)
where p is the fluid pressure, µ is the viscosity, T is the surface tension parameter, κ is the boundary curvature, n i denotes components of the outward normal n to ∂ D(t) and ei j is the usual fluid rate-of-strain tensor. The boundary is time-evolved in a quasi-steady fashion with a normal velocity, Vn , determined by the same kinematic condition, Vn = u · n, as in viscous fingering. In terms of the Goursat functions in (21) – which are now generally time-evolving – the stress condition (23) takes the form i f (z, t) + z f (¯z , t) + g (¯z , t) = − z s , 2
(24)
1430
M.Z. Bazant and D. Crowdy
where again s denotes arc length. Once f (z, t) has been determined from (24), the kinematic condition Im[z t z¯ s ] = Im[−2 f (z, t)¯z s ] −
1 2
(25)
is used to time-advance the interface. A significant contribution was made by Hopper [32] who showed, using complex variable methods based on the decomposition (21), that the problem for the surface-tension driven coalescence of two equal circular blobs of viscous fluid can be reduced to the evolution of a rational function conformal map, from a unit w-circle, of the form g(w, t) =
R(t)w . w 2 − a 2 (t)
(26)
The two time-evolving parameters R(t) and a(t) satisfy two coupled nonlinear ODEs. Figure 5 shows a sequence of shapes of the two coalescing blobs computed using Hopper’s solution. At large times, the configuration equilibrates to a single circular blob. While Hopper’s coalescence problem provides insight into the growth of the inter-particle neck region, there are no pores in this configuration and it is natural to ask whether more general exact solutions exist. Crowdy [33] reappraised the viscous sintering problem and showed, in direct analogy with Richardson’s result on Hele–Shaw flows, that the dynamics of the sintering problem is also such as to preserve quadrature domains. As in the Hele– Shaw problem, this perspective paved the way for the identification of new exact solutions, generalizing (26), for the evolution of doubly-connected fluid regions. Figure 6 shows the shrinkage of a pore enclosed by a typical “unit” in a doubly-connected square packing of touching near-circular blobs of viscous fluid. This calculation employs a conformal mapping to the doubly-connected fluid region (which is no longer a rational function but a more general automorphic function) derived by Crowdy [34] and, in the same spirit as Hopper’s solution (26), requires only the integration of three coupled nonlinear ODEs. The fluid regions shown in Fig. 6 are all doubly-connected quadrature domains. Richardson [35] has also considered similar Stokes flow problems using a different conformal mapping approach.
Figure 5. Evolution of the solution of Hopper [32] for the coalescence of two equal blobs of fluid under the effects of surface tension.
Conformal mapping methods for interfacial dynamics
1431
Figure 6. The coalescence of fluid blobs and collapse of cylindrical pores in a model of viscous sintering. This sequence of images shows an analytical solution by Crowdy [34] using complex-variable methods.
2.1.6. Pores in elastic solids Solid elasticity in two dimensions is also governed by a bi-harmonic function, the Airy stress function [30]. Therefore, the stress tensor, σi j , and the displacement field, u i , may be expressed in terms of two analytic functions, f (z) and g(z): σ22 + σ11 = f (z) + f (z), 2 σ22 − σ11 + iσ12 = z f (z) + g (z), 2 Y (u 1 + iu 2 ) = κ f (z) − z f (z) − g(z) 1+ν
(27) (28) (29)
where Y is Young’s modulus, ν is Poisson’s ratio, and κ = (3 − ν)/(1 + ν) for plane stress and κ = 3 − 4ν for plane strain. As with bubbles in viscous flow, the use of Goursat functions allows conformal mapping to be applied to bi-harmonic free boundary problems in elastic solids, without solving explicitly for bulk stresses and strains. For example, Wang and Suo [36] have simulated the dynamics of a singlyconnected pore by surface diffusion in an infinite stressed elastic solid. As in the case of void electromigration described above, they solve nonlinear ODEs for the Laurent coefficients of the conformal map from the exterior of the unit disk, Eq. (16). Under uniaxial tension, there is a competition between surface tension, which prefers a circular shape, and the applied stress, which drives elongation and eventually fracture in the transverse direction. The numerical
1432
M.Z. Bazant and D. Crowdy
method, based on the truncated Laurent expansion, is able to capture the transition from stable elliptical shapes at small applied stress to the unstable growth of transverse protrusions at large applied stress, although naturally it breaks down when cusps resembling crack tips begin to form.
2.3.
Non-Harmonic Conformally Invariant Fields
The vast majority of applications of conformal mapping fall into one of the two classes above, involving harmonic or bi-harmonic functions, where the connections with analytic functions, Eqs. (4) and (21), are cleverly exploited. It turns out, however, that conformal mapping can be applied just as easily to a broad class of problems involving non-harmonic fields, recently discovered by Bazant [37]. Of course, in planar geometry, the conformal map itself is described by an analytic function, but the fields need not be, as long as they transform in a simple way under conformal mapping. The most convenient fields satisfy conformally invariant partial differential equations (PDEs), whose forms are unaltered by a conformal change of variables. It is straightforward to transform PDEs under a conformal mapping of the plane, w = f (z), by expressing them in terms of complex gradient operator introduced above, ∇z =
∂ ∂ ∂ +i =2 , ∂x ∂y ∂z
(30)
which we have related to the z partial derivative using the Cauchy–Riemann equations, Eq. (1). In this form, it is clear that ∇z f = 0 if and only if f (z) is analytic, in which case ∇ z f = 2 f . Using the chain rule, also obtain the transformation rule for the gradient, ∇ z = f ∇w .
(31)
To apply this formalism, we write Laplace’s equation in the form, ∇ z2 φ = Re ∇z ∇ z φ = ∇z ∇ z φ = 0,
(32)
which assumes that mixed partial derivatives can be taken in either order. (Note that a · b = Re ab.) The conformal invariance of Laplace’s equation, ∇w ∇ w φ = 0, then follows from a simple calculation, ∇z ∇ z = (∇z f )∇ w + | f |2 ∇w ∇ w = | f |2 ∇w ∇ w ,
(33)
where ∇z f = 0 because f is also analytic. As a result of conformal invariance, any harmonic function in the w-plane, φ(w), remains harmonic in the
Conformal mapping methods for interfacial dynamics
1433
z-plane, φ( f (z)), after the simple substitution, w = f (z). We came to the same conclusion above in Eq. (4), using the connection between harmonic and analytic functions, but the argument here is more general and also applies to other PDEs. The bi-harmonic equation is not conformally invariant, but some other equations – and systems of equations – are. The key observation is that any “product of two gradients” transforms in the same way under conformal mapping, not only the Laplacian, ∇ · ∇ φ, but also the term, ∇ φ1 · ∇ φ2 = Re(∇φ1 )∇φ2 , which involves two real functions, φ1 and φ2 : Re(∇z φ1 ) ∇ z φ2 = | f |2 Re(∇w φ1 ) ∇ w φ2 .
(34)
(Todd Squires has since noted that the operator, ∇ φ1 × ∇ φ2 = Im(∇φ1 )∇φ2 , also transforms in the same way.) These observations imply the conformal invariance of a broad class of systems of nonlinear PDEs: N
ai ∇ 2 φi +
N
i =1
j =i
ai j ∇ φi · ∇ φ j +
N
bi j ∇ φi × ∇ φ j = 0,
(35)
j = i+1
where the coefficients ai (φ), ai j (φ), and bi j (φ) may be nonlinear functions of the unknowns, φ = (φ1 , φ2 , . . . , φ N ), but not of the independent variables or any derivatives of the unknowns. The general solutions to these equations are not harmonic and thus depend on both z and z. Nevertheless, conformal mapping works in precisely the same way: A solution, φ(w, w), can be mapped to another solution, φ( f (z), f (z)), by a simple substitution, w = f (z). This allows the conformal mapping techniques above (and below) to be extended to new kinds of moving free boundary problems.
2.1.7. Transport-limited growth phenomena For physical applications, the conformally invariant class, Eq. (35), includes the important set of steady conservation laws for gradient-driven flux densities, ∂ci = ∇ · Fi = 0, ∂t
Fi = ci ui − Di (ci ) ∇ ci ,
ui ∝ ∇ φ,
(36)
where {ci } are scalar fields, such as chemical concentrations or temperature, {Di (ci )} are nonlinear diffusivities, {ui } are irrotational vector fields causing advection, and φ is a potential [37]. Physical examples include advectiondiffusion, where φ is the harmonic velocity potential, and electrochemical transport, where φ is the non-harmonic electrostatic potential, determined implicitly by electro-neutrality.
1434
M.Z. Bazant and D. Crowdy
By modifying the classical methods described above for Laplacian growth, conformal-map dynamics can thus be formulated for more general, transportlimited growth phenomena [38]. The natural generalization of the kinematic condition, Eq. (6), is that the free boundary moves in proportion to one of the gradient-driven fluxes with velocity, v ∝ F1 . For the growth of a finite filament, driven by prescribed fluxes and/or concentrations at infinity, one obtains a generalization of the Polubarinova–Galin equation for the conformal map, z = g(w, t), from the exterior of the unit disk to the exterior of growing object, Re(w g gt ) = σ (w, t) on |w| = 1,
(37)
where σ (w, t) is the non-constant, time-dependent normal flux, nˆ · F1 , on the unit circle in the mathematical plane.
2.1.8. Solidification in a fluid flow A special case of the conformally invariant Eq. (35) has been known for almost a century: steady advection-diffusion of a scalar field, c, in a potential flow, u. The dimensionless PDEs are Pe u · ∇ c = ∇ 2 c,
u = ∇ φ,
∇ 2 φ = 0,
(38)
where we have introduced the P´eclet number, Pe = UL/D, in terms of a characteristic length, L, velocity, U , and diffusivity, D. In 1905, Boussinesq showed that Eq. (38) takes a simpler form in streamline coordinates, (φ, ψ), where = φ + iψ is the complex velocity potential: ∂c = Pe ∂φ
∂ 2c ∂ 2c + ∂φ 2 ∂ψ 2
(39)
because advection (the left hand side) is directed only along streamlines, while diffusion (the right hand side) also occurs in the transverse direction, along isopotential lines. From the general perspective above, we recognize this as the conformal mapping of an invariant system of PDEs of the form (36) to the complex plane, where the flow is uniform and any obstacles in the flow are mapped to horizontal slits. Streamline coordinates form the basis for Maksimov’s method for interfacial growth by steady advection-diffusion in a background potential flow, which has been applied to freezing in ground-water flow and vapor deposition on textile fibers [4, 39]. The growing interface is a streamline held at a fixed concentration (or temperature) relative to the flowing bulk fluid at infinity. This is arguably the simplest growth model with two competing transport processes, and yet open questions remain about the nonlinear dynamics, even without surface tension.
Conformal mapping methods for interfacial dynamics
1435
Figure 7. The exact self-similar solution, Eq. (40), for continuous advection-diffusion-limited growth in a uniform background potential flow (yellow streamlines) at the dynamical fixed point (Pe = ∞). The concentration field (color contour plot) is shown for Pe = 100. (Courtesy of Jaehyuk Choi.)
The normal flux distribution to a finite absorber in a uniform background flow, σ (w, t) in Eq. (37) is well known, but rather complicated [40], so it is replaced by asymptotic approximations for analytical work, such as √ σ ∼ 2 Pe/π sin(θ/2) as Pe → ∞, which is the fixed point of the dynamics. In this important limit, Choi et al. [41] have found an exact similarity solution,
g(w, t) = A1 (t) w(w − 1), A1 (t) = t 2/3 (40) √ iθ to Eq. (37) with σ (e , t) = A1 (t) sin(θ/2) (since Pe(t) ∝ A1 (t) for a fixed background flow). As shown in Fig. 7, this corresponds to a constant shape, 2/3 ◦ whose linear size grows like √ t , with a 90 cusp at the rear stagnation point, where a branch point of w(w − 1) lies on the unit circle. For any finite, Pe(t), however, the cusp is smoothed, and the map remains univalent, although other singularities may form. Curiously, when mapped to the channel geometry with log z, the solution (40) becomes a Saffman–Taylor finger of width, λ = 3/4.
3.
Stochastic Interfacial Dynamics
The continuous dynamics of conformal maps is a mature subject, but much attention is now focusing on analogous problems with discrete, stochastic dynamics. The essential change is in the kinematic condition: The expression for the interfacial velocity, e.g., Eq. (6), is re-interpreted as the probability
1436
M.Z. Bazant and D. Crowdy
density (per unit arc length) for advancing the interface with a discrete “bump”, e.g., to model a depositing particle. Continuous conformal-map dynamics is then replaced by rules for constructing and composing the bumps. This method of iterated conformal maps was introduced by Hastings and Levitov [42] in the context of Laplacian growth. Stochastic Laplacian growth has been discussed since the early 1980s, but Hastings and Levitov [42] first showed how to implement it with conformal mapping. They proposed the following family of bump functions,
f λ,θ (w) = eiθ f λ e−iθ w , |w| ≥ 1
f λ (w) = w 1−a
(41)
a
1−λ (1 + λ)(w + 1) w+1+ w 2 +1−2w −1 2w 1+λ
(42) as elementary univalent mappings of the exterior of the unit disk used to advance the interface (0 < a ≤ 1). The function, f λ,θ (w), places a bump of (approximate) area, λ, on the unit circle, centered at angle, θ. Compared to analytic functions of the unit disk, the Hastings–Levitov function (42) generates a much more localized perturbation, focused on the region between two branch points, leaving the rest of the unit circle unaltered √ [43]. For a = 1, the map produces a strike, which is a line segment of length λ emanating normal to the circle. For a = 1/2, the map is an invertible composition of simple linear, M¨obius and Joukowski transformations, which inserts a semi-circular bump on the unit circle. As shown in Fig. 8, this yields a good description of (a)
(b) 4
400
2
200
0
0
⫺2
⫺200
⫺4
⫺400 ⫺4
⫺2
0
2
4
⫺400
⫺200
0
200
400
Figure 8. Simulation of the aggregation of (a) 4 and (b) 10 000 particles using the Hastings– Levitov algorithm (a = 1/2). Color contours show the quasi-steady concentration (or probability) field for mobile particles arriving from infinity, and purple curves indicate lines of diffusive flux (or probability current). (Courtesy of Jaehyuk Choi and Benny Davidovitch.)
Conformal mapping methods for interfacial dynamics
1437
aggregating particles, although other choices, like a = 2/3, have also been considered [43]. Quantifying the effect of the bump shape remains a basic open question. Once the bump function is chosen, the conformal map, z = gn (w), from the exterior of the unit disk to the evolving domain with n bumps is constructed by iteration,
gn (w) = gn−1 f λn ,θn (w)
(43)
starting from the initial interface, given by g0 (w). All of the physics is contained in the sequence of bump parameters, {(λn , θn )}, which can be generated in different ways (in the w plane) to model a variety of physical processes (in the z-plane). As shown in Fig. 8(b), the interface often develops a very complicated, fractal structure, which is given, quite remarkably, by an exact mapping of the unit circle. The great advantage of stochastic conformal mapping over atomistic or Monte Carlo simulation of interfacial growth lies in its mathematical insight. For example, given the sequence {(λn , θn )} from a simulation of some physical growth process, the Laurent coefficients, Ak (n), of the conformal map, gn (w), as defined in Eq. (16), can be calculated analytically. For the bump function (42), Davidovitch et al. [43] provide a hierarchy of recursion relations, yielding formulae such as A1 (n) =
n
(1 + λm )a ,
(44)
m=1
and explain how to interpret the Laurent coefficients. For example, A1 is the conformal radius of the cluster, a convenient measure of its linear extent. It is also the radius of a grounded disk with the same capacitance (with respect to infinity) as the cluster. The Koebe “1/4 theorem” on univalent functions [44] ensures that the cluster (image of the unit disk) is always contained in a disk of radius 4A1 . The next Laurent coefficient, A0 , is the center of a uniformly charged disk, which would have the same asymptotic electric field as the cluster (if also charged). Similarly, higher Laurent coefficients encode higher multipole moments of the cluster. Mapping the unit circle with a truncated Laurent expansion defines the web, which wraps around the growing tips and exhibits a sort of turbulent dynamics, endlessly forming and smoothing cusp-like protrusions [42, 45]. The stochastic dynamics, however, does not suffer from finite-time singularities because the iterated map, by construction, remains univalent. In some sense, discreteness plays the role of surface tension, as another regularization of ill-posed continuum models like Laplacian growth.
1438
3.1.
M.Z. Bazant and D. Crowdy
Diffusion-Limited Aggregation (DLA)
The stochastic analog of Laplacian growth is the DLA model of Witten and Sander [46], illustrated in Fig. 8, in which particles perform random walks one-by-one from infinity until they stick irreversibly to a cluster, which grows from a seed at the origin. DLA and its variants (see below) provide simple models for many fractal patterns in nature, such as colloidal aggregates, dendritic electro-deposits, snowflakes, lightning strikes, mineral deposits, and surface patterns in ion-beam microscopy [14]. In spite of decades of research, however, DLA still presents theoretical mysteries, which are just beginning to unravel [47]. The Hastings–Levitov algorithm for DLA prescribes the bump parameters, {(λn , θn )}, as follows. As in Laplacian growth, the harmonic function for the concentration (or probability density) of the next random walker approaching an n-particle cluster is simply, φn (z) = A Re log gn−1 (z),
(45)
according to Eq. (4), since φ(w) = A Re log w = A log|w| is the (trivial) solution to Laplace’s equation in the mathematical w plane with φ = 0 on the unit disk with a circularly symmetric flux density, A, prescribed at infinity. Using the transformation rule, Eq. (31), we then find that the evolving harmonic measure, pn (z)|dz|, for the nth growth event corresponds to a uniform probability measure, Pn (θ) dθ, for angles, θn , on the unit circle, w = eiθ : ∇ φ dθ w pn (z)|dz| = |∇z φ||dz| = |gn−1 dw| = |∇w φ||dw| = = Pn (θ) dθ, g 2π n−1
(46) where we set A = 1/2π for normalization, which implicitly sets the time scale. The conformal invariance of the harmonic measure is well known in mathematics, but the surprising result of Hastings and Levitov [42] is that all the complexity of DLA is slaved to a sequence of independent, uniform random variables. Where the complexity resides is in the bump area, λn , which depends nontrivially on current cluster geometry and thus on the entire history of random angles, {θm | m ≤ n}. For DLA, the bump area in the mathematical w plane should be chosen such that it has a fixed value, λ0 , in the physical z-plane, equal to the aggregating particle area. As long as the new bump is sufficiently small, it is natural to try to correct only for the Jacobian factor Jn (w) = |gn (w)|2
(47)
Conformal mapping methods for interfacial dynamics
1439
of the previous conformal map at the center of the new bump, λn =
λ0 , Jn−1 (eiθn )
(48)
although it is not clear a priori that such a local approximation is valid. Note at least that gn → ∞, and thus λn → 0, as the cluster grows, so this has a chance of working. Numerical simulations with the Hastings–Levitov algorithm do indeed produce nearly constant bump areas, as in Fig. 8. Nevertheless, much larger “particles”, which fill deep fjords in the cluster, occasionally occur where the map varies too wildly, as shown in Fig. 9(a). It is possible (but somewhat unsatisfying) to reject particles outside an “area acceptance window” to produce rather realistic DLA clusters, as shown in Fig. 9(b). It seems that the rejected large bumps are so rare that they do not much influence statistical scaling properties of the clusters [48], although this issue is by no means rigorously resolved.
3.2.
Fractal Geometry
Fractal patterns abound in nature, and DLA provides the most common way to understand them [14]. The fractal scaling of DLA has been debated for decades, but conformal dynamics is shedding new light on the problem. Simulations show that the conformal radius (44) exhibits fractal scaling, A1 (n) ∝ n 1/D f , where the fractal dimension, D f = 1.71, agrees with the accepted value from Monte Carlo (random walk) simulations of DLA, although the prefactor seems to depend on the bump function [43]. A perturbative renormalizationgroup analysis of the conformal dynamics by Hastings [45] gives a similar result, D f = 2 − 1/2 + 1/5 = 1.7. The multifractal spectrum of the harmonic measure has also been studied [49, 50]. Perhaps the most basic question is whether DLA clusters are truly fractal – statistically self-similar and free of any length scale. This long-standing question requires accurate statistics and very large simulations, to erase the surprisingly long memory of the initial conditions. Conformal dynamics provides exact formulae for cluster moments, but simulations are limited to at most 105 particles by poor O(n 2 ) scaling, caused by the history-dependent Jacobian in Eq. (48). In contrast, efficient random-walk simulations can aggregate many millions of particles. Therefore, Somfai et al. [51] developed a hybrid method relying only upon the existence of the conformal map, but not the Hastings–Levitov algorithm to construct it. Large clusters by Monte Carlo simulation, and approximate Laurent coefficients are computed, purely for their morphological information, as follows. For a given cluster of size N , M random walkers are launched
1440
M.Z. Bazant and D. Crowdy
(a)
(b)
(c)
(d)
(e)
(f)
Figure 9. Simulations of fractal aggregates by Stepanov and Levitov [48]: (a) Superimposed time series of the boundary, showing the aggregation of particles, represented by iterated conformal maps; (b) a larger simulation with a particle-area acceptance window; (c) the result of anisotropic growth probability with square symmetry; (d) square-anisotropic growth with noise control via flat particles; (e) triangular-anisotropic growth with noise control; (f) isotropic growth with noise control, which resembles radial viscous fingering. (Courtesy of Leonid Levitov.)
from far away, and the positions, z m , where they would first touch the cluster, are recorded. If the conformal map, z = gn (eiθ ), were known, the points z m would correspond to M angles θm on the unit circle. Since these must sample a uniform distribution, one assumes θm = 2π m/M for large M. From Eq. (16),
Conformal mapping methods for interfacial dynamics
1441
the Laurent coefficientsare simply the Fourier coefficients of the discretely sampled function, z m = Ak eiθm k . Using this method, all Laurent coefficients appear to scale with the same fractal dimension,
|Ak (n)|2 ∝ n 2/D f
(49)
although the first few coefficients crossover extremely slowly to the asymptotic scaling.
3.3.
Snowflakes and Viscous Fingers
In conventional Monte Carlo simulations, many variants of DLA have been proposed to model real patterns found in nature [14]. For example, clusters closely resembling snowflakes can be grown by a combination of noise control (requiring multiple hits before attachment) and anisotropy (on a lattice). Conformal dynamics offers the same flexibility, as shown in Fig. 9, while allowing anisotropy and noise to be controlled independently [48]. Anisotropy can be introduced in the growth probability with a weight factor, 1 + c cos mαn , where αn is the surface orientation angle in the physical plane given by Eq. (17), or by simply rejecting angles outside some tolerance from the desired symmetry directions. Noise can be controlled by flattening the aspect ratio of the bumps. Without anisotropy, this produces smooth fluid-like patterns (Fig. 9(f)), reminiscent of viscous fingers (Fig. 2). The possible relation between DLA and viscous fingering is a tantalizing open question in pattern formation. Many authors have argued that the regularization of finite-time singularities in Laplacian growth by discreteness is somehow analogous to surface tension. Indeed, the average DLA cluster in a channel, grown by conformal mapping, is similar (but not identical) to a Saffman–Taylor finger of width 1/2 [52], and the instantaneous expected growth rate of a cluster can be related to the Polubarinova–Galin (or “Shraiman– Bensimon”) equation [42]. Conformal dynamics with many bumps grown simultaneously suggests that Laplacian growth and DLA are in different universality classes, due to the basic difference of layer-by-layer vs. one-byone growth, respectively [53]. Another multiple-bump algorithm with complete surface coverage, however, seems to yield the opposite conclusion [54].
3.4.
Dielectric Breakdown
In their original paper, Hastings and Levitov [42] allowed for the size of the bump in the physical plane to vary with an exponent, α, by replacing Jn−1
1442
M.Z. Bazant and D. Crowdy
with ( Jn−1 )α/2 in Eq. (48). In DLA (α = 2), the bump size is roughly constant, but for 0 < α < 2 the bump size grows with the local gradient of the Laplacian field. This is a simple model for dielectric breakdown, where the stochastic growth of an electric discharge penetrating a material is nonlinearly enhanced by the local electric field. One could use strikes (a = 0) rather than bumps (a = 1/2) to better reproduce the string-like branched patterns seen in laboratory experiments [14] and more familiar lightning strikes. The model displays a “stable-to-turbulent” phase transition: The relative surface roughness decreases with time for 0 ≤ α < 1 and grows for α > 1. The original Dielectric Breakdown Model (DBM) of Niemeyer et al. [55] has a more complicated conformal-dynamics representation. As usual, the growth is driven by the gradient of a harmonic function, φ (the electrostatic potential) on an iso-potential surface (the discharge region). Unlike the αmodel above, however, DBM growth events are assumed to have constant size, so the bump size in the mathematical plane is still chosen according to Eq. (48). The difference lies in the growth measure, which does not obey Eq. (46). Instead, the generalized harmonic measure in the physical z-plane is given by p(z) ∝ |∇z φ|η ,
(50)
where η is an exponent interpolating between the Eden model (η = 0), DLA (η = 1), and nonlinear dielectric breakdown (η > 1). For η =/ 1, the fortuitous cancellation in Eq. (46) does not occur. Instead, a similar calculation using Eq. (45) yields a non-uniform probability measure for the nth angle on the unit circle in the mathematical plane, (eiθn )|1−η , Pn (θn ) = |gn−1
(51)
which is complicated and depends on the entire history of the simulation. Nevertheless, conformal mapping can be applied fruitfully to DBM, because not solving Laplace’s equation around the cluster outweighs the difficulty of sampling the angle measure. Surmounting the latter with a Monte Carlo algorithm, Hastings [56] has performed DBM simulations of 104 growth events, an order of magnitude beyond standard methods solving Laplace’s equation on a lattice. The results, illustrated in Fig. 10, support the theoretical conjecture that DBM clusters become one-dimensional, and thus non-fractal, for η ≥ 4. Using the conformal-mapping formalism, efforts are also underway to develop a unified scaling theory of the η-model for the growth probability from DBM combined with the α-model above for the bump size [50].
Conformal mapping methods for interfacial dynamics (a)
1443
(b)
Figure 10. Conformal-mapping simulations by Hastings [56] of the Dielectric Breakdown Model with (a) η = 2 and (b) η = 3.5. (Courtesy of Matt Hastings.)
3.5.
Brittle Fracture
Modeling the stochastic dynamics of fracture is a daunting problem, especially in heterogeneous materials [14, 57]. The basic equations and boundary conditions are still the subject of debate, and even the simplest models are difficult to solve. In two dimensions, stochastic conformal mapping provides an elegant, new alternative to discrete-lattice and finite-element models. In brittle fracture, the bulk material is assumed to obey Lam´e’s equation of linear elasticity, ∂ 2u = (λ + µ)∇ ∇ (∇ ∇ · u) + µ∇ ∇ 2 u, (52) ∂t 2 where u is the displacement field, ρ is the density, and µ and λ are Lam´e’s constants. For conformal mapping, it is crucial to assume (i) two-dimensional symmetry of the fracture pattern and (ii) quasi-steady elasticity, which sets the left hand side to zero to obtain equations of the type described above. For Mode III fracture, where a constant out-of-plane shear stress is applied at infinity, we have ∇ · u = 0, so the steady Lam´e equation reduces to Laplace’s equation for the out-of-plane displacement, ∇ 2 u z = 0, which allows the use of complex potentials. For Modes I and II, where a uniaxial, in-plane tensile stress is applied at infinity, the steady Lam´e equation must be solved. As discussed above, this is equivalent to the bi-harmonic equation for the Airy stress function, which allows the use of Goursat functions. For all three modes, the method of iterated conformal maps can be adapted to produce fracture patterns for a variety of physical assumptions about crack dynamics [58]. For Modes I and II fracture, these models provide the first ρ
1444
M.Z. Bazant and D. Crowdy
examples of stochastic bi-harmonic growth, which have interesting differences with stochastic Laplacian growth for Mode III fracture. The Hastings–Levitov formalism is used with constant-size bumps, as in DLA, to represent the fracture process zone, where elasticity does not apply. The growth measure a function of the excess tangential stress, beyond a critical yield stress, σc , characterizing the local strength of the material. Quenched disorder is easily included by making σc a random variable. In spite of its many assumptions, the method provides analytical insights, while obviating the need to solve Eq. (52) during fracture dynamics, so it merits further study.
3.6.
Advection-Diffusion-Limited Aggregation
Non-local fractal growth models typically involve a single bulk field driving the dynamics, such as the particle concentration in DLA, the electric field in DBM, or the strain field in brittle fracture, and as a result these models tend to yield statistically similar structures, apart from the effect of boundary conditions. Pattern formation in nature, however, is often fueled by multiple transport processes, such as diffusion, electromigration, and/or advection in a fluid flow. The effect of such dynamical competition on growth morphology is an open question, which would be difficult to address with lattice-based or finite-element methods, since many large fractal clusters must be generated to fully explore the space and time dependence. Once again, conformal mapping provides a convenient means to formulate stochastic analogs of the non-Laplacian transport-limited growth models from Section 2.3 (in two dimensions). It is straightforward to adapt the Hastings– Levitov algorithm to construct stochastic dynamics driven by bulk fields satisfying the conformally invariant system of Eq. (35). A class of such models has recently been formulated by Bazant et al. [38]. Perhaps the simplest case involving two transport processes, illustrated in Fig. 11, is Advection-Diffusion-Limited Aggregation (ADLA), or “DLA in a flow”. Imagine a fluid carrying a dilute concentration of sticky particles flowing past a sticky object, which begins to collect a fractal aggregate. As the cluster grows, it causes the fluid to flow around it and changes the concentration field, which in turn alters the growth probability measure. Assuming a quasi-steady potential flow with a uniform speed far from the cluster, the dimensionless transport problem is Pe0 ∇ φ · ∇ c = ∇ 2 c, ∇ 2 φ = 0, c = 0, nˆ · ∇ φ = 0, σ = nˆ · ∇ c, c → 1, ∇ φ → xˆ ,
z ∈ z (t),
(53)
z ∈ ∂z (t),
(54)
|z| → ∞,
(55)
Conformal mapping methods for interfacial dynamics
1445
Figure 11. A simulation of Advection-Diffusion-Limited Aggregation from Bazant et al. [38] In each row, the growth probabilities in the physical z-plane (on the right) are obtained by solving advection-diffusion in a potential flow past an absorbing cylinder in the mathematical w-plane (on the left), with the same time-dependent P´eclet number.
where Pe0 is the initial P´eclet number and σ is the diffusive flux to the surface, which drives the growth. The transport problem is solved in the mathematical w-plane, where it corresponds to a uniform potential flow of concentrated fluid past an absorbing circular cylinder. The normal diffusive flux on the cylinder, σ (θ, Pe), can be obtained from a tabulated numerical solution or an accurate analytical approximation [40]. Because the boundary condition on φ at infinity is not conformally invariant, the flow in the w-plane has a time-dependent P´eclet number, Pe(t) = A1 (t)Pe0 , which grows with the conformal radius of the cluster. As a result, the
1446
M.Z. Bazant and D. Crowdy
probability of the nth growth event is given by a time-dependent, non-uniform measure for the angle on the unit circle, β Pn (θn ) = τn σ (eiθn , A1 (tn−1 )), (56) λ0 where β is a constant setting the mean growth rate. The waiting time between growth events is an exponential random variable with mean, τn , given by the current integrated flux to the object, λ0 = βτn
2π
σ (eiθ , A1 (tn−1 )) dθ.
(57)
0
Unlike DLA, the aggregation speeds up as the cluster grows, due to a larger cross section to catch new particles in the flow. As shown in Fig. 11, the model displays a universal dynamical crossover from DLA (the unstable fixed point) to an advection-dominated stable fixed point, since Pe(t) → ∞. Remarkably, the fractal dimension remains constant during the transition, equal to the value for DLA, in spite of dramatic changes in the growth rate and morphology (as indicated by higher Laurent coefficients). Moreover, the shape of the “average” ADLA cluster in the high-Pe regime of Fig. 11 is quite similar (but not identical) to the exact solution, Eq. (40), for the analogous continuous problem in Fig. 7. Much remains to be done to understand these kinds of models and apply them to materials problems.
4.
Curved Surfaces
Entov and Etingof (44) considered the generalized problem of Hele–Shaw flows in a non-planar cell having non-zero curvature. In such problems, the velocity of the viscous flow is still the (surface) gradient of a potential, φ, but this function is now a solution of the so-called Laplace–Beltrami equation on the curved surface. The Riemann mapping theorem extends to curved surfaces and says that any simply-connected smooth surface is conformally equivalent to the unit disk, the complex plane, or the Riemann sphere. A common example is the well-known stereographic projection of the surface of a sphere to the (compactified) complex plane. Under a conformal mapping, solutions of the Laplace–Beltrami equation map to solutions to Laplace’s equation and this combination of facts led Entov and Etingof (44) [59] to identify classes of explicit solutions to the continuous Hele–Shaw problem in a variety of non-planar cells. With very similar intent, Parisio et al. [60] have recently considered the evolution of Saffman–Taylor fingers on the surface of a sphere. By now, the reader may realize that most of the methods already considered in this article are, in principle, amenable to generalization to curved surfaces,
Conformal mapping methods for interfacial dynamics
1447
which can be reached by conformal mapping of the plane. For example, Fig. 12 shows a simulation of a DLA cluster growing on the surface of a sphere, using a generalized Hastings–Levitov algorithm, which takes surface curvature into account. The key modification is to multiply the Jacobian in Eq. (47) by the Jacobian of the stereographic projection, 1 + |z/R|2 , where R is the radius of the sphere. It should also be clear that any continuous or discrete growth model driven by a conformally-invariant bulk field, such as ADLA, can be simulated on general curved surfaces by means of appropriate conformal projection to a complex plane. The reason is that the system of Eq. (35) is invariant under any conformal mapping, to a flat or curved surface, because each term transforms like the Laplacian, ∇ 2 φ → J ∇ 2 φ, where J is the Jacobian. The purpose of studying these models is not only to understand growth on a particular ideal shape, such as a sphere, but more generally to explore the effect of local surface curvature on pattern formation. For example, this could help interpret mineral deposit patterns in rough geological fracture surfaces, which form by the diffusion and advection of oxygen in slowly flowing water.
Figure 12. Conformal-mapping simulation of DLA on a sphere. Particles diffuse one by one from the North Pole and aggregate on a seed at a South Pole. (Courtesy of Jaehyuk Choi, Martin Bazant, and Darren Crowdy.)
1448
5.
M.Z. Bazant and D. Crowdy
Outlook
Although conformal mapping has been with us for centuries, new developments with applications continue to the present day. This appears to be the first pedagogical review of stochastic conformal-mapping methods for interfacial dynamics, which also covers the latest progress in continuum methods. Hopefully, this will encourage the further exchange of ideas (and people) between the two fields. Our focus has also been on materials problems, which provide many opportunities to apply and extend conformal mapping. Building on specific open questions scattered throughout the text, we close with a general outlook on directions for future research. A basic question for both stochastic and continuum methods is the effect of geometrical constraints, such as walls or curved surfaces, on interfacial dynamics. Most work to date has been for either radial or channel geometries, but it would be interesting to describe finite viscous fingers or DLA clusters growing near walls of various shapes, as is often the case in materials applications. The extension of conformal-map dynamics to multiply connected domains is another mathematically challenging area, which has received some attention recently but seems ripe for further development. Understanding the exact solution structure of Laplacian-growth problems using the mathematical abstraction of quadrature domain theory holds great potential, especially given that mathematicians have already begun to explore the extent to which the various mathematical concepts extend to higher-dimensions [27]. Describing multiply connected domains could pave the way for new mathematical theories of evolving material microstructures. Topology is the main difference between an isolated bubble and a dense sintering compact. Microstructural evolution in elastic solids may be an even more interesting, and challenging, direction for conformal-mapping methods. From a mathematical point of view, much remains to be done to place stochastic conformal-mapping methods for interfacial dynamics on more rigorous ground. This has recently been achieved in the simpler case of Stochastic Loewner evolution (SLE), which has a similar history to the interfacial problems discussed here [61]. Oded Schramm introduced SLE in 2000 as a stochastic version of the continuous Loewner evolution from univalent function theory, which grows a one-dimensional random filament from a disk or half plane. This important development in pure mathematics came a few years after the pioneering DLA papers of Hastings and Levitov in physics. A notable difference is that SLE has a rigorous mathematical theory based on stochastic calculus, which has enabled new proofs on the properties of percolation clusters and self-avoiding random walks (in two dimensions, of course). One hopes that someday DLA, DBM, ADLA, and other fractal-growth models will also be placed on such a rigorous footing.
Conformal mapping methods for interfacial dynamics
1449
Returning to materials applications, it seems there are many new problems to be considered using conformal mapping. Relatively little work has been done so far on void electromigration, viscous sintering, solid pore evolution, brittle fracture, electrodeposition, and solidification in fluid flows. The reader is encouraged to explore these and other problems using a powerful mathematical tool, which deserves more attention in materials science.
References [1] R.V. Churchill and J.W. Brown, Complex Variables and Applications, 5th edn., McGraw-Hill, New York, 1990. [2] T. Needham, Visual Complex Analysis, Clarendon Press, Oxford, 1997. [3] S.D. Howison, “Complex variable methods in Hele-Shaw moving boundary problems,” Euro. J. Appl. Math., 3, 209–224, 1992. [4] L.M. Cummings, Y.E. Hohlov, S.D. Howison, and K. Kornev, “Two-dimensional soldification and melting in potential flows,” J. Fluid Mech., 378, 1–18, 1999. [5] P.G. Saffman and G.I. Taylor, “The penetration of a fluid into a porous medium or Hele–Shaw cell containing a more viscous liquid,” Proceedings of the Royal Society, London A, 245, 312–329, 1958. [6] M. Kruskal and H. Segur, “Asymptotics beyond all orders in a model of crystal growth,” Stud. Appl. Math., 85, 129, 1991. [7] S. Tanveer, “Evolution of Hele–Shaw interface for small surface tension,” Philosophical Transactions of the Royal Society of London A, 343, 155–204, 1993a. [8] S. Tanveer, “Surprises in viscous fingering,” J. Fluid Mech., 409, 273–308, 2000. [9] B. Bensimon and D. Shraiman, “Singularities in non-local interface dynamics,” Phys. Rev. A, 30, 2840–2842, 1984. [10] L.P. Kadanoff, “Exact solutions for the Saffman–Taylor problem with surface tension,” Phys. Rev. Lett., 65, 2986–2988, 1990. [11] D. Crowdy, “Hele–Shaw flows and water waves,” J. Fluid Mech., 409, 223–242, 2000. [12] J.W. Maclean and P.G Saffman, “The effect of surface tension on the shape of fingers in the Hele–Shaw cell,” J. Fluid Mech., 102, 455, 1981. [13] W.-S. Dai, L.P. Kadanoff, and S.-M. Zhou, “Interface dynamics and the motion of complex singularities,” Phys. Rev. A, 43, 6672–6682, 1991. [14] A. Bunde and S. Havlin (ed.), Fractals and Disordered Systems, 2nd edn., Springer, New York, 1996. [15] S. Tanveer, “Singularities in the classical Rayleigh–Taylor flow: formation and subsequent motion,” Proceedings of the Royal Society, A, 441, 501–525, 1993b. [16] V.E. Zakharov, “Stability of periodic waves of finite amplitude on the surface of deep fluid,” J. Appl. Mech. Tech. Phys., 2, 190, 1968. [17] T. Yoshikawa and A.M. Balk, “The growth of fingers and bubbles in the strongly nonlinear regime of the Richtmyer–Meshkov instability,” Phys. Lett. A, 251, 184– 190, 1999. [18] W. Wang, Z. Suo, and T.-H. Hao, “A simulation of electromigration-induced transgranular slits,” J. Appl. Phys., 79, 2394–2403, 1996. [19] M. Ben Amar, “Void electromigration as a moving free-boundary value problem,” Physica D, 134, 275–286, 1999.
1450
M.Z. Bazant and D. Crowdy
[20] P. Saffman, “Exact solutions for the growth of fingers from a flat interface between two fluids in a porous medium,” Q. J. Mech. Appl. Math., 12, 146–150, 1959. [21] S. Howison, “Fingering in Hele–Shaw cells,” J. Fluid Mech., 12, 439–453, 1986. [22] D. Crowdy and S. Tanveer, “The effect of finiteness in the Saffman–Taylor viscous fingering problem,” J. Stat. Phys., 114, 1501–1536, 2004. [23] S. Richardson, “Hele–Shaw flows with a free boundary produced by the injection of fluid into a narrow channel,” J. Fluid Mech., 56, 609–618, 1981. [24] G. Carrier, M. Krook, and C. Pearson, Functions of a Complex Variable, McGraw– Hill, New York, 1966. [25] S. Richardson, “Hele–Shaw flows with time-dependent free boundaries involving injection through slits,” Stud. Appl. Math., 87, 175–194, 1992. [26] A. Varchenko and P. Etingof, Why the Boundary of a Round Drop Becomes a Curve of Order Four, University Lecture Series, AMS, Providence, 1992. [27] H. Shapiro, The Schwarz Function and its Generalization to Higher dimension, Wiley, New York, 1992. [28] S. Richardson, “Hele–Shaw flows with time-dependent free boundaries involving a multiply-connected fluid region,” Eur. J. Appl. Math., 12, 571–599, 2001. [29] D. Crowdy and J. Marshall, “Constructing multiply-connected quadrature domains,” SIAM J. Appl. Math., 64, 1334–1359, 2004. [30] N. Muskhelishvili, Some Basic Problems of the Mathematical Theory of Elasticity, Noordhoff, Groningen, Holland, 1953. [31] G.K. Batchelor, An Introduction to Fluid Dynamics, Cambridge University Press, 1967. [32] R. Hopper, “Plane stokes flow driven by capillarity on a free surface,” J. Fluid Mech., 213, 349–375, 1990. [33] D. Crowdy, “A note on viscous sintering and quadrature identities,” Eur. J. Appl. Math., 10, 623–634, 1999. [34] D.G. Crowdy, “Viscous sintering of unimodal and bimodal cylindrical packings with shrinking pores,” Eur. J. Appl. Math., 14, 421–445, 2003. [35] S. Richardson, “Plane stokes flow with time-dependent free boundaries in which the fluid occupies a doubly-connected region,” Eur. J. Appl. Math., 11, 249–269, 2000. [36] W. Wang and Z. Suo, “Shape change of a pore in a stressed solid via surface diffusion motivated by surface and elastic energy variations,” J. Mech. Phys. Solids, 45, 709– 729, 1997. [37] M.Z. Bazant, “Conformal mapping of some non-harmonic functions in transport theory,” Proceedings of the Royal Society, A, 460, 1433, 2004. [38] M.Z. Bazant, J. Choi, and B. Davidovitch, “Dynamics of conformal maps for a class of non-Laplacian growth phenomena,” Phys. Rev. Lett., 91, 045503, 2003. [39] K. Kornev and G. Mukhamadullina, “Mathematical theory of freezing for flow in porous media,” Proceedings of the Royal Society, London A, 447, 281–297, 1994. [40] J. Choi, D. Margetis, T.M. Squires, and M.Z. Bazant, “Steady advection-diffusion to finite absorbers in two-dimensional potential flows,” J. Fluid Mech., 2004b. [41] J. Choi, B. Davidovitch, and M.Z. Bazant, “Crossover and scaling of advectiondiffusion-limited aggregation,” In preparation, 2004a. [42] M.B. Hastings and L.S. Levitov, “Laplacian growth as one-dimensional turbulence,” Physica D, 116, 244–252, 1998. [43] B. Davidovitch, H.G.E. Hentschel, Z. Olami, I. Procaccia, L.M. Sander, and E. Somfai, “Diffusion-limited aggregation and iterated conformal maps,” Phys. Rev. E, 59, 1368–1378, 1999. [44] P.L. Duren, Univalent Functions, Springer-Verlag, New York, 1983.
Conformal mapping methods for interfacial dynamics
1451
[45] M.B. Hastings, “Renormalization theory of stochastic growth,” Phys. Rev. E, 55, 135, 1997. [46] T.A. Witten and L.M. Sander, “Diffusion-limited aggregation: a kinetic critical phenomenon,” Phys. Rev. Lett., 47, 1400–1403, 1981. [47] T.C. Halsey, “Diffusion-limited aggregation: a model for pattern formation,” Phys. Today, 53, 36, 2000. [48] M.G. Stepanov and L.S. Levitov, “Laplacian growth with separately controlled noise and anisotropy,” Phys. Rev. E, 63, 061102, 2001. [49] M.H. Jensen, A. Levermann, J. Mathiesen, and I. Procaccia, “Multifractal structure of the harmonic measure of diffusion-limited aggregates,” Phys. Rev. E, 65, 046109, 2002. [50] R.C. Ball and E. Somfai, “Theory of diffusion controlled growth,” Phys. Rev. Lett., 89, 133503, 2002. [51] E. Somfai, L.M. Sander, and R.C. Ball, “Scaling and crossovers in diffusion limited aggregation,” Phys. Rev. Lett., 83, 5523, 1999. [52] E. Somfai, R.C. Ball, J.P. DeVita, and L.M. Sander, “Diffusion-limited aggregation in channel geometry,” Phys. Rev. E, 68, 020401, 2003. [53] F. Barra, B. Davidovitch, and I. Procaccia, “Iterated conformal dynamics and Laplacian growth,” Phys. Rev. E, 65, 046144, 2002a. [54] A. Levermann and I. Procaccia, “Algorithm for parallel laplacian growth by iterated conformal maps,” Phys. Rev. E, 69, 031401, 2004. [55] L. Niemeyer, L. Pietronero, and H.J. Wiesmann, “Fractal dimension of dielectric breakdown,” Phys. Rev. Lett., 52, 1033–1036, 1984. [56] M.B. Hastings, “Fractal to nonfractal phase transition in the dielectric breakdown model,” Phys. Rev. Lett., 87, 175502, 2001. [57] H.J. Hermann and S. Roux (eds.), Statistical Models for the Fracture of Disordered Media, North-Holland, Amsterdam, 1990. [58] F. Barra, A. Levermann, and I. Procaccia, “Quasistatic brittle fracture in inhomogeneous media and iterated conformal maps,” Phys. Rev. E, 66, 066122, 2002b. [59] V.M. Entov and P.I. Etingof, “Bubble contraction in Hele–Shaw cells,” Quart. J. Mech. Appl. Math., 507–535, 1991. [60] F. Parisio, F. Moreas, J.A. Miranda, and M. Widom, “Saffman–Taylor problem on a sphere,” Phys. Rev. E, 63, 036307, 2001. [61] W. Kager and B. Nienhuis, “A guide to stochastic loewner evolution and its applications,” J. Stat. Phys., 115, 1149–1229, 2004.
4.11 EQUATION-FREE MODELING FOR COMPLEX SYSTEMS Ioannis G. Kevrekidis1, C. William Gear1 , and Gerhard Hummer2 1 Princeton University, Princeton, NJ, USA 2
National Institutes of Health, Bethesda, MD, USA
A persistent feature of many complex systems is the emergence of macroscopic, coherent behavior from the interactions of microscopic “agents” – molecules, cells, individuals in a population – among themselves and with their environment. The implication is that macroscopic rules (a description of the system at a coarse-grained, high-level) can somehow be deduced from microscopic ones (a description at a much finer level). For laminar Newtonian fluid mechanics, a successful coarse-grained description (the Navier–Stokes equations) was known on a phenomenological basis long before its approximate derivation from kinetic theory [1]. Today we must frequently study systems for which the physics can be modeled at a microscopic, fine scale; yet it is practically impossible to explicitly derive a good macroscopic description from the microscopic rules. Hence, we look to the computer to explore the macroscopic behavior based on the microscopic description. It is difficult to define complexity in a precise, useful way. At the same time it pervades current modeling in engineering science, in the life and physical sciences, and beyond them (e.g., in economics) (see, e.g., Refs. [2, 3]). We may not typically think of a laminar Newtonian flow as complex, even though it involves interactions of enormous numbers of fluid molecules with themselves and with the boundaries of the flow. Such problems are considered simple because we have a good model, describing the behavior of the system at the level we need for practical purposes. If we are interested in pressure drops and flow rates over humanly relevant space/time scales, we do not need to know where each and every molecule is, or its individual velocity, at a given instant in time. Similarly, if a stirred chemical reactor can be modeled adequately, for design purposes, by a few ordinary differential equations (ODEs), the immense complexity of molecular interactions involved in flow, reaction and mixing in it goes unnoticed. The system is classified as simple, because 1453 S. Yip (ed.), Handbook of Materials Modeling, 1453–1475. c 2005 Springer. Printed in the Netherlands.
1454
I.G. Kevrekidis et al.
a simple model of the behavior is adequate for practical purposes. This suggests that the scale of the observer, and the practical goals of the modeling, are crucial in classifying a system, its models, or its behavior as complex – or as simple. Macroscopic models of reaction and transport processes in our textbooks come in the form of conservation laws (species, mass, momentum, energy) closed through constitutive equations (reaction rates as a function of concentration, viscous stresses as functionals of velocity gradients). These models are written directly at the scale (alternatively, at the level of complexity) at which we are interested in practically modeling the system behavior. Because we observe the system at the level of concentrations or velocity fields,we sometimes forget that what is really evolving during an experiment is distributions of colliding and reacting molecules. We know, from experience with particular classes of problems, that it is possible to write predictive deterministic laws for the behavior observed at the level of concentrations or velocity fields – laws that are predictive over space and time scales relevant to engineering practice. Knowing the right level of observation at which we can be practically predictive, we attempt to write closed evolution equations for the system at this level. The closures may be based on experiment (e.g., through engineering correlations) or on mathematical modeling and approximation of what happens at more microscopic scales (e.g., the Chapman–Enskog expansion). In many problems of current modeling practice, ranging from materials science to ecology, and from engineering to computational chemistry, the physics are known at the microscopic/individual level, and the closures required to translate them to high-level, coarse-grained, macroscopic descriptions are not available. Sometimes we do not even know at what level of observation one can be practically predictive. Severe computational limitations arise in trying to bridge, through direct computer simulation, the enormous gap between the scale of the available description and the macroscopic, “system” scale at which the questions of interest are asked and the practical answers are required (see, e.g., Refs. [4, 5]). These computational limitations are a major stumbling block in current complex system modeling. Our objective is to describe a computational approach for dealing with any complex, multi-scale system whose collective, coarse-grained behavior is simple when we know in principle how to model such systems at a very fine scale (e.g., through molecular dynamics). We assume that we do not know how to write good simple model equations at the right coarse-grained, macroscopic scale for their collective, coarse-grained behavior. We will argue that, in many cases, the derivation of macroscopic equations can be circumvented; that by using short bursts of appropriately initialized microscopic simulation one can effectively solve the macroscopic equations without ever writing them down. A direct bridge can be built between microscopic simulation (e.g., kinetic Monte Carlo, agent-based modeling) and traditional continuum numerical
Equation-free modeling for complex systems
1455
analysis. It is possible to enable microscopic simulators to directly perform macroscopic, systems level tasks. The main idea is to consider the microscopic, fine-scale simulator as a (computational) experiment that one can set up, initialize, and run at will. The results of such appropriately designed, initialized and executed brief computational experiments allow us to estimate the same information that a macroscopic model would allow us to evaluate from explicit formulas. The heart of the approach can be conveyed through a simple example (see Fig. 1). Consider a single, autonomous ODE, dc = f (c). dt
(1)
Think of it as a model for the dynamics of a reactant concentration in a stirred reactor. Equations like this embody “practical determinism” as discussed above: given a finite amount of information (the state at the present time, c(t =0)) we can predict the state at a future time. Consider how this is done on the computer using – for illustration – the simplest numerical integration scheme, forward Euler: cn+1 ≡ c([n + 1]τ ) = cn + τ f (cn ).
(2)
Starting with the initial condition, c0 , we go to the equation and evaluate f (c0 ), the time derivative, or slope of the trajectory c(t); we use this value to make a prediction of the state of the system at the next time step, c1 . We then repeat the process: go to the equation with c1 to evaluate f (c1 ) and use the Euler scheme to predict c2 ; and so on. Forgetting for the moment accuracy and adaptive step size selection, consider how the equation is used: given the state we evaluate the time-derivative; and then, using mathematics (in particular, Taylor series and smoothness to create a local linear model of the process in time) we make a prediction of the state at the next time step. A numerical integration code will “ping” a sub-routine with the current state as input, and will obtain as output the time-derivative at this state. The code will then process this value, and use local Taylor series in order to make a prediction of the next state (the next value of c at which to call the sub-routine evaluating the function f ). Three simple things are important to notice. First, the task at hand (numerical integration) does not need a closed formula for f (c) – it only needs f (c) evaluated at a particular sequence of values cn . Whether the sub-routine evaluates f (c) from a single-line formula, uses a table lookup, or solves a large subsidiary problem, from the point of view of the integration code it is the same thing. Second, the sequence of values cn at which we need the time-derivative evaluated is not known a priori. It is generated as the task progresses, from processing results of previous function evaluations through the Euler formula. We know that protocols exist for designing experiments to
1456
I.G. Kevrekidis et al. (a) C C2 f (C 1 )
C1
f (C 0 )
C0
t0
t1
t2
t
t2
t
(b) C C2 ~ f (C 1 )
C1 ~ f (C 0 )
C0
t0
t1
(c) C ⫺ Φτ (C)
(n) C (n) C ⫹ε
C (n⫹1)
Figure 1. (a) Forward Euler numerical integration, used (b) as a template for projective integration using the results of short experiments. (c) Fixed-point iteration for a timestepper.
Equation-free modeling for complex systems
1457
accomplish tasks such as parameter estimation [6]. In the same spirit, we can think of the Euler method, and of explicit numerical integrators in general, as protocols for specifying where to perform function evaluations based on the task we want to accomplish (computation of a temporal trajectory). Lastly, the form of the protocol (the Euler method here) is based on mathematics, particularly on smoothness and Taylor series. The trajectory is locally approximated as a linear function of time; the coefficients of this function are obtained from the model using function evaluations. Suppose now that we do not have the equation, but we have the experiment itself : we can fill up the stirred reactor with reactant at concentration c0 , run for some time, and record the time series of c(t). Using the results of a short run (over, say, 1 min) we can now estimate the slope, dc/dt at t = 0, and predict (using the Euler method) where the concentration will be in, say 10 min. Now, instead of waiting for 9 min for the reactor to get there, we stop the experiment and immediately start a new one: reinitialize the reactor at the predicted concentration; run for one more minute, and use forward Euler to predict what the concentration will be 20 min down the line. We are substituting short, appropriately initialized experiments, and estimation based on the experimental results, for the function evaluations that the sub-routine with the closed form f (c) would return. We are in effect doing forward Euler again; but the coefficients of the local linear model are obtained using experimentation “on demand ” [7] rather than function evaluations of an a priori available model. Many elements of this example are contrived; for example, the assumption that an Euler prediction with a 10 min step is reasonably accurate. It may also appear laughable that, instead of waiting nine more minutes for the reactor to get to the predicted concentration, we will initialize a fresh experiment at that concentration. It will probably take much more than 9 min to start a new experiment; there will be startup transients, and noise in the measurements. The point, however, remains: it is possible to do forward Euler integration using short bursts of appropriately initialized experiments if it is easy to initialize such experiments at will. An “outer” process (design of the next experiment, setting it up, measuring its results, processing them to design a new experiment) is wrapped around an “inner” process (the experiment). The outer wrapper is motivated by the task that we wish to perform (here, longtime integration) and is based on traditional, continuum numerical analysis. The inner layer is the process itself. It is clear that systems theory components (data acquisition and filtering, model identification, [8]) are vital in forming the connection between the outer layer and the inner layer (the task we want to accomplish and the system itself). Now we complete the argument: suppose that the inner layer is not a laboratory experiment, but a computational one, with a model at a different, much finer level of description (for the sake of the discussion, a lattice kinetic
1458
I.G. Kevrekidis et al.
Monte Carlo, kMC, model of the reaction). Instead of running the kMC model for long times, and observing the evolution of the concentration, we can exploit the procedure described above, perform only short bursts of appropriately initialized microscopic simulation, and use their results to evolve the macroscopic behavior over hopefully much longer time scales. It is much easier to initialize a code at will – a computational experiment – as opposed to initializing a new laboratory experiment. Many new issues arise, notably noise, in the form of fluctuations, from the microscopic solver. The conceptual point, however, remains: even if we do not have the right macroscopic equation for the concentration, we can still perform its numerical integration without obtaining it in closed form. The skeleton of the wrapper (the integration algorithm) is the same one we would use if we had the macroscopic equation; but now function evaluations are substituted by short computational experiments with the microscopic simulator, whose results are appropriately processed for local macroscopic identification and estimation. If a large separation of time-scales exists between microscopic dynamics (here, the time we need to run kinetic Monte Carlo to estimate dc/dt) and the macroscopic evolution of the concentration, this procedure may be significantly more economical than direct simulation. Passing information between the microscopic and macroscopic scales at the beginning and the end of each computational experiment is a vitally important issue. It is accomplished through a lifting operator (macro- to micro-) and a restriction operator (micro- to macro-) as discussed below (see [9, 10] and references therein). Detailed, fine-level dynamics are typically given in terms of microscopically/stochastically evolving distributions of interacting “agents” (molecules, cells); the evolution rules could be molecular dynamics (classical, or Car–Parrinello [11]), MC or kMC, Brownian dynamics, etc. The macroscopic dynamics are described by closed evolution equations, typically ordinary (for macroscopically lumped) or partial differential/integrodifferential equations. The dependent variables in these equations are frequently a few, lower order moments of the evolving distributions (such as concentration, the zeroth moment). The proposed computational methodology consists of the following basic elements: (a) Choose the statistics of interest for describing the long-term behavior of the system and an appropriate representation for them. For example, in a gas simulation at the particle level, the statistics would probably be density and momentum (zeroth and first moment of the particle distribution over velocities) and we might choose to discretize them in a computational domain via finite elements. We call this the macroscopic description, u. These choices suggest possible restriction operators, M, from the microscopic-level description U, to the macroscopic description: u = MU;
Equation-free modeling for complex systems
1459
(b) Choose an appropriate lifting operator, µ from the macroscopic description, u, to one or more consistent microscopic descriptions, U. For example, in a gas simulation using pressure, etc. as the macroscopic-level variables, µ could make random particle assignments consistent with the macroscopic statistics. µM = I, i.e., lifting from the macroscopic to the microscopic and then restricting (projecting) down again should have no effect, except roundoff. (c) Start with a macroscopic condition (e.g., concentration profile) u(t0 ); (d) Transform it through lifting to one – or more – fine, consistent microscopic realizations U(t0 ) = µu(t0 ); (e) Evolve each realization using the microscopic simulator for the desired short macroscopic time T, generating the values U(t1 ) where t1 = t0 + T; (f) Obtain the restriction(s) u(t1 ) = MU(t1 ) (and average over them). This constitutes the coarse time-stepper, or coarse time-T map. If this map is accurate enough, we showed above how to use it in a two-tier procedure to perform Coarse Projective Integration [12–14]. • repeating steps (e–f) over several time steps and obtaining several U(ti ) as well as their restrictions u(ti ) = MU(ti ), i = 1, 2, . . . , k + 1 • using the chord approximating these successive time-stepper output points to estimate the derivative – the “right-hand-side” of the equations we do not have –, we can then • use this derivative in another, outer integrator scheme (such as forward Euler) to produce estimates of the macroscopic state much later in time u(tk+1+M ). • go back to step (d). The lifting step (creating microscopic distributions conditioned on a few of their lower moments, going back to Ehrenfest, [15]) is clearly not unique, and sometimes quite non-trivial: consider for example creating a distribution of particles on a lattice that has a prescribed average as well as a prescribed pair probability. A preparatory step (e.g., through simulated annealing) may be required to arrange the particles on the lattice consistently with the prescribed constraints. Through such appropriate preparation, one can even lift prescribed pair-correlation functions to consistent particle assemblies. Constrained dynamics algorithms, like SHAKE [16] can also be thought of as lifting procedures; see also Ref. [17]. An important point made in Fig. 2a is that an initial simulation interval must elapse before estimating the time-derivative of the macroscopic variables from the microscopic simulation. In the microscopic dynamics, every particle evolves while interacting with other particles, and all the moments of the distribution evolve in a coupled manner. It is therefore remarkable that practically predictive models are usually written in terms of only a few moments
1460
I.G. Kevrekidis et al. (a)
(b)
TI M E
Patch dynamics Lift µ
Project Restrict 2
Evolve 2
Restrict 1
Evolve 1 Interpolate
Lift µ
Interpolate Apply BC2
Boxes SPACE
(c)
Figure 2. Schematic illustrations of (a) coarse projectiveintegration; (b) patch dynamics; and (c) coarse-timestepper-based bifurcation computations (see text).
Equation-free modeling for complex systems
1461
of these evolving distributions. This is only possible because the remaining, higher-order moments quickly become functionals of the few, lower order, slow, “master” moments – our observation variables. This occurs over timescales that are short compared to the macroscopic observation time-scales. In this separation of time-scales (and concomitant space scales) lies the essential reduction step underpinning effective simplicity and practical determinism. The idea is that the long-term observable dynamics of the system evolve on a low-dimensional, strongly attracting, slow manifold in moments space; this is, effectively, a quasi-steady state approximation [18]. This manifold is parameterized by our observation variables (typically the lower distribution moments, like concentration) in terms of which we write macroscopic equations. The expected values of the remaining moments can be written as an (unspecified) function of the coarse variables; that is the graph of the manifold. A good example is the law of Newtonian viscosity: when one starts a molecular simulation, the stresses are not instantaneously proportional to velocity gradients – but for Newtonian fluids they become so within a few collision times, i.e., over times much shorter than the macroscopic observation times over which the Navier–Stokes equations become valid approximations. The coarse variables are therefore observation variables. If the fine-scale simulation, conditioned on values of the observation variables, is initialized “off manifold”, it only takes a fast (possibly constrained) initial transient to approach a neighborhood of this manifold. Through the restriction operator, we observe the dynamics on the hyperplane spanned by our chosen observation variables. After the system quickly relaxes to the manifold, we estimate the time-derivative of the observation variables, and use it in the projective integration scheme. The dynamics of the full system will then, after lifting and a short integration, spontaneously establish (by bringing us to the manifold) the missing closure: the effect of the full description on the observed dynamics. A direct conceptual analogy arises here with center manifolds in dynamical systems (parameterized using eigenvectors of the linearization at a steady state, see, e.g., Ref. [19]) or inertial manifolds for dissipative PDEs (parameterized using eigenfunctions of a linear dissipative operator, [20, 21]). Normal forms and (approximate) inertial forms are thus analogous to our macroscopic equations for the coarse observation variables. Low order moments have traditionally been the observation variables of choice in our textbooks. In principle, however, any set of variables that parameterizes this low-dimensional slow manifold can be used as observation variables with the appropriate lifting and restriction operators. Using more observation variables than necessary reduces computational efficiency; it is analogous to using a finer mesh than necessary for the accuracy required in solving a problem. Intelligently chosen order parameters usually provide a much more parsimonious basis set on which to observe the dynamics and apply our computational framework. There is a clear analogy here with
1462
I.G. Kevrekidis et al.
empirical eigenfunctions [22] used for model reduction in the discretization of dissipative PDEs. The detection of good observables, capable of efficiently parameterizing this manifold, through statistical analysis of simulation results, is a crucial enabling technology for our computational framework. Using data mining techniques (e.g., see Ref. [23–25]) to find such observables can be thought of as the “variable-free” component of the equation-free modeling approach. In coarse projective integration we exploit the smoothness in time of the unavailable macroscopic equation in order to project (jump) to the future. In the case of macroscopically (spatially or otherwise) distributed systems, one can exploit smoothness of the unavailable macroscopic equation in space in order to perform the microscopic simulations only over small, but appropriately coupled, computational boxes (“teeth”). This is illustrated in Fig. 2b: (a) Coarse variable selection (same as above, but now the variable u(x) depends on “coarse space” x. We have chosen for simplicity to consider only one space dimension.) (b) Choice of lifting operator (same as above, but now we lift entire profiles of u(x, t) to profiles of U(y, t), where is microscopic space corresponding to the macroscopic space x. This lifting involves therefore not only the variables, but the space descriptions too. The basic idea is that a coarse point in x corresponds to an interval (a “box” or “tooth” in y). (c) Prescribe a macroscopic initial profile u(x, t 0 ) – the “coarse field”. In particular, consider the values u i (t0 ) at a number of macro-mesh points; the macroscopic profile arises from interpolation of these values of the coarse-field. (d) Lift the “mesh points” xi and the values u i (t0 ) to profiles Ui (yi , t0 ), in microscopic domains (“teeth”) yi corresponding to the coarse-mesh points xi . These profiles should be conditioned on the values u i , and it is a good idea that they are also conditioned on certain boundary conditions motivated by the coarse-field (e.g., be consistent with coarse slopes at the boundaries of the teeth that are computed from the coarse-field). (e) Evolve the microscopic dynamics in each of these boxes for a short time T based on the microscopic description, and through ensembles that enforce the coarsely inspired boundary conditions (see, e.g., Ref. [26]) – and thus generate Ui (yi , t1 ), where t1 = t0 + T. (f) Obtain the restriction from each patch to coarse variables u i (t1 ) = M Ui (yi , t1 ). (g) Interpolate between these to obtain the new coarse-field u(x, t1 ). Up to this point, we have the gaptooth scheme: a scheme that computes in small domains (the “teeth”) which communicate over the gaps between them
Equation-free modeling for complex systems
1463
through “coarse-field motivated” boundary conditions. We can now proceed by combining the gaptooth scheme with projective integration ideas to (h) Repeat the process (lift within the teeth, compute new boundary conditions, evolve microscopically, restrict to macroscopic variables and interpolate) for a few steps, and then (i) Project coarse-fields “long” into the future. For a projective forward Euler this would involve the chord between two successive coarse-fields to estimate the right-hand-side of the unavailable coarse equation, and then an Euler “projection” of the coarse-field long into the future. (j) Repeat the entire procedure starting with the lifting (d) above. This leads to patch dynamics: a computational framework in which simulations using the microscopic description over short times and small computational domains (“patches” in space-time) can be used to advance the macroscopic dynamics over long times and large computational domains [10, 27–29]. Initializing microscopic computations conditioned on macroscopic variables is an important component of coarse projective integration; similarly, imposing macroscopically motivated boundary conditions to microscopic computations is an important element of gaptooth and patch dynamics. The methods we discussed can, under appropriate conditions, drastically accelerate the direct simulation of the coarse-grained, macroscopic behavior of certain complex multi-scale systems. Direct simulation, however, is but the simplest computational task one can perform with a system model. It corresponds, in some sense, to physical experimentation: we set parameter values and initial conditions, let the system evolve on the computer and observe its behavior, just like performing a laboratory experiment. Depending on what we want to learn about the system, there exist much more interesting and efficient ways of using the model and the computer. Consider for example the location of steady states; fixed point algorithms, like the Netwon– Raphson, are a much more efficient way of finding steady states than direct integration (given a good initial guess). Such fixed point algorithms can locate both stable and unstable steady states (the latter would be extremely difficult or impossible to find with direct simulation). “The Jacobian of the solution is a treasure trove, not only for continuation, but also for analyzing stability of solutions, for detecting bifurcations of solution families, and for computing asymptotic estimates of the effects, on any solution, of small changes in parameters, boundary conditions and boundary shape” [30]. Beyond stability and sensitivity analysis, having the steady states and using Taylor series in their neighborhood (Jacobians, Hessians) one can design stabilizing controllers, observers, solve optimization problems, etc. There is a vast arsenal of algorithms (and codes implementing them) for the computer-aided analysis of system models, going much beyond direct simulation. Yet these algorithms
1464
I.G. Kevrekidis et al.
are applicable to macroscopic equations: ODEs, Differential Algebraic Equations (DAEs), PDEs/PDAEs and their discretizations. Smoothness and Taylor series expansions (derivatives with respect to time, Frechet derivatives, partial derivatives with respect to parameters) are vital in formulating and implementing most of these algorithms. When the model comes in the form of microscopic/stochastic simulators at a much finer scale – without a closed formula for the equation, i.e., without a “right-hand side” for the time-derivative –, this arsenal of continuum numerical tools appears useless. Fortunately, the same coarse timestepping idea we used to accelerate direct simulation of an effectively simple multi-scale system can be used to enable its coarse-grained computer-assisted analysis even without explicit macroscopic equations. To illustrate this, we return to our simple scalar example in Fig. 1. We are given a black box timestepper for this equation: a code which, initialized with cn (t = nτ ) integrates the equation for time τ and returns the result cn+1 = c(t = [n + 1]τ ). We use the notation cn+1 = τ (cn ). If the task at hand is to find a steady state for the equation, this can be accomplished by calling the timestepper repeatedly (integrate forward in time) until the result does not change any more. Indeed a steady state of the equation is a fixed point for the timestepper, x ∗ =τ (x ∗ ). Yet this iteration will only find stable steady states, and the rate of convergence to them depends on the physical dynamics of the problem, becoming increasingly slow close to transition boundaries. The method of choice for finding a steady state (given a good initial guess) would be a Newton–Raphson iteration, which would converge quadratically to non-singular steady states.
df dc
c(n)
(c(n+1) − c(n) ) = − f (c(n) ).
Can we trick an integration code (the timestepper) into becoming a fixed point solver? In other words, if we do not have the equation for f (c), but can computationally evaluate the timestepper, can we still do Newton for the steady state? The answer is illustrated in Fig. 1c: we use the computationally evaluated timestepper to solve the fixed point problem G(c) ≡ c − (c) = 0. Calling the timestepper for an initial condition c(n) gives us (c(n) ) and the residual, G(c(n) ). Lacking a formula to compute the linearization, we call the timestepper with a nearby initial condition, c(n) + ε. This gives us (c(n) + ε), • ε. This estimate and the difference (using Taylor series) is approximately d dc of the action of the Jacobian can then be used in a secant method to compute the next iterate c(n + 1) of the steady-state search. Notice again the crucial issue of being able to initialize a simulator at will; after c(n+1) is estimated from
Equation-free modeling for complex systems
1465
the nearby integrations and the secant procedure, we can immediately call the timestepper with initial condition c(n + 1) and iterate the process. We have not done much more than estimating derivatives through differencing. Yet forward integration can now be used (through a computational superstructure, a “wrapper” that implements what we just described in words) to converge to unstable steady states, and eventually to compute bifurcation diagrams. We have enabled a simulation code to perform a task (fixed point computation) for which it had not been designed [31]. This procedure may initially appear hopeless in higher dimensions (e.g., for the large sets of ODEs arising in PDE discretizations). Fortunately, recent developments in large-scale computational linear algebra (the so-called matrix free solvers and eigensolvers) address precisely this point. Integrating with two nearby initial conditions (m-vectors, differing by the m-vector ε) and taking the difference of the timestepper results provides an estimate of DΦ · ε, the inner product of the m × m Jacobian matrix of the timestepper (which is not available in closed form) and the known m-vector ε. Matrix-free iterative algorithms (for example Newton–Krylov/GMRES methods based on the timestepper) can then be used to solve for the steady state (e.g., Refs. [32, 33]). Matrix-free eigensolvers (e.g., subspace iteration methods based on the timestepper) can be used to estimate the part of the spectrum of the linearization close to the imaginary axis, which is relevant for stability and bifurcation computations of the unavailable equation [34]. We see once more that the quantities necessary for computer-aided analysis (residuals, action of Jacobians) can be estimated by appropriately designed short calls to the timestepper and subsequent post-processing of the results, even if the equation is not available in closed form. Remarkably, and completely independently of complex/multi-scale computations, these software wrappers have the potential to enable legacy integration codes (large-scale, industrial dynamic simulators) to perform tasks such as stability/bifurcation and operability analysis, controller design and optimization. Our inspiration comes from precisely such a wrapper: the Recursive Projection Method of Ref. [35], which enables a class of large scale direct simulators (even slightly unstable ones) into becoming convergent fixed point solvers. Clearly, the same type of computational superstructure can turn coarsetimesteppers (lifting from macroscopic to consistent microscopic initial conditions, evolving with the fine-scale code, and restricting back to macroscopic variables) into coarse-fixed point algorithms, and, with appropriate augmentation, coarse bifurcation algorithms (Fig. 2c). Coarse residuals and the action of coarse slow Jacobians and Hessians can be estimated in a matrix-free context by systematic, judicious calls to the coarse timestepper. Coarse equation solvers and coarse eigensolvers can thus be implemented – many aspects of the computer-assisted analysis of the unavailable macroscopic equation can be
1466
I.G. Kevrekidis et al.
performed without the equation. Motivated by the connection to matrix-free numerical analysis methods, we call the timestepper and coarse-timestepper based computer-assisted analysis equation free computation [10]. The scope of the approach is very general. Coarse projective integration and coarse bifurcation computations have been used to accelerate lattice kinetic Monte Carlo simulations of catalytic surface reactions ([36–39]); biased random walk kMC models of e-coli chemotaxis ([40]); kinetic theory-based, interacting particle simulations of hydrodynamic equations [28]; Brownian dynamics simulations of nematic liquid crystals [41]; lattice Boltzmann-BGK simulations of multi-phase, bubbly flows [31]; molecular dynamics simulations of the folding of a peptide fragment [42]; individual-based kMC models of evolving diseases such as influenza [43]; kMC models of dislocation movement in a lattice containing diffusing impurities [44]; molecular dynamics simulations of granular flows; and more. For some spatially distributed problems, this involved gaptooth and patch dynamics versions of the coarse-timestepper. As more experience is accumulated and the methods develop further, more problems may become accessible to equation-free computer aided analysis. Beyond simulation and stability/continuation computations, equation-free computation has been used to perform tasks such as linear stabilizing controller design for kMC, LB-BGK as well as Brownian Dynamics simulators [41, 45, 46]; case studies of coarse optimization [47] as well as coarse feedback linearization for kMC simulators [48, 49] have been performed; additional tasks like coarse reverse integration backward in time [50], and coarse dynamic renormalization [10, 51], for the equation-free computation of selfsimilar solutions are also possible. Wrappers for legacy codes have been designed (RPM has been wrapped around gPROMS to accelerate rapid pressure swing absorption computations, and coarse integration of an unavailable envelope equation has also been used for this purpose, [52]). Other problems can also be approached through the same basic scheme, including problems which we believe could be modeled by effective medium equations (such as flow in porous media, or reaction-diffusion over microcomposite catalysts). Here again, short bursts of detailed medium simulation can be used to estimate the timestepper of the effective medium equation without deriving this equation explicitly [53]. Similarly, the solution of effective continuum equations for spatially discrete problems (such as lattices of coupled neurons) can be attempted in an equation-free framework [54]. Most of the discussion so far was formulated in a deterministic context; yet many complex systems of interest are well-described by stochastic models. Every outcome of computations with such models is in principle different; noise destroys determinism at the level of a single experiment. Determinism is often restored, however, at a different level of observation: when one considers the distribution of the outcomes of several realizations. One can be deterministic (i.e., write predictive equations) about the expectation of a sufficiently
Equation-free modeling for complex systems
1467
large ensemble of experiments; possibly about the expectation and standard deviation of such an ensemble. Once again, higher order moments of a probability distribution (whose evolution is governed by a Fokker–Planck-type equation) get quickly slaved to lower order moments, and one can be practically predictive if one looks at an appropriately coarse-grained level. While, for example, we cannot know the fate of an individual after a year, we can be practically predictive about the evolution of a few basic statistics of the population of a country. For the right observables, the coarse-timestepper is then constructed by simulating a large enough ensemble of realizations of the stochastic problem. An important category of problems can be approximated by dynamics on low-dimensional free-energy surfaces, parametrized by a few well-chosen coarse variables (reaction coordinates). In the statistical mechanics of molecular systems the ability to be “practically predictive” with just a few meaningful reaction coordinates is intimately connected with separation of time scales. Formally, such coordinates could be defined with the help of the leading eigenfunctions of a Frobenius–Perron operator for the detailed problem [55]; yet this is practically unachievable. Instead, physical intuition, experience and data analysis is often used to suggest collective coordinates which hopefully provide dynamically relevant measures of the progress of a reaction. Projecting the full dynamics on such well-chosen reaction coordinates will then retain the macroscopically relevant features of the dynamics with only simplified representations of noise and memory [56, 57]. Short bursts of appropriately initialized molecular dynamics can again be used to estimate on demand the drift and the noise terms of effective Langevin or Fokker–Planck equations in these variables [58]; to find minima and saddles; to solve optimal path problems, and to construct approximate propagators for the density on this surface, without deriving or writing this effective equation in closed form. In our discussion we have endeavored to outline the new possibilities opened by such an equation-free framework. These possibilities are accompanied by many theoretical and practical difficulties. Some of these issues arise in algorithms of continuum numerical analysis themselves (stepsize selection in numerical integration, mesh-size selection in spatial discretizations, error monitoring and control in matrix-free iterative methods); some are particular to complex/multi-scale timesteppers (consistent initialization through lifting; estimation and filtering involved in restriction operators; imposition of macroscopically inspired boundary conditions); some arise from the coupling (choice of good observation variables). We will mention one special feature here. Adaptive step size selection is often performed by doing the computation with different step sizes and estimating the error a posteriori; similarly, adaptive mesh selection is based on computations performed at different mesh-sizes to estimate the error. To adaptively determine the level of coarse-graining at which we can be practically predictive, the coarse timestepper can be computed by
1468
I.G. Kevrekidis et al.
conditioning the microscopic simulation at different observation levels, i.e., with different numbers of coarse variables (e.g., surface coverages only, vs. surface coverages and pair probabilities for lattice simulations of surface reactions). Matrix-free, timestepper-based eigensolvers can then be used to estimate the slow eigenvalues and corresponding eigenvectors for the timestepper, which should be tangent to the slow manifold (embodying the missing closure). Gaps in this spectrum, and the components of the corresponding eigenvectors can be used to probe the number and nature of coarse variables that should be used to observe the system dynamics (i.e., to locally parametrize the manifold). Handshaking between microscopic solvers and macroscopic continuum numerical analysis consists mainly of subjects traditionally studied in systems theory. System identification based on the results of computational experimentation with the fine-scale model is the most important component. Separation of time-scales underpins the low-dimensionality of the macroscopic dynamics. The dynamics of the hierarchy of distribution moments constitute a singularly perturbed system, and brief simulation is used to “cure off-manifold initial conditions” by bringing them back onto the manifold, healing the errors we commit when lifting. The dynamics themselves establish the missing closure; we can think of this as a “closure on demand” approach. Adaptive tabulation [59] can be used to economize in the design of experiments, and the importance of data assimilation/statistical analysis tools to identify non-linear correlations has already been stressed. The use of observer theory (e.g., [60, 61]) and realization balancing (e.g., Refs. [62, 63]) arises naturally: the microscopic system dynamics are observed on the macroscopic variables, but are realized through the microscopic simulator. Techniques for filtering [64] and variance reduction [65] will play an important role in determining how useful equation-free computations will ultimately be [66]. Timestepper-based methods are, in effect, alternative ensembles for performing microscopic (molecular dynamics, kMC, Brownian dynamics) simulations. These ensembles, however, are motivated by macroscopic numerical analysis, rather than statistical mechanical considerations. We are currently exploring the applicability of these “numerical analysis motivated” ensembles in accelerating equilibrium computations (grand canonical MC computations of micelle formation, [67, 68]). It is particularly interesting to consider ensembles motivated by the augmented systems arising in multi-parameter continuation. In such ensembles, like the pathostat [48, 49] based on pseudoarclength continuation, both the variables and the operating parameters themselves evolve, so that the system traces both stable and unstable parts of bifurcation diagrams. An increasing number of experimental systems appears in the literature for which finely spatially distributed actuation authority – coupled with sensing – is available; photosensitive chemical reactions addressed through a
Equation-free modeling for complex systems
1469
digital projector [69], laser-addressable catalytic reactions [70] and interfacial flows [71], colloidal particles manipulated through optical tweezers [72] or electric fields [73] are some such examples. When experiments can be initialized at will, the timestepper methods we discussed here can be applied to laboratory – rather than computational – experiments. Continuum numerical methods will then become experimental design protocols, tuned to the task we wish to perform. This way, mathematics might be performed directly on the physical system, and not on the (approximate) equations modeling it. Many of the mathematical and computational tools combined in this exposition (e.g., system identification, or inertial manifold theory) are wellestablished; we borrowed them, in our synthesis with tools developed in our group, as necessary. Innovative multi-scale/multi-level techniques proposed over the last decade include the quasi-continuum methods of Phillips and coworkers [74, 75]; the optimal prediction methods of Chorin and coworkers [76, 77]; the coupling of continuum fields with stochastic evolution in the work of Oettinger and coworkers [78, 79]; the kinetic-theory-based solvers proposed by Xu and Prendergast [80, 81], the modification of equation-free computation in the context of conservation laws by E and Engquist [82]; and the lattice coarse graining by Katsoulakis et al. [83] (see the review by Givon et al, [84] and the discussion in Ref. [10]. In the context of molecular dynamics simulations, the idea of using multiple, and possibly coupled replica runs to search conformation space (for systems with unmodified or artificially modified energy surfaces) forms the basis of approaches such as conformational flooding [85], parallel replica MD [86], SWARM-MD [87], coarse extended Lagrangian dynamics [88, 89], and simple averaging over multiple trajectories [90, 91]. It is fitting to close this perspective citing from a 1980 article entitled “Computer-aided analysis of nonlinear problems in transport phenomena” by Brown, Scriven and Silliman [30]: The nonlinear partial differential equations of mass, momentum, energy, species and charge transport, especially in two and three dimensions, can be solved in terms of functions of limited differentiability – no more than the physics warrants – rather than the analytical functions of classical analysis. . . . Organizing the polynomials in the so-called finite element basis functions facilitates generating and analyzing solutions by large, fast computers employing modern matrix techniques”. These sentences celebrate the transition from analytical solutions (of explicitly available equations) to computer-assisted solutions. The solutions are not analytically available for our class of complex/multiscale problems either; but now the equations themselves are not available, and they are solved in a computerassisted fashion using appropriate computational experiments at a different level of system description. The similarity of the list of important elements is remarkable: The right basis functions, dictated by the physics (discretizations of the right coarse observation variables); large, fast computers (now
1470
I.G. Kevrekidis et al.
massively parallel clusters, each CPU computing one realization of trajectories for the same “coarse” initial condition); and modern matrix techniques (now matrix-free iterative linear algebra). The approach bridges traditional numerical analysis, computational experimentation with the microscopic simulator, and systems theory; its most vital element is the simple fact that a code can be initialized at will. If one has good macroscopic equations, one should use them. But when these equations are not available in closed form (and such cases arise with increasing frequency in contemporary modeling) the equation-free computational enabling technology we outlined here may hold the key to the engineering of effectively simple systems.
Acknowledgments This work was partially supported over the years by AFOSR, through an NSF/ITR grant, DARPA and Princeton University. A somewhat shortened version of this article has appeared as a Perspective in the July 2004 issue of the AIChE Journal.
References [1] S. Chapman and T.G. Cowling, The Mathematical Theory of Non-Uniform Gases, 2nd edn., Cambridge Unversity Press, Cambridge, 1952, 1939. [2] J.M. Ottino, “Complex systems,” AIChE Journal, 49(2), 292, 2003. [3] M.E. Csete and J. Doyle, “Reverse engineering of biological complexity,” Science, 295 1664, 2002. [4] D. Maroudas, “Multiscale modeling of hard materials: challenges and opportunities for chemical engineering,” AIChE J., 46, 878, 2002. [5] G. Lu and E. Kaxiras, An overview of multiscale simulations of materials: cond-mat/0401073 preprint at arXiv.org, 2004. [6] G.E.P. Box, W. Hunter, and J.S. Hunter, Statistics for Experimenters: An Introduction to Design, Data Analysis and Model Building, Wiley, New York, 1978. [7] G. Cybenko, “Just in time learning and estimation,” In: Identification, Adaptation and Learning: the Science of Learning Models from Data, NATO ASI Series, F153, Springer, Berlin, 423, 1996. [8] L. Ljung, System Identification: Theory for the User, 2nd edn., Prentice Hall, New York, 1999. [9] K. Theodoropoulos, Y.-H. Qian, and I.G. Kevrekidis, “Coarse stability and bifurcation analysis using timesteppers: a reaction diffusion example,” Proc. Natl Acad. Sci., 97(18), 9840, 2000. [10] I.G. Kevrekidis, C.W. Gear, J.M. Hyman, P.G. Kevrekidis, O. Runborg, and K. Theodoropoulos, “Equation-free coarse-grained multiscale computation: enabling microscopic simulators to perform system-level tasks,” Commun. Math. Sci., 1(4), 715–762, original version can be obtained as physics/0209043 at arXiv.org, 2003.
Equation-free modeling for complex systems
1471
[11] R. Car and M. Parrinello, “Unified approach for molecular dynamics and density functional theory,” Phys. Rev. Lett., 55, 2471, 1985. [12] C.W. Gear and I.G. Kevrekidis, “Projective methods for stiff differential equations: problems with gaps in their eigenvalue spectrum,” SIAM J. Sci. Comp., 24(4), 1091, original NEC Technical Report NECI-TR 2001-029, Apr. 2001, 2003. [13] C.W. Gear, “Projective integration methods for distributions,” NEC Technical Report NECI TR 2001-130, Nov. 2001, 2001. [14] C.W. Gear, I.G. Kevrekidis, and K. Theodoropoulos, “Coarse integration/bifurcation analysis via microscopic simulators: micro-Galerkin methods,” Comp. Chem. Eng., 26, 941, Original NEC Technical Report NECI TR 2001-106, Oct. 2001, 2002. [15] P. Ehrenfest and T. Ehrenfest, In: Enzyklopaedie der Mathematsichen Wissenschaften (1911), repinted in P. Ehrenfest, Collected Scientific Papers, North Holland, Amsterdam, 1959. [16] J.P. Ryckaert, G. Ciccotti, and H. Berendsen, “Numerical integration of the Cartesian equations of motion of a system with constraints: molecular Dynamics of N-alkanes,” J. Comp. Phys., 23, 327, 1977. [17] C.W. Gear, T.J. Kaper, I.G. Kevrekidis, and A. Zagaris, “Projecting on a slow manifold: singularly perturbed systems and legacy codes,” submitted to SIADS, can be found as Physics/0405074 at arXiv.org, 2004. [18] M. Bodenstein, “Eine theorie der photochemischen Reaktionsgeschwindigkeiten,” Z. Phys. Chem., 85, 329, 1913. [19] J. Guckenheimer and P. Holmes, Nonlinear Oscillations, Dynamical Systems and Bifurcations of Vector Fields, Springer Verlag (Appl. Math. Sci. vol. 42.), New York, 1983. [20] P. Constantin, C. Foias, B. Nicolaenko, and R. Temam, Integral Manifolds and Inertial Manifolds for Dissipative Partial Differential Equations, Springer Verlag, New York, 1988. [21] R. Temam, Infinite Dimensional Dynamical Systems in Mechanics and Physics, Springer Verlag, New York, 1998. [22] P. Holmes, J.L. Lumley, and G. Berkooz, Turbulence, Coherent Structures, Dynamical Systems and Symmetry, Cambridge University Press, Cambridge, 1998. [23] I.T. Jolliffe, Principal Component Analysis, Springer Verlag, New York, 1986. [24] A.J. Smola, O.L. Mangasarian, and B. Schoelkopf, “Sparse kernel feature analysis,” Data Mining Institute Technical Report 99–04, University of Wisconsin, Madison, 1999. [25] R.R. Coifman, S. Lafon, A. Lee, M. Maggioni, F. Warner, and S. Zucker, “Geometric diffusions as a tool for harmonic analysis and structure definition of data,” Proc. Natl. Acad. Sci. USA, submitted, 2004. [26] J. Li, D. Liao, and S. Yip, “Imposing field boundary conditions in MD simulations of fluids: optimal particle controller and buffer zone feedback,” Mat. Res. Soc. Symp. Proc., 538, 473, 1998. [27] I.G. Kevrekidis, “Coarse bifurcation studies of alternative microscopic/hybrid simulators,” Plenary Lecture, CAST Division, AIChE annual meeting, Los Angeles, can be found at http://arnold.princeton.edu/∼yannis, 2000. [28] C.W. Gear, J. Li, and I.G. Kevrekidis, “The gaptooth method in particle simulations,” Phys. Lett. A, 316, 190–195, 2003. [29] G. Samaey, I.G. Kevrekidis, and D. Roose, “The gap-tooth scheme for homogenization problems,” SIAM MMS, in press, 2005.
1472
I.G. Kevrekidis et al.
[30] R.A. Brown, L.E. Scriven, and W.J. Silliman, “Computer-aided analysis of nonlinear problems in transport phenomena,” In: P.J. Holmes (ed.), New Approaches to Nonlinear Problems in Dynamics, SIAM Publications, Philadelphia, p. 298, 1980. [31] K. Theodoropoulos, Sankaranarayanan, S. Sundaresan, and I.G. Kevrekidis, “Coarse bifurcation studies of bubble flow lattice Boltzmann simulations,” Chem. Eng. Sci., 59, 2357, can be obtained as nlin.PS/0111040 at arXiv.org, 2004. [32] C.T. Kelley, Iterative Methods for Solving Linear and Nonlinear Equations, SIAM Publications, Philadelphia, 1995. [33] Y. Saad, Iterative Methods for Sparse Linear Systems, 2nd edn., SIAM Publications, Philadelphia, 2003. [34] R.B. Lehoucq, D.C. Sorensen, and C. Yang, ARPACK Usres’ Guide: Solution of Large-Scale Eigenvalue Problems with Implicitly Restarted Arnoldi Methods, SIAM Publications, Philadelphia, 1998. [35] G.M. Shroff and H.B. Keller, “Stabilization of unstable procedures: a recursive projection method,” SIAM J. Numer. Anal., 30, 1099, 1993. [36] A. Makeev, D. Maroudas, and I.G. Kevrekidis, “Coarse stability and bifurcation analysis using stochastic simulators: kinetic Monte Carlo examples,” J. Chem. Phys., 116, 10083, 2002. [37] A.G. Makeev, D. Maroudas, A.Z. Panagiotopoulos, and I.G. Kevrekidis, “Coarse bifurcation analysis of kinetic Monte Carlo simulations: a lattice gas model with lateral interactions,” J. Chem. Phys., 117(18), 8229, 2002. [38] A.G. Makeev and I.G. Kevrekidis, “Equation-free multiscale computations for a lattice-gas model: coarse-grained bifurcation analysis of the NO+CO reaction on Pt(100),” Chem. Eng. Sci., 59, 1733, 2004. [39] R. Rico-Martinez, C.W. Gear, and I.G. Kevrekidis, “Coarse projective KMC integration: forward/reverse initial and boundary value problems,” J. Comp. Phys., 196, 474, 2004. [40] S. Setayeshgar, C.W. Gear, H.G. Othmer, and I.G. Kevrekidis, “Application of coarse integration to bacterial chemotaxis,” SIAM MMS, accepted, can be found as physics/0308040 at arXiv.org, 2004. [41] C. Siettos, M.D. Graham, and I.G. Kevrekidis, “Coarse Brownian dynamics for nematic liquid crystals: bifurcation, projective integration and control via stochastic simulation,” J. Chem. Phys., 118(22), 10149, can be obtained as cond-mat/0211455 at arXiv.org, 2003. [42] G. Hummer and I.G. Kevrekidis, “Coarse molecular dynamics of a peptide fragment: free energy, kinetics and long time dynamics computations,” J. Chem. Phys., 118(23), 10762, 2003. [43] J. Cisternas, C.W. Gear, S. Levin, and I.G. Kevrekidis, “Equation-free modeling of evolving diseases: coarse-grained computations with individual-based models,” Proc. R. Soc. London, 460, 27621, can be found as nlin.AO/0310011 at arXiv.org, 2004. [44] M. Haataja, D. Srolovitz, and I.G. Kevrekidis, “Apparent hysteresis in a driven system with self-organized drag,” Phys. Rev. Lett., 92(16), 160603, also cond-mat/0310460 at arXiv.org, 2004. [45] C.I. Siettos, A. Armaou, A.G. Makeev, and I.G. Kevrekidis, “Microscopic/stochastic timesteppers and coarse control: a kinetic Monte Carlo example,” AIChE J., 49(7), 1922, nlin.CG/0207017 at arXiv.org, 2003. [46] A. Armaou, C.I. Siettos, and I.G. Kevrekidis, “Time-steppers and coarse control of microscopic distributed processes,” Int. J. Robust Nonlinear Control, 14, 89, 2004.
Equation-free modeling for complex systems
1473
[47] A. Armaou and I.G. Kevrekidis, “Optimal switching policies using coarse timesteppers,” Proceedings of the 2003 CDC Conference, Hawaii, can be obtained as nlin.CG/0309024 at arXiv.org, 2003. [48] C.I. Siettos, N. Kazantzis, and I.G. Kevrekidis, “Coarse feedback linearization using timesteppers,” Submitted to Int. J. Bifurcations and Chaos, 2004. [49] C.I. Siettos, D. Maroudas, and I.G. Kevrekidis, “Coarse bifurcation diagrams via microscopic simulators: a state-feedback control-based approach,” Int. J. Bif. Chaos, 14(1), 207, 2004. [50] C.W. Gear and I.G. Kevrekidis, “Computing in the past with forward integration,” Phys. Lett. A, 321, 335, 2004. [51] L. Chen, P.G. Debenedetti, C.W. Gear, and I.G. Kevrekidis, “From molecular dynamics to coarse self-similar solutions: a simple example using equation-free computation,” J. Non-Newtonian Fluid Mech., 120, 215, 2004. [52] C.I. Siettos, C.C. Pantelides, and I.G. Kevrekidis, “Enabling dynamic process simulators to perform alternative tasks: a time-stepper based toolkit for computer-aided analysis,” Ind. Eng. Chem. Res., 42(26), 6795, 2003. [53] O. Runborg, I.G. Kevrekidis, and K. Theodoropoulos, “Effective stability and bifurcation analysis: a time stepper based approach,” Nonlinearity, 15, 491, 2002. [54] J. Moeller, O. Runborg, P.G. Kevrekidis, K. Lust, and I.G. Kevrekidis, “Effective equations for discrete systems: a time stepper based approach,” in press, Int. J. Bifurcations and Chaos, 2005. [55] C. Schuette, A. Fischer, W. Huisinga, and P. Deuflhard, “A direct approach to conformational dynamics based on hybrid Monte Carlo,” J. Comp. Phys., 151, 146, 1999. [56] R. Zwanzig, Nonequilibrium Statistical Mechanics, Oxford University Press, New York, 2001. [57] P. Haenggi, P. Talkner, and M. Borkovec, “Reaction-rate theory: 50 years After Kramers,” Rev. Mod. Phys., 62(2), 251, 1990. [58] R. Kupferman and A. Stuart, “Fitting SDE models to nonlinear Kac-Zwanzig heat bath models,” Phys. D, in press, 2005. [59] S. Pope, “Computationally efficient implementation of combustion chemistry using ins situ adaptive tabulation,” Comb. Theory Model., 1, 41, also Beam Technologies Inc, ISAT-CK Users’ Guide (Release 1.0), 1998. Beam Technologies Inc., Ithaca, NY, 1997. [60] D.G. Luenberger, “Observing the state of a linear system,” IEEE Trans. Military Electronics, 8, 74, 1964. [61] A.J. Krener, Nonlinear observers in control systems, robotics and automation. In: H. Unbehauen (ed.), Encyclopedie of Life Support Systems (EOLSS), Eolss Publishers, Oxford, 2003. [62] B.C. Moore, “Principal component analysis in linear systems: controllability, observability and model readuction,” IEEE Trans. Automatic Control, 26(1), 17, 1981. [63] S. Lall, J.E. Marsden, and S. Glavaski, “A subspace approach to balanced truncation for model reduction of nonlinear control systems,” Int. J. Robust Nonlinear Control, 12, 519, 2002. [64] R.E. Kalman and R.S. Bucy, “New results in linear filtering and prediction theory,” Trans. ASME, Part D, J. Basic Eng., 83, 95, 1961. [65] M. Melchior and H.C. Oettinger, “Variance reduced simulations of stochastic differential equations,” J. Chem. Phys., 103(21), 9506, 1995. [66] J. Li, P.G. Kevrekidis, C.W. Gear, and I.G. Kevrekidis, “Deciding the nature of the coarse equation through microscopic simulation,” SIAM MMS, 1(3), 391, 2003.
1474
I.G. Kevrekidis et al.
[67] D. Kopelevich, A.Z. Panagiotopoulos, and I.G. Kevrekidis, “Coarse grained computations for a micellar system,” in press, 2005. [68] D. Kopelevich, A.Z. Panagiotopoulos, and I.G. Kevrekidis, “Coarse kinetic approach to rare events: application to micelle formation,” in press, J. Chem. Phys., 2005. [69] T. Sakurai, E. Mihaliuk, F. Chirila, and K. Showalter, “Design and control of wave propagation patterns in excitable media,” Science, 296 , 2009, 2002. [70] J. Wolff, A.G. Papathanasiou, I.G. Kevrekidis, H.H. Rotermund, and G. Ertl, “Spatiotemporal addressing of surface activity,” Science, 294, 134, 2001. [71] D. Semwogerere and M.F. Schatz, “Evolution of hexagonal patterns from controlled initial conditions in a Benard-Marangoni convection experiment,” Phys. Rev. Lett., 88, 054501, 2002. [72] D.G. Grier, “A revolution in optical manipulation,” Nature, 424, 810, 2003. [73] W.D. Ristenpart, I.A. Aksay, and D.A. Saville, “Electrically guided assembly of planar superlattices in binary colloidal suspensions,” Phys. Rev. Lett., 90, 12, 2003. [74] R. Phillips, Crystals, Defects and Microstructures, Cambridge University Press, Cambridge, 2001. [75] M. Ortiz and R. Phillips, “Nanomechanics of defects in solids,” Adv. Appl. Mech., 36, 1, 1999. [76] A. Chorin, A. Kast, and R. Kupferman, “Optimal prediction for underresolved dynamics,” Proc. Natl Acad. Sci. USA, 95, 4094, 1998. [77] A. Chorin, O. Hald, and R. Kupferman, “Optimal prediction and the Mori–Zwanzig representation of irreversible processes,” Proc. Natl Acad. Sci. USA, 97, 2968, 2000. [78] H.C. Oettinger, Stochastic Processes in Polymeric Fluids, Springer Verlag, New York, 1996. [79] M. Laso and H.-C. Oettinger, “Calculation of viscoelastic flow using molecular models: the CONNFFESSIT approach,” JNNFM, 47, 1, 1993. [80] K. Xu and K. Prendergast, “Numerical Navier–Stokes from gask kinetic theory,” J. Comp. Phys., 114, 9, 1994. [81] K. Xu, “A Gas-kinetic BGK scheme for the Navier–Stokes equations and its connection with artificial dissipation and the Godunov method,” J. Comp. Phys., 171, 289, 2001. [82] W.E. and B. Engquist, “The heterogeneous multiscale methods,” Commun. Math. Sci., 1, 87, 2003. [83] M.A. Katsoulakis, A.J. Majda, and D.G. Vlachos, “Coarse grained stochastic processes for microscopic lattice systems,” Proc. Natl. Acad. Sci. USA, 100(3), 782, 2003. [84] D. Givon, R. Kupferman, and A. Stuart, “Extracting macroscopic dynamics: model problems and algorithms,” Submitted to Nonlinearity, can be obtained as Warwick Preprint 11/2003, http://www.maths.warwick.ac.uk/ ∼stuart/extract.pdf, 2003. [85] H. Grubmueller, “Predicting slow structural transitions in macromolecular systems: conformational flooding,” Phys. Rev. E., 52(3), 2893, 1995. [86] A.F. Voter, “Parallel replica method for dynamics of infrequent events,” Phys. Rev. B, 57(22), R13985, 1998. [87] T. Huber and W.F. van Gunsteren, “SWARM-MD: searching conformational space by cooperative molecular dynamics,” J. Chem. Phys. A., 102(29), 5937, 1998. [88] M. Iannuzzi, A. Laio, and M. Parrinello, “Efficient exploration of reactive potential energy surfaces using Car-Parrinello molecular dynamics,” Phys. Rev. Lett., 90(23), 238302, 2003.
Equation-free modeling for complex systems
1475
[89] A. Laio and M. Parrinello, “Escaping free energy minima,” Proc. Natl Acad. Sci. USA, 99(20), 12562, 2002. [90] I.C. Yeh and G. Hummer, “Peptide loop-closure kinetics from microsecond molecular dynamics simulations in explicit solvent,” JACS, 124(23), 6563, 2002. [91] C.D. Snow, N. Nguyen, V.S. Pande, and M. Gruebele, “Absolute comparison of simulated and experimental protein folding,” Nature, 420(6911), 102, 2002.
4.12 MATHEMATICAL STRATEGIES FOR THE COARSE-GRAINING OF MICROSCOPIC MODELS Markos A. Katsoulakis1 and Dionisios G. Vlachos2 1
Department of Mathematics and Statistics, University of Massachusetts - Amherst, Amherst, MA 01002, USA 2 Department of Chemical Engineering, Center for Catalytic Science and Technology, University of Delaware, Newark, DE 19716, USA
1.
Introduction
Spatial inhomogeneity at some small length scale is the rule rather than the exception in most physicochemical processes ranging from advanced materials’ synthesis, to catalysis, to self-assembly, to atmospheric science, to molecular biology. These inhomogeneities arise from thermal fluctuations and complex interactions between microscopic mechanisms underlying conservation laws. While nanometer inhomogeneity and its corresponding ensemble average behavior can be studied via molecular simulation, such as molecular dynamics (MD) and Monte Carlo (MC) techniques, mesoscale inhomogeneity is beyond the realm of available molecular models and simulations. Mesoscopic inhomogeneities are encountered in self-assembly, pattern formation on surfaces and in solution, standing and traveling waves, as well as in systems exposed to an external field that varies spatially over micrometer to centimeter length scales. It is this class of problems that require “large scale” mesoscopic or coarse-grained molecular models and where the developments described herein are applicable. It is desirable that such mesoscopic or coarse-grained models meet the following needs: • They are derived from microscopic ones to retain microscopic mechanisms and interactions and enable a truly first principles multi-scale approach; • They reach large length and time scales, which are currently unattainable by micro scopic molecular models; 1477 S. Yip (ed.), Handbook of Materials Modeling, 1477–1490. c 2005 Springer. Printed in the Netherlands.
1478
M.A. Katsoulakis and D.G. Vlachos
• They give the correct statistical mechanics limits; • They describe equilibrium as well as dynamic properties accurately; • They retain the correct noise of molecular models to ensure that phenomena, such as nucleation, phase transitions, pattern formation, etc. at larger scales are properly modeled; • They are amenable to mathematical analysis in order to assess the errors introduced during coarse-graining and enable optimized coarse-graining strategies to be developed. Toward these goals, recent work in Refs. [1–3] focused on developing a novel stochastic modeling and computational framework, capable of describing efficiently much larger length and time scales than conventional microscopic models and simulations. Here, we did not directly attempt to speed up microscopic simulation algorithms such as MD or MC. Instead, our perspective was to derive a hierarchy of new coarse-grained stochastic models – referred to as Coarse-Grained MC (CGMC) – ordered by the magnitude of space/time scales. This new set of models involves a reduced set of observables compared to the original microscopic models, incorporating microscopic details and noise, as well as the interaction of the unresolved degrees of freedom. The outline of this approach can be summarized in the following heuristic steps: 1. Coarse-grid selection. We select a computational grid (lattice) Lc (see Fig. la) which will be referred to as the “coarse-grid”. The microscopic processes describe much smaller scales by explicitly simulating atoms or molecules–“particles”–and are defined at the subgrid level: for example in Ref. [1] they are defined on a “microscopic” grid L (see Fig. lb and Section 3 below). 2. Coarse-grained Monte Carlo methods. Using the microscopic stochastic model as a starting point, we derive by carrying out a "stochastic closure" a coarser stochastic model for a reduced number of observables, set on Lc (see Fig. la). These new stochastic processes define in essence Coarse Lattice LC 1
2
3
4
5
6
...
m
adsorption desorption diffusion Fine Lattice L 1 2 3 4 5 6 7 ...q
Figure 1.
Coarse and fine grids (lattices) with absorption/desorption and surface diffusion.
Mathematical strategies of microscopic models
1479
coarse-grained MC algorithms, which rather than describing dynamics of a single microscopic particle as conventional MC do, they model the evolution of a coarse observable on Lc . The CGMC models span a hierarchy of length scales starting from the microscopic to the mesoscopic scales, and involve Markovian birth–death and generalized exclusion processes. A key feature of our coarse-graining procedure is that the full hierarchy of our derived stochastic dynamics satisfies detailed balance relations and as a result not only yields self-consistent random fluctuation mechanisms, but which are also consistent with the underlying microscopic fluctuations. To demonstrate the basic ideas, we consider as our microscopic model an Ising-type system. This class of stochastic processes is employed in the modeling of adsorption, desorption, reaction and diffusion of interacting chemical species on surfaces or through nanopores of materials in numerous areas such as catalysis and microporous materials, growth of materials, biological molecules, magnetism, etc. The fundamental principle on which this type of modeling is based on is the following: when the binding of species on a surface or within a pore is relatively strong, these physical processes can be described as jump (hopping) processes from one site to another or to the gasphase (Fig. lb) with a transition probability that can be calculated, to varying degrees of rigor, from even smaller scales using quantum mechanical calculations and/or transition state theory, or from detailed experiments, see for instance [4].
2.
Microscopic Lattice Models
Ising-type systems are set on a periodic lattice L which is a discretization of the interval I = [0, 1]. We divide I in N (micro)cells and consider the microscopic grid L = 1/N Z ∩ I in Fig. lb. Throughout this discussion we concentrate on one-dimensional models, however, our results extend easily (and perform better!) in higher dimensions. At each lattice site ie x ∈ L the order parameter σ (x) is allowed to take the values 0 and 1 describing vacant and occupied sites, respectively. The energy H of the system, evaluated at the configuration σ = {σ (x) : x ∈ L} is given by the Hamiltonian, 1 J (x − y)σ (x)σ (y)+ hσ (x), (1) H (σ ) = − 2 x∈L y =/ x where h = h(x), x ∈ L, is the external field and J is the inter-particle potential. Equilibrium states of the Ising model are described by the Gibbs states at a prescribed temperature T , µL,β (dσ ) = Z L−1 exp (−β H (σ )) PN (dσ ),
1480
M.A. Katsoulakis and D.G. Vlachos
where β = 1/kT and k is the Boltzmann constant and Z L is the partition function. Furthermore the product Bernoulli distribution PN (σ ) with mean 1/2 is the prior distribution on L. The inter-particle potentials J account for interactions between occupied sites. We consider symmetric potentials with finite range interactions where by the integer L we denote the total number of interacting neighboring sites of a given point on L. The interaction potential can be written as J (x − y) =
1 V L
N (x − y) , L
x, y ∈ L,
(2)
where V (r) = V (−r), and V (r) = 0, |r| ≥ 1, accounting for possible finite range interactions. Note that for V summable, the choice of the scaling factor 1/L in (1) implies the summability of the potential J , even when N, L → ∞. An additional condition required in order to obtain error estimates for the coarse-graining procedure is that V is smooth away from 0 and R |∂r V (r)| dr < ∞. The derivation of the interaction potentials can be carried out either from quantum mechanics calculations (e.g., RKKY interactions in micromagnetics [5]) or experimentaly. Sometimes potentials involve only nearest neighbors since further interactions can be neglected, in which case we obtain the classical Ising model. However in many applications interactions are significant over a large but finite number of neighbors (see for instance the experimental results in Ref. [6]), or even involve true long range interactions such as electrostatics or the RKKY-type exchange energies mentioned earlier. The dynamics of Ising-type models considered in the literature consists of order parameter flips and/or exchanges that correspond to different physical processes. More specifically a flip at the site x ∈ L is a spontaneous change in the order parameter, 1 is converted to 0 and vice versa, while a spin exchange between the neighboring sites x, y ∈ L is a spontaneous exchange of the order parameters at the two locations, 1 is converted to 0 and vice versa. For instance, a spin flip can model the desorption of a particle from a surface described by the lattice to the gas phase above and conversely the adsorption of a particle from the gas phase to the surface, see Fig. lb. Such a model has also been proposed recently in the atmospheric sciences literature for describing certain unresolved features of tropical convection [7, 8]. On the other hand spin exchanges describe the diffusion of particles on a lattice; in this case the presence of interactions typically gives rise to a non-Fickian macroscopic behavior [9–11]. These mechanisms are set-up as follows: if σ is the configuration prior to a flip at x, then we denote the configuration after the flip by σ x . When the configuration is σ , a flip occurs at x with a rate c(x, σ ), i.e., the order parameter at x changes, during the time interval [t, t + t] with probability c(x, σ )t. The resulting stochastic process {σt }t ≥ 0 is defined as a continuous time jump Markov process with generator defined in terms of the
Mathematical strategies of microscopic models
1481
rate c(x, σ ), [12]. The imposed condition of detailed balance implies that the dynamics leave the Gibbs measure invariant and is equivalent to c(x, σ ) exp(−β H (σ )) = c(x, σ x ) exp(−β H (σ x )). The simplest type of dynamics satisfying the detailed balance condition is the Metropolis-type dynamics [13] where the energy barrier for desorption or diffusion depends only on the energy difference between the initial and final states. This type of dynamics are usually employed as MC relaxational algorithms for sampling from the equilibrium canonical Gibbs measure. However, in the context of physicochemical applications involving non-equilibrium evolution of interacting chemical species on surfaces or through nanopores of materials, it is more appropriate to consider dynamics where the activation energy of desorption or diffusion is the energy barrier a species has to overcome in jumping from one lattice site to another or to the gas phase. This type of dynamics is called Arrhenius dynamics and can be derived from MD or transition state theory calculations (see for instance Ref. [4]), to varying degrees of rigor and approximation. The fundamental idea here is that when the binding of species on a surface or within a pore is relatively strong, desorption and diffusion can be modeled as a hopping process from one site to another or to the gas phase, with a transition probability that depends on the potential energy surface. The Arrhenius rate for the adsorption/desorption mechanism is: c(x, σ ) = d0 (1 − σ (x)) + d0 σ (x) exp[−βU (x, σ )], where U (x, σ ) =
(3)
J (x − z)σ (z) − h(x),
z= / x,z∈L
is the total energy contribution from the particle interactions with the particle located at the site x ∈ L, as well as the external field h. Typically an additional term corresponding to the energy associated with the surface binding of the particle at x, can be also included in the external field h in U ; finally d0 is a rate constant that mathematically can be chosen arbitrarily but physically is related to the pre-exponential of the microscopic processes. Similarly we can define an Arrhenius mechanism for diffusion; in both cases the dynamics satisfy detailed balance.
3.
Coarse-grained Stochastic Processes and CGMC Algorithms
First we construct the coarse grid Lc by dividing I = [0, 1] in m equal size coarse cells (see Fig. la); in turn, each coarse cell is subdivided into q
1482
M.A. Katsoulakis and D.G. Vlachos
(micro)cells. Hence I is divided in N = mq cells and L = 1/mq Z ∩ I is the microscopic lattice in Fig. lb. Each coarse cell is denoted by Dk , k = 1, . . . , m and the coarse lattice corresponding to the coarse cell partition (Fig. la) is defined as Lc = 1/m Z ∩ I. We consider the integers k = 1, . . . , m as the unsealed lattice points of Lc , the coarse-grained stochastic processes defined below are set on Lc while the Ising model is set on the microscopic lattice L. Next we define a coarse-grained observable on the coarse lattice Lc . One such intuitive choice motivated by renormalization theory [14] is the average over each coarse cell Dk :
F(σt )(k) : =
σt (y),
k = 1, . . . , m.
(4)
y∈Dk
Although F(σt ) is not a Markov process, our goal here is to derive a Markov process ηt , defined on the coarse lattice Lc , approximating the true microscopic average F(σ ). Computationally this new process η is advantageous over the underlying microscopic σ , since it has a substantially smaller state space than σ and can be simulated much more efficiently. We next derive with a direct calculation from the microscopic stochastic process the exact coarse-grained rates for adsoprtion and desorption for the microscopic average F(σt ) in coarse cell Dk ; these rates are, respectively c¯a (k) : =
c(x, σ ) (1 − σ (x)),
x∈Dk
c¯d (k) : =
c(x, σ )σ (x).
(5)
x∈Dk
In the case of Arrhenius diffusion the exact jump rate from cell Dk to Dl of the microscopic average (4) is given by c¯diff (k) : =
c(x, y, σ )σ (x)(1 − σ (y)).
(6)
x∈Dk, y ∈Dt
The main goal here is to express these exact coarse-grained rates, up to a controlled error, as functions of the “mesoscopic” random variable F(σ ), rather than the microscopic σ. This step yields a Markov process that will approximate in a probability metric the microscopic average (4). We refer to this procedure as a closure in analogy to closure arguments in kinetic theory and the derivation of coarse-grained deterministic PDE from interacting particle systems as hydrodynamic limits [12]. However, here we carry out a stochastic closure that retains fluctuations of the microscopic system. We demonstrate these arguments only in the case of Arrhenius dynamics; full details including other dynamics can be found in Refs. [1–3]. For the adsorption/desorption case we define the coarse-grained birth– death Markov process η = {η(k) : k ∈ Lc } approximating (4), where the random variable η(k) ∈ {0, 1, . . . , q} counts the number of particles in each coarse cell Dk . Using the rate calculations above we obtain the update rate with which the
Mathematical strategies of microscopic models
1483
value η(k) ≈ F(σ ) is increased by 1 (adsorption rate of a single particle in the coarse cell Dk ) and decreased by 1 (desorption in Dk ), respectively: ca (k, η) = d0 [q − η(k)],
cd (k, η) = d0 η(k) exp[−β U¯ (k)],
(7)
¯ As we show in where U¯ (l) = k∈Lc k=1 J¯(l, k)η(k) + J¯(0, 0)(η(l) − 1) − h(l). Katsoulakis et al. 2003a this new rate can be obtained from (5) with an error of the order O(q/L), when replacing F(σ ) ≈ η. Finally, the coarse-grained potential J¯ is defined by including the average of all contributions of pairwise microscopic interactions between coarse cells and within the same coarse cell,
J¯(k, l) = m 2
J (r − s) dr ds,
(8)
Dl ×Dk
where the area of Dl × Dk is equal to 1/m 2 . The coarse-grained external field h¯ is defined accordingly. Wavelets with vanishing moments can also be used in the construction of the coarse-grained potential [11, 15]. Similarly, in the Arrhenius diffusion case we obtain [3] the new rate cdiff (k → l, η) = q1 η(k)(q − η(l)) exp[−β(U0 + U¯ (k, η))],
(9)
describing the migration of a particle from the coarse cell Dk to cell Dl if k, I are nearest neighbors, and cdiff (k → l, η) = 0 otherwise; the generator for the Markov process ηt is defined analogously. A crucial step, which is special for the diffusion case, in obtaining (9) from (6) is the approximation of the local function σ (x)(1 − σ )) in (6) as a function of the coarse-grained variable η. This last step is trivial in the spin flip dynamics since such local functions in (5) are linear. Here we make the closure assumption that the particles are at local equilibrium inside each coarse cell Dk , we thus can replace σ (x) by q −1 η(k) (resp. σ (y) by q −1 η(l)). This last substitution somewhat parallels the “Replacement Lemma” in the interacting particle systems literature, necessary to obtain deterministic PDE as hydrodynamic limits: relative entropy estimates describing local equilibration of interacting particles allow to approximately rewrite local functions as a function of the coarse grained variables, see Ref. [16]. This analogy becomes precise in the discussion in Section 6 of the relative entropy error estimates, discussed below (18), between the microscopic processes σ and coarse-grained η. The invariant measure for the coarse-grained process {ηt }t ≥0 is a canonical Gibbs measure related to the original microscopic dynamics {σt }t ≥0: µm,q,β (dη) =
1 Z m,q,β
exp(−β H¯ (η))Pm,q (dη),
(10)
1484
M.A. Katsoulakis and D.G. Vlachos
where the product binomial distribution Pm,q (η), is the prior distribution arising from the microscopic prior by including q independent sites. Furthermore, H¯ is the coarse-grained Hamiltonian derived from the microscopic H ,
1 H¯ (η) = − 2 l∈L +
c k∈Lc k=1
J¯(0, 0) J¯(k, l)η(k)η(l) − η(l)(η(l) − 1) 2 l∈L c
¯ hη(k)
(11)
k∈Lc
The same-cell interaction term η(l)(η(l) − 1), yields the global mean field theory when the coarse-graining is performed beyond the interaction parameter L, as well as at the other extreme of q = 1 it is consistent with the Ising case. As a result we obtain a complete hierarchy of MC models-termed coarsegrained MC-spanning from Ising (q = 1) to mean field statistical mechanics limits where the latter does not include detailed interactions but includes noise, unlike the usual ODE mean field theories. Finally it can be easily shown both in the adsorption/desorption and the diffusion case that the condition of detailed balance for η with respect to the measure µm,q,β holds. Thus, combined mechanisms of diffusion, adsorption and desorption, which typically coexist in physical systems [17], can be modeled and simulated consistently for every coarse-graining level q. Detailed balance guarantees the proper inclusion of fluctuations in the coarse-grained model as they arise from the microscopies. This is justified in part by the form of the prior in (10), it is tested numerically in Refs. [1, 3] and it is proved rigorously by the loss of information estimate (18) below.
4.
Coarse-grained Monte Carlo Algorithms
The implementation of coarse-grained MC (CGMC), based on (7) and (9), is essentially identical to the microscopic MC [18] with a few differences. First, the inter-particle potential J is coarse-grained at the beginning of a simulation to represent interactions between particles within each cell (a feature absent in microscopic MC) as well as interactions with neighboring cells. Second, the order parameter is still an integer but varies between zero and q, instead of zero and one which is typical for microscopic MC. Otherwise, microscopic and coarse-grained algorithms are basically the same. Finally, we should comment about the significant computational savings resulting from coarse graining. For CGMC the CPU time in kinetic MC simulation with global update, i.e., searching the entire lattice to identify the chosen site, scales approximately as O(m 3 ) vs. O(N 3 ) for a conventional MC algorithm. In addition, coarse-grained potentials J¯ are compressed through the wavelet expansion (4) and thus additional savings are made in the calculation of energetics.
Mathematical strategies of microscopic models
1485
Overall in the case of adsorption/desorption processes the CPU time can decrease for the same real time with increasing q approximately as O(1/q 2 ). For example, even a very modest 10-fold reduction in the number of sites (q = 10) results in reduced CPU by a factor of 102 , yielding a significant enhancement in performance. Thus, while for macroscopic size systems in the millimeter length scale or larger, microscopic MC simulations are impractical on a single processor, the computational savings of CGMC make it a suitable tool capable of capturing large scale features, while retaining microscopic information on intermolecular forces and particle fluctuations. CGMC can capture mesoscale morphological features by incorporating the noise correctly, as well as simulating large length scales. For instance we refer to the standing wave example for adsorption/desorption computed by CGMC in Ref. [2] in this case we employed an exact analytic solution for the average coverage as a rigorous benchmark for the CGMC computations. A striking difference between diffusion and adsorption/desorption processes simulations is that in the case of diffusion we also have coarse-graining in time by a factor q 2 . This is certainly intuitively clear if one considers the additional space covered by a single coarsegrained jump, which would take q microscopic jumps. We refer to Ref. [3] for theory and simulations justifying and demonstrating precisely this coarse-graining in time effect. In turn, this approach contributes to improving the hydrodynamic slowdown effect in conservative MC and results in additional CPU savings. Overall, for long potentials CPU savings of up to q 4 , occur for continuous time KMC simulation.
5.
Connections to Stochastic Mesoscopic Models and Their Simulation
In this section we discuss connections of CGMC with coarse-grained models involving Stochastic PDE (SPDE) derived mainly in the physics and more recently in the mathematics communities. These approaches involve a heuristic and in some cases a rigorous passage to the infinite lattice limit in averaged quantities such as (4). Then, under suitable conditions, random fluctuations in the microscopic average (4) are suppressed in analogy to the Law of large numbers, but are accounted for as corrections similarly to the Central Limit Theorem. In the end the limit of (4) is expected to solve a SPDE. A classical example of such a SPDE is the stochastic Cahn–Hilliard–Cook model [19], which takes the abstract form:
ct − ∇ · µ[c]∇
δ E[c] δc
1 − √ ∇ · { 2µ[c]W˙ } = 0, N
(12)
1486
M.A. Katsoulakis and D.G. Vlachos
where W˙ = (W˙ 1 (x, t), . . . , W˙ d (x, t)) is a space/time white noise, δ E[c]/δc is the variational derivative of the free energy
|∇c| + βh
c(y) dy +
2
E[c] = D
F(c(y)) dy.
(13)
Here F(c) is a double-well potential and µ[c] is the mobility of the system. In the case of Cahn–Hilliard–Cook models the mobility is typically µ[c] = 1, or µ[c] = c(1 − c). In Ref. [10] we derived a stochastic PDE of the type (12) as a mesoscopic theory for diffusion of molecules interacting with a long range potential for microscopic dynamics by studying the asymptotics of (4), as the the number of interacting neighbors L → ∞. The free energy in this case is β E[c] = − 2 +
V (y − y )c(y)c(y ) dy dy + βh
c(y) dy
r(c(y)) dy .
(14)
where r(c) = c log c + (1 − c) log (1 − c), and the mobility depends explicitely on the choice of microscopic dynamics:
µ[c] =
βc(1 − c), βc(1 − c) exp(−βV ∗ c),
Metropolis-type, Arrhenius
(15)
where * denotes the convolution of two functions. Here the derivation of the noise is not based on a central limit theorem-type of scaling, which would linearize (12) and will not account for the expected hysteresis and metastability. Instead, the noise term is “designed” so that: (a) as expected (12) will satisfy a fluctuation–dissipation relation and (b) yield the same large deviation functional and rare events as the microscopic spin exchange process. We refer to Ref. [20] for an overview of mesoscopic PDE-based theories for both diffusion and adsorption and desorption processes. The connection of CGMC with SPDE such as (12) can be readily seen even with an equilibrium calculation: formally the Gibbs states associated with this Langevin-type stochastic equation is given by the free energy E[c]. On the other hand in Ref. [1] 2003a we derived an asymptotic formula for the coarse-grained Gibbs measure (10) as q → ∞: µm,qβ (η0 ) =
1 Z m,q,β
exp −qm(E m,q (η0 ) + oq (1)) ,
(16)
where E m,q [C] = −
β ¯ βh 1 Ck + r(Ck ), V (k, l)Ck Cl + m k∈L m k∈L 2m L¯ l k c c (17)
Mathematical strategies of microscopic models
1487
and J¯ = 1/L V¯ and L¯ = L/q is the coarse-grained potential length of J¯; we also define the average coverage at k ∈ Lc , Ck = λk /q, where η0 = (λ1 , λ2 , . . . , λm ), 0 ≤ λi ≤ q, and r(c) = c log c + (1 − c) log (1 − c). It is now clear that when the coarse-grained potential V¯ is long ranged (17) is merely a discrete version of the free energy (14). On the other hand if V¯ is a nearest neighbor potential then (17) yields a discrete version of the Ginzburg–Landau energy (13). In passing we remark that (16) also implies that for large q and m fixed, the most probable equilibrium configurations of the coarse-grained process ηt are given by the minimizers of the discrete free energy (17). A notable advantage of the CGMC methods over numerically solving Cahn–Hilliard–Cook type equations is the explicit connection to the microscopic system. While the connection with the underlying microscopic system is clear for the stochastic mesoscopic equations (12), (15) their derivation from microscopies is valid for L 1, which is not a strict requirement for our coarse-grained systems, as the estimate (18) demonstrates. From a mathematical perspective, due to the singular nature of the noise term, such SPDEs are expected to have only distributional, at best, solutions in dimensions more than 1. As a result, although direct simulation of (12), (see (15)), may have the advantage that PDE-based spectral methods can be used to surpass the hydrodynamic slowdown of MC algorithms, see Horntrop et al. 2001, they, however, require the careful handling of the highly singular noise term so that the scheme satisfies the detailed balance condition. For detailed adsorption/desorption mechanisms, it is not even clear which is the stochastic mesoscopic analogue of (12) that still satisfies detailed balance. On the other hand, CGMC includes fluctuations consistently with the detailed balance principle, allowing for the mesoscopic modeling of multiple simultaneous mechanisms such as particle diffusion, adsorption, desorption and reaction and always including properly stochastic fluctuations.
6.
The Numerical Analysis of CGMC: An Information Theory Approach
In this section we discuss the error analysis between microscopic models and CGMC in a more traditional numerical analysis sense. The error here represents the loss of information in the transition from the microscopic probability measure to the coarse-grained one. Such relative entropy estimates give a first mathematical reasoning for the parameter regimes (e.g., degree of coarse-graining) for which CGMC is expected to give errors within a certain tolerance. In Refs. [1, 3] we rigorously and computationally demonstrated that coarse-grained and microscopic processes share the same asymptotic mean
1488
M.A. Katsoulakis and D.G. Vlachos
behavior, i.e., that averages of the microscopic and coarse-grained processes solve the same mesoscopic deterministic PDE in the long-range interactions limit L → ∞. In addition to comparing the asymptotic mean behavior of coarse-grained and microscopic systems, we would like to understand how well and in what regimes CGMC captures the fluctuations of the microscopic system. As a first step in this direction, in numerical simulations in Ref. [2] we observed almost pathwise agreement between CGMC and microscopic MC simulations in the adsorption/desorption case when the level of coarse graining q was substantially smaller than L, e.g., q/L ≈ .25 and L = 40 (we note that in two dimensions potentials with just three lattice units long interactions have L about 30). These simulations suggested that in order to understand questions beyond the agreement in average behavior, we would like to have a comparison of the entire probability measures of the microscopic and CG processes. Our principal idea in this direction is to obtain a quantitative measure of controlling the loss of information during coarse-graining from finer to coarser scales: we consider the exact coarse graining of the microscopic Gibbs measure, µL,β oF(η) : = µL,β ({σ : F(σ ) = η}), where F is the projection operator from fine-to-coarse variables (4), and compare it to the Gibbs measure in CGMC (10). The relative entropy between the two measures provides a first quantitative estimate of the loss of information, during the coarse-graining process from finer to coarser scales, [3]: R(µm,q,β |µL,β oF) : = N
−1
log
η
= O
q . L
µm,q,β (η) µm,q,β (η) µL,β ({σ : F(σ ) = η}) (18)
Notice that the estimate (18) is on the specific entropy which is the relative entropy normalized with the size N of the microscopic system; the loss of information – however, small in each coarse cell – grows linearly with size as we take into account a growing number of cells. Relation (18) gives some initial mathematical intuition, at least at equilibrium, on how to rationally design a “good” CGMC algorithm, i.e., decide how to select the extent of coarse-graining q, given a potential J with a total number of interacting neighbors L and a desired accuracy. In fact, (18) is essentially a numerical analysis estimate between the exact solution of the microscopic system σ and the approximating CGMC η. Such estimates for the solution of a PDE and a corresponding finite element approximation are usually done in an L p or Sobolev norm. Here the relative entropy provides the analogue of a norm, without strictly being one. Furthermore, due to the Pinsker inequality [22], the estimate (18) implies an estimate on the total variation norm of the probability measures.
Mathematical strategies of microscopic models
7.
1489
Conclusions
Here we provided an overview of the first steps taken in deriving a mathematically founded framework for coarse-graining of stochastic processes and associated kinetic Monte Carlo simulations. We have shown that coarsegrained models and simulations can reach larger scales while retaining information about the microscopic mechanisms and interaction potentials and the correct noise. Information theory methods have been introduced to assess the errors (loos of information) during coarse-graining. We believe that these tools will be essential to providing strategies for optimized coarse-graining designs. Concluding, we remark that while our focus has been on simple Ising type of models, the concepts introduced here can be extended to more complex systems. One such application to atmospheric sciences arises in Ref. [8], where CGMC models, coupled with the macroscopic fluid and thermodynamic equations, are used to parametrize underresolved (subgrid) features of tropical convection. Furthermore, in recent years there is a great interest in the polymer science and biology literature in coarse-graining atomistic models of polymer chains; we refer to the review article on coarse-graining by Muller-Plathe Ref. [22], for further discussion. In this context, coarse-graining is typically achieved by collecting a number of atoms (on the order of 10–20) in a polymer chain into a “super-atom” and semi-empirically/analytically fit parameters to a known potential type U¯ , e.g., Lennard–Jones, to derive the coarse-grained potential for the super-atoms. Other coarse-graining techniques in the polymer science literature including the bond fluctuation model and its variants share the perspective of the CGMC: an atomistic chain model is mapped on a lattice, where a super-atom occupies a lattice cell (similarly to the coarse-cells Dk in Section 2). All these coarse-grained models have to varying degrees the drawback that they rely on parameterized coarse potentials. Hence at different conditions (e.g., temperature, density, composition) need to be re-parameterized [23]. Furthermore, since they are not directly derived from the atomistic dynamics, it is not clear if they reproduce transport and dynamic properties such as melt viscosities. We hope that our methods can eventually provide a new mathematical framework to these approaches and a more systematic – if not completely mathematical – way to construct coarse-grained dynamics and potentials for such complex systems.
References [1] M.A. Katsoulakis, A.J. Majda, and D.G. Vlachos, J. Comp. Phys., 186, 250, 2003. [2] M.A. Katsoulakis, A.J. Majda, and D.G. Vlachos, Proc. Natl. Acad. Sci. USA, 100, 782, 2003. [3] M.A. Katsoulakis and D.G. Vlachos, J. Chem. Phys., 112, 18, 2003.
1490
M.A. Katsoulakis and D.G. Vlachos
[4] S.M. Auerbach, Int. Rev. Phys. Chem., 19, 155, 2000. [5] R.C. O’Handley, Modern Magnetic Materials: Principles and Applications, Wiley, New York, 2000. [6] S. Renisch, R. Schuster, J. Wintterlin, and C. Ertl, Phys. Rev. Lett., 82, 3839, 1999. [7] A.J. Majda and B. Khouider, Proc. Natl. Acad. Sci. USA, 99, 1123, 2002. [8] B. Khouider, B. Majda, A. J. and M.A. Katsoulakis, Proc. Natl. Acad. Sci. USA, 100, 11941, 2003. [9] G. Giacomin and J.L. Lebowitz, J.L., J. Stat. Phys., 87, 37, 1997. [10] D.G. Vlachos and M.A. Katsoulakis, Phys. Rev. Lett., 85, 3898, 2000. [11] R. Lam, T. Basak, D.G. Vlachos, and M.A. Katsoulakis, J. Chem. Phys., 115, 11278, 2001. [12] C. Kipnis and C. Landim, Scaling Limits of Interacting Particle Systems, Springer, New York, 1999. [13] B. Gidas, Topics in Contemporary Probability and its Applications, J. Laurie Snell (ed.), CRC Press, Boca Raton, 1995. [14] N. Goldenfeld, Lectures on Phase Transitions and the Renormalization Group, vol. 85, Addison-Wesley, New York, 1992. [15] A.E. Ismail, G.C. Rutledge, and G. Stephanopoulos, J. Chem. Phys., 118, 4414, 2003. [16] H.T. Yau, Lett. Math. Phys., 22, 63, 1991. [17] M. Hildebrand and A.S. Mikhailov, J. Phys. Chem., 100, 19089, 1996. [18] D.P. Landau and K. Binder, A Guide to Monte Carlo Simulations in Statistical Physics, Cambridge University Press, London, 2000. [19] H.E. Cook, Acta Metall., 18, 297, 1970. [20] M.A. Katsoulakis and D.G. Vlachos, IMA Vol. Math. Appl., 136, 179, 2003. [21] D.J. Horntrop, M.A. Katsoulakis, and D.G. Vlachos, J. Comp. Phys., 173, 361, 2001. [22] T.M. Cover and J.A. Thomas, J.A., Elements of Information Theory, Wiley, New York, 1991. [23] F. Muller-Plathe, Chem. Phys. Chem., 3, 754, 2002. [24] G. Beylkin, R. Coifman, and V. Rokhlin, Commun Pure Appl. Math., 44, 141, 1991. [25] M. Hildebrand, A.S. Mikhailov, and G. Ertl, Phys. Rev. E, 58, 5483, 1998. [26] M. Seul and D. Andelman, Science, 267, 476, 1995. [27] A.F. Voter and J.D. Doll, J. Chem. Phys., 82, 80, 1985.
4.13 MULTISCALE MODELING OF CRYSTALLINE SOLIDS Weinan E and Xiantao Li Program in Applied and Computational Mathematics, Princeton University
1.
Introduction
Multiscale modeling and computation has recently become one of the most active research areas in applied science. With rapidly growing computing power, we are increasingly more capable of modeling the details of physical processes. Nevertheless, we still face the challenge that the phenomena of interest are oftentimes the result of strong interaction between multiple spatial and temporal scales, and the physical processes are described by radically different models at different scales. The mechanical behavior of solids is a typical example that exhibits such a multiscale characteristic. At the fundamental level, everything about the solid can be attributed to the electronic structures which obey the Schr¨odinger equation. Atomic interactions and crystal structures can be described at the atomistic scale using molecular dynamics. Mechanical properties at the scale of the material are often modeled using continuum mechanics for which one speaks of stresses and strains. In between there are carious levels of mesoscales where one deals with defects such as grain boundaries, dislocation dynamics, and dislocation bundles. What makes the problem challenging is that these different scales are often strongly coupled with each other. Continuum models usually offer an efficient way of studying material properties. But they suffer from inadequate accuracy and the lack of microstructural information that tells us the microscopic mechanisms for why the material responds in the way it does. Atomistic models, on the other hand, allow us to probe the detailed crystalline and defect structure. However, the length and time scales of our interest are often far beyond what a full atomistic computation can reach. This is where multiscale modeling comes into play. The idea is that by coupling microscopic models such as molecular dynamics (MD) 1491 S. Yip (ed.), Handbook of Materials Modeling, 1491–1506. c 2005 Springer. Printed in the Netherlands.
1492
Weinan E and X. Li
with macroscopic models such as continuum mechanics, one might be able to develop numerical tools that have the accuracy that is comparable with the microscopic model and the efficiently that is comparable to the macroscopic model. In this article, we will review some of the strategies that have been proposed for this purpose. We will focus on the coupling between molecular dynamics and continuum mechanics, although some of the strategies can be formulated in a more general setting. In addition, for simplicity we will concentrate on concurrent coupling methods that link different scales “on the fly”. Broadly speaking, concurrent coupling methods can be divided into two main categories, those based on energetic formulations and those based on dynamic formulations. We will discuss them separately.
2.
Energy-based Methods
At the atomistic scale, the deformation of the solid is described by the (displaced) positions of atoms that make up the solid. At zero temperature, the positions of the atoms are obtained by minimizing the total energy of the system, which consists of the potential energy due to the interaction of the atoms and the energy due to applied forces: E tot = E(x1 , . . . , x N ) −
f (x j )
(1)
j
Here x j denotes the displaced position of the jth atom. We will use x0j to denote its reference position which is taken to be the equilibrium position. u j = x j − x0j is the displacement of the jth atom. At the continuum level, the deformation of the solid is described by the displacement field u which also minimizes the total energy of the system that consists of the elastic energy caused by the deformation and the energy due to external forces:
ε (∇u − fex u) dx
(2)
Here ε is the strain energy density. Numerically this problem is solved by finite element methods on an appropriate triangulation {α } of the domain that defines the solid. In both cases, dynamics can be generated using Hamilton’s equation for the corresponding energy functional. Clearly the continuum approach is more efficient once we know the strain energy density. The conventional approach in continuum mechanics is to model this empirically using a combination of experimental data and analytical reasoning. Recently developed multiscale approach, on the other hand, aims
Multiscale modeling of crystalline solids
1493
at computing the strain energy directly based on the atomistic model. Next we will discuss several methods that have been developed for this purpose. To begin with, let Q be an appropriately defined operator that maps the microscopic configuration u j of the atoms to the macroscopic displacement field u. Then consistency between (1) and (2) implies that the strain energy should be given in terms of the atomistic model by, e[u] = min
Q{u j } = u
E tot .
(3)
However, this formula is quite impractical for numerical purpose since the number of atoms involved is often too large, and one has to come up with appropriate approximation procedures.
2.1.
QC – Quasicontinuum Method
One remarkably successful approach is the (quasicontinuum QC) method [1, 2]. QC is a way of simulating the macroscale nonlinear deformation of crystalline solids using molecular mechanics. It consists of three main components. • A finite element method on an adaptively generated mesh, which is automatically refined to the atomistic level near defects. Away from the defects, the mesh is coarsened to reflect the slow variation of the displacement field. • A kinematic constraint by which a subset of atoms, called representative atoms, are selected. The deformation of the other atoms are expressed in terms of the deformation of the representative atoms. This reduces the number of degrees of freedom in the problem. • A summation rule that computes an approximation to the total energy of the system by visiting only a small subset of the atoms. A simple example of the summation rule is the Cauchy–Born rule which computes the local energy by assuming the deformation is locally uniform. We now discuss these components in some detail. Ideally, in order to calculate the total energy, one needs to visit all the atoms in the domain: E tot =
N
E i (x1 , x2 , . . . , x N ).
(4)
i=1
Here E i is the energy contribution from site xi . The analytical form of E i depends on the empirical potential models in use. In practice, the computation of E i only involves neighboring atoms. In the region where the displacement field is smooth, keeping track of each individual atom is unnecessary. After
1494
Weinan E and X. Li
selecting some representative atoms (repatoms), the displacement of the rest of the atoms can be approximated via linear interpolation, uj =
Nrep
Sα x0j uα ,
α=1
where the subscript α identifies the representative atoms, Sα is an appropriate weight function, Nrep being the number of repatoms involved. This step reduces the number of degrees of freedom. But to compute the total energy, in principle we still need to visit every atom. To reduce the computational complexity involved in computing the total energy, several summation rules are introduced. The simplest of these is to assume that the deformation gradient A = ∂x/∂x0 is uniform within each element: namely that the Cauchy–Born rule holds. The strain energy in the element k can be approximately written as ε (Ak ) |k | in terms of the strain energy density ε (A). With these approximations, the evaluation of the total energy is reduced to a summation over the finite elements, E tot ≈
Ne
ε (Ak ) |k |
(5)
k=1
where Ne is the number of elements. This formulation is called the local version of QC. The advantage of local QC is the great reduction of the degrees of freedom since Nrep N . In the presence of defects, the deformation tends to be non-smooth. Therefore, the approximation made in local QC will be inaccurate. A nonlocal version of QC has been developed which proposes to compute the energy with the following ansatz: E tot ∼
Nrep
n α E α (uα )
(6)
α=1
Here the weight {n α } is related to atom density. The energy from each repatom {E α } is computed by visiting its neighboring atoms, which are generated using the local deformation. Near defects such as cracks or dislocations, the finite element mesh is also refined to atomic scale to reflect the local deformation more accurately. Practical implementations usually combine both local nonlocal version of the method, and a criterion has been suggested to identify the local/nonlocal regions so that the whole procedure can be applied adaptively. Another version of QC, which is based on the force calculation, has been put forward in Ref. [3]. The methods generate clusters around the repatoms and perform the force calculation using the atoms within the clusters, see Fig. 1.
Multiscale modeling of crystalline solids
1495
Figure 1. Schematic illustration of QC (courtesy of M. Ortiz). Only atoms in the small cluster need to be visited during the computation.
QC has been successfully applied to a number of problems∗ including dislocation structure, nanoindentation, crack propagation, deformation twinning, etc. The use of local QC to control the far-field region and thus create a continuum environment for material defects has become more and more popular. In its simplest form, QC ignores atomic vibrations and thus the entropic effects. This restricts QC to static problems at zero temperature. Dynamics of defects can be studied in a quasistatic setting. Finite temperature can be incorporated perturbatively [2, 4].
2.2.
MAAD – Macro Atomistic Ab initio Dynamics
MAAD (Macro Atomistic Ab initio Dynamics) was proposed in Refs. [5, 6] to simulate crack propagation in Silicon. The computational domain is decomposed into three parts: the continuum region away from the crack tip where the
* For recently development and source code, see http://www.qcmethod.com.
1496
Weinan E and X. Li
linear elasticity model is solved using a finite element method, an atomistic region near the crack tip on which molecular dynamics m j x¨ j = − ∇x j V,
j = 1, 2, . . . , Natom ,
(7)
with the Stillinger–Weber potential is used, an a quantum mechanical region at the crack tip where the tight binding model (TB) is used to model bond breaking. This is done by writing the Hamiltonian in the form Htot = HFE + HMD + HFE/MD + HTB + HMD/TB ,
(8)
which represents the energy contribution from different regions and the interface between them. For brevity we will explain the calculation of the first three terms. In the (finite element FE) region, the variables are the displacement field u, and the expression for the Hamiltonian is standard: HFE =
Ne 1 uT Kuk + u˙ kT M u˙ k 2 k=1 k
(9)
Here K and M are the stiffness and mass matrices. The stiffness matrix can be obtained from the harmonic approximation of the interatomic potential. In the case of finite (but constant) temperature, these parameters are adjusted accordingly to be consistent with the atomistic system in the MD region. The Hamiltonian in the MD region is simply the total energy: HMD =
atom 1 N m i u˙ 2i + V 2 i=1
(10)
where ui is the displacement of the ith atom, V is the total potential energy in the MD region. A key ingredient in this procedure is a handshaking scheme at the continuum/MD (and MD/TB) interface. Specifically, near the continuum/MD interface the finite elements are refined all the way to the atomistic level so that their vertices coincide with the reference atomistic positions at the interface. The handshaking Hamiltonian HFE/MD accounts for the interaction across the interface. The energy is computed from the continuum side and the MD side using, respectively, the formulase (9) and (10) with half and half weight for each. The continuum region and the atomistic region are then evolved simultaneously in time. Energy transport across the interface has been ignored. By refining the finite element mesh to atomistic scale at the interface, MAAD also avoids the issue of phonon reflection that we will discuss at the end of this article.
Multiscale modeling of crystalline solids
2.3.
1497
CGMD – Coarse-Grained Molecular Dynamics
Coarse-grained molecular dynamics is a systematic procedure for deriving the effective Hamiltonian for a set of coarse-grained variables from the microscopic Hamiltonian [7]. Starting from a microscopic Hamiltonian HMD defined on the phase space and defining coarse-grained variable by uµ =
f j µ u j , pµ =
j
f j µp j ,
(11)
j
where f j µ are appropriate weights, the effective Hamiltonian for the coarsegrained variables are obtained from 1 E(uµ , pµ ) = Z
HMD e− HMD , kB T dx j dp j
(12)
where
= µ δ uµ −
fkµ uk δ pu −
k
fkµ pk ,
k
and Z is a normalization constant, T is the temperature. Consistency with the coarse-grained variables is ensured through the presence of the delta functions, similar to the imposition of the kinematic constraint in QC. Equation (12) plays the role of (3) at finite temperature, with Q defined via (11). The basic assumption in this formalism is that the small scale component is at equilibrium given the coarse-grained variables. Strictly speaking this is only true if the relaxation times associated with the small scales are much shorter than that or the coarse-grained variables. In general the coarse-grained energy in (12) is still difficult to compute. It has been computed for the case of harmonic potential in Ref. [7] and more generally in Ref. [8].
3.
Dynamics-based Method
So far we have discussed energy based methods. In these methods, the key is to obtain a multiscale representation of the total energy of the system. In QC, this is done via the representative atoms and the summation rule. In MAAD, this is done by handshaking the atomistic and continuum regions through a gradual matching of the grids. In CGMD, this is done via thermodynamically integrating out the contribution of the small scales. Hamilton’s equation is applied to the reduced Hamiltonian in order to model dynamics.
1498
Weinan E and X. Li
An alternative approach is to model dynamics directly. Equilibrium states are obtained as steady states of the dynamics. This is essential if energy transport is coupled with the dynamics. At the present time, this approach is much less developed compared with energy-based approaches discussed earlier. So far the only general strategy seems to be that of Li and E[9], which is based on the framework of the heterogeneous multiscale method (HMM) developed by E and Engquist [10]. This will be discussed next. We will also discuss a related topic, namely how to impose matching conditions at the atomistic–continuum interface.
3.1.
Heterogeneous Multiscale Method
In order to develop a general multiscale methodology that can handle both dynamics and finite temperature effects, Li and E [9] relied on the framework of the heterogeneous multiscale method (HMM), which has been used for designing multiscale methods for several different applications including fluids.† there are two major components in HMM. The selection of a macroscale solver and the estimation of the needed macroscale data using the microscale solver. In general the macroscale solver should be chosen to maximize the efficiency in resolving the macroscale behavior of the system and minimize the complexity of coupling with the microscale model. In the context of solids, our starting point for both the macroscale and microscale models are the universal conservation laws of mass, momentum and energy in Lagrangian coordinates: ∂t A − ∇x0 v = 0,
ρ0 ∂t v + ∇x0 · σ = 0, ρ0 ∂t e + ∇x0 · j = 0,
(13)
Here A, v, e are the deformation gradient, velocity and total energy per particle respectively, ρ0 is the density. At the macroscale level, e.g., continuum mechanics, σ is the first Piola–Kirchhoff stress tensor and j is the energy flux. The first equation in (13) is merely a compatibility statement. The second and third equation express conservation of momentum and energy, respectively. After combining with proper constitutive relations these equations can be used to model nonlinear elasticity, thermoelasticity and even plasticity. At the microscopic level, i.e., molecular dynamics, these conservation laws
† For other applications of HMM, visit http://www.math.Princeton.edu/multiscale.
Multiscale modeling of crystalline solids
1499
continue to hold with the stress and energy given in terms of the atomistic variables by,
σ˜ (x0 , t) = 12 f xi (t) − x j (t) ⊗ x0i − x0j i= /j 1 × δ(x0 − (x0j + λ(x0i − x0j )))dλ, 0 ˜j (x0 , t) = 1 v i (t) + v j (t) · f x j − xi x0i − x0j 4 i= /j 1 × δ(x0 − (x0j + λ(x0i − x0j )))dλ,
(14)
0
Here for simplicity we only provided these expressions for the case when the
atomistic potential is simply a pair potential: V =1/2 i =/ j φ xi (t) − x j (t) and f = −∇φ. It is well-known that pair potentials are quite inadequate for modeling solids, but one can find the formulas for more general potentials in Ref. [9]. These conservation laws suggest a new coupling strategy in the HMM framework at the level of fluxes: the macroscopic variables can be used as constraints for the atomistic system, the needed constitutive data – the fluxes, can be obtained from results from the atomistic model via ensemble time averaging after the microscale system equilibrates. This is the method proposed in Ref. [9]. Compared with QC or CGMD, HMM is more of a top-down approach in that it starts with an incomplete macroscale model, and uses the microscale model as a supplement to provide the missing data, the fluxes. In QC or CGMD, one starts with a full atomistic description with all the physical details. A coarse graining procedure is then applied to remove the unnecessary data in order to arrive at a coarse-grained model. We next describe the details of the HMM procedure.
3.1.1. Macroscale solver Since the macroscale model is a conservation law, the macroscale solver is a method for conservation laws. Although there are plenty of methods available for conservation laws, e.g., Ref. [11], many of them involve the computation of the Jacobian for the flux functions, and this dramatically increases the computational complexity in a coupled multiscale method when the continuum equation is not explicitly known. An exception is the central scheme of Lax–Friedrichs type, such as Ref. [12], which is formulated over a staggered-grid. As it turns out, this method can be easily coupled with molecular dynamic simulations.
1500
Weinan E and X. Li
We first write the conservation laws in the generic form, ut + fx = 0,
(15)
We will confine our discussion to one dimensional continuum models since the extension to higher dimension is straightforward. A (macro) staggered grid is laid out as in Fig. 2. First order central scheme represents the solutions by piece-wise constants, which are the average values over each cell: unk
1 = x
x k+1/2
u(x, t n )dx.
x k −1/2
Time integration over xk , xk + 1 × t n , t n + 1 leads to the following scheme, +1 unk + 1/2 =
t n unk + unk + 1 − fk + 1 − fnk , 2 x
(16)
tn+2 fn+1 k 1/2
fn+1 k+1/2
[]
[] un+1 k 1/2
un+1 k+1/2
tn+1 fnk
fnk 1
[]
[] unk 1
tn
xk1
[] unk
xk
fnk+1
unk+1 xk+1
Figure 2. A schematic illustration of the numerical procedure for one macro time step: starting from piecewise constant solutions {unk }, one integrates (15) in time and in the cell [xk , xk+1 ]. The time step t is chosen in such a way that the waves coming from xk+1/2 will not reach xk , and thus for t ∈ [t n , t n+1 ), u(xk , t) = unk .To obtain the local flux, we perform a MD simulation using unk as constraints. The needed flux is then extracted from the MD fluxes via time averaging.
Multiscale modeling of crystalline solids
1501
where fnk
1 = t
tn +1
f(xk , t)dt
tn
This is then approximated by numerical quadrature such as the mid-point formula. A simple choice is f kn ∼ f (xk , t n ). The stability of such a scheme, which usually manifests itself in the form of a constraint on the size of t, can be appreciated from considering the adiabatic case f = f(u): if we choose the time step t small enough, the waves generated from the cell interface {x k + 1/2} will not arrive at the grid point {xk }, and, therefore, the solution as well as the fluxes at the grid points will not change until the next time step. With this specific choice of the macro-solver, we can illustrate the HMM procedure schematically in Fig. 2. At each macro time step, the scheme (16) requires as input the fluxes at grid point xk to complete the time integration, These flux values are obtained by performing local MD simulations that are consistent with the local macroscale state (A, v, e). The Eq. (13) is then integrated to next time step using (16).
3.1.2. Reconstruction Next we discuss how to set up the atomistic simulation to estimate the local fluxes. The first step is to reconstruct initial MD configurations that are consistent with the local macro state variables (A, v, e). The shape of the MD cell, and hence the new basis, is set up from the local deformation tensor. For example if the undeformed cell has basis E, then the ˜ new basis is E=AE. Assuming the deformation is uniform within the cell, the new basis then determines the displacement of each atom. From the atomic positions we can compute the potential energy. After subtracting the potential energy and the kinetic energy associated with the mean velocity from the total energy e, we obtain the temperature by assuming that the remaining energy is due to thermal fluctuation. Using the mean velocity and temperature we initialize the velocity of the atoms by Maxwell distribution.
3.1.3. Boundary conditions Of central importance is the boundary condition imposed on the microscopic system in order to guarantee consistency with the local macroscale variables. In the case when the system is homogeneous, the most convenient boundary condition is the periodic boundary condition. The MD cell is first
1502
Weinan E and X. Li
deformed according to the deformation gradient A. Then the cell is periodically extended to the whole space.
3.1.4. Estimating the data The needed macroscale fluxes are estimated from the MD results by time averaging. To reduce the transient effects, we use a kernel that puts less weight on the transient period, e.g., 1 A K = lim t →+∞ t
t 0
s K (1 − )A(s)ds, t
K (θ) = 1 − cos (2π θ ).
(17)
Experience suggests that using this kernel substantially improves the quality of the data than straightforward averaging.
3.1.5. Dealing with defects In the presence of defects, QC and MAAD refine the grid to atomic level to account for defect energy. This procedure is seamless but can become rather complicated in simulating dynamics. HMM instead suggest keeping the macro-grid (which might be locally refined) in the entire computational domain but performing a model refinement locally near the defects. Away from the defects, the fluxes are computed using the procedure described before, or if an empirical model is accurate enough, one can simply compute the fluxes using the empirical model. Near the defects there are two cases to consider depending on whether there is scale separation between the local relaxation time around the defects and the time scale for the dynamics of the defects In the absence of such a time scale separation, the molecular dynamics simulation around the defects will be kept for all times. This imposes a limitation on the time scales that can be accessed using such a procedure. But if the atomistic relaxation times can be very long, there is really little one can do other than following the history of the atomistic features near the defects. Macro-scale fluxes can still be computed from the micro-scale fluxes via time averaging. In this case, since the atomistic region near the defect is necessarily macroscopically inhomogeneous, the atomistic boundary conditions need to the modified. Li and E [9] proposes using a biased Andersen thermostate at a border region that takes into account both the local mean velocity and local temperature. Finally, the overall deformation is controlled by fixing the outmost atoms. In the case when there is time scale separation, this procedure can be much simplified. In this case one can build the defect dynamics directly into the macro-solver and the atomistic simulations can be localized in space and time
Multiscale modeling of crystalline solids
1503
to predict the velocity of the defects and stress near the defects. Such a defect tracking procedure is implemented for twin boundary dynamics in Ref. [9].
3.1.6. Atomistic–continuum interface condition One issue that has received a great deal of attention is the matching condition at the atomistic–continuum interface. In a coupled MD-continuum calculation, the MD region is meant to be vary small but inevitably at finite temperature. The phonons generated in the MD region need to be propagated out in order to keep the fluctuations in the MD region under control. This is achieved through imposing appropriate boundary conditions at the atomistic– continuum interface that limits phonon reflection. The first attempt for deriving such boundary conditions is found in Ref. [12]. Cai et al. suggested obtaining the exact linear response functions at the interface by precomputing. This strategy is in principle exact under the harmonic approximation. But it is often too expensive since the linear response functions (which are simply Green’s functions) are quite nonlocal. When the MD region changes as a result of defect dynamics, these functions will have to be computed again. Further work along this line was done later by Wagner et al. Ref. [13]. To achieve an optimal balance between efficiency and accuracy, a local method was formulated in E and Huang [14, 15] with the idea of minimizing phonon reflection, giving a pre-determined stencil for the boundary condition. To explain the optimal local matching conditions, we consider the one dimensional case where the continuum model is the simple wave equation, ∂ 2u ∂ 2u = ∂t 2 ∂ x 2 and its discrete form, − 2u nj + u n−1 u n+1 j j = u nj +1 − 2u nj −1, j ≥ 1. (18) 2 t These equations can be obtained by linear zing (7). For simplicity we consider the case when the atomistic region is in the semi-infinite domain defined by x > 0 and j =0 is the boundary. To prescribe the boundary condition we express u n0 as u n0 =
ak, j u n−k j ,
a0,0 = 0.
k, j ≥ 0
We start with a pre-determined set S of {k, j }’s outside of which we set ak, j = 0. The set S is the stencil that we choose. Choosing the right S is a very crucial step in this procedure. Large S will lead to an increase in the complexity of
1504
Weinan E and X. Li
the algorithm. But small S may not be enough for the purpose of suppressing phonon reflection. Once S is selected, {ak, j }are chosen to minimize the total reflection in appropriate norm. The reflection coefficient, or more generally the reflection matrix can be obtained by looking for solutions in the form of u nj = ei(nωt + j ξ ) + R(ξ )ei(nωt − j ξ ) . Using (18), we obtain
ak, j ei( j ξ −kωt ) − 1 , −i( j ξ −kωt ) − 1 k, j ak, j e
R(ξ ) = −
k, j
(19)
where ω = ω(ξ ) is the dispersion coefficient satisfying ωt ξ 1 sin = sin . t 2 2 Similar calculation can be done for general crystal structures in which case the phonon spectrum may consist of several branches. Having R(ξ ), ak, j can be obtained by minimizing the total phonon reflection, π
min
W (ξ )R(ξ )|2 dξ,
0
with appropriately chosen weight function W . In addition constraints are needed at ξ = 0 in the form of R(0) = 0, R (0) = 0, . . . , to ensure accuracy at large scale. As example, if one uses only the terms a1,0 and a1,1 , and W =1 with R(0)=0 at the boundary, one has, + tu n−1 u n0 = (1 − t)u n−1 0 1 .
(20)
If instead one keeps the terms {a j,k, j ≤ 3, k ≤ 2}, the minimization leads to the following coefficients:
(a j,k ) =
1.95264 −0.074207 −0.014903 −0.95406 0.074904 0.015621
.
In order to get better performance at high wave number, more coefficients (larger S) have to be included. The method has been applied to dislocation dynamics in the Frenkel– Kontorova model and friction between rough crystal surfaces. It has shown promise in suppressing phonon reflection.
Multiscale modeling of crystalline solids
4.
1505
Summary
We have based our presentation on dividing multiscale methods into energybased and dynamics-based methods. From the viewpoint of coarse-graining, there are also two different set of ideas. The first set of ideas, used in QC, CGMD and HMM, is to pre-define a set of coarse-grained variables. By expressing the microscopic model in terms of the coarse-grained variables, one finds a relationship that express the macroscale data in terms of the microscopic quantities. In QC, this relationship is (3). In CGMD, this relationship is (12). In HMM, this relationship is (14). This relationship is the starting point of the micro-macro coupling. The second set of ideas, used in MAAD and E and Huang [14], is to divide the computational domain into macro and micro regions. Separate models are used in different regions and an explicit matching is used to bridging the two regions. Most existing work on multiscale modeling of solids deals with single crystal with isolated defects. Going beyond single crystals requires substantial work. Dealing with polycrystals with grain boundaries and plasticity with many interacting dislocations seem to require new ideas in coupling.
References [1] E.B. Tadmor, M. Ortiz, and R. Phillips, “Quasicontinuum analysis of defects in crystals,” Phil. Mag. A, 73, 1529, 1996. [2] R.E. Miller and E.B. Tadmor, “The quasicontinuum method: overview, applications and current directions,” J. Comput.-Aided Mater. Des., in press, 2003. [3] J. Knap and M. Ortiz, “An analysis of the quasicontinuum method,” J. Mech. Phys. Solid, 49, 1899, 2001. [4] V. Shenoy and R. Phillips, “Finite temperature quasicontinuum methods,” Mat. Res. Soc. Symp. Proc., 538, 465, 1999. [5] F.F. Abraham, J.Q. Broughton, N. Bernstein, and E. Kaxiras, “Spanning the continuum to quantum length scales in a dynamic simulation of brittle fracture,” Europhys. Lett., 44(6), 783, 1998. [6] J.Q. Broughton, F.F. Abraham, N. Bernstein, and E. Kaxiras, “Concurrent coupling of length scales: methodology and application,” Phys. Rev. B, 60(4), 2391, 1999. [7] R.E. Rudd and J.Q. Broughton, “Coarse-grained molecular dynamics and the atomic limit of finite element,” Phys. Rev. B, 58(10), R5893, 1998. [8] R.E. Rudd and J.Q. Broughton, Unpublished, 2000. [9] X.T. Li and W.E, “Multiscal modeling of solids,” Preprint, 2003. [10] W.E and B. Engquist, “The heterogeneous multi-scale methods,” Comm. Math. Sci., 1(1), 87, 2002. [11] E. Godlewski, and P.A. Raviart, Numerical Approximation of Hyperbolic systems of Conservation Laws, Springer-Verlag, New York, 1996. [12] H. Nessyahu and E. Tadmor, “Nonoscillatory central differencing for hyperbolic conservation laws,” J. Comp. Phys., 87(2), 408, 1990.
1506
Weinan E and X. Li
[13] G.J. Wagner, G.K. Eduard, and W.K. Liu, Molecular Dynamics Boundary Conditions for Regular Crystal Lattice, Preprint, 2003. [14] W.E and Z. Huang, “Matching conditions in atomistic-continuum modeling of material,” Phys. Rev. Lett., 87(13), 135501, 2001. [15] W.E and Z. Huang, “A dynamic atomistic-continuum method for the simulation of crystalline material,” J. Comp. Phys., 182, 234, 2002.
4.14 MULTISCALE COMPUTATION OF FLUID FLOW IN HETEROGENEOUS MEDIA Thomas Y. Hou California Institute of Technology, Pasadena, CA, USA
There are many interesting physical problems that have multiscale solutions. These problems range from composite materials to wave propagation in random media, flow and transport through heterogeneous porous media, and turbulent flow. Computing these multiple scale solutions accurately presents a major challenge due to the wide range of scales in the solution. It is very expensive to resolve all the small scale features on a fine grid by direct num-erical simulations. A natural question is if it is possible to develop a multiscale computational method that captures the effect of small scales on the large scales using a coarse grid, but does not require resolving all the small scale features. Such multiscale method can offer significant computational savings. We use the immiscible two-phase flow in heterogeneous porous media and incompressible flow as examples to illustrate some key issues in designing multiscale computational methods for fluid flows. Two-phase flows have many applications in oil reservoir simulations and environmental science problems. Through the use of sophisticated geological and geostatistical modeling tools, engineers and geologists can now generate highly detailed, three-dimensional representations of reservoir properties. Such models can be particularly important for reservoir management, as fine scale details in formation properties, such as thin, high permeability layers or thin shale barriers, can dominate reservoir behavior. The direct use of these highly resolved models for reservoir simulation is not generally feasible because their fine level of detail (tens of millions grid blocks) places prohibitive demands on computational resources. Therefore, the ability to coarsen these highly resolved geologic models to levels of detail appropriate for reservoir simulation (tens of thousands grid blocks), while maintaining the integrity of the model for purpose of flow simulation (i.e., avoiding the loss of important details), is clearly needed. 1507 S. Yip (ed.), Handbook of Materials Modeling, 1507–1528. c 2005 Springer. Printed in the Netherlands.
1508
T.Y. Hou
In recent years, we have introduced a multiscale finite element method (MsFEM) for solving partial differential equations with multiscale solutions [1–4]. This method has been demonstrated to be effective in upscaling two-phase flows in heterogeneous porous media. The main idea of this approach is to construct local multiscale finite element base functions that capture the small scale information within each element. The small scale information is then brought to the large scales through the coupling of the global stiffness matrix. Thus, the effect of small scales on the large scales is captured correctly. In our method, the base functions are constructed by solving the governing equation locally within each coarse grid element. The local construction of the multiscale base functions offers several computational advantages such as parallel computing and local adaptivity in computing the base functions. These advantages can be explored in upscaling a fine grid model. One of the central issues in many multiscale methods is to localize the subgrid small scale problems. In the context of the multiscale finite element method, it is the question of how to design proper microscopic boundary conditions for the local base functions. Naive choice of microscopic boundary conditions can lead to large errors. The nature of the numerical errors due to improperly chosen local boundary conditions depends on the type of the governing equation for the underlying physical problem. For elliptic or diffusion dominated problems, the effect of the numerical boundary layers is strongly localized. For convection dominated transport, the errors caused by the improper microscopic boundary condition can propagate long distance and pollute the large scale physical solution. Below we will discuss multiscale methods for these two types of problems in some details.
1.
Formulation and Background
The flow and transport problems in porous media are considered in a hierarchical level of approximation. At the microscale, the solute transport is governed by the convection–diffusion equation in a homogeneous fluid. However, for porous media, it is very difficult to obtain full information about the pore structure. Certain averaging procedure has to be carried out, and the porous medium becomes a continuum with certain macroscopic properties, such as the porosity and permeability. With the modern geostatistical techniques, one can routinely generate a fine grid model as large as tens of millions of grid blocks. As a first step, one has to upscale the fine grid model to a coarse grid model consisting of tens of thousands of coarse grid blocks but still preserve the integrity of the original fine grid model. Once the coarse grid model is obt-ained, it can be used many times with different boundary conditions or source distributions for the purpose of model validation and oil field management. This could reduce the computational cost significantly.
Multiscale computation of fluid flow in heterogeneous media
1509
We consider a heterogeneous system which represents two-phase immiscible flow. Our interest is in the effect of permeability heterogeneity on twophase flow. Therefore, we neglect the effect of compressibility and capillary pressure, and consider porosity to be constant. This system can be described by writing Darcy’s law for each phase (all quantities are dimensionless) vj =
krj (S) K ∇ p, µj
(1)
where vj are Darcy’s velocity for the phase j (j = o, w; oil, water), p is pressure, S is water saturation, K is the permeability tensor, krj is the relative permeabilities of each phase and µj is the viscosity of the phase j. Darcy’s law for each phase coupled with mass conservation, can be manipulated to give the pressure and saturation equations ∇ · (λ(S)K ∇ p) = 0, ∂S + u · ∇ f (S) = 0, ∂t
(2) (3)
which can be solved subject to some appropriate initial and boundary conditions. The parameters in the above equations are given by krw (S) kro (S) + , µw µo krw (S)/µw , f (S) = krw (S)/µw + kro /µo u = vw + vo = −λ(S)K ∇ p. λ=
(4) (5) (6)
Typically, the permeability tensor K in an oil reservoir model contains many or continuous spectrum of scales that are not separable. The variation in the permeability tensor is also very large, with the ratio between the maximum and minimum permeability being as large as 106 . This means that flow velocity can be very large near certain fast flow channels. To avoid time-stepping restriction associated with an explicit method, a full implicit time discretization is usually employed for the saturation equation. Moreover, the geometry of the computational domain is quite complicated. All these complications make it difficult to apply standard fast iterative methods such as the multigrid method to solve the large scale elliptic equation for pressure. In fact, solving the elliptic problem seems to consume most of the computational time in practice. Thus developing an efficient multiscale adaptive method for solving the elliptic problem becomes essential in oil reservoir simulations.
1510
2.
T.Y. Hou
Multiscale Finite Element Method
We first focus on developing an effective multiscale finite element method for solving the elliptic (pressure) equation with highly oscillating coefficients. We consider the following elliptic problem L u : = − ∇ · (a (x)∇u) = f in ,
u = 0 on ∂,
(7)
where a (x) = (aij (x)) is a positive definite matrix, is the physical domain and ∂ denotes the boundary of domain . This model equation represents a common difficulty shared by several physical problems. For flow in porous media, it is the pressure equation through Darcy’s law. The coefficient a ε represents the permeability tensor. For composite materials, it is the steady heat conduction equation and the coefficient a ε represents the thermal conductivity. The variational problem of (7) is to seek u ∈ H01 () such that a(u, v) = f (v), ∀v ∈ H01 (),
(8)
where a(u, v) =
aij
∂v ∂u dx and f (v) = ∂x i ∂x j
f v dx.
We have used the Einstein summation notation in the above formula. The Sobolev space H01 () consists of all functions whose mth derivatives (m = 0, 1) are L 2 integrable over and which vanish at the boundary of . A finite element method is obtained by restricting the weak formulation (8) to a finite dimensional subspace of H01 (). For 0 < h ≤ 1, let Kh be a partition of by a collection of triangular element K with diameter ≤ h. In each element K ∈ Kh , we define a set of nodal basis {φ Ki , i =1, . . . , d} with d being the number of nodes of the element. The subscript K will be neglected when bases in one element are considered. In our multiscale finite element method, the base function φ i is constructed by solving the homogeneous equation over each coarse grid element: L φ i = 0 in K ∈ Kh .
(9)
Let x j ( j = 1, . . . , d) be the nodal points of K . As usual, we require φ i (x j ) = δi j , where δi j = 1 if i = j , and δi j = 0 for i =/ j . One needs to specify the boundary condition of φ i to make (9) a well-posed problem. The simplest choice of the boundary condition for φ i is the linear boundary condition. For now, we assume that the base functions are continuous across the boundaries of the elements, so that the finite element solution space V h , which is spanned by the multiscale bases φ Ki is a subspace of H01 (), i.e.,
V h = span φ Ki : i = 1, . . . , d; K ∈ Kh ⊂ H01 ().
Multiscale computation of fluid flow in heterogeneous media
1511
Except for special cases when the coefficient aij has periodic structure or is separable in space variables, we in general need to compute the multiscale bases numerically using a subgrid mesh. The multiscale finite element method is to find the approximate solution of (8) in V h , i.e., to find u h ∈ V h such that a(u h , v) = f (v), ∀v ∈ V h .
(10)
In the case when a (x) = a(x, x/) with a(x, y) being periodic in y, we have proved that the multiscale finite element method gives a convergence result uniform in as tends to zero [2]. Moreover, the rate of convergence in the energy norm is of the form O h + + (/ h)1/2 . We remark that the idea of using base functions governed by the differential equations has been used in the finite element community see e.g., [5]. The multiscale finite element method presented here is also similar in spirit to the residual-free bubble finite element method [6] and the variational multiscale method [7].
3.
The Over-sampling Technique
The choice of boundary conditions in defining the multiscale bases plays a crucial role in approximating the multiscale solution. Intuitively, the boundary condition for the multiscale base function should reflect the multiscale oscillation of the solution u across the boundary of the coarse grid element. To gain insight, we first consider the special case of periodic microstructures, i.e., a (x) = a(x, x/), with a(x, y) being periodic in y. Using standard homogenization theory [8], we can perform multiscale expansion for the base function, φ , as follows (y = x/) φ = φ0 (x) + φ1 (x, y) + θ (x) + O( 2 ), where φ0 is the effective solution, φ1 is the first order corrector. The boundary corrector θ is chosen so that the boundary condition of φ on ∂ K is exactly satisfied by the first three terms in the expansion. By solving a periodic cell problem for χ j y · a(x, y) y χ j =
∂ ai j (x, y) ∂ yi
(11)
with zero mean, we can express the first order corrector φ1 as follows: φ1 (x, y) = − χ j ∂φ0 /∂x j . The boundary corrector, θ , then satisfies x · a(x, x/) x θ = 0 in K with boundary condition
θ ∂ K = φ1 (x, x/)∂ K .
1512
T.Y. Hou
The oscillatory boundary condition of θ induces a numerical boundary layer, which leads to the so-called resonance error [1]. To avoid this resonance error, we need to incorporate the multidimensional oscillatory information through the cell problem into our boundary condition for φ . If we set φ |∂ K = (φ0 + φ1 (x, x/))|∂ K , then the boundary condition for θ |∂ K becomes identically equal to zero. Therefore, we have θ ≡ 0. In this case, we have an analytic expression for the multiscale base functions φ as follows φ = φ0 (x) + φ1 (x, x/),
(12)
where φ1 (x, y) = −χ j (x, y)∂φ0 /∂x j , χ j is the solution of the cell problem (11), and φ0 can be chosen as the standard linear finite element base. This set of multiscale bases avoid the boundary layer effect completely. The analytic form of the multiscale base function also gives a more efficient way to construct the multiscale base functions. Numerical experiments by Andrew Westhead demonstrate a clear first order convergence of this method without suffering from resonance error. For more details, see www.ama.caltech.edu/∼ westhead/MSFEM. However, for problems that do not have scale separation and periodic microstructure, we cannot use this approach to compute the multiscale base functions in general. Motivated by our convergence analysis, we propose an over-sampling method to overcome the difficulty due to scale resonance [1]. The idea is quite simple and easy to implement. Since the boundary layer in the first order corrector is thin, O(), we can first construct intermediate sample bases in a domain with size larger than h + . Here, h is the coarse grid mesh size and is the small scale in the solution. From these intermediate sample bases, we can construct the multiscale bases over the computational element, using only the interior information of the sample bases restricted to the computational element. Specifically, let ψ j be the base functions satisfying the homogeneous elliptic equation in the larger sample domain S ⊃ K . We then form the actual base φ i by linear combination of ψ j φi =
d
ci j ψ j .
j =1
The coefficients ci j are determined by condition φ i (x j ) = δi j . The corresponding θ ε for φ i are now free of boundary layers. By doing this, we can reduce the influence of the boundary layer in the larger sample domain on the base functions significantly. As a consequence, we obtain an improved rate of convergence [1, 3].
4.
Convergence and Accuracy
To assess the accuracy of our multiscale method, we compare MsFEM with a traditional linear finite element method (LFEM for short) using a subgrid
Multiscale computation of fluid flow in heterogeneous media
1513
mesh, h s = h/M. The multiscale bases are computed using the same subgrid mesh. Note that MsFEM only captures the solution at the coarse grid h, while FEM tries to resolve the solution at the fine grid h s . Our extensive numerical experiments demonstrate that the accuracy of MsFEM on the coarse grid h is comparable to that of the corresponding well-resolved LFEM calculation at the same coarse grid. In some cases, MsFEM gives even more accurate results than LFEM. First, we demonstrate the convergence in the case when the coefficient has scale separation and periodic structure. In Table 1, we present the result for a(x/ε) =
2 + sin(2π x2 /ε) 2 + P sin(2πx1 ε) + (P = 1.8), 2 + P cos(2π x2 /ε) 2 + P sin(2π x1 /ε) f (x) = −1 and u|∂ = 0,
(13) (14)
where = [0, 1] × [0, 1]. We denote by N the number of coarse grid points along each dimension, i.e., N = 1/ h. The convergence of three different methods are compared for fixed ε/ h = 0.64, where “L” indicates that linear boundary condition is imposed on the multiscale base functions, “os” indicates the use of over-sampling, and LFEM stands for linear FEM. We see clearly the scale resonance in the results of MsFEM-L and the (almost) first-order convergence (i.e., no resonance) in MsFEM-os-L. Moreover, the errors of MsFEM-os-L are smaller than those of LFEM obtained on the fine grid. Next, we illustrate the convergence of the multiscale finite element method when the coefficient is random and has no scale separation nor periodic structure. In Fig. 1, we show the results for a log-normally distributed a ε . In this case, the effect of scale resonance shows clearly for MsFEM-L, i.e., the error increases as h approaches ε. Here ε ∼ 0.004 roughly equals the correlation length. Even the use of an oscillatory boundary conditions (MsFEMO), which is obtained by solving a reduced 1D problem along the edge of the element, does not help much in this case. On the other hand, MsFEM with over-sampling agrees very well with the well-resolved calculation. We have also applied the multiscale finite element method to study wave propagation in random media and singularly perturbed convection-dominated diffusion problems. For more details, see Refs. [9, 10]. Table 1. Convergence for periodic case N 16 32 64 128
ε 0.04 0.02 0.01 0.005
MsFEM-L ||E||l 2 Rate
MsFEM-os-L ||E||l 2 Rate
LFEM MN ||E||l 2
3.54e–4 3.90e–4 4.04e–4 4.10e–4
7.78e–5 3.83e–5 1.97e–5 1.03e–5
256 512 1024 2048
–0.14 –0.05 –0.02
1.02 0.96 0.94
1.34e–4 1.34e–4 1.34e–4 1.34e–4
1514
T.Y. Hou 1e⫺2
LFEM MFEM-O MFEM-L MFEM-os-L
l 2 -norm error
5e⫺3
1e⫺3
5e⫺4 32
64
128
256
512
N
Figure 1. The l 2 -norm error of the solutions using various schemes for a log-normally distributed permeability field.
5.
Recovery of Small Scale Solution from Coarse Grid Solution
To solve the transport equation in the two-phase flows, we need to compute the velocity field from the elliptic equation for pressure, i.e., u = − λ(S)K ∇ p. In some applications involving isotropic media, the cell-averaged velocity is sufficient, as shown by some computations using the local upscaling methods [11]. However, for anisotropic media, especially layered ones (Fig. 2), the velocity in some thin channels can be much higher than the cell average, and these channels often have dominant effects on the transport solutions. In this case, the information about fine scale velocity becomes vitally important. Therefore, an important question for all upscaling methods is how to take those fast-flow channels into account. For MsFEM, the fine scale velocity can be easily recovered from the multiscale base functions, which provide interpolations from the coarse h-grid to the fine h s -grid. To illustrate that we can recover the fine grid velocity field from the coarse grid pressure calculation, we use the layered medium which is plotted in Fig. 2. We compare the computations of the horizontal velocity fields obtained by two methods. In Fig. 3a, we plot the horizontal velocity field obtained by using a fine grid (N = 1024) calculation. In Fig. 3b, we plot the same horizontal velocity field obtained by using the coarse grid pressure calculation with a coarse grid (N = 64) and using the multiscale finite element bases to interpolate the fine grid velocity field. We can see that the recovered velocity field captures very well the layer structure in the fine grid velocity
Multiscale computation of fluid flow in heterogeneous media
1515
1
0.8
0.6
0.4
0.2
0
0
0.2
Figure 2.
(a)
0.4
0.6
0.8
A random porosity field with layered structure.
(b)
1
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0
0 0
0.2 2
0.4 4
6
0.6 8
1
0.8 10
1
12 14 elevation
0
0 0
0.2 2
0.4 4
6
0.6 8
0.8 10
1
12 14 elevation
Figure 3. (a) Fine grid horizontal velocity field, N = 1024. (b) Recovered horizontal velocity field from the coarse grid calculation (N = 64) using multiscale bases.
field. Further, we use the recovered fine grid velocity field to compute the saturation in time. In Fig. 4a, we plot the saturation at t =0.06 obtained by the fine grid calculation. Figure 4b shows the corresponding saturation obtained using the recovered velocity field from the coarse grid calculation. Most of detailed fine scale fingering structures in the well-resolved saturation are captured very well by the corresponding calculation using the recovered velocity field from the coarse grid pressure calculation. The agreement is quite striking. We also check the fractional flow curves obtained by the two calculations. The fractional flow of the red fluid, defined as F = Sred u 1 dy/ u 1 dy (S being
1516 (a)
T.Y. Hou (b)
1
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0
0
0.2
0.4
0.6
0.8
0
0.2
0.4
0.6
0.8 1 elevation
0
1
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8 1 elevation
Figure 4. (a) Fine grid saturation at t = 0.06, N = 1024. (b) Saturation computed using the recovered velocity field from the coarse grid calculation (N = 64) using multiscale bases.
DNS (fine) MsFEM (recovered) DNS (averaged) MsFEM (coarse)
1
Fractional flow
0.9
0.8
0.7
0.6
0.5
0
0.1
0.2
0.3
0.4
0.5
0.6
Time
Figure 5. Variation of fractional flow with time. DNS: well-resolved direct numerical solution using LFEM (N = 512). MsFEM: over-sampling is used (N = 64, M = 8).
the saturation, u 1 being the horizontal velocity component), at the right boundary is shown in Fig. 5. The top pair of curves are the solutions of the transport problem using the cell-averaged velocity obtained from a well-resolved solution and from MsFEM; the bottom pair are solutions using well-resolved fine scale velocity and the recovered fine scale velocity from the MsFEM calculation. Two conclusions can be made from the comparisons. First, the
Multiscale computation of fluid flow in heterogeneous media
1517
cell-averaged velocity may lead to a large error in the solution of the transport equation. Second, both the recovered fine scale velocity and the cell-averaged velocity obtained from MsFEM give faithful reproductions of respective direct numerical solutions. We remark that a finite volume version of the multiscale finite element method has been developed by Jenny et al. [12]. They also found that by updating the multiscale bases adaptively in space and time, they can approximate the well-resolved solution accurately. The percentage of the multiscale bases that need to be updated is small, only a few percent of the total number of bases [13]. In some sense, the multiscale finite element method also offers an efficient approach to capture the fine scale details using only a small fraction of the computational time required for a direct numerical simulation using a fine grid.
6.
Scale-up of Two-phase Flows
The multiscale finite element method has been used in conjunction with some moment closure models to obtain an upscaled method for two-phase flows. In many oil reservoir applications, capillary pressure effect is so small that it is neglected in practice. Upscaling a convection dominated transport is difficult due to the nonlocal memory effect [14]. Here we use the upscaling method proposed in [15] to design an overall coarse grid model for the transport equation. In its simplest form, neglecting the effect of gravity, compressibility, capillary pressure, and considering constant porosity and unit mobility, the governing equations for the flow transport in highly heterogeneous porous media can be described by the following partial differential equations ∇ · (K (x)∇ p) = 0, ∂S + u · ∇ S = 0, ∂t
(15) (16)
where p is the pressure, S is the water saturation, K (x) = (Kij (x)) is the relative permeability tensor, and u = − K(x)∇ p is the Darcy velocity. The work of Efendiev et al. [15] for upscaling the saturation equation involves a moment closure argument. The velocity and the saturation are separated into a local mean quantity and a small scale perturbation with zero mean. For example, the Darcy velocity is expressed as u = u0 + u in (16), where u0 is the average of velocity u over each coarse element, u is the deviation of the fine scale velocity from its coarse scale average. If one ignores the third order terms containing the fluctuations of velocity and saturation, one can obtain an
1518
T.Y. Hou
average equation for the saturation S as follows
∂S ∂ + u0 · ∇ S = ∂t ∂x i
∂S Di j (x, t) , ∂x j
(17)
where the diffusion coefficients Di j (x, t) are defined by Dii (x, t) = |ui (x)| L 0i (x, t),
Di j (x, t) = 0, for i=/ j,
where |ui (x)| stands for the average of |ui (x)| over each coarse element. The function L 0i (x, t) is the length of the coarse grid streamline in the xi direction which starts at time t at point x, i.e., L 0i (x, t)
t
=
yi (s)ds,
0
where y(s) is the solution of the following system of ODEs dy(s) = u0 (y(s)), y(t) = x. ds Note that the hyperbolic equation (16) is now replaced by a convection– diffusion equation. One should note that the induced diffusion term is history dependent. In some sense, it captures the nonlocal history dependent memory effect described by Tartar in the simple shear flow problem [14]. The multiscale finite element method can be readily combined with the above upscaling model for the saturation equation. The local fine grid velocity u can be reconstructed from the multiscale finite element bases. We perform a coarse grid computation of the above algorithm on the coarse 64 × 64 mesh using a mixed multiscale finite element method [4]. The fractional flow curve using the above algorithm is depicted in Fig. 6. It gives excellent agreement with the “exact” fractional flow curve which is obtained using a fine 1024 × 1024 mesh. Upscaling the two-phase flow is more difficult due to the dynamic coupling between the pressure and the saturation. One important observation is that the fluctuation in saturation is relatively small away from the oil/water interface. In this region, the multiscale bases are essentially the same as those generated by the corresponding one-phase flow (i.e., λ = 1). These base functions are time independent. In practice, we can design an adaptive strategy to update the multiscale bases in space and time. The percentage of multiscale bases that need to be updated is relatively small (a few percent of the total number of the bases) [13]. The base functions that need to be updated are mostly near the interface separating the oil from the water. For those coarse grid cells far from the interface, there is little change in mobility dynamically. The upscaling of
Multiscale computation of fluid flow in heterogeneous media
1519
1
F(t )
0.8
0.6
0.4
0.2
0
0
0.5
1
1.5
2
2.5
t Figure 6. The accuracy of the coarse grid algorithm. Solid line is the well-resolved fractional flow curve. The slash-dotted line is the fractional flow curve using above coarse grid algorithm.
the saturation equation based on moment closure argument can be generalized to the two-phase flow with the enhanced diffusivity depending on the local small scale velocity field [15]. As we mentioned before, the fluctuation of the velocity field u can be accurately recovered from the coarse grid computation by using local multiscale bases.
7.
Multiscale Analysis for Incompressible Flow
The upscaling of the nonlinear transport equation in two-phase flows shares some of the common difficulties in deriving the effective equations for incompressible flow at high Reynolds number. The understanding of scale interactions for 3D incompressible flow has been a major challenge. For high Reynolds number flow, the degrees of freedom are so high that it is almost impossible to resolve all small scales by direct numerical simulations. Deriving an effective equation for the large scale solution is very useful in engineering applications, see e.g., [16, 17]. In deriving a large eddy simulation model, one usually needs to make certain closure assumptions. The accuracy of such closure models is hard to measure a priori. It varies from application to application. For many engineering applications, it is desirable to design a subgrid-based large scale model in a systematic way so that we can measure and control the modeling error. However, the strong nonlinear interaction of small scales and the lack of scale separation make it difficult to derive an effective equation.
1520
T.Y. Hou
We consider the incompressible Navier-Stokes equation ut + (u · ∇)u = −∇ p + ν u , ∇ · u = 0,
(18) (19)
with multiscale initial data u (x, 0) = u0 (x). Here u (t, x) and p (t, x) are velocity and pressure, respectively, ν is viscosity. We use boldface letters to denote vector variables. For the time being, we do not consider the effect of boundary and assume that the solution is periodic with period 2π in each dimension. For incompressible flow at high Reynolds number, small scales are generated dynamically through nonlinear interactions. In general, there is no scale separation in the solution. However, by decomposing the physical solution into a lower frequency component and a high frequency component, we can formally express the solution as the sum of a large scale solution and a small scale component. This decomposition can be carried out easily in Fourier space. Further, by rearranging the order of summation in the Fourier transformation, we can express the initial condition in the following form
u (x, 0) = U(x) + W x,
x ,
where W(x, y) is periodic in y and has mean zero. Here represents the cut-off wavelength in the solution above which the solution is resolvable and below which the solution is unresolvable. We call this a reparameterization technique. The question of interest is how to derive a homogenized equation for the averaged velocity field for small but finite . If the viscosity coefficient ν is of order one, then it can be shown that the high frequency oscillations will be damped out quickly in O() time. Even with ν = O(), the cell viscosity will be of order one and the oscillatory component of the velocity field is of order O(). In order for the oscillatory component of the velocity field persists in time, we need to have ν = O( 2 ). In this case, the cell viscosity is zero to the leading order. Since we are interested in the convection dominated transport, we set ν = 0 and consider only the incompressible Euler equation. The homogenization of the Euler equation with oscillating data was first studied by McLaughlin–Papanicolaou–Pironneau (MPP for short) [18]. In Ref. [18], MPP made an important assumption that the small scale oscillation is convected by the mean flow. Based on this assumption, they made the following multiscale expansion for velocity and pressure
t θ(t, x) t θ(t, x) + u1 t, x, , + ··· u (t, x) = u(t, x) + w t, x, , t θ(t, x) t θ(t, x) + p1 t, x, , + ··· p (t, x) = p(t, x) + q t, x, ,
Multiscale computation of fluid flow in heterogeneous media
1521
where w(t, x, τ, y), u1 (t, x, τ, y), q, and p1 are assumed to be periodic in both y and τ , and the phase θ is convected by the mean velocity field u ∂θ + u · ∇x θ = 0, θ(0, x) = x. (20) ∂t By substituting the above multiscale expansions into the Euler equation and equating coefficients of the same order, MPP obtained a homogenized equation for (u, p), and a periodic cell problem for (w(t, x, τ, y), q(t, x, τ, y)). On the other hand, it is not clear whether the resulting cell problem for w and q has a unique solution that is periodic in both y and τ . Additional assumptions were imposed on the solution of the cell problem in order to derive a variant of the k − model. The understanding of how small scale solution being propagated dynamically is clearly very important in deriving the homogenized equation. Motivated by the work of MPP, we have recently developed a multiscale analysis for the incompressible Euler equation with multiscale solutions [19, 20]. Our study shows that the small scale oscillations are convected by the full oscillatory velocity field, not just the mean velocity: ∂θ + u · ∇x θ = 0, θ (0, x) = x. (21) ∂t This is clear for the 2D Euler equation since vorticity, ω , is conserved along the characteristics θ (t, x) , ω (t, x) = ω0 θ (t, x), where ω0 (x, x/) is the initial vorticity, which is of order O(1/). Similar conclusion can be drawn for the 3D Euler equation. Now the multiscale structure of θ (x, t) is coupled to the multiscale structure of u . In some sense, we embed multiscale structure within multiscale expansions. It is quite a challenge to unfold the multiscale solution structure. Naive multiscale expansion for θ may lead to generation of infinite number of scales. Motivated by the above analysis, we look for multiscale expansions of the velocity field and the pressure of the following form u (t, x) = u(t, x) + w(t, θ(t, x), τ, y) + u(1) (t, θ(t, x), τ, y) + · · · , (22)
(1)
p (t, x) = p(t, x) + q(t, θ(t, x), τ, y) + p (t, θ(t, x), τ, y) + · · · , (23)
where τ = t/ and y = θ (t, x)/. We assume that w, and q have zero mean with respect to y. The phase function θ is defined in (21) and it has the following multiscale expansion: θ (1) θ = θ(t, x) + θ t, θ(t, x), τ, + ··· . (24)
1522
T.Y. Hou
This particular form of multiscale expansion was suggested by a corresponding Lagrangian multiscale analysis [19]. If one tried to expand θ naively as a function of x/ and t/, one would find that there is a generation of infinite number of scales at t > 0 and would not be able to obtain a well-posed cell problem. Expanding the Jacobian matrix, we get ∇x θ = B (0) +B (1) +· · · . Substituting the expansion into the Euler equation and matching the terms of the same order, we obtain the following homogenized equation
∂t u + u · ∇x u + ∇x · ww = −∇x p, u|t =0 = U(x), ∇x · u = 0,
(25) (26)
where ww stands for space-time average in (y, τ ), and ww stands for a matrix whose entry at the ith row and j th column is wi w j . The equation for w is given by
∂τ w + B (0) ∇y q = 0, τ > 0; (B
(0)
∇y ) · w = 0,
w|τ =0 = W(x, y),
(27) t = 0.
(28)
Moreover, we can derive the evolution equations for θ and θ(1) as follows
∂t θ + (u · ∇x )θ = 0, θ|t =0 = x,
(29)
∂τ θ(1) + (w · ∇x )θ = 0, θ(1)|τ =0 = 0.
(30)
From θ and θ(1) , we can compute the Jacobian matrix B (0) as follows: B (0) = (I − D y θ(1))−1 ∇x θ.
(31)
To check the convergence of our multiscale analysis, we compare the computational result obtained by solving the homogenized equation with that obtained by a well-resolved direct numerical simulation (DNS). Further, we use the first two terms in the multiscale expansion for the velocity field to reconstruct the fine grid velocity field. The initial velocity field is generated in Fourier space by imposing some power-law decay in the velocity field with a random phase perturbation in each Fourier mode. For this initial condition, we choose = 0.05. In Fig. 7a, we plot the initial horizontal velocity field in the fine mesh. The corresponding coarse grid velocity field is plotted in Fig. 7b. As we see in the spectrum plot in Fig. 9, there is no scale separation in the solution. We compare the computation obtained by the homogenized equation with that obtained by DNS at t = 0.5 in Fig. 8. We use the spectral interpolation to reconstruct the fine grid velocity field as a sum of the homogenized solution u and the cell velocity field w. We can see that the reconstructed velocity field (plotted only on the coarse grid) captures very well the fine grid velocity field obtained by DNS using a 512 × 512 grid. We also compare the accuracy
Multiscale computation of fluid flow in heterogeneous media (a)
1523
(b)
500
60
1
440
0.8
400
50
0.6
350
0.4
40
300
0.2
250
0
30
⫺0.2
200
⫺0.4
20
150 100
⫺0.6
10
⫺0.2
50
⫺1 100
200 300 400 t ⫽0 u ⫹w (fine grid)
10
500
20
30
40
50
60
t ⫽0 u ⫹w (coarse grid)
Figure 7. Horizontal velocity fields at t = 0. (a)
(b)
500
1
60
440
0.8
400
50
0.6
350
0.4
40
300 250
0.2 0
30
⫺0.2
200
⫺0.4
20
150
⫺0.6
100
10
⫺0.2
50
⫺1 100 200 300 400 t ⫽0.5 u ⫹w (DNS,fine grid)
500
10 20 30 40 50 60 t ⫽0.50 u ⫹w (interpolated on coarse grid)
Figure 8. Horizontal velocity fields at t = 0.5.
in the Fourier space, which is given in Fig. 9b. The agreement between the well-resolved solution and the reconstructed solution from the homogenized equation is excellent in both low frequencies and high frequencies. Further, we compare the mean velocity field obtained by the homogenized equation with that obtained by direct simulation using a low pass filter. The results are plotted in Figs. 10 and 11, respectively. We can see that the agreement between the two calculations is very good up to t = 1.0. Similar results are obtained for longer time calculations. The above multiscale analysis can be generalized to problems with general multiscale initial data without scale separation and periodic structure. This can be done by using the reparameterization technique in the Fourier space, which we described earlier for the initial velocity. This reparameterization technique
1524
T.Y. Hou
(a)
(b)
10⫺1
10⫺1 DNS(512⫻512)
10⫺2 10⫺3
10⫺2
10⫺4
U⫹W(512⫻512) DNS(512⫻512)
10⫺5
10⫺3
10⫺6 10⫺4
10⫺7 10⫺8
10⫺5
10⫺9 10⫺10
10⫺6
10⫺11 100
101
102
103
t ⫽0 spectrum of velocity
100
101
102
103
t ⫽0.5 spectrum of velocity
Figure 9. Spectrum of velocity fields at t = 0 and t = 0.5 respectively.
(a)
(b)
500
1
60
440
0.8
400
50
0.6
350
0.4 40
300 250
0.2 0
30
⫺0.2
200
⫺0.4
20
150
⫺0.6
100 10
50
⫺0.2 ⫺1
100 200 300 400 t ⫽1.0 mean flow u (DNS,fine grid) filter k ⫽0.01
500
10 20 30 40 50 60 t ⫽1.0 mean flow u (coarse grid) filter k ⫽0.01
Figure 10. Mean velocity fields at t = 1.0.
can be used repeatedly in time. The dynamic reparameterization also accounts for the dynamic interactions between the large and small scales. The difficulty associated with finding the local microscopic boundary condition can be overcome. Preliminary computational results show that the multiscale method can capture accurately the large scale solution and the spectral property of the small scale solution for a relatively long time computations. Our ultimate goal is to use the multiscale analysis to design an effective coarse grid model that can capture accurately the large scale behavior but with a computational cost comparable to the traditional large eddy simulation
Multiscale computation of fluid flow in heterogeneous media (a)
1525
(b)
⫺0.05 ⫺0.1
⫺0.05
filter scale k⫽0.01
t⫽1.0 DNS t⫽1.0 two-scale t⫽00
⫺0.15
⫺0.15
⫺0.2
⫺0.2
⫺0.25
⫺0.25
⫺0.3
⫺0.3
⫺0.35
⫺0.35
⫺0.4
⫺0.4
⫺0.45
⫺0.45
⫺0.5
0
1
2
t⫽1.0 DNS t⫽1.0 two-scale t⫽00
⫺0.1
3
4
5
6
t⫽1.0 cross-section of mean flow u filter scale⫽0.01
7
⫺0.5
0
1
2
3
filter scale k⫽0.05
4
5
6
7
t⫽1.0 cross-section of mean flow u filter scale⫽0.005
Figure 11. Cross-Section of the mean velocity fields at t = 1.0.
(LES) models [16, 17]. To achieve this, we need to take into account the special structures in the fully mixed flow, such as homogeneity and possible local self-similarity of the flow in the interior of the domain. When the flow is fully mixed, we expect that the Reynolds stress term, i.e., ww , reaches to a statistical equilibrium relatively fast. As a consequence, we may need to solve for the cell problem in τ for only a small number of time steps after updating the effective velocity in one coarse grid time step. Moreover, we need not solve the cell problem for every coarse grid for homogeneous flow. It should be sufficient to solve one or a few representative cell problems for fully mixed flow and use the solution of these representative cell solutions to compute the Reynolds stress term in the homogenized velocity equation. If this can be achieved, it would lead to a significant computational saving.
8.
Discussions
Multiscale methods offer several advantages over direct numerical simulations on a fine grid. First, the multiscale bases are very local. This makes it very easy to implement the method in parallel computing. Also the memory requirement is less stringent compared with direct numerical simulations since the base functions can be computed locally and independently. Secondly, we can use an effective adaptive strategy to update the multiscale bases only in the region that is needed. Thirdly, the multiscale methods offer an effective tool in deriving upscaled equations. In oil reservoir simulations, it is often the
1526
T.Y. Hou
case that multiple simulations of the same reservoir model must be carried out in order to validate the fine grid reservoir model. After the upscaled model has been obtained, it can be used repeatedly with different boundary conditions and source distributions for management purpose. In this case, the cost of computing the multiscale base functions is just an over-head. If one can coarsen the fine grid by a factor of 10 in each dimension, the computational saving of the upscaled model over the original fine model could be as large as a factor 10 000 (three space dimensions plus time). It remains a great challenge to develop a systematic multiscale analysis to upscale the convection-dominated transport in heterogeneous media. While the upscaled saturation equation based on perturbation argument and moment closure approximation is simple and easy to implement, it is hard to estimate its modeling error as the fluctuations in velocity or saturation are not small in practice. New multiscale analysis need to be developed to account for the longrange interaction of small scales (the memory effect). Recently, we have developed a novel multiscale analysis for convection-dominated transport equation [2]. The analysis is based on a delicate multiscale analysis of the transport equation. The multiscale analysis for two-phase flows is not as complicated as that for the incompressible Euler equation. There is no need to introduce a multiscale phase function here, and the fast variable, y = x/, which characterizes the small scale solution, enters only as a parameter. This makes it easier for us to generalize the analysis to problems which do not have scale separation. We remark that there are other different approaches to multiscale problems, see e.g., [22–27]. Some of these methods assume that the media have periodic microstructures or scale separation, and explore these properties in their multiscale methods, while others use wavelet approximations, renormalization group techniques, and variational methods.
9.
Outlook
Looking forward, the main challenge in developing multiscale methods seems to be the lack of analytical tools in studying nonlinear dynamic problems that are convection-dominated and whose solutions do not have scale separation or periodic microstructures. For convection-dominated transport problems that do not have scale separation, it is very difficult to construct local multiscale base functions as we did for the elliptic-or diffusion-dominated problems. Incorrect local microscopic boundary conditions for the local multiscale base functions can lead to order one errors propagating down stream and create fluid dynamic instabilities. Systematic multiscale analysis needs to be carried out to account for the long-range interaction of small scales.
Multiscale computation of fluid flow in heterogeneous media
1527
To bridge the gap between the classical homogenization theory where scale separation is required and those practical applications where we do not have scale separation, we need to develop a new type of multiscale analysis. The new multiscale analysis should not require a large separation of scales. By using the dynamic reparameterization technique, we can always divide a multiscale solution into a large scale component and a small scale component. Interaction of the large scales and small scales can be effectively modeled by using a two-scale analysis for a short time increment. Then we use the reparameterization technique to decompose the solution again into a large scale component and a small scale component. Thus interaction of large and small scale solution occurs iteratively at every small time increment. Over a long time, we can account for interactions of all scales. We are currently pursuing this approach with the hope to develop a systematic multiscale analysis for incompressible flow at high Reynolds number.
References [1] T.Y. Hou and X. Wu, “A multiscale finite element method for elliptic problems in composite materials and porous media,” J. Comput. Phys., 134, 169–189, 1997. [2] T.Y. Hou, X. Wu, and Z. Cai, “Convergence of a multiscale finite element method for elliptic problems with rapidly oscillating coefficients,” Math. Comput., 68, 913–943, 1999. [3] Y.R. Efendiev, T.Y. Hou, and X. Wu, “Convergence of a nonconforming multiscale finite element method,” SIAM J. Numer. Anal., 37, 888–910, 2000b. [4] Z. Chen and T. Hou, “A mixed finite element method for elliptic problems with rapidly oscillating coefficients,” Math. Comput., 72, 541–576, 2002. [5] I. Babuska, G. Caloz, and E. Osborn, “Special finite element methods for a class of second order elliptic problems with rough coefficients,” SIAM J. Numer. Anal., 31, 945–981, 1994. [6] F. Brezzi and A. Russo, “Choosing bubbles for advection-diffusion problems,” Math. Models Methods Appl. Sci., 4, 571–587, 1994. [7] T.J.R. Hughes, “Multiscale phenomena: Green’s functions, the Dirichlet-toNeumann formulation, subgrid scale models, bubbles and the origins of stabilized methods,” Comput. Methods Appl. Mech. Engrg., 127, 387–401, 1995. [8] A. Bensoussan, J.L. Lions, and G. Papanicolaou, Asymptotic Analysis for Periodic Structures, 1st edn., North-Holland, Amsterdam, 1978. [9] T.Y. Hou, “Multiscale modeling and computation of incompressible flow,” In: J.M. Hill and R. Moore (eds.), Applied Mathematics Entering the 21st Century, Invited Talks from the ICIAM 2003 Congress, SIAM, Philadelphia, pp. 177–209, 2004. [10] P. Park and T.Y. Hou, “Multiscale numerical methods for singularly-perturbed convection–diffusion equations,” Int. J. Comput. Meth., 1(1), 17–65, 2004. [11] L.J. Durlofsky, “Numerical calculation of equivalent grid block permeability tensors for Heterogeneous porous media,” Water Resour. Res., 27, 699–708, 1991. [12] P. Jenny, S.H. Lee, and H. Tchelepi, “Multi-scale finite volume method for elliptic problems in subsurface flow simulation,” J. Comput. Phys., 187, 47–67, 2003.
1528
T.Y. Hou
[13] P. Jenny, S.H. Lee, and H. Tchelepi, “Adaptive multi-scale finite volume method for multi-phase flow and transport in porous media,” Multiscale Model. Simul., 3, 50–64, 2005. [14] L. Tartar, “Nonlocal effects induced by homogenization,” In: F. Culumbini (ed.), PDE and Calculus of Variations , Birkh¨auser, Boston, pp. 925–938, 1989. [15] Y.R. Efendiev, L.J. Durlofsky, and S.H. Lee, “Modeling of subgrid effects in coarsescale simulations of transport in heterogeneous porous media,” Water Resour. Res., 36, 2031–2041, 2000a. [16] J. Smogorinsky, “General circulation experiments with the primitive equations,” Mon. Weather Rev., 91, 99–164, 1963. [17] M. Germano, U. Pimomelli, P. Moin, and W. Cabot, “A dynamic subgrid-scale eddy viscosity model,” Phys. Fluids A, 3, 1760–1765, 1991. [18] D.W. McLaughlin, G.C. Papanicolaou, and O. Pironneau, “Convection of microstructure and related problems,” SIAM J. Appl. Math., 45, 780–797, 1985. [19] T.Y. Hou, D. Yang, and K. Wang, “Homogenization of incompressible Euler equations,” J. Comput. Math., 22(2), 220–229, 2004b. [20] T.Y. Hou, D. Yang, and H. Ran, “Multiscale analysis in the Lagrangian formulation for the 2-D incompressible Euler equation,” Discr. Continuous Dynam. Sys., 12, to appear, 2005. [21] T.Y. Hou, A. Westhead, and D. Yang, “Multiscale analysis and computation for two-phase flows in strongly heterogeneous porous media,” (in preparation), 2005a. [22] M. Dorobantu and B. Engquist, “Wavelet-based numerical homogenization,” SIAM J. Numer. Anal., 35, 540–559, 1998. [23] T. Wallstrom, S. Hou, M.A. Christie, L.J. Durlofsky, and D. Sharp, “Accurate scale up of two-phase flow using renormalization and nonuniform coarsening,” Comput. Geosci., 3, 69–87, 1999. [24] T. Arbogast, “Numerical subgrid upscaling of two-phase flow in porous media,” In: Z. Chen (ed.), Numerical Treatment of Multiphase Flows in Porous Media, Springer, Berlin, pp. 35–49, 2000. [25] A. Matache, I. Babuska, and C. Schwab, “Generalized p-FEM in homogenization,” Numer. Math., 86, 319–375, 2000. [26] L.Q. Cao, J.Z. Cui, and D.C. Zhu, “Multiscale asymptotic analysis and numerical simulation for the second order Helmholtz equations with rapidly oscillating coefficients over general convex domains,” SIAM J. Numer. Anal., 40, 543–577, 2002. [27] W.E. and B. Engquist, “The heterogeneous multi-scale methods,” Comm. Math. Sci., 1, 87–133, 2003.
4.15 CERTIFIED REAL-TIME SOLUTION OF PARAMETRIZED PARTIAL DIFFERENTIAL EQUATIONS Nguyen Ngoc Cuong, Karen Veroy, and Anthony T. Patera Massachusetts Institute of Technology, Cambridge, MA, USA
1.
Introduction
Engineering analysis requires the prediction of (say, a single) selected “output” s e relevant to ultimate component and system performance:∗ typical outputs include energies and forces, critical stresses or strains, flowrates or pressure drops, and various local and global measures of concentration, temperature, and flux. These outputs are functions of system parameters, or “inputs”, µ, that serve to identify a particular realization or configuration of the component or system: these inputs typically reflect geometry, properties, and boundary conditions and loads; we shall assume that µ is a P-vector (or P-tuple) of parameters in a prescribed closed input domain D ⊂ R P . The input–output relationship s e (µ) : D → R thus encapsulates the behavior relevant to the desired engineering context. In many important cases, the input–output function s e (µ) is best articulated as a (say) linear functional of a field variable u e (µ). The field variable, in turn, satisfies a µ-parametrized partial differential equation (PDE) that describes the underlying physics: for given µ ∈ D, u e (µ) ∈ X e is the solution of g(u e (µ), v; µ) = 0,
∀ v ∈ X e,
(1)
where g is the weak form of the relevant partial differential equation† and X e is an appropriate Hilbert space defined over the physical domain ⊂ Rd . Note * Here superscript “e” shall refer to “exact.” We shall later introduce a “truth approximation” which will
bear no superscript. † We shall restrict our attention in this paper to second-order elliptic partial differential equations; see Outlook for a brief discussion of parabolic problems. 1529 S. Yip (ed.), Handbook of Materials Modeling, 1529–1564. c 2005 Springer. Printed in the Netherlands.
1530
N.N. Cuong et al.
in the linear case, g(w, v; µ) ≡ a(w, v; µ) − f (v), where a(·, ·; µ) and f are continuous bilinear and linear forms, respectively; for any given µ ∈ D, u e (µ) ∈ X e now satisfies a(u e (µ), v; µ) = f (v),
∀ v ∈ X e (linear).
(2)
Relevant system behavior is thus described by an implicit “input–output” relationship s e (µ) = (u e (µ)),
(3)
evaluation of which necessitates solution of the partial differential equation (1) or (2). Many problems in materials and materials processing can be formulated as particular instantiations of the abstraction (1) and (3) or perhaps (2) and (3). Typical field variables and associated second-order elliptic partial differential equations include temperature and steady conduction–Poisson; displacement and equilibrium or Helmholtz elasticity; {velocity, temperature} and steady Boussinesq incompressible Navier–Stokes; wavefunction and stationary Schr¨odinger via (say) Hartree–Fock approximation. The latter two equations are nonlinear, while the former two equations are linear; in subsequent sections we shall provide detailed examples of both nonlinear and linear problems. Our particular interest – or certainly the best way to motivate our approach – is “deployed” systems: components or processes that are in service, in operation, or in the field. For example, in the materials and materials processing context, we may be interested in assessment, evolution, and accommodation of a crack in a critical component of an in-service jet engine; in real-time characterization and optimization of the heat treatment protocol for a turbine disk; or in online thermal “control” of Bridgman semiconductor crystal growth. Typical computational tasks include robust parameter estimation (inverse problems) and adaptive design (optimization problems): in the former – for example, assessment of current crack length or in-process heat transfer coefficient – we must deduce inputs µ representing system characteristics based on outputs s e (µ) reflecting measured observables; in the latter – for example, prescription of allowable load or best thermal environment – we must deduce inputs µ representing “control” variables based on outputs s e (µ) reflecting current process objectives. Both of these demanding activities must support an action in the presence of continually evolving environmental and mission parameters. The computational requirements on the forward problem are thus formidable: the evaluation must be real-time, since the action must be immediate; and the evaluation must be certified – endowed with a rigorous error bound – since the action must be safe and feasible. For example, in our aerospace crack example, we must predict in the field – without recourse to a lengthy computational investigation – the load that the potentially damaged structure
Real-time solution of parametrized partial differential equations
1531
can unambiguously safely carry. Similarly, in our materials processing examples, we must predict in operation – in response to deduced environmental variation – temperature boundary conditions that will preserve the desired material properties. Classical approaches such as the finite element method cannot typically satisfy these requirements. In the finite element method, we first introduce a piecewise-polynomial “truth” approximation subspace X (⊂ X e ) of dimension N . The “truth” finite element approximation is then found by (say) Galerkin projection: given µ ∈ D, s(µ) = (u(µ)),
(4)
where u(µ) ∈ X satisfies g(u(µ), v; µ) = 0,
∀ v ∈ X,
(5)
or, in the linear case g(w, v; µ) ≡ a(w, v; µ) − f (v), a(u(µ), v; µ) = f (v),
∀ v ∈ X (linear).
(6)
We assume that (5) and (6) are well-posed; we articulate the associated hypotheses more precisely in the context of a posteriori error estimation. We shall assume – hence the appellation “truth” – that X is sufficiently rich that u(µ) (respectively, s(µ)) is sufficiently close to u e (µ) (respectively, s e (µ)) for all µ in the parameter domain D. Unfortunately, for any reasonable error tolerance, the dimension N needed to satisfy this condition – even with the application of appropriate (parameter-dependent) adaptive mesh refinement strategies – is typically extremely large, and in particular much too large to provide real-time response in the deployed context. Deployed systems thus present no shortage of unique computational challenges; however, they also provide many unique computational opportunities – opportunities that must be exploited. We first consider the “approximation opportunity.” The critical observation is that, although the field variable u e (µ) generally belongs to the infinitedimensional space X e associated with the underlying partial differential equation, in fact u e (µ) resides on a very low-dimensional manifold Me ≡{u e (µ) | µ ∈ D} induced by the parametric dependence; for example, for a single parameter, µ ∈ D ⊂ R P=1 , u e (µ) will describe a one-dimensional filament that winds through X e . Furthermore, the field variable u e (µ) will typically be extremely regular in µ – the parametrically induced manifold Me is very smooth – even when the field variable enjoys only limited regularity with respect to the spatial coordinate x ∈ .∗ In the finite element method, the approximation space X is * The smoothness in µ may be deduced from the equations for the sensitivity derivatives; the stability and
continuity properties of the partial differential operator are crucial.
1532
N.N. Cuong et al.
much too general – X can approximate many functions that do not reside on Me – and hence much too expensive. This observation presents a clear opportunity: we can effect significant dimension reduction in state space if we restrict attention to Me ; the field variable can then be adequately approximated by a space of dimension N N . However, since manipulation of even one “point” on Me is expensive, we must identify further structure. We thus next consider the “computational opportunities”; here there are two critical observations. The first observation derives from the mathematical formulation: very often, the parameter dependence of the partial differential equation can be expressed as the sum of Q products of (known, easily evaluated) parameter-dependent functions and parameter-independent continuous forms; we shall denote this structure as “affine” parameter dependence. In our linear case, (2), affine parameter dependence reduces to a(w, v; µ) =
Q
q (µ) a q (w, v),
(7)
q=1
for q : D → R and a q : X × X → R, 1 ≤ q ≤ Q. The second observation derives from our context: rapid deployed response perforce places a predominant emphasis on very low marginal cost – we must minimize the additional effort associated with each new evaluation µ → s(µ) “in the field.” These two observations present a clear opportunity: we can exploit the underlying affine parametric structure (7) to design effective offline–online computational procedures which willingly accept greatly increased initial preprocessing – offline, pre-deployed – expense in exchange for greatly reduced marginal – online, deployed – “in service” cost.∗ The two essential components to our approach are (i) rapidly, uniformly (over D) convergent reduced-basis (RB) approximations, and (ii) associated rigorous and sharp a posteriori error bounds. Both components exploit affine parametric structure and offline–online computational decompositions to provide extremely rapid deployed response – real-time prediction and associated error estimation. We next describe these essential ingredients.
2. 2.1.
Reduced-Basis Method Approximation
The reduced-basis method was introduced in the late 1970s in the context of nonlinear structural analysis [1, 2] and subsequently abstracted, analyzed, * Clearly, low marginal cost implies low asymptotic average cost; our methods are thus also relevant to (non real-time) many-query multi-optimization studies – and, in fact, to any situation characterized by extensive exploration of parameter space.
Real-time solution of parametrized partial differential equations
1533
and extended to a much larger class of parametrized PDEs [3, 4] – including the incompressible Navier–Stokes equations [5–7] relevant to many materials processing applications. The RB method explicitly recognizes and exploits the dimension reduction afforded by the low-dimensional and smooth parametrically induced solution manifold. We note that the RB approximation is constructed not as an approximation to the exact solution, u e (µ), but rather as an approximation to the (finite element) truth approximation, u(µ). As already discussed, N , the dimension of X , will be very large; our RB formulation and associated error estimation procedures must be stable and (online) efficient as N → ∞. We shall consider in this section the linear case, g(w, v; µ) ≡ a(w, v; µ) − f (v), in which s(µ) and u(µ) are given by (4) and (6), respectively; recall that a is bilinear and f , , are linear. We shall consider a “primal–dual” formulation particularly well-suited to good approximation and error characterization of the output; towards this end, we introduce a dual, or adjoint, problem: given µ ∈ D, ψ(µ) ∈ X satisfies a(v, ψ(µ); µ) = −(v),
∀ v ∈ X.
(8)
Note that if a is symmetric and = f , which we shall denote “compliance,” ψ(µ) = −u(µ). In the “Lagrangian” [4] RB approach, the field variable u(µ) is approximated by (typically) Galerkin projection onto a space spanned by solutions of the governing PDE at N selected points in parameter space. For the primal probpr pr lem, (6), we introduce nested parameter samples S N ≡ {µ1 ∈ D, . . . , µ N ∈ D} pr and associated nested RB approximation subspaces W N ≡span{ζn ≡ u(µn ), 1 ≤ n ≤ N } for 1 ≤ N ≤ Nmax ; similarly, for the dual problem (8), we define corredu sponding samples S Ndudu ≡ {µdu 1 ∈ D, . . . , µ N du ∈ D} and RB approximation du du du du ∗ spaces W N du ≡span{ζndu ≡ ψ(µdu n ), 1 ≤ n ≤ N } for 1 ≤ N ≤ Nmax . (Procedu dures for selection of good samples SN , S N du and hence spaces W N , W Ndudu will be discussed in subsequent sections.) Our RB approximation is thus: given µ ∈ D, s N (µ) = (u N (µ)) + g(u N (µ), ψ N du (µ); µ),
(9)
where u N (µ) ∈ W N and ψ N du (µ) ∈ W Ndudu satisfy a(u N (µ), v; µ) = f (v),
∀ v ∈ WN ,
(10)
and a(v, ψ N du (µ); µ) = −(v),
∀ v ∈ W Ndudu ,
(11)
* In actual practice, the primal and dual bases should be orthogonalized with respect to the inner product associated with the Hilbert space X, (·, ·) X ; the algebraic systems then inherit the “conditioning” properties of the underlying partial differential equation.
1534
N.N. Cuong et al.
respectively. We emphasize that we are interested in global approximations that are uniformly valid over a finite parameter domain D. We note that, in the compliance case – a symmetric and = f such that ψ(µ) = −u(µ) – we may simply take N du = N , S Ndu = S N , W Ndu = W N , and hence ψ N (µ) = −u N (µ). In practice, in such a case we need never actually form the dual problem – we simply identify ψ N (µ) = −u N (µ) – with a corresponding 50% reduction in computational effort. Typically [8, 9], and in some very special cases provably [10], u N (µ), ψ N (µ), and s N (µ) converge to u(µ), ψ(µ), and s(µ) uniformly and extremely rapidly – thanks to the smoothness in µ – and thus we may achieve the desired accuracy for N, N du N . The critical ingredients of the a priori theory are (i) the optimality properties of Galerkin projection,∗ and (ii) the good approximation properties of W N (respectively, W Ndudu ) for the manifold M ≡ {u(µ) | µ ∈ D} (respectively, Mdu ≡ {ψ(µ) | µ ∈ D}).
2.2.
Offline–Online Computational Procedure
Even though N , N du may be small, the elements of (say) W N are in some sense “large”: ζn ≡ u(µpr n ) will be represented in terms of N N truth finite element basis functions. To eliminate the N -contamination of the deployed performance, we must consider offline–online computational procedures [7– 9, 11]. For our purposes here, we continue to assume that our PDE is linear, (6), and furthermore exactly affine, (7), for some modest Q. In future sections we shall consider a nonlinear example as well as the possibility of nonaffine operators. To begin, we expand our reduced-basis approximation as u N (µ) =
N
du
u N j (µ)ζ j ,
ψ N du (µ) =
j =1
N
ψ N du j (µ)ζ jdu .
(12)
j =1
It then follows from (9) and (12) that the reduced-basis output can be expressed as s N (µ) =
N
du
u N j (µ) (ζ j ) −
j =1
N
ψ N du j (µ) f (ζ jdu )
j =1 N du
+
Q N j =1 j =1 q=1
u N j (µ)ψ N du j (µ)q (µ)a q (ζ j , ζ jdu ),
(13)
* Galerkin optimality relies on stability of the discrete equations. The latter is only assured for coercive
problems; for noncoercive problems, Petrov–Galerkin methods may thus be preferred [12].
Real-time solution of parametrized partial differential equations
1535
where the coefficients u N j (µ), 1 ≤ j ≤ N , and ψ N du j , 1 ≤ j ≤ N du , satisfy the N × N and N du × N du linear algebraic systems N j =1 du
N j =1
Q
(µ)a (ζ j , ζi ) u N j (µ) = f (ζi ), q
q=1 Q
q
1 ≤ i ≤ N,
(14)
q (µ)a q (ζidu , ζ jdu ) ψ N du j (µ) = −(ζidu),
1 ≤ i ≤ N du .
q=1
(15) The offline–online decomposition is now clear. For simplicity below we assume that N du = N . In the offline stage – performed once – we first solve for the ζi , ζidu , 1 ≤ i ≤ N ; we then form and store (ζi ), f (ζi ), (ζidu), and f (ζidu ), 1 ≤ i ≤ N , and a q (ζ j , ζi ), a q (ζidu , ζ jdu ), 1 ≤ i, j ≤ N , 1 ≤ q ≤ Q, and a q (ζi , ζ jdu ), 1 ≤ i, j ≤ N , 1 ≤ q ≤ Q.∗ Note all quantities computed in the offline stage are independent of the parameter µ. In the online stage – performed many times, for each new value of µ “in the field” –we first assemble and subsequently invert the N × N “stiff ness matrices” qQ= 1 q (µ) a q (ζ j , ζi ) of (14) and qQ= 1 q (µ) a q (ζidu , ζ jdu ) of (15) – this yields the u N j (µ), ψ N du j (µ), 1 ≤ j ≤ N ; we next perform the summation (13) – this yields the s N (µ). The operation count for the online stage is, respectively, O(Q N 2 ) and O(N 3 ) to assemble (recall that the a q (ζ j , ζi ), 1 ≤ i, j ≤ N , 1 ≤ q ≤ Q, are pre-stored) and invert the stiffness matrices, and O(N ) + O(Q N 2 ) to evaluate the output (recall that the (ζ j ) are pre-stored); note that the RB stiffness matrix is, in general, full. The essential point is that the online complexity is independent of N , the dimension of the underlying truth finite element approximation space. Since N, N du N , we expect – and often realize – significant, orders-of-magnitude computational economies relative to classical discretization approaches.
3. 3.1.
A Posteriori Error Estimation Motivation
A posteriori error estimation procedures are very well developed for classical approximations of, and solution procedures for, (say) partial differential equations [13–15] and algebraic systems [16]. However, until quite recently, * In actual practice, in the offline stage we consider N = N du du max and N = Nmax ; then, in the online stage,
we extract the necessary subvectors and submatrices.
1536
N.N. Cuong et al.
there has been essentially no way to rigorously, quantitatively, sharply, and efficiently assess the accuracy of RB approximations. As a result, for any given new µ, the RB (say, primal) solution u N (µ) typically raises many more questions than it answers. Is there even a solution u(µ) near u N (µ)? This question is particularly crucial in the nonlinear context – for which in general we are guaranteed neither existence nor uniqueness. Is |s(µ)−s N (µ)| ≤ tol, where tol is the maximum acceptable error? Is a crucial feasibility condition s(µ) ≤ C (say, in a constrained optimization exercise) satisfied – not just for the RB approximation, s N (µ), but also for the “true” output, s(µ)? If these questions cannot be affirmatively answered, we may propose the wrong – and unsafe or infeasible – action in the deployed context. A fourth question is also important: Is N too large, |s(µ) − s N (µ)| tol, with an associated steep (N 3 ) penalty on computational efficiency? An overly conservative approximation may jeopardize the real-time response and associated action – with corresponding detriment to the deployed systems. We may also consider the approximation properties and efficiency of the (say, primal) parameter samples and associated RB approximation spaces, S N and W N , 1 ≤ N ≤ Nmax . Do we satisfy our global “acceptable error level” condition, |s(µ) − s N (µ)| ≤ tol , ∀µ ∈ D, for (close to) the smallest possible value of N ? And a related question: For our given tolerance tol , are the RB stiffness matrices (or, in the nonlinear case, Newton Jacobians) as well-conditioned as possible – given that by construction W N will be increasingly colinear with increasing N ? If the answers are not affirmative, then our RB approximations are more expensive (and unstable) than necessary – and perhaps too expensive to provide real-time response. In short, the pre-asymptotic and essentially ad hoc or empirical nature of reduced-basis discretizations, the strongly superlinear scaling (with N , N du ) of the reduced-basis online complexity, and the particular needs of deployed realtime systems virtually demand rigorous a posteriori error estimators. Absent such certification, we must either err on the side of computational pessimism – and compromise real-time response – or err on the side of computational optimism – and risk sub-optimal, infeasible, or potentially unsafe decisions. In Refs. [8, 9, 17, 18], we introduce a family of rigorous error estimators for reduced-basis approximation of a wide class of partial differential equations (see also Ref. [19] for an alternative approach). As in almost all error estimation contexts, the enabling (trivial) observation is that, whereas a 100% error in the field variable u(µ) or output s(µ) is clearly unacceptable, a 100% or even larger (conservative) error in the error is tolerable and not at all useless; we may thus pursue “relaxations” of the equation governing the error and residual that would be bootless for the original equation governing the field variable u(µ). We now present further details for the particular case of elliptic linear problems with exact affine parameter dependence (7): the truth solution satisfies
Real-time solution of parametrized partial differential equations
1537
(4), (6), and (8), and the corresponding reduced-basis approximation satisfies (9)–(11). (In subsequent sections we shall consider the extension to nonlinear problems through a detailed example; we shall also briefly discuss nonaffine problems.)
3.2.
Error Bounds
We shall need several preliminary definitions. To begin, we denote the inner product and norm associated with our Hilbert space X as (w, v) X and √
v X = (v, v) X , respectively; we further define the dual norm (of any bounded linear functional h) as h(v) .
v X
h X ≡ sup v∈X
(16)
We recall that we restrict our attention here to second-order elliptic partial differential equations: thus, for a scalar problem (such as heat conduction), H01 () ⊂ X e ⊂ H 1 (), where H 1 () (respectively, H01 ()) is the usual space of derivative-square-integrable functions (respectively, derivative–square– integrable functions that vanish on ∂, the boundary of ) [20]. A typical choice for (·, ·) X is (w, v) X =
∇w · ∇v + wv,
(17)
which is simply the standard H 1 () inner product. We next introduce [12, 18] the operator T µ : X → X such that, for any w in X , (T µ w, v) X = a(w, v; µ), ∀ v ∈ X . We then define σ (w; µ) ≡
T µ w X ,
w X
and note that β(µ) ≡ inf sup
a(w, v; µ) = inf σ (w; µ),
w X v X w∈X
(18)
γ (µ) ≡ sup sup
a(w, v; µ) = sup σ (w; µ);
w X v X w∈X
(19)
w∈X v∈X
w∈X v∈X
we also recall that β(µ) w X T µ w X ≤ a(w, T µ w; µ),
∀ w ∈ X.
(20)
Here β(µ) is the Babuˇska “inf–sup” stability constant – the minimum singular value associated with our differential operator (and transpose operator) – and
1538
N.N. Cuong et al.
γ (µ) is the standard continuity constant. We suppose that γ (µ) is bounded ∀ µ ∈ D, and that β(µ) ≥ β0 > 0, ∀ µ ∈ D. We note that for a symmetric, coercive bilinear form, β(µ) = αc (µ), where αc (µ) ≡ inf
w∈X
a(w, w; µ) ,
w 2X
is the standard coercivity constant. Given our reduced-basis primal solution u N (µ), it is readily derived that the error e(µ) ≡ u(µ) − u N (µ) ∈ X satisfies a(e(µ), v; µ) = −g(u N (µ), v; µ),
∀ v ∈ X,
(21)
where −g(u N (µ), v; µ) ≡ f (v) − a(u N (µ), v; µ) (in this linear case) is the familiar residual. It then follows from (16), (20), and (21) that
e(µ) X ≤
ε N (µ) , β(µ)
where ε N (µ) ≡ g(u N (µ), · ; µ) X ,
(22)
is the dual norm of the residual. We now assume that we are privy to a nonnegative lower bound for the ˜ ˜ inf–sup parameter, β(µ), such that β(µ) ≥ β(µ) ≥ β β(µ), ∀µ ∈ D, where β ∈]0, 1[. We then introduce our “energy” error bound
N (µ) ≡
ε N (µ) , ˜ β(µ)
(23)
the effectivity of which is defined as η N (µ) ≡
N (µ) .
e(µ) X
It is readily proven [9, 18] that, for any N , 1 ≤ N ≤ Nmax , 1 ≤ η N (µ) ≤
γ (µ) , ˜ β(µ)
∀ µ ∈ D.
(24)
From the left inequality, we deduce that e(µ) X ≤ N (µ), ∀µ ∈ D, and hence that N (µ) is a rigorous upper bound for the true error∗ measured in the
· X norm – this provides certification: feasibility and “safety” are guaranteed. From the right inequality, we deduce that N (µ) overestimates the true * Note, however, that these error bounds are relative to our underlying “truth” approximation, u(µ) ∈ X, not to the exact solution, u e (µ) ∈ X e .
Real-time solution of parametrized partial differential equations
1539
∗ ˜ error by at most γ (µ)/β(µ), independent of N – this relates to efficiency: an overly conservative error bound will be manifested in an unnecessarily large N and unduly expensive RB approximation, or (even worse) an overly conservative or expensive decision or action “in the field.” We now turn to error bounds for the output of interest. To begin, we note that the dual satisfies an “energy” error bound very similar to the primal result: du , for 1 ≤ N du ≤ Nmax
ψ(µ) − ψ N du (µ) X ≤ du N (µ),
∀ µ ∈ D;
du du ˜ here du N ≡ ε N (µ)/β(µ), and ε N (µ) = − (·) − a(·, ψ N du (µ); µ) X is the dual norm of the dual residual. It then follows† that
|s(µ) − s N (µ)| ≤ sN (µ),
∀µ ∈ D,
(25)
where
sN (µ) ≡ ε N (µ) du N (µ).
(26)
du ˜ It is critical to note that sN (µ) = β(µ) N (µ) N (µ): the output error (and output error bound) vanishes as the product of the primal and dual error (bounds), and hence much more rapidly than either the primal or dual error. From the perspective of computational efficiency, a good choice is ε N (µ) ≈ ε du N (µ); the latter also (roughly) ensures that the bound (25), (26) will be quite sharp. In the compliance case, a symmetric and = f , we immediately obtain
du N (µ) = N (µ), and hence (25) obtains for
sN (µ) ≡
ε 2N (µ) , ˜ β(µ)
∀ µ ∈ D (compliance);
(27)
here, we obtain the “square” effect even without (explicit) introduction of the dual problem. For a coercive further improvements are possible [9]. The real challenge in a posteriori error estimation is not the presentation of these rather classical results, but rather the development of efficient computational approaches for the evaluation of the necessary constituents. In our particular deployed context, “efficient” translates to “online complexity independent of N ,” and “necessary constituents” translates to “dual norm of the primal residual, ε N (µ) ≡ g(u N (µ), ·; µ) X , dual norm of the dual residual, ε du N (µ) ≡ − (·) − a(·, ψ N du (µ); µ) X , and lower bound for the inf–sup ˜ constant, β(µ).” We now turn to these issues. * The upper bound on the effectivity can be large. In many cases, this effectivity bound is in fact quite pessimistic; in many other cases, the effectivity (bound) may be improved by judicious choice of (multipoint) inner product (·, ·) X – in effect, a “bound conditioner” [21]. † The proof is simple: |s(µ) − s (µ)| = |(e) − g(u (µ), ψ (µ); µ)| = | − a(e(µ), ψ(µ); µ) − g(u (µ), N N N N ψ N (µ); µ)| = |g(u N (µ), ψ(µ) − ψ N (µ); µ)| ≤ ε N (µ) du N (µ).
1540
3.3.
N.N. Cuong et al.
Offline–Online Computational Procedures
3.3.1. The dual norm of the residual We consider only the primal residual; the dual residual admits a similar treatment. To begin, we note from standard duality arguments that ε N (µ) ≡ g(u N (µ), ·; µ) X = e(µ) ˆ X,
(28)
where eˆ (µ) ∈ X satisfies (e(µ), ˆ v) X = −g(u N (µ), v; µ),
∀ v ∈ X.
(29)
We next observe from our reduced-basis representation (12) and affine assumption (7) that −g(u N (µ), v; µ) may be expressed as −g(u N (µ), v; µ) = f (v) −
Q N
q (µ)u N n (µ)a q (ζn , v),
∀v ∈ X.
q=1 n=1
(30) It thus follows from (29) and (30) that eˆ (µ) ∈ X satisfies (e(µ), ˆ v) X = f (v) −
Q N
q (µ) u N n (µ) a q (ζn , v),
∀ v ∈ X.
(31)
q=1 n=1
The critical observation [8, 9] is that the right-hand side of (31) is a sum of products of parameter-dependent functions and parameter-independent linear functionals. In particular, it follows from linear superposition that we may write e(µ) ˆ ∈ X as e(µ) ˆ =C+
Q N
q (µ) u N n (µ) Lqn ,
q=1 n=1
for C ∈ X satisfying (C, v) X = f (v), ∀ v ∈ X, and Lqn ∈ X satisfying (Lqn , v) X = − a q (ζn , v), ∀ v ∈ X , 1 ≤ n ≤ N , 1 ≤ q ≤ Q; note from (17) that the latter are simple parameter-independent (scalar or vector) Poisson, or Poisson-like, problems. It thus follows that 2
e(µ) ˆ X = (C, C) X +
Q N
q (µ) u N n (µ) 2(C, Lqn ) X
q=1 n=1
+
Q
N
q =1 n =1
q
(µ) u N n (µ)
q (Lqn , Ln ) X
.
(32)
Real-time solution of parametrized partial differential equations
1541
The expression (32) – which we relate to the requisite dual norm of the residual through (28) – is the sum of products of parameter-dependent (simple, known) functions and parameter-independent inner products. The offline– online decomposition is now clear. In the offline stage – performed once – we first solve for C and Lqn , 1 ≤ n ≤ N , 1 ≤ q ≤ Q; we then evaluate and save the relevant parameter-independent q inner products (C, C) X , (C, Lqn ) X , (Lqn , Ln ) X , 1 ≤ n, n ≤ N , 1 ≤ q, q ≤ Q. Note that all quantities computed in the offline stage are independent of the parameter µ. In the online stage – performed many times, for each new value of µ “in the field” – we simply evaluate the sum (32) in terms of the q (µ), u N n (µ) and the precalculated and stored (parameter-independent) (·, ·) X inner products. The operation count for the online stage is only O(Q 2 N 2 ) – again, the essential point is that the online complexity is independent of N , the dimension of the underlying truth finite element approximation space. We further note that, unless Q is quite large, the online cost associated with the calculation of the dual norm of the residual is commensurate with the online cost associated with the calculation of s N (µ).
3.3.2. Lower bound for the inf–sup parameter Obviously, from the definition (18), we may readily obtain by a variety of techniques effective upper bounds for β(µ); however, lower bounds are much more difficult to construct. We do note that in the case of symmetric coercive ˜ operators we can often determine β(µ) (≤ β(µ) = αc (µ), ∀µ ∈ D) “by inspection.” For example, if we verify q (µ) > 0, ∀ µ ∈ D, and a q (v, v) ≥ 0, ∀ v ∈ X , 1 ≤ q ≤ Q, then we may choose [8, 21] for our coercivity lower bound ˜ β(µ) =
q (µ) min αc (µ), ¯ q∈{1,...,Q} q (µ) ¯
(33)
for some µ¯ ∈ D. Unfortunately, these hypotheses are rather restrictive, and hence more complicated (and offline-expensive) recipes must often be pursued [17, 18]. We consider here a construction which is valid for general noncoercive operators (and thus also relevant in the nonlinear context [22]); for simplicity, we assume our problem remains well-posed over a convex parameter set that includes D. To begin, given µ¯ ∈ D and t = (t(1) , . . . , t( P) ) ∈ R P – note t( j ) is the value of the j th component of t – we introduce the bilinear form T (w, v; t; µ) ¯ = (T µ¯ w, T µ¯ v) X +
P p=1
t( p)
Q ∂q q=1
∂µ( p)
µ¯
µ¯
(µ) ¯ a (w, T v) + a (v, T w) q
q
(34)
1542
N.N. Cuong et al.
and associated Rayleigh quotient F(t; µ) ¯ = min v∈X
T (v, v; t; µ) ¯ ; 2
v X
(35)
it is readily demonstrated that F(t; µ) ¯ is concave in t [24], and hence D µ¯ ≡ P ¯ µ) ¯ ≥ 0} is perforce convex. We next introduce semi-norms {µ ∈ R |F(µ − µ; | · |q : X → R+,0 such that |a q (w, v)| ≤ q |w|q |v|q , Q
C X = supw∈X
q=1
∀w, v ∈ X, 1 ≤ q ≤ Q,
|w|2q
w 2X
(36) ,
for positive parameter-independent constants q , 1 ≤ q ≤ Q, and C X ; it is often the case that 1 (µ) = Constant, in which case the q = 1 contribution to the sum in (34) and (36) may be discarded. (Note that C X is typically independent of Q, since the a q are often associated with non-overlapping subdomains of .) Finally, we define
(µ; µ) ¯ ≡ CX
max
q∈{1,...,Q}
q (µ) − q (µ) ¯ q
∂ (µ − µ) ¯ ( p) (µ) ¯ , − ∂µ( p) p=1 P
q
(37)
for µ ≡ (µ(1) , . . . , µ( P) ) ∈ R P. We now introduce points µ¯ j and associated polytopes P µ¯ j ⊂ D µ¯ j , 1 ≤ j ≤ J, such that D⊂
J
P µ¯ j ,
(38)
j =1
min
ν∈V
µ ¯j
F(ν − µ¯ j ; µ¯ j ) − max (µ; µ¯ j ) ≥ β β(µ¯ j ), µ∈P
µ¯ j
1 ≤ j ≤ J. (39)
Here V µ¯ j is the set of vertices associated with the polytope P µ¯ j – for example, P µ¯ j may be a simplex with |V µ¯ j | = P + 1 vertices; and β ∈ ]0, 1[ is a prescribed accuracy constant. Our lower bound is then given by ˜ β(µ) =
max
j ∈{1,...,J }|µ∈P
µ ¯j
β β(µ¯ j ).
(40)
˜ ˜ In fact, β(µ) of (40) may not strictly honor our condition β(µ) > β β(µ); however, as the latter relates to accuracy, approximate satisfaction suffices.
Real-time solution of parametrized partial differential equations
1543
˜ (Recall that β(µ) appears in the denominator of our error bound; hence, even a relative inf–sup discrepancy of 80%, β ≈ 1/5, is acceptable.) It can be eas˜ ily demonstrated that β(µ) ≥ β(µ) ≥ β β0 > 0, ∀µ ∈ D, which thus ensures well-posed and rigorous error bounds. We now turn to the offline–online decomposition. The offline stage comprises two parts: the generation of a set of points and polytopes–vertices, µ¯ j and P µ¯ j , V µ¯ j , 1 ≤ j ≤ J ; and the verification that (38) (trivial) and (39) (nontrivial) are indeed satisfied. We focus on verification; generation – quite involved – is described in detail in [23]. To verify (39), the essential observation is that the expensive terms – “truth” eigenproblems associated with F, (35), and β, (18) – are limited to a finite set of vertices, J+
J
|V µ¯ j |,
j =1
in total; only for the extremely inexpensive – and typically algebraically very simple – (µ; µ¯ j ) terms must we consider minimization over the polytopes. The online stage (40) is very simple: a search/look-up table, with complexity logarithmic in J and polynomial in P. We close by remarking on the properties of F(µ − µ; ¯ µ) ¯ that play an important role. First, F(µ − µ; ¯ µ) ¯ ≤ β 2 (µ), ∀µ ∈ D µ¯ (say, for the case in which q (µ) = µ(q) , 1 ≤ q ≤ Q = P): this ensures the lower bound result. Second, F(t; µ) ¯ is concave in t (note that in general β(µ) is neither (quasi-) concave nor (quasi-) convex in µ [24]): this ensures a tractable offline computation. Third, F(µ − µ; ¯ µ) ¯ is “tangent”∗ to β(µ) at µ = µ¯ – the cruder estimate (µ; µ) ¯ is a second-order correction: this controls the growth of J (for example, relative to simpler continuity bounds [17]).
3.4.
Sample Construction: A Greedy Algorithm
Our error estimation procedures also allow us to pursue more rational constructions of our parameter samples S N , S Ndudu (and hence spaces W N , W Ndudu ) [18]. We consider here only the primal problem – in which our error criterion is
u(µ) − u N (µ) X ≡ e(µ) X ≤ tol ; similar approaches may be developed for du the dual – ψ(µ) − ψ N du (µ) X ≤ tol , and hence the output – |s(µ) − s N (µ)| ≤ s tol. We denote the smallest primal error tolerance anticipated as tol, min – this must be determined a priori offline; we then permit tol ∈ [tol, min, ∞[ to be specified online. We also introduce F ∈ D nF , a very fine random sample over the parameter domain D of size n F 1. * To make this third property rigorous we must in general consider non-smooth analysis and also possibly
a continuous spectrum as N → ∞.
1544
N.N. Cuong et al.
We first consider the offline stage. We assume that we are given a sample S N , and hence space W N and associated reduced-basis approximation (procedure to determine) u N (µ), ∀µ ∈ D. We then calculate µ∗N = arg maxµ ∈ F
N (µ) – N (µ) is our “online” error bound (23) that, in the limit of n F → ∞ queries, may be evaluated (on average) in O(N 2 Q 2 ) operations; we next append µ∗N to S N to form S N + 1 , and hence W N + 1 . We now continue this process until N = Nmax such that N∗ max = tol,min, where N∗ ≡ N (µ∗N ), 1 ≤ N ≤ Nmax . In the online stage, given any desired tol ∈ [tol, min, ∞[ and any new value of µ ∈ D “in the field”, we first choose N from a pre-tabulated array such that N∗ ≡ N (µ∗N ) = tol. We next calculate u N (µ) and N (µ), and then verify that – and if necessary, subsequently increase N such that – the condition
N (µ) ≤ tol is indeed satisfied. (We should not and do not rely on the finite sample F for either rigor or sharpness.) The crucial point is that N (µ) is an accurate and “online-inexpensive” – O(1) effectivity and N -independent asymptotic complexity – surrogate for the true (very-expensive-to-calculate) error u(µ) − u N (µ) X . This surrogate permits us to (i) offline – here we exploit low average cost – perform a much more exhaustive (n F 1) and, hence, meaningful search for the best samples S N and, hence, most rapidly uniformly convergent spaces W N ,∗ and (ii) online – here we exploit low marginal cost – determine the smallest N , and hence, the most efficient approximation, for which we rigorously achieve the desired accuracy.
4. 4.1.
A Linear Example: Helmholtz-Elasticity Problem Description
We consider a two-dimensional thin plate with a horizontal crack at the (say) interface of two lamina: the (original) domain o (z, L) ⊂ R2 , shown in Fig. 1, is defined as [0, 2] × [0, 1] \ Co , where Co ≡ {x1 ∈ [b − L/2, b + L/2], x2 = 1/2} defines the idealized crack. The left surface of the plate is secured; the top and bottom boundaries are stress-free; and the right boundary is subjected to a vertical oscillatory uniform traction at frequency ω. We model the plate as plane-stress linear isotropic elastic with (scaled) density unity, Young’s modulus unity, and Poisson ratio 0.25; the latter determine the (parameter-independent) constitutive tensor E i j k . Our P = 3 input is µ ≡ (µ(1) , µ(2) , µ(3) ) ≡ (ω2 , b, L); our output is the (oscillatory) amplitude of the average vertical displacement on the right edge of the plate.
* We may in fact view our offline sampling process as a (greedy, parameter space, “L ∞ (D)”) variant of the
POD economization procedure [25] in which – thanks to N (µ) – we need never construct the “rejected” snapshots.
Real-time solution of parametrized partial differential equations
1545
L b
Figure 1. (Original) domain for the Helmholtz elasticity example.
The governing equation for the displacement u o (x o ; µ) ∈ X o (µ) is therefore a o (u o (µ), v; µ) = f o (v), ∀ v ∈ X o (µ), where X o (µ) is a quadratic finite element truth approximation subspace (of dimension N = 14,662) of X e (µ) ≡ {v ∈ (H 1 (o (b, L)))2 | v|x1o = 0 = 0 }; here a (w, v; µ) ≡
o
wi, j E i j k v k, − ω2 wi v i ,
o (b,L)
(v i, j denotes ∂v i /∂ x j and repeated physical indices imply summation), and f o (v) ≡ x o = 2 v 2 . The crack surface is hence modeled extremely simplisti1 cally – as a stress-free boundary. The output s o (µ) is given by s o (µ) = o (u o (µ)), where o (v) = f o (v); we are thus “in compliance”. We now map o (b, L) via a continuous piecewise-affine transformation to a fixed domain . This new problem can now be cast precisely in the desired abstract form, in which , X , and (w, v) X are independent of the parameter µ: as required, all parameter dependence now enters through the bilinear and linear forms; in particular, our affine assumption (7) applies for Q = 10. In the Appendix we summarize the q (µ), a q (w, v), 1 ≤ q ≤ Q; the bound conditioner (·, ·) X ; and the resulting continuity constants q and semi-norms | · |q , 1 ≤ q ≤ Q, and norm equivalence parameter C X . The (undamped, nonradiating) Helmholtz equation exhibits resonances. Our techniques can treat near resonances, as well as large frequency ranges, quite well [18, 23]. For our illustrative purposes here, we choose the parameter domain D (⊂ R P = 3 ) ≡ (ω2 ∈ [3.2, 4.8])×(b ∈ [0.9, 1.1]) × (L ∈ [0.15, 0.25]); D contains no resonances – β(µ) ≥ β0 > 0, ∀µ ∈ D – however, ω2 = 3.2 and 4.8 are close to corresponding natural frequencies, and hence the problem is distinctly noncoercive.
4.2.
Numerical Results
We first consider the inf–sup lower bound construction. We show in Fig. 2 β (µ) and F(µ− µ; ¯ µ) ¯ for µ= ¯ µ¯ 1 =(4.0, 1.0, 0.2); for purposes of presentation 2 we keep µ(1) = (ω = 4.0) fixed and vary µ(2) (= b) and µ(3) (= L). We observe 2
1546
N.N. Cuong et al.
0.02 0.01 0
⫺0.01 ⫺0.02 0.25 0.225 0.2
L
0.175 0.15 0.9
0.95
1
1.05
1.1
b
Figure 2. β 2 (µ) and F(µ − µ; ¯ µ) ¯ for µ¯ = (4, 1, 0.2) as a function of (b, L); ω2 = 4.0.
that (in this particular case, even without (µ; µ)), ¯ F(µ − µ; ¯ µ) ¯ is a lower bound for β 2 (µ); that F(µ − µ; ¯ µ) is concave; and that F(µ − µ; ¯ µ) is tan2 ¯ Thanks to the latter, we can cover D (for ¯β = 0.2) such gent to β (µ) at µ = µ. that (38) and (39) are satisfied with only J = 84 polytopes; in this particular case the P µ¯ j , 1 ≤ j ≤ J, are hexahedrons such that |V µ j | = 8, 1 ≤ j ≤ J . Armed with the inf–sup lower bound, we can now pursue the adaptive sampling strategy described in the previous section. We recall that our problem is compliant, and hence we need only consider the primal variable (and (µ) = ε N (µ)). For tol, min subsequently set ψ N du = N (µ) = −u N (µ) and ε du N du = N pr = 10−3 and n F = 729 we obtain Nmax = 32 such that Nmax ≡ Nmax (µ Nmax ) = 9.03 × 10−4 . We present in Table 1 N,max,rel , η N,ave , sN,max , and ηsN,ave as a function of N . Here N,max,rel is the maximum over Test of N (µ)/ u Nmax max , η N,ave is the average over Test of N (µ)/ u(µ) − u N (µ) X , sN,max,rel is the maximum over Test of sN (µ)/|s Nmax |max , and ηsN,ave is the average over Test of sN (µ)/|s(µ) − s N (µ)|. Here Test ∈ (D I )343 is a random parameter sample of size 343; u Nmax max ≡ maxµ ∈ Test u Nmax (µ) X = 2.0775 and |s Nmax |max ≡ maxµ∈Test |s Nmax (µ)| = 0.089966; and N (µ) and sN (µ) are given by (23) and (27), respectively. We observe that the RB approximation – in particular, for the output – converges very rapidly, and that our rigorous error bounds are in fact quite sharp. The effectivities are not quite O(1) primarily due to the relatively crude inf–sup lower bound; but note that, thanks to the rapid convergence of the RB approximation, O(10) effectivities do not significantly affect efficiency – the induced increase in RB dimension N is quite modest. We turn now to computational effort. For (say) N = 24 and any given µ (say, (4.0, 1.0, 0.2)) – for which the error in the reduced-basis output s N (µ)
Real-time solution of parametrized partial differential equations
1547
Table 1. Numerical results for Helmholtz elasticity N
N,max,rel
η N,ave
sN,max,rel
ηsN,ave
12 16 20 24 28
1.54 × 10−1 3.40 × 10−2 1.58 × 10−2 5.91 × 10−3 2.42 × 10−3
13.41 12.24 13.22 12.56 12.44
3.31 × 10−2 2.13 × 10−3 4.50 × 10−4 4.81 × 10−5 9.98 × 10−6
15.93 14.86 15.44 14.45 14.53
relative to the truth approximation s(µ) is certifiably less than sN (µ) (= 4.94 × 10−7 ) – the Online Time (marginal cost) to compute both s N (µ) and sN (µ) is less than 0.0030 the Total Time to directly calculate the truth result s(µ) = (u(µ)). The savings will be even larger for problems with more complex geometry and solution structure, in particular in three space dimensions. As desired, we achieve efficiency due to (i) our choice of sample, (ii) our rigorous stopping criterion sN (µ), and (iii) our affine parameter dependence and associated offline–online computational procedures; and we achieve rigorous certainty – the reduced-basis predictions may serve in “deployed” decision processes with complete confidence (or at least with the same confidence as the underlying physical model and associated truth finite element approximation). The true merit of the approach is best illustrated in the deployed–real-time context of parameter identification (crack assessment) and adaptive mission optimization (load maximization); see Ref. [24] for an example.
5.
A Nonlinear Example: Natural Convection
Obviously nonlinear equations do not admit the same degree of generality as linear equations. We thus present our approach to nonlinear equations for a particular quadratically nonlinear elliptic problem: the steady Boussinesq incompressible Navier–Stokes equations. This example permits us to identify the key new computational and theoretical ingredients; then, in Outlook, we contemplate more general (higher-order) nonlinearities.
5.1.
Problem Description
We consider Prandtl number Pr = 0.7 Boussinesq natural convection in a square cavity (x1 , x2 ) ∈ ≡ [0, 1] × [0, 1]; the Pr = 0 limit is described in greater detail in [22, 26]. The governing equations for the velocity U = (U1 , U2 ), pressure p, and temperature θ are the (coupled) incompressible steady Navier– Stokes and thermal convection–diffusion equations. Our single parameter
1548
N.N. Cuong et al.
(P = 1) is the Grashof number, µ ≡ Gr, which is the ratio of the buoyancy forces (induced by the temperature field) to the momentum dissipation mechanisms; we consider Gr ∈ D ≡ [1.0, 1.0 × 104 ]. This flow is a model problem for Bridgman growth of semi-conductor crystals; future work shall address geometric (angle, aspect ratio) and Pr variation, and higher Gr – all of which are important in actual materials processing applications. In terms of the general mathematical formulation, (5), u(µ) ≡ (U1 , U2 , p, θ, λ)(µ), where λ is a Lagrange multiplier associated with the pressure zero-mean condition. Our solution u(µ) resides in the space X ≡ X U × X p × X θ × R, where X U ⊂ (H01 ())2 , X p ⊂ L 2 () (respectively, X θ ⊂ {v ∈ H 1 () |v|x1 = 0 = 0}) is a classical P2 −P1 Taylor–Hood Stokes (respectively, P2 scalar) finite element approximation subspace [5]; X is of dimension N = 2869. We associate to X the inner product and norm (w, v) X =
∂χ ∂φ ∂ Wi ∂ Vi + Wi Vi + rq + + χφ + κα ∂x j ∂x j ∂ xi ∂ xi
√
and w X = (w, w) X , respectively, where w = (W1 , W2 , r, χ, κ) and v = (V1 , V2 , q, φ, α). The strong (or distributional) form of the governing equations is then √
Gr u j
√ ∂p √ ∂u i ∂ 2ui = − Gr + Gr θδi2 + , ∂x j ∂ xi ∂x j∂x j
i = 1, 2,
∂u i = λ, ∂ xi √ ∂ 2θ ∂θ Gr Pr u j = , ∂x j ∂x j∂x j
with boundary–normalization conditions u|∂ = 0 on the velocity, p = 0 on the pressure, and ∂θ/∂n|1 = 1, θ|0 = 0, ∂θ/∂n|s = 0 on the temperature; the flow is thus driven by the flux imposed on 1 . Here δij is the Kroneckerdelta, ∂ is the boundary of , and 0 = {x1 = 0, x2 ∈ [0, 1]} (left side), 1 = {x1 = 1, x2 ∈ [0, 1]} (right side), and s = {x1 ∈ ]0, 1[ , x2 = 0} ∪ {x1 ∈ ]0, 1[ , x2 = 1} (top and bottom). It is readily derived that λ = 0; however, we retain this term as a computationally convenient and stable fashion by which to impose the zero-mean pressure condition on the truth finite element solution. Our output of interest is the average temperature over 1 : s(Gr) = (u(Gr)), where (v = (V1 , V2 , q, φ, α)) ≡
φ;
1
note that s −1 (Gr) is the traditional “Nusselt number”.
(41)
Real-time solution of parametrized partial differential equations
1549
The weak form of our partial differential equations is then given by (5), where g(w, v; Gr) ≡ a0 (w, v; Gr) + 12 a1 (w, w, v; Gr) − f (v), a0 (w 1 , v; Gr) ≡
+
∂ Wi1 ∂ Vi − ∂x j ∂x j
a1 (w 1 , w 2 , v; Gr) ≡
√
Gr −
∂ Wi1 q + κ1 ∂ xi √
Gr −
q +α
χ V2 −
r1
1
(42)
r
1 ∂ Vi
∂ xi
,
(43) 1 ∂ Vi
W j1 Wi2 + W j2 Wi
+ Pr f (v) ≡
∂χ ∂φ + ∂ xi ∂ xi 1
∂χ W j2
1
∂x j
+
φ;
∂x j
∂χ W j1
2
∂x j
φ ,
(44) (45)
1
here w 1 = (W11 , W21 , r 1 , χ 1 , κ 1 ), w 2 = (W12 , W22 , r 2 , χ 2 , κ 2 ) , and v = (V1 , V2 , q, φ, α). Note that, even though = f , we are not in “compliance” as g is not bilinear, symmetric; however, we are “close” to compliance, and thus might anticipate rapid output convergence. We next observe that a0 (w 1 , v; Gr) and a1 (w 1 , w 2 , v; Gr) satisfy (a nonlinear version of) our assumption of affine parameter dependence (7). In particular, we may write a0 (w 1 , v; Gr) =
Q0
q
q
0 (Gr)a0 (w 1 , v),
(46)
q=1
a1 (w 1 , w 2 , v; Gr) =
Q1
q
q
1 (Gr)a1 (w 1 , w 2 , v),
(47)
q=1
√ 1 2 = 2 and Q = 1. In particular, (Gr) = 1, (Gr) = Gr, and 11 (Gr) = for Q 0 1 0 0 √ Gr; the corresponding parameter-independent bilinear and trilinear forms should be clear from (43) and (44). We shall exploit (46) and (47) in our offline–online decomposition. We define the derivative (about z ∈ X ) bilinear form dg(·, ·; z; Gr) : X × X → R as dg(w, v; z; Gr) ≡ a0 (w, v; Gr) + a1 (w, z, v; Gr)
1550
N.N. Cuong et al.
which clearly inherits the affine structure (46) and (47) of g; we note that, for our simple quadratic nonlinearity, g(z + w, v; Gr) = g(z, v; Gr) + dg(w, v; z; Gr) + (1/2) a1 (w, w, v; Gr). We then associate to dg(·, ·; z; Gr) our Babuˇska inf–sup and continuity “constants” dg(w, v; z; Gr) ,
w X v X dg(w, v; z; Gr) , γ (z; Gr) ≡ sup sup
w X v X w∈X v∈X β(z; Gr) ≡ inf sup w∈X v∈X
respectively; these constants now depend on the state z about which we linearize. We shall confirm a posteriori that a solution to our problem does indeed exist for all Gr in the chosen D; we can further demonstrate [22] that the manifold {u(Gr)|Gr ∈ D} upon which we focus is a nonsingular (isolated) ∗ solution branch, √ and thus β(u(Gr)) ≥ β0 > 0, ∀ Gr ∈ D. We can also verify γ (z; Gr) ≤ 2 Gr (1 + ρU (ρU + Prρθ ) z X ), where
V L 4 () ,
V X U
ρU ≡ sup
V ∈X U
ρθ ≡ sup
φ∈X θ
φ L 4 ()
φ H 1 ()
(48)
are embedding constants [27, 28]; for V ∈ X U , V L n () ≡ Sobolev n/2 1/n ( (Vi Vi ) ) , 1 ≤ n < ∞, (W, V ) X U ≡ (∂ Wi /∂ x j )(∂ Vi /∂ x j ) + Wi Vi , 1/2 and V X U ≡ (V, V ) X U . We present in Fig. 3(a) a plot of s(Gr); as expected, for low Gr we obtain the conduction solution, s(Gr) = 1; at higher Gr, the larger buoyancy terms create more vigorous flows and hence more effective heat transfer. We show in Fig. 3(b) the velocity and temperature distribution at Gr = 104 ; we observe the familiar “S”-shaped natural convection profile.
5.2.
Reduced-Basis Approximation
For simplicity of exposition we shall not address here the adjoint in the nonlinear (approximation or error estimation) context [22], and we shall thus only consider RB treatment of the primal problem, (5) and (42). Our RB (Galerkin)
* We note that our truth approximation is div-stable in the sense that the “Brezzi” inf–sup parameter, β Br ,
is bounded from below (independent of N ):
β Br ≡
inf
{q∈X p |
sup
q(∂Vi /∂xi )
q=0} V ∈X U V X U q L 2 ()
> 0;
this is a necessary condition for “Babuˇska” inf–sup stability of the linearized operator dg(·, ·, z; Gr).
Real-time solution of parametrized partial differential equations
1551
Figure 3. (a) Inverse Nusselt number s(Gr) as a function of Gr; and (b) velocity and temperature field for Gr = 104 .
approximation is thus: for given Gr ∈ D, evaluate s N (Gr) = (u N (Gr)), where p u N (Gr) ≡ (U N , p N , θ N , λ N )(Gr) ∈ W N ≡ W NU × W N × W Nθ × W Nλ satisfies g(u N (Gr), v; Gr) = 0,
∀ v ∈ WN ,
for and g defined in (41) and (42)–(45). There are two new ingredients: correct choice of W N to ensure div-stability; and efficient offline–online treatment of the nonlinearity. We first address W N . To begin, we assume that N = 4m for m a positive intpr eger, and we introduce a sequence of nested parameter samples S N ≡ {µ1 ∈ pr D, . . . , µ N/4 ∈ D} in terms of which we may then define the components of ¯ W N . It is simplest to start with W p ≡ span{p(µn ), 1 ≤ n ≤ N/4, and p}, where p¯ = 1 is the constant function; we then choose W NU ≡ span{U (µpr n ), 2 U ), 1 ≤ n ≤N/4}, where for q ∈ L (), Sq ∈ X satisfies S p(µpr n (Sq, V ) X U =
∂ Vi q, ∂ xi
∀ V ∈ XU ;
W Nθ
λ ≡ span{θ(µpr we next define n ), 1 ≤ n ≤ N/4}; and, finally, W N ≡ R. Note that W NU must be chosen such that the RB approximation satisfies the Brezzi div-stability condition; for our problem, the domain and hence, the span of the supremizers do not depend on the parameter, and therefore the choice of W NU is simple – the more general case is addressed in [29]. We obp serve that dim(W NU ) = (N/2), dim(W N ) = (N/4) + 1, dim(W Nθ ) = (N/4), and dim(W Nλ ) = 1, and hence dim(W N ) = N + 2.∗
* In fact, we can explicitly eliminate (the zero coefficient of) p¯ and λ (= 0) from our RB discrete equations, N pr p and thus the effective dimension of W N is N . In the RB context, for which each member p(µn ) of W N is explicitly zero-mean, the services of the Lagrange multiplier are no longer required.
1552
N.N. Cuong et al.
For our nonlinear problem, the essential computational kernel is the inner Newton update: given a kth iterate u kN (Gr), the Newton increment δu kN (Gr) v; u kN (Gr); Gr)=−g(u kN (Gr), v; Gr), ∀v ∈ X . If we now satisfies dg(δu kN (Gr), N = n=1 u kN n (Gr) ζn – where W N = span{ζn , 1 ≤ n ≤ N } – and expand u kN (Gr) N k δu N (Gr) = j =1 δu kN j (Gr) ζ j , we obtain [17] the linear set of equations N j =1
Q0
q
q
0 (Gr) a0 (ζ j , ζi )
q=1
+
Q1 N n=1 q =1
q q 1 (Gr)u kNn (Gr)a1 (ζ j , ζn , ζi )
= − g(u kN (Gr), ζi ; Gr),
δ kN j (Gr)
1 ≤ i ≤ N,
where (from (42))
−g(u kN (Gr), ζi ;
Gr) = f (ζi ) −
N j =1
1 + 2
Q1 N
Q0
q
q
0 (Gr) a0 (ζ j , ζi )
q=1
q q u kN n (Gr)1 (Gr)a1 (ζ j , ζn , ζi )
u kN j (Gr)
n=1 q=1
is the residual for v = ζi . We can now directly apply the offline–online procedure [7–9] described earlier for linear problems, except now we must perform summations both “over the affine parameter dependence” and “over the reduced-basis coefficients” (of the current Newton iterate about which we linearize).∗ The operation count for the predominant Newton update component of the online stage is then – per Newton iteration – O(N 3 ) to assemble the residual, −g(u kN (Gr), ζi ; Gr), 1 ≤ i ≤ N , and O(N 3 ) to assemble and invert the N × N Jacobian. The essential point is that the online complexity is independent of N , thanks to offline generation and storage of the requisite parameter independent quantities q (for example, a1 (ζ j , ζn , ζi )). For this particular nonlinear problem, there is relatively little additional cost associated with the nonlinearity. However, our success depends crucially on the low-order polynomial nature of our nonlinearity: in general, standard Galerkin procedures will yield N n + 1 complexity for an nth order (n ≥ 2) polynomial nonlinearity. Although symmetries can be invoked to modestly improve the scaling with N and n [18], in any event new approaches will be * In essence – we shall see this again in the error estimation context – our quadratic nonlinearity effectively introduces N additional “parameter-dependent functions” and “parameter-independent forms” associated with the coefficients of our field-variable expansion and our trilinear form, respectively; however, these new parameter contributions are correlated in ways that we can gainfully exploit.
Real-time solution of parametrized partial differential equations
1553
required for nonpolynomial nonlinearities; we discuss these new procedures for efficient treatment of general nonaffine and nonlinear operators in Outlook.
5.3.
A Posteriori Error Estimation
The motivation for rigorous a posteriori error estimation is even more selfevident in the case of nonlinear problems. Fortunately, there is a rich mathematical foundation upon which to build the necessary computational structure. We first introduce the former; we then describe the latter. For simplicity, we develop here error bounds only for the primal energy norm, u(µ)−u N (µ) X ; we can also develop error bounds for the output – however, good effectivities will require consideration of the dual [22].
5.3.1. Error bounds We require some slight modifications to our earlier (linear) preliminaries. µ µ In particular, we introduce TN : X → X such that, for any w ∈ X , (TN w, v) X = dg(w, v; u N (µ); µ), ∀v ∈ X ; we then define σ N (w; µ) ≡ TNµ w X / w X . Our inf–sup and continuity constants – now linearized about the reduced-basis solution – can then be expressed as β N (µ) ≡ β(u N (µ); µ) = infw ∈ X σ N (w; µ), and γ N (µ) ≡ γ (u N (µ); µ) = supw ∈ X σ N (w; µ), respectively; as before, we shall need a nonnegative lower bound for the inf–sup parameter, β˜ N (µ), such that β N (µ) ≥ β˜N (µ) ≥ 0, ∀ µ ∈ D. As in the linear case, the dual norm of the residual, ε N (µ) of (22), shall play a central role; the (negative of the) residual for our current nonlinear problem is given by (42) for w = u N (µ). We also introduce a new √ combination of parameters τ N (µ) ≡ 2ρ(µ)ε N (µ)/β˜N2 (µ), where ρ(µ) = 2 GrρU (ρU + Prρθ ) depends on the Sobolev embedding constants ρU and ρθ of (48); in essense, τ N (µ) is an appropriately “nondimensionalized” measure of the residual. Finally, we define N ∗ (µ) such that τ N (µ) < 1 for N ≥ N ∗ (µ); we require N ∗ (µ) ≤ Nmax , ∀ µ ∈ D. (The latter is a condition on Nmax that reflects both the convergence rate of the RB approximation and the quality of our inf–sup lower bound.) We recall that µ ≡ Gr ∈ D ≡ [1.0, 1.0 × 104 ]. Our error bound is then expressed, for any µ ∈ D and N ≥ N ∗ (µ), as
N (µ) =
β˜N (µ) 1 − 1 − τ N (µ) . ρ(µ)
(49)
The main result can be very simply stated: if N ≥ N ∗ (µ), there exists a unique solution u(µ) to (5) in the open ball
β˜N (µ) B u N (µ), ρ(µ)
≡
˜N (µ) β z ∈ X z − u N (µ) X < ; ρ(µ)
(50)
1554
N.N. Cuong et al.
furthermore,
u(µ) − u N (µ) X ≤ N (µ).
(51)
The proof, given in Ref. [22], is a slight specialization of a general abstract result [30, 31] that in turn derives from the Brezzi–Rappaz–Raviart (BRR) framework for the analysis of variational approximations of nonlinear partial differential equations [32]; the central ingredient is the construction of an appropriate contraction mapping which then forms the foundation for a standard fixed-point argument. On the basis of the main proposition (50) and (51) we can further prove several important corollaries related to the wellposedness of the truth approximation (5), and – similar to the linear result (24) – the effectivity of our error bound (49) [22]. We note that, as ε N (µ) → 0, we shall certainly satisfy N ≥ N ∗ (µ); furthermore the upper bound to the true error, N (µ) of (49), is asymptotic to ε N (µ)/β˜N (µ). We may derive these limits directly and rigorously from (49) and (51), or more heuristically from the equation for the error e(µ) ≡ u(µ) −u N (µ), dg(e(µ), v; u N (µ); µ) = −g(u N (µ), v; µ) − 12 a1 (e(µ), e(µ), v; µ). (52) We conclude that the nonlinear case shares much in common with the limiting linear case. However, there are also important differences: even for τ N (µ) < 1, we must (in general) admit the possibility of other solutions to (5) – solutions outside B(u N (µ), β˜N /ρ(µ)) – that are not near u N (µ); and for τ N (µ) ≥ 1, we cannot even be assured that there is indeed any solution u(µ) near u N (µ). This conclusion is not surprising: for “noncoercive” nonlinear problems the error equation (51) may in general admit no or several solutions; we can only be certain that a small (isolated) solution exists, (50) and (51), if the residual is sufficiently small. The theory informs us that the appropriate measure of the residual is τ N (µ), which reflects both the stability of the operator (β˜N (µ)) and the strength of the nonlinearity (ρ(µ)). As in the linear case, the real computational challenge is the development of efficient procedures for the calculation of the necessary a posteriori quantities:∗ the dual norm of the residual, ε N (µ); the inf–sup lower bound, β˜N (µ); and – new to our nonlinear problem – the Sobolev constants, ρU and ρθ . We now turn to these considerations.
* Typically, the BRR framework provides a nonquantitative a priori or a posteriori justification of asymp-
totic convergence. In our context, there is a unique opportunity to render the BRR theory completely predictive: actual a posteriori error estimators that are quantitative, rigorous, sharp, and (online) inexpensive.
Real-time solution of parametrized partial differential equations
1555
5.3.2. Offline-online computational procedures The dual norm of the residual. Fortunately, the duality relation of the linear case, (29), still applies – g(w, v; µ) of (42) is nonlinear in w, but of course linear in v. For our nonlinear problem, the negative of the residual, (42), for w = u N (µ), may be expressed in terms of the reduced-basis expansion (12) as −g(u N (µ), v; µ) = f (v) −
N
u N n (µ)
n=1
Q0
q
q
0 (µ)a0 (ζn , v)
q=1
Q1 N
1 q q 1 (µ) u N n (µ)a1 (ζn , ζn , v) , + 2 q =1 n =1
(53)
where we recall that µ ≡ Gr. If we insert (53) in (29) and apply linear superposition, we obtain e(µ) ˆ =C+
N
u N n (µ)
n=1
Q0
q
0 (µ)Lqn +
q=1
Q1 N q =1 n =1
q
q
1 (µ)u N n (µ)Qn n ,
where C ∈ X satisfies (C, v) X = f (v), ∀ v ∈ X , Lqn ∈ X satisfies (Lqn , v) X = q q q − a0 (ζn , v), ∀ v ∈ X , 1 ≤ n ≤ N , 1 ≤ q ≤ Q 0 , and Qn n ∈ X satisfies Qn n = q −a1 (ζn , ζn , v)/2, ∀ v ∈ X , 1 ≤ n, n ≤ N , 1 ≤ q ≤ Q 1 ; the latter are again simple (vector) Poisson problems. It thus follows that [22] 2
e(µ) ˆ X
= (C, C) X +
N
u N n (µ) 2
Q0
n=1
× 2
Q1
q=1
q
q
1 (µ)(C, Qn n ) X +
q=1
+
N
u N n (µ) 2
n =1
+
N n =1
q
0 (µ)(C, Lqn ) X +
Q0 Q1 q=1 q =1
u N n (µ)
Q1 Q1 q=1 q =1
q
Q0 Q0 q=1 q =1 q
q
N
u N n (µ)
n =1
q
q
0 (µ)0 (µ)(Lqn , Ln ) X q
0 (µ)1 (µ)(Lqn , Qn n ) X
q q q q 1 (µ)1 (µ)(Qn n , Qn n ) X
from which we can directly calculate the requisite dual norm of the residual through (28). We can now readily adapt the offline–online procedure developed in the linear case; however, our summation “over the affine dependence” now involves a double summation “over the reduced-basis coefficients”. The operation count for the online stage is thus (to leading order) O(Q 21 N 4 ); the essential point is that
1556
N.N. Cuong et al.
the online complexity is again independent of N – thanks to offline generation and storage of the requisite parameter-independent inner products (for examq q ple, (Qn n , Qn n ) X , 1 ≤ n, n , n , n ≤ N , 1 ≤ q, q ≤ Q 1 ). Although the N 4 online scaling is certainly less than pleasant, the error bound is calculated only once – at the termination of the Newton iteration – and hence in actual practice the additional online cost attributable to the residual dual norm computation is in fact not too large. However, the quartic scaling with N is again a memento mori that, for higher order (than quadratic) nonlinearities, standard Galerkin procedures are not viable; we discuss the alternatives further in Outlook. Lower bound for the inf–sup parameter. Our procedure for the linear case can be readily adopted: we need “only” incorporate the N additional parameterdependent “coefficient functions” – in fact, the RB coefficients – that appear in the linearized-about-u N (µ) derivative operator. Hence, for our nonlinear problem, the bilinear form T of (34) and Rayleigh quotient F of (35) now contain sensitivity derivatives of these additional “coefficient functions”; furthermore, the (µ, µ) ¯ function of (37) – our second-order remainder term – now includes the deviation of the RB coefficients from linear parameter dependence. Further details are provided in Ref. [22] (for Pr = 0) for the case in which W N ≡ W NU is divergence-free. Sobolev continuity constant. We present here the procedure for calculation of ρU ; the procedure for ρθ is similar. We first note [27, 28] that ρU = ˆ ξˆ ) ∈ (R+ , X U ) satisfies (1/δˆmin )1/2 , where (δ, (ξˆ , V ) X U = δˆ
ξˆ j ξˆ j ξˆi Vi ,
∀V ∈ X U ,
ξˆ 4L 4 () = 1,
and (δˆmin , ξˆmin ) denotes the ground state. To solve this eigenproblem, and in particular to ensure that we realize the ground state, we pursue a homotopy procedure. Towards that end, we introduce a parameter h ∈ [0, 1] (and associated small increment h) and look for (δ(h), ξ(h)) ∈ (R+ , X U ) that satisfies
(ξ(h), V ) X U = δ(h) h
ξ j (h)ξ j (h)ξi (h)Vi
+ (1 − h)
ξi (h)Vi , ∀V ∈ X U ,
h ξ 4L 4 () + (1 − h) ξ 2L 2 () = 1;
(54)
(δmin (h), ξmin (h)) denotes the ground state. We observe that (δmin (1), ξmin (1))= (δˆmin , ξˆmin ); and that (δmin (0), ξmin (0)) is the lowest eigenpair of the standard
Real-time solution of parametrized partial differential equations
1557
(vector) Laplacian “linear” eigenproblem. Our homotopy procedure is simple: we first set h old = 0 and find (δmin (0), ξmin (0)) by standard techniques; then, until h new = 1, we set h new ← h old + h, solve (54) for (δmin (h new ), ξmin (h new )) by Newton iteration initialized to (δmin (h old), ξmin (h old )), and update h old ← h new . For our domain, we find (offline) ρU = 0.6008, ρθ = 0.2788; since ρU and ρθ are parameter-independent, no online computation is required.
5.3.3. Sample construction The greedy algorithm developed in the linear case requires some modification in the nonlinear context. The first issue is that, to evaluate our error bound N (µ), we must appeal to our inf–sup lower bound; however, in the nonlinear case, this inf–sup lower bound, β˜N (µ), is defined with respect to the linearized state u Nmax (µ) [22]. In short, to determine the “next” sample point µ N+1 we must already know S Nmax – and hence µ N+1 . To avoid this circular reference during the offline sample generation process, we replace our inf–sup lower bound with a crude (for example, piecewise constant over D) approximation to β(u(µ)); once the samples are constructed, we revert to our rigorous (and now calculable) lower bound, β˜N (µ). The second issue is that, in the nonlinear context, our error bound is not operative until τ N (µ) < 1; hence, the greedy procedure must first select on arg maxµ∈F τ N (µ) – until τ N (µ) < 1 over D – and only subsequently select on arg maxµ ∈ F N (µ) [Prud’homme, private communication]. The resulting sample will ensure not only rapid convergence to the exact solution, but also rapid convergence to a certifiably accurate solution.
5.4.
Numerical Results
We present in Table 2 u(µ˜ N ) − u N (µ˜ N ) X / u(µ˜ N ) X , N,rel (µ˜ N ) ≡
N (µ˜ N )/ u N (µ˜ N ) X , and η N (µ˜ N ) ≡ N (µ˜ N )/ e(µ˜ N ) X for 8 ≤ N ≤ Nmax = 40; here µ˜ N ≡ arg max
µ∈Test
u(µ) − u N (µ) X
u(µ) X
and Test is a random parameter grid of size n Test = 500. We observe very rapid convergence of u N (µ) to u(µ) over D (more precisely, Test ) – our samples S N are optimally constructed to provide uniform convergence. The output error decreases even more rapidly: maxµ ∈ Test |s(µ) − s N (µ)|/s(µ) = 1.34 × 10−1 , 2.80 × 10−4 , and 9.79 × 10−7 for N = 8, 16, and 24, respectively; this “superconvergence” is a vestige of near compliance. As regards a posteriori error estimation, we observe that N ∗ (µ˜ N ) = 24
1558
N.N. Cuong et al. Table 2. Convergence and effectivity results for the natural convection problem; the “*” signifies that N ∗ (µ˜ N ) > N, which in turn indicate that τ N (µ˜ N ) ≥ 1 N
u(µ˜ N ) − u N (µ˜ N ) X
u(µ˜ N ) X
N,rel (µ˜ N )
η N (µ˜ N )
8 16 24 32 40
3.28 × 10−1 1.45 × 10−2 1.80 × 10−4 8.05 × 10−7 4.60 × 10−8
* * 7.47 × 10−4 7.60 × 10−6 8.69 × 10−7
* * 4.15 9.44 18.93
is relatively small – we can (respectively, can not) provide a definitive error bound for N ≥ 24 (respectively, N < 24); more generally, we find that N ∗ (µ) ≤ 24, ∀ µ ∈ D. We note that the effectivities are quite good∗ – in fact, considerably better than the worst-case predictions of our effectivity corollary. (The higher effectivity at N = 40 is undoubtedly due to round-off in the online summation.) The results of Table 2 are based on an inf–sup lower bound construction with J = 28 elements: points µ¯ j and polytopes (here segments) P µ¯ j , 1 ≤ j ≤ J . The accuracy of the resulting lower bound is reflected in the modest N ∗ (µ) and the good effectivities reported in Table 2. Most of the points µ¯ j are clustered at larger Gr, as might be expected. Finally, we note that the total online computational time on a Pentium M 1.6 GHz processor to predict u N (Gr), s N (Gr), and N (Gr) to a relative accuracy (in the energy norm) of 10−3 is – ∀ Gr ∈ D – 300 ms; this should be compared to 50 s for direct finite element calculation of the truth solution, u(Gr), s(Gr). We achieve computational savings of O(100): N is very small thanks to (i) the good convergence properties of S N and hence W N , and (ii) the rigorous and sharp stopping criterion provided by N (Gr); and the marginal computational complexity to evaluate s N (Gr) and N (Gr) depends only on N and not on N – thanks to the offline–online decomposition. The computational savings will be even more significant for more complex problems particularly in three spatial dimensions; it is critical to recall that we realize these savings without compromising rigorous certainty.† * It is perhaps surprising that the BRR theory – not really designed for quantitative service – yields such sharp results. However, it is important to note that, as ε N (µ) → 0, N (µ) ∼ ε N (µ)/β˜ N (µ), and thus the more pessimistic bounds (in particular ρ) are absent – except in τ N (µ). † We admit that the extension of our results to much larger Gr is not without difficulty. The more complex flow structures and the stronger nonlinearity will degrade the convergence rate and a posteriori error bounds – and increase N and J ; and (inevitable) limit points and bifurcations will require special precautions.
Real-time solution of parametrized partial differential equations
6.
1559
Outlook
We address here some of the more obvious questions that arise in reviewing the current state of affairs. As a first question: How many parameters P can we consider – for P how large are our techniques still viable? It is undeniably the case that ultimately we should anticipate exponential scaling (of both N and certainly J ) as P increases, with a concomitant unacceptable increase certainly in offline but also perhaps in online computational effort. Fortunately, for smaller P, the growth in N is rather modest, as (good) sampling procedures will automatically identify the more interesting regions of parameter space. Unfortunately, the growth in J is more problematic: we shall require more efficient construction and verification procedures for our inf–sup lower bound samples. In any event, treatment of hundreds (or even many tens) of truly independent parameters by the global methods described in this chapter is clearly not practicable; in such cases, more local approaches must be pursued.∗ A second question: How can we efficiently treat problems with non-affine parameter dependence and (more than quadratic) state-space nonlinearity? Both these issues are satisfactorily addressed by a new “empirical interpolation” approach [33]. In this approach, we replace a general nonaffine nonlinear function of the parameter µ, spatial coordinate x, and field variable u(x; µ), H(u; x; µ), by a collateral RB expansion: in particular, we approxµ); x; µ) – as required in our RB projection for u N (µ) – by imate H(u N (x; M H M (x; µ) = m=1 dm (µ)ξm (x). The critical ingredients of the approach are H = {µH , . . . , µH }, and approximation (i) a “good” collateral RB sample, S M 1 M H H space, span{ξm = H(u(µm ); x; µm ), 1 ≤ m ≤ M}, (ii) a stable and inexpensive interpolation procedure by which to determine (online) the dm (µ), 1 ≤ m ≤ M, and (iii) effective a posteriori error bounds with which to quantify the effect of the newly introduced truncation. It is perhaps only in the latter that the technique is somewhat disappointing: the error estimators – though quite sharp and very efficient – are completely (provably) rigorous upper bounds only in certain restricted situations. Finally, a third question, again related to generality: What class of PDEs can be treated? In addition to the elliptic equations discussed in this paper, parabolic equations can also be addressed satisfactorily from both the approximation and error estimation points of view [24, 34, 35]:† much of the elliptic technology directly applies, except that time now appears as an additional parameter; this parabolic framework can be viewed as an extension of * We do note that at least some problems with ostensibly many parameters in fact involve highly coupled
or correlated parameters: certain classes of shape optimization certainly fall into this category. In these situations, global progress can be made. † To date we have experience with only stable parabolic systems such as the heat equation; unstable systems present considerable difficulty, in particular if long-time solutions are desired.
1560
N.N. Cuong et al.
time-domain model reduction procedures [19, 25, 36]. Unfortunately, treatment of hyperbolic problems does not look promising: although RB methods can perform quite well anecdotally, in general the underlying smoothness (in parameter µ) and stability will no longer obtain; as a result, both the approximation properties and error estimators will suffer. We close by noting that the offline aspects of the approaches described are both complicated and computationally expensive. The former can be at least partially addressed by appropriate software and architectures [37]; however, the latter will in any event remain. It follows that these techniques will really only be viable in situations in which there is truly an imperative for real-time certified response: a real premium on (i) greatly reduced marginal cost (or asymptotic average cost), and (ii) rigorous characterization of certainty; or equivalently, a very high (opportunity) cost associated with (i) slow response – long latency times, and (ii) incorrect (or unsafe) decisions or actions. There are many classes of materials and materials processing problems and contexts for which the methods are appropriate; and certainly there are many classes of materials and materials processing problems and contexts for which more classical techniques remain distinctly preferred.
Appendix A Helmholtz Elasticity Example We first define a reference domain corresponding to the geometry b = br = 1 and L = L r = 0.2. We then map o (b, L) → ≡ o (br , L r ) by a continuous piecewise-affine (in fact, piecewise-dilation-in-x1 ) transformation. We define three subdomains, 1 ≡ ] 0, br − L r /2 [ × ] 0, 1 [ , 2 ≡ ] br − L r /2, br + L r / ¯ = ¯1∪ ¯2∪ ¯ 3. 2 [× ] 0, 1[, 3 ≡ ]br + L r /2, 2 [×] 0, 1 [, such that We may then express the resulting bilinear form a(w, v; µ) as an affine sum (7) for Q = 10; the particular q (µ), a q (w, v), 1 ≤ q ≤ 10, as shown in Table 3. (Recall that w = (w1 , w2 ) and v = (v 1 , v 2 ).) The constitutive constants in Table 3 are given by c11 =
1 , 1 − ν2
c22 = c11 ,
c12 =
ν , 1 − ν2
c66 =
1 , 2(1 + ν)
where ν = 0.25 is the Poisson ratio (and the normalized Young’s modulus is unity); recall that we consider plane stress and a linear isotropic solid. We now define our inner product-cum-bound conditioner as (w, v) X ≡
c11
∂v 1 ∂w1 ∂v 2 ∂w2 ∂v 2 ∂w2 ∂v 1 ∂w1 + c22 + c66 + c66 ∂ x1 ∂ x1 ∂ x2 ∂ x2 ∂ x1 ∂ x1 ∂ x2 ∂ x2
+ w1 v 1 + w2 v 2
=
Q q=2
a q (w, v) ;
Real-time solution of parametrized partial differential equations
1561
Table 3. Parametric functions q (µ) and parameter-independent bilinear forms a q (w, v) for the two-dimensional crack problem q (µ)
q 1
1
c12
a q (w, v) ∂v 1 ∂w2 ∂v ∂w1 + 2 ∂ x1 ∂ x2 ∂ x2 ∂ x1
+ c66 2
br − L r /2 b − L/2
3
Lr L
4
2 − br − L r /2 2 − b − L/2
5
b − L/2 br − L r /2
6
L Lr
7
2 − b − L/2 2 − br − L r /2
8
−ω2
9
L −ω2 Lr
10
c11 1
c11 2
c11 3
c22 1
c22
b − L/2 br − L r /2
2 − b − L/2 −ω2 2 − br − L r /2
2
c22
3
∂v 1 ∂w1 ∂ x1 ∂ x1 ∂v 1 ∂w1 ∂ x1 ∂ x1 ∂v 1 ∂w1 ∂ x1 ∂ x1 ∂v 2 ∂w2 ∂ x2 ∂ x2 ∂v 2 ∂w2 ∂ x2 ∂ x2 ∂v 2 ∂w2 ∂ x2 ∂ x2
∂v 1 ∂w2 ∂v ∂w1 + 2 ∂ x2 ∂ x1 ∂ x1 ∂ x2
+ c66
1
2
3
1
2
+ c66
+ c66
+ c66
+ c66
+ c66
3
∂v 2 ∂w2 ∂ x1 ∂ x1 ∂v 2 ∂w2 ∂ x1 ∂ x1 ∂v 2 ∂w2 ∂ x1 ∂ x1 ∂v 1 ∂w1 ∂ x2 ∂ x2 ∂v 1 ∂w1 ∂ x2 ∂ x2 ∂v 1 ∂w1 ∂ x2 ∂ x2
w1 v 1 + w2 v 2 1
w1 v 1 + w2 v 2 2
w1 v 1 + w2 v 2 3
thanks to the Dirichlet conditions at x1 = 0 (and also the wi v i term), (·, ·) X is appropriately coercive. We now observe that (µ) = 1 ( 1 = 0) and we can thus disregard the q = 1 term in our continuity bounds. We may then choose |v|2q = a q (v, v), 2 ≤ q ≤ Q, since the a q (·, ·) are positive semi-definite; it thus follows from the Cauchy–Schwarz inequality that q = 1, 2 ≤ q ≤ Q; furthermore, from (36), we directly obtain C X = 1.
Acknowledgments We would like to thank Professor Yvon Maday of University Paris VI for his many invaluable contributions to this work. We would also like to thank
1562
N.N. Cuong et al.
Dr Christophe Prud’homme of EPFL, Mr Martin Grepl of MIT, Mr Gianluigi Rozza of EPFL, and Professor Liu Gui-Rong of NUS for many helpful recommendations. This work was supported by DARPA and AFOSR under Grant F49620-03-1-0356, DARPA/GEAE and AFOSR under Grant F49620-03-10439, and the Singapore-MIT Alliance.
References [1] B.O. Almroth, P. Stern, and F.A. Brogan, “Automatic choice of global shape functions in structural analysis,” AIAA J., 16, 525–528, 1978. [2] A.K. Noor and J.M. Peters, “Reduced basis technique for nonlinear analysis of structures,” AIAA J., 18, 455–462, 1980. [3] J.P. Fink, and W.C. Rheinboldt, “On the error behavior of the reduced basis technique for nonlinear finite element approximations,” Z. Angew. Math. Mech., 63, 21–28, 1983. [4] T.A. Porsching, “Estimation of the error in the reduced basis method solution of nonlinear equations,” Math. Comput., 45, 487–496, 1985. [5] M.D. Gunzburger, Finite Element Methods for Viscous Incompressible Flows: A Guide to Theory, Practice, and Algorithms, Academic Press, Boston, 1989. [6] J.S. Peterson, “The reduced basis method for incompressible viscous flow calculations,” SIAM J. Sci. Stat. Comput., 10, 777–786, 1989. [7] K. Ito and S.S. Ravindran, “A reduced-order method for simulation and control of fluid flows,” Journal of Computational Physics, 143, 403–425, 1998. [8] L. Machiels, Y. Maday, I.B. Oliveira, A.T. Patera, and D. Rovas, “Output bounds for reduced-basis approximations of symmetric positive definite eigenvalue problems,” C. R. Acad. Sci. Paris, S´erie I, 331, 153–158, 2000. [9] C. Prud’homme, D. Rovas, K. Veroy, Y. Maday, A.T. Patera, and G. Turinici, “Reliable real-time solution of parametrized partial differential equations: Reducedbasis output bound methods,” J. Fluids Eng., 124, 70–80, 2002. [10] Y. Maday, A.T. Patera, and G. Turinici, “Global a priori convergence theory for reduced-basis approximation of single-parameter symmetric coercive elliptic partial differential equations,” C. R. Acad. Sci. Paris, S´erie I, 335, 289–294, 2002. [11] E. Balmes, “Parametric families of reduced finite element models: Theory and applications,” Mech. Syst. Signal Process., 10, 381–394, 1996. [12] Y. Maday, A.T. Patera, and D.V. Rovas, “A blackbox reduced-basis output bound method for noncoercive linear problems,” In: D. Cioranescu and J. Lions (eds.), Nonlinear Partial Differential Equations and Their Applications, Coll´ege de France Seminar Volume XIV, Elsevier Science B.V, pp. 533–569, 2002. [13] R. Becker and R. Rannacher, “Weighted a posteriori error control in finite element methods,” ENUMATH 95 Proceedings World Science Publications, Singapore, 1997. [14] M. Paraschivoiu and A.T. Patera, “A hierarchical duality approach to bounds for the outputs of partial differential equations,” Comp. Meth. Appl. Mech. Eng., 158, 389–407, 1998. [15] M. Ainsworth and J.T. Oden, A Posteriori Error Estimation in Finite Element Analysis. Pure and Applied Mathematics., Wiley-Interscience, New York, 2000. [16] J.W. Demmel, Applied Numerical Linear Algebra, SIAM, Philadelphia, 1997.
Real-time solution of parametrized partial differential equations
1563
[17] K. Veroy, C. Prud’homme, and A.T. Patera, “Reduced-basis approximation of the viscous Burgers equation: Rigorous a posteriori error bounds,” C. R. Acad. Sci. Paris, S´erie I, 337, 619–624, 2003. [18] K. Veroy, C. Prud’homme, D.V. Rovas, and A.T. Patera, “A posteriori error bounds for reduced-basis approximation of parametrized noncoercive and nonlinear elliptic partial differential equations (AIAA Paper 2003-3847),” Proceedings of the 16th AIAA Computational Fluid Dynamics Conference, 2003. [19] M. Meyer and H.G. Matthies, “Efficient model reduction in non-linear dynamics using the Karhunen–Lo`eve expansion and dual-weighted-residual methods,” Comput. Mech., 31, 179–191, 2003. [20] A. Quarteroni and A. Valli, Numerical Approximation of Partial Differential Equations, 2nd edn. Springer, 1997. [21] K. Veroy, D. Rovas, and A.T. Patera, “A posteriori error estimation for reducedbasis approximation of parametrized elliptic coercive partial differential equations: “Convex inverse” bound conditioners,” Control, Optim. Calculus Var., 8, 1007–1028, Special Volume: A tribute to J.-L. Lions, 2002. [22] K. Veroy and A.T. Patera, “Certified real-time solution of the parametrized steady incompressible Navier–Stokes equations; Rigorous reduced-basis a posteriori error bounds,” Submitted to International Journal for Numerical Methods in Fluids (Special Issue — Proceedings for 2004 ICFD Conference on Numerical Methods for Fluid Dynamics, Oxford), 2004. [23] N.C. Nguyen, Reduced-Basis Approximation and A Posteriori Error Bounds for Nonaffine and Nonlinear Partial Differential Equations: Application to Inverse Analysis, PhD Thesis, Singapore-MIT Alliance, National University of Singapore, In progress, 2005. [24] M.A. Grepl, N.C. Nguyen, K. Veroy, A.T. Patera, and G.R. Liu, “ Certified rapid solution of parametrized partial differential equations for real-time applications,” Proceedings of the 2nd Sandia Workshop of PDE-Constrained Optimization: Towards Real-Time and On-Line PDE-Constrained Optimization, SIAM Computational Science and Engineering Book Series. Submitted, 2004. [25] L. Sirovich, “Turbulence and the dynamics of coherent structures, Part 1: Coherent structures,” Q. Appl. Math., 45, 561–571, 1987. [26] B. Roux (ed.), Numerical Simulation of Oscillatory Convection in Low-Pr Fluids: A GAMM Workshop, vol. 27 of Notes on Numerical Fluids Mechanics, Vieweg, 1990. [27] N. Trudinger, “On imbedding into Orlicz spaces and some applications,” J. Math. Mech., 17, 473–484, 1967. [28] G. Talenti, “Best constant in Sobolev inequality,” Ann. Mat. Pura Appl., 110, 353–372, 1976. [29] G. Rozza, “Proceedings of the Third M.I.T. Conference on Computational Fluid and Solid Mechanics,” June 14–17, 2005. In: K. Bathe (ed.), Computational Fluid and Solid Mechanics., Elsevier, Submitted, 2005. [30] G. Caloz and J. Rappaz, “Numerical analysis for nonlinear and bifurcation problems,” In: P. Ciarlet and J. Lions (eds.), Handbook of Numerical Analysis, vol. V, Techniques of Scientific Computing (Part 2), Elsevier Science B.V, pp. 487–637, 1997. [31] K. Ito and S.S. Ravindran, “A reduced basis method for control problems governed by PDEs,” In: W. Desch, F. Kappel, and K. Kunisch (eds.), Control and Estimation of Distributed Parameter Systems, Birkh¨auser, pp. 153–168, 1998. [32] F. Brezzi, J. Rappaz, and P. Raviart, “Finite dimensional approximation of nonlinear problems. Part I: Branches of nonsingular solutions,” Numerische Mathematik, 36, 1–25, 1980.
1564
N.N. Cuong et al.
[33] M. Barrault, N.C. Nguyen, Y. Maday, and A.T. Patera, “An “empirical interpolation” method: application to efficient reduced-basis discretization of partial differential equations,” C. R. Acad. Sci. Paris, S´erie I, 339, 667–672, 2004. [34] D. Rovas, Reduced-Basis Output Bound Methods for Parametrized Partial Differential Equations, PhD Thesis, Massachusetts Institute of Technology, Cambridge, MA, 2002. [35] M.A. Grepl and A.T. Patera, A posteriori error bounds for reduced-basis approximations of parametrized parabolic partial differential equations, M2AN Math. Model. Numer. Anal., To appear, 2005. [36] Z.J. Bai, “Krylov subspace techniques for reduced-order modeling of large-scale dynamical systems.”, Appl. Numer. Math., 43, 9–44, 2002. [37] C. Prud’homme, D.V. Rovas, K. Veroy, and A.T. Patera, “A mathematical and computational framework for reliable real-time solution of parametrized partial differential equations,” M2AN Math. Model. Numer. Anal., 36, 747–771, 2002.