Reinhard Mahnke, Jevgenijs Kaupuˇzs, and Ihor Lubashevsky Physics of Stochastic Processes
Reinhard Mahnke, Jevgenijs Kaupuˇzs, and Ihor Lubashevsky
Physics of Stochastic Processes How Randomness Acts in Time
The Authors Dr. Reinhard Mahnke University of Rostock Institute of Physics Rostock, Germany
[email protected] Dr. Jevgenijs Kaupuˇzs University of Latvia Mathematical and Computer Science Riga, Latvia Prof. Ihor Lubashevsky Russian Academy of Sciences Prokhorov General Physics Institute Moscow, Russia
All books published by Wiley-VCH are carefully produced. Nevertheless, authors, editors, and publisher do not warrant the information contained in these books, including this book, to be free of errors. Readers are advised to keep in mind that statements, data, illustrations, procedural details or other items may inadvertently be inaccurate. Library of Congress Card No.: applied for British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library. Bibliographic information published by the Deutsche Nationalbibliothek Die Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available in the Internet at http://dnb.d-nb.de. 2009 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim All rights reserved (including those of translation into other languages). No part of this book may be reproduced in any form – by photoprinting, microfilm, or any other means – nor transmitted or translated into a machine language without written permission from the publishers. Registered names, trademarks, etc. used in this book, even when not specifically marked as such, are not to be considered unprotected by law. Composition Laserwords Private Ltd., Chennai, India Printing betz-druck GmbH, Darmstadt Bookbinding Litges & Dopf GmbH, Heppenheim Printed in the Federal Republic of Germany Printed on acid-free paper ISBN: 978-3-527-40840-5
V
Contents Preface
XI
Part I Basic Mathematical Description 1 1.1 1.2 1.3 1.4
1.5 1.6 2 2.1 2.2
2.2.1 2.2.2 2.2.3 2.2.4 2.3 2.4 2.4.1 2.4.2 2.4.3 2.5
1
Fundamental Concepts 3 Wiener Process, Adapted Processes and Quadratic Variation 3 The Space of Square Integrable Random Variables 8 The Ito Integral and the Ito Formula 15 The Kolmogorov Differential Equation and the Fokker–Planck Equation 23 Special Diffusion Processes 27 Exercises 29 Multidimensional Approach 31 Bounded Multidimensional Region 31 From Chapman–Kolmogorov Equation to Fokker–Planck Description 33 The Backward Fokker–Planck Equation 35 Boundary Singularities 37 The Forward Fokker–Planck Equation 40 Boundary Relations 43 Different Types of Boundaries 44 Equivalent Lattice Representation of Random Walks Near the Boundary 45 Diffusion Tensor Representations 46 Equivalent Lattice Random Walks 54 Properties of the Boundary Layer 56 Expression for Boundary Singularities 58
Physics of Stochastic Processes: How Randomness Acts in Time Reinhard Mahnke, Jevgenijs Kaupuˇzs and Ihor Lubashevsky Copyright 2009 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim ISBN: 978-3-527-40840-5
VI
Contents
2.6 Derivation of Singular Boundary Scaling Properties 61 2.6.1 Moments of the Walker Distribution and the Generating Function 61 2.6.2 Master Equation for Lattice Random Walks and its General Solution 62 2.6.3 Limit of Multiple-Step Random Walks on Small Time Scales 65 2.6.4 Continuum Limit and a Boundary Model 68 2.7 Boundary Condition for the Backward Fokker–Planck Equation 69 2.8 Boundary Condition for the Forward Fokker–Planck Equation 71 2.9 Concluding Remarks 72 2.10 Exercises 73 Part II Physics of Stochastic Processes
75
3 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8
The Master Equation 77 Markovian Stochastic Processes 77 The Master Equation 82 One-Step Processes in Finite Systems 85 The First-Passage Time Problem 88 The Poisson Process in Closed and Open Systems The Two-Level System 99 The Three-Level System 105 Exercises 114
4 4.1 4.2 4.3 4.4 4.5 4.6
4.7 4.8
The Fokker–Planck Equation 117 General Fokker–Planck Equations 117 Bounded Drift–Diffusion in One Dimension 119 The Escape Problem and its Solution 123 Derivation of the Fokker–Planck Equation 127 Fokker–Planck Dynamics in Finite State Space 128 Fokker–Planck Dynamics with Coordinate-Dependent Diffusion Coefficient 133 Alternative Method of Solving the Fokker–Planck Equation 140 Exercises 142
5 5.1 5.2 5.3 5.4 5.5
The Langevin Equation 145 A System of Many Brownian Particles 145 A Traditional View of the Langevin Equation 151 Additive White Noise 152 Spectral Analysis 157 Brownian Motion in Three-Dimensional Velocity Space
92
160
Contents
5.6 5.7 5.8 5.9 5.10
Stochastic Differential Equations 166 The Standard Wiener Process 168 Arithmetic Brownian Motion 173 Geometric Brownian Motion 173 Exercises 176
Part III Applications 179 6 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.7.1 6.7.2 6.7.3 6.8
One-Dimensional Diffusion 181 Random Walk on a Line and Diffusion: Main Results 181 A Drunken Sailor as Random Walker 184 Diffusion with Natural Boundaries 186 Diffusion in a Finite Interval with Mixed Boundaries 193 The Mirror Method and Time Lag 200 Maximum Value Distribution 205 Summary of Results for Diffusion in a Finite Interval 208 Reflected Diffusion 208 Diffusion in a Semi-Open System 209 Diffusion in an Open System 210 Exercises 211
7 7.1 7.2
Bounded Drift–Diffusion Motion 213 Drift–Diffusion Equation with Natural Boundaries 213 Drift–Diffusion Problem with Absorbing and Reflecting Boundaries 215 Dimensionless Drift–Diffusion Equation 216 Solution in Terms of Orthogonal Eigenfunctions 217 First-Passage Time Probability Density 226 Cumulative Breakdown Probability 228 The Limiting Case for Large Positive Values of the Control Parameter 229 A Brief Survey of the Exact Solution 232 Probability Density 233 Outflow Probability Density 234 First Moment of the Outflow Probability Density 234 Second Moment of the Outflow Probability Density 235 Outflow Probability 236 Relationship to the Sturm–Liouville Theory 238 Alternative Method by the Backward Fokker–Planck Equation Roots of the Transcendental Equation 249 Exercises 251
7.3 7.4 7.5 7.6 7.7 7.8 7.8.1 7.8.2 7.8.3 7.8.4 7.8.5 7.9 7.10 7.11 7.12
240
VII
VIII
Contents
8 8.1 8.2 8.3 8.4 8.5 8.6
The Ornstein–Uhlenbeck Process 253 Definitions and Properties 253 The Ornstein–Uhlenbeck Process and its Solution 254 The Ornstein–Uhlenbeck Process with Linear Potential 261 The Exponential Ornstein–Uhlenbeck Process 266 Outlook on Econophysics 268 Exercises 272
9 9.1 9.2 9.3 9.4 9.5 9.6 9.7
Nucleation in Supersaturated Vapors 275 Dynamics of First-Order Phase Transitions in Finite Systems Condensation of Supersaturated Vapor 277 The General Multi-Droplet Scenario 286 Detailed Balance and Free Energy 290 Relaxation to the Free Energy Minimum 294 Chemical Potentials 295 Exercises 296
10 10.1 10.2 10.3 10.4 10.5 10.6 10.7
Vehicular Traffic 299 The Car-Following Theory 299 The Optimal Velocity Model and its Langevin Approach 302 Traffic Jam Formation on a Circular Road 316 Metastability Near Phase Transitions in Traffic Flow 328 Car Cluster Formation as First-Order Phase Transition 332 Thermodynamics of Traffic Flow 338 Exercises 348
11 11.1 11.2 11.3 11.4 11.5 11.6 11.7 11.8 11.9 11.10 11.11 11.12
Noise-Induced Phase Transitions 351 Equilibrium and Nonequilibrium Phase Transitions 351 Types of Stochastic Differential Equations 354 Transformation of Random Variables 358 Forms of the Fokker–Planck Equation 360 The Verhulst Model of Third Order 361 The Genetic Model 364 Noise-Induced Instability in Geometric Brownian Motion 364 System Dynamics with Stagnation 367 Oscillator with Dynamical Traps 369 Dynamics with Traps in a Chain of Oscillators 372 Self-Freezing Model for Multi-Lane Traffic 381 Exercises 385
275
Contents
12 12.1 12.2 12.3 12.4 12.5 12.6 12.7 12.8
Many-Particle Systems 387 Hopping Models with Zero-Range Interaction 387 The Zero-Range Model of Traffic Flow 389 Transition Rates and Phase Separation 391 Metastability 395 Monte Carlo Simulations of the Hopping Model 400 Fundamental Diagram of the Zero-Range Model 403 Polarization Kinetics in Ferroelectrics with Fluctuations Exercises 409 Epilog
411
References Index
423
413
405
IX
XI
Preface A wide variety of systems in nature can be regarded as many-particle ensembles with extremely intricate dynamics of their elements. Numerous examples are known in physics, e.g. gases, fluids, superfluids, electrons and ions in conductors, semiconductors, plasma, nuclear matter in neutron stars, etc. Such macroscopic systems are typically formed of 1023 –1028 particles, with essentially erratic motion, so a description of the individual elements is really hopeless. However, it is not actually necessary for practical tasks because on the macroscopic level we are dealing only with cumulative effects expressed in macroscopic variables. At this level, details of the individual particle motion are averaged – only the mean characteristics are essential for a description of the system dynamics. The deviation of an individual particle from the mean behavior can then be taken into account, if necessary, in terms of random fluctuations characterized again by some mean parameters. It should be noted that many systems of a nonphysical nature, e.g. fish swarms and bird flocks, vehicle ensembles, pedestrians or stock markets can be regarded (leaving aside social aspects of their behavior) as ensembles of interacting particles. There are several approaches to tackling many-particle systems. Dealing with a physical object whose dynamics is based on the Newtonian or Schr¨odinger equation, it is possible to start from the microscopic description and directly write down the corresponding governing equations. Then a rather small part of the system comprising, e.g. one, two, or three particles should be singled out and considered individually. The effect of the other elements on this selected part is taken into account on the average. Roughly speaking, it is in just this way that the notion of a thermal heat bath is introduced – a small part of the system under consideration is singled out and its interaction with the neighboring particles is simulated in terms of stochastic energy exchange with a certain reservoir characterized by some temperature. This approach is the most rigorous and, as a result, the most difficult way of constructing a bridge between the microscopic description dealing with individual particles (atoms, molecules, etc.) and the mesoscopic continuum fields, e.g. density, temperature, and pressure. Typically this bridge is implemented in the form of a partial differential equation or a system of such equations governing the distribution function of the particle or the collection of particles. We Physics of Stochastic Processes: How Randomness Acts in Time Reinhard Mahnke, Jevgenijs Kaupuˇzs and Ihor Lubashevsky Copyright 2009 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim ISBN: 978-3-527-40840-5
XII
Preface
should point out that the many-particle ensembles governed by the laws of classical Newtonian mechanics exhibit chaotic dynamics rather than stochastic dynamics. The term chaotic refers to systems whose evolution from the initial conditions is rigorously determined by nonrandom dynamics. Given the initial conditions, the dynamics of such a deterministic system is formally predictable, so it is not stochastic in a rigorous sense. However, if the system trajectories are located inside a given bounded region and are nonperiodic, then their temporal and spatial structure is highly intricate and in fact looks like that of stochastic random paths. Moreover, these trajectories pass all the standard tests for randomness so, for practical purposes, they can be regarded as stochastic. This observation is actually one of the ways to justify introducing a thermal heat bath characterized by stochastic energy exchange (between the small part of the system under consideration and the surrounding particles treated as a random reservoir at a particular temperature). The same comments concerning the relationship between chaos and stochasticity should be addressed in time series analysis. Without knowledge of the origin it is practically impossible to distinguish between chaotic and stochastic behavior. Another way to treat many-particle systems is to construct a collection of microscopic equations governing, e.g. the dynamics of individual particles where, for a given particle, the influence of the other particles is described in terms of both systematic and random forces. The notion of random forces again enables one to derive the corresponding partial differential equations for the distribution function of particles. Indeed the random forces should be introduced in such a way that these governing equations for the distribution function coincide with those obtained via the approach of the previous paragraph. As far as social, ecological, and economic systems are concerned, postulating the appropriate form of the random forces seems to be the only way to construct a mathematical description. This is due to such systems being open. Moreover the behavior of their elements is so intricate and multifactorial that a closed mathematical description is likely to be impossible. The latter approach is precisely the main topic of this book. It is based on probability theory or, more specifically, on the notion of stochastic processes and the relevant mathematical constructions, which are the subject matter of Chapters 1 and 2 (see the layout of the book shown at the end of this preface, page XVII). On the microscopic level stochastic trajectories of the system motion are the basic elements of the probabilistic description. It is assumed that different stochastic realizations of the random force are independent and also that the motion of particles does not have long-time memory. The notion of stochastic trajectories has a long history, possibly going back to the scientific poem De Rerum Nature (On the Nature of Things, circa 60 BC) by Titus Lucretius Carus. Although very little is known about the Roman philosopher, it seems he described the random motion of dust particles in air. In 1785 Jan Ingenhousz observed the irregular motion of coal dust particles on the surface of alcohol. Then, in 1827, the British botanist Robert Brown also discovered random highly erratic motion of pollen particles floating in
Preface
water under the microscope. Since that time this phenomenon has been called Brownian motion. The generalization of the observed phenomena gave rise to the notion of random walks where the walker dynamics is governed by both regular and stochastic forces. The first person who proposed a mathematical model for Brownian motion appears to be Thorvald N. Thiele in 1880. This was followed independently by Louis Bachelier in 1900 in his PhD thesis Th´eorie de la Sp´eculation devoted to a stochastic analysis of the stock and option markets. He worked out mathematically the idea that the stock market prices are essentially sums of independent, bounded random changes. The results put forward by Bechelier led to a flash of interest in stochastic processes and corresponding probabilistic approaches. However, it was Albert Einstein’s independent research into the problem in his 1905 paper that brought the solution to the attention of physicists (see, e.g. Brownian motion – Wikipedia, The Free Encyclopedia, 23 October 2007). The qualitative explanation of Brownian motion as a kinetic phenomenon was put forward by several authors. As mentioned above, it is possible to add random forces to the dynamical laws which were proposed for the first time by the French physicist Paul Langevin. (This resulted in a new mathematical field now known as stochastic differential equations.) The appropriate partial differential equations for the distribution function could then be derived based on the Langevin equation. It is possible to develop the probabilistic description of a stochastic process in the opposite way – the equations governing the distribution function are postulated and the appropriate Langevin equation is constructed in order to give these equations. This idea was implemented for the first time by Albert Einstein deriving the diffusion equation for Brownian particles in his famous ¨ paper Uber die von der molekularkinetischen Theorie der W¨arme geforderte Bewegung von in ruhenden Fl¨ussigkeiten suspendierten Teilchen published in Annalen der Physik (1905). The equation for diffusive motion was then developed by Adriaan Fokker (1914) and later more completely and generally by Max Planck (1918), leading to the transport equation now known as the Fokker–Planck equation. There are also approaches to describing random processes in discrete phase spaces based on ordinary differential equations (e.g. the probability balance law known as the master equation). If a stochastic process develops in discrete space and time the cellular automata models can be used, which form a distinct branch of the theory of stochastic processes. These problems and their mutual interrelationship are considered in Chapters 3–5 which adopt one of the main assumptions in the theory of stochastic processes, the Markovian approximation. According to this approximation, the displacement of a wandering particle on mesoscopic scales can be considered as the result of many small independent identically distributed steps. This reasoning is very close to what is now called a Kramers–Moyal expansion and has been used to derive the Fokker–Planck equation. To elucidate the main notions of stochastic processes, Chapters 6 to 8 consider in detail some rather simple examples of discrete random walks and
XIII
XIV
Preface
continuous Brownian motion. In particular, they touch on the problem of reaching a boundary for the first time. This problem plays an essential role in many physical phenomena such as escaping from a potential well, anomalous diffusion in fractal media, heat diffusion in living tissue, etc. As mentioned above, the notion of stochastic processes can form the initial mathematical description for objects of a nonphysical nature, e.g. social, ecological, and economic systems. This is a novel branch of science where only the first steps have been taken. It turns out that, in spite of their nonphysical nature, the cooperative phenomena in such systems (for example, self-organization processes in congested traffic or motion of pedestrians and social animals) exhibit a wide variety of properties commonly met in physical systems (for example in gas–liquid phase transitions, spinodal decomposition in solid solutions, ferromagnetic transitions, etc.). So, in some sense, the stochastic description of many-particle ensembles with strong interaction between their elements is of a more general nature than the basic laws of the corresponding mechanical systems. These questions are considered in Chapters 9 and 10 dealing with the aggregation of particles out of an initially homogeneous situation. This phenomenon is well known in physics, as well as in other branches of the natural sciences and engineering. The formation of bound states as an aggregation process is due to self-organization. The formation of car clusters (jams) at overcritical densities in traffic flow is an analogous phenomenon in the sense that cars can be considered as (strong asymmetrically) interacting particles. The development of traffic jams in vehicular flow is an everyday example of the occurrence of nucleation and aggregation in a system of many point-like cars. Traffic jams are a typical signature of the complex behavior of the many-car system. The master equation approach to stochastic processes can be applied to describe the car-cluster formation on a road in partial analogy to droplet formation in a supersaturated vapor. This jamming transition is very similar to conventional phase transitions appearing in the study of critical phenomena. Traffic-like collective movements are observed at almost all levels of biological systems. We study the energy balance of motorized particles in a many-car system. New dynamical features, such as steady state motion with energy flux, also appear. This phenomenon is also observed in a system of active Brownian particles with energy take-up and energy dissipation. The last two Chapters 11 and 12 are devoted to some modern applications in the physics of stochastic processes. First, we consider nonequilibrium phase transition induced by noise or caused by dynamical traps. Probably, the former type of transition can only be described using the Langevin equation with multiplicative noise, that is, stochastic equations for which the intensity of the random forces depends on the system state. During the last few decades it has been demonstrated that the behavior of such systems can be rather complex; in particular, the appearance of new states can be induced by noise as its intensity increases and attains certain critical values. The second type of phase
Preface
transition seems to be a commonly encountered phenomenon in systems, for example, congested traffic flow, where the human factor is essential. Such transitions are due to the existence of some regions in the corresponding phase space where the system dynamics is stagnated. Following the notions introduced in the theory of Hamiltonian dynamics with complex behavior, these regions are called dynamical traps. Finally, we turn to the kinetics of many-particle systems. The zero-range process, introduced in 1970 by Frank Spitzer as a system of interacting random walks, serves as a generic model in which rigorous large-scale description of the dynamics for arbitrary initial densities is possible in terms of a hydrodynamic equation for the coarse-grained particle density. It allows one to derive a criterion for phase separation in one-dimensional driven systems of interacting particles, e.g. in traffic flow, as well as to describe nontrivial features of stochastic dynamics like metastability. Nowadays another aspect which should be taken into account is nonGaussian behavior; that is, long-tail distributions which are observed in stock market data as well as in transportation theory. In this sense, applied sciences such as sociology and econophysics, biophysics and engineering, consider extreme events in nature and society and deal with effects (like material rupture) which can be investigated only by the probabilistic approach. In concluding this preface, we would like to underline the spirit in which this book is intended. Here we are in agreement with other authors of books on random processes; in particular, A. J. Chorin and O. H. Hald in Stochastic Tools in Mathematics and Science state: ‘When you asked alumni graduates from universities in Europe and US moving into nonacademic jobs in society and industry what they actually need in their business, you found that most of them did stochastic things like time series analysis, data processing etc., but that had never appeared in detail in university courses’. So the general aim of the present book is to provide stochastic tools for the multidisciplinary understanding of random events and to illustrate them with many beautiful applications in different disciplines ranging from econophysics to sociology. The central problem under consideration in this book is thus the theoretical modeling of complex systems, that is, many-particle systems with nondeterministic behavior. In contrast to the established classical deterministic approach based on trajectories, we develop and investigate probabilistic dynamics using stochastic tools, such as stochastic differential equations, Fokker–Planck and master equations, to obtain the probability density distribution. The stochastic technique provides an exact and more understandable background to describe complex systems. The authors have been working for years on the problems to which this monograph is devoted. Nevertheless, the book is also the result of longstanding scientific cooperation with a number of colleagues from all over the world. The authors thank Werner Ebeling, Rudolf Friedrich, Vilnis Frishfelds, Namik Gusein-Zade, Peter H¨anggi, Rosemary Harris, Andreas Heuer, Dirk Helbing, Alexander Ignatov, Andris Jakoviˇcs, Holger Kantz,
XV
XVI
Preface
Boris Kerner, Reinhart K¨uhne, Kai Nagel, Holger Nobach, Gerd R¨opke, Yuri and Michael Romanovsky, Anri Rukhadze, Andreas Schadschneider, Michael Schreckenberg, Gunter M. Sch¨utz, Lutz Schimansky-Geier, Yuki Sugiyama, Steffen Trimper, Peter Wagner and Hans Weber, for fruitful discussions. Special thanks are due to Friedrich Liese from the Institute of Mathematics at Rostock University for delivering a joint lecture series on Stochastic Processes from the mathematical (F. Liese) as well as physical (R. Mahnke) points of view and for preparing Chapter 1 of this book – Fundamental Concepts. The contents of this book took shape over several years, based on research and lectures performed at different locations. One of the recent lecture presentations took place in the summer term of 2007 at Rostock University. The authors have benefited from the contributions of a number of students. We would like to express our gratitude to the active participants, Michael Br¨udgam, Matthias Florian, Peter Gr¨unwald, Hannes Hartmann, Julia Hinkel, Bastian Holst, Thomas Kiesel, Susanne Killiches, Knut Klingbeil, Christof Liebe, Daniel M¨unzner, Ralf Remer, Elisabeth Sch¨one, Philipp Sperling, Marten Tolk, Andris Voitkans, Norman Wilken, and Mathias Winkel, together with many other students, PhD students and co-workers. Finally, we would like to acknowledge Andrey Ushakov, a student from Moscow Technical University of Radiophysics, Engineering and Automation, who has contributed to Section 3.7 – Three-Level System. The authors acknowledge support from the Deutsche Forschungsgemeinschaft via grant MA 1508/8. Rostock, Riga, Moscow October 2008
Reinhard Mahnke Jevgenijs Kaupuˇzs Ihor Lubashevsky
Preface
Mathematics Chapter 1 Fundamentals
+
Chapter 2 Multidimensionality
Physics Chapter 3
Chapter 4
Chapter 5
Master Equation
Fokker–Planck Equation
Langevin Equation
Chapter 6
Chapter 7
Chapter 8
Diffusion in Coordinate Space
Drift–Diffusion in Coordinate Space
2D Diffusion in Velocity and Coordinate Space
Applications I
Applications II
Chapter 9
Chapter 10
Droplet Condensation in Vapors
Vehicular Nucleation on Roads
Applications III Chapter 12
Chapter 11 Induced Phase Transitions
+
Kinetics of Many-Particle Systems
XVII
Part I Basic Mathematical Description
3
1 Fundamental Concepts
1.1 Wiener Process, Adapted Processes and Quadratic Variation
Stochastic processes represent a fundamental concept used to model the development of a physical or nonphysical system in time. It has turned out that the apparatus of stochastic processes is powerful enough to be applied to many other fields, such as economy, finance, engineering, transportation, biology and medicine. To start with, we recall that a random variable X is a mapping X : → R that assigns a real value to each elementary event ω ∈ . The concrete value X(ω) is called a realization. It is the value we observe after the experiment has been done. To create a mathematical machine we suppose that a probability space (, F, P) is given. is the set of all elementary events and F is the family of events we are interested in. It contains the set of all elementary events and is assumed to be closed with respect to forming the complement and countable intersections and unions of events from this collection of events. Such families of sets or events are called σ-algebras. The character σ indicates that even the union or intersection of countably many sets belongs to F as well. For mathematical reasons we have to assume that ‘events generated by X’, i.e. sets of the type {ω : X(ω) ∈ I}, where I is an open or closed or semi-open interval, are really events; i.e. such sets are assumed also to belong to F. Unfortunately the collection of all intervals of the real line is not closed with respect to the operation of union. The smallest collection of subsets of the real line that is a σ-algebra and contains all intervals is called the σ-algebra of Borel sets and will be denoted by B. It turns out that we have not only {ω : X(ω) ∈ I} ∈ F for any interval but even {ω : X(ω) ∈ B} ∈ F for every Borel set B. This fact is referred to as the F-measurability of X. It turns out that for any random variable X and any continuous or monotone function g the function Y(ω) = g(X(ω)) is again a random variable. This statement remains true even if we replace g by a function from a larger class of functions, called the family of all measurable functions, to which not only the continuous functions but also the pointwise limit of continuous functions belong. This class of functions is closed with respect to ‘almost all’ standard manipulations with Physics of Stochastic Processes: How Randomness Acts in Time Reinhard Mahnke, Jevgenijs Kaupuˇzs and Ihor Lubashevsky Copyright 2009 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim ISBN: 978-3-527-40840-5
4
1 Fundamental Concepts
functions, such as linear combinations and products and finally forming new functions by plugging one function into another function. The probability measure P is defined on F and it assigns to each event A ∈ F a number P(A) called the probability of A. The mappings A → P(A) satisfy the axioms of probability theory, i.e. P is a non-negative σ-additive set function on F with P() = 1. We assume that the reader is familiar with probability theory at an introductory course level and in the following we use basic concepts and results without giving additional motivation or explanation. Random variables or random vectors are useful concepts to model the random outcome of an experiment. But we have to include the additional variable ‘time’ when we are going to study random effects which change over time. Definition 1.1 By stochastic process we mean a family of random variables (Xt )t≥0 which are defined on the probability space (, F, P). By definition Xt is in fact a function of two variables Xt (ω). For fixed t this function of ω is a random variable. Otherwise, if we fix ω then we call the function of t defined by t → Xt (ω) a realization or a path. This means that the realization of a stochastic process is a function. Therefore stochastic processes are sometimes referred to as random functions. We call a stochastic process continuous if all realizations are continuous functions. For the construction of a stochastic process, that is, of a suitable probability space, one needs the so-called finite dimensional distributions which are the distributions of random vectors (Xt1 , . . . , Xtn ), where t1 < t2 < · · · < tn is any fixed selection. For details of the construction we refer to Øksendal [175]. A fundamental idea of modeling experiments with several random outcomes in both probability theory and mathematical statistics is to start with independent random variables and to create a model by choosing suitable functions of these independent random variables. This fact explains why, in the area of stochastic processes, the particular processes with independent increments play an exceptional role. This, in combination with the fundamental meaning of the normal distribution in probability theory, makes clear the importance of the so-called Wiener process, which will now be defined. Definition 1.2 A stochastic process (Wt )t≥0 is called a standard Wiener process or (briefly) Wiener process if: 1) W0 = 0, 2) (Wt )t≥0 has independent increments, i.e. Wtn − Wtn−1 , . . . , Wt2 − Wt1 , Wt1 are independent for t1 < t2 < · · · < tn , 3) For all 0 ≤ s < t, Wt − Ws has a normal distribution with expectation E(Wt − Ws ) = 0 and variance V(Wt − Ws ) = t − s, 4) All paths of (Wt )t≥0 are continuous. The Wiener process is also called Brownian motion. This process is named after the biologist Robert Brown whose research dates back to the 1820s. The
1.1 Wiener Process, Adapted Processes and Quadratic Variation
mathematical theory began with Louis Bachelier (Th´eorie de la Sp´eculation, 1900) and later by Albert Einstein (Eine neue Bestimmung der Molek¨uldimensionen, 1905). Norbert Wiener (1923) was the first to create a firm mathematical basis for Brownian motion. To study properties of the paths of the Wiener process we use the quadratic variation as a measure of the smoothness of a function. Definition 1.3 Let f : [0, T] → R be a real function and zn : a = t0,n < t1,n < · · · < tn,n = b, a sequence of partitions with δ(zn ) := max (ti+1,n − ti,n ) → 0, 0≤i≤n−1
as
n → ∞.
2 If limn→∞ n−1 i=0 (f (ti+1,n ) − f (ti,n )) exists and is independent of the concrete sequence of partitions then this limit is called the quadratic variation of f and will be denoted by [f ]T . We show that the quadratic variation of a continuously differentiable function is zero. Lemma 1.1 If f is differentiable in [0, T] and the derivative f (t) is continuous then [f ]T = 0. Proof . Put C = sup0≤t≤T |f (t)|. Then |f (t) − f (s)| ≤ C|t − s| and n−1
(f (ti+1,n ) − f (ti,n ))2 ≤ C2
n−1
i=0
(ti+1,n − ti,n )2
i=0
≤ C δ(zn )T →n→∞ 0. 2
If (Xt )0≤t≤T is a stochastic process then the quadratic variation [X]T is a random variable such that for any sequence of partitions zn with δ(zn ) → 0 it holds for n→∞ n−1
(Xti+1,n − Xti,n )2 →P [X]T ,
i=0
where →P is the symbol for stochastic convergence. Whether the quadratic variation of a stochastic process does or does not exist depends on the concrete structure of this process and has to be checked in a concrete situation and it is often more useful to deal with the convergence in mean square instead of the stochastic convergence. The relation between the two concepts provides the well known Chebyshev inequality which states that, for any random variables Zn , Z P(|Zn − Z| > ε) ≤
1 E(Zn − Z)2 . ε2
5
6
1 Fundamental Concepts
Hence the mean square convergence E(Zn − Z)2 → 0 of Zn to Z implies the stochastic convergence P(|Zn − Z| > ε) → 0 of Zn to Z. Now we are going to calculate the quadratic variation of a Wiener process. To this end we need a well known fact. If V has a normal distribution with expectation µ and variance σ2 then EV = µ, E(V − µ)3 = 0,
V(V) = E(V − µ)2 = σ2 E(V − µ)4 = 3σ4 .
If µ = 0 then E(V 2 − σ2 )2 = E(V 4 − 2σ2 V 2 + σ4 )
= 3σ4 − σ4 = 2σ4 .
(1.1)
Theorem 1.1 If (Wt )0≤t≤T is a Wiener process then the quadratic variation [W]T = T. Proof . Let zn be a sequence of partitions of [0, T] with δ(zn ) → 0 and put Zn =
n−1 (Wti+1,n − Wti,n )2 . i=0
From the definition of the Wiener process we get that E(Wti+1,n − Wti,n )2 = ti+1,n − ti,n . As the variance of a sum of independent random variables is just the sum of the variances we get from the independent increments 2 n−1 2 E(Zn − t) = E (Wti+1,n − Wti,n ) − (ti+1,n − ti,n ) 2
i=0
= V(Zn ) =
n−1
V((Wti+1,n − Wti,n )2 )
i=0
=
n−1
E((Wti+1,n − Wti,n )2 − (ti+1,n − ti,n ))2
i=0
=2
n−1 (ti+1,n − ti,n )2 ≤ 2δ(zn )T → 0, i=0
where for the last equality we have used (1.1). The statement [W]T = T is remarkable from different points of view. The exceptional fact is that the quadratic variation of this special stochastic process (Wt )0≤t≤T is a degenerate random variable, it is the deterministic value T. This value is non-zero. Therefore we may conclude from Lemma 1.1 that the paths of
1.1 Wiener Process, Adapted Processes and Quadratic Variation
Xt
2
0
−2 0
1
2
3
t
Figure 1.1 Collection of realizations Xt of the special stochastic process (Wt )0≤t≤T named after Norbert Wiener.
a Wiener process cannot be continuously differentiable as otherwise the quadratic variation must be zero. The fact that the quadratic variation is non-zero implies that the absolute value of an increment Wt − Ws cannot be proportional to t − s. From here we may conclude that the paths of a Wiener process are continuous but not differentiable and therefore strongly fluctuating. The illustrative picture (see Figure 1.1) of simulated realizations of a Wiener process underlines this statement. One of the main problems in the theory of stochastic processes is to find mathematical models that describe the evolution of a system in time and can especially be used to predict, of course not without error, the values in the future with the help of information about the process collected from the past. Here and in the sequel by ‘the collected information’ we mean the family of all events observable up to time t. This collection of events will be denoted by Ft , where we suppose that Ft is a σ-algebra. It is clear that Fs ⊆ Ft ⊆ F. Such families of σ-algebras are referred to as a filtration and will be denoted by (Ft )≥0 . Each stochastic process (Xt )t≥0 generates a filtration by the requirement that Ft is the smallest σ-algebra that contains all events {Xs ∈ I} where I is any interval and 0 ≤ s ≤ t. This filtration will be denoted σ((Xs )0≤s≤t ). We call any stochastic process (Yt )t≥0 adapted to the filtration (Ft )≥0 (short Ft -adapted) if all events that may be constructed by the process up to time t belong to the class of observable events, i.e. already belong to Ft . The formal mathematical condition is σ((Ys )0≤s≤t ) ⊆ Ft for every t ≥ 0. If for any fixed t and any random variable Z all events {Z ∈ I}, I ⊆ R, belong to Ft and it holds that EZ2 < ∞ then there are Xt1 ,n , . . . , Xtmn ,n , ti,j ≤ t and (measurable) functions fn (Xt1 ,n , . . . , Xtmn ,n ) such that E(Z − fn (Xt1 ,n , . . . , Xtmn ,n ))2 → 0.
We omit the proof which would require additional results from measure theory. We denote by Pt (X) the class of all such random variables. Pt (X) may be considered as the past of the process (Xt )t≥0 .
7
8
1 Fundamental Concepts
Example 1.1 Let (Wt )t≥0 be a Wiener process and Ft = σ((Ys )0≤s≤t ). 2 + Wt4 , Xt = (Wt4 / The following processes are Ft -adapted Xt = Wt2 , Xt = W0,5·t 2 1 + W0,1·t ). The process Wt+1 is not Ft -adapted. We fix the interval [0, T], set Ft = σ((Ws )0≤s≤t ) and denote by Et (W) ⊆ Pt (W) the collection of all elementary Ft -adapted processes, that is of all processes that may be written as Yt =
n−1
Xti I[ti ,ti+1 ) (t),
Xti ∈ Pti (W),
(1.2)
i=0
where 0 = t0 < t1 < · · · < tn and 1 if a ≤ t < b, I[a,b) (t) = 0 if else. The Ft -adeptness of the process Yt follows from the fact that exclusively random variables Xti with ti ≤ t appear in the sum. The process Yt is piecewise constant, it has the value Xti in [ti , ti+1 ) and jumps at ti with a height ∆Yti = Xti − Xti−1 .
1.2 The Space of Square Integrable Random Variables
By H2 we denote the space of all random variables X with EX 2 < ∞. Here and in the sequel we identify random variables X and Y that take on different values only with probability zero, i.e. P(X = Y) = 0. Set
X, Y := E(XY). It is not hard to see that X, Y satisfies all conditions that are imposed on a scalar product, i.e. X, Y is symmetric in X and Y, it is linear in both X and Y, and it holds that
X, X ≥ 0, where the equality is satisfied if and only if X = 0. The norm of a random variable X is given by √ X = EX 2 , and the distance of X and Y is the norm of X − Y. Recall that a sequence of random variables Xn is said to be convergent in mean square to X if E(Xn − X)2 = 0. Hence this type of convergence is nothing other than the norm convergence limn→∞ Xn − X = 0. A sequence of random variables {Xn } is said to be a Cauchy sequence if
1.2 The Space of Square Integrable Random Variables
lim Xn − Xm = 0.
n,m→∞
For a proof of the following theorem we refer to Øksendal [175]. Theorem 1.2 To each Cauchy sequence Xn ∈ H2 there is some X ∈ Z2 with lim Xn − X = 0,
n→∞
i.e. the space is complete. It is clear that H2 is a linear space. As we have already equipped H2 with a scalar product we get, together with the completeness, that H2 is a Hilbert space. This fact allows us to apply methods from the Hilbert space theory to problems of probability theory. A subset T ⊆ H2 is called closed, if every limit X of a sequence Xn ∈ T belongs to T again. If L ⊆ H2 is a closed linear subspace of H2 then there is some element in L that best approximates X. Theorem 1.3 If L ⊆ H2 is a closed linear subspace of H2 , then to each X ∈ H2 there is a random variable in L, denoted by L X ∈ L and called the projection of X on L, such that inf X − Y = X − L X .
Y∈L
Proof . Let Yn ∈ L be a minimum sequence, i.e. lim X − Yn = inf X − Y .
n→∞
Y∈L
Then Ymn is a minimum sequence again. Because X − 1 (Yn + Ym ) ≤ 1 X − Yn + 1 X − Ym n n 2 2 2 1 2 (Yn
+ Ymn ) is also a minimum sequence. Then 2 2 1 1 1 2 X − Yn + X − Ymn − X − (Yn + Ymn ) lim = 0. n→∞ 2 2 2
For any random variables U, V it holds that 2
2 1 1 1 1 2 1 2 1 2 2 U + V − (U + V) = E U + V − (U + V) 2 2 2 2 2 2 =
1 1 E (U − V)2 = U − V 2 . 4 4
Putting U = X − Yn , V = X − Ymn we arrive at 2 2 1 1 1 X − Yn 2 + X − Ymn − (Y X − + Y ) n mn 2 2 2 2 1 = Yn − Ymn → 0. 4
9
10
1 Fundamental Concepts
As mn was an arbitrary sequence we see that Yn is a Cauchy sequence and converges, by the completeness of H2 , to some random variable L X that belongs to L since L is closed by assumption. Without going into detail we note that the projection L X is uniquely determined in the sense that, for every Z ∈ L which also provides a best approximation, it holds that P(L X = Z) = 0.
(1.3)
The projection L X can be also characterized with the help of conditions imposed on the error X − L X. Corollary 1.1 It holds that Y = L X if and only if Y ∈ L and Y − X ⊥ L, i.e.
Y − X, Z = 0 for every Z ∈ L.
(1.4)
Proof . 1. Assume Y = L X. Then Y ∈ L by the definition of the projection. We consider 2 g(t) = (X − Y) − tZ = X − Y 2 + t2 Z 2 − 2t Y − X, Z . By the definition of L X the function g(t) attains its minimum at t = 0. Hence g (0) = −2 Y − X, Z = 0 which implies Y − X, Z = 0. 2. If Y ∈ L satisfies (1.4) then for every U ∈ L X − U 2 = X − Y 2 + 2 X − Y, Y − U + Y − U 2 . As Z = Y − U ∈ L we see that the middle term vanishes. Hence the right-hand term is minimal if and only if U = Y. The simplest prediction of a random variable X is a constant value. Which value a is the best one ? It is easy to see that the function ϕ(a) = E(X − a)2 attains the minimum at a0 = EX. Consequently, if L consists of constant random variables only, then L X = EX. This is the reason why, for any closed linear subspace, we call the projection L X the conditional expectation given L. In this case we tacitly assume that all constant random variables are contained in L. As L is a linear space this is equivalent to the fact that Z0 ≡ 1 ∈ L. If this condition is satisfied then we write E(X|L) := L X.
Choosing Z = 1 in (1.4) we get the following.
1.2 The Space of Square Integrable Random Variables
Conclusion 1.1 (Iterated expectation) It holds that E(E(X|L)) = EX.
(1.5)
The relation (1.4) provides the orthogonal decomposition X = L X + (X − L X).
(1.6)
Here L X belongs to the subspace L whereas the error X − L X is perpendicular on L. The Corollary 1.1 implies that the projection operator L is linear, i.e. L (a1 X1 + a2 X2 ) = a1 L (X1 ) + a2 L (X2 ). The relation (1.6) implies L X ≤ X . This inequality yields, in conjunction with the linearity, that L X depends continuously on X. Indeed, Xn → X implies L Xn − L X = L (Xn − X) ≤ Xn − X → 0. (1.7) Now we collect other properties of the conditional expectation that will be used in the sequel. Lemma 1.2 If L is a closed linear subspace of H2 that contains the constant variables and V is a random variable such that UV ∈ L for every U ∈ L then E(VX|L) = V E(X|L).
Proof . The assumption VU ∈ L and (1.4) imply 0 = X − E(X|L), VU
= E(XV − V E(X|L))U = XV − V E(X|L), U .
The application of Corollary 1.1 completes the proof. The multiple application of the conditional expectation corresponds to the iterated application of projections. Lemma 1.3 If Li is a closed linear subspace of H2 that contains the constant variables and L1 ⊆ L2 then E((E(X|L2 ))|L1 ) = E(X|L1 ).
Proof . Set R = E(X|L1 ) and S = E(X|L2 ). Then by Corollary 1.1
X − R, U = 0
for every U ∈ L1 ,
X − S, U = 0
for every U ∈ L2 .
11
12
1 Fundamental Concepts
The assumption L1 ⊆ L2 gives
S − R, U = 0 for every U ∈ L1 . Corollary 1.1 completes the proof. Next we study the relation between the independence of random variables and the conditional expectation. Lemma 1.4 If X is independent of every Z ∈ L then E(X|L) = EX.
Proof . The required independence implies E(XZ) = (EX)(EZ) E(X − EX)Z = 0.
The statement follows from Corollary 1.4 and the fact that the constant random variable EX belongs to L. We say that L is generated by the random variables X1 , . . . , Xn if L consists of all possible functions (not necessarily linear) h(X1 , . . . , Xn ) such that Eh2 (X1 , . . . , Xn ) < ∞. Then we write L = G(X1 , . . . , Xn ). Suppose the vector (Y, X1 , . . . , Xn ) has the joint density f (y, x1 , . . . , xn ). Then
(1.8) g(x1 , . . . , xn ) = f (y, x1 , . . . , xn ) dy is the marginal density of (X1 , . . . , Xn ) and f (y|x1 , . . . , xn ) =
f (y, x1 , . . . , xn ) g(x1 , . . . , xn )
(1.9)
is called the conditional density of of Y given X1 = x1 , . . . , Xn = xn . Theorem 1.4 Let γ be any function with Eγ2 (Y) < ∞ and f (y|x1 , . . . , xn ) be the conditional density of Y given X1 = x1 , . . . , Xn = xn . Then E(γ(Y)|G(X1 , . . . , Xn )) = ψ(X1 , . . . , Xn ),
where ψ is the so-called regression function that is given by
+∞ γ(t)f (t|x1 , . . . , xn ) dt. ψ(x1 , . . . , xn ) = −∞
(1.10)
Proof . As G(X1 , . . . , Xn ) consists of all functions ϕ(X1 , . . . , Xn ) it suffices to show that E(Y − ψ(X1 , . . . , Xn ))2 ≤ E(Y − ϕ(X1 , . . . , Xn ))2 .
1.2 The Space of Square Integrable Random Variables
It holds that E(γ(Y) − ϕ(X1 , . . . , Xn ))2
= · · · (γ(y) − ϕ(x1 , . . . , xn ))2 f (y, x1 , . . . , xn ) dy dx1 · · · dxn
= · · · (γ(y) − ψ)2 f (y, x1 , . . . , xn ) dy dx1 · · · dxn
+ 2 · · · (γ(y) − ψ)(ψ − ϕ)f (y, x1 , . . . , xn ) dx dy1 · · · dxn
+ · · · (ϕ − ψ)2 f (y, x1 , . . . , xn ) dx dx1 · · · dxn .
To calculate the middle term we note that ϕ − ψ does not depend on y. Hence
···
=
(γ(x) − ψ)(ψ − ϕ)f (y, x1 , . . . , xn ) dx dy1 · · · dxn
(ψ − ϕ) (γ(y) − ψ)f (y|x1 , . . . , xn ) dy ···
×g(x1 , . . . , xn ) dx1 · · · dxn =0 because of (1.10). Hence E(γ(X) − ϕ(X1 , . . . , Xn ))2
= E(γ(X) − ψ(X1 , . . . , Xn ))2 + E(ϕ(X1 , . . . , Xn ) − ψ(X1 , . . . , Xn ))2 . The term on the right-hand side becomes minimal if and only if ϕ(X1 , . . . , Xn ) − ψ(X1 , . . . , Xn ) = 0 which proves the statement. Let (Xt )t≥0 be a stochastic process such that all finite dimensional distributions of Xt1 , . . . , Xtn have a density that we will denote by ft1 ,...,tn (x1 , . . . , xn ), where t1 < t2 < · · · < tn . By ftn |t1 ,...,tn−1 (xn |x1 , . . . , xn−1 ) =
ft1 ,...,tn (x1 , . . . , xn ) ft1 ,...,tn−1 (x1 , . . . , xn−1 )
(1.11)
we denote the conditional density of Xtn given Xt1 = x1 , . . . , Xtn−1 = xn−1 . We call a stochastic process a Markov process if the conditional density depends only on the values of the process at the last moment of the past, i.e. ftn |t1 ,...,tn−1 (xn |x1 , . . . , xn−1 ) = ftn |n−1 (xn |xn−1 ).
(1.12)
If (Xt )t≥0 is a Markov process, then by Theorem 1.4, for every t1 < t2 < · · · < tn = t and h > 0 E(γ(Xt+h )|G(Xt1 , . . . , Xtn )) = E(γ(Xt+h )|G(Xt )).
(1.13)
13
14
1 Fundamental Concepts
Conversely, if the last condition holds for every γ then
γ(xn )ftn |n−1 (xn |xn−1 ) dxn = γ(xn )ftn |t1 ,...,tn−1 (xn |x1 , . . . , xn−1 ) dxn , (1.14) As γ is arbitrary the relation (1.14) yields ftn |n−1 (xn |xn−1 ) = ftn |t1 ,...,tn−1 (xn |x1 , . . . , xn−1 ). Recall that Pt (X) is the smallest closed subspace of H2 that contains all subspaces G(Xt1 , . . . , Xtm ), where t1 < t2 < · · · < tn . This means that Pt (X) consists of all random variables that are either functions of random variables from the past or a limit of such random variables. Hence by the continuity of the scalar product γ(Xt+h ) − E(γ(Xt+h )|G(Xt )) ⊥ Z,
Z ∈ Pt (X),
if and only if γ(Xt+h ) − E(γ(Xt+h )|G(Xt )) ⊥ Z,
Z ∈ G(Xt1 , . . . , Xtn )
for any t1 < t2 < · · · < tn ≤ t. As E(γ(Xt+h )|G(Xt )) ∈ Pt (X) then from Corollary 1.1 we get the following theorem. Theorem 1.5 A stochastic process (Xt )t≥0 is a Markov process if and only if E(γ(Xt+h )|Pt (X)) = E(γ(Xt+h )|G(Xt ))
for every function γ with Eγ2 (Xt+h ) < ∞. This condition is equivalent to (1.13) for any t1 < t2 < · · · < tn = t. Now we present a general construction scheme for Markov processes. Theorem 1.6 Let (Xt )t≥0 be a stochastic process and V(x, t, h) for t, h > 0, x ∈ R a family of random variables such that: 1) V(x, t, h) is independent of every Z ∈ Pt (X) for every t, h > 0 , x ∈ R 2) Xt+h = V(Xt , t, h). Then (Xt )t≥0 is a Markov process. Proof . Assume Eγ2 (Xt+h ) < ∞ and fix t1 < · · · < tn = t. Let (, F, P) be the basic probability space. For fixed t, h > 0 the random variable γ(V(x, t, h)) is a function of x and ω, say (x, ω). Without proof we use the fact that each such function can be approximated by linear combinations of the products of functions v(x)V(ω) in the sense that, for suitably chosen vi,n and Vi,n that are independent of every Z ∈ Pt (X) 2 n E vi,n (Xt )Vi,n − γ(V(Xt , t, h)) → 0. i=1
In view of Theorem 1.5 and Corollary 1.1 we have to show that E(γ(Xt+h ) − E(γ(Xt+h )|L(Xt1 , . . . , Xtn )))Z = 0
1.3 The Ito Integral and the Ito Formula
for every Z ∈ G(Xt1 , . . . , Xtn ). Due to the continuity of the projection, see (1.7), it suffices to show that n n E vi,n (Xt )Vi,n − E vi,n (Xt )Vi,n |G(Xt1 , . . . , Xtn ) Z = 0. (1.15) i=1
i=1
To this end we note that vi,n (Xt ) ∈ G(Xt1 , . . . , Xtn ). Hence by Lemma 1.2 E(vi,n (Xt )Vi,n |G(Xt1 , . . . , Xtn )) = vi,n (Xt )E(Vi,n |G(Xt1 , . . . , Xtn )).
Lemma 1.4 and the independence of Vi,n of all Xt1 , . . . , Xtn implies E(Vi,n |G(Xt1 , . . . , Xtn )) = E(Vi,n ).
This yields E(Z(E(vi,n (Xt )Vi,n |L(Xt1 , . . . , Xtn )))) = E(Zvi,n (Xt )) E(Vi,n ) .
(1.16)
Otherwise Vi,n is independent of Xt1 , . . . , Xtn and therefore independent of Zvi,n (Xt ). This yields E(Vi,n vi,n (Xt )Z) = E(Zvi,n (Xt ) E(Vi,n ) . (1.17) The relations (1.16) and (1.17) imply (1.15) and thus the statement.
1.3 The Ito Integral and the Ito Formula
The aim of this section is to introduce and study the concept of the Ito integral which is an integral where, instead of the classical Riemann integral, the values of the function to be integrated are not weighted according to the length of the interval from the chosen partition. Instead we weight this values by increments of a Wiener process. A first idea could be to set
b
b Xs dWs := Xs Ws ds. (1.18) a
a
But we know from the discussion after Theorem 1.1 that the derivative Ws does not exist. So this fact excludes this method. Ito succeeded in constructing an integral of the above type by starting as a first step with elementary processes and in a second step by extending the integral to a larger class of processes. Recall that by (1.2) every elementary adapted process X ∈ E(W) can be written as Yt =
n−1
Xti I[ti ,ti+1 ) (t),
Xti ∈ Pti (W).
i=0
We set
T
Xs dWs := 0
n−1 i=0
Xti (Wti+1 − Wti ).
15
16
1 Fundamental Concepts
A first immediate property of this integral concept is its linearity, i.e.
T
T
T (1) (2) (1) (2) (c1 Xs + c2 Xs ) dWs = c1 Xs dWs + c2 Xs dWs . 0
0
0
Another property that makes Hilbert space arguments applicable is the so-called isometry property. Theorem 1.7 If X (1) , X (2) ∈ E(W) then
T T (1) (2) Xs dWs , Xs dWs = 0
0
T
(1) (2) Xs , Xs ds.
(1.19)
0
Proof . A possible change to a joint refinement shows that the two elementary (1) (2) processes Xs and Xs can be represented about the same partition. Hence n (j) (j) Xti I[ti ,ti+1 ) (t), Yt = i=0
(j) Xt i
with some
T
∈ Pti (W). Then
(1) Xs
T
dWs ,
0
(2) Xs
dWs
0
=
n−1
(1)
(2)
E(Xti Xtj (Wti+1 − Wti )(Wtj+1 − Wtj )).
i,j=0
Let i = j and for example ti > tj . The independence of the increments implies that (1) (2) Wti+1 − Wti and Xti Xtj (Wtj+1 − Wtj ) are independent. Consequently E(Wti+1 − Wti ) = 0 implies that the mixed terms vanish. This yields n−1
T T (1) (2) (1) (2) Xs dWs , Xs dWs = E(Xti Xti (Wti+1 − Wti )2 ). 0
0
i=0
(1) (2) Because of Xti Xti ∈ Pti (W) (Wti+1 − Wti )2 which implies (1)
this random variable from the past is independent of
(2)
(1)
(2)
E[(Wti+1 − Wti )2 Xti Xti ] = E[(Wti+1 − Wti )2 ]E[Xti Xti ] (1)
(2)
= (ti+1 − ti )[E(Xti Xti )]. Hence 0
T
(1) Xs
dWs ,
T
(2) Xs
dWs =
0
n−1 (1) (2) [E(Xti Xti )](ti+1 − ti ) i=0
=
0
T
(1) (2) Xs , Xs ds.
1.3 The Ito Integral and the Ito Formula
We denote by L2 (W) the set of all Pt (W)-adapted processes X with
T EXt2 dt < ∞. 0
In the sequel we use the fact that every X ∈ L2 (W) can be approximated by elementary processes X (n) ∈ E(W) in the sense that
T (n) E(Xt − Xt )2 dt = 0. (1.20) lim n→∞ 0
We refer to Øksendal [175] for a proof. The relation (1.20) provides
T (n) (m) E(Xt − Xt )2 dt = 0, lim n,m→∞ 0
which, together with the isometry property (1.19), leads to
T
lim E
n,m→∞
0
(n) Xt
= lim
T
n,m→∞ 0
2
T
dWt −
(m) Xt
0 (n)
E(Xt
dWt
(m)
− Xt )2 dt = 0.
T (n) This means that the sequence of random variables 0 Xt dWt is a Cauchy sequence and converges therefore to a random variable that will be denoted by
T Xt dWt . 0
This random variable is independent of the choice of the approximating sequence (n) Xt and is called the Ito integral. The continuity of the scalar product shows that the above isometry property is still valid for the larger class of processes X ∈ L2 (W). Theorem 1.8 If X, Y ∈ L2 (W) then
T
(aXt + bYt ) dWt = a
0 T
Xt dWt , 0
0
T
Yt dWt =
T
Xt dWt + b
0 T
T
Yt dWt 0
Xt , Yt dt.
0
Letting the upper bound in the integral be variable we may introduce the new t stochastic process 0 Xs dWs which has been constructed exclusively with the help of random variables from Pt (W). Thus we see that the new process
t Xs dWs (1.21) Yt = 0
again belongs to L2 (W). This process has an important projection property.
17
18
1 Fundamental Concepts
Theorem 1.9 If t1 < t2 then Yt in (1.21) satisfies E(Yt2 |Pt1 (W)) = Yt1
(1.22)
EYt1 = EYt2 = 0.
(1.23)
Proof . By the linearity of the Ito integral and the continuity of the projection we have to prove the statement only for elementary processes of the type Xt = ZI[a,b) (t) where Z ∈ Pa (W). Then
t Xs dWs = Z(Wb∧t − Wa ), Yt = 0
where b ∧ t = min(b, t). This shows that Yt does not depend on t for t < a and t > b. Hence we have only to consider the case a ≤ t1 < t2 ≤ b. Then Yt2 − Yt1 = Z(Wt2 − Wt1 ) and E(Yt2 − Yt1 |Pt1 (W)) = E(Z(Wt2 − Wt1 )|Pt1 (W)).
As Z ∈ Pa (W) ⊆ Pt1 (W) we may apply Lemma 1.2 and can take Z out of the conditional expectation E(Z(Wt2 − Wt1 )|Pt1 (W)) = ZE((Wt2 − Wt1 )|Pt1 (W)).
The independence of Wt2 − Wt1 and the random variables from Pt1 (W) together with Lemma 1.4 yield E((Wt2 − Wt1 )|Pt1 (W)) = 0
and therefore E(Yt2 − Yt1 |Pt1 (W)) = 0.
Because of Yt1 ∈ Pt1 (W) we obtain E(Yt2 |Vt1 (W)) = Yt1 which is the first statement. The relation (1.5) implies EYt2 = EYt1 for every 0 ≤ t1 ≤ t2 . As Y0 = 0 we get (1.23). Stochastic processes that satisfy (1.22) are called martingales in probability theory. Now we introduce a class of processes that turns out to be useful in order to model the evolution of a time-dependent phenomenon. A stochastic process X is called an Ito process, if
t
t As ds + Bs dWs , (1.24) X t = X0 + 0
0
t where A, B ∈ L2 (W). It is not hard to show that the quadratic variation of 0 As ds is t zero, so that 0 As ds is a smooth part of Xt that plays the role of a drift. The second t component 0 Bs dWs is irregular as the quadratic variation is
t [X]t = B2s ds (1.25) 0
1.3 The Ito Integral and the Ito Formula
which can be easily shown and does not vanish. We also write dXs = As ds + Bs dWs .
(1.26)
instead of (1.24). Ito processes admit the following interpretation. For fixed h > 0 the increment Xt+h − Xt is approximately given by Xt+h − Xt ≈ At h + Bt (Wt+h − Wt ).
(1.27)
The first term At h is a drift with a slope which is governed by values from the past. The factors in the product Bt (Wt+h − Wt ) are independent where Wt+h − Wt is normally distributed with expectation zero and variance h. If the values in the past are fixed then Bt (Wt+h − Wt ) has the variance B2t h. This mean that Bs dWs is a diffusion term. Diffusion processes are special Ito processes. They are characterized by the fact that the drift coefficient At as well as the diffusion coefficient Bt only depend on the last state of the process. This means that At = a(t, Xt ),
and
Bt = b(t, Xt ),
with some a(t, x) and b(t, x). Hence
t
Xt = X 0 +
a(s, Xs ) ds +
0
t
b(s, Xs ) dWs .
(1.28)
0
This is an integral equation for Xt , which can formally be written as a differential equation, often used as a basic equation of motion in physics and named after Langevin ˙ t. X˙ t = a(t, Xt ) + b(t, Xt ) W
(1.29)
˙ t ≡ dWt /dt does not exist as we have already pointed out by The problem is that W showing that the paths of Wt are not differentiable. The representation (1.28) raises the question of for which a, b the integral equation has a solution and under which conditions this solution is unique. In the sense of an initial value problem the value X0 has to be fixed. Necessary and sufficient conditions that guarantee the existence and uniqueness of a solution of this initial value problem can be found in many books, e.g. [30, 57, 91, 104, 175]. Often the starting point X0 is a deterministic value, say x0 . To indicate the dependence on x0 we denote the corresponding process by Xt,x0 . Hence
Xt,x0 = x0 +
t
a(s, Xs,x ) ds +
0
t
b(s, Xs,x ) dWs , 0
and
Xt+h,x − Xt,x = t
t+h
a(s, Xs,x ) ds + t
t+h
b(s, Xs,x ) dWs .
(1.30)
19
20
1 Fundamental Concepts
For every fixed x the random variable
V(x, t, h) = x +
t+h
a(s, Xs,x ) ds +
t
t+h
b(s, Xs,x ) dWs t
is independent of of the random variables from Pt (W). If (1.30) has a unique solution then Xt+h,x0 = V(Xt,x0 , t, h). From Theorem 1.6 we get the Markov property. Theorem 1.10 If the equation
t
t Xt,x = x + a(τ, Xτ,x ) dτ + b(τ, Xτ,x ) dWτ s
s
has a unique solution for every x and s then the process starting at x0 being defined as the solution of
t
t Xt,x0 = x0 + a(s, Xs,x0 ) ds + b(s, Xs,x0 ) dWs 0
0
is a Markov process. It is called homogeneous, if a and b are independent of s, hence
t
t Xt,x0 = x0 + a(Xs,x0 ) ds + b(Xs,x0 ) dWs . 0
0
The class of Ito processes is closed with respect to the application of smooth functions, i.e. u(t, Xt ) is again a Ito process whose drift and diffusion coefficient can be given explicitly. Theorem 1.11 (Ito formula) Suppose A, B ∈ L2 (W) and assume dXs = As ds + Bs dWs . If u : [0, ∞) × R → R is twice continuously differentiable then du(t, Xt ) =
∂u ∂u 1 ∂2u (t, Xt ) dt + (t, Xt ) dXt + (t, Xt ) · (dXt )2 , ∂t ∂x 2 ∂x2
(1.31)
where (dXt )2 = dXt · dXt is to be calculated according to the following rules dt · dt = dt · dWt = dWt · dt = 0, dWt · dWt = dt.
(1.32) (1.33)
Proof . We give only a sketch of the proof. Further details can be found in Øksendal [175] or many other textbooks on stochastic differential equations such as Chorin and Held [30] and Karatzas and Shreve [91].
1.3 The Ito Integral and the Ito Formula
Suppose zn = {t0,n , . . . , tn.n }, t0,n = 0, tn.n = t is a sequence of partitions of [0, t] with δ(zn ) → 0. Then u(t, Xt ) − u(t, X0 ) =
n [u(ttl,n , Xtl,n ) − u(ttl−1,n , Xtl−1,n )]. l=1
and by the Taylor expansion u(t, Xtl,n ) − u(t, Xtl−1,n ) =
∂u (tl,n , Xtl,n )(tl,n − tl−1,n ) ∂t ∂u (tl,n , Xtl,n )(Xtl,n − Xtl−1,n ) + ∂x +
1 ∂ 2u (tl,n , Xtl,n )(Xtl,n − Xtl−1,n )2 + Rl,n , 2 ∂x2
where n
Rl,n −→P 0
l=1
can be shown. The sum of the first terms of the above decomposition can be shown to tend to
t ∂u (s, Xs ) ds, 0 ∂t as n → ∞. Similarly, by dXs = As ds + Bs dWs the sum of the second terms tends to
t
t ∂u ∂u (s, Xs )As ds + (s, Xs )Bs dWs . ∂x 0 0 ∂x Using dXs = As ds + Bs dWs again we see that the sum of the third terms consists of three parts. The first one is n 1 ∂ 2u l=1
2 ∂x2
(tl,n , Xtl,n )A2tl−1,n (tl,n − tl−1,n )2 .
Assuming, for simplicity, a boundedness of c
n
∂ 2 u(t,x) 2 At , ∂x2
(1.34) this sum does not exceed
(tl,n − tl−1,n )2 ≤ cδ(zn ) · t → 0
l=1
as δ(zn ) = max1≤l≤n |tl,n − tl−1,n | → 0. Hence (1.34) tends stochastically to zero. The second part is the mixed term n ∂2u l=1
∂x2
(tl,n , Xtl,n )Atl−1,n Btl−1,n (tl,n − tl−1,n )(Wtl,n − Wtl,n ).
If ∂∂xu2 (tl,n , Xtl,n )Atl−1,n Btl−1,n is bounded then the expectation of the absolute value can be estimated by 2
21
22
1 Fundamental Concepts
c
n (tl,n − tl−1,n )E|Wtl,n − Wtl,n |. l=1
Using the inequality E|Z| ≤ (EZ2 )1/2 valid for any random variable Z we get the bound n c (tl,n − tl−1,n )(tl,n − tl−1,n )1/2 → 0 l=1
where we used max1≤l≤n |tl,n − tl−1,n | → 0 again. The sum over the third parts 1 ∂ 2u (tl,n , Xtl,n )B2tl−1,n (Wtl,n − Wtl,n )2 2 ∂x2 n
l=1
does not disappear. By similar arguments that have been used while studying the quadratic variation of the Wiener process one can show that the last sum tends to
1 t ∂ 2u (s, Xs )B2s ds, 2 0 ∂x2 which completes the sketch of the proof. We now consider special cases. Suppose Xt is a diffusion process already defined by (1.28)
t
t a(s, Xs ) ds + b(s, Xs ) dWs (1.35) X t = X0 + 0
0
The transformation rules (1.32) and (1.33) give u(t, Xt ) = u(0, X0 )
t ∂u(s, Xs ) 1 ∂ 2 u(s, Xs ) 2 ∂u(s, Xs ) ds + a(s, Xs ) + + b (s, X ) ds s ∂s ∂x 2 ∂x2 0
t ∂u(s, Xs ) b(s, Xs ) dWs . + (1.36) ∂x 0 If u depends only on x then u(Xt ) = u(X0 )
t 1 u (Xs )a(s, Xs ) + u (Xs )b2 (s, Xs ) ds + 2 0
t + u (Xs )b(s, Xs ) dWs . 0
Corollary 1.2 If Xt is a solution of dXt = a(t, Xt ) dt + b(t, Xt ) dWt then
t 1 u (Xs )a(s, Xs ) + u (Xs )b2 (s, Xs ) ds. E(u(Xt ) − u(X0 )) = E 2 0
(1.37)
1.4 The Kolmogorov Differential Equation and the Fokker–Planck Equation
Proof . Theorem 1.9 shows that
t ∂u(Xs ) b(s, Xs ) dWs E ∂x 0 is independent of t and is therefore zero as the expression vanishes for t = 0. To conclude this section we note that the diffusion process Xt in (1.28) reduces to the Wiener process in the special case a = 0, b = 1. But in the general case one may replace the probability measure P by another distribution Q (Girsanov transformation) such that the process Xt becomes a Wiener process with respect to Q.
1.4 The Kolmogorov Differential Equation and the Fokker–Planck Equation
We consider the diffusion process defined by the stochastic differential equation dXt = a(Xt ) dt + b(Xt ) dWt .
(1.38)
We know from Theorem 1.10 that this process is a Markov process. As both a and b do not depend on t the process is homogeneous. Let f (t, x, y) be the family of transition densities, i.e. f (t, x, ·) is the conditional density of Xt given X0 = x. If the process Xt,x starts in t = 0 at x then f (t, x, ·) is the probability density of Xt,x . This family of densities satisfies the Chapman–Kolmogorov equation
f (s + t, x, y) = f (s, x, z)f (t, z, y) dz, 0 ≤ s, t. (1.39) Let Cb be the space of all bounded and measurable functions R and denote by C20 the space of all twice continuously differentiable functions that vanish outside of some finite interval that may depend on the concrete function under consideration. For u ∈ Cb we set
(Tt u)(x) =
u(y)f (t, x, y) dy
= Eu(Xt,x ). It is easy to see that Tt u ∈ Cb . The Chapman–Kolmogorov equation implies the semigroup property, that is, Tt Ts = Ts+t . Putting X0 = x in Corollary 1.2 we get, for any u ∈ C20 ,
t 1 u (Xs,x )a(Xs,x ) + u (Xs,x )b2 (Xs,x ) ds (Tt u)(x) = u(x) + E 2 0
(1.40)
23
24
1 Fundamental Concepts
and therefore 1 (Th u)(x) − u(x) =E h h
h
0
1 u (Xs,x )a(Xs,x ) + u (Xs,x )b2 (Xs,x ) ds. 2
Each diffusion process can be shown to be continuous. Hence lims↓0 Xs,x = x and lim h↓0
1 h
h
0
= a(x)
1 u (Xs,x )a(Xs,x ) + u (Xs,x )b2 (Xs,x ) 2
∂u(x) 1 2 ∂ 2 u(x) + b (x) = (Au)(x), ∂x 2 ∂x2
where A is the differential operator A = a(x)
1 ∂ ∂2 + b2 (x) 2 . ∂x 2 ∂x
(1.41)
This differential operator is the infinitesimal operator of the semigroup in the sense that (Au)(x) = lim h↓0
(Th u)(x) − u(x) . h
Let I be the identical operator. Then we obtain from the semigroup property (1.40) that
(Th − I)u Tt+h u − Tt u = lim Tt lim h↓0 h h↓0 h = Tt Au.
(1.42)
Similarly,
(Th − I) Tt+h u − Tt u = lim Tt u lim h↓0 h h↓0 h = ATt u. Thus we have obtained the following result. Theorem 1.12 If Xt,x is the solution of dXt,x = a(Xt,x ) dt + b(Xt,x ) dWt , X0,x = x and u ∈ C20 , then u(t, x) = (Tt u)(x) = Eu(Xt,x ) =
u(y)f (t, x, y) dy
(1.43)
1.4 The Kolmogorov Differential Equation and the Fokker–Planck Equation
satisfies the Kolmogorov forward equation ∂u(t, x) = (Tt Au)(x) ∂t
∂u(y) 1 2 ∂ 2 u(y) a(y) f (t, x, y) dy + b (y) = ∂y 2 ∂y2
(1.44)
and the Kolmogorov backward equation ∂u(t, x) = A(Tt u)(x) ∂t = a(x)
(1.45)
∂u(t, x) 1 2 ∂ 2 u(t, x) + b (x) . ∂x 2 ∂x2
Proof . The statement (1.44) follows from (1.42). Similarly, (1.45) follows from (1.43). Now we establish differential equations for the transition densities. To this end we apply integration by parts. If u, v ∈ C20 , and both a and b are twice continuously differentiable then
d(a(x)v(x)) du(x) v(x) dx = − u(x) dx, dx dx
2 2
d (b (x)v(x)) d2 u(x) b2 (x) v(x) dx = u(x) dx. 2 2 dx dx
a(x)
The application to (1.44) yields ∂u(t, x) = ∂t =
∂u(y) 1 2 ∂ 2 u(y) a(y) f (t, x, y) dy + b (y) ∂y 2 ∂y2 −
∂(a(y)f (t, x, y)) 1 ∂ 2 (b2 (y)f (t, x, y)) u(y) dy. + ∂y 2 ∂y2
Otherwise
∂u(t, x) ∂ = u(y)f (t, x, y) dy ∂t ∂t
∂ = u(y) f (t, x, y) dy. ∂t Hence for every u ∈ C20
∂ ∂(a(y)f (t, x, y)) 1 ∂ 2 (b2 (y)f (t, x, y)) u(y) dy = 0. f (t, x, y) + − ∂t ∂y 2 ∂y2
(1.46)
(1.47)
25
26
1 Fundamental Concepts
Let u ∈ C20 be any probability density with support, e.g. u(t) = ct2 (1 − t)2 , where c is determined by
1
u(t) dt = 1.
0
Put for every fixed z un (t) = nu(n(t − z)).
(1.48)
For large n the sequence un (t) is concentrated around z. When ψ is twice continuously differentiable we get
ψ(t)un (t) dt =
ψ(t)nu(n(t − z)) dt
=
s ψ z+ u(s) ds → ψ(z)u(s) ds = ψ(z). n
The application of this statement to (1.47) yields the so-called forward Fokker–Planck equation ∂ ∂(a(z)f (t, x, z)) 1 ∂ 2 (b2 (z)f (t, x, z)) f (t, x, z) = − + . ∂t ∂z 2 ∂z2
(1.49)
Similarly, the relation (1.45) yields ∂u(t, x) 1 2 ∂ 2 u(t, x) ∂u(t, x) = a(y) + b (x) ∂t ∂x 2 ∂x2
∂ ∂2 1 = a(y) u(y)f (t, x, y) dy u(y)f (t, x, y) dy + b2 (x) 2 ∂x 2 ∂x
∂f (t, x, y) 1 2 ∂ 2 f (t, x, y) dy. + b (x) = u(y) a(y) ∂x 2 ∂x2 Because of (1.46) we arrive at
∂f (t, x, y) 1 2 ∂ 2 f (t, x, y) ∂ dy = 0. f (t, x, y) − a(y) − b (x) u(y) ∂t ∂x 2 ∂x2 Again by plugging in un from (1.48) and by letting n → ∞ we obtain ∂f (t, x, y) 1 2 ∂ 2 f (t, x, y) ∂ f (t, x, y) = a(x) + b (x) , ∂t ∂x 2 ∂x2 which is called the backward Fokker–Planck equation.
(1.50)
1.5 Special Diffusion Processes
1.5 Special Diffusion Processes
This section is aimed at presenting special examples of diffusion processes and studying the relation between them. Example 1.2 If Xt = Wt is the Wiener process then a = 0 and b = 1 in the stochastic differential equation (1.38). Since f (t, x, y) is the density of x + Wt the family of transition densities is given by f (t, x, y) = ϕ0,t (y − x) where ϕµ,σ2 is the density of the normal distribution with parameters µ and σ2 . We see from (1.41) that the infinitesimal operator A is given by A=
1 ∂2 . 2 ∂x2
Putting
u(t, x) =
u(y)f (t, x, y) dy
the Kolmogorov backward equation (1.45) reads 1 ∂ 2 u(t, x) ∂u(t, x) = . ∂t 2 ∂x2 This type of equation is called heat (or pure diffusion, which means without drift) equation in physics. The Fokker–Planck equation has the same form 1 ∂ 2 f (t, x, y) ∂f (t, x, y) = . ∂t 2 ∂y2 Of course, the above differential equation could also have been directly obtained using the fact that the transition density is, in view of Xt,x = x + Wt , given by 1 (y − x)2 . f (t, x, y) = ϕ0,t (y − x) = √ exp − 2t 2πt Example 1.3 The Ornstein–Uhlenbeck process is defined to be a solution of the following stochastic differential equation dXt = µXt dt + σ dWt . To solve this equation we apply the Ito formula to the process Xt exp{µt} where we choose u(t, x) = x exp{−µt}. The formula for du(t, Xt ) in Theorem 1.11 (Ito formula) gives ∂u(t, Xt ) ∂u(t, Xt ) 1 ∂ 2 u(t, Xt ) dt + dXt + (dXt )2 ∂t ∂x 2 ∂x2 = −µXt exp{−µt} dt + exp{−µt} dXt
d(Xt exp{−µt}) =
= exp{−µt}σ dWt .
27
28
1 Fundamental Concepts
Hence
exp{−µt}Xt − X0 = σ
t
exp{−µs} dWs 0
t
Xt = X0 exp{µt} + σ
exp{µ(t − s)} dWs .
0
The infinitesimal operator reads A = µx
1 ∂2 ∂ + σ2 2 . ∂x 2 ∂x
From the definition of the Ito integralone easily concludes that for any nonrandom function h the random variable
t h(s) dWs 0
t has a normal distribution with expectation zero and variance 0 h2 (s) ds. This means t that the distribution of σ 0 exp{µ(t − s)} dWs is a normal distribution with expectation zero and a variance given by
σ2 exp{2µt}
t
exp{−2µs} ds = −
0
=−
σ2 exp{2µt} [exp{−2µt} − 1] 2µ
σ2 σ [1 − exp{2µt}] −→ − 2µ 2µ 2
for t → ∞
if µ < 0. In this case X0 exp{µt} tends to zero. Hence for µ < 0 the one-dimensional marginal distribution of Xt tends to a normal distribution with expectation zero and variance −(σ2 /2µ). One can show that this distribution, when used as an initial distribution of X0 , turns the Ornstein–Uhlenbeck process into a stationary process. Example 1.4 We consider the geometric Brownian motion that is defined by Yt = exp{µt + σWt }. Put Xt = µt + σWt . We use the Ito formula in Theorem 1.11 with u(x) = exp{x}. Hence by (1.32) and (1.33) ∂u(t, Xt ) ∂u(t, Xt ) 1 ∂ 2 u(t, Xt ) dt + dXt + (dXt )2 ∂t ∂x 2 ∂x2 1 = u(Xt ) dXt + u(Xt )(µ dt + σ dWt )2 2
σ2 σ2 dt + σYt dWt . = Yt dXt + Yt dt = Yt µ + 2 2
dYt =
1.6 Exercises
In particular, for µ = −σ2 /2 we get dYt = σYt dWt . Hence we see from (1.23) that Yt has the constant expectation EY0 = 1. These special cases of diffusion processes considered in the last three examples will be discussed in more detail in Chapter 6 (Wiener process or Brownian motion from Example 1.2), in Chapter 8 (Ornstein–Uhlenbeck process from Example 1.3) and in Chapter 11 as well as in Section 5.9 (geometric Brownian motion from Example 1.4).
1.6 Exercises
E 1.1 Ito diffusion Write a computer program using the Euler discretization algorithm of the Ito stochastic differential equation (1.38) to study special cases of Ito diffusion such as the Wiener process, Brownian motion with constant drift, and especially geometric Brownian motion (see Examples 1.2–1.4 in Section 1.5). Start with a simulation of the Wiener process dXt = dWt using a discrete time interval ∆t and normally distributed random numbers Z ∼ N(0, 1) generated by the Box–Muller and/or the polar method. Check the known properties of the Wiener process by considering the Wiener difference ∆Wt = Wt+∆t − Wt over time step ∆t = t + ∆t − t in the limit ∆t → 0. E 1.2 Brownian paths in higher dimensions Study Brownian paths (or Wiener trails) in higher dimensions Rn (n ≥ 2) and show that the n-dimensional Brownian motion is isotropic by doing simulations of Brownian paths in R2 . E 1.3 Hausdorff dimension The Hausdorff dimension and the box-counting dimension of a Brownian trail in Rn (n ≥ 2) is equal to 2. Try to find the Hausdorff and box dimension for a graph (realization) of Brownian motion in R1 (one-dimensional case). E 1.4 Stochastic process with constant drift and diffusion Find the Fokker–Planck equation for the stochastic process that satisfies the stochastic differential equation dXt = −a dt + b dWt , where a and b are constants and dWt = Wt+dt − Wt is the increment of a Wiener process (also called white noise). E 1.5 Stochastic Ornstein–Uhlenbeck process Consider Example 1.3 in Section 1.5 (Ornstein–Uhlenbeck process) in more detail and find the solution of the corresponding Fokker–Planck equation related to dut = −µ ut dt + σ dWt with non-negative constants µ, σ and given the initial condition ut=0 = u0 . Show that the probability density p(u, t) becomes stationary and the so-called fluctuation–dissipation relation holds.
29
31
2 Multidimensional Approach
2.1 Bounded Multidimensional Region
As it is known already from Chapter 1 and many textbooks, see e.g. [55,193], Markovian stochastic processes are completely determined by their conditional probabilities which obey the Chapman–Kolmogorov equation. The Kramers–Moyal expansion can be used to determine the Fokker–Planck equation by specifying the drift vector and diffusion matrix based on the assumption of vanishing higher order Kramers–Moyal coefficients. Usually, the Fokker–Planck equation is derived implicitly assuming that the phase space of the stochastic variables under consideration extends to infinity, so that so-called natural boundary conditions can be applied. If stochastic processes in a finite region of phase space are considered, boundary conditions are introduced a posteriori based on apparent physical arguments leading to the notion of a reflecting barrier, characterized by a vanishing normal component of the probability current, an absorbing barrier, where the probability distribution has to vanish, and boundary conditions at a discontinuity, where probability distributions and the normal components of the probability current have to be continuous. No attempts, so far, have been made to derive the Fokker–Planck equation simultaneously with appropriate boundary conditions from the Chapman–Kolmogorov equation. It is quite evident that boundaries can strongly influence the stochastic motion of a particle in various ways depending on the microscopic interactions. As an example we mention a boundary formed by a fast diffusion layer. In such a thin layer, particles are able to diffuse in the directions tangential to the boundary on a fast time scale, whereas in the bulk the particle behavior should be accurately described by the Fokker–Planck equation. The theoretical treatment of the particle diffusion requires the formulation of consistent boundary conditions which match the internal Fokker–Planck behavior to the stochastic properties of the boundary layer. So it would be desirable to have a technique for deriving the boundary conditions, referring directly to the way in which the regional boundaries affect the stochastic processes. In this respect, we note the unified formulation of the Fokker–Planck equation for stochastic hybrid systems by Julien Bect [17, 18] devoted to a general Physics of Stochastic Processes: How Randomness Acts in Time Reinhard Mahnke, Jevgenijs Kaupuˇzs and Ihor Lubashevsky Copyright 2009 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim ISBN: 978-3-527-40840-5
32
2 Multidimensional Approach
description of random processes near boundaries causing deterministic jumps. Boundary conditions for the Fokker–Planck equation which describes coupled transport of photons and electrons are derived in [2]. A series of papers [162, 171, 224] dealing with boundary conditions for the advection–diffusion problem combines the Boltzmann and Fokker–Planck equations and their numerical implementation, and [207] develops diffusion models for molecular transport across membranes via ion channels and wider pores in terms of random walks affected by boundaries with complex properties. In addition, [70] actually constructs the absorbing boundary as a limiting transition of an infinite space with halfspaces having substantially different properties and [242] implements boundary conditions for the Wiener processes in path integrals. Papers [56] and [87] develop a rather sophisticated moment technique for tackling the Fokker–Planck equation with mixed boundary conditions based on a special moment truncation scheme. In this chapter we extend the method of deriving the Fokker–Planck equation from the Chapman–Kolmogorov equation in such a way that simultaneously consistent boundary conditions can be formulated. Our approach is based on the introduction of physical models for the stochastic behavior close to the boundary. We demonstrate that boundaries break the symmetry of the random forces leading to boundary singularities in the Kramers–Moyal expansion. The cancellation of these singularities yields the appropriate boundary conditions. We explicitly derive the boundary conditions for a reflecting or absorbing barrier and describe the general procedure for the derivation of the boundary conditions for the case of a fast diffusion layer. It should be noted that a similar anomalous effect of the regional boundaries on random processes was analyzed in [117, 178] and [182] by numerical implementation of the Wiener processes near the boundaries. In addition, paper [163] applies the concept of symmetry breakdown caused, however, by external fields, to construct a generalized master equation for the classical and anomalous diffusion processes. In principle the present approach can be extended to anomalous transport phenomena, e.g. sub- and super-diffusion, which are modeled by fractional diffusion operators. It is well known that the formulation of boundary conditions for these processes is still a challenging problem although several approaches have been developed [9, 109, 136, 219]. The procedure outlined here might be helpful in formulating appropriate boundary conditions for these more complicated processes. The main subject of the chapter is to derive the boundary conditions for the Fokker–Planck equations, both forward and backward ones, directly from the Chapman–Kolmogorov equation. Also, the Fokker–Planck equations will be obtained because; first, this makes the subject more complete; and second, it demonstrates the relationship between the basic elements in constructing a differential description of Markovian process for internal points and near the boundaries. To do this an M-dimensional region with boundaries is considered. The boundaries are assumed, in addition, to be able to absorb particles or to give rise to fast surface transport. It is demonstrated that the boundaries break down the
2.2 From Chapman–Kolmogorov Equation to Fokker–Planck Description
symmetry of random walks in their vicinity, leading to the boundary singularities in the corresponding kinetic coefficients. Eliminating these singularities we get the desired boundary conditions. As required, the boundary condition for the forward Fokker–Planck equation satisfies mass conservation.
2.2 From Chapman–Kolmogorov Equation to Fokker–Planck Description
We consider the stochastic dynamics of a Markovian system represented as a point r belonging to a certain domain Q in the Euclidean M-dimensional space RM . The domain Q is assumed to be bounded by a smooth hypersurface ϒ. When the detailed information about possible trajectories {r(t)} of the system motion is of minor importance the conditional probability, called also the Green function, G(r, t|r0 , t0 ) := P r0 , t0 ⇒ r, t gives us the complete description of system evolution. By definition, the Green function is the probability density of finding the system at the point r at time t provided it was located at the point r0 at the initial time t0 . Since Markovian systems have no memory, the Green function G(r, t|r0 , t0 ) obeys the integral Chapman–Kolmogorov equation that represents the transition of the system from the initial point r0 to the terminal one r within the time interval (t0 , t) as a complex step via an intermediate point r∗ ∈ Q at a certain fixed moment of time t∗ with succeeding summation over all possible positions of the intermediate point (see, e.g., [55])
G(r, t|r0 , t0 ) = dr∗ G(r, t|r∗ , t∗ ) G(r∗ , t∗ |r0 , t0 ). (2.1) Q
The time t∗ may be chosen arbitrarily between the initial and terminal time moments, t∗ ∈ [t0 , t]. Figure 2.1 depicts this equation. The domain boundary ϒ will be regarded as a physical object, and so some individual properties are ascribed to it. In particular, the boundary itself can G(r, tr0, t0)
r0, t0
G(r∗, t∗ r0, t0)
r, t
r0, t0
G(r, tr∗, t∗)
r∗, t∗ Backward FPE
r, t Forward FPE
Figure 2.1 Diagram of the Chapman–Kolmogorov equation. The symbol denotes summation over the intermediate point r∗ and the arrows illustrate the limiting cases t∗ → t0 + 0 and t∗ → t − 0, matching the backward and forward Fokker–Planck equations.
33
34
2 Multidimensional Approach
affect the system, for example, trapping it. So the symbol of the triple integral is used in (2.1) to underline this feature and, where appropriate, it should be read as
dr . . . = dr . . . + ds . . . + ds . . . Q
Q+
ϒ
ϒtr
where the symbol Q+ denotes the internal points of the domain Q. The boundary ϒ is split from the medium bulk because it can differ essentially from the medium bulk in its properties, and the boundary traps ϒtr are singled out and treated individually for the same reasons. To simplify the notation a similar rule
dr . . . = dr . . . + ds . . . Q+
Q
ϒ
is also adopted. The integrals are split in order to treat the motion of the system inside the internal points Q+ , its possible anomalous transport along the boundary ϒ, and the trap effect, individually. Also, according to the probability definition, the equality
dr G(r, t|r0 , t0 ) = 1 (2.2) Q
holds when the integration runs over all possible states of the system including the boundary traps ϒtr . In the following a general model for the medium boundary will be studied. Here we paid attention only to the fact that the boundary traps have to be treated individually because the system after being trapped cannot leave the boundary remaining in a trap forever. As a result, if the point r0 belongs to a trap, then for any internal point r of the domain Q the Green function is equal to zero G(r, t|r0 , t0 ) = 0 for r0 ∈ ϒtr ,
r ∈ Q+ .
Later, the Green function G(r, t|r0 , t0 ) for the internal initial and terminal points r0 , r ∈ Q+ will be considered. Therefore, the general Chapman–Kolmogorov equation (2.1) can be reduced by eliminating the integration over the traps, so becoming
dr∗ G(r, t|r∗ , t∗ ) G(r∗ , t∗ |r0 , t0 ). (2.3) G(r, t|r0 , t0 ) = Q
In (2.3) this elimination is pointed out by the absence of one integral matching the traps, cf. the general formulation (2.1) of the Chapman–Kolmogorov equation. Within the given integration rule the equality matching identity (2.2) is violated and we have
dr G(r, t|r0 , t0 ) = 1 − dstr G(str , t|r0 , t0 ) < 1, (2.4) Q
ϒtr
where the symbol str stands for the boundary trap located at the point s ∈ ϒ.
2.2 From Chapman–Kolmogorov Equation to Fokker–Planck Description
In order to obtain the Fokker–Planck equations, two additional assumptions must be adopted. The former is the short time confinement, meaning that on small time scales the system cannot jump over long distances, or in terms of the Green function, its first and second moments converge and
lim
t→t0 +0
Q
dr G(r, t|r0 , t0 )|r − r0 |p = 0,
p = 1, 2.
(2.5)
The latter is the medium local homogeneity; in other words, the medium where the Markovian process develops, i.e. the domain Q, should be endowed with characteristics being actually some smooth fields determined inside Q+ or at ϒ individually. As a result, the Green function G(r, t|r0 , t0 ) has to be smooth with respect to all its arguments for t > t0 and r, r0 ∈ Q+ . Because the intermediate time t∗ entering the Chapman–Kolmogorov equation is any fixed value between the initial and terminal time moments, t0 < t∗ < t, there is a freedom to choose it for specific purposes. In particular, the passage to one of the limits t∗ → t0 + 0 or t∗ → t − 0 gives rise to either the backward or forward Fokker–Planck equation, respectively (see Figure 2.1). 2.2.1 The Backward Fokker–Planck Equation
To implement the limit t∗ → t0 + 0 let us choose an arbitrary small time scale τ and consider the Chapman–Kolmogorov equation for t∗ = t0 + τ and an internal point r0 . Then, according to the adopted assumptions, the first multiplier G(r, t|r∗ , t∗ ) on the right-hand side of (2.3) is a smooth function of both the argument r∗ and t∗ , whereas the second one G(r∗ , t∗ |r0 , t0 ) exhibits strong variations on small spatial scales. So we can expand the function G(r, t|r0 + R, t0 + τ) in the Taylor series with respect to the variables τ and R = r∗ − r0 . The required accuracy is the first order in the time step τ and the second order in R because the characteristic spatial displacement of the system during time τ is of order τ1/2 . Within this accuracy it is
G(r, t|r0 + R, t0 + τ) = G(r, t|r0 , t0 ) + τ +
M
∂G(r, t|r0 , t0 ) ∂t0
Ri ∇i0 G(r, t|r0 , t0 )
i=1
+
M 1 i j 0 0 R R ∇i ∇j G(r, t|r0 , t0 ), 2
(2.6)
i,j=1
where the operator ∇i0 = ∂/∂x0i acts only on the argument r0 of the Green function. The substitution of expansion (2.6) into the Chapman–Kolmogorov equation (2.3)
35
36
2 Multidimensional Approach
reduces it to the following −τ
∂G(r, t|r0 , t0 ) = − R(r0 , t0 , τ) G(r, t|r0 , t0 ) ∂t0 +
M
Ui (r0 , t0 , τ)∇i0 G(r, t|r0 , t0 )
i=1
+
M
Lij (r0 , t0 , τ)∇i0 ∇j0 G(r, t|r0 , t0 ),
(2.7)
i,j=1
where the quantities
R(r0 , t0 , τ) = 1 −
Q
Ui (r0 , t0 , τ) = Lij (r0 , t0 , τ) =
Q
1 2
dRG(r0 + R, t0 + τ|r0 , t0 ),
dRRi G(r0 + R, t0 + τ|r0 , t0 ),
(2.8) (2.9)
Q
dRRi Rj G(r0 + R, t0 + τ|r0 , t0 )
(2.10)
have been introduced. Also, the first term on the right-hand side of (2.7) has been assumed to be small and to tend to zero as τ → 0 which is justified based on the required results. For an internal point r0 and, thus, separated from the boundary ϒ by a finite distance, the time step τ can be chosen so small that it is possible to construct a neighborhood of the point r0 with the following properties. First, deviation of the Green function G(r0 + R, t0 + τ|r0 , t0 ) from zero outside this neighborhood is ignorable due the first assumption of short time confinement. Second, inside it the medium can be regarded as the homogeneous space RM by virtue of the second assumption on the local homogeneity. In this case actually replicating the proof of the Law of Large Numbers using the generating function notion (see, e.g. [55]) it is possible to demonstrate that quantities (2.9) and (2.10) scale linearly with τ. The difference of quantity (2.8) from zero is ignorable. Therefore, for internal points, we can introduce the drift velocity v i (r, t) and the diffusion tensor Dij (r, t) by the expressions
v i (r, t) = lim
1 τ
Dij (r, t) = lim
1 2τ
τ→+0
τ→+0
Q+
dRRi G(r + R, t + τ|r, t),
(2.11)
Q+
dRRi Rj G(r + R, t + τ|r, t).
(2.12)
Then, for the internal points, the division of (2.7) by τ and the succeeding passage to the limit τ → +0 yields the backward Fokker–Planck equation −
∂G(r, t|r0 , t0 ) = LFPB G(r, t|r0 , t0 ) , ∂t0
(2.13)
2.2 From Chapman–Kolmogorov Equation to Fokker–Planck Description
where the backward Fokker–Planck operator is LFPB :=
M
Dij (r0 , t0 , τ)∇i0 ∇j0 +
i,j=1
M
v i (r0 , t0 , τ)∇i0 .
(2.14)
i=1
We note that the backward Fokker–Planck operator acts on the second spatial argument of the Green function G(r, t|r0 , t0 ). This Fokker–Planck equation should be supplemented with the initial condition and the boundary condition. By construction, at the initial time t0 the system was located at the internal point r0 , so the initial condition just writes the Green function in the form of the Dirac δ-function (2.15) G(r, t|r0 , t0 )t=t = δ(r − r0 ). 0
The boundary condition interrelates the values of the Green function and its derivatives at the internal points adjacent to the domain boundary ϒ, that is, values obtained by the continuation r0 → s from some internal point r0 ∈ QM+ to a boundary point s ∈ ϒ. 2.2.2 Boundary Singularities
The direct implementation of the passage to the boundary points, however, causes a certain problem. Expansion (2.7) exhibits irregular behavior within the joint passage to limits τ → +0 and r0 → s. When the former τ → +0 precedes the latter r0 → s no boundary conditions are found at all. In the opposite order, i.e. when the passage r0 → s is performed first, the kinetic √ coefficients (2.8)–(2.10) change the scaling type; now they vary with time τ as τ at the leading order. The fact is that a path of Markovian system is not smooth at every point and its characteristic variations on small time scales about τ are √ proportional to τ. For the internal points of the domain Q the path deviations √ in opposite directions are equiprobable within an accuracy of τ. As a result the coefficient Ui (r, t, τ) becomes a linear function of the argument τ. In some sense the given anomaly in the Markovian dynamics is hidden at the internal points and reflected only in the linear τ-dependence of the second-order moments Lij (r0 , t0 , τ) of the Green function G(r, t|r0 , t0 ). The medium boundary ϒ breaks down this symmetry because, in particular, it prevents the system from reaching the points on the opposite side. Since the system displacement does not essentially change in amplitude, the terms Ui (r, t, τ) acquire the root square dependence on the argument τ. In a certain sense the medium boundary reveals this anomaly (Figure 2.2). The succeeding division of expansion (2.7) by τ gives rise to singularities of the type τ−1/2 which will be referred to as boundary singularities. The medium boundary can affect the system dynamics in a more complex way; here, however, we currently confine our speculations only to the effect of its impermeability. The boundary ϒ restricts the system motion only in the normal direction. It is quite natural to expect that the boundary singularities quantified
37
38
2 Multidimensional Approach
Symmetry breakdown of random walks near the medium boundary
〈dr(t)〉 ∝ t
Symmetry of random walks at internal points
t
〈dr(t)〉 = 0
ϒ
O
O
Figure 2.2 The effect of the boundary impermeability on the Markovian system motion. Schematic illustration.
in terms of diverging components Ui (s, t, τ)/τ will form a vector object b that is determined by the mutual effect of two factors. The former is the spatial orientation of the medium boundary ϒ described by its unit normal n. The latter is the spatial pattern of random Langevin forces governing the stochastic motion of the given system and is characterized by the diffusion tensor Dij (r, t). Within a scalar cofactor we have only one possibility to construct the vector b = {bi } using the two objects, bi =
M
Dij nj
(2.16)
j=1
or, in the vector form b = D · n.
(2.17)
The validity of this construction will be justified in this chapter and b will be referred to as the vector of boundary singularities. To be rigorous it should be noted that, in the general case, the correct expression for the vector of boundary singularities should use the operator Dij obtained from the diffusion tensor Dij by lowering one of its indices, namely, bi = j Dij nj (these details are discussed in Section 2.4.1). However, dealing with orthonormal bases as is the case at the initial stage of the current consideration, the tensors Dij and Dij coincide with each other in the component magnitudes. So, in order not to overload the reader’s perception and the mathematical constructions, expressions similar to (2.16) will be used where appropriate.
2.2 From Chapman–Kolmogorov Equation to Fokker–Planck Description
The notion of the boundary singularity vector immediately enables us to write the desired boundary condition when the medium boundary just confines the system motion. In this case the first and third terms on the right-hand side of expansion (2.7) are absent and the singularity caused by the sequence of transitions r0 → s and then τ → 0 takes the form M 1 i b (s)∇i0 G(r, t|r, t0 )r →s √ 0 τ i=1 M 1 ij → √ D (s, t0 )nj (s)∇i0 G(r, t|r0 , t0 )r →s . 0 τ i,j=1
Naturally, for the internal point r the Green function G(r, t|r0 , t0 ) cannot exhibit any singularity. Therefor the cofactor of the singularity τ−1/2 must be equal to zero, that is M
Dij (s, t0 )nj (s)∇i0 G(r, t|r0 , t0 )r
0 →s
= 0.
i,j=1
It is the well known expression for the boundary condition of the backward Fokker–Planck equation, which is typically obtained in another way, treating the Green function as the concentration of particles spreading over the medium from the point r0 (see, e.g. [55]). This chapter is devoted to deriving the boundary conditions for the Fokker–Planck equation based on the notion of the boundary singularities. A more general situation will be studied and, in particular, these qualitative speculations will also be justified. The present analysis has demonstrated that the boundary condition for the backward Fokker–Planck equation stems from the boundary singularity terms vanishing in expansion (2.7), that is when r0 ϒ − ∗ R(r0 , t0 , τ) G(r, t|r0 , t0 ) +
M
∗ i
U (r0 , t0 , τ)∇i0 G(r, t|r0 , t0 )
i=1
+
M
∗ ij
L (r0 , t0 , τ)∇i0 ∇j0 G(r, t|r0 , t0 ) = 0,
(2.18)
i,j=1
where the symbol ∗ labels the components of the corresponding kinetic coefficients scaling as τ1/2 . It should be pointed out that in (2.18) the argument r0 is an arbitrary point of a thin layer ϒτ adjacent to the boundary ϒ, which is designated by the symbol . When τ → 0 its thickness also tends to zero (as τ1/2 ). However, before passing to the limit τ → 0 the layer ϒτ remains volumetric. Now let us discuss similar problems with respect to the forward Fokker–Planck equation matching the other possibility of passage to the limiting case in the Chapman–Kolmogorov equation (2.3).
39
40
2 Multidimensional Approach
2.2.3 The Forward Fokker–Planck Equation
The Chapman–Kolmogorov equation (2.3) also allows for the limit where the intermediate point tends to the terminal one, i.e. t∗ = t − τ with τ → +0. In this case the former cofactor G(r, t|r∗ , t − τ) on the right-hand side of (2.3) exhibits strong variations on small spatial scales, whereas the latter one G(r∗ , t − τ|r0 , t0 ) becomes a smooth function of the argument r∗ . Now, however, applying directly to an expansion similar to that which has been used in deriving the backward Fokker–Planck equation is not appropriate. The fact is that in this way the integration runs over the initial point r∗ of the Green function G(r, t|r∗ , t − τ) and coefficients appearing similar to quantities (2.8)–(2.10) have another meaning. In particular, an integral similar to (2.4) can essentially deviate from unity. To overcome this problem the Pontryagin technique is applied [197]. It is quite similar to the Kramers–Moyal approach (see, e.g. [193]) but is more suitable for tackling the boundary singularity. Let us consider as the first step some arbitrary smooth function φ(r) determined in the domain Q and integrate both the sides of the Chapman–Kolmogorov equation (2.3). In this way we get
Q
drφ(r)G(r, t∗ + τ|r0 , t0 )
=
Q
Q
dr dr∗ φ(r) G(r, t∗ + τ|r∗ , t∗ ) G(r∗ , t∗ |r0 , t0 ).
(2.19)
For a rather small time scale τ the Green function G(r, t∗ + τ|r∗ , t∗ ) is located practically within some small neighborhood of the point r∗ . In this way the function φ(r) can be expanded in the Taylor series near the point r∗ with respect to the variable R = r − r∗ φ(r) = φ(r∗ ) +
M i=1
Ri ∇i∗ φ(r∗ ) +
M 1 i j ∗ ∗ R R ∇i ∇j φ(r∗ ). 2
(2.20)
i,j=1
Also, since the Green function G(r, t∗ + τ|r0 , t0 ) depends smoothly on τ the expansion G(r, t∗ + τ|r0 , t0 ) = G(r, t∗ |r0 , t0 ) + τ
∂G(r, t∗ |r0 , t0 ) ∂t∗
(2.21)
is also justified for a small value of τ. Then the substitution of the last two expressions into (2.19) with succeeding integration over R and the replacement of the dummy variable r∗ by r as well as t∗
2.2 From Chapman–Kolmogorov Equation to Fokker–Planck Description ϒ
Layer of ϒt boundary singularities
n δ∝ τ
+
φ(x)
x
0
φ(x)
x
0
Figure 2.3 Structure of integral (2.22) and division of the region Q into the layer of boundary singularities and internal points with regular behavior of the kinetic coefficients.
by t, yields
∂G(r, t|r0 , t0 ) = drφ(r) τ ∂t Q
+
Q
dr φ(r) −R(r, t, τ) G(r, t|r0 , t0 )
M i=1
+
M
! ∇i φ(r) Ui (r, t, τ) G(r, t|r0 , t0 ) " ! ∇i ∇j φ(r) L (r, t, τ) G(r, t|r0 , t0 ) . ij
i,j=1
(2.22) Here the coefficients R(r, t, τ), Ui (r, t, τ) and Lij (r, t, τ) again exhibit anomalous behavior within a narrow layer ϒτ adjacent to the medium boundary ϒ (Figure 2.3). As should be expected and in accordance with results to be obtained, the thickness of this layer scales with time τ as τ1/2 . These coefficients themselves also scale as τ1/2 . As a result the corresponding part of integral (2.22) scales as τ. So, after dividing both sides of (2.22) by τ with the following passage to the limit τ → 0, the contribution to (2.22) caused by integration over this layer remains finite. Therefore, to analyze the properties of the integral relation (2.22) the domain Q is split into this layer of boundary singularities and an internal part. After passage to the limit τ → 0 this division allows us to treat the boundary ϒ and the internal points Q+ individually. Keeping the aforementioned in mind, the integral expression (2.22) is represented as a sum of two terms, the integral over the layer ϒτ denoted by the formal symbol of surface integral and the integral over the internal part Q+ of the domain Q
41
42
2 Multidimensional Approach
Q
dr . . . =
ϒτ
dr . . . +
Q+
dr . . .
(2.23)
Let us consider the second term first. Inside the region Q+ the kinetic coefficients Ui (r, t, τ) and Lij (r, t, τ) behave in regular way, i.e. they scale as τ according to (2.11) and (2.12), whereas the term R(r, t, τ) vanishes. So, dividing the corresponding part of the integral relation (2.22) by τ and passing to the limit τ → 0 we have M ! ∂G(r, t|r0 , t0 ) = drφ(r) dr ∇i φ(r) v i (r, t, τ) G(r, t|r0 , t0 ) ∂t Q+ Q+ i=1 " M ! ij ∇i ∇j φ(r) D (r, t, τ) G(r, t|r0 , t0 ) . +
i,j=1
(2.24) Using the Gauss divergence theorem this integral in turn is split into two surface and volume parts
dr . . . = ds . . . + dr . . . . (2.25) Q+
Q+
ϒ
The volume integral has the form M ! ∂G(r, t|r0 , t0 ) = drφ(r) drφ(r) − ∇i v i (r, t, τ) G(r, t|r0 , t0 ) ∂t Q+ Q+ i=1 " M ! ij + ∇i ∇j D (r, t, τ) G(r, t|r0 , t0 ) . (2.26)
i,j=1
The latter equality immediately gives rise to the forward Fokker–Planck equation. Indeed, currently φ(r) is an arbitrary smooth function and no additional constrain will be imposed on it for the internal points of the domain Q. So applying local variations of φ(r) at an arbitrary internal point r (Figure 2.3) we see that the left and right-hand sides of (2.26) should be equal to each other for the points r ∈ Q+ individually, obtaining the forward Fokker–Planck equation ∂G(r, t|r0 , t0 ) = LFPF G(r, t|r0 , t0 ) ∂t with the forward Fokker–Planck operator LFPF {♦} :=
M i=1
∇i
M
∇j Dij (r, t, τ) ♦ − v i (r, t, τ) ♦ .
(2.27)
(2.28)
j=1
Here the symbol ♦ stands for a function on which the operator acts. It should be also pointed out that the Fokker–Planck operator acts on the first spatial argument of the Green function.
2.2 From Chapman–Kolmogorov Equation to Fokker–Planck Description
The forward Fokker–Planck equation can be also written in the conservation form M ∂G(r, t|r0 , t0 ) + ∇i Ji G(r, t|r0 , t0 ) = 0, ∂t
(2.29)
i=1
with the probability flux operator J = { J i }M i=1 Ji {♦} := −
M
∇j Dij (r, t, τ) ♦ + v i (r, t, τ) ♦.
(2.30)
j=1
The forward Fokker–Planck equation is naturally supplemented by the same initial condition (2.15). 2.2.4 Boundary Relations
Splits (2.23) and (2.25) give rise to two additional terms. The former one is related to the first split and is the integral over the layer ϒτ of boundary singularities
ϒτ
dr G(s, t|r0 , t0 )
M
∇i φ(s) ∗ Ui (r, t, τ)
i=1 ∗
−φ(s) R(r, t, τ) +
M
" ∗ ij
∇i ∇j φ(s) L (r, t, τ) .
(2.31)
i,j=1
Here the use of the symbols dr and r in the singular components of the kinetic coefficients indicates that, before passing to the limit τ → 0, we should consider the layer ϒτ volumetric. The Green function G(r, t|r0 , t0 ) as well as the test function φ(r) and its derivatives exhibit minor variations across the layer ϒτ so their argument r has been replaced by the corresponding nearest point s laying on the boundary ϒ. The latter term is related to the part of expression (2.24) that remains after integration using the convergence theorem. It can be written in the form ϒ
ds
M
! ∇j φ(s)ni (s) Dij (s, t, τ) G(s, t|r0 , t0 )
i,j=1
=−
ϒ
dsφ(s)
M
ni (s)Ji G(s, t|r0 , t0 )},
(2.32)
i=1
where n(s) = {ni (s)} is the unit normal to the boundary ϒ at point s directed inwards in the domain Q. Leaping ahead, we note that the appropriate choice of the boundary values of the test function φ(s) and its derivatives has to fulfill equality (2.31). Then at the next
43
44
2 Multidimensional Approach
step it gives rise to the required boundary condition for the forward Fokker–Planck equation. Let us demonstrate this for the impermeable boundary using the notion of the boundary singularity vector b. Namely, we again assume that, for an internal point r located in the vicinity of a boundary point s, i.e. r s Dij (s, t)νj (s). (2.33) Ui (r, t, τ) ∝ bi (s) = j
In this case only the first term in equality (2.31) remains and it is fulfilled when M
Dij (s, t)nj (s)∇i φ(s) = 0.
(2.34)
ij=1
Equality (2.34) just relates the boundary values of the test function φ(s) with its derivative along the boundary normal n(s). So for an arbitrary smooth function φϒ (s) determined at the boundary ϒ it is possible to construct the appropriate function φ(r) determined in the domain Q and meeting equality (2.34) (see Figure 2.3). So in the given case the left-hand side and, thus, the right-hand side of expression (2.32) become zero. Since the integral on the right-hand side of (2.32) contains an arbitrary function φ(s) determined at the boundary ϒ, the equality M
ni (s)Ji G(s, t|r0 , t0 )} = 0
(2.35)
i=1
holds for every point of the boundary ϒ individually. Expression (2.35) means that the probability flux in the direction normal to the boundary ϒ is equal to zero, which actually reflects its impermeability. However, to derive the boundary conditions for the Fokker–Planck equations, more sophisticated constructions are necessary. Also, in order to take into account other possible properties of the medium boundary its model must be specified.
2.3 Different Types of Boundaries
In this section, to be specific, we consider three typical examples of medium boundaries. They are (a) the impermeable boundary, (b) the boundary absorbing particles, and (c) the boundary with a thin adjacent layer characterized by extremely high values of the kinetic coefficients, the fast diffusion boundary (see Figure 2.4). The first type matches a medium whose boundary properties are similar to its bulk properties; the boundary points differ from the internal ones only by the absence of medium points on one side. As a result a random walker hopping over the medium points just cannot pass through the boundary. The second type is similar to the first one except for the fact that the walker can be trapped at the boundary and will not return again to the medium. In this case the corresponding boundary conditions are typically used in describing the first passage time problem or diffusion in solids with fixed boundary values of impurity
2.4 Equivalent Lattice Representation of Random Walks Near the Boundary
Impermeable boundary
Absorbing boundary
Boundary with fast diffusion layer
(a)
(b)
(c)
ϒ
ϒ
ϒ
x2
x2 x1
3+
Figure 2.4
x2 x1
x1 x3
3+
x3
3+
λ,
x3
Three types of boundaries under consideration.
concentration Cs (see, e.g. [55]). Generally the boundary absorption is described by the rate σCs , where σ is a certain kinetic coefficient. The third type of boundaries are very common, for example, in polycrystals or nanoparticle agglomerates. The grain boundaries contain a huge amount of defects and as a result the diffusion coefficient inside the grain boundaries can exceed its value in the crystal bulk by many orders. Therefore, impurity propagation in polycrystals is governed mainly by grain boundary diffusion (for a review see [20] and references therein). In terms of random walks the effect of the fast diffusion layer is reduced to extremely long spatial jumps made by a walker inside it. It is natural to characterize such a boundary layer by its thickness λ about the atomic spacing and the ratio 1 of the diffusion coefficients inside the boundary layer and in the regular crystal lattice.
2.4 Equivalent Lattice Representation of Random Walks Near the Boundary
The derivation of both the forward and backward Fokker–Planck equations, the requires the calculation of three quantities R(r, t, τ) , Ui (r, t, τ) and Lij (r, t, τ) specified by expressions (2.8)–(2.10). They are the moments of the system displacement R during the time τ treated as an arbitrary small value. In order to obtain the desired boundary conditions these quantities should be found in the vicinity of the medium boundary ϒ or, more precisely, in its neighborhood ϒτ of thickness about (Dτ)1/2 , where D is the characteristic value of the diffusion tensor components. To study the boundary effects it suffices to consider quite a small region wherein the medium and its boundary have practically homogeneous properties and, in addition, the boundary geometry is approximated well by some hyperplane. In this region the system motion will be imitated by random walks on a lattice constructed as follows.
45
46
2 Multidimensional Approach
First, the elementary steps of the random walks on it are characterized by a time τa such that τa τ
(2.36)
and the arrangement of the lattice nodes, i.e. their spacings {ai } and the spatial orientation should again give us the same diffusion tensor D as well as the drift field v for the internal points on time scales τa t τ. The individual hops of a random walker between the neighboring nodes actually represent a collection of mutually independent Langevin forces governing the random system motion in the given continuum. Second, the boundary ϒ is represented as a layer of nodes ϒ0 between which the walker can migrate via elementary hops. In other words, the aforementioned collection of mutually independent Langevin forces has to contain components acting along the boundary ϒ and one component moving the walker towards or from ϒ. Other characteristics of this effective lattice may be chosen for the sake of convenience. At the final stage we should pass to the limit τa → 0 returning to the continuous description. 2.4.1 Diffusion Tensor Representations
In order to construct the required lattice let us consider Markovian random walks {r(t)} in M-dimensional Euclidean half-space RM+ made of vectors r = {x1 , x2 , . . . , xM } such that r · n :=
M
xi ni ≥ 0,
i=1
where n = {n , n2 , . . . , nM } is a certain unit vector. The boundary of RM+ , that is, the hyperplane ϒ = {r · n = 0} perpendicular to the vector n is, in its turn, the Euclidean space RM−1 of dimension M − 1. The half-space RM+ and, correspondingly, the hyperplane ϒ are assumed to be homogeneous. The latter means that the local properties of the random walks under consideration have to be independent of position in space; naturally the boundary and internal points are not equivalent. In particular, the diffusion tensor D and drift vector v are the same at all the internal points of the half-space RM+ . In this case the components of the drift vector and diffusion tensor are determined by the expressions (cf. (2.8)–(2.10)) 1
1 i δX (t, τ) , τ ! ! 1 Dij = δX i (t, τ) − v i τ δX j (t, τ) − v j τ . 2τ vi =
(2.37) (2.38)
2.4 Equivalent Lattice Representation of Random Walks Near the Boundary
Here the random variable δX i (t, τ) := xi (t + τ) − xi (t) and r = {xi } is an arbitrary internal point. The observation time interval τ should be chosen to be small enough so that the length scale (Dτ)1/2 is much less than the distance between the point r and the boundary ϒ, i.e. Dτ (r · n)2 , and the triangular brackets . . . stand for the average over all the random trajectories passing through the point r at time t. It should be noted that, due to the space homogeneity, the passage to the limit τ → 0 can be omitted, which is necessary in the general case. In the following, nonorthogonal bases will be used. So, keeping in mind the tensor notation (see, e.g. [138]), the upper and lower indices will be distinguished. In these terms {xi } or just xi , is a vector, whereas, the collection of the basis vectors ei is a covector. According to definitions (2.38) and (2.37) the objects Dij and v i are contravariant tensors. In addition, if the basis e has the form e = eϒ ⊕ e, where eϒ is the basis of the hyperplane ϒ and the vector e does not lie within it, then the Greek letters will label the tensor indices corresponding to the hyperplane ϒ to simplify the perception of this fact. In order to deal with the diffusion tensor in a nonorthogonal basis e = {ei } the metric tensor is also necessary. It is defined as g ij := (ei · ej )
(2.39)
and is the kernel of the scalar product of two vectors r and r, namely, (r · r) :=
M
gij xi x j .
(2.40)
i,j=1
For an orthonormal basis the metric tensor g ij = δij , where δij is the Kronecker delta. The metric tensor g ij defines the conversion of contravariant tensors into covariant ones, in particular, Di··j =
M
Dik gkj ,
·j
Di· =
k=1
M
gik Dkj ,
(2.41)
k=1
as well as Dij =
M
gik gjp Dkp .
(2.42)
k,p=1
Due to the diffusion tensor Dij as well as the metric tensor g ij being symmetric, the tensor Dij is also symmetric, whereas the tensors Di··j and D·ij· are identical and so acting denoted further as Dij . The tensor Dij can be regarded as a certain operator D in the space RM and the tensor Dij specifies a quadratic form = r · Dr
M i,j,k=1
j
gij xi Dk xk =
M i,j=1
Dij xi xj .
(2.43)
47
48
2 Multidimensional Approach
The quadratic form (2.43) is positive-definite. To demonstrate this, a random variable δL =
M
M p δX p (t, τ) − v p τ (ep · ) = δX (t, τ) − v p τ gpi i
p=1
p,i=1
i M is considered, where = M i=1 ei is an arbitrary vector in the space R and the metric tensor definition (2.39) has been taken into account. From this we we have a chain of equalities M
0 < [δL]2 =
gpi gp i li li
! δX p (t, τ) − v p τ δX p (t, τ) − v p τ
p,p ,i,i =1 M
=
p,p ,i,i =1 M
=
2τDpp gpi gp i li li =
M
2τDii li li
i,i =1
2τDii li li .
p,p ,i,i =1
So, for any arbitrary vector li and covector li the inequalities M i,j=1
Dij li lj > 0,
M
Dij li lj > 0
(2.44)
i,j=1
hold. The covector and vector representations of the same object are related as j li = M j=1 gij l ; within orthonormal bases they are identical. Since the symmetry of the tensor Dij and the quadratic form (2.43) are both are real positive quantities positive definite, all the eigenvalues of the operator D and its eigenvectors form a basis in the space RM which can be chosen to be orthonormal (see, e.g., [53]). In this basis the diffusion tensor takes the diagonal form Therefore, the corresponding eigenvectors and eigenvalues specify the directions and intensity of the mutually independent Langevin forces governing random walks in the medium under consideration. Unfortunately, in the general case where all the eigenvalues are nondegenerate, this basis is unique, so it cannot be used in constructing the desired lattice in the vicinity of the medium boundary ϒ because one could find a situation where none of the basis vectors are parallel to the hyperplane ϒ. In order to overcome this problem we will construct a special nonorthogonal basis applying to the following statement. Proposition 2.1 Let RM+ = {r · n > 0} be a homogeneous half-space bounded by the hyperplane ϒ = {r · n = 0} and e = {e1 , e2 , . . . eM } be a fixed arbitrary basis of RM . In this basis the components of the diffusion tensor {Dij } as well as the metric tensor {gij } are given. Then there is a basis b = bϒ ⊕ bM with the following properties. First, it is composed of a certain orthonormal basis bϒ of the hyperplane ϒ and a unit vector bM not belonging to ϒ that is determined by the expression
2.4 Equivalent Lattice Representation of Random Walks Near the Boundary
bM =
M 1 ei Dij nj . ω
(2.45)
i,j=1
Here, according to the construction of the half-space, RM+ n = {n1 , n2 , . . . nM } is the unit vector normal to the hyperplane ϒ and the normalization factor 1/2 M j i p k ω= gij Dp Dk n n . (2.46) i,j,p,k=1
Second, in the basis b the diffusion tensor takes the diagonal form D1 0 . . . 0 0 D2 . . . 0 ij D = . .. .. , .. .. . . . 0 0 . . . DM
(2.47)
where all its diagonal components are positive quantities, {Di > 0}, with the value DM being given by the expression −1 M Dij ni nj . (2.48) DM = ω2 i,j=1
Third, let, in addition, the initial basis be of the form e = eϒ ⊕ n, where eϒ = {eα }M−1 1 ˆ ϒ = uα be the transformation of the is a certain basis of the hyperplane ϒ, and U β uˆ
hyperplane ϒ mapping the basis bϒ onto the basis eϒ , that is, bϒ → eϒ . By mapping ˆ ϒ is complemented to a certain transformation U ˆ of the bM → n the transformation U whole space RM , namely, if r is an arbitrary vector of the space RM with the coordinates specified by its expansion over the bases e and b: r=
M−1
eγ xγ + nxM ≡
γ=1
M−1
bγ ζγ + bM ζM ,
(2.49)
γ=1
then its coordinates are related by the expressions ζα =
M−1 γ=1
ζM =
uαγ xγ −
1 DMM
DγM xM ,
ω xM , DMM
(2.50a) (2.50b)
and for the inverse transformation xα =
M−1 γ=1
xM =
u˘ αγ ζγ +
DMM M ζ . ω
1 αM M D ζ , ω
(2.50c)
(2.50d)
49
50
2 Multidimensional Approach
−1 = u˘ α is the operator inverse to the operator U ˆ ϒ , that is, meeting the identity Here U M−1 v α γ β α ˘ γ uβ = δβ . Besides, the equality γ=1 u M−1
u˘ αγ u˘ βγ Dγ = Dαβ −
γ=1
1 DαM DβM DMM
(2.51)
holds. The proof of this proposition requires just formal mathematical manipulation. For the first step, the initial basis e of the half-space RM+ is assumed to comprise a certain basis eϒ = {e1 , e2 , . . . eM−1 } of the hyperplane ϒ and its unit normal n directed inward RM+ , that is, e = eϒ ⊕ n. Then the results to be obtained will be represented in invariant form where appropriate, enabling us to write the general expressions. The diffusion tensor Dij is assumed to be determined beforehand in the initial basis. Also as before, to simplify reader perception the Greek letters will be used to label tensor indices corresponding to the hyperplane ϒ. Let us consider a new basis b = bϒ ⊕ bM of the same structure except for the last vector bM ; it need not be normal to the hyperplane ϒ. A one-to-one map between the two bases, e ⇔ b, determines a linear transformation U of the space RM mapping, in particular, the hyperplane ϒ onto itself. This transformation U = U ij is specified by the relationship between the basis vectors eα =
M−1
bβ uβα ;
n=
M−1
bα ωα + bM ωM .
(2.52)
α=1
β=1
Uϒ acting in the hyperplane ϒ whereas Here the tensor uαβ represents an operator the tensor ωα (in ϒ) and the coefficient ωM = 0 complement it to the operator U, namely, U αβ = uαβ , UM β = 0,
U αM = ωα ,
(2.53a)
M UM M =ω .
(2.53b)
According to the rule of tensor transformations (see, e.g. [138]) in the basis b the diffusion matrix has the components ˜ αβ = D
M−1 γ,γ =1
β
uαγ uγ Dγγ + ωα ωβ DMM +
M−1
˜ αM = ωM D
uαγ DγM + ωα DMM ,
M−1
ωα uβγ + ωβ uαγ DγM ,
(2.54a)
γ=1
(2.54b)
γ=1
+ , ˜ MM = ωM 2 DMM . D
(2.54c)
2.4 Equivalent Lattice Representation of Random Walks Near the Boundary
Correspondingly, an arbitrary vector xi = {xα , xM } is converted as x˜ α =
M−1
uαγ xγ + ωα xM ,
(2.55a)
γ=1
x˜ M = ωM xM .
(2.55b)
Currently there are no restrictions imposed on the basis b (except for its general structure). Now let us choose a specific version of the tensor ωα that eliminates the off-diagonal elements of the diffusion tensor in the basis b. By virtue of (2.54b) it is ωα = −
M−1
1 DMM
uαγ DγM .
(2.56)
γ=1
Here, division by DMM is possible because, according to definition (2.38), the diagonal elements of diffusion tensor are positive, in particular, DMM > 0 except for the case where the system motion along the direction n is rigorously deterministic. However, setting DMM → +0 the latter case can also be allowed for. The substitution of (2.56) into (2.54a) yields M−1
˜ αβ = D
γ,γ =1
β
uαγ uγ Dγγ ,
(2.57)
where the object Dαβ = Dαβ −
1 DMM
DαM DβM
(2.58)
is a tensor within the hyperplane ϒ because, up to now, the collections of vectors eϒ and bϒ are general bases of this hyperplane. The tensor Dαβ is symmetric and positive definite. The latter property stems directly from inequality (2.44) written for an arbitrary covector lα of the hyperplane ϒ with the component lM = −
M−1
1 DMM
DMγ lγ ,
(2.59)
Dαβ lα lβ > 0.
(2.60)
γ=1
namely, M i,j=1
Dij li lj =
M−1 α,β=1
Therefore, the basis bϒ of the hyperplane ϒ can be chosen to be an orthonormal one wherein the tensor Dαβ takes the diagonal form with the diagonal components β being positive values, so Dαβ = Dα = Dαβ = Dα δαβ [53]. This basis bϒ is made up := Dα whose eigenvalues are {Dα }. For of the eigenvectors of the operator D β β
example, in the initial basis eϒ the tensor Dα is related to the tensor Dαβ by the expression
51
52
2 Multidimensional Approach
Dβα =
M−1
gαγ Dγβ ,
where
gαβ := (eα · eβ )
γ=1
is the metric tensor of the hyperplane ϒ. The choice of the given basis bϒ specifies the transformation matrix uαβ which, together with expression (2.56), gives us the vector bM and the corresponding component DM of the diffusion tensor. Namely, first substituting (2.56) into the latter equality of (2.52) and taking into account the former one, we write bM ωM = n +
1 DMM
M−1 α,γ=1
bα uαγ DγM = n +
1 DMM
M−1
eγ DγM .
(2.61)
γ=1
In the invariant form this expression can be rewritten as bM =
M 1 ei Dij (ej · n), ω
(2.62)
i,j=1
where the normalization factor ω 1/2 M ω= Dik Djp (ei · ej )(ek · n)(ep · n)
(2.63)
i,j,k,p=1
is due to the vector bM being of unit length. Since the obtained expressions (2.62) and (2.63) are of the tensor form and are scalar in this sense (they do not contain free indices) they hold within any basis, thus proving (2.45) and (2.46). Second, according to (2.61) and (2.62) the coefficient ωM = ω/DMM . So expressions (2.54c) and (2.63) give us the diffusion tensor component DM related to the vector bM in the basis b −1 M−1 DM = ω2 Dij (ei · n)(ej · n) (2.64) i,j=1
and written in the invariant form. Formula (2.48) is therefore proved. In addition, expressions (2.55) and (2.56) together with the equality ω = ωM DMM immediately lead to (2.50a) and (2.50b). ˘ αβ of the hyperplane ϒ that is Finally, we need the transformation U−1 ϒ = u α inverse to the transformation Uϒ = uγ ; its components obey the equality M−1 γ=1
γ
u˘ αγ uβ = δαβ .
(2.65)
This exists due to the transformation Uϒ being a one-to-one map of the bases e and b. Then the inversion of equalities (2.55) with ωα given by expression (2.56) ˜ αβ yields (2.50c) and (2.50d). Inverting (2.57) and taking into account the tensor D to have the diagonal form Dα δαβ in the orthonormal basis b, we directly obtain
2.4 Equivalent Lattice Representation of Random Walks Near the Boundary M−1 γ=1
u˘ αγ u˘ βγ Dγ = Dαβ
which, together with (2.58), gives rise to formula (2.51). The Proposition is thus proved. The following comments on Proposition 2.1 should be made. First, it is worthwhile to note that the basis vector bM constructed by (2.45) is actually the vector b of boundary singularities (2.16) normalized to unity. Second, for the initial basis e of the general form (2.51) persuades us to introduce the surface diffusion tensor M
Dij := Dij −
j
Dip Dk np np
p,k=1 M
(2.66) k p
Dpk n n
p,k=1
which describes the system random motion along the hyperplane ϒ. Indeed, in a basis bϒ ⊕ n the components of this tensor belonging to the hyperplane ϒ coincide with ones given by (2.51) and are equal to zero when one of its indices matches the vector n. Third, when the initial basis e is orthonormal the expressions of Proposition 2.1 can be simplified. Indeed, in this case the metric tensor gij := (ei · ej ) = δij is the unit matrix and it is possible not to distinguish between the upper and lower tensor indices; in particular, all the components Dij = Dij = Dij are identical. If, in addition, the initial basis has the form e = eϒ ⊕ n expressions (2.45)–(2.48) become
bM
M−1 1 = eγ DγM + nDMM , ω
(2.67)
γ=1
where the coefficient ω in (2.46) is ω=
M−1
1/2 D2γM
+
D2MM
(2.68)
γ=1
and the value of DM in (2.48) is DM =
1 DMM
M−1 γ=1
D2γM + DMM .
(2.69)
Also, the inverse transformation matrix u˘ αβ coincides with the direct transformation matrix transposed, so u˘ αβ = uβα .
53
54
2 Multidimensional Approach
Proposition 2.1 prompts us to use the basis b = {bi } in describing random walks in the half-space RM+ . For its internal points the continuous random walks are represented as a collection of mutually independent one-dimensional Markovian processes {ζi (t)}
t r(t) = bi ζi (t) = bi dt ξi (t ), (2.70) 0
where the Langevin random forces {ξi (t)} meet the correlations ξi (t) = v i , ξi (t)ξi (t ) = 2Di δii δ(t − t ),
(2.71) (2.72)
and {v i } are the components of the drift velocity v = bi v i in the basis b. As could be shown directly, these random forces lead to expressions (2.37) and (2.38). 2.4.2 Equivalent Lattice Random Walks
The desired lattice is constructed as follows (see also Figure 2.5 for illustration). For the first step a set of nodes {aϒ } is fixed on the boundary ϒ such that ϒ3 ϒ2 ϒ1 ϒ0
ϒ
a1b1 x2
a2b2 a3b3
x1
3+
x3
Figure 2.5 The lattice random walks imitating a continuous Markovian process in the half-space R3+ . Here ϒ is the boundary of R3+ , the axes x 1 , x 2 are chosen to be directed along the vectors b1 , b2 of the basis bϒ , the axis x 3 is normal to the plane
ϒ, whereas the basic vector b3 is not normal to it, in the general case. The values a1 , a2 , a3 are the lattice spacings and gray arrows show possible hops to the nearest neighbors.
2.4 Equivalent Lattice Representation of Random Walks Near the Boundary
aϒ (nϒ ) =
M−1
b α a α nα ,
(2.73)
α=1
where nϒ = {n1 , n2 , . . . , nM−1 } is a collection of integers taking values in Z and the lattice spacings aα are chosen to be equal to aα = 2τa MDα . (2.74a) Here τa is any small time scale meeting inequality (2.36) and being the time step of lattice random walks; a walker hops to one of the nearest neighbors in time τa . Such jumps are illustrated by gray arrows in Figure 2.5. These nodes are regarded as the boundary layer ϒ0 of the lattice to be constructed. Then the layer ϒ0 as a whole is shifted inwards in the region RM+ by the vector aM bM , where aM = 2τa MDM . (2.74b) Then this new layer ϒ1 in turn is shifted by the same vector RM+ , giving rise to the next layer ϒ2 of nodes, and so on. In this way we construct the system of layers {ϒk } making up the desired lattice and random walks on this lattice will imitate the continuous process in the half-space RM+ . Let us now specify the probability of hops from an internal node n to one of its nearest neighbors n along a basis vector bi by the expression τa i 1 + Pnn = v χi . (2.75) 2M 2ai Here the random value χi = ±1 takes into accounts the possibility of jumps along the vector bi or in the opposite direction. The sequence of such hops with time step τa represents equivalently the continuous process quite far from the boundary ϒ. Indeed, due to the law of large numbers (see, e.g. [47]) two Markovian processes are identical if, on quite a small time scale, both of them lead to the same mean and mean-square values of the system displacement. By virtue of (2.75) one hop of the walker is characterized by the following mean values of its displacement δr = bi δζi Pnn δζinn = τa v i , (2.76) n
n
j
Pnn δζinn δζnn = 2τa Di δij ,
(2.77)
where the sums run over all the nearest neighbors n of the node n. According to (2.37) and (2.38) the same mean values of the system displacement during the time interval τa are given by the continuous random process. Rigorously speaking, the latter mean value and one corresponding to the continuous random process are not identical, but their difference (bi · bj )v i v j τ2a is of the second order in the time scale τa , whereas the leading terms are of the first order. So by choosing the time scale τa to be sufficiently small, we can make this difference negligible.
55
56
2 Multidimensional Approach
2.4.3 Properties of the Boundary Layer
In order to describe the boundary effects on random walks, special properties should be ascribed to the nodes of the boundary layer ϒ0 . It is worthwhile to note that it is the place where the model of the medium boundary appears for the first time. Keeping in mind the boundary types discussed in Section 2.3; first each boundary node is regarded as a unit of two elements, the lattice node itself and a trap. If a walker jumps to a trap it will never return to the lattice nodes. The introduction of traps mimics the absorption effect of medium boundaries. Second, possible fast diffusion inside a thin layer adjacent to medium boundaries is imitated in terms of multiple steps over the boundary nodes during the time interval τa . These constructions are illustrated in Figure 2.6. For the walker located at a certain boundary node the probabilities of hopping to the internal neighboring node, Pl , or being trapped, Ptr are specified as 1 − σa σa , Ptr = , (2.78) M M where the coefficient σa quantifies the trapping (absorption) effect. Leaping ahead, we note that the coefficient σa can be assumed to be a small value because its magnitude σa → 0 as τa → 0 within the collection of lattices leading to the equivalent description of the random walks on time scales τa t τ. These probabilities have been chosen to constitute the probability of walker motion along the direction of the basis vector bM equal to the same value for the internal points, Pl =
Pl + Ptr =
1 . M
(2.79)
ϒ1 ϒ0
ϒ1 Node trap ϒ0
a3b3
Figure 2.6 Characteristic properties of random walks in the boundary layer ϒ0 . The left inset illustrates possible hops from the boundary layer. The main fragment illustrates the walker jumps inside the boundary layer
ϒ0 which can be complex and comprises many elementary hops. The latter feature imitates possible fast diffusion inside a certain thin layer adjacent to crystal boundaries.
2.4 Equivalent Lattice Representation of Random Walks Near the Boundary
Therefore the probability of the walker being initially at a boundary node nϒ and making a jump within the boundary layer ϒ0 is Pϒ =
M−1 . M
(2.80)
First, let us consider the case where such jumps are the elementary hops to one nϒ of the nearest neighboring nodes in ϒ0 . Then, following construction (2.75) its conditional probability is written as (1)
Pn
ϒ nϒ
=
M τa α 1 + v χα . 2(M − 1) (M − 1) 2aα ϒ
(2.81)
Here, as before, the value χα = ±1 is ascribed to the walker hop along the basis vector bα or in the opposite direction, vϒα are the components of the drift velocity inside the boundary layer in the basis bϒ . It should be noted that regular drift inside the boundary layer and the medium can be different in nature, which is allowed for by the index ϒ at the boundary components of the drift velocity. The adopted expression (2.81), as it must, obeys equalities similar to expressions (2.76), (2.77), namely, for the displacement δrϒ = bα δζα along the boundary ϒ Pϒ Pϒ
Pn
ϒm
δζαn
δζαn
ϒm
δζn
(1)
m∈ϒ (1)
Pn
m∈ϒ
ϒm
ϒm
β
ϒm
= τa vϒα ,
(2.82)
= 2τa Dα δαβ .
(2.83)
The fast diffusion inside the boundary layer ϒ is imitated by complex jumps made up of g successive elementary hops within the time τa . In this case the walker can get not only the nearest neighboring nodes, but also relatively distant ones. The conditional probability of such a g-fold jump from node nϒ to node nϒ is given by the expression (g)
Pn
ϒ nϒ
=
(1)
m1 ,m2 ,...,mg−1 ∈ϒ (1)
Pn
(1)
ϒ m1
(1)
× Pm
g−2 mg−1
× Pm
× Pm
g−1 nϒ
1 m2
× ···
.
(2.84) (g)
By virtue of (2.82) and (2.83) the probability function Pn n of g-fold jumps gives ϒ ϒ the following values for the first and second moments of the walker displacement M−1 δrϒ = α=1 bα δζα in the layer ϒ0 Pϒ
Pn
ϒm
δζαn
δζαn
ϒm
δζn
(g)
m∈ϒ
Pϒ
m∈ϒ
(g)
Pn
ϒm
ϒm
β
ϒm
= gτa vϒα , = 2(gτa )Dα δαβ + (gτa )2
(2.85) M β vα v . (M − 1) ϒ ϒ
(2.86)
57
58
2 Multidimensional Approach
In expression (2.86) we again have ignored terms of order gτ2a because the displacement of a walker along the boundary ϒ caused by its migration inside the layer ϒ0 is considerable only for g 1 as will be seen further. In the latter case the conditional probability (2.84) of transition from the node nϒ to the node bα mα (mα are integers) (2.87) nϒ = nϒ + α
can be approximated by the Gaussian distribution " M−1 M − 1 M−1 (M − 1) gM τa vϒα 2 (g) 2 mα − exp − Pn n = ϒ ϒ 2πg 2g (M − 1) aα α=1
(2.88)
by virtue of the law of large numbers and expressions (2.74a), (2.85), (2.86). The desired lattice random walks imitating the continuous Markovian process in the vicinity of the medium boundary ϒ is therefore constructed.
2.5 Expression for Boundary Singularities
As discussed in Section 2.2.2, the medium boundary ϒ breaks down the symmetry of random walks in its vicinity, which is reflected in the anomalous behavior of quantities (2.8)–(2.10) near the boundary ϒ. To quantify this effect it is necessary to calculate the given integrals near the boundary ϒ for any small time interval τ. Quantities (2.8)–(2.10) comprise two types of terms which differ in scaling with respect to τ; regular components proportional to τ and anomalous one scaling √ as τ. In the present section only the latter terms are under consideration. When deriving the Fokker–Planck equations their division by τ gives rise to the singularity τ−1/2 . Their cofactors quantify the influence of the boundary on Markovian processes and by setting them equal to zero we can relate the boundary values of the Green function G(r, t|r0 , t0 ) to the physical properties of the medium boundaries. Assuming the time scale τ to be sufficiently small the medium in a certain neighborhood Qs of a boundary point s ∈ ϒ is treated as a homogeneous continuum with time independent characteristics and the corresponding fragment of the boundary ϒ is approximated by a hyperplane. In this case it is natural to choose the coordinate system related to a basis e = eϒ ⊕ n, which, in particular, reduces the number of the Green function arguments, , + G(r, x0M |τ) := G r, t0 + τ|{0ϒ , x0M }, t0 . The system origin was located at the hyperplane ϒ such that the vector r0 = {0ϒ , x0M } can have only one component x0M determining the distance between the point r0 and the hyperplane ϒ. Then, using the general definitions (2.8)–(2.10) of the quantities R(r, t, τ), Ui (r, t, τ), and Lij (r, t, τ) the anomalous properties of random walks near the boundary ϒ are quantified by their singular components ∗ Ui (τ, xM )
2.5 Expression for Boundary Singularities
√
and ∗ Lij (τ, xM ) scaling as τ. The symbol ∗ is not applied to R(τ, xM ) because it possesses no regular component at all. In other words the desired quantities are determined by the following means
d˜rG(˜r, xM |τ) = 1 − R(τ, xM ),
(2.89)
d˜r δx˜ i G(˜r, xM |τ) = ∗ Ui (τ, xM ) + O(τ),
(2.90)
d˜r δx˜ i δx˜ j G(˜r, xM |τ) = ∗ Lij (τ, xM ) + O(τ),
(2.91)
Qs
1 2
Qs
Qs
where δx˜ α = x˜ α and δx˜ M = x˜ M − xM . In order to calculate these boundary singularities we first fix the value τ and introduce a new time scale τa τ. Then the lattice constructed is Section 2.4 and random walks on it are applied to calculate the desired quantities. There are two advantages in using these lattice random walks. First, the choice of the basis b = bϒ ⊕ bM enables us to simulate the continuous Markovian process as independent random walks along the directions parallel to the hyperplane ϒ and along the vector bM . Second, it becomes possible to ascribe special features to the nodes of the boundary layer and in this way to simulate some physical properties of the medium boundary. In particular, it either can absorb a random walker or cause it to migrate extremely quickly along the boundary within a thin layer. Finally, to restore the continuous description the limit τa → 0 is used. The implementation of this approach again is based on just mathematical manipulations with the probability function for lattice random walks. So only the final results are stated here and the reader is referred to Section 2.6 for the proof. Proposition 2.2 Let us consider a Markovian system in a homogeneous half-space RM+ bounded by a hyperplane ϒ and endowed with the basis b = bϒ ⊕ bM described in Proposition 2.1, r=
M−1
bγ ζγ + bM ζM .
γ=1
The hyperplane ϒ treated as a physical boundary can absorb the system as well as force it to migrate quickly along ϒ. The diffusion tensor Dij as well as the drift velocity v i at the internal points and vϒi at the boundary ϒ are assumed to be determined in the basis b. It should be noted that the boundary drift velocity vϒi is the velocity at which the system would have moved outside the boundary if it had been affected by the same forces. The continuous motion of the Markovian system is imitated by random walks on the lattice constructed in Section 2.4 with time step τa . Finally, the limit τa → 0 is applied. Then, first, the boundary absorption and fast transport can be characterized by two kinetic coefficients called the surface absorption rate σ and the surface diffusion length lϒ ascribed directly to the boundary ϒ itself, implying that these quantities are independent of the discretization time τa .
59
60
2 Multidimensional Approach
Second, random walks near the hyperplane ϒ exhibit anomalous properties reflected √ in the following singular mean scaling with the time τ as τ: −1/2
Rb (τ, ζM ) = DMM σ · K(τ, ζM ), ∗ M Ub (τ, ζM ) ∗ α Ub (τ, ζM ) ∗ αβ Lb (τ, ζM )
=
−1/2 DMM ω
(2.92)
· K(τ, ζ ), M
(2.93)
−1/2
= DMM lϒ vϒα · K(τ, ζM ), =
−1/2 DMM lϒ Dα δαβ
· K(τ, ζM ).
(2.94) (2.95)
Here the label b denotes that the basis b used, ζM is the distance between the point r and the hyperplane ϒ measured along the vector bM , and the function K(τ, ζM ) is specified by the integral . 1 (ζM )2 1 τ dz K(τ, ζM ) = . (2.96) √ exp − π 0 4DM τ z z In order to represent these boundary singularities in the initial basis e Proposition 2.1 is again applied. The initial basis has been assumed to be of the form e = eϒ ⊕ n with the unit normal n to the boundary ϒ directed inwards into the ˆ −1 = u˘ α be the operator mapping the boundary basis eϒ onto the medium. Let U ϒ β basis bϒ . Then transition from the coordinates {ζα }, ζM of a vector r in the basis b to its coordinates {xα }, xM in the basis e is specified by (2.50c) and (2.50d) using the tensor u˘ αβ and the diffusion tensor Dij determined in the initial basis e. In the vector form these coordinates are related by (2.49). The quantities ∗ Uib (τ, ζM ) and ∗ Lij (τ, ζM ) are obtained by averaging variations of the coordinates ζi . So they are b a contravariant vector and tensor, respectively, with the latter being proportional to the diffusion tensor written in the basis b and reduced to the hyperplane ϒ, namely, the tensor Da δαβ . The value Rb (τ, ζM ) is naturally a scalar. Whence it follows directly that −1/2
R(τ, xM ) = DMM σ · K(τ, xM ), ∗ i
−1/2 DMM
∗ αβ
−1/2
U (τ, xM ) =
!
DiM + lϒ vϒi · K(τ, xM ),
L (τ, xM ) = DMM lϒ Dαβ · K(τ, xM ).
(2.97) (2.98) (2.99)
Here the coordinate xM and ζM are inter-related by (2.50d) and the boundary diffusion tensor Dαβ is specified by (2.66). Formula (2.98) can also be rewritten in the vector form ∗
−1/2 U(τ, xM ) = DMM [b + lϒ vϒ ] K(τ, xM ),
where the vector b of boundary singularities is given by (2.16).
(2.100)
2.6 Derivation of Singular Boundary Scaling Properties
2.6 Derivation of Singular Boundary Scaling Properties
The homogeneous half-space RM+ bounded by the hyperplane ϒ is under consideration and the lattice described in Section 2.4 is constructed. It is made up of the node layers {ϒi } parallel to the hyperplane ϒ with the interplane spacing vector aM bM . The individual node arrangement of the layers ϒi is determined by the vectors of the hyperplane basis bϒ with spacings {aα }. In other words, the nodes of this lattice are the points rn =
M−1
nα (aα bα ) + naM bM ,
α=1
where n is the collection of numbers {nϒ , n} = {{nα } , n} taking any integer value, nα |M−1 = 0, ±1, ±2, . . ., except for the last one; it takes only non-negative values 1 n = 0, 1, 2, . . . In particular, the points {rn }ϒ with n = 0 form the boundary layer ϒ0 . The Markovian process in the half-space RM+ is simulated by random walks on this lattice with the hop probabilities as given in Section 2.4. To find the desired boundary singularities we will analyze evolution of the walker distribution over the given lattice, that is the dynamics of the probability Pt,n to find the walker at node n after hop t. Here t is the time measured in jump numbers, that is, in units of the hop duration τa . At the initial time t = 0 the walker is assumed to be located at a certain internal node n0 . Without lost of generality all the components of the index n0 can be set equal to zero except for the last one: n0 = {0, 0, . . . , 0, n0 }. 2.6.1 Moments of the Walker Distribution and the Generating Function
Actually the main purpose here is to find the zeroth, first, and second-order moments of the distribution function Pt,m . The zeroth moment quantifies the trapping effect, whereas the fist and second ones characterize the walker propagation in space. Namely, the following quantities Ra (t, n0 ) = 1 −
∞
Pt,{nϒ ,n} ,
(2.101)
n=0 nϒ
Uia (t, n0 ) =
∞
ni − ni0 Pt,{nϒ ,n} ,
(2.102)
n=0 nϒ
ij
La (t, n0 ) =
∞ 1 i j n − ni0 nj − n0 Pt,{nϒ ,n} 2 n n=0
(2.103)
ϒ
have to be calculated. Here the index i is used as a general symbol for one of the indices {α}, M. In order to do this the generating function and its analogy, written
61
62
2 Multidimensional Approach
for the boundary nodes only G(s, p, kϒ ) =
∞
e−st−p(n−n0 )+i(kϒ ·nϒ ) Pt,{nϒ ,n} ,
(2.104)
e−st+i(kϒ ·nϒ ) Pt,{nϒ ,0} = lim e−pn0 G(s, p, kϒ )
(2.105)
t=0 nϒ n=0
g(s, kϒ ) =
∞
p →∞
t=0 nϒ
are introduced, where the complex arguments s, p have the positive real parts, Re s, Re p ≥ 0. It should be noted that the traps are not included in these sums. The discrete Laplace transforms of the desired functions (2.101)–(2.103) are directly related to the generating function. Indeed Ra (s, n0 ) =
∞
e−st Ra (t, n0 ) =
t=0
Uia (s, n0 ) =
∞
e−st Uia (t, n0 ) = ∇i G(s, p, kϒ )p,k
t=0 ij
La (s, n0 ) =
∞
1 − G(s, 0, 0), (1 − e−s )
e−st La (t, n0 ) = ij
t=0
ϒ =0
(2.106)
,
(2.107)
1 ∇i ∇j G(s, p, kϒ )p,k =0 , ϒ 2
(2.108)
where the operator ∇i is ∇α = −i∂kα if the index i = α is one of the indices of the hyperplane ϒ and ∇ M = −∂p for the index i = M. 2.6.2 Master Equation for Lattice Random Walks and its General Solution
To find the generating function for the discrete random walks under consideration the corresponding master equation is applied. For an internal node n = {nϒ , n} with n ≥ 2 it takes the form Pt+1,n =
Pt,m Pmn .
(2.109)
m
Here the prime on the sum denotes the index m running over all the nearest neighbors of the given node n and according to expression (2.75) the corresponding hop probabilities can be represented as Pmn =
1 + i χi , 2M
(2.110) 1/2
where i = τa v i M/ai are some small quantities scaling with τa as i ∝ τa and the value χi = ±1 stands for hops along the basis vector bi or in the opposite direction,
2.6 Derivation of Singular Boundary Scaling Properties
that is, the hop to the node with mi = ni ± 1 and m j = n j for j = i. For the nodes of the layer ϒ1 the master equation becomes
Pt+1,n =
Pt,m Pmn + Pt,nb Pl .
(2.111)
m
Here again the prime on the sum has the same meaning except for only internal neighboring nodes being taken into account; {nb , n} is the pair of nodes belonging to the boundary layer ϒ0 and the adjacent internal layer ϒ1 that are related to each other via walker hops, and the hop probability Pl is determined by (2.78). The walker distribution function Pnb ,t in the boundary layer obeys the equation Pt+1,nb =
(g)
Pt,mb Pϒ Pmb nb + Pt,n Pnnb .
(2.112)
mb ∈ϒ0
We recall that the jumps inside the boundary layer can be complex and comprise, (g) individually, g elementary hops. In this case the multihop probability Pmb nb is determined by formula (2.84). The one-hop probability along the basis vector bα , provided the walker remains in the boundary layer ϒ0 , is (1)
Pmb nb =
1 + ϒ α χα , 2(M − 1)
(2.113)
α ϒ where ϒ α = τa vϒ M/aα is again a small parameter scaling as α ∝ τa . The values α quantify the asymmetry of hops in the boundary layer ϒ0 . In particular, these complex jumps are characterized by the means 1/2
α (g) nα P0nb = n ϒ = nb ∈ϒ0
g ϒ , (M − 1) α
α β (g) nα nβ P0nb = n n ϒ = nb ∈ϒ0
g g(g − 1) ϒ ϒ δαβ + . (M − 1) (M − 1)2 α β
(2.114)
(2.115)
Finally, the master equation for the traps is (tr)
(tr)
Pt+1,nb = Pt,nb + Pt,nb Ptr .
(2.116)
The hop probabilities Pl , Ptr , are given by expressions (2.78) and the kinetic coefficients of walker jumps inside the boundary layer ϒ0 are specified by expressions (2.80), (2.81), and (2.84). At the initial time the walker distribution meets the condition Pt=0,n = δnn0 .
(2.117)
To solve this system of equations we substitute (2.109), (2.111), and (2.112) into definition (2.104) of the generating function G(s, p, kϒ ) and after succeeding
63
64
2 Multidimensional Approach
mathematical manipulations get the following equation (see the comments about its derivation just after formula (2.120)) , + , + s e − p, kϒ G s, p, kϒ + , + , = es − epn0 p, kϒ − φ p, kϒ g(s, kϒ )
(2.118)
relating the given generating functions G(s, p, kϒ ) and g(s, kϒ ) to each other. Here the following functions (p, kϒ ) =
M−1 , , 1 + 1 + cos kα + iα sin kα , cosh p − M sinh p + M M
(2.119)
M−1 (g) (1 − σa ) −p (M − 1) e + exp i(kϒ · nϒ ) P0nϒ M M
(2.120)
α=1
φ(p, kϒ ) =
α=1
have been constructed in deriving (2.118). The key parts in deriving (2.118) are outlined below. The conversion in (2.104) from t → t + 1 leads to the line G(s, p, kϒ ) = e−s G(s, p, kϒ ) + 1,
(2.121)
where G(s, p, kϒ ) =
∞
e−st−p(n−n0 )+i(kϒ ·nϒ ) Pt+1,{nϒ ,n}
(2.122)
t=0 nϒ n=0
and the initial condition (2.117) has been taken into account. Equations (2.109), (2.111), and (2.112) relating two succeeding steps of random walks are substituted into the latter expression. As a result the terms in sums (2.109)–(2.117), matching the interlayer hops, split it into two parts G(s, p, kϒ ) ⇒ 1 (p)G(s, p, kϒ ) + epn0 φ1 (p) − 1 (p) g(s, kϒ ) with the latter summand caused by the fact that the boundary nodes have different properties from the internal ones. In their turn the components of sums (2.109)–(2.117) describing transitions between a given node n and the nodes of the same layer also split the term G(s, p, kϒ ) into two parts G(s, p, kϒ ) ⇒ 2 (kϒ )G(s, p, kϒ ) + epn0 φ2 (kϒ ) − 2 (kϒ ) g(s, kϒ ), where the latter summand is due to fast diffusion in +the boundary layer. The , (p) + 2 (kϒ ) = combination of the two last lines gives (2.118) with p, k ϒ 1 , + , + and φ p, kϒ = φ1 p + φ2 (kϒ ).
2.6 Derivation of Singular Boundary Scaling Properties
The generating function G(s, p, kϒ ) has no singularities in the region Re s, Re+p > 0., Therefore the left-hand side of (2.118) is equal to zero when es − p, kϒ = 0. Resolving the latter equality with respect to the variable p we obtain a function p = ω(s, kϒ ) defined by the equation ω(s, kϒ ), kϒ = es (2.123) which specifies the locus in the space {s, p, kϒ } where the right-hand side of (2.118) also has to be equal to zero. The latter enables us immediately to write the boundary generating function in the form exp −ω(s, kϒ )n0 . (2.124) g(s, kϒ ) = 1 − e−s φ ω(s, kϒ ), kϒ Expressions (2.118) and (2.124) actually solve the problem, giving us the following expression for the generating function +
G s, p, kϒ
,
+ , p, kϒ − 1 1 1 , + = +s [1 − e−s ] 1 − e−s e − p, kϒ + , + , " −[ω(s,kϒ )−p]n0 φ p, kϒ − p, kϒ + , , +e 1 − e−s φ ω(s, kϒ ), kϒ
(2.125)
where the first summand is the image of the delta function Pt,n = δnn0 not contributing to one of the quantities (2.101)–(2.103), the second term is due to random walks over the internal nodes, and the last one is caused by boundary effects. Formula (2.125) specifies the desired generating function in the general form. 2.6.3 Limit of Multiple-Step Random Walks on Small Time Scales
In order to find the Laplace transforms (2.106)–(2.108) it suffices to expand the , + generating function G s, p, kϒ into the Taylor series with respect to the arguments p and kϒ , cutting off the series at the second-order terms. However, in the case under consideration there are additional assumptions essentially simplifying the derivation of the desired results. First, only random walks with many steps are of interest because the hop duration τa has been chosen to be much less then the observation time interval τ of the analyzed Markovian process, τa τ. This means that the inequality s 1 holds. Second, the time interval τ is regarded as any small value. So only the components of moments (2.101)–(2.103) that are characterized by scaling τd with the exponent d not exceeding unity (d ≤ 1) are to be taken into account. With respect to the generating function G(s, p, kϒ ) the latter assumption is converted to the statement that all the components of itself and its derivatives calculated at the point {kϒ = 0, p = 0} that scale with the argument s as s−d and have the exponent d exceeding two (d > 2) can be ignored.
65
66
2 Multidimensional Approach
At the point {kϒ = 0, p = 0} according to the definitions (2.119), (2.120) the functions (p, kϒ ) and φ(p, kϒ ) take the values (0, 0) = 1 and φ(0, 0) = 1 −
σa , M
(2.126)
where the coefficient σa is considered to be a small parameter, which is justified in the limit τa → 0 as will be seen below. Therefore, in the adopted assumptions, (2.125) for the generating function can be rewritten as + , , + p, kϒ − 1 1 G s, p, kϒ = + s s2 + , + , φ p, kϒ − p, kϒ . + e−ω(s,0)n0 s s + 1 − φ ω(s, 0), 0
(2.127)
, + , + The expansion of the functions p, kϒ , φ p, kϒ with respect to p and kϒ at the required order is (p, kϒ ) = 1 −
M−1 p2 1 M p 1 iα kα − k2α + + M 2M M 2
(2.128)
α=1
and φ(p, kϒ ) = 1 − +
p p2 σa − + M M 2M
M−1 M−1 g g −1 ϒ ϒ ig ϒ α β . α kα − kα kβ δαβ + M α=1 2M M−1
(2.129)
α,β=1
In deriving expression (2.129) equations (2.114) and (2.115) have been used. The substitution of the generating function written in the form (2.127) with approximations (2.128) and (2.129) into relations (2.106)–(2.108) yields σa Ka (s, n0 ) , M M 1 UM Ka (s, n0 ) + , a (s, n0 ) = M Ms2 Ra (s, n0 ) =
α (g − 1)ϒ α Ka (s, n0 ) + , M Ms2 ϒ gϒ δαβ (g − 1) α β δ Ka (s, n0 ) + (s, n ) = + , Lαβ 0 αβ a 2M M−1 2Ms2 Uαa (s, n0 ) =
LMM a (s, n0 ) =
1 , 2Ms2
(2.130) (2.131) (2.132) (2.133) (2.134)
where the mean LαM a (s, n0 ) is equal to zero. Here the function Ka (s, n0 ) is defined by the expression
2.6 Derivation of Singular Boundary Scaling Properties
exp −ω(s, 0)n0 + , Ka (s, n0 ) = s s + 1 − φ ω(s, 0), 0
(2.135)
and we have ignored some insignificant terms where appropriate. Previously we measured the time t in units of the hop duration τa and spatial coordinates {ζi } in units of the lattice spacings {ai } within the frame b. Now let us return to the initial units and deal with the corresponding spatial correlations. To do this, first (2.131)–(2.134) should be multiplied by the spacings aM and aα , or their products aα aβ and a2M , respectively. Second, the dimensionless Laplace argument s has to be replaced by the product sτa , because previously when applying to the discrete Laplace transformation the replacement st → sτa ·
t τa
was used obliquely. Third, for the further conversion of the discrete Laplace transformation into a continuous one within the replacement
∞ ∞ → dt(. . . ) τa 0
t/τa =0
all the functions (2.130)–(2.134) must be multiplied by the time scale τa . Leaping ahead we note that the absorption coefficient σa has to scale with τa as √ σa ∝ τa . As noted before, the coefficients {i } also behave in this way. Therefore the observation time interval τ can be chosen to be so small that the solution of (2.123) becomes (2.136) ω(sτa , 0) = 2Msτa and function (2.135) matches a continuous Laplace transform / M K(s, ζ0 ) 2τa
τa Ka (sτa , n0 ) =
(2.137)
given by the expression K(s, ζM 0 )
−3/2
=s
exp −ζ0
.
s DM
,
(2.138)
with ζM 0 = aM n0 being the distance from the node of the walker’s initial position to the medium boundary ϒ along the vector bM . Indeed, first, if we ignore the second term on the right-hand side of expansion (2.128) the solution of (2.123) for sτa 1 and kϒ = 0 is of the form (2.136). 2 /DM or It is justified when ω M , which is equivalent to the condition s vM 2 τ DM /vM . Second, according to expansion (2.129) the denominator in expression (2.135) at the leading order is
+ , ω(sτa , 0) sτa + 1 − φ ω(sτa , 0), 0 = M
67
68
2 Multidimensional Approach
√ provided ω(sτa , 0) σa . Because σa ∼ ετa , where ε is some constant, the latter inequality is reduced to the following s ε and τ ε. Since the time interval is an arbitrary small value, the two inequalities can be adopted beforehand. So formula (2.136) and (2.137) follow immediately for the spacing aM given by expression (2.74b). 2.6.4 Continuum Limit and a Boundary Model
To get the final results we analyze the obtained expression in the limit τa → 0. The probability distribution Pt,m of the lattice random walks can be treated as a discrete implementation of the Green function G(r, r0 , t) giving the probability density of finding a walker at the point r at time t provided it was initially at the point r0 . Using the Green function G(r, r0 , t) the means under consideration are written as the following moments
R(t, ζ0 ) = 1 −
Uib (t, ζ0 ) = ij
Lb (t, ζ0 ) =
RM+
1 2
drG(r, r0 , t),
(2.139)
dr(ζi − ζi0 )G(r, r0 , t),
(2.140)
RM+
j
RM+
dr(ζi − ζi0 )(ζj − ζ0 )G(r, r0 , t),
(2.141)
and their Laplace transforms can be obtained from the quantities (2.130)–(2.134) in the manner described in the previous subsection. As a result we have −1/2
R(s, ζ0 ) = DMM σ K(s, ζ0 ), −1/2
UM b (s, ζ0 ) = DMM ω K(s, ζ0 ) +
(2.142) vM s2
,
vα −1/2 Uαb (s, ζ0 ) = DMM lϒ vϒα K(s, ζ0 ) + 2 , s Dα αβ −1/2 Lb (s, ζ0 ) = DMM lϒ Dα K(s, ζ0 ) + 2 δαβ , s LMM b (s, ζ0 ) =
DM s2
(2.143) (2.144) (2.145) (2.146)
the component LαM b (s, ζ0 ) is equal to zero. Here the following characteristics of the medium boundary, treated as an infinitely thin layer ϒ / . DMM MDMM τa (2.147) , lϒ := g σ := σa 2Mτa 2 have been introduced and expression (2.48) has been used. It should be noted that, according to (2.147) the number g of elementary hops forming the long distant
2.7 Boundary Condition for the Backward Fokker–Planck Equation −1/2
jumps of walkers in the boundary layer ϒ0 has to grow with τa as τa in order to retain the effect of boundary fast transport in the limit τa → 0. As a result, the √ second term in the square brackets of (2.143) scales as τa because, in turn, the √ τa . Therefore it vanishes in the limit τa → 0 coefficients {ϒ α } vary with τa as and the symmetry of the second moments caused by the boundary fast diffusion is restored. The equality (see, e.g. [60])
0
∞
.
ζ2 1 dt s √ exp − 0 − st = √ exp −ζ0 4DM t DM s πt
(2.148)
and the Laplace transform of integrals, enable us to represent the inverse Laplace transform K(t, ζ0 ) of function (2.138) in the integral form . K(t, ζ0 ) =
t π
0
1
ζ2 1 dz . √ exp − 0 4DM t z z
(2.149)
Expression (2.149) together with formula (2.142)–(2.146) proves Proposition 2.2.
2.7 Boundary Condition for the Backward Fokker–Planck Equation
The expressions (2.97)–(2.99) directly lead us to the final results. First, they relate the singular kinetic coefficients to the diffusion tensor and the physical characteristics of the medium boundary. Second, they reduce the problem of canceling the singularities inside a thin layer ϒτ adjacent to the boundary ϒ which, nevertheless, is volumetric before implementing the passage to the limit τ → 0. Indeed, since all the terms of (2.97)–(2.99) depend on the coordinate xM in the normal direction via the same function K(τ, xM ), the singularities will be canceled at all the points of the layer ϒτ if it is the case at the boundary ϒ. Also, the structure of the function K(τ, xM ), namely, expression (2.96) justifies that adopted before the assumption that the characteristic thickness of the layer ϒτ scales with time as τ1/2 . As shown in Section 2.2.2 the boundary singularities that appear in the expansion of the Chapman–Kolmogorov equation leading to the backward Fokker–Planck equation will vanish if equality (2.18) holds. At first, in order to improve the perception of the results let us consider quite a small neighborhood of the point s belonging to the boundary ϒ wherein it is actually a hyperplane and choose the basis eϒ ⊕ n composed of its hyperplane basis eϒ (s) and unit normal n(s) directed inwards the domain Q. Then substituting expressions (2.97)–(2.99) into formula (2.18) we can immediately conclude that at the boundary point s ∈ ϒ the Green function G(r, t|r0 , t0 ) with respect to the latter pair of its arguments with r0 → s has to meet the condition
69
70
2 Multidimensional Approach
M
DiM(s, t0 )∇is G(r, t|s, t0 ) = σ(s, t0 ) G(r, t|s, t0 ) − lϒ (s, t0 )
i=1
×
M−1 α=1
+
M−1
vϒα (s, t0 )∇αs G(r, t|s, t0 )
Dαβ (s, t0 )∇αs ∇βs G(r, t|s, t0 ) .
(2.150)
α,β=1
We note that the two last terms in expression (2.150) describe the effective motion of the system inside the boundary ϒ and have the form of the backward Fokker–Planck operator (2.14) with the diffusion tensor Dαβ and drift velocity vϒα whose action is confined to the boundary ϒ. In order to rewrite this expression for an orthonormal basis of general orientation we make use of the definition of the boundary singularity vector b(s, t), expression (2.16), and take into account (2.66) for the surface tensor diffusion. Then introducing the backward Fokker–Planck operator acting only within the hyperplane ϒ M M ij s s i s FPB (s, t0 ) ♦ = lϒ (s, t0 ) D (s, t0 )∇i ∇j ♦ + vϒ (s, t0 )∇i ♦ , (2.151) i,j=1
i=1
where as before the symbol ♦ stands for a function on which this operator acts. Then in the vector invariant form the boundary condition for the backward Fokker–Planck equation is written as b(s, t0 ) · ∇ s G(r, t|s, t0 ) = σ(s, t0 ) G(r, t|s, t0 ) − FPB (s, t0 ) G(r, t|s, t0 ) , (2.152) which is the desired formula. In deriving expression (2.150) the boundary ϒ was treated as a hyperplane, so the Euclidian space of dimension (M − 1) and its local basis eϒ was used. To write it again in the general form underlining the fact that the operator FPB acts in this hyperplane, only the tensor notions of covariant derivatives are used (see, e.g. [138]). In these terms the action of the operator FPB on the Green function taken at the boundary ϒ can be rewritten as FPB (s, t0 ) G(r, t|s, t0 ) = lϒ (s, t0 ) ×
M−1
vϒα (s, t0 )G(r, t|s, t0 );α
α=1
+
M−1 α,β=1
αβ
D (s, t0 ) G(r, t|s, t0 );αβ .
(2.153)
2.8 Boundary Condition for the Forward Fokker–Planck Equation
71
In the given case it is simply another form of the corresponding term in expression (2.150). However, for a nonplanar boundary, (2.153) holds allowing for the boundary curvature, whereas expression (2.151) loses the curvature effect. Its analysis goes far beyond the scope of the present chapter, so, here we will just ignore it.
2.8 Boundary Condition for the Forward Fokker–Planck Equation
The boundary conditions are obtained in a similar way. First, we note that the integrand of expression (2.31) is similar to the boundary relation (2.18) within the replacement the test function φ(r) by the Green function G(r, t|r0 , t0 ) and the action of the operators at the argument r instead of r0 . This analogy and the boundary condition (2.152) for the backward Fokker–Planck equation enable us to reduce equality (2.31) to the following FPB (s, t) φ(s) (2.154) b(s, t) · ∇ s φ(s) = σ(s, t)φ(s) − for an arbitrary boundary point s ∈ ϒ. Since the boundary part of the backward Fokker–Planck equation acts only within the boundary ϒ, only the left part of expression (2.154) contains the first derivative of the test function φ(s) in the direction normal to the boundary ϒ at the point s. All the other terms are either the boundary value of the function φ(s) itself or its derivatives along the hyperplane ϒ. It justifies the previously adopted statement that, in the vicinity of ϒ, the test function φ(r) can have any boundary value φ(s). Then noting that the left-hand side of the condition (2.32) is just the combination b(s, t) · ∇ s G(s, t|r0 , t0 ), the last term converts expression (2.32) into ϒ
dsφ(s)
M
νi (s)Ji G(s, t|r0 , t0 ) = −
i=1
ϒ
+
ϒ
dsφ(s)σ(s, t) G(s, t|r0 , t0 ) ds FPB (s, t) φ(s) G(s, t|r0 , t0 ).
(2.155) Using the divergence integral theorem for the surfaces the last term in (2.155) is reduced to the form ds FPB (s, t) φ(s) G(s, t|r0 , t0 ) = dsφ(s) FPB (s, t) G(s, t|r0 , t0 ) . ϒ
ϒ
Here the operator FPF is the boundary forward Fokker–Planck equation
M M s s ij i FPF (s, t) ♦ = , (2.156) ∇i ∇j lϒ (s, t)D (s, t) ♦ − lϒ (s, t)vϒ (s, t0 ) ♦ i=1
j=1
72
2 Multidimensional Approach
where again the symbol ♦ stands for the function on which the operator acts. Since the test function φ(s) takes any arbitrary values at the boundary ϒ, equality (2.155) holds for any point on the boundary ϒ, so FPF (s, t) G(s, t|r0 , t0 ) , (2.157) n(s) · J G(s, t|r0 , t0 )} = −σ(s, t) G(s, t|r0 , t0 ) + which is the desired boundary condition for the forward Fokker–Planck equation. As should be the case, the boundary condition (2.157) can be interpreted in terms of mass conservation; the component of the walker flux normal to the boundary ϒ is determined by the surface rate of walker absorption and the rate of fast surface transport, withdrawing the walkers from the given boundary point.
2.9 Concluding Remarks
In this chapter, a technique for deriving the boundary conditions for the Fokker– Planck equations based on the Chapman–Kolmogorov integral equation has been developed. The purpose of the chapter (see also [128]) is summarized in Figure 2.7. The interest to this problem is partly due to the following. It is well known that the Fokker–Planck equations, both forward and backward ones, stem directly from the Chapman–Kolmogorov equation under two additional assumptions; the short time confinement of the corresponding Markovian process and the local homogeneity of the medium. There are quite rigorous methods of deriving them from the integral Chapman–Kolmogorov equation based on expanding the latter Integral Chapman–Kolmogorov equation (CKE)
Expansion of CKE in two limits for intermediate time t∗ → t0 + 0 or t∗ → t − 0 Forward and backward Fokker–Planck equations (FPE)
Standard approach Analogy between mass conservation law and forward FPE
Boundary conditions for forward FPE backward FPE Figure 2.7 Illustration of the main purpose of this chapter concerning derivation of boundary conditions for the forward as well as the backward Fokker–Planck equation.
2.10 Exercises
on short time scales in the possible limits. By contrast, the corresponding boundary conditions are typically postulated applying the physical meaning of the probability flux and the analogy between the forward Fokker–Planck equation and the mass conservation law. However, such simple arguments can fail when dealing with more complex Markovian processes like sub- or super-diffusion, for which the Fokker–Planck equations with fractional derivatives form the governing equations. In this case it would be appropriate to have a formal technique giving rise to the boundary conditions starting from the general description. However, until now, constructing such a technique has been a challenging problem. This was also the case with respect to the normal Markovian processes in continua. This chapter has demonstrated how to do this with the normal Markovian processes. The key point is the fact that the medium boundary breaks down the symmetry of random walks near to it. As a result, the coefficients in the corresponding expansion series of the Chapman–Kolmogorov equation are endowed with anomalous features called the boundary singularities. Namely, on short time scales they behave as (δt)−1/2 . Since the probability distribution on macroscopic scales cannot contain such singularities, the corresponding cofactors in the expressions for the boundary singularities should be set equal to zero, leading to the required boundary conditions. In this way we have shown that the boundary conditions of the Fokker–Planck equations also follow directly from the Chapman–Kolmogorov equation supplemented by some rather general assumptions about the properties of the medium boundary. As must be the case, the boundary conditions obtained in this way satisfy mass conservation.
2.10 Exercises
E 2.1 Unbounded lattice random walk Consider lattice random walks on a regular unbounded chain of nodes. Use the Law of Large Numbers and demonstrate that the probability of finding a walker at a given node has the Gaussian form when the number of walker steps tends to infinity. E 2.2 Diagonal diffusion tensor I Find the orthonormal coordinate system where the following two-dimensional diffusion tensor {D11 = 2,
D22 = 2,
D12 = D21 = 1}
has the diagonal form, and calculate its components. E 2.3 Diagonal diffusion tensor II Dealing with the half-space R2+ = {x1 , x2 |x1 ≥ 0} to find the coordinate system (not orthogonal) where the diffusion tensor given in Exercise (2.2) has the diagonal form and one of the basic vectors is parallel to the boundary x1 = 0.
73
74
2 Multidimensional Approach
E 2.4 Diffusion with one impermeable boundary Let us consider the half-space R+ = {x|x ≥ 0} with an impermeable (reflecting) boundary at x = 0 and unbiased random walks characterized by the constant diffusion coefficient D. Using the image method, show that the Green function G(x, x0 |t) meets the boundary condition ∂G/∂x = 0 at x = 0. E 2.5 Drift–diffusion with one impermeable boundary Find the stationary distribution Gst (x) := limt→∞ G(x, x0 |t) of random walks in the half-space R+ = {x|x ≥ 0} with the impermeable (reflecting) boundary at x = 0 which are characterized by constant diffusion coefficient D and constant drift velocity −v (v > 0). E 2.6 Forward and backward Fokker–Planck equation Let the Green function G(x, t|x0 , t0 ) be introduced as the solution of the forward Fokker–Planck equation for t > t0 and x, x0 ∈ (x1 , x2 ) " ∂ ∂ D(x, t)G ∂G = − v(x, t)G (2.158) ∂t ∂x ∂x subject to the following initial and boundary conditions
Gt=0 = δ(x − x0 ), ∂[D(x, t)G] ∂[D(x, t)G] − v(x, t)G − v(x, t)G = = 0. ∂x ∂x x=x1 x=x2
(2.159) (2.160)
Demonstrate directly that this Green function G(x, t|x0 , t0 ) also obeys the backward Fokker–Planck equation for t0 < t and x, x0 ∈ (x1 , x2 ) −
∂G ∂2G ∂G = D(x0 , t0 ) 2 + v(x0 , t0 ) ∂t0 ∂x0 ∂x0
(2.161)
subject to the initial and boundary conditions Gt0 =t = δ(x − x0 ), ∂G ∂G = = 0. ∂x x=x1 ∂x x=x2
(2.162) (2.163)
Prove the same statement going in the opposite direction, that is, from the backward to forward Fokker–Planck equations.
Part II Physics of Stochastic Processes
77
3 The Master Equation
3.1 Markovian Stochastic Processes
Stochastic processes occur in many physical descriptions of nature. Historically, the motion of a heavy particle in a fluid of light molecules was the first to be observed. The path of such a Brownian particle consists of stochastic displacements due to random collisions. This motion was studied by the Scottish botanist Robert Brown (1773–1858). In 1828 he discovered that the microscopically small particles into which the pollen of plants decay in an aqueous solution are in permanent irregular motion. This type of stochastic process is called Brownian motion and can be interpreted as a discrete random walk or continuous diffusion. This topic is considered in textbooks about Statistical Physics [38, 85, 194, 206, 213] and also in many books or monographs about stochastic processes [6, 25, 55, 84, 121, 156, 160, 172, 201, 211, 234]. The intuitive background needed to describe irregular motion completely as a stochastic process is to measure the values x1, x2 , . . . , xn , . . . at times t1 , t2 , . . . , tn , . . . of a time-dependent random variable x(t) and to assume that a set of joint probability densities, called JPD distributions pn (x1 , t1 ; x2 , t2 ; . . . ; xn , tn ),
n = 1, 2, . . .
(3.1)
exists. The same can be done by introducing a set of conditional probability densities (called CPD distributions) pn (xn , tn | xn−1 , tn−1 ; . . . ; x1 , t1 ),
n = 2, 3, . . .
(3.2)
denoting that, at time tn , the value xn can be found, if at previous times tn−1 , . . . , t1 the respective values xn+1 , . . . x1 were present. The relationship between JPD and CPD is given by pn+1 (x1 , t1 ; . . . ; xn+1 , tn+1 ) = pn+1 (xn+1 , tn+1 | xn , tn ; . . . ; x1 , t1 ) pn (x1 , t1 ; . . . ; xn , tn ).
(3.3)
This stochastic description in terms of macroscopic variables will be called mesoscopic. Why? Typical systems encountered in everyday life like gases, liquids, solids, Physics of Stochastic Processes: How Randomness Acts in Time Reinhard Mahnke, Jevgenijs Kaupuˇzs and Ihor Lubashevsky Copyright 2009 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim ISBN: 978-3-527-40840-5
78
3 The Master Equation
biological organisms and human or technical objects consist of about 1023 interacting units. The macroscopic properties of matter are usually the result of the collective behavior of a large number of atoms and molecules acting under the laws of quantum mechanics. To understand and control these collective macroscopic phenomena a complete knowledge based upon the known fundamental laws of microscopic physics is useless because the problem of interacting particles is far beyond the capabilities of the largest recent, and probably future, computers. The understanding of complex macroscopic systems consisting of many basic particles (in the order of atomic sizes: 10−10 m) requires the formulation of new concepts. One method is a stochastic description taking into account statistical behavior. Since the macroscopic features are averages over time of a large number of microscopic interactions, a stochastic description links both approaches together; both the microscopic and the macroscopic, to give probabilistic results. Monographs (recommended for physicists and engineers) devoted to stochastic concepts are mainly written as advanced courses on Statistical Physics like that by Josef Honerkamp [85] and on Statistical Thermodynamics by Werner Ebeling and Igor M. Sokolov [38], or well known textbooks on Stochastic Processes, see e.g. [6] by Vadim S. Anishenko et al., [55] by Crispin W. Gardiner, [84] by Josef Honerkamp and [234] by N. G. van Kampen. Speaking about a stochastic process from the physical point of view we always refer to stochastic variables (random events) changing in time. A realization of a stochastic process is a trajectory x(t) as a function of time. Here we introduce a hierarchy of probability distributions pn (x1 , t1 ; x2 , t2 ; . . . ; xn , tn ) dx1 dx2 . . . dxn ,
n = 1, 2, . . . ,
(3.4)
where p1 (x1 , t1 ) dx1 is known as a time-dependent probability of first order to measure the value x1 (precisely, the value within [x1 , x1 + dx1 ]) at time t1 , p2 (x1 , t1 ; x2 , t2 ) is the same probability of second order, up to higher-order joint distributions pn (x1 , t1 ; . . . ; xn , tn ) dx1 dx2 . . . dxn in order to find, for the stochastic variable, the value x1 at time t1 , the value x2 at time t2 and so on. It is only the knowledge of such an infinite hierarchy of joint probability densities pn (x1 , t1 ; . . . ; xn , tn ) (expression (3.1)) with n = 1, 2, . . . which gives us the overall description of the stochastic process. A stochastic process without any dynamics (like throwing a coin or any game of chance) is called a temporally uncorrelated process. It holds that p2 (x1 , t1 ; x2 , t2 ) = p1 (x1 , t1 ) p1 (x2 , t2 ),
(3.5)
if random variables at different times are mutually independent. This means that each realization of a random number at time t2 does not depend on the previous time t1 , that is, the correlation at different times t1 = t2 is zero. Such a stochastic process, where function p1 (x1 , t1 ) ≡ p1 (x) is the density of a normal distribution, is called Gaussian white noise. The Gaussian white noise with its rapidly varying, highly irregular trajectory is an idealization of a realistic fluctuating quantity. Due to the factorization of all higher-order joint probability densities, the knowledge of the normalized distribution p1 (x1 , t1 ) totally describes the process.
3.1 Markovian Stochastic Processes
Now we are introducing dynamics via correlations between two different time moments. This basic assumption enables us to define the Markov process, also called the Markovian process, by two quantities totally, namely the first-order p1 (x1 , t1 ) and the second-order probability density p2 (x1 , t1 ; x2 , t2 ), or equivalently by the joint probability p1 (x1 , t1 ) and the conditional probability p2 (x2 , t2 | x1 , t1 ) of finding the value x2 at time t2 , given that its value at a previous time t1 (t1 < t2 ) is x1 . Contrary to the uncorrelated processes (3.5) discussed before, Markov processes are characterized by the following temporal relationship p2 (x1 , t1 ; x2 , t2 ) = p2 (x2 , t2 |x1 , t1 ) p1 (x1 , t1 ).
(3.6)
The Markov property pn (xn , tn | xn−1 , tn−1 ; . . . ; x1 , t1 ) = p2 (xn , tn | xn−1 , tn−1 )
(3.7)
enables us to calculate all higher-order joint probabilities pn for n > 2. To determine the fundamental equation of stochastic processes of Markov type we start with the third-order distribution (t1 < t2 < t3 ) p3 (x1 , t1 ; x2 , t2 ; x3 , t3 ) = p3 (x3 , t3 | x2 , t2 ; x1 , t1 ) p2 (x1 , t1 ; x2 , t2 ) = p2 (x3 , t3 | x2 , t2 ) p2 (x2 , t2 | x1 , t1 ) p1 (x1 , t1 )
(3.8)
and integrate this identity over x2 and then divide both sides by p1 (x1 , t1 ). We get the following result for the conditional probabilities defining a Markov process
p2 (x3 , t3 | x1 , t1 ) = p2 (x3 , t3 | x2 , t2 ) p2 (x2 , t2 | x1 , t1 ) dx2 , (3.9) called the Chapman–Kolmogorov equation. As already stated the Markov process is uniquely determined by the distribution p1 (x, t) at time t and the conditional probability p2 (x , t | x, t), also called the transition probability from x at t to x at a later t , to determine the whole hierarchy pn (n ≥ 3) by the Markov property (3.7). Also, these two functions cannot be chosen arbitrarily, they have to fulfill two consistency conditions, namely the Chapman–Kolmogorov equation (3.9)
(3.10) p2 (x , t | x, t) = p2 (x , t | x , t ) p2 (x , t | x, t) dx , the Markov relationship (3.6)
p1 (x , t ) = p2 (x , t |x, t) p1 (x, t) dx,
(3.11)
and the normalization condition
p1 (x , t ) dx = 1.
(3.12)
79
80
3 The Master Equation
The history in a Markov process, given by (3.7), is very short, only one time interval from t to t is involved. If the trajectory has reached x at time t, the past is forgotten, and it moves toward x at t with a probability depending on x, t and x , t only. The entire information relevant for the future is thus contained in the present. A Markov process is a stochastic process for which the future depends on the past and the present only through the present. It has no memory [201]. In an ordinary case where the space of states x is locally homogeneous it makes sense to transform the Chapman–Kolmogorov equation (3.9) in an equivalent differential equation in the short-time limit t = t + τ with small τ tending to zero. The short-time behavior of the transition probability p2 (· | ·) should be written as series expansion with respect to the time interval τ in the form p2 (x, t + τ | x , t) = 1 − w(x, t)τ δ(x − x ) + τw(x, x , t) + O(τ2 ).
(3.13)
The new quantity w(x, x , t) ≥ 0 is the transition rate, the probability per time unit, for a jump from x to x = x at time t. This transition rate w multiplied by the time step τ gives the second term in the series expansion describing transitions from another state x to x. The first term (with the delta function) is the probability that no transitions take place during time interval τ. Based on the normalization condition
p2 (x, t + τ | x , t) dx = 1
(3.14)
it follows that
w(x, t) =
w(x , x, t) dx .
(3.15)
The ansatz (3.13) implies that a realization of the random variable after any time interval τ retains the same value with a certain probability or attains a different value with the complementary probability. A typical trajectory x(t) consists of straight lines x(t) = constant, interrupted by jumps. An illustration is presented in Figure 3.1.
x x0 0
0
t
Figure 3.1 Sketch of time evolution of a stochastic onedimensional variable x(t). The stochastic trajectory consists of pieces of deterministic motion interrupted by jumps.
3.1 Markovian Stochastic Processes
From Chapman–Kolmogorov equation (3.9) together with (3.13) we get p2 (x, t + τ | x , t ) =
p2 (x, t + τ | x , t)p2 (x , t | x , t ) dx
=
1 − w(x, t)τ δ(x − x )p2 (x , t | x , t ) dx
+
τw(x, x , t)p2 (x , t | x , t ) dx + O(τ2 ).
(3.16)
With (3.15) and after taking the short-time limit τ → 0 one obtains the following differential equation ∂ p2 (x, t | x , t ) = ∂t
w(x, x , t)p2 (x , t | x , t ) dx
−
w(x , x, t)p2 (x, t | x , t ) dx .
(3.17)
In order to rewrite the derived equation in a well known form using physical concepts, we get after multiplication by p1 (x , t ) and integration over x , the differential formulation of the Chapman–Kolmogorov equation
∂ p1 (x, t) = w(x, x , t)p1 (x , t) dx − w(x , x, t)p1 (x, t) dx (3.18) ∂t called the master equation in the (physical) literature. The name ‘master equation’ for the above probability balance equation is used in the sense that this differential expression is a general, fundamental or basic equation. For a process which is homogeneous in time the transition rates w(x, x , t) are independent of time t and therefore w(x, x , t) = w(x, x ). The shorttime transition rates w have to be known from the physical context, often like an intuitive ansatz, or have to be formulated based on a reasonable hypothesis or approximation. One of these is Fermi’s golden rule originating from microscopic quantum theory [194]. With known transition rates w and given initial distribution p1 (x, t = 0) the master equation (3.18) gives the resulting evolution of the probability p1 over an infinitely long time period. The well known master equation can be written in different ways. Besides the continuous formulation with one variable x, the generalization to the multidimensional as well as the discrete case is obvious. Instead of p1 (x, t) with the high-dimensional probability vector P(x, t) ≡ P (x1 , x2 , . . . , xn , t) we may write the master equation in the discrete form (with summation instead of integration) as ∂ w(x, x )P(x , t) − w(x , x)P(x, t) . P(x, t) = ∂t
(3.19)
x =x
Generalizations of the master equation have been developed by Honerkamp and Breuer [85], Montroll and West [165] and others. To perform stochastic simulations of complex systems like piecewise deterministic Markov processes, a stochastic
81
82
3 The Master Equation
formulation of fluid dynamics or reaction–diffusion equations; the so-called manybody or multivariate master equation, have been introduced. When describing quantum random systems, the master equation is usually called the Pauli master equation.
3.2 The Master Equation
The basic equation of stochastic Markov processes, called the master equation or explicitly the forward master equation, is usually written as a gain–loss equation (3.18) for the probabilities p(x, t) in the form
∂p(x, t) = w(x, x )p(x , t) − w(x , x)p(x, t) dx . (3.20) ∂t This very general equation can be interpreted as local balance for the probability densities which have to fulfill the global normalization condition
p(x, t) dx = 1 (3.21) at each time moment t, also at the beginning for the initial distribution p(x, t = 0). The linear master equation (3.20) with known transition rates per unit time w(x, x ) is a so-called Markov evolution equation showing the relaxation from a chosen starting distribution p(x, t = 0) to some final probability distribution p(x, t → ∞). The linearity of the master equation is based on the assumption that the underlying dynamics is Markovian. The transition probabilities w do not depend on the history of reaching a state, so that the transition rates per unit time are indeed constants for a given temperature or total energy. If the state space of the stochastic variable is a discrete one, often considering natural numbers within a finite range 0 ≤ n ≤ N, the master equation for the time evolution of the probabilities p(n, t) is written as dp(n, t) w(n, n )p(n , t) − w(n , n)p(n, t) , = dt
(3.22)
n =n
where w(n , n) ≥ 0 are rate constants for transitions from n to another n = n. Together with the initial probabilities p(n, t = 0) (n = 0, 1, 2, . . . , N) and the boundary conditions at n = 0 and n = N this set of equations governing the time evolution of p(n, t) from the beginning at t = 0 to the long-time limit t → ∞ has to be solved. The meaning of both terms is clear. The first (positive) term is the inflow current to state n due to transitions from other states n , and the second (negative) term is the outflow current due to opposite transitions from n to n . Now let us define stationarity, sometimes called steady state, as a time independent distribution pst (n) by the condition dp(n, t)/ dt|p=pst = 0. Therefore the stationary master equation is given by
3.2 The Master Equation
0=
w(n, n )pst (n ) − w(n , n)pst (n) .
(3.23)
n =n
This equation states the obvious fact that, in the stationary or steady-state regime, the sum of all transitions into any state n must be balanced by the sum of all transitions from n into other states n . Based on the properties of the transition rates per unit time, the probabilities p(n, t) tend in the long-time limit to the uniquely defined stationary distribution pst (n), for which a constant probability flow is possible in open systems. This fundamental property of the master equation may be stated as lim p(n, t) = pst (n).
(3.24)
t→∞
Now we are considering the question of equilibrium in a system without external exchange. The condition of equilibrium in closed isolated systems is much stronger than the former condition of stationarity (3.23). Here we require, as an additional constraint, a balance between each pair of states n and n separately. This so-called detailed balance relation is written for the equilibrium distribution peq (n) as 0 = w(n, n )peq (n ) − w(n , n)peq (n).
(3.25)
It always holds for one-step processes in one-dimensional systems with closed boundaries considered further in our paper. Of course, each equilibrium state is by definition also stationary. If the initial probability vector p(n, t = 0) is strongly nonequilibrium, many probabilities p(n, t) change rapidly as soon as the evolution starts (short-time regime), and then relax more slowly towards equilibrium (longtime behavior). The final state, called thermodynamic equilibrium, is reached in the limit t → ∞. Using linear algebra we want to solve the master equation analytically by an expansion in eigenfunctions. This method gives us a general solution of the timedependent probability vector p(n, t) expressed by eigenvectors and eigenvalues. In a first step we introduce the master equation, written as a set of coupled linear differential equations (3.22), in a compact matrix form d P(t) = W P(t), dt
(3.26)
with a probability vector P(t) = {p(n, t) | n = 0, . . . , N} and an undecomposable asymmetric transition matrix W = {W(n, n ) | n, n = 0, . . . , N}. The elements of the matrix are given by W(n, n ) = w(n, n ) − δn,n w(m, n) (3.27) m =n
and obey the following two properties n
W(n, n ) ≥ 0
for n = n ,
(3.28)
W(n, n ) = 0
for each n .
(3.29)
83
84
3 The Master Equation
As we know from matrix theory [53] there are a number of consequences based on both properties. In particular the transition matrix W has a single zero eigenvalue whose eigenvector is the equilibrium probability distribution. In general, other eigenvalues can be complex and they always have negative real part. In our special case where the detailed balance (3.25) holds all eigenvalues are real, as discussed further on. The solution P(t) of the master equation (3.26) with given initial vector P(0) may be written formally as P(t) = P(0) exp(W t), (3.30) ∞ (where exp(W t) = m=0 (W t)m /m!) but this does not help us to find P(t) explicitly. The familiar method is to make W symmetric and thereby diagonalizable and then to construct the solution as superposition of eigenvectors uλ related to (zero or negative) eigenvalues λ in the form P(t) = cλ u λ e λ t . (3.31) λ
with, until now, unknown coefficients cλ . Using the condition of detailed balance (3.25) we transform the matrix W = {W(n, n )} to a new symmetric transition 0 = {W(n, 0 n )} with elements given by matrix W / eq def 0 , n). 0 n ) = W(n, n ) p (n ) = W(n (3.32) W(n, peq (n) 0 have the same eigenvalues λi . Due to the symmetry of Both matrices W and W 0 all eigenvalues are real. They may be labeled in order of decreasing matrix W, algebraic values, so that λ0 = 0 and λi < 0 for 1 ≤ i ≤ N. Denoting the normalized ui respectively, defined by the eigenvalue equations eigenvectors by ui and 0
W(n, n ) ui (n ) = λi ui (n);
W ui = λi ui
(3.33)
0 n )0 ui (n); ui (n ) = λi 0 W(n,
0 ui = λi 0 ui W0
(3.34)
n
n
ui (n) to each other, we are ready and related by the transformation ui (n) = peq (n) 0 to construct the time dependent solution of the fundamental master equation (3.26). According to superposition formula (3.31), where the coefficients cλ are calculated from the initial condition p(n, 0) at t = 0, the solution is then N N p(m, 0) λ t 0 0 ui (n) e i ui (m) , (3.35) p(n, t) = peq (n) peq (m) m=0 i=0 or p(n, t) =
N i=0
ui (n) e
λi t
p(m, 0) . ui (m) eq p (m) m=0 N
(3.36)
3.3 One-Step Processes in Finite Systems
This solution plays a very important role in the stochastic description of Markov processes and can be found in different notations (e.g. as an integral representation) in many textbooks, see e.g. [84, 234]. As time increases to infinity (t → ∞) only the term i = 0 in the solution survives and the probabilities tend to equilibrium P(t) → Peq , written as N N p(m, 0) eq λi t . (3.37) ui (n) e ui (m) eq p(n, t) = p (n) + p (m) i=1
m=0
In the long-time limit all remaining modes cλ uλ eλ t decay exponentially. In the short-time regime, due to combinations of modes with different signs, there is the possibility of growing and subsequent shrinking of transient states as probability current from initial distribution P(0) to equilibrium Peq via intermediates P(t) [164]. Master equation dynamics can be studied either by solving the basic equation analytically with implementation of numerical methods or by simulating the stochastic process as a large number of subsequent jumps from state to state with the given transition rates. Both methods have different advantages and disadvantages. One important point is the choice of the appropriate time interval called the numerical integration step or waiting time in simulation techniques. The step size required for a given accuracy is usually smaller when the time t is closer to zero, and can be enlarged as t grows. Therefore, only a numerical algorithm with an adaptive step size should be used. Detailed information about algorithms used to generate a trajectory of a stochastic process described by a master equation can be found in textbooks by Honerkamp [84, 85] or others [140, 149].
3.3 One-Step Processes in Finite Systems
We are speaking about a one-dimensional stochastic process if the state space is characterized by one variable only. Often this discrete variable is a particle number n ≥ 0 describing the amount of molecules in a box or the size of an aggregate. In chemical physics such aggregation phenomena like formation and/or decay of clusters are of great interest. Examples are the formation of a crystal or glass when cooling a liquid, or the condensation of a droplet out of a supersaturated vapor. To determine the relaxation dynamics of clusters of size n we take a particularly simple Markov process with transitions between neighboring states n and n = n ± 1. This situation is called a one-step process. In biophysics, if the variable n represents the number of living individuals of a particular species, the one-step process is often called birth-and-death process used-to investigate problems in population dynamics. The random walk with displacements to the left and right by one step is well known in physics [195] and often plays a role as an introductory example and has been recently revisited and applied to new fields like econophysics [119, 180, 206, 235]. The detailed balance relation (3.25) can be proven for the one-step process, so that in our case the former (see Section 3.2) is completely correct.
85
86
3 The Master Equation n+1
w−(n+1)
w+(n) n
w−(n)
w+(n−1) n−1
Figure 3.2 Illustration of a one-step process showing the up and down or forward and backward transition probabilities between neighboring states.
Setting the transition rates w(n, n − 1) = w+ (n − 1), w(n, n + 1) = w− (n + 1), and therefore also w(n + 1, n) = w+ (n), w(n − 1, n) = w− (n), see Figure 3.2; now the forward master equation (3.22) reads dp(n, t) = w+ (n − 1) p(n − 1, t) + w− (n + 1) p(n + 1, t) dt − w+ (n) + w− (n) p(n, t).
(3.38)
In general, the forward and backward transition rates w+ (n), w− (n) are nonlinear functions of the random variable n; the physical dimension of w± is one over time (s−1 ). The master equation is always linear in the unknown probabilities p(n, t) of being at state n at time t. It has to be completed by the boundary conditions. The nonlinearity refers only to the transition coefficients. Further on we will pay attention to particles as aggregates in a closed box or vehicular jams on a circular road. Therefore, in finite systems, the range of the discrete variable n is bounded between 0 and N (n = 0, 1, 2, . . . , N). The general one-step master equation (3.38) is valid for n = 1, 2, . . . , N − 1, but meaningless at the boundaries n = 0 and n = N. Therefore, we have to add two boundary equations as closure conditions dp(0, t) = w− (1) p(1, t) − w+ (0) p(0, t), dt dp(N, t) = w+ (N − 1) p(N − 1, t) − w− (N) p(N, t). dt
(3.39) (3.40)
To solve the set of equations we rewrite (3.38) as a balance equation dp(n, t) = J(n + 1, t) − J(n, t) dt
(3.41)
with the probability current defined by J(n, t) = w− (n) p(n, t) − w+ (n − 1) p(n − 1, t).
(3.42)
3.3 One-Step Processes in Finite Systems
In the stationary regime, remember (3.23), all flows (3.42) have to be independent of n and therefore equal to a constant current of probability: J(n + 1) = J(n) = J. In open systems the stationary solution is no longer unique, it depends on the current J. In finite systems with n = 0, 1, 2, . . . , N one finds a situation with zero flux J = 0, which corresponds to the steady state with a detailed balance relationship similar to (3.25). Therefore, the stationary distribution pst (n) fulfills the recurrence relation pst (n) =
w+ (n − 1) st p (n − 1). w− (n)
(3.43)
By applying the iteration successively we get the relation pst (n) = pst (0)
n 1 w+ (m − 1) , w− (m) m=1
(3.44)
which determines all probabilities pst (n) (n = 1, 2, . . . , N) in terms of the first unknown one pst (0). Taking into account the normalization condition N
pst (n) = 1 or pst (0) +
n=0
N
pst (n) = 1
(3.45)
n=1
the stationary probability distribution pst (n) in finite systems is finally written as n 1 w+ (m − 1) w− (m) m=1 n = 1, 2, . . . , N N k 1 w+ (m − 1) 1+ w− (m) (3.46) pst (n) = k=1 m=1 1 n = 0. N 1 k w+ (m − 1) 1+ w− (m) m=1 k=1
It is often convenient to write the stationary solution (3.44) in the exponential form (3.47) pst (n) = pst (0) exp −(n) , where, in analogy to physical systems, the function
n w− (m) ln (n) = w+ (m − 1)
(3.48)
m=1
is called the potential. An example of a double-well potential (n) and corresponding bistable stationary probability distribution is shown in Figure 3.3. As we can see, the minimum of the potential corresponds to the probability maximum and vice versa. The obtained result (3.46) based on the zero-flux relationship (3.43) is a unique solution for the stationary probability distribution in finite systems with closed
87
3 The Master Equation
Φ(n)
6
3
0 0
50
100
150
100
150
n
(a) 0.02
Pst(n)
88
0.01
0 (b)
0
50 n
Figure 3.3 An example of a double well potential (a) and the corresponding bistable probability distribution (b) depending on the stochastic variable n.
boundaries. For an isolated system, the stationary solution of the master equation pst is identical with the thermodynamic equilibrium peq , where the detailed balance holds which, for one-step processes, reads w− (n) peq (n) = w+ (n − 1) peq (n − 1).
(3.49)
The condition of detailed balance states a physical principle. If the distribution peq is known from equilibrium statistical mechanics and if one of the transition rates is also known (e.g. w+ by a reasonable ansatz), then (3.49) provides an opportunity to formulate the opposite transition rate w− in a consistent way. By this procedure the nonequilibrium behavior is adequately described by a sequence of (quasi-) equilibrium states. The relaxation from any initial nonequilibrium distribution tends always to the known final equilibrium. In physical systems the equilibrium distribution is usually represented in an exponential form, see e.g. [167, 213], Peq (n) ∝ exp −(n)/(kB T) (3.50) where (n) is the thermodynamic potential depending on the stochastic variable n, kB is the Boltzmann constant, and T is the temperature. Equation (3.50) is comparable with (3.47) where (n) = (n)/(kB T). 3.4 The First-Passage Time Problem
In many applications it is important to know the mean time during which the system finds its stable state by overcoming a potential barrier (cf. Figure 3.3) due to
3.4 The First-Passage Time Problem
stochastic fluctuations. It is closely related to the breakdown phenomena [26, 216]. Particularly, in traffic engineering one speaks about the traffic breakdown probability which is the transition rate as inverse quantity of the average breakdown time during which a spontaneous jamming (clustering) of cars appears in an initially homogeneous metastable traffic flow [110–112]. In a more general formulation, it is called the, well known, first-passage problem. The problem is to find the average time during which a stochastic system reaches, for the first time, some given state if started from another state. This time is called the mean first-passage time. For a mathematical formulation of the problem first we derive the backward master equation. Our starting point is the Chapman–Kolmogorov equation (3.10), written for discrete variables p(n, t | n , t ) p(n , t | n , t ). (3.51) p(n, t | n , t ) = n
Here p(n, t | n , t ) represents the conditional probability that the system is in state n at time t if it was in state n at time moment t , where t < t < t. By setting t = t + ∆t and taking into account the normalization condition (3.14) p(n , t + ∆t | n , t ) = 1, (3.52) n
using p(n, t | n , t + ∆t) =
p(n, t | n , t + ∆t)p(n , t + ∆t | n , t )
(3.53)
n
we obtain p(n, t | n , t + ∆t) − p(n, t | n , t ) p(n , t + ∆t | n , t ) p(n, t | n , t + ∆t) − p(n, t | n , t + ∆t) . = n
(3.54) Dividing both sides of (3.54) by ∆t and taking the limit ∆t → 0, we arrive at the backward master equation ∂p(n, t | n , t ) = w(n , n , t ) p(n, t | n , t ) − p(n, t | n , t ) ∂t
(3.55)
n
describing the evolution of the probabilities p(n, t | n , t ) with respect to the initial time t . Here p(n , t + ∆t | n , t ) ∆t→0 ∆t
w(n , n , t ) = lim
(3.56)
is the transition rate from state n to state n at time moment t . An appropriate initial condition for (3.55) is p(n, t = 0 | n , t = 0) = δn,n
(3.57)
89
90
3 The Master Equation
stating that the system cannot be in two different states simultaneously. For one-step processes, assuming no explicit time dependence of the transition rates w+ (n) = w(n + 1, n, t) and w− (n) = w(n − 1, n, t), the backward master equation (3.55) becomes ∂p(n, t | n , t ) = w+ (n ) p(n, t | n , t ) − p(n, t | n + 1, t ) ∂t + w− (n ) p(n, t | n , t ) − p(n, t | n − 1, t ) .
(3.58)
To study the first-passage problem, the backward master equation should be supplied by suitable boundary conditions. Let us assume that the value of the stochastic variable n belongs to the interval a ≤ n ≤ b at the initial time t = 0. We consider a reflecting boundary at n = a, that is, w− (a) = 0, which means that the system can never reach states with values n < a, and an absorbing boundary at n = b, that is, w− (b + 1) = 0, which means that the system never returns back to n ∈ [a; b] once it has left this interval. It is often convenient to associate n with the position of a randomly walking particle, assuming that the particle is absorbed at n = b + 1. The question is, how long is the average time till this absorption takes place? The quantity we have to calculate is the breakdown rate as inverse of the mean first-passage time T(n) starting from a certain position n inside the interval [a; b] to stick at b + 1. Obviously, the reflecting boundary condition w− (a) = 0 in (3.58) can be formally replaced by p(n, t | a − 1, t ) = p(n, t | a, t )
(3.59)
The absorbing boundary condition for the backward master equation can be written as p(n, t | b + 1, t ) = 0,
(3.60)
which states that the transition from state n = b + 1 to states n ≤ b is forbidden. The probability G(n, t) that at time t the system still has not left the interval [a; b] is given by G(n, t) =
b
p(n , t | n, 0).
(3.61)
n =a
The function G(n, t) obeys the equation −
∂ G(n, t) = w+ (n) G(n, t) − G(n + 1, t) ∂t + w− (n) G(n, t) − G(n − 1, t)
(3.62)
and the boundary conditions G(a − 1, t) = G(a, t)
(3.63)
G(b + 1, t) = 0,
(3.64)
3.4 The First-Passage Time Problem
as follows from (3.58)–(3.60) and according to (3.61) in view of the fact that probability p(n, t | n , t ) is a function of the time difference t − t , which means that the derivative with respect to t is the negative derivative with respect to t. According to the definition of G(n, t) (3.61) the probability of absorption within an infinitesimal time interval [t; t + dt] is G(n, t) − G(n, t + dt) = −(∂G/∂t) dt, which means that the mean first-passage time T(n) is
∞
∞ ∂G dt = t G(n, t) dt, (3.65)
T(n) = − ∂t 0 0 where the latter identity is due to the integration by parts. Taking into account (3.65), we obtain just the equation for the mean first-passage time by integration over time in (3.62) and (3.63) to (3.64). It yields the desired equation 1 = w+ (n) T(n) − T(n + 1) + w− (n) T(n) − T(n − 1) (3.66) with the boundary conditions
T(a − 1) = T(a)
(3.67)
T(b + 1) = 0.
(3.68)
To solve (3.66), it is suitable to rewrite it in new variables as w+ (n) φ(n) S(n) − S(n − 1) = −1
(3.69)
where φ(a) = 1 and φ(n) =
n 1 w− (m) w (m) m=a+1 +
(3.70)
S(n) =
T(n + 1) − T(n) . φ(n)
(3.71)
Equation (3.70) holds for n ∈ [a + 1; b] and (3.71) is valid for n ∈ [a; b] with S(a − 1) = 0. From this we find immediately φ(k) S(k) = −φ(k)
k −1 w+ (m)φ(m) = T(k + 1) − T(k)
(3.72)
m=a
The summation in (3.72) from k = n to k = b taking account of the boundary condition (3.68) yields the solution
T(n) =
b k=n
φ(k)
k −1 w+ (m)φ(m) .
(3.73)
m=a
It is convenient to express the solution (3.73) in terms of the stationary probability distribution pst (n) given by (3.44). It finally yields
91
3 The Master Equation
15
ln
92
10
5
0
0
50
100
150
b Figure 3.4 Logarithm of the mean first-passage time as a function of state variable. On average the system needs time
T to move from the initial state at n = 0 to the absorbing state at n = b + 1, corresponding to the stationary probability distribution in Figure 3.3.
T(n) =
k b −1 pst (m), w+ (k)pst (k) k=n
(3.74)
m=a
which allows us to calculate analytically the mean first-passage time T(n) to reach state b + 1 for the first time, when starting at position n, taking into account the forward rate w+ and the stationary distribution pst defined by (3.44). The mean breakdown rate is given by 1/ T(n). An example of the mean first-passage time, corresponding to the stationary probability distribution in Figure 3.3, is shown in Figure 3.4. The mean firstpassage time increases rapidly with changing boundary value b up to b ≈ 15 due to the necessity to climb up the first hill of the potential in Figure 3.3. Then the increase becomes slower and the ln T vs b curve almost has a plateau from b ≈ 30 to b ≈ 85. It corresponds to the decreasing part of the potential, therefore the system can pass these states easily in a relatively short time. The mean first-passage time again increases dramatically for larger b values due to the growth of the potential, and in our example it reaches the value of 2.4 × 106 dimensionless time units at b = 150. This means that the states with large values of the state variable n will practically never be reached. For comparison, we have T = 1.05 × 103 time units at b = 50.
3.5 The Poisson Process in Closed and Open Systems
Until to now we have considered Markov processes in a more general framework without defining the states of the system or the rates for the transitions between these states precisely. The particular case, where the states are characterized by a single particle number n and the rates by a one-step backward transition w− (n) only, is called decay process. A schematic realization of such a stochastic process
3.5 The Poisson Process in Closed and Open Systems
n n0 n0−1 n0−2
0
0
t
Figure 3.5 Sketch of the realization of a stochastic decay process of Poisson type with shrinking particle number n starting from n = n0 at t = 0.
is shown in Figure 3.5 illustrating dissolution or shrinkage of a bound state of n members. In a first step we present an example of traffic flow considered as a Markov process. We want to investigate the dissolution of a queue of cars standing in front of traffic lights. When the lights switch to green, the first car starts to move. After a certain time interval (waiting time τ = constant > 0) the next vehicle accelerates to pass the stop line and so on. In our model we consider the decay of traffic congestion without taking into account any influence of external factors, like ramps or intersections, on the driver’s behavior. The stochastic variable n(t) is the number of cars which are bounded in the jam at time t. A queue or platoon of n vehicles is also called a car cluster of size n in agreement with the concept of aggregation [141, 144, 203] and traffic flow [146–151]. When the initial jam size is finite, given by the value n(t = 0) = n0 , shown in Figure 3.5, the trajectory n(t) = n0 , n0 − 1, . . . , 2, 1, 0 consists of unit jumps at random times. The jam starting with size n0 becomes smaller and smaller and dissolves completely. In Figure 3.6 we have shown three different stochastic trajectories to illustrate car cluster dissolution. This one-step stochastic process is a death process only, sometimes called a Poisson process. Defining p(n, t) as the probability of finding a jam of size n at time t, the master equation for the dissolution process reads ∂ p(n, t) = w− (n + 1)p(n + 1, t) − w− (n)p(n, t) ∂t
(3.75)
with the decay rate per unit time is assumed to be w(n , n) = w(n − 1, n) ≡ w− (n) =
1 . τ
(3.76)
In this approximation the experimentally known waiting time constant τ is a given control parameter in our escape model. It is the reaction time of a driver, usually about 1.5 or 2 seconds, to escape from the jam when the road in front of his car becomes free. Therefore, the transition rate (3.76) is a constant w− = 1/τ independent of the jam size n.
93
3 The Master Equation
50 40 30 n
94
20 10 0
0
20
40
60
80
100
t/t Figure 3.6 Three different stochastic trajectories showing the dissolution of a car queue with the initial length (size) n0 = 50, that is, the cluster size n vs dimensionless time t/τ. The theoretical average value is shown by a smooth solid line.
For the described process of jam shrinkage (n0 ≥ n ≥ 0), starting with cluster size n = n0 and ending with n = 0, we thus obtain the following master equation including boundary conditions (compare (3.38)–(3.40)) ∂ p(n0 , t) = ∂t ∂ p(n, t) = ∂t ∂ p(0, t) = ∂t
1 − p(n0 , t), τ 1 p(n + 1, t) − p(n, t) , τ 1 p(1, t) τ
(3.77) n0 − 1 ≥ n > 0,
(3.78) (3.79)
and initial probability distribution p(n, t = 0) = δn,n0 . The delta function means that at the beginning the vehicular queue consists of exactly n0 cars. In order to find the explicit expression for the probability distribution p(n, t) we have to solve the set of equations (3.77)–(3.79). This can be done analytically starting with the first equation, getting p(n0 , t) = exp(−t/τ) as the exponential decay function, inserting the solution into the next equation for p(n0 − 1, t), solving it and continue iteratively up to p(0, t). The general solution of the probability p(n, t) of observing a car cluster of size n at time t is p(n, t) =
(t/τ)n0 −n −t/τ e , (n0 − n)!
p(0, t) = 1 −
n 0 −1 m=0
0 < n ≤ n0 ,
(t/τ)m −t/τ e . m!
(3.80)
(3.81)
3.5 The Poisson Process in Closed and Open Systems
As already mentioned (3.45), the probabilities are always normalized to unity, n0 which can be proven by summation n=0 p(n, t) inserting (3.80 and 3.81) to get one. The time evolution of the probability p(n, t) has been calculated from (3.80) and (3.81) for an initial queue length n0 = 50. The result is shown in Figure 3.7 and compared to numerical Monte Carlo simulation experiments. The average or expectation value n of the cluster size n is usually given by
n(t) ≡
n0
n p(n, t) =
n=0
n0
n p(n, t)
(3.82)
n=1
and can be calculated using the known probabilities (3.80) to get the exact result t
n(t) = n0 Q(n0 − 1, t) − Q(n0 − 2, t) (3.83) τ where Q(n, t) is an abbreviation called the Poisson term Q(n, t) = e−t/τ def
n (t/τ)m . m!
(3.84)
m=0
The variance or second central moment
n(t) which measures the fluctuations is given by
n = (n − n)2 = n2 − n2 and can also be calculated as follows + , 2t
n(t) = n0 n0 Q(n0 − 1, t) − Q(n0 − 2, t) 1 − Q(n0 − 1, t) τ 2 t t + Q(n0 − 3, t) − Q 2 (n0 − 2, t) + Q(n0 − 2, t). τ τ
(3.85)
(3.86)
In some approximation, where we set Q(n, t) (3.84) to one, the mean value (3.83) reduces to a linearly decreasing function in time t (3.87)
n(t) ≈ n0 − , τ whereas the variance (3.86) reduces to a linearly increasing behavior t (3.88)
n(t) ≈ . τ In the case of the linear mean value approximation (3.87) the time required for the jam to dissolve totally, is given by tend = n0 τ.
(3.89)
The exact result (3.85) and the linear approximation (3.87) for the mean value depending on time are shown in Figure 3.8 by solid and dashed lines, respectively. In Figure 3.9 we have shown the same plots for the variance (3.86) and its linearization (3.88). Equations (3.87) and (3.88), however, do not describe the final stage of dissolution of any finite car cluster. In this case, taking the limit t → ∞ in the time-dependent results (3.80) and (3.81), we have
95
3 The Master Equation
0.20
P(n,t)
0.15
0.10
0.05
0.00
0
10
20
30
40
50
30
40
50
30
40
50
n 0.10
P(n,t)
0.08 0.06 0.04 0.02 0.00
0
10
20 n
0.8
0.6 P(n,t)
96
0.4
0.2
0.0
0
10
20 n
Figure 3.7 Probability distribution P(n, t) at three different times (from the top to the bottom) t/τ = 5, 25, and 55 with the initial condition P(n, 0) = δn,50 . Solid lines show the analytical solution, triangles indicate Monte Carlo results obtained by simulation of 5000 stochastic trajectories. Note that the diagrams have different scales along the probability axis.
3.5 The Poisson Process in Closed and Open Systems
50 40
30 20 10 0
0
20
40 tend /t 60 t/t
80
Figure 3.8 The mean value n of the cluster size depending on the dimensionless time t/τ. The initial size of the cluster is n0 = 50. The exact result is shown by a solid line and the linear approximation by a dashed line.
50
<>
40 30 20 10 0
0
20
40
60
80
t/t Figure 3.9 The variance
n depending on the dimensionless time t/τ. The initial size of the cluster is n0 = 50. The exact result is shown by a solid line and the linear approximation by a dashed line.
lim p(n, t) = δn,0 .
t→∞
(3.90)
If we do not consider the final stage of dissolution of a large cluster, that is, if t is considerably smaller than tend (3.89), then the probability p(0, t) for the cluster to be completely dissolved is very small. This allows us to obtain the correct results for n > 0 by the following alternative method.
97
98
3 The Master Equation
Let us define the generating function G(z, t) by def G(z, t) = zn p(n, t).
(3.91)
n
According to the situation actually considered, the particular term p(0, t) in this sum is negligible, so that the lower limit of summation may be taken from n = 1 instead of n = 0. The initial condition corresponding to p(0, t) = δn,n0 is represented by G(z, 0) = zn0 .
(3.92)
The equation for the generating function is obtained if both sides of the master equation (3.78) are multiplied by zn performing the summation over n afterwards. This yields
∂ 1 1 G(z, t) = − 1 G(z, t). (3.93) ∂t τ z The solution of the partial differential equation (3.93) with respect to the initial condition (3.92) is given by
t 1 −1 . (3.94) G(z, t) = zn0 exp τ z The previous result for p(n, t) at n ≥ 1 (3.80) is obtained from this equation after substitution by (3.91) and expansion of the exponent in z. Starting from (3.94)
t1 (3.95) G(z, t) = zn0 e−t/τ exp τz the power series is written as follows G(z, t) =
n
1 t m m! τ z m 1 t m = e−t/τ zn0 −m m! τ m n0 −n t 1 = e−t/τ zn (n − n)! τ 0 n
zn p(n, t) = zn0 e−t/τ
(3.96)
(3.97)
(3.98)
and therefore by comparison of the same order terms we get the Poisson distribution (3.80) p(n, t) =
(t/τ)n0 −n −t/τ . e (n0 − n)!
(3.99)
In order to depict the process of jam shrinkage and to illustrate the developed formalism for the probabilistic description we refer to the graphics Figures 3.6–3.9 shown previously. These are based on numerical calculations simulating the stochastic process n(t) (stochastic trajectories) and illustrating the time-dependent probability distributions p(n, t) and related quantities.
3.6 The Two-Level System
The simple model discussed above can be improved to describe the dissolution of a vehicle queue at a signalized road intersection taking into account the car dynamics of the starting behavior when the red traffic light is switched to green. The quantity we are interested in is a modified detachment probability (3.76) which now depends on the cluster size n. For a long queue the detachment rate w− (n) has a constant value 1/τ consistent with (3.76). However, due to the time spent in acceleration of the first cars and movement towards the stop line, the detachment rate changes for smaller queues.
3.6 The Two-Level System
In physics one often meets systems which are characterized by two distinct maxima of the probability distribution in the space of all possible states. Such bistable systems can be described by just two states or two levels + and −, corresponding to these two maxima. The Ising model of a single spin, fluctuating in an external field, can be considered as a toy example. The spin has two states: spin up or along the field direction (+ level), and spin down or opposite to the field direction (− level). It can flip from one state to another and vice versa with certain probabilities per time unit or transition rate, which depend on temperature and interaction strength with the external field. The Ising model has had an enormous impact on modern physics in general and statistical physics in particular, but also on other areas of science, including biology and neuroscience, economics and sociology among others [5, 33, 161, 220]. The Ising model of interacting spins is in fact a nontrivial many-particle system, although in the mean-field approximation it still fits into the simple picture described above. In this case the spin fluctuates in the mean field which is a superposition of the external field and the mean field created by the surrounding spins. In the vicinity of the phase transition or critical point, where the spontaneous ordering of spins occurs in zero external field, the spin system exhibits huge correlated fluctuations, the so-called critical fluctuations, which are not properly captured by the mean-field description. In this case the critical behavior of the Ising model is nontrivial and common for a certain class of models and real systems in accordance with the universality hypothesis known in the theory of critical phenomena [5, 137, 253]. Although exact solutions of one-dimensional and two-dimensional Ising models are well known [15, 161, 177], the critical behavior in the three-dimensional case has been a challenging problem up to the present, and several approaches including renormalization group and numerical methods (see [181] for a review) and, more recently, an alternative analytical method [94] have been proposed. Returning to the two-level description, also called the Markovian dichotomic system, as a toy example with two states, the dynamics of the system is given by transition rates w−+ from + to − and w+− from − to + . The states are characterized by the probabilities p+ (t) and p− (t) that the system at time t occupies the
99
100
3 The Master Equation
state + and −, respectively. The probabilities obey the normalization condition p+ + p− = 1. A basic assumption to describe the dynamics is that the switching between two states is a Markov process. This means that the system has no memory, that is, the transition rate from one given state to another is a property of this state and does not depend on the prehistory how it has been reached. The time evolution of p+ (t) and p− (t) is given by the master equations d p+ = −w−+ p+ + w+− p− dt d p− = +w−+ p+ − w+− p− dt
(3.100) (3.101)
together with initial condition specifying the values of the probabilities p+ (0) and p− (0) at t = 0. The solution of these differential equations is w+− w+− e−(w+− +w−+ )t + p+ (0) − w+− + w−+ w+− + w−+ w−+ w−+ p− (t) = + p− (0) − e−(w+− +w−+ )t . w+− + w−+ w+− + w−+
p+ (t) =
(3.102) (3.103)
In the following we will show two different ways of obtaining the solution. The first one is the direct integration method. Inserting the normalization relation p+ = 1 − p− into the first equation of motion, we obtain an inhomogeneous differential equation d p+ + (w+− + w−+ )p+ = w+− , dt
(3.104)
which is solved in the standard way. First we consider the homogeneous case d p+ + (w+− + w−+ )p+ = 0 dt
(3.105)
and find its solution −(w+− +w−+ )t . phom + (t) = C e
(3.106)
In the following we search for a particular solution by variation of constant according to the ansatz p+ (t) = C(t) e−(w+− +w−+ )t . par
(3.107)
By inserting this into the inhomogeneous differential equation we obtain
t +(w +w )t w+− C(t) = w+− e +− −+ − 1 . (3.108) e+(w+− +w−+ )s ds = w+− + w−+ 0
3.6 The Two-Level System
Therefore par
p+ (t) =
w+− 1 − e−(w+− +w−+ )t w+− + w−+
(3.109)
holds and the solution reads par
p+ (t) = p+ (t) + phom + (t) w+− = 1 − e−(w+− +w−+ )t + C e−(w+− +w−+ )t , w+− + w−+
(3.110)
The integration constant C is calculated from the initial condition p+ (t = 0) = C = p+ (0),
(3.111)
which finally yields the known result. The solution can be obtained using another method, so-called diagonalization by eigenstates. It is possible to solve the equations of motion starting from the form
p+ (t) d p+ (t) w+− −w−+ = , (3.112) w−+ −w+− dt p− (t) p− (t) or, compare (3.26), d P(t) = W P(t), dt
(3.113)
where W is the transition matrix, and P(t) is the time-dependent state. Taking into account the initial condition P(t = 0) = P(0) the formal solution reads P(t) = exp (W t) P(0) = U(t) P(0).
(3.114)
The general solution P(t), see (3.31), with eigenvalues λi and eigenstates ui is given by ci ui e−λi t , (3.115) P(t) = i
where the coefficients ci are constants calculated from initial conditions. In our case i has two values: i = 0, 1. The first step consists of the determination of eigenvalues from −w−+ − λ w+− = 0. | W − λE | = w−+ −w+− − λ
(3.116)
The calculation (w−+ + λ) (w+− + λ) − w+− w−+ = 0
(3.117)
λ (λ + (w+− + w−+ )) = 0
(3.118)
101
102
3 The Master Equation
yields the eigenvalues λ0 = 0 and λ1 = − (w+− + w−+ ). The term with zero eigenvalue λ0 in (3.115) represents the stationary solution reached asymptotically as t → ∞. In this case the stationary solution is also the equilibrium one, since the detailed balance (3.25) holds, that is, the condition dp+ (t)/dt = dp− (t)/dt = 0 implies that the probability flux from state + to state − is balanced by the opposite flux. The other term with negative eigenvalue λ1 describes the relaxation to this stationary solution. In the second step we calculate the eigenstates W uλ = λ uλ from −w−+ w−+
w+− −w+−
i
u+ i u−
= λi
i u+
i u−
.
(3.119)
For i = 0 we have −w−+ w−+
w+− −w+−
0
u+ 0 u−
=0
0
u+ 0 u−
=
0 , 0
(3.120)
so that the eigenstate u0 can be written as 0 u+ =
w+− 0 u . w−+ −
(3.121)
For i = 1 we have −w−+ w−+
w+− −w+−
1
u+ 1 u−
= − (w+− + w−+ )
1
u+ 1 u−
(3.122)
and the eigenstate u1 can be represented as 1 1 = −u− . u+
(3.123)
Putting these eigenstates into the general solution P(t) = c0 u0 eλ0 t + c1 u1 eλ1 t
(3.124)
we get P(t) =
w+−
−1 −(w+− +w−+ )t p+ (t) e = c0 w−+ + c1 1 p− (t) 1
(3.125)
From the initial condition P(t = 0) = c0 u0 + c1 u1 = P(0)
(3.126)
3.6 The Two-Level System
we obtain
P(0) =
w+−
−1 p+ (0) . = c0 w−+ + c1 1 p− (0) 1
(3.127)
Using the initial condition and normalization we finally obtain the previously unknown coefficients c0 =
w−+ , w+− + w−+
c1 = p− (0) − c0 = p− (0) −
(3.128) w−+ w+− = −p+ (0) + w+− + w−+ w+− + w−+
(3.129)
to get the known result. According to (3.114), the solution can be represented by the time evolution matrix U as P(t) = U(t, t0 )P(t0 ) with
U(t, t0 ) = eW(t−t0 ) .
In our two-state system, setting t0 = 0, we have
p+ (t) p+ (0) U11 U12 = U21 U22 p− (t) p− (0)
(3.130)
(3.131)
Adding up both equations and applying the normalization condition we obtain
U11 U12 1 − U22 U11 U= = (3.132) U21 U22 1 − U11 U22 with two unknown matrix elements U11 and U22 . Using already known results for p+ (t) and p− (t) after some mathematical manipulation we get w+− w−+ + e−(w+− +w−+ )t w+− + w−+ w+− + w−+ w+− = e−(w+− +w−+ )t w+− + w−+ w−+ = e−(w+− +w−+ )t w+− + w−+ w−+ w−+ = + e−(w+− +w−+ )t . w+− + w−+ w+− + w−+
U11 =
(3.133)
U12
(3.134)
U21 U22
(3.135) (3.136)
Up to now we never have specified the transition rates w+− and w−+ . They define a particular model. In physical systems like, e.g., the Ising model, the + and − states are characterized by certain energies. Since energy can be defined up to an arbitrary additive constant, we may set the lowest energy equal to zero. Then one state (say +) has energy 0 and the other state (−) has an energy value ε > 0 called the activation energy.
103
3 The Master Equation
The transition rates triggered by the heat bath with temperature T are given by the following Arrhenius ansatz βε (hill up) w−+ = ν exp − 2 βε w+− = ν exp + (hill down), 2
(3.137) (3.138)
where β = 1/kB T is a parameter proportional to the inverse temperature T, kB is the Boltzmann constant, and ν is a constant flip rate. Independent of the initial values p+ (0), p− (0) the system reaches the long-time limit p+ (t → ∞), p− (t → ∞) (see Figure 3.10) given by p+ (t → ∞) =
w+− 1 = , w+− + w−+ 1 + e−βε
(3.139)
p− (t → ∞) =
w−+ 1 = . w+− + w−+ 1 + e+βε
(3.140)
This distribution is shown as function of temperature in Figure 3.11. An extension of the simple dichotomic spin-flip process coupled to two heat reservoirs is analyzed by Steffen Trimper [228]. While one flip process is triggered by a heat bath at temperature T, the inverse transition is activated by a heat bath at a different temperature T . The stationary solution of the master equation leads to a generalized Fermi distribution with an effective temperature Te as the harmonic average of T and T . 1 T=100K T=2000K
0.8
p+(t),p−(t)
104
0.6 0.4 0.2 0
0
2
4 t[s]
Figure 3.10 The solution of the master equation for different values of temperature T. The results are shown for the following set of parameters: p+ (t = 0) = 1, p− (t = 0) = 0, ν = 0.5 s−1 , ε = 10−21 Nm, kB = 1.3806503 × 10−23 m2 kg s−2 K−1 .
6
3.7 The Three-Level System 1
p+st,p−st
0.8 0.6 0.4 0.2 0
0
100
200
300
400
500
T[K]
Figure 3.11 Long-time behavior as a function of temperature. Energy difference ε = 10−21 Nm as before.
3.7 The Three-Level System
The foregoing consideration of two-level system can be generalized to a system of three states: 1, 2 and 3. In this case the probability distribution is described by the three-component vector p1 (t) P(t) = p2 (t) (3.141) p3 (t) obeying the master equation (cf. (3.26)) d P(t) = W P(t), dt where the transition matrix W is given by w12 − (w21 + w31 ) W= w21 − (w12 + w32 ) w31 w32
(3.142) w13 w23 − (w13 + w23 )
(3.143)
As distinct from the two-level case, the stationary solution of the three-level system is not necessarily the equilibrium one. It depends on the specific values of the transition rates wij , and only in a special case is the detailed balance satisfied. In the stationary state the probability flux between two states is constant, but not necessarily zero. For example, if w12 = w23 = w31 = 0 holds, then from state 1 it is possible to go only to state 2; from state 2 to state 3; and from state 3 to state 1 with transition rates w21 , w32 , and w13 , respectively. The stationary solution then corresponds to a constant circular flux. Before searching for the general solution we consider some particular cases. A simple situation is when all wij = 0 except w23 and w32 . In this case only transitions between states 2 and 3 take place, whereas state 1 is isolated, as illustrated in the following schematic picture
105
106
3 The Master Equation CaseI:Isolatedstate 1 2
3
The solution reads p1 (t) = p1 (0) w23 (1 − p1 (0)) p2 (t) = w32 + w23 w23 (1 − p1 (0)) e−(w32 +w23 )t + p2 (0) − w32 + w23 w32 p3 (t) = (1 − p1 (0)) w32 + w23 w32 (1 − p1 (0)) e−(w32 +w23 )t + p3 (0) − w32 + w23
(3.144)
(3.145)
(3.146)
At p1 (0) = 0 the probabilities p2 (t) and p3 (t) obey the solution for the two-level system (3.102)–(3.103). A simple solution exists in the case where all transition rates are equal: wij = w, as shown in the following picture Case II: Symmetry (equal fluxes) 2
3
1
In this case we have 1 + p1 (0) − 3 1 p2 (t) = + p2 (0) − 3 1 p3 (t) = + p3 (0) − 3
p1 (t) =
1 −3wt e 3
1 −3wt e 3
1 −3wt e 3
(3.147) (3.148) (3.149)
Finally, as a particular situation we consider the already mentioned totally asymmetric case w12 = w23 = w31 = 0, represented as Case III: Asymmetry (circular flux) 2 1
3
3.7 The Three-Level System
with constant stationary flux. The solution reads P(t) = Pst + C1 P1 eλ1 t + C2 P2 eλ2 t with λ1,2 =
−A ±
(3.150)
√
A2 − 4B , 2
(3.151)
where A = w21 + w32 + w13
(3.152)
B = w32 w13 + w21 w13 + w21 w32
(3.153)
are constants,
w32 w13 1 Pst = w21 w13 B w21 w32
is the stationary solution, and w13 P1,2 = −w13 − w21 − λ1,2 w21 + λ1,2
(3.154)
(3.155)
are the eigenvectors representing the time-dependent part of the solution with the weight coefficients
w32 1 w21 + λ2 p1 (0) (3.156) − p3 (0) − λ2 C1 = λ2 − λ1 w13 B
w32 1 w21 + λ1 p1 (0) (3.157) − p3 (0) − λ1 C2 = λ1 − λ2 w13 B found from the initial condition. As an example, the solution for w21 = w13 = 1 s−1 and w32 = 5 s−1 with the initial condition p1 (0) = 1, p2 (0) = p3 (0) = 0 is shown in Figure 3.12. In the general case the solution can be written as (3.150) P(t) = Pst + C1 P1 eλ1 t + C2 P2 eλ2 t together with (3.151) √ −A ± A2 − 4B λ1,2 = , 2 where
(3.158)
(3.159)
A = w12 + w13 + w21 + w23 + w31 + w32
(3.160)
B = a+b+c
(3.161)
a = w12 w23 + w13 w32 + w12 w13
(3.162)
b = w13 w21 + w23 w31 + w21 w23
(3.163)
c = w31 w32 + w21 w32 + w12 w31
(3.164)
107
3 The Master Equation
1
p1(t) P(t)
108
0.5
5/11
p3(t)
p2(t) 1/11 0
0
1
2
3
t[s] Figure 3.12 The solution of the master equation P(t) for a three-level system with totally asymmetric transition rates w12 = w23 = w31 = 0, w21 = w13 = 1 s−1 , and w32 = 5 s−1 .
and a 1 P = b B c st
w13 − w12 = −w13 − w21 − w31 − λ1,2 w12 + w21 + w31 + λ1,2
(3.165)
P1,2
(3.166)
1 a w12 + w21 + w31 + λ2 c − p3 (0) + p1 (0) − λ2 − λ1 B w13 − w12 B
(3.167)
1 a w12 + w21 + w31 + λ1 c − p3 (0) + C2 = p1 (0) − λ1 − λ2 B w13 − w12 B
(3.168)
C1 =
The meaning of Pst (3.165) is the stationary probability distribution. It is the eigenvector of matrix W, which corresponds to the zero eigenvalue λst = 0. Two other eigenvectors P1 and P2 correspond to the eigenvalues λ1 and λ2 , respectively. By definition, the eigenvalues and eigenvectors are determined by the equation W · P = λP. This leads to the equation for eigenvalues + , det W − λE = 0, where E is the unit matrix with components Eij = δij .
(3.169)
(3.170)
3.7 The Three-Level System
Note that the condition λ(P1 + P2 + P3 ) = 0 is obtained for the components Pi of the eigenvector by summing up all three equations represented in the vectorial form (3.169). This means that P1 + P2 + P3 = 0 holds for the eigenvectors with λ = 0. According to (3.170), the system of linear equations for Pi is degenerated, that is, there are no more than two independent equations. A unique solution is (st) (st) (st) obtained by using the probability normalization condition P1 + P2 + P3 = 1, as well as the initial condition to find the constants C1 and C2 . Equation (3.170) is of the form −(w21 + w31 + λ) w12 w −(w + w32 + λ) 21 12 w31 w32
w13 = 0. w23 −(w13 + w23 + λ)
(3.171)
Calculating the determinant (3.171) we obtain λ[λ2 + λ(w12 + w13 + w21 + w23 + w31 + w32 ) + (w12 w13 + w13 w21 + w12 w23 + w21 w23 + w12 w31 + w23 w31 + w13 w32 + w21 w32 + w31 w32 )] = 0.
(3.172)
The first root of equation (3.172) is λst = 0. It describes the stationary distribution dPst =0 dt
(3.173)
The other roots satisfy the equation λ2 + Aλ + B = 0
(3.174)
where constants A and B are given by (3.160) and (3.161), respectively. We have three different cases. In the first case, meeting the inequality A2 > 4B both of the eigenvalues (λ1 , λ2 ) are different real negative numbers. In the second case, when A2 = 4B, these eigenvalues are equal to each other, λ1 = λ2 = −A/2. In the third case, when A2 < 4B, the eigenvalues (λ1 , λ2 ) are complex numbers √ √ λ1 = (−A + i |D|)/2 and λ2 = (−A − i |D|)/2, where D = A2 − 4B. In this case the system dynamics can exhibit damped oscillation ! |D|t/2 + B2 cos |D|t/2 , (3.175) P(t) = Pst + e−At/2 B1 sin where the constants B1 and B2 are determined from the initial condition. The discriminant D = A2 − 4B of polynomial (3.174) is given by the expression 2 2 2 2 2 2 + w21 + w13 + w31 + w23 + w32 D = w12 , + + 2 w12 w21 + w13 w31 + w23 w32 + w21 w31 + w12 w32 + w13 w23 + − 2 w12 w23 + w13 w32 + w12 w13 + w13 w21 + w23 w31 , (3.176) + w21 w23 + w31 w32 + w21 w32 + w12 w31 .
The stationary solution, defined by (3.173) and given by (3.165), can be found by a more elegant method developed by Gustav Kirchhoff. It is based on some
109
110
3 The Master Equation
elements of graph theory, as explained in [64, 204]. The Markovian system under consideration is represented by a graph G consisting of vertices and edges. Vertices correspond to system states, whereas edges connect all vertices i and j for which at least one of the transition rates wij and wji is nonzero. For nonzero transition rates wij , the graph G of the three-level system is represented as 3
2 1
In the following we will assume that the graph G is connected. In principle, it can also consist of unconnected parts. Then Kirchhoff’s method can be applied to each part separately. An example is the already considered case with an isolated state where the problem, in fact, reduces to the solution of a two-level system represented by two vertices connected by one edge. The only peculiarity is that the stationary probabilities have to be normalized in such a way that the total probability for the isolated subsystem is that given by the initial condition. The stationary solution is represented by subgraphs of G, called maximal trees. A maximal tree T{G} is a connected subgraph of G such that: (i) all edges of T{G} are edges of G; (ii) T{G} contains all vertices of G; and (iii) T{G} contains no circuits (cyclic sequences of edges). It is easy to realize that one has to drop a certain minimum number of edges of G to exclude circuits and obtain maximal trees. Maximal trees of the graph G for the three-level system are 2
3
2
3
2
3 1
1
1
In order to construct the stationary solution of the master equation we need directed maximal trees. For a given state i and tree T the directed tree Tj is obtained from T by directing all its edges towards the vertex i. The superposition of such trees makes up the general solution of the master equation. In the case of the three-level system it is represented as 3
2 K1 =
3
2 +
1
K2 =
3
2 +
2
3
+ 1
1
1
3
2
3
2 +
1
K3 =
3 1
1
3
2
2 +
3
2 +
1
1
(3.177)
3.7 The Three-Level System
j stands for the transition from the state j to the state i Here the symbol i with rate wij . We ascribe to each directed maximal tree Tj a weight Ki {Tj } equal to the product of all transition rates wij corresponding to its edges with the appropriate directions. The value Ki is defined as the sum of Ki {Tj } running over all the maximal directed trees leading to state i Ki {Tj }. (3.178) K1 = all Tj
According to (3.177), for our three-level system we have K1 = w12 w23 + w13 w32 + w12 w13 ,
(3.179)
K2 = w21 w13 + w23 w31 + w21 w23 ,
(3.180)
K3 = w31 w32 + w32 w21 + w31 w12 .
(3.181)
The Kirchhoff formula for the stationary probability distribution pst i of an N-level system, represented by a connected graph, is given by pst i =
Ki N
.
(3.182)
Kj
j=1
We can easily see that this general formula is consistent with (3.165) in the case of the three-level system (N = 3), where Ki are given by (3.179)–(3.181). In the following we consider the three-level system thermodynamically. For thermodynamic systems the detailed balance typically holds. This means that, for an arbitrary chosen pair of states i, j, the stationary probability flux between them is equal to zero, wji pieq
pieq i
j
wij
or in mathematical terms eq
eq
Jij := wji pi − wij pj = 0.
(3.183) eq
Following Section 3.2 we use the symbol pi to designate the stationary distribution function pst i for the systems with detailed balance. As is well known for such a system, the energy Hi specifies the equilibrium distribution via the Boltzmann formula eq
pi =
1 exp{−βHi }. Z
(3.184)
111
112
3 The Master Equation
Let us fix some state, for example, state 1. Then for any state i the detailed balance reads eq
eq
wi1 p1 = w1i pi .
(3.185)
By virtue of (3.184) expression (3.185) is rewritten as w1i = exp{−β(H1 − Hi )}, wi1
(3.186)
thereby Hi = H1 + kB T[ln w1i − ln wi1 ].
(3.187)
Expression (3.187) actually enables us to construct the energy Hi using the given transition rates wij . If H1 is known we can calculate any Hi . In particular, for the three-level system this construction of the energies Hi is actually based on the following maximal tree 2
3 1
So, on one hand, the energy Hi and correspondingly the distribution of a thermodynamic system can be constructed using one maximal tree only. On the other hand, the Kirchhoff diagram technique deals with all the maximal trees being actually independent. In order to elucidate this seeming contradiction and to propose an approach for constructing the energy H, without applying to a certain fixed maximal tree, we consider the three-level system as an example. The stationary probability flux, for example, along the edge {12} can be calculated directly by substituting Kirchhoff’s formula (3.182) into (3.183) within the replacement pst → peq , yielding eq
eq
eq
J12 = w21 p1 − w12 p2 = w21 w32 w13 − w12 w23 w31 = 0
(3.188)
The fluxes along {13} and {23} have the same value equal to zero. Expression (3.188) provides the condition w21 w32 w13 = w12 w23 w31
(3.189)
or using Kirchhoff diagrams (3.189) is represented as 2
3
2
3
− 1
= 0 1
(3.190)
The following equalities result immediately from (3.189) k1 :=
K1 {Tj } w12 w32 w13 = = , K2 {Tj } w21 w23 w31
(3.191)
3.7 The Three-Level System
k2 :=
K1 {Tj } w13 w12 w23 = = . K3 {Tj } w31 w21 w32
(3.192)
Therefore, to construct the equilibrium distribution only one column in (3.177) may be taken into account. The choice of any number of columns in (3.177) gives the same result eq
p1 =
1 , 1 + k1 + k 2
eq
p2 =
k1 , 1 + k1 + k 2
eq
p3 =
k2 . 1 + k1 + k 2
(3.193)
So we have demonstrated how to introduce the energy applying to the notion of Kirchhoff’s diagrams. By definition, the energy of a state i within a tree Tj is written as Hi {Tj } = −kB T ln Ki {Tj },
(3.194)
for example, for state 1 expression (3.194) reads H1 {T1 } = −kB T[ln w12 + ln w13 ].
(3.195)
For a system with detailed balance the energy Hi {Tj } possesses the following property. The difference between the energies of an arbitrarily chosen pair of states {i, j} within the same tree Tk is a constant Hi {Tk } − Hj {Tk } = const.
(3.196)
By way of an example, we consider the following two trees directed to state 1 and, similarly, two trees directed to state 2 2
3
2
1
2
3 1
3 1
2
3 1
Using expression (3.194) the corresponding energies are H1 {T1 } = −kB T[ln w12 + ln w13 ], H2 {T1 } = −kB T[ln w21 + ln w13 ], H1 {T2 } = −kB T[ln w32 + ln w13 ], H2 {T2 } = −kB T[ln w31 + ln w32 ].
(3.197)
113
114
3 The Master Equation
The differences between the energies are written as H1 {T1 } − H2 {T1 } = −kB T[ln w12 + ln w21 ],
(3.198)
H1 {T2 } − H2 {T2 } = −kB T[ln w32 + ln w13 + ln w31 + ln w32 ].
(3.199)
Expressions (3.198) and (3.199) are equal to each other provided the condition (3.189) is fulfilled and, thus, ln(w32 w13 w21 ) = ln(w31 w23 w12 ).
(3.200)
Now let us construct the energy Hi of a state i as 1 Hi {Tk }. 3 3
Hi =
(3.201)
k=1
Then expression for Hi {Tk } can be rewritten as Hi {Tk } = Hi + ∆H{Tk }.
(3.202)
In (3.202) the former term depends only on the state i. In the general case the latter one should depend on the state i as well as the given tree Tj . However, due to the detailed balance it turns out to depend only on the tree Tj . In fact, for a fixed tree Tk we can write ∆Hi {Tk } − ∆Hj {Tk } = Hi {Tk } − Hj {Tk } −
3 3 1 Hi {Tj } − Hj {Tj } = 0. 3 j =1
(3.203)
j =1
By virtue of (3.196) using (3.202) expression (3.182) can be rewritten (β = 1/(kB T)) eq
pi =
exp{−βHi } 3
,
(3.204)
exp{−βHi }
i=1
which has the form of the Boltzmann distribution. We note that these speculations also hold for multilevel systems with detailed balance.
3.8 Exercises
E 3.1 Radioactive decay By analogy with the Poisson process considered in Section 3.5, solve the master 1 = τ10 · n. With equation (3.75) for the radioactive decay process with w− (n) = τ(n) a given initial condition p(n, t = 0) = δn,n0 find the analytical time-dependent solution p(n, t) for n ≥ 0. As in Section 3.5, calculate the first two moments and the variance of the probability distribution.
3.8 Exercises
E 3.2 Ehrenfest urn model Consider the Ehrenfest model of particle diffusion between two boxes. In 1907 Paul and Tatiana Ehrenfest formulated their stochastic urn model [39] for the first time to discuss the H-Theorem investigated by Ludwig Boltzmann. There are totally N particles. Let n be the number of particles in the first (or left) box. Each of them can go over to the second (or right) box with transition rate d12 . Similarly, any one of N − n particles in the second box can go over to the first one with the transition rate d21 . The probability p(n, t) of finding the system in the state with n particles in the first box at time t is given by the one-step master equation (3.38) with the transition rates w+ (n) = d21 (N − n),
(3.205)
w− (n) = d12 n.
(3.206)
One task is to find the stationary as well as the time-dependent solutions of this master equation with the initial condition p(n, t = 0) = δn,n0 . Hint: use the generating function n n s11 s22 p(n1 , n2 , t), | si | < 1, (3.207) F(s1 , s2 , t) = ni
where in our case p(n1 , n2 , t) ≡ p(n1 , N − n1 , t) ≡ p(n, t). Another task is to obtain the differential equations for the first and the second moments, that is n(t) and n2 (t), of the probability distribution function p(n, t) and solve them. Calculate the variance and estimate its dependence on the number of particles N at t → ∞ (the equilibrium value). Hint: use the master equation and the definition nk = n nk p(n, t). E 3.3 Schl¨ogl model Consider the model of the bistable Schl¨ogl reaction (named after Friedrich Schl¨ogl from Aachen) k1
A+2X 3 X; k1
k2
X F, k2
where A is the raw substance and F is the final product, both having constant concentrations, whereas X is the intermediate substance of interest with varying concentration C. Its dependence on time T is given by the equation dC = k1 AC2 − k1 C3 − k2 C + k2 F. (3.208) dT The concentration is C = N/V, where N is the number of molecules and V is the volume of the system. By introducing elementary volume and time units, V0 and t0 = V02 /k1 , (3.208) is written in dimensionless variables X = V0 C and t = T/t0 as dX = −X 3 + aX 2 − bX + c, (3.209) dt , , , + + + where a = k1 AV0 /k1 , b = k2 V02 /k1 , and c = k2 FV03 /k1 are positive dimensionless control parameters. Using the substitution x = X − a/3, (3.209) is transformed into the cubic normal form without the quadratic term,
115
116
3 The Master Equation
dx = −x3 + βx + γ dt
(3.210)
2 3 a − 13 ab + c. Depending on the with two control parameters β = 13 a2 − b and γ = 27 parameter γ, (3.210) has either one or three stationary solutions. One task is to find the stationary solutions of (3.210) as functions of the parameter γ and analyze their stability. Another task is to construct the master equation for the corresponding stochastic model, where p(N, t) is the probability of finding the system in a state with N molecules of substance X at time t. Hint: consider a finite volume V, where C = N/V; estimate the transition rates using the mean-field concentration product ansatz multiplied with the reaction constant.
E 3.4 Stochastic Brusselator Start a new project called the stochastic Brusselator (named after the city of Brussels, where this model was first discussed by Ilya Prigogine et al.). The model deals with certain idealized autocatalytic reactions. In general, these are chemical reactions in which at least one of the products is also a reactant. We consider the following example k1
R1 → X;
k2
R2 + X → Y + F1 ;
k3
2X + Y → 3X;
k4
X → F2 ,
where k1 , k2 , k3 , k4 are reaction constants; R1 and R2 are the raw substances; F1 and F2 are final products of the reaction; whereas X and Y are intermediate substances which are of particular interest. Let us denote the concentrations of R1 and R2 by r1 and r2 , and those of X and Y by x and y. In the following it is assumed that the concentrations of the raw substances are so large that their depletion during a considered time of reaction can be neglected, so r1 and r2 are constants. We are thus interested only in the temporal variation of the concentrations x and y, which are described by the system of two coupled differential equations dx = k1 r1 − k2 r2 x + k3 x2 y − k4 x dt dy = k2 r2 x − k3 x2 y. dt
(3.211) (3.212)
One task is to find the fixed-point stationary solution of (3.211)–(3.212), to determine the region of its stability depending on the parameters of the model, and to analyze numerically the solution in the region where the fixed point is unstable. Another task is to construct the master equation for the corresponding stochastic model, where p(Nx , Ny , t) is the probability of finding the system in a state with Nx molecules of substance X and Ny molecules of substance Y at time t. Hint: consider a finite volume V, where x = Nx /V and y = Ny /V; estimate the transition rates using the mean-field concentration product ansatz multiplied with the reaction constant.
117
4 The Fokker–Planck Equation
4.1 General Fokker–Planck Equations
One of the fundamental dynamical expressions for Markovian processes is the Fokker–Planck equation [49, 193] in its forward and backward notation. The basic quantity describing the probabilistic evolution for the path from the initial value r0 = r(t0 ) to position r(t) for all times t ≥ t0 including t → ∞ is the conditional probability density p(r, t | r0 , t0 ), also called the Green function. Usually the boundary conditions which have to be formulated due to the context of the given problem are essential for the analytical representation of the Fokker–Planck solution which is related to the Sturm–Liouville problem. The multidimensional forward Fokker–Planck equation (using ∇ known as Nabla notation) consists of drift contributions (given by vector v = {vi | i = 1, . . . , N}) as well as diffusion terms (given by matrix D = {dij | i, j = 1, . . . , N}) N ∂p(r, t | r0 , t0 ) =− ∇i vi (r, t)p(r, t | r0 , t0 ) ∂t i=1
+
N N
∇i ∇j dij (r, t) p(r, t | r0 , t0 )
(4.1)
i=1 j=1
and can be written as a typical continuity equation ∂p(r, t | r0 , t0 ) + ∇i Ji {p(r, t | r0 , t0 )} = 0 ∂t N
(4.2)
i=1
with the probability flux Ji = vi (r, t)p(r, t | r0 , t0 ) −
N
∇j dij (r, t)p(r, t | r0 , t0 ) .
(4.3)
j=1
The continuity equation (4.2) has to be completed by boundary conditions indicating special properties of the flux (4.3) and/or the conditional probability density p(r, t | r0 , t0 ) like reflecting and absorbing barriers. Physics of Stochastic Processes: How Randomness Acts in Time Reinhard Mahnke, Jevgenijs Kaupuˇzs and Ihor Lubashevsky Copyright 2009 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim ISBN: 978-3-527-40840-5
118
4 The Fokker–Planck Equation
The corresponding multidimensional backward Fokker–Planck equation reads (∇ 0 means Nabla derivative with respect to r0 ) ∂p(r, t | r0 , t0 ) =+ vi (r0 , t0 ) ∇i0 p(r, t | r0 , t0 ) ∂t0 N
−
i=1
+
N N
dij (r0 , t0 ) ∇i0 ∇j0 p(r, t | r0 , t0 ).
(4.4)
i=1 j=1
The initial condition in both cases is p(r, t = t0 | r0 , t0 ) = p(r, t | r0 , t0 = t) = δ(r − r0 ).
(4.5)
In the following we rewrite the conditional probability p(r, t | r0 , t0 ) for times t > t0 as p({x}, t | {y}, s) and define the forward and the adjoint backward Fokker–Planck operators LF and LB , respectively. The general Fokker–Planck equations for N variables then read ∂p({x}, t | {y}, s) = LF ({x}, t) p({x}, t | {y}, s) with ∂t LF ({x}, t) = −
N N N ∂ ∂2 vi ({x}, t) + dij ({x}, t) ∂xi ∂xi ∂xj i=1
(4.6)
i=1 j=1
in agreement with (4.1), and ∂p({x}, t | {y}, s) = −LB ({y}, s) p({x}, t | {y}, s) ∂s LB ({y}, s) =
N i=1
∂ ∂2 + dij ({y}, s) ∂yi ∂yi ∂yj N
vi ({y}, s)
with
N
(4.7)
i=1 j=1
in agreement with (4.4). Finally we consider the mostly investigated one-dimensional time-homogeneous situation with a finite state space I. The forward Fokker–Planck equation (4.6) reduces to ∂p(x, t | y, s) = LF (x) p(x, t | y, s) with ∂t
LF (x) = −
∂2 ∂ v(x) + 2 d(x), (4.8) ∂x ∂x
whereas the backward equation (4.7) is ∂p(x, t | y, s) = −LB (y) p(x, t | y, s) with ∂s
LB (y) = v(y)
∂2 ∂ + d(y) 2 . (4.9) ∂y ∂y
Both operators LF , LB are acting on functions in the interval I ∈ [a, b] subject to appropriate boundary conditions. To investigate the relationship between both operators we consider the following second-order differential operator L acting on function u(x)
4.2 Bounded Drift–Diffusion in One Dimension
Lu(x) ≡ a0 (x)
d2 d u(x) + a1 (x) u(x) + a2 (x)u(x). 2 dx dx
Then the adjoint operator L† defined by
dx v(x)Lu(x) = dx u(x)L† v(x)
(4.10)
(4.11)
becomes L† u(x) ≡
, , d2 + d + a0 (x)u(x) − a1 (x)u(x) + a2 (x)u(x). dx2 dx
(4.12)
To find the condition that the operator will be self-adjoint, so that L† = L, we figure out from (4.12)
d2 da0 (x) d − a u(x) u(x) + 2 (x) 1 dx2 dx dx 2
d a0 (x) da1 (x) + a + − (x) u(x) 2 dx2 dx
L† u(x) = a0 (x)
(4.13)
the required condition da0 (x) = a1 (x) dx that gives
(4.14)
d2 da0 (x) d u(x) + a2 (x)u(x) u(x) + dx2 dx dx
, d da0 (x) d2 + u(x) + a2 (x)u(x) = a0 (x)u(x) − dx2 dx dx
d d a0 (x) u(x) + a2 (x)u(x). = dx dx
Lu(x) = L† u(x) = a0 (x)
(4.15)
This shows the equivalence between the forward and backward Fokker–Planck equation.
4.2 Bounded Drift–Diffusion in One Dimension
Now we are going to consider the one-dimensional drift–diffusion problem in a finite interval with a reflecting (left at position a) and an absorbing (right at position b) boundary. The forward Fokker–Planck equation reads ∂p(x, t | x0 , t0 ) = LF p(x, t | x0 , t0 ) ∂t ∂ ∂ −v(x) + d(x) p(x, t | x0 , t0 ) ≡ ∂x ∂x
(4.16)
119
120
4 The Fokker–Planck Equation
with the initial condition (usually the delta distribution) p(x, t = t0 | x0 , t0 ) = p0 (x0 ) = δ(x − x0 ) including the boundary conditions at the left ∂ d(x) p(x, t | x0 , t0 ) =0 j(x = a, t) ≡ v(x) − ∂x x=a
(4.17)
(4.18)
and right border
p(x, t | x0 , t0 )x=b = 0.
(4.19)
In order to discuss the same physical problem by different means we consider, in analogy to the forward dynamics, the appropriate backward Fokker–Planck equation, given by −
∂p(x, t | x0 , t0 ) = LB p(x, t | x0 , t0 ) ∂t0 ∂ ∂ p(x, t | x0 , t0 ) ≡ v(x0 ) + d(x0 ) ∂x0 ∂x0
(4.20)
with initial condition (once again the delta distribution) p(x, t | x0 , t0 = t) = p0 (x) = δ(x − x0 ) including the reflecting boundary at the left ∂ p(x, t | x0 , t0 ) =0 ∂x0 x0 =a and absorbing at the right border p(x, t | x0 , t0 )x =b = 0. 0
(4.21)
(4.22)
(4.23)
Since we obtain a unique solution of the conditional probability density p(x, t | x0 , t0 ) by two different ways (forward or backward dynamics, see Figure 4.1) we have to remember the Markov property of the process based on the Chapman– Kolmogorov equation
b dy p(x, t | y, s) p(y, s | x0 , t0 ). (4.24) p(x, t | x0 , t0 ) = a
After differentiation of both sides with respect to time s in the first step we get (following [84])
b ∂ dy p(x, t | y, s)p(y, s | x0 , t0 ) ∂s a
b ∂ dy p(x, t | y, s) p(y, s | x0 , t0 ) = ∂s a
b ∂ dy p(y, s | x0 , t0 ) p(x, t | y, s). + ∂s a
0=
(4.25)
(4.26)
4.2 Bounded Drift–Diffusion in One Dimension
x
b
(a)
x0
a
t0
t
t0
t
x
b
(b)
x0
a
Figure 4.1 Schematic view of forward (a) and backward (b) dynamics showing three different stochastic trajectories x(t) in finite state space a ≤ x ≤ b.
Inserting the corresponding Fokker–Planck operators LF and LB
0=
b
dy p(x, t | y, s) LF (y)p(y, s | x0 , t0 )
a
−
b
dy p(y, s | x0 , t0 ) LB (y)p(x, t | y, s)
(4.27)
a
we get
0=
b
dy p(x, t | y, s)
a
− a
b
∂ ∂ −v(y) + d(y) p(y, s | x0 , t0 ) ∂y ∂y
∂ ∂ p(x, t | y, s). dy p(y, s | x0 , t0 ) v(y) + d(y) ∂y ∂y
(4.28)
121
122
4 The Fokker–Planck Equation
Doing several mathematical manipulations using uv − vu = (uv − vu )
b
0=−
dy a
b
+
∂ v(y) p(x, t | y, s) p(y, s | x0 , t0 ) ∂y
dy p(x, t | y, s)
a
b
−
∂2 d(y) p(y, s | x0 , t0 ) ∂y2
dy p(y, s | x0 , t0 ) d(y)
a
∂2 p(x, t | y, s) ∂y2
∂ v(y) p(x, t | y, s) p(y, s | x0 , t0 ) ∂y a
b ∂ ∂ p(x, t | y, s) d(y) p(y, s | x0 , t0 ) dy + ∂y ∂y a ∂ − d(y) p(y, s | x0 , t0 ) p(x, t | y, s) ∂y
b ∂ ∂ p(x, t | y, s) −v(y) p(y, s | x0 , t0 ) + d(y) p(y, s | x0 , t0 ) = dy ∂y ∂y a
b ∂ ∂ d(y) p(y, s | x0 , t0 ) p(x, t | y, s) (4.29) dy − ∂y ∂y a =−
b
dy
after integration we arrive at
y=b ∂ −p(x, t | y, s) v(y)p(y, s | x0 , t0 ) − d(y)p(y, s | x0 , t0 ) ∂y y=a y=b ∂ = d(y) p(y, s | x0 , t0 ) p(x, t | y, s) . ∂y y=a
(4.30)
Taking into account forward Fokker–Planck equation with reflecting left boundary j(y = a, t) = v(y = a) p(y = a, s | x0 , t0 ) −
∂ d(y) p(y, s | x0 , t0 ) =0 ∂y y=a
(4.31)
and absorbing right boundary p(y = b, s | x0 , t0 ) = 0
(4.32)
we get from (4.30) ∂ p(y, s | x0 , t0 ) p(x, t | y = b, s)d(y = b) ∂y y=b = −p(y = a, s | x0 , t0 )d(y = a)
∂ p(x, t | y, s) =0 ∂y y=a
(4.33)
4.3 The Escape Problem and its Solution
the corresponding conditions for the backward Fokker–Planck equation, e.g. at the left reflecting wall ∂ p(x, t | y, s) =0 (4.34) ∂y y=a and at the right absorbing border p(x, t | y = b, s) = 0.
(4.35)
Similar calculations can be done for other boundaries such as two reflecting or two open (absorbing) ones. To summarize to this point we state that both Fokker–Planck dynamics with the corresponding boundary conditions are equivalent and give the same results. In the following we consider the typical one-dimensional exit problem of a Brownian particle (drift–diffusion dynamics) from a bounded domain, whose boundary is usually reflecting, except for an absorbing window. Investigating stochastic dynamics in higher dimensions, the average life-time or the mean first passage time (MFPT) increases indefinitely as the absorbing part of the, usually reflecting, surface shrinks to zero. This so-called narrow escape problem becomes important for different kinds of channels such as capillary outflow. Important investigations by A. Singer et al. [217] derive the leading order term in the expansion of the mean first-passage time t of a Brownian particle with diffusion coefficient D escaping from a general domain of volume |V| to an elliptical hole of large semi-axis a that is much smaller than |V|1/3
t ∼
|V| K(e), 2πaD
(4.36)
where e is the eccentricity of the ellipse, and K(x) is the complete elliptic integral of the first kind. In the special case of a small circular hole of radius a the result
t ∼
|V| 4aD
(4.37)
was already known by Baron Rayleigh published in his famous book The Theory of Sound (1877, first edition). In [217] the Rayleigh formula has been extended by derivation of the second-order term and the error estimate for a ball of radius R with a circular hole (radius a) in the boundary a a R |V| 1 + log + O . (4.38)
t = 4aD R a R
4.3 The Escape Problem and its Solution
Define a new quantity G(t, x0 ) based on the solution p(x, t | x0 , t0 ) as the probability of finding x(t) starting from x0 at initial time t0 , still in the finite interval a ≤ x < b by
123
124
4 The Fokker–Planck Equation
b
G(t, x0 ) =
p(x, t | x0 , t0 ) dx,
(4.39)
a
which is related to the probability current density P(t, x0 ) of leaving the system for the first time at the right border x = b P(t, x0 ) = −
∂ G(t, x0 ). ∂t
(4.40)
This equation is a global conservation law stating that x(t) is either in the interval a ≤ x < b (volume) or is passing the absorbing boundary x = b (surface). The outflow function P(t, x0 ) dt is the first-passage time probability that x(t) passes the value x = b (right boundary) for the first time in time interval (t, t + dt) after starting from x(t = t0 ) = x0 . In a paper published in 1951 by Siegert [215] the so-called first passage time problem has been treated analytically for the case when the reflecting boundary is going to infinity to become a natural one, that is, with a → −∞. Defining all moments by
∞ tn P(t, x0 ) dt (4.41)
tn (x0 → b) = 0
the zeroth moment is obviously normalization. The first moment is called the mean first passage time (MFPT) whereas the inverse is known as the escape or breakdown rate. The mean first-passage time tells us how long it takes on average to move from x0 ∈ [a, b) inside the interval to the open boundary at b taking into account the reflecting wall at a. Using the above definition we get for the first moment
t1 (x0 → b) =
∞
t P(t, x0 ) dt = −
0
dt =
∞
t 0
∂ G(t, x0 ) ∂t
∞
G(t, x0 ) dt
(4.42)
0
and also higher moments (n ≥ 1) with similar integration by parts
∞
tn (x0 → b) = n tn−1 G(t, x0 ) dt.
(4.43)
0
Now we are looking for an alternative way to calculate the moments (4.41) directly to avoid the knowledge of the basic function p(x, t | x0 , t0 ) as a solution of the Fokker–Planck dynamics. Starting with the backward Fokker–Planck equation and exchanging the time derivation from t0 to t −
∂p(x, t | x0 , t0 ) ∂p(x, t | x0 , t0 ) = LB p(x, t | x0 , t0 ) = ∂t0 ∂t
and integrating over x we get a partial differential equation for G(t, x0 ) ∂ ∂2 ∂ G(t, x0 ) = v(x0 ) + d(x0 ) 2 G(t, x0 ) ∂t ∂x0 ∂x0
(4.44)
(4.45)
4.3 The Escape Problem and its Solution
together with G(t = t0 , x0 ) = 1 and ∂ G(t, x0 ) = 0; G(t, x0 ) = 0. x0 =b ∂x0 x0 =a
(4.46)
Either this equation can be solved to get the moments from G(t, x0 ) or we can derive a hierarchic set of equations for the moments. After multiplication of (4.45) by ntn−1 and integration over time t from t0 = 0 to infinity we get ∞ ∂ ∂2 + d(x0 ) 2 ntn−1 G(t, x0 ) dt v(x0 ) ∂x0 ∂x0 0
∞
∞ ∂G(t, x0 ) dt = −n(n − 1) ntn−1 tn−2 G(t, x0 ) dt. (4.47) = ∂t 0 0 The result is for all n ≥ 1 d d2 v(x0 ) + d(x0 ) 2 tn (x0 → b) = −n tn−1 (x0 → b) dx0 dx0 with closure condition t0 (x0 → b) = 1 and d
tn (x0 → b) = 0; tn (x0 → b) = 0. x0 =b dx0 x0 =a
(4.48)
(4.49)
To get the mean first-passage time (MFPT, n = 1) the following equation has to be solved d d2 v(x0 ) + d(x0 ) 2 t1 (x0 → b) = −1 (4.50) dx0 dx0 which can be done directly in simple cases. In the following we consider the so-called drift–diffusion motion with constant drift v(x0 ) = v as well as constant diffusion d(x0 ) = D. The first moment t1 (x0 → b) (mean first passage time) has to be calculated from the inhomogeneous differential equation 2 v d 1 d (4.51) +
t1 (x0 → b) = − D dx0 D dx02 as superposition of the homogeneous and the particular solution v 1
t1 (x0 → b) = t1 hom + t1 par = C1 + C2 e− D x0 − x0 . v
(4.52)
Taking into account the boundary conditions (4.49) the previously unknown coefficients C1 and C2 become C1 =
v D b + 2 e− D (b − a) ; v v
C2 = −
which finally gives the following solution
D −va e D v2
(4.53)
125
4 The Fokker–Planck Equation
t1 (x0 → b) =
v D v b − x0 + 2 e− D (b − a) − e− D (x0 − a) v v
(4.54)
including the special situation starting most far away at left wall (put x0 at the left boundary a) D v b−a + 2 e− D (b − a) − 1 . (4.55)
t1 (x0 = a → b) = v v The special case v = 0 indicates pure diffusion which follows as the limit from above lim t1 (x0 = a → b) =
v→0
1 (b − a)2 . 2 D
(4.56)
To make expressions easier in order to draw the solution (4.54) as a function of the drift parameter we introduce the dimensionless quantities y=
x−a ; b−a
T=
D t; (b − a)2
=
v (b − a) D
(4.57)
in order to get, instead of t1 (x0 → b) (4.54), the dimensionless first-passage time
T1 (y0 → 1) including the limiting case = 0 as , , 1 + 1 + = 0 1 − y0 + 2 e− − e−y0
T1 (y0 → 1)() = (4.58) , 1+ 1 − y02 = 0. 2 The results are shown in Figure 4.2 for the case y0 = 0. The full line gives the complete solution (4.58) including the diffusion result 1/2 for = 0. If the drift 30 25 20
126
15 10 5 0 −10 −8
−6
−4
−2
0 Ω
2
4
6
8
10
Figure 4.2 Dimensionless mean first-passage time
T1 (y0 → 1) (4.58) with initial position y0 = 0 (at the left reflecting wall) depending on the drift force (full curve). The dotted curve shows the approximation by −2 e− ; the dashed line shows the correct value 1/2 valid for = 0 (pure diffusion) only.
4.4 Derivation of the Fokker–Planck Equation
is directed to the right ( > 0) the system reaches the absorbing border very quickly with T1 (0 → 1) → 0 at → +∞. In the opposite case (negative drift) the probability of reaching the right border is shrinking and the mean first passage time increases very quickly with T1 (0 → 1) → ∞ at → −∞. The second term in (4.58) e− /2 (dotted curve) is a good approximation in the latter extreme case. It fails around ≈ 0, where this term diverges. The dashed straight line indicates the value 1/2 which is correct for pure diffusion only ( = 0).
4.4 Derivation of the Fokker–Planck Equation
The master equation and also the Fokker–Planck equation are both useful for describing the time development of the probability density function p(x, t) for a continuous variable x. In the following we want to discuss the one-dimensional case in detail. The Fokker–Planck equation follows from the master equation (3.20)
+∞ ∂p(x, t) = (4.59) w(x, x , t)p(x , t) − w(x , x, t)p(x, t) dx ∂t −∞ due to the Kramers–Moyal expansion where only the first two leading terms are retained. As distinct from (3.20), here we allow a more general case, that is where the transition frequencies depend on time t. The derivation can be found in many textbooks, see e.g. [55]. By introducing the quantity f (y, x, t) = w(x + y, x, t), the master equation (4.59) can be written as
+∞ ∂p(x, t) = f (y, x − y, t)p(x − y, t) − f (y, x, t)p(x, t) dy. (4.60) ∂t −∞ It is assumed that f (y, x − y, t) is a smooth function with respect to y. The basic idea is to expand the quantity f (y, x − y, t)p(x − y, t) in a Taylor series around y = 0, which yields the Kramers–Moyal expansion ∞ ∂p(x, t) (−1)n ∂ n αn (x, t)p(x, t) , = ∂t n! ∂xn
(4.61)
n=1
where
αn (x, t) =
+∞
−∞
y f (y, x, t) dy = n
+∞
−∞
(x − x)n w(x , x, t) dx
(4.62)
are the nth-order moments of the transition frequencies w(x , x, t). Retaining only the first two expansion terms in (4.61) one obtains the well known Fokker–Planck equation in forward notation 1 ∂2 ∂p(x, t) ∂ =− α2 (x, t) p(x, t) . α1 (x, t) p(x, t) + 2 ∂t ∂x 2 ∂x
(4.63)
127
128
4 The Fokker–Planck Equation
The first term in (4.63) is called the drift term and the second one is called the diffusion or fluctuation term. This is due to the analogy with a drift–diffusion equation where the first derivative describes the drift of the probability profile without changing its form, whereas the second one describes the pure diffusion effect. In fact, (4.63) is a drift–diffusion equation for the probability p(x, t). The diffusion or effluence of the probability distribution profile occurs due to the stochastic fluctuations, therefore the second term in (4.63) is also called the fluctuation term. More explicitly, (4.63) is called the forward Fokker–Planck equation to distinguish it from the backward Fokker–Planck equation which describes the evolution of the conditional probability p(x, t | x , t ) with respect to the initial time t . Like the backward master equation considered in Section 3.4, the backward Fokker–Planck equation is useful for studying the first-passage problem, which has been already presented in the previous section.The derivation in both cases is similar. Namely, we can write (3.54) for the continuous variable x (4.64) p(x, t | x , t + ∆t) − p(x, t | x , t )
= p(x , t + ∆t | x , t ) p(x, t | x , t + ∆t) − p(x, t | x , t + ∆t) dx and take the limit ∆t → 0. In the same approximation which has been used to obtain (4.63), the probability p(x, t | x , t ) can be expanded in a Taylor series around x = x retaining the first two terms only. It yields the backward Fokker–Planck equation 2 1 ∂p(x, t | x , t ) ∂p(x, t | x , t ) ∂ p(x, t | x , t ) α = −α (x , t ) − (x , t ) , (4.65) 1 2 ∂t ∂x 2 ∂x 2
where the drift coefficient α1 (x , t ) and the diffusion coefficient α2 (x , t ) are the moments of the transition frequency (4.62) also entering the forward Fokker–Planck equation. The connection to the stochastic differential equation calculus is given by the following Langevin equation (4.66) dx(t) = α1 (x, t) dt + α2 (x, t) dW(t) The term dW(t) = W(t + dt) − W(t) is called the Wiener increment noise term with the properties W(t) = 0 and dW(t)2 = dt.
4.5 Fokker–Planck Dynamics in Finite State Space
We consider the general one-dimensional forward Fokker–Planck dynamics (4.63) with time-homogeneous coefficients D1 (x)(= α1 ; drift) and D2 (x) (= α2 /2; diffusion) ∂ ∂p(x, t) ∂2 =− D1 (x) p(x, t) + 2 D2 (x) p(x, t) ∂t ∂x ∂x
(4.67)
4.5 Fokker–Planck Dynamics in Finite State Space
on a finite state space interval a < x < b. The properties of the boundaries (closed at x = a and open at x = b) are fixed. In the following comparison, presented in Table 4.1, we show the similarities between forward and backward dynamics in detail and indicate that both methods of solving the problem are equivalent and therefore the solutions are identical. Coming back to the basic equation (4.67) in the first step we change the general diffusion term given by function D2 (x) to an expression with a constant value D using the transformation p(x, t) dx = w(x, t) dx
with x = x(x).
(4.68)
Since we have p(x, t) = w(x, t) dx/dx ≡ x w(x, t) the time derivation becomes ∂ ∂p(x, t) ∂w = , x w = x ∂t ∂t ∂t
(4.69)
whereas the coordinate derivative changes to + ,2 ∂w ∂p(x, t) ∂ ∂w = = x w + x . x w = x w + x ∂x ∂x ∂x ∂x
(4.70)
According to (4.67) we rewrite the Fokker–Planck equation as follows , ∂ ∂p(x, t) ∂ + =− D1 (x)p(x, t) − D2 (x) p(x, t) ∂t ∂x ∂x , ∂ + ∂p(x, t) D1 (x) − D2 (x) p(x, t) − D2 (x) =− ∂x ∂x
(4.71) (4.72)
and exchange the derivatives by (4.69) and (4.70) to get the transformed Fokker– Planck equation , + ,2 ∂w ∂ + ∂w = −D1 x + D2 x + D2 x w + D2 x . (4.73) ∂t ∂x ∂x Here we set + , 62 (x) = D2 (x) x 2 = D = const. D
(4.74)
and obtain the previously unknown coordinate transformation x = x(x) as
x/ D dx x(x) = (4.75) D2 (x ) including a constant which has to be determined according to the scaled boundaries x(x = a) = a and x(x = b) = b. Defining the new drift coefficient by 61 (x) = −D1 (x(x)) x + D (x) x + D2 x (4.76) −D 2 + , 2 62 = D x + 2D2 x x = 0, finally the transformed and taking into account D 2 Fokker–Planck equation reads
129
130
4 The Fokker–Planck Equation Table 4.1
Comparison between forward and backward Fokker–Planck dynamics.
Forward dynamics p(x, t | x0 , t0 )
Backward dynamics p(x, t | x0 , t0 )
Forward Fokker–Planck equation: ∂ ∂2 ∂ p = − [D1 p] + 2 [D2 p] ∂t ∂x ∂x
Backward Fokker–Planck equation:
Initial condition: p(x, t = t0 | x0 , t0 ) = δ(x − x0 )
Initial condition: p(x, t | x0 , t0 = t) = δ(x − x0 )
Closed boundary (reflecting at x = a): ∂ p =0 j(x = a, t) = D1 p − D2 ∂x x=a
Closed boundary (reflecting at x0 = a): ∂ p ∂x0 x0 =a = 0
Open boundary (absorbing at x = b): p(x = b, t | x0 , t0 ) = 0 Solution p = pF (x, t | x0 , t0 ) identical p = pF = pB
Open boundary (absorbing at x0 = b): p(x, t | x0 = b, t0 ) = 0 Solution p = pB (x, t | x0 , t0 ) identical p = pF = pB
Survival probability: b G(t, x0 ) = a p(x, t | x0 , t0 ) dx
Survival probability: b G(t, x0 ) = a p(x, t | x0 , t0 ) dx or directly from ∂ ∂ ∂2 ∂t G = D1 ∂x0 G + D2 ∂x02 G with G(t = t0 , x0 ) = 1 ∂ G and ∂x = 0; 0 x=a G(t, x0 = b) = 0.
Outflow probability density: P(t, x0 → b) = j(x = b, t)
Outflow probability density: P(t, x0 → b) = − d G(t, x0 ) dt
Cumulative outflow from t0 = 0 to tobs : t W(t ≤ tobs , x0 → b) = 0 obs P(t, x0 → b) dt
Cumulative outflow from t0 = 0 to tobs : W(t ≤ tobs , x0 → b) = 1 − G(tobs , x0 )
Moments: ∞
tn (x0 → b) = 0 tn P(t, x0 → b) dt
Moments: ∞
tn (x0 → b) = n 0 tn−1 G(t, x0 ) dt
First moment (MFPT with x0 = a): ∞
t1 (a → b) = 0 t P(t, a → b) dt
First moment (MFPT with x0 = a): ∞
t1 (a → b) = 0 G(t, x0 = a) dt or directly from d2 t1 d t1 D2 + D1 = −1 dx0 dx02 with d t1 = 0 and t1 = 0. dx0 x0 =a x0 =b
∂ p = D ∂ p + D ∂2 p − ∂t∂ p = ∂t 1 ∂x 2 0 0 ∂x02
4.5 Fokker–Planck Dynamics in Finite State Space
, ∂ +6 ∂ 2 w(x, t) ∂w(x, t) =− . D1 (x) w(x, t) + D ∂t ∂x ∂x2
(4.77)
2 61 (x) = D1 (x(x)) dx + D2 (x(x)) d x . D dx dx2
(4.78)
with
In the next step we introduce dimensionless variables, coordinate y ∈ [0, 1] and time T, via y = y(x) =
x−a b−a
and T = T(t) =
D b−a
2 t
(4.79)
together with a new probability density P(y, T) by w(x, t) dx = P(y, T) dy. The resulting Fokker–Planck equation is
∂ ∂2 ∂P(y, T) = − A(y) + 2 P(y, T) ∂T ∂y ∂y
(4.80)
(4.81)
b − a6 D1 (y). D In the next step we transform this equation to a differential equation of the Sturm–Liouville type (without first derivative) by using the substitution with drift function A(y) =
P(y, T) = e−(y)/2 Q(y, T).
(4.82)
According to (4.82), the terms in the Fokker–Planck equation (4.81) are transformed as follows: ∂Q 1 ∂P = e−(y)/2 − Q + , (4.83) ∂y 2 ∂y 2 1 2 1 ∂2P −(y)/2 ∂ Q ∂Q + Q− Q , =e − (4.84) ∂y2 ∂y2 ∂y 4 2 1 ∂(AP) ∂Q = e−(y)/2 − AQ + A Q + A . (4.85) ∂y 2 ∂y Inserting these relations into (4.81) yields
1 ∂2Q 1 2 1 ∂Q ∂Q(y, T) = A − A + − Q − (A + ) + . (4.86) ∂T 2 4 2 ∂y ∂y2 In order to cancel the term with the first derivative, one has to choose = −A or y = − A(y ) dy , which finally leads to the equation of the desired form ∂2Q ∂Q = −A1 (y)Q + ∂T ∂y2 with A1 (y) = 14 A2 + 12 A . Now we separate the variables by setting
(4.87)
131
132
4 The Fokker–Planck Equation
Q(y, T) = e−λT ψ(y)
(4.88)
which leads to the equation d2 ψ + (λ − A1 (y))ψ = 0 dy2
(4.89)
for the y-dependent function ψ(y), which can be written also as Lψ + λψ = 0,
(4.90)
where L=
d2 − A1 (y) dy2
(4.91)
is the Sturm–Liouville operator. Equation (4.90) has to be completed with appropriate boundary conditions α1 ψ + β1
dψ =0: dy
y = a,
(4.92)
α2 ψ + β2
dψ =0: dy
y = b,
(4.93)
where α1 , β1 , α2 , and β2 are constants. Equation (4.89) together with the boundary conditions (4.92) and (4.93) represent the Sturm–Liouville problem. Nontrivial particular solutions, which are called eigenfunctions, are possible only at certain values of λ which are called eigenvalues of the problem. The eigenvalues and eigenfunctions of the Sturm–Liouville problem depend on the specific form of A1 (y) and boundary conditions. However, the eigenvalues are always real if A(y) is a real function, since the Sturm–Liouville operator L is self-adjoint. This means that for any two functions g(y) and u(y), which fulfill the boundary conditions (4.92) and (4.93), we have (g, Lu) = (Lg, u),
(4.94)
where (·, ·) denotes the scalar product. For scalar functions this is defined by an integral
b (g, u) = g(y )u(y ) dy . (4.95) a
Hence, the condition (4.94) for the operator (4.91) becomes
b 2 d2 g d u (g, Lu) − (Lg, u) = g 2 − u 2 dy dy dy a
b du dg du dg b d g −u dy = g −u = 0. (4.96) = dy dy dy dy a a dy It obviously is fulfilled according to (4.92) and (4.93), which proves that the Sturm–Liouville operator is self-adjoint. If A(y) is real, then L = L∗ holds, and therefore the equation which is complex conjugated to (4.90) reads
4.6 Fokker–Planck Dynamics with Coordinate-Dependent Diffusion Coefficient ∗
∗
∗
Lψ + λ ψ = 0.
(4.97)
By setting g = ψ∗ and u = ψ in (4.94) we obtain (ψ∗ , Lψ) = (Lψ∗ , ψ).
(4.98)
According to (4.90) and (4.97) it reduces to λ∗ (ψ∗ , ψ) = λ(ψ∗ , ψ)
(4.99)
or λ∗ = λ, which means that the eigenvalue λ is real, as we have mentioned already. 4.6 Fokker–Planck Dynamics with Coordinate-Dependent Diffusion Coefficient
Now we are going to investigate a simple example: the evolution of a particle concentration profile c(x, t) with a diffusion coefficient D which linearly depends on coordinate x as D(x) = D0 (1 + gx). This task was first solved by Martin [159], but we are following our pathway outlined in Section 4.5 to get the solution for the probability density p(x, t). Our result received in a different way from that obtained by Martin for zero initial value, is also a generalization for arbitrary initial conditions. The corresponding diffusion equation reads
∂ ∂p(x, t) ∂p(x, t) = D(x) ∂t ∂x ∂x = D0 g
∂p(x, t) ∂ 2 p(x, t) + D0 (1 + gx) . ∂x ∂x2
(4.100)
This equation is completed by the reflecting boundary condition at x = −1/g (vanishing flux) and natural boundary condition at x = ∞ (vanishing flux and concentration) as well as the initial condition p(x, t = 0) = δ(x − x0 ). In the following we will show how this equation is treated according to the scheme described in the previous section. First we rewrite it in the form (4.67), that is, ∂ ∂2 ∂p(x, t) =− D1 (x) p(x, t) + 2 D2 (x) p(x, t) , ∂t ∂x ∂x
(4.101)
where D1 (x) = D0 g and D2 (x) = D0 (1 + gx). Further on, we make the coordinate transformation (4.75)
x/
x D0 dx 2 dx x(x) = = = 1 + gx (4.102) D2 (x ) g 1 + gx and the transformation p(x, t) = w(x, t) · dx/dx = w(x, t) · 2/(gx) to the density function w(x, t) in the new coordinate space in order to obtain the equation with constant diffusion coefficient
133
134
4 The Fokker–Planck Equation
∂ D0 ∂w(x, t) ∂ 2 w(x, t) =− w(x, t) + D0 ∂t ∂x x ∂x2
(4.103)
in accordance with (4.77) and (4.78). The boundary values of the new coordinate x are a = x(x = −1/g) = 0 and b = ∞. The constant D0 can be removed by changing the variables T = tD0 and P(x, T) = w(x, t). This yields an equation of the form (4.81) with A(x) = 1/x:
∂ 1 ∂P(x, T) ∂ 2 P(x, T) =− P(x, T) + . (4.104) ∂T ∂x x ∂x2 In the next step we remove the first derivative by changing the variable as P(x, T) = e−(x)/2 Q(x, T),
(4.105)
where
(x) = −
x
dy = − ln x y
(4.106)
holds for x > 0. This leads to an equation of the form (4.87) ∂Q 1 ∂2Q = Q+ , 2 ∂T (2x) ∂x2
(4.107)
where the coefficient at Q is set to −A1 (x) = − 14 A2 (x) − 12 dA(x)/ dx = 1/(2x)2 according to A(x) = 1/x. The boundary conditions are obtained by transforming into the new variables the condition of zero flux, j = D0 g p(x, t) −
∂ D0 (1 + gx) p(x, t) = 0, ∂x
at the boundary x = −1/g. The resulting condition is that 1 1 ∂Q(x, T) Q(x, T) − x √ ∂x x 2
(4.108)
(4.109)
must vanish at x → 0 as well as at x → ∞. The separation of variables (cf. (4.88)) Q(x, T) = e−λ T (x) 2
leads finally to the Sturm–Liouville problem
d2 1 2 + λ + =0 (2x)2 dx2 with the boundary condition at x = 0 1 ∂(x) (x) = x , 2 ∂x x=0
(4.110)
(4.111)
(4.112)
4.6 Fokker–Planck Dynamics with Coordinate-Dependent Diffusion Coefficient
following from the vanishing of (4.109), and limx→∞ (x) = 0 according to the physical meaning of the natural boundary condition. The condition (4.112) corresponds to (0) = 0, as will be seen from the following solution. In this case the boundary conditions take the standard form (4.92)–(4.93). To solve our diffusion problem, it is more suitable to return to the form (4.104) containing the first derivative with respect to coordinate x, since we then obtain the known Bessel equation instead of (4.111). The boundary (vanishing flux) condition at x = 0 then reads ∂P(x, T) , (4.113) P(x, T) = x ∂x x=0
and the initial condition c(x, t = 0) = δ(x − x0 ) is g + , 2P(x, T = 0) =δ x2 − x20 gx 4
(4.114)
or P(x, T = 0) =
x δ (x − x0 ) . x0
(4.115)
Here we have used the relation δ(f (x)) = |f (x1 0 )| δ(x − x0 ) with x0 as a root of f (x). To solve this new problem, we now try the separation ansatz P(x, T) = χ(T)ψ(x). Inserting it into (4.104) and dividing by χ(T)ψ(x) we find 1 1 ∂ 2 ψ(x) 1 ∂ψ(x) 1 ∂χ(T) = 2 − + . χ(T) ∂T xψ(x) ∂x ψ(x) ∂x2 x
(4.116)
Since both sides depend only on different variables, they have to be constant. Choosing −λ2 as the separation constant we get a simple relation for the temporal evolution ∂χ(T) = −λ2 χ(T) ∂T
(4.117)
having a solution χ(T) = e−λ T . 2
(4.118)
The general solution contains an arbitrary normalization constant, which is not important here and has been set to one. The spatial solution has to satisfy a more complicated differential equation x2
∂ 2 ψ(x) ∂x
2
−x
, ∂ψ(x) + + 1 + λ2 x2 ψ(x) = 0. ∂x
(4.119)
This equation is quite similar to the Bessel equation x2 y (x) + xy (x) + (n2 + x2 )y(x) = 0. We can transform our equation into this form by a change of variables:
(4.120)
135
136
4 The Fokker–Planck Equation
y = λx,
ψ(x) = yφ(y).
(4.121)
Now we compute + , 1 + λ2 x2 ψ(x) = yφ(y) + y3 φ(y), x x2
, ∂ψ(x) ∂ + ∂φ(y) =y yφ(y) = yφ(y) + y2 , ∂x ∂y ∂y
∂ 2 ψ(x) ∂x
2
= y2
, ∂2 + ∂φ(y) ∂ 2 φ(y) yφ(y) = 2y2 . + y3 2 ∂y ∂y ∂y2
By inserting this into (4.119) we obtain the Bessel equation y2
∂φ(y) ∂ 2 φ(y) +y + y2 φ(y) = 0. ∂y2 ∂y
(4.122)
The solution is given by a linear combination of Bessel functions Jn (y) and Yn (y) of zeroth order n = 0: , 1+ (4.123) φ(y) = c1 J0 (y) + c2 Y0 (y) . λ Written in terms of x and ψ(x), this reads as ψ(x) = c1 xJ0 (λx) + c2 xY0 (λx).
(4.124)
This solution has to fit to our boundary condition ∂P(x, T) ∂ψ(x) ⇒ ψ(x) = x . P(x, T) = x ∂x x=0 ∂x x=0 Inserting our solution into this equation we obtain + , 0 = x c1 J0 (λx) + c2 Y0 (λx) x→0 .
(4.125)
(4.126)
It can be shown that the Bessel functions can be approximated in the limit x → 0 by lim J0 (λx) = 0,
λx→0
lim Y0 (λx) =
λx→0
2 , πx
(4.127)
therefore we find that the constant c2 is zero. In conclusion, the particular solution for each λ reads 2
Pλ (x, T) = χλ (T)ψλ (x) = cλ e−λ T xJ0 (λx).
(4.128)
The general solution can be written as a linear combination of the solutions for different λ. Since we have a continuous set of eigenvalues λ, we have to formulate the linear combinations with an integral
∞ 2 dλ C(λ)e−λ T xJ0 (λx), (4.129) P(x, T) = 0
where the weight function C(λ) is determined by the initial condition (4.115). Consequently, in our example we have to solve the equation
4.6 Fokker–Planck Dynamics with Coordinate-Dependent Diffusion Coefficient
x δ (x − x0 ) = x0
∞
dλ C(λ)xJ0 (λx).
(4.130)
0
For this purpose we intend to use the orthogonality of the Bessel functions. It can be shown that the Bessel functions are orthogonal when we consider the following scalar product
∞ f (x)g(x)x dx. (4.131) (f , g) = 0
Thus, the corresponding relation reads
∞ , + J0 (λx)J0 (κx) x dx = δ(λ − κ). J0 (λx), J0 (κx) = λ
(4.132)
0
Therefore, we multiply both sides of (4.130) by κJ0 (κx) and integrate over x, which yields
κ
∞
J0 (κx) 0
x δ (x − x0 ) dx = κ x0
=
∞
∞
dx J0 (κx) 0 ∞
κJ0 (κx)J0 (λx) x dx
dλ C(λ)
0
dλC(λ)xJ0 (λx)
0 ∞
0
and we get κJ0 (κx 0 )
x0 = x0
∞
dλ C(λ)δ(λ − κ),
0
C(κ) = κ J0 (κx 0 ).
(4.133)
According to (4.133) and (4.129) and using an integral taken from [60], the final solution reads
∞ + , 2 dλ λJ0 (λx0 ) xJ0 (λx) e−λ T P(x, T) = 0
∞
=x
2T
dλ λJ0 (λx 0 )J0 (λx)e−λ
0
=x
"
x2 + x20 x0 x 1 exp − I0 , 2T 4T 2T
(4.134)
where I0 is the zeroth-order Bessel function of complex argument given by the general relation In (x) = i−n Jn (ix), or for n = 0 I0 (x) = J0 (ix) =
∞ x (x/2)2k + ···. + ,2 = 1 + 2 k! k=0
(4.135)
Replacing x = 2g 1 + gx, T = D0 t and p(x, t) = P(x, T) · 2/(gx), we find the concentration c in variables of x, t as final result
137
4 The Fokker–Planck Equation 3
t=0.01
2 p
138
1 t=0.3
t=0.1
t=1 t=5
0 −1
0
1
2
x
Figure 4.3 Plot of the probability density profile p(x, t) for different times moments starting from an initial delta-like distribution showing the evolution of the distribution (4.136). Parameters: x0 = 0, D = 1 m2 s−1 , g = 1 m−1 .
2 1 + gx 1 + gx0 2 + g(x + x0 ) 1 exp − I0 . p(x, t) = gD0 t g 2 D0 t g 2 D0 t
(4.136)
This resulting distribution is in agreement with that one found in [159] for the special case of zero initial value x0 . Its temporal development is shown in Figure 4.3. For a well-founded discussion we want to know the properties of the moments and therefore we have to compute the first moments of the concentration
∞ xn p(x, t) dx. (4.137)
xn = −1/g
For reasons of simplicity we perform a linear transformation y = 1 + gx such that the lower boundary equals to zero. Inserting our solution for the probability density, we find the following integral 1 + gx0 1 exp − 2
x = gD0 t g D0 t ∞ 1 + gx0 √ dy y (y − 1)n × 2 exp − y I . 0 gn g 2 D0 t g 2 D0 t g 0 n
(4.138)
These integrals can be solved analytically with the help of appropriate integral tables. In [60] we can find the following formula
∞
x 0
µ− 12 −αx
e
, + 2
√ µ + ν + 12 β2 β 2α e M−µ,ν , I2ν (2β x) = βαµ (2ν + 1) α 1 > 0. for µ+ν+ 2
(4.139)
4.6 Fokker–Planck Dynamics with Coordinate-Dependent Diffusion Coefficient
Since we want to compute the zeroth, first, and second moment, we are interested in the parameters µ = 12 , 32 , 52 and ν = 0. The functions M−µ,ν in these cases read: M− 1 ,0 (x) = 2
M− 3 ,0 (x) =
√ √
2
M− 5 ,0 (x) = 2
1
(1) = 1
xe 2 z ,
1
x(1 + x)e 2 z ,
(4.140)
(2) = 1
1 1√ x(2 + 4x + x2 )e 2 z , 2
(4.141)
(3) = 2.
(4.142)
Therefore we can use the following formulas
∞
√ 1 β2 e−αx I0 (2β x) dx = e α , α 0
2
∞ √ β 1 β2 eα, xe−αx I0 (2β x) dx = 2 1 + α α 0
2
∞ √ β β4 1 β2 x2 e−αx I0 (2β x) dx = 3 2 + 4 + 2 e α . α α α 0 In our calculations the parameters α and β are given by 1 + gx0 1 α= 2 , β= . g D0 t g 2 D0 t
(4.143) (4.144) (4.145)
(4.146)
The zeroth moment is due to the normalization of the distribution. Namely, the integral of the probability density over the total space gives us one:
x0 = =
∞
−1/g
p(x, t) dx =
0 1 − 1+gx e g 2 D0 t gD0 t
0
∞
√ dy e−αy I0 (2β y) g
0 1 β2 1 − 1+gx e g 2 D0 t e α = 1. 2 g D0 t α
(4.147)
The first moment describes the mean value of the distribution
∞ xp(x, t) dx
x = −1/g
=
0 1 − 1+gx e g 2 D0 t gD0 t
0 1 − 1+gx e g 2 D0 t = 3 g D0 t
=
1+gx
1 g3D
0t
− 2 0 g D0 t
e
= gD0 t + x0 .
0
∞
y − 1 −αy √ dy e I0 (2β y) g g
∞
1 √ ye−αy I0 (2β y) dy − x0 g 0 2 β2 β 1 1 1+ eα − α2 α g (4.148)
139
4 The Fokker–Planck Equation 3
σ
2 〈x〉, σ
140
〈x〉 1
0
0
1 t
2
Figure 4.4 The mean value x (dashed line) and the standard deviation σ (solid line) of the distribution p(x, t) depending on time. The dotted lines shows the linear asymptotics of σ. Parameters: x0 = 0, D = 1 m2 s−1 , g = 1m−1 .
Hence, the mean value drifts with a constant velocity vdrift = gD0 to the area with a higher diffusion parameter. The second moment can be computed analogously
∞
x2 = x2 p(x, t) dx −1/g
= = = =
0 1 − 1+gx e g 2 D0 t gD0 t
1+gx − 2 0 g D0 t
1 e g 4 D0 t
1+gx
1 g4D
0t
− 2 0 g D0 t
e
2D20 g 2 t2
∞ 0
(y − 1)2 −αy √ dy e I0 (2β y) g2 g
1 1 2 √
x + + 2 x0 y2 e−αy I0 (2β y) dy − g g g 0
2 , β β4 1 + 1 β2 1 + 2 + 4 e α − 2 2 g 2 D0 t + gx0 + 1 + 2 3 2 α α α g g
∞
+ 2D0 t(1 + 2gx0 ) + x02 .
(4.149)
Thus, the variance is given by σ2 = x2 − x2 = D20 g 2 t2 + 2D0 t(1 + gx0 ).
(4.150)
Hence we find that, for-small times, the √standard deviation σ grows as the square root of the time, σ ≈ 2D0 (1 + gx0 ) · t, whereas for large times the growth is linear, σ ≈ gD0 t + x0 + 1/g = x + 1/g. The time-dependence of the first moment
x, and of σ is illustrated in Figure 4.4. 4.7 Alternative Method of Solving the Fokker–Planck Equation
Coming back to the diffusion equation (4.100) we want finally to present an alternative method of solving the Fokker–Planck equation via the Laplace transformation
4.7 Alternative Method of Solving the Fokker–Planck Equation
to get the long-time behavior as an approximative solution. Applying linear transformation of the coordinate y = 1 + gx and scaling the time T = D0 g 2 t the probability transforms by P(y, T) dy = p(x, t) dx and we end up with the following problem, given by the Fokker–Planck equation for y ∈ (0, ∞)
∂ ∂P(y, T) ∂P(y, T) = y (4.151) ∂T ∂y ∂y subject to the boundary conditions y ∂y Py→+0 = 0, P|y→∞ → 0
(4.152)
and the initial delta distribution at time T = 0 P|T=0 = δ(y − y0 ),
y0 > 0.
(4.153)
To solve it we make use of the Laplace transformation with respect to time and space. The space–time Laplace transform of P is defined by
∞ ∞ + , dy dT P(y, T) exp −sT − py . F(p, s) = L P := 0
(4.154)
0
Applying this Laplace transformation, the Fokker–Planck equation becomes sF = −p
∂ + , pF + e−py0 . ∂p
(4.155)
To obtain this result we have used the following identities , + ∂P := sF − exp −py0 , L ∂T
∂ ∂P ∂ + , L y := −p pF . ∂y ∂y ∂p
(4.156) (4.157)
The solution of (4.155) is found in the form
A(p) s F(p, s) = exp , p p whence we obtain the equation for A(p)
s A = p exp − − py0 . p Since A(0) = 0 holds, we have
p s dq exp − − qx0 , A(p) = q q 0 and the desired solution is
1 1 1 p dq exp s − − qy0 . F(p, s) = p 0 q p q
(4.158)
141
142
4 The Fokker–Planck Equation
Changing the variable p −1 q=p 1+y s the integral (4.158) is converted to
∞ dy sp exp −y − y0 . F(p, s) = s + py s + yp 0
(4.159)
The inverse transformation from the Laplace space F(p, s) to the original variables P(y, T) becomes easy in the long-time limit. As time increases, the influence of the initial system position y0 becomes negligible. This limit is described by expression (4.159) where we set y0 = 0. In this case formula (4.159) reads
∞ ∞
∞ + , + , dy exp −y = dy dT P(y, T) exp −sT − py (4.160) F(p, s) = 0 s + py 0 0 Taking into account the identity
∞ 1 dt exp (−st − ξt) = s+ξ 0
(4.161)
from (4.160) we have
∞
+ , dxP(t, x) exp −px =
0
∞
dx exp −x(1 + pt)
0
=
1 1 1 = −1 . 1 + pt t t +p
By using again (4.161), the distribution function can be written as y 1 P(y, T) = exp − . T T Going back to the original variables p(x, t), we have
1 + gx 1 exp − . p(x, t) = D0 gt D0 g 2 t
(4.162)
(4.163)
(4.164)
It should be noted that expressions (4.136) and (4.164) coincide according to (4.135) for 1 + gx0 = 0 as should be the case for the initial position of the delta distribution located at the boundary x = x0 = −1/g. 4.8 Exercises
E 4.1 Fokker–Planck equation and Maxwell distribution Consider the Fokker–Planck equation (4.67) in the one-dimensional velocity space of particles, where the variable x represents the velocity v, that is x → v. Here we assume D1 (v) = −γv and D2 = γkB T/m, where γ is the friction coefficient, kB is the Boltzmann constant, T is the temperature, and m is the mass of a particle. The task is to obtain the stationary solution and compare it with the Maxwell distribution.
4.8 Exercises
E 4.2 Hermitean Fokker–Planck operator Transform the Fokker–Planck equation for the velocity distribution function p(v, t) considered in the previous exercise into the equation for W(v, t) = eφ(v)/2 p(v, t). Find the function φ(v) for which the resulting Fokker–Planck operator is Hermitean. E 4.3 Fokker–Planck equation into the Schr¨odinger equation Transform the Fokker–Planck equation (4.67) into the form ∂p = LFP p(x, t) ∂t
with
LFP = D
∂2 − V(x) ∂x2
(4.165)
and then further into the Schr¨odinger equation, for a particle in the potential V(x), by introducing the imaginary Schr¨odinger time tSchr¨odinger = −it. Find the relation between the diffusion coefficient D in (4.165) and the particle mass in the Schr¨odinger equation. E 4.4 Fokker–Planck equation with harmonic potential Consider the Fokker–Planck equation (4.165) for the harmonic potential V (x) = (1/2) mω20 x2 . Transform it into the Schr¨odinger equation and find the eigenvalues and the eigenfunctions of the Fokker–Planck operator.
143
145
5 The Langevin Equation
5.1 A System of Many Brownian Particles
Stochastic differential equations, which in physics are often denoted by Langevin equations, represent a tool for the modeling of quasi-continuous diffusion processes. The range of their applications is very wide, including physical problems like the motion of Brownian particles, the movement of grains in a granular media, the motion of vehicles in traffic flow or of animals in a biological system, the problems of economics and financial mathematics, the description of chemical reactions, and numerous optimization problems, etc. This chapter is devoted to the Langevin equation, first published by Paul Langevin in 1908, to describe the Brownian motion of passive (force-free) or active particles [6,34,38,64,211]. Active or self-driven motion of motorized particles (also called Brownian agents) is based on the assumption that the friction coefficient of the ordinary Langevin equation is velocity dependent, representing the take-up of energy from the environment and its conversion into kinetic energy. Before we are going to discuss this in more detail in this Section, at first we introduce the Langevin equation as follows. A general Langevin equation in multidimensional case with the state vector q = (q1 , q2 , . . . , qn ), the deterministic force F = (F1 , F2 , . . . , Fn ), and the Langevin force as an additive white noise ξ = (ξ1 , ξ2 , . . . , ξn ) reads [64] dqi = Fi (q) + ξˆ i (t) dt
(5.1)
where
ξˆ i (t) = 0,
(5.2)
ξˆ i (t)ξˆ j (t ) = Qij δ(t − t ).
(5.3)
The corresponding Fokker–Planck equation for the probability density function p(q1 , q2 , . . . , qn , t) ≡ p(q, t) is (cf. (4.1))
Physics of Stochastic Processes: How Randomness Acts in Time Reinhard Mahnke, Jevgenijs Kaupuˇzs and Ihor Lubashevsky Copyright 2009 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim ISBN: 978-3-527-40840-5
146
5 The Langevin Equation
1 ∂ 2 p(q, t) ∂p(q, t) = −∇q {F p(q, t)} + Qij , ∂t 2 ∂qi ∂qj
(5.4)
ij
where ∇q {F p(q, t)} =
∂ {Fi p(q, t)} ∂qi
(5.5)
i
denotes the divergence in the n-dimensional state space. The Fokker-Planck equation (5.4) can be written as a continuity equation (cf. (4.2)) ∂p(q, t) + ∇q J(q, t) = 0, ∂t where J(q, t) is the probability flux vector with components ∂p(q, t) Ji (q, t) = Fi p(q, t) − Qij . ∂qj
(5.6)
(5.7)
j
In the case where the Langevin force has only diagonal terms, so Qij = 2Di δij holds, the Langevin equation can be represented as dqi (5.8) = Fi (q) + 2Di ξi (t), dt √ where ξi (t) = ξˆ i (t)/ 2Di is the standard Gaussian white noise with the property
ξi (t)ξj (t ) = δij δ(t − t ),
(5.9)
and Di has the meaning of the diffusion coefficient for the variable qi . This equation has many applications in physics, chemistry, engineering and social sciences [6, 34, 38, 64, 211]. In particular, the system of N Brownian particles moving in d-dimensional coordinate space of a given volume V is described by the Langevin equations drα = vα , α = 1, 2, . . . , N dt 1 dvα = −γα vα − ∇α U (r1 , . . . , rN ) + 2Bα ξα (t), dt mα
(5.10) (5.11)
which correspond to (5.8). In this case the state vector has 2 dN components comprising d components of the coordinate vector rα and d components of the velocity vector vα of each particle with mass mα . The particles are numbered by the index α. In this case ξα (t) also is a d-dimensional vector, whereas γα and Bα are scalar quantities – the friction coefficient and the diffusion coefficient in the velocity space. The noise has only diagonal terms, which in our notation means
ξα (t) = 0,
ξα,i (t)ξβ,j (t ) = δαβ δij δ(t − t ),
(5.12) (5.13)
5.1 A System of Many Brownian Particles
where the indices α and β refer to particles, whereas i and j (from 1 to d) refer to the vector components of a given particle. The operator ∇α represents the gradient in the space of rα . It gives the force Fα acting on the particle number α, −∇α U (r1 , . . . , rN ) = Fα (r1 , . . . , rN ) .
(5.14)
In this notation the system of equations (5.10)–(5.11) becomes drα = vα , α = 1, 2, . . . , N dt 1 dvα Fα (r1 , . . . , rN ) + 2Bα ξα (t). = −γα vα + dt mα
(5.15) (5.16)
The corresponding Fokker–Planck equation for the probability density function pN = pN (r1 , v1 , . . . , rN , vN , t) consistent with the particular case of (5.4) where the double sum over i and j contains only the diagonal terms, reads
∂ + , ∂ 1 ∂pN vα pN − =− Fα pN ∂t ∂rα ∂vα mα α α +
∂ + , ∂ 2 pN γα vα pN + Bα 2 . ∂vα ∂vα α α
(5.17)
Here ∂/∂rα and ∂/∂vα are the nabla or the divergence operators acting in the space of rα and vα , respectively. By analogy, the second-order derivative in the last sum represents the Laplace operator. In the thermodynamic equilibrium, which corresponds to the stationary solution ∂pN /∂t = 0 of (5.17), the probability density is consistent with the Boltzmann–Gibbs distribution 1 −HN (r1 ,v1 ,...,rN ,vN )/(kB T) pN = e . (5.18) Z(T) Here H is the Hamiltonian of the system and Z(T) is the partition function, which depends on the thermal energy kB T, where kB is the Boltzmann constant and T is the temperature. The Hamiltonian is given by H=
N α=1
mα
v2α + U (r1 , . . . , rN ) . 2
(5.19)
Hence, the diffusion coefficient in the velocity space Bα should obey the Einstein relation, also called the fluctuation-dissipation relationship Bα =
γα kB T . mα
(5.20)
In the following we consider a system of Brownian particles with the potential + , mα mβ u | rα − rβ | . (5.21) U (r1 , . . . , rN ) = α<β
147
148
5 The Langevin Equation
Here u(r) is the potential of interaction between two particles. In particular, the case with gravitational potential in two dimensions has been widely studied by P. H. Chavanis in [27]. In the overdamped limit γα → +∞, the inertia of the particles in (5.11) can be neglected [27], which means that we can solve this equation with respect to vα by setting dvα / dt = 0 and substitute the result into (5.10). It yields drα = −µα ∇α (r1 , . . . , rN ) + 2Dα ξα (t), (5.22) dt where 1 µα = (5.23) γα mα is the mobility and Dα =
Bα γα2
(5.24)
is the spatial diffusion coefficient of the particle number α. From (5.23), (5.24), and (5.20) we obtain the well known Einstein relation between the spatial diffusion coefficient and mobility, that is, Dα = µα kB T.
(5.25)
The Fokker–Planck equation corresponding to (5.22) N ∂ ∂pN ∂U ∂pN Dα = + µα pN ∂t ∂rα ∂rα ∂rα α=1
(5.26)
describes the time evolution of the probability density function pN = pN (r1 , . . . , rN , t) in the (Nd)-dimensional coordinate space of N particles. The systems with long-range interaction like, e.g., the N-star model in astrophysics, demonstrate a particularly interesting behavior [27]. They have a complex thermodynamics and present phase transitions between ‘gaseous’ and ‘clustered’ states. In particular, the model of self-gravitating Brownian particles is of interest [27]. In this model, the particles interact gravitationally, but they also experience a friction force and a stochastic force. The latter mimics a coupling with a thermal bath of nongravitational origin. This is a dissipative system, therefore a description by canonical ensemble with given temperature is appropriate. The dynamics of such a model is described by a set of N coupled Langevin equations which we have already discussed with the gravitational potential u(r) = −
1 G d − 2 r d−2
u(r) = G ln r
for d = 2,
(5.27)
for d = 2,
(5.28)
where G is the gravity constant. The two-dimensional case d = 2 is special. An interesting and, generally difficult, problem consists in determining the effective diffusion coefficient D(T) of a particle of the system. For the self-gravitating gas of Brownian particles in two dimensions it can be calculated exactly [27],
5.1 A System of Many Brownian Particles
D(T) =
KB T γm
1−
Tc T
,
(5.29)
where Tc is the critical temperature KB Tc = (N − 1)
Gm2 . 4
(5.30)
For T Tc the self-gravity becomes negligible, and we recover the Einstein relation (5.25). This corresponds to a diffusive motion of particles which is slightly modified by self-gravity. The diffusion coefficient vanishes at T = Tc and becomes negative at T < Tc . The latter implies the collapse. In that case the system forms a Dirac peak containing the whole mass in a finite time. Interacting random walkers, described by coupled Langevin equations, are also studied in soft-matter physics in order to compute the transport properties of interacting particles. Examples are supercooled liquids and colloids in solution. In these examples, however, the interaction potential is short-range and the system is homogeneous at equilibrium. In other applications, one considers systems with short-range next-neighbor interactions, such as one-dimensional ring chains with Toda or Morse potential, but with velocity-dependent friction coefficient γ(v) [38,211]. The system of coupled Langevin equations (i = 1, 2, . . . , N) then reads
m
dxi = vi , dt
(5.31)
√ ∂U dvi =− − mγ(vi ) vi + 2B ξi (t). dt ∂xi
(5.32)
In this case N particles are numbered by the index i and the periodic boundary conditions xi+N = xi + L are assumed, where L is the length of the chain. The derivative −∂U/∂xi of the system’s potential energy U represents the interaction force, whereas −mγ(vi ) vi represents the friction force acting on the ith particle. In the particular case studied in [28] we have γ(v) = γ0 + γ1 (v),
(5.33)
where the constant part γ0 describes the viscous friction between the particle and the surrounding heat bath, whereas the nonlinear coefficient γ1 (v) = −
q <0 κ + v2
(5.34)
is introduced to model the active Brownian particles. A positive parameter q > 0 describes the energy flux from the external reservoir into the depots carried by the particles. The parameter κ is connected to the internal dissipation and the conversion of the energy. Introducing a new parameter µ=
q − κ, γ0
(5.35)
149
150
5 The Langevin Equation
Equation (5.34) can be written as
κ+µ v2 − µ = γ . γ(v) = γ0 1 − 0 κ + v2 κ + v2
(5.36)
√ Here µ plays the role of a bifurcation parameter, since γ(v) = 0 holds if v = ± µ. For µ < 0 the friction coefficient γ(v) is always positive and thus leads to damping of the particle motion. At µ > 0, the friction coefficient becomes negative for small velocities v 2 < µ, generating so-called active motion. The interaction potential (or potential energy) is given by U= ui (∆xi ),
(5.37)
i
where ∆xi = xi+1 − xi are the distances between nearest neighbors. In the following we assume that the pair interaction potential is given by ui (∆xi ) = u(∆xi ) for all i. A particular example is the Morse potential u(∆x) =
!2 a −b(∆x−σ) a −1 − , e 2b 2b
which is qualitatively similar to the well known Lennard–Jones potential
a ∆x −6 ∆x −12 u(∆x) = . −2 2b σ σ
(5.38)
(5.39)
The parameters are chosen such that the minimum −a/(2b) of the potentials (5.38) and (5.39) is located at ∆x = σ. The Morse potential can be considered as a generalization of the Toda potential ! a a −b(∆x−σ) e (5.40) − 1 + a(∆x − σ) − . u(∆x) = b 2b The latter is widely appreciated, since it yields an exactly solvable nonlinear model. The models with short-range interaction and negative velocity-dependent friction coefficient, describe active particles and show interesting features like clustering. There exists a critical density interval, where both equidistant and cluster configurations correspond to local minima of the potential energy [28]. Soliton-like oscillations are observed in the Toda chains with negative friction [37]. A conservative chain from N elements possesses N + 1 different modes of oscillation. They differ from one another in shape, amplitude, frequency and phase shift, between the oscillations of neighboring particles. In the active chain each of these modes may generate the corresponding attractor [38]. Various interesting examples based on Langevin dynamics are considered in [212, 226] such as stochastic thermodynamics and nonequilibrium steady states.
5.2 A Traditional View of the Langevin Equation
5.2 A Traditional View of the Langevin Equation
The Langevin equation describes the dynamics of a system in the presence of an interaction with the environment. For simplicity here we consider the onedimensional case, where the state of the system is characterized by a scalar quantity x(t) which depends on time t. The time evolution is described by the Langevin equation dx = f (x) + ψ(x) ξ(t) dt
(5.41)
together with the initial condition x(t = 0) = x0 .
(5.42)
Here the dynamics of the system itself is given by the deterministic force f (x), whereas the interaction with the environment is represented by the stochastic or Langevin force ψ(x)ξ(t), where ψ(x) is the noise intensity. If the latter is constant then the Langevin force represents an additive noise. The intensity ψ(x) may depend on x in general. In this case we deal with the so-called multiplicative noise. In the classical case, ξ(t) is the Gaussian white noise, representing random and normally distributed fluctuations, which are completely uncorrelated for different times. It is important to notice, however, that another kind of noise ξ(t) may also be of interest. For example, the Markovian dichotomous noise represents a stochastic process of switching between two discrete values. This type of noise is frequently used for modeling various phenomena in biology, physics, and chemistry. States of the dichotomous process can be associated, e.g. with two different levels of external stimuli, the presence or absence of an external perturbation, etc. It is interesting to mention that a combination of dichotomous and white noise can lead to a bimodal probability distribution even in a system with a single-well potential [35] φ(x) = αx2 or linear force f (x) = −dφ/dx. Thus, the noise can significantly change the behavior of a system. In this sense we can speak about noise-induced phase transitions. The latter topic will be discussed in detail in Chapter 11. Here we give some general statements only and recommend the monograph by Werner Horsthemke and Rene Lefever [86] to check for further information concerning noise-induced transitions. First we notice that the phase transition point is shifted depending on the noise intensity, which is a usual phenomenon. It is a general feature of nonlinear systems subject to multiplicative external noise. The shift in the bifurcation diagram is not too surprising when one thinks about it. The external noise can induce even deeper and far less intuitive modifications in the macroscopic behavior of nonlinear systems [86]. Nonequilibrium systems are, by their very nature, closely dependent on their environment. A question therefore arises as to how the nonequilibrium and the environmental randomness
151
152
5 The Langevin Equation
interact. Can this interaction lead to drastic changes in the macroscopic behavior of the system even outside the neighborhood of a deterministic instability point? In other words, is it possible that the external noise modifies bifurcation diagrams in a more profound way than by just a shift in the parameter space? The basic question can be formulated as follows: do nonlinear systems always adjust their macroscopic behavior to the average properties of the environment, or can one find situations in which the system responds to the randomness of the environment in a certain more active way displaying, for instance, behavior forbidden under deterministic external conditions? The answer to these questions is indeed positive. It turns out that even extremely rapid totally random external noise can considerably alter the macroscopic behavior of nonlinear systems. It can induce new transition phenomena, known as noise-induced phase transitions, which are quite unexpected from the usual phenomenological description. In the following we turn back to the traditional approach with white noise added to the deterministic drift.
5.3 Additive White Noise
Historically, the Langevin equation has been designed to describe Brownian motion, assuming ψ(x) = σ in (5.41) as a constant. This is the usual case of the Langevin equation with the additive noise dx = f (x) + σξ(t). dt
(5.43)
In general, ξ(t) is a randomly fluctuating quantity. Traditionally it is white noise, which has the following properties
ξ(t) = 0,
(5.44)
ξ(t)ξ(t ) = δ(t − t ).
(5.45)
Equation (5.43) can be formulated as a stochastic differential equation with the initial condition (5.42). It is the conventional form of writing used in the mathematical literature, that is dx(t) = f (x(t)) dt + σ dW(t);
x(t = 0) = x0 ,
(5.46)
where W(t) is the standard Wiener process with the following properties
W(t) = 0,
(5.47)
W(t)W(t ) = min(t, t ).
(5.48)
5.3 Additive White Noise
For the increments of the Wiener process dW(t) = W(t + dt) − W(t) at dt → 0 we have
dW(t) = 0, dt,
dW(t) dW(t ) = 0,
(5.49)
t =t t = t
(5.50)
The formal relation between the Wiener process and the Langevin force is given by ξ(t) =
dW(t) ⇐⇒ W(t) = dt
t
ξ(s) ds.
(5.51)
0
The stochastic differential equations and this formal relation will be discussed in more detail in Section 5.6. Here we would like to mention that the formal solution of (5.46) is
t
x(t) = x0 +
f (x(s)) ds + σW(t).
(5.52)
0
This, however, is only a different formulation of the problem by rewriting the stochastic differential equation (5.46) as an integral equation (5.52). Since the right-hand side of (5.52) contains the unknown function x(s), it cannot serve as a solution in practical applications. The probability density distribution p(x, t) for the variable x at time t is given by the following Fokker–Planck equation which corresponds to (5.43) or (5.46), respectively, σ2 ∂ 2 p(x, t) ∂ ∂ p(x, t) = − f (x)p(x, t) + ∂t ∂x 2 ∂x2
(5.53)
with the initial condition p(x, t = 0) = δ(x − x0 ).
(5.54)
The averages over the ensemble of stochastic realizations, like the mean value x(t) and the correlation function x(t)x(t ), can be expressed in terms of the probability distribution functions as
x(t) =
x(t)x(t ) =
∞
−∞
∞
−∞
xp(x, t) dx,
∞
−∞
xy p(x, t; y, t ) dx dy.
(5.55) (5.56)
153
154
5 The Langevin Equation
Here p(x, t; y, t ) is the joint probability density for the two times introduced in Section 3.1. Returning to the Langevin equation (5.43), first let us consider the dynamics without fluctuations, which is given by the equation with σ = 0, dx = f (x). dt
(5.57)
The force can be represented as f (x) = −
dφ(x) , dx
(5.58)
where φ(x) is the potential. A simple classical example is the double-well potential φ(x) =
α 2 β 4 x + x , 2 4
(5.59)
where β > 0. It has one minimum if α > 0 and two minima if α < 0. The corresponding force is f (x) = −αx − βx3 .
(5.60)
The stationary solutions of (5.57) are the roots of the equation f (x) = 0 or the extremum points of the potential φ(x). They are given by x(α + βx2 ) = 0.
(5.61)
One root always is x0 = 0. At α ≥ 0 this is the only real solution. At α < 0, √ two other real solutions appear x1,2 = ± −α/β corresponding to two minima of the potential. The solution x0 = 0 corresponds to the only minimum of the potential at α > 0, which is changed to the maximum at α < 0. Minimum of φ(x) always corresponds to a stable solution and the maximum to an unstable solution of (5.57), as follows from the stability analysis considering small deviations from the extremum point. These solutions, depending on the parameter α, represent the so-called supercritical bifurcation diagram shown in Figure 5.1. It is called supercritical, since the stable branches merge continuously at the bifurcation point α = 0. A bifurcation diagram of an other kind emerges for the potential φ(x) =
α 2 β 4 γ 6 x + x + x 2 4 6
(5.62)
with β < 0 and γ > 0. It corresponds to f (x) = −αx − βx3 − γx5 .
(5.63)
In this case the equation f (x) = 0 has five roots, some of which may be complex. One solution is x0 = 0. The other four roots are given by 7 / 8 2 8 β α β 9 ± (5.64) x1,2,3,4 = ± − − . 2γ 2γ γ
5.3 Additive White Noise
1
x
0
−1 −2
−1
0
1
α
Figure 5.1 The supercritical bifurcation diagram for the potential (5.59) at β = 1. Stable solutions of (5.57) depending on the parameter α are shown by solid lines and the unstable one by a dashed line.
Only the real solutions have a physical meaning. Also, the solutions corresponding to the minima of the potential are stable, whereas those representing the maxima are unstable. At α > β2 /(4γ) the only real solution is x0 = 0. All five solutions are real within 0 ≤ α ≤ β2 /(4γ). Three of them, including x0 = 0, are stable and correspond to three minima of φ(x). The other two roots represent two local maxima in between. At α = 0, the minimum at x = 0 transforms into the maximum and two other maxima disappear. Thus, at α < 0 there are two stable solutions and one unstable solution x0 = 0. The corresponding so-called subcritical bifurcation diagram is shown in Figure 5.2. As distinct from the supercritical bifurcation diagram in Figure 5.1, here the stable nonzero branches start at certain nonzero x values at α = β2 /(4γ), where the
x
1
0
−1 −2
−1
0
1
α
Figure 5.2 The subcritical bifurcation diagram for the potential (5.62) at β = −1 and γ = 1. Stable solutions of (5.57) depending on the parameter α are shown by solid lines, and the unstable ones by dashed lines.
155
5 The Langevin Equation
x0 = 0 branch is still stable. Therefore, the system cannot switch to these nonzero branches if the initial x value is near zero. In the deterministic dynamics it first happens with a jump only at α = 0 when α is decreased. If α is increased, starting from negative values, then a jump from one of the nonzero stable solutions to the zero solution occurs at α = β2 /(4γ) > 0. In other words, hysteresis is observed. The behavior of the dynamical system in the case of supercritical as well as subcritical bifurcation is essentially changed by the noise included in the Langevin equation (5.43). Due to this noise, the system with potential (5.59) can be randomly √ switched between two stable states x1,2 = ± −α/β at α < 0, which is never possible in the deterministic dynamics. Similarly, in the system with potential (5.62), the noise enables a switching between three stable states within 0 ≤ α ≤ β2 /(4γ), or between two stable branches of the bifurcation diagram in Figure 5.2 at α < 0. Considering an ensemble of different stochastic realizations of the process ξ(t), the Langevin equation (5.43) allows one to calculate the probability density p(x, t) which has a certain value depending on x at time t. The stationary probability density pst (x) = limt→∞ p(x, t) is given by the stationary solution of the corresponding Fokker–Planck equation (5.53), so p (x) = st
e−2φ(x)/σ ∞
−∞
2
e−2φ(x)/σ dx 2
.
(5.65)
The stable branches of the bifurcation diagrams in Figures 5.1 and 5.2 correspond to maxima of the stationary probability distribution function pst (x), whereas the unstable branches correspond to its local minima. It can be seen in Figures 5.3 and 5.4, where the stationary probability density is shown for different values of α, corresponding to the bifurcation diagrams in Figures 5.1 and 5.2, respectively. Because of the symmetry of the considered potentials, the probability distribution function is symmetrical with respect to x = 0 in these examples. 1.5
1 pst(x)
156
0.5
0
−1
0 x
1
Figure 5.3 The stationary probability density pst (x) for the potential (5.59) with β = 1 at α = 0.5 (dotted curve), α = 0 (dashed curve), and α = −0.5 (solid curve).
5.4 Spectral Analysis
pst(x)
2
1
0
−1
0 x
1
Figure 5.4 The stationary probability density pst (x) for the potential (5.62) with β = −1 and γ = 1 at α = 0.5 (dotted curve), α = 0.15 (dot-dashed curve), α = 0 (dashed curve), and α = −0.5 (solid curve).
5.4 Spectral Analysis
Fourier or spectral analysis is a powerful tool used to analyze the solution of the Langevin equation [174, 194]. As an example, here we apply spectral analysis to the time evolution of a vector v(t). Specifically, v(t) can be the velocity of a Brownian particle moving in three-dimensional space. Its Fourier representation as an infinite sum is
∞ ∞ 2πkt 2πkt + . (5.66) ak cos bk sin v(t) = v(t + T) = T T k=−∞
k=−∞
Here ak and bk are the Fourier coefficients given by
2πkt dt T 0
1 T 2πkt dt. bk = v(t) sin T 0 T ak =
1 T
T
v(t) cos
(5.67) (5.68)
Using the well known Euler formulas eix = cos(x) + i sin(x),
(5.69)
and eix − e−ix eix + e−ix , sin(x) = , (5.70) 2 2i the transformation (5.66) can be represented by complex Fourier amplitudes v˜ (ω) as follows 1 v(t) = v(t + T) = v˜ (ω)eiωt , (5.71) T ω cos(x) =
157
158
5 The Langevin Equation
where the summation runs over a set of discrete frequencies ω = 2πk/T with k = 0, ±1, ±2, . . . . The inverse transformation reads
T
v˜ (ω) =
v(t)e−iωt dt.
(5.72)
0
Equation (5.71) can be viewed as an expansion in the basis of orthogonal wave functions eiωt , which satisfies the periodic boundary conditions and has the orthogonality property 1 T
0
T
eiωt e−iω t dt = δω,ω .
(5.73)
Note that the term ω = 0 in (5.71) represents the constant, that is, the timeindependent contribution. If, in general, limT→∞ v(t) is a constant, it is just v˜ (0)/T. In this case we have v(t) − v =
1 v˜ (ω)eiωt . T
(5.74)
ω =0
If the period T tends to infinity (T → ∞), the discrete sum over ω = 0 may be replaced by an integral. This substitution in (5.74) yields v(t) − v =
1 2π
∞
−∞
v˜ (ω)eiωt dω.
(5.75)
For any given time difference t − t = ∆t, the correlation function in the stationary regime at T → ∞ is given by the components vi of the velocity vector v as ϕij (∆t) = (vi (t) − vi )(vj (t + ∆t) − vj )
(5.76)
and depends only on the absolute value of ∆t. Due to the periodic boundary conditions the correlation function (5.76) can be defined within ∆t ∈ [−T/2, T/2]. The spectral density then is given by the Fourier transformation
Sjl (ω) = lim
T/2
T→∞ −T/2
ϕjl (∆t)e−iω∆t d(∆t).
(5.77)
The inverse transformation reads
∞ 1 dω Sjl (ω)eiω∆t . Sjl (ω)eiω∆t = T→∞ T 2π −∞ ω
ϕjl (∆t) = lim
(5.78)
The spectral density (5.77) can be expressed in terms of v˜ i (ω), where v˜ i (ω) are the components of the vector v˜ (ω). Since Sjl (ω) is independent of time, a formal averaging over time t in (5.77) does not change the result. Thus we can write
5.4 Spectral Analysis
Sjl (ω) = lim
T→∞
1 T
0
T
T/2
−T/2
ϕjl (∆t)e−iω∆t dt d(∆t).
(5.79)
Inserting (5.76) and (5.74), we obtain Sjl (ω) =
(5.80) 1
v˜j (ω1 )v˜l (ω2 ) T3
lim
T→∞
ω1 =0 ω2 =0
0
T
T/2
−T/2
ei(ω1 +ω2 )t ei(ω2 −ω)∆t dt d(∆t).
Taking into account the orthogonality property (5.73), only the terms with ω2 = ω and ω1 = −ω remain after the integration. Thus, for ω = 0, we have Sjl (ω) = lim
T→∞
1
v˜j (−ω)v˜l (ω). T
(5.81)
Since the velocity v is real, we have v˜j (−ω) = v˜ j∗ (ω) according to (5.71), and therefore Sjj (ω) = lim
T→∞
1 ∗ 1
v˜ (ω)v˜j (ω) = lim | v˜ j (ω) |2 . T→∞ T T j
(5.82)
The relation between the spectral density given by (5.82) and (5.77) (at j = l) is known as the Wiener–Khinchin theorem. As a simple example, where the spectral density can be easily calculated, we consider an exponentially decaying correlation function ϕ(∆t) = Ae−|∆t|/τ .
(5.83)
In this case the correlation function is a scalar quantity corresponding to a one-dimensional motion, therefore the indices are omitted. The spectral density calculated from (5.77) is
S(ω) =
∞ −∞
ϕ(∆t)e−iω∆t d(∆t) =
=A =A
0
−∞
∞
−∞
e(1−iωτ)∆t/τ d(∆t) +
Ae−|∆t|/τ e−iω∆t d(∆t)
∞
e−(1+iωτ)∆t/τ d(∆t)
0
τ 2Aτ τ + = . 1 − iωτ 1 + iωτ 1 + (ωτ)2
(5.84)
In the limit τ → 0 and A → ∞ at a constant Aτ, the correlation function (5.83) is proportional to the delta function and corresponds to the white noise. According to (5.84), the Fourier spectrum of the white noise, obtained in this limit, is independent of the frequency ω. In other words, like the white light, it contains the whole uniform spectrum of frequencies. At a finite value of the parameter τ, which can be interpreted as a correlation time, the Fourier spectrum has a smooth cut-off at ω ≈ 1/τ. This spectrum corresponds to colored noise. For 1/f noise in vehicular traffic, see [251].
159
160
5 The Langevin Equation
5.5 Brownian Motion in Three-Dimensional Velocity Space
Here we will show how the techniques of correlation functions and Fourier analysis, introduced in Section 5.4, are applied to a specific problem of Brownian motion. Consider first the deterministic motion of a Brownian particle with initial velocity v(t = 0) = v0 in a medium (liquid) with friction. Here the velocity is a three-dimensional vector. Its time evolution is described by the equation dv(t) = −γv(t), dt
(5.85)
where γ is the friction coefficient. The solution reads v(t) = v0 e−γt .
(5.86)
Thus, in this simple model, the particle reduces its velocity asymptotically to zero due to friction. This equation, however, does not completely describe the motion of a particle in a liquid. One needs to take into account the randomness caused by stochastic collisions with liquid molecules, which never allow the velocity to relax to zero. This effect is described by the Langevin equation √ dv(t) = −γv(t) + 2B ξ(t), dt
(5.87)
√ where (5.85) is completed by a stochastic (Langevin) force 2B ξ(t). Here B is the diffusion coefficient in the velocity space and ξ(t) is a three-dimensional vector with components ξi (t), representing a stochastic process. The actual Brownian motion in the space of velocity v and coordinate x is known as the Ornstein–Uhlenbeck process. For the case where the velocity and the coordinate are scalar (onedimensional) quantities this process will be discussed, based on the Fokker–Planck equation, in Chapter 8. The stochastic force should have the following properties. 1. Each component of the stochastic force has zero mean value
ξi (t)v0 = 0,
(5.88)
where the symbol v0 indicates that only those stochastic realizations are considered for which v(t = 0) = v0 . This means that the stochastic force has no influence on the averaged motion. 2. The Langevin force is the Gaussian stochastic process, which means that all higher order correlation functions reduce to the two-time correlation function
ξi (t1 )ξj (t2 )v0 according to
ξ(ti )ξ(tj )v0 · · · ξ(tk )ξ(tl )v0 . (5.89)
ξ(t1 )ξ(t2 ) · · · ξ(t2n )v0 = all pairings
Like the first moment (5.88), all odd-order moments are zero.
5.5 Brownian Motion in Three-Dimensional Velocity Space
3. The
ξi (t)ξj (t )v0
function is δ-correlated in time
ξi (t)ξj (t )v0 = δij δ(t − t ).
(5.90)
Also, this formula implies that different components are uncorrelated or statistically independent. 4. The stochastic process for the velocity√v(t) of the Brownian particle is statistically independent of the stochastic force 2B ξ(t ) for t > t, so that v(t) at a given time is independent of the stochastic force in the future:
v(t)ξ(t )v0 = 0
for t > t.
(5.91)
The velocity v(t), naturally, will be affected by ξ(t ) at t < t. In the following, we consider two different ways to get the solution of the Langevin equation (5.87): by direct integration and harmonic analysis. Direct integration yields a formal solution for each specific realization of the stochastic process ξ(t),
t √ eγt 2B ξ(t ) dt , (5.92) v(t) = v0 e−γt + e−γt 0
as can be verified by inserting (5.92) into (5.87). This solution allows us to calculate moments of the velocity distribution for the ensemble of all stochastic realizations with given initial velocity v0 . The first moment is
t √
v(t)v0 = v0 e−γt + e−γt eγt 2B ξ(t )v0 dt . (5.93) 0
The last term vanishes, since the Langevin force has zero mean value, as discussed above. Thus we have
v(t)v0 = v0 e−γt .
(5.94)
The correlation function v(t)v(t )v0 for velocities at different times can also be calculated in this way. Alternatively, the correlation function can be defined for deviations from the mean values as (v(t) − v(t))(v(t ) − v(t ))v0 by analogy with (5.76). Both definitions are equivalent for long times, where the mean velocity
v(t)v0 tends to zero. For definiteness we assume that t > t holds. Then for any velocity component we have
vi (t)vi (t )v0 =
2 −γ(t +t) vi,0 e −γ(t +t)
2 = vi,0 e
−γ(t +t)
+ 2Be
t 0
−γ(t +t)
+ 2Be
t
eγ(s +s) ξi (s)ξi (s ) ds ds
0 t
eγ(s+s) ds
0
2 −γ(t +t) = vi,0 e +
B −γ(t −t) − e−γ(t +t) . e γ
(5.95)
161
5 The Langevin Equation
By using the definition of scalar product, the correlation function v(t)v(t )v0 is easily calculated from (5.95) as
vi (t)vi (t )v0 . (5.96)
v(t)v(t )v0 = i
The second moment for each velocity component is obtained from (5.95) by setting t = t, so 2 −2γt
vi2 (t)v0 = vi,0 e +
, B+ 1 − e−2γt . γ
(5.97)
As an illustrative example, the mean value for one of the velocity components vi = vx and the variance vx2 v0 − vx 2v0 , calculated from (5.94) and (5.97), are shown in Figure 5.5. The theoretical curve for the mean value (dashed line) is compared with one specific realization of the process (fluctuating curve). The variance (solid line) shows the magnitude of the stochastic fluctuations. Apart from the mean values, the probability density p(vx , vy , vz , t) in the threedimensional velocity space is also of interest. Taking into account that the velocity components in (5.87) are not coupled, their probability distributions are independent, and we have p(vx , vy , vz , t) = p(vx , t) p(vy , t) p(vz , t),
(5.98)
where p(vx , t), p(vy , t), and p(vz , t) are the probability densities for one component. The latter ones can be calculated by solving the corresponding Fokker–Planck equation for the one-dimensional problem, considered in detail in Chapter 8. Here we only report the result (cf. (8.34)) (vi − vi,0 exp[−γt])2 1 , (5.99) exp − p(vi , t) = 2σ2 (t) 2πσ2 (t) 6
3 vx
162
0
−3 0
5 t
Figure 5.5 A stochastic trajectory (fluctuating curve) and the mean value (dashed curve) of the velocity component vx , measured in ms−1 , depending on the time t measured in seconds. The initial condition
10 is vx,0 = 5 ms−1 and the values of the parameters are γ = 1 s−1 and B = 2 m2 s−3 . The solid curve shows the temporal behavior of the variance vx2 v0 − vx 2v . 0
5.5 Brownian Motion in Three-Dimensional Velocity Space
where i = x, y, z denotes the ith component of vector v and σ2 (t) = vi2 − vi 2 =
B (1 − exp[−2γt]) γ
(5.100)
is the variance consistent with (5.94) and (5.97). For large times t the initial state (velocity v0 ) is forgotten and the final equilibrium state is given by lim vi2 (t)v0 = B/γ.
t→∞
(5.101)
On the other hand, it is well known that
vi2 =
kB T m
(5.102)
holds in the equilibrium of a classical system. Comparing (5.101) and (5.102) we arrive to the relation B kB T = γ m
(5.103)
known as the Einstein formula. It relates the macroscopic quantity (friction coefficient) γ, which describes the dissipation of the momentum, to the microscopic quantity (diffusion coefficient) B, which describes the stochastic force. The Langevin equation (5.87) can be also solved by the Fourier transformation method. According to (5.71), we have 1 d 1 dv(t) = iω˜v(ω)eiωt . v˜ (ω)eiωt = dt T ω dt T ω
(5.104)
This, in fact, is the Fourier expansion of dv(t)/ dt with the coefficients iω˜v(ω). The latter ones thus represent the Fourier transform of dv(t)/ dt. Hence, (5.87) in Fourier representation reads √ ˜ (5.105) (iω + γ)˜v(ω) = 2B ξ(ω), ˜ where ξ(ω) is the Fourier transform of the noise ξ(t). This is a simple algebraic equation yielding √ ˜ 2B ξ(ω) . (5.106) v˜ (ω) = iω + γ By analogy with Sjl (ω) given by (5.77), we will now consider the spectral density for the velocity Svj vl (ω) ≡ Sjl (ω), as well as for the noise Sξj ξl (ω). These spectral densities are the Fourier transforms of the corresponding correlation functions for the velocity (5.76) and for the noise (5.90). Following (5.82), the diagonal terms (in the stationary regime at T → ∞) are given by Svj vj (ω) =
| v˜j (ω) |2 and Sξj ξj (ω) = | ξ˜ j (ω) |2 . The nondiagonal case is consistent with (5.81), that is Svj vl (ω) = v˜j (−ω)v˜l (ω) and Sξj ξl (ω) = ξ˜ j (−ω)ξ˜ l (ω). The spectral density for the coordinate, defined as Sxj xl (ω) = x˜ j (−ω)x˜ l (ω), can also be calculated. In
163
164
5 The Langevin Equation
the latter case, however, the corresponding correlation function (xj (t) − xj )(xl (t + ∆t) − xl ) diverges in the long-time limit at j = l. Hence, the spectral density of the noise term is Sξj ξl (ω) = ξ˜ j (−ω)ξ˜ l (ω) =
=
∞ −∞
∞
−∞
ξj (t)ξl (t + ∆t)e−iω∆t d(∆t)
δjl δ(∆t) e−iω∆t d(∆t) = δjl .
(5.107)
The spectral density of the velocity can easily be calculated from (5.106) and (5.107), Svj vl (ω) = v˜j (−ω)v˜l (ω) =
2B δjl 2B ξ˜ j (−ω)ξ˜ l (ω) = 2 . ω2 + γ2 ω + γ2
(5.108)
The spectral density of the coordinate is calculated using the relation x(t)/ dt = v(t). According to x(t) =
1 x˜ (ω)eiωt , T ω
(5.109)
1 dx(t) iω˜x(ω)eiωt , = dt T ω
(5.110)
the Fourier transform of x(t) is x˜ (ω), whereas that of dx(t)/dt is iω˜x(ω). The latter is equal to v˜ (ω), as consistent with dx(t)/dt = v(t), that is, v˜ (ω) = iω˜x(ω).
(5.111)
Using this relation, we obtain Sxj xl (ω) = x˜ j (−ω)x˜ l (ω) =
1 1 2B δjl
v˜j (−ω)v˜l (ω) = 2 2 . 2 ω ω ω + γ2
(5.112)
The second moment of a given velocity component can be calculated from (5.78). Taking into account the definition of the correlation function (5.76), its Fourier representation (5.78), and the fact that the mean velocity is zero at t → ∞, (5.108) yields in the large-time limit:
∞
dω 1 ∞ B dω B Svj vj (ω) = (5.113)
vj2 = = , 2 + γ2 2π π ω γ −∞ −∞ as well as
lim vj (t)vj (t + ∆t) =
t→∞
∞
−∞
dω 1 Sv v (ω)eiω∆t = 2π j j π
B = e−γ|∆t| = vj2 e−γ|∆t| . γ
∞
−∞
B eiω∆t dω ω2 + γ2 (5.114)
5.6 Brownian Motion in Three-Dimensional Velocity Space
The result (5.113) is the same as (5.101), obtained earlier by the method of direct integration of the Langevin equation. The integral in (5.114) is calculated by making a suitable contour in the complex ω plane and using the residue theorem. For ∆t > 0 we need only to calculate the residue at ω = iγ, whereas for ∆t < 0, the residue at ω = −iγ. Equation (5.114) is consistent with the relation between the exponential correlation function and its spectral density given by (5.84). The correlation function (5.114) corresponds to the long-time limit t → ∞ at a given positive ∆t = t − t in the formula (5.95) obtained by the method of direct integration for arbitrary t. The decay of the correlation function vj (t)vj (t + ∆t) is illustrated in Figure 5.6, comparing analytical and simulation results. The latter ones are obtained by a time-averaging over one stochastic trajectory, which is equivalent to an ensembleaveraging. √ The diffusion coefficient is related to the spectral density of the stochastic force 2B ξ(t) via √ 2B ξ(ω) ˜ 2 = 2B, (5.115) as follows from (5.107) at j = l. The Einstein formula (5.103) thus can be written as √ 2B ξ(ω) ˜ 2 = 2γkB T/m. (5.116) This equation represents a relation between the friction coefficient and the spectral density of the noise source. Hence, it is a special form of the fluctuation-dissipation theorem.
2
1
0 0 Figure 5.6 The correlation function
vx (t)vx (t + ∆t) (at t → ∞) for the velocity component vi = vx , measured in ms−1 , depending on the time difference ∆t given in seconds at γ = 1 s−1 and B = 2 m2 s−3 . The solid curve shows the
5 ∆t
10 analytical result (5.114). The dashed and dotted curves represent estimates obtained by a time-averaging over t ∈ [100 s, 2000 s] and t ∈ [100 s, 9000 s], respectively, for one stochastic trajectory.
165
166
5 The Langevin Equation
5.6 Stochastic Differential Equations
As a starting point we consider the one-dimensional stochastic differential equation for the variable x, which can be, e.g. the coordinate of a particle performing random walk in one dimension. The motion of the particle is described by the stochastic differential equation (SDE) dx(t) = a(x) dt + bη (x) dW(t),
(5.117)
where a(x) and bη (x) are given functions of x, dx(t) = x(t + dt) − x(t) is the increment of x in the time interval from t to t + dt, whereas dW(t) = W(t + dt) − W(t) is the increment of the standard Wiener process having the properties
dW(t) = 0 and (dW(t))2 = dt. Later on, the Wiener process will be discussed in detail. This equation (5.117), written in the form (5.41) dx = a(x) + bη (x) ξ(t), dt
(5.118)
is known as Langevin equation [61, 234]. The Langevin force, formally ξ(t) = dW(t)/dt, has to be understood as a fluctuating quantity having the Gaussian distribution . dt dt (5.119) exp − ξ2 p(ξ(t)) = 2π 2 with ξ(t) = 0 and ξ(t)ξ(t ) = δ(t − t ). According to the formal substitution ξ(t) = dW(t)/dt we should have the variance which diverges as ξ(t)2 = 1/dt → ∞ at dt → 0. The above incorrect substitution dW(t) = ξ(t) dt, however, represents only a formal way of writing and has no rigorous mathematical meaning, since stochastic trajectories are not differentiable. An important peculiarity of the stochastic differential equations (5.117) and of the Langevin equation (5.118) is that their solution essentially depends on how the coefficient bη (x) at the noise term is defined. Namely, it is important whether this coefficient is determined at x = x(t), x = x(t + dt), or at x in some intermediate time moment. The parameter η is introduced to distinguish between these cases. Different possibilities can be chosen according to bη (x) = b(x(t + η dt)).
(5.120)
The case η = 0, when the coefficient b is determined at the left border of the integration interval [t, t + dt], is called the Ito stochastic process. In the case of the Stratonovich process, where η = 1/2, it is determined in the middle of the interval. Finally, if bη (x) is determined at the right border t + dt, which corresponds to η = 1, then we are dealing with the H¨anggi–Klimontovich process. Alternatively, one can define the coefficient bη (x) as bη (x) = b((1 − η )x(t)) + η x(t + dt)).
(5.121)
5.6 Stochastic Differential Equations
η
η
The two definitions (5.120) and (5.121) are identical at η = = 0 and η = = 1. For an arbitrary stochastic trajectory x(t), however, the relationship between η and η is different if 0 < η < 1. The solution of the stochastic differential equation of Ito type (Ito-SDE) dx(t) = a[x(t)] dt + b[x(t)] dW(t) is represented by the Ito stochastic integral
t
t a[x(t )] dt + b[x(t )] dW(t ). x(t) = x(t0 ) + t0
(5.122)
(5.123)
t0
Equation (5.122) thus has a unique solution (5.123) which is a Markov process. The probability density p(x, t) of finding the particle at a position x at time t is given by the Fokker–Planck equation with general η ∂ 1 ∂ ∂ p= −a(x)p + b(x)2η b(x)2(1−η) p . (5.124) ∂t ∂x 2 ∂x The stationary solution of (5.124) reads x C a(y) exp 2 dy pst (x) = b(x)2(1−η ) b(y)2
(5.125)
with integration constant C given by the normalization condition pst (x) dx = 1. In the case of Ito stochastic calculus (integration at left border), the stochastic differential equation (5.122) in typical notations is written as dx(t) = a(x) dt + b(x) dW(t), and the corresponding Fokker–Planck equation reads ∂ 1 ∂ ∂p = −a(x)p + b(x)2 p ∂t ∂x 2 ∂x =−
1 ∂2 ∂ b(x)2 p . a(x)p + ∂x 2 ∂x2
(5.126)
(5.127) (5.128)
To distinguish it from the Ito-SDE, the Stratonovich-SDE (integration in the middle) in these notations is written using the special symbol ◦ dx(t) = a(x) dt + b(x)◦ dW(t). The corresponding Fokker–Planck equation is ∂ 1 ∂ ∂p = −a(x)p + b(x) b(x)p . ∂t ∂x 2 ∂x
(5.129)
(5.130)
One has to take into account that deviations from the usual differentiation rules take place at η = 1/2. It is important when making a transformation of variable y = g(x) ⇐⇒ x = g −1 (y). The transformed Langevin equation then reads dy = a˜ (y) + b˜ η (y) ξ(t) dt
(5.131)
167
168
5 The Langevin Equation
or dy = a˜ (y) dt + b˜ η (y) dW(t)
(5.132)
with coefficients a˜ (y) = g (x) a(x) +
+1 2
, − η g (x) b(x)2 ,
˜ = g (x) b(x), b(y)
(5.133) (5.134)
where g = dg/dx. In the follwing we want to consider some special cases in detail starting with a driftless process. 5.7 The Standard Wiener Process
The stochastic equation of motion driven by Gaussian white noise (compare Example 1.2 in Section 1.5) reads dx(t) = dW(t).
(5.135)
As pointed out in Chapter 1, (5.135) cannot be written as dx(t) = ξ(t) dt, since dW(t)/ dt = ξ(t), as already explained in connection with (5.117)–(5.119). The solution of (5.135) is given by the initial condition x(t = 0) = x0 for the state variable x, and the properties of the standard Wiener process W(t) are stated by 1. W(t = 0) ≡ W0 = 0 almost surely; 2. W(t) is a process with independent increments dW(t) = W(t + dt) − W(t); 3. W(t) − W(s) is normally distributed N(0, t − s) with mean 0 and variance t − s (0 ≤ s < t). Integrating (5.135) we obtain x(t) = x0 + W(t).
(5.136)
The moments of variable x(t) are
x(t) = x0
x(t)2 = x02 + t
since
W(t) = 0
since
W(t)2 = t.
(5.137)
The variance is linearly increasing in time
x(t)2 − x(t)2 = t,
(5.138)
and the covariance or correlation function for different times (as extension of the variance) is given by
x(t)x(t ) − x(t) x(t ) = min{t, t }.
(5.139)
5.7 The Standard Wiener Process
The Wiener process W(t), which is the solution of (5.135) with x0 = 0, is a Gaussian process normally distributed as N(0, t), thus
b x2 1 e− 2t dx. (5.140) P(a ≤ W(t) ≤ b) = √ 2π t a According to the definition of the Wiener process, its increments are independent. This means that each following step is independent of the previous history. In other words, every Wiener process is a Markov process. In a precise mathematical notation: almost every trajectory of the Wiener process is differentiable almost nowhere [25, 61, 201]. √ Because (W(t + h) − W(t))/ h is normally distributed as N(0, 1), it follows
|W(t + h) − W(t)| a P(|W(t + h) − W(t) |
a/√h 1 −x2 /2 = √ dx (5.141) √ e 2π −a/ h 1 a 2a ≤ √ 2√ = √ . 2π h 2π h
(5.142)
The so-called strong law of large numbers is stated as W(t) →0 t
for
t → ∞.
(5.143)
The Wiener process has a scaling property which states that if W(t) is a Wiener 0 defined by process, then the time-scaled process W(t) 0 = t W(1/t) W(t)
with
0 W(0) =0
(5.144)
is also a Wiener process. Often the Wiener process, with finite stopping time T, is discussed. The most fundamental one is the first passage time T(b) which is defined by T(b) = min{t ≥ 0; x(t) = b}
(5.145)
as minimal time when the trajectory hits for the first time T the value b starting from x0 . We shall first obtain the probability density function of T(b) by an heuristic argument, called the reflection principle, using a shadow path [91]. Let us denote by P[T(b) < t] the probability that for a given time moment t the relation T(b) < t holds. It is the probability that the value b has been reached within the time interval [0, t]. Let P[T(b) < t, x(t) > b] be the probability that two conditions T(b) < t and x(t) > b are satisfied simultaneously at time moment t. Further on, we have P[x(t) > b] = P[T(b) < t, x(t) > b] + P[T(b) > t, x(t) > b] = P[T(b) < t, x(t) > b]
(5.146)
169
5 The Langevin Equation
for the probability P[x(t) > b] that x(t) exceeds b at time moment t, since P[T(b) > t, x(t) > b] = 0 holds according to the definition of T(b). The reflection principle tells us that, after reaching the border x = b, each trajectory of Wiener process has a shadow path which is a reflection with respect to the line x(t) = b. Due to the symmetry of random walk, the probability for a given set of stochastic trajectories passing x = b is the same as for the corresponding set of shadow trajectories. This shadow symmetry is reflected in the relation P[T(b) < t, x(t) > b] = P[T(b) < t, x(t) < b]
(5.147)
According to (5.146) and (5.147) we have P[T(b) < t, x(t) > b] = P[x(t) > b]
(5.148)
P[T(b) < t, x(t) < b] = P[x(t) > b]
(5.149)
Summing these two equations, we obtain P[T(b) < t] = 2P[x(t) > b].
(5.150)
Using the known result (5.140), leads to the relation . 2 ∞ −z2 /2 e dz P[T(b) < t] = π bt−1/2
(5.151)
for the cumulative probability P[T(b) < t] that a trajectory has reached the border x = b before the time moment t when starting at x0 = 0. For the general value of x0 < b one has to replace b with b − x0 . The probability density P(T; b) of the first passage time of the Wiener process starting at x0 = 0 (see Figure 5.7) is obtained from (5.151) via differentiation with respect to time t,
0.4
P
170
0.2
0
0
1
2
3
4
5
t
Figure 5.7 The first-passage time probability density P(t; b) for a standard Wiener process starting at x0 = 0 with absorbing boundary b = 1 (solid line). For comparison, the dashed curve shows the decreasing prefactor (∼t−3/2 ) of the exponential function.
5.7 The Standard Wiener Process
P(T; b) = √
b 2πT 3
exp −
b2 . 2T
(5.152)
The ∞ mean first passage time T in this case is infinite, that is, the integral 0 P(T; b) dT diverges. There are special cases under consideration which are often discussed in literature, see e.g. [25, 61, 91, 201]: 1. The Wiener process on a half–line [0, ∞]. One has to specify what happens when the origin is reached. (a) Absorbing boundary at x = 0. (b) Instantaneous reflection at x = 0. 2. The Wiener process on a finite interval [a, b]. (a) Reflection at both endpoints (doubly reflected Wiener process). (b) Absorption at both endpoints. (c) Mixed boundaries. In the following we shall briefly discuss the numerical realization of stochastic trajectories. The analytical properties of the Wiener process help us to calculate the increment ∆W(t) numerically from standard normally distributed random numbers Z ∼ N(0, 1) via √ (5.153) ∆W(t) = Z ∆t . We would like to mention three different algorithms used to obtain normally distributed random numbers Z. All of them produce the transformation to generate Z ∼ N(0, 1) from uniformly distributed random numbers Ui ∈ (0, 1). The transformation by the Polar method has three steps: 1. 2. 3. 4. 5.
Generate two uniformly distributed random numbers U1 and U2 . Define Vi = 2Ui − 1. 2 2 Check the condition that - W = V1 + V2 < 1. If ‘yes’ create Z = V1 −2 log(W)/W. If ‘no’ generate new random numbers (go to 1) and check this condition again.
For the same realization the Box–Muller method proposes the necessary transformation after the generation of U1 , U2 ∈ (0, 1) by one step only (5.154) Z = −2 ln U1 cos (2 π U2 ) . The difference between both described methods is insignificant and connected only with the simulation time. The third strategy used to calculate variable Z is based on theoretical explanations. It is well-known that the uniform distribution in the interval [0, 1] 0 : x<0 (5.155) puniform (x) = 1 : 0<x<1 0 : x>1
171
5 The Langevin Equation
has the mean value
+∞
µ= x puniform (x) dx = −∞
1
x dx =
0
and the variance
+∞
σ2 = (x − µ)2 puniform (x) dx = −∞
1 2
1
(5.156)
x−
0
1 2
2 dx =
1 . 12
(5.157)
Therefore, the new variable Z=
12 1 Ui − 2
(5.158)
i=1
has to be approximately the normal distributed one with parameters
Z = 0
and
Z2 − Z2 = 1 .
(5.159)
This method does not require the calculation of any function such as logarithm, square root, or cosine. However, it is necessary to generate twelve uniformly distributed random numbers to calculate Z for one time step [t, t + ∆t]. The Box–Muller and Polar methods need for it only two random numbers. As an example, ten numerically simulated stochastic trajectories of the Wiener process starting at x0 = 2 are shown in Figure 5.8. Here we show also the mean value x(t) and the variance x(t)2 − x(t)2 , approximately evaluated from the ensemble of ten trajectories, and compare them with the theoretical results (5.137) and (5.138).
10 x(t)
172
x0 0
0
5 t
10
Figure 5.8 Sample of ten stochastic trajectories starting at x0 = 2 (dotted zig-zag lines), the mean value, and the variance of the Wiener process depending on time t. The theoretical mean value is depicted by the dashed line, and the variance by the solid straight line. The corresponding numerical estimates are shown as bold curves.
5.9 Geometric Brownian Motion
5.8 Arithmetic Brownian Motion
The standard Brownian motion is defined as the constant drift function together with the white noise already known from the Wiener process dx(t) = a dt + b dW(t)
(5.160)
together with following initial conditions x(t = 0) = x0 and W(t = 0) = 0. Simple integration gives the solution x(t) = x0 + at + b W(t). The probability distribution function thus is Gaussian, that is:
1 (x − x0 − at)2 . p(x, t) = √ exp − 2b2 t 2πb2 t
(5.161)
(5.162)
It is a solution of the Fokker–Planck equation (5.124) with η = 0 ∂ b2 ∂ 2 ∂ p(x, t) = −a p(x, t) + p(x, t). ∂t ∂x 2 ∂x2
(5.163)
This equation will be discussed in some detail in Chapters 6 and 7. From (5.162) we directly obtain the first two moments
x(t) = x0 + at
x(t)2 = x(t)2 + b2 t.
(5.164) (5.165)
An example of numerical simulation including ten stochastic trajectories with evaluation of the mean and variance is illustrated in Figure 5.9.
5.9 Geometric Brownian Motion
The stochastic differential equation with linear drift and a multiplicative noise term is called geometric Brownian motion. It has wide applicability in financial modeling and is given in Ito notation by dx(t) = a x(t) dt + b x(t) dW(t)
(5.166)
with typical initial conditions x(t = 0) = x0 > 0 and W(t = 0) = 0. It is a special case of the Ito-SDE (5.126) with a(x) = ax and b(x) = bx. In the following we will use the transformation
x , (5.167) y(x) = ln x0
173
5 The Langevin Equation
40
20 x(t)
174
x
0 0
−20
0
5 t
10
Figure 5.9 Sample of ten stochastic trajectories starting at x0 = 5 (dotted random zig-zag lines), the mean value, and the variance of the arithmetic Brownian motion with parameters a = 1 and b = 2 depending on time t. The theoretical mean value is depicted by a dashed line, and the variance by the solid straight line. The corresponding numerical estimates are shown as solid curves.
where x0 = x(t = 0). According to (5.132) we have ˜ dW dy(t) = a˜ (y) dt + b(y)
(5.168)
with 1 d2 y dy a(x) + b(x)2 dx 2 dx2 ˜ = dy b(x). b(y) dx
a˜ (y) =
(5.169) (5.170)
Here y ≡
x0 1 dy 1 = = ; dx x x0 x
y = −
1 , x2
˜ = b. Consequently, the stochastic and hence we have a˜ (y) = a − b2 /2 and b(y) differential equation for the transformed variable y(t) reads
b2 dt + b dW(t), (5.171) dy(t) = d ln x(t) = a − 2 Integrating both sides the solution results in
b2 t + b W(t) x(t) = x0 exp a − 2
(5.172)
The probability density for the variable y(t) is given by the Gaussian distribution + ,2 y − (a − b2 /2)t 1 . (5.173) p˜ (y, t) = √ exp − 2b2 t 2π b2 t
5.9 Geometric Brownian Motion
The probability distribution p(x, t) for the original variable is easily calculated according to p(x, t) = p˜ (y, t)
1 dy = p˜ (ln[x/x0 ], t). dx x
(5.174)
It yields the log-normal distribution + ,2 ln[x/x0 ] − (a − b2 /2)t 1 1 exp − . p(x, t) = √ 2b2 t 2π b2 t x
(5.175)
This distribution is a solution of the following Fokker–Planck equation ∂ ∂ 1 ∂2 2 2 b x p(x, t) p(x, t) = − [ax p(x, t)] + 2 ∂t ∂x 2 ∂x , + 2 , ∂p + 2 1 ∂ 2p + (bx)2 2 . = b − a p + 2b − a x ∂x 2 ∂x
(5.176)
The mean value (first moment) of (5.175) is calculated as follows
∞ dy dx = x0 ey p˜ (y, t) dy dx 0 0 −∞ + ,2
∞ y − a − b2 /2 t x0 dy exp y − = √ 2b2 t 2πb2 t −∞ + ,2
∞ y − a + b2 /2 t x0 eat = √ dy exp − 2b2 t 2πb2 t −∞
∞ 1 z2 at = x0 e √ exp − 2 dz = x0 eat . 2b t 2πb2 t −∞
x(t) =
∞
xp(x, t) dx =
∞
x˜p(y, t)
(5.177)
It increases exponentially
x(t) = x0 eat .
(5.178)
The mean square value (second moment) is calculated in a similar way
∞ dy 2
x(t) = dx = x0 x p(x, t) dx = x p˜ (y, t) e2y p˜ (y, t) dy dx 0 0 −∞ + ,2
∞ y − a − b2 /2 t x02 dy exp 2y − = √ 2b2 t 2πb2 t −∞ + + , ,2 2 y − a + 3b2 /2 t x02 e 2a+b t ∞ dy = √ exp − 2b2 t 2πb2 t −∞ 2
∞
+
2
2
,
x02 e 2a+b t = √ 2πb2 t
∞
∞
2
z2 exp − 2 2b t −∞
+
2
,
dz = x02 e 2a+b t .
(5.179)
175
5 The Langevin Equation
80 60 x(t)
176
40 20 0
0
1
2
3
t Figure 5.10 Sample of ten stochastic trajectories starting at x0 = 1 (dotted random zig-zag lines), the mean value, and the variance of the geometric Brownian motion with parameters a = 1 and b =
0.5 depending on time t. The theoretical mean value is depicted by a dashed line, and the variance by the solid straight line. The corresponding numerical estimates are shown as solid curves.
Thus we have +
x(t)2 = x02 e2 a+b
2 /2
,
t
(5.180)
giving the variance ! 2
x(t)2 − x(t)2 = x02 e2a t eb t − 1 .
(5.181)
Similarly, as in the case of the arithmetic Brownian motion, we have shown a numerical realization of ten stochastic trajectories with the evaluation of the mean and variance Figure 5.10. In this case, at a = 1 > 0, typical stochastic trajectories show exponential growth in time. This agrees with the formulas (5.178) and (5.181).
5.10 Exercises
E 5.1 Brownian particle Verify that function v(t), given by (5.92), is the solution of the Langevin equation (5.87) for the velocity of a Brownian particle; first, by performing the integration in (5.87); and second, by inserting (5.92) into (5.87). E 5.2 Noisy harmonic oscillator Consider an oscillator with random frequency modulation. The position x and velocity v of the oscillator is described by the complex variable z = x + iv. This obeys the Langevin equation of motion [160]
5.10 Exercises
dz = i (ω + ε ξ(t))z, dt
(5.182)
where ξ(t) is the Gaussian white noise with amplitude given by the parameter ε. Find the analytical solution in the absence of noise at ε = 0. Analyze the behavior of the oscillator depending on the noise amplitude and compare it with the solution without noise. E 5.3 Geometric Brownian motion Study the geometric Brownian motion (5.166) as the special case of an Ito process in more detail. Fix the initial condition x(t = 0) = x0 > 0, but change the control parameters a ≥ 0 and b ≥ 0 to distinguish three cases: (i) a < b2 /2; (ii) a = b2 /2; (iii) a > b2 /2 . Plot the analytical solution, given by the probability density distribution (5.175), for the three cases at different times and compare with your simulation results based on the numerical realizations of a sample of stochastic trajectories, see Figure 5.10. E 5.4 Financial market Create a project about econophysics. Study important models for buyers and sellers as market participants. Follow the spirit of ‘Stochastic Processes. From Physics to Finance’ by Wolfgang Paul and J¨org Baschnagel [180] to find out about modeling the financial market. Discuss the Black–Scholes equation mathematically and compare it with the geometric Brownian motion approach. Follow Appendix F of [180] to transform the Fokker–Planck equation (5.176) into an ordinary diffusion equation to finally obtain the result (5.175) using inverse transformations.
177
Part III Applications
181
6 One-Dimensional Diffusion
6.1 Random Walk on a Line and Diffusion: Main Results
In an earlier chapter we have discussed in some detail the Brownian motion of a particle as a stochastic process. Another closely related problem is the random walk. In distinction to the Brownian motion, where the randomness appears as a continuous Wiener process, the random walk proceeds by discrete steps. It is described by the diffusion equation in the continuum limit. As will be seen from the following examples, the rules that control the random walk are very simple. However, as often occurs in physical models, the consequences of simple rules are far from elementary. Random walks are interesting in themselves, as they provide a basis for the understanding of a wide range of phenomena and require the use of many mathematical techniques to solve the related problems [180, 195]. Random walks introduce us to the concept of scale invariance (as they look the same on all scales) and universality. The latter means that the general features of the statistical behavior are independent of the microscopic details. The concept of the random walk, also called drunkard’s walk, was introduced into science by Karl Pearson in a letter to Nature in 1905 [180]: A man starts from a point 0 and walks l yards in a straight line: he then turns through any angle whatever and walks another l yards in a straight line. He repeats this process n times. I require the probability that after these n stretches he is at a distance between r and r + δr from the starting point 0. The drunkard takes a series of steps of equal length away from the last point but each at a random angle. The solution to this problem was provided in the same volume of Nature by Lord Rayleigh. The random walk on a line is much simpler. The positions are spaced regularly along a line. The walker has two possibilities: either one step to the right (+1) with probability p or one step to the left (−1) with probability q = 1 − p. The symmetric case (pure diffusion) means p = q = 1/2.
Physics of Stochastic Processes: How Randomness Acts in Time Reinhard Mahnke, Jevgenijs Kaupuˇzs and Ihor Lubashevsky Copyright 2009 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim ISBN: 978-3-527-40840-5
182
6 One-Dimensional Diffusion Random walk experiment with m = 10 stochastic realizations ( j = 1, 2, . . . , 10) each consisting of n = 10 steps (i = 1, 2, . . . , 10). The tabulated numbers are the series of displacements i = ±1, and the position s = i i after n = 10 steps and the squared deviation for each realization of the random walk.
Table 6.1
i 1
1 1
2 −1
2
−1
−1
3
1
1
4
−1
1
j
3 1
4 −1
5 1
6 1
1
1
−1
1
−1
1
−1
−1
1
1
10 1
s (s − s)2 4 9
7 −1
8 1
9 1
1
1
−1
−1
1
0
1
1
1
−1
−1
4
9
−1
−1
−1
−1
−4
25
1
5
1
1
−1
−1
−1
−1
−1
−1
1
−1
−4
25
6
−1
1
−1
1
1
1
−1
1
1
−1
2
1
7
−1
1
1
−1
−1
−1
1
1
1
1
2
1
8
1
1
−1
−1
1
−1
−1
1
−1
1
0
1
9
1
1
−1
−1
1
1
1
1
−1
1
4
9
10
−1
1
1
1
−1
1
−1
1
−1
1
2
1
After n steps, the position of the random walker is given by s(n) =
n
i
with
i = ±1.
(6.1)
i=1
A series of fluctuating values s(n) is obtained when the random walk is repeated many times. Interesting quantities are the averages over an ensemble of m different realizations. As an example, the results of m = 10 such random walk realizations, each consisting of n = 10 steps, are collected in Table 6.1. For an infinitely large ensemble m → ∞, the probability theory allows us to calculate the probability distribution for s(n), as well as the mean values. For the symmetric case p = q = 1/2 we have zero mean value
s(n) = 0, since s(n) and s(−n) are equally likely. The root-mean-square = √ σ(n) = s(n)2 − s(n)2 = n characterizes the average deviation amplitude from the mean value.
(6.2)
(6.3)
√ In the experiment reflected in Table 6.1 we have s(10) = 1 and σ(10) = 8.2 ≈ 2.86 . Taking into account that our ensemble consists only of m = 10 √ realizations, the agreement with the theoretical values s(10) = 0 and σ(10) = 10 ≈ 3.16 is √ good. In particular, the expected error for the s(n) value is about σ(n)/ m = 1, as in our experiment. However, ten realizations are still insufficient to evaluate the probability distribution P(s, n) reasonably well, where P(s, n) is the probability of the deviation s after n steps. For this we need more realizations. For illustration, the probability distributions estimated from 10 and 100 realizations (crosses and
6.1 Random Walk on a Line and Diffusion: Main Results
P(s,n)
0.3
0.2
0.1
0 −10 −8 −6 −4 −2
0 s
2
4
6
8
10
Figure 6.1 The probability distribution P(s, n) for the position s after n = 10 steps of a symmetric random walk (dotted curve) in comparison with the empirical results for 10 (crosses) and 100 (circles) realizations.
circles, respectively) are shown in Figure 6.1 and compared with the theoretical result (dotted curve). The well known demonstration experiment for the random walk on a line is provided by the Galton board. Here balls are dropped through a triangular array of nails. Every time a ball hits a nail it has a probability q of falling to the left of the nail and a probability p of falling to the right of the nail. For the usual symmetrical Galton board we have p = q = 1/2. The theoretical results given by (6.2)–(6.3), as well as the probability distribution over the positions of the random walker, follow easily from the properties of the Markov chain describing the process. Namely, the probability P(m, n + 1) that the walker is at position m after n + 1 steps is given by the set of probabilities P(m, n) after n steps in accordance with the master equation P(m, n + 1) = p P(m − 1, n) + q P(m + 1, n).
(6.4)
The solution of (6.4) is the binomial distribution P(m, n) =
n! p(n+m)/2 q(n−m)/2 . (n + m)/2 ! (n − m)/2 !
The first moment of this probability distribution is
n 1
m(n) = mP(m, n) = 2n p − 2 m=−n and the second moment is
m (n) = 2
n m=−n
m P(m, n) = 4npq + 4n 2
2
1 p− 2
(6.5)
(6.6)
2
Hence, the root-mean-square is given by = = σ(n) = (m − m)2 = m2 − m2 = 4npq,
(6.7)
(6.8)
183
184
6 One-Dimensional Diffusion
and the relative width (error) / 4np(1 − p) p(1 − p) 1 σ = = √ # n−1/2
m 2n(p − 1/2) (p − 1/2)2 n
(6.9)
tends to zero when n goes to infinity. The particular symmetric case considered before is recovered at p = q = 1/2, taking into account that s(n) ≡ m. Up to this point we have considered the discrete random walk on an infinite line. In other words, the case with natural boundary conditions. In general, one can consider the random walk in a finite interval with different boundary conditions on the left and right. These can be reflecting or absorbing. In the first case it is not allowed for the random walker to cross the boundary, whereas in the second case the walker is absorbed when reaching it (i. e. the random walk is then terminated). Random walks on a multidimensional hypercubic lattice with periodic boundary conditions are treated in [180] when studying the Polya problem of recurrence. This is the problem of finding the probability that a random walker will ever return to its starting point. In 1919 George Polya showed that walks occuring in one or two dimensions return to their starting point with absolute certainty (recurrent walks), but in higher dimensions the walker has a nonzero probability of never revisiting its starting point (transient walks) [195]. The discrete random walk serves as a toy model for diffusive motion. In a certain continuum limit of infinitely small steps, the probability distribution over coordinates of the random walker is described by the diffusion (or in a more general case drift–diffusion) equation with corresponding boundary conditions. In the following section we will show how this scheme works in the simplest case of the symmetric random walk with natural boundary conditions.
6.2 A Drunken Sailor as Random Walker
To provide an illustrative example of a simple stochastic process with the application of the continuum approximation we refer to the problem of the well known onedimensional random walk with equal probability of making a step to the right (forward) or to the left (backward). The experimental equipment used to get the probabilities P(m, n) of finding the particle at position m after n steps, when starting at m = 0, is the Galton board. After a series of n steps of equal length the particle could be found at any of the following points m = {−n, −n + 1, . . . , −1, 0, +1, . . . , n − 1, n}.
(6.10)
Position m consists of k steps in one direction (success) and n − k in the opposite direction (failure) m = k − (n − k) = 2k − n.
(6.11)
6.2 A Drunken Sailor as Random Walker
For the k successes we get k=
1 2
(n + m) .
(6.12)
Starting with the well known binomial distribution for discrete probabilities
n k P(m, n) ≡ B(k, n) = p (1 − p)n−k (6.13) k we reduce this to the symmetric case (p = 1/2) n n 1 1 n! n! = . P(m, n) = k!(n − k)! 2 (n + m)/2 ! (n − m)/2 ! 2
(6.14)
Further on we introduce the (still discrete) coordinate xm = a m and time tn = τ n, where a is the hopping distance (a length unit) and τ is the time step (a time unit) and rewrite the binomial distribution (6.14) as P(xm , tn ) = =
tn /τ tn /(2τ) + xm /(2a)
tn /τ 1 2
(tn /τ)! tn /(2τ) + xm /(2a) ! tn /(2τ) − xm /(2a) !
tn /τ 1 . 2
(6.15)
We are interested in the limit where the total number of steps n as well as the number of steps in one direction k is large. In this case we can apply the well known Stirling formula √ (6.16) n! # e−n nn 2πn or ln n! # n ln n − n +
1 2
ln(2πn)
(6.17)
to evaluate the factorials in (6.14). This yields + , P(m, n) # exp n ln n − k ln k − (n − k) ln(n − k) − n ln 2 × n1/2 [2πk(n − k)]−1/2 .
(6.18)
According to our substitutions xm = a m and tn = τ n made in (6.15) we have tn (1 + δ), 2τ tn (1 − δ), n−k = 2τ k=
(6.19) (6.20)
where δ=
m xm τ = . n a tn
(6.21)
185
186
6 One-Dimensional Diffusion
This is a property of the symmetrical random walk, following on from (6.14), that δ is a small quantity for relevant stochastic realizations at n → ∞, since typically √ the deviations m from the origin m = 0 are of order n. According to this, we can insert (6.19)–(6.21) into (6.18) and make an expansion in the Taylor series of δ. Retaining only the terms up to the second order in δ we obtain
+ ,−1/2 x2 τ . (6.22) exp − m2 P(m, n) ≡ P(xm , tn ) # 2 (2πtn )−1/2 τ1/2 1 − δ2 2a tn The term δ2 in the prefactor is vanishingly small and can be omitted. Now we introduce a new parameter D=
1 a2 , 2 τ
(6.23)
called the diffusion coefficient, by considering the continuum limit where length unit a and time unit τ both tend to zero in such a way that D remains constant. In this case a physically interesting quantity is the probability density p(x, t), that is, the probability of finding a particle within [x, x + dx] divided by the interval length dx. Taking into account that the positions xm which can be reached after n steps form a discrete grid with step size 2a [cf. (6.11)], any interval dx includes dx/(2a) points of this grid at a → 0. Hence, for any small enough dx the probability of finding a particle in this interval is P(xm , tn ) dx/(2a) = p(x, t) dx with x = xm and t = tn , where the probability density is given by 1/2
2τ P(xm , tn ) x2 τ # (4πt)−1/2 . (6.24) p(x, t) = exp − 2a a2 2a2 t Taking into account the definition (6.23), we finally obtain the Gaussian distribution
1 x2 (6.25) p(x, t) = √ exp − 4Dt 4πDt in the considered limit where a → 0 at a constant diffusion coefficient D.
6.3 Diffusion with Natural Boundaries
The dynamics of the probability density p(x, t) for a one-dimensional random walk (Wiener process, Brownian motion) is given by the one-dimensional diffusion equation ∂ 2 p(x, t) ∂p(x, t) =D . ∂t ∂x2
(6.26)
This deals with a nonlocal problem in coordinate space. To obtain a certain solution, the diffusion equation (6.26) has to be completed by initial and boundary conditions. We consider the initial condition p(x, t = 0) = δ(x − x0 ) given by the delta function
6.3 Diffusion with Natural Boundaries
(a sharp peak at x = x0 ), which physically means that the random walk starts at x = x0 , as well as the natural boundary conditions limx→±∞ p(x, t) = 0. Since we have a closed system, the probability density function p(x, t) obeys the normalization condition (or global probability conservation law)
∞ p(x, t) dx = 1, (6.27) −∞
as well as the continuity equation (local conservation) ∂p(x, t) ∂ + jdiff (x, t) = 0. ∂t ∂x
(6.28)
The diffusion equation originates from the continuity equation via Fick’s laws: Fick’s 1st law (continuity equation) ∂ ∂p(x, t) + jdiff (x, t) = 0 ∂t ∂x
(6.29)
Fick’s 2nd law (diffusion flow) jdiff (x, t) = − D
∂p(x, t) . ∂x
(6.30)
By inserting (6.30) into (6.29) we arrive at the diffusion equation (6.26). The solution of the diffusion equation with the initial condition as a delta-peak at x = x0 and natural boundary conditions, is the Gaussian (normal) distribution
(x − x0 )2 1 , (6.31) exp − p(x, t) = √ 4Dt 4πDt which is an extension of (6.25) to an arbitrary starting point x0 of the random walk. It is symmetric with respect to x0 . Figure 6.2 illustrates the behavior of p(x, t) for different observation times. A generalization for an arbitrary initial distribution p(x, t = 0) = p0 (x) is possible due to the superposition of probability densities created by different sources (initial distributions), which is a property of our diffusion equation due to its ∞ linearity. Hence, for p0 (x) ≡ −∞ p0 (y)δ(x − y) dy the solution is an integral over the normal distributions corresponding to p(x, t = 0) = δ(x − y) weighted by the source intensity p0 (y), so
+∞ (x − y)2 1 dy. (6.32) p0 (y) exp − p(x, t) = √ 4Dt 4πDt −∞ For p0 (x) = δ(x − x0 ) we recover the known Gaussian distribution (6.31). Now we show how the solution of the diffusion equation can be obtained by a one-dimensional Fourier transformation to 0 p(k, t) (transformation to the inverse space by a generating function) which is defined by
+∞ 1 p(k, t) dk, (6.33) eikx0 p(x, t) = √ 2π −∞
187
6 One-Dimensional Diffusion
1.4 1.2 1 p(x,t)
188
0.8 0.6 0.4 15
0.2
10
0 0 0.5
5 1 1.5
0 2 2.5
3 3.5
t,s
4 4.5 5 5.5 −15
−10
−5
x,m
Figure 6.2 The solution of the diffusion equation (6.26) showing times t = 0.05 s, 1.5 s, 3 s, 5 s (from left to right). Three-dimensional view of Gaussian distribution p(x, t) (6.31) in m−1 with initial condition x0 = 0 and diffusion constant D = 1 m2 s−1 .
1 0 p(k, t) = √ 2π
+∞
−∞
e−ikx p(x, t) dx,
(6.34)
where k is the wave number. The left-hand side of the diffusion equation (6.26) is transformed as ∂ 1 ∂ p(x, t) = √ ∂t ∂t 2π
∞
1 p(k, t) = √ dk e 0 2π
ikx
−∞
∞
dk eikx
−∞
∂0 p(k, t) , ∂t
(6.35)
whereas the right-hand side, starting with ∂ ∂ 1 p(x, t) = √ ∂x ∂x 2π
∞
1 p(k, t) = √ dk e 0 2π ikx
−∞
∞
−∞
p(k, t) dk, (ik)eikx0
(6.36)
becomes
∂2 ∂ 1 p(x, t) = √ ∂x2 ∂x 2π 1 = −√ 2π
∞
1 p(k, t) dk = √ (ik)eikx0 2π −∞
∞
−∞
p(k, t) dk. k2 eikx0
∞
−∞
p(k, t) dk (ik)2 eikx0
(6.37)
6.3 Diffusion with Natural Boundaries
Hence the transformed equation reads
∞
∞ D ∂0 p(k, t) 1 = −√ p(k, t) dk eikx dk k2 eikx0 √ ∂t 2π −∞ 2π −∞
(6.38)
and the integrands must be equal ∂0 p(k, t) = −k2D 0 p(k, t), ∂t
(6.39)
which leads to a local problem in the k-space. An elementary integration yields the solution 2 Dt
0 p(k, t) = 0 p(k, t = 0)e−k
(6.40)
in the form of an exponentially decaying kth Fourier mode. The initial condition transforms as 1 0 p(k, t = 0) = √ 2π
∞
1 eikx p(x, t = 0) dx = √ 2π −∞
∞
−∞
e−ikx δ(x − x0 ) dx
1 = √ eikx0 , 2π
(6.41)
so that the solution in the Fourier space (spectral representation) is 1 2 0 p(k, t) = √ e−ikx0 e−k Dt . 2π
(6.42)
Now we make the inverse transformation to the coordinate space 1 p(x, t) = √ 2π
=
1 2π
1 = 2π 1 = 2π
∞
−∞
0 p(k,t)
∞
−∞
∞
−∞
1 2 dk eikx √ e−ikx0 e−k Dt 2π > ?@ A
∞
−∞
2 Dt
dk eik(x−x0 )−k −Dt k2 −
dk e
−Dt
dk e
ik(x−x0 ) Dt
!
i(x−x ) 2
k− 2Dt0
+
x−x0 2 2Dt
! .
(6.43)
Further on we change the variable z = k − i(x − x0 )/(2Dt); dz = dk. According to this we have to make the integration in the complex plane along the line which is parallel to the real axis although shifted by c = −i(x − x0 )/(2Dt). The integral does
189
190
6 One-Dimensional Diffusion
not depend on c and therefore we can shift the integration path to the real axis. This can be proven by considering a closed integration contour including the real axis, the line Im z = c, and additional lines closing the contour at Re z = ±∞. Since the integrand function is analytical inside the enclosed part of the complex plane, the integral over the whole contour is zero; the integrals over the connecting parts at infinity are also zero, and hence the integral over Im z = c from Re z = +∞ to Re z = −∞ is compensated by the integral over the real axis from −∞ to ∞. The integral changes its sign when the integration direction is reversed and, therefore, both integrals are equal if taken from −∞ to +∞. Thus we have p(x, t) =
1 2π
∞
−∞
−Dt
dz e−Dtz · e 2
(x−x0 )2 4D2 t2
=
1 (x−x0 )2 e 4Dt 2π
By using the known formula .
∞ π 2 e−αx dx = I= α −∞
∞
−∞
dz e−Dtz . 2
(6.44)
(6.45)
with α = Dt, to calculate the last integral in (6.44), we again obtain the solution as a Gaussian distribution
(x − x0 )2 1 . (6.46) exp − p(x, t) = √ 4Dt 4πDt The Gaussian integral (6.45) (often also called the Poisson integral) can be calculated as follows. Consider the square of this integral I2 =
∞
−∞
e−αx dx 2
2
=
∞
−∞
dx
∞
−∞
dy e−α(x
2 +y2 )
and then make a transformation to polar coordinates by = x = r cos α, r = x 2 + y2 y = r sin α,
(6.47)
(6.48)
φ = arctan(y/x)
dx dy = r dr dϕ.
(6.49)
The factor r in the latter equality represents the determinant of the transformation Jacobian ∂x ∂y ∂r sin α ∂r cos α (6.50) J = = r(− sin α) r(cos α) = r. ∂x ∂y ∂α ∂α
6.3 Diffusion with Natural Boundaries
Thus we have
2 I =
2π
∞
dϕ
−αr 2
dr re
0
= 2π
0
∞
−αr 2
re
dr = π
0
∞
e−αz dz.
(6.51)
0
By substituting αz = u;
du = α dz
(6.52)
the integral becomes I2 =
π α
∞
e−u du =
0
π −u ∞ π −e = , α α 0
(6.53)
and hence (cf. (6.45)) . I=
π . α
(6.54)
Now let us calculate the moments of the density distribution function p(x, t) (6.31). The moment of nth order is given by
xn (t) =
∞
dx xn p(x, t).
−∞
(6.55)
The moment of zeroth order has to be one, as it represents the normalization integral
x0 ≡ 1 =
∞
−∞
p(x, t) dx.
(6.56)
To check it we insert the solution (6.31)
x0 =
∞
−∞
√
1 4πDt
e
(x−x0 )2 4Dt
dx.
(6.57)
This integral is further transformed by substituting x − x0 = z;
dx = dz
(6.58)
and denoting a2 = 1/(4Dt). It yields a
x0 = √ 2 π
0
∞
e−a
2 z2
√ a π = 1. dz = √ 2 π 2a
(6.59)
The first moment of the probability distribution function represents the mean value
191
192
6 One-Dimensional Diffusion
x1 =
∞
−∞
x p(x, t) dx.
(6.60)
In our case it has to be equal to x0 , since the distribution is symmetric with respect to this point. We check it by calculation
x1 = √
1 4πDt
∞
xe−
−∞
(x−x0 )2 4Dt
dx = √
1 4πDt
∞ z2 1 1 e− 4Dt dz + √ = x0 √ 4πDt −∞ 4πDt ?@ A > >
∞
−∞
∞
z2
(x0 + z)e− 4Dt dz z2
ze− 4Dt dz −∞ ?@ A
(6.61)
=0, symmetry around z=x−x0 =0
x0 =1
= x 0 + 0 = x0 .
(6.62)
The mean value x = x0 does not change in time, it keeps the initial value. The second moment
∞
x2 = x2 p(x, t) dx
(6.63)
−∞
is related to the standard deviation σ via
(x − x)2 = (x − x0 )2 ≡ σ2 .
(6.64)
To calculate the second moment we use the identity
∞
−∞
x2 e−αx dx = − 2
d dα
∞ −∞
e−αx dx. 2
(6.65)
Denoting again α = 1/(4Dt), we have
x2 =
∞
−∞
= √ = √
x2 √
1 4πDt 1 4πDt
1 4πDt
∞
−∞
∞
−∞
= x02 + 0 + √
e−
(x−x0 )2 4Dt
dx
(6.66)
(y + x0 )2 e−αy dy
(6.67)
(y2 + 2x0 y + x02 )e−αy dy
(6.68)
1 4πDt
∞
−∞
y2 e−αy dy
(6.69)
6.4 Diffusion in a Finite Interval with Mixed Boundaries
= x02 + √
1
−
4πDt
d dα
∞
−∞
e−αy dy . 2
(6.70)
In the second line we have substituted x − x0 = y, dx = dy. Using (6.45) further calculation yields √ √
3 π π d 1 α− 2 = x02 − √ − √ 2 α 4πDt dα 4πDt √
− 3 2 1 π = x02 + √ = x02 + 2Dt 2 4πDt 4Dt
x2 = x02 − √
1
(6.71)
(6.72)
which finally gives σ2 ≡ (x − x)2 = 2D t √ √ σ = 2Dt ∼ t.
(6.73) (6.74)
The latter equality means that the width of the probability distribution increases with time as the square root of time t.
6.4 Diffusion in a Finite Interval with Mixed Boundaries
Here we consider an example of the initial and boundary value diffusion problem in a finite interval with one reflecting and one absorbing boundary. We calculate the breakdown probability density P(t, b) which is defined as the probability per time unit of reaching the absorbing boundary [82]. The problem is described by the following set of equations: 1. The equation of motion (dynamics) ∂ 2 p(x, t) ∂p(x, t) =D . ∂t ∂x2
(6.75)
2. The initial condition (delta function) p(x, t = 0) = δ(x − x0 ). 3. The reflecting boundary condition at x = a (left border) ∂p(x, t) =0 ∂x x=a
(6.76)
(6.77)
4. The absorbing boundary condition at x = b (right border) p(x = b, t) = 0.
(6.78)
193
194
6 One-Dimensional Diffusion
For convenience we make a transformation to a new variable y = x − a. The transformed equations read as follows: 1. The equation of motion (dynamics) ∂ 2 p(y, t) ∂p(y, t) =D . ∂t ∂y2
(6.79)
2. The initial condition (delta function) p(y, t = 0) = δ(y − y0 ). 3. The reflecting boundary condition at y = 0 (left border) ∂p(y, t) = 0. ∂y y=0
(6.80)
(6.81)
4. The absorbing boundary condition at y = b − a (right border) p(y = b − a, t) = 0.
(6.82)
To solve the problem, first we make a separation ansatz p(y, t) = χ(t) f (y), which yields 1 dχ(t) 1 d2 f (y) =D . χ(t) dt f (y) dy2
(6.83)
Both sides should be equal to a constant, called −λ. Integration of the left-hand side gives an exponential decay function χ(t) = χ0 exp (−λt)
(6.84)
with χ(t = 0) = χ0 = 1. Introducing the notion of wave number k given by k2 =
λ D
(6.85)
and integrating the right-hand side of (6.83) we obtain the wave equation d2 f (y) + k2 f (y) = 0. d2 y
(6.86)
Its general solution is f (y) = A sin(ky) + B cos(ky).
(6.87)
This solution (6.87) contains three unknown parameters k (or λ), A, and B. The two (left and right) boundary conditions thus allow us to determine particular solutions of (6.86) up to unknown prefactors, which further can be uniquely determined by constructing a time-dependent solution which fulfills the initial condition.
6.4 Diffusion in a Finite Interval with Mixed Boundaries
The left boundary condition df (y) =0 dy y=0
(6.88)
gives + , + , A k cos k · 0 − B k sin k · 0 = 0
(6.89)
from which we get A = 0. The right boundary condition f (y = b − a) = 0
(6.90)
gives us + + ,, + + ,, A sin k b − a − B cos k b − a = 0.
(6.91)
Taking into account that A = 0 and looking for a nontrivial solution with B = 0 we obtain the conditional equation + + ,, cos k b − a = 0. (6.92) This yields discrete solutions for the wave numbers k and eigenvalues λ = Dk2 :
1 +m , 2
2 1 Dπ2 + m λm = (b − a)2 2 π km = b−a
(6.93) (6.94)
with non-negative integer numbers m = 0, 1, 2, . . . (see Table 6.2). A particular time-dependent solution, which fulfills the boundary conditions and corresponds to the eigenvalue λm , is thus the eigenfunction pm (y, t) given by + , pm (y, t) = Bm e−λm t cos km y . (6.95) Table 6.2 Wave numbers km (in m−1 ) and eigenvalues λm (in s−1 ) for m = 0, 1, . . . , 5 using the parameter set a = 0, b = 1 m, D = 1 m2 s−1 .
m
km
λm
0 1 2 3 4 5
1.5708 4.7124 7.8539 10.9956 14.1372 17.2787
2.4674 22.2066 61.6850 120.9026 199.8595 298.5555
195
196
6 One-Dimensional Diffusion
The complete solution of the problem is found as a superposition of these eigenfunctions p(y, t) =
∞
pm (y, t) =
m=0
∞
+ , Bm e−λm t cos km y ,
(6.96)
m=0
where the weight parameters Bm are obtained from the initial condition p(y, t = 0) =
∞
+ , Bm cos km y = δ(y − y0 ).
(6.97)
m=0
To get them we first multiply (6.97) by cos(kn y), and integrate over y from 0 to b − a:
b−a
b−a ∞ + , + , + , Bm dy cos km y cos kn y = dyδ(y − y0 ) cos kn y . (6.98) 0
m=0
0
The integral on the left-hand side can be easily calculated using the orthogonality of the eigenfunctions, which can also be verified by direct calculation
b−a
+ , + , dy cos kn y cos km y
0
= =
1 2
b−a
dy cos (kn + km )y + cos (kn − km )y
(6.99)
0
1 (b − a)δmn , 2
(6.100)
where the relations π (1 + n + m) , b−a π k n − km = (n − m) b−a
k n + km =
(6.101) (6.102)
following from (6.94), as well as the well known limit sin x =1 x→0 x lim
(6.103)
have been used. Hence, (6.98) reduces to ∞
Bm
m=0
b−a δmn = 2
b−a
δ(y − y0 ) cos(kn y)
(6.104)
0
which yields Bm =
2 cos(km y0 ). b−a
(6.105)
6.4 Diffusion in a Finite Interval with Mixed Boundaries
3
m=0 m=1 m=2 m=3
fm(x)
2 1 0 −1 −2
0
0.2
0.4
0.6
0.8
1
x,m Figure 6.3 Eigenfunctions fm (x) (6.106) in m−1/2 for m = 0, 1, 2, 3 with a = 0, b = 1 m and diffusion constant D = 1 m2 s−1 .
Figure 6.3 shows the normalized eigenfunctions . 2 cos(km (x − a)) fm (x) = b−a
(6.106)
for the first four values of m. According to this, the solution reads p(y, t) =
∞ + , , + 2 −λm t e cos km y0 cos km y . b − a m=0
(6.107)
The inverse transformation from y to x gives us the final probability distribution p(x, t) =
∞ + , + , 2 −λm t e cos km (x0 − a) cos km (x − a) . b−a
(6.108)
m=0
Figure 6.4 shows the probability density distribution p(x, t) (6.108) which tends to zero with increasing time. The first-passage time distribution (breakdown probability density) follows from the balance condition
d b p(x, t) dx. (6.109) P(t, x = b) = − dt a By inserting the solution (6.108) into the right-hand side of this equation, we obtain P(t, b) =
∞ + , b + , 2 λm e−λm t cos km (x0 − a) cos km (x − a) b−a a m=0
=
∞ ,, + , + + 2 λm −λm t e cos km (x0 − a) sin km b − a b−a km m=0
197
6 One-Dimensional Diffusion
3 t = 0.01 s t = 0.05 s t = 0.1 s t = 0.5 s
p(x,t)
2
1
0
0
0.2
0.4
0.6
0.8
1
x,m Figure 6.4 Time-dependent solution p(x, t) (6.108) in m−1 of the diffusion equation in the finite interval x ∈ [0, 1] with reflecting boundary at a = 0 and absorbing boundary at b = 1 m. Parameters are: initial condition x0 = 0.5 m and diffusion constant D = 1 m2 s−1 .
=
∞ + , 2πD 2 m 1 + m e−Dkm t cos km (x0 − a) . (−1) 2 (b − a) m=0 2
(6.110)
The result fulfills the normalization condition
∞ P(t, b) dt = 1.
(6.111)
0
The first-passage time or breakdown probability density distribution P(t, x = b) indicating how long it takes for the system to reach the absorbing boundary at x = b for the first time, is presented in Figure 6.5 for the same set of parameters as 4
4
3
3
P(t,x=b)
P(t,x=b)
198
2 1 0
2 1
0
0.5
1 t,s
1.5
2
0
0
0.025 0.05 0.075 t,s
Figure 6.5 The first-passage time probability density distribution P(t, x = b) in s−1 with b = 1 m, initial condition x0 = 0.5 m and diffusion constant D = 1 m2 s−1 . The right curve shows the time lag in detail.
0.1
0.125
6.4 Diffusion in a Finite Interval with Mixed Boundaries
0.25 0.25 0.2
P(t,x=b)
P(t,x=b)
0.2 0.15
0.15 0.1 0.05
0.1 0
0
0.2
0.6
0.8
1
t,s
0.05 0
0.4
0
5
10 t,s
15
20
Figure 6.6 The first-passage time probability density distribution P(t, x = b) in s−1 when a → − ∞ (6.112) with b = 2 m, initial condition x0 = 0 and diffusion constant D = 1 m2 s−1 . Internal plot is a scaled picture of the time lag.
in Figure 6.4. The right plot of Figure 6.5 is a scaled picture showing the time lag which will be explained later. Finally, we present a well known result in the mathematical literature. If we move the left boundary very far away (limiting case: a → −∞) we receive from the infinite sum (6.110) the well known formula (see Figure 6.6)
(b − x0 )2 b − x0 . (6.112) exp − P(t, b) = √ 4Dt 4πDt3 The following suggests a way to get this result. The first-passage time distribution P(t, b) for the diffusion problem in a finite interval with reflecting boundary at x = a, absorbing boundary at x = b, and the initial probability distribution δ(x − x0 ) is given by (6.110) together with wave numbers (6.94) written as , + (6.113) km (b − a) = π m + 12 . Taking into account (6.113), we make the following replacements + , + , cos km [x0 − a] = cos km [b − a] − km [b − x0 ] , , + + = cos π m + 12 − km s ,
(6.114)
where s = b − x0 . By virtue of the well known trigonometric relationship cos(α ± β) = cos α cos β ∓ sin α sin β, it further transforms to , + + ,, + , + cos km [x0 − a] = cos π m + 12 cos km s ,, + , + + + sin π m + 12 sin km s + , = (−1)m sin km s .
(6.115)
199
200
6 One-Dimensional Diffusion
By inserting this into (6.110) we obtain P(t, b) = 2D = 2D
∞ + , π(m + 1/2) −Dk2m t e sin km s 2 (b − a) m=0 ∞ + , km −Dk2m t e sin km s . b−a
(6.116)
m=0
Now we use the identity + , + , ∂ cos km s = −km sin km s , ∂s and the relation for the wave number increment π ∆km = km+1 − km = b−a
(6.117)
(6.118)
to transform (6.116) into P(t, b) = −
∞ + , 2D ∂ −Dk2m t e cos km s ∆km . π ∂s
(6.119)
m=0
In the limit where the left boundary tends to minus infinity, which corresponds to a → −∞ or b − a → ∞, the sum is replaced with the integral, which yields
∞ + , 2D ∂ 2 e−Dk t cos ks dk. (6.120) P(t, b) = − π ∂s 0 This integral is calculated by applying the known formula √
∞ π −s2 /(4u2 ) 2 2 e e−u x cos(sx) dx = , u>0 2u 0 √ which in our case (u = Dt) yields P(b, t) = − √
s2 ∂ − s2 s e 4Dt = √ e− 4Dt . πDt ∂s 4πDt3
D
(6.121)
(6.122)
Returning to the original notations, we finally recover the known expression (6.112) for the first-passage time distribution in the semi–infinite interval.
6.5 The Mirror Method and Time Lag
Alternatively, the solution of the diffusion equation in shifted coordinates y = x − a, together with the initial condition p(y, t = 0) = δ(y − y0 ) and different boundary conditions can be obtained as a superposition of solutions
(y − y0 )2 1 exp − py0 (y, t) = √ (6.123) 4Dt 4πDt
6.5 The Mirror Method and Time Lag
for natural boundary conditions with different positions y0 of the delta-peaks at the initial time moment t = 0. Any such superposition satisfies the diffusion equation (6.79) due to its linearity. The only problem is to fulfill the initial and boundary conditions. For simplicity, let us consider first the case where there is a reflecting boundary at y = 0 and we are looking for the solution within y ∈ [0, ∞), so the other boundary is located at +∞. Formally we can extend the y interval from −∞ to ∞ and make a mirror construction. Obviously, the superposition as the sum of py0 (y, t) and p−y0 (y, t) fulfills the initial condition for the interval y ∈ [0, ∞), as well as the reflecting boundary condition ∂p(y, t)/∂y = 0 at y = 0 due to the mirror symmetry around this point (see Figure 6.7(a)). Hence p(y, t) = py0 (y, t) + p−y0 (y, t)
(6.124)
p(y,t)
0.1
0.05
0
−y0
(a)
0
y0
y
0.1
p(y,t)
0.05 0 −0.05 −0.1 −y0 (b)
0
y0
y
Figure 6.7 Construction of the solution for y ∈ [0, + ∞) with reflecting boundary at y = 0 (a) and absorbing boundary at y = 0 (b). Circles (full lines) are the final results corresponding to (6.124) and (6.125), respectively.
201
6 One-Dimensional Diffusion
is the solution of the problem with one reflecting boundary at y = 0. Similarly (see Figure 6.7(b)), we can construct a solution which is antisymmetric with respect to y = b − a, That is, p(y, t) = py0 (y, t) − p2(b−a)−y0 (y, t).
(6.125)
This is the solution of the problem with single absorbing boundary at y = b − a (the other boundary located infinitely far away), since it obeys the condition p(y, t) = 0 at y = b − a. The problem with two boundaries located at a finite distance b − a from each other is more complicated, since an infinite number of periodically placed peaks are necessary to satisfy the specific symmetry and/or antisymmetry conditions. In this case (Figure 6.8) we have p(y, t) =
∞
(−1)m py0 +2m(b−a) + p−y0 +2m(b−a)
(6.126)
m=−∞
for our problem with the reflecting boundary at y = 0 and the absorbing one at y = b − a. For two reflecting boundaries (Figure 6.9) the solution is p(y, t) =
∞
py0 +2m(b−a) + p−y0 +2m(b−a)
(6.127)
m=−∞
and for two absorbing boundaries (Figure 6.10) it reads p(y, t) =
∞
py0 +2m(b−a) − p−y0 +2m(b−a) .
(6.128)
m=−∞
The first-passage time distribution (cf. (6.109))
b−a d p(y, t) dy P(t, y = b − a) = − dt 0
(6.129)
0.8 0.6 0.4 p(y,t)
202
0.2 0 −0.2 −0.4
−2b
−b
−y0 0 y0
b
2b
y Figure 6.8 Construction of solution for finite interval y ∈ [0, b − a) with a = 0 for the mixed boundary problem, that is, reflecting boundary at y = 0 and absorbing boundary at y = b. The bold curve is the final solution (6.126).
6.5 The Mirror Method and Time Lag
0.8
p(y,t)
0.6
0.4
0.2
0
−2b
−b
−y0 0 y0 y
b
2b
Figure 6.9 Schematic picture of the mirror method for constructing the solution of the Fokker–Planck equation with two reflecting boundary conditions at y = 0 and y = b − a (a = 0). The bold curve is the final solution (6.127).
0.4
p(y,t)
0.2
0
−0.2
−0.4
−2b
−b
−y0 0 y0
b
2b
y Figure 6.10 Schematic picture of the mirror method for constructing the solution of the Fokker–Planck equation with two absorbing boundary conditions at y = 0 and y = b − a (a = 0). The bold curve is the final solution (6.128).
for our problem with the absorbing boundary at x = b or y = b − a in the limit where the other boundary is moved infinitely far away (b − a → ∞ at a finite b − x0 or b − a − y0 ) can be calculated easily by using the solution (6.125). In this case it is useful to rewrite the probability outflow (6.129) as ∂p(y, t) P(t, y = b − a) = −D (6.130) ∂y y=b−a obtained by inserting the time derivative of p(y, t) from (6.79) into (6.129). Substituting the solution (6.108) and (6.123) into (6.130) we obtain
203
6 One-Dimensional Diffusion
b − a − y0 (b − a − y0 )2 . (6.131) P(t, y = b − a) = √ exp − 4Dt 4πDt3 The transformation to the original coordinate x = y + a leads finally to
(b − x0 )2 b − x0 , (6.132) exp − P(t, x = b) = √ 4Dt 4πDt3 already stated as the result (6.112). A remarkable property of the breakdown probability distribution (6.132) is its diverging mean value, which is related to the fact that stochastic trajectories are allowed to move arbitrarily far away from the origin in the negative direction, requiring as unlimited long time to return and reach the absorbing boundary. Another interesting property is the presence of a certain time lag tlag , which means that the distribution function P(t, b) is very small and tends to zero extremely fast for t < tlag . This implies that the breakdown is very rarely (or practically never) observed before t = tlag , as typically one needs some minimal time to cover the distance from the origin x = x0 to the absorbing boundary x = b. The value of tlag can be defined by constructing a tangent at the inflection point of the P(t, b) plot and looking for the point t = tlag where this linear approximation gives zero. Figure 6.11 shows the graphical interpretation of this time lag construction. The function P(t, b) has a maximum at 2 (b − x0 )2 3 4D and two inflection points . 2 t1,2 = tmax 1 ∓ . 5 t = tmax =
(6.133)
(6.134)
0.25 0.2
P (t,x=b)
204
0.15 0.1 0.05 0
0
0.2 t tlag infl
0.4 t,s
Figure 6.11 Construction of time lag tlag of the first-passage time probability density P(t, x = b) in s−1 for limit case of a → −∞ with x0 = 0 and b = 2 m (see inside plot in Figure 6.6). Time lag tlag (6.136) is calcu-
0.6 t max
0.8
lated as the intersection point of the t-axis and the tangent (dotted + line) to P(t, x =,b) at the inflection point tinfl , P(tinfl , x = b) (6.134).
6.6 Maximum Value Distribution
The smallest value t1 = tinfl is meaningful in our tangent construction. Thus, the time lag is given by the equation , dP + = P(tinfl , b) (6.135) tinfl − tlag dt t=tinfl from which we find √ (b − x0 )2 1 7 − 2 10 tmax ≈ 3.75 × 10−2 . tlag = 3 D
(6.136)
Finally, this time lag is calculated for the example which has been presented in Figure 6.11. In this case we get a numerical value of tlag = 0.15 s for parameters x0 = 0, b = 2 m and D = 1 m2 s−1 . The maximum of P(t, b) corresponds to the point (0.667 s; 0.231 s−1 ). There are two inflection points of the function P(t, b) but we confine our consideration to the inflection point located on the left-hand side of tmax , that is (0.245 s; 0.078 s−1 ).
6.6 Maximum Value Distribution
Consider a one-dimensional random walk (called a diffusion process) with natural boundary conditions starting at point x0 . We require the distribution of maximum values of coordinate x for an ensemble of stochastic trajectories running within a given time interval from zero to t. Let Pmax (x|x0 , t) be the probability density in x space of the maximum distribution. Note that x ≥ x0 always holds, and any trajectory wandering only in the region x ≤ x0 has the maximum at x = x0 . Let us denote by Pmax (x = x0 |x0 , t) the probability of the latter scenario. The probability Pmax (x ≤ b|x0 , t) that the maximum of a trajectory does not exceed a certain value b reads
b Pmax (x|x0 , t) dx. (6.137) Pmax (x ≤ b|x0 , t) = Pmax (x = x0 |x0 , t) + x0
In the following it will be shown that the probability Pmax (x = x0 |x0 , t) vanishes in our case of stochastic trajectories. In fact, Pmax (x ≤ b|x0 , t) is the probability that x belongs to the interval (−∞, b] at the time moment t. Thus, we have
b
b p(x, t|x0 , b) dx = Pmax (x|x0 , t) dx, (6.138) −∞
x0
where p(x, t|x0 , b) is the probability distribution density over x at a time t provided that the initial distribution is δ(x − x0 ) and the absorbing boundary is located at x = b. It can be related to the first-passage problem in a semi-infinite interval x ∈ (−∞, b] with absorbing boundary at x = b. According to (6.138), we have
t
b Pmax (x|x0 , t) dx = 1 − P(t |x0 , b) dt , (6.139) x0
0
205
206
6 One-Dimensional Diffusion
where P(t |x0 , b) is the first-passage time (t ) distribution density to reach the absorbing boundary. Taking the derivative with respect to b in (6.138), we obtain
b ∂ p(x, t|x0 , b) dx. Pmax (x = b|x0 , t) = ∂b −∞ Using (6.139), the derivation yields
t
t ∂ ∂ P(t |x0 , b) dt . P(t |x0 , b) dt = − Pmax (x = b|x0 , t) = − ∂b 0 0 ∂b
(6.140)
(6.141)
For pure diffusion with the above given initial and boundary conditions we recall the solution p(x, t|x0 , b) = √
(x − 2b + x0 )2 (x − x0 )2 − exp − exp − 4Dt 4Dt 4πDt (6.142) 1
obtained by the mirror construction method. The maximum distribution can be in principle, calculated from the first-passage time distribution (6.112)
b − x0 (x − x0 )2 P(t|x0 , b) = √ (6.143) exp − 4Dt 4πDt3 according to (6.141), although in our case it is more easily obtained from (6.140): Pmax (x = b|x0 , t) =
∂ ∂b
b
−∞
p(x, t|x0 , b) dx
= p(b, t|x0 , b) +
b
−∞
∂ p(x, t|x0 , b) dx. ∂b
(6.144)
The first term on the right-hand side of (6.144) is zero according to the absorbing boundary condition at x = b. By inserting (6.142) into the integral, we obtain
b 1 (x − 2b + x0 ) Pmax (x = b|x0 , t) = − 4π(Dt)3 −∞ (x − 2b + x0 )2 dx. × exp − 4Dt
(6.145)
This integral can be easily calculated using the substitution z = (x − 2b + x0 )2 , which yields the result in the form of the Gaussian distribution (x − x0 )2 1 . (6.146) exp − Pmax (x|x0 , t) = √ 4Dt πDt
6.6 Maximum Value Distribution
This fulfills the normalization condition
∞ Pmax (x|x0 , t) dx = 1.
(6.147)
x0
Since the total probability is one, it means that there is no special additional contribution due to the trajectories located in the region x ≤ x0 in the whole time interval, so Pmax (x = x0 |x0 , t) = 0. The latter probability can be calculated directly as
b p(x, t|x0 , b) dx = 0 (6.148) Pmax (x = x0 |x0 , t) = lim b%x0
−∞
in accordance with the solution (6.142). It can be understood in such a way that any trajectory of the diffusion process, which starts at x = x0 , with probability one, reaches a value x > x0 which is infinitely close to the origin. However, there is a continuum of trajectories which wander in the region x < x0 after some small fluctuations around the origin x = x0 . For all these trajectories, the maximum is located near x0 , which explains the maximum of the probability distribution (6.146) at x = x0 . In Figure 6.12 the theoretical formula (6.146) is compared to a numerical simulation on a lattice. The formula (6.146) is well known in the mathematical literature, see e.g. [91], p. 95, where it is obtained from the probability density distribution over the variables a and b (where a ≤ b and b ≥ 0) 2(2b − a) (2b − a)2 (6.149) exp − P0 (a, b) = 4Dt 2π(2Dt)3
Probability distribution
0.008
0.006
0.004
0.002
0.000 0
1
2
3
4
x/<x 2>1/2
Figure 6.12 The probability distribution of maximal x values of the random walk depending on the normalized coordinate x/ x 2 1/2 with x 2 = N, where N = 104 is the number of discrete steps of unit length
made by the random walker. The solid line shows the theoretical curve (6.146) for 2Dt = N, whereas the simulation results obtained from a sample of 106 trajectories are represented by circles.
207
208
6 One-Dimensional Diffusion
to have the maximum at x = b and the end-coordinate at x = a for stochastic trajectories starting at x0 = 0 and running within the time interval from zero to t. The maximum distribution (6.146) is obtained by integrating the distribution (6.149) over all the possible a values from −∞ to b.
6.7 Summary of Results for Diffusion in a Finite Interval
Summarizing the results of the probability density p(x, t) for one-dimensional diffusion in a finite interval a ≤ x ≤ b we distinguish between three different cases related to the properties of the boundaries. 6.7.1 Reflected Diffusion
Diffusion within two reflecting barriers (RR) at x = a and x = b: Mirror method: pRR (x, t) =
∞
px0 −a+2m(b−a) (x, t) + p−(x0 −a)+2m(b−a) (x, t)
(6.150)
m=−∞
with px0 (x, t) = √
(x − x0 )2 . exp − 4Dt 4πDt 1
(6.151)
Eigenfunction expansion: pRR (x, t) =
∞ + , + , 1 2 e−Dkm t cos km (x0 − a) cos km (x − a) b − a m=−∞
pRR (x, t) =
∞ + , + , 2 −Dk2m t 1 + e cos km (x0 − a) cos km (x − a) b − a b − a m=1
(6.152)
or
(6.153) with km =
π m, b−a
λm = D k2m =
Dπ2 m2 . (b − a)2
(6.154)
Reflected diffusion takes place in a closed system with a normalized survival function
b d G(t, x0 ) = 0 pRR (x, t) dx = 1 or (6.155) GRR (t, x0 ) = dt a without outflow at any border P(t, x = a) = 0 and P(t, x = b) = 0.
6.7 Summary of Results for Diffusion in a Finite Interval
6.7.2 Diffusion in a Semi-Open System
Reflecting barrier (left at x = a) and absorbing border (right at x = b) (RA): Mirror method: ∞
pRA (x, t) =
(−1)m px0 −a+2m(b−a) (x, t) + p−(x0 −a)+2m(b−a) (x, t) (6.156)
m=−∞
with px0 (x, t) = √
(x − x0 )2 . exp − 4Dt 4πDt 1
(6.157)
Eigenfunction expansion: pRA (x, t) =
∞ + , + , 1 2 e−Dkm t cos km (x0 − a) cos km (x − a) b − a m=−∞
(6.158)
pRA (x, t) =
∞ + , + , 2 −Dk2m t e cos km (x0 − a) cos km (x − a) b−a
(6.159)
or
m=0
with km =
π b−a
1 +m , 2
λm = D k2m =
Dπ2 (b − a)2
1 +m 2
2 .
(6.160)
Survival function in a semi-open system:
GRA (t, x0 ) =
b
pRA (x, t) dx a
=
∞ + , (−1)m 2 −Dk2m t e cos km (x0 − a) b−a km
(6.161)
m=0
Outflow density at the right border: P(t, x = b) =
∂p(x, t) d G(t, x0 ) = j(x = b, t) = −D dt ∂x x=b
=
∞ ,, + , + + 2 λm −λm t e cos km (x0 − a) sin km b − a b − a m=0 km
=
∞ + , 2πD 2 m 1 + m e−Dkm t cos km (x0 − a) . (6.162) (−1) 2 (b − a) m=0 2
209
210
6 One-Dimensional Diffusion
Mean first passage time:
t1 (x0 → b) =
, 1 b − x0 + b − a + x0 − a 2 D
(6.163)
including the special case
t1 (x0 = a → b) =
1 (b − a)2 . 2 D
(6.164)
Second moment:
t2 (x0 → b) =
, 5 (b − a)4 1 (x0 − a)2 + (x0 − a)2 − 6(b − a)2 + 2 2 12 D 12 D (6.165)
with variance = 1 (b − a)2 .
t2 (x0 = a → b) − t1 (x0 = a → b)2 = √ D 6
(6.166)
6.7.3 Diffusion in an Open System
Diffusion within two absorbing boundaries (AA): Mirror method: pAA (x, t) =
∞
px0 −a+2m(b−a) (x, t) − p−(x0 −a)+2m(b−a) (x, t)
(6.167)
m=−∞
with px0 (x, t) = √
(x − x0 )2 . exp − 4Dt 4πDt 1
(6.168)
Eigenfunction expansion: pAA (x, t) =
∞ + , + , 1 2 e−Dkm t sin km (x0 − a) sin km (x − a) b − a m=−∞
(6.169)
pAA (x, t) =
∞ + , + , 2 −Dk2m t e sin km (x0 − a) sin km (x − a) b − a m=1
(6.170)
or
with km =
π m, b−a
λm = D k2m =
Dπ2 m2 . (b − a)2
(6.171)
6.8 Exercises
Survival function in an open system:
GAA (t, x0 ) =
b
pAA (x, t) dx a
=
∞ + , (−1)m+1 2 −Dk2m t e cos km (x0 − a) . b − a m=0 km
(6.172)
Outflow density at left border: ∂p(x, t) P(t, x = a) = −j(x = a, t) = +D ∂x x=a =
∞ + , 2πD −Dk2m t e m sin km (x0 − a) . (b − a)2
(6.173)
m=1
Outflow density at right border: ∂p(x, t) P(t, x = b) = j(x = b, t) = −D ∂x x=b =
∞ + , 2πD −Dk2m t e (−1)m+1 m sin km (x0 − a) . (b − a)2
(6.174)
m=0
6.8 Exercises
E 6.1 Random walk with right border Consider a discrete random walk on a line starting at the origin m = 0. The walker can move with equal probability p = q = 1/2 one unit to the left or to the right at each time step. The motion is unbounded from the left (natural boundary condition at m → −∞), and there is an absorbing boundary at m = mb . Find the probability p(mb , n) that the walker will be absorbed at m = mb after n time steps. Compare the obtained result with the simple model having natural boundary conditions. E 6.2 Galton board Sir Francis Galton studied stochastic motion by discrete probabilistic jumps on an experimental setup which is now called the Galton board. As for a drunken sailor, he mimics the outcome of a large number of independent experiments with balls on triangular arrangements of nails as scatterers. The general task is to analyze the stochastic dynamics of an asymmetrical Galton board including the continuum limit. Divide the task into three parts. Start with the historical background and learn from Galton’s life about the motivation of Sir Francis Galton to study human heredity and his interest on qualitative statistics caused by Charles Darwin’s ‘Origin of Species’.
211
212
6 One-Dimensional Diffusion
The Galton board is a discrete version of Brownian motion. Study the time evolution of the probability of finding the ball (as a point particle) after the nth collision in a certain bin m. Having in mind the Markov property, solve this Markov chain dynamics using elementary left (p) and right (q = 1 − p) jump probabilities. Consider the general case p = q and discuss two special geometries: pure symmetry p = q = 1/2, and total asymmetry p = 0. Francis Galton had already made the point that, for large n or a large number of independent experiments, the probability of the outcome approaches the normal distribution. Derive this Gaussian normal distribution as a continuum limit out of the binomial distribution. Introduce continuous spatial and time variables as well as microscopic parameters for drift and diffusion in order to get the corresponding Fokker–Planck equation. E 6.3 Brownian motion This motion named after Robert Brown shows the stochastic displacement of a particle. Brown was a botanist and he did not realize that the motion he saw in 1827 was associated with collisions on a molecular scale. It took over 75 years before Albert Einstein recognized the connection between Brownian motion and the physical process called diffusion. Study Einstein’s concept of Brownian motion to derive the well known diffusion ¨ equation (6.26) by reading the orginal paper ‘Uber die von der molekularkinetischen Theorie der W¨arme geforderte Bewegung von in ruhenden Fl¨ussigkeiten suspendierten Teilchen’ [40] in Annalen der Physik 1905, pp. 549–60. The solution of the diffusion equation is known as a Gaussian distribution. Show its profile and discuss the properties, especially the second moment called the mean-square displacement. E 6.4 Levy walks The mathematician Paul Levy examined random walks with self-similar dynamics and power-law scaling. Prepare an overview talk with handout about these so-called Levy walks (or Levy flights) and show the relationship with the traditional random walk or ordinary diffusion studied perviously.
213
7 Bounded Drift–Diffusion Motion
7.1 Drift–Diffusion Equation with Natural Boundaries
Let us consider the following Fokker–Planck equation ∂p(x, t) ∂p(x, t) ∂ 2 p(x, t) = −v +D ∂t ∂x ∂x2
(7.1)
with constant (positive or negative) drift v and diffusion D > 0 in the interval −∞ < x < +∞ called natural boundaries, together with the initial condition as a delta function p(x, t = 0) = δ(x − x0 ). We would like to find an analytical solution p(x, t) of (7.1) by the linear transformation P(y, T) dy = p(x, t) dx
(7.2)
y = y(x, t) = x − vt,
(7.3)
T = T(x, t) = Dt.
(7.4)
with
According to (7.2) we have dy = P(y, T) · 1 = P(y, T). p(x, t) = P(y, t) dx
(7.5)
The time derivative on the left-hand side of (7.1) transforms to ∂P(y, T) ∂P(y, T) ∂T ∂P(y, T) ∂y ∂p(x, t) = = + , ∂t ∂t ∂T ∂t ∂y ∂t
(7.6)
whereas the coordinate derivatives on the right-hand side are ∂p(x, t) ∂P(y, T) ∂P(y, T) ∂y = = ∂x ∂x ∂y ∂x
Physics of Stochastic Processes: How Randomness Acts in Time Reinhard Mahnke, Jevgenijs Kaupuˇzs and Ihor Lubashevsky Copyright 2009 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim ISBN: 978-3-527-40840-5
(7.7)
7 Bounded Drift–Diffusion Motion
and ∂ 2 p(x, t) ∂ 2 P(y, T) ∂ = = ∂x2 ∂x2 ∂y
∂P(y, T) ∂y ∂y ∂x
∂y . ∂x
(7.8)
According to (7.3) and (7.4) we have ∂y/∂x = 1, ∂y/∂t = −v, and ∂T/∂t = D. By inserting these relations into (7.6)–(7.8) and then all into (7.1) we obtain the well known partial differential equation called the diffusion law (cf. 6.26) ∂ 2 P(y, T) ∂P(y, T) = ∂T ∂y2
(7.9)
with vanishing distribution P(y, T) at infinite boundaries and initial condition P(y, T = 0) = δ(y − y0 ). The Gaussian solution of (7.9) reads P(y, T) = √
1 4πT
e−
(y−y0 )2 4T
,
(7.10)
and after inverse transformation to original variables we get p(x, t) = √
1 4πDt
e−
(x−x0 −vt)2 4Dt
.
(7.11)
Figure 7.1 illustrates the dynamics given by the Gaussian moving and widening profile (7.11). The mean value (first moment) changes linearly in time
x(t) = x0 + vt,
(7.12)
as does the variance (second moment minus the squared first moment)
x2 (t) − x2 (t) = 2Dt.
(7.13)
1 t = 0.1 s t = 1.0 s t = 3.0 s
0.8
p(x, t)
214
0.6 0.4 0.2 0 −4
−2
0
2 x/m
4
6
Figure 7.1 Evolution of the probability density p(x, t) of drift–diffusion dynamics for different time moments. Parameters: v = 1ms−1 ; D = 1m2 s−1 ; x0 = 0.5m.
8
7.2 Drift–Diffusion Problem with Absorbing and Reflecting Boundaries
7.2 Drift–Diffusion Problem with Absorbing and Reflecting Boundaries
Let us consider the initial boundary-value problem (shown schematically in Figure 7.2) with constant diffusion coefficient D and constant drift coefficient v. Our task is to calculate analytically the probability density p(x, t) of finding the system in state x (exactly in the small interval [x; x + dx]) at time t. The state space is defined as closed on the left-hand side and open on the right-hand side. Due to these properties we introduce boundary conditions which determine the behavior of the solution. Another important quantity is the outflow (or breakdown) probability density at the right border which is found from the solution of the Fokker–Planck equation using the balance equation [22, 82, 83, 185]. Applications for the calculated outflow probability are the many-car systems used to define and describe traffic breakdown on roads, depending stochastically on the vehicular density [111, 112, 151]. The dynamics of p(x, t) is given by the drift–diffusion equation, as well as the initial and boundary conditions: 1. The drift–diffusion equation (Fokker–Planck dynamics, see (7.1)) ∂p(x, t) ∂ 2 p(x, t) ∂p(x, t) = −v +D . ∂t ∂x ∂x2
(7.14)
2. The initial condition (delta function) p(x, t = 0) = δ(x − x0 ).
(7.15)
0
a
Absorbing boundary
Initial condition
Reflecting boundary
3. The reflecting boundary at x = a (flux j vanishes at left border) ∂p(x, t) j(x = a, t) = vp(x = a, t) − D = 0. ∂x x=a
p(x, t)
x0
b
x Figure 7.2 Schematic picture of the boundary-value problem showing the probability density p(x, t) in the finite interval a ≤ x ≤ b.
(7.16)
215
216
7 Bounded Drift–Diffusion Motion
4. The absorbing boundary at x = b (probability density p vanishes at the right border) p(x = b, t) = 0. The balance equation in our open system, see Figure 7.2,
b ∂ p(x, t) dx + P(t, x = b) = 0 ∂t a
(7.17)
(7.18)
relates the probability that the system is still in a state x ∈ [a, b] with the probability flux P(t, x = b) out of this interval at the right absorbing boundary x = b at time t (exactly in the interval [t; t + dt]).
7.3 Dimensionless Drift–Diffusion Equation
It is convenient to formulate the drift–diffusion problem in dimensionless variables. For this purpose we define a new variable 0 ≤ y ≤ 1 instead of a ≤ x ≤ b by y=
x−a , b−a
(7.19)
a new time T by T=
D t, (b − a)2
(7.20)
one dimensionless control parameter (scaled drift v which may have positive, zero, or negative values) , v + b−a , (7.21) = D a new probability density P(y, T) by P(y, T) dy = p(x, t) dx
(7.22)
and therefore P(y, T) = (b − a)p(x, t).
(7.23)
As a result, (7.14)–(7.17) can be rewritten as follows: 1. The drift–diffusion equation (Fokker–Planck dynamics) ∂P(y, T) ∂P(y, T) ∂ 2 P(y, T) = − + . ∂T ∂y ∂y2
(7.24)
2. The initial condition (delta–function) P(y, T = 0) = δ(y − y0 ).
(7.25)
7.4 Solution in Terms of Orthogonal Eigenfunctions
3. The reflecting boundary at y = 0 (flux J vanishes at left border) ∂P(y, T) J(y = 0, T) = P(y = 0, T) − = 0. ∂y y=0
(7.26)
4. The absorbing boundary at y = 1 (probability density P vanishes at right border) P(y = 1, T) = 0.
(7.27)
Our aim is to calculate the first-passage time distribution function P(t) as the probability flux out of the system, given in dimensionless variables by
1 ∂ with G(T) = P(T, y = 1) = − G(T) P(y, T) dy. (7.28) ∂T 0 To switch over to cumulative probability functions we define
Tobs P(T, y = 1) dT W(T ≤ Tobs ) =
(7.29)
0
and using (7.28)
W(T ≤ Tobs ) = −
Tobs
0
∂G(T) dT = −G(Tobs ) + G(0) ∂T
(7.30)
calculate the following relationship W(T ≤ Tobs ) = 1 − G(Tobs ).
(7.31)
Both (7.28) (differential form) and (7.31) (integral notation) consider the balance between the probability which, at time T, is still inside the system G(T) and the flux which is passing the surface at that moment P(T, y = 1) or the amount of probability which has already left the system during the time interval 0 ≤ T ≤ Tobs over the surface to the outside W(t ≤ Tobs ). By analogy with the statistics of the durability of technical components we may call the probability distribution G(Tobs ) the survival or life-time function indicating that the equipment or particle is still alive at observation time Tobs . The breakdown function W(T ≤ Tobs ) gives the probability distribution for the components which have already been broken from the beginning up to the observation time Tobs . In this context, the cumulative life-time distribution G(Tobs ) is often estimated on the basis of experiments of limited duration [26]. 7.4 Solution in Terms of Orthogonal Eigenfunctions
To find the solution of the well defined drift–diffusion problem, first we take the dimensionless form (7.24)–(7.27) and use a transformation to a new function Q by
Q(y, T) = e− 2 y P(y, T).
(7.32)
217
218
7 Bounded Drift–Diffusion Motion
This results in a dynamics without a first derivative, called the reduced Fokker– Planck equation 2 ∂ 2 Q(y, T) ∂Q(y, T) = − Q(y, T) + . ∂T 4 ∂y2
(7.33)
According to (7.32) the initial condition is transformed to
Q(y, T = 0) = e− 2 y0 P(y, T = 0), whereas the reflecting boundary condition at y = 0 becomes ∂Q(y, T) Q(y = 0, T) − = 0, 2 ∂y y=0
(7.34)
(7.35)
and the absorbing boundary condition at y = 1 now reads Q(y = 1, T) = 0.
(7.36)
The solution of the reduced equation (7.33) can be found by the method of separation of variables. Making a separation ansatz Q(y, T) = χ(T)ψ(y), we obtain 2 1 d2 ψ(y) 1 dχ(T) =− + . χ(T) dT 4 ψ(y) dy2
(7.37)
Both sides should be equal to a constant. This constant is denoted by −λ, where λ has the meaning of an eigenvalue. The eigenvalue λ should be real and non-negative. Integration of the left-hand side gives the exponential decay χ(T) = χ0 exp{−λT}
(7.38)
with χ(T = 0) = χ0 where the constant χ0 can be set equal to 1. Let us now define the dimensionless wave number k as k2 = λ. The right-hand side of (7.37) then transforms into the following wave equation
d2 ψ(y) 2 2 ψ(y) = 0. (7.39) + k − dy2 4 2 2 ˜2 Further - on, we introduce a modified wave number k = k − /4. Note that ˜k = + k2 − 2 /4 may be complex (either pure real or pure imaginary). First we consider the case where k˜ is real. A suitable complex ansatz for the solution of the wave equation (7.39) reads
˜ + C exp{−iky} ˜ ψ(y) = C∗ exp{+iky}
(7.40)
with complex coefficients C = A/2 + iB/2 and C∗ = A/2 − iB/2 chosen in such a way as to ensure a real solution ˜ + B sin(ky). ˜ ψ(y) = A cos(ky)
(7.41)
7.4 Solution in Terms of Orthogonal Eigenfunctions
The two boundary conditions (7.35) and (7.36) can be used to determine the modified wave number k˜ and the ratio A/B. The particular solutions are eigenfunctions ψm (y), which form a complete set of orthogonal functions. As the third condition, we require that these eigenfunctions are normalized
1 ψ2m (y) dy = 1. (7.42) 0
˜ A, and B are defined. In this case all three parameters k, The condition for the left boundary (7.35) reads dψ(y) ψ(y = 0) − = 0. 2 dy y=0
(7.43)
After substitution by (7.40) this reduces to ∗ ˜ ∗ − C) (C + C) = ik(C 2
(7.44)
or ˜ A = kB. 2 The condition for the right boundary (7.36) ψ(y = 1) = 0
(7.45)
(7.46)
gives us
or
˜ + C exp{−ik} ˜ =0 C∗ exp{+ik}
(7.47)
A cos k˜ + B sin k˜ = 0.
(7.48)
By putting both equalities (7.45) and (7.48) together and looking for a nontrivial solution, we arrive at a transcendental equation ˜ − exp{−ik} ˜ = k˜ exp{+ik} ˜ + exp{−ik} ˜ exp{+ik} (7.49) i 2 or sin k˜ + k˜ cos k˜ = 0, (7.50) 2 and, respectively, 2˜ tan k˜ = − k, (7.51) which gives the spectrum of values k˜ m with m = 0, 1, 2, . . . (numbered in such a way that 0 < k˜ 0 < k˜ 1 < k˜ 2 < . . .) and the discrete eigenvalues λm > 0. Due to (7.41) and (7.48), the eigenfunctions can be written as ! ψm (y) = Rm cos k˜ m y sin k˜ m − cos k˜ m sin k˜ m y ,
(7.52)
219
220
7 Bounded Drift–Diffusion Motion
where Rm = Am / sin k˜ m = −Bm / cos k˜ m . Taking into account the identity sin(α − β) = sin α cos β − cos α sin β, (7.52) reduces to ! (7.53) ψm (y) = Rm sin k˜ m (1 − y) . The normalization constant Rm is found by inserting (7.53) into (7.42). Calculation of the normalization integral by using the transcendental equation (7.50) gives us + ,! 1 1 − sin2 k˜ m 1 − y dy = R2m sin 2k˜ m 2 4k˜ m 0 R2m 1 = 1+ = 1, 2 2 k˜ 2m + 2 /4 and hence (7.53) becomes 7 8 ! 2 8 ˜ m (1 − y) sin k ψm (y) = 9 1 + 2 ˜ 2 1 2
R2m
1
(7.54)
(7.55)
km + /4
or
7 8 8 ψm (y) = 9
2 1+
1 2 k2m
sin
= k2m − 2 /4 (1 − y) .
(7.56)
˜ This - calculation refers to the case > −2 where all wave numbers km or km = 2 2 km − /4 are real and positive. However, the smallest or ground-state wave vector k˜ 0 vanishes when tends to −2 from above, and no continuation of this solution exists on the real axis for < −2. A purely imaginary solution k˜ 0 = iκ0 appears instead, where κ0 is real, see Figure 7.3. In this case (for < −2) a real ground state eigenfunction ψ0 (y) can be found in the form (7.40) where C = A/2 + B/2 and C∗ = A/2 − B/2, so, ψ0 (y) = A cosh(κ0 y) + B sinh(κ0 y).
(7.57)
The transcendental equation for the wave number k˜ 0 = iκ0 can be written as the following equation for κ0 sinh (κ0 ) + κ0 cosh (κ0 ) = 0. 2
(7.58)
As compared to the previous case > −2, trigonometric functions are replaced by the corresponding hyperbolic ones. Similar calculations as before yield 7 8 2 8 ψ0 (y) = 9− sinh κ0 (1 − y) . (7.59) 1 1 + 2 −κ2 + 2 /4 0
Note that κ0 = −ik˜ 0 is the imaginary part of k˜ 0 and κ20 = −k˜ 20 . As regards other solutions of (7.50), called excited states, which are those for k˜ m with m > 0, nothing special happens at = −2, so that these wave numbers are always real. The
7.4 Solution in Terms of Orthogonal Eigenfunctions
l0(Ω)
5 4 3
~ k0(Ω)
k0
2 1 0 −10
−5
0
5
Ω Figure 7.3 The wave number k˜ 0 ( ≥ −2) respectively κ0 ( ≤ −2) and eigenvalue λ0 for the ground state m = 0. The thin straight line shows the approximation κ0 ≈ −/2 valid for large negative < −5.
situation for the ground state m = 0 at different values of the dimensionless drift parameter is summarized in Table 7.1, which presents the solutions of the transcendental equation (7.58) κ0 together with λ0 = −κ20 + 2 /4 and those of (7.50) k˜ 0 together with eigenvalues λ0 = k˜ 20 + 2 /4. Table 7.2 shows the behavior of the lowest wave numbers k˜ m with m = 0, 1, . . . , 5. The results are plotted in Figure 7.4. The ground-state wave number κ0 (for ≤ −2) and k˜ 0 (for ≥ −2) and eigenvalue λ0 depending on the dimensionless drift parameter . Table 7.1
Ω
κ0
λ0
Ω
k˜ 0
−9.00 −8.50 −8.00 −7.50 −7.00 −6.50 −6.00 −5.50 −5.00 −4.50 −4.00 −3.50 −3.00 −2.50 −2.00
4.499 4.248 3.997 3.745 3.493 3.240 2.984 2.726 2.464 2.195 1.915 1.617 1.288 0.888 0.000
0.010 0.015 0.021 0.031 0.045 0.064 0.091 0.128 0.178 0.245 0.333 0.446 0.591 0.774 1.000
−2.00 −1.50 −1.00 −0.50 0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 4.00 4.50 5.00
0.000 0.845 1.165 1.393 1.571 1.715 1.836 1.939 2.028 2.106 2.174 2.235 2.288 2.337 2.381
λ0 1.000 1.276 1.608 2.004 2.468 3.005 3.623 4.325 5.116 5.999 6.979 8.058 9.239 10.525 11.917
221
7 Bounded Drift–Diffusion Motion The wave numbers k˜ m (m = 0, 1, . . . , 5) depending on the dimensionless drift parameter .
Table 7.2
Ω
−10.0
−5.0
−2.0
−1.0
0.0
1.0
2.0
5.0
m=0 m=1 m=2 m=3 m=4 m=5
4.999 3.790 7.250 10.553 13.789 16.992
2.464 4.172 7.533 10.767 13.959 17.133
0.000 4.493 7.725 10.904 14.066 17.220
1.165 4.604 7.789 10.949 14.101 17.249
1.571 4.712 7.854 10.995 14.137 17.279
1.836 4.816 7.917 11.040 14.172 17.308
2.028 4.913 7.979 11.085 14.207 17.336
2.381 5.163 8.151 11.214 14.310 17.421
12
m=3
~ km
9
m=2
6
m=1
k
3
0
m=0 −40
−20
(a)
0 Ω
20
40
200 m=3 150
lm
222
m=2
100
m=1 50 m=0 0 −20 (b)
−10
0 Ω
10
Figure 7.4 The parameter dependence of wave numbers k˜ m () (a) and eigenvalues λm () (b) for ground state m = 0 and excited states m = 1, 2, 3.
20
10.0 2.653 5.454 8.391 11.408 14.469 17.556
7.4 Solution in Terms of Orthogonal Eigenfunctions
2
ψ0
Ω = −5 Ω = −2.5 Ω=0 Ω=3
1
0
0
0.2
0.4
0.6
0.8
1
y The eigenfunction ψ0 (y) for different values of control parameter .
Figure 7.5
In general (for arbitrary ), the eigenfunctions are orthogonal and normalized:
1
ψl (y)ψm (y) dy = 0,
m = l
(7.60)
ψm (y)ψm (y) dy = 1,
m = l.
(7.61)
0 1
0
Figure 7.5 shows the ground eigenstate (m = 0) for different parameter values , whereas Figure 7.6 gives a collection of eigenstate functions (m = 0, 1, . . . , 5) for = −5.0 and = 3.0. In the following, explicit formulas (where ψm (y) is specified) are written for the case > −2. In order to construct the time-dependent solution for Q(y, t), which fulfills the initial condition, we consider the superposition of all particular solutions with different eigenvalues λm Q(y, T) =
∞
Cm e−λm T ψm (y).
(7.62)
m=0
By inserting the initial condition
P(y, T = 0) = e 2 y Q(y, T = 0) = δ(y − y0 )
(7.63)
into (7.62) we obtain ∞
Cm ψm (y) = e− 2 y δ(y − y0 ).
(7.64)
m=0
Now we expand the right-hand side of this equation by using the basis of orthonormal eigenfunctions (7.55) and identify Cm with the corresponding coefficient at ψm , so
223
7 Bounded Drift–Diffusion Motion 2.5 2 1.5
ψm
1 0.5 0
m=0 m=1 m=2 m=3
−0.5 −1 −1.5
0
0.2
0.4
(a)
0.6
0.8
1
y 1.5 1 0.5
ψm
224
0
m=0 m=1 m=2 m=3
−0.5 −1 −1.5
0
0.2
0.4
0.6
0.8
1
y
(b)
Figure 7.6 The eigenfunctions ψm (y) for m = 0, 1, 2, 3 and for = −5.0 (a) and = 3.0 (b).
Cm =
e 2 y δ(y − y0 )ψm dy = e− 2 y0 ψm (y0 ).
(7.65)
This allows us to write the solution for P(y, T) as
P(y, T) = e 2 (y−y0 )
∞
e−λm T ψm (y0 )ψm (y),
(7.66)
m=0
or, more specifically,
P(y, T) = 2e
(y−y ) 0 2
˜2
− k + ∞ e m 1 + 2 ˜ 2 m=0
2 /4
T
1 km +2 /4
+ ,! + ,! sin k˜ m 1 − y0 sin k˜ m 1 − y . (7.67)
The set of Figures 7.7 illustrates the time evolution of the probability density (7.67) choosing different parameter values .
7.4 Solution in Terms of Orthogonal Eigenfunctions 5
5 T = 0.01 T = 0.05 T = 0.1 T = 0.5
3 2
0
0.5 y
(a)
0
0.5 y
(b)
1
3 T = 0.01 T = 0.05 T = 0.1 T = 0.5
2
P(y, T)
P(y, T)
2
0
1
3
1
0
3
1
1 0
T = 0.01 T = 0.05 T = 0.1 T = 0.5
4 P(y, T)
P(y, T)
4
0
0.5 y
(c)
2
1
0
1
T = 0.01 T = 0.05 T = 0.1 T = 0.5
0
0.5 y
(d)
1
Figure 7.7 The solution of the drift–diffusion Fokker–Planck equation with initial condition y0 = 0.5 for different values of the control parameter , so: = −5.0 (a); = −2.5 (b); = 0.1 (c); = 3.0 (d).
Finally, we make the inverse transformation to original variables as follows: 1 P(y, T), b−a x − a x0 − a v v (y − y0 ) = (b − a) − = (x − x0 ), 2 2D b−a b−a 2D
˜ ˜km (1 − y) = k˜ m b − a − x − a = km (b − x), b−a b−a b−a
Dt 2 1 v2 k˜ 2m + T = k˜ 2m + (b − a)2 2 4 4D (b − a)2 2 v 2 k˜ m Dt, = + b−a 2D p(x, t) =
1+
(7.68) (7.69) (7.70)
(7.71)
1 v 1 (b − a) ˜ 2 = 1+ + ,2 2 2 2 k ˜ (b−a) 2 km + /4 2D v m + 2D (b − a) (b−a)2 = 1+
v 2D(b − a)
2 k˜ m b−a
1 +
+
, v 2 2D
.
(7.72)
225
226
7 Bounded Drift–Diffusion Motion
Hence, the solution in original variables reads p(x, t) =
v 2 e 2D (x−x0 ) a−b
×
−
∞ m=0
e 1+
k˜ m b−a
(7.73)
2
v +( 2D )
2
Dt
1 v 2D(b−a) ˜ 2 km v 2 +( 2D ) b−a
, , k˜ m + k˜ m + b − x0 sin b−x sin b−a b−a
where the values k˜ m are solutions of the transcendental equation (7.50) 1v (b − a) sin k˜ m + k˜ m cos k˜ m = 0. 2D
(7.74)
7.5 First-Passage Time Probability Density
Let us recall the balance equation (7.18) for open systems. Using dimensionless variables, the quantity P(T, y = 1) dT, given by (7.28), is the probability that the absorbing boundary at y = 1 is reached for the first time within the time interval [T, T + dT] (since it is forbiddenr to return and then reach it once again). Hence, P(T, y = 1) is the first-passage time probability density sometimes called the breakdown probability density. First we calculate the integral
1
G(T) =
P(y, T) dy
(7.75)
0
=2
1
dye
(y−y ) 0 2
0
∞ − y0 − e 2 e =2 1 + 2 m=0
˜2
− k + ∞ e m 1 + 2 ˜ 2 m=0
k˜ 2m +2 /4 T
1 k˜ 2m +2 /4
2 /4
1 km +2 /4
! ! sin k˜ m (1 − y0 ) sin k˜ m (1 − y)
! sin k˜ m (1 − y0 )
1
! e 2 y sin k˜ m (1 − y) .
0
The latter integral is transformed as follows
1
! In = e 2 y sin k˜ m (1 − y) dy = e 2 0
T
1
! e− 2 z sin k˜ m z dz.
(7.76)
0
By using the known integration formula
eax eax sin(bx) dx = 2 a sin(bx) − b cos(bx) , a + b2
(7.77)
7.5 First-Passage Time Probability Density
where in our case a = −/2 and b = k˜ m , we obtain
In = e 2
k˜ m k˜ 2m + 2 /4
.
(7.78)
Hence, the integral (7.75) is
˜2
− k + ∞ e 2 (1−y0 ) e m G(T) = 2 1 + 2 ˜ 2 1 2 m=0
2 /4
k˜ m
T
k˜ 2m + 2 /4
km + /4
! sin k˜ m (1 − y0 ) .
(7.79)
By means of (7.79) the first-passage time probability density (7.28) can be calculated easily: P(T, y = 1) = −
∂ ∂T
1
P(y, T) dy 0
! ˜ m sin k˜ m (1 − y0 ) ∞ e 2 (1−y0 ) k ∂ − k˜ 2m +2 /4 T e = −2 1 k˜ 2m + 2 /4 ∂T m=0 1 + 2 ˜ 2 2 km + /4
˜2
− k + ∞ e m = 2e 2 (1−y0 ) 1 + 2 ˜ 2 m=0
2 /4
T
! ˜ sin k˜ m (1 − y0 ) .
km 1 km +2 /4
(7.80)
The outflow distribution P(T, y = 1) is shown in Figure 7.8 (with different values of the dimensionless drift ) as well as in Figure 7.9 (with different values of the initial condition y0 ). Making the inverse transformation to original variables, we obtain P(T, y = 1) dT = P(t, x = b) dt, P(t, x = b) = P(T, y = 1)
(7.81)
dT D = P(T, y = 1) , dt (b − a)2
(7.82)
1 0.5 0
(a)
6
Ω = −2.5 Ω = −5 Ω = −8
1.5
P(T, y = 1)
P(T, y = 1)
2
0
0.5 T
4 2 0
1
Ω = −1.5 Ω = 0.1 Ω=3
0
(b)
Figure 7.8 The first-passage time probability density distribution P(T, y = 1) for < −2 (a) and > −2 (b).
0.5 T
1
227
7 Bounded Drift–Diffusion Motion
2 y0 = 0.0 y0 = 0.3 y0 = 0.5 y0 = 0.6
1.5
P(T, y = 1)
228
1
0.5
0
0
0.02
0.04
0.06
0.08
0.1
T Figure 7.9 Short-time behavior of first-passage time probability density distribution P(T, y = 1) for different initial conditions 0 ≤ y0 ≤ 1 showing time lag.
and P(t, x = b) =
2D v (b−x0 ) e 2D b−a
×
−
∞ m=0
e 1+
v 2D
(7.83)
2
k˜ m b−a
v +( 2D )
(b−a)
2
1
2
k˜ m b−a
Dt
v +( 2D )
k˜ m k˜ m sin (b − x0 ) . b−a b−a
2
7.6 Cumulative Breakdown Probability
The probability that the absorbing boundary y = 1 is reached within a certain observation time interval 0 ≤ T ≤ Tobs is given by the cumulative (breakdown) probability
Tobs P(T, y = 1) dT (7.84) W (, T ≤ Tobs ) = 0
with (7.80)
P(T, y = 1) = 2e
(1−y ) 0 2
˜2
− k + ∞ e m 1 + 2 ˜ 2 m=0
2 /4
T
! ˜ sin k˜ m (1 − y0 ) .
km 1 2 km + /4
(7.85)
For Tobs → ∞ we have W → 1. Exchanging the order of summation and integration we can integrate (7.84) term by term. In particular, we calculate
7.7 The Limiting Case for Large Positive Values of the Control Parameter
Tobs
Tobs −1 − k˜ 2m +2 /4 T − k˜ 2m +2 /4 T I= e dT = e 0 k˜ 2m + 2 /4 0
2 2 ˜ 1 − k + /4 Tobs . 1−e m = k˜ 2m + 2 /4
(7.86)
Hence we obtain
W (, Tobs ) = 2e 2 (1−y0 ) ×
∞ m=0
− k˜ 2m +2 /4 Tobs
1−e
k˜ 2m + 2 /4
= 2e 2 (1−y0 )
1 2 k˜ 2m +2 /4 − k˜ 2m +2 /4 Tobs
1+
! k˜ m sin k˜ m (1 − y0 )
∞ ! 1−e k˜ m sin k˜ m (1 − y0 ) , (7.87) k˜ 2 + 2 /4 + /2
m=0
m
displayed in Figure 7.10 as a function of the observation time Tobs (a) and the parameter dependence (b). Carrying out an inverse transformation on the original variables, we get the same (breakdown) probability w as a function of all the parameters v, D, a, b, and the initial condition x0 , so , + v w Tobs ; v, D, a, b, x0 = 2e 2D (b−x0 )
˜
2
(7.88)
km v 2 DT − b−a +( 2D ) obs , 1−e k˜ m + ˜ × b − x0 . km sin + v ,2 +v , ˜2 b−a (b − a)2 + 2D (b − a) m=0 km + 2D
∞
In the following we want to consider an approximation for large control parameter values .
7.7 The Limiting Case for Large Positive Values of the Control Parameter
Consider the parameter limit → +∞ which corresponds either to large positive drift v and/or large interval b − a, or to a small diffusion coefficient D. In this case, for a given m, the solution of the transcendental equation can be found in the form k˜ m = π(m + 1) − εm , where εm is small and positive. From the periodicity property we obtain + , cos k˜ m = cos(π(m + 1) − εm ) = −(−1)m cos(εm ) = −(−1)m + O ε2m , + , sin k˜ m = sin(π(m + 1) − εm ) = (−1)m sin(εm ) = (−1)m εm + O ε3m .
229
7 Bounded Drift–Diffusion Motion
1
W(Ω, Tobs)
0.8 Ω = −1 Ω=0 Ω=3
0.6 0.4 0.2 0
0
0.5
(a)
1 Tobs
1.5
2
1 0.8 W(Ω, Tobs)
230
0.6 Tobs = 0.1 Tobs = 0.3 Tobs = 0.5
0.4 0.2 0 −10
−5
(b)
0 Ω
5
10
Figure 7.10 The probability W(, Tobs ) (7.84), (7.87) as a function of the observation time Tobs with fixed (a) and vice versa (b).
By inserting this into the transcendental equation (7.50), we obtain
, + 2 π(m + 1) + O −2 , + , 2 sin(k˜ m ) = (−1)m π(m + 1) + O −2 . εm =
(7.89) (7.90)
In this approximation the normalization integral for large and the initial condition y0 → 0 can be written as
I=
∞
P(T, y = 1) dT = 2e/2
0
# e/2
˜ m sin k˜ m ∞ k m=0
λm + /2
∞ ∞ −4 (−1)m (πm)2 −2 (−1)m (πm)2 /2 = e . π2 m2 + 2 /4 π2 m2 + 2 /4 m=−∞
m=1
(7.91)
7.7 The Limiting Case for Large Positive Values of the Control Parameter
Further on we set (−1)m = eiπm and, in a continuum approximation, replace the sum by the integral
∞ −2 eiπm (πm)2 dm. (7.92) I # e/2 2 2 2 −∞ π m + /4 Now we make an integration contour in the complex plane, closing it in the upper plane (Im m > 0) at infinity where |eiπm | is exponentially small. According to the residue theorem, this yields I = 2πi Res(mi ) = 2πi Res(m0 ), (7.93) i
where m0 = i/(2π) is the location of the pole in the upper plane, found as a root of the equation π2 m2 + 2 /4 = 0. According to the well known rule, the residue is calculated by setting m = m0 in the enumerator of (7.92) and replacing the denominator with its derivative at m = m0 . This gives the desired result I = 1, that is, the considered approximation gives the correct normalization of the outflow probability density P(T, y = 1) at the right boundary. The probability distribution function P(y, T) given by (7.67) can also be calculated in such a continuum approximation. In this case the increment in the wave numbers is
π 2 # . (7.94) ∆k˜ m = k˜ m+1 − k˜ m = π + εm − εm+1 # π 1 − 1 + 2/ Note that, in this approximation, for → ∞ the normalization constant Rm in (7.54) is related to the increment ∆k˜ via R2m =
2 1+
1 2 k˜ 2m +2 /4
#
2 2 # ∆k˜ m . 1 + 2/ π
(7.95)
Hence, (7.67) for the probability density can be written as
P(y, T) = 2e 2 (y−y0 )
∞
+ ,! + ,! R2m e−λm T sin k˜ m 1 − y0 sin k˜ m 1 − y
m=0
2 # e 2 (y−y0 ) π
∞
− k˜ 2m +2 /4 T
e
+ ,! + ,! sin k˜ m 1 − y0 sin k˜ m 1 − y ∆k˜ m .
m=0
(7.96) In the continuum approximation we replace the sum by the integral
,! + ,! + 2 (y−y0 ) ∞ − k˜ 2 +2 /4 T e2 e sin k˜ 1 − y0 sin k˜ 1 − y dk˜ π 0
∞ ,! + ,! + 1 − k˜ 2 +2 /4 T ˜ = e 2 (y−y0 ) dk. e cos k˜ y − y0 − cos k˜ 2 − y − y0 π 0
P(y, T) #
(7.97)
231
7 Bounded Drift–Diffusion Motion 1.25 1 P(y,T)
232
T = 0.05
0.75 0.5 0.25 0
T = 0.2 0
0.5 y
1
Figure 7.11 Comparison of the probability density P(y, T) in drift–diffusion dynamics with finite boundaries for two times. The parameter value is = 3.0; initial condition is y0 = 0.5. The solid lines represent the exact result (7.67); dotted lines show the approximation (7.98).
In the latter+ transformation we ,have used the trigonometric identity sin α sin β = 12 cos(α − β) − cos(α + β) . The resulting known integrals yield y−y − T (y−y0 )2 (2−y−y0 )2 1 0 2 P(y, T) # √ e2 e− 4T − e− 4T . (7.98) 4πT The comparison between the exact distribution (7.67) and the approximation (7.98) is shown in Figure 7.11. For short enough times 4T (2 − y − y0 )2 the second term is very small. Neglecting this term, (7.98) reduces to the known exact solution for natural boundary conditions. Based on (7.98), it is easy to calculate the probability flux J(y, T) = P(y, T) −
∂ P(y, T) ∂y
(7.99)
and the first-passage time distribution P(T) = J(y = 1, T) which takes a particularly simple form 1 − y0 − (1−y0 −T)2 4T P(T) = √ e . 4πT 3 The cumulative breakdown probability (7.84) is then
Tobs 1 − y0 − (1−y0 −T)2 4T √ e dT. W(, T ≤ Tobs ) = 4πT 3 0
(7.100)
(7.101)
7.8 A Brief Survey of the Exact Solution
Finally, we want to summarize the exact results written in dimensionless form.
7.8 A Brief Survey of the Exact Solution
7.8.1 Probability Density
The general solution of the initial-boundary-value Fokker–Planck equation (7.24) reads (7.67)
P(y, T) = e 2 (y−y0 )
∞
e−λm T ψm (y0 )ψm (y)
(7.102)
m=0
with eigenfunctions (7.55) and (7.59) of the ground state (m = 0) 7 8 ! 2 8 ˜ 0 (1 − y) , sin k > −2 9 1 + 2 ˜ 2 1 2 k + /4 0 √ + , ψ0 (y) = 3 1−y , = −2 7 8 2 8 9− sinh κ0 (1 − y) , < −2 1 1 + 2 −κ2 +2 /4
(7.103)
0
and all other eigenfunctions (7.55) 7 8 ! 2 8 sin k˜ m (1 − y) ψm (y) = 9 1 1 + 2 ˜2 2
m = 1, 2, . . . .
(7.104)
km + /4
The eigenvalue of the ground state (m = 0) is given by > −2 k˜ 20 + 2 /4, λ0 = 1, = −2 −κ2 + 2 /4, < −2 0
(7.105)
and all others are λm = k˜ 2m + 2 /4
m = 1, 2, . . . ,
(7.106)
where the wave numbers are calculated from the transcendental equation (7.51) k˜ 0 : κ0 : k˜ m :
2˜ k0 , 2 tanh κ0 = − κ0 , 2 tan k˜ m = − k˜ m , tan k˜ 0 = −
> −2
(7.107)
< −2
(7.108)
m = 1, 2, . . . .
(7.109)
233
234
7 Bounded Drift–Diffusion Motion
7.8.2 Outflow Probability Density
The first-passage time probability density distribution P depending on reads as follows 1. > −2
(see (7.80))
P(T, y = 1) = 2e
˜2
− k + ∞ e m 1 + 2 ˜ 2 m=0
(1−y ) 0 2
2 /4
T
1 km +2 /4
! k˜ m sin k˜ m (1 − y0 ) .
(7.110)
2. = −2 −(1−y0 )
P(T, y = 1) = e
3(1 − y0 )e−T
˜2
∞ − km +1 e +2 1 − ˜2 1 m=1
T
! ˜km sin k˜ m (1 − y0 ) .
(7.111)
km +1
3. < −2
P(T, y = 1) = 2e 2 (1−y0 ) × −
− −κ20 +2 /4 T e
1+
1 2 −κ2 +2 /4 0
˜2
− k + ∞ e m + 1 + 2 ˜ 2 m=1
2 /4
T
1 km +2 /4
κ0 sinh κ0 (1 − y0 ) ! k˜ m sin k˜ m (1 − y0 ) .
(7.112)
7.8.3 First Moment of the Outflow Probability Density
The mean first-passage time (or average time of breakdown) T1 corresponding to the breakdown probability density distribution P is
∞
∞ ∂ G(T) dT (7.113) TP(T, y = 1) dT = − T T1 = ∂T 0 0 or
T1 =
∞
G(T) dT 0
with
G(T) =
1
P(y, T) dy. 0
The mean value T1 depending on the control parameter reads:
(7.114)
7.8 A Brief Survey of the Exact Solution
1. > −2
T1 = 2e 2 (1−y0 )
∞ m=0
2. = −2
! k˜ m sin k˜ m (1 − y0 ) . k˜ 2m + 2 /4 + 2 k˜ 2m + 2 /4 (7.115)
! ˜ m (1 − y0 ) ∞ sin k 3(1 − y0 ) + 2 . k˜ 3
T1 = e−(1−y0 )
(7.116)
m
m=1
3. < −2
T1 = 2e 2 (1−y0 )
(7.117)
× −+ 2 −κ0 + 2 /4 + +
∞ m=1
k˜ 2m + 2 /4 +
κ0 2
,+
−κ20
k˜ m 2
+
2 /4
k˜ 2m + 2 /4
, sinh κ0 (1 − y0 )
! sin k˜ m (1 − y0 ) .
The analytical expressions for the mean breakdown time T1 depending on the control parameter and fixed initial value y0 are summarized in Figure 7.12. The straight line T1 = 0.5 shows the result for = 0 (driftless diffusion). 7.8.4 Second Moment of the Outflow Probability Density
By definition, the second moment is
∞ T2 = T 2 P(T, y = 1) dT
(7.118)
0
30 25 T1(y0=0)
20 15 10 5 0 −10 −8 −6 −4 −2
Figure 7.12
0 Ω
2
4
6
8
10
Mean first-passage time T1 (y0 = 0) as a function of the drift parameter .
235
236
7 Bounded Drift–Diffusion Motion
or
∞
T2 = 2
TG(T) dT
1
G(T) =
with
0
P(y, T) dy.
(7.119)
0
The second moment T2 depending on the control parameter reads 1. > −2 ∞
T2 = 4e 2 (1−y0 )
m=0
! k˜ m 2 sin k˜ m (1 − y0 ) . k˜ 2m + 2 /4 + 2 k˜ 2m + 2 /4 (7.120)
2. = −2 −(1−y0 )
T2 = 2e
∞
3(1 − y0 ) + 2
m=1
! sin k˜ m (1−y0 ) . k˜ 4m
(7.121)
3. < −2
T2 = 4e 2 (1−y0 )
(7.122)
× −+ −κ20 + 2 /4 + +
∞ m=1
k˜ 2m + 2 /4 +
κ0 ,+ 2
−κ20
k˜ m 2
+
,2 2 /4
k˜ 2m + 2 /4
sinh κ0 (1 − y0 )
! 2 sin k˜ m (1 − y0 ) .
7.8.5 Outflow Probability
The cumulative probability W depending on is given by 1. > −2
(see (7.87))
− k˜ 2 +2 /4 Tobs ∞ ! 1−e m k˜ m sin k˜ m (1 − y0 ) . W (, Tobs ) = 2e 2 (1−y0 ) 2 ˜2 m=0 km + /4 + /2 (7.123)
2. = −2
−(1−y0 )
W (, Tobs ) = e
+2
3 1 − e−Tobs (1 − y0 )
− ∞ 1−e m=1
k˜ 2m +1 Tobs
k˜ m
! ˜ sin km (1 − y0 ) .
(7.124)
7.8 A Brief Survey of the Exact Solution
3. < −2
W (, Tobs ) = 2e 2 (1−y0 ) − −κ20 +2 /4 Tobs 1−e κ0 sinh κ0 (1 − y0 ) × − 2 2 −κ0 + /4 + /2 − k˜ 2m +2 /4 Tobs ∞ ! 1−e + k˜ m sin k˜ m (1 − y0 ) . k˜ 2 + 2 /4 + /2
(7.125)
m
m=1
Going back to the original variables t = tobs including parameters v, D, a, b, x0 and defining the modified wave numbers (in units of m−1 ) as kˆ m = k˜ m /(b − a) and κz0 = z0 /(b − a) the cumulative (breakdown) probability w(t = tobs; v, D, a, b, x0 ) has the form 1.
v(b − a) > −2 D
(see (7.88))
−D km +( 2D ) ∞ v 2 1−e w= e 2D (b−x0 ) + v ,2 ˆ2 b−a + m=0 k + m
2.
v
2
tobs
v 2D(b−a)
2D
! kˆ m sin kˆ m (b − x0 ) . (7.126)
v(b − a) = −2 D w=
b−x
0 e− b−a
2 + b−a
3.
ˆ2
− Dtobs2 3 1 − e (b−a) (b − x0 ) b−a −D kˆ 2m +
∞ 1−e
1
(b−a)2
kˆ m
m=1
tobs
(7.127)
! sin kˆ m (b − x0 ) .
v(b − a) < −2 D v 2 e 2D (b−x0 ) b−a v 2 tobs −D −ˆκ20 +( 2D ) 1 − e × − κˆ 0 sinh κˆ 0 (b − x0 ) + v ,2 v 2 −ˆκ0 + 2D + 2D(b−a) v 2 tobs −D kˆ 2m +( 2D ∞ ) ! 1−e + kˆ m sin kˆ m (b − x0 ) + ,2 kˆ 2 + v + v
w=
m=1
m
2D
2D(b−a)
(7.128)
237
238
7 Bounded Drift–Diffusion Motion
with, compare (7.110)–(7.112), kˆ 0 : κˆ 0 : kˆ m :
+ , 2D = − kˆ 0 , tan kˆ 0 b − a v + + ,, 2D tanh κˆ 0 b − a = − κˆ 0 , v + , 2D tan kˆ m b − a = − kˆ m , v
v(b − a) > −2 D v(b − a) < −2 D m = 1, 2, . . . .
(7.129) (7.130) (7.131)
7.9 Relationship to the Sturm–Liouville Theory
The particular drift–diffusion problem over a finite interval with reflecting (left) and absorbing (right) boundaries belongs to the following general mathematical theory named after Jacques Charles Francois Sturm (1803–1855) and Joseph Liouville (1809–1882). The classical Sturm–Liouville theory [249] considers a real second-order linear differential equation of the form dψ d p(x) + q(x)ψ = λw(x)ψ (7.132) − dx dx together with boundary conditions at the ends of interval [a, b] given by dψ = 0, dx x=a dψ β1 ψ(x = b) + β2 = 0. dx x=b
α1 ψ(x = a) + α2
(7.133) (7.134)
The particular functions p(x), q(x), w(x) are real and continuous on the finite interval [a, b] together with specified values at the boundaries. The aim of the Sturm–Liouville problem is to find the values of λ (called eigenvalues λn ) for which there are nontrivial solutions of the differential equation (7.132) satisfying the boundary conditions (7.133) and (7.134). The corresponding solutions (for such λn ) are called eigenfunctions ψn (x) of the problem. Defining the Sturm–Liouville differential operator over the unit interval [0, 1] by dψ d p(x) + q(x)ψ (7.135) Lψ = − dx dx and making the weight w(x) equal to unity (w = 1) the general equation (7.132) can be represented as an eigenvalue problem Lψ = λψ
(7.136)
with boundary conditions (7.133) (a = 0) and (7.134) (b = 1) written as B0 ψ = 0,
B1 ψ = 0.
(7.137)
7.9 Relationship to the Sturm–Liouville Theory
Assuming a differentiable positive function p(x) > 0, the Sturm–Liouville operator is called regular and it is self-adjoint, to fulfill
1
1 Lψ1 · ψ2 = ψ1 · Lψ2 . (7.138) 0
0
Any self-adjoint operator has real non-negative eigenvalues λ0 < λ1 < . . . < λn < . . . → ∞. The corresponding eigenfunctions ψn (x) have exactly n zeros in (0, 1) and form an orthogonal set
1 ψn (x)ψm (x) dx = δmn . (7.139) 0
The eigenvalues λn of the classical Sturm–Liouville problem (7.132) with positive function p(x) > 0 as well as positive weight function w(x) > 0, together with separated boundary conditions (7.133) and (7.134) can be calculated by the following expression
λn a
b
ψn (x)2 w(x) dx =
b
! ,2 + p(x) dψn (x)/ dx + q(x)ψn (x)2 dx
a
+ , b −p(x)ψn (x) dψn (x)/ dx . a
(7.140)
The eigenfunctions are mutually orthogonal (m = n) and usually normalized (m = n) in accordance with the equation
b ψn (x)ψm (x)w(x) dx = δmn (7.141) a
known as orthogonality relation (similar to (7.139)). Coming back to the original drift–diffusion problem written in dimensionless variables over unit interval 0 ≤ y ≤ 1 and recalling (7.37), the separation constant λ appears in the following differential equation −
d2 ψ(y) 2 ψ(y) = λψ(y) + dy2 4
(7.142)
which can be related to the regular Sturm–Liouville eigenvalue problem via p(y) = 1 > 0; w(y) = 1 > 0 and q(y) = 2 /4. The boundary conditions given by (7.43) and (7.46) can be expressed as dψ · ψ(y = 0) + (−1) · = 0, 2 dy y=0 dψ =0 1 · ψ(y = 1) + 0 · dy y=1 in agreement with (7.133) and (7.134).
(7.143) (7.144)
239
240
7 Bounded Drift–Diffusion Motion
The previously unknown separation constant λ has a spectrum of real positive eigenvalues which can be calculated using (7.140) from
1 dψn (y) 1 dψn (y) 2 2 λn = + (7.145) ψn (y)2 dx − ψn (y) dy 4 dy 0 0 taking into account the orthogonality relation (7.141)
1 ψn (y)ψm (y) dy = δmn for w(y) = 1.
(7.146)
0
In this way, the first-passage time problem can be solved on the basis of the Sturm–Liouville theory. An alternative approach will be considered in the following section. 7.10 Alternative Method by the Backward Fokker–Planck Equation
Here we treat the first-passage time problem based on the backward Fokker–Planck equation. Random walks inside the one-dimensional domain (0, 1) are considered assuming its boundaries x = 0 and x = 1 to be reflecting and absorbing, respectively. In addition, the random walks are assumed to undergo drift with constant rate ; as previously the positive values of denoted the drift directed from left , The backward Fokker–Planck equation for the distribution function + to right. P y, y0 , T is of the form ∂ 2 P(y, y0 , T) ∂P(y, y0 , T) ∂P(y, y0 , T) = + ∂T ∂y0 ∂y02 subject to the boundary conditions ∂P(y, y0 , T) = 0, ∂y0 y0 =0 P(y, y0 = 1, T) = 0,
(7.147)
(7.148) (7.149)
and the initial condition P(y, y0 , T = 0) = δ(y − y0 ).
(7.150)
Here, as before, y0 is the starting position of a random walker. The probability that the walker will leave the interval (0, 1) some time is equal to unity. So the probability that the walker is located inside this interval up to the current moment T is equal to the probability ,of getting to the boundary y = 1 for the + first time in the future. Therefore, if P T, y0 is the probability density of getting to the boundary y = 1 for the first time at the moment T then the following relation
∞
1 , + + , P T , y0 dT = P y, y0 , T dy (7.151a) T
0
7.10 Alternative Method by the Backward Fokker–Planck Equation
or, which is actually the same as (7.28),
1 , + + , d P T, y0 = − P y, y0 , T dy dT 0
(7.151b)
has to hold. It should be noted that the symbol P(T, y = 1) was used previously to denote the first-passage time probability and the argument y0 was omitted because it plays the role of constant parameter. In this description the variable y0 is shown explicitly and the rather formal symbol y = 1, denoting the position of the absorbing boundary, is omitted for brevity. The variable y plays the role of the parameter in (7.147) and the operators ∂T , ∂y0 are commutative. Therefore, first integrating (7.147) with respect to y over the interval (0, 1) and then acting by operator ∂+T on ,both sides, we immediately obtain the governing equation for the function P T, y0 ∂P ∂2P ∂P = 2 + . ∂T ∂y0 ∂y0
(7.152)
, + Since the boundary y = 0 is reflecting, the probability distribution P y, y0 , T does not exhibit any anomalous behavior as y0 → 0. Therefore, applying again the same method+ of conversion from the probability distribution to the first-passage , probability P T, y0 we conclude that the boundary condition (7.148) also holds for + , the function P T, y0 , so , + ∂P T, y0 = 0. (7.153) ∂y0 y0 =0
At the initial time T = 0 the probability of getting the boundary y = 1 from any internal point y0 ∈ [0, 1) is equal to zero, so , + P T = 0, y0 = 0 for y0 < 1. (7.154) + , is the initial condition for P T, y0 . + , At the absorbing boundary y0 = 1 the probability density P T, y0 exhibits a singularity. Indeed, on one hand, if the walker was born exactly at the point y0 = 1, then it will be absorbed immediately and, thus, + , P T, y0 = 1 = 0 for T > 0. (7.155) On the other hand, for a point y0 < 1 located arbitrarily near this boundary, equality (7.154) holds, so + , lim P T, y0 , = 0. (7.156) y0 →1−0
In addition, the characteristic time interval during which such a walker will be absorbed at the boundary y0 = 1 tends to zero as y0 → 1. The latter feature is equivalent to the equality
τ + , P T, y0 dT = 1. (7.157) lim lim τ→+0 y0 →1−0 0
241
242
7 Bounded Drift–Diffusion Motion
In the general case we can only state that the walker definitely leaves the interval (0, 1) some time so, for an arbitrary point y0 , we have
∞ , + (7.158) P T, y0 dT = 1. 0
+ , To tackle this singularity let us convert from the function P y0 , T to its Laplace transform
∞ + , , + e−sT P T, y0 dT. (7.159) L s, y0 = 0
Performing the Laplace transformation with respect to both sides of (7.152) and integrating by parts, the left-hand side reduces to sL =
∂L ∂ 2L + + P(T = 0, y0 ). 2 ∂y0 ∂y0
(7.160)
∂L ∂ 2L + . ∂y0 ∂y02
(7.161)
, + So, by virtue of (7.154) the governing equation for the Laplace transform L s, y0 has the form sL =
Using the operator ∂y0 on expression (7.159) we see that the boundary condition (7.153) holds also for the given Laplace transform. As far the other boundary is concerned, only an infinitely narrow neighborhood of T = 0 contributes to the integral (7.159) in this case and, thus, the exponential cofactor exp(−sT) can be omitted when y0 → 1, reducing (7.159) to (7.158). It turns out that (7.161) must be accompanied by the following boundary conditions
and
∂L(s, y0 ) =0 ∂y0 y0 =0
(7.162)
L(s, y0 = 1) = 1.
(7.163)
Let us analyze in detail the first-passage time properties in the situation under consideration. As a first step, the general solution of (7.161) is written in the form + , L s, y0 = A+ ey0 κ+ + A− ey0 κ− , (7.164) where A± are constants and κ± are the values that should be found substituting (7.164) into (7.161). In this way we conclude that the values κ± are the roots of the equation κ2 + κ − s = 0, so 1 κ± = − ± 2
.
1 2 + s. 4
(7.165)
(7.166)
7.10 Alternative Method by the Backward Fokker–Planck Equation
The boundary conditions (7.162)–(7.163) in this case become ∂L(s, y0 ) = κ+ A+ + κ− A− = 0, ∂y0 y0 =0 + , L s, y0 = 1 = A+ eκ+ + A− eκ− = 1, specifying the constants A± , namely, κ− A+ = − κ − κ+ e − κ− eκ+ κ+ . A− = κ+ eκ− − κ− eκ+
(7.167) (7.168)
(7.169a) (7.169b)
The substitution of (7.166) and (7.169) into (7.164) yields the desired expression + , k cosh(y0 k) + 12 sinh(y0 k) 1 L s, y0 = e− 2 (y0 −1) . (7.170) k cosh(k) + 12 sinh(k) where the symbol k denotes the expression . 1 2 + s. (7.171) k = k(s) := 4 + , In order to find the inverse image P T, y0 of (7.170) we make use of the inverse Laplace transform (γ is any positive number)
γ+i∞ , + 1 (7.172) P(T, y0 ) = esT L s, y0 ds. 2πi γ−i∞ Function (7.170) is analytic everywhere on the complex plane s = Re s + i Im s except for the roots of its denominator because it contains only the even orders = of the variable 14 2 + s. The denominator set of zeros {sn }, as will be clear later, is made up of isolated points of the real axis (Re s) on the left-hand side from the origin s = 0 shown in Figure 7.13. Therefore, to calculate integral (7.172) the residue theory is used. Let us consider a loop Cs of radius R on the complex plane that contains the fragment of the line (γ − i∞, γ + i∞) (see Figure 7.13) and analyze the integral
+ , 1 esT L s, y0 ds. (7.173) I(T, y0 ) := 2πi Cs
along it. According to the residue technique developed for functions of a complex variable its value is equal to the sum running over all the residues of the integrand which are located in the domain Qs bounded by the loop Cs : ! I(T, y0 ) = Res esn T L(sn , y0 ) . (7.174) sn ∈Qs
We recall that, by definition, the residue Res[f (sp )] of a function f (s) of complex variable s is the coefficient c−1 of its expansion around a singularity point sp (a pole of the function f (s)):
243
244
7 Bounded Drift–Diffusion Motion
Im s
CR s R 2q
g …
Re s
…
L (s, y0) singularities
Cs Figure 7.13
Technique for calculating integral (7.172).
f (s) = . . . +
c−2 c−1 + c0 + c1 (s − sp ) + c2 (s − sp )2 + . . . + (s − sp )2 (s − sp )
In particular, let a function f (s) be of the form f (s) =
P(s) , Q(s)
(7.175)
and exhibit a singularity of the first order at the point sp , so c1 = 0 whereas all the other coefficients c−2 , c−3 , . . . = 0, and P(sp ) = 0. Then −1 dQ(s) Res f (sp ) = P(sp ) . (7.176) ds s=sp If we are able to find the integral along the circular fragment CR of Cs then expression (7.174) immediately gives us the inverse Laplace transform (7.172). The value of this integral part + can , be estimated in the following way. We note that for y0 < 1 the function L s, y0 tends to zero exponentially as the argument s goes to infinity along the half-lines s = −re±iθ where 0 < θ < π, namely 1 , + √ (7.177) L s, y0 ∝ exp − (1 − y0 ) r sin φ . 2 Therefore the integral along CR , except for an arbitrary small-angle neighborhood of the real axis (Re s), tends to zero as R → ∞. The set of zeros of the denominator is practically equidistant far from the origin. We can choose an arbitrary large circular
7.10 Alternative Method by the Backward Fokker–Planck Equation
fragment passing between these singularities and, so, the Laplace transform at CR is bounded in the given neighborhood, where the exponential cofactor esT ∼ e−RT → 0 decreases exponentially. Summarizing the aforesaid argument we conclude that the integral vanishes
, + (7.178) esT L s, y0 ds → 0 as R → ∞. CR
Whence it follows that the desired inverse Laplace transform can be represented as the series ! + , (7.179) P T, y0 = Res esn T L(s, y0 ), sn . n
As follows from (7.170) the residue set {sn } is determined by the equation sinh k = 0, (7.180) 2 where k = k(s) is given by (7.171). Equation (7.180) will be analyzed in detail in Section 7.11. Applying the results of this analysis, it is possible to assert that the collection of poles are located at Im s = 0 and Re s < 0, containing the points k cosh k +
1 sn = − 2 − 0 k2n , n = 1, 2, 3, . . . 4 and an additional point either
(7.181)
1 k2∗ s∗ = − 2 − 0 4
for −2 < < 0
(7.181a)
1 k2∗ s∗ = − 2 + 0 4
for < −2
(7.181b)
or
depending on the value of the drift rate . Here the set {0 kn } of real numbers as well as the value 0 k∗ are the roots of equation k cos k +
sin k = 0, 2
(7.182)
with the latter value only meeting it for −2 < < 0, whereas for < −2 the value 0 k∗ is a root of the equation k cosh k +
sinh k = 0. 2
(7.183)
Figure 7.14 shows the solutions 0 k from (7.182) or (7.183) geometrically constructed as the intersection between the trigonometric function tan k or hyperbolic tangent tanh k with the linear function −(/2)k for different values of the control parameter . It should be noted that the zero-value roots of (7.181a) and (7.181b) do not contribute to the set of poles because the numerator of the ratio (7.180) is also equal to zero in this case. In order to calculate the residues of function (7.170) using formula (7.176) we will first calculate the derivative of its denominator
245
7 Bounded Drift–Diffusion Motion 15
∼ k1
∼ k−2
10
tan(k), − 2k/Ω
∼ k1
∼ k−1
∼ k−2 5
∼ k∗
∼ k−1
0
∼ −k∗
∼ k1
Ω = −1.0
−5 −10
∼ k−1 ∼ k−1
∼ k2
Ω = 2.0
∼ k1
Ω = 0.7
Ω = −0.7
∼ k2
−15 −1.5
−1.0
−0.5
0.0 0.5 Ratio k /π
1.0
3.0 Ω = −2.5 2.5
2.0
∼ k∗
2.0 Ω = −1.5
∼ k∗
0.8 Ω = −5.0
Ω = −1.1 ∼ k∗
1.0
1.2 1.0
∼ k∗
1.5
1.5
0.6 0.4
0.5
0.2
0.0
0.0
−0.5
−0.2 −0.1 0.0 0.1 0.2 0.3 0.4 Ratio k/π
0.0 0.2 0.4 0.6 0.8 1.0 Ratio k/π
Figure 7.14 Set of solutions 0 k obtained from (7.182) as the intersection between tan k and the straight line −(/2)k and those of (7.183) as the intersection between tanh k and the same straight line (bottom right figure).
. D(s) :=
1 2 + s cosh 4
.
. 1 2 1 2 1 + s + sinh +s 4 2 4
at the given points. Taking into account equality (7.180) we have . 2 − sn dD(s) 1 2 + sn . = = cosh ds sn 4 1 2 + s 4
n
tanh(k), – 2k /Ω
−2.0
tan(k), – 2k /Ω
246
7.10 Alternative Method by the Backward Fokker–Planck Equation
The numerator of ratio (7.180) at the poles is equal to = . sinh (1 − y0 ) 14 2 + sn 1 2
= + sn N(sn ) := . 4 1 2 sinh 4 + sn In these terms the desired probability density P(T, y0 ) is specified by the sum 1 N(sn ) P(T, y0 ) = e[− 2 (y0 −1)+sn t] . (7.184) dD(s)/ds| sn n Using the expressions obtained and again the equality (7.180) this sum can be rewritten as = +1 2 , s T sinh (1 − y0 ) 1 2 + sn 4 + sn e n 1 4 = P(T, y0 ) = 2e− 2 (y0 −1) · 2 − sn n sinh 2 14 2 + sn 1
= 2e 2 (1−y0 )
−2<<0
+
<−2
−
02
− k + ∞ e n 1 + 2 02 n=1
2 /4
T
k2∗ +2 /4 T − 0
e 1+
0 sin (1 − y0 )0 k∗
k∗ 1 20 k2∗ +2 /4 − −0 k2∗ +2 /4 T
e 1+
0 sin (1 − y0 )0 kn
!
kn 1 kn +2 /4
0
k∗ sinh 1 2 −0 k2∗ +2 /4
!
(1 − y0 )0 k∗
!
" .
(7.185)
The latter formula stems from the former one after several steps of arithmetical manipulation also taking into account (7.182) and (7.183). Expression (7.185) is the desired formula for the probability density of getting to the absorbing boundary for the first time at moment T. It exactly coincides with expressions (7.110), (7.112) and contains (7.111) as the limiting case → −2 where the summation index is related to n = m + 1. Let us consider the case corresponding to a significant potential barrier separating the terminal points of the diffusion interval, that is the limit of large negative values < 0 and || 1. For such as the value 0 k∗ is given by the root of (7.183) and its magnitude is much larger than unity. Approximating we get 0 k∗ ≈ −/2 = ||/2 because, in this case, tanh0 k∗ ≈ 1 (see Figure 7.14). The straight line in Figure 7.3 shows the same root κ0 ≡ 0 k∗ = /2 in its zeroth approximation. In order to obtain the deviation of 0 k∗ from the zeroth estimate we have found, (7.183) can be solved by iteration. So, let us rewrite (7.183) as 1 − e−2k k = − tanh k = − 2 2 1 + e−2k (7.186) =− 1 − 2e−2k + 2e−4k − 2e−6k + . . . . 2
247
248
7 Bounded Drift–Diffusion Motion
Obviously the zeroth approximation with respect to the small parameter exp(−2k) yields the same estimate and the first iteration gives the desired value 0 (7.187) k∗ ≈ − 1 − 2e . 2 All the terms in expression (7.185) containing the time-dependent cofactors exp −(0 k2n + 2 /4)T become ignorable on scales T 4/2 . The last one is the exception because by virtue of (7.187) k2∗ = 2 e − 2 e2 ≈ 2 e 2 /4 − 0
(7.188)
its cofactor is of the form exp − (−0 k2∗ + 2 /4)T ≈ exp − 2 e T
(7.189)
and on time scales T < e|| /2 the corresponding term keeps its initial value at T = 0. So as time T goes beyond a rather narrow initial interval T 4/2 only the last term contributes to the function P(T, y0 ). In this case, taking into account estimate (7.187) expression (7.185) can be calculated as P(T, y0 ) = −2e
1 (1−y ) 0 2
− −0 k2∗ +2 /4 T
e 1+
! 0 sinh (1 − y0 )0 k∗ .
k∗ 1 2 −0 k2∗ +2 /4
(7.190)
k2∗ should be estimated using the second term In this expression the value 2 /4 − 0 in (7.187) which gives (7.188) because otherwise the result is equal to zero. In the cases where the value 0 k∗ enters individually it is possible to use the leading term In this way also using the definition of the hyperbolic only, that is set+0 k∗ = −/2. , sine, sinh z = ez − e−z /2, (7.190) can be represented as ! ! P(T, y0 ) = 1 − e(1−y0 ) 2 e exp − 2 e T . (7.191) It should be noted that (7.191) matches the standard formulas for the escape rate from a potential well. In fact, for 1 − y0 1/||, (7.191) considered as a function of time T, has the form T 1 exp − P(T, y0 ) = (7.192)
T
T with the time scale
T =
+ , 1 exp − . 2
(7.193)
It is exactly the time distribution for escaping from a potential well and (7.193) is the characteristic dependence of the mean life-time in the potential well, on its height. Finally, we display the different approximations together with the exact solution shown in Figure 7.15. In the long-time behavior, without considering the influence of the initial delta peak at y0 , all curves coincide up to a certain extent very well.
7.11 Roots of the Transcendental Equation
P (T, y = 1)
0.2 P (T. y = 1)
0.1
0.15
0
4
8
12
16
T
0.1
0.05
0
1
2 T
3
4
Figure 7.15 Behavior of first-passage time probability density distribution P(T, y = 1) ≡ P(T, y0 ) for different approximation levels. Parameters are = −5.0 (scaled negative drift coefficient) and y0 = 0.5 (initial condition). The exact solution (7.185) is given by the full line, the approximation (7.192) by the dotted and (7.191) by the dashed line.
7.11 Roots of the Transcendental Equation
For the complex number z = ζ + iξ the following identities sinh z = sinh ζ cos ξ + i cosh ζ sin ξ cosh z = cosh ζ cos ξ + i sinh ζ sin ξ hold, therefore, the equation z cosh z = β sinh z becomes + , + , (ζ + iξ) cosh ζ cos ξ + i sinh ζ sin ξ = β sinh ζ cos ξ + i cosh ζ sin ξ . Then, separating the real and imaginary parts, we get ζ cosh ζ cos ξ − ξ sinh ζ sin ξ = β sinh ζ cos ξ,
(7.194a)
ζ sinh ζ sin ξ + ξ cosh ζ cos ξ = β cosh ζ sin ξ.
(7.194b)
If ζ = 0, then the system of equations (7.194) is reduced to the equality imposed on ξ: ξ cos ξ = β sin ξ.
(7.195)
In its turn, for ξ = 0 the coupled equation (7.194) converts into the equality imposed on ζ: ζ cosh ζ = β sinh ζ.
(7.196)
249
250
7 Bounded Drift–Diffusion Motion
We will demonstrate that system (7.194) does not admit other solutions. Let ζ = 0 and ξ = 0. In this case we also have cos ξ = 0 and sin ξ = 0 because, otherwise, either (7.194a) or (7.194b) cannot hold. This enables us to divide (7.194a) and (7.194b) by sinh ζ cos ξ and cosh ζ sin ξ, respectively, reducing (7.194) to ζ
cosh ζ sin ξ −ξ = β, sinh ζ cos ξ
(7.197)
ζ
sinh ζ cos ξ +ξ = β, cosh ζ sin ξ
(7.198)
ζ
sin ξ sinh ζ cos ξ cosh ζ −ξ =ζ +ξ sinh ζ cos ξ cosh ζ sin ξ
(7.199)
thus,
or sin 2ξ sinh 2ζ = . 2ζ 2ξ
(7.200) Since |sin ξ/ξ| < 1 and sinh ζ/ζ > 1 for ζ, ξ = 0 there are no solutions with ζ, ξ = 0 simultaneously. Summarizing the aforesaid we get the conclusion that the equation z cosh z = β sinh z admits only solutions meeting one of the following conditions ζ cosh ζ = β sinh ζ,
ξ = 0,
(7.201a)
or ξ cos ξ = β sin ξ,
ζ = 0.
(7.201b)
Whence it follows that, first, the collection of solutions contains the series ζ0 = 0, ξ0 = 0, for n = 1, 2, 3, . . .
π π + nπ, + nπ 2 2 and for n = −1, −2, −3, . . . ζn = 0, ξn ∈
−
ζn = 0, ξn = −ξ−n .
(7.202)
Second, for β > 0 and β = 1 there is an additional pair of roots {ζ∗ , ξ∗ } and {−ζ∗ , −ξ∗ } such that π ξ∗ ∈ 0, for β < 1 ζ∗ = 0, (7.203a) 2 ζ∗ ∈ (0, β),
ξ∗ = 0
for β > 1.
In particular, when β → 1 the value ξ∗ → 0 or ζ∗ → 0.
(7.203b)
7.12 Exercises
In the limiting case β 1, the first terms of the root series for n greater and similar to unity are ξn ≈ nπ
(7.204)
and the root of the second type is ζ∗ ≈ β 1 − 2 exp(−2β) .
(7.205)
7.12 Exercises
E 7.1 Fokker–Planck dynamics with linear potential Remember the general one-dimensional Fokker–Planck equation (5.53) σ2 ∂ 2 p(x, t) ∂ ∂ p(x, t) = − f (x)p(x, t) + , ∂t ∂x 2 ∂x2 which corresponds to the stochastic differential equation of Langevin type with white noise (5.46) dx(t) = f (x(t)) dt + σ dW(t). Introduce the diffusion coefficient D = σ2 /2 and the potential function V(x) via V(x) = − f (x) dx to get, as a starting point, the following Fokker–Planck partial differential equation
∂ dV(x) ∂p(x, t) ∂p(x, t) = p(x, t) + D (7.206) ∂t ∂x dx ∂x together with the delta-like initial condition p(x, t = 0) = δ(x − x0 ). Investigate the dynamics (7.206) with linear potential V(x) = −αx, where the constant drift coefficient α may have value ranges from minus to plus infinity, shown in Figure 7.16.
V(x)
a<0 0 a>0
a
0 x
x0
b
Figure 7.16 Sketch of the linear potential V(x) = −αx within the finite interval a ≤ x ≤ b for two different parameter values α > 0 (solid line) and α < 0 (broken line). The initial contition is marked by x0 .
251
252
7 Bounded Drift–Diffusion Motion
Find the solution of the drift–diffusion problem in a finite interval a ≤ x ≤ b with well defined boundary conditions. Take mixed boundaries, meaning closed on the left-hand side x = a and open on the right-hand side x = b. Discuss the first-passage time problem (7.28). Follow the analytical method outlined in Section 7.5 and compare it with results presented by Fox and Choi [29, 48] and Linetsky [125, 126]. Is the P´eclet number, known from Redner [185], related to the dimensionless drift parameter (7.21)? E 7.2 Fokker–Planck dynamics with V-shaped potential Find the solution of the drift–diffusion problem for a particle moving in a V-shaped potential given by V(x) = γ | x | (γ ≥ 0). The dynamics is described by the Fokker–Planck equation (7.206) with natural boundary conditions −∞ ≤ x ≤ +∞ and initial value x0 . Hint: use the analogy with the solution of (7.1). E 7.3 First-passage time problem with absorbing boundary Find the solution of the drift–diffusion problem given by the Fokker–Planck equation (7.206) with the, already known, potential V(x) = γ | x | in the case of a natural boundary at x → −∞ and an absorbing one at x = b. E 7.4 First-passage time problem with mixed boundaries Find the solution of the drift–diffusion problem given by the Fokker–Planck equation (7.206) with the potential V(x) = γ | x | in the case of a reflecting boundary at x = a and an absorbing one at x = b. Compare the results with the previous task.
253
8 The Ornstein–Uhlenbeck Process
8.1 Definitions and Properties
In Chapter 5 we discussed the Brownian motion in the velocity space based on the Langevin equation. The corresponding stochastic process is known as the Ornstein–Uhlenbeck process. The mathematical model developed with some modifications also has other applications, e.g. in finance [119]. Here we consider a more general case where the mean value of the stochastic variable relaxes to a nonzero value. In the following we will study the process in the space of two variables corresponding to the velocity and the coordinate of the Brownian particle. However, at first we restrict our analysis to motion in one dimension. As distinct from Chapter 5, here we will consider the probability density functions in detail by solving the Fokker–Planck rather than the Langevin equation. From a mathematical point of view [7, 25, 201] the Ornstein–Uhlenbeck process is defined as a mean-reverting process given by the following stochastic differential equation + , dx(t) = a − c x(t) dt + b dW(t) (8.1) together with initial conditions x(t = 0) = x0 and W(t = 0) = 0. All control parameters a, b and c are non-negative. The special case with a = 0 is a process for which the mean value tends to zero. In general, it tends to a nonzero value a/c in the long-time limit t → ∞. If we set c = 0, the arithmetic Brownian motion (cf. Section 5.8) remains. By using the identity + , + , d φx(t) = φ(t) c x(t) dt + dx(t) ,
(8.2)
where φ(t) = exp{c t} acts as an integrating factor, then (8.1) can be written as + , + , d φx(t) = φ(t) a dt + b dW(t) . (8.3) This is a stochastic differential equation with respect to φ x(t). The formal solution, obtained by integrating both sides of (8.3), reads Physics of Stochastic Processes: How Randomness Acts in Time Reinhard Mahnke, Jevgenijs Kaupuˇzs and Ihor Lubashevsky Copyright 2009 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim ISBN: 978-3-527-40840-5
8 The Ornstein–Uhlenbeck Process
8
4 x
254
0
0
5 t
10
Figure 8.1 A stochastic trajectory of the Ornstein– Uhlenbeck process (8.1) with the initial condition x0 = −2 for parameter values a = 3, b = 2 and c = 1. The theoretical mean value and the variance are shown by dashed and solid curves, respectively.
x(t) =
a [1 − exp{−ct}] + x0 exp{−ct} + b c
t
exp{−c(t − s)} dW(s).
(8.4)
0
The first and the second moments are a [1 − exp{−ct}] + x0 exp{−ct}, c , b2 + 1 − exp{−2ct} .
x(t)2 = x(t)2 + 2c
x(t) =
(8.5) (8.6)
As an illustrative example, one stochastic trajectory of the mean-reverting process is shown in Figure 8.1, comparing it with the theoretical mean value. The variance
x(t)2 − x(t)2 , calculated from (8.6) and (8.5), indicates the amplitude of the stochastic fluctuations. The random real-valued function x(t) (Figure 8.1 shows one realization) is defined on the set of outcomes of a random experiment as a solution of the stochastic differential equation (8.1). Details of stochastic modeling including numerical solution techniques can be found in [4] and [104]. Over the years, the Ornstein–Uhlenbeck process [231] has become to play an important role in several branches of natural sciences related, in some form, to Brownian motion and diffusion [13, 103, 154, 186].
8.2 The Ornstein–Uhlenbeck Process and its Solution
We consider the Ornstein–Uhlenbeck process in the space of coordinate x and velocity v [231]. In the Langevin formalism it is defined by the following system of
8.2 The Ornstein–Uhlenbeck Process and its Solution
stochastic differential equations [180, 193] dx = v dt,
√ dv = −γ v dt + 2 B dW(t),
(8.7) (8.8)
where W(t) is the Wiener process. The ensemble of stochastic trajectories given by these equations is characterized by the probability distribution function p(x, v, t) which obeys the Fokker–Planck equation ∂ ∂ ∂2 ∂ p = − [v p] + [γ v p] + [B p]. ∂t ∂x ∂v ∂ v2
(8.9)
This is a particular case of (5.4) with a two-dimensional state vector (x, v). We set p(x, v, t = 0) = δ(x − x0 ) δ(v − v0 )
(8.10)
as the initial condition, which means that the process starts at the position x = x0 with given velocity v0 . Our aim is to solve the above Fokker–Planck equation (8.9) in agreement with (8.10) to get the probability density p = p(x, v, t) analytically. Here we will give a sketch of the approach worked out by Ralf Remer [189]. Since the equation contains only the first derivative with respect to the coordinate x, it is helpful to make the Fourier transformation relative to this variable. It is defined as
+∞ 1 dkx exp [i kx x] p(kx , v, t). (8.11) p(x, v, t) = √ 2 π −∞ In this way the Fokker–Planck equation (8.9) is transformed to ∂ ∂2 ∂ [γ v p] + p = −i kx v p + [B p]. ∂t ∂v ∂ v2
(8.12)
The resulting equation (8.12) for p = p(kx , v, t) contains the second derivative with respect to v, whereas the velocity v itself is contained only in the first power. Therefore we can easily make the Fourier transformation with respect also to this variable, so 1 p(kx , v, t) = √ 2π
+∞
−∞
dkv exp [i kv v] p˜ (kx , kv , t),
+∞ 1 ∂ v p(kx , v, t) = − √ dkv i exp [i kv v] p˜ (kx , kv , t) ∂ kv 2 π −∞
+∞ ∂ 1 dkv i exp [i kv v] = √ p˜ (kx , kv , t). ∂ kv 2 π −∞
(8.13)
(8.14)
Hence we arrive at the following equation for p˜ = p˜ (kx , kv , t) in the Fourier space of x and v ∂ ∂ p˜ − B k2v p˜ (8.15) p˜ = [kx − γ kv ] ∂t ∂ kv
255
256
8 The Ornstein–Uhlenbeck Process
or ∂ ∂ p˜ = −B k2v p˜ . (8.16) p˜ + [γ kv − kx ] ∂t ∂ kv This equation represents the Cauchy problem which can be solved by the method of characteristics. For this purpose we introduce a time-dependent quantity kv = kv (τ) which depends on the intrinsic time τ ∈ [0, t] and obeys the boundary condition kv (t) = kv .
(8.17)
Then we consider the equation ∂ ∂ 2 p˜ = −B kv p˜ p˜ + [γ kv − kx ] ∂t ∂ kv
(8.18)
which is obtained from (8.16) by replacing kv with kv . Due to the boundary condition (8.17), the solution of (8.18) agrees with that of (8.16) at τ = t. The function kv (τ) can be chosen such that it satisfies the equation d k v = γ k v − kx . dτ In this case we have ∂
[γ kv − kx ]
p˜ =
(8.19) ∂ p˜ d kv ∂kv d τ
∂ kv and hence we obtain an important relation
(8.20)
∂ p˜ ∂ p˜ d kv ∂ p˜ d p˜ ∂ 2 = + = + [γ kv − kx ] p˜ = −B kv p˜ (8.21) dτ ∂ τ ∂kv d τ ∂τ ∂ kv for the total derivative of p˜ (τ). Thus we have an ordinary differential equation d 2 (8.22) p˜ = −B kv p˜ dτ the solution of which gives us p˜ (kx , kv , t) at τ = t. Equation (8.19) can be solved with respect to the variable kv by the method of variation. It yields kx kx exp [γ τ] + . (8.23) kv (τ) = k0 − γ γ Taking into account the boundary condition (8.17), we obtain the complete solution kx kx exp [γ (τ − t)] + . (8.24) kv (τ) = kv − γ γ This equation is used to solve the differential equation (8.22) as follows p˜ = p˜ 0 exp [−(t)],
t 2 kv (τ) dτ, (t) = B
(8.25)
0
A2 AC 2 (exp [2 γ t] − 1) + C t + 2 (exp [γ t] − 1) , (t) = B 2γ γ kx kx exp [−γ t], C = . A = kv − γ γ
(8.26)
8.2 The Ornstein–Uhlenbeck Process and its Solution
The initial condition for the probability density distribution in the Fourier space reads 1 exp [−i kx x0 ] exp [−i kv v0 ]. p˜ (kx , kv , t = 0) = (8.27) 2π By inserting the initial condition in the solution for p we replace kv with the equivalent term kv (0). Then the solution for the probability distribution in the Fourier space becomes 1 exp [−i kx x0 ] exp [−i v0 (A + C)] exp [−(t)], 2π 2 A AC (t) = B (exp [2 γ t] − 1) + C2 t + 2 (exp [γ t] − 1) , 2γ γ kx kx exp [−γ t], C = . A = kv − γ γ
p˜ (kx , kv , t) =
(8.28)
(8.29)
In order to obtain the one-dimensional probability density distribution in the space of velocities v, we use the following relation
pv (v, t) = =
dx p(x, v, t) =
√
2 π p(0, v, t) =
√ 2 πF[˜p(0, kv , t)](v, t).
+∞
−∞
dkv exp [i kv v] p˜ (0, kv , t) (8.30)
The probability distribution for the velocity v at time t is thus given by the inverse transformation of the solution in the Fourier space at kx = 0. According to this, we calculate pv (v, t) =
√
2 πF[˜p(0, kv , t)](v, t),
1 exp [−i kv v0 exp [−γ t]] exp [−φ(t) k2v ], 2π B (1 − exp [−2 γ t]), (8.31) φ(t) = 2γ 1 v2 1 , (8.32) exp − F exp [−φ(t) k2v ] (v, t) = 2 2 φ(t) 2 φ(t) ! F f (kv , t) (v, t) = f (v, t), ! F exp [i kv b]f (kv , t) (v, t) = f (v + b, t). (8.33) p˜ (0, kv , t) =
It leads to the complete solution in v which reads pv (v, t) = σv2 (t) =
1 (v − v0 exp [−γ t])2 , exp − 2 σv2 (t) 2 π σv2 (t) 1
B (1 − exp [−2 γ t]). γ
(8.34)
257
8 The Ornstein–Uhlenbeck Process
0.8 t = 0.5 s t=1s t=5s
0.6 pv(v,t)
258
0.4
0.2
0 −2
−1
0 v
1
2
Figure 8.2 The probability density distribution pv (v, t) given by (8.34) at three different times for the parameter values B = 0.5 m2 s−3 and γ = 1 s−1 . The initial condition is v0 = 1 m s−1 .
The probability density distribution for velocity v at different times t is shown in Figure 8.2. The dotted line indicates our initial condition pv (v, t = 0) = δ(v − v0 ) with v0 = 1 m s−1 (the initial velocity). The dashed line corresponds to the mean value of the velocity in the limit t → ∞. The relaxation of the distribution to the stationary (equilibrium) one can be clearly seen. Considering the long-time limit t → ∞ in (8.34), we obtain the well known Maxwell distribution [85, 193] pv (v, t) = √
1 m v2 exp − 2 kB T 2 π kB T m
B kB T = γ m
(8.35) (8.36)
with temperature T, mass of particles m, and the Boltzmann constant kB . The diffusion coefficient in the velocity space B characterizes the fluctuation strength, whereas γ is the friction coefficient, which is related to the energy dissipation. As we can see, the ratio of these two quantities is proportional to the temperature and, respectively, the thermal energy kB T. Therefore, this relation (8.36), which has already appeared in Chapter 5 (cf. (5.103)) and is known as Einstein’s formula representing some form of the fluctuation-dissipation theorem. In the same way we can calculate the probability distribution over the coordinate x px (x, t) = p˜ (kx , 0, t) =
√
2 π F[˜p(kx , 0, t)],
1 exp [−i kx x0 ] 2π × exp −i kx v0 (1 − exp [−γ t])/γ exp [−ω(t) k2x ],
(8.37)
8.2 The Ornstein–Uhlenbeck Process and its Solution
ω(t) = −
3 B B B B − exp [−2 γ t] + 2 t + 2 3 exp [−γ t]. 2 γ3 2 γ3 γ γ
It gives us the solution 1 (x − µx (t))2 1 px (x, t) = , exp − 2 σx2 (t) 2 π σx2 (t) v0 µx (t) = x0 + (1 − exp [−γ t]), γ σx2 (t) =
(8.38)
2B B B B t − 3 3 + 4 3 exp [−γ t] − 3 exp [−2 γ t]. γ2 γ γ γ
The probability density distribution over the coordinate x at different times is presented in Figure 8.3. The dotted line again shows our initial condition px (x, t = 0) = δ(x − x0 ) with x0 = 0. The dashed line indicates the mean value of the coordinate in the long-time limit t → ∞. The broadening of the distribution with time can be clearly seen. Considering the variance in more detail, we find the following relation σx2 ∼
2B t = 2Dt γ2
(8.39)
for long times t → ∞. This linear growth of the variance, shown in Figure 8.4, had already been discovered by Albert Einstein [40].
2.5
px(x,t)
2
t = 0.5 s t=1s t=5s
1.5 1 0.5 0 −2
−1
0 x
1
2
Figure 8.3 The probability density distribution px (x, t) given by (8.38) at three different times for parameter values B = 0.5 m2 s−3 and γ = 1 s−1 . The initial condition is x0 = 0 m and v0 = 1 m s−1 .
259
8 The Ornstein–Uhlenbeck Process
4
3
sx2
260
2
1
0
0
1
2
3
4
5
t Figure 8.4 Time dependence of the variance σx2 given by (8.38). The dashed line shows the asymptotic slope D = B/γ2 for large times (see (8.39). The values of the parameters are B = 0.5 m2 s−3 and γ = 1 s−1 .
In order to make the inverse transformation of the solution in Fourier space, we modify (8.28) as follows
p˜ (kx , kv , t) =
1 exp [−i kx µx ] exp [−i kv µv ] 2π
× exp [−σx2 k2x − σv2 k2v − σx2 v kv kx ] , v0 + µ x = x0 + 1 − exp [−γ t] , γ
(8.40)
µv = v0 exp [−γ t], σx2 =
, , B + B + B 1 − exp [−2 γ t] + 2 t − 2 3 1 − exp [−γ t] , 2 γ3 γ γ
σv2 =
, B + 1 − exp [−2 γ t] , 2γ
σx2 v = −
, , B + B + 1 − exp [−2 γ t] + 2 2 1 − exp [−γ t] . 2 γ γ
In the following we make the inverse transformation with respect to the variable kv . For this purpose we rewrite (8.40) in a slightly modified form:
p˜ (kx , kv , t) =
1 exp [−i kx µx ] exp [−i kv µv ] exp [−σx2 k2x − σv2 k2v ], 2π
µv = µv − i σx2 v kx .
(8.41)
8.3 The Ornstein–Uhlenbeck Process with Linear Potential
By using the inverse Fourier transformation given in (8.33), we obtain the probability distribution p(kx , v, t) which reads 1 (v − (µv − i σx2 v kx ))2 exp [−i kx µx ] exp [−σx2 k2x ] exp − 2 2 σv2 2 π 2 σv2 1 (v − µv )2 1 exp [−i kx µx ] exp [−σ2x k2x ] exp − = , (8.42) 2 2 σv2 2 π 2 σv2
p=
1 -
µx = µx + (v − µv ) σx2 v /(2 σv2 ), σ2x = σx2 − σx4 v /(4 σv2 ). The resulting probability density distribution p can be easily transformed back to the original distribution p(x, v, t) which we wanted to calculate, so p(x, v, t) = -
1 1 (v − µv )2 1 (x − µx )2 . exp − = exp − 2 2 σv2 2 2 σ2x 2 π 2 σv2 2 π 2 σ2x 1
(8.43) The probability density p(x, v, t) is given in units of s m−2 . We have thus calculated the probability distribution from which we can determine the probability of finding a particle within any small coordinate interval [x, x + dx] and velocity interval [v, v + dv] at moment t. The probability density distribution obtained is presented in Figure 8.5. The broadening over the coordinate axis with increasing time, as well as the stationary profile over the velocity axis can be easily recognized. Furthermore, it is easy to see that a simple multiplication of one-dimensional distributions for the coordinate x and velocity v does not reproduce mutual dependence.
8.3 The Ornstein–Uhlenbeck Process with Linear Potential
In the previous section we have treated the well known Ornstein–Uhlenbeck process and have obtained the time-dependent solution for the distribution of the probability density. The process considered there takes place in a spatial region with constant potential. Now we would like to consider the same process in the presence of a potential which is linear in x. It is defined by U(x) = −m γ θ x.
(8.44)
By means of the known relation between force F and potential U F(x, t) = −
∂ U(x, t) ∂x
(8.45)
261
262
8 The Ornstein–Uhlenbeck Process t=1s 2
1
1 p(x,v,t)
0
3.0 2.6 2.3 1.9 1.5 1.1 0.75 0.38 0
−1 −2 −2
−1
(a)
0 x
1
v
v
t = 0.5 s 2
p(x,v,t)
0
0.90 0.79 0.68 0.56 0.45 0.34 0.23 0.11 0
−1 −2 −2
2
−1
(b)
2
1
1 p(x,v,t)
0
0.14 0.12 0.11 0.09 0.07 0.05 0.03 0.02 0
−1 −2 −2 (c)
−1
0 x
1
2
t = 50 s
2
1
v
v
t=5s
0 x
p(x,v,t)
0
3.5E-2 3.1E-2 2.6E-2 2.2E-2 1.7E-2 1.3E-2 8.7E-2 4.4E-2 0
−1 −2 −2
2
−1
(d)
0 x
1
2
Figure 8.5 The probability density distribution p(x, v, t) given by (8.43) at four different times t = 0.5 s (a), t = 1 s (b), t = 5 s c), and t = 50 s (d). The values of the parameters are B = 0.5 m2 s−3 and γ = 1 s−1 . The initial conditions are x0 = 0 m and v0 = 1 m s−1 .
the equations of motion in the Langevin formalism can be written as dx = v dt, dv = γ (θ − v) dt +
√
(8.46) 2 B dW(t).
(8.47)
As distinct from the standard case we now have an additional term γ θ dt. The corresponding Fokker–Planck equation for the probability density p(x, v, t) reads ∂ ∂ ∂ ∂2 ∂ p=− (v p) − (γ θ p) + (γ v p) + (B p). ∂t ∂x ∂v ∂v ∂ v2
(8.48)
Applying the Fourier transformation twice as in the previous section, with p = p(kx , v, t) and p˜ = p˜ (kx , kv , t), we obtain the following equation ∂ ∂ p˜ = −[B k2v + i γ θ kv ] p˜ p˜ + [γ kv − kx ] ∂t ∂ kv
(8.49)
which has to be solved. It is easy to see that the system of characteristic equations changes only with respect to p˜ and not with respect to kv . Hence, the solution for kv (τ) can be taken from (8.24). The solution for p˜ is calculated as follows
8.3 The Ornstein–Uhlenbeck Process with Linear Potential
d 2 p˜ = −[B kv + i γ θ kv ] p˜ gives p˜ = p˜ 0 exp [−(t)], dτ
t ! 2 (t) = dτ B kv (τ) + i γ θ kv (τ) ,
(8.50)
0
B A2 BAC (exp [2 γ t] − 1) + B C2 t + 2 (exp [γ t] − 1) (t) = 2γ γ + i θ A (exp [γ t] − 1) + i γ θ C t, kx kx exp [−γ t], C = . A = kv − γ γ The initial conditions are the same as in the previous section. Therefore we also have the same constant p˜ 0 . In the following, we transform the above solution for the probability density distribution in the Fourier space into a form similar to (8.40), that is, p˜ (kx , kv , t) =
µx = µv = σx2 = σv2 = σx2 v =
1 exp [−i kx µx ] exp [−i kv µv ] 2π × exp [−σx2 k2x − σv2 k2v − σx2 v kv kx ]
, θ + v0 − 1 − exp [−γ t] + θ t, x0 + γ γ v0 exp [−γ t] + θ (1 − exp [−γ t]), , , B + B + B 1 − exp [−2 γ t] + 2 t − 2 3 1 − exp [−γ t] , 2 γ3 γ γ , B + 1 − exp [−2 γ t] , 2γ , , B + B + − 2 1 − exp [−2 γ t] + 2 2 1 − exp [−γ t] . γ γ
(8.51)
In this way, for the Ornstein–Uhlenbeck process with linear potential we obtain a similar solution as previously (for constant or zero potential) with only slightly modified parameters p(x, v, t) = -
1 1 (v − µv )2 1 (x − µx )2 , exp − = exp − 2 2 σv2 2 2 σ2x 2 π 2 σv2 2 π 2 σ2x 1
µx = µx + (v − µv ) σx2 v /(2 σv2 ),
(8.52)
σ2x = σx2 − σx4 v /(4 σv2 ). The resulting probability density distribution p(x, v, t) is presented in Figure 8.6. The drift and broadening over the coordinate x with increasing of time and also the stationary profile over the velocity v axis (now with an other than zero mean value) can be easily recognized.
263
264
8 The Ornstein–Uhlenbeck Process t=1s 2
1
1 p(x,v,t)
0
3.0 2.6 2.3 1.9 1.5 1.1 0.75 0.38 0
−1 −2 −2
−1
(a)
0 x
1
v
v
t = 0.5 s 2
p(x,v,t)
0
0.90 0.79 0.68 0.56 0.45 0.34 0.23 0.11 0
−1 −2 −2
2
−1
(b)
t=5s
2
2
1
1 p(x,v,t)
p(x,v,t)
0
0.12 0.11 0.090 0.075 0.060 0.045 0.030 0.015 0
−1 −2 −2
−1
0 x
1
v
v
1
t = 50 s
2
(c)
0 x
0
1E-11 1E-11 8E-12 7E-12 5E-12 4E-12 3E-12 1E-12 0
−1 −2 −2
2
−1
(d)
0 x
1
2
Figure 8.6 The probability density distribution p(x, v, t) given by (8.52) at four different times t = 0.5 s (a), t = 1 s (b), t = 5 s (c), and t = 50 s (d). The values of the parameters are θ = −1 m s−1 , B = 0.5 m2 s−3 , and γ = 1 s−1 . The initial conditions are x0 = 0 m and v0 = 1 m s−1 .
Comparing the solutions for the Ornstein–Uhlenbeck process with constant potential (previous section) and with the linear potential (see (8.52)) we state the same analytical form of a double Gaussian distribution. The variance (and the correlation between the coordinate x and the velocity v) in both cases is the same. However, the mean values µv and µx have been changed by the potential. The velocity v now relaxes to the mean value θ for large times t → ∞. In this case we also have a deterministic drift θ t in the coordinate x in addition to the influence of the initial condition, via the difference term 1/γ (v0 − θ). By means of the general solution (8.52) one can again calculate the onedimensional probability density distributions for any one of the variables. For the velocity v we obtain 1 (v − µv )2 , pv (v, t) = exp − 2 2 σv2 2 π 2 σv2 1
µv = v0 exp [−γ t] + θ (1 − exp [−γ t]), σv2 =
, B + 1 − exp [−2 γ t] . 2γ
(8.53)
8.3 The Ornstein–Uhlenbeck Process with Linear Potential 0.8 t = 0.5 s t=1s t=5s
pv(v,t)
0.6
0.4
0.2
0 −2
−1
0 v
1
2
Figure 8.7 Probability density distribution pv (v, t) given by (8.54) for three different times. The values of the parameters are θ = −1 m s−1 , B = 0.5 m2 s−3 , and γ = 1 s−1 . The initial condition is v0 = 1 m s−1 .
The time-dependent solution for the velocity v is presented in Figure 8.7. We again find that the probability density distribution corresponds to the Gaussian distribution. Due to the linear potential in coordinate space, the mean value of the stationary probability distribution is shifted by θ. The variance (the mean squared fluctuation of the velocity), however, is not influenced by the linear potential. Now we calculate the one-dimensional distribution over the coordinate x by means of the probability density p(x, v, t). The time-dependent solution reads 1 (x − µx )2 , exp − px (x, t) = 2 2 σx2 2 π 2 σx2
, θ + v0 − 1 − exp [−γ t] + θ t, µ x = x0 + γ γ 1
σx2 =
(8.54)
, , B + B + B 1 − exp [−2 γ t] + 2 t − 2 3 1 − exp [−γ t] . 3 2γ γ γ
The distribution is shown in Figure 8.8. This function also corresponds to the Gaussian distribution. The variance (the mean squared fluctuation in the coordinate space) is not influenced by the linear potential. While the velocity tends to a certain value θ, the time dependence of the mean value of the coordinate x is dominated by the linear term θ t in the long-time limit t → ∞. The Ornstein–Uhlenbeck process with the linear potential differs from the standard one only in the mean value of the double Gaussian distribution. Therefore, we have shown in Figure 8.9 the mean value of the velocity µv depending on the mean value of the coordinate µx for times between 0 and 50 s. This relation is always linear for the standard Ornstein–Uhlenbeck process and has a fixed starting point (corresponding to the given initial condition) µx = x0 , µv = v0 and a fixed end-point µx = x0 + v0 /γ, µv = 0 for t → ∞. To the contrary, the relation between both mean values is nonlinear for θ = 0 in the case of the Ornstein–Uhlenbeck
265
8 The Ornstein–Uhlenbeck Process 2.5 t = 0.5 s t=1s t=5s
px(x,t)
2 1.5 1 0.5 0 −2
−1
0 x
1
2
Figure 8.8 The probability density distribution px (x, t) given by (8.55) at three different times. The values of the parameters are θ = −1 m s−1 , B = 0.5 m2 s−3 , and γ = 1 s−1 . The initial condition is x0 = 0 m and v0 = 1 m s−1 . 1 OUP OUP with linear potential t=0s t = 0.5 s t=1s t=5s
0.5
µv
266
0
−0.5
−1
−3
−2
−1 µx
0
1
Figure 8.9 The mean value of the velocity µv versus the mean value of the coordinate µx for times t from 0 s to 5 s. OUP is the abbreviation for the Ornstein–Uhlenbeck process. The values of parameters are θ = −1 m s−1 , B = 0.5 m2 s−3 , and γ = 1 s−1 . The initial condition is given by x0 = 0 m and v0 = 1 m s−1 .
process with linear potential. The starting point in both cases is the same because of the same initial condition. The end point at t → ∞ is µx = ∞ and µv = θ for the linear potential. This results in the nonlinear relation between both mean values.
8.4 The Exponential Ornstein–Uhlenbeck Process
Here we consider the Ornstein–Uhlenbeck process with a potential of exponential form and derive the equation of motion in the space of Laplace and Fourier
8.4 The Exponential Ornstein–Uhlenbeck Process
transformed variables. We consider the following system of equations in the Langevin formalism dx = θ exp [a v] dt, √ dv = −γ v dt + 2 B dW(t).
(8.55) (8.56)
Since the dependence on the velocity v is exponential, the coordinate x can take only positive values if we start with a positive one. Therefore the coordinate can only increase with time. The Fokker–Planck equation for the probability density p = p(x, v, t) reads , ∂ + ∂2 ∂ ∂ p=− (v p) + (B p). θ exp [a v] p + ∂t ∂x ∂v ∂ v2
(8.57)
p(x, v, t = 0) = δ(x − x0 ) δ(v − v0 ),
(8.58)
We set
p(0, v, t) = 0
for x0 > 0
(8.59)
as the initial condition. In this case it is reasonable to consider the Laplace rather than the Fourier transformation with respect to the coordinate x, since the latter is positively defined. The Laplace transformation is defined by
+∞ p(kx , v, t) = dx exp [−kx x] p(x, v, t). (8.60) 0
Hence we obtain for p = p(kx , v, t) ∂ ∂ ∂2 (v p) + p = −θ exp [a v] kx p + (B p). (8.61) ∂t ∂v ∂ v2 We also make the Fourier transformation p˜ = p˜ (kx , kv , t) with respect to the velocity v 1 p(kx , v, t) = √ 2π 1 exp [a v] p(kx , v, t) = √ 2π 1 = √ 2π
+∞ −∞
dkv exp [i kv v] p˜ (kx , kv , t),
+∞−ia −∞−ia +∞ −∞
(8.62)
dkv exp [i kv v] p˜ (kx , kv + i a, t)
dkv exp [i kv v] p˜ (kx , kv + i a, t).
(8.63)
Here we have assumed that p˜ (kx , kv , t) is vanishing at Re kv → ±∞ and analytical within −a ≤ Im kv ≤ 0 in the complex plane of kv , which allows us to shift the integration path in (8.63) to the real axis. This leads to the following equation of motion ∂ ∂ p˜ (kx , kv , t) − Bk2v p˜ (kx , kv , t). p˜ (kx , kv , t) = −θkx p˜ (kx , kv + ia, t) − γkv ∂t ∂kv (8.64)
267
268
8 The Ornstein–Uhlenbeck Process
An essential difference from the equations we obtained earlier for the cases with constant and linear potentials is that (8.64) contains p˜ with shifted argument kv + i a. This equation therefore cannot be solved analytically in the same way as previously. 8.5 Outlook on Econophysics
Over the last few years, research activities in the field of quantitative finance have greatly increased [119,180]. The main topics of interest, especially for practitioners, are risk management and derivatives. But in order to handle risk optimally it is necessary to know some basic facts about stock price dynamics, at least the basic characteristics of the dynamics (chaotic or stochastic, Gaussian-like or not). This is why, we want to consider stock price dynamics as a subject itself [189]. The stock price development is described as a stochastic process following geometric Brownian motion (cf. Section 5.9). The Langevin equation in Ito notation (5.166) is well known as dk = µ k dt + σ k dW(t),
(8.65)
with the stock price k at time t, where µ is the time constant drift (or growth rate), σ is the time constant fluctuation and dW(t) is the time-dependent increment of a Wiener process W(t). As a result of this description, the logarithmic stock price x(t) = ln [k(t)/k0 ] follows a Gaussian distribution, with a time-dependent mean of (µ − σ2 /2) t + x0 and a time-dependent variance of σ2 t, if the initial value at time t = 0 is fixed at x0 . The geometric Brownian motion was the basis for further investigations in econophysics, for instance, of the Black–Scholes formula [180]. But in recent years it was found that the description did not agree with the empirical facts of the stock market [189–192]. Mainly, the changes dx(t) in the logarithmic stock price x(t) are not Gaussian distributed. The empirical distribution is a leptokurtic distribution. It has, compared to that of a Gaussian, fat tails (higher probability density of (absolute) high values) and a higher concentration of probability density around the mean, see Figure 8.10. In models with stochastic volatility, the variance v = σ2 is no longer a parameter, but is itself a variable. In addition to this, the variance is, like the stock price, a stochastic variable. This coincides with empirical observations, that exhibit random behavior of the variance. The corresponding stochastic differential equations with the Langevin formalism in the Ito notation are as follows: dk = µ k dt + v(t) k dWk (t) dv = av (v, k, t) dt + bv (v, k, t) dZv (t) dZv (t) = ρ dWk (t) + 1 − ρ2 dWv (t).
(8.66)
The noise terms dWi (t) are again the increments of a Wiener process and they are independent and identically distributed.
p(dx,dt=3h)
8.5 Outlook on Econophysics
1e+01
1e+00
Dax Normal
-0.02
0 dx
0.02
Figure 8.10 The empirical probability density distribution of the DAX (German stock index) for an increment dx in the logarithmic stock price x(t) for dt = 3 h is plotted against the normal distribution with estimated moments.
The analysis of empirical stock prices shows that the variance exhibits meanreversion, which means that it fluctuates randomly around a long-run average. Therefore, for the drift term av (v, k, t) we use the following relation (Ito notation): av (v, k, t) = γ (θ − v).
(8.67)
The parameter θ is the value of v (usually taken as the long-run average), that the variance reaches within the relaxation time 1/γ. Both approaches that we want to analyze, the Hull–White and the Heston model, belong to the models with stochastic volatility and describe the variance by a mean reverting process. Only the diffusion terms are different. The fluctuation term of H the Hull–White model bHW v (v, k, t) and of the Heston model bv (v, k, t) are known as [192] ˜v bHW v (v, k, t) = κ √ bH v (v, k, t) = κ v,
(8.68) (8.69)
where the parameters κ and κ˜ represent the strength of the stochastic fluctuations of the variance v. In fitting the models against empirical data, the correlation coefficient ρ is often set to zero. The empirical analysis shows that ρ 1 holds, and we assume it in our further analysis. In the Heston model without correlation, the dynamics of the variance v(t) is described by a stochastic differential equation in the Langevin formalism with Ito notation as follows √ (8.70) dv = γ (θ − v) dt + κ v dWv (t).
269
270
8 The Ornstein–Uhlenbeck Process
The corresponding equation for the Hull–White model is dv = γ (θ − v) dt + κ˜ v dW(t).
(8.71)
For a further analysis of the variance we transform the Langevin equation into a Fokker–Planck equation ∂ ∂ p(v, t) + S(v, t) = 0 ∂t ∂v
(8.72)
with probability p(v, t) dv that the variance is in the interval [v, v + dv] at time t, where S(v, t) is the probability flux. It is S(v, t) = γ θ p(v, t) − γ v p(v, t) −
κ2 ∂ v p(v, t) 2 ∂v
(8.73)
κ˜ 2 ∂ 2 v p(v, t) 2 ∂v
(8.74)
for the Heston model and S(v, t) = γ θ p(v, t) − γ v p(v, t) −
for the Hull–White model. We are now interested in the stationary solution of the probability density distribution of the variance v. In this case the probability current S(v, t) in (8.73) is equal to zero. We obtain the stationary probability density distribution pH st (v) for the variance v(t) of the Heston model without correlation as pH st (v) =
aa av! a−1 v exp − ; (a) θa θ
a=
For the Hull–White model this reads as follows a˜ a˜ θa˜ +1 −(˜a+2) a˜ θ pHW v ; (v) = exp − st (˜a) v
2γθ . κ2
a˜ =
2γ . κ˜ 2
(8.75)
(8.76)
The empirical analysis of stock price data reveals that the relaxation time 1/γ of the variance v is around 22 days (especially for indices). Compared to the time scale of the logarithmic returns of about 1 h, that we are interested in, this is very long and leads to the conclusion that we can regard the variance v as constant in the short time window we are considering. Therefore, we consider the probability density distribution of the logarithmic changes y of the stock prices for short time windows τ. We apply the relation for the conditional probabilities
p(k, t) = p(k, t | v, t) p(v, t) dv. (8.77) In accordance with the above discussion, we can use the stationary distribution pst (v) instead of the time-dependent probability density distribution p(v, t). Furthermore, we do not focus on the stock price k itself, but on the changes y = dx of the logarithmic stock price x = ln [k(t)/k0 ] in the time interval τ = dt. Therefore,
8.5 Outlook on Econophysics
according to (8.77) we derive the relation
∞ ps (y, τ) = p(y, τ | v) pst (v) dv,
(8.78)
v=0
where ps (y, τ) indicates that the variance has to be in the stationary regime. The stationary distribution pst (v) we have already calculated. But we need an ansatz for the conditional probability density distribution p(y, τ | v). We know that, in the case of constant variance v = σ2 , the Heston model and the Hull–White model transform into geometric Brownian motion (for the parameters κ = 0 d−1 , κ˜ = 0 d−1/2 , v(t = 0) = v , θ = v ), where the logarithmic changes y are normally distributed with mean µ τ + 12 v τ and variance v τ. Because the integral (8.77) is also valid for the geometric Brownian motion (with p(v, t) = δv−v ) we obtain the ansatz for the conditional probability density distribution + ,2 1 y − µ τ + 12 v τ 1 , (8.79) exp − p(y, τ | v) = √ 2 vτ 2πvτ which also coincides with an analytically obtained result for the Heston model. With the help of the conditional probability density distribution (8.79) and the specific stationary distributions of the variance we calculate the probability density distributions of the logarithmic returns for short times τ. We have fitted the obtained solutions against the returns of 1 hour (τ = 1 h) for the DAX index presented in Figures 8.11 and 8.12. More examples and an extended analysis can be found in [189–192].
p(y,t)
1e+01
1e+00 pDax(y,t=1h) psH(y,t=1h)
1e−01 −0.04
−0.02
0 y
0.02
Figure 8.11 The probability density distribution of the logarithmic return y for the Heston model pH s (y, τ) is plotted against the calculated probability density distribution pDax (y, τ) of the DAX index (02.05.1996–28.12.2001) for time τ = 1 h with χ2 = 24.2, a = 1.36, θ = 5.15 × 10−5 h−1 and µ = 3.03 × 10−4 h−1 .
0.04
271
8 The Ornstein–Uhlenbeck Process
1e+01 p(y,t)
272
1e+00 pDax(y,t=1h) pHW s (y,t=1h)
1e−01 −0.04
−0.02
0 y
0.02
0.04
Figure 8.12 The probability density distribution of the logarithmic return y for the Hull–White model pHW s (y, τ) is plotted against the calculated probability density distribution pDax (y, τ) of the DAX index (02.05.1996–28.12.2001) for time τ = 1 h with χ2 = 54.1, a = 0.08, θ = 3.21 × 10−4 h−1 and µ = 2.97 × 10−4 h−1 .
8.6 Exercises
E 8.1 Ornstein–Uhlenbeck’s paper from 1930 Study the historical paper by George Eugene Uhlenbeck (1880–1941) and Leonard Salomon Ornstein (1900–1988) ‘On the Theory of Brownian motion’ (Physical Review, 1930, vol. 36, pp. 823–841) in detail. Consider an extension to the situation without additional force (cf. Section 8.2) the case with the external force as the Brownian motion of a harmonically bonded particle. Hint: Take [231] as the starting point for your analysis. E 8.2 The Ornstein–Uhlenbeck process Consider the one-dimensional stochastic process (8.1) with a = 0, c = γ (friction coeffi√ cient) and b = 2B (fluctuation strength). Based on the given Langevin equation (8.8) with initial value v(t = 0) = v0 write down the corresponding Fokker–Planck equation. Find the time-dependent solution for the stochastic process given by v(t) (Langevin approach) as well as p(v, t) (probability density) and p(v1 , t1 ; v2 , t2 ) (joint probability density, JPD, (3.1)) or p(v2 , t2 | v1 , t1 ) (conditional probability density, CPD, (3.2)) from the Fokker–Planck approach. Remember the average or mean value v(t)
µ(t) = v p(v, t) dv and the covariance or correlation function (v(t) − v(t))(w(s) − w(s))
(8.80)
8.6 Exercises
σ2 (t, s) =
+
,+ , v − µ(t) w − µ(s) p(v, t; w, s) dw dv
(8.81)
and show that the solution is a Gaussian normal distribution N(µ(t); σ2 (t, t)) with mean µ(t) and variance σ(t, t), which should be calculated. E 8.3 The Heston model Verify that (8.75) is the stationary solution of the Fokker–Planck equation (8.72) with the probability flux (8.73) describing the variance in the Heston model. Perform simulations of stochastic trajectories for the corresponding Langevin equation and evaluate the stationary probability distribution in the long-time limit. Compare the result with the analytical solution (8.75).
273
275
9 Nucleation in Supersaturated Vapors
9.1 Dynamics of First-Order Phase Transitions in Finite Systems
Stochastic processes have many classical applications in physics such as diffusion or Brownian motion as discussed in Chapters 6–8. Here we consider a further important application to the dynamics of first-order phase transitions, considering formation of vapor droplets in a supersaturated vapor as a particular example. A complete theory of the dynamics of first-order phase transitions would have to account for a variety of physical processes on different time and length scales including nucleation, spinodal decomposition, growth of clusters of new phase and late-stage coarsening process like Ostwald ripening and coagulation, see e. g. [237, 244]. If the system is infinitely large, then the depletion of the medium can be neglected in the nucleation process. This allows a theoretical description of first-order phase transition ignoring any interaction between the growing clusters as presumed in the classical nucleation theory [16, 45, 220, 236, 248]. In this case we obtain a simultaneous nucleation and independent growth of already formed supercritical clusters. The basic kinetic model, underlying classical nucleation theory, was proposed by Leo Szilard. It yields a steady-state nucleation rate provided that the system is continuously supplied by monomers while large clusters are removed. For a finite system the formation and growth of the clusters result in depletion of the surrounding medium. Thus we come to another scenario of first-order phase transitions in finite closed systems. This general scenario is characterized as follows. First, a process of nucleation occurs in the initial homogeneous supersaturated state requiring a very short time. The critical energy for the formation of a stable droplet is determined through a competition between a volume term (which favors creation of the droplet), and a surface term (which favors its dissolution). On average, the droplets with n > ncr (critical cluster size) grow, while those with n < ncr shrink. Therefore, the formation of critical nuclea in the initially homogeneous state is possible due only to the stochastic fluctuations. Further stable growth of droplets over the critical size (n > ncr ) can be described in a deterministic manner. In a second stage, the growth of the already formed supercritical clusters predominates and the nucleation rate decreases. This stage is succeeded by a third stage Physics of Stochastic Processes: How Randomness Acts in Time Reinhard Mahnke, Jevgenijs Kaupuˇzs and Ihor Lubashevsky Copyright 2009 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim ISBN: 978-3-527-40840-5
276
9 Nucleation in Supersaturated Vapors
of competitive growth of the clusters, the so-called Ostwald ripening period. It is characterized by a decrease in the number of clusters and an increase in their mean size or radius during a large interval of time. The theory of Ostwald ripening goes back to the pioneering work by Lifshitz and Slyozov [124] and, independently, that of Wagner [238]. The classical Lifshitz–Slyozov–Wagner (LSW) theory describes a homogeneous or heterogeneous system (binary mixture) in the two-phase region within the droplet model using a cluster size distribution which changes due to monomer condensation and/or evaporation. Ostwald ripening is the process of phase separation in a supersaturated system or binary mixture by diffusional growth of spherical nuclei of the minority phase. At low initial supersaturation, spherical droplets will typically be nucleated at large mutual separation so that the diffusional droplet growth can be described in a single-droplet picture. In the LSW theory of Ostwald ripening the droplet growth is coupled to the concentration field to obey global mass conservation. The LSW theory provides evidence for the existence of a universal cluster distribution function if appropriately scaled variables are used and for universal power laws in time for the physical quantities such as cluster density and mean cluster size. Universality here means that the asymptotic long-time behavior is independent of the details of the initial nucleation process. One finds, in particular, that the critical cluster size increases as t1/3 , the supersaturation follows the inverse t−1/3 law, but the number of clusters decreases as 1/t at large times t. Ostwald ripening, the late-stage process of droplet growth by evaporation and condensation, is well understood. To reduce the interfacial free energy of the system, material diffuses away from small, high-curvature droplets (which dissolve), and condenses onto large, low-curvature droplets (which grow). In this way, the large droplets swallow the small ones. A single large droplet survives in the final stage of this competition. The classical LSW theory considers this process in the noninteracting limit with the vanishing volume fraction φ of the condensating minority phase. Because of the depletion of the medium, the three stages are not independent of each other. In particular the nucleation rate depends on the growth of the already formed clusters. The outlined scenario of first-order phase transitions is valid, if the initial supersaturation is not too high. In this case the first stage of the transition can be described by nucleation as a formation of fluctuations in small regions of space with large differences in the density compared with the initial state. If the initial supersaturation increases, nucleation is replaced continuously by spinodal decomposition. Although the limiting cases are understood, much less is known about the complete evolution of the system from the early nucleation to the late Ostwald ripening stage. This problem studied in the work by Schmelzer [202] and others [113, 145, 155] has experimental evidence which points to the importance of the interparticle diffusional interactions and of the spatial locations of particles in nucleation and growth. The experiments have confirmed the prediction of selfsimilar coarsening behavior at long times. However, the measured distributions over cluster sizes generally are broader and more symmetric than the LSW theory predicts.
9.2 Condensation of Supersaturated Vapor
9.2 Condensation of Supersaturated Vapor
A simple example of a system of many particles, is a gas consisting of identical molecules called monomers. If the gas is dilute (the density, as the number of molecules per unit volume, is small), the average separation length between the monomers is large and, correspondingly, their interaction is negligible. The gas is said to be ideal if the average separation length is much larger than the de Broglie wavelength. We treat the monomers as indistinguishable particles moving in a closed volume and making reactive collisions to form aggregates called molecular clusters [139, 229]. If we consider a vapor at equilibrium then a certain change in the thermodynamic parameters enables us to move the system into a nonequilibrium state. The vapor becomes supersaturated. The basic quantity describing the situation is the cluster distribution function N at time t N(t) = (N0 , N1 , N2 , . . . , Nn , . . . , NN )
(9.1)
which gives the number of clusters Nn of size n. The free particles (molecules) are called monomers of size n = 0. It is supposed that N1 molecules are excited. These molecules may be named a precluster of size n = 1. The bound states are clusters of size n ≥ 2. Investigating a finite system the overall number of particles Ntotal as well as the volume V and the temperature T are fixed. The particle conservation law Ntotal = N0 + N1 +
N
nNn = constant
(9.2)
n=2
takes into account that particles are either free, excited, or bounded in clusters. There is always some difficulty in describing the initial stage of formation of a cluster. We have introduced the precluster as an intermediate state between free and bounded states to provide an easy and unified description of the aggregation process. Further on, we consider a simplified case where only one single cluster of size n (i. e. Nn = 1) coexists with N0 = Ntotal − n unbounded (free) particles, which means that the cluster distribution (9.1) reduces to N(t) = (N0 , 0, . . . , 0, Nn = 1, 0, . . . , 0)
(9.3)
and the overall particle conservation (9.2) to Ntotal = N0 +
N
nNn = N0 + n · 1 = constant,
(9.4)
n=1
where the stochastic variable n = n(t) is the number of particles bounded in the cluster at time t. The nucleation box (volume V) embedded in a heat bath (temperature T) displaying the situation schematically is shown in Figure 9.1.
277
278
9 Nucleation in Supersaturated Vapors
W+(n)
W−(n)
Figure 9.1 Isothermal–isochoric nucleation in supersaturated vapor: free molecules (black dots) called monomers in the initial stage (left) of the aggregation process form a cluster (spherical droplet) of certain size n, coexisting with a gas of free molecules, afterwards (right).
The starting point is the one-dimensional one-step master equation (3.38) describing the condensation and evaporation of a single particle on or from a molecular cluster, where w+ (n) and w− (n) are the transition rates of condensation and evaporation, respectively, which have now to be formulated. The attachment probability per time of a monomer to a (spherical) cluster of size n is proportional to the cluster surface A(n) and to the density of free monomers N0 /Vfree , so w+ (n) = αA(n) (Ntotal − n) /Vfree ,
1 ≤ n ≤ Ntotal ,
(9.5)
where −1 Vfree = V − n cclust
(9.6)
is the free volume not occupied by the cluster. Here we have assumed that each −1 related to the density of liquid cliquid particle has his own eigenvolume cliquid equivalent to the density of particles in a cluster, so cliquid ≡ cclust . The coefficient α has not yet been specified. We can interpret it as the velocity of sticking. The surface of a spherical droplet is given by A(n) = 4πr 2 = 4π (cclust 4π/3)−2/3 n2/3 ∼ n2/3
(9.7)
with known incompressible particle density inside the cluster cclust = constant (liquid density as given experimental value). A special case is the formation of a precluster n = 1 out of an elementary particle (n = 0). The precluster can be understood as an excited monomer which is able to react with some other monomer to form a dimer (n = 2). In the free-particle state n = 0 any of the Ntotal monomers can become excited, so that we can write w+ (0) =
p Ntotal τ
(9.8)
9.2 Condensation of Supersaturated Vapor
where the parameter p in this case means the excitation probability per time multiplied by the time constant τ. By using the detailed balance relation (3.49), the evaporation rate w− (n) of a monomer from a cluster of size n is calculated from the known attachment rate (9.5). For the equilibrium cluster distribution, the most probable value of the cluster size n corresponds to the minimum of the thermodynamic potential, in this case free energy F, as it is evident from (3.50). The latter equation (with ≡ F) combined with the detailed balance condition (3.49) allows to find the relation between transition rates w(N | N) and w(N | N ) of opposite stochastic events (transition from state N to state N and vice versa), so
F(T, V, N) − F(T, V, N ) w(N | N) = exp . (9.9) w(N | N ) kB T In this way, to find the relation between transition probabilities we need the knowledge of the free energy F. Following the basic principles of the statistical mechanics [167], we define the Hamiltonian of our many-particle system and calculate the statistical sum (or integral) Z which is related to free energy via F = −kB T ln Z. Our system is described by the cluster distribution N defined by (9.1). The total Hamiltonian H(N) reads H=
N
Hn
(9.10)
n=0 (n)
with the contribution Hn for the Nn clusters of size n at coordinates ri and (n) momenta pi written as kinetic energy and interaction potential 2 (n) (n,n) + , + (n) (n) , pi Hn p , r + Uij (9.11) = | r i − rj | . (n) 2mi i i<j (n)
The mass mi of a cluster containing n monomers (n ≥ 1) is given by (n)
mi ≡ mn = n m
(9.12)
where m ≡ m0 is the mass of one monomer. The canonical partition function, that is, the statistical integral, is an integral over all space and momentum coordinates. In the semi-classical approximation it reads [167]
N 1 1 (9.13) Z(T, V, N) = d3Nn p d3Nn q exp (−βHn ) , Nn ! h3Nn n=0
where β = 1/(kB T). This partition function can be divided in two factors, one of which represents an ideal part due to the kinetic energy Zideal (T, V, N) =
N 1 n=0
Nn 3Nn Veff 2π m k T n B Nn ! h3Nn
(9.14)
and the second part Zbinding (T, V, N) is responsible for the energy stored in clusters. Here Veff < V is the effective volume. By introducing this quantity in (9.14) instead
279
280
9 Nucleation in Supersaturated Vapors
of the total volume V we take into account the fact that particles are not point-like, so the integration over the spatial coordinates in (9.13) effectively takes place in a reduced volume Veff . In a certain approximation, both terms together read 3 √
Nn N Nn 1 Veff 2π m k T f n B n Z(T, V, N) = exp − (9.15) Nn ! h kB T n=0 where the binding energy fn (T) is the minimum value of the potential energy sought over all spatial arrangements of the n bounded monomers (n,n) + , Uij (9.16) | r i − rj | . fn (T) = minr i<j
From the canonical partition function Z we can calculate the thermodynamic quantities using the relation between Z and the state function free energy F via F(T, V, N) = −kB T ln Z(T, V, N).
(9.17)
According to (9.15) and (9.17), in the isothermal–isochoric situation we obtain F(T, V, N) = kB T
N N , + Nn ln λn (T)3 /Veff + ln Nn ! + Nn fn (T) n=0
(9.18)
n=0
,1/2 + with de Broglie wavelength λn (T) = n−1/2 λ0 (T) = n−1/2 h2 /(2πmkB T) (at n ≥ 1). Here λ0 (T) is the de Broglie wavelength of a monomer (particle of mass m) given by λ0 = h/(2πmkB T)1/2 ≈ 10−10 m.
(9.19)
This is the wavelength of a quantum-mechanical free particle with energy E = p2 /2m = 2 k2 /2m, where k is the wave number relating to the wavelength λ0 . Using the Stirling formula ln Nn ! # Nn ln Nn − Nn in (9.18) we obtain an approximation for large cluster numbers Nn F(T, V, N) = kB T
N n=0
N + , Nn ln λn (T)3 Nn /Veff − 1 + Nn fn (T).
(9.20)
n=0
In the case of one cluster of size n only (see (9.3)) and at N0 = Ntotal − n → ∞ (that is in the thermodynamic limit by expansion of ln N0 !) the free energy (9.18) reads , + F(T, V, n) = kB T (Ntotal − n) ln λ0 (T)3 (Ntotal − n)/Veff − 1 , + , + + 1 − δn,0 ln λn (T)3 /Veff + fn (T).
(9.21)
In our special case, where only one cluster of size n is possible, (9.9) in the thermodynamic limit reduces to
w− (n) Veff (1 − 1/n)3/2 fn (T) − fn−1 (T) = 3 :n≥2 exp w+ (n − 1) kB T λ0 (T) (Ntotal − n)
1 w− (1) f1 (T) = . exp w+ (0) Ntotal kB T
(9.22) (9.23)
9.2 Condensation of Supersaturated Vapor
The reduction of the effective integration volume in (9.13) is relevant at large densities when the cluster includes a relatively large part of all particles. According to this, the effective volume Veff can be approximately replaced with the free volume Vfree outside the cluster given by (9.6). Then from (9.22) we obtain an approximation
Vfree fn (T) − fn−1 (T) w− (n) = 3 , (9.24) exp w+ (n) kB T λ0 (T) (Ntotal − n) which is true for large enough n. In a rough approximation, we have assumed that (9.24) can be extrapolated up to n = 1. Taking into account (9.5), this yields
1 fn − fn−1 , (9.25) w− (n) = αA(n) 3 exp kB T λ0 where fn − fn−1 is the difference in the binding energies between clusters of size n and n − 1. Based on (9.23) the stochasticity parameter p in (9.8) can be written as
f1 (T) p = w− (1) exp − . (9.26) τ kB T By (9.5), (9.25), and (9.26) all the transition rates are well defined in a way which is consistent with the basic principles of statistical mechanics. Clusters as bound states of elementary particles (monomers) have negative potential energy, which is the so-called binding energy. The potential function fn (T) is well known from atomic and nuclear theory and also as the Bethe–Weizs¨acker formula which, in a simple nonlinear approximation, reads [141–143, 203] fn (T) = µ∞ (T) n + σA(n).
(9.27)
The binding energy consists of a negative volume term (µ∞ < 0) and a positive surface contribution. The quantity µ∞ (T) is the chemical potential of one monomer or, in other words, the energy necessary for taking away one elementary particle (monomer) from a cluster with a flat surface. The parameter σ > 0 can be understood as the surface tension of a flat surface. Ansatz (9.27) is a good approximation for large enough sizes n and also provides the correct normalization condition f0 = 0 for a free particle (n = 0). Substituting (9.27) into the detachment rate (9.25) we obtain the approximation " µ∞ (T) + σ A(n) − A(n − 1) 1 w− (n) = αA(n) 3 exp kB T λ0
1 2σk(n) µ∞ (T) ≈ αA(n) 3 exp exp . kB T cclust kB T λ0
(9.28)
This result is valid for large enough clusters (starting with n ≈ 10) and contains the curvature k(n) of a size-n droplet k(n) = 1/r = (cclust 4π/3)1/3 n−1/3 .
(9.29)
281
282
9 Nucleation in Supersaturated Vapors
The difference in the surface areas A(n) − A(n − 1) in (9.28) can be evaluated by using a series expansion of A(n − 1) around n and retaining the leading term only, so
2 ∼ n−1/3 . (9.30) n2/3 − (n − 1)2/3 ≈ n2/3 − n2/3 1 − 3n Taking into account the ideal gas model, the chemical potential µ∞ is related in a simple way to the equilibrium density (concentration) ceq (∞) of monomers in the case of flat interface (r → ∞) between the liquid phase (droplet) and the gaseous phase (free monomers). Thus, we have (9.31) µ∞ (T) = kB T ln λ0 (T)3 ceq (∞) . However, in reality, due to the spherical droplets, the interface is curved, therefore the concentration of free monomers in equilibrium is larger than ceq (∞). For large enough clusters, the detachment probability w− (n) = αA(n)ceq (∞) exp(k(n))
(9.32)
is obtained by inserting (9.31) into (9.28), where the length = (T) defined as (T) =
2σ , cclust kB T
(9.33)
is explained and depicted in Figure 9.2. Considering the cluster sizes n(t) as a continuous variable which can be measured experimentally, the equation of motion dn/ dt with a given velocity function v(n), showing the time evolution of the cluster size, is well known. Putting forward phenomenological arguments such as Fick’s law, the dynamical equation of interface reaction limited aggregation reads as: dn = v(n) dt
with
v(n) =
+ , D A(n) cfree − ceq (n) ,
(a)
(9.34)
(b)
l Cclust
Cclust
C
C
Ceq(n)
Ceq(n) r
Figure 9.2 Particle concentration c depending on the distance r from the center of the cluster: (a) shows a real density profile, whereas (b) approximates the profile corresponding to the model of a cluster with
r sharp border. The length shows the width of the interface between the dense phase (cclust ) and the dilute surrounding (ceq (n)) over a size-n droplet.
9.2 Condensation of Supersaturated Vapor −1 where cfree = N0 /Vfree = (Ntotal − n)/Vfree = (c − n/V)/(1 − cclust n/V) is the density of free particles, and c = Ntotal /V is the total density of particles. The constant D is called the diffusion coefficient; the coefficient , called the capillary length, is a small interface thickness defined by (9.33). In the stochastic approach we also have an equation (10.70) for the average cluster size n, which is similar to (9.34). This is a deterministic equation for the mean value d d
n = n p(n, t) = w+ (n) − w− (n), (9.35) dt dt n
which is obtained by an averaging of the master equation. In a certain approximation (9.35) may be written as follows d
n ≈ w+ ( n) − w− ( n), dt
(9.36)
describing the time evolution of the average cluster size n. According to the definitions of the transition frequencies (9.5) and (9.32), the time evolution of the mean cluster size (9.36) can be written in the same form as (9.34), d Ntotal − n (9.37)
n = αA( n) − ceq (∞)ek( n) . dt Vfree By comparing these two equations we find the previously unknown coefficient α and the equilibrium concentration ceq (n): α = D/
(9.38)
ceq (n) = ceq (∞) ek(n) .
(9.39)
and
The only difference between (9.34) and (9.37) is that, in the latter case, we always have the average value of the cluster size n instead of n. Rewriting (9.37) accounting for (9.6) and (9.38) we finally obtain D c − n/V d n k( n) = A( n) − ceq (∞)e . (9.40) −1 dt 1 − cclust
n/V In the stationary state d n/dt = 0 holds. Equation (9.40) is valid at large enough n only, whereas at n → 0 it should be modified to ensure that the transition rates and, therefore, the expression in the brackets does not diverge. Then we obtain three stationary solutions. The homogeneous situation without any cluster corresponds to A( nst ) = 0, or to zero value of the stationary cluster size nst = 0. The other two solutions, describing the heterogeneous situation, originate from the identity cfree ( nst ) = ceq ( nst ) or in an extended version ! c − nst /V −1/3 . (9.41) = ceq (∞) exp (cclust 4π/3)1/3 nst −1 1 − cclust nst /V
283
284
9 Nucleation in Supersaturated Vapors
c l.h.s.
c1
r.h.s.
0
0 ncr
n∗cr
c1V nstable
Figure 9.3 Terms on the l. h. s. and on the r. h. s. of (9.41) depending on the cluster size n. The two crossing points at n = nstable (stable cluster size) and n = ncr (unstable or critical cluster size) correspond to two different solutions of (9.41). The thin line
cV
n
related to the critical density c1 has a single common point with the thick curved line at n = n∗cr . The horizontal dashed line shows the value of the equilibrium concentration ceq (∞).
From this equation we can find the stationary cluster size nst as a function of the total density c. This is a nonlinear equation which cannot be solved analytically. However, it can be easily analyzed (solved) graphically. In Figure 9.3, terms on the left-hand side (l.h.s.) and on the right-hand side (r.h.s.) versus the mean cluster size n are shown. The two crossing points n = ncr and n = nstable correspond to two different solutions of (9.41). The quantity ncr is known as the critical cluster size in nucleation theory, whereas nstable represents the stable stationary cluster size. Their meaning will be clarified in further discussion. The crossing points exist only if the total concentration c exceeds some critical value c > c1 which corresponds to a bifurcation point where both solutions merge into one, as shown by the thin line which has only one common point with the thick line at the marginal (largest possible) value of the critical cluster size nst = n∗cr . At c > c1 three different regions can be distinguished for the cluster size nst : 1. At n < ncr we have d n/dt < 0 which means that the cluster dissolves. 2. At ncr < n < nstable we have d n/dt > 0 which means that the cluster grows until it reaches the stable stationary size nstable . 3. At n > nstable we have d n/dt < 0 which means that the cluster reduces its size (dissolves) to the stationary value nstable . According to this, the solution n = nstable (c) corresponds to a stable cluster size, whereas the solution n = ncr (c) corresponds to an unstable stationary cluster size. In Figure 9.4 we have shown both solutions of (9.41) (branches n1 (c) and n2 (c)), providing the stationary cluster size in a supersaturated vapor as a function of the total density c. Branch n1 (c) depicted by a solid line corresponds to the stable cluster size, whereas branch n2 (c) indicated by a dashed line corresponds to the
9.2 Condensation of Supersaturated Vapor
st
n1(c)
n∗crit
n2(c) 0
c1
cclust C
Figure 9.4 The stationary cluster size nst depending on the total density c of particles in a supersaturated vapor. The stable cluster size (the horizontal line and the branch n1 (c)) is shown by solid lines, whereas the unstable cluster size (the branch n2 (c)) is shown by a dashed line. Arrows indicate the time evolution of n.
unstable (critical) cluster size. Several trajectories showing the time evolution of
n to one of the stable stationary values (solid lines) are indicated by arrows. The bifurcation diagram in Figure 9.4 around the critical density c1 is similar to those in Figure 10.25 describing the nucleation on roads. An essential difference, however, is that no increase in the critical cluster size n2 (c) is observed in a supersaturated vapor at large densities [142, 146, 149, 151]. Expansion of the exponent in (9.39), retaining the linear term only, yields ceq ( n) = ceq (∞)(1 + k( n)).
(9.42)
By using this linearization around the critical cluster size ncr , (9.40) can be written in the well known form [203] d n = Dceq (∞)A( n) k(ncr ) − k( n) , dt
(9.43)
where k(ncr ) =
1 c − ncr /V − ceq (∞) + , −1 ceq (∞) 1 − cclust ncr /V
(9.44)
is the curvature of the critical cluster. From this equation the above discussed property that clusters with an overcritical size (n > ncr , so k(n) < k(ncr )) grow, whereas those with an undercritical size (n < ncr , so k(n) > k(ncr )) dissolve, is obvious. In the actual bistable situation the growth from an undercritical to an overcritical cluster size cannot be described by deterministic equations of motion like (9.43). This phenomenon of noise-induced transitions over the critical value of cluster size can be treated in the stochastic approach only [149, 156, 229, 230]. Using the Monte Carlo method we have simulated and presented in Figure 9.5 three different stochastic trajectories showing the time evolution, that is the cluster
285
9 Nucleation in Supersaturated Vapors
600
400 n
286
200
0 0.0
0.1 t*
0.2
Figure 9.5 Three different stochastic trajectories showing the time evolution of the cluster size n vs the dimensionless time t∗ . The lower dashed line indicates the critical cluster size ncr = 54; the upper dashed line represents the stable cluster size nstable = 650.
size n vs the dimensionless time t∗ = (αA(1)/V) · t, of the system with Ntotal = 1000 −1 → 0) starting with n values around the critical cluster size point-like particles (cclust ncr ≈ 54 (the lower dashed line). The parameters of the system are chosen such that σA(1)/(kB T) = 10 and Vceq (∞) = 160. In one of the cases the cluster dissolves, whereas in other two cases it grows over the critical size. Also, as distinct form the prediction of the deterministic equation (9.40), in one of the cases the growth up to the stable cluster size n ≈ 650 (the upper dashed line) occurs, starting with n = 45 < ncr . The probability distribution at three different times: t∗ = 0.003, 0.04, and 0.3, have been calculated by averaging over a large number of stochastic trajectories starting with n = ncr = 54. The results are shown in Figure 9.6. The probability maximum moves towards larger values of the cluster size n with increasing time, and at t∗ = 0.3 the probability distribution agrees approximately with the equilibrium distribution Peq (n) ∝ exp(−F(n)/(kB T)) (smooth curve). The maximum of the equilibrium distribution corresponds to the minimum of the free energy, whereas the critical cluster size corresponds to the local maximum of the free energy, as shown in Figures 9.4 and 9.6.
9.3 The General Multi-Droplet Scenario
The basic equation describing the whole process of nucleation, cluster growth, and coarsening, is the stochastic master equation (3.19) ∂ P(N, t) = W(N | N )P(N , t) − W(N | N)P(N, t) (9.45) ∂t N
9.3 The General Multi-Droplet Scenario
t∗=0.003
P(n,t∗)
−150 F(n)/(kBT) 0.02 t∗=0.3
F(n)/(kBT)
−50
0.04
−250
t∗=0.04 0
0
500 n
Figure 9.6 The probability distribution P(n, t∗ ) at three different dimensionless times t∗ = 0.003 (left), t∗ = 0.04 (middle), and t∗ = 0.3 (right) calculated by averaging over 20 000 stochastic trajectories simulated
−350 1000 by the Monte Carlo method. The equilibrium distribution is shown by a smooth solid line. The maximum of the equilibrium distribution corresponds to the minimum of the free energy, F(n)/(kB T) (smooth curve).
which gives the probability P(N, t) at time t of the system having the cluster distribution N defined by (9.1). Here W(N | N) is the transition frequency of the system from state N to state N . The growth of liquid droplets in a supersaturated vapor, discussed in the previous section, follows the general scenario outlined here. For simplicity, our calculations have been made for the single-droplet case which corresponds to the final stage of the Ostwald ripening. A complete description of the process, however, has to include the multi-droplet picture, as consistent with the master-equation (9.45). The stationary cluster distribution Nst is determined by the stationary solution of this masterequation (∂Pst /∂t = 0). Following the master equation approach (see Chapter 3), the transition frequencies of opposite stochastic events are related by the detailed balance condition (9.9) which in a general case of thermodynamic system reads
∆ (9.46) W(N | N) = W(N | N ) exp − kB T where ∆ = (N ) − (N) is the difference between the thermodynamic potentials of the states N and N . In this case the stationary solution is also the equilibrium solution Peq (N). For any thermodynamic constraints, reflected by the characteristic thermodynamic potential , we may write + , −1 Peq (N) = Pnorm exp −(N)/kB T (9.47) in accordance with (3.50), where Pnorm is a normalizing factor. The equilibrium states Neq have the highest possible equilibrium (stationary) probabilities Pst (Neq ) → max. Similar to classical nucleation theory, we assume that the cluster size distribution N(t) is changed by monomer–cluster reactions only, so N1 + Nn ←→ Nn+1 .
(9.48)
287
288
9 Nucleation in Supersaturated Vapors
In this case the master equation (9.45) can be written as ∂ P (N1 , N2 , · · · , Nn , · · · , NN , t) ∂t
(9.49)
= W1+ (N1 + 2) P (N1 + 2, N2 − 1, N3 , . . . , NN , t) +W2− (N2 + 1) P (N1 − 2, N2 + 1, N3 , . . . , NN , t) +
N
+ Wn−1 (Nn−1 + 1) P (N1 + 1, . . . , Nn−1 + 1, Nn − 1, . . . , NN , t)
n=3
+
N−1
− Wn+1 (Nn−1 + 1) P (N1 − 1, . . . , Nn − 1, Nn+1 + 1, . . . , NN , t)
n=2
−
N
Wn+ (Nn ) + Wn− (Nn ) P (N1 , N2 , . . . , Nn , . . . , NN , t) .
n=1 − are the transition rates of the forward and backward Here Wn+ and Wn+1 monomer–cluster reaction (9.48). The master equation (9.49) also describes the formation of a dimer from two monomers and its decay as special cases. If we do not use the precluster–cluster concept introduced in Section 9.2, then we have to consider the probability that two monomers meet together and form a dimer, to estimate the transition frequency W1+ . This depends on the density of monomers N1 and, according to simple geometrical considerations, in our previous notation is given by
N1 (N1 − 1) D A(1) . (9.50) V Other transition rates are also consistent with our consideration in Section 9.2. One has to take into account that the rate of the reaction (9.48) is proportional to the number of clusters Nn involved. Therefore, we have W1+ (N1 ) =
Wn+ (Nn ) = Nn w+ (n)
(9.51)
Wn− (Nn ) = Nn w− (n),
(9.52)
where w+ (n) is the attachment frequency to a given cluster of size n, defined by (9.5), and w− (n) is the detachment frequency (9.28). Coming back to the interface reaction limited aggregation law (9.34) + , dn D = A(n) cfree − ceq (n) dt with the concentration over a curved surface (9.39) given by + , ceq (n) = ceq (∞) ek(n) ≈ ceq (∞) 1 + k(n)
(9.53)
(9.54)
we get the following linearized growth law + , dn = D ceq (∞)A(n) k(ncr ) − k(n) dt
(9.55)
9.3 The General Multi-Droplet Scenario
where the curvature of the critical cluster is determined by the concentration of free particles cfree or supersaturation y = (cfree − ceq (∞))/ceq (∞) k(ncr ) ≡ (cclust 4π/3)1/3 n−1/3 = cr
y 1 cfree − ceq (∞) = ceq (∞)
(9.56)
and therefore the critical cluster size given by 3 ncr = (cclust 4π/3) . y
(9.57)
Taking into account a multi–cluster distribution of different sizes {ni | i = 1, . . . m} and the particle conservation law for a finite system (9.4) cfree = ctotal −
m
ni Ni /V
(9.58)
i=1
we get, for a temporal change in the monomer concentration Ni dni d cfree = − . dt V dt
(9.59)
i
Using the growth law for each cluster(9.55)
1 cfree − ceq (∞) dni = D ceq (∞)A(ni ) − k(ni ) dt ceq (∞)
(9.60)
we receive from (9.59) Ni D cfree − ceq (∞) Ni d cfree = − A(ni )+D ceq (∞) A(ni )k(ni ). (9.61) dt ceq (∞) V V i
i
Inserting the supersaturation, calculated from (9.61), into (9.60) we obtain the final set of ordinary nonlinear differential equations governing the constrained growth and Ostwald ripening of an ensemble of m droplets of sizes ni + , V dcfree D dni = A(ni ) ceq (∞) < k > − k(ni ) − (9.62) dt D Aclust dt together with
dcfree D Aclust Vclust =− ctotal − cclust − ceq (∞) − ceq (∞) < k > dt V V
where Aclust =
i
−1 Vclust =cclust
Ni A(ni )
i
k =
Ni ni
i
(9.63)
total surface of all droplets total volume of all droplets (9.64)
k(ni )Ni A(ni )
i
mean curvature Ni A(ni )
289
290
9 Nucleation in Supersaturated Vapors
with ctotal = Ntotal /V = constant being the total number of free and bounded monomers per volume. The surface of a spherical drop A(n) is given by (9.7). The kinetic equations (9.62, 9.63) describe the rapid growth of the droplet ensemble (second term dcfree /dt in (9.62)) as well as the slow selection between the droplets (first term in (9.62)) as a competition process. The numerical solution of the coupled nonlinear equations yields the time evolution of a given droplet ensemble with certain initial sizes. In a short time regime the size of all droplets increases and the two-phase system reaches so-called internal equilibrium (dcfree / dt ≈ 0), a nonstationary situation where the liquid is in quasi-equilibrium with its surrounding vapor. Since the row material (monomers) is limited at the end of this growth stage, the total volume of the new phase Vclust is very close, but not equal, to its equilibrium value. In the final long-time regime a competitive ripening process takes place, which minimizes the total surface of all droplets Aclust , the total volume Vclust being almost unchanged. At the end of the coarsening process the dynamical system has reached a stable fixed point where only one droplet is present. The other smaller clusters having dissolved to give monomers to the winner; the largest drop.
9.4 Detailed Balance and Free Energy
In Section 9.2 we discussed how the detailed balance (9.9) can be used to determine the transition rates. For this purpose one needs to calculate the free energy of the system based on the first principles of statistical mechanics. In many applications the inverse problem can be of interest. Namely, if the transition rates are known due to some physical assumptions, then the free energy and chemical potentials can be derived, based on the principle of detailed balance. We will show how this works for the liquid–gas system with the idea of applying this scheme to other systems such as traffic in transportation and biology [32, 123, 150, 241]. For simplicity, we consider a situation where only one cluster of molecules coexists with the vapor phase. The number n of molecules called monomers bounded in the cluster is a stochastic variable, whereas their total number N in a given volume V is fixed. The stochastic events of adding or removing one monomer are characterized by the transition rates w+ (n) and w− (n) depending on the actual cluster size n. According to (9.9), the detailed balance reads
F(n) − F(n − 1) w+ (n − 1) = exp − , (9.65) w− (n) kB T where T is the temperature, kB the Boltzmann constant, and F(n) is the free energy of state (including all possible microscopic distributions of coordinates and momenta of free monomers) with cluster size n. For large enough n, (9.65) can be approximated as
∂F/∂n w+ (n) # exp − , w− (n) kB T
(9.66)
9.4 Detailed Balance and Free Energy
which leads to the equation w+ (n) 1 ∂F ln =− . w− (n) kB T ∂n
(9.67)
From this we get
F = F 0 − kB T 0
n
w+ (n ) ln w− (n )
dn ,
(9.68)
where F0 = F(n = 0) does not depend on the cluster size n. It is the free energy of the system without cluster; in this case the free energy of an ideal gas. We insert here the physical ansatz for the transition rates for a large system having small fraction of total volume V occupied by the condensed (droplet) phase (cf. (9.24) with Vfree → V)
λ3 (T)(N − n) fn−1 (T) − fn (T) w+ (n) = 0 exp , (9.69) w− (n) V kB T where N is the total number of particles previously written as Ntotal , and fn (T) is the binding energy of a cluster of size n. By using the approximation fn−1 (T) − fn (T) # −∂fn (T)/∂n , we obtain
n 3 λ0 (T)(N − n ) dn + fn (T). ln (9.70) F = F 0 − kB T V 0 The integration, using ln x dx = x ln x − x, yields
N F = F0 − kB TN ln λ30 (T) −1 V
N−n 3 + kB T(N − n) ln λ0 (T) − 1 + fn (T). V
(9.71)
The free energy of an ideal system (gas) F0 cannot be obtained from the detailed balance relation. It is given by F0 = −kB T ln Zideal , where Zideal is the partition function of the ideal gas (9.14), which for the one-cluster system with volume V = L3 reads
3N L 1 Zideal = . (9.72) N! λ0 (T) Hence, applying the Stirling formula ln N! # N ln N − N, we obtain
N F0 = kB TN ln λ30 (T) −1 . V
(9.73)
By inserting (9.73) into (9.71) we recover the known expression (cf. (9.21) for n > 0)
N−n F = kB T(N − n) ln λ30 (T) − 1 + fn (T) (9.74) V
291
292
9 Nucleation in Supersaturated Vapors
for the free energy of a liquid–gas system under isothermal and isochoric conditions. Equation (9.71) can be written as C ! , n n n + F − F0 = ρ 1− ln 1 − − 1 + 1 − ln λ30 (T)ρ VkB T N N N n 2/3 3 µ∞ (T) n , + (T) (cclust 4π/3)1/3 N −1/3 + kB T N 2 N
(9.75)
where ρ = N/V is the overall density and (T) is the diffusion length (width) of the liquid–gas interface given by (9.33). Later on we introduce the dimensionless density ρ˜ = λ30 (T)ρ and the dimensionless volume 0 V = V/λ30 (T). In this notation (9.69) transforms to
w+ (n) n µ∞ (T) = ρ˜ 1 − exp − w− (n) N kB T n −1/3
, × exp −(T) (cclust 4π/3)1/3 0 V −1/3 ρ˜ −1/3 N
(9.76)
whereas (9.75) becomes ! n n µ∞ (T) n n F − F0 ln 1 − − 1 + 1 − ln (˜ρ) + = ρ˜ 1 − 0 N N N kB T N VkB T n 2/3 3 . (9.77) + (T) (cclust 4π/3)1/3 0 V −1/3 ρ˜ −1/3 2 N These equations allow us to calculate the ratio w+ (n)/w− (n), as well as the VkB T) depending on normalized (dimensionless) free energy difference (F − F0 )/(0 the fraction of condensed molecules n/N at a given overall density for fixed volume and temperature. The results of the calculation for three different dimensionless densities ρ˜ = 5 × 10−7 , 10−5 , 1.2 × 10−5 at the values of dimensionless control V −1/3 = 0.003 are shown in parameters µ∞ /(kB T) = −12 and (T) (cclust 4π/3)1/3 0 Figures 9.7 and 9.8. Note that the extrema of F − F0 in Figure 9.8 correspond to the crossing points with the horizontal line w+ (n)/w− (n) = 1 in Figure 9.7. At the smallest density (dot-dashed line) there are no crossing points and the free energy is a monotonously increasing function of n/N, showing that the stable state of the liquid–gas system contains no liquid droplet. Stable droplet appears at larger densities (dashed and solid lines) by overcoming a nucleation barrier (local free energy maximum in Figure 9.8). The parameters we have chosen are quite realistic, and are comparable with those of water at T = 300 K and V = 5 × 10−23 m3 with about 37 250 molecules (mass m = 2.99 × 10−23 kg) at ρ˜ = 10−5 . For water at T = 300 K we have λ0 (T) = 2.377 × 10−11 m and cclust = 3.346 × 1028 m−3 . Hence, the dimensionless density in the cluster ρ˜ clust = cclust λ30 = 4.491 × 10−4 exceeds about 50 times the critical mean density ρ˜ = ρ˜ c # 9.2 × 10−6 at which the condensation (that is, a minimum of free energy at n/N > 0) appears in our calculation. Assuming the above parameters
9.4 Detailed Balance and Free Energy
w+(n)/w−(n)
1
0.5
0
0
0.5 n/N
1
Figure 9.7 The ratio of transition rates w+ (n)/w− (n) depending on the fraction of condensed particles n/N for three dimensionless densities ρ˜ = 5 × 10−7 (dot–dashed line), ρ˜ = 10−5 (dashed line), and ρ˜ = 1.2 × 10−5 (solid line).
1×10−6
f–f0
5×10−7 0 −5×10−7 0
0.2
0.4
0.6
n/N Figure 9.8 Normalized free energy difference VkB T) = f − f0 depending on the fraction of con(F − F0 )/(0 densed particles n/N for three dimensionless densities ρ˜ = 5 × 10−7 (dot-dashed line), ρ˜ = 10−5 (dashed line), and ρ˜ = 1.2 × 10−5 (solid line).
of water, we obtain (T) = 8.953 × 10−10 m for the width of the liquid–gas interface and surface tension σ = 6.20 × 108 N m−1 . It is about three times the characteristic −1/3 intermolecular distance in the cluster, which is roughly cclust # 3.1 × 10−10 m. The critical density ρ˜ c increases with temperature and becomes equal to the cluster density ρ˜ clust at the critical temperature T = Tc . In our description the physically meaningful densities are restricted by ρ˜ ≤ ρ˜ clust . This means that no condensation phase transition takes place for these physical densities at T > Tc . Assuming that µ∞ and σ do not change with temperature and that the above given values of the dimensionless control parameters correspond to T = 300 K, we find Tc # 430 K in our example.
293
294
9 Nucleation in Supersaturated Vapors
9.5 Relaxation to the Free Energy Minimum
Now we consider the general behavior of a system in the vicinity of a local or global minimum of F(n). In this case the argument of the exponent in (9.66) is small and we can make a Taylor expansion
w+ (n) ∂F/∂n 1 ∂F # exp − , (9.78) #1− ∗ w− (n) T∗ T ∂n where T ∗ = kB T is the temperature measured in energy units. Such a notation is useful for generalization to other systems such as traffic flow discussed in the next chapter. Equation (9.78) can be rewritten as w+ (n) − w− (n) # −
w− (n) ∂F . T ∗ ∂n
(9.79)
On the other hand, we can write in a deterministic approximation dn = w+ (n) − w− (n). dt
(9.80)
Comparing (9.79) and (9.80), we obtain w− (n) ∂F dn #− ∗ . dt T ∂n
(9.81)
As in the Landau theory of phase transitions, we can expand the free energy around the minimum point n = n0 defined by ∂F = 0. (9.82) ∂n n=n0 In the first approximation, where we retain only the leading term, we have also w+ (n) = w− (n) = w± (n0 ), which leads to the kinetic equation [213] dn # −0 (n − n0 ) , dt where 0 =
w± (n0 ) ∂ 2 F T∗ ∂n2 n=n0
(9.83)
(9.84)
is the relaxation rate. For 0 > 0, which corresponds to the minimum of F, the solution is the exponential relaxation to n = n0 : + , n(t) = n0 + n(0) − n0 e−0 t .
(9.85)
This solution is valid also for 0 < 0, in which case n0 corresponds to a free energy maximum. In this case it describes the deviation from this maximum point.
9.6 Chemical Potentials
9.6 Chemical Potentials
Our system can be considered as consisting of two phases: the cluster phase with n particles and free energy Fclust (n), and the ideal gas phase with Nideal = N − n particles and free energy Fideal (Nideal ). The total free energy then is F = Fclust + Fideal . While the total number of particles N is fixed, the number of particles in any of the phases fluctuates. According to the definition, we can write µclust = ∂Fclust /∂n and µideal = ∂Fideal /∂Nideal = −∂Fideal /∂n for the chemical potentials of these phases. Hence ∂Fclust ∂Fideal ∂F = + = µclust − µideal (9.86) ∂n ∂n ∂n and the kinetic equation (9.81) can be written as w± (n0 ) dn (9.87) #− (µclust − µideal ) . dt T∗ The latter equation has a certain physical interpretation: the driving force pushing the system to the phase equilibrium is the difference in the chemical potentials in both phases. Equilibrium is reached when the chemical potentials of the coexisting phases are equal, so µclust = µideal . For the liquid–gas system Fideal (T, V, N, n) is given by (9.73), where N is replaced with Nideal = N − n, so,
N−n −1 . (9.88) Fideal (T, V, N, n) = kB T(N − n) ln λ30 (T) V Hence the total free energy (9.74) can be written as F = Fideal (T, V, N, n) + fn (T).
(9.89)
The chemical potential of the liquid phase is thus given by the derivative of the binding energy fn (T) ≡ Fclust (T, V, N, n): ∂A(n) = µ∞ (T) + kB T (T)k(n), (9.90) µclust = µ∞ (T) + σ ∂n where k(n) = 1/r is the curvature (9.29) of the liquid surface for a droplet of size n with radius r and surface area A(n) = 4πr 2 . The chemical potential of the gaseous phase calculated from (9.88) is
∂Fideal N−n (9.91) = kB T ln λ30 (T) = kB T ln (˜ρfree ) , µideal = − ∂n V where ρ˜ free = λ30 (T)(N − n)/V is the dimensionless density of molecules in the gaseous phase. According to these expressions for the chemical potentials, the ansatz (9.69) can be written as
w+ (n) µideal fn−1 (T) − fn (T) µclust − µideal = exp exp # exp − . w− (n) kB T kB T kB T (9.92) The latter relation is consistent with (9.66) and (9.86).
295
296
9 Nucleation in Supersaturated Vapors
This approach in the calculation of free energy and chemical potentials can be easily generalized to any system, where the principle of detailed balance is valid. In the following chapter we will apply this method to the description of traffic flow, which is a system of many interacting vehicles exhibiting similar features, such as phase transition and phase separation, as do many physical systems and supersaturated vapor, in particular.
9.7 Exercises
E 9.1 Szilard model of nucleation I Consider the Szilard model of nucleation. The nucleation reactor consists of a reservoir of aggregates having size n = a − 1, a nucleation box with nuclea of sizes n = a, a + 1, . . . , b − 1, b, and surrounding media, where large aggregates consisting of n = b + 1 monomers are collected, as shown in Figure 9.9. Let p(n, t) be the probability that a nucleus has size n at time t. The probability obeys the normalization condition p(a − 1, t) + G(t) + p(b + 1, t) = 1, (9.93) b where G(t) = n=a p(n, t) is the total probability of having the size n ∈ [a, b]. The aggregate of size n can add or lose a monomer with transition rates w+ (n) and w− (n), respectively. However, there is no transition from the state n = a to n = a − 1, as well as from n = b + 1 to n = b (see Figure 9.9). The task is to formulate the master equations describing the time evolution of p(n, t) for this model. Another task is to do this for a modified model, where only the sizes a ≤ n ≤ b are considered at a given probability inflow flux into the nucleation box, Pin , and the outflow flux from it, Pout , in the stationary situation where Pin = Pout holds. In the latter case find the stationary solution for the probability distribution function over the aggregate sizes.
W+(n )
W−(n )
a−1 Figure 9.9
a a+1 …
n
… b−1 b
b+1
Schematic representation of the Szilard model.
9.7 Exercises
E 9.2 Szilard model of nucleation II Find the time-dependent solution p(n, t) (where a − 1 ≤ n ≤ b) for the Szilard model considered in the previous exercise in a special case of constant transition rates w+ (n) = q and w− (n) = 0 at the initial condition p(n, 0) = δn,a−1 . Find the probability inflow and outflow fluxes Pin (t) and Pout (t) from the reservoir to the nucleation box and from the nucleation box to the surrounding media, respectively. E 9.3 Nucleation in a supersaturated vapor Construct the phase diagram of the supersaturated vapor–liquid system. Showing different states in the density–temperature plane. Explain the meaning of the critical point. Hint: follow the analogy with the phase diagram for the many-car system showing free flow and congested traffic in Exercise E 10.3.
297
299
10 Vehicular Traffic
10.1 The Car-Following Theory
The stochastic processes and methods of their description are powerful tools in many applications. Apart from well known classical applications in physics such as diffusion or Brownian motion, stochastic processes have been successfully used also in many interdisciplinary fields related to physics. Recently, the theoretical and empirical foundations of the physics of traffic flow have come into the focus of the physical community, see e.g. [76, 77, 100, 151]. Like every other field of physics the modeling of traffic flow described by generalized forces [78, 80, 227] should be based on the analysis of empirical data [105, 106] in order to understand the underlying stochastic process [240]. Different approaches such as deterministic and stochastic nonlinear dynamics as well as the statistical physics of many-particle systems have been very successful in understanding empirically observed structure formation on roads such as jam formation in freeway traffic [71–75, 168]. The motion of an individual vehicle has many peculiarities, since it is controlled by motivated driver behavior together with physical constraints. Nevertheless, on macroscopic scales the car ensemble displays phenomena such as phase formation (nucleation), widely encountered in different physical systems. This analogy can be clearly seen on a mesoscopic level of description where, instead of following the motion of each vehicle, a stochastic cluster of congested cars is considered [142, 147–149, 151, 153]. In a wide class of traffic models, the so-called car-following models, the behavior of a given vehicle is determined by the car ahead. One often assumes that each car tries to drive with a certain velocity, which is optimal for the actual headway distance. Car-following models with optimal velocity function, first proposed by Bando et al. [10–12], have been reviewed and analyzed in [44]. This optimal velocity models can be related to the traffic model with time lag [173,243], in which the equation of motion of the ith car reads dxi (t + τ) = vopt (∆xi (t)), dt Physics of Stochastic Processes: How Randomness Acts in Time Reinhard Mahnke, Jevgenijs Kaupuˇzs and Ihor Lubashevsky Copyright 2009 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim ISBN: 978-3-527-40840-5
(10.1)
300
10 Vehicular Traffic
where xi (t) is the position of the actual vehicle at time t, ∆xi (t) = xi+1 (t) − xi (t) is the headway, and τ is the delay time. The meaning of the latter is the time lag that it takes the car velocity to reach the optimal velocity vopt (∆xi (t)). The optimal velocity function usually is expressed in terms of hyperbolic tangents [10, 44] vmax tanh(∆x − hc ) + tanh(hc ) (10.2) vopt (∆x) = 2 or Hill’s function [151, 153] vopt (∆x) = vmax
D2
(∆x)2 . + (∆x)2
(10.3)
Both these are sigmoidal functions. A common parameter is the maximal velocity vmax , whereas hc and D are parameters with similar meaning which are called the safety distance and interaction distance, respectively. By making the Taylor expansion with respect to the delay time τ in (10.1), dxi (t) d2 xi (t) dxi (t + τ) + ..., = +τ dt dt dt2
(10.4)
and taking two leading terms only, we obtain τ
d2 xi (t) dxi (t) = vopt (∆xi ). + dt2 dt
A rearrangement of terms yields the known optimal velocity model
1 dxi (t) d2 xi (t) = vopt (∆xi ) − . dt2 τ dt
(10.5)
(10.6)
The parameter 1/τ is called driver’s sensitivity. Car-following models with certain optimal distance function are also considered. In [250] a force model has been proposed, where the acceleration of a car consists of two components. One of them is only velocity (v) dependent and describes a tendency to keep the maximal speed
v , (10.7) a1 (v) = a0 1 − vmax where a0 is the acceleration when the vehicle starts moving. The other component describes a tendency to keep some optimal distance ∆xopt from the vehicle in front, and has been defined as ∆xopt (v) = d0 + k0 v 2 ,
(10.8)
where d0 and k0 are two parameters chosen as 9 m and 0.1 m−1 s2 in [250]. The idea of keeping a certain distance is similar to that in molecule dynamics, therefore this component of acceleration a2 (∆x, ∆xopt ) is related to the interaction potential of Lennard–Jones type
1 ∆xopt 4 1 ∆xopt 2 − . (10.9) U(∆x, ∆xopt ) ∝ 4 ∆x 2 ∆x
10.1 The Car-Following Theory
The difference from molecular dynamics is that the force F(∆xi ) = m a2 (∆xi , ∆xopt (vi )) = −
∂U ∂U = ∂xi ∂∆xi
(10.10)
is passed from the front vehicle to its follower but not vice versa, so Newton’s third law does not apply here. In this notation, index i refers to the ith car and ∆xi = xi+1 − xi is the headway. Thus the force, normalized to the vehicle mass m, is given by
∆xopt 4 ∆xopt 2 F(∆x) k = a2 (∆x, ∆xopt (v)) = − + , (10.11) m ∆x ∆x ∆x where k is a constant coefficient taken as 50 m2 s−2 in [250]. For the actual control parameters, the normalized force depending on the headway distance ∆x at different velocities v is shown in Figure 10.1. The equation of motion is obtained by summing up both components of the acceleration a1 (v) and a2 (∆x, ∆xopt (v)) to obtain the resulting acceleration d2 xi /dt2 of the vehicle i, which depends on the headway ∆xi = xi+1 − xi and velocity vi = dxi /dt
∆xopt (vi ) 2 ∆xopt (vi ) 4 d 2 xi vi k 1 − . (10.12) = + − + a 0 dt2 ∆xi ∆xi ∆xi vmax The above are specific car-following models. In a more general formulation, the acceleration can depend on the headway distance, velocity, and also on the velocity difference vi+1 − vi .
0
∆xopt(V)
Force F(∆x)
20
−5
15
v = 0 ms−1 v = 5 ms−1 10 v = 10 ms−1 0 −10
5
10 V
10
20
30 ∆x
40
Figure 10.1 The force (10.11), normalized to the vehicle mass, in the car-following model (10.12) depending on the headway distance ∆x at different car velocities v. The insertion shows the optimal distance ∆xopt as a function of velocity.
50
301
302
10 Vehicular Traffic
10.2 The Optimal Velocity Model and its Langevin Approach
Although the cooperative behavior of cars when treated as active particles seems to be more general, we concentrate on a well-investigated approximation known as the safety distance or optimal velocity (OV) model (10.6), first proposed by Bando et al. [10–12]. This model defines, on a microscopic level, the law of motion for circular traffic in terms of headway distances and velocity differences between neighboring cars. We start with the deterministic case, following the OV model, to include finally the stochastic fluctuations resulting in the Langevin equation. The dynamics of the ensemble of N cars is given by the system of 2N differential equations of first order with respect to time. We consider a model of point-like cars moving on a circular road of length L. For each car, it is necessary to find its coordinate xi (t) and velocity vi (t), where i = 1, . . . , N, at any time t. Within the framework of the optimal velocity model (OVM), the law of motion reads 1 dvi = (vopt (∆xi ) − vi ); dt τ dxi = vi dt
(10.13) (10.14)
where vopt (∆x) is the optimal velocity function. As distinct from the original work [10] with (10.2), here we use a simpler expression (10.3) vopt (∆x) = vmax
D2
(∆x)2 + (∆x)2
(10.15)
for the optimal velocity function, where ∆xi = xi+1 − xi is the headway (bumperto-bumper distance), vmax is the maximum speed allowed and D is a given positive control parameter called the interaction distance. We introduce new dimensionless variables for velocities, coordinates and time in the following way ui = vi /vmax ,
yi = xi /D,
T = t/τ.
(10.16)
After transformation (10.16) the dynamical system (10.13)–(10.14) can be rewritten as dui = uopt (∆yi ) − ui ; dT 1 dyi = ui , dT b
(10.17) (10.18)
where the dimensionless optimal velocity uopt and the dimensionless control parameter b are given as uopt (∆y) =
(∆y)2 1 + (∆y)2
and
b=
D . τ vmax
(10.19)
10.2 The Optimal Velocity Model and its Langevin Approach
The steady-state or free-flow solution for all vehicles n = 1 . . . N is unst = uopt (∆yhom )
(10.20)
1 ynst = y0 + n ∆yhom + uopt (∆yhom ) T b
(10.21)
where y0 is an arbitrary constant, ∆yhom = L/N is the headway distance in homogeneous flow, and L = L/D is the dimensionless length of the road. The stability of the steady-state solution can be investigated by the method of small perturbations. Taking into account the periodic boundary conditions, we consider a perturbation of the form yn = ynst + δy eλT+ik∆yhom n .
(10.22)
Here δy → 0 is the amplitude of periodic perturbation with the wave vector k = 2πm/L, where m = 1, 2, . . . , N − 1. In this case it is appropriate to reduce the system of 2N first-order differential equations (10.17)–(10.18) to N equations of the second order ∂yn 1 ∂ 2 yn + − uopt (∆yn ) = 0. ∂T 2 ∂T b
(10.23)
By inserting (10.22) into (10.23) and expanding uopt (∆yn ) in the vicinity of ∆y = ∆yhom , we obtain the following quadratic equation
uopt 2πi m 1 − exp =0 (10.24) λ2 + λ + b N for the Lyapunov exponents λ, where uopt = duopt (∆y)/d(∆y) at ∆y = ∆yhom . This can be easily solved by putting λ = Re λ + i Im λ = α + iβ. Hence, α and β are solutions of the system of two equations
2πm = 0, b N
uopt 2πm sin = 0. 2α β + β − b N
α2 − β2 + α +
uopt
1 − cos
(10.25) (10.26)
/b when α The homogeneous solution loses stability at a certain value of uopt vanishes for some m. According to (10.25)–(10.26), this takes place at
uopt uopt 2πm 1 ,. + sin , = (10.27) α = 0, β = b N b 1 + cos 2πm N /b at which the real part of one The stability border corresponds to the value of uopt of the Lyapunov exponents vanishes for the first time and then becomes positive if /b is increased. This occurs first with the exponent indexed by m = 1. uopt Consider now the stability regions in the b-c plane, where c = N/L is the dimensionless density of cars. These regions for homogeneous and heterogeneous
303
304
10 Vehicular Traffic
solutions depend on the initial conditions as well as the total number of cars N. The homogeneous flow described by the steady-state of (10.17)–(10.18), + solution , that is ∆yist = ∆yhom = L/N = 1/c and uist = uopt ∆yist , becomes unstable when entering the region b < b(c), where b(c) is given by
2π 1 1 + cos (10.28) b(c) = uopt c N with uopt
1 2 c3 = , c (1 + c2 )2
(10.29)
as consistent with the stability condition (10.27)√at m = 1. The maximum of b(c) corresponds to the critical concentration ccr = 3. The finite size effect on the phase diagram, where the regions of stable and unstable homogeneous flow are separated by the b(c) curve, is illustrated in Figure 10.2. The homogeneous stationary solution (10.20)–(10.21) transforms into a heterogeneous limit-cycle solution when entering the region below the b(c) curve. The limit cycle is formed in the phase space of headway distances ∆y and velocities u, as shown in Figure 10.3 for c = 1.5 and b = 1.1. The trajectory of each car goes around this limit cycle in the long-time limit T +→ ∞, whereas the homogeneous , flow is represented by the unstable fixed point ∆yhom , uopt (∆yhom ) (solid circle) on the optimal velocity curve (dashed line). The numerical study of a system of N = 60 cars shows that the limit-cycle solution becomes unstable when exiting the region below the dashed curve shown in Figure 10.2(b). In other words, as in many physical systems (e.g. supersaturated vapor), we observe a hysteresis effect which is a property of the first-order phase transition. The stationary state of the system between the solid and dashed curves (at a given N) is of the same type as the initial conditions. A phase transition from the heterogeneous to the homogeneous state takes place at a fixed density c when the parameter b is increased and reaches the dashed curve in Figure 10.2(b). The heterogeneous state is characterized by the minimum umin and the maximum √umax values of the velocity u in the limit cycle. At the critical density c = ccr = 3 both branches umin = umin (b) and umax = umax (b) continuously √ merge into one branch umin (b) = umax (b) = uhom at the critical point b = bcr = (3 3/8)(1 + cos(2π/N)), where uhom = 1/(1 + c2 ) is the steadystate velocity of the homogeneous flow. It is the supercritical bifurcation diagram shown in Figure 10.4(b) for a finite number of cars N = 60. The transformation is discontinuous at c = ccr , as consistent with the subcritical bifurcation diagram at c = 1.5 in Figure 10.4(a). The power-like singularities of umax and umin at c = ccr and b → bcr , so umax (b) − uhom ∝ (bcr − b)β1 β2
uhom − umin (b) ∝ (bcr − b)
(10.30) (10.31)
are described by the critical exponents β1 and β2 . It is useful to define the effective eff critical exponents βeff 1 (h) and β2 (h) as the slope of the log-log plot evaluated from
10.2 The Optimal Velocity Model and its Langevin Approach
bcr 1.2 b 1.0
0.8 (a) 0
1.0
ccr
c 2.5
1.0
ccr
c
b bcr 1.2
1.0
(b) 0
2.5
Figure 10.2 Phase diagram as b-c plane for a system with a fixed different number of cars N. From the bottom to the top, solid curves b(c) show the stability border of the homogeneous traffic flow at N = 6, N = 10, and N = 60, respectively. The dotted curve is the function b(c) when N tends to infinity. Each b(c) plot has a
√ maximum at the critical density ccr = 3 ≈ 1.732. The √ maximum value at N → ∞ is bcr = 3 3/4 ≈ 1.299. The dashed curve on figure (b) shows the stability border of the heterogeneous traffic flow at N = 60 (numerical simulation). It becomes unstable when exiting the region below this curve.
the data at b = bcr − h and b = bcr − 2h. The results are shown in Figure 10.5. The true (asymptotic) values of the critical exponents are obtained at h → 0. Figure 10.5 represents the numerical evidence that both exponents β1 and β2 have the same universal mean-field value 1/2. It also shows that an evaluation of the critical exponent by simply measuring the slope of the log-log plot at a quite small distance from the critical point, e.g. at h ∼ 0.006, is rather misleading [94]. The dynamics of a car system is described by a trajectory in 2N-dimensional space of all headway distances ∆yi and velocities ui . For any given initial condition, one can consider N individual trajectories (one for each car) in the two-dimensional (∆y, u) phase space. The dynamics represented by such a set of trajectories depends on the initial condition. However, for a certain set of initial conditions at a given
305
10 Vehicular Traffic
0.6
u
306
0.3
0
0
0.6 ∆y
1.2
Figure 10.3 The limit cycle (solid curve) for the system of N = 60 cars at the dimensionless density c = 1.5 and parameter b = 1.1. The dashed line is the optimal velocity curve and the solid circle is the unstable fixed point. Arrows indicate the direction of motion along the limit cycle.
overall density of cars, the movement of the points in the phase space can be characterized by the vector field showing the average flow of these points. In particular, we consider an ensemble of initial conditions with random spatial distribution of cars with the only requirement that ∆yi >∆ymin holds, where ∆ymin is some minimum headway distance, the initial velocities being distributed uniformly within u ∈ [ u − ∆u, u + ∆u], where u is some mean velocity. Also the ensemble includes a set of u values to cover a certain region of the velocities of interest. At each time step of the numerical integration, we have N flow vectors Ji = (d∆yi /dT, dui /dT) = ([ui+1 − ui ]/b, dui /dT) associated with N points (∆yi , ui ) in the phase space. The field of the averaged flow vectors J is constructed by splitting the specifically considered region of the phase space in cells, and then calculating the mean flow vector for each cell by averaging over the ensemble of initial conditions and over all time steps from T = 0 to T = Tmax . The mean flow vector is depicted by a suitably scaled arrow with the origin in the middle of the corresponding cell. It shows the mean direction in which the points belonging to this cell of the phase space move, and the length of the arrow is proportional to the average speed of this movement. Such a vector field for the system of N = 60 cars at the values of the dimensionless control parameters b = 1.3 and c = 0.75, for which the homogeneous flow (that is, the fixed point) is stable, is depicted in Figure 10.6. The parameters ∆ymin = 0.2, ∆u = 0.05 with a grid of u values u = ∆u, 2∆u, . . . , 1 − ∆u have been used for the set of initial conditions, including 100 different random initial distributions of coordinates and velocities for each value of u. The integration has been performed up to Tmax = 100. In Figure 10.6(b) a more detailed view of the region around the stable fixed point is shown, obtained by averaging over 1000 random realizations for each u. This vector field or phase portrait of the car system represents a nontrivial
10.2 The Optimal Velocity Model and its Langevin Approach
0.6 umax 0.4 u
uhom
0.2 umin 0
1
(a)
1.2 b
1.4
0.6 umax
u
0.4 uhom 0.2 umin 0 (b)
1
1.2 b
1.4
Figure 10.4 Subcritical bifurcation diagram at c = 1.5 (a) and supercritical bifurcation diagram at the critical density √ c = ccr = 3 (b) for the minimum and maximum velocities depending on the control parameter b for a fixed number of cars N = 60.
(averaged) dynamics for reaching the fixed point. As we can see, the relaxation to the fixed point is not straightforward. An important characteristic is the-ratio | J |/σ, where | J | is the averaged modulus of the flow vector and σ = (J − J)2 is the square root of its variance, which characterizes the mean deviation or fluctuation of | J |. The region between the dotted lines in Figure 10.6 corresponds to | J | < 3σ, whereas | J | > 3σ holds outside this region. In the latter case the motion in the phase plane of ∆y and u is very predictable and is usually given with quite a good accuracy by the mean flow vector. The motion becomes more diffusive with decreasing | J |/σ. This is observed near the optimal velocity curve (the dashed line) and, particularly, in the vicinity of the fixed point (the solid circle), where | J | σ holds. In Figure 10.7 the phase portrait is shown for a larger density c = 1.5 and b = 1.28. It is very close to the border b = b(c) # 1.2781 given by (10.28), where the homogeneous stationary solution becomes unstable and transforms into the limit
307
10 Vehicular Traffic
0.5 0.48 eff
β1,2
308
0.46 0.44 0.42 0.4
0
0.003
0.006
0.009
h Figure 10.5 The effective critical exponents βeff 1 (h) (solid circles) and βeff (h) (empty circles) estimated from the 2 umax (b) and umin (b) data, respectively, at finite distances from the critical point: b = bcr − h and b = bcr − 2h. The universal asymptotic value of the critical exponent β = 0.5 is obtained at h → 0.
cycle. In fact, the flow vectors indicate an unstable limit cycle, and the convergence to the fixed-point stationary solution is very slow. A disadvantage of the optimal velocity model (10.13)–(10.14) is that it is not collision free. In other words, it does not ensure the solution where all the headway distances are always positive. In particular, we never succeeded to obtain a limitcycle solution without collisions at b < 0.86. An important feature of the original as well as our optimal velocity model (10.13)–(10.14) is the symmetry between acceleration and braking. In real traffic, however, the braking (deceleration) can be considerably larger than the acceleration, which is important to avoid collisions. Below, we propose a collision-free model which includes this asymmetry. The dynamics of a car system can be formulated as Newton’s law of motion. Returning to the dimensional quantities, we have the following set of 2N differential equations m
dvi = Fdet (vi , ∆xi ); dt dxi = vi dt
(10.32) (10.33)
where m is the mass of a point-like particle. Now we split the deterministic force Fdet (vi , ∆xi ) into two parts Fdet (vi , ∆xi ) = Facc (vi ) + Fdec (vi , ∆xi )
(10.34)
with acceleration and deceleration ansatz Facc (vi ) =
m (vmax − vi ) ≥ 0 τ
(10.35)
10.2 The Optimal Velocity Model and its Langevin Approach
u
0.8
0.4
0
0
1
2
3
∆y
(a) 0.8
u
0.7
0.6
0.5 (b)
0
1.2
1.4 ∆y
Figure 10.6 The vector field showing an averaged movement of points in the phase space of headway distances ∆y and velocities u at the values of control parameters c = 0.75 and b = 1.3 for an ensemble of
1.6
random initial distributions. The dashed line is the optimal velocity curve and the solid circle is the stable fixed point corresponding to the homogeneous flow. The dotted lines are isolines | J | = 3σ.
, m+ vi 2 vopt (∆xi ) − vmax 1 + p Fdec (vi , ∆xi ) = ≤0 τ ∆xi
(10.36)
taking into account the optimal velocity function (10.15). The deceleration force Fdec includes a symmetry-breaking term with new parameter p. A simple consideration shows that a deceleration −dvi /dt > v02 /(2∆x) is large enough for a car, moving with initial velocity v0 , to stop before a vehicle staying in front of it at an initial distance ∆x. According to safety considerations, a car has to decelerate even faster at small distances ∆x, therefore we have assumed in (10.36) that the additional term in Fdec is proportional to (vi /∆xi )2 . Under this condition a car can never reach with nonzero velocity v = vi (t ) (at a time t = t ) another car staying in front of it since the assumption v > 0 leads to a contradiction: it implies that
309
10 Vehicular Traffic
0.6
0.4 u
310
0.2
0
0
0.5
1
1.5
∆y Figure 10.7 The vector field showing an averaged movement of points in the phase space of headway distances ∆y and velocities u at the values of control parameters c = 1.5 and b = 1.28 for an ensemble of
random initial distributions. The dashed line is the optimal velocity curve, and the solid circle is the stable fixed point corresponding to the homogeneous flow. The dotted lines are isolines | J | = 3σ.
−dvi /dt > vi2 (t0 )/(2∆x(t0 )) holds within certain time interval t0 < t < t , which by itself rules out the collision. The original optimal velocity model (10.13)–(10.14) is recovered at p = 0. If p is a small parameter, then the new model behaves practically in the same way as the old one, except for those critical situations where some car produces an accident in the old model, and its motion is now corrected to avoid the collision. The deceleration force term due to interaction between cars is always negative
1 m vi 2 1+ p . (10.37) Fdec = −vmax τ 1 + (∆xi /D)2 ∆xi Using the acceleration (10.35) and deceleration (10.37) ansatz we are able to write the deterministic force Fdet as
vi 2 vi m 1 1+ p 1− − . Fdet (vi , ∆xi ) = vmax τ vmax ∆xi 1 + (∆xi /D)2 (10.38) As earlier, it is convenient to make our equations of motion dimensionless (10.16). Finally we obtain 1 dui = 1 − ui − + ,2 dT 1 + ∆yi 1 dyi = ui dT b where p˜ = p vmax /D.
ui 2 1 + p˜ ∆yi
(10.39)
(10.40)
10.2 The Optimal Velocity Model and its Langevin Approach
0.6
u
0.4
0.2
0
0
0.2
0.4
0.6
0.8
1
y Figure 10.8 The dimensionless velocity u vs coordinate y for one car with the initial coordinate y = 0 and velocity u = 0.7 having a wall at the position y = 1 in front of it. The dashed line, showing a collision with
the wall, corresponds to the optimal velocity model (˜p = 0), whereas the solid line corresponds to the advanced collision-free model with p˜ = 0.2.
To illustrate the difference between the new and old models, we have made a numerical calculation at b = 1 for one car with the initial coordinate y = 0 and velocity u = 0.7 having a wall (or, equally, another staying car) in front of it at the position y = 1. In Figure 10.8 we have plotted the velocity u depending on the coordinate y for the OVM model with p˜ = 0 (dot-dashed curve), as well as for the new advanced model with p˜ = 0.2 (solid curve). As can be seen, the car reaches the wall with nonzero velocity in the original model. On the contrary, no accident occurs in the new model. In this case the car approaches the wall asymptotically at T → ∞ with vanishing velocity u → 0. The homogeneous steady-state solution dui /dT = 0 of system (10.39)–(10.40) reads
1 ui 2 ˜ 1 − ui − 1 + = 0. (10.41) p + ,2 ∆yi 1 + ∆yi This corresponds to equal headways ∆yi = L/N = 1/c. The steady-state velocity can be found as the positive root of the quadratic equation ui2 +
, (∆yi )2 + (∆yi )4 1 + (∆yi )2 ui − =0 2 p˜ p˜ 2
which follows from (10.41). It yields / (∆yi )2 (1 + (∆yi )2 ) 4 p˜ 2 1+ −1 . ui = 2 p˜ 2 (1 + (∆yi )2 )2
(10.42)
(10.43)
The result of the optimal velocity model ui = uopt (∆yi ) is recovered at p˜ → 0. In Figure 10.9 the steady-state velocity of the new model with p˜ = 1 is compared to that of the optimal velocity model with p˜ = 0. The difference is relatively small
311
10 Vehicular Traffic
1 0.8 0.6 ust
312
0.4 0.2 0
0
1
2 ∆y
3
4
Figure 10.9 The steady state velocity of a homogeneous traffic flow depending on the headway distance ∆y. The solid line corresponds to the optimal velocity model (˜p = 0), the dashed line indicates the result of the new model with p˜ = 1.
at p˜ = 1 and much smaller (about 100 times) at p˜ = 0.1. The latter implies that the additional term with p˜ practically does not change the properties of the original optimal velocity model at p˜ ∼ 0.1, except that it makes the model collision free. Up to now we have dealt with the deterministic car-following model. The stochastic optimal velocity (OV) model of a one-lane road with periodic boundary conditions is obtained by adding the noise term. Thus, within the Langevin approach [34, 157], we have the following set of acceleration equations vopt (∆xi ) − vi dvi = + ξi (t). dt τ
(10.44)
Neglecting the noise term ξi , the position xi (t) and the velocity vi (t) of each car i = 1, . . . , N at every time t can be calculated from the initial values by integrating the coupled equations of motion. The coupling is due to the interaction between two successive cars measured by the headway ∆xi = xi+1 − xi − (note car length → 0 for point-like cars). The optimal or desired velocity vopt (∆x) (10.15) is the steady-state velocity chosen by drivers as a function of the headway between cars. It increases monotonously with the distance and tends to a constant maximum value vmax for ∆x → ∞. Our equations of motion can be written as a random dynamical system with multiplicative Gaussian white noise [152] m dvi (t) = Fdet (vi , ∆xi ) dt + σ vi dWi (t) dxi (t) = vi dt
(10.45) (10.46)
with noise intensity σ > 0 for a point-like particle (vehicle i) of mass m with speed vi (t) at location xi (t). The choice of the multiplicative noise is motivated by the fact that, contrary to the additive noise, it ensures the positiveness of all velocities
10.2 The Optimal Velocity Model and its Langevin Approach
vi . Here √ Fdet = m (vopt (∆xi ) − vi )/τ is the deterministic force. The fluctuations dWi = Z dt are given by the increment of a Wiener process, where Z is a N(0, 1) standard normal-distributed random number. By using the dimensionless variables introduced in (10.16), the system of equations (10.45)–(10.46) becomes dui (t) = (uopt (∆yi ) − ui ) dT + a ui dWi (t) dyi (t) =
1 ui dT, b
(10.47) (10.48)
√ where a = σ τ/m is the dimensionless noise amplitude. We have numerically solved, by simulation, the system of equations (10.47)–(10.48) for a finite system of 60 cars at nonvanishing noise intensity a = 0.1 by an algorithm called the explicit 1.5 order strong scheme [104]. We have fixed the dimensionless density of cars c and have varied the parameter b to estimate the value b(c) at which the transition from the free-flow regime to congested traffic occurs. A jump-like increase in the variance of the velocity var(u) = u2 − u2 takes place around the stability curve b = b(c), see Figure 10.2. The noise tends to wash out the phase transition in the finite system which we considered, therefore we could not observe a real jump and the value of b(c) has been identified with the inflection point of the var(u) vs b plot where this curve has maximum √ steepness. Moreover, the phase transition around the critical density ccr = 3 becomes too diffuse to identify the location of the transition point. Due to the fluctuations at a > 0, our system of cars cannot stay unlimited for a long time in a metastable state. This means that a sufficiently long simulation will give us one line b = b(c) irrespective of the initial conditions, rather than two branches of the phase diagram shown in Figure 10.2 (b) for the deterministic case a = 0. Nevertheless, even at positive a = 0.1 a hysteresis effect has been observed in finite-time simulations at certain densities, e.g. c = 3. The phase diagram in Figure 10.10 shows the parts of the b = b(c) curve (dotted line) at a = 0.1 as well as the two branches with a = 0 for comparison. The idea of using the variance of the velocity u to distinguish between the homogeneous and heterogeneous states of the system is similar to that applied in [169], where a closely related quantity; namely the variance of the density, has been considered in the framework of stochastic car-following models. In particular, we expect that the stochastic OVM model considered here would produce a similar variance plot depending on the density and noise intensity, as does the cellular automata model (CA) discussed in [169]. This variance plot is seen in Figure 10.11. As distinct from the CA model considered in [169] (for the Nagel–Schreckenberg cellular automata model see also [168, 170, 198, 199, 205]), the noise can produce collisions in the model (10.45)–(10.46), which is highly likely at large noise intensities. A modification of the deterministic part, as proposed in (10.36), is helpful to solve this problem. We have fixed the noise intensity a = 0.1 and the number of cars N = 60 and have studied the distribution of vehicular velocities and headway distances
313
10 Vehicular Traffic
b
1.3
1.2
0
1
2
3
c Figure 10.10 The phase diagram of a finite system of N = 60 cars, including the phase transition line (dotted curve) at a nonvanishing noise intensity a = 0.1. The solid and dashed lines refer to the case a = 0 and have the same meaning as in Figure 10.2.
Density variance
314
0.2 0.18 0.16 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 0.5 0.45 0.4 0.35 0.3
1
0.25
ise
No
0.8
0.2 0.15 0.1 0.05
0.6 0.4 0.2
sity
Den
0
Figure 10.11 3D-plot of the density variance for a cellular automaton (CA) model with slow-to-start rule. The noise is given by a parameter p>0 [169] of random deceleration. The figure is taken from [169].
depending on the dimensionless car density c and parameter b. In Figure 10.12(a) the velocity distribution function f (u) is shown at b = 1.1 for a relatively small density c = 0.5 as well as for higher density c = 2. In the first case we have only one maximum located near u = 0.8 which is the steady-state velocity of the homogeneous free traffic flow without noise. The noise only smears out the delta-like distribution to yield the smooth maximum seen in Figure 10.12. The traffic flow is homogeneous with small fluctuations in velocities and also headway distances, as shown in Figure 10.12(b). The headway-distance distribution
10.2 The Optimal Velocity Model and its Langevin Approach
30
a =0.1, b = 1.1, N =60
f(u)
20
10
0
Congested flow c=2
0
0.2
0.4
(a)
f(∆y)
0.6
0.8
1
u
a =0.1, b = 1.1, N =60
9
6
0
Free flow c =0.5
Congested flow c=2
3
(b)
Free flow c=0.5
0
1
2
3
∆y
Figure 10.12 The probability density distribution of velocities (a) and headway distances (b) for two different car densities c = 0.5 and c = 2 at fixed dimensionless noise amplitude a = 0.1 and parameter b = 1.1.
function f (∆y) has a maximum near ∆y = ∆yhom = 1/c = 2 which is the average headway distance in a homogeneous flow. In the second case (c = 2) the velocity distribution function and the headway distribution function have two maxima. This indicates the coexistence of two phases – free flow with relatively large headways and velocities and jam with small headways and velocities. The heavy traffic at c = 3.5, and the critical situation at c = ccr and b = bcr are illustrated in Figure 10.13 with the same notation as in Figure 10.12. In both cases the distributions have only one maximum, as is consistent with the existence of only one phase. A distinguishing feature of the critical point is that the distributions over the headway distances and velocities are relatively broad. The simulation of the stochastic equations (10.45) and (10.46) in the coexistence region allows us also to find the probability P(n, t) that just n cars are involved in the jam. In this case the jammed cars are defined as those vehicles which have the headway distance smaller than the homogeneous one ∆yhom = 1/c.
315
10 Vehicular Traffic
45
a =0.1, N =60 30 f(u)
Heavy traffic c =3.5, b=1.1 15
0
Flow at critical point c =1.732, b= 1.295
0
0.1
0.2
(a)
0.3
0.4
0.5
u
a =0.1, N =60 18
f(∆y)
316
Heavy traffic c= 3.5, b=1.1
12 Flow at critical point c= 1.732, b= 1.295
6
0 (b)
0
0.3
0.6
0.6
∆y
Figure 10.13 The probability density distribution of velocities (a) and headway distances (b) in the case of heavy traffic with large density of √ cars c = 3.5 (at b = 1.1) and at the critical point c = ccr = 3 # 1.73205, b = bcr # 1.29548. In both cases the dimensionless noise intensity is a = 0.1.
The results of numerical simulation at c = 2 are shown in Figure 10.14. They coincide qualitatively with the investigations within the stochastic master equation approach [147–149, 152, 153] discussed in Sections 10.3–10.5.
10.3 Traffic Jam Formation on a Circular Road
In the previous section we have described the motion of a car ensemble on a microscopic level by considering each car individually. Many properties of traffic flow, including jam formation, can be more easily described on a less detailed mesoscopic level, where we are looking only for the number of jammed cars n. This type of model on a one-lane circular road is illustrated in Figure 10.15 where we have shown two different regimes of traffic flow: free traffic flow (a) and congested
10.3 Traffic Jam Formation on a Circular Road
a = 0.1, b = 1.1, N = 60 0.4
P(n)
Congested flow c= 2 0.2
0
0
10
20
30 n
40
50
60
Figure 10.14 The cluster (jam) size distribution when the density of cars c = 2 for a system of N = 60 cars at the dimensionless noise intensity a = 0.1 and b = 1.1.
∆Xfree
∆X
clust
∆Xfree
(a)
(b)
Figure 10.15 Free traffic flow (a) and congested traffic flow (b) on a one-lane circular road. In the case of congested traffic (b) there are two clusters of different length coexisting, with free flow shown as an example. The headway between the cars is ∆xclust inside a cluster and ∆xfree in free flow. The direction of movement is indicated by an arrow.
traffic flow (b). In the congested traffic several jams can exist simultaneously, for example two clusters in Figure 10.15. Here we consider a simple model where only one car cluster (a queue of n cars) is allowed. Installation of multiple loop detectors for queue detection in recent years has made it possible to measure the queue length or n(t) as the number of congested cars directly [122]. In the following we consider the attachment of a vehicle to the car cluster and the detachment from it as elementary stochastic events. The traffic is thus treated
317
318
10 Vehicular Traffic
as a one-step Markov process described by the general master equation (3.38) ∂ p(n, t) = w+ (n − 1) p(n − 1, t) + w− (n + 1)p(n + 1, t) ∂t −[w+ (n) + w− (n)] p(n, t).
(10.49)
Now the basic problem is to find an appropriate ansatz for both transition probabilities w+ (n) and w− (n). Note that physical boundary conditions (0 ≤ n ≤ N) for the master equation (10.49) are ensured by formally setting P(−1, t) = P(N + 1, t) = 0 and w+ (N) = w− (0) = 0. The latter two transitions are impossible physically and they are not included in our further analysis. As before (3.76), we assume a constant value for the escape rate w− (n), so w− (n) = w− =
1 . τ
(10.50)
The probability per time unit w+ (n) that a vehicle is added to a car cluster of size n is estimated based on the following physical model. The total number of cars is N. They are moving along a circular one-lane road of length L. If a road is crowded with cars, each car requires some minimum space or length which, obviously, is larger than the real length of a car. We call this the effective length of a car. The distance between the front bumpers of two neighboring cars, in general, is + ∆x. The distance ∆x can be understood as the headway between two ‘effective’ cars which, according to our definition, is always smaller than the real bumper-to-bumper distance. The maximum velocity of each car is vmax . The desired (optimum) velocity vopt , depending on the distance between two cars ∆x, is given by the formula vopt (∆x) = vmax
(∆x)2 , D2 + (∆x)2
(10.51)
where the parameter D, called the interaction distance, corresponds to the velocity value vmax /2. According to the ansatz (10.51) the optimum velocity, see Figure 10.16, is represented by a sigmoidal function with values ranging from 0, corresponding to zero distance between cars, to vmax , corresponding to an infinitely large distance or absence of interaction between cars. Our assumption is that a vehicle changes its velocity from vopt (∆xfree ) in free flow to vopt (∆xclust ) in a jam and approaches the cluster as soon as the distance to the next car (the last car in the cluster) reduces from ∆xfree to ∆xclust . This assumption allows one to calculate the average number of cars joining the cluster per time unit or the attachment frequency w+ (n) to an existing car cluster. Thus, we have the ansatz valid for 1≤n
vopt (∆xfree (n)) − vopt (∆xclust ) . ∆xfree (n) − ∆xclust
(10.52)
This equation (10.52) requires the knowledge of ∆xfree and ∆xclust as a function of the cluster size n. Measurements on highways have shown that the density of
10.3 Traffic Jam Formation on a Circular Road
Vopt Vmax
Vmax 2
0
0
∆x
D
Figure 10.16 Analytical form of the optimal velocity function vopt depending on headway ∆x.
cars in congested traffic is independent of the size of the dense congested phase (jam). As a consequence, the distance between jammed cars, the spacing ∆xclust , has a constant value which has to be treated as a given measured quantity or known control parameter. We have defined the length of the car cluster or jam size depending on the number of congested cars n by Lclust = n + ∆xclust S(n), where
S(n) =
(10.53)
0 : n=0 n − 1: n ≥ 1
(10.54)
is the number of spacings of size ∆xclust . In this way, we have for the total length of road L = n + ∆xclust S(n) + (N − n) + ∆xfree (N − S(n)), ?@ A > ?@ A > Lclust
(10.55)
Lfree
where Lfree = L − Lclust = L − { n + ∆xclust S(n)}
(10.56)
denotes the length of the uncongested or free road. For Lfree we can also write, according to (10.55) Lfree = (N − n) + ∆xfree (N − S(n)).
(10.57)
Comparing these two equations, we obtain the distance in free flow, depending on cluster size, ∆xfree (n) =
L − N − ∆xclust S(n) . N − S(n)
(10.58)
319
320
10 Vehicular Traffic
By this, all the transition probabilities (10.52) are defined except the transition from the state without any cluster n = 0 to the smallest cluster size n = 1. This transition and the meaning of the state with a single congested car (n = 1) called a precluster, requires some explanation. Some stochastic event or perturbation of the free traffic flow, which is represented by n = 0, is necessary to initiate the formation of a cluster. These stochastic events are simulated assuming that one of the free cars can reduce its velocity to vopt (∆xclust ) and thus become a single congested car or a cluster of size n = 1. This process is characterized by the transition frequency w+ (0) which cannot be calculated from the ansatz (10.52), but has to be considered as one of the control parameters of the model. A cluster of size one appears also when a two-car cluster is reduced by one car. In this consideration, the vehicular cluster with size n = 1 is a car which still has not accelerated after this event. In any case, a precluster is defined as a single car moving with the velocity vopt (∆xclust ). Since at n = 0 any of the N free cars has an opportunity to become a single congested car, an appropriate ansatz for the transition frequency w+ (0) is p (10.59) w+ (0) = N, τ where p > 0 is a dimensionless constant called the stochastic perturbation parameter or stochasticity. In natural sciences, and especially in physics, it is usually accepted to write all the basic equations in dimensionless variables. It is appropriate to introduce the dimensionless time T via T = t/τ and the dimensionless distances normalized to , so, ∆y = ∆x/, d = D/, ∆yclust = ∆xclust / and ∆yfree = ∆xfree /, as well as the dimensionless optimum velocity wopt = vopt /vmax . Then the basic equations of this section can be rewritten as follows. The master equation for the scaled probability distribution P(n, T) instead of p(n, t): 1 ∂ P(n, T) = w+ (n − 1) P(n − 1, T) + w− (n + 1) P(n + 1, T) τ ∂T −[w+ (n) + w− (n)] P(n, T);
(10.60)
the optimal velocity definition: wopt (∆y) =
d2
(∆y)2 ; + (∆y)2
(10.61)
the transition frequencies: w− (n) = w− =
1 , τ
1 ≤ n ≤ N,
(10.62)
1 p N, τ vmax vopt (∆xfree ) − vopt (∆xclust ) /vmax w+ (n) = [∆xfree − ∆xclust ] / w+ (0) =
=
1 wopt (∆yfree (n)) − wopt (∆yclust ) b , τ ∆yfree (n) − ∆yclust
1≤n≤N−1
(10.63)
(10.64)
10.3 Traffic Jam Formation on a Circular Road
with dimensionless parameter b = vmax τ/;
(10.65)
and the ansatz for the cluster length and related quantities: Lclust −1 = n + ∆yclust S(n) = cclust n, Lfree −1 = N − n + ∆yfree (N − S(n)) = cfree (N − n), L/ − N − ∆yclust S(n) . ∆yfree (n) = N − S(n)
(10.66) (10.67) (10.68)
According to the definitions, c = N/L = is the total density of cars, cclust = n /Lclust and cfree = (N − n)/Lfree are the densities in a jam and in the free flow, respectively. In the stochastic approach an equation can be obtained for the average cluster size n. Based on the master equation (3.38), we obtain a deterministic equation for the mean value d d
n = np(n, t) = w+ (n) − w− (n), (10.69) dt dt n which can be written in a certain approximation as follows d
n ≈ w+ ( n) − w− ( n), dt
(10.70)
describing the time evolution of the average cluster size n. The stationary cluster size nst can be calculated from the condition d n/dt = 0 or wopt (∆yfree ( n) − wopt (∆yclust ) w+ ( n) =b =1 w− ( n) ∆yfree ( n) − ∆yclust
(10.71)
consistent with the ansatz for the transition probabilities (10.62) and (10.64). By using the definition (10.61) of the optimal velocity function wopt (∆y) we obtain the equation (∆yclust )2 (∆yfree ( n))2 = ∆yfree ( n) − ∆yclust − (10.72) b 2 d + (∆yfree ( n))2 d2 + (∆yclust )2 which can be solved with respect to ∆yfree . One solution of the third-order equation (10.72) is trivial ∆yfree = ∆yclust . The other two solutions, which have a particular physical meaning, read (1,2)
∆yfree =
d + (∆yclust )2 ] = × bd ± b2 d2 + 4b∆yclust [d2 + (∆yclust )2 ] − 4[d2 + (∆yclust )2 ]2 . 2[d2
(10.73)
321
322
10 Vehicular Traffic
According to (10.73), the headway ∆yfree between cars in a free flow coexisting with a single cluster has a constant value depending merely on the control parameters of the model. Now, by means of (10.68), which states the relation between ∆yfree and n, we are able to calculate the stationary cluster size nst . As already pointed out we set S(n) = n − 1 + δn,0 ≈ n, assuming that the cluster (if it exists) contains a large number of cars. This leads to the equation (1,2) c 1 + ∆yfree − 1
nst = , (10.74) (1,2) L/ ∆y − ∆yclust free
where c = N/L is the total density of cars and the term on the left-hand side of the (1,2) equation is, in fact, the relative part of the road crowded by cars. In this case ∆yfree has a constant value given by (10.73). Result (10.74) makes sense at large enough densities where it provides a positive value of nst . The solution with the largest (1) value ∆yfree (a positive sign in (10.73)) gives the average stationary size of a stable cluster depending on the total density c within the region c > c1 , where c1 =
1
(10.75)
(1)
1 + ∆yfree
is the critical density at which the spontaneous growth of a large car cluster starts. There is another critical density c2 , given by c2 =
1 (2)
1 + ∆yfree
,
(10.76)
which defines the region c > c2 where an unstable car cluster corresponding to (2) the solution (10.74) with the smallest value ∆yfree of the headway can exist. In the special case of vanishing bumper-to-bumper distance in a jam ∆yclust = 0 our result (10.73) reduces to . b2 b (1,2) − d2 , (10.77) ∆yfree = ± 2 4 and (10.74) to
nst c−1 . =c+ . L/ b b2 2 ± −d 2 4
(10.78)
The value of nst /L with sign + in (10.78) corresponds to the stable, while that with sign − corresponds to the unstable stationary cluster size. According to the deterministic equation (10.69), clusters of an undercritical (smaller than the unstable) size dissolve (dn/dt < 0) whereas those of the overcritical size relax to the stable stationary cluster size. The growth of the car cluster starting with the undercritical cluster size is possible, too, but only due to the stochastic fluctuations. Such a process can be described within a stochastic approach only. The diagram shown in Figure 10.17 relates the stationary cluster size nst to the total density of cars c in the case of the aggregation in traffic flow at vanishing
10.3 Traffic Jam Formation on a Circular Road 1
L /l
st
n1(c) n2(c)
0 0
c1
c2
Figure 10.17 The stationary cluster size
nst normalized to L/ (the relative part of the road crowded by cars) depending on the total density of cars c. The stable cluster size (branch n1 (c) and horizontal lines) is shown by thick solid lines,
c
1
whereas the unstable cluster size (branch n2 (c)) and a horizontal line) is shown by dot-dashed lines. Arrows indicate the time evolution of n. Parameters: b = 8.5, d = 13/6, and ∆yclust = 0.
200
n
150
100
50
0
0
500
1000
t /t Figure 10.18 Two stochastic trajectories showing the time evolution of the cluster size n(t) starting from a total congestion with n(0) = N = 200 (dotted line) and from a free flow with n(0) = 0 (solid line). The stationary mean value nst = 125 cars
is indicated by a horizontal solid line. Parameters of the system are L/ = 833.3 (L = 5000 m, = 6 m), b = 8.5, d = 13/6, ∆yclust = 1/6, and the stochasticity p = 0.001.
∆yclust . The two branches given by (10.78) (with + and −, respectively) are denoted by n1 (c) and n2 (c). Several trajectories showing the time evolution of n to one of the stable stationary values (thick lines) are indicated by arrows. The time evolution of the system to the stationary state, consistent with the master equation (10.60), is illustrated in Figure 10.18 by two typical stochastic trajectories. The stationary probability distribution (compare (3.24)) P(n) = lim P(n, T) T→∞
(10.79)
323
10 Vehicular Traffic
representing the long-time behavior of the master equation (10.60) can be found easily. This is exactly the general stationary solution for one-step processes in closed finite systems given by solution (3.46). In Figure 10.19 we have shown the stationary solution P(n) depending on the total number of cars on a road of given length. The maximum of the probability distribution corresponds to the stable cluster size of congested cars.
2.0
P(n) Pmax
P(n) Pmax
1.0
0.5
0.0
0
20
(a)
40
1.0
0.0
60
0
200
400 n
600
0
200
400 n
600
0
200
400 n
600
(d)
n
2.0
P(n) Pmax
P(n) Pmax
1.0
0.5
0.0
0
20
40
1.0
0.0
60
n
(b)
(e) 2.0
0.5
0.0 (c)
P(n) Pmax
1.0 P(n) Pmax
324
0
20
40
0.0
60
n
Figure 10.19 Series of different stationary probability distributions P(n) (solid lines) and ratios of transition rates w+ (n)/w− (n) (dashed lines) showing the formation of a jam of size n depending on the total number of cars N on the road. The values of N and Pmax (maximum of P(n)) are (a) N = 50, Pmax = 0.905; (b) N = 92, Pmax = 0.451;
1.0
(f)
(c) N = 110, Pmax = 0.050; (d) N = 200, Pmax = 0.042; (e) N = 663, Pmax = 0.125; (f) N = 664, Pmax = 0.220. The parameters of the system are L/ = 833.3 (L = 5000 m, = 6 m), b = 8.5, d = 13/6, ∆yclust = 1/6, and p = 0.001. The maximum of the probability distribution corresponds to the stable cluster size of the congested cars.
10.3 Traffic Jam Formation on a Circular Road
One of the most important characteristics of traffic flow is the fundamental diagram showing the flux J of vehicles on the road as a function of the total car density ρ = N/L (or dimensionless total density c = ρ). We define J as a local flux ρ(x, t)v(x, t) averaged over an infinite time interval, where ρ(x, t) is the local density and v(x, t) is the local velocity of cars at a time t and space coordinate x, so
1 t ∆N = lim ρ(x, t )v(x, t ) dt . (10.80) J= t→∞ t 0 ∆t In dimensionless variables this equation reduces to
∆N 1 T ∆N =τ = b lim c(x, T )u(x, T ) dT , j= T→∞ T 0 ∆T ∆t
(10.81)
where j = Jτ is the dimensionless flux, u(x, T) is the velocity normalized to vmax , and b = vmax τ/ is the, already defined, dimensionless parameter (10.65). In our model the local velocity and the density of cars are defined by the cluster size n and the distance x − x between the considered local coordinate x and the coordinate x of the first car in the jam. Thus, we have (x, t) = (x − x (t), n(t)) and v(x, t) = v(x − x (t), n(t)), and after averaging over time we get J= (10.82) dx P(n, x ) (x − x , n)v(x − x , n), n
where P(n, x )dx denotes the part of the total time during which the size of the cluster is n and the coordinate of the first car of the jam is between x and x + dx . The cluster can be found with equal probability at any coordinate x along the circular road if an average over an infinite time interval t is considered. Thus we have P(n, x ) = P(n)/L. According to our assumptions, the velocity of congested cars is vopt (∆xclust ) and their density is clust = n/Lclust = cclust / inside the jam of length Lclust . Outside the cluster we have v = vopt (∆xfree (n)) and free = (N − n)/Lfree = cfree /. By these assumptions the integration (10.82) can be performed easily, and this yields for the dimensionless flux + , Lclust Lfree j=b wopt (∆yclust )cclust + wopt ∆yfree (n) cfree . (10.83) P(n) L L n After the substitution of (10.83) with the definitions of the densities cclust (10.66) and cfree (10.67) we easily obtain
, + n n j=b . (10.84) P(n) wopt (∆yclust ) + wopt ∆yfree (n) c − L L n This equation is suitable for calculation of the flux-density fundamental diagram according to the known stationary probability distribution P(n). Now we consider the behavior of the system in the (thermodynamic) limit N → ∞ under the condition σ = (Rd)2 + 4R∆yclust − 4 > 0
(10.85)
325
326
10 Vehicular Traffic
where R = b/(d2 + (∆yclust )2 ). This is the condition at which the equation w+ (n)/w− (n) = 1 has real physical solution(s) and a cluster with n/N = 0 emerges, that is, a phase transition takes place at some value of the car density c. In the opposite case there is no phase transition (cluster formation) at all. In the special situation ∆yclust = 0, condition (10.85) reduces to b > 2d. The analysis of solution (3.46) shows that P(z) = N −1 δ(z − z0 ) holds in the thermodynamic limit N → ∞, where z is defined as z = n/N with the value z0 corresponding to the absolute maximum of P(z). z0 = 0 holds, if c ≤ c1 or c > c2 . If c1 ≤ c < c2 , then z0 = z0 where z0 is defined by z0 = (1 + ∆yfree − 1/c)/(∆yfree − ∆yclust ) and ∆yfree has the √ constant value (d/2)(Rd + σ), as follows from the equation w+ (n)/w− (n) = 1. The √ critical densities c1 and c2 are defined by z0 = 0 or c1 = 1/(1 + (d/2)(Rd + σ)), and ln(P(z = 0)) = ln(P(z = z0 )), respectively. Their physical meaning is the following. At c = c1 the free traffic flow becomes unstable and a large cluster of cars emerges spontaneously in the thermodynamic limit N → ∞. In a finite system, as seen in Figure 10.19, this situation corresponds to N about 92 (see (b)). At c = c2 the probability distribution P(n) has two competing maxima (with the same value of ln P(n) at N → ∞). In Figure 10.19 this corresponds approximately to N = 663 and N = 664. In the following an equation will be derived (under the assumption ∆yclust = 0) from which both critical densities c1 and c2 can be determined. Taking account of (3.44) the condition ln(P(z = 0)) = ln(P(z = z0 )) in the thermodynamic limit reduces to
z 0 ln[Q(z)] dz = 0, (10.86) 0
where Q(z) = w+ (n = zN)/w− (n = zN). This equation is satisfied both at c = c2 and c = c1 because in the latter case we have z0 = 0. Using partial integration (accounting for Q(z0 ) = 1) and changing the integration variable to h = ∆yfree (defined by (10.68)), we get
Bd 1 sd 2h 1 − − 2 dh = 0, (10.87) h d + h2 h sd where s = (1 − c)/(cd) and B = b/(2d) + b2 /(4d2 ) − 1. This integral can be calculated analytically, and this yields s B(1 + s2 ) + − 1 + 2s (arctan B − arctan s) = 0. (10.88) ln s(1 + B2 ) B One of the solutions is obviously s1 = B, corresponding to the first critical value c1 = 1/(1 + Bd). A complete analytical solution is possible in some asymptotic cases. At B = 1 + where → 0 the solution can be found in the form s = 1 + δ where δ → 0. Neglecting terms of fourth and higher orders we get (δ − )2 (δ + 2) = 0. Thus, we have s1 = 1 + = B and s2 # 1 − 2. At the critical point = 0 or b = 2d (at b > 2d the cluster emerges) we get s1 = s2 = 1 or c1 = c2 = ccrit where ccrit = 1/(d + 1) is the critical value of c. Another asymptotic case is B → ∞ where
10.3 Traffic Jam Formation on a Circular Road
we have a solution with s2 → 0. Retaining only the main terms in the equation we get ln(s2 B) + 1 = 0 or s2 = 1/(eB). It should be noted that cases with c ≤ cclust have a physical meaning only, because the total density c cannot exceed the density of cars in the cluster cclust = 1/(1 + ∆yclust ). In general (with ∆yclust > 0), a situation is possible where the equation for c2 has no solution at c2 < cclust . In this case the following flux equations for an infinite system are correct, formally setting c2 = cclust . Thus, taking into account the above discussed solution for P(n), we get the following flux-density relation bc(1 − c)2 : c ∈ [0; c1 ]∪]c2 ; cclust ] (cd)2 + (1 − c)2 (10.89) j(c) = 1 − c + c(bwopt (∆yclust ) − ∆yclust ) : c ∈ [c1 ; c2 ] These equations represent an exact analytical solution for the fundamental diagram of traffic flow in the framework of our relatively simple model, calculated in the thermodynamic limit. Since the fundamental diagram represents one of the most important characteristics of traffic flow, this result has a fundamental significance and has to be compared with vehicular experiments. As can be seen from these equations, the fundamental diagram consists of fragments of a nonlinear curve and of a straight line. The nonlinear curve represented by the first formula of (10.89) corresponds to homogeneous flow, whereas the straight line corresponds to nonhomogeneous (or congested) flow. The fundamental diagram calculated for the special case ∆yclust = 0 is shown in Figure 10.20. In this diagram
1.5
L = infinity L = 25000m L = 5000 m L = 1000 m
j
1.0
0.5
0.0 0.0
0.2
0.4
0.6
0.8
1.0
C Figure 10.20 Based on the stationary solution of the stochastic master equation the fundamental diagram (dimensionless flow rate (flux) j vs. dimensionless car density c) is calculated. The dimensionless control parameters are b = 10, d = 7/3, and ∆yclust = 0. The length of road L varies, the effective length of a car being fixed, = 6 m.
For finite roads (L < ∞) and for infinitely long roads (L → ∞) the flow j can be divided into homogeneous regimes (left: free flow as a gaseous phase, right: heavy traffic as a liquid phase) and a transition regime with free and congested vehicles (formation of a car cluster).
327
10 Vehicular Traffic
Density (cars/km) 1.5
0
20
40
60
80
100
120
140
1.0 2000
0.5
Flux (cars/h)
3000
j
328
1000
0.0 0.0
0.3
0
0.6 C
Figure 10.21 Comparison of the fundamental diagram of traffic flow calculated at fitted parameter values = 6 m, vmax = 34 m/s, τ = 1.5 s, D = 13 m, and ∆xclust = 1 m (b = 8.5, d = 13/6, and ∆yclust = 1/6) with experimental data of Ref. 101 (denoted by separate points and thin solid line
connecting measured points). The thick solid line shows the solution for the finite road of length L = 5000 m, the theoretical curves for infinite system represented by the first and the second formula in Eq. (10.89) are shown by smooth thin solid line and dashed line, respectively.
three different regimes of the traffic flow can be distinguished: free flow as a gaseous phase at small densities (c < c1 ), heavy traffic as a liquid phase at large densities (c > c2 ), and a transition regime with free and congested vehicles at intermediate densities (c1 < c < c2 ). In Figure 10.21 we show a comparison of our fundamental diagram to experimental traffic data [101]. In this case we have used realistic values of the control parameters found from experimental measurements.
10.4 Metastability Near Phase Transitions in Traffic Flow
In the previous section we have focused our attention on the stationary characteristics of traffic flow, such as the stationary probability distribution p(n) over cluster (jam) sizes n and the fundamental diagram averaged over an infinite time interval. Here we discuss the time behavior of the probability distribution function P(t), introduced in Section 3.2, for our one-lane traffic model with the transition frequencies (10.62)–(10.64). We are particularly interested in the relaxation dynamics near the critical densities c1 and c2 . By analogy with physical systems of many interacting particles, the relaxation behavior near criticality can be very slow (the critical slowing down), so that the long-time behavior of P(t) is important.
10.4 Metastability Near Phase Transitions in Traffic Flow
The general time-dependent solution of the master equation is given by (3.37). We need merely the eigenvalues λi for our specific case. In fact, the problem reduces to finding the eigenvalues of the transition matrix W (cf. (3.26) and (3.27)) which in our case is a three-diagonal matrix. Below we will describe the method of solution of this problem. Consider the determinant Det(m) of a three-diagonal matrix of size m comprised of elements ai,j . It is assumed that ai,j are given constants independent of the matrix size m. It follows from the structure of such a matrix that Det(m) = am,m Det(m − 1) − am−1,m am,m−1 Det(m − 2).
(10.90)
The initial condition for this recurrence relation is Det(0) = 1 and Det(1) = a1,1 . The actual problem is to calculate determinant of a matrix of size N + 1 (N is the total number of cars, 0 ≤ n < N) with diagonal elements ai,i = −w+ (i − 1) − w− (i − 1) − λ (where w− (0) = w+ (N) = 0, 1 ≤ i ≤ N + 1) and nondiagonal elements ai−1,i = w− (i − 1) and ai,i−1 = w+ (i − 2) (where 2 ≤ i ≤ N + 1), and to find the values of λ (eigenvalues) at which Det(N + 1) = 0. It follows from the specific properties of the matrix W, discussed in Section 3.2, that one eigenvalue is λ0 = 0 and all other eigenvalues 0 > λ1 > λ2 > . . . > λN are negative. They can be calculated from the equation of Nth order f (λ) =
N
(N)
Bn λn = 0,
(10.91)
n=0 (N)
where coefficients Bn are found based on (10.90). Det(m) can be represented as Det(m) =
m
(m)
An λn ,
(10.92)
n=0 (m)
where coefficients An with 0 ≤ n ≤ m satisfy the recurrence relation (m)
(m−1)
An = −[w+ (m − 1) + w− (m − 1)] An
(m−2)
− w+ (m − 2)w− (m − 1) An
(m−1)
− An−1
.
(10.93) (m)
We can subsequently calculate all the coefficients An from (10.93) starting with (0) m = 1 and finishing with m = N + 1. In this case we formally set A0 = 1 and (m) (N+1) = 0 holds since one eigenvalue is zero. According to An = 0 if n > m. A0 (N) (N+1) this Bn ≡ An+1 . All N roots of (10.91) are real and negative. This means that function f (λ) has exactly N − 1 extremum points (solutions of equation df /dλ = 0 of the (N − 1)th order) located between zeros of this function. Thus, f (λ) is monotonous within λ1 < λ < 0 which allows us to find the eigenvalue λ1 by solving (10.91) numerically (e.g, by the Newton linearization method) with the initial approximation λ = 0. After this we rewrite (10.91) in the form (λ − λ1 )
N−1 n=0
(N−1) n
Bn
λ =0
(10.94)
329
10 Vehicular Traffic
based on the recurrence relation (N−1)
(N)
(N−1)
Bn−1 = Bn + λ1 Bn
,
(10.95)
(N−1)
where BN = 0 and n subsequently takes the values N, N − 1, . . . , 1. Then we find the eigenvalue λ2 from the equation N−1
(N−1) n
Bn
λ =0
(10.96)
n=0
with the initial approximation λ = λ1 . This procedure can be continued to find all the eigenvalues by subsequent reduction of the equation order. In exact arithmetics this method allows us to find all eigenvalues for arbitrarily large matrix. However, due to numerical inaccuracy, in practice this can be done only for quite a small matrix (e.g. N ≤ 20). Nevertheless, the eigenvalues which are the smallest in magnitude can be calculated for large enough N, which allows us to investigate the long-time behavior of the system. In particular, it follows from (3.31) that the probability distribution P(t) tends to the equilibrium distribution Peq as P(t) − Peq # c1 u1 eλ1 t
(10.97)
at t → ∞, where λ1 is the eigenvalue closest to zero and u is the corresponding eigenvector of the transition matrix W. In other words, −1/λ1 is the relaxation time. Results of the calculation for λ1 (solid lines) and also for λ2 (dotted lines) depending on the density of cars c = N/L for different lengths of the road L are shown in Figure 10.22. We have used the same set of control parameters: C2 Cjump Cclust
C1 1 −4 In(−λ)
330
−9 −14 −19 −24 0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
C
Figure 10.22 Representation of the eigenvalues as ln(−λ1 ) (solid lines) and ln(−λ2 ) (dotted lines) vs density of cars c at different sizes of the system L/. From the top to the bottom (if looking at c = c1 and also at c = cjump ) the sizes are L/ = 51.46, 101.08, 151.03, 201.09, 1500, and 4000. The first
four sizes are chosen such that the minima of solid curves at c ≈ cjump correspond to N = 40, 80, 120, and 160. Parameters of the system are b = 8.5, d = 13/6, ∆yclust = 1/6, p = 0.001. The time is assumed to be dimensionless, so w− (n) = 1.
10.4 Metastability Near Phase Transitions in Traffic Flow
b = 8.5, d = 13/6, ∆yclust = 1/6, and p = 0.001, as in our previous calculation in Figure 10.19. It is evident from Figure 10.22 that −λ1 has a sharp minimum at c ≈ cjump , and the density at the minimum tends to cjump with increasing L. In this case we have denoted by cjump the second critical value c2 of the density in (10.89) which corresponds to a jump in the maximum of the stationary probability distribution p(n) ≡ peq (n) (at N between 663 and 664 in Figure 10.19) as well as in the flux j (Figure 10.20). As distinct from (10.89), here we denote by c2 the minimal density at which the stationary probability distribution p(n) has two maxima. In general, two maxima exist at c2 < c < cclust where cclust is the car density inside a cluster. The value of cjump about 0.796 corresponds to the jump-like first-order phase transition, as is evident from Figure 10.19. The value of −λ1 in Figure 10.22 decreases exponentially with increasing system size (linearly in the logarithmic scale used in Figure 10.22) at densities c2 < c < cclust . This result has a simple physical interpretation. At these densities the equilibrium probability distribution peq (n) has two maxima, one of which corresponds to the stable, and another to the metastable state of the system. So, −1/λ1 represents the time in which the system finds the stable state if one starts from the metastable state. At c = cjump both states have the same probability, therefore the relaxation is the slowest. In general, the increase in the relaxation time is exponential because of the necessity to overcome a ‘potential barrier’ when switching from one state to another. The ‘height of the potential barrier’ is proportional to the number of unstable intermediate states (proportional to N) which need to be overcome. It is easy to construct the analytical asymptotic solution at t → ∞ in large system size limit L → ∞ in the case when the time-dependent probability distribution p(n, t) meets the initial condition p(n, 0) ≈ pme (n), where pme (n) is the probability distribution of the quasi-equilibrium or metastable state. The system exhibits relaxation to the metastable state within a time interval of about −1/λ2 . This time interval of does not diverge exponentially at L → ∞, as can be seen from Figure 10.22 (dotted lines). Thus, for −1/λ2 t −1/λ1 we have p(n, t) # pme (n), which means that the term c1 u1 in (10.97) is (approximately) Pme − Peq . In this way, we obtain + , (10.98) p(n, t) # peq (n) + pme (n) − peq (n) eλ1 t at t −1/λ2 . This is an asymptotically exact equation in the limit L → ∞ and tλ2 → ∞ for the values of n around both maxima of peq (n) (except the irrelevant intermediate states where pme (n) is not precisely defined but, in any case, is vanishingly small). In this case pme (n) can be calculated in the same way as peq (n), but neglecting those states which practically cannot be reached at tλ2 → ∞ and tλ1 → 0. The correct result is always ensured by taking into account the states either with n < nunst (at c2 < c < cjump ) or with n > nunst (at cjump < c < cclust ) to include the maximum of p(n, 0). Here nunst is the unstable cluster size corresponding to the minimum of peq (n). It is also interesting to investigate the relaxation behavior of the system near the first phase transition point c = c1 where the spontaneous formation of a large car cluster starts when increasing the density. Some singularity of the relaxation time
331
332
10 Vehicular Traffic
is expected at c = c1 in the limit L → ∞. Our calculations at the larger system sizes L/ = 1500 and L/ = 4000 indicate the existence of a breakpoint in ln(−λ1 ) vs c plot at c = c1 in the limit L → ∞. In this case −λ1 has an almost constant value at densities somewhat above c1 . This value behaves as ∼1/L (within the estimation accuracy allowed by our calculations at finite L), which means that the relaxation time increases proportionally with the system size L. The fact that −λ1 ∼ 1/L can be understood as follows. The relaxation time at the densities actually considered is proportional to the time of stable (stationary) cluster formation, but the latter is proportional to L at a given c. Both the mean velocity of cluster growth and the stable cluster size are proportional to c − c1 ; therefore, the mean time of the cluster formation is practically independent of c − c1 , which explains the fact that −λ1 as a function of c is almost constant.
10.5 Car Cluster Formation as First-Order Phase Transition
Here we propose an essential innovation in our traffic flow model: now the detachment frequency depends on the cluster size. By analogy with physical systems like droplets in supersaturated vapor, one can expect that smaller clusters dissolve easier, so w− (n) increases considerably when the cluster size becomes smaller than some characteristic value n0 . In traffic flow, this idea is based on the concept of higher irregularity of the cluster structure when the size is small. The increase in w− (n) might also be related to some multi-lane effect; due to several possible overtaking maneuvers, clusters consisting of a few cars are particularly unstable. In some approximation our one-lane model effectively describes a multilane freeway. In this case n and N are the number of congested cars and the total number of cars per lane. The above mentioned multi-lane effect is simulated by choosing an appropriate ansatz for w− (n). Our specific choice is to replace (10.62) by
s n0 1 w− (n) = : n ≥ 1, (10.99) 1+β τ n + n0 where β and s are positive constants. Assuming n0 1, the parameter β ≈ (w− (1) − w− (∞))/w− (∞) shows the relative increase in w− (n) when the cluster size n is reduced from large values to 1. The parameter s is responsible for the speed of w− (n) converging to its asymptotic value w− (∞) = 1/τ at n → ∞. The specific form of w− (n) is chosen somewhat arbitrarily to get a simple, but still realistic, description. The advanced traffic flow model, introduced in this section, exhibits similar features to those observed in supersaturated vapor as discussed in Chapter 9. The most essential distinguishing feature of the new model is the existence of metastability near the phase transition from the free flow to cluster phase. The latter, therefore, can be interpreted as a jump-like, that is first-order phase transition where the system goes over from one state to another by overcoming a potential barrier.
10.5 Car Cluster Formation as First-Order Phase Transition
20
c =0.145
Φ(n)
0
−20
−40
c= 0.1945
0
100
(a)
200 n
In
20
0 (b)
c= 0.145
10
0
c= 0.1945
200
400 b
Figure 10.23 The potential of the circular one-lane-road model depending on the car cluster size n (a), and the logarithm of the mean first-passage time from the freeflow state n = 0 to the congested state with cluster of a certain size n = b + 1 (b). The control parameters of the model are b = 8.5, d = 13/6, ∆yclust = 1/6, β = 0.8, s = 1, and
n0 = 10 at the total dimensionless length of the road L/ = 2000. Curves from the top to the bottom correspond to increasing densities c = 0.13, 0.145, 0.154, 0.17, 0.1945, and 0.3. The dotted curves at c = 0.145 and c = 0.1945 indicate the range of bistability with the double-well potential shown in the upper picture by thick solid lines.
The potential (3.48) with the specific attachment frequencies (10.63), (10.64) and the detachment frequencies (10.99) of our advanced model is shown in Figure 10.23 (a) for different total densities of cars at realistic values of the dimensionless control parameters b = 8.5, d = 13/6, and ∆yclust = 1/6 estimated in [147]. Other parameters have been chosen: β = 0.8, L/ = 2000, s = 1, and n0 = 10. In a certain range of concentrations from c = 0.145 to c = 0.1945, indicated by dotted lines, the potential has two minima (one at n = 0, another at n > 1) separated by a smooth maximum. This is the so-called double-well potential. The stable state of a large system corresponds to the absolute minimum of the potential. However, the system needs considerable time to go over from one minimum to another by climbing over the potential barrier. Thus, the switching from a free-flow state with n = 0 to the congested state occurs due to rear fluctuations of large amplitude, which means that the system exhibits metastability and first-order phase transition.
333
334
10 Vehicular Traffic
This behavior is completely analogous to nucleation in supersaturated vapor (see Chapter 9), where the growth of liquid droplets of overcritical size occurs, the critical size being reached due to stochastic fluctuations caused by overcoming the free energy barrier. An essential parameter characterizing the switching process is the mean firstpassage time T (3.74) from the free-flow state with n = 0 to the jammed states beyond the maximum of potential (n). Its logarithm ln T depending on the cluster size b + 1 which has to be reached for the first time, is shown in Figure 10.23(b) at the same overall car densities c as the potential depicted in Figure 10.23(a) [99]. The time is dimensionless, measured in units of τ. The thick solid lines at c = 0.154 (upper line) and c = 0.17 (lower line) correspond to the bistable situation with the double-well potential. Assuming τ = 1.5 s, in agreement with the estimation in [147], we can evaluate from Figure 10.23, the mean time during which the system overcomes the potential barrier starting from the free flow state n = 0. In our example the maximum of the potential is located at n = 14 at the density c = 0.154 and n = 7 at the density c = 0.17. The corresponding mean first-passage times are about 6 min and 80 s, respectively. However, the system still needs some time to reach the stable cluster size n = 79 at c = 0.154 and n = 124 at c = 0.17 which coincides with the minimum of the potential. In our example, the corresponding mean times are 40 min and 25 min. The local maximum of the potential corresponds to the unstable or critical cluster size ncrit , since smaller clusters with n < ncrit tend, on average, to shrink, while larger clusters with n > ncrit tend to grow up to the stable cluster size nstable . The mean first-passage time increases relatively slightly within b ∈ [ncrit ; nstable ]. This corresponds to the middle part of the thick solid lines in Figure 10.23(b). In principle, the car cluster can also somewhat exceed the stable size due to stochastic fluctuations. However, the mean first-passage time increases dramatically in this case. It corresponds to the sharply increasing r. h. s. part of the curves in Figure 10.23(b). The sharp increase in the mean first-passage time is observed for all concentrations shown in Figure 10.23 at large enough values of b, which can be reached by moving against the driving force −∂(n)/∂n of the increasing potential. At small densities (the upper curve at c = 0.13) the formation of a remarkably large car cluster is highly improbable, therefore the mean first-passage time increases sharply starting from b = 0. At intermediate densities, represented by the lower curve at c = 0.3, the formation of a large cluster proceeds without overcoming a potential barrier. There is only a relatively small delay at the beginning of the process, where the driving force −∂(n)/∂n is smaller. Note that the upper cut-off of the ln T vs b curves corresponds to the time 4.85 × 108 τ which is about 23 years. This means that the states with cluster sizes which are considerably larger than the stable stationary cluster size (which is zero at small densities and nonzero at larger densities) are practically never reached. One must mention that the bistability exists also at large densities of cars, that is, at c > 0.5 in our example. In this case the system can switch from a dense homogeneous state to a heterogeneous cluster state by overcoming a potential barrier which is much higher than in the actual range of concentrations c ∈ [0.145; 0.1945] indicated in Figure 10.23.
10.5 Car Cluster Formation as First-Order Phase Transition
In our traffic-flow model the inverse time constant 1/τ plays the role of temperature. The detachment frequency w− (n) (10.99) diverges as 1/τ at τ → 0, whereas the attachment frequency (10.64) remains finite in this case. This means that, for all vehicle densities, the car cluster tends to dissolve at τ → 0 or at high temperatures. In other words, the formation of a stable car cluster as well as bistability is possible only at undercritical values of the temperature 1/τ. In the simplest case of β = 0 the critical temperature corresponds to a vanishing parameter σ in (10.85), as consistent with the fact that the balance condition w+ (n)/w− (n) = 1 for the average number of attached and detached cars can be fulfilled only at σ > 0. To show further analogy with a supersaturated vapor, we consider the mean cluster size as described by the deterministic equation (10.70). The stationary equation in our advanced model becomes
s wopt (∆yfree ( n)) − wopt (∆yclust ) n0 b =1+β (10.100) ∆yfree ( n) − ∆yclust n + n0 consistent with the ansatz for transition probabilities (10.64) and (10.99). The stationary mean cluster sizes nst correspond, in general, to different stationary solutions of these equations. Also the stable free-flow state with nst = 0 has to be identified as a state with d n/dT < 0 at n → 0. The corresponding dimensionless fluxes
nst nst + wopt (∆yfree ( n)) c − (10.101) j = Jτ = b wopt (∆yclust ) L L can be calculated as well, where J is the flux averaged over the whole road or, which is the same, over an infinitely long time interval at a fixed coordinate. In this case we have neglected the stochastic fluctuations taking n ≡ nst . The equation for the stationary cluster size on the road, (10.100), has been solved together with (10.61) and (10.68) in the approximation S(n) ≈ n, ∆yfree (n) =
(L/) (1 − c) − n ∆yclust , c (L/) − n
(10.102)
at the same realistic values of the dimensionless control parameters b = 8.5, d = 13/6, ∆yclust = 1/6 as we used in Figure 10.23. Three different values of the parameter β, β = 0, 0.8, and 1.6, have been considered. Other parameters have been chosen L/ = 500, s = 1, and n0 = 10. The results are shown in Figures 10.24 and 10.25. In Figure 10.24 we have illustrated the behavior of the stationary cluster size nst (a) and the corresponding flux (b) in a special case of parameter β = 0 where our advanced model reduces to the previous one discussed in Section 10.3. As a result, the stationary cluster size behaves in a similar way as in Figure 10.17, with the only essential difference that ∆yclust > 0. In this case the free flow with nst = 0 is stable up to some critical density c = c1 . Then a car cluster appears which grows linearly in size with increasing density c from c1 to the maximum value cclust = 1/(1 + ∆yclust ) (the density in the cluster), as shown in Figure 10.24 by a thin solid line. This corresponds to one of the stationary solutions n = n1 (c) of (10.100). At densities
335
10 Vehicular Traffic
400
st
300
n1(c)
200
100
0 0.0
n2(c)
c1
0.5
c2
cclust 1.0
c2
cclust 1.0
C
(a) 1.5
1.0 j
336
0.5
0.0 0.0 (b)
c1
0.5 C
Figure 10.24 Stationary cluster size (a) and flux (b) vs concentration calculated for a finite road at dimensionless control parameters β = 0, b = 8.5, d = 13/6, ∆yclust = 1/6, L/ = 500. The thick and the thin solid
lines correspond to stable free and congested states, respectively. Dashed lines represent an unstable free flow and the dotdashed line shows the unstable (critical) cluster size.
c2 < c < cclust another positive solution n = n2 (c) exists (dot-dashed line) which corresponds to an unstable or critical cluster size. According to (10.70), the cluster tends to grow (dn/dt > 0) if n > n2 (c), and it tends to dissolve (dn/dt < 0) if n < n1 (c). The most probable time evolution of n, given by the deterministic equation (10.70), depending on the density c and initial conditions, is indicated in Figure 10.24 by arrows. As in the supersaturated vapor, the growth of a cluster starting from undercritical size n < n2 (c) and finishing with the stable cluster size n = n1 (c) is also possible, but only due to stochastic fluctuations.
10.5 Car Cluster Formation as First-Order Phase Transition
st
1.5
400 n1(c)
1.0 j
300 200
0.5 100 * ncrit 0 0.0
n2(c) c1
(a)
0.5
C cclust 1.0
0.0 0.0
c1
c
st 400
0.5 c
C cclust 1.0
1.5
300
1.0 j
n1(c)
200 0.5 100 * ncrit 0 0.0
(b)
n2(c) c1
0.5
C cclust 1.0
0.0 0.0
c
c1
0.5
C cclust 1.0
c
Figure 10.25 Stationary cluster size (left) and flux (right) vs concentration calculated for finite road at β = 0.8 (a) and β = 1.6 (b). Other dimensionless control parameters are n0 = 10, s = 1, b = 8.5, d = 13/6, ∆yclust = 1/6, L/ = 500. The meaning of solid, dashed, and dot-dashed lines is the same as in Figure 10.24.
In Figure 10.25 we have shown the same diagrams at β > 0. A distinguishing feature is that a nonzero critical cluster size already appears at c = c1 where a phase transition takes place with a jump-like increase in the stable cluster size
nst from 0 to n∗crit . This is a first-order phase transition, a behavior observed in physical systems like supersaturated vapor. At small values of β, however, a density region with vanishing critical cluster size still exists (β = 0.8, Figure 10.25(a)) which disappears at larger β values (β = 1.6, Figure 10.25(b)). Note also that at β = 0.8 the free flow becomes unstable somewhat above c1 (see j(c) plots in Figure 10.25), while both free and congested states of the system are stable at β = 1.6. The bifurcation diagram at β = 1.6 is qualitatively similar to that of the supersaturated vapor in Figure 9.4. At positive β, a jump-like decrease in the flux j is observed at the critical point c = c1 when switching from the free flow (thick solid line) to the congested flow (thin solid line). This, however, is a finite-size effect which disappears in the limit N → ∞, since n∗crit /N → 0.
337
338
10 Vehicular Traffic
10.6 Thermodynamics of Traffic Flow
An extension of thermodynamic concepts from equilibrium to nonequilibrium or driven systems is one of the fundamental problems in physics. It refers also to socalled nonphysical systems, such as traffic or granular flow, economics, biological systems, etc., where the laws of microscopic interaction and motion differ from those known in physics. Different approaches have been developed previously. In the geometrical formulation of thermodynamics [62], the latter is regarded as a theory arising in the analysis of dynamics. In this concept, equilibrium thermodynamics is represented by a manifold of time-independent equilibrium states, whereas the thermodynamics of a driven system is represented by a manifold of slowly evolving states. A k-component system undergoing chemical reaction is considered as an example in [62]. A more widely discussed approach is based on the introduction of entropy [188, 218] and usage of the entropy maximization principle in various applications, e.g. linear dissipative driven systems [218] and single-lane traffic [188]. An appropriate definition of temperature is a relevant question when we consider a nonphysical system. In [188] the temperature T and pressure p of traffic flow have been introduced via derivatives of certain thermodynamic functions, and it has been found that T is negative at typical velocities. In another approach [118] similarities between traffic and granular flow have been discussed proposing two effective temperatures: one characterizing fast or single-car dynamics, and another – slow or collective dynamics of traffic flow. As mentioned in [188], entropy need not occupy a position of primacy in a general theory beyond the classical equilibrium thermodynamics. We have found that, cases where the stationary state of a driven system has the property of detailed balance in the space of a suitable stochastic variable, the thermodynamic potential can be easily introduced based on this property in complete analogy with equilibrium systems. This approach can prove to be useful in many applications due to its relative simplicity. As an example we consider the formation of a car cluster in one-lane traffic and show its analogy with the phase separation in supersaturated vapor–liquid system. The aggregation of particles out of an initially homogeneous situation is well known in physics, as well as in other branches of natural sciences and engineering. The formation of bound states as an aggregation process is related to selforganization phenomena [203, 210, 229]. The formation of car clusters (jams) at overcritical densities in traffic flow is an analogous phenomenon in the sense that cars can be considered as interacting particles [107, 108, 184]. The development of traffic jams in vehicular flow is an everyday example of the occurrence of nucleation and aggregation in a system of many point-like cars. For previous work focusing on the description of jam formation as a nucleation process, see [112, 147, 149, 151]. This is related to phase separation and metastability in low-dimensional driven systems, a topic which has attracted much recent interest [43,89,97,166,209]. Metastability and hysteresis effects have been observed
10.6 Thermodynamics of Traffic Flow
in real traffic, see, e.g., [31, 77, 100, 133, 239, 240] for a discussion of empirical data and the various different modeling approaches. Here we focus on the application of thermodynamics to a many-particle system such as traffic flow. In a first step we do not consider real traffic with its very complicated behavior but instead limit our investigations to simple models of a directional one-lane vehicular flow. We hope this will trigger further development in the description of more realistic situations for multi-lane traffic as well as synchronized flow [100]. We have found a certain analogy with physical systems like supersaturated vapor-liquid, although there are also essential differences, since the traffic flow is a driven system. We would like to outline some basic ideas and concepts. 1. On a microscopic level, traffic flow can be described by the optimal velocity model (OVM). In this case the equations of motion can be written as Newton’s law with accelerating and decelerating forces and one can define the potential V and the kinetic energy T of the car system, as well as the total energy E = T + V. The latter has a thermodynamic interpretation as E = U, where U is the internal energy of the system. 2. Traffic flow is a dissipative system of driven or active particles. It means that the total energy is not conserved, but we have an energy balance equation dE +=0 dt
3.
4.
5.
6.
with the energy flux following from the equations of motion and consisting of dissipation (due to friction) and energy input (due to the burning of petrol). In the long-time limit the many-car system tends to a certain stationary state. In the microscopic description it is either the fixed-point or the limit cycle in the phase space of velocities and headways depending on the overall car density and control parameters. The stationary state is characterized by a certain internal energy. On a mesoscopic level, traffic flow can be described by a stochastic master equation, where the stochastic variable is the number of congested cars n, that is, the size of car cluster. In this case the fixed-point solution corresponds to n = 0, and the limit cycle – to the coexistence of a car cluster with n > 0 and the free-flow phase. In the space of the cluster size, the detailed balance holds for the stationary solution just as in equilibrium physical systems. It allows one to describe various properties of the stationary state by equilibrium thermodynamics. In particular, we calculate the free energy of the system and the chemical potentials of coexisting phases in complete analogy with the known treatment for a supersaturated liquid-gas system. As distinct from equilibrium systems, the chemical potential of the cluster phase of traffic flow is not an internal property of this phase, since it depends on an outer parameter – the density of cars in the free-flow phase. It allows one to distinguish between the traffic flow as a driven system and purely equilibrium systems.
339
340
10 Vehicular Traffic
Traffic flow can be viewed as a random dynamical system [6, 7] of active or intelligent particles [38, 41, 211]. To describe it on a microscopic level, here we use the optimal velocity (OV) model for point-like cars, moving on a one-lane road with periodic boundary conditions, defined by (10.13)–(10.15). Equation (10.13) can be written as dvi = Facc (vi ) + Fdec (∆xi ), m (10.103) dt where m (vmax − vi ) ≥ 0 τ , m+ Fdec (∆xi ) = vopt (∆xi ) − vmax ≤ 0 τ Facc (vi ) =
(10.104) (10.105)
are the accelerating and decelerating forces, respectively. The coordinate-dependent force term is due to the interaction between cars
(∆x)2 m − 1 (10.106) Fdec (∆x) = vmax τ D2 + (∆x)2 and is always negative, starting at Fdec (∆x = 0) = −vmax m/τ, approaching zero at infinite distances. The potential energy of the car system can be defined as V= N i=1 φ(∆xi ), where φ(∆xi ) is the interaction potential of the ith car with the car ahead, which is given by Fdec (∆xi ) = −
∂φ(xi+1 − xi ) dφ(∆xi ) = ∂xi d∆xi
By integrating this equation we get
∆x Dm π − arctan , φ(∆x) = vmax τ 2 D
(10.107)
(10.108)
where the integration constant is chosen such that φ(∞) = 0. The potential energy depending on the headway is shown in Figure 10.26. Note that Fdec (∆xi ) in this case is not given by −∂V/∂xi , since the latter quantity includes an additional term −∂φ(xi − xi−1 )/∂xi . This term is absent in our definition of the force because the car behind does not influence the motion of the actual ith vehicle. It reflects the fact that, unlike in physical systems, Newton’s third law does not hold here. The total time derivative of the potential energy is N ∂φ(∆xi ) dxi ∂φ(∆xi ) dxi+1 dV = + dt ∂xi dt ∂xi+1 dt i=1
=
N (vi+1 − vi )Fdec (∆xi )
(10.109)
i=1
2 The total time derivative of the kinetic energy T = N i=1 mvi /2 is obtained by multiplying both sides of (10.103) by vi and summing over i. It leads to the following energy balance equation
10.6 Thermodynamics of Traffic Flow
3×104
φ
2×104
1×104
0
0
200
400
600
∆x Figure 10.26
Graph of the potential energy function φ(∆x).
dE +=0 dt
(10.110)
for the total energy E = T + V of the car system, where =−
N vi Facc (vi ) + vi+1 Fdec (∆xi )
(10.111)
i=1
is the energy flux. It includes both energy dissipation due to friction and energy input from the engine. Equation (10.110) shows that, as distinct from closed mechanical systems, the total energy is not conserved in traffic flow. Nevertheless, it approaches a constant value in the long-time limit, where the system converges to one of two possible stationary states: either to the fixed point ∆xi = ∆xhom , vi = vopt (∆xhom ) (where ∆xhom = L/N is the distance between N homogeneously distributed cars over the road of length L), or to the limit cycle in the phase space of headways and velocities. Both situations are illustrated in Figure 10.27. At a small enough density of cars there is a stable fixed point (solid circle), which lies on the optimal velocity curve (dotted line). An unstable fixed point (empty circle) exists at larger densities. In the latter case any small perturbation of the initially homogeneous fixed point situation leads to the limit cycle (solid line) in the long-time limit. In the thermodynamic interpretation the mean energy E is the internal energy U of the system. The latter thus has a certain value in any one of the stationary states. The temporal behavior of E for the same sets of parameters as in Figure 10.27 is shown in Figure 10.28. In the case of the convergence to the limit cycle (solid line) for ρ = 0.0606 m−1 , one can distinguish six plateaus in the energy curve. The first one represents the short-time behavior when starting from an almost homogeneous initial condition with zero velocities, and the second plateau is the unstable fixed-point situation. Further on, four car clusters have been formed in the actual simulation, and this temporal situation is represented by the third, relatively small, plateau. The next three plateaus with three, two and, finally, one car clusters reflect the coarse graining or Ostwald ripening process. The dashed line shows the convergence to the stable fixed point value at ρ = 0.0303 m−1 .
341
10 Vehicular Traffic
12
V
8
4
0
0
10
20 ∆x
30
Figure 10.27 Fixed points (circles) and the limit cycle (solid line) in the space of headways ∆x and velocities v of cars. The solid circle represents the stable fixed point at the car density ρ = N/L = 0.0303 m−1 . The empty circle is the unstable fixed point at a larger density ρ = 0.0606 m−1 , where
40
the long-time trajectory for any car is the limit cycle shown. The fixed points lie on the optimal velocity curve (dotted line) given by (10.15). The parameters are chosen as N = 60, D = 33 m, vmax = 20 m s−1 , τ = 1.5 s, and m = 1000 kg.
160
140 E
342
120
100 10−1
100
101
102
103
104
105
106
t Figure 10.28 The total energy E of the car 2 /2, desystem, measured in units of mvmax pending on the time t given in seconds. The same sets of parameters have been used as in Figure 10.27. The upper solid line corresponds to a larger density ρ = 0.0606 m−1
where the limit cycle forms, whereas the lower dashed line corresponds to a smaller density ρ = 0.0303 m−1 where the convergence to a stable fixed point is observed.
Apart from the internal energy, other thermodynamic functions can be introduced as well. In the following we will calculate the free energy F of the traffic flow. By using the known relation F = U − T ∗ S we can also calculate the entropy S of traffic flow for a properly defined ‘temperature’ T ∗ . Up to now we have considered purely deterministic equations of motion. Randomness can be included, e.g. by adding a multiplicative noise term to (10.103).
10.6 Thermodynamics of Traffic Flow
This leads to stochastic differential equations (cf. (10.45), (10.46)) m dvi (t) = (Facc (vi ) + Fdec (∆xi )) dt + σ vi dWi (t), dxi (t) = vi dt
(10.112) (10.113)
A similar equation with an additive noise term has been studied in [79, 108]. An advantage of the version with multiplicative noise is that it guarantees the positiveness of velocities vi . In the deterministic model the departure (leaving a cluster) times are strongly correlated in such a way that, in the stationary regime, one car leaves the cluster after each time interval of a given length τ1 . The arrival (adding to a cluster) times also are strongly correlated due to the repulsive forces. The noise makes these correlations weaker. It allows one to apply the formalism of stochastic Markov processes in order to describe approximately the fluctuations of the cluster size, as discussed later. It is easier to study the formation of car congestion on a mesoscopic level, as has been done in [123, 147–151, 153], where we do not follow each individual car, but only look for the number of congested cars n, that is, the size of car cluster. In this description it is also very easy to introduce the randomness, by considering n as a stochastic variable. Some properties of our stochastic traffic flow model, given by the master equation (10.49), can be described by equilibrium thermodynamics in analogy to the liquid-vapor system in spite of the fact that the traffic flow is a driven, namely, a nonequilibrium system. Here we consider the simplest version of the model, assuming that cars are point-like. In this case the transition rates (10.50) and (10.52) reduce to 1 τ vopt (∆xfree ) w+ (n) = , ∆xfree
w− (n) =
(10.114) (10.115)
where τ is a reaction time constant and ∆xfree (n) = L/(N − n) is the mean headway distance in the free-flow phase. As we have discussed earlier, no large stable cluster forms at low car densities, whereas a macroscopic fraction of them are condensed (jammed) into the cluster above a certain critical density. The first situation corresponds to the fixed-point solution of the OVM (optimal velocity model, see Section 10.1), whereas the second corresponds to the limit cycle. The stationary solution pst (n) = limt→∞ p(n, t) obeys the detailed balance condition pst (n) w+ (n) = pst (n + 1) w− (n + 1) (cf. (3.49)). It is a very remarkable property of the actual model, which allows one to make a connection to thermodynamics. In equilibrium the detailed balance holds in a physical system like, e.g. the supersaturated vapor-liquid discussed in Chapter 9. The free energy and chemical potentials can be derived based on this principle in a physical system, as well as in the actual traffic flow model. By analogy with (9.65), the detailed balance for a system containing a cluster of size n can be written as
343
344
10 Vehicular Traffic
F(n) − F(n − 1) w+ (n − 1) , = exp − w− (n) T∗
(10.116)
where T ∗ is a parameter with energy dimension, which corresponds to kB T in (9.65) or to the temperature measured in energy units, and F(n) is the free energy of state (including all possible microscopic distributions of coordinates and momenta) with cluster size n. Also, in traffic flow, T ∗ can be interpreted as some ‘temperature’. Following (9.66)–(9.68), also in traffic flow, the free energy of a system with large cluster size n can be represented by the transition rates as 1 ∂F w+ (n) =− ∗ . (10.117) ln w− (n) T ∂n or F = F0 − T ∗
n
ln 0
w+ (n ) dn , w− (n )
(10.118)
where F0 = F(n = 0) does not depend on the cluster size n. It is the free energy of the system without cluster. According to (10.114) and (10.115), the ratio of transition rates reads vopt (∆xfree ) w+ (n) , =τ w− (n) ∆xfree
(10.119)
where vopt (∆x) is the optimal velocity function given by (10.15). It yields w+ (n) 1 − n/N = vmax τρ , 2 w− (n) 1 + ρ D2 (1 − n/N)2
(10.120)
where ρ = N/L is the car density. Introducing the dimensionless density ρ˜ = ρD and a dimensionless control parameter b˜ = D/(vmax τ), this becomes 1 ρ˜ (1 − n/N) w+ (n) = , ˜b 1 + ρ˜ 2 (1 − n/N)2 w− (n) or
ln
(10.121)
w+ (n) ρ˜ n 2 n! = ln − ln 1 + ρ˜ 2 1 − . + ln 1 − w− (n) N N b˜
the latter relation into By+ inserting , + , (10.118), the ln 1 + x2 dx = 2 arctan x + x ln 1 + x2 − 2x yields
integration
n n n n ρ˜ F − F0 = ρ˜ 1 − ln 1 − − − ln 0 N N N N L T∗ b˜
! 2 + , n n + ln 1 + ρ˜ 2 − 1− ln 1 + ρ˜ 2 1 − N N n ! , + 2 arctan ρ˜ − 2 arctan ρ˜ 1 − N where 0 L = L/D is the dimensionless length of the road.
(10.122) using
(10.123)
10.6 Thermodynamics of Traffic Flow
The results for w+ (n)/w− (n) and (F − F0 )/(0 LT ∗ ) depending on the fraction of congested cars n/N at four different densities are shown in Figures 10.29 and 10.30. The value of the dimensionless control parameter has been chosen as b˜ = 2/7 ≈ 0.2857. It corresponds, e.g., to D = 24 m, vmax = 42 m s−1 , and τ = 2 s. The ratio w+ (n)/w− (n) is never 1 and no stable car cluster forms at small densities (dotted line). These plots have to be compared with those for the liquid-gas system shown in Figures 9.7 and 9.8. In distinction to the liquid-gas system, the cluster appears without a nucleation barrier in the actual traffic-flow model at somewhat larger densities ρ˜ (dot-dashed line), whereas the nucleation barrier (free energy maximum) shows up only at even larger ρ˜ values (solid line). 2
w+(n)/w−(n)
1.5
1
0.5
0
0
0.5
1
n/N Figure 10.29 The ratio of transition rates w+ (n)/w− (n) depending on the fraction of congested cars n/N for four dimensionless densities ρ˜ = 0.1 (dotted line), ρ˜ = 1 (dotdashed line), ρ˜ = 3.186 (dashed line), and 0 ρ = 5 (solid line).
0.3
f−f0
0 −0.3 −0.6 −0.9 0
0.5 n/N
Figure 10.30 Normalized free energy difference LT ∗ ) = f − f0 depending on the fraction of con(F − F0 )/(0 gested cars n/N for four dimensionless densities ρ˜ = 0.1 (dotted line), ρ˜ = 1 (dot-dashed line), ρ˜ = 3.186 (dashed line), and ρ˜ = 5 (solid line).
1
345
346
10 Vehicular Traffic
In the above calculation we have determined only the difference F − F0 , but not the free energy F0 of the ideal system without car cluster. As in physical systems, e.g. the supersaturated vapor discussed in Chapter 9, the latter cannot be derived from the detailed balance, but should be calculated from a microscopic model. We should take into account that the distribution over momenta for cars is not the same as that for molecules in an ideal gas. As a first approximation we may assume a similar Gaussian distribution with only shifted mean value p = m v = mvopt (∆xhom ) in accordance with the optimal velocity vopt (∆xhom ) in the homogeneous flow of cars with the mean headway distance ∆xhom = L/N = 1/ρ. The Gaussian form of the distribution is consistent with the simulation results for the stochastic car-following models [79,108,151]. We should take into account also that cars are always moving in one direction, so momentum p > 0 always holds. Finally, as distinct from the ideal gas of molecules, the coordinates and momenta of cars are one-dimensional. Hence, by integration over coordinates and momenta of all cars, for the ideal part of the partition function we can write
Zideal
+ ,2
∞
N pα − p 1 11 L = dxα dpα exp − N! h 0 2mT ∗ 0 α=1
≈
1 N!
L λ0 (T ∗ )
N ,
(10.124)
where λ0 (T ∗ ) = h/(2πmT ∗ )1/2 . The latter approximate equality in (10.124) holds when p2 /(2mT ∗ ) 1 or, in other words, when the width of the velocity distribution is narrow as compared to the mean velocity. The latter condition is satisfied for the model (10.112) and (10.113) with a certain set of control parameters used in the simulations (see Figure 10.12). The distribution width, however, increases with the noise amplitude. In fact, the approximation (10.124) is good enough when the distribution function has a small value at zero momentum p = 0, as in the simulation results of [79,108]. According to the above consideration, the temperature in traffic flow is a parameter which controls this distribution width or the amplitude of the velocity and momentum fluctuations, as in the ideal gas of molecules. According to (10.124), the ideal part of the free energy reads + , + , F0 = −T ∗ N ln L/λ0 (T ∗ ) − ln N! # T ∗ N ln ρλ0 (T ∗ ) − 1 . (10.125) As in liquid-gas system, the congested traffic can be considered as consisting of two phases: the cluster or jam (like a liquid) phase with n particles (vehicles) and free energy Fclust (n), and the free flow (like an ideal gas) phase with Nideal = N − n particles and free energy Fideal (Nideal ). The total free energy is F = Fclust + Fideal , and we can define the same basic relations for the chemical potentials of these phases as in physical systems, so µclust = ∂Fclust /∂n and µideal = ∂Fideal /∂Nideal = −∂Fideal /∂n. Hence, we also have (cf. (9.86)) ∂F = µclust − µideal . ∂n
(10.126)
10.6 Thermodynamics of Traffic Flow
The free energy of the free-flow phase in traffic is
N−n −1 , Fideal (T ∗ , L, N, n) = T ∗ (N − n) ln λ0 (T ∗ ) L
(10.127)
as consistent with (10.125) where we put N → Nideal = N − n and ρ → Nideal /L. From (10.127) we get
∂Fideal N−n λ0 (T ∗ ) = T ∗ ln λ0 (T ∗ ) = T ∗ ln ρ˜ free , (10.128) µideal = − ∂n L D where ρ˜ free = D(N − n)/L is the dimensionless density of cars in the free-flow phase. The chemical potential of the cluster phase can be easily calculated from (10.117), (10.122), and (10.126). It yields µclust
n 2 D 2 = −T ln − ln 1 + ρ˜ 1 − N λ0 b˜
D = −T ∗ ln − ln 1 + ρ˜ 2free . ˜ λ0 b ∗
(10.129)
It is remarkable that, as distinct from the liquid-gas system, the chemical potential of the cluster phase is not an internal property of this phase, since it depends on the outer parameter – the density of the surrounding free-flow phase ρ˜ free . The physical interpretation of this fact is that the traffic flow is a driven system, which approaches a stationary rather than equilibrium state in the usual sense. However, as we have shown here, various properties of this stationary state can be described by equilibrium thermodynamics. The free energy of the cluster phase can be calculated consistently from (10.123), (10.125), and (10.127) according to F = Fclust + Fideal . The result is
D n 2 + ln Fclust (T ∗ , L, N, n) = T ∗ N − N λ0 b˜ 2 n !! + arctan ρ˜ − arctan ρ˜ 1 − (10.130) ρ˜ N
, + n !2 n 2 2 . + ln 1 + ρ˜ − 1− ln 1 + ρ˜ 1 − N N It is consistent with µclust = ∂Fclust /∂n. As a summary of this chapter, we have shown how thermodynamics can be applied to a many-particle system such as traffic flow, based on a microscopic (carfollowing) as well as mesoscopic (stochastic) cluster description. By analogy with equilibrium physical systems like supersaturated vapor-forming liquid droplets (see Chapter 9), we have derived the free energy function and chemical potentials by using the detailed balance in the space of car cluster sizes. Distinguishing features between the traffic flow as a driven system and equilibrium physical systems have also been discussed.
347
348
10 Vehicular Traffic
10.7 Exercises
E 10.1 Optimal velocity model I Formulate the optimal velocity model, considered in Section 10.6, writing Newton’s equation of motion in the dimensionless form d 0 vi = 0 Facc (0 vi ) + 0 Fdec (∆0 xi ) d˜t by introducing dimensionless variables, time ˜t = t/τ, coordinate 0 x = x/D, velocity 0 v= v/vmax , and the kinetic and potential energy of the ith car given in dimensionless form as 0 Ekin (0 vi ) = 0 vi 2 /2 and 0 Epot (∆0 xi ) = − 0 Fdec (0 xi )d0 xi . Here D and vmax are the interaction distance and the maximum velocity entering the optimal velocity function vopt (∆x) = vmax (∆x)2 /(D2 + (∆x)2 ). Derive expressions for dimensionless quantities: the acceleration and deceleration forces 0 Facc (0 vi ) and 0 Fdec (∆0 xi ), the dimensionless kinetic and potential energy of the car system, and the total energy flux. Find the dimensionless control parameters which determine the behavior of the model. E 10.2 Optimal velocity model II Having the energy balance in mind (see Figure 10.31) make numerical calculations of the energy and energy flux for the optimal velocity model in analogy with those presented in Section 10.6, but, using dimensionless variables as defined in the previous exercise. Find how the systems behavior changes depending on the values of the dimensionless control parameters and calculate the limit cycle in the headway – velocity phase space. Fuel Inflow v
Friction Outflow Figure 10.31
Energy flow in steady state vehicular traffic.
E 10.3 Optimal velocity model III Consider the optimal velocity model introduced in Section 10.2. In Figure 10.32 a phase diagram is depicted which shows different states of the car system on a circular road depending on the average headway (0 L/N) and the dimensionless control parameter b. Find the answers to the following questions. • • • •
What is the meaning of the spinodal function? What is the meaning of the binodal function? What does bcrit mean? What does bcoll mean?
10.7 Exercises
1.4
Parameter b
1.3 1.2 1.1 1.0 0.9 0.8 0.0
0.4
0.8 Headway
1.2
1.6
Figure 10.32 Binodal (solid) and spinodal function (dashed) in phase diagram of traffic. The upper dotted line is bcrit . The lower dotted line is bcoll .
• Identify five different regions in the phase diagram (only homogeneous solution, only limit-cycle solution, homogeneous and limit-cycle solution possible, homogeneous solution with collisions in the limit-cycle, no stable solution) and discuss their meaning in detail. • Think about the hysteresis effect in this diagram. What is the meaning of this effect in real traffic?
349
351
11 Noise-Induced Phase Transitions
11.1 Equilibrium and Nonequilibrium Phase Transitions
This chapter is devoted to a novel type of cooperative phenomenon that can be observed in systems far from their equilibrium state or in systems where the notion of thermal equilibrium is not applied at all. These are called phase transitions induced by noise. The ability of noise to produce order under certain conditions, in particular, to give rise to various cooperative phenomena and the corresponding phase transitions, is now a well established fact [6,54,86,102,114,200]. Such phase transitions manifest themselves in the phase-space distribution changing its structure, for example, the number of maxima. Popular examples are stochastic resonance [36, 52, 66], coherence resonance [114, 183], noise-induced transport [158], and a number of noise-induced phase transitions of the diffusion type [54, 232, 233]. Typically, the latter phenomena are due to multiplicative noise. However, additive noise in the presence of another multiplicative noise can also induce phase transitions [115,245,246] or individually cause a hidden phase to become visible [116]. Systems of elements with motivated behavior, e.g. fish and bird swarms, car ensembles on highways, stock markets, etc., also can display a wide variety of cooperative effects caused by noise action [77]. However, the theory of these phenomena is far from being well understood. To elucidate the main features of the nonequilibrium phase transitions under consideration, let us first discuss briefly the basic notions of the equilibrium phase transitions of second order. We make use of the Landau order parameter theory and for the sake of simplicity consider a one-dimensional system with a nonconservative scalar order parameter h(x, t) with dynamics governed by the equation [137]
δH{h} ∂h =− + gξ(t), ∂t δh
(11.1)
where is the kinetic coefficient, g is the amplitude of the random Langevin force proportional to white noise ξ(t) whose averages satisfy the equalities (11.2) ξ(t) = 0 and ξ(t)ξ(t ) = δ(t − t ), Physics of Stochastic Processes: How Randomness Acts in Time Reinhard Mahnke, Jevgenijs Kaupuˇzs and Ihor Lubashevsky Copyright 2009 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim ISBN: 978-3-527-40840-5
352
11 Noise-Induced Phase Transitions
and the functional H{h} is the free energy, typically written in the form
+∞ 1 2 (∇h)2 + H(h) dx H{h} = 2 −∞ with the free energy density H(h) specified by the expression H(h) =
1 2
α(T)h2 +
1 4
βh4 .
Here, by definition, the symbol ∇ := ∂/∂x, the characteristic spatial scale , the coefficient β > 0, the kinetic coefficient > 0 and the noise amplitude g are certain constant parameters of the given system, whereas the coefficient α(T) depends on the temperature T. In the case of = 0, it changes the sign at the critical temperature T = Tc . The following ansatz with some constant α0 α(T) = α0
(T − Tc ) Tc
is usually adopted. In these terms the governing equation (11.1) becomes
∂h = 2 ∇ 2 h − α(T)h − βh3 + gξ(t). ∂t
(11.3)
For T > Tc the system possesses only one phase state matching the stable minimum of the free energy h(x) = 0 (Figure 11.1). When the temperature drops below the critical value, T < Tc , the free energy changes in form. Two new additional extrema appear which are stable, whereas the previous one becomes unstable. As a result, the homogeneous state decays, giving rise to the composition of domains with √ h± = ± |α|/β. This bifurcation is shown in Figure 11.1 which also exhibits the resulting evolution of the probability function P(h) of the order parameter. It should be pointed out that the width of the distribution P(h) is determined by the intensity g of the Langevin forces. Within the framework of the given model the temperature dependence of the coefficient α(T) and the intensity g of the Langevin forces are independent characteristics of the system properties. This is quite a formal statement because, on the microscopic level, both of them are related to the stochastic motion of the particles forming this system. Nevertheless, when dealing with equilibrium phase transitions, one can regard the random forces as just a source of disorder and the appearance of new phases becomes pronounced when the difference between the new phase branches, (h+ − h− ), essentially exceeds the Langevin force intensity, g, in magnitude. In some sense the nonequilibrium phase transitions are distinguished from the equilibrium ones by the fact that the phenomena causing them cannot be described within models similar to (11.1) without substantial modification. Below, we will consider two types of nonequilibrium phase transitions that are due to the creative action of noise. The former appears in systems where the amplitude g(h) of the random Langevin forces depends essentially on the order parameter h, for example, tends to zero as h → 0. The latter comes into being via the anomalous behavior of the kinetic coefficient (h) depending on the order parameter now being some vector h = {h1 , h2 , . . .}. Namely, the phase space {h1 , h2 , . . .} of such
0 2 0 Order parameter, h
Regular force, -dH/dh
(a)
(b)
1 0 2
0 Order parameter, h
Mean value of order parameter
1
2 1
0
(c) Distribution function, P
Free energy density, H
11.1 Equilibrium and Nonequilibrium Phase Transitions
(d)
Figure 11.1 Schematic illustration of the mechanism of the second-order phase transitions in equilibrium systems. (a) shows the free energy density above and below the critical value of the temperature, which is labeled with numbers 1 and 2, respectively.
0 Temperature difference, T - Tc
1 g 2
0 Order parameter, h The resulting regular force is shown in (b). The averaged order parameter vs the temperature is demonstrated in (c), whereas (d) depicts the distribution function of the order parameter again above and below the phase transition.
systems contains a narrow layer, a ‘low-dimensional’ domain called the dynamical trap region, where the kinetic coefficient (h) takes extremely large values. So, when the system enters this region its dynamics becomes stagnated, which gives rise to long-lived states which are treated as new dynamical phases. In all of these cases, noise gives rise to a certain ordering in addition to random perturbations in the system motion. As a result, after the system has undergone phase transition, the difference (h+ − h− ) between the mean values of the order parameter h in the new phases and the width of local extrema of the distribution function P(h) are of the same order of magnitude. In some sense they are due to the same phenomena and, thus, cannot be analyzed independently of each other. So, such transitions are typically detected by analyzing the distribution function and fixing the transition between, e.g. the unimodal distribution to the bimodal one (Figure 11.2) as the control parameter changes. Before passing directly to the nonequilibrium phase transition it is necessary to outline the mathematical concepts to be used in describing nonlinear stochastic systems.
353
Distribution function
11 Noise-Induced Phase Transitions
a
tro
lp
ar
am et
er
,
Phase transition
0
Co n
354
Order parameter, h
Figure 11.2 Illustration of the approach to detecting phase transitions in onequilibrium systems which are caused by the creative action of noise.
11.2 Types of Stochastic Differential Equations
An introduction to the rigorous description of stochastic processes is given in Chapter 1. Here we apply a more informal construction which is actually justified by the results presented within the chapter. For an ordinary dynamical system with, for example, a one-dimensional phase space x ∈ R, the governing equation is written as dx = f (x, t), dt
(11.4)
where f (x, t) is a force acting on the system. From a naive point of view one might just replace the force f (x, t) by the sum of the regular component f (x, t) and the random Langevin source g(x, t)ξ(t) to describe a similar stochastic system, rewriting the governing equation as dx = f (x, t) + g(x, t)ξ(t). dt
(11.5)
Unfortunately, such a generalization is justified only in the case when the intensity of the Langevin sources does not depend on the phase variable, so that g(x, t) = g(t). To understand this fault we consider the model where f (x, t) = 0,
g(x, t) = 1 + x,
(11.6)
and 1 is a small parameter. Initially, the system is assumed to be located at the origin x(t = 0) = 0. Then, within an accuracy up to the first order in , the solution of (11.5) can be obtained by iteration yielding
t
t
t dt ξ(t ) + dt dt ξ(t )ξ(t ). (11.7) x(t) = 0
0
0
11.2 Types of Stochastic Differential Equations
By virtue of (11.2) the value of x(t) averaged over all the realizations of white noise x(t) is given by the expression
t
t dt dt δ(t − t ). (11.8) x(t) = 0
0
It is exactly these expressions which are responsible for ambiguities in the formal use of differential equations for describing nonlinear stochastic systems. The problem is that the point where the Dirac δ-function differs from zero lies at the boundary t = t of the integration region (Figure 11.3). In this case it is not clear which part of the δ-function should be ascribed to the integration region or, what amounts to the same thing, which part is outside it. In order to overcome this problem, one can apply to the notion of infinitesimals, namely, the hyperreal numbers. Their rigorous description can be found in the literature [1,3]. Here we will confine our consideration to the physical level of rigor. Dealing again with a standard dynamical system, its governing equation (11.4) can be rewritten using infinitesimals as dx = f (x, t) dt,
(11.9)
where dx and dt are infinitesimally small increments of variable x and a time step. Roughly speaking, in standard dynamical systems there is only one infinitesimally small variable, the time step dt. All the other infinitesimals are derived from it by multiplying dt by some smooth function. From the standpoint of such a microscopic level the distinction between systems with regular and stochastic dynamics is mainly due to the stochastic ones possessing two or more really independent infinitesimals, the time step dt and the infinitesimal moments {dW(t)} of the random Langevin forces. In the case under consideration there is only one moment of the random force, dW(t). Here, the time t is used with the t″ t
d (t ′ – t ″) ?
Region of integration
t
t′
Figure 11.3 Illustration of the fault encountered in describing stochastic systems by nonlinear Langevin forces. See the discussion preceding expression (11.8).
355
356
11 Noise-Induced Phase Transitions
symbol dW(t) in order to relate the random force moment to the current time. At different moments t, t of time the quantities dW(t) and dW(t ) are supposed to be mutually independent. It is possible to consider dW(t) as the integral of white noise ξ(t)
t+dt dt ξ(t ) (11.10) dW(t) = t
which, however, is no more than a qualitative explanation of the properties ascribed to the random force moment dW(t). Namely, they are (11.11) dW(t) = 0 and [dW(t)]2 = dt. The latter equality √ enables us to estimate the amplitude of the random force moment as dW(t) ∼ dt. In the following it will be seen that a stochastic infinitesimal quantity√affects the system dynamics only if its amplitude scales with the time step as dt, whereas all the essential regular infinitesimals scale as dt. Therefore, when dealing with [dW(t)]2 we can ignore its random component and regard it as a regular infinitesimal. In other words, the following relationship [dW(t)]2 = dt
(11.12)
is adopted. In some sense it specifies the algebraic manipulations with infinitesimals of the stochastic systems (see [55] for the details). Now we are ready to write down the governing equation for the given stochastic system in infinitesimals. However, before doing this it is worthwhile to note that the notion of infinitesimals enables one to pose a question about the point xθ at which the force f (xθ , t) in (11.9) should be calculated. In particular, it is naturally to set (Figure 11.4) xθ = x + θ dx
(11.13)
where 0 ≤ θ ≤ 1 is a certain constant or a smooth function of x. For standard dynamical systems, however, this shift in (11.10) from the point x to point xθ makes no sense because it gives rise to addition terms on the left-hand side of (11.10) having no effect on the system dynamics. Namely, f (x, t) dt ⇒ f (xθ , t) ≈ f (x, t) dt +
∂f θ dx dt , ∂x
(11.14)
and since dx ∝ dt in the given case, the last term scales as ( dt)2 which is negligible. This is not the case if the Langevin force undergoes a shift x → xθ as will be seen below. x + dx xq = x + q·dx x
Figure 11.4 Intermediate point xθ which determines the values of the forces acting on the system during the time interval (t, t + dt).
11.2 Types of Stochastic Differential Equations
Keeping in mind this possibility of dealing with the intermediate points {xθ } we apply to the following equation in infinitesimals (cf. [55]) dx = fθ (xθ , t) dt + g(xθ , t) dW(t)
(11.15a)
to describe the stochastic system with a nonlinear Langevin force. This equation actually inherits the structure of the formal equation (11.5) after its ‘integration’ from t to t + dt and the formal relationship (11.10) between the random infinitesimal dW(t) and white noise ξ(t). In other words, the given stochastic system is characterized by the regular force fθ (x, t), the intensity g(x, t) of the random Langevin force, and the parameter θ specifying the intermediate point xθ which determines the magnitudes of these forces. Concluding the construction of the given governing equation we note the following. First, the introduced basic parameter θ has no analogy in standard dynamical systems. Moreover, it characterizes the physical properties of a given system reflecting its features at the microscopic level. The subscript θ for the function fθ (x, t) is used to underline the fact that it belongs to the triplet {f , g, θ}. Second, in the general case, fθ (x, t) is a vector and g(x, t) as √ well as θ(x, t) is some tensor. Third, as it follows from (11.15a) that dx ∝ dW ∝ dt holds at the leading order in dt. Therefore, the difference fθ (xθ , t) dt − fθ (x, t) dt ≈
df θ(dt)3/2 dx
can be ignored because it scales as ( dt)3/2 with time step dt and, therefore, is an infinitesimal of higher order than dt. So the replacement fθ (xθ , t) = fθ (x, t) in equation (11.15a) will be further adopted. In this case, (11.15a) reads dx = fθ (x, t) dt + g(xθ , t) dW(t).
(11.15b)
Fourth, special types of stochastic equations have individual names (Table 11.1). The reasons for this will become clear. As we have mentioned above, a stochastic system is described by the triplet {fθ , g, θ} which aggregates the system properties on the microscopic level, and the question concerning the specific value of θ should correspond to physics rather than mathematics. For example, diffusion in solids is the H¨anggi–Klimontovich process, whereas current induced by temperature gradient is the Ito process. Table 11.1
Basic types of stochastic systems.
θ
Name of stochastic process
0 1/2 1
Ito process Stratonovich process H¨anggi–Klimontovich process
357
358
11 Noise-Induced Phase Transitions
Nevertheless, for a given system it is possible to make use of the description for {fθ , g, θ}, where the value of the parameter θ is chosen specific purposes. This is due to the fact that the collection of triplets {fθ , g, θ} θ , where the form of the function g(x, t) is fixed, gives the equivalent description of stochastic dynamics, provided that the regular forces {fθ } meet some relationship. To show this we expand the function g(xθ , t) = g(x + θ dx, t) in the Taylor series with respect to the infinitesimal quantity dx and cut it off at the first-order term g(xθ , t) = g(x, t) + θ
∂g(x, t) dx. ∂x
As √ follows from (11.15), at the leading order in dt, that is, within the accuracy of dt, the infinitesimals dx and dW are related by the expression dx = g(x, t) dW, so g(xθ , t) = g(x, t) +
θ ∂g 2 (x, t) dW. 2 ∂x
Substituting the latter expression into (11.15b) and taking into account (11.12), this reduces to the following Ito-type stochastic governing equation θ ∂g 2 (x, t) dt + g(x, t) dW(t). (11.16) dx = fθ (x, t) + 2 ∂x Therefore, all the triplets {fθ , g, θ} with a given intensity g(x, t) of Langevin forces actually describe the same stochastic system if the regular forces {fθ (x, t)} belong to a family of functions such that f0 (x, t) = fθ (x, t) +
θ ∂g 2 (x, t) , 2 ∂x
(11.17)
where f0 (x, t) is the regular component of the Ito-type triplet. A similar relationship for the regular force of the H¨anggi–Klimontovich type is written as f1 (x, t) = fθ (x, t) −
(1 − θ) ∂g 2 (x, t) . 2 ∂x
(11.18)
As a closing remark in this section, we note that the Ito representation of a stochastic system has an advantage in describing the system dynamics as an array of succeeding steps { dx} because the probability of the system transition from the state x to the state x + dx is specified completely by its properties at the state x only. The properties distinguishing the other two types of stochastic process will be discussed below.
11.3 Transformation of Random Variables
Conversion from a random variable, x, to a new one, y, poses the problem of finding the governing equation for the random process y(t) ⇐ x(t) provided that the governing equation of the system dynamics described in terms of the variable x is known, for example, it is (11.15). It is normal to consider this problem under the
11.3 Transformation of Random Variables
assumption that the old and new random processes are characterized by the same value of the parameter θ. Unfortunately, the standard rules valid for the ordinary differential equations do not hold, in general, for stochastic systems. The fact is that, even for a smooth function x = ϕ(y), the expansion ϕ(y + dy) = ϕ(y) + ϕ (y) dy + 12 ϕ (y)( dy2 ) + · · · contains terms proportional to dt from two sources. The former is the second term in this expansion due to the presence of the regular force; the latter is the third term due to the contribution of the random force moment, namely, ( dW)2 = dt. To derive the transformation rule for the stochastic system under consideration we make use of the following equalities xθ = θϕ[yθ + (1 − θ) dy] + (1 − θ)ϕ[yθ − θ dy] = ϕ[yθ ] + 12 θ(1 − θ)ϕ [yθ ]( dy)2
(11.19)
and dx = ϕ[yθ + (1 − θ) dy] − ϕ[yθ − θ dy] = ϕ [yθ ] dy + ( 12 − θ)ϕ [yθ ]( dy)2 ,
(11.20)
where yθ = y + θ dy. In (11.15a) the argument xθ of the functions fθ (xθ , t) and g(xθ , t) can be replaced by another quantity deviating from xθ by an infinitesimal of √ order less than dt. Thus, by virtue of (11.20) the transformation of the variables x = ϕ(y) is implemented, first using the direct replacement fθ (xθ , t), g(xθ , t) → fθ [ϕ(yθ ), t], g[ϕ(yθ ), t]. Second, the substitution of (11.19) into (11.15a) converts it to ϕ [yθ ] dy + ( 12 − θ)ϕ [yθ ]( dy)2 = fθ [ϕ(yθ ), t] dt + g[ϕ(yθ ), t] dW(t).
(11.21) √ Whence it follows that, at the leading order in dt, that is, within the accuracy dt the infinitesimals dy and dW are related as g[ϕ(yθ ), t] dW(t). dy = ϕ [yθ ] Taking into account rule (11.12), formula (11.21) leads us to the required governing equation written for the random variable y related to x as x = ϕ(y) (cf. [55])
1 1 2 ϕ [yθ ] dy = dt fθ [ϕ(yθ ), t] + θ − g [ϕ(yθ ), t] ϕ [yθ ] 2 (ϕ [yθ ])2 +
g[ϕ(yθ ), t] dW(t). ϕ [yθ ]
(11.22)
It is the second term in the braces that makes the transformation of random variables distinct from one for ordinary dynamical systems. However, if the random process is of the Stratonovich type, so that θ = 1/2, this term vanishes and the transformation of random variables takes the standard form [55]
359
360
11 Noise-Induced Phase Transitions
dy =
g[ϕ(y1/2 ), t] 1 f1/2 [ϕ(y1/2 ), t] dt + dW(t). ϕ [y1/2 ] ϕ [y1/2 ]
(11.23)
This property endows the Stratonovich representation of stochastic processes with an advantage in dealing with a description based on various transformations of the system phase space.
11.4 Forms of the Fokker–Planck Equation
The relationship between the stochastic differential equations dealing with individual realizations of stochastic trajectories and the Fokker–Planck equation describing the dynamics of the distribution function P(x, t) was considered in detail in Chapter 5. Here we will make use of the results presented there. For the systems under consideration the diffusion coefficient is calculated as follows (dx)2 g 2 (x, t) = . D(x, t) = 2 dt 2 Thus, for a stochastic system governed by the Ito-type triplet {f0 , g, 0} the Fokker–Planck equation takes the form + , ∂ 1 ∂ g2P ∂P = − f0 P . (11.24) ∂t ∂x 2 ∂x In order to write the corresponding equation for a system with a general triplet {fθ , g, θ}, we can make use of the Fokker–Planck equation (11.24) after passing to the equivalent Ito representation according to (11.17). In this way we get , + ∂P ∂ g 2θ ∂ g 2(1−θ) P = − fθ P . (11.25) ∂t ∂x 2 ∂x In particular, for the Stratonovich processes, where θ = 1/2, the Fokker–Planck equation is of the form + , ∂ g ∂ gP ∂P = − f1/2 P , (11.26) ∂t ∂x 2 ∂x whereas for the H¨anggi–Klimontovich processes it becomes ∂ g 2 ∂P ∂P = − f1 P . ∂t ∂x 2 ∂x
(11.27)
The H¨anggi–Klimontovich representation of stochastic dynamics is singled out by the relation between the steady-state distribution Pst (x), the regular force f (x), and the intensity g(x) of the Langevin source. Namely, for an autonomous system, that is, when its properties do not depend on time, the steady-state solution Pst (x)
11.5 The Verhulst Model of Third Order
of (11.27) obeys the equality g 2 (x) dPst − f1 (x)Pst = 0. 2 dx This equality is quite natural for unbounded one-dimensional systems, whereas for multidimensional systems it also is the case when the detailed balance holds. Direct integration of the latter equality gives us the expression for the steady-state distribution x 2f1 (x ) 1 st (11.28) dx . P (x) = exp Z g 2 (x ) Dealing with the initial triplet {fθ , g, θ} of the general type, formula (11.28) can be rewritten as x 2fθ (x ) 1 (11.29) exp dx Pst (x) = Zg 2(1−θ) (x) g 2 (x ) by virtue of (11.18). Here the coefficient Z is specified by the normalization condition
+∞ Pst (x) dx = 1. (11.30) −∞
Naturally, all these integrals should exist. Expression (11.28) will be used in the analysis of the noised-induced phase transitions in the next section. In the following two sections we consider typical examples of the stochastic systems undergoing nonequilibrium phase transition caused by nonlinear Langevin sources.
11.5 The Verhulst Model of Third Order
This model is related to one-dimensional systems with fading dynamics near the √ stationary point x = 0 or between two stable points x = ± λs (when λs > 0), where the returning force contains linear and nonlinear component like this dx = −λ(t)x − x3 , dt
(11.31)
with the coefficient λ(t) = λs + σξ(t) not being a constant but exhibiting random fluctuations in time. The intensity of these fluctuations is quantified by the parameter σ. When the amplitude of these fluctuations exceeds some critical value, the coefficient λ(t) changes the sign and during a short time the system behavior becomes anomalous, for example, the point x = 0 becomes unstable. The Verhulst model actually describes the competition between these random events of the system instability and regular fading. The corresponding governing equation is of the form [86] dx = λs x − x3 + σxθ dW(t)
(11.32)
361
11 Noise-Induced Phase Transitions
and the parameter θ is assumed to be given in the interval 0 < θ < 1. Therefore fθ (x) = λs x − x3 and g(x) = σx hold for this system. In the following we will analyze only the properties of the stationary distribution Pst (x). Formula (11.29) together with the normalization condition (11.30) immediately gives us the desired function 2β−1 2 |x| 1 x (for β > 0), (11.33) exp − 2 Pst (x) = (β)σ σ σ where the parameter β = β(λs , σ, θ) is β=
1 λs +θ− σ2 2
(11.34)
and (p) is the gamma function. Let us consider two possible cases with λs > 0 or λs < 0 individually. When λs > 0 and σ = 0 the system has one unstable stationary point x = 0 and two √ stable ones x = ± λs according to (11.31). The random Langevin force of low intensity, σ 1, can only disturb the system motion near these points. In the given limit β > 12 holds and the distribution function is bimodal (Figure 11.5). This bimodality, however, is not due to the noise effect but is a result of the regular
Distribution function
1.0
(a)
s = 0.5
s = 1.0
0.8 0.6 0.4 0.2 0.0
(b)
1.0 Distribution function
362
s = 1.05
s = 1.35
0.8 0.6 0.4 0.2 0.0 −2
(c)
−1
1 0 Variable x
−2
2 (d)
Figure 11.5 The stationary distribution function of the Ito-type Verhulst random process for different values of the noise intensity σ. (a) depicts the distribution function for σ < σc1 , (b) shows it at the critical value σ = σc1 , whereas (c) and (d) exhibits the distribution
−1
0 1 Variable x
2
function in the case σc1 < σ < σc2 for two values of σ located near the boundaries of this interval. In drawing the curves the parameter λs = 1 was √ used for which σc1 = 1 and σc2 = 2.
11.5 The Verhulst Model of Third Order
force structure. As the noise intensity grows, the Langevin force destroys it, and for 0 < β < 12 the stationary distribution Pst (x) becomes unimodal. For large values of σ, corresponding to β < 0, the system undergoes collapse indicated formally by the fact that function (11.33) has a nonintegrable singularity at x = 0. Accordingly, when the system wandering in space reaches the origin, it will never leave it. This ‘ordering’ is really due to the noise effect. In the given case the stationary distribution takes the form of the Dirac δ-function Pst (x) = δ(x)
(for β > 0).
(11.35)
Figure 11.5 visualizes this evolution of the stationary distribution as the intensity of the random Langevin forces increases. According to (11.33), the system is characterized by the bimodal stationary distribution if / λs σ < σc1 := . (11.36) (1 − θ) √ For σ 1 its extrema are located near x ± λs . In the given case, the random Langevin force affecting the system dynamics mainly disturbs its motion. We note that for the H¨anggi–Klimontovich process, regarded here as the limit θ → 1 − 0, it is the only possible behavior of the system. For the intermediate values of the noise intensity σ, namely, / λs , σc1 < σ < σc2 := + 1 (11.37) 2 −θ the Langevin force destroys the ordering by a regular force but cannot order the system motion itself. For stochastic processes with θ ≥ 12 the critical value σc2 does not exist and the noise effect is purely destructive. This is true, in particular, for the Stratonovich-type processes. When θ < 12 and the noise intensity is quite high, σ > σc2 ,
(11.38)
the Langevin force causes the system to collapse at the origin, which means that, after some finite time, the system will inevitably be trapped at the point x = 0. If the parameter λs < 0, the Verhulst system without noise admits only one stationary point x = 0 being stable. In this case a noise of low intensity cannot destroy the system collapse at the origin and the stationary distribution function is the Dirac δ-function. It is justified by the divergence of integral (11.30) for the solution (11.33) with β < 0. The latter inequality holds for small values of σ, as follows from (11.34). For a stochastic system with θ > 12 the Langevin force, however, destroys this collapse when its intensity σ exceeds some critical value / |λs | ∗ ,. σ > σc := + (11.39) θ − 12 Here again the Langevin force plays a destructive role. We note that the phenomena described by the given Verhulst model are mainly the effects of the order breakdown proceeding via phase transitions caused by the nonlinearity of the Langevin force. In the next section a system with the creative role of nonlinear Langevin forces will be considered.
363
364
11 Noise-Induced Phase Transitions
11.6 The Genetic Model
This model reflects the characteristic features in the description of population genetics. By way of an example we analyze it within a somewhat simplified formulation dealing with the variable X determined inside the interval (0, 1) with dynamics governed by the equation [86] 1 dX = − X + λ(t)X(1 − X), dt 2 where the parameter λ(t) undergoes random fluctuations in time. For the sake of simplicity we convert to the new variable x = (2X − 1) and assume the process under consideration to be of the Stratonovich type. In this case the governing equation in infinitesimals is written as , + (11.40) dx = −x dt + σ 1 − xθ2 dW, where, as previously, the parameter σ characterizes the noise intensity. Also, the system is assumed to be localized initially at some point of the interval −1 < x < 1. Here the regular force f1/2 (x) describes pure damping relaxation toward the origin, x = 0, the unique stationary point being stable without the noise effect. The intensity of the random Langevin forces, g(x) = σ(1 − x2 ), drops essentially at the points x = ±1, which is responsible for the anomalous system behavior. In this case, (11.29) together with (11.30) gives
1 1 1 1 −1 + , Pst (x) = exp , (11.41) K exp − 0 2σ2 2σ2 (1 − x2 ) σ2 1 − x 2 where K0 (. . .) is the modified Bessel function of the second kind of order 0. Analyzing this expression directly, we conclude that the distribution function Pst (x) is of unimodal form when the noise intensity is quite low, namely, σ < 1. For σ > 1, the Langevin force induces essential deviation of the system from the origin and the distribution function become biomodal (Figure 11.6). In particular, the function Pst (x) attains its maximum at the points √ 1 − σ2 . xm = ± σ In the given system it is the nonlinear Langevin force that causes the formation of two new phases, which is a good example of the constructive action of nonlinear random forces.
11.7 Noise-Induced Instability in Geometric Brownian Motion
Both of the examples considered in the previous two sections demonstrate the fact that the system distribution can change its form depending on the noise intensity,
11.7 Noise-Induced Instability in Geometric Brownian Motion
1.2
Distribution function
1.0
s = 0.5
0.8 s=1
0.6 s=2
0.4 0.2 0.0 −1.0
−0.5
0.0 Variable x
0.5
1.0
Figure 11.6 A stationary distribution function of the Stratonovich type. Genetic random process for different values of the noise intensity σ.
when the Langevin force is essentially nonlinear. At first glance it might be possible to assume that the transitions found in the distribution form are no more than ‘artifacts’ and can be eliminated by converting to the appropriate variables y = ψ(x). Indeed, this transformation of variables leads to the conversion of the probability density P(x) → p(y), namely, p(y) =
1 P(x), ψ (x)
(11.42)
and the nonlinearity of the transformation y = ψ(x) can eliminate the distribution bimodality. The remaining part of this section is a counter-example of this statement. It is provided by geometric random walks already introduced in Section 5.9. The process is governed by the following equation in infinitesimals (cf. [206]) dx = βx dt + σxθ dW(t),
(11.43)
where the system motion is considered in the half-space x > 0 with the coefficient β = ±1, and, as previously, the parameter σ quantifies the noise intensity. The parameter θ of this process is assumed to be a given constant not equal to 1/2, so 0 ≤ θ < 1/2 or 1/2 < θ ≤ 1. This model provides a rather simplified description of the birth–death processes in many biological and ecological objects. Let us convert to the new variable x = exp(y). Then, by virtue of (11.22), equation (11.43) becomes
1 2 σ dt + σ dW(t). dy = β + θ − 2
(11.44)
(11.45)
365
366
11 Noise-Induced Phase Transitions
Whence it follows that, if the Langevin source is rather weak, namely, / 2 , σ < σc := |2θ − 1|
(11.46)
then its presence cannot essentially affect the system dynamics. In other words, in the y-space the system drifts either to −∞ or +∞, depending on the sign of the parameter β. In the x-space this is seen in the system tending to the origin x or to infinity with probability equal to unity as time increases. When the noise intensity increases the critical value, σ > σc , the system motion can change direction. This occurs when β = −1 and θ > 1/2 or β = 1 and θ < 1/2. This actually demonstrates the instability of the point x = 0 caused by multiplicative noise. Finalizing this section, we present the distribution function p(y, t) and P(x, t) written for the random variables y and x governed by (11.45). The latter describes the standard diffusion process with constant drift, so 2 " (y − y0 ) − (β + (θ − 12 )σ2 )t , exp − p(y, t) = √ 2σ2 t 2πσ2 t , , 2 " + + ln(x/x0 ) − β + θ − 12 σ2 t 1 , P(x, t) = √ exp − 2σ2 t x 2πσ2 t 1
(11.47a)
(11.47b)
where x0 = exp(y0 ) are the coordinates of the initial position of the particle and (11.42) has also been taken into account. Figure 11.7 visualizes these functions for different moments of time. It also should be noted that the geometric random walks have quite anomalous properties. To demonstrate this fact, let us analyze the time dependence of the moments of the variable x
∞ xp P(x, t) dx for p = 1, 2, . . . (11.48) Mp (t) := 0
Substituting (11.47) into (11.48), direct integration yields p − 1 2 Mp (t) = exp β + θ + σ pt . 2
(11.49)
So, when the effect of noise is ignorable, σ → 0, the system either goes to zero or to infinity, depending on the sign of β. For the Langevin force of finite intensity, σ > 0, there is always an order p so that
β p>1−2 2 +θ σ and the moment Mp (t) diverges as t → ∞. In particular, even for θ = 0 and β = 0, the first-order moment M1 does not depend on time, whereas the second moment M2 (t) goes to infinity as exp{σ2 t}. In other words, even if the mean value of the random variable x is fixed, its random fluctuations increase without bounds as time increases.
Distribution function P (x,t)
11.8 System Dynamics with Stagnation
100
100
∆=2
10−1
10−1 2
10−2
10−2 3
10−3
4
10−3 10−2 10−1 100 101 102 103 Distribution function P (x,t )
∆=1
1
100
10−2 10−3
1
2 3 4
10−3 10−2 10−1 100 101 102 103 102
∆=0
10−1
10−3
∆ = −1
101
4 3
100
2
4 3
1
10−1
2 1
10−4 10−3 10−2 10−1 100 101 102 Variable x
10−5 10−4 10−3 10−2 10−1 100 101 Variable x
Figure 11.7 The distribution function for geometric random walks governed by (11.43) for different times {ti } such that σ2 t1 = 0.1, σ2 t2 = 1, σ2 t3 = 2, and σ2 t4 = 3. The curves in the figure are labeled with the corresponding numbers and the values used ∆ = β/σ2 + θ − 1/2 are also shown. The system was initially located at x0 = 1.
11.8 System Dynamics with Stagnation
Now we pass on to the consideration of another type of phase transition induced by the action of noise, namely, phase transitions caused by dynamical traps. These occur in systems where the kinetic coefficient (h), see (11.1), depends essentially on the system state h. Originally, the development of the dynamical trap concept [247] or, more precisely, the dynamical traps of stagnation type were stimulated by a wide class of intricate cooperative phenomena found in the dynamics of various systems, e.g. vehicle ensembles moving on highways, fish and bird swarms, stock markets, etc. (see [77] for a review). The background of the models to be developed is as follows. People, as elements of a certain system, cannot individually control all the governing parameters. Therefore one chooses a few crucial parameters and mainly focuses attention on them. When equilibrium with respect to these crucial parameters is attained, human activity slows down, in turn retarding the system
367
11 Noise-Induced Phase Transitions
dynamics as a whole. For example, when driving a car, the control over the relative velocity is of prime importance in comparison with the correction of the headway distance. So, under normal conditions a driver should first reduce the relative velocity between his car and the car ahead and only then optimize the headway. In markets, the deviation from the supply-demand equilibrium, reflected in price changes, also has to exhibit faster time variations than, e.g. the production cost determined by technological capabilities. In physical systems this situation can also be found, e.g. in Pd-metal alloys charged with hydrogen where the structure relaxation exhibits non-monotonic dynamics [8,92]. In these alloys hydrogen atoms and nonequilibrium vacancies form long-lived complexes essentially affecting the structure relaxation. Their generation and disappearance governed, in turn, by the structure evolution causes the non-monotonic dynamics which can be described in terms of different time scales. These speculations have led us to the concept of dynamical traps, that is, a certain ‘low’ dimensional region (trap region) in the phase space where all the main kinetic coefficients exhibit anomalous behavior [129, 130, 132, 133]. As a result, all the time scales of the system dynamics in the trap region become large in comparison with their values outside it. The latter effect, in turn, causes long-lived states to appear in the system. In time patterns these states manifest themselves as a sequence of fragments within which at least one of the phase variables remains approximately constant. These fragments are continuously connected by sharp jumps of the given variable. Paper [133] demonstrated that such long-lived states do exist in dense traffic flow and proposed some model of dynamical traps to explain the observed features of the car velocity time series (Figure 11.8). Papers [129] and [132] simplified this model to single out the dynamical trap effect on its own. Paper [130] studied a single oscillator with dynamical traps and demonstrated numerically that the white noise can cause the distribution function of the oscillator position to convert from the unimodal form to the bimodal one. This is due to the fact that, inside the trap region, the regular ‘force’ is only depressed, rather than changing
(a)
Dynamical trap locus (stagnation dynamics region)
Parameter under control
Fragments of stagnated system motion Phase variable
‘Hidden’ parameter
368
(b)
Figure 11.8 Illustration of the dynamical trap effect: (a) depicts the phase space with the region of the dynamical traps where the system motion is stagnated; (b) shows the time pattern for one of the phase state variables.
Time
11.9 Oscillator with Dynamical Traps
the sign, and the system motion is governed by a random Langevin force. A first step towards this effect in oscillator ensembles was made in [129].
11.9 Oscillator with Dynamical Traps
In order to elucidate the mechanism of such nonequilibrium phase transitions, this section analyzes a model derived from the damping harmonic oscillator where the dynamical trap region is a narrow layer in the phase space, inside of which the particle velocity is equal to zero. Namely, the following system is under consideration dx = v, dt dv σ 2 = −ω0 (v) x + v + 0 ξv (t). dt ω0
(11.50) (11.51)
Here x and v are the dynamical variables treated as the coordinate and velocity of a certain particle, ω0 is the circular frequency of oscillations provided the system is not affected by other factors, σ is the damping decrement, and the term 0 ξv (t) in (11.51) is a random Langevin force of intensity 0 proportional to the white noise ξv (t), ξv (t) = 0, ξv (t)ξv (t ) = δ(t − t ), (11.52) with unit amplitude. At (v) = 1 and σ = 0 the system of equations (11.50)–(11.51) corresponds to the classical harmonic oscillator. Here we treat another case, where the function (v) describes the dynamical-trap effect in the vicinity of v = 0. The following simple ansatz (v) =
v 2 + (2 ϑt2 , v 2 + ϑt2
(11.53)
is adopted, where the parameter ϑt characterizes the thickness of the trap region and the parameter ( ≤ 1 measures the trapping efficacy. When ( = 1 the dynamicaltrap effect is negligible and for ( = 0 it is most effective. It should be pointed out that the governing equations (11.50) and (11.51) are written in terms of time derivatives because in the given case the noise is additive and the result is independent of the random process parameter θ. The characteristic features of the given system are illustrated in Figure 11.9. The shaded area shows the trap region where the regular force, the former term in (11.51), is depressed. The latter is described by the factor (v) taking small values in the trap region (for ( 1). Inside the trap region the system is mainly governed by the random Langevin force. Outside the trap region it is approximately harmonic. In order to analyze the system dynamics, a dimensionless time t and the dynamical variables η and u are used. Namely, the time t is measured in units of
369
11 Noise-Induced Phase Transitions
u
Distribution of residence time
1.0 Trap factor Ω (u)
370
0.8 0.6
Trap region Region of free motion
0.4
−Jt
0.2 ∆2 0.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 Relative velocity u = u/Jt Jt
0
Jt
Velocity Direction of regular motion
0 −Jt (a)
Trap region
x Region of random motion
(b)
Figure 11.9 Characteristic structure of the phase space {x, v}. The shaded area represents the trap region where the regular force is depressed and the system motion is random. The regular force depression is
described by the factor (v) illustrated in (a). The essence of the trap effect on the system dynamics is shown in (b). Outside the trap region the system dynamics is mainly regular.
1/ω0 , that is, t → t/ω0 and the units of the coordinate x and the velocity v are ϑt /ω0 and ϑt , respectively. So, by introducing the new variables η=
xω0 ϑt
and
u=
v , ϑt
the dynamical equations (11.50), (11.51) read (for the dimensionless time t) dη = u, dt
du = −[u] (η + σu) + ξ(t), dt
(11.54)
where the noise ξ(t) obeys conditions like equalities (11.52), the parameter is √ = 0 /( ω0 ϑt ), and the function [u] is given by [u] =
u2 + (2 . u2 + 1
Without noise, this system has only one stationary point {η = 0, u = 0} being stable because it possesses a Lyapunov function 2
u2 1 − (2 u + (2 η2 + + ln H(η, u) = . (11.55) 2 2 2 (2 This Lyapunov function attains the absolute minimum at the point {η = 0, u = 0} and obeys the inequality dH(η, u) = −σu2 < 0 for u = 0. dt
(11.56)
11.9 Oscillator with Dynamical Traps
In particular, if σ = 0 and = 0, then function (11.55) is the first integral of the system. In the following, the values σ and will be treated as small parameters. The dynamics of system (11.54) was analyzed numerically using a high-order stochastic Runge–Kutta method [23] (see also [24]). The distribution function P(η, u) was calculated numerically by finding the cumulative time during which the system is located inside a given mesh on the (η, u)-plane for a path of a sufficiently long time of motion, t ≈ 500 000. The size of mesh was chosen to be about 1% of the dimension characterizing the system location on the (η, u) plane. The evolution of the distribution function P(η, u) is shown in Figure 11.10 in the form of the level contours dividing the variation scale into ten equal parts. Part (a) corresponds to the case of ( = 1 where the trap effect is absent and the distribution function is unimodal; (c) illustrates the case when the distribution function has a well pronounced bimodal shape, shown also in Figure 11.11. Comparing (a), (b) and (c) in Figure 11.10, it becomes evident that there is a certain relation c ((, σ, ) = 0 between the parameters (, σ, and when the system undergoes a second-order phase transition, which manifests itself as a change in the shape of the phase space density P(η, u) from unimodal to bimodal. In particular, for σ = 0.1 and = 0.1 the critical value of the parameter ( is (c (σ, ) ≈ 0.5, as can be seen in part (b). To understand the mechanism of the noise-induced phase transition observed numerically in the given system, consider a typical fragment of the system motion through the trap region for ( 1 as shown in Figure 11.12. When it goes into the trap region Qt , −ϑt v ϑt , the regular force [u] (η + σu) containing the trap factor [u] and governing the regular motion becomes small. Hence, inside this region the system dynamics becomes random due to the remaining weak Langevin force ξ(t). However, the boundaries ∂+ Qt (where v ∼ ϑt ) and ∂− Qt (where v ∼ −ϑt ) are not identical in properties with respect to the system motion. At the boundary ∂+ Qt the regular force leads the system inwards to the trap region Qt , whereas at the boundary ∂− Qt it causes the system to leave the region Qt . Outside the trap region Qt the regular force is dominant. So, from the standpoint of the system motion inside the region Qt , the boundary ∂+ Qt is ‘reflecting’ whereas the boundary ∂− Qt is ‘absorbing’. As a result, the distribution of the residence time at different points in the region Qt should be asymmetric, as schematically shown in Figure 11.9(b). This asymmetry is also seen in the distribution function P(η, u) obtained numerically. Its maxima are located at the points with nonzero values of the velocity, clearly visible in Figure 11.10(d). Therefore, during location inside the trap region the mean velocity of the system must be positive and tends to go away from the origin. This effect gives rise to an increase in the ‘energy’ H(η, u). Outside the trap region the ‘energy’ H(η, u) decreases according to (11.56). So, when the former effect becomes sufficiently strong, that is, the random force intensity exceeds a certain critical value, > c ((, σ), the distribution function P(η, u) becomes bimodal.
371
11 Noise-Induced Phase Transitions
0.0 −0.5
(a)
−1.0 1.0 0.5
∆ = 0.5
0.0
−1.0 1.0
Velocity u
−0.5
(b)
0.5
(c)
∆ = 1.0
0.5
Velocity u
Velocity u
1.0
∆ = 0.2
0.0 −0.5 −1.0 −3.0 −2.5 −2.0 −1.5 −1.0 −0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 Coordinate η 0.50
Velocity u
372
∆ = 0.1
0.25 0.00 0.25
(d) −0.50 1.5
2.0
2.5
3.0
3.5
4.0
4.5
Coordinate η Figure 11.10 Evolution of the distribution function P(η, u) (shown by level contours) as the parameter ( decreases. In numerical calculations the values σ = 0.1 and = 0.1 were used; (d) depicts only one maximum of the whole distribution function.
11.10 Dynamics with Traps in a Chain of Oscillators
The previous section was devoted to the mechanism via which phase transitions induced by dynamical traps arise. This section demonstrates that dynamical traps do in fact give rise to cooperative phenomena that can be regarded as the formation of new phases.
11.10 Dynamics with Traps in a Chain of Oscillators
on on functi
1.0 0.8
Distributi
0.6 0.4 0.2 0.0 3 2 Co
3
1 na
di or
2 1
te
h
0 −1 −2 −3
−3
−2
−1
0
yu
cit
lo Ve
Figure 11.11 The form of the distribution function P(η, u) for the parameters σ = 0.1, = 0.1, and ( = 0.2. Decrease of ‘energy’ due to regular dambing
1.0
Increase of ‘energy’ due to time residence asymmetry in trap region
Velocity u
0.5
0.0 = 0.1 −0.5
s = 0.1 ∆ = 0.01
−1.0 3.0
3.5
4.0
4.5
5.0
Coordinate η Figure 11.12 A typical fragment of the system path going through the trap region. The parameters σ = 0.1, = 0.1, and ( = 0.01 were used in numerical simulations in order to make the trap effect more pronounced.
We analyze a one-dimensional ensemble of ‘lazy’ particles. These particles are characterized by their positions and velocities {xi , vi } as well as possessing some motives for active behavior. Particle i ‘wishes’ to get the ‘optimal’ middle position between the nearest neighbors. Thus, one of the stimuli causing it to accelerate or decelerate is the difference ηi = xi − 12 (xi−1 + xi+1 ) provided its relative velocity ϑi = vi − 12 (vi−1 + vi+1 ) with respect to the pair of nearest neighbors is sufficiently low. Otherwise, especially if particle i is currently located near the optimal position,
373
374
11 Noise-Induced Phase Transitions
it has to eliminate the relative velocity ϑi , being the other stimulus for particle i to change its state of motion. Since a particle cannot predict the dynamics of its neighbors, it has to regard them as moving uniformly with the current velocities. The acceleration dvi /dt is determined directly by both stimuli. The model to be formulated combines both of these stimuli within a linear approximation (ηi + σϑi ), where σ is the relative weight of the second stimulus. When, however, the relative velocity ϑi attains sufficiently low values, the current situation for particle i cannot become worse, at least, it cannot occur soon. In this case particle i ‘prefers’ not to change the state of motion and to retard the correction of its relative position. This assumption leads to the appearance of some common cofactor (ϑi ) in the corresponding governing equation as , + dvi ∝ −(ϑi ) ηi + σϑi . dt The cofactor (ϑ) has to satisfy the inequality (ϑ) 1 for ϑ ϑc and, (ϑ) ≈ 1 when ϑ ϑc , where ϑc is a critical value quantifying the particle’s ‘perception’ of speed. The inclusion of such a factor is the implementation of the dynamical-trap effect. Now we will describe the model. The following linear chain of N point-like particles is considered (Figure 11.13). Each internal particle i = 1, N can freely move along the x-axis interacting with the nearest neighbors, namely, particles i − 1 and i + 1 interact via ideal elastic springs with some quasi-viscous friction. The dynamics of this particle ensemble is governed by the collection of coupled equations dxi = vi , dt dvi = −(ϑi , hi )[ηi + σϑi + σ0 vi ] + ξi (t). dt
(11.57) (11.58)
x xi −1, ui −1
xi +1, ui +1
xi, ui ϑ
Region of dynamical traps Ω (ϑ) ∝ ϑ2
Regular dynamics 1 0
Stochastic dynamics
h
−1
−h0
h0
Figure 11.13 The particle ensemble under consideration and the structure of the phase space. The darkened region depicts the points where the dynamical-trap effect is pronounced. For the relationship between the variables xi , vi , hi , and ϑi see (11.60) and (11.61).
11.10 Dynamics with Traps in a Chain of Oscillators
Here for i = 2, 3, . . . , N − 1 the variables ηi and ϑi to be called the symmetry distortion and the distortion rate, respectively, are specified as ηi = xi − 12 (xi−1 + xi+1 ),
(11.59)
ϑ i = vi −
(11.60)
1 2 (vi−1
+ vi+1 ),
the mean distance hi between the particles at the point xi , by definition, is hi = 12 (xi+1 − xi−1 ),
(11.61)
and {ξi (t)} is the collection of mutually independent white-noise sources of unit amplitude, so ξi (t) = 0, ξi (t)ξi (t ) = δii δ(t − t ). (11.62) Also, the parameter is the noise amplitude, σ is the viscous friction coefficient of the springs, σ0 is a small parameter that can be treated as some viscous friction related to the particle motion with respect to the given physical frame. It is introduced to prevent the system motion as a whole reaching an infinitely high velocity. The symbol . . . denotes averaging over all the noise realizations; δii and δ(t − t ) are the Kronecker symbol and the Dirac δ-function. The factor (ϑi , hi ) is due to the effect of dynamical traps and, following our previous ansatz, we write (ϑ, h) =
ϑ2 + (2 (h) , ϑ2 + 1
(11.63)
where the function ((h) of the form , + (2 (h) = (2 + 1 − (2
h20 h2 + h20
(11.64)
is used. The parameter ( ∈ [0.1] quantifies the dynamical trap influence and the spatial scale h0 specifies the small distances within which the trap effect is depressed, so for h h0 its value is ((h) ≈ 1, whereas for h h0 /( it is ((h) ≈ (. If this parameter is ( = 1, then the dynamical traps do not exist at all. In the opposite case, ( 1, their influence is pronounced inside a certain neighborhood of the h-axis (trap region) whose thickness is about unity (Figure 11.17). The temporal and spatial scales have been chosen so that the thickness of the trap region is about unity, and the oscillation circular frequency is also equal to unity outside the trap region. The terminal particles, i = 1 and i = N, are assumed to be fixed, so x1 (t) = 0,
xN (t) = (N − 1)l,
(11.65)
where l is the particle spacing in the homogeneous chain. The particles are treated as mutually impermeable ones. Therefore, when the coordinates xi and xi+1 of an
375
376
11 Noise-Induced Phase Transitions
internal particle pair become identical, an absolutely elastic collision is assumed to happen, so if xi (t) = xi+i (t) at a certain time t, then the timeless velocity exchange vi (t + 0) = vi+1 (t − 0), vi+1 (t + 0) = vi (t − 0)
(11.66)
comes into being. Multiparticle collisions are ignored. The system of equations (4.155)–(11.66) forms the model under consideration. We note again that the Langevin sources enter this model linearly, so the governing equations admit the representation with time derivatives. The stationary point xist = (i − 1)l is stable with respect to small perturbations; it stems from the linear stability analysis with respect to perturbations of the form δxi (t) ∝ exp{γt + ikl(i − 1)},
(11.67)
where γ is the instability increment, k is the wave number, and the symbol i denotes the imaginary unit. The boundary conditions (11.65) are fulfilled by assuming the wave number k to take the values km = πm/[(N − 1)l] for m = ±1, ±2, . . . , ±(N − 2). For large values of the particle number N the parameter k can be treated as a continuous variable. Using the standard technique, the system of equations (11.57), (11.58) for perturbation (11.67) leads us to the following relation between the instability increment γ(k) and the wave number k: kl 1 γ = −0 σ0 + σ sin2 2 2 / 2 2 kl 2 kl 2 1 . (11.68) + i 20 sin − 0 σ0 + σ sin 2 2 2 Ansatz (11.63) has been used in deriving (11.68), enabling us to set 0 = (0, l) = (2 (l). Whence it follows that Re γ(k) > 0 for k > 0, so the homogeneous state of the chain is stable with respect to infinitely small perturbations of the particle arrangement. The nonlinear dynamics of the given system has been analyzed numerically. Integration of the stochastic differential equations (4.155) and (11.58) was performed using the E2 high-order stochastic Runge–Kutta method [23,24]. Particle collisions were implemented analyzing a linear approximation of the system dynamics within one elementary step of the numerical procedure and finding the time at which a collision has happened. Then this step, treated as a complex one, was repeated. The integration time step of 0.02 was used, the results obtained were checked to be stable with respect to decreasing integration time step. An ensemble of 1000 particles was studied in order to make the statistics sufficient and to avoid the strong effect of the boundary conditions. The integration time T was chosen from 5000 to 8000 time units in order to make the calculated distributions stable. At the initial stage all the particles were distributed uniformly in space, whereas their velocities were randomly and uniformly distributed within the unit interval. The results of numerical simulation were used to evaluate the following partial distributions
11.10 Dynamics with Traps in a Chain of Oscillators
P(z) =
N−M T 1 dt δ(z − zi (t)), (N − 2M)(T − T0 ) T0
(11.69)
i=M
where the time dependence zi (t) describes the dynamics of one of the variables ηi (t), ϑi (t), and vi (t) ascribed to particle i. Here z is a given point of the space Rz describing the symmetry distortion η, the distortion rate ϑ, and the particle velocity v, respectively. The variables {η, ϑ, v} enable one to represent the system dynamics portrait within the space Rη × Rϑ × Rv or its subspace; N is the total number of particles in the ensemble, and M is the number of particles located near each of its boundaries. These are excluded from the consideration in order to weaken the possible effect of the specific boundary conditions. The same is true for the lower boundary of time integration T0 ; its value is chosen to eliminate the effect of the specific initial conditions. The numerical implementation of the integration over time in (11.69) was related to the direct summation of the time series obtained. The partition of the corresponding space Rz was chosen such that the results are practically independent of the cell size. The value of M was also chosen using the stability of the result with respect to the double increase in M. The values of M ∼ 50 and T0 ∼ 500–1000 were used. Let us first discuss local properties of these ensembles. The term ‘local’ means that the corresponding state variable can take practically independent values when the particle index i changes by one or two. The variable ηi (expression (11.59)) may be regarded in this manner as it describes the symmetry of the particle arrangement in space. When ηi = 0, particle i takes the middle position between the nearest neighbors, particles i − 1 and i + 1. A nonzero value of ηi denotes its deviation from this position; in other words, a local distortion of the ensemble symmetry. This was the reason for the name used for the variables ηi as well as the variables ϑi = dηi /dt. Figure 11.14 shows the distribution of the variables η and ϑ depending on the dissipation rate σ and the initial distance l between particles, that is, their mean density. In the case of weak dissipation, the distribution functions of the symmetry distortion P(η) possess two maxima, matching the effect described in the previous section. Noise makes the uniform particle distribution unstable and the particles spend the main time in the vicinity of one or other neighbors. This leads to the bimodal distribution of the symmetry distortion η. After entering the region of dynamical traps the particle motion is stagnated, whereas outside it particles move relatively fast. This fact is reflected in the distribution of the distortion rate ϑ actually found to contain two components of different scales. The narrow component is due to the particle motion inside the trap region. This should be practically independent of the mean distance between particles. By contrast, the wide one depends remarkably on the particle density because it matches the fast motion of particles outside the trap region and, thus, has to be affected by their relative dynamics. This effect is exactly demonstrated in Figure 11.14 which also depicts the corresponding properties of the particle paths.
377
11 Noise-Induced Phase Transitions Strong dissipation, σ = 0.99, σ0 = 0.01 Distribution function P, (η) /Pmax
Distribution function P, (η) /Pmax
Weak dissipation, σ = 0.99, σ0 = 0.01 1.0 0.8 0.6
1
2
0.4 0.2 0.0
−8
−6
−4
−2
0
2
4
6
8
1.0
1.0
0.9
0.8
0.8 0.7
0.6
0.6 −0.2−0.1 0 0.1 0.2
0.4
1
2
0.2 0.0
−4
−2
Distribution function P, (ϑ) /Pmax
0.1 1
2
0.01
0.001 −8
−6
−4 −2 0 2 Distortion rate, ϑ
4
6
8
3 3 0
−3
−3
−6
−6 −9 −9 −6 −3 0 3 6 9 −6 Distortion rate, ϑ
4
1.0 0.9 0.8 0.7
0.1
0.6 0.5
−0.4−0.2 0 0.2 0.4
0.01 2
1 0.001
−4
−3
−2
−1 0 1 2 Distortion rate, ϑ
1
6
0
1
2
1
2
3
4
12
6
9
0
Symmetry distortion, η
Symmetry distortion, η
Distribution function P, (ϑ) /Pmax
Symmetry distortion, η
Symmetry distortion, η
378
−3
0
3
6
Distortion rate, ϑ
Figure 11.14 The distribution functions of the symmetry distortion η and the distortion rate ϑ for the 1000 particle ensemble with low (l = 50, label 1) and high (l = 5, label 2) density and weak (σ ≈ 0.1) and strong (σ ≈ 1.0) dissipation. The lower four windows depict characteristic path fragments of duration 1000 time units formed by a single particle with index i = 500 on the phase plane {η, ϑ} which was chosen due to its middle
2
2
8 4
1
0
0
−4
−1
−8 −12
−2 −4 −2 0
2
4
Distortion rate, ϑ
−1.0 −0.5 0.0 0.5 −1.0 Distortion rate, ϑ
position in the given ensemble. The other parameters used are the noise amplitude = 0.1, the trap-effect measure ( = 0.1, the small regularization friction coefficient σ0 = 0.01 and the regularization spatial scale h0 = 0.25. The time interval within which the data were averaged changed from 2000 to 5000 in order to make the obtained distributions stable.
In the case of strong dissipation, σ ≈ 1.0, the situation changes dramatically, although the characteristic scales of the corresponding distributions turn out to be of the same order of magnitude. Now the distribution function P(η) of the symmetry distortion has only one maximum at η = 0; however, its form is characterized by two scales. In other words, it looks like a sum of two monoscale components. One of these is sufficiently wide, its thickness is about the same value as that obtained
11.10 Dynamics with Traps in a Chain of Oscillators
for the corresponding particle ensemble with weak dissipation. This component exhibits a remarkable dependence on the particle density, enabling us to relate it to the particle motion outside the trap region. The other is characterized by an extremely narrow and sharp form shown in detail in the inset of Figure 11.14 for the dense particle ensemble. Its sharpness leads us to the assumption that ‘many–particle’ effects in such systems with dynamical traps cause the symmetrical state to be singled out from the other possible states concerning the system properties. By contrast, the distortion rate behaves in a similar way to the previous case except for some details. When the mean particle density is high (l = 5) the wide component of the distortion rate distribution disappears and only the narrow one remains, with the latter having a quasi-cusp form ∝ exp{−|ϑ|}. For the system with low density, the peak of the distortion rate distribution splits into two small spikes. These features can be explained by referring to the frames in Figure 11.14, which exhibit typical path fragments formed by the motion of a single particle on the {ηϑ} plane. Roughly speaking, three motion types can be singled out: some stagnation inside a narrow neighborhood of the origin {η = 0, ϑ = 0}; slow wandering inside the trap region that, on average, follows a line with a finite positive slope; and the fast motion outside the trap region. The fast motion fragments typically stem from an arbitrary point of the low motion region and lead to a certain neighborhood of the origin. It seems that the systems with a low density of particles have the possibility of going sufficiently far from the origin and, during the fast motion, rarely come into the stagnation region. As a result, first, the distortion rate distribution function is of a two-scale form and contains two spikes on the peak. In the case of high density, the fast motion is depressed substantially and the system migrates mainly in the slow-motion region, entering the stagnation region many times. Thus, the distortion-rate distribution converts into a single-scale function and the symmetric state occurs often, giving rise to a significant sharp component of the distortion distribution located near the point η = 0. Now let us discuss the nonlocal characteristics of the 1000-particle ensembles. Figure 11.15 depicts the velocity distributions. As we can see, it depends essentially on both parameters; the mean particle density and the dissipation rate. When the mean particle density is low and the dissipation is weak (l = 50 and σ ≈ 0.1), the velocity distribution is practically of Gaussian form, however, its width has extremely large values about 10. The tenfold increase in the particle density, l : 50 → 5, shrinks the velocity distribution to the same order and its scale becomes similar to that of the distortion rate distribution in magnitude. However, in this case the form of the velocity distribution is a monoscale function of the well pronounced cusp form ∝ exp{−|v|}. In the case of strong dissipation (σ ≈ 1.0) the situation is the opposite. The system with low density (l = 50), as previously, is characterized by an extremely wide velocity distribution, its width is about 10. However, now its form deviates substantially from the Gaussian one. For the corresponding ensemble with high density (l = 5) the velocity distribution is Gaussian with width about 1. The latter, nevertheless, is much larger than the same width in the absence of dynamical traps.
379
11 Noise-Induced Phase Transitions Strong dissipation, σ = 0.09, σ0 = 0.01
0.1
1
Distribution function, P(υ)/Pmax
1 2
0.001 −20 −15 −10 −5
0
5
1
20 15 10 5 0 −5 −10 −15 −20
2
0.001 −20 −15−10 −5 0
10 15 20
5 10 15
20
Particle velocity, υ 15
1
Particle velocity, υ
Particle velocity, υ
0.1
Particle velocity, υ
(a)
2
0
200
(b)
(c)
1
0.01
0.01
400
600
800
10
1
5 2
0 −5 −10 −15 −20
1000
0
200
Time 700
1
650 600 550 500 450 0
200
400
600
400
600
800
1000
600 800
1000
Time
800
1000
Time
Figure 11.15 The distribution functions of the particle velocities and the characteristic time patterns formed by the velocity variations of the 500 th particle. Dynamics of the 1000-particle ensemble with low (l = 50, label 1) and high (l = 5, label 2) mean density and weak (σ ≈ 0.1) and strong (σ ≈ 1.0) dissipation was implemented for the calculation time up to 8000 time units to make the obtained distributions stable with respect to a time increase. (c) shows the time patterns formed by 200 paths of particle motion during 1000 time units and chosen in
Particle positions (in units of l )
Distribution function, P(υ)/Pmax
Weak dissipation, σ = 0.09, σ0 = 0.01
Particle positions (in units of l )
380
750
1
700 650 600 550 500
0
200
400
Time
the middle of the given ensemble. Here the curve thickness has been chosen so that the difference in brightness can depict local variations in the path spacing due to changes either in the particle density or in the velocities of cooperative particle motion (in this way the different long-lived states of the given particle ensemble become apparent). The other parameters used are the noise amplitude = 0.1, the trap-effect measure ( = 0.1, the small regularization friction coefficient σ0 = 0.01 and the regularization spatial scale h0 = 0.25.
These features of the velocity distribution characterize the cooperative behavior of particles rather than their individual dynamics. In other words, there should be strong correlations in the motion of not only neighboring particles but also distant ones. Therefore, the velocity variations responsible for the formation of such distributions in fact describe the motion of multiparticle clusters. To justify
11.11 Self-Freezing Model for Multi-Lane Traffic
this, we refer to Figure 11.15(b) which demonstrates some typical fragments of the time patterns formed by the velocities of individual particles. When the mean particle density is low (l = 50), these patterns look like a sequence of fragments {vα } inside which the particle velocity varies in the vicinity of some level vα . The values {vα } are rather randomly distributed inside a certain region of thickness V ∼ 10 in the vicinity of v = 0. The continuous transitions between these fragments occur via sharp jumps. The typical duration of these fragments is about T ∼ 100, which enables us to regard them as long-lived states because the temporal scales of individual particle dynamics are about several units. Moreover, these long-lived states can persist only if a group of many particles moves as a whole because the characteristic distance L individually traveled by a particle involved in such state is about L ∼ VT ∼ 1000 l. The spatial structure of these cooperative states is depicted in Figure 11.15 which also shows time patterns formed by paths {xi (t)} for 200 particles of duration about 1000 time units. These particles were chosen in the middle part of the 1000-particle ensembles with low density. For high-density ensembles such patterns also develop, but are not so pronounced. As we can see, a large number of different mesoscopic states are formed in these systems. They differ from one another in size, direction of motion, speed, life-time, etc. Moreover, the life-time of such a state can be much longer than the characteristic time interval during which particles forming it will currently belong to this state individually. Besides, the patterns found could be classified as hierarchical structures. Some relatively small domains formed by cooperative motion of individual particles, in their turn, together make up larger superstructures. In other words, the observed long-lived cooperative states have their ‘own’ life independent, in some sense, of the individual particle dynamics. The latter properties are the reason for regarding them as certain dynamical phases arising in the systems under consideration due to the dynamical traps affecting the individual particle motion. The term ‘dynamical’ has been used to underline the fact that the complex cooperative motion of particles is responsible for these long-lived states; without the continuous particle motion such states cannot exist.
11.11 Self-Freezing Model for Multi-Lane Traffic
This section is devoted to a model for multi-lane congested traffic flow, where the dynamical traps give rise to the continuum of long-lived states observed in real traffic. When vehicles move on a multi-lane highway without changing lanes, they interact practically only with the nearest neighbors ahead. The more frequently lane changing occurs, the more correlated is the traffic flow on a multi-lane highway. Therefore, to characterize traffic flow on multi-lane highways, it is reasonable to introduce an additional state variable, the order parameter η [131]. In this case the mean velocity v of multi-lane traffic flow is determined by both the vehicle density ρ and the order parameter η, namely, v = ϑ(η, ρ).
381
382
11 Noise-Induced Phase Transitions
Synchronized mode: car arrangement enabling (a) or hindering (b) overtaking 3
a
2
1
3
b
1
2
Figure 11.16 Schematic illustration of the car arrangement in the neighboring lane in the synchronized mode that enables overtaking (a) or hinders it (b).
However, to describe phase transitions in cooperative vehicle motion, we have to treat the multi-lane car interaction in more detail. The fact is that, for a car to change lane, the local vehicle arrangement at the neighboring lanes should be of a special form, otherwise it will be prevented for a certain time, as illustrated in Figure 11.16. For car 1 to be able to overtake car 2 the neighboring car 3 should provide room for this maneuver. In the opposite case the driver of car 1 has to wait and the local car arrangement will not vary substantially. In other words, changes in the particular realizations of the local car arrangement can be frozen for a certain time although the globally optimal car configuration is not attained at the current moment of time. Due to this self-freezing effect, the synchronized mode can comprise a great number of locally metastable states and correspond to a certain two-dimensional region in the ρq plane rather than to a line q = ϑ(ρ)ρ. This feature seems to be similar to that met in physical media with local order; for example, in glasses where phase transitions are characterized by a wide range of controlling parameters (temperature, pressure, etc.) rather than their fixed values (see, e.g. [252]). We should specify the evolution of the order parameter a to complete the description of the long-lived state continuum of the cooperative car motion. Since the order parameter a allows for microscopic details of the fundamental cluster structure, its fluctuations will be treated as a random noise whose amplitude depends on the vehicle density only. In contrast, the rate da/dt of time variations in the order parameter a has to be affected substantially by the current value of the order parameter η. In fact, as the order parameter η tends to the local optimum value η0 (a, ρ) for the given a, the rate da/dt should be depressed because all the drivers forming the fundamental cluster prefer to wait until a more comfortable car configuration arises, therefore inhibiting the evolution of the fundamental cluster structure.
11.11 Self-Freezing Model for Multi-Lane Traffic
The dimensionless model describing these effects in the congested multi-lane traffic takes the form [132] d.η = −(η − a2 ) dt + dWη (t), da = −0 (η, a)a dt +
(11.70)
1/2 0 1/2 (η, a) dWa (t),
(11.71)
Order perameter η,
(a) Bus (b)
where the infinitesimal moments of the random Langevin forces dWη and dWa are mutually independent and the random process is of the Ito type. Finally, the function (η, a) may be specified as: (η − a2 ) 2 /φ20 if η − a2 ≤ φ0 . (11.72) (η, a) = 1 if η − a2 > φ0 1.0
(a) η independent of a
0.0 (<η2>) 1/2 = 0.1 −1.0 5.0
(b) η affected by a
4.0 3.0 2.0 1.0 0.0 3.0 no trapping
Order perameter a
2.0 1.0 0.0 −1.0 −2.0 −3.0
(
20
40 60 Dimensionless time
80
Figure 11.17 The time pattern of the order parameters η and a if the self-freezing effect is suppressed: (a) exhibits time variations in the order parameter η in the case where the influence of the order parameter a has been ignored; (b) presents the dynamics of η affected by a.
100
383
11 Noise-Induced Phase Transitions
Here η = a2 is the optimal value of the order parameter η attained for the given value of the car ensemble arrangement, and a = 0 matches the totally optimal structure of the car ensemble. When the car arrangement attains a local optimum for a fixed value of a, the drivers ‘do not know what to do’ and the system dynamics stagnates. In other words, the curve η = a2 is the locus of dynamical traps and factor (11.72) describes them, with φ0 being the threshold of driver perception. Let us now discuss the results obtained by numerically simulating the system of (11.70), and (11.71) for 0 = 1, = 0.1, and φ0 = 0.5. First of all, Figure 11.17 illustrates the evolution of the order parameters η and a when there is no selffreezing. In this case the order parameter a exhibits the standard random pattern of Brownian movement inside a region of unit width. We see a collection of practically independent spikes of unit width. A similar pattern (see Figure 11.17(a)) is demonstrated by the dynamics of the order parameter η provided the interaction of the parameters η and a have been ignored, so (η − a2 ) is replaced by η. Naturally, in this case the synchronized mode matches a line on the ρq plane for 1. 3.0
Order parameter h
2.5 2.0 1.5 1.0 0.5 0.0 −0.5 3.0 2.0 Order parameter a
384
1.0 0.0 −1.0 −2.0 −3.0
0
20
40 60 Dimensionless time
80
Figure 11.18 The time pattern of the order parameters η and a when the self-freezing effect is considerable.
100
11.12 Exercises
The dynamics of the order parameter η, affected by the variable a but without the reciprocal influence, is shown in Figure 11.17(b). Again a collection of spikes can be seen, whose amplitude as well as width has increased tenfold. The dynamics changes dramatically for the full problem, see Figure 11.18. The time pattern takes a form corresponding to the long-lived state continuum. When the point {a(t), η(t)} representing the current state of the synchronized mode wanders on the aη plane and reaches the curve η = a2 at any point, it will be trapped for a certain time until it finally escapes from the trap due to the noise dWη (t). After this the system again wanders in the aη plane during a time interval about unity before being trapped for the next time. Since the characteristic duration of the trapping is much longer than unity, the pattern looks like a certain collection of local metastable states of the synchronized mode. However, such prolonged stays of the system are not metastable in a rigorous sense and we prefer to call them simply long-lived states. Since each point of the curve η = a2 is a trap, the long-lived states make up a certain continuum. The problem of a car following a lead car driven with constant velocity is considered in [134,135]. To derive the governing equations for the dynamics of the following car a cost functional is constructed. This functional ranks the outcomes of different driving strategies. Assuming rational driver behavior, the existence of the Nash equilibrium is proved. 11.12 Exercises
E 11.1 Stochastic differential equations and stochastic processes Verify whether the following stochastic differential equations of different types dx = x dt + x02 dW,
(11.73a)
2 dW, dx = (x − x3 ) dt + x1/2
(11.73b)
dx = (x − 2x ) dt +
(11.73c)
3
x12
dW
describe the same process. E 11.2 Transformation of variables in stochastic differential equations Find the transformation of the variables y = φ(x) that converts the following stochastic differential equation of the H¨anggi–Klimontovich type dx =
x 1 dt + dW (1 + x2 )3 1 + x12
(11.74a)
to the stochastic differential equation dy = dW describing the standard diffusion process.
(11.74b)
385
386
11 Noise-Induced Phase Transitions
E 11.3 Noise-induced phase transitions in a genetic processes Let us consider a genetic model of stochastic processes specified by the equation dx = −(βx + x3 ) dt + σ(1 − xθ2 ) dW
(11.75)
for x ∈ (−1, 1). Find the threshold value of the noise intensity σ = σc causing the unimodal–bimodal transition depending on the parameter β > 0 and the process type, that is, on the value of θ. E 11.4 Anomalous properties of stochastic processes with nonlinear noise Let a random process in R be governed by the following stochastic differential equation of the H¨anggi–Klimontovich type = (11.76) dx = −αx dt + 1 + x12 dW, where α > 0 is a given positive constant. Find the stationary distribution Pst (x) of the random variable x and find the conditions when the moments
+∞ Mp := |x|p Pst (x) dx, p = 1, 2, . . . (11.77) −∞
diverge. E 11.5 Geometric random walk without crossing boundaries Let one-dimensional geometric random walks be governed by the equation dx = x dt + x1/2 dW
(11.78)
of the Stratonovich type and let the walker be absorbed immediately when it gets to the point x = xtr . Find the probability for the walker to survive depending on its initial position x0 > xtr as time goes to infinity, t → ∞.
387
12 Many-Particle Systems
12.1 Hopping Models with Zero-Range Interaction
In Chapters 9 and 10 in which we discussed the nucleation in supersaturated vapor, as well as the jam formation in traffic flow, we have mainly focused on the applications of stochastic Markov processes to one-cluster models. Here we will consider in some detail more complex many-particle models starting with a particular totally asymmetric particle-hopping model, where many clusters can coexist, as described by the so-called zero-range stochastic process as a system of interacting random walks [43, 67–69, 89, 97, 166, 209, 221]. In general, the particle-hopping models are those where the spatial coordinates of particles are discretized or split into cells. Each cell can be either empty or occupied by a particle. It can also contain many particles depending on the specific definition of the model. If the time is also discretized, then at each time step particles can jump to other cells. Different updating rules are possible such as parallel, sequential, or random. Each transition to the new discrete state of the particle system is characterized by certain probability. If in a one-dimensional model the probabilities for a particle to hop to the right and to the left are different, then the model is called asymmetric. The hopping models can also be defined in continuous time. In this case the hopping probabilities are replaced by transition rates, as is usual in the master equation approach discussed in Chapter 3. In a totally asymmetric particle-hopping model, particles can jump in only one direction. The stochastic particle-hopping process is called the exclusion process, if a cell can contain no more than one particle. The totally asymmetric simple exclusion process (TASEP) may be suitable for describing the traffic flow, as the motion of cars is indeed totally asymmetric (in one direction), and two vehicles cannot occupy the same position on the road. The Nagel–Schreckenberg cellular automaton model [170] with the maximum velocity vmax = 1, where at each time step a car can move forwards to the next empty cell with certain probability, represents a simple TASEP. Characteristic plots showing sets of trajectories generated by the Nagel–Schreckenberg model are shown in Figure 12.1. An extension of the TASEP to two-lane traffic has been made in [187]. In this case two possible internal states, like spin-up and spin-down in the Ising model, Physics of Stochastic Processes: How Randomness Acts in Time Reinhard Mahnke, Jevgenijs Kaupuˇzs and Ihor Lubashevsky Copyright 2009 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim ISBN: 978-3-527-40840-5
388
12 Many-Particle Systems
(a)
(b)
(c)
(d)
Figure 12.1 Spatio-temporal diagram of road traffic simulated by the Nagel–Schreckenberg model with vmax = 5. The results for four different car densities: (a) 8 %, (b) 12 %, (c) 15 %, (d) 30 %, are presented.
are assigned to each particle. These internal states allow one to distinguish between two lanes. The exclusion principle in TASEP is equivalent to the Pauli exclusion principle for electrons in this case. Another application of an asymmetric simple exclusion process was considered in [81] by mapping it onto an XXZ quantum chain. A widely studied particle-hopping process is the so-called zero-range process, which could be viewed as a generic model for domain dynamics in one dimension [89]. In this model, each cell or box can contain an arbitrary number of particles n, whereas the process is defined by the hopping rates to the neighboring boxes, and these rates depend only on the number of particles in the departure box. Due to the latter property, it is called the zero-range process (ZRP). The zero-range model is attractive from the point of view that many particlehopping processes such as the TASEP can be easily mapped to the ZRP [43, 89]. The mapping can be done in such a way that the number of particles in a box in ZRP corresponds to the size of a particle cluster or, alternatively, to the size of a gap between clusters in TASEP. The mapping of a traffic model to ZRP will be discussed in detail in the next section. The zero-range model is also attractive as the stationary probability distribution can be represented exactly as a product measure [42, 221]. Also, it provides exact criteria for the phase separation [89] in a one-dimensional driven system. In the following section we will discuss these points in relation to the model, which is designed to describe some features of traffic flow such as phase separation and metastability [97].
12.2 The Zero-Range Model of Traffic Flow
12.2 The Zero-Range Model of Traffic Flow
The development of traffic jams in vehicular flow is an everyday example of the occurrence of phase separation in low-dimensional driven systems, a topic which has attracted much recent interest (see, e.g., [166, 209] and references therein). In [89] the existence of phase separation is related to the size-dependence of domain currents and a quantitative criterion is obtained by considering the zerorange process (ZRP) as a generic model for domain dynamics. Phase separation corresponds to the phenomenon of condensation in the ZRP (see [43] for a recent review) in which a macroscopic proportion of particles accumulate on a single site. In the following we will use the zero-range picture to study the phase separation in traffic flow. We consider a model of traffic flow, where cars are moving along a circular road. Each car occupies a certain length of road . We divide the whole road of total length L into cells of size . Each cell can be either empty or occupied by a car, just as in cellular automaton traffic models (see, e.g., [31, 77, 151] and references therein). Most of these models use a discrete-time update rule; for example, see [120] for a class of traffic models related to a parallel updating version of the ZRP. In contrast, we consider the development of our system in continuous time. The probability per unit time for each car to move to the next cell is given by a certain transition rate, which depends on the actual configuration of empty and occupied cells. This configuration is characterized by the cluster distribution. An uninterrupted string of n occupied cells, bounded by two empty cells, is called a cluster of size n. The clusters of size n = 1 are associated with freely moving cars. The first car in each cluster is allowed to move forward by one cell. The transition rate wn of this stochastic event depends on the size n of the cluster to which the car belongs. In this case w1 is the mean of the inverse time necessary for a free car to move forward by one cell. The transition rate w1 is related to the distribution of velocities in the free-flow regime or phase, which is characterized by a certain car density cfree . For small densities, expected in the free-flow phase in real traffic, the interaction between cars is weak and, therefore, the transition rate w1 depends only weakly on the density cfree . Hence, in the first approximation we may assume that w1 is a constant. This model can be directly mapped to the zero-range process. Each vacancy (empty cell) in the original model is related to a box in the zero-range model. The number of boxes is fixed, and each box can contain an arbitrary number of particles (cars), which is equal to the size of the cluster located to the left (if cars are moving to the right) of the corresponding vacancy in the original model. If this vacancy has another vacancy to the left, then it means that the box is empty. Since the boundary conditions are periodic in the original model, they remain periodic also in the zero-range model. In this representation, one particle moves from a given box to the right with transition rate wn , which depends only on the number of particles n in this box. In the grand canonical ensemble, where the total number
389
390
12 Many-Particle Systems
of particles is allowed to fluctuate, the stationary distribution over the cluster-size configurations is the product of independent distributions for individual boxes. The probability that there are just n particles in a box in a homogeneous phase D is [42, 221] P(n) ∝ zn / nm=1 wm for n > 0, P(0) being given by the normalization condition. Here z = eµ/kB T is the fugacity a parameter which controls the mean number of particles in the system. This result can be obtained and interpreted within the stochastic master equation approach [43]. Assuming the statistical independence of the distributions in different boxes, we have a multiplicative ansatz P2 (k, m, t) = P(k, t) P(m, t)
(12.1)
for the joint probability P2 (k, m, t) that there are m particles in one box and k particles in the neighboring box on the left at time t. This approximation leads to the mean-field dynamics described by the master equations [43] ∂P(n, t) = wP(n − 1, t) + wn+1 P(n + 1, t) ∂t −[ w + wn ]P(n, t) : n ≥ 1, ∂P(0, t) = w1 P(1, t) − wP(0, t), ∂t
(12.2) (12.3)
where
w(t) =
∞
wk P(k, t)
(12.4)
k=1
is the mean inflow rate in a box. The ansatz (12.1) is an exact property of the stationary state of the grand canonical ensemble or, alternatively, of an infinitely large system [221]. Hence, in these cases, the master equations (12.2) and (12.3) give the exact stationary state while providing a mean-field approximation to the dynamics of reaching it. The stationary solution P(n) corresponding to ∂P(n, t)/∂t = 0 can be found recursively, starting from n = 0. It yields the known result [42, 43, 221] P(n) = P(0) wn
n 1 1 wm
(12.5)
m=1
for n > 0, where P(0) is found from the normalization condition. Denoting the number of boxes by M, which corresponds to the number of vacancies in the original model, the mean number of cars on the road is given by
N = M n, where
n =
∞
n P(n)
(12.6)
n=1
is the average number of particles in a box. Note that, in the grand canonical ensemble, the total number of cars and the length of the road L fluctuate. For the mean value, measured in units of , we have L = M + N. Hence, the average density of cars is
12.3 Transition Rates and Phase Separation
c=
n
N = .
L 1 + n
(12.7)
According to (12.7), (12.6), and (12.5), we have the following relation ∞
c = 1−c
n wn
n=1
1+
∞ n=1
n 1 1 wm
m=1 n 1 n
w
m=1
1 wm
(12.8)
from which the stationary mean inflow rate w can be calculated at a given average density c.
12.3 Transition Rates and Phase Separation
Now we make the following choice for the transition rate dependence on the cluster size n:
b for n ≥ 2, (12.9) wn = w∞ 1 + σ n the value of w1 being given separately, since it is related to the motion of uncongested cars, whereas wn with n ≥ 2 represents escape from a jam of size n. Although an individual driver does not know how many cars are jammed behind him, the effective current of cars from a jam, represented by wn , is a collective effect which is expected to depend on the correlations and internal structure (e.g., distribution of headways) within the cluster [89]. A monotonously decreasing dependence on cluster size, such as (12.9), can be considered as a type of slow-to-start rule: the longer a car has been stationary the larger the probability of a delay when starting (cf. [14, 19, 127, 225]). We now explore the consequences of the choice (12.9) in terms of the ZRP phase behavior and its implications for the description of traffic flow. In numerical calculations we have assumed w∞ = 1/τ∞ = 1 and w1 = 5, by choosing the time constant τ∞ as a time unit, whereas the control parameters b and σ have been varied. If σ > 1, as well as for b ≤ 2 at σ = 1, then (12.8) has a solution for any density 0 < c < 1 (see dashed and dotted curves in Figure 12.2). This implies that the homogeneous phase is stable in the whole range of densities, so there is no phase transition in a strict sense. If σ < 1 (solid curve in Figure 12.2), as well as for b > 2 at σ = 1 (see Figure 12.3), w/w∞ reaches 1 at a critical density 0 < ccr < 1, and there is no physical solution of (12.8) for c > ccr . This means that the homogeneous phase cannot accommodate a larger density of particles and condensation takes place at c > ccr . This behavior underlies the known criterion for phase separation in onedimensional driven systems [89]. For illustration, we comment that, in the multi-cluster model considered in [96], the transition rates do not depend on
391
12 Many-Particle Systems
1
<w> w∞
σ=0.5
σ=1
σ=2
b=1
0.5
0
0
Figure 12.2
0.5 c
1
w/w∞ versus density c at b = 1 for different σ.
1 b=3
<w> w∞
392
b=2
σ =1
0.5
0
b=1
0
Figure 12.3
0.5 c
1
w/w∞ versus density c at σ = 1 for different b.
the cluster sizes, only the inflow rate in a cluster depends on the overall car density and fraction of congested cars. This corresponds to the case b = 0, where, according to the criterion, no macroscopic phase separation takes place in agreement with the theoretical conclusions and simulation results of [96]. In contrast, a class of microscopic models was introduced in [90] where correlations within the domain (jam) give rise to currents of the form (12.9) with σ = 1 and b > 2; phase separation is then observed. At c < ccr in our model the cluster distribution function P(n) decays exponentially fast for large n (dashed and dotted curves in Figures 12.4 and 12.5), whereas the decay is slower at c = ccr (solid curves in Figures 12.4 and 12.5). It is well known that the decay in this case is power-like for σ = 1, so that P(n) ∼ n−b [89]. The large n asymptotics for 0 < σ < 1 at the critical density c = ccr is calculated as follows [97]. According to (12.5), at c = ccr where w = w∞ , we have
12.3 Transition Rates and Phase Separation
0.15
P(n)
0.1
σ = 0.5 b =1
c =0.3 0.05
c= ccr
c =0.1 0
0
4
8
12
n Figure 12.4 Probability distribution function P(n) over cluster sizes for different densities c at σ = 0.5 and b = 1.
0.1 P(n)
σ=1 b=3 c=0.3
0.05 c=0.1 0
c=ccr
0
4
8
12
n Figure 12.5 Probability distribution function P(n) over cluster sizes for different densities c at σ = 1 and b = 3.
b ln 1 + σ m m=2
n
P(0)w∞ b − = ln ln 1 + σ dm + C + δ(n), w1 m 1
ln P(n) = ln
P(0)w∞ w1
−
n
(12.10)
where C is a constant, and δ(n) → 0 at n → ∞. The latter follows from the fact that each term with m = k in the sum generates terms ∝ k−y with y > 0 when the logarithm is expanded in a Taylor series, and for each of these terms we have k−y =
k
, + m−y dm + O k−y−1 at k → ∞.
k−1
This ensures that the difference between the integral and the sum in (12.10) is finite and tends to some constant C at n → ∞ for σ > 0. The remainder term δ(n) is
393
394
12 Many-Particle Systems
irrelevant for the leading asymptotic behavior of P(n). By expanding the logarithm and integrating term by term, for 0 < σ < 1 and σ = 1/2, 1/3, 1/4, . . . we obtain " [1/σ] 1 (−b)k n1−kσ , (12.11) exp P(n) ∝ k 1 − kσ k=1
where [1/σ] denotes the integer part of 1/σ. The cases where σ is an inverse integer are special, since a term ∝ 1/m appears in the expansion of the logarithm, giving rise to a power-like correction to the stretched exponential behavior, namely " σ−1 1−1 (−b)k n1−kσ σ(−b)1/σ P(n) ∝ n (12.12) exp k 1 − kσ k=1
for σ = 1/2, 1/3, 1/4, . . . . The known result for σ = 1 can also be obtained by this method: it corresponds to the power-like prefactor in (12.12). Only the linear expansion term of the logarithm is relevant at 1/2 < σ < 1, so we find P(n) ∝ exp −b n1−σ /(1 − σ) in the limit of large n. The first two terms are relevant for 1/3 < σ ≤ 1/2, the third one becomes important for 1/4 < σ ≤ 1/3, and so on. Equations (12.11) and (12.12) represent an exact analytical result at n → ∞ which we have also verified numerically at different values of σ and b. In this form, where the proportionality coefficient is not specified, Equations (12.11) and (12.12) are universal, that is, they do not depend on the choice of w1 . At w/w∞ = 1 the inflow w in a macroscopic cluster of size n → ∞ is balanced by the outflow w∞ . This means that at overall density c > ccr the homogeneous phase with density ccr is in equilibrium with a macroscopic cluster, represented by one of the boxes containing a non-vanishing fraction of all particles in the thermodynamic limit [63, 88]. Hence, w/w∞ = 1 holds in the phase coexistence regime at c > ccr . According to (12.7), the critical density ccr is given by
ncr , (12.13) ccr = 1 + ncr where ncr is the mean cluster size at the critical density. Since w = w∞ holds in this case, we have ∞ n 1 1 n n w∞ w n=1 m=1 m . (12.14)
ncr = ∞ n 1 1 n 1+ w∞ wm n=1
m=1
The critical density, calculated numerically from (12.13) and (12.14) as a function of parameters σ and b, is shown in Figures 12.6 and 12.7, respectively. In contrast to the situations discussed previously in the literature, in our model w1 is not given by the general formula (12.9) but is an independent parameter. This distinction leads to quantitatively different results, for example, the critical density for σ = 1 is analytically shown to be b(b + 1) ccr = . (12.15) (b − 1)[2(b + 1) + w1 (b − 2)]
12.4 Metastability
1
b = 0.5 ccr
b= 1 0.5
b=2 b= 4
0
0
0.5 σ
1
Figure 12.6 Critical density as a function of control parameter σ for different values of b.
1 σ=1
ccr
σ=0.9 σ=0.5
0.5
σ=0.1 0
0
5 b
10
Figure 12.7 Critical density as a function of control parameter b for different values of σ.
12.4 Metastability
Suppose that, at the initial time moment t = 0, the system is in a homogeneous state with overall density slightly larger than ccr . Here we study the development of such a state in the mean-field approximation provided by (12.2) and (12.3). With this initial condition, the mean inflow rate in a box w is slightly larger than that at c = ccr , so w = w∞ + ε holds with small and positive ε. Hence, only large clusters with wn < w∞ + ε have a stable tendency to grow, whereas any smaller cluster typically (except a rare case) fluctuates until it finally dissolves. In other words, the initially homogeneous system with no large clusters can stay in this metastable supersaturated state for a long time until a large stable cluster appears due to a rare fluctuation.
395
396
12 Many-Particle Systems
Neglecting the fluctuations, the time development of the size n of a cluster is described by the deterministic equation dn = w − wn . (12.16) dt According to this equation, the undercritical clusters with n < ncr tend to dissolve, whereas the overcritical ones with n > ncr tend to grow, where the critical cluster size ncr is given by the condition
w = wncr .
(12.17)
Using (12.9) yields
1/σ b . ncr #
w/w∞ − 1
(12.18)
In this case ncr is rounded to an integer value. This deterministic approach describes only the most probable scenario for an arbitrarily chosen cluster of a given size. It does not allow one to obtain the distribution over cluster sizes: the deterministic equation (12.16) suggests that all clusters shrink to zero size if they are smaller than ncr at the beginning, whereas the real size distribution arises from the competition between opposite stochastic events of shrinking and growing. Assuming that the distribution of relatively small clusters contributing to n is quasi-stationary, that is, the detailed balance (equality of the terms in (12.2) and (12.3) describing opposite stochastic events) for these clusters is almost reached before any cluster with n > ncr has appeared, we have ncr n P(n) (12.19)
n # n=1
for such a metastable state. In this case, from (12.7), we obtain ncr
c # 1−c
n wn
n=1
n 1 1 wm
m=1
ncr n 1 1 1+
wn wm n=1
.
(12.20)
m=1
instead of (12.8) for calculation of w in this homogeneous metastable state. The critical cluster size is found self-consistently by solving (12.18) and (12.20) as a system of equations. From (12.18) we can see that the critical cluster size ncr diverges at c → ccr , since w → w∞ . The results of the calculation of ncr (rounding down to an integer value) at σ = 0.5, b = 1 and at σ = 1, b = 3 are shown in Figure 12.8. Within the framework of mean-field dynamics, the mean nucleation time in our model can be evaluated as follows. Let P(t) be the probability density of the firstpassage time of exceeding the critical number of particles ncr in a single box. By our
12.4 Metastability
1000
ncr
σ=0.5 b=1 500
0 (a)
ccr
0.6
0.7
0.8
c
0.6
c
1000
ncr
σ=1 500
0 (b)
b=3
ccr
0.5
Figure 12.8 Critical cluster size vs density at σ = 0.5, b = 1 (a) and σ = 1, b = 3 (b). The critical density is indicated by a vertical dashed line.
definition, the nucleation occurs when one of the M boxes reaches the cluster size ncr + 1. The probability that it occurs first in a given box within a small time interval !M−1 t [t; t + dt] is thus P(t) dt × 1 − 0 P(t ) dt according to our assumption that the !M−1 t is the probability boxes are statistically independent. The term 1 − 0 P(t ) dt that in all other boxes, except the given one, the overcritical cluster size ncr + 1 has still not been reached. Since the nucleation can occur in any of M boxes, the nucleation probability density PM(t) for the system of M boxes is given by M−1
t P(t ) dt PM(t) = M P(t) × 1 −
0
# MP(t) exp −M 0
t
P(t ) dt
.
(12.21)
397
398
12 Many-Particle Systems
The latter equality tholds for large M, since all M boxes are equivalent, and therefore the probability 0 P(t ) dt that the nucleation occurs in a given box within a characteristic time interval t ∼ TM is a small quantity of order 1/M. The mean nucleation time for the system of M boxes is
∞ t PM (t) dt. (12.22)
TM = 0
Here T1 is the mean first-passage time for a single box. In order to estimate TM according to (12.21) and (12.22), one needs some idea about the first-passage time probability density for one box P(t). This is actually the problem of a particle escaping from a potential well. Since we start with an almost homogeneous state of the system, we may assume zero cluster size n = 0 as the initial condition. The first-passage time probability density can be calculated as the probability per unit time of reaching the state ncr + 1, assuming that the particle is absorbed there. It is reasonable to assume that after a certain equilibration time teq , when a quasi-stationary distribution of the cluster sizes within n ≤ ncr is reached, the escaping from this region is characterized by a certain transition rate wesc . Hence, for t > teq we have
t (12.23) P(t ) dt , P(t) # wesc × 1 − 0
where the expression in square brackets is the probability that the absorption at ncr + 1 has still not occurred up to the time t. At high enough potential barriers (large mean first-passage times) the short-time contribution to the integral is irrelevant and, by means of (12.22), the solution of (12.23) can be written as
1 t , (12.24) P(t) = exp −
T1
T1 −1 . Obviously, this approximate solution of the first-passage probwhere T1 = wesc lem is not valid for very short times t teq , since the short-time solution should explicitly depend on the initial condition. In particular, if we start at n = 0, then the state ncr + 1 cannot be reached immediately, so that P(0) = 0. Nevertheless (12.24) can be used to estimate the mean nucleation time TM provided that TM > teq . We have checked the correctness of these theoretical expectations within the mean-field dynamics represented by (12.2) and (12.3) by comparing them with the results of the simulation of stochastic trajectories generated according to these equations. The simulation curves for P(t) at two different sets of parameters: σ = 0.5, b = 1, c = 0.84 (with ncr = 48), and σ = 1, b = 3, c = 0.61 (with ncr = 35) are shown in Figure 12.9 in two different time scales. As we can see, (12.24) is a good approximation for large enough times t > teq . For definiteness, we have identified the equilibration time teq with the crossing point of the theoretical and simulated curves. An interesting additional feature is the presence of an apparent nucleation time lag, which is about tlag ≈ 60 for the first set of parameters and about tlag ≈ 30 for the second one. Evidently, the first-passage time probability density P(t) tends to zero very rapidly when t decreases below tlag .
12.4 Metastability
P(t)
1×10−4
5×10−5
0 (a)
5000
0 teq
10000
t 15000
P(t)
1×10−4
5×10−5
0 (b)
0
200
400
teq 600
Figure 12.9 Comparison between the theoretical approximation (12.24) for P(t) (smooth curves) and mean-field simulation results (fluctuating curves) shown in a longer (a) and in a shorter (b) time scale. The vertical dashed line indicates the
t
800
equilibration time teq ≈ 500 for the set of parameters σ = 0.5, b = 1, and c = 0.84 represented by the upper curves in both pictures. The other curves correspond to σ = 1, b = 3, and c = 0.61.
By inserting (12.24) in (12.22) we obtain
∞ + , x e−x exp −M 1 − e−x dx
TM # M T1
(12.25)
0
after changing the integration variable t/ T1 → x. Taking into account that only the region x ∼ 1/M contributes to the integral at large M, we arrive at a very simple expression
TM #
T1 M
(12.26)
relating the mean first-passage time or nucleation time in a system of M boxes to that of one box. The latter can be calculated easily by the known formula [55]
T1 =
ncr n=0
n !−1 ˜ ˜ P(m),
wP(n) m=0
(12.27)
399
12 Many-Particle Systems
1021
σ = 0.5 b =1
1014
M
M
400
107 100 (a)
ccr
0.6
c
0.7
106 105 104 103 102 101 100
(b)
σ=1 b=3
ccr
0.48
c
0.52
Figure 12.10 Mean nucleation time versus density at σ = 0.5, b = 1 (a) and σ = 1, b = 3 (b). In both cases M = 106 . The critical density is indicated by a vertical dashed line.
D ˜ ˜ where P(0) = 1 and P(n) = nk=1 ( w/wk ) with n > 1 represent the unnormalized stationary probability distribution. The mean nucleation time versus the density c, calculated from (12.26) and (12.27) at M = 106 is shown in Figure 12.10. These figures show that the mean nucleation time increases dramatically as the critical cluster size ncr increases (see the corresponding plots in Figure 12.8) approaching the critical density ccr . According to our previous discussion, estimate (12.26) is valid for large enough mean nucleation times TM > teq ; in particular, when approaching the critical density c % ccr at any large but fixed M. It is not valid in the thermodynamic limit M → ∞ at a fixed density c. Namely, (12.26) suggests that TM decreases as ∼ 1/M, whereas in reality the decrease must be slower for small nucleation times (large M) since P(t) → 0 as t → 0. In particular, the mean-field dynamics suggests that, for a wide range of M values, TM quasi-saturates at TM ≈ tlag , since the critical cluster size is almost never reached before t = tlag .
12.5 Monte Carlo Simulations of the Hopping Model
Numerical simulations of the zero-range model show clear evidence for the existence of a metastable state prior to condensation In Figure 12.11 we show the largest cluster size as a function of time for three separate Monte Carlo runs in the case σ = 0.5, b = 1, w1 = 5, M = 105 . For each run the system was started in a random uniform initial condition with density c = 0.66 (for these parameters ccr # 0.56). It can be clearly seen that, after a short equilibration period, the system fluctuates in a metastable state before a condensate appears. The critical cluster size is observed to be around 400 in good agreement with the prediction ncr # 330 from (12.18) and (12.20) (see Figure 12.8). However, the metastable time is about an order of magnitude larger than predicted. In Figure 12.12 we show the distribution of cluster sizes (for small clusters) averaged over the metastable state of one such run. The distribution is in good agreement with (12.5) with w = wncr , thus supporting the assumption of quasistationarity.
Largest cluster size
12.5 Monte Carlo Simulations of the Hopping Model
5000 4500 4000 3500 3000 2500 2000 1500 1000 500 0
0
100000
200000 t
300000
400000
Figure 12.11 Largest cluster size versus time for σ = 0.5, b = 1, w1 = 5, c = 0.66, M = 105 . Results from three independent Monte Carlo runs are shown.
10−1
P(n)
10−2 10−3 10−4 10−5 0
20
40
60
80
100
n Figure 12.12 Distribution over small cluster sizes in the metastable state. Results for Monte Carlo simulation (crosses) compared to prediction of (12.5) with w = wncr and ncr calculated numerically from (12.18) and (12.20) (dashed line).
In the analytical treatment of the previous section we calculated the mean time for the maximum cluster size to exceed ncr under the assumption that the current in the metastable state is constant. In practice, of course, the metastable current also fluctuates (and the fluctuations are greater when wn depends more strongly on cluster size, e.g. for the σ = 1 case compared with, say, σ = 0.5). Simulations suggest that these fluctuations can destroy the metastable state in cases where the metastable current wncr is close to the current of the condensed phase w∞ . In contrast, for parameters where the metastable state is well separated from the condensed state we find relatively good quantitative agreement between theory and simulation. For example, in Figures 12.13 and 12.14 we compare the average
401
12 Many-Particle Systems
160 140
ncr
120 100 80 60 40
0.38
0.4
0.42
0.44 c
0.46
0.48
0.5
Figure 12.13 Critical cluster size versus density for σ = 0.5, b = 3, w1 = 0.5, M = 105 (ccr # 0.27). Crosses show simulation data (averaged over 10 Monte Carlo histories), the dashed line is a prediction of (12.18) and (12.20).
106 105 104
402
103 102 101 0.38
0.4
0.42
0.44 c
0.46
0.48
0.5
Figure 12.14 Nucleation time versus density for σ = 0.5, b = 3, w1 = 0.5, M = 105 (ccr # 0.27). Crosses show simulation data (averaged over 10 Monte Carlo histories), the dashed line is a prediction of (12.26) and (12.27).
simulation values of the critical cluster size and nucleation time with the theoretical predictions for a range of densities in the case σ = 0.5 and b = 3. In these simulations we crudely identified the end of the metastable state as the point when the current out of the largest cluster had been less than the average system current for 50 consecutive Monte Carlo time steps. We find that our mean-field theory fairly accurately reproduces the critical cluster size, but systematically underestimates the nucleation time. This discrepancy may be partly due to the presence of (weak) dynamical correlations between the numbers of particles in the boxes in the fluctuating metastable state. Namely, the appearance of a large cluster with n # ncr is likely to be accompanied by a
12.6 Fundamental Diagram of the Zero-Range Model
slight depletion of the surrounding medium. Furthermore, we only calculated the mean first-passage time and ignored the probability that a cluster reaches ncr + 1 and is immediately driven by a fluctuation back below ncr . Monte Carlo histories which involve such a fluctuation back into the metastable state before a condensate is established, would increase the average simulation nucleation time above the theoretical prediction. Despite the neglect of current fluctuations, dynamical correlations, etc., our simulations show that the simple mean-field approach provides a good qualitative description of the metastable state and its dependence on density. It thus represents an important first step towards more refined theories.
12.6 Fundamental Diagram of the Zero-Range Model
The relation between the density c and flux j of cars is known as the fundamental diagram of traffic flow. The average stationary flux can be calculated as follows j=
∞
Q(n) wn ,
(12.28)
n=1
where Q(n) is the probability that there is a car in a given cell (in the original model) which can move forwards at a rate wn . Note that only those cars contribute to the flux, which are the first in some cluster. Hence, Q(n) = ϕP(n)/ ∞ m=1 P(m), where ϕ is the fraction of cells which contain such cars. This fraction can be calculated easily as the number of clusters divided by the total number of cells. These quantities fluctuate in our model. For large systems, however, they can be replaced by the mean values. The mean number of clusters is equal to the mean number of non-empty boxes M ∞ n=1 P(n) in the zero-range model, whereas the mean number of cells, that is, the mean length of the road is L = M + N = M (1 + n) = M/(1 − c), as we have already discussed in Section 12.2. Hence, Q(n) = (1 − c) P(n) and (12.28) reduces to j = (1 − c) w.
(12.29)
The mean stationary transition rate w depends on the car density c. For undercritical densities c < ccr , this quantity is the solution of (12.8). For overcritical densities we have w = w∞ in the phase coexistence regime, as discussed in Section 12.3; therefore, in this case the fundamental diagram reduces to a straight line j = (1 − c) w∞ :
c ≥ ccr .
(12.30)
In the metastable homogeneous state at c > ccr the mean transition rate w together with the critical cluster size ncr can be found from the system of equations (12.18) and (12.20), which allows calculation of the metastable branch of flux j. The resulting fundamental diagrams for σ = 0.5, w1 = 5 and two values of parameter b are shown in Figure 12.15. As we can see, the shape of the fundamental
403
12 Many-Particle Systems
j
0.5
0 (a)
0
0.4
ccr
0.8
c
0.5 j
404
0 (b)
0
ccr
0.5
c
1
Figure 12.15 The fundamental (flux–density) diagram for two different sets of control parameters: σ = 0.5, b = 1, w1 = 5 (a); σ = 0.5, b = 3, w1 = 5 (b). The branches of metastable homogeneous state are shown by dotted lines, the critical densities ccr are indicated by vertical dashed lines.
diagram, as well as the critical density and location of the metastable branch, depend remarkably on the value of b. These features will also depend on the values of σ and w1 . The metastable branch ends abruptly at certain density above which (12.18) and (12.20) have no real solution. This corresponds to a relatively small, but finite value of the critical cluster size ncr . Note that a metastable branch is also observed in simulations of cellular automata with slow-to-start rules [14, 19, 225]. In our examples, however, the metastable branch is located at larger densities and decreases with increasing c over a certain wide range of values depending on b, σ and w1 . The simulations of the previous section suggest that, when this metastable branch is well separated from the condensed section of the fundamental diagram, our picture is robust even in the presence of fluctuations. In summary, therefore, we believe that by suitable variation of parameters our simple model can reproduce some important features of real traffic flow [97].
12.7 Polarization Kinetics in Ferroelectrics with Fluctuations
12.7 Polarization Kinetics in Ferroelectrics with Fluctuations
Here we would like to complete our discussion of many-particle systems with one more application, which is more nontrivial in the sense that a multidimensional state space is considered to describe collective phenomena in a ferroelectric. The stochastic description of collective phenomena like phase transitions is a key destination in solid-state physics. The problems of this kind are nontrivial and usually are solved by means of the perturbation theory [51, 94, 137, 253] with some exceptions like the mean-field model considered in [196]. Here we study the kinetics of polarization switching in ferroelectrics, taking into account the spatio–temporal fluctuations of the polarization field, given by the Langevin and multidimensional Fokker–Planck equations [64, 179]. The problem was studied earlier [93, 95] by means of the Feynman diagram technique. Here we derive the Fokker–Planck equation in the Fourier representation, which is suitable for a numerical approach. We consider a ferroelectric with the Landau–Ginzburg Hamiltonian
β c α 2 P (x) + P4 (x) + (∇P(x))2 − λ(x, t)P(x, t) dx, H= 2 4 2
(12.31)
where P(x, t) is the local polarization and λ(x, t) is the time-dependent external field. Only those configurations of the polarization are allowed which correspond to the cut-off k < in the Fourier space with = π/a, where a is the lattice constant. Hamiltonian (12.31) can be approximated by a sum over discrete cells, where the size of one cell can be even larger than the lattice constant. It is a small domain with almost constant polarization. Thus, the Hamiltonian of a system consisting of N cells with total volume V reads
β 4 c V α 2 2 P (x) + P (x) + (∇P(x)) − λ(x, t)P(x, t) , (12.32) H= N x 2 4 2 where V/N = ∆V is the volume of one cell, and the coordinates of their centers are given by the set of discrete d-dimensional vectors x ∈ Rd . The stochastic dynamics of the system is described by the Langevin equation ˙ t) = −γ ∂H + ξ(x, t), P(x, ∂P(x, t)
(12.33)
where ξ(x, t) is the white noise, that is,
ξ(x, t)ξ(x , t ) = 2γθ δx,x δ(t − t ). In the case of Gaussian white noise, the probability distribution function , + f = f P(x1 ), P(x2 ), . . . , P(xN ), t
(12.34)
405
406
12 Many-Particle Systems
is given by the Fokker–Planck equation
∂ ∂H 1 ∂f ∂f = f +θ , γ ∂t ∂P(x) ∂P(x) ∂P(x) x
(12.35)
as consistent with (5.4) for the N-dimensional state vector with components P (xi ), where i = 1, 2, . . . , N. At equilibrium we have a vanishing flux which corresponds to Boltzmann’s distribution f ∝ exp(−H/θ) with θ = kB T. Assuming periodic boundary conditions, we consider the Fourier transformation P(x) = N −1/2
Pk eikx
k
Pk = N
−1/2
P(x)e−ikx .
(12.36)
x
The Fourier amplitudes are complex numbers Pk = Pk + iPk . Since P(x) is real, P−k = Pk and P−k = −Pk hold. It is assumed that the total number of modes N is an odd number. This means that there is a mode with k = 0 and the modes with ±k1 , ±k2 , . . . , ±km , where m = (N − 1)/2 is the number of independent nonzero modes. The Fokker–Planck equation for the probability distribution function f = f P0 , Pk 1 , Pk 2 , . . . , Pk m , Pk1 , Pk2 , . . . , Pkm , t reads , ∂H 1+ ∂f f + θ 1 + δk,0 2 ∂Pk ∂Pk k∈ ∂ 1 ∂H ∂f , f + θ + ∂Pk 2 ∂Pk ∂Pk
∂ 1 ∂f = γ ∂t ∂Pk
(12.37)
k∈
where P0 ≡ P0 , is the set of m independent nonzero wave vectors, and also includes k = 0. Here the Fourier-transformed Hamiltonian is given by H = ∆V
, 1 + λ−k (t)Pk α + ck2 |Pk |2 − 2 k
β + N −1 4
k
Pk1 Pk2 Pk3 Pk4 .
(12.38)
k1 +k2 +k3 +k4 =0
≡ Pk and Some of the variables in (12.38) are dependent according to P−k P−k ≡ −Pk , and λk (t) = λk (t) + iλk (t) is the Fourier transform of λ(x, t). The
12.7 Polarization Kinetics in Ferroelectrics with Fluctuations
Fokker–Planck equation (12.37) can be written explicitly as ∂ + , , ∂f θ+ 1 ∂f 2 ∆V f α + ck Pk + βSk − λk (t) + = 1 + δk,0 γ ∂t ∂Pk 2 ∂Pk k∈ ∂ , + θ ∂f 2 P ∆V f α + ck , (12.39) + βS − λ (t) + + k k k ∂Pk 2 ∂Pk k∈
where Sk = N −1
k1 +k2 +k3 =k
Sk = N −1
k1 +k2 +k3 =k
C C
E Pk 1 Pk 2 Pk 3 − 3Pk 1 Pk2 Pk3 ,
(12.40)
E −Pk1 Pk2 Pk3 + 3Pk1 Pk 2 Pk 3 .
(12.41)
The simplest case is the spatially homogeneous polarization when only the k = 0 mode is retained in (12.39) with a spatially homogeneous external field λ(x, t) = λ0 (t) = A sin(ωt). In this case we have ∂ ∂f 1 ∂f = V f αP0 + βP03 − A sin(ωt) + θ . (12.42) γ ∂t ∂P0 ∂P0 This equation has been solved numerically by using a difference scheme with special exponential-type substitution described in [98]. The numerical solution has been found within P0 ∈ [−2; 2] at the values of parameters γ = V = β = 1, α = −1, θ = 0.05, A = 0.309, and ω = 10−3 . The boundary conditions f (±2, t) = 0 and the initial condition 2 P0 − P˜ 1 (12.43) f (P0 , 0) = √ exp − 2σ2 2πσ have been used with σ = 0.3, P˜ = −1 for P0 < 0, and P˜ = 1 for P0 > 0. The calculated mean polarization P0 depending on the external field λ0 (t) forms a hysteresis loop shown in Figure 12.16. Further we shall consider a quasi one-dimensional case, where a threedimensional ferroelectric sample is stretched out in x the direction, so Lx Ly and Lx Lz hold for the linear sizes. In this case we assume that the polarization as well as the external field depend only on the coordinate x. This means that the wave vectors also have only one nonvanishing component, which is a scalar quantity k = (2π/Lx ) · n, where n = 0, ±1, ±2, . . . , ±m. As the first step, we include only one (m = 1) independent nonzero wave vector k1 = 2π/Lx (totally√N = 3 wave√vectors k = −k1 , 0, k1 ) and homogeneous external field λ(x, t) = 1/ 3λ0 (t) = 1/ 3A sin(ωt). Furthermore, we assume that the probability distribution function in real space is translation invariant at the initial time. Due to the translation symmetry of the model, it holds also at later
407
408
12 Many-Particle Systems
1.10 P0 0.55 0.00 −0.55 l(t)/A −1.10 −0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
Figure 12.16 The polarization hysteresis: the mean polarization P0 versus normalized external field λ0 (t)/A calculated numerically at the values of parameters γ = V = β = 1, α = −1, θ = 0.05, A = 0.309, and ω = 10−3 .
times. In the Fourier representation, this means that the probability distribution function depends on the modulus of Pk1 , but not on its phase. Thus, we have fˆ +P0 , | P |, t, k1 , f = f P0 , Pk1 , Pk1 , t = 2π | Pk1 | + , + , where fˆ P0 , | Pk1 |, t is the probability density in the P0 , | Pk1 | space. It obeys the Fokker–Planck equation "
∂ fˆ 1 3 2 ˆ P + 2P0 | Pk1 | − A sin(ωt) + θ ∆V f αP0 + β 3 0 ∂P0 , + + , ∂ ∆V fˆ α + ck21 | Pk1 | + β | Pk1 |3 + P02 | Pk1 | + ∂ | Pk1 |
∂ 1 ∂ fˆ = γ ∂t ∂P0
θ + 2
∂ fˆ fˆ − ∂ | Pk1 | | Pk1 |
" .
(12.44)
Since f is finite, fˆ vanishes at | Pk1 | = 0. The physical boundary conditions correspond to zero flux at the boundaries P0 = ±∞, | Pk1 | = 0, and | Pk1 | = ∞. An appropriate initial condition has to be chosen which fulfills these relations, e.g. + , , + fˆ P0 , | Pk1 |, 0 ∝ | Pk1 | exp −a0 P02 − a1 | Pk1 |2 . In the present section a multidimensional Fokker–Planck equation has been derived, which describes the polarization-switching kinetics in a ferroelectric in the presence of an external field. The probability distribution function entering this equation, depends on a set of Fourier amplitudes. An example calculation has been performed in a spatially homogeneous approximation, retaining only the
12.8 Exercises
zero mode k = 0. The calculated mean polarization versus external field forms a hysteresis, as observed in real ferroelectrics.
12.8 Exercises
E 12.1 Generalization of zero-range model Consider a generalization of the zero-range model, where one particle from each of N boxes on a ring can hop forwards with the transition rate u(n, m), where n is the number of particles in the actual box and m the number in the destination box. By analogy with the zero-range model, formulate the master equation describing, in the mean-field approximation, the dynamics of the probability p(n, t) of having just n particles in a box at time t. E 12.2 Particle-hopping model Consider a string of M subsequent boxes in the particle-hopping model defined in the previous exercise and formulate the master equation for the probability pM (n1 , n2 , . . . , nM ; t) of having n1 particles in box 1, n2 particles in box 2, and so on at time t. Consider the thermodynamic limit of a large total number of boxes N → ∞ with periodic boundary conditions and use the Mth order cluster approximation PM+1 (n1 , n2 , . . . , nM+1 ; t) #
PM (n1 , . . . , nM ; t) PM (n2 , . . . , nM+1 ; t) (12.45) PM−1 (n2 , . . . , nM ; t)
to obtain a closed equation for any M. Find the condition at which the stationary probability distribution calculated from this master equation factorizes (for any M) as in the case of the zero-range model. If the factorization takes place for the transition rates which depend both on n and m, then we have the so-called misanthrope process. E 12.3 Flux-density fundamental diagram Consider the particle-hopping model defined in previous two exercises with transition rates u(n, m) = f (n) g(m),
(12.46)
where f (0) = 0 f (1) = independent constant
b1 f (n) = f∞ 1 + σ : n ≥ 2 n1
b2 g(n) = g∞ 1 + . (n + 1)σ2
(12.47)
409
410
12 Many-Particle Systems
Calculate the flux-density fundamental diagram based on the master equation in the mean-field approximation. Consider different values of the parameters b1 , b2 , σ1 , σ2 and compare the results with those of the zero-range model, corresponding to b2 = 0. E 12.4 Equilibrium fluctuations Consider the Fokker–Planck equation (12.42) describing the spatially homogeneous polarization fluctuations. Find the condition at which this equation has the equilibrium solution in the form of the Boltzmann distribution.
411
Epilog
Finally we would like to classify this textbook within the general framework of the natural and engineering sciences. One should note that, at a very basic or fundamental level, nature is described by the laws of quantum mechanics and elementary particle physics. The explanation of macroscopic physical phenomena, especially in nonequilibrium, based on a microscopic description of the underlying processes is the subject of many scientific papers and books which have been written on this topic. For any interested reader we recommend the monograph Statistical Mechanics of Nonequilibrium Processes by our colleagues Dmitrii Zubarev, Vladimir Morozov and Gerd R¨opke [254] which studies physical processes on a microscopic level. There are also fundamental mathematical studies of stochastic processes [57,91]. An exact application of the fundamental laws to complex systems, however, is too complicated, as has already been pointed out by Paul Dirac in 1929: The underlying physical laws necessary for the mathematical theory of a large part of physics and the whole of chemistry are thus completely known, and the difficulty is only that the exact application of these laws leads to equations much too complicated to be soluble. It therefore becomes desirable that approximate practical methods of applying quantum mechanics should be developed, which can lead to an explanation of the main features of complex atomic systems without too much computation. This book tends towards practical applications, starting from classical physical systems like supersaturated vapors, and extending the known methods to other complex systems such as traffic flow and financial markets. Taking into account the difficulties mentioned we begin from a mesoscopic or a macroscopic level of description, where the microscopic fundamental laws are taken into account in an averaged or integrated way. Since the pioneering works on Brownian motion and its interpretation at the molecular level by Albert Einstein, the study of open systems at a phenomenological level, based on coarse-grained kinetic equations, is still crucial in classical statistical mechanics. The stochastic theory of nonequilibrium systems allows one to connect isolated pure-state dynamics (also without noise as deterministic chaos [208]) with Physics of Stochastic Processes: How Randomness Acts in Time Reinhard Mahnke, Jevgenijs Kaupuˇzs and Ihor Lubashevsky Copyright 2009 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim ISBN: 978-3-527-40840-5
412
Epilog
open mixed-state dynamics in order to study active propelling or motor-driven particles. For further reading we recommend the book Statistical Thermodynamics and Stochastic Theory of Nonequilibrium Systems by Werner Ebeling (my former encouraging supervisor at Rostock University, R. M.) and Igor Sokolov [38] as well as many other important references [6, 86, 211, 220]. The authors thank all readers who have persevered to the end of the whole text including 150 figures!
413
References
1 R. Abraham: Non-standard Analysis,
2
3
4
5
6
7 8
9
(Princeton University Press, New York 1996). A.Z. Akcasu, J.P. Holloway: Fokker–Planck description of particle transport in finite media: Boundary conditions, Phys. Rev. E 58, 4321 (1998). S. Albeverio, J.E. Fenstad, R. HoeghKrohn, T. Lindstrøm: Nonstandard Methods in Stochastic Analysis and Mathematical Physics, (Academic Press, New York 1986). E. Allen: Modeling with Itˆo Stochastic Differential Equations, (Springer, Dordrecht 2007). D.J. Amit: Field Theory, the Renormalization Group, and Critical Phenomena, (World Scientific, Singapore 1984). V.S. Anishchenko, V. Astakhov, A. Neimann, T. Vadivasova, L. Schimansky–Geier: Nonlinear Dynamics of Chaotic and Stochastic Systems, 2002, 2nd ed. (Springer, Berlin 2007). L. Arnold: Random Dynamical Systems, (Springer, Berlin 1998). V.M. Avdjukhina, A.A. Anishchenko, A.A. Katsnelson, G.P. Revkevich: Nonmonotonic relaxation in hydrogen-saturated alloys Pd-Mo, Perspekt. Materials, n. 4, 5 (2002) (in Russian). V. Balakrishnan, C. Van den Broeck, P. H¨anggi: First-passage times of non-Markovian processes: The case of a reflecting boundary, Phys. Rev. A 38, 4213 (1988).
10 M. Bando, K. Hasebe, A. Nakayama,
11
12
13 14
15
16
17
18
A. Shibata, Y. Sugiyama: Structure stability of congestion in traffic dynamics, Japan. J. Indust. and Appl. Math. 11, 203 (1994). M. Bando, K. Hasebe, A. Nakayama, A. Shibata, Y. Sugiyama: Dynamical model of traffic congestion and numerical simulation, Phys. Rev. E 51, 1035 (1995). M. Bando, K. Hasebe, K. Nakanishi, A. Nakayama, A. Shibata, Y. Sugiyama: Phenomenological study of dynamical model of traffic flow, J. Phys. I France 5, 1389 (1995). R.B. Banks: Growth and diffusion phenomena, (Springer, Berlin 1994). R. Barlovic, L. Santen, A. Schadschneider, M. Schreckenberg: Metastable states in cellular automata for traffic flow, Eur. Phys. J. B 5, 793 (1998). R.J. Baxter: Exactly Solved Models in Statistical Mechanics, (Academic Press, London 1989). R. Becker, W. D¨oring: Kinetische Behandlung der Keimbildung in u¨ bers¨attigten D¨ampfen, Ann. Phys. 24, 719 (1935). J. Bect: A unifying formulation of the Fokker–Planck–Kolmogorov equation for general stochastic hybrid systems, IFAC World Congress, July 6–11, 2008, Seoul, e-print arXiv:0801.3725v2, 26 Feb 2008. J. Bect, H. Baili, G. Fleury: Fokker–Planck–Kolmogorov equation for stochastic differential equations
Physics of Stochastic Processes: How Randomness Acts in Time Reinhard Mahnke, Jevgenijs Kaupuˇzs and Ihor Lubashevsky Copyright 2009 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim ISBN: 978-3-527-40840-5
414
References
19
20
21
22
23
24
25
26
27
28
29
30
31
with boundary hitting resets, e-print arXiv:math/0504583v1, 28 Apr 2005. S.C. Benjamin, N.F. Johnson, P.M. Hui: Cellular automata models of traffic flow along a highway containing a junction, J. Phys. A 29, 3119 (1996). J. Bernardini: Grain boundary diffusion in metallic nano and polycrystals, Interface Science 5, 55 (1997). R.V. Bobryk, A. Chrzeszczyk: Transitions in a Duffing oscillator excited by random noise, Nonlin. Dyn. 51, 541 (2008). A.N. Borodin, P. Salminen: Handbook of Brownian Motion – Facts and Formulae, 2nd ed., (Birkh¨auser, Basel 2002). K. Burrage, P.M. Burrage: High strong order explicit Runge–Kutta methods for stochastic ordinary differential equations, App. Num. Math. 22, 81 (1996). P.M. Burrage: Numerical methods for stochastic differential equations, Ph.D. Thesis (University of Queensland, Brisbane, Queensland, Australia, 1999). V. Capasso, D. Bakstein: An Introduction to Continuous–Time Stochastic Processes. Theory, Models, and Applications to Finance, Biology, and Medicine, (Birkh¨auser, Berlin 2004). B. K. Chakrabarti, L. G. Benguigui: Statistical Physics of Fracture and Breakdown in Disordered Systems, (Clarendon Press, Oxford 1997). P.H. Chavanis: Exact diffusion coefficient of self-graviting Brownian particles in two dimensions, Eur. Phys. J. B 57, 391 (2007). A. Chetverikov, J. Dunkel: Phase behavior and collective excitations of the Morse ring chain, Eur. Phys. J. B 35, 239 (2003). M.H. Choi, R.F. Fox: Evolution of escape processes with a time-varying load, Phys. Rev. E 66, 031103, 2002. A.J. Chorin, O.H. Hald: Stochastic Tools in Mathematics and Science, (Springer, New York 2006). D. Chowdhury, L. Santen, A. Schadschneider: Statistical physics of vehicular traffic
32
33
34
35
36
37
38
39
40
41
42
43
44
and some related systems, Phys. Reports 329, 199, 2000. D. Chowdhury, A. Schadschneider, K. Nishinari: Physics of transport and traffic phenomena in biology: From molecular motors and cells to organisms, Phys. Life Rev. 2, 318 (2005). K. Christensen, N.R. Moloney: Complexity and Criticality, (Imperial College Press 2005). W.T. Coffey, Yu.P. Kalmykov, J.T. Waldron: The Langevin Equation (World Scientific, New Jersey 2004). B. Dybiec, L. Schimansky-Geier: Emergence of bistability in noisy systems with single-well potential, Eur. Phys. J. B, 57, 313 (2007). M.I. Dykman, P.V.E. McClintock: What can stochastic resonance do? Nature 391, 344 (1998). W. Ebeling, P.S. Landa, V.G. Ushakov: Self-oscillations in ring Toda chains with negative friction, Phys. Rev. E 63, 046601 (2001). W. Ebeling, I.S. Sokolov: Statistical Thermodynamics and Stochastic Theory of Nonequilibrium Systems (World Scientific, New Jersey 2005). ¨ P. & T. Ehrenfest: Uber zwei bekannte Einw¨ande gegen das Boltzmannsche H-Theorem Physikalische Zeitschrift 8, 311 (1907). ¨ A. Einstein: Uber die von der molekularkinetischen Theorie der W¨arme gef¨orderte Bewegung in ruhenden Fl¨ussigkeiten suspendierten Teilchen, Annalen der Physik, 17, 549 (1905). U. Erdmann, W. Ebeling, L. Schimansky-Geier, F. Schweitzer: Brownian particles far from equilibrium, Eur. Phys. J. B 15, 105 (2000). M.R. Evans: Phase transitions in onedimensional nonequilibrium systems, Braz. J. Phys. 30, 42 (2000). M.R. Evans, T. Hanney: Nonequilibrium statistical mechanics of the zero-range process and related models, J. Phys. A: Math. Gen. 38, R195 (2005). H. Ez-Zahraouy, Z. Benrihane, A. Benyoussef: The optimal velocity traffic flow models with open boundaries, Eur. Phys. J. B 36, 289 (2003).
References 45 L. Farkas: The velocity of nucleus for-
46
47
48
49 50
51
52
53
54
55
56
57
58
59
60
mation in supersaturated vapours, Z. Phys. Chemie 125, 236 (1927). W. Feller: An Introduction to Probability Theory and its Applications, Vol. I (John Wiley & Sons, New York 1968). W. Feller: An Introduction to Probability Theory and its Applications, Vol. II, 2nd ed. (John Wiley & Sons, New York 1971). R.F. Fox, M.H. Choi: Rectified Brownian motion and kinesin motion along microtubules, Phys. Rev. E 63, 051901, 2001. T.D. Frank: Nonlinear Fokker–Planck Equations, (Springer, Berlin 2005). G. Gallavotti: Heat and fluctuations from order to chaos, Eur. Phys. J. B 61, 1 (2008). A. Gambassi: Relaxation phenomena at criticality, Eur. Phys. J. B 64, 379 (2008). L. Gammaitoni, P. H¨anggi, P. Jung, F. Marchesoni: Stochastic resonance, Rev. Mod. Phys. 70, 223 (1998). F.R. Gantmacher: The Theory of Matrices, (Chelsea, New York 1959); Russian ed (Nauka, Moscow 1967); German ed (Deutscher Verlag der Wissenschaften, Berlin 1986). J. Garc´ıa-Ojalvo, J.M. Sancho: Noise in Spatially Extended Systems, Institute for Nonlinear Science (Springer, New York 1999). C.W. Gardiner: Handbook of Stochastic Methods for Physics, Chemistry and the Natural Sciences, 1985, 3rd ed. (Springer, Berlin 2004). N.M. Ghoniem: Stochastic theory of diffusional planar-atomic clustering and its application to dislocation loops, Phys. Rev. B 39, 11810 (1989). I.I. Gikhman, A.V. Skorokhod: Introduction to the Theory of Random Processes, (Dover, New York 1996). D. Gillespie: Exact stochastic simulation of coupled chemical reactions, J. Phys. Chem. 81, 2340 (1977). ` C. Godreche: Dynamics of condensation in zero-range process, J. Phys. A 36, 6313 (2003). I.S. Gradshteyn, I.M. Ryzhik, Alan Jeffrey (ed.), Daniel Zwillinger (ed.): Table of Integrals,
61
62
63
64
65
66
67
68
69
70
71
72
73
74
Series, and Products, 6th ed. (Academic Press, San Diego 2000). M. Grigoriu: Stochastic Calculus. Applications in Science and Engineering, (Birkh¨auser, Boston 2002). M. Grmela: Thermodynamics of a driven system, Phys. Rev. E 48, 919 (1993). S. Grosskinsky, G.M. Sch¨utz, H. Spohn: Condensation in the zero-range process: Stationary and dynamical properties, J. Stat. Phys. 113, 389 (2003). H. Haken: Synergetics. Introduction and Advanced Topics, (Springer, Berlin 2004). P. H¨anggi, H. Grabert, P. Talkner, H. Thomas: Bistable systems: master equation versus Fokker–Planck modeling, Phys. Rev. A 29, 371 (2002). P. H¨anggi, P. Talkner, M. Borkovec: Reaction-rate theory: fifty years after Kramers, Rev. Mod. Phys. 62, 251 (1990). R.J. Harris, A. R´akos, G.M. Sch¨utz: Current fluctuations in the zerorange process with open boundaries, J. Stat. Mech., P08003 (2005). R.J. Harris, A. R´akos, G.M. Sch¨utz: Breakdown of Gallavotti–Cohen symmetry for stochastic dynamics, Europhys. Lett. 75, 227 (2006). R.J. Harris, R.B. Stinchombe: Scaling approach to related disordered stochastic and free-fermion models, Phys. Rev. E 75, 031104 (2007). S. Harris: Absorbing-boundary limit for Brownian motion: Demonstration for a model, Phys. Rev. A 36, 3392 (1987). D. Helbing: Improved fluiddynamic model for vehicular traffic, Phys. Rev. E 51, 3164 (1995). D. Helbing: Theoretical foundation of macroscopic traffic models, Physica A 219, 375 (1995). D. Helbing: High-fidelity macroscopic traffic equations, Physica A 219, 391 (1995). D. Helbing: Derivation and empirical validation of a refined traffic flow model, Physica A 233, 253 (1996).
415
416
References 75 D. Helbing: Empirical traffic data and
76
77
78
79
80
81
82
83
84
85
86
87
88
their implications for traffic modelling, Phys. Rev. E 55, R25 (1997). D. Helbing: Verkehrsdynamik. Neue physikalische Modellierungskonzepte, (Springer, Berlin 1997). D. Helbing: Traffic and related self-driven many-particle systems, Rev. Mod. Phys., 73, 1067 (2001). D. Helbing, B. Tilch: Generalized force model of traffic dynamics, Phys. Rev. E 58, 133 (1998). D. Helbing, M. Treiber, A. Kesting, Understanding interarrival and interdeparture time statistics from interactions in queuing systems, Physica A 363, 62 (2006). D. Helbing, M. Treiber: Gas-kineticbased traffic model explaining observed hysteretic phase transitions, Phys. Rev. Lett. 81, 3042 (1998). M. Henkel, G. Sch¨utz: Boundaryinduced phase transitions in equilibrium and non-equilibrium systems, Physica A 206, 187 (1994). J. Hinkel: Applications of Physics of Stochastic Processes to Vehicular Traffic Problems, (Dissertation, Univ. Rostock 2007). J. Hinkel, R. Mahnke: Outflow probability for drift–diffusion dynamics, Int. J. Theor. Phys. 46, 1542 (2007). J. Honerkamp: Stochastische Dynamische Systeme, (VCH, Weinheim 1990); English ed Stochastic Dynamical Systems, (VCH, New York 1994). J. Honerkamp: Statistical Physics. An Advanced Approach with Applications, (Springer, Berlin 1998). W. Horsthemke, R. Lefever: Noise-Induced Transitions. Theory and Applications in Physics, Chemistry, and Biology, 1984, 2nd ed. (Springer, Berlin 2006). H. Huang, N.M. Ghoniem: Formulation of a moment method for multidimensional Fokker–Planck equations, Phys. Rev. E 51, 5251 (1995). I. Jeon, P. March: Condensation transition for zero range invariant measures, Can. Math. Soc. Conf. Proc. 26, 233 (2000).
89 Y. Kafri, E. Levine, D. Mukamel,
90
91
92
93
94
95
96
97
98
99
100
G.M. Sch¨utz, J. T¨or¨ok: Criterion for phase separation in onedimensional driven systems, Phys. Rev. Lett. 89, 035702 (2002). Y. Kafri, E. Levine, D. Mukamel, G.M. Sch¨utz, R.D. Willmann: Phase-separation transition in one-dimensional driven models, Phys. Rev. E 68, 035101(R) (2003). I. Karatzas, St. E. Shreve: Brownian Motion and Stochastic Calculus, 1988, 4-th ed. (Springer, New York 1997). A.A. Katsnelson, A.I. Olemskoi, I.V. Sukhorukova, G.P. Revkevich: Self–oscillation processes during the structure relaxation of palladium–metal alloys (Pd–W) saturated with hydrogen, PhysicsUspekhi, 165, 331 (1995). J. Kaupuˇzs: Fluctuations in ferroelectrics and dielectric properties from the Fokker–Planck equation, physica status solidi (b) 195, 325 (1996). J. Kaupuˇzs: Critical exponents predicted by grouping of Feynman diagrams in ϕ4 model, Ann. Phys. (Leipzig) 10, 299 (2001). J. Kaupuˇzs, E. Klotins: Spatiotemporal correlations of local polarization in ferroelectrics, Ferroelectrics 296, 239 (2003). J. Kaupuˇzs, R. Mahnke: A stochastic multi-cluster model of freeway traffic, Eur. Phys. J. B 14, 793 (2000). J. Kaupuˇzs, R. Mahnke, R.J. Harris: Zero-range model of traffic flow, Phys. Rev. E 72, 056125 (2005). J. Kaupuˇzs, J. Rimshans: Numerical solution of semiconductor Fokker–Planck kinetic equations, In: Proc. of the European Congress ECCOMAS 2000, Barcelona, Spain, pp. 1–18, 2000. J. Kaupuˇzs, H. Weber, J. Tolmacheva, R. Mahnke: Applications to traffic breakdown on highways, In: Progress in Industrial Mathematics at ECMI 2002, A. Buikis, R. Ciegis, A.D. Fitt, eds., pp. 133–138, (Springer, Berlin 2004). B.S. Kerner: The Physics of Traffic, (Springer, Berlin 2004).
References 101 B. S. Kerner, H. Rehborn: Experimen-
102
103
104
105
106
107
108
109
110
111
tal properties of complexity in traffic flow, Phys. Rev. E 53, R4275 (1996). D.O. Kharchenko, A.V. Dvornichenko: Phase transitions induced by thermal fluctuations, Eur. Phys. J. B 61, 95 (2008). M. Kiessling, C. Lancellotti: The linear Fokker–Planck equation for the Ornstein–Uhlenbeck process as an (almost) nonlinear kinetic equation for an isolated N-particle system, J. Stat. Phys. 123, 525 (2006). P.E. Kloeden, E. Platen: Numerical Solution of Stochastic Differential Equations, 1992, Corr. 3rd ed. (Springer, Berlin 1999). W. Knospe, L. Santen, A. Schadschneider, M. Schreckenberg: Towards a realistic microscopic description of highway traffic, J. Phys. A 33, L477 (2000). W. Knospe, L. Santen, A. Schadschneider, M. Schreckenberg: Human behavior as origin of traffic phases, Phys. Rev. E 65, 015101 (2002). M. Krbalek: Equilibrium distribution in a thermodynamical traffic gas, J. Phys. A: Math. Theor. 40, 5813 (2007). M. Krbalek, D. Helbing: Determination of interaction potentials in freeway traffic from steady-state statistics, Physica A 333, 370 (2004). N. Krepysheva, L. Di Pietro, M.-C. N´eel: Fractional diffusion and reflective boundary condition, Physica A 368, 355 (2006). R. K¨uhne, R. Mahnke: Controlling traffic breakdowns, In: Transportation and Traffic Theory (Ed.: H.S. Mahmassani), pp. 229–244 (Elsevier Ltd., Oxford 2005). R. K¨uhne, R. Mahnke, J. Hinkel: Understanding traffic breakdown: A stochastic approach, In: Transportation and Traffic Theory (Eds.: R.E. Allsop, M.G.H. Bell, B.G. Heydecker), pp. 777–790 (Elsevier Ltd., Oxford 2007).
112 R. K¨ uhne, R. Mahnke,
113
114
115
116
117
118
119
120
121
122
123
I. Lubashevsky, J. Kaupuˇzs: Probabilistic description of traffic breakdowns, Phys. Rev. E 65, 066125 (2002). D. Labudde, R. Mahnke, V. Frischfeld: Monte Carlo simulation of thermodynamic systems with cluster formation under different boundary conditions, Comp. Phys. Comm. 106, 181 (1997). P. Landa: Nonlinear Oscillations and Waves in Dynamical Systems, (Kluwer Academic Publ., Dordrecht 1996). P.S. Landa, A.A. Zaikin, L. Schimansky-Geier: Influence of additive noise on noise-induced phase transitions in nonlinear chains, Chaos, Solitons and Fractals 9, 1367 (1998). P.S. Landa, A.A. Zaikin, V.G. Ushakov, J. Kurths: Influence of additive noise on transitions in nonlinear systems, Phys. Rev. E 61, 4809 (2000). G. Lamm, K. Schulten: Extended Brownian dynamics. II. Reactive, nonlinear diffusion, J. Chem. Phys. 78, 2713 (1983). M.E. L´arraga, J.A. del R´io, A. Mehta: Two effective temperatures in traffic flow models: Analogies with granular flow, Physica A 307, 527 (2002). M. Lax, W. Cai, M. Xu: Random Processes in Physics and Finance, (Oxford University Press, New York, 2006). E. Levine, G. Ziv, L. Gray, D. Mukamel: Traffic jams and ordering far from thermal equilibrium, Physica A 340, 636 (2004). A.J. Lichtenberg, M.A. Lieberman: Regular and Stochastic Motion, (Springer, New York 1983). Ch. Liebe: Stochastik der Verkehrsdynamik: Von Zeitreihen-Analysen zu Verkehrsmodellen, (Diplom, Univ. Rostock, 2006). Ch. Liebe, R. Mahnke, J. Kaupuˇzs, H. Weber: Vehicular motion and traffic breakdown: Evaluation of energy balance, In: Traffic and Granular Flow ’07 (Eds.: C. Appert-Rolland, F. Chevoir, Ph. Gondret, S. Lassarre,
417
418
References
124
125
126
127
128
129
130
131
132
133
134
J.-P. Lebacque, M. Schreckenberg), (Springer–Verlag, Berlin 2008). I. M. Lifshitz, V. V. Slyozov: The kinetics of precipitation from supersaturated solid solutions, J. Phys. Chem. Solids 19, 35 (1961). V. Linetsky: The spectral representation of Bessel processes with constant drift: Applications in queueing and finance, J. Appl. Prob. 41, 327, 2004. V. Linetsky: On the transition densities for reflected diffusions, Adv. Appl. Prob. 37, 435, 2005. O’ Loan, M.R. Evans, M.E. Cates: Jamming transition in a homogeneous one-dimensional system: the bus route model, Phys. Rev. E 58, 1404 (1998). I. Lubashevsky, R. Friedrich, R. Mahnke, A. Ushakov, N. Kubrakov: Boundary singularities and boundary conditions for the Fokker–Planck equations, e-print arXiv:mathph/0612037v1, 12 Dec 2006. I. Lubashevsky, M. Hajimahmoodzadeh, A. Katsnelson, P. Wagner: Noise– induced phase transition in an oscillatory system with dynamical traps, Eur. Phys. J. B 36, 115 (2003). I. Lubashevsky, S. Kalenkov, R. Mahnke: Towards a variational principle for motivated vehicle motion, Phys. Rev. E 65, 036140 (2002). I. Lubashevsky, R. Mahnke: Order-parameter model for unstable multilane traffic flow, Phys. Rev. E 62, 6082 (2000). I. Lubashevsky, R. Mahnke, M. Hajimahmoodzadeh, A. Katsnelson: Long-lived states of oscillator chains with dynamical traps, Eur. Phys. J. B 44, 63 (2005). I. Lubashevsky, R. Mahnke, P. Wagner, S. Kalenkov: Long-lived states in synchronized traffic flow: Empirical prompt and dynamical trap model, Phys. Rev. E 66, 016117 (2002). I. Lubashevsky, P. Wagner, R. Mahnke: Rational-driver approximation in car-following theory, Phys. Rev. E 68, 056109 (2003).
135 I. Lubashevsky, P. Wagner,
136
137
138
139
140
141
142
143
144
145
146
R. Mahnke: Bounded rational driver models, Eur. Phys. J. B 32, 243 (2003). J. Luczka, M. Niemiec, P. H¨anggi: First-passage time for randomly flashing diffusion, Phys. Rev. E 52, 5810 (1995). Shang-Keng Ma: Modern Theory of Critical Phenomena, (W.A. Benjamin, New York 1976). A.J. MacConnell: Applications of Tensor Analysis, (Dover Publications, New York 1957). R. Mahnke: Zur Evolution in nichtlinearen dynamischen Systemen, (Habilitation, Univ. Rostock 1990). R. Mahnke: Nichtlineare Physik in Aufgaben, Teubner Studienb¨ucher Physik, (Teubner, Stuttgart 1994). R. Mahnke: Aggregation phenomena to a single cluster regime under different boundary conditions, Zeitschr. f. Phys. Chem. (Leipzig) 204, 85 (1998). R. Mahnke: Probabilistic description of nucleation in vapours and on roads, In: Interface and Transport Dynamics – Computational Modelling (Eds.: H. Emmerich, B. Nestler, M. Schreckenberg), pp. 361–389, (Springer, Berlin 2003). R. Mahnke, A. Budde: A new formula for the binding energy of clusters, Zeitschr. f. Phys. Chem. (Leipzig) 271, 857 (1990). R. Mahnke, A. Budde: MonteCarlo-experiments to aggregation phenomena in complex systems, In: Models of Selforganisation in Complex Systems MOSES (Eds.: W. Ebeling, M. Peschel, W. Weid-lich), Mathematical Research, Vol. 64, pp. 164–171, (Akademie-Verlag, Berlin, 1991). R. Mahnke, A. Budde: Pattern formation by cellular automata, Journal of Mathematical Modelling and Simulation in System Analysis (Syst. Anal. Model. Simul.) 10, 133 (1992). R. Mahnke, H. Hartmann: Keimbildung in u¨ bers¨attigten Gasen und auf u¨ berf¨ullten Autobahnen, In:
References
147
148
149
150
151
152
153
154
155
156
157
158
Irreversible Prozesse und Selbstorganisation (Eds.: T. P¨oschel, H. Malchow, L. Schimansky-Geier), pp. 97–112, (Logos–Verlag, Berlin, 2006). R. Mahnke, J. Kaupuˇzs: Stochastic theory of freeway traffic, Phys. Rev. E 59, 117 (1999). R. Mahnke, J. Kaupuˇzs: Probabilistic description of traffic flow, Networks and Spatial Economics 1, 103 (2001). R. Mahnke, J. Kaupuˇzs, V. Frishfelds: Nucleation in physical and nonphysical systems, Atmospheric Research 65, 261 (2003). R. Mahnke, J. Kaupuˇzs, J. Hinkel, H. Weber: Applications of thermodynamics to driven systems, Eur. Phys. J. B 57, 463 (2007). R. Mahnke, J. Kaupuˇzs, I. Lubashevsky: Probabilistic description of traffic flow, Physics Reports 408, 1–130 (2005). R. Mahnke, R. K¨uhne, J. Kaupuˇzs, I. Lubashevsky, R. Remer: Stochastic description of traffic breakdown. In: Noise in Complex Systems and Stochastic Dynamics, L. Schimansky-Geier, D. Abbott, A. Neiman, Ch. Van den Broeck, eds., Proc. SPIE 5114, 126 (2003). R. Mahnke, N. Pieret: Stochastic master-equation approach to aggregation in freeway traffic, Phys. Rev. E 56, 2666 (1997). R. Mahnke, J. Schmelzer, G. R¨opke: Nichtlineare Ph¨anomene und Selbstorganisation, Teubner Studienb¨ucher Physik, (Teubner, Stuttgart 1992). R. Mahnke, H. Urbschat, A. Budde: Nucleation and condensation to a single equilibrium cluster regime in a Monte Carlo experiment, Zeitsch. f. Physik D 20, 399 (1991). H. Malchow, L. SchimanskyGeier: Noise and Diffusion in Bistable Nonequilibrium Systems, Teubner-Texte zur Physik, Vol. 5, (Teubner, Leipzig 1986). R. Mannella: Integration of stochastic differential equations on a computer, Int. J. Mod. Phys. 13, 1117 (2004). F. Marchesoni: Conceptual design of a molecular shuttle, Phys. Lett. A 237, 126 (1998).
159 M. Martin: The source solution
160
161
162
163
164
165
166
167 168
169
for diffusion with a linearly position dependent diffusion coefficient, Zeitschrift f¨ur Physikalische Chemie, NF, 162, 245 (1989). R.M. Mazo: Brownian motion: Fluctuations, Dynamics, and Applications, (Clarendon Press, Oxford 2006). B. McCoy, T.T. Wu: The TwoDimensional Ising Model, (Harvard University Press 1973). S.V.G. Menon, D.C. Sahni: Derivation of the diffusion equation and radiation boundary condition from the Fokker–Planck equation, Phys. Rev. A 32, 3832 (1985). R. Metzler: Non-homogeneous random walks, generalized master equations, fractional Fokker–Planck equations, and the generalized Kramers–Moyal expansion, Eur. Phys. J. B 19, 249 (2001). M.A. Miller, J.P.K. Doye, D.J. Wales: Structural relaxation in atomic clusters: Master equation dynamics, Phys. Rev. E 60, 3701 (1999). E.W. Montroll, B.J. West: On an enriched collection of stochastic processes. In: Studies in Statistical Mechanics, vol. VII: Fluctuation Phenomena, ed by E. W. Montroll and J. L. Lebowitz, (North Holland Publ, Amsterdam 1979). D. Mukamel: Phase transitions in nonequilibrium systems, In: Soft and Fragile Matter. Nonequilibrium Dynamics, Metastability and Flow, (Eds.: M.E. Cates and M.R. Evans), p. 205, (Institute of Physics Publishing, Bristol 2000). A. M¨unster: Statistical Thermodynamics, vol. I, (Springer, Berlin 1969). K. Nagel: Particle hopping models and traffic flow theory, Phys. Rev. E 53, 4655 (1996). D. Jost, K. Nagel: Probabilistic traffic flow breakdown in stochastic car following models, In: Traffic and Granular Flow ’03 (Eds.: S.P. Hoogendoorn, S. Luding, P.H.L. Bovy, M. Schreckenberg, D.E. Wolf), pp. 87–103, (Springer, Berlin 2005).
419
420
References 170 K. Nagel, M. Schreckenberg: A cellu-
171
172
173
174
175
176
177
178
179
180
181
182
lar automaton model for freeway traffic, J. Phys. I France 2, 2221 (1992). K.R. Naqvi, K.J. Mork, S. Waldenstrøm: Reduction of the Fokker–Planck equation with an absorbing or reflecting boundary to the diffusion equation and the radiation boundary condition, Phys. Rev. Lett. 49, 304 (1982). E. Nelson: Dynamical Theories of Brownian Motion, (Princenton University Press 1967). G.F. Newell: Mathematical models of freely moving traffic, Oper. Res. 9, 209 (1961). H. Nobach: Vorteile Klassischer Signal–und Datenverarbeitungsverfahren in der Optischen Str¨omungsmesstechnik, (Habilitation, Univ. Darmstadt 2007) B. Øksendal: Stochastic Differential Equations, 1985, 6th ed. (Springer, Berlin 2003). B. Øksendal, A. Sulem: Applied Stochastic Control of Jump Diffusion, (Springer, Berlin 2005). L. Onsager: Crystal Statistics. I. A two-dimensional model with an order–disorder transition, Phys. Rev. 65, 117 (1944). ¨ H. Ottinger: Computer simulation of reptation theories. I. DoiEdwards and Curtiss-Bird models, J. Chem. Phys. 91, 6455 (1989). G. Parisi, N. Sourlas: Random magnetic fields, supersymmetry, and negative dimensions, Phys. Rev. Lett. 43, 744 (1979). W. Paul, J. Baschnagel: Stochastic Processes. From Physics to Finance, (Springer, Berlin 1999). A. Pelissetto, E. Vicari: Critical phenomena and renormalization-group theory, Physics Reports 368, 549 (2002). E.A.J.F. Peters, Th.M.A.O.M. Barenbrug: Efficient Brownian dynamics simulation of particles near walls. I. Reflecting and absorbing walls, II. Sticky walls, Phys. Rev. E 66, 056701, 056702 (2002).
183 A.S. Pikovsky, J. Kurths: Coherence
184
185
186
187
188
189
190
191
192
193
194
195
196
Resonance in a noise-driven excitable system, Phys. Rev. Lett. 78, 775 (1997). I. Prigogine, R. Herman: Kinematic Theory of Vehicular Traffic, (Elsevier, New York 1971). S. Redner: A Guide to First-Passage Processes, (Cambridge University Press, New York 2001). P. R´efr´egier: Noise Theory and Application to Physics. From Fluctuations to Information, (Springer, New York 2004). T. Reichenbach, E. Frey, T. Franosch: Traffic jams induced by rare switching events in two-lane transport, New J. Phys. 9, 159 (2007). H. Reiss, A. D. Hammerich, E. W. Montroll: Thermodynamic treatment of nonphysical systems: Formalism and an example (single-lane traffic), J. Stat. Phys. 42, 647 (1986). R. Remer: Theorie und Simulation von Zeitreihen mit Anwendungen auf die Aktienkursdynamik, (Dissertation, Univ. Rostock 2005). R. Remer, R. Mahnke: Application of Heston model and its solution to German DAX data, Physica A 344, 236 (2004). R. Remer, R. Mahnke: Stochastic volatility models and their application to German DAX data, Fluctuation and Noise Letters 4, R67 (2004). R. Remer, R. Mahnke: Application of the Heston and Hull–White models to German DAX data, Quantitative Finance 4, 685 (2004). H. Risken: The Fokker–Planck Equation: Methods of Solutions and Applications, 1984, 3rd ed (Springer, Berlin 1996). G. R¨opke: Statistische Mechanik f¨ur das Nichtgleichgewicht, (Deutscher Verlag der Wissenschaften, Berlin 1987). J. Rudnik, G. Gaspari: Elements of the Random Walk. An Introduction for Advanced Students and Researchers, (Cambridge University Press, 2004). S. Ruffo: Equilibrium and nonequilibrium properties of systems
References
197
198
199
200
201
202
203
204
205
206 207
208
209
210
with long-range interactions, Eur. Phys. J. B 64, 355 (2008). Yu.B. Rumer, M.Sh. Ryvkin: Thermodynamics, Statistical Physics and Kinetics, (Mir Publishers, Moscow, 1980). A. Schadschneider: The Nagel– Schreckenberg model revisited, Eur. Phys. J. B 10, 573 (1999). A. Schadschneider, M. Schreckenberg: Traffic flow models with ‘slow-to-start’ rules, Ann. Phys. (Leipzig) 6, 541 (1997). L. Schimansky-Geier, Th. P¨oschel (eds.): Stochastic Dynamics, Lecture Notes in Physics, Vol. 484 (Springer, Berlin 1998). R.B. Schinazi: Classical and Spatial Stochastic Processes, (Birkh¨auser, Boston 1999). J. Schmelzer (ed.): Nucleation Theory and Applications, (Wiley-VCH, Weinheim 2005). J. Schmelzer, G. R¨opke, R. Mahnke: Aggregation Phenomena in Complex Systems, (Wiley-VCH, Weinheim 1999). J. Schnackenberg: Network theory of microscopic and macroscopic behavior of master equation systems, Rev. Mod. Phys. 48, 571 (1976). M. Schreckenberg, A. Schadschneider, K. Nagel, N. Ito: Discrete stochastic models for traffic flow, Phys. Rev. E 51, 2939 (1995). M. Schulz: Statistical Physics and Economics, (Springer, New York 2003). M.F. Schumaker: Boundary conditions and trajectories of diffusion processes, J. Chem. Phys. 117, 2469 (2002). H.G. Schuster: Deterministic Chaos. An Introduction, 2nd ed. (VCH, Weinheim 1989). G.M. Sch¨utz: Critical phenomena and universal dynamics in onedimensional driven diffusive systems with two species of particles, J. Phys. A: Math. Gen. 36, R339 (2003). F. Schweitzer (ed.): Self-Organization of Complex Structures, (Gordon and Breach Science Publ., Amsterdam 1977).
211 F. Schweitzer: Brownian Agents
212
213
214 215 216
217
218
219
220
221 222
223
224
225
226
and Active Particles. Collective Dynamics in the Natural and Social Sciences, (Springer, Berlin 2003). U. Seifert: Stochastic thermodynamics: Principles and perspectives, Eur. Phys. J. B 64, 423 (2008). J.P. Sethna: Statistical Mechanics: Entropy, Order Parameters and Complexity, (Oxford University Press, New York 2006). R. Seydel: Tools for Computational Finance, (Springer, Berlin 2004). A. Siegert: On the first passage time probability, Phys. Rev. 81, 617 (1951). R. da Silveira: An introduction to breakdown phenomena in disordered systems, Am. J. Phys. 67, 1177 (1999). A. Singer, Z. Schuss, D. Holcman: Narrow escape, Part I –III, J. Stat. Phys. 122, 437; 465; 491 (2006). E. Smith, Thermodynamic dual structure of linear-dissipative driven system, Phys. Rev. E 72, 036130 (2005). I.M. Sokolov: Solution of a class of non-Markovian Fokker–Planck equations, Phys. Rev. E 66, 041101 (2002). D. Sornette: Critical Phenomena in Natural Sciences, (Springer, Berlin 2004). F. Spitzer: Interaction of Markov processes, Adv. Math. 5, 246 (1970). D. Stirzaker: Stochastic Processes and Models, (Oxford University Press, New York 2005). S.X. Sun: Path summation formulation of the master equation, Phys. Rev. Lett. 96, 210602 (2006). P. Szymczak, A.J.C. Ladd: Boundary conditions for stochastic solutions of the convection–diffusion equation, Phys. Rev. E 68, 036704 (2003). M. Takayasu, H. Takayasu: 1/f noise in a traffic model, Fractals 1, 860 (1993). T. Taniguchi, E.G.D. Cohen: Nonequilibrium steady state thermodynamics and fluctuations for stochastic systems, J. Stat. Phys. 130, 633 (2008).
421
422
References 227 M. Treiber, A. Hennecke, D. Helbing:
228
229
230
231
232
233
234
235
236 237
238
239
240
Derivation, properties, and simulation of a gas-kinetic-based nonlocal traffic model, Phys. Rev. E 59, 239 (1999). St. Trimper: Master equation and two heat reservoirs, Phys. Rev. E 74, 051121 (2006). H. Ulbricht, J. Schmelzer, R. Mahnke, F. Schweitzer: Thermodynamics of Finite Systems and the Kinetics of First-Order Phase Transitions, Teubner-Texte zur Physik, vol. 17, (Teubner, Leipzig 1988). H. Ulbricht, F. Schweitzer, R. Mahnke: Nucleation theory and dynamics of first-order phase transitions in finite systems, In: Selforganisation by Nonlinear Irreversible Processes (Eds.: W. Ebeling, H. Ulbricht), pp. 23–36, Springer–Series in Synergetics, vol. 33 (Springer, Berlin 1986). G.E. Uhlenbeck, L.S. Ornstein: On the theory of the Brownian motion, Phys. Rev. 36, 823, 1930. C. Van den Broeck, J.M.R. Parrondo, R. Toral: Noise-induced nonequilibrium phase transition, Phys. Rev. Lett. 73, 3395 (1994). C. Van den Broeck, J.M.R. Parrondo, R. Toral, R. Kawai: Nonequilibrium phase transitions induced by multiplicative noise, Phys. Rev. E 55, 4084 (1997). N.G. Van Kampen: Stochastic Processes in Physics and Chemistry, 1981, 2nd edn (North Holland Publ, Amsterdam 1992). J. Voit: The Statistical Mechanics of Financial Markets, (Springer, Berlin 2001). M. Volmer: Kinetik der Phasenbildung, (Th. Steinkopff, Dresden 1939). M. Volmer, A. Weber: Nuclei formation in supersaturated states, Z. Phys. Chemie 119, 227 (1926). C. Wagner: Theory of precipitate change by redissolution, Z. Elektrochemie 65, 581 (1961). P. Wagner, How human drivers control their vehicle Eur. Phys. J. B 52, 427 (2006). P. Wagner, K. Nagel: Comparing traffic flow models with different
241
242
243
244
245
246
247 248
249
250
251
252 253
254
number of phases, Eur. Phys. J. B 63, 315 (2008). H. Weber, R. Mahnke, Ch. Liebe, J. Kaupuˇzs: Dynamics and thermodynamics of traffic flow, In: Traffic and Granular Flow ’07 (Eds.: C. Appert-Rolland, F. Chevoir, Ph. Gondret, S. Lassarre, J.-P. Lebacque, M. Schreckenberg), (Springer, Berlin 2008). M.F. Wehner, W.G. Wolfer: Numerical evaluation of path-integral solutions to Fokker–Planck equations. II. Restricted stochastic processes, Phys. Rev. A 28, 3003 (1983). G.B. Whitham: Exact solution for a discrete system arising in taffic flow, Proc. R. Soc. London, Ser. A 428, 49 (1990). D. T. Wu: Nucleation theory, In: Solid State Physics, vol. 50, (Eds.: H. Ehrenreich, F. Spaepen), p. 37 (Academic Press, San Diego 1997) A.A. Zaikin, J. Garcia-Ojalvo, L. Schimansky-Geier: Nonequilibrium first-order phase transitions induced by additive noise, Phys. Rev. E 60, R6275 (1999). A.A. Zaikin, L. Schimansky-Geier: Spatial patterns induced by additive noise, Phys. Rev. E 58, 4355 (1998). G.M. Zaslavsky: Dynamical traps, Physica D, 168–169, 292 (2002). Ya. B. Zeldovich: Theory of the formation of a new phase. Cavitation, Sov. Phys. JETP 12, 525 (1942). A. Zettl: Sturm–Liouville Theory, (Providence, American Mathematical Society, 2005). J.W. Zhang, Y. Zou, L. Ge: A force model for single-line traffic, Physica A 376, 628 (2007). X. Zhang, G. Hu: 1/f noise in a two-lane highway traffic model, Phys. Rev. E 52, 4664 (1995). J.M. Ziman: Models of Disorder, (Cambridge University Press, 1979). J. Zinn-Justin: Quantum Field Theory and Critical Phenomena, (Clarendon Press, Oxford 1996). D. Zubarev, V. Morozov, G. R¨opke: Statistical Mechanics of Nonequilibrium Processes, Vol. 1 and 2 (Wiley-VCH, Berlin 1997).
423
Index
a absorbing boundary, 44, 90, 199, 216 active Brownian particle, 149 active particle, 306, 343 adjoint operator, 119 advection–diffusion problem, 32 aggregation, XV, 85, 342 –reaction limited, 284 anomalous behavior, 357 arithmetric Brownian motion, 173 Arrhenius ansatz, 104 Arrhenius, Svante, 104 attachment, 281, 290, 322
b Bachelier, Louis, XIV, 5 backward Fokker–Planck equation, 26, 36, 39, 240 backward Kolmogorov equation, 27 backward transition, 92 balance equation, 82, 86, 216 Bessel equation, 135 Bessel function, 136, 368 Bethe–Weizs¨acker formula, 283 bifurcation, 287, 356 –subcritical, 308 –supercritical, 308 bifurcation diagram, 154 binding energy, 282, 283 binomial distribution, 183, 185, 212 birth-and-death process, 85 bistability, 99 Black–Scholes equation, 177 Boltzmann distribution, 111, 410 Boltzmann, Ludwig, 115 Boltzmann–Gibbs distribution, 147 boundary condition, 91
boundary layer, 56 –scaling, 61 boundary singularity, 37, 40 –vector, 38 boundary trap, 34 Box–Muller method, 171 breakdown function, 217 breakdown probability –cumulative, 228 breakdown probability density, 197, 226 breakdown probability distribution, 204 breakdown rate, 92 Brown, Robert, XII, 4, 77, 212 Brownian motion, XIV, XV, 4, 5, 77, 212 –3d velocity space, 160 –direct integration, 161 –first moment, 161 –harmonic analysis, 163 –second moment, 162 –variance, 163 Brownian particle, 123, 146, 176 Brownian path, 29 Brusselator, 116 bumper-to-bumper distance, 306
c canonical ensemble, 148 capillary length, 285 car cluster, 321 car cluster model, 336 car dynamics, 309 –acceleration, 312 –deceleration, 312 –vector field, 310 car ensemble, 320 car following model, 303
Physics of Stochastic Processes: How Randomness Acts in Time Reinhard Mahnke, Jevgenijs Kaupuˇzs and Ihor Lubashevsky Copyright 2009 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim ISBN: 978-3-527-40840-5
424
Index car system –total energy, 345 Carus, Titus Lucretius XII Cauchy problem, 256 Cauchy sequence, 8, 17 cellular automata, XIV, 391, 408 chaos, XII Chapman–Kolmogorov equation, 23, 31, 32, 34, 35, 79, 81, 89, 120 Chebyshev inequality, 5 chemical potential, 284, 298 circular road –one-lane, 320 circular traffic, 306 cluster –binding energy, 283 –critical, 277 cluster dissolution, 93 cluster distribution, 281 cluster distribution function, 279 cluster size –average, 325 –stationary, 326 –time evolution, 325 collective dynamics, 342 colored noise, 159 complex number, 249 condensation, 85, 395, 404 conditional probability density, 77 congested flow, 331 congested traffic, 317 continuity equation, 117, 146, 187 cooperative behavior, 306 cooperative phenomenon, 355 correlation function, 158, 165, 273 covariance, 273 creative action, 357 critical cluster –curvature, 287 critical cluster size, 286, 292 critical exponent, 308, 309 critical phenomena, XV critical point, 308 criticality, 332 cumulative breakdown probability, 232 cumulative life-time distribution, 217 cumulative probability, 217 current fluctuations, 407 curvature, 283
d Darwin, Charles, 211 de Broglie length, 279, 282
decay process, 92 deceleration force, 314 depletion, 278 detachment, 290 detachment rate, 283, 284 detailed balance, 83, 85, 88, 111, 113, 281, 289, 293, 347, 400 dichotomic process, 99 dichotomous noise, 151 differential equation –first-order, 306 –second-order, 238 differential operator, second-order, 119 diffusion, XV, 31 –absorbing boundaries, 210 –boundaries, 208 –finite interval, 208 –finite intervall, 193 –mixed boundaries, 193 –natural boundaries, 186 –semi-open, 209 –separation ansatz, 135, 194 diffusion coefficient, 133, 146, 186, 364 diffusion equation, XIV –one-dimensional, 186 –superposition, 200 diffusion layer, 31 diffusion length, 295 diffusion process, 19, 22–24, 27, 32 diffusion tensor, 38, 46, 47 diffusional growth, 278 Dirac function, 37 Dirac, Paul, 415 discrete random walk, 211 displacement, 35 dissipative system, 343 dissolution, 93 double-well potential, 87, 154 drift field, 46 drift velocity, 54 drift–diffusion dynamics –eigenfunction, 219 drift–diffusion motion –first moment, 125 drift–diffusion problem –bounded, 119 –dimensionless, 216 –one-dimensional, 119 driven system, 342, 395 drunkard’s walk, 181 dynamical trap, XV, 372 dynamics, deterministic, 154
Index
e Ebeling, Werner, 416 econophysics, 85, 177, 269 Ehrenfest, Paul and Tatiana, 115 eigenvalue equation, 84 eigenvalue problem, 195 Einstein formula, 163, 258 Einstein relation, 147, 148 Einstein, Albert, XIV, 5, 212, 260, 415 energy balance, 343, 352 energy dissipation, 149 energy exchange, XI energy flux, 345 energy input, 343 ensemble-averaging, 165 equilibrium distribution, 83 equilibrium phase transition, 355 equilibrium state, 163 erratic motion, XI Euler discretization, 29 Euler formula, 157 evaporation, 281 evolution, 7 exponential decay, 218
f fast diffusion boundary, 44 fat tails, 269 Fermi distribution, 104 Fermi’s golden rule, 81 ferroelectric, 409 Feynman diagram, 409 Fick’s law, 187 Fick, Adolf, 187 filtration, 7 financial market, 177 finite-size effect, 308, 341 first moment, 124 first-passage time, 126, 242 first-passage time distribution, 197, 217 first-passage time problem –absorbing boundary, 252 –mixed boundaries, 252 first-order phase transition, 277, 336 first-passage problem, 90 first-passage time, 89, 402 first-passage time distribution –semi-finite interval, 199 flow vector, 310 fluctuation-dissipation relation, 147 fluctuation-dissipation theorem, 163, 165, 258
flux-density relation, 329 Fokker, Adriaan, XIV Fokker–Planck dynamics, 124 –V-shaped potential, 252 –linear potential, 251 Fokker–Planck equation, XIV, 27, 31, 32, 35, 117, 148, 409 –backward, 118, 128, 240 –boundary conditions, 31, 32 –conservation form, 43 –derivation, 126 –forward, 117, 127, 128 –mixed boundaries, 215 –multidimensional, 117, 118 –multidimensional, 145 –natural boundaries, 213 –one-dimensional, 118 –stationary solution, 167 –Sturm–Liouville type, 131 Fokker–Planck operator, 37, 118, 121 Fokker–Planck dynamics –comparison, 130 force –accelerating, 344 –decelerating, 344 forward Fokker–Planck equation, 26, 42 Fourier representation, 157, 409 Fourier space, 189, 409 Fourier transformation, 187, 255, 410 –inverse, 261 fractal media, XV free energy, 282, 294, 348, 356 –ideal gas, 294 free flow solution, 307 friction –nonlinear, 149 friction force, 149 fundamental diagram, 329, 407
g Galton board, 183, 184, 211 Galton, Francis, 183, 211 Gauss theorem, 42 Gaussian distribution, 58, 166, 173, 186, 190, 206, 214, 269 –moments, 214 Gaussian profile, 187 Gaussian white noise, 78, 146, 151, 168, 409 –multiplicative, 316 generating function, 36, 98 Genetic model, 368
425
426
Index geometric Brownian motion, 28, 173, 177, 269, 369 geometric random walks, 369 German stock index, 270 Girsanov transformation, 23 grand canonical ensemble, 393 graph theory, 109 gravitational potential, 148 Green function, 33, 34, 37, 39 ground state, 220
h Hamiltonian dynamics, XVI Hamiltonian –many-particle system, 281 H¨anggi–Klimontovich process, 166, 361 harmonic oscillator, 176 –dynamical trap, 373 harmonic potential, 143 Hausdorff dimension, 29 heat bath, XI, 104, 149 Heston model, 270, 274 homogeneous flow, 308 Hull–White model, 270 hyperbolic tangens, 304 hysteresis, 308, 342
i ideal gas model, 284 impermeability, 37 impermeable boundary, 44 Ingenhousz, Jan, XII initial-boundary-value problem, 215, 233 interaction potential, 150, 344 internal energy, 345 Ising model, 99 Ito diffusion, 29 Ito formula, 20, 27 Ito integral, 15, 17, 28 Ito process, 18, 361 –diffusion, 19, 20 –drift, 19, 20 –increment, 19 –path, 19 –quadratic variation, 18 Ito stochastic calculus, 167 Ito stochastic integral, 167 Ito stochastic process, 166
j jam formation, 320 jam shrinkage, 94 joint probability density, 77 Julien Bect, 31
k kinetic coefficient, 357 kinetic equation, 297 Kirchhoff diagram, 112 Kirchhoff’s method, 110 Kirchhoff, Gustav, 109 Kolmogorov backward equation, 25 Kolmogorov forward equation, 25 Kramers–Moyal approach, 40 Kramers–Moyal expansion, XIV, 31, 127 Kronecker delta, 47
l Landau theory, 297 Landau–Ginzburg Hamiltonian, 409 Langevin equation, XV, 19, 128, 316, 409 –additive noise, 152 –multidimensional, 145 –one-dimensional, 151 –overdamped limit, 148 Langevin force, 38, 54, 145, 151, 355 Langevin, Paul, XIV Laplace operator, 147 Laplace transformation, 141, 242 –inverse, 243 lattice random walk, 54, 58, 62 law of large numbers, 36, 169 Lennard–Jones potential, 304 Levy walk, 212 Levy, Paul, 212 life-time, 217 Lifshitz–Slyozov–Wagner theory, 278 limit cycle, 308, 312 Liouville, Joseph, 238 liquid–gas interface, 295 liquid–gas system, 293, 295, 298 local homogeneity, 35 log-normal distribution, 175 logarithmic return, 271 long-lived state, 372, 386 long-range interaction, 148 Lyapunov exponent, 307 Lyapunov function, 374
Index
m many-particle system, XI, 99 –vehicular traffic, 351 –Hamiltonian, 147 Markov chain, 183 Markov dichotomic system –diagonalization method, 101 –master equation, 100 –time evolution matrix, 103 Markov process, 14, 20, 23, 79, 167, 322 –history, 80 Markov property, 20, 79, 120, 212 Martin, Manfred, 133 martingale, 18 master equation, XV, 81, 82, 288 –boundaries, 86 –matrix form, 83 –multidimensional, 290 –one-step, 322 –three-level system, 105 –transition rate, 81 –two heat reservoirs, 104 maximal tree, 110 maximum value distribution, 206 Maxwell distribution, 142, 258 mean first-passage time, 123, 124, 338 mean reverting process, 270 mean value, 191, 273 mean-field approximation, 99, 394 mean-field dynamics, 394, 400 mean-reverting process, 253 metastability, XVI, 332, 392, 399 metastable state, 405 metric tensor, 47 mirror method, 200 molecular dynamics, 305 moment, nth order, 191 Monte Carlo method, 288 Monte Carlo simulation, 95, 404 Morse potential, 149, 150 Morse ring chain, 149 multi-lane effect, 336 multi-lane traffic flow, 385
n Nagel–Schreckenberg model, 317, 391 narrow escape problem, 123 Newton’s third law, 305 Nobach, Holger, 157 noise, 316 –1/f noise, 159 –spectral density, 164
noise amplitude, 317 noise-induced transition, 151 noise-induced transport, 355 nonequilibrium phase transition, 357 non-Gaussian behavior, XVI nonlinear Langevin force, 361 normal distribution, 27, 28, 78 normalization, 191 normalized eigenfunctions, 197 nucleation, 277, 342 –free energy, 282 –isothermal–isochoric, 280 –multi-droplet case, 289 –on roads, 287 –single-droplet case, 279 –supersaturated vapor, 286 nucleation theory, 290 nucleation time, 402
o one-step master equation, 86 one-step process, 85, 88, 90 one-step processes, 328 optimal velocity, 322 optimal velocity function, 306 optimal velocity model, 304, 306, 312, 344, 352 order parameter, 357, 385 order parameter theory, 355 Ornstein, Leonard Salomon, 273 Ornstein–Uhlenbeck process, 27, 29, 160, 255, 273 orthogonality, 196 –Bessel functions, 137 oscillator ensemble, 373 Ostwald ripening, 277, 278, 292 outflow probability, 124
p particle conservation, 279 particle-hopping model, 391 –asymmetric, 391 –totally asymmetric, 391 partition function, 281 Pauli exclusion principle, 392 Pearson, Karl, 181 periodic boundary conditions, 410 periodic perturbation, 307 phase equilibrium, 298 phase separation, XVI, 392 phase transition, XV, 148, 308, 409 –noise-induced, 355
427
428
Index phase-space distribution, 355 Planck, Max, XIV Poisson distribution, 98 Poisson integral, 190 Poisson process, 93, 114 polar coordinates, 190 Polar method, 171 polarization, 409 polarization hysteresis, 411 Polya, George, 184 Pontryagin technique, 40 population genetics, 368 potential energy, 344 power-like singularity, 308 precluster, 324 Prigogine, Ilya, 116 probability current, 86, 124, 271 probability density –first-passage time, 171 probability density distribution, 271 probability density function, 148 probability distribution, 78, 324 probability flux, 216 probability flux operator, 43 probability outflow, 203 probability theory, XII, 18
q quadratic variation,
5, 6
r R¨opke, Gerd, 415 random force, XII, 360 random number, 171 –uniform distribution, 171 random path, XII random process, XIV random variable, 4, 8, 16, 17, 22 random walk, XIV, 46, 54, 56, 85, 181, 391 –boundary condition, 184 –moments, 184 –one-dimensional, 182 –recurrence, 184 –symmetric, 182 random walk experiment, 182 randomness, 347 Rayleigh formula, 123 reaction limited aggregation, 290 reaction–diffusion equation, 82 recurrence relation, 87 reduced Fokker–Planck equation, 218 reflected diffusion, 208
reflecting boundar, 215 reflection principle, 170 relaxation, 297 relaxation dynamics, 332 relaxation time, 334 Remer, Ralf, 255 root-mean-square, 182 Runge–Kutta method, 375
s scale invariance, 181 scaling property, 169 Schl¨ogl model, 115 Schl¨ogl, Friedrich, 115 Schmelzer, J¨urn, 278 Schr¨odinger equation, 143 self-adjoint operator, 119 self-driven motion, 145 self-gravity, 149 self-organization, XV, 342 semi–infinite interval, 200 semigroup, 23, 24 separation ansatz, 218 shadow path, 170 Siegert, Arnold, 124 sigmoidal function, 304 single-lane traffic, 342 slow-to-start rule, 395, 408 soft matter physics, 149 soliton, 150 spectral analysis, 157 spectral density, 158 spectral representation, 189 spinodal decomposition, 278 Spitzer, Frank, XVI stability analysis, 307, 380 stability region, 307 stagnation, 371 standard deviation, 192 stationarity, 82 stationary distribution, 82 steady state, 307 steady-state distribution, 365 Stirling formula, 185, 282, 294 stochastic collision, 160 stochastic differential equation, 23, 27, 152, 347 –drift–diffusion, 173 –multiplicative noise, 173 –one-dimensional, 166 stochastic dynamics, XVI, 33 stochastic equation, XV stochastic force, properties, 160
Index stochastic hybrid systems, 31 stochastic integration –intermediate point, 361 stochastic modeling, 254 stochastic motion, 31, 38 stochastic process, XIV, 4, 78, 80 –exclusion process, 391 –Markov approximation, XIV, 80 –Stratonovich type, 364 –totally asymmetric, 391 –zero-range, 391 stochastic realization, 153 stochastic system –Ito type, 362 –Langevin source, 358 stochastic tool, XVI stochastic trajectory, 327 –maximum, 205 stochastic volatility, 269 stochasticity, XII stock market, XIV, XVI, 269 stock price dynamics, 269 stopping time, 169 Stratonovich process, 166, 361 Stratonovich stochastic calculus, 167 Sturm, Jacques Ch. F., 238 Sturm–Liouville operator, 132, 238 –self-adjoint, 132, 239 Sturm–Liouville problem, 117, 134 Sturm–Liouville theory, 238 sub-diffusion, 32 subcritical bifurcation, 155 subgraph –connected, 110 super-diffusion, 32 supercritical bifurcation, 154 superposition, 223 supersaturated vapor, 289 supersaturation, 278, 292 survival function –open system, 211 –semi-open system, 209 symmetry breakdown, 32 synchronized flow, 343 Szilard, Leo, 277, 299
t Taylor expansion, 21, 186, 297 thermodynamic equilibrium, 83, 147 thermodynamic limit, 282, 329, 398
thermodynamic potential, 88, 289 –free energy, 281 Thiele, Thorvald N., XIV time confinement, 35, 72 time delay, 303 time lag, 199, 204, 402 –tangent construction, 205 time-series analysis, XII time-averaging, 165 Toda chain, 150 Toda potential, 149 traffic –flux-density relation, 331 –fundamental diagram, 329 traffic breakdown, 215 traffic breakdown probability, 89 traffic flow, XVI, 93, 303 –chemical potential, 351 –force model, 304 –free energy, 351 –phase diagram, 308 –temperature, 348 –thermodynamics, 343 traffic jam, 342 traffic model –optimal velocity, 303 –time lag, 303 transcendental equation, 219, 229, 245 –roots, 249 transition frequency, 324 transition matrix, 83, 334 transition probability, 324 translation symmetry, 411 transport, 32 transportation, 293 Trimper, Steffen, 104 two-level system, 99
u Uhlenbeck, George Eugene, 273 uncorrelated process, 78 universality, 181 urn model, 115
v vector field, 310 vehicular flux, 329 vehicular interaction, 344 vehicular traffic, XV Verhulst model, 365 volatility, 270
429
430
Index
w wave equation, 194, 218 wave number, 195, 218 –ground state, 221 white noise, 152, 409 –additive, 145 Wiener increment, 128 Wiener process, 4, 6, 15, 22, 27, 32, 152, 168, 317 –boundaries, 171 –finite interval, 171 –increment, 153, 169
–moments, 168 –trajectory, 172 –variance, 168 Wiener trail, 29 Wiener, Norbert, 5, 7 Wiener–Khinchin theorem,
159
z zero-flux relationship, 87 zero-range model, 393, 404 zero-range process, XVI, 391