Lecture Notes in Mathematics Editors: J.-M. Morel, Cachan B. Teissier, Paris Editors Mathematical Biosciences Subseries: P.K. Maini, Oxford
For further volumes: http://www.springer.com/series/304
2058
•
Mostafa Bachar Jerry Batzel Susanne Ditlevsen Editors
Stochastic Biomathematical Models with Applications to Neuronal Modeling
123
Editors Mostafa Bachar King Saud University College of Sciences Department of Mathematics Riyadh, Saudi Arabia
Susanne Ditlevsen University of Copenhagen Department of Mathematical Sciences Copenhagen, Denmark
Jerry Batzel University of Graz Institute for Mathematics and Scientific Computing and Medical University of Graz Institute of Physiology Graz, Austria
ISBN 978-3-642-32156-6 ISBN 978-3-642-32157-3 (eBook) DOI 10.1007/978-3-642-32157-3 Springer Heidelberg New York Dordrecht London Lecture Notes in Mathematics ISSN print edition: 0075-8434 ISSN electronic edition: 1617-9692 Library of Congress Control Number: 2012949578 Mathematics Subject Classification (2010): 60Gxx, 92C20, 37N25, 92Bxx c Springer-Verlag Berlin Heidelberg 2013 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. Exempted from this legal reservation are brief excerpts in connection with reviews or scholarly analysis or material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. Duplication of this publication or parts thereof is permitted only under the provisions of the Copyright Law of the Publisher’s location, in its current version, and permission for use must always be obtained from Springer. Permissions for use may be obtained through RightsLink at the Copyright Clearance Center. Violations are liable to prosecution under the respective Copyright Law. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. While the advice and information in this book are believed to be true and accurate at the date of publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or omissions that may be made. The publisher makes no warranty, express or implied, with respect to the material contained herein. Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)
Preface
Why use mathematical or stochastic models to study something as complicated and often poorly understood as dynamics in physiology? Hopefully, this book can provide some partial answers and point to the exciting problems that still remain to be solved, where mathematical and stochastic tools can be useful. In this book we treat basics of stochastic process theory represented by a stochastic differential equation directed towards biological modeling and review the field of neuronal models. Theoretical models must be relevant physiologically to be useful and interesting, and their analysis can provide biological insight and help summarize and interpret experimental data. Predictions can be extracted from the model, and experiments verifying or invalidating the model can be suggested, thereby enhancing physiological understanding. Even if the mechanisms are well understood, simulations from models can explore the consequences of extreme physiological conditions that might be unethical or impossible to reproduce experimentally. The process of building a theoretical model forces one to consider and decide on the essential characteristics of the physiological dynamics, as well as which variables and mechanisms to include. Analysis and numerical simulations of the model illustrate quantitatively and qualitatively the consequences of the assumptions implied in the model. The unifying aim of theoretical modeling and experimental investigation is the elucidation of the underlying biological processes that result in a particular observed phenomenon. Many biological systems are highly irregular, and experiments under controlled conditions show a large trial-to-trial variability, even when keeping the experimental setup fixed. This calls for a stochastic, as opposed to deterministic, modeling approach, especially because ignoring the stochastic phenomena in the modeling may hugely affect the conclusions of the studied biological system. In linear systems the noise might only blur the underlying dynamics without qualitatively affecting it, but in nonlinear dynamical systems corrupted by noise, the corresponding deterministic dynamics can be drastically changed. In general, stochastic effects influence the dynamics and may enhance, diminish, or even completely change the dynamic behavior of the system. In certain biological systems, e.g., in auditory
v
vi
Preface
neurons, the noise is even believed to enhance the signal, thus providing a biological justification for the large amount of noise found in living systems. The book treats stochasticity represented by stochastic differential equations, but is not meant to be a comprehensive textbook on stochastic methods. It is primarily intended for mathematicians and life scientists who are interested in seeing an in-depth and motivated presentation of an important application of stochastic methods. Our goal is to provide an illuminating example of where and how stochastic methods can enter into modeling physiological systems. Life scientists generally, with some background in mathematics, will also be able to benefit by seeing how these two areas are linked, and we point the interested reader to references that fill in missing background information. We hope that the material as presented will provide a useful illustration of interdisciplinary research. The reader should have a basic background in probability, differential and integral calculus, and ordinary differential equations. We focus on neuronal modeling, where stochastic models have supplemented deterministic dynamical models since the sixties. There exist already many excellent books on nonlinear dynamics, biomathematical modeling, and computational neuroscience. The aim of this volume is to provide a focused, up-to-date presentation of neuronal modeling using stochastic methods, while in addition both motivating the linkage and providing insight into practical issues. More profit will of course be derived from reading the book, if it is combined with the reading of some introductory texts in computational neuroscience or biomathematical modeling. The book is divided into two parts. The first part introduces some methodology, which is useful when modeling biological systems with stochastic dynamics. Chapter 1 is an introduction to stochastic models and a good place to start if the reader has no background in stochastic processes. Also a short overview of statistical methods to estimate model parameters from data is provided. Chapters 2 and 3 need more mathematical preparation and a basic understanding of stochastic processes and probability theory, as well as some notion of measure theory. In Chap. 2, scale and speed measures of one-dimensional diffusion processes are reviewed, which are used to determine expressions for hitting times or first-escape times, in particular, boundary behaviors. These tools are very useful when dealing with neuronal models, as those presented in the second part of the book. Chapter 3 gives an introduction to the theory of large deviations, providing an asymptotic description of the fluctuations of a stochastic system, in particular, providing exponential estimates for the waiting time to rare events. Chapter 4 closes the methodological part indicating how the theory from Chap. 3 can be implemented in practice. These techniques are natural to use when analyzing stochastic models of biological systems. The second part of the book treats neuronal models. Chapter 5 provides a timely review of existing methods and available analytical results for the most common one-dimensional stochastic integrate-and-fire models. These models of neuronal activity collapse the neuronal anatomy into a single point in space, sacrificing realism for mathematical tractability, although they often succeed in predicting neuronal response with considerable accuracy. Chapter 6 goes a step further to more realistic models and includes the spatial dimension of neuronal dynamics. Stochastic
Preface
vii
partial differential equation models are reviewed, in particular the Hodgkin–Huxley and the FitzHugh–Nagumo models are treated in detail. Chapter 7 is dedicated to a probabilistic treatment of FitzHugh–Nagumo systems. Finally, Chap. 8 implements the tools from Chap. 6 to a specific application of modeling spreading of cortical depression. This volume was first conceived as a result of the experience of designing and holding a combined summer school/workshop event on the subject of stochastic modeling in physiology. This event was part of a Marie Curie sponsored series of four training events designed to encourage interdisciplinary research in modeling physiological systems. The events brought together mathematicians, bioengineers, statisticians, medical clinicians, physiologists, and other related life science researchers. The underlying motivation and inspiration for this series of events is that unraveling the complexities of physiological systems and the intricacies of interaction between systems requires development of novel and innovative insights as well as new research approaches and techniques. Furthermore, such new approaches can be strongly stimulated by merging the different perspectives from the mathematical/engineering disciplines on the one side and the life sciences on the other. These observations motivated the design of the events in which summer schools would introduce new and young researches to an interdisciplinary treatment of the modeling of key physiological systems with emphasis on how modeling can address important clinical problems related to these systems. Directly following each summer school an interdisciplinary scientific workshop was held. These workshops focused on the same themes as the preceding summer school and were designed as stand alone scientific events. Students from the school participated in these workshops and in this way new and current researchers could interact and develop contacts for collaboration. The general web page linking and reflecting all four Marie Curie training events can be found at: http://www.uni-graz.at/biomedmath/info.html. The event related to this volume can be found at the event web page: http://www.math.ku.dk/susanne/SummerSchool2008/
Acknowledgements This book would never have existed without the work of the contributors of this volume. They all participated in and contributed to the success of the combined summer school and workshop, and we would like to thank them all for sharing their insights with us. We are grateful to Michael Sørensen and Franz Kappel for their help and part of organizing the summer school and the workshop. We wish to thank all the people who attended the summer school and workshop and especially the teachers of the summer school: Andrea De Gaetano, Susanne Ditlevsen, Terese Graversen, Martin Jacobsen, Seema Nanda, Bernt Øksendal, Umberto Picchini, Laura Sacerdote, Michael Sørensen, and Gilles Wainrib. Thanks to Flemming H. Jacobsen for typing parts of the book. We also thank Springer Verlag and especially Ute McCrory for their support during the production of the book. Finally, this volume would not have been possible without the funds from different sources, mainly the European Union under the program Marie Curie Conferences and Training Course which supported the summer school and workshop under
viii
Preface
the project Biomathtech 07-10 (MSCF-CT-2006-045961). We were also supported by the European Society for Mathematical and Theoretical Biology, Forskerskolen i Matematik og Anvendelser, Forskerskolen i Biostatistik, Copenhagen Statistics Network at University of Copenhagen and Biomedical Simulations Resource, University of Southern California. Research by Susanne Ditlevsen was supported by the Danish Council for Independent Research j Natural Sciences, Jerry Batzel was partially funded by FWF (Austria) project P18778-N13, Mostafa Bachar was partially supported by Deanship of Scientific Research, College of Science Research center, King Saud University, Adeline Samson was supported by the Bonus Qualite Recherche from Universite Paris Descartes, Laura Sacerdote and Maria Teresa Giraudo were supported by MIUR PRIN 2008, Mich`ele Thieullen was Supported by Agence Nationale de Recherche ANR-09-BLAN-0008-01, and Henry Tuckwell thanks Prof Dr J¨urgen Jost for his fine hospitality at MIS MPI.
Riyadh, Saudi Arabia Graz, Austria Copenhagen, Denmark February 2012
Mostafa Bachar Jerry Batzel Susanne Ditlevsen
Contents
Part I
Methodology
1 Introduction to Stochastic Models in Biology .. . . . . . . .. . . . . . . . . . . . . . . . . . . . Susanne Ditlevsen and Adeline Samson 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1.2 Markov Chains and Discrete-Time Processes . . . . .. . . . . . . . . . . . . . . . . . . . 1.3 The Wiener Process (or Brownian Motion) .. . . . . . .. . . . . . . . . . . . . . . . . . . . 1.4 Stochastic Differential Equations . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1.5 Existence and Uniqueness . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1.6 Itˆo’s Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1.7 Monte Carlo Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1.7.1 The Euler–Maruyama Scheme .. . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1.7.2 The Milstein Scheme .. . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1.8 Inference .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1.8.1 Maximum Likelihood . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1.8.2 Bayesian Approach .. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1.8.3 Martingale Estimating Functions . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1.9 Biological Applications .. . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1.9.1 Oncology.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1.9.2 Agronomy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 2 One-Dimensional Homogeneous Diffusions . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . Martin Jacobsen 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 2.2 Diffusion Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 2.3 Scale Function and Speed Measure . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 2.4 Boundary Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 2.5 Expected Time to Hit a Given Level . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
3 3 4 5 8 13 14 16 16 17 17 18 22 23 25 25 29 33 37 37 39 40 45 54 55
ix
x
Contents
3 A Brief Introduction to Large Deviations Theory . . . .. . . . . . . . . . . . . . . . . . . . Gilles Wainrib 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 3.2 Sum of Independent Random Variables. . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 3.3 General Theory .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 3.4 Some Large Deviations Principles for Stochastic Processes .. . . . . . . . . 3.4.1 Sanov Theorem for Markov Chains . . . . . .. . . . . . . . . . . . . . . . . . . . 3.4.2 Small Noise and Freidlin–Wentzell Theory .. . . . . . . . . . . . . . . . . 3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 4 Some Numerical Methods for Rare Events Simulation and Analysis .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . Gilles Wainrib 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 4.2 Monte-Carlo Simulation Methods.. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 4.2.1 Overview of the Different Approaches .. .. . . . . . . . . . . . . . . . . . . . 4.2.2 Focus on Importance Sampling .. . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 4.3 Numerical Methods Based on Large Deviations Theory .. . . . . . . . . . . . . 4.3.1 Quasipotential and Optimal Path . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 4.3.2 Numerical Methods . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 4.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . Part II
57 57 58 61 64 64 65 71 71 73 73 75 75 78 87 87 88 93 95
Neuronal Models
5 Stochastic Integrate and Fire Models: A Review on Mathematical Methods and Their Applications . . . . . .. . . . . . . . . . . . . . . . . . . . Laura Sacerdote and Maria Teresa Giraudo 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 5.2 Biological Features of the Neuron . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 5.3 One Dimensional Stochastic Integrate and Fire Models . . . . . . . . . . . . . . 5.3.1 Introduction and Notation .. . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 5.3.2 Wiener Process Model . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 5.3.3 Randomized Random Walk Model .. . . . . .. . . . . . . . . . . . . . . . . . . . 5.3.4 Stein’s Model .. . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 5.3.5 Ornstein–Uhlenbeck Diffusion Model . . .. . . . . . . . . . . . . . . . . . . . 5.3.6 Reversal Potential Models . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 5.3.7 Comparison Between Different LIF Models .. . . . . . . . . . . . . . . . 5.3.8 Jump Diffusion Models . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 5.3.9 Boundary Shapes . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 5.3.10 Further Models . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 5.3.11 Refractoriness and Return Process Models . . . . . . . . . . . . . . . . . .
99 99 101 102 102 103 105 106 107 112 117 119 120 121 122
Contents
5.4 Mathematical Methods for the First Passage Time Problem and Their Application to the Study of Neuronal Models.. . . 5.4.1 Analytical Methods.. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 5.4.2 Numerical Methods . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 5.4.3 Simulation Methods .. . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 5.5 Estimation Problems for LIF Models . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 5.5.1 Samples from Membrane Potential Measures . . . . . . . . . . . . . . . 5.5.2 Samples of ISIs .. . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 6 Stochastic Partial Differential Equations in Neurobiology: Linear and Nonlinear Models for Spiking Neurons . .. . . . . . . . . . . . . . . . . . . . Henry C. Tuckwell 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 6.2 Linear SPDE Neuronal Models: A Brief Summary . . . . . . . . . . . . . . . . . . . 6.2.1 Geometrical or Anatomical Considerations .. . . . . . . . . . . . . . . . . 6.2.2 Simple Linear SPDE Models . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 6.2.3 Two-Component Linear SPDE Systems .. . . . . . . . . . . . . . . . . . . . 6.3 Nonlinear Models for Spiking Neurons .. . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 6.3.1 The Ionic Currents Underlying Neuronal Spiking .. . . . . . . . . . 6.3.2 A General SPDE for Nerve Membrane Potential .. . . . . . . . . . . 6.4 Stochastic Spatial Hodgkin–Huxley Model.. . . . . . .. . . . . . . . . . . . . . . . . . . . 6.4.1 Noise-Free Excitation . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 6.4.2 Stochastic Stimulation . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 6.5 A Stochastic Spatial FitzHugh–Nagumo System . .. . . . . . . . . . . . . . . . . . . . 6.5.1 The Effect of Noise on the Probability of Transmission.. . . . 6.6 Discussion .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
xi
124 125 133 136 139 139 141 142 149 150 151 152 155 159 160 160 161 162 164 165 168 168 170 171
7 Deterministic and Stochastic FitzHugh–Nagumo Systems . . . . . . . . . . . . . . Mich`ele Thieullen 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 7.2 FN Systems of ODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 7.3 Large Deviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 7.4 Stochastic Perturbation of FN . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 7.5 Deterministic FN Including Space Propagation . . .. . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
175
8 Stochastic Modeling of Spreading Cortical Depression.. . . . . . . . . . . . . . . . . Henry C. Tuckwell 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 8.2 Reaction–Diffusion Model for Cortical Spreading Depression .. . . . . . 8.2.1 The Standard Parameter Set . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 8.3 Random Sources of K C . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 8.3.1 Mainly Uniform K C Sources with An Isolated Region of Higher Activity . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .
187
175 177 178 181 182 186
187 189 192 193 194
xii
Contents
8.4 Reduced Exchange-Pump Capacity Over a Small Lesion . . . . . . . . . . . . 195 8.5 Discussion .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 197 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 198 Glossary . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 201 Index . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 205
Contributors
Susanne Ditlevsen Department of Mathematical Sciences, University of Copenhagen, Copenhagen, Denmark Maria Teresa Giraudo Department of Mathematics “Giuseppe Peano”, University of Torino, Torino, Italy Martin Jacobsen Department of Mathematical Sciences, University of Copenhagen, Copenhagen, Denmark Laura Sacerdote Department of Mathematics “Giuseppe Peano”, University of Torino, Torino, Italy Adeline Samson CNRS UMR8145, Laboratoire MAP5, Universit´e Paris Descartes, France Mich`ele Thieullen Universit´e Pierre et Marie Curie - Paris 6, Laboratoire de Probabilit´es et Mod`eles Al´eatoires, Paris cedex 05, France Henry C. Tuckwell Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany Gilles Wainrib Laboratoire Analyse G´eom´etrie et Applications, Institut Galil´ee, Universit´e Paris 13, Villetaneuse, France
xiii
•
Acronyms
a.s. ATP CLT CNS CTMC CV DCEI FET FIM FN FPT GABA GWN HH IF IG iid IS ISI LDP LIF LLN MC MCMC MLE MLP MRI NMDA NMSE ODE OU
Almost surely Adenosine triphosphate Central limit theorem Central nervous system Continuous-time Markov chain Coefficient of variation Dynamic contrast enhanced imaging First entrance time Fisher information matrix FitzHugh–Nagumo First passage time Gamma-aminobutyric acid Gaussian white noise Hodgkin–Huxley Integrate and Fire Inverse Gaussian independent and identically distributed Importance sampling Interspike interval Large deviation principle Leaky Integrate and Fire Law of large numbers Monte Carlo Markov Chain Monte Carlo Maximum likelihood estimator Most likely path Magnetic resonance imaging N-methyl D-aspartate Normalized mean-square error Ordinary differential equation Ornstein–Uhlenbeck xv
xvi
PDE pdf PSP r.v. RRW SD SDE SPDE w.r.t.
Acronyms
Partial differential equation Probability density function Postsynaptic potential Random variable Randomized Random Walk Spreading cortical depression Stochastic differential equation Stochastic partial differential equation With respect to
Part I
Methodology
Chapter 1
Introduction to Stochastic Models in Biology Susanne Ditlevsen and Adeline Samson
1.1 Introduction This chapter is concerned with continuous time processes, which are often modeled as a system of ordinary differential equations (ODEs). These models assume that the observed dynamics are driven exclusively by internal, deterministic mechanisms. However, real biological systems will always be exposed to influences that are not completely understood or not feasible to model explicitly. Ignoring these phenomena in the modeling may affect the analysis of the studied biological systems. Therefore there is an increasing need to extend the deterministic models to models that embrace more complex variations in the dynamics. A way of modeling these elements is by including stochastic influences or noise. A natural extension of a deterministic differential equations model is a system of stochastic differential equations (SDEs), where relevant parameters are modeled as suitable stochastic processes, or stochastic processes are added to the driving system equations. This approach assumes that the dynamics are partly driven by noise. All biological dynamical systems evolve under stochastic forces, if we define stochasticity as the parts of the dynamics that we either cannot predict or understand or that we choose not to include in the explicit modeling. To be realistic, models of biological systems should include random influences, since they are concerned with subsystems of the real world that cannot be sufficiently isolated from effects external to the model. The physiological justification to include erratic behaviors S. Ditlevsen () Department of Mathematical Sciences, University of Copenhagen, Universitetsparken 5, 2100 Copenhagen, Denmark e-mail:
[email protected] A. Samson CNRS UMR8145, Laboratoire MAP5, Universit´e Paris Descartes, 45 rue des Saints P`eres, 75006 Paris, France e-mail:
[email protected] M. Bachar et al. (eds.), Stochastic Biomathematical Models, Lecture Notes in Mathematics 2058, DOI 10.1007/978-3-642-32157-3 1, © Springer-Verlag Berlin Heidelberg 2013
3
4
S. Ditlevsen and A. Samson
in a model can be found in the many factors that cannot be controlled, such as hormonal oscillations, blood pressure variations, respiration, variable neural control of muscle activity, enzymatic processes, energy requirements, cellular metabolism, sympathetic nerve activity, or individual characteristics like body mass index, genes, smoking, stress impacts, etc. Also to be considered are external influences, such as small differences in the experimental procedure, temperature, differences in preparation and administration of drugs (if this is included in the experiment). In addition, experimental runs may be conducted by different experimentalists who inevitably will exhibit small differences in procedures within the protocols. Different sources of errors will require different modeling of the noise, and these factors should be considered as carefully as the modeling of the deterministic part, in order to make the model predictions and parameter values possible to interpret. It is therefore essential to understand and investigate the influence of noise in the dynamics. In many cases the noise simply blurs the underlying dynamics without qualitatively affecting it, as is the case with measurement noise or in many linear systems. However, in nonlinear dynamical systems with system noise, the noise will often drastically change the corresponding deterministic dynamics. In general, stochastic effects influence the dynamics, and may enhance, diminish or even completely change the dynamic behavior of the system.
1.2 Markov Chains and Discrete-Time Processes A sequence of stochastic variables fXn ; n D 0; 1; : : :g is called a stochastic process. It could for example be measurements every 5 min of the level of blood glucose for a diabetic patient. The simplest type of stochastic process is one where the random variables are assumed independent, but this is often too simple to capture important features of the data. For example if the blood glucose is high, we would also expect it to be high 5 min later. The simplest type of stochastic process incorporating dependence between observations is a Markov process. Definition 1.1 (Markov chain). A stochastic process fXn ; n D 0; 1; : : :g which can take values in the state space I is called a discrete-time Markov chain if for each n D 0; 1; : : :, P.XnC1 D inC1 j X0 D i0 ; : : : ; Xn D in / D P.XnC1 D inC1 j Xn D in / for all possible values of i0 ; : : : ; inC1 2 I , whenever both sides are well-defined. This means that conditionally on the present state of the system, its future and past are independent. A classical example of a stochastic process in discrete time is a random walk. Consider the random migration of a molecule or a small particle arising from motion due to thermal energy. The particle starts at the origin at time 0. At each time unit
1 Introduction to Stochastic Models in Biology
5
the particle moves one distance unit up with probability p or one distance unit down with probability 1 p, independent of past movements. The random variable Xn then denotes the position of the particle at time n: Xn D Xn1 ˙ 1. This random process fXn gn2N0 is a discrete-time Markov chain which has state space the integers. Now let p D 1=2 and assume that we accelerate the process, so that displacements occur every ı units of time. At the same time, displacements decreases to units of distance. What happens in the limit of continuous time and space, i.e. when ı ! 0 and ! 0? Denote X.t/ the position of the particle at time t, and assume X.0/ D 0. Let K denote the number of upward jumps made after a total of n jumps. Then the position of the particle after nı units of time is given by X.nı/ D .K 1 C .n K/ .1// D .2K n/ : Since displacements occur independent of one another, the random variable K has a binomial distribution with parameters n and 1=2. Thus, E .X.nı// D .2E.K/ n/ D .2n=2 n/ D 0; 1 2 21 Var .X.nı// D 4 Var .K/ D 4 1 n D 2 n: 2 2 Now let ı ! 0 to obtain a continuous time process. Then Var .X.t//t Dnı D 2 n D 2 t=ı: We see that unless ı and go to 0 while keeping 2 proportional to ı, then the variance will be either 0 or infinite—both cases rather uninteresting! Thus, we put 2 D 2 ı for some constant > 0, and obtain a continuous time and space process with E .X.t// D 0 and Var .X.t// D 2 t for all t 0. With a little extra work and evoking the central limit theorem, one can show that the limiting process has a Gaussian distribution with zero mean and variance 2 t. This process is called the Wiener process (Fig. 1.1).
1.3 The Wiener Process (or Brownian Motion) The most important stochastic process in continuous time is the Wiener process, also called Brownian Motion. It is used as a building block in more elaborate models. In 1828 the Scottish botanist Robert Brown observed that pollen grains suspended in water moved in an apparently random way, changing direction continuously. In 1905, Einstein explained this by the pollen grains being bombarded by water molecules, and Brown only contributed to the theory with his name. The precise mathematical formulation of this phenomenon was given by Norbert Wiener in 1923.
6 Fig. 1.1 Random walks over the time interval Œ0; 1 with decreasing time p steps ı and jump sizes D ı. (a): ı D 0:1. (b): ı D 0:01. (c): ı D 0:001. (d): ı D 0:0001. The random walk approaches a Wiener process for decreasing step size
S. Ditlevsen and A. Samson
a
b
c
d
The Wiener process can be seen as the limit of a random walk when the time steps and the jump sizes go to 0 in a suitable way (see Sect. 1.2) and can formally be defined as follows. Definition 1.2 (Wiener process). A stochastic process fW .t/gt 0 is called a Wiener process or a Brownian motion if 1. W .0/ D 0. 2. fW .t/gt 0 has independent increments, i.e. Wt1 ; Wt2 Wt1 ; : : : ; Wtk Wtk1 are independent random variables for all 0 t1 < t2 < < tk . 3. W .t C s/ W .s/ N .0; t/ for all t > 0. Here, N .; 2 / denotes the normal distribution with mean and variance 2 . Thus, the Wiener process is a Gaussian process: a stochastic process X is called a Gaussian process if for any finite set of indices t1 ; : : : ; tk the vector of random variables .X.t1 /; : : : ; X.tk // follows a k-dimensional normal distribution. In fact, it can be shown that any continuous time stochastic process with independent increments and finite second moments: E.X 2 .t// < 1 for all t, is a Gaussian process provided that X.t0 / is Gaussian for some t0 . The Wiener process is continuous with mean zero and variance proportional to the elapsed time: E.W .t// D 0 and Var.W .t// D t. If fX.t/gt 0 is a stationary stochastic process, then fX.t/gt 0 has the same distribution as fX.t Ch/gt 0 for all h > 0. Thus, the Wiener process cannot
1 Introduction to Stochastic Models in Biology
7
be stationary since the variance increases with t. The autocovariance function is given by Cov.Wt ; Ws / D min.s; t/. The sample paths of a Wiener process behave “wildly” in that they are nowhere differentiable. To see what that means define the total variation of a real-valued function f on an interval Œa; b R by the quantity Vab .f / D sup
n X
jf .tk / f .tk1 /j
kD1
where the supremum is taken over all finite partitions a t0 < < tn b of Œa; b. When Vab .f / < 1 and f is right-continuous we say that f is of bounded variation on Œa; b. Functions that behave sufficiently “nice” are of bounded variation, if for example f is differentiable it is of bounded variation. It turns out that the Wiener process is everywhere of unbounded variation. This p happens because the increments W .t C t/ W .t/ is on the order of t instead of t since the variance is t. Heuristically, Vab .W / D sup
n X
jW .tk / W .tk1 /j
kD1
ˇ n ˇ X ˇ ˇ ˇW a C k .b a/ W a C .k 1/ .b a/ ˇ lim ˇ ˇ n!1 n n kD1 n r X p 1 .b a/ D lim n.b a/ D 1 lim n!1 n!1 n kD1
for any interval Œa; b. Trying to differentiate we see how this affects the limit p jW .t C t/ W .t/j j tj lim D 1: t !0 t !0 t t lim
Now define the quadratic variation of a real-valued function f on Œa; b R by Œf ba D sup
n X .f .tk / f .tk1 //2 kD1
where the supremum is taken as before. For continuous functions of bounded variation the quadratic variation is always 0, and thus, if Œf ba > 0 then Vab .f / D 1. The quadratic variation of a Wiener process over an interval Œs; t equals t s, and in the limit we therefore expect lim .W .t C t/ W .t//2 t:
t !0
(1.1)
8
S. Ditlevsen and A. Samson
1.4 Stochastic Differential Equations Assume that the ODE
dx D a.x; t/ (1.2) dt describes a one-dimensional dynamical system. Assume that a./ fulfills conditions such that a unique solution exists, thus x.t/ D x.tI x0 ; t0 / is a solution satisfying the initial condition x.t0 / D x0 . Given the initial condition, we know how the system behaves at all times t, even if we cannot find a solution analytically. We can always solve it numerically up to any desired precision. In many biological systems this is not realistic, and a more realistic model can be obtained if we allow for some randomness in the description. A natural extension of a deterministic ODE model is given by an SDE model, where relevant parameters are randomized or modeled as random processes of some suitable form, or simply by adding a noise term to the driving equations of the system. This approach assumes that some degree of noise is present in the dynamics of the process. Here we will use the Wiener process. It leads to a mixed system with both a deterministic and a stochastic part in the following way [21, 24]: dXt D .Xt ; t/ dt C .Xt ; t/ d Wt
(1.3)
where fXt D X.t/gt 0 is a stochastic process, not a deterministic function like in (1.2). This is indicated by the capital letter. Here fWt D W .t/gt 0 is a Wiener process and since it is nowhere differentiable, we need to define what the differential means. It turns out that it is useful to write d Wt D t dt, where ft gt 0 is a white noise process, defined as being normally distributed for any fixed t and uncorrelated: E.t s / D 0 if s ¤ t. Strictly speaking, the white noise process ft gt 0 does not exist as a conventional function of t, but could be interpreted as the generalized derivative of a Wiener process. The functions ./ and ./ can be nonlinear, ./ is called the drift coefficient or the deterministic component, and ./ is called the diffusion coefficient or the stochastic component (system noise), that may depend on the state of the system, Xt . If ./ and ./ do not depend on t the process is called time-homogeneous. Equation (1.3) should be interpreted in the following way: Z t Z t Xt D X0 C .Xs ; s/ ds C .Xs ; s/ d Ws (1.4) t0
t0
where X0 is a random variable independent of the Wiener process. It could simply be a constant. The first integral on the right hand side can be interpreted as an ordinary integral, but what is the second integral? The Wiener process is nowhere differentiable, so how do we give meaning to this differential? Let us try the usual tricks from ordinary calculus, where we define the integral for a simple class of functions, and then extend by some approximation procedure to a larger class of functions. We want to define
1 Introduction to Stochastic Models in Biology
Z
9
t
f .s/ d Ws :
(1.5)
t0
If f .t/ is constant we would expect the integral (1.5) to equal .Wt Wt0 /. Note that this is a random variable with expectation 0 since the increments of a Wiener process has expectation 0. Assume that f .t/ is a non-random step function of the form f .s/ D j on tj s < tj C1 for j D 1; 2; : : : ; n where t0 D t1 < t2 < < tnC1 D t. Then we define Z
t
f .s/ d Ws D
t0
n X
j Wtj C1 Wtj :
j D1
It is natural to approximate a given function f .t/ by a step function. Now f .t/ can be random, but we will only consider functions that are measurable with respect to the -algebra generated by the random variables fWs gst . The concepts of -algebra and measurable space will be defined in Chap. 2, for now they are not needed. Intuitively it means that the value of f .t/ can be determined from the values of Ws for s t. For example, we could take f .t/ D Wt ,Rbut not f .t/ D W2t . We t cannot look into the future! Moreover, we require that EŒ t0 f .s/2 ds < 1. For the rest of this chapter we will always assume these conditions on the integrands. Define a partition ˘n of the interval Œt0 ; t by t0 D t1 < t2 < < tnC1 D t where j˘n j D maxfjtj C1 tj j W j D 1; : : : ; ng is the norm of the partition, and approximate f .t/ f .tj /
for tj t < tj C1
where the point tj belongs to the interval Œtj ; tj C1 . Then we define Z
t
f .s/ d Ws D lim
j˘n j!0
t0
n X
f .tj / Wtj C1 Wtj :
j D1
When f .t/ is stochastic it turns out that—unlike ordinary integrals—it makes a difference how tj is chosen! To see this consider f .t/ D Wt and define two approximations: tj D tj , the left end point, and tj D tj C1 , the right end point. Taking expectations we see that the two choices yield different results: 2 E4
n X
Wtj Wtj C1
3 n X W tj 5 D E Wtj Wtj C1 Wtj
j D1
j D1
D
n X j D1
E Wtj E Wtj C1 Wtj D 0
10
S. Ditlevsen and A. Samson
because the Wiener process has independent increments with mean 0. On the other hand, 2 3 n n h X X 2 i 4 5 E D Wtj C1 Wtj C1 Wtj E Wtj C1 Wtj j D1
j D1
D
n X tj C1 tj D t t0 j D1
hP i n W D 0 and rearranged in the where we have subtracted E W W t t t j j j C1 j D1 first equality sign, and the second equality sign is the variance of the Wiener process. Two useful and common choices are the following: • The Itˆo integral: tj D tj , the left end point. • The Stratonovich integral: tj D .tj C tj C1 /=2, the mid point. There are arguments for using either one or the other, most of them rather technical and we will not enter in this discussion here. Fortunately, though, the difference between the two is an ordinary integral and it is possible to calculate one integral from the other. Here we only use the Itˆo integral, and we call a process given by an equation of the form (1.3) an Itˆo process. Properties of the Itˆo integral The usual linearity properties are also valid for Itˆo integrals, Z t Z t Z t f .s/ d Ws D f .s/ d Ws C f .s/ d Ws t
t0
t0
for t0 < t < t, and Z
t
Z .af .s/ C bg.s// d Ws D a
t0
t
Z f .s/ d Ws C b
t0
t
g.s/ d Ws t0
where a and b are constants. Note that the terms are random variables. Moreover, Z
t
E
f .s/ d Ws D 0:
t0
Finally, we have Itˆo’s isometry: Given R t the properties of the Wiener process, the variance of the stochastic process f t0 f .s/ d Ws gt 0 is equal to Z
t
f .s/ d Ws
Var t0
Z
t
D
E f 2 .s/ ds:
t0
A very important property is that solutions to (1.3) are Markov processes.
1 Introduction to Stochastic Models in Biology
11
Some important examples of Itˆo processes are the following. Wiener process with drift Imagine a particle suspended in water which is being bombarded by water molecules. The temperature of the water will influence the force of the bombardment, and thus we need a parameter to characterize this. Moreover, there is a water current which drives the particle in a certain direction, and we will assume a parameter to characterize the drift. To describe the displacements of the particle, the Wiener process can be generalized to the process dXt D dt C d Wt which has solution Xt D x0 C t C Wt for X0 D x0 . It is thus normally distributed with mean x0 C t and variance 2 t, as follows from the properties of the standard Wiener process. This process has been proposed as a simplified model for the membrane potential evolution in a neuron, see Chap. 5, Sect. 5.3.2. Geometric Brownian motion Imagine a drug is supplied as a bolus to the blood stream and that the average metabolic process of the drug can be described by an exponential decay through the deterministic equation x 0 D ax, where x is the concentration of the drug in plasma and a is the decay rate. The prime 0 denotes derivative with respect to time. Assume now that the decay rate fluctuates randomly due to the complex working of the enzymatic machinery involved in the breakdown of the drug. That could be described by letting a vary randomly as a D C t , where ft gt 0 is a Gaussian white noise process. Then t dt can be written as the differential of a Wiener process, d Wt . This leads to the model dXt D Xt dt C Xt d Wt : It is shown below (Example 1.2, Sect. 1.6) that the explicit solution is 1 2 t C Wt : Xt D X0 exp 2 The process only takes positive values and Xt conditional on X0 follows a lognormal distribution with parameters log.X0 / C . 2 =2/t and 2 t. Ornstein–Uhlenbeck process Imagine a process subject to a restoring force, i.e. the process is attracted to some constant level but is continuously perturbed by noise. An example is given by the membrane potential of a neuron that is constantly being perturbed by electrical impulses from the surrounding network, and at the same time is attracted to an equilibrium value depending on the resting potentials for various ions present at different concentrations inside the cell and in the interstitium, see Chap. 5, Sect. 5.3.5. This leads to the model
12
S. Ditlevsen and A. Samson
Xt ˛ dt C d Wt ; dXt D
(1.6)
with ; > 0. Here has units time, and is the typical time constant of the system. The autocorrelation is given by corr.Xt ; Xt Cs / D e s= , and thus the autocorrelation has decreased with a factor of 1=e after units of time. It has the explicit solution (due to (1.9) below) Xt D X0 e t = C ˛.1 e t = / C e t =
Z
t
e s= d Ws
(1.7)
0
and Xt conditional on X0 is normally distributed with mean E.Xt / D X0 e t = C ˛.1e t = / and variance Var.Xt / D 2 .1e 2t = /=2. If X0 is normally distributed with mean ˛ and variance 2 =2, then so is Xt for all t. Thus, contrary to the processes above, the Ornstein–Uhlenbeck process has a stationary solution. Square-root process In many applications an unrestricted state space is unrealistic, and the variance is often observed to decrease with decreasing distance to some lower level. For example, the hyper-polarization caused by inhibitory reversal potentials in neuron membranes is smaller if the membrane potential is closer to the inhibitory reversal potential, see Chap. 5, Sect. 5.3.6. For simplicity we assume this lower limit in the state space equal to 0. This leads to the model p Xt ˛ dXt D dt C Xt d Wt :
(1.8)
The process is also called the Cox-Ingersoll-Ross process in the financial literature [6], or the Feller process in the neuronal literature, because [16] proposed it as a model for population growth. If 2˛=. 2 / 1 the process stays positive, (see Chap. 2, Example 2.3), and admits a stationary distribution. The transition density is a non-central chi-square distribution with conditional mean and variance E.Xt jX0 / D ˛ C .X0 ˛/e t = Var.Xt jX0 / D ˛
2 .1 e t = /2 C X0 2 .1 e t = /e t = 2
The asymptotic stationary distribution is a gamma distribution with shape parameter 2˛=. 2 / and scale parameter 2 =2. When the diffusion term does not depend on the state variable Xt as in the Wiener process with drift and the Ornstein–Uhlenbeck process, we say that it has additive noise. In this case the Itˆo and the Stratonovich integrals yield the same process, so it does not matter which calculus we choose. In the case of Geometric Brownian motion or the square root process we say that it has multiplicative noise. The four processes are simulated in Fig. 1.2.
1 Introduction to Stochastic Models in Biology
13
a
b
c
d
Fig. 1.2 Sample paths from (a): a Wiener process with drift, (b): a Geometric Brownian motion, (c): an Ornstein–Uhlenbeck process, and (d): a square-root process. Note how the amplitude of the noise does not change over time for the Wiener and the Ornstein–Uhlenbeck process (a and c), whereas for Geometric Brownian motion and the square-root process (b and d), the amplitude of the noise depends on the state variable
1.5 Existence and Uniqueness To ensure the existence of a solution to (1.3) for 0 t T where T is fixed, the following is sufficient: j.t; x/j C j.t; x/j C.1 C jxj/ for some constant C [22, 24]. This ensures that fXt gt 0 does not explode, i.e that fjXt jgt 0 does not tend to 1 in finite time. To ensure uniqueness of a solution the Lipschitz condition is sufficient: j.t; x/ .t; y/j C j.t; x/ .t; y/j Djx yj for some constant D. Note that only sufficient conditions are stated, and in many biological applications these are too strict, and weaker conditions can be found. We will not treat these here, though. In Chap. 2 conditions on the functions and to ensure that the process stays away from the boundaries without assuming the
14
S. Ditlevsen and A. Samson
Lipschitz condition are discussed in detail. Note also that the above conditions are fulfilled for three of the processes described above. The square root process does not fulfill the Lipschitz condition at 0 and is treated in Chap. 2, Example 2.3. In general, many realistic biological models do not fulfill the Lipschitz condition, and the more advanced tools of Chap. 2 are necessary to check if the model is well behaved.
1.6 Itˆo’s Formula Stochastic differentials do not obey the ordinary chain rule as we know it from classical calculus [24, 34]. An additional term appears because .d Wt /2 behaves like dt, see (1.1). We have Theorem 1.1 (Itˆo’s formula). Let fXt gt 0 be an Itˆo process given by dXt D .t; Xt /dt C .t; Xt / d Wt and let f .t; x/ be a twice continuously differentiable function in x and once continuously differentiable function in t. Then Yt D f .t; Xt / is also an Itˆo process, and d Yt D
@f @f 1 @2 f .t; Xt /dt C .t; Xt /dXt C 2 .t; Xt / 2 .t; Xt /dt: @t @x 2 @x
(1.9)
The first two terms on the right hand side correspond to the chain rule we know from classical calculus, but an extra term appears in stochastic calculus because the Wiener process is of unbounded variation, and thus the quadratic variation comes into play. Example 1.1. Let us calculate the integral Z t Ws d Ws : 0
From classical calculus we expect a term like 12 Wt2 in the solution. Thus, we choose f .t; x/ D 12 x 2 and Xt D Wt and apply Itˆo’s formula to Yt D f .t; Wt / D
1 2 W : 2 t
We obtain @f @f 1 @2 f .t; Wt /dt C .t; Wt /d Wt C 2 .t; Wt / 2 .t; Wt /dt @t @x 2 @x 1 D 0 C Wt d Wt C dt 2
d Yt D
1 Introduction to Stochastic Models in Biology
15
because 2 .t; Wt / D 1. Hence Z t Z t Z 1 2 1 t 1 Y t D Wt D Ws d Ws C ds D Ws d Ws C t 2 2 0 2 0 0 and finally
Z
t
Ws d Ws D 0
1 2 1 W t: 2 t 2
Example 1.2. Let us find the solution fXt gt 0 to the Geometric Brownian motion dXt D Xt dt C Xt d Wt : Rewrite the equation as dXt D dt C d Wt : Xt Thus, we have
Z
t 0
dXs D t C Wt ; Xs
(1.10)
which suggests to apply Itˆo’s formula on f .t; x/ D log x. We obtain @f @f 1 @2 f .t; Xt /dt C .t; Xt /dXt C 2 .t; Xt / 2 .t; Xt /dt @t @x 2 @x 1 dXt 1 1 1 dXt C 2 Xt2 2 dt D 2 dt D 0C Xt 2 Xt 2 Xt
d Yt D d.log Xt / D
and thus dXt 1 D d.log Xt / C 2 dt: Xt 2
(1.11)
Integrating (1.11) and using (1.10) we finally obtain log
Xt D X0
Z 0
t
dXs 1 2 1 t D t C Wt 2 t Xs 2 2
and so Xt D X0 exp
1 2 t C Wt : 2
Note that it is simply the exponential of a Wiener process with drift. The solution (1.7) of the Ornstein–Uhlenbeck process can be found by multiplying both sides of (1.6) with e t = and then apply Itˆo’s formula to e t = Xt . We will not do that here.
16
S. Ditlevsen and A. Samson
1.7 Monte Carlo Simulations The solution of an Itˆo process is rarely explicit. When no explicit solution is available we can approximate different characteristics of the process by simulation, such as sample paths, moments, qualitative behavior etc. Usually such simulation methods are based on discrete approximations of the continuous solution to an SDE [19, 22]. Different schemes are available depending on how good we want the approximation to be, which comes at a price of computer time. Assume we want to approximate a solution to (1.3) in the time interval Œ0; T . Consider the time discretization 0 D t0 < t1 < < tj < < tN D T and denote the time steps by j D tj C1 tj and the increments of the Wiener process by Wj D Wtj C1 Wtj . Then Wj N .0; j /, which we can use to construct approximations by drawing normally distributed numbers from a random number generator. For simplicity assume that the process is time-homogenous.
1.7.1 The Euler–Maruyama Scheme The simplest scheme, referred to as the Euler–Maruyama scheme, is the stochastic analogue of the deterministic Euler scheme. Approximate the process Xt at the discrete time-points tj ; 1 j N by the recursion Ytj C1 D Ytj C .Ytj /j C .Ytj /Wj I Yt0 D x0
(1.12)
p where Wj D j Zj , with Zj being standard normal variables with mean 0 and variance 1 for all j . This procedure approximates the drift and diffusion functions by constants between time steps, so obviously the approximation improves for smaller time steps. To evaluate the convergence things are more complicated for stochastic processes, and we operate with two criteria of optimality: the strong and the weak orders of convergence [2, 3, 19, 22]. Consider the expectation of the absolute error at the final time instant T of the Euler–Maruyama scheme. It can be shown that there exist constants K > 0 and ı0 > 0 such that E.jXT YtN j/ Kı 0:5 for any time discretization with maximum step size ı 2 .0; ı0 /. We say that the approximating process Y converges in the strong sense with order 0.5. This is similar to how approximations are evaluated in deterministic systems, only here we take expectations, since XT and YtN are random variables. Compare with the Euler scheme for an ODE which has order of convergence 1. Sometimes we do not need a close pathwise approximation, but only some function of the value at a given
1 Introduction to Stochastic Models in Biology
17
final time T , e.g. E.XT /, E.XT2 ) or generally E.g.XT //. In this case we have that there exist constants K > 0 and ı0 > 0 such that for any polynomial g jE.g.XT / E.g.YtN ///j Kı for any time discretization with maximum step size ı 2 .0; ı0 /. We say that the approximating process Y converges in the weak sense with order 1.
1.7.2 The Milstein Scheme To improve the accuracy of the approximation we add a second-order term that appears from Itˆo’s formula. Approximate Xt by 1 Ytj C1 D Ytj C .Ytj /j C .Ytj /Wj C .Ytj / 0 .Ytj /..Wj /2 j / (1.13) „ ƒ‚ … 2 „
Euler–Maruyama
ƒ‚
…
Milstein
where the prime 0 denotes derivative. It is not obvious exactly how this term appears, but it can be derived through stochastic Taylor expansions. The Milstein scheme converges in the strong sense with order 1, and could thus be regarded as the proper generalization of the deterministic Euler-scheme. If .Xt / does not depend on fXt gt 0 the Euler–Maruyama and the Milstein schemes coincide.
1.8 Inference Estimation of parameters of an Itˆo process is a key issue in applications. Estimation in continuously observed Itˆo processes has been widely studied [23, 27]. However, biological processes are normally observed at discrete times. Parametric inference for discretely observed Itˆo processes can be complicated depending on the model considered and on the presence of measurement noise. The transition densities are only known for a few specific Itˆo processes (Wiener process with drift, Ornstein– Uhlenbeck process, Square-root process). Likelihood functions of data sampled from these processes have then an explicit form, and the maximum likelihood estimator (MLE) of the parameters can thus be computed. These estimators have nice statistical properties like consistency and asymptotic normality [7]. Consistency means that when the number of observations go to infinity, the estimator will converge in probability to the true value. Asymptotic normality means that the large sample distribution of the estimator will be close to normal, which is useful for constructing confidence intervals.
18
S. Ditlevsen and A. Samson
When the transition densities are not available, the likelihood function cannot be directly computed. Several estimation methods have been proposed to circumvent this difficulty: methods based on an approximation of the transition density by Hermite expansions [1], simulation based methods, also called Monte Carlo methods [13, 25], martingale estimating functions [4], see also [19, 23, 27] and references therein. Estimation is more difficult when the process is observed with measurement noise. The likelihood function is explicit in some few specific cases for which filtering techniques can be applied [15]. Otherwise, methods based on simulations or on the Expectation-Maximization algorithm have been developed [10]. The Bayesian approach, which is an alternative to the maximum likelihood approach, can be applied to a large variety of problems when it is combined with sample path simulations or Euler–Maruyama approximations [11]. We present below some of these methods.
1.8.1 Maximum Likelihood Observations of an Itˆo process without measurement noise Consider discrete observations x0 ; : : : ; xN at time points 0 D t0 < t1 < : : : < tj < : : : < tN D T of an Itˆo process X which depends on an unknown parameter vector , dXt D .t; Xt I /dt C .t; Xt I / d Wt :
(1.14)
The vector of observations is denoted x0WN D .x0 ; : : : ; xN /. Bayes’ rule combined with the Markovian nature of the process X , which the discrete data inherit, imply that the likelihood function of is simply the product of transition densities, L. ; x0WN / D p.x0 I /
N Y
p.xj jxj 1 I /;
(1.15)
j D1
where p.x0 I / is the density of the initial variable X0 and p.xt jxs I /; s < t is the transition density of X , i.e. the conditional density of Xt at time t, given that it was at Xs D xs at an earlier time s. We will normally ignore the asymptotically unimportant distribution of X0 by setting p.x0 I / D 1. The vector of partial derivatives of the log-likelihood function with respect to the coordinates of is called the score function, X @ @ log L. ; x0WN / D log p.xj jxj 1 I /; @ @ i D1 N
U. / D
which under mild regularity conditions is a martingale under P .
(1.16)
1 Introduction to Stochastic Models in Biology
19
Definition 1.3 (Martingale). A stochastic process fXn; n D 1; 2; : : :g is called a martingale if for all n D 1; 2; : : :, E.jXn j/ < 1 E.XnC1 j X1 ; : : : ; Xn / D Xn i.e., the conditional expected value of the next observation, given all the past observations, is equal to the last observation. The MLE usually solves the estimating equation U. / D 0. Under mild regularity conditions it is consistent and asymptotically normally distributed [7]. Example 1.3. Let us calculate the likelihood function of an Ornstein–Uhlenbeck process (1.6). The unknown parameter to be estimated is D .; ˛; /. Denote the length of the observation time intervals by j D tj tj 1 , for j D 1; : : : ; N . Equation (1.7) provides an explicit expression of Xtj as a function of Xtj 1 and : Xtj D Xtj 1 e j = C˛.1e j = /C i ;
i N .0;
2 .1e 2j = //: (1.17) 2
The likelihood (1.15) is thus explicit and equal to L. ; x0WN / D p.x0 I /
2 j j 2j 1e ; ' xj I xj 1 e C ˛ 1e ; 2 j D1 N Y
where '.xI ; 2 / denotes the density of a Gaussian variable with mean and variance 2 . The unique maximum of the likelihood function provides the MLE O D .; O ˛; O O 2 /. When j D is constant the MLE is given by the equations Pn 4=O / j D1 .Xj Xj 1 e ˛O D ; 4=O n.1 e / Pn O O j 1 ˛/ j D1 .Xj ˛/.X 4=O Pn D ; e 2 O j D1 .Xj 1 ˛/ P 2 nj D1 .Xj ˛O .Xj 1 ˛/e O 4=O /2 2 O D : n.1 e 24=O /O P O O > 0. Otherwise there is no solution. It requires that nj D1 .Xj ˛/.X j 1 ˛/ When the transition density function p./ is unknown, the likelihood is not explicit. A simple approximation to the likelihood function is obtained by approximating the transition density by a Gaussian density with the correct first and second conditional moments, .y F .; xI //2 exp p.xjyI / q.xjyI / D p 2 .; xI / 2 .; xI / 1
20
S. Ditlevsen and A. Samson
where F .; xI / D E .X jX0 D x/ and .; xI / D Var .X jX0 D x/. In this way we obtain the quasi-likelihood L. / QL. / D
N Y
q.Xj 1 jXj I /:
j D1
By differentiation with respect to the parameter, we obtain the quasi-score function 8 ˆ @ N X < F .; Xj 1 I / @ @ log QL. / D ŒXj F .; Xj 1 I / (1.18) ˆ @ .; Xj 1 I / j D1 : 9 @ > .; Xj 1 I / = @ 2 .Xj F .; Xj 1 I // .; Xj 1 I / ; C > 2 .; Xj 1 I /2 ; which is clearly a martingale under P . It is a particular case of a quadratic martingale estimating function considered by [4, 5]. Another approach is to compute an approximation to p./ based on the Euler– Maruyama (1.12) or the Milstein (1.13) schemes. In general, this approximation will converge to p./ as ! 0. More precisely, the Euler–Maruyama approximation of (1.15) consists in replacing p./ by the Gaussian density of the Euler–Maruyama scheme: L. ; x0WN / p.x0 ; /
N Y
'.xj I xj 1 Cj .tj 1 ; xj 1 ; /; j 2 .tj 1 ; xj 1 ; //:
j D1
When the interval lengths .j / are fixed and large, the Euler–Maruyama scheme provides a poor approximation to the diffusion. An alternative is to approximate the transition density via simulation of finer sample paths. A set of auxiliary latent data points are introduced between each pair of observations. Along these auxiliary latent data points, the process can be finely sampled and the likelihood function is then approximated via numerical integration (also called Monte Carlo method) [13, 25]. We detail the approximation of the transition density p.xj jxj 1 I / on the interval Œtj 1 ; tj for a fixed j 2 f1; : : : ; N g. The time interval Œtj 1 ; tj is discretized in .j / .j / .j / .j / K sub-intervals tj 1 D 0 < 1 < : : : < k < : : : < K D tj . The transition density p.xj jxj 1 I / can be written as
Z
p xj jxj 1 I D p x .j / jx .j / I D p x .j / jX .j / ; : : : ; X .j / ; x .j / I K
0
K
p X .j / ; : : : ; X .j / jx .j / I dX .j / : : : dX .j / K1 1 0 K1 1 h i D E p.x .j / jX .j / I / ; K
K1
K1
1
0
1 Introduction to Stochastic Models in Biology
21
where the expectation is taken under the distribution p.X .j / ; : : : ; X .j / j x .j / I /. K1 1 0 By simulating M independent sample paths .x m.j / ; : : : ; x m.j / /mD1;:::;M under this K1
1
distribution, the transition density p.xj jxj 1 I / can be approximated by
p .M / .xj jxj 1 I / D
M 1 X p xj jx m.j / ; : : : ; x m.j / ; xj 1 I K1 1 M mD1
(1.19)
M 1 X m D p xj jx .j / I : K1 M mD1 By the law of large numbers, the approximating transition density p .M / .xj jxj 1 I / converges to p.xj jxj 1 I /. For a given j , the simulation of the sample paths .x m.j / ; : : : ; x m.j / /mD1;:::;M can be performed using the Euler–Maruyama or Milstein K1
1
schemes with the initial condition xj 1 . The densities of the right side of (1.19) are then explicit Gaussian densities. The Euler–Maruyama approximation gives .M /
pEM .xj jxj 1 I / M 1 X .j / .j / .j / 2 .j / m m m D ' xj I x .j / C K .K1 ; x .j / ; /; K .K1 ; x .j / ; / K1 K1 K1 M mD1 .j /
.j /
.j /
with k D k k1 . However, this approach can have poor convergence properties as the simulations are based on unconditional distributions, especially the variance resulting from the Monte Carlo integration can be large. A more appropriate strategy to reduce the variance consists in importance sampling: instead of simulating the sample paths using the Euler–Maruyama or the Milstein schemes, the trajectories .x m.j / ; : : : ; x m.j / /mD1;:::;M are generated using Brownian bridges, conditioning the 1
K1
proposed bridge on the events xj 1 and xj [12]. More precisely, for k D 1; : : : ; .K 1/, x m.j / is simulated with: k
x m.j / D xtj 1 C k
xtj xtj 1 .j / m . tj 1 / C B .j / ; k tj tj 1 k
(1.20)
where B is a standard Brownian bridge on Œtj 1 ; tj equal to zero for t D tj 1 and t D tj , which can be easily simulated.
22
S. Ditlevsen and A. Samson
Observations of an Itˆo process with measurement noise Consider that the Itˆo process is discretely observed with measurement noise. Let y0WN D .y0 ; : : : ; yN / denote the vector of noisy observations: yj D Xtj C "j ;
(1.21)
where X is defined by (1.14), the "j ’s are the measurement error random variables assumed to be independent, identically distributed with a centered normal distribution with unit variance and independent of fXt gt 0 , and is the measurement noise level. The observed process is no longer Markov. The likelihood function of the data y0WN can be computed by recursive conditioning: L. ; y0WN / D p.y0 I /
N Y
p.yj jy0Wj 1 I /;
j D1
where y0Wj D .y0 ; : : : ; yj / is the vector of observations until time tj . It is thus sufficient to compute the distribution of yj given y0Wj 1 which can be written Z p.yj jy0Wj 1 I / D p.yj jXtj I /p.Xtj jy0Wj 1 I /dXtj : This conditional distribution is rarely explicit, though for the Ornstein–Uhlenbeck process it is. Since the innovation noise j of the discretization of the Ornstein– Uhlenbeck process (1.17) and the observation noise "j are Gaussian variables, the law of yj given y0Wj 1 can be obtained by elementary computations on Gaussian laws if we know the mean and covariance of the conditional distribution of Xtj given y0Wj 1 . This conditional distribution can be exactly computed using Kalman recursions as proposed by [15, 26]. The Kalman filter is an iterative procedure which computes recursively the following conditional quantities: XO j . / D E.Xtj jy0Wj 1 I /, Vj . / D E..Xtj XOj /2 I /, XO j . / D E.Xtj jy0Wj I /, Vj . / D E..Xtj XO j /2 I /. The exact likelihood of y0WN is then equal to 2 1 .yj XOj . // q exp L. ; y0WN / D 2 .Vj . / C 2 / 2.Vj . / C 2 / j D0 N Y
1
! :
(1.22)
When the unobserved diffusion is not an Ornstein–Uhlenbeck process, Monte Carlo methods can be used similarly to the case without measurement noise.
1.8.2 Bayesian Approach Bayesian estimation is an alternative to the MLE, which takes advantage of prior knowledge of parameter values. For example, biologists may know that the decay rate of a drug elimination is most probably close to some pre-specified value. This is
1 Introduction to Stochastic Models in Biology
23
incorporated into the model by assuming a prior distribution for the parameters. The Bayesian approach consists in estimating the posterior distribution of the parameter given the observations and the prior distribution. Denote p. / the prior distribution ( is thus a random variable). When the Itˆo process is observed without measurement noise, the posterior distribution given the observations x0WN is p. jx0WN / D
p. ; x0WN / p.x0WN j /p. / D ; p.x0WN / p.x0WN /
R where p.x0WN j / is the likelihood function, and p.x0WN / D p. ; x0WN /d is the marginal distribution of the data x0WN . In general, the posterior distribution has no closed form because p.x0WN / is not explicit. Classical Bayesian estimators propose to approximate the posterior distribution via simulation of samples . m /1mM using Markov Chain Monte Carlo (MCMC) techniques. The aim is to simulate a Markov Chain with the target distribution p. jx0WN / as stationary distribution. Usual MCMC techniques are the Metropolis–Hastings and the Gibbs algorithms [28]. The Metropolis–Hastings algorithm, an accept–reject algorithm, requires an explicit expression of p.x0WN j / for the computation of the acceptance probability. This is rarely the case for Itˆo processes and approaches similar to the MLE framework can be used: p. ; x0WN / can be approximated via the Euler– Maruyama scheme by a Gaussian density [14], and Brownian bridges can be used to reduce the variance of the MCMC integration [14, 29]. When the diffusion is observed with measurement noise (1.21), the posterior distribution given the observations y0WN is Z p. jy0WN / D p. ; Xt0 ; : : : ; XtN jy0WN /dXt0 : : : dXtN Z D
p.y0WN j ; Xt0 ; : : : ; XtN /p. ; Xt0 ; : : : ; XtN / dXt0 : : : dXtN : p.y0WN /
Simulations of . ; Xt0 ; : : : ; XtN / under p. ; Xt0 ; : : : ; XtN jy0WN / provide samples of under the posterior distribution. Similarly to the case without measurement noise, the MCMC approach combined with Gaussian approximations are used to simulate samples under this target distribution.
1.8.3 Martingale Estimating Functions The score function (1.16) can be approximated by means of martingales of a similar form. Suppose we have a collection of real valued functions hj .x; y; I /, j D 1; : : : ; N satisfying Z hj .x; yI /p.y j xI /dy D 0
(1.23)
24
S. Ditlevsen and A. Samson
for all x and . Consider estimating functions of the form Gn . / D
n X
a.Xi 1 ; /h.Xi 1 ; Xi I /;
(1.24)
i D1
where h D .h1 ; : : : ; hN /T , and the p N weight matrix a.x; / is a function of x such that (1.24) is P -integrable. It follows from (1.23) that Gn . / is a martingale under P for all . An estimating function with this property is called a martingale estimating function. The matrix a determines how much weight is given to each of the hj ’s in the estimation procedure. This weight matrix can be chosen in an optimal way using the theory of optimal estimating functions. We will not treat this here, see [4, 5, 31, 32] for details. Example 1.4. The martingale estimating function (1.18) is of the type (1.24) with N D 2, h1 .x; yI / D y F .; xI / and h2 .x; yI / D .y F .; xI //2 .; x; /. The weight matrix is
@ .; xI / @ F .; xI / : ; .; xI / 2 2 .; xI /
Example 1.5. A generally applicable quadratic martingale estimating function for model (1.14) is Gn . / D
n
X @ .Xi 1 I / i D1
C
2 .Xi 1 I /
ŒXi F .Xi 1 I /
(1.25)
@ 2 .Xi 1 I / 2 .X F .X I // .X I / : i i 1 i 1 2 4 .Xi 1 I /
For the square-root process (1.8) the quadratic martingale estimating function (1.25) is 1 n X 1 = = Xi Xi 1 e ˛.1 e / C B C B i D1 Xi 1 C B n C BX C B = = Xi Xi 1 e ˛.1 e / C B C Gn . / D B C B i D1 n h C BX 1 C B = = 2 Xi Xi 1 e ˛.1 e / C B C B i D1 Xi 1 A @ i ˚ 2 .˛=2 Xi 1 / e 2= .˛ Xi 1 /e = C ˛=2 0
1 Introduction to Stochastic Models in Biology
25
1.9 Biological Applications To end this chapter, we will give a few examples of the use of stochastic models in biology. Examples of applications in neuroscience can be found in Chaps. 5–8.
1.9.1 Oncology This work has been realized by Benjamin Favetto, Adeline Samson, Daniel Balvay, Isabelle Thomassin, Valentine Genon-Catalot, Charles-Andr´e Cuenod and Yves Rozenholc. In anti-cancer therapy, it is of importance to assess tumor aggressiveness as well as to follow and monitor the in vivo effects of treatments [17, 30]. This can be performed via dynamic contrast enhanced imaging (DCEI) by studying the tissue microvascularisation and angiogenesis. This facilitates a better treatment monitoring by optimizing in vivo the therapeutic strategy. The DCEI experiment consists in injecting a contrast agent to the patient and recording a medical images sequence, which measures the evolution of the contrast agent concentration along time. The pharmacokinetics of the contrast agent is modeled by a bidimensional differential system. In this pharmacokinetic model, the contrast agent within a voxel of tissue is assumed to be either in the plasma compartment or inside the interstitial compartment. We assume that exchanges inside a voxel are (1) from the arteries (input) into the blood plasma; (2) from the blood plasma into the veins (output) and (3) between blood plasma and interstitial space. The quantities of contrast agent in a single unit voxel at time t are denoted AIF .t/, QP .t/ and QI .t/ for artery, plasma and interstitial compartments, respectively. The biological parameters and constraints are as follows: FT 0 is the tissue blood perfusion flow per unit volume of tissue (in mlmin1 100 ml1 ), Vb 0 is the part of whole blood volume (in %), Ve 0 is the part of extravascular extracellular space fractional volume (in %), and PS 0 is the permeability surface area product per unit volume of tissue (in mlmin1100 ml1 ). We have that Vb C Ve < 100. The hematocrit is the proportion of blood volume consisting of red blood cells and assumed to be h D 0:4. The delay with which the contrast agent arrives from the arteries to the plasma is denoted ı. Both t and ı are measured in seconds. The contrast agent kinetics can be modeled by the following ODE model: dQP .t/ FT PS PS FT D AIF.t ı/ QP .t/ C QP .t/ QI .t/ dt 1h Vb .1 h/ Ve Vb .1h/ PS PS dQI .t/ D QP .t/ QI .t/: dt Vb .1 h/ Ve
(1.26)
26
S. Ditlevsen and A. Samson
We assume that no contrast agent exists inside the body before the acquisition and hence the initial conditions are QP .t0 / D QI .t0 / D AIF .t0 / D 0. Note that AIF .t/ is a given function for all t, controlled by the experimentalist. However, this deterministic model is unable to capture the random fluctuations observed along time. For example, it fails to capture the contrast agent dosing and sampling errors, the measurement errors in the arterial input function, or the random fluctuations along time in the plasma/interstitial permeability. These variations are unpredictable. Our main hypothesis is that a more realistic model can be obtained by a stochastic approach. We introduce an SDE model by adding random components: dQP .t/ D
PS FT PS FT AIF.tı/ QP .t/ C QP .t/ dt QI .t/ 1h Vb .1h/ Ve Vb .1h/
C1 d Wt1 PS PS dQI .t/ D QP .t/ QI .t/ dt C 2 d Wt2 Vb .1 h/ Ve
(1.27)
where Wt1 and Wt2 are two independent Wiener processes, and 1 , 2 are the standard deviations of the random perturbations. The initial conditions are the same as above. This Itˆo process is a bidimensional Ornstein Uhlenbeck process. In our biological context, only the sum S.t/ D QP .t/ C QI .t/ can be measured. Noisy and discrete measurements .yi ; i D 0; : : : ; N / of S.t/ are performed at times t0 D 0 < t1 < : : : < tN D T . The observation model is thus: yi D S.ti / C "i ;
"i N .0; 1/
where ."i /i D0;:::;N are assumed to be independent, and is the unknown standard deviation of the Gaussian noise. The model parameters are denoted ODE D .FT ; Vb ; PS; Ve ; ı; 2 / and SDE D .FT ; Vb ; PS; Ve ; ı; 1 ; 2 ; 2 / for the ODE and SDE models, respectively. MLEs O of the model parameters are obtained by applying the standard least squares method for the ODE model and the Kalman filter approach for the SDE model. Predictions of both models are computed as the solution of the differential system (1.26) computed in O ODE for the ODE model and as the conditional expectation of QP and QI given the whole data (y0WN ) for the SDE model. The ODE and SDE models were applied to two signals to estimate the parameters O ODE and O SDE , their standard deviations and the associated predictions QO P , QO I and SO . The ODE and SDE residuals were computed as the difference between the observations y0WN and the predictions SO of the corresponding model. Signal 1 results are summarized in Table 1.1 and Fig. 1.3. For this signal, the ODE and SDE estimates and the predictions of the quantity of contrast agent are identical. For signal 2, the ODE and SDE estimates were different. The SDE predicted quantity of contrast agent in the interstitial compartment QO ISDE .t/ was always null (QO ISDE .t/ D 0 8t) while the ODE prediction QO IODE .t/ was not (Fig. 1.4). The ODE
1 Introduction to Stochastic Models in Biology
27
Table 1.1 Estimated parameters for oncology signal 1 data, using the ODE and the SDE models
600
600
500
500
400
400
300
300
200
200
100
100
0
0 0
50
100
150
200
250
300
Parameters
ODE model
SDE model
FT Vb PS Ve ı
1 2
48.7 40.5 13.3 29.4 6.0 8.02 – –
48.7 40.5 13.3 29.4 6.0 7.86 < 103 < 103
0
50
100
150
200
250
300
Fig. 1.3 Predictions for signal 1 data, obtained with the ODE model (left) and the SDE model (right): black stars () are the tissue observations .yi /, the AIF observations are represented by the red line, crosses () are the residuals. The plain blue, dashed pink and dash-dotted green lines are the predictions for S.t /, QP .t / and QI .t /, respectively
model detected exchanges inside the voxel between the two compartments. The ODE residuals were correlated, especially between times t D 40 and t D 75, contrary to the SDE residuals. Parameter estimates obtained by the ODE and the SDE models are different (Table 1.2). The SDE estimated blood volume (VObSDE D 53:5) is larger than the ODE estimate (VObODE D 41:3). The SDE estimated permeability surface (P S SDE D 0:81) is much less than the ODE estimate (P S ODE D 2:96). As VObODE C VOeODE D 100, the ODE estimation has stopped at a boundary of the optimization domain. This suggests a more careful look. We removed the 2 (and then the 5) last times of observations. While the SDE estimation remained stable when removing observations (up to changes in the last digits), the ODE estimation changed totally showing its poor stability. This variability induces even an inversion in the prediction of the quantity of the contrast agent in the two compartments. The results with the 2 or 5 last observations removed are added in Table 1.2, only for the ODE estimation. Figure 1.4 illustrates these results by zooming in on the predictions for each estimation.
b
b
28
S. Ditlevsen and A. Samson
140
140
120
120
100
100
80
80
60
60
40
40
20
20 0
0 – 20 0
50
100 150 200 250 300
– 20 0
140
140
120
120
100
100
80
80
60
60
40
40
20
20
0
50
100 150 200 250 300
50
100 150 200 250 300
0
– 20 0
50
100 150 200 250 300
– 20 0
Fig. 1.4 Top figures: predictions for oncology signal 2 data, obtained with the ODE model (left) and the SDE model (right): black stars () are the tissue observations .yi /, crosses () are the residuals. The plain blue, dashed pink and dash-dotted green lines are respectively the predictions for S.t /, QP .t / and QI .t /. For the SDE model, each prediction curve is surrounded by its 95% confidence intervals. Bottom figures: predictions obtained with the ODE model removing the last 2 observations (left) and the last 5 observations (right)
Table 1.2 Estimated parameters for oncology signal 2 data, using the ODE model, the SDE model and using the ODE model after removing the last 2 and the last 5 observations ODE without ODE without Parameters ODE model SDE modela 3 last times 5 last times FT 24.6 20.0 32.4 20.3 Vb 41.3 53.5 6.6 52.9 PS 2.96 0.81 43.2 0.04 Ve 58.7 0.04 27.9 0.002 ı 10.5 9.68 9.5 7.49
7.55 6.51 8.4 8.19 1 1.22 2 0.02 a The results were exactly the same after dropping the last 2 or 5 observations
1 Introduction to Stochastic Models in Biology
29
In conclusion, the use of a stochastic version of the two-compartment model avoids the instability sometimes observed with the classical two-compartment model. The SDE approach provides a more robust parameter estimation, adding reliability to the two-compartment models.
1.9.2 Agronomy This work has been realized by Sophie Donnet, Jean-Louis Foulley and Adeline Samson [11]. Growth curve data consist of repeated measurements of a growth process over time among a population of individuals. In agronomy, growth data allow differentiating animal or vegetal phenotypes by characterizing the dynamics of the underlying biological process. In gynaecology or pediatrics, height and weight of children are regularly recorded to control their development. The parametric statistical approach used to analyze these data is a parametric growth function, such as the Gompertz, logistic, Richards or Weibull functions [35], which prescribe monotone increasing growth, whatever the parameter values. These models have proved their efficiency in animal genetics [18,20] and in pediatrics [33]. However, as pointed out by [8], the used function may not capture the exact process, as responses for some individuals may display some local fluctuations such as weight decreases or growth slow-down. These phenomena are not due to measurement errors but are induced by an underlying biological process that is still unknown today. In animal genetics, a wrong modeling of these curves could affect the genetic analysis. In fetal growth, the detection of growth slow-down is a crucial indicator of fetal development problems. Thus, we propose to model these variations in growth curves by an Itˆo process. The parameter estimation is based on a Bayesian approach. We focus on the modeling of chicken growth. Data y are noisy weight measurements of chickens at weeks t D 0, 4, 6, 8, 12, 16, 20, 24, 28, 32, 36, 40 after birth (Fig. 1.5). These data are classically analyzed with a Gompertz function: x.t/ D Ae Be
C t
;
(1.28)
which depends on the three parameters A; B; C and verifies the following ODE x 0 .t/ D BC e C t x.t/;
x.0/ D Ae B :
(1.29)
A heteroscedastic error model is usually required to obtain satisfactory results. For simplicity we model the logarithm of the data y with the logarithm of the Gompertz function (1.28) and add a measurement error with constant variance: log yj D log A Be C tj C "j ; "j i:i:d: N .0; 1/; 8j D 0; : : : ; ni :
(1.30)
30
S. Ditlevsen and A. Samson
Fig. 1.5 Growth curves of the 50 chickens and mean growth curve in dashed bold line
The log-parametrization for A and C was used to ensure that parameters are positive. We estimate the posterior distribution of the parameters .log A; B; log C; 2 / of this ODE model. The SDE model is deduced from the Gompertz equation (1.29): dXt D BC e C t Xt dt C Xt d Wt ;
X0 D Ae B ;
(1.31)
where the diffusion coefficient is set equal to Xt given the heteroscedasticity of the process. This means that the standard error of the random perturbations of the growth rate is proportional to the weight. This Itˆo process belongs to the family of Geometric Brownian motions with time inhomogeneous drift. The Itˆo process (1.31) has an explicit solution. Indeed, set Zt D log.Xt /. By Itˆo’s formula (1.9), the conditional distribution of Zt Ch given .Zs /, s t; h > 0 is: 1 Zt Ch j.Zs /st N .Zt Be C t .e C h 1/ 2 h; 2 h/; Z0 D log.A/ B: 2 C t
1
Thus, Xt D Ae Be e 2 t CWt and X0 D Ae B . As a consequence, Xt is a multiplicative random perturbation of the solution of the Gompertz model. Due to the assumption of the non-negativity of A, Xt is almost surely non-negative, which is a natural constraint to model weight records. We then discretize the SDE: Ztj jZtj 1 N
2
Ztj 1 Be C tj 1 .e C.tj tj 1 / 1/
1 2 .tj tj 1 /; 2 .tj tj 1 / : 2
1 Introduction to Stochastic Models in Biology
31
Table 1.3 Posterior distributions for the ODE and SDE models on chicken growth data: average of estimated parameters and their 95% credibility intervals (95% CI) ODE SDE Average 95% CI Average 95% CI log A B log C
2 2
7.77 4.17 2.75 225.5
[7.70; 7.84] [4.11; 4.23] [2.70; 2.81] [197.4; 255.5]
7.75 4.15 2.78 630.2 0.09
[7.67; 7.83 ] [4.08; 4.22] [2.71; 2.84] [463.8; 797.9] [0.07; 0.12]
The SDE model on the logarithm of data is thus defined as: .log y0 ; log y1 ; : : : ; log yN /T D .log.A/ B; Zt1 ; : : : ; ZtN /T C "; " i:i:d: N .0; IN C1 / T .Zt1 ; : : : ; ZtN /T D log.A/ Be C t1 ; : : : ; log.A/ Be C tN 2 .t1 ; : : : ; tN /T C ;
i:i:d N 0J ; 2 t t D .min.tj ; tj 0 //1j;j 0 ni ;
(1.32)
where T denotes transposition. By a Bayesian approach we estimate the posterior distribution of the parameters .log A; B; log C; 2 ; 2 /. We consider Gaussian prior distributions for .log A; B; log C /, an inverse Gamma prior distribution for 2 as suggested by [9] for hierarchical models, and an inverse Gamma prior distribution for 2 . The posterior distribution is approximated with an MCMC algorithm. Posterior expectations of the parameters are presented in Table 1.3. The estimate of 2 is strictly positive and its confidence interval is far away from zero. This implies that the dynamical process that most likely represents the growth is an Itˆo process with non-negligible noise. Diagnostic tools to validate the models are applied to both ODE and SDE models. Figure 1.6 presents the posterior predictive distributions of both models computed for each time point. Centered and symmetrical posterior predictive distributions correspond to a “good” model. There is a clear improvement in the posterior predictive distributions from the ODE to the SDE model for the whole population, both at early and late ages. The predictive abilities of the two models can be compared on the posterior expectation of the squared errors using cross-validation techniques. New data sets denoted yj are constructed by dropping the j th measurement. The error is then: i h rep;k rjk D E .log.yj / log.yj //2 jyj ;
k D 1; 2
32
S. Ditlevsen and A. Samson
Fig. 1.6 Posterior predictive distributions for the ODE and SDE models on chicken growth data
rep;k
rep;k
with yj drawn from the predictive distribution p.yj jyj /. Averaging in rjk is with respect to the posterior uncertainty in the parameters of the model. We performed that comparison for the last observation j D 12, which is especially critical ode with respect to the growth pattern studied here. These quantities are r12 D 0:56 and sde r12 D 0:48, resulting in a reduction of the squared errors of prediction of 14% when using SDE vs ODE. Figure 1.7 reports, for four subjects, the observed weights, the ODE prediction, the empirical mean of the last 1,000 simulated trajectories of the SDE (1.32) generated during the estimation algorithm, their empirical 95% confidence limits (from the 2.5th percentile to the 97.5th percentile) and one simulated trajectory. Subjects 4 and 13 are examples of subjects with no growth slow-down. Both ODE and SDE models satisfactorily fit the observations. Subject 14 has a small observed weight decrease. For subject 1, the weight decrease is more important. For both subjects, the ODE model fails to capture this phenomenon while the SDE model does. In conclusion, on the presented data set, the introduction of this SDE model leads to a clear validation of the model (Fig. 1.6) which was not the case in the standard model, justifying the introduction of the new stochastic component.
1 Introduction to Stochastic Models in Biology
33
Fig. 1.7 Observations (circles), predictions obtained with the ODE model (long dashed line), mean SDE prediction (smooth solid line), 95% credibility interval obtained with the SDE model (dotted line) and one SDE realization (solid line), for subjects 1, 4, 13 and 14
References 1. A¨ıt-Sahalia, Y.: Maximum likelihood estimation of discretely sampled diffusions: a closedform approximation approach. Econometrica 70(1), 223–262 (2002) 2. Bally, V., Talay, D.: The law of the Euler scheme for stochastic differential equations (I): convergence rate of the distribution function. Probab. Theor. Relat. Field 104(1), 43–60 (1996) 3. Bally, V., Talay, D.: The law of the Euler scheme for stochastic differential equations (II): convergence rate of the density. Monte Carlo Meth. Appl. 2, 93–128 (1996) 4. Bibby, B.M., Sørensen, M.: Martingale estimation functions for discretely observed diffusion processes. Bernoulli 1(1/2), 017–039 (1995) 5. Bibby, B.M., Sørensen, M.: On estimation for discretely observed diffusions: a review. Theor. Stoch. Process. 2(18), 49–56 (1996) 6. Cox, J.C., Ingersoll, J.E., Ross, S.A.: A theory of the term structure of interest rates. Econometrica 53, 385–407 (1985) 7. Dacunha-Castelle, D., Florens-Zmirou, D.: Estimation of the coefficients of a diffusion from discrete observations. Stochastics 19(4), 263–284 (1986)
34
S. Ditlevsen and A. Samson
8. Davidian, M., Giltinan, D.M.: Nonlinear models for repeated measurements: An overview and update. J. Agr. Biol. Environ. Stat. 8, 387–419 (2003) 9. De la Cruz-Mesia, R., Marshall, G.: Non-linear random effects models with continuous time autoregressive errors: a Bayesian approach. Stat. Med. 25, 1471–1484 (2006) 10. Donnet, S., Samson, A.: Parametric inference for mixed models defined by stochastic differential equations. ESAIM Probab. Stat. 12, 196–218 (2008) 11. Donnet, S., Foulley, J.L., Samson, A.: Bayesian analysis of growth curves using mixed models defined by stochastic differential equations. Biometrics 66(3), 733–741 (2010) 12. Durham, G.B., Gallant, A.R.: Numerical techniques for maximum likelihood estimation of continuous-time diffusion processes. J. Bus. Econ. Stat. 20, 297–338 (2002) 13. Elerian, O., Chib, S., Shephard, N.: Likelihood inference for discretely observed nonlinear diffusions. Econometrica 69(4), 959–993 (2001) 14. Eraker, B.: MCMC analysis of diffusion models with application to finance. J. Bus. Econ. Stat. 19(2), 177–191 (2001) 15. Favetto, B., Samson, A.: Parameter estimation for a bidimensional partially observed Ornstein-Uhlenbeck process with biological application. Scand. J. Stat. 37, 200–220 (2010) 16. Feller, W.: Diffusion processes in genetics. In: Neyman, J. (ed.) Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Probability, pp. 227–246. University of California Press, Berkeley (1951) 17. Fournier, L., Thiam, R., Cu´enod, C.-A., Medioni, J., Trinquart, L., Balvay, D., Banu, E., Balcaceres, J., Frija, G., Oudard, S.: Dynamic contrast-enhanced CT (DCE-CT) as an early biomarker of response in metastatic renal cell carcinoma (mRCC) under anti-angiogenic treatment. J. Clin. Oncol. ASCO Annu. Meet. Proc. (Post-Meeting Edition) 25 (2007) 18. Hou, W., Garvan, C.W., Zhao, W., Behnke, M., Eyler, F., Wu, R.: A general model for detecting genetic determinants underlying longitudinal traits with unequally spaced measurements and nonstationary covariance structure. Biostatistics 6, 420–433 (2005) 19. Iacus, S.M.: Simulation and Inference for Stochastic Differential Equations. With R examples. Springer, New York (2008) 20. Jaffr´ezic, F., Meza, C., Lavielle, M., Foulley, J.L.: Genetic analysis of growth curves using the SAEM algorithm. Genet. Sel. Evol. 38, 583–600 (2006) 21. Karlin, S., Taylor, H.M.: A Second Course in Stochastic Processes. Academic, New York (1981) 22. Kloeden, P., Platen, E.: Numerical Solution of Stochastic Differential Equations. Springer, New York (1999) 23. Kutoyants, T.: Parameter Estimation for Stochastic Processes. Helderman Verlag, Berlin (1984) 24. Øksendal, B.: Stochastic Differential Equations. An Introduction with Applications, 6th edn. Universitext. Springer, Berlin (2003) 25. Pedersen, A.: A new approach to maximum likelihood estimation for stochastic differential equations based on discrete observations. Scand. J. Stat. 22(1), 55–71 (1995) 26. Pedersen, A.R.: Statistical analysis of gaussian diffusion processes based on incomplete discrete observations. Research Report, Department of Theoretical Statistics, University of Aarhus, 297 (1994) 27. Prakasa Rao, B.: Statistical Inference for Diffusion Type Processes. Arnold, London (1999) 28. Robert, C.P.: Bayesian computational methods. In: Handbook of Computational Statistics, pp. 719–765. Springer, Berlin (2004) 29. Roberts, G.O., Stramer, O.: On inference for partially observed nonlinear diffusion models using the Metropolis-Hastings algorithm. Biometrika 88(3), 603–621 (2001) 30. Rosen, M.A., Schnall, M.D.: Dynamic contrast-enhanced magnetic resonance imaging for assessing tumor vascularity and vascular effects of targeted therapies in renal cell carcinoma. Clin. Cancer Res. 13(2), 770–6 (2007) 31. Sørensen, M.: Parametric inference for discretely sampled stochastic differential equations. In: Andersen, T.G., Davis, R.A., Kreiss, J.P., Mikosch, T. (eds.) Handbook of Financial Time Series, pp. 531–553. Springer, Heidelberg (2009)
1 Introduction to Stochastic Models in Biology
35
32. Sørensen, M.: Estimating functions for diffusion-type processes. In: Kessler, M., Lindner, A., Sørensen, M. (eds.) Statistical Methods for Stochastic Differential Equations. Chapmann & Hall/CRC Monographs on Statistics & Applied Probability, London (2012) 33. Spyrides, M.H., Struchiner, C.J., Barbosa, M.T., Kac, G.: Effect of predominant breastfeeding duration on infant growth: a prospective study using nonlinear mixed effect models. J. Pediatr. 84, 237–243 (2008) 34. Taylor, H.M., Karlin, S.: An Introduction to Stochastic Modeling, 3rd edn. Academic, San Diego, CA (1998) 35. Zimmerman, D., N´unez-Ant´on, V.: Parametric modelling of growth curve data: an overview. Test 10, 1–73 (2001)
Chapter 2
One-Dimensional Homogeneous Diffusions Martin Jacobsen
2.1 Introduction When constructing a model defined by a stochastic differential equation (SDE) the basic problem is whether the equation has a solution and if so, when an initial condition is given, whether the solution is unique. Once the existence and uniqueness of the solution has been established so that the model is well-defined one may then proceed to study specific properties of the solution such as its long term behaviour, stationarity and the form of the invariant distribution, boundedness or positivity and whatever other properties are needed for the problem at hand. The solution to an SDE is a stochastic process, i.e, a randomly generated function of time so that formally the solution may be viewed as a typically huge collection of ordinary functions of time. It is this that makes SDEs much more difficult to deal with than ordinary differential equations where a unique solution is just one function of time. The commonly reported sufficient conditions for existence and uniqueness of solutions to SDEs were given in Chap. 1, Sect. 1.5. Unfortunately, these conditions are far too restrictive for dealing with processes that are required to be, e.g, positive (the popular square-root model, see Chap. 1, page 12 and Chap. 5, Sect. 5.3.6— the positivity of the model is discussed in Example 2.3 below) or bounded (the model (5.62) in Chap. 5 which is required to stay between the prescribed boundaries VI and VE ). It is the main purpose of this chapter to discuss some classical methods for deciding when the solution to an SDE stays within certain boundaries and also to determine when an invariant distribution exists and if so, what distribution it is.
M. Jacobsen Department of Mathematical Sciences, University of Copenhagen, Universitetsparken 5, 2100 Copenhagen, Denmark e-mail:
[email protected] M. Bachar et al. (eds.), Stochastic Biomathematical Models, Lecture Notes in Mathematics 2058, DOI 10.1007/978-3-642-32157-3 2, © Springer-Verlag Berlin Heidelberg 2013
37
38
M. Jacobsen
In the theory of stochastic processes it is standard to assume given a filtered probability space .˝; F ; Ft ; P/ with ˝ the sample space, F a -algebra of subsets of ˝ (a collection of subsets containing ˝ itself and closed under the formation of complementation, countable unions and countable intersections), P a probability measure on the measurable space P .˝; F / (P satisfies 0 P 1, P .˝/ D 1 and is -additive, P .[nD1 Fn / D 1 nD1 P .Fn / when the Fn 2 F are pairwise disjoint). Finally, for every t 0, Ft is a sub -algebra of F such that Fs Ft for s t and where Ft should be thought of as the collection of sets or events, the realisation or non-realisation of which is determined by the evolution of the model on the time interval Œ0; t. Below we shall consider adapted stochastic processes X D .Xt /t 0 that are real-valued solutions to SDEs, i.e, apart from being a solution it is required that each Xt is an Ft -measurable function from ˝ to R. Note that formally the ˚ solution to the SDE is the collection .Xt .!//t 0 W ! 2 ˝ of functions of t, often referred to as the sample paths of the process. The processes to be discussed below, diffusions, are continuous, i.e, every sample path is a continuous function of time. The definition of a martingale in discrete time was presented in Chap. 1, Definition 1.3. The definition of martingales in continuous time is more involved and relies on having everything defined on a filtered probability space. Definition 2.1 (Martingale). A real-valued process M is a martingale with respect to the filtration .Ft / if it is adapted, E jMt j < 1 and E .Mt jFs / D Ms for all 0 s t: Recall that the martingale property E .Mt jFs / D M R s is equivalent R to the statement that for all 0 s t, Ms is Fs -measurable and F Mt d P D F Ms d P for all F 2 Fs : The martingale convergence theorem states that for a continuous (or just rightcontinuous) martingale M with either supt 0 EMtC < 1 or supt 0 EMt < 1 (where MtC D max.Mt ; 0/; Mt D min.Mt ; 0/) the limit M1 D limt !1 Mt exists almost surely and E jM1 j < 1. In particular a continuous martingale which is either positive or bounded always converges a.s. A very important concept is that of stopping times. A stopping time is a random variable T W ˝ ! Œ0; 1 such that .T t/ 2 Ft for all t 0. A stopping time should be thought of as a random point in time such that the realisation T D t is determined by the evolution of the model on Œ0; t. For what follows below the most important stopping times describe the first time a diffusion reaches a given level or the first time a diffusion hits the boundary of an interval. Stopping times are also needed for the following definition. Definition 2.2 (Local martingale). A real-valued process M is a local martingale with respect to the filtration .Ft / if it is adapted and there exists a sequence .Tn /n1 of stopping times, increasing to 1 almost surely, such that for each n the stopped process MTn D .MTn ^t /t 0 is a martingale.
2 One-Dimensional Homogeneous Diffusions
39
are very important, in particular because all Itˆo-integrals R Local martingales
t Z dW with Z a continuous adapted process are local martingales. A local s s 0 t 0
martingale M need not be a martingale, not even if all E jMt j < 1. A simple sufficient condition for a local martingale to be a martingale is that E supsWst jMs j < 1 for all t, in particular a bounded local martingale is always a martingale. For the Itˆo-integral above, an Ralternative sufficient condition for this local martingale to be t a martingale is that E 0 Zs2 ds < 1 for all t. As the example with the Itˆo-integral shows, it is often easy to argue that a given process is a local martingale. Often one would like to have a true martingale, but this may be very difficult to show, however it should be done: do not ever trust a local martingale to be a martingale without verification or your analysis may prove disastrously wrong!
2.2 Diffusion Processes Suppose given a real-valued process X D fXt D X.t/gt 0 , which is a solution to an SDE (see also Chap. 1, (1.3)), dX t D .Xt /dt C .Xt /dW t ;
X0 x0
(2.1)
with W a continuous one-dimensional standard Brownian motion (a BM.1/-process, also called a Wiener process, see Definition 1.2, Chap. 1) defined on some filtered probability space .˝; F ; Ft ; P/. In particular, Ft is an increasing sequence of -algebras. You can think of Ft as the history up to time t. Thus Z Xt D x0 C
t
0
Z
t
.Xs / ds C
.Xs /dWs : 0
Such a process X is an example of a (one-dimensional) diffusion process, also called an Itˆo process, see p. 10, Chap. 1. Suppose also that it is known that 1
\ Xt 2 l; rŒ A D 1 P@ 0
t 0
where l; rŒ R is an open interval that could be a genuine subinterval of R. In particular l < x0 < r. It is entirely possible to discuss diffusions such that all Xt take their values in half-open or closed intervals such as Œl; rŒ or Œl; r and where it is possible for X to hit the boundary l (or r if relevant). In that case, l will be a reflecting or absorbing boundary. In this chapter, however, we only discuss open intervals l; rŒ. See Remark 2.1 below for further discussion.
40
M. Jacobsen
Our first aim is to discuss conditions on the functions and that ensure that X in fact stays away from the boundary points l and r, also in the case where l D 1 or r D C1. Assume from now on that ; are continuous functions on l; rŒ, with > 0, but do not assume Lipschitz conditions on the functions and as is often done to ensure existence and uniqueness of solutions to (2.1), see Chap. 1, Sect. 1.5. At the moment we are just given a l; rŒ-valued solution to (2.1). We shall need the following form of Itˆo’s formula: if f 2 C 2 (twice differentiable with a continuous second derivative), then 1 df .Xt / D f 0 .Xt / dXt C f 00 .Xt / d ŒX t 2
(2.2)
where 0 denotes derivative with respect to x. Here, ŒX is the quadratic variation process for X, which because of (2.1) is given by Z
t
ŒX t D
2 .Xs / ds:
(2.3)
0
2.3 Scale Function and Speed Measure We start by looking for a twice differentiable function S W l; rŒ! R, such that S.X/ is a continuous local martingale. By (2.1), (2.2) and (2.3), dS.Xt / D AS.Xt /dt C S 0 .Xt /.Xt /dW t where A is the second order differential operator (the infinitesimal generator for the diffusion X) 1 (2.4) Af .x/ D .x/f 0 .x/ C 2 .x/f 00 .x/: 2 Thus S.X/ is a continuous local martingale (see Definition 2.2) if AS 0. This gives that S satisfies Z S 0 .x/ D c exp
x x0
2.y/ dy 2 .y/
for some c. If c > 0 we have S 0 > 0 so S is strictly increasing. S is called a scale function for the diffusion X. If S is a scale function, all other scale functions are of the form c1 C c2 S for some c1 2 R, c2 > 0. With x0 2 l; rŒ given, fix a < x0 < b, a; b 2 l; rŒ. Define the stopping time Ta;b D infft W Xt D a or Xt D bg:
2 One-Dimensional Homogeneous Diffusions
41
Then .S.X//Ta;b D fS.XTa;b ^t /gt 0 is a bounded local martingale, hence a true martingale, and so, for all t ES.XTa;b ^t / D ES.X0 / D S.x0 / and for t ! 1, by dominated convergence ES.XTa;b / D S.x0 /; where
8 < S.b/ S.XTa;b / D S.a/ : limt !1 S.Xt /
on .Tb < Ta / on .Ta < Tb / on .Ta;b D C1/
exists by the martingale convergence theorem. Here, Tc D infft W Xt D cg where c 2 l; rŒ for c D a or b. This is equal to the first passage time (5.1) in Chap. 5 applied in the neuronal Integrate and Fire models to represent the spike time. It will be shown below that P.Ta;b < 1/ D 1. Believing this for the moment, we find that S.x0 / D ES.XTa;b / D S.b/P.Tb < Ta / C S.a/P.Ta < Tb /; i.e, P.Tb < Ta / D 1 P.Ta < Tb / D
S.x0 / S.a/ ; S.b/ S.a/
(2.5)
the first basic formula. Note that since limt !1 S.Xt / exists a.s. on .Ta;b D C1/ and because S is strictly increasing and continuous, also limt !1 Xt exists a.s. on .Ta;b D C1/ (this is what we can say at the moment—remember that we shall show shortly that P.Ta;b D C1/ D 0). With a < x0 < b as before, let ' W Œa; b ! R be continuous and let f denote the unique solution to Af .x/ D '.x/;
a x b;
f .a/ D f .b/ D 0:
Then, see (2.1), (2.2) and (2.4), df .Xt / C '.Xt /dt D f 0 .Xt /.Xt /dW t Z
so
t
Mt .f / f .Xt / C
'.Xs /ds 0
is a continuous, local martingale, hence so is M.f /Ta;b . But since sup jMTa;b ^s .f /j sup jf .x/j C t sup j'.x/j < 1; st
axb
axb
42
M. Jacobsen
M.f /Ta;b is a true martingale, in particular Z
t ^Ta;b
E
'.Xs /ds D f .x0 / Ef .Xt ^Ta;b /:
0
Since f is continuous, by the fact that limt !1 Xt exists, lim f .Xt ^Ta;b / D f .XTa;b /
t !1
exists a.s. and by dominated convergence, Ef .XTa;b / D lim Ef .Xt ^Ta;b /: t !1
Furthermore, by monotone convergence if ' 0 or ' 0, Z
t ^Ta;b
lim E
t !1
Z
Ta;b
'.Xs /ds D E
0
'.Xs /ds 0
so that for such ', Z
Ta;b
E
'.Xs /ds D f .x0 / Ef .XTa;b /:
0
Taking ' 1 on Œa; b this gives ETa;b D f0 .x0 / Ef0 .XTa;b /; where f0 solves Af0 1, f0 .a/ D f0 .b/ D 0. Since the expression on the right hand side is finite, ETa;b < 1 follows. In particular, P.Ta;b < 1/ D 1, and we have shown the basic scale function formula (2.5). Also, for general ', since it is now clear that Ef .XTa;b / D 0 because f .a/ D f .b/ D 0, Z E
Ta;b
'.Xs /ds D f .x0 /:
0
This is certainly true if ' 0 or ' 0. For general continuous ', write ' D ' C ' , with ' C D max.'; 0/ and ' D min.'; 0/. Below, in Theorem 2.1, we give an identity which is true for all bounded Borel functions ' W Œa; b ! R. Lemma 2.1. Let S be an arbitrary scale function with Z S 0 .x/ D c exp
x x0
2.y/ dy 2 .y/
2 One-Dimensional Homogeneous Diffusions
43
for some c > 0, and define k W l; rŒ! RC by k.x/ D
2 : 2 .x/S 0 .x/
Then the unique solution f to Af ' on Œa; b, f .a/ D f .b/ D 0, where ' is a given continuous function, is Z b f .x/ D Ga;b .x; y/'.y/k.y/dy; a
where Ga;b is the Green function Ga;b .x; y/ D Ga;b .y; x/, Ga;b .x; y/ D
.S.x/ S.a//.S.b/ S.y// ; S.b/ S.a/
a x y b:
Proof. Since Ga;b .a; y/ D Ga;b .x; b/ D 0, clearly f .a/ D f .b/ D 0. If x < z 2 Œa; b, Z S.x/ S.z/ x f .z/ f .x/ D .S.y/ S.a//'.y/k.y/dy S.b/ S.a/ a Z S.z/ S.x/ b C .S.b/ S.y//'.y/k.y/dy S.b/ S.a/ z Z z 1 ..S.y/ S.a//.S.b/ S.z// C S.b/ S.a/ x .S.x/ S.a//.S.b/ S.y///'.y/k.y/dy:
(2.6)
Look at the last term in (2.6). We get that .S.y/ S.a// .S.b/ S.z// .S.x/ S.a// .S.b/ S.y// .S.z/ S.a// .S.b/ S.z// .S.x/ S.a// .S.b/ S.z// D .S.z/ S.x// .S.b/ S.z// and by a similar argument .S.y/ S.a// .S.b/ S.z// .S.x/ S.a// .S.b/ S.y// .S.z/ S.x// .S.x/ S.a// : For the integral itself we therefore see that Z
z
..S.y/ S.a//.S.b/ S.z// .S.x/ S.a//.S.b/ S.y///'.y/k.y/dy x
Z
z
.S.z/ S.x// .S.b/ S.z//
' .y/ k.y/ dy x
44
M. Jacobsen
and Z
z
..S.y/ S.a//.S.b/ S.z// .S.x/ S.a//.S.b/ S.y///'.y/k.y/dy
x
.S.z/ S.x// .S.x/ S.a// : Dividing in (2.6) by S.z/ S.x/ and taking limits as z # x, it now follows that 1 f 0 .x/ D S 0 .x/ S.b/ S.a/ 1 C S.b/ S.a/
Z
x
.S.y/ S.a//'.y/k.y/dy
a
Z
b
.S.b/ S.y//'.y/k.y/dy
z
and differentiating this with respect to x gives
f0 S0
0 .x/ D
1 ..S.x/ S.a//'.x/k.x/ C.S.b/ S.x//'.x/k.x// S.b/ S.a/
D '.x/k.x/: It remains only to verify that
f0 S0
k 2 1 00 2b 0 0 0 0 00 f D f D kAf : log S f C f S0 2 2
0 D
t u The measure m on l; rŒ with density k, m.dx/ D k.x/dx, is called the speed measure for the diffusion X. Note that if the scale function S is replaced by c1 Cc2 S (where c1 2 R, c2 > 0), k is replaced by c12 k. We summarize the results obtained so far, writing Px0 ; Ex0 instead of P; E to emphasize the initial value X0 x0 , which is called x in the theorem. Theorem 2.1. With X given by (2.1), a diffusion with values in l; rŒ, where and > 0 are continuous, and where X0 x 2 l; rŒ, it holds for a < x < b, a; b 2 l; rŒ, that Px .Ta;b < 1/ D 1, Px .Tb < Ta / D 1 Px .Ta < Tb / D
S.x/ S.a/ ; S.b/ S.a/
and for ' W Œa; b ! R bounded and measurable, that Z
Ta;b
E
x 0
Z
b
'.Xs /ds D
Ga;b .x; y/'.y/k.y/dy: a
2 One-Dimensional Homogeneous Diffusions
Z
In particular,
45
b
E Ta;b D x
Ga;b .x; y/k.y/dy: a
In the formulas above, S , given by (apart from an additive constant), Z x 2.y/ dy S 0 .x/ D exp 2 x0 .y/ is an arbitrary scale function and k.x/ D
2 2 .x/S 0 .x/
is the corresponding speed measure density. Example 2.1. If X is a BM.1/-process, X is a martingale, so S.x/ D x is a scale function which corresponds to k 2, i.e the speed measure is two times the Lebesgue measure. Further xa ; ba D .x a/ .b x/
Px .Tb < Ta / D Ex Ta;b
for a x b 2 R. If X is a Brownian motion with drift , diffusion coefficient , i.e, Xt D X0 C t C Bt ; then 2
S 0 .x/ D e 2 x ;
k.x/ D
2 22 x e : 2
2.4 Boundary Behavior So far we have assumed that 0 Px @
\
1
Xt 2 l; rŒ A D 1;
t 0
i.e., that Tr D Tl 1 Px -almost surely. The next result will tell us what are the properties of S and k that prevents X from reaching either of the boundaries l and r. Throughout S is a given scale, k the matching density for the speed measure. Theorem 2.2. Define S.r/ D limy"r S.y/ 1;
S.l/ D limy#l S.y/ 1:
(i) Either Z S.r/ D 1
r
.S.r/ S.z//k.z/d z D 1;
or y
y 2 l; rŒ
46
M. Jacobsen
and similarly, either Z
y
S.l/ D 1 or
.S.z/ S.l//k.z/d z D 1;
y 2 l; rŒ:
l
(ii) For l < a < x < b < r, Px .Ta < 1/ D
S.r/ S.x/ ; S.r/ S.a/
Px .Tb < 1/ D
S.x/ S.l/ : S.b/ S.l/
In particular Px .Ty < 1/ > 0 for all x; y 2 l; rŒ, Px .Ta < 1/ D 1 if and only if S.r/ D 1 and Px .Tb < 1/ D 1 if and only if S.l/ D 1. (iii) If S.r/ < 1, then limt !1 Xt D r Px -a.s. on A , where [
A D
.Ta D 1/;
and Px .A / D
aWa<x
S.x/ S.l/ S.r/ S.l/
and if S.l/ > 1, then limt !1 Xt D l Px -a.s. on AC , where [
AC D
.Tb D 1/;
and Px .AC / D
bWb>x
S.r/ S.x/ : S.r/ S.l/
(iv) It holds that
lim Xt D r D 1 if t !1
Px lim Xt D l D 1 if
Px
S.r/ < 1 and S.l/ D 1; S.r/ D 1 and S.l/ > 1:
t !1
(v) If S.r/ < 1 and S.l/ > 1 then Px . lim Xt D r/ D 1 Px . lim Xt D l/ D t !1
t !1
S.x/ S.l/ : S.r/ S.l/
(vi) If S.r/ D 1 and S.l/ D 1 then X is recurrent in the sense that 0
1 \ \[ Px @ .Xs D y/A D 1; y2l;rŒ t >0 s>t
i.e, X hits any level infinitely often in any interval Œt; 1Œ, t 0. Proof. Let l < a < x < b < r. For b " r, Tb " Tr 1 (by the assumption that X never hits r), so 1.Tb
b!r
proving (ii).
S.x/ S.a/ S.x/ S.a/ D S.b/ S.a/ S.r/ S.a/
2 One-Dimensional Homogeneous Diffusions
47
If S.r/ < 1, S.X/Ta is a bounded local martingale, hence a true martingale, so the random variable
S.a/ on .Ta < 1/ S.X.Ta // D limt !1 S.Xt / on .Ta D 1/ is well defined Px -a.s. and satisfies Ex S.XTa / D S.x/: On the other hand, S.x/ D Ex S.XTa / D S.a/
S.r/ S.x/ C Ex .S.XTa /1.Ta D1/ / S.r/ S.a/
implying that Ex .S.XTa /1.Ta D1/ / D S.r/
S.x/ S.a/ D S.r/Px .Ta D 1/: S.r/ S.a/
Since S.XTa / S.r/, it follows that S.XTa / D S.r/ Px -a.s. on .Ta D 1/, i.e, limt !1 Xt D r Px -a.s. on .Ta D 1/ and (iii) follows since .Ta D 1/ " A as a # l so S.x/ S.l/ Px .A / D lim Px .Ta D 1/ D : S.r/ S.l/ a#l Now we can prove (i): if S.r/ < 1, limb"r Ta;b D Ta Px -a.s. and so by monotone convergence Z b x E Ta D lim Ga;b .x; y/k.y/dy: b"r
a
But since by (ii), P .Ta D 1/ > 0, the left hand side equals 1. The right hand side equals x
Z
.S.x/ S.a// .S.b/ S.y// k.y/ dy b"r S.b/ S.a/ x Z x .S.y/ S.a// .S.b/ S.x// C k.y/ dy S.b/ S.a/ a Z b S.x/ S.a/ D .S.r/ S.y// k.y/ dy S.r/ S.a/ x Z S.r/ S.x/ x C .S.y/ S.a// k.y/ dy S.r/ S.a/ a b
lim
where the last term is finite, consequently the first term equals C1 and (i) is proved.
48
M. Jacobsen
It remains to establish (vi). From (ii) we know that Px .Ta < 1/ D Px .Tb < 1/ D 1 for all a < x, b > x. Let an # l, bn " r, then Tan " 1, Tbn " 1 Px -a.s. and between Tan and Tbn X passes through all levels y 2 Œan ; bn since it is continuous. (vi) follows easily from this. t u Instead of starting with a solution to (2.1), assume given an open interval l; rŒ and continuous functions b W l; rŒ! R, W l; rŒ! RC D 0; 1Œ that satisfy the condition from Theorem 2.2 (i): Z r .S.r/ S.z// k.z/ dz D C1; y 2 l; rŒ S.r/ D C1 or Z S.l/ D 1
y y
or
.S.y/ S.l// k.z/ dz D C1;
y 2 l; rŒ
l
where for some x0 2 l; rŒ, Z S 0 .x/ D exp
x x0
2.y/ dy ; 2 .y/
k.x/ D
2 2 .x/S 0 .x/
:
Theorem 2.3. Let l; rŒ, ; be as above, let W be a BM.1/-process on the filtered space .˝; F ; Ft ; P/ and let U 2 F0 be a given random variable with values in l; rŒ. Then the SDE dXt D .Xt /dt C .Xt /dW t ;
X0 U;
has a unique solution which is a diffusion. If U x0 , the distribution ˘ x0 of X, viewed as a random variable with values in CR0 .l; rŒ/, the space of continuous paths w W R0 !l; rŒ, does not depend on the choice of .˝; F ; Ft ; P/ and W (uniqueness in law), and with boundary condition U 2 F0 , the distribution of X is the mixture Ran arbitrary x ˘ P.U 2 dx/. l;rŒ This very important result we cannot prove here. At best we could give a proof when ; are Lipschitz on any interval ; Œ where l < < < r. Some of the ideas in a proof is contained in the following. Example 2.2. Let W D W.1/ ; : : : ; W.d / be a BM.d /-process (Brownian motion in d dimensions) where d 2, let a > 0 and define 0 Xt D kWQ t k D @
d X
1 12 .WQ t
.j / 2 A
/
j D1
where WQ t
.j /
D Wt
.1/
C a for j D 1, WQ t
.j /
D Wt
.j /
for j 2.
2 One-Dimensional Homogeneous Diffusions
49
X is a d -dimensional Bessel process (BES.d /); the dimension refers of course to the Brownian motion since X is one-dimensional starting at a > 0. We shall first study the properties of X using Itˆo’s formula. However, x 7! kxk is C 2 only on Rd n 0, so it is necessary to stop X before it hits 0. Let 0 < r < a and define T D infft W Xt D rg: Q T k, and by the multi-dimensional version of Itˆo’s formula, using Then XT D kW that xi xj ıij @2 @ xi ; kxk D kxk D 2 @xi kxk kxk kxk3 @xij we get dX Tt
d 1 X Q .j / T 1d 1 .j / D T .Wt / d.WQ t / C d.t ^ T /; 2 XtT Xt j D1
in particular, ŒX T t D
d Z X
T ^t
j D1 0
Next, let fd W R0 ! R solve
1 Q .j / 2 .W / ds D T ^ t: Xs2 s
d 1 0 f .x/ 2x d
fd .x/ D
D 12 fd00 .x/; i.e,
log x 1 x .d 2/ d 2
if d D 2 if d 3:
Then d 1 1 X Q .j / T dfd .XT / D fd0 .XT /dXT C fd00 .XT /d ŒXT D T d .W / d.WQ .j / /T ; 2 .X / j D1
i.e,fd XT is a continuous local martingale. It follows that for every n and N 2 IN, fd XTn;N is a true martingale (because it is a bounded local martingale) with Tn;N D inf ft W Xt D 1=n or Xt D N g ; assuming that 1=n < a < N . Clearly P .Tn;N < 1/ D 1 (because Tn;N infft W .2/ jWt j D N g < 1 a.s.) so fd .X Tn;N .1// D fd .XTn;N / D fd .1=n/
or
fd .N /:
(2.7)
Using optional sampling on the uniformly integrable (since bounded) martingale fd .X Tn;N / we get E.fd .XTn;N / j FTn1;N / D fd .XTn1;N /;
50
M. Jacobsen
i.e. for N > a fixed, .fd .XTn;N //n is a discrete time martingale, bounded above by the constant fd .N /. Hence lim fd .XTn;N / D ZN
n!1
exists a.s. as a finite limit, but since (2.7) holds and fd .1=n/ ! 1, necessarily XTn;N D N for n sufficiently large, i.e, X hits any given high level N before it hits levels sufficiently close to 0. It follows that with probability 1, X will never hit 0. But then we may use Itˆo’s formula directly and deduce that dXt D
d 1d 1 1 X Q .j / Q .j / W d Wt : dt C 2 Xt Xt j D1 t
The last term corresponds to a continuous local martingale Y with Y0 0 and with quadratic variation process ŒY t D
d Z X
t
j D1 0
.WQ s /2 ds D t; Xs2 .j /
hence by L´evy’s characterization of Brownian motion, Y is a BM.1/-process W , and X solves 1d 1 dX t D dt C dW t ; X.0/ a: 2 Xt We have now shown that BES.d / is a diffusion with values in 0; 1Œ, with scale function fd and speed measure density kd .x/ D
2 D 2x .d 1/ : fd0 .x/
By Theorem 2.2, X is recurrent for d D 2 (in particular it gets arbitrarily close to 0 without ever hitting 0), while for d 3, P. lim Xt D 1/ D 1; t !1
P .Tr < 1/ D
r d 2 a
;
.r a/ :
Note that if you are good at integration, you should be able to prove that for d D 2 the local martingale fd .X/ D log X is not a true martingale, simply by showing that Z
1
E log X.1/ D log a C a
1 1 r2 e 2 dr > log a: r
t u
2 One-Dimensional Homogeneous Diffusions
51
Remark 2.1. The results described above give a complete description of diffusions of the form (2.1) that move on an open interval l; rŒ. The classical theory of so-called regular diffusions (one-dimensional) characterises each such diffusion through its scale function and speed measure. Apart from including half-open or closed intervals from l to r corresponding to reflecting or absorbing boundaries, this theory allows for a much wider collection of scales and speeds. In particular, the appropriate scale functions S are just strictly increasing and continuous, while the speed measures m must satisfy that 0 < m .Œa; b/ < 1 for all a < b 2 l; rŒ, but m need not have a density k and may even have atoms, m .fcg/ > 0 for some c 2 l; rŒ. The behaviour of S and m close to the boundaries l and r depends on whether the boundary point is included in the interval from l to r or not, i.e. on whether the boundary point is a possible value of the process or not. Two classical references on the theory of regular diffusions are [2] and [1]. Let now again X be a diffusion on l; rŒ, dX t D .Xt /dt C .Xt /dW t ;
X.0/ U 2 F0
with scale function S , speed measure density k, satisfying the critical conditions from Theorem 2.2 (i). As usual ; are continuous with > 0. The problem we shall now study is that of investigating whether there exists a probability on l; rŒ, such that if U has distribution , X is stationary, i.e, for all t, Xt has distribution . is also called an invariant probability for X. If w it exists, is uniquely determined and typically, for all x, pt .x; / ! (weak convergence) as t ! 1. Here, pt .x; / is the transition probability from x, i.e. pt .x; B/ D Px .Xt 2 B/. Theorem 2.4. X has an invariant probability if and only if Z
r
k.x/dx < 1;
K l
and in that case .dx/ D
1 k.x/dx: K
In particular, in order for the invariant probability to exist, it is necessary that X be recurrent, S.r/ D 1; S.l/ D 1. Proof. (partial). Suppose first that the invariant probability measure exists. Let K denote the class of f 2 C 2 .l; rŒ/ such that for some l 0 < r 0 it holds that f is constant on l; l 0 Œ and constant on r 0 ; rŒ (the constant values may be different). Then, since Z
Z
t
f .Xt / D f .X.0// C
t
Af .Xs /ds C 0
0
f 0 .Xs /.Xs /dW s ;
(2.8)
52
M. Jacobsen
and since f is bounded and Af , f 0 have compact support so that also Af , f 0 are D bounded, taking expectations and using Xt D X.0/, we obtain .Af / D 0 Consequently,
Z
r l
.f 2 K / :
1 .bf 0 C 2 f 00 / d D 0 2
.f 2 K / :
Assuming now that .dx/ D u.x/ dx this yields (use partial integration and that f 0 has compact support), Z
r l
1 .bu . 2 u/0 /f 0 dx D 0: 2
Rx But as f 0 we can obtain any g 2 C 1 with compact support (because x0 g.y/dy is constant close to l and r, respectively, since g vanishes close to l and r), and the class of such g is dense in L2 (Lebesgue .l; r/). Deduce that bu 12 . 2 u/0 0 on .l; r/ and the desired expression for the invariant density follows. Rr We still need that the existence of implies l k.x/dx < 1 and uniqueness Rr of . We claim that if l k.x/dx < 1, then u D K1 k is the density for the invariant measure, but the proof of this requires Markov process theory. We now know that .Af / D 0 for f 2 K where .dx/ D u.x/dx. We also have Z
t
Pt f .x/ D f .x/ C
Ps .Af /.x/ds 0
Rt from (2.8). Here Pt f .x/ D Ex .f .Xt //. Also .Pt f / D .f / C 0 .Ps .Af /ds/. One must now argue that Ps f 2 D.A/ (standard definition of the domain D.A/ of A; f 2 C 2 bounded, Af bounded), that Ps .Af / D A.Ps f / and finally that .A.Ps f // D 0 (or .A.g// D 0 for all g 2 D.A/). Then Z
t
.Pt f / D .f / C
.A.Ps f //ds D .f /:
0
It remains to verify that if an arbitrary x 2 l; rŒ Z
Rr l
k.x/dx < 1, then S.r/ D 1, S.l/ D 1. But for Z
r x
so
Rr x
r
.S.r/ S.y//k.y/dy
1D
k.y/dy < 1 forces S.r/ D 1.
k.y/dy.S.r/ S.x// x
t u
2 One-Dimensional Homogeneous Diffusions
53
Example 2.3. The square root process (also called the Cox–Ingersoll–Ross process) solves the SDE p dX t D .a C bXt /dt C Xt dW t with parameters a; b 2 R, > 0, see Chap. 1, p. 12, where a D ˛= and b D 1=, or Chap. 5, model (5.51), where a D F and b D 1=. One is interested in a solution which is strictly positive and finite i.e. l; rŒ D 0; 1Œ. We use Theorem 2.2 to decide for what values of a; b, > 0 a strictly positive and finite solution exists. By computation Z S .x/ D exp
2.a C by/ dy 2y
x
0
1
and
Dx
2a2
2b exp 2 .x 1/
2 2a2 1 2b k.x/ D 2 x exp .x 1/ : 2
It follows that S.0/ D 1 S.1/ D 1
,
2a 1; 2
,
b < 0 or b D 0;
2a 1: 2
In the case where S.1/ < 1 we compute Z
1
.S.1/ S.y//k.y/dy
I D x
for large x. (i) If b D 0,
2a 2
> 1, Z
1
Z
1
I DK x
(ii) If b > 0,
Z
2a
2a
z 2 d z y 2 1 dy D C1:
y
1
Z
1
I DK x
2a
2b
2a
y
Rewrite the inner integral as y
2a2
Z
1
y
2b
z 2 e 2 z dz y 2 1 e 2 y dy:
2a2 2b z e 2 z dz y
54
M. Jacobsen 2a
and use that . yz / 2 stays bounded when y z y C c for arbitrary c > 0, to deduce that Z 1 2a 2b 2a 2b I KQ y 2 e 2 y y 2 1 e 2 y dy D C1: x
Thus the conditions on S , k from R xTheorem 2.2 (i) are always satisfied at r D 1. Finally we evaluate J D 0 .S.y/ S.0//k.y/dy for small x > 0 when S.0/ > 1, i.e. when
2a 2
2b
< 1. But since e ˙ 2 z is close to 1 for small z, Z
x
Z
y
J K 0
2a
2a
z 2 dz y 2 1 dy < 1
0
< 1, the conditions on S , k from Theorem 2.2 (i) are not satisfied for so when 2a 2 l D 0. The conclusion is that the square root SDE has a strictly positive and finite solution if and only if 2a 1: 2 1, The solution is recurrent (S.1/ D 1, S.0/ D 1) if and only if either 2a 2 D 1, b D 0. b < 0 or 2a 2 In the recurrent case, the process has an invariant probability if and only if 2a 1, b < 0. The invariant probability is then a -distribution. 2
2.5 Expected Time to Hit a Given Level We conclude this chapter with a discussion of the expected time for a diffusion to hit a given level. Proposition 2.1. Let X solve d X D .X/dt C .X/d W on l; rŒ, with scale S , and speed density k. Then for l < a < x the following holds: (i) If S.r/ < 1 then Px .Ta D 1/ > 0 and Ex Ta D 1, Rr (ii) If S.r/ D 1 then Px .Ta < 1/ D 1 and Ex Ta < 1 if and only if x k.y/ dy < 1, (iii) If S.r/ D 1 and S.l/ D 1,we have Ex Ta < 1, Ex Tb < 1 for all a R r 2 l; xŒ, all b 2 x; rŒ if and only if X has an invariant measure, i.e, l k.y/dy < 1.
2 One-Dimensional Homogeneous Diffusions
55
Proof. (i) Follows from Theorem 2.2 (ii). (ii) If S.r/ D 1, we also know that Px .Ta < 1/ D 1 from Theorem 2.2 (ii). By monotone convergence, as b " r, Ex Ta D lim Ex Ta;b b"r
Z
.S.x/ S.a// .S.b/ S.y// k.y/ dy S.b/ S.a/ x Z x .S.y/ S.a// .S.b/ S.x// C k.y/ dy S.b/ S.a/ a Z r Z x D .S.x/ S.a// k.y/ dy C .S.y/ S.a// k.y/ dy b
D lim b"r
x
a
and (ii) follows. (iii) Is a direct consequence of (ii) (applying also the version of (ii) with S.l/ D 1). t u
References 1. Freedman, D.: Brownian Motion and Diffusion. Holden-Day, San Francisco (1971) 2. Itˆo, K., McKean, H.P.: Diffusion Processes and Their Sample Paths. Springer, Berlin (1965)
Chapter 3
A Brief Introduction to Large Deviations Theory Gilles Wainrib
Abstract In this chapter we introduce the main concepts of large deviations theory. We state some of the main theorems with several examples, from Cram´er theorem for the sum of independent random variables, to Freidlin–Wentzell theory of random perturbation of dynamical systems.
3.1 Introduction Large deviations theory is concerned with an asymptotic description of the fluctuations of a system around its most probable behavior. The first example of such a description goes back to Boltzmann’s 1877 calculation [4] for a system of independent particles, establishing a fundamental link between the notion of entropy and the asymptotic exponential behavior of multinomial probabilities. The entropy of a system at equilibrium measures the number of microscopic configurations leading to a given macroscopic state, and the state of maximum entropy corresponds to the most probable state. Not only at the core of thermodynamics and statistical physics, entropy has played a major role in many areas of science. In life sciences, entropy is an important concept, from evolution theory to protein unfolding and self assembling and molecular motors, not to mention its links with information theory which is widely applied in genetics or in neuroscience. Sharing the perspective of [11], large deviations theory may be viewed as a mathematical investigation of the concept of entropy. Describing fluctuations beyond the central limit theorem (CLT), this theory provides exponential estimates for rare events analysis, which is a field
G. Wainrib Laboratoire Analyse G´eom´etrie et Applications, Institut Galil´ee, Universit´e Paris 13, Villetaneuse, France e-mail:
[email protected] M. Bachar et al. (eds.), Stochastic Biomathematical Models, Lecture Notes in Mathematics 2058, DOI 10.1007/978-3-642-32157-3 3, © Springer-Verlag Berlin Heidelberg 2013
57
58
G. Wainrib
of growing interest in many applied sciences. In Chap. 7 the theory is applied in a probabilistic treatment of the FitzHugh–Nagumo neuronal model. Let us give an illustration with an elementary example. If one throws n coins, then for large n the proportion of heads will be close to 1=2 with high probability. This is the p law of large numbers. The CLT states that the typical fluctuations of order 1= n around this value are asymptotically normally distributed. This result is valid to evaluate for instance the probability to have between 480 and 510 heads if n D 1; 000. However, if one wants to evaluate the probability of having over 700 heads, which is a number exponentially small in n, then it is necessary to use the information contained in the higher moments of the random variable “coin toss,” whereas the CLT only uses the first two moments. Large deviations theory provides answers to this question through an appropriate transform of the moment generating function (exponential moments) that is related to the concept of relative entropy. Historically, the first mathematical result describing the large fluctuations around its mean for a sum of independent random variables is due to Cram´er [6] (Sect. 3.2). The mathematical pendent to Boltzmann’s calculation is Sanov’s theorem [17] (Sect. 3.2) for the empirical measure of independent random variables. The general theoretical framework (Sect. 3.3) for this type of asymptotic results has been developed afterwards, in particular by Stroock and Varadhan. A key result in this framework is the G¨artner–Ellis theorem (Sect. 3.3), generalizing the results of Sanov and Cram´er to the case of dependent random variables. Small random perturbations of dynamical systems (Sect. 3.4) have been investigated by Freidlin and Wentzell [13] within the framework of large deviations for sample paths, and have many applications, for example for the problem of exit from a domain. This is relevant for the leaky Integrate and Fire neuronal models treated in Chap. 5, where spike times of a neuron are represented by first passage times, see e.g. Sect. 5.4.1.3, see also Sect. 4.2.2.3 in Chap. 4. This chapter is not intended to be a detailed and precise account of large deviations results, but rather an introductory guide, and we encourage the curious reader to consult the standard mathematical textbooks [7, 8] on this topic.
3.2 Sum of Independent Random Variables Consider a sequence of independent and identically distributed (iid) real random variables 1 ; 2 ; : : :. If m WD E.1 / < 1, then by the strong Law of Large Numbers (LLN), the empirical average n 1X i (3.1) An D n i D1 converges a.s. to m when n ! 1. When n is large but finite, it is of interest this question is to characterize the fluctuations of An around m. A first answer to p given by the CLT, and concerns typical fluctuations of order O.1= n/ around m. p More precisely, if 2 WD Var.1 / < 1, then n.An m/ converges in law to a
3 A Brief Introduction to Large Deviations Theory
59
Gaussian random variable N p .0; 2 /. However, the CLT does not describe properly fluctuations larger than O.1= n/. From the LLN, we know that with a > m, pn .a/ WD P ŒAn > a converges to 0 when n ! 1, and we would like to estimate the speed of this convergence according to the value of a. The event fAn > ag is often called a rare event, since we will see that pn .a/ becomes exponentially small when n is large. A first historical example comes from a problem related to the insurance industry: if Xi is the claim of policy holder i , what is the probability that the total claim exceeds na, with a > m ? That is, the focus is on the distribution tail of the total claim. Such a question is crucial, since the insurance company may not be able to refund policy holders above a critical value na , and pn .a / is then the probability of ruin. Contrary to the CLT where only the first two moments of X1 characterize the asymptotic rescaled behavior, describing rare events requires exponential moments to integrate the distribution tail behavior. In the exponential scale, rare events have on average a significant contribution. Theorem 3.1 (Cram´er [6]). Assume E e X1 < 1 for all 2 R and define: . / WD ln E e X1 and I.x/ WD sup f x . /g
(3.2)
2R
the Legendre transform of . Then, for all a 2m; 1 and a0 2 Œ0; mŒ: lim
n!1
1 1 ln P ŒAn > a D inf I.x/ and lim ln P An < a0 D inf0 I.x/: x>a n!1 x
The complete proof can be found in [8]. An upper bound of n1 ln pn .a/ can be obtained with h 1 in h 1 in E e An D E e n X1 > e n a pn .a/ so that
n1 ln pn .a/ < . / a sup f a . /g: 2R
The lower bound is less straightforward and can be derived using an appropriate change of probability. We will see in the next section that An is said to satisfy a large deviation principle of speed n and rate function I . Example 3.1 (Coin tossing). We want to estimate the probability of having k D na heads in n throws. The random variables i are then Bernoulli variables with
i D 1 D 1=2. To apply Cram´er’s Theorem, we compute PŒ e C1 < 1, then I.x/ D sup 2R f x . /g is obtained solving . / D ln 2 x /, which gives x D 0 . /, obtained for .x/ D ln. 1x I.x/ D x ln.x/ C .1 x/ ln.1 x/ C ln.2/
(3.4)
60
G. Wainrib
Fig. 3.1 Rate function I.x/ D x ln.x/ C .1 x/ ln.1 x/ C ln.2/ for the coin tossing example
A plot of I is given in Fig. 3.1: I is non-negative and has a unique zero at x D 1=2. Thus, by Cram´er’s Theorem: lim
n!1
1 ln P ŒAn > a D inf I.x/ D a ln.a/ .1 a/ ln.1 a/ ln.2/ x>a n
This asymptotic result for P ŒAn > a can be obtained directly since nŠ P.nAn D k/ D 21n kŠ.nk/Š , and using Stirling’s approximation nŠ nn e n one retrieves the same expression for I.x/. An elementary calculus shows that nI.x/ is also the relative entropy or Kulback–Leibler distance between the Binomial.n; x/ and Binomial.n; 1=2/ distributions. A given macroscopic state An D a can be achieved by many microscopic states .X1 ; : : : ; Xn / 2 f0; 1gn. Essentially, entropy counts the number of those microscopic states. Saying that the maximum of entropy corresponds to the most likely realization An D 1=2 is equivalent to the fact that I.x/ has its minimum (zero) at x D 1=2. Empirical measure. It is possible to generalize Cram´er’s Theorem to the empirical measure associated with the sequence .i /i 1 . We assume that i take values in a finite set E D f1; : : : ; d g, and are iid with distribution .k /k2E , and each k > 0. We define the empirical measure: 1X ı n i D1 i n
Ln D
(3.5)
3 A Brief Introduction to Large Deviations Theory
61
The empirical measure is a random probability on E: it belongs to the Pmeasure d probability simplex M .E/ D f 2 Œ0; 1d I kD1 k D 1g. Our purpose is to estimate the probability that Ln is away from . We thusPneed a distance on M .E/: we consider the total variation distance d.; / D 12 dsD1 js s j. The strong LLN implies that limn!1 d.Ln ; / D 0 with probability one. We define the ball of radius a > 0 with this distance: Ba ./ D f 2 M .E /I d.; / ag and its complementary B a ./ D M .E/ Ba ./. Theorem 3.2 (Sanov). For all a > 0 1 ln P Ln 2 B a ./ D inf I ./ n!1 n 2B a ./ lim
with
d X
s s ln I ./ WD s sD1
(3.6)
:
(3.7)
This result can be proved directly, using Stirling’s approximation to the multinomial law satisfied by Ln . The quantity I ./ is actually the relative entropy H.; / of with respect to . More details can be found in [8, 17].
3.3 General Theory Cram´er’s and Sanov’s Theorem presented in the previous section can be seen as specific examples in a wider theory of asymptotic results concerning large fluctuations of various random objects. Large deviation theory has been developed by several authors, in particular Stroock and Varadhan. A common general framework is to consider a sequence of probability spaces .˝n ; Fn ; Pn / and a sequence of random variables .Xn /, taking values in S , a complete separable metric space. To the sequence .Xn / is associated a sequence of laws .Pn /, defined by Pn .C / D Pn .Xn 2 C /. For instance, in the case of stochastic processes, S is a function space and Pn is the law of process Xn . Let .an / be such that limn!1 an D C1. Definition 3.1 (Large Deviation Principle (LDP)). The sequence .Xn / satisfies a large deviation principle of speed an and rate function I.x/ if: 1. For all C closed subset of S , lim sup n!1
1 ln Pn .C / inf I.x/ D I.C / x2C an
1 ln Pn .O/ inf I.x/ D I.O/ x2O an 3. I is lower semi-continuous with compact level sets. 2. For all O open subset of S , lim inf n!1
Instead of (1) and (2), we will write Pn .K/ e an I.K/ . As a first example, one can show that Cram´er’s Theorem can be reformulated as: .An / satisfies a LDP of speed n and rate function I.x/ given in (3.4).
62
G. Wainrib
The natural questions arising after this definition are how to prove a LDP and how to compute the rate function I . A rather general answer is given by the fundamental theorem of G¨artner–Ellis [10, 14], originally stated in finite dimension and later d generalized to infinite dimension [2]. We consider the case where the Xi are R 1 an Xn the scaled cumulant generating function valued. Let n . / WD an ln E e for 2 Rd . Theorem 3.3 (G¨artner–Ellis). If lim n . / WD . / is finite, and differentiable n!1
for all 2 Rd , then .Xn / satisfies a LDP of speed an and rate function I.x/ D sup f x . /g
(3.8)
2Rd
To give some heuristics behind the derivation of the rate function, we suppose that a LDP effectively holds. In this case n .x/ denotes the density of Xn :
E e
an Xn
Z D
Z e
an x
n .x/dx
e an x e an I.x/ dx
(3.9)
! exp an sup f x I.x/g
(3.10)
2Rd
where the last line is obtained by the Laplace Principle. ThisRprinciple is a general result enabling one to approximate integrals of the form A exp . .x// dx by exp sup .x/ for large , in the sense that: x2A
1 lim ln !1
Z
exp . .x// dx A
D sup .x/: x2A
We refer to [7] for more details about the Laplace Principle. As a consequence, n . / ! sup 2Rd f x I.x/g and as . / is differentiable, one can show that I is strictly convex and the Legendre transform is reversible in the sense that sup 2Rd f x I.x/g D . / is equivalent to (3.8). Example 3.2. Applying G¨ theorem to An as defined in (3.1) with an D n artner–Ellis yields n . / D n1 ln E e n An ! ln E e D . /. In this case of a sum of iid random variables, . / is analytic and in particular differentiable. The strength of the G¨artner–Ellis Theorem is that it deals with sums of dependent variables as well. Remark on convex and non-convex rate functions. We have presented here a weak version of the G¨artner–Ellis theorem: the differentiability condition for . / can be weakened. It is related to the convexity properties of the rate function. At this stage, remark that if a LDP is obtained with the G¨artner–Ellis theorem, then the rate function is necessarily strictly convex. This comes from the fact that the Legendre transform of a differentiable function (here ) is necessarily strictly convex [16].
3 A Brief Introduction to Large Deviations Theory
63
Hence, this theorem cannot be used to obtain nonconvex rate functions, especially with several minima. A detailed discussion of this question as well as several interesting examples can be found in [19]. Varadhan’s Lemma and change of measure. Note that a more general convergence result, known as Varadhan’s Lemma, extends Laplace approximation to a wider setting, namely: .f / D lim an1 ln E e an f .Xn / D supff .x/ I.x/g: n!1
x
Another form of this result can be used to derive one LDP from another one: if .Xn / satisfies a LDP of speed n with rate function IX and if .Yn / is such that its law PYn is defined by Z PYn .A/ D
A
e nF.x/ PX n .dx/
. Z
S
e nF.x/ PX n .dx/
(3.11)
Y (this means that the relative entropy between PX n and Pn is of order n), then Yn satisfies a LDP of speed n with rate function
I Y .x/ D supfF .y/ I.y/g .F .x/ I.x//:
(3.12)
y2S
Contraction principle. Another useful tool to obtain LDP deals with the case of a sequence Yn defined as Yn D F .Xn /, with F continuous and knowing that Xn satisfies a LDP of speed an and rate function IX . The contraction principle states that Yn satisfies a LDP of same speed an and rate function IY .y/ D
inf
xW F .x/Dy
IX .x/:
Remark that with the contraction principle, one can deduce Cram´er Theorem from P Sanov Theorem, with a function F W M .E/ ! R such that F ./ D dkD1 kk . Relationship between LDP, LLN and CLT. To conclude this section, we go back to the LLN and the CLT, which can be derived from a LDP. We consider the case where the assumptions of the G¨artner–Ellis theorem hold. If the rate function I.x/ has a global minimum at x and if I.x / D 0 then x D 0 .0/ D limn!1 E.Xn /. More precisely, Xn gets concentrated around x since Pn .dx / converges to 1 exponentially fast when n ! 1. Note that in this case, one also has I 0 .x / D 0. Moreover, if I is twice differentiable at x , then I.x/ Š 12 I 00 .x /.x x /2 so 00 2 that Pn .dx / e nI .x /.xx / . For iid sums, I 00 .x / D 1=00 .0/ D 1= 2 as expected by the CLT. However, this relationship between LDP and CLT requires some specific assumptions to be valid [3], and the two following examples show that this question may be delicate.
64
G. Wainrib
Example 3.3. (i) The LDP does not imply the CLT [5]. Consider symmetric random variables fXt gt 1 with distributions P.jXt j > x/ D exp.x 2 t/. The moment generating functions 1 1 Efexp.tyXt /g D 1 C yt 1=2 exp. ty2 / 2 4
Z
y
p
y
t =2
p
e u du 2
t =2
are analytic; their normalized logarithms are real-analytic and converge to (analytic) L.y/ D 14 y 2 , but the convergence holds for the real arguments y only and the CLT fails. On the other hand, the LDP holds with the Gaussian rate function. (ii) The CLT does not imply that the rate function has a quadratic minimum [19]. Let Sn be the mean of n iid random variables X1 ; X2 ; : : : ; Xn distributed according to the Pareto density p.x/ D
a .jxj C b/ˇ
with ˇ > 3, a; b > 0. For ˇ > 3, the variance is finite and the CLT holds for n1=2 Sn . However, the rate function of Sn is everywhere equal to zero (since the density of Sn has the same power-law tails as those of p.x/.
3.4 Some Large Deviations Principles for Stochastic Processes 3.4.1 Sanov Theorem for Markov Chains Let 1 ; 2 ; : : : be a Markov chain on a finite state space E D f1; : : : ; d g, and with a transition matrix Q D .Qij /i;j 2E . We assume that Qij > 0 for all i; j 2 E. We keep the same notation as in Sect. 3.2, and define the empirical measure: 1X ı : n i D1 i n
Ln D
(3.13)
If n is seen as time, Ln .k/ is the proportion of time the chain spends in state k 2 E. With our assumptions, the stationary distribution for the Markov chain is unique, and we know that Ln converges to . We ask the question of the deviations of Ln from . Here, the appropriate space is M .E/, with the total variation distance, which constitutes a complete separable metric space, on which the general theory applies. We present here a theorem for Markov chains in discrete time, but a similar result exists in the continuous time setting.
3 A Brief Introduction to Large Deviations Theory
65
Theorem 3.4. The sequence .Ln / satisfies a LDP of speed n with rate function: " IQ ./ D
sup u2.0;1/d
d X kD1
.Qu/k k ln uk
# :
(3.14)
The rate function IQ is finite, positive, continuous and strictly convex on M .E/, and the stationary distribution is the only zero of IQ . There are two main ways to prove this Ptheorem. For both, the idea is to introduce the pair empirical measure Zn D n1 niD1 ıfi ;i C1 g and then to go back to the empirical measure by applying the contraction principle. The first way [11] is based on the relative entropy method: to obtain a LDP for the Markov chain (dependent sequence), one uses an existing LDP in the independent case and then shifts the rate function for the independent case with the relative entropy between the dependent and independent laws. This method of relative entropy is also useful to prove more difficult LDPs for system of interacting particles [21]. The other method [8] is to apply the G¨artner–Ellis Theorem to the sequence Zn : with 2 Rd Rd : n . / D
d X 1 ln E e n Zn D i Pijn . /e j i n i;j D1
(3.15)
where Pij . / D Qij e ij . By the theory of Perron–Frobenius, one can show that n . / converges to . /, the logarithm of the unique largest eigenvalue of P . / when n ! 1. Then working on the Legendre transform of . / one ends up with a formula for the rate function, and concludes by the contraction principle. Example 3.4. Consider a Markov chain, with two states 0 and 1 and transition probability Q00 D Q11 D p and Q00 D Q11 D 1 p. Then the rate function for the empirical measure is obtained by finding the supremum in (3.14), which can be rewritten by setting v D u1 =u2 : IQ ./ D sup Œ0 ln.p C .1 p/=v/ 1 ln.p C .1 p/v/ : v
The supremum is attained for v, solution of p1 v2 .1 p/.0 1 /v p0 D 0, which gives a complicated expression for IQ . In the case p D 1=2, the Markov chain is actually just a sequence of iid random variables, and one finds IQ ./ is the relative entropy between the distribution and the distribution D .1=2; 1=2/.
3.4.2 Small Noise and Freidlin–Wentzell Theory In the above discussion on Markov chains, the asymptotic parameter n can be interpreted as time. Here, we are interested in a situation where the level of noise is the asymptotic parameter and is going to zero. Thus, the focus is on the behavior
66
G. Wainrib
of a system around its deterministic trajectory. Two such situations arise naturally in biological models. First, dynamical systems may be subject to external small random perturbation, which leads in particular to the study of stochastic differential equations (SDEs) as h ! 0: dX ht D b.Xth /dt C h.Xth /dW t : Another common situation is when considering a population of continuous-time Markov chains, which appears for instance in chemical kinetics or epidemiological models. In this case, the “noise” parameter is the inverse size of the population: indeed, when the population is infinite, the law of large numbers ensures a deterministic limit for an aggregate variable (such as the proportion of individuals in a given state), however, when the population size is finite, fluctuations remains, and large deviation theory may help in characterizing these finite size intrinsic fluctuations around the deterministic limit. Both cases are included in a more general setting, where the process Xth is a Markov process with initial distribution Phx and an infinitesimal generator, defined by, for f 2 C 2 with compact support: Ah f .x/ D
X i
C
1 h
b i .x/fi 0 .x/ C Z
h X i;j a .x/fij00 .x/ 2 i;j
Œf .x C hˇ/ f .x/ h Rr
X
ˇi fi0 .x/x .dˇ/
i
R where x is a measure on Rd f0g such that jˇj2 x .dˇ/ < 1, and the matrix a is defined as a D T . The first term in the above sum corresponds to the drift, the second term to the diffusion and the third term to the jumps. The small noise assumption is twofold here: the diffusion is multiplied by a small parameter h, and the jumps are assumed to be small of order h with a frequency of order 1= h, which is typically the case when studying proportions in a population of size 1= h. In the context of stochastic processes, the state space is a functional space and the rate function is a functional of a given trajectory. With A being a set of trajectories, we are interested in quantities of the form: lim .h/1 ln P .Xth /t 0 2 A :
h!0
(3.16)
If X h satisfies a LDP when h ! 0, with speed .h/ and rate function I , then the above limit will be roughly inf I. /. 2A
3.4.2.1 Action Functional Our aim is to show how to construct the rate function I from the generator Ah . Of course, several technical conditions will be required for the LDP to be valid.
3 A Brief Introduction to Large Deviations Theory
67
We are going to consider exponential moments and Legendre transforms, with analogy with the Cram´er theorem for sums of independent variables (here the role played by those independent variables is played by the independent increments of the process), or with the G¨artner–Ellis theorem. Definition 3.2. Let H.x; ˛/ D h exp.h1 ˛x/ Ah exp.h1 ˛/ .x/, called the Hamiltonian: Z X 1 X i;j H.x; ˛/ D b i .x/˛i C a .x/˛i ˛j C Œe .˛;ˇ/ 1 .˛; ˇ/x .dˇ/: r 2 R i i;j Then we denote L.x; ˇ/ the Lagrangian, defined as the Legendre transform of H.x; ˛/: L.x; ˇ/ D supf.˛; ˇ/ H.x; ˛/g: ˛
Definition 3.3. For an Rr valued function t , T1 t T2 , we define the action functional: ST1 T2 . / D
(R T2
L. t ; Pt /dt if is abs. continuous and the integral converges C1 otherwise. T1
Under some restrictions on H and L, Freidlin and Wentzel prove a theorem (Theorem 2.1, p. 146 in [13]) that establishes a LDP for X h : Theorem 3.5 (Freidlin–Wentzell). Under the following assumptions: 1. There exists an everywhere finite non-negative convex function HO .˛/ such that HO .0/ D 0 and H.x; ˛/ HO .˛/ for all x; ˛. 2. The function L.x; ˇ/ is finite for all values of the arguments; for any R > 0 there exists positive constants M and m M ,j@ˇ L.x; ˇ/j M , P P such that L.x; ˇ/ i j k ij @L=.@ˇ ˇ /.x; ˇ/ci cj m i ci for all x; c 2 R and all ˇ, jˇj < R. L.y 0 ; ˇ/ L.y; ˇ/ ! 0 as ı 0 ! 0. 3. L .ı 0 / D sup sup 1 C L.y; ˇ/ jyy 0 j<ı 0 ˇ Then the process .Xth /t 2Œ0;T satisfies a LDP of action functional S0T . / and with speed h1 as h ! 0 and uniformly in the initial point x. Case 1: Diffusion process. For the Gaussian perturbation case, under the assumptions that the drift and diffusion parameters are bounded and uniformly continuous, and that the diffusion matrix is uniformly non-degenerate, we can apply the above theorem and we find, with ai;j the inverse matrix of the diffusion matrix ai;j : 1 S0;T . / D 2
Z 0
T
X i;j
j ai;j . t /. Pti b i . t //. Pt b j . t //dt:
68
G. Wainrib
It is still true in the case where the drift b h depends on h provided that there is uniform convergence to b (Theorem 3.1, Chap. 5 in [13]). Note that if t is solution of P t D b. t /, then S0;T . / D 0, which is consistent with the limit h ! 0. The value of S0;T . / quantifies on a logarithmic scale the “probability” that X h follows a given trajectory between 0 and T . A special case is where Xth D hWt . This case was considered before Freidlin– Wentzell by Schilder [18] and the action functional reads: Z 1 T P 2 S0;T . / D j t j dt: 2 0 It is possible to use the Contraction Principle to derive a LDP for a wide class of SDEs from Schilder LDP for the Brownian motion. Notice also that by scaling, hW t has the same law as Wh2 t so that the LDP also provides information about the small time behavior (see [1]) of the Brownian motion. Case 2: Markov jump process. For h > 0, let Xth be a Markov jump process with state space Eh (all the points that are multiples of h), intensity h .x/ D h1 Œr.x/ C l.x/ and with jump law h .x; x C h/ D r.x/= .r.x/ C l.x// and h .x; x h/ D l.x/= .r.x/ C l.x//, where r and l are two non-negative and bounded real functions. It means that the process jumps to x C h with rate h1 r.x/ and to x h with rate h1 l.x/. Consider as an example a population of N D 1= h individuals, each one jumping between states 0 and 1 with rates Ai;j , for i; j 2 f0; 1g, i ¤ j . Then define X h .t/, the proportion of individuals that are in the state 1 at time t. In this case r.x/ D .1 x/A0;1 and l.x/ D xA1;0 . Here, H.x; ˛/ D .e ˛ 1/ r.x/ C .e ˛ 1/ l.x/ ! p p ˇ C ˇ 2 C 4r.x/l.x/ L.x; ˇ/ D ˇ ln ˇ 2 C 4r.x/l.x/ C l.x/ C r.x/ 2r.x/ and the conditions (1)–(3) of Theorem 3.5 are satisfied so that the action functional for this process is given by S. / with: (R T S. / D
L. t ; Pt /dt if is abs. continuous and the integral converges C1 otherwise. 0
One checks without difficulty that L. t ; Pt / D 0 when P t D r. t / l. t /, which corresponds to the deterministic limit h ! 0 coming from the LLN. 3.4.2.2 Quasipotential and Asymptotic Estimates for the Problem of Exit from a Domain The purpose of this section is to show what are the consequences of a LDP for asymptotic solutions of exit problems arising very frequently in many applications. For instance, in population models this is related to the question of extinction, and
3 A Brief Introduction to Large Deviations Theory
69
for neuronal models to a threshold crossing responsible for spike generation. See, for instance, Chaps. 5 and 7. Let D be a domain of Rr with a smooth boundary @D. Let us distinguish two different cases: 1. If the deterministic limit xt starting at a point x 2 D exits from D in a finite time T , then the stochastic process will also leave D in a finite time with probability 1 as h ! 0 at a point of the boundary that is close to the deterministic “exit point” xT . 2. In the case where, for x 2 @D, .b.x/; n.x// < 0 with n being the exterior normal, then xt does not leave D, but Xth will leave D with probability 1 as h ! 0. To determine the exit time and the exit point, we will introduce the quasipotential, which is the infimum of the action functional for trajectories starting at x 2 D and ending on the boundary. Suppose we are in the second case, and suppose that 0 is an asymptotically stable equilibrium point, and 8x 2 D; xt .x/ ! 0 as t ! 1 without leaving D (we say D is attracted by 0). In the case of gradient systems, the problem of exit from a domain is well studied, and the most famous result is of course the Kramer’s escape rate for a double-well potential. However, in the non-gradient case, a quantity called the quasipotential plays the role of a “probabilistic landscape,” with reference to the language of energetic landscape for potentials. Definition 3.4. We define the quasipotential as: V .x; y/ D
inf
fS0T . /I 0 D x; T D yg:
abs: cont:;T >0
Note that for gradient systems perturbed by additive white noise (constant diffusion coefficient), the quasipotential is just twice the actual potential. With this setting we have the following theorem: Theorem 3.6 (Freidlin–Wentzell). 1. For the mean exit time h WD infftI Xth … Dg: for all x 2 D, lim .h/ ln Ex Œ h D inf V .0; y/ D V0 :
h!0
y2@D
2. For the exit point: if there exist a unique y0 2 @D such that V .0; y0 / D inf V .0; y/ y2@D
then, 8ı > 0, 8x 2 D, lim Px ŒjXhh y0 j < ı D 1:
h!0
70
G. Wainrib
Our aim is now to consider a situation where one can obtain analytically an expression of the quasipotential, which is generally not possible. Numerical methods are presented in Chap. 4. We also want to show in the following example the point we made in the introduction, namely that LDP is a sharper result than the CLT, and that this difference can be dramatic when considering rare events. This example and further theoretical results can be found in [15]. Example 3.5. We recall the example stated above in case 2. With N D 1= h, consider a population of N individuals, each one jumping between states 0 and 1 with rates Ai;j , for i; j 2 f0; 1g, i ¤ j . Then define X h .t/, the proportion of individuals that are in the state 1 at time t. Our aim is to compare the behavior of this jump process with a diffusion approximation obtained in the asymptotic regime of a large population size. First, we recall that from [12], when N ! 1, X h converges in probability on finite time intervals to the solution of a deterministic differential equation: xP D .1 x/A0;1 xA1;0 . Moreover, it is possible to build a diffusion approximation, also called Langevin approximation XQ h of the process X h as: dXQ .t/ D Œr.XQ h .t// l.XQ h .t//dt C h
p q h r.XQ h .t// C l.XQ h .t//dW t ;
where r.x/ D .1 x/A0;1 and l.x/ D xA1;0 . To compare X h and XQ h , we will consider the problem of exit from a domain and apply Theorem 3.6. To this end, we need to compute the quasipotentials associated with X h and XQ h . Obtaining the Hamiltonians HM and HL associated respectively with X h and XQ h is the first step towards this computation. By Theorem 4.3, Chap. 5, p. 159 of [13], we have a way to compute the quasipotential: find a function U , vanishing at x0 WD A0;1 =.A0;1 CA1;0 /, continuously differentiable and satisfying H.x; U 0 .x// D 0 for x ¤ x0 and such that U 0 .x/ ¤ 0 for x ¤ x0 where: • In the jump Markov case: HM .x; ˛/ D .e ˛ 1/r.x/ C .e ˛ 1/l.x/ • In the Langevin approximation: HL .x; ˛/ D .r.x/ l.x//˛ C .r.x/ C l.x//˛ 2 Here we note that HL is the second order expansion of HM in ˛. Actually, solving 0 HL .x; UL0 .x// D 0 and HM .x; UM .x// D 0, we can find the quasipotentials UL and UM explicitly: • In the jump Markov case: Z
x
UM .x/ D
ln.l.u/=r.u//du x0
• In the Langevin approximation: Z
x
UL .x/ D 2 x0
r.u/ l.u/ du r.u/ C l.u/
3 A Brief Introduction to Large Deviations Theory
71
Then, consider the double barrier exit problem. Define the first passage times ah WD infft 0; X h .t/ < ag and bh WD infft 0; X h .t/ > bg, with 0 < a < x0 < b < 1, and suppose X h .0/ D x0 the stable equilibrium point for the deterministic equation. Then, from Theorem 3.6 the probability P ah < bh of escaping first from a tends to 1 if the value of the quasipotential evaluated at a is strictly below its value at b, and tends to 0 otherwise. With some values of the parameters, the following situation arises: UM .a/ < UM .b/ but UL .a/ > UL .b/. This means that for small h, the original Markov jump process will escape almost always from a whereas its diffusion approximation, derived from a CLT, will escape almost always from b, as shown in more details in [15].
3.5 Conclusion In this brief chapter, we have shown some of the key results of large deviations theory, from Cram´er Theorem for the sum of independent random variables, to the Freidlin–Wentzell theory of small random perturbations of dynamical systems. We have chosen to make a synthetic presentation without the proofs to give a concise overview of this theory, which has many ramifications. The results presented here are only a fraction of all the available results, and many sharpenings, ramifications and generalizations are available. We deeply encourage the reader to refer to the mathematical textbooks [7–9, 20], to [13] for the small noise problems, and to [11, 19] for the relationship with statistical physics and entropy. This relationship seems to be one of the key paths towards an application of large deviations theory in biology. For instance, statistical physics techniques have been introduced successfully in the last decade to study complex biological networks, such as neuronal networks. Moreover, the issues raised by small random perturbations are of course of great interest for the study of many biological processes, especially when rare events may be amplified by feedback loops and non-linearities. Large deviations tools are a first step towards the analysis of such events, and can also help designing efficient simulations techniques, as discussed in Chap. 4 of the present volume.
References 1. Azencott, R.: Grandes d´eviations et Applications. In: Ecole d’Et´e de Probabilities de SaintFlour VIII-1978. Lecture Notes in Math., vol. 774, pp. 1–176. Springer, New York (1980) 2. Baldi, P.: Large deviations and stochastic homogenization. Ann. Mat. Pura Appl. 151, 161–177 (1988) 3. Bolthausen, E.: Laplace approximations for sums of independent random vectors. Probab. Theor. Relat. Field 71(2), 167–206 (1987) ¨ 4. Boltzmann, L.: Uber die beziehung zwischen dem zweiten hauptsatze der mechanischen w¨armetheoreie un der Wahrscheinlichkeitrechnung respektive den s¨atzen u¨ ber das w¨armegleichgewicht (On the relationship between the second law of the mechanical theory of heat and the probability calculus). Wiener Berichte 2(76), 373–435 (1877)
72
G. Wainrib
5. Bryc, W.: A remark on the connection between the large deviation principle and the central limit theorem. Stat. Probab. Lett. 18(4), 253–256 (1993) 6. Cram´er, H.: Sur un nouveau th´eor`eme limite dans la th´eorie des probabilit´es. Colloque consacr´e a` la th´eorie des probabilit´es, Hermann. 3, 2–29 (1938) 7. Dembo, A., Zeitouni, O.: Large Deviations Techniques and Applications, 2nd edn. Springer, New York (1998) 8. den Hollander, F.: Large Deviations. Fields Institute Monograph. American Mathematical Society, Providence (2000) 9. Deuschel, J.D., Stroock, D.W.: Large Deviations. Academic, New York (1989) 10. Ellis, R.S.: Large deviations for a general class of random vectors. Ann. Probab. 12, 1–12 (1984) 11. Ellis, R.S.: Entropy, Large Deviations, and Statistical Mechanics. Springer, New York (1985) 12. Ethier, S.N., Kurtz, T.G.: Markov Processes. Wiley, New York (1986) 13. Freidlin, M., Wentzell, A.D.: Random Perturbations of Dynamical Systems, 2nd edn. Springer, New York (1998) 14. G¨artner, J.: On large deviations from the invariant measure. Theor. Probab. Appl. 22, 24–39, (1977) 15. Pakdaman, K., Thieullen, M., Wainrib, G.: Diffusion approximation of birth-death processes: comparison in terms of large deviations and exit point. Stat. Probab. Lett. 80(13–14), 1121– 1127 (2010) 16. Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1997) 17. Sanov, I.N.: On the probability of large deviations of random variables. Selected translations in Math. Stat. Probab. I 213–244 (1961) 18. Schilder, M.: Some asymptotic formulae for Wiener integrals. Trans. Am. Math. Soc. 125, 63–85 (1966) 19. Touchette, H.: The large deviations approach to statistical mechanics. Phys. Rep. 478, 1–69 (2009) 20. Varadhan, S.R.S.: Large Deviations and Applications. SIAM, Philadelphia (1984) 21. Varadhan, S.R.S.: Lectures on hydrodynamic scaling. In: Hydrodynamic Limits and Related Topics. Fields Institute Communication, vol. 27, pp. 3–42. American Mathematical Society, Providence (2009)
Chapter 4
Some Numerical Methods for Rare Events Simulation and Analysis Gilles Wainrib
Abstract We present several numerical approaches to investigate rare events in stochastic systems, with a specific focus on application to biological models. We first present several aspects concerning variance reduction of Monte-Carlo methods, with a focus on importance sampling. We show how these methods can be applied to basic biological models. We show that these techniques can be useful in dealing with multiscale continuous-time Markov chains that are important in the context of biochemical reaction networks. We treat with more detail the problem of first passage time for a linear diffusion process, arising from the integrate-andfire neuron model, to show the kind of mathematical problems that may arise when trying to design an optimal importance sampling scheme. This leads to the observation that large deviation theory can be helpful to solve these questions. We also review a numerical method designed to estimate large deviations quantities such as the quasipotential and the optimal path and apply this method to estimate numerically the quasipotential of the Morris–Lecar neuron model perturbed by small noise.
4.1 Introduction A rare event happens with a very small probability, where “small” obviously depends on the context. Studying such events may appear to some people useless since their occurrence is highly improbable. However, as their consequences may
G. Wainrib Laboratoire Analyse G´eom´etrie et Applications, Institut Galil´ee, Universit´e Paris 13, Villetaneuse, France e-mail: [email protected] M. Bachar et al. (eds.), Stochastic Biomathematical Models, Lecture Notes in Mathematics 2058, DOI 10.1007/978-3-642-32157-3 4, © Springer-Verlag Berlin Heidelberg 2013
73
74
G. Wainrib
be huge compared to most of the other events that happen frequently, one may argue on the contrary that their study is particularly important. Think for instance of natural disasters, nuclear accidents, financial crashes, and so on. Estimating the probability of such an event is very difficult in practice and requires advanced theoretical and simulation tools. The growing fields of application range from industrial and natural risk to finance and from complex systems reliability to chemistry and biology. For instance, in biology, some spontaneous phenomena happen, even though they are rare, and may then be enhanced by feedback loops and non-linearities: a specific mutation that gives rise to a very contagious virus is a rare event that enables the virus to propagate and reproduce. From a statistical perspective, rare events can be studied within three main frameworks. The first situation is the most usual: suppose the quantity of interest is the mean of many random variables, the problem of a rare event here is to estimate deviations from the mean and when the central limit theorem (CLT) holds, it partly provides an answer with the Gaussian distribution. The point is that the CLT does not describe fluctuations that are too far away from the mean and more sophisticated tools are required. The second situation is similar, except that one does not assume that the CLT is valid anymore. This is the case if the random variables are heavy-tailed (e.g. power law), and another class of universal laws appears, namely the ˛-stable laws introduced by Levy. The Gaussian law is no longer appropriate here. Both situations deal with the behavior of the mean. However, it may be more interesting in some contexts to study for instance the maximum of the random variables. To this end, the theory of extreme values has been developed, leading to a third class of universal laws encompassing the Frechet, Gumbel and Weibull distributions. In this chapter, we are not going to take the statistical point of view. We assume instead that we start from a model of the system of interest and that we want to analyze rare events that may occur in the system. Our purpose is to introduce the reader to some classical methods for the numerical estimation of quantities related to these events, once a model has been specified. The difficulty to simulate such events with a naive Monte-Carlo (MC) approach stems from the fact that one has to run many simulations before one success happens. To obtain statistically significant results with a low variance, the number of runs becomes prohibitive. Smart reduction variance techniques for MC simulation have been widely used in many areas, and we are going to present here some of them in the context of rare event simulation, with a slight view towards biological models, which is a rather recent field of application. The difficulty in practice is to find the optimal variance reduction scheme corresponding to a given problem. This practical difficulty can be solved with heuristic techniques, but one can also benefit from theoretical results, for instance from large deviations theory. Therefore, as well as showing how theory may help the design of efficient rare event simulation, we will present some numerical methods to analyze rare events through the lens of large deviations theory. These are not stochastic simulation methods, but rather optimization problems, solved by deterministic algorithms, the most probable trajectory that leads to a given
4 Some Numerical Methods for Rare Events Simulation and Analysis
75
rare event. Combining both approaches should provide a rather efficient toolbox to deal with many situations coming from various fields of applications. The chapter is organized as follows. In Sect. 4.2, we recall briefly Monte-Carlo basics and present an overview of the main techniques to rare event simulation. We focus then on one method, importance sampling (IS), showing its application for continuous-time Markov chains and diffusion processes, that are widely used in biological models. In Sect. 4.3, we introduce minimal concepts of large deviation theory and present in some details a recent algorithm [13] designed to compute optimal paths and quasipotentials, with an application to a stochastic neuron model.
4.2 Monte-Carlo Simulation Methods Before introducing improved MC techniques designed for rare events simulation, we recall here briefly several useful general results on MC simulation. We refer the reader to [15] for a presentation on the topic. Consider a real random variable X and suppose we are interested in estimating the probability PŒX 2 A D pA . To do so, we generate N 2 N independent and identically distributed .Xi / and build the unbiased estimator N 1 X .N / pOA WD 1X 2A : (4.1) N i D1 i .N /
By the law of large numbers, pOA converges to pA . The quality of this estimator depends essentially upon its empirical variance, estimated by: .N / O2 A WD
2 1 X .N / : 1Xi 2A pOA N 1 i D1 N
(4.2)
The CLT and well known asymptotic results [4] in statistical theory of estimators (-square law, Student law) provide a way to evaluate a confidence interval for such an estimation. For instance, if pA D 109 , then the number N required to have a relative error of 10% with a confidence ˛ D 5% is a prohibitive N 3 1011 . Therefore, variance reduction methods are necessary to tackle this issue.
4.2.1 Overview of the Different Approaches In order to estimate rare events, several MC methods have been introduced. Our aim is not to give a comprehensive and detailed account of all these methods and their ramifications. We refer to the recent book [20] for more details. Instead, our purpose
76
G. Wainrib
is first to expose an overview of the ideas behind the main approaches, and then to focus on the most used method, IS, showing through original examples its potential application in computational biology. 4.2.1.1 Importance Sampling In practice, a naive MC estimation of a rare event probability consists in simulating N realizations of iid random variables. Almost all of them are unsuccessful. Then, a classic average of the results is made, as in (4.1). Instead, the idea of IS is to change the randomness in order to make the rare event happen more often, and then to associate accordingly a weight to each realization. The classic average is thus replaced by a weighted average. Let us show how the concepts of “change of randomness” and “weight” appear in a simple example. Consider a random variable X as before, and assume it has a density f .x/. Suppose we are again interested Rin evaluating pA D PŒX 2 A. With g.x/ a strictly positive function such that g.x/dx D 1, we have: Z pA D EŒ1X 2A D
1x2A f .x/dx
(4.3)
R
D
Z f .x/ g.x/dx: 1x2A g.x/ R
(4.4)
Then, if XQ is a random variable with density g.x/ and if L.x/ WD f .x/=g.x/, then: pA D E 1X2A L.XQ / : Q
(4.5)
Saying that A is a rare event is equivalent to the fact that f .x/ is very small for x 2 A. The idea is thus to choose g such that g.x/ is not small for x 2 A, in order for the event fXQ 2 Ag not to be rare. An IS estimator of pA is then: IS;.N /
pOA
WD
N 1 X 1 Q L.XQ i /: N i D1 Xi 2A
(4.6)
Introducing g.x/ we have made a change of probability and each realization of XQi must be weighted by a likelihood ratio L.XQ i /. The question is now how to choose IS;.N / g to minimize the variance of the estimator pOA . We will discuss this question in more details on an example in Sect. 4.2.2. 4.2.1.2 Importance Splitting To introduce the ideas of importance splitting, we consider a stochastic process X.t/ on a space E and we wish to estimate the probability that X.t/ enters a set A E before going back to its initial value x0 . This probability p may
4 Some Numerical Methods for Rare Events Simulation and Analysis
77
be very small if A is not frequently visited. With a naive MC, one would keep only the trajectories entering A and throw away all the other ones, although they contain some information on the process. The basic idea behind importance splitting technique is to take advantage of those unsuccessful trajectories that were close to entering A. More precisely, considering A1 A2 : : : Ad D A, make n0 simulations starting at x0 and when X enters A1 at a point x1 , launch a number n1 of simulations from x1 . Then amongst these n1 trajectories, some of them will reach A2 at some points and be split into n2 copies, etc., until reaching Ad from Ad 1 . Then, a number N D n0 n1 nd of trajectories have been simulated, and if NA denotes the number of trajectories that have reached Ad D A, an unbiased estimator for p is thus: NA : (4.7) pQ D N In the case where the evolution of the paths at each level are iid, an expression for the variance of pQ is given by [20]: Var.p/ Q D p2
d X kD1
1 pk p1 n0 pk nk1
(4.8)
where pk is the probability of reaching Ak before x0 , knowing X has entered Ak1 . The efficiency of a splitting technique should not only be evaluated with the variance, but also with the total number N of simulations. The question is how to choose the numbers ni , and an answer from (4.8) is to choose them such that pi ni 1 D 1: intuitively, when pi is small, one needs to launch more trajectories ni 1 , or equivalently, with ni fixed, P one should increase the number of levels, thus decreasing pi . Minimizing p 2 dkD1 .1 pk / yields pi D p 1=d and limd !1 Var.p/ Q D ln.1=p/.
4.2.1.3 Other Methods and Refinements Of course, there exist other variance reduction methods, and we refer to [15] for more details. However, for rare event simulation, sampling and splitting techniques appear as the main methods. Most of the research is focused on the optimal design of a sampling or splitting strategy. A typical example is the cross-entropy method [21], which aims at finding the optimal change of probability by minimizing the distance (relative entropy) to the zero variance change of probability. As we will see below, large deviation theory is also a source of inspiration to build efficient sampling schemes, as well as splitting strategies as recently demonstrated [6]. Moreover, an important point concerning rare event simulation is that the method should be chosen to be adapted to the specific question at hand. Thus, new applications raise new theoretical and practical issues, such as dealing with heavy-tailed processes coming from Internet and telecommunication networks modelling. We refer to [20] for a recent account on these questions.
78
G. Wainrib
4.2.2 Focus on Importance Sampling In this section, we are going to give more details on IS in order to see how the setting of Sect. 4.2.1.1 can be extended to two types of stochastic processes widely used in biological models. First we will focus on continuous-time Markov chains (CTMC), which are building blocks of models for biochemical reaction networks, genetic regulation, ion channel kinetics for instance. Then, the interest will be on diffusion processes, and as an illustration we will discuss more specifically the first passage time problem for the leaky integrate-and-fire neuron model, which is similar to an Ornstein–Uhlenbeck process (see also Chap. 5). IS research is mainly concerned with the choice of the change of probability and our aim is not to go into all the subtleties of this topic, but instead to explain how the idea of IS applies in important examples. Only while discussing IS for diffusion processes, we will try to show how the theoretical results from large deviation theory helps designing a good IS change of probability. 4.2.2.1 IS for Continuous-Time Markov Chains Consider a finite state space E and transition rates ˛i;j > 0 for i; j 2 E. These define a CTMC, or Markov jump process, which jumps between states i and j with a rate ˛i;j . Let us take an example of rare events related issue for these models. If some transition rates are much smaller, say of order O./, than the other ones, of order O.1/, with 1, then some transitions can be considered as rare events. This is the case in many biological systems which are operating with several timescales. Methods have been suggested to reduce the dimension of such models (e.g. [19]). Here our purpose is to show how IS may be applied to improve MC simulation for these systems. The main tool is the change of probability: instead of simulating X defined by its rates ˛ D fai;j for i; j 2 Ef ast and ai;j for i; j 2 Eslow g, we simulate another process with rates ˛Q , and weight the result by a likelihood function, also called the Radon–Nikodym derivative, defined by: ! X ˛Q exp ˛Xi ;x .i C1 i / Y Xi ;Xi C1 x2E !; LT .X / D (4.9) X i Wi T ˛Xi ;Xi C1 exp ˛Q Xi ;x .i C1 i / x2E
where a trajectory X is represented by the visited states Xj and the time spent j in each state. Then assume one wants to estimate WD E ŒFT .X /, where FT is a functional of the trajectory of X on Œ0; T . Instead of simulating N copies of X and computing the naive MC estimator: N 1 X FT .Xi /; ON D N i D1
(4.10)
4 Some Numerical Methods for Rare Events Simulation and Analysis
79
Fig. 4.1 Toy model of multiscale slow–fast continuous-time Markov chain. Horizontal transitions are fast compared to vertical ones
one rather simulates N copies of XQ and computes: N 1 X ON D F .XQi /LT .XQi /: N i D1
(4.11)
Again, the ultimate question is how to choose the matrix ˛Q . For a more detailed discussion on this question, see the chapter by W. Sandmann in [20]. Let us examine heuristically this issue on an elementary toy model, that may arise for instance in ion channel gating modeling (see Chap. 6). The state space E is composed of four states E D fA; B; C; Dg. The state A may correspond to the ion channel open state. Assume transitions between states A; B and states C; D are fast, say ˛A;B D ˛B;A D ˛C;D D ˛D;C D 1, whereas the other transitions are slow, say ˛A;C D ˛C;A D ˛B;D D ˛D;B D 1 (Fig. 4.1). Suppose X .0/ D D and we want to estimate the time spent in the open state A between 0 and T : Z T 1 1A .X .s//ds : (4.12) A WD E T 0 Then, as most of the trajectories will stay in states C and D between 0 and T , a naive MC is not adapted. In order to make the event “Xi .s/ 2 A” more frequent, we suggest to introduce another process XQ , with all the transition rates equal to 1. Then an IS estimator for A will be: Z N X 1 T O IS WD 1 Q 1A .Xi .s//ds LT .XQ i / A N i D1 T 0
(4.13)
with LT .XQi / given by Eq. (4.9) where rates ˛Q are all equal to 1 and rates ˛ are equal to either 1 or depending on the transition. We show a numerical comparison between Naive MC and IS in Table 4.1. The number of simulation is N D 105 for both methods. When is larger than 1=N D 105 , both methods give similar averages, but IS has a better variance. What is striking is what happens when is smaller than 1=N : the naive MC returns 0,
80
G. Wainrib
Table 4.1 Comparison between naive MC and IS for the estimation of the time spent in the state A between t D 0 and t D 1, with initial state D. For both methods, the number N of simulation runs is 105 Mean naive MC Variance naive MC Mean IS MC Variance IS MC 101 103 104 105 107
1.631E-02 1.792E-04 1.598E-05 0.000 0.000
7.526E-03 7.770E-05 9.523E-06 0.000 0.000
1.609E-02 1.743E-04 1.765E-05 1.790E-06 1.749E-08
4.588E-03 1.098E-06 5.014E-08 9.574E-11 1.062E-14
since all the 105 simulated trajectories have stayed in states fC; Dg, whereas the IS method can provide a relevant estimation. To our knowledge, an optimal change of measure for a general CTMC with slow and fast transitions has not been fully investigated yet.
4.2.2.2 Reminder of Large Deviations Theory Before discussing an application of IS to diffusion processes in Sect. 4.2.2.3 and other numerical methods in Sect. 4.3, we present here a brief reminder of the necessary definitions and properties of large deviations theory useful for our purposes. Indeed, this theory is focused on estimating exponentially small probabilities, such as the probability of having only heads in a coin toss game. We refer to Chap. 3 for an introduction on this topic and to [7, 8] for a more detailed account of the main results. Let us start with the main definitions. Consider a sequence of probability spaces .˝ ; F ; P />0 and a sequence of random variables .X />0 , taking values in S , a complete separable metric space. To the sequence .X / is associated a sequence of laws .P /, defined by P .C / D P .X 2 C /. For instance, in the case of stochastic processes, S is a function space, such as C .Œ0; T /, the space of continuous functions on Œ0; T , and P is the law of process X . Let .a / be such that lim!0 a D 0. For instance, we will consider here the following family of stochastic differential equations with > 0 a small parameter: dX .t/ D f .X .t//dt C d Wt :
(4.14)
Consider Definition 3.1 in Chap. 3 of the large deviation principle (LDP), with 1=an D a . A rough interpretation of this definition is that the probability of an 1 event E [ S is of order ./e a I.E/ when ! 0, and where the estimation of ./ requires more analysis. LDP for diffusion processes and Freidlin–Wentzell theory. In [10], large deviations theory is developed for small random perturbations of dynamical systems. A particular case concern (4.14), and we summarize here the main results. More details can be found in Chap. 3.
4 Some Numerical Methods for Rare Events Simulation and Analysis
81
To the generator of the Markov process X , one associates a quantity called the Hamiltonian H which may be considered as an analogous of the Laplace transform (see below). By a Legendre transform, one constructs an associated Lagrangian L, and defines the action functional on the space of trajectories in Œ0; T . The generator A of X is given by: A g.x/ WD f .x/g 0 .x/ C
2 00 g .x/: 2
(4.15)
Then, one gets from Definition 3.2 in Chap. 3 with D h and the generator (4.15) the Hamiltonian: 1 H.x; ˛/ D f .x/˛ C j˛j2 (4.16) 2 and the Lagrangian: L.x; ˇ/ D supf.˛; ˇ/ H.x; ˛/g:
(4.17)
˛
With the above, and recalling Definition 3.3 in Chap. 3 of the action functional, we have: Theorem 4.1 ([10]). Under the assumptions that the drift is bounded and uniformly continuous, .X />0 satisfies a LDP of speed 2 with good rate function S0T . In the case of (4.14), one finds: 1 S0;T . / D 2
Z
T
P j .s/ f . .s//j2 ds:
(4.18)
0
Note that if t is solution of Pt D f . t /, then S0;T . / D 0, which is consistent with the limit ! 0. The value of S0;T . / quantifies on a logarithmic scale the “probability” that X follows a given trajectory between 0 and T . An interesting consequence of Theorem 4.1 is to provide estimates for quantities related to the exit problem. For instance, if xP D f .x/ admits a unique stable equilibrium point x , one can consider the first exit time from a domain D containing x : WD infftI X .t/ … Dg: For small , one expects this time to be very large. In [10], it is proven that the mean time is exponentially large in 2 in the sense that: lim 2 ln EŒ D inf V .0; y/ D V0
!0
y2@D
(4.19)
where we have introduced the quasipotential V .x; y/ WD inf
inf
T 0 W .0/Dx; .T /Dy
S0;T . /:
(4.20)
We refer to Sect. 4.3 for more details about the quasipotential and its numerical evaluation.
82
G. Wainrib
Laplace Principle and Varadhan’s Lemma. We conclude this brief reminder by presenting an important tool in the study of the large fluctuations of a random variable Y . A key point is to understand its high-order moments, which R are well captured by the Laplace transform of the law of Y : EŒe Y D e y .y/dy assuming Y admits a density . The Laplace Principle is a general result enabling to R approximate integrals of the form A exp . .x// dx by exp sup .x/ for large x2A
, in the sense that: 1 ln !1
Z
exp . .x// dx
lim
D sup .x/: x2A
A
This principle can be extended in infinite dimensions, enabling the asymptotic study of functionals of X . Let F W S ! Rd a smooth functional. Theorem 4.2 (Varadhan’s Lemma). If .X />0 satisfies a LDP of speed a and rate function I , then: i h 1 .f / D lim a ln E e a F .X / D supfF .x/ I.x/g !0
(4.21)
x2S
We refer the reader to [7] for more details. This result will prove useful to estimate the variance of the IS estimator, and thus to choose an optimal change of measure.
4.2.2.3 IS for Diffusion Processes A second class of widely used processes in biological models is the class of diffusion processes (see Chap. 1). They appear naturally when considering a dynamical system perturbed by an external white noise. For example, with one of the most elementary neuron models, called leaky Integrate and Fire (see Chap. 5, (5.1) and (5.33)), one may study the impact of a white noise stimulation: Subthreshold dynamics: dV t D ˇVt dt C dW t for Vt < Vt h
(4.22)
Reset: Vt C D Vr if Vt D Vt h
(4.23)
Each time the membrane potential Vt reaches the threshold Vt h , a spike is emitted. Finding the spike time distribution is equivalent to a first passage time problem: WD infft > 0I Vt D Vt h g: For this simple example, several methods have been proposed to find the law of [3]. Here we are interested in illustrating how IS can provide an alternative way for estimating numerically this law, through the cumulative distribution function P. < t/, with MC simulation. If is large, or if t is large, or if Vt h is close to
4 Some Numerical Methods for Rare Events Simulation and Analysis
83
the starting point, then the event f < tg should occur frequently. However, if or t are very small, or Vt h very high, then P. < t/ should be small and a naive MC will not be efficient. Consider the case 1 and the other parameters fixed, of order 1. The process .Vt / fluctuates around 0 with a small amplitude (i.e EŒVt D 0 and VarŒVt is of order 2 =2ˇ). Moreover, as predicted by large deviation theory (cf. Sect. 4.2.2.2, (4.19)), a crossing of Vt h occurs in a time of order exp.Vt2h ˇ= 2 / 1, meaning that: lim 2 ln EŒ D Vt2h ˇ: (4.24) !0
We want to apply the ideas of IS, that is simulate another process VQ instead of V , and weight the results by a likelihood function. Here, a natural idea is to add a drift so that the expectation of does not tend to C1 any more. We thus introduce: d VQt D .t ˇVt /dt C d WQ t
(4.25)
Rt where WQ t D Wt 0 s ds. Denoting Q D infft > 0I VQt D Vt h g, a sufficient condition to have a finite limit for EŒQ is t > ˇVt h =. In this way, the hitting time will be much smaller: when is small, we expect Q to be concentrated around a finite value , instead of going to infinity. If s D is constant and strictly larger than ˇVt h =, then D ˇ 1 ln Œ.V0 =ˇ/=.Vt h =ˇ/. Thus, for t > , the event f < tg should be frequent. To implement the IS algorithm, we need to know the likelihood functional for the change of probability corresponding to the new process VQ . This likelihood functional LT is given by the Girsanov theorem: Z LT .VQ / D exp
T
s d WQ s
0
1 2
Z
t 0
2s ds :
(4.26)
The dependence of LT upon VQ appears only through the trajectory of the underlying Brownian motion WQ . An IS estimator for P. < t/ is then:
pONIS .t/ D
N 1 X 1 LT .VQi /: N i D1 fQi
(4.27)
Let us now discuss in more details a strategy to find an optimal choice of .t /. This is a difficult question and to our knowledge only partial answers are known [5, 16, 18]. The following arguments will rely on large deviation theory, see Chap. 3 and Sect. 4.2.2.2. More details can be found in [2, 7]. Finding an optimal means that the variance of the estimator pONIS .t/ should be minimal. To be slightly more general, consider that we want to estimate E ŒG.fVs g0sT /
(4.28)
84
G. Wainrib
where G is a functional of the trajectory. The case of the first passage time enters this framework. The IS estimator is based on the equality: E ŒG.fVs g0sT / D E G.fVQs g0sT /LT .VQ /
(4.29)
which can be rewritten in a shorter way as: EP ŒG D EQ GLT :
(4.30)
The variance of the estimator is then 1=N multiplied by: Var D EP G 2 LT EP ŒG2 :
(4.31)
The aim is to find that minimizes Var . The first observation is that if is chosen such that: 1 G (4.32) D EP ŒG LT then the variance vanishes. In this case, we say that corresponds to the zero variance change of probability. However, this requires to know in advance EP ŒG which is precisely the quantity we want to estimate. The first kind of arguments is based on this observation, and the idea is to find a reasonable value for EP ŒG with the help of large deviation theory in order to obtain a that is close to the zero variance change of probability. If G is only a function of the process at time T (and not a functional of the full trajectory), then it is possible to obtain a tractable formula for Var and for the zero variance change of probability. This idea has been developed in the context of option pricing in [9]. This method requires a prior for EP ŒG and large deviations theory is a powerful tool to obtain such a prior in the limit of small noise. However, as soon as the drift does not belong to the Hamiltonian class, then analytical expressions of large deviations quantities such as the quasipotential are very difficult to obtain. Numerical methods may then be useful to obtain this prior. We discuss these methods, which have their own interest, in the next section. Based on the same observation, iterative methods were suggested: starting from a guess for EP ŒG, a first run of MC simulations then give an estimated value, which in turn can be re-injected to define a new change of probability, and so on. The second kind of arguments has appeared in [11, 12] and is based on the Varadhan Lemma, which extends the Laplace transform (cf. Sect. 4.2.2.2, (4.21) and [7] for more details). Assume V satisfies a large deviation principle when ! 0, with rate function I . Then, if G is non-negative, we have from Varadhan Theorem (under some assumptions): ln.G 2 LT / lim 2 ln EP exp D supfln.G 2 . /LT . // I. /g !0 2
(4.33)
4 Some Numerical Methods for Rare Events Simulation and Analysis
85
where the supremum is over all the absolutely continuous functions on Œ0; T with initial condition .0/ D V0 . Several authors have then suggested that a interesting choice of would be the minimizer in:
inf supfln.G 2 . /LT . // I. /g:
(4.34)
Such a corresponds to a so-called asymptotically optimal change of probability. In our setting, I is given by: I. / D
1 2
Z
T
b. t / Pt
2
dt
(4.35)
0
where b. t / D ˇ t . Then replacing LT and I by their expressions, the minimization/maximization problem can be rewritten as: inf supf2 ln.G. // C
Z
1 2
T
2 t .b. t / Pt / dt
Z
0
T
2 b. t / Pt dtg:
0
(4.36) Then, assuming the conditions for inverting the infimum and the supremum are satisfied (see [12]), this problem reduces to: Z
T
supf2 ln.G. //
2t dtg
0
(4.37)
where is solution of P D b. /Ct . In our setting, G. / D 1f sup .s/ > V g , th s2Œ0;T
so that the problem is now: Z W
inf sup .s/ > Vt h
T 0
2t dt:
(4.38)
s2Œ0;T
Finding a solution to this problem is elementary if one considers only constant drifts t . In this case, should be such that D T , which gives: D
ˇ V0 Vt h e ˇT : 1 e ˇT
(4.39)
To conclude, we display in Fig. 4.2 some simulations of the trajectories for the initial process, and in Fig. 4.3 simulations of the drifted process: the crossing event was very rare in the initial problem and has become frequent in the new one. Finally, in Table 4.2, we compare the estimated value of P. < 1/ between the naive MC and the IS strategy: with N D 104 the naive method fails completely, whereas with the IS strategy, a rather good estimation is obtained.
86
G. Wainrib 1
0.5
>
0
–0.5
–1 0
0.2
0.4
0.6
0.8
1
Time t
Fig. 4.2 A hundred sample trajectories of the original Vt with parameters ˇ D 1, Vth D 1, T D 1, D 0:1 and D 0: the event of crossing Vth D 1 before T D 1 is a (very) rare event
1
0.5
>
0
–0.5
–1 0
0.2
0.4
0.6
0.8
1
Time t
Fig. 4.3 A hundred sample trajectories of the drifted Vt with parameters ˇ D 1, Vth D 1, T D 1, D 0:1 and D 15:81: the event of crossing Vth D 1 before T D 1 happens approximately half of the time. In the MC estimator (4.27), a very small weight LT is associated with each sample trajectory to make the average
4 Some Numerical Methods for Rare Events Simulation and Analysis
87
Table 4.2 The IS method improves significantly the variance of the estimator of the probability that the first passage time is larger than T D 1. Parameters: ˇ D 1, Vth D 1, T D 1, D 0:3, D 5:273257 Number of runs N Mean naive MC Variance naive MC Mean IS MC Variance IS MC 102 103 104 105 106
0.000 0.000 0.000 1.000E-05 2.000E-06
0.000 0.000 0.000 1.000E-05 2.000E-06
1.217E-06 6.819E-07 9.431E-07 8.287E-07 9.019E-07
2.428E-11 1.535E-11 2.718E-10 1.158E-10 4.450E-09
4.3 Numerical Methods Based on Large Deviations Theory 4.3.1 Quasipotential and Optimal Path While describing simulation methods for rare events, we have seen that having a reasonable prior for the quantity we are trying to estimate can be a key element to design an efficient algorithm. For instance, in the IS method, a way to find a good change of probability (in the sense of minimal variance) is to try to being close to the zero-variance change of probability, which requires a knowledge of the value one wants to estimate. To find such a prior, approximation methods are useful, and often rare events can be studied in a framework of small noise asymptotics. Therefore large deviations theory can provide asymptotic estimates that can be used in the design of simulation methods. Moreover, the results obtained by large deviations techniques have their own interest. A situation of special interest is when considering a dynamical system perturbed by small external noise or a large population close to its deterministic “law of large numbers” description, where finite size effects play the role of internal noise. In this situation, suppose the underlying deterministic system has two stable equilibria. Because of the noise, the system will jump between those two stable states. Two types of questions can be asked about such a system: what is the law of the time between two jumps (in particular its expectation) and what is the most likely path (MLP) in the phase space from one equilibrium to the other. If the dynamics is non-gradient, then the classical Kramer’s rate theory fails, and one needs other tools. The MLP can be predicted using large deviations theory, and not only it provides an insight in the mechanism of the rare event, but can also be used to deduce the jumping rate. In the Freidlin–Wentzell theory of large deviations [10], this MLP is shown to minimize an action functional. This functional on a trajectory space can be interpreted as a probabilistic cost function: it assigns a sort of “likelihood” to a given path between two points. Of course, if there exists a trajectory solution of the deterministic system with the prescribed starting and ending points, then it is exactly
88
G. Wainrib
the MLP. However, in the situation of interest there is no such deterministic solution: an escape from one stable equilibria to the other is a noise-induced phenomenon. To find the MLP, one has to find the trajectory starting from x at time t D 0 and ending at y at time t D T , which minimizes the action functional S0;T . /. The minimization should also be taken over all T 0, and this leads to the following definition of the so-called quasipotential: V .x; y/ WD inf
inf
T 0 W .0/Dx; .T /Dy
S0;T . /:
(4.40)
The quasipotential has two interesting properties. First the average time of exit from a domain D around a stable fixed point x is of order exp.V .x/= 2 /, where V .x/ WD inf V .x; y/. Secondly, the most likely exit point on the boundary is y2@D
given, if unique, by y such that V .x; y/ D V .x/. We refer to Chaps. 3 and 7 for more details about the quasipotential. Our purpose is to present an algorithm [13] designed to compute numerically the quasipotential corresponding to a given small noise system.
4.3.2 Numerical Methods The problem we are dealing with is an optimization problem, which defines the quasipotential: V .x; y/ D inf
inf
fS0;T . /g
T 0 2Cax1 ;x2 .0;T /
where Cax1 ;x2 .0; T / is the set of all absolutely continuous functions from Œ0; T to D with .0/ D x and .T / D y.
4.3.2.1 General Comments on Quasipotential Computation Algorithms There exist several numerical methods to solve this minimization problem. The first idea is to use a shooting method [14] considering the boundary value problem for the Hamiltonian equation associated with the minimization problem. Another method, called the minimum action method has been introduced in [23] and is based on a relaxation method for the associated Euler–Lagrange equation. The problem of this method is that it is well suited when T is fixed but not when the infimum has to be taken over T as well, since the MLP may be obtained for T ! 1. The method we are going to present here, called the geometric minimum action method and introduced in [13], is based on a geometric reformulation, which prevents difficulties related to large times T ! 1 and is thus more efficient in this context.
4 Some Numerical Methods for Rare Events Simulation and Analysis
89
4.3.2.2 Presentation of the Algorithm from [13] The ideas behind the algorithm are the following: • Reformulate the problem so that the infimum should not be taken over all T and but only over the curves 2 Cax1 ;x2 .0; 1/ going from x1 to x2 . • Write the Euler-Lagrange equation corresponding to the reformulated optimization problem. • Solve numerically this equation through a discretization relaxation method. Three main assumptions are made for the method, and we recall them here. D is an open connected of Rn . A1 A2 A3
8x 2 D; H.x; 0/ 0 H.; / is twice continuously differentiable H .x; / is uniformly elliptic on compacts
In the case of a diffusion process, with diffusion matrix a diagonal (uncorrelated Brownian motions), then H D a and assumption A3 requires the diffusion process to be uniformly non-degenerate. Geometric reformulation. By definition: V .x1 ; x2 / D inf
inf
T 2Cax1 ;x2 .0;T /
fST . /g:
The main result is the following reformulation: V .x1 ; x2 / D
Q /g fS.
inf
x ;x2
2Ca 1
.0;1/
with the two following expressions for SQ : SQ . / D
Z
1
<
0
Q ; ; .
0
Z
1
/ > d˛ D
0
L. ;
0
/=d˛
(4.41)
0
D . ;
0
/
where the functions Q .x; y/ and .x; y/ are defined implicitly for all x 2 D and y 2 Rn f0g as the unique solution of the system: H.x; Q / D 0;
H .x; Q / D y;
0:
In the case of a diffusion process (4.14), H is given in (4.16), and the above system reduces to: 1 Q f .x/ Q .x; y/ C j .x; y/j2 D 0; 2
Q y/ D .x; y/y; f .x/ C .x;
0:
90
G. Wainrib
What is important is that the extremal function for the minimization problem of the reformulation will have the same probabilistic interpretation (optimal path) as the extremal function for the original problem, after of course an appropriate time change. Derivation. For a rigorous proof we refer the reader to [13], but here we will describe heuristically one way to derive the result (cf. [13], Remark 2.6, p. 19). Every function 2 Cax1 ;x2 .0; T / can be written as: D
oG 1
where 2 Cax1 ;x2 .0; 1/ follows the path of at constant speed, and G is an appropriate time-rescaling. Then, minimizing over all and T is equivalent to minimizing over all functions and G. After a change of variable, setting g D G 0 : Z
T
ST . / D
P L. ; /dt D
0
Z
1
L. ;
0
=g/gd˛:
0
So to find SQ . / we have to minimize over all g, which can be done as follows by setting the derivative of the integrand of the second expression equal to zero: =g/g > CL. ; 0 =g/ D < 0 =g; . ; 0 =g/ > C < . ; 0 =g/;
0D<
0
=g 2 ; Ly . ;
D H. ; . ;
0
0
0
=g > H. ; . ;
0
=g/
=g//
with .x; y/ being the maximizer in the Legendre transform: L.x; y/ D supf< ; y > H.x; /g D < .x; y/; y > H.x; .x; y//:
So that we have:
H .x; .x; y// D y
and
Ly .x; y/ D .x; y/:
Thus, the condition:
H. ; . ;
0
=g// D 0
is satisfied with D 1=g such that: Q y/ D .x; y/; H .x; Q .x; y// D y .x; and we then derive expression (4.41) noticing that L. ;
0
/ D< Q . ;
0
/;
0
> 0:
4 Some Numerical Methods for Rare Events Simulation and Analysis
91
Euler–Lagrange equation. As we have this new formulation of the optimization problem defining the quasipotential, we will derive the Euler–Lagrange equation associated with the minimization of SQ . Let us recall the Euler–Lagrange equation associated with the optimization of R1 J.v/ D 0 F .v; v0 /dt. If J 0 .u/ D 0 then: d dt
@F @F .u; u0 / .u; u0 / D 0: @y @x
Then, if J.v/ D SQ .v/, we have the following equation: @˛ . Q C QyT
0
/ QxT
0
D0
with all the Q being evaluated at . ; 0 /. After some technical calculus, this equation gives the following system: 2
00
C H x
0
H Hx 0
0
D0
(4.42)
with initial and final condition: .0/ D x1 and .T / D x2 and with D and where Hx ; H x ; and H are evaluated at . ; Q . ; 0 //. In terms j 0 j2 of implementation, the algorithm is designed to solve the system (4.42) using a classical relaxation method based on a discretized version of (4.42). Examples of applications of this method in chemistry can be found in [22].
4.3.2.3 Towards a Study of Spontaneous Action Potential Generation In this section, we show an illustration of the results on a widely used neuron model that can be obtained with the above algorithm. A further development of this approach is the subject of ongoing research. The original three-dimensional Morris–Lecar model was introduced in [17] to account for the voltage oscillation of the barnacle muscle fiber. In order to account for various sources of noise, we add a Wiener process on each variable, so the system reads: Cm VP D ŒgC a m1 .V /.V VC a / gK w.V VK / gL .V VL / C I dt C 1 dBt
.1/
.2/
wP D Œ.1 w/˛K .V / wˇK .V / dt C 2 dBt
with auxiliary functions and parameters given in Appendix A. Here the variable V is the membrane potential, w corresponds to the proportion of open potassium channels.
92
G. Wainrib 0.40 0.35 0.30 0.25 0.20 0.15 0.10 0.05 0.00 − 80
− 60
− 40
− 20
0
20
40
Fig. 4.4 This figure shows two things: first, the background color map represents the quasipotential V .x1 ; x2 / with x1 the stable fixed point of the Morris–Lecar system for different values of x2 in the phase plane; second, the red curve is an example of an extremal trajectory for the action functional S. / of the algorithm presented above, starting from x1 (the point in the darkest blue), to another (arbitrary) point. Color code: from low values of the quasipotential (blue) to high values (red). Abscissa is the potential (mV) and Ordinate is the proportion of open ion channels
As we are dealing with the small noise regime, the only noise parameter that we can introduce is the ratio ı D 1 =2 . This parameter corresponds to the ratio between the extrinsic (synaptic) and intrinsic (channel) sources of noise. For the input current I D 0, we compute the quasipotential V .x1 ; x2 / with x1 the stable fixed point of the system, and for all x2 in the phase plane, see Fig. 4.4. This quasipotential can be interpreted as a noise-induced energy landscape, in the sense that small fluctuations can lead the trajectory far from the stable equilibrium but with an associated cost which is precisely the quasipotential. In Fig. 4.5, we explore how this picture is affected by a change of parameters I and ı. We observe that increasing I or ı induces an “opening” of the quasipotential. For I , this is presumably related to a bifurcation in the underlying deterministic system, since increasing I changes the stability of the fixed point x1 . For ı, this is related to a noise-induced increase of the escape rate (see for instance the second line, from right to left: I is fixed and ı is increased). Further quantitative studies of the quasipotential is an ongoing research work.
4 Some Numerical Methods for Rare Events Simulation and Analysis
93
Fig. 4.5 Quasipotential pictures for different values of ı 2 f5; 2:5; 1g (from left to right) and I 2 f0; 10; 20; 30; 40g (from top to bottom) for the Morris–Lecal model. In each figure, the abscissa is the potential variable V (in mV) and the ordinate the recovery variable w (without unit). The red and yellow curves are respectively the w- and V -nullclines of the system. Their intersection is the equilibrium point x1 from which the quasipotential V .x1 ; x2 / is computed, for any x2 in the phase plane. Notice that x1 depends on I . Color code: from low values of the quasipotential (blue) to high values (red)
4.4 Conclusion In this chapter, we have introduced several concepts and methods that helps estimating rare events related quantities. In the first part, we have discussed several aspects concerning variance reduction of MC method, with a focus on IS. Our aim
94
G. Wainrib
was to illustrate how these method may apply to elementary situations inspired by biological models. We have shown that these techniques can be useful to deal with multiscale continuous-time Markov chains. We have treated with more detail the problem of first passage time for a linear diffusion process to show the kind of mathematical problems that may arise when trying to design an optimal IS scheme. This has led to the observation that large deviation theory can be helpful to solve these questions. Therefore, we have also presented a numerical method designed to estimate large deviations quantities such as the quasipotential and the optimal path. This approach provides asymptotic estimates which have their own interest, but these estimates can be used in turn to initiate an efficient MC simulation, for example to construct an almost optimal change of probability. The link between large deviations and rare event simulation is treated in more details in [2]. Several rare event simulation problems are not completely solved and new questions arising from biological modeling will certainly require new techniques. For the interested reader, we recommend the recent book [20] where developments of both theory and applications are presented. The question of rare events is not new, and many philosophers have discussed this topic. We would like to end by a quote from Lucretius [1]: First-beginnings of things (. . . ) have been wont to be borne on, and to unite in every way and essay everything that they might create, meeting one with another, therefore it comes to pass that scattered abroad through a great age, as they try meetings and motions of every kind, at last those come together, which, suddenly cast together, become often the beginnings of great things, of earth, sea and sky, and the race of living things.
Appendix A: Auxiliary Functions and Parameters The auxiliary functions of the Morris–Lecar model [17] used in the chapter are: 1 V V3 V V3 n cosh 1 C tanh 2 2V4 V4
(4.43)
1 V V3 V V3 ˇK .V / D n cosh 1 tanh 2 2V4 V4
(4.44)
m1 .V / D ˛C a .V /=.˛C a .V / C ˇC a .V //
(4.45)
˛K .V / D
with:
V V1 V V1 1 ˛C a .V / D cosh 1 C tanh 2 2V2 V2 V V1 V V1 1 ˇC a .V / D cosh 1 tanh : 2 2V2 V2
(4.46) (4.47)
4 Some Numerical Methods for Rare Events Simulation and Analysis
95
The values of the parameters and initial conditions used for the numerical integration and stochastic simulations are: V1 D 0 mVI V2 D 15 mV; V3 D 10 mV; V4 D 10 mV; gC a D 4 mS=cm2 ; gK D 8 mS=cm2 ; gL D 2 mS=cm2 ; VK D 70 mV; VL D 50 mV; VC a D 100 mV; Cm D 20 F=cm2 I n D 0:1:
References 1. Bailey, C.: Lucretius: On the Nature of Things. Angell. Kessinger Publishing, LLC, Whitefish, MT 59937 USA (2008) 2. Bucklew, J.A.: Introduction to Rare Event Simulation. Springer, New York (2004) 3. Burkitt, A.N.: A review of the integrate-and-fire neuron model: I. Homogeneous synaptic input. Biol. Cybern. 95(1), 1–19 (2006) 4. Cox, D.R., Hinkley, D.V.: Theoretical Statistics. Chapman and Hall/CRC, London (1979) 5. Deaconu, M., Lejay, A.: Simulation of diffusions by means of importance sampling paradigm. Ann. Appl. Probab. 20(4), 1389–1424 (2010) 6. Dean, T., Dupuis, P.: Splitting for rare event simulation: A large deviation approach to design and analysis. Stoch. Process. Their Appl. 119(2), 562–587 (2009) 7. Dembo, A., Zeitouni, O.: Large Deviations Techniques and Applications. Springer, New York (2009) 8. Den Hollander, F.: Large Deviations. AMS Bookstore. Providence, Rhode Island (2008) 9. Fournie, E., Lasry, J.M., Touzi, N.: Monte Carlo methods for stochastic volatility models. In: Rogers, L.C.G., Talay, D. (eds.) Numerical Methods in Finance, pp. 146–164. Cambridge University Press, Cambridge (1997) 10. Freidlin, M.I., Wentzell, A.D.: Random Perturbations of Dynamical Systems. Springer, New York (1998) 11. Glasserman, P., Heidelberger, P., Shahabuddin, P.: Asymptotically optimal importance sampling and stratification for pricing path-dependent options. Math. Finance 9(2), 117–152 (2001) 12. Guasoni, P., Robertson, S.: Optimal importance sampling with explicit formulas in continuous time. Finance Stochast. 12(1), 1–19 (2008) 13. Heymann, M., Vanden-Eijnden, E.: The Geometric Minimum Action Method: A Least Action Principle on the Space of Curves. Comm. Pure Appl. Math. 61(8), 1052–1117 (2008) 14. Keller, H.B.: Numerical Methods for Two-Point Boundary-Value Problems. Blaisdell, Waltham, MA (1968) 15. Liu, J.B.: Monte Carlo Strategies in Scientific Computing. Springer, New York (2008) 16. Milstein, G.N.: Numerical Integration of Stochastic Differential Equations. Springer, New York (1995) 17. Morris, C., Lecar, H.: Voltage oscillations in the barnacle giant muscle fiber. Biophys. J. 35(1), 193–213 (1981) 18. Newton, N.J.: Variance reduction for simulated diffusions. SIAM J. Appl. Math. 54(6), 1780– 1805 (1994) 19. Rao, C.V., Arkin, A.P.: Stochastic chemical kinetics and the quasi-steady-state assumption: application to the Gillespie algorithm. J. Chem. Phys. 118, 4999 (2003) 20. Rubino, G., Tuffin, B.: Rare Event Simulation Using Monte Carlo Methods. Wiley, New York (2009) 21. Rubinstein, R.Y.: Optimization of computer simulation models with rare events* 1. Eur. J. Oper. Res. 99(1), 89–112 (1997) 22. Vanden-Eijnden, E., Heymann, M.: The geometric minimum action method for computing minimum energy paths. J. Chem. Phys. 128, 061103 (2008) 23. Weinan, E., Ren, W., Vanden-Eijnden, E.: Minimum action method for the study of rare events. Comm. Pure Appl. Math. 57(5), 637–656 (2004)
Part II
Neuronal Models
Chapter 5
Stochastic Integrate and Fire Models: A Review on Mathematical Methods and Their Applications Laura Sacerdote and Maria Teresa Giraudo
Abstract Mathematical models are an important tool for neuroscientists. During the last 30 years many papers have appeared on single neuron description and specifically on stochastic Integrate and Fire models. Analytical results have been proved and numerical and simulation methods have been developed for their study. Recent reviews collect the main features of these models but do not focus on the methodologies employed to obtain them. The aim of this paper is to fill this gap by upgrading old reviews. The idea is to collect the existing methods and the available analytical results for the most common one dimensional stochastic Integrate and Fire models to make them available for studies on networks. An effort to unify the mathematical notation is also made. The review is divided in two parts: 1. Derivation of the models with the list of the available closed forms expressions for their characterization. 2. Presentation of the existing mathematical and statistical methods for the study of these models.
5.1 Introduction Progress in experimental techniques, with the possibility to record simultaneously from many neurons, move the interest of scientists from single neuron to small or large network models. Hence, the time seems ripe to summarize the contribution of single neuron models to our knowledge of neuronal coding. Various types of spiking neuron models exist, with different levels of details in the description. They range from biophysical ones in the lines of the classical paper of 1952 by Hodgkin and Huxley [62] (see also Chap. 6), to the “integrate and fire” variants (see, for example, L. Sacerdote () M.T. Giraudo Department of Mathematics, University of Torino, Via Carlo Alberto 10 Torino, Italy e-mail: [email protected]; [email protected] M. Bachar et al. (eds.), Stochastic Biomathematical Models, Lecture Notes in Mathematics 2058, DOI 10.1007/978-3-642-32157-3 5, © Springer-Verlag Berlin Heidelberg 2013
99
100
L. Sacerdote and M.T. Giraudo
[41, 61, 98]). Integrate and Fire (IF) type models disregard biological details, that are accounted for through a stochastic term, to focus on causal relationships in neuronal dynamics. Their relative simplicity make them good candidates for the study of networks. Recent reviews discuss qualitative [16, 17] and quantitative [68] features of stochastic Leaky Integrate and Fire (LIF) models. These models are a variant of IF models where the spontaneous membrane decay is introduced. An older paper [106] concerns mathematical methods for their study. The aim of this work is to collect the existing mathematical methods for LIF models, to provide a set of methodologies for future studies on networks. Indeed, although the stochastic LIF models are simplified representations of the cells, they are considered good descriptors of the neuron spiking activity [67, 71]. Though some criticisms have appeared, showing some lacks in the fit of experimental data [68], these models are still largely employed. The most used is the Ornstein–Uhlenbeck (OU) version but all of them have played a role for the understanding of the mechanisms involved in neuronal coding. The first IF models date back to 1907, when Lapique [85] proposed to describe the membrane potential evolution of a neuron, subject to an input, using the time derivative of the law for the capacitance. In the presence of an input current, the membrane voltage increases until it reaches a constant threshold S . Then a spike occurs and the voltage is reset to its resting value, to start again to evolve [133]. Although it reasonably fit some experimental data, this model remained disregarded until the second half of the last century. Then it became the embryonic idea for “Integrate and Fire” models. The leading idea in the formulation of stochastic IF and LIF models was to partition the features of the neuron into two groups: the first group of features were accounted for by the mathematical description of the neuronal (deterministic) dynamics while the second group of features were globally considered by means of a noise term. In Sect. 5.3 we derive the most popular LIF models after a brief description of the biological features of interest in Sect. 5.2. Improvements of LIF models were proposed in the eighties, following the initial efforts to recognize the main laws governing our brain. The lack of suitable mathematical instruments soon made apparent the difficulties in determining explicit expressions for the input-output relationship. The end of the eighties and the starting years of the nineties are characterized by mathematical and numerical advances, accompanied by the development of new faster computers. Section 5.4 is devoted to a review of the main mathematical methods for the study of stochastic LIF models, updating previous reviews [1, 104, 106]. In the nineties the use of such methodologies, as well as specific reliable and powerful simulation methods, allowed to obtain a deeper knowledge of the model features. Unexpected results on the role of noise in neuronal coding have been proved mathematically and confirmed experimentally [127]. Surprisingly, all research on LIF models has disregarded for a long time the ability to fit real data. The only exception was [75] which did consider the parameter estimation problem. Recently, papers on the statistical estimation of model parameters have started to appear. Section 5.5 considers this subject.
5 Stochastic Integrate and Fire Models
101
5.2 Biological Features of the Neuron A comprehensive description of the physiological properties of neurons is outside the scope of this work. We refer to [41, 133, 134] for an exhaustive exposition of neurobiological properties relevant in the modeling context. The neurons are the elementary processing units in the central nervous system, interconnected with intricate patterns. Neurons of different sizes and shape, but sharing some fundamental features, exist in all the areas of the brain. Their estimated number in the human brain is around 1012 . A typical neuron can be divided into three distinct parts called dendrites, axon, and soma. The dendrites play the role of the input device collecting signals from other neurons and transmitting them to the soma. The soma is the non-linear processing unit of the neuron. It generates a signal, known as spike or action potential, if the total amount of inputs exceeds a certain threshold. The axon is the output device carrying the signal to the other neurons. The action potentials are electrical pulses, having a duration of about 1–2 ms and an amplitude of around 100 mV. They do not change their shape during transmission through the neuron. A neuron cannot elicit a second spike immediately after a first one has occurred due to the existence of a refractory period. A chain of action potentials emitted by a single neuron is called a spike train, representing a series of similar events occurring either at regularly spaced instants of time or more randomly. The time between two consecutive spikes is called an interspike interval (ISI). The site where the axon of a neuron is linked with the dendrite or the soma of a second neuron is called the synapse. The linkage between an activated (presynaptic) neuron and a second (postsynaptic) neuron is typically chemical in nature. When an action potential reaches a synapse, it triggers complex bio-chemical reactions leading to the release of a neurotransmitter and the opening of specific ionic channels on the membrane. The ion influx leads to a change in the potential value at the postsynaptic site and the translation of the chemical signal into an electrical one. This voltage response is called the postsynaptic potential (PSP). The effect of a spike on the postsynaptic neuron is measured in terms of the potential difference between the interior of the cell and its surroundings, called the membrane potential. In the absence of spike inputs, the cell is at a resting level of about 65 mV. If the change in membrane potential is positive, the synapse is excitatory and induces a negative depolarization, otherwise it is inhibitory and hyperpolarizes the cell. In the absence of inputs, i.e. in the silent state, the neuron membrane potential decays exponentially toward the resting level. The dimensions and number of synapses vary for different neurons. Some neurons, such as Purkinje cerebellar cells, pyramidal neurons and interneurons recorded in vitro [68], have a huge number of synapses and extended dendritic trees. IF models can then be employed for the description of their output behavior since, due to the large number of synapses, limit theorems can be used [68, 100].
102
L. Sacerdote and M.T. Giraudo
5.3 One Dimensional Stochastic Integrate and Fire Models 5.3.1 Introduction and Notation The huge number of synapses impinging on the neuron determines a stochasticity in the activating current not considered in the Lapique model. The first attempt to formulate a stochastic IF model is due to Gerstein and Mandelbrot. In [40] they fitted a number of recorded ISIs through the Inverse Gaussian (IG) distribution, i.e. the first passage time distribution of a Wiener process through a constant boundary S . They described the membrane potential dynamics preceding the release of a spike through a Wiener process. To get a renewal process they assumed that after each spike the membrane potential is instantaneously reset to its initial value (see [25] for an introduction on these processes). This model is the basis of successive more realistic models. In the models, classified as stochastic IF or LIF, one describes the time evolution of the membrane potential by means of a suitable stochastic process X D fXt ; t > t0 g with Xt0 D x0 and identifies the ISIs with the random variable (r.v.) first passage time (FPT) of X through the threshold S : T D TS D inf ft > t0 W Xt > S g :
(5.1)
The probability density function (pdf) of T , when it exists, is g.t/ D g.S; t jx0 ; t0 / D
@ P .T < t/: @t
(5.2)
When t0 D 0 we simply write g.S; t jx0 /. In some instances S D S.t/. In Sects. 5.3.2, 5.3.5 and 5.3.6 we focus on models that describe the subthreshold membrane potential as a diffusion process. In Sects. 5.3.3 and 5.3.4 we present two continuous time Markov models, the Randomized Random Walk (RRW) and Stein’s model. Reviews on IF and LIF models have already appeared [80,104,106] but here we unify the notation and we list in a single contribution the mathematical results scattered in different papers. In the case of models using a diffusion process X D fXt ; t > t0 g, the diffusion interval is I D .l; r/ (see Chap. 2), the drift coefficient and infinitesimal variance (the infinitesimal moments) are: 1 E .Xt jXt D x / t
1 E .Xt /2 jXt D x ; 2 .x/ D lim t !0 t .x/ D lim
t !0
(5.3)
@P .Xt x jXt0 Dx0 / with Xt D Xt Ct Xt . The transition pdf f .x; t jx0 ; t0 / D can @x be found as the solution to a partial differential equation, either the Kolmogorov equation [100]:
5 Stochastic Integrate and Fire Models
103
@f .x; t jx0 ; t0 / @f .x; t jx0 ; t0 / 2 .x0 / @2 f .x; t jx0 ; t0 / C .x0 / C D 0 (5.4) @t0 @x0 2 @x02 or the Fokker–Planck equation @f .x; t jx0 ; t0 / @ 1 @2 ˚ 2 D .x/f .x; t jx0 ; t0 / f.x/f .x; t jx0 ; t0 /g C 2 @t @x 2 @x (5.5) with initial delta condition lim f .x; t jx0 ; t0 / D ı.x x0 /:
t0 !t
(5.6)
Here ı denotes the Dirac delta function. We suppose that the infinitesimal moments verify some mild conditions [70, 100, 106] to guarantee the existence of the solutions of the Fokker–Planck and Kolmogorov equations. Furthermore, when a dependence of the diffusion coefficients on t is not specified, the processes are time homogeneous, i.e. their properties are invariant with respect to time shifts. When (5.4) is solved in the presence of an absorbing boundary in x D S , a further absorption condition must be imposed: lim f a .x; t jx0 ; t0 / D 0;
x!S
(5.7)
@ P .Xt x; TS > tjXs D y/ is the corresponding transiwhere f a .x; t jx0 ; t0 / D @x tion pdf. To get renewal processes, X is always reset to x0 after each spike. To characterize a diffusion model, one can also make use of the Itˆo-type stochastic differential equation (SDE) verified by the process (cf. Chap. 1). A comparison between different LIF models is given in Sect. 5.3.7. Jump diffusion models, allowing one to distinguish the effect of neuronal inputs according to their frequency and their size, are presented in Sect. 5.3.8. The role of the threshold shape is illustrated in Sect. 5.3.9, and the most recently introduced IF models are surveyed in Sect. 5.3.10. To switch from the description of the spike times of the neuron to the count of the number of spikes up to a given time t, we introduce, in Sect. 5.3.11, the return processes.
5.3.2 Wiener Process Model Gerstein and Mandelbrot [40] described the time evolution of the subthreshold membrane potential through a Wiener process Xt characterized by infinitesimal moments .x/ D I 2 .x/ D 2 (5.8) with 2 R; > 0. Their model was motivated by experimental observations of the ISIs exhibiting histograms typical of stable distributions. Indeed, this property is
104
L. Sacerdote and M.T. Giraudo
exhibited by the FPT of a Wiener process. One gets such process from the standard Wiener process W (Chap. 1, Definition 1.2 and p. 11) through the transformation Xt D t C Wt I
8 t 0:
(5.9)
To relate the use of the Wiener process with the membrane potential evolution, Gerstein and Mandelbrot observed that the Wiener process is the continuous limit of a random walk (Chap. 1, p. 4). The occurrence of jumps will then model the incoming of PSPs. The continuous limit is a good approximation when the inputs are of small size and are frequent. The transition pdf of X is @P .Xt < x jXt0 D x0 / (5.10) @x ( ) 1 Œx x0 .t t0 /2 D p exp : 2 2 .t t0 / 2 2 .t t0 /
fX .x; t jx0 ; t0 /
To mimic the spiking times a constant absorbing boundary S is introduced. The spike times are then identified with the FPT, T , of the Wiener process originated at Xt0 D x0 through the boundary. To obtain the renewal property, the process is instantaneously reset to x0 after each spike. Hence, the ISIs correspond to the independent identically distributed (iid) r.v. Tn , n D 1; 2; : : :, with Tn T . The transition pdf of X , if Xt0 D x0 , is Gaussian with mean E.Xt / D x0 Ct and variance Var.Xt / D 2 t, while the FPT pdf through a constant boundary S > x0 is an IG distribution, hence the pdf and the cumulative distribution are: ( ) S x0 .S x0 t/2 g .S; t jx0 / D p exp I (5.11) 2 2 t 2 2 t 3
2.S x0 / S x0 t 1 S x0 C t C e 2 Erfc : (5.12) Erfc P.T < t/ D p p 2 2t 2t Here Erfc denotes the complementary error function [2]. The mean and the variance of the FPT are EŒT D
S x0 I
Var .T / D
.S x0 / 2 : 3
(5.13)
The transition pdf in the presence of a constant absorbing boundary S is [104]: f a .x; tjy; s/ D
p
.x y .t s//2 expŒ 2 2 .t s/ 2.t s/ 1
2 .x 2S C y .t s//2 expŒ 2 .S y/ : (5.14) 2 2 .t s/
5 Stochastic Integrate and Fire Models
105
Despite the excellent fitting with some experimental data, the Gerstein and Mandelbrot model was criticized for its biological simplifications [134]. However, it allows one to obtain results that help the intuition for more realistic models and it is still used for this purpose, taking advantage of the existence of a closed form FPT pdf through a constant boundary. Its FPT pdf is known also through particular time dependent boundaries. These FPTs can be used to account for the refractory period following a spike. Indeed, a time varying boundary, assuming high values at small times and then decreasing, makes short ISIs rare. The FPT pdf is known for a continuous piecewise-linear boundary S.t/ D ˛i C ˇi t; t 2 Œti 1 ; ti ; i 1, where t0 < t1 < t2 < : : : and ˛i , ˇi 2 R with t0 0. If t 2 Œ0; 1/, such boundary is linear and the FPT pdf is g.˛1 C ˇ1 t; t jx0 / D
Œ˛1 C ˇ1 t t x0 2 j˛1 x0 j : p exp 2 2 t 2 t 3
(5.15)
In the general case, setting ˛i C1 D ˛i Cˇi ti one gets that t 7! S.t/ is continuous on Œt0 ; 1/. If we put Si D S.ti /, the transition pdf for the Wiener process without drift, W , in the presence of an absorbing boundary S .t/, f a .x1 ; t1 I x2 ; t2 I : : : I x; tjx0 ; t0 /, is for t 2 .tn1 ; tn / [136]: f a .x1 ; t1 I x2 ; t2 I : : : I x; tjx0 ; t0 / D D
(5.16)
n
@ fP .Wt1 < x1 ; : : : ; Wtn1 < xn1 ; Wt < xI T > t jWt0 D x0 < ˛1 /g @x1 : : : @xn n1 Y
f a .xi ; ti jxi 1 ; ti 1 /f a .x; tjxn1 ; tn1 /
i D1
D
n1 Y
1e
i D1
2
.Si xi /.Si 1 xi 1 / ti ti 1
.xi xi 1 /2 f a .x; tjxn1 ; tn1 / exp p 2.ti ti 1 / 2.ti ti 1 /
for xi Si , 1 i n; x0 < S0 and x 2 .1; S /. Further closed form expressions for the FPT of a Wiener process have been obtained by the of images [27] or as solutions of suitable integral equations ˇ ˇ method ˇ ˇ [103], when ˇ dSdt.t / ˇ C t ˛ , with ˛ < 1=2 and C a constant. This last case is discussed in Sect. 5.4.1.1, and involves series of multiple integrals.
5.3.3 Randomized Random Walk Model In the RRW the regularly spaced intertimes of the random walk between PSPs are substituted with exponentially distributed intertimes of parameters C and for excitatory and inhibitory PSPs, respectively. The process X with X0 D 0 has mean and variance
106
L. Sacerdote and M.T. Giraudo
E .Xt / D ı C tI Var .Xt / D ı 2 C C t:
(5.17)
Here ı > 0 is the constant amplitude of PSPs. The FPT pdf through the boundary S , with S an integer multiple of ı, is [134]: g .S; t j0 / D
S ı
C
S=2ı
C C
e .
t
/t
p
IS=ı 2t C ; t > 0
(5.18)
where I ./ is the modified Bessel function of parameter [2]. The mean and variance of the ISI distribution are [134]: S C C S EŒT D I Var .T / D : (5.19) ı .C / ı .C /3 When ı ! 0, assuming C process with D 0.
1 , 2ı 2
this model converges to the Wiener
5.3.4 Stein’s Model In 1965 Stein [131] formulated the first LIF model, i.e. an IF model with the leakage feature, by introducing the spontaneous membrane decay in the absence of PSPs in the RRW model. The process X is solution of the SDE Xt C dt C ı C dN C dXt D t C ı dN t I Xt0 D x0 :
(5.20)
Here > 0 is the membrane time constant, is the resting potential, NtC and Nt are independent Poisson processes of parameters C and , respectively, and ı C > 0, ı < 0 are the amplitudes of excitatory and inhibitory PSPs. Generally for this model and for all those descending from it one assumes D 0; since the case ¤ 0 can be obtained by the shift X 7! X . Following the IF model structure, the spike times are the first crossing times of the process through the boundary and the membrane potential is instantaneously reset to its resting value after each spike. The infinitesimal moments are: x E ŒXt Ch Xt jXt D x D C C C ı C C ı (5.21) h h i E .Xt Ch Xt /2 jXt D x 2 D C ı C C .ı /2 : M2 .x/ D lim h!0 h
M1 .x/ D lim
h!0
The mean trajectory, in the absence of a threshold, is E .Xt jx0 / D x0 e t = C C ı C C ı 1 e t = ;
(5.22)
5 Stochastic Integrate and Fire Models
107
where we put E .Xt jx0 / D E .Xt jX0 D x0 /. The FPT problem for the process (5.20) is still unsolved and the use of simulation techniques is required for its analysis.
5.3.5 Ornstein–Uhlenbeck Diffusion Model The OU process (Chap. 1, p. 11) was proposed as a continuous limit of Stein’s model to facilitate the solution of the FPT problem. The rationale for this limit is the huge number of synapses characterizing certain neurons such as the Purkinje cells. The PSPs determine frequent small jumps and limit theorems can be applied to get a diffusion process. The OU process, already known in the physics literature [135], is both a Markov and a Gaussian stochastic process. Different approaches can be followed to obtain the diffusion limit of Stein’s model. In [69] the convergence of the measure of a Stein’s process to that of the OU is studied. In [18, 106] it is proved that the continuous limit of the transition pdf for the process (5.20) converges to a transition pdf that verifies the Fokker–Planck equation for the OU process. Alternatively, the OU model can be derived from a differential equation describing the membrane potential dynamics. We sketch in the following these last two approaches. Due to the time continuity and to the Markov property of Stein’s process, the Smolukowsky equation holds for the transition pdf: Z f .x; t C t jx0 ; t0 / D
1 1
f .x; t C t jz; t / f .z; t jx0 ; t0 / d z:
(5.23)
In the absence of inputs to the neuron, the process (5.20), initially in the state z at time t, decays exponentially to the zero resting potential, reaching at time t C t the value x1 D ze t = . In case of an excitatory or an inhibitory input at time u 2 .t; t C t/ the potential becomes x2 .u/ D ze t = C ı C e .t Ct u/= or x3 .u/ D R t Ct 1 xi .u/ d u; i D 2; 3 ze t = C ı e .t Ct u/= , respectively. Setting xi D t t the left hand side of (5.23) becomes: f .x; t C t jz; t / D 1 C C t ı .x x1 / C C tı .x x2 / C tı .x x3 / C o.t/: (5.24) Hence (5.23) becomes: 1 C C t f xe t = ; t jx0 ; t0 ı C t = C t = C tf xe e (5.25) 1 ; t jx0 ; t0 t ı t = C tf .xe t = .e 1/; t jx0 ; t0 / C o.t/: t
f .x; t C t jx0 ; t0 / D e t =
˚
108
L. Sacerdote and M.T. Giraudo
Approximating e t =
t
C1, dividing by t and letting t ! 0; (5.25) becomes
@f .x; t jx0 ; t0 / @ x D f .x; t jx0 ; t0 / C C f x ı C ; t jx0 ; t0 @t @x f .x; t jx0 ; t0 / C Œf .x ı ; t jx0 ; t0 / f .x; t jx0 ; t0 / : (5.26) Developing the terms in square brackets as a Taylor series around x
i @ h x @f .x; t jx0 ; t0 / D C C ı C C ı f .x; t jx0 ; t0 / C (5.27) @t @x 1 o i X .1/n @n nh C n C C .ı /n f .x; t jx0 ; t0 / : ı n nŠ @x nD2 C
Assuming ı C D ı D ı, C Aı C positive constants, for ı ! 0 we get:
2 , 2ı 2
A ı
C
2 ; 2ı 2
with ; AC ; A
i 2 @2 f .x; t jx ; t / @f .x; t jx0 ; t0 / @ h x 0 0 D C f .x; t jx0 ; t0 / C ; @t @x 2 @x 2 (5.28) i.e. the Fokker–Planck equation for an OU process with D AC A : Its solution with an initial delta condition (5.6) is the transition pdf fOU .x; t jx0 ; t0 /
@P .Xt < x jXt0 D x0 / D @x
1 2.t t0 / 1 e
s
8 2 9 .t t0 / .t t0 / ˆ > ˆ > ˆ < x x0 e .1 e / > = : (5.29)
exp 2.t t / ˆ > ˆ > 2 1 e 0 ˆ > : ; A finite volume method to approximate the solution of the time-dependent Fokker– Planck equation for the OU process in the presence of boundary conditions has been proposed in [88]. Such a method allows one to deal both with stationary and time dependent inputs. The OU process X takes values on the real line, and the mean and variance with X0 D x0 are: E .Xt jx0 / D 1 e t = C x0 e t = Var .Xt jx0 / D
2 1 e 2t = : 2
(5.30) (5.31)
5 Stochastic Integrate and Fire Models 12
109 25
a
10
b
20
8 15 6 10 4 5 2 0
0 −2
0
10
20
30
40
50
−5
0
10
t (ms)
20
30
40
50
t (ms)
p Fig. and E.Xt / C p 5.1 Mean value (middle line) and curves E.Xt / 3 Var.Xt / (lower line)1 (Panel A), 3 Var.Xt / (upper line) for an OU process with parameters D 0:8 mVms D 2:0 mVms1 (Panel B), 2 D 0:2 mV2 ms1 and D 10 ms
Properties of the models, as well as the range of validity of some approximate formulae for the FPT problem, depend upon the value of the asymptotic mean depolarization of the process X . Hence, we distinguish between two distinct firing regimes: subthreshold if E.X1 / < S and suprathreshold in the opposite case. Models of the LIF type can be interpreted in the framework of threshold detectors theory. The presence of a feeble noise helps the detection of the signal, a characteristic of any threshold detector. In p Fig. 5.1 we plot the mean value of an OU process (5.30) together with E.Xt / ˙ 3 Var.Xt /, making use of (5.31). The two panels correspond to examples in the subthreshold regime (Panel A) and in the suprathreshold regime (Panel B). The intrinsic random variability determines crossings even in the subthreshold regime. The OU model can also be obtained from the differential equation for the time evolution of the subthreshold membrane potential in the presence of spontaneous decay of parameter and net input : Xt dXt D C I X0 D x0 : dt
(5.32)
Adding a noise term of intensity to account for the random PSPs, one gets: Xt dXt D C dt C dW t I X0 D x0 :
(5.33)
110
L. Sacerdote and M.T. Giraudo
This is the SDE of an OU process (Chap. 1, p. 11). The analytical expression for the FPT pdf of the OU process is still an open problem. In [4] three alternative representations of the distribution of T are introduced. The first one involves an eigenvalue expansion in terms of zeros of the parabolic cylinder functions, while the second is an integral representation involving special functions. We report here the third one that writes the FPT pdf of an OU process with D 0 through the boundary S in terms of a three-dimensional Bessel bridge: g.t/ D e .S
2 x 2 t /=2 0
Z t 1 gW .t/EBb exp 2 .rs S /2 ds : 2 0
(5.34)
Here gW .t/ is the FPT pdf through the boundary S for the standard Wiener process, rs is the three-dimensional Bessel bridge over the interval Œ0; t between r0 D 0 and rt D S x0 . This process is solution of: drs D
1 y rs C t s rs
ds C dW s ; r0 D x; s < t:
(5.35)
In (5.34) EBb indicates the expectation with respect to the Bessel bridge law. In [137] formula (5.34) is used to approximate the FPT pdf with Monte Carlo techniques. An explicit expression for the FPT density of continuous Gaussian processes to a general boundary is obtained under mild conditions in [36], while an expression for the FPT of the Wiener process to a curved boundary is expanded as a series of multiple integrals in [37]. Existing available closed form expressions include the case of a hyperbolic boundary [13] t t S.t/ D C Ae C Be ; (5.36) with A and B arbitrary constants [100]. Furthermore specific boundaries can be obtained through the space time transformations described in Sect. 5.4.1.2 applied to closed form solutions for the case of the Wiener process. The Laplace transform of the FPT pdf in the case of a constant boundary S is [101]: (
.x0 / .S / E e T D exp 2 2 2
2
)
hq D D
i . x0 / hq i ; (5.37) 2 . S / 2 2 2
where D .:/ is the Parabolic Cylinder Function [2] of parameter . No analytical inversion formula is available for (5.37). Reliable and efficient procedures, discussed in Sect. 5.4, can be applied to obtain the FPT pdf either numerically or by means of simulations for constant or time dependent boundaries. The FPT mean has been determined as the derivative of (5.37), computed in D 0 [101]:
5 Stochastic Integrate and Fire Models
111
! 1 1 X xS2n x12n 1 X EŒT D 2 nD1 n .2n 1/ŠŠ nD1 n .2n 1/ŠŠ p 1 3 x12 1 3 xS2 ; I xS ; I C =2 x1 2 2 2 2 2 2 (
(5.38)
p p where x1 D . x0 / 2= . 2 /, xS D . S / 2= . 2 / and .a; cI z/ is the Kummer function [2]. The double factorial is defined as .2n 1/ŠŠ D .2n 1/ .2n3/ 1. Alternatively, the mean is expressed through the Siegert formula [129]: r EŒT D
2
Z
S
z z2 1 C Erf . p / exp. 2 /d z;
(5.39)
where Erf .:/ denotes the error function [2]. Use of (5.38) or (5.39) depends on the value of the parameters since the two formulae present numerical difficulties for different ranges. Approximate formulae [79] hold for specific ranges. If > S and ! 0, i.e. in the quasi-deterministic case, the mean FPT can be approximated by equating the expression of E.Xt jx0 / with S to obtain [78]: S EŒT ln : (5.40) x0 Note that (5.40) disregards the effect of the noise on the crossings. If x0 S; or equivalently if is sufficiently small and is negative so that the crossing is a rare event, the approximation p .S /2 3 exp EŒT S 2
(5.41)
holds [45]. A linear approximation for the firing rate f D 1=EŒT , obtained using (5.38), is [79]:
1 p (5.42) C 2 S : S p p
This approximation holds when = and S = are small enough. When neither (5.38) nor (5.39) are suitable for computations and S but is not small enough to apply approximation (5.40), an “ad hoc” procedure to evaluate thepmean FPT is possible. One establishes at first the time t1 at which E.Xt / C 2 Var.Xt / first crosses the threshold S , i.e. when most trajectories are still below the threshold. For t > t1 the process is then approximated by means of the Wiener process with drift and initial value E.Xt1 /. f ./ D
112
L. Sacerdote and M.T. Giraudo
The second moment of the FPT for the OU process is [91]:
p xs xs '1 . p / C 1 . p / 2 2
p xs x1 2 ln 2 '1 . p / '1 . p / C 2 2 2 p xs x1 xs '2 . p / C '2 . p / 2 . p / C 2 2 2
EŒT 2 D 2 EŒT
(5.43)
x1 2. p / 2
where x1 , xS are defined as in (5.38) and Z
z
'1 .z/ D
2
e t dt D 0
'2 .z/ D
kD0
1 X nD0
Z 1 .z/
D2
D
1 X nD0
z2kC1 I kŠ.2k C 1/
X 1 z2nC3 .n C 1/Š.2n C 3/ 2k C 1 n
kD0
z
e 0
2 .z/
1 X
u2
Z
u
e
v2
dvdu D
0 n 2nC4
2 z .2n C 3/ŠŠ.n C 2/
1 X kD0
n X kD0
2kz2kC2 I .2k C 1/ŠŠ.k C 1/
1 : kC1
(5.44)
In [20] the mean, variance and skewness of the FPT for the OU process are tabulated for neurobiologically compatible choices of the parameters. Asymptotic results for the FPT of the OU process are presented in Sect. 5.4.1.3.
5.3.6 Reversal Potential Models The diffusion interval of the OU process is the real line but large negative values of the membrane potential are unrealistic. Hence, other models introduce a saturation effect on the membrane sensibility. When the value of the membrane potential is close to the reversal potential VI , the incoming inputs produce a reduced effect [43,76]. A diffusion model with reversal potentials is proposed in [43] as a diffusion limit on a birth and death process. A similar diffusion limit is obtained in [76] from a variant of Stein’s model (5.20) where an inhibitory reversal potential VI is introduced: i h p 1 d Yt D Yt dt C ı C dN C t C " .Yt VI / C Yt VI dN t I Y0 D y0 : (5.45)
5 Stochastic Integrate and Fire Models
113
Here NtC , Nt ı C and are the same as in (5.20), " 2 .1; 0/, VI < x0 are two constants and is a suitably defined random variable. The first two infinitesimal moments (5.3) of this model are: 1 " C ı C C " VI I .y/ D y 2 2 .y/ D C ı C C "2 .y VI /2 C Var ./ .y VI / :
(5.46)
The mean trajectory of the process originated in Y0 D y0 is
C ı C " VI 1 t . 1 " / 1 e E .Yt jy0 / D x0 e t . " / C : 1 "
(5.47)
The diffusion limit of this model (known in the neurobiological literature as the Feller model and in other contexts as the Cox–Ingersol–Ross process or the squareroot model, see Chap. 1, p. 12, or Chap. 2, Example 2.3) is identified with the solution to the SDE [76] p Yt d Yt D C 2 dt C 2 Yt VI dW t I Y0 D y0 :
(5.48)
Here the constants 2 ; 2 and are related with those of model (5.45) by imposing the equality of the infinitesimal moments. One first sets D 1" and C C C C 2 D ı " VI D ı I where I D "VI , then substitutes the variable in (5.45) by a suitable sequence of r.v. such that in the limit for n ! 1 one gets Var.n / D 0. This choice allows one to obtain the same infinitesimal variance at the resting level for the two processes. Hence one gets 22 D
2 C .ı C /2 C . I/ : VI
(5.49)
Note that, due to the expressions for and 2 , the parameters and 2 appearing in the SDE (5.48) bear a different meaning here with respect to the corresponding parameters and in the OU model. Furthermore, < . The diffusion coefficient of the process (5.48) becomes negative if Xt < VI , hence the state space is I D ŒVI ; 1/. The boundary in VI is regular or exit, depending upon the values of 2 and 2 , according to the Feller classification of boundaries [70], see also Chap. 2, Example 2.3. To determine the transition probability density of the process (5.48), a boundary condition should be added. The natural choice, that respects the model features, is the reflecting condition:
lim
x!VI
.x/f .x; t jx0 ; t0 /
@ 2 .x/f .x; t jx0 ; t0 / D 0: @x
(5.50)
114
L. Sacerdote and M.T. Giraudo
The Feller process is generally known in its standardized form p Xt C F dt C 2 Xt dW t I X0 D VI ; dX t D
(5.51)
that can be easily obtained from (5.48) by performing the space transformation Yt ! Xt D Yt VI and by setting F D 2 VI . This equation is defined over I D Œ0; 1/. A different notation for the parameters of the Feller process, largely employed in the literature, sets: 2 1 (5.52) p I q F I r 2 : 2 The transition pdf of the Feller process X depends upon the nature of the lower boundary x D 0 for the process (5.51) or x D VI for the process (5.48) and on the selected boundary condition for the solution of its Kolmogorov equation [70]. If we impose a zero-flux condition (5.50) at the origin, using the notation (5.52), we obtain the transition pdf [43]: p
x p.t t0 / e x0
qr 2r
@P .Xt < x jXt0 D x0 / D (5.53) @x r.e p.t t0 / 1/ ( " p # ) p x C x0 e p.t t0 / 2p xx0 e p.t t0 / :
exp p.t t / I qr 1 0 1 r e r e p.t t0 / 1
fF e .x; t jx0 ; t0 /
Here I .z/ indicates the modified Bessel function of the first kind [2] of parameter . The mean trajectory of the process (5.48) originated in X0 D x0 is E .Xt jx0 / D x0 e t = C 2 1 e t = ;
(5.54)
while its variance is Var .Xt jx0 / D
22
V
t 2 I t t 1e 1 e C .x0 VI / e : (5.55) 2
The FPT pdf of the Feller process cannot be obtained in a closed form but it can be evaluated by employing the methods described in Sect. 5.4. Furthermore, its Laplace transform is [42]:
q px0 ˚ p ; rI r
: g .S jx0 / D (5.56) ˚ p ; qr I pS r Here ˚ denotes the Kummer function [2] and the notation in (5.52) has been employed. The mean firing time, when x0 < S < VE , is [43]:
5 Stochastic Integrate and Fire Models
115
! .S VI /nC1 .x0 VI /nC1 : S x0 C Q .n C 1/ niD1 2 VI C i22 =2 nD1 (5.57) If 2 S and 2 is suitably small, the mean FPT can be approximated with a formula analoguous to (5.40) for the OU process. Indeed it holds: 1 X
EŒT D 2 VI
EŒT ln
S 2 x0 2
:
(5.58)
When the crossing is a rare event, i.e. x0 S or 2 is small, a result analogous to (5.41) can be derived [45]: EŒT
S VI 2
S VI
22 4
2 .2 VI / 22
22 2 .S VI /
2.2 2VI / 2
e
2.S VI / 22
:
(5.59)
Here denotes the Gamma function [2]. The second moment of the FPT has been obtained in [43]: 2 EŒT 2 D
2EŒT 6 4 S VI C 2 VI 1 X 2 kD1
k X j D1
(
2
3 1 X kD1
.S VI / Qk h .k C 1/ i D1 2 VI C
.S VI /kC1 .x0 VI /kC1 .2 VI /.k C 1/
kC1
i 22 2
7 i 5(5.60)
) 1 k Y i22 j 2 VI C : 2 i D1
The expression for the third moment can be found in [106]. A further variant of Stein’s model was proposed in [82] with two reversal potentials, an inhibitory (lower) one VI and an excitatory (upper) one VE : 1 dX t D Xt dt C ı C .VE Xt /dN C t C Œı .Xt VI / i p J .VE Xt /.Xt VI / dN t X0 D x0 :
(5.61)
Here the two independent Poisson processes N C and N have intensities C and , respectively, 1 < ı < 0 < ı C < 1, J is a r.v. defined over the interval .1 ı ; ı / such that E.J / D 0. For a sequence of models (5.61) indexed by n, one can assume that ınC ; ın ! 0 , C n ; n ! 1 in such a way that
116
L. Sacerdote and M.T. Giraudo
2 ınC C n ! 0, ın n ! 0. Simultaneously, E.Jn / ! 0C in such a way that 2 2 n E.Jn / ! 3 > 0. This allows to obtain the diffusion approximation
p Xt dX t D C 3 dt C 3 .Xt VI / .VE Xt /dW t I X0 D x0 3
(5.62)
where the new constants 3 and 3 D VE VI have been introduced. The state space of this process is ŒVI ; VE and its transition pdf is solution of the Fokker–Planck equation with initial delta condition and with reflecting conditions of the type (5.50) on both boundaries VI and VE . The first two moments of the membrane potential [82] are: E.Xt jx0 / D x0 e t =3 C 3 3 .1 e t =3 /;
(5.63)
(
2 ˇ 2ˇ C 32 ˛t .ˇ ˛/ 2ˇ C 3 Ce (5.64) ˛ 2˛ C 32 ˛ ˛ C 32 2 x0 VI ˛t 32 t .ˇ ˛/ 2ˇ 2˛ C 3 ˛t 1 Ce e VE VI 2˛ C 32 ˛ C 32 ! ) 2 x0 VI 2 2ˇ C 32 ˛t 32 t 2˛2ˇC3 2˛t 32 t Ce : 1 Ce VE VI ˛ C 32 ˛ C 32
E Xt2 jx0 D .VE VI /2
Here ˛ D 1=3 and ˇ D .˛VI C 3 / = .VE VI / and a typo in [82] is corrected. Use of (5.63) and (5.64) allows the computation of Var.Xt jx0 /. The mean firing time through a boundary S < VE is: EŒT D C
S x0 ˇ.VE VI /
(5.65)
1 X .2ˇ=32 C 1/ .2˛=32 C 1/ .S VI /nC2 .x0 VI /nC2 : .2ˇ=32 C n C 2/ .2˛=32 / ˇ .n C 2/ .VE VI /nC2 nD0
If the boundary crossing is a rare event, a result analoguous to (5.41) and (5.59) holds [82]: S x0 ˛.S C x0 2VI / EŒT 1C : (5.66) ˇ .VE VI / .2ˇ C 32 /.VE VI / When 3 3 > S , one gets a result analogous to (5.40) and (5.58):
S 3 3 EŒT 3 ln x0 3 3
:
(5.67)
5 Stochastic Integrate and Fire Models
117
20
15
10
Xt 5
0
−5
0
20
40
60
80
100
t (ms)
Fig. 5.2 Sample paths of the OU (dashed line), the Feller (dotted line) and the double reversal potential (continuous line) models employing the same leading Wiener process realization. Here D 2 D 3 D 1 mVms1 , 2 D 0:9 mV2 ms1 , VI D 10 mV, VE D 30 mV
5.3.7 Comparison Between Different LIF Models The mathematical complexity of the FPT problem increases with the attempts to make the models more realistic. However, it is certainly desirable to avoid the use of complex models when they do not add any improvement with respect to the simpler ones. In Fig. 5.2 we compare sample paths from the OU, the Feller process and the process with double reversal potential, simulated using the Euler–Maruyama scheme (Chap. 1, Sect. 1.7.1) on the same leading Wiener process trajectory. Furthermore, we consider D D 10 ms and we choose the same level of variability at the resting level for all models. Hence, 22 jVI j D 2 and 32 VE jVI j D 2 . Since the three processes do not show relevant discrepancies, the first two models are preferable due to their better computational tractability. When one wishes to compare the FPT pdfs one gets different results according to the selected criterium for the parameters values. In [78] the OU and the Feller ISIs, computed through (5.114), are compared, according to three different criteria: • To get the same values for their corresponding discrete versions: D 2 I 2 D 22 VI ; and chosen accordingly (Fig. 5.3A).
118
L. Sacerdote and M.T. Giraudo
a
b
0.08
c
0.08
0.8
0.07
0.07
0.7
0.06
0.06
0.6
0.05
0.05
0.5
g(t) 0.04
g(t) 0.04
g(t) 0.4
0.03
0.03
0.3
0.02
0.02
0.2
0.01
0.01
0.1
0
0
0
10
20
t (ms)
30
0
10
20
30
0
0
t (ms)
2
4
6
8
t (ms)
Fig. 5.3 Comparisons between the OU (continuous line) and the Feller (dashed lines) models. Panel A: VI D 10 mV, S D 10 mV, D 6:2 ms; parameters controlling the PSP sizes and the intensities of the input processes: ae D iI D 2 mV, D 0:2, D 8= Š 1:290 ms1 , ! D 4= Š 0:641 ms1 . Panel B: x0 D 0 mV, VI D 10 mV, S D 10 mV, D D 6:2 mV, 2 D 4; 9 and 16 mV2 ms1 (from bottom to top), D F D 0 mVms1 . Panel C: Feller model y0 D 0 mV, S D 5 mV, VI D 10 mV, D 6:2 ms, 2 D 0 mVms1 , 22 D 4 mVms1 ; OU model x0 D 0 mV, S D 5 mV, D 6:2 ms, D 0:799 mVms1 , 2 D 48:03 mV2 ms1
• To get the same mean trajectory for both models (Fig. 5.3B): D I D 2 p D 2 VI : • To get almost equal FPT densities. For this aim, one fixes the parameters for one model and determines a set of parameters, reproducing a similar ISI distribution, for the other model (Fig. 5.3C). To guess possible sets of parameters for the second process we impose equality of mean and variance of their FPTs. The last case illustrates the case where a histogram of experimentally obtained ISIs can be fitted by either of the two model distributions. The use of the more complex Feller model seems preferable when one has data of the membrane potential between consecutive spikes [78]. When only the ISI distribution is available, both models fit the data. In [119] qualitative comparisons between the OU and the Feller processes, obtained through the stochastic ordering techniques [120] in Sect. 5.4.1.5, are presented. In [118] the same techniques are used for a sensitivity analysis on the parameters of the FPT pdf. Stochastic ordering properties of the FPTs are used in [28] to select the model. Membrane potential data analyzed in [64] show that the same neuron, under different experimental conditions, can be described either by the OU model, by the Feller model or by an alternative model with a quadratic diffusion coefficient.
5 Stochastic Integrate and Fire Models
119
5.3.8 Jump Diffusion Models To perform the diffusion limits in Sects. 5.3.5 and 5.3.6, it was assumed that all the contributions to the changes in the membrane potential were of the same small amplitude and the frequency was large enough [18]. However, PSPs impinging on the soma can play a different role with respect to the contributions on different areas of the neuron. LIF models where the subthreshold membrane potential dynamics is described by jump diffusion processes allow one to separate inputs according to how strong they are. Jump diffusion models can be obtained from a variant of the Stein-type model: n m X X Xt C;j C dXt D dt C ıj dN t C ık dN ;k C ı e dN et C ı i dN it t j D1 kD1
Xt0 D x0 :
(5.68)
Here Nte ; Nti are independent Poisson processes of parameters e and i and C;j amplitudes ı e and ı i accounting for the strong contributions. Nt ; Nt;k are C independent Poisson processes of parameters j and k , independent from Nte and C C Nti . If ıjC ; ık ! 0 and at the same time C j ; k ! 1 so that ıj j C ıj j ! C 2 C 2 2 and .ıj / j C .ıj / j ! , a diffusion approximation can be performed to get a process solution of the SDE: Xt C dt C 2 dW t C ı e dN et C ı i dN it I dX t D
Xt0 D x0
(5.69)
where W is independent from Nte and Nti . The model (5.69) is a jump diffusion process with an OU underlying diffusion. Other jump diffusion models may be obtained introducing reversal potentials. All these models are of LIF type, requiring the superposition of a boundary S to mimic the spike activity. The crossings can occur either during diffusion or during an upward jump when X 2 .S ı e ; S /. Hence, the spike time is the time of first entrance (FET) into the strip .S; 1/. The cases of underlying Wiener with drift and OU processes have been considered in [55, 81, 116, 130]. The exponential distribution for the jump epochs preserves the Markov property of the process (5.69). In [116, 117] the case of IG distributed jump epochs is discussed but the resulting process is no more Markov. To compute the ISI distribution for IG and exponential time distributed jumps one resorts to simulation techniques. Differently from the unimodal behavior of the ISI distribution of diffusion models, jump diffusions have a multimodal shape (Fig. 5.4). The only analytical results on the FET problem for jump diffusions refer to an underlying Wiener process with constant boundary. Lower bounds are proposed in [29] for the FET density and in [55] for the FET mean and variance, together with exact formulae for the specific case of large jumps, when the jumps are driven by
120
L. Sacerdote and M.T. Giraudo
a
g(t)
0.09 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0
b
0
20
40
60
80
100
0.02 0.018 0.016 0.014 0.012 g(t) 0.01 0.008 0.006 0.004 0.002 0
0
50
100
c
150
200
t (ms)
t (ms)
d
0.03
0.04
0.025 0.03 0.02 g(t)
g(t) 0.015
0.02
0.01 0.01 0.005 0
0
50
100 t (ms)
150
0
0
50
100
150
t (ms)
Fig. 5.4 Examples of ISI distributions for jump-diffusion processes (5.69); underlying OU diffusion process with different parameters for Panels A and B, C and D. Panel A: IG time distributed jumps. Panel C: Exponential jump interarrivals. Panel B, Panel D: ISI distribution for pure OU diffusion, parameters as in Panels A and C respectively
a counting process. An approximate solution to an integral equation for the FET density of a jump diffusion process is discussed in [56] for the Wiener process.
5.3.9 Boundary Shapes Constant thresholds are typically employed in LIF models because of their greater mathematical tractability. However, the refractory period following each spike has been modelled by means of threshold shapes [52,100]. These boundaries attain very high values after a spike, then decrease under the reference value and finally oscillate around a constant value (Fig. 5.5A). In [21] a dynamic threshold obeying a differential equation is considered for the same aim. A boundary which is a linear combination of two exponentials with different time constants is proposed in [73], to fit experimental data. The use of this boundary, together with the lack of the resetting of the membrane potential after a spike, allows for a very good fit of the data. A computational method that can reproduce and predict a variety of spike responses has been deviced in [74] using a multi-timescale adaptive threshold predictor and a nonresetting leaky integrator.
5 Stochastic Integrate and Fire Models
121
In [22, 23] thresholds with fatigue are proposed to account for experimental data showing a progressive decrease of excitability during high frequency firing. This type of threshold destroys the renewal and Markov character of the process, but allows one to describe adaptation phenomena through LIF models. The inclusion of time dependent boundaries prevents the use of many of the mathematical methods described in the next section. However, reliable numerical and simulation techniques can be applied (cf. Sect. 5.4). Finally, the study of periodic boundaries or of noisy thresholds [87] becomes a useful mathematical method to deal with periodic inputs. Indeed, one transforms the original process with time periodic drift and constant boundary into a time homogeneous diffusion process, constrained by a periodic absorbing boundary [93, 99, 128]. To illustrate this idea, consider an OU model with periodic input of frequency !, phase ' and amplitude A. The SDE (5.33) for this LIF model is Xt C C A sin.!t C '/ dt C dW t I X0 D x0 dX t D
(5.70)
with x0 < S . X is not an OU process, however the change of space variable A Yt D Xt p sin.!t C ' / e t = sin.' / 2 1 C .! /
(5.71)
with D arctan.! / transforms (5.70) into the SDE of an OU process with parameters , and , y0 D x0 , and the constant boundary S into A S.t/ D S p sin.!t C ' / e t = sin.' / : 2 1 C .! /
(5.72)
The ISIs of the periodically modulated LIF model with constant threshold are distributed as the ISIs of a LIF model with constant input and appropriate timedependent threshold (Fig. 5.5).
5.3.10 Further Models New efforts on LIF models are mainly devoted to the study of input-output relationships or to the analysis of small neuronal networks with units described by LIF models [130]. New variants of LIF models have recently appeared in the literature to catch further features such as plasticity [41] or to improve their flexibility and their predictive power [12, 15, 24]. We quote also the LIF compartmental models [77, 108–110] that extend the one dimensional models by introducing systems of SDEs to describe the dynamics of different components of the neuron, such as the dendritic and the soma zone in the case of two compartmental models. These models
122
L. Sacerdote and M.T. Giraudo
a
b
13
13 12
S(t) (mV)
12.5
12
11 10 9
S(t) (mV)
8 7
11.5
0
50
100 150 200 250 300 t (ms)
c
11
g(t)
10.5
10
9.5
0
50
100 t (ms)
150
0.04 0.035 0.03 0.025 0.02 0.015 0.01 0.005 0
0
50
100 150 200 250 300 t(ms) t
t Fig. 5.5 Panel A: Time dependent threshold S.t / D 10 C 3e t 0:6e 100 sin. 100 /. Panel B: Threshold of the transformed process for an OU process with additional sinusoidal term in the drift coefficient. Panel C: Simulated FPT pdf for the same process
require the study of the FPT of one component through a boundary to describe the ISIs. Up to now this problem can be handled only through simulations.
5.3.11 Refractoriness and Return Process Models An alternative approach to the study of spike trains focus on the number of spikes in a prescribed time interval. This study allows one to introduce the refractoriness of the neuron in a quite natural way. To this aim one can associate a return process fZ D Zt ; W t 0g in the interval .l; S /, S 2 I , to any regular diffusion process X D fXt W t 0g on I D .l; r/ as follows. Starting at x0 2 .l; S / at time 0, the process Z coincides with X until it attains the level S . At this time it is blocked on the boundary S for a random time and no new crossings can occur during this refractory period. Then Z and X are instantaneously reset to x0 and the process Z evolves as X until the boundary S is reached again, and so on. We show in Fig. 5.6 a sample path of this process when the refractory period is constant. Let Fi and Ri , i D 0; 1; : : :, be the r.v. describing the time between the i -th reset and the .i C 1/-th crossing and the i -th refractory period. For time-homogeneous diffusions, the r.v. Fi are iid with pdf g.S; t jx0 /. It is also assumed that the r.v.
5 Stochastic Integrate and Fire Models
123
10 8 6 4 2 0 −2 −4
0
5
10
15
20 t (ms)
25
30
35
40
Fig. 5.6 Sample path of a return process Z with a constant refractory period
Ri ; i D 0; 1; : : :, are iid with pdf '.t/ depending only on the duration of the refractory period. A counting process M D fMt ; t 0g can be introduced to describe the number of attainments of the level S by the process Z up to time t. Let qk .t jx0 / D P fMt D k jZ0 D x0 g ; k D 0; 1; : : :
(5.73)
be the probability that k firings occur up to t. Then [107]: Z
t
q0 .t jx0 / D 1
g.S; jx0 /d
(5.74)
0
qk .t jx0 / D Œg.S; t jx0 / '.t/
.k/
Z
t
1
g.S; jx0 /d C g.S; t jx0 / 0
Œ'.t/ g.S; t jx0 /
.k1/
Z t 1 './d k D 1; 2; : : : (5.75) 0
where means convolution and the exponent .k/ denotes .k/-fold convolution. Expressions for such probabilities have been obtained in [3] for the Wiener process for exponentially distributed refractory periods. In the general time homogeneous case X E fMtr jx0 g D k r qk .t jx0 / r D 1; 2; : : : (5.76) k1
is the r-th order moment of M . Let Ii , i D 0; 1; 2; : : : be the r.v. describing the ISIs and let I0 be the time of the first firing. One has [47]:
124
L. Sacerdote and M.T. Giraudo
E.I / D t1 .S jx0 / C E.R/I E.I 2 / D t2 .S jx0 / C E.R2 / C 2E.R/t1 .S jx0 / E.I 3 / D t3 .S jx0 / C E.R3 / C 3E.R/t2 .S jx0 / C 3E.R2 /t1 .S jx0 /
(5.77)
where tr .S jx0 / is the r-th order moment of the FPT. If the first three moments of the refractory period are finite, the following asymptotic expressions for large times hold for the first two moments of M [47]: 1 E.I 2 / t1 .S jx0 / 1 tC I E.I / 2 E2 .I / E.I / n o 1 2 1 2t1 .S jx0 / 2E.I 2 / E ŒMt 2 jx0 Š 2 t t C E .I / E3 .I / E.I / E2 .I / E fMt jx0 g Š
C
3 E2 .I 2 / 2 E.I 3 / 1 E.I 2 / t1 .S jx0 / C 4 3 2 2 E .I / 3 E .I / 2 E .I / E.I /
C
t2 .S jx0 / 2E.I 2 / 3 t1 .S jx0 /: E2 .I / E .I /
(5.78) (5.79)
In [46, 105] the case of absence of a refractory period and that of a random distribution for the return value are discussed for the Wiener, OU and Feller processes. Alternatively, [14] proposes to model refractoriness through return processes characterized by an elastic boundary as firing threshold.
5.4 Mathematical Methods for the First Passage Time Problem and Their Application to the Study of Neuronal Models We update here previous reviews [1, 104, 106] on the methods available up to now to deal with the FPT problem for stochastic LIF models. In the case of diffusion processes, closed form expressions for the transition pdfs are determined as solutions of the Kolmogorov or the Fokker–Planck equations of Sects. 5.3.2, 5.3.5 and 5.3.6. The Fourier transform is the typical method to get these solutions. Closed form solutions for the FPT pdf refer generally to specific time dependent boundaries. The constant boundary case is known for the Wiener process (5.11) or for the OU process if S D 0; D 0 and x0 ¤ 0: The FPT pdf has been determined for the hyperbolic shape boundary (5.36) for the OU process [100] or for the Feller process [114] in the case of boundaries corresponding to symmetries for these processes. Use of the reflection principle allows one to determine the FPT pdf through Daniels’ boundary [26,27,86], while a particular FPT pdf is found through a symmetry based constructive approach in [44]. The theory of group transformations is used in [48] and [114] to determine the transition pdfs of an OU process between
5 Stochastic Integrate and Fire Models
125
a lower reflecting and an upper absorbing constant boundary and of a Feller process between the origin and a hyperbolic-type boundary, but is of no interest for neuronal application. Hence, we omit the description of this method. In Sect. 5.4.1 we review the commonly used analytical techniques for FPT problems: integral equations (Sect. 5.4.1.1), change of variables or of measure (Sect. 5.4.1.2), asymptotic studies (Sect. 5.4.1.3), computation of FPT moments (Sect. 5.4.1.4), and stochastic ordering (Sect. 5.4.1.5). In Sect. 5.4.1.6 we present the available methods for jump diffusion processes. In Sect. 5.4.2 we introduce the direct (Sect. 5.4.2.1) and inverse (Sect. 5.4.2.2) FPT problem. Finally, in (Sect. 5.4.3) we sketch specific simulation techniques for FPTs and the numerical tools for their solution.
5.4.1 Analytical Methods 5.4.1.1 Integral Equations In 1943 Fortet [39] proved, under mild conditions for the boundary S .t/, that the Volterra integral equation of the first kind: Z
t
f .x; t jx0 / D
g .S./; jx0 / f .x; tjS ./ ; / d ;
(5.80)
0
holding for x > S.t/, holds also for x D S.t/. When the boundary is constant and the process is time homogeneous, (5.80) is of convolution type and the Laplace transform method can be applied. Denoting as f .S jx0 / and g .S jx0 / the Laplace transforms of f .S; t jx0 / and of g.S; t jx0 /, for x > S > x0 one gets: g .S jx0 / D
f .x jx0 / : f .x jS /
(5.81)
Generally, the Laplace transform cannot be analytically inverted due to its complex expression (cf. for example (5.37)). Equation (5.80) for x D S.t/ has a weakly singular kernel for ! t. Indeed, any diffusion behaves as a Wiener process for small times. Hence, k.t; / f .S.t/; t jS ./ ; / p with k.t; / ! 0 as ! t. This makes numerical t methods for their solution unstable. Hence it is convenient to switch to a second type Volterra equation. Integrating (5.80) between the left side of the diffusion interval l and S.t/ and then differentiating with respect to time, one gets a second kind Volterra equation [103]: Z
t
g.S.t/; t jx0 / D 2j.S.t/; t jx0 / 2
g .S.t/; t jx0 / j .S .t/ ; t jS ./ ; / d : 0
(5.82)
126
L. Sacerdote and M.T. Giraudo
Here we have introduced the probability current through z at time u of the diffusion process X whose transition pdf is solution of (5.5): 1 2
j .z; u jw; v / D .z/ f .z; u jw; v /
ˇ ˇ @ 2 : (5.83) .y/ f .y; u jw; v / ˇˇ @y yDz
Equation (5.82) has a weakly singular kernel. For the Wiener process, when ˇ ˇ ˇ dS .t / ˇ ˇ dt ˇ C t ˛ , with ˛ < 1=2, C a constant and limt !0 S.t/ > Wt0 D x0 , g .S.t/; t jx0 / is the only L2 solution of (5.82). It can be expressed as [103] Z
t
g .S.t/; tjx0 / D 2j .S.t/; tjx0 / 4 C
1 X
Z
t
4n
nD1
dj .S.t/; tjS./; / j .S./; jx0 / 0
djn .S.t/; tjS./; /
(5.84)
0
Z
2j .S./; jx0 / 4
d j .S./; jS. /; / j .S. /; jx0 / 0
where Z
t
jn .S.t/; t jS./; / D
d j1 .S. /; jS./; / jn1 .S.t/; t jS. /; / (5.85)
for n D 2; 3; : : : and Z
t
j1 .S.t/; t jS./; / D
d j .S. /; jS./; / j .S.t/; t jS. /; / :
(5.86)
A third integral equation was proposed in [13]: Z
t
g .S.t/; t jx0 / D 2 .S.t/; t jx0 / C 2
g .S.t/; t jx0 /
.S.t/; t jS ./ ; / d
0
(5.87)
where .S.t/; t jx; / D
d fF .S.t/; t jx; /g C k .t/ f .S.t/; t jx; / : dt
(5.88)
R S .t / Here F .S .t/ ; t jx; / D l f .y; t jx; / dy and k .t/ is an arbitrary continuous function. A suitable choice for k .t/ allows to make the kernel of (5.87) regular and hence (5.87) becomes optimal for numerical integration. The expressions (5.88) for the OU and the Feller processes, respectively, that make the kernel of (5.87) regular are [13]
5 Stochastic Integrate and Fire Models
127 t
OU
e S 0 .t/ C S.t/= C 2.t / 2 .1 e /
t
ŒS.t/ e x C f .S .t/; tjx; //(5.89)
.S .t/; t jx; // D Œ
and h F el .S .t/; t jx; // D
p
S.t /p.t / x
i .qr/ 2r
pŒS.t/ C xe p.t / exp rŒe p.t / 1
rŒe p.t / 1
1 r pS.t/e p.t / C ŒpS.t/ C q S 0 .t/
S 0 .t/ p.t / e 1 2 2 " p # 2p S.t/xe p.t /
Iq=r1 rŒe p.t / 1 " p #) p 2p S.t/xe p.t / p S.t/xe p.t / Iq=r C : (5.90) e p.t / 1 rŒe p.t / 1
Other choices for k.t/ make the integral on the right hand side of (5.87) vanish for boundaries with particular symmetry properties, such as (5.36) for the OU process. Infinite sum expansions, bounds and approximations for g .S.t/; t jx0 / can be obtained from (5.87) [121]. Expressions that regularize the kernel of (5.87) can be found also in other specific cases.
5.4.1.2 Change of Variables or Change of Measures The transition pdf and the FPT pdf of an assigned process can be obtained through changes of variables and/or changes of measure. Let Y D fYt ; t 0g be a diffusion process characterized by drift .y; t/ and diffusion coefficient .y; t/. One may wish to transform this process into a Wiener process through suitable space-time transformations, when these transformations exist. In [99] it is shown that a transformation, conserving the probability mass, mapping the Kolmogorov equation of the process Y into the analogous equation for the Wiener process @2 f 0 @f 0 C D 0; (5.91) @ 0 @y 02 with initial delta condition, is of the form 0 D ./I y 0 D
ˇ .y; /I f .x; t jy; /dx D f 0 .x 0 ; t 0 ˇy 0 ; 0 /dx0 :
(5.92)
128
L. Sacerdote and M.T. Giraudo
This transformation exists if and only if the infinitesimal moments verify .y; / D
y02 .y; / 2
C
Z y .y; / c2 ./ 2 .x; / C 02 .x; / c1 ./ C : dx 2 Œ.x; /3 z (5.93)
Here z 2 I is an arbitrary value and c1 .t/ and c2 .t/ are arbitrary functions of time. If (5.93) holds the transformation is: Z y Z p dx 1 k1 exp d uc2 .u/ 2 0 z .x; / p Z Z u k1 1 duc1 .u/ exp d c2 . / C k2 2 2 2 0 Z u Z 0 D ./ D k1 du exp d c2 . / C k3
y0 D
.y; / D
1
ˇ f .x; t jy; /dx D f .x ; t ˇy 0 ; 0 /dx0 0
0
0
0
(5.94)
where z 2 I , i 2 Œ0; 1/ and ki , i D 1; 2; 3 are constants with k1 > 0. For example, the transformation p p i 2 0 k1 . z / h 0 k1 0 0 e e e .y z/ C y D .y; / D C k2 i 2.1 0 / k1 h 2. 0 / C k3 e e 0 D ./ D 2 ˇ f .x; t jy; /dx D f 0 .x 0 ; t 0 ˇy 0 ; 0 /dx0 (5.95) changes the OU process into the standard Wiener process. Here 1 ; 2 > 0 are arbitrary times. The transformation (5.95) sends the linear boundary S.t/ D a C bt for the Wiener process into the U-shaped boundary (5.36) for the OU process. The Feller process can be transformed into the Wiener process, conserving the probability mass, only if qr D 12 . In [19] a necessary and sufficient condition analogous to (5.93) is given to transform a diffusion process Y into a Feller process. A change between the measures of two processes is considered in [4, 5, 115]. In [4] the Girsanov theorem [111] and the change of measure dP
OU
Z t 1 2 1 2 2 Wt x0 t 2 D exp W ds dP 2 2 0 s
(5.96)
are applied to the OU process to obtain its FPT pdf through a constant boundary. Here POU and P denote the distributions of an OU process with D 0 and of a standard Wiener process, respectively.
5 Stochastic Integrate and Fire Models
129
5.4.1.3 Asymptotic Results Asymptotic results play an important role in the study of the FPT pdf because they hold for relatively small values of the involved variable. Studies on the asymptotic behavior of the FPT pdf belong to two different classes: large values of the boundary and large times. Let us first list asymptotic results in the first class. In [91] the asymptotic exponentiality of the FPT for an OU process is proved; this result is extended in [90] to a class of diffusion processes admitting steady-state density W .x/ D lim f .x; t jx0 / D t !1
c exp 2 .x/
Z
x
2.y/ dy ; 2 .y/
(5.97)
where c is a normalization constant. When the boundary S approaches the unattainable level r of the diffusion interval, limx!r 2 .x/ ŒW .x/2 EŒT D 0, the following asymptotic result for the FPT pdf g.t/ holds:
t 1 : exp g.t/ EŒT EŒT
(5.98)
Numerical studies on the OU and on the Feller processes show that this behavior is attained with a negligible error for quite small values of the boundary S (i.e. for S D 3 if D 0, D 1 and 2 D 1 for the OU process). In this asymptotic case the mean FPT, EŒT , loses the dependency upon the initial value x0 . Asymptotic results hold for the same processes in the case of boundaries either asymptotically constant or asymptotically periodic [45]. Periodic boundaries for the OU process are considered also in [113] (see [106] for a review on time dependent boundaries). Let us now switch to the asymptotic behavior with respect to time. Different techniques, including large deviations theory (see Chaps. 3 and 4) may be used to prove the results. For small times, the FPT can be approximated with the IG distribution. Indeed, near the origin any diffusion process can be approximated by a Wiener process. In [45] the asymptotic behaviour, for large t, of the FPT pdfs through some time-varying boundaries, is considered. This paper deals with a class of one dimensional diffusion processes with steady-state density. The considered boundaries include periodic boundaries. Sufficient conditions for an asymptotic exponential behavior are given for the cases of asymptotically constant and asymptotically periodic boundaries. Explicit expressions are worked out for the processes that can be obtained from the OU process by spatial transformations. The FPT pdf as t ! 1, for periodic or asymptotically periodic boundaries S.t/, under mild conditions [45] exhibits damped oscillations with the same period T as the boundary:
Z t g.S.t/; tjx0 / ˛.t/ exp ˛./d : (5.99) 0
130
L. Sacerdote and M.T. Giraudo
Here ˛.t/ is a periodic function of period T : ˛.t/ D 2 lim
n!1
ŒS.t C nT /; t C nT jx0
(5.100)
d 2 ŒV .t/ W ŒV .t/ D V 0 .t/ C ŒV .t/ dt 4 and V .t/ D limn!1 S.t C nT /. This behavior holds already also for times not too far from the origin. An exponential asymptotic behavior is also proved for large times and constant boundaries in [126]. An asymptotic evaluation of the probability that the Wiener process first crosses a square root boundary is provided p in [125]. Denoting by Tc the FPT of the Wiener process trough the boundary c 1 C t one has P .Tc > t/ t !1 qt p.c/ ; Here limc!1 p.c/ D 0I limc!0 p.c/ D and 1 with respect to of the equation:
1 2
0 < p.c/ <
1 : 2
(5.101)
and 2p.c/ is a real solution between 0
p n 1 X sin. n 2 / .1 C 2 /. 2c/ . / D 1: nŠ 2 nD1
(5.102)
Using the inverse of transformation (5.95), this result can be applied to get the asymptotic OU process FPT pdf trough a constant boundary for large times. In [121] truncations of the series expansion of the FPT pdf solution of (5.87) are used to achieve approximate evaluations. Use of fixed point theorems is made to obtain bounds for the FPT pdf of the OU and the Feller processes. Inequalities are proved to find for which times the FPT pdf can be approximated, within a preassigned error, by means of an assigned distribution such as the FPT of the Wiener process or the exponential one. In [92] the asymptotic behavior of the FPT pdf through time-varying boundaries is determined for a class of Gauss-Markov processes.
5.4.1.4 Moments of the FPT Analytical formulae for the moments of the FPT are available only for time homogeneous processes with constant boundary. Three approaches are possible: 1. Derivatives of the Laplace transform of the FPT pdf. 2. Solutions of second order differential equations. 3. Solution of the recursive formula proposed by Siegert [129].
5 Stochastic Integrate and Fire Models
131
Having the Laplace transform of the FPT pdf, one can compute: EŒT n D .1/n
d n g .S jx0 / ; d ./n
(5.103)
where g .S jx0 / is given by (5.81). The presence of special functions in the Laplace transforms for the OU (5.37) or for the Feller process (5.56) leads to very complex computations. Alternatively, using the Kolmogorov equation and (5.81), one can show that the moments of the FPT verify the recursive system of ordinary differential equations: d 2 EŒT n .x0 / d EŒT n .x0 / C .x / D nEŒT n1 .x0 /; x0 2 .l; S / 0 dx0 dx20 (5.104) with boundary conditions: 2 .x0 /
EŒT 0 .l/ D 0; EŒT 0 .S / D 1:
(5.105)
When the process admits a steady state distribution, one can write the solution of (5.103) through the Siegert formula [129]: Z
S
EŒT D tn .S jx0 / D n n
x0
2d z 2 .z/W .z/
Z
z
W .y/tn1 .S jy /dy:
(5.106)
l
Due to the numerical difficulties of these formulae, in [79] approximations are proposed for specific processes (Sects. 5.3.5 and 5.3.6) together with suggestions on the use of each one for specific parameter ranges.
5.4.1.5 Stochastic Ordering A further technique for the study of the FPTs is the stochastic comparison of the FPTs from different models [118–120]. Consider the FPTs T1 and T2 of two diffusion processes X1 and X2 over I1 D .l1 ; r1 / and I2 D .l2 ; r2 /, with drifts i .x/, x D 1; 2 and diffusion coefficients i .x/, i D 1; 2, respectively. Let the two processes Y1 and Y2 be obtained from X1 and X2 through the transformation Z
x
yi D gi .x/ D li
dz ; i D 1; 2: i .z/
Moreover, let Y1 and Y2 verify the inequalities: Y1 .y/ Y2 .y/ 8y 2 Œ0; g2 .S /I and 12 .x/ 22 .x/.
dY2 .y/ 0 8y 2 Œ0; g2 .S / dy
(5.107)
132
L. Sacerdote and M.T. Giraudo
For x0 2 .max.l1 ; l2 /; S /, S 2 .max.l1 ; l2 /; min.r1 ; r2 //, x0 < S , it a.s. holds: TX1 .S jx0 / TX2 .S jx0 /:
(5.108)
Note that Y1 and Y2 are characterized by unit diffusion coefficient and drift Yi .y/ D
ˇ ˇ 1 di2 1 . C i .x//ˇˇ : i .x/ 4 dx xDgi1 .y/
(5.109)
5.4.1.6 Jump Diffusion Processes The following integral equation for the FET pdf b g . S; tj y; / of the process X in (5.69), defined over I D Œl; S and originated in y at time , holds [50]: b g . S; tj y; / D e .t / g .S; t jy; / C
Z
Z
t
S
d ze .u /
du
(5.110)
l
fe f a .z a; u jy; / C i f a .z C a; u jy; / I.l;S a/ .z/g Z S e .t /
b g . S; tj z; u/ C e d zf a .z; t jy; / : S a
Here D e C i , IA ./ is the indicator function of the set A and the jump amplitudes are ı e D ı i D a. Furthermore, g .S; t jy; / and f a .x; tjy; s/ are the FPT and the transition pdf in the presence of the boundary S of the underlying diffusion process. The following approximate solution: b g . S; tj y; /
e .t / gtS
C e e .t / C i e .t /
.y; / C e e .t / Z
Z
t
Z
t
1 S
du
S
d zf a .z; t jy; / S a
S a
du Z
Z
1
d zfa .z; u jy; / gtS a .z; u/
d zf a .z; u jy; / gtS Ca .z; u/ (5.111)
holds [56] for a Wiener process with drift and diffusion coefficient when e > i , e 1. Here gt .z; u/ D g.; tjz; u/. This approximation can be interpreted in terms of sample path behavior for the process X . For jumps of low frequency, but relevant amplitude with respect to S , most of the sample paths cross the boundary either by diffusion without jumps or for an upward jump when Xt 2 ŒS a; S / or by diffusion after at most a single (upward or downward) jump. The possible occurrence of a higher number of jumps is disregarded. Hence, this approximation explains the first two peaks of the observed multimodal behavior exhibited by the FET pdf (Fig. 5.4).
5 Stochastic Integrate and Fire Models
133
Some results on the moments of two simplified jump diffusion neuronal models are discussed in [50, 51]. In the “large jumps” model the amplitude of exponentially time distributed jumps is large enough to determine a crossing of the threshold at each jump. Assuming that the crossing is a sure event, a recursive relation holds for the FET moments of order n 1: n EŒT D
"
(
n
EŒT
n1
C .1/
n
.n1/
dgC .S jx0 / d n1
#
) :
(5.112)
D0
Here gC .S jx0 / is the Laplace transform of parameter C of the pure diffusion FPT, is the frequency of jumps and the superscript .n 1/ denotes the .n 1/-th derivative with respect to . In the “reset” model exponentially time distributed jumps force the membrane potential to return instantaneously to its resting level VR x0 . Then the dynamics restarts anew till the crossing of the boundary or a new resetting. This model includes both upward and downward jumps of frequencies 1 and 2 , whose time epochs are described by means of two independent Poisson processes. One has: EŒT D EŒT 2 D
1 g .S jx0 / I g .S jx0 /
(5.113)
2 dgC .S jx0 / 2EŒT C g .S jx0 / d Œg .S jx0 /2 D0
with the same notation as in (5.112) but D 1 C 2 .
5.4.2 Numerical Methods 5.4.2.1 The Direct FPT Problem The direct FPT problem requires to determine the FPT pdf for a given process assuming the transition pdf and the boundary shape to be known. A large literature exists on numerical methods to solve the integral equations for the FPT pdf even for non time-homogeneous diffusion processes [6, 13, 35, 38, 59, 102, 106]. The one proposed in [13] seems to be the fastest and most efficient. It consists in discretizing (5.87) when the function k .t/ is chosen to get a regular kernel for the second kind Volterra equation ((5.89) or (5.90) for the OU and the Feller processes, respectively). Setting t D t0 C kh, k D 1; 2; : : :, h > 0, the discretized solution of (5.87) is
134
L. Sacerdote and M.T. Giraudo 1
a Cumulative FPT probability distribution
0.35 0.3 0.25
g(t)
0.2 0.15 0.1 0.05 0
0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
0 2 4 6 8 10 12 14 16 18 t (ms)
b
0.9
0 2 4 6 8 10 12 14 16 18 t (ms)
Fig. 5.7 FPT pdf (panel A) and FPT probability distribution (panel B) of an OU process obtained numerically with D 0 mVms1 , 2 D 1 mV2 ms1 , D 1 ms1 , S D 1 mV; integration steps h D 0:045 (continuous line), h D 0:6 (dashed line)
e g .S .t0 C h/ ; t0 C h jx0 ; t0 / D 2 .S .t0 C h/ ; t0 C h jx0 ; t0 /
(5.114)
e g .S .t0 C kh/ ; t0 C kh jx0 ; t0 / D 2 .S .t0 C kh/ ; t0 C kh jx0 ; t0 / C2h
k1 X
e g .S .t0 C j h/ ; t0 C j h jx0 ; t0 /
j D1
.S .t0 C kh/ ; t0 C kh jS .t0 C j h/ ; t0 C j h / k D 2; 3; : : :
A suitable discretization step is necessary to make the numerical integration reliable. The numerical algorithm (5.114) uses previous integration steps to determine the successive values, hence it is important to get good evaluations on the first intervals. A heuristic rule is to execute at least 20 integration steps before the maximum of the FPT pdf occurs. In Fig. 5.7, panel A, we show the FPT pdf of a standard OU process with different integration steps. The error in the FPT pdf due to a wrong choice for h is shown in the evaluation of the FPT distribution (Panel B). In [112] a strategy is proposed to numerically solve (5.87) with an appropriate choice of the integration step. To this aim a time-dependent function, the FPT Location function, is introduced.
5 Stochastic Integrate and Fire Models
135
5.4.2.2 The Inverse FPT Problem The inverse FPT problem consists in determining the expression for the boundary S.t/ when the FPT pdf for a diffusion process is known, either in exact form or through a sample of FPTs. Two numerical algorithms are proposed to solve this problem for the Wiener process in [138] and extended to the OU process in [122]. The first algorithm proposes a Monte Carlo procedure to approximate the unknown boundary for the Wiener process stepwise. This algorithm is reliable and easily implemented but it is computationally expensive. The second approach, purely numerical, is computationally more attractive and can be extended to processes different from the Wiener and the OU. We present it here in the case of the Wiener process. By integrating Fortet’s equation (5.80) in x from S.t/ to infinity [96] one obtains the integral equation
S.t/ p t
Z
t
D
0
S.t/S.s/ g.s/ ds p t s
.t > 0/
(5.115)
Rx where .x/ D 1 ˚.x/, ˚.x/ D 1 '.z/ d z, '.z/ is the standard normal pdf and g.t/ D g.S.t/; t jx0 / is the FPT pdf for the Wiener process through the threshold S.t/. Equation (5.115) is a Volterra integral equation of the first kind in g.s/ but it is a non-linear Volterra integral equation of the second kind in S.t/. Its kernel is nonsingular since it is bounded. Moreover, the functions involved are bounded and is invertible. Hence, one can obtain numerically the approximate values S .ti / of S.ti / at the knots ti D ih for i D 1; : : : ; n; here h D t=n (t > 0 fixed). The integral on the left hand side of (5.115) is approximated by the Euler–Maruyama scheme:
S .ti / p ti
i X S .ti /S .tj / D g.tj / h p ti tj j D1
.i D 1; : : : ; n/;
(5.116)
getting a non-linear system of n equations in n unknowns S .t1 /; : : : ; S .tn /. Note that the i -th equation, i 2, makes use of the values S .t1 /,. . . ,S .ti 1 / in the preceding steps. Hence, (5.116) can be solved iteratively to get approximate values for the unknown boundary S at the knots. The convergence of this algorithm and an estimate of its error are considered in [138]. Extensions to the case in which the FPT pdf is known only through samples of ISIs make use of the kernel method to approximate the FPT pdf [123]. Applications to neuronal modeling are proposed in [122, 124] for the OU process. In particular, in [124] this algorithm is employed to propose a classification method of groups of neurons when simultaneously recorded spike trains are available.
136
L. Sacerdote and M.T. Giraudo 12
10
8
Xt
6
4
2
0
0
5
10
15 t (ms)
20
25
30
Fig. 5.8 A sample path of a diffusion process and its discretization. Exemplification of a missed crossing of the boundary inside a time interval of the simulation. Dashes: simulated values of the sample path; Continuous line: sample path
5.4.3 Simulation Methods In the study of neuronal models, when the process is not time-homogeneous or the boundary is time dependent, the only available general technique is simulation. Unfortunately, simulation may make the results for FPTs unreliable. The standard approach uses discretization algorithms for the SDE describing the membrane potential dynamics. Various reliable discretization schemes exist [72], depending on which degree of strong or weak convergence is required. The easiest are the Euler–Maruyama or the Milstein scheme (Chap. 1, Sects. 1.7.1 and 1.7.2). The major cause of error in the simulation of FPTs is related to the risk of not detecting the crossing of the boundary due to the discretization of the sample paths (Fig. 5.8). This implies a serious overestimation of the FPT, which does not disappear when the discretization h step decreases [53]. The number of trials where the error may occur increases with the decrement of h, balancing the corresponding decrease in the probability of error. Different solutions have been proposed to make the simulation of FPTs reliable. They suggest to evaluate the crossing probability during each integration step through the bridge process joining then last two values generated for the diffusion o Œt0 ;t1 Œt0 ;t1 process. A bridge process X D Xt ; t 2 Œt0 ; t1 is associated to a given diffusion X by constraining X to take fixed values at the time instants t0 and t1 > t0 .
5 Stochastic Integrate and Fire Models
137
The process X Œt0 ;t1 is still a diffusion, since its sample paths are a subset of the set of sample paths of X . The crossing probability of the original diffusion during a simulation step of amplitude h coincides with that of its associated bridge on the same time interval [111]. Then one can evaluate, on each interval, the probability of hidden crossings for this process. For the Wiener process, the probability that the bridge W Œ; Ch , originated in the state y at time and constrained to assume the value z at time C h, crosses the boundary S > y during Œ; C h is (
) 2 S 2 Sy S z C zy P D PW .S; h; y; z/ D exp : 2h 2
(5.117)
One can compare this value with a generated random number U uniform on .0; 1/ and, if U < P , conclude that a crossing has happened in that interval. The crossing probability of a Wiener process is used to approximate the crossing of the bridge associated to the process of interest in [63]. To introduce a more precise estimation of the crossing probability for the bridge we first recall the relationship between the transition pdfs f .x; t jy; / and f Œt0 ;t1 .x; t jy; / of the process X and of its bridge X Œt0 ;t1 [111]: f Œt0 ;t1 .x; t jy; / D f .x; t jy; /
f .z; t1 jx; t / f .z; t1 jy; /
t0 < < t < t1 :
(5.118)
Denoting by T Œt0 ;t1 the FPT of the bridge X Œt0 ;t1 through the boundary S and by g Œt0 ;t1 .S; t jx0 ; t0 / its pdf it holds [38]: g Œt0 ;t1 .S; t jx0 ; t0 / D g .S; t jx0 ; t0 /
f .z; t1 jS; t / f .z; t1 jx0 ; t0 /
t0 < < t < t1 :
(5.119)
Integral equations analoguous to (5.82) and (5.87) hold also for the FPT pdf of the bridge process. In [53] an approximate value of g Œt0 ;t1 .S; t jx0 / is obtained, by disregarding the integral on the left hand side of such equations. The approximation using (5.87) g Œt0 ;t1 .S; t jx0 ; t0 / Š 2 .S; t jx0 ; t0 /
f .z; t1 jS; t / f .z; t1 jx0 ; t0 /
(5.120)
produces very good estimates in the case of the OU and of the Feller underlying diffusion processes. In [53] the integral of this approximation over the discretization interval is used to estimate the probability of a hidden crossing inside each interval. In [54] a Monte Carlo algorithm is proposed to estimate the crossing probability of the bridge process. A numerical scheme is applied to the SDE for the bridge process to generate N samples. If L samples cross the boundary, the ratio L=N is used to estimate the crossing probability. The SDE for the bridge has drift and diffusion coefficients [54]:
138
L. Sacerdote and M.T. Giraudo
Œt0 ;t1 .x; t/ D .x/ C
@ 2 .x/ f .z; t1 jx; t/I Œt0 ;t1 .x/ D .x/: f .z; t1 jx; t/ @x
(5.121)
Here .x/ and .x/ are the drift and the diffusion coefficient of the original diffusion X . For a standardized OU process the coefficients (5.121) are [54]: Œt0 ;t1 OU .x; t/
2 ze .t1 t / x Œt0 ;t1 I OU D x C 2.t t / .x/ D 1; e 1 1
(5.122)
and for the Feller process they are [54]: p 2p xzep.t1 t / p q I pe p.t1 t / 2p xze p.t1 t / r r.ep.t1 t / 1/ Œt p F0 ;t1 .x; t/ D q C x p 2 p.t t / C e 1 1 e p.t1 t / 1 2p xzep.t1 t / I q 1 r.ep.t1 t / 1/
r
p Œt F 0 ;t1 .x/ D 2rx:
(5.123)
Here we employed the notation (5.52) and I. / denotes the modified Bessel function of parameter . A nested algorithm is proposed in [54] for the numerical solution of the SDE for the bridge. This choice avoids to evaluate the drift in t D t1 where it becomes singular. An alternative method to evaluate the hidden crossing probabilities, based on large deviation arguments (see Chaps. 3 and 4), is proposed in [7]. This method is less precise, but it does not require the knowledge of the transition pdf of the process X. These algorithms can be applied also to jump diffusion processes but, whenever a jump falls in between the two nodes tn and tnC1 of the partition, the right end of the time interval Œtn ; tnC1 should be substituted with the epoch e t n , tn < e t n tnC1 , of the jump event. To account for the possible hidden crossings inside the discretization intervals, the correction algorithm proposed in [53] should be employed. A novel numerical method for the simulation of FPT has been recently proposed in [132]. The algorithm makes use of the representation of the stochastic process through an expansion using the Haar functions. It takes advantage of the dichotomic nature of this development to refine the description of the process in intervals where possible hidden crossings may arise. In a recent paper [57] it is remarked that the membrane potential, until the spike, evolves in the presence of the boundary. The SDE for the process constrained by the boundary, i.e. for the process that has not yet attained the boundary until a fixed time, is determined. The SDE for its bridge, conditioned to cross the boundary for the first time at its right end, is also determined. Use of the simulation techniques allows to simulate these SDEs.
5 Stochastic Integrate and Fire Models
139
5.5 Estimation Problems for LIF Models A few papers exist on the parameter estimation problem. The literature on this subject is rather recent, except for two older papers. In [75] a sample of membrane potential values observed at discrete times is considered, while [65] uses the moment method on a sample of ISIs. We focus here mainly on the available statistical methods for the OU and the Feller models. Their parameters can be divided into two groups: the intrinsic parameters, S; x0 and for the OU process and S; x0 ; and VI for the Feller process, and the input parameters, and 2 for the OU process and F and 22 for the Feller process. The intrinsic parameters are often disregarded in estimation problems assuming their direct measure. In [60] the estimation of the refractory period is also discussed. We distinguish in the sequel two types of methods, depending upon the available data: 1. Intracellular membrane recordings. 2. ISI time series.
5.5.1 Samples from Membrane Potential Measures In [83] a regression method and a maximum likelihood technique are applied to estimate ˇ D 1 , and from intracellular data, supposed to follow an OU process. We report here the maximum likelihood estimators (MLEs) for the case of OU and Feller processes. One assumes that during an ISI the membrane depolarization Xi D xi , i D 0; 1; : : : ; N , is sampled at the N C 1 points ti D ih. The MLEs are: 1 b ˇD h
PN 1 j D0
xj2
PN 1 j D0
xj C1 xj C .xN x0 /x
j D0
xj2 C x 2 N
PN 1
;
(5.124)
xN x0 b C ˇx; T
(5.125)
N 1
2 1 X xj C1 xj C xj hb ˇ hb T j D0
(5.126)
b D and b D
P ˛ and b F for ˛ D 1 and F of the where x D N1 N j D0 xj , T D N h. The MLEs b Feller process [11] coincide with (5.124) and (5.125), while the estimator for 22 ,
putting h D 1 e , is:
140
L. Sacerdote and M.T. Giraudo
2 F h Xi 1 e Xi b
: b 22 D P N 1 F h2 C 2Xi 1 h e j D1 Xi 1 b 2
PN
1 j D1 Xi 1
(5.127)
Formulae (5.124)–(5.127) are obtained disregarding the existence of the firing boundary. This approximation introduces a bias on the estimated values [9, 10]. The bias of the estimator of is of the same order of magnitude as the standard deviation of the estimate. Unbiased estimators for are not yet available for the OU and the Feller models while the bias for the RRW and for the Wiener models are computed in a closed form in [9]. A comparative study on the estimators for the Feller process is performed in [10]. In [8], MLEs are derived from discrete observations of a Markov process up to the first-hitting time of a threshold, both in discrete and in continuous time. The models considered are the RRW, an autoregressive model of order one (AR(1)), and the Wiener, OU and Feller diffusions. For the last two, approximations are introduced to evaluate the conditional transition pdf and the FPT distribution. These approximations hold when the sampling intervals are small. Their use allow to evaluate the likelihood function. In [94] an algorithm is proposed to compute likelihoods, based on the numerical solution of the integral equation (5.87). Furthermore, an estimator based on the large deviation principle (see Chaps. 3 and 4) is suggested to deal with the case of very small likelihoods in the tails of the distribution. In [97], a MLE method is employed for a particular LIF model with an additional variance parameter modeling possible slow fluctuations in the parameter ˙ t0 . ˙ t1 In [64] a sample of discrete observations Xi ; i0 i i1 ; i0 WD ; i1 WD of the process X in (5.33), over the time interval Œt0 ; t1 , is considered. The following nonparametric kernel estimators are proposed: Xi a X.i CM / Xi K i Di0 h M b .a/ D Pi1 M Xi a i Di0 K h Pi1 M Xi a X.i CM / Xi 2 p i Di0 K h M b2 .a/ D Pi1 M Xi a i Di0 K h Pi1 M
(5.128)
with h > 0 a suitable bandwidth. Furthermore, K .y/ may be chosen as a rectangular or triangular kernel, for a suitable integer M . Examples of possible kernels are K.y/ D 12 If1;C1g .y/ and K.y/ D .1 jyj/ If1;C1g .y/, with If g .y/ indicator function of the set fg : Extensions to other processes and for the selection of the model (OU, Feller or others) are used to exemplify the method. The examples considered correspond to rarely spiking neurons, a fact that minimizes the problem underlined in [9], but prevents its use in other instances.
5 Stochastic Integrate and Fire Models
141
5.5.2 Samples of ISIs The case of ISI data has been recently considered in [89]. In this paper an algorithm is proposed for computing MLEs with their confidence regions for and 2 . The algorithm numerically inverts the Laplace transform for the OU model. The method works also to estimate the parameter but it requires larger samples. MLEs for the OU model are obtained in [137] using numerical evaluations of (5.34). In [31] a variant of the moment method is proposed to estimate the input parameters of the OU process. In this paper an optimal stopping theorem is applied to determine the first two exponential moments of the FPT: E e T = D
E e 2T = D
; S
2 . /2 2 2 . S /2 2
:
(5.129)
2n from a The moment method is then applied to obtain the estimators b n and b sample of ISIs fT1 ; : : : ; Tn g: b n D
S Z1;n ; Z1;n 1
where
b 2n D
2 Z2;n Z1;n 2S 2 .Z2;n 1/ .Z1;n 1/2
1 X Ti = e ; n i D1 n
Z1;n D
1 X 2Ti = e : n i D1
(5.130)
n
Z2;n D
(5.131)
This method can be applied only region since the following in the suprathreshold conditions must be fulfilled: E e T = < 1; E e 2T = < 1: The first condition is 2 verified if > S and the second holds if > S and 2 < . S /2 : In [32] analoguous results are proved for the Feller model, for which F y0 E e T = D ; F S
.F y0 /2 22 .F =2 y0 / E e 2T = D : .F S /2 22 .F =2 S / (5.132) q
2 2F These expectations converge when F > S and 22 1 C 2 1 < .F S /. 2
The estimators are: b F;n D b 22;n
S Z1;n y0 Z1;n 1
(5.133)
2 Z2;n Z1;n 2 .S y0 /2 : D Œ2 .Z1;n 1/ .SZ2;n y0 / .SZ1;n y0 / .Z2;n 1/
Consistency and asymptotic normality of these estimators have been proved in [49]. The sample sizes required to get the asymptotic conditions are not huge (some hundreds).
142
L. Sacerdote and M.T. Giraudo
In [30, 33] an alternative method to estimate and 2 , based on the analoguous of (5.115) for the OU process, 1
0
B .1 e / S C C ˚B
A D @r 2 2t 1e 2 t
0
Z
t 0
p S g.u/˚ @ 2 p
s
1 e
t u
1 C e
t u
1 A d u; (5.134)
is proposed. Numerical results suggest that this approach is preferable to the previous ones. The case of observations of the trajectory on very short time intervals is considered in [95]. They propose a method to estimate the parameters of an OU process in this particular instance. A recent review [84] has appeared summarizing the state of the art of the estimation problem for diffusion processes in neuronal modeling. A comparison of the different estimation methods for the OU process is performed in [34]. Parallel spike trains are discussed in greater depth in [58]. This book presents the methods of correlation analysis together with a review on different approaches for the analysis of single spike trains. The different chapters discuss many important problems related to the statistical analysis of spike trains. Finally, the inverse FPT method is applied in [124] to classify simultaneously recorded spike trains. The value of the parameters for the OU process are assumed constant for all the recorded spike trains. The boundary is determined by the inverse FPT method and a comparison of the different boundaries is employed to classify the data. The case of non-stationary data is not contemplated by these estimation procedures. In [66] a jump diffusion model is built that incorporates the nonstationarity and a firing mechanism with a state dependent intensity. Statistical methods are suggested to estimate all unknown quantities. The problem of models whose noise term has a specific temporal structure has not been solved up to now. In [123] the inverse FPT method is used on samples of FPTs from an OU process to recover the boundary shape and to test nonstationary behaviors. The proposed method makes use of a moving window approach. The inverse FPT is applied on samples from each window. The comparison of the determined boundary shapes allows to detect changes in the observed dynamics.
References 1. Abrahams, J.: A survey of recent progress on level-crossing problems for random processes. In: Blake, I.F., Poor, H.V. (eds.) Communications and Networks. A Survey of Recent Advances, pp. 6–25. Springer, New York (1986) 2. Abramowitz, M., Stegun, I.A. (eds.): Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables. Dover, New York (1972)
5 Stochastic Integrate and Fire Models
143
3. Albano, G., Giorno, V., Nobile, A.G., Ricciardi L.M.: A Wiener-type neuronal model in the presence of exponential refractoriness. BioSystems 88, 202–215 (2007) 4. Alili, L., Patie, P., Pedersen, J.L.: Representation of the first hitting time density of an Ornstein–Uhlenbeck process. Stoch. Model 21, 967–980 (2005) 5. Alili, L., Patie, P.: Boundary crossing identities for diffusions having the time-inversion property. J. Theor. Probab. 23, 65–84 (2010) 6. Anderssen, R.S., DeHoog, F.R., Weiss, R.: On the numerical solution of Brownian motion processes. J. Appl. Probab. 10, 409–418 (1973) 7. Baldi, P., Caramellino, L.: Asymptotics of hitting probabilities for general one-dimensional diffusions. Ann. Appl. Probab. 12, 1071–1095 (2002) 8. Bibbona, E., Ditlevsen, S.: Estimation in discretely observed Markov processes killed at a threshold, Early View Scandinavian Journal of Statistics online 28 August 2012 DOI 10.1111/j.1467-9469.2012.00810.x 9. Bibbona, E., L´ansk´y, P., Sacerdote, L., Sirovich, R.: Errors in estimation of input signal for integrate and fire neuronal models. Phys. Rev. E 78, Art. No. 011918 (2008) 10. Bibbona, E., L´ansk´y, P., Sirovich, R.: Estimating input parameters from intracellular recordings in the Feller neuronal model. Phys. Rev. E 81, Art. No. 031916 (2010) 11. Bibby, B., Sørensen, M.: On estimation for discretely observed diffusions: A review. Theor. Stoch. Proc. 2, 49–56 (1996) 12. Brette, R., Gerstner, W.: Adaptive exponential integrate-and-fire model as an effective description of neuronal activity. J. Neurophysiol. 94, 3637–3642 (2005) 13. Buonocore, A., Nobile, A.G., Ricciardi, L.M.: A new integral equation for the evaluation of the first-passage-time probability densities. Adv. Appl. Probab. 19, 784–800 (1987) 14. Buonocore, A., Giorno, V., Nobile, A.G., Ricciardi, L.M.: A neuronal modeling paradigm in the presence of refractoriness. BioSystems 67, 35–43 (2002) 15. Buonocore, A., Caputo, L., Pirozzi, E., Ricciardi, L.M.: On a stochastic leaky integrate-andfire neuronalmodel. Neural Comput. 22, 2558–2585 (2010) 16. Burkitt, A.N.: A review of the integrate and fire neuron model: I. Homogeneous synaptic input. Biol. Cybern. 95, 1–19 (2006) 17. Burkitt, A.N.: A review of the integrate and fire neuron model: II. Inhomogeneous synaptic input and network properties. Biol. Cybern. 95, 97–112 (2006) 18. Capocelli, R.M., Ricciardi, L.M.: Diffusion approximation and first passage time problem for a model neuron. Kybernetik 8(6), 214–223 (1971) 19. Capocelli, R.M., Ricciardi, L.M.: On the transformation of diffusion processes into the Feller process. Math. Biosci. 29, 219–234 (1976) 20. Cerbone, G., Ricciardi, L.M., Sacerdote, L.: Mean variance and skewness of first passage time for the Ornstein–Uhlenbeck process. Cybern. Syst. 12, 395–429 (1981) 21. Chacron, M.J., Longtin, A., St-Hilaire, M., Maler, L.: Suprathreshold stochastic firing dynamics with memory in P-type electroreceptors. Phys. Rev. Lett. 85, 1576–1579 (2000) 22. Chacron, M.J., Pakdaman, K., Longtin, A.: Interspike interval correlations, memory, adaptation, and refractoriness in a leaky integrate-and-fire model with threshold fatigue. Neural Comput. 15(2), 253–278 (2003) 23. Chacron, M.J., Lindner, B., Longtin, A.: Threshold fatigue and information transmission. J. Comput. Neurosci. 23, 301–311 (2007) 24. Clopath, C., Jolivet, R., Rauch, A., Luscher, H.R., Gerstner, W.: Predicting neuronal activity with simple models of the threshold type: adaptive exponential integrate-and-fire model with two compartments. Neurocomputing 70, 1168–1673 (2007) 25. Cox, D.R., Miller, H.D.: The Theory of Stochastic Processes. Chapman and Hall, London (1977) 26. Daniels, H.E.: The minimum of a stationary Markov process superimposed on a U-shaped trend. J. Appl. Probab. 6, 399–408 (1969) 27. Daniels, H.E.: Sequential tests constructed from images. Ann. Stat. 10, 394–400 (1982) 28. Di Crescenzo, A., Ricciardi, L.M.: On a discrimination problem for a class of stochastic processes with ordered first-passage-times. Appl. Stoch. Model Bus. Ind. 17, 205–219 (2001)
144
L. Sacerdote and M.T. Giraudo
29. Di Crescenzo, A., Di Nardo, E., Ricciardi, L.M.: On certain bounds for first-crossing-time probabilities of a jump-diffusion process. Sci. Math. Jpn. 64(2), 449–460 (2006) 30. Ditlevsen, S., Ditlevsen, O.: Parameter estimation from observations of first-passage times of the Ornstein–Uhlenbeck process and the Feller process. Probabilist. Eng. Mech. 23, 170–179 (2008) 31. Ditlevsen, S., L´ansk´y, P.: Estimation of the input parameters in the Ornstein–Uhlenbeck neuronal model. Phys. Rev. E 71, Art. No. 011907 (2005) 32. Ditlevsen, S., L´ansk´y, P.: Estimation of the input parameters in the Feller neuronal model. Phys. Rev. E 73, Art. No. 061910 (2006) 33. Ditlevsen, S., L´ansk´y, P.: Parameters of stochastic diffusion processes estimated from observations of first hitting-times: application to the leaky integrate-and-fire neuronal model. Phys. Rev. E 76, Art. No. 041906 (2007) 34. Ditlevsen, S., L´ansk´y, P.: Comparison of statistical methods for estimation of the input parameters in the Ornstein–Uhlenbeck neuronal model from first-passage times data. In: Ricciardi, L.M., Buonocore, A., Pirozzi, E. (eds.) American Institute of Physics Proceedings Series, CP1028, Collective Dynamics: Topics on Competition and Cooperation in the Biosciences (2008) 35. Durbin, J.: Boundary crossing probabilities for the Brownian motion and Poisson processes and techniques for computing the power of the Kolmogorov Smirnov test. J. Appl. Probab. 8, 431–453 (1971) 36. Durbin, J.: The first-passage density of a continuous Gaussian process to a general boundary. J. Appl. Probab. 22(1), 99–122 (1985) 37. Durbin, J., Williams, D.: The first-passage density of the Brownian Motion process to a curved boundary. J. Appl. Probab. 29(2), 291–304 (1992) 38. Favella, L., Reineri, M.T., Ricciardi, L.M., Sacerdote, L.: First-passage-time problems and some related computational methods. Cybern. Syst. 13, 95–128 (1982) 39. Fortet, R.: Les fonctions al´eatoires du type Markoff associ´ees a` certaines e` quations lin´eaires au d´eriv´ees partielles du type parabolique. J. Math. Pure Appl. 22(9), 177–243 (1943) 40. Gerstein, G.L., Mandelbrot, B.: Random walk models for the spike activity of a single neuron. Biophys. J. 4, 41–68 (1964) 41. Gerstner, W., Kistler, W.M.: Spiking Neuron Models Single Neurons, Populations, Plasticity. Cambridge University Press, Cambridge (2002) 42. Giorno, V., Nobile, A.G., Ricciardi, L.M., Sacerdote, L.: Some remarks on the Raleigh process. J. Appl. Probab. 23, 398–408 (1986) 43. Giorno, V., L´ansk´y, P., Nobile, A.G., Ricciardi, L.M.: Diffusion approximation and firstpassage-time problem for a model neuron: III. A birth-and-death process approach. Biol. Cybern. 58, 387–404 (1988) 44. Giorno, V., Nobile, A.G., Ricciardi, L.M.: A symmetry based constructive approach to probability densities for one dimensional diffusion processes. J. Appl. Probab. 27, 707–721 (1989) 45. Giorno, V., Nobile, A.G., Ricciardi, L.M.: On the asymptotic behavior of first-passage-time densities for one dimensional diffusion processes and varying boundary. Adv. Appl. Probab. 22, 883–914 (1990) 46. Giorno, V., Nobile, A.G., Ricciardi, L.M.: Instantaneous return process and neuronal firings. In: Trappl, R. (ed.) Cybernetics and Systems Research 1992, pp. 829–836. World Scientific, New York (1992) 47. Giorno, V., Nobile, A.G., Ricciardi, L.M.C.: On the moments of firing numbers in diffusion neuronal models with refractoriness. In: Mira, J., Alvarez, J.R. (eds.) IWINAC 2005. Lecture Notes in Computer Sciences 3561, pp. 186–194. Springer, New York (2005) 48. Giraudo, M.T.: A similarity solution for the Ornstein–Uhlenbeck diffusion process constrained by a reflecting and an absorbing boundary. Ricerche Matemat. 49(1), 47–63 (2000) 49. Giraudo, M.T., Mininni, R., Sacerdote, L.: On the asymptotic behavior of the parameter estimators for some diffusion processes: application to neuronal models. Ricerche Matemat. 58(1), 103–127 (2009)
5 Stochastic Integrate and Fire Models
145
50. Giraudo, M.T., Sacerdote, L.: Some remarks on first-passage-time for jump-diffusion processes. In: Trappl, R. (ed.) Cybernetics and Systems ’96, pp. 518–523. University of Wien Press, Wien (1996) 51. Giraudo, M.T., Sacerdote, L.: Jump-diffusion processes as models for neuronal activity. BioSystems 40, 75–82 (1997) 52. Giraudo, M.T., Sacerdote, L.: Simulation methods in neuronal modeling. BioSystems 48, 77–83 (1998) 53. Giraudo, M.T., Sacerdote, L.: An improved technique for the simulation of first passage times for diffusion processes. Comm. Stat. Simulat. Comput. 28(4), 1135–1163 (1999) 54. Giraudo, M.T., Sacerdote, L., Zucca, C.: Evaluation of first passage times of diffusion processes through boundaries by means of a totally simulative algorithm. Meth. Comp. Appl. Probab. 3, 215–231 (2001) 55. Giraudo, M.T., Sacerdote, L., Sirovich, R.: Effects of random jumps on a very simple neuronal diffusion model. BioSystems 67, 75–83 (2002) 56. Giraudo, M.T.: An approximate formula for the first-crossing-time density of a Wiener process perturbed by random jumps. Stat. Probab. Lett. 79, 1559–1567 (2009) 57. Giraudo, M.T., Greenwood, P.E., Sacerdote, L.: How sample paths of Leaky Integrate and Fire models are influenced by the presence of a firing threshold. Neural Comput. 23(7), 1743–67 (2011) 58. GrRun, S., Rotter, P.: Analysis of Parallel Spike Trains. Springer, New York (2010) 59. Gutierrez, R., Ricciardi, L.M., Rom´an, P., Torres, F.: First-passage-time densities for timenon-homogeneous diffusion processes. J. Appl. Probab. 34, 623–631 (1997) 60. Hampel, D., L´ansk´y, P.: On the estimation of refractory period. J. Neurosci. Meth. 171, 288–295 (2008) 61. Helias, M., Deger, M., Diesmann, M., Rotter, S.: Equilibrium and response properties of the integrate-and-fire neuron in discrete time. Front. Comput. Neurosci. 3, Article 29 (2010) 62. Hodgkin, A., Huxley, A.: A quantitative description of membrane current and its application to conduction and excitation in nerve. J. Physiol. 117, 500–544 (1952) 63. Honerkamp, J.: Stochastic Dynamical Systems: Concepts, Numerical Methods, Data Analysis. VCH, New York (1994) 64. Hopfner, R.: On a set of data for the membrane potential in a neuron. Math. Biosci. 207(2), 275–301 (2007) 65. Inoue, J., Sato, S., Ricciardi, L.M.: On the parameter estimation for diffusion models of single neuron’s activities. I. Application to spontaneous activities of mesencephalic reticular formation cells in sleep and waking states. Biol. Cybern. 73(3), 209–221 (1995) 66. Jahn, P., Berg, R.W., Hounsgaard, J., Ditlevsen, S.: Motoneuron membrane potentials follow a time inhomogeneous jump diffusion process. J. Comput. Neurosci. 31, 563–579 (2011) 67. Jolivet, R., Lewis, T.J., Gerstner, W.: Generalized integrate-and-fire models of neuronal activity. Approximate spike trains of a detailed model to a high degree of accuracy. J. Neurophysiol. 92(2), 959–976 (2004) 68. Jolivet, R., Roth, A., Schurmann, F., Gerstner, W., Senn, W.: Special issue on quantitative neuron modeling. Biol. Cybern. 99, 237–239 (2006) 69. Kallianpur, G.: On the diffusion approximation to a discontinuous model for a single neuron. In: Contributions to Statistics, pp. 247–258. North-Holland, Amsterdam (1983) 70. Karlin, S., Taylor, H.M.: A Second Course in Stochastic Processes. Academic, New York (1981) 71. Kistler, W.M., Gerstner, W., vanHemmen, J.L.: Reduction of the Hodgkin–Huxley equations to a single-variable threshold model. Neural Comput. 9(5), 1015–1045 (1997) 72. Kloeden, P., Platen, P.: The numerical solution of Stochastic differential equations. Springer, New York (1992) 73. Kobayashi, R., Tsubo, Y., Shinomoto, S.: Predicting spike times of any cortical neuron. Frontiers in Systems Neuroscience. Conference Abstract: Computational and systems neuroscience. doi: 10.3389/conf.neuro.06.2009.03.196 (2009)
146
L. Sacerdote and M.T. Giraudo
74. Kobayashi, R., Tsubo, Y., Shinomoto, S.: Made-to-order spiking neuron model equipped with a multi-timescale adaptive threshold. Front. Comput. Neurosci. 3, Article 9 (2009) 75. L´ansk´y, P.: Inference for the diffusion models of neuronal activity. Math. Biosci. 67, 247–260 (1983) 76. L´ansk´y, P., L´ansk´a, V.: Diffusion approximations of the neuronal model with synaptic reversal potentials. Biol. Cybern. 56, 19–26 (1987) 77. L´ansk´y, P., Rodriguez, R.: Coding range of a two-compartmental model of a neuron. Biol. Cybern. 81, 161 (1999) 78. L´ansk´y, P., Sacerdote, L., Tomassetti, F.: On the comparison of Feller and Ornstein– Uhlenbeck models for neural activity. Biol. Cybern. 73, 457–465 (1995/1996) 79. L´ansk´y, P., Sacerdote, L.: The Ornstein–Uhlenbeck neuronal model with signal-dependent noise. Phys. Lett. A 285, 132–140 (2001) 80. L´ansk´y, P., Sato, S.: The stochastic diffusion models of nerve membrane depolarization and interspike interval generation. J. Periph. Nerv. Syst. 4, 27–42 (1999) 81. L´ansk´y, P., Musila, M.: Generalized Stein’s model for anatomically complex neurons. Biosystems 25, 179–191 (1991) 82. L´ansk´a, V., L´ansk´y, P., Smith, C.E.: Synaptic transmission in a diffusion model for neural activity. J. Theor. Biol. 166, 393–406 (1994) 83. L´ansk´y, P., Sanda, P., He, J.F.: The parameters of the stochastic leaky integrate-and-fire neuronal model. J. Comput. Neurosci. 21, 211–223 (2006) 84. L´ansk´y, P., Ditlevsen, S.: A review of the methods for signal estimation in stochastic diffusion leaky integrate-and-fire neuronal models. Biol. Cybern. 99, 253–262 (2008) 85. Lapique, L.: Reserches quantitatives sur l’excitation e´ lectrique des nerfs trait´ee comme une polarization. J. Physiol. Pathol. Gen. 9, 620–635 (1907) 86. Lerche, H.R.: Boundary Crossing of Brownian Motion. Lecture Notes in Statistics, vol. 40. Springer, Heidelberg (1986) 87. Lindner, B., Chacron, M.J., Longtin, A.: Integrate and fire neurons with threshold noise: A tractable model of how interspike interval correlations affect neuronal signal transmission. Phys. Rev. E 72, 021911 (2005) 88. Marpeau, F., Barua, A., Josic, K.: A finite volume method for stochastic integrate-and-fire models. J. Comput. Neurosci. 26, 445–457 (2009) 89. Mullowney, P., Iyengar, S.: Parameter estimation for a leaky integrate-and-fire neuronal model from ISI data. J. Comput. Neurosci. 24, 179–194 (2008) 90. Nobile, A.G., Ricciardi, L.M., Sacerdote, L.: Exponential trends of first passage time densities for a class of diffusion processes with steady-state distribution. J. Appl. Probab. 22, 611–618 (1985) 91. Nobile, A.G., Ricciardi, L.M., Sacerdote, L.: Exponential trends of Ornstein–Uhlenbeck firstpassage-time densities. J. Appl. Probab. 22, 360–369 (1985) 92. Nobile, A.G., Pirozzi, E., Ricciardi, L.M.: On the Estimation of First-Passage Time Densities for a Class of Gauss-Markov Processes EUROCAST 2007. In: Diaz, M. (ed.) LNCS 4739:146–153, Springer, Berlin, 2007 93. Pakdaman, K., Mestivier, D.: External noise synchronizes forced oscillators. Phys. Rev. E 64, 030901 (2001) 94. Paninski, L., Haith, A., Szirtes, G.: Integral equation methods for computing likelihoods and their derivatives in the stochastic integrate and fire model. J. Comput. Neurosci. 24, 69–79 (2008) 95. Pawlas, Z., Klebanov, L.B., Prokop, M., L´ansk´y, P.: Parameters of spike trains observed in a short time window. Neural Comput. 20, 1325–1343 (2008) 96. Peskir, G.: Limit at zero of the Brownian first-passage density. Probab. Theor. Relat. Field 124, 100–111 (2002) 97. Picchini, U., L´ansk´y, P., De Gaetano, A., Ditlevsen, S.: Parameters of the diffusion leaky integrate-and fire neuronal model for a slowly fluctuating signal. Neural Comput. 20, 2696–2714 (2008)
5 Stochastic Integrate and Fire Models
147
98. Plesser, H.E., Diesmann, M.: Simplicity and efficiency of integrate-and-fire neuron models. Neural Comput. 21, 353–359 (2009) 99. Ricciardi, L.M.: On the transformation of Diffusion Processes into the Wiener process. J. Math. Anal. Appl. 54(1), 185–199 (1976) 100. Ricciardi, L.M.: Diffusion Processes and Related Topics in Biology. Lecture Notes in Biomathematics, vol. 14. Springer, Berlin (1977) 101. Ricciardi, L.M., Sacerdote, L.: The Ornstein–Uhlenbeck process as a model for neuronal activity. Biol. Cybern. 35, 1–9 (1979) 102. Ricciardi, L.M., Sacerdote, L., Sato, S.: Diffusion approximation and first passage time problem for a model neuron II Outline of a computational method. Math. Biosci. 64, 29–44 (1983) 103. Ricciardi, L.M., Sacerdote, L., Sato, S.: On an integral equation for first-passage-time probability densities. J. Appl. Probab. 21(2), 302–314 (1984) 104. Ricciardi, L.M., Sato, S.: Diffusion processes and first-passage-time problems. In: Ricciardi, L.M. (ed.) Lectures Notes in Biomathematics and Informatics. Manchester University Press, Manchester (1989) 105. Ricciardi, L.M., Di Crescenzo, A., Giorno, V., Nobile, A.G.: On the instantaneous return process for neuronal diffusion models. In: Marinaro, M., Scarpetta, G. (eds.) Structure: From Physics to General Systems, pp. 78–94. World Scientific, New York (1992) 106. Ricciardi, L.M., Di Crescenzo, A., Giorno, V., Nobile, A.: An outline of theoretical and algorithmic approaches to first passage time problems with applications to biological modeling. Math. Jpn. 50(2), 247–321 (1999) 107. Ricciardi, L.M., Esposito, G., Giorno, V., Valerio, C.: Modeling neuronal firing in the presence of refractoriness. In: Mira, J., Alvarez, J.R. (eds.) IWANN 2003. Lecture Notes in Computer Sciences 2686, pp. 1–8. Springer, New York (2003) 108. Rodriguez, R., L´ansk´y, P.: Two-compartment stochastic model of a neuron with periodic input. Lecture Notes in Computer Science 1606 “Foundations and Tools for Neural Modeling (IWANN’99)”. Springer, New York (1999) 109. Rodriguez, R., L´ansk´y, P.: A simple stochastic model of spatially complex neurons. Biosystems 58, 49 (2000) 110. Rodriguez, R., L´ansk´y, P.: Effect of spatial extension on noise-enhanced phase locking in a leaky integrate-and-fire model of a neuron. Phys. Rev. E 62, 8427 (2001) 111. Roger, L.C.G., Williams, D.: Diffusions, Markov Processes and Martingales. Wiley Series in Probability and Mathematical Statistics, New York (1987) 112. Rom´an, P., Serrano, J.J., Torres, F.: First-passage-time location function: Application to determine first-passage-time densities in diffusion processes. Comput. Stat. Data Anal. 52, 4132–4146 (2008) 113. Sacerdote, L.: Asymptotic behavior of Ornstein–Uhlenbeck first-passage-time density through boundaries. Appl. Stoch. Mod. Data Anal. 6, 53–57 (1988) 114. Sacerdote, L.: On the solution of the Fokker–Planck equation for a Feller process. Adv. Appl. Probab. 22(1), 101–110 (1990) 115. Sacerdote, L., Ricciardi, L.M.: On the transformation of diffusion equations and boundaries into the Kolmogorov equation for the Wiener process. Ricerche Matemat. 41(1), 123–135 (1992) 116. Sacerdote, L., Sirovich, R.: Multimodality of the interspike interval distribution in a simple jump-diffusion model. Scientiae Mathematicae Japonicae Online 8, 359–374 (2003) 117. Sacerdote, L., Sirovich, R.: Noise induced phenomena in jump-diffusion models for single neuron spike activity. IJCNN Proc., Budapest (2004) 118. Sacerdote, L., Smith, C.E.: New parameter relationships determined via stochastic ordering for spike activity in a reversal potential model. BioSystems 58, 59–65 (2000) 119. Sacerdote, L., Smith, C.E.: A qualitative comparison of some diffusion models for neural activity via stochastic ordering. Biol. Cybern. 83(6), 543–551 (2000) 120. Sacerdote, L., Smith, C.E.: Almost sure comparisons for first passage times of diffusion processes through boundaries. Meth. Comput. Appl. Probab. 6(3), 323–341 (2004)
148
L. Sacerdote and M.T. Giraudo
121. Sacerdote, L., Tomassetti, F.: On evaluations and asymptotic approximations of first-passagetime probabilities. Adv. Appl. Probab. 28(1), 270–284 (1996) 122. Sacerdote, L., Zucca, C.: Threshold shape corresponding to a Gamma firing distribution in an Ornstein–Uhlenbeck neuronal model. Scientiae Mathematicae Japonicae 58(2), 295–30 (2003) 123. Sacerdote, L., Zucca, C.: On the relationship between interspikes interval distribution and boundary shape in the Ornstein-Uhlenbeck neuronal model. In: Mathematical Modelling and Computing in Biology and Medicine (Capasso, V. ed.): 161–168, the MIRIAM Project Series, Progetto Leonardo, Esculapio Pub. Co., Bologna, 2003 124. Sacerdote, L., Villa, A.E.P., Zucca, C.: On the classification of experimental data modeled via a stochastic leaky integrate and fire model through boundary values. Bull. Math. Biol. 68(6), 1257–1274 (2006) 125. Sato, S.: Evaluation of the First-Passage Time Probability to a Square Root Boundary for the Wiener Process, J. Appl. Probab. 14(4), 850–856 (1977) 126. Sato, S.: Note on the Ornstein–Uhlenbeck process model for stochastic activity of a single neuron. Lect. Note Biomath. 70, 146–156 (1987) 127. Segundo, J., Vibert, J.-F., Pakdaman, K., Stiber, M., Diez, Martinez O.: Noise and the neuroscience: a long history, a recent revival and some theory. In: Pribram, K.H. (eds.) Origins: Brain & Self Organization. Erlbaum, Hillsdale, NJ (1994) 128. Shimokawa, T., Pakdaman, K., Sato, S.: Time-scale matching in the response of a leaky integrate-and-fire neuron model to periodic stimulus with additive noise. Phys. Rev. E 59, 3427–3443 (1999) 129. Siegert, A.J.F.: On the first passage time probability problem. Phys. Rev. 81, 617–623 (1951) 130. Sirovich, R.: Mathematical models for the study of synchronization phenomena in neuronal networks, Ph.D. Thesis, University of Torino and Universit´e de Grenoble (2006) 131. Stein, R.B.: A theoretical analysis of neuronal variability, Biophys. J. 5, 385–386 (1965) 132. Taillefumier, T., Magnasco, M.O.: A fast algorithm for the first-passage times of Gauss– Markov processes with H¨older continuous boundaries. J. Stat. Phys. 140, 1130–1156 (2010) 133. Tuckwell, H.C.: Introduction to Theoretical Neurobiology. Linear Cable Theory and Dendritic Structure, vol. 1. Cambridge University Press, Cambridge (1988) 134. Tuckwell, H.C.: Introduction to Theoretical Neurobiology. Nonlinear and Stochastic Theories, vol. 2. Cambridge University Press, Cambridge (1988) 135. Uhlenbeck, G.E., Ornstein, L.S.: On the theory of Brownian motion. Phys. Rev. 36, 823–841 (1930) 136. Wang, L., Potzelberger, K.: Boundary crossing probability for Brownian motion and general boundaries. J. Appl. Probab. 34, 54–65 (1997) 137. Zhang, X., You, G., Chen, T., Feng, J.K.: Maximum likelihood decoding of neuronal inputs from an interspike interval distribution. Neural Comput. 19(4), 1319–1346 (2009) 138. Zucca, C., Sacerdote, L.: On the inverse first-passage-time problem for a wiener process. Ann. Appl. Probab. 19(4), 1319–1346 (2009)
Chapter 6
Stochastic Partial Differential Equations in Neurobiology: Linear and Nonlinear Models for Spiking Neurons Henry C. Tuckwell
Abstract Stochastic differential equation (SDE) models of nerve cells for the most part neglect the spatial dimension. Including the latter leads to stochastic partial differential equations (SPDEs) which allow for the inclusion of important variations in the densities of ion channels. In the first part of this work, we briefly consider representations of neuronal anatomy in the context of linear SPDE models on line segments with one and two components. Such models are reviewed and analytical methods illustrated for finding solutions as series of Ornstein–Uhlenbeck processes. However, only nonlinear models exhibit natural spike thresholds and admit traveling wave solutions, so the rest of the article is concerned with spatial versions of the two most studied nonlinear models, the Hodgkin–Huxley system and the FitzHugh–Nagumo approximation. The ion currents underlying neuronal spiking are first discussed and a general nonlinear SPDE model is presented. Guided by recent results for noise-induced inhibition of spiking in the corresponding system of ordinary differential equations, in the spatial Hodgkin–Huxley model, excitation is applied over a small region and the spiking activity observed as a function of mean stimulus strength with a view to finding the critical values for repetitive firing. During spiking near those critical values, noise of increasing amplitudes is applied over the whole neuron and over restricted regions. Minima have been found in the spike counts which parallel results for the point model and which have been termed inverse stochastic resonance. A stochastic FitzHugh–Nagumo system is also described and results given for the probability of transmission along a neuron in the presence of noise.
H.C. Tuckwell () Max Planck Institute for Mathematics in the Sciences, Inselstr. 22, Leipzig, 04103 Germany e-mail: [email protected] M. Bachar et al. (eds.), Stochastic Biomathematical Models, Lecture Notes in Mathematics 2058, DOI 10.1007/978-3-642-32157-3 6, © Springer-Verlag Berlin Heidelberg 2013
149
150
H.C. Tuckwell
6.1 Introduction Fundamental observations on neurons include factors which determine their spiking behaviour which is usually related to information processing in the nervous system. The need for stochastic, as opposed to deterministic, modeling in neurobiology arose from several fundamental experimental observations, between the years 1946 and 1976, of the activity of diverse nerve-cell or nerve-cell-related systems. Neurons usually emit spikes or action potentials when they receive sufficient excitatory stimuli in a sufficiently short time interval—an example of neuronal spiking is shown in Fig. 6.1. Early observations included variability in the time-intervals between neuronal spikes (the ISI or interspike interval), random postsynaptic potentials at neuromuscular junction, the opening and closing of single ion channels and random fluctuations in electroencephalographic recordings—see [21, 44] for historical references. The first appearance of classical random walk theory and Brownian motion in neuronal modeling came in Gerstein and Mandelbrot’s pioneering work [10]. This was soon followed by the introduction of the Ornstein–Uhlenbeck process as a neuronal model [11], for which the aspects of the first-passage time problem, relevant to neuronal spiking, were solved by Roy and Smith [32]. From that time on the primary focus has been on models of neuronal activity where the whole neuronal anatomy, including soma, dendrites and axon, is collapsed into a single point, so to speak. It is surprising that this approximation has been pursued for so long although it has often succeeded in predicting neuronal responses with considerable accuracy. However, the reason for the persistent use of point models has probably been the additional mathematical complexities involved in partial differential equations (PDEs) compared to ordinary differential equations (ODEs). See Chap. 5 for a review of single point neuronal models. The deterministic spatial Hodgkin–Huxley (HH) system, consisting of the cable PDE and three auxiliary differential equations is one of the most successful mathematical models in physiology. In reality, linear stochastic cable models are not much more complicated than the corresponding point models. Indeed, if solutions are found by simulation, the same could be said of nonlinear models such as that of HH, although more computing time is required. The main advantage of the spatial models is that more realistic distributions of ion channels, including those related to synaptic input, may be incorporated. Distinguishing locations for various ion channel types has important consequences. For example, in serotonergic neurons, low-threshold calcium current channels are thought to be mainly somatic, whereas high threshold calcium channels, which in turn activate calcium-gated potassium conductance, occur mainly on dendrites [3]. Neurons, especially those in the mammalian central nervous system, often receive many thousands of synaptic inputs from many different sources and each source has a different spatial distribution pattern [29, 53]. On the other hand, the disadvantage of spatial models is that a knowledge of many more parameters is required, many of which can at best only be approximately estimated.
6 SPDEs in Neurobiology
151
Fig. 6.1 Action potentials recorded intracellularly from a neuron in the prefrontal cortex of rat under local electrical stimulation. Adapted from [34]. Note that spikes appear at an approximate threshold voltage of 62 mV
6.2 Linear SPDE Neuronal Models: A Brief Summary A general linear PDE model for nerve membrane potential, called a cable equation, takes the form cm
1 @2 V @V V D C I.x; t/; 2 @t ri @x rm
0 < x < l;
t > 0;
(6.1)
where the symbols and their units are as follows:
x t V .x; t/ l ri
= = = = =
rm cm I
= = =
distance from left-hand end point in cm time in seconds depolarization from rest at .x; t/ in volts length of cable in cm resistance per unit length of internal medium (cytoplasm) in ohms/cm membrane resistance of unit length times unit length in ohms cm membrane capacitance per unit length in farads/cm I.x; t/ = applied current density in amperes/cm
However, it is simpler mathematically to use units of time and space called 1 the membrane time constant m D cm rm and characteristic length D .rm =ri / 2 respectively, so that the above equation becomes, still using x, t for space, time variables (see Sect. 4.4 of [42] which also contains historical references) @V @2 V I.x; t/ D V C ; 2 @t @x cm
0 < x < L;
t > 0;
(6.2)
152
H.C. Tuckwell
Fig. 6.2 Showing the anatomy of a pyramidal cell from rat cerebral cortex. Adapted from [33]. The cell body (soma) has a diameter of about 20 m where V is in volts and cm D cm is the capacitance in farads of a characteristic length. Note that with this scaling time and space variables are dimensionless. L D l= is called the electrotonic length of the cable and now the units for I are coulombs. Usually the constant cm is set at unity as it simply scales the input and as the system is linear, similarly scales the response. The interval of definition as well as boundary and initial conditions are naturally required to determine specific solutions.
6.2.1 Geometrical or Anatomical Considerations Most neurons consist of a cell body or soma (cell body), dendritic tree(s) and axon, which usually also branches prolifically. These structures are illustrated in Fig. 6.2, which is a depiction of a pyramidal cell of rat sensorimotor cortex.
6 SPDEs in Neurobiology
153
There is a great variety of sizes and forms of neurons. Most neurons in the mammalian brain are classified as excitatory or inhibitory and the majority of the former are pyramidal cells, of which there are many forms—see [16] and the review of Spruston [36]. A review of inhibitory cell types can be found in [27]. The soma is pivotal in the sense that it is, roughly speaking, the part of the cell that separates the input and output components. Many somas, however, are sites of synaptic input. Spikes which are transmitted along an axon usually emanate from, near, or at the soma. There are 9 basic geometrical and biophysical configurations which may be employed to roughly represent a neuron’s anatomy for modeling with differential equations. Many of these are sketched in Fig. 6.3. (i) A single point. Somewhat surprisingly, this, which is equivalent to spaceclamping, is the most frequent representation of a neuron’s geometry! ODEs are employed, but despite the simplicity, predicting details of neuronal spiking analytically is difficult. The popular model involving an Ornstein–Uhlenbeck process is still an active area in theoretical neurobiology [5, 6, 54]. (ii) A single line segment. Assuming cylindrical symmetry, the line segment represents a nerve cylinder, which may be an axon or an isolated dendritic segment, or part thereof. This is probably quite accurate for such preparations as the squid axon. At one end, a point soma can, as a crude approximation, be represented with a sealed-end condition. (iii) Line segment plus lumped soma. A soma is often represented by a resistance Rs and capacitance Cs in parallel which are attached to the dendritic compartment. Such a soma circuit is referred to as a lumped (point) soma. The remaining 6 configurations are (iv) no axon, point soma and dendritic tree, (v) no axon, lumped soma and dendritic tree, (vi) simple axon, point soma and dendritic tree, (vii) simple axon, lumped soma and dendritic tree, (viii) branched axon, point soma and dendritic tree, and lastly (ix) branched axon, lumped soma and dendritic tree(s). The last configuration contains the most anatomical and biophysical reality but is hampered by a large number of constraints.
6.2.1.1 Reduction to a One-Dimensional Cable If, as is most often the case, a neuron has many dendritic trunks and an axon, each of which branches many times, then there are three methods of handling the geometrical (anatomical) details. 1. Use a cable equation for each segment. 2. Assume little spatial variation of potential etc. over each segment and use an ODE for electrical potential on each segment. This is the approach used by many software packages. 3. Use a mapping from the neuronal branching structure to a cylinder and thus reduce the multi-segment problem to that of a single segment, giving a cable
154
H.C. Tuckwell
Fig. 6.3 Several of the geometrical forms for representing nerve cells
equation in one space dimension. Most modeling studies ignore spatial extent altogether and the many of those that include spatial extent do not include a soma and hardly ever an axon. The reason is of course that the inclusion of all three major neuronal components, soma, axon and dendrites, makes for a complicated system of equations and boundary conditions.
6 SPDEs in Neurobiology
155
6.2.2 Simple Linear SPDE Models Many versions of the input current density I.x; t/ in the form of random processes were summarized in Chap. 9 of [43]. For discrete inputs of strengths ai at space points xi ; i D 1; : : : :n, arriving at the times of events in the counting processes Ni , we have n X dNi (6.3) ı.x xi /ai I1 .x; t/ D dt i D1 where ı./ is Dirac’s delta function, ai > 0 for an excitatory input and ai < 0 for an inhibitory input. More commonly, when the Ni ’s are Poisson processes, then, if the jai j’s are small enough and the associated frequencies i are large enough, then a diffusion approximation (with no implied limiting procedure) may be employed so I2 .x; t/ D
q d Wi ı.x xi / ai i C i ai2 dt i D1
n X
(6.4)
where the Wi ’s are standard Wiener processes, such that W .t/ has mean 0 and variance t, see Chap. 1. To simplify, the Poisson processes in (6.3) and corresponding Wiener processes in (6.4) are assumed to be independent. With the forms I1 or I2 for the current density, the SPDE, in conjunction with a threshold condition for firing, is the spatial version of the commonly used stochastic leaky integrate and fire models. The method of separation of variables can be used on finite intervals (e.g. Œ0; L) to obtain an infinite series representation for V . Let the Green’s function for the cable equation with given boundary conditions be G.x; yI t/ D
X
2
k .x/ k .y/e k t
(6.5)
k
wheref k g are spatial eigenfunctions and fk g are the corresponding eigenvalues. Then, for example, the solution of the cable equation with multiple white noise inputs can be written n X X V .x; t/ D Vki .t/ k .x/ (6.6) i D1
k
where for each k and i dVki D Œ2k Vki C ai i k .xi / dt C
q i ai2 d Wi :
(6.7)
That is, each process Vki is an Ornstein–Uhlenbeck process, those carrying different i indices being statistically independent. Moments of V can be readily determined analytically and simulation is a useful method for estimating firing times. Simulation methods for these systems were given in [44, 50].
156
H.C. Tuckwell
6.2.2.1 Commonly Employed Boundary Conditions For cables on Œ0; L there are two simple sets of boundary conditions usually considered. Firstly, the cable may be assumed to have sealed ends so that Vx .0; t/ D Vx .L; t/ D 0;
(6.8)
where subscripts denote partial differentiation. The remaining case of interest is that of killed ends V .0; t/ D V .L; t/ D 0: (6.9) For sealed ends the eigenvalues are n D 1 C n2 2 =L2 ; n D 0; 1; : : : and the normalized (to unity) eigenfunctions are 0 .x/ D n .x/ D
(6.10) p1 L
and
p 2=L cos.nx=L/; n D 1; 2; : : : :
(6.11)
In the killed ends case, the eigenvalues are n D 1 C n2 2 =L2 ; n D 1; 2; : : :
(6.12)
and the normalized eigenfunctions are n .x/ D
p 2=L sin.nx=L/; n D 1; 2; : : : :
(6.13)
6.2.2.2 Inclusion of Synaptic Reversal Potentials In the above model the response to an excitation or inhibition is always of the same magnitude, regardless of the potential when the input arrives. In reality, since synaptic potentials are generated by ion currents whose components have specific Nernst potentials (see [42], page 48), the response to a synaptic input depends on the prior potential. This aspect was introduced in point models in [39] for the Poisson case and for the diffusion approximation in [15]. Inclusion of reversal potentials in the cable model gives the following SPDE for nE excitatory inputs at the space points xE;j arriving at the event times of NE;j and nI inhibitory inputs at the points xE;k at the times of events in NI;k nE X dNE;j @2 V @V D V C aE;j ı.x xE;j /.V VE / 2 @t @x dt j D1
nI X kD1
aI;k ı.x xI;k /.V VI /
dNI;k : (6.14) dt
6 SPDEs in Neurobiology
157
Here the quantities aE;j > 0 and aI;k > 0 determine the magnitudes of the responses to synaptic inputs. It is assumed that all excitatory inputs have the reversal potential VE and all inhibitory inputs have reversal potential VI , though this is a simplification. If the processes NE;j and NI;k have mean rates E;j and I;k , respectively, and are independent, then a diffusion approximation can be constructed with the SPDE E X @2 V @V D V C aE;j E;j ı.x xE;j /.V VE / @t @x 2 j D1
n
C
q d WE;j 2 ı.x xE;j / aE;j E;j .V VE /2 dt j D1 nE X
nI X
aI;k I;k ı.x xI;k /.V VI /
kD1
C
nI X kD1
q d WI;k 2 ı.x xI;k / aI;k I;k .V VI /2 dt
(6.15)
where the WE;j and WI;k are (possibly) independent standard Wiener processes. Both the discontinuous model and its diffusion approximation will be the subject of future investigations.
6.2.2.3 Two-Parameter White Noise Input Of interest is the case of uniform two-parameter white noise w.x; t/ so that I3 .x; t/ D a C bw.x; t/;
(6.16)
where fw.x; t/; x 2 Œ0; L; t 0g is a space-time white noise with covariance function CovŒw.x; s/; w.y; t/ D ı.x y/ı.s t/: (6.17) The solution has the decomposition V .x; t/ D
X
Vk .t/ k .x/
(6.18)
k
which involves Ornstein–Uhlenbeck processes which are all statistically independent, satisfying the SDEs dVk D Œak 2k Vk dt C bdW k ;
(6.19)
158
H.C. Tuckwell
where
Z
L
ak D a
k .y/ dy
(6.20)
0
and the one-parameter Wiener processes are defined by Z
L
Z
t
Wk .t/ D
k .y/w.y; s/ ds dy: 0
(6.21)
0
Analytical and simulation approaches for finding the statistical properties of V and firing times were reported in [49].
6.2.2.4 Synaptic Input as a Mixture of Jump and Diffusion Processes For many neurons, some excitatory postsynaptic potentials have large amplitudes, of order a few to several millivolts, whereas others may be much smaller. Under these conditions a model may be constructed in which some inputs are represented by discontinuous (jump) processes which are not well approximated by diffusions whereas smaller amplitude inputs are amenable to such approximations. This aspect was introduced in point models in [40]. Inclusion in the spatial model with reversal potentials is immediate by taking linear combinations of input terms from the jump and diffusion cases as follows E X dNE;j @2 V @V D V C aE;j ı.x xE;j /.V VE / 2 @t @x dt j D1
n
nI X kD1
aI;k ı.x xI;k /.V VI /
dNI;k dt
0
C
nE X
0 0 aE;j 0E;j ı.x xE;j /.V VE /
j D1 0
C
0 q d WE;j 02 0 ı.x xE;j / aE;j 0E;j .V VE /2 dt j D1 nE X
0
nI X
0 0 aI;k 0I;k ı.x xI;k /.V VI /
kD1 0
C
nI X kD1
0 q d WI;k 02 0 ; ı.x xI;k / aI;k 0I;k .V VI /2 dt
(6.22)
where unprimed quantities refer to the large amplitude inputs and the primed quantities to those for which a diffusion approximation is appropriate.
6 SPDEs in Neurobiology
159
6.2.3 Two-Component Linear SPDE Systems The above SPDE containing derivatives of counting processes, such as Poisson processes, has solutions with discontinuities due to the impulsive nature of the derivatives dNi =dt: In real neurons the arrival of, for example, an excitatory synaptic potential is signalled by a rapid but smooth increase in membrane potential, followed by an approximately exponential decay. In order to give a more realistic representation, the time derivative of the current density is set to @2 NE @2 NI @I D ˛I C aE aI ; @t @x@t @x@t
(6.23)
where NE and NI are two-parameter counting processes (e.g. Poisson processes), not necessarily independent, so that I has discontinuities but V itself is smooth (continuous). Here aE ; aI 0. In real neurons, rates and amplitudes will, naturally, vary in space and time and amplitudes may be random. However, for simplicity it is assumed that all excitatory events cause I to increase locally by aE and all inhibitory events result in a jump down in I of magnitude aI . The rates may also be assumed constant so that E and I are the mean number of excitatory and inhibitory events, respectively, per unit area in the .x; t/-plane. Because the system is linear and homogeneous, the statistical properties are relatively straightforward to determine analytically. A simpler model with an Ornstein–Uhlenbeck current at a point was analyzed in [51]. Put K D aE E aI I which is mean net synaptic drive. With sealed ends it is readily shown that the mean depolarization is 1 K t t ˛t 1e C .e e / ; ˛ ¤ 1: EŒV .x; t/ D ˛ 1˛
(6.24)
This is the same as the result for the infinite cable, giving a mean which is independent of position. In the case of killed ends the mean voltage along the cable is given by EŒV .x; t/ D
1 4K X sin.nx=L/ 1 e n t .e ˛t e n t / ; ˛ nD1 n n n ˛
(6.25)
where summation is over odd values of n only. A diffusion approximation may be employed here so that, assuming the postsynaptic potential amplitudes and the rates are constant, @I D ˛I C K C w.x; t/ @t
(6.26)
160
H.C. Tuckwell
q where D aE2 E C aI2 I . Many statistical properties of the corresponding membrane potential V were obtained and the effects of various spatial distributions of synaptic input, based on data for cortical pyramidal cells, were found on the interspike interval distribution [45, 46]. With excitation only, the ISI distribution is unimodal with a decaying exponential appearance and with a large coefficient of variation. As inhibition near the soma grows, two striking effects emerge. The ISI distribution shifts first to bimodal and then to unimodal with an approximately Gaussian shape with a concentration at large intervals. At the same time the coefficient of variation of the ISI drops dramatically to less than 1/5 of its value without inhibition.
6.3 Nonlinear Models for Spiking Neurons In 1952 Hodgkin and Huxley set forth a dynamical system of PDEs to describe action potential generation and propagation in the squid giant axon. From the time of that original work to the present day, their method of dividing the membrane current into a capacitative and ionic component has provided the basis for mathematical models of many nerve cells, as exemplified recently by spatial models of thalamic pacemaker cells [31], and paraventricular neurons [22]. The capacitative component of the current is always assumed to have the simple form C @V @t . The main principle that emerges from all these works is that to quantify the ionic components, each ion-channel type is represented by an activation variable m and, if appropriate, an inactivation variable h. The current density for each channel usually has the form J D gmax mn h.V Vi / where gmax , which may depend on position, is the conductance available with all channels open, although sometimes the constant field form is more accurate [4]. An important effect is the modulation of conductances by various biochemical mechanisms [23]. In some cases, as for example the L-type calcium current, the inclusion of a calcium-dependent inactivation variable is often important [22, 31].
6.3.1 The Ionic Currents Underlying Neuronal Spiking The original HH-system for squid contained only sodium, potassium and leak currents so that the total ionic current was Iion D INa C IK C Ileak
(6.27)
Furthermore, in squid axon the distribution of the corresponding ion channels was assumed to be spatially uniform. Models of motoneurons [7, 37] and cortical pyramidal cells [19, 26] have also contained only these three components, but with
6 SPDEs in Neurobiology
161
varying channel densities over the neuron’s surface. However, in the last three decades it has become apparent that one needs to consider many ion channels apart from the original sodium and potassium channels in the HH-model. For example, it is now known that calcium currents are important in the spiking activity of most, if not all, CNS neurons [21, 38]. Calcium currents, which are themselves voltage-gated, do not only contribute directly to membrane currents, but also cause increases in intracellular calcium concentration. Such inward currents cause changes in calcium concentration-dependent conductances of which an important example is the calcium-activated potassium current IKC a . To describe nerve cell activity with a degree of biophysical reality it is therefore frequently essential to take into account calcium dynamics. The latter entails buffering, sequestration, diffusion and pumping or active transport; see for example [4]. Unfortunately from the point of view of mathematical modeling, the number of ion-channel types is enormous, there being 10 types of calcium channel (with many subtypes) [8] and 40 types of potassium channel [14]. By 1997 there had been at least 40 types of ion channel found just in nerve terminals [30]. See [25] for an earlier yet classic summary of channel types in various mammalian nerve cells.
6.3.2 A General SPDE for Nerve Membrane Potential A general HH-type electrophysiological model with the addition of synaptic and applied inputs in the form of an SPDE for the membrane potential V .x; t/ on a segment of a nerve can be described in one space dimension, assuming approximately cylindrical geometry. In most cases a neuronal cell body with dendritic trees and an axon will be represented by a collection of such segments and sometimes a special equation (probably an ODE) for the somatic component. Thus, in the case of a real neuron, many boundary conditions will need to be satisfied. Including only purely voltage-dependent channels, on each segment we have X @2 V @V p q D C gi;max mi i hi i .V Vi / C Isyn C Iapp 2 @t @x i D1 n
@mi D ˛mi .V /.1 mi / ˇmi .V /mi @t @hi D ˛hi .V /.1 hi / ˇhi .V / @t X dNk ak ı.x xk /.V Vk / Isyn D dt
(6.28)
(6.29) (6.30) (6.31)
k
where there are n distinct types of ion channel. Here Isyn is synaptic input occurring at space points xk with reversal potentials Vk and amplitudes ak according to the
162
H.C. Tuckwell
point processes Nk , and Iapp is any experimentally applied current. The i -th channel type has maximal conductance density gi;max and the corresponding activation and inactivation variables are mi and hi , respectively. If there is no inactivation, as is often the case, such as for some high voltage threshold calcium channels and the potassium delayed rectifier channels, then qi can be set at zero. For a leak current, which may have more than one component, both pi and qi can be set to zero. The maximal conductances and the synaptic and applied currents are spaceand possibly time-dependent. Calcium-dependence has been omitted because of the complication that some calcium currents and not others are involved in the activation of, for example, potassium channels. Calcium dynamics has been taken into account in many different ways, even for the same neuron type [28, 31]. To illustrate, the L-type calcium channel has calcium-dependent inactivation so if the internal calcium concentration is Cai , the external calcium concentration is Cao , then all deterministic formulations of the L-type calcium current employed in modeling to date are included in the general form ICaL D mp1 .V; t/hp2 .V; t/f .Cai ; t/F.V; Cai ; Cao /;
(6.32)
where m.V; t/ is the voltage-dependent activation variable, h.V; t/ is the voltagedependent inactivation variable and f .Cai ; t/ is the (internal) calcium-dependent inactivation variable. The factor F contains membrane biophysical parameters and is either of the Ohmic form used in the original HH model, or the constant-field form, often called the Goldman–Hodgkin–Katz form [43].
6.4 Stochastic Spatial Hodgkin–Huxley Model Recent studies of the HH-system of ODEs with stochastic input have revealed interesting phenomena which have a character opposite to that of stochastic resonance. In the latter, there is a noise level at which some response variable achieves a maximum. In particular, at mean input current densities near the critical value (about 6.4 A/cm2 ) for repetitive firing, it was found that noise could strongly inhibit spiking. Furthermore, there occurred, for given mean current densities, a minimum in the firing rate as the noise level increased from zero [52]. It is of interest to see if these phenomena extend to the spatial HH-system which we describe forthwith. Historically, a study of the properties of the output spike train of an HH cable with Poisson inputs was previously described in [12] and simulations of random channel openings were considered in [35]. The simulations below were performed using an Euler method (see [46] and Sect. 1.7.1 in Chap. 1) whose accuracy was confirmed by comparison with analytical results. The following system of differential equations was proposed [17] to describe the evolution in time and space of the depolarization V in the squid giant axon: selectfont
6 SPDEs in Neurobiology
Cm
163
a @2 V @V C gN K n4 .VK V / C gN Na m3 h.VNa V / C gl .Vl V / C I.x; t / D @t 2Ri @x 2 (6.33) @h D ˛h .V /.1 h/ ˇh .V /h @t
(6.34)
@m D ˛m .V /.1 m/ ˇm .V /m @t
(6.35)
@n D ˛n .V /.1 n/ ˇn .V /n: @t
(6.36)
Here Cm ; gN K ; gN Na ; gl , and I.x; t/ are respectively the membrane capacitance, maximal potassium conductance, maximal sodium conductance, leak conductance and applied current density for unit area (1sq cm). Ri is the intracellular resistivity and a is the fiber radius. n, m and h are the potassium activation, sodium activation and sodium inactivation variables and their evolution is determined by the voltagedependent coefficients ˛n .V / D
10 V 100Œe .10V /=10
1
;
ˇn .V / D
1 V =80 e 8
(6.37) (6.38)
˛m .V / D
25 V ; 10Œe .25V /=10 1
ˇm .V / D 4e V =18
˛h .V / D
7 V =20 e ; 100
ˇh .V / D
1 e .30V /=10
C1
(6.39)
The following standard parameter values are employed: a D 0:0238; Ri D 34:5; Cm D 1; gN K D 36; gN Na D 120; gl D 0:3; VK D 12; VNa D 115 and Vl D 10. For the initial values, V .0/ D 0 and for the auxiliary variables the ˛n .0/ equilibrium values are used, for example n.0/ D ˛n .0/Cˇ . The units for these n .0/ various quantities are as follows: all times are in msec, all voltages are in mV, all conductances per unit area are in mS/cm2 , Ri is in ohm-cm, Cm is in F/cm2 , distances are in cm, and current density is in microamperes/cm2. Note that with the standard parameters, the HH-model does not act as a spontaneous pacemaker. One may turn the HH neuron into a spontaneously firing cell by shifting, for example, the half activation potential to 30:5 mV from about 28:4 mV (assumed resting at 55 mV) whereupon there is a threshold for repetitive spiking around C1:8 nA (hyperpolarizing). Then for the HH system of ODEs, similar phenomena, including inverse stochastic resonance, are found with noise as with the standard parameter set. This robustness is expected to apply also to the spatial model as discussed below.
164
H.C. Tuckwell
Fig. 6.4 The number of spikes N on .0; L/ at t D 160 is plotted against the level of excitation in the absence of noise. The dashed curve is for the smaller region of excitation to x1 D 0:1 whereas the solid curve is for x1 D 0:2. Notice the abrupt increases in spike rates at values close to the bifurcation to repetitive firing, being about 6.1 for x1 D 0:2 and 6.5 for x1 D 0:1
6.4.1 Noise-Free Excitation We firstly consider the HH-system with a constant input current over a small interval so that I.x; t/ D .x; t/ where .x; t/ D > 0;
0 x x1 L;
t > 0;
(6.40)
and zero current density elsewhere. The initial condition for V is resting level and the auxiliary variables have their corresponding equilibrium values. The length was set at L D 6 cm. With x1 D 0:2 the response for D 4 is a solitary spike. With D 6 a doublet of spikes propagates along the nerve cylinder and beyond some critical value of there ensues a train of regularly spaced spikes, as for example with D 7:5. The latter case corresponds to repetitive and periodic firing in the HH-system of ODEs. In order to quantify the spiking activity, the maximum number N of spikes on (0,6) is found and Fig. 6.4 shows the dependence of N on the mean input current density, , for two values of x1 D 0:1 and x1 D 0:2. For < 2 no spikes occurred for both values of x1 . A solitary spike emerged for 2 and when reached 6 in the case of x1 D 0:2 and 6.5 in the case of x1 D 0:1, a doublet arose and propagated along the cylinder. For slightly greater values of , an abrupt
6 SPDEs in Neurobiology
165
increase in the number of spikes, indicating that a bifurcation had occurred (see [52] for an explanation of such phenomena). Subsequently the number of spikes reached a plateau and when reached 9, the largest value considered here, the number of spikes was 11 for both values of x1 . In consideration of the behavior of the HH system of ODEs with noise, it is then of interest to examine the effects of noise on the spike counts near the bifurcation point for the PDE case.
6.4.2 Stochastic Stimulation The HH-system of PDEs was therefore considered with applied currents of the following form I.x; t/ D .x; t/ C .x; t/w.x; t/ (6.41) on a cylindrical nerve cell extending from x D 0 to x D L, where .x; t/ is as above and for the random component .x; t/ D > 0;
0 x2 x x3 L;
t > 0;
(6.42)
and zero elsewhere. Here fw.x; t/; x 2 Œ0:L; t 0g is a two-parameter white noise with covariance function CovŒw.x; s/; w.y; t/ D ı.x y/ı.t s/;
(6.43)
.x; t/ and .x; t/ being deterministic functions specifying the mean and variance of the noisy input. The numerical integration of the resulting stochastic HH system of PDEs is performed by discretization using an explicit method whose accuracy has been verified by comparison with analytical results in similar systems [46], there being no available analytical results for the HH model. Figure 6.5 shows examples of the effects of noise with the following parameters: D 6:7; x1 D 0:1; x2 D 0, and x3 D L D 6. The records show the membrane potential as a function of x at t D 160. In the top record there is no noise and there are 9 spikes. In the middle two records, with a noise level of D 0:1 there is a significant diminution of the spiking activity, with only 1 spike in one case and 3 in the other. With the noise turned up to D 0:3 (bottom record) the number of spikes is greater, there being 6 in the example shown. With x1 D 0:1, mean spike counts were obtained at various for D 5, 6.7 and 7. The first of these values is less than the critical value for repetitive firing (see Fig. 6.4) and the other two close to and just above the critical value. Relatively small numbers of trials were performed as integration of the PDEs naturally takes much longer than the ODEs. Hence, the number of trials in the following is 25, which is a small statistical sample, but is sufficient to show the main effects. Figure 6.6 shows plots of mean spike counts, EŒN , as explained above, versus noise level. For D 5, EŒN increases monotonically as increases from 0 to 0.3. When
166
H.C. Tuckwell
Fig. 6.5 Showing the effects of noise on spiking for mean current densities near the bifurcation to repetitive spiking. Parameters are D 6:7, x1 D 0:1, x2 D 0, and x3 D L D 6. In the top record with no noise there is repetitive firing which, as shown in the second and third records, is strongly inhibited by a relatively small noise of amplitude D 0:1. A larger noise amplitude D 0:3 leads to much less inhibition
D 6:7, which is very close to the critical value for repetitive firing, a small amount of noise causes a substantial decrease in firing (cf. Fig. 6.5) with the appearance of a minimum near D 0:1. For D 7, where indefinite repetitive firing occurs without noise, a similar reduction in firing activity occurs at all values of up to the largest value employed, 0.3. Furthermore, a minimum in mean spike count also occurs near D 0:1, a phenomenon referred to previously as inverse stochastic resonance [13]. In some trials, secondary phenomena were observed as in the FitzHugh–Nagumo (FN) system [46]. An example of what might be called an anomalous case occurred for x1 D 0:1, with the mean excitation level D 5 below the threshold for repetitive firing and noise of amplitude D 0:3 extending along the whole cable. A single spike emerges from the left hand end. By t D 32 a pair of spikes is seen to emerge at x 5, one traveling towards the emerging spike and one to the right. Not long after t D 80 the left-going secondary spike collides with the emerging right-going spike and these spikes annihilate each other. Thus, the spike count on .0; L/ ends up at 0 at t D 160 due to interference between a noise-generated spike and the spike elicited by the deterministic excitation. With x1 D 0:2, mean spike counts were similarly obtained with various noise amplitudes for D 5, 6.2 and 6.5. Again, the first of these values is less than the critical value for repetitive firing (see Fig. 6.4) and the other two close to and just
6 SPDEs in Neurobiology
167
Fig. 6.6 Mean number of spikes as a function of noise level for various values of the mean level of excitation with excitation on (0,0.1). The bottom curve is for a value of well below the critical value at which repetitive firing occurs. 95% confidence limits are indicated
above the critical value. Similar behavior in spiking activity occurred as varied as for x1 D 0:1. Thus, these findings of inverse stochastic resonance parallel those found for the HH system of ODEs and although there is no standard bifurcation analysis for the PDE system, it is probable that most of the arguments which apply to the system of ODEs apply to the PDEs. It was also found that noise over the small region to x D 0:05 reduces the mean spike count by 48% and when the extent of the noise is to x 0:1 the mean spike count drops to about one third of its value without noise. Thus, there is only a small further reduction in spiking when the noise extends to the whole interval. Similar results were obtained for x1 D 0:2. Thus, noise over even a small region where the excitation occurs may inhibit partially or completely the emergence of spikes from a trigger zone just as or almost as effectively as noise along the whole extent of the neuron. Surprisingly, with x1 D 0:1, x2 D 0:1 and x3 D 0:2 so that the small noise patch was just to the right of the excitatory stimulus, no reduction in spike count occurred. Thus, noise at the site of the excitation causes a significant reduction in spike count, but noise with the same magnitude and extent but disjoint from the region of excitation has, at least in the cases examined, no effect. However, a small amount of interference with the outgoing spike train did occur when the noise amplitude was stronger at D 0:3 with x1 D 0:1 and the noise was on (0.1, 0.2). Such interference is probably due to a different mechanism from switching the system from one attractor (firing regularly) to another (a stable point) and possibly is due to the instigation of secondary wave
168
H.C. Tuckwell
phenomena as described above in the anomalous case and in the FN system [46]. For more details of the effects of noise on the instigation and propagation of spikes in the spatial HH system, including both means and variances of spike numbers, see [47, 48].
6.5 A Stochastic Spatial FitzHugh–Nagumo System The FN system has long been employed as a simplification of the HH model as it shares many of its properties and only has two rather than four components. Hence we here briefly discuss the FN spatial model with noise. See also Chap. 7 for a probabilistic treatment of the FN system. In one space dimension, the FN model can be written, using subscript notation for partial differentiation, ut DD1 uxx C u.u a/.1 u/ v C I.x; t/; 0
vt DD2 vxx C Œu pv C b; 0 < x < L;
(6.44) (6.45)
t > 0;
where the voltage variable is u.x; t/ and a recovery variable is v.x; t/. The quantities ; ; 0 ; p; D1 and D2 are positive and usually taken to be constants, although they could vary with both x and t. The parameter b can be positive, negative or zero. The applied current or input signal I.x; t/ may be due to external or intrinsic sources. In the majority of applications D1 D 1 and D2 D 0.
6.5.1 The Effect of Noise on the Probability of Transmission One aspect of interest is the effects of noise on the propagation of an action potential. To this end we consider an FN system with the original parameterization [9] ut D uxx C u
u3 v C I.x; t/ 3
vt D 0:08.u 0:8v C 0:7/:
(6.46) (6.47)
and let I.x; t/ D .x/w.x; t/;
(6.48)
that is, driftless white noise with amplitude which may depend on position. Sealed end conditions are employed. In order to start an action potential we apply a current J at x D 0 for 0 t t . The boundary conditions are thus
6 SPDEs in Neurobiology
169
ux .0; t/ D J;
0 < t t ;
(6.49)
ux .0; t/ D 0;
t > t ;
(6.50)
ux .L; t/ D 0;
t > 0:
(6.51)
For initial conditions for the general system of SPDEs ut D D1 uxx C f .u; v/ C w.x; t/ and vt D D2 vxx C g.u; v/ we choose suitable equilibrium values u.x; 0/ D u ;
v.x; 0/ D v ;
0 < x < L;
(6.52)
where u and v satisfy f .u ; v / D 0;
g.u ; v / D 0:
(6.53)
For the original FN model, these equilibrium values are u D 1:1994 and v D 0:6243, being the unique real solution of u u3 =3 v D 0 and 0:08.u 0:8v C 0:7/ D 0. With this paramaterization an action potential (solitary wave solution) has a speed of about 0.8 space units per time unit. If a space unit is chosen to represent 0.1 mm and a time unit is chosen as 0.04 msec, then the speed of propagation is 2 m/s which is close to the value for unmyelinated fibers. We set the overall length at 50 space units or 5 mm to be near the order of magnitude of real neurons. Suppose that there is a noisy background throughout the length of the nerve so that .x/ D , a constant. Often a solitary wave passes fairly well unchanged but this may not be so even for small amplitude noise over larger distances. When D 0:225 or 0:25 several outcomes are possible. The wave may pass fairly well unchanged, or the wave progresses some distance but dies due to noise interference, or sometimes, as seen above in the HH model, a subsidiary noise-induced wave starts at the right hand end and propagates towards the oncoming wave induced by the end current. Annihilation of the original right-going wave occurs when it collides with the backtravelling wave and the result is a failure of propagation. The dependence on noise amplitude of the probability of successful transmission, ptrans ./, of an action potential, instigated at x D 0, was investigated for both uniform noise on .0; L/, where L D 5 mm, and for noise restricted to 2:5 x 3:5. The results are shown in Fig. 6.7 where probability of transmission is plotted against . The results for uniform noise (blue crosses) can be divided into two regimes, to the left and right of the point P. The rate of decline of ptrans as increases from 0.12 to about 0.26 is slower by a factor of about 4.5 than that as increases from 0.26 to 0.40. Examination of the sample paths showed that there are two kinds of transmission failure. One is due to purely noise interference occurring at the smaller values of and resulting in the annihilation of the traveling wave. The other occurs when the noise itself starts a secondary disturbance of sufficient magnitude that it may grow into a substantial response, which may take the form of another wave or multiple waves. Sometimes the original wave almost dies and noise leads to its revival as a secondary wave. When the noise was restricted to be
170
H.C. Tuckwell
Fig. 6.7 The probability of transmission of the action potential versus noise amplitude. Blue crosses are for uniform noise whereas red crosses are for the case of noise restricted to a small region based on 100 trials per point. The point P demarcates for the uniform case the regime for smaller where the noise essentially annihilates the oncoming wave from the regime for larger where the noise is sufficiently strong to give rise to non-local large often disruptive responses
over a small space interval, transmission failure occurred usually by interference by noise rather than secondary phenomena. That is, the original travelling wave found it difficult to traverse the noisy patch. See [46] for further results and discussion.
6.6 Discussion We have presented several SPDE models in neurobiology, with focus on single neurons. In Chap. 8 SPDE models are presented for the cerebro-cortical phenomenon of spreading depression. Although the mathematics involved in establishing existence and describing properties of the solutions to such SPDEs is highly abstract [2, 20], simulation techniques, which may be explicit or implicit, enable one to determine many statistical properties relevant to electrophysiological investigations. For the HH system of PDEs, Horikawa [18] undertook a study of the effects of noise on nerve impulse propagation using one of the stochastic models proposed in [41]. In the present article, for the HH PDE model we also have focused on additive noise, but another somewhat different approach is to consider noisy ion channel dynamics, theoretically explored in [1]. Our main finding was that the phenomenon of inverse stochastic resonance, recently elaborated on for the HH system of ODEs, does occur
6 SPDEs in Neurobiology
171
in the HH SPDE system as well. Although noise along the whole neuron was found to suppress spiking near the critical levels of mean excitation for repetitive spiking, with a concomitant minimum as noise amplitude increased away from zero, noise over a small region near the main source of excitation was found to be nearly as potent in its inhibitory effect. The FN system has been employed in numerous settings [24] outside its original domain as an approximation to the HH nerve cell model. Here we have focused briefly on the effects of additive noise in the PDE version, which is elaborated on in [46]. Two modes of inhibition of transmission by noise were found, one local and the other due to secondary wave phenomena, which also occurs in the HH SPDE system.
References 1. Austin, T.D.: The emergence of the deterministic Hodgkin-Huxley equations as a limit from the underlying stochastic ion-channel mechanism. Ann. Appl. Probab. 18, 1279–1325 (2008) 2. Berg´e, B., Chueshov, I.D., Vuillermot, P.A.: On the behavior of solutions to certain parabolic SPDEs driven by Wiener processes. Stoch. Proc. Appl. 92, 237–263 (2001) 3. Burlhis, T.M., Aghajanian, G.K.: Pacemaker potentials of serotonergic dorsal raphe neurons: contribution of a low-threshold Ca2C conductance. Synapse 1, 582–588 (1987) 4. Destexhe, A., Sejnowski, O.: Thalamocortical Assemblies. Oxford University Press, Oxford (2001) 5. Ditlevsen, S., Ditlevsen, O.: Parameter estimation from observations of first-passage times of the Ornstein-Uhlenbeck process and the Feller process. Probabilist. Eng. Mech. 23, 170–179 (2008) 6. Ditlevsen, S., Lansky, P.: Estimation of the input parameters in the Ornstein-Uhlenbeck neuronal model. Phys. Rev. E 71, Art. No. 011,907 (2005) 7. Dodge, F.A., Cooley, J.: Action potential of the motoneuron. IBM J. Res. Devel. 17, 219–229 (1973) 8. Dolphin, A.C.: Calcium channel diversity: multiple roles of calcium channel subunits. Curr. Opin. Neurobiol. 19, 237–244 (2009) 9. FitzHugh, R.: Mathematical models of excitation and propagation in nerve. In: Biological Engineering. McGrawHill, New York (1969) 10. Gerstein, G.L., Mandelbrot, B.: Random walk models for the spike activity of a single neuron. Biophys. J. 4, 4168 (1964) 11. Gluss, B.: A model for neuron firing with exponential decay of potential resulting in diffusion equations for probability density. Bull. Math. Biophys. 29, 233–243 (1967) 12. Goldfinger, M.D.: Poisson process stimulation of an excitable membrane cable model. Biophys. J. 50, 27–40 (1986) 13. Gutkin, B.S., Jost, J., Tuckwell, H.C.: Inhibition of rhythmic neural spiking by noise: the occurrence of a minimum in activity with increasing noise. Naturwissenschaften 96, 1091– 1097 (2009) 14. Gutman, G.A., Chandy, K.G., Grissmer, S., Lazdunski, M., McKinnon, D., Pardo, L.A., Robertson, G.A., Rudy, B., Sanguinetti, M.C., Stuhmer, W., Wang, X.: International Union of Pharmacology. LIII. Nomenclature and molecular relationships of voltage-gated potassium channels. Pharmacol. Rev. 57, 473,508 (2005) 15. Hanson, F.B., Tuckwell, H.C.: Diffusion approximations for neuronal activity including synaptic reversal potentials. J. Theoret. Neurobiol. 2, 127–153 (1983)
172
H.C. Tuckwell
16. Hellwig, B.: A quantitative analysis of the local connectivity between pyramidal neurons in layers 2/3 of the rat visual cortex. Biol. Cybern. 82, 111–121 (2000) 17. Hodgkin, A.L., Huxley, A.F.: A quantitative description of membrane current and its application to conduction and excitation in nerve. J. Physiol. 117, 500–544 (1952) 18. Horikawa, Y.: Noise effects on spike propagation in the stochastic Hodgkin-Huxley models. Biol. Cybern. 66, 19–25 (1991) 19. Iannella, N., Tanaka, S., Tuckwell, H.C.: Firing properties of a stochastic PDE model of a rat sensory cortex layer 2/3 pyramidal cell. Math. Biosci. 188, 117–132 (2004) 20. Kallianpur, G., Xiong, J.: Diffusion approximation of nuclear space-valued stochastic differential equations driven by Poisson random measures. Ann. Appl. Probab. 5, 493–517 (1995) 21. Koch, C.: Biophysics of Computation: Information Processing in Single Neurons. Oxford University Press, Oxford (1999) 22. Komendantov, A.O., Tasker, J.G., Trayanova, N.A.: Somato-dendritic mechanisms underlying the electrophysiological properties of hypothalamic magnocellular neuroendocrine cells: A multicompartmental model study. J. Comput. Neurosci. 23, 143–168 (2007) 23. Levitan, I.B., Kaczmarek, L.K.: Neuromodulation. Oxford University Press, Oxford (1987) 24. Lindner, B., Garcia-Ojalvo, J., Neiman, A., Schimansky-Geier, L.: Effects of noise in excitable systems. Phys. Rep. 392, 321–424 (2004) 25. Llinas, R.: The intrinsic electrophysiological properties of mammalian neurons: insights into central nervous system function. Science 242, 1654–1664 (1988) 26. Mainen, Z.F., Joerges, J., Huguenard, J.R., Sejnowski, T.J.: A model of spike initiation in neocortical pyramidal neurons. Neuron 15, 1427–1439 (1995) 27. Markram, H., Toledo-Rodriguez, M., Wang, Y., Gupta, A., Silberberg, G., Wu, C.: Interneurons of the neocortical inhibitory system. Nat. Rev. Neurosci. 5, 793–807 (2004) 28. McCormick, D.A., Huguenard, J.R.: A model of the electrophysiological properties of thalamocortical relay neurons. J. Neurophysiol. 68, 1384–1400 (1992) 29. Meg´ıas, M., Emri, Z.S., Freund, T.F., Guly´as, A.I.: Total number and distribution of inhibitory and excitatory synapses on hippocampal CA1 pyramidal cells. Neuroscience 102, 527–540 (2001) 30. Meir, A., Ginsburg, S., Butkevich, A., Kachalsky, S.G., Kaiserman, I., Ahdut, R., Demirgoren, S., Rahamimoff, R.: Ion channels in presynaptic nerve terminals and control of transmitter release. Physiol. Rev. 79, 1020–1088 (1999) 31. Rhodes, P.A., Llinas, R.: A model of thalamocortical relay cells. J. Physiol. 565, 765–781 (2005) 32. Roy, B.K., Smith, D.R.: Analysis of the exponential decay model of the neuron showing frequency threshold effects. Bull. Math. Biophys. 31, 341–357 (1969) 33. Sholl, D.: The Organization of the Cerebral Cortex. Methuen, London (1956) 34. Shu, Y., Hasenstaub, A., Badoual, M., Bal, T., McCormick, D.A.: Barrages of synaptic activity control the gain and sensitivity of cortical neurons. J. Neurosci. 23, 10388–10401 (2003) 35. Skaugen, E., Walloe, L.: Firing behaviour in a stochastic nerve membrane model based upon the Hodgkin-Huxley equations. Acta Physiol. Scand. 107, 343–363 (1979) 36. Spruston, N.: Pyramidal neurons: dendritic structure and synaptic integration. Nat. Rev. Neurosci. 9, 206–221 (2008) 37. Traub, R.D.: Motoneurons of different geometry and the size principle. Biol. Cybern. 25, 163– 175 (1977) 38. Traub, R.D.: Neocortical pyramidal cells: a model with dendritic calcium conductance reproduces repetitive firing and epileptic behavior. Brain Res. 173, 243–257 (1979) 39. Tuckwell, H.C.: Synaptic transmission in a model for stochastic neural activity. J. Theor. Biol. 77, 65–81 (1979) 40. Tuckwell, H.C.: Poisson Processes in Biology. In: Stochastic Nonlinear Systems, pp. 162–172. Springer, Berlin (1981) 41. Tuckwell, H.C.: Stochastic equations for nerve membrane potential. J. Theoret. Neurobiol. 5, 87–99 (1986)
6 SPDEs in Neurobiology
173
42. Tuckwell, H.C.: Introduction to Theoretical Neurobiology, vol. 1: Linear Cable Theory and Dendritic Structure. Cambridge University Press, Cambridge (1988) 43. Tuckwell, H.C.: Introduction to Theoretical Neurobiology, vol. 2: Nonlinear and Stochastic Theories. Cambridge University Press, Cambridge (1988) 44. Tuckwell, H.C.: Stochastic Processes in the Neurosciences. SIAM, Philadelphia (1989) 45. Tuckwell, H.C.: Spatial neuron model with two-parameter Ornstein-Uhlenbeck input current. Phys. A 368, 495–510 (2006) 46. Tuckwell, H.C.: Analytical and simulation results for the stochastic spatial FitzHugh-Nagumo neuron. Neural Comput. 20, 3003–3035 (2008) 47. Tuckwell, H.C., Jost, J.: Weak noise in neurons may powerfully inhibit the generation of repetitive spiking but not its propagation. PLoS Comp. Biol. 6, e1000794 (2010) 48. Tuckwell, H.C., Jost, J.: The effects of various spatial distributions of weak noise on rhythmic spiking. J. Comp. Neurosci. 30, 361–371 (2011) 49. Tuckwell, H.C., Walsh, J.B.: Random currents through nerve membranes. Biol. Cybern. 49, 99–110 (1983) 50. Tuckwell, H.C., Wan, F.Y.M., Wong, Y.S.: The interspike interval of a cable model neuron with white noise input. Biol. Cybern. 49, 155–167 (1984) 51. Tuckwell, H.C., Wan, F.Y.M., Rospars, J.P.: A spatial stochastic neuronal model with OrnsteinUhlenbeck input current. Biol. Cybern. 86, 137–145 (2002) 52. Tuckwell, H.C., Jost, J., Gutkin, B.S.: Inhibition and modulation of rhythmic neuronal spiking by noise. Phys. Rev. E 80, 031907 (2009) 53. Watts, J., Thomson, A.M.: Excitatory and inhibitory connections show selectivity in the neocortex. J. Physiol. 562.1, 89–97 (2005) 54. Zhang, X., You, G., Chen, T., Feng, J.: Maximum likelihood decoding of neuronal inputs from an interspike interval distribution. Neural Comput. 21, 1–27 (2009)
Chapter 7
Deterministic and Stochastic FitzHugh–Nagumo Systems Mich`ele Thieullen
Abstract In this chapter we review some mathematical aspects of FitzHugh– Nagumo systems of ordinary differential equations or partial differential equations. Our treatment is probabilistic. We focus on small noise asymptotics for these systems and their stochastic perturbations. The noise is either an external perturbation or already present when the system involves spatial propagation.
7.1 Introduction FitzHugh–Nagumo (FN) systems are two-dimensional nonlinear systems of ordinary differential equations (ODEs) or partial differential equations (PDEs). Although originally devised in the framework of neuronal modeling, they have become an archetype for systems exhibiting excitability which are especially common in biology. They are used for instance in cardiac electrophysiology [1]. Deterministic FN systems exhibit two forms depending on whether the spatial propagation is taken into account or not. We will also be interested in their stochastic perturbations. In the deterministic setting, when propagation is neglected, one considers the following family indexed by the real parameters a; b; I and ı > 0: ı xP t D yt C f .xt / C I; yPt D xt byt a
x0 D x
y0 D y:
(7.1) (7.2)
M. Thieullen Universit´e Pierre et Marie Curie - Paris 6, Laboratoire de Probabilit´es et Mod`eles Al´eatoires, Boˆıte 188, 4, Place Jussieu, 75252 Paris cedex 05, France e-mail: [email protected]. M. Bachar et al. (eds.), Stochastic Biomathematical Models, Lecture Notes in Mathematics 2058, DOI 10.1007/978-3-642-32157-3 7, © Springer-Verlag Berlin Heidelberg 2013
175
176
M. Thieullen
The function f is a cubic polynomial: f .x/ D x.x ˛/.x ˇ/ with ˛ < 0 < ˇ (we will sometimes restrict ourselves to b D 0 and I D 0). Taking into account propagation the system is @2 Vt wt C f .Vt / C I; @x 2 wP t D Vt bwt a w0 D w:
ı VPt D
V0 D v
(7.3) (7.4)
In neurophysiology, xt (resp. Vt ), the activator, denotes the voltage or potential difference across the membrane of a single neuron and yt (resp. wt ), the inhibitor, characterizes the degree of accommodation or refractoriness of the system. The system (7.1)–(7.2) of ODEs deals with the space clamped situation: it describes the potential at one point, whereas (7.3)–(7.4) includes propagation of the potential along the axon. The control parameter I is an exterior current (or stimulus). Historically, FN systems were first proposed by FitzHugh and Nagumo [3, 8] as a simplification of the Hodgkin–Huxley (HH) model. The genuine HH model was devised to describe the generation and propagation of the nerve impulse along the giant axon of the squid [6]. It is a four-dimensional system of nonlinear partial differential equations for the coupled evolution of the membrane potential and the different ionic currents, which is remarkably efficient in reproducing the physiological data. However, it is difficult to analyze mathematically. Mathematical tractability of FN systems rendered them attractive, even if they may be considered as too simple w.r.t. the biophysical complexity of the electrical activity of the neuron. In particular, FN models can explain the onset of neuronal oscillations, which consist in the emission of periodic spike trains observed experimentally in response to a sustained current input I (indeed FN systems find their origin in the nonlinear oscillator model proposed by Van der Pol). In the present chapter we review selected facts about the mathematical properties of both deterministic and stochastic FN models with an emphasis on the bistability property and the influence of the stochastic component (or noise) on the deterministic behaviour. As we consider the case of small noise (or small diffusion), the theory of large deviations will play a central role. An introduction to this theory is provided in Chap. 3. Studying the influence of noise on deterministic excitability is a wide and active field of research of which we just give a flavour. The interested reader should consult monographs [7] and [11]. For more information on one-dimensional stochastic models (also called stochastic Integrate and Fire models) of neuronal electrical activity see Chap. 5, and Chap. 6 treats a stochastic spatial FN system, as well as the HH model. Chapter 1 contains an introduction to stochastic differential equations with a view towards applications in biology. The chapter is organized as follows. In Sect. 7.2 we review basic facts about FN systems of ODEs. Section 7.3 contains a reminder of large deviations theory for diffusion processes. In Sect. 7.4 we consider a stochastic perturbation of FN obtained when the stimulus is a white noise. Section 7.5 deals with a FN system including spatial propagation. A figure is provided at the end of the paper.
7 Deterministic and Stochastic FitzHugh–Nagumo Systems
177
7.2 FN Systems of ODEs As mentioned above, FN systems were devised in order to perform the mathematical analysis of neuronal excitability. Such a reduction of dimension, from a fourdimensional model such as HH to a two-dimensional one such as FN, makes sense because the four variables present in HH evolve at different time scales. This is why these systems are often called fast-slow. This is materialized in (7.1)–(7.2) or (7.3)– (7.4) above by choosing parameter ı small. Then the two time scales are strongly separated: the membrane potential xt (or Vt ) is the fast variable and evolves rapidly whereas the recovery variable yt (or wt ) is the slow variable and evolves slowly. Systems involving different scales, not only in time but also in space are common in modeling of biological systems. Let us recall that the excitable property of the neuron means that starting close to the resting state, a small perturbation in voltage due to a small input decays back to rest while a sufficiently large perturbation due to a large enough input continues to increase and generates an action potential. If the input is maintained the system can perform oscillations. Mathematically speaking, a large excursion away from an equilibrium point which is expected to be stable, occurs when a variation of one parameter of the system causes the equilibrium point to lose its stability. This is called a bifurcation. Systems (7.1)–(7.2) exhibit Hopf bifurcations when the input current I is taken as bifurcation parameter. Since we also want to study the case where the stimulus is random, we prefer to focus on the following form of FN model with the same cubic function f as above: ı xP t D yt C f .xt /; yPt D xt a;
x0 D x
y0 D y
(7.5) (7.6)
For this system the bifurcation parameter is a. Let us first look for equilibrium points of (7.5)–(7.6) and their stability. By definition, the equilibrium points of a system xP t D f .xt ; yt /
(7.7)
yPt D g.xt ; yt /
(7.8)
are those points .x; y/ satisfying f .x; y/ D g.x; y/ D 0. Hence, for any value of a there is a unique equilibrium point for (7.5)–(7.6), which is .a; f .a//. To study the stability of the equilibrium point .x ; y / of (7.7)–(7.8), we linearize the system, which amounts to considering the linear system ZP D AZ. Here A is the Jacobian matrix given by !
AD
@f @x @g @x
@f @y @g @y
178
M. Thieullen
where the partial derivatives are taken at .x ; y /. For (7.5)–(7.6) with parameter a, f 0 .a/ ı
AD
1
! 1ı ; 0
q which admits two eigenvalues; ˙ D 2ı1 .f 0 .a/˙i 4ı f 0 2 .a//. Let us introduce a0 < a1 the two values of the parameter a where f 0 vanishes. For a … fa0 ; a1 g, the real part of the eigenvalues is different from 0 and its sign does not change on a small open interval around a. On the contrary, when a 2 fa0 ; a1 g, the eigenvalues are pure imaginary complex numbers. The sign of f 0 .a0 C / is the sign of . Therefore .a; f .a// is stable when a < a0 , unstable when a > a0 . The situation is analogous when a passes through a1 . For a 2a0 ; a1 Œ, the system admits a limit cycle. We are in presence of a bifurcation. There are several types of bifurcations, the bifurcation considered here is of Hopf type. It can be verified numerically that if ı < 0:01, the limit cycle is very close to the loop made by the two attracting branches of the curve y D f .x/ where x 7! f .x/ is decreasing and y 2 Œf .a0 /; f .a1 /, and the portions of the two horizontal segments y D f .a0 /, y D f .a1 / connecting them. The system goes slowly along the two branches of the limit cycle which are portions of f.x; y/I y D f .x/g, and much faster on the connecting segments where the second variable y remains approximately constant. As a consequence of the slow–fast property the set of equilibrium points of the one dimensional family xP t D y Cf .xt / plays an important role. When y 2f .a0 /; f .a1 /Œ, this set consists of three points x .y/ < x0 .y/ < xC .y/. The points x˙ .y/ are stable whereas x0 .y/ is unstable. This family of one dimensional systems is fundamental when studying the stochastic perturbation below.
7.3 Large Deviations We now recall basic results on large deviations theory (see [4]) with a view towards our specific situation, see also Chap. 3 with a more in-depth treatment. In particular, we consider large deviations for diffusion processes. Because of the slow–fast property of FN systems, the evolution of their solutions can be decomposed into successive fast and slow sequences. During the fast ones, the slow variable may be considered as frozen and the theory of large deviations is actually applied in dimension one to the perturbation of y
y
dxt D .y C f .xt //dt;
y
x0 D x
(7.9)
by a small noise, namely to y
y
d xQ t D .y C f .xQ t //dt C
p d Wt ;
y
xQ 0 D x:
(7.10)
7 Deterministic and Stochastic FitzHugh–Nagumo Systems
179
Let a real valued deterministic system be given dxt D b.xt /dt;
x0 D x
(7.11)
as well as its perturbation by a brownian motion with small variance Q d xQ t D b.xQ t /dt C
p Q d Wt ;
xQ 0 D x:
(7.12)
When Q ! 0, .xQ t / converges to the solution .xt / of (7.11) in probability uniformly on any bounded time interval: 8 > 0 8T > 0
lim P.sup jxQ t xt j > / D 0:
Q!0
(7.13)
Œ0;T
However, because of the presence of the noise term, some trajectories of the process .xQ t / may perform a large deviation from those of the deterministic system .xt /. Such deviations are measured by means of the action functional STT12 .'/ independent of Q , defined by Z 1 T2 STT12 .'/ D j'Pu b.'u /j2 d u (7.14) 2 T1 if ' is absolutely continuous, and by STT12 .'/ D C1 otherwise. With the help of this action functional one defines quasi-potentials which provide estimates on the mean exit time of .xQ t / from a domain. Domains of interest are basins of attraction of the stable equilibrium points of (7.11). In general, the quasi-potential w.r.t. a point z (also called transition rate) associated to the action functional S of (7.14), is defined as the function u 7! V .u/ WD inffSTT12 .'/I 0 T1 < T2 ; '.T1 / D z; '.T2 / D ug;
(7.15)
where the inf STT12 .'/ is taken over all possible T1 ; T2 ; ' satisfying the constraint 0 T1 < T2 ; '.T1 / D z; '.T2 / D u. The important point is that in dimension one the quasi-potential is explicitly deduced from the drift by the following result, which holds also in the multidimensional case provided that the drift is gradient. Proposition 7.1. The quasi-potential of (7.11) w.r.t. z coincides with the function Z u 7! V .u/ D 2
u
b.r/dr:
(7.16)
z
Ru To understand the result of Proposition 7.1, let us set v.u/ D z b.r/dr, so that b.u/ D v0 .u/. The result is obtained using solutions of the ODE 'Pt D v0 .'t /
180
M. Thieullen
which connect z and u. Indeed, for such a ' and times 0 T1 < T2 such that '.T1 / D z; '.T2 / D u, the following holds: Z STT12 .'/ D 2
T2
v0 .'s /2 ds D 2
T1
Z
T2
v0 .'s /'Ps ds D 2v.u/:
T1
RT For general ', one can write STT12 .'/ D 12 T12 j'Ps C v0 .'s /j2 ds using that b RT RT coincides with v0 . Then STT12 .'/ D 12 T12 j'Ps v0 .'s /j2 ds C 2 T12 'Ps v0 .'s /ds R T2 2 T1 'Ps v0 .'s /ds D 2.v.'T2 / v.'T1 //. When ' connects z to u this implies that STT12 .'/ 2v.u/. Identity (7.16) follows. In the following statement we recall in Part 1 the probability of large deviation and in Part 2 the approximation of the exit time as an exponential of the quasipotential. Theorem 7.1 (see Chap. 4 in [4]). 1. Let > 0. Then 1 Px .sup jxQ t xt j / exp. Œinf S0T .'/ C o.1// Q Œ0;T
(7.17)
when Q ! 0, where WD f'I '0 D x; supŒ0;T j't xt j g and the subscript in Px indicates that it is a probability on trajectories starting from x at time 0 . 2. Let x be a stable equilibrium point of (7.11) such that b.r/ < 0 for all r > x , and b.r/ > 0 for all r < x . Let D be the basin of attraction of x and Q denote the first exit time of x, Q solution of (7.12), from D. Let us assume that D D˛1 ; ˛2 Œ with V .˛1 / < V .˛2 /. Then for all x 2 D, lim Px .xQ Q D ˛1 / D 1:
(7.18)
Q!0
For all x 2 D and h > 0, lim Px .e
Q!0
V .˛1 /h Q
< Q < e
V .˛1 /Ch Q
In the particular case where b 0, that is for Wt D motion in Rd , large deviations take the form
/ D 1:
p
Wt with .Wt / a brownian
Px .sup jW t W 0 j / 4d exp. Œ0;T
(7.19)
2 /: 2T d
(7.20)
7 Deterministic and Stochastic FitzHugh–Nagumo Systems
181
7.4 Stochastic Perturbation of FN The reader will find detailed statements and proofs in [2] for the material of the present section. We simply recall here some results. We study FN perturbed by small noise, namely the model ıdXt D .Yt C f .Xt //dt C d Yt D .Xt a/dt;
p d Wt ;
X0 D x
Y0 D y:
(7.21) (7.22)
In particular we are interested in oscillating behaviour. As in the deterministic case such behaviour corresponds mathematically to convergence towards a periodic solution. In the previous section we saw that oscillations for (7.5)–(7.6) (which is (7.21)–(7.22) with D 0) are possible only when a 2 Œa0 ; a1 . Remember that a0 ; a1 are defined by f 0 .a0 / D f 0 .a1 / D 0. In the presence of noise, oscillations can appear even for values of a … Œa0 ; a1 . The evolution is governed by the quasi-potential of the one-dimensional system obtained by freezing the second variable. As already mentioned, this is justified by the slow–fast property. However, oscillations are the result of an appropriate relationship between the intrinsic time scale of the system, which is ı, and the strength of the noise, which is . A precise relationship is given in the following theorem. This theorem are illustrated in Fig. 7.1. Theorem 7.2. For y 2f .a0 /; f .a1 /Œ, let x .y/ < x0 .y/ < xC .y/ be the elements R x0 .y/ of fx 2 RI f .x/ D yg, and V˙ WD 2 x .y/ .y C f .u//d u. Let .y ; S / be the ˙
intersection point of the graphs of V and VC . Assume that 9 c > 0;
j log ıj ! c: ı
(7.23)
If c 20; S Œ, let y˙ .c/ satisfy y .c/ < y < yC .c/ and V .y .c// D c D VC .yC .c//. Set x .c/ WD x .y .c// and xC .c/ WD xC .yC .c//. If a 2x .c/; xC .c/Œ, there exist two periodic functions ˚ca and ca such that for all A, h > 0, y 2f .a0 /; f .a1 /Œ, Z
A
lim P.x;y/ . 0
jXt ˚ca .t/j2 dt > h/ D 0
lim P.x;y/ .sup jYt ca .t/j > h/ D 0
(7.24) (7.25)
Œ0;A
where we use the notation lim to indicate that and ı tend to zero according to (7.23).
182
M. Thieullen
y+ (c) y
S/2
x∗− (y)
x− (c) a0
→←
x∗0 (y)
← →
a1 x∗+ (y)
x+ (c)
→ ← y− (c)
c/2
Fig. 7.1 Figure for Theorem 7.2 with f .x/ D x.4 x 2 /. Here the graph of f is symmetric w.r.t. the origin and y D 0. The area situated above the segment 0 x 2 of the x-axis and enclosed V .0/ by the curve of f is equal to S2 D ˙2 . For y 2 Œf .a0 /; f .a1 /, x .y/ < x0 .y/ < xC .y/ are j log ıj
the elements of fx 2 RI f .x/ D yg. Here ı ! c < S. y˙ .c/ satisfy y .c/ < 0 < yC .c/ and V .y .c// D c D VC .yC .c//. The area situated below the horizontal line fy D y .c/g and enclosed by the curve of f is equal to c2 . By symmetry it coincides with the area above the horizontal line fy D yC .c/g enclosed by the curve of f . The asymptotic periodic trajectory .y/; y/I y 2 Œy .c/; yC .c/g and the horizontal segments is composed of the branches f.x˙ y D y˙ .c/ connecting them
7.5 Deterministic FN Including Space Propagation There are certainly different ways to treat the FN model with space propagation [5, 10–12]. The following system of reaction–diffusion equations of FN type is considered in [10] using PDE theory: @'t 1 D ' C F .' ; / @t @ t D b C G.' ; / @t
(7.26) (7.27)
where F .'; / D f .'/ with f .u/ D u.1 u/.u /, 20; 1Œ and G.'; / D ' . This system describes the conduction of electric impulses in nerve axons. ' is the propagator, the controller. The diffusion parameter is small and the aim is to study the behaviour of the solutions when ! 0. The key property is
7 Deterministic and Stochastic FitzHugh–Nagumo Systems
183
that ' evolves much faster than ' : the system is a slow–fast system with two distinct time scales. Let us notice that does not play the same role in Eqs. (7.26) and (7.27). Indeed, assume formally that .' ; / has a limit .'; / when ! 0. Then, satisfies @@t D b C G.'; /. Again formally, if we replace by in (7.26) we see that ' “should be close” to the solution of @u 1 D u C F .u ; /: @t
(7.28)
Denote by m and M , respectively, the local minimum and local maximum of f . Assume for a while that is a constant. Then for each 2 m; M Œ the equation F .u; / D 0 has three solutions 0 . / < . / < 1 . /. The previous formal arguments build the bridge between [10] and [5] where the following reaction– diffusion is addressed with probabilistic tools: @u 1 D u C f .x; u /: @t
(7.29)
When is small as in our case, (7.29) is called slow-diffusion fast-reaction equation. For all real x, f .x; / is bistable (it admits two stable equilibrium points separated by an unstable one) like for instance f .x; u/ D .u 0 .x//.1 .x/ u/.u .x// with 0 .x/ < .x/ < 1 .x/. In the sequel we restrict ourselves to (7.29) with f .x; u/ D .u 0 .x//.1 .x/ u/.u .x// such that 0 .x/ < .x/ < 1 .x/. To understand better the motivations of what follows, let us set uQ WD u .t; x/. The function uQ satisfies @Qu D 2 Qu C f .x; uQ /; @t
(7.30)
which is reaction–diffusion equation with bistable reaction term f and small diffusion 2 . We are interested in the long time behaviour of uQ which means that we look for limt !C1 uQ .t; x/ in order to evaluate the effect of the small diffusion t on the asymptotic behaviour of dy D f .x; yt / obtained for D 0. We remark first dt that it is equivalent to study uQ .t; x/ as t ! C1 or to study u .t; x/ as ! 0. This is why we concentrate on (7.29). Let us start heuristically. Since ! 0, the influence of the diffusion term should be negligeable on short time intervals and the limit u of u should satisfy f .x; u/ D 0 if we only consider a short time duration. Since for each x there are exactly two stable solutions to this equation which are 0 .x/ and 1 .x/ with different attraction basins, we expect two separated regions. In one region, u will converge to 0 . In the other one it will converge to 1 . More precisely, let us define g.x/ WD u .0; x/ the initial value of u , and the two regions G0 D fx 2 RI g.x/ < .x/g G1 D fx 2 RI g.x/ > .x/g:
184
M. Thieullen
In Sect. 3 of [5] it is shown that for small positive t and small enough , u .t; x/ is close to 0 .x/ if x 2 G0 and to 1 .x/ if x 2 G1 . A precise statement is given in Theorem 7.3 below. The proof of this result is probabilistic. It relies on identity (7.20) and the Feynman–Kac formula [9], which provides an expression of u as a mean value along some random trajectories. Thanks to the Feynman– Kac formula, the small diffusion term in (7.29) can p be connected to a random perturbation by a Brownian motion of variance like Wt introduced below. We now sketch the proof to show how the key arguments are used. We refer the reader to [5] for details on the short time behaviour and more results. Let us first note that given a constant , after the transformation v WD u , one is left with a PDE for v of the form @v D v C c .x; v /v C f .x/ @t
(7.31)
where c .x; v/ WD v1 .f .x; v C / f .x; // and f .x/ WD 1 f .x; /. By the Feynman–Kac formula, v also solves the following integral equation on Œ0; C1Œ Rd : Z v .t; x/ D Ex .v .0; Wt / exp
Z
t
C Ex .
t
0
0
c .Wr ; v .t r; Wr /dr//
Z s f .Ws / expŒ c .Wr ; v .t r; Wr //drds/; 0
p where W WD x C Wt is a brownian motion. We will also need the Feynman– Kac formula up to a stopping time deduced from the previous one by applying the Markov property when 2 Œ0; t a.s.: Z v .t; x/ D Ex .v .t
Z
C Ex . 0
; W / exp
Z f .Ws / expŒ
0 s
0
c .Wr ; v .t r; Wr /dr// c .Wr ; v .t r; Wr //dr ds/:
The convergence of u towards 0 or 1 is obtained through successive steps after an appropriate choice of . One shows first that u is bounded. Choose
> supx2R max.g.x/; 1 .x//. Set as before v D u . Then v .0; Wt / 0 a.s. Moreover, f .x; / < 0 for all x. Hence f .x/, which in our case coincides with 1 f .x; /, is also strictly negative. The Feynman–Kac formula yields v .t; x/ 0 for all .t; x/ 2 Œ0; C1Œ R and u is bounded above by . It is now sufficient to notice that u solves a PDE of the same type as (7.28) to obtain that u is also bounded below. The bounds are independent of .
7 Deterministic and Stochastic FitzHugh–Nagumo Systems
185
The second step is to show that for all 0 < t0 < T and all compact K lim sup
sup .u 1 .x// 0
!0 Œt0 ;T K
lim inf
inf .u 0 .x// 0
!0 Œt0 ;T K
We focus on the first inequality. For fixed k > 0 and x 2 R, let us prove the existence of some ı > 0 such that lim sup
sup
.u 1 .x// k
!0 Œt0 ;T V .x;ı/
where V WD V .x; ı/ denotes the ı-neighbourhood of x. By continuity of 1 there exists ı > 0 such that supV .x;2ı/ 1 < infV .x;2ı/ 1 C k. Choose supV .x;2ı/ 1 < < infV .x;2ı/ 1 C k. Let M be such that supŒ0;T V u .t; x/ M . Since f 0 .x; u/ < 0 for u > 1 .x/, there exists a constant > 0 such that 1v .f .z; v C / f .z; // when z 2 V and v C 2 ŒinfV 1 C k; M . Let 1 WD inffs 0I jWs W0 j ıg and 2 WD inffs 2 Œ0; tI v .t s; Ws / C 1 .W0 / C kg: We apply the Feynman–Kac formula with stopping time D inf.1 ; 2 ; t/. Then for t 2 Œt0 ; T ,
v .t; x/ 1 .x/ C k C jM jPx .1 T / C jM j exp. t0 /: (7.32) This proves the statement using inequality (7.20). We refer to [5] for the proof of the two other steps. However, we describe their content. The third step consists in proving that for all h > 0, all open set G and compact K G satisfying lim sup!0 supG .u .0; / / < h, there exists tQ > 0 such that lim sup sup .u / < 0:
(7.33)
!0 Œ0;Qt K
The last step yields the convergence result to 0 or 1 . Convergence to 0 reads as follows. An analogous statement holds for 1 . Theorem 7.3. Let h > 0, G and K be respectively open and compact sets, K G. If lim sup!0 supG .u .0; / / < h, then there exists tQ > 0 such that 80 < s < tQ;
lim sup ju 0 j D 0:
!0 Œs;Qt K
(7.34)
186
M. Thieullen
As previously mentioned, Theorem 7.3 describes the behaviour of u .t; x/ on a small time duration and for small enough . This result is the first step in the study of the behaviour of u when ! 0. To understand its importance let us go back to the simpler situation of the reaction–diffusion equation @U D U C f .U /: @t
(7.35)
Here f .u/ D .u 0 /.u /.1 u/ where 0 < < 1 are constant equilibrium points (0 and 1 are stable, is unstable) which do not depend on x. The behaviour of U.t; x/ for large times is equivalent to q.x ct/ where limz!1 q.z/ D 1 and limz!C1 q.z/ D 0 . Assume that the initial value U.0; x/ is equal to 1 on a bounded region and to 0 in its complement. Then, as time increases, the set of points x such that U.t; x/ D 1 will expand. It is classical to interpret the points x where U.t; x/ is close to 1 (resp. 0 ) as excited (resp. non-excited) states. In these terms, the region of excited states expands as time increases. Building on Theorem 7.3, it is shown in [5] that an analogous situation arises for equation (7.29).
References 1. Aliev, R.R., Panfilov, A.V.: A simple two-variable model of cardiac excitation. Chaos Solitons Fractals 7, 293–301 (1995) 2. Doss, C., Thieullen, M.: Oscillations and Random Perturbations of a FitzHugh-Nagumo System. Preprint, arXiv:0906.2671v1, July (2009) 3. FitzHugh, R.: Impulses and physiological states in theoretical models of nerve membrane. Biophys. J. 1, 445–466 (1961) 4. Freidlin, M.I., Wentzell, A.D.: Random Perturbation of Dynamical Systems. Springer, New York (1984) 5. Gaertner, J.: Bistable Reaction-diffusion equations and excitable media. Math. Nach. 112, 125–152 (1983) 6. Hodgkin, A.L., Huxley, A.F.: A quantitative description of membrane current and its application to conduction and excitation in nerve. J. Physiol. 117, 500–544 (1952) 7. Lindner, B., Garcia-Ojalvo, J., Neiman, A., Schimansky-Geier, L.: Effects of noise in excitable systems. Phys. Rep. 392, 321–424 (2004) 8. Nagumo, J.S., Arimoto, S., Yoshizawa, S.: An active pulse transmission line simulating nerve axon. Proc. IRE 50, 2061–2071 (1962) 9. Revuz, D., Yor, M.: Continuous Martingales and Brownian Motion. Grundlehren der mathematischen Wissenschaften, vol. 293, 3rd edn. Springer, New York (1991) 10. Soravia, P., Souganidis, P.E.: Phase-field theory for FitzHugh-Nagumo-type systems. SIAM J. Math. Anal. 5, 1341–1359 (1996) 11. Tuckwell, H.C.: Introduction to Theoretical Neurobiology, vol. 1 and 2. Cambridge University Press, Cambridge (1988) 12. Tuckwell, H.C.: Nonlinear effects in white-noise driven spatial diffusion: general analytical results and probabilities of exceeding threshold. Phys. A 387, 1455–1463 (2008)
Chapter 8
Stochastic Modeling of Spreading Cortical Depression Henry C. Tuckwell
Abstract The nonlinear wave phenomenon of cortical spreading depression (SD), which occurs in many brain structures, has mathematical similarities to neuronal spiking but on very different space and time scales. Its properties and previous modeling are briefly reviewed. A model consisting of a 6-component reaction– diffusion system in two space dimensions is described. With 3-parameter Poisson process sources of potassium ions representing extrusions due to the random firings of neurons, the model takes the form of a multi-component set of nonlinear stochastic partial differential equations. Assuming that in a restricted small area the sources have greater strength than background, the probability of an SD wave is found as a function of the patch size. Also investigated is the probability of elicitation of SD through the occurrence of a patch with compromised metabolic activity, as may occur by virtue of an infarct after stroke. The analysis proceeds in terms of the effect of relative decreases in the strength of ATP-dependent sodium– potassium exchange pump.
8.1 Introduction Spreading cortical depression (SCD or CSD or SD) is a slowly moving (a few to 10 mm/min) wave occurring in many types of grey matter. It is typified by a surface negative potential and increased potassium ion concentration (to about 15– 50 mM above resting levels) along with decreases in sodium, chloride and calcium concentrations. Neurons in its path are first highly excited and then silent. SD was discovered in the rabbit brain [26] but has since been found in the brains of most mammals, including humans [30] (see [38] for discussion) where it plays roles in
H.C. Tuckwell Max Planck Institute for Mathematics in the Sciences, Inselstr. 22, Leipzig 04103, Germany e-mail: [email protected] M. Bachar et al. (eds.), Stochastic Biomathematical Models, Lecture Notes in Mathematics 2058, DOI 10.1007/978-3-642-32157-3 8, © Springer-Verlag Berlin Heidelberg 2013
187
188
H.C. Tuckwell
Fig. 8.1 Left: Le˜ao, who discovered SD in 1944. Right: Grafstein, who advanced the potassium hypothesis for SD
pathological phenomena such as stroke and migraine [11, 14, 25, 27, 31]. There is convincing evidence from MRI studies to suggest that SD or an SD-like event generates or coincides with the aura associated with migraine in human visual cortex [17] (Fig. 8.1). The characteristics of SD resemble those of action potentials in that it is a solitary wave (or approximately so) with a threshold, refractory period and admits the possibility of the occurrence of multiple waves in the case of a sustained stimulus. However, SD occurs in two and probably three space dimensions whereas we are accustomed to thinking of action potentials as being a one space-dimensional phenomenon. There are a large number of biophysical, electrophysiological, neurochemical and anatomical elements involved in SD instigation and propagation. Glia, neurons, synapses, many ion-channels both voltage and ligand gated, ion and transmitter concentrations, pumps, blood vessels, degree of hypoxia, cell swelling and gap junctions are some—for a useful discussion and comprehensive review see [38]. Since the 1970s there has been a continued debate about which elements are primary or causal for instigation and/or spread of SD. Originally, a view ascribed to Grafstein, a high concentration of extracellular potassium ions and their subsequent diffusion were considered to be the agent required for SD, but van Harreveld [46] for example raised the possibility that glutamate could also play such a role. On the grounds that rising potassium ion concentrations do not always occur ahead of an SD wave, Herreras [19] refuted their role in the spread. Debate has also centered on the role of gap junctions which were successfully incorporated in Shapiro’s detailed model [37]. Some authors have modeled single cell activity in order to understand SD but this approach seems to not address the S part of SD [21, 39]. Whereas such models are expected to give insight into cellular mechanisms, reduced ad hoc models like FitzHugh–Nagumo (FN) seem to be of limited utility given the extreme complexity of SD. It is worth recalling that the FN system is a two
8 Stochastic Modeling of Spreading Cortical Depression
189
component approximation to the 4-component Hodgkin–Huxley (HH) model, but that SD involves a much larger number of biophysical and physiological variables than HH. See also Chap. 6. There have been many recent significant advances in the mathematical modeling of SD and related complex phenomena. Detailed cellular models have been employed by Makarova et al. [28, 29] to analyze potential distributions during SD in the hippocampus. In an important direction, several authors have employed reaction–diffusion models, some simplified and others endowed with considerable physiological realism, in order to investigate geometric and physiological factors involved in such pathologies as migraine, stroke and cerebral ischemia [2,5,6,9,10], but see also the review by Hossman [20]. With the large amount of computing involved in solving realistic models, it has also been useful to consider efficient computational methods [40]. Recent important and interesting experimental advances and hypotheses have included measurements of ATP released into the extracellular space [36] and that circle of Willis anomalies contribute to migraine susceptibility and ischemic complications of migraine [7]. In addition, stress-induced comas in the locust have properties that closely resemble cortical SD [33], and in epilepsy models, increases in pre-ictal high-frequency (gamma) electroencephalographic activity had no effect on extracellular potassium or calcium ion concentrations [4]. An interesting study was also made of the effects of SD on sleep-related changes in rat brain [8].
8.2 Reaction–Diffusion Model for Cortical Spreading Depression Although a model with only potassium and calcium ion concentrations can reproduce many of the characteristics of SD [42, 45] a more comprehensive model, initially proposed in [44] considers the 4 ions K C ; C aCC ; NaC ; C l ; an excitatory transmitter, denoted by TE , which is expected to be mainly glutamate, and an inhibitory transmitter, denoted by TI , mostly GABA. Letting the vector of external ion and transmitter concentrations at time t be u.x; y; t/ with u1 ; : : : ; u6 ; as those of K C ; C aCC ; NaC ; C l ; TE and TI , respectively, then generally, for k D 1; : : : ; 6; @uk D Dk r 2 uk C Fk .u/; @t
(8.1)
where Dk is the diffusion coefficient for the k-th component, with an initial condition u.x; y; 0/ D u0 .x; y/; (8.2) and suitable conditions at the boundary of the region under consideration. SD is actually a phenomenon in 3 space dimensions, but we consider only two for economy of computation. Two intracellular compartments are distinguished, one
190
H.C. Tuckwell
pertaining to synapses and the other to nonsynaptic elements which may include contributions from glia. These are assigned possibly different ratios of extracellular to intracellular volumes, denoted by ˛1 and ˛2 , respectively. The internal ion concentrations, denoted by uii nt ; i D 1; 2; 3; 4; are assumed to be given by the local conservation equations, which for potassium, sodium and chloride are, with R denoting a resting equilibrium value, uii nt .x; y; t/ D uii nt;R C ˛1 ŒuR i ui .x; y; t/; i D 1; 3; 4
(8.3)
for potassium, sodium and chloride whereas for calcium ui2nt .x; y; t/ D ui2nt;R C ˛2 ŒuR 2 u2 .x; y; t/:
(8.4)
It is more transparent to use K o;i ; C ao:i ; Nao;i ; C l o;i ; TEo;i and TIo;i for the ion and transmitter concentrations and it is expeditious to omit the space-time coordinates .x; y; t/. The membrane potential is assumed given by the Goldman formula VM
o K C pNa Nao C pC l C l i RT ln D F K i C pNa Nai C pC l C l 0
(8.5)
and the Nernst potentials, given by standard formulas for the ionic species, are denoted by VK ; VC a ; VNa ; VC l . The very complex dynamics of calcium at presynaptic terminals have been the subject of many experimental and theoretical studies mainly with a view to quantitatively understanding transmitter release [12, 18, 22]. Other works have concentrated on calcium dynamics in neurons during action potentials [35]. We include a major component of calcium fluxes, that associated with the activation of synapses because of its relevance to transmitter release. Although flows through other membranes are doubtless significant they are for the most part neglected in the present model. Their inclusion is no more difficult, and their quantitative aspects just as uncertain, but to maintain a degree of simplicity they are omitted. The source and sink terms are slightly modified from those given in [44]. For potassium, fK D k1 .VM VK /
k3 TIo TEo PK;Na C fK;p C k5 ; C TEo C k2 TIo C k4
(8.6)
where PK;Na is the pump term and fK;p is a passive flux term given by fK;p D k6 .VM VM;R /.VM VK /H.VM VM;R /
(8.7)
where VM;R is resting membrane potential and H./ is the Heaviside unit step function. The constant k5 ensures that fK D 0 at resting levels and it is assumed that the transmitter induced conductance changes are zero unless TEo , TIo are positive.
8 Stochastic Modeling of Spreading Cortical Depression
191
Although ion pumps have a complicated dependence on concentrations of several ion species [41, 47], we have adopted a model with an explicit and relatively simple form for the sodium-potassium exchange pump [13], PK;Na
k18 3 k19 2 1C o D k17 1 C ; Nai K
(8.8)
where it is assumed that Nai > 0 and K o > 0. The contributions from action potentials were incorporated as described in [43]. For calcium, fC a D k7 .VM VC a /gC a C PC a k8 (8.9) where the calcium conductance is gC a D .1 C tanhŒk31 .VM C VM / k32 /H.VM VMT /;
(8.10)
VMT being a cut-off potential with k32 D 1 C tanhŒk31 .VMT C VM /
(8.11)
to ensure gC a rises smoothly up from zero as VM increases through VMT . The calcium pump is simply k20 C ai H.C ai / : (8.12) PC a D C ai C k21 The sodium and chloride terms contain transmitter-induced conductance changes and pumps k10 TIo TEo k22 PK;Na k11 D k9 .VM VNa / C o TEo C k2 TI C k4 TIo k13 TEo C PC l k14 D k12 .VM VC l / C Teo C k2 TIo C k4
fNa fC l where
PC l D
k25 C l i H.C l i / : C l i C k26
(8.13) (8.14)
(8.15)
Glutamate NMDA receptors have been strongly implicated in SD as known blockers of them prevent SD [32]. Rates of transmitter release are assumed proportional to calcium flux so fTE D k15 .VM VC a /gC a PE
(8.16)
fTI D k16 .VM VC a /gC a PI ;
(8.17)
192
H.C. Tuckwell
where PE D
k27 TEo H.TEo / TEo C k28
(8.18)
PI D
k29 TIo H.TIo / : TIo C k30
(8.19)
Glutamate may also be released from glia during SD [1, 24], but this contribution is not explicitly taken into account here. The pump terms for glutamate and GABA represent the clearance of these transmitters, for example into glial cells [48].
8.2.1 The Standard Parameter Set The above system of six reaction–diffusion equations may be integrated using an explicit method. The following set of constants, obtained from experiment or chosen judiciously in order to give results similar to experiment, are called the standard set.
Ratios of extracellular to intracellular volumes ˛1 D 0:25;
˛2 D 2:0:
Diffusion coefficients in units of 105 cm2 s1 DK D 2:5; DC a D 1:0; DNa D 1:7; DC l D 2:5; DTE D DTI D 1:3: Resting concentrations in mM K o;R D 3; Nai;R D 15;
K i;R D 140; C l o;R D 136:25;
C ao;R D 1;
C ai;R D 0:0001;
C l i;R D 6:
Permeabilities pNa D 0:05;
pC l D 0:4:
Calcium conductance parameters in mV VM D 45;
VMT D 60:
Nao;R D 120;
8 Stochastic Modeling of Spreading Cortical Depression
193
Dynamical constants k1 D 78:091;
k2 D 1:5;
k3 D 0;
k4 D 1:5;
k5 D 0;
k6 D 0:00015;
k7 D 0:2;
k8 D 0:0003998;
k9 D 1:6;
k10 D 0;
k11 D 39:8140; k12 D 104:05;
k13 D 0;
k14 D 104:064; k15 D 3:47;
k16 D 3:15;
k17 D 577:895; k18 D 2:5;
k19 D 2:5;
k20 D 0:8;
k21 D 0:2;
k22 D 0:3677;
k23 D 0:11;
k24 D 0:0711;
k25 D 260:16;
k26 D 9:0;
k27 D 47:124;
k28 D 1:0;
k29 D 47:124;
k30 D 1:00:
With an initial condition consisting of a supra-threshold Gaussian elevation of potassium chloride concentration, a typical solution is shown in Fig. 8.2. The amplitude and velocity of the wave of ion concentrations are in the experimental range.
8.3 Random Sources of K C Neural and other electrophysiological activity across the cerebral cortex is stochastic in nature, so many sources and sinks of ions and transmitters should be modeled with random processes. A prominent example is the release, as the result of action potentials and also due to the synaptic activation of receptors, of bursts into the extracellular compartment of potassium ions, with concomitant absorption of sodium, calcium and chloride ions. A simple way to include these random emissions (in two space dimensions) is by means of a counting process fN.x; y; t/; x1 x x2 ; y1 y y2 ; t 0g which gives the number of action potentials in Œx1 ; x Œy1 ; y in the time interval .0; t. Then the differential equation for the external potassium ion concentration becomes @u1 @3 N.x; y; t/ D DK r 2 u1 C F1 .u/ C ˛ @t @x@y@t
(8.20)
where ˛ is the local increase in K C concentration due to one extrusion event. Here for simplicity we assume that each pulse releases the same amount of potassium, whereas in the cortex, the amounts released will be random and with rates that may
194
H.C. Tuckwell
Fig. 8.2 The response when potassium chloride is added at the center of the square for the deterministic model with the standard parameter set. Shown spreading from the center are solitary waves of increased extracellular potassium and transmitters and decreased calcium, sodium and chloride. The initial distribution in all cases was a Gaussian suprathreshold application of KCl at the center of the space interval. A unit of distance corresponds to about 18.57 mm and a unit of time is 345 s. The times at which the concentrations are shown are time points tk D 0; tk D 300 and tk D 600; corresponding to times at 0, 8.63 min and 17.25 min, respectively
depend on the coordinates x; y and t. Some preliminary studies have been done by assuming in the first instance that the process N is a three-parameter Poisson process with mean rate , so that EŒN.x; y; t/ D .x x1 /.y y1 /t.
8.3.1 Mainly Uniform K C Sources with An Isolated Region of Higher Activity It is of great interest to ascertain in the context of a theoretical model if SD can occur spontaneously due to excessive local electrophysiological activity. For example, epileptic activity is a strong predictor of SD [3] as such activity results is excessive emissions of potassium ions. Thus we here consider that there is no initial local elevation of potassium chloride as in Fig. 8.2 but steady random backgroundemissions of K C due to the firings of nerve cells with rate parameter
8 Stochastic Modeling of Spreading Cortical Depression
195
Fig. 8.3 The fraction of trials, after 40 min and 25 s, in which SD wave formed with random sources of potassium as described in the text
1 D 20 per unit area per unit time and with magnitude ˛1 D 2. However, over a small square of magnitude A mm2 the rate is the same 2 D 1 but the amounts of potassium released are greater at ˛2 D 6. We examine the system after the substantial time interval of about 40 min to see if an SD wave has formed. This is done many times for various A so that the probability of the occurrence of an SD wave can be estimated. The results are shown in Fig. 8.3. It can be seen that if the area of the patch of heightened emissions is less than a threshold value At hresh of about 0.05 sq mm, then no SD wave forms. However, when A D 0:06 sq mm, there is sufficient potassium extruded to give an SD wave in about 10% of trials. This probability of SD formation increases steadily as A increases to eventually reach unity when A is 0.142 sq mm. Of course, by the fundamentals of probability theory, for all A > At hresh , an SD wave will form eventually with probability one, assuming that conditions remain fixed. There will also be threshold values for 2 and ˛2 . A complete investigation will be reported in a later publication.
8.4 Reduced Exchange-Pump Capacity Over a Small Lesion As a consequence of stroke or other pathologies [15, 34] there is damage to nerve cells, glia and other cell types in regions called infarcts. Such regions are implicated as containing sources of SD, often in conjunction with peri-infarct depolarization
196
H.C. Tuckwell
Fig. 8.4 Representation of a small lesion in the brain over which there is compromized metabolic capacity, especially with regard to the sodium-potassium exchange pump
4.15 CM
LESION 2mm X 2mm
[11]. It is expected that over a damaged region there will be less than normal metabolic energy for such processes as the ATP-dependent potassium–sodium exchange pump, amongst others. The amount of K C pumping for an internal sodium level Nai and an external potassium level of K o is modeled by the following empirically based simple expression [13]
PK;Na
k12 3 k3 2 1C o D k1 1 C ; Nai K
(8.21)
where it is assumed that Nai > 0 and K o > 0. To represent the reduced pumping theoretically, the constant k1 is multiplied by a scale factor sp taking values between 0 and 1. Figure 8.4 shows the geometry of the region under consideration, with uniform background emissions of K C as explained above with D 40 and ˛ D 0:5, which values are normally subthreshold for eliciting SD. In Fig. 8.5 we give computed solutions for a sample path of the model stochastic partial differential equations for the external potassium ion concentration at selected times with a scale factor of sp D 0:563 which is just below the critical value. At t D 3, a local elevation of potassium concentration has occurred around the lesion and by t D 5 an SD wave has emerged. In the remaining frame at t D 7 the SD wave can be seen to have propagated to quite a large distance from the lesion. In Table 8.1 are given results of reducing the strength of the Na/K exchange pump by various scale factors. Here 1 signifies the development of an SD wave within 23 min with random background activity as described above; 0 denotes no wave. It can be seen that the critical value of the multiplicative scale factor sp is close to 0.56; stronger pumps inhibit the formation of SD whereas weaker pumps promote it because there is more potassium available in the extracellular space. In the runs with sp D 0:565 and sp D 0:57, however, SD waves did form in the extended time period of over 52 min.
8 Stochastic Modeling of Spreading Cortical Depression
197
Fig. 8.5 Showing the random spontaneous emergence of an SD wave (external potassium ion concentration only) from the lesion over which the exchange K/Na pump has reduced capacity. Note that without the lesion, no SD wave forms with this level of background activity Table 8.1 Occurence (1) or not (0) of SD
Scale factor sp 0.5 0.55 0.56 0.562 0.565 0.57 0.6 0.65
Trial 1 1 1 1 1 0 0 0 0
Trial 2 1 1 1 1 0 0 0 0
Trial 3 1 1 1 0 0 0 0 0
8.5 Discussion Preliminary results for the instigation of SD by random sources of potassium, corresponding to nerve cell firings, were presented, showing threshold effects. Furthermore, the effects of reduced metabolic capacity as occurs due to ischemic lesions in the brain were investigated quantitatively in the presence of uniform
198
H.C. Tuckwell
random nerve cell activity. One other important application in neurobiology that we have not had space to discuss is in models for epileptiform activity [23], which take the form of large systems of nonlinear SPDEs. Analysis of models of epilepsy and SD can hopefully provide insight into new therapeutic measures for seizure and stroke treatments, as SD after many repetitions, can inflict damage on neuronal and related structures [16].
References 1. Basarsky, T.A., Feighan, D., MacVicar, B.A.: Glutamate release through volume-activated channels during spreading depression. J. Neurosci. 19, 6439–6445 (1999) 2. Boissel, J.P., Ribba, B., Grenier, E., Chapuisat, G., Dronne, M.A.: Modelling methodology in physiopathology. Prog. Biophys. Molec. Biol. 97, 28–39 (2008) 3. Bonthius, D.J., Stringer, J.L., Lothman, E.W., Steward, O.: Spreading depression and reverberatory seizures induce the upregulation of mRNA for glial fibrillary acidic protein. Brain Res. 645, 215–224 (1994) 4. Broberg, M., Pope, K.J., Nilsson, M., Wallace, A., Wilson, J., Willoughby, J.O.: Preseizure increased gamma electroencephalographic activity has no effect on extracellular potassium or calcium. J. Neurosci. Res. 85, 906–918 (2007) 5. Chapuisat, G.: Discussion of a simple model of spreading depressions. ESAIM Proc. 18, 87–98 (2007) 6. Chapuisat, G., Dronneb, M.A., Grenier, E., Hommel, M., Gilquin, H., Boissel, J.P.: A global phenomenological model of ischemic stroke with stress on spreading depressions. Prog. Biophys. Molec. Biol. 97, 4–27 (2008) 7. Cucchiara, B., Detre, J.: Migraine and circle of Willis anomalies. Med. Hypoth. 70, 860–865 (2008) 8. Cui, Y., Katoaka, Y., Inui, T., Mochizuki, T., Onoe, H., Matsumura, K., Urade, Y., Yamada, H., Watanabe, Y.: Up-regulated neuronal COX-2 expression after cortical spreading depression is involved in non-REM sleep induction in rats. J. Neurosci. Res. 86, 929–936 (2008) 9. Dahlem, M.A., Schneider, F.M., Sch¨oll, E.: Efficient control of transient wave forms to prevent spreading depolarizations. J. Theor. Biol. 251, 202–209 (2008) 10. Dahlem, M.A., Schneider, F.M., Sch¨oll, E.: Failure of feedback as a putative common mechanism of spreading depolarizations in migraine and stroke. Chaos 18, 026,110 (2008) 11. Fabricius, M., Fuhr, S., Willumsen, L., Dreier, J.P., Bhatia, R., Boutelle, M.G., Hartings, J.A., Bullock, R., Strong, A.J., Lauritzen, M.: Association of seizures with cortical spreading depression and perinfarct depolarisations in the acutely injured human brain. Clin. Neurophysiol. 119, 1973–1984 (2008) 12. Fossier, P., Tauc, L., Baux, G.: Calcium transients and neurotransmitter release at an identified synapse. Trends Neurosci. 22, 161–166 (1999) 13. Garay, R.P., Garrahan, P.J.: The interaction of sodium and potassium with the sodium pump in red cells. J. Physiol. 231, 297–325 (1973) 14. Gardner-Medwin, A.R.: Possible roles of vertebrate neuroglia in potassium dynamics, spreading depression and migraine. J. Exp. Biol. 95, 111–127 (1981) 15. Gass, A., Ay, H., Szabo, K., Koroshetz, W.J.: Diffusion-weighted MRI for the small stuff: the details of acute cerebral ischaemia. Lancet Neurol. 3, 39–45 (2004) 16. Gorgi, A.: Spreading depression: a review of the clinical relevance. Brain Res. Rev. 38, 33–60 (2001) 17. Hadjikhani, N., Sanchez Del Rio, M., Wu, O., Schwartz, D., Bakker, D., Fischl, B., Kwong, K.K., Cutrer, F.M., Rosen, B.R., Tootell, R.B., Sorensen, A.G., Moskowitz, M.A.: Mechanisms
8 Stochastic Modeling of Spreading Cortical Depression
199
of migraine aura revealed by functional MRI in human visual cortex. Proc. Natl. Acad. Sci. 98, 4687–4692 (2001) 18. Heidelberger, R., Heinemann, C., Neher, E., Matthews, G.: Calcium dependence of the rate of exocytosis in a synaptic terminal. Nature 371, 513–515 (1994) 19. Herreras, O.: Electrical prodromals of spreading depression void grafsteins potassium hypothesis. J. Neurophysiol. 94, 3656 (2005) 20. Hossman, K.A.: Cerebral ischemia: models, methods and outcomes. Neuropharmacology 55, 257–270 (2007) 21. Kager, H., Wadman, W.J., Somjen, G.G.: Simulated seizures and spreading depression in a neuron model incorporating interstitial space and ion concentrations. J. Neurophysiol. 84, 495,512 (2000) 22. Koester, H.J., Sakmann, B.: Calcium dynamics associated with action potentials in single nerve terminals of pyramidal cells in layer 2/3 of the young rat neocortex. J. Physiol. 529, 625–646 (2000) 23. Kramer, M.A., Szeri, A.J., Sleigh, J.W., Kirsch, H.E.: Mechanisms of seizure propagation in a cortical model. J. Comput. Neurosci. 22, 63–80 (2007) 24. Larrosa, B., Pastor, J., L´opez-Aguado, L., Herreras, O.: A role for glutamate and glia in the fast network oscillations preceding spreading depression. Neuroscience 141, 1057–1068 (2006) 25. Lauritzen, M.: Pathophysiology of the migraine aura: the spreading depression theory. Brain 117, 191–210 (1994) 26. Le˜ao, A.A.P.: Spreading depression of activity in the cerebral cortex. Neurophysiology 7, 359– 390 (1944) 27. Le˜ao, A.A.P., Morison, R.S.: Propagation of spreading cortical depression. J. Neurophysiol. 8, 33–45 (1945) 28. Makarova, J., Ibarz, J.M., Canals, S., Herreras, O.: A steady-state model of spreading depression predicts the importance of an unknown conductance in specific dendritic domains. Biophys. J. 92, 4216–4232 (2007) 29. Makarova, J., Makarov, V.A., Herreras, O.: Generation of sustained field potentials by gradients of polarization within single neurons: a macroscopic model of spreading depression. J. Neurophysiol. 103, 2446–2457 (2010) 30. Mayevsky, A., Doron, A., Manor, T., Meilin, S., Zarchin, N., Ouaknine, G.E.: Cortical spreading depression recorded from the human brain. Brain Res. 740, 268–274 (1996) 31. Milner, P.M.: Note on a possible correspondence between the scotomas of migraine and spreading depression of le˜ao. Electroencephalogr. Clin. Neurophysiol. 10, 705 (1958) 32. Obrenovitch, T.P., Zilkha, E.: Inhibition of cortical spreading depression by L-701,324, a novel antagonist at the glycine site of the N-methyl-D-aspartate receptor complex. Br. J. Pharmacol. 117, 931–937 (1996) 33. Rodgers, C.I., Armstrong, G.A.B., Robertson, R.M.: Coma in response to environmental stress in the locust: a model for cortical spreading depression. J. Insect. Physiol. 56, 980–990 (2010) 34. Rovira, A., Griv´e, E., Rovira, A., Alvarez-Sabin, J.: Distribution territories and causative mechanisms of ischemic stroke. Eur. Radiol. 15, 416–426 (2005) 35. Schiller, J., Helmchen, F., Sakmann, B.: Spatial profile of dendritic calcium transients evoked by action potentials in rat neocortical pyramidal neurones. J. Physiol. 487, 583–600 (1995) 36. Schock, S.C., Munyao, N., Yakubchyk, Y., Sabourin, L.A., Hakim, A.M., Ventureyra, E.C., Thompson, C.S.: Cortical spreading depression releases ATP into the extracellular space and purinergic receptor activation contributes to the induction of ischemic tolerance. Brain Res. 1168, 129–138 (2007) 37. Shapiro, B.E.: Osmotic forces and gap junctions in spreading depression: a computational model. J. Comp. Neurosci. 10, 99–120 (2001) 38. Somjen, G.G.: Mechanisms of spreading depression and hypoxic spreading depression-like depolarization. Physiol. Rev. 81, 1065–1096 (2001) 39. Somjen, G.G., Kager, H., Wadman, W.J.: Calcium sensitive nonselective cation current promotes seizure-like discharges and spreading depression in a model neuron. J. Comp. Neurosci. 26, 139–147 (2009)
200
H.C. Tuckwell
40. Teixeira, H.Z., Alvarenga, D.J., Almeida, A.C.G., Rodrigues, A.M., Duarte, M.A.: Parallelization of the electrodiffusion mechanism of the computational model of spreading depression. Computational Science and Engineering, Proceedings of 11th International Conference 2008, IEEE Conference publications. DOI: 10.1109/CSE.2008.12 pp. 261–266 (2008) 41. Torok, T.L.: Electrogenic NaC /Ca2C -exchange of nerve and muscle cells. Prog. Neurobiol. 82, 287–347 (2007) 42. Tuckwell, H.C.: Predictions and properties of a model of potssium and calcium ion movements during spreading cortical depression. Int. J. Neurosci. 10, 145–165 (1980) 43. Tuckwell, H.C.: Mathematical modeling of spreading cortical depression: spiral and reverberating waves. Am. Inst. Phys. Conf. Proc. 1028, 46–64 (2008) 44. Tuckwell, H.C., Hermansen, C.L.: Ion and transmitter movements during spreading cortical depression. Int. J. Neurosci. 12, 109–135 (1981) 45. Tuckwell, H.C., Miura, R.M.: A mathematical model for spreading cortical depression. Biophys. J. 23, 257–276 (1978) 46. Van Harreveld, A.: Two mechanisms for spreading depression in the chicken retina. J. Neurobiol. 9, 419–431 (1978) 47. Yingst, D.R., Davis, J., Schiebinger, R.: Effects of extracellular calcium and potassium on the sodium pump of rat adrenal glomerulosa cells. Am. J. Physiol. Cell Physiol. 280, C119–C125 (2001) 48. Zoremba, N., Homola, A., Rossaint, R., Sykov´a, E.: Brain metabolism and extracellular space diffusion parameters during and after transient global hypoxia in the rat cortex. Exp. Neurol. 203, 34–41 (2007)
Glossary
Agronomy The science of agriculture. Bridge process A continuous-time stochastic process, originated from a given process X and defined only on the time interval Œt0 ; t1 , whose probability distribution is the conditional probability distribution of X given that it takes fixed values at time t0 and t1 , where ti < 1. Diffusion process A Markov process in continuous time with continuous sample paths. First passage time The random variable representing the time until a given stochastic process first hits a set of its state space. For a one-dimensional process it can for example represent the time at which a boundary level is first reached starting from the origin. Green’s function A Green’s function or impulse response function is a function used to solve ordinary or partial differential equations with given initial and/or boundary conditions. Green’s functions are named after the mathematician George Green who first used them in 1830. Hodgkin–Huxley spatial model The Hodgkin–Huxley spatial model is a system of nonlinear partial differential equations originally employed in 1952 to study the propagation of action potentials in a squid axon. Inference Statistical inference is the process of drawing conclusions from data that are subject to random variation, for example measurement noise or sampling variation. Interspike interval Time between two successive spikes of a neuron. When the neuron membrane potential reaches a critical threshold value it triggers an action potential, commonly called a spike.
M. Bachar et al. (eds.), Stochastic Biomathematical Models, Lecture Notes in Mathematics 2058, DOI 10.1007/978-3-642-32157-3, © Springer-Verlag Berlin Heidelberg 2013
201
202
Glossary
Jump diffusion process A stochastic process consisting of a continuous part driven by a Wiener process or in general by a diffusion process and a jump part with jumps driven by a Poisson process and either fixed or random jump sizes. Large deviations theory A branch of probability theory concerning the asymptotic behaviour of the tails of sequences of probability distributions. Leaky Integrate and Fire models Formal spiking neuron models. The basic circuit of a leaky integrate and fire model consists of a capacitor C in parallel with a resistor R driven by a current I.t/. The voltage v.t/ across the capacitance is compared to a threshold . If v.ti / D an output pulse ı.t ti / is generated. Markov process A stochastic process in which the conditional expectation of the next value, given the current and preceding values, only depends on the current value, i.e. conditionally on the present state of the system, its future and past are independent. Martingale A stochastic process in which the conditional expectation of the next value, given the current and preceding values, is the current value. Neuron membrane potential The potential difference between the inside and outside of the cell. At rest, neurons have a membrane potential typically about 70 mV. Oncology The science of cancer. Ornstein–Uhlenbeck process A stochastic process that is solution of an SDE with linear drift and constant diffusion term. It is Gaussian and Markov. The process tends to drift towards its long-term mean: this is called mean-reverting. It describes the velocity of a Brownian particle under the influence of friction. Reaction–diffusion system One or more differential equations, at least one of which is a diffusion partial differential equation, with source and sink terms which determine the dynamics of the system. The simplest example is the Fisher’s equation. SPDE Stochastic partial differential equation. A partial differential equation such as the heat equation or wave equation, with random forcing terms, or parameters or boundary conditions which contain random processes. Spreading depression A slowly moving surface-negative wave which occurs in many brain structures in which there is at first excitation of neurons followed by silence and then recovery. It is accompanied by massive changes in ionic and other concentrations and is believed to be primarily pathological. Stochastic (non-deterministic) A stochastic process is one whose behavior contains random elements and which can be described through its probability distributions.
Glossary
203
Stochastic differential equation (SDE) A differential equation in which one or more terms are stochastic processes, often Wiener processes, thus resulting in a solution which is itself a stochastic process. However, other types of random fluctuations are possible, such as compound Poisson processes, resulting in jump processes. Stroke A stroke occurs if blood supply to a part of the brain is disrupted by either blockage or by loss of blood. The disabling effects depend on the part of the brain affected. It is the second leading cause of death worldwide. Two-parameter white noise The second mixed partial derivative of two-parameter Brownian motion is a random process called two-parameter white noise. The process is delta-correlated in both parameters. Wiener process or Brownian motion A continuous time random (diffusion) process with independent normally distributed (Gaussian) increments with mean zero and variance proportional to the elapsed time.
Index
-algebra, 37 action functional, 66, 179 agronomy, 29 Bayesian estimation, 22, 29 Bessel process, 49 boundary, 39, 102, 104, 120 boundary conditions, 156 Brownian bridge, bridge process, 21, 110, 136 Brownian Motion, see Wiener process 5 cable equation, 151 calcium, 160 chicken growth, 29 continuous-time Markov, 73, 78 Cox-Ingersoll-Ross process, see square-root process
diffusion process, 39, 67, 82 discrete-time Markov chain, 64
empirical measure, 58 Euler–Maruyama scheme, 16 Euler-Lagrange equation, 88 exchange pump, 191, 195
Feller process, see square-root process filtered probability space, 37 first passage time, 102 Fokker-Planck equation, 103 Freidlin–Wentzell, 57, 65
Gaussian process, 6 Geometric Brownian motion, 11, 15, 30 glutamate, 188, 189, 191 Green’s function, 43, 155
Hopf bifurcation, 178
importance sampling, 21, 76, 78 infinitesimal generator, 40 interspike interval, 101 inverse first passage time problem, 135, 142 inverse stochastic resonance, 163, 166 ion channels, 150, 161 ion currents, 160 Itˆo integral, 10 Itˆo’s formula, 14, 30, 40 Itˆo’s isometry, 10
Kalman filter, 22, 26 Kolmogorov equation, 102
large deviation principle, 61 large deviations, 57, 73, 80, 178 lesion, 195 likelihood function Gaussian approximation, 19 quasi-likelihood, 20 Markov chain, see Markov process 4 Markov Chain Monte Carlo, 23 Markov jump process, 68 Markov process, 4, 10
M. Bachar et al. (eds.), Stochastic Biomathematical Models, Lecture Notes in Mathematics 2058, DOI 10.1007/978-3-642-32157-3, © Springer-Verlag Berlin Heidelberg 2013
205
206 martingale, 19, 38 local, 38 martingale estimating function, 23 quadratic, 20, 24 maximum likelihood, 18 estimator, 26 membrane potential, 101 Milstein scheme, 17 model FitzHugh–Nagumo, spatial, 168 FitzHugh-Nagumo, 175 Gompertz, 29 Hodgkin–Huxley, spatial, 162 Hodgkin-Huxley, 176 integrate and fire, 99 jump diffusion, 119 leaky integrate and fire, 78, 100 linear SPDE, 155 logistic, 29 Morris-Lecar, 91 pharmacokinetic, 25 reaction-diffusion, 189 Richards, 29 Weibull, 29 Wiener, 102, 103 Monte Carlo method, 16, 20, 74 oncology, 25 ordinary differential equation, 3 Ornstein–Uhlenbeck bidimensional, 26 first passage time, 124, 127, 129, 134 likelihood, 19 model, 107, 117, 119, 120 model with periodic input, 121 parameter estimation, 139, 141 transformation to a Wiener process, 128 parameter estimation, 19 process, 11, 22 Poisson process three-parameter, 194 postsynaptic potential, 101 Potassium ions random extrusion, 193 probability of transmission, 168 quadratic variation, 7, 40 quasipotential, 68, 87, 179
Index random walk, 4, 6, 104 randomized random walk, 102, 105, 106 recurrent, 46 refractoriness, 122 repetitive firing, 162, 166 return process, 122 reversal potential, 112, 115, 117, 119 reversal potentials synaptic, 156
scale function, 40 simulation methods, 16, 75, 136 speed measure, 44 spike train, 101 spiking neuron, 160 spreading depression (SD), 187 square-root process, 12, 53, 113 first passage time, 124, 129 parameter estimation, 139, 141 quadratic martingale estimating function, 24 transformation to a Wiener process, 128 first passage time, 127 statistical inference, 17 with measurement noise, 22 without measurement noise, 18 Stein’s model, 102, 106, 107, 112, 115 stochastic differential equation, 3, 8, 39 stochastic model, 3 stochastic ordering, 118, 125, 131 stochastic partial differential equation, 149 stochastic process, 4 stopping time, 38 stroke, 189, 195 strong order of convergence, 16
threshold, see boundary
Volterra integral equation, 125
weak order of convergence, 16 white noise, two-parameter, 157 Wiener process, 5, 8, 123, 126, 132, 135 first-passage time distribution, 110, 119, 130 with drift, 11