Springer Series in Cognitive and Neural Systems Volume 1 Series Editors Vassilis Cutsuridis John G. Taylor
For further volumes: http://www.springer.com/series/8572
Mario Negrello
Invariants of Behavior Constancy and Variability in Neural Systems
123
Mario Negrello Okinawa Institute of Science and Technology Okinawa, Japan
[email protected]
ISBN 978-1-4419-8803-4 e-ISBN 978-1-4419-8804-1 DOI 10.1007/978-1-4419-8804-1 Springer New York Dordrecht Heidelberg London Library of Congress Control Number: 2011923227 c Springer Science+Business Media, LLC 2011 All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer Science+Business Media, LLC, 233 Spring Street, New York, NY 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)
Preface
As a cognitive scientist, I realize that in the study of the brain, challenges of epistemological nature match those of empirical nature. Issues have to be resolved on both fronts, and with matching impetus. On the empirical forefront, one has to deal with complexities of measurement, system identification, and experimental paradigms. On the epistemological side, function has to be outlined, analyzed, localized, discussed, explained. The two sides must be alloyed if the explanatory summit is to be achieved. And as a mechanical engineer, I find it difficult to dispel my natural tendency to see mechanism in everything. The meaning of the mechanism analogy in the study of organisms and brain function has to be put under adequate light to be serviceable. In the brain, mechanisms are dynamical processes, highly context dependent, that subserve an incredible breadth of organismic behavior. They are evanescent, but reliable. They are expressible through words and theories, but are exhausted by neither. I found no common denominator for my conflicting quandaries until I was introduced to cybernetics through the works of von Foerster, Braitenberg, Wiener, Pask, and Ashby, and more recently, Varela, Maturana, Pasemann, and Beer. So pervasive as to be invisible, cybernetics percolated through most of the sciences in the second half of the previous century, and is, to my mind, the only true method of achieving understanding amid the vast amount of knowledge that all sciences currently amass. It was not only once or twice that a powerful epiphany of mine could be retraced 60 years into the past, back to those of the magnificent set of revolutionary scientists who understood what it means to understand. It is to them that I pay dues where dues are due, and in all humbleness offer this book as a cybernetic contribution from a mechanical engineer, turned cognitive scientist, converted cyberneticist. Some words about the format of the book are required. Skimming the table of contents, one will notice that I took the liberty of dividing it into two parts. Had I heard some good advice “keep it short,” it would perhaps indeed have been shorter, more to the point, more concise, and would primarily consist of Part II. But then it would not serve a particular purpose I wanted it to play, which was to make explicit the path towards conclusions. I wanted it to be a complete snapshot, or better, a long-exposure picture, of the considerations leading to subsequent results. For that reason primarily I have very intentionally laid emphasis on epistemology and v
vi
Preface
theoretical aspects of the problem, hence Part I. This discussion seems to me to be essential to frame both problems that can be fractally complex and conclusions that are strengthened when contextualized. I believe that the process of committing complex ideas to paper has paid off immensely, and I hope to be harvesting dividends of the extra effort in times to come. For that, I must beg to be excused of a certain independence and perchance unevenness in the treatment of topics, at times introductory, at times technical, at times controversial. For that reason also, the references are introduced in the individual chapters. If the reader shall indulge me, I believe that he or she may come to appreciate my choice that more (not less) is more. Finally, parts of this book have appeared in print before as journal articles (Chaps. 5, 8), a book chapter (Chap. 7), and a newsletter article (Chap. 9). They have all undergone substantial revision in an attempt to deliver a more unified package. Finally, in the spirit of an open work, the book can also be said to be ongoing. The ideas in it, especially the role of invariance in the study of behavior, and constancy and variability in neural systems and behavior, are vast, unexplored, and, I feel, immensely rich. If blinded by my excess enthusiasm I failed to offer the precision and depth the ideas deserve, as a way of apology I urge you to keep on listening. There is more to come. Bonn, Germany
Mario Negrello
Acknowledgements
A mind’s produce is contingent on other minds and circumstances. Thank you: Jackie Griego and Judith Degen, for jovially enduring my peculiar rants; Steffen Wischmann, Walter de Back. and Robert Maertin, mental brethren; Mark Hallam, worthy opponent and inspired reviser; Peter K¨onig, a training example; Hod Lipson, for hospitality; Christian Rempis and Keyan Zahedi, for friendly scientific software; Orlei Negrello, my first scientific collaborator; Thomas Christaller, for mindful interaction from Japan to Sankt Augustin; Frank Pasemann, for enthusing and entrusting me with ideas, with admiration.
vii
Contents
Part I Invariants of Behavior 1
Introduction .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 1.1 Variation and Constancy in Neurobiology and Behavior .. . . . . . . . . . . . 1.2 Argument Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 1.2.1 Part I: Invariants and Invariants of Behavior . . . . . . . . . . . . . . . . 1.2.2 Models and Dynamical Systems . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 1.2.3 Part II: The Organism and Its World . . . . . . . .. . . . . . . . . . . . . . . . . 1.2.4 Convergent Evolution of Behavioral Traits.. . . . . . . . . . . . . . . . . 1.2.5 Modular Organization . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 1.2.6 Concept Duals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . References .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .
3 3 5 5 6 6 7 7 8 9
2
Invariances in Theory .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 2.1 Invariants in Physics and Mathematics .. . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 2.1.1 Invariants Are That Which Remains When Something Else Changes .. . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 2.1.2 Theoretical Invariants: Speed of Light, Planck’s Constant .. . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 2.1.3 Theoretical and Relational Invariants: Energy . . . . . . . . . . . . . . 2.1.4 Empirical and Relational Invariants: Law of Gases . . . . . . . . . 2.1.5 Context-Dependent Invariances: Sum of Internal Angles of the Triangle . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 2.1.6 Invariants Prefigure Theories.. . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 2.2 Invariants in Biology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 2.2.1 Biology Borrowed Invariants from Physics and Mathematics .. . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 2.2.2 Invariants in Biology Are Context-Dependent . . . . . . . . . . . . . . 2.2.3 Three Examples: Life, Form, and Behavior . . . . . . . . . . . . . . . . . 2.2.4 Ontogeny Needs the Environment . . . . . . . . . .. . . . . . . . . . . . . . . . . 2.2.5 Development and Biophysics . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 2.2.6 Evolution Operates on Organisms, Not on Genes. . . . . . . . . . . 2.2.7 Genes as Invariants of Species Identity .. . . .. . . . . . . . . . . . . . . . .
11 11 11 11 12 12 13 14 14 14 14 15 17 17 18 19 ix
x
Contents
2.2.8
Invariants in Biology: Only Within Narrow Contextual Boundaries . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 2.2.9 Genetic Triggers Do Not Build an Organism . . . . . . . . . . . . . . . 2.3 Invariants of Behavior .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 2.3.1 What Is an Invariant of Behavior? . . . . . . . . . .. . . . . . . . . . . . . . . . . 2.3.2 Genes as Invariants of Behavior . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 2.3.3 Genes Are Untenable as Invariants of Behavior, for They Are a Diluted Cause . . . . . . . . . . . . . . . . . 2.3.4 Neuroanatomy: Invariant Connections Between Architecture and Behavior .. . . . . . .. . . . . . . . . . . . . . . . . 2.3.5 Architectonic Invariances . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 2.3.6 Some Features of Gross Anatomy Can Be Traced to Behaviors . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 2.3.7 Connections Between Anatomy and Function Do Not Always Exist . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 2.3.8 Neuroanatomical Variation and Constancy of Function . . . . 2.3.9 One to Very Many Mappings from Anatomy to Function .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 2.3.10 Requirements for an Invariant of Behavior .. . . . . . . . . . . . . . . . . 2.3.11 Cybernetics, Reafference, and Sensorimotor Loops . . . . . . . . 2.3.12 Schema Theory and Functional Overlays . .. . . . . . . . . . . . . . . . . 2.3.13 Behavioral Function and Invariants of Behavior .. . . . . . . . . . . 2.4 Conclusion .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 2.5 Summary.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . References .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 3
Empirical Assessments of Invariance . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 3.1 Empirical Assessments of Invariance . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 3.1.1 Invariants in Neuroscience . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 3.1.2 A Brief Typology of Function and Invariance in Neuroscience .. . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 3.2 The Measurement of Neural Activity . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 3.2.1 Whatever Is Seen, Is Seen Through Lenses . . . . . . . . . . . . . . . . . 3.2.2 Contentions on the Measurement of Brain and Behavioral Function . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 3.2.3 Scope of Measurement Tools and Analysis Methods .. . . . . . 3.2.4 Oscillations and Potentials: EEG . . . . . . . . . . .. . . . . . . . . . . . . . . . . 3.2.5 Function Localization: fMRI .. . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 3.2.6 Brain Wiring: DTI and Diffusion Spectrum Imaging .. . . . . . 3.2.7 Electrophysiology: Single Unit Recordings .. . . . . . . . . . . . . . . . 3.2.8 Empirical Methods and the Language of Explanation .. . . . . 3.3 Sources of Variation .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 3.3.1 Instrumental Sources of Variation . . . . . . . . . .. . . . . . . . . . . . . . . . . 3.3.2 Repeatability and Variation in Different Levels .. . . . . . . . . . . . 3.3.3 Sources of Variation . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .
19 19 20 20 20 20 21 22 24 25 25 26 26 28 31 34 36 37 38 41 41 41 42 44 44 44 45 45 48 51 53 55 55 55 57 58
Contents
xi
3.4
Conclusions .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 3.4.1 Partial Pictures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 3.4.2 More Epistemological Contentions .. . . . . . . .. . . . . . . . . . . . . . . . . 3.5 Summary.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . References .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .
58 58 59 60 60
4
Modeling and Invariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 4.1 Invariance and Computational Models . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 4.1.1 Shared Invariant Rules from Natural Patterns .. . . . . . . . . . . . . . 4.1.2 Dynamical Neural Patterns .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 4.2 Computational Models of Neural Invariance . . . . . . . .. . . . . . . . . . . . . . . . . 4.2.1 Constancy and Variability in Modeling .. . . .. . . . . . . . . . . . . . . . . 4.2.2 Models and Invariance in Computational Neuroscience .. . . 4.2.3 Hebbian Plasticity . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 4.2.4 Kohonen’s Self-Organizing Maps . . . . . . . . . .. . . . . . . . . . . . . . . . . 4.2.5 Backpropagation: Gradient Descent Methods.. . . . . . . . . . . . . . 4.2.6 Hopfield Networks.. . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 4.2.7 Decorrelation Principles .. . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 4.3 Conclusion .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 4.3.1 Invariants and the Structure of the World. . .. . . . . . . . . . . . . . . . . 4.3.2 Constancy and Variability in Modeling .. . . .. . . . . . . . . . . . . . . . . References .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .
63 63 63 66 68 68 69 69 70 72 73 74 77 77 78 79
5
Dynamical Systems and Convergence .. . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 5.1 Introduction .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 5.1.1 Previous Conclusions .. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 5.1.2 Dynamical Systems Theory as an Integrative Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 5.1.3 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 5.2 Dynamical Systems Vocabulary . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 5.2.1 Dynamical Systems Basic Terminology .. . .. . . . . . . . . . . . . . . . . 5.2.2 Coupled Dynamical Systems and Interfaces . . . . . . . . . . . . . . . . 5.3 A Motor Act, as Coupled Dynamical Systems. . . . . . .. . . . . . . . . . . . . . . . . 5.3.1 Integrating Levels.. . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 5.4 Convergence in the Neuromuscular Junction . . . . . . . .. . . . . . . . . . . . . . . . . 5.4.1 Convergent Level Crossing. . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 5.5 Conclusion .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 5.5.1 Convergence and Level Crossing . . . . . . . . . . .. . . . . . . . . . . . . . . . . 5.6 Summary.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . References .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .
81 81 81
6
82 83 84 84 88 89 92 93 93 97 97 98 98
Neurons, Models, and Invariants . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .101 6.1 Neurons to Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .101 6.1.1 The History of the Models of the Action Potential . . . . . . . . .102
xii
Contents
6.1.2
The Hodgkin–Huxley Model Illustrates How Variability Converges to Constancy . . . . . . . .. . . . . . . . . . . . . . . . .106 6.1.3 Categories Emerge.. . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .108 6.2 Network Models .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .111 6.2.1 Difficulties with the Hodgkin–Huxley Model .. . . . . . . . . . . . . .111 6.2.2 A General Template To Build a Network Model .. . . . . . . . . . .112 6.2.3 Parameterizing Structure to Analyze Dynamics . . . . . . . . . . . .113 6.2.4 Properties of Units and Properties of the System . . . . . . . . . . .113 6.3 Recurrent Neural Network Models . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .114 6.3.1 Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .114 6.3.2 Discussion of Assumptions . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .115 6.3.3 Fair Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .117 6.4 Conclusion .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .117 6.4.1 The Staggering Complexity of the Neuron .. . . . . . . . . . . . . . . . .117 6.4.2 Neurons, Networks, and Organismic Behavior . . . . . . . . . . . . .118 6.5 Addendum: Level Crossing in Models . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .119 References .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .120 Part II Neurodynamics of Embodied Behavior 7
Neurodynamics and Evolutionary Robotics . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .125 7.1 Crash Course on the Neurodynamics of Recurrent Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .125 7.1.1 Varieties of Neurodynamics .. . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .125 7.1.2 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .127 7.1.3 Neurodynamics and Attractors .. . . . . . . . . . . . .. . . . . . . . . . . . . . . . .135 7.2 Evolutionary Robotics at a Glance . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .136 7.2.1 Neurodynamics and Evolutionary Robotics .. . . . . . . . . . . . . . . .136 7.2.2 From Evolution of Organisms to Evolutionary Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .136 7.2.3 Assumptions of an Evolutionary Robotics Problem . . . . . . . .137 7.2.4 Structural Evolution and Simulation .. . . . . . .. . . . . . . . . . . . . . . . .138 7.3 Conclusion .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .139 References .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .139
8
Attractor Landscapes and the Invariants of Behavior . .. . . . . . . . . . . . . . . . .141 8.1 Introduction .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .141 8.1.1 Outlook.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .141 8.1.2 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .145 8.2 Toy Problem in Active Tracking .. . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .147 8.3 Methods .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .148 8.3.1 Problem Description .. . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .148 8.3.2 Challenges for the Tracker . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .150 8.3.3 Convergence and Motor Projections of Attractors . . . . . . . . . .151 8.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .152 8.4.1 Tracking Behavior Across Attractors . . . . . . .. . . . . . . . . . . . . . . . .152
Contents
xiii
8.4.2 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .152 8.4.3 Analysis of Dynamical Entities Generating Behavior . . . . . .153 8.4.4 Convergent Activity and Equivalence of Attractors.. . . . . . . .153 8.4.5 Features of Evolved Attractor Landscapes .. . . . . . . . . . . . . . . . .154 8.4.6 Attractor Shapes and Action . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .159 8.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .163 8.5.1 Structure–Attractor Landscape–Function . .. . . . . . . . . . . . . . . . .163 8.5.2 Invariants of Behavior . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .163 8.5.3 Explanations of Functional Behavior: Negative Feedback.. . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .166 8.5.4 Behavioral Function Demands a Holistic Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .169 8.6 Linking Section: Convergent Landscapes .. . . . . . . . . . .. . . . . . . . . . . . . . . . .170 8.6.1 Direct Association Between Dynamics and Behavior . . . . . .170 Appendix I: Learning as Deforming Attractor Landscapes .. . . . . . . . . . . . . . . . .172 Appendix II: Related Work .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .173 References .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .174 9
Convergent Evolution of Behavioral Function . . . . . . . . . . .. . . . . . . . . . . . . . . . .177 9.1 Convergent Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .177 9.1.1 Outlook.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .177 9.2 Preliminaries.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .178 9.2.1 Is Evolution Gradual or Punctuated? . . . . . . .. . . . . . . . . . . . . . . . .178 9.2.2 The Moment of Invention of Function.. . . . .. . . . . . . . . . . . . . . . .179 9.2.3 Neutral Mutation and Appearance of Function . . . . . . . . . . . . .180 9.2.4 Convergent Evolution Controversy . . . . . . . . .. . . . . . . . . . . . . . . . .182 9.3 Evolutionary Phenomena in Evolution of Tracking and of Following . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .186 9.3.1 Invariance Organized Through Orderings and Selection . . .186 9.3.2 Gradual Improvement in Evolution of Simple Tracking .. . .187 9.3.3 Extensions to the Experiment on the Evolution of Tracking .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .192 9.4 Discussion: Convergent Evolution and Instinct .. . . . .. . . . . . . . . . . . . . . . .201 9.4.1 Convergent Evolution of Attractor Landscapes . . . . . . . . . . . . .201 9.4.2 Instinct: Convergent Evolution of Functional Behavior . . . .202 9.5 Constancy and Variability in Structure and Function .. . . . . . . . . . . . . . . .203 9.5.1 Aspects of Constancy . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .204 9.5.2 Sources of Variability .. . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .206 9.6 Summary.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .208 9.6.1 Convergent Function and Invariants of Behavior . . . . . . . . . . .208 9.7 Linking Section: Evolution and Modularity . . . . . . . . .. . . . . . . . . . . . . . . . .209 9.7.1 Functional Selection of Modular Structures and Dynamics .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .210 References .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .211
xiv
Contents
10 Neural Communication: Messages Between Modules . . .. . . . . . . . . . . . . . . . .213 10.1 Introduction .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .213 10.1.1 Neural Modularization . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .214 10.1.2 The Meaning of Neurons, Modules, and Attractor Landscapes . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .215 10.1.3 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .216 10.2 Modularity .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .216 10.2.1 Functional Semiotics of Neurons and Networks . . . . . . . . . . . .218 10.2.2 Anatomical Modularity.. . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .219 10.2.3 Heuristics for Modularity . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .220 10.3 Types of Modularity .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .221 10.3.1 Vertical Modularity .. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .222 10.3.2 Monolithic Modules and Dynamic Modularity . . . . . . . . . . . . .229 10.4 Equivalence, Variability, and Noise . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .230 10.4.1 The Geometry of Dynamical Modularity.. .. . . . . . . . . . . . . . . . .231 10.4.2 Other Mechanisms of Dynamical Modularity .. . . . . . . . . . . . . .231 10.5 Discussion: Modular Function and Attractor Landscapes .. . . . . . . . . . .234 10.5.1 Modules, Attractor Landscapes, and Meaning . . . . . . . . . . . . . .234 10.6 Summary.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .235 References .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .236 11 Conclusion . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .239 11.1 The Search for Mechanism Within a Meaningful World .. . . . . . . . . . . .239 11.1.1 Contraptions, Analogies, and Explanation .. . . . . . . . . . . . . . . . .239 11.1.2 Empirical Invariances: Flickering Lights . . .. . . . . . . . . . . . . . . . .240 11.2 The Current Stage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .243 Reference .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .244 Index . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .245
List of Figures
2.1 2.2 2.3 2.4 2.5 2.6
Acetabularia and its life cycle.. . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . Genes and explanation of behavior . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . Invariance in the caterpillar neural system. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . A Braitenberg vehicle.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . Reafference in the visual system of the fly . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . A schema functional overlay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .
18 22 23 24 29 32
3.1 3.2 3.3
Brainbow and diffusion tensor imaging . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . Scales in electroencephalography .. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . Brain activation as seen through functional magnetic resonance imaging lenses. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . Comparison of diffusion spectrum imaging tract structure of five human subjects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . Causation in spikes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .
43 46
64 67
4.4 4.5
Belousov–Zhabotinsky reaction and Dyctiostelyum spirals. . . . . . . . . . . . . . . . Varieties of action potentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . Hebbian learning and a model for asymmetric expansion of place fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . The rigged hill analogy of Hopfield networks .. . . . . . . . . . . .. . . . . . . . . . . . . . . . . Decorrelation and the formation of place cells . . . . . . . . . . . .. . . . . . . . . . . . . . . . .
5.1 5.2
The levels of a gesture as coupled dynamical systems . . . .. . . . . . . . . . . . . . . . . 91 Convergence to muscle fibers implies equivalence of spike trains . . . . . . . . 96
6.1 6.2
6.5
Hodgkin–Huxley equations and agreement . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .103 Fractal nature of channel activity revealed by the patch clamp. (From [22]).. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .104 Crystal structure of the calcium-gated potassium ion channel .. . . . . . . . . . . .105 Dynamical analysis of the parameter space of the Hodgkin– Huxley model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .107 The Fitzhugh–Nagumo model.. . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .109
7.1
A recurrent neural network .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .128
3.4 3.5 4.1 4.2 4.3
6.3 6.4
49 52 54
71 74 76
xv
xvi
List of Figures
7.2 7.3 7.4 7.5 7.6 7.7
Orbit, phase space, and time series. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .130 A period 5 periodic attractor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .131 Basins of attraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .132 Bifurcation sequence of a two-neuron network . . . . . . . . . . .. . . . . . . . . . . . . . . . .133 Landscape plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .134 The ENS3 algorithm for artificial evolution . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .138
8.1 8.2 8.3 8.4 8.5 8.6 8.7 8.8 8.9 8.10 8.11 8.12 8.13 8.14 8.15 8.16 8.17
Von Uexk¨ull’s Funktionskreis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .143 Neurodynamics explanatory loop . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .144 Simulated environment and the agent.. . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .147 Stimuli sequence from a trial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .149 Pattern-evoked functional attractors . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .153 Time series of an evolved period 4 attractor .. . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .154 Sequence of stimulus–action pairs . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .155 Coexisting attractors not functionally equivalent.. . . . . . . . .. . . . . . . . . . . . . . . . .155 Gravity matching linear increase of tracking velocity . . . .. . . . . . . . . . . . . . . . .156 Motor projections of attractor landscapes . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .157 Chaotic attractor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .160 Morphing attractor .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .160 Sequence of morphing attractors .. . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .161 Equivalent attractors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .162 Agent–environment diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .167 The coupling of the agent environment and neural systems . . . . . . . . . . . . . . .168 Dynamical schemata for behavior . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .171
9.1 9.2 9.3 9.4 9.5 9.6 9.7 9.8 9.9 9.10 9.11 9.12 9.13
Punctuated equilibria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .180 Divergence and convergence in a ball-drawing experiment . . . . . . . . . . . . . . .184 Gradual improvement of tracking .. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .187 Jagged gradual improvement of tracking . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .187 Convergence of attractor landscapes, first generations .. . .. . . . . . . . . . . . . . . . .188 Convergence of attractor landscapes, last generations . . . .. . . . . . . . . . . . . . . . .189 Five different agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .190 The Sinai experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .194 The evolution of fitness in the Sinai experiment . . . . . . . . . .. . . . . . . . . . . . . . . . .195 The R2 experiment.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .196 R2 experiment.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .197 R2’s holonomic arrangement of wheels . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .198 A flow diagram representing acquisition of function . . . . .. . . . . . . . . . . . . . . . .211
10.1 Vertical modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .224 10.2 R2’s retinal module and wheel module . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .224 10.3 Parameter modulation of the attractor landscape for the wheel module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .225 10.4 R2 wheel module metatransient .. . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .226 10.5 Horizontal modularity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .227
List of Figures
10.6 10.7 10.8 10.9
xvii
Modules outlined by a modularity finding algorithm . . . . .. . . . . . . . . . . . . . . . .230 Equivalent orbits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .232 Chaotic itinerancy and stimulus-forced reduction .. . . . . . . .. . . . . . . . . . . . . . . . .233 Modular function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .235
List of Tables
1.1 The prominent dichotomies and dilemmas in neuroscience and in the study of behavior as they relate to constancy and variability, with cross-references to the chapters where they appear . . . . . . .
8
2.1 Invariants in physics and mathematics . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 13 4.1 Invariances in the structure of interaction with the world are represented in invariance in neural activity . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 76 7.1 Neuronal models and levels.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .126 7.2 Types of attractors encountered in maps . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .131 8.1 Simulation details of tracking experiment . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .149 9.1 Graph theory connectivity measures . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .190 9.2 Graphic measures of evolved networks . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .191 9.3 Sinai simulation details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .195
xix
Part I
Invariants of Behavior
The important things in the world appear as the invariants (or more generally [. . . ] quantities with simple transformation properties) of [. . . ] transformations. Principles of Quantum Mechanics, 4th. Edition PAUL M.A. D IRAC
Where there is symmetry, there is invariance. N OETHER ’ S T HEOREM
Symmetry is a characteristic of the human mind. Letter to the Prince of Russia A LEXANDER P USHKIN
Chapter 1
Introduction
Abstract The study of the brain and behavior is illuminated with the discovery of invariances. Experimental brain research uncovers constancies amidst variation, with respect to interventions and transformations prescribed by experimental paradigms. Place cells, mirror neurons, event related potentials and areas differentially active in fMRI, all illustrate the pervasive role of invariances in neural systems in relation to their function. The introduction highlights the complementarity between constancy and variability in the study of neural systems, and outlines the elements of our plan to expose the mediating role of invariances between constancy and variability in behavior and neural systems.
1.1 Variation and Constancy in Neurobiology and Behavior It may at first seem like a paradox that in neural systems massive variability may peacefully coexist with conspicuous constancy. Variability and constancy are present in substrate (what it is made of – material cause), structure (how it is organized – formal cause), and function (what is it for – final cause). For any identifiable dimension of a phenomenon one may fancy to quantify, be that in morphology or in behavior, one finds simultaneously spreads of measures and consistent means. In neural systems this interplay finds its most impressive expression. In qualitative neurobiology it may seem as if, to any given scale, mostly everything varies. Across species, within species, or among individuals, the size of brains and structures, the number of neurons, their dendritic arbors, the size of cells, the number of synapses, the spread of dendrites, and the length and the diameters of axons and myelin sheaths vary widely. And the list goes on indefinitely, from the macroscopic to the nanoscopic, from the nanosecond to the lifetime. Such extreme variability notwithstanding, neuroscientists and other brain scientists are assuaged, each to various degrees (usually proportional to the proximity to the biological phenomena itself), by the fact that despite variations more or less stable patterns condense – invariants – given enough observations. That is, after a choice of measurable dimensions has been assumed, constancy may be revealed from the analysis of observable data, often in the form of statistical characterization M. Negrello, Invariants of Behavior: Constancy and Variability in Neural Systems, Springer Series in Cognitive and Neural Systems 1, DOI 10.1007/978-1-4419-8804-1 1, c Springer Science+Business Media, LLC 2011
3
4
1 Introduction
of the spread of variables about their averages. So, in the neuronal variables just mentioned, there will be the “average size” of a brain, an “average density” of neurons, the “mean length of the dendritic arborization,”and so forth. Each measurement is associated with its deviations. Both constancy and variability in biology are reflected in measurements, and afford the first step in understanding the organism as a system composed of variables. Through quantification we get a sense of the complexity of the system under study. But quantification of single variables does little else than to provide a sense of scale to the phenomenon. Measurable quantities in an organism are never insular; they represent parts that stand in relation to the whole. To paint a landscape picture of the system, the quantified variables must be integrated and interrelated. Constancy and variability of one variable must tell a tale about another variable for the picture of the workings of the system to be whole. It is the interplay of constancy and variability in measurement that hints at some such connections. In short, scientists reason that what varies together belongs together, in explanation as in reality. This is invaluable as first heuristics, and more than that, it is our most intuitive method for accessing the co-occurrences as representatives of causal connections. But this method becomes increasingly unreliable and cumbersome (1) the farther away variables are from each other and (2) the more variables are introduced. Concerning the first point, caution is advised in cases of correlation where the variables are not apparently immediate to avoid making spurious relations. In that sense, if one can reliably connect the length of an axon with the transmission time of an action potential, a spike, this connection is quite close and we are entitled to establish a lawlike relation, expressed in mathematical terms between axon length and propagation time. The case becomes more suspicious, however, when one tries to relate any physiological measurable characteristic of the system with behavior itself. Measurement alone does not reveal the connections between levels that far apart. In this case we will need a theory, one that stitches levels and is able to explain the constancies and variabilities of behavior in terms of the constancies and variabilities of the nervous system. It is desirable for any theory of the organization of behavior of living systems to account for the many-leveled assembly, from atoms to behavior. That is no small feat (and indeed it is not clear that it is possible). Here enters one of the most fascinating facts about organisms. Despite all the variation of organismic processes, variations at one level organize, through a large number of interactions, the next level. Regimented by physical principles, matter is constrained in the modes of interaction across scales. Through constraints and large numbers, matter self-organizes, life appears, behavior emerges. Self-organization principles seem to be the key to understanding the structure and function of nervous systems. Acknowledging this, however, does not simplify the problem. Any description of the workings of the nervous system cannot but be stultified by myriad complex interactions that occur in all levels of the system. Recent research has shown an active nervous system as never before imagined. Dendritic spines move on the micrometer per second scale [2]. Synaptic terminals bulge and swell with the action potential [1]. Neurotransmitter vesicles flow from within a presynaptic to a postsynaptic cell in microseconds. For
1.2 Argument Outline
5
any received spike, synapses are altered, and across the brain such events happen in the trillions per second, resulting in a brain that thrives with nonstop activity and change. And yet, the organism seems to act in a regular fashion. How can behavior, not to mention thought, be meaningful in any way amid this relentless agitation? How can variability of the small be an ingredient for constancy of the large? How can one theorize about the order within which components of the system will be put together at any one level, let alone at the system level? Is there a best level for the analysis of behavior? What are the contributions of the variation of individual events to the overall behavior of the organism? What is an instinct? Throughout this book, I will be addressing these questions.
1.2 Argument Outline 1.2.1 Part I: Invariants and Invariants of Behavior The first chapter surveys the multifaceted roles that invariants play in theorizing, from physics and mathematics to biology and neurobiology. The question “What is an invariant of behavior?” is posed, and some alternatives are proposed and discussed: genes, neuroanatomy, and reflex theory. From that, the cybernetic take on the issue is introduced, and placed in an evolutionary context, in which single behaviors are identified in respect to the goals they achieve, and how they subserve the organism’s viability. The search for invariants of behavior is framed as a search for mechanisms. This search is far from trivial, as assumptions play a prominent role. The search for invariance is taken further with the analysis of empirical methods of behavioral invariance in the study of the mammalian brain. These methods are the bread and butter of the neuroscientist and cognitive neuroscientist: electrophysiology, functional magnetic resonance imaging, and diffusion tensor imaging all search for measurable invariances that may illuminate function and mechanism. This is not simple to do as all methods have inherent particularities, both of empirical and of epistemological nature. A clear view on these issues is important as they have strong bearing on conclusions about results of experiments, and by extension, on the mechanisms of behavior. Further in the analysis of invariants and their explanatory roles, I present particular models of computational neuroscience that elucidate the interplay between constancy and variability, and illustrate the appearance of invariances. These methods rely on the interplay between structure of the input and rules of structural modification, in which a pliable structure organizes around the regularities in input. In analogy, organisms learning and adapting to their bodies and environments also self-organize function. Albeit powerful, both empirical assessment of invariances and models thereof present partial pictures. To assemble this mosaic, it is necessary to have a theory that is apt to cover the causal levels of behavioral phenomena. Dynamical systems is proposed as such a theory.
6
1 Introduction
1.2.2 Models and Dynamical Systems The theory of dynamical systems is presented as an integrative theory, from nano to meter. Life in our world, and perhaps in our universe, is marshaled by physics, in time, in recurrent loops, where what happens next is what happened before plus how things work. The models abstract systems into state variables, rules, and parameters. Analysis of structure plus parameters can be done both analytically and computationally to address the invariances of a modeled system. Dynamical systems analysis reveals new facets of phenomena that are modeled, and may effectively lead to powerful reductions. The Hodgkin–Huxley model of the action potential is presented as a prototypical example of dynamical systems analysis applied to neurons. It illustrates several interesting aspects of the action potential generation, for example, how constancy in the action potentials appears despite variability in ion channel distribution. Effectively, dynamical systems modeling demonstrates how a phenomenon can be analyzed with respect to its parameters and structure, and helps solve the dichotomy between constancy and variability by a mapping between parameter domains and dynamical varieties. Furthermore, dynamical systems can be coupled to cover organismic phenomena on many scales, as it provides a general frame applicable to an ample range of phenomena. Describing a system as coupled systems highlights interfaces. Interfaces reflect the locus of organized transitions, or level crossings, places where constancy and variability merge. At interfaces there is convergence and/or divergence, so the vicissitudes of variation coalesce in meaningful averages, whereas, conversely, constancy may drift and spread into variation. These transformations depend crucially on a number of processes which are regimented by particular types of physical interactions. Variability may coalesce in averages, whereas constancy may drift. Mutatis mutandis, by the same operating principles – large numbers and physical laws – means smear again in variation. In a functioning system, these two stances of constancy and variation complement and define each other.
1.2.3 Part II: The Organism and Its World Add environment and body and the equation is complete. Behavior is a product set of the environment, the organism, and their laws, and I will show that complex behavior may very well be produced by the composition of loops of different path lengths within the neural structure, neural structure plus body, and finally adding the environment. In Part II, I will show why the archerfish does not require knowledge of ballistics to be able to spit on a beetle meters away [3]. What this means will only become clear then, but the general idea is simple. The organism does not represent the environment, rather, it is fit to act. Organisms are not only fit to act, they also evolve to be fit to act. The archerfish is knowledge of ballistics. Returning to the connection between variability and constancy in nervous systems, I will introduce a model showing how constancy in behavior may arise in evolution, despite variability in the nervous structures subserving behavior. For
1.2 Argument Outline
7
most of this part, I will be making use of the tools of evolutionary robotics, combined with neurodynamical analysis of neural networks, which are introduced in Chap. 7. In Chap. 8, two conceptual entities will be introduced (attractor landscapes and metatransients) and employed in the analysis of behavior of agents. Building on these results, I will show what a possible path of evolution of function can look like (Chap. 9). Although a large variation appears in the structure of the recurrent neural networks evolved, there is convergence in the evolution of attractor landscapes, and this convergence appears across robot morphologies. The attractor landscape is taken as an exemplary invariant of behavior, a level of the dynamical implementation of function that represents how structural coupling leads to convergent evolution towards function.
1.2.4 Convergent Evolution of Behavioral Traits To create viable beings, evolution builds on divergence given by mutation and convergence towards function. With mutation in structure, evolution divergently searches the space of possible designs, and the environmental tests constrain the possibilities for solutions. Mutations rearrange basic components with inherent potentialities for the invention of behavioral function. Sometimes, a mutation may lead to disorganization and collapse, whereas other times mutations are neutral. Between neutral and deleterious mutations are those that beget function and increased viability. The invention of behavioral function is often a punctuated event, whereas the development of function is a gradual process. I will show examples from an evolutionary robotics tracking experiment that exhibits both gradual improvement and punctuated functional invention. Attractor landscapes are a conceptual tool that show how invariants appear as agents converge to function. In nature, we find analogies in behavioral function across phyla and taxa, which also exhibit analogous solutions. In reference to our results, analogies in form and behavior derive from the ideal implementations of function to which evolution may converge, of which the attractor landscape is an example. Convergent evolution towards behavioral function underlies the appearance of instincts and analogous behavior of different organisms. Convergence and divergence also collaborate to resolve an old controversy about punctuated equilibria. In the interplay between the two, an answer can be given to Stephen Jay Gould’s question of what would happen if the evolutionary tape were replayed.
1.2.5 Modular Organization Finally, I will present an excursus into some implications concerning the modular organization of nervous systems. Behavioral function depends on useful distinctions gathered from interaction with the environment. When the capability of recognizing distinctions is acquired by an organism, it may be more or less compatible with
8
1 Introduction
previously existing structures. Modularity exists in brains, I propose, because of this compatibility issue. If new function is not compatible with previous structures, a further structure must be adjoined independently of previously existing ones. In this case we have the appearance of a functional module. The more independent the functions, the more modularized the structure. Modularity analysis is a crucial analytical heuristic for the study of neural systems. There are, however, epistemological difficulties with the identification and attribution of function to a module. Because most brain activity is given in terms of the common currency of neural activity (action potentials, neurotransmitters, etc.), the issue of modularity is entangled. A fresh view on this issue is presented, taking a semiotic perspective. Neurons and neural modules are semiotic devices: what they mean is what they are about. This view suggests that the exchange of activity between modules should be conceived as the interpretation of messages. The view leads to useful conceptions about what a module may send, what it receives, and how it contextualizes information in its own terms. Activity exchanges in neural systems do not inherently carry meaning. Messages arise in the context of the receiver; meaning is produced locally. Attractor landscapes function as a model for meaning generation in modular systems. They are a conceptual shorthand clarifying the dichotomy between noise and variation, on one hand, and functional behavior, on the other.
1.2.6 Concept Duals This thesis builds on the resolution of dual and complementary concepts in the themes of variability and constancy. As these duals coalesce into a unified picture, solutions for prima facie paradoxes surface. Table 1.1 presents a concept map of where these dual concepts are handled.
] Table 1.1 The prominent dichotomies and dilemmas in neuroscience and in the study of behavior as they relate to constancy and variability, with cross-references to the chapters where they appear Duals (concept guide) Variability Constancy Chapter Transformations Invariances 2, 3 One to many Many to one 2, 5, 6, 8 Divergence Convergence 5, 6, 8, 9 Parameters Varieties 6, 7, 8 Rate Coding Temporal coding 5, 8 Complexity Simplicity 3, 4, 5, 6 Transients Attractors 7, 8, 10 Discontinuous Gradual 5, 7, 9 Variability Equivalence 8, 10 Generality Specificity 10 Noise Messages 10
References
9
References 1. Iwasa K, Tasaki I, Gibbons R (1980) Swelling of nerve fibers associated with action potentials. Science 210(4467):338–339 2. Konur S, Yuste R (2004) Imaging the motility of dendritic protrusions and axon terminals: roles in axon sampling and synaptic competition. Mol Cell Neurosci 27(4):427–440 3. Schuster S, W¨ohl S, Griebsch M, Klostermeier I (2006) Animal cognition: How archer fish learn to down rapidly moving targets. Curr Biol 16(4):378–383
Chapter 2
Invariances in Theory
Abstract This chapter surveys the multifaceted roles that invariants play in theorizing, from physics and mathematics to biology and neurobiology. The question “What is an invariant of behavior?” is posed, and some alternatives are proposed and discussed: genes, neuroanatomy, and reflex theory. From that, the cybernetic take on the issue is introduced and placed in an evolutionary context, in which single behaviors are identified in respect to the goals they achieve and how they subserve the organism’s viability. The search for invariants of behavior is framed as a search for mechanisms. This search is far from trivial, as assumptions play a prominent role.
2.1 Invariants in Physics and Mathematics 2.1.1 Invariants Are That Which Remains When Something Else Changes Invariants are usually thought of as conserved quantities, relational structures, constant qualities or functions; in essence, invariants are properties of a system that are conserved irrespective of some manipulation. They are something describable about the system, which, often enough, are abstract properties existing nowhere but in the language of theory. But despite this dubious ontological status, invariants are invaluable as well as fundamental to theories. The discovery of an invariant may reveal connections between sorts of a theory previously thought unrelated. When something is conserved when the system is manipulated, that something becomes a latch for the other variables in the system to change around. It becomes a source of explanatory power, equating other variables in its terms.
2.1.2 Theoretical Invariants: Speed of Light, Planck’s Constant Without invariants both modern physics and mathematics are scarcely conceivable. Most appropriately the term originates from mathematical disciplines. Examples of M. Negrello, Invariants of Behavior: Constancy and Variability in Neural Systems, Springer Series in Cognitive and Neural Systems 1, DOI 10.1007/978-1-4419-8804-1 2, c Springer Science+Business Media, LLC 2011
11
12
2 Invariances in Theory
explanatory invariants are profuse and pervasive. Heisenberg’s uncertainty principle draws a relation between our knowledge of a particle – how fast it moves, and where it is. It is bounded (above) by a number, Planck’s constant, representing the amount of knowledge we may extract in an experiment. Planck’s constant is an invariant in quantum mechanics that reappears in the resolution of physical paradoxes, such as the radiation of a black body. It gives a stance of solidity even in uncertainty. Another example of a theory that is anchored in assuming invariants is Einstein’s special theory of relativity. It is based on the axiom that the speed of light in a vacuum with respect to an inertial reference frame is always constant, a theoretical invariant. In the test of this assumption many physical paradoxes found resolution, e.g., the Michaelson–Moreley experiment. Incidentally, Einstein eventually regretted having called the theory Relativit¨atstheorie, having later preferred Invariantentheorie. Both the speed of light and Planck’s constant are theoretical invariants; no manipulation affects the conserved quantity, the former determined by postulate, the latter by deduction.1
2.1.3 Theoretical and Relational Invariants: Energy A giant leap in physical science followed the inception of the concept of energy of a system (Leibniz was the first to propose a formulation), an abstract quantity that is conserved through transformations of the components of a physical system. As an object falls from some height, the increasing velocity results from the transformation from potential energy to kinetic energy. As the object hits the ground, it produces sound, heat, and plastic alterations of the object, which are all expressible in terms of energy. Through the principle of conservation of energy we know that the heat, sound, and shape deformations produced all relate to the same quantity, which remains unaltered throughout the process. At a given time, the total amount of energy of a closed system is constant, and the invariant relates all components and their states within a system. Energy is the common currency across transformations, and is invariant for a closed system. If the system is open, i.e., other components are injected or removed, we know that the energy of the new state will increase or decrease by an exact number, representing the energy fluxes in and out.
2.1.4 Empirical and Relational Invariants: Law of Gases Other examples of invariants relate quantities, such as the Boyle–Mariotte law in thermodynamics, which relates the ratio of pressure and temperature of an ideal gas to a constant. The gas law was found experimentally; it is not absolute. An ideal
1
Other purely theoretical examples exist, such as the entities in the study of topology. For example, a torus – a donut – will always have one hole irrespective of deformations, because its shape identity is defined by its topological properties. The invariant property is a part of the entity’s definition.
2.1 Invariants in Physics and Mathematics
13
gas is an abstraction; there are variations dependent on particular gases. But all gases follow the rule, within some tolerance bounds, that pressure and temperature are inversely proportional. Similarly, the number of atoms in a volume, given temperature and pressure, is constant. They are related by a constant, first encountered in experiments, then derived analytically (among others, by Einstein [49]). Avogadro’s number specifies how volume relates to the number of molecules in a system. Like in Boyle’s law, Avogadro’s number closely follows reality; it is also an idealization conjoining empirical measurements. It works for ideal gases, which do not exist, but serves as a template for all real gases. Empirical invariants are constants and proportions and may indicate general principles underlying theories. Another example fitting this category is the thermal expansion coefficient, and it also fits in the next category. Both the thermal expansion coefficient and the gas constant are context-dependent, as both depend on the material or gas.
2.1.5 Context-Dependent Invariances: Sum of Internal Angles of the Triangle Some invariants are absolute, as Planck’s constant, and no conceivable manipulation affects the quantity bound by it. However, this is the exception; most invariants are context-dependent, requiring a ceteris paribus condition: that everything else should be kept constant. Some invariants are so only within a frame or a context (e.g., speed of light in a vacuum). Even in mathematics there are examples of context-dependent invariants Table 2.1. In Euclidean geometry the sum of the internal angles of a triangle is an invariant number (180ı). When Euclid’s parallels postulate is removed, the sum of internal angles ceases to be an invariant. This is a simple example of how invariants are dependent on the theories that define them, even in mathematics. Likewise, Planck’s constant, , is only valid within quantum mechanics, becoming meaningless when interpreted within, for example, Newtonian mechanics. The invariants from different theories may be incompatible, or irreducible to each other, or even devoid of meaning.
Table 2.1 Some examples of invariants in physics and mathematics Example Characteristics Speed of light Theoretical, postulate Planck’s constant Theoretical, formally deduced Avogadro’s number Empirical, relational Thermal expansion coefficient Empirical, idealization Energy Theoretical, relational, idealization Gas constant Empirical, relational Sum of internal angles of a triangle Theoretical, context-dependent
14
2 Invariances in Theory
2.1.6 Invariants Prefigure Theories Hence, invariants are bound by the assumptions of the theories that define them. Invariances are describable properties of a system that putatively remain unaltered irrespective of some classes of manipulations, or perturbations, or interventions in the system. Both in physics and in mathematics, invariants are the hallmarks of constancy, and often the axiomatic basis of a system of explanation, as in Einstein’s theory of relativity. This has also been phrased as a theorem stating that whenever there is symmetry, a quantity is conserved [44]. When there is symmetry, variables are coupled, and their interdependence begets predictability and explanation. The theorem finds support in a vast number of examples in physics. Invariants prefigure theories, and often theories are born when invariants are discovered.2
2.2 Invariants in Biology 2.2.1 Biology Borrowed Invariants from Physics and Mathematics How about biology? Modern biology motivated by the successes in physics has also made extensive use of invariants, more or less explicitly. In biological theories, invariants are as pervasive as – if less persuasive than, because they are less general – physical principles. In biology, mostly everything seems to be in constant change, contexts vary fluidly, boundaries are not well defined: there are few absolutes.
2.2.2 Invariants in Biology Are Context-Dependent But there are some constancies that, although not absolute, subsist healthily within particular contexts given that some conditions for existence are met. Through them order can be retrieved. Relational invariants figure prominently in models of systems biology, of which there are several, including high-level concepts such as order, organization, hierarchy, and structure. Explanations in biology, and particularly in systems biology, gravitated towards concepts such as balance, equilibrium, and exchange. Many models have been accordingly proposed. Relational invariants in biology produce cycles, equilibrating processes, oscillatory reactions, and the like. Numerous biological models deal with networks of processes assessed for their stabilities and asymptotic states. As examples we have homeostasis (Ashby) [4], the hypercycle (Eigen and Schuster) [18], autopoiesis (Maturana and Varela) [63], and gene regulatory networks [37]. In these cases, constituents of cyclic processes are 2
Noether’s theorem informally stated: differentiable symmetries generated by local actions have correspondent conserved current. An English translation of the original paper is found in [45].
2.2 Invariants in Biology
15
causally interrelated, where this begets that, begets that, and so on, in a systematized cycle of causes. One studies the bounds of operation of the networks, in terms of stability and in terms of the networks of interactions between constituents. The constancies resulting from networks of coupled processes are akin to self-organization, the idea that some arrangements of matter are endowed with certain equilibrating potentialities. Because of the necessity of assuming particular contexts and outlining a subpart of the phenomenon, most invariants in biology have a more tentative, hypothetical nature. Many are pure abstractions, such as the hypercycle or homeostasis. The further biology becomes from physics and chemistry, the flimsier, the more contextdependent its invariants. They lose the status of principles and axioms as in physics, and become conceptual ancillaries for characterizing the boundaries and identity of a biological phenomenon. Therefore, to employ invariants in biology, more than in physics, one is committed to a keen understanding of contextual influences, boundaries, and ceteris paribus assumptions. In biology, invariants dwell on very particular levels of description/analysis/explanation. Somewhat paradoxically, invariants in biology are most useful when contextual (and conceptual) boundaries are clearly discernible.
2.2.3 Three Examples: Life, Form, and Behavior To illustrate this, I will discuss three examples of the search for invariants in biology: life, form, and behavior. Each example will bring in an important set of considerations, historical, logical, epistemological, and ontological. From those we derive conclusions that inform and specify the search for invariants of behavior, which shall occupy us for most of the rest of this book.
2.2.3.1 Life, as an Invariant of the Living, Is a Tautology Some theories assume abstract invariant properties coextensive with what is to be explained. In this case, they become dry tautologies, and fail to provide an explanation. Such is the case, for instance, of life as an invariant property of the living.3 Tautologies, unless they can be framed as an axiomatic basis of a theoretical system, are dry in explanation. Difficult questions such as what is the fundamental property of the living led to many misconceptions, such as the supplanted idea of el´an vital (Bergson) and entelechies (Driesch) [38]; in trying to distill the general property of life, too dilute a concept evaporates. This happens because the abstract property
3 Coextensive: If P is an invariant property of a system, then for all members of set A, it is true that A has P . In the example of life, if all the living A have life (property L), then 8A W L.A/ ! L.A/, a tautology. An explanatory principle such as this has vaporous foundation, and no logical derivations are possible. Nothing can be learned; life becomes magic.
16
2 Invariances in Theory
(life) has too many instantiations (organisms); taken individually, each instantiation is fundamentally different, and therefore, life as a property can only refer to all instantiations of it when in a highly abstracted form. Attempting such overhauling levels of generality is disappointingly unproductive. The trouble with excess generalization in the search for invariants is that the result often becomes a self-referential tautology – devoid of predictive power – one with which one can have no hope of constructing a powerful theory.
2.2.3.2 Autopoiesis: An Invariant of Living Things Tautologies will be explanatory when they can be made pillars of axiomatic theories, when they are self-evident truths at the foundation of explanations. If the tautology is not an end in itself, but may be composed of other tautologies, it can become a principle of a system of explanation. In autopoiesis, Maturana and Varela’s theory of the living, the abstract invariant property “living,” being defined only loosely, is translated into more explanatory relations that take the actual organization of an organism into consideration. Maturana and Varela propose that living should be defined in terms of cyclical processes: processes producing more processes like the originals exhibit the hallmark of the living [61–63]. It is this smart move that replaces the idea that life is something independent of the organisms. Life becomes the organization of processes which maintain themselves by recurrently producing themselves. This is the abstract property that becomes the explanatory invariant. In studying life then, instead of a mystical search for some extra factor that confers life, one focuses the search on the kinds of processes that confer autopoietic properties to the living. Consequently, the study of the living becomes the search for processes that maintain themselves metastably under perturbations. This is a valuable reduction, for it frames the approach at large, and determines the further set of questions to be addressed.
2.2.3.3 Genes as Invariants of Form Many successful biological theories are founded upon invariances. Ever since Lineaus, and perhaps Aristotle, it is a set of morphological traits of a particular species that defines its identity. It was clear that there must have been some reason for the invariance of morphogenesis, although the reason itself was unclear. The whole exercise of taxonomy, despite its pitfalls, strives for the most explanatory level of morphological invariance; those properties that congregated define a species identity [39]. The discovery of the inheritance mechanism being a crystal lodged in all the cells of an organism (DNA) infused biology with optimism. There was finally a reason for constancy of form and behavior. The whole endeavor of molecular phylogeny results from the realization that genomes can be taken as invariants of species. But for form and behavior, the history was not complete until the cell and the environment had been brought back into the morphogenesis equation.
2.2 Invariants in Biology
17
2.2.3.4 Developmental Systems Theory Critique: Environment as a Determinant of Form The idea of the gene as the quintessential invariant of ontogeny has recently been called into question by researchers [17, 22, 24, 27, 33, 34, 47, 54].4 Developmental systems theory has emphasized the role of reciprocal interactions with the environment in ontogeny, in an attempt to bring the context as an integral part of explanation. Modern theories emphasized the role of developmental processes, such as triggers of biochemical environment, cellular environment dependent protein production, energy consumption, intracellular and extracellular transport, and consumable materials. All of these help scaffold a form. If the genes are regarded as invariants of form, they are so only within very complex contextual frames.
2.2.4 Ontogeny Needs the Environment Morphogenesis of some social insects, such as ants and termites, is a striking example of contextual dependency of form. It turns out that individuals with the very same genetic material, clones in every respect, will develop strikingly different morphologies and roles depending on the chemical context of the nest’s chamber in which they are born. And those morphological differences will be further expressed in the individual’s role in the colony, some becoming queens, some becoming winged soldiers, some becoming workers. Although not quite debunking the gene of its prominent role in generation of identity, developmental systems theory enforces contextual thinking in what may at first seem the stablest stance of invariance in biology. In the least, it becomes obvious that genes are only coresponsible for morphology and function. Because they are contextual, the genes are, at most, contingent invariants of form. In the extreme case, there are examples showing that even complex forms may be generated from a single cell, utterly devoid of genetic material, purely scaffolded by physiochemical processes [24].
2.2.5 Development and Biophysics The biophysical category of explanatory principles include examples such as adhesion, viscosity, permeability, rigidity, and elasticity, and in general the relational forces generating potential for rearrangements and self-organization, and by extension the organizing of constancies. Many of these properties are only remotely connected to genetic prescription. Brian Goodwin [24] describes the interesting case of the unicellular macroscopic alga Acetabularia, which despite being only one cell
4
See also [15] for an insightful book review on Oyama’s cycles of contingency.
18
2 Invariances in Theory
Fig. 2.1 Acetabularia and its life cycle
has complex form, resembling an umbrella (Fig. 2.1). Owing to complexity of its morphology, one would initially be inclined to attribute this complexity to genetic assembly. But a simple experiment precludes that conclusion. If the alga is cut at its stem, the nucleus of the cell remains at the bottom, presumably containing all genetic material with the information about the shape of the organism. There are now two parts, one with the nucleus at the bottom and no umbrella, and the other with no nucleus but with the umbrella. The surprising fact is that both parts regenerate a fully fledged version of the umbrella, irrespective of the presence of the nucleus. That is, the severed umbrella grows a second umbrella, at the severed side. So, the production of the material and the conditions that grow it are irrespective of the nucleus and its genetic material. Bri´ere and Goodwin [25] presented a model of the scaffolding of Acetabularia’s tip (the umbrella), where the assumptions are the physical properties of the cell’s substrate and time, excluding any further role for genetic production of material. This example illustrates well how complex form can originate solely from the physics of cell processes, even without gene transcription.
2.2.6 Evolution Operates on Organisms, Not on Genes The appearance of an invariant trait or feature in an organism is not a function of genes alone. So, instead of absolute invariants of form, it is perhaps more adequate to frame genes as invariants of cellular processes, as in gene regulatory networks, because at this level we are much more competent in defining a process in terms of components, dimensions of interdependence, as well as susceptibility to contextual influences. It is more parsimonious to keep genes at a proximal sphere of influence. For the same reason, it is a category mistake (sensu Gilbert Ryle) to say that evolution operates on genes [14]; an idea that finds much receptivity in the
2.2 Invariants in Biology
19
scientific community. Evolution is dependent on inheritance mechanisms; whether these are genetic or not is, in principle, irrelevant. Certainly, the gene pool correlates with evolutionary history, and has the potential of giving insight into the history of certain strains of beings. But it is a muddled conclusion that evolution operates on genes. Evolution operates on the level of the organism as a whole.
2.2.7 Genes as Invariants of Species Identity Arguably, genes are at their best with molecular phylogeny, the search for the branches of phyla across time. Comparing genetic data from fossils with those of modern organisms, similarities across genes may help reconstruct the tree of life, tracing back mutations and thereby finding branches that belong together. To an extent, genes are defensible as invariants of phylogenetic identity. Even so, there are many examples where the evolution of adapted traits is essentially regulated by the environment, such as the adaptive shape and color of some seashells [26]. Moreover, there are many mechanisms of epigenetic inheritance, such as chromatin markings, which are only now being demonstrated [33]. Inheritance is about any transmitted trait; if cellular material transmitted to progeny determines features of progeny, then there is epigenetic inheritance. Owing to other forms of inheritance, molecular phylogeny is also susceptible to misclassifications, and genes are not fail-free invariants of species identity.
2.2.8 Invariants in Biology: Only Within Narrow Contextual Boundaries As we proceed in our explanation, it turns out that most candidates for invariants in biology will follow this general rule; invariants in biology, if defensible, are so within narrow conceptual boundaries. Genes are the paradigmatic example. To say that the gene is an invariant of the form of an organism is to exclude too many formative levels. Genes are indeed crucial invariants of development and behavior, but are not almighty. By forcefully excluding the cell and the environment and their multifarious influences on the development of stable forms, the scope of explanation shrinks. In biology, it is counterproductive to try to attribute too much to one element of explanation. For the same reason, although genes and form do covary in certain experimental paradigms, it is a mistake to attribute form solely to genes.
2.2.9 Genetic Triggers Do Not Build an Organism So, genes do not hold the position of absolute invariants of form, as the gene’s causal prowess dilutes in development when the environment (also cellular) is taken into the morphogenesis equation. Molecular biology is at a stage where it can often
20
2 Invariances in Theory
say what a gene triggers, but remains silent and circumspect to the question how. Molecular biology errs if it confuses causes and triggers, because although there are many ways to cause some sort of function – by knocking a gene, for example – there are fewer to build one. To explain how something is built, it does not suffice to say which gene triggers what, rather the story must be spelled out throughout the development cascade, up until the generation of a full organism. The organism is not made up of a set of genetic triggers, because triggers are not stuff [28, 55].
2.3 Invariants of Behavior 2.3.1 What Is an Invariant of Behavior? Following on the definition of invariants, an invariant of behavior is that which remains constant with respect to an identified behavior. But what is it, and more crucially, is there such a thing? What is invariant when conspecifics execute similar motor behaviors? Further, will homologous behavior across phyla have similar invariances? Before these questions may even be asked, we must endure a demanding assumption: that it is possible to define functional behaviors.5 Start out with behaviors instinctive or stereotypical, preferably of simple organisms, and search for what may be the invariant concomitants of those. Think of a fly rubbing its hands, or a fish masticating, or a dog swallowing; simple behaviors. Ask what is invariant in those. At first glance, for the simple behaviors there are several candidates, of which three are discussed: genes, neuroanatomy, and reflexes.
2.3.2 Genes as Invariants of Behavior 2.3.3 Genes Are Untenable as Invariants of Behavior, for They Are a Diluted Cause In Sect. 2.2.6, we discussed the role of the genome as an invariant of form. We found that although genes are certainly fundamental for phylogeny and inheritance, they are insufficient as invariants of form. Cellular processes underlying morphogenesis exist which are independent of the genes. Hence, the environment (also cellular) has a formative and causal role in the creation of a form, where not all processes are directly caused by the genes. The creation of a living organism is not merely a sequence of genetic triggers, but a complex scaffolding process of interdependences. The same argument can be extended to include behavior. Enthusiastic with
5
Our working definition of behavioral function: when a goal can be attributed to a behavior, then the behavior is a functional behavior. Contrast that with behavior in toto.
2.3 Invariants of Behavior
21
the stability of genes, some have argued that the genome carries information, not only about species identity, but also a full program for the creation of a form, complete with modes of behavior. At this point, however, we address a matter of scope. Genes certainly have a crucial role in the development of the structure of an organism, and the structure in turn will produce, with environmental interaction, behavior. However, genes as explanatory entities fail to give proper answers to the “how” question. Genes can be associated with “which” behavior or form, but they alone cannot answer “how.” To say that genes are invariants of behavior, therefore, is an unwarranted exaggeration; one that simply vaults over too many levels at a time. A simple example will clarify the claim that genes are not the best level of analysis for behavior. Some species of fish are born without eyes. Genes have recently been found that trigger the formation of the eyes [43, 50]. If the eye-controlling motor neurons are present, then the fish uses them; so the presence of eyes modifies behavior. But the changes in behavior are better explained in terms of the newly acquired eyes than in terms of the gene that triggered the production of eyes. Behavior cannot be explained as a set of genetic triggers, for the genes do not see, the organism does. The form of an organism and its modes of interaction with the environment are more immediate to behavior than genes. To be sure, the structure is a consequence of development, genetically guided, so through circuitous routes, one may attribute behavior to genes. But this is like saying that the Sarajevo crime caused the First World War. Although it may have triggered it, all the conjuncture that made it a propitious trigger was given at a much larger scale of interdependences. Triggers are descriptions of linchpin transitions, which require quite a bit of context to be activated. A set of genetic triggers does not explain a developmental process, as much as an organism is not a set of triggers. Therefore, to pin a gene as responsible for behavior is to give too summarized a story about the behavior. Important in their own right, it is a hopeless game to exhaustively list all the genetic triggers of behavior. Invariances of behavior are to be found in the engaged interaction between the being and the world (see Fig. 2.2).
2.3.4 Neuroanatomy: Invariant Connections Between Architecture and Behavior That there must be a connection between neuroanatomy and behavior is nowadays beyond question (in the time of Aristotle it was not so, since he thought that the brain had a cooling function). Organisms of the same species with roughly the same behavioral traits also have, at certain scales, similar neuroanatomies. So, there are architectural characteristics that correlate with function and behavior. Occasionally, conspicuous features of similar neuroanatomies can be traced to function. However, this is the exception rather than the rule: although there are many distinguishable architectonic features, they do not always have obvious relations to function.
22
2 Invariances in Theory
Genes are not solely responsible for morphogenesis and development. The structure of an organism appears together with environment and the biochemical contexts of the cell. Genes help determine behavior inasmuch as they help determine the structure of an organism. Behavior appears as the organism couples with the environment, expressing the potentialities of behavior. Consider the question: how does gene A produce behavior B? Usually, the behavior-gene expert will not be able to answer. What he does instead, is to say: if we impair gene A, behavior B is not produced. Consequently, the gene is a better level for analysis of the processes leading to the development of structure, than those leading to behavior. For behavior, its causal powers become diffuse as levels are crossed.
Organism
Cell Development Genes
Cell Env.
CNS Organism
Morphogenesis
Behavior
Genes participate in development, as does the cell and the environment. In that, they help determine the structure of the organism, but they do not determine interactions with the environment, that ultimately become behavior.
Fig. 2.2 Genes and explanation of behavior. CNS central nervous system, Env. environment
2.3.5 Architectonic Invariances At first glance, some levels of neuroanatomical organization are distinguishable on the basis of constancy of architecture. Examples of such invariants are abundant: the organization of cortical layers in mammals, macroscopic morphological features of the brain (gyri, sulci), microscopic circuitry, the proportionality between neuronal density and the inverse of the cubic root of brain volume, or that between the square root of brain weight and the cubic root of body weight [10]. As Valentino
2.3 Invariants of Behavior
23
Braitenberg writes, “Very likely these quantitative relations reflect some general principles of the architecture of neuronal networks.” When quantified, many features of neuronal circuitry will display invariance in a statistical sense, such as average length of dendritic spines, average number of synapses in a given cortical area, and a long list of other features [6, 11, 19, 20]. The number of distinguishable levels of neural organization correlates with the complexity of the organism. Nervous systems of simpler organisms may have wiring that looks almost fully prespecified, as the wiring of insect ommatidia [32] (the fly is also a neat example, especially in the plates in [8]), whereas in complex organisms the degree of variation is much greater. Still, for both, there are gross6 levels of organization that appear highly patterned and repeatable (as in Fig. 2.3). As Braitenberg writes, this gives clues that, at some level, there must be some invariance of neuroanatomy responsible for constancy in behaviors.
Fig. 2.3 Invariance at the coarse level of architecture in the nervous system of the caterpillar. The brain is marked with a. (From Swammerdam [57])
6
Gross in the sense of coarse, at first glance.
24
2 Invariances in Theory
2.3.6 Some Features of Gross Anatomy Can Be Traced to Behaviors As an example, Braitenberg [9], a neuroscientist by training, derived from neuroanatomical observations a model of behavior now known as a Braitenberg vehicle that abstracts salient (gross) features of neural architecture, such as contralaterality or ipsilaterality, to explain simple functional behaviors, such as following or avoiding a light (see Fig. 2.4). This is a simple and powerful instance of how simple features of architecture could be taken as invariants of behavior (provided, of course, there is the body and the environment). Similarly, simpler organisms with only a few neurons (in the hundreds) may have almost identical wiring, to the extent that neuroscientists can find unique identifiers for every single neuron. At times, single neurons will be responsible for particular behaviors, such as escape responses, or simple reflexive behaviors, in which simple analysis of connectivity (sometimes one neuron, as in the case of zebra fish [58]) explains constancy in behavior. But there is no general method that relates architecture to function. Cortical layers, despite their similarity in construction, are often associated with very distinct functionality.
Fig. 2.4 A Braitenberg vehicle with phototropism. The contralateral connections establish negative feedback producing light-following behavior. Colors indicate contralateral pathways
2.3 Invariants of Behavior
25
2.3.7 Connections Between Anatomy and Function Do Not Always Exist Such a clear connection between features of anatomy and functional behavior only exists for either the coarse level of the neuroanatomy or for very simple organisms. As architectures and organisms become more complex, this kind of direct relation between architecture and behavior becomes increasingly difficult to retrieve. Already in insects there is a considerable difficulty in tracing features of architecture to behaviors. Although many distinguishable regularities may be observed, it is not even clear whether for each of those there will be a function, or whether that architecture is the only way to realize the function. If it is the case that different architectonic features could express the very same analogous function, then an argument can be made against the case for anatomy as an invariant of behavior. An invariant of behavior, by definition, must be constant when the function is constant. So, if there are multiple neural implementations of a particular function, then those neural implementations reflect the invariant, but are not the invariant. What is more, although we may describe the brain in very fine architectural features, the relationship between those features and function can be elusive. For historical reasons the anatomical characterization of the human brain is either independent of function or based on historical hypotheses (as in Broca’s aphasia). Wernicke’s areas, for example, are described in terms of the different cytoarchitectures and gross features of anatomy such as sulci and gyri. Even if, on occasion, some localization of function is attempted (see also Sect. 3.2.5 on functional magnetic resonance imaging and epistemic magnetic resonance imaging), the matching between function and structure inherits the arbitrariness and biases of the architectonic description. Incidentally, localizing function in the architecture informs little about how the function is executed (as will be discussed in Sect. 3.2.5), which is our primary interest.
2.3.8 Neuroanatomical Variation and Constancy of Function If it is difficult to map constancy of function to architecture, which has an apparent order; it is even more difficult to account for the extreme neuroanatomical variation in complex organisms. Even in simple organisms there are many possible neuroanatomical implementations that express similar behaviors. In insects it is often a single neuron which realizes a complete pathway, and the variation across individuals is large. This suggests that although gross neuroanatomy may be taken as an invariant, it is so only at a coarse scale. One is hard-pressed to isolate features of geometry that are invariant with respect to function. However, this conclusion does service in constraining the search for an invariant of behavior, in that it adds one important requirement. An invariant of behavior has to give an account of which variations in neuroanatomy may express similar behaviors, and how (as we shall see in Part II).
26
2 Invariances in Theory
2.3.9 One to Very Many Mappings from Anatomy to Function Even if there is a level where gross neuroanatomy correlates with some simple functions (and there is enough reason to believe there is), the case is very different for finer neuroanatomical details and more sophisticated behavior. The problem is that for more sophisticated brains, mapping between function and neuroanatomy is one to very many. In one individual, one neural architecture maps to all the functions an organism potentially expresses. Therefore, except for very particular cases, neuroanatomical features cannot explain the breadth of behavior. As organisms become more sophisticated, they express a widening breadth of behavior [56], not necessarily traceable to gross features of the architecture. In a sense, the architecture “lodges” all possible behaviors an organism can express, and hence is not very informative about specific behaviors. This a strong indication that neuroanatomy is not the most adequate level to search for invariants of behavior. This many-to-one function-to-structure mapping is manifested in current neuroscientific literature, where particular anatomical structures are hypothesized to perform a number of functions, often incommensurable or even contradictory. To take an extreme example, the hippocampus has been associated with a large number of cognitive and behavioral functions. Examples include localization [46], declarative memory [13], emotion and motivation [59], face recognition [51], sequencing and planning [7], and language production [53]. This nonexhaustive list of functions attributed to the hippocampus betrays subjectivity in the attribution of its role, and more than that, it befuddles brain theorists as to whether it is possible to attribute to the hippocampus a unique function subsuming all the instances of function above. As Braitenberg puts it (p. 7 in [8]) (my emphasis): We will not be able, in most cases, to explain the peculiarities of a certain brain structure by invoking the rules and constraints of the mechanism that synthesize brains out of neurons, but will always have to consider explanations in terms of the function it performs.
The problem is that for any behavior that has the hippocampus as an active component, the role of the hippocampus may change, dynamically and within behavioral context, introducing a difficulty for any static functional label.7
2.3.10 Requirements for an Invariant of Behavior A theory of behavior certainly has to explain how constancies in the gross architecture level lead to constant function, but it cannot stop there. We may derive some 7
Schema theory (see more in Sect. 2.3.12) exemplifies the attempts to overlay the brain with a structure of functional components, “boxes,” to aide in explanation of overall behavior. But although being a tenable level of functional analysis, it encounters the problems of arbitrariness. That is because, in an important sense, a function only exists while it is being executed, and only within the frame of a theory. Often, there will be no fundamental way to distinguish between different overlaying schemas which one is better or best.
2.3 Invariants of Behavior
27
requirements for an invariant of behavior, in that (1) it has to explain the appearance of gross features of neuroanatomy, (2) it has to explain its “capacity for behavior,” the fact that it may lodge a large number of “latent behaviors,” and (3) it has to explain how constancy in function appears despite variability in neuroanatomy. Much like genes, neuroanatomy subserves behavior but does not fully exhaust its explanation. The fact is that behavior exploits neuroanatomy in the same way that it employs the body, by being embedded in a context. For a Braitenberg vehicle to follow a light, there must be, in addition to the neural architecture, also a light source and a body that turns to it. Invariants of function rely on contingencies of the environment and are only definable in those terms. The behaviors expressible by the Braitenberg vehicle are only so when proper stimuli are present, which usually requires that the environment should be introduced into the explanation. Reflex theorists such as Skinner, Pavlov, Thorndike, and Herrnstein took this to heart. They proposed that the environment could be distilled into a number of stimuli, each evoking an action, where the sum of all reflex–stimulus pairs could make up an explanation of behavior. This was soon shown to be too reduced a picture to encompass the multitude of adaptive behavior [40].
2.3.10.1 Reflexes as Invariants of Behavior Although it has not been framed as such (a search for invariance), the most prominent example of the search for invariants in behavior is reflex theory. It departed from the observations that some stimuli were stereotypically associated with particular motor responses. At its inception, it seemed like a productive front of research, infused with optimism, but as the research program developed, dissatisfaction arose in the community. It soon became obvious that an organism is more than a finite list of stimulus-response reflexes [40]. Reflex theory is attractive because it hints at a possible systematic search of stimulus–response couplings. It may have caused some researches to believe incorrectly that it is possible to compile a comprehensive, if not complete, list of stimulus–response pairs. Summarizing Merleau-Ponty, the animal is more than the sum of its behaviors. The invariant reflex is too circumscribed, too contingent on precise definition of ceteris paribus conditions, to succeed as a theory of behavior. But in spite of its shortcomings, it was instrumental in the search for mechanisms behind behavior. The search for an exhaustive list of reflexes was an original and insightful attempt to explain the sources for the apparent invariances in behavior, albeit somewhat rudimentary.
2.3.10.2 Limitations of Reflex Theory Reflex theory faces two problems, one epistemological, one empirical. Empirical, because the laboratory setup is a far cry from natural behavior. As Merleau-Ponty (pp. 44–46 in [40]) writes, “few pure reflexes are found [because] laboratory reflexes
28
2 Invariances in Theory
are not biological realities.” Epistemological, because there was no obvious way to reintegrate the individual stimulus–reflexes in a whole picture of behavior. This highlighted the ludicrous nature of the original project of reflex theory. Because the environment is immensely complex, it is not possible to exhaustively list stimuli– reaction pairs. It is also not possible to assume a null context, because every stimulus presentation has a different context (even if that difference is merely time elapsed).
2.3.11 Cybernetics, Reafference, and Sensorimotor Loops The nervous system and the automatic machine are fundamentally alike in that they are devices which make decisions on the basis of decisions they made in the past. Norbert Wiener, Cybernetics
There would be no adequate framework to resolve the quandaries from reflex theory until the appearance of cybernetics. Cyberneticists realized that single stimulus– reaction pairs were insufficient to explain behavior. Theirs was the insight that if one adds feedback from the environment, then a recurrent structure emerges in time, where a stimulus–reaction pair entails the next stimulus–reaction pair. The organism then has an active/positive role in producing its own stimuli, and so on, in a cycle. Behavior appears as stimulus–reactions are sequentially and smoothly chained. Moreover, cyberneticists acknowledged that the organism had internal states, which rendered it a “nontrivial machine,” sensu von Foerster [21]. That suggested that the structure of the nervous system was organized as a “negative feedback with a purpose” (see Sect. 8.5.3). Mittelstadt, von Holst [31], Ashby [4], Arbib [1, 2], Wiener [52, 64], and others [9, 23, 60] provided a rich set of examples of organisms whose neural structures could execute the function of negative-feedback controllers. Examples ranged from the flight of a fly, to the eye movements of frogs. Their seminal contribution spawned many fields of research [30].
2.3.11.1 Cybernetics Extends Reflex Theory As in Sect. 2.3.10, although in some cases the underpinnings of reflexes could be traced to anatomy and network structures, in most cases behavior still defied a mechanistic explanation. Although an important start had been made, at the time a general theory of behavior was still missing. That would be the case until cybernetics introduced the concepts of reafference, recursion, and the sensorimotor feedback. The notable contributions of cybernetics provided new ground towards an explanation for the immense variation of complex nervous systems.
2.3 Invariants of Behavior
29
Fig. 2.5 An example of an explanation based on reafference. (From [31])
Despite much variation in neural structure, constancy in behavior was too pervasive for the concept of reflex to be relinquished altogether. Admittedly incomplete, on the upside it provided a stepping stone to the explanation of behavior and invariances, passive of further research on how specific mechanisms could underlie multifarious behavior. Cybernetic views enhanced reflex theory as reafference theory, and suggested a path for a more sophisticated take on a theory of invariances in behavior Fig. 2.5. It was to be searched for in the engaged interplay between body, nervous system, and environment.
2.3.11.2 Functions, Mechanisms, and Purpose The principal insight of cybernetics was to suggest that the organism could be regarded as a complex mechanism whose functions emerge as the organism becomes (sensu Varela and von Foerster) in its environment. Wiener and contemporary cyberneticists proposed that purposeful behavior is the outcome of a mechanism; that which a machine is built to do. It presupposes an abstract principle of construction – a mechanism – that entails function when in a sensorimotor loop. A machine that acts and corrects towards a goal is behaving purposefully, and its mechanism(s) for doing so is(are) the behavior invariant of that goal. In that reductionist sense, the simplest example in the literature of a machine with a purpose is a thermostat. Its construction is such that it compares two temperature values, a reference with a measured value, and sends corrective signals (which may be heating, or cooling), to reach a temperature equilibrium. More than an analogy, this was proposed as purposeful behavior. Many more such examples exist, as the watt regulator, or radar control, where negative feedback is the operating principle. Distasteful to some [36], the proposal that an organism acts as a feedback mechanism has profound implications. For if a certain mechanism entails a certain function, then it is the mechanism
30
2 Invariances in Theory
itself which is the invariant of function. The mechanism is, so to speak, coextensive with its functions. Moreover, it acknowledges the role of the environment as an inextricable part of function: for the function to be exerted, the environmental context must be given.8 The thermostat is the standard example of a cybernetic device: a system in the world, endowed with sensors and actuators that enable it to make feedback corrections to attain its goal. In that, it is similar in description to an organism that pursues a willed goal by deploying a set of actions and corrections to achieve it. The thermostat, however, has a goal in a rather narrow sense; in fact, it has only one goal, about which it is not cognizant. If, for example, the window is opened, the thermostat may huff and puff, but will not do more than keep the heater (cooler) on (off). Hence, compunctions to call its behavior multitudinal (or even adaptive) are more than justified. Multitudinal and adaptive behavior are hallmarks of organismic behavior of even insignificantly small mammals that do all sorts of cool stuff, such as finding a mate, finding a home, playing, fleeing, growing, finding food, killing food, and feeding, and all that flexibly, adaptively, and purposefully. Hans Jonas, in contemptuous disagreement, was quick to point out the analogy of purposeful behavior as a mechanism is misapplied [36]. Concerning the relationship between goal and purpose, in a mechanism, goal is that what the mechanism is built to achieve. To construct a machine is an exercise of defining a goal and constructing mechanisms to achieve it. An organism, on the other hand, seems to define its own goals. Its behavior not only exhibits purpose, but is also guided by it. That is the problem Jonas stresses: a machine has its goals attributed to it by us, an organism defines its own goals by existing and being intrinsically motivated (the concept of mediacy; pp. 106–114 in [36]). Opposite to a machine, an organism is not a set of stapled functions. Rather, an organism entails functions, which are exerted or not depending on the internal context and external affordances. Goals for the organism appear as the organism behaves, and as it does it changes, and consequently its goals change, and so on recursively. One may not discard Jonas’s criticisms lightly, but a closer look may help bring the two, machine and organism, together. First, we have it as a given that a mechanism may be taken as an analogy to an organism; after all, we initially built machines to accomplish purposes often copying those of organisms. We should not forget, as Jonas reminds us, that they are indeed our purposes, not the machine’s. But as a model for behavior, a cybernetic device is more than a cold machine. It outlines a set of prescriptions to assemble purposeful-like behavior, with a savory small set of assumptions. And it does not take too much dialectic effort to bring purpose back into the explanation, not as an analogy, but something which is embedded in the control loop, generating the specifications for behavior – even if it fails to show how the drives feel to their owner. My purpose here is to show that a mechanism is a correct, albeit hopelessly incomplete, model of the purposes, or
8
Environmental context is meant broadly: a cell can be an environment for DNA, as blood can be an environment for a cell, or the body can be a contextual environment for a neural system.
2.3 Invariants of Behavior
31
drives, or urges, of an organism. My reasons for the incompleteness claims are the same as Merleau-Ponty’s: “to build a model of an organism would be to build the organism itself” (p. 151 in [40], also Wiener [64]). This does not mean that we should give up building models, but that our models are not organisms – although they may help us understand the functions of an organism. And ultimately, it is not precluded that we will be able to assemble such a resourceful contrivance that will act as we do, although it may not have the same type of (organismic, motivational) reasons for its drives. If we bring evolution into the picture, the issue gains perspective. Goals appear as evolutionary viability imposes order on self-organizing structures, endowed with intrinsic potentialities. In evolution, mechanistic, functional behavior bestows organisms with enhanced viability. It is an orderly exploit of chaos, a selection, that what begets function. Life evolves towards function because function enhances viability. Very simple function, mechanical behavior, such as chemotaxis in a bacterium, was described as purposeful long before the advent of cybernetics as a theory. A bacterium as a complex organism has the purposeful behavior of a mechanism. All the behaviors a bacterium can express are latent in its structure, and are expressed when it encounters contexts, the environment. To the extent that we can disinter the mechanistic underpinnings of the behaviors of a bacterium, we can understand what its mechanisms are. Because an organism is a mechanism coextensive with all its potentials for functional behavior, a quest for function is fulfilled with the description of a mechanism. But the problem is a little deeper than that.
2.3.12 Schema Theory and Functional Overlays If we are compelled by the idea that behavior is subserved by a dynamical concoction of mechanisms operating in concert and context, the simplest way to study brain function is by finding a match between mechanisms and functions. One intuitive method would be to overlay the neural structures with hypothetical functional boxes, whose actions would produce the analyzed functional behavior. This method is essentially a product of systems theory applied to the study of neural function. The idea that function can be localized to particular substructures has been prolifically employed by cognitive scientists and neuroscientists. Unfortunately, this appealing project suffers from similar critiques as reflex theory. Whereas reflex theory falls short of explaining how the sum of functions builds the whole behavior, schema theory strains to show how brain areas shift between functions. As the same brain area may subserve different functions at different times (and even simultaneously), a static attribution of function to brain area is bound to be partial and incomplete. A functional overlay is a metaphor of localization of function, in which an idealized anatomical map of a brain, often hierarchical, is overlaid with functional boxes representing a mapping between brain areas and functions (an example is given in Fig. 2.6). The functional boxes are usually connected with arrows representing some kind of well-formed information exchange. The boxes are in fact black boxes
32
2 Invariances in Theory
Fig. 2.6 A schema as an overlay of functional boxes on brain areas. This particular schema proposes explaining the functional roles of brain areas involved in action recognition. Abbreviations: IT – infratemporal cortex; cIPS – caudal Infra Parietal Sulcus; AIP – anterior intra parietal cortex; STS – Superior Temporal Sulcus; MIP/LIP/VIP – medial, lateral, ventral intra parietal cortex. (From [48])
of function, implemented neurally somehow – perhaps instantiated with models, or with models pending. A structure of such boxes representing function–area associations is able to generate predictions about interdependencies, for example, in lesion studies. Outlining functional boxes is a basal tool for theory (and hypotheses) making, and in some incarnations is called schema theory [1–3, 16]. A schema is a relational structure of functional boxes standing for an explanation of a certain behavioral or cognitive ability. Hence, systems theory applied to brain function incurs a few critical difficulties. First, a function is a relational abstraction, subjectively outlined as in [5, 42]. Second, as discussed in Sect. 2.3.9, the same neural structure, such as the thalamus or the hippocampus (can there be a unique function for these structures?), may underlie different functions (also simultaneously!). Third, (vide Sects. 3.3 and 10.1–3) a neural structure may participate in two different functions in distinct ways. Fourth, to any functional question (what for?), there will be answers (for x), good or bad, as I will analyze in more depth in Chap. 3 about empirical assessments of invariance. Fifth, (as in Chap. 8.5.3) there is compelling evidence indicating that brain function is dynamically organized: a function is the product of dynamics that only really exist while the function is executed (else only as a potential). These points are individually addressed in the sections mentioned, and lead to the upcoming conclusion that these epistemological difficulties appear because the brain is a hermeneutic machine, one that can only be comprehended in terms of function, which only exists (in the eyes of the theory) once it is named. One finds (or does not), the function one names.
2.3 Invariants of Behavior
33
A functional overlay map is similar to a labeled and tagged map used for navigation, where depending on one’s interests, restaurants, hotels, and entertainment venues, would be highlighted, depending on the search. A functional overlay is like a meta-data-laden navigation map, where a restaurant and a hotel may occupy the same place simultaneously. The analogy is wanting, because although restaurants and hotels exist even when not shown, functions are only there when they are executed. For instance, it is not productive to analyze object recognition in the absence of the object, nor to analyze speech production in the absence of words.
2.3.12.1 Descriptions of Schema Functions Are Abstractions Over Instances When a neuroscientist asks questions about cognitive or behavioral function, he or she asks questions about memory, perception, understanding, abstraction, categorization, attention, judgment, decision, volition, and learning. Essentially these abilities are abstract clusters derived from single instances. All these functions have a relational structure, and all these functions are “of” something. As Brentano [12] pointed out (in 1874), for all these cognitive and behavioral functions, there are objects. Memory is “of” some event, perception is “of” some thing, and so forth. The functions listed above are abstract categories that cluster a large number of instances in which similar purpose is recognized. Descriptions of function derive from categorizations across instances which inherit properties of the individual instances, which may be contradictory or even mutually exclusive. This is the source of many a misunderstanding in the search for mechanisms underlying function. Hence, the search for a mechanism implementing a function stumbles upon a problem equivalent to Minksy’s frame problem in artificial intelligence [41]. The more detailed the description becomes, the more it excludes. Between excessive abstraction and excessive specificity schema theories of behavior fail to display the flexibility necessary to explain the relations between structure and function. The important lesson to be drawn is that the search for brain mechanisms must fully embrace the multifunctional character of the brain, as it dynamically shifts from function to function. The explanation of a brain function remains the search for a mechanism, but one that lodges functions only as potentials until they are put to use. An explanation of organismic function based on mechanisms and functional boxes has to explain more than a single function at a time; it must also address the multifarious potentialities for function that one neural structure may deploy, in different contexts and moments. Because the functional role of the neural substrate is often dynamically organized, and thus only exists in the context of behavior, box structure overlays become invalid in different moments. Dynamical systems explains how this can be the case, as structures assume different roles in different contexts, vis-`a-vis dynamical parameterizations. Invariants of behavior arise within dynamical contexts, and admit only very flexible functional overlays. At the expense of sounding repetitive, behavioral functions are entailments of an organism. Schema theory confuses component functions with de facto structural functionality.
34
2 Invariances in Theory
2.3.13 Behavioral Function and Invariants of Behavior A functional behavior is a phenomenon, in the proper sense of the term: the object of a person’s perception; what the senses notice or the mind notices [35]. Functional behavior is immanent in organisms; whereas observers describe and categorize behavior (after Merleau-Ponty). When purposes and goals can be attributed to a set of actions, this set of actions can be outlined as a functional behavior. Consider an organism as a complex mechanism with the potential for many component functions. Deploying these capabilities in a concerted manner to achieve a particular goal creates an example of functional behavior. The potential for behavior is fulfilled in interaction with the environment, in context. An invariant of behavior appears with the minimum set of mechanisms with the inherent potential for those functions indispensable for a functional behavior. Incidentally, akin to other invariants of biology, the idea of an invariant of behavior obviates subjectivity in the definition of function and context. Behaviors have functions as prerequisites. Consequently, the search for an invariant of behavior begins with the description of a behavior, parsed into component functions necessary for that behavior. The act of outlining the functional components of behavior is, by necessity, a subjective activity. Correspondingly, functional descriptions may be contradictory or fuzzy – the best heuristics being parsimony; as when attempting to avoid excessive anthropomorphism, notwithstanding its ineluctability. In a rough description, a behavioral function of a fly such as h pursuing a matei, requires components to hlocalize a moving dot in visual fieldi and h flight controli. Those descriptions are shorthand implicitly assuming the existence of mechanisms that enable these functions (note, moreover, that the mechanism underlying both functions may be one and the same). The invariant of a behavior is the set of mechanisms fundamentally necessary for a behavioral function. In that, it is inextricable from the descriptions of the component functions. It implies the required functions for one functional behavior, and defines the outline of mechanisms to be explained. In reflex theory the atom of explanation is a stimulus–reaction pair. In my (constructivistic) conception, the atom of explanation is the association of a functional behavior (such as feeding), with all its component functions (e.g., the capacity for hunger, foraging, mastication), and the minimal substrate that can implement that behavioral function, within an organismically relevant frame. The invariant of behavior is a many-to-one association from behavioral function to mechanism(s). Because functions can be implemented differently, the substrates implementing an invariant of behavior may vary widely. The less similarity between organisms, the less likelihood that the mechanisms will be similar in implementation. Conversely, the more similar the organisms, the more probable that the implementation will evince similarities in mechanisms. Evidently, the mechanisms leading to feeding behavior in an elephant are more similar to those of a hippopotamus than to those of an ant. Still, even between conspecifics there will be variation in implementations. That which remains the same across compared organisms is the
2.3 Invariants of Behavior
35
invariant of behavior, which may be a null set (e.g., a palm and an elephant) or an equivalent set (e.g., two worker ants). Different organisms with similar capabilities for behavioral function will in one way or another implement the same component functions. Thus, findings about behavioral invariants from one organism may tell tales about the selection of component functions of the behavior of different organisms. In that sense, invariants of behavior can be conceptualized as the overlapping regions of constancy, across the variable implementations of functional mechanisms across organisms. Invariants of behavior are conceptual shorthand for the definition of behavioral function and its possible implementations/instantiations. Invariants of behavior map to the space of (abstract) mechanisms with the potential for behavioral function.9 What is similar between an elephant feeding and an ant feeding? Simple answer: the description. Feeding is the function of acquiring energy through ingestion of food. Quite obviously, an ant and an elephant will have quite different mechanisms for feeding, but both share the description. This is the difficulty of the study of many a cognitive function with model organisms, the description is shared, but the mechanism is not. The main problem with that conception is also the necessary frame for any solution: language. A function is a description of a purpose. In the elephant as in the ant, the description for feeding is the same, because the goal is the same. At the abstract level of teleological analogies, the invariant of behavior is a description of that behavior and the possible analogies between component functions/mechanisms [23]. I participated recently in a round table of neuroethologists which included specialists in mammals, fish, and insects. At the event, results of operant conditioning in drosophilas were presented. In the paradigm, drosophila flies were made to couple an unconditioned stimulus with a conditioned stimulus. The source of disagreement was the number of trials until the response of the drosophilas was conditioned. Learning took place in very few trials (two to seven). Mammal experts pointed out that this process is much shorter than in mammals, where operant conditioning takes many more trials (more than 50) until the conditioned response is exhibited. Consequently, the mammal experts argued (with some support from fish experts) that what happened in drosophilas could not be called operant conditioning. Much to the dismay of the insect experts, who pointed out that apart from the number of trials, the descriptions of the experiments were essentially the same. The lesson to be learned from this event is that descriptions are dangerous. I take it that the problem behind the disagreement was that nothing learned about operant conditioning in insects could be imported to mammals. All the same, because the mechanisms are likely to be fundamentally different, a certain strictness concerning terminology is utterly justified. Component functions of a behavioral function may be more or less dependent on each other, and this dependence may vary between different organisms. Therefore, mechanisms may be more or less modularizable than others, reflected in the
9
This is one of the main difficulties with analogies in ethology and neuroethology. Descriptions of functions are shared across different organisms, whereas mechanisms are not.
36
2 Invariances in Theory
behavioral invariants. Lesion experiments test for that. In monolithic mechanisms, component functions may share the substrate with which they are implemented – the same neural structure may underlie two simultaneous functions. As an example, certain cephalopods use their visual system both to guide escape reactions and to select skin patterns for camouflage [29]. Because the visual systems underlie both abilities, damage to the common substrate impairs both behavioral functions. Although the substrate is the same, it may be employed by different subsystems in distinctive modes (as the experiment described in Sect. 10.3.2 will illustrate).
2.3.13.1 Invariants of Behavior from Convergent Evolution Because organisms share the world, along with some basic physics and biochemistry, some behavioral functions may be stumbled upon anew in different evolutionary lines. The phenomenon denominated “convergent evolution” may lead to similarities of form and of behavior, or both. A conception of evolution as procuring a maximum denominator in behavioral function may rely on invariants of behavior for a test bed of hypotheses. Invariants of behavior may be defined within individuals, across conspecifics, and beyond phyla. The amounts of overlap are informative about similar problems different organisms may have encountered in their evolutionary paths, leading to convergence in behavioral function, and structure. Invariants of behavior are proposed as a conceptual shorthand in the understanding of the multifarious behavior of organisms. In this chapter I exemplified how invariants of behavior affect the search for a mechanism for behavioral function. The procedure is simple: name a behavioral function, partition the behavioral function into necessary component functions, and finally, search for the minimum set of mechanisms with the capacity to subserve the behavioral function named. Invariances appearing as problems are solved similarly. It is in this sense that invariants of behavior will be henceforth employed.
2.4 Conclusion At the expense of having belabored the obvious, I argued among other things that to compose a powerful theory of one has to find those invariants that are most telling about a phenomenon. When aiming at an explanation of behavior, there are many stances not fully suitable for the task. It is unproductive, for example, to say that genes are invariants of behavior. Although that may be true, it is also very coarse. Unless genes show the mechanisms of behavior, they are improper candidates for explanatory invariants; likewise for neuroanatomy regarded in isolation. Neurons, networks, and their properties vary widely, even in conspecifics; and yet, most functions remain unaltered. The space of variation may have to be defined “outside” the
2.5 Summary
37
range of neuroanatomy – for instance, in dynamical terms. So the most adequate level of description sees neuroanatomy as subserving the dynamics that mediate the environment and organism. The derivation of lawlike relationships from correlations is a class of inferences common in biology (this is a neutral statement of fact). Nonetheless, the considerable difficulties in measurement and control conditions must be acknowledged. Biologists are often forced – often owing to empirical limitations – to abduct relations between variables that may be quite distant in reality. Circumventing the difficulty by quantifying relations between behavior and neurons, or behavior and genes, does not quite solve the problem. Behavior and genes may be related, but not immediately related. Between them there is too big a gap, even for a leap of faith. It is better to build bridges. That bridge is constituted by the environment and the body as links between genes, neurons, and behavior. This chapter contained an analysis of epistemological aspects involved in a search for invariance in behavior. The following questions were asked: What is an invariant? What forms does it take? What is the nature of invariance in the study of behavior? What forms do an invariant take? How have theories employed invariants? The next chapter deals with invariants arising from empirical measurements, and how invariant function is measured in brains. Essentially, all empirical methods of brain measurement rely on experimental paradigms that attempt to isolate the invariants in brain function. In neurophysiology, invariants are used to indicate the relevant dimensions of change of a phenomenon. If something simply does not change when something else changes, no distinction can be drawn, hence nothing can be learned. At the other extreme, if it varies too much, it is also not very enlightening. Science operates on distinctions and on gradations of change and relationships between variables. Therefore, the susceptibility of a hypothetical invariant to change of context offers a qualitative assessment of the quality of that invariant. The relevance of an invariant is assessed in terms of its dimensions of change. Thus, the next chapter discusses methods of empirical assessments of invariance in brain and behavior.
2.5 Summary The argument of this (somewhat philosophical) chapter is as follows: Invariants are that what remains when something else changes (Sect. 2). Invariants underlie theories and are pervasive in physics and mathematics,
as well as in biology (Sect. 2.1.1). Invariants come in different forms, fitting assorted categories. We named five: theoretical, empirical, relational, idealization, context-dependent (Sect. 2).
38
2 Invariances in Theory
Invariants in biology are context-dependent, genes are taken as an example. Although genes help build an organism, and thus may be framed (somewhat narrowly) as invariants of form, they do not offer an appropriate level of analysis for behavior, for behavior appears in the engaged interaction between an agent and the environment (Sect. 2.3.2). Similarly, neuroanatomical features can be taken as invariants of functional behavior, either in simple organisms or at the level of gross architecture, but they fail to address three crucial aspects (Sect. 2.3.4). 1. The mapping between neuroanatomical features and function is not always clear. 2. One neuroanatomy subserves many behaviors. 3. Different neuroanatomies subserve similar behaviors. The reflex is regarded as a hypothetical invariant of behavior. Reflex theory assumed that behavior was composed of stimulus–reflex pairs (Sect. 2.3.10). 1. However, behavior is more than concatenated stimulus–reflex pairs. 2. Moreover, a characterization of behavioral function is inherently subjective (Sect. 2.3.9). For cybernetics, purposeful behaviors result from the interaction of complex feedback mechanisms (reafferences) with the environment (Sect. 2.3.11). An invariant of behavior is the set of the possible mechanistic implementations of a behavioral function (Sect. 2.3.13).
References 1. Arbib MA (1972) The metaphorical brain, an introduction to cybernetics and brain theory. MIT Press, Cambridge 2. Arbib MA (1982) Machine Intelligence 10, Chichester: Ellis Horwood, chap Rana Computatrix, an evolving model of visuomotor coordination in frog and toad, pp 501–517 3. Arbib MA, Fellous JM (2004) Emotions: From brain to robot. Trends Cogn Sci 8(12) 4. Ashby W (1960) Design for a brain: The origin of adaptive behavior, 2nd edn. Chapman & Hall, London 5. Bateson G (1972) Steps to an ecology of mind. University of Chicago Press, p. 533 6. Benavides-Piccione R, Hamzei-Sichani F, Ballesteros-Yanez I, DeFelipe J, Yuste R (2006) Dendritic size of pyramidal neurons differs among mouse cortical regions. Cereb Cortex 16(7):990–1001 7. Berns GS, Sejnowski TJ (1998) A computational model of how the basal ganglia produce sequences. J Cogn Neurosci 10(1):108–121 8. Braitenberg V (1977) On the texture of brains: An introduction to neuroanatomy for the cybernetically minded. Springer, New York
References
39
9. Braitenberg V (1984) Vehicles, experiments in synthetic psychology. Bradford Book, Cambridge 10. Braitenberg V (2001) Brain size and number of neurons: An exercise in synthetic neuroanatomy. J. Comput. Neurosci. 10(1):71–77 11. Braitenberg V, Sch¨uz A (1998) Cortex: Statistics and geometry of neuronal connectivity. Springer, Berlin 12. Brentano FC (1874) Psychologie vom empirischen Standpunkte. Duncker & Humblot, Leipzig 13. Cohen N, Squire L (1980) Preserved learning and retention of pattern-analyzing skill in amnesia: dissociation of knowing how and knowing that. Science 210(4466):207 14. Dawkins R (1976) The selfish gene. Oxford University Press, New York 15. Di Paolo E (2002) Book review: Cycles of contingency. Artif Life 8(2) 16. D¨orner D (1999) Bauplan f¨ur eine Seele. Rowohlt, Reinbek 17. Edelman G (1988) Topobiology: An introduction to molecular embryology. Basic Books 18. Eigen M, Schuster P (1978) The hypercycle. Naturwissenschaften 65(1):7–41 19. Elston G, Rockland K (2002) The pyramidal cell of the sensorimotor cortex of the Macaque Monkey: Phenotypic variation. Cereb Cortex 12(10):1071–1078 20. Elston GN (2005) Cortex, cognition and the cell: New insights into the pyramidal neuron and prefrontal function. Cereb Cortex 13(11):1124–1238 21. von Foerster H, von Glaserfeld E (2005) Einf¨uhrung in den Konstruktivismus, 9th edn. Piper Press, Munich 22. Fox Keller E, Harel D (2007) Beyond the Gene. PLoS ONE 2(11):e1231 23. Glaserfeld Ev (1990) Teleology and the concepts of causation. Philosophica 46(2):17–42 24. Goodwin B (2001) The evolution of complexity: How the leopard changed its spots. Princeton Academic, Princeton 25. Goodwin B, Briere C (1989) A mathematical model of cytoskeletal dynamics and morphogenesis in acetabularia. The Cytoskeleton of the Algae. CRC Press, Boca Raton, pp 219–238 26. Gould SJ, Lewontin RD (1979) The spandrels of San Marcos and the Panglossian paradigm: A critic of the adaptationist programme. Proc R Soc Lond 205:581–598 27. Griffiths PE, Gray RD (2000) Darwinism and developmental systems. MIT Press, Cambridge 28. Griffiths PE, Stoltz K (2007) The cambridge companion to the philosophy of biology. chap Gene, Cambridge University Press, Cambridge, pp 103–119 29. Hanlon R (2007) Cephalopod dynamic camouflage. Curr Biol 17(11):400–404 30. Heylighen F, Joslyn C (2001) Cybernetics and second-order cybernetics. In: Meyers R (ed) Encyclopedia of Physical Science and Technology, 3rd edn. Academic, New York 31. von Holst VE, Mittelstaedt H (1950) Das Reafferenzprinzip. Die Naturwiss 37(20):464–476 32. Homberg U, Paech A (2002) Ultrastructure and orientation of ommatidia in the dorsal rim area of the locust compound eye. Arthropod Struct Dev 30(4):271–280 33. Jablonka E, Lamb M (2005) Evolution in four dimensions: Genetic, epigenetic, behavioral, and symbolic variation in the history of life. MIT Press, Cambridge 34. Jablonka E, Lamb M, Avital E (1998) ‘lamarckian’ mechanisms in darwinian evolution. Trends Ecol Evol 13(5):206–210 35. Jewell E, Abate F, McKean E (2001) The new Oxford American dictionary. Oxford University Press, Oxford 36. Jonas H (2001 (1966)) The phenomenon of life. Northwestern University Press, Evanston, IL 37. Kauffman S (1969) Metabolic stability and epigenesis in randomly constructed genetic nets. J Theor Biol 22(3):437–67 38. Mayr E (1961) Cause and Effect in Biology Kinds of causes, predictability, and teleology are viewed by a practicing biologist. Science 134(3489):1501–1506 39. Mayr E (1976) Evolution and the diversity of life. Harvard University Press, Cambridge 40. Merleau-Ponty M (1963 (translation), 1942) The Structure of Behavior. Duquesne University Press, Philadelphia 41. Minksy M (1975) The psychology of computer vision, chap A Framework for representing knowledge. McGraw-Hill, New York 42. Nagel E (1979) The structure of science: Problems in the logic of scientific explanation. Hackett Publishing, USA
40 43. 44. 45. 46.
2 Invariances in Theory
Niven J (2008) Evolution: Convergent eye losses in fishy circumstances. Curr Biol 18(1):27–29 Noether E (1918) Invariante variationsprobleme. Gott Nachr 235 Noether E, Tavel M (2005) Invariant variation problems. Arxiv preprint physics/0503066 O’Keefe J, Dostrovsky J (1971) The hippocampus as spatial map: preliminary evidence from unit activity in the freely moving rat. Brain Res 34:171–175 47. Oyama S (2000) The ontogeny of information: Developmental systems and evolution. Duke University Press, Durham 48. Oztop E, Kawato M, Arbib M (2006) Mirror neurons and imitation: A computationally guided review. Neural Netw 19(3):254–271 49. Pais A (1982) Subtle is the Lord. The science and the life of A. Einstein. Oxford University Press, Oxford 50. Porter J, Baker R (1997) Absence of oculomotor and trochlear motoneurons leads to altered extraocular muscle development in the Wnt-1 null mutant mouse. Dev Brain Res 100(1): 121–126 51. Quiroga R, Reddy L, Kreiman G, Koch C, Fried I (2005) Invariant visual representation by single neurons in the human brain. Nature 435(7045):1102–1107 52. Rosenblueth A, Wiener N, Bigelow J (1943) Behavior, purpose and teleology. Philos Sci 10: 18–24 53. Ryan L, Cox C, Hayes SM, Nadel L (2008) Hippocampal activation during episodic and semantic memory retrieval: Comparing category production and category cued recall. Neuropsychologia 46(8):2109–2121, DOI http://dx.doi.org/10.1016/j.neuropsychologia.2008.02. 030, URL http://dx.doi.org/10.1016/j.neuropsychologia.2008.02.030 54. Smith JM, Burrian R, Kauffmann S, Alberch P, Campbell J, Goodwin B, Lande L, Raul D, Wolpert L (1985) Developmental constraints and evolution. Q Rev Biol 60(3):265–287 55. Smith PG (2007) The cambridge companion to the philosophy of biology. Cambridge University Press, Cambridge, chap Information in Biology, pp 103–119 56. Sterelny K (2005) Thought in a hostile world. MIT Press, Cambridge 57. Swammerdam J (1737) Biblia Naturae, Sive Historia Insecto, vol 1. IDC (Leiden) 58. Ton R, Hackett J (1984) Neural mechanisms of startle behavior, Springer, Berlin, chap The Role of the Mauthner Cell in fast starts involving escape in Teleost Fishes 59. Tracy A, Jarrard L, Davidson T (2001) The hippocampus and motivation revisited: appetite and activity. Behav Brain Res 127(1–2):13–23 60. Turchin VF (1977) The Phenomenon of Science: a cybernetic approach to human evolution. Electronic URL http://pespmc1.vub.ac.be/POSBOOK.html 61. Varela F (1979) Principles of biological autonomy. North Holland, New York 62. Varela F, Maturana H (1987, 1998) The tree of knowledge, 1st edn. Shambala, Boston, MA 63. Varela F, Maturana H, Uribe R (1974) Autopoiesis: the organization of living systems, its characterization and a model. Curr Model Biol 5(4):187–96 64. Wiener N (1961) Cybernetics: or the control and communication in the animal and the machine, 2nd edn. MIT Press, Cambridge
Chapter 3
Empirical Assessments of Invariance
Abstract The search for invariances outlined in the previous chapter is taken further with the analysis of empirical methods of behavioral invariance in the study of the mammalian brain. These methods are the bread and butter of the neuroscientist and cognitive neuroscientist: electrophysiology, functional magnetic resonance imaging, and diffusion tensor imaging all search for measurable invariances that may illuminate function and mechanism. This is a simple task as all these methods have inherent particularities, of both empirical and epistemological nature. A clear view on these issues is essential as they have strong bearing on the conclusions about the results of experiments, and by extension on the mechanisms of behavior.
3.1 Empirical Assessments of Invariance 3.1.1 Invariants in Neuroscience Much of the plausibility of hypothetical neural invariants of behavior derives from findings in neuroscience and cognitive neuroscience, whose empirical assessments of the relationship between behavioral function and cortical activity show some constancy in measurements. The rationale consists of assuming that for one type of task, one assumed behavioral function, in different instantiations/moments the brain will act similarly in some way, and that this similarity will be reflected in measurements. In fact, current neuroscience thrives on similarities – and is perplexed at dissimilarities. When correlations in measured activity are conspicuous across trials and similar experiments, there is support for the belief of the existence of a structural level that brings about correlations. The argument concisely is as follows: empirical invariants of brain activity with respect to function exist, consequently there must exist, at some level of analysis, something constant about brain function. What and where those levels of analysis are will concern us in the upcoming chapters. To commence the search, we will consider where and how some forms of empirical invariants appear, and then we will discuss the empirical assessments of invariance
M. Negrello, Invariants of Behavior: Constancy and Variability in Neural Systems, Springer Series in Cognitive and Neural Systems 1, DOI 10.1007/978-1-4419-8804-1 3, c Springer Science+Business Media, LLC 2011
41
42
3 Empirical Assessments of Invariance
proper, that is, the methods and tools neuroscience employs to highlight invariance. To conclude the chapter, I indicate sources of variation in empirical assessments of invariance.
3.1.2 A Brief Typology of Function and Invariance in Neuroscience In brain theories invariants play numerous, variegated, and overlapping roles. Some scattered examples may help form a scenery. 1. Location to Function One of the first attempts at a paradigm for brain science was phrenology (Joseph Gall – 1760), a defunct field that attempted to associate brain areas to particular cognitive functions and personality traits. Modern methods and instruments (such as functional magnetic resonance imaging, fMRI, Sect. 3.2.5, magnetoencephalography, MEG, Sect. 3.2.4, and positron emission tomography, PET) have shown that, some epistemological contingencies notwithstanding, it is possible to relate locality with function. The same holds for lesion experiments, where the loss of specific function is related to particular brain areas. 2. Time to Function Thought and behavior unwind in time, and given the physical reality of the brain, processing takes time. It is therefore clear that some brain processes will correlate with their temporal aspects. The time course of concomitants of thought is therefore a powerful indicator of the durations of behavioral and thought processes, and may also expose structural underpinnings of behavioral function (Sect. 3.2.4). 3. Dynamical Patterns to Function Congregating measurement of time and location of brain activity with function brings about invariance in dynamical patterns. The development of brain measurement instruments has advanced to a stage where it is possible to simultaneously record both the spatial and the temporal aspects of brain activity. Multielectrode electroencephalography (EEG) and fMRI reveal invariant dependencies between different areas and their temporal courses, in terms of brain waves, firing patterns, or the shift of activity of one or more cortical and subcortical areas. 4. Structures to Function Static neural structures (neuroanatomy) may also be invariant with function. As a simple example we may take the law of reciprocal innervation (due to Sherrington) where agonistic–antagonistic connectivity in muscle innervation explains via an invariant structure how a muscle relaxes when the other muscle contracts. More sophisticated examples of how anatomy underlies function include the preprocessing of visual input, in terms of center surround connectivity patterns. Developments in methods unveiling macroneuroanatomical features, such
3.1 Empirical Assessments of Invariance
43
Fig. 3.1 Advances in imaging of brain networks both at the macroscopic level (diffusion tensor imaging) and at the microscopic level (Brainbow)
as diffusion tensor imaging (DTI; a magnetic resonance imaging method for revealing the spatial structure of neural tracts, Sect. 3.2.6), and advanced staining techniques that colorize neurons indicating microscopic neural structure [25] will be fundamental to elucidate the dependencies between structures and function (Fig. 3.1). 5. Invariant (In)dependence Between Functions Measured invariants often reveal connections between different functions, with respect to their neural control structures. Invariant co-occurrences of activities may indicate levels of interdependence between tasks and contexts, and neural activity, such as variations of heart beating rates and emotions in social cognition research [1]. 6. Invariants of Behavioral Function As mentioned in the first point, a fundamental assumption for the procurement of invariances is that a set of behaviors may be invariant with respect to the goals. An invariant behavioral function is one with a single description. A set of behaviors such as hreach for objecti, which can perform the function hgrab the objecti in many different ways, is an invariant behavior. Such invariants are of fundamental importance for experimental paradigms, inasmuch as they define classes of functional behavior, with which brain activity may be correlated. 7. Biophysical Invariants Governed by the biophysical implementation of the neuron, alloyed to physical laws (such as ionic balance across a neuron’s membrane), the neuron displays invariance. The duration of the absolute refractory period, the shape of an action potential, and the travel velocity of a spike (or action potential) maybe invariant relative to the biophysical construction and context of the neuron. Despite much variation in construction, there are constant aspects within certain parameter ranges.
44
3 Empirical Assessments of Invariance
Each of these invariant types will have a subset of suitable methods with which they can be accessed. Any such method has at its disposal a set of measurement tools that are subserved by the associated analysis methods within an experimental paradigm. In the next section we select and examine some of these methods, from the empirical and epistemological standpoints, considering the three elements. The upcoming analysis is crude because it is cursory, but it effectively brings home the main conclusion of this chapter: through different lenses, one sees different pictures.
3.2 The Measurement of Neural Activity 3.2.1 Whatever Is Seen, Is Seen Through Lenses Modern methods of brain measurement produce an amazing quantity of clues about what may be invariant in the brain for different tasks. However powerful, these methods are not without fault (as some are hasty to add). By necessity, all of the methods produce partial pictures, through their individual magnifying glasses. There are, in all cases, inherent empirical limitations. On occasion, these may also become insurmountable conceptual difficulties. In any case, better partial pictures, than no pictures. And as in a mosaic painting by Chuck Close, what may from up close seem as fragments in utter disorder, from a distance may expose identity.
3.2.2 Contentions on the Measurement of Brain and Behavioral Function Despite proverbial drawbacks, such as lack of temporal precision (in the case of fMRI), lack of spatial precision in EEG, and repeatability in electrophysiology (electrodes), some stable pictures of correlation between area and function, or oscillations and function, or firing rates and function, leave little doubt as to some level of organization behind the invariants produced. Stable correlates of measurement were encountered in experiments about perception (color [16], motion [15], sound, bodily and feeling states), in action (gestures [39], locomotion ), in cognition at large (language understanding [37], surprise and uncertainty [43], recognition [38], localization [32], empathy [5], emotion [33]), and in vegetative and other basal processes (respiration, hormonal cycles). Such a list may trick us into believing that invariants are everywhere. Unavoidably, however, each method reveals only the invariance illuminated by its flashlight, framed in terms of its own analytical tools (methodology, terminology, paradigm) and the form of data it generates. When regarding the phenomenon from different empirical angles, we build a heterogeneous mosaic composed of irregular tiles, whose reintegration in a coherent picture requires a concept of the picture they should compose. The metaphor of a puzzle of a colluding picture is evoked. But is there consilience in this picture? And what is it of?
3.2 The Measurement of Neural Activity
45
3.2.3 Scope of Measurement Tools and Analysis Methods A short incursion into how these invariants emerge from measurements may be enlightening, both by (1) showing what kind of information the methods produce about the invariants and by (2) understanding the empirical limitations and epistemological issues of the methods and the tools.
3.2.4 Oscillations and Potentials: EEG If tomorrow is Tuesday, how long will it take to milk the pink cow? This nonsensical sentence may have raised eyebrows, for it has violated expectations and aroused surprise (or befuddlement). How long did it take for surprise (or whatever evoked response) to dawn? Where and in what form will surprise, or any other mental state, show in brain measurement? Electroencephalograms assess the wave correlates of brain activity, searching for invariants to functions in wave characteristics, such as profiles, amplitudes, phases, time courses, and synchrony in similar circumstances or conditions. EEG purports to find the time and the places of thought, through invariants in potentials and oscillations. Thus, the pink cow and other nonsense evoke similar responses in brain waves.
3.2.4.1 Electrophysiology Has a Long History Electrophysiology is one of the oldest tools for measuring neural activity, and owing to its comparably long history, has correspondingly many variations and extensions. EEG measures electrical currents emanating from neural activity, with electrodes placed at one or many sites, from the surface of the scalp to intracranial extracellular medium (intracranial “mesoscopic” recordings) (Fig. 3.2). Measured activity is presumed to correlate with brain function, at times consistently, other times less so (especially across the brains of different subjects). But ever since its inception it has been a crucial tool in the neuroscientist’s portfolio for unearthing invariant temporal and spatial aspects of neural activity. A few well-placed electrodes, on occasion, indicate some qualitatively recognizable kinds of activity that are stable given a task. Nowadays, electroencephalograms are employed in a variety of experimental paradigms. The first and oldest measures the electrical fluctuation of a single electrode positioned somewhere on the scalp. It attempts to find invariant potentials evoked by a given task (event-related potentials), which may range from very simple perceptual tasks to very complex cognitive tasks. A class of stimulus is presented (e.g., a sequence of words), and the time course of the potentials evoked is recorded. In this paradigm, the invariant may be of the form of an amplitude of a peak, the sign (positive or negative), and the time frame of the evoked potential. In the EEG community, identifiable potentials receive names which indicate the polarity of the potential and the time between presentation of the stimulus (seminally,
46
3 Empirical Assessments of Invariance
Fig. 3.2 Electroencephalography (EEG) spans spatial scales. The usual form of the data retrieved from these methods is seen on the right. Autocorrelation for the single units, waves, and principal components for local field potentials (LFPs), power spectrum for electroencephalograms. iEEG intracranial EEG. (From [45])
for example, the P300 [43], a correlate of uncertainty, or the P200, a correlate of noun/verb distinction [36]). In this nomenclature resides the assumptions of what kind of invariants EEG expects. EEG presumes that, for a given task, brain activity (1) is spatially bound (a place in the scalp corresponding to an area in the brain), (2) has a definite time course, and (3) has specific amplitudes (possibly correlating with neurons active). Hence, as EEG measures the responses of populations, these populations respond in a concerted fashion, which is temporally coherent, spatially bound, and activity-consistent. In the ideal case, all of these variables will offer invariants. In the standard (unpublished) case, there will be trivially few or none. High variability both across and within subjects is a constant. In EEG, invariants also appear in terms of phase correlations of evoked potentials, or synchronous activity. Electrodes in the scalp positioned far away from each other may synchronize with each other, indicating synchronicity from different
3.2 The Measurement of Neural Activity
47
brain areas, possibly indicating a collaboration of these areas towards function. To find synchronicity, one primary analysis tool is independent component analysis (ICA), a statistical method that attempts to reduce the dimensionality in terms of the independent wave components by means of variational statistics [23]. Further, ICA attempts to encounter the components most informative about the underlying processes in terms of patterns of phase and power from the oscillations recorded simultaneously from different electrodes. ICA (and also principal component analysis) highlights evanescent synchronies between areas as they phase lock to and release, shedding light, in addition to temporal aspects, on functional connectivity and interdependence of areas. Another method of analysis is Granger causality, a mathematical method meant to identify causal connections between the oscillatory activity of different electrodes. Developments of EEG introduced both technical and mathematical sophistication, correlating activity of many electrodes placed across the whole scalp, or in the brain (mean-field potentials). Local field potentials are measured intracranially and are rather precise, achieving temporal discrimination of the order of 1 ms and spatial resolution from several millimeters (depending on the experiment). MEG is yet another extension that measures the magnetic correlates of brain activity. Both serve as tools in similar paradigms as electroencephalograms, giving similar pictures, owing to the kinship of these methods.
3.2.4.2 EEG: Empirical Issues What we call a brain wave, is not properly a wave. In reality, a wave is the electronic fluctuation measured by an electrode evoked in response to large populations of neurons. The brain has no “waves” per se. They are what EEG sees, through its methodological and empirical lenses. The known and often repeated shortcoming of EEG is that it is unable to say which neuron does what, since only the population activity is measured (in 1 mm3 of cortex there are approximately 106 neurons – compare that with the precision of standard EEG of some surface millimeters). Most EEG recordings happen in the surface of the scalp (especially in humans), measuring from even larger numbers. Adding to that, only surface activity is addressed, disregarding the substantial portion of activity happening beneath. Epistemologically, from the standpoint of neural processing measured oscillations may be epiphenomena, as some have argued, because intercellular potentials were shown not to cause spikes [12]. Neurons fire, but measured populations oscillate. For that reason, it is hard to attribute causal properties to oscillations per se. Nevertheless, knowing that particular rhythms arise, and that they correlate with function, suggests that the oscillations at the population level may be at the level at which function is expressed. Alpha rhythms in the occipital cortex have been correlated with perception of simultaneity and motion [44], theta rhythms for path integration [40], gamma rhythms with binding [9] and memory consolidation [8]. It is clear that oscillations emerge as concomitants of neural activity and
48
3 Empirical Assessments of Invariance
certain kinds of processing. It has been hypothesized that resonant oscillations are responsible for feature binding [9]. But the extent to which oscillations are necessary, or primarily causative, is open to debate. In any case, it is indisputable that the information they provide is crucial to understand timing in neural processing. Needless to say, questions remain that the method itself cannot address: Do oscillations have causal function? How do they arise in the population? What is the role of synchrony? Does it have causative roles? These questions are beyond the scope of the method because they are not about the measured concomitants of function, but are about the mechanism of function itself.1 These questions should be addressed through combinations of theory and paradigm.
3.2.5 Function Localization: fMRI Most fMRI studies pose a question of the sort: What is the part most active of the brain when task X is executed? Because of its high spatial resolution and noninvasiveness, fMRI became the primary tool for localization of mental function in the brain. fMRI measures the magnetic correlates of oxygen in blood flows, which are assumed to correlate with the energy consumption of brain areas, as their neurons consume oxygen in the production of action potentials (blood-level oxygenation level, the so-called BOLD signal) [26].2
3.2.5.1 The BOLD Signal fMRI uses as signals the magnetic properties of hemoglobin in the oxygen-bearing cell. Hemoglobin can basically have two states, it can be oxygen-loaded (oxyhemoglobin) or carboxyl-loaded (carboxyhemoglobin). As blood is constantly flowing around in the brain, and because the action potential is an energy-consuming event, one may take the oxygen consumption of a certain brain region as an indicator of the region’s activity during a certain mental task. This implies that blood consumption is a reliable indicator for predominantly localized neural activity (Fig. 3.3). Very appropriately, Freeman (p. 20 in [13]) has likened the BOLD signal to smoke in forest fires, where denser smoke indicates more trees being burned. A nice facet of this analogy is that the whole forest may be burning, or the whole brain may be active. Measurements of blood oxygenation levels are taken with respect to a threedimensional lattice of the brain, where the spatial resolution is given in voxels (a voxel is the volumetric equivalent of a pixel). The spatial resolution is dependent on the power of the magnetic field. Current equipment has a spatial resolution 1
Statements of the sort “These oscillations have multifold functions and act as universal operators or codes of brain functional activity” are too strong, and suffer from the confusion between correlate and cause [2]. 2 Timely. Logothetis [27] asks, “What can and what can’t we do with fMRI.”
3.2 The Measurement of Neural Activity
49
Fig. 3.3 Functional magnetic resonance imaging measures the oxygen consumption level of brain areas. The hippocampus and the occipital cortices are differentially active. (From [34])
of around 128 128 voxels, giving a spatial resolution of around 1 mm, roughly corresponding to one million, with a temporal precision of a couple of seconds. This is not very precise, with most of the coarse resolution being due to intrinsic limitations of oxygen diffusion times from the blood to inside the cells.3 Another interesting development comes from the physics of magnetic resonance, where superadiabaticity may lead to the development of more precise fMRI (along with smaller scanners) [6]. Brains vary in their morphology, so to find correspondent areas between brains of different subjects, interesting methods of brain mapping have been developed. These show the cortex as if unfolded in a surface. Gyri an Sulci (the valleys and ridges of brain the convolutions) are used to demarcate brain areas, and allow comparison between different brains [4].
3.2.5.2 Areas Comparatively More Active In most experimental paradigms involving fMRI, the activity in areas is found by taking the difference in the BOLD signal between experimental conditions and control conditions. So, instead of “active areas,” it is more correct to say “areas comparatively more active.” Quite a bit of other activity must be averaged out in the process before some area significantly more active can be found. To return to Freeman’s analogy, it is as if the whole forest is on fire, and we expect to find where the highest concentration of burning trees is by comparing patterns of smoke emission at two moments. The method gives coarse information about where the fire started, or how it propagated. 3
Recent articles have addressed the question of what precise cellular process is the closest correlate of the BOLD signal, whether it is from all neurons, only excitatory, inhibitory, or glial, cells, whether it applies equally to different areas, and so on. Recent articles have competently shown the different sources and how they contribute to the signal [26, 28].
50
3 Empirical Assessments of Invariance
By observing the time courses of activities, one can also employ fMRI to examine functional connectivity. Granger causality, for example, purports to retrieve causal dependence between time series of activities, where one may be able to say which caused the other, and with what confidence [7]. The implicit assumption is that highly active peaks and depressions of the waves are more important for causality. Although a reasonable assumption, it may exclude quite a bit of activity, which may as well be causal, and thus relevant, for the function sought. Incidentally, a recent result showed that a single spike from a single neuron may be sufficient to trigger a postsynaptic action potential, and by extension, a complex cascading set of events [29]. This single spike is invisible to any fMRI study.
3.2.5.3 Localization of Function Localization alone does not unearth principles of neural organization, of course. The main issues with functional localization of this sort are twofold. First, it is unable to resolve questions of multiple implementability. Is the sequence of active areas necessary for function, or merely a contingent artifice of the paradigm? Is there a fundamental reason for the existence of particular hubs in the brain? The second issue closely follows our discussion on invariants of behavior in Sect. 2.3.13. It has to do with the type of question posed: What area is most active when function F is necessary? Hence, the answers are – by assumption – given in associations between area and function. For example, the search for a “god” area in the brain reveals activations in the prefrontal orbital cortex [31] (known as a high association area). What function is this association really about? Abstract words associated with cognitive states? Or an antenna for “god” (or, as Descartes postulated, the connection between body and soul of the transempirical self)? When area–function associations are done freely, we may end up with bemusing conclusions. This is perhaps too rough a critique, and an unreasonable demand from the tool that ex hypothesi assumes function–area associations. Initially, one defines the function as necessary for a task, which will be the guiding hypothesis for the correlation, and only then does one look for activation correlates. If these are consistent across individuals, there is at least good reason to believe that there is a connection. Furthermore, this information integrates with the results across experiments. Had the experiments been insulated from each other, which obviously is not the case, then it would be impossible to be certain about the connection between function and localization. The responsibility of integrating results, however, is the scientist’s not the method’s. All the same, often critics are justified to raise awareness of the fact that (1) functionally relevant activity is likely to be obfuscated and (2) much could vary in implementation. Averaging out all activity which is not large may suppress from the conclusions other areas (or even units) that have functional significance, but are either too small or too weak for fMRI to see. An extension of this method may help
3.2 The Measurement of Neural Activity
51
address the second question.4 On a lighter note, some recent work by Haynes and others [18–22] has addressed the critiques by shifting emphasis from localization to dynamic pattern recognition. His methods have achieved considerable success in determining the content of thought, through pattern classification (the pattern is the temporal activity of different voxels). In terms of correctness of the classification they have achieved impressive accuracy. As long as there is a procedure (computational) to categorize different patterns, then the content of thought can be recognized. Nevertheless, his approach is objective and does not focus on explaining the differences between patterns of activity within subjects and across subjects.
3.2.6 Brain Wiring: DTI and Diffusion Spectrum Imaging Both DTI and diffusion spectrum imaging ask the central question: How is the brain wired? [30] DTI is a modern variation of the magnetic resonance imaging used to retrieve the spatial structure of major neural tracts (bundles of axons), by measuring the diffusion tensions of water in neural tissues (water diffuses anisotropically, due to the differential permeability of neural tissues). Knowledge of how brain areas connect will be instrumental for comprehending the relationships between structure and function. The method has already generated impressive data – and stupendous pictures – including for humans and high apes [30,48]. We can expect that this is just the beginning. For the student of the dynamics of neural networks this is possibly the most important empirical development in neuroscience in this century. With much knowledge comes much responsibility. Although the preliminary results allow boisterous rejoicing, it is better to take a cautiously circumspect attitude. For one thing, to say that we can retrieve connectivity from individuals does not imply we shall ex machina be enlightened about function. As the neuroscientist has learned to expect, there is staggering variability everywhere, beyond a subset of identifiable hubs. In a recent article [17], Sporns et al. took connectivity maps from diffusion spectrum images of five humans, and submitted the connectivity to graph theory analysis. The calculations conducted measured node degree, structural motifs, path lengths, and clustering coefficient distributions. In addition, k-core decomposition was calculated, a measure derived by recursively pruning nodes with degree smaller than k (k is the node degree, defined as the number of connections of a unit in a binary matrix. The type of connection and the spatial distance are neglected.). It is a measure indicative of the presence of hubs (areas with comparably high connectivity). Among other discoveries, they confirmed an important hypothesis, also from Strogatz [47], that the brain follows a small world law [41]. A small world distribution is one in which the distribution of edge degrees follows a power law.
4
Also, a debate has recently arisen concerning unusually high correlations in fMRI studies of social cognition, where a fundamental flaw in some experimental designs has been unearthed [46].
52
3 Empirical Assessments of Invariance
Fig. 3.4 Correlation of connectivity as seen from k-core decomposition from the diffusion spectrum images of five human subjects. Notice the variability in the distribution of cores across subjects. Notice also the variability from two scans of the same subject, possibly due to empirical variability. Network cores for each individual participant were derived by k-core decomposition of a binary connection matrix obtained by thresholding the high-resolution fiber densities such that a total of 10,000 connections remain in each participant. Nodes are plotted according to their core number, counted backwards from the last remaining core
Essentially this means that there are more neurons that connect to many neurons than there are neurons that connect to few neurons, and that the distribution of edge degrees follows an exponential: there will be very few that connect to very many. That shows unequivocally the locality of certain hubs, as, for example, the anterior cingulate gyrus.5 The result of the k-core decomposition from five subjects from [17] is displayed in Fig. 3.4. There is a considerable variability in k-cores across subjects, which signifies that some people have more hubs than others, and also that people have tract hubs at different brain ares. Also visible in three of the five subjects (A, C, and D) is an asymmetry in connectivity between hemispheres, with the left hemisphere having more hubs. To test for the repeatability of measurement, two scans were performed on separate occasions on subject A and the results were compared. There was consistency, although some variation between the two scans could be seen. In addition, the results reveal immense variability across subjects. Figure 3.4 shows the amount of correlation as seen from k-core decomposition of five human subjects. One notices that apart from known centers of connectivity (such as the cingulate gyrus, the basal ganglia, and the hippocampus), there is considerable difference across the tracts of different subjects. But as one moves away from the most salient structural correlations among subjects, one finds variability that increases considerably. And yet, despite variability in brain networks, all the subjects were fully functional human beings, similar to the extent that all humans are. Although we already came to expect significant differences between normal and abnormal brains, the large variation also between normal brains must also be addressed. Apparently our quandaries about variability and constancy are resilient to the greatest developments in neuroscientific tools. 5
Also noteworthy in Fig. 3.4 is the presence of cores in the left hemisphere in participant C.
3.2 The Measurement of Neural Activity
53
3.2.7 Electrophysiology: Single Unit Recordings Both in time and in space, the most precise technique of brain measurement is single cell recording (electrophysiology). The process consists in invading the brain with electrodes and recording the time course of the membrane potential from a single neuron. The method gives the closest proximity to the action potential, which presumably is the basis of most neural processes. Modern microelectrode arrays are able to record directly from hundreds of neurons, and indirectly from more. Because it is an invasive method, recordings are mostly done in laboratory animals, ranging from fish to apes. In special conditions humans have also been subjects.6 Electrophysiology revealed a complex picture of neuronal activity where variability and constancy coexist.
3.2.7.1 Invariants in Electrophysiology Electrophysiological recordings produce correspondences between the activity of a cell and some property being hypothesized as a correlate of function. A sampling of these functions (and cell correlates) includes localization (place cells [32], grid cells [24], prospective and retrospective cells [11]), pattern recognition (grandmother cells [38]), decision making (decision cells [35]), object recognition, somatosensory maps (finger cells), motor behavior, action production and recognition (mirror neurons [14]), and emotive behavior (empathy cells). One sees in the small sampling above that some of the most staggering invariants of neural processing resulted from electrophysiology. Mirror neurons, those that fire both when an action is performed and when it is observed, were first demonstrated with use of the technique. The place cells of the hippocampal formation, those most active when the animal is situated on a particular place, were among the first to be reported [32]. Decision cells were reported in the lateral intraparietal cortex of the monkey, and correlated with a particular decision the monkey may not even be conscious of yet [35]. There are other examples of cells that are invariant to very particular properties, for instance, the “Jennifer Aniston cell,”or the “Bill Clinton cell” [38].
3.2.7.2 Empirical and Epistemological Remarks It should nevertheless be quite obvious that these cells do not act alone,7 being members of larger networks that may or may not have been in the electrode’s path. In a
6
Usually when there is already the requirement for surgical interventions, or when the patient has full body paralysis. 7 Some popular science, as well as renowned scientists, have created much confusion on this particular topic.
54
3 Empirical Assessments of Invariance
Fig. 3.5 Did the spike at electrode 1 cause the spike at electrode 2? The time between two spikes alone is not a reliable indicator of the dependence between two neurons
sense, the electrophysiological recordings are the narrowest of brain measurement instruments, for although the numbers of recorded cells have been increasing with the development of both technology and analysis methods, it is altogether quite a small set of cells when compared with the whole neural system. The method has limitations of an instrumental nature, such as the small number of cells recorded, and the lack of knowledge about local connectivity, and that only after posthumous histology does one learn which cell was recorded (Fig. 3.5). But the more interesting contention is one of an epistemological nature. When one attempts to find which neurons correlate with a certain property, one is a priori committed to that property. In extreme cases, one asks a yes/no question: does neuron A display property P, or not? Mostly, it is the paradigm which proposes the properties being looked at. The properties do not quite “offer” themselves (although that can be the case, via serendipity), but they are sought with the hypothesis of various degrees of tenability. For this very reason, it is quite hard to conceptualize how different cells with various properties may compose a picture of the processing in terms other than associating properties. It is hard to reunite a bunch of cells with various properties in terms of overall systemic function (the complications of reentrances and recurrences spring to mind). These often lead to exclusion of those cells that “do not appear” to be particularly telling about the function under scrutiny. It is quite hard to sum the function of different cells and come up with a consistent picture, for these cells may well have distinctive capacities (e.g., what are both face cells and place cells doing in the hippocampus? With what property does a cerebellar Purkinje cell, which is active all the time, correlate? Although some theories have been proposed attempting functional integration based on different functions of different cells, integration is one of the crucial theoretical challenges for the theoretician who is informed by electrophysiology.
3.3 Sources of Variation
55
3.2.8 Empirical Methods and the Language of Explanation If all you have is a hammer, treat anything like a nail. Anonymous
Within its own scope and despite critiques that can be too harsh (the electrophysiologist and the imaging scientist often point at each other’s shortcomings, often neglecting their own), the methods reviewed in this chapter form the basis of empirical methods of cognitive neuroscience, in the attempt to retrieve the correlates of function, between and within areas, with admirable success. There exists an indisputable level of constancy in brain activity, although the principles, or mechanisms, are not immediately apparent. This enforces a prominent role for hypotheses, the formation of which is far from trivial, and often erroneously made dependent on the terminology associated with a method [2], and becoming heavily biased by the sort of data analyzed. Every method predicts more of what it sees, while tending to neglect what it does not see. Explanations given in terms of terminology of specific empirical methods are, however, often tethered to the methods themselves, tethers which become exposed when explanations are restrained to a small circle around the method to which they are umbilically connected.
3.3 Sources of Variation All empirical methods of brain measurement exhibit, and are subject to, the vicissitudes of variation. The variation sprouts from different roots, depending on (1) the methods, (2) experimental design, and (3) the brain itself. A categorization of variation may help weight the sources for their relative importance in individual cases.
3.3.1 Instrumental Sources of Variation A subset of those variations will be of purely instrumental nature, deriving from various imprecisions of the equipment, in which variations intrinsic to the method at hand will be reflected in a spread of measurements. Averaging out these variations and bringing out core features of the recorded responses is the purpose of much of the statistics applied to the measurements. Although the instrumental sources of variation are perhaps the simplest to grapple, they may well be some of the toughest to crack.
56
3 Empirical Assessments of Invariance
3.3.1.1 Context and Initial Conditions Another source of variation is that given by the brain’s relentless agitation. The brain from now is different from the brain in 1 s, either owing to neural plasticity or simply by being. The brain of an experimental subject does not passively await the presentation of a stimulus, or the moment to act for that matter. The brain is alive and constantly active. Nonstop activity means constant change. Unlike machines that can be reset to an initial state, the brain cannot be reset, at a button press, to the very same state at each execution of a task. For that reason, it is virtually impossible to hold all the surrounding variables constant in different trials. The subject will be performing the task amid all the rest of his or her thoughts, which will mix with measurement. These variations are often handled by the statistics of large numbers, in which there will be a clustering and a clumping of the surrounding factors, which ideally will have no bearing on the function under scrutiny. This assumption of independence is frequently unwarranted. If the brain is highly precise, small differences in context will lead to variations in execution, which may be irrelevant with respect to function, but may have a smearing role, confusing measurement. If the brain is chaotic, slight variations in the initial conditions will lead to ample variation in measurements.
3.3.1.2 Neural Implementation Variations in measurement may also be due to variations of the neural implementation. Members of the same species may implement the same function in different ways, and even a single brain may handle the same task in different ways at different times. It is the prerogative of the brain to be creative, so in different instantiations of the same task, it is conceivable that one individual brain might be doing the same thing but slightly differently, not to mention the staggering differences across individuals for one and the same task. This variation can mean one of two things: 1. Equivalent structures subserving one function. There may be multiple coexisting architectonic implementations of one function, each of which is active at a different instantiation. This is the idea of degeneracy, that a function may be executed by multiple networks of the same brain [10]. It appears likely, at least for the early stages of brain development in mammals, that for one single function there may be many possible coexisting redundant implementations, which may be active at different times. There may exist general mechanisms in the development of network topology which effectively build two redundant networks. So, in each instantiation of a task, these may produce effective differences, which may, or may not, be retrievable from experiments (e.g., think of fMRI). 2. Multiple realizability: different structures to one function. More extremely, the brain may implement the same functions in fundamentally different ways, in a hypothesis that has been called “multiple realizability.” The philosophy is not lacking in arguments both for and against multiple realizability [3]. Ultimately, it may be that the issue is resolvable with harder science, integrating knowledge
3.3 Sources of Variation
57
about what structure does what function how, and what types of variations there can be in implementation. Results from DTI studies of connectivity and neuroanatomy represent the current state of affairs, and on occasion appear to be contradictory. Although some hubs seem to be similar across individuals, this is most definitely not the standard case. So, to summarize, these are the possibilities for sources of variation at the neuronal level. given one type of behavior: Different networks, same neural mechanism Different networks, different neural mechanisms Same network, different context or initial conditions
3.3.1.3 Measurement and the Invariants of Behavior Implementation of function may be regarded as a tendency to achieve a certain type of network topology, to execute a certain behavioral function. It may be that the tendency of the self-organizing substrate is to beget particular functions, with particular mechanisms, by having inherent developmental tendencies (given ontogeny in the environment). If one concedes that experience may have a formative role in the establishment of networks, it is almost inconceivable that all functions will share really the same neural architecture. Nevertheless, we may speculate that given substrates (neurons) with certain potentialities of rearrangement have a directed developmental path, a trend so to speak inherent in the ways that matter (neural and otherwise) may organize. In that case, despite variations, the measurements will reflect a level of abstract generality for the function being considered. Measurements are projected shadows of an abstract behavior invariant, reflections of the set of possible mechanisms implementing a function.
3.3.2 Repeatability and Variation in Different Levels Some of the most informative experiments in neuroscience have produced explanations of neural correlates of low-level sensors and sensory sheets. These have elucidated transduction of worldly stimuli in terms of the physical properties most informative about the stimulus. Examples are many, ranging from correlates of whisking behavior in the mouse’s barrel cortex, to the perception of sky polarity in some locusts, to optic flow experiments with insects. These detailedly show some of the exquisite world–spike translations, in which an organism’s receptors are highly reliable measurement instruments. Some of these transduction mechanisms have uncanny repeatability, and show that biological mechanisms may be very precise. However, although recordings from interfaces close to the world show reliable and repeatable firing patterns, as activity is integrated in higher levels, the recordings
58
3 Empirical Assessments of Invariance
appear more unpredictable (noisy?), to the point of being, on occasion, described as stochastic. This creates a mild paradox, one that should not to be fleetingly dodged by any theory of brain function.8
3.3.3 Sources of Variation 1. Epistemological There are varying contexts in the execution of a task, meaning there may be
concomitant processes which may influence, or even underpin, the execution of a task. During the execution of a task there may be the appearance of novel solutions. The function being sought is inadequately specified. Statistics tends to obfuscate small contributions that may be functional. 2. Instrumental/empirical Imprecisions inherent to methods. Noise in the measuring equipment.
3. Neural The brain is in constant change; therefore, the initial conditions never repeat. There may be redundant circuits realizing the same task in different manners. There may be equivalent circuits realizing the same task in different places.
3.4 Conclusions 3.4.1 Partial Pictures From this rapid survey of some of the most prominent methods of brain measurement, we see that each tool paints a narrow picture of the brain’s workings. EEG and MEG paint a picture of an oscillatory brain where mental function is a product of the coherent organization of oscillations and synchronicity among areas. fMRI paints a picture of an organized spatial structure and localized function, even across individuals. Single unit recordings show the capacity of a neuron to correlate reliably with high-level behavior, and even to predict behavior. Conversely, fMRI is unable to show the source of synchrony, EEG can only measure diffuse populations; single unit recordings are too local. Each method gives a partial picture. In any of
8
“Is the brain precise or noisy?” is a version of this paradox, which is also a rather poor way to pose the question.
3.4 Conclusions
59
the methods, that which is not strictly protuberant statistically speaking is assumed to be irrelevant for the phenomenon, and is averaged out. Perhaps the problem would be solved if all measurements could be done simultaneously. If we knew from DTI the brain connectivity, then we could know from which cells we record, then integrate that with the level of EEG oscillations, summing up with the most active brain areas, and then we would have a more complete image. As it stands, we do not have the capability to do all at one time, and our best bet is to reintegrate the scattered results in the frame of a theory. Appropriately, that is the purpose of the following chapters.
3.4.2 More Epistemological Contentions Arguably, the most pertinacious critique to correlative methods is of an epistemological nature. Any experimental paradigm assessing function through correlated activity such as fMRI or PET or EEG is considerably committed to the functional hypothesis inherent in the experimental design. In a certain sense it is correct to affirm that for every function there will be brain correlates, and the most conspicuous of those are levels of activity. So, because the levels of activity are by necessity correlated with the task proposed, much depends on the definition of the task, which reciprocally defines the function searched for (examples abound in current cognitive neuroscience). In choosing a picture of Bill Clinton as a correlate of brain activity, one of two outcomes is possible. Either Bill Clinton correlates or he does not [38].9 The answer space is given by the question posed. Although a fascinating question – but contorted – “What is a good neural correlate?” will not be pursued here. I will say though, that no method is impermeable to this critique. In the retrieval of invariant correlates of function, the paradigm is an integral part of the answer, and the importance of the paradigm for conclusions can hardly be overemphasized. It is essential to see how the paradigm frames not only the question, but also the answer. On the other hand, the appreciation of this fact exposes the most basal characteristic of science made by people (with brains) who study brains (of people). As the long tradition of constructivism has reiterated, there is an inescapable – because it is inherent – circularity in explanation. We think with ideas. Every time we search for a brain function we assume that function (to various degrees). There is no escape. The only possible alternative for solace is to take refuge in purely descriptive analysis. But this in turn may conceal traps that jeopardize conclusions (see the mistakes of behaviorism). The danger is to think that there is no imposition of order on our part. This neglect abuts self-delusion, and paradoxically, the safest way around it is to face it. 9
A variation of this argument applies to the search for dendrites computing logical operations. One finds, or does not find, the logical operations sought. Usually this is done by restricting the context of dendritic function way beyond its biological context.
60
3 Empirical Assessments of Invariance
3.5 Summary The invariance assumption. For one particular task or function, the brain will act in a similar way. This similarity can be encountered in measurement (Sect. 3.1.1). Methods of measurement. Because methods paint partial pictures of some of the brain’s constancies in function, the results can appear contradictory or difficult to integrate (Sect. 3.2). The role of statistics. All of the methods rely on some kind of statistics to show invariants. This may (1) smoothen out variations that may also be informative and (2) disregard contextual effects (Sect. 3.3.1). Types of variability. There are many sources of variation in brain measurement: instrumental, contextual (also known as initial conditions), neural implementation (Sect. 3.3). Languages of experiments. Measurement tends to produce a language for the conclusions of the experiment (Sect. 3.2.8). Integration of results. A powerful brain theory must be general, which implies it should encompass the findings of different experimental paradigms. Integration of results depends on hypothesis formation, ideally under the tutelage of an integrative theory (see the next chapter).
References 1. Adolphs R (1999) Social cognition and the human brain. Trends Cogn Sci 3(12):469–479 2. Basar E, Basar-Eroglu C, Korokos S, Schr¨urmann M (1999) Oscillatory brain theory: A new trend in neuroscience IEEE engineering in medicine and biology magazine : the quarterly magazine of the Engineering in Medicine & Biology Society, 18(3):56–66 3. Batterman R (2000) Multiple realizability and universality. Br J Philos Sci 51(1):115–145 4. Carman G, Drury H, Van Essen D (1995) Computational methods for reconstructing and unfolding the cerebral cortex. Cerebr Cortex 5(6):506–517 5. Decety J, Gr`ezes J (2006) The power of simulation: Imagining one’s own and other’s behavior. Brain Res 1079(1):4–14 6. Deschamps M, Kervern G, Massiot D, Pintacuda G, Emsley L, Grandinetti P (2008) Superadiabaticity in magnetic resonance. J Chem Phy 129:204,110 7. Ding M, Chen Y, Bressler S (2006) Granger causality: Basic theory and application to neuroscience. Arxiv preprint q-bio/0608035 8. Douglas R, Martin K (1995) Vibrations in the memory. Nature 373(6515):563–564 9. Eckhorn R, Bauer R, Jordan W, Brosch M, Kruse W, Munk M, Reitboeck H (1988) Coherent oscillations: A mechanism of feature linking in the visual cortex? Biol Cybern 60(2):121–130 10. Edelman GM (1987) Neural darwinism. Basic Books, New York 11. Frank LM, Brown E, Wilson M (2000) Trajectory encoding in the hippocampus and entorhinal cortex. Neuron 27:169–178
References
61
12. Freeman W, Baird B (1989) Effects of applied electric current fields on cortical neural activity. Computational Neuroscience, New York, Plenum Press 13. Freeman WJ (1995) Society of Brains: A study in the neuroscience of love and hate. Laurence Erlbaum Associates Inc, Hillsdale, NJ 14. Gallese V, Fadiga L, Fogassi L, Rizzolati G (1996) Action recognition in the premotor cortex. Brain 119:593–609 15. Grossman E, Donnelly M, Price R, Pickens D, Morgan V, Neighbor G, Blake R (2000) Brain areas involved in perception of biological motion. J Cogn Neurosci 12(5):711–720 16. Hadjikhani N, Liu A, Dale A, Cavanagh P, Tootell R (1998) Retinotopy and color sensitivity in human visual cortical area V 8. Nat Neurosci 1(3):235–241 17. Hagmann P, Cammoun L, Gigandet X, Meuli R, Honey C, Wedeen V, Sporns O (2008) Mapping the structural core of human cerebral cortex. PLoS Biol 6(7):e159 18. Haynes J, Rees G (2005a) Predicting the orientation of invisible stimuli from activity in human primary visual cortex. Nat Neurosci 8:686–691 19. Haynes J, Rees G (2005b) Predicting the stream of consciousness from activity in human visual cortex. Curr Biol 15(14):1301–1307 20. Haynes J, Rees G (2006) Decoding mental states from brain activity in humans. Nat Rev Neurosci 7(7):523–534 21. Haynes J, Deichmann R, Rees G (2005) Eye-specific effects of binocular rivalry in the human lateral geniculate nucleus. Nature 438:496–499 22. Haynes J, Sakai K, Rees G, Gilbert S, Frith C, Passingham R (2007) Reading hidden intentions in the human brain. Curr Biol 17(4):323–328 23. Hyv¨arinen A, Oja E (2000) Independent component analysis: Algorithms and applications. Neural Netw 13(4–5):411–430 24. Jeffery KJ, Burgess N (2006) A metric for the cognitive map: Found at last? Trends Cogn Sci 10(1) 25. Livet J, Weissman T, Kang H, Draft R, Lu J, Bennis R, Sanes J, Lichtman J (2007) Transgenic strategies for combinatorial expression of fluorescent proteins in the nervous system. Nature 450:56–62 26. Logothetis NK (2003) The underpinnings of the BOLD functional magnetic resonance imaging signal. J Neurosci 23(10):3963–3971 27. Logothetis NK (2008) What we can do and what we cannot do with fMRI. Nature 453(7197):869–878 28. Logothetis NK, Pfeuffer J (2004) On the nature of the BOLD fMRI contrast mechanism. Magn Reson Imaging 22(10):1517–1531 29. Moln´ar G, Ol´ah S, Koml´osi G, F¨ule M, Szabadics J, Varga C, Barz´o P, Tam´as G (2008) Complex Events Initiated by Individual Spikes in the Human Cerebral Cortex. PLoS Biol 6(9):e222 30. Mori S, Zhang J (2006) Principles of diffusion tensor imaging and its applications to basic neuroscience research. Neuron 51(5):527–539 31. Newberg A, Newberg S (2005) The neuropsychology of religious and spiritual experience. Handbook of the psychology of religion and spirituality. pp 199–215 32. O’Keefe J, Dostrovsky J (1971) The hippocampus as spatial map: preliminary evidence from unit activity in the freely moving rat. Brain Research 34:171–175 33. Panksepp J (1998) Affective Neuroscience, Oxford UP, chap The Varieties of emotional systems in the brain, pp 41–58 34. Peigneux P, Orban P, Balteau E, Degueldre C, Luxen A, Laureys S, Maquet P (2006) Offline persistence of memory-related cerebral activity during active wakefulness. PLOS Biology 4(4) 35. Platt ML, Glimcher PW (1999) Neural correlates of decision variables in parietal cortex. Nature 400:233–238 36. Preissl H, Pulverm¨uller F, Lutzenberger W, Birbaumer N (1995) Evoked potentials distinguish between nouns and verbs. Neuroscience Letters 197(1):81–83 37. Pulverm¨uller F (2001) Brain reflections of words and their meaning. Trends in Cognitive Sciences 5(12):517–524 38. Quiroga R, Reddy L, Kreiman G, Koch C, Fried I (2005) Invariant visual representation by single neurons in the human brain. Nature 435(7045):1102–1107
62
3 Empirical Assessments of Invariance
39. Rizzolatti G, Fadiga L, Gallese V, Fogassi L (1996) Premotor cortex and the recognition of motor actions. Cognitive Brain Research 3(2):131–141 40. Skaggs W, McNaughton B, Wilson M, Barnes C (1996) Theta phase precession in hippocampal neuronal populations and the compression of temporal sequences. Hippocampus 6(2) 41. Sporns O, Honey C (2006) Small worlds inside big brains. Proceedings of the National Academy of Sciences 103(51):19,219 42. Sporns O, Honey C, K¨otter R (2007) Identification and classification of hubs in brain networks. PLoS ONE 2(10) 43. Sutton S, Braren M, Zubin J, John E (1965) Evoked-Potential Correlates of Stimulus Uncertainty. Science 150(3700):1187–1188 44. Varela F, Toro A, John E, Schwartz E (1981) Perceptual framing and cortical alpha rhythm. Neuropsychologia 19(5):675–86 45. Varela F, Lachaux J, Rodriguez E, Martinerie J, et al (2001) The brainweb: phase synchronization and large-scale integration. Nature Reviews Neuroscience 2(4):229–239 46. Vul E, Harris C, Winkielman P, Pashler H (2009 (to appear)) Voodoo correlations in social neuroscience. Perspectives on Psychological Science 47. Watts D, Strogatz S (1998) Small world. Nature 393:440–442 48. Wedeen V, Hagmann P, Tseng W, Reese T, Weisskoff R (2005) Mapping Complex Tissue Architecture With Diffusion Spectrum Magnetic Resonance Imaging. Magnetic Resonance in Medicine 54(6):1377
Chapter 4
Modeling and Invariance
The price of metaphor is eternal vigilance Norbert Wiener, Cybernetics
Metaphors are neither true nor false, but they can be more or less productive as heuristics for developing more precise theories. Peter Gardenf¨ors
Abstract Further in the analysis of invariants and their explanatory roles, this chapter introduces models of computational neuroscience that elucidate the interplay between constancy and variability, and illustrate the appearance of invariances. These models rely on the interplay between the structure of the input and rules of structural modification, in which a malleable structure organizes by assimilating the regularities in input. In analogy, organisms learning and adapting to their bodies and environments also self-organize function. Although powerful, empirical assessment of invariances and models thereof present partial pictures. To assemble this mosaic of models, it is necessary to have a theory that is able to cover the causal levels of behavioral phenomena. Dynamical systems is proposed as such a theory.
4.1 Invariance and Computational Models 4.1.1 Shared Invariant Rules from Natural Patterns Nature’s symmetric patterns have the hypnotic allure of awe-inspiring mysteries, inciting the natural philosopher to unveil them. Once the seductive signs are accepted and nature is stripped bare, graceful beauty often blossoms in the pure form of a mathematical abstraction. Behind stunning patterns in nature there are invariant properties of components and invariant rules, at once defiant and elusive.
M. Negrello, Invariants of Behavior: Constancy and Variability in Neural Systems, Springer Series in Cognitive and Neural Systems 1, DOI 10.1007/978-1-4419-8804-1 4, c Springer Science+Business Media, LLC 2011
63
64
4 Modeling and Invariance
Once revealed, an invariant transmutes into a mathematical crystal, whose facets reflect both itself and others, with which it abstractly shares its essence, its mathematical soul.1 In a biological system, an invariant is an abstract property entailed by the substrate and its organization principles. Physical laws create potential for modifications of structural configurations and relations between components of the substrate, resulting in invariance perchance of form and perchance of function or behavior. The organizations of the substrate are, at least for those subscribing to materialist views, solely regimented by physical principles. In the world, self-organization is superordinate on the physical principles of organization of matter, such as adhesion, viscosity, permeability, rigidity, elasticity, magnetism, electricity, and combinations thereof—relational forces and functions which create potential for change. Physical principles can be distilled into mathematical rules, which on occasion are shared between different systems (mathematical analogies). It is part of the fascination of modeling that, at some level of analysis, models may be shared between diverse systems. Natural processes whose results appear analogous can be modeled via similar invariant rules. The same holds for neural systems.
4.1.1.1 Similar Patterns from Invariant Rules Studies of morphogenesis are searches for abstract rules of a process that explain the formation of natural patterns. As an example, see Fig. 4.1, comparing spiral patterns occurring in nature and in inorganic chemistry. One is formed by processes of a colony of organisms, Dyctiostelyum, whereas the other is formed by a complex chemical reaction involving bromide and an acid, the Belousov–Zhabotinsky reaction (more pictures of spiral patterns from different systems can be found at http://www.uni-magdeburg.de/abp/picturegallery.htm). The patterns do not share
Fig. 4.1 Similar abstract rules resolve the similarities in the patterns generated by the chemical Belousov–Zhabotinsky reaction and the biological Dyctiostelyum spirals
1
Please forgive the somewhat exuberant exposition. It was written during an outburst of awe.
4.1 Invariance and Computational Models
65
substrate, and yet they look exquisitely similar. This similitude can be explained by an abstract description of a dynamical process that holds for both phenomena. Pattern formation results from two complimentary processes: a fast autocatalytic reaction (a positive feedback), and a slow inhibitory reaction (negative-feedback) loop. Roughly described, the pattern-forming process unfolds similarly for both. The autocatalysis is fast and exhausts the reactants, allowing a secondary, slower, process of regeneration of the catalysts to take place. The temporal dynamics of this process in a two-dimensional excitable medium [7] is oscillatory, producing a wavefront of reaction, and consequently the spiral patterns. The spirals are reliable indications of the existence of rules; both the bacteria and the chemical reaction share abstract, pattern-forming, invariant rules; and both produce, within an ample space of parameters, qualitatively invariant spatial patterns. Many other such examples of spiral patterns appear in nature (as in Fibonacci sequences in plant morphogenesis). Processes producing patterns can be modeled in a variety of levels of abstraction, from very abstract cellular automata, to very detailed reaction–diffusion models based on differential equations [21]. Across this range the processes are dynamic, unwinding in time, which introduces issues of stability depending on process rates, initial quantity of interactants, and influxes of energy (or perturbations) initiating the reaction. By analyzing changes to these parameters, one can study different qualitative behaviors that both systems exhibit. Parameters and potential regulate the processes. When there is too much potential and too few interactants, the process may be like a burst of flame, rapidly consuming the energy and quickly dying out. Likewise, when there is not enough potential, nothing ensues. Between the dynamics of pattern formation in bacteria and in the Belousov–Zhabotinsky reaction there is a similarity that allows both systems to be modeled by the same abstract model, similar parameters, at the level of pattern formation. Incidentally, a variation of the model has also been applied to the propagation of contraction forces of heart muscle. All three systems embody, in different substrates, the same abstract rules.
4.1.1.2 Causal Patterns with Different Entailments Naturally, bacteria, the heart, and chemicals are not the same; the set of potentialities of each is, of course, very different. The processes in bacteria are much more complex than those of the chemicals in the Belousov–Zhabotinksy reaction, with its much more intricate biochemical machinery. An explanation of the pattern formation, however, can reunite the two systems – at a certain level of abstraction – under the same model. At the level of pattern formation, the two systems are isomorphic – they share rules. The bacterium has, potentially, many more things it can do; many more ways it can interact with the environment and its colony. But at the level of pattern formation, bacteria are comparable to a chemical reaction, because at that level abstracted rules are shared. Structural similarities between the two phenomena translate into an abstract relational invariance that, within certain conditions,
66
4 Modeling and Invariance
explains both patterns with one sweep. The set of implications of both patterns, however, is very different, and a crucial difference it may be. Patterns themselves may have further potential roles to play in some cases, but not in others. Spatial patterns of bacteria, more resourceful than bromide, may play a functional role, e.g., in the development of the colony. The emerged pattern may acquire function, an emergent function, which operates at the level of the population. The pattern may admit the attribution of a causal role. Differential occupation of a surface will lead to differences in light absorption, the distribution of nutrients, and conceivably many other factors. These patterns in turn may shape the development of the colony. It could even be that evolution operated in the types of patterns that a colony of bacteria may express (multilevel selection). Regarding the extremely sophisticated forms these colonies can take, the conclusion that patterns play causal roles seems necessary. In this sense, the abstract model (fast autocatalysis and inhibition) covers the outcomes of the chemical reactions better than they do for the bacteria. For bacteria, a complete explanation would require the inclusion of more levels, genetic and developmental. In that sense, the search for abstract principles in the brain is more similar to the search for invariants in bacteria. Explaining one level does not exhaust the phenomenon. The brain offers a wealth of patterns, in a variety of levels, that exist simultaneously, and are different dependent on the empirical lenses through which they are seen. Although some brain patterns might be distant correlates of causation, some might be very close to the cause itself. Models help in finding out which is which, by showing what is covered and what is not.
4.1.2 Dynamical Neural Patterns As in Sect. 3.1, the many measurement methods show a brain permeated with patterns. Perhaps the most conspicuous of those is the intricate neuroanatomy, but a simple glance at the pictures produced by any empirical method reveals patterns in a variety of forms, in a variety of spatial and temporal scales. Patterns of activity (and morphology!) in functioning brains are dynamic and evolve in time, inducing the question of what rules could be shaping the observed patterns. The appearance of a certain invariant in the nervous system, the qualitative shape of an action potential, for example (as seen through its membrane potential), is a product of physical laws and the structure of the substrate. The electrotonic equilibrium across the cellular membrane is purely a function of the ionic composition of the cell, its membrane structures with differential conductances, and the cell’s surroundings. The types of action potentials that a cell is able to emit are eminently dependent on the interplay of components and surroundings. We examine this more closely in Sect. 6.1. For now, it suffices to say that the observed shapes of the action potentials are invariants of neuronal models with respect to some parameter domains.
4.1 Invariance and Computational Models
67
The action potential is, furthermore, an example of a dynamical pattern with causal prowess, whose influences propagate throughout the organism. At the base of most measurements of functional invariance in the brain, there are action potentials. Oscillations, local activity, and receptive fields are all consequences of action potentials plus network (and dendritic) structure. And the action potential itself is a consequence of the properties of the cell producing it. So, the analysis of invariance in the brain begins with the production of action potentials (which we do in Sect. 6.1). There we will see that the patterns of action potentials can be understood through modeling at the level of the cellular membrane. At that level, a picture of constancy emerges from variability in the level crossing from the anatomical composition of the neurons to the dynamics of the membrane potential. Distributions of ion channels lead to varieties of dynamical behavior. We call them varieties [11]: qualitatively different dynamically, in that a neuron can fire different action potentials dependent on channel properties and distributions determining the dynamics of ionic exchange (a sample of action potential varieties is shown in Fig. 4.2).
Fig. 4.2 Varieties of action potentials. Their behavior can be framed as invariant properties according to parameterizations. DAP depolarizing afterpotential. (From [13])
68
4 Modeling and Invariance
In the brain the presence of invariants leaves traces in many levels.2 The actual discovery (or invention) of rules connecting levels is anything but trivial, requiring as much knowledge as creativity. The discovery of rules and invariants usually follows a strenuous and windy historical path of experiment, discovery, and invention. In Sect. 6.1.1, we will see some of the facts leading to the Hodgkin–Huxley model of the action potential, one of the neatest examples of modeling neural invariances. Before that, we do a sampling review of models of computational neuroscience that have bearing on the present discussion. These models show how brain patterns could emerge as the outcomes of rules and components operating on structured input. Then we provide a description of the historical facts leading to our model of choice, the discrete time recurrent neural network, in the next chapter.
4.2 Computational Models of Neural Invariance 4.2.1 Constancy and Variability in Modeling Explanatory theories in biology (as well as in physics, more canonically) attempt to encompass variation with theorizing, where ideally a reductionistic view on the regularities from measurement meets abstract principles that may account for it. This epistemological direction, from constancies to rules and from rules to invariance, is also the prerogative of complex system models, where invariants in measurement mirror principles of organization, themselves a function of the potentialities of the substrate. The more fundamental the rule with the least neglectful exclusions, the more powerful the model/theory obtained. Through an explanatory theorization of this sort, even if by necessity about abstracted reality, one can claim to have found an organizational principle.3 2
The retrieval of rules from brain patterns stumbles upon contingencies. Crucially, data have to be arranged so patterns can be observed. This evokes the question of how much of the observed pattern may be a methodological artifact. Moreover, because humans are pattern-recognizers by nature, a reorganization of data to see patterns on occasion leads to specious patterns, and by extension to spurious conclusions. Rules inferred from brain patterns are more explanatory when they indicate causes, not correlations. Brain patterns are observed by an observer, who introduces his or her biases, which can take the form of theory-bound terminology or theoretical preconceptions. Unfortunately, I have found no general heuristics to distinguish wheat from hay. 3 Different entailments: Such principles belong to levels of explanation, by necessity, and are about the reality of the phenomenon, but are not the phenomenon. Confusion between simulation and reality, which some modelers indulge in, causes many of the quarrels with artificial life approaches. Artificial life, as Takashi Ikegami said, quoting Tom Froese, is actually dead, in the sense that the potentialities of matter in life-artificial and in life-organic are fundamentally different, and the combinations of different patterns will generally have different causal proclivities. Only a fullfledged equivalence among the self-organizing properties of elements could ever lead to life as we know. And because every chemical component of the periodic table has different properties, it is hard to imagine “component neutrality.” That is, that arrangements of different components would ever lead to equivalence expressed in all levels. That is not to say that we cannot build life as we do not yet know.
4.2 Computational Models of Neural Invariance
69
Here is a very general template fitting every explanation of the interplay between constancy and variability in biological processes. Both constancy and variability are products of the self-organization of a complex system, governed by physical principles. Loosely speaking, order emerges from chaos when there is potential for lawful interactions between components. By “potential for lawful interaction” I mean that the results of the interactions between components follow some invariant rule. The presence of variability indicates that interaction between components can happen in more than one way, or that the outcomes display divergence. Complementarily, constancy appears when there is convergence. When things can happen in more than one way, they are more likely to happen in one way than in another. Often, the way a process begins biases the outcomes of the process one way rather than the other. Then constancy appears at the side of the bias introduced by the initial structures.
4.2.2 Models and Invariance in Computational Neuroscience Some of the most beautiful models of computational neuroscience have been proposed to disinter rules leading to neural patterns concomitant of behavioral function. In this section, I discuss some of these models and their explanatory characteristics. Whether explicitly or not, all of the following models are examples of a search for invariant mechanisms producing observable functions. Models abduct – or select – particular properties of the nervous system and transform them into invariant rules, from which patterns and function may emerge. The exertion of functions produces distinctive neural patterns. The upcoming subsections will exemplify the claim.
4.2.3 Hebbian Plasticity 4.2.3.1 Ongoing Synaptic Modification In 1949 Donald Hebb [10] proposed a candidate for the mechanism of neuronal change that would underlie memory that has since become coextensive with neural models of learning. “Neurons that fire together wire together,” or Hebb’s rule, is the classical example of a principle of self-organization in the cortex. In Hebb’s own words, “When an axon of cell A is near enough to excite a cell B and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells such that A’s efficiency, as one of the cells firing B, is increased” Hebb did not propose a particular mathematical formulation, but one of the simplest implementations operates as follows. The strength of a synapse (Hebb’s efficacy) between two neurons is increased (or decreased) by a number proportional to their joint activity. If the activity of the receiving neuron is lower than that of the sender, the synapse is reinforced proportionally to the product of
70
4 Modeling and Invariance
activities. If, conversely, the receiving neuron has higher activity, the synapse is dampened proportionally. So, Hebb’s rule has two properties: it is local and it is symmetric. The neurons know nothing about their networks, but only about themselves and the incoming signal. There is no signal determining whether some plastic alteration led to some positive result at a higher level. Therefore, Hebbian learning is unsupervised. Many variations on Hebb’s theme have taken it from its original abstract phrasing and improved its biological adequacy, such as spike-time-dependent plasticity, which takes the recent history of spikes into consideration. Since then, on the basis of Hebb’s rule and variations many interesting systemic properties have been demonstrated (see, e.g., [5]). At the system level, Hebb’s rule has been prolific in proposing explanations for a number of behavioral phenomena. Abbott and Blum [1, 2, 4] used it to show how a mouse may implicitly learn to navigate a water maze. Metha et al. also presented a model of how predictive firing of hippocampal cells may be explained by Hebbian learning (and its biological counterparts, long-term potentiation and long-term depression). Fundamental to these results and others of the same type is the following: consistent experience leads to consistent alterations of network structure. In the network, an initial random distribution of weights becomes skewed with consistent experience. The skew of the weight distribution represents particular properties of the interactions of the simulated mouse with its environment. The outcome is a distribution of weights that approaches invariance by capturing (i.e., representing, in a weak way) the history of interactions (see Fig. 4.3).4 Hebbian plasticity has motivated biological explanations as well as machine learning applications, such as reinforcement learning, in various different guises (some in which the property of locality is forsaken, giving way to a system-level reinforcement signal generating the plasticity). Research has in most cases confirmed Hebb’s insight, and it is highly likely that Hebbian learning as an invariant rule represents the standard form of systemic alterations of neural structures.5
4.2.4 Kohonen’s Self-Organizing Maps Hebb’s insight motivated many models where plasticity begets invariance. Kohonen maps, also called self-organizing maps (SOMs), are an example. Kohonen maps are
4
In the original proposal, Hebb’s rule operates on real numbers representing neural activity, and learning happens when connections are repeatedly active. That leads to an unbounded positive feedback in the weight change, which in turn leads to scaling problems. Solutions inspired by biological mechanisms have been proposed to deal with the scaling problem by introducing homeostatic constraints on saturation [3]. 5 Hebb’s rule does, however, lead to paradoxes. There is the paradox of permanent change: if all synapses are constantly changing, the system must somehow avoid the so-called superposition catastrophe. How does the system handle contradictory information, always present in the real world, without forgetting?
4.2 Computational Models of Neural Invariance
71
Fig. 4.3 Asymmetric expansion of the “predictive coding” of place fields as explained by Hebbian learning. (From [17])
neural networks that organize multidimensional feature spaces. On the basis of a principle unearthed by von der Malsburg [16], on the organization of visual feature maps, Kohonen [14] generalized the principle to any number of feature dimensions. Similar with similar is the principle behind SOMs. The maps employ measures of similarity to produce topologically organized feature maps that, resembling topical maps in the cortex, have a spatial arrangement representing the organization of a feature space. Its paradigmatic examples are the original paper from von der Malsburg, where the space of visual features becomes organized, and the emergence of somatotopic-like maps of tactile features. Because SOMs are not restricted to particular features, they can generate topological organizations even with abstract concepts, such as semantic maps [20]. Once trained, a SOM becomes an invariant similarity classifier. SOMs show how invariance results from repeated application of rules on disorganized feature spaces, as long as there is a metric of comparison between inputs. The metric, and by extension, the production of maps, is dependent on particular network topologies for the quantification of similarity. Contingent on the topologies of the initial networks, the resulting map will be different. But, the important conclusion is this: given a certain metric and a certain space of training examples, the outcome is necessary.
72
4 Modeling and Invariance
This is not quite true. In fact, as should be expected, SOMs are highly sensitive to the training examples, the order of presentation, and the initial structure. The larger the dissimilarity between the training examples, the more divergent will be the resulting maps. This is worthy of mention because to understand particular maps, from particular sets of examples, it is important to understand the origin of the examples. Consider the following. Suppose the examples fed into the SOM are produced by a physical system. The distribution of features will follow the rules of the physical system. In that sense, the space of possible invariant maps will follow the particularities of the physical system. Now suppose that there are many maps, each trained from a different set of examples of the physical system. If, as we assumed, the outcomes of the physical system are randomly distributed, then maps with few examples will look considerably different. But SOMs that are fed with large numbers of examples will resemble each other, in unbiasedly representing the randomness of the original process. Although all SOMs will attempt to capture the invariances of the original system, only the larger maps will attain larger correspondences, and only to the degree that the metric is compatible with the incoming data. The SOM’s malleable structure assimilates the incoming data. The largest and more consistent the set of training samples, the better the emerging correspondences. There must be a match between the complexity of the data and the absorption capacity of the structure.
4.2.5 Backpropagation: Gradient Descent Methods Backpropagation was one of the first learning algorithms for neural networks. It is a method to create a categorizing network, and is an example of supervised learning, as the patterns are associated with desired responses. It works as follows. An initialized network with input, hidden, and output units is initialized with arbitrary weights. Then the network is fed with training patterns for which the classification is known. If the output units do not match to the desired output, the weights are changed in proportion to the error, which is the difference between the achieved and desired outputs. After the network has been fed with enough training examples, it will be able to extrapolate and categorize unseen patterns. Thus, the network becomes an invariant classifier. Analogously to SOMs, the structure of the network will represent the training patterns, and likewise, it will be dependent on the training examples. The networks trained by backpropagation may be feedforward or recurrent. In feedforward networks, the difficulty appears with contradictory patterns, which cannot be learned. Recurrent neural networks partially solve this problem by introducing internal states dependent on history, thereby providing context to the incoming patterns. Recurrent neural networks can also learn oscillatory patterns. The absorption capacity of these networks will be constrained by structures as well as size. Networks trained with backpropagation have been proposed as an explanation for perceptual categorization, as well as interpolation and extrapolation. In the past they were criticized for lack of a corresponding mechanism in biological networks,
4.2 Computational Models of Neural Invariance
73
that is, the biological adequacy was questioned. Recently, some neurons have been shown to propagate signals in the reverse direction (axon to dendrites), which resulted in relinquishment of some of the critiques (as cited in [15]). The difficulty remaining is the necessity for supervision, and that training examples must be previously classified before feeding them to the algorithm. Although it is not difficult to conceive of biological situations in which the training examples are provided by the interaction of an agent and an environment, the biological adequacy is to be judged case by case. The capability of networks to learn with backpropagation (or any variations on gradient descents) is imminently dependent on the structure of the network and the training examples. But provided that the structure and examples may produce proper classifying networks, the network becomes an invariant classifier. For large data sets, large networks can vary wildly both in their structure and in the correctness of the classification of unseen examples. But because they embody the data that was fed to them, the categorization ability is an abstraction over the problem.
4.2.6 Hopfield Networks Responsible for the revival of neural nets in the 1980s, Hopfield [12] showed how a recurrent neural network (spin glass model) can be made to store and retrieve patterns. Hopfield networks are an example of content-addressable memories: given a pattern similar (but distorted in some way) to one stored in the network, the network will eventually settle on the stored pattern (minimum free energy). In dynamical systems terms the initial (distorted) pattern is an initial condition, the resulting (stored) pattern is a fixed point. The Hopfield network is a network where all weights are symmetric (which guarantees the existence of a Lyapunov energy function), and there are no selfconnections. Moreover, the update rule is asynchronous to avoid oscillations. Neurons in the network are Boolean (the neurons have two states, in some versions 1 and 0 or 1 or 1). Patterns are stored via some version of backpropagation or the delta rule, and retrieval happens by inserting a pattern and letting it settle on one of the stored patterns. In the terminology of dynamical systems the inserted pattern is an initial condition, and the transient follows an energy gradient into the closest minimum of free energy (see Fig. 4.4). The network effectively implements a categorization, in which similar patterns will fall into one of a limited number of basins of attraction, determined by the number of units in the network (there is more about basins of attraction in Sect. 7.1.2.2) Dynamically, Hopfield’s model is trivial because the initial state always finds itself in some basin of attraction, and stays in that basin, merely following the gradient. The asynchronous update rule guarantees absence of oscillations. Every retrieved pattern is a fixed point of the dynamics. Apart from the biological adequacy quirks (such as the necessity for symmetric connections, and the asynchronicity of updates, the Boolean values of activities), there are other limitations. Because the phase space is partitioned into a limited
74
4 Modeling and Invariance
Fig. 4.4 The rigged hill analogy for basins of attraction of Hopfield networks. A minimum of free energy between two contiguous peaks is a basin of attraction, and the bottom point is a fixed-point attractor
number of basins of attraction (stored patterns, categories), the storage capacity is proportional to how this space can be partitioned. With more patterns, recollection collapses and the capacity for storage is in general quite limited [5]. The transient states have no functional role, and the attractor states are trivial fixed-point attractors. Nevertheless, Hopfield’s seminal idea showed how patterns can be stored and retrieved in recurrent neural networks. It provided the stepping stone for a comprehension of how neural networks could implement content-addressable memory. As in SOMs and Hebb’s rule, a trained Hopfield network is an implicit representation of the stored patterns. Essentially, every Hopfield network is a categorization machine that embodies the patterns with which it was trained. Its basin structure is an invariant of categories. Because it takes in distorted or disturbed patterns, and retrieves one single pattern, it is also a mechanism that reduces variation into constancy. As in the cases above, the invariant of a Hopfield network is a static representation of a history of interactions, which makes its way into a deformable space of neurons. That is to say, the systems described above all have a tendency: given rules, plus experience, the resulting network is a representation of an invariant.
4.2.7 Decorrelation Principles Recently, neat examples of the appearance of invariances from spatial–temporal influxes of coherent information have been demonstrated in the case of learning of temporal sequences. Wyss et al. [24–26] have tackled the problem of learning spatial–temporal patterns, where a temporal sequence of spatial stimuli is fed to a multilayered network that is altered according to a decorrelation rule. The idea behind a decorrelation rule is that as it learns, the neural system tends to minimize
4.2 Computational Models of Neural Invariance
75
redundancy. So, if at the outset both neuron A and neuron B respond to stimulus S, after learning, neuron A responds to stimulus S and neuron B responds to something else, where it can be more useful. In an inspired experiment with a decorrelation rule, a robot stands for a mouse that learns a maze on the basis of visual streams. At first the robot explores the maze freely, acquiring sequences of pictures fed to its network (layered and hierarchically organized). After the training period in which the decorrelation rules are applied to the network weights, a particular view of the maze (pictures) will evoke an invariant pattern of network activity [23]. This amounts to saying that for each place in the maze there is a corresponding invariant pattern of activity involving all neuronal units – the whole brain of the robot. The invariant representation of place appears because there is structure in the image sequence fed to the robot’s network, structure which is about the world and about how the agent moves in it. When the system was tested with scrambled sequences of images, as expected learning was not stable and no invariant patterns appeared. The structure of the world is acquired in interaction through sensorimotor contingencies; to get from A to B, the robot must necessarily go by a connected path between A and B. Any contiguous movement in four dimensions follows a Lie manifold constrained by the embedding of the robot [18, 19]. It is the very structure of movement in the world that produces the network structure whose activity is invariant with respect to place. By minimizing redundancy, activity patterns end up representing places in the world (more or less univocally, dependent on the complexity of the environment to be learned).6
4.2.7.1 Place Cells as Relational Invariants The same mechanism (decorrelation of multilayered networks) may explain the appearance of some empirical invariants, such as place cells, grandmother (or Jennifer Aniston) cells, and even, with a tad of goodwill, mirror neurons. It is an interesting exercise to imagine a simulated electrophysiology experiment in the robot described earlier. Suppose, for instance, that we had no access to the whole pattern of activity, but only a sample of it, through electrodes inserted in some random units of the network and we measure their activity. Then, when a picture from the maze is presented, the corresponding invariant pattern is evoked. The random cells sampled will have a level of activity which is invariant with place. Since invariant patterns have a disjunctive character, because of decorrelation, cells measured at random will be more or less active in particular positions of the maze. These hypothetical units are akin to “place cells,” and they may (or may not) correlate with particular places in the maze (see Fig. 4.5). Similarly impressive results based on slow feature analysis [22] have been demonstrated by Franzius et al. [6]. More than place cells, other
6
Notice that the cell “abstracts” precise information about the incoming picture. A similar pattern of activity will be evoked in a given place P irrespective of viewpoint V. All the viewpoints in place P produce similar activity patterns in the higher layers. The invariant is not a “place” per se, but is about where the “presented picture” fits into the network attractors.
76
4 Modeling and Invariance
Fig. 4.5 Left: Artificial place cells from sequences of pictures. Right: A place cell from a rat, with a trajectory overlaid. (Right from [23], left from [8]) Table 4.1 Invariances in the structure of interaction with the world are represented in invariance in neural activity Dimension Invariance in interaction Neural of transformation with the world invariance Rotation Place Place cell Displacement Orientation Head direction cell Displacement Vision target Viewpoint cell and rotation Velocity Step size (e.g, 2 cm) Grid cell
invariants were shown as outcomes of the trained algorithm, such as head-direction cells, viewpoint cells and grid cells. This makes sense because of the kinds of sensorial transformations that keep some relations to the world constant, and thereby define dimensions of change in relation to constancies in interaction with the world. The coherent structure of the world fed to a malleable model suited to acquire this structure (decorrelation, slow feature analysis) will produce the dimensions of invariance in patterns of activity. Examples of dimensions of transformation with invariants in interaction and respective neural invariants are given in Table 4.1. Because the transformations involved in locomotion are smooth (in Lie manifolds), a decorrelation method will generate abstract representations of smooth changes (of perceptual input). This will appear in constancies in the activity measured by electrophysiology, or by functional magnetic resonance imaging (as in [9]). Conceivably, if decorrelation (or slow-feature extraction) is a mechanism implemented by the brain’s matter, it explains the appearance of invariant place cells in measurement, as it minimizes redundant representations of the independent dimension that an agent’s relations with the world can change.
4.3 Conclusion
77
Specific transformations will lead to types of invariants, but it is the act of exploration, the experience itself, that reveals the structure of the interaction with the world within the neural system. Consequently, different experiences lead to different renditions of the invariant patterns of activity, but all will refer to the same modes of interaction with the world. The grounding will be constant. The acquisition of sequences of pictures from an exploratory mode of behavior implies a relational invariant between the being, its malleable structures, and the constancies of the world.
4.3 Conclusion 4.3.1 Invariants and the Structure of the World The principles underlying the models described in this chapter were inspired by brain properties and functions, and purport to emulate the functions that inspired them. In all cases we see the appearance of invariances, which represent coherent orders in the influxes of data. The functions achieved by all the neural models here are versions of the principles of categorization and classification (prediction in the case of Hebbian learning). Although cogent biological confirmation for these principles is lacking, these models carry something of the actual phenomenon. Generally, the following can be asserted with confidence. All the systems described in this chapter (SOM, decorrelation, backpropagation) take in stimulus data and are incrementally altered to represent the structure of the data. The functions of the systems are relational invariants, which relate the training examples, with the structure, with the function. In behavior, the environment is a coherent source of training data, whose sample points, although very varied, are not random. They are as structured as the environment itself, and implicitly represent the rules they obey. In malleable systems, the influx of coherent data will mold the system to the regularities of the data. Variation is unavoidable, resulting from dependence on the initial structure and sequences of inputs. Constancy appears as invariant patterns are resolved through the structure of the input and the structural alteration rules. In a sense, the invariant patterns come to represent the data fed to the system. Note that this notion of representation is simple; it requires naught but a certain level of “aboutness.” For example., the resulting SOMs are “about” the data. The term “representation” is employed in a sense different from standard notions from cognitive science, in that they are only intelligible in relation to the data. There is an inextricable dependence between the influx and the system, and meaning is immanent when they are taken together, but inexistent if they are taken independently. Malleable systems will self-organize to absorb the data that they are given. In organisms, data is whatever streamed inputs the organism receives. If there is order in the data, and if there are mechanisms to change the structure of the system consistently, the system will, in whatever way, come to represent the data itself. When the
78
4 Modeling and Invariance
distinctions and orders that were captured from data are used in behavior in some way, the loop is closed. Invariances of neural activity and structure are about how the organism is modified as it interacts in regimented ways with the environment. In that, all invariances thus produced are relational: they relate the organism to the world through experience.
4.3.2 Constancy and Variability in Modeling The previous sections presented a selection of computational models inspired by principles abstracted from neural systems. From those we derived a simple yet important conclusion: these models build on invariant rules and components to form around data fed to them, and come to implicitly represent regularities in the data. Regularities in the data are often about the structure of the world, and the modes of interaction between a system and its environment. When data is regular and components are similar, models develop equivalently. Constancy appears both at the level of function a model emulates (such as pattern categorization) and structure (such as weight distributions of a neural network). On the complementary side, sources of variability are the initial structures and the order of data presentation. Thus framed, the constancy and variability in models are analogous to those in organisms. But there is an important difference. Because models are abstracted from particular levels, they are by necessity only about those levels from which they are abstracted. It becomes difficult therefore to stitch models to cover the phenomenon without incurring gaps in coverage. How does one sum a network that does “place cell” with a network that does “similarity categorization” with a network that does “pattern retrieval”? Empirical assessments show partial pictures of invariance. Correspondingly, models emulate and explain particular levels of self-organization. Given the complexity of the problems of behavior and cognition, usually only one aspect is tackled at a time, only a subset of brain function. Abstract rules derived from the phenomenon reproduce, or emulate, some concomitants that are known to be present in the operation of the brain; however, none of them account single-handedly for all brain function. To sum models, as in uniting results from experiments, is a formidable challenge. To do so, one needs a framework where reintegration of partial models in a holistic model is possible and meaningful. Ideally, a model about brain function has to produce all the outcomes of the models described in this chapter, simultaneously. So, the abstract principles that were found in individual models must talk back to the phenomenon in all its scope.
]
References
79
References 1. Abbott L, Blum K (1996) Functional significance of ltp for sequence learning and prediction. Cereb Cortex (6):406–416 2. Abbott LF, Blum KI (1996) Functional significance of long-term potentiation between hippocampal place cells. Cereb Cortex 6:406–416. URL citeseer.ist.psu.edu/abbott94functional. html 3. Abbott LF, Nelson SB (2000) Synaptic plasticity: Taming the beast. Nat Neurosci 3:1178–1183 4. Blum KI, Abbott LF (1999) A model of spatial map formation in the Hippocampus of the rat. Neural codes and distributed representations. MIT Press, Cambridge 5. Domany E, van Hemmen J, Schulten K (1995) Models of neural networks, vol I, 2nd edn. (updated). Springer, New York 6. Franzius M, Sprekeler H, Wiskott L (2007) Slowness and sparseness lead to place, headdirection, and spatial-view cells. PLoS Comput Biol 3(8):e166 7. Goodwin B (2001) The evolution of complexity: How the leopard changed its spots. Princeton Academic, Princeton 8. Harris K, Csicsvari J, Hirase H, Dragoi G, Buzs´aki G (2003) Organization of cell assemblies in the hippocampus. Nat a-z index 424(6948):552–556 9. Hartley T, Maguire E, Spiers H, Burgess N (2003) The well-worn route and the path less traveled distinct neural bases of route following and wayfinding in humans. Neuron 37(5): 877–888 10. Hebb D (1949) Organization of behavior. Wiley, New York 11. Heylighen F, Joslyn C (2001) Cybernetics and second-order cybernetics. In: Meyers R (ed) Encyclopedia of physical science and technology, 3rd edn. Academic, New York 12. Hopfield JJ (1982) Neural networks and physical systems with collective emergent computational abilities. Proc Natl Acad Sci (Biophysics) 79:2554–2558 13. Izhikevich EM (2004) Which model to use for cortical spiking neurons. IEEE Trans Neural Netw 15(5):1063–1070 14. Kohonen T (1982) Self-organized formation of topologically correct feature maps. In: von der Malsburg C (ed) Kybernetik 14(2), pp 59–69 15. K¨ording K, K¨onig P (2001) Supervised and unsupervised learning with two sites of synaptic integration. J Comput Neurosci 11(3):207–215 16. von der Malsburg C (1973) Self-organization of orientation sensitive cells in the striate cortex. Kybernetik 14(2):85–100 17. Metha M, Quirk M, Wilson M (2000) Experience-dependent, assymetric shape of hippocampal receptive fields. Nature 25:707–715 18. Philipona D, O’Regan J (2004) Perception multimodale de l’espace. Philosophie de la nature aujourd’hui?, MSH, Boston 19. Philipona D, O’Regan J, Nadal J (2003) Is there something out there? inferring space from sensorimotor dependencies. Neural Comput 15(9):2029–2049 20. Ritter H, Kohonen T (1989) Self-organizing semantic maps. Biol Cybern 61(4):241–254 21. Sandstede B, Scheel A, Wulff C (1998) Bifurcations and dynamics of spiral waves. J Nonlinear Sci 9(4):439–478 22. Wiskott L, Sejnowski TJ (2002) Slow feature analysis: Unsupervised learning of invariances. Neural Comput 14:715–770 23. Wyss R, Verschure PFMJ (2003) Bounded invariance and the formation of place fields. In proceedings: NIPS 24. Wyss R, K¨onig P, Verschure PFMJ (2002) Invariant encoding of spatial stimulus topology in the temporal domain. Neurocomputing 44–46:703–708 25. Wyss R, K¨onig P, Verschure PFMJ (2003) Invariant representations of visual patterns in a temporal population code. Proc Natl Acad Sci 108(1):324–329 26. Wyss R, K¨onig P, Verschure PFMJ (2006) A model of the ventral visual system based on temporal stability and local memory. PLoS Biol 4(5)
Chapter 5
Dynamical Systems and Convergence
Abstract Dynamical systems can be coupled to other dynamical systems to cover organismic phenomenon on many scales, as they provides a general frame applicable to an ample range of phenomena. Describing a system as coupled systems highlights interfaces. Interfaces reflect the locus of organized transitions, or level crossings, places where constancy and variability merge. At interfaces there is convergence and/or divergence, so the vicissitudes of variation coalesce in meaningful averages; whereas conversely, constancy may drift and spread into variation. These transformations depend crucially on a number of processes, which are regimented by particular types of physical interactions. Variability may coalesce in averages, whereas constancy may drift. Mutatis mutandis, the same operating principles – large numbers and physical laws – means smear again in variation. In a functioning system, these two stances of constancy and variation complement and define each other.
5.1 Introduction 5.1.1 Previous Conclusions To highlight correlates of behavioral function, experimental paradigms tend to amalgamate a large number of observations. Averaging results over different observations may eclipse subtle constituents of explanation. This is very much an inherent characteristic of experiment; it is usually not possible to control for all the variables, nor to know whether the selected variables are those most explanatory. The way to handle such vicissitudes and variations of experimental nature is to form hypotheses about what links together protuberant correlates from different experimental results with the structure of the phenomenon. In short, to model. Furthermore, the isolated methods of brain measurement when taken independently offer distinct viewpoints on the activity of the neural system. Depending on the viewpoint, a different facet of the phenomenon is illuminated. Consilience between the results of different methods can only appear if findings from the different
M. Negrello, Invariants of Behavior: Constancy and Variability in Neural Systems, Springer Series in Cognitive and Neural Systems 1, DOI 10.1007/978-1-4419-8804-1 5, c Springer Science+Business Media, LLC 2011
81
82
5 Dynamical Systems and Convergence
methods tell tales about each other. Or better yet, if their results can be integrated on the general frame of an encompassing theory. Such an integrative theory has to explain why do the correlates of the different experimental paradigms appear. For instance, it has to explain where electroencephalography measured oscillations come from; it has to explain why there is localized activity in functional magnetic resonance imaging; it has to explain the emergence and the role of synchrony; it has to explain why a single cell correlates with Jennifer Aniston; it has to explain both variability in spike trains and the reliable single neuron responses; and it has to explain constancy in behavior, given variations in neural activity. And the theory has to explain all that in terms of the reality of organisms.
5.1.2 Dynamical Systems Theory as an Integrative Framework In this section, the theory of dynamical systems is presented as an integrative framework to the study of the behavior of organisms. The conception of the organism being best described in dynamical terms existed latent in many classical works ever since the Greek philosophers (especially Democritus)[16]. Phenomenologists departing from the ideas of Husserl and Heiddeger have flirted with the idea, such as in the works of Jonas [10] and Merleau-Ponty [12]. The idea evolved with works that emphasized the organism as a loop in the environment (Funktionskreis - von Uexk¨ull) [17]. Recently, the fertile idea that the brain is best described as a dynamical system has been extensively explored in a number of research areas, such as cognitive development [15], action and perception [7], minimal cognition [3, 4], perception and memory [14], and psychophysics [11], as well as philosophy [8], neuroethology and insect locomotion [5], and even language [6]. Other classical works have more or less implicitly upheld a dynamic conception of intelligence and cognition, despite having espoused apparently incompatible paradigms as artificial intelligence, such as those of Newell [13] and Hofstadter [9]. The prolificacy of the theory of dynamical systems attests to its generality, and the quality of the authors attests to the richness of these ideas. All and more have been both inestimable inspirations and sources for the present contribution, which humbly expects to adjoin such a smart lot in espousing dynamical views to cognition. The emphasis here is on the appearance of invariances at various levels, for which dynamical systems theory is fittingly suited. The theory of behavior of organisms based on a theory of neural dynamics is proposed as a framework that absorbs the results of empirical approaches. The analysis of how invariances are beckoned as dynamical systems underlie functional behaviors, I argue, is the most profitable level of analysis to understand the relations between neural structures and behavioral functions. It is in the language of dynamical systems that we find the common currency of explanation that is able to naturally address a wide range of questions about the behavior of organisms.
5.1 Introduction
83
There are good reasons for that. Dynamical systems is applicable to many levels of description, from macroscopic motor behavior, to the activity of networks, to the single neuron, and to behavior on a variety of temporal and spatial scales. Dynamical systems models can be (analytically) coupled, and so made to cover a substantial part of organismic phenomena relating to behavior, from the neuron, through the body, up to the environment as a driving force, closing the loop around the organism. The manner in which dynamical systems overlay phenomena highlights interfaces, loci of level transitions. Moreover, dynamical systems can represent different kinds of variables, and their relations, in a common language. Further, the dynamical systems approach urges us to make explicit assumptions about the constituents of a model. Ultimately, dynamical systems reveals many aspects of the invariants found in brain measurement, such as oscillations, localized activity, timed potentials, synchrony, and spike trains. With these properties as sufficient justification, one may welcome the dynamical systems approach as a powerful paradigm to study the behavior of systems, their dependency, their invariants, and expressions of variation and constancy.
5.1.3 Outline Before moving on to spell out answers, the alphabet of dynamical systems must be introduced. The basic components of a dynamical system – state variables, phase space, rules, and parameters – are introduced in Sect. 5.2.1. Then, I contrive a description of a gesture as coupled dynamical systems, to show how dynamical systems smoothly integrates levels (Sect. 5.3). Within that description, I present an example of how convergence transforms variability into constancy, with a description of the transmission of action potentials at the neuromuscular junction (Sect. 5.4.1). Following that, I present some historical facts leading to the Hodgkin–Huxley model of the action potential (Sect. 6.1.1), and show (1) how a stochastic description of cellular processes may be reduced to a deterministic model according to mild assumptions and (2) how variation of parameters representing ionic exchanges begets categories of action potentials, leading to an important class of explanatory reduction, that of convergence from parameter variation to categories of dynamical behavior (Sect. 6.1.2). Finally, I introduce our network model of choice, the recurrent neural network, in relation to the properties of the model and its assumptions (Sect. 6.3). In Part II, I will continue the exposition within the context of neurodynamics of discrete time recurrent neural networks.
84
5 Dynamical Systems and Convergence
5.2 Dynamical Systems Vocabulary I have no doubt that a totally reductionistic but incomprehensible explanation of the brain exists; the problem is how to translate it into a language we ourselves can fathom. Douglas Hofstadter, G¨odel, Escher, Bach (p. 714)
5.2.1 Dynamical Systems Basic Terminology The basic components of a dynamical system are state variables, phase space, rules, and parameters.
5.2.1.1 State and State Variables In dynamical systems theory, a dynamical system is a theoretical expression of the fact that, for a given subpart of reality (a system), its tendencies are a function of where the system is, plus how it evolves (its dynamics). This is to say that a subsequent state is causally dependent on the previous state, plus its potentialities (transformation rules). Where the system is invokes the notion of state, and how the system evolves is about rules. According to the 1913 edition of Webster’s Dictionary, “state” is appropriately defined as the “circumstances or condition of a being or thing at any given time.” The definition can be applied ipsis literis when the circumstances and the condition of beings and things can be described through measurable variables. Everything that can be quantified within a given system may become a variable representative of the state. And by extension, every variable that can be quantified is also, in principle, relatable to a certain law, or rule, expressing the connection between state variables at subsequent moments.
State Variables May Take Many Forms State variables may take many forms because a phenomenon can be described from many angles. They may be any, or a combination, of the following: 1. The may be measurable variables from experiment (e.g., membrane potential of a neuron, activation in a brain area). 2. They may be abstract variables, which represent a cluster of variables congregating many events (e.g., the birth rate in a predator–prey model. The birth rate is dependent, presumably, on the reproductive encounters of members, but we subsume this variable under the all-encompassing birth rate).
5.2 Dynamical Systems Vocabulary
85
3. They may also represent altogether unknown variables, such as a hidden variables, hypothesized to play a role in the phenomenon, despite not being known. 4. They may stand for stochastic variables (e.g., probability of opening of an ion channel, or neutrotransmitter vesicle release). This list is not exhaustive, but gives a flavor. In principle, any dimension is quantifiable, although some dimensions will be more meaningful than others.
State Variables Describing a Phenomenon Give the State a Unique Identity State variables describing a phenomenon give the state a unique identity, at least ideally, so this knowledge permits full reconstruction of a snapshot of the system. Hence, state variables come in two main sorts: continuous, when the variable may take any real value (e.g., space dimensions) and discrete, when the variable takes one of a finite number of values. Furthermore, states may be represented at different levels of abstraction. For example, a stochastic phenomenon may be represented by an abstract state variable, representing many stochastic events. Such is the case with variables representing ion conductivity, which abstracts responses of large numbers of ion channels into membrane conductivity of ions. Similarly, some neural models represent abstractly the activity of a population, often analogized to firing rates. State variables may also represent singular events, as in spiking neural networks, where the event of an action potential is abstracted into a punctual event (the Dirac function). As we see, the choice of state variables is the crux; it determines how the theory views a phenomenon, and what the theory can talk about.
State Variables Are the Nouns of a Dynamical Systems Language Selection, therefore, becomes preponderant. Two aspects should be held in mind. First, that a balance should be kept between the number of variables in a model and the explanatory power they provide. Second, akin to conclusions on empirical assessments (Sect. 3.1), because any answer of a particular model is provided in terms of the state variables, everything beyond them is invisible. An artful selection of state variables is, therefore, central to modeling.
5.2.1.2 Orbits and Phase Space The space formed by state variables is denominated phase space. A combination of state variables describing a system at a certain time is a point in phase space. One may combine different state variables to produce different views of the phase space. One may also define subsets of phase space, which are essentially projections of the whole system, in the subspace of choice.
86
5 Dynamical Systems and Convergence
If one records how the state variables change in time, or as a function of another variable, one may plot a trajectory representing the sequential history of states of a given system. The sequence of states is called an orbit. In dynamical systems models, the focus is on tendencies of the state variables, so the study of orbits in phase space is a central tool.
5.2.1.3 Rules In observing a phenomenon, we may deduce a rule that embodies relationships between measurables. Likewise, abstracted state variables influence each other, and in that case we design rules that specify how interdependent variables change. So, applying knowledge about how the variables in a system are interrelated,1 we construct a set of rules (or laws) that take a given state into the future. The rules, often equations, governing the course of the system may be (1) deduced via modeling the results of experiment, (2) analytically derived, or (3) a mixture of both. Generally, a system composed of rules operating on states to produce a subsequent state is a dynamical system. Rules may be applied recursively, producing sequences of states, or time courses of variables, called orbits. As with state variables, the rules of a dynamical system also take different forms. They might be modeled by differential or difference equations, finite state automata (rules), or even lattices and actions on partially ordered sets. In all these cases, the primary focus is on the temporal development of the system; how states become sequences. Temporality is inherent in the rules, doing away with the need for extraneous introduction of time.2 Mathematically, a dynamical system is a map, taking a selection of state variables to the subsequent state, according to rules, or equations. If V is a vector of state variables, representing a point in phase space, and F is some function which takes a state to the next state, a dynamical system is represented as such: V W7! F .V /: If the dynamical system has an explicit dependence on time, it is called nonautonomous. Some systems may be rewritten as autonomous by expressing time as a dependent variable: V W7! F .V; t/: In F are represented both the structure of a system and the laws which operate on it. In a mechanical system, for example, F includes the intrinsic properties of masses
1
Causes in a model are abstracted from the phenomenon in a particular level of analysis, and may not reflect the exact interactions of matter, but interesting slices of the phenomenon. 2 This is not inherently the case, for instance, in information theory approaches, nor is it for symbolic computation (good old-fashioned artificial intelligence) approaches.
5.2 Dynamical Systems Vocabulary
87
and objects (inertia, momentum), their relational properties (how the masses interact, e.g., springs, rigid links), and laws of motion. F operating on a state generates the sequence of states (an orbit). In the particular case of neural network models, F would stand for the structure of the network (connectivity), intrinsic parameters (e.g., recovery variables), and how activity is propagated (e.g., delays, capacitances). Through F the state would be taken into the future. Ideally, F would include all the factors necessary to unambiguously calculate the subsequent state (this ideal leads to complications that may lead to analytical intractability, though).
5.2.1.4 Parameters So, in a dynamical systems model, knowledge about the phenomenon enters in three forms: state variables, laws (rules, equations), and the yet the unmentioned, parameters. In a dynamical system, parameters represent something about the structure of the system. They may fall into one or more of the following categories: Material. They describe intrinsic properties of matter of the system, such as physical and biophysical properties (e.g., conductance, capacitance, rigidity) or biochemical properties (e.g., rate of reaction). Structural. They reflect some structural characteristics of the system, such as how components are interrelated. Structural parameters may result from the ways in which two systems couple and interact, so the system has an influence on the parameters, but on a slower timescale (e.g., the efficacy of a synapse, or inputs to a coupled system). Independent. They reflect factors extrinsic to the system, such as environmental variables (e.g., light, wind, temperature), which may input to the system. Strictly speaking, they are things that the system is influenced by, but has little influence on, i.e., to which it is loosely coupled. Amalgamated. They may represent amalgamated dependencies, which are assumed to act in combination. In this case, parameters are abstract. In a predator–prey model (an ecological model of species interdependence), there is a parameter representing death rate, agglomerating knowledge about, e.g., longevity or vulnerability. Also, the more abstract neuronal models become, the more their parameters belong to this category (not exclusively). In neuronal models, “synaptic efficacies” often fall into this category. In dynamical systems models it is often assumed that the timescale of change to a parameter is slower than that of the state variables. This is due primarily to analytical reasons (if parameters change as fast as state variables, they are state variables). Parameters are assumed to describe slower properties of a phenomenon, so the change of parameters has no bearing on the short-term dynamics. Consider, for example, the flow of water from a faucet. Irrespective of whether the gauge is being opened or closed (and the cross section through which the water flows increases or decreases), the immediate flow is dependent only on the section area opened. The state variable
88
5 Dynamical Systems and Convergence
“flow” is dependent on the parameter “gauge section,” but not on “rate of change of gauge section,” a slow change of parameters. Similarly, the immediate ion flow through a membrane is dependent on the immediate number of opened and closed channels, but not on whether they are closing or opening. The proportion of channels is a parameter, the ion flow is the state variable.
Parameters in Networks In a network model, parameters may be the synaptic strengths between neurons, axonal delays, and neurotransmitter production and release. Parameters may be the sensory input if the dynamics of the network is much faster than that of the input. In this case, parameters may result from coupling with an environment, or a secondary system. Admittedly, parameters as slow-changing variables is a somewhat fuzzy description, and in fact much depends on the definition of what are the parameters and what are the state variables. The issue is, of course, the rate of change. I will come back to this later, for now it suffices to say that what moves fast is a variable and what moves slowly or does not move is a parameter.
A Choice of Rules, Variables, and Parameters Instantiates One Dynamical System The parameters themselves, however, are not produced by the model. Parameters are chosen to define a particular system from a set, or to study the space of possible systems (in changing some parameters while keeping others constant. This is denominated “parameter space analysis.”). With parameter space analysis one may look at the impact of different parameters (and their combination) on the behavior of the dynamical system.3
5.2.2 Coupled Dynamical Systems and Interfaces With the necessary terminology we now provide a description of how dynamical systems overlays multiple levels of organismic phenomena. The scope of the
3
One word about the role of parameters in a dynamical systems model. An often-voiced criticism has been that “given enough parameters, a model dances and sings (does anything).” Or “give me enough variables and I will create a green elephant”; the humorous comment is only appropriate on occasion, most often when directed towards models fitting curves. But in the case of dynamical systems, it is misapplied, a gross canard. For parameters in dynamical systems are very informative indeed. For one thing, they have the power of talking back to the phenomenon, contingent on the level of detail of the model. When a set of parameters that causes a certain model to behave as expected is found, it also says more about the phenomenon itself. Furthermore, qualitative analysis of dynamics based on parameter sets, as we will see in Sect. 6.1.2, informs about a wide breadth of dynamical behavior.
5.3 A Motor Act, as Coupled Dynamical Systems
89
dynamical systems perspective appears as one connects phenomena at different levels of magnification, both spatial and temporal. Dynamical systems are unique in that they may compose a holistic framework in what may be called coupled dynamical systems, in which the outcomes of one dynamical system produce what another takes in, thereby connecting levels of explanation. An example of this is given in the next section, spelling out some of the implications of level crossing (more general considerations are found in the addendum at the end of Chap. 6). In Sect. 5.4, I describe an example of coupled dynamical systems – the action potential transmission at the neuromuscular junction – highlighting the kinds of reductions dynamical systems modeling incurs. This section has the objective of showing constancy from variability in aliasing inertial effects in motor actions. Using the conclusions of the section concerning modeling as a hook, I will subsequently present different dynamical systems models of neurons and networks, according to an increasing level of abstraction, from the Hodgkin–Huxley model to recurrent neural networks.
5.3 A Motor Act, as Coupled Dynamical Systems The essential service of muscle to life is, quickly and reversibly, to shorten so as to pull. its shortening is called “contraction”. The importance of muscular contraction to us can be stated by saying that all man can do is to move things, and this muscular contraction is his sole means thereto. Charles Sherrington, The Man on His Nature
In a Descartian universe, one could define an organism by all its atoms and their interactions, and, in principle, simulate all of what the organism will do. It does not appear very likely though that this level of accuracy in description will be explanatory or will be ever reached. What we may do instead is to outline a subset of the observed phenomenon in some meaningful, albeit simplified, subparts more amenable to description and analysis. A gesture may be described as a collection of coupled dynamical systems,after a selection of a number of interrelated levels of analysis and the abduction4 of their respective variables. The idea is that coupled dynamical systems may constitute a whole picture of the system, where the outputs of one level are coupled to inputs of the other. Concisely, one level interfaces the next. Dynamical systems are coupled where separable levels of explanation and description are presumed. The action of selection of a certain level with its associated variables and rules, as in any exercise of modeling, implies sacrificing and eclipsing variables at levels
4
Abduction is the process of selection of sorts (state variables, parameters, rules) of a phenomenon to construct a model.
90
5 Dynamical Systems and Convergence
above and below. This is equivalent to adopting a certain magnification factor in a microscope, in which selecting any one factor causes the others to go out of focus or out of sight. Despite much philosophy of science, skeptical of the generality of conclusions because of this necessary neglect, one needs not to have compunctions about the abduction act. For although the conjugated levels above and below might have disappeared from sight, they must not have faded from memory. In the process of amplifying the resolution, one encounters first skin, then cell, then membrane, and then molecules, but knows, through the very process of magnification, that these structures are aspects of the same thing. Coupled dynamical systems for the explanation of organisms should be regarded in a similar manner, as integral parts of the same phenomenon, which are differentially magnified to highlight processes at different levels. It is therefore possible to describe organisms (if not experimentally) at least analytically as a set of coupled dynamical systems. Let us consider one such contraption, of the organism as coupled dynamical systems, in terms of control. To be more specific, let us describe the production of some limb motion in terms of coupled dynamical systems. Even simple motions as hand waving exhibit the fractal complexity of biology. A description of a gesture may then arbitrarily start at the body level, with the mechanics of the body and its physical constants. The arms masses and skeletal constraints are internal, and are subject to the external influences of gravity, temperature, etc. We may choose as state variables the angles at the joints, the distribution of mass of the body, and the forces applied to the muscles. External variables are made parameters, assumed constant with respect to our level of explanation, of a short time window. Then, the postures of the body result from forces operating on the muscles, constrained by the neuromuscular constitution. With that, it becomes possible to define a dynamical system that will allow prediction about the immediate future of posture, contingent on a theory of the generation of muscle force.5 So we depart to the next level, which regulates the forces applied in the muscles. The level responsible for the generation of muscle force is both faster and smaller than that of the muscle–skeletal constraints of the mechanical level. That is, the level of coalitions of muscle cells forming muscle fibers.6 Combined contraction of a large number of individual muscle cells generates muscle force. A muscle cell can be described as a dynamical system, and at this level an arm motion involves groups of cells, muscle fibers, whose contraction force at a given time is a function of a number of variables, including the cell skeleton, length of the cell, its elastic properties, its modulatory influences, and preponderantly, the number of action potentials landing on the cell. These properties become 5
We know of other contributions to the muscle forces, such as bodily factors (e.g., dopamine controls muscle tonicity), whose influence may be modeled as modulatory. These factors can be modeled as parameters which may interface levels. For example, dopamine modulation of the force on a muscle may be modeled as a scaling factor of muscle-cell contraction. 6 This level is not satisfactory for longer intervals, for the forces on the muscles are changing nonlinearly with a number of extra factors, such as neurotransmitters, and energy sources to the contracting cells.
5.3 A Motor Act, as Coupled Dynamical Systems
91
variables and parameters and rules of a dynamical system. In a coupled dynamical system, outcomes of this level propagate to the skeletal level, where the actions of individual muscle fibers will cause motion. The skeletal level (and other mechanical constraints) do their own sort of computing, an embodied computing. The neurons that project to and from muscle cells, motor neurons and proprioceptors, integrate the neural and the motor level in a closed system. In mammals, further simplification is possible, since a motor neuron innervates only one muscle fiber (except during development!). The motor neurons go up the spinal chord and eventually reach higher networks in the cortex and cerebellum, forging networks of interconnections at synaptic terminals. The neurons composing these networks admit a description as dynamical systems both at the individual and at the network level. At this stage, the model will describe variables as the activity of the networks of neurons, and would be parameterized by its connectivity, and by the inputs from the body, called proprioceptive. The inputs are transduced information about joint angles, and velocities. With this step, we realize a reentrant circle. Information about joint angles reenters, as transduced information, and we find that there is a dependency on what the network will tell the muscle to do and on what it is already doing. Higher networks integrate information about the current posture with the permanently ongoing activity. Coupling neurons as dynamical systems with the networks as dynamical systems and with the muscles as dynamical systems integrates the whole (see Fig. 5.1). The production of a gesture is thereby covered, because each of these stages may be
A Motor Act Crosses Levels Neuron Properties
Skeletal constraints Body Posture
Mass distribution
Single Cell Contraction Forces
Muscle Forces Joint Angles
Number of Cells
tr
Body Physics
Activity of Networks
Network Connectivity
Cell's Biochemical State
Innervation Patterns
Incoming: Muscle and Joint State
Cell's Biophysical State
Architecture of Muscle Fiber
Level:
Incoming Activity
Muscle Contraction
ansd u ced
th
ro
ug
h
p pro
rio
cep
t io
n
Cell Contraction
Network Activity
Neurons
A depiction of how dynamical systems may be coupled, and how vairables of one level are difined by the neighboring levels. Represent parametrized dependencies between coupled levels. Rectangles represents a chain of levels that is recurrent. i just identify two reentrances, which are based on my particular choice of selection. Other models may add different variables as well as change the interfaces.
Fig. 5.1 The levels of a gesture as coupled dynamical systems
92
5 Dynamical Systems and Convergence
parameterized at the interfaces of neighboring systems. The importance of these interfaces cannot be overstressed. Not only because they provide boundary conditions for the neighboring levels, where one level constrains the other, but also because a complete cycle is closed, from cells through the body, to the environment, and back. This circularity is a strength, not a fault, for it remains true to the cyclic character of the phenomenon itself. Noticeably, the exercise of modeling and outlining subsystems unavoidably commits one to selection of what are variables, what are parameters, what are state variables, and what is to be neglected. Indeed, one of the great strengths of the method is that it compels one to make assumptions explicit, also about exclusions. On these two pillars, mindful selection and explicit assumptions, lies the ability of dynamical systems modeling. “The art of good theory is to select the variables that really matter”, as Arbib (p. 58 in [2]) put it. The idea is to select those variables that are explanatory about the phenomenon, with the least overhead. Doing this to coupled levels, one may study the individual dynamical systems representing slices of the phenomenon, preemptively knowing how to reintegrate them.
5.3.1 Integrating Levels To integrate sets of neighboring levels becomes crucial for any theory of organismic function, since an organism is best described as a collection of interdependent levels. In the study of behavior, three levels are preponderant: the body, the world, and the brain. Similarly, to study organismic function, all the dynamics are important, the world’s, the body’s and the neural structures’. So, any theory that addresses problems of the organism has to cope with at least all three. I described roughly how dynamical systems theory handles coupled levels when describing one small aspect of the motor system. The following becomes apparent: Selection of levels and modeling. The exercise of modeling
coupled dynamical systems is an observer-based selection of levels, and functions associated with those levels. – No single-level description can account for all behavioral phenomena. – Levels can cross scales in time and in space. – When levels are coupled, they also constrain each other. Interfaces provide boundary conditions. – Levels may be tightly coupled or loosely coupled. – In interfaces, reentrant information is transduced. Embodied computation. Some computation is done by the me-
chanical properties of the body.
5.4 Convergence in the Neuromuscular Junction
93
The organism as a system is across scales and levels. It exists as a whole from atoms, to cells, to the body. By slicing the phenomenon and creating interfaces, it may appear that this fact has been neglected. But paradoxically, the assumption of interrelated levels of description brings about a stance of independence. Different levels can be analyzed individually, and reintegrated through the interfaces priorly defined,7 obeying the constraints imposed by the neighboring levels. It is perhaps ironic that it is through selection of levels that a more holistic understanding is achieved.
5.4 Convergence in the Neuromuscular Junction The following sections presenting a description of how motor commands are conveyed is primarily based on a critical reading of [1]. Here I frame the biology of neuromuscular junctions as a dynamical process that is maybe deterministic at their origin, but owing to large numbers invokes stochastic descriptions. The example will show two instances of how variabilities coalesce into meaningful averages. First, variation in ion channels can be reduced to dynamics of membrane potential; then, individual spike trains antialias into firing rates, in the loose coupling between spiking neurons and inertial, elastic, muscles, producing contraction forces. These two are modeling reductions that follow level crossings in reality, which massively reduces the dimensionality of neural problems. To show these points, I will rely on another example, which will compose a conflating picture with the ones before: the description of transmission of an action potential at a neuromuscular junction as a process with myriad stochastic processes that may be modeled deterministically. It is Part II that I will show how these points conflate, within the paradigm of evolutionary robotics research, combined with neurodynamics, where we will model the origins of invariance.
5.4.1 Convergent Level Crossing 5.4.1.1 Stochasticity and Determinism Lead to a Many-to-One Mapping Between Neural Activity and Motor Action What follows is a description of our current knowledge of the neuroscience of the synaptic transmission, from the motor neurons to muscle cells, at the neuromuscular 7
The experiments to come will draw heavily on the idea of coupled dynamics, and therefore I went to lengths to devise how to cover a natural system with coupled dynamical systems. The description was purely conceptual, but it is essentially the same kind of abstraction we employ when building models of behavior. Conclusions about the role and the forms of exchanges at interfaces is where much headway can be made in understanding the interplay between constancy and variability.
94
5 Dynamical Systems and Convergence
junction. The transmission is the result of a complex sequence of physiochemical events, very often based on probabilistic descriptions, accounting for the untraceable variability of nanoscopic events. The events leading to transmission of the activating potential, which is mediated by the neurotransmitter acetylcholine, are thought to happen roughly as follows. On the presynaptic side the arrival of the action potential in the synaptic terminals of the motor neuron (step 1) creates electrotonic differences at the cell membrane (step 2), which causes the calcium channels to open (step 3), allowing inward flow of positive calcium ions from the interstitial medium to the cytoplasm (step 4), which starts a cascade of proteic bindings (secondary cascade) (step 5), subsequently releasing vesicles of acetylcholine (that had been produced intracellularly) in the synaptic cleft (the space between neuron and motor cell) (step 6). On the postsynaptic side, the process resumes with the uptake of the vesicles containing acetylcholine (step 7), which bind to receptors at the membrane’s sodium and potassium channels (step 8), promoting the active exchange of ions, (NaC in, KC out) (step 9), (10) leading to a postsynaptic potential (step 10), which spreads internally in the cell and the transverse tubules (step 11), causing the release of calcium ions from the sarcoplasmic reticulum (step 12), which finally initiates the mechanical process of contraction of the transverse tubules (step 13), where energy is used, mostly in the form of conversion of ATP to ADP. In all these steps we find a stochastic description of a large number of interactions that generates a mean, with a dynamical time course. They are not individually retrievable. We are unable to follow a single neurotransmitter vesicle as it crosses the synaptic cleft, or the precise numbers of vesicles taken up (at steps 6 and 7). We cannot precisely determine when an individual ion channel undergoes structural conformations (or what kind of conformation), or binding events (at steps 3, 5, 8, and 12). We cannot retrieve the nanoscopic-nanosecond pattern of electrotonic potentials around the membrane, or the exact number of exchanged ions (at steps 4, 9, and 11). But although we have a mass of unknowns, we can make claims about the reality of the minute individual interactions. The operating principles are the rates of molecular diffusion, the density and response of receptors, the density of released vesicles, and so on, all of which, owing to the coupling that ensues from those processes, may also produce a statistical theory of the tendencies of the averages representing the processes. And ultimately they may be introduced in a model that will exhibit similar qualitative dynamics, despite being agnostic about the individual formative processes (e.g., ion diffusion, protein conformation). Deep down reality exists, but at this level the model is independent of it.
5.4.1.2 Convergence at the Muscle Fiber An understanding of convergences and divergences in level crossing leads to a comprehension of the dynamics of neural commands. As an example, I describe macroscopic muscle contraction from the perspective of an incoming spike train. The simple but crucial point is that there are many different spike trains that will produce indistinguishable contractions. This is due to an aliasing effect that smoothens
5.4 Convergence in the Neuromuscular Junction
95
high frequencies as they divergently impact on the muscle. This effect implies a broad class of spike trains that, from a functional perspective, are equivalent. We start with the arrival of an action potential from a motor neuron to a muscle fiber. Because of a dendritic arbor coming onto muscle fibers, motor neurons discharge on many muscle cells, on many synaptic terminals simultaneously, in a divergent manner. In mammals it is the rule that one muscle fiber is innervated by only one muscle fiber (but a motor neuron innervates many fibers). The total contraction force of the muscle fiber is a function of the number of muscle cells that respond to the arriving action potentials by depolarization (discharge as described in the previous section). How and how many cells are going to respond at any given arrival is dependent on the state of the receiving muscle cells. For instance, some cells may be refractory from former discharges, whereas some cells respond with high intensity. Although the individual cell response will be unpredictable, it will conform to its biomechanical laws in terms of maximum contraction, refractory period, energetic constraints such as conversion of ATP to ADP in the contraction tubules, other metabolic processes, elasticity of the cell, and so on. So the differential contraction force is a function of the number of muscle cells that respond to the incoming signal, and how. Because there are many cells, their moments of activity will overlap. This smoothens the contraction force in the fiber, which is irrespective of exactly which of the individual cells are active. From the perspective of the muscle fiber contraction, all cells are alike. It does not matter whether cell A or cell B contracts. What matters is the overall contraction force in the muscle. All that each cell does is contract a little. Therefore, although each individual cell has it its own likelihood of discharge, dependent on intrinsic variables and the recent past, the contraction force in the muscle at a given time window is strictly a function of the contraction of individual cells, responding to incoming activity. This effect, which smoothens high-frequency variation, is denominated aliasing. So, the overall contraction force of a muscle fiber results from a large number of incoming action potentials diverging onto a muscle fiber. In this manner, the sequence of muscle cells activated is irrelevant because the muscle acts as a whole. Because the cell is a system regimented by physical laws, the bounds of the variation are well determined. Therefore, it is not necessary to have a deterministic model of the activation of the individual cell, with all the possible state variables (e.g., the state of every individual ion channel). For a cell, nothing is sacrificed in the assumption that the total contraction is purely a function of the incoming action potentials plus mechanical state variables describing the receiving muscle and cells. And indeed, what is true for the muscle fiber is also true of the muscle as a whole. The muscle is formed by many muscle fibers, which may vary in length, width, or elasticity, but in any case will act as a coalition. The same argument holds for torsional forces and more complicated muscle fiber arrangements, which despite introducing more biomechanical constraints will also have incoming spike trains that are equivalent with respect to the overall forces on motors. To control functional behavior, the neural system can select from a vast class of equivalent spike trains. The equivalence is due to convergence onto the muscle fiber. In convergence variability is aliased into constancy (see Fig. 5.2).
96
5 Dynamical Systems and Convergence
Cortical Neurons
Spinal Cord
The signal transmitted to motor neurons is a result of cortical activity involving many areas, such as Cerebellum, Pre-motor Cortex and Primary Sensory Cortex.
Motor Neuron Spike
Trains Dendritic Arbors
Many Dimensions
In mammals, Any motor neuron innervates many cells, but all muscle cells it innervates belong to only one muscle fiber. In addition, one muscle fiber may receive from different motor neurons.
One Muscle Fiber
Motor Neurons (signal) N1 N2 .
..
Muscle Cells (contraction) M1 M2 .
..
Mn
Because all muscle cells contribute to, Given one small time window, their stochastic contribution, averages in the large numbers of muscle cells in a fiber.
Projection
Nn
Different neurons innervate one fiber, so spikes from different motor neurons excite motor cells of that fiber at different times.
This reduces the high dimensional space of single neurons to one dimension of muscle force
At the level of the muscle fiber, any cell activated will contribute to the overall contraction, because they all participate in the same fiber. Therefore, muscle tonicity is a function of the total number of spikes arriving at any moment (and the recovery state of the muscle cells).
F
F No. of spikes in window
Fig. 5.2 Convergence to muscle fibers implies equivalence of spike trains
One Dimension
F (muscle tone)
5.5 Conclusion
97
Rate Codes and Convergence This argument is, of course, equivalent to saying that motor action is controlled by a rate code. But it accounts for the paradox that spike times of incoming trains are rather precise. Rate code is a good description only because aliasing occurs at the muscle. The dichotomy between rate and temporal coding is decided with respect to the receiving structures, that is, the muscle ensemble. In the case of muscles, spike trains have equivalent rates. This does not preclude that at the level of production of spike trains, the spike trains may be unique. Only that within some bounds (the set of equivalence of spike trains with respect to muscle tone) spatial and temporal variability converge to constancy in muscle tone and action. Different spike trains leading to similar muscle tone are not redundant, but are selected from among the set of equivalent spike trains. The organization of firing takes in the structure of the system which intrinsically defines equivalent sets of spiking patterns. That is to say that the meaning of “incoming” is resolved in terms of the receiving structures (see also Sect. 10.2.1).
5.5 Conclusion 5.5.1 Convergence and Level Crossing The contraction force on a muscle can therefore be satisfactorily modeled by the incoming activity, within a time window, on a single muscle fiber, significant simplifications notwithstanding.8 Moreover, owing to aliasing effects, many incoming spike trains will be, from the perspective of contraction force, equivalent. This is the same as saying that muscle contraction is invariant with a wide class of incoming spike trains, because many distinct spike trains cause equivalent muscle contractions (some studies in insects have been performed by Hopper ). As the explanation of invariants in behavior evolves, it will become clear how the theory is dependent on this type of conclusion. To put it concisely: for certain systems, variations of one level become constancies of the other. The divergent spike trains converging on a muscle fiber effect a reduction of dimensionality.
8
Detail about the modeled state variables and rules can be added, of course, ad libitum. But the amount of detail introduced should be governed by knowledge gain. If our level of abstraction is that of the overall contraction force in a small time window, other variables can be excluded, such as the production and release of neurotransmitters, or neuromodulators such as noradrenaline (epinephrine). We abstract away neuromodulators, synaptic alterations (facilitation, depression), and production of neurotransmitters to better comprehend how motor commands are conveyed. The explanations of motor behavior, even after this blunt abstraction, are not impaired but strengthened, as they introduce further constraints for equivalence classes of neural behavior.
98
5 Dynamical Systems and Convergence
Level interfaces are places of divergence and convergence, where constancy and variability meet. In the next chapter, I discuss another example of this, where level crossing implies reduction of dimensions, as parameter domains of the neuron beget distinctive activity categories.
] 5.6 Summary Dynamical systems models (Sect. 5.2.1) State variables are variables unambiguously representing the state of a system. Rules are applied to the state and take it to the future. Parameters are aspects of the system that change slowly with respect to the state variables.
Coupled dynamical systems (Sect. 5.2.2) Coupled dynamical systems covers phenomena as it can simultaneously describe different levels and scales of the phenomena. Coupled dynamical systems highlights interfaces as it enforces explicit assumptions about level crossings.
Level crossing (Sect. 5.4) The neuron–muscle interface exemplifies how large numbers (e.g., of ion channels) coalesce into categories of dynamics. Convergence at the muscle fiber implies equivalent classes of spike trains.
References 1. Claude G et al (2000) Principles of Neural Science, chaps. 33, 34, 35, 36, 37 (Part 6: Movement), 4th edn. Appleton & Lange, East Norwalk, CT 2. Arbib MA (1972) The metaphorical brain, an introduction to cybernetics and brain theory. MIT, Cambridge, MA 3. Beer RD (1995) A dynamical systems perspective on agent-environment interaction. Artif Intell (72):173–215
References
99
4. Beer RD (2000) Dynamical approaches to cognitive science. Trends Cognit Sci 4(3):91–99 5. Cruse H, Kindermann T, Schumm M, Dean J, Schmitz J (1998) Walknet–a biologically inspired network to control six-legged walking. Neural Netw 11(7–8):1435–1447 6. Elman J (1990) Finding structure in time. Cogn Sci 14(2):179–211 7. Freeman WJ (1995) Society of Brains: A study in the neuroscience of love and hate. Laurence Erlbaum Associates Inc, Hillsdale, NJ 8. van Gelder T (1998) The dynamical hypothesis in cognitive science. Behav Brain Sci 21(05):615–628 9. Hofstadter D (1999) G¨odel, Escher, Bach: An eternal golden braid. Basic Books, New York 10. Jonas H (2001) The phenomenon of life. Northwestern University Press, Evanston, IL (1966) 11. Kelso J (1995) Dynamic patterns: The self-organization of brain and behavior. MIT, Cambridge, MA 12. Merleau-Ponty M (1942) The structure of behavior (1963 translation). Duquesne University Press, Philadelphia, PA 13. Newell A (1981) The knowledge level. Department of Computer Science, Carnegie-Mellon University, Pittsburgh, PA 14. Schoner G, Kelso J (1988) Dynamic pattern generation in behavioral and neural systems. Science 239(4847):1513–1520 15. Smith L, Thelen E (1993) A dynamic systems approach to development: Applications. MIT, Cambridge, MA 16. Taylor C (1999) The atomists leucippus and democritus: Fragments: A text and translation. University of Toronto Press, Toronto, ON 17. von Uexk¨ull J (1934) Bedeutungslehre/Streifz¨uge durch die Umwelten von Tieren und Menschen, 1956th edn. Rowohlt, Hamburg
Chapter 6
Neurons, Models, and Invariants
Abstract The theory of dynamical systems is presented as an integrative theory, as it offers a consistent method to cross levels and scales. Behavior happens in recurrent loops where what happens next is what happened before plus how things work. Dynamical systems abstracts organisms into state variables, rules, and parameters. Analysis of structure and parameters can be done both analytically and computationally to address the invariances of a modeled system. Dynamical systems analysis reveals new facets of the phenomena modeled, and may effectively lead to powerful reductions. The Hodgkin–Huxley model of the action potential is presented as a prototypical example of dynamical systems analysis applied to neurons. The Hodgkin–Huxley model illustrates several interesting aspects of action potential generation, for example, how constancy in the action potentials appears despite variability in ion channel distribution. Effectively, dynamical systems modeling demonstrates how a phenomenon can be analyzed with respect to its parameters and structure, and helps solve the dichotomy between constancy and variability through mappings between parameter domains and dynamical varieties.
6.1 Neurons to Models Now we go from single biological neurons to model neurons to network models. This section describes the neuron as the paradigmatic example of how dynamical systems view the self-organization, as myriad parameters organize into a finite number of equilibrium states. I begin with a historical account of facts leading to the model that brought dynamical systems into neuroscience, the Hodgkin–Huxley model. The Hodgkin–Huxley model embodies one of the core lines of argument here: how variability (of parameters) becomes constancy (of dynamical behavior), both in theory and in reality. Following that, other models will be presented, in a direction of increasing abstraction, up until recurrent neural networks, the model used for the experiments described in Part II. Ironically, in its extreme simplicity, this model shows the other direction of the arrow: how divergent processes transmute constancy into variability. We begin with history.
M. Negrello, Invariants of Behavior: Constancy and Variability in Neural Systems, Springer Series in Cognitive and Neural Systems 1, DOI 10.1007/978-1-4419-8804-1 6, c Springer Science+Business Media, LLC 2011
101
102
6 Neurons, Models, and Invariants
6.1.1 The History of the Models of the Action Potential 6.1.1.1 From Galvani to Hodgkin and Huxley In the years between 1791 and 1798, Galvani showed that the nervous system responds to electrical signals by making the limbs of a dead frog twitch. Emil du Bois-Reymond, building on the discoveries of Galvani, and also Humbolt (du Bois-Reymond’s eminent supervisor), in attempting to replicate results obtained by Mateucci, discovered that nerve energy flows, what we now call an action potential [5]. Julius Bernstein [1, 2] hypothesized that energy is produced across the cell membrane, by exchange of ions , and was the first to use the Nernst equation to determine the equilibrium potential of ions across the membrane. As early as 1907, Lapicque hypothesized that the action potential is emitted when a threshold potential has been reached (chronaxie) [26]. Culminating with Santiago Ramon y Cajal’s demonstration that the nervous system was composed of individual units, a picture has been revealed where the control of higher organisms is a result of an immense number of electrically active cells acting as the middleman between perception and behavior.
6.1.1.2 Voltage Clamp Revealed Smooth Cross-Membrane Conductance In 1952, Hodgkin and Huxley, in a set of papers that would eventually give birth to computational neuroscience, showed that the action potential is generated by the changes of ionic conductances across the membrane, and that the changes in conductance are a function of the difference in potential of the membrane [16]. With empirical data from the squid’s giant axon, measured with the voltage clamp (not the patch clamp, which came later), and guided by the hypothesis of membranedependent ion conductances, they fit the measured data to a set of equations according to a method developed by Hartree [14]. The equations soon became standard, and are likely to be responsible for the introduction of dynamical systems into neuroscientific research. The agreement must not be taken as evidence that our equations are anything more than an empirical description of the time-course of the changes in permeability to sodium and potassium. An equally satisfactory description of the voltage clamp data could no doubt have been achieved with equations of very different form, which would probably have been equally successful in predicting the electrical behavior of the membrane. (From [16]).
The main Hodgkin–Huxley equations are differential equations relating the specific ionic conductances of sodium (NaC ) and potassium (KC ) to membrane potential, with a number of other constants (Fig. 6.1b has the original equations, and Fig. 6.1a shows the good agreement between measurements and the model). They were able to show that their equations reproduced quite a significant part of the phenomenology of the squid’s giant axon. What is more, they were also able to generate a
6.1 Neurons to Models
103
Fig. 6.1 Hodgkin–Huxley equations and modeling agreement. (From [16])
number of testable predictions, regarding manipulations of the system’s parameters. But although they narrowed down the search for mechanisms underlying the action potential, the mechanisms regulating the time-dependent exchange of ions through the membrane were still a matter of debate until the appearance of the patch clamp technique, an experimental method able to measure the fluctuations in ion conductance of microscopic cellular patches.
6.1.1.3 Patch Clamp Led to the Suggestion of the Ion Channel Hypothesis Neher and Sakmann’s patch clamp [32] allowed the measurement of the currents across the membrane in microscopically small patches. Unexpectedly, the ionic exchange events were neither continuous nor smooth, as assumed in the Hodgkin–Huxley model. Changes of potential were stepwise, not gradual. From the perspective of a single patch, the membrane conductance was not a continuous variable, as modeled by Hodgkin and Huxley, but a series of very fast (on the order of 1 ms) events of exchange best described as Boolean: either there was exchange, or there was none. Moreover, the probability that a patch would be in one state or the other was ultimately unpredictable (and some have described it as fractal [22]). Thus, the patch clamp led to the postulation of ion channels: structures (most likely proteins) within the cellular membrane that allows the intracellular environment to communicate with the intercellular environment. Because ion channels are
104
6 Neurons, Models, and Invariants
Fig. 6.2 Fractal nature of channel activity revealed by the patch clamp. (From [22])
very tiny molecules, only a limited number of ions can pass in a given interval.1 A picture of simple molecular mechanisms regulating ion exchange explains the stepwise nature of channel behavior and also the source of asymmetries in the exchange of ions Fig. 6.2. In fact, Hodgkin and Huxley showed remarkable foresight about the nature of ionic exchanges when they played with the idea that there is asymmetry in active channels: “. . . agreement might be obtained by postulating some asymmetry in the structure of the membrane, but this assumption was regarded as too speculative for profitable consideration.” In this instance it was soon to be shown that the speculation was not that extravagant. 6.1.1.4 Ion Channels Are Proteins That Undergo Structural Conformation Guided by the Environment Recently, some of the machinery of ion channels was unveiled by X-ray crystallography, revealing some of the three-dimensional structure of channels [20, 36]. Ion channels are active or passive, depending on their structure and possible conformational states (see Fig. 6.3). Passive ones (such as the Cl channel) induce a steady leakage current, whereas active ones respond to certain situations, as chemical gradients, chemical interactions, and temperature, by assuming different conformational states. But whatever the reality of the individual ion channel, the single event of conformation itself is neither observable, nor is its timing predictable, and it continues to be best represented by a stochastic variable, a probability of it being opened or not, which is statistically independent of (1) the history of openings and (2) other channels. 1
Other types of channels exist which are passive and do not undergo structural conformation with a functional role, such as gap junctions. These are passive channels between cells that exchange intracellular material.
6.1 Neurons to Models
105
Fig. 6.3 Stereogram of the crystal structure of the calcium-gated potassium channel as revealed by X-ray crystallography in (a) open and (b) closed conformation
6.1.1.5 The Law of Large Numbers in Ion Channel Activity As our knowledge of the machinery of the individual channels advances, the reinterpretation of the state variables representing conductivity in the Hodgkin–Huxley model, in terms of ion channel behavior, is of immediate consequence. The model prescribes gradual alterations of membrane conductance, which represent a sum of myriad ion exchange events that antialias (smoothen out) for large numbers. Notably, as far as we can tell, the nanosecond activity of individual channel events is irrelevant from the perspective of the membrane potential on the millisecond scale. At the level of the membrane potential, what matters is the time courses of state variables (conductances and membrane potential) and the choice of parameters. For a large amount of channel activity, there is convergence at the level of the membrane potential. For this reason, Hodgkin–Huxley descriptions of the activity of the squid axon were precise to an uncanny degree of correspondence (as in Fig. 6.1a), ignorance about the mechanisms giving rise to changes in conductance notwithstanding. The bold conversion between measurements of the squid axon and the mathematical model made Hodgkin and Huxley deserving recipients of the Nobel Prize. Their accurate model of the dynamics of membrane potential and the cable properties of axonal and dendritic transmission is a historical achievement. With electricity as the common currency, the membrane potential as the principal state variable and input current as a parameter, Hodgkin and Huxley achieved a sophisticated mathematical model of the action potential. Nowadays, the model is widely employed, and has
106
6 Neurons, Models, and Invariants
been extended to encompass complex models including the geometry of neurons, denominated compartmental models (employing Rall’s cable theory). It has generated beautiful explanations and predictions about a range of neural phenomena (see [35] for a review). It is noteworthy that the model, measuring at the level of cellular fluctuations of the membrane potential, achieved a high degree of correspondence, despite being agnostic about the mechanisms of ionic exchange at the molecular level. It is a cogent example for level crossing in modeling, where the lower level may be abstracted away, achieving agreement at the higher level.2 Large numbers of discrete events act in concert and smoothen out, and parameter domains converge to distinguishable categories of action potentials.
6.1.2 The Hodgkin–Huxley Model Illustrates How Variability Converges to Constancy Ever since the inception of the Hodgkin–Huxley model, the dynamics of the Hodgkin–Huxley equations naturally lent itself to parameter space analysis. The initiation, shape, and persistence of the action potential were related to the input parameter (input current) and the initial state. Meanwhile, dynamical systems theory was producing beautiful pictures using the same kind of equations as the Hodgkin– Huxley equations. Bifurcation theory, which is the part of dynamical systems theory that studies the qualitative changes of behavior with respect to parameter shifts, was introduced early on as an analysis tool for the dynamics of the model [34]. It turns out that the shape of an action potential of the Hodgkin–Huxley model is an invariant cycle (a limit cycle) of the dynamical system, a constancy given by the asymptotic behavior determined by parameterizations of the model. This means the action potential appears as an invariant set with respect to certain parameter changes. By fiddling with the parameters, one is able to show many qualitative correspondences with the dynamical behavior of a neuron’s membrane potential. Moreover, the parameter space partitions the dynamical behavior of a neuron (defined by a particular parameterization) into a limited number of dynamical responses, i.e., categories. Categories in this sense appear as a small number of qualitatively distinct membrane potential phenomena maps to the continuous space of parameters. In a modern yet classical study, Guckenheimer and Laboriau [13] produced a beautiful
2
Other theories exist about the action of ion channels. The dynamical systems model of Hodgkin and Huxley, however, is agnostic about them. It dwells at another level. For instance, a different hypothesis has been proposed considering the cell as a gel which wobbles alive, and whose membrane thins like that of a balloon which is pumped full, thus increasing permeability [28, 29]. This hypothesis sees the activity of ion channels as a result of passive plastic modifications of the cellular environment. Even so, the model is independent of particular theories, no matter how fanciful, just as long as the data matches.
6.1 Neurons to Models
107
Fig. 6.4 The parameter space analysis of the Hodgkin–Huxley model, showing a projection of the attractor landscape in two dimensions. The lines separate parameter domains with different qualitative behavior (attractors). Within the bifurcation boundaries indicated in parameter space, variation of parameters aliases into constancy of dynamical behavior. (From [13])
and thorough analysis of these parameter spaces in relation to model responses, a sample of which is seen in Fig. 6.4 [18]. It shows that certain combinations of parameters will fall into qualitatively different categories, each representing one kind of spiking phenomenon. Such a depiction shows the existence of subsets of parameter space with similar asymptotic qualitative behavior (attractors). Parameter domains with qualitatively distinct asymptotic behavior entail categories. According to the dynamical systems theory analysis of the Hodgkin–Huxley model, the action potential, as well as other membrane potential phenomena, results from the collective dynamics of the amalgamated conductivity variables, generated by changes of membrane conductivity, themselves dependent on voltage and time course. The conductivity variables are the sodium and potassium currents, which are actively dependent on membrane potential and time, and the chloride current which is passive, only dependent on membrane potential. This totalizes four free variables which fully determine a four-dimensional dynamics. Manipulations of the model’s parameters lead to distinctive classes of phenomena, including tonic and bursting spikes, quiescence, subthreshold oscillations, and resonance. An in-depth analysis of the bifurcation phenomena shows further how underlying dynamics explains how an action potential is initiated, maintained, and terminated.
108
6 Neurons, Models, and Invariants
The dynamical systems concept of the action potential reduced and eliminated archaic concepts, such as rheobase and chronaxie (p. 37 in [27]), in terms of their dynamical counterparts. “Rheobase,” for example, is the minimum electric current evoking/conjuring a spike, whereas “chronaxie” is the minimum time of a stimulation that conjures a spike. These concepts, derived from experiments, subsisted for a considerable period (and still do in particular fields). It had already been noted that there was a relation between chronaxie and rheobase. Dynamical analysis with the Hodgkin–Huxley model reveals why that is the case. It shows how the relations between the two are derived from the dynamics of the membrane potential, within different parameter domains. In so doing it debunks the idea of something like a fixed threshold, and it elucidates the threshold-like activity as a function of the regions of phase space of the dynamics.
6.1.2.1 Other Dynamical Models of the Single Unit Following the successes of the Hodgkin–Huxley model, other models were proposed with increasing levels of abstraction, mostly to achieve both analytical and computational tractability. Of those, worthy of mention are the Morris–Lecar model, the Fitzhugh–Nagumo model, and a number of variations on the “integrate and fire” theme. One of the most prolific has been the Fitzhugh–Nagumo model, which reduces the numbers of state variable dimensions to two and the number of parameters to five (see Fig. 6.5).3 The two parameters are simply the membrane potential and a recovery variable, which represents the combined effect of the three individual ion conductivities. Despite this simplification, the Fitzhugh–Nagumo model displays many of the properties of the neuron, including tonic and bursting spikes, subthreshold oscillations, and spike latency, at the expense of failing to exhibit other properties known to neurons, such as tonic bursting, inhibition induced bursting, and rebound burst [19]. Both the properties the Fitzhugh–Nagumo model emulates,and those it does not are dependent on the existence of particular parameter domains. The relation between parameter domains and individual neuronal properties was thoroughly studied in [18], in which many examples of the relationship between parameter domains and the type of activity of neurons appear.
6.1.3 Categories Emerge There are two main points to pin down in the roll of conclusions about the relation between an action potential and the cell producing it. Although a lot will vary from cell to cell, including shape, ion distributions, connectivity and arborization, 3
In the online resources, at www.irp.oist.jp/mnegrello/home.html included with this book there is a Mathematica notebook where one can explore on-the-fly the dynamical space of the Fitzhugh–Nagumo model according to parameterizations.
6.1 Neurons to Models
109
Fig. 6.5 The Fitzhugh–Nagumo model produces oscillations according to a limit cycle. Top: Phase space of the two state variables, a recovery variable, and the membrane potential. Bottom: The time series generated by the model. The cross-hair indicates the initial condition, blue dots are integration points. For different parameter combinations, the activity of the model also fits into categories, as happens with the Hodgkin–Huxley model. The thin blue line, the yellow curve, and dashed red line represent the null-clines of the dynamics
110
6 Neurons, Models, and Invariants
neurotransmitters, and glia, there will be a finite number of qualitatively distinct possibilities for the generation of an action potential. That number is potentially large, but effectively small compared with the immense space of structural possibilities of the cell. From Ramon y Cajal [6] to Braitenberg [4], the literature shows that there are effectively categorizable kinds of cells, and although we lack, up to now, a complete list of cell properties, the space of combinations of these properties will nonetheless lead to distinctive categories, not only morphologically, but also dynamically. Within a category, the qualitative responses of the single neuron will again fall within a limited number of possibilities. Thus, a neuron may spike (or not), burst, oscillate, resonate, habituate, or sensitize, but only a subset of these properties are displayed at the same time, or at the same neuron. The types of activity a neuron exhibits will result from its endowments (morphology and channel distributions) and context (biochemical environment). That means that the neuron’s potentialities are subordinate to its constraints, biological or physical, expressed in cell morphology, ion channel distribution, its biochemical environment, and crucially, the networks in which it participates. Of course, the possible combinations of all the properties may appear as an interminable set of possibilities, but constrained in ways that beget distinguishable categories. That explains why we do not have Aplysia neurons, but Aplysia can have neurons acting similar to those of mammals. That follows because both share the search space of constructive possibilities (this may turn out as a beautiful example of what has recently been called “deep homologies”). Certain biological features when made into parameters of dynamical neural models can emulate some aspects of the behavior of real neurons. The shape of an action potential (in membrane potential) can be identified as an attractor state of a dynamical system representing a particular neuron. In addition, the shape and the type of action potentials are dependent on particular choices of parameters. So, it is possible to examine the consequences of certain parameter changes, both in quality and in quantity. In addition, the actual set of distinct dynamical regimes is limited (in modeling as in reality): there are bounded regions of the space of parameters with similar qualitative behavior. Therefore, by tracing the different areas of parameter space to the type of activity they produce, we may discover typologies of neuronal behavior through dynamical analysis. As a consequence, we encounter constancy in spiking phenomena out of variation in parameter spaces. Essentially, classes of spiking phenomena will be robust to variations in input and other parameters. In summary, the arguments supporting the above statements claim: 1. Action potentials produced by models are invariant phenomena within certain parameter changes, where the shape of an action potential is an invariant cycle in phase space. The qualitative behavior of the membrane potential variable largely agrees with experiment. 2. Parameters stand for entities in reality – to various degrees of abstraction. One can find the classes of phenomena that are a priori possible, given the mathematical formulation of the model.
6.2 Network Models
111
3. Stochastic behavior of the very small and very fast becomes reliable and smooth tendencies for large numbers. Therefore, membrane potential can be predicted in terms of deterministic dynamical systems. 4. Combinations of parameters are potentially infinite in number, but map onto a closed and limited (albeit large) set of possible dynamical responses. These combined facts speak against the dogma ignoramus et ignorabimus that percolates in some neuroscience circles. Blocked by the potentially infinite number of parameter combinations, critics fail to see the much less obscure possibility: that what neurons potentially do falls into distinguishable categories, with particular functional potentialities. The neuron has in its parameters also its identity. Within certain bounds of variation, neurons of a category will have equivalent properties. Categorization of neuronal properties under particular parameter combinations will effectively reduce the space of possibilities to a number of categories. Incidentally, an analogous argument is valid for networks, as in Chap. 7.3, where despite variability in parameters, constancy appears in behavior. The action potential is a canonical example of how invariances emerge as consequences of dynamical systems formulations. We recast the measured action potential into a frame that explains it, by reducing the complexity of biology into relations of constancy and variability across levels. Or, more precisely, how parameter domains map to qualitative behavior. As we shall see, the same is valid in the study of networks and behavior.
6.2 Network Models 6.2.1 Difficulties with the Hodgkin–Huxley Model For its rich phenomenology, and despite its multidimensional complexity, the Hodgkin–Huxley model continues to be one of the models with the highest degree of biophysical meaningfulness to date, i.e., the parameters are informative about the phenomenon, i.e., made into testable predictions.4 For that reason, the Hodgkin–Huxley model is the model of choice for the more detailed, more biologically adequate, single neuron research. The Hodgkin–Huxley model, combined with cable models for dendrites, can be employed for a fully fledged description of a geometrical neuron, which increases the complexity of the problem to another level. Much research nowadays is devoted to these highly detailed models. But despite their fitting correspondence to real neurons, they are not almighty. Because of the tremendous complexity and combinatorial explosion of parameters, they
4
More recently some models have been introduced that take the level of abstraction to the stochastic behavior of individual channels.
112
6 Neurons, Models, and Invariants
are difficult to integrate in even the simplest networks. Nowadays, compartmental models still strain computers (compare 1,200 flops in the Hodgkin–Huxley computation with 13 flops in the Fitzhugh–Nagumo computation [19]). Also critically, the Hodgkin–Huxley model is at the verge of analytical tractability, because of its vast parameter space. These facts keep the Hodgkin–Huxley model mainly in the sphere of biologically plausible modeling; the computational equivalent of microscopic in vitro research. And for these reasons, some problems are – in my opinion, for the time being – beyond the scope of the Hodgkin–Huxley model. Standing prominently among those is the one that concerns us in the upcoming chapters: organismic behavior (comparable to in vivo research). And since it is the level of analysis that determines the tools at hand, the behavioral level requires a different set of tools. Behavior is embodied and embedded, and because of that, the level to study is the coupled systems level (see Sect. 5.3). A neuron may trigger motor behaviors, for example, the escape response of the zebra fish, or the recoiling of Aplysia, or modal conditioning in bees. But still, the neurons are embodied in a system, embedded in the world. The three dynamics together compose the system. The neuron alone does not explain the behavior, as it is necessary to know which events led the neuron to fire, and what resulted from its firing. It is blatantly obvious that most neurons are not solipsistic, but are inextricable members of groups; whether one spikes or does not is a function both of itself and of its peers. It is a focus on the systemic and dynamical organization properties of networks that will take properties of single neurons into the network level.
6.2.2 A General Template To Build a Network Model All neural network models are built as follows. First, we assume there is a set of properties for abstracted network units (from now on, a neuron model will be called a unit – following Francis Crick’s admonishment [10]), and then connect the units in some way. With this step accomplished, the model is ready for research. The next problem is to ask a question that the model may answer. The adequacy of the model to answer the question is frequently a determinant of the success of the model. A sample set of how single neuron invariant properties express themselves at the system level would include graded activity and memory, membrane oscillations and cortical oscillations, long term depression and habituation (desensitization), longterm potentiation and sensitization, and so on.5
5
To select the properties of a unit to compose a network model, one may start from two directions. Either one makes informed assumptions about what properties should be included in the model and builds one’s own, or one searches the literature to decide on a particular model. A variant
6.2 Network Models
113
6.2.3 Parameterizing Structure to Analyze Dynamics To ask what are the systemic properties (potentialities) of a network, in equivalence to models of single neurons, we parameterize structure, thus assuming a structure–function relationship of neural phenomena. By focusing on the structural parameters of neural networks, and on their dynamical consequences, we may study the dynamical behavior and transitions (bifurcations) of networks as a function of their structure. Ideally, we would also like to draw conclusions related to neural network control of beings; although abstracting many levels away (and often brutalizing them), properly phrased conclusions can talk back to the phenomenon, abstractions notwithstanding (an excellent review of the levels of analysis of neurons and networks is given in [15]). In Sect. 7.1 we present the analysis tools to study the dynamics of recurrent neural networks that shall abide this admonishment.
6.2.4 Properties of Units and Properties of the System So, it is to be expected that fascinating and mysterious properties of neurons [23] will underlie incredibly interesting systemic behaviors which we exclude (either out of naivety, or by choice) by selecting the properties our units have. But from the perspective of certain behaviors, it just may be that neurons are like ion channels: many, multifaceted, but irrelevant for the overarching dynamics of behavior – irrelevant at the individual scales. Save singular cases of very simple organisms, neurons act as groups, not as individuals. Neurons connect neurons to other neurons, and in their dynamical dance, they move the organism. So, if we are interested in the systemic effects of neuronal properties, we study their networks. This commits us making to simplifications, both for analytical and for computational amenability. Although this process may cost biological plausibility, it pays dividends in a systemwise comprehension of the phenomenon. What is more, it provides a platform
of the second method is to inherit models from the cogent arguments of one’s thesis supervisor. Whatever the case may be, it is of absolute importance to understand the properties of the model before one goes on to draw conclusions! It is my contention that therein, on a poor appraisal of the model’s properties or a narrow scope of usability, lie many of the well-directed critiques towards computational neuroscience. There are plenty of models which prove theorems about themselves and are hard to integrate in bigger theories.
114
6 Neurons, Models, and Invariants
for hypothesis formation on the consequences at the system level of neural properties of the individual unit.6;7
6.3 Recurrent Neural Network Models 6.3.1 Assumptions In the present case, we build a network from units whose abstracted properties are more or less standard in neural network research.8 A unit is something (according to [33]): – That stands for a population of synchronous neurons, or a brain area – That does linear summation of inputs of other units – Whose activity is represented by a real number – That saturates above and below – That sends this activity to other neurons, with a synchronous delay: one time step – That has biases to account for constant background output and population size differences – That may be recurrently connected (whereas feedforward networks have trivial dynamics)9
6
How many properties of neurons are exploited at the system level? There is no compelling reason to presuppose that all properties of neurons will be exploited for every organismic function. But some are most certainly crucial, as is the case with types of connections. In the brain, the synapses are predominantly chemical; the electrical ones being reserved for more stereotyped function. For instance, electrical synapses do not generally allow inhibitory action. The model analogizes inhibition with negative synaptic weights, that is, the activity of the neuron is multiplied by a negative number. The complexity of synapses effectively puts this simplification to shame. Synapses present many variations on the theme of connection between cells, which vary in timescales of transmission and recovery, neurotransmitter(s) used, and types of activations. An example of an unexpected type of synapse is gap junctions, in which there is exchange of intracellular material. They are often found in insects. Moreover, the morphology of dendritic trees and synapses also changes quite a bit. But as before, despite the wide variations on the theme, synapses are passive of anatomical and functional categorization. 7 Independence of properties: Of course, it must not be the case that the properties we include are independent of the others we exclude. Some properties may depend on each other (say, temporal segregation, which depends on delays [30]). This endangers the conclusions we may derive from models, for if properties we include are in fact dependent on those we exclude, then the set of derived consequences is an artifact. There are, though, heuristics for verifying the independence. Questions one may make are of the sort: Do the timescales of the properties relate? Does property 1 have time to alter property 2 within the time window of choice? If we remove property 2 of the model, what are the consequences of its absence on property 1?. In each case, there will be appropriate questions, and that should be addressed, in preparation for the experiments. 8 The continuous version of such a recurrent neural network has also been shown to be an approximator for any arbitrary smooth function [9, 12]. 9 The feedforward neural network is like an intricate rain gutter with complex piping: water may go circuitous ways, but the ground is always its last stop. There are no feedbacks or refluxes.
6.3 Recurrent Neural Network Models
115
6.3.2 Discussion of Assumptions Although much of the literature assumes such properties tacitly, I personally feel that some of the choices are important enough to have to be discussed, and to the extent that they can, justified. This does not mean that the justifications make the model particularly plausible, they are meant to make it not particularly implausible. In addition, a discussion of assumptions by highlighting the properties of the model indicates what kind of conclusions are warranted:
(1) Units are populations: saturation, activity bias. One of the most delicate simplifications incurred is the stark reduction in numbers of units. Even the simplest brains may have 1,000-fold more neurons than the largest networks. There are many justifications for those assumptions, albeit not all equally cogent. First, we point out that the units employed represent the activity of very many more neurons, or neuronal groups. It has already been shown that artificial networks of synchronous spiking neurons responding at the population level resemble sigmoids and their properties [11, 21]. The argument is similar to that in Sect. 5.4, where individual events become amassed in population effects.10 This analogy is specially adequate if populations are taken to have convergent inputs, or if they project to motors. So, we choose to abstract away the individual neurons and work at the population level. We introduce biases to represent background activity, differences in excitability, and size off the different populations. Likewise, the connections in our models represent “tracts,”in more words, major axonal pathways.11
10 Recall that the muscle fiber will act in coalition. If we assume that the production of muscle force is proportional to the number of units activated, we may distill the number of neurons into a number that stands for this proportion. This number is akin to an abstract firing rate, and such characterization has become standard in the literature. It does, however, brutalize the finer temporal details that may exist in neurons. There is evidence that in some cases the temporal aspects of spikes underlie behavioral functions. In those cases, sigmoidal units are obviously unsuited. Nevertheless, as we have been repeating, each model has its own scope; the properties of the models will determine the problems they solve. If the problem is sensory motor coordination, then sigmoidal units are sufficient. What is more, it is probable that in many cases the function the network effectively implements obeys a sigmoid, as is often the case with motor behavior. 11 The interpretation of units as populations is not the only one possible. Many small organisms have countable numbers of neurons, and very often, those are very close to the units we model. The analogy fails only as complexity of networks scales up and organisms become more complicated. That is when the reinterpretation as populations is necessary. By assuming populations, one precludes a whole class of potentialities at the system level. But many still remain, and naturally, those are the ones we may learn about.
116
6 Neurons, Models, and Invariants
(2) Dendrites: linear summation. Dendrites in our model do signed weighted sum of the incoming activity. Inhibition and excitation are represented by the signs of the incoming weights. This has been shown to be the case in certain arrangements [3, 7, 8, 37], although not in all [24, 30, 31,35]. At the system level, this precludes logics in dendritic computation, as well as multiplications. Although multiplications (amplification) are very likely to happen in neural networks, I am skeptical about logical operations in dendrites.12 Other features that are highly likely to have implications for systemic properties are delays, channel distributions, neurotransmitters, and the geometry of dendritic arbors.
(3) In discrete time steps. Discrete time steps in discrete time recurrent neural networks can be thought of as a sampling rate of activity recording, as if under stroboscopic light that reveals network activity in time steps of constant duration. This is primarily a simplification with an eye on the analysis of the recurrent neural networks as a dynamical system, especially from a mathematical standpoint. The model defined by difference equations is an approximation (via a Poincar´e map) of the continuous time recurrent neural networks. This particular model has slightly different representations of phase space than the continuous time model (e.g., periodic attractors with natural numbers as periods are inexistent in continuous time, given the Poincar´e map).13
12
On the logical operations of dendrites: The scientific paradigm that supports the hypothesis of logical operations performed in dendrites is highly contrived. Searching for logical operations departs from the premise that there are logical operations being performed. So, one selects what one thinks compose the propositions, a and b to be evaluated according to dendritic computation. The propositions are in terms of current injections and the answer (a “truth function”) is appearance (true) or not (false) of a spike. For that to work, the parameters one has to assume are wild, because one has to find precise point-to-point specifications of where to insert the electrodes inserting the currents. Then one tries various combinations of places for the electrodes. Eventually, for some combination, a response that emulates XOR appears. The conclusion is that the dendrites can compute this logical operation. However, there is a fundamental difference between dendritic computation and logics. In logics, operations are compositional and timeless. In neurons, time is fundamental, and it remains to be shown that dendrites can also compute compositional statements with many “simultaneous” propositions. This seems highly unlikely. Although two dendrites can be made to emulate particular logical operations in very artificial setups, in vivo operation is probably more multifaceted. It would not surprise me if in the same dendritic tree many logical operations concurrently coexist which may be logically contradictory. Furthermore, dendritic responses are probably very contextual, and highly dependent on initial conditions and other biophysical factors, such that the conclusion that the dendritic arbors compute logical operations is too contingent, too fragile, and too contrived to hold as an underlying principle. That dendritic arbors can “compute” is clear. That they compute with “logic” is like concluding that the earth is flat by measuring the curvature of the floor: the template fits, but only because of the narrow scope of the measurement. 13 In continuous time recurrent neural networks the state is computed in differential infinitesimals (i.e., it approximating a surface). The different consequences for these two types of recurrent neu-
6.4 Conclusion
117
6.3.3 Fair Assumptions No matter how fair an assumption, it shall always be far from incontestable. Indeed, I have put effort into showing how they can be contested. So far, I have been working under the premise “know your model and you shall understand (better) your problem.” Models are always approximations and never free from criticism. All the same and despite that, the general conclusions we may draw are made tenable for that very same reason.
6.4 Conclusion 6.4.1 The Staggering Complexity of the Neuron Recent experimental developments have undermined previous tacit assumptions of the neuron as a quasi-static element (developmental scale), although perhaps only few foresaw the astounding speed and breadth of the variations a neuron undergoes. Two recent examples show astounding rates of change in scales formerly thought to be, at most, slowly changing. It was recently shown through two-photon imaging that dendritic spines may move at astonishing speeds of the order of micrometers per minute [25]. Even more staggering are results that show that the synapse itself, in the event of an action potential, swells and bulges as ions flow through it [17]. At any given time, minute morphological variations are changing the nervous system in astonishingly complex ways. Findings such as these feel so expressive and meaningful that we are compelled to ask what hypothetical functional significance could such an amazingly active system express. And more than that, given all these motions, it is a reason to marvel that the organism is able to make sense of the variability and still function at all! More cautiously, the question is: Given an interesting property of the substrate observed in biology, does it play a functional role in the level of behavior? For any given property, the answer will be different. At times, it will be apparent that it participates in behavior, but its function is unknown, yet to be discovered. Occasionally it will be clear that the properties are independent, such as “eye color” and behavior. Frequently they are dependent, and may be so across multiple levels, such as the case in memory where a channel phenomenon (long-term potentiation), a molecular phenomenon (protein production), and a cognitive phenomenon (memory) are all involved in the same function. They cross levels because, whereas long-term potentiation acts at the molecular biology level of the cell, memory is a cognitive property.
ral networks are not trivial, and in fact have different potentialities for equivalent structures. As an example, in discrete time, an inhibitory self-connection generates an oscillation, whereas in continuous time it leads to a hyperbolic fixed point.
118
6 Neurons, Models, and Invariants
They are, in very important ways, incommensurable with each other.14 This is akin to the question of what is the role of genes in behavior. What is more, these properties are often interdependent, and all too frequently, in intricate ways. Take for instance, as an example, the transmission of an action potential along the axon. It is first dependent on the space variables relating to axonic lengths, and other geometric factors, as well as on the cell’s constitution (myelination patterns, size, biochemistry) and its supporting apparatus (glial cells), whose active roles are only now beginning to be unveiled. It is, moreover, dependent on ion channel distributions along the nervous cell’s soma, axon, and dendrites, about which remarkably little is known. The form of these dependencies is seldom clear, many of them posing remarkable difficulties to studies both in vivo and in vitro. The intricate complexities of the interdependencies can, in effect, make a single neuron appear as complex as a brain. At this time we have to pause and ask: What is the phenomenon to be explained? Departing from the answer to that question, we can define a level within which to construct an explanation, amounting to the identification of the relevant interdependencies and invariances. In the case of behavior, some neuronal properties will be fundamental, whereas others will be less relevant for the given level of analysis. The search amounts to, in essence, the development of models that give us insight into which properties are which.
6.4.2 Neurons, Networks, and Organismic Behavior The biological neuron is a wonderful piece of evolutionary magic. In contrast, the neural model is a wonderful piece of scientific progress. Whereas a neuron is a cell that produces electrical discharges, a neural model attempts to reproduce the spikes by distilling the essentials of the phenomenon into numbers. Neurons exhibit that fractal and interminable complexity of biological systems, in which increasing magnification does not cause a decrease in complexity. Neural models, conversely, are well-defined mathematical entities that have particular levels of analysis. Everything in a neuron may vary; in a neural model, only parameters may vary. A neuron is an inextricable part of a body, a neural model is not. And yet, neural models have the power to tell us what are the important factors regulating the spiking phenomenon. And what is more, neural models can be made into networks that control contraptions of our own devising, thereby providing a unique tool in the study of behavior: evolutionary robotics. By employing evolutionary robotics and other methods that convert unit properties into systemic properties (such as behavior), we begin to unveil the sources of invariance measured in experiment. Invariances appear with organismic and
14
In these cases, usually one correlates increases in behavioral measures with experience, and with long-term potentiation. But to show that dependence exists is not to show how, it is to show that. So, in the same spirit, we may ask: Are dendritic-spine movements important for development, behavior, reproduction, and pattern recognition?
6.5 Addendum: Level Crossing in Models
119
behavioral functions based on specific components. Only some structural organizations will offer levels of constancy on which an organism can rely if it is to face the challenges of a variable yet constant world. The neural organizations mediating behavior distill the variability of the world into constancy of activities that becomes behavior. At the single unit level, this can be shown as parameter domains begetting categories of constancy. Analogously, at the network level, parameter domains also become categories of behavior, albeit with a twist. There is more (and in more depth) about that in Part II.
] 6.5 Addendum: Level Crossing in Models Modeling is not a purely arbitrary exercise of outlining parts of phenomena. But there is some inherent arbitrariness in it, for the scientist is an observer within the system. A level of analysis is given by a selection of measurable variables and their putative interrelations. Variables are measured, and rules retrieved, according to what is most visible to us. The phenomenon itself must not abide to our outlines, of course. The history of science shows a number of examples in which outlines have been fuzzily drawn, or completely misplaced, only to be corrected with later advances. The outlining of a level of analysis leads to the appearance of interfaces. The appropriateness of interfaces in an explanation is a core issue. There is currently a trend in physics that argues that although models can be extremely explanatory within their own terms, models do not, by themselves, cross levels, we do. Much as when we try to integrate quantum mechanics explanations in macroscopic physical phenomena, the stitching between models is often salient and ugly. Moreover, models are like languages, which by necessity offer an incomplete description of phenomena. That is to say that the phenomenon is not exhausted with words. In models as in languages, a level is selected to better characterize the phenomenon. Models, in a very important sense, underdetermine the phenomenon. By shining light on one level, we cause other levels to be eclipsed. The question is: What is the correspondence between a level of description and a level of organization of matter? Both have to do with the appearance of patterns. But seeing a pattern is subjective, and as we have seen, tool-dependent. We outline by looking, and in the extreme some patterns appear only in our minds (think of a stereogram: random dots that evoke three-dimensional perception). So, it is hard to distinguish what is a real pattern from what appears patterned to us, as a function of our nature. We choose what to measure and model, according to the patterns we see. A level of analysis is constituted by variables and rules that relate them to each other, and these are abstracted by us. Rules and variables do not constitute reality, but reflect our perception of reality. We ourselves are reflections of the same reality, and many of the basal distinction-making capabilities of ours are, in essence, about reality. A theory of colors such as Goethe’s, for example, must depart from phenomenology. Experiment and explanation necessarily follow, with theories and measurement.
120
6 Neurons, Models, and Invariants
Any justification of modeling based on level crossing relies on the degree of correspondence between our levels of analysis and levels of organization of matter, whether actual or perceived. It may be tougher than it looks at first glance. Two criteria are helpful to judge the correspondence. One is the self-containedness of variables and rules within a level, and the second is predictive ability of emergent phenomena. Neither alone is going to ultimately resolve our innermost quandaries about the “existence” of a level, but when there is much of both, the acuter apprehensions may be soothed.
References 1. Bernstein J (1910) Die Thermostr¨ome des Muskels und die “Membrantheorie” der bioelektrischen Str¨ome. Pfl¨ugers Arch 131(10):589–600 2. Bernstein J, Tschermak A (1902) Ueber die Beziehung der negativen Schwankung des Muskelstromes zur Arbeitsleistung des Muskels. Pfl¨ugers Arch 89(7):289–331 3. Boeddeker N, Egelhaaf M (2005) A single control system for smooth and saccade-like pursuit in blowflies. J Exp Biol (208):1563–1572 4. Braitenberg V, Sch¨uz A (1998) Cortex: Statistics and geometry of neuronal connectivity. Springer, Berlin 5. Brazier M (1984) A history of neurophysiology in the 17th and 18th centuries: From concept to experiment. Lippincott Williams & Wilkins, New York 6. Cajal Santiago Ramon-y, Pasik P, Pasik T (2002) Texture of the nervous system of man and the vertebrates. Springer, New York 7. Cash S, Yuste R (1998) Input summation by cultured pyramidal neurons is linear and positionindependent. J Neurosci 18(1):10–15 8. Cash S, Yuste R (1999) Linear summation of excitatory inputs by CA1 pyramidal neurons. Neuron 22:383–394 9. Chow T, Li X (2000) Modeling of continuous time dynamical systems with input by recurrent neural networks. IEEE Trans Circuits Syst I Fundam Theory Appl 47(4):575–578 10. Crick F (1996) The astonishing hypothesis: The scientific search for the soul. J Nerv Ment Dis 184(6):384 11. Dayhoff J (2007) Computational properties of networks of synchronous groups of spiking neurons. Neural Comput 19(9):2433 12. Funahashi K, Nakamura Y (1993) Approximation of dynamical systems by continuous time recurrent neural networks. Neural Netw 6(6):801–806 13. Guckenheimer J, Labouriau I (1993) Bifurcation of the Hodgkin and Huxley equations: A new twist. Bull Math Biol 55(5):937–952 14. Hartree D, Womersley J (1937) A method for the numerical or mechanical solution of certain types of partial differential equations. Proc R Soc Lond A Math Phys Sci 161(906):353–366 15. Herz A, Gollisch T, Machens C, Jaeger D (2006) Modeling single-neuron dynamics and computations: A balance of detail and abstraction. Science 314(5796):80–85 16. Hodgkin A, Huxley A (1952) A quantitative description of ion currents and its applications to conduction and excitation in nerve membranes. J Physiol 117:500–544 17. Iwasa K, Tasaki I, Gibbons R (1980) Swelling of nerve fibers associated with action potentials. Science 210(4467):338–339 18. Izhikevich E (2007) Dynamical systems in neuroscience: The geometry of excitability and bursting. MIT, Cambridge, MA 19. Izhikevich EM (2004) Which model to use for cortical spiking neurons. IEEE Trans Neural Netw 15(5)
References
121
20. Jiang Y, Lee A, Chen J, Cadene M, Chait B, MacKinnon R (2002) Crystal structure and mechanism of a calcium-gated potassium channel. Nature 417:515–522 21. Kaltenbrunner A, Gomez V, Lopez V (2007) Phase transition and hysteresis in an ensemble of stochastic spiking neurons. Neural Comput 19(11):3011–3050 22. Kelso J (1995) Dynamic patterns: The self-organization of brain and behavior. MIT, Cambridge, MA 23. Koch C (1997) Computation and the single neuron. Nature 385:207–210 24. Koch C, Poggio T, Torre V (1983) Nonlinear interactions in a dendritic tree: Localization, timing, and role in information processing. Proc Natl Acad Sci 80(9):2799–2802 25. Konur S, Yuste R (2004) Imaging the motility of dendritic protrusions and axon terminals: Roles in axon sampling and synaptic competition. Mol Cell Neurosci 27(4):427–440 26. Lapicque L (1907) Recherches quantitatives sur l’excitation electrique des nerfs traitee comme une polarisation. J Physiol Pathol Gen 9:620–635 27. Lapicque L (1934) Neuro-muscular isochronism and chronological theory of curarization. J Physiol 81(1):113 28. Ling G (1992) A revolution in the physiology of the living cell. Krieger, Malabar, Florida 29. Ling G (2001) Life at the cell and below-cell level: The hidden history of a fundamental revolution in biology. Pacific, New York 30. London M, Hausser M (2005) Dendritic computation. Annu Rev Neurosci 28:503–32 31. Mel B, Schiller J (2004) On the fight between excitation and inhibition: Location is everything. Sci STKE 2004(250) 32. Neher E, Sakmann B, Steinbach J (1978) The extracellular patch clamp: A method for resolving currents through individual open channels in biological membranes. Pfl¨ugers Arch 375(2):219–228 33. Pasemann F, Wennekers T (2000) Generalized and partial synchronization of coupled neural networks. Network 11(1):41–61 34. Rinzel J, Keller J (1973) Traveling wave solutions of a nerve conduction equation. Biophys J 13(12):1313–1337 35. Segev I, London M (2000) Untangling dendrites with quantitative models. Science 290(5492): 744–750 36. Yellen G (2002) The voltage-gated potassium channels and their relatives. Nature 419(6902):35–42 37. Yuste R, Denk W (1995) Dendritic spines as basic functional units of neuronal integration. Nature 375(6533):682–684
Part II
Neurodynamics of Embodied Behavior
Every plant and animal is constructed upon the premise of its cyclic nature. Evolution and the Diversity of Life E RNST M AYR The machine is the spiritualization of an organism. T HEO VAN D OESBOURG
Outline of Part II The ideas presented in Part I will be instantiated with experimental examples in Part II. A family of experiments in active tracking conducted within the paradigm of evolutionary robotics will illustrate and legitimate the arguments in Part I, with the language and methods of neurodynamics. Terminological refinements of that language will be introduced as we proceed, connecting results and conclusions. Thus, Part II begins with a brief review of the basic concepts of neurodynamics (attractors, basins, bifurcation) and evolutionary robotics (evolutionary algorithms) in Sects. 7.1 and 7.2. Those familiar with the subjects are invited to skip these sections without an afterthought. The role of invariance in the explanation of variability and constancy in behavior is our main focus, and will occupy the rest of the book. In Sect. 8.1.2.4, an explanatory middleman is introduced, the concept of an attractor landscape. Then, it is used to show what is an invariant of behavior: an abstract representation of a functional mechanism. The fertility of this concept is explored throughout the rest of this book. In the present case, the invariant behavior, or the functional mechanism implemented by agents, is a two-dimensional version of the well-known – from cybernetics – negative feedback, which subsumes an ample range of both simple and complex organismic functions. I will (1) show how, despite extreme variability in network structures, constancy of behavior reflects invariant features of the attractor
124
II
Neurodynamics of Embodied Behavior
landscapes, (2) show that behavioral function only exists as a potential, until it is evoked, and (3) show that even networks with radically different attractors may implement the same embodied function, as long as these networks possess certain invariant features of the attractor landscape. In so doing, the connection between attractor landscapes and organismic function will be elucidated. In Sect. 8.4.4, we address constancy arising from level crossing, convergence, as variable activity patterns have their dimensions effectively reduced at the level of actuators. I will show how attractors are made equivalent through convergence. How the ongoingness of behavior appears as the attractor landscape is explored through organism-environment interaction. This will lead to the concept of a “metatransient,” with respect to which we show effectively what it means for an organism, to implicitly represent the coupling between an organism and its environment. That which renders the organism more viable is the appearance of organismic function. If it is true that the attractor landscape is a behavioral (functional) invariant, then when evolution encounters neurally controlled function, the attractor landscape must correspondingly and incrementally appear in evolution. An experiment on the evolution of attractor landscapes shows the gradual trend in the appearance of an attractor landscape, and how it correlates with the fitness landscape. This appears in Chap. 9. This chapter ends with a discussion of the meaning of instinct, within the framework outlined in Part I. The last chapter and the experiments described deal with the role of attractor landscapes in modular systems: (1) What is a module? (2) What is the kind of message exchanged between modules? (3) What is the meaning of noise? (4) What does it mean, in neural terms, to receive a message? The answers to these questions are tightly coupled, and depend on the simple semiotic observation that a message is interpreted by a receiver, and is only made meaningful therewith. Activity exchange between communicating brain areas admits a characterization in terms of meaning, which in turn admits a neat characterization in neurodynamics terms.
Questions for Part II What is an invariant of behavior? Where and how can we find it? To what kind of transformations is it invariant? These questions are answered in the next chapter, with respect to a toy problem in active tracking. How does behavioral function evolve? What are the concomitants? Why and how do different network structures produce similar function? In an active tracking toy problem, for instance, we evolve a variety of networks of different connectivity, each of them having invariant features of the attractor landscape. What is a module? How do modules communicate meaning along transmission lines (axons and dendrites)? The short version of the answer is that they communicate activity that will be recontextualized in terms of the receiving attractor landscape.
Chapter 7
Neurodynamics and Evolutionary Robotics
Abstract The next chapter presents the essential tools required for the interpretation of the subsequent chapters; neurodynamics and evolutionary robotics. The presentation of neurodynamics is visual rather than formal, and introduces the concepts of attractor, parameter space, and bifurcation. In addition the algorithm used for the evolutionary robotics experiments, ENS3 is swiftly introduced.
7.1 Crash Course on the Neurodynamics of Recurrent Neural Networks 7.1.1 Varieties of Neurodynamics The idea that the best language to describe the function of the brain’s networks is that of dynamical systems, which I heartily exposed in Chap. 4.3.2. The language of dynamical systems applied to the analysis of neural network models becomes neurodynamics. Because there are many levels at which neuronal behavior can be modeled, there are proportionally many guises of neurodynamics. At the core, what all approaches share is some state representation and some rule, taking in state variables and parameters, which computes how the state changes in time. What the state is, and what these rules are, varies quite a bit. The explanation also covers a wide spectrum with considerable overlap. Table 7.1 presents a typology of network models that have been subject to dynamical analysis, with some of their defining characteristics and references (other variations of these models exist with different combinations of features, although in the table the canonical or most widely used – according to my review – are listed). One will notice that many recurrent neural network (RNN) models share basic characteristics, and often the main distinguishing feature concerns analysis and/or applications. Discrete time RNNs are perhaps the most widely used for the analysis of dynamics in a wide assortment of applications and studies. They may be mathematically analyzed in small networks [1, 18, 21, 22, 25], or they may be used as a
M. Negrello, Invariants of Behavior: Constancy and Variability in Neural Systems, Springer Series in Cognitive and Neural Systems 1, DOI 10.1007/978-1-4419-8804-1 7, c Springer Science+Business Media, LLC 2011
125
Table 7.1 Neuronal models and levels modeled. In the column Scale, meso means that the level represented is that of neuronal groups or areas. Continuous and discrete refer to whether the model employs differential or difference equations. RNN recurrent neural network Scale State dimensions Deterministic/ Continuous/ No. of units Model (represented) State variables (per unit) stochastic discrete (representative) Hodgkin–Huxley (HH) Micro Membrane 4 Deterministic Continuous 1 potential (Vm) Conductances (Na, K, Cl) Fitzhugh–Nagumo (FN) Micro Membrane 2 Deterministic Continuous 1–100 potential (Vm) Recovery variable (u) Izhikevich (Simple Micro Membrane 2 Deterministic Continuous 1–billions spiking model) potential (Vm) Recovery variable (u) 2 Stochastic (probability Continuous 100–millions Micro/Meso/ The unit: Membrane Mean neural fields density of spikes) (representation) (ensemble density) Macro potential (Vm), T (inter-spike interval) Integrate and fire Micro Activity, spike 1 Deterministic Discrete 10–100 Continuous time RNN Meso Activity 1 Deterministic Continuous 10–100 Discrete time RNN Meso Activity 1 Deterministic Discrete 10–100 Micro (liquid Activity 1 Deterministic Continuous 10–1000 (echo) Dynamical repository RNN state) (liquid state) Discrete (echo 10–millions Meso (echo state) state) (liquid) (Liquid state machines, echo state networks)
126 7 Neurodynamics and Evolutionary Robotics
7.1 Crash Course on the Neurodynamics of Recurrent Neural Networks
127
dynamical repository in larger networks [13–15,17]. The choice of model is usually dependent on the problem being tackled. If it is, for example, to find the common currency in neural terms between oscillations and functional magnetic resonance imaging activation, the model of choice is some variant of a mean field model [8]. If it is to control a robot, a small network is usually preferred, and the units can be various, e.g., Fitzhugh–Nagumo, integrate and fire, discrete time RNNs, or continuous time RNNs. As in Sect. 6.2.4, it is not always clear from the outset which are the properties of a unit that shall become properties of the system that solve a problem. Our own variety of neurodynamics is discrete time, selected for its amenability to computational and mathematical analysis, as well as ease of implementation on embedded systems. Another great feature of the discrete time RNN as a model is the vast assortment of quality resources and literature available. Standing on the shoulders of giants, the vista is vaster. Discrete time RNN models are also vigorous in current neural networks research, and have generated a large number of contributions. The purpose of this short chapter is to provide a visual guide to the analysis tools for the neurodynamics of discrete time RNNs. It is meant as a primer for following the analysis of the experiments in subsequent chapters. The presentation is therefore short and pragmatic; for a less terse and altogether better treatment, there are collected references at the end of this chapter. Nevertheless, the analysis tools presented are for the most part standard practice, so a little review should be sufficient to follow the presentation of results. Emphasizing concision, this exposition is visual rather than formal. Comprehensive discussions of concepts will be shamelessly set aside. The analysis of embodied and embedded networks focuses on the impact of parameterizations – inputs – on dynamics. The presentation to follow emphasizes tools suitable for this kind of analysis, also known as the analysis of parameterized dynamical systems.
7.1.2 Definitions Dynamical systems studies the tendencies and long-term behavior of particular equations, difference equations in the discrete time RNN (also called maps). In parameterized dynamical systems, the equations are made dependent on parameters in order to study how the tendencies change as a function of the parameters. Neurodynamics is the subset of parameterized dynamical systems applied to the study of the dynamics of RNNs.
7.1.2.1 The Neuromodule For the formalisms and definitions of this section, I closely follow Pasemann [25]. A network of units, also called a neuromodule, is a digraph represented by a connectivity matrix of weights (representing synaptic strengths or efficacies), also called an
128
7 Neurodynamics and Evolutionary Robotics
adjacency matrix, with entries wij . The synaptic weights can be either positive or negative, and the units admit self-connections. The activity of the unit is a real number. Each unit has a bias, a number representing background activity, which may be made into a parameter, a constant or slowly varying real number representing input transduced to the unit. Recall that a state is a point in phase space representing the activities of all the units composing a network. So, a neuromodule composed of n units has an associated n-dimensional phase space (A Rn ). The dynamics of the RNN is given by the map D W A 7! A;
(7.1)
where one particular choice of parameters instantiates one dynamical system. The map D is defined by the following network equation: ai .t C 1/ D i C
n X
wij Œaj .t/; i D 1; : : : ; n;
(7.2)
j D1
where ai represents the activity of unit i , wij denotes the synaptic weight from unit j to unit i , and i D i C Ii denotes the sum of a bias term i and an external input
Fig. 7.1 A recurrent neural network. See the text for explanation of the symbols
7.1 Crash Course on the Neurodynamics of Recurrent Neural Networks
129
Ii . .X / represents any suitable function, usually with a saturating nonlinearity, such as the logistic sigmoid in (7.3) or the hyperbolic tangent in (7.4): 1 1 C ex
(7.3)
ex ex : ex C ex
(7.4)
.x/ D and .x/ D
The dynamics of this model, therefore, is always dissipative and bounded on an open domain. The parameter vector Q has selected components i and wij . One instantiation of a particular weight matrix, one bias vector, and a unique parameterization resolves one unique dynamical system D . A RNN is a set of dynamical systems, with respect to the class of parameters . The analysis of changes in parameters allows for analysis of neighboring dynamical systems, and the dynamical repository of a network.
7.1.2.2 Orbits, Attractors, and Transients Next, I introduce the concepts of “orbit,” “attractor,” “basin of attraction,” and “transient” via pictorial examples. The recurrent application of the network equation on an initial condition [an activity vector ai .t D 0/ with components 1 : : : i : : : N , where t D 0 by definition], produces a state sequence, or time series. Any such state sequence thus produced is denominated an orbit (7.5): O a.0/ WD fa.t/g j1 t D1 :
(7.5)
In Fig. 7.2 we see an example orbit of a two-neuron discrete time RNN, both in phase space and as a time series.
Attractors For some network structures and some initial conditions, the repeated application of the network equation will result in convergence to a state that does not change (or is approached asymptotically). This is an equilibrium state called a fixed point attractor (or an asymptotically stable fixed point). But, more generally, an attractor of a dynamical system is the asymptotic set of states of an orbit, or in other words, the set of states to which an orbit tends. The attractor is defined as an invariant set of states of a dynamical system (for a contention on the multiple usage of the label
130
7 Neurodynamics and Evolutionary Robotics
Fig. 7.2 An orbit in a two-dimensional phase space (output of two units), and translated into a time series (states are plotted as vertices, edges indicate sequence of states). The states of an orbit before the attractor are denominated “transient states” (approximately at t D 0; 1; 2; 3). The transient converges to a quasi-periodic attractor
“invariance” see the footnote.1 In maps (the case is different for continuous time), attractors can be categorized with respect to their “winding number.” In Table 7.2 we have four examples of attractors, categorized accordingly as fixed point attractor, periodic attractor, quasi-periodic attractor, and chaotic (or strange) attractor.2 Useful tools to depict attractors are the bifurcation diagram and the first return map. In the first return map, as the name says, the state at t C 1 is plotted as a function of the state at t. This plot is useful to identify the periodicity of attractors. In Fig. 7.3, for example, a period 5 attractor draws a star, as the orbit converges to five sequential points in phase space.
Transients The set of states in an orbit before an attractor has been reached is called a transient (see Fig. 7.2a).3 In large networks with many dimensions, it may sometimes be difficult to determine whether a transient has reached an attractor. Note that in the figure the transient part of the orbit is short. This is not necessarily the case.
1
A distinction and a contention concerning attractors have to be drawn at this point. The distinction is that attractors, defined as invariant sets of dynamical systems, are not to be confused with invariants of organismic behavior. They are invariants of particular dynamical systems, that is, of models. The contention is that invariants appear at different levels, and have to be handled in the particular levels where they appear. This point will be stressed in the next chapter, and is an important linchpin of the arguments henceforth. 2 Although in low dimensions the identification of the category to which attractors belong is simple, it becomes increasingly difficult in higher dimensions. 3 The mechanistic analogy is that of a rock falling off a cliff, with a phase space given by the spatial coordinates of the rock. The bottom of the cliff which the rock eventually reaches is an “attractor,” and the states until that point are transient states.
7.1 Crash Course on the Neurodynamics of Recurrent Neural Networks
131
Table 7.2 Types of attractors encountered in maps Fixed point For one parameterized dynamical system D , a fixed point p is a point of phase space that satisfies D .p/ D p. Fixed points are categorized according to the behavior of neighboring orbits. If neighboring orbits converge to the fixed point in all dimensions, it is denominated an attracting fixed point. If orbits close to p are drawn away from it in all dimensions, it is denominated a repel lor. In the case that orbits are attracted in some dimensions but not in others, it is denominated a saddle node Periodic attractor In a periodic attractor, states repeat themselves after a period k, that is, Dk .p/ D p, where k is the kth iterate of the application of map D, and k is minimal. Chaotic attractor Chaotic attractors are aperiodic. The criterion for a chaotic attractor takes into account neighboring initial conditions. If the orbits originating from two neighboring initial conditions close to the attractor diverge exponentially from each other, the attractor is denominated chaotic. Quasi-periodic attractor Quasi-periodic attractors are also aperiodic, but orbits originating from neighboring initial conditions do not exponentially diverge.
Fig. 7.3 A period 5 periodic attractor (black), depicted with a first return map. A point in the plot is the average output of the network at t and is plotted as a function of t C 1. Transient sequences appear in light gray. The states of the attractor are vertices, and sequences of states are edges. The circled cross at 0.2 indicates the initial condition
132
7 Neurodynamics and Evolutionary Robotics
Basins of Attraction A basin of attraction is the set formed by the collection of initial conditions ending in one attractor. Therefore, the phase space of a dynamical system with multiple attractors is partitioned in basins of attraction. If for all possible initial states only one attractor is reachable, the attractor is denominated a global attractor. In this case all of the phase space is one basin of attraction. If, on the other hand, different initial conditions lead to different attractors, and therefore multiple basins of attraction, then there are coexisting attractors. Basins of attraction may have well-determined boundaries or fractal boundaries (usually in the presence of chaotic attractors). Fractal boundaries happen when an arbitrarily small portion of phase space has orbits ending in different attractors. Incidentally, the determination of basins for highdimensional systems is a notorious challenge.
Bifurcations of Dynamics Through Parameterizations What happens to orbits of a dynamical system when we change its parameters? As mentioned previously, each parameterization defines one individual dynamical system. So, via changes of parameters, one compares dynamical systems in terms of their short-term and long-term behavior (orbits divided into transients and attractors). One way to examine similarities between dynamical systems is to compare, qualitatively, the orbits of one and the other Fig. 7.4. Given two dynamical systems f0 and f00 , do the orbits change qualitatively? The bifurcation diagram (or Feigenbaum plot) is a visualization tool that permits the comparison. In it, one aspect of the
Fig. 7.4 Basins of attraction. Two coexisting attractors separated by a basin boundary. Orbits end up in one or the other attractor depending on the basin of the initial condition
7.1 Crash Course on the Neurodynamics of Recurrent Neural Networks
133
Fig. 7.5 Bifurcation sequence of a two-neuron network parameterized by input to unit 2 (2 ). The mean output activity of the two neurons (transfer function: logistic sigmoid) is plotted. As the parameter is varied, the network undergoes qualitative changes in dynamical regimes. The gray dots are transient states (first 300 iterations), the blue dots belong to the attractors (next 100 iterations). The region around the dashed red line has a period 5 periodic attractor. Close to bifurcations, the smearing of gray dots indicates longer transients. Around 2 D 4, a chaotic attractor coexists with a period 5 attractor (the same as in Fig. 7.3)
asymptotic behavior of the network is made dependent on a varying parameter. There are two modes for which a change of parameter may impact on the orbits. Either the asymptotic states vary smoothly, in which case we see “morphing,” or they change qualitatively or “catastrophically,” in which case we see a “bifurcation.” In many cases bifurcations can be discovered analytically (some examples are found in [9]), whereas in other cases only computational methods will be available. In Fig. 7.5 a bifurcation sequence with respect to the change of input to neuron (2 ) is shown as an example two-neuron network.4
Isoperiodic Plot and Landscape Plot The isoperiodic plot and the landscape plot are suited for comparing changes of dynamics made dependent on combinations (usually pairs) of parameters. The
4
The weight matrix the network in Figs. 7.3 and 7.5 are and the bias vectors of 18 8 0:45 : W D and D 8 0 2
134
7 Neurodynamics and Evolutionary Robotics
isoperiodic plot is constructed in parameter space, where every point of parameter space (a combination of two parameters – two dimensions is easier on paper) is associated with the period of the attractor for that parameter combination. Thus, the isoperiodic plot presents a landscape picture of the kinds of dynamics present in a network. It indicates the locus of qualitative change of dynamics, as bifurcations. This is analogous to what we saw in Sect. 6.1.1, that a model’s dynamical behavior is dependent on the parameters, and that some regions of parameter space will lead to similar dynamical behaviors. Across bifurcation boundaries, there are different dynamical behaviors. These regions can be seen in Fig. 7.6a, represented by the same colors. Beyond periods, other features of the dynamics can be presented in a parameter space, called landscape plots. The isoperiodic plot is an example of a landscape of dynamics, showing how features of the dynamical landscape change along the parameter dimensions. Similar plots can be produced containing other features of the dynamical landscape. Useful examples are (1) the mean of an orbit, (2) the mean amplitude of the orbit, (3) the length of transients, and so on. Figure 7.6b depicts the method applied to the average amplitude of attractors.5 To find the periods of a dynamical system is not a trivial job. Sometimes it may be difficult to distinguish between high periods and actual chaotic behavior. Another constraint is the presence of multiple attractors at the same region of parameter
Fig. 7.6 Landscape plots for a two-unit recurrent neural network. (a) The periods of the dynamical systems for a combination of two parameters (w11 and w12 ), color coded (see the color bar). Dithering indicates regions of coexisting attractors (bottom right). White indicates either high periods (more than 18) or chaos. Close to bifurcation boundaries, white is an artifact of computation indicating long transients. (b) The average amplitudes of the attractors for the combination of the same parameters. Note that across some bifurcation boundaries, the amplitude increase changes smoothly, indicated by a smooth variation of the color. Dithering indicates sensitive dependence on the initial conditions associated with chaos. The structure of the network is given in the footnote 5
5
W D
w11 8
w12 0:45 : and D 3:9 0
7.1 Crash Course on the Neurodynamics of Recurrent Neural Networks
135
space, in which case the period is computed for one parameter combination only. So, the isoperiodic plot has two main limitations. 1. The period results from the computation of an algorithm, and therefore, for high periodicities it may be difficult to distinguish quasi-periodic attractors from chaotic attractors. To address this, I have developed a vector-based algorithm that compares a wave with itself to find periods (this is included on the online resources, at www.irp.oist.jp/mnegrello/home.html). 2. There may be coexisting attractors with different periods, for example, coexisting chaotic attractors. In many cases it may be impossible to determine univocally the period because there may be more than one for one parameter combination. Nevertheless, with cunning, it is possible to account for these limitations. For limitation 1, one would have to rely on an estimation of the separation between orbits for one and the same parameter combination. If, for instance, the attractor is chaotic, there will be a measurable exponential separation between orbits (indicated by a positive Lyapunov exponent). If not, it is likely to be a quasi-periodic attractor (the orbits do not exponentially separate, as they are neighbors in a torus). For limitation 2, one may have to perform bifurcation analysis of specific parameters along with other heuristics to identify the different basins of attraction. Nevertheless, for small neural networks and the indepth analysis of their dynamics, the isoperiodic plot and, more generally, landscape plots are valuable tools.
7.1.3 Neurodynamics and Attractors Recursive rules are what gives dynamical systems their tendencies (or attractors). The long-term behavior of orbits shows interesting phenomena, such as different kinds of oscillations, coexisting attractors, and fractal basins. The study of the dynamical behavior of orbits can be analyzed with respect to changes in parameters, and the initial conditions. The long-term behavior of the system, referred to as attractors, is the main currency of neurodynamics analysis. One studies the changes in dynamical behavior of orbits under parameter changes (some more examples of the dynamics of two-unit neural networks are found in the online resources, at www. irp.oist.jp/mnegrello/home.html material). Changes in parameters are transformations, in respect to which some features of the dynamics may remain invariant. Invariances may be qualitative or quantitative. The length of periods is an example of invariance with respect to parameter changes that will keep the period unaltered. The amplitude of periods can also be invariant with respect to manipulations, and in this case I speak about quantitative invariance. Both can be studied with the tools described above. But for us, the importance of parameter-based comparison of dynamics arises as the dynamics are assigned causal roles in behavior. An example of this was already provided in Sect. 6.1.1, in which parameter changes beget categories. Some of these categories will promote equivalence in behavior. This is to be decided with respect to the problem under analysis. Chapter 8 will explain the functional role of attractor
136
7 Neurodynamics and Evolutionary Robotics
landscapes. We will see that the analysis of dynamics is eminently dependent on the behavior under consideration. Equivalence between attractors, and by extension between landscapes, can be shown across networks in relation to the attractors of the set of dynamical systems resolved by one RNN structure. In the upcoming chapters, the tools of neurodynamics illuminate the dynamical features of networks responsible for the sensorimotor control of behavior. These networks are produced with the methods of evolutionary robotics, an overview of which is provided in the next section.
7.2 Evolutionary Robotics at a Glance 7.2.1 Neurodynamics and Evolutionary Robotics Neurodynamics assumes that the dynamics of the networks can teach us about general principles of neural control of embodied behavior. Evolutionary robotics is the method that produces networks subserving embodied behavior. Their marriage is therefore a natural consequence: evolved networks are analyzed for their dynamical potentials, given by the network’s structures.
7.2.2 From Evolution of Organisms to Evolutionary Algorithms Darwin’s evolutionary ratchet (variability, fitness, inheritance) is a fabulous heuristic to probe enormously complex search spaces such as body morphology and brain design. Genetic algorithms established a tradition by emulating Darwin’s evolutionary tripod, trying to benefit from what seemingly worked so well in nature. In many benchmarks evolutionary algorithms compare or outwit other search algorithms, especially where greedy protocols are prone to lead to local minima/maxima. Evolutionary robotics employs evolutionary computation to shape the morphologies of agents, assemble control structures, find parameter spaces, propose architectures, or simply to solve a problem where evolution’s methods are adequate. Evolutionary robotics has been employed in a variety of paradigms that study the evolution of communication [31], space of morphological designs [16, 29], and social [28], perceptual [5], and psychological phenomena [20]. As with other paradigms based on simulations, experiments in the field are viewed as “opaque thought experiments” [4]. As such, they take in assumptions about the problem as ingredients, but without assuming the specific forms of final solutions. Therefore, something can be learned about both the assumptions and the putative solutions. Bullock argues that opaque thought experiments are invaluable tools for the study of complex phenomena: first, because thought experiments require the observer to carefully choose his or her simplifying assumptions, thereby making
7.2 Evolutionary Robotics at a Glance
137
explicit commitments; second, because failure is also informative, as it invites reexamination of what was taken as components; and third, because at its best, it is able to inform about the essence of the phenomenon, as simple rules emulate complex behavior. 7.2.2.1 Literature A comprehensive review of evolutionary robotics and its applications can be found in [10]. The book by Nolfi and Floreano [19] attempts to ground evolutionary robotics as a discipline. A vast assortment of articles have appeared in a number of journals devoted to cognitive modeling, artificial life, as well as various flavors of robotics. More specifically to the marriage between evolutionary robotics and neurodynamics, where networks are generated by evolutionary robotics and analyzed with respect to their dynamics, emphasis is placed on understanding structure–dynamics– function relationships. A short listing of examples would have to include, but is by no means exhausted by, central pattern generators [26], robust object avoidance and light seeking [11], inverted pendulum balancers [24], multilegged walking machines [3, 30], as well as communication in multiagent systems [31] and flying robots [7]. On the philosophical forefront there are many contributions laying bare the advantages and contingencies of the usage of evolutionary robotics methods as scientific tools. Some good examples are given by Di Paolo et al. [4], who develop the concept of “opaque thought experiments,” Beer [2], who thoroughly discusses the applications of these methods in the cognitive sciences, and Pasemann [23], who discusses the implications for the concept of internal representations.
7.2.3 Assumptions of an Evolutionary Robotics Problem There Is No Way To Make Something out of Nothing The first step in evolutionary robotics is to assume a problem, a function, that a robot may be able to solve. Second, one assumes components, the building bricks, and the modes how these components may be rearranged and modified. For example, in the evolution of morphologies, one assumes limbs and joints, and variation proposes combinations. In a neural evolution example, the components may be units and synapses. The third assumption is the environment, the world in which the robot will behave, a simulated world or the real world. And finally, one assumes fitness metrics on the behavior and a method of selection. These three assumptions are operative constraints in achieving a solution for the problem posed. But despite these constraints, the strength of the method is that there is no assumption about the forms of solutions. Everything that can be generated by the rearrangement process, and satisfies an evolutionary gradient on the fitness metrics,
138
7 Neurodynamics and Evolutionary Robotics
is putatively a solution. If the space of possible solutions is large, and one wants to discover something about the space of solutions, this is the best method available.
7.2.4 Structural Evolution and Simulation 7.2.4.1 ENS3 , Simba, and Yars The algorithm employed for the artificial evolutions here was ENS3 ,6 described in [6], and reimplemented in the Simba environment [27]. The genome of the evolution algorithm is the structure of the networks themselves. The variation operator adds or deletes units (neurons), synapses, as well as changes weights. The change operators are specified on-the-fly, according to probabilities (e.g., probability of adding, removing, or changing the weight of a synapse, probability of adding and removing a neuron, probability of changing the transfer function). The selection of the agents that generates offspring is rank-based, which is controlled by a probability density function. This probability function represents the probability of an agent with rank r having n offspring, where n is estimated via a Poisson process. The stochastic character of offspring production is meant to avoid niche exploitation by minimizing opportunism and by keeping network diversity, as argued in [6]. A population of RNNs is thus generated and embodied in the simulated agent. For the physical simulations, I used Yars (yet another robot simulator) [32], based on the physical dynamics as modeled by Open Dynamics Engine. After a single trial, fitness values are attributed to the RNNs controlling an agent. At the end of a generation, the networks are ranked according to their fitness values, and the best are selected to generate offspring. They can be either simple copies of the parents or copies with alterations dictated by the change operators defined by the experimenter.
Fig. 7.7 The ENS3 algorithm for the evolution of the structure of neural networks. (From [12]) 6
ENS3 stands for evolution of networks through stochastic synthesis and selection.
References
139
The process runs for many generations. Furthermore, during the evolutionary procedure the experimenter might also operate as a metafitness, selecting agents with interesting behavior, with an eye on the further analysis of its dynamics. The ENS3 algorithm is summarized in Fig. 7.7. Some examples of Yars simulations are given in Figs. 8.3, 9.8, and 9.10.
7.3 Conclusion It is thus by generation of networks that produce behavior and the analysis of the successful networks that a theory of functional invariance of behavior is produced. The underpinnings of behavior can be sought in relation to the network’s structure, parameterizations, and consequent dynamics. Alloyed, neurodynamics and evolutionary robotics provide the tools for a holistic way to the study of invariants of behavior. The next chapters will show how.
References 1. Amari S (1972) Characteristics of random nets of analog neuron-like elements. IEEE Trans Syst Man and Cybern 2:643–57 2. Beer R (1998) Framing the debate between computational and dynamical approaches to cognitive science. Behav Brain Sci 21(05):630–630 3. Beer R, Gallagher J (1992) Evolving dynamical neural networks for adaptive behavior. Adapt Behav 1(1):91–122 4. Di Paolo E, Noble J, Bullock S (2000) Simulation models as opaque thought experiments. In: Artificial life VII: Proceedings of the seventh international conference on artificial life, MIT, Cambridge, pp 497–506 5. Di Paolo E, Rohde M, Iizuka H (2008) Sensitivity to social contingency or stability of interaction? Modelling the dynamics of perceptual crossing. New Ideas Psychol 26(2):278–294 6. Dieckmann U (1995) Coevolutionary dynamics of stochastic replicator systems. PhD thesis, Forschungszenter J¨ulich 7. Floreano D, Zufferey JC, Nicoud JD (2005) From wheels to wings with evolutionary spiking circuits. Artif Life 11(1–2):121–138. URL http://www.mitpressjournals.org/doi/abs/10.1162/ 1064546053278900 8. Ghosh A, Rho Y, McIntosch R A Rm K¨otter, Jirsa VK (2008) Noise during rest enables the exploration of the brain’s dynamic repertoire. PLoS Comput Biol 4(10) 9. Guckenheimer J, Holmes P (1983) Nonlinear oscillations, dynamical systems, and bifurcations of vector fields. Springer, Berlin 10. Harvey I, Paolo ED, Wood R, Quinn M, Tuci E (2005) Evolutionary robotics: A new scientific tool for studying cognition. Artif Life 11(1–2):79–98. URL http://www.mitpressjournals.org/ doi/abs/10.1162/1064546053278991 11. H¨ulse M, Ghazi-Zahedi K, Pasemann F (2002) Dynamical neural schmitt trigger for robot control. In: Dorronsoro(Ed) JR (ed) ICANN, vol ICANN 2002, LNCS 2415, Springer, Berlin, pp 783–788 12. H¨ulse M, Wischmann S, Pasemann F (2004) Structure and function of evolved neurocontrollers for autonomous robots. Conn Sci 16(4):249–266
140
7 Neurodynamics and Evolutionary Robotics
13. Jaeger H (2001) The “echo state” approach to analysing and training recurrent neural networks. Tech. Rep. GMD Report 148, German National Research Center for Information Technology. URL http://www.faculty.iu-bremen.de/hjaeger/pubs/EchoStatesTechRep.pdf 14. Jaeger H (2001) Short term memory in echo state networks. Tech. Rep. GMD Report 152, German National Research Center for Information Technology. URL http://www.faculty. iu-bremen.de/hjaeger/pubs/STMEchoStatesTechRep.pdf 15. Jaegger H, Maas W, Markram H (2007) special issue: Echo state networks and liquid state machines. Neural Netw 20(3):290–297 16. Lassabe N, Luga H, Duthen Y (2006) Evolving creatures in virtual ecosystems. Lect Notes Comput Sci 4282:11 17. Maas W, Natschlaeger T, Markram H (2002) A fresh look at real-time computation in recurrent neural circuits 18. Molter C, Salihoglu U, Bersini H (2007) The road to chaos by time-asymmetric hebbian learning in recurrent neural networks. Neural Comput 19:80–110 19. Nolfi S, Floreano D (2004) Evolutionary robotics: The biology, intelligence, and technology of self-organizing machines. Bradford Book, Cambridge, MA 20. Ogai Y, Ikegami T (2008) Microslip as a Simulated Artificial Mind. Adapt Behav 16(2): 129–147 21. Pasemann F (1993) Discrete dynamics of two neuron networks. Open Syst Inform Dyn 2(1):49–66 22. Pasemann F (1995) Characterization of periodic attractors in neural ring networks. Neural Netw (8):421–441 23. Pasemann F (1996) Reprasentation ohne Reprasentation-Uberlegungen zu einer Neurodynamik modularer kognitiver Systeme. Interne Reprasentationen-Neue Konzepte der Hirnforschung, pp 42–91 24. Pasemann F (1998) Evolving neurocontrollers for balancing an inverted pendulum. Comput Neural Syst 9:495–511 25. Pasemann F (2002) Complex dynamics and the structure of small neural networks. Netw Comput Neural Syst 13(2):195–216 26. Pasemann F, Hild M, Zahedi K (2003) So(2) – networks as neural oscillators. In: Mira J, Alvarez JR (eds) Computational Methods in Neural Modeling. Proceedings IWANN. Springer, Berlin, pp 144–151 27. Rempis C (2007) Simba, a framework for the artificial evolution of neural networks 28. Rohde M, Di Paolo E (2006) An Evolutionary Robotics simulation of human minimal social interaction. Long abstract SAB 6 29. Sims K (1994) Evolving virtual creatures, SIGGRAPH 94. In: Comput Graph Proc, Annual Conference Series, pp 15–22 30. von Twickel A, Pasemann F (2007) Reflex-oscillations in evolved single leg neurocontrollers for walking machines. Nat Comput 6(3):311–337 31. Wischmann S, Hulse M, Knabe JF, Pasemann F (2006) Synchronization of internal neural rhythms in multi-robotic systems. Adapt Behav 14(2):117–127. DOI 10.1177/ 105971230601400203. URL http://adb.sagepub.com/cgi/content/abstract/14/2/117, http://adb. sagepub.com/cgi/reprint/14/2/117.pdf 32. Zahedi K, Twickel A, Pasemann F (2008) YARS: A Physical 3D Simulator for Evolving Controllers for Real Robots. In: Proceedings of the 1st international conference on simulation, modeling, and programming for autonomous Robots, Springer, Berlin, pp 75–86
Chapter 8
Attractor Landscapes and the Invariants of Behavior
Abstract This chapter introduces an explanatory middleman, the concept of an attractor landscape. It is used to show what is an invariant of behavior: an abstract representation of a functional mechanism. With an experiment in the evolution of active tracking, recurrent neural networks are evolved that enable agents to track a moving object in a simulated environment. The invariant behavior, i.e., the functional mechanism implemented by agents, is a two-dimensional version of the well-known – from cybernetics – negative feedback, which subsumes an ample range of both simple and complex organismic functions. I will (1) show how, despite extreme variability in network structures, constancy of behavior reflects invariant features of the attractor landscapes, (2) show that behavioral function only exists as a potential, until it is evoked, and (3) show that even networks with radically different attractors may implement the same embodied function, as long as these networks possess certain invariant features of the attractor landscape. This chapter also addresses constancy arising from level crossing, convergence, as variable activity patterns have their dimensions effectively reduced at the level of actuators. I will show how attractors are made equivalent through convergence. How the ongoingness of behavior appears as the attractor landscape is explored through organism–environment interaction. This will lead to the concept of a “metatransient,” with respect to which we show effectively what it means for an organism, to implicitly represent the coupling between the organism and its environment.
8.1 Introduction 8.1.1 Outlook Now we put the concepts developed so far to good use, and return to the main question of this book: What is an invariant of behavior? In this chapter, the first of a series of simulation experiments on embodied active tracking is presented, its neurodynamics is analyzed, and the implications are discussed. Two new concepts are defined and employed in the analysis, the attractor landscape and the metatransient.
M. Negrello, Invariants of Behavior: Constancy and Variability in Neural Systems, Springer Series in Cognitive and Neural Systems 1, DOI 10.1007/978-1-4419-8804-1 8, c Springer Science+Business Media, LLC 2011
141
142
8 Attractor Landscapes and the Invariants of Behavior
In the following, the core ideas of the previous chapters will be interwoven with the results of experiment, under the overarching theme of constancy and variability. – Structural coupling sets the paradigm for the experiments, showing that whatever analysis is being performed, it is only meaningful with the assumption of the modes of interaction between an agent and its environment. Invariants of behavior can only be defined, in fact they only make sense, when referring to a world with which it is structurally coupled. – Coupled dynamical systems. The world, the body, and the neural structures are three main levels modeled in the experiment. In the simulations, the world and the body are modeled according to physical dynamics, whereas the neural structures have the dynamics of recurrent neural networks (RNNs). Behavior is produced in the engaged interplay of the three. – Convergence leads to constancy through reduction of dimensionality. Here, this happens in two ways. First, projection of multidimensional attractors to motors, and second, through aliasing effects given by the bodily inertia. This reduction of dimensionality implies that there is equivalence of activity patterns executing similar actions. Reduction of dimensions is the principle behind the equivalence of attractor landscapes. – Divergence, or variability of behavior, is given by the history of interactions, which take the agent to different internal states. In turn, different internal states signify different responses, which cause a dependence on the history of interactions. – The invariant of behavior is essentially an abstract mechanism, a reafferent negative feedback through the world. The mechanism can be best understood in terms of the attractor landscape of the evolved networks. Different networks with invariant features of the attractor landscapes exhibit equivalence of functional behavior. Invariant features of the attractor landscapes trace back to the problem posed.
8.1.1.1 The Merkwelt and the Wirkwelt Behavior Is the Ongoingness of a Sensorimotor Loop It traverses from agent to the world and back. How the sensorimotor loop operates depends on the blueprint of an organism, which defines what it can perceive and what it can do. Often, the space between sensors and motors is occupied by one or other control structure, and defining how an agent should act when it perceives what it does. Those parts of the world that drive an agent, those which it can perceive, were called by von Uexk¨ull Merkwelt (crudely, “perception world”). The Merkwelt is an agent’s window to reality, including all that which the agent can distinguish, all that which is potentially “informative.” The other side is the Wirkwelt (roughly, “action world”), where the world effects change, either in the world (moving an object) or in its relations to it (moving itself). Connecting the two worlds is the
8.1 Introduction
143
Fig. 8.1 Von Uexk¨ull’s Funktionskreis (1934), emphasizing the cyclic nature of functional behavior, with an Innenwelt that interfaces a Merkwelt through Merk-Organen (“perceptors”) and a Wirkwelt through Wirk-Organen (“effectors”)
Innenwelt (“inner world”), the control structures of an agent. The Innenwelt given by a RNN is a black box with dynamical potentialities, and to explain it, the contents must be exposed in terms of the dynamics, which only make sense in terms of the Merkwelt and Wirkwelt. In what follows, the Innenwelt black box is disassembled in terms of the potential dynamics of a neural control structure, and it is there that the invariant of behavior is to be sought (see Fig. 8.1).
8.1.1.2 Attractor Landscapes and Behavior Discrete time RNNs are taken as analogues of the neural control structures of organisms. As described in more detail in Sect. 8.1.2.1, RNNs are in fact a collection of dynamical systems, in which a large repository of dynamics might be encountered. We shall call the attractor landscape the dynamical repository given by one RNN and its possible parameterizations1 (the Merkwelt), or the collection of all dynamical systems that a RNN can be, and by extension, all the behaviors an embodied network can deploy. This is depicted in Fig. 8.2.
1
Parameterizations are meant in the manner that has become usual in the dynamical systems literature, in which parameters are variables that change slowly with respect to the dynamics of the network.
144
8 Attractor Landscapes and the Invariants of Behavior
Fig. 8.2 The neurodynamics explanatory loop. RNN recurrent neural network, E environment, parameters, a agent, D dynamical system
8.1.1.3 Transients and Behavior Meaningful behavior results from the exploitation of the attractor landscapes by transients [2, 4, 11]. In dynamical systems theory, transients are defined as the sequences of states of a dynamical system in phase space before asymptotic states (attractors) are reached. However, during behavior it is unlikely that transients are given enough time to coincide with one attractor, because of ever-present feedback, both from the body (as in proprioception) and from the environment (exteroception, stimuli). Rather, behavior hops under the attractor’s influences, and, regarding the input to the system as parameter changes, both in the form of feedback and in the form of perturbations, results in the realization of a class of dynamical systems, upon parameterizations of which the attractor landscape is dependent. Therefore, behavior happens on a “metatransient.” In contrast to a transient, which approaches an attractor in one dynamical system, a metatransient is, roughly, the transient that is subject to varying parameterization (changing stimulus) and therefore to distinct dynamical systems. In other words, the metatransient explores the basins of attraction and respective attractors as they change (in size, number, or shape) under the
8.1 Introduction
145
influence of the changing stimuli, internal state, and relational situations. In fact, our thesis is that behavior can be regarded as induced by a metatransient across the attractor landscape of a parameterized dynamical system (a RNN) that acts nontrivially through parameter shifts. In an embodied agent the transients are driven by the input, to exploit the dynamical substrate of a RNN as a scaffold for control and behavior [6]. In what follows, we make an empirical offering to demonstrate how this exploration can occur, via a toy problem in active tracking. The goal of the depictions included is to highlight dynamical entities as reductionistic concepts for the explanation of behavior. But before the problem itself, I introduce some new definitions in the language of neurodynamics.
8.1.2 Definitions 8.1.2.1 Structural Coupling By assumption, we (and others) consider input to the network to be equivalent to parameterizations which change slowly in comparison with the dynamics of the network. In an embodied problem, these parameterizations carry the structure of the interactions. So, after [33], we write, structural coupling (•) of the agent (a) and environment (E) at a given moment (t) as leading to (Ý) a set of parameters () of the control structure, the RNN: hE • ait Ý t :
(8.1)
8.1.2.2 Path in Parameter Space A path in parameter space P D .: : : ; t 1 ; ; t C1 ; : : :/ refers to the temporal sequence of parameter changes resulting from the structural coupling with the environment, where the reticences before and after the parameters stress the ongoingness of behavior. In a parameterized RNN such as the ones we handle here, every parameter defines a distinct dynamical system D. Recall that a discrete time dynamical system is a map (7!) which takes a state in phase space to the next state, so one parameter set () produced by the interaction of a and E will take state st to state st C1 : Dt .st / 7! st C1 :
(8.2)
By extension, because of the sensory motor loop, the new state st C1 leads to a new parameterization, DtC1 , that in turn takes the network to the next state, and so on in an ongoing succession.
146
8 Attractor Landscapes and the Invariants of Behavior
8.1.2.3 Metatransient Any state s of the network’s activity in phase space might, or might not, have reached an attractor. Recall that states before the attractor is reached are denominated transient states. It is noteworthy that when the network has motor efferences, transient states may have functional consequences, which produce a change in a’s state, possibly a motor action, changing the relations between a and E, as in (8.1), recursively resolving subsequent parameterizations. However, it is not necessary that the network should only produce behavior if an attractor has been reached. If the state of the network is constantly projecting to motor output, also a transient state might produce an action. Therefore, the path in parameter space P is determined by the sequence of interactions between the agent and the environment, which although influenced by the attractors is not strictly a function thereof. Consequently, the metatransient M is simply the set of states of the network in phase space as it is constrained by the structural coupling and its respective parameterizations, M D .: : : ; st ; : : :/. This can be rewritten so as to outline the fact that the states are dependent on the application of the map Dt with the parameterization that changes over time: M D Œ: : : ; Dt 1 .st 1 /; : : ::
(8.3)
Recall that a bifurcation is a region of structural instability, where the dynamics of two neighboring dynamical systems have qualitatively different attractor sets. Paths in parameter space P can be within or across the bifurcation boundaries. In the first case, attractors undergo smooth changes or morphing, whereas bifurcation boundary crossings lead to qualitatively distinct dynamical behavior.2
8.1.2.4 Attractor Landscape The dynamical substrate mentioned in Sect. 8.1 is the attractor landscape of a RNN, which is determined by its structure, i.e., the weights and biases of one RNN and its possible input parameterizations, which, as we have seen, are constrained by the structural coupling between the agent and the environment (Fig. 8.2). We assume that the structural variables of the network (weights and biases) remain fixed for the duration of a trial. A depiction of an attractor landscape is given by the bifurcation sequence in Fig. 7.5 representing the projection of the attractor landscape along the dimension of 2 . Consequently, for one embodied network with a fixed structure, the agent will do that which the network structure plus parameterizations allows; the agent will act according to its attractor landscape subject to the paths in parameter space. The
2
It is helpful to be aware that “qualitatively distinct dynamic behavior” might lead to qualitatively similar agent behavior, which might be a source of ambiguity (see Sect. 8.4.6.2).
8.2 Toy Problem in Active Tracking
147
attractor landscape is, so to speak, the behavior invariant of one embodied agent, or in other words, the capacity for behavior is given by the attractor landscape, accessible via parameterizations and the interaction with the environment. This is the claim which the next sections will purport to substantiate. Furthermore, we talk about a functional attractor landscape when we are able to identify, with reference to the attractor landscape, the functional mechanism by which an attractor landscape may subserve particular kinds of functional behavior.
8.2 Toy Problem in Active Tracking To illustrate the concepts above, we present a toy problem in active tracking in which evolutionary robotics [12] methods beget the structural parameters of the networks (weights and biases) embedded in agents. Simply put, the problem is the following. A head with a two degrees of freedom neck should be able to follow, with its gaze, a ball that bounces irregularly within a frame (see Fig. 8.3). The primary problems for the head are to know (1) in what direction to turn and (2) with what velocity. A secondary problem is to actively search for the ball in case it is lost from sight. Complementarily, our problem is then to know what are the dynamical entities that allow the network to do so. That is, the focus of the analysis is to see (1) how the metatransient might hop between attractors while (2) attractors change as a function of changing input patterns, and finally (3) how the projection of attractors becomes proper motor action.
Fig. 8.3 The simulated environment and the agent. The lines represent the distance sensors, where red means that the ray is in contact with the ball. The head is mounted on a neck controlled by two motors for pitch and yaw (the distance from the head to the ball is 3 m)
148
8 Attractor Landscapes and the Invariants of Behavior
The main motivation for this formulation is to have a highly dynamical problem where embodiment by necessity plays a fundamental role. In the case under study, inertial factors are crucial components of behavior, which impact on the dynamics of the sensory motor loop. In this way we may observe the behavior that occurs between attractors, in the metatransient. In the results section we describe in full detail the following: The convergent projections of the high-dimensional attractors onto the motor
units are at the core of the solution. We speak of projection shape as the action identity, and indicate how attractor morphing may also be responsible for proper action selection. Control is established by negative feedback, where the control signal is provided by the environment; the action is supported by the attractor landscape, which determines the attractors employed for control. The agents display temporal context dependency, which leads to divergent behavior. We show that there may be two coexisting attractors associated with one parameterization of the network, and that these attractors may lead to different actions. One individual path in parameter space keeps the metatransient under the influence of the same basins of attraction. Likewise, robustness appears because of multiple possible network responses generating similar actions in response to a given pattern. This means that the attractor landscapes have redundant attractors with respect to their evoked action. And conversely, there might be attractor leading to different actions as a response to one stimulus pattern (especially with ambiguous sensory input). This is may also be due to chaotic attractors in ambiguous regions of the attractor landscape. Through an analysis of the metatransient it becomes possible to infer an implicit modeling of the physical characteristics of the environment, such as gravity. Constancy in behavior is not apparent in the network structures, which can vary widely. Constancy is to be found in a particular depiction of the attractor landscapes that takes features of the environment and of the agent into consideration.
8.3 Methods 8.3.1 Problem Description 8.3.1.1 The Toy Problem The toy problem consists of a cybernetic tracking device following a ball in a virtual environment (see Fig. 8.3). The task of the device (the head) is to track a ball that bounces within a frame, keeping the ball under its gaze at all times. The device is a head mounted on a neck, composed by two motors for yaw and pitch. The input sensors are an array of nine distance sensors that signal linearly in the range from 0 (nothing meets the ray) to 0.15 (ball closest to the head). The head is directed
8.3 Methods
149
Fig. 8.4 An example of a path in parameter space: a sequence of 15 frames of stimuli, gathered during an average trial. The stimuli are analogous to a 9-pixel retina. The patterns are resolved by the encounters of retinal rays and the ball, see also Fig. 8.3
Table 8.1 Open Dynamics Engine simulation physics of the tracking experiment Entity Property Quantity Head Mass 3 kg Head Height 2m Yaw and pitch motor Maximum force 5N Yaw and pitch motor Maximum velocity 90 deg/s 3 3 distance input array Vertical and horizontal 0.25 m distance between sensors Ball Radius 0.5 m RNN and physics Update frequency 100 Hz RNN recurrent neural network
towards a frame that constrains a ball, which hops erratically in the two-dimensional plane. The ball is subject to gravity and to the geometry of the frame. As mentioned, by design the ball does not lose energy as it bounces. The head “sees” the ball as the interaction between the distance rays and the ball. It is possible to get an idea of the kind of input to the head according to what is seen in Fig. 8.4, where 15 sequential inputs gathered during a trial are depicted (the input is analogous to a 9-pixel retina). The simulated neck motors take the desired velocity as input, up to a certain force (according to specifications of the simulation library of rigid-body dynamics used in the simulation, the famed Open Dynamics Engine). See Table 8.1 for other physical properties of the simulation). As usual in evolutionary robotics, an evolutionary algorithm selects those RNNs according to a fitness function defining the aptitude of the agent to solve the prescribed problem. So, the telos of the head (keeping the ball under gaze) is reflected in the fitness function: the more the agent is able to keep the ball in sight (sensor input accumulated over trial time) while minimizing oscillations, the fitter the network (see (8.4)).
8.3.1.2 The Fitness Function The fitness function of a trial for an individual is given in (8.4): fitnessind D ˛ „
9 c X X t D1 sD1
ƒ‚ LT
Ss .t/ ˇ …
„
2 c X X
ŒMm .t/ Mm .t 1/2 :
t D2 mD1
ƒ‚ QT
…
(8.4)
150
8 Attractor Landscapes and the Invariants of Behavior
where Ss .t/ is the input at sensor neuron s at time t and Mm .t/ is the activation of the motor neuron m at t. The first term (LT) stands for the sum of stimulus inputs per cycle, and the negative quadratic term (QT) aims to minimize oscillations. ˛ and ˇ are parameters to weight the terms and are alterable on-the-fly during evolution, depending on the experimenter’s emphasis. c stands for the number of steps or cycles in a single trial. Note that all knowledge of the fitness function requires is the motor activities, so all knowledge of the fitness function is also available to the agent.
8.3.1.3 Embodied Discrete Time Recurrent Neural Network The embodied discrete time RNN is the model employed in the experiments, with the hyperbolic tangent as the nonlinear transfer function. The input layer is composed of linear buffers and receives no backward connection. On the online resources, at www.irp.oist.jp/mnegrello/home.html one can find hundreds of evolved networks, as well as the ones selected for analysis. ai .t C 1/ D
n X
wij Œaj .t/ C s I ; i; j D 10; : : : ; nI s D 1; : : : ; 9;
(8.5)
j
where ai is the activity of the i th unit of the network. The total number of units is n. is a sigmoid function, in this case the hyperbolic tangent. wij reads i receives from j with weight w. s is interpreted as slowly varying input from the sensors. The sensors are units 1–9; the motors are units 10 (yaw) and 11 (pitch).
8.3.2 Challenges for the Tracker Although a toy problem, the solution is not trivial, resulting from the physical simulation with high dynamics. Figure 8.3 depicts the environment, and the difficulties ensuing from it are listed here: 1. The head has to cope with rather meager input (a mere nine distance sensors) and, moreover, every pattern taken individually is ambiguous (is the ball coming into or escaping from view?). Moreover, even small changes in ball position relative to the sensor rays might lead to big input changes (say, in one cycle the input of one sensor might drop from a positive something to zero, as can be seen in Fig. 8.4). 2. The head’s foe, the bouncing ball, is designed to bounce erratically owing to (1) being dropped from different initial positions and (2) the different angles of the bottom platforms (see Fig. 8.3). For most initial conditions, the bouncing trajectories of the ball are highly unpredictable.
8.3 Methods
151
3. By design the ball does not lose energy as it bounces, implying that when the ball bounces at different positions of the side walls or bottom platforms, it has very different velocities in the z and y Cartesian axes. For example, when the ball bounces sideways at the bottom of the frame, the horizontal velocity is much higher than when it bounces higher up in the frame. The ball is constantly subject to gravity of 9:8 m/s2 . 4. The network has no knowledge of the frame, so in principle the head has no information about the exit angle of the ball after it bounces against the frame. That means that if the network has the stereotypical response of ball following (such as a pure asymmetry of the left–right weights), it is bound to lose track, as was observed in the first generations (approximately up to 50 generations).
8.3.3 Convergence and Motor Projections of Attractors Effective methods of analyzing small neural networks for their dynamic capacity have shown how paths in parameter space are able to change the behavior of the network both qualitatively and quantitatively. The presence of bifurcation boundaries in parameter space illustrates how even slight changes of parameters can bring the networks to different dynamical regimes. Sadly, some analytical methods used for very small networks are unavailable for large and highly recurrent ones such as the one we treat herein. Nonetheless, complex high-dimensional spaces are not wholly intractable. We therefore rely on projection methods for the analysis, such as projecting the manydimensional orbits to motor space according to the definition of connected paths in parameter space, which indicate bifurcations as well as attractor morphs. Motor projections point to many of the relevant aspects of the dynamical entities involved, and provide for intuitive visualization of the meanings of the metatransients. Figure 8.10. for instance. depicts a considerable part of the action set for the whole set of possible inputs. Although this analysis sacrifices, for example, the determination of precise bifurcation boundaries, it nevertheless allows for a bird-eye view of the totality of the agent’s action set. Problem-specific knowledge can also simplify the input dimensions. The environment is highly structured, regimented by physical laws, which produce regularities in causation. The same is valid for the form of a perceptual apparatus. In our case we reduce the nine-dimensional input space to the two principal components of the stimulus, which are the relative positions of the ball with respect to the sensory array in the horizontal and vertical dimensions. This reduction has the added benefit of showing how raw numbers of sensory input can be drastically reduced by the mild assumption that the changes in input space are correlated. This is because both the embodied agent and the environment are extended in the world, and so the inputs are never scrambled, and are always orderly. The possible configurations of the sensory stimulus in our case is constrained to the possible interactions with the rays and the ball, which define the sensory manifold, and thereby all the possible paths
152
8 Attractor Landscapes and the Invariants of Behavior
on the sensory space (as in [25]). Through these defensible simplifications it became possible to analyze the “action set” of the agent, despite the high dimensionality of both the input and the internal states.
8.4 Results 8.4.1 Tracking Behavior Across Attractors Competent behavior happens when the head is able to match the angular velocity of the neck with the linear velocity of the ball. This requires modulation of both the direction and the force applied to the yaw and pitch motors. From a snapshot of the input (some t ) neither direction nor force is decidable. So, to have best tracking, also the past states must be taken into account for the current action, meaning, for the optimal solution the system will also require memory, found in the internal states of the hidden layer. As we will see, it is the profile of the motor output wave that modulates both the force and the direction of the neck. The choice of wave profile (the attractor translated into a motor output time series, see Fig. 8.6) for control is equivalent to the choice of an attractor invoked by the sensorimotor loop.
8.4.2 Solutions The first solutions (before the 50th generation) were simple networks, which were used as a canvas by the evolution for the more resourceful ones. These primitive solutions employed asymmetry of the network weights to drive the motors. A very obvious limitation of such networks is that they are unable to actively search, rather remaining at fixed points (say, remaining down left until the ball is again in sight), or are in trivial oscillatory behavior. These primitive solutions were gradually substituted by networks that were able to solve the problem more robustly, although aspects of the initial networks have also been inherited by their descendants. The agents inspected in the subsequent sections have the ability of never losing their gaze under normal conditions (e.g., unchanged gravity, constant size of the ball). Many of the networks were not only selected by their fitness, but also by their observed behavior in different conditions, for example, a smaller ball and higher simulation frequencies (many subtle and interesting properties are hard to define in terms of the fitness function, for example, the search strategies when the ball escapes the gaze). Those whose behavior was seemingly less stereotypical also proved to have more diversity of the supporting dynamical structures. Nevertheless, the network’s size was constrained in evolution to not more than 16 units in the recurrent layer and a maximum of 140 synaptic connections.
8.4 Results
153
8.4.3 Analysis of Dynamical Entities Generating Behavior Most of the analysis is done with the asymptotic behavior of networks decoupled from the sensorimotor loop of the simulation. Artificial stimulus patterns that emulate possible interactions of the sensors with the ball lead to responses that are further projected onto motor space. This reduction has often been used in such studies and allows an intuitive understanding of the behavior of the network. It produces a picture of the behavior in the case of constant input, and it depicts the snapshot tendency of the metatransient. Moreover, to represent the motor actions the agent’s body effectively impresses, we average the motor outputs (100 network steps), producing the average amplitude of the motor outputs to verify the action tendency at any given moment.
8.4.4 Convergent Activity and Equivalence of Attractors 8.4.4.1 Motor Projections of Attractors For our analysis we define the velocity with which the head will move by the average of the output of the motor units for a number of cycles. This is seen in Fig. 8.5, which presents (1) a constant stimulus input, (2) the associated attractor, and (3) the arrow representing the averaged output for both motors. This is consonant with the concept in Sect. 5.3 whereby the reduction of dimensionality of an attractor happens as the multidimensional attractor of the whole network is translated into an activity time series to the motors. The inertia of the body, in this case, the head, antialiases the high-frequency time series, causing a “convergence,” a reduction of dimensions.
Fig. 8.5 Left to right: Stimulus pattern, the period 4 attractor itself, and the averaged output of the associated attractor (the vertices of the polygon are the attractor’s states). In middle, the arrow indicates the direction of states in the attractor
154
8 Attractor Landscapes and the Invariants of Behavior
Fig. 8.6 Temporal translation of an orbit on the period 4 attractor for yaw (top) and pitch (below) motors. The shape of the oscillations evoked by the attractor define the velocity arrow in Fig. 8.5
The period 4 attractor is shown in Fig. 8.6 in a time series of the motor output projections, in time, for both motors. The output of the motor units is regulated by the profiles of the activation curves for both motors; therefore, by the shape of the motor projections of the attractor. The amplitudes of the motor projections determine the velocity imprinted on the motors. In the figure one also sees that although the activities of the two motors (at any t) are dependent on one and the same attractor, each motor reads different characteristics of it. So, although the activity of the network is holistic, the motor efferences are independent readings of this n-dimensional attractor shape, as we claimed in Sect. 8.1.3 For every pattern presented to the input units the asymptotic behavior of the network is a time series, produced by sequential visiting of the states of the attractor. Such a time series can be, for instance, motor outputs or inputs to other modules. In the case of motor outputs, it will produce a sequence of actions. In the case of the tracking head, it will produce a sequence of gaze motions (see Fig. 8.7). A gaze direction in turn induces another input pattern, and so on in a sequence of couplings where outputs feed back the inputs through the environment.
8.4.5 Features of Evolved Attractor Landscapes During behavior, the attractors that are projected to motor actions here are often not trivial. The network does not simply associate a fixed-point attractor in response to 3
The idea that readout units sample different aspects of the same attractor is analogous to the approach of liquid state machines or echo state networks (dynamic reservoir networks) [19], with the difference that their attractors are generated randomly to satisfy certain requirements for complex dynamics, whereas our attractors are incrementally evolved. This difference results from the role of the attractors, in which we think that there are attractors that are more apt to solve some kinds of problems, and that artificial evolution is a good method to beget them.
8.4 Results
155
Fig. 8.7 Sequence of stimulus–action pairs of one particular network. The stimuli were recorded in an actual trial. The arrows represent the average activity of the motor output attractor in response to the associated stimulus. The stimulus sequence is a path in parameter space. Note the corrective aspect of the action (ball up left, action up left)
Fig. 8.8 Plots of orbits on output phase space for a series of different attractors for one and the same input pattern, with randomized initial conditions of the hidden layer. The ball is centered in the agent’s view, that is, the relative position of the ball is (0,0). The z-axis indicates the number of network iteration steps (time). The widely varying shapes of the attractors indicate distinct basins of attraction, and coexisting attractors on the attractor landscape. This leads to dependence on history (a type of short-term memory)
a stimulus pattern, such as in a Hopfield network. In fact we find a rich portfolio of attractors as can be seen, for example, in the plots in Figs. 8.8, 8.10, and 8.14. We observe that for a large portion of the input patterns (constructed to represent interactions with the ball), the natural asymptotic output of the network is some nontrivial attractor, normally with high periodicity, quasi-periodic orbits, or chaos. This is also true for coexisting attractors which exist for one and the same input pattern. This is verified by randomizing the initial conditions of the hidden layer and comparing the resulting asymptotic states for one single stimulus pattern (as in Fig. 8.8).4 In support of our observations, unlearned randomly connected RNNs with high recurrence have been shown to possess many very informative states [8,23], as they use many different attractors to translate different input patterns. Here it is also the case that the networks “freely” associate different sorts of attractors to stimuli during the evolution, retaining a high capacity for attractor storage.
4
This does not mean, however, that all the agent may reach every coexisting attractor during behavior, since the possible states are also bounded by the possible history of interactions.
156
8 Attractor Landscapes and the Invariants of Behavior
8.4.5.1 Metatransient and Implicit Mapping of Environmental Asymmetries It is easy to see that there is no one-to-one mapping of a given input pattern to a velocity of the ball.5 From that follows, to be optimal (gaze locked with the ball), the head has to use different velocities even when keeping the input pattern constant (although the acceleration is constant). Therefore, the velocity of the head has to be chosen by considering the recent velocity history to respond to gravity. For example, in the case of a falling ball, the metatransient has to approach attractors whose motor projection increases as the ball accelerates. Furthermore, as the input patterns are rather similar when the ball is under the gaze, small paths in parameter space have a very definite meaning. This is indicated in Fig. 8.9 in a plot of the pitch projection of the metatransient. The activations of the pitch motor unit were recorded during a trial. The rigged profile of the transient is averaged with a causal rectangular convolution window to represent the effected output velocity. One sees that the oscillations on the y-axis lead to a linear increase of the averaged velocity of the tracker’s neck. This is consistent with a linear velocity increase imposed by gravity. That means that the network has implicitly imprinted the interaction with gravity into its dynamical substrate. The negative feedback was adjusted to cope with the specific physics of the environment.
Fig. 8.9 Roughly linear increase and decrease of pitch velocity of the head. Top: The actual output of the network. Bottom: The convolution with a rectangular causal kernel of ten steps (0.1 s), representing the average velocity implemented by the tracker. For the average velocity to increase, the metatransient must switch to attractors of different shapes across the landscape
5
Consider when the ball is subject to gravity, with the velocity on the vertical axis, even if the head locks its position to the ball, thereby keeping a constant input pattern, it must nevertheless accelerate downward.
8.4 Results
157
8.4.5.2 Attractor Landscapes of Negative Feedback How is the agent is able to follow the ball despite the ball’s constant change in velocity (gravity, shock)? Insight into the mechanics of control is gained by reframing the question in terms of the tendencies of the metatransient. By plotting the projection of the motor space as a function of the set of interactions between the sensor array and the ball, we gain insight into the whole agent’s action set. At this level we observe invariant features of the attractor landscape shared by all agents solving the problem: a two-dimensional negative feedback. For each output unit we plot the mean amplitude of the respective motor projection for all the states in input space, as in Fig. 8.10. The figure is constructed as follows. The coordinates of each pixel stand for a relative position of the ball in head-centered coordinates. The color of that pixel represents the average amplitude of the respective output unit (yaw or pitch) given one dynamical system (the RNN parameterized by the correspondent interaction with the ball, in headcentered coordinates). The center of the diagram .0; 0/ represents the coincidence between the center of the ball and the center of the retina, that is, the relative position of the ball with respect to the retina, both in the horizontal and in the vertical coordinates, is zero (to know how the stimulus appears at the center of the coordinate systems, check the last state in the input sequence in Fig. 8.4, which is about the center). To computethe action associated with every pixel, we proceed
Fig. 8.10 The attractor landscape for the totality of the possible input space. On the left, the pitch motor unit and on the right, the yaw motor unit. Every pixel is a head-centered coordinate of the ball (the icons at the corners represent the interaction between the input array and the ball). The behavior of the agent can be seen as the metatransient relating paths in parameter space (see the text for an explanation)
158
8 Attractor Landscapes and the Invariants of Behavior
systematically by calculating the actions associated with the pixels according to vertical scanlines. For every new input pattern, we then calculate 300 steps in that orbit, drop 200, and average the last 100.6 In Fig. 8.10 we see two projections of the same attractor landscape, one to the pitch motor and one to the yaw motor. Note that they possess different symmetries, the yaw motor having vertical symmetry and the pitch motor having horizontal symmetry. That means that the two units are “reading” different aspects of the attractor landscape. Essentially, they transform the high-dimensional attractor (dimensions given by the number of units in the network) into two behaviorally relevant dimensions. The plot in Fig. 8.10 explains control as follows. For each stimulus and for each given initial condition resulting from interaction between the head and the ball there is a single pixel. This pixel represents a dynamical system for that parameter. Each plot shows the attractor landscape from the perspective of one motor unit. Assume that for a small number of network iterations, the input remains similar. The color of that pixel then represents roughly the action of the agent, albeit not exactly, since it is dependent on a temporal average, and the metatransient might become entrained in different states of the attractor. The simplest example goes as follows. When the ball is in the precise .0; 0/ coordinate, the action of both the yaw and the pitch motors is roughly zero. Assume the ball is falling. If the agent does nothing, the sensory input will change, and the ball will be down in relation to the center of the retina [e.g., coordinate .0; 0:3/]. That new situation will remove the agent from repose and evoke a response of the pitch motor close to 1, while the yaw remains on average near zero. But since 1 is the largest velocity, the head will advance in comparison with the ball, for example, changing the coordinate to, for example, .0; 0:3/, which will evoke a small upward velocity. This process unrolls as connected paths of parameter space (stimuli), associated with attractors (motor actions). The actions of the agent result from a negative-feedback loop, where the environment provides the control signal. The agent acts and corrects towards a goal that only appears as the agent behaves. It can be said that the plot in Fig. 8.10 depicts the agent’s “invariant of behavior” by showing what the velocity imprinted by the motors would be, had the input been one given input pattern, when the network is given enough time. Recalling that the ball dynamics does not permit the network to lazily settle on one final attractor, the picture is to be read as a collection of input–tendency pairs. Without enough time to settle on the attractor, the behavior is the metatransient that overlays the attractor landscape, as a function of the connected path in sensor/parameter space. The invariant of behavior is the collection of putative tendencies of the network, and only exists as it is exploited. It is interesting also to observe the functional characteristics of the individual projections on the output dimensions (pitch and yaw) are strongly dissimilar although both are a function of the high-dimensional space of the evolved network. Conceivably, this resulted from the impact of the asymmetries of the environment and of the 6
The internal states of the hidden layer are inherited across parameterizations.
8.4 Results
159
ball’s behavior: in the vertical direction, a bouncing driven by gravity, whereas in the horizontal, more constant velocity (contingent on the wall bounces). These landscapes varied in the details among the evolved networks, but some features were invariant if the behavior was at all present. In the next chapter, many examples of different agents will exemplify this similarity. That is, despite the randomization of the different evolutionary runs that produced the networks, the evolutionary path generated attractor landscapes that would provide for similar function. Moreover, the general trend in evolution was one of complexification of the landscapes: increased complexity of the underlying attractor landscapes usually implied solutions that were both more robust and more resourceful (e.g., an aperiodic active search), displaying generalized hysteresis and autonomous oscillations.
8.4.6 Attractor Shapes and Action The averaged motor projections of the attractors also indicate that there are two features of the attractors responsible for the action, where neither is preponderant over the other, i.e., the attractor’s periodicity and its shape in phase space. Depending on the shape of the attractor, different periods might lead to similar average speed, and conversely, equal periods and different shapes might lead to different outputs. This is clearly illustrated with the series of chaotic and quasi-periodic attractors in Fig. 8.5, calculated for very similar input patterns.7 Figure 8.5 shows the motor space projection of the simulated asymptotic orbit on the attractor, for 150 steps, with similar averaged velocities. In the simulation though, such equality would probably not happen. As the actions during the trials happen in a very short time, the average speed might change depending on two factors, where the transient starts and for how long it is approaching the attractor. That is particularly true for chaotic attractors, on which transients might become entrained in different positions. In Fig. 8.11 one sees that the attractor has a definite shape and structure. But because it is aperiodic, the average over motor output will obviously vary depending on the time window taken for computation. Nevertheless, it will only vary within the bounds given by the structure of the attractor itself, i.e., depending on where the metatransient engages the attractor, and how long it stays under its influence. The shape and periodicity of the attractor determine the average applied to motors.
8.4.6.1 Attractor Morph Conversely, the attractors might also change smoothly with paths in parameter space contained within bifurcation boundaries. A metatransient guided through a 7
These input patterns were obtained with the same network operating under 500-Hz update frequency, so the differences between steps would be tiny. Note that this test would not have been possible if the network had not been robust to different update frequencies.
160
8 Attractor Landscapes and the Invariants of Behavior
Fig. 8.11 The first-return map of a chaotic attractor for the yaw motor output indicates that despite the presence of structure, the putative action of the agent will change depending on where the metatransient engages the attractor
Fig. 8.12 Morphing attractors for a path in parameter space corresponding to a ball’s trajectory from bottom to top (z-axis). For each parameter in z (the relative position of the ball on the vertical axis), the orbit is computed in phase space and plotted as a z slice. The plot is a bifurcation diagram in the two perceptual dimensions, i.e., ball in horizontal and ball in vertical
morphing attractor might wind along the attractor set, changing smoothly as the attractor landscape is explored through parameter shifts. Figures 8.12 and 8.13 show a morphing attractor for a particular path in parameter space. Basically, the plot is
8.4 Results
161
Fig. 8.13 Sequence of morphing attractors. Left to right: The attractors morph smoothly as the stimulus set is varied (each plot represent a ball being shown from top to bottom, but with different left–right offsets. That is, the leftmost plot is the ball shown from top to bottom, but with a left offset, and so on to a right offset). The numbers at the top represent the offset with respect to the center
the exploded version of one vertical scanline in Fig. 8.10, where the output is not squashed in the average. The z-axis represents the parameter path, corresponding to the sensory input from the ball on the bottom of the sensor to the ball on the top. In other words, the z-axis in the figure represents the relative position of the ball on the vertical direction, keeping the horizontal constant. The x-axis and y-axis are the motor phase space. Each slice in z determines a dynamical system given the correspondent parameterization (stimulus). For each slice we plot 45 steps, after discarding the first five, of the projection of the orbit. The plot is in fact a bifurcation diagram, where we plot the two output dimensions as a function of the parameter, which is the relative position of the ball. The plot in Fig. 8.12 should be interpreted as follows. Start, for example, at the bottom of the z-axis with the relative ball position on the vertical axis equal to 0:7 m (the relative position on the horizontal axis is constant at 0:23 m). In that situation, with the ball having a slight offset to the left, the network starts out with a saturated response of 1 in yaw and 1 in pitch and very low amplitude oscillations. As the parameter in z is increased, meaning the ball moves up relatively to the input array, the amplitude of the oscillations on the phase space increases smoothly. The density and form indicate that the morphing attractor might be composed of quasi-periodic attractors. As we continue to move up the z-axis, i.e., ball up, the attractor smoothly saturates again on the maximum pitch. Because these transitions are between bifurcation boundaries, the shape of the attractor set is smooth. It is also worthy of mention that this particular solution was not the case for all the networks, despite the apparent similarity in their action sets and behavior.
8.4.6.2 Coexisting Attractors The presence of coexisting attractors is observed by presenting the same stimulus but with randomized initial states of the hidden layer. In theory, two identical parameterizations possess the same attractor structure, because they resolve the same dynamical system. Theoretically, the fact that two initial conditions lead to
162
8 Attractor Landscapes and the Invariants of Behavior
Fig. 8.14 For each of the stimulus patterns, the motor output orbit is plotted in phase space (z-axis represents the iteration cycle). Although five different attractors appear, they all lead to very similar actions (compare the velocity vectors of the third row)
different attractors means that there are at least two attractors, each belonging to a distinct basin of attraction, possibly across saddle nodes. When different orbits are observed, with the input pattern clamped constant (a single dynamical system subject to one parameterization Dpattern ), one may conclude that there are coexisting attractors for the dynamical system determined by that parameterization. Coexisting attractors might play similar or different functional roles, as we will see. In Fig. 8.8 one sees that the network might produce distinct outputs when given one and the same stimulus pattern plus a randomized initial condition of the hidden layer. This indicates the coexistence of attractors with different functional roles, since they produce different average motor outputs (as the arrows point in different directions). Compare this now with Fig. 8.14, which shows a plot of the actual attractors elicited during the trials (the activities of the hidden layer were logged, allowing off-line recomputing of the attractors visited during the trials). The stimulus was the same for all the plots, but the initial condition of the hidden layer was taken from the trials. The z-axis is the iteration step of the network, so one can see the time series of the motor projection. Notice that although the shape of the attractor noticeably varies, the average outputs of the attractor (given by the arrow plot beneath) are in this case very similar. This indicates distinct coexisting attractors, which nevertheless play similar functional roles. Because the coexisting attractors are accessed in accord with the internal state of the hidden layer, context dependency is implied. The system is sensitive to history, implying that it has a sort of transient memory. Transient memory is an issue in current cognitive dynamics research, as in [5], for example. The fact that there are coexisting attractors implies that there are multiple coexisting stable states, and so different actions can be elicited by one and the same stimulus, contingent on history.
8.5 Discussion
163
So, the history of interactions between the agent’s networks and the environment leads to different instantiations of the metatransient, which although subserved by an invariant (the unique attractor landscape given by the structure) leads to variation and divergence in behavior. In each instantiation of the behavior, the actions themselves are different. But the behavioral function, tracking, is the same every time.
8.5 Discussion 8.5.1 Structure–Attractor Landscape–Function Instead of asking the two-tiered question of how function derives from structure, I broke the question into three components and pointed to the attractor landscape as the mediating entity between structure and function. Thus, (1) a given network structure renders (2) an attractor landscape, which in turn (3) is exploited in the interaction of a body and with the environment to produce a function. The decomposition is justified because it opens up the black box containing the causal underpinning of the network function, and relates that to the attractor landscape as an explanatory middleman. The attractor landscape is the set of all attractors of a neural module, resolved by one specific weight matrix and the set of possible inputs (parameters) to the network. In that, one attractor landscape stands for all the potentials for behavior of an embedded network, embodying the behavioral invariant of one agent. Invariance appears in the features of an attractor landscape depicted from a particular perspective. Taking in the perceptual space of the agent and by plotting the average force applied to motors, we find a picture which is grossly invariant across agents solving the problem. However, this picture does not exist irrespective of the transformations performed, and they are an integral part of the explanation. Our understanding perforce remains confined within the bounds of the function assumed, and the transformations with respect to which something remains invariant. This view has a number of implications, many of which are primarily of epistemological interest. Invariants are about the knowledge they bring with them. The discussion to follow is primarily concerned with the ways in which invariances help us understand the mechanistic conception of functional behavior.
8.5.2 Invariants of Behavior To answer “What is the invariant of a behavior?” a number of prior commitments are necessary. One has to say (1) what the behavior is, (2) what is to be measured, (3) in relation to what, and (4) where. Then, one performs operations on the data acquired, to reveal transformations that keep the invariant so. If these transformations exist and are meaningful, the invariant will be respective to all four aspects in the context of the problem itself.
164
8 Attractor Landscapes and the Invariants of Behavior
Then, the important question becomes: “How do invariants inform about mechanism?” To reveal an invariant, measurements undergo a number of transformations. Data is organized to be better visualized and understood. The transformations are as important as the invariants they beget. In the present experiment, to get to the invariant I assumed, for example, the perceptual dimensions “ball in vertical” and “ball in horizontal.” Then, I performed a transformation on the data, averaging the attractor projections for a number of steps. Although in truth neither of these assumptions strictly corresponds to the situation during a trial, there is enough correspondence to be representative of a putative ball–agent encounter. In that, it also led to the discovery that the mechanism behind the function was a negative feedback in both motor dimensions. The invariant features of activity were revealed through assumed transformations. In the study of brain function things work similarly. So, to see that a particular firing rate is invariant with a place, we must make a map of places, and overlay the average activity of neurons on top. If the cell is more active in one place than in another, then we have a place invariant. The transformation “averaging the activity of a neuron” informs us that the “knowing where” mechanism can rely on the firing rate. Jennifer Aniston cells, likewise, are a connection between Jennifer Aniston and a certain firing rate that distinguishes Jennifer Aniston from other people. Now, the distinction is crucially dependent on our question, and how we transform the measurables to answer it. Although we may find a cell that is only active when it sees Jennifer Aniston (and silent when it does not), this does not inform us whether there is a faces dimension, where Jennifer stands alone (what does the cell do when introduced to a closely resembling doppelganger of Jenny?). Nevertheless, the knowledge that there is such a cell informs about a class of possible mechanisms. This result may, for example, be taken to resemble the behavior of a feedforward network trained with a backpropagation algorithm. Also here, the transformations revealing the invariant (a set of faces) give clues to the operating mechanisms. However, with such a narrow question posing, the information about mechanism is not very revealing. The transformations performed are dependent on the particular experimental questions, the method, the data and the functions studied. The invariant may be flimsy, and it may be very dependent on the transformations assumed. Perhaps Jennifer-invariance disappears under some transformations (as an old Jenny). Those disappearances also tip us off about what may be the currency that the brain employs for a particular function hrecognizing peoplesi. So, if Jennifer disappears when we perform an autocorrelation of the neuron’s activity, then we may infer that the mechanisms does not depend on particular spike times. Generally, the more robust the invariant, that is, the more transformations the data can undergo and still closely represent the function, the safer it is to assume that something “is really there.” Any new invariant revealed uncovers an aspect of the mechanism – that is, something the brain can rely on to perform a given function. In this chapter, the definition of a behavioral function htrackingi induced a method (and a level) to search for invariance. The method related “perceptual
8.5 Discussion
165
dimensions” and the “motor dimensions.” The invariant features of the attractor landscape, therefore, directly derive from the assumptions on function, morphology, and level. Once the function to be scrutinized has been identified, methods to find invariances are devised. To explain function, first we assume function, where this circularity is both unavoidable and necessary. What is invariant with respect to function is about the problem itself. Hence, the invariant found has aboutness. It is about the problem, defined by the agent, by the environment, and by the fitness function. We have found that the attractor landscape is invariant to a wide class of transformations, including: Network structure Transfer functions, number of synapses, number of neurons, continuous or discrete RNNs Attractor landscape Types of attractors, amplitude of attractors, loci of bifurcations, basins of attraction Evolutionary parameters Addition of synapses and neurons, rate of change of synapses, rate of change of biases I have probably overlooked others. Now three remarks are in order: 1. A behavioral function may determine an invariant, but the invariant does not determine extensions of function. So, if we find place cells in the hippocampus, do we also learn about the existence of grandmother cells? Unlikely. The question is how are these two cell-function types resolved in the same structure, and why. Brain structures are multifunctional in the very tautological sense that they do all that they do (as argued in Sect. 3.2.5). The task is to find the mechanism that has all these functions as entailments (potentialities), and not to find a mechanism for each of the functions. 2. Matter matters. There will be different entailments of different implementations, in terms of further functionality (what other function can be adjoined), in terms of energy efficiency (oscillations may be costly), or in terms of viability. A simulation is free to change structure in ways an organism is not, and conversely. By the same token, an organism has possibilities that are inaccessible for our most detailed simulations. I submit that it is research on the entailments of structures that is the most prolific trail for future structure–function research. 3. It is unassailable that we will find (or not find) invariants when we look for correlates of function. But the guiding question should always concern the transformations imputed on the data for the correlations to be revealed. This has two roles. First, transformations give information about what may have been eclipsed in our data-forming process. Second, transformations inform us about possible equivalences with respect to the behavioral function. In my case, by looking at the invariant by averaging the attractors, I consciously overlook fine temporal details of their profiles. This is only warranted given my particular problem and may not be the case generally. Concerning the second role, that of equivalences, through my transformations I was able to discover that attractors (and attractor landscapes) are equivalent with respect to action. We comprehend as a
166
8 Attractor Landscapes and the Invariants of Behavior
bonus that the transformations in the previous list do have bounds where they work. Their bounds are defined by the organism’s viability, as seen from a fitness function. In brain science at large, knowledge of a particular invariant to a function is only relevant inasmuch as possible mechanisms are therewith uncovered. The uncovering of mechanism crucially requires knowledge about the whole system and where the invariant stands in relation to the organism and its problems. There is no understanding of mechanism without comprehension of the relational structure of behavioral functions.
8.5.3 Explanations of Functional Behavior: Negative Feedback We know nothing accurately in reality, but only as it changes according to the bodily condition, and the constitution of those things that flow upon the body and impinge upon it. It will be obvious that it is impossible to understand how in reality each thing is. C.C.W. Taylor, The Atomists: Leucippus and Democritus: Fragments
The embedded tradition, going as far back as Democritus, regards the organism as both a part of its world and as an intermediate to it. Between itself and the world, wherein it is an active participant, the organism plays a double role, that of receptor and that of actor, where that what it does is a function of what it perceives, where it is, and what it is. This tradition can be exemplified by a collection of sensor motor diagrams, each of which sees the organism exchanging with its environment through interfaces (Fig. 8.15). Behavior in toto is the sensorimotor loop: an ongoing sequence of nontrivial reafferent loops [13] taking in the state of the body, of the control structures, and of the environment as coupled dynamical systems, to the future. Functional behavior is different. It is the product of outlining a goal that a set of actions expresses. Obviously, functional behavior is a teleological definition, for it assumes a goal, that which the functional behavior achieves. Scientific attempts purporting to avoid teleology are prepotent rather than effective, see shortcomings of behaviorism. Thus, a bacterium feeds, forages, digests, avoids, as an elephant does. In both cases, an explanation for a functional behavior takes in the assumption of function, and then procures the mechanism of the said function. Then, to be sure, the explanations for the behavioral functions of the elephant and the bacterium are different. Mechanisms are different, albeit prone to analogies; hence, the words for functions are shared. A weak form of teleology is thus inherent to all mechanistic explanations of behavior. An explanation of ongoing behavior is given with the mechanisms for component behavioral functions. A mechanistic explanation of function attempts to match
8.5 Discussion
167
Fig. 8.15 Agent–environment diagrams. (From [1, 2, 7, 10, 32, 34])
assumed behavioral function mechanisms. For that, explanation selects the relevant levels for assumed function. In this case, we found the relevant level in the attractor landscape, for it lodges invariant features with respect to equivalences in activity of networks and network structures. It is a holistic description that considers ongoingness of the loop as the context for multiple mechanisms, for multiple functional behaviors.
8.5.3.1 Negative Feedback is a Primordial Function Negative feedback appears as the loop between the environment and the organism driving an organism (Fig. 8.16). From bacteria following a gradient, to a sculptor removing that which is not a horse, or a fly following its mate, negative feedback is an abstract and general mechanism, based on act-correct protocols, expressing goal directedness. It is prophetic that cybernetics, before it was christened by Norbert Wiener, was referred to by von Foerster [35] as the study of “circular-causal mechanisms in biological and social systems.” The negative feedback is a paradigmatic example of a circular-causal mechanism, and tracking is a prototype thereof. So, our mechanistic explanation of the component behavior of tracking, in terms of a negative-feedback mechanism is as follows. To every point of a path in parameter space and to every state of the agent (hidden layer and body), there corresponds a point in the attractor landscape, whose mapping to an action causes the next state of
168
8 Attractor Landscapes and the Invariants of Behavior
Fig. 8.16 The agent’s body is propelled by the activity of its neural structures, which it also carries. The environment is loosely coupled to the agent, represented by the gap with squiggly arrows between the agent and the environment. Movements of the agent change its relations to the environment, symmetrically. The sensory system takes these changes in through its transducers, and recontextualizes them in terms of the ongoing activity of the nervous system (dashed cycle). The fading spiral represents the fading influence of the stimulus (inspired by Varela)
the path in parameter space, closing the loop in a sequence of linked actions.8 This explanation is appropriate, but does not exhaust the phenomenon, the tracking ability. Other subfunctions could be named and mechanistically explained, with more or less success. But whatever the case may be, identification of function prefigures explanations.
8.5.3.2 Explanations of Mechanism In nature, examples of control behavior based on reafference and negative feedback abound. Of those, a great number are in direct analogy to our tracker agent. A male fly, in a partner finding flight, must keep a small black dot (the partner) in sight, controlling the relative position of the retinal spot where the dot is found, with modulation of force on its wings. The same fly, as well as many other insects, exhibits similar control behavior with regard to optic flows, where it adjusts its orientation relative to the speed of the flow [9]. The same compensatory behavior happens with light and polarized light [14]. Although this is a coarse analogy, I claim that these behaviors exploit attractor landscapes similar to the ones found here. Whatever the network architectures are, given the function they execute, there must be an equivalent mechanism. Frogs strike prey in a negative feedback governed by visual stimuli that admits negative feedback as a description. The archerfish has the remarkable behavior of
8
Note that this reflexive depiction of behavior does not need to mean it is stereotypical. In The Structure of Behavior, Merleau-Ponty [22] makes a case for the inadequacy of a physiological theory of behavior based on reflexes, since a quantitative change of the stimulus induces a qualitative change in behavior. But as we have seen in our case, even a small quantitative difference in stimulus may invoke a qualitatively different reaction, not bijective with the stimulus.
8.5 Discussion
169
striking insects by spitting water at them, and swiftly swimming to the falling insect’s expected position. If there are mismatches between the insect’s position and of the fish, the fish performs on-the-fly corrections of movement. The behavior can be described as a negative feedback. When contrasted to knowledge-based explanations, those based on negative feedbacks are more parsimonious. For instance, the archerfish’s behavior has been described as requiring knowledge of ballistics, and of gravity, to be able to catch the falling insect [26]. The description is compelling, because the observed behavior is indeed sophisticated. However, to impute the fish with knowledge of the laws of ballistics hazards overrationalizing what happens. If we choose to describe the behavior in the language of the neurodynamics, one can imagine the stimulus (the position of the insect in the visual field, and parallax) as a parameter for an attractor landscape whose motor consequences will induce both the aiming and the swimming (the attractor landscapes can be ontogenetically endowed, and perfected through plasticity and rewards). To explain aiming behavior and catching behavior, one just needs to find what are the motor consequences of input stimuli, and context. The assumed knowledge of ballistics can be reduced to negative-feedback-like attractor landscapes, without demanding that the fish perform any complex computations. If I am permitted the analogy, the ability of the archerfish to perform its fascinating behavior summarizes to having structures affording the right kind of negative feedback.
8.5.4 Behavioral Function Demands a Holistic Description We now come to the last point about epistemology of function. The negative feedback underlying tracking appears as the agent engages the environment in interaction, and is only meaningful in that context. The agent acts on a proper stimulus, one affording a useful distinction (i.e., a direction), expressed in interaction. In tracking, distinctions appear in time, as the sequences of inputs (paths in parameter space) are associated with corrective action. Although the potential for tracking is given in the network structure, it only exists when all the rest exists. That is why to find what is the attractor landscape, we are committed with assumptions about the kinds of “meaningful stimuli” – i.e., stimuli that resemble those employed by the organism (agent) in a behaviorally meaningful situation. Hence, we must assume, for instance, the shape of the stimulus (a ball), and we must also assume that paths in parameter space are the principal components of relative movement of the stimulus. An analysis without reference to the structure of the stimulus or to the properties of motor projections would have been but an exercise in abstraction. Negative feedback as a function emerges holistically. One has to look at the whole input space to see that “ball to the left” invokes “move to the left” and so on. It is meaningless to ask how the activity of a neuron selected randomly correlates with the action. One has to choose the neuron close to the action, in our case the motor neuron. Other neurons will correlate more or less with the action evoked.
170
8 Attractor Landscapes and the Invariants of Behavior
An analysis taking in the whole input space is therefore more appropriate. It is possible to do this in such a simple system as ours, where it is possible to observe all activities simultaneously. But this is blatantly not easy in complex systems, where it is illusory to take in the whole input space for the analysis. To circumvent that, one has to make assumptions about what are the “more independent dimensions” of stimulus, with respect to the function outlined. As a consequence, one leaves out a portion of the phenomenon, which may be more or less relevant. The more independent a system appears, the more clearly the variables participating in it can be defined, and the easier it will be to find a mechanism. It is thus that attempts to modularize the brain for explanation arise. Modularization for explanation is as a powerful as it is a perilous heuristic, for it severs relations and occludes dependences. Function demands holistic descriptions, and modules that were severed for explanation must be reintegrated – connections reestablished – if the explanatory summit is to be achieved.
8.6 Linking Section: Convergent Landscapes The artificial evolution of neural structures selected by a fitness function produced a vast assortment of networks, with various configurations, where the number of units, connection matrices, and weight distributions varied widely. Yet, those that exhibited the prescribed function all had similar behavior. From the experiment described in this chapter, we observed that every successful agent was able to track a sphere, reliably, and in some cases optimally, never losing track of the object. This led us to the question: Is there a level of commonality that explains the equivalences in behavior? What allowed different networks to solve the problem similarly? The answer was to be found in the dynamics, but not quite as initially expected.
8.6.1 Direct Association Between Dynamics and Behavior In previous work done in our laboratory, we obtained interesting results concerning direct association of particular dynamical entities with particular behaviors. In many instances it was possible to find different dynamical regions that would underlie different behaviors. For example, in his Ph.D. thesis, H¨ulse [15] evolved controllers for a robot that would follow light sources while avoiding walls. H¨ulse was able to associate different parameter domains with qualitatively distinct dynamics, and in turn these distinct dynamical regions with particular behaviors. When the robot turned, turning behavior was associated (for example) with a period p attractor. Avoiding walls in sharp corners was related to a parameter domain where attractors coexisted, the so-called dynamical hysteresis phenomenon. In a more sophisticated example of a robot able to manage its energy resources, H¨ulse was also able to associate dynamical entities with different behaviors. The
8.6 Linking Section: Convergent Landscapes
171
experiment, concisely, goes as follows. An autonomous robot is given an energy reservoir, which is replenished by staying close to a light source. Without the light, the energy in the reservoir decays linearly with time. The task of the robot was to simultaneously maximize exploration, while not running out of energy (in which case it “died”). In this experiment, the behavior of successful robots was explained through analysis of the internal dynamics of the agent. Taking the amount of energy in the reservoir as a parameter, it was possible to identify regions of distinct dynamics, and to relate the different regions to different behaviors. When the energy reservoir was full, the robot would go exploring (walking around avoiding obstacles), and this behavior was associated with a fixed-point attractor. When the energy reservoir was in need of replenishment, the internal dynamics would enter a region of periodic attractors, which would cause the robot to stand in front of the light (see Fig. 8.17). Hence, behaviors could be explained by a simple mapping onto different types of dynamics. Similar results were obtained with the same paradigm, where dynamical entities were associated with particular behaviors (e.g., generalized hysteresis was associated with hescaping sharp cornersi, as in [16]). Guided by his promising results, my initial approach was to search for the different dynamical regions, and to see what kind of attractors and basins networks had. I took the approach of analyzing the network’s dynamics with reconstituted stimuli that the agent might encounter while behaving. I attempted to find associations between behaviors and specific qualitative dynamics of certain regions of parameter space.
Fig. 8.17 Dynamical schemata for behavior. An example of the association between different kinds of attractors and outlined behaviors of one neural network controller. (From [15])
172
8 Attractor Landscapes and the Invariants of Behavior
And indeed, as we saw in the present chapter, the opposite was the case. Instead of a one-to-one mapping association from dynamics to behaviors, what I found were many dynamical entities underlying a behavior. hLooking lefti, for example, was associated with attractors of many different periods, as well as chaotic attractors, and even, on occasion, fixed-point attractors, which were seemingly equivalent with respect to function. For function, it was not the class of dynamics, i.e., the type of attractor, that mattered. Rather, it was the action that those dynamics evoked, and how the behavior appeared in a constant shift of parameters and respective actions. Although the dynamical entities of the different agents did not resemble each other, the behaviors of the agents did. So we are compelled to conclude that it is not one particular attractor dynamics that will lead to one behavior. Rather, it is a whole class of dynamical behavior that will satisfy the evolutionary constraints of viability, as posed by the problem. There is a large space of variability of structures where constancy of behavior may be found. This topic is explored in the next chapter, in the context of convergent evolution.
] Appendix I: Learning as Deforming Attractor Landscapes Neurodynamics defines learning as changes of the network during behavior, that is, variation of weights according to synaptic plasticity rules. The networks here were static; plasticity was not involved. This simplification, which permits simpler off-line analysis, has the downside that all of the landscapes have rigid structure. In spite of that, active tracking as defined here is a simple problem, one that can be solved merely with evolution. The selection process can be seen as operating on attractor landscapes, selecting those with the most useful action set. In living beings, the ecological problems are way beyond the isolated function of tracking. One of the essentials is learning. But thinking about the dynamical substrates of the embodied RNN provides a new facet to conceptualizing learning. We might say that evolution begets initial parameter sets (instinctive responses), whereas learning modulates them to extend behavioral breadth [27]. Stretching out this analogy to the development of central nervous systems, it could be said that the initial structure has to carry those sets of conditions that allow for the lodging of the “learnable” attractors, those which the organism is able to learn. This provides a neurodynamics interpretation of the Baldwin effect in terms of the “learning capacity” of the underlying networks. Selection will benefit those individuals endowed with initial structures that are more malleable and permeable to useful attractor landscapes.
Appendix II: Related Work
173
Appendix II: Related Work There has been quite some work carried out in the last 15 years on the topic of the dynamical systems analysis of evolved agents, pioneered by articles by Pasemann [24], Beer [7], Tani [28], and others. My contribution stands on the shoulder of the seminal works, and presents a novel toy problem on pan–tilt tracking, with high environment–sensorimotor dynamics and an analysis of behavior based on evolved attractor landscapes (rather than learned, as in [17]), which gains a little more substance than in its usual metaphorical usage [3]. With reference to the concepts of the metatransient and attractor landscapes, a related concept is worthy of mention: the concept of chaotic itinerancy, advanced by Ikeda, Kaneko, Tsuda, and others. It has made a deep impact on the field, not only of cognition and brain dynamics [30, 31], but also very broadly [21], formally [20], conceptually [30], and metaphorically. An excellent review of the applications of the concept is presented in [21]. This must be mentioned also because of the important differences that appear when the approaches are contrasted. Mercilessly squeezing the concept into a couple of sentences, chaotic itinerancy is the idea that a chaotic attractor spanning the whole of a high-dimensional phase space of a dynamical system (also neural networks) collapses, at times, in limit cycles of much lower dimensions, called attractor ruins (in opposition to here, no projection is necessary, since the dimensionality reduction is achieved by the shape of the attractor itself). The paths connecting the lower-dimensional attractor ruins are itinerant (chaotic) owing to crossing of unstable manifolds, thus the denomination “chaotic itinerancy.” Although the concepts of chaotic itinerancy and metatransients can be made analogous, they differ in important ways. First, the systems we describe are parameterized by the input; therefore, we deal with a collection of dynamical systems, instead of a single one with many dimensions. So, in the case of itinerancy, we talk about one orbit that explores the complexity of basins of attraction of one dynamical system of high dimension, whereas here we talk about the complexity of the attractor landscape, which is the concoction of all basins of attractions of all dynamical systems in a RNN accessible by parameterizations. Second, in chaotic itinerancy the basin crossing of an orbit is usually due to noise and unstable manifolds, whereas here the metatransient crosses dynamical systems owing to parameterizations of the system, which are a function of the structural coupling between the agent and the environment. So, the crucial aspect of this difference is parameterization by structural coupling. Tani and Ikegami [18] touched the core of this issue in the title of their response article “Chaotic itinerancy needs embodied cognition.” Finally, because our functional states are efferent projections, the actual dimensionality of the attractor in higher dimensions is not crucial for behavior: there is a many-to-one mapping of different attractors to actions. It is assumedly indifferent whether the period of the attractor is low or high, as long as the motor projection gets the job done.
174
8 Attractor Landscapes and the Invariants of Behavior
References 1. Arbib MA (1972) The metaphorical brain, an introduction to cybernetics and brain theory. MIT, Cambridge, CA 2. Ashby W (1960) Design for a brain: The origin of adaptive behavior, 2nd edn. Chapman & Hall, London 3. Banerjee A (2001) The roles played by external input and synaptic modulations in the dynamics of neuronal systems. Behav Brain Sci 24(5):811–812 4. Barandiaran X, Moreno A (2006) On what makes certain dynamical systems cognitive: A minimally cognitive organization program. Adaptive Behavior 14(2):171–185. DOI 10.1177/ 105971230601400208, URL http://adb.sagepub.com/cgi/content/abstract/14/2/171, http://adb. sagepub.com/cgi/reprint/14/2/171.pdf 5. Beer R (2009) Beyond control: The dynamics of brain-body-environment interaction in motor systems. In: Sternad D (ed) Progress in motor control V: A multidisciplinary perspective. Springer, New York 6. Beer R, Gallagher J (1992) Evolving dynamical neural networks for adaptive behavior. Adapt Behav 1(1):91–122 7. Beer RD (1995) A dynamical systems perspective on agent-environment interaction. Artif Intell (72):173–215 8. Berry H, Quoy M (2006) Structure and dynamics of random recurrent neural networks. Adapt Behav 14(2):129–137. DOI 10.1177/105971230601400204, URL http://adb.sagepub.com/cgi/ content/abstract/14/2/129, http://adb.sagepub.com/cgi/reprint/14/2/129.pdf 9. Boeddeker N, Egelhaaf M (2005) A single control system for smooth and saccade-like pursuit in blowflies. J Exp Biol (208):1563–1572 10. Edelman GM (1989) The remembered present. Basic Books, New York 11. Freeman W (2000) Mesoscopic neurodynamics: From neuron to brain. J Physiol Paris 94(5–6):303–322 12. Harvey I, Paolo ED, Wood R, Quinn M, Tuci E (2005) Evolutionary robotics: A new scientific tool for studying cognition. Artif Life 11(1–2):79–98. URL http://www.mitpressjournals.org/ doi/abs/10.1162/1064546053278991 13. von Holst VE, Mittelstaedt H (1950) Das Reafferenzprinzip. Die Naturwissenschaften 37(20):464–476 14. Homberg U, Paech A (2002) Ultrastructure and orientation of ommatidia in the dorsal rim area of the locust compound eye. Arthropod Struct Dev 30(4):271–280 15. H¨ulse M (2006) Multifunktionalit¨at rekurrenter neuronaler netze – synthese und analyse nichtlinearer kontrolle autonomer roboter. PhD thesis, Universit¨at Osnabr¨uck 16. H¨ulse M, Ghazi-Zahedi K, Pasemann F (2002) Dynamical neural schmitt trigger for robot control. In: Dorronsoro JR (ed) ICANN, Springer, vol ICANN 2002, LNCS 2415, pp 783–788 17. Ijspeert AJ, Nakanishi J, Schaal S (2003) Learning attractor landscapes for learning motor primitives. In: Advances in neural information processing systems, MIT, Cambridge, CA 18. Ikegami T, Tani J (2002) Chaotic itinerancy needs embodied cognition to explain memory dynamics. Behav Brain Sci 24(05):818–819 19. Jaegger H, Maas W, Markram H (2007) special issue: Echo state networks and liquid state machines. Neural Netw 20(3):290–297 20. Kaneko K (1990) Clustering, coding, switching, hierarchical ordering, and control in network of chaotic elements. Physica D 41(37) 21. Kaneko K, Tsuda I (2003) Chaotic itinerancy. Chaos 13(3):926–936 22. Merleau-Ponty M (1963 (translation), 1942) The structure of behavior. Duquesne University Press, Philadelphia, PA 23. Molter C, Salihoglu U, Bersini H (2007) The road to chaos by time-asymmetric hebbian learning in recurrent neural networks. Neural Comput 19:80–110 24. Pasemann F (1993) Discrete dynamics of two neuron networks. Open Syst Inf Dyn 2(1):49–66 25. Philipona D, O’Regan J, Nadal J, Coenen OM (2004) Perception of the structure of the physical world using unknown multimodal sensors and effectors. Adv Neural Inf Process Syst 16:945–952
References
175
26. Rossel S, Corlija J, Schuster S (2002) Predicting three-dimensional target motion: how archer fish determine where to catch their dislodged prey. J Exp Biol 205(21):3321–3326. URL http://jeb.biologists.org/cgi/content/abstract/205/21/3321, http://jeb.biologists.org/cgi/reprint/ 205/21/3321.pdf 27. Sterelny K (2005) Thought in a hostile world. MIT, Cambridge, CA 28. Tani J (1998) An interpretation of the ‘self’ from the dynamical systems perspective: A constructivist approach. J Conscious Stud 5(5–6):516–542 29. Taylor C (1999) The atomists leucippus and democritus: Fragments: A text and translation. University of Toronto Press, Toronto 30. Tsuda I (1991) Chaotic itinerancy as a dynamical basis of hermeneutics in brain and mind. In: Microcomputers and attention. Manchester University Press, Manchester 31. Tsuda I (2001) Toward an interpretation of dynamic neural activity in terms of chaotic dynamical systems. Behav Brain Sci 24:793–847 32. von Uexk¨ull J (1934) Bedeutungslehre / Streifz¨uge durch die Umwelten von Tieren und Menschen, 1956th edn. Rowohlt Hamburg 33. Varela F (1979) Principles of biological autonomy. North Holland, New York 34. Varela F, Maturana H, Uribe R (1974) Autopoiesis: The organization of living systems, its characterization and a model. Curr Model Biol 5(4):187–96 35. Von Foerster H (2003) Understanding understanding: Essays on cybernetics and cognition. Springer, New York
Chapter 9
Convergent Evolution of Behavioral Function
Abstract The invention of behavioral function is often a punctuated event, whereas the development of function is a gradual process. The chapter includes examples from the evolutionary robotics tracking experiment introduced in the previous chapter and shows how both gradual and discontinuous improvement relate to the discovery of function. Attractor landscapes are a conceptual tool that show how invariants appear as agents converge to function. In nature, we find analogies in behavioral function across phyla and taxa, which also exhibit analogous solutions. Analogies in the form and behavior of organisms derive from the ideal implementations of function to which evolution may converge. Convergent evolution towards behavioral function underlies the appearance of instincts and analogous behavior of different organisms. Convergence and divergence also collaborate to resolve an old controversy about punctuated equilibria. In the interplay between the two, an answer can be given to Stephen Jay Gould’s question of what would happen if the evolutionary tape were replayed. The chapter ends with a list of the sources of constancy and variability in behavior.
9.1 Convergent Evolution 9.1.1 Outlook In this chapter we discuss the evolution of functional behavior. Departing from general considerations about punctuated equilibria and neutrality, we ask the following question: Does evolution converge and in that case, to what does it converge and why? Analogous morphological structures and functional behavior across many phyla suggests that some evolutionary processes do converge towards particular functional mechanisms. Operating on components whose recombination supports a wealth of behaviorally relevant dynamics, evolution has molded the brain as the ultimate multifunctional survival tool. But how much of its structure is necessary, and how much is contingent? In other words, what must be invariant in brains for the organism to exhibit the same behavioral functions? M. Negrello, Invariants of Behavior: Constancy and Variability in Neural Systems, Springer Series in Cognitive and Neural Systems 1, DOI 10.1007/978-1-4419-8804-1 9, c Springer Science+Business Media, LLC 2011
177
178
9 Convergent Evolution of Behavioral Function
Evolution builds on the resolution of apparent dichotomies. It merges divergence and convergence, gradual and abrupt improvement, variability in structures and constancy in function, to show a picture of both randomness and necessity. After a preliminary discussion, I illustrate these confluent duals with a set of small experiments: 1. Improvements of function may be gradual and may be abrupt. This is shown in the time course of fitness for variations on the tracking experiment, including a three-dimensional tracking problem (Sinai) and a mobile holonomic ball-following robot (R2) (Sects. 9.3.3.1 and 9.3.3.3). 2. Steady mutations cause forms to diverge. Nonetheless, accumulation of mutations tends to increase resilience to some modifications. As the organism accumulates changes, it also accumulates constraints. I discuss the significance of this, resorting to an illustrative microexperiment (Sect. 9.2.4.1). 3. Emergence of functional behavior is not dependent on any particular network structure, nor on any particular dynamics, but is dependent on the network’s attractor landscape, seen from the perspective of the function it subserves. I present comparative results of evolved networks solving variations of the tracking problem presented earlier, and their attractor landscapes. Although structures strongly vary, behavioral function is comparable, which maps back to their attractor landscapes seen from a selected level, where invariance with respect to function appears (Sect. 9.3).
9.2 Preliminaries 9.2.1 Is Evolution Gradual or Punctuated? Darwin originally conceived of evolution as a gradual process of incremental modifications. In the foreword of the sixth edition of On the Origin of Species, he comments on the discussion that ensued since the first edition of the same book. One of the works Darwin cites is the Vestiges of Creation, tenth edition, of an anonymous writer (1853 D D). The proposition determined after much consideration is, that the several series of animated beings, from the simplest and oldest up to the highest and most recent, are . . . the results . . . of an impulse which has been imparted to the forms of life, advancing them, in definite times, by generation, through grades of organization . . . (emphasis added.)
Darwin proceeds to point out that: The author apparently believes that organization progresses by sudden leaps, but that the effects produced by the conditions of life are gradual.
9.2 Preliminaries
179
Subsequently, Darwin confesses that: I cannot see how the two supposed “impulses” account in a scientific sense for the numerous and beautiful co-adaptations which we see throughout nature; I cannot see that we thus gain any insight how, for instance, a woodpecker has become adapted to its peculiar habits of Life . . . .
Thus, Darwin was left with a puzzle. To him it was a conundrum that this gradual mutation process was not represented in the fossil record. Looking at paleontological findings of the time, species seemed to occur within determined intervals with little or no change, and not, as he expected, as a continuous gradient of ever-changing body forms. This mystery lingered even after the advancement of paleontology (carbon dating) and despite the discovery of many fossils since. The missing links were a thorn in Darwin’s side. Steps towards the resolution of this dilemma came with the suggestion that evolution did not simply gradually change organisms, but did so discontinuously, in humps, occasioned by the discovery of niches with isolated subpopulations. The evolution of isolated populations was called “peripatetic speciation” by Mayr [9] in 1954 and “punctuated equilibria” by Eldrige and Gould [3]. Their idea is that populations evolving in isolation (such as on an island, e.g., Madagascar) would swiftly adapt (in evolutionary time) to new niches. Theirs is a plausible and powerful argument. But I contend that isolation of evolving populations is a not sufficient explanation for Darwin’s missing links riddle. Missing links appear because the invention of behavioral function is often a punctuated event. This momentary event would not easily appear in the fossil record. Given the extended timescales in which paleontology operates, it becomes difficult – as it is – to find missing links, because the time window of this event is a needle in a haystack of evolutionary time.
9.2.2 The Moment of Invention of Function An example illustrates the point. The youngsters of some species of gecko are able to run on the surface of water (and for this ability have been appropriately called the Jesus gecko). The young do so to avoid being cannibalized by their parents. The parents cannot run over water, given physical constraints (e.g., their weight-to-foot area ratio, the maximum speed of the legs). The appearance of the function hrunning over wateri occasioned a stepwise advantage, whereby individuals with the trait could survive their parents appetite, therewith having the potential of forming new colonies and thriving in numbers, carrying the acquired function along. This story can be recounted in many ways, in which a stepwise advantage begets a considerable gain and, therefore, a massive increase in viability. Function is, in a sense, invented [4] – and the moment of invention is a punctuated event (Fig. 9.1). Once the novel function is available, it can be improved, either developmentally, as in the Baldwin effect [14], or with affirmative mutations [10], either in behavior
180
9 Convergent Evolution of Behavioral Function
Fig. 9.1 How behavioral function affects punctuated equilibria. The z-axis is a discontinuous axis, representing qualitatively different functions that individuals of a population can potentially exert. The xy-plane represents the space of structural mutations. At the planes, the dots represent individuals with different phenotypes. A particular phenotype in the bottom plane in discovering function gives rise to a population of mutants able to deploy a novel function, thereby enabling the exploitation of different areas of the structure and function space that were previously unavailable (the nonintersecting blue region)
or in morphology. Incidentally, unless a conspicuous morphological trait accompanies function, the fossil record will not show traces of the newly acquired behavior (although it may be reflected in the numbers of fossils, representing the viability of a species). If a function appearing anew is behavioral, then the acquired functional advance may be invisible in the fossil record, leading to a Darwinian head scratch.
9.2.3 Neutral Mutation and Appearance of Function Behavior subserved by neural systems has the advantage that most changes of structure will not destroy function. Mutations that do not lead to novel functional traits are denominated neutral. In beautiful work, Schuster [12, 13] has shown that neutrality and robustness come hand in hand. A system that is robust to mutations, that is, whose mutations lead to roughly equivalent functional capabilities, has more ways to change avoiding functional collapse. In a talk presented at the Artificial Life Conference 2008 in Lisbon, he concocted the following facts into a cogent argument for the importance of neutrality. His argument is based on gene-regulatory networks and RNA. Schuster demonstrated that it is possible to build RNA molecules with only two bases, instead of the normal four found in nature, that are “functionally indistinguishable” from the natural version. The RNA molecules were interchangeable as the reactions dependent on that protein
9.2 Preliminaries
181
would continue normally “as if nothing had happened.” Two RNA molecules have equivalent functional roles if they have identical geometrical structures. Schuster then posed the questions: “Why have four bases when two suffice? Isn’t two more parsimonious?” An excellent case can be built with neutrality as an explanation for the four-bases construction. It turns out that a mutation on a two-base RNA is much more likely to be deleterious to the spatial structure of the molecule, and lead to a collapse of function. By inducing mutations on RNA molecules, experiments conducted by Schuster [13] have calculated the rate of deleterious mutations for two-base RNA is about 90%. With four bases, this number is much smaller; the four-base system introduces redundancy, which translates to robustness. In redundant systems, neutral mutations far outnumber either deleterious or functional mutations. With neutral mutation the potential to evade local optima is increased. In a system with small redundancy there are few paths of change, and maybe none whatsoever; the functional landscape is very rigged. In systems where neutrality reigns, however, there are many neutral paths that may lead to structural domains with the potential for discovery of function.
9.2.3.1 Neutral and Affirmative Mutations The analogy carries smoothly to neural systems. Most of the changes of the neural system are neutral with respect to function, hence there must be ample redundancy. But unlike gene networks, whose function can be – with some difficulty – retraced to the shape of molecules, in neural systems the structure–function relationship is more elusive. Behavioral function appears as the nervous system is propelled by and moves the organism in interaction with the environment. The crux is that behavioral function appears only at the organismic level (not forgetting the population level, in this context a pardonable omission). From the perspective of the organism it is irrelevant how a behavioral function is implemented. The structure of the nervous system connects to function through the body, and thus the only relevant level for function is the organismic level.1 The more ways there are to recombine and rearrange while avoiding functional collapse, the better. If, on one hand, evolution is not pressed to exhibit its creativity at every new birth, on the other, it cannot prevent change. So, it may leisurely change this or that until function arises – the fewer casualties, the better. Flexibility in structural mutations orthogonal to the space of behavioral functions makes for more
1
The structure of the nervous system is doubtlessly fundamental to function. Nonetheless, its importance is but instrumental: any structure endowed with the same potentialities for subserving behavioral function is, from the perspective of an organism’s viability, equivalent. A priori, for functional behavior it is not even necessary that there should be neurons. Any element with the same potentialities may do the same job. Of course, the extraordinary breadth of potentialities of the neuron is not easy to imitate in full, and perhaps only neurons will be able to subserve all breadth of behavior of complex organisms.
182
9 Convergent Evolution of Behavioral Function
robust organisms. In a world as ours, being robust is virtually a precondition for existence.2 Before it bumps on function, evolution tries and tries, neutrality drifts, with its fingers involuntarily probing for function and viability. The invention of function is a step out of the neutral circle.
9.2.4 Convergent Evolution Controversy To bestow organisms with function, evolution rearranges components with an incremental invention procedure. What can be modified, and how, is determined by the potentialities of the components that evolution has at its disposal. Evolution abides by the rules of reshuffling, inherent in the components and the designs from which it departs. Needless to say, it is easier to rearrange previously existing components than it is to invent components anew. In other words, evolution sticks with what it has (therewith deserving its “tinkerer” nickname), towards a future it cannot foresee. At the beginning of an evolutionary process there is much flexibility and incredible variability. The Cambrian explosion is a moment in the fossil record where a large number of astounding new designs appeared. Gould takes the Cambrian explosion as the paradigmatic example of the divergent inventiveness of evolution at the early stages of life [5]. Gould argues that had the evolutionary tape been replayed, beings would be radically different from those we know. Conway Morris [11] argues the opposite. According to him, there is inherent directionality of evolution, and therefore, convergence to particular types of solutions. Since the same environmental and physical constraints act on all life indiscriminately, there is an ideal to the evolutionary processes. This fascinating argument is dependent on what “radically different” means. To settle the issue, one must clearly define “convergence.” A couple of preliminary remarks are due here. First, it is clear that as time proceeds new modifications come to depend on those previously incorporated. Second, functions enabling viability of organisms are likely to be reaffirmed, as Ernst Mayr [10] argues. As a result, evolution may settle on some designs, which become increasingly resilient because of both acquired constraints and reaffirmed mutations. Evolution begets simultaneously functionality and constraints. The dilemma between the views of Gould and Morris is undecidable within the frame they propose, because function, that which renders viable organisms, is often left out of their
2
An interesting issue that arises here is that of historical contingencies in evolution. Neutrality causes organisms’ designs to drift across fitness landscapes, in roughly horizontal paths of equivalent viability. But even neutral mutations can eventually be instrumental for the evolution of function. That is, an organism comes to be dependent on its history of structural modifications, even those that apparently had no role as of its appearance. The contingent history of neutral evolution is relevant for the appearance of function. Sometimes a neutral modification of now becomes a fundamental step to a functional modification in the future. An experimental proof of this in bacteria is found in [1].
9.2 Preliminaries
183
evolutionary equation. If Gould is correct in affirming that there is tremendous invention potential for designs, Morris is right in saying that evolution takes place in a highly structured world, and with specific components, which implies certain kinds of solutions — or convergence. The Gould and Morris controversy, as many others, is resolved by merging the opposing views (close to the golden mean). As Gould himself puts it, “close to the golden ratio” [5]. Divergence and convergence collaboratively contribute in an evolutionary process.
9.2.4.1 Experiment 1: Divergence and Convergence Progress imposes not only new possibilities for the future but new restrictions. Norbert Wiener, The Human Use of Human Beings
I would like to illustrate the interplay between divergence and convergence in evolution using the following experiment as an analogy. At the outset, we have a bowl with two balls, one white and one black. We have a virtually inexhaustible supply of black and white balls from another container. One trial of the experiment proceeds as follows. In the first step we perform a blind draw of a ball from the bowl (at a fifty-white and fifty-black chance). If a white (black) ball is drawn, the ball is replaced and a new white (black) ball is added from the supply repository to the bowl. The procedure is then repeated for N blind draws, and the evolution of ratios is recorded. After the N th draw, we have N C 2 balls in the container. Now we repeat the experiment for a large number of trials and annotate the evolution of ratios. Intuitively one may expect a number of different outcomes as to the ratio of black and white balls over the many trials. Does the distribution of ratios resemble a Gaussian, with the mean at 50:50? Does a ball of a certain color often overtake the other, thus producing a distribution with two modes concentrated at the edges? Is the distribution of ratios uniform, such that all ratios are equally probable? You may now place your bets. One may hazard a couple of predictions. First, that the resulting proportion is heavily biased by the initial draws. So, if the first four balls drawn are all black, then for the fifth draw, there is a probability of merely 1/6 of selecting white. Thus, one may be led to expect that the initial draws heavily bias the possible outcomes. The experiment shows that any given trial converges to a stable ratio. Indeed, and this is probably the case. But the main question is: What is the white–black proportion after a large number of draws? The answer can be seen in Fig. 9.2. Lines of different colors represent the evolution of ratios between balls in different trials. The outcomes of five specific trials are emphasized with thicker lines. One sees that, indeed, individual trials roughly converge to a stable ratio (which given the law of large numbers should be expected).
184
9 Convergent Evolution of Behavioral Function
Fig. 9.2 One thousand runs of the experiment described in the text. Individual runs are color-coded. Five individual runs are emphasized. The histogram to the right represents the frequency of the particular outcomes, i.e., a uniform distribution of outcome ratios between black and white balls
On the other hand, it may be surprising to verify that there are large fluctuations at the outset. Particularly, the trial that at about the tenth draw has 90% of black balls eventually settles at just under 70%. So, although it does converge, the first draws are poor predictors for the ratio to which the trial converges. Now, regarding the histogram at the right of the plot, one sees something counterintuitive. The outcome of many trials spans the whole ratio space in a uniform fashion, showing remarkable divergence. There are no preferred ratios, despite the heavy dependence on the initial conditions. So, although the individual runs converge, the whole space of ratios is spanned. The probability of any given resultant ratio is equal, and thus the distribution of ratios is uniform. The analogy can be brought to bear on the interplay between divergence and convergence in evolution in the following ways: 1. The space of possible outcomes is free (all arrangements are possible) yet constrained (only combinations of black and white balls are possible). All the possible outcomes are inherent in the formulation of the problem. Although potentially infinite (the space of the real numbers between 1 and 0), all the solutions will remain within the space. The rough analogy notwithstanding, one can also claim that the recombination of real elements also cannot escape the varieties implied by the inherent potentialities of atoms. Although an obvious point in the example, this is not quite so obvious in real organisms, but retains some truth.
9.2 Preliminaries
185
2. The repeated application of invariant rules on basic components produces convergent trends. The evolutionary trend tends to stabilize on certain designs. Although change is permanent, future changes will tend to reaffirm the design. Reaffirmation of the design is inexorable, as constraints are incrementally acquired, as beings become more complex, and more dependent on history. 3. The fate of particular trends is influenced by, but not decided in, the first picks. By the inverse token, there is much divergence in the initial steps, as the first draws have a higher impact than later draws. The space of initial structures is spanned rather quickly. As time proceeds, the modifications become more modest, as a result of accumulated constraints and increased resilience to change. 4. Before the first draw, every outcome ratio is equally probable. Although the individual runs converge, taken together they span uniformly the space of possibilities. There are no a priori favored outcomes. In contrast, after many draws, the situation changes, when the history of the draws introduces a significant bias.
9.2.4.2 Selection and Behavioral Function Coarise This experiment helps us visualize possible trends of structural modifications. It says, however, little about evolution itself. The crucial component manifestly lacking from this tiny experiment is a selection procedure, without which there is no pressure for certain outcomes to be preferred. All outcomes are equiprobable, because there is no selection bias whatsoever. Without criteria of viability with which make a selection, outcomes become equiprobable. This is blatantly not the case in evolution, where selection criteria coarise with organisms [17]. The appearance of organismic and behavioral function is what generates orderings and begets selection criteria. Behavioral functions effectively cause the emergence of dimensions of comparison, making some better than others. They are better only inasmuch as their functional potentials revert to selection biases. As they acquire functions, organisms become qualitatively different from each other. Every behavioral function is a potential selection criterion. Organisms may possess many functions, and by extension many potential ways to order themselves. Furthermore, not all beings will be comparable according to an objective criterion, and this amounts to the difference between viability and fitness.3 Nevertheless,
3
Fitness is abstraction over viability, where the abilities of organisms can be somehow quantified, and organisms themselves compared. Fitness is hard, viability is soft. Fitness always produces specific orderings of the quantified agents, totally ordered sets. Conversely, viability’s orderings are subjective, contingent, and context-specific – when they are at all possible. In such orderings, relations of transitivity, reflexivity, or equivalence would only hold in isolated cases. Most of the others are neutral or simply incomparable. Even so, when orderings exist, through a selection procedure they can become
186
9 Convergent Evolution of Behavioral Function
behavioral functions that have thriving consequences for the population (or that are not detrimental in any important way) are factors biasing evolution. Underlying convergence to particular designs is the appearance of evolutionarily relevant function.
9.3 Evolutionary Phenomena in Evolution of Tracking and of Following In this section I review results from a set of further experiments in tracking, as well as an evolutionary analysis of the attractor landscape as it correlates with evolution.
9.3.1 Invariance Organized Through Orderings and Selection I have argued before that invariants in neural networks appear as structured input influxes in a plastic system (as in Sect. 4.3.1). Systems governed by invariant rules organize around regularities of the environment and the substrate, towards goals more or less explicit – thereby coming to represent the regularities. Analogously, evolving organisms are malleable subjects where viability (or fitness) implies selection criteria. As they evolve, organisms come to implicitly represent their modes of structural coupling with the environment [17]. The environment is a generous source of exploitable structure and so is the body. Both have the potential of furnishing the organism with sources of distinction – i.e., behaviorally relevant information [16]. Evolution organizes organisms around useful and potential distinctions offered by the environment in that they increase viability. It is here that the attractor landscape emerges as the invariant to which the evolution of function converges, for it represents the class of equivalent structures with the potential to deploy a given behavioral function.
motors of convergence. In an artificial evolution, orderings and selection mechanisms are the basis for the appearance of interesting evolutionary phenomena. Evolutionary robotics and the artificial evolution of recurrent neural networks (RNNs) for robot control operate according to idealized orderings of fitness, and unlike nature, arrange individuals according to very regular lattices (totally ordered sets). Incidentally, this is the basis for the distinction I have adhered to so far between fitness and viability. I talk about fitness when there exists a well-defined function for ordering of individuals, whereas “viability” refers to natural and irregular orderings, where relations of transitivity and reflexivity are not well defined. Fitness is essentially an idealization of viability.
9.3 Evolutionary Phenomena in Evolution of Tracking and of Following
187
9.3.2 Gradual Improvement in Evolution of Simple Tracking This section deals with the experiment described in the previous chapter, and assumes familiarity with the setup and methods of analysis therein, particularly in regard to attractor landscapes and their motor projections. The following presents results of evolutionary runs of the tracking experiment. Afterwards, I will present methods and results of two extensions of the original problem. In the original tracking problem, the evolutionary dynamics appeared as gradual improvements reaching a saturation. Gradual improvement of function was observed in almost all evolutions. This was reflected in the fitness function values as in Fig. 9.3. Agents organized function from the first generations on. The slope of the fitness curve close to the first generations is usually high. At certain evolutionary runs, the improvement between two generations was more than doubled. Around the 120th generation, the tracking function is reliable: (1) the ball is rarely lost and (2) if the ball is lost, it is actively pursued. In Fig. 9.3 one notices that in the first generations the fitness increase is largely monotonic. The monotonicity of the gradual improvement is dependent on the evolutionary parameters of structural variation. A higher variability in structural modifications, such as addition and removal of synapses, leads to a more jagged fitness curve, as seen in Fig. 9.4. To a RNN, small changes in structure can lead to
Fig. 9.3 The fitness curve from the best individual of 200 generations in the evolution of tracking. There is a gradual and almost monotonic improvement of fitness values
Fig. 9.4 Impact of mutation rates on fitness. The best individuals of 300 generations of another evolutionary run. Although the main trend is increasing, owing to evolutionary parameters increasing variability of structure, there are jagged edges of fitness, especially in the very first generations. The shaded area indicates the moment where the mutation rate was decreased. The maximum fitness here is at 8,511, whereas at the previous evolutionary run it was at 2,200. That disparity is due to the duration of the trials. The second evolutionary run trial durations were 4 times longer than the previous ones
188
9 Convergent Evolution of Behavioral Function
Fig. 9.5 Pairwise motor projection plots for the ten first individuals of one evolutionary run. The z-axis represents generations. The plot to the left corresponds to the yaw motor unit, that to the right is for the pitch motor unit. See the text for explanations
large alterations in the dynamics, so this effect was to be expected. Nevertheless, the gradual improvement was still the main trend despite higher variability. Notice also that the variability in function is tightly connected with the parameters of structural modification (evidently, neutrality is not independent of how much components change. The more components change, the less neutrality.). To see whether gradual improvement was correlated with the attractor landscape, I plotted a sequence of motor projections of attractor landscapes (as in Sect. 8.4.4.1); the results are seen in Fig. 9.5. In this plot, each slice represents the attractor landscape of an agent, where the z-axis represents its generation. The first ten agents and the last ten agents of an evolutionary run are seen. The landscapes display a convergent trend where a certain pattern becomes distinctive. As we have seen previously, this pattern is associated with the negative feedback that underlies the tracking function. It is a projection to the motor units. This pattern indicates negative feedback where the action will depend on the relative position of the ball in the agent’s visual field. When the ball is to the left of the visual field, there is a correction to the right, and vice versa. Feedback works similarly for up and down, the pitch motor. As evolution unfolds, these symmetric patterns emerge and later stabilize. Figure 9.6 shows the attractor landscapes for later generations, between 120 and 130. In the course of evolution other features of the projected attractor landscape become increasingly distinguishable. For example, when the ball is not visible (at the rim areas of the plot), the agent’s average motor action is close to zero (green background color). This is an aptitude that becomes organized through evolution. The first agents in an evolutionary run did not have average activity of motor units
9.3 Evolutionary Phenomena in Evolution of Tracking and of Following
189
Fig. 9.6 Pairwise motor projection plots for the individuals from generations 120–130 of one evolutionary run. The z-axis represents generations. The plot to the left corresponds to the yaw motor unit, that to the right is for the pitch motor unit. See the text for details
close to zero. This is useful for the agent, as it avoids the catatonic response of, when losing the ball, acting in a stereotypical way (e.g., if red in the left-down plot, in the event of losing its gaze, the agent would always turn to the right). Thus, the gradual improvement of function seems to imply a particular landscape, albeit with considerable variation of the details from different attractor landscapes and agents. We take this variation to be an instance of neutrality, as the variation of later generations did not, usually, have an impact on fitness.
9.3.2.1 Convergent Evolution (of Attractor Landscapes) What is more important for functional improvement, network structure or dynamics? Inasmuch as the network structure is what subserves the possible dynamics, one may intuitively think that structure comes first. On the other hand, a difference in structure must not signify a difference in behaviorally meaningful dynamics. Moreover, in an evolutionary procedure with random mutations, even an infinitesimal modification of synaptic weights may lead to the functional disorganization and the consequent demise of the agent (e.g., due to a bifurcation). But taking these results as an analogy, evolution selects from those structures available to it given the historical contingencies of mutation, those which possess the proper behavioral dynamics. Gradual improvement arises as evolutionary processes cause networks to be (1) organized – with proper connectivity structures whose dynamics enabled negative feedback – and (2) optimized – as the dynamics became better adapted to the physical constraints of the problem (gravity, size of ball, force applied on motors).
190
9 Convergent Evolution of Behavioral Function
9.3.2.2 Structural Divergence and Functional Convergence Evolution aims for function, not for dynamics or structure, albeit evidently both are necessary for function. Therefore, we hypothesized that the differences in structure and dynamics between different successful agents should be quite pronounced, whereas those of the attractor landscape motor projections, because they are a closer representation of the tracking function itself, should be minor. To see the extent to which this holds, I compared the best agents of five evolutionary runs in terms of both graph theory measures (from [15]) and their attractor landscapes. The results are shown in Table 9.2 and Fig. 9.7. In Table 9.2 it is interesting to observe the pronounced disparity in some of the graph measures of the different networks (the graph measures are summarized in Table 9.1). Despite their fitness values being within a standard deviation of the maximum fitness value for a single agent (per unit step of a trial), the graph theory measures indicate the space of structural variability for that fitness value.4 From the table we also observe that the graph measures for the structure of different individuals in late generations may be quite diverse. This indicates a dependency on the early mutations leading to divergence of later forms, akin to the arguments in Sect. 9.2.4.1. Compare now the motor projections of the attractor landscapes for the five individuals from different evolutionary runs. In their overall features they appear very similar: (1) symmetries in the projected landscapes to motor units — left–right symmetry of the yaw motor unit and up–down symmetry for the pitch motor unit; (2) the
Fig. 9.7 The individuals of five evolutionary runs with comparable fitness (within one standard deviation of each other). Despite different evolutionary parameters and considerable structural differences, the attractor landscapes display reoccurring features Table 9.1 Graph theory connectivity measures In strength Sum over weights of the connections coming into a unit (signed) Out strength Sum over weights of the connections going out from a unit (signed) k-density The ratio between the number of connections and number of possible connections Modularity Q value (fraction of connections within communities/expected fraction of such edges)
4
The networks in this chapter can be found on the accompanying online resources, at www.irp. oist.jp/mnegrello/home.html.
13 (0) 20 (7) 26 (13) 18 (5) 16 (3)
0.1410 0.1132 0.0883 0.4118a 0.2542
1.3630 (Y)/ 1.1970 (P) 1.6730 (Y)/ 2.660 (P) 1.3380 (Y)/ 1.0932 (P) 15.8408 (Y)/ 0.6629 (P) 12.7160 (Y)/ 7.8570 (P)a
1.7 (Y)/ 0.15 (P) 0.38 (Y) /0.24 (P) 1.689 (Y)/ 1.5531 (P) 33.878a (Y)/ 11.0575 (P)a 37.29 (Y)/ 0.4400 (P)
Out strength (Q)
3.4115 2.4597 5.3725 8.3911 9000a
Modularity (maximum)
22 (26) 43 (220) 62 (282) 126a (198) 58 (176)
Synapses
The fitness criterion was to achieve one standard deviation from the maximum fitness achieved by a given agent. Notable differences appear in the graphtheoretical measures and in the structures of these agents. Nevertheless, the attractor landscapes are similar across all. In strength and out strength are given for the motor units, yaw and pitch Y yaw, P pitch a Outlier values
1 (297) 2 (281) 3 (292) 4 (865) 5 (173)
Table 9.2 Network connectivity values from individuals of five evolutionary runs Network Units k-density In strength (generation) (hidden) (motor) (motor)
9.3 Evolutionary Phenomena in Evolution of Tracking and of Following 191
192
9 Convergent Evolution of Behavioral Function
maximal mean amplitudes of the responses are comparable; (3) the mean activities of the motor units in the absence of a stimulus are comparable and display a tendency to zero (green areas, where the average output of the units approach zero). Different individuals may use different dynamics to implement actions with similar consequences. The periods of the attractors involved in the implementation of action do vary. Thus, slight differences in behavior of different agents are found. These differences are minor compared with the behavior required, and are difficult to described (nevertheless, these differences could be instrumental in the acquisition of other functionality). The invariance is, of course, at the gross level of motor output, and represents the function htrackingi at that level. Although both the structure of networks and their dynamics vary, convergence in behavior is represented by the attractor projection to motors. The networks evolve such that the problem is solved. If one takes the set of possible structures enabling a particular behavioral function, that set is equivalent with respect to that behavioral function. In our case, the set of networks that solves the tracking function (according to some criterion) defines an equivalence class with respect to tracking. This convergence in behavior can be observed in the behavioral invariant, visualized as the motor projections of the attractor landscape.
9.3.2.3 Inventive Evolution Evolution has been said to be creative, but more than that, it is inventive. It conjures up its inventions out of the potentialities of the substrate and the environment. Evolution is also a caring inventor and it perfects its own inventions. Discovery of function and its gradual improvement collaborate to keep an organism viable, and ever more so.5 In this section we discussed an instance of gradual improvement. In the next, we will see instances of invention.
9.3.3 Extensions to the Experiment on the Evolution of Tracking The potential for function appears as an agent couples with its environment. The coupling between an agent and its environment in the tracking problem was fairly simple. The only capability required from the agent was to track a ball constrained to a plane, a reasonably straightforward embodied problem. The gradual improvement
5
Behavior is potential in interaction. Lest I move on too swiftly on and forget this important disclaimer, it should be noted that there are more sources of novel behavioral function than structural mutation. An organism couples with its environment, and function appears from this interaction. Changes in environmental circumstance, such as migration or niche alteration, may lead to a new set of possible ways to couple with the environment functionally. But as we focus on the study of the isolated agent, we might be led to overlook the fundamental contributions of the environment to behavioral functions.
9.3 Evolutionary Phenomena in Evolution of Tracking and of Following
193
of fitness of tracking in many evolutionary runs results from the rather narrow definition of the tracking function in two dimensions. There was not much space for invention. I presaged that a more complex problem, with more space for invention, would mean agents more multifaceted and consequently more eventful evolutionary runs. Thus, I extended the experiment in two ways, expecting more flexibility for the appearance of more complex functional behaviors, and, by extension, of more interesting fitness phenomena. In the first extension, the ball was allowed to move in three dimensions, introducing one degree of ambiguity (Is the ball approaching or moving away from the agent?). In the second, the tracking head was given mobility in space by being mounted on wheels. The evolutionary runs of these extensions exhibited interesting evolutionary phenomena, such as (1) punctuated equilibria interspersed with periods of gradual improvement and (2) contingent dependence of functions on functions (i.e., functions that had to be evolved before other functions could be evolved). Next I present the two variants of the tracking experiment, moving on to indicate their bearing on evolutionary phenomena. Because the discussion of both experiments has many similarities, it will be presented in a single section, following the presentation of the two experiments and their results.
9.3.3.1 Experiment Setup: Three-Dimensional Tracking – Sinai Environment In the first extension, the ball was allowed to bounce in three dimensions. That introduced one degree of ambiguity, as the ball could now move towards or away from the tracking agent, as seen in Fig. 9.8. The environment was designed such that the trajectories of the ball were by necessity chaotic. The environment is a triangular prism with a cylinder positioned at one of the edges. This setup was inspired by a variation of the Hadamard billiard, called Sinai billiard.6
Sensors and Motors The retinal input sensors are Boolean and respond between 0 (no ball) and 1 (ball very close). The rays are divergent with an angle of =64 both vertically and horizontally. This has the added effect that when the ball is far away, there is less chance that all the sensors will be in contact with it. The proprioceptive angle sensors
6
Yakov Sinai was a Russian mathematician who proved that the motion of a spherical particle in such a setup is not only chaotic, but also ergodic. To add to this complexity, the initial state of the ball was randomized. The cylinder protects the tracking head while being invisible to it, meaning the rays do not see the cylinder.
194
9 Convergent Evolution of Behavioral Function
Fig. 9.8 The Sinai experiment. At the center there is a cylinder, which protects a tracking head. The rays are divergent and respond with a 1 (ball) and 0 (no ball). Because the rays are divergent, it is in principle possible to distinguish distance (assuming a ball of constant size). The ball bounces erratically in the environment
are linear and return the current angle between [1, 1]. The output motors have a hyperbolic tangent as the neuronal transfer function, and otherwise share the same implementation with the original problem (Table 9.3).
Evolution Procedure The evolution was done in two stages. First, a primordial soup was prepared where agents started out with a large number of units (ten hidden units) and synapses (40) randomly distributed. At the tenth generation the parameters were changed to stabilizing values (equal probabilities of adding and removing neurons and synapses (0.02 per synapse and per neuron) with a high probability of weight change (30% of synapses are changed to values up to ˙2). The complete lists of evolutionary parameters are quite extensive, and they can be found in the online resources, at www.irp.oist.jp/mnegrello/home.html material (referenced by section).
9.3.3.2 Results: Three-Dimensional Tracking – Sinai In the Sinai evolutions I observed both gradual improvement and fitness jumps (invention of function) in many of the evolutionary runs. In early generations, the
9.3 Evolutionary Phenomena in Evolution of Tracking and of Following Table 9.3 Sinai simulation details Entity Property Head Mass Head Height Yaw and pitch motor Maximum force Yaw and pitch motor Maximum velocity 3 3 distance input array Vertical and horizontal divergence angle of sensors Ball Radius RNN and physics Update frequency
195
Quantity 3 kg 1m 5N 90 deg/s =64 rad 0.35 m 100 Hz
RNN recurrent neural network
Fig. 9.9 The evolution of fitness in the Sinai experiment. Three moments are seen: s steep increase of fitness, followed by a slower increase, and another smooth increase of fitness, separated by a discontinuous jump. See the text for details
agents were unable to keep track of the ball when it varied its proximity. The agent would follow the ball for a short period and lose it soon afterwards. At the event of ball loss, many of the initial agents would catatonically stare at a corner, until the target came to a more amenable tracking condition (possibly as a result of the saturation of the activity of one of the motors). At this period, the fitness of agents would improve gradually until a plateau is reached, where fitness would oscillate (dependent on the initial conditions of ball release with varying difficulties). In three runs, agents would remain at this plateau for a while, with fitness oscillating about similar values, and in certain cases, for the rest of the duration of the evolution (limited to 2,000 generations). An example of this is seen in Fig. 9.9. The next stage in evolution started with a sudden increase of fitness. This stepwise increase would happen in one generation only. The increase would be remarkably salient, as can be seen in Fig. 9.9. After this evolutionary moment, the subsequent generations would have different futures. Either the increase of fitness would disappear soon after the stepwise event or another plateau would be reached, otherwise another period of gradual evolution was observed. In different evolutionary runs, these event types would appear at different stages and different durations. Observation shows that moments of invention of function were accompanied by enhanced tracking robustness. This newly acquired quality of tracking, however, would be achieved in different forms in different evolutionary runs. In some runs, the agents would improve by being competent at following a ball approaching from
196
9 Convergent Evolution of Behavioral Function
the left. Other times, the agents would improve by better following a ball coming directly towards them. These behaviors relate to different alterations of the attractor landscapes. In a small number of the evolutionary runs, the agents would become competent in ball-following, to the point that they no longer lost sight of the ball. The agents would reach a “saturation” of fitness function — space for invention exhausted equals optimality.
9.3.3.3 Experiment Setup: Three-Dimensional Tracking – R2 Robot Morphology From the perspective of emergent behavior, the third extension is the most interesting, for many behaviors had to be evolved that based themselves on previous discoveries of function. A tracking head with yaw and pitch motors and sensors was mounted on a base of a three-wheeled holonomic arrangement (Fig. 9.12). The robot, christened R2, as usual tracks balls, but it is moreover required to follow the balls, as the balls may move in different directions, as in Fig. 9.10. The fitness function is the usual one for all experiments, a sum over retinal array sensory inputs.
Sensors and Motors The head: The retinal sensor input array has rays of 8 m, and responds linearly between 1 (no ball) and 1 (ball very close). The angle between the rays is =32 in
Fig. 9.10 The R2 experiment
9.3 Evolutionary Phenomena in Evolution of Tracking and of Following
197
the vertical direction and =16 in the horizontal. The array is otherwise arranged as in the other two experiments. The output motors of the neck have a hyperbolic tangent as the transfer function, and respond from 0:25 to 0:43 rad in the pitch motor, and from 0:35 to 0:35 rad in the yaw motor. The proprioceptive angle sensors are linear within the same range. The wheels: Three wheels are mounted on a cylindrical body of 1-m height and 0.5-m radius. The holonomic drive is a triangular arrangement of wheels, whose axis of rotation is aligned to the medians of the triangle formed by the wheel centers. The wheels can turn in both directions. A priori this arrangement enables the robot to turn around its own axis at the same place, as well as moving on straight lines (Fig. 9.11). The wheels are arranged in a circle, 120ı apart from each other. The motor neurons respond with the hyperbolic tangent, between [1 (counterclockwise), 1 (clockwise)]. The maximum speed of rotation is 8 rad/s. In this arrangement, one simple way to move forward is to activate two wheels with opposing rotations, although there are other possible ways (examined in [8]).
Fig. 9.11 R2 experiment. This holonomic robot has a tracking head with yaw and pitch motors and is evolved to follow balls that bounce away from the robot (bouncing indefinitely without losing energy). The diagram represents the initial randomized arrangement of the robot, and of the balls (see the text). The arrows indicate possible directions of ball movement (given an initial impulse in a random direction). Notice also that the balls do not come towards R2 – R2 has to come to them (indicated by the arrows on the balls pointing away from R2)
198
9 Convergent Evolution of Behavioral Function
Fig. 9.12 The holonomic arrangement of wheels, and an example or combination of wheel rotation. The gray triangle of wheel arrangement is equilateral. The labels at the wheels are the cardinal points, with reference to the way R2 is positioned in the environment (south is represented here as the front). If the wheels have the rotations as indicated, the robot moves with an arc to the right (as indicated by the arrows from R2)
Environment The environment for the holonomic drive (from now on denominated R2) has ten balls (radius 0.5) arranged in a cluster away from the initial position of the robot. This cluster of balls is constructed such that for it to “sense” the balls, the agent must first move towards them. This cluster is constructed anew at each trial by dividing a circle (10-m radius) into ten angles, and then placing each ball at a randomized distance (1 m). The center of the cluster is positioned in front of the robot, at a distance of 10 m (the sensor arrays are sensitive up to 8 m), with 1-m offsides. This arrangement is schematized in Fig. 9.11. The balls are released at different heights (ranging from 1 to 2 m), and receive an impulse (1 N/s) at the beginning of the trial, after which they bounce passively in some direction, dependent on the randomized angle of the impulse vector. The initial position of the robot is not randomized. A screenshot of one example initial condition for one arbitrary trial is also seen in Fig. 9.11.
Artificial Evolution Ten evolutionary runs were performed with two parameter sets.7 The trials ran for the constant duration of 15,000 steps, a rather lengthy trial time. The structural evolution ran in two modes, a “modular” mode in five evolutionary runs, and a “monolithic” mode for another five evolutionary runs. As the problem is not trivial and appears perforated with local minima, successful following is not reliably 7
The log files of the evolution can be found on the online resources, at www.irp.oist.jp/mnegrello/ home.html.
9.3 Evolutionary Phenomena in Evolution of Tracking and of Following
199
evolvable. Out of ten trials, satisfactory following ability was achieved twice in the monolithic version and four times in the modular version. In what follows, I describe my observations solely based on the monolithic version.8 My purpose for these evolutions was to observe the appearance of behavioral function, qualitatively. Therefore, my analysis is descriptive, rather than quantitative. The fitness curves serve merely as guides for the experimenter to search for the appearance of novel embodied behavioral phenomena. In accord with the argument that every evolutionary run is unique, the evolutionary runs are analyzed individually. The analysis focuses on the forms of possible evolutionary betterment, and the path to functional behavior. Here I will not perform a dynamical analysis of the evolved networks (but I will do this in the next chapter, in the context of modularity).
9.3.3.4 Results: Contingent Evolution and Invention of Function The appearance of novel behavioral function is difficult to observe in the fossil record, but in artificial evolution it is not a rare phenomenon. Leaps of fitness correlate with the invention of behavioral function, i.e., some ability which allows the agent to substantially increase its fitness value. Some abilities associated with following a ball appeared not only gradually, but also discontinuously. As the balls are positioned far from the agent, to follow the balls R2 must first learn to look for them. At the beginning of a trial, the ball’s are unreachable by the agent’s retinal rays. This provides a niche to be exploited by agents that are simply able to engage in hsearchi behavior, of whatever description. But simply moving around is not the same as tracking, so the fitness of these agents varies much. At this stage fitness improves only very gradually and with plenty of variation. The next stage is a primitive appearance of tracking function, where the appearance of negative feedbacks at the level of the retina increased fitness. This ability subserves the potential for a better control of the wheels. Controlling the wheels in a manner that will cause the agent to follow a ball is subsumed by an ability of tracking the balls. In this stage the agent often approached the ball and eventually “kicked it” when it bumped into the ball. That caused the ball to be sent off in some direction, usually with a speed larger than that of R2. The agents are now required to hcontrol the distancei between them and the balls, or to hsearch for balli. In the next stage, as the ability of the agent to track increases, so does the control of the wheels. The best agent was able to keep the ball at a more or less constant distance, by using its neck proprioceptors to modulate the velocity of the wheel arrangement. The coordination of movement from wheels and neck motors must not appear simultaneously. As network mutations change the dynamics of the system, different abilities evolve asynchronously. In successful evolutionary trials, abilities emerged 8
In the modular version, evolution of modularity was guided by a procedure which would cut connections between the modules. The arbitrariness of this procedure impairs the ability of analogizing with evolution properly.
200
9 Convergent Evolution of Behavioral Function
in a punctuated manner, accompanied by fitness jumps. The abilities we have observed that would only evolve in later generations causing fitness jumps were: (1) Effective use of the pitch motor to control the overall velocity towards the ball. As the balls distance themselves from the original point, they are moving in different directions and with different velocities. To follow the ball (without kicking it), R2 engages in an approach–stop movement, whose source is the neck proprioceptor. As the head looks up, the proprioceptor outputs to motors modulate the speed at the wheels, and as it looks down, it accelerates towards the ball. In this way the agent can keep at a roughly constant distance from the ball it is following, by approaching when the neck is down and distancing itself when the neck is up (this can be neatly seen in the video included on the online resources, at www.irp.oist.jp/mnegrello/home.html, which simultaneously shows the activations during a trial). Incidentally, the ability of R2 to approach the ball and distance itself from the ball can also be optimized gradually. The reason for the discontinuous jump in fitness can be retraced to alterations of the network structure; in this case, a connection from the yaw motor to a hidden neuron that projects to the two back wheels causing oscillations, which in the overall motion cancel each other out. This step has considerably improved the ability of ballfollowing.9 In summary, in the runs where following evolved, the individual functions of R2 were dependent on the previously existing ones, as follows: 1. Moving forward (requirement: concerted activation of wheels) 2. Tracking (requirement: moving forward) 3. Modulating speed (requirement: tracking). The inherent direction of evolution may not be unique, but it is guided by the aspects of the problem that can be progressively exploited. Evolutionary phases appear because the problems present themselves contingently on previous achievements. Convergence in evolution is also about sequences of problems, and how they depend on each other. First one has to develop neurons, then one develops protoeyes. Only then can phototactic behavior appear. To understand these historical contingencies is also to find how invariances appear at so many levels.
9.3.3.5 Outlining Behavioral Function When we observe the behavior of R2, many subbehaviors can be outlined. That is, an observer can outline particular sets of actions and name them as a behavior. The ontological status of function is largely subjective, and depends on analogies. As human observers of evolutionary robotics simulations, we are all too happy anthropomorphizing the behaviors of artificial agents, and attributing purpose and function to observed behaviors: “the agent does this because it wants to achieve that.” So, 9
For a more detailed analysis of the ability to control the wheels with proprioceptor motors, see Sect. 10.3.3.1.
9.4 Discussion: Convergent Evolution and Instinct
201
when I say that R2 modulated its velocity dependent on the proximity of the ball, I am making a behavioristic statement that implies a functional behavior. So, the reader may ask: When are we warranted in identifying particular behaviors of an evolved agent? I can see two ways in which to answer this question. Firstly, we are always warranted to name a behavior if we are aware of the analogical and subjective character of that definition. In this case, the degree to which the analogy holds true to its real counterpart is the measure of its quality. Preferably, a behavior has to be repeatable under similar circumstances, or at least fall into distinguishable categories. The second answer is utilitarian. If the behavior is measurable according to some fitness parameter and is useful to distinguish between agents, then its naming is also warranted. But in both cases, it has to hold that it is possible to give an explanation of why the behavior is so. This may be tougher than it first looks.
9.4 Discussion: Convergent Evolution and Instinct Die Gegen¨uberstellung von Ziel des Subjektes und Plan der Natur u¨ berstrebt uns auch der Frage nach dem Instinkt, mit dem niemand etwas Rechtes anfangen kann. Jakob von Uexk¨ull, Der Sinn Des Lebens
–Daddy, what is an instinct? –An instinct, my dear, is an explanatory principle. – But what does it explain? – [. . . ] Anything you want it to explain. Gregory Bateson, Steps Toward an Ecology of Mind
9.4.1 Convergent Evolution of Attractor Landscapes In our evolutionary paradigm we defined implicitly a function by prescribing an environment, a body, a fitness function, and the components of a control structure. Although the particular form of the solution was not assumed – seen in the diversity of networks produced – the attractor landscape regarding the specified function was similar, and is therefore an invariant of the specified function. This inverse problem of evolution, namely, to prescribe a function and to gather those structures that enable it, has led to an interesting insight. The constancy we do not find in different evolved structures reappears in the attractor landscapes when they are regarded under a specific light. At the danger of sounding mystical, I proposed that the attractor landscape only exists when it is looked at. Looking at it
202
9 Convergent Evolution of Behavioral Function
means assuming what the function is that it executes, and asking how it executes it, that is, what is the mechanism. The reoccurrence of certain major features of the attractor landscape induced the suspicion that a level of generality on the implementation of function existed; something akin to an abstract function, implicitly defined by the problem posing (definition of components and environment). By looking at how function evolved, we also saw how the invariant features of the attractor landscape accompanied the evolution.
9.4.2 Instinct: Convergent Evolution of Functional Behavior Convergence in behavioral function of artificial agents bears an analogy to an interesting and deep question: What is an instinct? The quest for instinct has historically and famously befuddled biologists and philosophers alike for a good part of the last four centuries. But owing to cybernetic ways of regarding the problem, we recently gained much ground. Two ideas are the foremost contributors to a modern depiction of instinct, which we can now boast. First, the theory of evolution helped us to grapple with the mechanisms behind affirmative mutations. The second idea, largely due to cybernetics thinking, is the proposal that behavior is a consequence of the coupling between an agent and its environment. The paradigm of evolutionary robotics, as opaque through experiments [2], conjoins these ideas into an experimental framework with illuminating analogies.
9.4.2.1 What Is an Instinct? Instinct is an organism’s set of inherited potential functional behaviors. How does it come about that organisms evolve such sophisticated patterns of behavior and indeed inherit them? I believe that given the ideas of evolution and of cybernetics, the answer is simple, although without these ideas, instinct is indeed a “nonstarter.” The short version of an answer could be phrased as follows. A given set of rearrangeable components that interacts with its surroundings will assimilate those functions that confer selective advantage. The appearance of a mode of interaction with selective advantage leads to the joint inheritance of the subserving structures of this mode of interaction (evidently, iff the subserving structures are inheritable). The same way that analogous structures (e.g., wings, eyes, brains) appear in different evolutionary paths, so may analogous behavior — and so on to instinct. Evolution converges to function when that function can be achieved through reorganization of an organism’s substrate in some ways, but not in any way. When guided by selective processes, natural processes undergoing iterated structural modifications converge, be that owing to the appearance of a function with selective advantage, or by increased viability of certain organizations (the acquisition of a functional behavior without viability of the organism obviously does not lead to
9.5 Constancy and Variability in Structure and Function
203
convergence). Evolution converges to function because function is a result of the potentialities for change inherent in an organism, and the mode of interactions that they may afford. These potentialities for change and interaction implicitly imply selection, which coarises with function. Invariances of function take primacy over those of structure. From the perspective of function, it is irrelevant whether the mechanism of behavior is implemented with neurons or transistors. Instinct is the product of the primacy of function in evolution. However, given a set of specific components to be rearranged, the space of solutions is constrained to a subset of the possible structures that enable function. That is perhaps the reason why cephalopods have neural architecture resembling that of vertebrates, vertical peduncles resembling cerebella and horizontal lobes resembling hippocampi [7]. There were functions in the discovery path of structural rearrangement found both by invertebrates and vertebrates. Both invertebrates and vertebrates found a class of architectures with dynamics allowing functions and niche exploitations. Through that discovery/invention of structure, viability was increased. Invariances in architecture and in dynamics ultimately refer to the discovery of function. It is for that reason that invariances are found both in empirical studies and in simulations based on the analogous principles. The function of tracking requires a class of structures that refers to the combined influences of the world, components and viability, before it is about structure or dynamics. Instincts appear because evolution leads organisms to converge towards function and behavioral function. Instinct is only a formidable problem if we forget that function appears when organisms interact with the environment, in the ways that their inherited structure affords.
9.5 Constancy and Variability in Structure and Function As Peter Godfrey Smith eloquently wrote, “in highly dynamic environments stereotypical responses lead to mediocre rewards.” In highly dynamical environments, it is unlikely the same situation will appear twice. Note that the body, from the perspective of the brain, is also a highly dynamical environment. On the other hand, situations will resemble each other, to various degrees, having some features that indicate that a certain mode of behavior is in order. The organism is not required to identify precisely the circumstances; rather it must be fit to act, meaningfully, where meaningfulness is related to its viability constraints. The organism must find the proper mode of behavior, within ecological categories of environmental problems. But once it is engaged in a particular behavioral mode, although the individual actions may vary, the overall meaning of the behavior is constant. In behavioral function, the key to robustness is variability that subserves constancy. Diverse sources of variability, alloyed with elements of constancy, allow for evolution and deployment of meaningful behavior. In our toy problems, sources of both constancy and variability are represented, both from the perspective of behavior and from the
204
9 Convergent Evolution of Behavioral Function
perspective of evolution. I now comprehensively enumerate and discuss the sources of constancy and variability, with regard to the foregoing. The following list stands as the overarching conclusion in reference to constancy and variability of the foregoing experiments. These conclusions will also be tied up with arguments in Part I.
9.5.1
Aspects of Constancy
1. Behavioral sources of constancy (a) The set of stimuli. In our case, the set of stimuli is rather small and can be exhausted by all possible interactions between the agent’s retina and the ball. Because neither the retina nor the object (the ball) changes, the set of all possible parameterizations, i.e., the stimulus space, is definable. For organisms the set of stimuli is not as clear. However, the ways in which they interact are regimented by the structure of the physical world, which implicitly defines the dimensions of invariance, for example, when an organism goes from A to B. The structure of the world gives itself in terms of how the relations between the organism and the world are connected by a smooth path of movement. Such paths are definable through geometrical transformations such as rotation, translation, and scaling, and are sources of constancy operating on neural structures (Sects. 4.3.1, 8.1.2.1). (b) Attractors and attractor structure. Attractors are the hallmarks of constancy of dynamical systems. From the perspective of behavior, attractors only acquire functional meaning as they guide transients which move the body, changing relations between the agent and the environment. So paradoxically, as some periodic attractors may underlie variable action, chaotic attractors may in some cases implement constant action. Whatever the case may be, it is correct to say that for any given parameterization, plus a state of the hidden layer, there exists an attractor structure, no matter how complicated, that is invariant. It is unlikely, however, that attractors are reached, given the fast dynamics of the loop through the environment. So, there is always a source of variation in the transients trailing attractors. Nonetheless, an attractor structure from one dynamical system unambiguously defines the momentary tendencies of a metatransient. Organisms, although much more complicated, still retain the certainty that the present state determines the future states (contingent, of course, on the past of interactions). This was seen in Chaps. 5 and 6, and especially in Chap. 7. (c) Convergence to motors. The convergent activity from a hidden layer to motors signifies that many metatransients lead to equivalent action. The effect is akin to that described in Sect. 5.3, where different spike trains caused similar muscle contraction owing to antialiasing effects of limb inertia. As we have seen, some attractors are comparable, i.e., have the same meaning, if for the same stimulus their resulting action is equivalent. The effective action of
9.5 Constancy and Variability in Structure and Function
205
the motor results from transforming incoming activity (often oscillations) into force, whose effects are smoothened via limb inertia. Therefore, actions produced by different attractors with similar time averages are equivalent. To give an example, a fixed point attractor that maps to a force of 0 N is equivalent, from the perspective of behavior, to a period 2 attractor whose time average is approximately zero. Convergence to motors is a source of equivalence of incoming spike trains, which achieve constancy in behavior (Sect. 8.4.4.1). (d) The RNN and its attractor landscape. Because the attractor landscape is constant to one RNN, and consequently to one agent, all the behaviors an agent may deploy must perforce be represented in the attractor landscape. These will vary across, but not within, individuals. The whole action set is represented by a RNN and its associated attractor landscape. In that, the attractor landscape is the overarching invariant of all behaviors of a given agent, in a given context. Certainly, across individuals the attractors and attractor landscape will vary, perhaps widely. But from particular levels one may depict functional attractor landscapes where features of problem solving are found, such as when different projections to motor units lead to a discovery of negative feedback. In our case the functional representation of the attractor landscape is given by the average activity projected on the motors. And it is the attractor landscape at that level that has invariant features relating to functional behavior (Sect. 8.1.1.2). 2. Evolutionary sources of constancy (a) Physics of the environment and the agent’s body. In our simulations the body and the environment respond to the emulated rules of the physical world; therefore, every constant in the model is a source of constancy. These determine the boundary conditions (meant metaphorically) for the function prescribed. Such constants become implicitly represented in the solutions themselves, as the structure of the agents forms around these constraints. The same holds for organisms. The structure of the world is the primary source of constancy in behavior (Sect. 4.3.1), although different organisms will exploit different features of the world to exploit their niches. (b) Bodily features. These include the agent’s construction, in regard to mass distributions (moments of inertia), joint constraints both in force and in angles. This category also includes the organization of the sensory systems (the normal vectors of the individual retinal pixels, as well as the distance between sensor rays). In organisms, morphological variation will lead to different sources of exploitation, and is therefore also a cause for variability. In simulation this can also occur, although it would not reinforce the conclusions. (c) Neural components. The rearrangements performed by the evolutionary process base themselves on uniform components, whose parameters may be modified, but retain the fundamental properties of the RNN model. So, whatever evolution does will by necessity be based on the potentialities of the individual components. Limitations of the components also constrain
206
9 Convergent Evolution of Behavioral Function
the space of possible solutions. Constancy of the components defines all the pool containing all networks that can be constructed (Sect. 6.3, Chap. 7). In nature, components are analogous, in that they do have a set of potentialities deriving from their constitution. Components can potentially be arranged in many ways, but not freely. From the discussion in Sect. 6.1.1 about the Hodgkin–Huxley model and how it relates to the real neuron, we know that potentialities of the units are also dependent on “biological parameter domains,” such as the presence of ion channels, or the length of axons (Sect. 6.2.4). The particular construction of the component changes, whereas the categories of behavior expressible by the components remain essentially the same. Within a short evolutionary time, evolution operates on parameters, rather than on the more fundamental Bauplan. (d) Behavioral function and fitness. In our evolutionary optimization, good behavior is well defined with a fitness function. For artificial evolution, this is a fundamental source of constancy. Because of an objective fitness function, it is ensured that if there is any solution, it will be in relation to the fitness function previously defined. In natural evolution, viability10 is much less explicitly defined, although the orders in the world are also permeated with constancy. Finding constancy in the environment from which to derive useful distinctions is the basic fundament of survival and viability. As it evolves, an organism implicitly defines, by its ontogeny, the fitness gradients to which it will submit.
9.5.2
Sources of Variability
1. Behavioral sources of variability (a) Paths in parameter space. The space of sensory inputs to the agent stands for a geometrical relationship with the environment. But an instantiation of a path in parameter space is variable, and dependent on the history of interactions. Paths in parameter space (defined in Sect. 8.1.2.2) will diverge with ever the smallest difference in execution, or with slight differences in the metatransient. This potentiation of differences from paths in parameter space is a source of variability in behavior. Nevertheless, despite variations in the paths of parameter space, tracking is still robust. Actually, tracking is robust precisely because it is able to cope with different paths of parameter space. This illustrates the duality between differences in action and constancy of behavioral function. (b) Transients and the metatransient. A parameterized RNN can be seen as an input–output machine, where given an initial condition and a parameter set,
10
See also the footnote on page 185 for the difference between viability and fitness.
9.5 Constancy and Variability in Structure and Function
207
the output is an orbit tending to an attractor. Within a dynamic world, it is unlikely that initial conditions will ever precisely repeat themselves. That implies that even orbits subject to identical input parameters (equal input – identical dynamical systems) may end in distinct attractors (e.g., coexisting attractors). Moreover, stimulus (as parameters) may be presented for various time durations, meaning that the metatransient may approach more, or less, its attractor. Transients, and by extension the metatransient (Sect. 8.1.2.3), will be determined by the historical contingencies, from the perspective of the network’s activity. (c) Higher-period attractors and chaotic attractors. Moreover, the attractor itself may have long periods, or be chaotic. In this case, the attractor has many different states where the metatransient can become entrained. Differences in the initial states of the transient potentiate (also because consequences of actions of different states vary – see paths in parameter space) and behavior diverges. Behavior in a dynamical environment is determined by fast responses. Complicated networks are likely to have complex attractors and long transients. Variation in the entraining will lead to variations in behavior (Sect. 8.4.6). Nonetheless, it is not necessary that chaotic attractors should underlie variability. The contingency is that, for behavior, the attractor shape in motor space is more important than the complexity of the attractor itself. So, two attractors projecting similar orbits to motor space are a source of constancy, not of variability. 2. Evolutionary sources of variability (a) Evolutionary modifications. It is almost superfluous to say that variation underlies evolution. It has to be taken to heart although that variation and constancy in evolutionary modifications are complementary. In this experiment (and the ones to follow), the modifications are rather circumscribed, but the space of structural and dynamical possibilities is large. (b) Contingency in historical modifications. As with the behavior of a nontrivial machine (i.e., an organism in von Foerster’s sense), where the history of interactions contingently operates on future states of the organism, and on its behavior, the evolution of function also carries contingencies of previous modifications. The evolution of the attractor landscape is permeated by this kind of historical contingencies [6]. Some modifications can carry the potential for the distal appearance of function, or in other words, modifications of now can enable the appearance of function far in the future, as previously discussed. (c) Individual instantiations of the attractor landscape. For the same reason, since the individuals carry the history of contingent modifications, so will the individual implementations of attractor landscapes be defined by the past. In various evolutionary runs, small differences in attractor landscapes potentiated to considerably distinct behaviors. The mode of search when the ball is lost, as an example, is differently performed by different individuals.
208
9 Convergent Evolution of Behavioral Function
These individual differences were also propagated to the descendants of a particular individual, and putatively, can also be identified with features of the landscape, just as long as the function is named (Sects. 2.3.13, 9.3.3.5).
9.6 Summary 9.6.1 Convergent Function and Invariants of Behavior This was a chapter about the evolution of behavioral function, to which agents converged. Behavioral functions were loosely specified by a problem posing involving a fitness function, agent morphology, and the environment. Although the behavior achieved by the agents was convergent, the control structures of these agents were not invariant with respect to structures (Sect. 8.5.2). Amid the variability of possible network structures, the behavioral function evolved towards a particular family of attractor landscapes, which displayed constancy amid variability. Only through a particular kind of attractor landscape was an agent able to encounter that function. Evolution converged to a function, and that was represented at a certain level of the agent’s dynamics: that of the motor projections of the attractor landscape of the networks. Convergence appeared because of potentialities for rearrangement in the agent’s morphology and neural components. Given those sources of constancy, the path to invention of function passed through finding network structures enabling a particular set of dynamics. In that, networks found the set of equivalent network structures corresponding to the specified behavioral function. Different seeds and different evolutionary runs led to divergent paths of structural modification, but convergence appeared through selection. A shortcoming of this analogy is that, in life, a selection criterion coarises with behavioral function. In our case the selection criterion was rigidly specified. But despite the rigid specification, the little space left for structural variability was spanned by the agents from different evolutionary runs. The similarity was found not in the structures, but in a family of dynamics, seen from the level of motor activity in response to a sensory stimulus. In a sense, the agents all evolved the same instinct, an invariant of tracking/following behavior: potential in the structure, while exerted in interaction. In evolution we shall find constancy and variability as (1) divergence is constrained by function, substrate, and selection, whereas (2) convergence is dissolved into equivalences and neutrality. Instinct is the evolutionary product of convergence to functional behaviors.
]
9.7 Linking Section: Evolution and Modularity
209
Dialectical Summary The appearance of behavioral function is often a punctual
event (Sect. 9.2.2). The optimization of behavioral function is usually a gradual
process (Sect. 9.2.2). Neutral mutations increase the likelihood of discovery of
function (Sect. 9.2.3). Neutral mutations increase robustness against loss of func-
tion (Sect. 9.2.3.1). Discovery of function is contingent on the history of struc-
tural modifications (Sect. 9.2.3.1). Evolution converges primarily to behavioral function. Invari-
ance in structure and in dynamics are necessary by-products of the appearance of function (Sect. 9.4.2).
Summary of Results The experiments with artificial evolution displayed agents
that converged to behavioral function. Discontinuous fitness jumps appear as novel behavioral func-
tion is invented. The appearance of function correlated with the attractor land-
scape seen in a particular level. There are multiple possible organizations of neural structures
that will enable similar function. The class of structures equivalent to a function is visualized
through a particular depiction of the attractor landscape. The class of structures equivalent to a function defines an invariant of behavior.
9.7 Linking Section: Evolution and Modularity This chapter formally ends at the previous section. This section is a link between this and the next chapter, and therefore I opted to keep it as an addendum. I beg you to read it though before you read the next chapter.
210
9 Convergent Evolution of Behavioral Function
9.7.1 Functional Selection of Modular Structures and Dynamics More broadly, how then does nature select the structures that in enabling particular dynamics would help the organism to better itself? If there are (1) many equivalent dynamics with respect to one action and (2) different structures that enable similar dynamics, how then are the structures preferentially selected? Are there biases for evolution to prefer some structures instead of others if the structures would a priori solve the same problems? The answer to that requires the realization that the wide structural space of equivalent function found in my results is a by-product of a rather narrow prescription for toy problems. Implicitly in my experimental design, I have prescribed a narrow behavioral function space. Nature, however, does not prescribe singular functions, it offers multitudinal possibilities. Although there are, at first glance, dynamics potentially equivalent to one function, there are reasons for constraining the space of structures. As an organism evolves towards a multifaceted exploitation of its environment, it must select among those structures that allow for compatible dynamics, that is, for coexisting potentials for function. Newly acquired functions, new modes of behavior, are either compatible with the foregoing existing functions or they are not. (Incidentally, this is not a case of tertium non datur. It may be that the two functions may both change, although this appears more improbable.) If the functions are not compatible, the organism must relinquish the previously existing function to assimilate the new. If they are compatible, the organism may adjoin the new function to its behavioral repertoire, and thereby increase its breadth of behavior. Indirectly, previously existing structures provide a canvas for all the subsequent mutations, which may or may not lead to the potential for the acquisition of behavioral function. The space of possible functional dynamics (those that may underlie behaviors) is a consequence of the possible structural alterations that will enable the said function. The evolutionary procedure operates as follows. Structural changes leading to novel dynamics may, or may not, underlie functional behaviors. If they do, there is (in the rule) a selection advantage. In this way evolution promotes the multiple functional exploitation of a structure. In this case the new modification may, or may not, be structurally dependent, or structurally independent. If they do not, there are two other possibilities: either the structural change is an alteration without a measurable impact on viability or fitness, in which case it may or not be kept, or it decreases fitness, in which case, all things being equal, it disappears with its bearer. A summary of the alternative routes by which potential function is acquired by a neural structure is depicted in the flow diagram in Fig. 9.13. The next chapter build upon this diagram.
References
211
Fig. 9.13 Function can be adjoined modularly or incorporated in previously existing structures. Without modularity, nature would have to create a functioning system with one sweep. With modularity, nature can handle problems one at a time. In that case, evolution produces dedicated structures and localized functions
References 1. Blount Z, Borland C, Lenski R (2008) Inaugural article: Historical contingency and the evolution of a key innovation in an experimental population of Escherichia coli. Proc Natl Acad Sci 105(23):7899 2. Di Paolo E, Noble J, Bullock S (2000) Simulation models as opaque thought experiments. In: Artificial life VII: Proceedings of the seventh international conference on artificial life. MIT, Cambridge, CA, pp 497–506 3. Eldridge N, Gould S (1972) Punctuated equilibria: An alternative to phyletic gradualism. In: Models in paleobiology. Freeman Cooper & Company, San Francisco, California 4. von Foerster H, von Glaserfeld E (2005) Einf¨uhrung in den Konstruktivismus, 9th edn. Piper Press, Munich 5. Gould SJ (1990) Wonderful life: The burgess shale and the nature of history. WW Norton & Company, New York 6. Gould SJ, Lewontin RD (1979) The spandrels of San Marcos and the Panglossian paradigm: A critic of the adaptationist programme. Proc R Soc Lond 205:581–598
212
9 Convergent Evolution of Behavioral Function
7. Hochner B, Shomrat T, Fiorito G (2006) The octopus: A model for a comparative analysis of the evolution of learning and memory mechanisms. Biol Bull 210(3):308 8. Mahn B (2003) Entwicklung von Neurokontrollern f¨ur eine holonome Roboterplatform. Diplomarbeit, Fachhochschule Oldenburg / Ostfriesland / Wilhelmshaven 9. Mayr E (1954) Change of genetic environment and evolution. In: Evolution as a Process. Allen and Unwin, London, pp 157–180 10. Mayr E (1976) Evolution and the diversity of life. Harvard University Press, Harvard 11. Morris SC (2003) Life’s solution: Inevitable humans in a lonely universe. Cambridge University Press, Cambridge, UK 12. Reidys C, Stadler P, Schuster P (1997) Generic properties of combinatory maps: neutral networks of RNA secondary structures. Bull Math Biol 59(2):339–397 13. Schuster P (1997) Landscapes and molecular evolution. Physica D 107(2–4):351–365 14. Simpson G (1953) The baldwin effect. Evolution 7:110–117 15. Sporns O (2002) Graph theory methods for the analysis of neural connectivity patterns. In: Neuroscience databases A practical guide, pp 169–83 16. Varela F (1979) Principles of biological autonomy. North Holland, New York 17. Varela F, Maturana H (1987, 1998) The tree of knowledge, 1st edn. Shambala, Boston, MA
Chapter 10
Neural Communication: Messages Between Modules
Abstract This chapter discusses and exemplifies the role of invariance in modular systems. We ask: What is a module? What is the kind of message exchanged between modules? What is the meaning of noise? What does it mean, in neural terms, to receive a message? The answers to these questions are tightly coupled, and rely on the simple observation that a message is interpreted by a receiver, and is only made meaningful therewith. Activity exchange between communicating brain areas admits a characterization in terms of meaning, which in turn admits a neat characterization in neurodynamics terms.
10.1 Introduction Why do things have outlines? Gregory Bateson Towards an Ecology of Mind
As the gentle reader may have intuited, modularity is a natural consequence of the views thus far. Granted that it is correct to say that (1) at any given time the brain is involved in a dynamic pattern including the whole of it (i.e., the state of the whole brain) and that (2) behavior is ultimately indivisible, it is nevertheless both possible and useful to define modules with associated functions. Modularity is a fundamental problem for the understanding of complex neural systems in organisms, and one that is particularly apt to be addressed within our framework developed thus far. Modules are the structures which are invariant with respect to a specified function, defined in terms of the (in)dependence between components for the functions they execute; individually or in concert. Ultimately, all modules in a brain are dependent on the organism, but they may also be more or less dependent on each other for particular behavioral functions. Therefore, the outlining of modules starts with an identification of function, as in Chap. 8, and the question of what is invariant with respect to the said function, both anatomically and dynamically. This chapter dwells on epistemological issues in neural modularity. Modules are seen as coupled dynamical systems, within the frame of neurodynamics. Recurrent neural network (RNN) modules produce messages in the form of activity, driven by M. Negrello, Invariants of Behavior: Constancy and Variability in Neural Systems, Springer Series in Cognitive and Neural Systems 1, DOI 10.1007/978-1-4419-8804-1 10, c Springer Science+Business Media, LLC 2011
213
214
10 Neural Communication: Messages Between Modules
parameterizations. What meaning these messages convey depends immanently on (1) the neural structure at their source and (2) what are the destinations (effector or other neural structures). Neural messages are ultimately about the behavioral functions they may subserve, and this tenet becomes the primary heuristics for the definition of modules. The chapter then moves on to explore major issues in modularity: (1) heuristics for the identification of modules, (2) types of modularity, (3) the meaning of neural messages and of noise between modules, and (4) modulation through attractor landscapes as modules, illustrated through simulation experiments. Throughout, I will continue to make use of the conceptual entities developed previously, the attractor landscape and metatransient, providing powerful metaphors for the comprehension of modular function in neural systems.
10.1.1 Neural Modularization In the conclusion of the foregoing chapter, I presented a flowchart representing the acquisition of novel function (Fig. 9.13). In summary, if an organism evolves towards multiple behavioral functions, it does so either by modifying a single system to do more or by aggregating a novel subsystem, roughly independent of previous structures. Is it possible to reverse-engineer the products of evolution? From the opposite side we may ask the inverse question. To what extent are brains decomposable into substructures with specific functionality? Some localized lesions, impairing behavioral function selectively (such as speech anomalies as Broca’s aphasia) indicate that structures with specific function may indeed be singled out according to behavioral criteria. This has led to the suggestion that “brain organs” would perform particular and localized functions, which would explain why function can be destroyed selectively. But this often turns out to be overly simplistic. Current belief is that some localization exists (as Fodor [12] would say, “to an interesting extent”), but specificity of the structure–function relationship less so.1 On occasion, a system may be modularized and the overall behavior described as a composition of elemental modular functions. At other times, atomization is undesirable, impossible, or simply wrong. Whether and how behavioral functions may be disassembled into subfunctions/structures is largely a case-by-case judgement. Many heuristics have been proposed, both qualitatively [2, 12, 30, 45] and
1 Unlike engineered systems (such as a refrigerator), sophisticated brains are richly multifunctional and the extent to which functions depend on each other is not always clear. Even given the striking regularities of the neural systems of arthropods, where some substructures are organized after clear rules (such as the lamina [6] and the central complex [38]), the univocal assignment of functional roles to modules remains elusive. Even in the case of insects, it is not true that regular structures can be associated with a meaningful functional role.
10.1 Introduction
215
quantitatively [36, 37, 42] for the attribution and identification of modules. I provide a list of heuristics for the identification of modularity based on anatomical and functional interdependences (Sect. 10.2.3). I then categorize modules in terms of how information flows in: vertical, horizontal, and monolithic (Sect. 10.3). At this stage I introduce results of neurodynamics analysis from the evolutionary robotics experiments performed.
10.1.2 The Meaning of Neurons, Modules, and Attractor Landscapes Attractor landscapes stand as explanatory devices supporting a modular conception, as they contextualize, translate and modulate, transmit, receive, or disregard activity dynamics, the neural messages inherently about the organism and its world. I propose a semiotic conception of modularity, as messages between modules are about the modules themselves (and ultimately about the organism itself as it depends on the behavioral functions that depend on these modules). An attractor landscape possesses the dynamical potential to convert incoming messages into useful messages for other modules (e.g., spike trains coming into muscles become motor activity). Incoming messages originate from various sources, the body, the perceptual apparatus, or deeper neural structures. Messages change character as they propagate through neural structures, and meaning appears with the functions modules may have, both in terms of their interfaces and in terms of the internal structures. The dynamics of control of R2 and the tracking head comes to illustrate and support these ideas (Sect. 10.3).
10.1.2.1 The Body as a Module To exemplify the idea that attractors convey signals that become messages in the context of the receiver, I frame the body as a module. In recent years, awareness has been raised of the inextricable role of embodiment in cognition [3, 8, 9, 11, 19, 20, 23, 28, 39, 40, 43]. But despite this important realization, embodiment is often considered a nuisance – the body as a challenge to be surmounted. Sensors are noisy and the world is complex. But the body, rather than a load, is an integral part of the solution. Through the activity of motors, the inertia of the body functions as a extended memory through position integration, and may mediate, through analogical computation, the production of messages about its own state that would otherwise be unavailable. The body, considered as a module whose meaning is itself, provides implicit information about physics. The body is moreover a material contextualizer of incoming dynamical activity, as well as a producer of messages that are about itself.
216
10 Neural Communication: Messages Between Modules
10.1.2.2 Messages and Noise The question of constancy and variability takes a particular shape within the discussion of modularity. The neurodynamics conception held here shows that activity entering a module may, or may not, have functional consequences. We may take the latter case as an operative definition of noise: variability without identifiable functional consequences is noise. This conception, of course, relies on a subjective outline of what are “identifiable functional consequences,” which may be more or less covariant with neural structures. Incoming activity modulated by a module’s attractor landscape may, or may not, resolve a meaningful behavioral function. What distinguishes noise from meaning are the functional consequences of the incoming activity. The aspects of incoming activity with functional consequences are meaningful. Those without functional consequences are noise. The existence of functional consequences is largely a function of the attractor landscape of a receiving module. Through an analysis of the geometry of phase space, I provide a novel way to look at variability and noise in neural messages.2
10.1.3 Outline 1. 2. 3. 4. 5.
Preliminaries on modularity Anatomical considerations The semiotics of neural systems Heuristics for modularity Types of modules (a) Vertical modules and the body loop (b) Horizontal modules and expansion of dynamics (c) Monolithic networks and dynamical modularity
6. Equivalence, variability, and noise 7. Discussion: neural modeling of modules
10.2 Modularity Modularity is a powerful simplifying assumption for the study of cognitive systems, as it partitions the big unknown into subcomponents, modular functional black boxes, with arrows in and out. If modularity is the case, the problem of the cognitive scientist would be to define how the function of a given black box is translated
2
The seed for this idea sprouted from an informal presentation by Frank Pasemann.
10.2 Modularity
217
into a structure. If only life were so simple. But structural modularity is far from uncontroversial. Defining the boundaries of modules may be a formidable enterprise, perforated by epistemological and empirical pitfalls relating to the definition of interfaces in continuous flesh [17, 20, 34]. Nonetheless, there are compelling reasons from both physiological observations and empirical assessments of function inducing the belief on modular brains. Anatomical characterizations of the brain produce many outlines, more or less clearly defined [7]. Some distinguishable structures are present across spatial scales, with salient features being repeatable in similar areas and across individuals. This stands for the intuition that the modular organization is not arbitrary or merely accidental [11, 32]. Features of connectivity suggest modularity, where connectivity hubs [18], and insulated nuclei (think of the thalamus or cerebellum), afford a conception of the brain as formed by a collection of “organs,” to which function may be attributed.3 The plot thickens when the focus is shifted from anatomy to dynamics. Even though there appears to be architectonic segregation, at any given time most of the brain is thriving with nonstop activity. So, the opposing view argues that since all of the brain is involved in a global dynamical pattern, its decomposition is more likely to confuse than to enlighten, artifacts resulting from our bias as scientific observers [15, 20, 31]. The arguments for this view are compelling. For one, the brain connectivity has small-world features. Some estimates say that the distance between any neuron and any other is merely three hops. Considering that the transmission of an action potential is an extremely fast event (milliseconds), where an action is usually longer (hundredths or tenths of seconds), the chance of localizing independent modules may seem formidable, as they may be dynamically engaged, context-dependent, where the relation to anatomy may be elusive (or, in the extreme, inexistent). The question is how to tell modules apart if all the brain is simultaneously active, and all neurons are causally close. Here I adopt the view that modules should be defined functionally (localized computation) and dynamically (organized via inputs). This existence reflects on structure, albeit with the contention that there may be different ways to implement modules with similar function, as described in Sect. 8.5.2. It is nonetheless crucial to realize that the functions of brain organs (if they exist) are never independent of the organism, and can only be interpreted in their own terms. Importantly, the interpretation frame of modular function must include physical, behavioral, ecological, evolutionary, and developmental considerations. Next I distinguish between three different types of modularity in terms of how modules interact, within and across (Sect. 10.3). Essential for my views is a semiotic notion of exchanged messages, where modules are receivers and transmitters of messages, whose meaning is created in exchange (Sect. 10.2.1). This implies that a message sent and a message received must not take the same forms or content, as they are recontextualized in terms of the receiver’s dynamical substrate (in our
3
Quite appropriately, organon in Greek means “instrument” or “tool.”
218
10 Neural Communication: Messages Between Modules
case, the attractor landscape). Messages are about where they come from and where they land. The brain is thus regarded as a hermeneutic machine [14], as the meaning of messages between modules is interpretable only in reference to the organism and its problems (thus making at least some form of weak teleological statements unavoidable). Attractor landscapes are conceptual shorthand for the identification of behavioral function with potential dynamics; metaphorically, attractor landscapes are bearers of potential meaning.
10.2.1 Functional Semiotics of Neurons and Networks 10.2.1.1 The Meaning of Neurons The structure of a module defines all possible dynamics available to it. However, the dynamics of a module only acquire meaning in terms of its context, the messages it receives and produces. The dynamics of modules are more like a preposition (“of”) than like a noun. The following exemplifies the claim. In the zebra fish, a single neuron alone has been shown to single-handedly control escape behavior [41], the Mauthner cell. It unites information from many low-level sensors (mechanoreceptors of the lateral line), and decides when, and to which side, to escape. Similarly, in the locust, an interneuron connecting the lamina and the “cerebrum” conveys information about the orientation of sky polarization [33]. In these examples, the single neuron acts as a module which handles a specific kind of environmental information, which is further relayed as behaviorally valuable information. The neuron and its activity is about the relationship between the world and the organism. The activity of the Mauthner cell is about pressure waves in the water signifying escape, therewith acquiring behavioral significance.4 Similarly, the cell of the locust signifies the time of day. The single neuron in the cases above exhibits aboutness. Where the neuron is – what it interfaces with – defines what it is about. Being an all-purpose component, the neuron signifies its inputs and outputs; the meaning of its activity only exists in context. It may be about the environment or about the entire organism. The potential for meaning of a neuron (or network) is the potential for distinction, which is translated into dynamic activity. How this dynamic activity is parsed in terms of the neuron’s potentialities is dependent on the neuron’s biochemistry, structure, and also on its relative location within networks. Similarly, motor (proprioceptor) neurons arriving at (leaving) a muscle are about that muscle and its behavioral relevance. Therefore, the comprehension of the function of a module, be it a neuron or a network, requires the contextualization of the module’s intakes and outputs with respect to the organism’s endowments and circumstances – its potentialities for behavior.
4
How a neuron or a network conveys meaning is the consequence of its evolutionary path towards viability.
10.2 Modularity
219
Information transduced from receptors is about not only features of the stimulus, but also about their arrangement in the sensory sheet, and further, about how the transduction is carried out (e.g., light – photosensors, or sound – mechanoreceptors of the cochlea). Abstractly speaking, neuronal activity is simultaneously about ecology, physics, development, and evolution. It is about the world, the body, and the properties of the neuron itself. Neuronal activity is a common currency in the neural system that is flexible to convey any message, but the message is only given meaning in a behavioral context.
10.2.2 Anatomical Modularity The brain of mammals invites modular analysis, whereas it denies a simplistic modular conception. On the basis of macroscopic anatomical features some “organs” stand out, as the cerebellum, the amygdala, or the hippocampus. To the naked eye, many of these organs display peculiar textures, indicative of particular wiring patterns – the cerebellum is a canonical example [24]. Neuroarchitectonic patterning is also represented at smaller scales, such as the thalamic and cerebellar nuclei, which appear as little islands of gray matter insulated by white matter. It seems unlikely that the structural segregation between nuclei is wholly arbitrary, and is likely to serve independent functions. Moreover, peculiar architectonic details particular to specific regions5 have led some to suggest that the brain has standard circuits (such as the cortical column, suggested by Mountcastle [32]). This suggestion, however, is less than uncontroversial. For even if there is a columnar organization in the barrel cortex of the mouse, or in the visual cortex of the monkey, many other structures are not distinguishable by conspicuous features of anatomy (most of the cortex according to Braitenberg [7], with the exception of some specialized cortices). Some theorists have indeed questioned the existence of the cortical column as a meaningful subunit (Jim Bower, personal communication), pointing out that uniform distribution of synaptic connectivity is the rule, whereas modules are the exception.6 Quantification of the distribution of neuronal connectivity has been shown to have stochastic features, e.g., the number of connections is a smooth function of the distance between units [5, 7], indicating that microscopically the modular story is less defensible. Uniformity appears in variables such as connectivity, axonal length,
5
The arrangement of cells in cortical layers, connectivity density and average size of dendritic arbors, types of connections, thickness of layers, etc. 6 But even if the distribution of synapses appears uniform in quantifications, this does not preclude the existence of specified point-to-point connections. For example, the quantification of certain features of a fractal may have a uniform distribution despite precise structural details. Such fine structure in the case of neural systems may indeed have functional consequences. That is to say that not all uniform distributions are equivalent, for the functional consequences of two systems with equivalent quantifications may be very different indeed.
220
10 Neural Communication: Messages Between Modules
dendritic arbors, cell density, cell types, neurotransmitter types, and of course, efferent and afferent projections. Many of these variables seem to follow smooth distributions, suggesting that whatever partitioning there may be in the mammalian cortex, the borders are likely to be blurry. So, we need criteria to try and dispel the fog surrounding modules. Because there is no unique way to accomplish the task of modularizing a brain, the route passes through a set of heuristics, ancillary to outline modules. In these heuristics I emphasize function and its associated neural invariants.
10.2.3 Heuristics for Modularity All in all, modules are defined with respect to a set of complementary heuristics from anatomy, dynamics, and function. A module can be defined when one or many of the following heuristics hold: Function. Is it possible to attribute specific function to a substructure? One way
to outline a module is to identify a particular function relating to that structure. Functional magnetic resonance imaging is the canonical method for this test. Clearly, this heuristic falters when a substructure appears to be involved in many different functions. Such is the case with high-connectivity hubs as the cingulum, the thalamus, the hippocampus, or the cerebellum. Localized lesions. What function is impaired with a selected lesion of structure? If specific functions can be impaired selectively, then it is likely that functional modularization is present. Albeit not free from criticism, lesion experiments may help to outline functional independence, localization, and modularity. Connectivity. Considering the connectivity within a given outlined volume of brain matter, if the connectivity within the outline is greater than that to the outside, then the outline identifies structural module. Betweenness centrality is a measure used to quantify the proximity of a unit within a module, in comparison with other units. This is an important anatomical heuristic for the identification of hubs, although they may or may not be modules in a functional sense.7 Temporal contiguity. Closely associated with connectivity is temporal contiguity. If the time it takes for activity to travel between neurons within a module is less than the time it takes from outside, then they may together form a module. The longer the traveling time of signals, the more likely it is that the activity may receive other contributions, and diffuse influences, becoming less modular.
7
Objective measures of modularity based on network connectivity can be ancillaries to identification of putative modules, in the dynamic interaction that may lead to function. Recent work in the field by Sporns [35], Leicht [29], and others, has furnished quantitative neuroscience with a wide assortment of connectivity and modularity measures, which are becoming standard metrics for brain connectivity.
10.3 Types of Modularity
221
Homogeneous input. Does the putative module receive inputs from similar
sources? Owing to neuronal plasticity, coherent and structured inputs are likely to involve a module in a holistic activity pattern. For instance, somatosensory areas receiving inputs from contiguous patches of skin may form distinguishable modules, which may in turn project to their peers in motor areas. Feature maps (somatosensory or otherwise) indicate modularity. Homogeneous output. Does the putative module project to similar receivers? If the output projects to similar recipients, then for the message to be unambiguous, it requires a degree of coherence. Outputs to muscles must be coherent, and therefore muscles may be identified with modules. As a gesture involves many muscles in concert, one module may also be about a group of muscles with respect to that particular gesture. Synchrony. Synchrony may be an indication of dynamically organized modules, which may not be contiguous nor homogeneous, but integrated in holistic dynamical patterns [13,44]. This important idea finds support in the precise timings with which neurons may transmit action potentials (two models based on the idea have been called “polychronization” by Izhikevich [25] and “syn-fire chains” by Abeles [1]). Oscillations in brain activity may also become synchronized, lending support to dynamical organization of modules. Of those, I would like to underscore the homogeneity of input and output as heuristics for modular function. In face of the discussion on semiotics of neurons and modules, I believe them to be essential. As behavioral function relies on messages originating from different systems, modularity is also about the meaning of putative neural messages in the context of the receiving side. Note, moreover, that homogeneity of inputs and outputs also implies hierarchies. The identification of a module from any of these heuristics may be tortuous. For example, the same neuron within a module may participate in different functions at different times. That is another argument for dynamically organized modularity. Invariants of function exist only in the context of the function itself — modules likewise.
10.3 Types of Modularity One way to typify modularity is according to the structure–function dependence between modules. On that basis, we distinguish three categories of modularity: vertical, horizontal, and monolithic (no modularity). Many behavioral functions appear as modules interact. A vertical interaction between modules happens exclusively at the body level and through the environment. Horizontal interaction happens when modules are also interconnected laterally, at the level of neurons. Monolithic modules do not allow any obvious anatomical decomposition, but they may still be dynamically modular, depending on how inputs and outputs exploit the attractor landscape. In the following, I describe and illustrate these types of modularity, from the perspective of neurodynamics.
222
10 Neural Communication: Messages Between Modules
10.3.1 Vertical Modularity If the function subserved by the module is only dependent on the module’s inputs from the environment, we talk about vertical modularity. Vertical modules are concerned with the behavioral effects brought about from a single source of information. In mammalian brains, where the integration is great, vertical modules are rare. But in the neuronal systems of insects, reptiles, and some fish, examples of vertical modularity are profuse. Sj¨olander recounted a compelling story about the modular composition of a snake catching prey (described in [16]). Prey-catching behavior of a snake is a smooth composition of striking, following, and swallowing the prey, head first. The snake strikes the prey, follows it to the place where the effect of the poison overcomes it, then finds the prey’s head and swallows it. Although the behaviors are organized in a smooth sequence, a closer look at the prey-catching behavior shows that this is not a centrally coordinated act. In a snake, the behavior hstriking preyi is governed by the eyes (or heat-sensitive organs), whereas hfollowing struck preyi is governed by smell (only) and, finally, hswallowing preyi is governed purely by tactile feedback. In this case the behavior of the snake is a sum of independently controlled subsystems for (1) catching, (2) following, and (3) swallowing – each operating within its own boundaries. In the snake every component of the prey-catching behavior is dependent on the modality that triggers the components of behavior. Apparently, the information from different modalities is encapsulated, i.e., not shared by a central coordinating system. Much as components of a refrigerator, functional behavior emerges meaningfully with all components operating in concert, at the organism’s level. The main features of vertical modularity are, following Menzel [30] (who follows Coltheart, who himself follows Fodor [12]), as follows: Innately specified or tuned to selective and highly prepared forms of simple
learning. Informationally encapsulated: restricted to a particular sensorimotor task; no
horizontal connections between processing modules; vertical processing is mandatory. Automatic; that is, fast both with respect to their specific sensorimotor connections and their restricted forms of learning.
Examples of Vertical Modularity Phototactic open-sky reaction Path integration from the visual flow field experienced during flight Distance and direction of indicated site Sun compass, relationship of sun azimuth and time of the day as a guide in navigation Detection of polarized light pattern in the sky
10.3 Types of Modularity
223
10.3.1.1 Neurodynamics of Vertical Modularity in R2 Inputs to Modules as Modulation of Attractor Landscapes Activity coming into to a module with a particular attractor landscape parameterizes the internal activity of the module. The parameterization resolves one particular dynamical system, with its associated attractors. When the parameterization is itself a function of some interface, such as the body or a sensory sheet (e.g., retina, proprioceptors), this parameterization carries the structure of the originating interfaces in itself. Vertical modules rely on the structure of the input to produce appropriate behavior. Vertical Modules and Embodied Computation in R2 Vertical modularity is demonstrated through an example with the simulated agent R2, where activity coming into a module is transduced through the body and the environment. The experiment is the same as described in Sect. 9.3.3.3. Recall that R2 has five motor units, two controlling yaw and pitch motors of the neck and three controlling the rotation of the wheels. In the vertical modularity condition, the yaw and pitch motors receive information exclusively from the retinal sensors, whereas the wheels receive information exclusively from the neck proprioceptors. The modules are thus “informationally encapsulated” (see Fig. 10.2). Proprioception is an informative source that is tightly coupled to the dynamics of the body. It is a reliable informant of the postural consequences of a motor action. It works as a filter for the fast dynamics controlling the neck, and projects through the body loop to the wheels, which orientate the agent towards its targets, the bouncing balls. Modular Evolution The structures of the two modules were evolved independently. Connections between modules were prevented. The neck-controlling motors only evolved weights, whereas the wheel control was free to evolve structure. The mutation rate was kept low, which was found to be the most reliable manner to evolve functional agents. The agent depicted here is the best agent from the modular evolution runs. The description of its behavior can be found in Sect. 9.3.3.4. All generations of the evolutionary runs are also available on the online resources, at www.irp.oist.jp/mnegrello/ home.html. Vertical Modules The network controlling the wheels did not receive information from the distance sensor array, but only through the body loop (3 in Fig. 10.1). The information from
224
10 Neural Communication: Messages Between Modules
Fig. 10.1 In vertical modularity, individual subsystems operate on specific inputs (1 and 2 ) generating independent outputs (functions F1 and F2 ). Different subsystems interact through the consequences of the functions through the environmental loop E, or the body loop 3
Fig. 10.2 (a) The retinal module and (b) the wheel module. Colors of synapses indicate positive or negative weights. Colors of units indicate the momentary activity of the unit
the motor units that move the neck was available indirectly, through the dynamics of the body itself, through the proprioceptive sensors mounted on the neck. So the module only shares the bodily consequences of the activity of the neck motors. The wheel module does not receive information directly from any internal units, but from the yaw and pitch proprioception sensors, closing the loop through the body. These represent the postural consequences of the motor activities of the neck. The activity coming in from the sensors is slower than the activity of the module, so it is taken in as parameters for the wheel module (Fig. 10.2).
10.3 Types of Modularity
225
Wheel Control The strategy developed by the agent relied on the south wheel for the control of the turning direction, whereas the antisymmetric activation of the northeast and northwest wheels caused forward movement. The yaw sensor was primarily responsible for the control of the south wheel (see Fig. 9.11). If the average activity of the south wheel is approximately zero, the agent moves forward. Clockwise and counterclockwise averages signify rotation to the left or right.
Modulation of the Attractor Landscape To see how the south wheel responds to the yaw sensor modulation, we plot the onedimensional attractor landscape of the south wheel module (south wheel unit plus hidden unit). This amounts to plotting the bifurcation diagram, where the parameter is in the range of the yaw sensor multiplied by the weight. This is found in Fig. 10.3. The attractor landscape has three regions, a period 2 region surrounded by fixed points of opposite signals. The fixed points account for turning directions (robot turns right to positive input and left to negative input). Most of the oscillatory region averages close to zero, signifying forward movement. Evolution maximized the region where the dynamics of the south wheel are close to zero by increasing the weight of the connection from the yaw sensor (Fig. 10.2). That gives an amplification of the signal from the yaw sensor. This is input to a two-neuron submodule close to the south wheel. The asymmetric connections at this module cause it to produce period 2 oscillations. Therefore, the agent moves forward for most yaw angles of the tracking head. This is also an instance of attractor morphing. As the parameter given by the yaw sensor changes, the attractor smoothly morphs from fixed points, to oscillations, to a fixed point.
Fig. 10.3 The attractor landscape of the south wheel module. In (a) the south wheel is blue and the hidden unit is red. In (b) dark red is the average activity of the south wheel. For both, the parameter is the range of the yaw sensor multiplied by the incoming weight (5.28). The attractor landscape has three regions, a period 2 region surrounded by fixed points of opposite signals (see the text for explanation)
226
10 Neural Communication: Messages Between Modules
Fig. 10.4 Metatransients of control of the south wheel as the yaw sensor modulates the rotation velocity
In Fig. 10.4 we see an instance of the control strategy from a given trial. The metatransient is depicted. The direction the robot turns is a function of the incoming parameter, i.e., the yaw sensor. As mentioned, the yaw sensor is a consequence of the actions coming into the neck motors, and how the inertia of the head parses the incoming activity from the retinal sensors. The pitch sensor acts by modulating the northwest wheel. When the agent is closer to the ball, the head points higher. When that happens, activity of the pitch sensor modulates the activity of the northwest wheel, causing it to momentarily reduce its velocity. This causes a slight rotation in one direction. This leads to the ball being less centered, which in turn leads to a wider movement of the neck in the opposite direction (yaw). This effects a modulation of the south wheel, which causes the body to orientate itself towards the ball. This is an instance of feedback control through the environment, akin to that described in Sect. 8.4.5.2. A primary mechanism of control is the modulation of the attractor landscape. It evolved to maximize the parameter domain where the south wheel was neutral to variations of the yaw sensor activity. The region where the motor output of the south wheel has a saturated period 2 oscillation is a region where neighboring inputs are equivalent. Small variations of the position of the neck do not result in a change of direction. Wider variations lead to changes in rotation. From a neural perspective, vertical modularity is the simplest case. It is also faster with respect to the input. Furthermore, it is usually simpler to attribute specific functionality to vertical modules, as they are easier to outline anatomically. Vertical modules are the foundation blocks upon which multifunctionality is built. By connecting modules internally (horizontally), new classes of dynamics arise, as the activity of neurons is much faster than that of parameters.
10.3.1.2 Horizontal Modularity When two neural modules with distinct functionality interact to promote a third function that could not exist without interaction, this is an instance of horizontal modularity. Lateral interaction between modules enables dynamics that were previously unavailable. Each individual function being realized by the attractor
10.3 Types of Modularity
227
Fig. 10.5 In horizontal modularity, two modules communicate at the neuronal level (represented by the touching internal loops), producing a combined function that cannot be produced unless this connection exists
landscape of each single module, the interaction between the two resolves a bigger module. Each module can still deploy its functions individually, but the lateral connection is a requirement for the third function, as depicted in Fig. 10.5. Integrative possibilities of horizontal modularity are clear. In the example of the snake earlier, if modules can interact horizontally, the snake could deploy behaviors with interchangeable sensory organs. Moreover, horizontal interaction between modules has the potential to be much faster than vertical interaction, because the loop is mediated by fast neuronal activity (not the body or the environment, usually slower). The adjoining of modules begets dynamics previously unavailable. The activity from a sending module carries features of the dynamical processes of that module (given by its structure and associated attractor landscape). Thus, integration between activities of different modules may enable further behavioral function requiring fast integration. As examples of horizontal integration, one may enumerate functions that require higher integration between stimuli. Examples are perceptual categorization, cross-modal abstractions and associations, contextual associations, and the like (note that these abilities are found in birds, as pigeons categorizing pictures, fish, and even in insects). Another example is the control of movement that does not strictly require feedback, such as reference signals (efferent copies or forward models [24, 46]), where ex hypothesi control and comparison are executed at different substrates. Some examples of horizontal modularity are given in below.
228
10 Neural Communication: Messages Between Modules
Examples of Horizontal Modularity Value-based perceptual categorization (as in Herrnstein’s law of effect [21]) Cross-modal operant conditioning (Pavlov’s dog) Cross-modal navigation in insects (e.g., using the sun’s zenith and optic flow) Perceptual filling in (interpolation, such as the blind spot) Cross-modal associative learning and generalization
Horizontal Modularity and Artificial Evolution In my experience with artificial evolution, no convincing examples of horizontal modularity have arisen. This is presumably due to multiple factors, resulting from the characteristics of the experiments. For one, the dynamics of small modules are very sensitive to the addition of synapses, as has already been observed by H¨ulse [22]. In my attempts to sum vertical modules to build up lateral interactions between the functions of component modules, what appeared instead was a tendency to render monolithic modules, such as the one in Fig. 10.6. In monolithic networks, modules are difficult to tell apart, and usually no further functionality results. A second factor preventing the appearance of horizontal modules is the nature of the problem, where the desirable integration does not require fast-communicating modules. The body loop and the environmental loop are preponderant over internal loops. In the evolutions, the networks were likely to use the information from the yaw and pitch sensors similarly to the vertical modularity case. Nevertheless, despite not being observed in these experiments, natural evolution is likely to encounter and affirm horizontal modular integration, provided it increases viability, by bestowing the organism with behavioral function. Integration is more likely to happen if activity from different sources can be reunited to produce useful meaning. For example, it is useful to recognize a predator both from the crackling sound it makes and from the image on the retina. Any precursor to the predator is informative about a reaction to the threat, and internal integration speeds up decisions. In none of my experiments did I have conditions that would require a lateral interaction of component modules. Nevertheless, one could conceive of experiments that would demand more refined distinctions between stimuli, where modular interaction would be necessary for the solutions. One alternative could be an implementation requiring the agent to tell two types of balls apart, approaching one type while avoiding the other. All the same, this experiment has been performed in a simpler version by Beer (reviewed in [4]), without horizontal modules. Maybe an experiment where association is required would be a better bet. Horizontal integration between modules is not trivial, and the lack of horizontal integration between cognitive functions of snakes (and other reptiles and amphibians) attests to this fact.
10.3 Types of Modularity
229
10.3.2 Monolithic Modules and Dynamic Modularity According to our heuristics, networks with no clear functional decomposition owing to a large number of internal reciprocal connections within units are monolithic. However, the activity of a monolithic artificial recurrent neural network with no obvious anatomical decomposition may still be modular in a dynamical sense. The idea is that projections of the phase space may read out various projections of the high-dimensional activity space. I refer to the results in Sect. 8.4.5.2, where the tracking agent with a monolithic network had the two motor neurons reading different aspects of the high-dimensional state vector, namely, yaw and pitch motors. Neurons may read out only those facets of the metatransient that concern them, distinguishing the complex activity of high-dimensional phase spaces into different aspects of action. Consider a single artificial neuron in the efferent stream of a module that receives projections from neurons in a monolithic RNN. The weight vector representing the contributions from the neurons of the monolithic network defines a hyperplane in phase space, where the activity of the receiving neuron amounts to the projection of states of the orbit onto this plane.8 This is similar to the solutions found by the tracking and Sinai experiments described in Chap. 8. A module receiving inputs from various sources takes in activity relevant for its own function. The motor units receiving inputs from a wide variety of units selects those dynamics with useful functional consequences. All the example networks evolved in Chaps. 8 and 9 were monolithic: there are no perfect modular decompositions, as units play distinct roles from the perspective of receiving units – a case of unit aboutness. Units that sample the activity of a recurrent neural network and extract information useful for functional behavior are denominated “readout units.” They are analogous to those in the literature of dynamical repository networks, in which highly recurrent neural networks with complicated dynamics map down the dynamics with a linear combination to readout units [26]. In a monolithic network, functional decomposition involves redundancy. For instance, when an algorithm to encounter modules was applied to Sinai networks, two modules were found. Each module is respective to one output unit. But these modules are not independent, and the units of one module contribute to the function of the other. In Fig. 10.6, one sees the output of an algorithm to define modules, from [35]. The algorithm finds two modules, each containing one motor and one sensor, and a subset of hidden units and retinal units. But if these modules were to be taken apart, the function of tracking would be wholly impaired. This shows how algorithmic quantifications of modularity are not sufficient to determine functional modules. To be fair, I applied the algorithm to a network likely to be monolithic. Nevertheless, this shortcoming exemplifies how algorithms that do not take functionality into account may not be informative about functional modularity. Therefore, to understand 8
This leads to an interesting take on results from electrophysiology. When one introduces electrodes in neural matter and tries to correlate the activity of a neuron with a particular function, one may find “silent” or “uncorrelated” neurons.
230
10 Neural Communication: Messages Between Modules
Fig. 10.6 Modules outlined by a modularity finding algorithm applied to a Sinai network. Red and blue distinguish units belonging to module 1 or module 2. Although the algorithm finds two modules, their functional independence is not well defined. Units 1 and 11 are proprioceptors, units 12 and 13 are motors, units 2–10 are retinal sensors, and the remaining units are hidden units
functional mechanisms behind modules, one needs functional outlining as well, notwithstanding the pitfalls of subjectivity that functional outlining entails (see Sect. 9.3.3.5). In monolithic modules, projections of selected dimensions of high-dimensional spaces lead to dynamical modularity, because messages received by different targets may be very different, both in terms of the dimensions of projection and in terms of the receiving attractor landscape.
10.4 Equivalence, Variability, and Noise Noise is that which is not a signal, that which does not lead to functional consequences. Whether something is noise or a signal depends on what the receiver does with the incoming activity. Two different signals distinguishable through quantifications (e.g., of information content) may not be distinguishable by a receiving unit. Neurodynamics affords an intuitive way to picture the equivalence of incoming signals, and therefore also of codefining what is the space of variability that is perceived as noise. The arguments to follow rely on intuitions about the geometry of phase space.
10.4 Equivalence, Variability, and Noise
231
10.4.1 The Geometry of Dynamical Modularity A state of a RNN is given by an activity vector at a given moment. This activity vector is a point in phase space. The weights of the connections to a unit receiving input from a subset within a network define a hyperplane. The activity of the receiving unit can be seen as the inner product of the activity vector of the subset of units and the weight vector of the input connections. The activity of the receiving unit is determined by a projection of the activity vector onto the weight vector. By the geometrical nature of the inner product, an equivalence class of incoming activity is defined. For any point on the weight vector, the orthogonal direction defines a class of activity vectors that project to the same point on phase space. To illustrate the idea, the situation in two dimensions is depicted in Fig. 10.7. A unit u3 receives a linear combination from units u1 and u2 , with weights w31 and w21 . Units u1 and u2 can also be a part of very large networks, but stand as interfaces to u3 . The activity of the receiving unit is the length of the segment between a point lying on the weight vector and the origin. So, to any point on the weight vector an orthogonal direction can be defined, where points in the phase space u1 u2 map to the same incoming activity. These points are equivalent from the perspective of the receiving unit. All combinations projected to the same point are equivalent, and all the putative differences amount, from the perspective of the receiving unit, to noise. In the figure, two equivalent periodic orbits are depicted. A receiving unit would act indistinguishably in response to these incoming orbits. The conclusion carries on to higher dimensions, as orthogonality in higher dimensions can be defined. In three dimensions, the equivalence between points in phase space is given by a two-dimensional plane orthogonal to the weight vector; any point in the plane will project to the same point in the projecting weight vector. Similarly, in four dimensions, an orthogonal three-dimensional hyperplane can be defined projecting to a point, and so forth. Thus, a downstream projection of a subset of an attractor’s dimensions defines the equivalence between different attractors, in terms of the receiving units. The geometrical intuition is that if the receiving module is to receive a shape (say, a two-dimensional circle), there are many ways for attractors of high dimensions to produce it. And conversely, a high-dimensional shape can project radically distinct shapes (e.g., a four-dimensional solid can simultaneously project a circle and a square). Moreover, the ”projection screen,” the attractor landscape of the receiving module, is not necessarily a plane (perhaps a spheroid), and can amplify and reveal details of the incoming activity.
10.4.2 Other Mechanisms of Dynamical Modularity It is in this sense that networks without obvious anatomical decomposition may be dynamically modular. This is analogous (albeit not identical, see Sect. 397) to the intuitions underlying chaotic itinerancy [27], where an orbit of a high-dimensional
232
10 Neural Communication: Messages Between Modules
Fig. 10.7 Equivalent states and orbits from the perspective of a receiving unit u3 . The subset of phase space formed by the outputs of units u1 and u2 . These project to u3 with weights w21 and w31 . The hyperplanes orthogonal to the weight vector (thin lines) define classes of equivalence of incoming activity with respect to the receiving unit. Two equivalent periodic orbits (a) and the resulting time series of the receiving unit (b) are depicted. The sequence of states in the orbits projects to states 1-2-3 (see the text)
space may effectively occupy a subset of phase space, also defining a shape. The so-called attractor ruins (or Milnor attractors, depicted in Fig. 10.8a) illustrate a mechanism for dynamical modularity. Yet another result related to the lowerdimensional projection of attractors is given in the work of Dauce and colleagues. They show how parameterizations reduce complex chaotic attractors to smallerdimensional periodic attractors, as in Fig. 10.8b. These three ways of thinking are complementary: (1) Tsuda [27] points out that dynamic activity of complex networks can spontaneously reduce to lower dimensions for some time interval; (2) Dauce [10] points out that reduction of dimensions
10.4 Equivalence, Variability, and Noise
233
Fig. 10.8 (a) Chaotic itinerancy and (b) stimulus-forced reduction: dynamical views on reduction of dimensionality. See the text for references and explanation. (b From [10])
can be induced by network parameterization (it is called “stimulus-forced”); (3) Beer [3, 4], Kruse, and Tani [23, 39] among others indicate the role of lowerdimensional projections. In the context of embodied problems, I add that readout units may select functional dynamics. In addition, I emphasize that the meaning of neural messages and functional equivalence is resolved at the receiving structures. Taken together, these insights show that dynamical modularity can be organized (1) spontaneously, (2) via driving input parameters, (3) via lower-dimensional projections, and (4) at the context of the receiving module. The emphasis on the latter subsumes the realization that neurons, networks, and modules are all semiotic devices – what they mean is what they are for. The space of dynamical variability to a functionally equivalent subset is defined in semiotic terms. Noise, neural variability, and neural equivalence are defined in terms of the meaning of modules and their attractor landscapes.9
9
Relation to biology: The case of RNNs is much simpler compared with biology, but the argument still holds, with the contingencies that biological networks have very complex phase spaces (with time delays and diverse synapse types). But given the statistical regularities of neuronal lattices, one promptly sees that the space of structural variation is constrained to certain architectures. Functional invariance of conspecifics must rely on an analogous argument.
234
10 Neural Communication: Messages Between Modules
10.5 Discussion: Modular Function and Attractor Landscapes Attractor landscapes are a helpful tool to conceptualize modular function. In particular, the shapes of attractors in phase space and respective attractor landscapes given by possible parameterizations are the explanatory entities providing a connection between structure and function. Regarding the foregoing, I speculate on the significance of the analogy to the analysis of structures and dynamics as they relate to (1) the meaning of modules, and (2) the understanding the function of biological neural systems.
10.5.1 Modules, Attractor Landscapes, and Meaning I espoused and exposed the idea that modules might have intrinsic functionality (e.g., tracking) while communicating within a network. This functionality may be dynamically engaged, and the module may only exist when it takes part in a function, otherwise the function only exists as a potential. Message exchanges are lower-dimension projections of the metatransient trailing attractors, i.e., a time series relating to the attractor landscape, and how it is parameterized. Such projections will putatively take the module to particular areas of the attractor landscape, exerting functions necessary for behavior, such as shunting, amplification, modulation, and sequencing, control. If modularization is possible, attractor landscapes will have identifiable inputs and outputs. With R2, I provided a simple example where these are imposed by the sensory inputs (retina, proprioceptors) and the motor ensemble (wheels). Close to interfaces, the activity of neurons is bound by a specific input. The transduction of physical quantities extracted from the environment is lawful, and so is the whole system. Invariances appear as a consequence of the capability of modules to exchange messages meaningful to the system (e.g., the body loop). Messages between modules should not be interpreted as possessing specific informational content. Although they are projections of the attractors, the projections they send only become meaningful as they are recontextualized, for example, in driving the activity of the receiving module, but always within its own domains. In this sense, it is not the output activation of the hidden layer units that tells the motors what to do. Rather, the motors themselves “understand” the message in their own terms. Only if the attractor landscape of a module is sensitive to the message will it drive the organism meaningfully. Modules interact on-the-fly across lower-dimensional projections (inputs and outputs), which are produced within modules, and are therefore dependent on what the sender and the receiver are about. According to this view, although networks might lodge an immense number of different attractor landscapes, only a few will have the proper landscape. Only those with attractor landscapes of proper shapes will be able to solve specific problems, and this adequateness relates to what these modules are about. Figure 10.9 summarizes the present view on functional interactions between modules and how they relate to attractor landscapes.
10.6 Summary
235
Fig. 10.9 Modular function with reference to attractor landscapes and functional dynamics. Dashed lines stand for theoretical components
10.6 Summary
Modules and Messages Neuronal aboutness Neurons are about their role in production of organismic function, as they stand in relation to the body and the environment. A proprioceptor is about a muscle, a pain receptor is about damage, a retinal sensor is about incoming light rays (Sect. 10.2.1).
236
10 Neural Communication: Messages Between Modules
Outlining modules Outlining modules is a difficult task that requires a number of heuristics (Sect. 10.2.3). Types of modules On the basis of functionality, different kinds of modularity can be defined, vertical, horizontal and monolithic (Sect. 10.3). What do modules transmit? Modules transmit efferent projections of their metatransient, which results from the interaction of an input with the attractor landscape. Their transmissions are about themselves and the functional potentialities of their attractor landscapes. What do modules receive? Modules receive inputs from a number of sources, including (1) parameterizations from the body loop, (2) efferent projections of neighboring modules, and (3) input from the world (Sect. 10.3.1.1). Parameters take the metatransient to particular regions of the attractor landscape. In that messages are contextualized in terms of the receiver, the same attractor landscape can project to different receivers dramatically distinct messages (monolithic modules and dynamical modularity). What is the function of a module? What a module is for is a function of its attractor landscape, and how inputs and initial conditions exploit it to produce function. Further, the function of a sending module is resolved at the receiving module, or in the dynamical interaction of both (Sect. 10.5.1). Noise and equivalence If different inputs do not produce functional output, then the inputs are equivalent, and their difference amounts to noise (Sect. 10.4).
References 1. Abeles M (1994) Firing rates and well-timed events in the cerebral cortex. Models of neural networks II: Temporal aspects of coding and information processing in biological systems. Springer, Berlin 2. Arbib MA (2007) Modular models of brain function. Scholarpedia 2(3):1869. URL ttp://www. scholarpedia.org/article/Modular models of brain function 3. Beer RD (1995) A dynamical systems perspective on agent-environment interaction. Artif Intell (72):173–215 4. Beer RD (2000) Dynamical approaches to cognitive science. Trends Cogn Sci 4(3):91–99 5. Benavides-Piccione R, Hamzei-Sichani F, Ballesteros-Yanez I, DeFelipe J, Yuste R (2006) Dendritic size of pyramidal neurons differs among mouse cortical regions. Cereb Cortex 16(7):990–1001 6. Braitenberg V, Braitenberg C (1979) Geometry of orientation columns in the visual cortex. Biol Cybern 33(3):179–186 7. Braitenberg V, Sch¨uz A (1998) Cortex: Statistics and geometry of neuronal connectivity. Springer, Berlin 8. Clark A (1999) An embodied cognitive science? Trends Cogn Sci 3(9):345–351 9. Clark A, Chalmers D (1998) The extended mind. Analysis 58(1):7–19
References
237
10. Dauce E, Quoy M, Cessac B, Doyon B, Samuelides M (1998) Self-organization and dynamics reduction in recurrent networks: stimulus presentation and learning. Neur Netw 11(3):521–533 11. Edelman GM (1987) Neural darwinism. Basic Books, New York 12. Fodor J (1983) The modularity of mind. MIT, Cambridge, MA 13. Freeman W, Barrie J (2001) Chaotic oscillations and the genesis of meaning in cerebral cortex. Nonlinear Dynamics in the Life and Social Sciences 14. Freeman WJ (1995) Society of Brains: A study in the neuroscience of love and hate. Laurence Erlbaum Associates Inc 15. Freeman WJ (2004) How and why brains create meaning from sensory information. International Journal of Bifurcation Theory and Chaos 14(2):515–530 16. G¨ardenfors P (1995) Cued and detached representations in animal cognition. Lund University Cognitive Studies 38 17. Grush R (2004) The emulation theory of representation: Motor control, imagery and perception. Behavioral and Brain Sciences 27:377–442 18. Hagmann P, Cammoun L, Gigandet X, Meuli R, Honey C, Wedeen V, Sporns O (2008) Mapping the structural core of human cerebral cortex. PLoS Biology 6(7):e159 19. Harvey I, Paolo ED, Wood R, Quinn M, Tuci E (2005) Evolutionary robotics: A new scientific tool for studying cognition. Artificial Life 11(1-2):79–98, URL http://www.mitpressjournals. org/doi/abs/10.1162/1064546053278991 20. Haugeland J (1995) Mind embodied and embedded. Acta Philosophica Fennica 21. Herrnstein RJ (1970) On the law of effect. Journal of the experimental analysis of behavior 13(2):243–266 22. H¨ulse M (2006) Multifunktionalit¨at rekurrenter neuronaler netze – synthese und analyse nichtlinearer kontrolle autonomer roboter. PhD thesis, Universit¨at Osnabr¨uck 23. Ikegami T, Tani J (2002) Chaotic itinerancy needs embodied cognition to explain memory dynamics. Behavioral and Brain Sciences 24(05):818–819 24. Ito M (2008) Control of mental activities by internal models in the cerebellum. Nature Reviews Neuroscience 25. Izhikevich EM (2006) Polychronization: Computation with spikes. Neural Computation 18:245–282 26. Jaegger H, Maas W, Markram H (2007) special issue: Echo state networks and liquid state machines. Neural Networks 20(3):290–297 27. Kaneko K, Tsuda I (2003) Chaotic itinerancy. Chaos: An Interdisciplinary Journal of Nonlinear Science 13(3):926–936 28. Krichmar JL, Edelman GM (2003) Brain-based devices: Intelligent systems based in principles of the nervous system. In: Proceedings of International Conference on Intelligent Robots and Systems 29. Leicht EA, Newman ME (2008) Community structure in directed networks. Physical Review Letters 100:118703 30. Menzel R, Giurfa M (2001) Cognitive architecture of a mini-brain: the honeybee. Trends in Cognitive Science 5(2):62–71 31. Merleau-Ponty M (1963 (translation), 1942) The Structure of Behavior. Duquesne University Press 32. Mountcastle V (1997) The columnar organization of the neocortex. Brain 120(701-722) 33. Pfeiffer K, Homberg U (2007) Coding of Azimuthal Directions via Time-Compensated Combination of Celestial Compass Cues. Current Biology 17(11):960–965 34. Simon H (1969) The Sciences of the Artificial. MIT Press (Cambridge) 35. Sporns O (2002) Graph theory methods for the analysis of neural connectivity patterns. Neuroscience Databases A Practical Guide pp 169–83 36. Sporns O, Kotter R (2004) Motifs in brain networks. PLOS Biology 2:1910–1918 37. Sporns O, Honey C, K¨otter R (2007) Identification and classification of hubs in brain networks. PLoS ONE 2(10) 38. Strausfeld N (2009) Brain organization and the origin of insects: an assessment. Proceedings of the Royal Society B: Biological Sciences
238
10 Neural Communication: Messages Between Modules
39. Tani J (1998) An interpretation of the ‘self’ from the dynamical systems perspective: A constructivist approach. Journal of Consciousness Studies 5(5-6):516–542 40. Thompson E, Varela F (2001) Radical embodiment: neural dynamics and consciousness. Trends in cognitive sciences 5(10):418–425 41. Ton R, Hackett J (1984) Neural Mechanisms of Startle Behavior, Springer, chap The Role of the Mauthner Cell in Fast Starts Involving Escape in Teleost Fishes 42. Tononi G, Sporns O, Edelman G (1994) A Measure for Brain Complexity: Relating Functional Segregation and Integration in the Nervous System. Proceedings of the National Academy of Sciences 91(11):5033–5037 43. Varela F, Rorty E, Thompson E (1991) The Embodied Mind. MIT Press 44. Varela F, Lachaux J, Rodriguez E, Martinerie J, et al (2001) The brainweb: phase synchronization and large-scale integration. Nature Reviews Neuroscience 2(4):229–239 45. Watson RA, Pollack JB (2005) Modular interdependency in complex dynamical systems. Artificial Life 11(4) 46. Wolpert DM, Doya K, Kawato M (2003) A unifying computational framework for motor control and social interaction. The Royal Society Journal
Chapter 11
Conclusion
11.1 The Search for Mechanism Within a Meaningful World 11.1.1 Contraptions, Analogies, and Explanation 11.1.1.1 We Are the World Despite our best – if moribund – dualistic intuitions, we cannot efface reality, and it cannot dodge us. This is true of all things in the universe of which we an uncanny expression. In building models of organisms, the first thing to acknowledge is the circle that is closed by existence; every object or being in the world is also a part of it. Objects compose the world in the same measure that they are enveloped by it and this is valid for animate and inanimate alike. 11.1.1.2 To Understand Is to Reconstruct Sum that to the maxim of the Italian philosopher Vico, “one understands what one is able to reconstruct.” This is the motivation for building embodied contraptions and embedding them in simulation worlds. What do we understand about beings when we develop simulations and contraptions of organisms in the world? Foremost, we learn much about what we assumed, and that is invaluable. The world does not work on assumptions, but scientists have no choice. But they can and must compare the results of testing their contraptions with that what happens, and see how well they fits. From organisms we abstract implementations into mechanisms, whereas in machines we implement abstractions of mechanisms. When there are correspondences between the consequences from machines to organisms, we have learned something in Vico’s sense. 11.1.1.3 Reconstructing with Neurodynamics and Evolutionary Robotics Embodied neural networks are abstractions over organisms. But as abstractions, they have consequences of their own, mapping back to organismic phenomena. M. Negrello, Invariants of Behavior: Constancy and Variability in Neural Systems, Springer Series in Cognitive and Neural Systems 1, DOI 10.1007/978-1-4419-8804-1 11, c Springer Science+Business Media, LLC 2011
239
240
11 Conclusion
The entities of a dynamical system (attractors, transients) stand for their counterparts in biology, and become analogies that are informative about the consequences of the assumptions. In our elaborate analogy, functional behavior is a consequence of the parameter-driven changes in attractor structures that influence the metatransient. Because the portions of attractor landscape accessible by the network change as a function of the input, so does the overlying metatransient. Attractor landscapes are the collection of attractors accessible through parameterizations, and lodge attractors of different types.
11.1.1.4 Functional Equivalence Although attractors fit in a small number of categories (i.e., fixed points, cyclic attractors, quasi-periodic attractors, and chaotic attractors), their characteristics are far from exhausted by this categorization. In some sense, the shape of an attractor is its identity. This is meant in the sense that not all four-sided figures are squares and not all squares are of the same size. The number of edges does not exhaust the relevant aspects of the figure. Attractors can be seen as figures in phase space, which is, after all, a measurable space. In this sense, even chaotic attractors with their ineluctable difficulties may be simple in the immediate functional consequences of their endlessly complex shapes. Equivalence is defined in terms of the function executed: a chaotic attractor may be as complex as any other, as long as the functional consequence is the same.
11.1.1.5 Functional Invariances Between Constancy and Variability And thus we came to functional invariances: those aspects of the abstraction that are invariant with respect to a function. They define the class of equivalence of a system with respect to a function. Our assumptions and model have helped us to understand the origin of a measured invariance in terms of a behavioral function. They have also helped us to see why these invariances turn up again every time a function appears. They have helped us to see how function may appear in evolution, and why there will be invariances shared across organisms that evolved homologous function. They helped us to see the space of variations in neural structures and dynamics, and how they are constrained by function. This is pretty good – for an analogy. How about the real thing?
11.1.2 Empirical Invariances: Flickering Lights Shining a light onto a suspended object against a surface projects a shadow, an aspect of the real thing. Imagine we have no access to the object, but only to its projected shadow. Imagine we can move a point light around the obstruction,
11.1 The Search for Mechanism Within a Meaningful World
241
a transformation, and we can choose the shape of the projection surface. Then, if the light emitted by a point and the center of the object are perpendicular to a plane projection surface, the shadow is a circle. Bringing the light source closer to the object renders a growing circle, and moving it further away renders a shrinking circle. If we move the point light so that the line between the light and the center of the object is not perpendicular to the plane surface, then the shadow becomes an ellipsoid. Moving it further down renders a hyperbola, where the surface sections the light cone. If the wall is not a plane, then the shadow is a product of the light cone and the projection screen, and will have features of both the object and the wall. Empirical methods of brain measurement are like a combination of a light source and a projection screen. The measurement produced is a product of all three, the object, the light, and the wall. We want to discover the essence of the object: we are bound, however, to the light and the screen. In the brain, every single time one shines a light, the projected shadow is a slightly different one. Moreover, we cannot see the screen, but only the shadow projected. If we cannot know exactly the surface of the screen, it is hard to tell apart how much of the projected shape is a function of the position of the light and how much is a function of the surface of the screen. Measurements of brain function are like that. It is as if the wall is constantly changing, as the light source is constantly flickering. We try to make out the occluded object, given the projected shadows that wobble as the light flickers and surfaces change. We try to see beyond the light and the surface, summing up all the shadows, to try to discover the object itself. The sum of all shadows is not the object. But their conjunction induces theories about the object, in that they relate the transformations (position of the light, shape of the surface) to the shadow (empirical invariant). This is what we do with invariant analysis of behavioral function.
11.1.2.1 Functional Analysis and Reconstruction That is why we find place cells, or grandmother cells. Because we look for them. The combination of place cells and grandmother cells can enable the function of visiting granny. But none of this is definable without reference to behavioral function. A functional analysis of neural systems cannot exhaust dynamical potentiality of neural structures. On the other hand, quantifications of anatomical features do not reveal function. To understand the interaction between function and structure, it is thus necessary to identify the functional consequences of dynamics given by structures, regarding the meaning of efferent and afferent streams in terms of the organism. If this identification is not amenable, rather than regarding the system as the composition of functions of parts, it is more useful to regard the dynamics of the whole. This is why I have emphasized the creation of meaning (see also MerleauPonty [1]). The meaning of neural structures is coextensive with their potential for behavioral function. Equivalence arises because different structures may have the same meaning for the organism.
242
11 Conclusion
Empirical invariances at particular levels, such as place cells, grid cells, or face cells, are not incontestable indicators of localized function. These invariances are aspects of a process, one that may not allow functional decomposition. If lesions at the grid cells impair the function of the overall system, then grid cells are not modules per se. Rather, they should be regarded as projections of a complex attractor landscape at a subset of the network at one perspective, selected by the experimental paradigm. Although invariant features reported by these cells may be traceable to functions at the organismic level (such as knowing one’s place), to attribute function to these cells is a case of excessive functional objectivism. Unwarranted assumptions of functional roles of cells may lead to a mistaken picture of these neurons as pieces of a clockwork mechanism. In disagreement, I surmise that empirical invariants are like projections of cerebral activity, and projections, by their very nature, are not functional. Projections are not functional the same way that a photograph is not a place. This vacillating ontological status notwithstanding, empirical invariants of function are fundamental for an understanding of brain structures. They provide the necessary criteria that models must address and explain. Anatomical invariances fall in the same category. Any explanation of brain function has to take into account the exquisite structures of cerebral organs as the cerebellum or the hippocampus, and be sure to include those in the explanation. Functional invariances add boundary conditions to the explanatory equation. No explanation is complete without simultaneously addressing all these aspects. Taking all these factors into account is apparently against the prescriptions of simplicity of the scientific method, and may cause the stern scientist to cringe in the face of complexity. The brain, however, is not simple. It may alleviate the twinge at Occam’s area to point out that by embracing these constraints, one reduces the problem to determining the types of dynamics with functional consequences, and from there to finding the set of structures that possess them. If, on one hand, this is easier said than done, on the other, it is better to work on a difficult problem that is well defined than on an easy problem that is ill defined. Whatever the case, if we aim for consilience with the neuroscientific understanding of brain function, it is incumbent upon modelers to propose how networks resolve meaning by (1) naming the form of activity exchange, (2) saying what the messages are of, (3) or about, (4) or for. The teleological approach (e.g., this one) surmises that processes are happening on the high-dimensional space of the computing network, and that the projection of the attractor carries orderly messages ultimately about the organism. When the four aspects above can be satisfactorily addressed, then we will understood more about the brain. Brain prosthetics become a viable treatment for a localized lesion. What then must these putative brain prosthetics contain, what (mathematical) functions should they deploy, such that the function of a defective structure may be emulated? If functions can be attributed to particular structures, those structures possess attractor landscapes subserving those functions. Emulating an equivalent attractor landscape amounts to engendering potentiality for equivalent
11.2 The Current Stage
243
functional dynamics. Emulating an invariance (e.g., a place cell) is insufficient for function. One must emulate the functional consequences of a substrate that has invariants as a consequence of its function. The brain is a hermeneutic machine. What it is and what it does can only be understood in terms of the whole being, and its ecological problems. The brain must be explained in terms of the potential functions of its wholeness. Move a toe, blink an eye, and scratch your head at the same time, and the activity of all the brain is about these three acts simultaneously. One brain does all. But underneath, two different instances of the behavior “finger wiggling” are likely to share some invariance subserving it. Because, ultimately, finger wiggling is about the finger itself. Invariance of behavior starts at the tip of the finger.
11.1.2.2 How Not to Do It The analysis from brain function cannot forsake the dynamical contexts, the environment (also cellular), and the body, as core constituents of explanation — to the degree that utterances about cerebral function fade to meaninglessness the further away they are from organismic reality, as when the logical operations are attributed to dendrites, or when knowledge of ballistics is attributed to archerfish. Questions become meaningless when the function assumed is not in the interest of the subject, but in the interest of the observer. The fish does not need ballistics, it requires ecologically attuned sensorimotor loops. Dendrites, likewise, may act under certain conditions as a standard logical operator (a particular input current, a precise position in the dendritic tree of injecting electrodes, and of measuring electrodes, a particular duration of stimulation, temporal difference between the two stimuli, and so forth). But to ask this question is to ask the wrong question. Only by a dynamic comprehension of a neuron’s capabilities can all of a neuron’s potential be unveiled. And I would happily put my money on the proposition that XORs or NANDs are the last thing on neurons’ or dendrites’ minds. Neurons are not logical operators, we are. And we can make them be whatever we want them to be. But they will not be that, they will be that what they are, nothing more, nothing less.
11.2 The Current Stage Concerning the progress of understanding of neural systems in the scientific history line, it is a good stage in which we find ourselves, if muddled. We amass large quantities of papers on topics as broad as the scientific horizon. Lest we find ourselves disoriented, we must hone our ability to find clear routes and visible landmarks instrumental for progress. Still, in the distance the shapes in the horizon seem to sharpen, a coherent picture seems to be forming. There is, however, much to do to justify our optimism which already exists, to boost such enthusiastic momentum
244
11 Conclusion
enjoyed by the sciences of the brain and of the artificial. Enthusiasm which is seen in all branches of brain science: computational neuroscience, neuroinformatics, artificial life, evolutionary robotics, biosemiotics, computational neuroethology, autonomous robotics, biorobotics, and more to come. They are all kin, and share the momentum from optimism, producing threads of understanding as they go. But it can be difficult to sort out the wheat from the chaff, even more so when in a tempest. Or perhaps, I should borrow a better metaphor, from Heraclitus: Those that seek for gold, dig much dirt and find little.
] Reference 1. Merleau-Ponty M (1942) The structure of behavior (1963 translation). Duquesne University Press, Philadelphia, PA
Index
A Active tracking, 147–148 Anatomical modularity, 219–220 Attractor landscapes convergence and motor projections, 151–152 description for, 146–147 dynamical entity generating behavior, 153 dynamics vs. behavior, convergent landscapes, 170–172 embodied discrete time RNN, 150 equivalence and convergent activity attractor shapes and action, 159–163 features, 154–159 motor projections, 153–154 fitness function, 149–150 Innenwelt black box, 143 invariants of behavior, 163–166 Merkwelt and Wirkwelt, 142–143 metatransient, 146 negative feedback, 157–159 agent-environment diagrams, 166–167 holistic description, behavioral function, 169–170 mechanism, 168–169 primordial function, 167–168 neurodynamics explanatory loop, 143, 144 path in parameter space, 145 simple network solutions, 152 structural coupling, 145 structure-attractor landscape-function, 163 toy problem active tracking, 147–148 description, 148–150 Open Dynamics Engine simulation physics, 149 tracker challenges, 150–151 tracking across attractors, 152 transients and behavior, 144–145
Attractors bifurcation diagram and first return map, 130, 131 convergence and motor projections, 151–152 definitions, 129–130 Hodgkin–Huxley model, 107 motor projections, 153–154 and neurodynamics, 135–136 shapes and action chaotic attractor, first return map, 159, 160 coexisting attractors, 161–163 morphing attractor, parameter space, 160, 161 types, 130, 131
B Baldwin effect, 172 Basin of attraction, 132 Behavioral function, 43, 44 Behavioral traits, 7 Behavior, invariances in theory anatomy vs. function, 25 architectonic invariances, 22–23 Braitenberg vehicle, 24 cybernetics, 28–31 description for, 20 function and invariants of behavior, 34–36 genes, 20–22 mappings, 26 neuroanatomical variation, 25 requirements, 26–28 schema theory and functional overlays, 31–33 Biology, invariances in theory autopoiesis, 16 boundaries, 19
M. Negrello, Invariants of Behavior: Constancy and Variability in Neural Systems, Springer Series in Cognitive and Neural Systems 1, DOI 10.1007/978-1-4419-8804-1, c Springer Science+Business Media, LLC 2011
245
246 Biology, invariances in theory (cont.) context-dependent, 14–15 developmental systems theory critique, 17 development and biophysics, 17–18 evolution, 18–19 genes, 16 genetic triggers, 19–20 life, 15–16 ontogeny, environment, 17 physics and mathematics, 14 species identity, genes, 19 Biophysical invariants, 43 Brain function cerebral organs, structures, 242 description, 32 measurements, 241 neuroscientific understanding, 242 Brain wiring, DTI and diffusion spectrum imaging, 51–52 Braitenberg vehicle, 24
C Coexisting attractors, 161–163 Computational models dynamical neural patterns action potentials varieties, 67 appearance, 66 creativity, 68 Hodgkin–Huxley model, 68 oscillations, 67 types, 66 from natural patterns, 63–66 biological system, 64 causal patterns, 65–66 invariant rules, 64–65 neural invariance backpropagation, 72–73 constancy and variability, 68–69 decorrelation principles, 74–77 Hebbian plasticity, 69–70 Hopfield networks, 73–74 Kohonen’s self-organizing maps, 70–72 neuroscience, 69 Constancy behavioral sources attractors and attractor structure, 204 motors, convergent activity, 204–205 RNN, 205 stimuli set, 204 electrophysiology, 53 evolutionary sources behavioral function and fitness, 206 bodily features, 205
Index environment and agent’s body, 205 neural components, 205–206 Convergence evolution, attractor landscapes, 201–202 and motor projections, attractor, 151–152 neuromuscular junction muscle fiber, 94–97 stochasticity, 93–94 Convergent evolution, behavioral function brain, multifunctional survival tool, 177 constancy behavioral sources, 204–205 evolutionary sources, 205–206 controversy Cambrian explosion, 182 divergence and convergence, experiment, 183–185 Gould and Morris controversy, 182–183 organismic appearance, 185 Darwin’s confessions, 178–179 dichotomies, 178 instinct analogous behavior, 202 cybernetics, 202 definition, 202 invertebrates and vertebrates, 203 theory of evolution, 202 modular structures and dynamics, 210–211 moment of invention, function, 179–180 neutral mutation and function appearance affirmative and neutral mutations, 181–182 RNA molecules, 180, 181 Schuster demonstrations, 180–181 punctuated equilibria, 179, 180 tracking, evolutionary phenomena extensions, 192–201 invariant rules, 186 simple tracking experiment, 187–192 variability behavioral sources, 206–207 evolutionary sources, 207–208 Cybernetics, 28–31, 202
D Decorrelation principles, 74–77 Diffusion tensor imaging (DTI), 43, 51–52 Divergence, level crossing, 94 DTI. See Diffusion tensor imaging (DTI) Dynamical entity generating behavior, 153
Index Dynamical systems coupled dynamical systems, motor contraction, 90 coupling neurons, 91 description, 89 gesture levels, 91 integrating levels, 92–93 proprioceptive, 91 skeletal level, 91 deterministic model, 83 integrative theory, 82–83 and models, 6 patterns (see Computational models) vocabulary and interfaces, 88–89 terminology, 84–88 Dynamic modularity chaotic itinerancy, 232, 233 geometrical nature, 231 and monolithic modules, 229–230 stimulus-forced reduction dynamics, 232, 233
E Electroencephalography (EEG) electrophysiology, 45–47 empirical issues, 47–48 Embodied discrete time RNN, 150 Empirical assessments, invariance epistemological contentions, 59 neural activity measurement brain and behavioral function, 44 brain wiring, 51–52 DTI and diffusion spectrum imaging, 51–52 EEG, oscillations and potentials, 45–48 electrophysiology, 53–54 fMRI, function localization, 48–51 language of explanation, 55 measurement tools and analysis methods, 45 neuroscience, 41–42 partial pictures, 58–59 sources of variation instrumental sources, 55–57 and repeatability, different levels, 57–58 types, 58 typology of function, 42–44 Empirical invariances, flickering lights brain measurements, 241 functional analysis and reconstruction brain prosthetics, 242
247 hermeneutic machine, 243 neural systems, 241 standard logical operator, dendrites, 243 teleological approach, 242 ENS3 algorithm, 138 Epistemology, 28, 53–54, 169 Evolutionary robotics Darwin’s evolutionary ratchet, 136 neurodynamics and, 136 problem, assumptions of, 137–138 structural evolution and simulation, 138–139 usage, 137
F Feigenbaum plot, 132 Fitzhugh–Nagumo model, 108, 109 fMRI active areas, 49–50 BOLD signal, 48–49 function localization, 50–51 Francis Crick’s admonishment, 112 Functional semiotics, 218–219
G Genes and explanation of behavior, 22 invariants, 16, 18 species identity, 19 triggers, 19–20 untenable, invariants of behavior, 20 Gould and Morris controversy, convergent evolution, 182–183
H Hebbian plasticity, 69–70 Hermeneutic machine, 32, 218, 243 Hodgkin–Huxley model bifurcation theory, 106 categories, 106–107 chronaxie, 108 dynamical neural patterns, 68 dynamical systems theory analysis, 107 Fitzhugh–Nagumo model, 108, 109 Galvani to Hodgkin and Huxley, 102 invariant cycle, 106 Morris–Lecar model, 108 parameter space analysis, 107 rheobase, 108 Hopfield networks, 73–74
248
Index
Horizontal modularity and artificial evolution, 228 examples, 227–228 module communication, neuronal level, 227
Ion channel hypothesis, 103–104 Isoperiodic plot construction, 134 dynamical landscape example, 134 limitations, 135
I Instinct, convergence analogous behavior, 202 cybernetics, 202 definition, 202 invertebrates and vertebrates, 203 theory of evolution, 202 Invariances in theory behavior anatomy vs. function, 25 architectonic invariances, 22–23 Braitenberg vehicle, 24 cybernetics, 28–31 description for, 20 function and invariants of behavior, 34–36 genes, 20–22 mappings, 26 neuroanatomical variation, 25 neuroanatomy, 21 requirements, 26–28 schema theory and functional overlays, 31–33 in biology autopoiesis, 16 boundaries, 19 context-dependent, 14–15 developmental systems theory critique, 17 development and biophysics, 17–18 evolution, 18–19 genes, 16 genetic triggers, 19–20 life, 15–16 ontogeny, environment, 17 physics and mathematics, 14 species identity, genes, 19 physics and mathematics changes, 11 energy, 12 invariants prefigure theories, 14 law of gases, 12–13 Planck’s constant, 11–12 speed of light, 11 sum of internal angles of the triangle, 13 Invariant cycle, 106
K Kohonen’s self-organizing maps, 70–72
L Landscape plot, 134 Learning rules, 69–71 Level crossing. See Convergence, neuromuscular junction Level of analysis, 82, 86 Limit cycles. See Invariant cycle
M Mechanism search constancy vs. variability, functional invariances, 240 embodied contraptions, 239 empirical invariances, flickering lights brain function measurements, 241 functional analysis and reconstruction, 241–243 functional equivalence, 240 neurodynamics and evolutionary robotics, 239–240 Messages and noise, 216 Modeling and invariance and computational models dynamical neural patterns, 66–68 from natural patterns, 63–66 neural invariance, 68–77 constancy and variability, modeling, 78 invariants and structure, 77–78 Modularity anatomical, 219–220 dynamical chaotic itinerancy, 231–233 geometrical nature, 231 and monolithic modules, 229–230 stimulus-forced reduction dynamics, 233 functional semiotics, 218–219 heuristics for connectivity, 220 function, 220 homogeneous input and output, 221 localized lesions, 220
Index synchrony, 221 temporal contiguity, 220 types horizontal, 226–228 monolithic modules and dynamic modularity, 229–230 vertical, 222–226 Modular organization, 7–8 Morphing attractor path in parameter space, 160 sequence, 161 Morris–Lecar model, 108
N Negative feedback, functional behavior agent-environment diagrams, 166–167 attractor landscapes pitch and yaw motor, 157, 158 totality, possible input space, 157, 158 cybernetics, 123 holistic description, 169–170 mechanism, 168–169 nervous system, 28 primordial function, 167–168 Network models Hodgkin–Huxley model difficulties dendrites cable models, 111 motor behaviors, 112 parameterizing structure, 113 template, 112 units and system, properties, 113–114 Neural activity communication (see Neural modularity) convergent level crossing, 93 Hebb’s rule, 70 measurement brain and behavioral function, 44 brain wiring, 51–52 DTI and diffusion spectrum imaging, 51–52 EEG, oscillations and potentials, 45–48 electrophysiology, 53–54 fMRI, function localization, 48–51 language of explanation, 55 measurement tools and analysis methods, 45 structure of interaction, invariance, 76 Neural invariance backpropagation, 72–73 constancy and variability, 68–69 decorrelation principles, 74–77 Hebbian plasticity, 69–70 Hopfield networks, 73–74
249 Kohonen’s self-organizing maps, 70–72 neuroscience, 69 Neural modularity anatomical modularity, 219–220 body as module, 215 dynamical modularity chaotic itinerancy, 232, 233 geometrical nature, 231 and monolithic modules, 229–230 stimulus-forced reduction dynamics, 232, 233 functional semiotics, 218–219 heuristics for, 220–221 messages and noise, 216 modular function and attractor landscapes, 234–235 neural modularization, 214–215 types, 221–230 Neural network models. See also Computational models assumptions, 114 dendrites, 116 discrete time steps, 116 discussion of, 115 incontestable, 117 units, 115 Neuroanatomy, 21–22, 25, 42 Neurobiology, 3, 5 Neurodynamics attractors, 129–130, 135–136 basin of attraction, 132 bifurcation sequence, 132–133 and evolutionary robotics, 136 isoperiodic and landscape plot, 133–135 neuromodule, 127–129 orbit, two-dimensional phase space, 129, 130 transients, 130–131 varieties of, 125–127 Neuromodule, 127–129 Neurons, models, and invariants categories emerge, 108 Aplysia neurons, 110 deep homologies, 110 neuron parameters, 111 Galvani to Hodgkin and Huxley, 102 Hodgkin–Huxley model, 101, 106–108 ion channel hypothesis, 103–104 law of large numbers, ion channel activity, 105–106 network models Hodgkin–Huxley model difficulties, 111–112 parameterizing structure, 113
250 Neurons, models, and invariants (cont.) template, 112 units and system, properties, 113–114 neural network models, 114–117 neuron complexity, 117–118 proteins, ion channels, 104 smooth cross-membrane conductance, 102–103 Neutral mutation and function appearance affirmative and neutral mutations, 181–182 RNA molecules, 180, 181 Schuster demonstrations, 180–181
O Orbit defined, 86, 129 plots, output phase space, 155 two-dimensional phase space, 129, 130
P Parameter space analysis, 106–107 Physics and mathematics, invariances in theory changes, 11 energy, 12 invariants prefigure theories, 14 law of gases, 12–13 Planck’s constant, 11–12 speed of light, 11 sum of internal angles of the triangle, 13
R Recurrent neural networks (RNN). See Neurodynamics Reflex cybernetics, 28–29 invariants of behavior, 27 stimulus pairs, 27 theory, limitations, 27–28 Robotics. See Evolutionary robotics R2 three-dimensional tracking behavioral function, 200–201 contingent function and invention of function, 199–200 experimental setup artificial evolution, 198–199 holonomic drive environment, 197, 198 robot morphology, 196 sensors and motors, 196–197
Index S Self-organizing maps (SOMs). See Kohonen’s self-organizing maps Simba environment, 138 Sinai three-dimensional tracking experiment setup environment, 193, 194 evolution procedure, 194 sensors and motors, 193–194 fitness evolution, 195 quality, 195–196 Single neuron models, 112, 113 Sources of variation instrumental sources behavior, measurement and invariants, 57 context and initial conditions, 56 neural implementation, 56–57 repeatability, different levels, 57–58 types, 58 Structure-attractor landscape-function, 163
T Teleological approach, 242 Tracking, evolutionary phenomena extensions, 192–201 invariant rules, 186 simple tracking experiment fitness function values, 187 graph theory connectivity measures, 190 inventive evolution, 192 monotonicity, 187 motor projection comparison, 190–192 mutation rate impacts, fitness, 187 network connectivity values, 190, 191 pairwise motor projection plots, 188, 189
V Variability, structure and function behavioral sources higher period and chaotic attractors, 207 paths, parameter space, 206 transients and metatransient, 206–207 evolutionary sources, 207–208 Vertical modularity features, 222 neurodynamics in R2 attractor landscape modulation, 225–226
Index internal activity, module, 223 modular evolution, 223 retinal and wheel module, 224 south wheel control, 225 vertical module and embodied computation, 223 prey-catching behavior, 222 Viability, 31 Vocabulary, dynamical systems interfaces coupled dynamical systems, 89 divergence and convergence, 98
251 scope, 88 terminology orbit, 86 parameters, 87–88 phase space, 85 rules, 86–87 state and state variables, 84–85
Y Yet another robot simulator (Yars), 138